Is a 4-Bit Synaptic Weight Resolution Enough? – Constraints on Enabling Spike-Timing Dependent Plasticity in Neuromorphic Hardware

Large-scale neuromorphic hardware systems typically bear the trade-off between detail level and required chip resources. Especially when implementing spike-timing dependent plasticity, reduction in resources leads to limitations as compared to floating point precision. By design, a natural modification that saves resources would be reducing synaptic weight resolution. In this study, we give an estimate for the impact of synaptic weight discretization on different levels, ranging from random walks of individual weights to computer simulations of spiking neural networks. The FACETS wafer-scale hardware system offers a 4-bit resolution of synaptic weights, which is shown to be sufficient within the scope of our network benchmark. Our findings indicate that increasing the resolution may not even be useful in light of further restrictions of customized mixed-signal synapses. In addition, variations due to production imperfections are investigated and shown to be uncritical in the context of the presented study. Our results represent a general framework for setting up and configuring hardware-constrained synapses. We suggest how weight discretization could be considered for other backends dedicated to large-scale simulations. Thus, our proposition of a good hardware verification practice may rise synergy effects between hardware developers and neuroscientists.


INTRODUCTION
Computer simulations have become an important tool to study cortical networks (e.g. Markram et al., 1997;Brunel, 2000;Morrison et al., 2005Morrison et al., , 2007Brette et al., 2007;Johansson and Lansner, 2007;Vogelstein et al., 2008;Kunkel et al., 2011;Yger et al., 2011). While they provide insight into activity dynamics that can not otherwise be measured in vivo or calculated analytically, their computation times can be very time-consuming and consequently unsuitable for statistical analyses, especially for learning neural networks . Even the ongoing enhancement of the von Neumann computer architecture is not likely to reduce simulation runtime significantly, as both single-and multi-core scaling face their limits in terms of transistor size (Thompson and Parthasarathy, 2006), energy consumption (Esmaeilzadeh et al., 2011), or communication (Perrin, 2011).
Neuromorphic hardware systems are an alternative to von Neumann computers that alleviates these limitations. Their underlying VLSI microcircuits are especially designed to solve neuron dynamics and can be highly accelerated compared to biological time (Indiveri et al., 2011). For most neuron models whose dynamics can be analytically stated, the evaluation of its equations can be determined either digitally (Plana et al., 2007) by means of numerical methods or with analog circuits that solve the neuron equations intrinsically . The analog approach has the advantage of maximal parallelism, as all neuron circuits are evolving simultaneously in continuous time. Furthermore, high acceleration factors compared to biological time (e.g. up to 10 5 reported by Millner et al., 2010), can be achieved by reducing the size of the analog neuron circuits. Nevertheless, many neuromorphic hardware systems are developed for operation in real-time to be applied in sensor applications or medical implants (Fromherz, 2002;Vogels et al., 2005;Levi et al., 2008).
Typically, the large number of programmable and possibly plastic synapses accounts for the major part of chip resources in neuromorphic hardware systems (Figure 1). Hence, the limited chip area requires a trade-off between the number and size of neurons and their synapses, while providing sufficiently complex dynamics. For example, decreasing the resolution of synaptic weights offers an opportunity to reduce the area required for synapses and therefore allows more synapses on a chip, rendering the synaptic weights discretized.
In this study, we will analyze the consequences of such a weight discretization and propose generic configuration strategies for www.frontiersin.org FIGURE 1 | Photograph of the HICANN (High Input Count Analog Neural Network) chip, the basic building block of the FACETS wafer-scale hardware system. Notice the large area occupied by mixed-signal synapse circuits (yellow boxes) compared to neuron circuits (orange boxes). A digital communication infrastructure (area between red and green boxes) ensures a high density of connections between neurons on the same and to other HICANN chips.
spike-timing dependent plasticity on discrete weights. Deviations from original models caused by this discretization are quantified by particular benchmarks. In addition, we will investigate further hardware restrictions specific for the FACETS 1 wafer-scale hardware system (FACETS, 2010), a pioneering neuromorphic device that implements a large amount of both configurable and plastic synapses (Schemmel et al., 2008Brüderle et al., 2011). To this end, custom hardware-inspired synapse models are integrated into a network benchmark using the simulation tool NEST (Gewaltig and Diesmann, 2007). The objective is to determine the smallest hardware implementation of synapses without distorting the behavior of theoretical network models that have been approved by computer simulations.

SPIKE-TIMING DEPENDENT PLASTICITY
Here, Spike-Timing Dependent Plasticity (STDP) is treated as a pair-based update rule as reviewed by e.g. Morrison et al. (2008). Most pair-based STDP models (Song et al., 2000;van Rossum et al., 2000;Gütig et al., 2003;Morrison et al., 2007) separate weight modifications δw into a spike-timing dependent factor x(∆t ) and a weight-dependent factor F (w): where ∆t = t i − t j denotes the interval between spike times t j and t i at the pre-and postsynaptic terminal, respectively. Typically, x(∆t ) is chosen to be exponentially decaying (e.g. Gerstner et al., 1996;Kempter et al., 1999). In contrast, the weight-dependence F (w), which is divided into F + (w) for a causal and F − (w) for an anti-causal spike-timingdependence, differs between different STDP models. Examples are given in Table 1. As F + (w) is positive and F − (w) negative for all these STDP models, causal relationships (∆t > 0) between pre-1 Fast Analog Computing with Emergent Transient States
and postsynaptic spikes potentiate and anti-causal relationships (∆t < 0) depress synaptic weights. In this study, the intermediate Gütig STDP model (bounded to the weight range [0, 1]) is chosen as an example STDP model. It represents a mixture of the multiplicative (µ = 1) and additive (µ = 0) STDP model and has been shown to provide stability in competitive synaptic learning (Gütig et al., 2003). Nevertheless, the following studies can be applied to any pair-based STDP model with exponentially decaying time-dependence, e.g. all models listed in Table 1.

SYNAPSES IN LARGE-SCALE HARDWARE SYSTEMS
The FACETS wafer-scale hardware system (Schemmel et al., 2008Brüderle et al., 2011) represents an example for a possible synapse size reduction in neuromorphic hardware systems. Figure 2 schematizes the hardware implementation of a synapse enabling STDP similar as presented in Schemmel et al. (2006) and Schemmel et al. (2007). It provides the functionality to store the value of the synaptic weight, to measure the spike-timing-dependence between pre-and postsynaptic spikes and to update the synaptic weight according to this measurement. Synapse density is maximized by separating the accumulation of the spike-timing-dependence x(∆t ) and the weight update controller, which is the hardware implementation of F (w). This allows 4·10 7 synapses on a single wafer .
Synaptic dynamics in the FACETS wafer-scale hardware system exploits the fact that weight dynamics typically evolves slower than electrical neuronal activity Kunkel et al., 2011). Therefore, weight updates can be divided into two steps (Figure 2). First, a measuring and accumulation step which locally determines the relative spike times between pairs of neurons and thus x(∆t ). This stage is designed in analog hardware (red area in Figure 2), as analog measurement and accumulation circuits require less chip resources compared to digital realizations thereof. Second, the digital weight update controller (upper green area in Figure 2) implements F (w) based on the previous analog Frontiers in Neuroscience | Neuromorphic Engineering FIGURE 2 | Schematic drawing of local hardware synapses which are consecutively processed by a global weight update controller. Analog circuits are highlighted in red (with solid frame) and digital circuits in green (dashed frames). The spike-timing-dependence (here one standard spike pair (SSP) with ∆t s , see text) between the pre-and postsynaptic neuron is (a) measured (here a SSP ) and (b) accumulated (here to a c in case of a causal spike pair, a a for anti-causal spike pairs is not affected). Then, the global weight update controller evaluates the accumulated spike-timing-dependence by means of a crossed threshold a th (here a c > a th ) and modifies the digital weight of the hardware synapse accordingly. The new synaptic weight w n + 1 is retrieved from the LUT according to the accumulated spike-timing-dependence and the current weight w n and is written back to the hardware synapse. The analog measurement and accumulation circuit is furthermore minimized by using the reduced symmetric nearest-neighbor spike pairing scheme (Morrison et al., 2008): instead of considering all past and future spikes (all-to-all spike pairing scheme), only the latest and the following spike at both terminals of the synapse are taken into account. result. A global weight update controller 2 is responsible for the consecutive updates of many synapses (Schemmel et al., 2006) and hence limits the maximal rate at which a synapse can be updated, the update controller frequency v c .
Sharing one weight update controller reduces synapses to small analog measurement and accumulation circuits as well as a digital circuit that implements the synaptic weight (Figure 2). The area required to implement these digital weights with a resolution of r bits is proportional to 2 r , the number of discrete weights. Consequently, assuming the analog circuits to be fixed in size, the size of a synapse is determined by its weight storage exponentially growing with the weight resolution. E.g. the FACETS wafer-scale hardware system has a weight resolution of r = 4 bits, letting the previously described circuits (analog and digital) equally sized on the chip.
Modifications in the layout of synapse circuits are timeconsuming and involve expensive re-manufacturing of chips. Thus, the configuration of connections between neurons is designed flexible enough to avoid these modifications and provide 2 One weight update controller for all 256 neurons with 224 synapses each. a general-purpose modeling environment . For the same reason, STDP is conform to the majority of available update rules. The STDP models listed in Table 1 share the same time-dependence x(∆t ). Its exponential shape is mimicked by small analog circuit not allowing for other time-dependencies (Schemmel et al., 2006(Schemmel et al., , 2007. The widely differing weightdependences F (w), on the other hand, are programmable into the weight update controller. Due to limited weight update controller resources, arithmetic operations F (w) as listed in Table 1 are not realizable and are replaced by a programmable look-up table (LUT; Schemmel et al., 2006). Such a LUT lists, for each discrete weight, the resulting weights in case of causal or anti-causal spike-timing-dependence between pre-and postsynaptic spikes. Instead of performing arithmetic operations during each weight update (equation 1), LUTs are used as a recallable memory consisting of precalculated weight modifications. Hence, LUTs do not limit the flexibility of weight updates if their weight-dependence (Table 1) does not change over time. Throughout this study, we prefer the concept of LUTs to arithmetic operations, because we like to focus on the discretized weight space, a state space of limited dimension.
In addition to STDP, the FACETS wafer-scale hardware system also supports a variant of short-term plasticity mechanisms according to (Tsodyks and Markram, 1997;Bi and Poo, 1998;Schemmel et al., 2007), which however leaves synaptic weights unchanged and therefore lies outside the scope of this study.

DISCRETIZATION OF SYNAPTIC WEIGHTS
Continuous weight values w c ∈ [0, 1], as assumed for the STDP models listed in Table 1, are transformed into r-bit coded discrete weight values w d : where c = 1/(2 r − 1) denotes the width of a bin and x the floorfunction, the largest integer less than or equal to x. This procedure divides the range of weight values I = [0, 1] into 2 r bins. The term 1/2 allows for a correct discretization of weight values near the borders of I, effectively dividing the width of the ending bins (otherwise, only w c = 1 would be mapped to w d = 1).

DISCRETIZATION OF SPIKE-TIMING DEPENDENT PLASTICITY
A single weight update, resulting from a pre-and postsynaptic spike, might be too fine grained to be captured by a low weight resolution (equation 2). Therefore, it is necessary to accumulate the effect of weight updates of several consecutive spike pairs in order to reach the next discrete weight value (equation 2; Figure 2). This is equivalent to state that the implementation of the STDP model assumes additive features for ms range intervals. To this end, we define a standard spike pair (SSP) as a spike pair with a time interval between a pre-and postsynaptic spike of ∆t s = 10 ms in accordance to biological measurements by Bi and Poo (2001), Sjöström et al. (2001), Markram (2006) in order to provide a standardized measure for the spike-timing-dependence. This time interval is chosen arbitrarily defining the granularity only (fine enough for the weight resolutions of interest) and is valid for both pre-post and post-pre spike pairs, as x(∆t ) takes its absolute value.

www.frontiersin.org
The values for a LUT are constructed as follows. First, the parameters r (weight resolution) and n (number of SSPs consecutively applied for an accumulated weight update) as well as the STDP rule-specific parameters τ STDP , λ, µ, α ( Table 1) are chosen. Next, starting with a discrete weight w d , weight updates δw(w, ∆t s ) specified by equation (1) are recursively applied n times in continuous weight space using either exclusively F + (w) or F − (w). This results in two accumulated weight updates ∆w +/− , one for each weight-dependence F +/− (w). Finally, the resulting weight value in continuous space is according to equation (2) transformed back to its discrete representation. This process is then carried out for each possible discrete weight value w d ( Table 2). We will further compare different LUTs letting n be a free parameter. In the following a weight update refers to ∆w, if not specified otherwise.
Although we are focusing on the Gütig STDP model, the updated weight values can in general under-or over-run the allowed weight interval I due to finite weight updates ∆w. In this case, the weight is clipped to its minimum or maximum value, respectively.

EQUILIBRIUM WEIGHT DISTRIBUTIONS
We analyze long-term effects of weight discretization by studying the equilibrium weight distribution of a synapse that is subject to Poissonian pre-and postsynaptic firing. Thus, potentiation and depression are equally probable (p d = p p = 1/2). Equilibrium weight distributions in discrete weight space of low resolution (between 2 and 10 bits) are compared to those with high resolution (16 bits) via the mean squared error MSE eq . Consecutive weight updates are performed based on precalculated LUTs.
Equilibrium weight distributions of discrete weights for a given weight resolution of r bits are calculated as follows. First, a LUT for 2 r discrete weights is configured with n SSPs. Initially, all 2 r discrete weight values w i have the same probability P i,0 = 1/2 r . For a compact description, the discrete weights w i are mapped to a 2 r dimensional space with unit vectors e i ∈ N 2 r . Then, for each iteration cycle j, the probability distribution is defined by P j = Σ 2 r −1 i=0 P i,j−1 (p p e c +p d e a ), where P i,j−1 is the probability for each discrete weight value w i of the previous iteration cycle j − 1. The indices of e c and e a are those of the resulting discrete weight values w i in case of a causal and anti-causal weight update, respectively, and are represented by the LUT. We define an equilibrium state as reached if the Euclidean norm P j−1 − P j is smaller than a threshold h = 10 −12 . An analytical approach for obtaining equilibrium weight distributions is derived in Section 6.1.

SPIKING NETWORK BENCHMARKS
In addition to the behavior under Poissonian noise, we study the impact of discretized weights with a software implementation of hardware synapses, enabling us to analyze synapses in isolation as well as in network benchmarks. The design of our simulation environment is flexible enough to take further hardware constraints and biological applications into account.

Software implementation of hardware synapses
The hardware constraints considered in this study are implemented as a customized synapse model within the framework of the NEST simulation tool (Gewaltig and Diesmann, 2007), allowing their well controlled application in simulator-based studies on large-scale neural networks. The basic properties of such a hardware-inspired synapse model are described as follows and are illustrated in Figures 2 and 5.
For each LUT configuration defined by its weight resolution r and number n of SSPs, the threshold for allowing weight updates is set to defining a = i x(∆t i ) as the spike pair accumulation for arbitrary intervals. Here, a single SSP is used, setting a = a SSP = x(∆t s ). If either the causal or anti-causal spike pair accumulation a c/a crosses the threshold a th , the synapse is "tagged" for a weight update. At the next cycle of the weight update controller all tagged synapses are updated according to the LUT. Afterward, the spike pair accumulation (causal or anti-causal) is reset to zero. Untagged synapses remain unprocessed by the update controller, and spike pairs are further accumulated without performing any weight update. If a synapse accumulates a c and a a above threshold between two cycles of the weight update controller, both are reset to zero without updating the synaptic weight. This threshold process implies that the frequency v w of weight updates is dependent on n, which in turn determines the threshold a th , but also on the firing rates and the correlation between the pre-and postsynaptic spike train. In general, a increases faster with higher firing rates or higher correlations. To circumvent these dependencies on network dynamics, we will use n as a generalized description for the weight update frequency v w . The weight update frequency v w should not be confused with the update controller frequency v c , with which is checked for threshold crossings and hence limits v w . Furthermore, we have implemented a reference synapse model in NEST, which is based on Gütig et al. (2003). It has the reduction of employing nearest-neighbor instead of all-to-all spike pairing (Morrison et al., 2008).
All simulations involving synapses are simulated with NEST. Spike trains are applied to built-in parrot neurons, that simply repeat their input, in order to control pre-and postsynaptic spike trains to interconnecting synapses.

Single synapse benchmark
We compare the weight evolutions of hardware-inspired and reference synapses receiving correlated pre-and postsynaptic spike trains, drawn from a multiple interaction process (MIP; Kuhn et al., 2003). This process introduces excess synchrony between two realizations by randomly thinning a template Poisson process. SSPs are then obtained by shifting one of the processes by ∆t s .
In this first scenario the spike pair accumulation a is checked for crossing a th with a frequency of v c = 10 Hz to focus on the effects of discrete weights only. This frequency is equal to the simulation step size, preventing the spike pair accumulation from overshooting the threshold a th without eliciting a weight update.
Synaptic weights are recorded in time steps of 3 s for an overall period of 150 s and are averaged over 30 random MIP realizations. Afterward the mean weight at each recorded time step is compared between the hardware-inspired and the reference synapse model by applying the mean squared error MSE w .

Network benchmarks
The detection of presynaptic synchrony is taken as a benchmark for synapse implementations. Two populations of 10 neurons each converge to an integrate-and-fire neuron with exponentially decaying synaptic conductances (see schematic in Figure 7A and model description in Tables 7 and 8) by either hardware-inspired or reference synapses. These synapses are excitatory, and their initial weights are drawn randomly from a uniform distribution over [0, 1). The amplitude of the postsynaptic conductance is wg max with g max = 100 nS. One population draws its spikes from a MIP with correlation coefficient c (Kuhn et al., 2003), the other from a Poisson process (MIP with c → 0). We choose presynaptic firing rates of 7.2 Hz such that the target neuron settles at a firing rate of 2-22 Hz depending on the synapse model. The exact postsynaptic firing rate is of minor importance as long as the synaptic weights reach an equilibrium state. The synaptic weights are recorded for 2,000 s with a sampling frequency of 0.1 Hz. The two resulting weight distributions are compared applying the Mann-Whitney U test (Mann and Whitney, 1947).

Further constraints.
Not only the discretization of synaptic weights, but also the update controller frequency v c and the reset behavior are constraints of the FACETS wafer-scale hardware system.
To study effects caused by a limited update controller frequency, we choose v c such that the interval between sequent cycles is a multiple of the simulator time step. Consequently weight updates can only occur on a time grid.
A common reset means that both the causal and anti-causal spike pair accumulations are reset, although only either a c or a a has crossed a th . Because the common reset requires only one reset line instead of two, it decreases the chip resources of synapses and is implemented in the current FACETS wafer-scale hardware system. As a basis for a possible compensation mechanism for the common reset, we suggest analog-to-digital converters (ADCs) with a 4-bit resolution that read out the spike pair accumulations. Such ADCs require only a small chip area in the global weight update controller compared to the large area occupied by additional reset lines covering all synapses and are therefore resource saving alternatives to second reset lines. An ADC allows to compare the spike pair accumulations against multiple thresholds. Implementations of the common reset as well as ADCs are added to the existing software model. For multiple thresholds, the same number of LUTs is needed that have to be chosen carefully. To provide symmetry within the order of consecutive causal and anti-causal weight updates, the spike pair accumulation (causal or anti-causal) that dominates in means of crossing a higher threshold is applied first.

Peri-stimulus-time-histograms.
The difference between static and STDP synapses on eliciting postsynaptic spikes in the above network benchmark can be analyzed with peri-stimulustime-histograms (PSTHs). Here, PSTHs show the probability of postsynaptic spike occurrences in dependence on the delay between a presynaptic trigger and its following postsynaptic spike. Spike times are recorded within the last third of an elongated simulation of 3,000 s with c = 0.025. During the last 1,000 s the mean weights are already in their equilibrium state, but are still fluctuating around it. The first spike of any two presynaptic spikes within a time window of ∆t on = 1 ms is used as a trigger. The length of ∆t on is chosen small compared to the membrane time constant τ m = 15 ms, such that the excitatory postsynaptic potentials of both presynaptic spikes overlap each other and increase the probability of eliciting a postsynaptic spike. On the other hand ∆t on is chosen large enough to not only include the simultaneous spikes generated by the MIP, but also include coincident spikes within the uncorrelated presynaptic population.

HARDWARE VARIATIONS
In contrast to arithmetic operations in software models, analog circuits vary due to the manufacturing process, although they are identically designed. The choice of precision for all building blocks should be governed by those that distort network functionality most. In this study, we assume that variations within the analog measurement and accumulation circuits are likely to be a key requirement for these choices, as they operate on the lowest level of STDP. Circuit variations are measured and compared between the causal and anti-causal part within a synapse and between synapses. All measurements are carried out with the FACETS chip-based hardware system (Schemmel et al., 2006(Schemmel et al., , 2007 with hardware parameters listed in Table 6. The FACETS chip-based hardware system shares a conceptually nearly identical STDP circuit with the FACETS wafer-scale hardware system (for details see Section 2) which was still in the assembly process at the course of this study. The hardware measurements are written in PyNN (Davison and Frégnac, 2006) and use the workflow described in (Brüderle et al., 2011).

Measurement
The circuit variations due to production imperfection are measured by recording STDP curves and comparing their integrals for ∆t > 0 and ∆t < 0. The curves are recorded by applying equidistant pairs of pre-and postsynaptic spikes with a predefined latency ∆t. Presynaptic spikes can be fed into the hardware www.frontiersin.org precisely. However, in contrast to NEST's parrot neurons, postsynaptic spikes are not directly adjustable and therefore has to be evoked by several synchronous external triggers (for details see Section 6.3). After discarding the first 10 spike pairs to ensure regular firing, the pre-and postsynaptic spike trains are shifted until the desired latency ∆t is measured. Due to the low spike pair frequency of 10 Hz, only the correlations within and not between the spike pairs are accumulated. The number N of consecutive spike pairs is increased until the threshold is crossed and hence a correlation flag is set ( Figure 8A). The inverse of this number versus ∆t is called an STDP curve. Such curves were recorded for 252 synapses within one synapse column, the remaining 4 synapses in this column were discarded.
For each STDP curve the total area A t = A a + A c is calculated and normalized by the mean A abs of the absolute area A abs = |A a | + |A c | over all STDP curves. Ideally, A t would vanish if both circuits are manufactured identically. The standard deviation σ a (assuming Gaussian distributed measurement data) of these normalized total areas A t is taken as one measure for circuit variations. Besides this asymmetry which measures the variation within a synapse, a measure for variation across synapses is the standard deviation σ t of the absolute areas A abs . Therefore the absolute areas A abs under each STDP curve are again normalized by A abs and furthermore the mean of all these normalized absolute areas is subtracted.

Software analysis
In order to predict the effects of the previously measured variations on the network benchmark, these variations are integrated into computer simulations. The thresholds for the causal and anticausal spike pair accumulations are drawn from two overlaying Gaussian distributions defined by the ideal thresholds (equation 3) and their variations σ t , σ a . Again, the same network benchmark as described above is used, but with a fixed correlation coefficient of c = 0.025 and an 8-bit LUT configured with n = 12 SSPs.

RESULTS
Synaptic weights of the FACETS wafer-scale hardware system  have a 4-bit resolution. We show that such a weight resolution is enough to exhibit learning in a neural network benchmark for synchrony detection. To this end, we analyze the effects of weight discretization in three steps as summarized in Table 3.

DYNAMIC RANGE OF STDP ON DISCRETE WEIGHTS
We choose the configuration of STDP on discrete weights according to Sections 2.3 and 2.4 to obtain weight dynamics comparable to that in continuous weight space. Each configuration can be described by a LUT "projecting" each discrete weight to new values, one for potentiation and one for depression (Table 4a). For a given weight resolution r the free configuration parameter n (number of SSPs) has to be adjusted to avoid a further reduction of the usable weight resolution by dead discrete weights. Dead discrete weights are defined as weights projecting to themselves in case of both potentiation and depression or not receiving any projections from other discrete weights. The percentage of dead discrete weights d defines the lower and upper limit of feasible values for n, the dynamic range. The absolute value of the interval within a SSP (∆t s ) is an arbitrary choice merely defining the granularity, but does not affect the results (not shown). Note that spike-timing precision in vivo, which is observed for high dimensional input such as dense noise and natural scenes, goes rarely beyond 5-10 ms (Butts et al., 2007;Desbordes et al., 2008Desbordes et al., , 2010Marre et al., 2009;J. Frégnac, personal communication), and the choice of 10 ms as a granular step is thus justified biologically.
Generally, low values of n realize frequent, small weight updates. However, if n is too low, some discrete weights may project to themselves (see rounding in equation 2) and prevent synaptic weights from evolving dynamically (see Table 4b; n = 15 in Figure 3A).
On the other hand, if n exceeds the upper limit of the dynamic range, intermediate discrete weights may not be reached by others. Rare, large weight updates favor projections to discrete weights near the borders of the weight range I and lead to a bimodal equilibrium weight distribution as shown in Table 4c and Figure 3A (n = 500).
The lower limit of the dynamic range decreases with increasing resolution (Figure 3B). Compared to a 4-bit weight resolution, an 8-bit weight resolution is sufficiently high to resolve weight Table 3 | Outline of analyses on the effects of weight discretization and further hardware constraints.

Description
Results Methods

LOOK-UP TABLE ANALYSIS
Basic analyses on the configuration of STDP on discrete weights by means of look-up tables (A) A) Section 3.1 A) Sections 2.3 and 2.4 and their long-term dynamics (B). B) Section 3.2 B) Section 2.5

SPIKING NETWORK BENCHMARKS
Software implementation of hardware-inspired synapses with discrete weights for application in spiking neural environments (C).
C) Section 2.6.1 Analyses of their effects on short-term weight dynamics in single synapses (D) D) Section 3.3.1 D) Section 2.6.2 and neural networks (E). E) Section 3.3.2 E) Section 2.6.3 Analyses on how additional hardware constraints effect the network benchmark (F). F) Section 3.3.3 F) Section 2.6.3 Frontiers in Neuroscience | Neuromorphic Engineering updates down to a single SSP (Figure 3D). This allows frequent weight updates comparable to weight evolutions in continuous weight space. The upper limit of the dynamic range does not change over increasing weight resolutions, but is critical for limited update controller frequencies as investigated in Section 3.3.

EQUILIBRIUM WEIGHT DISTRIBUTIONS
Studying learning in neural networks may span long periods of time. Therefore we analyze equilibrium weight distributions being the temporal limit of Poissonian distributed pre-and postsynaptic spiking. These distributions are obtained by applying random walks on LUTs with uniformly distributed occurrences of potentiations and depressions (Section 2.5). Figure 4A shows i.a. boundary effects caused by LUTs configured within the upper part of the dynamic range. E.g. for n = 144, the relative frequencies of both boundary values are increased due to large weight steps (red and cyan distributions). Frequent weights, in turn, increase the probability of weights to which they project (according to the LUT). This effect decreases with the number of look-ups, due to the random nature of the stimulus, however, causing intermediate weight values to occur at higher probability. The impact of weight discretization on long-term weight dynamics is quantified by comparing equilibrium weight distributions between low and high weight resolutions. Weight discretization involves distortions caused by rounding effects for small n (equation 2; Figure 3) and boundary effects for high n (Figures 4A,C). High weight resolutions can compensate for rounding effects, but not for boundary effects (Figure 4B).
This analysis on long-term weight dynamics ( Figure 4C) refines the choice for n roughly estimated by the dynamic range ( Figure 3C).

SPIKING NETWORK BENCHMARKS
We extend the above studies on temporal limits by analyses on short-term dynamics with unequal probabilities for potentiation p p and depression p d . A hardware-inspired synapse model is used in computer simulations of spiking neural networks, of which an example of typical dynamics is shown in Figure 5. As the preand postsynaptic spike trains are correlated in a causal fashion, the causal spike pair accumulation increases faster than the anticausal one ( Figure 5A). It crosses the threshold twice, evoking two potentiation steps (at around 7 and 13 s) before the anticausal spike pair accumulation evokes a depression at around 14 s (Figures 5A,B). The first two potentiations project to the subsequent entry of the LUT, whereas the following depression rounds to the next but one discrete weight (omitting one entry in the LUT) due to the asymmetry measure α in the STDP model by Gütig et al. (2003).

Single synapse benchmark
This benchmark compares single weight traces between hardwareinspired and reference synapses (Section 2.6.2). A synapse receives correlated pre-and postsynaptic input ( Figure 6A) resulting in weight dynamics as shown in Figure 6B. The standard deviation for discrete weights (hardware-inspired synapse model) is larger than that for continuous weights (reference model). This difference is caused by rare, large weight jumps (induced by high n) also responsible for the broadening of equilibrium weight distributions ( Figure 4A). Consequently, the standard deviation increases further with decreasing weight resolutions (not shown here). The dependence of the deviation between discrete and continuous weight traces on the weight resolution r and the number n of SSPs is qualitatively comparable to that of comparisons between equilibrium weight distributions (Figures 6D,E). This similarity, especially in dependence on n (Figure 6D), emphasizes the crucial impact of LUT configurations on both short-and long-term weight dynamics.
To further illustrate underlying rounding effects when configuring LUTs, the asymmetry value α in Gütig's STDP model can be taken as an example. In an extreme case both potentiation and depression are rounded down (compare weight step size for potentiation and depression in Figure 5B). This would increase the originally slight asymmetry drastically and therefore enlarge the distortion caused by weight discretization.
The weight update frequency v w is determined by the weight resolution r and the number n of SSPs. High frequencies are beneficial for chronologically keeping up with weight evolutions in continuous weight space. They can be realized by small numbers of SSPs lowering the threshold a th (equation 3). On the other hand, rounding effects in the LUT configuration deteriorate for too small numbers of SSPs ( Figure 6D). In case of a weight resolution r = 4 bits (r = 8 bits) choosing n = 36 (n = 12) for the LUT configuration represents a good balance between a high weight update frequency and proper both short-and long-term weight dynamics (Figures 3B, 4B and 6C). Note that n can be chosen smaller for higher weight resolutions, because the distorting impact of rounding effects decreases.

Network benchmark: synchrony detection
Not only exact weight traces of single synapses (Section 3.3), but rather those of synapse populations are crucial to fulfill tasks, e.g. the detection of synchronous firing within neural networks. The principle of synchrony detection is a crucial feature of various neural networks with plasticity, e.g. reported by Senn et al. (1998), Kuba et al. (2002), Davison et al. (2009), El Boustani et al. (2012. Here, it is introduced by means of an elementary benchmark neural network (Figure 7A; Section 3), using the hardware-inspired or reference synapse model, respectively.

Figure 7B
shows a delay distribution of postsynaptic spike occurrences, relative to the trigger onset, synchronous presynaptic firing (Section 2). For the shown range of ∆t del , the postsynaptic neuron is more likely to fire if connected with static (dark gray trace) instead of STDP (black trace) synapses. The correlated population causes its afferent synapses to strengthen more compared to those from the uncorrelated population. This can be seen in Figure 7C, where w saturates at different values (t ≈ 700 s). The same effect can be observed for discretized weights in Figure 7D. For ∆t del > 170 ms the delay distribution for static synapses is larger than that for STDP synapses (not shown here), because such delayed postsynaptic spikes are barely influenced by their presynaptic counterparts. This is due to small time constants of the postsynaptic neuron (see τ m = C m /g L and τ syn in Tables 7 and 8) compared to ∆t del . www.frontiersin.org of having the same median of weights within both groups of synapses (with correlated and correlated input) at t = 2,000 s versus the correlation coefficient c. The hardware-inspired synapses model is represented in red (r = 4 bits and n = 36), green (r = 4 bits and n = 36) and blue (r = 8 bits and n = 12). Black depicts the reference synapse model (r = 64 bits). The background shading represents the significance levels: p < 0.05, p < 0.01, and p < 0.001. (F) Dependence of the p-value on the update controller frequency v c for c = 0.025. Colors as in (E). (G) Black and red trace as in (E). Additionally, p-values for hardware-inspired synapses with common resets are plotted in yellow (r = 4 bits and n = 36) and magenta (r = 8 bits and n = 12). Compensations with ADCs are depicted in gray (r = 4 bits and n = 15-45 in steps of 2) and cyan (r = 8 bits and n = 1-46 in steps of 3). Figure 7E shows the p-values of the Mann-Whitney U test applied to both groups of synaptic weights at t = 2,000 s for different configurations of weight resolution r and number n of SSPs. Generally, p-values (probability of having the same median within both groups of weights) decrease with an increasing correlation coefficient. Although applying previously selected "healthy" LUT configurations, weight discretization changes the required correlation coefficient for reaching significance level (gray shaded areas). Incrementing the weight resolution while retaining the number of SSPs n does not change the p-values significantly. Low weight resolutions cause larger spacings between discrete weights that can further facilitate the distinction between both medians (for n = 36 compare r = 4 bits to r = 8 bits in Figure 7E). However, reducing n for high weight resolutions shortens the accumulation period and consequently allows the synapses to capture fluctuations in a on smaller time scales. This improves the p-value, but is inconvenient for low weight resolutions, because these LUT configurations do not yield the desired weight dynamics (Figures 3, 4 and 6).

Network benchmark: further constraints
In addition to the discretization of synaptic weights that has been analyzed so far, we also consider additional hardware constraints of the FACETS wafer-scale system (Section 2.6.3). This allows us to compare the effects of other hardware constraints to those of weight discretization. First, we take into account a limited update controller frequency v c . Figure 7F shows that low frequencies (<1 Hz) distort the weight dynamics drastically and deteriorate the distinction between correlated and uncorrelated inputs. Ideally, a weight update would be performed whenever the spike pair accumulations cross the threshold ( Figure 5A). However, these weight updates of frequency v w are now limited to a time grid with frequency v c . The larger the latency between a threshold crossing and the arrival of the weight update controller, the more likely this threshold is exceeded. Hence, the weight update is underestimated and delayed. Low weight resolutions are less affected, because a high ratio v c /v w reduces threshold overruns and hence distortions. This low resolution requires a high number of SSPs which in turn increases the threshold a th (equation 3) and thus the weight update frequency v w .
Second, hardware-inspired synapses with the limitation to common reset lines cease to discriminate between correlated and uncorrelated input (Figure 7G, yellow and magenta traces). A crossing of the threshold by one spike pair accumulation resets the other (Figure 5) and suppresses its further weight updates, leading to underestimation of synapses with less correlated input.
To compensate for common resets we suggest ADCs that allow the comparison of spike pair accumulations to multiple thresholds. Nevertheless, ADCs compensate common resets only for high weight resolutions ( Figure 7G). Again, for low weight resolutions and hence high numbers of SSPs fluctuations can not be taken into account (Figure 7G, gray values). This is the case for a 4-bit weight resolution, whereas a 8-bit weight resolution is high enough to resolve small fluctuations down to single SSPs ( Figure 7G, cyan values). Each threshold has its own LUT configured with a number of SSPs that matches the dynamic range (Figure 3). The upper limit of n is chosen according to the results of Section 3.2. The update controller frequency is chosen to be low enough (v c = 0.2 Hz) to enable all thresholds to be hit.

HARDWARE VARIATIONS
So far, we neglected production imperfections in real hardware systems. However, fixed pattern noise induced by these imperfections are a crucial limitation on the transistor level and may distort the functionality of the analog synapse circuit making higher weight resolutions unnecessary. The smaller and denser the transistors, the larger the discrepancies from their theoretical properties (Pelgrom et al., 1989). Using the protocol illustrated in Figure 8A we recorded STDP curves on the FACETS chip-based hardware system (Figures 8B,C; Section 2.7.1). Variations within (σ a ) and between (σ t ) individual synapses are shown as distributions in www.frontiersin.org Figures 8D,E, both suggesting variations at around 20%. Both variations are incorporated into computer simulations of the network benchmark (Figure 7A; Section 2.7.2) to analyze their effects on synchrony detection. The p-value (as in Figures 7E-G) rises with increasing asymmetry within synapses, but is hardly affected by variations between synapses (Figure 8F).

CONFIGURATION OF STDP ON DISCRETE WEIGHTS
In this study, we demonstrate generic strategies to configure STDP on discrete weights as, e.g. implemented in neuromorphic hardware systems. Resulting weight dynamics is critically dependent on the frequency of weight updates that has to be adjusted to the available weight resolution. Choosing a frequency within the dynamic range (Figure 3) is a prerequisite for the exploitation of discrete weight space ensuring proper weight dynamics. Analyses on long-term dynamics using Poisson-driven equilibrium weight distributions help to refine this choice (Figure 4). The obtained configuration space is similar to that of short-term dynamics, being the evolution of single synaptic weights (Figure 6). This similarity confirms the crucial impact of the LUT configuration on weight dynamics which is caused by rounding effects. Based on these results, we have chosen two example LUT configurations (r = 4 bits; n = 36 and r = 8 bits; n = 12) for further analysis, both realizable on the FACETS wafer-scale hardware system. High weight resolutions allow for higher frequencies of weight updates approximating the ideal model, occasionally requiring several spike pairs to evoke a weight update. Correspondingly, in associative pairing literature, a minimal number of associations is required to detect functional changes (expressed by the spiking or postsynaptic potential response) and varies from studies to studies from a few to several tens Laurent, 2007, 2012).
Discretization not only affects the accuracy of weights, but also broadens their equilibrium weight distributions (Figure 4), which are actually shown to be narrow in large-scale neural networks . Furthermore, this broadening can distort the functionality of neural networks, e.g. it deteriorates the distinction between the two groups of weights (of synapses originating from the correlated or uncorrelated population) within the network benchmark (compare Figures 7C,D). On the other hand, weight discretization can also be advantageous for synchrony detection, if, e.g. groups of weights separate due to large step sizes between neighboring discrete weights (compare red and green in Figure 7E).
In summary, these analyses of STDP on discrete weights are necessary for obtaining appropriate configurations for a variety of STDP models and weight resolutions.

4-BIT WEIGHT RESOLUTION
Simulations of the network benchmark show that a 4-bit weight resolution is sufficient to detect synchronous presynaptic firing significantly (Figure 7). Groups of synapses receiving correlated input strengthen and in turn increase the probability of synchronous presynaptic activity to elicit postsynaptic spikes as compared to static synapses ( Figure 7B). Thus, the weight distribution within the network reflects synchrony within sub-populations of presynaptic neurons. Increasing the weight resolution causes both weight distributions, for the correlated and uncorrelated input, to narrow and separate from each other. Consequently, an 8-bit resolution is sufficient to reproduce the p-values of continuous weights with floating point precision (corresponds to discrete weights with r = 64 bits, Figure 7E). This resolution requires the combination of two hardware synapses and is under development . On the other hand, increasing the weight resolution, but retaining the frequency of weight updates (number of SSPs), results in weight distributions of comparable width and consequently does not improve the p-values significantly ( Figure 7E).
Other neuromorphic hardware systems implement bistable synapses corresponding to a 1-bit weight resolution (Badoni et al., 2006;Indiveri et al., 2010). Bistable synapse models are shown to be sufficient for memory formation (Amit and Fusi, 1994;Fusi et al., 2005;Brader et al., 2007;Clopath et al., 2008). However, these models do not only employ spike-timings (Levy and Steward, 1983;Markram, 2006;Mu and Poo, 2006;Cassenaer and Laurent, 2007;Bill et al., 2010), but also read the postsynaptic membrane potential (Sjöström et al., 2001;Trachtenberg et al., 2002) requiring additional hardware resources. So far, there is no consensus of a general synapse model, and neuromorphic hardware systems are mostly limited to only subclasses of these models.
Studies on weight discretization are not limited to the FACETS hardware systems only, but are applicable to other backends for neural network simulations. For example, our results can be applied to the fully digital neuromorphic hardware system described by Jin et al. (2010b), who also report STDP with a 75 FIGURE 9 | The configuration space of STDP on discrete weights spanned by the weight resolution r and the number n of SSPs that is inversely proportional to the weight update frequency v w . The darkest gray area depicts the configurations with dead discrete weights (Figure 3). The lower limits of configurations for proper equilibrium weight distributions (Figure 4) and single synapse dynamics (Figure 6) are shown with brighter shades. The dashed rectangle marks configurations realizable by the FACETS wafer-scale hardware system (assuming an acceleration factor of 10 3 , all synapses enabled for STDP and SSPs applied with 10 Hz). The working points for a 4-bit (n = 36) and 8-bit (n = 12) weight resolution are highlighted as a triangle and circle, respectively. reduced weight resolution. Furthermore, weight discretization may be a further approach to reduce memory consumption of "classical" neural simulators.

FURTHER HARDWARE CONSTRAINTS
In addition to a limited weight resolution, we have studied further constraints of the current FACETS wafer-scale hardware system with the network benchmark. A limited update controller frequency implying a minimum time interval between subsequent weight updates does not affect the p-values down to a critical frequency v c ≈ 1 Hz ( Figure 7F). The update controller frequency decreases linearly with the number of hardware synapses enabled for STDP. Assuming a hardware acceleration factor of 10 3 all synapses can be enabled for STDP staying below this critical frequency. However, the number of STDP synapses should be decreased if a higher update controller frequency is required, e.g. for a configuration with an 8-bit weight resolution and a small number of SSPs.
Common resets of spike pair accumulations reduce synapse chip resources by requiring one instead of two reset lines, but suppress synaptic depression and bias the weight evolution toward potentiation. This is due to the feed-forward network architecture, in which causal relationships between pre-and postsynaptic spikes are more likely than anti-causal ones. Long periods of accumulation (large numbers of SSPs) lower the probability of synaptic depression. Hence, all weights tend to saturate at the maximum weight value impeding a distinction between both populations of synapses within the network benchmark ( Figure 7G). The probability of synaptic depression can be increased by high weight update frequencies (small numbers of SSPs) shortening the accumulation periods (equation 3) and subsequently approaching the behavior of independent resets. However, high weight update frequencies require high weight resolutions and thus high update controller frequencies, which decreases the number of available synapses enabled for STDP.
As a compensation for common resets, we suggest that the single spike pair accumulation threshold is expanded to multiple thresholds implemented as ADCs. In comparison to synapses

Parameter Description Value
V clrc Amount of charge that will be accumulated on the capacitor C 1 (Schemmel et al., 2006)  with common resets, ADCs improve p-values significantly only for an 8-bit weight resolutions ( Figure 7G, compare cyan to magenta values). However, the combination of two 4-bit hardware synapses allows to mimic independent resets and hence yields p-values comparable to 8-bit synapses using ADCs ( Figure 7G, compare red to cyan values). Mimicking independent resets is under development for the FACETS wafer-scale hardware system. Each of the two combined synapses will be configured to accumulate only either causal or anti-causal spike pairs, while both synapses are updated in a common process. This requires only minor hardware design changes within the weight update controller and should www.frontiersin.org  x (∆t ) = exp(−|∆t |/τ STDP ) be preferred to more expensive changes for realizing ADCs. The implementation of real second reset lines is not possible without major hardware design changes, but is considered for future chip revisions.
Benchmark simulations incorporating the measured variations within and between synapse circuits due to production imperfections result in p-values worse (higher) than for a 4-bit weight resolution (compare asterisk in Figure 8F to red value for c = 0.025 in Figure 7E). Consequently, a 4-bit weight resolution is sufficient for the current implementation of the measurement and accumulation circuits. We suppose that the isolatedly analyzed effects of production imperfections and weight discretization add up and limit the best possible p-value of each other. Analysis on combinations of hardware restrictions would allow to quantify how their effects add up and are considered for further studies. However, hardware variations can also be considered as a limitation on the transistor level making higher weight resolutions unnecessary.
Frontiers in Neuroscience | Neuromorphic Engineering Figure 9 summarizes the results on how to configure STDP on discrete weights. For a given weight resolution r the number n of SSPs has to be chosen as low as possible to allow for high weight update frequencies v w . However, n must be high enough to ensure STDP dynamics comparable to continuous weights (lightest gray shaded area) and to stay within the configuration space realizable by the FACETS wafer-scale hardware system. The hardware system limits the update controller frequency v c and hence distorts STDP especially for low n.

OUTLOOK
Currently, STDP in neuromorphic hardware systems is enabled for only 10 to few 10,000 synapses in real-time (Arthur and Boahen, 2006;Zou et al., 2006;Daouzli et al., 2008;Ramakrishnan et al., 2011). Large-scale systems do not implement long-term plasticity (Merolla and Boahen, 2006;Vogels et al., 2011) or operate in real-time only (Jin et al., 2010a). Enabling a largescale (over 4·10 7 synapses) and highly accelerated neuromorphic hardware system (the FACETS wafer-scale hardware system) with configurable STDP requires trade-offs between number and size of synapses, which raises constraints in their implementation (Schemmel et al., 2006. Table 5 summarizes these tradeoffs and gives an impression about the hardware costs and effects on STDP. In this study, we introduced novel analysis tools allowing the investigation of hardware constraints and therefore verifying and improving the hardware design without the need for expensive and time-consuming prototyping. Ideally, this validation process should be shifted to an earlier stage of hardware design combining the expertise from Computational Neuroscience and Neuromorphic Engineering, as, e.g. published by Linares- . This kind of research is crucial for researchers to use and understand research executed on neuromorphic hardware systems and thereby transform it into a tool substituting von Neumann computers in Computational Neuroscience. Brüderle et al. (2011) report the development of a virtual hardware, a simulation tool replicating the functionality and configuration space of the entire FACETS wafer-scale hardware system. This tool will allow further analyses on hardware constraints, e.g. in the communication infrastructure and configuration space.
The presented results verify the current implementation of the FACETS wafer-scale hardware system in terms of balance between weight resolution, update controller frequency and circuit variations. Further improvement of the existing hardware implementation would require improvements of all aspects. The only substantial bottleneck has been identified to be common resets, already leading to design improvements of the wafer-scale system.
Although all presented studies refer to the intermediate Gütig STDP model, any other STDP model relying on equation (1) and an exponentially decaying time-dependence can be investigated with the existing software tools in a generic way, e.g. those models listed in Table 1. In contrast to the fixed exponential time-dependence implemented as analog circuits in the FACETS wafer-scale hardware system, the weight-dependence is freely programmable and stored in a LUT. The categories refer to the model description in Table 7.
Ideally, a high resolution in the weight range of highest plausibility is requested, a high effective resolution. Bounded STDP models (e.g. the intermediate Gütig STDP model applied in this study) are well suited for a 4-bit weight resolution and allow a linear mapping of continuous to discrete weights. A 4-bit weight resolution causes large weight updates and hence broadens the weight distribution spanning the whole weight range. This results in a high effective resolution. On the other hand, unbounded STDP models (e.g. the power law and van Rossum STDP models) have long tails toward high weights. Cutting the tail by only mapping low weights to discrete weights would increase the frequency of the highest discrete weight. A possible solution is a non-linear mapping of continuous to discrete weights -large differences between high discrete weights and small differences between low discrete weights. However, a variable distance between discrete weights would require more hardware efforts.
An all-to-all spike pairing scheme applied to the reference synapses within the network benchmark results in p-values worse (higher) than for synapses implementing a reduced symmetric nearest-neighbor spike pairing scheme (not shown, but www.frontiersin.org comparable to 4-bit discrete weights in Figure 7E, see red values). Detailed analyses on different spike pairing schemes could be investigated in further studies.
As a next step, our hardware synapse model can replace the regular STDP synapses in simulations of established neural networks, to test their robustness and applicability for physical emulation in the FACETS wafer-scale hardware system. The synapse model is available in the following NEST release and can easily be applied to NEST or PyNN network descriptions. If neural networks, or modifications of them, qualitatively reproduce the simulation, they can be applied to the hardware system, with which similar results can be expected. Thus, the presented simulation tools allow beforehand modifications of network architectures to ensure the compatibility with the hardware system.
With respect to more complex long-term plasticity models, the hardware system is currently being extended by a programmable microprocessor that is in control of all weight modifications. This processor allows to combine synapse rows in order to compensate for common resets. With possible access to further neuron or network properties the processor would allow for more complex plasticity rules as, e.g. those of Clopath et al. (2008) and Vogelstein et al. (2007). Even modifications of multiple neurons are feasible, a phenomenon observed in experiments with neuromodulators (Eckhorn et al., 1990;Itti and Koch, 2001;Reynolds and Wickens, 2002;Shmuel et al., 2005). Nevertheless, more experimental data and consensus about neuromodulator models and their applications are required to further customize the processor. New hardware revisions are rather expensive and consequently should only cover established models that are prepared for hardware implementation by dedicated studies.
This presented evaluation of the FACETS wafer-scale hardware system is meant to encourage neuroscientists to benefit from neuromorphic hardware without leaving their environment in terms of neuron, synapse and network models. We further endorse that, toward an efficient exploitation of hardware resources, the design of synapse models will be influenced by hardware implementations rather than only by their mathematical treatability (e.g. Badoni et al., 2006).