On Spike-Timing-Dependent-Plasticity, Memristive Devices, and Building a Self-Learning Visual Cortex

In this paper we present a very exciting overlap between emergent nanotechnology and neuroscience, which has been discovered by neuromorphic engineers. Specifically, we are linking one type of memristor nanotechnology devices to the biological synaptic update rule known as spike-time-dependent-plasticity (STDP) found in real biological synapses. Understanding this link allows neuromorphic engineers to develop circuit architectures that use this type of memristors to artificially emulate parts of the visual cortex. We focus on the type of memristors referred to as voltage or flux driven memristors and focus our discussions on a behavioral macro-model for such devices. The implementations result in fully asynchronous architectures with neurons sending their action potentials not only forward but also backward. One critical aspect is to use neurons that generate spikes of specific shapes. We will see how by changing the shapes of the neuron action potential spikes we can tune and manipulate the STDP learning rules for both excitatory and inhibitory synapses. We will see how neurons and memristors can be interconnected to achieve large scale spiking learning systems, that follow a type of multiplicative STDP learning rule. We will briefly extend the architectures to use three-terminal transistors with similar memristive behavior. We will illustrate how a V1 visual cortex layer can assembled and how it is capable of learning to extract orientations from visual data coming from a real artificial CMOS spiking retina observing real life scenes. Finally, we will discuss limitations of currently available memristors. The results presented are based on behavioral simulations and do not take into account non-idealities of devices and interconnects. The aim of this paper is to present, in a tutorial manner, an initial framework for the possible development of fully asynchronous STDP learning neuromorphic architectures exploiting two or three-terminal memristive type devices. All files used for the simulations are made available through the journal web site1.

In this paper we present a very exciting overlap between emergent nanotechnology and neuroscience, which has been discovered by neuromorphic engineers. Specifically, we are linking one type of memristor nanotechnology devices to the biological synaptic update rule known as spike-time-dependent-plasticity (STDP) found in real biological synapses. Understanding this link allows neuromorphic engineers to develop circuit architectures that use this type of memristors to artificially emulate parts of the visual cortex. We focus on the type of memristors referred to as voltage or flux driven memristors and focus our discussions on a behavioral macro-model for such devices. The implementations result in fully asynchronous architectures with neurons sending their action potentials not only forward but also backward. One critical aspect is to use neurons that generate spikes of specific shapes. We will see how by changing the shapes of the neuron action potential spikes we can tune and manipulate the STDP learning rules for both excitatory and inhibitory synapses. We will see how neurons and memristors can be interconnected to achieve large scale spiking learning systems, that follow a type of multiplicative STDP learning rule. We will briefly extend the architectures to use three-terminal transistors with similar memristive behavior. We will illustrate how a V1 visual cortex layer can assembled and how it is capable of learning to extract orientations from visual data coming from a real artificial CMOS spiking retina observing real life scenes. Finally, we will discuss limitations of currently available memristors. The results presented are based on behavioral simulations and do not take into account non-idealities of devices and interconnects. The aim of this paper is to present, in a tutorial manner, an initial framework for the possible development of fully asynchronous STDP learning neuromorphic architectures exploiting two or three-terminal memristive type devices. All files used for the simulations are made available through the journal web site 1 .
Keywords: STDP, memristor, synapses, spikes, learning, nanotechnology, visual cortex, neural network of individual neurons, circuits, and overall architectures creates desirable computations, affects how information is represented, influences robustness to damage, incorporates learning and development, and facilitates evolutionary change.

IntroductIon
Neuromorphic engineering 2 is a new interdisciplinary discipline that takes inspiration from biology, physics, mathematics, computer science, and engineering to design artificial neural systems, such as vision systems, head-eye systems, auditory processors, and autonomous robots, the physical architecture, and design principles of which are based on those of biological nervous systems. The term neuromorphic was coined by Carver Mead, in the late 1980s (Mead, 1989) to describe very-large-scale integration (VLSI) systems containing electronic analog circuits that mimic neurobiological architectures present in the nervous system. In recent times the term neuromorphic has been used to describe both analog, digital, or mixed-mode analog/digital VLSI systems that implement models of neural systems (for perception, motor control, or sensory processing) and also software algorithms. A key aspect of neuromorphic design is understanding how the morphology

StdP
Spike-time-dependent-plasticity is a family of learning mechanisms originally postulated in the context of artificial machine learning algorithms (or computational neuroscience), exploiting spike-based computations (as in brains) with great emphasis on the relative timings of spikes. Gerstner started to report the firstspike-timing-dependent learning algorithms (Gerstner et al., 1993(Gerstner et al., , 1996 in 1993. STDP has been shown to be better than Hebbian correlation-based plasticity at explaining cortical phenomena (Young, 2007;Finelli et al., 2008), and has been proven successful in learning hidden spiking patterns (Masquelier et al., 2008) or performing competitive spike pattern learning (Masquelier et al., 2009a). Astonishingly, experimental evidence of STDP has been reported by neuroscience groups during the past decade (Markram et al., 1997;Poo, 1998, 2001;Zhang et al., 1998;Feldman, 2000;Mu and Poo, 2006;Cassenaer and Laurent, 2007;Jacob et al., 2007), so today we can state that the physiological existence of STDP has been reasonably well established 3 .
However, the full implications of the molecular and electrochemical principles behind STDP are still under debate (Rubin et al., 2005). Before describing STDP mathematically, let us first explain how neurons interchange information and what the synaptic connections are. Figure 1 illustrates two neurons connected by a synapse. The pre-synaptic neuron is sending a pre-synaptic spike V mem-pre (t) through one of its axons to the synaptic junction. Neural spikes are membrane voltages from the outside of the cellular membrane V pre + with respect to the inside V pre − . Thus . The "large" membrane voltages during a spike (in the order of a hundred mV) cause a variety of selective molecular membrane channels to open and close allowing many ionic and molecular substances to flow, or preventing them from flowing through the membrane. At the same time, synaptic vesicles inside the pre-synaptic cell containing "packages" of neurotransmitters fuse with the membrane in such a way that these "packages" are released into the synaptic cleft (the inter cellular space between both neurons at the synaptic junction). Neurotransmitters are collected in part by the post-synaptic membrane, contributing to a change in its membrane conductivity. The cumulative effect of pre-synaptic spikes (coming from this or other pre-synaptic neurons) will eventually trigger the generation of a new spike at the post-synaptic neuron. Each synapse is characterized by a "synaptic strength" (or weight) w which determines the efficacy of a pre-synaptic spike in contributing to this cumulative action at the post-synaptic neuron. This weight w could well be interpreted as the size and/or number of neurotransmitter packages released during a pre-synaptic spike. However, for our analyses, we will interpret w more generally as some kind of structural parameter of the synapse (like the amount of one or more metabolic substances) that directly controls the efficacy of this synapse per spike. The synaptic weight w is considered to be non-volatile and analog in nature, but it changes in time as a function of the spiking activity of pre-and post-synaptic neurons. This phenomenon was originally relatively unrelated disciplines (nanotechnology, biology, and computer science), which have nevertheless recently been drawn together by researchers in neuromorphic engineering (Snider, 2007(Snider, , 2008Linares-Barranco and Serrano-Gotarredona, 2009). STDP was originally postulated as a family of computer learning algorithms (Gerstner et al., 1993), and is being used by the machine intelligence and computational neuroscience community (Delorme et al., 2001;Guyonneau et al., 2004;Thorpe, 2007, 2010;Young, 2007;Finelli et al., 2008;Masquelier et al., 2008Masquelier et al., , 2009aWeidenbacher and Neumann, 2008). At the same time its biological and physiological foundations have been reasonably well established during the past decade (Markram et al., 1997;Poo, 1998, 2001;Zhang et al., 1998;Feldman, 2000;Mu and Poo, 2006;Cassenaer and Laurent, 2007;Jacob et al., 2007). If memristance and STDP can be related, then (a) recent discoveries in nanophysics and nanoelectronic principles may shed new light on the intricate molecular and physiological mechanisms behind STDP in neuroscience, and (b) new neuromorphic-like computers built out of nanotechnology memristive devices could incorporate biological STDP mechanisms, yielding a new generation of self-adaptive ultra-high-dense intelligent machines. Here we explain how by combining memristance models with the electrical wave signals of neural impulses (spikes) converging from pre-and post-synaptic neurons into a synaptic junction, STDP behavior emerges naturally (Linares- ). This helps us to understand how neural and memristance parameters modulate STDP, and may offer new insights to neurophysiologists searching for the ultimate physiological mechanisms responsible for STDP in biological synapses. At the same time, it also provides a direct means of incorporating STDP learning mechanisms into a new generation of nanotechnology computers employing memristors. Here we focus on this second aspect.
In this paper we first quickly review STDP and memristor concepts. Then we explain how the memristance mechanism, and one particular formulation of it, can explain the experimental characterization of the STDP phenomena in biological synapses. We will see how the shape of action potentials is a crucial component which influences and defines the mathematical learning of STDP, and how by changing action potential shapes the STDP learning rule can be modulated and changed. The paper then concentrates on proposing circuit techniques and architectures for achieving STDP learning neural systems using memristors as synapses. We will see that ideal voltage/flux driven memristors implement a particular type of multiplicative STDP (mSTDP) with quadratic law. We will show how excitatory and inhibitory learning synapses can be implemented and efficiently simulated using a memristor macro-model with neurons described behaviorally in electric circuit simulators. We will also briefly extend the discussion from two-terminal passive memristor devices to three-terminal active field effect transistor (FET) devices with memristive-like adaptation mechanisms. We will then show how to implement a prototype V1 layer mimicking the early processing part of the visual cortex. We will use real data collected from a fabricated Spiking retina chip observing life scenes and use it to train this artificial simulated V1 layer. As a result, this artificial V1 layer learns to become orientation sensitive, like its biological counterpart. 3 For a historical overview on how STDP research evolved independently among computational and experimentalist groups, please refer to the last paragraph in Sjöström and Gerstner (2010).

StdP verSuS AntI-StdP
The STDP learning function ξ(∆T) as defined in Figures 3A,B is useful for synapses with positive weights. In these cases, weight w is strengthened if it is increased (∆w > 0) when ∆T > 0, and vice versa. However, if the weight is negative (w < 0), as in some inhibitory synapse implementations, the STDP learning function in Figure 3B is not appropriate because an increase in weight (∆w > 0) would weaken the strength of the synapse, and vice versa. For negative weight synapses an STDP learning function with a shape similar to that shown in Figure 3C (Lubenov and Siapas, 2008) is required. In this case, the synapse is strengthened by decreasing its weight (∆w < 0), which should happen for ∆T > 0. Let us call this an Anti-STDP synaptic update or learning function. Other more exotic shapes for ξ(∆T) are also possible, as we will discuss later in Sections 4.1 and 5.1. observed and reported in 1949 by Hebb (1949), who introduced his Hebbian learning postulate: "When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased." Traditionally, this has been described by computational neuroscientists and machine learning computer engineers as producing an increment in synaptic weight ∆w proportional to the product of the mean firing rates of pre-and post-synaptic neurons. STDP is a refinement of this 1949 rule which takes into account the precise relative timing of individual pre-and postsynaptic spikes, and not their average rates over time. In STDP the change in synaptic weight ∆w is expressed as a function of the time difference between the post-synaptic spike at t pos and the pre-synaptic spike at t pre (see Figure 2). Specifically, as is shown in Figure 3, ∆w = ξ(∆T), with ∆T = t pos − t pre . The shape of the STDP function ξ can be interpolated from experimental data from Bi and Poo (2001) as shown in Figure 3A. For positive ∆T (that is to say, the pre-synaptic spike has a highly relevant role in producing the post-synaptic spike) there will be a potentiation of synaptic weight ∆w > 0, which will be stronger as |∆T| reduces. For negative ∆T (that is to say, the pre-synaptic spike is highly irrelevant for the generation of the post-synaptic spike), there will be a depression of synaptic weight ∆w < 0, which will be stronger as |∆T| reduces. Bi and Poo concluded that they had observed an asymmetric critical window for ∆T of about ±40-80 ms for synaptic modification to take place. Mathematically, this ξ(∆T) STDP learning function is described by computational neuroscientists as Voltage v MR is the difference between the post-synaptic membrane voltage V mem − pos and the pre-synaptic membrane voltage V mem − pre .
(1) voltage difference between two-terminals "v," (2) current flowing through into a device terminal "i," (3) charge flowing through a device terminal or integral of current q = ∫i(t)dt, and (4) flux or integral of voltage φ = ∫v(t)dt. A two-terminal device is said to be canonical (Chua et al., 1987) if either two of the four basic electrical quantities are related by a static 4 relationship, as shown in Figure 4. A resistor has a static relationship between terminal voltage v and device current i, as shown in Figure 4A. A capacitor shows a static relationship between charge q and voltage v, as shown in Figure 4B. An inductor has a static relationship between its current i and flux v, as shown in Figure 4C. These three devices have been very well known since the origins of Electronics and Electricity. However, there are other possibilities for combining the four basic electrical quantities: (q, i), (v, φ), and (q, φ). Ignoring the combinations of a quantity with its time derivative leaves us with one single additional possibility: (q, φ). This reasoning led Chua to postulate the existence of a fourth basic two-terminal element, which he called the Memristor. The memristor would show a static relationship between charge q and flux φ, as shown in Figure 4D.
If the q versus φ relationship is linear, the memristor degenerates into a linear resistor. Memristors behave as resistances in which the resistance changes through some of the basic electrical quantities, and is somehow memorized. The simple concept of memristance as defined in Figure 4D can be extended to refer to any device exhibiting resistive behavior whose resistance can change through

AddItIve verSuS MultIPlIcAtIve StdP
Most of the present day literature on STDP presents a learning function ξ which depends on ∆T but not on the actual weight value w. This type of weight-independent STDP learning rule is usually known as "additive STDP." Additive STDP requires the weight values to be bounded to an interval because weights will stabilize at one of their boundary values (van Rossum et al., 2000;Rubin et al., 2001).
On the other hand, in "multiplicative STDP" or mSTDP (van Rossum et al., 2000;Rubin et al., 2001;Gütig et al., 2003) the learning function is also a function of the actual weight value ξ m (w, ∆T). Furthermore, there usually appears a weight dependent factor which multiplies the original additive STDP learning function ξ a , and which may generally be different for the positive (∆T > 0) and In mSTDP weights can stabilize to intermediate values inside the boundary definitions. Thus, it is often not even necessary to enforce boundary conditions for the weight values (van Rossum et al., 2000). As we will see later in Section 4.2, for the particular type of memristive devices we are analyzing (voltage/flux driven) the resulting STDP learning rule will be of a multiplicative type with function F having a quadratic dependence with w, while the values of w are kept bounded.

4
By "static" we mean it is not altered by changes of the above electrical quantities, or by their history, integrals, derivatives, etc. These "static" curves can, however, be time varying if the change is caused by an external agent. For example, a motor driven potentiometer would have a "static" i/v curve that is time varying.
where I o and v o are parameters which may or may not depend on w. This shape of f is shown in Figure 5B. Many other mathematical formulations can be used (Chua and Kang, 1976). In order to relate memristance to biological STDP, as will be done in Section 4, we need a voltage/flux controlled memristor with thresholding behavior, exponential behavior beyond threshold, and bidirectional behavior (i.e., to be able to increment and decrement w).
Since a memristor has polarity, as indicated in Figure 5A, we need to establish a criterion to assign one of the terminals as the positive terminal. The criterion adopted to assign polarity is that if a sufficiently large positive voltage v MR is applied to the memristor (i.e., larger than its positive threshold), it will increase its conductance. Otherwise, if a sufficiently large negative voltage v MR is applied (i.e., increasing beyond its negative threshold), it will decrease its conductance.

MeMrIStor MAcro-Model for two-terMInAl devIceS
A macro-model of a device is a behavioral model made of circuit elements (ideal or not) that describes its behavior. Some circuit simulators allow a device to be defined mathematically using for example AHDL or Verilog-A circuit description languages. However, if the device can be described with a macro-model circuit, this will have some advantages 5 .
(1) First, it uses already built-in components providing faster simulations; (2) second, as it is made of circuit elements it gives (analog) circuit designers a richer intuitive insight into how it works and performs, and how to improve it for specific goals; (3) it is very intuitive when adding parasitic components (resistors and capacitors) to aid in the convergence of the simulator's internal algorithms; (4) and if care is taken to keep the operating voltages and currents of internal nodes to the levels the simulator expects from conventional circuits, simulations converge easier and faster. Reported memristors (as in Strukov et al., 2008;Jo et al., 2010) adhere very well to the "moving wall model" (see Figure 6A), where the wall position w separates two different resistive regions in some of the four basic electrical quantities, but at the same time exhibiting memory for that resistance. In that case, more elaborate mathematical descriptions are required (Chua and Kang, 1976).
Memristance has recently been demonstrated (with extraordinary impact among the research community) in nanoscale two-terminal devices, such as certain titanium-dioxide (Strukov et al., 2008;Borghetti et al., 2009) and amorphous Silicon (Jo et al., 2009) crosspoint switches. However, memristive devices were reported earlier by other groups (Argall, 1968;Swaroop et al., 1998). Memristance arises naturally in nanoscale devices because small voltages can yield enormous electric fields that produce the motion of charged atomic or molecular species, changing structural properties of a device (such as its doping profile) while it operates. Memristors are asymmetric two-terminal passive devices. Consequently, their circuit symbol must indicate somehow their polarity. Figure 5A shows two possible symbols. Here we will consider one particular subset of memristors described by (Snider, 2007(Snider, , 2008 where w is some physical (structural) parameter, i MR is the current through the device, v MR the voltage drop across it, and g is its (nonlinear) conductance. Since the change of structural parameter w is driven by voltage v MR , we say this memristor is voltage (or flux) driven. The group at the Michigan University claims to have fabricated a memristor of this kind (Jo et al., 2009(Jo et al., , 2010. If function f() in eq. (4) is driven by memristor current i MR , then we say the memristor is current (or charge) driven. The HP group tends to model their memristor as one of this type (Strukov et al., 2008;Borghetti et al., 2009). In memristive nanoscale devices, function f may describe ionic drift under electric fields. Although this may conceivably be modeled by a linear dependence of f with voltage v MR (Strukov et al., 2008), it is clear that in reality such dependence is more likely to grow exponentially and/or include a threshold barrier v th (Jo et al., 2010). For our discussions, let us assume the following dependence 5 Note that our aim in providing a macro-model circuit is to have a means of simulating large number of memristors efficiently in a circuit simulator, and hence take advantage of its computational efficiency and ease of use. Our aim is not to provide a means of building physical circuits out of such macro-models (as Chua, 1971, did in the past using mutator circuits). A direct physical circuit realization of Figure 6B would result, for example, in leaks of the memory value w due to unavoidable current leak paths in parallel with the capacitor.
Parameter k R scales between the voltage domain range of w ( usually within a few volts, for proper simulator convergence) and the resistance domain range of R which can be as high as hundreds of Megaohms (Jo et al., 2009(Jo et al., , 2010. Figure 7 shows the simulation results 9 of a memristor connected in series with a 5-MΩ resistance stimulated with a 2-V sinusoid of decreasing frequency from 5 KHz to 0 Hz in 26 cycles. Maximum and minimum memristor resistance limits were R max = 100 MΩ and R min = 100 MΩ, symmetric threshold voltages were |v th | = 1 V, the exponential f(v MR ) was characterized by I o = 10 µA, v o = 0.1 V, v th = 1 V, and the other macro-model parameters were 222mA, w o = 12.2 V, and C MR = 10 mF.

relAtIon between StdP And MeMrIStAnce
How can STDP be related to memristance? The key is to consider carefully the shape of the electric neural spikes (Linares-Barranco and Serrano-Gotarredona, 2009). The exact shape of neural spikes, usually called "action potentials" among neuroscientists, is difficult to measure precisely since the experimental setup influences strongly. Furthermore, different action potential shapes have been recorded for different types of neurons, although in general they all display a certain resemblance. For our discussion it suffices to assume a generic action potential shape with the following properties (see Figure 8A). During spike on-set, which happens during a time t ail + , membrane voltage increases exponentially to a positive peak amplitude A mp + . After this, it changes quickly to a peak negative amplitude − − A mp and returns smoothly to its resting potential during a time t ail − . A shape of the type shown in Figure 8A can be expressed mathematically, for example, as Parameters τ ail + and τ ail − control the curvature of the on-set and off-set sides of the action potential.
series, and moves depending on device current or voltage. A circuit macro-model that implements eqs. (3-6) is shown in Figure 6B. It comprises a controlled resistor in which resistance R is controlled linearly by internal state voltage w, R(w) = k R × (w + w o ). Voltage w represents the "structural" parameter of the wall position, which is bounded to [0,L] 6 . Component NOTA is a non-linear differential-input voltage-controlled current source (transconductor), also known as non-linear operational transconductance amplifier (OTA) which provides an output current i g (v MR ) = f(v MR ) controlled by input differential voltage v MR , as given in eqs. (5-6) and Figure 5B. Non-linear element g sat is a non-linear resistor with a shape as shown in Figure 5C, which limits the range of the resistance R(w) to [R min , R max ], thus keeping w inside its natural boundary [0,L]. Consequently, the macro-model circuit in Figure 6 is mathematically described by  6 For current/charge driven memristors, the boundary constraint was modeled by adding a multiplicative windowing function to allow for non-linear drift (Biolek et al., 2009;Joglekar and Wolf, 2009). 7 These equations are similar to Di Ventra's macro-model Pershin and Di Ventra, in press), except that we use an additive rather than a multiplicative term g sat to constrain the interval for w. A multiplicative term may result in the stacking of w at one of its limits, as mentioned by Biolek et al. (2009). 8 We originally reported the macro-model of Figure 6B in May 2010 (Pérez-Carrasco et al., 2010a). Simultaneously, a mathematically equivalent macro-model was reported by Shin et al. (2010), based on Charge-Flux Constitutive Relationships. 9 Spectre netlist files for these simulations are available in folder Spectre_for_Fig7 in the AdditionalMaterial file, downloadable from the Journal web site.
negative areas (below −v th , when ∆T < 0) result in decrements for w (∆w < 0). As |∆T| approaches zero, the peak of the red area in v MR is higher. Since this peak is amplified exponentially, the contribution for incrementing/decrementing w will be more pronounced as |∆T| is reduced. The resulting function ∆w(∆T), computed using the memristor model through eq. (13) is shown in Figure 3B. Actually, it imitates the behavior of the STDP function ξ obtained by Bi and Poo from physiological experiments, which was shown in Figure 3A. For this numerical computation we used the following parameters: . , v o = 1/7, t ail + = 5 ms, t ail − = 75 ms, t + = 40 ms, t − = 3 ms. Making α pos ≠ α pre breaks the symmetry of function j(∆T), and making them very different removes one of the branches in j(∆T). It is possible to have more freedom to achieve a desired shape for j(∆T) by setting α pos = α pre = 1, but instead shaping the spikes traveling forward and backward independently. Also, note that for ∆T values very close to zero ∆w(∆T) is approximately linear and crosses the origin. This is because as ∆T approaches zero, v MR approaches zero for any t [see eq. (12)].
This result shows that a memristive type of mechanism could be behind the biological STDP phenomenon 10 .

Influence of ActIon PotentIAl ShAPe
The shape of the action potential function spk(t) strongly influences the shape of the resulting STDP function ξ(∆T). As an illustration, Figure 9 shows how for several shapes of action potentials ("spikes") different STDP learning functions ξ(∆T) are obtained 11 . For example, if the exponential shape degenerates into a triangular type of shape, then the central region of ξ(∆T) will display a smoother transition from the negative peak to the positive peak. Note that this would weaken learning for cases with small |∆T|. Making the positive peak of the spike smaller than the negative peak, makes the negative branch for ξ(∆T) stronger than the positive branch. If the action potential is substituted by a rectangular shape signal, then the central region becomes linear and a saturation effect might occur. If the rectangular spike is made more symmetric, then ξ(∆T) degenerates into a triangular type of shape, which is very different from the original biological STDP learning function. In general, to obtain a biological-like STDP learning function a narrow short positive pulse of large amplitude and a longer relaxing slowly decreasing negative tail are required. However, from a computational point of view, it might be more interesting to massage the shape of the "spikes" and tune the STDP learning function as desired.

MeMrIStorS IMPleMent A MultIPlIcAtIve tyPe of StdP
It is worth mentioning here that memristors actually implement a multiplicative type of STDP. The reason is because, according to eq. (13), the structural parameter updates ∆w(∆T) follow an additive type of STDP rule, independent of w. Parameter w is the memristor "wall" position separating the low and high resistivity Consider the case of pre-and post-synaptic neurons in Figure 1 being of the same type, and thus generating the same action potential shape, spk(t) of eq. (10), when they fire. Axons and dendrites operate as transmission lines, so it is reasonable to expect some attenuation when the spikes arrive at the respective synapses. Let α pre be the attenuation for the pre-synaptic spike V mem-pos (t) = α pre spk(t − t pre ), and α pos for the post-synaptic spike V mem-pos (t) = α pos spk(t − t pos ). When both spikes are more or less simultaneously present at the two cell membranes of the synapse, then channels on both membranes are open. Consequently, in principle, it makes sense to assume that during such time there could be a path for substances in the inside of one cell to move directly to the inside of the other cell and vice versa. Furthermore, let us assume now that such motion of substances obeys a memristive law similar to those described by eqs. (3-6). This means, that we would have a two-terminal memristive device between the inner sides of the two cells; more specifically, between V pos − and V pre − in Figure 1B A simple change of variables t = t′ − t pos and recalling that This memristor voltage v MR is shown in Figure 2 for the cases of ∆T being positive or negative. According to eqs. (5-6), memristive update will take place only if v MR exceeds threshold v th , as indicated by the red shaded area in Figure 2. As we postulated earlier, during this memristive update some amount of synaptic structural substance(s) ∆w would be interchanged between the two sides of the synapse. The amount of substance ∆w will ultimately affect the synaptic strength of this synapse. If this amount of synaptic structural substance interchanged between the two synaptic terminations obeys a memristive law as in eqs. (3-6), then from eq. (4) which is the red area of the shaded regions in Figure 2, previously amplified exponentially through function f() of eqs. (5-6). Positive areas (above v th , when ∆T > 0) yield increments for w (∆w > 0), while −20 40 60 80 20 −20 40 60 80 20

10
A few months after announcing our finding (Linares-Barranco and Serrano-Gotarredona, 2009), we became aware of a similar result (Afifi et al., 2009), which is a particular case of eqs. (10-12) for α pre = α pos = 1 and a different action potential shape (the one shown in Figure 9B1).

11
Matlab files for generating these figures are available in folder Matlab_for_Fig9 in the AdditionalMaterial file, downloadable from the Journal web site.
Note that the fact that memristors implement a multiplicative type of STDP derives from the fact that the wall position w is linear with resistance, and thus inversely proportional to synaptic strength. Consequently, we should expect mSTDP also from synchronous STDP realizations, using either current or voltage-driven memristors, as the update would also be weight dependent. On the other hand, we have assumed that function f() in eqs. (5, 6, and 13) does not depend on w. In practice, there might exist such dependence, and if true, the resulting mSTDP learning function might deviate from the one discussed here.

connectIng MeMrIStorS wIth SPIkIng neuronS for ASynchronouS StdP leArnIng
Synchronous memristive STDP learning architectures were proposed by Snider (2007Snider ( , 2008, assuming voltage/flux driven memristors, and recently demonstrated by the group at Michigan University (Jo et al., 2010). In that proposal each neural spike is mapped into a sequence of precisely spaced fixed amplitude digital pulses which must maintain global synchronization to separate regions (see Figure 6A). According to eq. (8), the memristor instantaneous resistance R(w) is linear with w. Consequently, the memristive STDP resistance update ∆R(∆T) = k R × ∆w(∆T) = k R j(∆T) will follow an additive STDP update rule as well, independent of the actual value of R. However, as we will see in the next Section, when memristors (or resistors) are used as synapses in a neural circuit, the synaptic strength of such synapses is proportional to their conductance G = 1/R, because as conductance increases more current will be delivered to the post-synaptic neuron. Consequently, synaptic strength update is given by which is quadratically proportional to the actual conductance. Memristors will therefore yield larger update steps for high conductances, but smaller steps for low conductances. This suggests that, before training, weights (conductances) should be initialized to rather high values, so that as learning progresses the updates tend to become smaller. and through a local, compact digital-to-analog converter provide the programmed action potential. Implementation details of such spike generation circuit are beyond the scope of this paper. For the synapses, it is possible to fabricate very high density memristor crossbar structures (Strukov et al., 2008;Borghetti et al., 2009) which connect to the neural layers, as shown in Figure 11A. Neurons generate action potentials with a shape similar to those given in Figure 8. Note that the positive terminals of the memristors connect to the pre-synaptic neurons. This way, when ∆T > 0 the memristors see a negative voltage beyond threshold and their resistance (inverse of synaptic strength) will decrease. Alternatively, the same result can be obtained by having the neurons generate an inverted spike action potential (as in Figure 8B) but connecting the memristors with opposite polarity, as shown in Figure 11B.
In an excitatory synapse a pre-synaptic action potential spike should produce an increment in the neuron integral, i.e., make the integrator output voltage V int approach V REF (see Figure 10). During neuronal spike integration a neuron simply accumulates the contributions of incoming spikes on its integrator. All synapses connected to its input node do not experiment any weight update and operate as resistances of constant value. This is guaranteed by making the action potential peaks lower than the threshold value v th in Figure 5. In order to have a constant positive resistance contribute a net positive charge packet during each incoming pre-synaptic spike, the net area under the spike waveform (see Figure 8) should be positive. For the particular case of parameter selection that results in the action potential shapes in Figure 8, it turns out that the spike in Figure 8A presents a net negative area while the spike in Figure 8B presents a net positive area. Consequently, using spikes with the shape and parameters as in Figure 8A results in synapses delivering net negative charges, while using spikes with the shape and parameters as in Figure 8B results in synapses delivering net positive charges. If neurons are set such that V REF > V rest , the incoming net negative charge packets make the integrator output V int approach V REF . In this case synapses delivering net negative charge packets operate as excitatory synapses. On the contrary, if neurons are set such that V REF < V rest then synapses delivering net positive charge packets operate as excitatory synapses. The arrangement shown in Figure 11A, which uses spikes as in Figure 8A, therefore results in excitatory synapses delivering net negative charge packets if V REF > V rest . On the other hand, for Figure 11B which uses spikes as in Figure 8B, synapses are excitatory, delivering net positive charge Interestingly, the strength of STDP learning can be modulated by changing the amplitudes (or shapes) of the electric spikes in time. This would allow the implemention of faster learning at the beginning of a learning process, and progressively slow learning down as the system stores acquired knowledge, or even turn it off completely after some time. This is a very desired feature for STDP machine learning systems (W. Maass, Private communication).
This way of interconnecting memristors with neurons as in Figures 10 and 11 avoids cross-coupling of spikes between rows and columns, because all lines are driven by (ideal) voltage sources. Using this arrangement with the memristor macro-model of Figure 6 we performed intensive behavioral simulations in Cadence-Spectre to test the concept on the 4 × 4 feed forward array shown in Figure 12. Note that the neuron used (shown in the inset) is a particular case of the integration phase of neural activity from the synaptic weight update phase. This global synchronization requirement imposes severe difficulties when the system scales up to very large sizes.
On the other hand, from the discussions in previous Sections, we present an alternative approach which is fully asynchronous (Linares-Barranco and Serrano-Gotarredona, 2009). Consequently, no global synchronizations will be required nor global separations into neuron-integration phases and synapse-learning phases, making this approach attractive for scaling up to very large numbers of neurons and synapses.
We first need a neural circuit that integrates spikes until a threshold is reached. At that moment, it should provide a spike of the desired shape. A possible schematic diagram for a leaky integrate-and-fire (LIF) neuron block is shown in Figure 10. The neurons need to include a current summing and sinking input terminal so that in the absence of spike output the integral of input current spike signals can be computed, while maintaining the input node tied to a fixed voltage. This can be done by using a lossy integrator with a clamped voltage input. The output of this accumulated integral V int is compared against a reference V REF .
If this reference is reached, the comparator output will trigger a spike generation circuit, which provides the output spike of the neuron. During spike generation, the integration capacitor is charged to refractory voltage V refr , while the input opamp is configured as a voltage buffer, thus copying the spike waveform to the neuron input node. An attenuated version of the post-synaptic neural voltage α pos V pos (t) is thus made available to the synaptic memristors connected to this neuron input. Another attenuated version of the spike is fed forward to the output of the neuron V pre (t) = α pre spk(t). During the whole time of the spike (typically in the order of 20-100 ms) the neuron is not integrating (computationally inactive). This time is also called "refractory time." During the absence of spike output, the spike generation circuit provides a constant voltage V rest .
For the Spike Circuit in Figure 10 an analog circuit can be devised that generates a specific action potential shape with some tunable parameters (Indiveri et al., under review). However, for STDP experiments it is more desirable to allow for full programmability of arbitrarily shaped action potentials. Since all neurons should have the same spike shape (at least all neurons of a part of a whole system), one interesting option is to have a circuit at the chip periphery broadcasting digital samples of the action potentials at different phases to all neurons. The spike generation circuits would then capture the closest phase, delay it properly, Memristors are reversed with respect to the cases in Figure 11. Note that now, for anti-STDP, when ∆T > 0 the memristors see a positive v MR voltage beyond threshold (which will produce an increase in resistance and a decrease in synaptic strength), while for ∆T < 0 they see a negative voltage beyond threshold (which will produce a synaptic strength increment). Memristors are physically positive resistances (of time varying values). Whether memristors act as excitatory or inhibitory synapses is determined by the combination of (1) net area under the action potential waveform (i.e., sign of net charge sent to the post-synaptic neuron) and (2) whether V REF is above or below V rest , as mentioned in the discussion around Figure 11. Another twist in STDP variations is obtained by having the neurons send back an inverted version of the spike sent forward, as shown in Figure 15A 13 . In this case, it is possible to have the resulting STDP learning function show a positive learning window around ∆T = 0 with positive increments for both positive and negative values of ∆T close to zero. Beyond a specific value of |∆T| there are decrements in the weights (∆w < 0), for both positive and negative sides of ∆T. This is shown in Figure 15C. This type of learning is useful under some circumstances (W. Maass, Private communication).

AchIevIng StdP wIth MeMrIStIve fet-lIke devIceS
Most present day literature on memristors refers to two-terminal (nano)devices. In essence, memristors operate as resistors the resistance of which can be changed and maintained in a nonvolatile manner. This feature (and their compact size) is basically what makes them highly attractive not only for building neuromorphic systems, but also for memories and computing systems in general. However, memristance is not necessarily restricted to two-terminal devices. It is well known that it is possible to use three (or four) terminal FETs as resistors, current sources, or (volatile) memory elements. If the same nanoscale principles that give rise to memristance in two-terminal devices could be extrapolated to three or four terminal FETs, then the adaptive memristive circuits presented so far could be extrapolated to more generic FET-based circuits as well. FETs have more terminals and consequently will the one in Figure 10 with V refr = V rest = 0 (spike resting potential) and R = ∞. The results are shown in Figure 13 12 . Only the first two column synapses are stimulated, with 200 ms period spikes (of 45 ms duration) with a 25-ms relative delay between the two columns. As can be seen, only the synapses at the first two columns change their resistance, while those on the other two columns do not, confirming the correct operation of STDP without any crosstalk between columns or rows. This demonstrates that this architecture can be scaled up to arbitrary size, at least conceptually. Practical considerations that could limit maximum size are mainly fan-out of neurons, interconnect delays, and parasitic crosstalk. Note that in Figure 13 memristor resistances do not always converge to their extreme values R min or R max (as in additive STDP) but that some of them (R 12 , R 21 , R 31 , R 34 ) have converged to intermediate values (as is characteristic for mSTDP).

StdP vArIAtIonS
Standard STDP aims to implement the synaptic learning functions of the shape shown in Figure 3B. In the case of synapses with negative weights anti-STDP learning functions similar to the one shown in Figure 3C need to be implemented. This is achieved with memristors by simply changing their polarity. Figure 14 shows how neurons and 12 Spectre netlist files for these simulations are available in folder Spectre_for_Fig13 in the AdditionalMaterial file, downloadable from the Journal web site.

Figure 11 | Two possible interconnection schemes between memristors and neurons for STDP learning. (A)
Memristor polarities for spikes as in Figure 8A. (B) Memristor polarities for inverted spikes as in Figure 8B. Figure 12 | Small feed forward memristive array simulated behaviorally with Cadence-Spectre. 13 Matlab files for these simulations are available in folder Matlab_for_Fig15 in the AdditionalMaterial file, downloadable from the Journal web site. Figure 16A shows a 3-terminal FET symbol, and Figure 16B illustrates how such a FET can be used as a very-wide-range tunable current source by maintaining its V GS voltage constant but changing its threshold voltage. result in less dense structures than their two-terminal counterparts. However, FETs can present very wide tuning ranges. For example, imagine a (nano)FET transistor in which the threshold voltage could be tuned through some memristive-like mechanism. from previous layer neurons activate the FET drain terminals. The FET source terminals are connected to fixed voltages V ref2 clamped by the neurons' input. The neurons' inputs are now always clamped to a fixed voltage, i.e., the neuron output spike is not sent back through the neuron input terminal. Now, the neuron spike is sent back through an independent extra terminal which connects to the gates of the FETs in the same row. If a pre-synaptic spike (on a synapse FET drain terminal) is prior to a post-synaptic spike (on the same synapse FET gate terminal) then the FET's gate-to-drain voltage would exceed positive threshold v th in Figure 17B, activating the corresponding NOTA, increasing w. This would decrease the FET's threshold voltage and shift its current versus V GS curve in Figure 16B to the left, thus increasing its driving current for the same V GS voltage. Consequently, this results precisely in an STDP positive update. If the pre-synaptic spike occurs later than the post-synaptic spike, the opposite behavior is obtained: a negative STDP update. The circuit in Figure 18 was simulated in Cadence-Spectre using the macro-model in Figure 17 for the FETs. The simulation results are shown in Figure 19 15 . For the neuron we used a simplified version of the circuit in Figure 10 which is not lossy and which has a refractory voltage equal to the resting potential. For the simulations in Figure 19 we set the range Some researchers are developing nanoscale FET devices in which the threshold voltage can be adjusted by manipulating their terminal voltages above certain thresholds 14 (Lai et al., 2008;Agnus et al., 2010;Bichler et al., 2010;Jeong et al., 2010). Such behavior suggests a memristive (nanoscale) weight update mechanism similar to the one we have assumed for two-terminal memristors in the previous Sections. Let us assume we can change the FET threshold voltage up and down whenever either the gate-to-drain V GD and/or gate-tosource V GS voltages exceed certain positive and negative thresholds. A FET macro-model for describing such behavior, inspired by the one in Figure 6 for memristors, is shown in Figure 17A. There, a fixed-threshold (memory-less) FET is used, in which the internal gate terminal is connected through a variable voltage source to the external device gate G. This voltage source is controlled by an internal weight w, which changes by means of a mechanism similar to that of the memristor weight in Figure 6. The NOTA non-linear function, shown in Figure 17B is identical to that in Figure 5B, and the weight saturation function shown in Figure 17C is identical to that in Figure 5C.
Using such memristive FETs we can assemble a crossbar-like feed forward perceptron network, as shown in Figure 18. The synapses are now implemented using FETs of this type. Input spikes coming 14 Some other researchers have demonstrated STDP with 4-terminal FETs that include a floating gate (Ramakrishnan et al., 2010).   Mahowald, 1992). For over 10 years AER sensory systems were reported by only a handful of research groups, examples being Lazzaro et al. (1993) and Johns Hopkins University (Cauwenberghs et al., 1998) pioneering work on audition, or Boahen (1999Boahen ( , 2002 early developments on retinas. However, during these years some basic progress was made. A better understanding of asynchronous design (Sparsø and Furber, 2001;Martin and Nyström, 2006) leading to robust unarbitrated (Mortara et al., 1995) and arbitrated (Boahen, 1996(Boahen, , 2000 asynchronous event readout, combined with the availability of user-friendly FPGA external support for interfacing and new submicron technologies allowing complex pixels in reduced areas, heralded a new trend in AER bio-inspired Spiking Sensor developments. Since 2003 many researchers have embraced this trend. AER has been used fundamentally in vision (retina) sensors, for purposes such as simple light intensity to frequency transformations (Culurciello et al., 2003;Posch et al., 2010), time-to-first-spike coding (Ruedi et al., 2003;Chen and Bermak, 2007), foveated sensors (Azadmehr et al., 2005), spatial contrast (Costas-Santos et al., 2007;Massari et al., 2008;Ruedi et al., 2009;Leñero-Bardallo et al., 2010), temporal contrast (Mallik et al., 2005;Posch et al., 2007Posch et al., , 2010Lichtsteiner et al., 2008;Leñero-Bardallo et al., 2011), motion sensing and computation (Boahen, 1999), and combined spatial and temporal contrast sensing (Zaghloul and  of w as w max = 10 V and w min = −10 V. We can see again that some weights converge to the boundary values, while others stabilize at intermediate values, as is characteristic for mSTDP.

AddreSS-event-rePreSentAtIon
Address-event-representation (AER) is a well established technology among neuromorphic engineers. AER was originally proposed 20 years ago in Mead's Caltech research lab (Sivilotti, 1991; But AER has also been employed for post-sensing event-driven processing, emulating biological cortical structures. Vernier et al. (1997) developed AER convolutional filters with elliptic-like kernels while Choi et al. (2005) reported more sophisticated Gabor-like filters. Serrano-Gotarredona et al. (1999) reported an AER architecture that would allow more generic kernels, although with some Boahen, 2004a,b). AER has also been used for auditory systems (Lazzaro et al., 1993;Cauwenberghs et al., 1998;Sarpeshkar et al., 2005;Wen and Boahen, 2006;Chan et al., 2007), competition, and winner-takes-all networks (Chicca et al., 2007;Oster et al., 2008), and even for systems distributed over wireless networks (Teixeira et al., 2005). Figure 17A) for the FeT network in Figure 18. a real artificial AER retina, and finally we will show the receptive fields formed through STDP learning in the artificial memristive V1 layer with this training data. The biological V1 visual cortex layer is known to be sensitive to specific orientations (Hubel and Wiesel, 1959). We will show how such orientation sensitive receptive fields arise naturally when building an artificial memristive V1 layer with STDP learning and stimulated with real spiking data obtained with an artificial AER motion sensitive retina.

Figure 19 | Simulated evolution of weights (w in
The spontaneous formation of orientation sensitive receptive fields through STDP learning has already been reported by other researchers 16 (Delorme et al., 2001;Snider, 2007). In those works static luminance images were used for training. Pixel intensities were coded into spikes through some kind of computational transformation: either a stochastic rate coding scheme (Snider, 2007), or a rank-order coding scheme (Delorme et al., 2001). Here we directly use the continuous AER output stream of events produced by a real motion sensitive retina CMOS sensor.

toPology of v1 vISuAl cortex lAyer And PhySIcAl reAlIzAtIon
The simplified V1 topology we want to emulate can be explained with the help of Figure 21A. The retina sends spikes to the V1 cortex layer through synaptic connections 17 . The V1 layer is structured in a number of "Feature Maps." We can think of the retina as an array of "pixels," each with coordinate (x, y). Let us assume each "Feature Map" in V1 replicates the same coordinates (x, y), so that each pixel in the retina has a corresponding pixel in each "Feature Map" with the same coordinate. Each pixel (x c , y c ) in a "Feature Map" receives inputs not only from pixel (x c , y c ) in the retina, but also from all neighbors within a spatial neighborhood (x c + x r , y c + y r ). Alternatively, we may say that each pixel in the retina (x c , y c ) connects to a Projection Field of pixels (x c + x r , y c + y r ) in each of the Feature Maps. Thus, projection fields include a number of synaptic connections, so that the spikes produced by one pixel in the retina are sent to the pixels of the projection field in each feature geometric symmetry restrictions. In 2006 the same group started to report working AER Convolution chips with arbitrary shape programmable kernels of size up to 32 × 32 (Serrano-Gotarredona et al., 2006, 2008Camuñas-Mesa et al., in press). Figure 20 explains the basic idea behind a point-to-point AER link. An emitter chip (or module) includes an array of neurons generating spikes. Each neuron is assigned an address, such as its (x, y) coordinate within the array. Neurons generate spikes at low frequency (10-1000 Hz), and these are arbitrated and put on an inter-chip (or inter-module) high-speed asynchronous AER bus. The AER bus is a multi-bit (either parallel, serial, or mixed) bus which transmits the addresses of the emitting neurons. Typical delays for transmitting Address Events between AER chips range from about 30 ns (Costas-Santos et al., 2007) to 1 µs (Lichtsteiner et al., 2008) per event for parallel AER, and have been reported down to 24 ns per event for serial AER with potential to go as low as 8 ns per event (Berge and Hafliger, 2005). These addresses are received, read, and decoded by the receiver chip (or module) and sent to the corresponding destination neuron or neurons. Figure 20 illustrates a point-to-point AER link with a single emitter and a single receiver. The use of AER splitters and mergers ) allows extension to one-to-many, manyto-one, or many-to-many AER links. Inserting AER mappers ) allows coordinate transformations (rotations, translations, etc.) to be performed while address events travel between modules. Current research is looking at how large numbers of AER convolutional modules can be combined through independent and multiple AER links to build high-speed object and texture recognition systems (Camuñas-Mesa et al., 2010;Pérez-Carrasco et al., 2010b).

buIldIng A Self-leArnIng vISuAl cortex wIth MeMrIStorS And StdP-reAdy Aer hArdwAre
In previous Sections we have shown how to interconnect memristors with spiking neurons to achieve STDP learning systems. We have illustrated this with a very specific topology, a feed forward crossbar structure (Figures 11, 14, and 15), where all neurons in one layer connect to all neurons in the next layer. However, the methodology is not restricted to this specific spatial topology, and can be extended to any generic neural network topology. In this Section we will apply those same concepts to a topology representing the first processing layer of the visual cortex, namely layer V1. We will first explain the V1 layer topology we will use, show how to build it physically, then we will describe the training data we will use from

16
It should be clarified, however, that the present body of experimental neuroscience literature confirms that preliminary orientation tuned receptive fields build up during development before eye opening. Afterward, visual experience contributes to the maturation and sharpening of orientation selectivity (White et al., 2001). 17 Here we are ignoring the lateral geniculate nucleus (LGN) between the retina and V1, as in many visual system models.
LGN seems to introduce a relay only, without performing any computation.
map. Feature maps operate as feature extractors. Specifically, the feature maps in V1 detect the presence of oriented edges at specific orientations and scales (Hubel and Wiesel, 1959).
The physical implementation of one such Feature Map with AER CMOS neurons and a layer of memristor crossbar structure on top is shown in Figure 21B (Snider, 2008). The lower CMOS contains the array of neurons (or pixels) of one V1 Feature Map. Each neuron has coordinate (x, y), as its corresponding retina pixel. Address Event spikes of coordinate (x, y) coming from the retina are sent to pixel (x, y) in the Feature Map. This neuron then sends out a spike of the desired shape (for example, as in Figure 8) through its output node. In Figure 21B each neuron has an output node (green) and an input node (red). The output node connects to a thin line tilted slightly with respect to the CMOS tile (as in CMOL; Strukov and Likharev, 2005), so that it does not intersect with any other neuron output node in the CMOS tile. This thin line has many others in parallel, each connecting to one neuron output node. Perpendicular to all these lines there are other thin lines (at a different altitude), each connecting to the input node of one neuron. The two sets of perpendicular lines form a "sandwich" with a separation layer formed by memristive material. This way, at the intersection of each perpendicular thin line there is a memristor. For example, in Figure 21B neuron 1 output node connects to the vertical pink line, while neuron 2 input node connects to the horizontal pink line. The synaptic memristor connecting neuron 1 output to neuron 2 input is at the intersection of the two lines (blue circle). The vertical pink line (neuron 1 output) has memristive intersections with all horizontal thin lines. Consequently, neuron 1 output connects to all other neuron inputs. In the same manner, all neuron output nodes connect to all neuron input nodes. For projection field based topologies, each neuron output does not connect to all other neuron inputs. Instead, connectivity is limited to a given spatial neighborhood. This is achieved by having lines of limited length (instead of one reaching over the full CMOS array). For square projection fields of size 10k = 100 2 , for example, each line has to be extended to 50 cells on each side.
Below we present some simulation results from training a set of such AER Feature Maps with real stimuli coming from a temporal derivative AER retina watching life scenes. First, we briefly explain the AER temporal derivative retina and what kind of spikes it produces. We then describe how we used this data to stimulate a set of AER hybrid CMOS-memristor Feature Maps and what kind of selectivity these Feature Maps developed.

Aer teMPorAl dIfference retInA
We will use the spiking data obtained from an AER Temporal Difference Retina chip (Posch et al., 2007(Posch et al., , 2010Lichtsteiner et al., 2008;Leñero-Bardallo et al., 2011) to train an artificial V1 STDP visual cortex layer. The retina has an array of 128 × 128 pixels. Each pixel (x, y) has a photo sensor that provides a continuous photo current I ph (x, y) plus a circuitry that generates a signed spike every time its photo current changes by a given relative amount |∆I ph / I ph | > C th . Figure 22A shows the output events produced by the retina when observing a dot rotating at 400 Hz. Blue dots represent positive events (going from dark to bright) and red dots represent negative events (going from bright to dark). The address events collected during 7 ms are plotted in (x, y, t) coordinates. Figure 22B shows the events collected during 20 ms when observing two people walking. White pixels correspond to positive events (∆I ph /I ph > 0), while black pixels to negative events (∆I ph /I ph < 0).

StdP trAInIng reSultS of v1 lAyer
In this Section we will analyze the learning behavior a hybrid CMOS-memristive V1-like system when it is trained through STDP using the architectural and circuital principles outlined throughout the paper and using real stimuli obtained from a 128 × 128 pixel AER temporal derivative retina. Specifically, we used a 521-s recording with 20.5 million events showing scenes observed when driving in a car (Delbruck). We used a simplified network structure to simulate and see what kind of receptive fields would naturally arise. The network structure is shown in Figure 23. From the retina visual field of 128 × 128 pixels we cropped 324 non-overlapping patches of 7 × 7 pixels each, and concatenated all these events sequentially making a recording of 324 × 521 = 168804 s (47 h) with 19.6 million events. This concatenation was used for one training epoch, and we required a total of five epochs to observe convergence in the learned weights. The events from each patch are separated into two additional 7 × 7 fields depending on the event sign. The activity of  Figure 24 shows the evolution of the receptive fields when using the dedicated event based simulator [case (2) in previous paragraph]. We see the receptive fields of the 32 neurons, where the positive and negative weights are grouped together in the same 7 × 7 square by assigning "white" to positive weights and "black" to negative weights. The central gray color means zero weight. As can be seen, there is a clear tendency for the receptive fields to become orientation selective.
It is worth mentioning that the type of continuous processing involved here differs from time-to-first-spike (or rank-order) coding schemes, where a stimulus on-set provides a reference time these two subfields is projected onto 32 neurons 18 . Consequently, there are 32 × 2 × 7 × 7 trainable weights. Weights are always positive. Each of the 32 neurons inhibits the other neurons through lateral inhibitory connections, as in reference (Masquelier et al., 2009a). Each neuron is as shown in Figure 10, with a leak and a refractory voltage. Inhibitory lateral connections have fixed weights, while the weights of the feed forward connections follow STDP learning. Weights were initialized either to random values, or to maximum values. The STDP learning functions were such that the ratio of the negative side area over the positive side area was [see eq. (1)] a − t − /a + t + = 1.25, meaning that STDP was biased toward depression. The time constant for the positive side was t + = 13.6 ms, while that of the negative side was t − = 15.2 ms, and there is a central linear region for |∆t| < 0.5 ms. Memristor resistances were bounded to the interval R min = 10 MΩ, R max = 100 MΩ. We simulated this system theoretically in several ways: (1) by solving the differential equations of biological integrate-and-fire neurons via an Euler method with a time step update of 0.1 ms, using the Brian simulator (Goodman and Brette, 2008) and a conventional additive STDP learning rule; (2) by using a dedicated event based simulator adapted from reference (Masquelier et al., 2009a) implementing the quadratic STDP learning rule of the memristors and the neuron dynamics corresponding to the circuit in Figure 10 with spikes as in Figure 8 19 ; and (3) by simulating a simplified event-driven matlab code with instantaneous neuron dynamics (but including a non-instantaneous leak) and with quadratic mSTDP 20 . In all cases, receptive fields became clearly orientation selective. The resulting receptive fields are biased to vertical edges, similar to the visual input stimuli we have used for this experiment. 18 Compared to the arrangement in Figure 21A, each of the 32 neurons in Figure 23 represents one of the Feature Maps in Figure 21A. Consequently, to implement the full V1 structure physically, each neuron with all its 7 × 7 input synapses in Figure 23 has to be "cloned" in a 128 × 128 array. 19 The corresponding Matlab and C code files for these simulations are available in folder STDP_V1_C_code_Fig24 in the AdditionalMaterial file, downloadable from the Journal web site. 20 The Matlab file is available in folder STDP_V1_matlab in the AdditionalMaterial file, downloadable from the Journal web site.  STDP learning function characterized through electrical spectre simulations (blue dots) to match the ideal function used in the theoretical simulations (red circles) 23 .
At this point we would like to highlight an important difference between the memristor-based network of integrate-and-fire neurons with STDP synapses presented here, and an equivalent network as used normally among neurocomputational researchers (see the integrate-and-fire neuron model in Gerstner, 1999). In this latter case, the evolution of membrane voltage following an input spike at t spk is as if the spike injects a current I t t I e spk spk m t t spk spk ( ) at node V pos in Figure 10. Parameters I m and t spk defining this spike contribution are independent of the parameters a + , a − , t + , t − in eq.
(1) used to characterize the STDP learning function. However, in case of the memristor implementation, each spike injects a current (Delorme et al., 2001;Guyonneau et al., 2004;Thorpe, 2007, 2010;Weidenbacher and Neumann, 2008). It also differs from Phase-of-Firing coding, where the peak of a population activity oscillation is used as a reference time (Masquelier et al., 2009b). Here there is no oscillation, nor stimulus on-set, nor any reference time, but a continuous flow of spikes, and yet STDP is able to pick patterns that are consistently present in the training data, confirming previous results (Masquelier et al., 2008). Future work, however, will evaluate the use of periodic resets in the AER retina, leading to time-to-first-spike coding with respect to those resets. We also simulated the network in Spectre using the memristor macro-model in Section 3.1, but for 16 neurons only 21 . However, electrical circuit simulation was very slow. Simulating for just one of the 324 input patches (with about 154 K events) took 559 ks CPU time (6.5 days) running on a SUN Fire X2200 M2 Linux cluster with dual cores at 2.2 GHz and 4 GB RAM. In this case we could only verify that the initial evolution of weights was similar to those of the software programs, as shown in Figure 25.
For the circuit simulations we used the topology and spike shapes shown in 22 Figure 26. There are two input memristor arrays, one for the positive and one for the negative subfields in  Spectre and Matlab files for these simulations are available in folder Spectre_ STDP_V1 in the AdditionalMaterial file, downloadable from the Journal web site. 22 This topology and these spike shapes also correspond to the theoretical simulations of case (2) and Figure 24. The only difference between the two cases is in the number of neurons used: 32 neurons for the theoretical simulations, and 16 neurons for the circuit simulations. 23 Spectre and Matlab files for these simulations are available in folder Tune_ Memristor_ in_Spectre in the AdditionalMaterial file, downloadable from the Journal web site. no longer so ideal, then there will be crosstalk between lines. For example, if a spike is sent to a column then the voltage on all rows would change slightly. The consequence of this is that part of the charge provided by the incoming spike will be lost through nondesired synapses and the impact of the spike on the target neurons will be weaker. During learning, the situation is less severe because for STDP update the memristor voltage has to exceed the learning threshold [v th in eq. (5)]. The effect of having non-ideal voltage sources is that the terminal voltage difference on the memristors needing synaptic update would be slightly less than in the ideal situation and learning would be weaker than expected ideally. However, having non-ideal voltage sources would not induce STDP update in undesired synapses. Another parasitic issue related to crosstalk is parasitic capacitive crosstalk between lines, which can be more pronounced as pitch and line distances decrease.
Also, one highly critical aspect which needs to be evaluated is the influence of component mismatches. Nanoscale devices suffer from high mismatch in general. Consequently, we should expect nanoscale memristors too to suffer from great parameter variations from one to another. It is true that they will operate as adaptive devices that will learn their functionality hopefully compensating for (some) mismatches. However, their learning and adaptation rules will also suffer from mismatch, making some synapses learn faster than others, or in slightly different fashions. In any case, the main sources of mismatch in memristor devices still need to be identified, and then their influence in the overall system learning behavior evaluated. However, to undertake such an initiative, we first need ready access to large arrays of reliable memristors fabricated in a stable and repeatable manner.
In general, an important issue is precise memristor modeling. Throughout this paper we have assumed an idealized voltage-driven memristor macro-model. This is useful to devise possible system architectures to achieve a desired functionality, such as STDP learning. However, to estimate realistic performance figures of resulting systems, it will be necessary to include non-ideal effects, both of the memristors and companion CMOS circuits. In this paper, no high order effects have been modeled, such as those related to noise, mismatch, and other memristor non-idealities not yet reported. For example, we have assumed that the wall model boundary condition is imposed by the voltage dependent function in eqs. (5 and 6). If this function turns out to be dependent on w, has other non-ideal effects, or there are other important mechanisms involved in the boundary conditions, this might affect the resulting behavior of the STDP learning function when integrating eq. (13), questioning the quadratic mSTDP behavior we have assumed.

concluSIon
In this paper we have shown that STDP learning can be induced by the voltage/flux driven formulation of a memristor device. We have used this formulation to develop fully asynchronous circuit architectures capable of performing STDP, by having neurons send their spikes not only forward but also backward. We have seen that the STDP learning rule induced this way is of a multiplicative, specifically quadratic type. We have shown how the shape of spikes is critical to achieve and modulate a specific STDP learning function. We have provided a memristor macro-model for simulating arrays of memristors efficiently in circuit simulators. Finally, at node V pos in Figure 10 proportional to the spike waveform delivered by the neurons. Since this waveform also determines the shape of the STDP learning function, it turns out that there is now a strong dependency between the parameters defining the evolution of the membrane voltage and those defining the STDP learning function. They are no longer independent and it is consequently more difficult to adjust all true independent parameters properly for a desired behavior.

PrActIcAl lIMItAtIonS, reAlIStIc SIzeS, PItcheS, denSIty, croSStAlk And Power conSIderAtIonS
Nanoscale memristor technology is still quite incipient and no realistic large scale systems have been reported at the time of writing (as far as we know). However, we can estimate an orientative scale and density of what may realistically be achieved in the near future, and the main limitations which may be encountered in a real physical implementation.
Regarding the wiring density of synaptic memristors, a pitch of 100 nm is conservatively realistic for present day technologies (Jung et al., 2006;Green et al., 2007), while the near future might bring us closer 10 nm (Jeon et al., 2010). Assuming wafer scale dies of 100 nm pitch 2D memristor arrays capable of interfacing reliably with lower CMOS become available some time soon, this would result in a synaptic density of 10 10 synapses/cm 2 .
In the brain, the number of synapses per neuron is about 10 3 -10 4 . If we want to maintain the 10 4 ratio, we would need to fabricate CMOS neurons with a pitch of 10 µm, resulting in 10 6 neurons/cm 2 . Such neuron sizes are quite realistic for present day nanometer scale CMOS (45 or 32 nm), given the complexity of the neurons needed.
Another problem is that of resistance value ranges of the memristors' R min (synapse ON) and R max (synapse OFF). Reported memristors present resistance values from the kΩ range up to the MΩ range (Borghetti et al., 2009;Jo et al., 2009Jo et al., , 2010. The memristor resistance value range affects the performance, reliability, crosstalk and power dissipation of a full large scale system. For example, it affects the driving capability of the neurons and their power consumption. If one neuron needs to drive 10 4 synapses of average value 1 MΩ to an average 1 V level, it has to be able to provide an average current of 10 mA during a spike (of say 20 ms), delivering 10 mW per spike. If there are 10 6 neurons/cm 2 each firing at an average of 10 Hz (which is similar to biological neurons), the synapses would dissipate a power of about 2 kW. The neurons would need at least the same power, presumably more. It is obvious that such a structure would melt quickly. The resistance range needs to be increased by a minimum factor of 100, so that minimum resistances are at least 100 MΩ, or even larger. As pitches are lowered, resistances would need to increase quadratically with pitch decrease, to maintain the power limitation. Another option would be to scale down voltage, but there is not much range. Even our 1 V maximum voltage assumption is quite optimistic for available present day memristors, which tend to operate between 2 and 10 V (Borghetti et al., 2009;Jo et al., 2009Jo et al., , 2010. Also, we have always assumed so far that voltage sources driving memristor terminals behave as ideal voltage sources, or at least, that the output resistance of such voltage sources is negligible compared to the total resistance they have to drive. Again, this will be achieved more easily if memristors present rather high resistance values. If driving voltage sources are we have studied an emulation of the V1 visual cortex capable of self-learning how to extract orientations from spiking inputs provided by a real physical AER spiking retina fabricated in CMOS. At the end we have also discussed possible limitations of present day memristors. The presented results are ideal extrapolations based on behavioral simulations. As memristor devices are further developed and non-ideal effects become known, the impact of non-idealities in the presented architectures and methods can be further assessed. Future work has to evolve toward more realistic memristor models and improved memristor devices, specially devices with much higher resistivities. One critical property that memristors need to provide for efficient STDP and non-volatility is the central dead-zone in Figure 5B, which the already reported memristor from Michigan University (Jo et al., 2009) seems to present. Another issue relates to the quadratic type of mSTDP followed by the presented devices and architectures. This is a quite unusual form of STDP, which needs to be further investigated from a theoretical point of view. In general, there might be stability issues with generic STDP when used in complex biological models (Izhikevich and Desai, 2003;Watt and Desai, 2010). Similarly, since the presented approach allows the shape of the neural spikes, and therefore the shape of the STDP learning curves to be changed in time, further theoretical studies are required to incorporate time varying STDP learning functions for speeding up, stabilizing, or in general improving learning performance.