NeuroPack: An Algorithm-Level Python-Based Simulator for Memristor-Empowered Neuro-Inspired Computing

Emerging two-terminal nanoscale memory devices, known as memristors, have demonstrated great potential for implementing energy-efficient neuro-inspired computing architectures over the past decade. As a result, a wide range of technologies have been developed that, in turn, are described via distinct empirical models. This diversity of technologies requires the establishment of versatile tools that can enable designers to translate memristors’ attributes in novel neuro-inspired topologies. In this study, we present NeuroPack, a modular, algorithm-level Python-based simulation platform that can support studies of memristor neuro-inspired architectures for performing online learning or offline classification. The NeuroPack environment is designed with versatility being central, allowing the user to choose from a variety of neuron models, learning rules, and memristor models. Its hierarchical structure empowers NeuroPack to predict any memristor state changes and the corresponding neural network behavior across a variety of design decisions and user parameter options. The use of NeuroPack is demonstrated herein via an application example of performing handwritten digit classification with the MNIST dataset and an existing empirical model for metal-oxide memristors.


INTRODUCTION
Over the last decade, neuro-inspired computing (NIC) has experienced an immense growth, manifesting itself in a range of advances across theory, hardware and infrastructure.Theoretical NIC has proposed a very wide range of artificial neural network (ANN) configurations, such as Convolutional neural networks (LeCun et al., 1999) and LSTMs (Hochreiter and Schmidhuber, 1997) that may operate at various levels of weight and signal quantisation (Dundar and Rose, 1995) and spanning across both spiking (Wu and Feng, 2018) and non-spiking (Sengupta et al., 2019) approaches.Evidently, this design process comprises multiple decision points that overall renders a very substantial and complex design space.
Simultaneously, research on novel hardware technologies has developed along multiple strands including fully digital architectures (Davies et al., 2018;Akopyan et al., 2015;Painkras et al., 2013), supra-threshold (Schmitt et al., 2017) and sub-threshold (Benjamin et al., 2014) analogue architectures, with some more recent contenders (Burr et al., 2016) utilising post-CMOS electronic components called "memristors" (Chua, 1971).This latter category is the focus of this work.Fundamentally, memristive devices are electrically tuneable (non-linear) resistors which have shown great promise at efficiently implementing the most numerous components found in neural networks: the synapses and mapping of their weights.Memristors feature the potential for extreme downscaling (Khiat et al., 2016), back-end-of-line (BEOL) integrability (Xia et al., 2009), sub-ns switching capability (Choi et al., 2016) and very low switching energy (Goux et al., 2012).Multiple families of memristive devices have been developed exploiting different electrochemical effects ranging from atomic-level effects in metal-oxides (Prodromakis et al., 2010) and atomic switches (Aono and Hasegawa, 2010), to bulk crystallisation/amorphisation effects in phase-change memory devices (Bedeschi et al., 2009) and magneto-resistive effects in Spin-Torque-Transfer (STT) devices (Vincent et al., 2015), to name a few.Each of these families features its own idiosyncrasies in terms of electrical behaviour and correspondingly are described via distinct empirical models.
The broad interest in employing memristors within neuro-inspired hardware mainly stems from their Multi-bit storage (Stathopoulos et al., 2017) capability and their simple architecture that can be tessellated in large arrays (Sivan et al., 2019).Those excellent features make memristors good candidates for multiply-accumulate operations (Serb et al., 2017) required in in-memory computing (IMC) applications.Moreover, the intrinsic properties of memristors are similar to those of biological synapses (Serrano-Gotarredona et al., 2013).Inspired by this fact, designs such as Serb et al. (2016) and Covi et al. (2016) successfully applied memristors as synapses in online learning with spike-timing-dependentplasticity (Markram et al., 1997), which is a learning rule inspired from biological NNs.Memristors have also been employed as components for NIC, from offline classification (Yao et al., 2020) to online learning (Payvand et al., 2020).Along these lines, software-based simulation platform designed for memristorbased neuomorphic systems become significant for fast validation of design ideas and predicting device behaviour.
Current simulators (e.g.MNSIM (Xia et al., 2018) and NeuroSim (Chen et al., 2018)) focus more on circuit-level designs, serving as tools either to simulate the behaviour of different hardware modules, or to estimate the performance of memristor-based neuromorphic hardware in integrated circuit designs.Sitting at a higher level of abstraction would be an "algorithm-level device-model-in-the-loop" simulator (or "algo-simulator" for short) designed to test functionally defined (as opposed to explicitly designed) circuits with memristive device models at algorithm level, e.g.performing specific online or offline learning tasks with memristors as synapses in spike-based NNs.Such tool would allow fast verification of design concepts before serious hardware design effort is committed, in essence answering the question: Is my design likely to function given knowledge on my memristive devices, assuming the rest of the circuit functions flawlessly?If yes, work can proceed to the next stage.
In this paper, we present NeuroPack: a simulator for memristor-based neuro-inspired computing at algorithm level.NeuroPack is a complete, hierarchical framework for simulating spiking-based neural networks, supporting various neuron models, learning rules, memristor models, memristor devices, neural networks, and different applications.Written in Python, it can be easily extended and customised by users, as will be shown.NeuroPack also integrates an empirical memristor model proposed by (Messaris et al., 2017b).Between processing algorithms and setting & monitoring memristor states, there is the significant step of applying a pulse of specific voltage and duration to trigger a memristor state change corresponding to some desired weight change calculated from learning rules.NeuroPack integrates a module to convert desired weight changes to estimated stimulation pulse parameters for bridging this gap.Finally, NeuroPack includes a built-in analysis tool with a user-friendly graphic user interface (GUI) for visualising and processing classification results.The main contributions of this work include: 1. Developing an algo-simulator for memristor-powered neuro-inspired computing with selectable neuron and device models, as well as learning rules.2. Modelling memristor state changes in neuro-inspired computing tasks given user-defined memristor parameters.
3. Converting expected weight changes prescribed by learning rules into parameters of bias pulses used for triggering memristor state changes in weight updating.
The rest of paper is organised as follows: in section 2 we introduce the architecture of NeuroPack with core parts and the workflow.Section 3 demonstrates an example application of handwritten digit recognition in MNIST dataset performed in NeuroPack and section 4 summarises the paper.

Design Overview
NeuroPack is designed for predicting the outcomes of online learning or offline classification tasks under selectable neuron, plasticity and memristive device models, as well as for triggering and monitoring memristor state changes.To achieve those two tasks NeuroPack's workflow (see Figure 1) is divided into five parts: input file handling, virtual memristor array, neuron core, plasticity core, and analysis tool.
For input data handling, there are three input files that need to be generated from users: configuration file, connectivity matrix file, and stimuli file.The configuration file contains the main parameters for building up NNs (e.g.network size, NN depth, number of neurons for each layer etc), setting up neuron models (e.g.leakage, noise scale etc), and initialising memristor devices (e.g. up and bottom boundaries for memristor resistance).The connectivity matrix file is used to define neuron connectivity and to map synapses to virtual memristors.The stimuli file stores both input signals and output labels.When loaded in the NeuroPack main panel, input signals and output labels are split by checking if the neuron is an output neuron using the information provided by the configuration file.
We now walk through the procedure for carrying out a classification task in a spiking network.First, the input signals to the neuron model are converted to spikes and saved in the stimuli file.Training and test datasets are loaded separately.At each time step an input neuron can either be spiking (1) or silent (0).Next, the neuron core reads out the resistive states (RS) from a virtual memristor array, and calculates internal variables, such as membrane voltages, according to the selected neuron model.The new fire states are then calculated in two steps: firstly considering whether neurons are supposed to fire by checking if the membrane voltages surpass the threshold, and secondly adding network-level constraints (for example winner-take-all networks).When the current input stimuli belong to a training dataset, inference results are sent to the plasticity core to trigger weight update as per selected learning rule.When the test dataset is processed, plasticity updates are skipped.
Weight updates happen during the plasticity phase.For STDP, long-term potentiation (LTP) or long-term depression (LTD) "events" are directly applied to memristor devices based on a calculation taking nothing into account except for spike timing information.For other learning rules other types of information may be necessary.These can be made available to the system configuration file.Importantly, there are two main conceptual ways of specifying plasticity events.The simpler way is to create a function that maps plasticity-relevant variables directly to pulsing parameters.Thereafter the physics of the memristor (correspondingly the response of the memristor model) will determine the actual resistive state change, which in turn will be reflected into a weight change via a resistance-to-weight mapping function.The more complex route involves mapping plasticity-relevant variable configurations to a requested weight change, and then searching the device model (a model is necessary for this approach) for a solution predicting that some set of pulse parameters will result in the required change.The same process will be repeated for the given number of epochs.Inference results as well as internal variable and parameters, including membrane voltages, fire history, and weights, can be sent to the built-in analysis tool for further visualisation and analysis.

Neuron Models and Learning Rules
Neuron models and learning rules are placed in "neuron core" and "plasticity core" respectively in Figure 1.For some learning rules, such as backpropagation which requires calculation of error gradients, the equations depend on the neuron model.Therefore, neuron models must be treated as first-class objects.However, a simpler solution implemented in this tool is to ensure that each learning rule is paired with one neuron core and be placed together in the same "core" document.NeuroPack provides four example core files: leaky-integrate-and-fire neuron (LIF) (Abbott, 1999) with STDP, leaky-integrateand-fire neuron with backpropagation (BP) (Rumelhart et al., 1986), Tempotron (Gütig and Sompolinsky, 2006), Izhikevich neuron (Izhikevich, 2003) and direct random target projection (DRTP) (Frenkel et al., 2021).Other neuron and plasticity rules can be customised according to users' needs by simply using existing example cores as standard templates and introducing user-defined cores.In this section, we provide one specific example, where we use a LIF neuron model and BP, to implement a fully-connected multi-layer spiking neural network with winner-take-all functionality (Oster et al., 2009).

Leaky-integrate-and-fire neuron and Back-propagation
Leaky-integrate-and-fire (LIF) is a simple and relatively computationally friendly neuron model.Most neuromorphic accelerators, with (e.g.Guo et al. (2019)) or without memristors (e.g.Merolla et al. (2011); Yin et al. (2017)), use LIF neurons.LIF has been implemented in NeuroPack using the equations below adapted from the differential form (Gerstner et al., 2014) by assuming discretised time steps: Where V t represents membrane voltage at time t, W is the weight, x t is incoming spike to the neuron (considered as 1 or 0), α ∈ [0, 1] is a leakage term, y t is the output spike, h(x) is the step function and V th is the neuron threshold.The equations describe that the neuron membrane voltage at any time step is determined by two parts: the weighted sum of incoming spikes at this time step and the membrane voltage at the last time step.If the membrane voltage surpasses the threshold, a spike is generated and the membrane voltage is reset.The cost function E at time step t is then given by: where N is the number of output neurons, i is the output neuron index, and ŷi,t is the correct firing state of output neuron i at time step t.Finally, weight changes are given by: where η is the learning rate, k is the layer index, and K is the number of layers.W k gives the weight matrix between layer k − 1 and k. ⊙ means element-wise multiplication.δ gives the error back-propagated from output neurons.The step function h(V t − V th ) is discontinuous, so the derivation is replaced by noise as suggested in Lee et al. (2016).For the full derivation, please refer to supplementary material.We note that the only potentially problematic variable is ŷi,t which is provided in stimuli file; everything else is accessible directly to NeuroPack.

Adding winner-take-all functionality
We now introduce winner-take-all (WTA) functionality, which constrains the output layer to have at most one firing neuron at each time step.This is done by adding one softmax layer and making the neuron that has the largest softmax result fire: The cost function correspondingly needs adjusted by taking a cross-entropy form: Where j is the index of the output neuron that should fire.Based on new cost function, weight changes are calculated as: as before application of WTA, all variables are accounted for; the freshly introduced softmax can be computed entirely within the same core file by simply adding network-level constraints.For the full derivation, please refer to supplementary material.
In summary, to configure a multi-layer spiking neural network with WTA using LIF neurons with BP in NeuroPack, parameters including learning rate, noise scale, threshold and leakage are loaded from the configuration file.Input spikes encoded from stimuli files are fed to the "neuron core" in the core file.The "neuron core" reads the memristor states from the device array, calculates membrane voltages, and updates fire history during the inference phase.If training is enabled, inference results, ground truth information as well as internal variables are loaded into the "plasticity core" which then adjusts the memristive weights according to calculated weight changes, precisely as summarised in Figure 1.

Memristor Model
When using virtual memristive devices, as is the case in Figure 1, a memristive model has to be used for predicting memristor RS as a function of read-out voltage and RS changes in response to plasticity events.Here we use an empirical memristor model proposed by Messaris et al. (2017b) where A n and A p are scaling factors.The (e |v| tp,n − 1) term marks the main, exponential dependency of the switching rate on bias voltage with t n , t p as fitting parameters extracted during modelling.Next, the last term encapsulates the dependence of the switching rate on the current resistive state with the aid of fitting parameters a 0p , a 1p , a 0n , a 1n .The main effect here is that the closer the RS is to the upper (lower) limit, the harder it is to continue pushing it further up (down), implementing "RS saturation".Finally, r p,n (v) is a simple, 1st order polynomial helper function that helps capture the exact nature of the switching rate's dependency on current RS: Whilst models may take different forms (e.g.charge-flux models (Shin et al., 2010)), the fundamental condition that is observed is that the device model is always fed a series of time-wise discretised voltages and then must be able to calculate current and change in resistive state.Models that by default take current inputs are presently not supported.dt is treated as a pre-defined, fixed parameter and remains constant throughout simulation in the current implementation.NeuroPack organises memristor model instances in a virtual array with user-defined parameters as inputs by using a class ParametricDevice(*args) to create a device object and methods initialise(R) and step dt(V, dt) in the same class to set the current resistance and send a pulse with magnitude of V and pulsewidth of dt to the memristor device correspondingly."Virtual memristor array" also provides methods such as read(w,b) and pulse(w, b, v, pw), to read the RS of the device at a given wordline w and bitline b address within the virtual array and apply a pulse with magnitude of v and pulse-width pw to the device at said address correspondingly.It is these functions that neuron models and learning rules in the core files use to access memristor RS and trigger RS change.Using model parameter sets from different types of devices (different stacks or even devices with different underlying electrochemical mechanisms), as well as varying the parameters of the model in an exploratory fashion are great tools to gain an understanding of how different memristive devices can influence the outcome of basic machine learning algorithms.

Weight Mapping and Updating
Most memristor-based neuromorphic designs (e.g Hu et al. (2018) and Li et al. (2018)) store weights as memristor conductance and apply incoming inputs as voltages across memristors, so that the multiplication results can be easily attained by measuring the current according to Ohm's law.Therefore, weights are mapped linearly to memristor conductance by default in NeuroPack, but users are allowed to define other mapping methods if needed.However, memristor being nonlinear devices, obtaining wellcontrolled weight changes is challenging because the resistive state after application of a triggering pulse depends both on the current resistance state and biasing parameters (typ.pulse voltage and duration).To deal with this issue, NeuroPack includes a module for calculating biasing parameters, expected to produce the desired weight changes.In this module, inputs are memristor model parameters for initialising a memristor device, current state, target state, dt, and a list of tuples, each of which contains magnitude and pulsewidth to represent a pulse.
Inside the module, a ParametericDevice object is created by passing memristor model parameters.Resulting resistance for each set of pulse parameters is calculated by calling step dt(V, dt) method.After all resulting resistance values are attained, distance to the resistance expected to be written to memristors is calculated, and the set of parameters that lead to shortest distance will be selected.The pseudo code of this module has be included in supplementary material.
The weight update scheme using the pulse parameter selection module is as follows: to begin with, userdefined R-tolerance, which is defines as the maximum that NeuroPack assumes the device state has converged, and maximum updating steps are loaded in Neuropack.After calculating the target resistance for a device, actual resistance is read and the resulting value is used as the initial resistance in the pulse parameter selection module.After attaining the set of parameters that will lead to closest resistance a pulse is sent and the new resistance is read.This predict-write-verify process is repeated until the calculated is smaller than the R-tolerance or the maximum step number is reached.

Customisation, usage scenarios and interface
With Python as programming language, NeuroPack can be flexibly customised at both algorithm level and device level.At algorithm level, NeuroPack can apply user-defined neuron models and learning rules.As it is shown in Figure 1, neuron core and plasticity core are placed in the same core file.In NeuroPack there are four example core files which will be explained in details in the next section.By writing and selecting user-defined core files, users can apply any customized neuron models and plasticity rules.At device level, users can easily set and change memristor parameters to describe different device characteristics.
There are two main usage scenarios for NeuroPack.One is as a supporting tool to explore how different parameter values and settings affect classification performance in NIC tasks (Figure 2 (A)).In this scenario, Users can load parameters in the configuration file with different values to monitor how memristor devices behave differently, or to investigate how classification accuracy or error rate is affected.Another usage scenario is to use NeuroPack to test and validate NN algorithms, including neuron models and learning rules (Figure 2 (B)).In this scenario, a user-defined core file or an example template is loaded to perform NN simulations.Users can then quickly test the algorithm and validate the idea by visualising and displaying memristor device state and other NN key variables (such as membrane voltage).
The visualisation and analysis tool in NeuroPack provides a user-friendly GUI to display inference results.This tool is separated from the main panel of NeuroPack.When executing a classification task, NeuroPack saves variables in every epoch including membrane voltages, input stimuli, fire history, output errors, weights as measured from the memristors, etc in a Numpy (.npz) file.Array-related variables such as weights are displayed in a color map, and neuron-related variables, for example membrane voltages and fire history, are visualised in curves to show the changing tendencies.

RESULTS
We use an MNIST handwritten digit recognition task as an example application to validate NeuroPack.Original images with 28 × 28 pixels from the MNIST dataset have been cropped to 22 × 22 and binarised (see input images for 10 digits in Figure 3 (A)).Background pixels and digit pixels are represented as 0 and 1 respectively.A single image is unrolled to a 484-dimensional vector with 0 and 1 only to be sent to input neurons in parallel.The spike encoding scheme sends a spike when the input bit is 1, and no spike for 0. The neural network used in this task is a 484-input 10-output two-layer feedforward winner-takeall spiking neural network with leaky-integrate-and-fire neuron model and gradient descent learning rule (see Figure 3 (E) for neural network architecture for this task).To map all 484×10 synapses, a 100×100 array of memristors is used.The network was trained for 10000 epochs before being fed with 2000 test images to evaluate classification accuracy.Parameters used in NeuroPack for MNIST handwritten digit classification task can be found in Table 1.Memristor parameters listed in table are based on the extraction method from (Messaris et al., 2017a) and devices presented in (Stathopoulos et al., 2017), given voltage range from 0.9V to 1.2V for positive bias and from -0.9V to -1.2V for negative bias with 11kΩ as initialised resistance.Therefore, pulse options used to update memristors are all within those ranges.The model yielded an estimated memristor operating range between 2.23-12.8kΩgiven bias voltage of ±1.2V and 12.5-18.9kΩgiven bias voltage of ±0.9V.The resulting conductance caused by the bias voltages of ±0.9V-± 1.2V is 5.3 × 10 −5 -4.48 × 10 −4 S. To make sure the weights can be mapped to range [0, 1], the linear mapping between memristor conductance and weights is given by the equation as below: Before training, memristor RSs are initialised to 11kΩ with a maximum variation of ±500 Ω. Memristor initial RSs are mapped to small weights close to the bottom boundary of the operating range, given conductance as the linear mapping of synaptic weights.Conductance sets before training, after 2000 epochs, and after 10000 epochs can be found in Figure 3 (B)-(D).During training, conductance associated with digit pixels and labelled output neurons will be increased gradually, while other conductance stay small, therefore targeted digits show up in the conductance sets along the training process.The weight sets show the same tendency as the mapping is linear.The baseline given by the version storing weights directly without using memristor models achieved a test accuracy of 83.55%, which indicates the accuracy bottleneck is not the memristor model.
We now use NeuroPack to illustrate how the intimately device-and programming protocol-related issue of selecting an appropriate R-tolerance affects recognition accuracy.Figure 4 (A) and (B) shows the training accuracy curves and test set accuracy results for different R-tolerance values.When R-tolerance is small (within 1%), the accuracy is not affected significantly.With a larger R-tolerance, training accuracy increases initially but then starts to drop.This is also reflected in the corresponding test set accuracy.To investigate the causes, we look into the resistance changes in both stimulated and non-stimulated synapses with different R-tolerance values, as displayed in Figure 4 (C).The red line shows the baseline virtual resistance values calculated according to the weight mapping scheme for a stimulated synapse (specifically the one between input neuron 250 and output neuron 6) as yielded by a non-memristor, software synapse.The baseline shows a gradual decrease tendency through out the whole 10000 training epochs.In contrast, resistance update with different R-tolerance values cuts off increasingly early as R-tolerance increases: for 0.1%, 1%, 2%, and 3% saturation occurs roughly at epochs ∼ 9000, 7000, 1000, and 100 respectively.Intuitively speaking, if we use a smooth, continuous curve to fit the baseline trace, we find that its gradient progressively decreases.This can be explained by the decreasing gradient of the cost function during the training process.This indicates that the required resistance changes between successive time steps reduce as training continues.Meanwhile, R-tolerance is defined as in NeuroPack, therefore it can be regarded as the cut-off ratio of memristor RS change.In other words, when resistance change between  , after 1000 epochs for R tolerance of 2%(E) or 0.1%(F), and after 10000 epochs for R tolerance of 2%(G) or 0.1%(H).The total range of memristor RSs is from 2.26kΩ to 11.5kΩ, which gives the range of weights as 0.98 to 0.086 correspondingly.two time steps is smaller than the R-tolerance, the resistance update stops.Therefore, the larger the Rtolerance, the earlier memristor RSs stop updating.Figure 4 (D) -(H) shows the memristor RS sets before training (D), after 1000 epochs (with R tolerance of 2% (E) and 0.1% (F)), and after 10000 epochs (with R tolerance of 2% (G) and 0.1% (H)).There are three color regions that can be clearly seen: blue (high resistive range), white (middle resistive range), and red (low resistive range).Before training, memristor RSs are initialised to the high resistive range.After 1000 epochs, both versions with R tolerance of 2% and 0.1% with show distinguishable high and middle resistive ranges, reflecting the increasing training accuracy before 1000 epochs in Figure 4 (A).After 10000 epochs, the version with R-tolerance of 0.1% clearly displays high, middle, and low resistive regions, while low resistive region is merged to middle region in version with R-tolerance of 2% because the cut-off of a too large R-tolerance.The version with R-tolerance of 2% is not able to distinguish specific images when middle region becomes larger as stimuli from different digits will all be assigned to weights with middle values.Therefore, the accuracy curves for large R-tolerance display decreasing tendency after certain points in Figure 4 (A).
Finally, we present a comparison between two biasing schemes: a) bias voltages are only applied to selected devices (corresponding selector-based crossbar array) scenarios, and b) half-bias voltages are also applied to unselected devices in the same bitlines and wordlines (as in selectorless crossbars).Figure 5 (A) shows the accuracy of both versions.In the training accuracy curves, both versions show the same tendency with a noticeable gap in between.The test accuracy bar chart further displays the ∼20% gap.In order to explore the cause of the accuracy gap, we look into the resistance change curves of memristors representing a stimulated synapse (between input neuron 250 and output neuron 6) and a non-stimulated synapse (between input neuron 10 and output neuron 6).The baseline curves (red) are given by virtual resistance values of the baseline which stores weights directly without using memristor models.The resistance change for the stimulated synapse in the selector-based version (dark blue trace) shows the same decreasing tendency as the baseline (red), while the resistance in the selectorless version (orange) displays an increasing tendency.Zooming into epochs 0 to 200, we observe unexpected resistance increases when no resistance update should happen for the selectorless scenario.This is because some memristors representing synapses connected to the neurons that are not supposed to fire have been applied pulses to increase the weights according to the learning rule, and they shared the same wordlines or bitlines with those whose resistance are supposed to decrease.A cycle of half-voltage bias only caused a trivial resistance increment, but there were many cycles of unexpected resistance drop happening in the same time step as there were many devices in the same wordlines/ bitlines, therefore the resistance increment accumulated and caused a large gap to the baseline.For the non-stimulated synapse, both the selector-based (purple) and the baseline (green) traces stay around their initial values through out the whole training phase, whilst unexpected resistance increases occur in the selectorless version.However, the resistance in memristors representing the stimulated synapse is still slightly smaller than that of the non-stimulated synapse in the given examples in the selectorless scenario.Because of this slight resistance difference, the NN based on selectorless array is still able to learn images and classify correctly in some cases.We present the memristor resistive states before and after training for both versions in Figure 5 (C)  -(E).Due to the half-voltage bias, the new memeristor operation range changes to 2.23k-28kΩ, giving the new conductance range from 3.57 × 10 −5 S to 4.48 × 10 −4 S. Therefore, the linear mapping from conductance to weights now is changed to the equation below: Memristor RSs are initialised as ∼11kΩ before the training.After 10000 epochs, the resistive states of stimulated memristors in selector-based array (Figure 5 (D)) decrease to low resistive range (∼2.26kΩ), while the non-stimulated stay in high resistive range (∼11kΩ).In the selectorless version, stimulated memristor RSs increase to very high resistive range (∼18kΩ) with non-stimulated ones increasing even higher to (∼22kΩ).Thus, NeuroPack has helped us uncover the perhaps surprising result that even in the presence of invasive unexpected weights update the NN is still capable of distinguishing the MNIST digits substantially better than chance, albeit with a very different weight distribution than the selector-based network.

CONCLUSION
In this paper, we presented NeuroPack, a versatile algorithm-level software emulator for memristor-based neuro-inspired computing systems.NeuroPack allows users to customise the simulator at both system level and device level.This platform can work as a standalone tool to emulate neuro-inspired computing with different neuron models, learning rules, memristor models, different types and numbers of memristor devices, different neural network architectures, and different applications.We further showcased an application example using NeuroPack to simulate a two-layer SNN for handwritten digit recognition with the MNIST dataset.We explored how different factors such as R-tolerance in weight updating and biasing methods in different array structures affect system classification accuracy and quickly reached two conclusions: a) That even a surprisingly lax 1% tolerance in resistive state (an engineering parameter) allows for sufficient training efficiency to closely match the performance of an ideal model for this architecture and dataset.This indicates that memristor-based systems may be able to achieve competitive performance without requiring expensive precision circuits are least in some scenarios.b) That even in a scenario with unexpected weight updates due to the half-voltage biasing method in selectorless Leaky integrate-and-fire neuron model in matrix form: Where V t is membrane voltage vector in the selected layer at time t with the size of n × 1 (n is the total number of neurons in this layer), x t is the input signals fed to selected layer at time t with the size of m×1 (m is the total number of neurons in last layer), W represents the weight matrix between current current layer and the last layer with the size of n × m, α is the leakage factor, ⊙ means the element-wise product, V th is the membrane voltage threshold, y t is the output vector for the selected layer with only 0 and 1 to indicate if there is a spike generated with the size of n × 1, and ŷt is the correct output state for output layer with the same size of that of the output layer, E is the cost function calculating the distance between y t in the output layer and ŷt , N is the number of output neurons, and ∆W gives the weight update matrix with the same size as that of W . = (W T l,t δ l,t ⊙ h ′ (V l−1,t − V th ))x T l−1,t = δ l−1,t x T l−1,t let δ l−1,t = (W T l,t δ l,t ⊙ h ′ (V l−1,t − V th )) Therefore, the weight update matrix ∆W can be expressed as follow: Where K is the index of the output layer.

DERIVATION OF BACK PROPAGATION FOR LEAKY INTEGRATE-AND-FIRE NEURON MODELS (WITH WINNER-TAKE-ALL)
To have winner-take-all network-level restriction to only allow at most one neuron to fire at each time step, a softmax layer is added to the previous output layer: Where S t is a vector with the size of N × 1.With the new output representation becoming continuous values, we apply cross entropy loss as the cost function: ŷ i,t ln(S i,t ) = −ln(S j,t ) Where j is the index of the output neuron that should fire according to the ground truth.
Therefore, the derivation of the gradient for the output layer is as below: = (W T l,t δ l,t ⊙ h ′ (V l−1,t − V th ))x T l−1,t = δ l−1,t x T l−1,t let δ l,t = ((W T l,t δ l,t ⊙ h ′ (V l−1,t − V th )) Therefore, the final weight update matrix expression is as below: ⊲ Find parameters of the pulse option end procedure Frontiers

FrontiersFigure 3 .
Figure 3. Image recognition task with TiOx memristor-based neural network simulation in NeuroPack.(A) Binary handwritten digits from MNIST dataset cropped to 22 × 22. Dark blue and yellow represent weights 0 and 1 respectively.(B) -(D) Conductance sets before training (B), after 2000 epochs (C) and after 10000 epochs (D).Conductance spans from 0.0000869S to 0.0004416S, indicating final memristor RSs ranging from around 2264 ohms to 11507 ohms.(E) Concept diagram of spiking neural network used in this image recognition task.In practice the network is fully connected.22 × 22 -pixel images are unrolled to 484-bit input sending to 484 input neurons correspondingly as input spikes.In this task, a two-layer network with 484 input neurons and 10 output neurons is applied.(F) Accuracy curves over the training process for both memristor version and non-memristor version.Minibatch size of 100 are used to calculate the accuracy changes during the training.Provided 2000 images from a separated test set, memristor version and non-memristor version achieved accuracy of 82.00% and 83.55% respectively.
Figure 3 (F) shows the accuracy evolution curve during training with minibatch size = 100.2000 images from a separated test set were fed to the network after training, and 1640 out of 2000 got classified correctly, giving general inference accuracy of 82.00%.

Figure 4 .
Figure 4. Classification accuracy affected by R tolerance.(A) and (B) Accuracy in training and test phases with different R tolerance respectively.(C) Resistance changes in a memristor representing a stimulated synapse (between input neuron 250 and output neuron 6) in training process with different R tolerance.Notice that we use virtual resistance values for the baseline.(D)-(H) Memristor resistive states before training (D), after 1000 epochs for R tolerance of 2%(E) or 0.1%(F), and after 10000 epochs for R tolerance of 2%(G) or 0.1%(H).The total range of memristor RSs is from 2.26kΩ to 11.5kΩ, which gives the range of weights as 0.98 to 0.086 correspondingly.

Figure 5 .
Figure 5. Classification with versions using selector-based and selectorless memristor arrays.(A) Accuracy in selector-based and selectorless arrays.(B) Resistance changes in memristors representing a stimulated (between input neuron 250 and output 6) and a non-stimulated (between input neuron 10 and output neuron 6) synapses.(C) -(E) give memristor resistive states before training (C), after training for both selector-based (D) and selectorless (E) versions.The color bar shows the memristor states from 2.26k-25kOmega, which give weights in range 0.98 to 0.01 correspondingly.

Frontiers 3 PSEUDO
CODE FOR PULSE PARAMETER SELECTION MODULE procedure PULSE PARAMETERS SELECTION ACCORDING TO ∆W memristor = ParametericDevice(*args) ⊲ Create a ParametericDevice object res = [] ⊲ Create an empty list to store resulting resistance for pulse in pulseList do ⊲ Loop over all options of pulse parameters memristor.initialise(R)⊲ Set current resistance to R for timestep in range(pulse[1] / dt) do ⊲ pulse[1] stores pulsewidth step dt(pulse[0], dt) ⊲ pulse[0] stores magnitude end for res.append(memristor.Rmem) ⊲ Append resulting resistance to the list end for resDist = abs(res -R expected) ⊲ Calculate distance between expected and calculated resistance selectedPulseIndex = argmin(resDist) ⊲ Find the index of minimal distance selectedPulse = res[selectedPulseIndex] In terms of applications, we use NeuroPack to demonstrate image classification on MNIST dataset in our 'Results' Figure1.NeuroPack workflow.The system reads three configuration files: connectivity matrix, (neural) stimuli and config.The memristor model is embedded within the virtual memristor array, which initialises a number of memristor devices according to instructions in the configuration and connectivity matrix files.Neuron models and learning rules are placed in the core file.In "updating fire history" stage, there are three separate steps: calculating fire states assuming neurons fire 'freely'; adding network-level constraints; and updating the fire history matrix."Calculating ∆W" is supported by most learning rules except STDP.By replacing the core file, users can apply different neuron models and learning rules.section.We also give result analysis for systems with different R tolerance, a parameter used in weight updating, and two biasing methods as examples to showcase that NeuroPack assists users to investigate how key design, device and architectural factors affect memristor-based neuromorphic computing systems.
, but other user-defined models are also compatible with NeuroPack.With different values of parameters, this model has shown flexibility to model large ranges of memristor devices.The model expresses RS switching rate ( dR dt ) as a function of initial RS and biasing voltage.The switching rate equation is reproduced here for convenience:

Table 1 .
Parameters used in NeuroPack for MNIST handwritten digit classification task., 10 −6 , 10 −6 , 5 × 10 −6 , 10 −5 , 5 × 10 −5 be possible to decode useful information from the state of the system after training.These investigations illustrate the role NeuroPack can play in assisting users to design and validate neuroinspired concepts and improve system performance involving emerging nanoscale memory technologies.We envisage that by varying datasets, biasing and other experimental parameters, device technology and connectivity patterns, users from across the community will be able to use this tool to generate the results that suit their needs quickly and efficiently.
arXiv:2201.03339v2 [cs.ET] 16 Feb 2022 According to chain rule, ∂E ∂w l,t for the output layer can be derived as below:= (y l,t − ŷt ) ⊙ h ′ (V l,t − V th ) • x T l,t = δ l,t x T l,t let δ l,t = (y l,t − ŷt ) ⊙ h ′ (V l,t − V th )After that we can back propagate the error from output layer:= (y l,t − ŷt ) ⊙ h ′ (V l,t − V th ) • ∂V l,t ∂W l−1,t ∂W l,t let Z l,t = V l,t ⊙ y l,t = (S t − ŷt ) ⊙ (y l,t + V l,t ⊙ h ′ (V l,t − V th )x T l,t = δ l,t x T l,t let δ l,t = (S t − ŷt ) ⊙ (y l,t + V l,t ⊙ h ′ (V l,t − V th )From the output layer results, we can back propagate the errors and get the expression for other layers as follow: