On the Self-Repair Role of Astrocytes in STDP Enabled Unsupervised SNNs

Neuromorphic computing is emerging to be a disruptive computational paradigm that attempts to emulate various facets of the underlying structure and functionalities of the brain in the algorithm and hardware design of next-generation machine learning platforms. This work goes beyond the focus of current neuromorphic computing architectures on computational models for neuron and synapse to examine other computational units of the biological brain that might contribute to cognition and especially self-repair. We draw inspiration and insights from computational neuroscience regarding functionalities of glial cells and explore their role in the fault-tolerant capacity of Spiking Neural Networks (SNNs) trained in an unsupervised fashion using Spike-Timing Dependent Plasticity (STDP). We characterize the degree of self-repair that can be enabled in such networks with varying degree of faults ranging from 50 to 90% and evaluate our proposal on the MNIST and Fashion-MNIST datasets.


INTRODUCTION
Neuromorphic computing has made significant strides over the past few years-both from hardware (Merolla et al., 2014;Sengupta and Roy, 2017;Davies et al., 2018;Singh et al., 2020) and algorithmic perspective (Diehl and Cook, 2015;Neftci et al., 2019;Sengupta et al., 2019;Lu and Sengupta, 2020). However, the quest to decode the operation of the brain have mainly focused on spike based information processing in the neurons and plasticity in the synapses. Over the past few years, there has been increasing evidence that glial cells, and in particular astrocytes, play a crucial role in a multitude of brain functions (Allam et al., 2012). As a matter of fact, astrocytes represent a large proportion of the cell population in the human brain (Allam et al., 2012). There have been also suggestions that complexity of astrocyte functionality can significantly contribute to the computational power of the human brain. Astrocytes are strategically positioned to ensheath tens of thousands of synapses, axons and dendrites among others, thereby having the capability to serve as a communication channel between multiple components and behave as a sensing medium for ongoing brain activity (Chung et al., 2015). This has led neuroscientists to conclude that astrocytes play a major role in higher order brain functions like learning and memory, in addition to neurons and synapses. Over the past few years, there have been multiple studies to revise the neuron-circuit model for describing higher order brain functions to incorporate astrocytes as part of the neuron-glia network model (Allam et al., 2012;Min et al., 2012). These investigations clearly indicate and quantify that incorporating astrocyte functionality in network models influence neuron excitability, synaptic strengthening and, in turn, plasticity mechanisms like Short-Term Plasticity and Long-Term Potentiation, which are important learning tools used by neuromorphic engineers.
The key distinguishing factors of our work against prior efforts can be summarized as follows: (i) While recent literature reports astrocyte computational models and their impact on fault-tolerance and synaptic learning (Allam et al., 2012;De Pittà et al., 2012;Gordleeva et al., 2012;Min et al., 2012;Wade et al., 2012), the studies have been mostly confined to small scale networks. This work is a first attempt to explore the self-repair role of astrocytes at scale in unsupervised SNNs in standard visual recognition tasks.
(ii) In parallel, there is a long history of implementing astrocyte functionality in analog and digital CMOS implementations (Irizarry et al., 2013;Nazari et al., 2015;Ranjbar and Amiri, 2015;Lee and Parker, 2016;Liu et al., 2017;Amiri et al., 2018;Karimi et al., 2018). More recently, emerging physics in post-CMOS technologies like spintronics are also being leveraged to mimic glia functionalities at a one-to-one level (Garg et al., 2020). However, the primary focus has been on a brain-emulation perspective, i.e., implementing astrocyte computational models with high degree of detail in the underlying hardware. We explore the aspects of astrocyte functionality that would be relevant to self-repair in the context of SNN based machine learning platforms and evaluate the degree of bio-fidelity required.
(iii) While Refs. (Hazan et al., 2019;Saunders et al., 2019b) discusses impact of faults in unsupervised STDP enabled SNNs, self-repair functionality in such networks have not been studied previously.
While neuromorphic hardware based on emerging post-CMOS technologies (Jo et al., 2010;Kuzum et al., 2011;Ramakrishnan et al., 2011;Jackson et al., 2013;Sengupta and Roy, 2017) have made significant advancements to reduce the area and power efficiency gap of Artificial Intelligence (AI) systems, such emerging hardware are characterized by a host of non-idealities which has greatly limited its scalability. Our work provides motivation toward autonomous self-repair of such faulty neuromorphic hardware platforms. The efficacy of our proposed astrocyte enabled self-repair process is measured by the following steps: (i) Training SNNs using unsupervised STDP learning rules in networks equipped with lateral inhibition and homeostasis, (ii) Introducing "faults" 1 in the trained weight maps by setting a randomly chosen subset of the weights to zero and (iii) Implementing self-repair by re-training the faulty network with astrocyte functionality augmented STDP learning rules. We also compare our proposal with sole STDP based retraining strategy and substantiate our results on the MNIST and F-MNIST datasets.  (Wade et al., 2012). 2-AG is local signal associated with each synapse while e-SP is a global signal. A1 is the astrocyte.

Astrocyte Preliminaries
In addition to astrocyte mediated meta-plasticity for learning and memory Jung, 2004, 2007;Volman et al., 2007;Wade et al., 2012), there has been indication that retrograde signaling via astrocytes probably underlie selfrepair in the brain. Computational models demonstrate that when faults occur in synapses corresponding to a particular neuron, indirect feedback signal (mediated through retrograde signaling by the astrocyte via endocannabinoids, a type of retrograde messenger) from other neurons in the network implements repair functionality by increasing the transmission probability across all healthy synapses for the affected neuron, thereby restoring the original operation (Wade et al., 2012). Astrocytes modulate this synaptic transmission probability (PR) through two feedback signaling pathways: direct and indirect, responsible for synaptic depression (DSE) and potentiation (e-SP), respectively. Multiple astrocyte computational models Jung, 2004, 2007;Volman et al., 2007;Wade et al., 2012) describe the interaction of astrocytes and neurons via the tripartite synapse where the astrocyte's sensitivity to 2-arachidonyl glycerol (2-AG), a type of endocannabinoid, is considered. Each time a post synaptic neuron fires, 2-AG is released from the post synaptic dendrite and can be described as: where, AG is the quantity of 2-AG, τ AG is the decay rate of 2-AG, r AG is the 2-AG production rate and t sp is the time of the postsynaptic spike.
The 2-AG binds to receptors (CB1Rs) on the astrocyte process and instigates the generation of IP 3 , which subsequently binds to IP 3 receptors on the Endoplasmic Reticulum (ER) to open channels that allow the release of Ca 2+ . It is this increase in cystolic Ca 2+ that causes the release of gliotransmitters into the synaptic cleft that is ultimately responsible for the indirect signaling. The Li-Rinzel model (Li and Rinzel, 1994) uses three channels to describe the Ca 2+ dynamics within the astrocyte: J pump models how Ca 2+ is pumped into the ER from the cytoplasm via the Sarco-Endoplasmic-Reticulum Ca 2+ -ATPase (SERCA) pumps, J leak describes Ca 2+ leakage into the cytoplasm and J chan models the opening of Ca 2+ channels by the mutual gating of Ca 2+ and IP 3 concentrations. The Ca 2+ dynamics is thus given by: The details of the equations and their derivations can be obtained from Wade et al. (2012) andDe Pittà et al. (2009). The intracellular astrocytic calcium dynamics control the glutamate release from the astrocyte which drives e-SP. This release can be modeled by:  where, Glu is the quantity of glutamate, τ Glu is the glutamate decay rate, r Glu is the glutamate production rate and t Ca is the time of the Ca 2+ threshold crossing. To model e-SP: where, τ eSP is the decay rate of e-SP and m eSP is a scaling factor. Equation (4) substantiates that the level of e-SP is dependent on the quantity of glutamate released by the astrocyte. The released 2-AG also binds directly to pre-synpatic CB1Rs (direct signaling). A linear relationship is assumed between DSE FIGURE 6 | Test accuracy of a 225 neuron network on the MNIST dataset with 70% faulty connections with normal and enhanced learning rates during STDP re-training process. Re-training with A-STDP rule is also depicted.
where, AG is the amount of 2-AG released by the post-synaptic neuron and is found from Equation (1) and K AG is a scaling factor. The PR associated with each synapse is given by the following equation: where, PR(t 0 ) is the initial PR of the synapses, e-SP and DSE are given by Equations (4) and (5), respectively. In the computational models, the effect of DSE is local to the synapses connected to a particular neuron whereas all the tripartite synapses connected to the same astrocyte receive the same e-SP. Under no-fault condition, the DSE and e-SP reach a dynamic equilibrium where the PR is unchanged over time, resulting in a fixed firing rate for the neurons. When a fault occurs, this balance subsides and the PR changes according to Equation (6) to restore the firing rate to its previous value. To showcase this effect consider for instance, Figure 1 where a simple SNN with two post-synaptic neurons is depicted. Let us assume that each post-neuron receives input spikes from 10 pre-neurons. The initial PR of the synapses were set to 0.5. Figure 1A is the case with no faults, while in Figure 1B, faults have occurred after some time in 70% of the synapses associated with post-neuron N2 (Figure 2). Note, here "faults" imply that the synapses do not take part in transmission of the input spikes i.e., have a PR of 0. This results in a drop of the firing frequency associated with N2 while operation of N1 is not impacted. Thus, the amount of 2-AG released by N2 decreases, which increases DSE and in turn increases the PR of the associated synapses of N2 where no faults have occurred. Hence, we observe in Figure 2D that the increased PR recovers the firing rate and approaches the ideal firing frequency. Note that the degree of self-recovery, i.e., the difference between the recovered and ideal frequency is a function of the fault probability. The simulation conditions and parameters for the modeling are based on Wade et al. (2012). Interested readers are directed to Wade et al. (2012) for an extensive discussion on the astrocyte computational model and the underlying processes governing the retrograde signaling.
A key question that we have attempted to address in this work is the computational complexity at which we require to model the feedback mechanism to implement autonomous repair in such self-learning networks. Simplifying the feedback modeling would enable us to implement such functionalities by efficient hardware primitives. For instance, the core functionality of astrocyte self-repair occurs in conjunction with STDP based learning in synapses. Figure 3 shows a typical STDP learning rule where the change in synaptic weight varies exponentially with the spike time difference between the pre-and post-neuron (Liu et al., 2018), according to measurements performed in rat glutamatergic synapses (Bi and Poo, 2001). Typically, the height of the STDP weight update for potentiation/depression is constant (A + /A − ). However, astrocyte mediated self-repair suggests that the weight update should be a function of the firing rate of the post-neuron (Liu et al., 2018). Assuming the faultless expected firing rate of the post-neuron to be f ideal and the non-ideal firing rate to be f , the synaptic weight update window height should be a function of f = f ideal − f . The concept has been explained further in Figure 3 and is also in accordance with Figure 2 where the PR increase after fault introduction varies in a non-linear fashion over time and eventually stabilizes once the self-repaired firing frequency approaches the ideal value. The functional dependence is assumed to be that of a sigmoid function-indicating that as the magnitude of the fault, i.e., deviation in the ideal firing frequency of the neuron increases, the height of the learning window increases in proportion to compensate for the fault (Liu et al., 2018). Note that the term "fault" for the machine learning workloads, described herein, refers to synaptic weights (symbolizing PR) stuck at zero. Therefore, with increasing amount of synaptic faults, f << f ideal , thereby increasing the STDP learning window height significantly. During the self-healing process, the frequency deviation gradually reduces and thereby the re-learning rate also becomes less pronounced and finally saturates once the ideal frequency is reached. While our proposal is based on Liu et al. (2018), prior work has been explored in the context of a prototype artificial neural network with only 4 input neurons and 4 output neurons. Extending the framework to an unsupervised SNN based machine learning framework therefore requires significant explorations, highlighted next.

Neuron Model and Synaptic Plasticity
We utilized the Leaky Integrate and Fire (LIF) spiking neuron model in our work. The temporal LIF neuron dynamics are described as, where, v(t) is the membrane potential, τ mem is the membrane time constant, v rest is the resting potential and I(t) denotes the total input to the neuron at time t. The weighted summation of synaptic inputs is represented by I(t). When the neuron's membrane potential crosses a threshold value, v th (t), it fires an output spike and the membrane potential is reset to v reset . The neuron's membrane voltage is fixed at the reset potential for a refractory period, δ ref , after it spikes during which it does not receive any inputs.
In order to ensure that single neurons do not dominate the firing pattern, homeostasis (Diehl and Cook, 2015) is also implemented through an adaptive thresholding scheme. The membrane threshold of each neuron is given by the following temporal dynamics, where, θ 0 > v rest , v reset and is a constant. τ theta is the adaptive threshold time constant. The adaptive threshold, θ (t) is increased by a constant quantity θ + , each time the neuron fires, and decays exponentially according to the dynamics in Equation 8. A trace (Morrison et al., 2008) based synaptic weight update rule was used for the online learning process (Diehl and Cook, 2015;Saunders et al., 2019b). The pre and post-synaptic traces are given by x pre and x post , respectively. Whenever the pre (post) -synaptic neuron fires, the variable x pre (x post ) is set to 1, otherwise it decays exponentially to 0 with spike trace decay time constant, τ trace . The STDP weight update rule is characterized by the following dynamics, w = η post * x pre on post-synaptic spike −η pre * x post on pre-synaptic spike (9) where, η pre /η post denote the learning rates for pre-synaptic/postsynaptic updates, respectively. The weights of the neurons are bounded in the range of [0, w max ]. It is worth mentioning here that the sum of the weights associated with all post-synaptic neurons is normalized to a constant factor, w norm (Saunders et al., 2019a).

Network Architecture
Our SNN based unsupervised machine learning framework is based on single layer architectures inspired from cortical microcircuits (Diehl and Cook, 2015). Figure 4 shows the network connectivity of spiking neurons utilized for patternrecognition problems. Such a network topology has been shown to be efficient in several problems, such as digit recognition (Diehl and Cook, 2015) and sparse encoding (Knag et al., 2015). The SNN, under consideration, has an Input Layer with the number of neurons equivalent to the dimensionality of the input data. Input neurons generate spikes by converting each pixel in the input image to a Poisson spike train whose average firing frequency is proportional to the pixel intensity. This layer connects in an all-to-all fashion to the Output Layer through excitatory synapses. The Output layer has n neurons LIF neurons  A-STDP non-linearity hyperparameter, σ 2 characterized by homeostasis functionality. It also has static (constant weights) recurrent inhibitory synapses with weight values, w recurrent , for lateral inhibition to achieve soft Winner-Take-All (WTA) condition. Each neuron in the Output Layer has an inhibitory connection to all the neurons in that layer except itself. Trace-based STDP mechanism is used to learn the weights of all synapses between the Input and Output Layers. The neurons in the Output Layer are assigned classes based on their highest response (spike frequency) to input training patterns (Diehl and Cook, 2015).

Challenges and Astrocyte Augmented STDP (A-STDP) Learning Rule Formulation
One of the major challenges in extending the astrocyte based macro-modeling in such self-learning networks lies in the fact that the ideal neuron firing frequency is a function of the specific input class the neuron responds to. This is substantiated by Figure 5 which depicts the histogram distribution of the ideal firing rate of the wining neuron in the fault-less network. Further, due to sparse neural firing, the total number of output spikes of the winning neurons over the inference window is also small, thereby limiting the amount of information (number of discrete levels) that can be encoded in the frequency deviation, f . This leads to the question: Can we utilize another surrogate signal that gives us information about the degree of self-repair occurring in the network over time while being independent of the class of the input data? While the above challenge is related to the process of reducing the STDP learning window over time, we observed that using sole STDP learning or with a constant enhanced learning rate consistently reduced the network accuracy over time (Figure 6). Figure 7 also depicts that normal STDP retraining with faulty synapses slowly loses their learnt representations over time. Re-learning all the healthy synaptic weights uniformly using STDP with an enhanced learning rate should at least result in some accuracy improvement for the initial epochs of re-training, even if the modulation of learning window height over time is not incorporated in the self-repair framework. The degradation of network accuracy starting from the commencement of the retraining process signified that some additional factors may have been absent in the astrocyte functionality macro-modeling process, which is independent from the above challenge of modulating the temporal behavior of the STDP learning window.
In that regard, we draw inspiration from Equation (6), where we observe that the initial fault-free value of the PR acts as a scaling factor for the self-repair feedback terms DSE and e-SP. We perform a similar simulation for the network shown in Figure 1, with each neuron receiving input from 10 synapses. However in this case, we set the initial PR of all of the synapses to 0.5, except one connected to N2; for which the initial PR was set to 0.1. In other words, 9 of the synapses connected to N2 have a PR(t 0 )=0.5, while for one PR(t 0 )=0.1. The lower initial PR value symbolizes a weaker connection. The network is simulated for 400 s and at 200 s, the associated PR of 8 of the synapses with higher initial PR are reduced to 0 to signify faulty condition (Figure 8). We observe that after the introduction of the faults, the PR of the synapses with the higher initial PR value is enhanced greatly compared to the one with the lower initial PR. This leads us to the conclusion that synapses that play a greater role in postsynaptic firing also play a greater role in the self-repair process compared to other synapses.
Since our unsupervised SNN is characterized by analog synaptic weights in the range of [0, w max ], we hypothesized that this characteristic might underlie the reason for the accuracy degradation and designed a preferential self-repair learning rule for healthier synapses. This was found to result in significant accuracy improvement during the retraining process (discussed in next section). Our formulated A-STDP learning rule formulation is therefore also guided by the following question: Can we aggressively increase the healthy synaptic weights during the initial learning epochs which preserves the original representations learnt by the network?
Driven by the above observations, we formulated our Astrocyte Augmented STDP (A-STDP) learning rule during the self-repair process as, w = η post * x pre * (w/w α ) σ on post-synaptic spike −η pre * x post on pre-synaptic spike (10) where, w α represents the weight value at the α-th percentile of the network and serves as the surrogate signal to guide the retraining process. Figure 9 depicts the temporal behavior of w α for the 98-th percentile of the weight distribution. After faults are introduced, w α is significantly reduced and slowly increases over time during the re-learning process. It finally saturates off at the bounded value w max . The term w/w α ensures that the effective learning rate for healthier synapses (w > w α ) is much higher than the learning rate for weaker connections (w < w α ) while σ dictates the degree of non-linearity. Since w α increases over time, the enhanced learning process also reduces and finally stops once w α saturates. It is worth mentioning here that w α , σ and w max are hyperparameters for the A-STDP learning rule. All hyperparameter settings and simulation details are presented in the next section.

RESULTS
We evaluated our proposal in the context of unsupervised SNN training on standard image recognition benchmarks under two settings: scaling in network size and scaling in network complexity. We used MNIST (LeCun and Cortes, 2010) and Fashion-MNIST (Xiao et al., 2017) datasets for our analysis. Both datasets contain 28 × 28 grayscale images of handwritten digits / fashion products (belonging to one of 10 categories) with 60,000 training examples and 10,000 testing examples. All experiments are run in PyTorch framework using a single GPU with a batchsize of 16 images. In addition to standard input preprocessing for generating the Poisson spike train, the images in F-MNIST dataset also undergo Sobel filtering for edge detection before being converted to spike trains. The SNN implementation is done using a modified version of the mini-batch processing enabled SNN simulation framework (Saunders et al., 2019b) in BindsNET (Hazan et al., 2018), a PyTorch based package (Link). In addition to dataset complexity scaling, we also evaluated two networks with increasing size (225 and 400 neurons) on the MNIST dataset. For the MNIST dataset, the baseline test accuracy of the ideal network was 89.53 and 92.02%, respectively. A 400neuron network was used for the F-MNIST dataset with 77.35% accuracy. The baseline test accuracies are superior/comparable to prior reported accuracies for unsupervised learning on both datasets. For instance, Diehl and Cook (2015) reports 87% accuracy for an STDP trained network with 400 neurons while Zhu and Wang (2020) reports the best accuracy of 73.1% for state-of-the-art clustering methods on the F-MNIST dataset. Table 1 lists the network simulation parameters used in this work. It is worth mentioning here that all hyperparameters were kept unchanged (from their initial values during training) in the selfrepair process. We also kept the hyperparameters, w α and σ for the A-STDP rule unchanged for all fault simulations. Figure 10 shows a typical ablation study of the hyperparameters α and σ . For this study, we trained a 225-neuron network with 90% faults. We divided the training set into training and validation subsets in the ratio of 5:1, respectively, through random sampling. The two accuracy plots shown in Figure 10 are models retrained on the training subset and then evaluated on the new validation set. Further hyperparameter optimizations for different fault conditions can potentially improve the accuracy improvement even further. The network is first trained with sole STDP learning rule for 2 epochs and the maximum test accuracy network is chosen as the baseline model. Subsequently, faults are introduced by randomly deleting synapses (from the Input to the Output Layer) post-training. Each synaptic connection was assigned a deletion probability, p del , to decide whether the connection would be retained in the faulty network. For this work, p del was varied between 0.5 and 0.9 to analyze the network and re-train after introducing faults. Note that A-STDP learning rule is only used during this self-repair phase. It is worth mentioning here, that weight normalization by factor w norm (mentioned in section 2.2) is used before starting the re-training process. This helps to adjust the magnitude of firing threshold relative to the weights of the neurons (since the resultant magnitude diminishes due to fault injection). Figure 11 shows the test classification accuracy as a function of re-learning epochs for a 225/400 neuron network with 80% probability for faulty synapses. After the faults are introduced, the network accuracy improves over time during the self-repair process. The mean and standard deviation of test accuracy from 5 independent runs are plotted in Figure 11. Figure 12 depicts the initial and self-repaired weight maps of the 225 (MNIST) and 400 (F-MNIST) neuron networks, substantiating that original learnt representations are preserved during the re-learning process. Table 2 summarizes our results for all networks with varying degrees of faults. The numbers in parentheses denote the standard deviation in accuracy from the 5 independent runs. Since sole STDP learning resulted in accuracy degradation for most of the runs, the accuracy is reported after 1 re-learning epoch. For some cases, some accuracy improvement through normal STDP was also observed.
The maximum accuracy is reported for the A-STDP re-training process. After repair through A-STDP, the network is able to achieve accuracy improvement across all level of faults, ranging from 50 to 90%. Interestingly, A-STDP is able to repair faults even in a 90% faulty network and improve the testing accuracy by almost 9% (5%) for the MNIST (F-MNIST) dataset. Further, the accuracy improvement due to A-STDP scales up with increasing degree of faults. Note that the standard deviation of the final accuracy over 5 independent runs is much smaller for A-STDP than normal STDP re-training, signifying that the astrocyte enabled self-repair is consistently stable, irrespective of the initial fault locations.

DISCUSSION
The work provides proof-of-concept results toward the development of a new generation of neuromorphic computing platforms that are able to autonomously self-repair faulty nonideal hardware operation. Extending beyond just unsupervised STDP learning, augmenting astrocyte feedback in supervised gradient descent based training of SNNs needs to be explored along with their implementation on neuromorphic datasets (Orchard et al., 2015). In this work, we also focused on aspects of astrocyte operation that would be relevant from a macromodeling perspective for self-repair. Further investigations on understanding the role of neuroglia in neuromorphic computing can potentially forge new directions related to synaptic learning, temporal binding, among others.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.