<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurosci.</journal-id>
<journal-title>Frontiers in Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-453X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnins.2023.1266003</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>First-spike coding promotes accurate and efficient spiking neural networks for discrete events with rich temporal structures</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Liu</surname> <given-names>Siying</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2313523/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/data-curation/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/software/"/>
<role content-type="https://credit.niso.org/contributor-roles/validation/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Leung</surname> <given-names>Vincent C. H.</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/2387953/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Dragotti</surname> <given-names>Pier Luigi</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/2517706/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/project-administration/"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff><institution>Communications and Signal Processing Group, Department of Electrical and Electronic Engineering, Imperial College London</institution>, <addr-line>London</addr-line>, <country>United Kingdom</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Timoth&#x000E9;e Masquelier, Centre National de la Recherche Scientifique (CNRS), France</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Malu Zhang, National University of Singapore, Singapore; Bernard Girau, Universit&#x000E9; de Lorraine, France; Seongsik Park, Korea Institute of Science and Technology (KIST), Republic of Korea</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Siying Liu <email>siying.liu20&#x00040;imperial.ac.uk</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>10</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>17</volume>
<elocation-id>1266003</elocation-id>
<history>
<date date-type="received">
<day>24</day>
<month>07</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>09</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Liu, Leung and Dragotti.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Liu, Leung and Dragotti</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>Spiking neural networks (SNNs) are well-suited to process asynchronous event-based data. Most of the existing SNNs use rate-coding schemes that focus on firing rate (FR), and so they generally ignore the spike timing in events. On the contrary, methods based on temporal coding, particularly time-to-first-spike (TTFS) coding, can be accurate and efficient but they are difficult to train. Currently, there is limited research on applying TTFS coding to real events, since traditional TTFS-based methods impose one-spike constraint, which is not realistic for event-based data. In this study, we present a novel decision-making strategy based on first-spike (FS) coding that encodes FS timings of the output neurons to investigate the role of the first-spike timing in classifying real-world event sequences with complex temporal structures. To achieve FS coding, we propose a novel surrogate gradient learning method for discrete spike trains. In the forward pass, output spikes are encoded into discrete times to generate FS times. In the backpropagation, we develop an error assignment method that propagates error from FS times to spikes through a Gaussian window, and then supervised learning for spikes is implemented through a surrogate gradient approach. Additional strategies are introduced to facilitate the training of FS timings, such as adding empty sequences and employing different parameters for different layers. We make a comprehensive comparison between FS and FR coding in the experiments. Our results show that FS coding achieves comparable accuracy to FR coding while leading to superior energy efficiency and distinct neuronal dynamics on data sequences with very rich temporal structures. Additionally, a longer time delay in the first spike leads to higher accuracy, indicating important information is encoded in the timing of the first spike.</p></abstract>
<kwd-group>
<kwd>spiking neural networks</kwd>
<kwd>first-spike coding</kwd>
<kwd>firing rate coding</kwd>
<kwd>time-to-first-spike</kwd>
<kwd>surrogate gradient</kwd>
<kwd>event-based data</kwd>
<kwd>temporal structures</kwd>
</kwd-group>
<counts>
<fig-count count="9"/>
<table-count count="7"/>
<equation-count count="17"/>
<ref-count count="62"/>
<page-count count="19"/>
<word-count count="13150"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Neuromorphic Engineering</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The emergence of event-driven neuromorphic devices has given further impetus to the development of spiking neural networks (SNNs) (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>). SNNs more closely mimic biological neural systems by processing and transmitting information with sparse and asynchronous binary spikes (Pfeiffer and Pfeil, <xref ref-type="bibr" rid="B39">2018</xref>). By incorporating spike timing in their neuron model, SNNs have become effective tools for acquiring and processing temporal information (Wang et al., <xref ref-type="bibr" rid="B47">2020</xref>). Neuromorphic devices such as dynamic vision sensors (DVS) and dynamic audio sensors (DAS) produce asynchronous events which are well-suited to be used as the input of SNNs. Combining SNNs with the output of neuromorphic devices can potentially enable the development of power-efficient systems that more closely mimic biological processing.</p>
<p>SNNs process a sequence of spikes in each layer, which is referred to as spike trains. A spike train is mathematically defined by <inline-formula><mml:math id="M1"><mml:mi>s</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="script">I</mml:mi></mml:mrow></mml:mrow></mml:munder><mml:mi>&#x003B4;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, where <italic>t</italic><sub><italic>i</italic></sub> represents the timing of individual spikes in the set <inline-formula><mml:math id="M2"><mml:mrow><mml:mi mathvariant="script">I</mml:mi></mml:mrow></mml:math></inline-formula>. In terms of information encoding, rate coding and temporal coding are two distinct approaches in SNNs (Rullen and Thorpe, <xref ref-type="bibr" rid="B41">2001</xref>; Huxter et al., <xref ref-type="bibr" rid="B19">2003</xref>; Brette, <xref ref-type="bibr" rid="B7">2015</xref>; Kiselev, <xref ref-type="bibr" rid="B26">2016</xref>; Liu and Wang, <xref ref-type="bibr" rid="B29">2022</xref>). Rate coding focuses on the firing rate (FR) of neurons, in which information is represented by the average number of spikes within a certain time window. Although it is widely used and effective, rate coding is less efficient because it involves a great number of spikes and ignores the relative timing between spikes, which encodes important information of stimulus in visual (Gollisch and Meister, <xref ref-type="bibr" rid="B14">2008</xref>), auditory (Heil, <xref ref-type="bibr" rid="B18">2004</xref>; Fontaine and Peremans, <xref ref-type="bibr" rid="B12">2009</xref>), and other systems (Panzeri et al., <xref ref-type="bibr" rid="B35">2001</xref>; Huxter et al., <xref ref-type="bibr" rid="B19">2003</xref>). Alternatively, temporal coding schemes rely on the precise timing of individual spikes, and this potentially provides a faster and efficient way of processing and transmitting signals. In particular, the first spike after a stimulus (Panzeri et al., <xref ref-type="bibr" rid="B35">2001</xref>; Johansson and Birznieks, <xref ref-type="bibr" rid="B23">2004</xref>) is capable of reliably conveying considerable information. This inspired methods based on the time-to-first-spike (TTFS) coding, resulting in fewer spikes and efficient computation (Bonilla et al., <xref ref-type="bibr" rid="B6">2022</xref>; Yu et al., <xref ref-type="bibr" rid="B57">2023</xref>). In practice, most of these methods force each neuron to fire at most one spike (Mostafa, <xref ref-type="bibr" rid="B31">2018</xref>; Kheradpisheh and Masquelier, <xref ref-type="bibr" rid="B25">2020</xref>; G&#x000F6;ltz et al., <xref ref-type="bibr" rid="B15">2021</xref>; Mirsadeghi et al., <xref ref-type="bibr" rid="B30">2021</xref>; Zhou et al., <xref ref-type="bibr" rid="B61">2021</xref>; Com&#x0015F;a et al., <xref ref-type="bibr" rid="B8">2022</xref>) or assume there is a very long refractory period after a spike (Kotariya and Ganguly, <xref ref-type="bibr" rid="B27">2021</xref>) to allow the computation of exact derivatives of postsynaptic spike times with respect to presynaptic times. This means that these networks can only process static inputs (Mostafa, <xref ref-type="bibr" rid="B31">2018</xref>; Zhou et al., <xref ref-type="bibr" rid="B61">2021</xref>; Com&#x0015F;a et al., <xref ref-type="bibr" rid="B8">2022</xref>; Sakemi et al., <xref ref-type="bibr" rid="B42">2023</xref>), such as spikes converted from intensity of each pixel in images, but not a continuous stream of events. Hence, this type of single-spike encoding is not biologically plausible.</p>
<p>In addition, there have been limited research on investigating the temporal structures in neuromorphic data sequences and SNNs. In the context of spiking signals, temporal structures refer to the patterns, changes, or behaviors that occur over time in the generation and transmission of these signals. Data sequences containing rich temporal structures indicate that useful information is encoded in the temporal domain. As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, various types of data produce varying degrees of temporal structures, which can yield diverse results when using different coding schemes. However, some widely used datasets lack diverse temporal structures in their sequences because events are generated by repeatedly moving a neuromorphic device around static images, such as neuromorphic MNIST (N-MNIST), N-Caltech101 (Orchard et al., <xref ref-type="bibr" rid="B33">2015</xref>), and CIFAR10-DVS (Li et al., <xref ref-type="bibr" rid="B28">2017</xref>). Results presented by Iyer et al. (<xref ref-type="bibr" rid="B20">2021</xref>) illustrated that rate-based SNNs outperform timing-based methods on N-MNIST dataset. The authors argued that spike timings of sequences in N-MNIST may not contain too much useful information. Moreover, recent evidence (Jiang et al., <xref ref-type="bibr" rid="B22">2023</xref>) indicates that timing-based computation is superior in the task involving abundant temporal information. As for TTFS coding, although some studies focused on event data of static scenes (Park et al., <xref ref-type="bibr" rid="B37">2020</xref>; Kotariya and Ganguly, <xref ref-type="bibr" rid="B27">2021</xref>), there have not been studies applying TTFS coding to real event sequences that exhibit rich temporal structures.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Event sequences with varying degrees of temporal structures (blue/red: positive/negative events). DVSGesture and DVSPlane are visual datasets that heavily rely on the spatial information in decision-making. The sequences of DVSGesture are periodic in the temporal domain, while DVSPlane data exhibits more complex temporal structures as it lacks temporal repetition. On the other hand, audio data sequences in SHD and N-TIDIGITS are non-repetitive and it is difficult to differentiate between classes solely based on spatial information. DVSGesture (Amir et al., <xref ref-type="bibr" rid="B2">2017</xref>), SHD (Cramer et al., <xref ref-type="bibr" rid="B9">2022</xref>), N-TIDIGITS (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>), DVSPlane (Afshar et al., <xref ref-type="bibr" rid="B1">2019</xref>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0001.tif"/>
</fig>
<p>We, therefore, propose a novel decision-making scheme based on first-spike (FS) coding by encoding FS timings of output neurons for real-world event sequences with rich temporal structures, aiming to understand whether the timing of the FS plays a distinct role and whether it exploits the temporal information more effectively than FR. The FS coding differs from traditional TTFS-based studies in that information is conveyed without relying on exact timings, thereby eliminating the restriction of allowing each neuron to fire at most once. Instead, we only encode the timings of the fist spike in the output layer and use it in the loss function for supervised learning. However, supervised learning involving precise timing poses several challenges. First, it is impractical to use continuous time encoding due to high computational cost caused by the substantial number of events generated by neuromorphic device. In addition, exact gradients of continuous spike times with respect to spike trains are ill-defined, which prevents the standard backpropagation process used in conventional artificial neural networks (ANNs). Some existing methods overcome this issue by estimating the derivatives of continuous time with respect to membrane potential around the threshold, such as SpikeProp (Bohte et al., <xref ref-type="bibr" rid="B5">2002</xref>) and its variations (Xu et al., <xref ref-type="bibr" rid="B53">2013</xref>; Shrestha and Song, <xref ref-type="bibr" rid="B44">2016</xref>, <xref ref-type="bibr" rid="B45">2017</xref>) and EventProp (Wunderlich and Pehle, <xref ref-type="bibr" rid="B50">2021</xref>). Other methods utilize the relationship between time and membrane potential to achieve supervised learning of precise spike timing, such as probabilistic models of firing intensity (Pfister et al., <xref ref-type="bibr" rid="B40">2006</xref>; Gardner et al., <xref ref-type="bibr" rid="B13">2015</xref>) and implicit differentiation on the equilibrium state (Xiao et al., <xref ref-type="bibr" rid="B52">2020</xref>, <xref ref-type="bibr" rid="B51">2023</xref>). Recent methods have made attempts to calculate exact derivatives of the postsynaptic spike times with respect to presynaptic spike times (Com&#x0015F;a et al., <xref ref-type="bibr" rid="B8">2022</xref>) or potential (Zhang et al., <xref ref-type="bibr" rid="B59">2022</xref>). For example, Zhang et al. (<xref ref-type="bibr" rid="B59">2022</xref>) proposed a rectified linear postsynaptic potential function to alleviate problems such as non-differentiable spike function, exploding gradients and dead neurons during backpropagation in deep SNNs utilizing temporal coding. Most methods train the network to learn the timing of desired spike trains, restricting its adaptability to diverse input scenarios. Furthermore, the complicated rules of error propagation and the dependency between spike times in these methods limit their utilization in deep networks. Therefore, to simplify training and alleviate the restrictions for spike times, we exclusively apply discrete temporal coding to the output spikes in the final layer. In this way, we can concentrate on error propagation from output FS timings to subsequent spikes by leveraging the surrogate gradient learning (Wu et al., <xref ref-type="bibr" rid="B49">2018</xref>; Neftci et al., <xref ref-type="bibr" rid="B32">2019</xref>; Yin et al., <xref ref-type="bibr" rid="B55">2021</xref>) for spikes. Specifically, the error of FS time in the output layer is propagated to multiple spikes through a Gaussian window, and then the SuperSpike method (Zenke and Ganguli, <xref ref-type="bibr" rid="B58">2018</xref>), based on surrogate gradient descent, is utilized to achieve the supervised learning of spikes in the network. Additionally, this approach enables a flexible configuration of the network architecture, which can include a combination of convolutional layers and fully connected (FC) structures with recurrent connections.</p>
<p>Another difficulty in training SNNs is dealing with neurons that fail to generate any spikes within the given time window, commonly referred to as inactive neurons. This issue is particularly predominant in training based on the first-spike time, as the error is only propagated through the first output spike. Consequently, the weights associated with subsequent spikes cannot be updated, leading to a lower firing rate and increasing inactivity in their neurons. Additional strategies are usually necessary to solve this problem, such as weight initialization (Bohte et al., <xref ref-type="bibr" rid="B5">2002</xref>), large penalty on inactive neurons (Mostafa, <xref ref-type="bibr" rid="B31">2018</xref>; Com&#x0015F;a et al., <xref ref-type="bibr" rid="B8">2022</xref>), and synchronization pulses as temporal biases (Com&#x0015F;a et al., <xref ref-type="bibr" rid="B8">2022</xref>). Hence, in this study, we design strategies to facilitate the training based on FS timings, specifically for event sequences. First, we assign the error of inactive neurons across multiple steps rather than just one. Second, the time window is enlarged by adding empty sequences to reduce the number of inactive neurons that generate spikes beyond the observed window due to significant time delay. Finally, to enhance the performance of FS coding, we apply smaller values of time constant and threshold in the initial layers to effectively extract local features, while we use large values in the final layers to facilitate decision-making based on previous stimuli.</p>
<p>In the experiments, we make a comprehensive comparison of FS and FR coding schemes on several commonly used visual and auditory neuromorphic datasets, including DVSGesture (Amir et al., <xref ref-type="bibr" rid="B2">2017</xref>), SHD (Cramer et al., <xref ref-type="bibr" rid="B9">2022</xref>), N-TIDIGITS (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>), and DVSPlane (Afshar et al., <xref ref-type="bibr" rid="B1">2019</xref>). These data sequences demonstrate different levels of temporal structures, as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. Results show that FS coding achieves comparable accuracy with FR coding, although typically with a lower temporal delay. There is a trade-off between classification accuracy and the first-spike latency. An appropriate temporal delay allows the network to make accurate decisions after receiving sufficient information. Furthermore, the FS models exhibit distinct neuronal behavior on different types of data sequences. In particular, the networks based on FS coding demonstrate enhanced performance and superior energy efficiency on audio data sequences with very rich temporal structures. On the other hand, when processing visual data sequences containing repetitive signals and rich spatial information, FS and FR models demonstrate similar neuronal dynamics and produce similar spike counts.</p></sec>
<sec sec-type="materials and methods" id="s2">
<title>2. Materials and methods</title>
<p>Consider a stream of events emitted by a neuromorphic sensor, <inline-formula><mml:math id="M3"><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mspace width="0.3em" class="thinspace"/><mml:mo>,</mml:mo><mml:mi>M</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>, over a certain time window. An event <italic>e</italic><sub><italic>i</italic></sub> in continuous space and time can be represented as a function <italic>e</italic><sub><italic>i</italic></sub>(<italic><bold>x</bold></italic>, <italic>t</italic>) &#x0003D; <italic>p</italic><sub><italic>i</italic></sub>&#x003B4;(<italic><bold>x</bold></italic>&#x02212;<italic><bold>x</bold></italic><sub><italic>i</italic></sub>, <italic>t</italic>&#x02212;<italic>t</italic><sub><italic>i</italic></sub>), which means that an event with polarity <italic>p</italic><sub><italic>i</italic></sub> is emitted at the location <italic><bold>x</bold></italic><sub><italic>i</italic></sub> and at the timestamp <italic>t</italic><sub><italic>i</italic></sub>. The polarity <italic>p</italic><sub><italic>i</italic></sub> &#x0003D; &#x000B1;1 represents whether the brightness change is positive or negative. To reduce computational cost, we transform <inline-formula><mml:math id="M4"><mml:mrow><mml:mi>&#x003B5;</mml:mi></mml:mrow></mml:math></inline-formula> into a discretized spatio-temporal representation <italic><bold>E</bold></italic> as the input of SNNs. The input tensor contains <italic>T</italic> temporal bins by accumulating raw events at a resolution of &#x00394;<italic>t</italic>. Each pixel location takes the number of positive or negative events within each temporal bin. In this setup, every pixel is associated with two channels to indicate the polarity of events. As a result, for a vision sensor with an image plane of dimensions <italic>H</italic>&#x000D7;<italic>W</italic>, the input <italic><bold>E</bold></italic> forms a 4-D tensor of size 2 &#x000D7; <italic>H</italic>&#x000D7;<italic>W</italic>&#x000D7;<italic>T</italic>. As audio sensors have no polarity applied, the input <italic><bold>E</bold></italic> is represented as a 2-D tensor with dimensions <italic>F</italic>&#x000D7;<italic>T</italic>, where <italic>F</italic> denotes the number of channels for the audio sensor.</p>
<p>In terms of the coding schemes of SNNs, spike trains can be encoded into different formats to convey information (Guo et al., <xref ref-type="bibr" rid="B16">2021</xref>). <xref ref-type="fig" rid="F2">Figure 2A</xref> presents the comparison of spike-based coding schemes in decision-making. FR coding focuses on the average spike count within a certain time window. In population coding, several neurons in each population capture different features of input stimuli over time, and their responses are combined to make a decision (Panzeri et al., <xref ref-type="bibr" rid="B34">2015</xref>). In burst coding, a burst of spikes is emitted at one time, in which information is carried in the spike count and the inter-spike interval within the burst (Izhikevich et al., <xref ref-type="bibr" rid="B21">2003</xref>). Traditional TTFS coding restricts each neuron to fire at most once and information is conveyed in the exact timings, while our FS coding only focuses on the first spike of output neurons, since, in FS, the output neuron that fires first determines the classification outcome.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Coding schemes and the SNN architecture. <bold>(A)</bold> Comparisons of different spike-based coding in decision-making. Our proposed FS coding is only applied to the output layer. <bold>(B)</bold> The SNN architecture for the classification problem using FR <bold>(top)</bold> and FS <bold>(bottom)</bold> coding for the output layer. The input events are represented with spatio-temporal grids <italic><bold>E</bold></italic>. The output spike trains in the final layer are encoded into FR and FS timings for classification. For FS coding, the predicted class corresponds to the neuron which fires the earliest spike, while for FR coding, the predicted class corresponds to the neuron which fires more spikes.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0002.tif"/>
</fig>
<p>A standard multi-layer SNN architecture is used in this study, and is implemented using either convolutional layers or fully connected layers based on the required task. The overall dynamics is shown in <xref ref-type="fig" rid="F2">Figure 2B</xref>. It takes the spatio-temporal representation of events <italic><bold>E</bold></italic> as the input, and each neuron generates a spike train of length <italic>T</italic>. In the output layer, FR or FS coding are used for classification. The models using FR or FS coding are denoted as the FR or FS model in the rest of the article. For the FR model, the predicted class is determined by the highest firing rate. For the FS model, output spike trains are encoded into temporal codes to obtain FS timing for each neuron. The predicted class in this case depends on the earliest spike across all the output neurons.</p>
<p>Our SNN model can be seen as a hybrid system in which multiple spikes are transmitted in hidden layers, but only the first output spike is utilized to make a decision. However, the information within the hidden layers not only depends on the firing rate (FR) of neurons but also considers the order in which the spikes occur. This is because the first-spike timing is emphasized in the output layer, introducing the aspect of spike order as an informative factor. This distinguishes it from standard FR coding, where only the spike count over a certain time window holds significance.</p>
<p>We introduce the SNN model with discrete time encoding in Section 2.1, the error propagation through FS timings in Section 2.2, and strategies facilitating the training based on FS timings in Section 2.3.</p>
<sec>
<title>2.1. SNN model and time encoding</title>
<p>In this subsection, we introduce the current-based leaky integrate-and-fire (CUBA-LIF) neuron model in Section 2.1.1 and then extend it to a multi-layer SNN for event sequences in Section 2.1.2. In our SNN model, binary spikes are transmitted and processed between layers, and discrete time encoding is applied to the spike trains of the output layer to obtain the FS timings of the system.</p>
<sec>
<title>2.1.1. Neuron model</title>
<p>One of the most commonly used neuron model is the CUBA-LIF neuron. Consider a set of presynaptic neurons <italic>j</italic> &#x0003D; 1, 2, &#x022EF;&#x000A0;, <italic>J</italic> connected to a postsynaptic neuron <italic>i</italic>, then the dynamics of the CUBA-LIF model is as follows:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac></mml:mtd><mml:mtd><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0003C;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>&#x003B4;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac></mml:mtd><mml:mtd><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003C4;<sub><italic>m</italic></sub> and &#x003C4;<sub><italic>s</italic></sub> are the time constant of membrane potential <italic>U</italic>(<italic>t</italic>) and synaptic current <italic>I</italic>(<italic>t</italic>), respectively, and &#x003B4;(<italic>t</italic>) is the Dirac function representing a spike function. Here, <italic>t</italic><sub><italic>j, m</italic></sub>&#x0003C;<italic>t</italic> is the firing time of the <italic>m</italic>th spike generated by the <italic>j</italic>th presynaptic neuron. Moreover, the synaptic weight between neurons <italic>i</italic> and <italic>j</italic> is denoted as <italic>w</italic><sub><italic>ij</italic></sub>, and <italic>U</italic><sub><italic>r</italic></sub> is the resting potential, where we set <italic>U</italic><sub><italic>r</italic></sub> &#x0003D; 0.</p>
<p>The condition that neuron <italic>i</italic> fires a spike is when the membrane potential <italic>U</italic>(<italic>t</italic>) reaches a threshold &#x003B8;. After spiking, the potential drops below <italic>U</italic><sub><italic>r</italic></sub> and then recovers to <italic>U</italic><sub><italic>r</italic></sub> within a refractory period. In our model, the refractory period is ignored, which means <italic>U</italic><sub><italic>i</italic></sub>(<italic>t</italic>) is reset to <italic>U</italic><sub><italic>r</italic></sub> &#x0003D; 0 instantly.</p></sec>
<sec>
<title>2.1.2. SNN model with discrete time encoding</title>
<p>The CUBA-LIF neuron model is then discretized to construct a multi-layer SNN. Given an SNN model with <italic>L</italic> layers, the membrane potential <italic><bold>U</bold></italic><sup><italic>l, n</italic></sup> is evolved through layers <italic>l</italic> &#x0003D; 1, 2, &#x022EF;&#x000A0;, <italic>L</italic> and time steps <italic>n</italic> &#x0003D; 1, 2, &#x022EF;&#x000A0;, <italic>T</italic>. When the membrane potential of neuron <italic>i</italic> in layer <italic>l</italic> at time step <italic>n</italic> is greater than a threshold: <inline-formula><mml:math id="M6"><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02265;</mml:mo><mml:mi>&#x003B8;</mml:mi></mml:math></inline-formula>, a spike is generated and is denoted as <inline-formula><mml:math id="M7"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>. Otherwise <inline-formula><mml:math id="M8"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula>.</p>
<p>We follow the same discretization scheme used by Neftci et al. (<xref ref-type="bibr" rid="B32">2019</xref>). The update of synaptic current <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, membrane potential <inline-formula><mml:math id="M10"><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and spike firing from step <italic>n</italic> to <italic>n</italic>&#x0002B;1 are as follows:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M11"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:mo>=</mml:mo><mml:mi>&#x003B2;</mml:mi><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>Q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>Q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E3"><label>(3)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:mo>=</mml:mo><mml:mi>&#x003B1;</mml:mi><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msubsup><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E4"><label>(4)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M14"><mml:mi>&#x003B1;</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x00394;</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="M15"><mml:mi>&#x003B2;</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x00394;</mml:mi><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:math></inline-formula>. Moreover, in our models, &#x003C4; &#x0003D; &#x003C4;<sub><italic>s</italic></sub> &#x0003D; &#x003C4;<sub><italic>m</italic></sub>. The last term in Eq. (2) represents optional recurrent connections in the fully connected layer, where <italic>v</italic><sub><italic>ik</italic></sub> is the weight of a recurrent connection between the <italic>k</italic>th and <italic>i</italic>th neuron in the same layer <italic>l</italic>. The number of neurons in the <italic>l</italic>th layer is denoted as <italic>Q</italic>(<italic>l</italic>). In Eq. (3), the membrane potential <inline-formula><mml:math id="M16"><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is reset by multiplying <inline-formula><mml:math id="M17"><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. Finally, the spiking process can be described as a step function of the membrane potential:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M18"><mml:mrow><mml:mi>s</mml:mi><mml:mo>=</mml:mo><mml:mi>f</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>U</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>U</mml:mi><mml:mo>&#x02265;</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>U</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The forward propagation of a single neuron is shown as solid arrows in <xref ref-type="fig" rid="F3">Figure 3A</xref>. The synaptic current, membrane potential, and spikes are updated in both spatial and temporal domains using Eqs (3), (4).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Forward and backward propagation in hidden and output layers. <bold>(A)</bold> Forward and backward propagation of a single neuron through spikes in the SNN model. The <italic>j</italic>th neuron in layer <italic>l</italic>&#x02212;1 is connected to the <italic>i</italic>th neuron in layer <italic>l</italic>. Connections between <inline-formula><mml:math id="M19"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>s</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M20"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> represent recurrent connections. <bold>(B)</bold> iscrete temporal encoding and error assignment from time-to-spike of neuron <italic>i</italic> in the output layer <italic>L</italic>. Output spike train is encoded into discrete times according to the time sequence, while the time of a silent step is encoded as a large value, denoted as <inline-formula><mml:math id="M21"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>f</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. In the backpropagation, a Gaussian filter is used to distribute error from one step to the others. (a) For a valid time of FS, the error <inline-formula><mml:math id="M22"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> is propagated to spikes <inline-formula><mml:math id="M23"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> through a Gaussian window; (b) for an inactive output neuron, the error is assigned to every time step. (c) As the Gaussian window size <italic>W</italic> increases, the error distribution becomes more similar to the error generated by (d) FR, where each spike is assigned the same error.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0003.tif"/>
</fig>
<p>The output spike trains of the network can be encoded into different formats for classification. To obtain FS timings of neurons, temporal coding is required to obtain the spike timings. We, therefore, apply discrete time encoding for spike trains of neurons in the last layer <italic>L</italic>. According to the time sequences, a spike is encoded as its time step, but it is unclear how to encode a silent step that does not generate a spike. Directly encoding it as infinity will cause an error in the computation of the loss. We instead replace it with a fixed large value, denoted as <italic>t</italic><sup><italic>inf</italic></sup>, which should be greater than <italic>T&#x00394;t</italic>.</p>
<p>Specifically, the output time of the <italic>i</italic>th neuron at step <italic>n</italic> is given by:</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M24"><mml:mrow><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>h</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:mi>n</mml:mi><mml:mi>&#x00394;</mml:mi><mml:mi>t</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mi>t</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>f</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where</p>
<disp-formula id="E7"><label>(7)</label><mml:math id="M25"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>f</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>&#x00394;</mml:mi><mml:mi>t</mml:mi><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The discrete temporal coding process in the last layer <italic>L</italic> is illustrated in <xref ref-type="fig" rid="F3">Figure 3B</xref>. Spikes are encoded into discrete timestamps in accordance with the time sequence, while other steps are encoded with <italic>t</italic><sup><italic>inf</italic></sup>.</p>
<p>Finally, the time of the first spike from the <italic>i</italic>th output neuron, denoted as <inline-formula><mml:math id="M26"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula>, is given by the minimum of the temporal codes:</p>
<disp-formula id="E8"><label>(8)</label><mml:math id="M27"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>However, another concern arises when all the neurons fail to fire within the time window. The network cannot make a prediction by the first spike, because it becomes challenging to determine which neuron fires the first. To solve this problem, we utilize the maximum membrane potential over time of each neuron <inline-formula><mml:math id="M28"><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo class="qopname">max</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> to facilitate the prediction. The higher the membrane potential, the higher the likelihood the neuron would fire earlier. Therefore, the predicted label <italic>y</italic><sup><italic>P</italic></sup> corresponds to either the neuron that fires the first spike or the one with the highest membrane potential if all the output neurons are inactive.</p></sec></sec>
<sec>
<title>2.2. Backpropagation through FS timings</title>
<p>In this section, we propose a supervised learning framework for FS coding. First, we define the loss function as the cross-entropy loss based on the FS times of the output neurons. We use this to minimize the FS time of the target neuron and maximize that of non-target neurons:</p>
<disp-formula id="E9"><label>(9)</label><mml:math id="M29"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x02112;</mml:mi><mml:mrow><mml:mi>F</mml:mi><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>C</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mi>log</mml:mi><mml:mfrac><mml:mrow><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>C</mml:mi></mml:msubsup><mml:mrow><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msubsup><mml:mi>t</mml:mi><mml:mi>j</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003BB;</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mo>&#x0007B;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x0007C;</mml:mo><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup><mml:mo>&#x0003E;</mml:mo><mml:mi>T</mml:mi><mml:mi>&#x00394;</mml:mi><mml:mi>t</mml:mi><mml:mo>&#x0007D;</mml:mo></mml:mrow></mml:munder><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo stretchy='false'>[</mml:mo><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>]</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>y</italic><sub><italic>i</italic></sub> is the target label (0 or 1) of the <italic>i</italic>th class, <italic>C</italic> is the number of output neurons (classes), and &#x003B1;<sub>0</sub>, &#x003B2;<sub>0</sub> and &#x003BB;<sub><italic>t</italic></sub> are constant coefficients, where &#x003B1;<sub>0</sub> is used to control the speed of training and prevent the exponential function from taking an excessively high value. The second term is a constraint to penalize a target neuron which never fires. For ease of notation, we use <inline-formula><mml:math id="M30"><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:math></inline-formula> to represent <inline-formula><mml:math id="M31"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>F</mml:mi><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in the rest of the article.</p>
<p>To compare FS with FR coding, similarly, a cross-entropy loss function maximizing the FR of the target neuron is used and is given by:</p>
<disp-formula id="E10"><label>(10)</label><mml:math id="M32"><mml:mrow><mml:msub><mml:mi>&#x02112;</mml:mi><mml:mrow><mml:mi>F</mml:mi><mml:mi>R</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>C</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mi>log</mml:mi><mml:mfrac><mml:mrow><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>f</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>C</mml:mi></mml:msubsup><mml:mrow><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>f</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where &#x003B1;<sub>1</sub> is a constant value and <inline-formula><mml:math id="M33"><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:mfrac><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the spike rate within <italic>T</italic> steps of the <italic>i</italic>th neuron in the last layer <italic>L</italic>. Since the training of FR usually does not suffer from inactive neurons, the constraint used in Eq. (9) is not required here.</p>
<p>To learn the weights <inline-formula><mml:math id="M34"><mml:mstyle mathvariant="bold-italic"><mml:mi>W</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M35"><mml:mstyle mathvariant="bold-italic"><mml:mi>V</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula>, since the discretization described in Eqs (3), (4) effectively leads to an SNN as visualized in <xref ref-type="fig" rid="F3">Figure 3A</xref>, we can simply perform a standard error backpropagation same as in conventional ANNs. However, there remains two challenges for the learning process based on FS timings. First, we have to propagate the error from the FS time <inline-formula><mml:math id="M36"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> to the postsynaptic spikes <inline-formula><mml:math id="M37"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> at the output layer. We, therefore, introduce a novel error assignment scheme in Section 2.2.1. Another obstacle is the non-differentiability of the spike function in Eq. (4), which describes the relationship between membrane potential and postsynaptic spikes. This can be overcome by using a surrogate gradient to approximate that of the step function (see Section 2.2.4).</p>
<sec>
<title>2.2.1. Error propagation from FS times to spikes</title>
<p>The error of the FS time <inline-formula><mml:math id="M38"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> is computed by the loss function, denoted as <inline-formula><mml:math id="M39"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula>. To enable the flow of error throughout the entire network, it is necessary to propagate the error for the single (first) step to all the associated spikes in the last layer. This process is divided into two steps: first, from the FS time <inline-formula><mml:math id="M40"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> to all steps <inline-formula><mml:math id="M41"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, and then from times <inline-formula><mml:math id="M42"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> to spikes <inline-formula><mml:math id="M43"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. According to the chain rule, the error of the spike in the output layer can be computed based on the error of the FS time and related gradients as follows:</p>
<disp-formula id="E11"><label>(11)</label><mml:math id="M44"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The key issue is to compute the gradient <inline-formula><mml:math id="M45"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> and <inline-formula><mml:math id="M46"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula>. <xref ref-type="fig" rid="F3">Figure 3B</xref> illustrates the error assignment through these two steps.</p></sec>
<sec>
<title>2.2.2. From FS times to temporal codes</title>
<p>First, for active neurons, as the FS time is given by the minimum of the temporal codes, the error of FS time is only related to the corresponding time step, which means that the derivative of the FS time with respect to temporal codes <inline-formula><mml:math id="M47"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is 1 only when <inline-formula><mml:math id="M48"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is the first-spike time. Specifically,</p>
<disp-formula id="E12"><label>(12)</label><mml:math id="M49"><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x02202;</mml:mo><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup></mml:mrow><mml:mrow><mml:mo>&#x02202;</mml:mo><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>t</mml:mi><mml:mi>i</mml:mi><mml:mi>&#x02131;</mml:mi></mml:msubsup><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:mi>o</mml:mi><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>w</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>e</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>However, for inactive neurons, propagating the error through a single step will make the weights difficult to update. To address this issue, the error <inline-formula><mml:math id="M50"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> is assigned to all the other steps for inactive neurons, which means <inline-formula><mml:math id="M51"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="script">F</mml:mi></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> is always equal to 1. These two cases for active and inactive neurons are illustrated in <xref ref-type="fig" rid="F3">Figures 3B(a</xref>, <xref ref-type="fig" rid="F3">b</xref>).</p>
<p>Note that the strategy for inactive neurons has opposite effects on target and non-target neurons. This contrast arises from the fact that the loss function aims to minimize the first-spike time for target neurons while maximize it for non-target neurons. Hence, assigning error to all time steps is equivalent to minimizing (maximizing) the total firing time for target (non-target) neurons. Consequently, this strategy promotes the activation of dormant target neurons, while reinforcing the inactivity of non-target neurons that are already inactive.</p></sec>
<sec>
<title>2.2.3. From temporal codes to spikes</title>
<p>The second gradient <inline-formula><mml:math id="M52"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> cannot be computed directly. In inference, temporal codes are exclusively linked to their corresponding spikes at a single step. The timestamp <inline-formula><mml:math id="M53"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is generated from <inline-formula><mml:math id="M54"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> at the same step <italic>n</italic>. However, it is essential to involve spikes occurring around step <italic>n</italic> (<italic>m</italic>&#x02260;<italic>n</italic>) in the optimization of spike time at step <italic>n</italic>. During the learning process, the change of spike times results in the change of connections. If the optimization only focuses on a single spike at step <italic>n</italic> and its corresponding weights, the weight update would be inherently unstable. In the case of target neurons, the learning process optimizes not only the spike at step <italic>n</italic> but also the spikes and associated weights from earlier steps (<italic>m</italic>&#x0003C;<italic>n</italic>) to reduce output times. For non-target neurons, optimization should also involve spikes and related weights from later steps (<italic>m</italic>&#x0003E;<italic>n</italic>). Furthermore, since spike times can only change at most a few time steps at each iteration, the impact of error at step <italic>n</italic> diminishes as the time step is far away from <italic>n</italic>. This means that spikes closer to step <italic>n</italic> should receive a larger error assignment.</p>
<p>Therefore, a surrogate gradient needs to be designed to distribute error from <inline-formula><mml:math id="M55"><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> to <inline-formula><mml:math id="M56"><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. A Gaussian window is used to ensure a smooth and gradual weight update. In addition, the error of spikes should have an opposite sign to the error of times. The reason is that decreasing the spike time is equivalent to increasing the probability of spike firing in early steps, in other words, increasing the value of spikes from 0 to 1. Specifically, our approach is to distribute the error of time at step <italic>m</italic> to the spikes around it through a negative Gaussian window:</p>
<disp-formula id="E13"><label>(13)</label><mml:math id="M57"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>m</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msubsup><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mo>-</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>g</italic>(<italic>x</italic>) is given by:</p>
<disp-formula id="E14"><label>(14)</label><mml:math id="M58"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x003C0;</mml:mi><mml:mi>&#x003C3;</mml:mi></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>A</italic>&#x0003E;0 is the amplitude and <inline-formula><mml:math id="M59"><mml:mi>&#x003C3;</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo>&#x0230A;</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mrow><mml:mi>D</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo>&#x0230B;</mml:mo></mml:mrow></mml:math></inline-formula> is the standard deviation, determined by the length of the sequence <italic>T</italic> and a constant factor <italic>D</italic>.</p>
<p>The width of Gaussian window is determined by 3-sigma limit and the length of whole sequence, <italic>W</italic> &#x0003D; min{6&#x003C3;&#x0002B;1, <italic>T</italic>}. As depicted in <xref ref-type="fig" rid="F3">Figure 3B(c</xref>), when the factor <italic>D</italic> increases, both the standard deviation &#x003C3; and the window size <italic>W</italic> decrease, which means that the error assignment only focuses on the time steps surrounding the first spike. When <italic>D</italic> &#x02192; &#x0221E;, &#x003C3; &#x02192; 0, and <italic>W</italic> &#x02192; 1, the error is only assigned to the current step. On the contrary, when <italic>D</italic> &#x02192; 0, &#x003C3; &#x02192; &#x0221E;, and <italic>W</italic>&#x02192;<italic>T</italic>, the error is propagated to all the steps with approximately the same value, which is similar to the error propagation of the loss based on FR [see <xref ref-type="fig" rid="F3">Figure 3B(d</xref>)]. The parameters <italic>A</italic> and <italic>D</italic> are determined empirically. The value of <italic>A</italic> should not be too small, usually around 2<italic>T</italic>, otherwise the training is slow with a deteriorated performance. The values of <italic>D</italic> and window size <italic>W</italic> have a significant impact on the performance which are discussed further in Section 3.5.</p></sec>
<sec>
<title>2.2.4. Surrogate gradient descent training through spikes</title>
<p>After propagating error from FS times to spikes in the output layer, the error can be propagated through spikes in the rest of the network. To solve the non-differentiability of the spike function, a surrogate gradient (Zenke and Ganguli, <xref ref-type="bibr" rid="B58">2018</xref>) is used to estimate the derivative of postsynaptic spikes with respect to membrane potential. Specifically, the gradient of the step function in Eq. (5) is estimated using a fast sigmoid function in the backward pass:</p>
<disp-formula id="E15"><label>(15)</label><mml:math id="M60"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02248;</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>-</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Having the error assignment from FS times to spikes and surrogate gradients of spike function, the overall backpropagation pipeline can be constructed. As shown in <xref ref-type="fig" rid="F3">Figures 3A</xref>, <xref ref-type="fig" rid="F3">B</xref>, the error flows from time <inline-formula><mml:math id="M61"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>t</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula> to spike <inline-formula><mml:math id="M62"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>s</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula> in the last layer, given by Eqs (12), (13). In each layer, the error of spikes <inline-formula><mml:math id="M63"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>s</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula> is propagated to membrane potential <inline-formula><mml:math id="M64"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>U</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula> through the surrogate gradient in Eq. (15) and then the error of synaptic current <inline-formula><mml:math id="M65"><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:math></inline-formula> can be calculated. Finally, the derivative of error with respect to weights <italic><bold>W</bold></italic><sup><italic>l</italic></sup>, <italic><bold>V</bold></italic><sup><italic>l</italic></sup> in each layer are calculated by taking the derivative of Eq. (2):</p>
<disp-formula id="E16"><label>(16)</label><mml:math id="M66"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>W</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>W</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>s</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>and</p>
<disp-formula id="E17"><label>(17)</label><mml:math id="M67"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>V</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>V</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>s</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
</sec></sec>
<sec>
<title>2.3. Strategies facilitating training based on FS timings</title>
<p>In addition to the computation of gradients, parameter initialization is also crucial to the training of SNNs. Time constant &#x003C4;, threshold &#x003B8;, temporal resolution &#x00394;<italic>t</italic>, the length of the sequence <italic>T</italic>, and weight initialization have a great impact on the results. In addition, training based on FS timings is more challenging because only focusing on the first spike leads to more inactive neurons during training. Thus, apart from the error assignment from time-to-spike described in Section 2.2.1, other strategies can be used in parameter settings to facilitate the training process.</p>
<sec>
<title>2.3.1. Different time constants and thresholds for feature extraction and decision</title>
<p>An appropriate time constant &#x003C4; and threshold &#x003B8; can enhance the performance of the system, which determine the firing rate and neuron activity in the system. Empirically, we found that smaller values of &#x003C4; and &#x003B8; in the first few layers but larger values for the final layers leads to better performance on FS coding, which is consistent with our intuition. <xref ref-type="fig" rid="F4">Figure 4A</xref> illustrates the responses of neurons with different &#x003C4; and &#x003B8; to the same input sequences. The neuron with a small value of &#x003C4; has a short memory due to the rapid decay of its membrane potential. Meanwhile, a small &#x003B8; is used to maintain a high firing rate, thereby facilitating the transmission of sufficient information. Therefore, small values of &#x003C4; and &#x003B8; are well-suited for capturing local features in the initial layers.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Illustrations of strategies to facilitate training with the FS loss. <bold>(A)</bold> Dynamics of neurons with different time constant &#x003C4; and threshold &#x003B8;. <bold>(Top)</bold> Input spike sequence (all the weights are equal to 1). <bold>(Middle)</bold> The neuron with &#x003C4; &#x0003D; 5<italic>ms</italic>, &#x003B8; &#x0003D; 0.5 responds rapidly to local features. <bold>(Bottom)</bold> The neuron with large parameters &#x003C4; &#x0003D; 50<italic>ms</italic>, &#x003B8; &#x0003D; 2 keeps a longer memory of input signals. <bold>(B)</bold> Illustration of how an active neuron with a large time delay can be seen as inactive. We can see that there exists a delay for spikes propagating down the network, causing the neuron to fail to fire a spike within the input window. By extending the time window with an empty sequence, the inactive neuron becomes active.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0004.tif"/>
</fig>
<p>On the contrary, the final layer requires a longer delay to make a correct decision after enough information is accumulated. We can see from <xref ref-type="fig" rid="F4">Figure 4A</xref> that large values of &#x003C4; and &#x003B8; can help keep a longer memory of previous stimuli, ensuring the target output neuron fires the first spike after receiving enough spikes from previous layers. Experiments in Section 3.3 confirm that an appropriate longer time delay of the first spike leads to higher accuracy of prediction.</p></sec>
<sec>
<title>2.3.2. Extension of time window with empty sequences</title>
<p>One of the challenges in training FS-coded model is dealing with inactive neurons. Having too many inactive neurons in the network will stop the gradient flow and hamper the update of weights, making the training more difficult. In the output layer, particularly, there is no precise timing for the optimization of inactive neurons. However, we found that some of them are only inactive due to delay between layers which leads to spikes beyond the observed time window, as visualized in <xref ref-type="fig" rid="F4">Figure 4B</xref>.</p>
<p>We thus extend the input window by appending an empty sequence to the end during training. We can see from <xref ref-type="fig" rid="F4">Figure 4B</xref> that the empty sequence allows the output neuron to fire a later spike outside the original window. The length of empty sequence <italic>T</italic><sub><italic>E</italic></sub> is determined by time constant &#x003C4; of the last layer empirically. Usually, a network with a larger time constant is more likely to exhibit a longer delay, therefore, a larger <italic>T</italic><sub><italic>E</italic></sub> should be used in that case. Note that this strategy is only used to facilitate training, and it is useful to enhance accuracy when &#x003C4; in the last layer is very large and the original window size is relatively small. The downside of this approach is a higher computation workload and a longer delay of the target FS (see Section 3.3). For the experiments in Section 3, we set <italic>T</italic><sub><italic>E</italic></sub> &#x0003D; 0 unless otherwise specified.</p></sec></sec></sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. Experimental settings</title>
<p>We compare models using FS and FR coding schemes on classification tasks. We avoid using unrealistic datasets, such as N-MNIST (Orchard et al., <xref ref-type="bibr" rid="B33">2015</xref>) and CIFAR10-DVS (Li et al., <xref ref-type="bibr" rid="B28">2017</xref>), in which the first spike is not meaningful because events are generated by moving neuromorphic device around static images and there are no significant temporal differences in their sequences (Iyer et al., <xref ref-type="bibr" rid="B20">2021</xref>). Instead, we test our model on realistic datasets in which important information is encoded in spike timings. For visual datasets DVSGesture (Amir et al., <xref ref-type="bibr" rid="B2">2017</xref>) and DVSPlane (Afshar et al., <xref ref-type="bibr" rid="B1">2019</xref>), events are generated by real cameras capturing dynamic scenes. For auditory datasets N-TIDIGITS (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>) and SHD (Cramer et al., <xref ref-type="bibr" rid="B9">2022</xref>), spikes are derived by converting existing datasets using a neuromorphic device or a realistic simulator. As shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, the temporal structures of event sequences in these datasets are different. For example, auditory signals in SHD and N-TIDIGITS are non-repetitive and the temporal complexity is much higher than visual datasets, since spatial information is crucial in the prediction of visual data. In addition, signals in DVSGesture contain repetitive information, in which a person performs the same gesture several times. By contrast, DVSPlane exhibits more temporal complexity since the movement is not periodic. We observed that FS and FR models exhibit different behavior on these signals with varying temporal structures.</p>
<p>In our settings, the architecture and parameters vary among different tasks. Note that our aim is to make a comparison between FR and FS codings rather than seeking highest accuracy, hence relatively simple architectures are used in our study. The notation of architecture remains consistent throughout the article. For example, [2,32,32]-32C5S2-P2-64C3-FC128(R)-10 represents a network with a [2,32,32] input, where the first convolutional (C) layer contains 32 kernels (5 &#x000D7; 5) with a stride (S) of 2, followed by a max pooling (P) (2 &#x000D7; 2) and 64 convolutional kernels (3 &#x000D7; 3) with a stride of 1 (default), and finally a fully connected (FC) layer with 128 neurons with 10 recurrent (R) connection and output classes. Weights are all initialized using Xavier uniform distribution, and the Adam optimizer with a weight decay of 1<italic>e</italic><sup>&#x02212;4</sup> is used. The parameter of surrogate gradient &#x003C1; is set to 5 in all the cases. Notations for learning rate, batch size, and the number of epochs are &#x003B7;, <italic>B</italic>, and <italic>N</italic><sub><italic>ep</italic></sub>, respectively. We denote time constant and threshold as &#x003C4;<sub>1</sub> and &#x003B8;<sub>1</sub> for those used in feature extraction, and as &#x003C4;<sub>2</sub> and &#x003B8;<sub>2</sub> for those used in decision process. In a convolutional SNN, &#x003C4;<sub>1</sub>/&#x003B8;<sub>1</sub> are used in convolutional and pooling layers, and &#x003C4;<sub>2</sub>/&#x003B8;<sub>2</sub> are used in FC layers. For FC architecture, &#x003C4;<sub>1</sub>/&#x003B8;<sub>1</sub> are used in hidden layers while &#x003C4;<sub>2</sub>/&#x003B8;<sub>2</sub> are used in the output layer. In our experiments, &#x003C4;<sub>1</sub>/&#x003B8;<sub>1</sub>/&#x003B8;<sub>2</sub> are determined empirically to obtain the optimal results. The value of &#x003C4;<sub>2</sub> affects time delay and accuracy significantly, whereas the value of &#x003B8;<sub>2</sub> does not have a great impact on the performance when &#x003C4;<sub>2</sub> is fixed. We, therefore, focus on &#x003C4;<sub>2</sub> and test different values with &#x003C4;<sub>2</sub> &#x0003D; &#x003BC;&#x003C4;<sub>1</sub>. Further details are introduced in the following and in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Hyperparameters for different tasks.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="left" colspan="4"><bold>Neuron</bold></th>
<th valign="top" align="left" colspan="5"><bold>FS</bold></th>
<th valign="top" align="left"><bold>FR</bold></th>
<th valign="top" align="left" colspan="3"><bold>Training</bold></th>
</tr>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="left"><bold><italic>&#x00394;t</italic></bold></th>
<th valign="top" align="left"><bold>&#x003C4;<sub>1</sub></bold></th>
<th valign="top" align="left"><bold>&#x003B8;<sub>1</sub></bold></th>
<th valign="top" align="left"><bold>&#x003B8;<sub>2</sub></bold></th>
<th valign="top" align="left"><bold>&#x003B1;<sub>0</sub></bold></th>
<th valign="top" align="left"><bold>&#x003BB;</bold></th>
<th valign="top" align="left"><bold>&#x003B2;<sub>0</sub></bold></th>
<th valign="top" align="left"><bold><italic>A</italic></bold></th>
<th valign="top" align="left"><bold><italic>D</italic></bold></th>
<th valign="top" align="left"><bold>&#x003B1;<sub>1</sub></bold></th>
<th valign="top" align="center"><bold>&#x003B7;</bold></th>
<th valign="top" align="center"><bold><italic>N</italic><sub><italic>ep</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>B</italic></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">DVSGesture</td>
<td valign="top" align="left">10 ms</td>
<td valign="top" align="left">50 ms</td>
<td valign="top" align="left">0.5</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">0.02</td>
<td valign="top" align="left">300</td>
<td valign="top" align="left">4</td>
<td valign="top" align="left">10</td>
<td valign="top" align="center">1<italic>e</italic><sup>&#x02212;4</sup></td>
<td valign="top" align="center">40</td>
<td valign="top" align="center">16</td>
</tr> <tr>
<td valign="top" align="left">SHD</td>
<td valign="top" align="left">10 ms</td>
<td valign="top" align="left">50 ms</td>
<td valign="top" align="left">5</td>
<td valign="top" align="left">10</td>
<td valign="top" align="left">0.2</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">0.02</td>
<td valign="top" align="left">200</td>
<td valign="top" align="left">16</td>
<td valign="top" align="left">20</td>
<td valign="top" align="center">1<italic>e</italic><sup>&#x02212;3</sup></td>
<td valign="top" align="center">80</td>
<td valign="top" align="center">128</td>
</tr> <tr>
<td valign="top" align="left">N-TIDIGITS</td>
<td valign="top" align="left">5 ms</td>
<td valign="top" align="left">25 ms</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">2</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">0.02</td>
<td valign="top" align="left">500</td>
<td valign="top" align="left">16</td>
<td valign="top" align="left">20</td>
<td valign="top" align="center">1<italic>e</italic><sup>&#x02212;3</sup></td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">128</td>
</tr>
<tr>
<td valign="top" align="left">DVSPlane</td>
<td valign="top" align="left">2 ms</td>
<td valign="top" align="left">10 ms</td>
<td valign="top" align="left">0.5</td>
<td valign="top" align="left">2</td>
<td valign="top" align="left">0.1</td>
<td valign="top" align="left">0.01</td>
<td valign="top" align="left">0.02</td>
<td valign="top" align="left">500</td>
<td valign="top" align="left">8</td>
<td valign="top" align="left">15</td>
<td valign="top" align="center">3<italic>e</italic><sup>&#x02212;4</sup></td>
<td valign="top" align="center">50</td>
<td valign="top" align="center">16</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec>
<title>3.1.1. DVSGesture dataset</title>
<p>DVSGesture (Amir et al., <xref ref-type="bibr" rid="B2">2017</xref>) is captured by DVS128 camera, including 11 hand gestures recorded under three different lighting conditions. Since gestures in the last class are random, we only take the first 10 classes. Recordings are split into 1,078 and 264 samples for training and testing, respectively. Each sequence is around 6 s, we clip 1.2 s in training and 2.5 s for testing, with a temporal resolution &#x00394;<italic>t</italic> &#x0003D; 10 ms. The frame size is 128 &#x000D7; 128, which is downsampled by 4 for the input. The architecture is [2,32,32]-64C3-128C3-P2-128C3-P2-FC128(R)-10.</p></sec>
<sec>
<title>3.1.2. SHD dataset</title>
<p>Spiking Heidelberg Dataset (SHD) is a dynamic audio dataset generated using Lauscher, an artificial cochlea model, including 20 classes of spoken digits from 0 to 9 in both German and English languages. A total of 7,736 samples are used for training and 2,264 samples for testing. Each sequence is around 1 s. We clip 0.8 s and 1 s in training and testing, respectively, and &#x00394;<italic>t</italic> &#x0003D; 10 ms. The architecture is 700-FC256(R)-20, in which the time constant in the hidden layer is initialized with &#x003C4;<sub>1</sub> but related variables &#x003B1;, &#x003B2; are trainable using the method by Perez-Nieves et al. (<xref ref-type="bibr" rid="B38">2021</xref>).</p></sec>
<sec>
<title>3.1.3. N-TIDIGITS dataset</title>
<p>N-TIDIGITS Dataset (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>) transforms TIDIGITS dataset into spikes with the dynamic audio sensor, CochleaAMS1b. It includes 11 classes of spoken numbers 0&#x02013;9 and a word &#x0201C;oh.&#x0201D; The length of sequences is around 0.08&#x02013;2.46 s but most of the sequences are &#x0003C;1.2 s. &#x00394;<italic>t</italic> &#x0003D; 5<italic>ms</italic> and <italic>T</italic> &#x0003D; 250 are used in all sequences. Training and testing datatsets include 2,463 and 2,486 samples, respectively. The architecture is 64-FC256(R)-FC256(R)-11, in which time constant in hidden layers are trainable.</p></sec>
<sec>
<title>3.1.4. DVSPlane dataset</title>
<p>DVSPlane dataset (Afshar et al., <xref ref-type="bibr" rid="B1">2019</xref>) is captured by an asynchronous time-based image sensor (ATIS). Here, four different airplane models are dropped free-hand from varying heights and distances in front of the camera. The length of sequences is 242 &#x000B1; 21 ms. We set &#x00394;<italic>t</italic> &#x0003D; 2 ms and <italic>T</italic> &#x0003D; 100 in training, and <italic>T</italic> &#x0003D; 120 in testing. The 800 samples are split into 640 and 160 samples for training and testing. The image with a size of 304 &#x000D7; 240 is downsampled by 4 as input. The architecture is [2,76,60]-32C5S2-64C3-P2-128C3-P2-FC256(R)-4.</p></sec>
<sec>
<title>3.1.5. Evaluation</title>
<p>The results are compared in terms of accuracy, time delay, and spike count. If not specified, the accuracy of the FS or FR model is evaluated in a manner that is consistent with its training.</p>
<p>As shown in <xref ref-type="fig" rid="F5">Figure 5</xref>, the accuracy increases with longer time window. We, therefore, evaluate time delay as time when reaching 50 or 90% of the peak accuracy within the given time window, denoted as <italic>t</italic><sub><italic>d</italic></sub>(50%) and <italic>t</italic><sub><italic>d</italic></sub>(90%), respectively.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>An example of accuracy variation with increasing time window size. The time delay <italic>t</italic><sub><italic>d</italic></sub> is evaluated using the time when the accuracy reaches 50% (dashed line) or 90% (dotted line) of the peak value within the given time window.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0005.tif"/>
</fig>
<p>In terms of energy consumption, many studies calculate the number of synaptic operations as a measure. However, these metrics are usually employed when comparing power consumption between SNNs and ANNs, or between different architectures. They usually involve multiplying the number of connections by spikes (Wu et al., <xref ref-type="bibr" rid="B48">2022</xref>), or factoring in firing rate, time step, and FLOPs (Zhou et al., <xref ref-type="bibr" rid="B62">2023</xref>). Nonetheless, in our settings, we only change the coding scheme in the output layer while maintaining the same architectures. In addition, FS coding scheme does not introduce additional operations during inference. Therefore, energy consumption mainly depends on the spike count in the system. Here, we use the average number of spikes per neuron in the system (denoted as <italic>N</italic><sub><italic>s</italic></sub>) to evaluate power consumption.</p>
<p>Furthermore, to test the required number of spikes for correct classification, an optional constraint for the spike count is added in the loss function to constrain the total number of spikes in the system, i.e., <inline-formula><mml:math id="M68"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x000D1;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:math></inline-formula>, where &#x000D1;<sub><italic>s</italic></sub> is the target average spike count per neuron and &#x003BB;<sub><italic>s</italic></sub> is a constant value. The range of &#x003BB;<sub><italic>s</italic></sub> extends from 0.1 to 3, and we empirically adjusted it for various FS and FR models. Results under different spike count constraints are discussed in Sections 3.3 and 3.4.</p></sec></sec>
<sec>
<title>3.2. Results overview</title>
<p>We compare the results of models using FS and FR codings, respectively, on the four datasets. During the training of FS models, an empty sequence is added with lengths <italic>T</italic><sub><italic>E</italic></sub> &#x0003D; 40 and <italic>T</italic><sub><italic>E</italic></sub> &#x0003D; 20 for DVSGesture and SHD data sequences, respectively. As shown in <xref ref-type="table" rid="T2">Table 2</xref>, we can see that the FS model demonstrates comparable performance with the FR model. We also tested different time constant &#x003C4;<sub>2</sub> for decision layers. For FS models, a larger &#x003C4;<sub>2</sub> leads to a higher accuracy on both metrics, but the time delay <italic>t</italic><sub><italic>d</italic></sub> also increases. In contrast, for FR models, &#x003C4;<sub>2</sub> does not affect the accuracy significantly. It seems that a longer first-time latency encodes more information. The relationship between accuracy and time delay is further discussed in Section 3.3. In addition, the response time <italic>t</italic><sub><italic>d</italic></sub> of the FR model is shorter than the FS model overall. Note that the disparity between time delays for FR and FS models is large when reaching 50% accuracy, but the gap is reduced when achieving 90% of the best performance in some cases, especially for models with small time constants.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Comparison of FR and FS models with different time constant &#x003C4;<sub>2</sub> for decision layers (&#x003C4;<sub>2</sub> &#x0003D; &#x003BC;&#x003C4;<sub>1</sub>) in terms of accuracy (Acc), spike count (<italic>N</italic><sub><italic>s</italic></sub>), and time delay (<italic>t</italic><sub><italic>d</italic></sub>).</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th/>
<th valign="top" align="center" colspan="4"><bold>Trained using FR</bold></th>
<th valign="top" align="left" colspan="4"><bold>Trained using FS</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<td/>
<td valign="top" align="left">&#x003BC;</td>
<td valign="top" align="center"><bold>Acc(%)</bold>&#x02191;</td>
<td valign="top" align="left"><italic>N</italic><sub><italic>s</italic></sub>&#x02193;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(50%)&#x02193;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(90%)&#x02193;</td>
<td valign="top" align="center"><bold>Acc (%)</bold>&#x02191;</td>
<td valign="top" align="left"><italic>N</italic><sub><italic>s</italic></sub>&#x02193;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(50%)&#x02193;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(90%)&#x02193;</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">DVSGesture</td>
<td valign="top" align="left">1</td>
<td valign="top" align="center">94.0</td>
<td valign="top" align="left">10.0</td>
<td valign="top" align="center"><bold>36.2</bold></td>
<td valign="top" align="center">58.6</td>
<td valign="top" align="center">90.1</td>
<td valign="top" align="left">10.2</td>
<td valign="top" align="center">38.2</td>
<td valign="top" align="center"><bold>56.8</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="center">94.2</td>
<td valign="top" align="left">9.5</td>
<td valign="top" align="center">41.6</td>
<td valign="top" align="center">58.8</td>
<td valign="top" align="center">89.7</td>
<td valign="top" align="left">9.6</td>
<td valign="top" align="center">41.8</td>
<td valign="top" align="center">59.0</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">12</td>
<td valign="top" align="center"><bold>95.1</bold></td>
<td valign="top" align="left">11.0</td>
<td valign="top" align="center">45.4</td>
<td valign="top" align="center">60.2</td>
<td valign="top" align="center">92.8</td>
<td valign="top" align="left"><bold>8.9</bold></td>
<td valign="top" align="center">61.6</td>
<td valign="top" align="center">77.6</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">SHD</td>
<td valign="top" align="left">1</td>
<td valign="top" align="center">79.3</td>
<td valign="top" align="left">26.8</td>
<td valign="top" align="center"><bold>28.8</bold></td>
<td valign="top" align="center"><bold>46.2</bold></td>
<td valign="top" align="center">78.3</td>
<td valign="top" align="left">24.9</td>
<td valign="top" align="center">33.8</td>
<td valign="top" align="center">47.2</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="center">78.8</td>
<td valign="top" align="left">31.0</td>
<td valign="top" align="center">30.8</td>
<td valign="top" align="center">47.2</td>
<td valign="top" align="center">85.5</td>
<td valign="top" align="left">12.6</td>
<td valign="top" align="center">48.4</td>
<td valign="top" align="center">67.0</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">12</td>
<td valign="top" align="center">77.8</td>
<td valign="top" align="left">35.7</td>
<td valign="top" align="center">33.0</td>
<td valign="top" align="center">49.4</td>
<td valign="top" align="center"><bold>87.6</bold></td>
<td valign="top" align="left"><bold>9.0</bold></td>
<td valign="top" align="center">58.0</td>
<td valign="top" align="center">74.8</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">N-TIDIGITS</td>
<td valign="top" align="left">1</td>
<td valign="top" align="center">88.2</td>
<td valign="top" align="left">10.6</td>
<td valign="top" align="center"><bold>74.4</bold></td>
<td valign="top" align="center">133.0</td>
<td valign="top" align="center">87.6</td>
<td valign="top" align="left">12.1</td>
<td valign="top" align="center">88.0</td>
<td valign="top" align="center"><bold>129.4</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="center">88.7</td>
<td valign="top" align="left">11.4</td>
<td valign="top" align="center">76.6</td>
<td valign="top" align="center">136.8</td>
<td valign="top" align="center"><bold>89.3</bold></td>
<td valign="top" align="left">5.4</td>
<td valign="top" align="center">94.4</td>
<td valign="top" align="center">137.8</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">8</td>
<td valign="top" align="center">88.1</td>
<td valign="top" align="left">11.5</td>
<td valign="top" align="center">80.6</td>
<td valign="top" align="center">139.4</td>
<td valign="top" align="center"><bold>89.3</bold></td>
<td valign="top" align="left"><bold>5.2</bold></td>
<td valign="top" align="center">98.2</td>
<td valign="top" align="center">144.0</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">DVSPlane</td>
<td valign="top" align="left">1</td>
<td valign="top" align="center">83.0</td>
<td valign="top" align="left"><bold>1.3</bold></td>
<td valign="top" align="center">57.6</td>
<td valign="top" align="center">84.4</td>
<td valign="top" align="center">87.6</td>
<td valign="top" align="left">2.0</td>
<td valign="top" align="center"><bold>52.0</bold></td>
<td valign="top" align="center"><bold>65.2</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="center">85.4</td>
<td valign="top" align="left">1.5</td>
<td valign="top" align="center">68.0</td>
<td valign="top" align="center">85.8</td>
<td valign="top" align="center">90.5</td>
<td valign="top" align="left">2.0</td>
<td valign="top" align="center">54.6</td>
<td valign="top" align="center">67.6</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">8</td>
<td valign="top" align="center">89.3</td>
<td valign="top" align="left">1.8</td>
<td valign="top" align="center">67.8</td>
<td valign="top" align="center">82.8</td>
<td valign="top" align="center"><bold>92.7</bold></td>
<td valign="top" align="left">1.9</td>
<td valign="top" align="center">62.0</td>
<td valign="top" align="center">77.0</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>The results are the average of 5 trials. The bold values represent the best results obtained when comparing the FR and FS systems for each dataset.</p>
</table-wrap-foot>
</table-wrap>
<p>One benefit of using FS coding is the reduction in the number of spikes, thereby leading to better energy efficiency. We can see from <xref ref-type="table" rid="T2">Table 2</xref> that the number of spikes <italic>N</italic><sub><italic>s</italic></sub> in an FS system is usually smaller than the number in an FR system, especially for SHD and N-TIDIGITS classification. Further experiments demonstrate that FS models are robust with fewer spikes, as shown in Section 3.4.</p>
<p>Furthermore, we generated the output spike raster plots to analyze the neuronal activities of the FS and FR systems. As depicted in <xref ref-type="fig" rid="F6">Figure 6</xref>, non-target neurons of an FS model fire fewer spikes than an FR model, which reduces the likelihood of misclassification based on FR. The reason is that the first time of the non-target neuron is optimized to the end of the sequence, which reduces the probability of firing in the whole sequence significantly. This phenomenon allows the FS model to make decisions based on FR as well. Further results on data sequences with different temporal structures are analyzed in Section 3.5.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Representative output spike raster plots of correct prediction using models trained with FR (above, blue) and FS (bottom, red), respectively, &#x003C4;<sub>2</sub> &#x0003D; 4&#x003C4;<sub>1</sub> in all the models. Training with FR leads to faster responses but non-target neurons fire more spikes, while the FS coding leads to lower firing rate and reduces the likelihood of misclassification because non-target neurons fire fewer spikes.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0006.tif"/>
</fig>
<p>To validate the effectiveness of FS coding, further experiments are conducted on models using adaptive LIF neurons in Section 3.6.</p></sec>
<sec>
<title>3.3. Trade-off between accuracy and time delay</title>
<p>During training, we found that it is easier to train FS models when time constant is relatively large and a proper empty sequence is added. Results also indicate that there is a trade-off between accuracy and response time. The two main factors that affect time delay, time constant and length of empty sequences, are analyzed in the following.</p>
<sec>
<title>3.3.1. Time constant</title>
<p>As observed from <xref ref-type="table" rid="T2">Table 2</xref>, the FS model with a larger time constant &#x003C4;<sub>2</sub> in the last layers leads to higher accuracy and fewer spikes in the system overall, but at a cost of longer time delay.</p></sec>
<sec>
<title>3.3.2. Time window and empty sequence</title>
<p>Another factor that affects time delay is the length of the input time window, which is determined by the original length of data <italic>T</italic> and the length of added empty sequence <italic>T</italic><sub><italic>E</italic></sub>. First, we used a fixed <italic>T</italic> and tested different <italic>T</italic><sub><italic>E</italic></sub> values on the DVSGesture dataset. The model with &#x003C4;<sub>2</sub> &#x0003D; 12&#x003C4;<sub>1</sub> is tested because the empty sequence is added only when the original window size <italic>T</italic> is relatively small for a long time delay (large &#x003C4;<sub>2</sub>). As shown in <xref ref-type="table" rid="T3">Table 3</xref>, an increasing <italic>T</italic><sub><italic>E</italic></sub> leads to a longer time delay and a higher overall accuracy.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Results of DVSGesture classification with different length of data sequence <italic>T</italic> and added empty sequence <italic>T</italic><sub><italic>E</italic></sub> (<italic>N</italic><sub><italic>ep</italic></sub> &#x0003D; 50).</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left" colspan="4"><italic><bold>T</bold></italic> &#x0003D; 120, &#x003C4;<sub><bold>2</bold></sub> &#x0003D; 12&#x003C4;<sub><bold>1</bold></sub></th>
<th valign="top" align="left" colspan="4"><italic><bold>T</bold></italic><sub><bold><italic><bold>E</bold></italic></bold></sub> &#x0003D; 0, &#x003C4;<sub><bold>2</bold></sub> &#x0003D; &#x003C4;<sub><bold>1</bold></sub></th>
</tr>
</thead>
<tbody>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<td valign="top" align="left"><italic>T</italic><sub><italic>E</italic></sub></td>
<td valign="top" align="left"><bold>Acc(%)</bold> &#x02191;</td>
<td valign="top" align="left"><italic>t</italic><sub><italic>d</italic></sub>(50%)&#x02193;</td>
<td valign="top" align="left"><italic>t</italic><sub><italic>d</italic></sub>(90%)&#x02193;</td>
<td valign="top" align="left"><italic>T</italic></td>
<td valign="top" align="center"><bold>Acc (%)</bold> &#x02191;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(50%)&#x02193;</td>
<td valign="top" align="center"><italic>t</italic><sub><italic>d</italic></sub>(90%)&#x02193;</td>
</tr> <tr>
<td valign="top" align="left">0</td>
<td valign="top" align="left">92.4</td>
<td valign="top" align="left"><bold>59</bold></td>
<td valign="top" align="left"><bold>82</bold></td>
<td valign="top" align="left">60</td>
<td valign="top" align="center">81.1</td>
<td valign="top" align="center"><bold>29</bold></td>
<td valign="top" align="center"><bold>35</bold></td>
</tr> <tr>
<td valign="top" align="left">20</td>
<td valign="top" align="left">93.6</td>
<td valign="top" align="left">64</td>
<td valign="top" align="left">95</td>
<td valign="top" align="left">80</td>
<td valign="top" align="center">86.0</td>
<td valign="top" align="center">35</td>
<td valign="top" align="center">44</td>
</tr> <tr>
<td valign="top" align="left">40</td>
<td valign="top" align="left">93.6</td>
<td valign="top" align="left">65</td>
<td valign="top" align="left">97</td>
<td valign="top" align="left">100</td>
<td valign="top" align="center">88.3</td>
<td valign="top" align="center">36</td>
<td valign="top" align="center">48</td>
</tr> <tr>
<td valign="top" align="left">60</td>
<td valign="top" align="left"><bold>93.9</bold></td>
<td valign="top" align="left">66</td>
<td valign="top" align="left">99</td>
<td valign="top" align="left">120</td>
<td valign="top" align="center">90.9</td>
<td valign="top" align="center">38</td>
<td valign="top" align="center">53</td>
</tr> <tr>
<td valign="top" align="left">80</td>
<td valign="top" align="left"><bold>93.9</bold></td>
<td valign="top" align="left">77</td>
<td valign="top" align="left">114</td>
<td valign="top" align="left">140</td>
<td valign="top" align="center"><bold>89.4</bold></td>
<td valign="top" align="center">48</td>
<td valign="top" align="center">82</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>A larger window size <italic>T</italic>&#x0002B;<italic>T</italic><sub><italic>E</italic></sub> leads to higher accuracy but a longer time delay. The bold values represent the best results in each column.</p>
</table-wrap-foot>
</table-wrap>
<p>Furthermore, different length <italic>T</italic> of input data is tested for the model with &#x003C4;<sub>2</sub> &#x0003D; &#x003C4;<sub>1</sub>, where <italic>T</italic><sub><italic>E</italic></sub> &#x0003D; 0 for all the trials. The second column in <xref ref-type="table" rid="T3">Table 3</xref> shows that the accuracy is higher with a larger window size.</p></sec>
<sec>
<title>3.3.3. Accuracy vs. time delay under spike count constraints</title>
<p>We further imposed different spike count constraints to FS and FR models with different time constant &#x003C4;<sub>2</sub>. The time delay and corresponding accuracy of each trial are displayed in <xref ref-type="fig" rid="F7">Figure 7</xref>. We can see that the time delay is mainly determined by time constant. FR models usually exhibit shorter time delay than FS models, except for DVSPlane. It is clear that FS models for SHD and N-TIDIGITS datasets achieve higher accuracy with longer time delay. The spike raster plots of SHD and N-TIDIGITS classifications in <xref ref-type="fig" rid="F6">Figure 6</xref> also indicate that FS models achieve higher accuracy with longer time delays and fewer spikes, while FR models respond faster but more spikes produced by non-target neurons interfere with the classification. For DVSGesture and DVSPlane datasets, although the relationship between FS and FR models is not obvious, the best FS results are generated when we have larger temporal latency.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Relationship between accuracy and time delay <italic>t</italic><sub><italic>d</italic></sub> for models with different time constants &#x003C4;<sub>2</sub> under different spike count constraints. Red: FS models. Blue: FR models. Darker colors indicate results with larger target average spike count &#x000D1;<sub><italic>s</italic></sub>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0007.tif"/>
</fig>
<p>It is also worth noting that there are points with longer <italic>t</italic><sub><italic>d</italic></sub> but lower accuracy under the same time constant. These points represent the models with smaller number of spikes, which highlights that time delay tends to be longer with fewer spikes. Overall, fewer spikes in the system usually lead to a lower accuracy and longer time delay. The relationship between accuracy and spike count is further discussed in Section 3.4.</p>
<p>To summarize, appropriate time delay ensures that the first spike makes a correct decision. In other words, the first spike encodes more information with a longer time delay. However, a model with a larger &#x003C4;<sub>2</sub> has a risk of lacking sufficient spikes to make a decision due to a lower output firing rate. In addition, it becomes more difficult to improve the accuracy as the cost of time delay increases, because the accuracy is not only determined by time delay but also limited by the model itself and other factors.</p></sec></sec>
<sec>
<title>3.4. Energy efficiency</title>
<p>In SNNs, the power consumption mainly depends on the mean spike activity and the number of synaptic operations (Parameshwara et al., <xref ref-type="bibr" rid="B36">2021</xref>). When deploying SNNs on neuromorphic hardware such as the Intel Loihi (Davies et al., <xref ref-type="bibr" rid="B10">2018</xref>), reducing the number of spikes in the systems could leads to gains in energy efficiency.</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> illustrates that FS systems produce fewer spikes than FR systems overall, especially on SHD and N-TIDIGITS datasets. DVSGesture and DVSPlane classifications with CNN architecture generate approximately the same number of spikes in both systems. The FS system with a larger time constant is usually more energy efficient, while the FR system generates fewer spikes with a smaller &#x003C4;<sub>2</sub>.</p>
<p>As mentioned in Section 3.3, reducing the number of spikes through a spike count constraint results in a decline in performance. <xref ref-type="fig" rid="F8">Figure 8</xref> presents the relationship between accuracy and average spike count <italic>N</italic><sub><italic>s</italic></sub> for models with different time constants &#x003C4;<sub>2</sub>, in which red and blue curves represent results of FS and FR models, respectively. We can see that the accuracy of FR models decreases significantly overall, while the FS modes is more stable with fewer spikes, especially for SHD and N-TIDIGITS tasks. For DVSGesture dataset, the original spike counts of FS and FR models are close, so accuracy of both models drops as the spike count decreases. Note that the original <italic>N</italic><sub><italic>s</italic></sub> of FS models is larger than FR models in DVSPlane classification, but the accuracy of FR models decreases more significantly. In addition, the FS model with a large &#x003C4;<sub>2</sub> is more robust to reduced number of spikes. Overall, FS models with a large time constant are more energy efficient and more robust to the spike count constraint.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Accuracy with different spike count <italic>N</italic><sub><italic>s</italic></sub> for models with different time constant &#x003C4;<sub>2</sub>. Red: FS models. Blue: FR models.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0008.tif"/>
</fig></sec>
<sec>
<title>3.5. Performance and behavior on event sequences with different temporal structures</title>
<p>From <xref ref-type="table" rid="T2">Table 2</xref>, we can see that the performance varies on different datasets. Overall, the FS system model outperforms the FR model on SHD/N-TIDIGITS/DVSPlane datasets, whereas its performance on DVSGesture task is relatively inferior. These differences are due to different temporal structures in the input sequences. As illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>, in the DVSGesture dataset, the same gesture is repeated several times. The target neuron is expected to keep firing and make a consistent decision. In contrast, the data pattern of the audio data is non-repetitive. The frequencies of spoken digit numbers are changing over time within a short period, so that the neuron do not have to keep firing after a prediction has been made. Data sequences in DVSPlane are also non-repetitive, but the spatial features do not change significantly during the dropping of an airplane.</p>
<sec>
<title>3.5.1. Neuron activities</title>
<p>To better compare the firing patterns between FS and FR models, apart from <xref ref-type="fig" rid="F6">Figure 6</xref>, we aggregated all the output spikes generated in response to different input signals for each class in a single raster plot, as shown in <xref ref-type="fig" rid="F9">Figure 9</xref>. Each color corresponds to the output spikes from a single trial. An ideal case is that each neuron generates spikes of only one color, such as FS-DVSGesture in <xref ref-type="fig" rid="F9">Figure 9</xref>. This indicates that only the target neuron is active while the other neurons remain inactive. On the other hand, mixed colors (such as FR-SHD) indicate that the non-target neurons fire more spikes thereby affecting classification.</p>
<fig id="F9" position="float">
<label>Figure 9</label>
<caption><p>Output spike raster plots of FR and FS systems for different types of data. Each figure illustrates the output spikes generated in response to various input signals that belong to different classes. Each color corresponds to the output spikes from a single trial. We can observe that the neuronal activities of FS and FR systems are significantly different on the SHD dataset with rich temporal structures, whereas for the signals with temporal repetition (DVSGesture), the output spike patterns of both systems are similar.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnins-17-1266003-g0009.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F6">Figures 6</xref>, <xref ref-type="fig" rid="F9">9</xref>, it is notable that FS models exhibit distinct neuronal behavior on different types of data. For the visual signals in DVSGesture and DVSPlane, FS models produce periodic firing pattern similar to those produced by FR models, in which the target neuron keeps firing to consistently make a prediction. However, for the SHD data sequences without temporal repetition, FS systems generate much fewer spikes than FR systems and the target neuron almost stops firing after the classification has been made.</p>
<p>The distinct neuronal activities of SHD also illustrate why FS coding can outperform FR coding even though the FS coding makes decisions based on only a portion of the given input spikes. First, FS models usually exhibit a longer temporal delay than FR models. This highlights that ensuring high accuracy demands sufficient input information. The time delay indicates the minimum required information for accurate decision-making. From FS-SHD in <xref ref-type="fig" rid="F9">Figure 9</xref>, the time delay varies notably across different trials. This indicates that different lengths of input data are necessary for making correct decisions. While FR coding utilizes entire given input data, it can include redundant information. As shown by FR-SHD in <xref ref-type="fig" rid="F9">Figure 9</xref>, output neurons start firing at approximately the same time across different trials, even when the available information is not enough for correct decision-making. Non-target neurons generate more spikes that disrupt the decision process, leading to a lack of precision in decision-making.</p>
<p>The unique output spike pattern suggests that FS models is also capable of achieving accurate classifications based on FR. In <xref ref-type="table" rid="T4">Table 4</xref>, both FS and FR accuracies are tested on the two types of models. Results demonstrate that FS models performs well on FR and sometimes even better than FR models, although it is trained based on FS timings. However, FR models struggle to predict accurately based on FS.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>The accuracy of FS and FR models with different time constants &#x003C4;<sub>2</sub> for decision layers (&#x003C4;<sub>2</sub> &#x0003D; &#x003BC;&#x003C4;<sub>1</sub>).</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th/>
<th valign="top" align="left" colspan="2"><bold>Trained using FR loss (%)</bold></th>
<th valign="top" align="left" colspan="2"><bold>Trained using FS loss (%)</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<td/>
<td valign="top" align="left">&#x003BC;</td>
<td valign="top" align="left"><bold>Acc (FS)</bold>&#x02191;</td>
<td valign="top" align="left"><bold>Acc (FR)</bold>&#x02191;</td>
<td valign="top" align="left"><bold>Acc (FS)</bold>&#x02191;</td>
<td valign="top" align="left"><bold>Acc (FR)</bold>&#x02191;</td>
</tr> <tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">72.3</td>
<td valign="top" align="left">94.0</td>
<td valign="top" align="left">90.1</td>
<td valign="top" align="left">93.7</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="left">81.0</td>
<td valign="top" align="left">94.2</td>
<td valign="top" align="left">89.7</td>
<td valign="top" align="left">94.3</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">12</td>
<td valign="top" align="left">82.6</td>
<td valign="top" align="left"><bold>95.1</bold></td>
<td valign="top" align="left"><bold>92.8</bold></td>
<td valign="top" align="left">94.6</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">SHD</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">34.3</td>
<td valign="top" align="left">79.3</td>
<td valign="top" align="left">78.3</td>
<td valign="top" align="left">70.5</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="left">39.4</td>
<td valign="top" align="left">78.8</td>
<td valign="top" align="left">85.5</td>
<td valign="top" align="left">79.1</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">12</td>
<td valign="top" align="left">42.8</td>
<td valign="top" align="left">77.8</td>
<td valign="top" align="left"><bold>87.6</bold></td>
<td valign="top" align="left"><bold>79.4</bold></td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">N-TIDIGITS</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">47.9</td>
<td valign="top" align="left">88.2</td>
<td valign="top" align="left">87.6</td>
<td valign="top" align="left">75.6</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="left">47.4</td>
<td valign="top" align="left"><bold>88.7</bold></td>
<td valign="top" align="left"><bold>89.3</bold></td>
<td valign="top" align="left">87.0</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">8</td>
<td valign="top" align="left">51.0</td>
<td valign="top" align="left">88.1</td>
<td valign="top" align="left"><bold>89.3</bold></td>
<td valign="top" align="left">88.0</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">DVSPlane</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">69.5</td>
<td valign="top" align="left">83.0</td>
<td valign="top" align="left">87.6</td>
<td valign="top" align="left">88.5</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="left">76.5</td>
<td valign="top" align="left">85.4</td>
<td valign="top" align="left">90.5</td>
<td valign="top" align="left"><bold>90.3</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">8</td>
<td valign="top" align="left">82.0</td>
<td valign="top" align="left">89.3</td>
<td valign="top" align="left"><bold>92.7</bold></td>
<td valign="top" align="left">90.0</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>The results are the average of 5 trials. Both accuracy based on FS and FR are tested. Acc (FS) and Acc (FR) indicate accuracy when testing using FS and FR, respectively. The bold values represent the best results obtained when comparing the FR and FS accuracy for each model.</p>
</table-wrap-foot>
</table-wrap></sec>
<sec>
<title>3.5.2. Gaussian window size in error propagation</title>
<p>Another interesting observation is that the choice of Gaussian window size in the error propagation is related to the data type. We observed that repetitive visual data achieves better performance with a larger Gaussian window <italic>W</italic> (i.e., with a smaller <italic>D</italic>), while non-repetitive audio sequences prefer a smaller value of <italic>W</italic>. Specifically, we obtain optimal results using <italic>D</italic> &#x0003D; 16 for SHD and N-TIDIGITS, and <italic>D</italic> &#x0003D; 4, <italic>D</italic> &#x0003D; 8 for DVSGesture and DVSPlane, respectively. A larger window size in error propagation means that the error of FS times is propagated to a wider time range and more spikes are optimized. As a result, the firing patterns are more similar to rate coding with a higher firing rate but the precise timing of spikes is lost. On the contrary, with a smaller window size, the error is propagated to fewer spikes where the precise timing is emphasized. <xref ref-type="table" rid="T5">Table 5</xref> shows the results of the FS model on repetitive (DVSGesture) and non-repetitive (SHD) data with different Gaussian window size <italic>W</italic>. As <italic>W</italic> decreases, the firing rate of the target neuron decreases since fewer spikes are optimized, leading to a drop in the FR accuracy. Nevertheless, there are distinct behaviors in the FS accuracy of repetitive and non-repetitive signals. The FS accuracy of repetitive data follows a similar trend to FR accuracy, but it improves with a smaller window on non-repetitive data, as the precise timing is more important in this case.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Comparison of results on repetitive visual data (DVSGesture) and non-repetitive audio data (SHD) with different Gaussian window size <italic>W</italic> in error assignment.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left" colspan="4"><bold>DVSGesture (</bold>&#x003C4;<sub><bold>2</bold></sub> &#x0003D; 4&#x003C4;<sub><bold>1</bold></sub><bold>)</bold></th>
<th valign="top" align="left" colspan="4"><bold>SHD (</bold>&#x003C4;<sub><bold>2</bold></sub> &#x0003D; 4&#x003C4;<sub><bold>1</bold></sub><bold>)</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<td valign="top" align="left"><italic>D</italic></td>
<td valign="top" align="left"><italic>W</italic></td>
<td valign="top" align="center"><bold>Acc (%)(FS)</bold> &#x02191;</td>
<td valign="top" align="center"><bold>Acc (%) (FR)</bold> &#x02191;</td>
<td valign="top" align="left"><italic>D</italic></td>
<td valign="top" align="left"><italic>W</italic></td>
<td valign="top" align="center"><bold>Acc (%) (FS)</bold> &#x02191;</td>
<td valign="top" align="center"><bold>Acc(%) (FR)</bold> &#x02191;</td>
</tr> <tr>
<td valign="top" align="left">4</td>
<td valign="top" align="left">120</td>
<td valign="top" align="center"><bold>89.0</bold></td>
<td valign="top" align="center"><bold>94.3</bold></td>
<td valign="top" align="left">4</td>
<td valign="top" align="left">120</td>
<td valign="top" align="center">82.9</td>
<td valign="top" align="center"><bold>81.8</bold></td>
</tr> <tr>
<td valign="top" align="left">8</td>
<td valign="top" align="left">91</td>
<td valign="top" align="center">88.6</td>
<td valign="top" align="center">93.2</td>
<td valign="top" align="left">8</td>
<td valign="top" align="left">73</td>
<td valign="top" align="center">84.5</td>
<td valign="top" align="center">80.3</td>
</tr> <tr>
<td valign="top" align="left">12</td>
<td valign="top" align="left">61</td>
<td valign="top" align="center">86.7</td>
<td valign="top" align="center">88.6</td>
<td valign="top" align="left">16</td>
<td valign="top" align="left">37</td>
<td valign="top" align="center"><bold>85.6</bold></td>
<td valign="top" align="center">77.3</td>
</tr> <tr>
<td valign="top" align="left">16</td>
<td valign="top" align="left">46</td>
<td valign="top" align="center">88.6</td>
<td valign="top" align="center">90.2</td>
<td valign="top" align="left">32</td>
<td valign="top" align="left">19</td>
<td valign="top" align="center">85.5</td>
<td valign="top" align="center">76.5</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>As the window size <italic>W</italic> decreases, the FR accuracy drops due to a lower firing rate. The FS accuracy of repetitive signals experiences a decline as well, whereas that of non-repetitive signals sees an opposite trend where the precise timing is important. The bold values represent the best results in each column.</p>
</table-wrap-foot>
</table-wrap>
<p>We can, therefore, conclude that FS coding is better suited for the classification of non-repetitive dat and when precise timing of the first spike is required. FR neurons should be used for repetitive signals or the cases in which a consistent and stable prediction is required.</p></sec></sec>
<sec>
<title>3.6. Results of models using AdLIF neurons</title>
<p>To further validate the effectiveness of the FS coding, we conducted further experiments by replacing CUBA-LIF neurons with adaptive LIF (AdLIF) neurons (Bittar and Garner, <xref ref-type="bibr" rid="B4">2022</xref>) in hidden layers. We tested the FC models for the SHD and NTIDIGITS tasks and recurrent connections are removed. Note that CUBA-LIF neurons with fixed parameters are still used in the output layer, because we found that this type of neuron can enjoy longer delay for accurate decision-making. We only tested models with large time constant in the output layer and utilized batch normalization and dropout strategy in hidden layers to obtain the best accuracy. Detailed parameter settings are listed in <xref ref-type="app" rid="A1">Appendix</xref>.</p>
<p><xref ref-type="table" rid="T6">Table 6</xref> presents comparison results between FS and FR models utilizing AdLIF neurons. The same conclusion can be drawn from these results as those derived from models using CUBA-LIF neurons. The FS coding leads to higher accuracy and superior energy efficiency (fewer spikes) than FR coding on data sequences with rich temporal structures. On the other hand, FS models exhibit longer time delay compared to FR models, but the gap is reduced as the accuracy reaches 90% of its peak value. Furthermore, compared to the results of SHD and NTIDIGITS in <xref ref-type="table" rid="T2">Table 2</xref>, the models with AdLIF neurons exhibit significant accuracy improvement than the models employing CUBA-LIF neurons. This highlights that our approach is flexible and works well with various neuron types. This also demonstrates that the FS coding scheme has potential to achieve higher accuracy for data sequences with complex temporal structures if employing advanced architectures and strategies. The comparison results with other state-of-the-art methods is presented in the <xref ref-type="app" rid="A1">Appendix</xref>.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Comparison of FR and FS models with AdLIF neurons.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>AdLIF</bold></th>
<th valign="top" align="center"><bold>Loss</bold></th>
<th valign="top" align="center"><bold>Acc (%)&#x02191;</bold></th>
<th valign="top" align="center"><bold><italic>N</italic><sub><italic>s</italic></sub>&#x02193;</bold></th>
<th valign="top" align="center"><bold><italic>t</italic><sub><italic>d</italic></sub>(50<italic>%</italic>)&#x02193;</bold></th>
<th valign="top" align="center"><bold><italic>t</italic><sub><italic>d</italic></sub>(90<italic>%</italic>)&#x02193;</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" rowspan="2">SHD</td>
<td valign="top" align="center">FR</td>
<td valign="top" align="center">90.18</td>
<td valign="top" align="center">7.24</td>
<td valign="top" align="center"><bold>28.4</bold></td>
<td valign="top" align="center"><bold>57.8</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">FS</td>
<td valign="top" align="center"><bold>94.08</bold></td>
<td valign="top" align="center"><bold>3.65</bold></td>
<td valign="top" align="center">55.0</td>
<td valign="top" align="center">77.2</td>
</tr>
<tr>
<td valign="top" align="left" rowspan="2">NTIDIGITS</td>
<td valign="top" align="center">FR</td>
<td valign="top" align="center">91.11</td>
<td valign="top" align="center">13.19</td>
<td valign="top" align="center"><bold>64.6</bold></td>
<td valign="top" align="center"><bold>122.0</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">FS</td>
<td valign="top" align="center"><bold>92.25</bold></td>
<td valign="top" align="center"><bold>6.19</bold></td>
<td valign="top" align="center">88.8</td>
<td valign="top" align="center">129.4</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The results are the average of 5 trials. The bold values represent the best results obtained when comparing the FR and FS models for each dataset.</p>
</table-wrap-foot>
</table-wrap></sec></sec>
<sec sec-type="conclusions" id="s4">
<title>4. Conclusion</title>
<p>In this study, we introduce a novel decision-making scheme based on FS coding for realistic event sequences by encoding FS timings of output neurons, and propose a supervised training framework based on FS timings. In the forward pass, discrete temporal coding is applied to the spike trains in the output layer. In the backpropagation, we propose error assignment from FS times to spikes through a Gaussian window and then leverage a surrogate gradient descent method for spikes to achieve supervised learning. Additional strategies are designed to facilitate training and mitigate the influence of inactive neurons, such as adding empty sequences and using different time constants and thresholds for feature extraction and decision layers.</p>
<p>In the experiments, we test the FS coding scheme on classifying various types of event data with rich temporal structures and make a comprehensive comparison with FR coding. Our results provide insights into the distinct mechanisms underlying FS and FR codings. First, FS coding demonstrates comparable performance with FR coding, but there is a trade-off between accuracy and time delay. A relatively longer temporal latency in the first spike helps encode more information, leading to higher FS accuracy. Furthermore, models based on FS and FR codings demonstrate distinct neuronal behavior on different types of data sequences in terms of firing patterns and sparsity. In particular, FS systems are much more energy efficient than FR systems for non-repetitive audio sequences with highly complex temporal structures. In contrast, for visual data sequences with temporal repetition and spatial information, the behavior of FS and FR models is more aligned. The FS systems tend to exhibit longer response time compared to FR systems. Future research could focus on exploring strategies to reduce the temporal delay of the first spike.</p></sec>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.</p></sec>
<sec sec-type="author-contributions" id="s6">
<title>Author contributions</title>
<p>SL: Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Writing&#x02014;original draft. VL: Supervision, Writing&#x02014;review and editing. PD: Project administration, Supervision, Writing&#x02014;review and editing.</p></sec>
</body>
<back>
<sec sec-type="funding-information" id="s7">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. SL was supported by the scholarship from the Department of Electrical and Electronic Engineering, Imperial College London, UK.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s8">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Afshar</surname> <given-names>S.</given-names></name> <name><surname>Hamilton</surname> <given-names>T. J.</given-names></name> <name><surname>Tapson</surname> <given-names>J.</given-names></name> <name><surname>Van Schaik</surname> <given-names>A.</given-names></name> <name><surname>Cohen</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>Investigation of event-based surfaces for high-speed detection, unsupervised feature extraction, and object recognition</article-title>. <source>Front. Neurosci</source>. <volume>12</volume>, <fpage>1047</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2018.01047</pub-id><pub-id pub-id-type="pmid">30705618</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Amir</surname> <given-names>A.</given-names></name> <name><surname>Taba</surname> <given-names>B.</given-names></name> <name><surname>Berg</surname> <given-names>D.</given-names></name> <name><surname>Melano</surname> <given-names>T.</given-names></name> <name><surname>McKinstry</surname> <given-names>J.</given-names></name> <name><surname>Di Nolfo</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>&#x0201C;A low power, fully event-based gesture recognition system,&#x0201D;</article-title> in <source>Proc. IEEE Comput. Vis. Pattern Recognit (CVPR)</source> (<publisher-loc>Honolulu, HI</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>7388</fpage>&#x02013;<lpage>7397</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.781</pub-id><pub-id pub-id-type="pmid">32903824</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anumula</surname> <given-names>J.</given-names></name> <name><surname>Neil</surname> <given-names>D.</given-names></name> <name><surname>Delbruck</surname> <given-names>T.</given-names></name> <name><surname>Liu</surname> <given-names>S.-C.</given-names></name></person-group> (<year>2018</year>). <article-title>Feature representations for neuromorphic audio spike streams</article-title>. <source>Front. Neurosci</source>. <volume>12</volume>, <fpage>23</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2018.00023</pub-id><pub-id pub-id-type="pmid">29479300</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bittar</surname> <given-names>A.</given-names></name> <name><surname>Garner</surname> <given-names>P. N.</given-names></name></person-group> (<year>2022</year>). <article-title>A surrogate gradient spiking baseline for speech command recognition</article-title>. <source>Front. Neurosci</source>. <volume>16</volume>, <fpage>865897</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2022.865897</pub-id><pub-id pub-id-type="pmid">36117617</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bohte</surname> <given-names>S. M.</given-names></name> <name><surname>Kok</surname> <given-names>J. N.</given-names></name> <name><surname>La Poutr&#x000E9;</surname> <given-names>H.</given-names></name></person-group> (<year>2002</year>). <article-title>Error-backpropagation in temporally encoded networks of spiking neurons</article-title>. <source>Neurocomputing</source> <volume>48</volume>, <fpage>17</fpage>&#x02013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1016/S0925-2312(01)00658-0</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bonilla</surname> <given-names>L.</given-names></name> <name><surname>Gautrais</surname> <given-names>J.</given-names></name> <name><surname>Thorpe</surname> <given-names>S.</given-names></name> <name><surname>Masquelier</surname> <given-names>T.</given-names></name></person-group> (<year>2022</year>). <article-title>Analyzing time-to-first-spike coding schemes: a theoretical approach</article-title>. <source>Front. Neurosci</source>. <volume>16</volume>, <fpage>971937</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2022.971937</pub-id><pub-id pub-id-type="pmid">36225737</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brette</surname> <given-names>R.</given-names></name></person-group> (<year>2015</year>). <article-title>Philosophy of the spike: rate-based vs. spike-based theories of the brain</article-title>. <source>Front. Syst. Neurosci</source>. <volume>9</volume>, <fpage>151</fpage>. <pub-id pub-id-type="doi">10.3389/fnsys.2015.00151</pub-id><pub-id pub-id-type="pmid">26617496</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Com&#x0015F;a</surname> <given-names>I.-M.</given-names></name> <name><surname>Potempa</surname> <given-names>K.</given-names></name> <name><surname>Versari</surname> <given-names>L.</given-names></name> <name><surname>Fischbacher</surname> <given-names>T.</given-names></name> <name><surname>Gesmundo</surname> <given-names>A.</given-names></name> <name><surname>Alakuijala</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>&#x0201C;Temporal coding in spiking neural networks with alpha synaptic function: learning with backpropagation,&#x0201D;</article-title> in <source>IEEE Transactions on Neural Networks and Learning Systems</source>, vol. 33 (IEEE), <fpage>5939</fpage>&#x02013;<lpage>5952</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3071976</pub-id><pub-id pub-id-type="pmid">33900924</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cramer</surname> <given-names>B.</given-names></name> <name><surname>Stradmann</surname> <given-names>Y.</given-names></name> <name><surname>Schemmel</surname> <given-names>J.</given-names></name> <name><surname>Zenke</surname> <given-names>F.</given-names></name></person-group> (<year>2022</year>). <article-title>The Heidelberg spiking datasets for the systematic evaluation of spiking neural networks</article-title>. <source>IEEE Trans. Neural Netw. Learning Syst</source>. <volume>33</volume>, <fpage>2744</fpage>&#x02013;<lpage>2757</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2020.3044364</pub-id><pub-id pub-id-type="pmid">33378266</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davies</surname> <given-names>M.</given-names></name> <name><surname>Srinivasa</surname> <given-names>N.</given-names></name> <name><surname>Lin</surname> <given-names>T.-H.</given-names></name> <name><surname>Chinya</surname> <given-names>G.</given-names></name> <name><surname>Cao</surname> <given-names>Y.</given-names></name> <name><surname>Choday</surname> <given-names>S. H.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>&#x0201C;Loihi: a neuromorphic manycore processor with on-chip learning,&#x0201D;</article-title> in <source>IEEE Micro</source>, vol. 38 (IEEE), <fpage>82</fpage>&#x02013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1109/MM.2018.112130359</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fang</surname> <given-names>W.</given-names></name> <name><surname>Yu</surname> <given-names>Z.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Masquelier</surname> <given-names>T.</given-names></name> <name><surname>Huang</surname> <given-names>T.</given-names></name> <name><surname>Tian</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>&#x0201C;Incorporating learnable membrane time constant to enhance learning of spiking neural networks,&#x0201D;</article-title> in <source>Proc. IEEE Int. Conf. Comput. Vis</source>. (<italic>ICCV)</italic> (Montreal, QC: IEEE), <fpage>2641</fpage>&#x02013;<lpage>2651</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV48922.2021.00266</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fontaine</surname> <given-names>B.</given-names></name> <name><surname>Peremans</surname> <given-names>H.</given-names></name></person-group> (<year>2009</year>). <article-title>Bat echolocation processing using first-spike latency coding</article-title>. <source>Neural Netw</source>. <volume>22</volume>, <fpage>1372</fpage>&#x02013;<lpage>1382</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2009.05.002</pub-id><pub-id pub-id-type="pmid">19481904</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gardner</surname> <given-names>B.</given-names></name> <name><surname>Sporea</surname> <given-names>I.</given-names></name> <name><surname>Gr&#x000FC;ning</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Learning spatiotemporally encoded pattern transformations in structured spiking neural networks</article-title>. <source>Neural Comput</source>. <volume>27</volume>, <fpage>2548</fpage>&#x02013;<lpage>2586</lpage>. <pub-id pub-id-type="doi">10.1162/NECO_a_00790</pub-id><pub-id pub-id-type="pmid">26496039</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gollisch</surname> <given-names>T.</given-names></name> <name><surname>Meister</surname> <given-names>M.</given-names></name></person-group> (<year>2008</year>). <article-title>Rapid neural coding in the retina with relative spike latencies</article-title>. <source>Science</source> <volume>319</volume>, <fpage>1108</fpage>&#x02013;<lpage>1111</lpage>. <pub-id pub-id-type="doi">10.1126/science.1149639</pub-id><pub-id pub-id-type="pmid">18292344</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x000F6;ltz</surname> <given-names>J.</given-names></name> <name><surname>Kriener</surname> <given-names>L.</given-names></name> <name><surname>Baumbach</surname> <given-names>A.</given-names></name> <name><surname>Billaudelle</surname> <given-names>S.</given-names></name> <name><surname>Breitwieser</surname> <given-names>O.</given-names></name> <name><surname>Cramer</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Fast and energy-efficient neuromorphic deep learning with first-spike times</article-title>. <source>Nat. Mach. Intell</source>. <volume>3</volume>, <fpage>823</fpage>&#x02013;<lpage>835</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-021-00388-x</pub-id><pub-id pub-id-type="pmid">37523463</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guo</surname> <given-names>W.</given-names></name> <name><surname>Fouda</surname> <given-names>M. E.</given-names></name> <name><surname>Eltawil</surname> <given-names>A. M.</given-names></name> <name><surname>Salama</surname> <given-names>K. N.</given-names></name></person-group> (<year>2021</year>). <article-title>Neural coding in spiking neural networks: a comparative study for robust neuromorphic systems</article-title>. <source>Front. Neurosci</source>. <volume>15</volume>, <fpage>638474</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2021.638474</pub-id><pub-id pub-id-type="pmid">33746705</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hammouamri</surname> <given-names>I.</given-names></name> <name><surname>Khalfaoui-Hassani</surname> <given-names>I.</given-names></name> <name><surname>Masquelier</surname> <given-names>T.</given-names></name></person-group> (<year>2023</year>). <article-title>Learning delays in spiking neural networks using dilated convolutions with learnable spacings</article-title>. <source>arXiv [Preprint]</source>. arXiv:2306.17670.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heil</surname> <given-names>P.</given-names></name></person-group> (<year>2004</year>). <article-title>First-spike latency of auditory neurons revisited</article-title>. <source>Curr. Opin. Neurobiol</source>. <volume>14</volume>, <fpage>461</fpage>&#x02013;<lpage>467</lpage>. <pub-id pub-id-type="doi">10.1016/j.conb.2004.07.002</pub-id><pub-id pub-id-type="pmid">15321067</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huxter</surname> <given-names>J.</given-names></name> <name><surname>Burgess</surname> <given-names>N.</given-names></name> <name><surname>O&#x00027;Keefe</surname> <given-names>J.</given-names></name></person-group> (<year>2003</year>). <article-title>Independent rate and temporal coding in hippocampal pyramidal cells</article-title>. <source>Nature</source> <volume>425</volume>, <fpage>828</fpage>&#x02013;<lpage>832</lpage>. <pub-id pub-id-type="doi">10.1038/nature02058</pub-id><pub-id pub-id-type="pmid">14574410</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iyer</surname> <given-names>L. R.</given-names></name> <name><surname>Chua</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Is neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain</article-title>. <source>Front. Neurosci</source>. <volume>15</volume>, <fpage>608567</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2021.608567</pub-id><pub-id pub-id-type="pmid">33841072</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Izhikevich</surname> <given-names>E. M.</given-names></name> <name><surname>Desai</surname> <given-names>N. S.</given-names></name> <name><surname>Walcott</surname> <given-names>E. C.</given-names></name> <name><surname>Hoppensteadt</surname> <given-names>F. C.</given-names></name></person-group> (<year>2003</year>). <article-title>Bursts as a unit of neural information: selective communication via resonance</article-title>. <source>Trends Neurosci</source>. <volume>26</volume>, <fpage>161</fpage>&#x02013;<lpage>167</lpage>. <pub-id pub-id-type="doi">10.1016/S0166-2236(03)00034-1</pub-id><pub-id pub-id-type="pmid">12591219</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>Z.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>T.</given-names></name> <name><surname>Poo</surname> <given-names>M.-m.</given-names></name> <name><surname>Xu</surname> <given-names>B.</given-names></name></person-group> (<year>2023</year>). <article-title>Origin of the efficiency of spike timing-based neural computation for processing temporal information</article-title>. <source>Neural Netw</source>. <volume>160</volume>, <fpage>84</fpage>&#x02013;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2022.12.017</pub-id><pub-id pub-id-type="pmid">36621172</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johansson</surname> <given-names>R. S.</given-names></name> <name><surname>Birznieks</surname> <given-names>I.</given-names></name></person-group> (<year>2004</year>). <article-title>First spikes in ensembles of human tactile afferents code complex spatial fingertip events</article-title>. <source>Nat. Neurosci</source>. <volume>7</volume>, <fpage>170</fpage>&#x02013;<lpage>177</lpage>. <pub-id pub-id-type="doi">10.1038/nn1177</pub-id><pub-id pub-id-type="pmid">14730306</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kaiser</surname> <given-names>J.</given-names></name> <name><surname>Mostafa</surname> <given-names>H.</given-names></name> <name><surname>Neftci</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>Synaptic plasticity dynamics for deep continuous local learning (DECOLLE)</article-title>. <source>Front. Neurosci</source>. <volume>14</volume>, <fpage>424</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2020.00424</pub-id><pub-id pub-id-type="pmid">32477050</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kheradpisheh</surname> <given-names>S. R.</given-names></name> <name><surname>Masquelier</surname> <given-names>T.</given-names></name></person-group> (<year>2020</year>). <article-title>Temporal backpropagation for spiking neural networks with one spike per neuron</article-title>. <source>Int. J. Neur. Syst</source>. <volume>30</volume>, <fpage>2050027</fpage>. <pub-id pub-id-type="doi">10.1142/S0129065720500276</pub-id><pub-id pub-id-type="pmid">32466691</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kiselev</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Rate coding vs. temporal coding - is optimum between?&#x0201D;</article-title> in <source>2016 International Joint Conference on Neural Networks (IJCNN)</source> (<publisher-loc>Vancouver, BC</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1355</fpage>&#x02013;<lpage>1359</lpage>. <pub-id pub-id-type="doi">10.1109/IJCNN.2016.7727355</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kotariya</surname> <given-names>V.</given-names></name> <name><surname>Ganguly</surname> <given-names>U.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Spiking-GAN: a spiking generative adversarial network using time-to-first-spike coding,&#x0201D;</article-title> in <source>2022 International Joint Conference on Neural Networks (IJCNN)</source> (<publisher-loc>Padua</publisher-loc>: <publisher-name>IEEE</publisher-name>). <pub-id pub-id-type="doi">10.1109/IJCNN55064.2022.9892262</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Liu</surname> <given-names>H.</given-names></name> <name><surname>Ji</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name> <name><surname>Shi</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>CIFAR10-DVS: an event-stream dataset for object classification</article-title>. <source>Front. Neurosci</source>. <volume>11</volume>, <fpage>309</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2017.00309</pub-id><pub-id pub-id-type="pmid">28611582</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>X.-P.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name></person-group> (<year>2022</year>). <article-title>Distinct neuronal types contribute to hybrid temporal encoding strategies in primate auditory cortex</article-title>. <source>PLOS Biol</source>. 20, e3001642. <pub-id pub-id-type="doi">10.1371/journal.pbio.3001642</pub-id><pub-id pub-id-type="pmid">35613218</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mirsadeghi</surname> <given-names>M.</given-names></name> <name><surname>Shalchian</surname> <given-names>M.</given-names></name> <name><surname>Kheradpisheh</surname> <given-names>S. R.</given-names></name> <name><surname>Masquelier</surname> <given-names>T.</given-names></name></person-group> (<year>2021</year>). <article-title>STiDi-BP: spike time displacement based error backpropagation in multilayer spiking neural networks</article-title>. <source>Neurocomputing</source> <volume>427</volume>, <fpage>131</fpage>&#x02013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2020.11.052</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mostafa</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Supervised learning based on temporal coding in spiking neural networks</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <volume>29</volume>, <fpage>3227</fpage>&#x02013;<lpage>3235</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2017.2726060</pub-id><pub-id pub-id-type="pmid">28783639</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neftci</surname> <given-names>E. O.</given-names></name> <name><surname>Mostafa</surname> <given-names>H.</given-names></name> <name><surname>Zenke</surname> <given-names>F.</given-names></name></person-group> (<year>2019</year>). <article-title>Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks</article-title>. <source>IEEE Signal Process. Mag</source>. <volume>36</volume>, <fpage>51</fpage>&#x02013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.1109/MSP.2019.2931595</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Orchard</surname> <given-names>G.</given-names></name> <name><surname>Jayawant</surname> <given-names>A.</given-names></name> <name><surname>Cohen</surname> <given-names>G. K.</given-names></name> <name><surname>Thakor</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>Converting static image datasets to spiking neuromorphic datasets using saccades</article-title>. <source>Front. Neurosci</source>. <volume>9</volume>, <fpage>437</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2015.00437</pub-id><pub-id pub-id-type="pmid">26635513</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Panzeri</surname> <given-names>S.</given-names></name> <name><surname>Macke</surname> <given-names>J. H.</given-names></name> <name><surname>Gross</surname> <given-names>J.</given-names></name> <name><surname>Kayser</surname> <given-names>C.</given-names></name></person-group> (<year>2015</year>). <article-title>Neural population coding: combining insights from microscopic and mass signals</article-title>. <source>Trends Cogn. Sci</source>. <volume>19</volume>, <fpage>162</fpage>&#x02013;<lpage>172</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2015.01.002</pub-id><pub-id pub-id-type="pmid">25670005</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Panzeri</surname> <given-names>S.</given-names></name> <name><surname>Petersen</surname> <given-names>R. S.</given-names></name> <name><surname>Schultz</surname> <given-names>S. R.</given-names></name> <name><surname>Lebedev</surname> <given-names>M.</given-names></name> <name><surname>Diamond</surname> <given-names>M. E.</given-names></name></person-group> (<year>2001</year>). <article-title>The role of spike timing in the coding of stimulus location in rat somatosensory cortex</article-title>. <source>Neuron</source> <volume>29</volume>, <fpage>769</fpage>&#x02013;<lpage>777</lpage>. <pub-id pub-id-type="doi">10.1016/S0896-6273(01)00251-3</pub-id><pub-id pub-id-type="pmid">11301035</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Parameshwara</surname> <given-names>C. M.</given-names></name> <name><surname>Li</surname> <given-names>S.</given-names></name> <name><surname>Ferm&#x000FC;ller</surname> <given-names>C.</given-names></name> <name><surname>Sanket</surname> <given-names>N. J.</given-names></name> <name><surname>Evanusa</surname> <given-names>M. S.</given-names></name> <name><surname>Aloimonos</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>&#x0201C;SpikeMS: deep spiking neural network for motion segmentation,&#x0201D;</article-title> in <source>2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</source> (<publisher-loc>Prague</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>3414</fpage>&#x02013;<lpage>3420</lpage>. <pub-id pub-id-type="doi">10.1109/IROS51168.2021.9636506</pub-id><pub-id pub-id-type="pmid">27534393</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Park</surname> <given-names>S.</given-names></name> <name><surname>Kim</surname> <given-names>S.</given-names></name> <name><surname>Na</surname> <given-names>B.</given-names></name> <name><surname>Yoon</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;T2FS.NN: deep spiking neural networks with time-to-first-spike coding,&#x0201D;</article-title> in <source>2020 57th ACM/IEEE Design Automation Conference (DAC)</source> (<publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1109/DAC18072.2020.9218689</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perez-Nieves</surname> <given-names>N.</given-names></name> <name><surname>Leung</surname> <given-names>V. C. H.</given-names></name> <name><surname>Dragotti</surname> <given-names>P. L.</given-names></name> <name><surname>Goodman</surname> <given-names>D. F. M.</given-names></name></person-group> (<year>2021</year>). <article-title>Neural heterogeneity promotes robust learning</article-title>. <source>Nat. Commun</source>. <volume>12</volume>, <fpage>5791</fpage>. <pub-id pub-id-type="doi">10.1038/s41467-021-26022-3</pub-id><pub-id pub-id-type="pmid">34608134</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pfeiffer</surname> <given-names>M.</given-names></name> <name><surname>Pfeil</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <article-title>Deep learning with spiking neurons: opportunities and challenges</article-title>. <source>Front. Neurosci</source>. <volume>12</volume>, <fpage>774</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2018.00774</pub-id><pub-id pub-id-type="pmid">30410432</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pfister</surname> <given-names>J.-P.</given-names></name> <name><surname>Toyoizumi</surname> <given-names>T.</given-names></name> <name><surname>Barber</surname> <given-names>D.</given-names></name> <name><surname>Gerstner</surname> <given-names>W.</given-names></name></person-group> (<year>2006</year>). <article-title>Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning</article-title>. <source>Neural. Comput</source>. <volume>18</volume>, <fpage>1318</fpage>&#x02013;<lpage>1348</lpage>. <pub-id pub-id-type="doi">10.1162/neco.2006.18.6.1318</pub-id><pub-id pub-id-type="pmid">16764506</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rullen</surname> <given-names>R. V.</given-names></name> <name><surname>Thorpe</surname> <given-names>S. J.</given-names></name></person-group> (<year>2001</year>). <article-title>Rate coding versus temporal order coding: what the retinal ganglion cells tell the visual cortex</article-title>. <source>Neural Comput</source>. <volume>13</volume>, <fpage>1255</fpage>&#x02013;<lpage>1283</lpage>. <pub-id pub-id-type="doi">10.1162/08997660152002852</pub-id><pub-id pub-id-type="pmid">11387046</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sakemi</surname> <given-names>Y.</given-names></name> <name><surname>Morino</surname> <given-names>K.</given-names></name> <name><surname>Morie</surname> <given-names>T.</given-names></name> <name><surname>Aihara</surname> <given-names>K.</given-names></name></person-group> (<year>2023</year>). <article-title>A supervised learning algorithm for multilayer spiking neural networks based on temporal coding toward energy-efficient VLSI processor design</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <volume>34</volume>, <fpage>394</fpage>&#x02013;<lpage>408</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3095068</pub-id><pub-id pub-id-type="pmid">34280109</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shrestha</surname> <given-names>S. B.</given-names></name> <name><surname>Orchard</surname> <given-names>G.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;SLAYER: spike layer error reassignment in time,&#x0201D;</article-title> in <source>Proceedings of the 32nd International Conference on Neural Information Processing Systems</source> (<publisher-loc>Montr&#x000E9;al, QC</publisher-loc>: <publisher-name>Curran Associates Inc.</publisher-name>), <fpage>1419</fpage>&#x02013;<lpage>1428</lpage>. <pub-id pub-id-type="doi">10.5555/3326943.3327073</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shrestha</surname> <given-names>S. B.</given-names></name> <name><surname>Song</surname> <given-names>Q.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Event based weight update for learning infinite spike train,&#x0201D;</article-title> in <source>2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)</source> (<publisher-loc>Anaheim, CA</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>333</fpage>&#x02013;<lpage>338</lpage>. <pub-id pub-id-type="doi">10.1109/ICMLA.2016.0061</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shrestha</surname> <given-names>S. B.</given-names></name> <name><surname>Song</surname> <given-names>Q.</given-names></name></person-group> (<year>2017</year>). <article-title>Robust spike-train learning in spike-event based weight update</article-title>. <source>Neural Netw</source>. <volume>96</volume>, <fpage>33</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2017.08.010</pub-id><pub-id pub-id-type="pmid">28957730</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shrestha</surname> <given-names>S. B.</given-names></name> <name><surname>Zhu</surname> <given-names>L.</given-names></name> <name><surname>Sun</surname> <given-names>P.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Spikemax: spike-based loss methods for classification,&#x0201D;</article-title> in <source>2022 International Joint Conference on Neural Networks (IJCNN)</source> (<publisher-loc>Padua</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1109/IJCNN55064.2022.9892379</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Lin</surname> <given-names>X.</given-names></name> <name><surname>Dang</surname> <given-names>X.</given-names></name></person-group> (<year>2020</year>). <article-title>Supervised learning in spiking neural networks: a review of algorithms and evaluations</article-title>. <source>Neural Netw</source>. <volume>125</volume>, <fpage>258</fpage>&#x02013;<lpage>280</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2020.02.011</pub-id><pub-id pub-id-type="pmid">32146356</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>D.</given-names></name> <name><surname>Yi</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>X.</given-names></name></person-group> (<year>2022</year>). <article-title>A little energy goes a long way: build an energy-efficient, accurate spiking neural network from convolutional neural network</article-title>. <source>Front. Neurosci</source>. <volume>16</volume>, <fpage>759900</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2022.759900</pub-id><pub-id pub-id-type="pmid">35692427</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Deng</surname> <given-names>L.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name> <name><surname>Zhu</surname> <given-names>J.</given-names></name> <name><surname>Shi</surname> <given-names>L.</given-names></name></person-group> (<year>2018</year>). <article-title>Spatio-temporal backpropagation for training high-performance spiking neural networks</article-title>. <source>Front. Neurosci</source>. <volume>12</volume>, <fpage>331</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2018.00331</pub-id><pub-id pub-id-type="pmid">29875621</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wunderlich</surname> <given-names>T. C.</given-names></name> <name><surname>Pehle</surname> <given-names>C.</given-names></name></person-group> (<year>2021</year>). <article-title>Event-based backpropagation can compute exact gradients for spiking neural networks</article-title>. <source>Sci. Rep</source>. <volume>11</volume>, <fpage>12829</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-91786-z</pub-id><pub-id pub-id-type="pmid">34145314</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>M.</given-names></name> <name><surname>Meng</surname> <given-names>Q.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Lin</surname> <given-names>Z.</given-names></name></person-group> (<year>2023</year>). <article-title>SPIDE: a purely spike-based method for training feedback spiking neural networks</article-title>. <source>Neural Netw</source>. <volume>161</volume>, <fpage>9</fpage>&#x02013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2023.01.026</pub-id><pub-id pub-id-type="pmid">36736003</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname> <given-names>R.</given-names></name> <name><surname>Tang</surname> <given-names>H.</given-names></name> <name><surname>Ma</surname> <given-names>Y.</given-names></name> <name><surname>Yan</surname> <given-names>R.</given-names></name> <name><surname>Orchard</surname> <given-names>G.</given-names></name></person-group> (<year>2020</year>). <article-title>An event-driven categorization model for AER image sensors using multispike encoding and learning</article-title>. <source>IEEE Trans. Neural Netw. Learning Syst</source>. <volume>31</volume>, <fpage>3649</fpage>&#x02013;<lpage>3657</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2019.2945630</pub-id><pub-id pub-id-type="pmid">31714243</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Zeng</surname> <given-names>X.</given-names></name> <name><surname>Han</surname> <given-names>L.</given-names></name> <name><surname>Yang</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>A supervised multi-spike learning algorithm based on gradient descent for spiking neural networks</article-title>. <source>Neural Netw</source>. <volume>43</volume>, <fpage>99</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2013.02.003</pub-id><pub-id pub-id-type="pmid">23500504</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>M.</given-names></name> <name><surname>Gao</surname> <given-names>H.</given-names></name> <name><surname>Zhao</surname> <given-names>G.</given-names></name> <name><surname>Wang</surname> <given-names>D.</given-names></name> <name><surname>Lin</surname> <given-names>Y.</given-names></name> <name><surname>Yang</surname> <given-names>Z.</given-names></name> <etal/></person-group>. (<year>2021</year>). Temporal-wise attention spiking neural networks for event streams classification,&#x0201D; in <italic>2021 IEEE/CVF International Conference on Computer Vision (ICCV)</italic> (Montreal, QC: IEEE), <fpage>10201</fpage>&#x02013;<lpage>10210</lpage>. <pub-id pub-id-type="doi">10.1109/ICCV48922.2021.01006</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yin</surname> <given-names>B.</given-names></name> <name><surname>Corradi</surname> <given-names>F.</given-names></name> <name><surname>Boht&#x000E9;</surname> <given-names>S. M.</given-names></name></person-group> (<year>2021</year>). <article-title>Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks</article-title>. <source>Nat. Mach. Intell</source>. <volume>3</volume>, <fpage>905</fpage>&#x02013;<lpage>913</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-021-00397-w</pub-id></citation>
</ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>C.</given-names></name> <name><surname>Gu</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>G.</given-names></name> <name><surname>Wang</surname> <given-names>A.</given-names></name> <name><surname>Li</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>STSC-SNN: spatio-temporal synaptic connection with temporal convolution and attention for spiking neural networks</article-title>. <source>Front. Neurosci</source>. <volume>16</volume>, <fpage>1079357</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2022.1079357</pub-id><pub-id pub-id-type="pmid">36620452</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>M.</given-names></name> <name><surname>Xiang</surname> <given-names>T.</given-names></name> <name><surname>Srivatsa</surname> <given-names>P.</given-names></name> <name><surname>Chu</surname> <given-names>K. T. N.</given-names></name> <name><surname>Amornpaisannon</surname> <given-names>B.</given-names></name> <name><surname>Tavva</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>A TTFS-based energy and utilization efficient neuromorphic CNN accelerator</article-title>. <source>Front. Neurosci</source>. <volume>17</volume>, <fpage>1121592</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2023.1121592</pub-id><pub-id pub-id-type="pmid">37214405</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zenke</surname> <given-names>F.</given-names></name> <name><surname>Ganguli</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>SuperSpike: supervised learning in multilayer spiking neural networks</article-title>. <source>Neural Comput</source>. <volume>30</volume>, <fpage>1514</fpage>&#x02013;<lpage>1541</lpage>. <pub-id pub-id-type="doi">10.1162/neco_a_01086</pub-id><pub-id pub-id-type="pmid">29652587</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>M.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Wu</surname> <given-names>J.</given-names></name> <name><surname>Belatreche</surname> <given-names>A.</given-names></name> <name><surname>Amornpaisannon</surname> <given-names>B.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Rectified linear postsynaptic potential function for backpropagation in deep spiking neural networks</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <volume>33</volume>, <fpage>1947</fpage>&#x02013;<lpage>1958</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3110991</pub-id><pub-id pub-id-type="pmid">34534091</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Li</surname> <given-names>P.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Spike-train level backpropagation for training deep recurrent spiking neural networks,&#x0201D;</article-title> in <source>Proceedings of the 33rd International Conference on Neural Information Processing Systems</source>, number 701 (Red Hook, NY: Curran Associates Inc), <fpage>7802</fpage>&#x02013;<lpage>7813</lpage>.</citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Chandrasekaran</surname> <given-names>S. T.</given-names></name> <name><surname>Sanyal</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Temporal-coded deep spiking neural network with easy training and robust performance</article-title>. <source>Proc. AAAI Conf. Artif. Intell</source>. <volume>35</volume>, <fpage>11143</fpage>&#x02013;<lpage>11151</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v35i12.17329</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>Z.</given-names></name> <name><surname>Zhu</surname> <given-names>Y.</given-names></name> <name><surname>He</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>YAN</surname> <given-names>S.</given-names></name> <name><surname>Tian</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>&#x0201C;Spikformer: when spiking neural network meets transformer,&#x0201D;</article-title> in <source>The Eleventh International Conference on Learning Representations</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://openreview.net/forum?id=frE4fUwz_h">https://openreview.net/forum?id=frE4fUwz_h</ext-link></citation>
</ref>
</ref-list>
<app-group>
<app id="A1">
<title>Appendix</title>
<p>For the experiments in Section 3.6, we replaced the CUBA-LIF neurons in hidden layers with AdLIF neurons (Bittar and Garner, <xref ref-type="bibr" rid="B4">2022</xref>) and trained with the FS loss. This adjustment introduces parameters related to adaptation, including adaptation time constant &#x003C4;<sub><italic>w</italic></sub>, and recovery variables <italic>a</italic> and <italic>b</italic>. According to Bittar and Garner (<xref ref-type="bibr" rid="B4">2022</xref>), &#x003C4;<sub><italic>w</italic></sub> is initialized with 100 ms, and <italic>a</italic> and <italic>b</italic> are constrained in the range of <italic>a</italic>&#x02208;[&#x02212;1, 1], <italic>b</italic>&#x02208;[0, 2]. However, in our models, we set <italic>a</italic>&#x02208;[0, 1] and <italic>b</italic>&#x02208;[0, 1]. We observed that this setting leads to more stable training and improved performance. During training, these parameters are trainable along with &#x003C4;<sub><italic>m</italic></sub>.</p>
<p>In particular, for the SHD task, the architecture is 700-FC256-FC256-20, &#x003C4;<sub>1</sub> &#x0003D; 5<italic>ms</italic>, &#x003C4;<sub>2</sub> &#x0003D; 12&#x003C4;<sub>1</sub>, &#x003B8;<sub>1</sub> &#x0003D; 0.5, &#x003B8;<sub>2</sub> &#x0003D; 10. For the NTIDIGITS task, the architecture is 64-FC256-FC256-11, &#x003C4;<sub>1</sub> &#x0003D; 5<italic>ms</italic>, &#x003C4;<sub>2</sub> &#x0003D; 8&#x003C4;<sub>1</sub>, &#x003B8;<sub>1</sub> &#x0003D; 0.5, &#x003B8;<sub>2</sub> &#x0003D; 10. Other training parameters keep the same with the models using CUBA-LIF neurons. In addition, batch normalization (BN) and dropout (DP) are used in each hidden layer. Specifically, a single layer comprises linear operations &#x0002B; BN &#x0002B; neuron &#x0002B; DP. The dropout rate is set to 0.25.</p>
<p><xref ref-type="table" rid="TA1">Table A1</xref> presents comparison results of models utilizing CUBA-LIF and AdLIF neurons and the results of other state-of-the-art methods on DVSGesture/SHD/NTIDIGITS datasets. We can see that the performance of models with CUBA-LIF neurons is generally worse, particularly in the case of DVSGesture, indicating that FS coding is not suited for sequences involving temporal repetition. In contrast, models with AdLIF neurons exhibit significant accuracy improvement, achieving comparable performance with state-of-the-art methods, especially for SHD dataset.</p>
<table-wrap position="float" id="TA1">
<label>Table A1</label>
<caption><p>Result comparison of FS models with other methods.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Acc (%)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center" rowspan="7">DVSGesture</td>
<td valign="top" align="left">Hetero. RSNN (Perez-Nieves et al., <xref ref-type="bibr" rid="B38">2021</xref>)</td>
<td valign="top" align="center">82.9</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">SLAYER (Shrestha and Orchard, <xref ref-type="bibr" rid="B43">2018</xref>)</td>
<td valign="top" align="center">93.64 &#x000B1; 0.49</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">DECOLLE (Kaiser et al., <xref ref-type="bibr" rid="B24">2020</xref>)</td>
<td valign="top" align="center">95.54</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">SLAYER &#x0002B; SpikeMax (Shrestha et al., <xref ref-type="bibr" rid="B46">2022</xref>)</td>
<td valign="top" align="center">95.83 &#x000B1; 0.48</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">PLIF (Fang et al., <xref ref-type="bibr" rid="B11">2021</xref>) (STBP)</td>
<td valign="top" align="center">97.57</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">STSC-SNN (Yu et al., <xref ref-type="bibr" rid="B56">2022</xref>)</td>
<td valign="top" align="center"><bold>98.96</bold></td>
</tr> <tr>
<td/>
<td valign="top" align="left">Ours (CUBA-LIF)</td>
<td valign="top" align="center">92.8</td>
</tr> <tr>
<td valign="top" align="center" rowspan="9">SHD</td>
<td valign="top" align="left">Hetero. RSNN (Perez-Nieves et al., <xref ref-type="bibr" rid="B38">2021</xref>)</td>
<td valign="top" align="center">82.7 &#x000B1; 0.8</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">RSNN with data aug. &#x0002B; noise (Cramer et al., <xref ref-type="bibr" rid="B9">2022</xref>)</td>
<td valign="top" align="center">83.2 &#x000B1; 1.3</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Adaptive SRNN (Yin et al., <xref ref-type="bibr" rid="B55">2021</xref>)</td>
<td valign="top" align="center">90.4</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">TA-SNN (Yao et al., <xref ref-type="bibr" rid="B54">2021</xref>)</td>
<td valign="top" align="center">91.08</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">STSC-SNN (Yu et al., <xref ref-type="bibr" rid="B56">2022</xref>)</td>
<td valign="top" align="center">92.36</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">RadLIF (Bittar and Garner, <xref ref-type="bibr" rid="B4">2022</xref>)</td>
<td valign="top" align="center">94.62</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">SNN with learned delays (Hammouamri et al., <xref ref-type="bibr" rid="B17">2023</xref>)</td>
<td valign="top" align="center"><bold>95.07</bold> <bold>&#x000B1;</bold> <bold>0.24</bold></td>
</tr> <tr>
<td/>
<td valign="top" align="left">Ours (CUBA-LIF)</td>
<td valign="top" align="center">87.6</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Ours (AdLIF)</td>
<td valign="top" align="center">94.08</td>
</tr> <tr>
<td valign="top" align="center" rowspan="7">NTIDIGITS</td>
<td valign="top" align="left">GRU-RNN (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>)</td>
<td valign="top" align="center">90.9</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Phased-LSTM (Anumula et al., <xref ref-type="bibr" rid="B3">2018</xref>)</td>
<td valign="top" align="center">91.25</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">ST-RSBP (Zhang and Li, <xref ref-type="bibr" rid="B60">2019</xref>)</td>
<td valign="top" align="center">93.63 &#x000B1; 0.27</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">SLAYER &#x0002B; spike-rate (Shrestha et al., <xref ref-type="bibr" rid="B46">2022</xref>)</td>
<td valign="top" align="center"><bold>94.19</bold> <bold>&#x000B1;</bold> <bold>0.18</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">SLAYER &#x0002B; SpikeMax (Shrestha et al., <xref ref-type="bibr" rid="B46">2022</xref>)</td>
<td valign="top" align="center">93.21 &#x000B1; 0.32</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Ours (CUBA-LIF)</td>
<td valign="top" align="center">89.3</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Ours (AdLIF)</td>
<td valign="top" align="center">92.25</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The bold values represent the best results for each dataset.</p>
</table-wrap-foot>
</table-wrap>
</app>
</app-group>
</back>
</article>