Rethinking skip connections in Spiking Neural Networks with Time-To-First-Spike coding

Time-To-First-Spike (TTFS) coding in Spiking Neural Networks (SNNs) offers significant advantages in terms of energy efficiency, closely mimicking the behavior of biological neurons. In this work, we delve into the role of skip connections, a widely used concept in Artificial Neural Networks (ANNs), within the domain of SNNs with TTFS coding. Our focus is on two distinct types of skip connection architectures: (1) addition-based skip connections, and (2) concatenation-based skip connections. We find that addition-based skip connections introduce an additional delay in terms of spike timing. On the other hand, concatenation-based skip connections circumvent this delay but produce time gaps between after-convolution and skip connection paths, thereby restricting the effective mixing of information from these two paths. To mitigate these issues, we propose a novel approach involving a learnable delay for skip connections in the concatenation-based skip connection architecture. This approach successfully bridges the time gap between the convolutional and skip branches, facilitating improved information mixing. We conduct experiments on public datasets including MNIST and Fashion-MNIST, illustrating the advantage of the skip connection in TTFS coding architectures. Additionally, we demonstrate the applicability of TTFS coding on beyond image recognition tasks and extend it to scientific machine-learning tasks, broadening the potential uses of SNNs.


Introduction
The communication between spiking neurons in the brain, characterized by its binary, event-driven, and sparse nature, offers significant potential for creating flexible and energy-efficient artificial intelligence (AI) systems [1,2].Spiking Neural Networks (SNNs), unlike traditional Artificial Neural Networks (ANNs), leverage binary spikes, thereby offering a unique dimension of time in their operation.Recent studies have shown promising results with SNNs, making them suitable for competitive and energy-efficient applications in neuromorphic hardware [1,[3][4][5][6].
A primary application of SNNs lies in image recognition [1,2].In order to transform a static image into binary spike trains, a range of coding schemes have been introduced [6][7][8].Rate coding conveys information through the firing rate of spikes [9][10][11][12][13].Phase coding, meanwhile, embeds temporal information in spike patterns utilizing a global oscillator [14].In contrast, burst coding transmits spike bursts within brief time periods, which boosts the reliability of synaptic communication between neurons [7].While these coding schemes have proven successful in training SNNs, they generate a large number of spikes, which presents challenges when applied to ultra-low power devices.
To leverage temporal spike information in ultra-low power environments, researchers have increasingly focused on Time-To-First-Spike (TTFS) coding [15,16].The core concept involves representing information through spike timing, with each neuron generating a single spike during the forward process.A line of work focuses on training temporal-coded SNNs with backpropagation [6,[17][18][19][20][21], which highlights biological plausibility and efficiency of temporal coding.Much of the previous work has centered on developing improved synaptic models capable of effectively processing temporal information.For instance, [17] employed non-leaky integrate-and-fire neurons to compute locally exact gradients for backpropagation, while [6] introduced the alpha-synaptic function to enhance SNNs' accuracy.Recently, [19] proposed a ReLUlike spike dynamics that effectively mitigates the dead neuron issue caused by the leaky nature of spike functions.
While advances in synaptic modeling have illuminated the understanding of neuronal dynamics, the exploration of network architecture in temporal SNNs has been relatively limited.In this paper, we explore architectural improvements of TTFS coding, focusing on the role of skip connections in neural networks.Skip connections are a widely employed technique in ANNs, facilitating training and enhancing performance by allowing information to bypass certain layers.We examine two types of skip connection architectures: (1) addition-based skip connections, as proposed in ResNet [22], and (2) concatenation-based skip connections utilized in the ShuffleNetV2 architecture [23].We find that, when implemented in temporal SNN architectures, addition-based skip connections can introduce extra delays in the time dimension.Conversely, concatenation-based skip connections substantially reduce inference latency, but yield limited performance improvements due to discrepancies between the distributions from the convolution and skip branches.To augment the performance of concatenation-based skip connections, we propose a learnable delay for skip connections, which diminishes the distribution gap between the skip and convolutional branches, allowing for more effective information mixing between the two distributions.
In addition to our exploration of a new architecture for TTFS SNNs, we also investigate applications outside of image recognition, specifically in the realm of scientific machine-learning tasks.We venture into the domain of time-reversal wave localization problems, a significant challenge in physics and engineering.This problem aims to trace back a wave's source given the wave shape at a later time [24][25][26].Through these experiments, we aim to demonstrate the versatility and potential of SNNs in various complex tasks, significantly expanding their applicability beyond traditional domains.
In summary, our contributions in this paper are threefold.(1) First, we explore the network architecture of temporal SNNs, with a particular emphasis on skip connections, examining both addition-based residual connections and concatenation-based skip connections in the context of temporal SNNs.( 2) Second, we propose a learnable delay for skip connections to improve the performance of concatenation-based skip connections by reducing the distribution gap between skip and convolutional branches, enabling more effective information mixing.(3) Lastly, we extend the application of Time-To-First-Spike (TTFS) coding beyond image recognition to the time-reversal problem for source localization using wave signal.These contributions not only advance our understanding of network architecture in temporal SNNs but also broaden the potential applications of TTFS coding in various domains.

Spiking Neural Networks
Spiking Neural Networks (SNNs), unlike traditional Artificial Neural Networks (ANNs), operate using temporal spikes, thereby offering a unique dimension of time in their operation [1,2].Among their components, the Leaky-Integrate-and-Fire (LIF) neuron is key as it functions as a non-linear activation unit.The LIF neurons stand out due to the "memory" held within their membrane potential, where spikes are incrementally gathered.Once this potential exceeds a certain threshold, the neurons fire output spikes and the potential resets.
Training algorithms have been a primary focus of SNN research.Several methods proposed to address this involve transforming pre-trained ANNs into SNNs using weight or threshold balancing strategies [27][28][29][30][31].While these techniques are generally effective, they require a multitude of timesteps to emulate float activations using binary spikes.A set of recent studies have suggested the use of surrogate functions to circumvent this non-differentiable backpropagation issue [13,[32][33][34][35][36][37][38][39].These methods, accounting for temporal dynamics during weight training, exhibit high performance and short latency.
Another significant aspect of SNN research pertains to the coding scheme.Several schemes have been proposed for image classification with SNNs.Burst coding, for example, communicates a burst of spikes within a short duration, enhancing synaptic communication reliability [7].Phase coding encodes temporal information into spike patterns based on a global oscillator [14].Furthermore, rate coding has been applied to large-scale settings and is currently used by state-of-the-art methods [5,13,32].This scheme generates a spike train over T timesteps, where the total spike count reflects the magnitude of the input values.However, this generation of numerous spikes can pose issues for ultra-low power devices.To address this, Time-To-First-Spike (TTFS) coding has gained interest [6,15,17], as it generates a single spike per neuron, with spike latency inversely related to information importance.Despite progress in synaptic modeling, the architectural exploration of temporal SNNs remains limited.In this paper, we delve into the architectural enhancement of TTFS coding, specifically emphasizing the significance of skip connections in neural networks.

Skip Connection Architecture
The concept of skip connections or shortcut connections has been a cornerstone in the development of deep learning architectures, contributing significantly to the performance enhancement of various models.[22] proposed the ResNet architecture that employs skip connections to allow signals to bypass layers, directly flowing from one layer to a layer further into the network.This design helps to alleviate the problem of vanishing gradients in deep networks, thereby enabling the training of networks that are significantly deeper than those previously possible.Furthermore, introduced by [40], Highway Networks utilize gated skip connections, where the data flow is regulated by learned gating functions.This allows the network to learn to control the information flow dynamically.Also, [41] proposed DenseNet, where each layer receives direct inputs from all preceding layers and passes down its own feature maps to all subsequent layers.This dense connectivity promotes feature reuse and substantially reduces the number of parameters.In the ShuffleNetV2 architectures [23], channel shuffle operations and pointwise group convolutions are combined with skip connections to create highly efficient network architectures suitable for mobile devices.The aforementioned architectures have shown the importance of skip connections in enhancing the performance of neural networks.However, most of these architectures have been designed for traditional Artificial Neural Networks (ANNs), and their application and efficacy in the context of Spiking Neural Networks (SNNs) remain to be thoroughly investigated.

Temporal Neuron
Our neuron model is based on the non-leaky integrate-and-fire neurons proposed in [17].The neuron employs exponentially decaying synaptic current kernels denoted as ϵ.The influence of a spike occurring at time t k on the membrane potential can be expressed as Here, V j (t) represents the membrane potential of neuron j at time t, and w ji is the weight connection between neuron j and neuron i in the preceding layer.The synaptic current kernels, denoted by ϵ, are defined as where U (x) is the step function.U (x) = 1 when x ≥ 0 and U (x) = 0 otherwise, which ensures that the synaptic kernel accounts for the time subsequent to the input spike.By considering both Eq. 1 and Eq. 2, we can derive the equation of the membrane potential.We assume one neuron generates at most one spike, so the membrane potential V (t) increases until there is an output spike.Then, we can write the membrane potential with N input spikes Here, t k is the spike timing of k-th spike and w k is the corresponding weight.The neuron generates an output spike whenever the membrane potential has a higher value than 1, thus, V (t out ) ≥ 1, where t out means the output spike timing.For all spike t k < t out , we refer to the set of input spike index as a casual set C, following the definition from the previous work [17].
Then, we reorganize Eq. 4 for the spike timing.For simplicity, we transform exp(t i ) → z i , i.e., z-transformation [42].Then, Eq. 5 can be rewritten as: With TTFS coding, the networks determine the class of an image based on the neuron in the final layer that fires the earliest spike.For example, consider a scenario involving a 10-class classification problem.If the neuron corresponding to the "dog" category emits the earliest spike in the final layer, the network immediately classifies the given image as a "dog".This allows neural networks to have faster predictions, as the classification is determined as soon as the first spike is generated, without the need to wait for spikes from other neurons.

Temporal Delay in a Layer
Our objective is to accelerate the first spike timing in the final layer, without compromising on the accuracy of the model.In this context, one might wonder: what factors contribute to the delay in temporal coding?The delay is induced by spiking neurons where each neuron requires time to charge the membrane potential to generate the output spike, as demonstrated in Fig. 1(a).Fig. 1(b) presents a histogram of spike timings, illustrating that the distribution shifts towards a later time as the layer goes deeper.We aim to reduce this inherent temporal delay by bypassing the convolutional layers.

Architectures
We focus on the skip connection design in the temporal coding.We examine two types of skip connection architectures: (1) addition-based skip connections, as seen in ResNet [22], and (2) concatenation-based skip connections utilized in the ShuffleNetV2 architecture [23].Addition-based skip connection architecture adds a skip connection to the main convolutional branch.Let X l represent the input to l-th block, and F (X l ) represent the non-linear transformation operations (e.g., convolution, batch normalization, and non-linearity) within a residual block.The operation of a residual block can be written as follows: The overall operation of addition-based skip connection is illustrated in Fig. 2(a).The major problem with this scheme is that adding two branches (i.e., the skip connection and the convolutional branch) induces a significant delay in spike timing.For example, adding a spike from a skip connection at time t A and a spike after convolutional operation at time t B results in the output at time t A +t B .Therefore, the addition-based skip connection is not appropriate for TTFS coding.
On the other hand, a concatenation-based skip connection utilizes the channel split operation, where an input tensor is split into two parts along the channel dimension.One part is transformed through a series of operations while the other part passes a skip connection.These two parts are then concatenated and shuffled to ensure equal information sharing among channels.Mathematically, for the input activation X l in l-th block, the operation of a concatenation-based skip connection can be represented as follows: where F (X l ) represents the non-linear transformation operation, and X l,1 , X l,2 stand for two input tensors divided through channel dimension, respectively.The overall illustration of the concatenation-based skip connection is shown in Fig. 2(b).
Different from the addition-based skip connection, the concatenation-based skip connection allows spikes to pass directly through, thereby expediting the timing of spikes.However, this method does come with a notable disadvantage: a timing discrepancy between the spikes in the convolutional branch and those in the skip connection branch.Specifically, we observe the distributions from the convolutional and skip connection branches have less overlap.This lack of overlap can make it difficult to integrate information effectively between the two distributions in the later layers.This is because of the inherent property of temporal neurons in TTFS coding.After a neuron spikes, there is no further activity, which implies a TTFS neuron will not consider any later input after it has output a spike.Consequently, these neurons tend to prioritize the earlier input, usually coming from the skip connection.This skewed consideration can potentially lead to a drop in accuracy, as vital information from later inputs could be overlooked.This observation underscores the importance of appropriately managing the timing of inputs in temporal SNNs to ensure effective information integration and high network performance.
To summarize, addition-based skip connections introduce additional timing delays in temporal SNNs.On the other hand, concatenation-based skip connections, despite speeding up the latency during inference, may overlook crucial information from the convolutional branch.

Adding Learnable Delay with a Skip Connection
Hence, a question naturally arises: how can we improve the accuracy while reducing latency in TTFS?We focus on the problem of concatenation-based skip connections, i.e., timing discrepancy between the convolutional branch and the skip branch.To address this problem, we introduce a delay to the skip connection, which is designed to minimize the timing disparity between the two branches.Fig. 3 provides an illustration of the delay implementation within the skip connection, where a delay is added across each channel.This introduces a slight adjustment to the original concatenation-based skip connection architecture.Initially, we partition the feature map across the channel dimension as follows: Subsequently, we apply a convolution layer F (•) to X l,1 and a delay block D(•) to X l,2 .Then we concatenate the outputs from those branches: where the delay block can be written as follows: Here, θ l ∈ R D represents the parameters within the delay block in l-th layer, where D is the channel dimension.This delay block applies distinct delays across each channel.We train these parameters alongside the other weight parameters within the neural network.To further align the distributions from the convolutional layer and the skip connection, we introduce an additional loss constraint during optimization: This loss function encourages the two distributions to converge, enhancing the efficacy of the information mixing between the convolutional and skip branches.

Overall Optimization
For the given image X, we convert the float input to the spike timing.Similar to rate coding [9,35,43], we add one convolutional layer at the first layer, i.e., Conv-BN-ReLU.Then we directly utilize the output of the ReLU layer to the spike timing (t = ReLU (BN (Conv(X)))) as the output of the ReLU layer is always higher than zero.We pass through the multiple temporal neuron layers as described in 3.1.In the output layer, the class probability is computed from the spike timing.The objective is to train the network such that the neuron associated with the correct class is the first to fire among all neurons.This can be done by applying the cross-entropy loss to the output spike timing O ∈ R C of the last layer: Here, y c is a one-hot encoding of a class index, and O c denotes the c-th index of output neurons.Multiplying -1 with the output neuron is for assigning higher probability weighting for the early spike.With a cross-entropy loss for classification, following the prior research [17], we introduce an additional term to the cost function that penalizes the input weight vectors of neurons whose sum is below 1 (denominator of Eq. 6).
Here, i is the neuron index of layer l, and j is the input neuron index from the previous layer to neuron i.Overall, the total loss function is defined as follows.
where λ 1 and λ 2 are the hyperparameters for a trade-off between the losses.

Implementation Details
We evaluate our method on MNIST [44] and Fashion-MNIST [45].We train the model with 128 batch samples using Adam optimizer with weight decay 1e-3.The initial learning rate is set to 6e-4 and decayed with cosine learning rate scheduling [46].We set the total number of epochs to 100.We set λ 1 and λ 2 to 1 and 1e-6, respectively.We use PyTorch for implementation.
To compare these architectures, we introduce latency as an additional metric, along with accuracy.In this context, latency is defined as the average time taken for the first spike to occur in the final layer.This is a particularly relevant measure for TTFS coding, as operations can be terminated as soon as the first spike occurs in the last layer.As such, we present both latency and accuracy in our results, offering a comprehensive understanding of the trade-off between speed and precision in these varying architectural designs.
In Table 1 and Table 4.2, we present the accuracy and latency results for the MNIST and Fashion-MNIST datasets respectively.Several key observations can be made from these results: (1) The addition of skip connections significantly improves model accuracy compared to the baseline model.Specifically, the skip connection  model improves performance by approximately 4% for both the MNIST and Fashion-MNIST datasets.( 2) While the addition-based skip connection enhances accuracy, it also results in increased latency.This additional latency stems from the additive operation between the skip and convolutional branches, as discussed in Section 3.2.2.
(3) The concatenation-based skip connection model, without a delay block, achieves a reduction in latency.However, its performance is comparatively lower than that of the addition-based skip connection architecture.(4) Incorporating a delay block into the concatenation-based skip connection model leads to improved performance.Notably, the addition of the delay block does not significantly increase latency for either dataset.In summary, among the tested architectures, the concatenation-based skip connection model with a delay block provides the optimal balance of high performance and low latency.

Comparison with Previous Methods
To establish the effectiveness of our proposed architecture, we draw comparisons between our model, Concat-based Skip Connection with Delay, and previous models.We focus on models that have been applied to the MNIST and Fashion-MNIST datasets.Table 4.3 shows results for the MNIST dataset.Our model achieves an accuracy of 98.5%, which is competitive with prior work.For the Fashion-MNIST dataset, Table 4.3 illustrates the superior performance of our model, which achieves an accuracy of 91.4%.The model by [19]     model may be attributed to its ability to address the timing discrepancy between convolutional and skip branches, thus maintaining high performance while achieving low latency.

Energy Efficiency: Spike Rate Comparison
As highlighted in an earlier section, the ability to terminate operation within the networks as soon as the first spike at the last layer occurs is an inherent advantage of TTFS coding.To fully leverage this characteristic, we assess the number of spikes present in each layer when we implement an early exit strategy.In Fig. 5, we visualize the spike rate of the baseline, addition-based skip connection architecture, and concatenation-based skip connection architecture.Here, we use the concatenationbased skip connection results with the added delay.In this context, the spike rate refers to the proportion of firing neurons in a given layer.This metric is instrumental in understanding the speed and efficiency of each model, and consequently, its suitability for real-time or latency-sensitive applications.
The experimental results showcased in Fig. 5 provide a clear comparison between the baseline model, the addition-based skip connection model, and the concatenationbased skip connection model in terms of spike rate at different layers of the network.In the baseline model, the spike rates for conv2, conv3, conv4, and layers are 74.49%,53.45%, 64.37%, and 26.99% respectively.For the addition-based skip connection model, the spike rates significantly increase, recorded at 85.5%, 83.12%, 90.61%, and 86.02% for conv2, conv3, conv4, and conv5 layers, respectively.These high spike rates demonstrate the model's heightened activity during the computation process.On the other hand, the concatenation-based skip connection model exhibits considerably lower spike rates.Specifically, the rates for conv2, conv3, conv4, and conv5 layers are 61.59%,49.43%, 30.67%, and 7.59% respectively.This indicates that fewer neurons are active during computation, thereby leading to lower latency and potentially faster computation times.
In summary, while the addition-based skip connection model tends to enhance activity across the network, the concatenation-based model successfully reduces the spike rate, potentially improving the efficiency of the network, particularly in latencysensitive scenarios.These results further establish the importance of an appropriate skip connection strategy in designing efficient spiking neural networks.

Comparison with ANN
SNNs are renowned for their energy efficiency in comparison to traditional ANNs.To demonstrate this, we undertake a comparison of the approximate energy consumption between SNNs and ANNs, assuming that both are built using the same architecture.Due to their event-driven nature and binary 1, 0 spike processing, SNNs are characterized by a reduced complexity of computations.More specifically, a multiplyaccumulate (MAC) operation reduces in an SNN to a simple floating-point (FP) addition, thus necessitating only an accumulation (AC) operation.In contrast, traditional ANNs still require full MAC operations.In line with previous studies such as [32,52], we estimate the energy consumption for SNNs by quantifying the total MAC operations involved.Using the standard 45nm CMOS technology as a reference point [53], we assign the energy for MAC and AC operations as E M AC = 4.6pJ and E AC = 0.9pJ respectively.This shows that MAC operations consume approximately 5.11 times more energy than AC operations.Since neurons in SNNs only consume energy whenever a neuron spikes, we multiply the spiking rate R s (l) at layer l with FLOPs to obtain the SNN FLOPs count.The total inference energy of ANNs (E AN N ) and SNNs (E SN N ) are calculated by: E AN N = l F LOP s(l) × E M AC and E SN N = l F LOP s(l) × R s (l) × E AC , respectively.F LOP s(l) represents the number of FLOPs in layer l.
The energy consumption comparison between our method and a conventional ANN is presented in Table 4.4.3.Notably, while maintaining competitive performance, our method -which combines concatenation-based skip connections with a delay blocksignificantly reduces the relative energy cost.In contrast to the ANN's relative energy   cost of 1, our method operates at just 13% of the ANN's energy expenditure, thus demonstrating the superior energy efficiency of our approach.

Delay Design
There are several design variants for delay blocks.Here, we propose three types for comparison: (1) Layer-wise delay applies the same delay across all neurons in one layer.
(2) Channel-wise delay adds a timing delay for each channel, which is used in our method.
(3) Pixel-wise delay adds a timing delay for each spatial location.For intermediate layers where the feature tensor size is C ×H ×W , we apply delay for each H ×W location.Table 4.4.3presents the comparative performance of these three delay block design variants, specifically evaluating their impact on accuracy and latency using the Fashion-MNIST dataset.The results suggest that the application of delay varies significantly in effect depending on its implementation level.Specifically, channel-wise delay, as employed in our method, demonstrated the highest accuracy and the lowest latency, indicating its effectiveness for the integration into concatenation-based skip connections.This demonstrates the potential benefits of applying unique delays to each channel, providing an effective balance between performance and computational efficiency.

Analysis on Delay Value
In this section, we investigate the impact of different initial values within the delay block.For this purpose, we set the initial delay values to [0, 0.25, 0.5, 0.75, 1.0] and train the model accordingly.As depicted in Fig. 6 Left, we present the performance corresponding to each initialization time.It is observed that when initialized with a small delay, the resulting latency remains small.Conversely, when the delay value is initialized to be higher, the resultant latency increases, but this also leads to an enhancement in accuracy.Fig. 6 Right visualizes how these delay values fluctuate across epochs.We note that regardless of the initial value, all cases tend to gravitate  towards a middle value over time.This results in high initial values decreasing over time, while lower initial values witness an increase.

Advantage of Skip Connection: Training Stability
Our analysis further explores the benefits of incorporating skip connections within the temporal SNN architecture.Prior research in the field of artificial neural networks (ANNs) suggests that the inclusion of skip connections enhances training stability and accelerates convergence speed [22].We scrutinize this premise in our current context by tracking and visualizing the evolution of training loss and accuracy over successive epochs, as depicted in Fig. 7. Our findings corroborate the aforementioned assertions, demonstrating that our optimized SNN architectures foster swift convergence while maintaining high test accuracy.This reinforces the value of skip connections as a significant contribution to the performance and efficiency of temporal SNNs.

Experiments on Wave Equation
In broadening the scope of our investigation, we extend the application of TTFS coding to tasks within the domain of scientific machine learning (SciML).An important topic in this field is how to approximate accurately solutions of partial differential equations.
In particular, we are interested in solving the inverse (time-reversal) problem of locating sources in an underwater acoustic domain from measurements at a later time.Zone-based Labeling There has been a lot of research in this domain, with or without machine learning, as shown in this survey [54].The recent developments of Transformer-based architectures have recently been used for this task as well [55].Another proposed method refers to using the Time-Reversal method incorporated with Machine learning based inference system [24][25][26].Most methods still rely on ANNs that can be expensive to train.The challenge herein is formulating this problem as a classification problem which aligns with the current implementation of the TTFS.The mathematical formulation of the wave problem we will investigate in this work is given by: where u(x, y, t) is the acoustic wave pressure, and c is the wave propagation velocity (assumed constant in this work and equal to 1484 m s ).A single dot over u denotes a first derivative with respect to time.A double dot denotes a second derivative.The function u 0 is the initial condition for the wave propagation, determined by the source.This initial condition is taken as a small Gaussian eruption (f (x) = e −( x−x 0.05 ) 2 , x = xmax−xmin 2 ) that mimics a localized source (point source that has been smoothed to avoid numerical artifacts).The goal is, from measurements of the wave pressure at some time t, to recover the location of the small initial eruption.This location is defined as a grid point.To turn it into a classification problem, we split the grid into zones, and infer the zone where the source is located.The more zones we use, the more precise localization we can achieve.However, more zones create a harder classification task, with growing computational demands.
Dataset Configuration: We generate a synthetic dataset based on the wave equation for a domain of N x × N y locations.We use a finite-difference central differences numerical scheme, preserving up to second order accuracy.We choose the ratio between the spatial discretization and the temporal discretization to satisfy the Courant-Freidrichs-Lewy condition (so that the scheme is stable).To create the synthetic dataset, for each sample, we choose a location for the source, use the solver to march in time and compute the wave pressure across the domain at the 100-th time step.Then we create a label based on the location of the source, so the pairs consisting of an image of the pressure at the 100-th time step and its corresponding label (source) form the dataset.
In detail, we posit the existence of a wave source at all locations within the domain, excluding a 10 pixels border around the outer boundary, thus resulting in (N x − 10) × (N y −10) data samples.As mentioned above, for each wave source location, we compute the wave pressure at the 100-th time step.To create the labels, we segment the domain into M × M zones and assign each source location a label from 0 ∼ (M 2 − 1), as shown in Fig. 8 for a 3 × 3 zone-based labeling.The labeling process is the basic quantization of the domain into smaller segments, and the assigning label for each source according to the region it belongs to.For the 3×3 labeling, we have nine zones, meaning a total of nine classes.To get a more precise localization one can use 6×6 zones (as shown later), thus having 36 classes for the classification mechanism.Finally, we partition these data samples randomly into training and testing sets at an 80:20 ratio, respectively.
In Table 4.5, we report the accuracy and latency of the wave equation problem.We use the architecture shown in Fig. 4. Note that 6 × 6 zone-based labeling is a more difficult task than 3 × 3 zone-based labeling as the model requires classifying a larger number of zones.We make the following observations: (1) The general performance trend aligns with what we observed for image recognition tasks.Concatenation-based skip connection architectures, especially when paired with a delay block, show superior performance in terms of balancing high accuracy with lower latency.This supports the effectiveness of our proposed architecture for not only static image datasets but also for SciML tasks such as solving time-reversal problems for wave equations.(2) The table also reveals that as we change the labeling strategy from 3 × 3 zones to 6 × 6 zones, we see an increase in latency across all models.This is intuitively right as a higher number of zones (classes) involves more computations and hence results in a longer latency.In addition to the increased latency, a larger zone number also results in slightly reduced accuracy for all methods.This could be due to the increased complexity of the task with more zones, potentially requiring a more sophisticated model or additional training to achieve similar levels of accuracy as those observed for the smaller number of zones.In Table 4.5, we compare the energy efficiency of SNN with ANN.Similar to image classification, our SNN consumes only 14% of ANN energy, while sacrificing ≤ 1% accuracy.

Conclusion
In conclusion, this study has made significant strides in the exploration of TTFS coding and the optimization of skip connection architectures for improving the efficiency  and accuracy of SNNs.We discovered that while addition-based skip connections introduce temporal delays, concatenation-based skip connections tend to miss crucial information from the non-linear operation branch.To address these challenges, we proposed a novel approach that introduces a learnable delay for skip connections, bridging the gap between the spike timing discrepancies of the convolution and skip branches.We demonstrated that this method not only accelerates the first spike's timing but also maintains accuracy, offering an effective solution for faster prediction in TTFS coding.We also extended our exploration to SciML tasks, unveiling the potential of TTFS coding beyond image recognition applications.Our findings suggest that there is room for further research in optimizing the network architecture of temporal SNNs, and we hope that our work will inspire new approaches and applications in this exciting field.In the future, we aim to further improve the effectiveness of our proposed method and explore its applicability to even larger and more complex tasks.We believe that the continuing evolution of SNN architectures will significantly contribute to the advancement of low-power, efficient, and highly accurate artificial intelligence systems.
is a multi-program national laboratory operated for the U.S. Department of Energy (DOE) by Battelle Memorial Institute under Contract No. DE-AC05-76RL01830.

Fig. 1
Fig. 1 Illustration of a spike timing delay through layers.(a) Change of membrane potential when there are input spikes at times 1, 2, and 3. (b) We visualize the histogram of spike timing after Conv3, Conv4, and Conv5 layers.

Fig. 2
Fig.2Illustration of a spike timing delay of two skip connection architectures.We visualize the histogram of spike timing for 1 ○ skip branch, 2 ○ convolutional branch, and 3 ○ after combining the two branches.

Fig. 3
Fig. 3 Illustration of the concatenation-based skip connection architecture with a delay block.

Fig. 4
Fig. 4 Illustration of the architectures of Baseline CNN, Addition-based skip connection, and Concatenation-based skip connection.

Fig. 5
Fig. 5 Spike rate in different architectures.We use the Fashion-MNIST dataset and measure the spike rate in Conv2 ∼ Conv5 layers.

Fig. 6
Fig. 6 Left: Accuracy and latency trade-off with respect to the time initialization in delay block.Right: Change of delay as training goes on.We train and test the model on Fashion-MNIST.

Fig. 7
Fig. 7 Comparison of the (a) training loss and (b) training accuracy across different architectures on Fashion-MNIST.

Fig. 8
Fig. 8 The figure on the left illustrates an example of zone-based labeling for the source localization problem for the wave equation.The image is divided into 3 × 3 zones, with each zone assigned a distinct label.On the right, nine different wave images corresponding to each of the nine labels are displayed.Each image represents the wave shape at the 100-th time step, originating from a wave source located at the center of the respective zone.

Table 1
Classification Accuracy (%) and latency of skip connection architectures on the MNIST dataset.

Table 2
Classification Accuracy (%) and latency of skip connection architectures on the Fashion-MNIST dataset.
is the next best performer with an accuracy of 90.1%.In both cases, our Concat-based Skip Connection with Delay architecture outperforms previous models, indicating its effectiveness in enhancing the performance of Spiking Neural Networks.This is particularly noteworthy for tasks that involve Time-To-First-Spike (TTFS) coding.The success of the Concat-based Skip Connection with Delay

Table 3
Accuracy (%) comparison among the previous work on the MNIST dataset.

Table 4
Accuracy (%) comparison among the previous work on the Fashion-MNIST dataset.

Table 5
Energy-efficiency comparison between ANN and SNN (with concatenation-based skip + delay) at inference.We use the Fashion-MNIST dataset for the experiments.

Table 6
Accuracy (%) comparison among the different delay block designs.We use the Fashion-MNIST dataset for the experiments.

Table 7
Classification Accuracy (%) and latency on time-reversal source localization problem for the wave equation.Labeling Strategy means we use M × M zones for assigning a label to each source location.Fig.8shows an example of 3 × 3 zone-based labeling.

Table 8
Energy-efficiency comparison between ANN and SNN (with concatenation-based skip + delay) at inference on the wave equation problem.We use the 6 × 6 zone-based labeling for the experiments.