AUTHOR=Xu Yingfu , Shidqi Kevin , van Schaik Gert-Jan , Bilgic Refik , Dobrita Alexandra , Wang Shenqi , Meijer Roy , Nembhani Prithvish , Arjmand Cina , Martinello Pietro , Gebregiorgis Anteneh , Hamdioui Said , Detterer Paul , Traferro Stefano , Konijnenburg Mario , Vadivel Kanishkan , Sifalakis Manolis , Tang Guangzhi , Yousefzadeh Amirreza TITLE=Optimizing event-based neural networks on digital neuromorphic architecture: a comprehensive design space exploration JOURNAL=Frontiers in Neuroscience VOLUME=Volume 18 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2024.1335422 DOI=10.3389/fnins.2024.1335422 ISSN=1662-453X ABSTRACT=Neuromorphic processors promise low-latency and energy-efficient processing by adopting novel brain-inspired design methodologies. Yet, current neuromorphic solutions still struggle to rival conventional deep learning accelerators' performance and area efficiency in practical applications. Event-driven data-flow processing and near/in-memory computing are the two dominant design trends of neuromorphic processors. However, there remain challenges in reducing the overhead of event-driven processing and increasing the mapping efficiency of near/in-memory computing, which directly impacts the performance and area efficiency. In this work, we discuss these challenges and present our exploration of optimizing event-based neural network inference on SENECA, a scalable and flexible neuromorphic architecture. To address the overhead of event-driven processing, we perform comprehensive design space exploration and propose spike-grouping to reduce the total energy and latency. Furthermore, we introduce the event-driven depth-first convolution to increase area efficiency and latency in convolutional neural networks (CNNs) on the neuromorphic processor. We benchmarked our optimized solution on keyword spotting, sensor fusion, digit recognition \blue{and high resolution object detection} tasks. Compared with other state-of-the-art large-scale neuromorphic processors, our proposed optimizations result in a 6$\times$ to 300$\times$ improvement in energy \green{efficiency}, a 3$\times$ to 15$\times$ improvement in latency, and a 3$\times$ to 100$\times$ improvement in area efficiency. Our optimizations for event-based neural networks can be potentially generalized to a wide range of event-based neuromorphic processors.