Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. High Perform. Comput.

Sec. HPC Applications

Volume 3 - 2025 | doi: 10.3389/fhpcp.2025.1638203

This article is part of the Research TopicAdvancements in Extreme-Scale I/O, Storage Systems, and Data AnalyticsView all articles

Toward a Persistent Event-Streaming System for High-Performance Computing Applications

Provisionally accepted
Matthieu  DorierMatthieu Dorier1*Amal  GueroudjiAmal Gueroudji1Valerie  Hayot-SassonValerie Hayot-Sasson2Hai  NguyenHai Nguyen1,2Seth  OckermanSeth Ockerman3Renan  SouzaRenan Souza4Tekin  BicerTekin Bicer1Haochen  PanHaochen Pan2Philip  CarnsPhilip Carns1Kyle  ChardKyle Chard1,2Ryan  ChardRyan Chard1Maxime  GonthierMaxime Gonthier1,2Eliu  A HuertaEliu A Huerta1Ben  LenardBen Lenard1Bogdan  NicolaeBogdan Nicolae1Parth  PatelParth Patel1Justin  M WozniakJustin M Wozniak1Ian  FosterIan Foster1,2Nageswara  RaoNageswara Rao4Robert  B RossRobert B Ross1
  • 1Argonne National Laboratory, Lemont, United States
  • 2The University of Chicago Department of Computer Science, Chicago, United States
  • 3University of Wisconsin-Madison, Madison, United States
  • 4Oak Ridge National Laboratory, Oak Ridge, United States

The final, formatted version of the article will be published soon.

High-performance computing (HPC) applications have traditionally relied on parallel file systems and file transfer services to manage data movement and storage. Alternative approaches have been proposed that use direct communications between application components, trading persistence and fault tolerance for speed. Event-driven architectures, as popularized in enterprise contexts, present a compelling middle ground, avoiding the performance cost and API constraints of parallel file systems while retaining persistence and offering impedance matching between application components. However, adapting streaming frameworks to HPC workloads requires addressing challenges unique to HPC systems. This paper investigates the potential for a streaming framework designed for HPC infrastructures and use cases. We introduce Mofka, a persistent event-streaming framework designed specifically for HPC environments. Mofka combines the capabilities of a traditional streaming service with optimizations tailored to the HPC context, such as support for massively multicore nodes, efficient scaling for large producerconsumer workflows, RDMA-enabled high-performance network communications, specialized network fabrics with multiple links per node, and efficient handling of large scientific data payloads. Built using the Mochi suite of HPC data service components, Mofka provides a lightweight, modular, and high-performance solution for persistent streaming in HPC systems.We present the architecture of Mofka and evaluate its performance against Kafka and Redpanda using benchmarks on diverse platforms, including Argonne's Polaris and Oak Ridge's Frontier supercomputers, showing up to 8× improvement in throughput in some scenarios. We then demonstrate its utility in several real-world applications: a tomographic reconstruction pipeline, a workflow for the discovery of metal-organic frameworks for carbon capture, and the instrumentation of Dask workflows for provenance tracking and performance analysis.

Keywords: HPC, I/O, Streaming, Mochi, Mofka, Kafka, Redpanda

Received: 30 May 2025; Accepted: 07 Aug 2025.

Copyright: © 2025 Dorier, Gueroudji, Hayot-Sasson, Nguyen, Ockerman, Souza, Bicer, Pan, Carns, Chard, Chard, Gonthier, Huerta, Lenard, Nicolae, Patel, Wozniak, Foster, Rao and Ross. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Matthieu Dorier, Argonne National Laboratory, Lemont, United States

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.