Your new experience awaits. Try the new design now and help us make it even better

HYPOTHESIS AND THEORY article

Front. High Perform. Comput.

Sec. Architecture and Systems

Volume 3 - 2025 | doi: 10.3389/fhpcp.2025.1393936

This article is part of the Research TopicOptimizing I/O Performance in High-Performance Computing SystemsView all articles

An analysis of the I/O semantic gaps of HPC storage stacks

Provisionally accepted
  • 1Technical University Dresden, Dresden, Germany
  • 2University of Göttingen, Göttingen, Lower Saxony, Germany

The final, formatted version of the article will be published soon.

Modern High Performance Computing (HPC) I/O systems consist of stacked hardand software layers that provide interfaces for data access. Depending on application needs, developers usually choose higher layers with richer semantics for the ease of use or lower layers for performance. Each I/O interface on a given stack consists of a set of operations and their syntactic definition, as well as a set of semantic properties. To properly function, high-level libraries such as HDF5 need to map their semantics to lower-level APIs such as POSIX. Lower-level storage backends provide different I/O semantics than the layers in the stack above, while sometimes implementing the same interface. However, most I/O interfaces do not transport semantic information through their APIs. Ideally, no semantics of an I/O operation should be lost while passing through the I/O stack, allowing lower layers to optimize performance. Unfortunately, there is a lack of general definition and unified taxonomy of I/O semantics. Similarly, system level APIs offer little support for passing semantics to underlying layers. Thus, passing semantic information between layers is currently not feasible.In this paper, we systematically compare I/O interfaces by examining their semantics across the HPC I/O stack. Our primary goal is to provide a taxonomy and comparative analysis, not to propose a new I/O interface or implementation. We propose a general definition of I/O semantics and present a unified classification of I/O semantics based on the categories of concurrent access, persistency, consistency, spatiality, temporality and mutability. This allows us to compare I/O interfaces in terms of their I/O semantics. We show that semantic information is lost while traveling through the storage stack, which often prevents the underlying storage backends from making the proper performance and consistency decisions. In other words, each layer acts like a semantic filter for the lower layers. We discuss how higher-level abstractions could propagate their semantics and assumptions down through lower-levels of the I/O stack. As a possible mitigation, we discuss the conceptual design of semantics-aware 1 Oeste et al.interfaces, to illustrate how such interfaces might address semantic loss -though we do not propose a concrete new implementation.

Keywords: HPC, I/O, I/O stack, I/O semantics, POSIX, MPIIO

Received: 29 Feb 2024; Accepted: 18 Jul 2025.

Copyright: © 2025 Oeste, Höhn, Kluge and Kunkel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Sebastian Oeste, Technical University Dresden, Dresden, Germany

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.