HYPOTHESIS AND THEORY article
Front. High Perform. Comput.
Sec. Architecture and Systems
Volume 3 - 2025 | doi: 10.3389/fhpcp.2025.1393936
This article is part of the Research TopicOptimizing I/O Performance in High-Performance Computing SystemsView all articles
An analysis of the I/O semantic gaps of HPC storage stacks
Provisionally accepted- 1Technical University Dresden, Dresden, Germany
- 2University of Göttingen, Göttingen, Lower Saxony, Germany
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Modern High Performance Computing (HPC) I/O systems consist of stacked hardand software layers that provide interfaces for data access. Depending on application needs, developers usually choose higher layers with richer semantics for the ease of use or lower layers for performance. Each I/O interface on a given stack consists of a set of operations and their syntactic definition, as well as a set of semantic properties. To properly function, high-level libraries such as HDF5 need to map their semantics to lower-level APIs such as POSIX. Lower-level storage backends provide different I/O semantics than the layers in the stack above, while sometimes implementing the same interface. However, most I/O interfaces do not transport semantic information through their APIs. Ideally, no semantics of an I/O operation should be lost while passing through the I/O stack, allowing lower layers to optimize performance. Unfortunately, there is a lack of general definition and unified taxonomy of I/O semantics. Similarly, system level APIs offer little support for passing semantics to underlying layers. Thus, passing semantic information between layers is currently not feasible.In this paper, we systematically compare I/O interfaces by examining their semantics across the HPC I/O stack. Our primary goal is to provide a taxonomy and comparative analysis, not to propose a new I/O interface or implementation. We propose a general definition of I/O semantics and present a unified classification of I/O semantics based on the categories of concurrent access, persistency, consistency, spatiality, temporality and mutability. This allows us to compare I/O interfaces in terms of their I/O semantics. We show that semantic information is lost while traveling through the storage stack, which often prevents the underlying storage backends from making the proper performance and consistency decisions. In other words, each layer acts like a semantic filter for the lower layers. We discuss how higher-level abstractions could propagate their semantics and assumptions down through lower-levels of the I/O stack. As a possible mitigation, we discuss the conceptual design of semantics-aware 1 Oeste et al.interfaces, to illustrate how such interfaces might address semantic loss -though we do not propose a concrete new implementation.
Keywords: HPC, I/O, I/O stack, I/O semantics, POSIX, MPIIO
Received: 29 Feb 2024; Accepted: 18 Jul 2025.
Copyright: © 2025 Oeste, Höhn, Kluge and Kunkel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Sebastian Oeste, Technical University Dresden, Dresden, Germany
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.