Skip to main content

ORIGINAL RESEARCH article

Front. High Perform. Comput.
Sec. Architecture and Systems
Volume 2 - 2024 | doi: 10.3389/fhpcp.2024.1360720

Using Open-Science Workflow Tools to Produce SCEC CyberShake Physics-Based Probabilistic Seismic Hazard Models Provisionally Accepted

 Scott Callaghan1*  Philip J. Maechling1 Fabio Silva1 Mei-Hui Su1 Kevin R. Milner1 Robert W. Graves2 Kim B. Olsen3 Yifeng Cui4 Karan Vahi5  Albert Kottke6  Ewa Deelman5 Christine A. Goulet7 Thomas H. Jordan8 Yehuda Ben-Zion1
  • 1Statewide California Earthquake Center, University of Southern California, United States
  • 2Earthquake Hazards Program, United States Geological Survey, United States
  • 3Department of Geological Sciences, San Diego State University, United States
  • 4San Diego Supercomputer Center, Jacobs School of Engineering, University of California, San Diego, United States
  • 5Information Sciences Institute, University of Southern California, United States
  • 6Pacific Gas and Electric Company, United States
  • 7Earthquake Science Center, United States Geological Survey, United States
  • 8Department of Earth Sciences, University of Southern California, United States

The final, formatted version of the article will be published soon.

Receive an email when it is updated
You just subscribed to receive the final version of the article

The Statewide (formerly Southern) California Earthquake Center (SCEC) conducts multidisciplinary earthquake system science research that aims to develop predictive models of earthquake processes, and to produce accurate seismic hazard information that can improve societal preparedness and resiliency to earthquake hazards. As part of this program, SCEC has developed the CyberShake platform, which calculates physics-based probabilistic seismic hazard analysis (PSHA) models for regions with high-quality seismic velocity and fault models. PSHA hazard models provide estimates of possible future peak ground motions for sites of interest that are useful for building engineering, insurance rates, and disaster planning. The CyberShake platform implements a sophisticated computational pipeline that includes over 15 individual codes written by 6 developers. These codes are heterogeneous, ranging from short-running high-throughput serial CPU codes to large, longrunning, parallel GPU codes. Additionally, CyberShake simulation campaigns are computationally extensive, typically producing tens of terabytes of meaningful scientific data and metadata over several months of around-the-clock execution on leadership-class supercomputers. We present the workflow software stack required to support CyberShake campaigns, including open-source workflow tools and custom solutions. We identify how the CyberShake platform and supporting tools enable us to meet a variety of challenges that come with large-scale simulations, such as automated remote job submission, data management, and verification and validation. This platform enabled us to perform our most recent simulation campaign, CyberShake Study 22.12, from December 2022 to April 2023. During this time, our workflow tools executed approximately 32,000 jobs, and used about 770,000 node-hours on the Summit system at Oak Ridge Leadership Computing Facility. At peak, we utilized 73% of the system. Our workflow tools managed about 2.5 PB of total data, and automatically staged 19 million output files totaling 74 TB back to archival storage on the University of Southern California's Center for Advanced Research Computing systems. CyberShake extreme-scale workflows have generated simulation-based probabilistic seismic hazard models that are being used by seismological, engineering, and governmental communities.

Keywords: scientific workflows, Probabilistic seismic hazard analysis, High performance computing, Seismic simulations, Distributed Computing, computational modeling

Received: 23 Dec 2023; Accepted: 19 Feb 2024.

Copyright: © 2024 Callaghan, Maechling, Silva, Su, Milner, Graves, Olsen, Cui, Vahi, Kottke, Deelman, Goulet, Jordan and Ben-Zion. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mx. Scott Callaghan, Statewide California Earthquake Center, University of Southern California, Los Angeles, United States