Building Precise Local Submarine Earthquake Catalogs via a Deep-Learning-Empowered Workflow and its Application to the Challenger Deep

Submarine active faults and earthquakes, which contain crucial information to seafloor tectonics and submarine geohazards, can be effectively characterized by precise submarine earthquake catalogs. However, the precise and rapid building of submarine earthquake catalogs is challenging due to the following facts: (i) intense noise in ocean seismic data; (ii) the sparse seismic network; (iii) the lack of historical near-field observations. In this paper, we built a deep-learning-based automatic workflow named ESPRH for automatically building submarine earthquake catalogs from continuous seismograms. The ESPRH workflow integrates Earthquake Transformer (EqT) and Siamese Earthquake Transformer (S-EqT) for initial earthquake detection and phase picking, PickNet for phase refinement, REAL for earthquake association and rough location, and HypoInverse, HypoDD for precise earthquake relocation. We apply ESPRH to the continuous data recorded by an array of 12 broadband Ocean Bottom Seismographs (OBS) near the Challenger Deep at the southern-most Mariana subduction zone from Dec. 2016 to Jun. 2017. In this study, we acquire a high-resolution local earthquakes catalog that provides new insights into the geometry of shallow fault zones. We report the active submarine faults by seismicity in Challenger Deep which is the deepest place on Earth. These faults are a significant reference for submarine geological hazards and evidence for serpentinization. Hence, the ESPRH is qualified to construct comprehensive local submarine earthquake catalogs automatically, rapidly, and precisely from raw OBS seismic data.


INTRODUCTION
Submarine seismicity and active faults are essential for the analysis and monitoring of submarine geohazards. A local earthquake catalog can reveal the detailed geometry of faults, providing critical insights into tectonics and earthquake disasters. However, seismic data is generally extensive, and it is subjective and time-consuming to extract earthquake signals by human experts manually. Many traditional automatic earthquake detection methods have been proposed to address this problem, such as short-term average/ long-term average algorithm (STA/LTA) (Allen, 1978), autoregression with Akaike Information Criterion (AIC) (Sleeman and van Eck, 1999). However, these methods are less precise than human experts and rely on hyperparameters, limiting their performance when processing complex seismic data with different types of noise and variable signal-to-noise ratios. The template matching method (Gibbons and Ringdal, 2006;Peng and Zhao, 2009) is widely used for building earthquake catalogs by exploiting the similarity of earthquake waveforms between nearby earthquakes using previously identified earthquake templates. However, its computational cost is relatively high, and sufficient templates are generally unavailable for the OBS network due to the lack of historical observations. Different from conventional methods that only utilize several manually designed features, machine-learning-based methods (Bishop, 2006), especially deep neural networks (Lecun et al., 2015), can automatically extract rich features from extensive seismic data. Recently, researchers have made considerable progress in earthquake detection, and phase picking via deep-learning-based methods (e.g., Perol et al., 2018;Mousavi et al., 2019b;Mousavi et al., 2020;Pardo et al., 2019;Ross et al., 2019;Wang et al., 2019;Wu et al., 2019;Zhou et al., 2019;Zhu and Beroza, 2019). The Earthquake Transformer (EqT) (Mousavi et al., 2020) model achieves the state-of-art performance of~99% precision,~99% recall rate, and~0.01 s mean absolute error for picking P and S phases on the STanford EArthquake Dataset (STEAD) (Mousavi et al., 2019a), outperforming all the other popular models. Xiao et al., 2021 proposed the Siamese Earthquake Transformer (S-EqT) (Xiao et al., 2021) model to address the false-negative issue in the EqT model. However, due to the limitation of the training set distribution (e.g., 92% of seismograms in the STEAD dataset are within 110 km epicenter distance), the phase picking precision of both EqT and S-EqT would decrease on seismograms with epicenter distances larger than 110 km. Although Wang et al. (2019) propose a neural network (PickNet) (Wang et al., 2019) for regional seismic arrival picking with epicenter distances up to~1,000 km, their method does not include earthquake detection and requires a pre-existing regional earthquake catalog.
Thus, in our study, we utilize these methods to form a practical workflow, named ESPRH, for building the submarine earthquake catalog (Figure 1). The workflow consists of three stages: The first stage is earthquake detection and phase picking by EqT, S-EqT, and PickNet; The second step is earthquake association and initial location by REAL; The final stage is relocating detected earthquakes by HypoInverse and HypoDD. We applied ESPRH to continuous seismic data recorded by an array of 12 broadband OBSs deployed at Challenger Deep, the deepest point in the ocean, from Dec. 2016 to Jun. 2017. Challenger Deep is located in the  Mariana Trench, a subduction zone between the Philippine and Pacific plates. The high seismicity and unique geographical location make this data ideal for validating our workflow. As a result, we obtained a catalog containing 1,383 relocated local earthquakes, which is about two times larger than that of recent work by  using the same data by template matching. Our catalog reveals the detailed geometry of the faults and seismicity in Challenger Deep and shows that the ESPRH workflow is suitable for building local submarine earthquake catalogs, and it may contribute significantly to the understanding of the Earth's interior.

DATA
The data used in this study is recorded by an array of 12 OBSs ( Figure 2) down to ocean depth greater than 8,000 m from the Southern Mariana OBS Experiment (SMOE) from Dec. 2016 to Jun. 2017. The OBS (STS-G60) equipped with a three-component sensor was developed by the Institute of Geology and Geophysics, Chinese Academy of Sciences. The OBS sensor is designed for a low-frequency response of 30 s with a sampling rate of 100 Hz. After time-correction, the data is treated differently in different processing steps: it is filtered to 1-45 Hz when feeding to the EqT and S-EqT models; it is unfiltered when providing to the PickNet model; it is filtered to 0.2-10 Hz with instrument response
In the first stage, we apply EqT for the initial earthquake detection and phase picking because it leverages the most advanced deep-learning techniques, such as transformers, residual connections, and achieves the state-of-art performance on the STEAD dataset (Mousavi et al., 2019a), which is the current largest public dataset. 92% of seismograms in STEAD are with 110 km epicenter distance. The most passive OBS experiments are within this range. Hence EqT is ideal for initial earthquake detection and phase picking. Then we feed the outputs of EqT to S-EqT to further reduce the false-negative rate of the EqT model. S-EqT is a pair-wise deep-learning model, which retrieves previously missed phase picks in low SNR seismograms based on their similarities to other confident phase picks in high-dimensional spaces. Then the outputs are feed to PickNet, which is trained for phase arrival picking using a dataset of seismograms with epicenter distances up to~1,000 km for phase arrival time refinement. Here "refinement" means the P and S phase picks created by EqT and S-EqT are used to create input time windows for PickNet. Because the PickNet model predicts only one arrival time per time window, the refined picks by PickNet are limited within a 2-s time range from those by EqT and S-EqT to prevent it from refining one pick multiple times. The threshold for earthquake detection, P, and S phase picking in three models is 0.3, 0.1, and 0.1, respectively.
In the second stage, we utilize REAL to link these phases through grid searching. The REAL method employs grid searching, and it is rapid and reliable for these tasks. In the REAL method, the center of the searched area is at the station that recorded the initiating P phase. The coarse location of events will be at the grid point that has the maximum number of picks. If grid points with the maximum number of picks are non-unique, the grid point with the smallest residual will be chosen. We discard the events associated with less than 3 P picks or less than a total of 4 P and S picks. The magnitudes of associated earthquakes are estimated under the Richter magnitude scale (Richter, 1935) using 50-second-long slices after P and S phases.
In the third stage, we use HypoInverse to improve the grid search location results and then use HypoDD to enhance these events' relative locations further. HypoInverse is an absolute localisation method based on gradient descent. HypoDD is a double-difference hypocenter location method, which significantly improves the relative location accuracy and reveals concentrated seismic streaks. The 1D velocity model is a combination of crust 1.0 and 2-D crustal P-wave velocity models from Wan et al. (2019)

RESULTS AND DISCUSSIONS
To validate the ESPRH workflow, we apply it to OBS's continuous data of 12 stations from Dec. 2016 to Jun. 2017 and conduct ablation study to deep learning models in the first stage. As shown in Table 2, the combination of EqT and S-EqT detects~7.5 times more earthquakes in the REAL catalog than using EqT only. This shows the necessity of S-EqT in the first stage. We only keep earthquakes with both horizontal location errors and depth location errors less than 20 km. Hence, the slight increment in earthquake number in HypoInverse and HypoDD catalogs after applying PickNet to refine phase indicts the arrival times are refined to higher precision. Figure 3 show the results of HypoInverse and HypoDD corresponding to Table 2. The shallow earthquakes in catalogs, especially at the south side of the Challenger Deep, increased significantly after applying S-EqT, which is important for analyzing the change of seafloor topography and the coupling relationship between seafloor and seawater. Figures 3C,F show that the locations of these earthquakes coincide with the seafloor topography, which demonstrates the reliability of the ESPRH workflow.
As shown in profile A-B in Figure 4, most of the earthquakes in the catalog produced with only EqT in the first stage are shallow. Few of them reach a depth of 60 km by Hypoinverse, and all are less than 40 km by HypoDD. The combination of EqT and S-EqT detects more deep earthquakes up to 100 km by HypoInverse and 80 km by HypoDD. Figures 4C, 5F show the PickNet's precise fine-tuning of the initial detection results, where subtle differences in small areas can lead to more precise positioning locations. In particular, after relocation, it is possible to see the distribution of positions that are fully consistent with the morphology of the subducting slab.
In this study, we analyze the seismicity based on the final HypoDD catalog. The relative positions of these seismic events can portray the geometric structure of the faults where the earthquakes occur. As shown in Figure 6, a total of 1,270 earthquakes were relocated by HypoDD during the period of OBS observation. These earthquakes are mainly distributed near the OBS array. In addition to subduction slabs, these earthquakes are distributed in the outer rise region, overriding plate and subduction interface. Divided by the trench axis, the number of earthquakes in the north is much higher than that in the southern outer-rise region. As can be seen from the profile, the former has some very deep earthquakes distributed up to 80 km. We consider that these events outline the morphology of the subduction slab. The depth of the latter earthquakes is less than 25 km. As shown in the figure, these shallow seismic streaks can be corroborated in the topographic map. We speculate that shallow outer-rise earthquakes are normal faulting events. We also find a large number of moderately deep earthquakes above the subduction slab, which may reveal the ongoing influence of subduction activity on the overriding plate. We further evaluate the detection performance of ESPRH by comparing it with the catalog of , which is based on six OBS stations of our array and a station located in Guam. A comparison with our results is shown in Figure 5.
Hence, we only compare the earthquake catalogs in the area near the stations ( Figure 5). In terms of the number of detected earthquakes, we obtained 1,285 seismic events that were localized by HypoInverse with both ERH and ERZ less than 20 km, which is comparable to the number obtained by matched filtering. But 1,270 of these earthquakes can be relocated, which is almost three times the number obtained by matched filtering. In terms of the location of the earthquakes, we have a more balanced distribution, which is consistent with the local earthquake pattern. This can provide more reliable support for seismicity as well as geohazard analysis. Although the catalog of earthquakes obtained by matched filtering also contains a large number of small seismic events not found in the USGS, these are clustered in the SWMR at the south of the array, and near the trench. Although the basic structure of the subduction slab can be clearly outlined on the profiles of both catalogs. However, Zhu's catalog shows a denser aggregation pattern with gaps near the trench. Earthquakes are also relatively rare in the overriding area. Throughout the study area, the ESPRH catalog allows for a more detailed and complete analysis of Frontiers in Earth Science | www.frontiersin.org February 2022 | Volume 10 | Article 817551 6 seismicity and a more robust outline of fault geometry. In particular, the area between SWMR and Challenger Deep, above the subduction slab, can be seen as a fault produced by extrusion deformation. In overriding plates, the faults represented by the seismic strips can also be found. This is due to the limitations of the method. The basic idea of the matching filter approach in building the catalog is to find similar earthquake events with the template. It is very difficult to detect earthquakes in blank areas in the template. Therefore, having a good template is a prerequisite for getting a complete catalog, which relies on preliminary observations. The matched filtering method can be applied well and quickly in areas with abundant seismic observations and has good performance in aftershock detection.
Most of the submarine areas lack historical near-field seismic data. It is also not possible to predict the location of earthquake occurrence when analyzing seismicity and submarine geohazards. This situation shows the advantages of ESPRH, which has the regular process: pick P&S arrivals, associate phases, and locate events.
We also analyzed the statical features of earthquakes ( Figure 6). Figure 6A shows the current frequent earthquakes still occurring in the vicinity of the Challenger Deep, confirming the observation of continued subduction of the Pacific plate. The depth distribution showed that earthquakes mainly concentrate~10 km, where submarine geological hazards occur. The distribution of magnitudes shows that most of the earthquakes are concentrated in magnitude 3, which indicates the poor quality of the seabed data.
Our localization results provide some evidence for the geological structure of the Challenger Deep. It is clear from the location of these faults, and the depth of the earthquakes, that the faults extend into the mantle (Figure 7). Every subducting plate inevitably undergoes a transition from horizontal to vertical motion as it passes through the outer uplift zone of each subduction trench. If plate rupture occurs, a positive fault is created, which can be observed by earthquake localization. Seismic velocity models obtained from previous surface wave imaging results show a significant low-velocity anomaly in the subducted slab that extends to a depth of about 25 km inside the mantle, indicating that the southern Marianas subducted slab carries a large amount of water into the Earth's interior (Zhu et al., 2021). Compared with central Mariana (Kato et al., 2003), the velocity reduction of the southern arc-front mantle is not as obvious, suggesting that the degree of serpentinization of the arc-front mantle is lower in the south. The difference in the degree of serpentinization of the southern and central Mariana arc-front mantle may reflect the different geological processes and development of the arc-front mantle in the two regions.
We marked the position of the faults in the subduction slab as well as the outer rise by visual retrieval. Our catalog shows that tectonic activity in the southern arc-front region is more intense due to the expansion of the Mariana Trough and the rapid retreat of the Pacific plate, and the southern-most Mariana arc-front is strongly deformed with a large number of parallel and vertical orthotropic and strike-slip faults, suggesting that the southern arc-front is experiencing strong strike and vertical tensioning along the trench. The intense tectonic activity may prevent fluid focusing and affect the degree of serpentinization in the pre-arc mantle. This provides support for the conclusions obtained by previous work with Rayleigh wave tomography.

CONCLUSION
In this study, we build a deep-learning-empowered fully automatic workflow named ESPRH that can quickly build regional earthquake catalogs with corresponding high-quality P and S wave arrival times. By applying it to an OBS array in the Challenger Deep, The ESPRH obtains a complete earthquake catalog which provides novel insights into the geometry of the faults and seismicity around the Challenger Deep and provides evidence for serpentinization of the Pacific plate in the southern Mariana Trench. Such application demonstrates that our pipeline is practical to construct comprehensive local submarine earthquake catalogs automatically, rapidly, and precisely. This study presents a comprehensive local earthquake catalog around the Challenger Deep and provides a powerful tool for future seismic studies at submarine earthquakes.

DATA AVAILABILITY STATEMENT
The earthquake catalogs that support the conclusions of this article are publicly available in the Supplementary Material. For the raw OBS data in this study, please contact the corresponding author.

AUTHOR CONTRIBUTIONS
XW conducted the research, plotted the figures, and wrote the manuscript. ZX helped in developing the codes of the workflow. SH and ZX contributed to the conception and design of the study. YW completed time correction. All authors contributed to manuscript revision, read, and approved the submitted version.