Citizen Scientists Help Detect and Classify Dynamically Triggered Seismic Activity in Alaska

In this citizen science project, we ask citizens to listen to relevant sections of seismograms that are converted to audible frequencies. Citizen scientists helped identify local seismic events whose recorded signals are much smaller than those associated with the surface waves that have triggered these local events. The local events include small earthquakes as well as tectonic tremor. While progress has been made in understanding how these events might be triggered by surface waves from large teleseismic earthquakes around the world, there is no consensus on its physical mechanism. The aim of our project is to engage the help of citizen scientists to increase general knowledge of triggered seismic events that may or may not occur during transient strain changes, such as from propagating surface waves. A better understanding of triggered seismic events is expected to provide important clues toward a fundamental understanding of how earthquakes nucleate and the physical mechanisms that connect different earthquakes and other slip events. From the volunteers’ classifications we determined that citizen scientists achieve a higher reliability in detecting earthquakes and noise than in detecting tremor or other signals and that citizen scientists more accurately identify earthquake signals than a trained machine-learning algorithm. For tremor classifications we currently depend entirely on humans as no machine has yet learned to detect triggered tremor.

In this citizen science project, we ask citizens to listen to relevant sections of seismograms that are converted to audible frequencies. Citizen scientists helped identify local seismic events whose recorded signals are much smaller than those associated with the surface waves that have triggered these local events. The local events include small earthquakes as well as tectonic tremor. While progress has been made in understanding how these events might be triggered by surface waves from large teleseismic earthquakes around the world, there is no consensus on its physical mechanism. The aim of our project is to engage the help of citizen scientists to increase general knowledge of triggered seismic events that may or may not occur during transient strain changes, such as from propagating surface waves. A better understanding of triggered seismic events is expected to provide important clues toward a fundamental understanding of how earthquakes nucleate and the physical mechanisms that connect different earthquakes and other slip events. From the volunteers' classifications we determined that citizen scientists achieve a higher reliability in detecting earthquakes and noise than in detecting tremor or other signals and that citizen scientists more accurately identify earthquake signals than a trained machinelearning algorithm. For tremor classifications we currently depend entirely on humans as no machine has yet learned to detect triggered tremor.

INTRODUCTION
Surface waves generally have the longest duration and largest displacement of all seismic waves. When they pass through a seismically active region, surface waves from distant earthquakes may locally trigger an earthquake or tremor (Miyazawa and Mori, 2005;Gomberg et al., 2008;Rubinstein et al., 2009;Chao et al., 2012;Ide, 2012). Determining the frequency and conditions under which triggered seismic events occur will lead to a better understanding of the dynamic triggering of earthquakes (Peng and Gomberg, 2010;Brodsky and van der Elst, 2014). Seismometers continuously record ground motion at stations around the world, including seismic waves of small events which may be detected at one or at multiple instrument locations. Due to the large number of seismometers, the available seismograms are too numerous to be examined by seismologists (Liang et al., 2016). With Earthquake Detective, we utilize the Zooniverse platform to engage citizen scientists in an experiment to test if many human ears and eyes can replace the process of a professional seismologist in identifying dynamically triggered seismic events. We focus on data from seismic stations in Alaska, including USArray stations of EarthScope. Our approach has three advantages: (1) The human ear naturally performs a time-frequency analysis and is capable of discerning a wide range of different signals (Zwicker, 1961), (2) many human ears listening to the same data provides statistics that rank seismograms in order of their likelihood to contain a recording of a local event, which is helpful to researchers' analysis of this data (Kilb et al., 2012), and (3) part of the citizen scientists' responses can be compared to the results of a machine-learning algorithm to assess their performance.
Different seismic events can be classified by citizen scientists when listening to the audio data alongside the visual graphs. When sufficient data is classified, seismologists and data scientists can use it to train a machine-learning algorithm (an example of artificial intelligence) to automate the classification of seismograms (Xing et al., 2003;Perol et al., 2018;Tang et al., 2020). From there, seismic models for how, where, when, and why earthquakes happen may be refined by seismologists. The work citizen scientists put into this project contributes to the fundamental understanding of our planet that will allow a more sustainable society by allowing professionals to better assess hazards from future seismic events. An electronic supplement provides details on interface diagrams of the project and portions of data utilized.

MATERIALS AND METHODS
Far-field surface waves of large magnitude earthquakes can dynamically trigger seismic events such as small, local earthquakes (Prejean et al., 2004) and tectonic tremor (Peng and Gomberg, 2010). Here, we address results from the citizen scientists' classifications of data from USArray (TA) and the Alaska Regional Network (AK), which were recorded in the US from 2013 to 2018 (see section "Acknowledgments and Data" for details). The seismic waveforms presented to citizen scientists are downloaded from the IRIS (Incorporated Research Institutions for Seismology) Data Management System (DMS) (see section "Acknowledgments and Data"). The downloaded waveforms ( Figure 1A) have a start time of 60 minutes before and an end time of 180 minutes after the origin times of selected large earthquakes with moment magnitude (M w ) greater than 7.5 ( Table 1; Aiken and Peng, 2014;Chao and Obara, 2016). Waveforms were converted to ground velocity by deconvolving the instrument response from the recorded waveforms, and rotated to radial, transverse and vertical components ( Figure 1B). The waveforms are then band-pass filtered between 2 and 8 Hz ( Figure 1C) to remove Rayleigh waves from the radial and vertical components and Love waves from the transverse component. After determining the beginning of the surface-wave window for each station based on its distance from the epicenter and using a group velocity of 4.5 km/s, we selected the first 2000 s of the time series after this start time ( Figure 1D). We generated audio files by speeding up the time series by a factor of 800 and applying an arctangent function to the amplitudes for dynamic-range compression ( Figure 1E). This provides improved audibility for signals with smaller amplitude while preventing events with larger amplitude signals from excessive loudness. Waveforms with either gaps in the time series, calibrations or re-centering signals, or other glitches were discarded before presenting the data to citizen scientists on the largest people-powered research platform, "Zooniverse" (Supplementary Figures S1-S3). With this platform, we were able to provide tutorial and practice sessions for training our citizen scientists to identify "earthquakes, " "tremor, " and "noise" signals. Citizen scientists are asked to choose "none of the above" when the seismic signals do not clearly fall in one of the other categories or more than one different signal is present in the data (Supplementary Figures S4, S5). Seismic waves that are caused by the displacement of tectonic plates along a fault are known as earthquake signals. They are caused by the sudden release of seismic energy, making them short in duration and resembling the sound of a slamming door. Tremors have a longer duration and are generated by a slow release of acoustic and seismic energy. Sped up to audible frequencies, tremor can sound like a train darting over railroad tracks.
The Earth is in constant motion under the influence of forces from atmosphere, hydrosphere (e.g., ocean currents and waves), and biosphere, including anthropogenic activity, generated by traffic or industry, for example. Therefore, every seismogram contains relatively steady noise, even in the absence of seismic signals or distinct noise events, which converts to a slowly varying, white noise "baseline" for the sound file. These noise signals sound like whistling wind, crinkling aluminum foil, or radio static.
Earthquakes and tremors as well as natural and anthropogenic sources generate seismic signals that may or may not exceed the baseline noise level of a seismogram. These different sources can be distinguished by the sound of their signals.

RESULTS
Of 2467 seismograms recorded by the AK network, 1103 seismograms were classified as earthquakes by citizen scientists, 141 as tremor, 770 as noise, and 228 were labeled as to pertaining to none of these categories. The distribution of classifications in the four categories (Figure 2) indicates that earthquakes (74% of all classifications on seismograms identified as earthquakes are made for this category) and noise (66%) were identified with more certainty by citizen scientists than tremor (50%) and other, unclear events (51%). Hence, citizen scientists were able to classify earthquakes and noise more consistently than tremor and other events.
For one M w 7.5 earthquake on December 5, 2018, seismologists independently classified the seismograms for which 7 of 10 citizen scientists agreed, in order to assess the accuracy of the project volunteers. For comparison, we applied a machine-learning (ML) algorithm, trained to detect earthquake signals only (Tang et al., 2020), and compared its output with our expert labels as well. Assuming that the expert labels are "true, " citizen scientists' labels were 85% accurate in classifying earthquakes and did not mislabel any seismogram without earthquakes though 23% of all earthquakes remained undetected by citizen scientists (Figure 3). Figure 4 shows results from ML as projections into two-dimensional spaces via the PCA (Principal Component Analysis) of 10-dimensional embeddings. PCA is a non-parametric statistical technique (George and Vidyapeetham, 2012) used for dimensionality reduction in machine learning and the principal components are the coefficients of orthogonal linear combinations of the variables in the dataset. Contours indicate the distributions of the training dataset and the symbols represent the testing dataset. The machine-learning algorithm achieved only 76.2% accuracy in classifying earthquakes in the same dataset, a score nearly 10% lower than citizen scientists (Figure 4).

DISCUSSION
Seismograms are retired after having been classified by 10 different users on Zooniverse. Of 2467 seismograms, 2242 have received a conclusive label, meaning that the number of classifications made for one category is not reached for any other category as shown by a narrow distribution. There was a larger level of agreement between volunteers FIGURE 4 | The convolutional neural network for classifying the training dataset (contour maps) and the test dataset (symbols) from the December 5, 2018 M w 7.5 teleseimic earthquake. Shown in red are seismograms with an earthquake, seismograms without earthquakes are shown in blue.
Frontiers in Earth Science | www.frontiersin.org when the seismographs contained either earthquakes or noise (Figure 2). Citizen scientists agree less on which seismograms contain tremor and other signals as shown by a wider distribution of classifications (Figure 5). We assume that the degree of agreement of classifications between citizen scientists reflects the collective confidence of citizen scientists in identifying the seismic signals. We found that earthquakes and noise have characteristic waveforms and associated audio signals that make it easy to distinguish them from other seismic signals. Citizen scientists are directed to classify seismic signals as not pertaining to any of the other categories when the seismograms contain several different signals or have unclear waveforms or audio signals. In these situations the seismograms are often classified as an earthquake or tremor. It is therefore unsurprising that the agreement of classifications made on seismograms in the category "none of the above" is lower than on seismograms in the other categories.
The 225 seismograms which have not received a conclusive label (Figure 5), meaning that the highest amount of classifications has been reached for more than one category, Frontiers in Earth Science | www.frontiersin.org amount to only 9% of all seismograms. Unsurprisingly, the distribution of classifications made on these seismograms shows no clear preference for any of the categories. However, it stands out that classifications for tremor and "none of the above" are more numerous than for earthquakes and noise, reflecting that these seismic signals are more difficult to identify and confirming the affirmations made for seismograms with a conclusive label. This may bias citizen scientists (Hart et al., 2009;Swanson et al., 2016) to classify seismograms with "none of the above" events as earthquakes, tremor or noise. These "none of the above" events reflect that seismograms within the surface wave intervals may contain instrument signals, and signals of anthropogenic and natural sources (Smith and Tape, 2019).
The classifications made by citizen scientists of Zooniverse make it possible to locate the stations with additional seismic signals that occurred during the passage of surface waves of teleseismic earthquakes in the AK network (Figures 6-8). Surface waves from the earthquake on December 5, 2018 with M w 7.5 southeast of the Loyalty Islands triggered local earthquakes within 300 km north of Anchorage, (Figure 6). During the passage of surface waves from the September 8, 2017 M w 8.2 Mexico earthquake, tremor occurred in central Alaska (Figure 7). The signals recorded during the passage of surface waves from the September 28, 2018 M w 7.5 Sulawesi earthquake (Figure 8) show a random mix of classifications by citizen scientists, implying that signals are present, but are ambiguous in nature. The focus on this study has been on harnessing the intelligence of citizen scientists to identify triggered seismic events. In the subtask of detecting triggered earthquakes, we compared the results of citizen scientists to an existing machine-learning algorithm (Tang et al., 2020). The confusion matrices in Figure 3 show that the machine-learning algorithm misidentified 11 of the expert-labeled non-earthquake signals as earthquake signals and missed 9 of the expert-labeled earthquakes, while correctly labeling 47 earthquake and 17 non-earthquake signals. On the other hand, citizen scientists correctly identified 43 earthquakes and missed 13 earthquake signals, while correctly labeling 28 non-earthquake signals.
From the above results, both methods can successfully identify triggered earthquakes, but citizen scientists can detect non-earthquake signals better than the machinelearning algorithm. Citizen scientists are more successful at identifying non-earthquake signals because we encourage them to classify seismograms without clear earthquake signals as "none of above, " and the same standard used by seismologists to label the data. However, the machine-learning algorithm may identify triggered earthquakes hidden by high background noise as positive examples (Figure 9). Hence, the algorithm detects 11 more earthquake signals than seismologists.

CONCLUSION
Over 2000 citizen scientists helped classify more than 2000 seismograms from 30 large worldwide earthquakes with magnitudes over 7.5 in the citizen science project "Earthquake Detective" on Zooniverse. Citizen scientists generally agree more with each other when identifying (1) seismograms with earthquake signals and (2) the absence of distinct signals (noise) than when identifying tremor or other signals. A subset of data we also classified by experts (seismologists among the authors) and a machinelearning algorithm trained to detect triggered earthquakes (Tang et al., 2020). We compared these classifications from a machine-learning algorithm, citizen scientists and seismologists with each other and with the earthquake classifications of citizen scientists. We found that citizen scientists did not misidentify seismograms without an earthquake (no false positives) but missed 13 earthquake signals in seismograms (false negatives), while correctly labeling 43 earthquake and 28 non-earthquake signals. The machine-learning algorithm misidentified 11 nonearthquake signals and failed to detect 9 earthquake signals in seismograms, while correctly labeling 47 earthquake and 17 non-earthquake signals. Both the citizen scientists and the machine-learning algorithm perform well in identifying earthquakes, but the citizen scientists outperformed the machine-learning algorithm in labeling non-earthquake signals. Earthquake Detectives and a machine-learning algorithm experience similar degrees of difficulties for example in identifying other seismic signals, which are more challenging and requires more intelligence than identifying earthquakes, even though citizen scientists are currently better at both.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
VT, BR, and SL conceived and designed the project. VT organized the database of seismic data. VT, BR, JN, and SL developed the project. JT and MP contributed miscellaneous support for project development, management, presentation, and strategy. KC contributed to data collection and critical discussions of triggered seismic events. VT, BR, and SL wrote the first draft of the manuscript and figures. All authors contributed to the revision of the manuscript, read and approved the submitted version.

FUNDING
This research was funded by the Integrated Data-Driven Discovery in Earth and Astrophysical Sciences (ID 3 EAS) program under National Science Foundation grant NSF-NRT 1450006.