VINEDA—Volcanic INfrasound Explosions Detector Algorithm

Efficient event detection from large infrasound databases gathered in volcanic settings relies on the availability of robust and automated work-flows. While numerous triggering algorithms for event detection have been proposed in the past, they mostly focus on applications to seismological data. Analyses of acoustic infrasound for signal detection is often performed manually or by application of the traditional short-term average/long-term average (STA/LTA) algorithms, which have shown limitations when applied in volcanic environments, or more generally to signals with poor signal-to-noise ratios. Here, we present a new algorithm specifically designed for automated detection of volcanic explosions from acoustic infrasound data streams. The algorithm is based on the characterization of the shape of the explosion signals, their duration, and frequency content. The algorithm combines noise reduction techniques with automatic feature extraction in order to allow confident detection of signals affected by non-stationary noise. We have benchmarked the performances of the new detector by comparison with both the STA/LTA algorithm and human analysts, with encouraging results. In this manuscript, we present our algorithm and make its software implementation available to other potential users.


INTRODUCTION
Seismic and acoustic signals are key in monitoring and characterizing volcanic unrest. Recent technological advances in sensor development, data transmission, and archival protocols have made the collection of large amounts of geophysical data commonplace at active volcanoes and other monitoring environments. The sheer amount of data recorded makes their manual analysis a challenging, frequently unfeasible, task. The implementation of automated tools to address this challenge is, thus, vital for effective monitoring operations.
Automatic event detection and classification work-flows applied to seismic data include an initial segmentation stage, commonly via the application of short-term average/long-term average (STA/LTA) algorithms in order to parse the continuous seismograms into individual earthquake waveforms with varied characteristics and sources (Allen., 1982). In more advanced processing work-flows, this is followed by automatic classification of the signals by different methods, including Neural Networks (Scarpetta et al., 2005), pattern recognition (e.g., Curilem et al., 2014), Hidden Markov Models (Ibáñez et al., 2009), Support Vector Machines (Giacco et al., 2009), or statistical properties (Bueno et al., 2019). A wealth of new algorithms are constantly published in the literature in order to improve the efficiency of automatic detection and classification procedures for different types of signals, including those associated with tectonic earthquakes (Di Stefano et al., 2006;Álvarez et al., 2013;Bhatti et al., 2016), low-frequency volcano-seismic events (Frank and Shapiro, 2014), avalanches (Marchetti et al., 2015), and debris flows (Schimmel and Hübl, 2016). Collectively, these algorithms represent an important toolbox for the creation of high-quality research databases.
Volcano infrasound is becoming increasingly popular as a monitoring tool (Johnson and Ripepe, 2011;; among other applications, acoustic data are frequently used for detection and characterization of explosive volcanic activity (e.g., Garcés et al., 1999;Johnson et al., 2004;Vergniolle and Ripepe, 2008;Caplan-Auerbach et al., 2010;Fee and Matoza, 2013;Lamb et al., 2015;De Angelis et al., 2019). Volcanic explosions are commonly recorded by infrasound microphones in the 0.01-20 Hz frequency band as signals characterized by impulsive onsets followed by codas with variable duration, from few seconds to several minutes. Due to the increasing amount of acoustic infrasound data routinely collected on active volcanoes, the development of tools for automated signal detection is crucial for efficient monitoring. Here, we introduce an adaptative infrasound detector based on time-and frequency-domain characterization of volcanic explosion signals. We take advantage of advanced signal processing techniques, in combination with apriori knowledge of recorded explosions, to implement a robust infrasound detector based on adaptive multi-band processing. The datasets selected for this study are obtained from two previous monitoring campaigns at Santiaguito    Lamb et al. (2019) provides a seismic and infrasound analysis of three eruptive phases in a multiparametric monitoring framework, associating infrasound and seismic signals. Hence, the sheer volume of recorded explosions at both volcanoes makes an ideal ground for testing the VINEDA algorithm, as the geological properties are already known.

Algorithm Description
Detection of volcanic explosions in infrasound data streams is an important and challenging task. The algorithm detailed here, the Volcanic INfrasound Explosions Detector Algorithm (VINEDA), parses raw acoustic data streams, x(n) into a normalized characteristic function (CF) within a given frequency range [f low ,f high ]; the frequency band is chosen to include the majority of energy transported by explosion infrasound. Explosion temporal boundaries are defined by the abrupt onset arrival and return to background amplitude, with expected average durations given by D = {D min . . . D max }, with D min and D max the minimum and maximum expected duration of infrasound explosions, respectively. The number of duration bands used in the discriminant analysis is defined as N db . Background amplitude is defined as characteristic low-amplitude stationary signals (i.e., wind, noise, or lack of infrasound activity). The data processing pipeline is illustrated in Figure 1, key parameters for the algorithm are given in Algorithm 1, and Figure 2 shows the outputs of each step of the workflow using infrasound recorded of an explosion at Santiaguito volcano, Guatemala.
As the first step, an anti-aliasing Finite Impulse Response (FIR) low-pass filter is applied with a corner frequency f high , corresponding to the highest frequency component of interest in the explosion signal investigated. This stage eliminates noise outside the frequency band of interest for explosions, eases computational load and reduces signal complexity, while preserving important signal onset information. The sampling frequency, fs, is usually much higher than f high (typical values for fs and f high are of the order of 100 Hz and below 5 Hz, respectively); this allows, in the second processing step, downsampling of the input signal, x(n), by a factor of R = fs/fs ′ , being fs ′ = 2 · f high . Downsampling is performed by a polyphase anti-aliasing filter to compensate for potential delays due to re-sampling computations.
An adaptive de-trending filter is then applied to remove noise that could affect the lower frequency range of infrasound explosions, such as from wind or long-period instrument drift. Detrending is implemented as a zero-phase high-pass FIR filter. The order of this filter is given by the maximum duration of the explosion (D max ), whereas the cutoff frequency is fixed at f low Hz. This filter also helps to mitigate the influence of low frequency noise, such as microbaroms, which is often located at very low frequency bands (i.e. [0.1-0.5] Hz) (Landès et al., 2012) A median filter is further applied to remove spikes in the signal, such as those associated with instrument or data transmission noise. This filter reduces extreme values and smooths the input signal ( Figure 2B).
Since explosions are characterized by sharp onsets, a multiband filter stage has been specifically designed to provide no edge delay (Álvarez et al., 2013). The design of the filter bank (number of bands, central frequencies and bandwidths) depends on the frequency content of the explosions to be detected. Figure 2E depicts an instance of the impulse response of the FIR filters designed. The center frequencies are uniformly distributed within the frequency band of the explosion signal [f low , f high ], according to: where N fb is the number of sub-bands of the filter bank.
Frontiers in Earth Science | www.frontiersin.org Once the multi-bank filter is applied, signal envelopes are computed within each sub-band; the envelope, e k (n ′ ), for each sub-band is estimated as: whereŝ k (n ′ ) is the Hilbert Transform of s k (n ′ ) (Bracewell, 1999). All the sub-band envelopes are then added to obtain the signal e(n ′ ). This global envelope allows characterizing how energy is delivered at the explosion onset (i.e., an abrupt rise followed by a slow decay). Figure 2C shows the detected envelope e(n ′ ) of the explosion. The abrupt onset is detected, and the smooth decay is preserved. The next step in the detection algorithm is the application of a discriminant filter to the signal e(n ′ ) in order to compute a characteristic function, CF(n ′ ), and detect explosion onsets ( Figure 2D). The impulse response h(n ′ ) of the proposed discriminant filter is shown in Figure 2F; this filter is designed to enhance signals with a sharp rise and gradual decay, such as infrasound explosions, with a duration in the order of D. Thus, the discriminant detector mitigates the effect of nonstationary noises while finding the best match of infrasonic signals (Álvarez et al., 2013). A penalty factor for non-impulsive onsets, β, is added to increase the robustness of the filter with respect to background noise. The value of β controls the impulsivity of the signal we are going to detect, and it should be selected on the basis of the expected onset. Larger β values are required to detect very impulsive onsets with respect to background noise.
In the final stage of the detection workflow, the characteristic function, CF(n ′ ), is normalized following a non-linear companding method to emphasize onset arrivals without loosing amplitude information (Rabiner and Gold, 1975). The peak of CF(n ′ ) corresponds to the onset of the detected explosion event (Figure 2D).
The sequence of filtering stages is summarized in Figures 2A-D. Note that the spike present in the raw signal x(n) at ∼18 s is suppressed during processing and the long-period trend is removed while the onset of the explosion signal is preserved. The CF(n ′ ) for the original explosion waveform is shown in Figure 2D. The amplitude of the CF is proportional to the sharpness of the original explosion onset. Notice that the  (Álvarez et al., 2013). discriminant detector has suppressed background noise, thus highlighting the explosion onset.
Finally, VINEDA is designed to be used with any singlestation recordings. However, the flexibility of VINEDA allows its configuration in a parallel framework to work in a multi-station setting. Once the signals have been detected, derived CFs can be merged or interfaced with physical propagation models to locate the events of interest.

Multi-Station Application Example on Etna Volcano
VINEDA is a highly flexible algorithm that can be applied to network data to enhance detection of infrasound signals associated with volcanic explosions. Here, we demonstrate an application to network data recorded at Mt. Etna volcano by three infrasound sensors, during June 2017, with a frequency content between [1.0-3.0] Hz and maximum duration of D max = 5.0 s (Diaz-Moreno et al., 2019). Each sensor is independently processed, as we aim to investigate how VINEDA detects infrasound explosions from stations installed at different locations from the volcanic vent. Figure 3 shows the CFs along with signal envelopes (e 1 (n ′ ), e 2 (n ′ ), e 3 (n ′ )) for each of the input signals (x 1 (n), x 2 (n), x 3 (n)). Note that, generally, the proposed pre-processing steps yield robust envelopes, as noise and long-period trends have been filtered out, and the no-edge delay filter along with the envelope detector characterize explosion onsets. However, when stations are closer to the vent, the recorded explosions are less attenuated with larger amplitudes and more obvious onsets (ET01, ET04, and ET10 located at 1.0, 1.3, and 6 km from the vent, respectively; Diaz-Moreno et al., 2019). While closer stations exhibit larger CFs as observed for CF 1 (n ′ ) and CF 2 (n ′ ), the detector is still able to observe a signal at the farthest station, CF 3 (n ′ ). The results presented in Figure 3 highlight the capabilities of VINEDA to suppress much of the noise and still being able to detect attenuated explosions at greater distances. The robustness of the algorithm to perform multi-station detection permits a direct embedding with atmospheric propagation models to compute

DATA EXPERIMENTATION AND RESULTS
We applied the VINEDA algorithm to infrasound data from low-intensity Strombolian explosions recorded at Mt. Etna volcano during an experiment in the summer of 2017 . Additionally, the generalization capabilities of the algorithm across different volcanic settings are tested with infrasound data from Santiaguito . A traditional STA/LTA algorithm is compared with VINEDA in order to assess its detection performance and robustness. In addition, 2 days of continuous data from both volcanoes were also manually evaluated by a group of 6 experts. For every waveform of a detected event, each individual expert checks the presence/absence of an event and assign a quality factor Q. Faced with the question "Is this an explosion?", experts chose one of three quality assessments: "strongly agree" (Q = 1), "agree" (Q = 2), and "undecided" (Q = 3). Figure 4 depicts three infrasound signals with different quality factors, as reviewed by experts. Notice that signal to noise ratio (SNR) along is the highest for Q1, where the explosion is clearly visible. For lower quality values, the segmented explosion is almost masked with background fluctuations. All our experiments were performed on a 64 bit computer with an i7-8700k CPU (3.70 GHz) processor, 16 GB RAM, and Ubuntu 16.04. On this machine, VINEDA takes and average of 0.72 s (over 10 runs) to process 1 day of infrasound data of both volcanoes.
The Receiver Operating Characteristic (ROC) curve is an excellent graphical technique to visualize the trade-off between the sensitivity and the specificity of any detection system, for a particular decision threshold (Fawcett, 2006). The sensitivity is a metric to evaluate the goodness of a model to detect true events and according to the common definition from Signal Detection Theory, the sensitivity can be calculated as: where infrasound explosions are successfully detected (making a true positive, TP), mistakenly ignored (making a false negative, FN), and vice versa: other events or noise can be mistakenly detected as explosions (making a false positive, FP). FIGURE 4 | Infrasound signals from Etna 2017 experiment, as presented to the experts during the annotation process. The assigned quality factors (Q1, Q2, Q3) is also shown. Notice that the highest quality of Q1 implies the best signal to noise ratio, with the explosion clearly distinguishable. For Q2, the event is detected by VINEDA, despite lower amplitude values. For Q3, atmospheric disturbances affect the infrasound signal, and even if the explosion is detected, a multi-station analysis would help to decrease false positives.
The specificity measures the proportion of actual negatives that are correctly identified as such. Since VINEDA is a detector (it detects infrasound explosions in an input signal in which an event could be found at any time) but not a classifier, the specificity of the detector is expressed as the rate of false positives per hour (FP/h). Figure 5 illustrates the ROC curves of detection for Etna (left) and Santiaguito (right) volcanoes performed by VINEDA algorithm (red) when compared to STA/LTA algorithm (blue), and two quality criteria Q ≤ 2 and Q ≤ 3 as assessed by human analysts. For Santiaguito volcano, explosions were characterized by a duration of 4 s, with a frequency content in the range of [1.0-3.0] Hz. Mount Etna explosions were characterized by an average duration of 2 s, and a frequency content in the range of [1.0-3.0] Hz. In both settings the parameters β and N fb were fixed to 3. Note that, from Equation (1), a N fb = 3 yields a set of central frequencies that covers the range of infrasonic explosions for both volcanoes. The definition of the frequency band [f low , f high ] is essential to guarantee correct segmentation boundaries. For Etna and Santiaguito volcano with given band of 1.0-3.0 Hz., the computed frequencies for the multi-band analysis are fc = [1.33, 2.00, 2.67] (see Equation 1), encompassing the range of frequencies of interest for all explosive activity analyzed. Similarly, a value of β = 3 helps to select events with sharp onsets mitigating the influence of the background noise. Given the mathematical design of the discriminant filter, larger values of β fit sharp onsets whilst decreasing the computed CF function for smaller (or attenuated) explosions. The vector range of durations D is defined, for both volcanoes and a delta timestep of 1 s, between D min = 2.0 and D max = 5.0 s. In practice, the selected parameters suffice for the range of frequencies in infrasonic explosions for these volcanoes, but this definition is highly flexible and is left to the analyst on a per-case basis.
All streams were sampled at fs =100 Hz. The detection thresholds (thr) varied between 0 and 240. STA/LTA is a simple and effective method to detect transient events that defines a characteristic function for detection as the ratio between the STA and the LTA (the absolute values of the seismogram averaged over short and long windows, respectively). In such way, the LTA tracks the background seismic noise, while the STA/LTA ratio increases when there is a sudden increase in signal amplitude  and Santiaguito (B) volcanoes, for the events manually assessed by the experts with the three different quality criterion considered: "strongly agree" (Q = 1), "agree" (Q = 2), and "undecided" (Q = 3). N indicates the total number of detected explosions by the algorithm with the assigned quality factor. (Allen., 1982). Using the infrasound data gathered at Santiaguito and Mount Etna, STA/LTA was applied to the band of interest ([1.0-3.0] Hz for both volcanoes), with short and long windows of 1 and 10 s, respectively.
The area under the ROC curve could provide an idea of the benefits of any particular algorithm. For an algorithm with a good performance, the area under this curve would be maximum since the ideal detection threshold would be the one providing a result closest to Sensitivity = 1 and FP/h = 0. Figure 5 shows that VINEDA detector outperforms STA/LTA for both volcanoes. The performance including events in which experts are "sure" (Q ≤ 2) is better than that for "not sure" events (Q ≤ 3) in all cases. The improvement in sensitivity and rate of false positives per hour for VINEDA compared to STA/LTA in all cases is associated to the specific processing of infrasound explosion signals that the algorithm VINEDA carries out. The maximum value of the CF function depends on the quality of the explosion signal. Figure 6 shows the real value of the CF against the quality factor assigned by the experts, in combination with the total number of detected explosions (N). We observe that higher values on the CF function are obtained for events with better quality. These events would be detected using high values of the detection threshold, while keeping the number of detected explosions low. This is in line with the design of the algorithm, as restrictive detection thresholds will only retrieve very distinctive explosions with abrupt onsets, discarding the rest. By contrast, for less demanding thresholds, the number of detected explosions increases but the maximum value of the CF decreases. This behavior is expected, and should be taken into account to set the detection threshold depending on the particular needs or application.
The ever-increasing availability of infrasonic data requires the development of mathematical routines that can be used to detect events of geophysical interest. We have presented VINEDA; a generic and scalable multi-step algorithm designed to detect infrasound explosions. Our experimental evaluation with data from two volcanoes, Santiaguito and Etna, suggest that VINEDA improves performance over STA/LTA approaches. For both volcanoes, the refinement of the detector shown in the ROC curve, jointly with the value of the estimated CF function, confirms the capabilities of VINEDA to surpass STA/LTA. The discriminant detector helps to filter out nonstationary noises and acts as a penalty to temporally longer infrasound events, such as rockfalls, avalanches, or degassing. Further, the quality of detections is strengthened by VINEDA's capabilities to function in a multi-station setting. We suggest that the procedure described here can be used to annotate highquality data from sequential infrasound streams for further postprocessing, including the training of advanced machine learning models, picking algorithms or geo-statistical modeling.

CONCLUSIONS
Acoustic infrasound provides unique insights on the dynamics of erupting volcanoes. The detection and characterization of explosions from large streams of continuous, multi-channel, infrasound data is a challenging task. In this manuscript, we have introduced VINEDA, an infrasound detector which makes extensive use of signal processing techniques in order to characterize continuous volcano acoustic records and extract explosion signals. This algorithm stands as a middle point between the complete knowledge of the target signal beforehand, and incomplete knowledge of the explosions on the recorded infrasonic-stream as some prior knowledge of signal features suffice for VINEDA to detect target signals. VINEDA is suitable for deployment within volcano monitoring systems and offers a trade-off between quality and quantity of detections. We suggest that real-time implementations of algorithms like VINEDA are crucial to improve existing infrasound datasets and ultimately, increase our ability to monitor unrest at active volcanoes.

DATA AVAILABILITY STATEMENT
The Santiaguito and Etna infrasound datasets selected for this study are available via request to the authors. VINEDA code is open-source and can be downloaded at https://github.com/ srsudo/vineda.