Accounting for Natural Uncertainty Within Monitoring Systems for Induced Seismicity Based on Earthquake Magnitudes

To reduce the probability of future large earthquakes, traffic light systems (TLSs) define appropriate reactions to observed induced seismicity depending on each event's range of local earthquake magnitude (ML). The impact of velocity uncertainties and station site effects may be greater than a whole magnitude unit of ML, which can make the difference between a decision to continue (“green” TLS zone) and an immediate stop of operations (“red” zone). We show how to include these uncertainties in thresholds such that events only exceed a threshold with a fixed probability. This probability can be set by regulators to reflect their tolerance to risk. We demonstrate that with the new TLS, a red-light threshold would have been encountered earlier in the hydraulic fracturing operation at Preston New Road, UK, halting operations and potentially avoiding the later large magnitude events. It is therefore critical to establish systems which permit regulators to account for uncertainties when managing risk.


INTRODUCTION
The increasing number of industrial operations related to hydrocarbon extraction, geothermal power production, hydraulic fracturing for shale gas exploitation, wastewater injection, water impoundment, hydrocarbon storage, and mining operations in recent years, and the potential for large-scale subsurface CO 2 storage in future, has increased the importance of understanding and de-risking induced seismicity both to the scientific community and to the public who live near such operations (Grigoli et al., 2017). The potential to induce seismicity by human activities is wellknown (McGarr et al., 2002;Elsworth et al., 2016;Foulger et al., 2018;Keranen and Weingarten, 2018;Schultz et al., 2020). Military waste fluid injected in the Rocky Mountain Arsenal in the 1960's near Denver, Colorado (Healy et al., 1968), induced the so-called "Denver earthquakes." Since then induced earthquakes related to mining (Arabasz et al., 2005;Fritschen, 2010), oil and gas field depletion (Bardainne et al., 2008;Van Thienen-Visser and Breunese, 2015) shale gas exploitation (Bao and Eaton, 2016;Clarke et al., 2019;Lei et al., 2019), geothermal exploitation (Häring et al., 2008;Deichmann and Giardini, 2009), and waste water disposal (Ellsworth, 2013) have been documented around the world (Baisch et al., 2019). In the UK, induced earthquakes related to hydraulic fracturing at Preese Hall (Clarke et al., 2014), and Preston New Road (Clarke et al., 2019) have been observed, and the latter led to an indefinitely imposed UK government moratorium on fracking.
Traffic light systems (TLS; Bommer et al., 2006;Majer et al., 2012;Mignan et al., 2017;Baisch et al., 2019) are used widely to manage hazard and risk due to induced seismicity in geothermal and hydrocarbon industries, whereby operations are continued ("green"), amended ("amber"), or stopped ("red") based on the local event magnitude. In the original TLS developed by Bommer et al. (2006), the TLS thresholds are based on peak ground velocity, but other TLSs have been implemented based on earthquake magnitude or other ground motion parameters, such as peak ground acceleration (Ader et al., 2020). Depending on the industrial activities, criteria for a TLS may be very different. Baisch et al. (2019) and He et al. (2020) summarized some examples of existing TLSs that correspond to different industrial activities. In the UK the "amber" and "red" thresholds for induced seismicity related to unconventional oil and gas operations are set to local earthquake magnitudes M L = 0 and M L = 0.5, respectively, and this has led to multiple halts of hydraulic fracturing operations during the past few years (Clarke et al., 2019) and finally to an immediate moratorium of operations in November 2019.
The thresholds between zones in TLS are often defined based on limited case studies and on a priori assumptions in a best effort to provide simple schemes (Grigoli et al., 2017;Baisch et al., 2019). Consequently, they do not necessarily take into account the range of possible scenarios, nor uncertainties in event magnitudes, and hence some operations will incorrectly continue, increasing the risk of larger triggered earthquakes, while others will be wrongly halted. To ensure actions taken are robust, it is therefore necessary to estimate local magnitudes with uncertainties, and to consider them in the choice of M L thresholds in TLSs.
Assessing accurate magnitudes for human-induced earthquakes such as shale gas stimulation, waste water storage, or enhanced geothermal systems is difficult, because they are affected by lack of knowledge about the Earth's subsurface between the source and receivers and by the magnitude scale used . A standard approach to determine M L is to first locate the earthquake and then apply an empirical scaling relation to the source-to-receiver distance (Gutenberg and Richter, 1942;Gutenberg, 2013). Source location-related uncertainties in M L can then be evaluated using the location confidence ellipses. Unfortunately, estimating errors on M L due to velocity model uncertainties, energy attenuation during propagation or site effects such as wavefield focusing is difficult.
It is well-known that the accuracy of hypocenter locations depends largely on the velocity model accuracy (Husen and Hardebeck, 2010). Various efforts have been made to estimate velocity model uncertainties, by including a correction term to traveltime curve predictions (Myers et al., 2007), making random perturbations around a given velocity model (Poliannikov et al., 2013) and by locating seismic events in an ensemble of velocity models obtained by a Bayesian analysis of independent data (Gesret et al., 2011;Hauser et al., 2011). Recently, Garcia-Aristizabal et al. (2020) analyzed different sources of uncertainty that can be relevant for the determination of earthquake source locations, and introduced a logic-tree-based ensemble modeling approach for framing the problem in a decision-making context. Their approach, however, is not fully probabilistic, but limited to a finite set of explored models.
Here we propose a way to calculate local magnitudes with uncertainties for microseismic events, and to include the uncertainties in the design of TLS. We use a 3D Monte Carlo non-linear traveltime tomography method to jointly invert for hypocenter locations and velocity model. This allows us to obtain posterior distributions for local magnitude M L , which cover both velocity and source location uncertainties. Results clearly show that velocity uncertainties and station site effects are significant and change the zones of the TLS to which events are assigned, hence they directly affect safety related decisions. We then apply our method to the hydraulic fracturing induced seismicity at Preston New Road, UK and a mining site, and demonstrate that a red-light would have been encountered earlier if uncertainties would have been accounted for in the TLS thresholds.

METHODS AND DATA
Usually, local magnitudes M L are calculated by first locating the earthquake using standard linearized earthquake location methods (e.g., Klein, 2002), which require simple assumptions about the unknown underlying subsurface seismic velocity structure, and then applying an empirical scaling relation to the source-receiver distance to determine M L (Gutenberg and Richter, 1942;Gutenberg, 2013). The solution found by such location methods depends on the a priori best guess velocity model, and so it is not guaranteed to find a location near that of the true earthquake. They also cannot represent uncertainties on M L related to velocity model uncertainties, energy attenuation during propagation, or site effects such as wavefield focusing.

Non-linear Joint Hypocenter-Velocity Travel-Time Tomography
We use a probabilistic approach to jointly invert for hypocenter locations and 3D subsurface velocity. Our approach is based on a reversible jump Markov chain Monte Carlo algorithm (Green, 1995), which is an iterative stochastic method to generate samples from a target probability density. In a Bayesian approach all information is described in probabilistic terms. The goal is to calculate the posterior probability distribution function (pdf) which describes the probability of model m being true given observed data d and other relevant, a priori information. The posterior pdf is defined using Bayes' theorem (Jaynes, 2003): this combines prior knowledge about the model the prior probability p(m) with a likelihood function p(d|m) that describes the probability of observing the data if the particular given model m was true. In our approach, the posterior probability is a transdimensional function: the number of parameters is not fixed, and hence the posterior pdf is defined across a number of spaces with different dimensionalities.
We use the approach and code of Zhang et al. (2020) and use arrival times of P and S body waves from local earthquakes as data, and include the velocity model, the average arrival time uncertainties, source locations and original time as parameters. The 3D subsurface velocity model is defined in terms of a Voronoi tessellation of constant velocity cells, where both the position of Voronoi cells and their number can change during sampling, guided by the data and prior information. However, due to the parsimony of Bayesian inference, complicated models (models with many cells) tend to be rejected in favor of simpler models, if they fit the data equally well. The full model vector m is given by where n is the number of Voronoi cells, s describes their positions, and V s and V p describe the S-and P wave velocity within each Voronoi cell. The vector e = e 1 x , e 1 y , e 1 z , e 1 t , ...e N x , e N y , e N z , e N t contains source locations and origin times of N events, and σ is the arrival time data uncertainty. The travel time uncertainties for event i are defined as Zhang et al. (2018): where σ 0 and σ 1 are noise hyperparameters and t the P or S travel time.
We initialize 20 Markov chains with randomly generated starting models drawn from the prior distribution so that each chain starts from a different point in model space. To minimize dependence on this initial model, chains progress through a large number of samples called the burn-in phase from which all models are discarded. To reduce dependence of each sample on the next, after burn-in we only store every 200th model to use as samples of the posterior distribution. Each chain sampled 1.88 million models. At each step of the Markov chain a new model m ′ is generated by perturbing the current model. In our approach we have seven types of possible perturbation: adding, removing or moving a Voronoi cell (i.e., changing s), changing the P or S velocity of a randomly chosen Voronoi cell (V p , V s ), changing the noise hyperparameter σ , or changing the source coordinates of one randomly chosen source (e). The type of perturbation is selected randomly at each iteration, and the candidate model m ′ is accepted with a probability α (Green, 1995) given by: where J is the Jacobian matrix of the transformation from m to m ′ and is used to account for the volume changes of parameter space during jumps between dimensions, and q m|m ′ are proposal distributions that we use to propose new models m ′ at each step. In our case, it can be shown that the Jacobian is an identity matrix (Zhang et al., 2018).
A key function in the acceptance probability is the likelihood p(d|m) which quantifies the misfit between the observed data d obs and estimated data d est obtained by an eikonal solver using the fast marching method (Rawlinson and Sambridge, 2004) in model m. The likelihood is defined as: and σ i is the P or S wave travel time uncertainty for event i given by Equation (2). The likelihood function contains both the effect of the errors in the source locations and the velocity model uncertainties on the travel times. We choose uniform priors for the source location coordinates and the number of Voronoi cells, and Gaussian priors for all other parameters. A full and more detailed description of the methodology can be found in Zhang et al. (2018Zhang et al. ( , 2020.

M L Scaling Relations
A general local magnitude scaling relation is described by where r is the hypocentral distance in km, and A is the zero-topeak amplitude in nm on the horizontal components filtered with a Wood-Anderson response (Ottemöller and Sargeant, 2013;Butcher et al., 2017;Luckett et al., 2018). Parameters a, b, c, d, and f are region dependent constants which describe the geometrical spreading (a), attenuation (b), the base level (c), and distance dependent correction terms (d and f ), respectively. The original BGS scaling relation given by Ottemöller and Sargeant (2013) is M OS L = log 10 (A) + 1.11 log 10 (r) + 0.00189r − 2.09 . (7) This was updated by Butcher et al. (2017) to account for short source-receiver distances, giving M B L = log 10 (A) + 1.17 log 10 (r) + 0.0514r − 3 .
(9) The latter scale was used for the BGS locations throughout this paper.

Data
We use data from surface seismic monitoring arrays at two sites in the United Kingdom: (1) Preston New Road, where hydraulic fracturing took place in the Bowland Shale tight gas reservoir, and (2) Thoresby Colliery, a deep coal mine in Nottinghamshire.
At Preston New Road, hydraulic fracturing started on 15 October 2018 at the PNR-1z well in Lancashire, UK under the guidance of Cuadrilla Resources Ltd. and targeted the Bowland shale at a depth of ∼2,300 m (Clarke et al., 2019). During operations, the British Geological Survey (BGS) detected 172 local seismic events with local magnitudes M L between −1.8 and 1.6. The M L = 0 threshold ("amber") was exceeded by nine events, six of which had local magnitudes larger than 0.5 ("red" zone). In late October 2018, five events occurred that exceeded  the red light TLS thresholds after which operations were paused for a month, but microseismicity still occurred during the hiatus (Figure 1). The largest event with M L = 1.6, which was felt by some local residents, occurred on 11 December 11:21:15 UTC after operations resumed on 8 December. Hydraulic fracturing operations of the well ended on 17 December 2018. Over the course of 3 months more than 38,000 microseismic events were detected in real-time with the geophone array, with moment magnitudes M w between −3.1 and 1.6 (Clarke et al., 2019). We analyzed the P-and S-wave travel time data for the 172 largest earthquakes which were recorded at 11 seismic stations by the BGS (Figure 2). The majority of these events occurred between 2 and 2.5 km depth and occur in the vicinity of the well.

Thoresby Colliery
Thoresby Colliery in New Ollerton has a history of seismicity related to mining (Bishop et al., 1993), and in response to felt earthquakes between December 2013 and January 2014, the British Geological Survey (BGS) installed a temporary seismic network with seven seismometer stations, four of which are three-component broadband stations (Figure 3). Mininginduced earthquakes are some of the most widely studied and their magnitude and depth range is similar to fracking induced earthquake magnitudes (Davies et al., 2013), hence provide an excellent analog for the study of hydrofracturing induced seismicity.
Most of the seismic events used in this study are located north and south of the coal seams (Figure 3), and the majority of the events occurs at 800 m depth, which coincides with the depth of the coal seams (Butcher et al., 2017). The northern cluster occurred later in 2014 than the southern one. To reduce the computational costs we only use 61 seismic events out of the 305 recorded, giving 769 P-and S travel times to invert.

RESULTS
The McMC joint inversion provides us 3D posterior distributions of seismic velocities (V p and V s ), and of the earthquake hypocenter locations (Figures 4A,B). Therefore, we can calculate hypocentral distance posterior distributions (Figure 4C), which in turn allows us to estimate station-average local magnitudes M L posterior distributions ( Figure 4D) using a scaling relation (e.g., one of Equations 6-8). These distributions include the effects of velocity and source location uncertainties as well as the source radiation pattern on the pdf for event magnitudes. The station-averaged M L posterior distribution for one source may have a width that spans more than one zone of the traffic light system (e.g., cyan distribution in Figure 4D) which indicates that velocity model uncertainties alone can change the TLS zone to which the earthquake is attributed. Thus, uncertainties affect real operational decisions.   8) (Figure 5A; Supplementary Figure 1) and therefore make a difference between a continuation ("green") and an immediate stop ("red"). Note, however, that the original BGS scaling relation was not used by the BGS to calculate M L for these events; we include it here to show the effect of magnitude scale choices. The difference in M L between the M B L scale (Equation 7) and the most  (Figure 5A).

Scaling Relation and Station Site Effects on M L
Station site effects such as attenuation, focusing, and radiation pattern become evident by comparing M L distributions at individual stations. These uncertainties can shift the M L posterior distribution for one source by half a magnitude unit, sometimes more, easily sufficient to move the source into another zone of the TLS (Figure 5B; Supplementary Figure 2). We compare velocity and station site effect uncertainties on local magnitudes in Figure 6 for the hydraulic fracturing induced earthquakes at Preston New Road. The site effect uncertainties are estimated by calculating the mean local magnitude for each seismic source at all 4 stations (using the amended BGS scaling relation, Equation, 8), and then taking the difference between the smallest and largest mean station magnitude as a measure of site-related uncertainties. The velocity-related uncertainties are defined as the width of the interval between the 5-95% percentile of the station-averaged local magnitudes distributions. Their effects each average around ±0.125 and ±0.05 magnitude units, respectively, in our case study, but they vary and can have a combined effect that alters magnitude estimates by up to a whole magnitude unit (Figure 6). We observe that uncertainties are also roughly equally important for the mining induced seismicity at New Ollerton-their effects average around ±0.3 and ±0.05 magnitude units for site and velocity-related effects, respectively-with a combined effect that again can alter magnitude estimates by up to a whole magnitude unit, and potentially move events from "green" to "red" zones (Figure 7).

A PROBABILISTIC TRAFFIC LIGHT SYSTEM
We can now include the velocity and station site effect uncertainties in M L in a traffic light system (TLS). To do this, we first calculate M L threshold probability curves using the stationaveraged M L posterior distributions of the microseismic events at Preston New Road. Threshold probability curves describe the probability that an earthquake of a given magnitude is in any one zone of the TLS. They take into account velocity and stationsite effect uncertainties, as well as attenuation and geometrical spreading in M L . Furthermore, the threshold probability curves allow us to draw conclusions about the range of observed M L for which the probability of any earthquake being in a zone drops below a given confidence level α. The last point is particularly interesting for regulators and operators because it enables them to define the thresholds between zones in such a way that the probability of an earthquake being in each zone of the TLS is always above a chosen confidence level α.
M L threshold probability curves are obtained by shifting each of the 172 station-averaged event M L pdfs along the local magnitude axis and estimating the percentage of the distribution lying in each of the three zones of the TLS Frontiers in Earth Science | www.frontiersin.org  8) due to velocity uncertainties (purple error bars) and station site effects (gray bars). Black dots mark the maximum of the station-averaged ML distribution for each seismic event. Background colors indicate the three zones of the UK traffic light system, "green," "amber," and "red." (B) Normalized histograms of the magnitude uncertainties displayed in (A). Numbers display the velocity and station-site effect uncertainties at which the histograms take maximum values. Figure 3). By averaging over all threshold probabilities curves, we obtain one curve that describes the probability of an earthquake with a given magnitude being in any zone of the TLS. This can then be used to draw conclusions about (1) probabilities of earthquakes of a certain event magnitude M L being in any one of the TLS zones, and (2) the M L range for which the probability of any earthquake being in a zone drops below a given confidence level α.

(Supplementary
Our approach here is approximate, but once the first inversion if performed, subsequent assignment of a new event's M L to the correct zone of an adjusted TLS is trivial and can be done in real time, as explained below. In theory, however, the most rigorous approach to incorporating uncertainty into the calculation of M L for any one event is to retrieve its full posterior M L distribution, which is the averaged pdf across all stations for that one event. Although at this point we can do this for any existing event in our dataset, in practice we want to be able to do this for each new event that occurs, in real time. This presents a large challenge, however, since formally we must add the travel times from this new event to our dataset and re-run the whole sampling procedure again. We have added one new earthquake to the dataset and sampled 140,000 new models. This took 14,880 CPUhours on the ARCHER HPC machine, and so remains practically impossible for real time monitoring. Furthermore, the M L pdf of the new event is still sparsely sampled, and hence does not allow for a robust M L uncertainty quantification.

TLS With Realistic Uncertainties
A regulator or operator can choose whether they wish to minimize the probability of any such event exceeding a TLS threshold undetected, or to maximize the certainty that an event truly has exceeded the legal magnitude limits in order to avoid unnecessary, costly halt of operations. We term the first TLS -, where the M L thresholds are shifted toward smaller apparent-magnitude thresholds. In this way, the risk of smallermagnitude events leading to large earthquakes is reduced because operations are both halted and put "on caution" earlier. In the latter, the TLS thresholds would effectively be increased to higher values (TLS + ), so that operations could still continue up to larger apparent earthquake magnitudes. The choice of the risk system by the operator (TLS + or TLS -) is, however, subjective and depends on the country's governmental policies. The choice of the confidence level defines the TLS thresholds, but these as well as the risk strategy can be changed at any time.
For example, say a regulator or operator chooses that the confidence level with which each event is assigned to the correct zone must be at least 80% for decisions to be made. The range of estimated M L values that would have less than α = 0.8 probability is −0.036 < M L < 0.035 (zone A) and 0.46 < M L < 0.53 (zone B) (gray zones in Figure 8A) using the current UK TLS thresholds. Then, in a TLS -, all of zone A would be attributed to "amber, " and zone B to "red, " effectively moving the TLS FIGURE 8 | Developing TLS systems where the risk of larger triggered earthquakes is potentially reduced (TLS -), and a TLS where the risk of unnecessary, costly stop of operations is reduced (TLS + ). The threshold probability curves describe the probability of an event of local magnitude M L being in any one of the TLS zones (color coded red/amber/green for each zone). (A) Earthquakes that have M L estimates in zones A and B cannot be assigned to "red/amber/green" with 80% confidence. (B) For a 20% risk of an event exceeding a TLS threshold undetected, zones A and B are attributed to "amber" and "red," respectively (TLS -). (C) For 80% certainty that any event has exceeded a threshold, zones A and B are attributed to "green" and "amber," respectively (TLS + ). Black dots in (C) mark probabilities for example earthquakes of different M L . thresholds to lower values ( Figure 8B). Alternatively, in a TLS + , zone A would be assigned to "green" and zone B to "amber, " so the TLS thresholds would effectively be increased to higher values ( Figure 8C).
The uncertainties in M L discussed here are site specific so need to be determined for each geographical area or industrial operation individually. However, our approach can be applied to any site and to any form of induced seismicity. We have also demonstrated that for the Thoresby Colliery mining site in the UK the velocity model and station site effect uncertainties in M L are non-negligible (Figure 7), and can be accounted for in the choice of TLS thresholds (Supplementary Figure 4).

APPLICATION OF A PROBABILISTIC TLS TO PRESTON NEW ROAD SEISMICITY
We can use the three TLSs (Figure 8) to analyze retrospectively how decisions would have changed at Preston New Road under a TLS + or TLS - (Figure 9). We compare here the classification in the UK-TLS, a TLSand TLS + for a 80% confidence level (Figure 5). That means, the risk of exceeding a TLS threshold in TLSis 20%, while in the TLS + , the certainty that a threshold was exceeded is 80%. The earthquake on October 19th would have been classified as "amber" in all three TLSs using the maximum probability magnitude, whereas it was classified as "green" by the operator. Hence, action would have been taken earlier and the probability of subsequent larger events would have been reduced. The same is true for the seismicity on October 24th (Figure 9B), where operations would have stopped immediately with a safety prioritizing system (TLS -), and also in the UK TLS with a M L that accounts for uncertainties. This demonstrates the importance of accounting for uncertainties in local magnitudes M L in the decision-making process.
We acknowledge that the occurrence of induced seismicity is a multi-parameter phenomenon, depending on details of subsurface structures as well as on the complete history of operational measures and therefore cannot be predicted. Deformation processes may continue and can still induce seismicity after injections stop. The delay time between hydraulic fracturing completion and the cessation of the observed seismicity can be up to several years (Baisch et al., 2019). It is therefore speculative that an earlier stop would have prevented large magnitude post-injection seismicity at PNR. Nevertheless, it has been shown that lower M L threshold values in the TLS used for the geothermal stimulation in Basel, Switzerland could have prevented larger magnitude postinjection seismicity (Baisch et al., 2019). We therefore argue that it is critical to establish systems which permit regulators to account for uncertainties while managing risk, as we propose here.

CONCLUSIONS
We implemented a fully Bayesian approach for analysing uncertainties, such as velocity model and source location uncertainties in local earthquake magnitudes and evaluate their influence on decision-making for induced seismicity. We conclude that these uncertainties are important, as they can make a difference of up to one or two magnitude unit, and hence directly affect operational decisions by potentially moving an earthquake two zones in a traffic light system (TLS) leading to radically different operational outcomes.
To build a site-specific probabilistic TLS that accounts for uncertainties, the following three steps are necessary: (1) run one fully non-linear hypocenter-velocity tomography for the site and calculate M L posterior distributions for each earthquake. (2) calculate threshold probability curves, choose a desired confidence level α, and determine the M L zones A and B below the desired confidence level. (3) attribute zone A and B to "green/amber" or "amber/red" according to the desired safety system (reduce the risk of larger magnitude events (TLS -) or reduce the risk of halting operations unnecessarily (TLS + )). From this point on, real time assignment of any new event's M L to the correct TLS zone is trival, yet incorporates all the uncertainty in the measurements.
We applied our method to anthropogenic seismicity at a hydraulic fracturing site in the UK, and demonstrate that a red-light threshold would have been encountered earlier in a TLS -, which possibly could have prevented the UKwide shut-down. We also applied our methods to miningrelated seismicity at Thoresby Colliery, UK and find they FIGURE 9 | Classification of seismic events for 4 days (A-D) in the UK TLS (top two rows). Row 1 shows results using the local magnitudes determined by the BGS while other rows use the maximum probability magnitude determined in this work. Row 3: Classifications for a TLS where the risk of unnecessary, costly stop of operations is reduced (TLS + ). Row 4: TLS where the risk of larger triggered earthquakes is potentially reduced (TLS -). Both were calculated for an 80% confidence level. apply equally well in this different setting. Hence, our approach can be applied to any site and any form of seismicity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
CR processed the data, performed the analysis, prepared the figures, and wrote the paper. XZ developed the inversion code. BB provided the data of both case studies. All authors contributed to the interpretation of the results and the writing of the paper.