Long Memory in Earthquake Time Series: The Case Study of the Geysers Geothermal Field

Barani, S.; Cristofaro, L.; Taroni, M.; Gil-Alaña, L. A.; Ferretti, G.

doi:10.3389/feart.2021.563649

ORIGINAL RESEARCH article

Front. Earth Sci., 05 April 2021

Sec. Solid Earth Geophysics

Volume 9 - 2021 | https://doi.org/10.3389/feart.2021.563649

This article is part of the Research TopicAchievements and New Frontiers in Research Oriented to Earthquake ForecastingView all 28 articles

Long Memory in Earthquake Time Series: The Case Study of the Geysers Geothermal Field

S. Barani¹*

¹DISTAV, University of Genoa, Genoa, Italy
²Department of Statistical Sciences, Sapienza University of Rome, Rome, Italy
³Istituto Nazionale Di Geofisica e Vulcanologia, Rome, Italy
⁴University of Navarra, Faculty of Economics and ICS-NCID, Pamplona, Spain
⁵University Francisco de Vitoria, Department of Business Administration, Madrid, Spain

The present study aims at proving the existence of long memory (or long-range dependence) in the earthquake process through the analysis of time series of induced seismicity. Specifically, we apply alternative statistical techniques borrowed from econometrics to the seismic catalog of The Geysers geothermal field (California), the world’s largest geothermal field. The choice of the study area is essentially guided by the completeness of the seismic catalog at smaller magnitudes (a drawback of conventional catalogs of natural seismicity). Contrary to previous studies, where the long-memory property was examined by using non-parametric approaches (e.g., rescaled range analysis), we assume a fractional integration model for which the degree of memory is defined by a real parameter d, which is related to the best known Hurst exponent. In particular, long-memory behavior is observed for d > 0. We estimate and test the value of d (i.e., the hypothesis of long memory) by applying parametric, semi-parametric, and non-parametric approaches to time series describing the daily number of earthquakes and the logarithm of the (total) seismic moment released per day. Attention is also paid to examining the sensitivity of the results to the uncertainty in the completeness magnitude of the catalog, and to investigating to what extent temporal fluctuations in seismic activity induced by injection operations affect the value of d. Temporal variations in the values of d are analyzed together with those of the b-value of the Gutenberg and Richter law. Our results indicate strong evidence of long memory, with d mostly constrained between 0 and 0.5. We observe that the value of d tends to decrease with increasing the magnitude completeness threshold, and therefore appears to be influenced by the number of information in the chain of intervening related events. Moreover, we find a moderate but significant negative correlation between d and the b-value. A negative, albeit weaker correlation is found between d and the fluid injection, as well as between d and the annual number of earthquakes.

Introduction

The concept of long memory in the seismic process has a relatively long history. Most research on this topic dates back to the 80s, when Mandelbrot interpreted earthquakes as self-similar “objects” (Mandelbrot, 1983) and Bak et al. (1987) introduced the notion of self-organized criticality (SOC). According to this latter study and Bak and Tang (1989), earthquakes are interpreted as the result of interactive dissipative dynamical systems that evolve spontaneously toward critical (i.e., at the edge of collapsing), stationary states that can collapse in events of any size, forming self-similar fractal patterns. Systems that show significant statistical correlations across large time scales are said to be long-memory processes or processes with long-range dependence (LRD) (Karagiannis et al., 2004). The word “memory” is related to the connections between certain observations and those occurring after an amount of time has passed. The word “long” indicates that observations are correlated across widely separated times (i.e., the dependence among observations is significant, even across large time shifts). Interested readers on this topic may refer to the exhaustive dissertation of Samorodnitsky (2006).

Since then, many studies interpreted seismicity and tectonic deformations as the results of a dynamical process exhibiting a self-organized critical behavior (e.g., Sornette and Sornette, 1989; Ito and Matsuzaki, 1990; Sornette, 1991; Turcotte, 1999). In particular, Scholz (1991) used data from artificially induced seismicity to assert that the continental crust is virtually everywhere in a state close to seismic failure. Concurrently, Kagan and Jackson (1991) observed that earthquake clustering is a major long-term manifestation of the seismic process. In particular, it was observed that regions of recent high seismic activity have a larger than usual chance of producing new strong earthquakes (e.g., Lomnitz, 1994; Kagan and Jackson, 1999; Kagan and Jackson, 2013). In other words, periods of high release of seismic deformation will most likely be followed by years of higher-than-average seismic strain release. Conversely, seismically quiet periods will tend to be followed by quiet years. This implies that the process of accumulation and release of seismic strain is governed by long memory and presents a persistent behavior (e.g., Barani et al., 2018).

These findings have important consequences on the reliability of most earthquake forecasting models. Specifically, long-term clustering and self-organized criticality invalidate three important assumptions that are widely used in probabilistic seismic hazard analysis (PSHA), a key element of earthquake-resistant design, and long-term forecasting (e.g., Bak, 1991; Kagan et al., 2012): 1) seismicity as a Poisson process; 2) quasi-periodic or recurrent behavior; 3) characteristic earthquake model. A first attempt to incorporate long-term earthquake interactions into a forecasting model is the double-branching model of Marzocchi and Lombardi (2008). In this model, the first-step branching accounts for the short-term clustering, in which the occurrence of triggered events is modeled according to the modified Omori law (Utsu, 1961), while the second-step branching models the long-term triggering associated with post-seismic stress transfer produced by stronger events. The long-term evolution is essentially modeled according to an inverse exponential distribution of the form $e^{- \frac{t}{τ}}$ , where t is the elapsed time from the occurrence of the earthquake that generates stress variations, and τ is the characteristic time of the interaction.

Attempts to detect long memory in the earthquake process in a statistical robust way, were made by analyzing time series of earthquake frequency (Ogata and Abe, 1991; Li and Chen 2001; Jimenez et al., 2006), released energy (Jimenez et al., 2006), seismic moment (Cisternas et al., 2004; Barani et al., 2018; Mukhopadhyay and Sengupta, 2018), and time series of b-value (Chen, 1997). Gkarlaouni et al. (2017) searched for persistency in time series of magnitudes, inter-event times, and inter-event epicentral distances of successive events. Series of inter-event periods were also analyzed by Liu et al. (1995) and Alvarez-Ramirez et al. (2012). Most of these studies examine the long-memory feature through the rescaled range analysis (R/S analysis), which was originally proposed by Hurst (1951) and described rigorously by Mandelbrot and Wallis (1968). This technique estimates the Hurst exponent (H), a parameter that measures the level of correlation in time series. It takes values between 0 and 1 and indicates long memory for values different from 0.5. A value close to 0.5 indicates a stochastic process whose values have no dependence or mutual correlation (e.g., white noise). Most recently, in order to determine H, Peng et al. (1994) proposed the Detrended Fluctuation Analysis (DFA). The DFA was used in seismological studies by Lennartz et al. (2008), Telesca et al. (2004), and Shadkhoo et al. (2009). Shadkhoo and Jafari (2009) used a generalization of DFA, known as multifractal-DFA (Kandelhardt et al., 2002).

Although the majority of the previous studies give H > 0.5 (indicating a persistent, long-memory process), no general agreement about the presence of long memory in seismicity seems to emerge. Hence, this issue can be considered still open, and motivates the need of spending further efforts in this field of research with the aim of filling the lack of robust statistical evidence of this hypothesis.

In the present work, we apply for the first time techniques borrowed from econometrics to investigate the long-memory feature in seismicity. We analyze time series of earthquakes (i.e., daily number of earthquakes versus time and logarithm of seismic moment released per day versus time) associated with the industrial activity of The Geysers geothermal field (northern California), a dry-steam field in a natural greywacke sandstone reservoir at around 1.5–3 km depth. In this area, part of the seismic activity is induced by injection of fluids, but also present is triggered seismicity that includes aftershock sequences, swarms, and earthquake clusters (e.g., Johnson and Majer, 2017). The choice of the study area is essentially guided by the completeness of the seismic catalog at smaller magnitudes, which is a major drawback of conventional catalogs of natural seismicity. This allows examining the sensitivity of the results to the uncertainty in the completeness magnitude of the catalog. Moreover, since it has been shown that both seismic activity and clustering properties vary according to injection operations (e.g., Martínez-Garzón et al., 2013; Martínez-Garzón et al., 2014; Martínez-Garzón et al., 20s18; Trugman et al., 2016; Langenbruch and Zoback 2016), The Geysers field is an ideal area for investigating temporal variations of the long-memory property in seismicity time series. Such variations are examined in relation to those of the b-value of the Gutenberg and Richter relation.

Contrary to the studies mentioned above, which investigate the property of long memory in a non-parametric way (i.e., they do not assume any specific parametric model for the time series of interest), we consider a parametric model based on the concept of fractional integration that imposes a pole or singularity in the spectrum of the time series at zero frequency (see “Methodology” section). The level of correlations among observations in the time series is expressed by a real parameter d, which is related to the Hurst exponent. Long memory behavior is observed for d > 0. The results obtained by using the parametric approach are compared to those resulting from semi-parametric (i.e., the “local” Whittle approach of Robinson (1995)) and non-parametric (i.e., modified R/S analysis) methods applied to the same time series.

Data

The data set considered in this study is the Berkeley-Geysers (BG) catalog, made available by the Northern California Earthquake Data Center (NCEDC, 2014). It collects 458,459 earthquakes with local magnitude (M_L) from 0 to 4.7 recorded between April 2003 and June 2016 in the area of The Geysers geothermal field (Figure 1). As stated above, the catalog includes earthquakes induced by injection of fluids, but also present is triggered seismicity (i.e., aftershock sequences, swarms, and earthquake clusters). Johnson and Majer (2017) found no basic differences between the source mechanisms of these different types of earthquakes.

FIGURE 1

FIGURE 1. Seismicity distribution in The Geysers geothermal field (from the Berkeley-Geysers catalog (NCEDC, 2014)). The polygon in red delimits the study area: the dashed line separates the northwestern sector (Zone 1) from the southeastern one (Zone 2). Faults are displayed by white lines (see acknowledgments).

In the subsequent analyses, we consider the events located within the red polygon in Figure 1. Specifically, first we analyze the property of long memory in the entire The Geysers field by comparing and contrasting alternative statistical techniques. Then, we analyze how temporal fluctuations in seismic activity induced by injection operations affect the value of d. This latter analysis is presented for the entire data set and northwestern sector (Zone 1 in Figure 1). Many authors (Stark, 2003; Beall et al., 2010; Beall and Wright, 2010) have observed that this part of The Geysers field is more active seismically than the southeastern part (Zone 2), following the expansion of the field in that sector. Following Beall and Wright (2010) and Convertito et al. (2012), the northwestern area includes all earthquakes with magnitude greater than 4, whereas the southeastern one is characterized by smaller magnitude events. Moreover, in Zone 1, earthquake hypocenters reach greater depths, reflecting the migration of fluids deeper into the reservoir (Johnson et al., 2016).

In order to compute the completeness magnitude (M_c) of the data set (i.e., the minimum magnitude value above which a data set can be assumed representative of the seismic productivity of an area), we apply the maximum curvature approach (Wiemer and Wyss, 2000), which defines M_c as the magnitude value corresponding to the maximum of the incremental magnitude-frequency distribution. In our catalog, the maximum curvature of the distribution corresponds to M_l = 1.0 (see Figure 2). Following Woessner and Wiemer (2005), a correction term is added to this value in order to avoid possible underestimation. We consider two alternative correction values, equal to 0.3 and 0.5 magnitude units, which lead to values of M_c of 1.3 and 1.5, respectively. Note that these correction values are both greater than 0.2, a standard value in such kind of applications (e.g., Tormann et al., 2015). Thus, both M_c values considered in the present study can be interpreted as conservative estimates of the completeness of the catalog. For both these values, Figure 3 displays bar graphs showing the variability of the monthly number of earthquakes in the period covered by the catalog along with the (cumulative) seismic moment (M₀) released over time. The seismic moment (in Nm) was calculated by applying the M_l-M₀ relations of Chen and Chen (1989) for California:

\log M_{0} = M_{l} + 10.5 (M_{l} \leq 3.6) (1)

\log M_{0} = 1.5 M_{l} + 8.7 (3.6 < M_{l} \leq 5.0) (2)

\log M_{0} = 3 M_{l} + 1.2 (5.0 < M_{l} \leq 6.3) (3)

The reliability of these relations when applied to induced seismicity is checked by comparison with the M_l-M_w data set of induced earthquakes of Edwards and Douglas (2014), which also includes data relevant to The Geysers geothermal field (Figure 4). Moment magnitude (M_w) values were calculated from M₀ by using the well-known Hanks and Kanamori (1979) relation.

FIGURE 2

FIGURE 2. Incremental (red dots) and cumulative (blue circles) magnitude-frequency distribution. The gray and black lines indicate alternative magnitude completeness thresholds: M_c = 1.3 (gray line) and M_c = 1.5 (black line). Bins of 0.1 magnitude units are used.

FIGURE 3

FIGURE 3. Variability of the number of earthquakes in the period covered by the Berkeley-Geysers earthquake catalog (NCEDC, 2014) (April 2003–June 2016). Light and dark gray bars indicate the monthly number of events above the completeness thresholds M_c = 1.3 and M_c = 1.5, respectively. Curves of cumulative seismic moment are superimposed. The solid line is for M_c = 1.3, while the dashed one is for M_c = 1.5.

FIGURE 4

FIGURE 4. Relation between local and moment magnitude for induced earthquakes in Hengill (Iceland), Basel (Switzerland), and Geysers (California) geothermal areas (modified from Edwards and Douglas (2014)). The Chen and Chen (1989) scaling relation is superimposed.

Besides fluctuations of seismic activity, which can be related to injection operations, Figure 3 shows relatively long periods of much milder (or absent) seismicity (e.g., October 2004, March 2009, November 2014; October 2015) that can be attributed to temporary issues in seismic monitoring operations rather than to actual periods of seismic inactivity. Such “instabilities” in the seismic catalog, along with short-period ones, are also discernible from the time series analyzed in the following (Figure 5). We indicate the time series of the daily number of earthquakes with N13+ and N15+ (where N stands for ‘number’, while ‘13+’ and ‘15+’ recall the values of the completeness magnitude M_c). Likewise, the time series of M₀ are labeled as M13+ and M15+.

FIGURE 5

FIGURE 5. Time series plots. Time series in the left column consider earthquakes with magnitudes above the completeness threshold M_c = 1.3, while time series in the right column are for M_c = 1.5. Top panels: time series of the daily number of earthquakes (N13+ and N15+). Bottom panels: time series of the (logarithmic) seismic moment M₀ released per day (M13+ and M15+). Time series go to zero when no data are reported in the catalog.

Methodology

As mentioned in the earlier parts of the manuscript, we use alternative techniques to examine the long-memory property. We start by providing two definitions of this concept, one in the time domain and the other one in the frequency domain.

According to the time domain definition, a covariance stationary process {x_t, t = 1, 2, … , T} with autocovariance function $E [(x_{t} - E [x_{t}]) (x_{t + u} - E [x_{t + u}])] = γ_{u}$ (with the usual notation E for the expectation operator) displays long memory if the infinite sum of the absolute value of its autocovariances is infinite, namely:

\lim_{T \to \infty} \sum_{u = - T}^{T} | γ_{u} | = \infty (4)

Now, assuming that x_t has a spectral density function f(λ), defined as the Fourier transform of the autocovariance function

f (λ) = \frac{1}{2 π} \sum_{u = - \infty}^{\infty} γ_{u} c o s (λ u) - π < λ \leq π (5)

the frequency domain definition of long memory states that the spectral density function must display a pole or singularity at least at one frequency in the interval [0, π), namely:

f (λ) \to \infty, as λ \to λ^{*}, for λ^{*} \in [0, π) (6)

In simple words, Eq. 6 states that, in the context of long-memory processes (i.e., d > 0), the spectral density function is unbounded at a specific frequency λ, typically the zero frequency. The periodogram of the data¹, which is an estimator of the spectral density function of the series, should reproduce that behavior, with a large value at the smallest frequency. This can be observed in Figure 6A, which displays the first 200 values of the periodograms of the time series considered in this study for the completeness threshold M_c = 1.3. However, taking the first differences of the data (i.e., $y_{t} = x_{t} - x_{t - 1}$ ), the periodograms (Figure 6B) display values close to zero at the zero frequency, which is clearly an indication of over-differentiation. This is indicative of d < 1 in the original time series. This was precisely the issue mentioned in Granger (1980) for the motivation of fractional integration with order of integration d in the range (0, 1). For further readings, see Granger (1980), Granger (1981), Granger and Joyeux (1980), and Hosking (1981).

FIGURE 6

FIGURE 6. First 200 values of the periodograms relative to the N13+ and M13+ time series (top and bottom panels of (A), respectively) and relative to the time series of the first differences of N13+ and M13+ (top and bottom panel of (B), respectively). The horizontal axis refers to the index i of the frequency $λ_{i} = 2 π i / T$ , where i = 1, 2, … , 200.

In this article, we claim that the time series under investigation satisfy the two properties above, and we will test the hypothesis of long memory by using alternative methods based on parametric, semi-parametric, and non-parametric approaches.

Among the many methods to test the hypothesis of long memory, we focus on the fractional integration or I(d) approach, which was originally developed by Granger (1980), Granger (1981), Granger and Joyeux (1980), and Hosking (1981), and was widely used in the last twenty years in different fields (interested readers may refer Gil-Alana and Hualde (2009) for a review of applications involving fractional integration). In particular, assuming a generic time series {x_t, t = 1, 2, … , T}, the fractional integration approach is based on the following equation:

{(1 - L)}^{d} x_{t} = u_{t}, t = 1, 2, \dots, T (7)

where L is the lag operator (i.e., $L^{k} x_{t} = x_{t - k}$ ), d is a real value, and u_t is assumed to be a short memory or I (0) time series, defined as a covariance stationary process satisfying

\lim_{T \to \infty} \sum_{u = - T}^{T} | γ_{u} | < \infty (8)

or, in the frequency domain,

0 < f (λ) < \infty, for λ \in [0, π) (9)

In this framework, x_t is said to be integrated of order d, or I(d), in the sense that “d differences” (i.e., the operator (1 – L) is applied d times on x_t as expressed in Eq. 7) are required to remove the long memory from x_t, thus leading to a new time series u_t characterized by short memory (i.e., integrated of zero order, I (0)). Applying the Binomial expansion to the left-hand side of Eq. 7 leads to:

{(1 - L)}^{d} x_{t} = \sum_{k = 0}^{t - 1} (\begin{matrix} d \\ k \end{matrix}) {(- 1)}^{k} L^{k} x_{t} (10)

Thus, if d is a positive, non-integer value, then x_t depends on all its past history (i.e., the dependence among observations is significant, even across large time shifts), and the higher the value of d is, the greater the level of dependence between the observations is. Therefore, d can be interpreted as an indicator of the degree of dependence in the data.

In the context of I(d) models, three values of d assume an important meaning. The first, d = 0, implies that the series is I (0) or short memory. The second, d = 0.5, allows distinction between covariance stationary processes (i.e., d < 0.5) and nonstationary processes (i.e., d ≥ 0.5)². Finally, d = 1 is another turning value, as separates series that display the property of mean reversion (i.e., d < 1), for which any random shock will have a transitory effect disappearing in the long run, and series (with d ≥ 1) for which the effect of the shocks become permanent and increases as d increases above 1. Thus, the fractional integration approach is a very flexible method because it allows considering different categories of I(d) processes, with the following properties:

a. d < 0: anti-persistence (i.e., a time series switches between high and low values in adjacent pairs);

b. d = 0: I(0) or short-memory behavior

c. 0 < d < 0.5: covariance stationary, long-memory behavior with mean reversion;

d. 0.5 ≤ d < 1: nonstationary though mean-reverting long-memory behavior, with long-lasting effects of shocks (i.e., any random shock in the series will disappear in the long run though taking longer time than in the previous case);

e. d = 1: nonstationary I (1) behavior with shocks having permanent effects in the series (i.e., lack of mean reversion);

f. d > 1: explosive behavior with lack of mean reversion.

Note that u_t in Eq. 7 can also display some type short memory structure like the one produced by the stationary and invertible Auto Regressive Moving Average (ARMA)-type of models. Thus, for example, if u_t follows an ARMA (p, q) model (where p and q are respectively the orders of the AR and MA components), we say that x_t is a fractionally integrated ARMA (i.e., ARFIMA (p, d, q)) model (Sowell, 1992; Beran, 1993).

There are several methods for estimating the value of the parameter d (or testing the hypothesis of long memory) in time series. The subsections below describe those applied in the present work.

Robinson Tests for Fractional Differentiation

The tests of Robinson (1994) are a number of parametric tests in the frequency domain that allow verifying the hypothesis of long-memory in time series assuming a fractional integration model. They consist in disproving the null hypothesis $H_{O} : d = d_{O}$ for different real values $d_{O}$ . Such tests are widely used in many disciplines, including climatology (Gil-Alana, 2005; Gil-Alana, 2008; Gil-Alana and Sauci, 2019), economics (Gil-Alana and Robinson, 1997), finance (Gil-Alana and Moreno, 2012; Abbritti et al., 2016), and environmental sciences (Solarin and Bello, 2018; Gil-Alana and Trani, 2019).

In the present study, we use two tests. The first one (RBWN hereinafter) imposes a linear trend to the data, and uses the residuals to compute u_t (Eq. 7) for different values d₀ in order to find the value leading to a white noise series. Specifically, the regression model fitted through the data is expressed by:

x_{t} = β_{0} + β_{1} t + ξ_{t} (11)

where β₀ and β₁ are regression coefficients, and ξ_t indicates the residual. In the present study, we consider three standard cases: 1) β₀ = β₁ = 0 (Model 1), 2) β₁ = 0 and β₀ determined from the data (Model 2), and 3) both β₀ and β₁ estimated from the data (Model 3).

The residuals are then used to compute u_t by applying Eq. 7:

{(1 - L)}^{d_{0}} ξ_{t} = u_{t} (12)

The value d₀ that leads to a white noise series u_t is assumed the best value for d (i.e., the null hypothesis $H_{0} : d = d_{0}$ is accepted if d₀ leads to a white noise series).

The second test (RBBL hereinafter), which is based on the original approach of Bloomfield (1973), is similar to the previous one but assumes short-memory as autocorrelation feature for u_t. The best value for d is chosen as the value d₀ that generates a time series u_t with spectral density function of the form of:

f (λ) = \frac{σ^{2}}{2 π} e^{(2 \sum_{j = 1}^{p} τ_{j} \cos (λ j))} (13)

where p = 1 in the present study (this means that any value in the series is correlated to the previous one only), and τ_j indicates the autocorrelation coefficients (see Robinson (1994) for details).

Bloomfield (1973) showed that the logarithm of $f (λ)$ approximates very well the logarithm of the spectrum of autoregressive (AR) processes. Thus, Eq. 13 is used here to express the autocorrelation in u_t, which also accommodates very well in the context of fractional integration (see also Gil-Alana (2004)).

Robinson Test Based on the “Local” Whittle Approach

The Robinson test based on the “local” Whittle approach is a semi-parametric test originally proposed by Robinson (1995) (RB95 hereinafter). The word “local” is used to indicate that the method only considers a band of frequencies close to zero. The estimate of d ( $\hat{d}$ ) is calculated as:

\hat{d} = \underset{- 0.5 \leq d \leq 0.5}{arg min} (\log C {(d)}^{*} - 2 d \frac{1}{m} \sum_{i = 1}^{m} \log λ_{i}) (14)

where $m = T^{δ}$ (in this study, we assume δ = 0.40, 0.45, … , 0.70 with step of 0.05) is the bandwidth parameter, $λ_{i} = 2 π i / T$ , and

C {(d)}^{*} = \frac{1}{m} \sum_{i = 1}^{m} I (λ_{i}) λ_{i}^{2 d} (15)

Although this method was extended and improved by many authors (e.g., Velasco, 1999; Shimotsu and Phillips, 2006; Abadir et al., 2007), we use the Robinson (1995) one due to its simplicity. In particular, unlike other methods that require additional user-chosen parameters and the results are typically very sensitive to them, it only requires the bandwidth parameter m.

Modified Rescaled Range Analysis

The rescaled range analysis (R/S analysis) is a non-parametric method that allows computation of the so-called Hurst exponent, H (Hurst, 1951). As with d, H measures the level of correlation in time series. It takes values between 0 and 1, and indicates anti-persistence if H < 0.5 and persistent behavior if H > 0.5. A value close to 0.5 indicates no correlation within the series. In the case of covariance stationary processes, the Hurst exponent is related to d through the following relationship (e.g., Beran, 1993):

H = d + 0.5 (16)

In the present work, we apply the modified R/S procedure proposed by Lo (1991). Given a stationary time series x_t of length T, sample mean $\bar{x}$ , sample variance ${\hat{σ}}_{x}^{2}$ , and sample autocovariance at lag u given by ${\hat{γ}}_{u}$ , the modified R/S statistic is:

Q_{T} (q) = \frac{1}{{\hat{σ}}_{T} (q)} (\max_{1 \leq k \leq T} \sum_{t = 1}^{k} (x_{t} - \bar{x}) - \min_{1 \leq k \leq T} \sum_{t = 1}^{k} (x_{t} - \bar{x})) (17)

where

{\hat{σ}}_{T}^{2} (q) = {\hat{σ}}_{x}^{2} + 2 \sum_{u = 1}^{q} ω_{u} (q) {\hat{γ}}_{u} (18)

and

ω_{u} (q) = 1 - \frac{u}{q + 1} 1 \leq u \leq q < T (19)

The parameter q in the previous equations is known as the bandwidth number. In this study, we assume q = 0, 1, 3, 5, 10, 30, 50.

An advantage of the modified R/S statistic is that it allows us to obtain a simple formula to estimate the value of the fractional differencing parameter d, namely:

\hat{d} = \frac{\log Q_{T} (q)}{\log T} - 0.5 (20)

In the previous formulation, assuming q = 0 leads to the classic Hurst R/S analysis (Hurst, 1951).

The estimates of d are considered acceptable for values of the parameter $V_{T} (q)$ above 1.862 or less than 0.809 (corresponding to a 5% level of significance) (Lo, 1991), being $V_{T} (q)$ defined as:

V_{T} (q) = \frac{Q_{T} (q)}{\sqrt{T}} (21)

The limit distribution of V_T(q) is derived in Lo (1991), and the 95% confidence interval with equal probabilities in both tails is [0.809, 1.862].

Monte Carlo experiments conducted, for example, by Teverovsky et al. (1999) showed that the modified R/S analysis is biased in favor of accepting the null hypothesis of no long memory as q increases. Thus, using Lo’s modified method alone is not recommended as it can distort the results.

Results

We first present the results obtained by using the two parametric approaches of Robinson (1994). Table 1 summarizes the estimates of d (and the relative 95% confidence band) obtained by using RBWN (Table 1) and RBBL (Table 1). As stated in the foregoing, we show the results for three standard cases: β₀ = β₁ = 0 (Model 1), β₁ = 0 and β₀ determined from the data (Model 2), and both β₀ and β₁ estimated from the data (Model 3). We have marked in bold the most significant model according to the statistical significance (i.e., t-value) of the values of β₀ and β₁ for the three cases considered (see Table 2).

TABLE 1

TABLE 1. Values of d (and relative 95% confidence interval) resulting from the two parametric approaches of Robinson (1994): a) RBWN; b) RBBL. Results are presented for three standard cases: β₀ = β₁ = 0 (Model 1), β₁ = 0 and β₀ determined from the data (Model 2), and both β₀ and β₁ estimated from the data (Model 3). In bold is the model accepted according to the t-values associated with the estimates of β₀ and β₁ for the three cases (see Table 2).

TABLE 2

TABLE 2. Values of β₀ and β₁ (Eq. 11) for the three models in Table 1: a) RBWN; b) RBBL. The t-values computed for each coefficient are reported in brackets. A model has to be rejected if at least one of its coefficients presents a t-value less than 1.95.

Analyzing the values of d in Table 1 reveals that all time series present the long-memory feature (d > 0), thus confirming the results from a number of previous studies showing that seismicity is a long-memory process (see the “Introduction” for a list of references about this issue). Model 2 has always been resulted the most significant model based on t-values (Table 2). Specifically, both Table 1 show that the values of d corresponding to the time series of the daily number of earthquakes (N13+ and N15+) are in the range (0, 0.5), while those relative to the time series of M₀ (M13+ and M15+) are in the range (0.5, 1). Precisely, concerning these latter series, the values of d indicate a nonstationary, mean-reverting behavior with long-lasting effects of shocks. For these series, the values obtained by using RBBL are significantly lower (up to ≈25%) than those resulting from RBWN. This may be explained by recalling that RBBL assumes short-memory as autocorrelation feature for u_t (Eq. 13) while RBWN assumes that u_t is white noise. Roughly speaking, this means that if the data presents some short memory (which is reflected in ξ_t in Eq. 11), RBBL takes it into consideration (as a feature of u_t) while RBWN does not. Hence, in the case of RBWN, the values of d may be biased by the short memory in the data. Thus, the higher level of nonstationarity indicated by the values of d resulting from the application of RBWN to M13+ and M15 + may be attributed to some short memory in ξ_t. Comparing the estimates of d obtained by applying RBWN and RBBL to N13+ and N15 + shows that the latter approach provides greater values. However, in this case, the difference is negligible, as it is less than 10%. Finally, for both types of time series, we can observe that the value of d tends to decrease with increasing the magnitude completeness threshold M_c.

Table 3 summarizes the results obtained by applying the RB95 semi-parametric approach for different values of the bandwidth parameter $m = T^{δ}$ (i.e., δ = 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, and 0.70). We observe that, for the N13+ and N15 + series, the values of d reach around 0.5, which is the highest value allowed (see Eq. 14). This indicates an effect of saturation of d at lower m values (see also Figure 7, top panels). To overcome such an effect, we take the first differences of the data and apply RB95 to them. The final value of d is then obtained by adding 1 to the value resulting from the application of RB95 to the series of the first differences of the original data. The d values calculated this way are shown in Figure 7 (red lines) in comparison with those obtained from the original data series (blue lines). Note that repeating the analysis on the series of the first differences produces a saturation of d as m increases. Saturated values have to be obviously discarded. Such a saturation effect along with the high variability of d, which is particularly evident for smaller values of m, could be indicative of “spurious” long memory due to regime change (e.g., short memory) or smoothly varying trend (Qu, 2011). This is confirmed by the results (not shown here for the sake of brevity) of test statistic proposed by Qu (2011) (which uses the local Whittle estimate of d for m frequency components) for the null hypothesis that a given time series is a “pure” stationary long-memory process. The saturation is not observed for the d values obtained for the M13+ and M15 + series (Figure 7, bottom panels), which range between about 0.1 and 0.4, thus indicating a covariance stationary, long-memory behavior with mean reversion. As opposed to the values of d obtained for N13+ and N15+ (Figure 7, top panels), for these series d tends to increase markedly with m. This leads of the observation that the values of d computed through the application of RB95 tend to converge toward those resulting from RBWN and, above all, RBBL as m increases. Finally, note that increasing the magnitude completeness threshold yields lower values of d only in the case of the time series of the daily number of earthquakes (i.e., N13+ and N15+). Conversely, the values of d obtained for the time series of M₀ are insensitive to M_c.

TABLE 3

TABLE 3. Values of d obtained by applying the Robinson test based on the “local” Whittle approach (RB95) for different values of the bandwidth parameter $m = T^{δ}$ .

FIGURE 7

FIGURE 7. Values of d obtained by applying the Robinson test based on the “local” Whittle approach (RB95) versus the bandwidth parameter $m = T^{δ}$ . Charts in the left column refer to the magnitude completeness threshold M_c = 1.3, while those in the right column are for M_c = 1.5. Top panels: d values computed from the time series of the daily number of earthquakes (N13+ and N15+). Bottom panels: d values computed from the time series of the (logarithmic) seismic moment M₀ released per day (M13+ and M15+). Blue and red lines display the d values resulting from the original data series and by taking the first differences of the data, respectively (see the main text for details).

Table 4 lists the values of d determined through the modified R/S analysis for a selected number of q values. Evidence of long memory is once again found in all cases, with the values of d constrained between 0 and 0.5 for all series. On average, the values of d are lower than those discussed previously, with a better agreement for lower values of q. This seems in agreement with the observations of Teverovsky et al. (1999) showing that the modified R/S analysis is biased in favor of accepting the null hypothesis of no long memory as q increases. Again, increasing the magnitude completeness threshold yields lower values of d. As with RB95, this is observed only for the time series of the daily number of earthquakes.

TABLE 4

TABLE 4. Values of d determined through the application of the modified R/S analysis for different values of the bandwidth number q. The number in brackets are the values of $V_{T} (q)$ in Eq. 21.

Discussion

As is widely known and remarked in the introduction, the rate of seismicity in geothermal areas is strongly related to fluid injection operations. Fluid injection induces perturbation in the crust that affect the local stress field and may lead to fault failure. Therefore, changes in the injection can induce changes in earthquake productivity (i.e., number of earthquakes), clustering properties, and known seismological parameters such as the b-value of the Gutenberg and Richter relation. Temporal variations in seismic activity at The Geysers geothermal field were studied by different authors (e.g., Eberhart-Phillips and Oppenheimer, 1984; Martínez-Garzón et al., 2013; Martínez-Garzón et al., 2014; Martínez-Garzón et al., 2018; Kwiatek et al., 2015; Johnson et al., 2016; Trugman et al., 2016), as well as changes in the b-value (Convertito et al., 2012; Martínez-Garzón et al., 2014; Kwiatek et al., 2015; Trugman et al., 2016). This section discusses how temporal fluctuations in fluid injection affect seismic activity and, consequently, the degree of memory in seismicity time series. To this end, time series of the daily number of earthquakes were analyzed in order to examine temporal variations in the value of d since 2005 both in the entire The Geysers area and in the northwestern sector (Zone 1 in Figure 1)³. For both cases, we divided the catalog into sections, each covering 1 year⁴, and, for each of them, we computed the value of the completeness magnitude M_c. Specifically, we used a moving window approach with a 1-month shift between consecutive annual time series. Given the substantial fluctuation of the M_c value with time (1.0 ≤ M_c ≤ 1.5, not shown here), which may bias the results, we chose to consider only events with M_c ≥ 1.5. Then, we applied the RBBL, RBWN (following the results in Table 1, the regression coefficients in Eq. 11 were determined according to Model 2), and standard R/S (i.e., assuming q = 0 in Eq. 17) techniques to determine the value of d for each section of the catalog. The RB95 method was not used because of the sensitivity of the results to the choice of the amplitude of the bandwidth parameter m (i.e., because of the subjectivity in the choice of the m value). Contextually, we computed the b-value of the Gutenberg and Richter relation through the maximum likelihood approach (Aki 1965) in the search of a possible relation between it and d.

For the entire The Geysers field, the charts in Figure 8 present the temporal variation of fluid injection (we aggregated monthly injection data provided by the Department of Conservation State of California to obtain values of yearly fluid injection), annual number of earthquakes above M_c = 1.5 (N (M_c)), b-value, and d. On average, the value of d ranges between 0.15 and 0.35, thus indicating again a stationary long-memory process. However, temporal fluctuations are evident, suggesting that the value of d is modulated to some extent by the hydraulic operations in the study area. A visual examination of the trend of d in conjunction with that of the fluid injection seems to indicate a negative correlation, with d that tends to assume smaller values during periods of increased fluid injection. However, the level of correlation between these variables, which is quantified here by the well-known Pearson correlation coefficient (r), is weak (and not always significant), as r varies between −0.08 and −0.34 depending on the method used to compute d (the values of r are summarized in Table 5 along with the associated p-values). Similarly, a weak negative correlation is found between d and N (M_c) (−0.05 ≤ r ≤ −0.46), with this latter variable which, as expected, is positively correlated to the fluid injection (r = 0.69). Such a positive correlation is consistent with the results of Martínez-Garzón et al. (2018), showing that periods of higher seismicity coincide with periods of higher fluid injection. During these periods, the b-value tends to decrease slightly, in agreement with the study of Martínez-Garzón et al. (2014). Following Martínez-Garzón et al. (2018), Eberhart-Phillips and Oppenheimer (1984), and Johnson et al. (2016), such an increase of N (M_c), and the concurrent decrease of both d- and b-values, may be related to increased stress in the reservoir, mainly accommodated by strike-slip and thrust faults (Martínez-Garzón et al., 2014). Hence, fluid injection is considered the major factor driving the seismicity at The Geysers. Finally, moderate but significant negative correlation (−0.53 ≤ r ≤ −0.31) is observed between d and the b-value of the Gutenberg and Richter law. This finding is consistent with the results of Lee et al. (2009) and Lee et al. (2012), showing a negative correlation between the Hurst exponent and the b-value, both using a sandpile model and analyzing real seismicity in Taiwan. According to the explanations of Lee et al. (2009), the decreasing b-value during periods of high fluid injection, which reflects the increased number of events with larger magnitude, indicates that the system is in a critical state or is approaching it (see also De Santis et al., 2011). This is in agreement with the observations of Martínez-Garzón et al. (2018), showing an increase of main shock magnitude along with increased foreshock activity during these periods. The concurrent increase of d-value may be interpreted as an increase of the degree of the persistence of larger and larger events in the system. Conversely, the system is moving away from the critical state during periods of increased b-values, presenting an energy deficit. During these periods, the value of d tends to decrease, thus indicating a decrease of persistence.

FIGURE 8

FIGURE 8. Temporal variation of yearly fluid injection (top panel), annual number of earthquakes above M_c = 1.5 (N (M_c)) (mid panel, black line), b-value (mid panel, gray line), and d (bottom panel) in the entire The Geysers field (Zone 1 + Zone 2 in Figure 1). Data points are plotted at the end of each time interval considered in the moving window analysis.

TABLE 5

TABLE 5. Values of the Pearson correlation coefficient (r) and associated p-values for the pairs of variables specified in the first column. N(M_c) indicates the annual number of earthquakes above M_c = 1.5; the subscript of d_RBWN, d_RBBL, and d_RS indicates the technique used to compute d. Bold text indicates significant p-values (i.e., p < 0.05/17 = 0.0029 where 17 is the number of correlations tested according to the Bonferroni correction for multiple tests (Kato, 2019)). The correlation with fluid injection is not quantified for Zone 1, as specific injection data are not available.

Similar observations can be done by analyzing the moving window results for Zone 1 (Figure 9 and Table 5). According to the findings of Martínez-Garzón et al. (2018), the similarity between the results obtained for this zone and the entire field suggests that the northwestern sector provides a fair representation of the processes that govern earthquake clustering in the whole field.

FIGURE 9

FIGURE 9. Same as Figure 8 but for the northerwestern sector of The Geysers area (Zone 1 in Figure 1). The fluid injection variation shown in the top panel refers to the entire The Geysers field.

Summary and Conclusions

Our study has analyzed the long-memory feature in seismicity through a comprehensive statistical analysis of a set of earthquake time series associated with the activity in The Geysers geothermal field. In particular, we have applied different statistical techniques to time series of the daily number of earthquakes and seismic moment for different magnitude completeness thresholds. We have focused on the seismicity recorded in The Geysers geothermal field because of the high level of completeness of the seismic catalog.

Unlike to previous studies, which investigated the property of long memory in a non-parametric way, we have considered a parametric model based on the concept of fractional integration. The level of correlations among observations in the time series is expressed by a real parameter d, which is indicative of long memory for values greater than zero. The results obtained by using the parametric approach have been compared to those resulting from semi-parametric and non-parametric methods.

Our results have pointed out the presence of long memory for all the time series and methods considered, showing that seismicity is a process characterized by long memory. In particular, the estimates of the differencing parameter d obtained by applying the parametric RBWN and RBBL approaches have shown that the time series of the daily number of earthquakes present long memory with stationary behavior. Indeed, d ranges between around 0.3 and 0.4. The short memory does not affect the results, as the values of d resulting from RBWN are close to those estimated by RBBL. The long-memory feature has also been observed in the time series of seismic moment. However, nonstationary behavior has been shown in this case, as d resulted between 0.5 and 1. We have observed that the estimates obtained by using RBWN could be biased toward higher values (up to around 0.75) due to short-memory autocorrelation, which affects the values of d computed via RBBL to a lower extent (d ≈ 0.55).

Similar results have been obtained by applying the semi-parametric RB95 approach and the non-parametric R/S analysis, although they were shown to be rather sensitive to the choice of the value of the bandwidth parameter. In particular, as regards RB95, we have shown that the values of d tend to converge to those resulting from RBBL at increasing values of the bandwidth parameter m. The opposite occurs considering the results of the modified R/S analysis. This would confirm the bias due to short-memory in the results obtained by applying RBWN to the seismic moment time series.

Finally, albeit moderately, our results have pointed out that the value of d is influenced by the number of information in the chain of intervening related events. In particular, d tends to decrease with increasing the magnitude completeness threshold M_c. This is more evident for the time series of the number of earthquakes. Indeed, the value of M_c is controlled by small events and therefore significantly affects the daily number of earthquakes in the series. Conversely, the total seismic moment released daily is controlled by stronger events and, therefore, the time series are less influenced by M_c. Nevertheless, according to the results obtained by applying RBWN to the seismic moment time series, small magnitude earthquakes with short memory effects would affect nonstationarity. Indeed, we have observed that d increases significantly above 0.5 as M_c decreases. Therefore, catalog completeness assessments may play a fundamental role on the reliability of d estimates.

The analysis of the temporal variation of d, besides corroborating the previous considerations about the long memory in the seismic process, has revealed a certain degree of correlation with hydraulic operations. Specifically, our results indicate a moderate but significant negative correlation between d and the b-value of the Gutenberg and Richter relation. A negative, albeit weaker correlation, which deserves further investigation in the future, is found between d and the fluid injection, as well as between d and the annual number of earthquakes.

Besides proving long memory in seismicity, our study has pointed out the importance of using multiple approaches to obtain reliable estimates of the parameters capturing the long-memory behavior in the earthquake process. The augmented possibility of computing reliable values of such parameters, along with the increasing availability of earthquake data, can strengthen the awareness of scientists on the importance of developing and using earthquake forecasting models based on stochastic processes that allow for long-memory dynamics (e.g., Garra and Polito, 2011).

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://ncedc.org/egs/catalog-search.html.

Author Contributions

This study started from an original idea of all authors. SB and MT provided the time series that were subsequently analyzed by LC, SB, LG-A, and MT. MT also calculated the completeness of the data set. SB and LG-A wrote the main body of the manuscript in collaboration with LC. GF supervised the work since the early stages and contributed to the choice of the magnitude conversion equations. SB, LC, and MT reviewed the manuscript and contributed to the interpretation of the results. Figures and Tables were prepared by SB and LC except for Figure 2 and Table 5 that were provided by MT.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We are grateful to two anonymous reviewers for their valuable comments and suggestions that have significantly improved the manuscript. SB is thankful to Benjamin Edwards and Matteo Picozzi for the fruitful discussions about magnitude conversion. LC kindly acknowledges prof. Elvira Di Nardo (Università degli Studi di Torino) for her support and suggestions during the preparation of the article. LG-A gratefully acknowledges financial support from the MINEIC-AEI-FEDER ECO 2017–85503-R project from ‘Ministerio de Economía, Industria y Competitividad’ (MINEIC), ‘Agencia Estatal de Investigación’ (AEI) Spain and ‘Fondo Europeo de Desarrollo Regional’ (FEDER). Waveform data, metadata, or data products for this study were accessed through the Northern California Earthquake Data Center (NCEDC), doi:10.7932/NCEDC (https://ncedc.org/egs/catalog-search.html). Aggregated monthly injection data for the Geysers field were accessed through the Department of Conservation State of California T(https://www.conservation.ca.gov/calgem/geothermal/manual/Pages/production.aspx). Faults in Figure 1 were accessed through the United States Geological Survey (United States Geological Survey and California Geological survey, Quaternary fault and fold database for the United States, accessed May 10, 2020, at: https://www.usgs.gov/natural-hazards/earthquake-hazards/faults).

Footnotes

¹The periodogram of a time series of length T {x_t, t = 1, 2, … , T}, is defined as:

I (λ) = \frac{1}{2 π T} \sum_{t = 1}^{T} \sum_{s = 1}^{T} x_{t} x_{s} e^{λ i (t - s)}

²It is nonstationary in the sense that the variance of the partial sums increases with d.

³Although the analysis was also carried out for Zone 2, results are not presented because of their instability, which possibly reflects the combined effect of catalog instabilities related to monitoring issues and the milder seismic activity of this zone. Both these factors contribute to the incompleteness of the catalog at smaller magnitudes and, consequently, to a significant proportion of zeroes in the time series analyzed.

⁴The length of the series was chosen in order to avoid instabilities in the results due to too short time series and short-term instabilities in the seismic catalog. The use of series with less than 300 samples is generally not recommended in this type of applications as leading to excessively wide confidence intervals.

References

Abadir, K. M., Distaso, W., and Giraitis, L. (2007). Nonstationarity-extended local Whittle estimation. J. Econom. 141, 1353–1384. doi:10.1016/j.jeconom.2007.01.020