Long Memory in Earthquake Time Series: The Case Study of the Geysers Geothermal Field

The present study aims at proving the existence of long memory (or long-range dependence) in the earthquake process through the analysis of time series of induced seismicity. Specifically, we apply alternative statistical techniques borrowed from econometrics to the seismic catalog of The Geysers geothermal field (California), the world’s largest geothermal field. The choice of the study area is essentially guided by the completeness of the seismic catalog at smaller magnitudes (a drawback of conventional catalogs of natural seismicity). Contrary to previous studies, where the long-memory property was examined by using non-parametric approaches (e.g., rescaled range analysis), we assume a fractional integration model for which the degree of memory is defined by a real parameter d, which is related to the best known Hurst exponent. In particular, long-memory behavior is observed for d > 0. We estimate and test the value of d (i.e., the hypothesis of long memory) by applying parametric, semi-parametric, and non-parametric approaches to time series describing the daily number of earthquakes and the logarithm of the (total) seismic moment released per day. Attention is also paid to examining the sensitivity of the results to the uncertainty in the completeness magnitude of the catalog, and to investigating to what extent temporal fluctuations in seismic activity induced by injection operations affect the value of d. Temporal variations in the values of d are analyzed together with those of the b-value of the Gutenberg and Richter law. Our results indicate strong evidence of long memory, with d mostly constrained between 0 and 0.5. We observe that the value of d tends to decrease with increasing the magnitude completeness threshold, and therefore appears to be influenced by the number of information in the chain of intervening related events. Moreover, we find a moderate but significant negative correlation between d and the b-value. A negative, albeit weaker correlation is found between d and the fluid injection, as well as between d and the annual number of earthquakes.


INTRODUCTION
The concept of long memory in the seismic process has a relatively long history. Most research on this topic dates back to the 80s, when Mandelbrot interpreted earthquakes as selfsimilar "objects" (Mandelbrot, 1983) and Bak et al. (1987) introduced the notion of self-organized criticality (SOC). According to this latter study and Bak and Tang (1989), earthquakes are interpreted as the result of interactive dissipative dynamical systems that evolve spontaneously toward critical (i.e., at the edge of collapsing), stationary states that can collapse in events of any size, forming self-similar fractal patterns. Systems that show significant statistical correlations across large time scales are said to be long-memory processes or processes with long-range dependence (LRD) (Karagiannis et al., 2004). The word "memory" is related to the connections between certain observations and those occurring after an amount of time has passed. The word "long" indicates that observations are correlated across widely separated times (i.e., the dependence among observations is significant, even across large time shifts). Interested readers on this topic may refer to the exhaustive dissertation of Samorodnitsky (2006).
Since then, many studies interpreted seismicity and tectonic deformations as the results of a dynamical process exhibiting a self-organized critical behavior (e.g., Sornette and Sornette, 1989;Ito and Matsuzaki, 1990;Sornette, 1991;Turcotte, 1999). In particular, Scholz (1991) used data from artificially induced seismicity to assert that the continental crust is virtually everywhere in a state close to seismic failure. Concurrently, Kagan and Jackson (1991) observed that earthquake clustering is a major long-term manifestation of the seismic process. In particular, it was observed that regions of recent high seismic activity have a larger than usual chance of producing new strong earthquakes (e.g., Lomnitz, 1994;Kagan and Jackson, 1999;Kagan and Jackson, 2013). In other words, periods of high release of seismic deformation will most likely be followed by years of higher-than-average seismic strain release. Conversely, seismically quiet periods will tend to be followed by quiet years. This implies that the process of accumulation and release of seismic strain is governed by long memory and presents a persistent behavior (e.g., Barani et al., 2018).
These findings have important consequences on the reliability of most earthquake forecasting models. Specifically, long-term clustering and self-organized criticality invalidate three important assumptions that are widely used in probabilistic seismic hazard analysis (PSHA), a key element of earthquakeresistant design, and long-term forecasting (e.g., Bak, 1991;Kagan et al., 2012): 1) seismicity as a Poisson process; 2) quasi-periodic or recurrent behavior; 3) characteristic earthquake model. A first attempt to incorporate long-term earthquake interactions into a forecasting model is the double-branching model of Marzocchi and Lombardi (2008). In this model, the first-step branching accounts for the short-term clustering, in which the occurrence of triggered events is modeled according to the modified Omori law (Utsu, 1961), while the second-step branching models the longterm triggering associated with post-seismic stress transfer produced by stronger events. The long-term evolution is essentially modeled according to an inverse exponential distribution of the form e − t τ , where t is the elapsed time from the occurrence of the earthquake that generates stress variations, and τ is the characteristic time of the interaction.
Attempts to detect long memory in the earthquake process in a statistical robust way, were made by analyzing time series of earthquake frequency (Ogata and Abe, 1991;Li and Chen 2001;Jimenez et al., 2006), released energy (Jimenez et al., 2006), seismic moment (Cisternas et al., 2004;Barani et al., 2018;Mukhopadhyay and Sengupta, 2018), and time series of b-value (Chen, 1997). Gkarlaouni et al. (2017) searched for persistency in time series of magnitudes, inter-event times, and inter-event epicentral distances of successive events. Series of inter-event periods were also analyzed by Liu et al. (1995) and Alvarez-Ramirez et al. (2012). Most of these studies examine the long-memory feature through the rescaled range analysis (R/S analysis), which was originally proposed by Hurst (1951) and described rigorously by Mandelbrot and Wallis (1968). This technique estimates the Hurst exponent (H), a parameter that measures the level of correlation in time series. It takes values between 0 and 1 and indicates long memory for values different from 0.5. A value close to 0.5 indicates a stochastic process whose values have no dependence or mutual correlation (e.g., white noise). Most recently, in order to determine H, Peng et al. (1994) proposed the Detrended Fluctuation Analysis (DFA). The DFA was used in seismological studies by Lennartz et al. (2008), Telesca et al. (2004), and .  used a generalization of DFA, known as multifractal-DFA (Kandelhardt et al., 2002).
Although the majority of the previous studies give H > 0.5 (indicating a persistent, long-memory process), no general agreement about the presence of long memory in seismicity seems to emerge. Hence, this issue can be considered still open, and motivates the need of spending further efforts in this field of research with the aim of filling the lack of robust statistical evidence of this hypothesis.
In the present work, we apply for the first time techniques borrowed from econometrics to investigate the long-memory feature in seismicity. We analyze time series of earthquakes (i.e., daily number of earthquakes versus time and logarithm of seismic moment released per day versus time) associated with the industrial activity of The Geysers geothermal field (northern California), a dry-steam field in a natural greywacke sandstone reservoir at around 1.5-3 km depth. In this area, part of the seismic activity is induced by injection of fluids, but also present is triggered seismicity that includes aftershock sequences, swarms, and earthquake clusters (e.g., Johnson and Majer, 2017). The choice of the study area is essentially guided by the completeness of the seismic catalog at smaller magnitudes, which is a major drawback of conventional catalogs of natural seismicity. This allows examining the sensitivity of the results to the uncertainty in the completeness magnitude of the catalog. Moreover, since it has been shown that both seismic activity and clustering properties vary according to injection operations (e.g., Martínez-Garzón et al., 2013;Martínez-Garzón et al., 2014;Martínez-Garzón et al., 20s18;Trugman et al., 2016;Langenbruch and Zoback 2016), The Geysers field is an ideal area for investigating temporal variations of the long-memory property in seismicity time series. Such variations are examined in relation to those of the b-value of the Gutenberg and Richter relation.
Contrary to the studies mentioned above, which investigate the property of long memory in a non-parametric way (i.e., they do not assume any specific parametric model for the time series of interest), we consider a parametric model based on the concept of fractional integration that imposes a pole or singularity in the spectrum of the time series at zero frequency (see "Methodology" section). The level of correlations among observations in the time series is expressed by a real parameter d, which is related to the Hurst exponent. Long memory behavior is observed for d > 0. The results obtained by using the parametric approach are compared to those resulting from semi-parametric (i.e., the "local" Whittle approach of Robinson (1995)) and nonparametric (i.e., modified R/S analysis) methods applied to the same time series.

DATA
The data set considered in this study is the Berkeley-Geysers (BG) catalog, made available by the Northern California Earthquake Data Center (NCEDC, 2014). It collects 458,459 earthquakes with local magnitude (M L ) from 0 to 4.7 recorded between April 2003 and June 2016 in the area of The Geysers geothermal field ( Figure 1). As stated above, the catalog includes earthquakes induced by injection of fluids, but also present is triggered seismicity (i.e., aftershock sequences, swarms, and earthquake clusters). Johnson and Majer (2017) found no basic differences between the source mechanisms of these different types of earthquakes.
In the subsequent analyses, we consider the events located within the red polygon in Figure 1. Specifically, first we analyze the property of long memory in the entire The Geysers field by comparing and contrasting alternative statistical techniques. Then, we analyze how temporal fluctuations in seismic activity induced by injection operations affect the value of d. This latter analysis is presented for the entire data set and northwestern sector (Zone 1 in Figure 1). Many authors (Stark, 2003; have observed that this part of The Geysers field is more active seismically than the southeastern part (Zone 2), following the expansion of the field in that sector. Following  and Convertito et al. (2012), the northwestern area includes all earthquakes with magnitude greater than 4, whereas the southeastern one is characterized by smaller magnitude events. Moreover, in Zone 1, earthquake hypocenters reach greater depths, reflecting the migration of fluids deeper into the reservoir (Johnson et al., 2016).
In order to compute the completeness magnitude (M c ) of the data set (i.e., the minimum magnitude value above which a data set can be assumed representative of the seismic productivity of an area), we apply the maximum curvature approach (Wiemer and Wyss, 2000), which defines M c as the magnitude value corresponding to the maximum of the incremental magnitude-frequency distribution. In our catalog, the maximum curvature of the distribution corresponds to M l 1.0 (see Figure 2). Following Woessner and Wiemer (2005), a correction term is added to this value in order to avoid possible underestimation. We consider two alternative correction values, equal to 0.3 and 0.5 magnitude units, which lead to values of M c of 1.3 and 1.5, respectively. Note that these correction values are both greater than 0.2, a standard value in such kind of applications (e.g., Tormann et al., 2015). Thus, both M c values considered in the present study can be interpreted as conservative estimates of the completeness of the catalog. For both these values, Figure 3 displays bar graphs showing the variability of the monthly number of earthquakes in the period covered by the  Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 563649 4 catalog along with the (cumulative) seismic moment (M 0 ) released over time. The seismic moment (in Nm) was calculated by applying the M l -M 0 relations of Chen and Chen (1989) for California: The reliability of these relations when applied to induced seismicity is checked by comparison with the M l -M w data set of induced earthquakes of Edwards and Douglas (2014), which also includes data relevant to The Geysers geothermal field ( Figure 4). Moment magnitude (M w ) values were calculated from M 0 by using the well-known Hanks and Kanamori (1979) relation. Besides fluctuations of seismic activity, which can be related to injection operations, Figure 3 shows relatively long periods of much milder (or absent) seismicity (e.g., October 2004, March 2009, November 2014October 2015) that can be attributed to temporary issues in seismic monitoring operations rather than to actual periods of seismic inactivity. Such "instabilities" in the seismic catalog, along with short-period ones, are also discernible from the time series analyzed in the following ( Figure 5). We indicate the time series of the daily number of earthquakes with N13+ and N15+ (where N stands for 'number', while '13+' and '15+' recall the values of the completeness magnitude M c ). Likewise, the time series of M 0 are labeled as M13+ and M15+.

METHODOLOGY
As mentioned in the earlier parts of the manuscript, we use alternative techniques to examine the long-memory property. We start by providing two definitions of this concept, one in the time domain and the other one in the frequency domain.
According to the time domain definition, a covariance stationary ] c u (with the usual notation E for the expectation operator) displays long memory if the infinite sum of the absolute value of its autocovariances is infinite, namely: Now, assuming that x t has a spectral density function f(λ), defined as the Fourier transform of the autocovariance function the frequency domain definition of long memory states that the spectral density function must display a pole or singularity at least at one frequency in the interval [0, π), namely: In simple words, Eq. 6 states that, in the context of long-memory processes (i.e., d > 0), the spectral density function is unbounded at a specific frequency λ, typically the zero frequency. The periodogram of the data 1 , which is an estimator of the spectral density function of the series, should reproduce that behavior, with a large value at the smallest frequency. This can be observed in Figure 6A, which displays the first 200 values of the periodograms of the time series considered in this study for the completeness threshold M c 1.3. However, taking the first differences of the data (i.e., y t x t − x t−1 ), the periodograms ( Figure 6B) display values close to zero at the zero frequency, which is clearly an indication of over-differentiation. This is indicative of d < 1 in the original time series. This was precisely the issue mentioned in Granger (1980) for the motivation of fractional integration with order of integration d in the range (0, 1). For further readings, see Granger (1980), Granger (1981), Joyeux (1980), andHosking (1981).
In this article, we claim that the time series under investigation satisfy the two properties above, and we will test the hypothesis of long memory by using alternative methods based on parametric, semi-parametric, and non-parametric approaches.
Among the many methods to test the hypothesis of long memory, we focus on the fractional integration or I(d) approach, which was originally developed by Granger (1980), FIGURE 4 | Relation between local and moment magnitude for induced earthquakes in Hengill (Iceland), Basel (Switzerland), and Geysers (California) geothermal areas (modified from Edwards and Douglas (2014)). The Chen and Chen (1989) scaling relation is superimposed. Granger (1981), Granger and Joyeux (1980), and Hosking (1981), and was widely used in the last twenty years in different fields (interested readers may refer Gil-Alana and Hualde (2009) for a review of applications involving fractional integration). In particular, assuming a generic time series {x t , t 1, 2, . . . , T}, the fractional integration approach is based on the following equation: where L is the lag operator (i.e., L k x t x t−k ), d is a real value, and u t is assumed to be a short memory or I (0) time series, defined as a covariance stationary process satisfying or, in the frequency domain, In this framework, x t is said to be integrated of order d, or I(d), in the sense that "d differences" (i.e., the operator (1 -L) is applied d times on x t as expressed in Eq. 7) are required to remove the long memory from x t , thus leading to a new time series u t characterized by short memory (i.e., integrated of zero order, I (0)). Applying the Binomial expansion to the left-hand side of Eq. 7 leads to: Thus, if d is a positive, non-integer value, then x t depends on all its past history (i.e., the dependence among observations is significant, even across large time shifts), and the higher the value of d is, the greater the level of dependence between the Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 563649 6 observations is. Therefore, d can be interpreted as an indicator of the degree of dependence in the data.
In the context of I(d) models, three values of d assume an important meaning. The first, d 0, implies that the series is I (0) or short memory. The second, d 0.5, allows distinction between covariance stationary processes (i.e., d < 0.5) and nonstationary processes (i.e., d ≥ 0.5) 2 . Finally, d 1 is another turning value, as separates series that display the property of mean reversion (i.e., d < 1), for which any random shock will have a transitory effect disappearing in the long run, and series (with d ≥ 1) for which the effect of the shocks become permanent and increases as d increases above 1. Thus, the fractional integration approach is a very flexible method because it allows considering different categories of I(d) processes, with the following properties: a. d < 0: anti-persistence (i.e., a time series switches between high and low values in adjacent pairs); b. d 0: I(0) or short-memory behavior c. 0 < d < 0.5: covariance stationary, long-memory behavior with mean reversion; d. 0.5 ≤ d < 1: nonstationary though mean-reverting longmemory behavior, with long-lasting effects of shocks (i.e., any random shock in the series will disappear in the long run though taking longer time than in the previous case); e. d 1: nonstationary I (1) behavior with shocks having permanent effects in the series (i.e., lack of mean reversion); f. d > 1: explosive behavior with lack of mean reversion.
Note that u t in Eq. 7 can also display some type short memory structure like the one produced by the stationary and invertible Auto Regressive Moving Average (ARMA)-type of models. Thus, for example, if u t follows an ARMA (p, q) model (where p and q are respectively the orders of the AR and MA components), we say that x t is a fractionally integrated ARMA (i.e., ARFIMA (p, d, q)) model (Sowell, 1992;Beran, 1993).
There are several methods for estimating the value of the parameter d (or testing the hypothesis of long memory) in time series. The subsections below describe those applied in the present work.

Robinson Tests for Fractional Differentiation
The tests of Robinson (1994)  In the present study, we use two tests. The first one (RBWN hereinafter) imposes a linear trend to the data, and uses the residuals to compute u t (Eq. 7) for different values d 0 in order to find the value leading to a white noise series. Specifically, the regression model fitted through the data is expressed by: where β 0 and β 1 are regression coefficients, and ξ t indicates the residual. In the present study, we consider three standard cases: 1) β 0 β 1 0 (Model 1), 2) β 1 0 and β 0 determined from the data (Model 2), and 3) both β 0 and β 1 estimated from the data (Model 3).
The residuals are then used to compute u t by applying Eq. 7: (1 − L) d0 ξ t u t The value d 0 that leads to a white noise series u t is assumed the best value for d (i.e., the null hypothesis H 0 : d d 0 is accepted if d 0 leads to a white noise series). The second test (RBBL hereinafter), which is based on the original approach of Bloomfield (1973), is similar to the previous one but assumes short-memory as autocorrelation feature for u t . The best value for d is chosen as the value d 0 that generates a time series u t with spectral density function of the form of: where p 1 in the present study (this means that any value in the series is correlated to the previous one only), and τ j indicates the autocorrelation coefficients (see Robinson (1994) for details). Bloomfield (1973) showed that the logarithm of f (λ) approximates very well the logarithm of the spectrum of autoregressive (AR) processes. Thus, Eq. 13 is used here to express the autocorrelation in u t , which also accommodates very well in the context of fractional integration (see also Gil-Alana (2004)).

Robinson Test Based on the "Local" Whittle Approach
The Robinson test based on the "local" Whittle approach is a semi-parametric test originally proposed by Robinson (1995) (RB95 hereinafter). The word "local" is used to indicate that the method only considers a band of frequencies close to zero. The estimate of d ( d) is calculated as: where m T δ (in this study, we assume δ 0.40, 0.45, . . . , 0.70 with step of 0.05) is the bandwidth parameter, λ i 2πi/T, and Although this method was extended and improved by many authors (e.g., Velasco, 1999;Shimotsu and Phillips, 2006;Abadir et al., 2007), we use the Robinson (1995) one due to its simplicity. In particular, unlike other methods that require additional userchosen parameters and the results are typically very sensitive to them, it only requires the bandwidth parameter m.

Modified Rescaled Range Analysis
The rescaled range analysis (R/S analysis) is a non-parametric method that allows computation of the so-called Hurst exponent, H (Hurst, 1951). As with d, H measures the level of correlation in time series. It takes values between 0 and 1, and indicates antipersistence if H < 0.5 and persistent behavior if H > 0.5. A value close to 0.5 indicates no correlation within the series. In the case of covariance stationary processes, the Hurst exponent is related to d through the following relationship (e.g., Beran, 1993): In the present work, we apply the modified R/S procedure proposed by Lo (1991). Given a stationary time series x t of length T, sample mean x, sample variance σ 2 x , and sample autocovariance at lag u given by c u , the modified R/S statistic is: and The parameter q in the previous equations is known as the bandwidth number. In this study, we assume q 0, 1, 3, 5, 10, 30, 50. An advantage of the modified R/S statistic is that it allows us to obtain a simple formula to estimate the value of the fractional differencing parameter d, namely: Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 563649 8 In the previous formulation, assuming q 0 leads to the classic Hurst R/S analysis (Hurst, 1951). The estimates of d are considered acceptable for values of the parameter V T (q) above 1.862 or less than 0.809 (corresponding to a 5% level of significance) (Lo, 1991), being V T (q) defined as: The limit distribution of V T (q) is derived in Lo (1991), and the 95% confidence interval with equal probabilities in both tails is [0.809, 1.862].
Monte Carlo experiments conducted, for example, by Teverovsky et al. (1999) showed that the modified R/S analysis is biased in favor of accepting the null hypothesis of no long memory as q increases. Thus, using Lo's modified method alone is not recommended as it can distort the results.

RESULTS
We first present the results obtained by using the two parametric approaches of Robinson (1994). Table 1 summarizes the estimates of d (and the relative 95% confidence band) obtained by using RBWN (Table 1) and RBBL ( Table 1). As stated in the foregoing, we show the results for three standard cases: β 0 β 1 0 (Model 1), β 1 0 and β 0 determined from the data (Model 2), and both β 0 and β 1 estimated from the data (Model 3). We have marked in bold the most significant model according to the statistical significance (i.e., t-value) of the values of β 0 and β 1 for the three cases considered (see Table 2).
Analyzing the values of d in Table 1 reveals that all time series present the long-memory feature (d > 0), thus confirming the results from a number of previous studies showing that seismicity is a long-memory process (see the "Introduction" for a list of references about this issue). Model 2 has always been resulted the most significant model based on t-values (Table 2). Specifically, both Table 1 show that the values of d corresponding to the time series of the daily number of earthquakes (N13+ and N15+) are in the range (0, 0.5), while those relative to the time series of M 0 (M13+ and M15+) are in the range (0.5, 1). Precisely, concerning these latter series, the values of d indicate a nonstationary, mean-reverting behavior with long-lasting effects of shocks. For these series, the values obtained by using RBBL are significantly lower (up to ≈25%) than those resulting from RBWN. This may be explained by recalling that RBBL assumes short-memory as autocorrelation feature for u t (Eq. 13) while RBWN assumes that u t is white noise. Roughly speaking, this means that if the data presents some short memory (which is reflected in ξ t in Eq. 11), RBBL takes it into consideration (as a feature of u t ) while RBWN does not. Hence, in the case of RBWN, the values of d may be biased by the short memory in the data. Thus, the higher level of nonstationarity indicated by the values of d resulting from the application of RBWN to M13+ and M15 + may be attributed to some short memory in ξ t . Comparing the estimates of d obtained by applying RBWN and RBBL to N13+ and N15 + shows that the latter approach provides greater values. However, in this case, the difference is negligible, as it is less than 10%. Finally, for both types of time series, we can observe that the value of d tends to decrease with increasing the magnitude completeness threshold M c . Table 3 summarizes the results obtained by applying the RB95 semi-parametric approach for different values of the bandwidth parameter m T δ (i.e., δ 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, and 0.70). We observe that, for the N13+ and N15 + series, the values of d reach around 0.5, which is the highest value allowed (see Eq. 14). This indicates an effect of saturation of d at lower m values (see also Figure 7, top panels). To overcome such an effect, we take the first differences of the data and apply RB95 to them. The final value of d is then obtained by adding 1 to the value resulting from the application of RB95 to the series of the first differences of the original data. The d values calculated this  Robinson (1994): a) RBWN; b) RBBL. Results are presented for three standard cases: β 0 β 1 0 (Model 1), β 1 0 and β 0 determined from the data (Model 2), and both β 0 and β 1 estimated from the data (Model 3). In bold is the model accepted according to the t-values associated with the estimates of β 0 and β 1 for the three cases (see Table 2).

Series
Model 1  A model has to be rejected if at least one of its coefficients presents a t-value less than 1.95.

Series
Model 1 Model 2 Model 3 β 0 , β 1 β 0 , β 1 β 0 , β 1 N13+ 0, 0 21.6 (9.0), 0 21.5 (7.0), 3.6E-5 (3.2E-2) N15+ 0, 0 13.3 (9.5), 0 15.9 (9.6), −9.5E-4 (−1. way are shown in Figure 7 (red lines) in comparison with those obtained from the original data series (blue lines). Note that repeating the analysis on the series of the first differences produces a saturation of d as m increases. Saturated values have to be obviously discarded. Such a saturation effect along with the high variability of d, which is particularly evident for smaller values of m, could be indicative of "spurious" long memory due to regime change (e.g., short memory) or smoothly varying trend (Qu, 2011). This is confirmed by the results (not shown here for the sake of brevity) of test statistic proposed by Qu (2011) (which uses the local Whittle estimate of d for m frequency components) for the null hypothesis that a given time series is a "pure" stationary long-memory process. The saturation is not observed for the d values obtained for the M13+ and M15 + series (Figure 7, bottom panels), which range between about 0.1 and 0.4, thus indicating a covariance stationary, long-memory behavior with mean reversion. As opposed to the values of d obtained for N13+ and N15+ (Figure 7, top panels), for these  Table 4 lists the values of d determined through the modified R/S analysis for a selected number of q values. Evidence of long memory is once again found in all cases, with the values of d constrained between 0 and 0.5 for all series. On average, the values of d are lower than those discussed previously, with a better agreement for lower values of q. This seems in agreement with the observations of Teverovsky et al. (1999) showing that the modified R/S analysis is biased in favor of accepting the null hypothesis of no long memory as q increases. Again, increasing the magnitude completeness threshold yields lower values of d. As with RB95, this is observed only for the time series of the daily number of earthquakes.

DISCUSSION
As is widely known and remarked in the introduction, the rate of seismicity in geothermal areas is strongly related to fluid injection operations. Fluid injection induces perturbation in the crust that affect the local stress field and may lead to fault failure. Therefore, changes in the injection can induce changes in earthquake productivity (i.e., number of earthquakes), clustering properties, and known seismological parameters such as the b-value of the Gutenberg and Richter relation. Temporal variations in seismic activity at The Geysers geothermal field were studied by different authors (e.g., Eberhart-Phillips and Oppenheimer, 1984;Martínez-Garzón et al., 2013;Martínez-Garzón et al., 2014;Martínez-Garzón et al., 2018;Kwiatek et al., 2015;Johnson et al., 2016;Trugman et al., 2016), as well as changes in the b-value (Convertito et al., 2012;Martínez-Garzón et al., 2014;Kwiatek et al., 2015;Trugman et al., 2016). This section discusses how temporal fluctuations in fluid injection affect seismic activity and, consequently, the degree of memory in seismicity time series. To this end, time series of the daily number of earthquakes were analyzed in order to examine temporal variations in the value of d since 2005 both in the entire The Geysers area and in the northwestern sector (Zone 1 in Figure 1) 3 . For both cases, we divided the catalog into sections, each covering 1 year 4 , and, for each of them, we computed the value of the completeness magnitude M c . Specifically, we used a moving window approach with a 1month shift between consecutive annual time series. Given the substantial fluctuation of the M c value with time (1.0 ≤ M c ≤ 1.5, not shown here), which may bias the results, we chose to consider only events with M c ≥ 1.5. Then, we applied the RBBL, RBWN (following the results in Table 1, the regression coefficients in Eq. 11 were determined according to Model 2), and standard R/S (i.e., assuming q 0 in Eq. 17) techniques to determine the value of d for each section of the catalog. The RB95 method was not used because of the sensitivity of the results to the choice of the amplitude of the bandwidth parameter m (i.e., because of the subjectivity in the choice of the m value). Contextually, we computed the b-value of the Gutenberg and Richter relation through the maximum likelihood approach (Aki 1965) in the search of a possible relation between it and d.
For the entire The Geysers field, the charts in Figure 8 present the temporal variation of fluid injection (we aggregated monthly injection data provided by the Department of Conservation State of California to obtain values of yearly fluid injection), annual number of earthquakes above M c 1.5 (N (M c )), b-value, and d. On average, the value of d ranges between 0.15 and 0.35, thus indicating again a stationary longmemory process. However, temporal fluctuations are evident, suggesting that the value of d is modulated to some extent by the hydraulic operations in the study area. A visual examination of the trend of d in conjunction with that of the fluid injection seems to indicate a negative correlation, with d that tends to assume smaller values during periods of increased fluid injection. However, the level of correlation between these variables, which is quantified here by the well-known Pearson correlation coefficient (r), is weak (and not always significant), as r varies between −0.08 and −0.34 depending on the method used to compute d (the values of r are summarized in Table 5  Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 563649 increased number of events with larger magnitude, indicates that the system is in a critical state or is approaching it (see also De Santis et al., 2011). This is in agreement with the observations of Martínez-Garzón et al. (2018), showing an increase of main shock magnitude along with increased foreshock activity during these periods. The concurrent increase of d-value may be interpreted as an increase of the degree of the persistence of larger and larger events in the system. Conversely, the system is moving away from the critical state during periods of increased b-values, presenting an energy deficit. During these periods, the value of d tends to decrease, thus indicating a decrease of persistence. Similar observations can be done by analyzing the moving window results for Zone 1 (Figure 9 and Table 5). According to the findings of Martínez-Garzón et al. (2018), the similarity between the results obtained for this zone and the entire field suggests that the northwestern sector provides a fair representation of the processes that govern earthquake clustering in the whole field.

SUMMARY AND CONCLUSIONS
Our study has analyzed the long-memory feature in seismicity through a comprehensive statistical analysis of a set of earthquake time series associated with the activity in The Geysers geothermal field. In particular, we have applied different statistical techniques to time series of the daily number of earthquakes and seismic moment for different magnitude completeness thresholds. We have focused on the seismicity recorded in The Geysers geothermal field because of the high level of completeness of the seismic catalog.
Unlike to previous studies, which investigated the property of long memory in a non-parametric way, we have considered a parametric model based on the concept of fractional integration. The level of correlations among observations in the time series is expressed by a real parameter d, which is indicative of long memory for values greater than zero. The results obtained by using the parametric approach have been compared to those resulting from semi-parametric and non-parametric methods.
Our results have pointed out the presence of long memory for all the time series and methods considered, showing that seismicity is a process characterized by long memory. In particular, the estimates of the differencing parameter d obtained by applying the parametric RBWN and RBBL approaches have shown that the time series of the daily number of earthquakes present long memory with stationary behavior. Indeed, d ranges between around 0.3 and 0.4. The short memory does not affect the results, as the values of d resulting from RBWN are close to those estimated by RBBL. The long-memory feature has also been observed in the time series of seismic moment. However, nonstationary behavior has been shown in this case, as d resulted between 0.5 and 1. We have observed that the estimates obtained by using RBWN could be biased toward higher values (up to around 0.75) due to short-memory autocorrelation, which affects the values of d computed via RBBL to a lower extent (d ≈ 0.55).
Similar results have been obtained by applying the semiparametric RB95 approach and the non-parametric R/S analysis, although they were shown to be rather sensitive to the choice of the value of the bandwidth parameter. In particular, as regards RB95, we have shown that the values of d tend to converge to those resulting from RBBL at increasing values of the bandwidth parameter m. The opposite occurs considering the results of the modified R/S analysis. This would confirm the bias due to short-memory in the results obtained by applying RBWN to the seismic moment time series.
Finally, albeit moderately, our results have pointed out that the value of d is influenced by the number of information in the chain of intervening related events. In particular, d tends to decrease with increasing the magnitude completeness threshold M c . This is more evident for the time series of the number of earthquakes. Indeed, the value of M c is controlled by small events and therefore significantly affects the daily number of earthquakes in the series. Conversely, the total seismic moment released daily is controlled by stronger events and, therefore, the time series are less influenced by M c . Nevertheless, according to the results obtained by applying RBWN to the seismic moment time series, small magnitude earthquakes with short memory effects would affect nonstationarity. Indeed, we have observed that d increases significantly above 0.5 as M c decreases. Therefore, catalog completeness assessments may play a fundamental role on the reliability of d estimates.
The analysis of the temporal variation of d, besides corroborating the previous considerations about the long memory in the seismic process, has revealed a certain degree of correlation with hydraulic operations. Specifically, our results indicate a moderate but significant negative correlation between d and the b-value of the Gutenberg and Richter relation. A negative, albeit weaker  (Kato, 2019)). The correlation with fluid injection is not quantified for Zone 1, as specific injection data are not available. correlation, which deserves further investigation in the future, is found between d and the fluid injection, as well as between d and the annual number of earthquakes. Besides proving long memory in seismicity, our study has pointed out the importance of using multiple approaches to obtain reliable estimates of the parameters capturing the long-memory behavior in the earthquake process. The augmented possibility of computing reliable values of such parameters, along with the increasing availability of earthquake data, can strengthen the awareness of scientists on the importance of developing and using earthquake forecasting models based on stochastic processes that allow for long-memory dynamics (e.g., Garra and Polito, 2011).

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: http://ncedc.org/egs/catalog-search.html.

AUTHOR CONTRIBUTIONS
This study started from an original idea of all authors. SB and MT provided the time series that were subsequently analyzed by LC, SB, LG-A, and MT. MT also calculated the completeness of the data set. SB and LG-A wrote the main body of the manuscript in collaboration with LC. GF supervised the work since the early stages and  Table 5 that were provided by MT.

ACKNOWLEDGMENTS
We are grateful to two anonymous reviewers for their valuable comments and suggestions that have significantly improved the manuscript. SB is thankful to Benjamin Edwards and Matteo Picozzi for the fruitful discussions about magnitude conversion. LC kindly acknowledges prof. Elvira Di Nardo (Università degli Studi di Torino) for her support and suggestions during the preparation of the article.