Induced microearthquakes predict permeability creation in the brittle crust

Predicting the evolution of permeability accurately during stimulation at the reservoir scale and at the resolution of individual fractures is essential to characterize the fluid transport and the reactive/heat-transfer characteristics of reservoirs where stress exerts significant control. Here, we develop a hybrid machine learning (ML) model to visualize in situ permeability evolution for an intermediate-scale (∼10 m) hydraulic stimulation experiment. This model includes an ML model that was trained using the well history of flow rate and wellhead pressure and MEQ data from the first three stimulation episodes to predict average permeability from the statistical features of the MEQs alone for later episodes. Moreover, a physics-inspired model is integrated to estimate in situ fracture permeability spatially. This method relates fracture permeability to fracture dilation and scales dilation to the equivalent MEQ magnitude, according to laboratory observations. The seismic data are then applied to define incremental changes in permeability in both space and time. Our results confirm the excellent agreement between the ground truth and model-predicted permeability evolution. The resulting permeability map defines and quantifies flow paths in the reservoir with the averaged permeability comparing favorably with the ground truth of permeability.


Introduction 1
Enhanced geothermal systems (EGS) offer the possibility to provide plentiful and continuous 2 carbon-free energy for the increasing global demand of electric power (Kneafsey et al., 2018& 3 2019; Williams et al., 2008) and have significant potential to shift the current dependence on 4 fossil fuels (Fridleifsson et., al 2018). The net recoverable energy and power depend on reservoir 5 temperature, utilizable reservoir volume (Bauer et al., 2019) and rates of fluid mass and heat 6 transfer, which are in turn controlled by reservoir-scale petrophysical parameters (Laubach et al., 7 2009;Bauer et al., 2017;Kushnir et al., 2018). Thus, the ability to create a reservoir and to 8 predict permeability evolution before, during and after stimulation at reservoir scale and at the 9 resolution of individual fractures is essential to accurately estimate fluid mass transfer and heat 10 recovery from the reservoir. 11 A key challenge is in defining reservoir-scale response in the opaque subsurface where 12 observation is limited to the acoustic signal from the micro-earthquakes (MEQs) that typically 13 accompany the stimulation. This is exacerbated by the highly stress-and pressure-sensitive 14 controls in creating new fractures and reactivating existing fractures. This resulting and 15 improved fluid transmission and heat transfer response may be evaluated if the geometric 16 evolution and connectivity of the fracture network is known/predictable. However, the 17 complexity of this nonlinear response and reinforcing feedbacks linking changes in reservoir 18 pressure to changes in permeability renders such problems intrinsically ill-constrained and ill-19 defined. This situation is compounded by the reality that initial conditions are also poorly 20 constrained and perhaps unknowable. Additionally, these models rely on the integration of 21 multiple geologic, geochemical, and geophysical observations collected during and after well 22 stimulation. Thus, this innate complexity renders accurate predictions of permeability evolution 23 during well stimulation nigh impossible -limiting the utility of traditional forecasting methods 24 in light of such complexity. 25 The application of machine learning (ML) models potentially eases this difficulty by 26 accommodating prior learning from multiple prior experiences in previously-probed and self-27 similar reservoirs with common nonlinear interactions and feedbacks. Machine learning (ML), 28 deep learning (DL) and hybrid methods combined with other techniques offer promise in 29 predicting static permeability structure in reservoirs. This includes permeability structure 30 recovered from key petrophysical features (e.g. density, grain or pore diameter and porosity) 31 recovered from core data (Al Khalifah et al., 2019;Erofeev et al., 2019) using ML methods and 32 from well logging data (Gholami et al., 2012;Eriavbe et al., 2019;Ahmadi et al., 2019;Khanet 33 al., 2019;Karimpouli et al., 2020) using DL. Combining 3D seismic and magnetotelluric (MT) 34 datasets (Matzel, et al., 2021) or pore structure recovered from X-ray micro- CT (Wu et al., 2018;35 Tian et al., 2020;Tembely et al., 2021) have replicated permeability structure using hybrid and 36 DL models. These methods provide valuable insights into applying ML or DL methods to predict 37 permeability. However, lacking are ML methods to predict permeability evolution where 38 stimulation at reservoir scale results in significant changes in permeability. 39

43
We present a novel hybrid ML method to map 3D permeability evolution using microearthquake 44 (MEQ) data recovered concurrently with fluid injection history during well stimulation. This 45 strategy combines an ML model to predict average permeability evolution in the reservoir with a 46 physical model connecting fracture permeability MEQ magnitude. We use experimental data 47 wellbore to an external pressure boundary, (2) this boundary migrates outwards with the cloud of 83 MEQs, and (3) the MEQs are partitioned between those resulting from changes in effective stress 84 (80% assumed) and those beyond this region resulting from changes in total stress (20% 85 assumed). Thus, the activated reservoir is confined to a volume that only contains fractures 86 (MEQs) reactivated by fluid percolating directly from the injection (80% of the events) and that 87 this stimulated volume grows with time as the external cylindrical boundary contour migrates as 88 more distant MEQs are recorded. Thus, this migrating cylindrical envelope is estimated from the 89 cumulative frequency of MEQs-with-radius and capped where cumulative frequency is 0.8 90 ( Figure 2 b1-b5). 91  outwards during these two latter experiments. The radius is zero before any MEQ and grows to 106 ~7.5 m at the end of episode #4 and ~12 m for episode #5, suggesting a continuous flow pathway 107 between injection and production wells.

118
We estimate average permeability from the injectivity by presuming radial flow from the 10m 119 long borehole with interior and exterior pressure boundary conditions fixed to those of the 120 injection and production well, respectively, as, = µ ( / ) , where is the average 121 permeability (m 2 ), µ is the viscosity of water (8.9×10 -4 Pa·s), r is the migrating radius of the 122 cylindrical volume (m), which is derived from the migration of MEQs during the experiment. 123 is the interior/injection wellbore radius (0.048m). This represents an approximation of the 124 geometric correction for flow that is likely heterogeneous with azimuth but is here considered 125 uniform. The deterministic estimate of permeability for episodes #4 and #5 is shown as the light 126 green line in Figure. 3(d) and 3(e). Figure 3(f) shows well pressure (black line) and flow rate 127 (blue line) histories for each of the injection (top) and production wells (bottom) in episode #4. 128 Flow rate was observed to spike for approximately one minute at the production well, 129 highlighted in the blue shaded area, indicating that fluid had propagated to the production well. 130 The permeability during this short period can be estimated using Darcy's Law, = µ , where 131 dq (1.68 LPM) is the net average flowrate between the two wells, L is the distance between 132 injection and production wells (~10m), A is the cross-sectional area (π· ( ) 2 m 2 ), dp (25.14 MPa) 133 is the average pressure difference between two wells. The resulting permeability is 1.2 × 134 − m 2 , which is close to our estimation of 2 × − m 2 at the same time (T) based on the 135 cylindrical flow geometry. This further confirms that the assumption of ~80% of seismic events 136 directly induced by effective stress (direct contact with elevated fluid pressures) is reasonable. 137

ML predicted average permeability in the reservoir [
Step 2]. The ML model is built using 138 data from the first three injection episodes #1-#3 with ~450 MEQs and ~2.5 hours of well 139 stimulation (training data). This comprises nine statistical features of the seismic data calculated 140 over a small moving time window and labeled by injectivity over the same time window. Each 141 time window is one minute in duration. The first 80% of the dataset is used to train the model 142 with the remaining 20% used to quantify whether the model is overfitting. Hyperparameters are 143 set by five-fold cross-validation and the optimal model is selected using root mean squared error 144 (RMSE) as an evaluation metric. A step-by-step description of both the data analysis and 145 machine learning methods is documented in the Methods section. Note that the training data may 146 be any single episode, or multiple episodes, that precede the predicted episode. For example, to 147 estimate injectivity in episode #4 (testing data), the training data can be episode #3 only, or all 148 three episodes from #1 to #3. We run six tests in total to determine the optimal training dataset 149 and input features to predict injectivity for episodes #4 and #5. The nine statistical features of 150 the MEQs produce a total of 511 possible combinations. Each combination (of features) is 151 applied to build the ML model for each test. The best model for each test is defined with R 2 152 closet to unity using the difference between the deterministic and predicted in each episode. 153 These models are selected for each of the best with its input features, R 2 , RMSE and 154 hyperparameters shown in Table 1. 155  ML predicted injectivities. Importantly, this injectivity is predicted purely based on training of 167 the ML algorithm against injectivity-vs-MEQ histories over episode #3 to predict injectivity over 168 episodes #4 and #5 from the MEQs-alone. These predictions are derived purely from the 169 statistical features of the seismic events corresponding to a history of injectivity that the model 170 has never seen. We emphasize that there is no past nor future information considered when 171 making such a prediction. Each prediction uses only the statistical features of seismic events 172 within a single moving window. Thus, we can predict the corresponding history of injectivity in 173 episodes #4 & #5 from the seismic catalog, alone. 174

183
Feature importance of input features.

185
The trained XGBoost model automatically evaluates the relative importance on the predictive 186 features in honoring the training data and therefore provides a nascent physical understanding for 187 the problem. The F-score defines the relative importance of each input feature, which infers the 188 relative importance of that corresponding feature in controlling response of the physical system. 189 The evolution of the predictive key features for episodes #4 and #5 are shown in Figures 4c and  190 5c, together with their feature importance in Figures 4d and 5d

254
We average the evolving fracture-by-fracture spatial distribution of permeability previously 257 recovered from equation (10) (Step 3) to compare against the average permeability recovered 258 from Steps 1 & 2. We mesh a 20×20×10m cube that contains the 10m radius cylinder and 259 discretize the volume using 0.1 m grid. Each calculated fracture permeability value is then 260 assigned into the mesh block based on its location. If several fractures are in the same mesh 261 block, then the largest (most dominant) of the fracture permeabilities is selected. Mesh blocks 262 without fractures are assigned a matrix permeability of 10 -19 m 2 . We arithmetically average and 263 volume-weight the permeability across all mesh blocks. The cross comparison of average 264 permeability by different methods at the end of episodes #4 and #5 are shown in Table.2. 265 Permeability between two wells for episode#4 is calculated by Darcy's law as shown in Figure  266 3e-insert. No data of flow rate and pressure at the production well for episode #5 are collected, 267 thus permeability estimated using this method is missing. It is observed that average permeability 268 estimated by the physics-inspired model (equation 10) is only slightly higher than the other 269 methods. The deterministic calculation of permeabilities for episodes #4 and #5, recovered from 270 the injectivity ground truth (Step 1), are of the same magnitude as the average permeabilities 271 estimated from the fracture permeability maps (Step 3) -indicating that fracture permeabilities 272 may be directly constrained from MEQs and based on moment magnitude. 273

Discussion 274
We apply deterministic ground truthing (Step 1), hybrid ML methods (Step 2) and physics-275 inspired MEQ-permeability constraints (Step 3) to cross correlate permeability evolution in an 276 unusually tightly constrained field experiment. Based on the ML methods, injectivity is predicted 277 from the statistical features of the seismic events corresponding to a history of injectivity that the 278 model has never seen. This is accomplished by training the ML model in early episodes of 279 observation and predicting over the later (final two) episodes. Average permeability of the 280 fractured reservoir is then calculated from injectivity, using a geometric correction of the steady 281 state. This causality between injectivity/permeability and MEQs is then used to develop a 282 physics-inspired model linking permeability evolution to MEQ magnitude. This relies on the 283 parallel plate analog linking permeability to fracture aperture then dilation and then 284 supplemented by laboratory observations scaling dilation to equivalent MEQ magnitude. These 285 data are then scaled to link individual MEQs of known magnitude and location to define 286 incremental changes in permeability in both space and in time. This pointwise distribution of 287 permeability is then used to define a map of permeability structure and then averaged to recover 288 the mean permeability evolution. This prediction closely matches the measured ground truth of 289 ensemble permeability recovered from this unusually well-constrained injection experiment. The 290 promising approach to quantify evolution of permeability as a function of observable MEQs 291 provides a valuable tool in the characterization of reservoirs with the potential to visualize the 292 evolution of fluid flow paths, evaluate reactive/heat-transfer surface area and capable of further 293 constraint using other superimposed signals, say of tracer response. Challenges for the 294 application of this method to larger scale include dealing with much noisier data, applying longer 295 injection histories and operated under various conditions need to be collected as a database to 296 build ML models and to evaluate and improve accuracy. 297 Domain knowledge plays an essential role in choosing the key statistical features that condition 298 response for ML models. We select nine statistical features in the model to estimate the 299 dependence of injectivity on (1) number of MEQs, (2) magnitude of MEQs, and (3) the distance 300 of MEQs from injection well in the regression problem. The results illustrate that the distance to 301 MEQs from the injection well is the dominant feature in predicting injectivity, followed by the 302 number of MEQs and then the magnitude of MEQs. It should also be noted that the success of 303 supervised ML approaches are heavily dependent on the training data used. Our results show that 304 using the data solely from episode #3 to predict response in episodes #4 or #5 is much more 305 accurate than using all three episodes #1-#3. This is because flow rates following episode #3 306 were more than an order of magnitude larger than those used in prior stimulations (episode #1-2) 307 with similar injection pressure, resulting in a factor of ten increase in injectivity. If we choose 308 episodes #1-#3 instead of episode #3 alone, this will bias the training with ~78% (110 out of 140 309 minutes) of low injectivity data conforming to only ~33% (150 out of 450 seismic events) of 310 MEQs -data irrelevant for the higher injectivities observed during the later episodes #4 and #5. 311 Using the most recent and relevant data, alone. undoubtedly increases the accuracy in the 312 prediction. Therefore, choosing a dataset with similar characteristics delivers better prediction 313 than using a larger dataset including less relevant performance data. 314

Methods 315
Data analysis and machine learning methods. 316 (1) Data arrangement. In this project, well injection data is continuously recorded every second, 317 and seismic catalogue is recorded for each seismic event, which is not continuous in time like 318 well data. We start by creating files of well injection history and seismic catalogue with a 319 common time base. We then scan both well data and seismic data using the same moving time 320 windows, and compute statistic features of seismic data and average injectivity for each of these 321 windows. We define 9 features (the statistic features of seismic data) for ML model including  The time series of variables is then labelled with the injectivity correspondingly. The size of time 329 moving window is picked for a minute, and each time window moves forward for a second. We 330 build variables and the corresponding labels, and then add the resulting list of labelled features to 331 the database at each time increment. Finally, each line of dataset i is a list {xi 1 , xi 2 , xi 3 … xi n , yi}, 332 with xi n the n th feature of the i th time window of seismic data, n is the total number of features, 333 and yi the average injectivity during the same time moving window. 334 (2) Train-validate-test split. We aim to use ML model to predict injectivity in episodes #4 and #5. 335 The training data can be the dataset of one single episode or the combination of few episodes that 336 occurred before the predicted episode. For example, to estimate injectivity in episode #4 (testing 337 data), the training data can be episode #3 only, or episodes #1 to #3. The detail of training data 338 and testing data is given in Table1. As the database has a time series features of the seismic data 339 and corresponding average injectivity, the train-validate split must be of two continuous datasets 340 in time, not a random split. Here we use the first 80% of the training data to construct the ML 341 model and rest 20% of the training data for validating the model. 342 (3) Choose the features. The dataset is now composed of 9 features describing the seismic data 343 versus time moving window. Our goal is to predict the injectivity based on the given features. 344 The features can be any combinations of these 9 features, which results in 511 possibilities. 345 (5) Make prediction. We can now make predictions using the best model (final model). The 357 combination of features will be selected and then input into the best model to predict injectivity. 358 The predicted data will be compared with ground truth of injection to assess the performance of 359 the model. 360 (6) Feature importance. we can look for the best features identified by our model, to try to 361 understand how the model reached its estimations. Here we apply feature importance type as 362 'gain', which implies the relative contribution of the corresponding feature to the model 363 calculated by taking each feature's contribution for each tree in the model. A higher value of this 364 metric when compared to another feature implies it is more important for generating a prediction. 365 Linking seismic magnitude with fracture permeability using empirical equation. Here we 366 develop a new method to link MEQs and fracture permeability change during hydraulic 367 stimulation of a fractured reservoir. First, a simple configuration of hydro-shear failure is 368 assumed to be generated by the microearthquakes (MEQs) during stimulation, with a shear 369 displacement of u (m) uniformly applied to a square fracture plane of fracture length L (m). The 370 fracture area A (m 2 ) is then simply defined as 371

=
(1) 372 It has been demonstrated that fracture length can be estimated by aperture using an exponent 373 relationship as . 374

=
(2) 375 where b is the fracture aperture (m) and α and β are the coefficient that link the fracture aperture 376 and length. Permeability (m 2 ) can be estimated from the fracture aperture using the cubic law 377 as 378

= (4) 381
Where G is shear modulus of the reservoir rock, usually of the order of 30 GPa (McGarr,2014) 382 Seismic moment can be converted to an MEQ magnitude using 383 = .
( ) + . To determine the coefficients α1 and β for the reservoir injection here (EGS-Collab project) we 392 use experimental data that concurrently measured aperture and shear displacement during slip on 393 laboratory faults as pore pressure was incremented due to fluid injection (Fig. 1a of Li et al., 394 2021). These are the same reservoir rocks as in the field experiment. Seismic moment is then 395 calculated from measured reactivation displacement (equation (4)) and permeability is evaluated 396 from measured aperture (equation (3)) -both independently. Substitution of multiple correlated 397 values of permeability kf and moment Mw into equation (8) then yields the coefficients α1 and β. 398 The results for change in fracture permeability kf as a function of seismic moment Mw are shown 399 in Figure 7 with regression defining the fitting parameters as, 400 � � = .