Skip to main content

METHODS article

Front. Earth Sci., 30 September 2022
Sec. Solid Earth Geophysics
Volume 10 - 2022 |

A deep learning approach for signal identification in the fluid injection process during hydraulic fracturing using distributed acoustic sensing data

www.frontiersin.orgYikang Zheng1,2 www.frontiersin.orgYibo Wang1,2* www.frontiersin.orgXing Liang3 www.frontiersin.orgQingfeng Xue1,2 www.frontiersin.orgEnmao Liang4 www.frontiersin.orgShaojiang Wu1,2 www.frontiersin.orgShujie An5 www.frontiersin.orgYi Yao1,2 www.frontiersin.orgChen Liu3 www.frontiersin.orgJue Mei3
  • 1Key Laboratory of Petroleum Resource Research, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, China
  • 2Innovation Academy for Earth Science, Chinese Academy of Sciences, Beijing, China
  • 3PetroChina Zhejiang Oilfield Company, Hangzhou, China
  • 4China State Shipbuilding Corporation, Limited 715th Research Institute, Hangzhou, China
  • 5Optical Science and Technology (Chengdu) Ltd, Chengdu, China

Full-cycle and real-time monitoring of the wellbore flow during hydraulic fracturing is challenging in unconventional oil and gas development. In the past few years, distributed acoustic sensing (DAS) provides opportunities to measure the acoustic energy distribution along the entire horizontal well. It is a promising tool for real-time monitoring and understanding of the fluid injection process. However, the signal identification of effective flow in the wellbore from DAS data is cumbersome and prone to error. We propose a deep learning approach to solve this problem. The neural network is a combination of Convolutional Neural Networks (CNNs) and Bidirectional Long Short-Term Memory Networks (BiLSTM) to extract the spatial and temporal features from the DAS data. The trained model is applied to the field data collected in the horizontal well. The results demonstrate its capability for intelligent monitoring and real-time evaluation for hydraulic fracturing.


Hydraulic fracturing operation in horizontal wells has become the most effective stimulation technology for unconventional, low-permeability reservoirs. Real-time evaluation of the fracturing process provides important information to design the unconventional-reservoir completion and improve production (Montgomery et al., 2010). The conventional monitoring methods, such as microseismic, time-lapse seismic, and pressure monitoring, are limited to coverage and resolution. Recently, distributed acoustic sensing (DAS) is emerging as a real-time downhole sensing technology. The fiber cable is installed permanently on the outside of a casing string and measures the vibration along the wellbore. In the DAS system, the interrogator unit transmits laser pulse along the cable, and the interferometer measures the changes in the Rayleigh back-scattering pattern associated with any deformation on the cable caused by incident waves (Mateeva et al., 2014; Spica et al., 2020). It is superior to other wellbore detection methods for real-time measurement, high spatial resolution and convenient deployment.

The high-density data recorded by the fiber cable in the injection well can directly show the fluid migration in the wellbore. Through the detailed surveillance of the fluid in the stimulation process, the design of commonly used plug-and-perf completion can be optimized. The operation parameters are chosen to achieve low-lost, high-efficiency production, such as fluid type, pumping method, injection volume, and adjustment of sand concentration (Jin et al., 2017; Richter et al., 2019). However, the manual analysis of DAS data is inefficient and prone to error. Applying machine learning or deep learning to this problem is an attractive solution. Jin et al. (2019) propose the artificial neural network (ANN) algorithm to identify fracture-hit signals from the DAS data recorded at offset monitor wells. Binder and Tura (2020) use convolutional neural networks (CNNs) to detect microseismic events in the downhole DAS data. Stork et al. (2020) shows the successful application of CNNs to microseismic event detection in DAS data. The purpose of this study is to identify the signal related to fluid injection in the borehole. The CNN is combined with Bidirectional Long Short-Term Memory Networks (BiLSTM) to extract the spatial and temporal features from the DAS data. The results demonstrate the feasibility and effectiveness of the proposed framework for large DAS data volume.


Convolutional Neural Networks (CNNs) is a class of feedforward neural networks that include convolution computation and non-linear activation operators (O'Shea and Nash, 2015). It is one of the representative algorithms of deep learning. CNNs are commonly used to analyze visual images. They are also known as motion-invariant or space-invariant artificial neural networks (SIANN) and are based on a shared weight structure of convolution kernels or filters that slide along input features and provide translation-equivalent responses. Counterintuitively, most CNNs are only equivariant to translation, not invariant. They have applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, etc. (Gu et al., 2018).

As for the processing of time series data, such as the DAS data, recurrent neural networks (RNNs) is a very classic structure applied to data prediction (Medsker and Jain, 2001). It is used to find the relation of the data volume and predict the data within the corresponding context. However, due to its simple structure, RNNs suffer from gradient disappearance and gradient explosion when dealing with long-term sequence problems (Salehinejad et al., 2017). The Long Short-Term Memory (LSTM) networks are a type of neural network with stronger capability for time series prediction, which is developed from the RNNs (Hochreiter and Schmidhuber, 1997; Van Houdt et al., 2020). LSTM consists of one or more functional unit modules with forgettable and memory functions. This model is proposed to solve the problem that the traditional RNNs have the disappearance of backpropagation gradient in the long-term sequence. The core components of LSTM networks include forget, input, and output gates. LSTM networks are well suited for classification, processing and forecasting problems for time series data. Conventional RNN units and deep learning networks based on LSTM units cannot save the value of the previous time series due to the limitation of their basic structure, so they are better at predicting the next time step data with current data but lack the ability to predict a previous time step. For many sequence prediction problems, the time series data are bidirectional time-dependent. Thus RNNs and LSTM become inefficient in prediction ability. To overcome this limitation, bidirectional RNNs (BRNNs) make use of previous context by processing the data in both directions with two separate hidden layers, which are then fed forwards to the same output layer (Schuster and Paliwal, 1997). Combining BRNNs with LSTM gives bidirectional LSTM (BiLSTM), which can access long-range context in both input directions (Graves et al., 2013).

CNNs is the well-known artificial neutral network and widely applied in image recognition, classification and segmentation. But it can only provide the mapping of spatial features from the input to the output. The DAS data are time series, and the temporal relations can not be learned and predicted by CNNs. RNNs are able to extract temporal dynamic characteristics but have limitations on memory cost. LSTM can be considered as an improved version of RNNs and is suitable to learn long-term dependencies. A Bidirectional LSTM (BiLSTM) is a model that consists of two LSTMs to receive the forward and backward information. It can effectively increase both preceding and subsequent information available to the network. In the processing of DAS data for signal identification, we combine the CNNs and BiLSTM to extract both the spatial and temporal features. The proposed model benefits from the advantages of CNNs and BiLSTM. The image features are captured by CNNs and the long-term dependency of the data is learned by the BiLSTM. Figure 1 shows the detailed scheme of the network architecture used in this study. The size of the input image is 128 x 128, as shown in Figure 1. 50% overlap is added to ensure that the continuous segmentation does not miss valid signals. The network model consists of CNN layers, BiLSTM layers and fully connected layers. The implementation of the proposed network is based on the Python deep learning API, Keras, which uses Tensorflow as the backend. These parameters are decided after we define the input and output, and optimized after several tests. Table 1 describes the specific structure and parameters of the network proposed in this paper in detail. The CNN layer focuses on extracting spatial feature information, the BiLSTM layer focuses on extracting time series features, and the fully connected layer is used to fuse the features extracted by the CNN layer and BiLSTM to achieve classification and recognition. The DAS data are divided into two types, effective injection signal and background noise. The input is the sequential DAS data, and the spatiotemporal characteristic is used to identify the fluid injection information. Firstly, the original data are segmented along the spatial and time axis to obtain the image with the size of 128 × 128. Then each sequence with 100 images in time are collected and used as the input. As the DAS response of fluid injection depends on the channel number and temporal step of the input DAS monitoring data, and the feature information obtained from different channels and time steps is highly correlated, the proposed network uses three BiLSTM layers successively to increase the ability of time series prediction and reduce the error in identification calculation. The problem involves the two-dimensional dynamic recognition problem both in space and time. The nonlinear conversion to linearization process in the fusion classification of space-time features is prone to errors (Tang et al., 2021), thus we add a fully connected network to improve the conversion performance. This modification can optimize computational efficiency and reduce the over-fitting phenomenon.


FIGURE 1. The schematic structure of the proposed network.


TABLE 1. The parameters used in the proposed network architecture.

Training data

The DAS system is deployed along the injection well in the shale gas field. The monitoring geometry is shown in Figure 2. The length of the cable is approximately 2.5 km. The spatial resolution is 1 m and the temporal interval is 0.25 ms. Figure 3 shows the processing steps of the raw data. The data is segmented along the time and channel axis, respectively. The datasets are selected from the recorded data of three wells at the same site in about 1 month. With the recorded data, the data are labeled manually by visual inspection to generate the training dataset. Figure 4 shows the typical labeled result of the data slice. After the manual labeling, the dataset are separated into training dataset and testing dataset with a ratio of 8:2.


FIGURE 2. The geometry of the horizontal wells used to collect DAS data. The cable is deployed along Well 2 (blue line) and the red dots indicate the position of the data used in the application.


FIGURE 3. The process to generate the training and testing data sets.


FIGURE 4. The typical labeled result of the raw data.

Network training

The goal of signal detection for fluid injection in hydraulic fracturing is to establish a rapid real-time evaluation and response system with high accuracy and high sensitivity. The following parameters are used to judge the performance of the trained model.

1) Effective detection rate (EDR)

The ratio of the effective signal detected, which is equal to the recall rate. It is calculated as follows:


where TP is the true positives, which refers to the number of correct detections for signals triggered by the trained network. FN is the false negatives, which refers to the number of wrong identifications for noise.

2) False alarm rate (FAR)

The ratio of false and correct identified signals of fluid injection, which is


where FP is the false positives, which refers to the number of wrongly indication for effective injection signals.

3) F1 score

A measure that combines precision and recall, which is also the harmonic mean of precision and recall


4) Response time

This parameter is used to indicate the time consuming of the proposed workflow. It is the time difference between the time of the first sample and the output time of the first identified effective injection signal.

Using the training dataset, we obtained the proposed model and used the testing dataset to validate its performance. The results are shown in Table 2. EDR is used to evaluate the precision of the identification model, FAR is used to indicate the missing of effective signals. F1 score is the overall evaluation using evenly weighted recall and precision. The results shows the trained model can effectively identify the signal from the raw data and the processing time can meet the requirements for real-time monitoring. On the computation node with four Nvidia Titan (Pascal) GPUs, it took about 5 days for the training.


TABLE 2. The performance of the trained model on the testing dataset.

Application to field data

In the application, the collected data in different stages that are not included in the training and testing datasets are used. Figure 5 shows the data slices with relatively high and low signal-to-noise ratio, respectively. Using the trained model for identification, the results are shown in Figure 6. It can be observed from the identification results that the signals related to fluid injection are identified with high accuracy.


FIGURE 5. The raw DAS data with high (A) and low signal-to-noise ratio (B).


FIGURE 6. The signals related to fluid injection identified from the data shown in Figure 5.

To further demonstrate the validity of the proposed model, the accumulated energy (the square of amplitude) of the recorded data is compared with the production curve. Figure 7 shows the results. In the conventional method of directly accumulating energy in the full record, the DAS response is inconsistent with the slurry rate curve, which is mainly due to the continuous background noise during the monitoring process. The results based on the identified DAS response can accurately fit with the slurry rate curve, as the extract DAS responses are directed related to fluid injection procedure. The model works effectively for the data collected at the same area as the validate data are similar to the training data. But it may need to be updated when the data have different characteristics. With more DAS data, the performance of the trained model can be further improved. The new deep learning algorithms developed for action recognition in video signals can also be introduced to improve the efficiency of the proposed method.


FIGURE 7. The comparison of the accumulation of DAS energy with the slurry rate curve of the well (blue line). The Red line denotes the results of the raw DAS data, and the ochre line denotes the results of the identified signals using the proposed model.


We propose a deep-leaning approach for real-time evaluation of raw DAS data to identify the signals related to fluid injection in hydraulic fracturing. The trained model demonstrates its effectiveness and accuracy in application to field data. The effective detection rate of injection signal is 95.1%, which enables real-time evaluation of hydraulic fracturing operation from downhole DAS data. The structure combing CNNs and BiLSTM performs reasonably well in spatiotemporal signal classification. The current models can be further improved in practical applications with more DAS data and better action recognition strategies.

Data availability statement

The data analyzed in this study is subject to the following licenses/restrictions: Data associated with this research are confidential and cannot be released. Requests to access these datasets should be directed to

Author contributions

YZ performed the data analysis. YZ and YW wrote and revised the manuscript. YW and XL provided the research ideas and supervised the findings of this work. QX, EL, SW, SA, YY, CL, and JM collected the original dataset and performed the preprocessing. All authors discussed the results and contributed to the final manuscript.


This study was funded by the CAS Project for Young Scientists in Basic Research (Grant No. YSBR-020), and the National Natural Science Foundation of China (Grant No. 42025403).


We would like to thank three reviewers for their valuable comments that improved this manuscript significantly.

Conflict of interest

Authors XL, CL and JM were employed by the company PetroChina Zhejiang Oilfield Company. Author EL was employed by the company China State Shipbuilding Corporation, Limited 715th Research Institute, Author SA was employed by the company Optical Science and Technology (Chengdu) Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


Binder, G., and Tura, A. (2020). Convolutional neural networks for automated microseismic detection in downhole distributed acoustic sensing data and comparison to a surface geophone array. Geophys. Prospect. 68 (9), 2770–2782. doi:10.1111/1365-2478.13027

CrossRef Full Text | Google Scholar

Graves, A., Mohamed, A.-r., and Hinton, G. (2013). “Speech recognition with deep recurrent neural networks,” in IEEE international conference on acoustics, speech and signal processing (IEEE). paper presented at 2013.

CrossRef Full Text | Google Scholar

Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., et al. (2018). Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377. doi:10.1016/j.patcog.2017.10.013

CrossRef Full Text | Google Scholar

Hochreiter, S., and Schmidhuber, J. (1997). Long short-term memory. Neural Comput. 9 (8), 1735–1780. doi:10.1162/neco.1997.9.8.1735

PubMed Abstract | CrossRef Full Text | Google Scholar

Jin, G., Mendoza, K., Roy, B., and Buswell, D. G. (2019). Machine learning-based fracture-hit detection algorithm using LFDAS signal. Lead. Edge 38 (7), 520–524. doi:10.1190/tle38070520.1

CrossRef Full Text | Google Scholar

Jin, G., and Roy, B. (2017). Hydraulic-fracture geometry characterization using low-frequency DAS signal. Lead. Edge 36 (12), 975–980. doi:10.1190/tle36120975.1

CrossRef Full Text | Google Scholar

Mateeva, A., Lopez, J., Potters, H., Mestayer, J., Cox, B., Kiyashchenko, D., et al. (2014). Distributed acoustic sensing for reservoir monitoring with vertical seismic profiling. Geophys. Prospect. 62 (4), 679–692. doi:10.1111/1365-2478.12116

CrossRef Full Text | Google Scholar

Medsker, L. R., and Jain, L. (2001). Recurr. neural Netw. Des. Appl. 5, 64.

Montgomery, C. T., and Smith, M. B. (2010). Hydraulic fracturing: History of an enduring technology. J. Petroleum Technol. 62 (12), 26–40. doi:10.2118/1210-0026-jpt

CrossRef Full Text | Google Scholar

O'Shea, K., and Nash, R. (2015). An introduction to convolutional neural networks. Available at: (Accessed November 26, 2015).

Google Scholar

Richter, P., Parker, T., Woerpel, C., Wu, Y., Rufino, R., and Farhadiroushan, M. (2019). Hydraulic fracture monitoring and optimization in unconventional completions using a high-resolution engineered fibre-optic Distributed Acoustic Sensor. First break 37 (4), 63–68. doi:10.3997/1365-2397.n0021

CrossRef Full Text | Google Scholar

Salehinejad, H., Sankar, S., Barfett, J., Colak, E., and Valaee, S. (2017). Recent advances in recurrent neural networks. Available at: (Accessed December 29, 2017).

Google Scholar

Schuster, M., and Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45 (11), 2673–2681. doi:10.1109/78.650093

CrossRef Full Text | Google Scholar

Spica, Z. J., Perton, M., Martin, E. R., Beroza, G. C., and Biondi, B. (2020). Urban seismic site characterization by fiber-optic seismology. J. Geophys. Res. Solid Earth 125 (3), e2019JB018656. doi:10.1029/2019JB018656

CrossRef Full Text | Google Scholar

Stork, A. L., Baird, A. F., Horne, S. A., Naldrett, G., Lapins, S., Kendall, J.-M., et al. (2020). Application of machine learning to microseismic event detection in distributed acoustic sensing data. Geophysics 85 (5), KS149–KS160. doi:10.1190/geo2019-0774.1

CrossRef Full Text | Google Scholar

Tang, J., Xia, H., Zhang, J., Qiao, J., and Yu, W. (2021). Deep forest regression based on cross-layer full connection. Neural comput. Appl. 33 (15), 9307–9328. doi:10.1007/s00521-021-05691-7

CrossRef Full Text | Google Scholar

Van Houdt, G., Mosquera, C., and Nápoles, G. (2020). A review on the long short-term memory model. Artif. Intell. Rev. 53 (8), 5929–5955. doi:10.1007/s10462-020-09838-1

CrossRef Full Text | Google Scholar

Keywords: distributed acoustic sensing, deep learning, signal identification, convolutional neural networks, bidirectional long short-term memory

Citation: Zheng Y, Wang Y, Liang X, Xue Q, Liang E, Wu S, An S, Yao Y, Liu C and Mei J (2022) A deep learning approach for signal identification in the fluid injection process during hydraulic fracturing using distributed acoustic sensing data. Front. Earth Sci. 10:999530. doi: 10.3389/feart.2022.999530

Received: 21 July 2022; Accepted: 08 September 2022;
Published: 30 September 2022.

Edited by:

Maxim Lebedev, Curtin University, Australia

Reviewed by:

Shuang Zheng, Aramco Services Company, United States
Verónica Rodríguez Tribaldos, Berkeley Lab (DOE), United States
Salam Al-Rbeawi, Middle East Technical University, Turkey

Copyright © 2022 Zheng, Wang, Liang, Xue, Liang, Wu, An, Yao, Liu and Mei. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yibo Wang,