Predicting groundwater level using traditional and deep machine learning algorithms

Feng, Fan; Ghorbani, Hamzeh; Radwan, Ahmed E.

doi:10.3389/fenvs.2024.1291327

ORIGINAL RESEARCH article

Front. Environ. Sci., 16 February 2024
Sec. Big Data, AI, and the Environment
Volume 12 - 2024 | https://doi.org/10.3389/fenvs.2024.1291327

Predicting groundwater level using traditional and deep machine learning algorithms

Fan Feng¹*

Hamzeh Ghorbani²*

Ahmed E. Radwan³

¹University of Applied Sciences for Engineering and Economics, Berlin, Germany
²Young Researchers and Elite Club, Ahvaz Branch, Islamic Azad University, Ahvaz, Iran
³Faculty of Geography and Geology, Institute of Geological Sciences, Jagiellonian University, Kraków, Poland

This research aims to evaluate various traditional or deep machine learning algorithms for the prediction of groundwater level (GWL) using three key input variables specific to Izeh City in the Khuzestan province of Iran: groundwater extraction rate (E), rainfall rate (R), and river flow rate (P) (with 3 km distance). Various traditional and deep machine learning (DML) algorithms, including convolutional neural network (CNN), recurrent neural network (RNN), support vector machine (SVM), decision tree (DT), random forest (RF), and generative adversarial network (GAN), were evaluated. The convolutional neural network (CNN) algorithm demonstrated superior performance among all the algorithms evaluated in this study. The CNN model exhibited robustness against noise and variability, scalability for handling large datasets with multiple input variables, and parallelization capabilities for fast processing. Moreover, it autonomously learned and identified data patterns, resulting in fewer outlier predictions. The CNN model achieved the highest accuracy in GWL prediction, with an RMSE of 0.0558 and an R² of 0.9948. It also showed no outlier data predictions, indicating its reliability. Spearman and Pearson correlation analyses revealed that P and E were the dataset’s most influential variables on GWL. This research has significant implications for water resource management in Izeh City and the Khuzestan province of Iran, aiding in conservation efforts and increasing local crop productivity. The approach can also be applied to predicting GWL in various global regions facing water scarcity due to population growth. Future researchers are encouraged to consider these factors for more accurate GWL predictions. Additionally, the CNN algorithm’s performance can be further enhanced by incorporating additional input variables.

1 Introduction

The groundwater level (GWL) is of critical importance, especially in arid and semi-arid countries (Alfarrah and Walraevens, 2018; Bovolo et al., 2009; Priyan, 2021). In many areas, the overexploitation of GWL has led to irreparable damage to the groundwater sources (Alfarrah and Walraevens, 2018; Bovolo et al., 2009; Priyan, 2021). Predicting GWL is a key challenge in hydrogeological investigations, effective aquifer management, and assessment of subterranean water volume (Sun et al., 2022; Barzegar et al., 2017). Hydrogeological studies have been conducted to estimate the potential of underground water, predict changes in the GWL, and examine the current state of underground water resources (Hay and Mimura, 2005; Russo and Taddia, 2009). Empirical time series models have been extensively used to predict GWL levels (Eriksson, 1970). The ability of empirical or numerical models such as finite element groundwater flow system (FEFLOW)¹ (Ma et al., 2022), modular finite-difference flow model (MODFLOW)² (Hughes et al., 2022), and HydroGeoSphere³ (Kang et al., 2017) to estimate the GWL has made these models helpful in predicting the GWL (Trefry and Muffels, 2007; Wang et al., 2008; Brunner and Simmons, 2012).

1.1 Problem statement

The prediction of GWL is crucial for sustainable water resource management, as accurate forecasts contribute to understanding the availability and distribution of groundwater, essential for purposes such as agriculture, drinking water supply, and ecosystem maintenance (Singh et al., 2021a; Pragnaditya et al., 2021; Khan et al., 2023). Machine learning (ML) techniques offer the potential to analyze large and complex datasets, identify patterns, and make predictions that inform decision-making in water resource management (Singh, 2015; Singh et al., 2021b; Pham et al., 2022; Ghobadi and Kang, 2023; Singh et al., 2024). By applying ML to predict GWL, we can enhance our ability to monitor and manage water resources effectively, ensuring their sustainable use over time (Tao et al., 2022a; Pham et al., 2022). However, in Izeh City, Khuzestan province of Iran, certain challenges, such as low rainfall, increasing temperature, consecutive droughts, and overexploitation of GWL for agricultural purposes, create gaps in the prediction of GWL for this region. The absence of accurate predictive models tailored to Izeh City’s unique context poses a significant obstacle to achieving reliable predictions. Addressing these challenges is crucial for developing robust ML models that accurately forecast GWL in the region, thereby facilitating more effective water resource management strategies.

1.2 Literature review

Using the mathematical model of the aquifer is one of the best methods for managing and controlling the drop in water levels (Rajaee et al., 2019). In GWL mathematical models, differential equations are utilized to simulate GWL flow (Rajaee et al., 2019). Since the dynamic behavior of a hydrological system changes with the passage of time, the indicated models do not have adequate ability to predict the characteristics of water resources and are not suitable models (Rathinasamy et al., 2014). Physical models generally excel at capturing and delineating the relationships between variables, as they are built upon established scientific principles and laws. These models are grounded in a fundamental understanding of the underlying processes and mechanisms governing the system under consideration. This allows physical models to provide valuable insights into the behavior and interactions of various components within the system. Since the relationships between the variables affecting the GWL are complex and non-linear, physical models in practice require a lot of data to simulate the fluctuations of the GWL (Nayak et al., 2006; Khan and Valeo, 2016). Deep learning models, while powerful in terms of prediction performance, are often considered black-box models with limited interpretability, making it challenging to understand the exact relationships between variables. Based on artificial neural network (ANN), these models have shown remarkable success in various fields, such as image recognition, natural language processing, and gameplay (Nadiri et al., 2013).

Researchers have developed innovative approaches to predict the water level in aquifers in light of the numerous issues with artificial models for modeling aquifers (Tao et al., 2022b). Artificial intelligence (AI) models have been applied in a number of areas recently, including hydrogeological and underground water research (Nadiri et al., 2014). AI algorithms can use sparse, brief data to mimic irregular and non-linear time series with high accuracy. Due to their accuracy and usefulness, these models have been employed in recent years to anticipate the GWL (Franses and Van Dijk, 2000). Gong et al. (2015) tested the validity of three nonlinear time-series intelligence models, namely ANN, support vector machine (SVM), and adaptive neuro-fuzzy inference system (ANFIS), for the prediction of GWL considering surface water-groundwater interaction. The models were applied to two wells near Lake Okeechobee in Florida, United States, using a 10-year dataset of hydrological parameters. Evaluation measures showed that the ANFIS and SVM models provided more accurate predictions than the ANN model. Taking into account the surface water-groundwater interaction improved the prediction accuracy, particularly in areas close to the surface water, such as the lake area (Gong et al., 2016). Wen et al. (2017) introduced the wavelet analysis–artificial neural network (WA-ANN) model to predict the GWL in China for the next 1, 2, and 3 months. GWL, climate data, and water level were taken into consideration as input data in this study. They concluded that the suggested model is most accurate when the previous GWL is used as input data. In conclusion, it can be claimed that the WA-ANN model is a reliable and effective tool for estimating GWL (Wen et al., 2017). Kaya et al. (2018) used 196 data points from 2000 to 2015 to predict the GWL in the Turkish province of Reyhanli. They applied ANN and M5tree (M5T) model approaches in their investigation. They claimed that the methodologies suggested in this study are remarkably accurate for estimating the GWL and that the approaches presented in this study perform effectively (Kaya et al., 2018). Zhang et al. (2018) developed a Long short-term memory (LSTM) time series model to predict water table depth in agricultural areas with complex hydrogeological characteristics. Their proposed model outperformed the traditional feed-forward neural network (FFNN) in GWL prediction, achieving higher R² scores (0.789–0.952). The dropout method effectively prevented overfitting, and the model’s architecture demonstrated a strong learning ability on time series data. The study suggests that the LSTM-based model can be a valuable alternative for the prediction of GWL, particularly in data-scarce areas (Zhang et al., 2018). Kombo et al. (2020) introduced the K-Nearest Neighbour-random forest (KNN-RF) model along with ANN, KNN, SVM, and RF models to predict changes in the GWL of an aquifer in eastern Rwanda. The KNN-RF model is more accurate than other models, as they determined from their research. They asserted that planning and managing GWL resources can benefit from the KNN-RF approach (Kombo et al., 2020). Kumar et al. (2020) predicted GWL using a DL model alongside extreme learning machine (ELM) and Gaussian process regression (GPR) models, in the Konan basin, Japan. They assessed the DL model’s accuracy, which showed excellent agreement during validation (RMSE = 0.08, r = 0.95, NSE = 0.87). Re-validation at different stations demonstrated its robustness and generalization capabilities, making it a reliable tool for predicting GWL and optimizing resource allocation in groundwater systems (Kumar et al., 2020). Sharafati et al. (2020) employed gradient boosting regression (GBR) to predict monthly GWL in the Rafsanjan aquifer, Iran, using various input variables, including satellite data and pumping rates. They used the gamma test (GT) for feature selection and assessed performance using error metrics. The GBR yielded high predictive accuracy, especially with the gravity recovery and climate experiment (GRACE) dataset (Sharafati et al., 2020). Correlation analysis showed coefficient of determination values ranging from 0.66 to 0.94 for different lead times, with better accuracy in regions with higher water depth and pumping rates. The study offers valuable insights for water resource planning based on accurate modeling (Sharafati et al., 2020). Banadkooki et al. (2020) aimed to predict GWL using precipitation and temperature data with various temporal delays. They employed the radial basis function–whale algorithm (RBF-WA), multilayer perception (MLP–WA), and genetic programming (GP) to build hybrid ANN models. Results showed that the MLP–WA model outperformed others when using temperature data with delays of 3, 6, and 9 months. Combining precipitation and temperature data with these delays yielded the best results (Banadkooki et al., 2020).

Osman et al. (2021) used three Xgboost models, an ANN, and vector regression to predict the GWL in Selangor, Malaysia. This study used 11 months from October 2017 to July 2018 to collect data for the models, including rainfall, temperature, previous day’s water level, and evaporation. The study’s conclusions showed that the Xgboost model produces more accurate prediction results (Osman et al., 2021). Malakar et al. (2021) predicted future GWL trends in India using GRACE-derived GWS, WaterGap model-based GWR, and GWW. Their LSTM model outperformed FNN and RNN, showing >84% of wells with r > 0.6 and RMSE <0.7. They anticipate declining GWL trends in northwest, north-central, and south India, which could impact water supply and crop production for 1.3 billion people (Malakar et al., 2021). Afan et al. (2021) employed deep learning (DL) and ensemble deep learning (EDL) techniques to predict GWL in Malaysia. Their results revealed that EDL outperformed DL in estimating GWL, except for the Paya Indah Wetland. Additionally, EDL demonstrated superior performance in predicting daily GWL across all stations, reducing errors and providing precise results within a shorter time lag. Overall, they revealed that the EDL model has the potential to contribute to the sustainable management of GWL in Malaysia (Afan et al., 2021). Khan et al. (2023) reviewed GWL prediction models comprehensively. They examined 109 research articles and concluded that ML and deep learning approaches are efficient for modeling GWL. They also suggested future research directions to enhance prediction accuracy and understanding in this field (Khan et al., 2023). Dehghani and Torabi Poudeh, (2022) predicted GWL in southwest Iran by employing several meta-heuristic algorithms, including Feed-forward neural network (FNN) and automated item generation (AIG) models. Utilizing data on monthly rainfall, temperature, and water table height from the Lorestan Regional Water Corporation spanning 2008 to 2018, their study demonstrated the superior accuracy of the ANN-AIG hybrid model compared to other methods (Dehghani and Torabi Poudeh, 2022). Mohammed et al. (2023) combined a numerical model called GMS with methods like GA-ANN, GA-ICA, extreme learning machine (ELM), and ORELM in order to predict the GWL using piezometric data and rainfall information. The results of this investigation showed that, compared to other methods, the ORELM method accurately predicts the level of GWL (Mohammed et al., 2023). Table 1 shows the research work for the literature reviews used to predict GWL.

TABLE 1

TABLE 1. List of the previous work on the prediction of the GWL based on DML.

So far, no systematic study has been conducted to estimate the GWL in Izeh City, which is located in the Khuzestan province of Iran. Given that the region’s primary occupation is agriculture and the prevalent use of GWL for domestic, agricultural, and industrial purposes, accurate GWL prediction can significantly impact water supply and crop production in this area.

2 Methodology

The diagram depicted in Figure 1 illustrates the prediction procedure for GWL employing both traditional and deep ML algorithms, including DT, RF, SVM, CNN, GAN, and RNN. The initial step of executing this methodology involves the collection of data from Iran’s study area. Subsequently, the dataset undergoes sorting and preprocessing stages, which encompass the removal of outliers and duplicate data points. Following this, the data points are normalized using Eq. (1).

ω_{i}^{l} = (\frac{ω_{i}^{l} - {ω \min}^{l}}{{ω \max}^{l} - {ω \min}^{l}}) * 2 - 1 (1)

FIGURE 1

FIGURE 1. Illustration of a flowchart for prediction of GWL based on traditional and deep ML (DT, RF, SVM, CNN, GAN, and RNN).

Finally, the dataset is randomly partitioned into training, testing, and validation sets. To compare traditional and deep ML, various metrics, such as RMSE and R-Square, are computed for each algorithm. Sophistic ML models like CNN, GAN, and RNN are developed using preprocessed data. Ultimately, the models’ performances are juxtaposed, leading to the selection of CNN as the optimal approach for predicting GWL.

2.1 Traditional machine learning

2.1.1 Decision tree (DT)

The DT is a widely used supervised ML algorithm that is particularly valuable for classification and prediction tasks by dividing data into sub-trees and branching out further (Kotsiantis, 2013). In this algorithm, the input variables (R, P, and E) are considered trees, and the control parameters related to the RF algorithm are considered nodes between the trees. Finally, the final decision is known as the GWL prediction. This study employed a regression decision tree model with specified parameters. The maximum depth of the tree was set to 100, indicating the maximum number of levels in the tree structure. The criterion for measuring the quality of a split was chosen as “Gini,” which typically assesses impurity for classification tasks, although it is worth noting that for regression tasks, other criteria like “mse” (Mean Squared Error) might be more common. The splitter strategy was set to “best,” meaning the algorithm considers all possible splits and selects the one that optimally reduces impurity or minimizes the mean squared error.

2.1.2 Random forest (RF)

The RF algorithm amalgamates the predictions stemming from all constituent trees within the forest, averaging them to yield a prediction that is not only more robust but also more accurate (Gomes et al., 2017). This ensemble approach effectively counteracts the influence of individual trees that might have generated erroneous predictions or excessively adhered to the training data’s idiosyncrasies. In this algorithm (RF) context, the input and output variables (R, P, E, and GWL) are metaphorically conceptualized as trees. The aggregate decisions and the ultimate amalgamated tree are denoted as the GWL outcomes upon culmination. This study employed a regression random forest model with specific parameter settings. The maximum depth of the trees in the forest was set to 100, indicating the maximum number of levels in each decision tree. The random state was fixed at 0, ensuring reproducibility by keeping the randomness constant. The number of decision trees in the forest was set to 0, which typically means an unrestricted growth of trees until the specified maximum depth is reached. The objective function used for training the model was the Mean Squared Error (MSE), a measure that quantifies the average squared difference between the predicted and actual values, guiding the optimization process toward minimizing prediction errors.

2.1.3 Support vector machines (SVM)

The versatility of SVM is evident in its utilization for both classification and regression tasks, mirroring the functionalities of DT and RF algorithms (Wang et al., 2022). The algorithm diligently endeavors to expand these margins to their fullest potential, effectively delving into the essence of generalized error learning theory and striving to minimize errors to the greatest extent possible (Kecman, 2001). This endeavor aligns with SVM’s overarching objective of achieving optimal separation between distinct classes or the prediction of accurate numerical values in regression scenarios (Ozer et al., 2020). In this algorithm, the input variables (R, P, and E) are the objective parameters discussed in this article, while the output variable (GWL) is the predictive parameter. In this study, a prediction model was developed with specific hyperparameters: a batch size of 100, determining the number of training samples processed in a single iteration; a regularization parameter (C), which helps control overfitting by penalizing significant coefficients in the model, set to 0.1; and the utilization of a polynomial kernel.

2.2 Deep machine learning

2.2.1 Recurrent neural network (RNN)

An RNN is a specialized neural network for handling sequential and time-series data, particularly suited for prediction GWL based on input parameters P, R, and E (Panahi et al., 2020). Unlike multilayer perceptron (MLP) and CNN architectures, RNNs emphasize time considerations. While feed-forward networks like CNNs are common, RNNs incorporate a feedback loop, enabling them to retain prior inputs and process input sequences, preserving information across moments (Kanjo et al., 2019; Garbin et al., 2020). This characteristic ensures historical data’s retention within the network. Figure 2A presents an RNN cell example (Han et al., 2021).

FIGURE 2

FIGURE 2. Illustration of the (A) chain of RNN network, (B) GAN network, (C) CNN algorithm for prediction of GWL.

An RNN consists of a hidden state memory input (‘h’) and a primary input ‘x’ (R, E, and P) (Ming et al., 2017). Processing occurs through layers ‘wh’ and ‘wx’ for ‘h' and ‘x’, respectively. ‘ht-1’ and ‘xt’ is multiplied by ‘wh’ and ‘wx’ weight matrices (Mirsalari et al., 2020) summed as per Eq. (2), and activated by functions like tanh, sigmoid, relu, etc. (Giordano et al., 2019) to yield ‘ht.’ See Figure 2A for the RNN architecture.

h_{t} = f (w_{h} h_{t - 1} + w_{x} x_{t} + b_{h}) (2)

The output above corresponds to the next hidden state (h_t) and the output of the RNN at time t. In Figure 2A X (1) serves as the input sequence; h (0) and X (1) combine for the subsequent stage. Outputs h (1) and X (2) in the following stage form input. During training, previous inputs are remembered (Shi et al., 2017). Unfolding the RNN over time creates a network chain. Hyperparameters are detailed in Table 2. In the realm of predicting GWL using an RNN algorithm, controlling chaos is important for stability. Adjusting the time constant for model units aids in capturing the system’s dynamics while optimizing the learning rate to fine-tune the model’s responsiveness. The time constant of filtered white noise inputs and the weight assigned to white noise inputs influence noise incorporation, demanding careful calibration. Iterating the training process and modifying the data time step size ensures model accuracy with evolving data patterns. Furthermore, the RNN integration step size impacts temporal resolution, necessitating strategic adjustments to balance precision and computational efficiency in predicting GWL.

TABLE 2

TABLE 2. The hyperparameters for RNN algorithm.

2.2.2 Generative adversarial networks (GAN)

The GANs, comprising a generator and a discriminator, identify data patterns autonomously, engaging in a competition to evolve the dataset (Shi et al., 2017). Figure 2B shows a GAN network.

The GANs consist of two neural networks: a generator G(x) and a discriminator D(x). The generator produces synthetic samples to increase the likelihood of fooling the discriminator (Dong and Lin, 2019). It takes noise vectors and generates fake data. Real and fake data are then fed to the discriminator, which categorizes them (Li et al., 2019). The model is trained by calculating the loss at the discriminator’s end and adjusting parameters via backpropagation (Alarsan and Younes, 2021). The GAN training process involves selecting real data (X), passing it through the generator and applying sigmoid activation, creating noise data (Z), generating samples (G(Z)), evaluating loss, backpropagating to update discriminator weights, using generator output to update its weights, and iterating until optimal weights are achieved for both networks.

The discriminator loss function assesses D’s prediction on real/fake data, calculated from errors made. Errors backpropagate to update parameters (Azari et al., 2022). It comprises terms for real (x) and fake (G(z)) inputs, with the real input loss term defined as expressed in Eq. (3). The second term is for fake input (G(z)) as expressed in Eq. (4).

l_{d_{1}} = \log σ (D (x)) (3)

l_{d_{2}} = \log (1 - σ (D (G (z))) (4)

In the equation, σ is the sigmoid function with an output range of 0–1. An output near 1 implies accurate real data recognition by D, resulting in minimal loss (Szandała, 2021).

The two-loss terms are computed and summed for the overall discriminator network loss. The GAN hyperparameters, which are pivotal for performance, are detailed in Table 3. Due to GAN-type variance, tuning is essential. The generator time impacts training duration and data quality; the lower filter quantile prevents biased results, and the top filter quantile maintains alignment. The loss function improves GAN via data distinctions, and the maximum depth affects complexity and overfitting. The maximum bin and learning rate control convergence, ensuring reproducibility through a random state. Estimators boost diversity, batch size influences stability, and patience counters overfitting. Control parameters are fixed through careful calibration and iterative experimentation to optimize performance for the prediction of GWL.

TABLE 3

TABLE 3. The hyperparameters for the GAN algorithm.

2.2.3 Convolutional neural network (CNN)

The CNN emulates the visual cortex with neurons, weights, and biases. It comprises convolutional, pooling, and fully connected layers (Azizah et al., 2017). Notably, the convolutional layer employs operations, while the fully connected layer maps characteristics to output. The CNNs maintain input structure, highlighting data relationships (Yamashita et al., 2018). Training entails optimizing parameters via backpropagation and gradient descent. Figure 2C shows the structure of the CNN algorithm.

The main kernel of the CNN is the convolutional layer, which has assigned most of the computations to the CNN (Wang et al., 2017). Each convolutional layer in the CNN consists of a set of filters, and the output is created from the convolution between the filters and the input layer (O'Shea and Nash, 2015). The output of the convolutional layer is called a feature map.

In CNNs, the convolution operator slides a kernel over the input, multiplying its values with input values, creating a feature map (Wang et al., 2021). Kernel count and size dictate operation complexity, often 3 × 3, 5 × 5, or 7 × 7. The number determines the output feature map depth. Padding maintains input size. The CNN hyperparameters are in Table 4. Each control parameter serves a specific function in the CNN algorithm for predicting GWL. The input layer processes the initial data, the CONV layer extracts features through convolution, the POOL layer reduces spatial dimensions, and the Fully Connected layer combines features for classification. Softmax provides probability scores. Kernel size determines feature extraction scope, and Stride size controls filter movement. Fixing parameters involves tuning through iterative training and adjusting based on model performance and validation results.

TABLE 4

TABLE 4. The hyperparameters for the CNN algorithm.

2.3 Spearman’s and Pearson’s correlation and error metrics

One of the best methods for determining the relative importance of input-independent variables compared to output-dependent variables (GWL) is to use the Pearson’s coefficient (R) method. This coefficient expresses a correlation between −1 and +1. Based on this coefficient, a value of +1 has the most significant positive impact, and a value of −1 has the most significant absolute impact, while a zero value means there is no linear relationship between two variables. Also, this parameter shows that it has no effect. The Pearson’s correlation is shown in Eq. (5) (De Winter et al., 2016).

Z = \frac{\sum_{i = 1}^{n} (P_{i} - \bar{P}) (K_{i} - \bar{K})}{\sqrt{\sum_{i = 1}^{n} {(P_{i} - \bar{P})}^{2}} \sqrt{\sum_{i = 1}^{n} {(K_{i} - \bar{K})}^{2}}} (5)

Spearman’s coefficient (ρ) is one of the coefficients of the input data set compared to the output for the input variables compared to the output variables. Data can be ranked using this parameter. This equation is in the form of Eq. (6) (Alsaqr, 2021).

ρ = \frac{\sum_{i = 1}^{n} (P_{i} - \bar{P}) (K_{i} - \bar{K})}{\sqrt{\sum_{i = 1}^{n} {(P_{i} - \bar{P})}^{2} \sum_{i = 1}^{n} {(K_{i} - \bar{K})}^{2}}} (6)

In order to compare and measure the comparison, the equations and statistical errors reported in Equations (7–9) are used.

STD = \sqrt{\frac{\sum_{i = 1}^{n} {({(\frac{{GWL}_{(Meas .)} - {GWL}_{(Pre .)}}{{GWL}_{(Meas .)}} x 100)}_{i} - (\frac{1}{n} \sum_{i = 1}^{n} ({GWL}_{Meas .}_{i} - {GWL}_{Pre .}_{i})))}^{2}}{n - 1}} (7)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({GWL}_{Meas .}_{i} - {GWL}_{Pred .}_{i})}^{2}} (8)

R^{2} = 1 - \frac{\sum_{i = 1}^{N} {({GWL}_{Meas .}_{i} - {GWL}_{Pre .}_{i})}^{2}}{\sum_{i = 1}^{N} {({GWL}_{Pre .}_{i} - \frac{\sum_{I = 1}^{n} {GWL}_{Meas .}_{i}}{n})}^{2}} (9)

3 Data gathering and data distribution

River water imports, precipitation, and the negative parameter of GWL withdrawal are among the positive parameters for GWL (Machiwal and Singh, 2015; Zhang et al., 2019). To predict this essential and vital parameter for human society, 2136-point data collected from 2018 to 2022 employing various methods, such as a water level sensor for groundwater level, a flow meter for groundwater extraction rate, a rain gauge for rainfall rate, and a stream gauge for river flow rate with 3 km distance, was gathered from Izeh City in the Khuzestan province of Iran for this article. BothIzeh City’s hydrology influences hydrology in Izeh City is influenced by human activities and seasonal changes (Kalantari et al., 2009; Hoseini, 2022). Human activities such as agriculture and industrialization significantly impact the region’s hydrological system (Nassery et al., 2009; Rashidi and Hosseinzadeh, 2019). Agricultural practices, particularly irrigation, heavily rely on E, which can lower the water level and alter the natural balance (Jafari et al., 2015; Neissi et al., 2020). Industrial activities and urbanization contribute to changes in surface runoff patterns and can introduce pollutants into water sources (Rashidi and Hosseinzadeh, 2019; Ziyari and Latifi, 2022). Additionally, seasonal variations play a crucial role in the hydrological cycle of Izeh City (Bakhtiari et al., 2021). During the rainy season, increased precipitation and runoff lead to rising GWL, while dry seasons result in decreased groundwater recharge due to higher evaporation rates (Kalantari et al., 2009). Understanding the intricate relationship between human activities, seasonal changes, and hydrology is vital for sustainable water resource management in Iran’s Izeh City of Khuzestan province (e.g., Nassery et al., 2009; Mahdavi et al., 2021).

In order to build a hybrid model of AI, 70% of the data is used for training, 15% for testing, and 15% for validation. The use of data to build the model is random. The statistical information related to the data used in this article is reported in Table 5. Based on this table, the range and values of statistical parameters are reported.

TABLE 5

TABLE 5. Report of input/output variables in order to predict GWL for the data related to Izeh City of Khuzestan province of Iran.

A heat map is used in order to distribute the data. As shown in Figure 3, 400 data points of the E are in the range of 145 < GWL <150, 1,300 data points of the E are in the range of 140 < GWL <145, and 436 data points of the E are in the range of GWL <140.

FIGURE 3

FIGURE 3. Illustration of heat map for input variable for prediction of GWL based on traditional and deep ML (DT, RF, SVM, CNN, GAN, and RNN).

As shown in that figure, 300 data points of the R are in the range of 145 < GWL <150, 1,100 data points of the R are in the range of 140 < GWL <145, and 736 data points of the R are in the range of GWL <140.

Also, 300 data points of the P are in the range of 145 < GWL <150, 1,236 data points of the P are in the range of 140 < GWL <145, and 600 data points of the P are in the range of GWL <140.

To present the Mean and StDev values, we have included the data distribution and data values in Figure 4. As depicted in Figure 4, the histograms visually represent the input/output variables, including E, R, P, and GWL. The distribution of the recorded values is displayed in the histogram for E, which also reveals the frequency of various extraction levels. Similarly, the histogram for R presents the distribution of R values, allowing us to observe the frequency of different precipitation amounts. The histogram for P displays the distribution of P data, providing an overview of the frequency of P measurements. Lastly, the histogram for GWL illustrates the distribution of measured values, giving us an understanding of the frequency of different water level readings. By examining this histogram, we can gain insights into the variability and distribution of GWL, which are crucial for assessing groundwater resources and potential fluctuations (e.g., Kumar and Ahmed, 2003; Ahmadi and Sedghamiz, 2007; Dash et al., 2010).

FIGURE 4

FIGURE 4. Illustration of histograms for input/output variables (groundwater extraction rate (E), rainfall rate (R), river flow rate (P), and groundwater level (GWL)).

4 Discussion of results

In order to predict this critical parameter, DT, RF, SVM, CNN, GAN, and RNN traditional and deep ML algorithms have been used. The reports related to the results of training, testing, validation, and total data are given in Table 6.

TABLE 6

TABLE 6. Statistical reports related to train, test, validation, and total results were used to predict GWL based on traditional and deep ML (DT, RF, SVM, CNN, GAN, and RNN).

This article uses traditional and deep ML (DT, RF, SVM, CNN, GAN, and RNN) methods to predict GWL. The statistical parameters R2, STD, and RMSE are used to evaluate the delivered models in Table 6. After checking the results from Table 6, it is clear that the reports of the CNN model are better than those of the RNN and GAN models. Based on the results shown, it is determined that the values of RMSE and R2 for train, test, validation, and total data are [0.0507, 0.0561, 0.0594, 0.0558] and [0.9968, 0.9951, 0.9926, 0.9948], respectively.

Based on Figure 5, which shows the cross plot between the measured data points and the predicted data, the best AI model for regression can be determined from among the models provided. This figure gives six traditional and deep ML models (DT, RF, SVM, CNN, GAN, and RNN). Based on this figure, a good comparison can be made between the models based on R². Based on the results presented visually for the whole dataset, it is clear that the RNN algorithm has a higher accuracy than the other algorithms. Based on the results shown, it is clear that the accuracy of these algorithms is SVM < RF < DT < GAN < RNN < CNN.

FIGURE 5

FIGURE 5. Cross plot diagram for prediction of GWL value using new traditional and deep ML algorithms for RF (orange color), SVM (gray color), DT (purple color), CNN (red color), RNN (green color), GAN (blue color) (DT, RF, SVM, CNN, GAN, and RNN).

Figure 6 shows the histogram of the GWL prediction error for three newly developed deep ML algorithms. As shown in the histogram diagram, GWL prediction errors are symmetrically distributed at the zero point, and for the CNN algorithm, this distribution is normal. Its statistical error distribution is either positively or negatively distributed.

FIGURE 6

FIGURE 6. Histogram plot to determine the error rate for GWL prediction using deep ML algorithms RF (orange color), SVM (gray color), DT (purple color), CNN (red color), RNN (green color), GAN (blue color) (DT, RF, SVM, CNN, GAN, and RNN).

Based on the data presented in Figure 7, which illustrates the relative error (%) versus data index for GWL prediction using deep ML algorithms RNN, CNN, and GAN, we can analyze the error ranges associated with each algorithm (e.g., Yoon et al., 2011; Banadkooki et al., 2020; Di Nunno and Granata, 2020; Azari et al., 2021). The figure provides valuable insights into the accuracy of the DT, RF, SVM, CNN, GAN, and RNN algorithms by depicting their respective relative error (%) ranges. Upon examining Figure 7, we observe that the error range for the CNN algorithm falls between −0.192 and 0.194. In contrast, the RNN algorithm exhibits an error range of −0.693 to 0.729, while the GAN algorithm spans from −0.850 to 0.850, the DT algorithm exhibits an error range of −8.1936 to 10.4948, and the RF algorithm exhibits an error range of −13.7735 to 14.085. while the SVM algorithm spans from −14.7825 to 15.2825. These error ranges show the magnitude and direction of the relative errors between the predicted and actual GWL values (e.g., Yoon et al., 2011; Marchant et al., 2016). Based on this information, it is concluded that the CNN algorithm outperforms the RNN and GAN algorithms in terms of accuracy. The GWL predictions made by the CNN algorithm exhibit a smaller relative error (%) when compared to the RNN, GAN, DT, RF, and SVM algorithms. Therefore, a comparison of these algorithms reveals that the accuracy ranking is as follows: CNN > RNN > GAN > DT > RF > SVM.

FIGURE 7

FIGURE 7. Illustration of the relative error (%) versus data index for GWL prediction using deep ML algorithms (RF (orange color), SVM (gray color), DT (purple color), CNN (red color), RNN (green color), GAN (blue color)).

According to the graphical data in Figure 8 and Table 6, which show the RMSE and R2 for GWL prediction utilizing deep ML algorithms (DT, RF, SVM, CNN, GAN, and RNN), the performance accuracy of RMSE and R² yields contrasting results. In other words, as the R² value increases, the corresponding RMSE value decreases. Furthermore, this figure effectively demonstrates the performance accuracy of the algorithms employed for GWL prediction, with the ranking as follows: CNN > RNN > GAN > DT > RF > SVM. Figure 8 provides valuable insights into the relationship between RMSE and R² in the context of GWL prediction. As the R² value increases, it indicates a stronger correlation between the predicted and actual GWL values (e.g., Sakaguchi and Berge, 1998; Seifi et al., 2020; Wu et al., 2023). Consequently, the RMSE value decreases, signifying a smaller average error in the prediction (e.g., Mukherjee and Ramachandran, 2018; Yosefvand and Shabanlou, 2020; Iqbal et al., 2021; Lin et al., 2022; Samantaray et al., 2022). The figure reinforces the conclusion that the CNN algorithm outperforms the RNN and GAN algorithms in terms of accuracy for GWL prediction. The higher R² and lower RMSE values associated with CNN demonstrate its superior performance compared to the other algorithms. Therefore, the comparative analysis suggests the following accuracy ranking: CNN > RNN > GAN > DT > RF > SVM.

FIGURE 8

FIGURE 8. Illustration of RMSE and R² for GWL prediction using deep ML algorithms (DT, RF, SVM, CNN, GAN, and RNN).

A comparison of Pearson and Spearman correlation coefficients based on Figure 9 can provide insight into the relationship between input variables and GWL (e.g., Hauke and Kossowski, 2011; Worsa-Kozak et al., 2020; Balacco et al., 2022). The observed negative correlation between groundwater recharge (R and P) and GWL indicates that these input factors yield a negative influence when incorporated into the linear relationship governing GWL or when included in the proportion it affects. In contrast, the observed positive correlation between E and GWL indicates that E has a positive power or direct proportionality when placed in the physical linear relationship of GWL (e.g., Hauke and Kossowski, 2011; Mukherjee and Ramachandran, 2018). The E involves drawing water from underground aquifers for purposes like irrigation, industry, and domestic use (Foster and Chilton, 2003; Worsa-Kozak et al., 2020). This often leads to declining GWL as extraction outpaces natural replenishment from R and infiltration, creating a positive correlation between E and GWL reduction.

FIGURE 9

FIGURE 9. Correlation between input and output parameters for Pearson and Spearman equations to predict GWL.

In contrast, P can exhibit a negative correlation with GWL due to stream-aquifer interaction. Elevated GWL can feed P, bolstering their flow, while low GWL prompts P to recharge adjacent aquifers by seeping water into the ground, establishing a dynamic that yields a negative correlation between P and GWL. Given that the Pearson value for R is approximately −0.00357 and close to zero, it can be assumed that this parameter has little effect on GWL. The use of both Pearson and Spearman correlation methods provides a robust analysis of the data, and the results can be used to develop GWL prediction models based on the input variables (Hauke and Kossowski, 2011; Worsa-Kozak et al., 2020). Expressing the relationships between input variables and GWL in Eq. (10) allows quantitative data analysis and facilitates comparison with other studies.

G W L = \propto (E) a n d G W L = \propto \frac{1}{(P, R)} (10)

The analysis of the Spearman and Pearson correlation coefficient values indicates that variables P and E have a stronger influence on GWL than variable R. This suggests that E and P flow are more significant factors affecting GWL than R (e.g., Kim et al., 2016; Csáfordi et al., 2017). However, it is important to note that the relative importance of these variables may vary depending on the specific site conditions and hydrological characteristics. The relative contributions of these variables to GWL can be determined with the aid of additional analysis, such as regression modeling, which can also offer insights into the underlying mechanisms causing the observed correlations (Hauke and Kossowski, 2011). The interpretation of the correlation coefficients should also consider the statistical significance of the results as well as the potential for confounding variables or measurement error (Mukherjee and Ramachandran, 2018; Iqbal et al., 2021).

Deep learning’s outstanding capabilities include forecasting crucial GWL characteristics (Sit et al., 2020; Afan et al., 2021; Wunsch et al., 2021). By employing powerful algorithms, these predictions can ensure accurate estimations, meet the water supply needs of the people in the Izeh area, and enhance their quality of life. The AI, including deep learning, has demonstrated its value in predicting GWL parameters. Leveraging these sophisticated algorithms, we can achieve precise predictions, thereby effectively addressing the water supply requirements of the community in Izeh and safeguarding their wellbeing.

5 Limitation

The limitations of this article are the lack of access to additional information about water diversion, evaporation rate, and temperature data in the target area, especially Izeh City. It is recommended that other researchers consider the influence of these parameters due to their considerable availability when predicting GWL. Including these parameters in the prediction model can provide a more accurate estimate of groundwater resources. This is particularly important because various factors, such as evaporation and temperature, affect GWL. Additionally, a similar article has not been published for Izeh City so far, and it has been somewhat challenging to provide data at this wide level.

Furthermore, it is essential to highlight that the effectiveness of CNN algorithms in predicting GWL is enhanced when a substantial number of input variables are employed. In this article, only three parameters were utilized as input variables, leading to the anticipation that augmenting the inputs will likely boost the accuracy of GWL predictions. Hence, it is recommended that researchers to incorporate a greater number of input variables to enhance the algorithm’s accuracy.

6 Conclusion

An extensive 2,136 time series data points dataset has been collected from the Izeh City of Khuzestan province in Iran. The collected data was utilized using the DML technique to effectively predict the GWL in the proximate wellbore regions by means of three input variables: groundwater extraction rate (E), rainfall rate (R), and river flow rate (P). Through analysis, it has been discovered that deep machine learning (DML) algorithms, such as recurrent neural network (RNN), convolutional neural network (CNN), generative adversarial network (GAN), decision tree (DT), random forest (RF), and support vector machine (SVM), which are traditional and deep ML algorithms, can be employed to predict GWL with remarkable precision. Moreover, the correlation coefficient analyses of Pearson and Spearman revealed that the GWL is negatively and indirectly related to the input variables of groundwater recharge (R and P). However, the input variable “E” exhibits a positive correlation with GWL.

Furthermore, the Spearman and Pearson correlation coefficients ascertain that the input variables P and E have a more significant influence on GWL compared to variable R. Considering that the Pearson value for R is approximately −0.00357 and close to zero, it can be inferred that this parameter has little effect on GWL. However, deep learning algorithms possess the capability to select impactful features and eliminate less influential ones. The level of GWL prediction accuracy achieved by the CNN model, applied to all data records in the comprehensive dataset, is impressive: RMSE = 0.0558 and R² = 0.9948. CNN, a cutting-edge deep ML algorithm that is a robust and efficacious ML tool for data point prediction processing, is applied in this study. Its capability to learn and detect patterns in vast datasets makes it an excellent choice for prediction data points with multiple input variables. Some of the advantages of using CNN over DT, RF, SVM, GAN, and RNN algorithms for the prediction of data points include robustness to noise and variability, scalability to handle extensive datasets with multiple input variables, parallelization for rapid processing speeds for real-time and near real-time applications, generalization to learn and identify patterns in data without explicit programming, and fewer outlier data predictions. This research can assist the residents of Izeh City in the Khuzestan province in conserving and managing their water resources and achieving increased crop productivity for the local economy. This approach can be applied to predict GWL in different parts of the world, and it can potentially improve water management in regions facing water scarcity due to global population growth.

Data availability statement

The data analyzed in this study is subject to the following licenses/restrictions: Data can be made available upon reasonable requests for academic purposes through the corresponding authors. Requests to access these datasets should be directed to hamzehghorbani68@yahoo.com.

Author contributions

FF: Conceptualization, Formal Analysis, Funding acquisition, Methodology, Resources, Software, Validation, Visualization, Writing–original draft, Writing–review and editing. HG: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Methodology, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing. AR: Conceptualization, Funding acquisition, Methodology, Resources, Validation, Visualization, Writing–original draft, Writing–review and editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

¹FEFLOW: is a computer program for simulating groundwater flow, mass transfer, and heat transfer in porous and fractured media

²MODFLOW: is the U.S. Geological Survey modular finite-difference flow model, which is a computer code that solves the groundwater flow equation.

³HydroGeoSphere: is a 3D control-volume finite element groundwater model that accounts for surface and subsurface flow, solute and energy transport, and heat transport.

References

Afan, H. A., Ibrahem Ahmed Osman, A., Essam, Y., Ahmed, A. N., Huang, Y. F., Kisi, O., et al. (2021). Modeling the fluctuations of groundwater level by employing ensemble deep learning techniques. Eng. Appl. Comput. Fluid Mech. 15, 1420–1439. doi:10.1080/19942060.2021.1974093

ORIGINAL RESEARCH article

Predicting groundwater level using traditional and deep machine learning algorithms

1 Introduction

1.1 Problem statement

1.2 Literature review

2 Methodology

2.1 Traditional machine learning

2.1.1 Decision tree (DT)

2.1.2 Random forest (RF)

2.1.3 Support vector machines (SVM)

2.2 Deep machine learning

2.2.1 Recurrent neural network (RNN)

2.2.2 Generative adversarial networks (GAN)

2.2.3 Convolutional neural network (CNN)

2.3 Spearman’s and Pearson’s correlation and error metrics

3 Data gathering and data distribution

4 Discussion of results

5 Limitation

6 Conclusion

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher’s note

Footnotes

References

People also looked at