Application of Artificial Neural Networks for Predicting the Stability of Rectangular Tunnels in Hoek–Brown Rock Masses

An artificial neural network (ANN) model for predicting the stability of rectangular tunnels in rock masses based on the Hoek–Brown (HB) failure criterion is presented in this study. Since the safety assessment of the tunnel stability is one critical issue for civil engineers during the construction, it is very important to develop a reliable and accurate stability analysis of such problems. The finite element limit analysis (FELA) with the HB failure criterion is used to develop the numerical upper and lower bound solutions of the problem of rectangular tunnels in rock masses. A novel machine learning-aided prediction of this problem is then developed based on the datasets of the numerical bound solutions obtained from the FELA. The inputs consist of six dimensionless parameters including the cover-depth ratio of tunnels, the width ratio of tunnels, the normalized uniaxial compressive strength, the geological strength index, the m i parameter, and the degree of disturbance of rock masses. The results show that the optimal ANN models provide very great accuracy in predicting the stability of the rectangular tunnels based on the HB failure criterion. The solutions will provide a prompt assessment of tunnel stability in rock masses for geotechnical engineers during the construction of rock tunnels.


INTRODUCTION
Tunnel safety has been a classic issue for geotechnical engineers during the processes of both construction and operation. It has been challenging to design and construct a large tunnel (e.g., highway tunnel and railway tunnel) especially in the area located on a weak ground. The catastrophic collapses of the tunnel due to the loss of stability and external disturbance have been found around the globe (Chung et al., 1995;Shin et al., 2006;Aygar and Gokceoglu, 2020). Therefore, it is very important to ensure the stability to prevent a collapse inside the tunnel during the construction process. To assess the stability of rock masses during the tunnel construction by an open-face conventional tunneling or a tunneling boring machine, the failure criterion for capturing the collapse of rocks is required to accurately compute the tunnel stability in rocks. The Hoek-Brown (HB) failure criterion is one of the famous criteria for capturing the failure behaviors of rock masses. The first version of this failure criterion was introduced by Hoek and Brown (1980) in the 1980s by employing the curve-fitting of triaxial test data of intact and jointed rocks. Later, in 2002, Hoek et al. (2002 updated the old version by accounting for the effect of highly fractured properties. A brief history and the development of this failure criterion can be found in the study by Hoek (Hoek, 2004;Hoek, 2007).
To estimate the stability of tunnels in rock masses, the finite element limit analysis (FELA) is one of the efficient techniques which has been commonly used to provide the stability solutions of the tunnels (Sloan, 2013). This technique employs plastic bounding theorems, finite element discretization, and non-linear programming. This technique consists of the associated upper and lower bound theorems (UB and LB) in conjunction with a perfectly plastic material with an associated flow rule. A true stability solution can be obtained by bracketing from UB (above) and LB solutions (below). The FELA with the HB failure criterion  has been used to solve the tunnel stability by following the methods of Ukritchon and Keawsawasvong (2019a), Keawsawasvong and Ukritchon (2020), and Xiao et al. (2021) for a single circular, square, and rectangular tunnel in a rock mass, respectively. The solutions of the stability of a plane strain heading of tunnels were also studied by Ukritchon and Keawsawasvong (2019b). Moreover, the stability of unlined dual circular, square, and horseshoe tunnels in rock masses was also carried out by Zhang et al. (2019), Xiao et al. (2019), and Rahaman and Kumar (2020), respectively. However, these works only proposed the solutions of the tunnel stability in rock masses as tables and design charts which cannot be directly applied for arbitrary values of all considered parameters such as the HB parameters or the tunnel's geometries without the approximation or the interpolation of the solutions. In addition, the previous study by Xiao et al. (2019) provided some numerical results for the stability of rectangular tunnels in rock masses. However, they did not comprehensively consider the impact of the degree of disturbance of rock masses on their numerical results (only demonstrating some examples). Therefore, a new procedure providing an accurate and reliable calculation of this stability problem should be carried out to accurately obtain the solutions of the stability of tunnels in rock masses by fully considering the effect of the degree of disturbance of rock masses.
Soft computing appeared as an alternative to the common analytic and numeric approaches, especially an artificial neural network (ANN) approach. This approach enables us to learn from a sufficiently dense dataset, and then configure a black-boxtype prediction model in order to solve the problems in the form of a closed simple equation. The ANN approach has been used to identify various rock parameters from several empirical tests in the field of rock engineering (e.g., Yang and Zhang, 1997;Mert et al., 2011;Ocak and Seker, 2012;Gholami et al., 2013;Miah et al., 2020;Mohamad Ali Ridho et al., 2021). In addition, the soft computing of the bearing capacity of foundations on rock masses using the ANN approach has also been presented by a few researchers (Ziaee et al., 2015;Alavi and Sadrossadat, 2016;Millán et al., 2021). An extreme learning neural network and terminal steepest descent algorithm were carried out by Li et al. (2016) to predict the stability of rock slopes by adopting the dataset from the FELA with the HB failure criterion. For the application of tunnel stability, Naghadehi et al. (2019) utilized the ANN approach to estimate the face stability of mechanized shield tunneling in cohesive-frictional soils. To the best of the author's knowledge, there is no previous study of the assessment of the stability of tunnels in Hoek-Brown rock masses using the ANN technique and the FELA solutions. The objective of this study is to develop a convenient tool based on the ANN approach for providing a prompt assessment of the stability of rectangular tunnels in Hoek-Brown rock masses.

PROBLEM STATEMENT
Hoek-Brown Failure Criterion  principal (compressive) stress. The form of a power-law relationship between the major and minor principal stresses (i.e., σ 1 and σ 3 ) is the mathematical expression of the HB failure criterion. Taking tensile normal stresses as positive, the HB failure criterion can be expressed as (Hoek et al., 2002) where σ ci is the uniaxial compressive strength of intact rock mass, and the parameters m b , s, and a are expressed in Equations 2-4.
( 4 ) In the previous equations, the geological strength index (GSI) has typical values from 10 to 100 (extremely poor rock mass to a perfectly intact rock mass). Also, DF represents the degree of disturbance, and it has typical values from 0 (undisturbed in-situ rock masses) to 1 (extremely disturbed in situ rock masses).
Parameter m i is a material constant that is related to the frictional strength of an intact rock mass and has typical values from 5 to 30. Figure 1 shows the problem definition of 2D unlined rectangular tunnels in a plane strain condition. Due to the assumption of the plane strain condition, the problem represents a very long unlined rectangular tunnel. The tunnels have a width (D), a length (B), and a cover depth (C) above their crown. The parameters of rock masses based on the Hoek-Brown failure criterion consist of GSI, DF, m i , σ ci , and unit weight, γ. A uniform surcharge pressure (σ s ) is applied over the rock surface as shown in Figure 1.

Problem Definition
According to the aforementioned parameters, there are eight design parameters in this study (i.e., C, D, B, σ ci , GSI, DF, m i , and γ). The dimensionless output parameter of this problem is the stability factor denoted by σ s /σ ci . Eq. 5 represents the stability factor as a function of six dimensionless parameters as follows: where B/D represents the width ratio, C/D represents the cover depth ratio, DF represents the degree of disturbance, GSI represents the geological strength index, mi represents a material constant related to the frictional strength, and γD/σci represents the normalized uniaxial compressive strength.
In this study, a non-linear input-output mapping of the system of tunnels in rock masses is constructed using a neural network trained by an extreme learning algorithm. The training data are the FELA solutions of the tunnel stability factor.

METHODOLOGY Finite Element Limit Analysis
The finite element limit analysis (FELA), which is the computational method based on a perfectly plastic material with an associated flow rule, employs the plastic bound theorems, finite element discretization, and mathematical optimization (Sloan, 2013;Keawsawasvong and Ukritchon, 2017;Ukritchon and Keawsawasvong, 2017;Krishnan et al., 2019;Ukritchon et al., 2019;Ukritchon and Keawsawasvong, 2020a;Ukritchon and Keawsawasvong, 2020b;Ukritchon et al., 2020;Keawsawasvong and Ukritchon, 2021). This FELA technique is carried out to derive the bracket of the true limit load from the targeted upper Bound (UB) and lower Bound (LB) solutions. A new computer software, namely, OptumG2 (OptumCE, 2020), is employed to compute the active collapse pressure (σ s ) of unlined rectangular tunnels in rock masses. The results will be used as datasets for constructing a machine learning model.
Figures 2A-C show three numerical models of unlined rectangular tunnels in rock masses for the cases of B/D = 0.5, 1, and 2, respectively, where the others are C/D = 2, γD/σ ci = 0, GSI = 80, DF = 0, and m i = 20. Due to the symmetry of the problem, only half of the domain is used in the modeling. The boundary conditions at the left plane of symmetry and the right plane are set to move only in the vertical direction. The boundary condition at the bottom plane is not allowed to move in both vertical and horizontal directions. The sizes of the domain are chosen to be sufficiently large in order to avoid any error from the problem size on the computed bound solutions. The tunnel is unlined, and there is no pressure applied on the periphery of the tunnel. At the rock surface, the uniform surcharge σ s is applied overall the area. This surcharge at the active collapse state will be optimized and used as the output from the LB and UB FELA using OptumG2. The recent adaptivity meshing technique is also employed in this study to improve computational efficiency (e.g., Ciria et al., 2008). Using this adaptivity meshing technique, a number of elements are automatically added in the zones that contain large plastic shear strain. As a result, the differences between UB and LB solutions become smaller after a few iteration steps. The setting of the adaptivity meshing technique in this study is set as an initial mesh of 5,000 elements at the first step and then will be increased to 10,000 elements after three iterations of mesh adaptivity. Examples of typical adaptive meshes can be seen in Figures 3A-C for the cases of B/D = 0.5, 1, and 2, respectively.
Examples of absolute velocity contours of an unlined rectangular tunnel in the rock mass are also shown in Figures 4A-C for the cases of B/D = 0.5, 1, and 2, respectively, which depict the associated failure mechanisms of this tunnel stability problem. All results of the tunnel stability obtained from the LB and UB FELA (about 2,160 solutions) will be averaged and then used as the training datasets in the machine learning approach as described later in the next section. The results of the stability index obtained by the FELA will be used in the output layer datasets in order to construct a machine learning model. Table 1 concludes the ranges of dimensionless parameters considered in the FELA. It is important to note that these ranges cover realistic geometrical and geological properties of the tunnel and rock masses. The input includes major parameters: γD/σ ci , GSI, m i , DF, C/D, and B/D affecting the stability index that is analyzed using the FELA and applied as an output variable in machine learning models.

Multiple Linear Regression
Linear regression is a linear approach for modeling the linear relationship between the dependent variable (scalar response) and  one or more independent variables (also known as explanatory variables). The case of one dependent variable is called simple linear regression. In this study, there are six independent variables so that the process is called multiple linear regression. It is noted that this method is one of the most well-known and simplest algorithms in statistical analysis and machine learning. The output is a dependent variable that can be calculated from the combination of the input or independent variables as shown in the following equation: wherey i = dependent variable (output);x i1 , x i1 , . . . x ip = independent variables (input);β 0 = y-intercept (constant term); β 1 , β 1 , . . . , β p = slope coefficients for each explanatory variable;ϵ = the model's error term (also known as the residuals).
A regression model assumes that the linear relationship between dependent variable y and the p-vector of regressors x is linear. This relationship is modeled via a residual term or error variable ϵ-an unobserved random variable that adds "noise" to the linear relationship between the dependent and independent variables. This study uses the linear regression function in WEKA to perform standard least-squares multiple linear regression and optionally perform attribute selection, either greedily using backward elimination or by building a full model from all attributes and dropping the terms one by one, in the decreasing order of their standardized coefficients, until a stopping criterion is reached.

Artificial Neural Network
An artificial neural network (ANN) is also applied in this study. This method is a data prediction framework based on existing features created from the human mind structure. It simulates the processing mechanism of the human brain's nervous system to complex information. A neural network is a computational model consisting of a large number of nodes (or neurons) connected to each other.
ANN consists of three layers: input layer, hidden layer, and output layer, as shown in Figure 5. The first layer is the input layer, where the feature vector is passed through. In this study, the input layer consists of six nodes representing γD/σ ci , GSI, m i , DF, C/D, and B/D. The second layer is the hidden layer, which consists of one or more threshold logic unit layers. Generally, the number of hidden layers and hidden neurons is chosen on the basis of a trial-and-error method until the best model is obtained. This layer aims to convert the information into content so that the output layer can be used to predict the data. It is within this layer that the weighted sums of the inputs are calculated and a step function is applied to it before being sent off as an output through the use of the rectified linear unit (ReLU) activation function, which provides non-linearity in the network. The final layer is the output layer presenting an independent variable or a predicted value. The output layer consists of one node presenting a predicted stability factor of rectangular tunnels in rock masses.

Cross-Validation
In the case of a limited number of datasets, splitting the method into train and test datasets is not reliable. A more general technique to offset any bias produced by the individual sample used for holdout is to repeat the entire procedure, training, and testing, with various random samples several times. The general way of estimating either the accuracy or error of a machine learning technique given a single, fixed sample of data is to use stratified 10-fold cross-validation. The data are divided randomly into 10 parts with the class represented in approximately the same proportions as in the complete dataset. Each part is held out in turn and the learning scheme trained on the remaining nine-tenths before calculating the error rate is calculated on the holdout set. Hence, the learning procedure has been executed a total of 10 times on different training sets. The average of the 10 errors is finally calculated to yield overall error estimation. However, a single tenfold cross-validation might not be enough to get a reliable error estimate. Different tenfold cross-validation experiments with the same learning scheme and dataset often produce different results because of the effect of random datasets.
It is recommended to repeat the cross-validation process 10 times-that is, 10 times tenfold cross-validation-and average the results. This involves invoking the learning algorithm 100 times on datasets that are all nine-tenths the size of the original.

Performance Measures
In order to investigate the performance of the trained models in this study, three statistical analyses named correlation coefficient (R), root mean squared error (RMSE), and mean absolute error (MAE) are employed.
The correlation coefficient measures the statistical correlation between the predicted value and actual value. The correlation coefficient ranges from 0 when there is no correlation to one for perfectly correlated results. However, a value that is less than zero signifies a negative relationship. Correlation is slightly different from the other measures because it is scale-independent in that, if a particular set of predictions is taken, the error is unchanged if all the predictions are multiplied by a constant factor and the actual values are left unchanged. This factor appears in every term of S PA in the numerator and in every term of S P in the denominator, thus canceling out. (This is not true for the relative error figures, despite normalization; if all the predictions are multiplied by a large constant, then the difference between the predicted and actual values will change dramatically, as will the percentage errors.) It is also different in that good performance leads to a large value of the correlation coefficient, whereas because the other methods measure error, good performance is indicated by small values.  The coefficient of determination (R-squared, R 2 ) is the square of the coefficient of correlation. It is important to note that in the case of multiple variables, R 2 can better quantify the strength of the developed model since r cannot be calculated when the variables are more than 1. Thus, this study uses R 2 as one of the performance measures of the developed model.
In addition, the mean absolute error (MAE) is an average of the magnitude of the individual errors without taking account of their sign. The mean-squared error tends to exaggerate the effect of outliers-instances when the prediction error is larger than the others-but the absolute error does not have this effect. All sizes of error are treated evenly according to their magnitude. The equation for MAE is shown in MAE p 1 − a 1 + . . . + p n − a n n .
In addition, the mean-squared error (MSE) is the principal and most commonly used measure; sometimes, the square root (root mean-squared error, RMSE) is taken to give it the same dimensions as the predicted value itself. Many mathematical techniques use the mean-squared error because it tends to be the easiest measure to manipulate mathematically. It is, as mathematicians say, "well behaved." However, RMSE is more widely used than MSE to evaluate the performance of the regression model with other random models as it has the same units as the dependent variable. Note that the lower MSE and RMSE value indicates a model with higher accuracy.

Multiple Linear Regression
A multilinear regression model is first conducted, and the regression coefficients or weights are obtained. The where y represents the stability factor σ s /σ ci , whereas x 1 , x 2 , x 3 , x 4 , x 5 , and x 6 are the dimensionless input parameters, namely, γD/σ ci , GSI, m i , DF, C/D, and B/D, respectively. Figure 6 presents the comparison between the actual stability factor (average values from UB and LB solutions) obtained from the FELA and the predicted value obtained from the multiple linear regression. The performance of the developed equation can be accessed via statistical tests, R 2 , MAE, and RMSE which are found to be 0.8466, 3.2983, and 4.5924, respectively (see Table 2).

Artificial Neural Network
In order to maximize the accuracy of the ANN models, the number of hidden layers and neurons should be optimized. In this study, one hidden layer is considered, while the number of hidden neurons is varied. Figure 7 presents the performance of ANN models for the stability factor of rectangular tunnels in rock masses. It is clear that the performance of ANN models is increased if the number of hidden neurons is increased. It should be noted that R 2 , MAE, and RMSE must be calculated as  the R 2 value alone cannot really show the performance of the models as the variation of MAE and RMSE can still be seen clearly. Figure 8 presents the performance of the models against the number of hidden neurons. It is found that when the number of hidden neurons is over 11, the performance of ANN models is likely to be stabilized. In this case, ANN with the architecture of 6-11-1 is chosen to be the optimal MLP model as it shows the lowest MAE and RMSE values among the other models, while R 2 is the highest among the models. Table 2 also compares the performance between the MLR and MLP models. It is clear that the MLP model performs much better than the MLR model. This optimal MLP model with the architecture of 6-11-1 is used in the next section for the sensitivity analysis. After obtaining the optimal ANN architecture, the approximate general functions can be employed considering the weighted inputs and the transfer function to create the outputs. The layer number defines the superscript on the weight matrix in multiple-layer networks, as seen in Figure 9. The proper notation is used in the two-layer tansig/purelin network. This network is useful for approximating general functions. Given a sufficient number of neurons in the hidden layer, it can arbitrarily approximate well any function with a finite number of discontinuities. In this section, the final weights of each parameter have been calculated in order to study the effects of each parameter on the stability index. Figure 9 shows an example of the dimension of weight matrix and bias of the optimal ANN model for the heading tunnel. Predictive  Equation 11 can be developed based on the tansig function, weight, and bias from the ANN model.
where N is the number of hidden neurons, X is the number of input variables, and J is the number of input variables. The weight matrix (IW1 and IW2) and bias (b 1i and b 2 ) in the hidden and output layers corresponding to the optimal ANN models are obtained. Hidden weight (IW1) is obtained based on the number of input parameters (J) and hidden neurons (N). There is one weight for every input to neuron connection between the layers. Each neuron in the hidden layer has its own bias constant (b 1i ). As for the output weight matrix (IW2), the number of rows matches the number of hidden layer neurons (N) and the number of columns matches the number of output layer neurons (k). There is one column for every neuron in the output layer. In this case, the output layer contains only one column. Table 3 presents the neural network constants of the optimal ANN model including the weight matrix and bias for the stability index calculation of rectangular tunnels. These values obtained from the optimal ANN networks can be used to develop predictive equation functions and test on new datasets with different variations of parameters within required ranges.

Sensitivity Analysis
The sensitivity analysis of the stability of rectangular tunnels in rock masses using the ANN approach is presented next to portray the influences of all considered input dimensionless parameters (e.g., B/D, C/D, DF, GSI, m i , and γD/σ ci ) on the stability factor (σ s /σ ci ). Figures 10A,B, respectively, show the effect of the width ratio B/D on the stability factor σ s /σ ci for the cases of (γD/σ ci = 0, DF = 0, GSI = 80, and m i = 5 and 30). The non-linear relationship between σ s /σ ci and B/D can be seen in Figure 10. It is found that an increase in B/D can decrease the geometrical arching effect resulting in the reduction of tunnel stability. As a result, a larger B/D ratio causes a smaller value of σ s /σ ci . In Figure 10, the stability factor of the lowest case (B/ D = 0.5) is higher than that of the highest case (B/D = 2), which is about 120-300%. The impact of cover depth ratio C/D on the stability factor σ s /σ ci for the cases of γD/σ ci = 0, DF = 0, m i = 20, and GSI = 60 and 100 are shown in Figures 11A,B, respectively. A non-linear relationship between σ s /σ ci and C/D is observed in Figure 11. A larger C/D value also yields an incensement of the geometrical arching effect which is positive in improving the tunnel stability factor (σ s /σ ci ). Generally, when B/D = 0.5 to 1, the case of the deepest cover depth (C/D = 5) has a larger stability factor about three to five times of the case of C/D = 1, as shown in Figure 11. However, when B/D = 2, the difference between the stability for the cases of C/D = 1 and 5 becomes very large about 40 times. The influences of Hoek-Brown parameters for rock masses including DF, GSI, m i , and γD/σ ci are presented next. From Figures 12A,B, the impact of geological strength index GSI on the stability factor σ s /σ ci is illustrated for the cases of γD/σ ci = 0, C/D = 3, B/D = 0.75, and DF = 0.25 and 0.5, respectively. Numerical results in Figure 12 have shown that there is an exponential relationship between GSI and σ s /σ ci , where an increase of GSI results in a non-linear increase of σ s /σ ci for all m i values. Such results can be referred to the exponential function used in the model of the Hoek-Brown failure criterion as expressed in Eqs 2-4. It is due to the fact that a large GSI value is a highly undisturbed rock mass which yields an increase in the stability of rock tunnels. From Figure 12, it is also found that when the value of m i is large, the non-linearity of the σ s /σ ci and GSI relationship is also high. The effect of the m i parameter on the stability factor σ s /σ ci is presented in Figures 13A,B for the cases of DF = 0.25 and 0.5, respectively. The plots are for γD/σ ci = 0, C/D = 3, and B/D = 0.75. The results in Figure 13 show that the relationship between σ s /σ ci and m i is linearly increasing. An increase of m i results in an increase of σ s /σ ci for all B/D values. In a physical meaning, m i depends upon the mineralogy, composition, and grain size of the intact rock. From Figure 13, it can be observed that the slope of the σ s /σ ci and m i line becomes higher when the value of GSI becomes larger, meaning that the impact of m i is more prominent when the GSI value is high. The influence of the degree of disturbance DF on the stability factor σ s /σ ci is demonstrated in Figures 14A,B for the cases of GSI = 60 and 100, respectively, where the others are γD/σ ci = 0, C/D = 3, and m i = 20. A high value of DF represents a larger degree of disturbance meaning that the stability factor σ s /σ ci is reduced. The slopes of the lines of DF effect decrease as the values of GSI increase (see Figures 14A,B). For the case of GSI = 60 (see Figure 14A), the structure of rock masses for this case is very blocky and has partially distributed masses so that the impact of DF is very high due to the low quality of rocks. On the other hand, when GSI = 100 (see Figure 14B), the structure of rock masses is perfectly intact with few very widely spaced discontinuities. As a result, the effect of DF on this perfectly intact rock on the stability factor is then smaller than that of blocky rocks. Finally, the influence of the normalized unit weight and the uniaxial compressive strength ratio γD/σ ci is presented in Figures 15A,B for the cases of DF = 0, GSI = 60, C/D = 3, and m i = 5 and 30, respectively. Obviously, all data are plotted horizontally, meaning that the increase of σ ci /γD does not significantly affect the results of σ s / σ ci . This is because the stress induced by the unit weight γ of rock masses is low in comparison to that of σ ci for rock masses. Thus, the impact of γD/σ ci is negligible for the tunnel stability in rock masses.

CONCLUSION
To the authors' knowledge, this study is the first to establish a machine learning aided design for predicting the stability factor of the rectangular tunnel that is located in a rock mass following the Hoek-Brown (HB) failure criterion. The stability factor of the problem is investigated in terms of six dimensionless parameters including the width ratio, cover depth ratio, degree of disturbance, geological strength index, material constant related to the frictional strength, and normalized uniaxial compressive strength. For practical engineers, it is time-consuming to develop the algorithm of FELA with the HB failure criterion for obtaining stability solutions of tunnels in rock masses. Moreover, proper software is not usually user-friendly, and additional resources capable of providing information useful for decision-making are required. This study provides the optimal machine learning models for predicting the stability factor of rectangular tunnels. It is notable that only one hidden layer is sufficient to create a high-performance neural network model as R 2 is already high and MSE is extremely low, showing that the optimal model can be used to accurately predict the stability factor. The trained networks are obtained and can be further used to test new data for predicting the stability factor of the tunnel located in a rock mass using the weight matrix and bias derived in this study. However, the proposed ANN models should not be used when the values of parameters are out of the certain ranges presented in this study.