Geopolymer Concrete Compressive Strength via Artificial Neural Network, Adaptive Neuro Fuzzy Interface System, and Gene Expression Programming With K-Fold Cross Validation

The ultrafine fly ash (FA) is a hazardous material collected from coal productions, which has been proficiently employed for the manufacturing of geopolymer concrete (GPC). In this study, the three artificial intelligence (AI) techniques, namely, artificial neural network (ANN), adaptive neuro-fuzzy interface (ANFIS), and gene expression programming (GEP) are used to establish a reliable and accurate model to estimate the compressive strength ( f ′ c ) of fly ash–based geopolymer concrete (FGPC). A database of 298 instances is developed from the peer-reviewed published work. The database consists of the ten most prominent explanatory variables and f ′ c of FGPC as a response parameter. The statistical error checks and criteria suggested in the literature are considered for the verification of the predictive strength of the models. The statistical measures considered in this study are MAE, RSE, RMSE, RRMSE, R, and performance index ( ρ ) . These checks verify that the ANFIS predictive model gives an outstanding performance followed by GEP and ANN predictive models. In the validation stage, the coefficient of correlation (R) for ANFIS, GEP, and ANN model is 0.9783, 0.9643, and 0.9314, respectively. All three models also fulfill the external verification criterion suggested in the literature. Generally, the GEP predictive model is ideal as it delivers a simplistic and easy mathematical equation for future use. The k-fold cross-validation (CV) of the GEP model is also conducted, which verifies the robustness of the GEP predictive model. Furthermore, the parametric study is carried via proposed GEP expression. This confirms that the GEP model accurately covers the influence of all the explanatory variables used for the prediction of f ′ c of FGPC. Thus, the proposed GEP equation can be used in the preliminary design of FGPC.


INTRODUCTION
Fly ash is the unburned residual obtained from coal production and is taken out by the gases expelled from the boiler, which is then accumulated by means of mechanical or electrostatic precipitator (Rafieizonooz et al., 2016;Aprianti S, 2017;Akbar et al., 2021). Every year, about 375 million tons of FA is generated with a retention cost of $20-$40 per ton (Dwivedi and Jain, 2014). FA contains hazardous minerals like alumina, ferric oxide, and silica; putting to dump fill sites without sufficient treatment creates a destructive and harmful effect on the ecology (Carlson and Adriano, 1993;Kumar Tiwari et al., 2016;Nadesan and Dinakar, 2017;Ghazali et al., 2019). Virtuous waste management is needed for the sustainability of a healthy environment. Also, the ultrafine particles of FA, when reached to the respiratory system, cause different health issues like cancer, anemia, physiological disorder, dermatitis, and diarrhea. It also pollutes the ground water and alarms the aquatic life (Carlson and Adriano, 1993;Kumar Tiwari et al., 2016;Ghazali et al., 2019).
In the world, mostly, concrete is used for construction purposes and is the utmost desirable substance after water Liew and Akbar, 2020). Around 3 tons of concrete is manufactured per person, which accumulates 25 billion tons of concrete production per year (Watts, 2019), which requires 2.6 billion tons of cement production every year and would be increased by 25 percent in the latter decade (Wongsa et al., 2020). However, the processing of cement has a detrimental role in polluting the environment. In manufacturing, 1 ton of cement and 1 ton of carbon dioxide are emitted into the atmosphere . The cement utilizes limestone, and a serious deficiency of limestone may arise after 25-50 years (Farooq et al., 2020c;Sumanth Kumar et al., 2020). Thus, the production of green concrete is needed which leads to sustainable development and a healthy environment. FA acts as a supplementary cementitious material in the concrete mix and has been effectively used by researchers in the production of green concrete (Wang et al., 2017, Wang et al., 2019aChen et al., 2019). The consumption of FA in the construction is a better choice as it will not only reduce the malicious impact of its dumping into landfills but will also decrease the use of cement.
From the last 20 years, in the construction industry, the use of fly ash-dependent geopolymer concrete is increasing rapidly because it lessens the depletion of cement in geopolymer concrete (GPC) (Gülşan et al., 2019;Kondepudi and Subramaniam, 2019;Xie et al., 2019;Zhang et al., 2020a;Bajpai et al., 2020;Meesala et al., 2020;Noushini et al., 2020;Nuaklong et al., 2020). Because of the anomalous behavior of FA, its application in the construction industry is still limited (Jena et al., 2019;Nguyen et al., 2020;Sandanayake et al., 2020). FGPC is significantly used in the construction industry, but yet no method is available for the prediction of its compressive strength (f ′ c ) based on the mix design parameters with maximal variables. The f ′ c of FGPC fluctuates by numerous parameters such as temperature required for curing of the sample (T), the time required for curing of the sample (t), age of the sample (A), the molarity (M) of the sodium hydroxide (NaOH) solution used, the percentage of silicon dioxide (SiO 2 ) to the water ratio (%S/W) for preparing solution of sodium silicate (Na 2 SiO 3 ), ratio between sodium silicate (Na 2 SiO 3 ) solution to NaOH (N s N s /N o N o ), percentage by volume of total aggregates (% A G ), ratio between fine aggregate to total aggregates (F/A G ), ratio between alkali to fly ash (A L /F A ), percentage of plasticizer (% P), and percentage of extra addition of water (% E W ) (Luhar et al., 2019;Tran et al., 2019;Van Dao et al., 2019;Wang et al., 2019b;Zhang et al., 2019, Zhang et al., 2020bPrachasaree et al., 2020;Farooq et al., 2021). This raises uncertainty in the prediction of f ′ c FGPC. Moreover, the rise in the use of supervised machine learning techniques for the development of an empirical model has been observed in recent times .
Throughout the globe, AI techniques are being used to estimate concrete properties . Various AI techniques are used by researchers such as fuzzy interface system (FIS), response surface methodology (RSM), adaptive neuro-fuzzy interface system (ANFIS), extreme machine learning (EML), artificial neural network (ANN), support vector machine (SVM), random forest (RF), particle swarm optimization algorithm (PSOA), backpropagation neural network (BPNN), genetic algorithm (GA), genetic programming (GP), and gene expression programming (GEP). Table 1 covers the recent research conducted for the prediction of concrete properties via AI techniques. The ANN and ANFIS techniques can detect and generalize the complicated patterns. Therefore, they can be effectively used to solve engineering complexities (Noori et al., 2010;Chou and Pham, 2013). The existence of enormous hidden neurons sometimes makes it difficult to develop the relationship between input and output variables. These models show a strong correlation between input and outputs but do not provide an empirical equation that can be further used in the field. This is due to the complex structure of ANN and ANFIS models, which limits the wide-scale adoption of these models (Noori et al., 2010;Sebaaly et al., 2018).
Genetic programming (GP) is a worthy soft computing method as it ignores the prior developed relationship in the establishment of the model (Gandomi et al., 2012;Gandomi et al., 2013). Recently, in civil engineering, gene expression programming (GEP) is introduced which is an extension of GP. GEP uses a fixed-length linear chromosome and encodes a small program (Ferreira, 2006). It is advantageous as it provides a simple empirical equation for predicting the response, which can be used practically (Behnia et al., 2013;Beheshti Aval et al., 2017;Gholampour et al., 2017;Sadrossadat et al., 2018;Iqbal et al., 2020).
In the design and analysis of concrete, compressive strength (f ′ c ) is the key factor . Vast experimental research is carried to find the f ′ c of FGPC. To avoid costly experimental procedures, to save time and to support the usage of FA in the building industry, the establishment of reliable, precise, and accurate mathematical equation is desirable, which can relate the maximum mix proportion variables and f ′ c of FGPC. Alkaroosh et al. (Alkroosh and Sarker, 2019) established a GEP-based empirical relationship for the prediction of f ′ c of FGPC, based on 56 instances saved from previous study (Hardjito and Rangan, 2005). The application of this model is limited to a confined database, that is, to the accompanying experimental results. Also, no variable was considered for the preparation of the Na 2 SiO 3 solution. Furthermore, their model displays a strong increasing linear relation between the molarity of NaOH and f ′ c of FGPC, which is contradictory to other studies, which confirms the decrease in the (f ′ c ) of FGPC by increasing the molarity of the NaOH solution (Joseph and Mathew, 2012).
In this research, a comprehensive database of 298 instances has been established from the previous peer-reviewed published work which contains 101 cylindrical samples with a size of (200 × 100)mm; height × diameter, 31 cube samples with a size of 100 mm, and 166 cube samples with a size of 150 mm. The comprehensive database ensure the consistency of the AI models. As AI techniques involve complex programming and require excessive care and optimization. Therefore, three AI methods, that is, ANN, ANFIS, and GEP are employed to predict the f ′ c of FGPC. The performance of these models is verified by k-fold cross-validation, statistical checks, and sensitivity and parametric study. Also, the performance of all these models is compared with each other to counter the complexity of programming.

MACHINE LEARNING MODELING TECHNIQUES
This study considers three different artificial intelligence (AI) algorithms, to establish a model for estimating the compressive strength of FGPC. The execution of these models does not need any prior knowledge of the experimental procedure. This section briefly describes the overview of the AI modeling techniques used in this research.

Gene Expression Programming
Koza suggested an artificial intelligence method, that is, genetic programming (GP), as a substitute for GA which works on fixedsize strings (Koza and Poli, 2005). GP is a flexible and adjustable programming method as it uses the nonlinear parse tree structure. It accepts the initial nonlinearity within the data. Such nonlinearity has been executed earlier (Koza and Poli, 2005;Alkroosh and Sarker, 2019). GP fails in considering the independent genome. GP deals with the nonlinear structure as both phenotype and genotype. This marks GP questionable in assembling the rudimentary and simple equation. To resolve the discrepancies in GP, Ferreira introduced a novel methodology called GEP (Koza and Poli, 2005). The noteworthy change in GEP is that it transfers the genome toward another generation. One more unique function is the formation of objects via chromosomes made up of genes, which are further expanded as tail and head (Saridemir, 2010). In GEP, every single gene is comprised of fitted length parameters, arithmetic operation as a set of functions, and terminal set of constants. In the operation of genetic code, there exists a one-to-one interaction between subsequent functions and the symbols of chromosomes. The essential figures and information needed for the establishment of an empirical equation are stored in the chromosomes. A novel language Karva is developed to deduce this information. Figure 1 shows the flow diagram of the GEP technique. The first step is the fairly random distribution of fixed-size chromosomes for every instance. The same chromosomes are then represented as the expression trees (ET), and the fitness is calculated for every single individual. The mutation cycle keeps on with the addition of different individuals for several generations till the best model is achieved. To renovate the population, genetic operations like reproduction, mutations, and crossovers are carried out.

Artificial Neural Network
ANN analyzes the data by the artificial intelligence (AI) method. It uses the learning ability of the human brain. The extensively used form of ANN is the feed-forward backpropagation (FFBP) algorithm. Figure 2 illustrates that FFBP comprises of minimum three layers, that is, the input, the output, and the hidden layers. These layers are linked through nodes in an appropriate order along with approximated weights. The purpose of the input layer is to obtain the data from outside. Their nodes do not operate a single function on input data. The data become biased, weighted, and summed up in the hidden layer. The executed data are then transferred to the output layer Gandomi and Roke, 2015).
Two types of FFBP are generally used, that is, single-layer perceptron (SLP) and multiple layer perceptron (MLP). The SLP is easy and simple but cannot catch the nonlinear relationship, while the complex nature of MLP effectively handles the nonlinear relation between output and input variables. The steps involved in the mathematical operation of MLP are as following: Step 1: In the first step, the input data is weighted and summed given as: where n, I i , and ω ij represent the total number of inputs, current input number, and the weight between the prior layer and the j th neuron, respectively, while "b" represents the process of termination.
Step 2: It involves an activation function. Different activation functions are used like a ramp, Gaussian, and sigmoid functions. However, the sigmoid function is utilized in this study which is stated as: (2) Step 3: In this step, the final output is determined, which is dependent on the estimated outputs by hidden neurons. The ultimate output can be expressed as: where ω jk and b k ′ represent the weighted link between j th hidden node and k th output node, while b k ′ defines the biased outcome of the k th output node.

Artificial Neuro-Fuzzy Interface
ANFIS is another AI technique that combines the effect of fuzzy logic and ANN (Çaydaş et al., 2009). ANN is generally employed to lessen the probability of error in the outcome. Whereas the fuzzy logic is utilized to prove and demonstrate the practiced knowledge and is applied while mathematical modeling of the anticipated input and output data set (Jaafari et al., 2019). Generally, the ANFIS works on five layers. The description of these layers is as follows: 1st layer: It is the fuzzification layer that encompasses the functional members of the input parameters, which use the Gaussian function for the prediction of the outcome. The mathematical equation is given as: where ε i and a i are the parameters used for the functional membership. 2nd layer: In this layer, nodes are utilized to send the output through the multiplication of the input with a particular weightage. This layer is worked on the fuzzy and logic via expression given below; 3rd layer: The aim of this layer is to normalize the functions of membership. The below-listed equation is used to estimate the ratio between various firing strengths.
4th layer: It is the defuzzification layer that utilizes the square nodes to conclude the rules of the fuzzy logic. The following expression shows the defuzzification process: where r i , m i , and n i are all the linear parameters. 5th layer: This layer has the function to aggregate and sum up the previous layers and later on conclude the final output.
The comprised dataset has f ′ c as a response variable and has an explanatory variable such as temperature required for curing of the sample (T), age of the sample (A), the molarity (M) of the sodium hydroxide (NaOH) solution used, the percentage of silicon dioxide (SiO 2 ) to the water ratio (% S/W) for preparing sodium silicate (Na 2 SiO 3 ) solution, the ratio between sodium silicate (Na 2 SiO 3 ) solution to NaOH (N s /N o ), the percentage by volume of total aggregates (% A G ), the ratio between fine aggregate to total aggregates (F/A G ), the ratio between alkali to fly ash (A L /F A ), percentage of plasticizer (% P), and percentage of extra addition of water (% E W ). For all the collected samples, the time (t) required for the initial curing of the sample is 24 h. It is true that f ′ c rises with curing time (t), but the rate of increment in f ′ c of FGPC is quick till 24 h (Hardjito and Rangan, 2005). Moreover, some researchers stated that, due to quick geopolymerization, f ′ c is not improved after 24 h (Van Jaarsveld et al., 2002). Therefore, limited study is conducted for prolonged curing time. The performance of every model relies on the distribution of input variables (Gandomi and Roke, 2015). Figure 3 shows the cumulative percentage and frequency distribution for all the 10-input parameter used in the modeling of f ′ c of FGPC. The data points of every input parameter are distributed over its range. Table 2 illustrates the range, variance, maximum, minimum, and mean values of the response and explanatory variables. To get accurate and precise results, it is suggested to utilize the projected models for the prediction f ′ c of FGPC within the prescribed range.
It must be noted that for the validation, consistency, and reliability of the dataset, many trials have been executed. The instances that deviate about 20% from the global norm were not counted in the development of the models. 298 data points were used to develop ANN, ANFIS, and GEP models for the prediction of f ′ c of FGPC. The overall dataset is randomly subdivided into 2 statistically consistent subsets, that is, train subset (70%, 208 instances) and validation subset (30%, 90 instances) . The train set has been utilized for training of the model, and the validation set was used to evaluate and verify the generalization capacity of models (Gholampour et al., 2017).

DEVELOPMENT AND EVALUATION OF MACHINE LEARNING MODELS
The first and foremost step in the establishment of the model using a machine learning algorithm is the selection of such input variables. To develop the generalized AI models, those input parameters are chosen, which greatly influence the properties of FGPC. To develop AI models for the compressive strength of FGPC, the most influential input variables considered in this study are shown in Eq. 9.

Development of Artificial Neural Network Model
The first step in the establishment of the ANN model is the adjustment of fitting parameters, which includes the numbers of the hidden layer, the hidden number of neurons in every layer, the function used for training of the neural network, the epochs, and the maximum number of repetitions.

Development of Artificial Neuro-Fuzzy Interface Model
Likewise, before the execution of the ANFIS algorithm, the fitting parameters were provided, which include the function used for the activation of the ANFIS algorithm, the number of epochs, and the maximum number of repetitions.

Development of Gene Expression Programming Model
The three groups of fitting parameters are used in the development of the GEP model. These are the ordinary model parameters, the numerical constants, and the genetic operators. The ordinary parameters include the population size, that is, the number of chromosomes, the number of genes, the connecting function, the head size, and the set of functions. The numerical constants cover the number of constants used per each gene, the type of data, and its lower and upper bound. The genetic operators involve the mutation rate, transposition function for root insertion sequence (RIS) and insertion sequence (IS), and rate of recombination for combining and splitting two chromosomes. To achieve a generous algorithm, the setting parameter setting suggested in the previous study has been utilized (Iqbal et al., 2020). Table 4 shows the detailed description of GEP setting parameters. GeneXproTool has been utilized to run the GEP-based algorithm.

Model Performance Evaluation Criteria
Generally, the coefficient of correlation (R) is utilized to study the performance and operation of the models. Because of the insensitivity of R in relation to division and multiplication of response to the constants, it cannot be exclusively chosen to judge the accuracy and precision of the model (Babanajad et al., 2017). Thus, this research study also considers the evaluation of the models via various statistical error parameters, that is, the relative squared error (RSE), root mean squared error (RMSE), mean absolute error (MSE), relative root mean squared error (RRMSE), and the performance index (ρ). The performance index (ρ) evaluates the model using the function of both RRMSE and R (Gandomi and Roke, 2015). Equations 10-15 show the mathematical expressions of these statical error parameters.
In the above expressions, here, y i , x i , y i , and x i are the i th model output, experimental output, average model outcome, and average experimental output, respectively. While n shows the number of instances in the dataset. The best calibrated model is the one that yields lower error statistics and a higher value of R. The researchers reported that for a strongly correlated model, the value of R must be greater than 0.8, and for an ideal model, it should be 1 . The value of the performance index (ρ) generally ranges from zero to positive infinity. Moreover, for the better model performance, (ρ) should be approximately equal to zero.

K-Fold Cross-Validation Model
Cross-validation (CV) is the technique generally considered for the judgment of the performance and flexibility of the machine learning model, while statistical analysis generalizes to an independent dataset. There are various types of CV techniques, for example, bootstrapping, Jack Knife test, disjoint sets test, three-way split test, and Monte Carlo test. (Saud et al., 2020). The k-fold cross-validation (CV) is carried out to minimize the sampling bias and overfitting issue.
In this research k-fold, CV algorithm is used which is Jack Knife's test part. The k-fold CV is the technique used to judge the working of the model, which splits the whole dataset into "k" equal subsets. In which, k-1 subsets are used for data training and one subset is hold out which is used for validation or testing with other datasets (Saud et al., 2020). In the k-fold CV technique, the entire procedure is recurring k-times through varying the testing and training data samples. Furthermore, the finest model is chosen via finding minimal error based on different error approximation statistics. The effectiveness of CV is that the entire instances are utilized for training and validation of the model, and every instance is once utilized for the validation purpose. The steps involved in the k-fold CV are as follows: • Splitting the whole dataset into "k" number of equal parts, known as folds. • Among "k" folds, one-fold is chosen for testing purposes and "k-1" folds are saved for training purposes. • The model is fitted upon train folds and predicted upon test fold. This recurs for all the folds. • For the prediction of the best model, the error is estimated via statistical checks like correlation coefficient (R) and root mean squared error (RMSE). Further, the best coefficient value that corresponds to lesser error is selected.
Kohavi (Kohavi, 1995) reported that the ten-fold CV algorithm provides reliable variance with reduced computational complexity. The whole dataset comprises of 298 instances that are divided into ten-folds. Nine-folds are utilized for training various models. One-fold is held to test against the best coefficient value provided by nine-folds. The entire process is recurred 10 times as the validation is to be executed the number of generations the data are divided. Among the 10 coefficient values, the best coefficient which displays minimal RMSE value is chosen. The flow diagram of the whole k-fold CV is shown in Figure 4.

Performance Assessment of Artificial Neural Network Model
The ANN has its importance in resolving a complex engineering problem. It is initially developed for predicting the models for complicated procedures that are nonlinear in nature. The simulation of the process requires input parameters (garbage in) to predict the output (garbage out). The ultimate results of the ANN predictive model are presented in Figure 5, which shows the slope of the regression lines for training and validation data points, that is, 0.9715 and 0.9762, respectively, (see Figures 5A,B) displays the dispersion of absolute error for the whole dataset utilized in ANN modeling. The percentage of average error and maximum percentage of error come out to be 9.83 and 14.67%, respectively. Nearly 80% of the data points have error values less than 10%. The hidden layers of ANN are like a black box, and it is very hard to find a proposed equation.

Performance Assessment of Artificial Neuro-Fuzzy Interface Model
In fuzzy system works on fuzzy reasoning and IF-THEN rules, ANFIS is a more powerful tool which is a combination of fuzzy logic and neural network. Figure 6A depicts that the slope of the regression line in the training and validation stage is 0.9949 and 1.000, respectively, which defines a strong correlation between ANFIS predicted outcome values and experimental outputs of f ′ c of FGPC. Figure 6B displays the outburst performance of the ANFIS predictive model. The average and maximum percent deviation between predictive ANFIS values and experimental values is quite lesser, that is, 2.58 and 6.09%. 95% data points show lesser error than 5%, which proves the superiority of the ANFIS model over the ANN model. Furthermore, the frequency of maximum absolute error is very less.

Gene Expression Programming-Based Empirical Equation
As presented in Figure 7, GEP provides an expression tree (ET) with four sub-ETs, which is translated to have empirical relation for the prediction of compressive strength (f ′ c ) of FGPC. Table 5 shows the description of indicators used in the expression tree of the GEP model. Eq. 16 is the final form of GEP's empirical equation that can be used for the prediction of f ′ c of FGPC in MPa. The variables A, B, C, and D are shown as Eqs 17-20 are the four variables that have been derived from sub-ET-1, sub-ET-2, sub-ET-3, and sub-ET-4, respectively.
Performance Assessment of Gene Expression Programming Model Figure 8A presents a strong correlation via the slope of the regression line between the predicted results of the GEP model and experimental values. From the training instances and validation instances, the regression line slope is calculated as 1.000 and 0.9892, respectively. The distribution of absolute error values between experimental and targeted outcomes is shown in Figure 8B. The maximum error percentage and the average percentage of absolute error are quite closer, that is, 8.32 and 6.47%, respectively. In comparison with ANN, the occurrence of maximum absolute error values is quite lesser. In the validation stage, 90% of data points of GEP predicted values have an error lower than 10%, with an average percent error lesser than 5.560%.
ANFIS gives an outstanding performance as compared to GEP, but it fails in providing a flexible and simplistic empirical equation for future use.

Statistics and External Verification of Artificial Neural Network, Artificial Neuro-Fuzzy Interface, and Gene Expression Programming Models
The statistics considered in this study for the error analysis of training and validation sets of ANN, ANFIS, and GEP models are shown in Table 6. The results indicate that all three models performed effectively, giving lesser error values. This shows the robust correlation between models predicted outcomes and experimental values. Among all the three models, ANFIS gives an outstanding performance followed by GEP and ANN models. In ANFIS, for the training instances, the MAE, RMSE, RSE, and R equal 3.286, 4.086, 0.294, and 0.9256, respectively, and measured as 2.084, 2.593, 0.0493, and 0.9783 for the validation instances. While for GEP, these values come out to be 5.823, 5.971, 0.325, and 0.8586 for the training instances, respectively, and 2.057, 2.643, 0.0675, and 0.9643 for the validation instances. The consistency of the GEP model is dependent on the number of data points in the dataset. The literature reveals that for the development of a reliable and consistent GEP model, the minimum value of the ratio between total number of instances and the total input variable must 3 (Gandomi and Roke, 2015). This study uses a higher value equal to 30 which comes from 298 data points and 10 input variables. Thus, as compared to ANN, an accurate and reliable GEP predictive model has been accomplished. In general, GEP modeling is preferred over ANFIS and ANN as it provides an empirical relation between the input variables and the response. However, ANN and ANFIS fail in providing an empirical relationship due to its complex architecture. As presented in Table 6, the performance index (ρ) for all the predictive models is nearly equal to zero. Thus, the developed GEP equation is reliable and accurate and can be utilized for the prediction of fresh data lying within the range provided in Table 2. The predicted results of all three models are also verified through the statistical checks suggested in the literature. The inclination (slope) of the regression line, that is, m ′ or m (crossing through an origin) must be near to 1 (Aslam et al., 2020). The authors also endorsed that the squared coefficient of correlation (crossing an origin) between the experimental outputs and predictive model results, that is, R 2 o or between the model predictive results and experimental outputs, that is, R 2 o must be near to 1 . These external verification checks are summarized in Table 7. This replicates that all the predictive models are correct and accurate and not just work as a correlation but have a predicting capability.

Comparison Between Artificial Neural
Network, Artificial Neuro-Fuzzy Interface, and Gene Expression Programming Models Figure 9 illustrates the comparison of the output proposed via ANN, ANFIS, and GEP models, for both the training and validation phases. It shows that all the models can capture the output precisely within an acceptable range of error. The performance index (ρ) and RMSE for the ANFIS model are lesser than ANN and GEP model in both training and validation phases. However, GEP performed better than the ANN model. Percentage of extra addition of water (%E W )  The ρ training and RMSE training for the ANFIS model are 49% and 32% better than the GEP model, respectively; and 51% and 32% better than the ANN model. While in the validation stage, ρ validation and RMSE validation are 22% and 2% better than GEP, respectively; and 55% and 48% better than the ANN model. ANFIS is a combination of ANN and fuzzy logic and thus gives an outburst performance in both the validation and training phase. Generally, the GEP predictive model is ideal as it delivers a simplistic and easy mathematical equation for future use.

K-Fold Cross-Validation of the Gene Expression Programming Model
Validation of the model is of great importance in machine learning, to test the performance and generalization ability of the model and to assure the optimal accuracy of the model. CV is conducted via k-fold CV algorithm, to improve the robustness, reliability, and effectiveness of the developed GEP model. The fluctuation in the selected statistical performance, that is, R and RMSE are shown in Figure 10. The maximum, minimum, and mean values of R for the predictive model are 0. 9723, 0.8706, and 0.9239, respectively, and 9.5031, 4.4537, and 6.9605 for RMSE, respectively. While the standard deviations of R and RMSE are 0.0328 and 1.5173, respectively. Based on the mentioned statistical indicators, the results of k-fold cross-validation confirm the generalization capacity and accurateness of the predictive model.

Sensitivity and Parametric Analysis
In this study, sensitivity analysis (SA) is conducted using GEP model outputs. The purpose is to assess the comparative contribution of all the ten explanatory variables utilized for the estimation of compressive strength (f ′ c ) of FGPC (Iqbal et al., 2020). The SA explains the dependence of the model outputs upon the explanatory parameters via Eqs 21, 22.
Here, x k is the k th input dominion. While f min (x k ) and f max (x k ) display the minimum and maximum output values, respectively, subjected to k th input dominion keeping the other explanatory variables at their mean values. The N k represents the range of the k th dominion, which is calculated by taking the difference between f max (x k ) and f min (x k ). Both, the SA as well as the parametric study were performed only via train instances because both the validation and training set data are consistent. Iqbal et al., 2020). Figure 11 displays the results of SA for the f ′ c of FGPC (Iqbal et al., 2020). In this research, a parametric study is also conducted via GEP model outcomes (using Equation 16), to assess the trend of predicted f ′ c of FGPC with the single explanatory variables. The fluctuation in f ′ c is determined just by varying only one explanatory variable from maximum to minimum, and the rest all are kept constant at their mean values. The ultimate results of the parametric study of f ′ c of the GEP model is shown in Figure 12.
In working with GPC, the initial temperature (T) for curing of samples is the utmost problematic parameter. Figure 11 depicts similar results and shows that T relatively contributes 25.30% in the f ′ c of FGPC. Figure 12 illustrates the trend between the explanatory variables and the response. It shows a linearly   N o ), % A G , (F/A G ), and A is increasing with a different rate.
In the production of FGPC, alkaline solution releases hydroxides and silicates, which creates polymers of alumina silicates. Extra heat is required to expedite its reaction with the source substance and to enhance the f ′ c of FGPC. Figure 12 depicts a rise in f ′ c till curing temperature rises to 100°C. The authors reported that curing of FGPC at higher temperatures results in the loss of moisture content, even if it is sealed properly (Joseph and Mathew, 2012). The f ′ c reduces after 240 days because the gel fills up the voids, resulting in the development of compressed and semi-homogenous structure (Wardhono et al., 2017). Figure 12 displays that f ′ c improves with increase in the volume of total aggregates; however, the volume of total aggregates is directly interlinked with the ration between fine aggregates to total aggregates.
The effect of molarity (M) of NaOH solution, the A L /F A ratio, and N s /N o ratio on the f ′ c of FGPC is linked with each other. However, the amount of Na 2 SiO 3 alters the microstructure and significantly affects f ′ c of FGPC. So, in the formulation of Na 2 SiO 3 solution, the ratio between the percentage of silica (%S/W) ratio needs to be greater, to achieve greater f ′ c . The lesser A L /F A the ratio in connection with the lesser molar solution of NaOH and greater N s /N o ratio ensure the high f ′ c of FGPC. However, NaOH solution must be an adequate amount for the completion of dissolution of geopolymers. The same results have also been studied in the previous study (Lokuge et al., 2018).
To accomplish a high workable FGPC mix and to prevent cracking, the addition of extra water and plasticizer is required (Nuruddin et al., 2011a). Figure 11 displays that the inclusion of plasticizers or extra addition of water (E W ) affects f ′ c 6.71 and 18.85% separately as compared to other input variables. As evident from Figure 12 that f ′ c rises with the inclusion of plasticizers and reduces by adding E W . Because the E W may cause segregation and bleeding in green concrete, if exceed a certain limit.
The parametric and sensitivity analysis accurately capture the effect of all input parameters considered in the establishment of machine learning models for the prediction of f ′ c of FGPC. Furthermore, results similar to Figure 12 have also been reported by different authors (Nuruddin et al., 2011a;Lokuge et al., 2018).

CONCLUSION
In this study, the three AI techniques, namely, ANN, ANFIS, and GEP are used for estimating the compressive strength (f ′ c ) of FGPC. Ten influential and prominent parameters are used as explanatory variables for the accurate prediction of f ′ c of FGPC. The k-fold CV, statistical error checks, and criterion suggested in the literature are considered for the verification of the predictive tendency of the models. The statistical measure considered in this study is MAE, RSE, RMSE, RRMSE, R, and performance index (ρ). These checks verify that the ANFIS predictive model gives an outstanding performance followed by GEP and ANN predictive models. In the validation stage, the coefficient of correlation (R) for ANFIS, GEP, and ANN models is 0.9783, 0.9643, and 0.9314, respectively. All three models also accurately fulfill the external verification criterion suggested in the literature. Generally, the GEP predictive model is ideal as it delivers a simplistic and easy mathematical equation for future use. Furthermore, the k-fold CV of the GEP model is also conducted, which verifies the accurateness and robustness of the GEP predictive model. The parametric study is carried via the proposed GEP expression. This confirms that the GEP model accurately covers the influence of all the explanatory variables used for the prediction of f ′ c of FGPC. Thus, the proposed GEP equation can be used in the preliminary design of FGPC.
However, it is highly suggested to conduct the leachate study, before the addition of fly ash (FA) as geopolymer material. This study offers a practical and effective base for the use of hazardous FA in concrete, as a substitute for discarding into landfills. This would eventually step toward viable, sustainable, and efficient construction with reduced greenhouse gases and lesser utilization of energy. In terms of disposal cost of FA and carbon credit, it would increase the economy of a country.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
MAK: methodology, software, data curation, and writing -original draft; AZ: supervision, funding acquisition and project administration; FF: investigation, writing -review and editing; MFJ: conceptualization, software, writing -review and editing; RA: resources and validation; HA: visualization and resources; MIK: resources and formal analysis.