Prediction of the compressive strength of tailings backfill using an EO-LightGBM model: performance comparison and feature importance analysis

Tang, Li; Zhang, Xiaoliang

doi:10.3389/feart.2025.1758600

ORIGINAL RESEARCH article

Front. Earth Sci., 18 February 2026

Sec. Geohazards and Georisks

Volume 13 - 2025 | https://doi.org/10.3389/feart.2025.1758600

Prediction of the compressive strength of tailings backfill using an EO-LightGBM model: performance comparison and feature importance analysis

LT
Li Tang ^*
XZ
Xiaoliang Zhang ^*

Kunming Metallurgy College, Kunming, Yunnan, China

Article metrics

View details

388

Views

Downloads

Abstract

Objectives:

As a green construction material utilizing mine tailings, tailings-based concrete plays a significant role in promoting resource reuse. Its compressive strength is critical for structural safety and long-term durability. Accurate prediction of mechanical properties is essential for mix design and engineering applications.

Methods:

Based on experimental data, this study proposes a predictive model for the 7-day compressive strength of tailings concrete using an EO-LightGBM model, which integrates the Equilibrium Optimizer with the Light Gradient Boosting Machine (LightGBM). The model’s performance is compared with several traditional machine learning methods, including Linear Regression, Support Vector Regression, Random Forest, GBDT, XGBoost, and LightGBM. The experiments used copper tailings with a weight concentration of 77%–85%, and considered variables such as tailings ratio (2.33-9.00) and cement-sand ratio (0.033-0.067). A total of 32 mix designs were developed, and their slump, yield stress, viscosity coefficient, and compressive strength were measured.

Results:

The EO-LightGBM model achieved a mean squared error (MSE) of 0.022, root mean square error (RMSE) of 0.148, and a coefficient of determination (R²) of 0.94 on the test set, outperforming all other models in terms of prediction accuracy and generalization capability. Feature importance analysis showed that weight concentration, tailings ratio, and cement-sand ratio were the most influential factors on compressive strength, while yield stress and viscosity coefficient had relatively minor contributions.

Conclusion:

The proposed model offers a reliable theoretical and technical basis for the design and strength prediction of tailings-based concrete materials.

1 Introduction

Tailings concrete, as a green building material derived from tailings resources, mainly aims to partially or fully replace conventional sand and gravel aggregates with mineral processing tailings to produce concrete with satisfactory mechanical properties, thereby promoting resource reutilization and reducing environmental impacts (Sheng et al., 2024; Xue and Yilmaz, 2022). Among the mechanical properties, compressive strength is a key indicator that directly determines the structural stability and engineering applicability of tailings concrete. However, due to the complex particle gradation, variable chemical composition, and strongly nonlinear interactions during hydration with cementitious materials, accurately predicting the mechanical properties of tailings concrete remains challenging (Li et al., 2023; Hu et al., 2024). Therefore, systematic investigation of the key factors governing compressive strength and the development of high-precision prediction models are of great importance for material optimization and engineering quality control.

Traditional approaches for predicting the mechanical properties of tailings concrete mainly rely on laboratory testing and numerical simulations. Laboratory experiments establish relationships between compressive strength and material composition through various mix designs and mechanical tests (Gao et al., 2022a; Zou et al., 2024). Dong (2020) investigated the mechanical properties and strength prediction of continuously graded tailings–cement concrete. By combining uniaxial compression tests, ultrasonic testing, and acoustic emission monitoring, they analyzed the influence of tailings gradation schemes on mechanical behavior. Their results indicated that different tailings-to-sand ratios significantly affect ultrasonic wave velocity distribution and stratification mechanisms. Based on Talbot gradation theory, an optimized gradation scheme was proposed, and a strength prediction model incorporating wave velocity and density parameters was developed, achieving a correlation coefficient greater than 0.99. Although this model demonstrated high prediction accuracy, such experimental approaches require substantial time and cost investments and cannot practically cover all possible mix combinations, which limits their applicability in large-scale engineering practice. Consequently, there is a strong demand for rapid, efficient, and accurate prediction methods to replace or supplement conventional experimental and simulation techniques.

In recent years, machine learning techniques have achieved remarkable progress in engineering applications, particularly in predicting material mechanical properties. These methods can automatically extract nonlinear feature patterns from experimental data and establish complex mappings between input variables and output responses. Common machine learning approaches include Linear Regression (LR) (Cao et al., 2021), Support Vector Regression (SVR) (Yu et al., 2022), Random Forest (RF) (Qi et al., 2021), Gradient Boosting Decision Trees (GBDT) (Min et al., 2023), and XGBoost (Yuan et al., 2023). These techniques have been widely applied to predict the mechanical behavior of construction materials, composites, and geological materials. For instance, Qi et al. (2018a) proposed an intelligent framework combining machine learning and genetic algorithms to predict the mechanical properties of cemented tailings concrete. Based on experimental data, three models—Decision Tree (DT), Gradient Boosting Machine (GBM), and Random Forest (RF)—were evaluated, with GBM showing the best performance, achieving correlations of 0.963, 0.887, 0.866, and 0.899 for UCS, yield strength, Young’s modulus, and UTS, respectively. The developed intelligent software significantly improved design efficiency while reducing experimental workload and cost. Qi et al. (2018b) further combined Boosted Regression Trees (BRT) with Particle Swarm Optimization (PSO) to predict the strength of recycled waste tailings–cement concrete, demonstrating that PSO effectively optimized BRT hyperparameters and enhanced prediction accuracy.

Among these algorithms, Light Gradient Boosting Machine (LightGBM), an efficient implementation of Gradient Boosting Decision Trees, employs a histogram-based feature discretization strategy that significantly improves computational efficiency while reducing the risk of overfitting. Consequently, LightGBM has attracted increasing attention in engineering prediction tasks. Zhang B. et al. (2022) integrated experimental data and machine learning to investigate the compressive strength (UCS) of cemented tailings concrete. Using 180 experimental datasets, they trained four neural network models—BPNN, RBFNN, GRNN, and LSTM—and introduced a correction coefficient to reduce discrepancies between laboratory predictions and engineering UCS. By combining GWO with LSTM, they successfully predicted the UCS of 153 tailings concrete samples under various gradation conditions and curing ages, providing an effective intelligent prediction approach for mine backfilling applications.

Fatah et al. (2024) applied machine learning techniques to study the stabilization of contaminated mine sludge (CMS) and to predict unconfined compressive strength (UCS) and heavy metal leaching behavior. Three tree-based models—XGBoost, Decision Tree (DT), and LightGBM—were trained and tested using 337 datasets. Monte Carlo simulations and ten-fold cross-validation were employed to enhance model robustness. The results indicated that XGBoost achieved the highest prediction accuracy, whereas LightGBM exhibited relatively lower performance. Importantly, the study highlighted that LightGBM performance is highly sensitive to hyperparameter selection, including learning rate, maximum tree depth, number of leaf nodes, and L1/L2 regularization parameters.

Conventional hyperparameter optimization methods, such as Grid Search and Bayesian Optimization, often suffer from high computational cost in high-dimensional search spaces and may struggle to identify globally optimal solutions. In recent years, intelligent optimization algorithms, including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Equilibrium Optimizer (EO), have demonstrated strong capability in machine learning hyperparameter tuning. EO is a recently developed optimization algorithm inspired by mass balance principles, featuring adaptive switching between global exploration and local exploitation, thereby reducing the risk of premature convergence and showing excellent performance in high-dimensional optimization problems (Meng et al., 2024; Hou et al., 2023). Accordingly, integrating EO with LightGBM to optimize hyperparameter selection and enhance prediction accuracy constitutes the central objective of this study.

In response to the above challenges, the contributions of this study are mainly reflected in two aspects. First, from a modeling perspective, an EO-LightGBM hybrid framework is proposed to construct a compressive strength prediction model that combines intelligent global search with gradient boosting learning. By leveraging the convergence and exploration capabilities of EO, key LightGBM hyperparameters are effectively optimized, reducing reliance on empirical parameter selection and improving model stability and generalization performance. Second, regarding data acquisition and variable analysis, 32 distinct mix designs were prepared using copper tailings with weight concentrations ranging from 77% to 85%. Systematic experiments were conducted to obtain slump, yield stress, viscosity coefficient, and 7-day compressive strength data. Through visualization analysis and feature importance evaluation, critical variables—such as weight concentration, tailings-to-sand ratio, and cement-to-tailings ratio—were identified as dominant factors governing mechanical performance, providing quantitative guidance for tailings concrete mix design.

Overall, the EO-LightGBM-based strength prediction model developed in this study integrates high accuracy, computational efficiency, and robust generalization capability. It overcomes limitations associated with traditional experimental and empirical modeling approaches and offers a promising technical framework for tailings resource utilization and performance prediction of concrete materials.

2 Experimental materials and design

To construct a reliable prediction model, it is first necessary to obtain the material characteristics and mechanical performance data of tailings concrete through systematic experiments. This chapter will introduce the materials used in the experiments, mix design, and relevant testing methods.

2.1 Waste tailings

The waste copper mine tailings used in this study were obtained from the tailings waste generated during the production process of a mining area. The tailings are gray-brown powdery, and the particle size is mainly concentrated between 0.075–2 mm, with a higher proportion of fine particles. The particle size distribution is shown in Figure 1, displaying a distinct enrichment of fine particles. The physical properties of the copper mine tailings were determined through experiments, with the tailings density measured as 2.65 g/cm³, loose bulk density as 1.52 g/cm³, compacted bulk density as 1.87 g/cm³, loose porosity approximately 42.6%, and compacted porosity 29.4%. The specific results are listed in Table 1. Additionally, the chemical composition of the copper mine tailings was analyzed using X-ray fluorescence (XRF), revealing that the main components of the tailings are SiO₂ (62.70%), Al₂O₃ (11.30%), and CaO (9.20%), followed by Fe₂O₃ (6.40%), MgO (1.82%), K₂O (0.75%), TiO₂ (0.52%), and other oxides. The specific chemical composition is listed in Table 2.

FIGURE 1

TABLE 1

Density (g/cm³)	Loose bulk density (g/cm³)	Compacted bulk density (g/cm³)	Loose porosity (%)	Compacted porosity (%)
2.65	1.48	1.87	44.15	29.43

Physical property parameters of vanadium iron ore tailings.

TABLE 2

SiO₂	Al₂O₃	CaO	Fe₂O₃	MgO	K₂O	TiO₂	Na₂O	Other
66.52	13.21	7.84	4.51	1.82	1.24	0.83	0.52	4.71

Chemical composition analysis of full tailings of vanadium iron ore (%).

2.2 Cement

The cement used in this study is Ordinary Portland Cement (P·O 42.5), which appears as a gray powder. The initial setting time is approximately 135 min, and the final setting time is 195 min. The specific surface area is 350 m²/kg, with a compressive strength (28 days) of 46.2 MPa and a tensile strength (28 days) of 8.1 MPa. The specific mechanical performance parameters are listed in Table 3. According to XRF testing, the main chemical component of the cement is CaO, accounting for 63.15%, followed by SiO₂ at 21.60%. Additionally, it contains Al₂O₃ (5.12%), Fe₂O₃ (3.42%), MgO (2.35%), and small amounts of SO₃, K₂O, Na₂O, and other oxides. The chemical composition is listed in Table 4.

TABLE 3

Initial setting time (min)	Final setting time (min)	Specific surface area (m²/kg)	Compressive strength (MPa)	Tensile strength (MPa)
135	195	350	46.2	8.1

Basic mechanical properties of Portland cement (P·O 42.5).

TABLE 4

CaO	SiO₂	Al₂O₃	Fe₂O₃	MgO	SO₃	K₂O	Na₂O	Other
63.15	21.60	5.12	3.42	2.35	1.82	0.65	0.30	1.59

Chemical composition analysis of Portland cement (P·O 42.5) (Unit:%).

2.3 Tailings concrete mix design

To study the optimal mix design for waste copper mine tailings concrete materials, 32 different mix designs for tailings concrete were designed in this experiment. The specific mix parameters are listed in Table 5. The mix design considers three key factors: weight concentration, tailings-to-throw ratio (total tailings/dry throw tailings), and cement-to-tailings ratio (cement/tailings). The specific mechanical properties under different mix conditions will be analyzed and discussed in subsequent experiments.

TABLE 5

Experiment number	Weight concentration (%)	Tailings-to-throw ratio	Cement-to-tailings ratio	Total tailings content (kg/m³)
1	77	9.00	0.033	1,356.48
2	77	5.67	0.040	1,266.61
3	77	4.00	0.050	1,175.16
4	77	2.33	0.067	1,002.82
5	78	9.00	0.033	1,383.96
6	78	5.67	0.040	1,308.22
7	78	4.00	0.050	1,188.09
8	78	2.33	0.067	1,044.48
9	79	9.00	0.033	1,407.75
10	79	5.67	0.040	1,302.52
11	79	4.00	0.050	1,257.02
12	79	2.33	0.067	1,081.47
13	80	9.00	0.033	1,423.50
14	80	5.67	0.040	1,357.48
15	80	4.00	0.050	1,282.27
16	80	2.33	0.067	1,116.43
17	80	9:1	0	1,514.97
18	80	8:2	0	1,331.13
19	81	9:1	0	1,555.37
20	81	8:2	0	1,366.22
21	81	7:3	0	1,181.48
22	81	6:4	0	1,001.01
23	82	9:1	0	1,596.92
24	82	8:2	0	1,402.27
25	82	7:3	0	1,212.29
26	82	6:4	0	1,026.80
27	83	8:2	0	1,439.34
28	83	7:3	0	1,243.94
29	83	6:4	0	1,053.28
30	84	6:4	0	1,080.49
31	84	7:3	0	1,276.47
32	85	6:4	0	1,108.45

Slurry test parameters.

2.4 Experimental testing

2.4.1 Fluidity test

To characterize the flowability of tailings slurry under different mix conditions, the slump cone test method was used in this study, following the GB/T 2419–2005 standard. During the experiment, freshly mixed tailings slurry was placed into the slump cone in three layers, with each layer tamped evenly 25 times using a tamping rod (Liu et al., 2023; Fu et al., 2021). After filling, the slump cone was lifted, and the slurry was allowed to fall freely under its own weight. The maximum diameter of the spread after the slump was measured and used to evaluate the flowability of the slurry. The experiment is shown in Figure 2.

FIGURE 2

2.4.2 Rheological test

The experimental data for the slurry’s rheological properties were measured using an R/S-type four-blade paddle rotational rheometer. The yield stress was measured using the CSR-controlled shear rate method. The shear rate range was set to the mode of 0 s^-1 - 120 s^-1 - 0 s^-1, with a stirring time of 2 min. The Bingham model was used for fitting (Gao et al., 2022b). Finally, the yield stress and viscosity coefficient parameters of the slurry under 32 different conditions of weight concentration, tailings-to-throw ratio (total tailings/dry throw tailings), and cement-to-tailings ratio (cement/dry throw tailings + total tailings) were obtained.

2.4.3 Mechanical properties test

The experimental tests were conducted according to the Chinese standard GBT 50107–2019. In this study, a hydraulic servo testing machine was used to prepare a total of 96 samples (32 different mixtures, with each mixture repeated 3 times) to test the compressive strength. It is important to note that compressive strength is typically measured according to the ASTM standard, which recommends preparing cylindrical samples. However, considering that cubic samples are easier to cast and demold, this study used cubic samples. The compressive strength test followed the standard procedure using cubic specimens. The hydraulic testing machine was used to apply pressure until the specimen failed, and the failure load was recorded to calculate the compressive strength (Ruan et al., 2023; Balasooriya, 2023). The calculation formula is shown in Equation 1:

In the formula, fc is the compressive strength (MPa); F is the failure load (N); and A is the cross-sectional area of the specimen (mm²). Through the selection of materials and experimental design outlined above, this study obtained rheological and strength data for 32 different mix designs (a total of 96 sets of data). These data will serve as the foundation for subsequent model training and analysis, providing strong support for achieving high-precision predictions.

3 Machine learning model

Based on the complete experimental data, constructing a scientifically reasonable prediction model is key to evaluating the performance of tailings concrete. This chapter will introduce the modeling process of the EO-LightGBM model, parameter optimization strategies, and performance evaluation methods of the model.

3.1 LightGBM model

LightGBM (Light Gradient Boosting Machine) is an efficient machine learning algorithm based on Gradient Boosting Decision Trees (GBDT), which implements powerful ensemble learning capabilities through the gradient boosting framework (Wang et al., 2024; Lu et al., 2024). LightGBM uses a histogram-based method for decision tree splitting and improves computational efficiency and prediction accuracy through a leaf-wise growth strategy. LightGBM is based on the GBDT method, whose core idea is to train multiple weak learners iteratively, with each new model fitting the residuals of the previous round, enabling the overall model to gradually approach the optimal solution. Assuming the dataset , where x_i represents the input features and y_i represents the true labels, GBDT iteratively optimizes the objective function shown in Equation 2:

In the formula, Ft(x) represents the cumulative prediction model at the tth round; ht(x) represents the weak learner (regression tree) trained in the tth round; and η is the learning rate, which controls the model’s update magnitude. In each iteration, GBDT computes the gradient direction based on the objective loss function and fits the negative gradient value using a new regression tree as shown in Equation 3:

In the formula, represents the loss function, where the commonly used squared loss is to calculate the prediction error.

LightGBM optimizes traditional GBDT by making several improvements in the splitting strategy and training process, including histogram-based splitting, efficient data storage, leaf-wise growth strategy, and handling categorical features. LightGBM uses a histogram-based method for feature partitioning, discretizing continuous features into a fixed number of bins, where each bin represents a range of feature values. This helps reduce memory usage and accelerates computation. For a feature xxx, LightGBM constructs a histogram H shown in Equation 4:

In the formula, g_i and h_i represent the first and second-order gradients of the objective loss function, respectively; B_j represents the range of feature values in the jth bin; H_j and H_j′ represent the sum of the first and second-order gradients of all samples in that bin.

LightGBM selects the best split point by calculating the gain of the histogram. The gain calculation formula is shown in Equation 5:

In the formula, H_L,H_L′ represent the first and second-order gradients of the left child node, respectively; H_R,H_R′ represent the first and second-order gradients of the right child node, respectively; HTH_THT and HT′H_T'HT′ represent the first and second-order gradients of the current node.

During the tree-growing process, LightGBM selects the leaf node with the maximum gain for splitting at each step, ensuring a more balanced tree structure. The gain calculation formula is as shown in Equation 6:

This strategy allows for faster loss reduction, improves the model’s expressiveness, and avoids unnecessary computations. LightGBM adds L1 and L2 regularization terms to the objective function to prevent overfitting. The regularized objective function is as shown in Equation 7 (Liu et al., 2023):

In the formula, is the L₁ regularization term, which increases sparsity; is the L₂ regularization term, which prevents the parameters from becoming too large and causing overfitting. Feature importance is determined by the number of splits, as shown in Equation 8:

In the formula, T_j represents the set of all tree nodes that use feature j for splitting; Gain_t is the gain value of the node. Finally, features with lower importance scores can be pruned to improve training efficiency. The training process typically terminates under one of the following conditions: the number of training rounds reaches the maximum iteration limit TTT; or the change in the loss function is smaller than the set threshold ϵ as shown in Equation 9.

3.2 Equilibrium Optimizer algorithm (EO)

The Equilibrium Optimizer (EO) algorithm was proposed by Faramarzi et al., in 2020. Its basic idea is to perform global optimization search by simulating the phenomenon of mass balance (Wang et al., 2021; Micev et al., 2021). The EO algorithm takes into account the mass balance equation of the controlled volume during its design, allowing it to effectively search within the solution space and avoid getting trapped in local optima. The search process of the EO algorithm is based on a mathematical model that moves the optimization variables towards the target equilibrium state. Assuming the optimization problem is to minimize an objective function, as shown in Equation 10:

In the formula, X = [x₁,x₂,.,x_n] represents a candidate solution in the search space; Ωis the feasible solution domain. EO introduces multiple equilibrium solutions X_eq , representing the optimal solutions that may be reached during the search process (Zhang et al., 2023). The equilibrium solutions are calculated based on the historical best solutions and updated using statistical information from the search population, defined as shown in Equation 11:

In the formula, N represents the number of equilibrium solutions; w_i is the weight coefficient; and X_i represents the best solution selected from the current population.

The core of the algorithm is to update the position of the solution through the mass balance equation. EO establishes an optimization model based on equilibrium states, and it is assumed that the update process of the solution Xt at iteration t satisfies the following shown in Equation 12:

In the formula, r₁ and r₂ are random numbers that control the search step size and search direction; F is the search factor, defined as shown in Equation 13:

In the formula, λ is the control parameter; U_B and L_B represent the upper and lower bounds of the variable, respectively; r3r_3r3 is a random disturbance factor that enhances the diversity of the search. The weights are calculated using a fitness ranking method to ensure a more stable optimization search, as shown in Equation 14:

In the formula, f(X_best) represents the optimal fitness value in the current population. The balance factor G is introduced to control the search intensity, as shown in Equation 15:

In the formula, G₀ is the initial balance factor; T is the maximum number of iterations; and t is the current iteration number. The optimization process typically sets the maximum number of iterations T, or terminates when the convergence condition is shown in Equation 16:

In the formula, ϵ is the convergence threshold, indicating that the change in the objective function is sufficiently small.

3.3 Model evaluation methods

3.3.1 Machine learning model evaluation

The predictive performance of the proposed method is evaluated using three metrics: Mean Squared Error (MSE), Root Mean Square Error (RMSE), and the correlation coefficient (R) (Sun et al., 2023; Zhang et al., 2020). The formulas are as shown in Equations 17–19:

In the formula, and represent the predicted and actual measured values, respectively; n is the number of data samples; is the average predicted value; and is the average actual value.

3.3.2 K-fold cross-validation

Regarding the calibration of regression models, early studies applied methods such as the simple substitution method, the boosting method, the holdout method, and the bootstrap method (Yan and Shi, 2024). Among these methods for validating training data, this study uses the most widely adopted k-fold cross-validation (CV), with k set to 5. Therefore, during hyperparameter tuning, the training dataset is split into five folds. The algorithm is trained using four folds and validated using the remaining one fold for the 5th fold. This process is repeated 5 times, each time using a different fold as the validation set. The final result is determined by the fold that yields the smallest error. The summary of the above CV process is shown in Figure 3.

FIGURE 3

4 Construction of the EO-LightGBM model for predicting the mechanical properties of tailings concrete

This study proposes an EO-LightGBM-based mechanical property prediction model for tailings concrete, and the overall workflow is illustrated in Figure 4. The model dataset is constructed using experimental data obtained from tailings concrete mix design parameters, physical and chemical characteristics, and mechanical test results. The complete dataset is divided into a training set and a test set with an 8:2 ratio, where the training set is used for model development and the test set is reserved for independent performance evaluation.

FIGURE 4

During the training stage, the training data are first input into the LightGBM model, which employs gradient boosting decision trees to learn the nonlinear relationship between input features and the compressive strength of tailings concrete. To further enhance predictive performance, the Equilibrium Optimizer (EO) is introduced to optimize the hyperparameters of the LightGBM model. The optimized hyperparameters mainly include the learning rate, number of decision trees, maximum tree depth, number of leaf nodes, and other key complexity-related parameters. The EO algorithm is inspired by mass balance principles and performs an adaptive search within the predefined hyperparameter space. An objective function based on the mean squared error is defined to minimize the discrepancy between predicted and measured strength values, thereby constructing the optimized EO-LightGBM model.

To assess the model’s generalization capability and reduce the risk of overfitting, five-fold cross-validation is employed during the training process. After the optimization and training procedures are completed, the optimal model is retained and subsequently evaluated using the test dataset. In addition to the proposed EO-LightGBM model, several conventional machine learning models are also trained for comparative analysis. Model performance is quantitatively assessed using evaluation metrics including mean squared error, root mean square error, and the correlation coefficient, in order to verify the effectiveness and comparative superiority of the EO-LightGBM model in predicting the mechanical properties of tailings concrete.

4.1 Analysis of the characteristics of the experimental data set

4.1.1 Basic Mechanical Property Analysis

4.1.1.1 Flowability analysis

The flowability of tailings concrete is a key factor affecting both construction performance and structural stability. It is typically measured using the slump test. A higher slump value indicates better flowability of the slurry, but an excessively high slump may lead to segregation and settling issues, while an excessively low slump could affect the filling density and pumping performance. As shown in Figure 5, the slump of tailings concrete was measured under different combinations of weight concentration, tailings-to-throw ratio, and cement-to-tailings ratio.

FIGURE 5

The experimental results indicate that the slump of tailings concrete decreases markedly with increasing weight concentration. At a weight concentration of 77%, the average slump ranges from 27.2 cm to 28.3 cm, reflecting good flowability. When the concentration increases to 79% and above, the slump is reduced to approximately 10.0 cm and 7.0 cm, respectively. This pronounced reduction can be attributed to the decrease in free water content in high-concentration slurries, accompanied by intensified inter-particle friction and increased cohesiveness and internal resistance, which collectively lead to a significant loss of flowability (Johansson et al., 2024; Duan et al., 2022). Examination of the mix designs further shows that higher concentration corresponds to a substantial increase in tailings content per unit volume. For instance, in Group 9 (79%), the tailings content reaches 1,407.75 kg/m³, compared with 1,356.48 kg/m³ in Group 1 (77%). The resulting increase in particle volume fraction effectively lowers the water-to-cement ratio, producing a denser slurry structure and further restricting flowability.

The tailings-to-throw ratio also exerts a notable influence on slump behavior. When the tailings-to-throw ratio is relatively low (e.g., 2.33), the measured slump is considerably higher than that observed at higher ratios (e.g., 9.00). This behavior suggests that a lower tailings-to-throw ratio increases the proportion of fine particles and the availability of free water, thereby enhancing slurry flowability. In contrast, a higher tailings-to-throw ratio increases the proportion of coarse particles, resulting in a more compact internal structure and reduced workability. Similarly, an increase in the cement-to-tailings ratio leads to a gradual decrease in slump. When the cement-to-tailings ratio rises from 0.033 to 0.067, the average slump decreases by approximately 2.0 cm. The higher cement content increases slurry viscosity and promotes particle flocculation, while the early formation of hydration-product skeletons restricts particle rearrangement and flow, ultimately leading to a noticeable deterioration in workability.

4.1.1.2 Rheological Property Analysis

Yield stress and viscosity coefficient are important parameters that characterize the rheological properties of the slurry. Yield stress reflects the slurry’s resistance to shear, and a higher yield stress may lead to increased delivery difficulty, while a lower yield stress may cause slurry segregation and layering. The viscosity coefficient measures the internal friction characteristics of the slurry, with a higher viscosity coefficient indicating greater flow resistance, which affects the filling efficiency. As shown in Figure 6, the yield stress and viscosity coefficient under different mix conditions are summarized.

FIGURE 6

As the weight concentration increases, both the yield stress and viscosity coefficient of tailings concrete exhibit clear increasing trends. At a weight concentration of 77%, the yield stress ranges from 40.55 Pa to 133.46 Pa, and the viscosity coefficient ranges from 0.3981 Pa s to 0.7369 Pa s. When the concentration exceeds 80%, the yield stress reaches a maximum of 382.47 Pa and the viscosity coefficient increases to 2.7021 Pa s. This behavior indicates that higher weight concentration increases the amount of tailings per unit volume in the slurry (e.g., 1,423.50 kg/m³ in experimental Group 13 compared with 1,356.48 kg/m³ in Group 1). Consequently, the packing density of solid particles increases, inter-particle contacts become more frequent and stronger, and internal friction and bonding effects are intensified, which collectively elevate the shear resistance and flow resistance of the slurry.

Variations in the tailings-to-throw ratio also substantially affect rheological parameters. When the tailings-to-throw ratio is high (e.g., 9:1), the proportion of coarse particles increases, leading to a more compact particle skeleton and thus higher yield stress and viscosity coefficient (Quan et al., 2022). For example, at a weight concentration of 78%, a tailings-to-throw ratio of 9.00 results in a yield stress of 237.44 Pa and a viscosity coefficient of 1.0613 Pa s, whereas reducing the ratio to 2.33 decreases these values to 139.81 Pa and 0.7357 Pa s, respectively.

The cement-to-tailings ratio further regulates the rheological behavior of the slurry. When the cement-to-tailings ratio increases from 0.033 to 0.067, both yield stress and viscosity coefficient show a decreasing tendency. At a 77% weight concentration, the yield stress and viscosity coefficient at a cement-to-tailings ratio of 0.033 are 133.46 Pa and 0.4886 Pa s, respectively, while increasing the ratio to 0.067 reduces them to 40.55 Pa and 0.3981 Pa s. This trend is associated with changes in mixture water availability and early structural development: a higher cement content under the same overall concentration can increase the effective water-to-cement ratio, thereby increasing the fraction of free water and improving lubrication between particles, which reduces inter-particle friction and weakens the early flocculated structure. As a result, the slurry exhibits lower shear resistance and improved flowability. The scatter plot in Figure 6 further supports these observations, showing consistent trends across different experimental runs, with yield stress and viscosity coefficient increasing markedly under high concentration and high tailings-to-throw ratio conditions, highlighting their strong controlling effects on rheological behavior.

4.1.1.3 Mechanical Property Analysis

The compressive strength of tailings concrete is an important indicator for evaluating its mechanical properties, directly impacting its stability and service life. Compressive strength is influenced by several factors, including weight concentration, tailings-to-throw ratio, cement-to-tailings ratio, and total tailings content. Experimental data show that under different mix conditions, the 7-day compressive strength of tailings concrete ranges from 1.39 MPa to 2.46 MPa. The specific results are shown in Figure 7.

FIGURE 7

When the weight concentration increases from 77% to 80%, the compressive strength generally shows an upward trend, ranging from 2.15 MPa to 2.46 MPa. However, when the weight concentration is further increased to 81% and above, the compressive strength begins to decrease. At a weight concentration of 82%, the compressive strength drops to between 1.49 MPa and 1.83 MPa, indicating that excessively high concentrations may affect the uniformity of the slurry, reducing its strength.

The change in the tailings-to-throw ratio has a significant impact on compressive strength. A higher tailings-to-throw ratio (e.g., 9:1) corresponds to a lower compressive strength, while a lower tailings-to-throw ratio (e.g., 2.33:1) results in relatively higher compressive strength. Under a weight concentration of 78%, the compressive strength is 1.99 MPa when the tailings-to-throw ratio is 9:1, whereas it increases to 2.26 MPa when the tailings-to-throw ratio is 2.33:1. This is because a higher tailings-to-throw ratio leads to a looser slurry structure, reducing the density of the tailings concrete.

The change in the cement-to-tailings ratio also affects compressive strength. When the cement-to-tailings ratio increases from 0.033 to 0.067, the compressive strength shows an upward trend. At a weight concentration of 77%, the compressive strength is 1.89 MPa when the cement-to-tailings ratio is 0.033, while it increases to 2.16 MPa when the ratio is 0.067. A higher cement-to-tailings ratio helps improve the cementing properties of the slurry, thus enhancing the structural strength of tailings concrete. However, an excessively high cement-to-tailings ratio may lead to increased brittleness of the slurry, which could affect long-term durability (Zhang F. et al., 2022; Luo et al., 2022).

The variation in the compressive strength of tailings concrete under different mix conditions is summarized. From the figure, it can be seen that when the weight concentration is between 77% and 80%, the compressive strength increases with the weight concentration. However, when the weight concentration is further increased to 81% and above, the compressive strength decreases.

4.1.2 Statistical analysis of the dataset

To gain deeper insights into the relationships between these variables, data visualization methods were used to explore the distribution characteristics of the dataset and the correlations between variables. As shown in Figure 8, the distribution of the data and the correlations between the variables are presented, including the scatter plot matrix of the variables in Figure 8A and the correlation matrix heatmap of the variables in Figure 8B.

FIGURE 8

Figure 8A shows the scatter distribution relationships between different variables, with univariate distribution histograms for each variable provided along the diagonal. From the histograms, it can be seen that variables such as weight concentration and total tailings content have relatively uniform distributions, while yield stress and viscosity coefficient exhibit certain skewed distributions. Figure 8B presents the correlation matrix heatmap of the variables, with different colors representing the strength and direction of the correlations. Red indicates positive correlation, while blue indicates negative correlation.

The weight concentration shows a strong negative correlation with compressive strength (R = −0.53), indicating that higher weight concentration may reduce the final mechanical performance of the filling material, which is related to changes in the internal porosity of the slurry. The tailings-to-throw ratio is negatively correlated with compressive strength (R = −0.55), suggesting that a higher tailings-to-throw ratio may decrease compressive strength due to relatively insufficient cementitious materials in high tailings-to-throw ratio systems, which affects the density of the structure. Slump shows a negative correlation with yield stress (R = −0.78), meaning that slurries with better flowability tend to have lower cohesion, leading to a decrease in yield stress. Yield stress is positively correlated with compressive strength (R = 0.39), meaning that higher yield stress represents stronger slurry stability, which benefits the strength development of tailings concrete. The total tailings content is positively correlated with yield stress (R = 0.50) and viscosity coefficient (R = 0.58), meaning that an increase in tailings content enhances the cohesion of the slurry, resulting in higher yield stress and viscosity coefficient. Although the proposed EO-LightGBM framework shows promising predictive performance, the dataset size in this study is relatively limited due to the practical constraints of laboratory preparation, curing, and mechanical testing, as well as the restricted availability of consistent tailings sources. Such a small dataset may increase the uncertainty of performance estimates and limit the generalizability of the conclusions to broader tailings-concrete systems. Therefore, the reported metrics should be interpreted as evidence of model effectiveness within the current experimental domain rather than universal superiority. To ensure a more reliable assessment under limited data, model training and evaluation were conducted with strict data splitting and robust validation procedures, and the conclusions are stated with a conservative scope. Further verification using larger and more diverse datasets will be necessary to strengthen cross-site and cross-mixture generalization in future work.

4.2 Model training and parameter optimization

In this study, the EO-LightGBM model is used to predict the compressive strength of tailings concrete, with optimization strategies incorporated during the training process to enhance model performance. The dataset is divided into training and testing sets in an 8:2 ratio, and input features are normalized to reduce the impact of dimensional differences on model convergence (Johansson et al., 2024). LightGBM, an efficient implementation of Gradient Boosting Decision Trees (GBDT), optimizes computational efficiency using a leaf-wise splitting strategy and accelerates the feature splitting search process using a histogram-based method, as shown in Figure 9. In this study, a total of 32 unique mixture designs were used. The dataset was split into training and test sets using an 8:2 random split, resulting in 26 samples for training and six samples for independent testing. The split was performed randomly (with a fixed random seed to ensure reproducibility) rather than stratified sampling, because the limited sample size makes multi-factor stratification difficult without producing extremely sparse strata. To check whether the small test set is reasonably representative, we compared the distributions (range and basic statistics) of the target compressive strength and key input variables between the training and test subsets, and confirmed that the test samples fall within the overall data ranges. Nevertheless, we acknowledge that the test set size is small, which may increase the uncertainty of the performance estimates; therefore, the reported metrics should be interpreted with this limitation in mind.

FIGURE 9

During the training process, the Mean Squared Error (MSE) is chosen as the loss function, and the optimization objective is to minimize the error between the predicted and actual values. Since LightGBM’s performance heavily depends on hyperparameter settings, this study introduces the Equilibrium Optimizer (EO) for automatic hyperparameter tuning to improve prediction accuracy and reduce training time. The EO algorithm simulates the mass balance process, dynamically adjusting the search direction through information exchange between individuals and selecting the optimal hyperparameter combination based on the fitness function during the iterative process.

The optimized variables include learning rate, maximum tree depth, number of leaf nodes, L1/L2 regularization parameters, and others. The search range for these variables is set based on empirical values and initially filtered through grid search. During training, 5-fold cross-validation is used to assess the model’s generalization ability. After each iteration, training and validation errors are calculated, and the learning rate is adjusted according to the convergence trend to improve stability.

As shown in Figure 10, the change in the model’s loss value under different hyperparameter combinations is displayed, with the optimized model showing better fitting performance on the validation set.

FIGURE 10

Figure 10 shows the trend of the loss value as the number of decision trees increases under different maximum depth (Max Depth) values. The overall trend indicates that as the number of trees increases, the model’s loss value gradually decreases, suggesting that more trees help improve the model’s learning capacity. A comparison of the curves with different maximum depths shows that the loss value for a smaller Max Depth (Max Depth = 3) is consistently higher than that of other curves, indicating weaker model fitting ability. As the maximum depth increases, the loss value decreases significantly, particularly when Max Depth increases to nine and 11, where the loss value drops most noticeably, and the convergence speed is faster. As shown in Figure 10, the reduction in loss with increasing number of trees and maximum depth can be explained by the learning mechanism of gradient boosting decision trees. In LightGBM, trees are added sequentially, and each new tree is trained to fit the residual errors of the current ensemble. Increasing the number of trees therefore enlarges the functional space of the model, allowing previously unexplained patterns in the data to be gradually captured, which leads to a continuous decrease in loss. Meanwhile, the maximum depth controls the representational capacity of individual trees. Deeper trees enable finer feature space partitioning and stronger modeling of nonlinear relationships and higher-order interactions among mixture variables, which improves approximation accuracy and accelerates convergence under the same number of trees. As a result, larger depths exhibit lower loss values and faster convergence trends (Poudel et al., 2025). However, this improvement reflects increased model capacity rather than unconditional superiority, and excessive depth may raise variance and overfitting risk, which should be controlled through regularization and early stopping. This behavior is consistent with the established understanding of boosting-based models and recent studies on hyperparameter effects in machine-learning-based strength prediction.

To ensure reproducibility, the hyperparameter search space optimized by the EO algorithm is explicitly defined as follows. The EO-LightGBM optimization considered key complexity- and generalization-related hyperparameters, including the number of trees (n_estimators, 100–2000), maximum tree depth (max_depth, 2–12), learning rate (learning_rate, 0.005–0.20), number of leaves (num_leaves, 8–256), minimum data in leaf (min_data_in_leaf, 5–100), feature fraction (feature_fraction, 0.6–1.0), bagging fraction (bagging_fraction, 0.6–1.0) and bagging frequency (bagging_freq, 0–10), as well as L1/L2 regularization terms (lambda_l1, 0–5; lambda_l2, 0–10). Continuous variables were searched in their real-valued ranges, while integer parameters (e.g., n_estimators, max_depth, num_leaves, min_data_in_leaf, bagging_freq) were rounded to the nearest valid integers before model training. The objective function minimized the cross-validated error metric on the training set, so the EO process selected hyperparameters within the above bounds while balancing accuracy and model complexity.

Figure 11 further compares the prediction error distribution of the model on the test set before and after optimization. The results show that after EO optimization, the error distribution of LightGBM becomes more concentrated, significantly improving prediction stability. To avoid overfitting, early stopping is applied to control the number of training rounds, with the stopping criterion being the absence of significant validation error reduction over several consecutive rounds.

FIGURE 11

Figure 11 shows the distribution of prediction errors on the test set before and after optimization. It can be observed that after optimization (in red), the error distribution is more concentrated, with the mean shifting towards zero, indicating that the optimized model has smaller prediction errors and improved stability on the test set. Before optimization (in blue), the error distribution is more scattered, suggesting that the original model exhibits larger fluctuations in predictions on the test set, with a broader error range. This indicates that after EO optimization, the prediction performance of LightGBM has been significantly improved.

4.3 Model results evaluation

Model evaluation is a key step in validating prediction performance. In this study, the EO-LightGBM model is used to predict the compressive strength of tailings concrete, and quantitative analysis is performed based on metrics such as Mean Squared Error (MSE), Root Mean Square Error (RMSE), and the correlation coefficient (R²). As shown in Figure 12, the comparison between the predicted values and actual measurements is displayed.

FIGURE 12

The left side of Figure 12 shows the fitting results on the training set, with data points mainly distributed along the ideal fitting line (y = x), indicating that the model has good fitting ability on the training data. The results on the test set, shown on the right, also demonstrate a high degree of fitting, with a small deviation between the predicted and actual values, and an R² of 0.9782, suggesting that the model can accurately predict the compressive strength of unseen data. The MSE on the test set is 0.0024, and the RMSE is 0.0485, both of which are relatively low, indicating that the model’s error is within a reasonable range. The model shows good fitting ability on both the training and test sets, with high prediction accuracy and strong generalization ability, making it suitable for the practical prediction of tailings concrete compressive strength.

4.4 Variable importance analysis

When constructing a tailings concrete compressive strength prediction model, identifying the key variables that influence compressive strength is crucial for optimizing material mix design and enhancing performance. This study uses the LightGBM model to perform a quantitative analysis of the importance of input features to assess the contribution of different variables to the prediction of compressive strength. The feature importance evaluation in LightGBM is based on the gain values calculated during decision tree node splits, which measure the reduction in loss when a particular feature is used for splitting. The cumulative gain values across all trees can quantify the contribution of that feature to the overall prediction task. As shown in Figure 13, the feature importance scores of each input variable for compressive strength prediction are displayed.

FIGURE 13

From Figure 13, it can be seen that weight concentration has the most significant impact on compressive strength, with an importance score of 0.35. The tailings-to-throw ratio and cement-to-tailings ratio follow, with scores of 0.20 and 0.18, respectively, indicating that these two variables make substantial contributions to predicting compressive strength. The total tailings content has a score of 0.15, showing that it has a significant impact on compressive strength, but it is lower than the top three variables.

Among the rheological parameters, slump has a score of 0.05, while yield stress and viscosity coefficient have scores of 0.04 and 0.03, respectively. This suggests that these parameters contribute relatively little to compressive strength, but still play a role in model prediction. Overall, structural composition parameters (weight concentration, tailings-to-throw ratio, cement-to-tailings ratio) are the main factors influencing compressive strength, while rheological properties (slump, yield stress, viscosity coefficient) have relatively low weights in the prediction.

4.5 Comparison with traditional machine learning models

This study uses the EO-LightGBM model to predict the compressive strength of tailings concrete and compares its performance with several traditional machine learning methods (Asteris et al., 2021; Armaghani and Asteris, 2021), including Linear Regression (LR), Support Vector Regression (SVR), Random Forest (RF), and Gradient Boosting Decision Trees (GBDT). The evaluation metrics include Mean Squared Error (MSE), Root Mean Square Error (RMSE), and the coefficient of determination (R²). Table 6 presents the performance comparison of different models on the test set.

TABLE 6

Model	MSE	RMSE	R²	Time/s
LR	0.052	0.228	0.78	0.05
SVR	0.045	0.212	0.82	0.32
RF	0.037	0.192	0.86	1.14
GBDT	0.031	0.176	0.89	2.03
XGBoost	0.028	0.167	0.91	1.79
LightGBM	0.026	0.161	0.92	1.35
EO-LightGBM	0.022	0.148	0.94	1.42

Performance comparison of different models on the test set.

Table 6 visually presents the MSE, RMSE, and R² performance of different models on the test set. Traditional regression methods (such as LR and SVR) show larger prediction errors, with MSE values of 0.052 and 0.045, and corresponding R² values of only 0.78 and 0.82, indicating their limited ability to fit complex nonlinear relationships. Random Forest (RF) and GBDT show some improvement in MSE and R², with GBDT’s R² reaching 0.89. XGBoost and LightGBM further optimize the errors, reducing MSE to 0.028 and 0.026, respectively, with R² reaching 0.91 and 0.92, demonstrating stronger generalization ability. EO-LightGBM performs the best across all metrics, with MSE reduced to 0.022, RMSE at 0.148, and R² increased to 0.94, indicating that the optimized LightGBM can more accurately capture the feature patterns of tailings concrete compressive strength.

In terms of training time, LR, due to its simple computation, requires the least time (0.05 s), but it has lower prediction accuracy. SVR has a higher computational complexity, with a training time of 0.32 s. RF, GBDT, and XGBoost have longer training times, at 1.14, 2.03, and 1.79 s, respectively. LightGBM, using a histogram-based splitting strategy, improves computational efficiency and reduces training time to 1.35 s. EO-LightGBM has a slightly longer training time than LightGBM (1.42 s), but it significantly outperforms all other methods in prediction accuracy.

After performance evaluation, the EO-LightGBM model outperforms traditional machine learning methods in both accuracy and generalization ability, confirming its feasibility and superiority in predicting the compressive strength of tailings concrete. The next chapter will provide an in-depth discussion and mechanism analysis of the model results.

5 Discussion

Based on model validation, merely presenting the prediction results is insufficient for fully understanding their physical nature and engineering applicability. Therefore, this chapter further discusses the influencing factors behind the model’s performance from two aspects: mechanism analysis and outlier exploration.

In the EO-LightGBM model constructed in this study, the introduction of the Equilibrium Optimizer (EO) for global hyperparameter search of key parameters (such as learning rate, maximum depth, number of leaf nodes, etc.) significantly enhanced the model’s learning efficiency and fitting accuracy. The prediction results show that the model achieves a coefficient of determination (R²) of 0.94 and a mean squared error (MSE) of 0.022 on the test set, outperforming all comparison models, including linear regression (LR), SVR, random forest (RF), and XGBoost. The superior performance of the model can be attributed to the integrated tree structure built based on the gradient boosting framework of LightGBM, which efficiently models the nonlinear relationship between input variables (such as weight concentration, tailings-to-throw ratio, cement-to-tailings ratio) and the output response (compressive strength). Meanwhile, the EO algorithm simulates the mass conservation search path to avoid local optima, effectively improving the model’s adaptability to complex multivariable systems. Hyperparameter optimization is widely recognized as an important factor influencing the predictive performance of machine learning models. Recent studies have shown that systematic tuning can significantly affect both absolute accuracy and relative comparisons among different algorithms. In this study, hyperparameter optimization was primarily focused on the proposed EO-LightGBM framework to evaluate the effectiveness of the EO strategy in enhancing model performance. The baseline models were implemented using commonly adopted parameter settings to provide a consistent reference for comparison. While further tuning of all baseline models could potentially improve their performance, the present comparison emphasizes the contribution of the EO-driven optimization mechanism rather than an exhaustive cross-model optimization. Similar observations regarding the sensitivity of model performance to hyperparameter settings have been reported in the literature.

Further analysis of the test-set errors was conducted to avoid a purely qualitative discussion of extreme-mix deviations. We calculated the sample-wise residuals (residual = predicted − actual) and the absolute percentage errors, and explicitly identified outliers as the points with the largest deviations (top three by absolute percentage error). For the current test set, the most prominent outliers correspond to sample #3 (actual 1.82 MPa, predicted 1.689 MPa; APE 7.20%), sample #8 (actual 1.62 MPa, predicted 1.5206 MPa; APE 6.14%), and sample #11 (actual 1.61 MPa, predicted 1.6839 MPa; APE 4.59%). Figure 14 plots residuals across samples and shows that these high-deviation points are concentrated near boundary conditions of the mix-design space. This behavior is physically plausible for extreme ratios (e.g., weight concentration ≥83%, tailings-to-throw ratio ≥8.0, or cement-to-tailings ratio ≤0.033): (i) at very high weight concentration, the reduction in free water restricts hydration and promotes nonuniform structure development (“surface hardening with internal looseness”), weakening strength growth and increasing sensitivity to small compositional fluctuations; (ii) at very high tailings-to-throw ratio, the coarse-tailings fraction increases and particle gradation becomes imbalanced, reducing packing density and cement paste coverage and creating interfacial discontinuities, which leads to nonlinearly amplified strength variability that is difficult to learn reliably when such extreme samples are rare in training. Therefore, the practical operational boundary of the model is defined by the calibrated range of key mix variables in the dataset, and predictions near or beyond those extreme ranges should be interpreted cautiously (Thapa and Ghani, 2025; Benzaamia et al., 2025).

FIGURE 14

From the perspective of variable influence mechanisms, weight concentration is the most significant factor affecting compressive strength. Its increase not only enhances the physical support between aggregates but also reduces the unit water-to-cement ratio, making the slurry denser, thus improving the strength. The regulation of tailings-to-throw ratio and cement-to-tailings ratio impacts the slurry’s uniformity and cementing performance by adjusting the particle gradation and cementitious material proportion. Notably, yield stress and viscosity coefficient are lower in the feature importance ranking, yet in certain local samples, both show a strong coupling with compressive strength, reflecting the indirect influence of slurry flow behavior on strength development. Future model development could explore incorporating dynamic rheological parameters (such as shear rate intervals before and after yield point, time-dependent modulus changes, etc.) to enhance the model’s ability to analyze non-steady-state slurry behaviors. The EO-LightGBM model’s advantages are not only reflected in its global prediction accuracy but also in its high consistency and scientific approach to variable selection and interpretability mechanisms. However, attention must still be paid to the model’s inadequate fitting capability for extreme boundary data, and future improvements could include ensemble residual analysis methods, the introduction of physical prior constraints, or combining deep structures such as CNN-LSTM to further enhance the model’s adaptability and robustness in complex material systems.

6 Conclusion

Based on the preceding research and discussion, this study conducted a systematic investigation into the performance characteristics of tailings concrete and its efficient prediction methods, leading to the following main conclusions:

1. This study designed 32 different mix ratios for copper mine tailings concrete with a weight concentration range of 77%–85%, and measured rheological and mechanical properties such as slump, yield stress, viscosity coefficient, and 7-day compressive strength. The results show that weight concentration and cement-to-tailings ratio significantly affect compressive strength. Excessively high weight concentrations reduce slurry uniformity, which in turn affects the final mechanical properties.
2. The EO-LightGBM model was used for compressive strength prediction and compared with traditional regression models (LR, SVR, RF, GBDT, XGBoost, LightGBM). The evaluation results on the test set showed that the EO-LightGBM model achieved an MSE of 0.022, an RMSE of 0.148, and an R² of 0.94, outperforming all other models. The model demonstrates high prediction accuracy and stability.
3. Variable importance analysis revealed that weight concentration, tailings-to-throw ratio, and cement-to-tailings ratio are key factors affecting compressive strength, contributing over 75% to the prediction, while yield stress and viscosity coefficient have a smaller contribution. This finding suggests that in tailings concrete mix optimization, the influence of these variables should be prioritized to improve the mechanical performance of tailings concrete.
4. The EO-LightGBM model proposed in this study provides an efficient method for predicting the mechanical properties of tailings concrete, which can be applied to fill material optimization and engineering practice. Future research may further introduce richer datasets and combine deep learning methods to improve the model’s applicability and broader value.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

Author contributions

LT: Software, Writing – original draft, Investigation, Writing – review and editing, Data curation, Methodology, Formal Analysis, Conceptualization. XZ: Validation, Formal Analysis, Data curation, Writing – review and editing, Writing – original draft, Investigation, Funding acquisition.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Acknowledgments

The authors would like to express their gratitude to the colleagues and laboratory staff at Kunming Metallurgy College who provided valuable assistance during the design, construction, and testing of the tailings concrete compressive strength prediction model using EO-LightGBM. Their support in experimental platform assembly, data collection, and analysis greatly contributed to the success of this study.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1
ArmaghaniD. J.AsterisP. G. (2021). A comparative study of ANN and ANFIS models for the prediction of cement-based mortar materials compressive strength. Neural Comput. Appl.33 (9), 4501–4532. 10.1007/s00521-020-05244-4
- CrossRef
- Google Scholar
2
AsterisP. G.SkentouA. D.BardhanA.SamuiP.PilakoutasK. (2021). Predicting concrete compressive strength using hybrid ensembling of surrogate machine learning models. Cem. Concr. Res.145, 106449. 10.1016/j.cemconres.2021.106449
- CrossRef
- Google Scholar
3
BalasooriyaA. (2023). Application of machine learning techniques to predict the unconfined compressive strength of sustainable cementitious materials used in the mining industry. Univ. Alta.1 (1), 1–114. 10.7939/r3-20sm-cx85
- CrossRef
- Google Scholar
4
BenzaamiaA.GhriciM.RbouhR.AsterisP. G. (2025). Prediction of chloride resistance level in concrete using optimized tree-based machine learning models. Bull. Comput. Intell.1 (1), 104–117. 10.53941/bci.2025.100007
- CrossRef
- Google Scholar
5
CaoS.XueG.YilmazE.YinZ. (2021). Assessment of rheological and sedimentation characteristics of fresh cemented tailings backfill slurry. Int. J. Min. Reclam. Environ.35 (5), 319–335. 10.1080/17480930.2020.1826092
- CrossRef
- Google Scholar
6
DongF. S. (2020). Study on mechanical properties and strength prediction model of continuously graded cemented tailings backfill. Hubei: Wuhan University of Science and Technology.
- Google Scholar
7
DuanH.LiuH.LiB.WangZ.GaoH. (2022). Mechanical properties and mechanism analysis of graphite tailings environment-friendly concrete. Materials15 (24), 8870. 10.3390/ma15248870
8
FatahT. A.MastoiA. K.BhattiN. K.AliM. (2024). Optimizing stabilization of contaminated mining sludge: a machine learning approach to predict strength and heavy metal leaching. Arabian J. Sci. Eng.1 (1), 1–21. 10.1007/s13369-024-09858-x
- CrossRef
- Google Scholar
9
FuZ. G.LiH.DengJ. H.QiaoD. P.WangJ. X., (2021). Bivariate rheological model of ultrafine tailings backfill slurry based on structural parameter and its applications. Chin. J. Nonferrous Metals31 (6), 1672–1685. 10.11817/ysxb.1004.0609.2021-39742
- CrossRef
- Google Scholar
10
GaoT.SunW.PengC. Z.ZhangS.LiJ.LuK.et al (2022a). Study of rheological parameters of full tailing filling slurry in a tin mine in Yunnan. Nonferrous Met. Eng.12 (3), 129–137. 10.3969/j.issn.2095-1744.2022.03.017
- CrossRef
- Google Scholar
11
GaoT.SunW.LiuZ.ChengH. (2022b). Investigation on fracture characteristics and failure pattern of inclined layered cemented tailings backfill. Constr. Build. Mater.343, 128110. 10.1016/j.conbuildmat.2022.128110
- CrossRef
- Google Scholar
12
HouX. Y.LuH. Y.LuM. D.ZhaoJ. J. (2023). Bidirectional optimize combining sparrow search and random difference. Comput. Sci.50 (11), 248–258. 10.11896/jsjkx.221100143
- CrossRef
- Google Scholar
13
HuP. D.LiuY. Z.LiK. B.HuangS.WangW.TianQ. (2024). Influence of temperature and cement-tailings ratio on the mechanical and damage characteristics of cemented tailings backfill. J. Nonferrous Metals Sci. Eng.15 (6), 890–900. 10.13264/j.cnki.ysjskx.2024.06.012
- CrossRef
- Google Scholar
14
JohanssonL.BahramiA.WallhagenM.CehlinM. (2024). A comprehensive review on properties of tailings-based low-carbon concrete: mechanical, environmental, and toxicological performances. Dev. Built Environ.18 (1), 100428. 10.1016/j.dibe.2024.100428
- CrossRef
- Google Scholar
15
LiW.ChenF.XiaoH.LiL.HuangK. (2023). Study on physical, mechanical and microstructural properties of lime-red mud stabilized soil under dry-wet and freeze-thaw cycles. Water Resour. Hydropower Eng.54 (6), 189–201. 10.13928/j.cnki.wrahe.2023.06.017
- CrossRef
- Google Scholar
16
LiuJ. Z.YinF.GaoZ. M. (2023). Experimental research and prediction analysis of rheological parameters of tailings backfill with multi-factors. Min. Metallurgical Eng.43 (6), 15–19. 10.3969/j.issn.0253-6099.2023.06.004
- CrossRef
- Google Scholar
17
LuS.CuiM.GaoB.LiuJ.NiJ.LiuJ.et al (2024). A comparative analysis of machine learning algorithms in predicting the performance of a combined radiant floor and fan coil cooling system. Buildings14 (6), 1659. 10.3390/buildings14061659
- CrossRef
- Google Scholar
18
LuoT.YiY.SunQ.LiL. G.TangL.HuaC. (2022). The effects of adding molybdenum tailings as cementitious paste replacement on the fluidity, mechanical properties and micro-structure of concrete. J. Build. Eng.62 (1), 105377. 10.1016/j.jobe.2022.105377
- CrossRef
- Google Scholar
19
MengZ. P.YangL. Q.WangB.LiuY. B. (2024). ADRC design for folding wing vehicles based on improved equilibrium optimization algorithm. J. Beijing Univ. Aeronautics Astronautics50 (8), 2449–2460. 10.13700/j.bh.1001-5965.2022.0698
- CrossRef
- Google Scholar
20
MicevM.ĆalasanM.OlivaD. (2021). Design and robustness analysis of an automatic voltage regulator system controller by using equilibrium optimizer algorithm. Comput. and Electr. Eng.89 (1), 106930. 10.1016/j.compeleceng.2020.106930
- CrossRef
- Google Scholar
21
MinC.XiongS.ShiY.LiuZ.LuX. (2023). Early-age compressive strength prediction of cemented phosphogypsum backfill using lab experiments and ensemble learning models. Case Stud. Constr. Mater.18 (1), 02107. 10.1016/j.cscm.2023.e02107
- CrossRef
- Google Scholar
22
PoudelS.GautamB.BhetuwalU.KharelP.KhatiwadaS.DhitalS.et al (2025). Prediction of compressive strength of sustainable concrete incorporating waste glass powder using machine learning algorithms. Sustainability17 (10), 4624. 10.3390/su17104624
- CrossRef
- Google Scholar
23
QiC.ChenQ.FourieA.ZhangQ. (2018a). An intelligent modelling framework for mechanical properties of cemented paste backfill. Miner. Eng.123, 16–27. 10.1016/j.mineng.2018.04.010
- CrossRef
- Google Scholar
24
QiC.FourieA.ChenQ.ZhangQ. (2018b). A strength prediction model using artificial intelligence for recycling waste tailings as cemented paste backfill. J. Clean. Prod.183, 566–578. 10.1016/j.jclepro.2018.02.154
- CrossRef
- Google Scholar
25
QiC. C.YangX. Y.LiG. C.SunY. T. (2021). Research status and perspectives of the application of artificial intelligence in mine backfilling. J. China Coal Soc.46 (2), 688–700. 10.13225/j.cnki.jccs.XR20.1704
- CrossRef
- Google Scholar
26
QuanX.WangS.LiuK.XuJ.ZhaoN.LiuB. (2022). Influence of molybdenum tailings by-products as fine aggregates on mechanical properties and microstructure of concrete. J. Build. Eng.54 (1), 104677. 10.1016/j.jobe.2022.104677
- CrossRef
- Google Scholar
27
RuanZ. E.WuA. X.FuH.WangS. Y. (2023). Influence mechanism of straw fiber on uniaxial compressive strength cemented paste backfill body of sulfur-bearing tailings. J. Central South Univ.54 (3), 837–848. 10.11817/j.issn.1672-7207.2023.03.005
- CrossRef
- Google Scholar
28
ShengY. H.FanC. C.WangZ. J.LiG. B.YangJ. G. (2024). Compressive strength and microstructure analysis of cemented tailings backfill with different cementing materials. Nonferrous Metals Sci. Eng.15 (4), 570–576. 10.13264/j.cnki.ysjskx.2024.04.012
- CrossRef
- Google Scholar
29
SunH. K.GanD. Q.ZhangY. J.XueZ. L. (2023). Rheological parameters prediction and characteristics of pipe transportation temperature distribution for ultra-fine tailings backfill slurry. Chin. J. Nonferrous Metals33 (4), 1333–1348. 10.11817/j.ysxb.1004.0609.2022-43234
- CrossRef
- Google Scholar
30
ThapaI.GhaniS. (2025). AI-Enabled sustainable soil stabilization for resilient urban infrastructure: advancing SDG 9 and SDG 12 through hybrid deep learning and environmental assessment. Bull. Comput. Intell.1, 3–30. 10.53941/bci.2025.100002
- CrossRef
- Google Scholar
31
WangJ.YangB.LiD.ZengC.ChenY.GuoZ.et al (2021). Photovoltaic cell parameter estimation based on improved equilibrium optimizer algorithm. Energy Convers. Manag.236 (1), 114051. 10.1016/j.enconman.2021.114051
- CrossRef
- Google Scholar
32
WangR.ZhangJ.LuY.HuangJ. (2024). Towards designing durable sculptural elements: ensemble learning in predicting compressive strength of fiber-reinforced nano-silica modified concrete. Buildings14 (2), 396. 10.3390/buildings14020396
- CrossRef
- Google Scholar
33
XueG.YilmazE. (2022). Strength, acoustic, and fractal behavior of fiber reinforced cemented tailings backfill subjected to triaxial compression loads. Constr. Build. Mater.338, 127667. 10.1016/j.conbuildmat.2022.127667
- CrossRef
- Google Scholar
34
YanM. Q.ShiH. B. (2024). Quantile treatment effect estimation method based on cross-validation. Statistics Decis.40 (20), 49–54. 10.13546/j.cnki.tjyjc.2024.20.008
- CrossRef
- Google Scholar
35
YuZ.WangY.WangY. (2022). A support vector machine and particle swarm optimization based model for cemented tailings backfill materials strength prediction. Mater.15 (6), 2128. 10.3390/ma15062128
36
YuanC. X.LiuZ. X.YangX. C.GuoJ.WanC.XiongS.et al (2023). Strength prediction of cemented paste backfill body based on WOA-XGBoost model. Chin. J. High Press. Phys.37 (5), 113–123. 10.11858/gywlxb.20230668
- CrossRef
- Google Scholar
37
ZhangY. Z.GanD. Q.ChenX.XueZ.RenW. (2020). Experiment of shear thixotropy of cemented tailings filling slurry and its prediction model based on dimensional analysis. Chin. J. Nonferrous Metals30 (4), 951–959. 10.11817/j.ysxb.1004.0609.2020-35737
- CrossRef
- Google Scholar
38
ZhangB.LiK.ZhangS.HuY.HanB. (2022a). Strength prediction and application of cemented paste backfill based on machine learning and strength correction. Heliyon8 (8), 10338. 10.1016/j.heliyon.2022.e10338
39
ZhangF.LiY.ZhangJ.GuiX.ZhuX.ZhaoC. (2022b). Effects of slag-based cementitious material on the mechanical behavior and heavy metal immobilization of mine tailings based cemented paste backfill. Heliyon8 (9), 10695. 10.1016/j.heliyon.2022.e10695
40
ZhangM. X.MaL.LiuY. (2023). New equilibrium optimizer algorithm combining concentration balance and fick's law. Comput. Eng. Appl.59 (3), 66–76.
- Google Scholar
41
ZouS.CaoS.YilmazE. (2024). Enhancing flexural property and mesoscopic mechanism of cementitious tailings backfill fabricated with 3D-printed polymers. Constr. Build. Mater.414, 135009. 10.1016/j.conbuildmat.2024.135009
- CrossRef
- Google Scholar

Summary

Keywords

compressive strength, EO, equilibrium optimizer, machine learning, tailings concrete

Citation

Tang L and Zhang X (2026) Prediction of the compressive strength of tailings backfill using an EO-LightGBM model: performance comparison and feature importance analysis. Front. Earth Sci. 13:1758600. doi: 10.3389/feart.2025.1758600

Received

01 December 2025

Revised

20 December 2025

Accepted

22 December 2025

Published

18 February 2026

Volume

13 - 2025

Edited by

Krzysztof Skrzypkowski, AGH University of Krakow, Poland

Reviewed by

Panagiotis G. Asteris, School of Pedagogical and Technological Education, Greece

Sushant Poudel, Lamar University, United States

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiaoliang Zhang, 13769116228@163.com; Li Tang, ltang8042@foxmail.com

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

ORIGINAL RESEARCH article

Prediction of the compressive strength of tailings backfill using an EO-LightGBM model: performance comparison and feature importance analysis

Abstract

1 Introduction

2 Experimental materials and design

2.1 Waste tailings

2.2 Cement

2.3 Tailings concrete mix design

2.4 Experimental testing

2.4.1 Fluidity test

2.4.2 Rheological test

2.4.3 Mechanical properties test

3 Machine learning model

3.1 LightGBM model

3.2 Equilibrium Optimizer algorithm (EO)

3.3 Model evaluation methods

3.3.1 Machine learning model evaluation

3.3.2 K-fold cross-validation

4 Construction of the EO-LightGBM model for predicting the mechanical properties of tailings concrete

4.1 Analysis of the characteristics of the experimental data set

4.1.1 Basic Mechanical Property Analysis

4.1.1.1 Flowability analysis

4.1.1.2 Rheological Property Analysis

4.1.1.3 Mechanical Property Analysis

4.1.2 Statistical analysis of the dataset

4.2 Model training and parameter optimization

4.3 Model results evaluation

4.4 Variable importance analysis

4.5 Comparison with traditional machine learning models

5 Discussion

6 Conclusion

Statements

Data availability statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Generative AI statement

Publisher’s note

References

Summary

Outline

Figures

Cite article

Share article

Article metrics