Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Earth Sci., 05 January 2026

Sec. Georeservoirs

Volume 13 - 2025 | https://doi.org/10.3389/feart.2025.1721227

Hybrid optimization of interpretable ensemble machine learning for petrophysical property prediction from well logs


Uti Ikitsombika MarkusUti Ikitsombika Markus1 
Jing Ba

Jing Ba 1* 
Muhammad Abid
Muhammad Abid 1Faruwa Ajibola RichardFaruwa Ajibola Richard1Emo Obadiah
Emo Obadiah2
  • 1 School of Earth Sciences and Engineering, Hohai University, Nanjing, China
  • 2 Department of Chemical Engineering, Ahmadu Bello University, Zaria, Nigeria

The precise prediction of petrophysical properties in tight reservoirs is essential for accurate reservoir characterization but remains impeded by significant lithological heterogeneity and complex, nonlinear relationships among well-log features. To address this, we propose a robust and interpretable machine learning framework that synergizes a stacked ensemble architecture with a post hoc physics-informed refinement step for predicting porosity and water saturation. The methodology employs a multi-stage process: (1) model-specific recursive feature elimination with cross-validation (RFECV) to identify optimal feature subsets; (2) a hybrid Genetic Algorithm–Particle Swarm Optimization (GA–PSO) strategy for efficient hyperparameter tuning; and (3) a stacked ensemble integrating Random Forest (RF), LightGBM, and CatBoost, with a Ridge regression meta-learner. We evaluate two configurations: hyperparameter optimization alone (Hybrid_Hyper_XGB) and joint optimization of hyperparameters and stacking weights (Stacked_Hybrid_Full). The superior Stacked_Hybrid_Full model is further enhanced by a post hoc physics-based refinement, where priors derived from the Wyllie time-average equation augmented with density-neutron crossplots and the Archie-Simandoux model are blended as soft regularizers, ensuring geological consistency without retraining. Comprehensive validation demonstrates that the physics-informed Stacked_Hybrid_Full model achieves superior performance, with R 2 values exceeding 0.91 for porosity and 0.83 for water saturation. Depth-resolved analysis confirms a significant reduction in prediction error and improved capture of structural features, particularly within laminated and low-porosity intervals. Model interpretability, probed via SHapley Additive exPlanations (SHAP), identifies permeability, resistivity, gamma ray, and shear velocity as the dominant predictive features and elucidates nontrivial interaction effects aligned with petrophysical principles. This work presents a transferable workflow that successfully bridges data-driven prediction with physical plausibility. The framework significantly enhances predictive robustness and model transparency for petrophysical characterization in heterogeneous tight reservoirs, offering substantial practical utility for reservoir evaluation in unconventional plays.

1 Introduction

Unconventional reservoirs, including shale gas, tight oil, and coalbed methane formations, have emerged as vital contributors to global energy security, accounting for a substantial portion of the world’s available hydrocarbon resources. Accurate prediction of petrophysical properties in unconventional reservoirs is essential for adequate reservoir characterization and optimizing exploration strategies (Zou et al., 2013). However, these petrophysical properties, derived from well log data, exhibit complex nonlinear relationships driven by geological heterogeneity (Yang and Zou, 2019).

Predicting reservoir properties in tight oil formations remains particularly challenging due to inherent complexities, including spatial heterogeneity and anisotropy, which complicate the application of conventional petrophysical and geophysical methodologies (Yang et al., 2016). Unlike traditional petrophysical models, machine learning (ML) algorithms are capable of independently identifying hidden, nonlinear relationships between input features and target outputs, thereby demonstrating superior performance in modeling complex systems (Bai et al., 2022; Dong et al., 2023; Sang et al., 2022; Wood, 2022). Recent studies have shown that ensemble machine learning (EML) models, particularly those based on decision tree architectures, achieve predictive performance comparable to that of deep learning models when applied to tabular datasets (Shwartz-Ziv and Armon, 2021). Notably, the architecture and hyperparameters of EML models can be efficiently optimized by using advanced heuristic techniques (Akande et al., 2017; Gu et al., 2021a; Salem et al., 2022).

Given that well-logging datasets are inherently tabular, EML models are well-suited for predictive tasks such as estimating reservoir properties, often outperforming conventional petrophysical approaches (Bai et al., 2022; Gu et al., 2022; Wang et al., 2020). These models effectively capture complex, nonlinear interdependencies among well-log variables, which is particularly beneficial for characterizing unconventional reservoirs (Abbas et al., 2023; Al-Mudhafar, 2015; Anifowose et al., 2019). Among the most widely employed EML algorithms for reservoir property prediction are Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) (Al-Mudhafar, 2020; Al-Mudhafar and Wood, 2022).

Despite their predictive power, EML models are often criticized for their limited interpretability, frequently being referred to as “black-box” models, which is an issue that poses barriers to their acceptance in industrial applications (Adadi and Berrada, 2018; Lipton, 2016; Murdoch et al., 2019). To address this issue, recent developments in artificial intelligence have prioritized the advancement of explainable machine learning techniques, collectively referred to as Explainable Artificial Intelligence (XAI). XAI methodologies facilitate transparency and foster trust by bridging the gap between model outputs and human interpretability (Barredo Arrieta et al., 2020; Loh et al., 2022). Among these, Shapley Additive Explanations (SHAP) has emerged as a prominent tool, offering both global and local interpretability insights (Feng et al., 2021; Kavzoglu and Teke, 2022; Markus et al., 2025).

Recent advances in metaheuristic optimization and feature selection have increased efficiency in the use of machine learning for geoscience (Nssibi et al., 2023; Selvam et al., 2024; Zhang et al., 2025). The hybrid approaches Genetic Algorithm-Particle Swarm Optimization (GA-PSO) and Particle Swarm Optimization-XGBoost have improved the prediction of permeability and lithology in tight sandstones by hyperparameter optimization and feature selection (Gu et al., 2021a; Gu et al., 2021b; Sheykhinasab et al., 2023). Enriched PSO–XGBoost interpretable models using SHAP have developed accurate and interpretable models of permeability (Liu and Liu, 2022), and carbonates have had logging-based permeability estimated with novel simulated annealing–genetic hybrid SA–GA–XGBoost (Huang et al., 2025). SHAP-based interpretability has been used to explain permeability predictions (Feng et al., 2024; Mohammadian et al., 2022; Zhang et al., 2024) and shear wave velocity estimations (Zhang et al., 2023), highlighting its effectiveness in identifying key influential features and enhancing trust in ML-derived insights. Regardless, most of the studies neglect hybrid optimization and explainability, treating them separately and focusing on adopted distinctive approaches (LightGBM, XGBoost) instead of a seamless ensemble interpretable design.

Physics-informed machine learning (PIML) integrates governing physical laws such as rock-physics relationships and fluid-flow principles directly into neural network architectures to improve generalization, reduce overfitting, and enhance physical interpretability (Shao et al., 2024). PIML has been applied to upscale permeability from core to reservoir scales using time-lapse geo-electrical data, improving characterization of subsurface fluid dynamics (Sakar et al., 2024). Probabilistic PIML formulations have also strengthened seismic petrophysical inversion by incorporating wave-physics constraints, yielding more reliable porosity estimates from seismic attributes (Khassaf et al., 2025). Advances in multi-scale fracture analysis and hybrid optimization further emphasize the importance of interpretable ML approaches in tight reservoirs, where bedding-parallel fractures exert strong control on permeability (Su et al., 2025; Wen et al., 2025). Similarly, peridynamic simulations of rock deformation demonstrate how nonlinear mechanical behavior can be integrated into PIML frameworks for improved porosity and saturation prediction under heterogeneous overburden conditions (Li et al., 2025; Tian et al., 2025). PIML approaches that incorporate petrophysical constraints such as Gassmann’s equations, Archie’s law, and geological priors for organic-rich intervals have shown substantial improvements in predicting reservoir properties in tight and unconventional formations (Abid et al., 2025; Gai et al., 2025; Pothana and Ling, 2025; Shao et al., 2024). These methods enhance robustness across heterogeneous systems, reducing prediction errors in low-porosity zones and during dynamic processes such as waterflooding (Mabiala et al., 2025).

EML models are increasingly utilized for petrophysical prediction, leveraging optimized hyperparameter tuning to enhance performance. Unlike previous studies where optimization and stacking are decoupled, this work performs simultaneous GA-PSO optimization of both base-model hyperparameters and stacking weights, followed by a post hoc refinement step applied to the stacked ensemble predictions, where a prior is computed from well logs using the Wyllie time-average equation augmented with density-neutron adjustments for porosity and Archie-Simandoux hybrid for water saturation, serving as a soft regularizer. This prior constrains the ensemble outputs via weighted blending tuned via cross-validation, ensuring physical consistency without retraining the models or risking data leakage. This study framework integrates model-specific recursive feature elimination with cross-validation (RFECV) for robust feature selection, a GA-PSO approach for simultaneous tuning of hyperparameters and EML stacking weights, a stacked ensemble model employing a Ridge regression meta-learner, and physics-informed blending to inject geological priors. To evaluate the framework’s efficiency, three distinct labelled scenarios are investigated: (i) Stacked_Hybrid_Full, which optimizes both hyperparameters and stacking weights augmented with physics-informed constraints; (ii) Hybrid_Hyper_XGB, which optimizes only base model hyperparameters with equal-weight averaging; and (iii) Baseline_XGB, a reference model with fixed hyperparameters. To enhance interpretability, SHAP analysis is employed to quantify the contributions of individual features and stacked models, providing geological insights.

2 Overview of study area and data preparation

The Ordos Basin, a major hydrocarbon rich basin in northern China comprises six principal tectonic units: the Yimeng Uplift, Western Margin Thrust Belt, Tianhuan Depression, Yishan Slope, Weibei Uplift, and Jinxi Flexural Fold Belt. During the Late Triassic, progressive tectonic closure drove a basinwide transition from shallow-marine to predominantly lacustrine conditions (Ji et al., 2022). Within this framework, the Yanchang Formation, particularly the Chang 7 Member of muddy shales with thin sandstone interbeds records maximum lacustrine expansion and hosts kerogen-rich, thermally mature successions central to unconventional shale-oil prospectivity (Shi et al., 2022). The study area lies at the confluence of sediment supply from the southwestern and northeastern basin margins, where hydrocarbons in tight sandstones and associated source rocks exhibit minimal lateral migration, indicating largely in-situ accumulation. These accumulations include high-quality shale oils and tight oils, with notable enrichment in the Changqing Oilfield (Liu et al., 2022).

Multi-well log datasets from 10 boreholes are integrated into a unified structure with consistent formatting and identifiers. The wells are distributed within the same work area, spanning lateral facies transitions from proximal delta-front sand-dominated successions to more distal lacustrine mud-rich intervals. Therefore the wells share the same geological conditions and lithological characteristics at the target formation. This spatial spread ensures that the training dataset captures representative heterogeneity of the Chang 7 tight reservoirs (Liu et al., 2022; Yang et al., 2016). Preprocessing involves imputing missing values through linear interpolation and backward filling, removing duplicate and non-informative columns, and normalizing features to zero mean and unit variance for numerical stability. This yielded 5,697 reservoir data points, varying across wells. To assess generalizability, three well-covered wells served as the independent test set, while the remaining 3,864 samples trained the ensemble models. Figure 1 illustrates a representative well log, highlighting depth-dependent variability in elastic and petrophysical properties.

Figure 1
Six side-by-side geophysical well logs showing variations with depth. Plots depict P-wave and S-wave velocities, density, porosity, water saturation, and shale volume, all displaying different trends from top to bottom.

Figure 1. Well log A (a) V P (P-wave velocity), (b) V S (S-wave velocity), (c) density, (d) porosity, (e) water saturation, and (f) shale volume.

To augment the geophysical relevance of the dataset, rock physics-derived attributes including V P/V S ratio, acoustic impedance ( I P = V P ρ b ), shear impedance ( I S = V S ρ b ), and Poisson’s ratio v = 1 / 2 V P 2 / V S 2 2 / V P 2 / V S 2 1 are engineered from conventional logs. These features captures elastic contrasts and lithological variations essential for reservoir property prediction. The inputs are V P, V S, V P/V S, I P, I S, permeability, resistivity, gamma ray and the targets are porosity and water saturation.

3 Methodology

This section presents a comprehensive hybrid machine learning framework to predict porosity and water saturation in unconventional reservoirs using well log data. The methodology integrates hybrid feature selection and ensemble learning enhanced by metaheuristic optimization and interpretable machine learning techniques.

3.1 Feature selection

The implementation of RFECV requires the selection of a machine learning algorithm as the base estimator, which is responsible for generating feature importance rankings used during the recursive elimination process (Chang et al., 2020). Hence, feature selection is conducted by using RFECV tailored to three base learners, including RF, LightGBM, and CatBoost. Each model undergoes independent selection cycles by using 5-fold cross-validation to identify the most informative predictors. We quantified feature selection stability using Jaccard similarity between cross-validation folds and selection frequency analysis. Features selected in ≥80% of folds were considered stable. Unified feature sets are derived by aggregating SHAP importance across all models, selecting the top k features where k represents the median optimal feature count across individual models. The Jaccard similarity between CV folds is J S i , S j = S i S j / S i S j , where S i and S j are feature sets selected in folds i and j. The mean Jaccrd similarity per model is J m e a n = 2 / K K 1 i j 1 J S i , S j , where K is the number of cross validation folds. The feature selection frequency is f k = 1 \ / K k I f k S i , where I is the indicator function, f k is the feature K and S i is feature set in fold i and the stability ratio is S R = 1 / N k I f k τ , where N is the total features and τ is the stability threshold.

3.2 Hybrid optimization

3.2.1 Model scenario

To evaluate the impact of hybrid optimization and model integration strategies, three scenarios are implemented:

a. Stacked_Hybrid_Full: Focuses on optimizing both base model hyperparameters and stacking weights through a meta-model that learns optimal blending.

b. Hybrid_Hyper_XGB: Optimizes only hyperparameters of the base models, followed by a simple ensemble, originally equal averaging, without explicit weight optimization.

c. Baseline_XGB: A reference model with fixed, non-optimized hyperparameters (e.g., n_estimators = 100, max_depth = 4, learning_rate = 0.05).

3.2.2 Optimization framework

The optimization framework utilizes a hybrid GA-PSO approach to tune hyperparameters for base models (RF, LightGBM, CatBoost) and, in the case of Stacked_Hybrid_Full, stacking weights. This method integrates the global search efficiency of genetic algorithms with the local refinement precision of particle swarm optimization, facilitating effective exploration of the hyperparameter space. The defined hyperparameter ranges are as follows:

• RF: Number of trees (n_estimators, 50–300), maximum depth (max_depth, 3–15), minimum samples to split (min_samples_split, 2–15), minimum samples per leaf (min_samples_leaf, 1–10).

• LightGBM: Number of trees (n_estimators, 50–300), maximum depth (max_depth, 3–15), learning rate (learning_rate, 0.01–0.2), feature fraction (feature_fraction, 0.5–0.8), bagging fraction (bagging_fraction, 0.5–0.8), minimum data in leaf (min_data_in_leaf, 10–50).

• CatBoost: Number of iterations (iterations, 50–300), depth (depth, 3–15), learning rate (learning_rate, 0.01–0.2), L2 regularization (l2_leaf_reg, 1–10), border count (border_count, 32–255).

This yields a total of 15 hyperparameters across the three base models. For the Stacked_Hybrid_Full scenario, an additional three dimensions are included to optimize the stacking weights, enhancing the ensemble’s adaptability.

3.2.3 GA-PSO hybrid optimization

Hyperparameters for base models (RF, LightGBM and CatBoost) and stacking weights for Stacked_Hybrid_Full are optimized using a hybrid GA-PSO approach. The optimization population consists of N = 10 particles, evolved over G = 15 generations. Each particle x i = x i 1 , . . . . , x i D (where i = 1,….10 and D is the number of hyperparameters or weights) is initialized randomly within feasible bounds. The PSO component updates particle velocity and position as shown in Equations 1, 2:

v i d t + 1 = w . v i d t + c 1 . r 1 . p b e s t i d x i d t + c 2 . r 2 . g b e s t d x i d t , ( 1 )
x i d t + 1 = x i d t + v i d t + 1 , ( 2 )

For t = 1,….,15, where w is the inertia weight, c 1 , c 2 are learning factors and r 1 , r 2 U 0 , 1 .

The GA component applies crossover (Equation 3):

x c d = a x p d + 1 α x q d , α U 0 , 1 ( 3 )

and mutation with probability p m to refine the population is Equation 4:

x i d = x i d + N 0 , σ 2 , ( 4 )

The fitness function minimizes the 10-fold cross-validated mean squared error (MSE) using Equation 5:

f x i = 1 10 k = 1 10 1 n k j f o l d k y j y ^ j x i 2 . ( 5 )

This optimization is applied to Stacked_Hybrid_Full and Hybrid_Hyper_XGB, loading scenario-specific parameters, respectively, while Baseline_XGB uses fixed hyperparameters.

3.2.4 Convergence monitoring and stability assessment

To ensure robust optimization convergence and address optimization stability, we implemented comprehensive monitoring mechanisms (Equations 68):

C o n v e r g e n c e : 1 w i = t w + 1 t R i 2 R i 1 2 < ς , ( 6 )
max i t w + 1 , t R i 2 R i 1 2 < 2 ς , ( 7 )

where w = 4, and ς = 0.002, tracking improvement rates across generations. Hyperparameter trajectory analysis tracks evolution of key parameters (n_estimators, max_depth, learning_rate) across all 15 generations. The stability quantification is:

σ R 2 = 1 N 1 i = G 4 G R i 2 R 2 ¯ 2 , ( 8 )

where G = 15, N = 5 final generations. Generation sufficiency validation is assessed through marginal gains Δ R f i n a l 2 = R G 2 R c o n v 2 and hyperparameter coefficient of variation C V p a r a m = σ p a r a m / μ p a r a m . Computational feasibility is key for industry adoption; our configuration balances exploration/convergence (N = 10, G = 15; fitness via 10-fold CV). On a standard GPU (RTX 3080), full optimization takes ∼45 min for 5,000 samples (RFECV: 10 min; GA-PSO: 25 min; stacking: 10 min).

3.2.5 Stacking and ensemble modeling

The Stacked_Hybrid_Full scenario integrates enhanced stacking techniques to optimize both hyperparameters and weights. Base learner predictions y ^ i m = f m x i ; θ m are generated for M = 3 models (RF, LigthGBM and CatBoost), with θ m optimized via GA-PSO. Out-of-fold (OOF) predictions are computed using 10-fold cross-validation (Equation 9):

y ^ i m , k = f m x i ; θ m , i k 0 , y ^ i m = 1 9 k k i y ^ i m , k , ( 9 )

forming the stacking matrix S = y ^ i 1 , y ^ i 2 , . . . y ^ i 4 T . The meta-model employs Ridge regression to optimize weights β (Equation 10):

min β y S β 2 2 + α β 2 2 , ( 10 )

solved as β = S T S + α I 1 S T y , with α selected via RidgeCV from ([0.1,1.0,10]). The final prediction is y ^ i = s i β .

The derived weights β provide critical interpretability insights; for porosity estimation, optimal α = 1.0 yields β = [-0.032, 0.331, 0.701] for [RF, LightGBM, CatBoost], indicating CatBoost’s dominant role (70.1%) in capturing complex porosity relationships. For water saturation, stronger regularization (α = 10.0) produces β = [−0.169, 0.552, 0.617], reflecting balanced contributions between CatBoost and LightGBM for saturation physics. Negative weights function as error-correction mechanisms, downweighting models prone to systematic biases while maintaining ensemble diversity. This approach reduces overfitting (via CV), enhances diversity, and improves interpretability through the geologically meaningful linear coefficients β.

The Hybrid_Hyper_XGB scenario optimizes base model hyperparameters θ m via GA-PSO to minimize (Equation 11):

M S E θ m = 1 n i = 1 n y i f m x i ; θ m 2 , ( 11 )

for M = 3 (RF, LightGBM and CatBoost). The ensemble prediction uses equal-weight averaging (Equation 12):

y ^ i = 1 3 m = 1 3 f m x i ; θ m , ( 12 )

lacking a meta-model, which distinguishes it from Stacked_Hybrid_Full and preserves its focus on hyperparameter optimization alone.

The Baseline_XGB scenario serves as a reference with fixed hyperparameters θ 0 = {n_estimators = 100, max_depth = 4, learning_rate = 0.05}, yielding predictions y ^ i = f X G B x i ; θ 0 without optimization, providing a benchmark for comparison.

3.2.6 Physics-informed enhancement

A physics-informed hybrid ensemble is introduced through a post hoc refinement step that constrains the stacked ensemble y ^ i = s i β predictions using petrophysical priors derived from log features. We compute a raw Wyllie/density estimate of porosity from logs and then calibrate it to the original porosity on the training wells only (Equation 13):

ϕ r = 100 0.6 Δ t Δ t m a Δ t f Δ t m a 1 v s h + 0.4 ρ m a ρ b ρ m a ρ f , ( 13 )

where Δ t = 10 6 / V P f t / s , v s h is the volume of shale and Δ t m a , Δ t f , ρ m a , ρ f are the standard matrix/fluid constants.

We fit a monotone map g (Equation 14) (isotonic or robust linear, constrained increasing) so that

g = arg m o n o min i ϵ t r a i n ϕ i l a b e l g ϕ r , i 2 , ( 14 )

where ϕ i l a b e l is the original porosity label for sample i

Apply g to obtain the porosity prior on any set used only as a prior in the blend (Equation 15):

ϕ p r i o r = g ϕ r , ( 15 )

With the porosity and true resistivity R t , Archie gives the water saturation prior (Equation 16):

S w , p r i o r = 100 a R w ϕ m R t 1 / n , a = 1 , m = 2 , n = 2 . ( 16 )

The values (a = 1, m = 2, n = 2) are standard for water-wet, low-clay sandstones, as validated in Ordos Basin studies (e.g., average a = 0.98 ± 0.05, m = 1.95 ± 0.1, n = 2.05 ± 0.08 from core-log calibration in Chang 7;) (Yang et al., 2016). These assume clean quartz matrix with R w from spontaneous potential logs, avoiding Simandoux corrections for minor shaliness (V sh <30%). Training/validation use the original porosity (dataset label) inside Archie to build S w , p r i o r in an OOF fashion.

Since the Stacked ensemble prediction is y ^ i = s i β , let y p r i o r ϕ p r i o r , S w , p r i o r be the corresponding physics prior. We form a soft physics constraint via a convex blend (Equation 17):

y ^ p h y s η = 1 η y ^ + η y p r i o r , η 0 , 1 . ( 17 )

η is selected separately for porosity and water saturation by OOF cross-validation on training wells over a small grid (e.g., {0.00,0.05, … ,0.40}, minimizing OOF MSE. This update is the unique minimizer of Equation 18:

min y i y i y ^ 2 2 + λ y i y p r i o r 2 2 , ( 18 )

Here, η = λ / 1 + λ ensures that the physics prior acts purely as a soft regularizer: when the stacked model is confident, η is small; when the data signal is weak or uncertain, η increases, providing stability and geological plausibility. A data-adaptive variant in which the blend weight is learned as a function of input features may also be evaluated; however, the scalar η formulation above remains the primary, transparent specification of the physics-informed refinement.

3.3 SHAP interpretability analysis

SHAP introduced by Lundberg and Lee (2017), is a unified framework designed to interpret predictions from complex machine learning models. To overcome the interpretability limitations of black-box models, SHAP constructs a simplified surrogate model that approximates the behavior of the original predictive model. This surrogate decomposes the output into additive contributions from each input feature, facilitating transparent insight into the reasoning behind individual predictions. SHAP analysis, a core component of the methodology, quantifies feature contributions for the base models (RF, LightGBM, CatBoost) within Stacked_Hybrid_Full, providing insights into geological drivers (e.g., Vp, Vs, Vp/Vs). SHAP values are computed by using the TreeExplainer for each base model, enabling interpretation of individual model contributions to the stacked ensemble. For a given base model (m), the prediction for a sample x R n is decomposed as Equation 19:

f m x = E f m x + j = 1 n ψ j , ( 19 )

where ψ j is the SHAP value for the (j)-th feature, and the summation is over the number of features in the dataset. The SHAP value ψ j is computed as Equation 20:

ψ j = S 1 , . . . n \ j S ! n S 1 ! n ! · f m x S j f m x S , ( 20 )

where (S) is a subset of the feature indices excluding (j), and f m x S is the model’s output with only the features in (S). SHAP values are computed for the training data and per test well to explain performance variability. Mean absolute SHAP values are calculated as Equation 21:

ψ j = 1 n i = 1 n ψ j , i . ( 21 )

3.4 Workflow overview

Figure 2 illustrates the proposed framework for predicting porosity and water saturation in unconventional reservoirs by using well log data. The methodology integrates robust data preprocessing, including mean imputation and robust scaling, with a multi-stage machine learning pipeline. The workflow employs a hybrid GA-PSO to simultaneously tune hyperparameters of base models (RF, LightGBM, CatBoost) and stacking weights. The pipeline constructs a stacked ensemble with a ridge regression meta-learner and a post hoc physics informed refinement, incorporating bias correction and regularization to address overestimation. Performance is assessed via 5-fold cross-validation, per-well metrics (R 2 , RMSE), and leave-one-well-out CV. SHAP analysis, applied to features and base models quantifies contributions of the features and base models, enhancing model interpretability and providing geological insights.

Figure 2
Flowchart illustrating a machine learning process. It starts with

Figure 2. Workflow of the methodology.

4 Results

4.1 Feature selection analysis

RFECV revealed clear model-dependent variability across the ensemble learners for both porosity and water saturation prediction targets (Figures 3, 4). The optimal number of features ranged between 8 and 11 depending on the model, reflecting differences in feature interaction complexity captured by each algorithm. For the porosity target, RF achieved optimal performance with 10 features, CatBoost with eight features, and LightGBM with 11 features (Figure 3). Similarly, for the water saturation target, the optimal subsets comprised eight features (Random Forest), 11 features (CatBoost), and nine features (LightGBM) (Figure 4).

Figure 3
Three line graphs (a, b, c) depict the relationship between the number of features and cross-validated R-squared values. Graph (a) uses blue circles showing optimal performance at 10 features, graph (b) uses green triangles indicating optimal performance at 8 features, and graph (c) uses orange squares showing optimal performance at 11 features. Each graph includes a confidence interval shaded region and a red dashed line marking optimal feature numbers.

Figure 3. RFECV analysis with confidence intervals for porosity, (a) RF, (b) CatBoost and (c) LightGBM.

Figure 4
Three line graphs show the relationship between cross-validated R-squared values and the number of features. Graph (a) in blue peaks at eight features, graph (b) in green peaks at eleven features, and graph (c) in orange peaks at nine features. Each graph includes a red dashed line indicating the optimal number of features.

Figure 4. RFECV analysis with confidence intervals for water saturation, (a) RF, (b) CatBoost and (c) LightGBM.

Feature selection stability analysis across five cross-validation folds indicated that Random Forest exhibited the highest robustness (mean Jaccard = 0.86 ± 0.05), followed by LightGBM (0.62 ± 0.08) and CatBoost (0.36 ± 0.12). Pairwise fold comparisons for Random Forest exceeded 0.80 in 85% of cases, highlighting its ability to consistently identify key predictors (Figures 5a,d for porosity; Figures 6a,d for water saturation). Core geophysical attributes including V S, Poisson’s ratio, and ρ displayed selection frequencies above 80% across all models (Figure 5b for porosity; Figure 6b for water saturation). These attributes form the backbone of elastic and petrophysical interdependencies, making them consistently favored by the models. In contrast, derived and engineered attributes such as Vp/Vs showed intermediate selection frequencies (20%–60%) and strong model-specific variability (Figure 5c for porosity; Figure 6c for water saturation).

Figure 5
Four charts comparing Random Forest, CatBoost, and LightGBM models: (a) Bar chart showing mean Jaccard similarity with error bars, LightGBM leading; (b) Heat map indicating feature selection frequency by model; (c) Bar chart displaying optimal number of features per model; (d) Bar chart comparing similarity metric values across models, with LightGBM having the highest mean Jaccard.

Figure 5. Feature selection stability snalysis for porosity. (a) Mean Jaccard similarity across models, (b) heatmap of feature selection frequency per fold (color scale 0–1), (c) histogram of optimal feature counts per model and (d) bar chart comparing stability metrics (mean Jaccard, STD, min Jaccard).

Figure 6
(a) Bar chart comparing mean Jaccard similarity of RandomForest, CatBoost, and LightGBM models. (b) Heatmap showing feature selection frequency across models with varying intensity. (c) Bar chart of frequency for optimal number of features for each model. (d) Bar chart comparing mean, standard deviation, and minimum Jaccard values among the models.

Figure 6. Feature selection stability snalysis for water saturation. (a) Mean Jaccard similarity across models, (b) heatmap of feature selection frequency per fold (color scale 0–1), (c) histogram of optimal feature counts per model and (d) bar chart comparing stability metrics (mean Jaccard, STD, min Jaccard).

4.2 GA-PSO optimization convergence

The hybrid GA-PSO optimization demonstrated robust convergence across all modeling scenarios (Figure 7). Convergence analysis revealed Hybrid_Hyper_XGB for porosity converges at generation 7 (46.7% of total) with excellent stability and Stacked_Hybrid_Full for porosity converging at generation 5 (33.3%) with high stability. While Hybrid_Hyper_XGB for water saturation converges at generation 8 (53.3%) with perfect stability and Stacked_Hybrid_Full for water saturation converging at generation 10 (66.7%) with good stability.

Figure 7
Four bar graphs comparing different models: (a) Convergence Generation shows higher values for Stacked_Hybrid_Full in orange than Hybrid_Hyper_XGB; (b) Marginal Gain (ΔAP) indicates Hybrid_Hyper_XGB_Sly as highest; (c) Standard Deviation reveals Stacked_Hybrid_Full_Sly has a larger deviation, and (d) Percentage Gain (%) shows the highest gain for Hybrid_Hyper_XGB_Sly. Models are colored in blue and orange.

Figure 7. Convergence metrics, (a) convergence generation analysis, (b) optimization marginal gains, (c) R 2 stability in final generations and (d) relative performance improvement.

Hyperparameter trajectories (Figure 8) indicates consistent stabilization trends for n_estimators (220–250), max_depth (8–9) and learning_rate (0.07) after generation 6.

Figure 8
Four graphs displaying model performance metrics across generations. (a) Line graph shows number of estimators increasing and stabilizing over generations, with red and blue lines for different models. (b) Line graph depicts max depth fluctuating before stabilizing, following distinct patterns for each model. (c) Line graph indicates decreasing learning rate, flattening over time, again with varied paths by model. (d) Bar chart presents convergence generation for four models, showing differences in generation number achieved. Labels identify models in each chart.

Figure 8. Hyperparameter trajectories, (a) n_estimation evolution, (b) max_depth evolution, (c) learning rate evolution and (d) convergence speed comparism.

R 2 evolution profiles (Figure 9) highlight that both porosity and water saturation targets achieved monotonic improvements during optimization. The Stacked_Hybrid_Full model maintained higher average and best R 2 values at each generation, outperforming Hybrid_Hyper_XGB by +0.002–0.003 in porosity and +0.004 in water saturation. Relative performance improvements reached 0.22% (porosity) and 0.29% (water saturation), with stable R 2 standard deviations <2.4 × 10−4 in final generations (Table 1).

Figure 9
Four line graphs labeled (a), (b), (c), and (d) comparing the performance of Hybrid_Hyper_XGB and Stacked_Hybrid_Full models over 15 generations. Graphs (a) and (c) depict the best R-squared values, while (b) and (d) show average R-squared values. The Stacked_Hybrid_Full consistently performs better, indicated by higher orange lines compared to the blue lines of Hybrid_Hyper_XGB. Dashed vertical lines mark specific generations of interest.

Figure 9. R 2 evolution profiles, (a) porosity best R 2 evolution, (b) porosity average R 2 evolution, (c) water saturation best R 2 evolution and (d) water saturation average R 2 evolution.

Table 1
www.frontiersin.org

Table 1. Optimization convergence metrics.

As shown in Table 1, marginal R 2 gains fell below 0.0002 beyond generation 10, confirming an optimal exploration–exploitation balance within 15 generations. The Stacked_Hybrid_Full model achieved faster convergence (5–10 generations) than Hybrid_Hyper_XGB (7–8 generations) while delivering higher R 2 stability in the final iterations (Figure 9).

4.3 Learning beahviour and cross-well validation on training data

To understand the impact of training data size on model performance and stability, learning curve analysis is performed for each target property. The learning curves are generated by progressively increasing the number of training samples from the combined set of seven wells and computing the mean cross-validation R 2 score at each increment. Learning curves (Figure 10) reveal that all models improve with increased training data, plateauing around 3,500 samples. Stacked_Hybrid_Full maintained the highest R 2 across all sample sizes, showing better sample efficiency and stability.

Figure 10
Two line graphs, labeled (a) and (b), display Cross-Validation R² Scores against Training Set Size. Graph (a) shows higher scores, with blue (Stacked_Hybrid_Full) and orange (Hybrid_Hyper_XGB) lines attaining the highest values. Green (Baseline_XGB) line is lower. In graph (b), similar trends are seen with overall lower R² Scores.

Figure 10. Learning curves for (a) porosity, (b) water saturation.

Leave-One-Well-Out (LOWO) cross-validation was employed on the seven-well training dataset to quantify model robustness and check for overfitting. This technique systematically holds out data from one well at a time for validation while training on the remaining wells, ensuring the model’s performance is assessed on entirely unseen spatial/geological contexts. This mitigates overfitting by preventing the model from memorizing training-specific patterns and promotes generalization across diverse well conditions. Figures 11a,b demonstrate the LOWO-validated performance for porosity and water saturation predictions. These dual-axis bar charts show R 2 and RMSE for each held-out well. Hatched bars indicate the Hybrid_Hyper_XGB scenario; solid bars indicate Stacked_Hybrid_Full. Consistently high R 2 (>0.8 for most wells in porosity; variable but generally >0.7 in water saturation) and low RMSE (e.g., <1.2 for porosity across wells) confirm robust generalization without signs of overfitting, such as inflated training metrics degrading on validation.

Figure 11
Bar charts comparing R² scores and RMSE values for various wells. Chart (a) shows high R² and RMSE for Well 4, with other wells having varying scores. Chart (b) displays similar data with different scale on RMSE axis. Stacked Hybrid Full and Hybrid Hyper XGB methods are represented.

Figure 11. Cross-well performance evaluation on training data (a) porosity and (b) water saturation.

4.4 Depth-wise well-log predictions

To quantify the impact of the post hoc physics constraints, we performed an ablation by comparing pure ensemble predictions that is pre-blending (Stacked_Hybrid_Pure) against the physics-constrained version (Stacked_Hybrid_Full). Metrics (Table 2) reveal blended boosts R 2 by 2%–4% (e.g., 0.8945 to 0.826 for porosity aggregate; 95% CI [0.88, 0.90]) and reduces MAE by 10%–15% versus pure, with narrower residuals in low-porosity (<5%) zones (Figure 12). This outperforms direct Wyllie/Archie baselines (Traditional) (R 2 = 0.47/0.58), while stabilizing variance, confirming soft constraints mitigate overfitting without restricting high-fidelity predictions. Depth overlays (Figure 13) highlight improved tracking of shaly transitions, validating geological consistency. These gains underscore the post hoc approach as a scalable PIML bridge for tight reservoirs.

Table 2
www.frontiersin.org

Table 2. Ablation metrics.

Figure 12
Three plots, labeled (a), (b), and (c), show porosity percentages against depth in meters, ranging from 1920 to 2040. Each plot includes lines for actual values, Stacked_Hybrid_Pure, Stacked_Hybrid_Full, and Traditional methods, depicted in different colors and styles. The X-axis represents porosity percentage, while the Y-axis represents depth in meters.

Figure 12. Porosity predictions comparing Stacked_Hybrid_Full, Stacked_Hybrid_Pure and Traditional ablations for Wells, (a) 1, (b) 2 and (c) 3.

Figure 13
Three graphs (a, b, c) compare water saturation percentages at various depths. Each graph includes four lines: Actual (black), Stacked_Hybrid_Pure (blue dashed), Stacked_Hybrid_Full (green dashed), and Traditional (red dotted). Depth ranges are approximately 1920-2020 meters for (a), 2040-2160 meters for (b), and 2060-2180 meters for (c). The x-axis represents water saturation percentage, and the y-axis denotes depth in meters.

Figure 13. Water saturation predictions comparing Stacked_Hybrid_Full, Stacked_Hybrid_Pure and Traditional ablations for Wells, (a) 1, (b) 2 and (c) 3.

With physics refinement confirmed, we now compare scenarios emphasizing optimization strategy impact. For porosity (Figure 14), Stacked_Hybrid_Full predictions track actual values closely across depths (1,920–2,020 m in (a), 2,060–2,140 m in (b) and (c)), capturing heterogeneities such as porosity spikes at ∼1,980 m and ∼2,120 m. Hybrid_Hyper_XGB and Baseline_XGB overestimate low-porosity zones, leading to smoothed profiles that overlook fine-scale variations. Water saturation profiles (Figure 15) show similar Stacked_Hybrid_Full superiority, with accurate reproduction of saturation gradients (e.g., sharp transitions at ∼2,100–2,120 m), whereas baseline models introduce relics in high-saturation layers.

Figure 14
Three graphs show porosity percentage versus depth, each with different models for prediction. Graph (a) covers depths from 1920 to 2000 meters, (b) 2060 to 2140 meters, and (c) 2060 to 2160 meters. Key shows four lines: Actual, Hybrid_Hyper_XGB, Baseline_XGB, and Stacked_Hybrid_Full, with varied line styles and colors.

Figure 14. Porosity predictions comparing Stacked_Hybrid_Full, Hybrid_Hyper_XGB and Baseline models for wells (a) 1, (b) 2, (c) 3.

Figure 15
Three plots (a, b, c) show porosity percentage versus depth in meters. Each plot compares actual values with three models: Hybrid_Hyper_XGB, Baseline_XGB, and Stacked_Hybrid_Full. Lines are color-coded: black for actual, green for Hybrid_Hyper_XGB, blue for Baseline_XGB, and purple for Stacked_Hybrid_Full. Depths range from approximately 1920 to 2160 meters, with porosity from 0% to 15%.

Figure 15. Water saturation predictions comparing Stacked_Hybrid_Full, Hybrid_Hyper_XGB and Baseline models for wells (a) 1, (b) 2, (c) 3.

4.5 Model performance evaluation

Scatter plots of predicted versus actual values for porosity (Figure 16) and water saturation (Figure 17) across the three independent test wells demonstrate the Stacked_Hybrid_Full model’s superior performance. For porosity, Stacked_Hybrid_Full achieves R 2 values of 0.783, 0.901, and 0.860 (a–c), with points closely aligned along the 1:1 line (0%–14% range), outperforming Hybrid_Hyper_XGB (R 2 = 0.586, 0.600, 0.823) and Baseline_XGB (R 2 = 0.272, 0.562, 0.576), though all models show increased scatter at <4% porosity. For water saturation, Stacked_Hybrid_Full yields R 2 values of 0.902, 0.823, and 0.927, aligning tightly with actual values (30%–100% range), surpassing Hybrid_Hyper_XGB (R 2 = 0.744, 0.673, 0.740) and Baseline_XGB (R 2 = 0.604, 0.331, 0.541), with greater deviations in the 50%–80% range. Stacked_Hybrid_Full’s hybrid stacking enhances its ability to model nonlinear reservoir properties, while Hybrid_Hyper_XGB and Baseline_XGB’s limitations suggest a need for improved handling of heterogeneity.

Figure 16
Three scatter plots labeled (a), (b), and (c) compare actual porosity to predicted porosity, with data points color-coded for three models: Stacked_Hybrid_Full (blue), Hybrid_Hyper_XGB (green), and Baseline_XGB (red). Each plot shows a 1:1 reference line. Panel (a) has R² values of 0.783, 0.586, and -0.272, panel (b) has 0.901, 0.860, and 0.562, and panel (c) has 0.860, 0.823, and 0.576, respectively.

Figure 16. Predicted vs. actual porosity for wells (a) 1, (b) 2, (c) 3.

Figure 17
Three scatter plots comparing predicted versus actual water saturation percentage. Plot (a) shows results for Stacked Hybrid Full, Hybrid Hypor XGB, and Baseline XGB models with R-squared values of 0.902, 0.744, and 0.604 respectively. Plot (b) presents R-squared values of 0.823, 0.573, and 0.331. Plot (c) gives values of 0.927, 0.740, and 0.541. A black dashed line represents the one-to-one line in each plot, showing varying degrees of prediction accuracy. Blue, green, and red dots represent different models' results.

Figure 17. Predicted vs. actual water saturation for wells (a) 1, (b) 2, (c) 3.

Table 3 shows the R 2 , RMSE, and MAE for each model (Stacked_Hybrid_Full, Hybrid_Hyper_XGB, Baseline_XGB) across all three wells. The error matrix includes both absolute (RMSE, MAE) and relative (R 2) metrics, to provide a complete picture of model performance.

Table 3
www.frontiersin.org

Table 3. Error metrics R 2 , RMSE, and MAE for each model.

Residual histograms (Figures 18, 19) reveal error symmetries, which indicate that Stacked_Hybrid_Full has narrow, symmetric error distributions centered near zero, suggesting minimal bias. Hybrid_Hyper_XGB shows moderate spread, while Baseline_XGB residuals are skewed with heavier tails.

Figure 18
Three histogram panels (a, b, c) compare distributions of residuals for different models: Stacked_Hybrid_Full (blue), Hybrid_Hyper_XGB (green), and Baseline_XGB (red). Each panel shows distinct distribution patterns, with frequency on the y-axis and residual values on the x-axis.

Figure 18. Residual histogram for porosity, wells (a) 1, (b) 2, (c) 3.

Figure 19
Three-panel histogram charts, labeled (a), (b), and (c), compare residual frequencies for different models: Stacked Hybrid Full (blue), Hybrid Hyper XGB (green), and Baseline XGB (red). Each panel shows distribution peaks around zero with varying spread and height.

Figure 19. Residual histogram for water saturation, wells (a) 1, (b) 2, (c) 3.

Residual-versus-depth scatter plots (Figures 20, 21) test for depth-dependent biases. Stacked_Hybrid_Full residuals cluster randomly around zero without trends, affirming model independence from depth. Hybrid_Hyper_XGB and Baseline_XGB display funnelling (increasing spread at greater depths) and slight positive biases below 2,100 m, potentially linked to unmodeled geological factors.

Figure 20
Three scatter plots (a, b, c) display residuals versus depth in meters. Data points are color-coded as blue (Stacked_Hybrid_Full), green (Hybrid_Hyper_XGB), and red (Baseline_XGB). Each plot shows varied distribution and density of residuals along the depth axis. The x-axis represents residuals, while the y-axis denotes depth.

Figure 20. Porosity residual depth profiles, wells (a) 1, (b) 2, (c) 3.

Figure 21
Three scatter plots labeled (a), (b), and (c) depict residuals versus depth (in meters) for different models. Each plot displays data points in blue, green, and red representing

Figure 21. Water saturation residual depth profiles, wells (a) 1, (b) 2, (c) 3.

4.6 Model interpretability analysis using SHAP

For transparency and interpretability in the ensemble prediction pipeline, SHAP analysis is performed exclusively on the Stacked_Hybrid_Full scenario. This model is selected for interpretability as it exhibits superior generalization performance and structural fidelity across all evaluation metrics.

The bar charts and beeswarm plots of mean SHAP values assess the significance and directional influence of input features on porosity and water saturation predictions. Global SHAP feature importance (Figures 22a, 23b) identifies permeability, shale volume, and depth as the dominant predictors for porosity; permeability, resistivity, and depth dominate for water saturation. Beeswarm plots (Figures 22b, 23b) show how feature value ranges influence predictions, such as high permeability and low shale volume increase porosity, while low resistivity and high permeability raise water saturation.

Figure 22
Two bar charts labeled (a) and (b) compare the impact of different geological features on model output. Chart (a) shows that permeability has the highest mean SHAP value, indicating its significance in model predictions. Chart (b) provides a detailed SHAP value distribution for each feature, using color gradients from blue (low) to red (high) to illustrate feature values.

Figure 22. (a) Bar chart of the mean SHAP values and (b) Beeswarm summary plots for porosity.

Figure 23
Two bar charts show feature impacts on a model. Chart (a) displays the mean SHAP value of each feature's average impact; permeability has the highest impact. Chart (b) illustrates SHAP value distributions; feature values vary from low (blue) to high (red).

Figure 23. (a) Bar chart of the mean SHAP values and (b) Beeswarm summary plots for water saturation.

Interaction plots (Figures 24, 25) reveal coupled lithology–fluid effects. For porosity, permeability interacts strongly with gamma ray, shale volume, and density; for water saturation, interactions are strongest between permeability and V S, resistivity, and shale volume. These interactions highlight the multi-factor controls on reservoir properties and the model’s ability to capture nonlinear relationships and the value of integrating multiple logs.

Figure 24
Scatter plots showing SHAP value relationships. (a) Correlates SHAP value with permeability, colored by Gamma. (b) Relates SHAP value to shale volume, colored by density (ρ). (c) Shows SHAP value against depth, colored by permeability. (d) Displays SHAP value in relation to density (ρ), colored by permeability. Data points are shaded from blue to red.

Figure 24. Interaction plots for porosity. (a) Permeability with gamma, (b) shale volume with density, (c) depth with permeability, and (d) density with permeability.

Figure 25
Scatter plots illustrating SHAP value relationships with different variables. (a) SHAP value versus Permeability, color-coded by \(V_s\) values. (b) SHAP value versus Resistivity, color-coded by Permeability. (c) SHAP value against Depth, color-coded by Shale Volume percentage. (d) SHAP value versus Gamma, color-coded by Permeability. Each plot shows clusters and trends indicating variable influence.

Figure 25. Interaction plots for water saturation. (a) Permeability with V S, (b) resistivity with permeability, (c) depth with shale volume, and (d) gamma with permeability.

Collectively, this interpretability analysis validates the scientific robustness of the stacked ensemble. By leveraging SHAP analysis, the framework not only delivers superior prediction accuracy but also provides geologically meaningful explanations for the observed trends. Such transparency is vital for reservoir characterization, risk-informed decision-making, and field development planning.

5 Discussion

This study presents a PIML framework for predicting petrophysical properties in tight reservoirs with the results demonstrating a robust integration of feature selection, model optimization, and physical principles, leading to significant improvements in predictive accuracy and geological consistency.

RFECV identified model-dependent optimal feature subsets, typically comprising 8–11 features for predicting porosity and water saturation. The RF algorithm demonstrated superior selection stability, evidenced by a mean Jaccard similarity of 0.86 ± 0.05, whereas CatBoost exhibited higher variability (0.36 ± 0.12). Core petrophysical attributes namely V S, Poisson’s ratio and ρ were consistently selected with a frequency exceeding 80%, underscoring their fundamental relationship with the target properties.

Hyperparameter optimization via the GA-PSO hybrid algorithm converged efficiently, requiring only 5–10 generations for the Stacked_Hybrid_Full model and 7–8 for the Hybrid_Hyper_XGB model. This process stabilized key parameters within narrow, effective ranges (n_estimators: 220–250; max_depth: 8–9; learning_rate: 0.07). The marginal improvement in R 2 diminished below a threshold of 0.0002 after the 10th generation, with the Stacked_Hybrid_Full model achieving a final performance gain of +0.002–0.004 over other ensembles. Learning curve analysis confirmed the sample efficiency of the approach, with model performance plateauing at approximately 3,500 training samples.

Cross-well validation affirmed the robustness of the developed models. For porosity prediction, the Stacked_Hybrid_Full model consistently excelled (R 2: 0.835–0.969; minimum RMSE: 0.61%), with the exception of one well where the Hybrid_Hyper_XGB model performed best. Predictions for water saturation exhibited greater inter-well variability, with a notably high RMSE in Well 5, highlighting the persistent challenges in modeling fluid saturation within highly heterogeneous intervals.

A post hoc physics-refinement step was implemented, blending log-derived physical models from Wyllie’s equation for porosity and the Archie-Simandoux model for water saturation as soft regularizers within the ML framework. This scalable PIML step yielded a 2%–4% gain in R 2 and a 10%–15% reduction in MAE, without requiring model retraining. The resulting depth profiles demonstrated a superior capacity to capture subsurface heterogeneity, outperforming traditional methods by 50%–90% in accuracy while ensuring geological consistency.

Analysis of test-well predictions confirmed the superiority of the Stacked_Hybrid_Full model. Residual scatter plots were symmetric and showed no systematic bias with depth, in contrast to baseline models which exhibited significant skewness and funnel-shaped error distributions.

SHAP analysis provided model interpretability, identifying permeability, shale volume, and depth as the primary drivers for porosity, which are consistent with mechanical compaction principles. For water saturation, permeability, resistivity, and depth were the dominant features, aligning with the theoretical foundations of Archie’s law. Beeswarm plots and interaction analysis further revealed directional trends (e.g., low shale volume increasing porosity) and feature couplings (e.g., permeability–gamma ray), thereby validating the synergistic use of multiple well logs.

6 Conclusion

This study has introduced and validated a robust, interpretable ensemble learning framework for the prediction of porosity and water saturation in complex tight reservoirs. The methodology directly addresses the persistent challenges of pronounced lithological heterogeneity and strong nonlinear feature interactions by integrating a structured machine learning pipeline with foundational petrophysical principles. The core of the framework leverages a stacked ensemble architecture, synergizing RF, LightGBM, and CatBoost as base learners with a Ridge regression meta-learner.

RFECV identified compact, model-specific feature subsets, with core attributes like V S, Poisson’s ratio, and ρ consistently selected, which underscores the model’s inherent alignment with established petrophysical relationships.

The dual-phase GA-PSO strategy effectively combined the global exploration capability of Genetic Algorithms with the local refinement of Particle Swarm Optimization. This hybrid approach achieved rapid convergence within 5–10 generations, stabilizing optimal hyperparameters and yielding an efficient and diverse set of base learners for the stacking ensemble. The joint optimization of both hyperparameters and stacking weights in the Stacked_Hybrid_Full configuration proved critical, enabling it to consistently outperform the hyperparameter-tuned Hybrid_Hyper_XGB model. Cross-well validation confirmed its robustness, with the model achieving superior and demonstrating stable, unbiased residuals across diverse well conditions. A fundamental contribution is the post hoc integration of rock physics priors using Wyllie’s equation for porosity and the Archie-Simandoux model for water saturation acting as soft regularizers. This scalable PIML step resulted in significant gains and improved the capture of heterogeneous depth profiles, outperforming conventional methods by 50%–90% without requiring retraining.

Furthermore, the framework provides critical interpretability through SHAP analysis, which quantitatively identified permeability, shale volume, and depth as primary drivers for porosity, and permeability, resistivity, and depth for water saturation. These findings are in direct agreement with mechanical compaction theory and Archie’s law. The analysis further revealed specific feature interactions (e.g., permeability-gamma ray coupling), validating the model’s ability to capture the multi-log synergies essential for accurate petrophysical characterization.

While this framework demonstrates robust performance within the geological context of the studied basin, the method could be applicable for the reservoir rocks from similar lithologies of tight clastic strata of different areas with minor adjustments. Its generalizability to formations with fundamentally different lithology or petrophysical characteristics (e.g., carbonates or gas hydrates) requires further validation. Consequently, a key objective for future work is to extend and adapt this methodology for application across diverse tectonic units and reservoir types, which will involve domain adaptation techniques and the integration of additional, domain-specific physical laws.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

UM: Validation, Conceptualization, Software, Methodology, Resources, Visualization, Funding acquisition, Formal Analysis, Writing – original draft, Data curation. JB: Validation, Methodology, Writing – review and editing, Supervision, Data curation, Funding acquisition, Resources, Visualization. MA: Methodology, Visualization, Validation, Writing – review and editing. FR: Methodology, Data curation, Writing – review and editing, Validation. EO: Software, Writing – review and editing, Data curation, Visualization.

Funding

The authors declare that financial support was received for the research and/or publication of this article. This work is supported by the Fundamental Research Funds for Central Universities of Hohai University (Grant no. B240201039) and the National Natural Science Foundation of China (Grant no. 42174161).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abbas, M. A., Al-Mudhafar, W. J., and Wood, D. A. (2023). Improving permeability prediction in carbonate reservoirs through gradient boosting hyperparameter tuning. Earth Sci. Inf. 16 (4), 3417–3432. doi:10.1007/s12145-023-01099-0

CrossRef Full Text | Google Scholar

Abid, M., Ba, J., Markus, U. I., Tariq, Z., and Ali, S. H. (2025). Modified approach to estimate effective porosity using density and neutron logging data in conventional and unconventional reservoirs. J. Appl. Geophys. 233, 105571. doi:10.1016/J.JAPPGEO.2024.105571

CrossRef Full Text | Google Scholar

Adadi, A., and Berrada, M. (2018). Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160. doi:10.1109/ACCESS.2018.2870052

CrossRef Full Text | Google Scholar

Akande, K. O., Owolabi, T. O., Olatunji, S. O., and AbdulRaheem, A. A. (2017). A hybrid particle swarm optimization and support vector regression model for modelling permeability prediction of hydrocarbon reservoir. J. Petroleum Sci. Eng. 150, 43–53. doi:10.1016/J.PETROL.2016.11.033

CrossRef Full Text | Google Scholar

Al-Mudhafar, W. (2015). Integrating bayesian model averaging for uncertainty reduction in permeability modeling. Proc. Annu. Offshore Technol. Conf. 1, 33–52. doi:10.4043/25646-MS

CrossRef Full Text | Google Scholar

Al-Mudhafar, W. J. (2020). “Integrating electrofacies and well logging data into regression and machine learning approaches for improved permeability estimation in a carbonate reservoir in a giant southern Iraqi oil field,” in Proceedings of the annual offshore technology conference, 2020-May. doi:10.4043/30763-MS

CrossRef Full Text | Google Scholar

Al-Mudhafar, W. J., and Wood, D. A. (2022). “Tree-based ensemble algorithms for lithofacies classification and permeability prediction in heterogeneous carbonate reservoirs,” in Proceedings of the annual offshore technology conference. doi:10.4043/31780-MS

CrossRef Full Text | Google Scholar

Anifowose, F., Abdulraheem, A., and Al-Shuhail, A. (2019). A parametric study of machine learning techniques in petroleum reservoir permeability prediction by integrating seismic attributes and wireline data. J. Petroleum Sci. Eng. 176, 762–774. doi:10.1016/J.PETROL.2019.01.110

CrossRef Full Text | Google Scholar

Bai, Y., Tan, M., Cao, H., Tang, J., and Liang, Z. (2022). Intelligent classification of carbonate reservoir quality using multisource geophysical logging and seismic data. IEEE Trans. Geoscience Remote Sens. 60, 1–12. doi:10.1109/TGRS.2022.3140790

CrossRef Full Text | Google Scholar

Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., et al. (2020). Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115. doi:10.1016/J.INFFUS.2019.12.012

CrossRef Full Text | Google Scholar

Chang, J., Li, J., Kang, Y., Lv, W., Xu, T., Li, Z., et al. (2020). Unsupervised domain adaptation using maximum mean discrepancy optimization for lithology identification. Geophysics, 86, ID19–ID30. doi:10.1190/geo2020-0391.1

CrossRef Full Text | Google Scholar

Dong, S. Q., Sun, Y. M., Xu, T., Zeng, L. B., Du, X. Y., Yang, X., et al. (2023). How to improve machine learning models for lithofacies identification by practical and novel ensemble strategy and principles. Petroleum Sci. 20 (2), 733–752. doi:10.1016/j.petsci.2022.09.006

CrossRef Full Text | Google Scholar

Feng, D.-C., Wang, W.-J., Mangalathu, S., and Taciroglu, E. (2021). Interpretable XGBoost-SHAP machine-learning model for shear strength prediction of squat RC walls. J. Struct. Eng. 147 (11), 04021173. doi:10.1061/(ASCE)ST.1943-541X.0003115

CrossRef Full Text | Google Scholar

Feng, P., Wang, R., Sun, J., Yan, W., Chi, P., and Luo, X. (2024). An interpretable ensemble machine-learning workflow for permeability predictions in tight sandstone reservoirs using logging data. Geophysics 89 (5), MR265–MR280. doi:10.1190/GEO2023-0657.1

CrossRef Full Text | Google Scholar

Gai, J., Jiang, W., Wang, T., Su, X., Dong, C., Yang, E., et al. (2025). A hybrid physics-informed machine learning framework for water cut prediction in waterflooding reservoirs. Results Eng. 28, 107856. doi:10.1016/j.rineng.2025.107856

CrossRef Full Text | Google Scholar

Gu, Y., Zhang, D., and Bao, Z. (2021a). A new data-driven predictor, PSO-XGBoost, used for permeability of tight sandstone reservoirs: a case study of member of Chang 4 + 5, Western Jiyuan Oilfield, Ordos Basin. J. Petroleum Sci. Eng. 199, 108350. doi:10.1016/j.petrol.2021.108350

CrossRef Full Text | Google Scholar

Gu, Y., Bao, Z., and Zhang, D. (2021b). A smart predictor used for lithologies of tight sandstone reservoirs: a case study of member of Chang 4 + 5, Jiyuan Oilfield, Ordos Basin. Petroleum Sci. Technol. 39 (7–8), 175–195. doi:10.1080/10916466.2021.1881114

CrossRef Full Text | Google Scholar

Gu, Y., Yang, Y., Gao, Y., Yan, S., Zhang, D., and Zhang, C. (2022). Data-driven estimation for permeability of simplex pore-throat reservoirs via an improved light gradient boosting machine: a demonstration of sand-mud profile, Ordos Basin, northern China. J. Petroleum Sci. Eng. 217, 110909. doi:10.1016/j.petrol.2022.110909

CrossRef Full Text | Google Scholar

Huang, C., Zhu, X., Lu, M., Zhang, Y., and Yang, S. (2025). XGBoost algorithm optimized by simulated annealing genetic algrithm for permeability prediction modeling of carbonate reservoirs. Sci. Rep. 15 (1), 14882. doi:10.1038/S41598-025-99627-Z

PubMed Abstract | CrossRef Full Text | Google Scholar

Ji, X., Wang, H., Ge, Y., Liang, J., and Xu, X. (2022). Empirical mode decomposition-refined composite multiscale dispersion entropy analysis and its application to geophysical well log data. J. Petroleum Sci. Eng. 208, 109495. doi:10.1016/J.PETROL.2021.109495

CrossRef Full Text | Google Scholar

Kavzoglu, T., and Teke, A. (2022). Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arabian J. Sci. Eng. 47 (6), 7367–7385. doi:10.1007/S13369-022-06560-8/METRICS

CrossRef Full Text | Google Scholar

Khassaf, A. K., Al-Hameed, Z. M., Al-Mohammedawi, N. R., Al-Mudhafar, W. J., Wood, D. A., Abbas, M. A., et al. (2025). “Physics-informed machine learning for enhanced permeability prediction in heterogeneous carbonate reservoirs,” in Proceedings of the annual offshore technology conference. doi:10.4043/35892-MS

CrossRef Full Text | Google Scholar

Li, L., Chen, J., Gao, C., Zhou, Z., Li, M., Zhang, D., et al. (2025). Peridynamics simulation of hydraulic fracturing in three-dimensional fractured rock mass. Phys. Fluids 37 (7). doi:10.1063/5.0274871/3355519

CrossRef Full Text | Google Scholar

Lipton, Z. C. (2016). The mythos of model interpretability. Commun. ACM 61 (10), 36–43. doi:10.1145/3233231

CrossRef Full Text | Google Scholar

Liu, J. J., and Liu, J. C. (2022). Permeability predictions for tight sandstone reservoir using explainable machine learning and particle swarm optimization. Geofluids 2022, 1–15. doi:10.1155/2022/2263329

CrossRef Full Text | Google Scholar

Liu, Q., Li, P., Jin, Z., Sun, Y., Hu, G., Zhu, D., et al. (2022). Organic-rich formation and hydrocarbon enrichment of lacustrine shale strata: a case study of Chang 7 member. Sci. China Earth Sci. 65 (1), 118–138. doi:10.1007/s11430-021-9819-y

CrossRef Full Text | Google Scholar

Loh, H. W., Ooi, C. P., Seoni, S., Barua, P. D., Molinari, F., and Acharya, U. R. (2022). Application of explainable artificial intelligence for healthcare: a systematic review of the last decade (2011–2022). Comput. Methods Programs Biomed. 226, 107161. doi:10.1016/j.cmpb.2022.107161

PubMed Abstract | CrossRef Full Text | Google Scholar

Lundberg, S. M., and Lee, S. I. (2017). A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst., 4766–4775. doi:10.48550/arXiv.1705.07874

CrossRef Full Text | Google Scholar

Mabiala, A. P., Cai, Z., Kouassi, A. K. F., Zhang, H., Mwakipunda, G. C., and Mahamadou, A. S. (2025). Integrating advanced machine learning models for accurate prediction of porosity and permeability in fractured and Vuggy carbonate reservoirs: insights from the Tarim Basin, Northwestern, China. SPE J. 30 (06), 3307–3333. doi:10.2118/226198-PA

CrossRef Full Text | Google Scholar

Markus, U. I., Ba, J., Abid, M., Faruwa, A. R., and Oli, I. C. (2025). Rock physics and machine learning for lithology identification and estimation of unconventional reservoir properties. Arabian J. Sci. Eng., 1–22. doi:10.1007/S13369-025-10101-4/METRICS

CrossRef Full Text | Google Scholar

Mohammadian, E., Kheirollahi, M., Liu, B., Ostadhassan, M., and Sabet, M. (2022). A case study of petrophysical rock typing and permeability prediction using machine learning in a heterogenous carbonate reservoir in Iran. Sci. Rep. 12 (1), 1–15. doi:10.1038/s41598-022-08575-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl, R., and Yu, B. (2019). Interpretable machine learning: definitions, methods, and applications. Proc. Natl. Acad. Sci. U. S. A. 116 (44), 22071–22080. doi:10.1073/pnas.1900654116

PubMed Abstract | CrossRef Full Text | Google Scholar

Nssibi, M., Manita, G., and Korbaa, O. (2023). Advances in nature-inspired metaheuristic optimization for feature selection problem: a comprehensive survey. Comput. Sci. Rev. 49, 100559. doi:10.1016/J.COSREV.2023.100559

CrossRef Full Text | Google Scholar

Pothana, P., and Ling, K. (2025). Physics-integrated neural networks for improved mineral volumes and porosity estimation from geophysical well logs. Energy Geosci. 6 (2), 100410. doi:10.1016/J.ENGEOS.2025.100410

CrossRef Full Text | Google Scholar

Sakar, C., Schwartz, N., and Moreno, Z. (2024). Physics-informed neural networks trained with time-lapse geo-electrical tomograms to estimate water saturation, permeability and petrophysical relations at heterogeneous soils. Water Resour. Res. 60 (8), e2024WR037672. doi:10.1029/2024wr037672

CrossRef Full Text | Google Scholar

Salem, A. M., Yakoot, M. S., and Mahmoud, O. (2022). A novel machine learning model for autonomous analysis and diagnosis of well integrity failures in artificial-lift production systems. Adv. Geo-Energy Res. 6 (2), 123–142. doi:10.46690/AGER.2022.02.05

CrossRef Full Text | Google Scholar

Sang, W., Yuan, S., Han, H., Liu, H., and Yu, Y. (2022). Porosity prediction using semi-supervised learning with biased well log data for improving estimation accuracy and reducing prediction uncertainty. Geophys. J. Int. 232 (2), 940–957. doi:10.1093/GJI/GGAC371

CrossRef Full Text | Google Scholar

Selvam, R., Hiremath, P., Cs, S. K., Ramakrishna Bhat, R., Tomar, V., Bansal, M., et al. (2024). Metaheuristic algorithms for optimization: a brief review. Eng. Proc. 59 (1), 238. doi:10.3390/ENGPROC2023059238

CrossRef Full Text | Google Scholar

Shao, R., Wang, H., and Xiao, L. (2024). Reservoir evaluation using petrophysics informed machine learning: a case study. Artif. Intell. Geosciences 5, 100070. doi:10.1016/J.AIIG.2024.100070

CrossRef Full Text | Google Scholar

Sheykhinasab, A., Mohseni, A. A., Barahooie Bahari, A., Naruei, E., Davoodi, S., Aghaz, A., et al. (2023). Prediction of permeability of highly heterogeneous hydrocarbon reservoir from conventional petrophysical logs using optimized data-driven algorithms. J. Petroleum Explor. Prod. Technol. 13 (2), 661–689. doi:10.1007/s13202-022-01593-z

CrossRef Full Text | Google Scholar

Shi, J., Zou, Y. R., Cai, Y. L., Zhan, Z. W., Sun, J. N., Liang, T., et al. (2022). Organic matter enrichment of the Chang 7 member in the Ordos Basin: insights from chemometrics and element geochemistry. Mar. Petroleum Geol. 135, 105404. doi:10.1016/j.marpetgeo.2021.105404

CrossRef Full Text | Google Scholar

Shwartz-Ziv, R., and Armon, A. (2021). Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90. doi:10.1016/j.inffus.2021.11.011

CrossRef Full Text | Google Scholar

Su, X., Zhu, R., Zhang, J., Liu, C., Gong, L., Jiang, X., et al. (2025). Multi-scale characterization and control factors of bedding-parallel fractures in continental shale reservoirs: insights from the Qingshankou Formation, Songliao Basin, China. Mar. Petroleum Geol. 182, 107580. doi:10.1016/j.marpetgeo.2025.107580

CrossRef Full Text | Google Scholar

Tian, F., Liu, Z., Zhou, J., and Shao, J. (2025). Rock cracking simulation in tension and compression by peridynamics using a novel contact-friction model with a twin mesh and potential functions. J. Rock Mech. Geotechnical Eng. 17 (6), 3395–3419. doi:10.1016/J.JRMGE.2024.10.018

CrossRef Full Text | Google Scholar

Wang, P., Chen, X., Wang, B., Li, J., and Dai, H. (2020). An improved method for lithology identification based on a hidden Markov model and random forests. Geophysics 85 (6), IM27–IM36. doi:10.1190/geo2020-0108.1

CrossRef Full Text | Google Scholar

Wen, P., Wang, S., Li, J., Dong, K., Ren, Z., Li, Y., et al. (2025). Multiobjective optimization of a pressure maintaining ball valve structure based on RSM and NSGA-II. Sci. Rep. 15 (1), 21342. doi:10.1038/s41598-025-02158-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Wood, D. A. (2022). Gamma-ray log derivative and volatility attributes assist facies characterization in clastic sedimentary sequences for formulaic and machine learning analysis. Adv. Geo-Energy Res. 6 (1), 69–85. doi:10.46690/ager.2022.01.06

CrossRef Full Text | Google Scholar

Yang, Z., and Zou, C. (2019). “Exploring petroleum inside source kitchen”: connotation and prospects of source rock oil and gas. Petroleum Explor. Dev. 46 (1), 181–193. doi:10.1016/S1876-3804(19)30018-7

CrossRef Full Text | Google Scholar

Yang, H., Li, S., and Liu, X. (2016). Characteristics and resource prospects of tight oil in Ordos Basin, China. Petroleum Res. 1 (1), 27–38. doi:10.1016/S2096-2495(17)30028-5

CrossRef Full Text | Google Scholar

Zhang, T., Chai, H., Wang, H., Guo, T., Zhang, L., and Zhang, W. (2023). Interpretable machine learning model for shear wave estimation in a carbonate reservoir using LightGBM and SHAP: a case study in the Amu Darya right bank. Front. Earth Sci. 11. doi:10.3389/FEART.2023.1217384/FULL

CrossRef Full Text | Google Scholar

Zhang, J., Ma, G., Yang, Z., Mei, J., Zhang, D., Zhou, W., et al. (2024). Knowledge extraction via machine learning guides a topology-based permeability prediction model. Water Resour. Res. 60 (7), e2024WR037124. doi:10.1029/2024WR037124

CrossRef Full Text | Google Scholar

Zhang, R., Wang, J., Liu, C., Su, K., Ishibuchi, H., and Jin, Y. (2025). Synergistic integration of metaheuristics and machine learning: latest advances and emerging trends. Artif. Intell. Rev. 58 (9), 1–64. doi:10.1007/S10462-025-11266-Y

CrossRef Full Text | Google Scholar

Zou, C. N., Yang, Z., Tao, S. Z., Yuan, X. J., Zhu, R. K., Hou, L. H., et al. (2013). Continuous hydrocarbon accumulation over a large area as a distinguishing characteristic of unconventional petroleum: the Ordos Basin, North-Central China. Earth-Science Rev. 126, 358–369. doi:10.1016/j.earscirev.2013.08.006

CrossRef Full Text | Google Scholar

Keywords: petrophysical property prediction, tight reservoirs, stacked ensemble, GA-PSO optimization, physics-informed machine learning

Citation: Markus UI, Ba J, Abid M, Richard FA and Obadiah E (2026) Hybrid optimization of interpretable ensemble machine learning for petrophysical property prediction from well logs. Front. Earth Sci. 13:1721227. doi: 10.3389/feart.2025.1721227

Received: 09 October 2025; Accepted: 27 November 2025;
Published: 05 January 2026.

Edited by:

Soroush Abolfathi, University of Warwick, United Kingdom

Reviewed by:

Suhaib Umer Ilyas, Jeddah University, Saudi Arabia
Zhengzheng Cao, Henan Polytechnic University, China
Njitacke Tabekoueng Zeric, University of Buea, Cameroon

Copyright © 2026 Markus, Ba, Abid, Richard and Obadiah. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jing Ba, amJhQGhodS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.