Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Plant Sci., 19 January 2026

Sec. Functional Plant Ecology

Volume 16 - 2025 | https://doi.org/10.3389/fpls.2025.1700203

The response and coupling coordination of soil moisture to vegetation in the Yellow River’s primary tributaries: a multi-source data analysis of the Wanchuan River Basin

  • 1College of Forestry, Gansu Agricultural University, Lanzhou, China
  • 2College of Management, Gansu Agricultural University, Lanzhou, China
  • 3College of Resources and Environment, Gansu Agricultural University, Lanzhou, China
  • 4School of Architecture, South China University of Technology, Guangzhou, China

Introduction: Vegetation status and soil moisture play crucial roles in ecosystem supply- demand services. Understanding the spatiotemporal variations of soil moisture (SM) and vegetation conditions is essential for assessing the stability of terrestrial ecosystems. However, the long-term response mechanisms and synergistic relationships between soil moisture and vegetation in the upper Yellow River’s Gansu section remain unclear, introducing uncertainties in evaluating the Loess Plateau ecosystem.

Methods: Utilizing multi-source data from the Wanchuan River Basin, a major tributary of the Yellow River, an inversion model was constructed to simulate the spatiotemporal distribution of soil moisture at depths of 0–30 cm.

Results and Discussion: The results reveal a distinct spatial pattern of lower moisture in the northern part and higher moisture in the southern part, with no significant variation across soil depths. A comprehensive vegetation index (VEG) integrating net primary productivity (NPP), leaf area index (LAI), and atmosphere-resistant vegetation index (ARVI) was found to be most suitable for this region. Overall vegetation conditions improved annually, with interannual variation following a normal distribution. Significant interaction between vegetation and soil moisture was observed, with the area of statistically significant positive correlation substantially exceeding that of negative correlation, indicating a synergistic relationship between the two variables. Coupling coordination analysis showed notable seasonal differences: in summer, coordination between SM and VEG was significantly divergent (coordinated in the south, uncoordinated in the north), while in other seasons, the relationship was characterized as marginally coordinated. This suggests a need for continuous structural optimization, with marginal coordination representing the predominant state of the soil moisture– vegetation system in the basin.Variable importance screening using the Giniindex- enhanced random forest algorithm identified land surface temperature and potential evapotranspiration as the dominant factors influencing the soil moisture–vegetation coupling coordination degree. These findings provide valuable data and theoretical support for understanding the synergistic mechanisms between soil, water, and vegetation in the Gansu section of the Yellow River, contributing to more effective ecosystem management strategies in the Loess Plateau region.

1 Introduction

Ecosystems provide vital services—regulating, provisioning, cultural, and supporting—that sustain human societies (Braat and de Groot, 2012). Among regulatory services, water conservation is critical for watershed stability and erosion control. Its effectiveness largely depends on soil moisture (SM) dynamics, which directly mediate water regulation and ecosystem resilience (Costanza et al., 2017).

Soils constitute the primary reservoir for moisture retention, with SM governing spatial heterogeneity and overall regulatory capacity (Gan et al., 2019). Key soil properties—organic matter, bulk density, and porosity—strongly influence SM retention (Rawls et al., 2003; Yang et al., 2021). Recent advances in remote sensing and modeling have improved SM quantification. Liang et al. (2020) and Liu C. et al. (2022) developed predictive models using multivariate regression and neural networks, while Deodoro et al. (2023) demonstrated SAR-based soil texture inversion. While methods for visualizing SM have matured in recent years, inversion techniques that integrate both soil physicochemical properties and remote sensing indicators still require further investigation. Further integration of remote and ground-based data remains essential to refine these models and enhance ecological management.

Vegetation dynamics significantly influence soil moisture, typically assessed using long-term vegetation parameters such as the Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Net Primary Productivity (NPP), and Leaf Area Index (LAI). Hou et al. (2022) utilized NDVI and its coefficient of variation to analyze spatial patterns of grassland degradation/restoration in Qilian County on the northeastern Tibetan Plateau. Similarly, Zhang et al. (2020) applied the CASA model to investigate spatiotemporal changes in grassland NPP in Gansu Province from 1982 to 2011 and examined their relationship with climatic drivers. These findings demonstrate that vegetation assessment based on remote sensing data serves as a powerful tool for visualizing vegetation conditions. Compared to remotely sensed SM estimation, vegetation indicators offer more diverse forms of characterization, with NPP, LAI, and various vegetation indices being commonly used descriptors. However, integrated assessments combining these multiple vegetation indicators remain limited.

The Loess Plateau, located in northwestern China, is a critical region for ecological restoration projects along the mainstem and tributaries of the Yellow River. Accurate evaluation of vegetation conditions and soil moisture directly informs ecological recovery strategies in this arid to semi-arid region. The coupling coordination degree (CCD) model has emerged as an effective quantitative tool for assessing vegetation–soil moisture interactions and their synergistic relationships. Zhao et al. (2022) successfully applied this model to analyze the synergy between soil habitat indicators and vegetation characteristics, establishing its reliability for such evaluations. Building on this, Gao et al. (2022) developed an advanced grey relational coupling model to investigate coupling mechanisms between species composition and soil properties, highlighting the fundamental importance of vegetation pattern–soil moisture interactions in maintaining regional ecological stability.In addition to interactions and synergies between vegetation and soil moisture, external factors affecting both have become a research focus in recent years. Liu et al. (2024) demonstrated through comprehensive analysis that hydrothermal conditions are key factors influencing both vegetation and soil moisture on the Loess Plateau. Zhang et al. (2024) constructed a soil moisture inversion model and integrated it with vegetation indices using the CCD model, identifying topography as the factor with the highest weight among various influencers. Thus, applying the CCD model to couple vegetation and soil moisture provides an effective means to systematically study their interactions and synergistic effects.

Despite its successful application in various ecosystems, research employing the CCD model to examine the synergistic relationship between soil moisture and vegetation dynamics in major Yellow River tributaries of the Loess Plateau remains scarce. Further work is needed to identify the dominant influencing factors and deepen the understanding of their coupled mechanisms. Therefore, this study selects, for the first time, the Wanchuan River Basin—a primary tributary of the Yellow River in the Loess Plateau region of Gansu Province—as the study area. By integrating multi-source parameters including in-situ physiological measurements, remote sensing data, and long-term vegetation indices, we established multiple regression models to achieve high-precision spatial visualization of regional soil moisture. Furthermore, using principal component analysis (PCA), we systematically combined multi-source vegetation descriptors—including NPP, LAI, and vegetation indices—to construct a comprehensive vegetation index, enabling large-scale spatial visualization of vegetation conditions. Through high-resolution spatiotemporal correlation analysis and coupling coordination degree modeling, this study provides an in-depth investigation of the interactions and spatial synergies between soil moisture and vegetation characteristics in the Yellow River Basin of the Loess Plateau. In addition, the random forest algorithm was applied to quantify the relative contribution of various influencing factors. These findings offer a theoretical foundation for understanding the sustainable dynamics of soil moisture and vegetation conditions in major tributaries of the Yellow River.

2 Materials and methods

2.1 Site description

This study focuses on the Wanchuan River Basin (Figure 1), a primary tributary of the upper Yellow River, located in the western Loess Plateau (103°07′-104°02′E, 35°03′-36°01′N). As a crucial ecological barrier for Lanzhou City in Gansu Province, the basin exhibits a stepped topography with higher elevations in the north and south and lower central areas. The main channel spans approximately 80 km, draining a total area of 1,900 km² entirely within Yuzhong County, Lanzhou.

Figure 1
Map collage of the Yellow River and Wanchuan River regions. The top left shows Asia with the Yellow River watershed in orange. The bottom left highlights Chinese provinces overlain by the watershed. The top right displays a topographic map of the Wanchuan River with sampling points. The bottom right illustrates land use with farmland, water bodies, forests, grasslands, cities, and bare land in various colors. Legends explain symbols and elevations, ranging from 1,002 to 4,355 meters.

Figure 1. Spatial distribution of study sites.

The basin experiences typical temperate continental monsoon climate and holds significant ecological importance. Its headwater originates from the Xinglong Mountain National Nature Reserve (peak elevation: 3,300 m), while its confluence with the Yellow River occurs near Heping Town (minimum elevation: 1,500 m). Topographic gradients create pronounced spatial heterogeneity in vegetation coverage between northern and southern banks. Dominant land use types include cropland and built-up areas, with the Xinglong Mountain area serving as a critical climate regulation zone that maintains regional ecological equilibrium.

2.2 Data sources

2.2.1 Soil sample collection and property measurement

Field investigations were conducted in the study area from July to August during 2023–2024, employing a stratified random sampling design with 170 sampling points (Figure 1), maintaining 1 km intervals between adjacent sites (Yigini and Panagos, 2016). At each sampling location, two 50 cm × 50 cm quadrats spaced 50 m apart were established, with GPS-recorded coordinates and elevation data, vegetation landscape photographs, and measurements of average plant height within quadrats. Soil sampling utilized 100 cm³ cutting rings to collect undisturbed cores for determining fresh weight, porosity, soil moisture and bulk density, while composite samples were obtained via the five-point method across stratified 0–30 cm depths (0–10 cm, 10–20 cm, and 20–30 cm). All samples were air-dried, sieved, and analyzed using a Bettersize2600 laser granulometer for particle size distribution (clay<0.002 mm, silt 0.002–0.05 mm, sand 0.05–2 mm) alongside other physicochemical properties, with detailed analytical methods provided in Figure 2 (Rosa and Franz, 2005).

Figure 2
Flowchart detailing soil indicators. The first category is “soil physicochemical indicators” with subcategories including soil moisture, pH, and phosphorus, each linked to specific testing methods like the Ring Knife and Potentiometric methods. The second category is “vegetation and soil structural indicators” with subcategories including clay, silt, sand, average plant height, and leaf area index, connected to methods like Laser Diffraction. Each indicator has an explanation of its ecological importance, such as influencing plant growth or regulating nutrient cycles.

Figure 2. Soil physicochemical indicators: analytical methods and ecological implications.

2.2.2 Remote sensing data sources and processing

This study employed multi-source remote sensing data to construct models while mitigating multicollinearity effects, primarily utilizing Sentinel-1 and Sentinel-2 datasets from the Copernicus Data Space Ecosystem (CDSE, https://browser.dataspace.copernicus.eu/). Among the selected remote sensing indicators, VH and VV radar bands were obtained from Sentinel-1, representing co-polarized and cross-polarized backscattering coefficients, which are closely related to soil moisture content. The optical bands were acquired from Sentinel-2, comprising 11 spectral bands (B1–B8A and B11–B12, excluding the 60-meter resolution B9 and B10 bands). All bands were uniformly resampled to the same resolution in a GIS platform, followed by coordinate system transformation and watershed mask extraction. The selected vegetation indices were also calculated using Sentinel-2 spectral bands, aiming to incorporate well-performing remote sensing indicators to the greatest extent for constructing the soil moisture regression model.Vegetation indices were systematically derived using the spectral analysis tools within ENVI 5.3 software.To address persistent atmospheric contamination in optical satellite imagery, we implemented a rigorous two-stage quality control protocol. First, temporal outliers caused by cloud and snow cover were identified through a combined radiometric and spectral threshold approach, then reconstructed using linear interpolation algorithms implemented through MATLAB scripting. Second, the corrected time series were processed using Maximum Value Composition (MVC) to generate monthly composite datasets, selectively retaining the highest-quality observations within each temporal window. This integrated methodology significantly reduces atmospheric interference while maintaining the integrity of vegetation spectral signals, following the established practices in remote sensing data processing (Yu et al., 2015). Monthly data were aggregated into seasonal datasets (spring: March-May; summer: June-August; autumn: September-November; winter: December-February) following international seasonal classification standards (Madonsela et al., 2015). The analysis was supplemented with watershed boundary and hydrographic network data provided by Lanzhou Hydrological Station to support comprehensive watershed characterization. The land use data were generated through supervised classification of Sentinel-2 imagery from August 2023, achieving a Kappa coefficient of 0.75. The classification criteria strictly adhered to the Land Use Classification System established by the Chinese Academy of Sciences. All remote sensing data were resampled to a uniform resolution of 20m × 20m, consistent with the spatial resolution of the Sentinel band data.

2.3 Development of soil moisture inversion models

This study systematically developed remote sensing inversion models for soil moisture content using multivariate statistical modeling approaches (Lai et al., 2021). Initial Pearson correlation analysis conducted in SPSS 26 identified optimal spectral bands from Sentinel-1/2 data and vegetation indices significantly correlated (p < 0.05) with measured soil parameters including moisture content, bulk density, and porosity, which were subsequently selected as modeling variables (Shahriari et al., 2019).

At the same time, the optimal regression model was selected, and the physical and chemical properties of each soil type were characterized using remote sensing bands and vegetation indices, thereby enabling comprehensive spatial and temporal representation of all variables.

Three regression methods were employed: (1) Stepwise Multiple Linear Regression (SMLR) optimized model structure by retaining only significant predictors while mitigating multicollinearity (Khaledian and Miller, 2020; Zare et al., 2019); (2) Partial Least Squares Regression (PLSR) combined principal component analysis with regression to address small-sample, high-dimensional data challenges (Boyan et al., 2015); (3) Ridge regression employs L2 regularization by incorporating a penalty term proportional to the sum of squared weight parameters into the loss function. This approach mitigates overfitting and enhances model generalizability, making it particularly suitable for handling multicollinearity in Sentinel-2 multi-band datasets (Alsaleh et al., 2025). The 170 sampling points were partitioned into 136 modeling samples and 34 validation samples (4:1 ratio) (Madonsela et al., 2015), with model performance evaluated through coefficient of determination (R²), statistical significance (p-value), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) to ensure inversion reliability (Madonsela et al., 2015; Mngadi et al., 2022). The specific calculation equations is as shown in Equations 14:

R2=i=1n(xix¯)2i=1n(x2x¯)2(1)
 RMSE=i=1n(x1-x2)2n(2)
MAE= 1ni=1n|x1-x2|(3)
MAPE = 1ni=1n|x1-x2x1|×100%(4)

Here, x1 represents the measured value of the soil moisture indicator; x2 denotes the predicted value of the soil moisture indicator; indicates the mean value of the measured soil moisture indicators; n represents the number of sampling points.

2.4 Construction of comprehensive vegetation index

Principal Component Analysis (PCA) is a multivariate statistical method employing dimensionality reduction to transform initial variables into fewer linear combinations that retain most original information. This technique effectively identifies principal components from multiple variables, revealing their quantitative relationships. In this study, PCA was applied to address high correlations among vegetation parameters by using low-dimensional representations to explain the majority of variance in NPP, LAI, and vegetation indices (Zhou et al., 2021). To align with soil moisture measurements from 2017-2024, we first calculated mean values of vegetation indices, LAI, and NPP across their respective raster images. Using ArcGIS’s “Raster to Point” tool, these averaged rasters were converted to vector data, with longitude and latitude fields added to the attribute tables before export, ensuring consistent spatial coordinates and point counts across all three vegetation parameter datasets. Furthermore, seven vegetation indices including NDVI, EVI, Ratio Vegetation Index (RVI), Differential Vegetation Index (DVI), Atmospherically Resistant Vegetation Index (ARVI), Soil Adjusted Vegetation Index (SAVI), and Green Normalized Difference Vegetation Index (GNDVI) were comparatively analyzed through PCA to select the optimal vegetation index, which together with NPP and LAI formed the final composite vegetation index for subsequent analysis. The calculation equation is as follows (Equations 511):

NDVI=NIR-RedNIR+Red(5)
EVI=2.5×(NIR-Red)NIR+6×Red-7.5×Blue+1(6)
SAVI=1.5×(NIR-Red)NIR+Red+0.5(7)
DVI = NIR-Red(8)
RVI = NIRRed(9)
ARVI = NIR-(2×Red-Blue)NIR+(2×Red-Blue)(10)
GNDVI = NIR-GreenNIR+Green(11)

Among these, NIR refers to the near-infrared band, corresponding to Band 8 (B8) of Sentinel-2 imagery; Red represents the red band, corresponding to Band 4 (B4); Blue denotes the blue band, corresponding to Band 2 (B2); and Green indicates the green band, corresponding to Band 3 (B3).Due to uncertainties in the retrieval accuracy of LAI values derived from leaf area scanners, this study additionally incorporated LAI product data for subsequent analysis. When the retrieval accuracy from scanned data was insufficient, the LAI product data were utilized to support the construction of comprehensive vegetation indices.This study utilized the MOD17A2H and MOD15A2H product datasets from NASA Earthdata Search (https://search.earthdata.nasa.gov/search) to acquire NPP and LAI data. The datasets were resampled to achieve a spatial resolution of 20m × 20m for subsequent analysis. This spatial alignment was implemented to maintain consistency with the resolution characteristics of vegetation indices, ensuring dimensional homogeneity across all remote sensing datasets used for constructing the comprehensive vegetation index (VEG) and performing subsequent correlation analyses.

2.5 Partial correlation analysis and significance test of soil moisture and vegetation

The analysis of raster data correlation quantitatively describes the degree of association between raster datasets while outputting correlation coefficients. Significance testing involves formulating hypotheses about the distribution or parameters of random variables, then evaluating these hypotheses against sample data to determine whether observed differences are statistically significant (Gu et al., 2018). Given the well-established relationship between soil moisture and vegetation dynamics, considering the spatiotemporal continuity of our dataset which may not satisfy linearity or normality assumptions, we employed Spearman’s rank correlation analysis to investigate vegetation responses to soil moisture. Using RStudio 4.5.2 with the terra and Hmisc packages, we computed pixel-level Spearman correlation coefficients and associated p-values (derived from t-tests) between soil moisture and vegetation factors at both seasonal and annual scales from 2017 to 2024, thereby capturing their spatiotemporal relationships. The calculation equation is as follows (Equation 12):

p=i=1n(RVEG,iR¯VEG)(RSM,iR¯SM)i=1n(RVEG,iR¯VEG)2i=1n(RSM,iR¯SM)2(12)

where n represents the total number of raster cells for both VEG and SM, which must be consistent between the two variables; RVEG,i ​denotes the rank of the i-th VEG value; RSM,i ​ indicates the rank of the i-th SM value; R¯VEG is the mean rank of VEG values; R¯SM is the mean rank of SM values; and p represents the Spearman’s correlation coefficient, which ranges from -1 to 1. To ensure computational validity, all raster datasets were rigorously standardized to identical coordinate systems, resolutions, and dimensional properties (matching row/column counts) prior to analysis.

2.6 Construction of a coupling coordination degree model for soil moisture and vegetation

The dynamic characteristics and patterns of soil moisture and the comprehensive vegetation index serve as crucial indicators for assessing soil-vegetation environmental improvements, with their synergistic interactions being quantitatively measurable through coupling degree analysis (Qi et al., 2022). Focusing on these two interconnected systems, we established a coupling degree model for the soil moisture-vegetation system using the following Equation 13:

C = 2VEG×SMVEG+SM(13)

VEG represents the comprehensive vegetation evaluation function, and SM denotes the soil moisture evaluation function. The coupling degree C between soil moisture and vegetation ranges from 0 to 1. When C approaches 1, it indicates a positive coupling state between vegetation and soil moisture, signifying favorable ecological restoration development; when C approaches 0, it reflects the opposite condition (Qi et al., 2022).

While C effectively quantifies interaction intensity, we advanced the analysis by developing a coupling coordination degree model to holistically evaluate system synergy, incorporating both coupling effects and developmental status to prevent potential misinterpretations from using coupling degree alone (Zuo et al., 2021) (Equations 14, 15):

Dvs=C×T(14)
T = α×VEG+β×SM(15)

Dvs represents the coupling coordination degree between soil moisture and vegetation; T denotes the comprehensive system harmonization index; α and β are the undetermined weighting coefficients for the comprehensive vegetation index and soil moisture index respectively, with α+β = 1. Based on previous research findings, this study considers both factors to be equally important, thus setting α=β=0.5 (Zuo et al., 2021; Zhang et al., 2024). Similar to C, the coupling coordination degree Dvs between soil moisture and vegetation also ranges from 0 to 1. A smaller Dvs value indicates poorer vegetation restoration effectiveness and greater imbalance in the coupling relationship between soil moisture and vegetation.

This study adopted the classification scheme from previous research and divided the coupling coordination degree into five levels using equal intervals (Table 1):

Table 1
www.frontiersin.org

Table 1. Coupling types between SM and VEG.

2.7 Screening of influencing factors for soil moisture-vegetation coupling coordination model based on random forest and Gini index

When assessing the importance of influencing factors in the soil moisture–vegetation coupling coordination model, the random forest algorithm is widely employed. Random forest operates by planting multiple seeds to construct numerous decision trees and aggregating their predictions to enhance model accuracy and robustness. During the construction of each decision tree, selecting the optimal node split point is crucial for its performance. An ideal split should minimize node impurity, with the Gini index serving as a common metric to quantify this impurity. In random forest, the importance of each feature is evaluated by calculating the average reduction in the Gini index achieved when splitting nodes using that feature. The relative importance of a feature is then determined by the proportion of its average Gini index reduction to the total reduction across all features (Breiman, 2001; Louppe, 2014; Liu et al., 2025). The calculation equation is as follows (Equation 16):

 Gini(m)= 1-j=1k[p(jm)]2(16)

p (j | m) represents the probability of class m at node j. When the feature Xk is used for node splitting, the change in the Gini index before and after the split is calculated. The calculation equation is as follows (Equation 17):

VIMk = Gini(m)-Gini(r)-Gini(l)(17)

Gini(m) denotes the Gini index of node m before splitting, while Gini(r) and Gini(l) represent the Gini indices of the right and left child nodes after splitting, respectively.

Within a single decision tree, the Gini changes from all split points that use feature Xk are accumulated. Across the entire random forest, the total Gini change for feature Xk is computed by summing its Gini changes over all decision trees. The calculation equation is as follows (Equation 18):

VIMGini,k = i=1nVIMik(18)

n represents the total number of trees in the random forest.

The importance score of a feature is obtained by calculating the percentage of its total Gini change relative to the total Gini changes of all features. The calculation equation is as follows (Equation 19):

VIMk = VIMki=1cVIMi(19)

c represents the total number of features.

The influencing factors selected in this study for the soil moisture–vegetation coupling coordination degree are categorized into anthropogenic activity factors (population density), topographic factors (elevation, slope, aspect), climatic factors (mean annual temperature, mean annual precipitation, land surface temperature, and potential evapotranspiration), covering all direct driving elements. Among them, the population density dataset was developed by the Oak Ridge National Laboratory (ORNL) of the U.S. Department of Energy and provided by East View Cartographic (https://landscan.ornl.gov/). The mean annual temperature, mean annual precipitation, and potential evapotranspiration (Peng, 2022) datasets were obtained from the National Tibetan Plateau Data Center (https://data.tpdc.ac.cn/zh-hans/data/). The land surface temperature dataset (Wan et al., 2021) was sourced from the NASA Earthdata portal (https://www.earthdata.nasa.gov/). Topographic factors were derived from DEM data (https://search.earthdata.nasa.gov/search) using the spatial analysis tools in ArcGIS 10.7. All datasets were resampled to a uniform resolution of 20m × 20m to maintain consistency with the data described in Sections 2.2.2 and 2.4.

3 Results

3.1 Screening of soil moisture indicators

Analysis of three soil depths revealed consistent correlations between soil moisture (SM) and six physicochemical parameters: bulk density (BD), capillary porosity (Pcap), non-capillary porosity (Pnon), available potassium (AK), total nitrogen (TN), and total phosphorus (TP). Among these, BD and porosity measures (Pcap and Pnon) showed the strongest correlations (Supplementary Figure S1).

Remote sensing data identified four significant reflectance bands (VH, B12, B8, and B8A) that maintained stable relationships with SM across all soil depths. Soil texture analysis demonstrated that clay content significantly affected SM only in surface soils (0–10 cm depth), revealing depth-dependent textural control on moisture retention. Since the LAI obtained from the leaf area scanner showed no significant correlation with any of the factors considered, it could not be effectively used for inversion or spatial representation. Therefore, LAI product data were employed for the subsequent stages of this study.

3.2 Evaluate the fitting results of the model

This study investigated the relationship between soil characteristics and soil moisture through regression modeling using statistically significant soil physicochemical indicators and reflectance bands strongly correlated with soil moisture (Supplementary Figure S1). Although sand content (0–10 cm depth) and leaf area index (20–30 cm depth) showed significant correlations with soil moisture, they were excluded from regression models due to the absence of significantly correlated spectral bands required for reliable predictive modeling.

The developed soil moisture regression model demonstrated excellent predictive performance across all three soil depths, with R² values exceeding 0.95, indicating strong agreement between predicted and observed values. Error analysis further validated the model’s accuracy, revealing that except for ridge regression at 0–10 cm and SMLR at 10–20 cm depth which exhibited relatively higher MAPE, all other RMSE, MAPE and MAE values remained within acceptable ranges, collectively confirming the model’s high precision in soil moisture estimation.

Among the three evaluated regression methods (all with p < 0.001), model selection was conducted based on R² values while maintaining RMSE<0.05, MAE<0.02, and MAPE between 0-20%. For 0–10 cm depth, while ridge regression achieved the highest R², SMLR showed superior performance in error metrics and was deemed more suitable. At 10–20 cm depth, ridge regression performed best with the highest R² values and lowest RMSE. For 20–30 cm depth, SMLR outperformed other methods across all evaluation metrics, establishing itself as the optimal choice. This systematic model selection process effectively addressed multicollinearity challenges in soil moisture inversion studies (Figure 3b). The final regression equations are presented in Table 2, with correlation analysis in Supplementary Figure S1 indicating that BD and Pcap had the greatest influence on model construction. Importantly, the ridge regression equation for 10–20 cm depth demonstrated the necessity of integrating multi-source data for inversion model development, which yields more accurate results.

Figure 3
Six scatter plots comparing SM measured values with SM simulation values across three soil depths: 0-10 cm, 10-20 cm, and 20-30 cm. Each row represents a different dataset size (n=136 and n=34). The plots include regression lines for different models (SMLR, Ridge Regression, PLSR) with associated statistical metrics such as R², RMSE, MAPE, MAE. Data points are color-coded by model, and equations for regression lines are provided.

Figure 3. Regression equations for soil moisture at different depths: (a) Training set; (b) Validation set.

Table 2
www.frontiersin.org

Table 2. Optimal equations for soil moisture function index at various depths.

3.3 Spatio-temporal dynamic analysis of soil moisture content

3.3.1 Seasonal scale spatial variation characteristics

The regression equations in Table 2 were used to estimate soil moisture content by combining soil physicochemical indicators with remote sensing bands and vegetation indices. Analysis of seasonal average spatial distribution patterns across different soil depths (2017-2024, Supplementary Figure S2) revealed several key findings:

During summer months, all soil layers showed similar spatial distributions, with high moisture values concentrated in the southern bank and Xinglong Mountain areas, while low values predominated in northern regions, creating a distinct south-to-north moisture gradient. In other seasons (spring, autumn, and winter), surface (0–10 cm) and deep layers (20–30 cm) display remarkably similar distribution patterns, though with significant moisture content differences between them. In contrast, the middle layer (10–20 cm) exhibited relatively uniform moisture distribution with minimal spatial variation.

These results demonstrate that while seasonal moisture patterns show some consistent features across depths, surface and deep layers display pronounced spatial heterogeneity throughout the year, contrasting sharply with the more stable middle layer (10–20 cm). This differential behavior indicates that moisture in both surface (0–10 cm) and deep layers (20–30 cm) is less stable and more variable, while the intermediate 10–20 cm layer maintains greater stability with consistent spatial patterns across the entire watershed.

3.3.2 Annual scale spatial variation characteristics

Analysis of annual average SM spatial distribution (Figure 4) revealed distinct depth-dependent patterns. Compared to intermediate depths, areas with low moisture content covered more extensive spatial ranges in both surface and subsurface layers, primarily distributed in northern and eastern regions of the study area. The Xinglong Mountain area showed unique vertical SM distribution characteristics, maintaining consistently higher moisture in surface and deep layers while exhibiting intermittently lower moisture at intermediate depths.

Figure 4
Three soil moisture maps of a region, each representing different soil depths: 0-10 cm, 10-20 cm, and 20-30 cm. Colors range from red, indicating lower moisture, to blue, indicating higher moisture. The scale bar at the bottom shows distances up to 20 kilometers.

Figure 4. The annual-scale spatial distribution of SM (2017–2024).

3.4 Construction of Integrated Vegetation Index

3.4.1 Correlation matrix analysis

The annual average rasters of the seven vegetation factors, along with NPP and LAI, were converted to point data using the Raster to Point tool in ArcGIS. This ensured consistency in spatial boundaries and resolution. The exported data were then analyzed using Principal Component Analysis (PCA) in SPSS 26. Supplementary Table S1 presents the Pearson correlation coefficients between each vegetation index and both NPP and LAI. The results demonstrate that only ARVI and NDVI exhibited correlations with LAI exceeding 0.6 (p < 0.05 for all significant relationships).

3.4.2 KMO and bartlett tests

The Kaiser-Meyer-Olkin (KMO) measure effectively reflects partial correlations among variables, with values approaching 1 indicating optimal conditions for principal component analysis (PCA), while values below 0.5 suggest unsuitable conditions for factor analysis. As presented in Supplementary Tables S2a-g, ARVI demonstrated a KMO value of 0.652 (≈0.7), significantly exceeding the 0.50 threshold and confirming its suitability for PCA. Notably, none of the other vegetation indices achieved KMO values above 0.6. Bartlett’s Test of Sphericity yielded statistically significant results (p = 0.000< 0.05), verifying substantial correlations among variables that were consistent with the correlation matrix results (Supplementary Table S1). Based on these robust diagnostic tests, ARVI was selected in combination with NPP and LAI for constructing the integrated vegetation index, as these parameters collectively exhibited superior sampling adequacy (KMO > 0.65) and statistically verified inter-correlations (p < 0.001). This selection process ensured optimal dimensionality reduction while preserving the most hydrologically significant vegetation characteristics for subsequent analyses.

3.4.3 Comprehensive evaluation of vegetation based on principal component analysis

The communality analysis of original variables revealed exceptionally high variance extraction values for the selected parameters, with NPP, LAI, and ARVI demonstrating extraction values of 0.927, 0.935, and 0.999 respectively. These results indicate that the principal components explain 92.7%, 93.5%, and 99.9% of the variance in LAI, NPP, and ARVI, confirming their outstanding suitability for dimensionality reduction (Table 3).

Table 3
www.frontiersin.org

Table 3. Table of common factor variances for vegetation factors.

The principal component analysis yielded significant dimensionality reduction results, with only the first component demonstrating an initial eigenvalue greater than 1 (2.322), accounting for 77.408% of the total variance in the dataset - surpassing the conventional 70% threshold for adequate representation. Notably, while the second component’s eigenvalue (0.539) fell below the Kaiser criterion threshold, its inclusion substantially improved the cumulative variance explanation to 95.376%. This two-component solution therefore captures nearly all (95.4%) of the systematic variation in the original vegetation parameters, representing an 18.0% improvement in explanatory power over the single-component model (Table 4).

Table 4
www.frontiersin.org

Table 4. Total variance explained for vegetation factors.

Furthermore, the scree plot analysis revealed a distinct inflection point occurring at the second principal component, providing additional statistical justification for retaining two components in our model (Figure 5).

Figure 5
Scree plot showing eigenvalues on the vertical axis and component numbers on the horizontal axis. Points are plotted at three components, with eigenvalues decreasing from approximately 2.5 to 0.3.

Figure 5. Scree plot.

Based on the total variance explanation (Table 4) and component matrix (Table 5) results, the final component score coefficient matrix (Table 6) yields the following principal component equations:

Table 5
www.frontiersin.org

Table 5. Component matrix for vegetation factors.

Table 6
www.frontiersin.org

Table 6. Component score coefficient matrix for vegetation factors.

F1 = 0.570×NPP + 0.637×LAI - 0.367×ARVI

F2 = -0.130×NPP - 0.245×LAI + 1.159×ARVI

The absolute values of coefficients reveal distinct ecological interpretations:

1. F1, LAI demonstrates dominant influence (0.637), followed by NPP (0.570) and ARVI (-0.367), suggesting this component primarily represents vegetation structural characteristics.

2. F2 shows ARVI’s overwhelming contribution (1.159), indicating spectral reflectance dominance in this component.

The composite score weights were derived from each component’s proportion of total eigenvalues, with rotated sums of squared loadings being 1.778 (F1) and 1.083 (F2). This produces the integrated formula:

F = (1.778×F1 + 1.083×F2)/(1.778 + 1.083) = 0.594×F1 + 0.406×F2

Through coefficient normalization of vegetation indicators, the final Comprehensive vegetation index (VEG) is calculated as:

VEG = 0.372×NPP + 0.370×LAI + 0.258×ARVI

3.4.4 Spatio-temporal distribution characteristics of the comprehensive vegetation index

The temporal and spatial analysis of the Comprehensive vegetation index (VEG) revealed distinct patterns across multiple scales. Seasonal boxplots consistently demonstrated peak VEG values during summer months, with the distribution characteristics approximating normal distribution across all temporal scales (Figure 6). Annual trend analysis showed sustained vegetation improvement from 2018 onward, with only a slight decline observed in 2024, suggesting generally positive vegetation dynamics in the study area.

Figure 6
Panel (a) is a box plot showing the seasonal average of VEG across winter, spring, summer, and autumn, with summer having the highest median. Panel (b) is a line graph illustrating the annual average of VEG from 2016 to 2025, showing a rising trend from 2019 onwards peaking in 2023.

Figure 6. Mean of VEG from 2017 to 2024: (a) seasonal scale; (b) annual scale.

Spatially, the southwestern Xinglong Mountain area consistently exhibited the highest VEG values across all seasons, a pattern strongly correlated with the established forest park in this region. Notably, the spatial distribution of VEG showed remarkable consistency with the spatial patterns of soil moisture (Supplementary Figure S2; Figure 7). Both indices displayed significantly lower values in the northern reaches compared to southern areas of the main watershed. This spatial congruence between VEG and soil moisture validates the methodological robustness of integrating NPP, LAI and ARVI for spatial vegetation characterization, as it effectively captures the eco-hydrological relationships in the watershed system.

Figure 7
Four maps display seasonal vegetation data labeled as “VEG” for spring, summer, autumn, and winter. Green indicates low vegetation levels and blue indicates high levels. A directional north arrow and a 20-kilometer scale bar are included.

Figure 7. The seasonal-scale spatial distribution of VEG from 2017 to 2024.

3.5 The response of soil moisture to vegetation

To examine the spatial correlations and response mechanisms between SM and VEG, we conducted a high-resolution, pixel-scale Spearman correlation analysis using long-term (2017-2024) VEG and SM data.

Supplementary Figure S3 reveals that in spring, areas with statistically significant correlations (p< 0.05) between SM and VEG at 0–10 cm and 10–20 cm depths were primarily concentrated in the southern main channel region. The correlation coefficients displayed a distinct spatial pattern, with positive correlations in the south and negative correlations in the north. At 20–30 cm depth, positive correlations prevailed throughout the study area, though the spatial extent of significant correlations was reduced compared to shallower layers.

Summer patterns (Supplementary Figure S4) showed notable differences from spring conditions. The significantly correlated areas expanded considerably, with the north-south contrast disappearing and the entire basin exhibiting strong positive correlations. This pattern likely reflects summer vegetation growth and maturity, coupled with enhanced soil moistures.

Autumn spatial correlations closely resembled spring patterns (Supplementary Figure S5). However, winter showed markedly reduced significance in spatial correlations across the basin, with no clear spatial differentiation (Supplementary Figure S6). This seasonal decline may be attributed to vegetation dormancy in winter, particularly for the deciduous tree species that dominate northern regions.

Figure 8 shows that from 2017 to 2024, soil moistures at all three soil depths exhibited significant correlations with VEG in the basin’s main areas, with significantly correlated areas being much more extensive at the annual scale than at the quarterly scale. Among these, the statistically significant positive and negative correlation areas all exhibit p-values< 0.05. At 0–10 cm and 10–20 cm depths, the correlations showed a distinct spatial pattern of significant positive correlations in southern areas and negative correlations in northern regions. In contrast, at 20–30 cm depth, the correlations displayed uniform positive values across the entire study area without the clear north-south differentiation observed in shallower layers.

Figure 8
Maps show correlation coefficients and significant correlations between soil moisture (SM) and vegetation (VEG) at depths of 0-10 cm, 10-20 cm, and 20-30 cm. The top row uses a gradient from blue (low) to red (high), indicating correlation strength. The bottom row categorizes areas into red for significant positive, blue for significant negative, and white for no significant correlation, displaying spatial patterns across different soil depths.

Figure 8. The annual-scale spatial correlation analysis of SM and VEG.

By calculating the proportion of statistically significant correlation areas (Figure 9), it can be observed that approximately half of the Wanchuan River Basin shows significant correlations between vegetation and soil moisture at depths of 0–10 cm and 10–20 cm, indicating a strong interactive response between these two variables. In contrast, at the 20–30 cm depth, non-significant correlation areas dominate the basin. As for the significant correlation areas, positive correlation is predominant, suggesting a synergistic relationship between vegetation and soil moisture in the Wanchuan River Basin, where both tend to increase or decrease simultaneously. Moreover, this coordinated pattern is unlikely to be attributed to random chance, as the correlations are statistically significant.

Figure 9
Three pie charts display the proportion of significant correlations at different soil depths: 0-10 cm, 10-20 cm, and 20-30 cm. Each shows categories of correlation types: significant positive (orange), significant negative (blue), and non-significant (gray). The first chart shows 31.6% significant positive, 17.5% significant negative, and 50.9% non-significant. The second chart shows 27.6% significant positive, 22.6% significant negative, and 49.8% non-significant. The third chart shows 33% significant positive, 4.4% significant negative, and 62.6% non-significant.

Figure 9. Proportion of significant correlation areas between SM and VEG.

3.6 Coupling and coordination analysis of soil moisture and comprehensive vegetation index

The spatial coupling coordination degree between SM and VEG reveals their landscape-scale synergistic interactions (Supplementary Figure S7). Summer patterns exhibited pronounced spatial heterogeneity across all three soil depths, consistently demonstrating a clear north-south divergence characterized by coordinated conditions in northern areas and uncoordinated states in southern regions. The Xinglong Mountain area in the southwest stood out with moderate-to-high coordination levels, reflecting strong system synergy where SM and VEG functioned as an integrated organic system with robust sustainable development capacity.

During spring and autumn, watershed-scale coupling coordination primarily displayed marginally coordinated patterns, indicating basic but suboptimal synergistic relationships between systems that would require structural optimization for improved performance.Winter conditions presented generally uncoordinated states prevailing across all depths except the 10-20cm layer. This pattern reflects limited system integration where one component’s development substantially lagged behind the other, particularly in surface and deep soil layers.

The annual-scale coupling coordination pattern between SM and VEG in the Wanchuan River Basin shows characteristics similar to spring and autumn conditions (Figure 10). The proportional distribution of the five coupling coordination degree (Dvs) levels (Figure 11) shows that at the 0–10 cm and 10–20 cm depths, the marginally coordinated category is significantly dominant, accounting for over 85% of the basin. At the 20–30 cm depth, although the marginally coordinated type remains relatively high, its proportion decreases by nearly 30% compared to the shallower depths, while the slightly disordered category increases notably. Overall, marginal coordination represents the primary manifestation of the soil moisture–vegetation coupling relationship in the Wanchuan River Basin.

Figure 10
Three maps showing soil coordination at depths of zero to ten centimeters, ten to twenty centimeters, and twenty to thirty centimeters. Colors indicate coordination levels: green for extreme disorder, light green for mild disorder, yellow for barely coordinated, orange for moderately coordinated, and red for highly coordinated areas.

Figure 10. The annual-scale spatial coupling coordination degree of SM and VEG.

Figure 11
Three pie charts represent the proportion distribution of Dv₅ values at different depths. The first chart (0-10cm) shows 89.7% in light yellow, 8.2% in orange, and 2% in blue. The second chart (10-20cm) shows 85.8% in light yellow, 13.9% in orange, and 0.3% in blue. The third chart (20-30cm) shows 52.2% in light yellow, 44.8% in blue, and 3% in orange. The key indicates color ranges corresponding to value categories.

Figure 11. Proportion of coupling coordination degree areas between SM and VEG.

3.7 Analysis of influencing factor importance for Dvs indicators

The Gini-index-enhanced random forest algorithm was employed to screen the influencing factors of Dvs, incorporating factors across four categories: anthropogenic (population density), climatic (mean annual precipitation, mean annual temperature, land surface temperature, potential evapotranspiration), topographic (elevation, slope, aspect). As shown in Figure 12, the importance rankings are fully consistent across all three soil depths: land surface temperature (x2) and potential evapotranspiration (x7) are the most significant factors influencing the Dvs degree in the Wanchuan River Basin, with their combined contribution close to 50%. Aspect and slope (x3, x4) have the least influence, while elevation (x8) and population density (x1) exhibit similar impact levels. Precipitation (x6) and air temperature (x5) rank below x2 and x7 in importance, though their effects are notably less pronounced than those of land surface temperature and potential evapotranspiration.

Figure 12
Three bar charts display the importance of factors at different depths: 0-10 cm, 10-20 cm, and 20-30 cm. In all charts, factor x2 has the highest significance, followed by x7 and x5. For 0-10 cm, x2 is 26.9 percent. For 10-20 cm, it is 23.7 percent. For 20-30 cm, it is 26.4 percent. Method used is Random Forest with 500 trees.

Figure 12. Ranking of influencing factors by importance for Dvs.

4 Discussion

To visualize soil moisture and vegetation characteristics and investigate their interactions, this study first employed Pearson correlation analysis to identify indicators significantly correlated with SM for regression model construction, thereby improving model accuracy and enabling multi-temporal spatial visualization of SM. Simultaneously, by evaluating seven commonly used vegetation indices, three vegetation descriptors—vegetation indices, NPP, and LAI—were integrated to develop a comprehensive vegetation index (VEG) equation, facilitating multi-temporal spatial simulation of vegetation conditions.

Pixel-based high-resolution spatial correlation analysis was applied to the generated SM and VEG raster data to preliminarily assess their interactions, including the proportion of statistically significant correlations and their spatial distribution. To gain deeper insight into their synergistic mechanisms, a coupling coordination degree model was adopted to quantify the coordination relationship, clarifying the synergistic patterns and spatial structure between the two variables.

Furthermore, using a Gini index-optimized random forest algorithm, factors directly related to both SM and VEG—including human activity, climatic, and topographic variables—were analyzed to identify the most influential drivers. This analysis provides a scientific basis for offering effective recommendations for the coordinated development of soil moisture and vegetation in the Wanchuan River Basin.

4.1 The spatio-temporal variations of the SM and VEG of the Wanchuan River

Soil physicochemical properties serve as crucial indicators for determining soil moisture content. Research by Nguemezi et al. (2020) demonstrated that these properties directly influence soil moisture, thereby limiting plant growth. In this study, with soil moisture content as the dependent variable and SM-related indicators at different soil depths as independent variables, we established regression models for SM at various soil depths to investigate its spatial distribution characteristics. The results indicate that among the physicochemical properties, bulk density and soil porosity exhibit the highest correlation coefficients with SM and appear most frequently as independent variables in the regression models. These two factors are therefore identified as the dominant drivers influencing SM estimation and have the greatest impact on SM variation. This finding aligns with existing studies (Ilek et al., 2017; Cui et al., 2023; Tian et al., 2018), confirming that the water storage capacity of soil layers is closely related to bulk density, and that changes in bulk density affect soil hydraulic properties. The simulation results showed distinct methodological advantages: SMLR proved more suitable for soil moisture inversion at 0-10cm and 20-30cm depths, while ridge regression emerged as the optimal approach for 10-20cm depth soil moisture inversion.These results are consistent with the findings of Alsaleh et al. (2025). SM exhibits no significant variation with depth, showing consistent spatial patterns across all three measured depths: lower values in the northern bank and higher values in the southern bank of the basin. Three interacting mechanisms may explain this spatial heterogeneity: (1) dense vegetation cover and thick litter layers enhance surface soil moisture, (2) reduced potential evapotranspiration decreases moisture loss from deep soil layers, and (3) extensive root systems of mature trees improve water storage capacity. These results demonstrate vegetation’s critical role in regulating complex vertical water redistribution dynamics in forest soil ecosystems.

The establishment of a vegetation status model for the Wanchuan River Basin enables timely and accurate recording of the spatiotemporal patterns and dynamic changes of vegetation cover, providing a scientific basis for strengthening ecological protection and soil-water conservation efforts (Liu D. et al., 2022). For comprehensive vegetation evaluation, principal component analysis was used to study the correlations among three vegetation parameters. The results showed: the absolute magnitude of coefficients in the first principal component equation followed LAI > NPP > ARVI, while ARVI dominated in the second principal component, differing from the results of Zhou et al. (2021). In contrast to Zhou’s study, which combined NDVI, LAI, and NPP for PCA analysis based on subjective judgment without conducting KMO and Bartlett’s tests or correlation analysis, this study implemented a rigorous preliminary screening process. We determined that ARVI, NPP, and LAI are more suitable for constructing a composite vegetation index. The normalized results demonstrate improved accuracy in vegetation characterization. Accordingly, we recommend using a composite vegetation index integrating NPP, LAI, and ARVI for vegetation characterization and its correlation analysis with soil moisture.This conclusion differs from Javier et al. (2025), who identified NDVI as the most suitable vegetation index for moisture-related studies. The discrepancy may be attributed to the high proportion of built-up areas in the Wanchuan River Basin, particularly the main urban area of Yuzhong County, Lanzhou, located on the south bank of the river, where aerosol content is notably elevated. Since ARVI is more effective than NDVI in regions with high aerosol concentrations, these findings possess both theoretical foundation and practical value, making them more suitable for guiding vegetation-soil moisture conservation strategies in the Wanchuan River Basin.

Furthermore, the study showed that temporally, the comprehensive vegetation index exhibited a gradually increasing trend with a clear normal distribution, indicating continuous improvement in vegetation conditions in the study area. High-value areas were mainly distributed in the Xinglong Mountain area in the southwest of the study area, while low-value areas were mainly located in low-coverage regions along the northern bank. The comprehensive vegetation index had significant effects on SM across different periods, consistent with Li et al. (2025)’s research.

4.2 Investigating SM–VEG relationships and influencing factors in the Wanchuan River Basin

This study conducted spatial response research between vegetation and soil moisture at a high resolution of 10m×10m, improving analytical precision in spatial relationships and detail processing compared to previous low-resolution grid studies, reflecting that even small vegetation patches have significant effects on soil moisture, similar to Xu et al. (2025)’s findings. However, this study reveals that as soil depth increases, the proportion of statistically significant correlation areas decreases, the synergistic effect between vegetation and soil moisture weakens, and their mutual influence diminishes. This finding contradicts the results reported by Xu et al. (2025). This difference may be due to significant terrain variations in the Wanchuan River Basin, where soils are mainly loose loess, and the dominant vegetation species in most areas are shrubs or herbs that have greater water retention effects on surface soil than deeper layers.

Vegetation-soil coordination is key to the success of farmland-to-forest conversion projects (Feng et al., 2023). Coupling coordination analysis can reflect interactions between vegetation-soil systems and quantitatively evaluate their coordination during ecological development. The vegetation-soil coupling model proposed by Feng et al. (2023) was used to study the synergistic effects between vegetation and soil and determine suitable vegetation restoration measures for local ecosystems. The results showed that the interaction and coordination between hydrological environment and vegetation varied significantly by season: summer showed the greatest differences with coordinated and uncoordinated areas distributed on the north and south banks respectively; spring and autumn showed mostly marginal coordination across the basin; winter showed mostly uncoordinated conditions except at 10-20cm depth. At the annual scale, the coupling coordination degree predominantly exhibits marginal coordination, consistent with the findings of Chen et al. (2022). Except for the Xinglong Mountain area, which demonstrates moderate coordination, most of the basin is characterized by a marginally coordinated state. Simultaneously, the areal proportion results further confirm that marginal coordination serves as the dominant pattern characterizing the soil moisture–vegetation system in the Wanchuan River Basin. These results indicate that although current ecological restoration efforts have established a fundamental synergy between hydrological processes and vegetation dynamics, further optimization of ecosystem structure is still required. Integrating ecological conservation with economic development will be essential to progressively achieve regional sustainable development.

Li et al. (2014) demonstrated the influence of precipitation on soil moisture through correlation analysis, while Wang et al. (2020) using regression and correlation analysis, identified human activities as the most significant factor affecting vegetation indices. Gao et al. (2025) integrating the Optimal Parameter-based Geographical Detector (OPGD), Multiscale Geographically Weighted Regression (MGWR), and Partial Least Squares Structural Equation Modeling (PLS-SEM), found that topographic indicators contributed most to vegetation index variation. Similarly, Sur et al. (2013) confirmed the role of topography in influencing soil moisture. This study comprehensively considered all potential influencing factors and selected eight drivers encompassing human activities, climate, topography, and potential evapotranspiration for analyzing the importance of factors affecting Dvs. Unlike previous approaches, we employed a Gini index-enhanced random forest algorithm to identify the most critical factors. The results show that land surface temperature and potential evapotranspiration rate are the most important factors influencing Dvs in the Wanchuan River Basin. This finding diverges from previous studies, indicating that updated screening methods and improved factor inclusiveness have influenced the outcome. It also demonstrates that in the Wanchuan River Basin, the dominant factors affecting vegetation-soil moisture interactions are not conventional drivers like precipitation, human activities, or topography, but rather more directly influential factors such as potential evapotranspiration and land surface temperature.

4.3 Limitations and prospects

This study presents, for the first time, a comprehensive methodology for constructing an comprehensive vegetation index and analyzing the response relationship between soil moisture and vegetation in the Wanchuan River Basin, a major tributary of the Yellow River. It systematically reveals the spatiotemporal distribution patterns and coupling coordination status of soil moisture and vegetation. Although the influencing factors identified in this study differ from previous research findings, the underlying cause can be attributed to the variability in driving factors and soil physicochemical characteristics across different major tributaries of the Yellow River. This variability necessitates tailored model selection and case-specific analysis.

Furthermore, the foundational data used in this study were derived from Sentinel-1 and Sentinel-2 imagery, which have been operational only since approximately 2016. This significantly limits the long-term applicability of the results. Additionally, there are resolution discrepancies between the high-resolution Sentinel data and other datasets such as NPP, LAI, population density, and climatic factors.

To address these limitations and enhance the generalizability of the research, future work should focus on the following aspects:

1. Conduct systematic field sampling and image processing in other sub-basins of the upper Yellow River, such as the Zuli River Basin closest to the study area, to improve the generalizability of the soil moisture–vegetation coupling coordination model and the integrated vegetation index.

2. Integrate multi-sensor satellite data, including Landsat and MODIS, to develop accurate soil moisture–vegetation coupling coordination models and comprehensive vegetation index construction methods for periods prior to 2016.

5 Conclusion

This study systematically developed inversion models for assessing soil moisture in the Wanchuan River Basin by integrating field-measured soil physicochemical properties, Sentinel-1/2 remote sensing data, and multiple vegetation indices through comprehensive regression analysis. The results demonstrate that bulk density and soil porosity constitute the most critical physicochemical factors influencing soil moisture. The comparative evaluation of regression methods demonstrates that ridge regression is more suitable for the intermediate soil layer, while SMLR performs better for moisture inversion in both surface and deep soil layers, showing optimal simulation effectiveness for surface soil moisture in this ecologically sensitive area. No significant vertical variation in soil moisture was observed across the study area, while the overall spatial distribution exhibited a clear north–south gradient, with lower values in the northern part and higher values in the southern part.

Principal component analysis incorporating NPP, LAI, and multiple vegetation indices identified ARVI as the most suitable vegetation index to combine with NPP and LAI for constructing a comprehensive vegetation index (VEG). Temporal analysis of VEG revealed a gradually increasing trend with normal distribution characteristics, indicating measurable improvements in vegetation conditions across the basin. High-resolution spatial correlation analysis reveals strong interactions between soil moisture and vegetation, with statistically significant positive correlations dominating among the significant areas. Spatially, a consistent south-positive/north-negative correlation pattern is observed, highlighting the remarkable mutual influence between soil moisture and vegetation dynamics.

Analysis using the coupling coordination degree model revealed distinct seasonal patterns in SM-VEG relationships, with summer showing pronounced spatial variability featuring coordinated conditions in southern areas and uncoordinated states in northern sectors, while spring and autumn exhibited widespread marginal coordination, and winter displayed predominantly uncoordinated conditions basin-wide. Marginal coordination represents the predominant manifestation of the soil moisture–vegetation coupling relationship throughout the entire river basin. The variable importance screening results indicate that land surface temperature and potential evapotranspiration are the most critical factors influencing the soil moisture–vegetation coupling coordination degree in the Wanchuan River Basin. These findings demonstrate that although ecological restoration efforts on the Loess Plateau have established a marginal synergistic effect between soil moisture and vegetation, the current level of coupling coordination remains suboptimal. Continued commitment to vegetation improvement and soil-water conservation measures is essential to regulate land surface temperature and potential evapotranspiration patterns. Further optimization of ecosystem structure and implementation of integrated management strategies will be crucial for progressively achieving sustainable development goals in the Yellow River Basin.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Author contributions

XinL: Conceptualization, Data curation, Formal Analysis, Methodology, Software, Visualization, Writing – original draft. XiaL: Conceptualization, Formal Analysis, Methodology, Writing – review & editing. XZ: Formal Analysis, Methodology, Writing – review & editing. jY: Conceptualization, Writing – original draft. ZL: Data curation, Investigation, Writing – review & editing. HZ: Investigation, Writing – review & editing. XuL: Conceptualization, Data curation, Funding acquisition, Project administration, Supervision, Validation, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. Department of Agriculture and Rural Affairs of Gansu Province: Soil Suitability Evaluation for Gansu Local Specialties (e.g., Forsythia and Astragalus) (03725156) and Scientific research start-up Project of Gansu Agricultural University: Driving Mechanisms and Ecological Effects of Spatiotemporal Changes in Vegetation Characteristics and Soil Nutrients in the Yellow River Basin (224024).

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2025.1700203/full#supplementary-material

References

Alsaleh, A. R., Alcibahy, M., Gafoor, F. A., Hashemi, H. A., Athamneh, B., Alhammadi, A. A., et al. (2025). Estimation of soil organic carbon in arid agricultural fields based on hyperspectral satellite images. Geoderma 453, 117151. doi: 10.1016/j.geoderma.2024.117151

Crossref Full Text | Google Scholar

Boyan, K., Yucel, T., and Abdul, M. M. (2015). Comparison between artificial neural network and partial least squares for on−line visible and near infrared spectroscopy measurement of soil organic carbon, pH and clay content. Soil Tillage Res. 146, 243–252. doi: 10.1016/j.still.2014.11.002

Crossref Full Text | Google Scholar

Braat, L. and de Groot, R. (2012). The ecosystem services agenda: bridging the worlds of natural science and economics, conservation and development, and public and private policy. Ecosystem Serv. 1, 4–15. doi: 10.1016/j.ecoser.2012.07.011

Crossref Full Text | Google Scholar

Breiman, L. (2001). Random forests. Mach. Learning. 45, 5–32. doi: 10.1023/A:1010933404324

Crossref Full Text | Google Scholar

Chen, J. H, Zhang, W. P, Song, L, and Wang, Y. F. (2022). The coupling effect between economic developmentand the urban ecological environment in Shanghai port. Sci. Total Environment. 841, 156734. doi: 10.1016/j.scitotenv.2022.156734

PubMed Abstract | Crossref Full Text | Google Scholar

Costanza, R., Groot, R., Braat, L., Kubiszewski, I., Fioramonti, L., Sutton, P., et al. (2017). Twenty years of ecosystem services: How far have we come and how far do we still need to go? Ecosystem Serv. 28, 1–6. doi: 10.1016/j.ecoser.2017.09.008

Crossref Full Text | Google Scholar

Cui, Q., Li, Z. X., Qi, F., Zhang, B. J., and Gui, J. (2023). Soil bulk density and altitude are primary drivers of soil water content and soil temperature in the Three Rivers Headwaters Region, China. Soil Use Managment 39, 1364–1387. doi: 10.1111/sum.12910

Crossref Full Text | Google Scholar

Deodoro, S. C., Moral, R. A., Fealy, R., McCarthy, T., and Fealy, R. (2023). An assessment of Sentinel-1 synthetic aperture radar, geophysical and topographical covariates for estimating topsoil particle-size fractions. Eur. J. Soil Science. 74, e13414. doi: 10.1111/ejss.13414

Crossref Full Text | Google Scholar

Feng, L. Q, Guo, M. M, Wang, W. L, Shi, Q. H, Guo, W.., Lou, Y. B, et al. (2023). Evaluation of the effects of long–term natural and artificial restoration on vegetation characteristics, soil properties and their coupling coordinations. Sci. Total Environ. 884, 163828. doi: 10.1016/j.scitotenv.2023.163828

PubMed Abstract | Crossref Full Text | Google Scholar

Gan, M., Jia, Y. H., Shao, M. A., Guo, C. J., and Li, T. C. (2019). Permanent gully increases the heterogeneity of soil water retention capacity across a slope-gully system. Agric. Ecosyst. Environ. 272, 206–217. doi: 10.1016/j.agee.2018.10.007

Crossref Full Text | Google Scholar

Gao, R., Ai, N., Liu, G. G., Liu, C. H., Qiang, F. F., Zhang, Z. Y., et al. (2022). The coupling relationship between herb communities and soil in a coal mine reclamation area after different years of restoration. Forests 13, 1481. doi: 10.3390/f13091481

Crossref Full Text | Google Scholar

Gao, J, Shi, C. Q, Yang, J. Y, Zhao, T. N, and Xie, W. X. (2025). Vegetation dynamics and responses to natural and anthropogenic drivers in a typical southern red soil region, China. Remote Sening 17, 2941. doi: 10.3390/rs17172941

Crossref Full Text | Google Scholar

Gu, Z. J., Duan, X. W., Shi, Y. D., Li, Y., and Pan, X. (2018). Spatiotemporal variation in vegetation coverage and its response to climatic factors in the Red River Basin, China. Ecol. Indic. 93, 54–64. doi: 10.1016/j.ecolind.2018.04.033

Crossref Full Text | Google Scholar

Hou, Q. Q., Ji, Z. X., Yang, H., and Yu, X. J. (2022). Impacts of climate change and human activities on different degraded grassland based on NDVI. Sci. Rep. 12, 1–18. doi: 10.1038/s41598-022-19943-6

PubMed Abstract | Crossref Full Text | Google Scholar

Ilek, A., Kucza, J., and Szostek, M. (2017). The effect of the bulk density and the decomposition index of organic matter on the water storage capacity of the surface layers of forest soils. Geoderma 285, 27–34. doi: 10.1016/j.geoderma.2016.09.025

Crossref Full Text | Google Scholar

Javier, A, Rafael, P, Francisco, J, and María, J. P. (2025). Using NDVI-derived vegetation vigour as a proxy for soil water content in Mediterranean-mountain traditional water management systems: Seasonal variability and restoration impacts. Ecol. Indic. 174, 113468. doi: 10.1016/j.ecolind.2025.113468

Crossref Full Text | Google Scholar

Khaledian, Y. and Miller, B. A. (2020). Selecting appropriate machine learning methods for digital soil mapping. Appl. Math. Model. 81, 401–418. doi: 10.1016/j.apm.2019.12.016

Crossref Full Text | Google Scholar

Lai, Y. Q., Wang, H. L., and Sun, X. L. (2021). A comparison of importance of modelling method and sample size for mapping soil organic matter in Guangdong, China. Ecol. Indic. 126, 107618. doi: 10.1016/j.ecolind.2021.107618

Crossref Full Text | Google Scholar

Li, Z. X, Cui, Q, Feng, Q, Qiao, B, and Gui, J. (2025). The variation of soil water content and its driving factors in different melting periods in the Three River Headwaters Region: A implication for vegetation restoration. Ecol. Eng. 212, 107495. doi: 10.1016/j.ecoleng.2024.107495

Crossref Full Text | Google Scholar

Li, X. Y, Liu, L. C, Duan, Z. H, and Wang, N. (2014). Spatio-temporal variability in remotely sensed surface soil moisture and its relationship with precipitation and evapotranspiration during the growingseason in the Loess Plateau, China. Environ. Earth Sci. 71, 1809–1820. doi: 10.1007/s12665-013-2585-7

Crossref Full Text | Google Scholar

Liang, S., Li, X. F., Zheng, X. M., Jiang, T., Li, X. J., and Qiao, D. J. (2020). Effects of winter snow cover on spring soil moisture based on remote sensing data product over farmland in northeast China. Remote Sens 12, 2716. doi: 10.3390/rs12172716

Crossref Full Text | Google Scholar

Liu, X. T., Cai, L., Li, M. Y., Yan, Y. L., Chen, H., and Wang, F. (2024). Why does afforestation policy lead to a drying trend in soil moisture on the Loess Plateau? Sci. Total Environ. 953, 175912. doi: 10.1016/j.scitotenv.2024.175912

PubMed Abstract | Crossref Full Text | Google Scholar

Liu, K., Gu, Y. C., Tang, L., Du, Y. H., Zhang, C., and Zhu, J. W. (2025). Random forest grid fault prediction based on genetic algorithm optimization. Front. Phys. 13. doi: 10.3389/fphy.2025.1480749

Crossref Full Text | Google Scholar

Liu, C. X, Zhang, X. D, Wang, T, Chen, G. Z, Zhu, K, Wang, Q, et al (2022). A GA-BP neural network regression model for predicting soil moisture in slope ecological protection. Sustainability 14, 1386. doi: 10.3390/su14031386

Crossref Full Text | Google Scholar

Liu, C. X., Zhang, X. D., Wang, T., et al. (2022). Detection of vegetation coverage changes in the Yellow River Basin from 2003 to 2020. Ecol. Indicators. 138, 108818. doi: 10.1016/j.ecolind.2022.108818

Crossref Full Text | Google Scholar

Louppe, G. (2014). Understanding random forests: from theory to practice. arXiv:1407.7502. 4:55–87. doi: 10.48550/arXiv.1407.7502

Crossref Full Text | Google Scholar

Madonsela, S., Cho, M. A., Ramoelo, A., Mutanga, O., and Naidoo, L. (2015). Estimating tree species diversity in the savannah using NDVI and woody canopy cover. Int. J. Appl. Earth observation geoinformation 66, 106–115. doi: 10.1016/j.jag.2017.11.005

Crossref Full Text | Google Scholar

Mngadi, M., Odindi, J., Mutanga, O., and Sibanda, M. (2022). Estimating aboveground net primary productivity of reforested trees in an urban landscape using biophysical variables and remotely sensed data. Sci. Total Environ. 802, 1–9. doi: 10.1016/j.scitotenv.2021.149958

PubMed Abstract | Crossref Full Text | Google Scholar

Nguemezi, C., Tematio, P., Yemefack, M., Tsozue, D., and Silatsa, T. B. F. (2020). Soil quality and soil fertility status in major soil groups at the Tombel area, South–West Cameroon. Heliyon 6, e03432. doi: 10.1016/j.heliyon.2020.e03432

PubMed Abstract | Crossref Full Text | Google Scholar

Peng, S. (2022) in 1-km monthly potential evapotranspiration dataset for China (1901-2024) (National Tibetan Plateau/Third Pole Environment Data Center). doi: 10.11866/db.loess.2021.001

Crossref Full Text | Google Scholar

Qi, Y., Farnoosh, A., and Lin, L. (2022). Coupling coordination analysis of China’s provincial water–energy–food nexus. Environ. Sci. pollut. Res. 29, 23303–23313. doi: 10.1007/s11356-021-17036-x

PubMed Abstract | Crossref Full Text | Google Scholar

Rawls, W. J., Pachepsky, Y. A., Ritchie, J. C., Sobecki, T. M., and Bloodworth, H. (2003). Effect of soil organic carbon on soil water retention. Geoderma 116, 61–76. doi: 10.1016/S0016-7061(03)00094-6

Crossref Full Text | Google Scholar

Rosa, M. and Franz, S. (2005). Manual for soil analysis - monitoring and assessing soil bioremediation. Soilbiol 5, 47–95. doi: 10.1007/3-540-28904-6

Crossref Full Text | Google Scholar

Shahriari, M., Delbari, M., Afrasiab, P., and Pahlavan-Rad, M. R. (2019). Predicting regional spatial distribution of soil texture in floodplains using remote sensing data: A case of southeastern Iran. Catena 182, 104149. doi: 10.1016/j.catena.2019.104149

Crossref Full Text | Google Scholar

Sur, C., Jung, Y., and Choi, M. (2013). The effect of the bulk density and the decomposition index of organic matter on the water storage capacity of the surface layers of forest soils. Geoderma 207-208, 234–243. doi: 10.1016/j.geoderma.2013.05.007

Crossref Full Text | Google Scholar

Tian, Z. C, Gao, W. D, Kool, D, Ren, T. S, Horton, R, Heitman, J. L, et al. (2018). Approaches for estimating soil water retention curves at various bulk densities with the extended van Genuchten model. Water ResourcesResearch 54, 5584–5601. doi: 10.1029/2018WR022871

Crossref Full Text | Google Scholar

Wan, Z., Hook, S., and Hulley, G. (2021). MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V061. (Distributed by NASA EOSDIS Land Processes Distributed Active Archive Center). doi: 10.5067/MODIS/MOD11A2.061

Crossref Full Text | Google Scholar

Wang, H, Yao, F, Zhu, H. S, and Zhao, Y. Y. (2020). Spatiotemporal variation of vegetation coverage and its response to climate factors and human activities in arid and semi-arid areas: case study of the Otindag sandy land in China. Sustainability 12, 5214. doi: 10.3390/su12125214

Crossref Full Text | Google Scholar

Xu, H. H, Jiang, Z. Y, Ding, J. Y, Guo, X, Wang, Y. J, Li, X. Y, et al. (2025). Enrichment effects of vegetation patches on soil water content depend on precipitation and vegetation conditions: A global meta-analysis. J. Hydrology 661, 133590. doi: 10.1016/j.jhydrol.2025.133590

Crossref Full Text | Google Scholar

Yang, H., Yoo, H., Lim, H., Kim, J., and Choi, H. T. (2021). Impacts of soil properties, topography, and environmental features on Soil Water Holding Capacities (SWHCs) and their interrelationships. Land 10, 1–15. doi: 10.3390/land10121290

Crossref Full Text | Google Scholar

Yigini, Y. and Panagos, P. (2016). Assessment of soil organic carbon stocks under future climate and land cover changes in Europe. Sci. Total Environ. 557-558, 838–850. doi: 10.1016/j.scitotenv.2016.03.085

PubMed Abstract | Crossref Full Text | Google Scholar

Yu, X., Zhang, C. S., and Jie, X. (2015). Areas benefiting from water conservation in key ecological function areas in China. J. Resour. Ecol. 6, 375–385. doi: 10.5814/j.issn.1674-764x.2015.06.005

Crossref Full Text | Google Scholar

Zare, S., Shamsi, S. R. F., and Abtahi, S. A. (2019). Weakly−coupled geo−statistical mapping of soil salinity to Stepwise Multiple Linear Regression of MODIS spectral image products. J. Afr. Earth Sci. 152, 101–114. doi: 10.1016/j.jafrearsci.2019.01.008

Crossref Full Text | Google Scholar

Zhang, X. N., Li, X. D., Nian, L. L., Samuel, A., Liu, X. Y., Liu, X. L., et al. (2024). Topographic and climatic factors effect spatiotemporal coupling relationship of soil water conservation function with vegetation in source of the yellow river. Sustainability 16, 6039. doi: 10.3390/su16146039

Crossref Full Text | Google Scholar

Zhang, M. L., Liu, X. N., Nazieh, S., Wang, X. Y., Nkrumah, T., and Hong, S. L.. (2020). Spatiotemporal distribution of grassland NPP in Gansu province, China from 1982 to 2011 and its impact factors. PLoS One 15, 1–15. doi: 10.1371/journal.pone.0242609

PubMed Abstract | Crossref Full Text | Google Scholar

Zhao, B. Q., Shen, Y. Y., Hu, X. K., Wu, Y. H., Zhang, L., Xia, D., et al. (2022). Evaluation of coupling coordination relationship between different habitat materials and vegetation system in the engineering disturbed area. Front. Materials 9. doi: 10.3389/fmats.2022.976489

Crossref Full Text | Google Scholar

Zhou, F. C., Han, X. Z., Tang, S. H., Song, X. N., and Wang, H. (2021). An improved model for evaluating ecosystem service values using land use/cover and vegetation parameters. J. Meteorological Res. 35, 148–156. doi: 10.1007/s13351-021-9199-x

Crossref Full Text | Google Scholar

Zuo, Z. L., Guo, H. X., Cheng, J. H., and Li, Y. L. (2021). How to achieve new progress in ecological civilization construction?—Based on cloud modeland coupling coordination degree model. Ecol. Indic. 127, 107789. doi: 10.1016/j.ecolind.2021.107789

Crossref Full Text | Google Scholar

Keywords: coupling coordination, random forest, soil moisture, vegetation dynamics, Wanchuan River Basin, Yellow River tributaries

Citation: Liu X, Li X, Zhang X, Yang J, Li Z, Zhang H and Liu X (2026) The response and coupling coordination of soil moisture to vegetation in the Yellow River’s primary tributaries: a multi-source data analysis of the Wanchuan River Basin. Front. Plant Sci. 16:1700203. doi: 10.3389/fpls.2025.1700203

Received: 08 September 2025; Accepted: 18 December 2025; Revised: 02 December 2025;
Published: 19 January 2026.

Edited by:

Xiao-Dong Yang, Ningbo University, China

Reviewed by:

Tayierjiang Aishan, Xinjiang University, China
Zengyan Li, Ningbo University, China
Changqing Jing, Xinjiang Agricultural University, China

Copyright © 2026 Liu, Li, Zhang, Yang, Li, Zhang and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xuelu Liu, bGl1eGxAZ3NhdS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.