- 1School of Geological Engineering, Qinghai University, Xining, Qinghai, China
- 2School of Engineering, Qinghai Institute of University, Xining, Qinghai, China
Landslide disasters frequently occur in the upper reaches of the Yellow River, particularly within the Gonghe to Xunhua section. A precise evaluation of landslide susceptibility is vital for effective disaster prevention and mitigation. Integrated models that combine statistical methods with machine learning techniques have been widely adopted for landslide susceptibility assessments. However, the quality and composition of the positive sample training data have a significant impact on the accuracy of the outcomes. This study uses historical landslide data from the region and applies two statistical approaches-the information value (IV) and the coefficient of determination (CF) methods-alongside three machine learning models: Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost). Six integrated models (IV-RF, IV-SVM, IV-XGBboost, CF-RF, CF-SVM, and CF-XGBoost) are developed to evaluate landslide susceptibility in the Yellow River’s upper reaches (from Gonghe to Xunhua). The Receiver Operating Characteristic (ROC) curve and Accuracy (ACC) values are used to assess the models’ performance, while spatial features of newly identified landslides, determined through optical remote sensing images, are compared using Small Baseline Subset-Interferometric Synthetic Aperture Radar (SBAS-InSAR) technology. The CF-XGBoost model is identified as the most effective. New landslide data were then added to the positive sample dataset to retrain the CF-XGBoost model, enhancing its predictive performance. The methodology proposed in this study not only enables effective evaluation of the accuracy and reliability of computational results derived from ensemble models, but also addresses the limitations caused by untimely acquisition of insufficient landslide samples. Furthermore, the resulting landslide susceptibility assessment establishes a reliable technical foundation for local disaster management authorities to formulate scientifically sound risk mitigation and control strategies.
1 Introduction
Landslides are a common geological hazard, distinguished by their sudden occurrence and widespread impact (Jia et al., 2022; Jiang et al., 2022), presenting direct threats to nearby infrastructure and the safety of residents’ lives and property (Pareek et al., 2025). In the upper reaches of the Yellow River (from Gonghe to Xunhua), the region’s complex geological features, steep topography, sparse vegetation, and increasing human activities in recent years have led to a higher frequency of landslides (Tu et al., 2023; Zhao et al., 2022). Therefore, it is essential to improve the management of landslide risks and enhance the capacity for disaster prevention and mitigation in this area. Landslide susceptibility assessment, a key method for disaster prevention, helps identify high-risk zones through precise, reliable, and efficient technical systems, providing a scientific foundation for effective disaster reduction and prevention efforts (Wang and Bai, 2023; He et al., 2023; Bhandary et al., 2013).
The goal of landslide susceptibility assessment is to forecast the likelihood of landslides by examining the spatial patterns of past landslides and the factors that influence their occurrence in a specific area (Sabatakakis et al., 2014; Rohan et al., 2023). The development of landslide disasters is influenced by a combination of internal factors (e.g., topography, geology, geological structure, transportation, and water systems) and external triggers (e.g., rainfall, earthquakes, and human engineering activities). The likelihood of a landslide varies depending on these factors (Lu et al., 2024). Traditional statistical methods calculate the probability of landslides by establishing mathematical relationships, which are simple and straightforward to apply but struggle to capture the complex interactions between landslides and various factors, leading to relatively low prediction accuracy (Zhang et al., 2022).With advancements in computer technology, machine learning models have increasingly been used for landslide susceptibility prediction (Dou et al., 2023; Qi et al., 2024; Huang et al., 2023). Unlike traditional statistical methods, machine learning models are capable of identifying nonlinear relationships between landslides and influencing factors, significantly improving prediction accuracy (Huang et al., 2020). However, single machine learning models often struggle to match training data with real-world conditions, making it difficult to fully capture the nonlinear interactions between landslides and evaluation factors. Combining statistical methods with machine learning models can help address this issue (Umar et al., 2014). The integration of these methods for landslide susceptibility assessment has become a prominent trend in research. For example, Wang et al. used the IV and CF methods along with the RF model for landslide susceptibility assessment in Ningnan County, demonstrating that the integrated model performed better than individual models (Wang J. et al., 2024). Liu et al. proposed the SF-Stacking method, which incorporates spatial heterogeneity and feature selection, for landslide susceptibility assessment in Yibin City. The results showed that SF-Stacking outperformed individual models such as BPNN, SVM, and KNN in terms of accuracy (Liu and Chen, 2024). Wang Jingjing et al., employed a bidirectional long short-term memory model based on landslide density (LD-BiLSTM) for landslide susceptibility assessment in Luding County, achieving higher accuracy compared to both the RF and IV models. These studies have proven that integrated models can effectively overcome the limitations of single models and improve landslide prediction accuracy (Wang L. et al., 2024).
In the existing body of literature, many studies have relied on historical landslide data as training datasets for landslide susceptibility assessments (Hong et al., 2024; Mao et al., 2021; Xing et al., 2021; Gu et al., 2024; Xing et al., 2023), often overlooking newly occurring landslide events. However, older, larger, and more destructive landslides, which may have been mitigated through measures like slope reinforcement by relevant geological disaster management authorities, could lead to less accurate predictions when based solely on historical data. This study introduces a coupled approach that integrates statistical methods, machine learning models, and SBAS-InSAR technology to assess landslide vulnerability in the upper reaches of the Yellow River. The study is structured in three key components: First, historical landslide data from 1998 to 2012, provided by the China Geological Survey (https://www.cgs.gov.cn/), were used to form the sample set. Two statistical methods-the IV and CF methods-were combined with advanced machine learning models, including RF, SVM, and XGBoost. Landslide susceptibility predictions were generated for six integrated models (IV-RF, IV-SVM, IV-XGBoost, CF-RF, CF-SVM, CF-XGBoost). Second, SBAS-InSAR technology along with optical remote sensing images were applied to detect new landslides that occurred in the study area from 2021 to 2023. The newly identified landslides were then compared with the susceptibility results from the six models, which revealed that the CF-XGBoost model was the most effective. Finally, the newly identified landslide data were incorporated into the CF-XGBoost model as a positive sample set to calculate the landslide hazard susceptibility index for the study area, and risk zoning was performed using the natural breakpoint method. These findings provide an important scientific foundation for landslide risk management and prevention in the upper reaches of the Yellow River (from Gonghe to Xunhua).
In summary, to address the current limitations in research on landslide susceptibility assessment in the upper reaches of the Yellow River, the Gonghe to Xunhua section was selected as the study area for conducting systematic landslide susceptibility prediction. The main contributions of this study are as follows:
• A positive sample set was established based on historical landslide points identified in the study area between 1998 and 2012. Landslide susceptibility was predicted by integrating statistical methods with advanced machine learning techniques. The predictive performance of the integrated model was further validated using newly identified landslide points from 2021 to 2023, which were detected through SBAS-InSAR and optical remote sensing imagery.
• The newly identified landslide points were incorporated into the positive sample set to update it. The optimal model (CF-XGBoost) was retrained using the updated dataset, resulting in an improved landslide susceptibility assessment for the upper reaches of the Yellow River. This approach ensures that the training samples remain temporally relevant and enhances the model’s predictive accuracy.
2 Materials and methods
2.1 Research area
The upper reaches of the Yellow River, particularly the Gonghe and Xunhua sections, are located in the southeastern part of the Qinghai-Tibet Plateau. The topography features elevated areas in the west, north, and south, with lower elevations in the east. Altitudes in this region range from 1657 to 4121 m. The river passes through significant areas, including the Longyang Gorge, Lijia Gorge, Guide County, Jianzha County, and Xunhua County (Wang Q. et al., 2024; Fei et al., 2023; Du et al., 2023). The climate in the study area is a plateau continental type, with average annual rainfall over the last 5 years ranging from 550 to 670 mm. Due to the relatively low precipitation, the normalized difference vegetation index (NDVI) remains below 0.3 in most parts of the region, indicating sparse vegetation and considerable desertification. In this ecologically fragile environment, which is further affected by river erosion and human engineering activities, landslide occurrences are common (Shi et al., 2019; Dong et al., 2018). Figure 1 illustrates the general situation and historical landslide data for the area. These landslides pose significant risks to infrastructure along the riverbanks and threaten the safety of residents and their property.
2.2 Data sources
To accurately detect new, unrecorded landslides, this study employed a method that combines SBAS-InSAR technology with optical remote sensing imagery. Data from Sentinel-1A ascending and descending tracks from January 2021 to December 2023 were used, with 123 scenes for ascending and 149 scenes for descending tracks. SRTM external elevation data with a 30 m resolution and precise orbit data were utilized for orbit error correction. Optical remote sensing images from Landsat-8, also with a 30 m resolution, were chosen for this study. The specific data sources are shown in Table 1. For the selection of non-landslide points, this study randomly selected 167 non-landslide points outside the 2 km buffer zone of landslide locations to maintain a 1:1 ratio between positive and negative samples. This balanced sampling approach prevents potential degradation in ensemble model accuracy caused by imbalanced sample distribution. The classification criteria of evaluation factors, spatial distribution of landslide points, and detailed procedures for non-landslide point selection are visually presented in Figure 2.

Figure 2. The classification of each evaluation factor and the landslide point and non-landslide location.
Landslide disasters develop through a complex process, typically influenced by a combination of natural factors and human engineering activities (Nguyen et al., 2025). In this study, 16 evaluation factors were initially selected from five key landslide influencing categories: geological environment, topography and geomorphology, meteorology and hydrology, vegetation and soil, and human engineering activities. These factors include elevation, slope, aspect, plan curvature, profile curvature, topographic wetness index, normalized difference vegetation index, rainfall, distance to faults, distance to rivers, distance to roads, formation lithology, land use, surface roughness, topographic relief, and surface cutting degree. To maintain consistency in the spatial representation of each factor, a 30 m spatial resolution was applied. Continuous factors were classified using the natural break method, while discrete factors were categorized based on their actual states (Wu et al., 2016). More details are provided in Supplementary Material.
2.3 Methods
2.3.1 Evaluation factor screening
To ensure accurate landslide prediction results, it is important to conduct a correlation analysis of the 16 primary evaluation factors to assess their independence (Li et al., 2022). While all selected factors play a role in the development of landslide hazards, strong correlations between them can affect the evaluation outcomes and cause collinearity problems (Yang C. et al., 2023). Therefore, screening the evaluation factors is crucial to maintain the accuracy of the results. In this study, the Pearson correlation coefficient method was employed, and its calculation formula is as follows (Li C. et al., 2024):
The correlation between the factors can be measured according to the calculated Pearson correlation coefficient (
2.3.2 Statistical approaches
The IV method is based on assessing the uncertainty of information. By calculating the information value of each evaluation factor affecting landslides in the study area, a higher information value suggests a greater likelihood of landslide occurrence. The formula for computing the information content is as follows (Lv et al., 2024):
In Equation 2,
The CF coefficient calculates the prior probability of landslide occurrence based on the states of different index factors using landslide point data. The CF value ranges from −1 to 1, where, similar to the IV method, a higher value indicates a greater tendency for landslides to occur. The formula for calculating CF is as follows (Ding et al., 2025):
In Equation 3,
2.3.3 Machine learning algorithms
Random forest (RF) is an ensemble learning algorithm that integrates multiple classification and regression trees. It constructs several decision trees using subsets of the data, aggregates the predictions from these trees, and ultimately determines the optimal result (Akinci, 2022). The RF algorithm randomly selects portions of the training dataset and features from these samples to train each individual learner, ensuring both independence among the trees and greater accuracy in the aggregated predictions. This method surpasses the performance of a single decision tree by averaging the outcomes, which minimizes overfitting and enhances predictive accuracy (Yang et al., 2024). The fundamental formula is as follows:
In Equation 4,
Support Vector Machine (SVM) is a widely used machine learning model for classification and regression tasks, with its primary concept being the identification of an optimal hyperplane to separate various categories of data (Zhang et al., 2023; Huang et al., 2022). Initially, all evaluation factors (
In Equation 5, the terms
The eXtreme Gradient Boosting (XGBoost) optimizes the loss function by employing the second derivative information, and determines the split node based on whether a reduction is achieved. The core formula is as follows (Guo et al., 2024):
In Equation 6, objective function is represented by
2.3.4 SBAS-InSAR technology
This study utilizes Sentinel-1A ascending and descending orbit data from January 2021 to December 2023 to calculate the 3-year average annual surface deformation rate in the study area using SBAS-InSAR technology. The SBAS-InSAR processing involves correcting track errors with precision track data and DEM, and performing phase unwrapping using the minimum-cost flow method (Yang S. et al., 2023). To generate sufficient interference pairs, the time baseline is set to 90–120 days and the spatial baseline is set at 120 m. A deformation rate threshold of 10 mm/a is applied, with values below this considered stable. If the deformation rate exceeds this threshold, optical remote sensing images are combined with visual interpretation to assess whether the area is affected by landslides.
2.3.5 Accuracy evaluation
To ensure the validity of the research method, the accuracy (ACC) and ROC curve were utilized to assess the model’s performance. Accuracy is the proportion of correctly predicted samples out of the total number of samples. The ACC value serves as a direct indicator of the model’s precision, with larger values reflecting higher accuracy (Wang J. et al., 2024). The ROC curve is frequently used to evaluate the classification effectiveness of a model, depicting the area under the curve created by the true positive rate (TPR) and the false positive rate (FPR) to measure the model’s accuracy (Liu and Chen, 2024). A greater area under the ROC curve, or a higher AUC value, signifies improved model accuracy and stronger predictive performance. The fundamental concept of
In Equations 7–9,
Precision refers to the proportion of all samples predicted as landslides by the model that are correctly identified as landslide samples. The fundamental concept of Precision is as follows:
Recall represents the proportion of correctly predicted landslide samples among all actual landslide samples, and its mathematical expression is as follows:
In Equation 10, F1-Score represents the harmonic mean of accuracy and recall, which can quantitatively evaluate the accuracy and completeness of a model. Its mathematical expression formula is as follows:
Assuming the model shows good accuracy, the new landslide data identified by SBAS-InSAR technology in conjunction with optical remote sensing images are compared with the landslide susceptibility prediction results from different models. If all the new landslide data fall within high-risk areas, the effectiveness of the model for landslide susceptibility assessment in the upper reaches of the Yellow River is confirmed, thereby verifying the accuracy of the model’s predictions. The technical approach employed in this study is depicted in Figure 3. First, a correlation analysis was conducted on the initially selected 16 evaluation indicators using the Pearson correlation coefficient method. Strongly correlated factors were eliminated to establish a landslide susceptibility evaluation index system. Second, 167 historical landslide locations within the study area were selected as positive samples, while 167 non-landslide points, randomly chosen from areas outside the 2-km buffer zones surrounding historical landslides, were used as negative samples. To address spatial autocorrelation, a spatial block cross-validation approach was applied. Specifically, all samples were first divided into a regular 10 × 10 geographic grid based on their spatial coordinates. Subsequently, group-based cross-validation was conducted to ensure that all samples within the same spatial block were exclusively assigned to either the training set or the validation set, thereby maintaining spatial independence between these two datasets. Finally, 70% of the samples were used for training and 30% for validation. The IV and CF values of each influencing factor were calculated and integrated with three models-RF, XGBoost, and SVM-to generate six integrated models. Through model training and prediction, six landslide susceptibility maps were produced. The performance of these models was evaluated using ROC curves, with accuracy and precision metrics, and susceptibility results were classified using the natural breaks method. Additionally, an overlay analysis was performed between the susceptibility zonation results and 227 newly identified landslides detected via InSAR and optical imagery to further validate the methodology’s reliability. Finally, the optimal model (CF-XGBoost) was applied to predict landslide susceptibility using training data derived from new landslide points, non-landslide points, and evaluation factors. For the training data, 227 new landslide points were defined as positive samples, with an equal number of non-landslide points randomly selected outside a 2 km buffer zone of these new landslides. Regarding the evaluation factors, NDVI, rainfall, and land use required resampling, while other factors were treated as static over time.
3 Results and analysis
3.1 SBAS-InSAR and new landslide identification results
Using Sentinel-1A data from January 2021 to December 2023 in the study area, this research employs SBAS-InSAR technology to calculate the average annual surface deformation rate over the past 3 years. The findings are presented in Figure 4. Specifically, Figure 4a depicts the deformation rate in the ascending orbit direction, while Figure 4b illustrates the deformation rate in the descending orbit direction. The entire SBAS-InSAR technical workflow was implemented using the SARscape module within ENVI 5.6 software.

Figure 4. Average annual surface deformation rate and spatial distribution of new landslides in the study area (2021–2023). (a) represents deformation in the ascending direction, (b) represents deformation in the descending direction, and (c) illustrates the spatial distribution characteristics of different landslides.
To achieve accurate landslide identification within the study area, deformation rates were overlaid onto Landsat imagery and Google Earth basemaps. A deformation threshold of 10 mm/a was established based on ascending and descending orbital datasets, with areas exhibiting deformation rates below this threshold classified as relatively stable regions. Preliminary landslide boundaries were delineated by comparing regional deformation rates against the established threshold. Subsequently, visual interpretation methods were systematically applied to verify each preliminary landslide polygon. The final landslide inventory presented in Figure 4c identifies 227 new landslides throughout the study area. Among these, 171 landslides were detected in ascending orbit data (indicated by white points in Figure 4c), 154 landslides were identified in descending orbit data (blue points), and 98 landslides showed detection consistency in both orbital directions (red points). Historical landslides are represented by black points in Figure 4c. Spatial distribution analysis reveals approximately 40 new landslides occupy pre-existing landslide footprints, while the majority constitute newly developed slope failures. Notably, both newly identified and historical landslides demonstrate clustered distributions concentrated within the Longyang Gorge, Lijia Gorge, Jianzha County, and Hualong County sectors.
3.2 Screening primary factor
Before training the coupling model, a correlation analysis was performed on the primary evaluation factors to prevent data redundancy due to high correlations, which could impact the model’s precision and the accuracy of landslide predictions. Pearson correlation coefficients were calculated to evaluate the relationships between the factors, as shown in Figure 5. The results reveal that the absolute correlation coefficient between surface roughness and slope exceeds 0.5, indicating a strong correlation. Excluding surface roughness led to an improvement of about 0.02 in the ROC value for each coupling model. As a result, surface roughness was excluded from the subsequent landslide susceptibility modeling.
3.3 IV and CF values of the second-level partition of each evaluation factor
Prior to calculating the IV and CF values for the secondary sub-regions of each factor, continuous factors should be classified using the natural breaks method, while discrete factors should be categorized according to their actual states. Following this, the respective areas and landslide counts within each classification interval of the factors are tallied. Subsequently, the IV and CF values for each classified factor are respectively computed based on Equation 2 and Equation 3.
The IV and CF values for secondary zones across different evaluation factors reflect their contribution to landslide occurrence, with higher values signifying a stronger influence. According to the calculation results, the following conditions that exhibit the highest IV and CF values include: elevation ranging from 2590 to 2754 m, slope between 20° and 30°, north-facing slopes (337.5°–360°), plan curvature from −1 to 0, profile curvature between −5.8 and −3.8, topographic wetness index less than 4.9, topographic relief ranging from 71 to 109 m, surface cutting degree between 40 and 60 m, NDVI ranging from 0.31 to 0.39, distance to rivers between 400 and 800 m, distance to roads is less than 400 m, distance to faults between 800 and 1200 m, lithology from Paleogene to recent, rainfall exceeding 651 mm/year, and land use as cultivated land, as detailed in Supplementary Material.
3.4 Model accuracy evaluation
Hyperparameter optimization is a critical step for enhancing the overall performance of machine learning models. It not only strengthens model robustness and generalization capabilities, effectively mitigating overfitting and improving training stability, but also significantly reduces computational resource consumption.
This study employed a strategy combining random search with fivefold cross-validation to identify the optimal hyperparameter configuration. Specifically, 2000 sets of hyperparameters were randomly sampled. For each set, a fivefold cross-validation procedure was performed: the training set was uniformly partitioned into five mutually exclusive subsets. Sequentially, models were trained on four subsets while the remaining subset served as the validation set for performance evaluation, rotating the validation set across each fold. After evaluating all parameter combinations, the system selected the hyperparameter set achieving the highest average AUC score across the five cross-validation rounds as the final configuration.
The optimal hyperparameters of each model are shown in Table 2. This study implements hyperparameter optimization and model training based on Python 3.9 software and the Scikit-learn (sklearn) package.
The IV and CF values derived from the information content method and the determined coefficient method, were applied to train three machine learning models-RF, XGBoost, and SVM-to generate landslide susceptibility evaluation results for six integrated models. The ROC curves and ACC values for each integrated model are presented in Figure 6. In terms of each precision index, the accuracy of the machine learning models coupled with the determined coefficient method generally outperformed those using the information content method. The AUC values for all integrated models surpassed 0.84, indicating strong fitting accuracy and predictive capability. Among them, the CF-XGBoost model achieved the highest accuracy with an AUC value of 0.916. The similar performance of the integrated models may be attributed to the resemblance between randomly generated non-landslide points and the environmental conditions of landslide points, minimizing the impact of subjective influences.
Table 3 shows the comprehensive precision metrics for six ensemble models. The CF-XGBoost demonstrates relatively superior overall accuracy. However, all models exhibit substantially lower recall rates compared to other precision metrics. This limitation arises from two primary factors: 1) the insufficient representation of positive-class instances (n = 167) in the training dataset, which hinders effective feature learning; and 2) the application of a fixed 0.5 decision threshold, which imposes stringent criteria for identifying positive-class outcomes. This is also the main reason why 9% of the new landslides occurred outside high-risk areas, as shown in Table 4. As a result, the models tend to minimize false positives while increasing the likelihood of false negatives in landslide detection. Nevertheless, the consistently high precision values indicate strong reliability in distinguishing true positive cases.
In addition to using the above five precision metrics, SBAS-InSAR technology combined with optical remote sensing images was also employed to compare and analyze the spatial distribution of landslides identified by the integrated models and the predicted landslide susceptibility. This allowed for further validation of the models’ predictive accuracy. When comparing the landslide susceptibility results obtained from statistical analysis with the spatial distribution of newly detected landslides, as shown in Table 4, it is evident that most of the new landslides are concentrated in moderate, high, and very high-risk areas, with only a small fraction located in low-risk regions. The CF-XGBoost model predicted that 91% of the new landslides occurred in high-risk and very high-risk areas, with no new landslides found in low-risk areas, further confirming its superior performance in landslide prediction.
3.5 Landslide susceptibility results
In this study, the natural breaks method was adopted to classify the landslide susceptibility indices predicted by six integrated models into five classes: very low, low, moderate, high, and very high susceptibility. The landslide susceptibility threshold corresponding to each class are shown in Table 5. The results demonstrate that the threshold structure defined by the natural breaks method enables the CF-XGBoost model to exhibit exceptional capability in precisely isolating very low-risk zones and effectively distinguishing very high-risk zones, thereby confirming its superior predictive performance. Compared to other ensemble models, this threshold structure significantly enhances the characterization accuracy of the spatial gradient of landslide probability. This advancement holds substantial practical significance for geohazard risk management planning.
The results of landslide susceptibility of each integrated model are illustrated in Figure 7. The findings show that areas to the north of Lijiaxia, Jianzha County, and Hualong County exhibit higher risk levels, corresponding with zones of landslide concentration. This is due to factors such as steep terrain, relatively high rainfall, a dense population along the Yellow River, frequent human engineering activities, and weak stratigraphic lithology, which collectively increase the likelihood of landslides. However, these differences mainly stem from variations in statistical methods and model characteristics. In terms of statistical methods, different approaches may result in significant discrepancies during feature extraction in the training phase, thereby affecting the composition of feature subsets used in model development. With regard to the machine learning models themselves, tree-based ensemble models such as RF and XGBoost are capable of effectively capturing and utilizing nonlinear relationships among features. In contrast, SVM with linear kernel functions depend heavily on the linear separability of input features. Moreover, key hyperparameters in tree-based models, such as the number of trees, maximum depth, and learning rate, have a direct impact on model performance. Similarly, the regularization parameter plays a critical role in determining the performance of SVM with linear kernels. For instance, CF-RF, CF-SVM, IV-RF, and IV-SVM models identify fewer low-risk regions but more areas in the medium to high-risk categories. On the other hand, the IV-XGBoost model identifies more low-risk areas but provides lower prediction accuracy for landslides. The CF-XGBoost model successfully predicts high-risk areas based on historical landslide data, with a strong alignment to actual landslide distributions.

Figure 7. Results of historical landslide susceptibility mapping. (a–f) respectively represent the model results of CF-XGBoost, CF-RF, CF-SVM, IV-XGBoost, IV-RF, and IV-SVM.
4 Discussion
4.1 Prediction of landslide susceptibility in the upper reaches of the Yellow River (from Gonghe to Xunhua section)
Using historical landslide data as the training sample set for the integrated model, it was found that the CF-XGBoost model demonstrated high accuracy and effective prediction performance. Consequently, this model was applied to predict landslide susceptibility in the upper reaches of the Yellow River (from Gonghe to Xunhua). When processing evaluation factors, only the normalized difference vegetation index and rainfall data from 2021 to 2023 were updated, while other factors remained consistent throughout the year. To ensure the authenticity of the existing landslide data and the accuracy of positive sample data, new landslide data identified by SBAS-InSAR technology and optical remote sensing images were selected as the training sample set.
Following model retraining with an increased number of positive samples (from 167 to 227), the CF-XGBoost model demonstrated a notable improvement in Recall, which increased from 0.818 to 0.911, as show in Table 6. This indicates that after sufficiently learning the characteristics of the positive class, the model successfully captured a greater number of actual landslides. However, Precision experienced a slight decrease of 0.028. This decline is attributable to an inevitable increase in false positives (non-landslides incorrectly classified as landslides) as the model reduced the number of missed detections. Overall, the model exhibited improvements in its discriminative ability (AUC), overall accuracy (ACC), landslide detection performance (Recall), and comprehensive performance (F1-Score). These enhancements collectively indicate strengthened model generalization capability and stability.
Figure 8a illustrates the landslide susceptibility zoning results obtained using the new landslide data and trained with the CF-XGBoost model. Compared to the historical landslide susceptibility evaluations, Jianzha County and Hualong County remain in high-risk zones, but there is an increase in low-risk areas, with the distribution of high-risk zones becoming clearer, particularly concentrated in Jianzha, Hualong, and Xunhua Counties. This suggests that the use of older historical landslide data could lead to inaccuracies in identifying high-risk areas. Field surveys in a high-risk area of Jianzha County, shown in Figures 8b–f, further validated the model’s ability to accurately identify the spatial distribution and landslide susceptibility of new landslides. The red area in Figure 8b highlights the sliding boundary, while Figure 8c shows subsidence on the slope surface, with the red area marking the active landslide front that has subsided by about 1 m. Figure 8d offers a closer look at Figures 8c,e,f display tensile fractures caused by the active landslides.

Figure 8. New landslide susceptibility results and detailed map of field investigation. (a) displays the prediction results of landslide susceptibility in the study area. (b) shows landslides verified through field investigation. (c) illustrates the front scarp of an active landslide, and (d) demonstrates a detailed view of settlement features in (c). (e,f) represent tensile fractures caused by the active landslide.
The landslide susceptibility prediction results indicate that urban development and infrastructure planning should prioritize avoiding areas of high and very high susceptibility, directing siting efforts toward zones of low and very low susceptibility. Furthermore, comprehensive factor importance analysis reveals that disaster prevention measures require enhanced implementation in high and very high susceptibility regions during concentrated rainfall seasons, particularly in areas exhibiting dense concentrations of high susceptibility zones such as Jianzha, Hualong, and Xunhua counties.
4.2 Feature importance analysis
Machine learning models not only offer strong predictive performance but also quantitatively assess the importance of each evaluation factor, providing insights into the contribution of various factors to landslide occurrence and facilitating the development of targeted preventive strategies. In this study, Weight is employed as the metric for calculating feature importance within the CF-XGBoost model. This metric quantifies feature importance by tallying the total number of times each feature is utilized as a split node across all trees in the ensemble. The weights of each factor calculated by the CF-XGBoost model are shown in Figure 9a. It can be observed that all factors play a significant role in landslide hazard development. According to the magnitude of factor weights, the three factors with greater impact on landslide hazards are rainfall, slope aspect, and stratigraphic lithology.

Figure 9. Importance of evaluation factors. (a) represents the CF-XGBoost model, and (b) represents the CF-RF model.
To further identify the principal controlling factors influencing landslides, this study computed the feature importance ranking using the CF-RF model, as illustrated in Figure 9b. Consistent with the results from the CF-XGBoost model, rainfall, slope aspect, and lithology remain the most significant factors affecting landslide occurrence. However, a discrepancy exists between the two models regarding the relative importance ranking of slope aspect and lithology, which is likely attributable to differences in their hyperparameter configurations. Overall, the CF-XGBoost and CF-RF models exhibit a high degree of consistency in the ranking of all factor weights, with both confirming the predominant role of rainfall among the influencing factors.
Analysis combining Supplementary Material shows that when annual rainfall exceeds 636 mm, rainfall promotes landslide occurrence, while north-south slope aspects have a significant influence on landslides. Furthermore, based on geotechnical mechanical properties, this study divides the research area into four categories: hard lithology, moderately hard lithology, moderately weak lithology, and weak lithology, as shown in Figure 4c. Most new and old landslides occur in moderately weak lithology areas. This is because rainwater preferentially infiltrates south-facing slopes, further softening the already fragile lithology. After rainwater infiltrates north-facing slopes, low water evaporation leads to long-term high soil moisture content, continuously reducing the shear strength of geomaterials and increasing the sliding force. Therefore, the interaction among rainfall, slope aspect, and stratigraphic lithology significantly increases landslide hazard risk.
Secondary factors influencing landslides are elevation, topographic relief, and vegetation coverage. This is particularly significant within the ranges of elevation 2590–2754 m, topographic relief 71–109 m, and vegetation coverage (NDVI) 0.31–0.39. This occurs because gravitational potential energy increases with elevation difference, and root stabilization effectiveness weakens in areas with low vegetation coverage, leading to tension crack formation under the self-weight of geomaterials. Additionally, under rainfall infiltration, crack generation in slopes is accelerated by these combined influences. The influence of other factors on landslides is relatively small, demonstrating that landslides in the Upper Yellow River region are mainly controlled by the area’s unique geographical conditions. Human activity factors such as distance to roads and land use exhibit no significant effects on landslide movement.
4.3 Comparison with existing studies
4.3.1 Comparison of the spatial distribution of newly detected landslides with existing studies
Due to the extensive spatial coverage of the study area, field verification of all identified landslides across the entire region was impractical, Consequently, a comparative analysis with existing monitoring results was performed to validate the accuracy of the landslide detection outcomes presented in this study. The detected landslides are predominantly distributed in southeastern Longyang Gorge, north of Lijia Gorge, Jianzha County, and Hualong County. The comparative analysis revealed a high degree of consistency between the landslide detection results obtained in this study and those reported in previous research. For instance, Du et al. employed Stacking-InSAR integrated with optical remote sensing imagery to identify landslide distribution within the upper reaches of the Yellow River (Du et al., 2023). Similarly, Zhao et al. utilized SBAS-InSAR combined with optical remote sensing imagery to determine the precise geographical locations of landslides in this region (Zhao et al., 2022). Notably, the landslides identified by both research groups were also primarily located in southeastern Longyang Gorge, north of Lijia Gorge, Jianzha County, and Hualong County.
4.3.2 Comparison between model predictive results and existing studies
Existing research on landslide susceptibility assessment in the upper Yellow River remains limited. The most comparable study is that of Li et al. (2016), who employed the Analytic Hierarchy Process (AHP) to evaluate susceptibility in the Longyang Gorge to Gongboxia Gorge segment. Their results identified high susceptibility zones across the northwestern and southwestern sectors of Longyang Gorge. While the present study similarly detected localized high-susceptibility areas in these sectors, their spatial extent is significantly reduced relative to Li et al.'s findings, with low-susceptibility domains predominating. This discrepancy likely stems from methodological differences, divergent evaluation criteria, and temporal environmental variations.
4.3.3 Comparative analysis of the model’s landslide prediction performance with existing studies
The integrated model (CF-XGBoost) employed in this study demonstrated superior performance in landslide prediction. Predictive results on new landslide data revealed that 91% of the landslides were located within high-risk and very high-risk zones. Comparative analysis with existing research indicates that this performance remains highly competitive. For example, Zhu et al. (2024) evaluated how different nonlandslide sample selection methods, specifically whole area random selection method, Buffer method, Frequency Ratio method, and Analytic Hierarchy Process (AHP), affected RF and XGBoost model performance in Huize County. Their optimal model (XGBoost-AHP) correctly predicted 85.03% of landslides. Yu et al. (2025) proposed a novel framework based on Dynamic Ensemble Selection (DES) to capture the spatial development patterns of different landslide types, conducting experiments in Wanzhou District, Chongqing, China. The DES model achieved an accuracy of 80.84% in classifying landslides into high-risk and very high-risk zones. Zhou et al. (2024) conducted a landslide susceptibility assessment for the Zigui-Badong section of China’s Three Gorges Reservoir area using a coupled approach integrating ensemble learning and machine learning. Their best-performing integrated model (LR-MLP-Boosting) correctly identified 82.34% of landslide pixels as situated within high-risk and very high-risk zones.
In summary, the upper reaches of the Yellow River constitute a landslide prone region in China, yet research on landslide susceptibility assessment in this area remains limited. Consequently, this study’s approach integrating statistical methods, machine learning models, and SBAS InSAR technology for landslide susceptibility evaluation in the upper Yellow River holds significant scientific merit. Furthermore, comparative analysis with existing studies reveals that the CF-XGBoost model employed in this work demonstrates superior landslide predictive performance.
4.4 Limitations and prospects
A primary limitation of this study stems from the temporal mismatch between the modeling and validation datasets: the historical landslide inventory covers the period 1998–2012, while the InSAR deformation observations used for model validation span 2021–2023. Environmental changes that may have occurred during this interval, such as land use transitions (e.g., urbanization, deforestation) and alterations in vegetation cover (e.g., degradation or succession), could reduce the model’s applicability to current conditions by modifying key landslide-controlling mechanisms. These mechanisms include root reinforcement, rainfall-infiltration-runoff interactions, and pore-water pressure dynamics. Such environmental variability may introduce systematic biases into model predictions, potentially leading to underestimation of current instability risks in areas with significant vegetation loss or, conversely, overestimation of risk in fundamentally altered environments. Therefore, although the model primarily reflects landslide occurrence patterns under historical environmental conditions, direct application of its predictions to interpret InSAR observations from 2021 to 2023 should be approached cautiously and integrated with concurrent assessments of environmental change. Future research should incorporate time-series remote sensing data (e.g., on land use and vegetation cover dynamics) to update model parameters and develop dynamic risk assessment frameworks compatible with near-real-time InSAR monitoring.
Evaluation factor selection significantly determines machine learning model accuracy. This study initially considered 16 potential landslide-influencing factors. The application of Pearson’s correlation coefficient method led to exclusion of surface roughness, resulting in 15 causative factors for model training. However, data availability limitations precluded incorporation of certain factors. For instance, earthquakes-as natural, uncontrollable phenomena-frequently trigger numerous landslides. Thus, future studies should prioritize earthquake-related factors to enhance analysis comprehensiveness and robustness.
Additionally, the natural break-point method partitioned the susceptibility index predicted by the integrated model, maximizing inter-group differences while minimizing intra-group variation. Future research should explore alternative partitioning methods to achieve more realistic zoning.
5 Conclusion
This study aimed to enhance landslide hazard prediction in the upper reaches of the Yellow River (Gonghe to Xunhua section) by obtaining high-precision and accurate landslide susceptibility evaluation results. Historical landslide data were used to train six integrated models (IV-RF, IV-SVM, IV-XGBoost, CF-RF, CF-SVM, CF-XGBoost), and each model’s accuracy was assessed using ROC curves and ACC values. New landslide data, identified through SBAS-InSAR technology and optical remote sensing images, were then overlaid with the susceptibility results from each model. The model that performed best, CF-XGBoost, was analyzed further. The results from this model, based on the new landslide data, were used as a key factor in predicting landslide occurrences in the study area. The key conclusions are as follows:
1 The CF-XGBoost model provided the highest accuracy among the six integrated models, with an AUC value of 0.916. Overlaying the model’s predictions with new landslide data showed a high degree of accuracy, with 91% of new landslides identified in high-risk and very high-risk areas, and no landslides detected in low-risk areas.
2 Spatial differences were observed between the susceptibility results based on historical data and those using new landslide data. The models using new landslide data more accurately reflected actual conditions, whereas those based on historical data tended to misidentify high-risk areas due to long-term landslide control, which led to inaccurate positive sample data for training.
3 The landslide susceptibility evaluation indicated that the highest-risk areas are concentrated in Jianzha County, Hualong County, and Xunhua County. Based on the factor weights, natural geographical conditions are the primary drivers of landslide occurrence, with rainfall being the most significant external factor. As such, landslide prevention efforts should be intensified in these counties during the rainy season.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Author contributions
JZ: Writing – original draft, Methodology, Visualization, Investigation, Writing – review and editing. WT: Formal Analysis, Investigation, Supervision, Writing – eview and editing. XW: Investigation, Writing – review and editing. XZ: Investigation, Writing – review and editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was funded by Qinghai Institute of Technology “Kunlun Talent” Talent Introduction Research Project (2023-QLGKLYCZX-25).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feart.2025.1652646/full#supplementary-material
References
Akinci, H. (2022). Assessment of rainfall-induced landslide susceptibility in Artvin, Turkey using machine learning techniques. J. Afr. Earth Sci 191, 104535. doi:10.1016/j.jafrearsci.2022.104535
Bhandary, N. P., Dahal, R. K., Timilsina, M., and Yatabe, R. (2013). Rainfall event-based landslide susceptibility zonation mapping. Nat. Hazards 69, 365–388. doi:10.1007/s11069-013-0715-x
Ding, D., Wu, Y., Wu, T., and Gong, C. (2025). Landslide susceptibility assessment in tongguan District Anhui China using information value and certainty factor models. Sci. Rep. 15, 12275. doi:10.1038/s41598-025-93704-z
Dong, G., Zhang, F., Liu, F., Zhang, D., Zhou, A., Yang, Y., et al. (2018). Multiple evidences indicate no relationship between prehistoric disasters in Lajia site and outburst flood in upper Yellow River valley, China. Sci. China Earth Sci. 61, 441–449. doi:10.1007/s11430-017-9079-3
Dou, H., Huang, S., Jian, W., and Wang, H. (2023). Landslide susceptibility mapping of mountain roads based on machine learning combined model. J. Mt. Sci. 20, 1232–1248. doi:10.1007/s11629-022-7657-2
Du, J., Li, Z., Song, C., Zhu, W., Ji, Y., Zhang, C., et al. (2023). InSAR-Based active landslide detection and characterization along the upper reaches of the yellow River. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 16, 3819–3830. doi:10.1109/JSTARS.2023.3263003
Fei, X., Tian, Y., Zhao, C., Liu, H., and Chen, H. (2023). Identification and deformation monitoring of unstable slopes in Longyangxia Reservoir area,the upper reach of Yellow River,China based on multi-temporal InSAR technology. J. Earth Sci. Environ. 45 (03), 578–589. doi:10.19814/j.jese.2022.11042
Gu, T., Duan, P., Wang, M., Li, J., and Zhang, Y. (2024). Effects of non-landslide sampling strategies on machine learning models in landslide susceptibility mapping. Sci. Rep. 14, 7201. doi:10.1038/s41598-024-57964-5
Guo, F., Wu, D., Ge, M., Dong, J., Fang, H., and Tian, D. (2024). The influence of continuous variable factor classification and machine learning model on the accuracy of landslide susceptibility evaluation. Inf. Sci. Wuhan. Univ. doi:10.13203/j.whugis20230413
He, Y., Wang, W., Zhang, L., Chen, Y., Chen, Y., Chen, B., et al. (2023). An identification method of potential landslide zones using InSAR data and landslide susceptibility. Geomat. Nat. Hazards Risk 14 (1), 2185120. doi:10.1080/19475705.2023.2185120
Hong, H., Wang, D., Zhu, A., and Wang, Y. (2024). Landslide susceptibility mapping based on the reliability of landslide and non-landslide sample. Expert Syst. Appl. 243, 122933. doi:10.1016/j.eswa.2023.122933
Huang, F., Cao, Z., Guo, J., Jiang, S., Li, S., and Guo, Z. (2020). Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. CATENA 191, 104580. doi:10.1016/j.catena.2020.104580
Huang, W., Ding, M., Li, Z., Zhuang, J., Yang, J., Li, X., et al. (2022). An efficient user-friendly integration tool for landslide susceptibility Mapping Based on Support Vector Machines: SVM-LSM toolbox. Remote Sens. 14, 3408. doi:10.3390/rs14143408
Huang, F., Xiong, H., Yao, C., Catani, F., Zhou, C., and Huang, J. (2023). Uncertainties of landslide susceptibility prediction considering different landslide types. J. ROCK Mech. Geotech. 15, 2954–2972. doi:10.1016/j.jrmge.2023.03.001
Jia, H., Wang, Y., Ge, D., Deng, Y., and Wang, R. (2022). InSAR Study of landslides: early detection, Three-Dimensional, and long-term surface displacement Estimation—A case of Xiaojiang River Basin, China. Remote Sens. 14, 1759. doi:10.3390/rs14071759
Jiang, Z., Zhao, C., Yan, M., Wang, B., and Liu, X. (2022). The early identification and spatio-temporal characteristics of loess landslides with SENTINEL-1A datasets: a case of Dingbian County, China. Remote Sens. 14, 6009. doi:10.3390/rs14236009
Li, Y., Zhu, H., and Chen, S. (2016). Landslide hazard assessment in the upper reaches of Yellow River based on AHP Method. Sci. Surv. Mapp. 41 (08), 67–70+75. doi:10.16251/j.cnki.1009-2307.2016.08.014
Li, B., Liu, K., Wang, M., He, Q., Jiang, Z., Zhu, W., et al. (2022). Global dynamic rainfall-induced landslide susceptibility mapping using machine learning. Remote Sens. 14, 5795. doi:10.3390/rs14225795
Li, C., Liu, Y., Lai, S., Wang, D., He, X., and Liu, Q. (2024a). Landslide susceptibility analysis based on the coupling model of logistic regression and support vector machine. J. Nat. Disasters 33 (02), 75–86. doi:10.13577/j.jnd.2024.0208
Li, Z., Leng, L., Sun, Y., Huo, Y., and He, Y. (2024b). Landslide susceptibility assessment in the river cascade development basin based on the IV-LM coupling model. Bull. Surv. Mapp., 237–241. doi:10.13474/j.cnki.11-2246.2024.S147
Liu, Y., and Chen, C. (2024). Landslide susceptibility evaluation method considering spatial heterogeneity and feature selection. Acta Geod. Cartogr. Sinica 53 (7), 1417–1428.
Lu, J., He, Y., Zhang, L., Zhang, Q., Gao, B., Chen, H., et al. (2024). Ensemble learning landslide susceptibility assessment with optimized non-landslide samples selection. Geomat. Nat. Hazards Risk 15 (1), 2378176. doi:10.1080/19475705.2024.2378176
Lv, Z., Wang, S., Yan, S., Han, J., and Zhang, G. (2024). Landslide susceptibility assessment based on Multisource remote sensing considering inventory quality and modeling. Sustainability 16, 8466. doi:10.3390/su16198466
Mao, Y., Mwakapesa, D., Wang, G., Nanehkaran, Y., and Zhang, M. (2021). Landslide susceptibility modelling based on AHC-OLID clustering algorithm. Adv. SPACE Res. 68, 301–316. doi:10.1016/j.asr.2021.03.014
Nguyen, D., Tiep, N., Bui, Q., Le, H., Prakash, I., Costache, R., et al. (2025). Landslide susceptibility mapping using rbfn-based ensemble machine learning models. Comput. Model Eng. Sci. 142 (1), 467–500. doi:10.32604/cmes.2024.056576
Pareek, T., Bhuyan, K., Westen, C. V., Rajaneesh, A., Sajinkumar, K. S., and Lombardo, L. (2025). Analyzing the posterior predictive capability and usability of landslide susceptibility maps: a case of Kerala, India. Landslides 22, 655–670. doi:10.1007/s10346-024-02389-4
Qi, T., Meng, X., and Zhao, Y. (2024). Landslide susceptibility assessment in active tectonic areas using machine learning algorithms. Remote Sens. 16, 2724. doi:10.3390/rs16152724
Rohan, T., Shelef, E., Mirus, B., and Coleman, T. (2023). Prolonged influence of urbanization on landslide susceptibility. Landslides 20, 1433–1447. doi:10.1007/s10346-023-02050-6
Sabatakakis, N., Koukis, G., Vassiliades, E., and Lainas, S. (2014). Landslide susceptibility zonation in Greece. Nat. Hazards 65, 523–543. doi:10.1007/s11069-012-0381-4
Shi, X., Yang, C., Zhang, L., Jiang, H., Liao, M., Zhang, L., et al. (2019). Mapping and characterizing displacements of active loess slopes along the upstream Yellow River with multi-temporal InSAR datasets. Sci. TOTAL Environ. 674, 200–210. doi:10.1016/j.scitotenv.2019.04.140
Su, C. (2023). Study on the risk evaluation of geoenvironmental hazardsin the mountainous areas of southern Ningxia in themiddle and upper reaches of the Yellow River (master’s thesis). Chang'an University. doi:10.26976/d.cnki.gchau.2023.001209
Tu, K., Ye, S., Zou, J., Hua, C., and Guo, J. (2023). InSAR displacement with high-resolution optical remote sensing for the early detection and deformation analysis of active landslides in the upper yellow River. Water 15, 769. doi:10.3390/w15040769
Umar, Z., Pradhan, B., Ahmad, A., Jebur, M., and Tehrany, M. (2014). Earthquake induced landslide susceptibility mapping using an integrated ensemble frequency ratio and logistic regression models in West Sumatera Province, Indonesia. CATENA 118, 124–135. doi:10.1016/j.catena.2014.02.005
Wang, X., and Bai, S. (2023). Landslide susceptibility mapping and interpretation in the upper Minjiang River Basin. Remote Sens. 15, 4947. doi:10.3390/rs15204947
Wang, J., Jaboyedoff, M., Chen, G., Luo, X., Derron, M., Hu, Q., et al. (2024a). Landslide susceptibility prediction and mapping using the LD-BiLSTM model in seismically active mountainous regions. Landslides 21, 17–34. doi:10.1007/s10346-023-02141-4
Wang, L., Lv, G., Du, J., Zhu, J., Zhao, G., Wang, D., et al. (2024b). InSAR detection and spatiotemporal characteristics of active landslides in the maqin section of the upper yellow River. Inf. Sci. Wuhan. Univ. doi:10.13203/j.whugis20240490
Wang, Q., Xiong, J., Cheng, W., Cui, X., Pang, Q., Liu, J., et al. (2024c). Landslide susceptibility mapping methods coupling with statistical methods, machine learning models and clustering algorithms. J. Geo- Inf. Sci. 26 (3), 620–637. doi:10.12082/dqxxkx.2024.230427
Wu, X., Shen, S., and Niu, R. (2016). Landslide susceptibility prediction using GIS and PSO-SVM. Inf. Sci. Wuhan. Univ. 41 (05), 665–671. doi:10.13203/j.whugis20130566
Xing, X., Wu, C., Li, J., Li, X., Zhang, L., and He, R. (2021). Susceptibility assessment for rainfall-induced landslides using a revised logistic regression method. Nat. Hazards 106, 97–117. doi:10.1007/s11069-020-04452-4
Xing, Y., Huang, S., Yue, J., Chen, Y., Xie, W., Wang, P., et al. (2023). Patterns of influence of different landslide boundaries and their spatial shapes on the uncertainty of landslide susceptibility prediction. Nat. Hazards 118, 709–727. doi:10.1007/s11069-023-06025-7
Yang, C., Liu, L., Huang, F., Huang, L., and Wang, X. (2023a). Machine learning-based landslide susceptibility assessment with optimized ratio of landslide to non-landslide samples. Gondwana Res. 123, 198–216. doi:10.1016/j.gr.2022.05.012
Yang, S., Li, D., Liu, Y., Xu, Z., Sun, Y., and She, X. (2023b). Landslide identification in human-modified alpine and canyon area of the Niulan River Basin based on SBAS-InSAR and optical images. Remote Sens. 15, 1998. doi:10.3390/rs15081998
Yang, X., Fan, X., Wang, K., and Zhou, Z. (2024). Research on landslide susceptibility prediction model based on LSTM-RF-MDBN. Environ. Sci. Pollut. Res. 31, 1504–1516. doi:10.1007/s11356-023-31232-x
Yu, L., Pradhan, B., and Wang, Y. (2025). A comparative study of various combination strategies for landslide susceptibility mapping considering landslide types. Geosci. Front. 16 (2), 101999. doi:10.1016/j.gsf.2024.101999
Zhang, Z., Deng, M., Xu, S., Zhang, Y., Fu, H., and Li, Z. (2022). Comparison of landslide susceptibility assessment models in Zhenkang County, Yunnan Province, China. Chin. J. Rock Mech. Eng. 41 (01), 157–171. doi:10.13722/j.cnki.jrme.2021.0360
Zhang, Y., Xu, P., Liu, J., He, J., Yang, H., Zeng, Y., et al. (2023). Comparison of LR, 5-CV SVM, GA SVM, and PSO SVM for landslide susceptibility assessment in Tibetan Plateau area, China. J. Mt. Sci. 20, 979–995. doi:10.1007/s11629-022-7685-y
Zhao, S., Zeng, R., Zhang, Z., Wang, H., and Meng, X. (2022). Early identification and influencing factors of potential landslides in the upper reaches of the Yellow River, China. Mt. Res. 40 (2), 249–264. doi:10.16089/j.cnki.1008-2786.000669
Zhou, C., Wang, Y., Cao, Y., Singh, R. P., Ahmed, B., Motagh, M., et al. (2024). Enhancing landslide susceptibility modelling through a novel non-landslide sampling method and ensemble learning technique. Geocarto Int. 39 (1), 2327463. doi:10.1080/10106049.2024.2327463
Keywords: upper Yellow River (China), statistical approaches, machine learning, SBAS-InSAR technology, landslide susceptibility assessment
Citation: Zeng J, Tuo W, Wang X and Zhao X (2025) Landslide susceptibility assessment of upper Yellow River using coupling statistical approaches, machine learning algorithms and SBAS-InSAR technique. Front. Earth Sci. 13:1652646. doi: 10.3389/feart.2025.1652646
Received: 24 June 2025; Accepted: 04 August 2025;
Published: 29 August 2025.
Edited by:
Chong Xu, Ministry of Emergency Management, ChinaReviewed by:
Bo Liu, China University of Geosciences, ChinaYang Dongxu, Chengdu University of Technology, China
Zhihan Wang, Yangtze University, China
Copyright © 2025 Zeng, Tuo, Wang and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Wanbing Tuo, d2J0dW9AcWhpdC5lZHUuY24=