- 1National Institute of Natural Hazards, Ministry of Emergency Management of China, Beijing, China
- 2Key Laboratory of Compound and Chained Natural Hazards Dynamics, Ministry of Emergency Management of China, Beijing, China
- 3School of Geology and Mining Engineering, Xinjiang University, Urumqi, China
- 4Beijing Engineering Corporation Limited, Beijing, China
- 5China Railway Xi’an Group Company Limited, Xi’an, Shaanxi, China
- 6China Railway Design Corporation, Tianjin, China
The Beijing–Guangzhou Railway is a critical transportation corridor in China, with the Zhuzhou–Guangzhou segment particularly susceptible to landslides due to steep terrain and marked climatic variability. This study combined 713 large-scale landslide remnants with 14 Influencing factors (topography, geology, and hydrology) to analyze landslide spatial distribution and proposed a multiscale landslide susceptibility assessment framework based on a “global modeling–local analysis” approach. The contribution of each factor to susceptibility was quantified using a random forest model coupled with the SHapley Additive exPlanations (SHAP) method. Spatial analysis revealed that landslides are concentrated in areas at 188–752 m elevation, with slopes of 15°–30°, curvatures of −0.35–0.22, Jurassic strata, and high precipitation, and are significantly influenced by river erosion and railway engineering activities. The susceptibility model performed well (AUC = 0.88). Global analysis showed that high and very high susceptibility areas are primarily located in the central and southern sections of the study area, particularly in northern Lechang, Shaoguan, and Suxian District, Chenzhou. Local analysis of three representative sections further indicated that slope, Topographic Wetness Index (TWI), elevation, and curvature are the primary hazard factors, with the order of contribution varying across sections, reflecting the regional characteristics and complexity of landslide mechanisms along the railway. This study provides a scientific basis for landslide risk prevention and control along the Beijing-Guangzhou Railway and a transferable reference for susceptibility assessment in similar regions.
1 Introduction
Landslides, as highly destructive and sudden geological hazards, pose serious threats to infrastructure and the safety of residents in mountainous regions (Hungr et al., 2014). Railways, a vital component of China’s transportation system, frequently traverse geologically complex regions such as mountainous and hilly terrains, rendering them particularly susceptible to landslides and related geohazards such as debris flows and rockfalls. The Beijing–Guangzhou Railway, one of the most important north–south transportation arteries in China, passes through several provinces with high geological hazard susceptibility, including Hunan and Guangdong. It serves as a national Class I trunk line with heavy passenger and freight traffic. The Zhuzhou–Guangzhou section of the railway runs through low mountainous and hilly areas, characterized by intense geomorphic dissection, slow vegetation recovery, active geological structures, and intense and frequent precipitation. Combined with increasing human activities, these conditions contribute to a prominent risk of landslide disasters along the route. In 2020, a landslide triggered by heavy rainfall on the Beijing-Guangzhou Railway in Yongxing County, Hunan Province, caused a train to derail after colliding with the collapsed material, resulting in direct economic losses estimated at 3.14 million USD (Bureau, 2020). This incident severely threatened operational safety, emphasizing the urgent need for effective landslide risk mitigation in this region.
In light of such threats, landslide susceptibility assessment serves as a critical tool for disaster prevention and mitigation, aiming to quantify the probability of landslide occurrence in a region by analyzing influencing factors such as topography, geology, and climate (Aleotti and Chowdhury, 1999; Reichenbach et al., 2018). Since the 1970s, landslide susceptibility evaluation methods have undergone significant development (Kubwimana et al., 2021; Huang, 2023). Early research methods primarily relied on qualitative analysis based on expert experience, such as the Analytic Hierarchy Process (AHP) and fuzzy comprehensive evaluation methods (Xu et al., 2009; Corominas et al., 2014; Liu et al., 2021; Wang et al., 2024b). Although these methods exhibit certain applicability in scenarios with limited data, their outcomes are often limited in objectivity and reproducibility (Yong et al., 2022). With the advancement of data acquisition capabilities, quantitative methods based on mathematical statistics, such as the information value method and frequency ratio method, have gradually become mainstream (Kamp et al., 2008; Xu et al., 2010; Yuan et al., 2022). However, these methods still exhibit limitations when dealing with complex nonlinear relationships, such as the reliance on subjective judgment for weight determination and the difficulty in fully capturing the intricate interactions between landslides and multiple influencing factors (Hong et al., 2017a; Kouhartsiouk and Perdikou, 2021).
In recent years, the introduction of machine learning techniques has provided new perspectives for landslide susceptibility evaluation (Merghadi et al., 2020; Fang et al., 2021; He et al., 2023; Guo et al., 2024). The Random Forest (RF) model has gradually emerged as a mainstream method for landslide susceptibility evaluation due to its strong resistance to overfitting and excellent compatibility with high-dimensional data (Bai et al., 2011; Youssef et al., 2016; Habumugisha et al., 2022). Studies have shown that the RF model performs exceptionally well in regional landslide prediction, with its AUC values typically surpassing those of traditional methods. For instance, Hong et al. compared logistic regression with the Random Forest model and found that the latter achieved higher prediction accuracy in the Wuyuan region (Hong et al., 2017b). Zhao et al. validated the superiority of the RF model in the Hengduan Mountains region through a comparison of six different models (Zhao et al., 2022). Zhang et al. utilized the RF and XGBoost models to assess landslide susceptibility in Fengjie County, Chongqing. The results demonstrated that the RF model outperformed the XGBoost model in landslide susceptibility evaluation for this region, with higher AUC values, accuracy, and F-scores (Zhang et al., 2023).
However, existing research largely focuses on natural geographical units such as watersheds and counties, lacking targeted analysis of linear engineering corridors (such as railways). Railway corridors not only have complex terrain and highly patchy landslide distribution along their lines, but are also influenced by multiple factors such as human disturbance and operational safety requirements, resulting in strong spatial heterogeneity and mechanistic diversity in their disaster risks. Current research on railway-related geological hazards mainly concentrates on the substructural response, deformation monitoring, or seismic behavior of engineering structures such as bridges and tunnels. For example, Sam et al. conducted a systematic study on the seismic behavior of tunnels under high ground stress conditions (Sam, 2024). In addition, although advanced technologies such as microseismic methods have provided insights into local ground motion (Susilo et al., 2024), and the importance of triggering factors such as extreme rainfall is increasingly recognized (Yanfatriani et al., 2024), systematic modeling studies of natural geological conditions and landslide susceptibility across entire railway corridors remain insufficient. In particular, there is still a lack of spatial mechanism analysis and local diagnostic studies for typical high-risk sections.
To address these gaps, this study focuses on the Zhuzhou–Guangzhou section of the Beijing–Guangzhou Railway and proposes a multiscale landslide susceptibility assessment framework based on a “global modeling–local analysis” approach. By integrating extensive landslide relic data with 14 influencing factors (including topographic, geological, and hydrological variables), the framework employs a Random Forest model for analysis and incorporates the SHapley Additive exPlanations (SHAP) interpretability method to extract factor contributions, thereby enhancing model transparency and explainability.
Building on the overall model, this study further identifies high-risk segments along the railway and develops a “typical section detailed analysis module.” Three representative regions are selected to examine the spatial distribution of dominant factors, differences in factor responses, and disaster-triggering mechanisms, enabling fine-scale spatial identification and risk interpretation at the segment level. This refined approach not only addresses the limited adaptability of traditional models in complex terrains but also embodies an in-depth extension from whole-region modeling to localized diagnosis, providing scientific support for segmented risk management and differentiated mitigation strategies of landslide hazards along railways.
2 Study area and data
2.1 Study area
The Beijing-Guangzhou Railway spans the north-south axis of China, extending from 39°48′N to 23°05′N latitude and from 113°04′E to 114°53′E longitude. It extends from Fengtai District in Beijing in the north to Guangzhou City in Guangdong Province in the south, with a total length of approximately 2,298 km. The railway traverses three major geographical regions: the North China Plain, the Jianghan Plain, and the Pearl River Delta. It intersects with several key rail lines, including the Beijing-Shanghai Railway, the Shanghai-Kunming Railway, the Hankou-Yichang Railway, and the Wuhan-Guangzhou High-Speed Railway, forming the core framework of China’s railway network. This study selects the Zhuzhou to Guangzhou section of the Beijing–Guangzhou Railway and its surrounding areas as the research focus. The railway passes through several cities including Zhuzhou, Hengyang, Chenzhou, Shaoguan, and Qingyuan, covering a total length of approximately 668 km (Figure 1). This section is located on the eastern edge of the Nanling tectonic belt and is one of the segments of the Beijing–Guangzhou Railway that traverses the most mountainous terrain. Along the route, it frequently crosses narrow valleys, steep slopes, and river-cut landscapes.
The study area features complex terrain dominated by mountains and hills, with significant overall elevation variation ranging from approximately 200–800 m above sea level. Certain local sections have steep slopes and deep valleys, with well-developed gullies. The region exhibits diverse lithologies, primarily including granite, gneiss, sandstone, shale, and clastic rocks. Some rock masses have well-developed joints and fractures and possess weak resistance to weathering, resulting in loose stratigraphic structures and soft lithologies. These conditions reduce slope stability and provide an unstable foundation conducive to the development of geological hazards such as landslides. The climate is characterized as subtropical monsoon, with distinct four seasons and a pattern of rainfall coinciding with warm periods. Annual precipitation ranges from about 400 to 1,600 mm, and the mean annual temperature is approximately 15 °C–20 °C. Heavy rainfall events are concentrated mainly from April to June, with frequent extreme precipitation, which serves as a significant external trigger for landslides. The combined influence of these factors makes the Beijing–Guangzhou Railway corridor a landslide-prone area, posing considerable threats to the safe operation of the railway.
2.2 Large-scale landslide relics inventory
Landslide relics refer to areas where landslides have occurred or are currently occurring. While most of these areas have stabilized, they may still undergo deformation due to external forces such as earthquakes or rainfall. These relics serve as the only direct data source for understanding the developmental context of local landslides (Guerriero et al., 2021; Wang et al., 2024c). This study utilizes multi-source high-resolution satellite imagery (QuickBird, IKONOS, SPOT-5, etc.) provided by Google Earth to systematically identify and catalog landslide remains in the study area using visual interpretation methods (Highland, 2004; Li et al., 2024; Xue et al., 2025). The cataloging results are verified by referring to historical materials and existing research to ensure their reliability.
Landslides with an area of ≥0.05 km2 were defined as large landslide relics. A total of 713 large landslide relics were cataloged, with individual landslide areas ranging from 0.05 to 2.02 km2, and a total landslide area of 117.35 km2 (Figures 2, 3). Investigating the inventory of landslide relics aids in identifying high-risk landslide areas and assessing the likelihood and potential impacts of future landslides, thereby providing a scientific basis for formulating effective disaster prevention and mitigation strategies (Feng et al., 2024b; Gao et al., 2024; Huang et al., 2024; Xie et al., 2025).
Figure 2. Typical landslide images in the study area. (a) at 23°44'26.37"N, 113°47'15.85"E; (b) at 24°28'16.83"N, 112°09'1.82"E; (c) 23°44'22.27"N, 113°46'30.93"E; (d) at 24°29'43.20"N, 112°13′37.10"E.
2.3 Influencing factors
Landslides are the result of interactions among various factors, including geological conditions, topography, hydrogeological conditions, seismic activity, human activities, and climate change (Brabb, 1984; Cruden, 1991). Therefore, based on the characteristics of the study area, this paper selected four categories (topography, basic geology, hydrometeorology, and surface cover), comprising a total of 14 factors (Figure 4 and Table 1). The topographic factors include elevation, slope, aspect, Terrain Position Index (TPI), and curvature; the basic geological factors include geology, distance to fault, and peak ground acceleration (PGA); the hydrometeorological factors include average precipitation, distance to river, and topographic wetness index (TWI); and the surface cover factors include Fractional Vegetation Cover (FVC), distance to railway,and landcover. These 14 influencing factors were utilized for landslide susceptibility analysis.
Figure 4. Factor classification maps. (a) Elevation; (b) Slope; (c) Aspect; (d) TPI; (e) Curvature; (f) Geology; (g) Average precipitation; (h) FVC; (i) TWI; (j) PGA; (k) Landcover; (l) Distance to Fault; (m) Distance to River; (n) Distance to Railway.
The elevation data were sourced from SRTM (Jarvis et al., 2008), and slope, aspect, and curvature information were extracted using GeoScience software. The slope position data were obtained through the Land Facet Corridor program. The TWI was generated based on DEM within the GRASS GIS environment. The land cover type was derived from the land cover dataset GLCNMO (Tateishi et al., 2011). The fault data were obtained from the National Seismic Active Fault Data and categorized into 11 classes (Xu et al., 2016; Wu et al., 2024). The stratigraphic data were sourced from the 1:2.5 million Geological Map of China (China Geological Survey) and partial geological maps of Asian countries (United States Geological Survey), and classified into 12 categories based on geological age. The average precipitation data were obtained from global climate data with a resolution of 1 km, and the average precipitation map for the study area was generated using linear interpolation methods (Fick and Hijmans, 2017). The PGA data were sourced from the United States Geological Survey (USGS), representing the coseismic PGA distribution map derived from a combination of station data and numerical simulations following seismic events (Xu et al., 2019). All factor data were processed into 30 m × 30 m grid format to prepare for subsequent modeling.
To avoid multicollinearity issues among the influencing factors, this study conducted a correlation analysis of the 14 factors using the Pearson correlation coefficient (Huang et al., 2025). Figure 5 shows that the absolute values of the correlations for all factors are below 0.7, indicating no multicollinearity, and thus, they can be input into the model for training.
3 Methods
This paper uses the random forest method to evaluate the landslide susceptibility of the Zhuzhou-Guangzhou section of the Beijing-Guangzhou Railway and the surrounding areas. The main steps include: (a) Data preparation: compile a local landslide database and prepare influencing factors; (b) Data processing: use a 30-m grid as the mapping unit, simplify the calculation by reclassifying the influencing factors, and select the same number of landslide samples in the non-landslide area as negative samples, assign values of 1 and 0 to the positive and negative sample points respectively, and randomly construct training and test sets for all samples in an 8:2 ratio; (c) Model training and evaluation: use the training set to train the model, and then use the test set to verify and evaluate the model; (d) Landslide susceptibility evaluation: use the trained model to predict and evaluate the regional landslide susceptibility (Figure 6).
3.1 Model sampling
In this study, a total of 1,426 sample points were selected and the ratio of positive to negative samples was 1:1. The positive samples were obtained by converting 713 large landslide relics into points, while the negative samples were created as random points in non-landslide areas. Values of 1 and 0 were assigned to the positive and negative sample points, respectively, to provide sample data for subsequent model training (Figure 7).
3.2 Random Forests
Random Forests (RF) is an ensemble method that individually trains binary decision trees, primarily known for its exceptional generalization capability and flexibility (Breiman, 2001; Torres-Vázquez et al., 2025). The RF algorithm is widely applicable to both classification and regression problems and exhibits relatively high tolerance to outliers and noise, making it less prone to overfitting (Zhang et al., 2017; Dey et al., 2024). The RF algorithm learns by constructing multiple decision trees, with each tree trained on a different random subsample of the dataset. Compared to a single decision tree, RF provides more robust and accurate prediction results. The mathematical expression of the RF algorithm is Equation 1:
Where B is the number of decision trees, Tb (x,θb) is the prediction function of the b-th tree, and θb represents the parameters of the b-th tree. The RF algorithm reduces variance and enhances the model’s generalization capability by averaging the predictions of multiple trees. It demonstrates excellent predictive performance across various data distributions and has become a preferred method for classification and prediction tasks.
This study employs the GridSearchCV method to repeatedly tune hyperparameters on the training set. The search ranges include: n_estimators [100, 120, 140, 160, 180], max_depth [3, 5, 7, 10, None] (where None indicates no depth restriction), min_samples_split [2, 5, 10], min_samples_leaf [1, 2, 4], and max_features [‘sqrt’, ‘log2’]. A 10-fold cross-validation is applied, and the optimal parameters are selected by maximizing the validation set AUC. The final optimal hyperparameter combination is as follows: the random forest consists of 140 decision trees, with a maximum depth of 5, a minimum split sample number of 2, and a minimum of 1 sample per leaf node. The dataset is randomly divided into training and test sets in an 8:2 ratio, and the training features are standardized. The susceptibility model is then constructed with a fixed random seed of 42.
3.3 Model validation
The receiver operating characteristic (ROC) curve is a tool used to evaluate the performance of binary classification models. It illustrates the model’s performance at different thresholds by comparing the true positive rate (TPR) and false positive rate (FPR), and its accuracy is widely applied in assessing machine learning models (Zhang et al., 2007). The formula is as follows (Equations 2, 3):
where TP indicates that both the prediction and the actual value are landslides; FP indicates that the prediction is a landslide while the actual value is not; TN indicates that both the prediction and the actual value are non-landslides; and FN indicates that the prediction is a non-landslide while the actual value is a landslide. The area under the ROC curve (AUC) directly reflects the accuracy of the model, with a value range of 0–1. The closer the AUC is to 1, the higher the model’s accuracy and the better its classification performance.
3.4 SHAP interpretability method
To reveal the specific contributions of each influencing factor to landslide susceptibility prediction in the Random Forest model, this study introduces the SHapley Additive exPlanations (SHAP) method (Xiao et al., 2024; Halder et al., 2025). SHAP is based on the Shapley value theory from cooperative game theory and aims to fairly distribute the contribution of the model output. Compared with LIME (Local Interpretable Model-agnostic Explanations), which may produce unstable interpretations due to local sampling, and permutation importance, which is limited to providing only global feature rankings, SHAP can simultaneously provide both global feature importance and local instance-level interpretations. This is crucial for revealing the differences in the dominant factors of landslides across different typical road sections within the study area (Fisher et al., 2019; Chen and Fan, 2024). For any model input feature vector X=(x1,x2,x3, … ,xm) with a model prediction output f(x), the SHAP method calculates the marginal contribution value ϕi of each feature xi, satisfying the following additive property (Equation 4):
Here,
Here, S is any subset of features excluding feature i,and fS is the expected model output conditioned on the features in subset S. By considering all possible feature combinations, SHAP values quantify the average contribution of a single feature to the model’s prediction, thereby ensuring theoretical fairness and consistency. In this study, the Python shap library is used to interpret the Random Forest model, obtaining both global and local contribution distributions for each factor, which assists in identifying the key driving factors of landslide occurrence and their directional effects.
4 Results and analysis
4.1 Spatial distribution of large-landslide relics
To more accurately analyze the spatial distribution characteristics of landslides, we selected landslide number density (LND) and landslide area percentage (LAP) as indicators to evaluate landslide abundance (Huang et al., 2022). LND reflects the number of landslides per unit area, aiding in the identification of high-frequency landslide regions (Equation 6); LAP represents the proportion of landslide area to the total study area, offering a clear visualization of the spatial extent of landslides (Equation 7):
where CA represents the area of each class of the influencing factors. Through these two indicators, we can comprehensively evaluate the abundance and spatial distribution of landslides. This enables an in-depth investigation of the relationship between influencing factors and landslides, as well as revealing the distribution patterns of large landslide relics in the study area (Li et al., 2021; Cui et al., 2022).
4.1.1 Topographic factors
The elevation of the study area is divided into nine categories according to the natural breakpoint method, which are −51–91,91–188,188–304,304–435,435–583,583–752,752–956,956–1222,1222–2030 (Figure 8). The large landslide relics are predominantly distributed within the elevation range of 188–752 m, comprising 518 landslides, which account for 72.65% of the total large landslide relics in the study area. The landslide number density (LND) and landslide area percentage (LAP) are highest in the elevation range of 583–752 m, with values of 0.0153% and 0.28%, respectively, indicating a higher probability of landslide occurrence in this range.
For the slope, most landslides occur between 15° and 30°, accounting for about 70.13% of the total number of landslides. Among these, the slope range of 24.09°–29.92° has the highest number of landslides, totaling 187 (Figure 9). Only 2.24% of the landslides occurred in the 0°–9° slope range. The highest LND and LAP values were found in the 29.92°–71.23° interval, and these metrics continued to increase from 12.01° to 15.66°, suggesting that landslide density roughly rises with slope.
Figure 10 shows the relationship between curvature and LND and LAP. The large-scale landslide relics in this area are mainly distributed in the range of-0.35–0.22, with a total of 536 landslides, which account for 75.18% of the total number. LND and LAP are highest in the range of −0.58 to −0.35. This is mainly because negative curvature values indicate concave terrain, which tends to create areas of water flow convergence, increasing soil saturation and reducing soil shear strength, thereby elevating the risk of landslides. In addition, these areas may also feature complex geological structures, fragmented rocks, and intense weathering conditions, further exacerbating landslide susceptibility.
In addition to flat areas, the number of large-scale landslide relics in each category is relatively balanced, and there are slightly more landslides in the Northeast and Southwest, with 110 and 108, respectively (Figure 11). In the Northeast, the LND reaches its highest value of 0.0086, while in the Northwest, the LND is the lowest at 0.0048. In contrast, the LAP is highest in the Southwest, at 0.16%. This pattern can be attributed to variations in sunlight, precipitation, and vegetation growth across different slope aspects. The Northeast and The Southwest receive strong solar radiation, leading to significant thermal expansion and contraction of rocks and soil, which promotes weathering and fragmentation, resulting in poor stability. Additionally, these aspects are more susceptible to landslides due to the substantial impact of precipitation and surface runoff. The Southwest, characterized by dense vegetation, may increase soil porosity and moisture content, reducing shear strength. Furthermore, vegetation can alter the path and velocity of surface runoff, resulting in greater susceptibility to landslides.
For the slope position, the number and area of landslides in the steep slope interval are the highest, with 547 and 36,764.26 km2, respectively. Additionally, both LND and LAP in this interval are also the highest, which is consistent with actual observations (Figure 12). Steep slopes are characterized by complex geological conditions and low shear strength of rock and soil types, making them prone to the formation of sliding surfaces. The development of joints, fractures, bedding planes, and faults in the geological structure further increases the likelihood of landslides. Additionally, the study area is located in a subtropical humid monsoon climate zone, accompanied by high annual rainfall. The substantial infiltration of rainwater leads to saturation of the soil and rock layers on the slopes, increasing the weight of the sliding mass and reducing the shear strength of the soil and rock layers. These conditions make landslides more likely to occur.
4.1.2 Basic geological factors
Most of the landslide disasters are distributed in Jurassic, Devonian, Cambrian, and Precambrian strata, accounting for about 77.84% of all disasters. A total of 29 landslides occurred in the three categories from Quaternary to Cretaceous, accounting for a small proportion (Figure 13). The highest LND is 0.086 in the Jurassic, followed by 0.0071 in the Devonian, and the lowest is the Tertiary. The lithology of the Jurassic, Devonian, Cambrian, and Precambrian strata has low shear strength, which is prone to the formation of sliding surfaces and landslides. In contrast, the lithology of the Quaternary to Cretaceous strata has higher shear strength and stability, resulting in relatively fewer landslide hazards. The highest LND is observed in the Jurassic strata, likely due to the widespread distribution of Jurassic strata within the study area and their lithological structure and geological conditions being more conducive to landslide occurrence.
For the distance to faults, according to the interval of 1 km, it is divided into 11 categories: 0–1 km, 1–2 km, 2–3 km, 3–4 km, 4–5 km, 5–6 km, 6–7 km, 7–8 km, 8–9 km, 9–10 km, >10 km (Figure 14). The majority of landslides occur in the >10 km range, with 569 landslides accounting for 79.80% of the total. The distribution of landslides in other categories is approximately equal. The LND exhibits significant fluctuations, with the highest value of 0.0092 observed at a distance of 6–7 km from the fault, and the lowest value of 0.0043 observed at a distance of 1–2 km from the fault. The highest LND is observed at 6–7 km from the fault, primarily due to the fragmentation of rock masses in secondary fault zones, the coupling effects of steep terrain and heavy rainfall, and human activity disturbances. In contrast, the near-fault region (1–2 km) exhibits the lowest LND, attributed to consolidated and stable rock masses, limited development, and monitoring biases.
4.1.3 Hydrometeorological factors
Figure 15 shows that 85.69% of landslides occur within the Average precipitation range of 1,485–1802 mm. Among these, the 1,652–1702 mm interval has the highest number of landslides, totaling 111. This is primarily because the ample moisture in this range increases soil saturation and pore water pressure, reducing slope stability and making landslides more likely to occur. The LND and LAP are highest in the 1802–1995 mm rainfall range, with values of 0.1680% and 0.21%, respectively. This is primarily because the study area has an average precipitation ranging from 400 to 1,600 mm, and regions within the 1802–1995 mm range are relatively small. However, when these regions experience high rainfall, the probability of landslides significantly increases, resulting in the highest LND and LAP values.
About 74.33% of the landslides occurred within the range of 1941.62 m from the river (Figure 16). Among them, the number of landslides in the 0–447.21 interval was the largest, with a total of 168 landslides, while the number of landslides in the 5,375.87–8747.57 interval was the smallest, with only 4 landslides. This is mainly due to the fact that the scouring effect of the river on the toe of the slope will destroy the supporting structure of the slope, resulting in the instability of the slope and the occurrence of landslides. In addition, the geological conditions near rivers are often complex, such as faults and weak rock layers, which are inherently prone to triggering landslides. Coupled with the erosive effects of rivers, these conditions further exacerbate the occurrence of landslides.
Figure 17 indicates that 616 landslides occur in the TWI 491–865 interval, accounting for 86.40% of the total number of landslides. In contrast, the number of landslides in the 1,030–2,150 interval is only 1.54% of the total. The LND and LAP of TWI exhibit a trend of initially increasing and then decreasing, reaching their highest values of 0.0164% and 0.26%, respectively, within the 679–783 range. This is because when the TWI value is low, the terrain is dry, the soil moisture content is low, resulting in a lower probability of landslides. As TWI values increase, humidity rises, soil moisture content increases, and shear strength decreases, leading to a higher probability of landslides and an increase in LND and LAP. However, when TWI values are excessively high, the soil becomes overly moist or even saturated, further reducing shear strength. Despite this, the probability of landslides decreases due to increased stability, causing LND and LAP to decline. The peak values occur within the TWI range where humidity is optimal.
4.1.4 Surface coverage factors
The number of landslides in the FVC 41–52 interval is the largest, with 125 landslides accounting for 17.53% of the total, while the number of landslides in the FVC -1-9 interval is the lowest, with only 26 landslides accounting for 3.65% of the total (Figure 18). The highest LND is observed in the FVC 86–100 range, primarily because the areas with high vegetation cover along the Zhuzhou-Guangzhou section of the Beijing-Guangzhou Railway are inherently located in regions with complex geological structures and steep terrain, where slope stability is already poor. High FVC values cannot completely offset the impact of these unfavorable geological and topographic conditions. Once the trigger factors are encountered, landslides are still prone to occur, resulting in large LND and LAP.
Most of the landslides are distributed in the broadleaf evergreen forests, broadleaf deciduous forests, needleleaf evergreen forests, tree open, a total of 525 large-scale landslide relics, accounting for 73.63% of the total number of landslides (Figure 19). Among them, there are more large-scale landslide relics in the tree open area, a total of 148, while there are fewer landslides in urban and mixed forests, only single-digit occurrences. The LND and LAP fluctuated greatly, with the highest LND of 0.0126, distributed in the broadleaf evergreen range, and the highest LAP of 0.23%, distributed in the herbaceous range.
For distance to railway, seven categories were defined: 0–0.2 km, 0.2–0.5 km, 0.5–1 km, 1–2 km, 2–5 km, 5–10 km, and >10 km (Figure 20). Landslides mainly occurred in the >10 km category, with 674 events, accounting for 94.45% of the total. Landslides were relatively few in the other categories. The LND (landslide number density) shows significant variation, with the highest value of 0.0070 observed in the 0.2–0.5 km range and the lowest value of 0.0007 in the 1–2 km range. The highest LND at 0.2–0.5 km from the railway is primarily attributed to the strong disturbance of slope stability caused by railway engineering activities, including slope cutting, train-induced vibrations, and drainage modifications. Slope cutting and excavation during railway construction directly weaken the slope toe support, leading to stress concentration on the potential sliding surface. The installation and modification of drainage ditches may alter surface and groundwater flow paths, increasing pore water pressure. Vibrations and cyclic loads generated by train operation can also cause cumulative plastic deformation of the slope soil, accelerating the reactivation of old landslides. Furthermore, railway maintenance, reinforcement, and expansion activities may also alter the local stress balance, increasing the risk of slope instability.
Overall, landslides in the study area are predominantly distributed within the elevation range of 188–752 m, slope angles of 15°–30°, and curvature range of −0.35–0.22. The number of landslides on northeast and southwest aspects is slightly higher, and the number and area of landslides in the steep slope interval are the highest. Landslides mostly occur in lithological regions such as the Jurassic strata, within the average precipitation range of 1,485–1802 mm. The number of landslides is higher in areas with FVC ranging from 41 to 52, while the landslide density is highest in the FVC range of 86–100. Most landslides occur more than 10 km from faults, but the highest landslide density is found within 6–7 km of faults. Approximately 74.33% of landslides are located within 1941.62 m of rivers, with the highest number of landslides occurring within 0–447.21 m. Landslides with TWI in the 491–865 interval account for 86.40%, and the landslide density and area percentage in the 679–783 interval are the highest. Landslides are predominantly observed in vegetation types such as broadleaf evergreen forests, with the highest landslide density occurring in areas dominated by broadleaf evergreen vegetation. Conversely, the highest percentage of landslide area is found in regions covered by grassland vegetation. Notably, the greatest number of landslides (94.45%) occurred at distances greater than 10 km from the railway; however, the highest landslide density (LND = 0.0070) was observed within the 0.2–0.5 km range. This distinctive distribution pattern likely reflects localized slope stability disturbances caused by railway engineering activities.
4.2 Model validation
In this study, thirteen influencing factors were selected, including elevation, slope, aspect, TPI, curvature, geology, average precipitation, FVC, TWI, PGA, landcover, distance to faults, distance to railway, and distance to rivers. A data ratio of 8:2 was employed for model training and validation to construct a landslide susceptibility assessment model for the Zhuzhou to Guangzhou section of the Beijing-Guangzhou Railway.
To comprehensively evaluate model performance, this study used a combination of metrics, including the area under the ROC curve (AUC), precision, recall, and F1 score. The results show that the RF model achieved an AUC value of 0.88, indicating a high accuracy in landslide prediction and an effective ability to distinguish between landslide and non-landslide occurrences (Figure 21). Furthermore, confusion matrix analysis (see Table 2) reveals an overall accuracy of 0.77. Specifically, the landslide class achieved a precision of 0.88, recall of 0.91, and an F1-score of 0.79; while the non-landslide class had a precision of 0.88, recall of 0.64, and an F1-score of 0.74, demonstrating balanced and robust performance in predicting both classes. Overall, the model shows good applicability and reliability.
4.3 Global susceptibility assessment of large-scale landslide relics
The study utilized a precision-validated susceptibility assessment model to predict landslide susceptibility in the study area. The landslide susceptibility assessment model with verified accuracy was used to predict the landslide susceptibility of the study area. The Natural Breaks (Jenks) method was used to reclassify the assessment results to determine the degree of landslide susceptibility in the study area (Jenks, 1967), which was divided into five levels: very low, low, medium, high and very high (Figure 22).
As shown in Figure 22, the high and very high susceptibility zones in the study area are primarily concentrated in the central and southern regions. Furthermore, high and very high susceptibility zones are also observed in the northeastern part of the study area. The area of each susceptibility zone was statistically analyzed, and the area ratios were calculated. The results show that the very low susceptibility zone covers approximately 43,034.36 to 45,784.54 km2, accounting for 41.22%–38.64% of the total study area; the low susceptibility zone covers about 23,240.75 to 23,314.76 km2, representing 20.99%–18.7%; the moderate susceptibility zone covers roughly 18,063.42 to 17,110.09 km2, accounting for 15.40%–16.22%; the high susceptibility zone covers approximately 15,939.49 to 14,311.06 km2, making up 12.89%–14.31%; and the very high susceptibility zone covers around 10,546.77 to 11,092.54 km2, which is 9.50%–9.6% of the total study area (Table 3). Overall, the southern section of the Beijing-Guangzhou Railway from Zhuzhou to Guangzhou is more prone to landslide hazards compared to the northern section, likely due to the steeper terrain and higher landslide risk in the southern region. Therefore, it is recommended that relevant authorities enhance monitoring and early warning systems in these high-risk areas, develop targeted disaster prevention and mitigation measures, and improve local residents’ awareness of disaster prevention and emergency response capabilities.
4.4 Importance of influencing factors
For the importance analysis of influencing factors, this study employed two methods: the built-in importance analysis method of Random Forest and the SHAP interpreter (Xiao et al., 2024; Halder et al., 2025). Firstly, the built-in importance analysis method of Random Forest was utilized to evaluate the contribution of each influencing factor within the model. This method primarily relies on two metrics: Mean Impurity Decrease and Mean Accuracy Decrease (Yu et al., 2024). Based on the Python platform, we calculated and sorted the importance values of 14 influencing factors. The results are as follows: Slope (0.3910) > TWI (0.1393) > Elevation (0.0936) > Average Precipitation (0.0899) > Curvature (0.0631) > Slope Position (0.0444) > Lithology (0.0316) > Land Cover Type (0.0307) > Distance to Fault (0.0285) > Distance to Railway (0.0277) > Distance to River (0.0459) > Vegetation Cover (0.0261) > Aspect (0.0204) > Distance to River (0.0132) > PGA (0.0003) (Figure 23). From the ranking results, it can be seen that slope, TWI, average precipitation, and elevation have relatively higher importance weights, whereas PGA and distance to river have lower importance.
To more comprehensively evaluate the importance of the influencing factors, we further employed the SHAP interpreter. SHAP values provide a more intuitive representation of the contribution of each feature to the model’s output (Lundberg and Lee, 2017; Xiao et al., 2024). Figure 24 illustrates the feature importance ranking based on SHAP, which is sorted according to the mean absolute SHAP value of each feature. The importance decreases from top to bottom, with each point in the figure representing a sample. The color gradient from red to blue indicates the feature values from highest to lowest. The analysis indicates that slope, TWI, average precipitation, and elevation are significant influencing factors contributing to landslide occurrence, while PGA and distance to river remain of lower importance. Among these, slope exhibits a significant positive correlation with landslide occurrence, suggesting that the higher the slope, the more prone to landslides. TWI exhibits a certain negative correlation with landslide occurrence, with lower TWI values having a higher impact on landslide initiation. Meanwhile, distance to rivers, distance to faults, and PGA remain factors of lower importance.
A comparison of the two importance analysis methods shows that slope, TWI, elevation, and average precipitation are the dominant factors influencing landslide occurrence in the study area. However, discrepancies exist in the importance rankings of certain factors, such as slope, TPI, land cover type, and vegetation cover. This is primarily due to the different mechanisms behind the two methods: the random forest algorithm assesses feature importance based on Gini impurity or permutation importance, while SHAP is based on Shapley values, which account for the marginal contribution of each feature across all possible combinations. SHAP is more sensitive to data distribution and variations in model predictions, and it can capture complex interactions between features. In contrast, random forest provides a relatively stable importance ranking but does not explicitly consider feature interactions. Despite these local ranking differences, both methods consistently identify the key triggering factors of landslides in the study area, offering theoretical support for accurately identifying high-risk zones and formulating differentiated prevention and mitigation strategies.
4.5 Local analysis of landslide susceptibility in representative sections
The Zhuzhou–Guangzhou section of the Beijing–Guangzhou Railway crosses two provinces, Hunan and Guangdong, encompassing 21 cities. Among them, Qingyuan City hosts the largest number of large landslide relics, accounting for 34.08% of the total landslides, followed by Guangzhou City, which accounts for 19.50%. Based on the landslide susceptibility assessment results, this study selects three typical segments along the railway corridor—Chenzhou urban area, Lechang City in Shaoguan, and Yingde City in Qingyuan—for focused analysis. These sections are not only high-risk and extremely high-risk landslide areas directly traversed by railways, but are also significantly representative in terms of geological structure, precipitation distribution, and landslide density. Chenzhou District is characterized by weak rock layers and fault zones, Lechang District by hard rocks and deeply incised river valleys, and Yingde District by karst landforms and dense faults. In addition, the precipitation amounts of the three areas differ, with the Yingde District having the highest annual precipitation, and all three are located within high-density landslide areas identified in this survey. Through SHAP value interpretation, the dominant factors and their mechanisms driving landslide formation in each segment are revealed, embodying a hierarchical assessment strategy of “global modeling—localized interpretation.”
As shown in Figure 25, the very high and high susceptibility zones in the urban area of Chenzhou cover approximately 598.03 km2, accounting for 27.66% of the total area. These zones are mainly distributed in Feitianshan Town, Bailutang Town, Wugaishan Town, Aoshang Town, and Yangtianhu Yao Ethnic Township. Due to the relatively limited number of samples in this section, direct factor contribution analysis may lead to unstable results. To address this issue, a bootstrap sampling strategy was adopted, performing 100 resampling iterations with replacement. In each iteration, the SHAP method was applied to extract feature contributions, and the average contribution values and standard deviations were calculated to derive a robust factor ranking and corresponding plots. This approach effectively improves the reliability and interpretability of factor diagnostics under small-sample conditions. The results indicate that slope (0.151), terrain wetness index (TWI, 0.062), and elevation (0.041) are the primary controlling factors for landslide susceptibility in this area, followed by curvature (0.031) and Average rainfall (0.026) (Figure 26). The region features complex geological structures with developed faults and severely weathered strata, including weak rock types such as mudstone and shale, which easily soften and destabilize under rainfall. Frequent mining activities further weaken slope stability. Moreover, the area experiences abundant and concentrated rainfall, with frequent heavy storms. Combined with the steep terrain, these conditions readily trigger landslide disasters.
Lechang City in Shaoguan is the area with the highest concentration of very high and high landslide susceptibility zones along the Zhuzhou–Guangzhou section of the Beijing–Guangzhou Railway. As shown in Figure 27, the area proportions of landslide susceptibility zones in this city are: very high susceptibility zone covering 324.88 km2 (13.4%), high susceptibility zone 529.52 km2 (25.97%), moderate susceptibility zone 598.31 km2 (24.69%), low susceptibility zone 491.61 km2 (20.28%), and very low susceptibility zone 379.43 km2 (15.65%). The very high and high susceptibility zones are mainly concentrated in the northern towns of Huangpu, Baishi, Liangjiang, Jiufeng, and the central town of Dayuan.
The SHAP contribution ranking is: slope (0.138), curvature (0.041), TWI (0.038), elevation (0.036), and distance to fault (0.025) (Figure 28). This area is located in a mid-to-low mountainous terrain with dramatic topographic variation. The dominant lithologies are intrusive rocks and carbonate rocks, with thick, loose weathered layers. The railway often crosses valley sections where surface and groundwater erosion weaken and destabilize slope toes. Additionally, concentrated rainfall combined with human activities such as transportation engineering construction are key triggers of landslides here.
From the landslide susceptibility results in Yingde City, Qingyuan, the areas of very high and high susceptibility zones are 600.13 km2 and 965.24 km2, respectively, together accounting for 27.76% of the total area. These zones are mainly concentrated in Shigutang Town, Hengshitang Town, Shakou Town, Wangbu Town, Donghua Town, and Dawan Town (Figure 29).
Figure 30 presents the SHAP analysis results for Yingde City, showing that slope (0.161), TWI (0.055), elevation (0.047), curvature (0.033), and average precipitation (0.029) are the dominant factors influencing landslide susceptibility. This area has strong surface water infiltration capacity, where heavy rainfall can rapidly increase subsurface pore water pressure, triggering landslides. Meanwhile, fractured rock masses and developed faults exacerbate slope instability potential. The average precipitation reaches 1906.2 mm, mainly concentrated during the flood season from April to September, further increasing the frequency and intensity of landslide occurrences.
5 Discussion
5.1 Spatial heterogeneity of landslides and segmental differences
This study evaluated landslide susceptibility along the Zhuzhou–Guangzhou segment of the Beijing–Guangzhou Railway, achieving high overall model performance (AUC = 0.88). However, the spatial distribution of landslide susceptibility exhibits significant heterogeneity. High susceptibility zones are primarily concentrated in specific areas characterized by fragmented terrain, intense rainfall, and fractured lithology, such as the urban district of Chenzhou, Lechang City in Shaoguan, and Yingde City in Qingyuan. Local SHAP value analysis reveals that while slope remains the dominant controlling factor across all three regions, the ranking of other contributing factors differs markedly, reflecting the region-specific characteristics of landslide triggers.
In the urban district of Chenzhou, the top five controlling factors are slope (0.151), TWI (0.062), elevation (0.041), curvature (0.031), and average precipitation (0.026). The region has a complex and fragmented geological structure with well-developed faults and intense weathering of strata. The widespread distribution of weak rock layers (such as mudstone and shale) is a key intrinsic factor contributing to the region’s high susceptibility to landslides. These rock layers have low cohesion and poor permeability; after rainfall infiltration, pore water pressure rises rapidly, and the rock mass softens upon contact with water, resulting in a sharp loss of internal friction angle and cohesion, thus creating favorable material sources and mechanical conditions for landslides. Furthermore, extensive mining activities in the area cause significant disturbance to slope structures (Xia, 2008). The importance of TWI and precipitation highlights the key role of rainfall-induced pore water pressure increases in destabilizing weak rock layers. The combined effects of tectonic environment, climatic conditions, and anthropogenic engineering disturbances constitute the fundamental causes of frequent landslides in this area.
The landslide mechanisms in Lechang City, Shaoguan, are primarily driven by the combined effects of topography and rainfall. The area exposes hard intrusive, metamorphic, and carbonate rocks, which, although possessing high strength, have developed joints and fissures due to long-term weathering and tectonic activity. This significantly reduces the overall shear strength of the rock mass and increases its permeability. Residual slope deposits, reaching thicknesses of 5–30 m, further weaken slope stability (Zhang and Zhang, 2014). Engineering disturbances, such as railway construction, have caused stress redistribution within the slope mass, exacerbating instability. Continuous scouring of the slope toe by the Wujiang River further weakens slope support. Intense and concentrated rainfall frequently triggers slope failures. The landslide causation in Lechang exhibits a multifactorial overlay, indicating that mitigation strategies should comprehensively address geological, engineering, and hydrological factors, with particular emphasis on toe reinforcement and hydrological management.
The distribution of landslides in Yingde City, Qingyuan, is controlled by the combined effects of steep topography, intense weathering profiles, and valley incision. The area features karst landforms and a dense fault zone, which lead to rapid increases in pore water pressure. Coupled with concentrated rainfall, this results in a typical causative pattern characterized by steep slopes, heavy rainfall, and fractured rock masses (Zhao, 2018). The coupled influence of hydrogeological processes and tectonic conditions significantly increases landslide susceptibility in the area, highlighting the need for enhanced hydrological monitoring and identification of structurally weak zones.
Based on the observed spatial heterogeneity of landslide susceptibility, differentiated prevention and mitigation strategies should be adopted for different high-risk segments. In extremely high susceptibility zones—such as Huangpu Town in Lechang City and Feitianshan Town in Suxian District—automated monitoring equipment, including inclinometers and rain gauges, should be deployed to establish a threshold-based early warning system tailored to dominant triggering factors (Reichenbach et al., 2018). In areas with significant anthropogenic disturbance, such as mining zones in Chenzhou, priority should be given to composite retaining structures using rock bolts and lattice beams (Dai et al., 2002). For rainfall-sensitive areas like Yingde City, subsurface drainage systems are recommended. Additionally, community emergency preparedness should be strengthened through multiple communication channels such as television, radio, and the internet (Petley, 2012; Han and Wu, 2024). To enhance monitoring capabilities, a space-air-ground integrated system combining UAV-based LiDAR and InSAR technologies should be implemented (Jaboyedoff et al., 2012; Feng et al., 2024a; Feng et al., 2024b). Such region-specific mitigation strategies will provide strong technical support for the long-term safe operation of the Beijing–Guangzhou railway corridor.
5.2 Refined evaluation framework: Applicability and innovation
In response to the practical demands for high-precision hazard identification and prevention in railway corridor projects, this study proposes and develops a refined landslide susceptibility evaluation framework based on a “global modeling–local refinement” approach. At the macro scale, the framework employs a Random Forest model to comprehensively explore the nonlinear coupling relationships between landslide occurrence and 14 influencing factors, including topography, geology, and climate. The model demonstrates strong performance, with an AUC of 0.88, high classification accuracy, as well as robust generalization capability.
Unlike black-box models such as deep learning, RF not only exhibits higher computational efficiency but also retains interpretability (Merghadi et al., 2020). Moreover, through feature randomization and ensemble mechanisms, RF effectively reduces overfitting risks, making it well-suited for complex geological environments with multidimensional and heterogeneous input features.
Multiple studies support RF’s superiority in landslide susceptibility assessment. For instance, Kavzoglu et al. compared RF, XGBoost, and NGBoost, finding that these ensemble methods outperform traditional single models in both training and testing phases, demonstrating stronger generalization capabilities (Kavzoglu and Teke, 2022). Similarly, Chauhan et al.'s study in the Indian Himalayas confirmed that RF and XGBoost not only provide accurate predictions but also offer valuable references for land-use planning and disaster management (Chauhan et al., 2025). Additionally, researchers have noted that RF’s feature randomization and majority voting mechanisms help prevent overfitting while enhancing generalization, particularly for high-dimensional geological data (Luu et al., 2024; Tun et al., 2024).
To improve model interpretability, this study incorporates the SHAP method based on the global Random Forest model, enabling quantitative contribution analysis at the factor level. Compared with traditional feature importance ranking methods, SHAP not only quantifies the global importance of each influencing factor but also reveals local effects at the individual sample level (Lundberg and Lee, 2017).
In recent years, SHAP has been widely applied in geological hazard risk assessments, including landslides and floods, significantly improving model transparency and reliability (Zeng et al., 2024; Choubin et al., 2025). In this study, SHAP analysis elucidates the spatial heterogeneity and nonlinear influences of key factors such as slope gradient, TWI, and average precipitation, providing a scientific basis for precise disaster prevention and mitigation strategies.
Traditional landslide susceptibility studies have mostly focused on uniform regional-scale modeling, making it difficult to fully reflect the complex spatial heterogeneity and differences in disaster mechanisms along linear engineering corridors (Jebur et al., 2014; Hong et al., 2015; Ali et al., 2022; Liu et al., 2022; Liu et al., 2024; Wang et al., 2024a). To overcome these limitations, this study innovatively introduces a “local refinement” module in the refined evaluation framework, selecting three high-risk typical sections in Qingyuan, Chenzhou, and Shaoguan for comparative analysis of fine-scale factor distribution characteristics and disaster mechanisms. The results show significant differences among the sections in terms of geomorphic types, lithological combinations, and external disturbance intensity. The ranking and contribution of dominant controlling factors exhibit distinct regional characteristics, revealing the spatial heterogeneity of landslide-prone environments.
Based on the analysis of differences in typical sections, this study further examines the model’s cross-regional applicability from the perspective of performance evaluation. To preliminarily verify the transferability of the proposed framework, a comparative analysis was conducted using random forest model results reported in previous studies from different regions. For instance, Hong et al. applied the random forest model in Wuyuan, Jiangxi Province, achieving an AUC of approximately 0.86 (Hong et al., 2017b); Gupta et al. obtained comparable accuracy (AUC = 0.876) along the NH-7 highway in the Himalayan region of India (Gupta et al., 2025); and Kavzoglu and Teke reported an AUC of about 0.87 in the Artvin coastal area of Turkey (Akinci et al., 2020). These comparable results suggest that, despite substantial variations in geological structures and environmental conditions, the random forest model maintains high predictive capability. This indicates that the proposed framework exhibits strong robustness and reliable cross-regional applicability. Further pilot verification will be carried out along railway and ultra-high-voltage transmission corridors with diverse geological settings to quantitatively assess its potential for cross-regional deployment.
5.3 Limitations of methods and data
Although the landslide susceptibility assessment framework developed in this study achieves good spatial accuracy and model interpretability, certain methodological and data limitations remain. First, regarding data accuracy, this study primarily utilizes 30-m resolution DEM and remote sensing data for factor extraction. While sufficient for regional-scale analysis, the relatively coarse spatial resolution may obscure key geomorphological features in areas with significant local elevation variations or complex geological structures, limiting the model’s sensitivity to microtopographic changes. Therefore, future research should consider incorporating higher-resolution data sources (such as 10-m or finer DEMs, LiDAR imagery) to more precisely characterize landslide-prone environments. Meanwhile, this study’s landslide inventory derived from remote sensing interpretation primarily focuses on large landslide remnants (≥0.05 km2). While this strategy ensures the saliency of the samples at the regional scale and the stability of model training, it may not fully capture the distribution patterns of small and shallow landslides that also pose threats to railway operations, thereby introducing potential biases into the detailed assessment of local risks along railway lines.
Regarding model adaptability, while random forest models possess strong generalization ability and interpretability, they are essentially static and cannot fully capture the dynamic triggering processes of landslides. External triggers such as extreme rainfall or earthquake–rainfall coupling can significantly alter surface hydrological conditions and the mechanical state of soil and rock, thereby affecting slope stability and model predictions. Future research could investigate the model’s sensitivity and stability under various external disturbances through numerical simulations or time-series modeling, gradually constructing dynamic, time-series susceptibility models capable of responding to environmental changes in real time. At the same time, the limited sample size in some typical areas may reduce the generalization ability of traditional statistical models. To address this, the present study introduces a bootstrap analysis strategy to enhance the interpretative stability in small-sample regions, although this method still has inherent limitations in statistical sampling.
In terms of factor construction and data integration, this study considers 14 types of factors, encompassing topography, geology, and climate. While these factors cover the main disaster-causing environments, the quantification of human activities—such as railway construction, slope reinforcement, and mining disturbances—remains limited, which may lead to an underestimation of local landslide risks. Moreover, the data sources are primarily static historical remote sensing and environmental datasets, without incorporating dynamic monitoring information such as real-time rainfall or groundwater fluctuations. The use of historical landslide relics may also introduce temporal biases, including changes in land use, vegetation cover, or local climate conditions since the landslide events, potentially affecting the accuracy of susceptibility analysis. Future work could integrate multi-source data (e.g., radar, InSAR, groundwater monitoring) and, building upon the existing RF–SHAP framework, develop time-series dynamic landslide susceptibility models to enable real-time updates and dynamic early warning. Additionally, given sufficient data availability, more specific railway engineering disturbance indicators could be incorporated to improve the targeting and predictive accuracy of landslide risk assessment along railway corridors.
6 Conclusion
This study focuses on the Zhuzhou–Guangzhou section of the Beijing–Guangzhou Railway and analyzes the spatial distribution patterns of landslides based on 713 large-scale landslide relics and 14 environmental factors (topography, geology, and hydrology). A landslide susceptibility assessment framework was constructed, integrating the random forest model with the SHAP interpretability method. The study proposes a multi-scale collaborative approach of “global modeling and local analysis,” enabling systematic evaluation from regional modeling to the diagnosis of key controlling factors in representative sections.
Spatial analysis shows that landslides are primarily distributed at elevations between 188 and 752 m and have slopes of 15°–30° and curvatures of −0.35–0.22. Northeastern and southwesterly aspects are slightly more common. The highest number and area of landslides occur on steeper slopes. Landslides occur predominantly in Jurassic strata and areas of high precipitation. FVC and proximity to rivers also significantly influence their distribution. The highest landslide density is within 0.2–0.5 km of railway lines, reflecting the localized perturbations caused by railway construction on slope stability.
The susceptibility model performed well (AUC = 0.88). Global results showed that high and very high susceptibility areas were mainly concentrated in the central and southern parts of the study area, especially in Suxian District of Chenzhou City and the northern part of Lechang City of Shaoguan City. Global factor importance analysis indicates that slope, TWI, elevation, and average precipitation are the primary controlling factors for landslide occurrence in the study area. By further applying the SHAP method for localized factor interpretation in typical high-risk segments (e.g., Chenzhou, Shaoguan, Qingyuan), it was found that although dominant factors are generally similar, their contribution rankings differ, reflecting the spatial heterogeneity and complexity of landslide triggering mechanisms along the railway corridor. For example, landslides in Suxian District are mainly influenced by the combined effects of mining disturbance and rainfall; the Lechang segment in Shaoguan requires focused attention on the interaction between engineering disturbances and hydrological factors; while the Yingde segment in Qingyuan is predominantly controlled by tectonic and hydrological interactions. These findings provide data support and mechanistic insights for segmented landslide risk identification and differentiated prevention strategies along the railway.
This framework enhances the adaptability of traditional models in complex geological settings and exemplifies an integrated analytical approach from regional modeling to localized detailed diagnosis, offering strong engineering applicability and promotion potential. Future work may integrate remote sensing monitoring and measured rainfall data to enable dynamic model updating and real-time early warning capabilities, further supporting precise landslide disaster prevention and control practices for the railway.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Author contributions
HL: Methodology, Investigation, Supervision, Conceptualization, Visualization, Formal Analysis, Writing – original draft. CX: Conceptualization, Methodology, Investigation, Writing – review and editing, Supervision, Funding acquisition. LF: Conceptualization, Writing – review and editing. PW: Writing – review and editing, Data curation. JS: Writing – review and editing, Data curation. XZ: Writing – review and editing, Data curation. JW: Project administration, Writing – review and editing. QS: Project administration, Writing – review and editing. KL: Methodology, Writing – review and editing.
Funding
The authors declare that financial support was received for the research and/or publication of this article. This work was supported by a grant from Chongqing Water Resources Bureau, China (Project No. CQS24C00836), Research Institute of China Southern Power Grid Co., Ltd. [1500002024030103SJ00003 (CG1500062001647685-001)], Research Institute of China Southern Power Grid Co., Ltd. [1500002024030103SJ00009 (CG1500062001634723-001)], and Key Project of China Railway Design Corporation (Project No. 2023A0226409).
Conflict of interest
Author PW was employed by Beijing Engineering Corporation Limited. Author JW was employed by China Railway Xi'an Group Company Limited. Author QS was employed by China Railway Design Corporation.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors declare that this study received funding from China Southern Power Grid Co., Ltd. and China Railway Design Corporation. The funders had the following involvement in the study: provision of essential research data.
The author CX declared that they were an editorial board member of Frontiers at the time of submission. This had no impact on the peer review process and the final decision.
Generative AI statement
The authors declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Akinci, H., Kilicoglu, C., and Dogan, S. (2020). Random forest-based landslide susceptibility mapping in coastal regions of Artvin, Turkey. ISPRS Int. J. Geo Information 9 (9), 553. doi:10.3390/ijgi9090553
Aleotti, P., and Chowdhury, R. (1999). Landslide hazard assessment: summary review and new perspectives. Bull. Eng. Geol. Environment 58, 21–44. doi:10.1007/s100640050066
Ali, S. A., Parvin, F., Pham, Q. B., Khedher, K. M., Dehbozorgi, M., Rabby, Y. W., et al. (2022). An ensemble random forest tree with SVM, ANN, NBT, and LMT for landslide susceptibility mapping in the rangit river watershed, India. Nat. Hazards 113 (3), 1601–1633. doi:10.1007/s11069-022-05360-5
Bai, S., Lü, G., Wang, J., Zhou, P., and Ding, L. (2011). GIS-Based rare events logistic regression for landslide-susceptibility mapping of lianyungang, China. Environ. Earth Sci. 62, 139–149. doi:10.1007/s12665-010-0509-3
Brabb, E. E. (1984). Innovative approaches to landslide hazard and risk mapping. Tokyo, Japan: Japan Landslide Society.
Bureau, G. R. S. A. A. (2020). Announcement on the investigation findings of the “March 30” major railway accident involving the derailment of passenger train T179 on the Beijing-Guangzhou railway line. Available online at: http://www.nra.gov.cn/zzjg/jgj/gzgl/gsgz/202004/t20200430_337548.shtml.
Chauhan, V., Gupta, L., and Dixit, J. (2025). Landslide susceptibility assessment for Uttarakhand, a Himalayan state of India, using multi-criteria decision making, bivariate, and machine learning models. Geoenvironmental Disasters 12 (1), 2. doi:10.1186/s40677-024-00307-3
Chen, C., and Fan, L. (2024). Interpretability of statistical, machine learning, and deep learning models for landslide susceptibility mapping in three gorges Reservoir area. arXiv Preprint arXiv:2405.11762. doi:10.48550/arXiv.2405.11762
Choubin, B., Jaafari, A., Henareh, J., Karimi, O., and Hosseini, F. S. (2025). Explainable artificial intelligence (XAI) for interpreting predictive models and key variables in flood susceptibility. Results Eng. 27, 105976. doi:10.1016/j.rineng.2025.105976
Corominas, J., van Westen, C., Frattini, P., Cascini, L., Malet, J.-P., Fotopoulou, S., et al. (2014). Recommendations for the quantitative analysis of landslide risk. Bull. Engineering Geology Environment 73, 209–263. doi:10.1007/s10064-013-0538-8
Cruden, D. (1991). A simple definition of a landslide. Bull. Eng. Geol. Environ. 43 (1), 27–29. doi:10.1007/bf02590167
Cui, Y., Hu, J., Xu, C., Miao, H., and Zheng, J. (2022). Landslides triggered by the 1970 Ms 7.7 Tonghai earthquake in Yunnan, China: an inventory, distribution characteristics, and tectonic significance. J. Mt. Sci. 19 (6), 1633–1649. doi:10.1007/s11629-022-7321-x
Dai, F., Lee, C. F., and Ngai, Y. Y. (2002). Landslide risk assessment and management: an overview. Eng. Geology 64 (1), 65–87. doi:10.1016/s0013-7952(01)00093-x
Dey, H., Haque, M. M., Shao, W., VanDyke, M., and Hao, F. (2024). Simulating flood risk in Tampa Bay using a machine learning driven approach. NPJ Nat. Hazards 1 (1), 40. doi:10.1038/s44304-024-00045-4
Fang, R., Liu, Y., and Huang, Z. (2021). A review of the methods of regional landslide hazard assessmentbased on machine learning. Chin. J. Geol. Hazard Control 32 (4), 1–8. doi:10.16031/j.cnki.issn.1003-8035.2021.04-01
Feng, L., Qi, W., Xu, C., Yang, W., Yang, Z., Xiao, Z., et al. (2024a). Landslide research from the perspectives of Qinling mountains in China: a critical review. J. Earth Sci. 35 (5), 1546–1567. doi:10.1007/s12583-023-1935-9
Feng, L., Xu, C., Tian, Y., Li, L., Sun, J., Huang, Y., et al. (2024b). Landslides of China's qinling. Geoscience Data J. 11 (4), 725–741. doi:10.1002/gdj3.246
Fick, S. E., and Hijmans, R. J. (2017). WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. Int. Journal Climatology 37 (12), 4302–4315. doi:10.1002/joc.5086
Fisher, A., Rudin, C., and Dominici, F. (2019). All models are wrong, but many are useful: learning a variable's importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20 (177), 177–181. Available online at: http://jmlr.org/papers/v20/18-760.html.
Gao, H., Xu, C., Xie, C., Ma, J., and Xiao, Z. (2024). Landslides triggered by the July 2023 extreme rainstorm in the Haihe River Basin, China. Berlin, Germany: Springer.
Guerriero, L., Prinzi, E. P., Calcaterra, D., Ciarcia, S., Di Martire, D., Guadagno, F. M., et al. (2021). Kinematics and geologic control of the deep-seated landslide affecting the historic center of Buonalbergo, southern Italy. Geomorphology 394, 107961. doi:10.1016/j.geomorph.2021.107961
Guo, F., Lai, P., Huang, F., Liu, L., Wang, X., and He, Z. (2024). Literature review and research progress of LandslideSusceptibility mapping based on knowledge graph. Earth Sci. 49 (5), 1584–1606. doi:10.3799/dqkx.2023.058
Gupta, K., Yunus, A. P., Siddique, T., and Ahamad, A. (2025). Landslide susceptibility along National Highway-7 in the Himalayas using random forest-based machine learning tool. J. Earth Syst. Sci. 134 (2), 74. doi:10.1007/s12040-025-02533-1
Habumugisha, J. M., Chen, N., Rahman, M., Islam, M. M., Ahmad, H., Elbeltagi, A., et al. (2022). Landslide susceptibility mapping with deep learning algorithms. Sustainability 14 (3), 1734. doi:10.3390/su14031734
Halder, K., Srivastava, A. K., Ghosh, A., Das, S., Banerjee, S., Pal, S. C., et al. (2025). Improving landslide susceptibility prediction through ensemble recursive feature elimination and meta-learning framework. Sci. Rep. 15 (1), 5170. doi:10.1038/s41598-025-87587-3
Han, Z., and Wu, G. (2024). Why do people not prepare for disasters? A national survey from China. Npj Nat. Hazards 1 (1), 1. doi:10.1038/s44304-024-00001-2
He, L., Wu, X., He, Z., Xue, D., Luo, F., Bai, W., et al. (2023). Susceptibility assessment of landslides in the Loess plateau based on machine learning models: a case study of xining city. Sustainability 15 (20), 14761. doi:10.3390/su152014761
Hong, H., Pradhan, B., Xu, C., and Bui, D. T. (2015). Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133, 266–281. doi:10.1016/j.catena.2015.05.019
Hong, H., Chen, W., Xu, C., Youssef, A. M., Pradhan, B., and Tien Bui, D. (2017a). Rainfall-induced landslide susceptibility assessment at the Chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto International 32 (2), 1–16. doi:10.1080/10106049.2015.1130086
Hong, H., Tsangaratos, P., Ilia, I., Chen, W., and Xu, C. (2017b). Comparing the performance of a logistic regression and a random forest model in landslide susceptibility assessments. The Case of Wuyaun Area, China. In: Advancing culture of living with landslides: volume 2 advances in landslide science. Cham: Springer), 1043–1050.
Huang, W. (2023). Landslide susceptibility assessment in large range basedon deep learning: a case Study of the Qinghai-TibetPlateau transportation corridor. Xi'an, China: Master, Chang’an University.
Huang, Y., Xu, C., Li, L., He, X., Cheng, J., Xu, X., et al. (2022). Inventory and spatial distribution of ancient landslides in Hualong County, China. Land 12 (1), 136. doi:10.3390/land12010136
Huang, Y., Xu, C., He, X., Cheng, J., Huang, Y., Wu, L., et al. (2024). Distribution characteristics and cumulative effects of landslides triggered by multiple moderate-magnitude earthquakes: a case study of the comprehensive seismic impact area in Yibin, Sichuan, China. Landslides 21 (12), 2927–2943. doi:10.1007/s10346-024-02351-4
Huang, Y., Xu, C., He, X., Cheng, J., Xu, X., and Tian, Y. (2025). Landslides induced by the 2023 Jishishan Ms6. 2 earthquake (NW China): spatial distribution characteristics and implication for the seismogenic fault. Npj Nat. Hazards 2 (1), 14. doi:10.1038/s44304-025-00064-9
Hungr, O., Leroueil, S., and Picarelli, L. (2014). The varnes classification of landslide types, an update. Landslides 11, 167–194. doi:10.1007/s10346-013-0436-y
Jaboyedoff, M., Oppikofer, T., Abellán, A., Derron, M.-H., Loye, A., Metzger, R., et al. (2012). Use of LIDAR in landslide investigations: a review. Nat. Hazards 61, 5–28. doi:10.1007/s11069-010-9634-2
Jarvis, A., Guevara, E., Reuter, H., and Nelson, A. (2008). Hole-filled SRTM for the globe: version 4: data grid. France: CGIAR Consortium for Spatial Information.
Jebur, M. N., Pradhan, B., and Tehrany, M. S. (2014). Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (LiDAR) data at catchment scale. Remote Sens. Environ. 152, 150–165. doi:10.1016/j.rse.2014.05.013
Kamp, U., Growley, B. J., Khattak, G. A., and Owen, L. A. (2008). GIS-based landslide susceptibility mapping for the 2005 Kashmir earthquake region. Geomorphology 101 (4), 631–642. doi:10.1016/j.geomorph.2008.03.003
Kavzoglu, T., and Teke, A. (2022). Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arabian J. Sci. Eng. 47 (6), 7367–7385. doi:10.1007/s13369-022-06560-8
Kouhartsiouk, D., and Perdikou, S. (2021). The application of DInSAR and Bayesian statistics for the assessment of landslide susceptibility. Nat. Hazards 105 (3), 2957–2985. doi:10.1007/s11069-020-04433-7
Kubwimana, D., Brahim, L. A., and Abdelouafi, A. (2021). A new approach in the development and analysis of the landslide susceptibility map of the hillslopes of Bujumbura, Burundi. Eureka Phys. Eng., 26–34. doi:10.21303/2461-4262.2021.001724
Li, L., Xu, C., Xu, X., Zhang, Z., and Cheng, J. (2021). Inventory and distribution characteristics of large-scale landslides in Baoji city, Shaanxi province, China. ISPRS Int. J. Geo Information 11 (1), 10. doi:10.3390/ijgi11010010
Li, T., Xu, C., Li, L., and Xu, J. (2024). The landslide traces inventory in the transition zone between the Qinghai-Tibet Plateau and the Loess Plateau: a case study of Jianzha County, China. Front. Earth Sci. 12, 1370992. doi:10.3389/feart.2024.1370992
Liu, F., Wang, L., Xiao, D., and Wang, J. (2021). Evaluation of landslide susceptibility in Ningnan County basedon fuzzy comprehensive evaluation. J. Nat. Disasters 30 (5), 237–246. doi:10.13577/j.jnd.2021.0523
Liu, R., Yang, X., Xu, C., Wei, L., and Zeng, X. (2022). Comparative study of convolutional neural network and conventional machine learning methods for landslide susceptibility mapping. Remote Sens. 14 (2), 321. doi:10.3390/rs14020321
Liu, Y., Xu, S., Liu, C., and Ma, Y. (2024). Landslide susceptibility assessment considering multi-method integrated feature selection and negative sample optimization. Bull. Surveying Mapping (9), 74. doi:10.13474/j.cnki.11-2246.2024.0914
Lundberg, S. M., and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Adv. Neural Information Processing Systems, 30.
Luu, C., Ha, H., Tran, X. T., Vu, T. H., and Bui, Q. D. (2024). Landslide susceptibility and building exposure assessment using machine learning models and geospatial analysis techniques. Adv. Space Res. 74 (11), 5489–5513. doi:10.1016/j.asr.2024.08.046
Merghadi, A., Yunus, A. P., Dou, J., Whiteley, J., ThaiPham, B., Bui, D. T., et al. (2020). Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth Science Rev. 207, 103225. doi:10.1016/j.earscirev.2020.103225
Petley, D. (2012). Global patterns of loss of life from landslides. Geology 40 (10), 927–930. doi:10.1130/g33217.1
Reichenbach, P., Rossi, M., Malamud, B. D., Mihir, M., and Guzzetti, F. (2018). A review of statistically-based landslide susceptibility models. Earth Science Reviews 180, 60–91. doi:10.1016/j.earscirev.2018.03.001
Sam, J. (2024). The effects of seismic behavior on high ground stress soft rock tunnel: a review. Civ. Eng. J. 10 (9), 3090–3121. doi:10.28991/cej-2024-010-09-020
Susilo, A., Zulaikah, S., Pohan, A. F., Hasan, M. F. R., Hisyam, F., Rohmah, S., et al. (2024). Vulnerability index assessment for mapping ground movements using the microtremor method as geological hazard mitigation. Civ. Eng. J. 10 (5), 1616–1626. doi:10.28991/cej-2024-010-05-017
Tateishi, R., Uriyangqai, B., Al-Bilbisi, H., Ghar, M. A., Tsend-Ayush, J., Kobayashi, T., et al. (2011). Production of global land cover data–GLCNMO. Int. J. Digital Earth 4 (1), 22–49. doi:10.1080/17538941003777521
Torres-Vázquez, M. Á., Herrera, S., Gincheva, A., Halifa-Marín, A., Cavicchia, L., Di Giuseppe, F., et al. (2025). Enhancing seasonal fire predictions with hybrid dynamical and random forest models. NPJ Nat. Hazards 2 (1), 20. doi:10.1038/s44304-025-00069-4
Tun, S. H., Changnv, Z., and Jamil, F. (2024). GIS-based landslide susceptibility assessment using random forest and support vector machine models: a case study of chin state, Myanmar. Acta Geodyn. Geomaterialia 21 (3). doi:10.13168/AGG.2024.0019
Wang, B., Li, S., Xu, W., Yang, Y., and Li, Y. (2024a). A comparative Study of landslide susceptibility evaluation based onThree different machine learning algorithms. Northwest. Geol. 57 (1), 34–43. doi:10.12401/j.nwg.2023033
Wang, J., Zang, M., Xu, C., Liu, T., and Huang, Y. (2024b). Landslide susceptibility assessment following the 2022 Luding earthquake:a coupled analytic hierarchy process and area under the receiver operating characteristic curve algorithm. J. Eng. Geol. 32 (5), 1696–1711. doi:10.13544/j.cnki.jeg.2024-0176
Wang, W., Huang, Y.-d., Xu, C., Shao, X.-y., Li, L., Feng, L.-y., et al. (2024c). Identification and distribution of 13003 landslides in the northwest margin of Qinghai-Tibet Plateau based on human-computer interaction remote sensing interpretation. China Geol. 7 (2), 171–187. doi:10.31035/cg2023140
Wu, X., Xu, X., Yu, G., Ren, J., Yang, X., Chen, G., et al. (2024). The China active faults database (CAFD) and its web system. Earth Syst. Sci. Data 16 (7), 3391–3417. doi:10.5194/essd-16-3391-2024
Xia, L. (2008). Application of RS in mine exploration and Monitor-a case study in Suxian, Chenzhou, Hunan. Beijing: Master China University of Geosciences.
Xiao, X., Zou, Y., Huang, J., Luo, X., Yang, L., Li, M., et al. (2024). An interpretable model for landslide susceptibility assessment based on optuna hyperparameter optimization and Random Forest. Geomatics, Nat. Hazards Risk 15 (1), 2347421. doi:10.1080/19475705.2024.2347421
Xie, C., Xu, C., Huang, Y., Liu, J., Jin, J., Xu, X., et al. (2025). Detailed inventory and initial analysis of landslides triggered by extreme rainfall in the northern Huaiji County, Guangdong Province, China, from June 6 to 9, 2020. Geoenvironmental Disasters 12 (1), 7. doi:10.1186/s40677-025-00311-1
Xu, C., Dai, F., Yao, X., Chen, J., Tu, X., Sun, Y., et al. (2009). GIS-based landslide susceptibility assessment usinganalytical hierarchy process in Wenchuanearthquake region. Chin. J. Rock Mech. Eng., 28. doi:10.3321/j.issn:1000-6915.2009.z2.100
Xu, C., Dai, F., Yao, X., Zhao, Z., and Xiao, J. (2010). GIS platform and certainty factor analysis method based wenchuan earthquake induced landslide susceptibility evaluation. J. Eng. Geol. 18 (1), 15. doi:10.3969/j.issn.1004-9665.2010.01.003
Xu, X., Han, Z., Yang, X., Zhang, S., Yu, G., Zhou, B., et al. (2016). Seismotectonic map in China and its adjacent regions. Beijing: Seismological Press.
Xu, C., Xu, X., Zhou, B., and Shen, L. (2019). Probability of coseismic landslides: a new generation of earthquake-triggered landslide hazardmodel. J. Eng. Geol. 27 (05), 1122–1130. doi:10.13544/j.cnki.jeg.2019084
Xue, Z., Xu, C., Zhang, Z., Feng, L., Li, H., Zhang, H., et al. (2025). Inventory of landslide relics in Zhenxiong County based on human-machine interactive visual interpretation, Yunnan Province, China. Front. Earth Sci. 12, 1518377. doi:10.3389/feart.2024.1518377
Yanfatriani, E., Marzuki, M., Vonnisa, M., Razi, P., Hapsoro, C. A., Ramadhan, R., et al. (2024). Extreme rainfall trends and hydrometeorological disasters in tropical regions: implications for climate resilience. Emerg. Sci. J. 8 (5), 1860–1874. doi:10.28991/esj-2024-08-05-012
Yong, C., Jinlong, D., Fei, G., Bin, T., Tao, Z., Hao, F., et al. (2022). Review of landslide susceptibility assessment based on knowledge mapping. Stoch. Environ. Res. Risk Assess. 36 (9), 2399–2417. doi:10.1007/s00477-021-02165-z
Youssef, A. M., Pourghasemi, H. R., Pourtaghi, Z. S., and Al-Katheeri, M. M. (2016). Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 13, 839–856. doi:10.1007/s10346-015-0614-1
Yu, B., Xing, H., and Yan, J. (2024). Susceptibility assessment of multi-hazards using random forest—back propagation neural network coupling model: a Hangzhou city case study. Sci. Rep. 14 (1), 21783. doi:10.1038/s41598-024-71053-7
Yuan, X., Liu, C., Nie, R., Yang, Z., Li, W., Dai, X., et al. (2022). A comparative analysis of certainty factor-based machine learning methods for collapse and landslide susceptibility mapping in Wenchuan County, China. Remote Sens. 14 (14), 3259. doi:10.3390/rs14143259
Zeng, T., Wang, L., Zhang, Y., Cheng, P., and Wu, F. (2024). Landslide susceptibility modeling and interpretability based on CatBoost-SHAP model. Chin. J. Geol. Hazard Control 35 (1), 37–50. doi:10.16031/j.cnki.issn.1003-8035.202309035
Zhang, H., and Zhang, G. (2014). Distribution and prevention of geo-hazards in Lechang city. J. Geol. Hazards Environ. Preserv. 25 (04), 47–50. doi:10.3969/j.issn.1006-4362.2014.04.008
Zhang, X., Jiang, C., and Luo, M. (2007). Application of ROC analysis in machine learning. Comput. Eng. And Appl. (04), 243–248. doi:10.3321/j.issn:1002-8331.2007.04.074
Zhang, K., Wu, X., Niu, R., Yang, K., and Zhao, L. (2017). The assessment of landslide susceptibility mapping using random forest and decision tree methods in the Three Gorges Reservoir area, China. Environ. Earth Sci. 76, 405–420. doi:10.1007/s12665-017-6731-5
Zhang, W., He, Y., Wang, L., Liu, S., and Meng, X. (2023). Landslide Susceptibility mapping using random forest and extreme gradient boosting: a case study of Fengjie, Chongqing. Geol. J. 58 (6), 2372–2387. doi:10.1002/gj.4683
Zhao, J. (2018). Research on geological disaster development characteristics and Formation conditions in Yingde City,Guangdong Province. Ground Water 40 (03), 108–109. doi:10.3969/j.issn.1004-1184.2018.03.037
Keywords: landslide susceptibility assessment, spatial distribution patterns, random forest, Beijing-Guangzhou railway, shap
Citation: Liu H, Xu C, Feng L, Wang P, Sun J, Zhang X, Wang J, Sun Q and Li K (2026) Spatial distribution patterns and landslide susceptibility analysis from a global–local perspective along the Zhuzhou-Guangzhou section of the Beijing–Guangzhou railway. Front. Earth Sci. 13:1722201. doi: 10.3389/feart.2025.1722201
Received: 10 October 2025; Accepted: 14 November 2025;
Published: 12 January 2026.
Edited by:
Gioacchino Francesco Andriani, University of Bari Aldo Moro, ItalyReviewed by:
Yang Hailong, Chengdu University of Technology, ChinaThapthai Chaithong, Kasetsart University, Thailand
Hooman Mousavi, K. N. Toosi University of Technology, Iran
Copyright © 2026 Liu, Xu, Feng, Wang, Sun, Zhang, Wang, Sun and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chong Xu, eGMxMTExMTExMUAxMjYuY29t
Hanxing Liu1,2