Slope Unit-Based Landslide Susceptibility Mapping Using Certainty Factor, Support Vector Machine, Random Forest, CF-SVM and CF-RF Models

Landslide susceptibility mapping is very important for landslide risk evaluation and land use planning. Toward this end, this paper presents a case study in Ningqiang County, Shanxi Province, China. Slope units were selected as the basic mapping units. A traditional statistical certainty factor model (CF), a machine learning support vector machine model (SVM) and random forest model (RF), along with a hybrid CF-SVM model and a CF-RF model were applied to analyze landslide susceptibility. Firstly, 10 landslide conditioning factors were selected, namely slope-angle, altitude, slope aspect, degree of relief, lithology, distance to rivers, distance to faults, distance to roads, average annual rainfall and normalized difference vegetation index. The 23,169 slope units were generated from a Digital Elevation Model and the corresponding 10 conditioning factor layers were produced from both geological and geographical data. Then, landslide susceptibility mapping was carried out using the five models, respectively. Next, the landslide density (LD), frequency ratio (FR), the area under the curve (AUC) and other indicators were used to validate the rationality, performance and accuracy of the models. The results showed that the susceptibility maps produced from the different models were all reasonable. In each map, the LD and FR were greatest in the zones classed as having very high landslide susceptibility, followed by the high, moderate, low and very low landslide susceptibility classes, respectively. From the comparison of the different maps and ROC curves, the RF model based on slope units was the most appropriate for landslide susceptibility mapping in the study area. It was also found that the combination of weaker learner model (CF model here) with a stronger learner model (SVM and RF model here) can impact the applicability of the stronger model.


INTRODUCTION
As one of Earth's major geological hazards, landslides are widely distributed and occur at a high frequency, causing heavy damage, which often leads to huge economic losses and casualties. Landslide susceptibility mapping is important for the prevention and mitigation of such hazards. In recent years, landslide susceptibility mapping based on GIS technology has gradually become the focus of landslide research (Tien Bui et al., 2012;Chen et al., 2014;Chen and Li, 2020;Chen et al., 2021). In this field of research, the selection of the mapping unit and the choice of the susceptibility analysis method are two main components which impact the accuracy of the resulting landslide susceptibility map.
The mapping unit is the smallest indivisible unit of space used in the landslide susceptibility analysis, which can be either regular or irregular. According to current research, all units can be classified into five types ( Table 1): grid units, terrain units, unique condition units, slope units and topographic units (Guzzetti et al., 1999). The advantages and limitations of each type of unit are just shown in Table 1. Among all of these types of mapping units, grid units are the most widely used for medium or small-scale landslide susceptibility zonation because they are easy to calculate and use for spatial analysis (Feizizadeh et al., 2017;Dang et al., 2019;Nam and Wang, 2020;Chen et al., 2021). However, grid units are not related closely to geological environments. For large-scale studies, with the support of abundant geological and geomorphic data, high-resolution satellite images and detailed landslide survey data, a more accurate landslide susceptibility map can be obtained by using GIS-based slope units (Guzzetti et al., 1999;Erener and Düzgün, 2011;Ba et al., 2018).
The analysis methods for landslide susceptibility mapping can be categorized into statistical and machine learning methods. The learning ability of the first type of method is relatively weak and mainly includes the analytic hierarchy process model (Park et al., 2012;Kayastha et al., 2013), the information value model (Lin and Tung, 2004;Sarkar et al., 2013;Sharma et al., 2014;Tan et al., 2015), the certainty factor model (Binaghi et al., 1998;Pistocchi et al., 2002) and the logistic regression model (Ohlmacher and Davis, 2003;Ercanoglu and Temiz, 2011;Das et al., 2012;Regmi et al., 2013;Lee et al., 2014). The second type of method generally has a stronger learning ability and mainly includes the support vector machine model (Tien Bui et al., 2012;Feizizadeh et al., 2017), the random forest model (Youssef et al., 2016;Behnia and Blais-Stevens, 2018;Dang et al., 2019;Nam and Wang, 2020) and the adaptive-network-based fuzzy inference model (Chen et al., 2021), among others. Although certain models have been used for landslide susceptibility mapping in specific areas, there has been no model put forward that can be used across all kinds of landslide conditions. Therefore, in recent years, landslide mapping methods have changed from single-model to hybridmodel approaches (Nieto et al., 2015;Meng et al., 2016;Zhou et al., 2016;Moayedi et al., 2018;Mokarram and Zarei, 2018;Dou et al., 2020;Li and Chen, 2020;Zhao and Chen, 2020). In this way, not only can different models learn from each other, allowing optimization of the evaluation results, but they can also be applied to different geological conditions. This study was based on Ningqiang County, which is located in the hinterland of the Qinba Mountains of China. Landslides are one of the most frequent and severe geological hazards in the area, and geological and geomorphological conditions in the area have a significant impact on landslide occurrence and distribution. Initially, we completed a 1:50,000 scale landslide survey, documenting detailed landslide characteristics and geological data about the area. How to make full use of such materials to obtain more accurate landslide susceptibility zonation maps is of great practical significance for landslide risk evaluation and land use planning in Ningqiang County. Based on the above comparison of the different types of mapping units, combined with knowledge from the landslide survey, we choose slope units as the basic unit type for the landslide susceptibility mapping. For the analysis methods, we focused mainly on the application of existing mature models to produce more reliable landslide susceptibility maps, rather than exploring and trying the latest models. Therefore, the certainty factor model (CF), the support vector machine model (SVM) and the random forest model (RF) were applied. The CF model is a widely used statistical model with a weak learning ability, while the SVM and RF models are two machine learning models which have been often used in recent years and have strong learning abilities. In addition, in order to explore the effects of a hybrid approach, the CF model was coupled with the SVM model (named the CF-SVM model) and the RF model (named the CF-RF model), respectively, and these were also applied to analyze landslide susceptibility in this area.

STUDY AREA
Ningqiang County is located in the southwest corner of Shaanxi Province, between 105°21°10″-106°35°18″ N and 32°37°06 ″-33°12°42″ E, with a total area of about 3246 km 2 ( Figure 1). It lies in the Qinba Mountains, with an elevation from 520 m to 2103 m above sea level according to Digital Elevation Model (DEM) data. This area has a subtropical humid monsoon climate, with an annual average temperature 13.5°C and annual precipitation of 960-1600 mm. Two major river systems, the Jialing River and the Han River, flow through this district with a network density of 1.4 km/km 2 . Geologically, the lithology in the study area included phyllite, sandstone, shale and limestone. While, alluvium, eluvium, slopewash and some clay materials (collectively referred to as accumulations) are extensively distributed in the area. Tectonically, the Kunlun-Qinling fold system and other major faults may have a large influence on slope stability in the study area. Due to these complex geological conditions, Ningqiang County has become a well-studied area for landslide susceptibility analysis in the Qinba Mountains. There were 332 landslides (including landslides and potential landslides) recorded in the landslide survey of the region, which were largely affected by the surrounding geological conditions, especially the fault lines, lithology, rainfall and slope characteristics (Zhao et al., 2012).

METHODOLOGY
The objective of this work was to produce a more accurate landslide susceptibility map for Ningqiang County by comparative analysis of the certainty factor model (CF), the support vector machine model (SVM), the random forest  Figure 2, which shows the key steps undertaken in the study, which include the landslide survey, preparation of the landslide conditioning factors, model evaluation and map generation, results comparison, model choice, and determination of the final landslide susceptibility map.

Data Resources
The data sources used in this study included: 1) 30 m resolution DEM; 2) Landsat-8 satellite remote sensing images; 3) topographic and geological maps of 1: 50,000 scale; 4) realtime monitoring data of rainfall; and 5) existing reports and field survey data of landslides.

Slope Unit Production
Compared with traditional grid cells, slope units are able to better reflect the actual environmental conditions that lead to landslide development and have definite geological significance. In this work, the hydrological analysis module of ArcGIS was used to produce the slope units for the landslide susceptibility mapping from DEM data. A total of 23,169 slope units were obtained, of which the minimum area was 900 m 2 and the maximum area was 1,690,200 m 2 ( Figure 3).

Landslide Inventory
A landslide inventory map defines the location and type of existing landslides. It is very important to determine the location of landslides during landslide susceptibility mapping. In order to produce a detailed and reliable landslides inventory map, a comprehensive field survey on the scale of 1:50,000 was performed in the study area, in addition to the collection of relevant reports and photos that were available. A total of 332 landslides were identified and mapped ( Figure 1) (landslide location is indicated by the centroids on the map). An analysis of the map showed a large proportion (98%) of these landslides were shallow landslides, with slide body thicknesses of less than 10 m. The smallest landslide was about 120 m 2 , the largest was about 5.0 × 10 5 m 2 and the average was about 2.5 × 10 4 m 2 . Centroids were used to represent the corresponding landslide locations. There were 198 landslides distributed on convex slopes, 84 on concave slopes, 150 on 30°-40°slopes and 166 on 10-60 m high slopes.
For training and testing the models, the same number of nonlandslides (332 non-landslides) were generated on ArcGIS and all of the landslides and non-landslides were randomly divided into two datasets: 70% (a total of 464 samples) for training the models and 30% (a total of 200 samples) for validation.

Conditioning Factor Preparation
From the data available and the characteristics of the surveyed landslides, this study selected 10 factors of landslide susceptibility: slope-angle, altitude, slope aspect, degree of relief, lithology, distance to rivers, distance to faults, distance to roads, average annual rainfall and the normalized difference vegetation index (NDVI).
When a map of slope units is formed, thematic layers for each landslide conditioning factor are produced. As some conditioning factors were extracted from grid data sources, such as altitude data from DEM, these layers had to be converted into slope units. This was done using the regional analysis function of ArcGIS ( Figure 4).

(1) Digital Elevation Model and Derivatives
The DEM of the study area was used to extract different conditioning factors such as slope-angle, altitude, slope-aspect, degree of relief, etc. Slope-angle is the degree of steepness of the surface unit, which is often expressed by the ratio of the vertical height of the slope to the horizontal distance. The size of the slope-angle has an impact on the stability of the slope, which affects the occurrence of landslides. From the DEM data of Ningqiang County, the slope information in the study area was extracted using ArcGIS terrain analysis and divided into the following four categories: 0-15°, 15-20°, 20-25°and >25°( Figure 4A; Table 2). The altitude information was extracted and divided into the following five categories: < 800m, 800m-1000m, 1000m-1200m, 1200m-1400m, 1400m-1600 m and >1600 m ( Figure 4B; Table 2).
The slope aspect information was divided into the following nine categories: plane, north, northeast, east, southeast, south, southwest, west and northwest ( Figure 4C; Table 2). The degree of relief was divided into three categories: 0-50 m, 50-100 m and >150 m ( Figure 4D; Table 2). (2)Lithology Lithology describes the source material that the landslide is formed from, and different types of lithology have different influences on landslides. Based on the 1:50,000 geological map of Ningqiang County, the lithology in the study area can be divided into the following seven categories: sandstone, shale, limestone, tuff, granite, keratophyre and schist ( Figure 4E; Table 2). (

3)Distance to Faults, Rivers and Roads
Due to the effect of stress, many cracks can be produced in rock and soil around fault zones, providing channels for infiltration and recharge of groundwater. This can enhance erosion of river banks, reducing the rock strength of slopes. Engineering activities during road construction can also affect the stability of slopes and induce landslides. Based on buffer analysis, the distance to faults was divided into five categories: 0-500 m, 500-1000 m, 1000-1500 m, 1500-2000 m and >2000 m ( Figure 4F; Table 2); the distance to rivers was divided into five categories: 0-150 m,150-300 m,300-450 m, 450-600 m and >600 m ( Figure 4G; Table 2); and the distance to roads was divided into five categories: 0-500 m, 500-1000 m, 1000-1500 m, 1500-2000 m and >2000 m ( Figure 4H; Table 2).

(4)Rainfall
Rainfall is one of the main factors that induces landslides, as it can increase the weight of the sliding body and decrease the strength of rock and soil on the slope. Based on data collected from meteorological stations in the study area the annual average rainfall was divided into the following nine categories: <950 mm, 950-1000 mm, 1000-1050 mm, 1050-1100 mm, 1100-1150 mm, 1150-1200 mm, 1200-1250 mm, 1250-1300 mm and >1300 mm ( Figure 4I; Table 2).

(5)Normalized Difference Vegetation Index (NDVI)
Normalized difference vegetation index (NDVI) is an indicator of vegetation growth and spatial distribution, which characterizes the surface vegetation cover. Based on the Landsat-8 satellite remote sensing images in the study area, the NDVI was divided into five categories: < 0.40, 0.40-0.45, 0.45-0.50, 0.50-0.55 and >0.55 ( Figure 4J; Table 2).

Multi-Collinearity Diagnostics
All the training data and test data were imported into SPSS software for multi-collinear diagnostic analysis, and the diagnostic results were shown in Table 3. It can be seen from the table that among the 10 landslide condition factors selected this time, the slope has the largest VIF and the smallest TOL (VIF 3.665,TOL 0.273), which is still in the normal range (when VIF value <10 or Tol value >0.1, it indicates that there is no serious collinearity problem). Therefore, it can be proved that the 10 landslide condition factors involved in this study are reasonable and can be used in the study of landslide susceptibility assessment.

Principle
The certainty factor (CF) method is a probability function which analyses the sensitivity of various factors that affect an event and it has been widely used in the evaluation of landslide susceptibility. This method was first proposed by Shortliffe and Buchanan (1975) and then gradually improved by subsequent researchers. The principle of the CF model is expressed as follows: where P a is the conditional probability of a landslide event of category a (e.g., this category could refer to slopes in the range of 20°-30°). In practical applications, this parameter can be represented by the ratio of the landslide area to the total land area in category a. P s is the prior probability of the total number of landslide events in the study area, expressed by the ratio of the total landslide area to the total land area. It can be seen that the CF value is in the range of [-1,1]. The closer the CF value is to 1, the greater the probability of landslide occurrence is; and the closer the CF value is to -1, the smaller the probability of landslide occurrence is. When the CF value is 0, the possibility of landslide occurrence is equal to that of non-occurrence. For each slope unit, the CF values of the corresponding 10 conditioning factors were unique, so the sum of the CF values for each slope unit was calculated to express the landslide sensitivity index (LSI) as

Application
The classification information and the corresponding CF values of the 10 conditioning factors selected in this paper were calculated ( Table 4). The distance to faults, distance to roads, distance to rivers, rainfall and lithology were the main conditioning factors of landslide susceptibility in Ningqiang County, as shown in Table 4, which is consistent with the conclusions obtained by Zhao et al., 2012 using an information quantity model.

SVM Model
Principle Support Vector Machine (SVM) is a supervised learning method based on statistical learning theory. Its core premise involves constructing an optimal hyperplane to distinguish different samples by maximizing the interval between them, which is often used to solve binary classification problems. For the purposes of classification, support vector machines can be used to divide data that is linearly separable or linearly inseparable according to what best distinguishes the samples.
Suppose there is a linearly separable training sample, {x i , y i }, i 1, 2/n; y i ∈ {+1, −1},the support vector machine tries to find a hyperplane that can maximize the distinction between two sets of samples and maximize the distance between the two nearest samples of these sets. The original support vector machine problem of support vector machine is stated such that: In the formula, ||w|| is the 2-norm of w, b is the offset, and n is the total number of training samples. It is difficult to differentiate landslide data linearly for landslide susceptibility assessment, so an SVM for linearly inseparable has to be used. The expression of the original linearly inseparable problem is as follows: In the formula, the relaxation variable ξ ｉ is introduced, which is used to describe the classification interval error. C is the penalty factor, which adjusts the limits for misclassification of a certain factor. This is a convex quadratic programming problem with inequality constraints, for which the dual problem can be obtained by using the Lagrange multiplier method, expressed as： Where a is the Lagrange multiplier and k(x i , y j ) is a kernel function that needs to be operated in high-dimensional feature space. Commonly used kernel functions include the linear function, the polynomial kernel function, the Gaussian radial basis kernel function and the sigmoid kernel function. The Gaussian radial basis kernel function, which is widely used to optimize the model parameters (including the penalty factor C and the relaxation variable c) and has good performance for both large and small samples, is expressed as in Eq. 6. The final decision function can be obtained by solving Eq. 7.
Application By using the grid search algorithm, C 1.3195 and c 1.8661 were found to be the optimal values. The SVM model was constructed and tested with 200 test samples. Among them, 142 samples were successfully predicted, with a prediction success rate of 71.0% ( Figure 5A). This shows that the established SVM model had good predictive performance.

RF Model
Principle Random forest (RF) is an ensemble learning technique based on decision trees (Youssef et al., 2016;Chu et al., 2019), which was first proposed by Breiman (2001). This method decides which category each sample belongs to by using multiple classification regression trees generated from a large number of sample data sets. Each decision tree represents a classifier, so the classification and predictive ability of an RF model is better than that of a single decision tree based on traditional classification and regression. In the RF model, the bootstrap resampling method is used to extract n samples (generally 2/3 of the original sample set) from the original sample set T to generate a new training sample set. A corresponding decision tree is generated for each independent training sample, and n decision trees were generated from the new training sample set to constitute the forest. The remaining samples (generally 1/3 of the original sample set) are called Out of Bag (OOB) samples, which constitute the test sample set and are used to verify the performance of the model. All of the decision tree categories in the forest are counted, and the class for which the sample receives the largest number of votes is denoted as its category. Further details of the RF model can be found in Breiman (2001).

Application
In this paper, the attribute values of each landslide conditioning factor were extracted from the original 464 training sample to construct the new training sample data set. The number of decision trees and the number of variables selected at random were set to 500 and 3, respectively. The RF model was constructed and then tested with the set of 200 test samples. Among them, 156 samples were successfully predicted, with a prediction success rate of 78.0% ( Figure 5B). This shows that the established RF model also had good predictive performance.

Principle
In order to explore the effects of hybrid statistical and machine learning methods for landslide susceptibility analysis in Ningqiang County, the CF values of all of the training samples (including landslide samples and non-landslide samples) calculated by the CF model were taken as a new sample dataset. On this basis, SVM and RF were applied to construct an ensemble CF-SVM model and CF-RF model, respectively.

Application
In this study, the CF values of the conditioning factors were extracted from the 464 original training samples (232 landslide samples and 232 non-landslide samples) to construct a new set of training data. The CF values of the conditioning factors were also extracted from the 200 original test samples (100 landslide samples and 100 non-landslide samples) to construct a new set of test data. For the CF-SVM model, the optimal value for C was found to be 1.4142 and c was 2.2974, and the number of decision trees and the number of variables selected at random were set to 500 and 3, respectively. The test results were as follows: there were 155 landslides were successfully predicted in the CF-SVM model (a prediction success rate of 77.5%, Figure 5C) and 162 landslides were successfully predicted in the CF-RF model (a prediction success rate of 81.0%, Figure 5D). These results imply that both the CF-RF and the CF-SVM model performed well, but the CF-RF model performed slightly better than the CF-SVM model.

Landslide Susceptibility Maps
After implementing the models, LSI values for all of the 23,169 slope units were calculated based on the five different models and these were reclassified by the natural break classification scheme in ArcGIS; the results are shown in Table 5.
Based on the CF model, the LSI values of all 23,169 slope units were in the range of -4.734∼2.849. According to the natural break classification scheme, these were divided into five grades from low to high (-4.734, -1.889), (-1.889, -0.856), (-0.856, -0.048), (-0.048, 0.766) and (0.766, 2.849), which represent the very low susceptibility, low susceptibility, moderate susceptibility, high susceptibility and very high susceptibility classes respectively. A landslide susceptibility map was derived from the LSI values of the CF model ( Figure 6A).
Based on the SVM model, the LSI values of all of the slopes were in the range of -2.398-2.010, and these were again divided into five grades representing very low to very high susceptibility (-2.398,-0.987), (-0.987, -0.380), (-0.380,0.206), (0.206,0.793) and (0.793, 2.010). The landslide susceptibility map based on the SVM model is shown in Figure 6B.

ROC Curves
The receiver operator curve (ROC) and the area under the curve (AUC) are generally utilized for evaluating the accuracy of the landslide susceptibility analysis model. In this study, ROC was represented by the success-rate and prediction-rate curves separately, which were generated from the corresponding training and test datasets. The cumulative percentage area of the ordered index values in descending order was categorized into 100 classes with 1% cumulative intervals as the horizontal axis, and the cumulative percentage of landslides corresponding to the range of the LSI values as the vertical axis. In this way, the successrate and prediction-rate curves were obtained from the intersecting values and were displayed as continuous lines. The AUC was obtained for both the training data and the validation data (Figure 7).

Landslide Susceptibility Map Rationality
The rationality of the landslide susceptibility maps was evaluated by looking at correlating in landslide locations via overlay analysis. Detailed information of each zonation in the maps produced by the different models can be obtained from Figure 8 and Table 6. The moderate class of the CF model covered the largest area by percentage (27.67%), followed by the high (26.77%), low (21.86%), very high (15.84%) and very low (7.86%) classes, respectively. Moreover, the largest number of landslides was observed in the very high class (37.65%), followed by the high (29.52%), moderate (19.28%), low (12.05%) and very low (1.50%) classes, respectively. In the SVM model, the percentage area of land that belonged to each susceptibility class followed the same sequence as for CF. However, the largest number of landslides was obtained in the high susceptibility class (34.94%), and then the very high (31.33%), moderate (19.88%), low (11.14%), and very low (2.71%) classes. In terms of the RF model, the high class was found to cover the largest area (23.37%), followed by the moderate (21.16%), very low (19.18%), low (19.07%) and very high (17.23) classes, respectively. However, contrary to these results, the largest number of landslides was observed in the very high class (70.18%), this was then followed by the high (19.28%), moderate 6.93%), low (2.71%) and very low (0.90%) classes, respectively. In the CF-SVM model, 25.81% of the land area was assigned to the very low class, followed by the very high (20.57%), low (19.77%), high (18.07%) and moderate (15.78%) classes. Moreover, 59.04% of landslides were assigned to the very high susceptibility class. This was followed by the high (15.06%), moderate (10.84%), low (9.64%) and very low (5.42%) classes, respectively. For the CF-FR model, the high-susceptibility class covered the largest area (21.68%), followed by the low (21.06%), moderate (19.30%), very low (19.27%) and very high (18.6%) classes. The percentage of landslides in these areas were as follows: very high (68.98%), high (21.69%), moderate (5.12%), low (3.01%) and very low (1.20%). These analyses show that the landslide susceptibility maps produced by most of the above models were reasonable, and the percentage of landslides progressively increased from the very low susceptibility class to the very high susceptibility class. However, the map produced by the RF model overall seemed the most reasonable in comparison to the other models.
Landslide density (LD, number of landslides per 100 km 2 ) and landslide frequency ratio (FR, FR the percentage of landslides/the percentage of area per susceptibility class)  were another two important indexes used to evaluate the rationality of the mapping results. The larger the LD or FR value, the higher the landslide susceptibility. The LD and FR of each model are summarized in Table 6 and shown in Figure 9 and Figure 10. , respectively. Therefore, in terms of the LD and FR, the RF model was the most reasonable, followed by the CF-RF model, the SVM model, the CF model and the CF-SVM model. This shows that the use of the ensemble approach in the form of CF-RF and CF-SVM reduced the reliability compared to these models on their own.

Landslide Susceptibility Map Accuracy
As shown in Figure 7A, the highest AUC value for the success rate curve was obtained for the RF model (0.919), followed by the CF-RF model (0.894), the CF-SVM model (0.751), the SVM model (0.676) and the CF model (0.675). The prediction rate curve is shown in Figure 7B. It can be seen that the CF-RF model had the highest prediction rate (0.714). Prediction rates of 0.698, 0.682, 0.680 and 0.661 were obtained for the RF, CF-SVM, SVM and CF models, respectively. Therefore, both the success rates and the prediction rates showed that the RF model based on slope units had a better degree of fit to the data than the other models. Overall, based on the above comparisons and analyses, the susceptibility map produced from the RF model had the highest rationality and accuracy, followed by the CF-RF model. This is consistent with the conclusions of previous studies that found that tree-based models achieve excellent results compared to other models Hong et al., 2020). Unfortunately, the two hybrid models (the CF-SVM model and the CF-RF model) did not meet our expectations. That is to say, the combination of the weaker learner model (he CF model here) to the stronger learner model (the SVM and RF models here) impacted the applicability of the stronger model on its own, which has been a concern of researchers (Dou et al., 2019). Though, in most cases, the hybrid model is better than the single model (Hong et al., 2017;Pradhan et al., 2017;Mallick et al., 2018;Xiao et al., 2019;Fang et al., 2020;Napoli et al., 2020;Saha and Saha, 2020).

CONCLUSION
Landslides are a highly dangerous and hugely destructive hazard that occurs all around the world. For this reason, landslide susceptibility analysis is very important for local government agencies. In the mapping of landslide susceptibility based on GIS, scientists have put forward many models and used many different types of mapping units. However, until now, there has been no model that could incorporate any type of landslide conditions for susceptibility mapping. Additionally, this arge-scale susceptibility analysis using detailed data has shown that slope units have better mapping properties than other units. In this paper, taking slope units as the basic analysis units, landslide susceptibility mapping of Ningqiang County was carried out using a CF model, SVM model, RF model, CF-SVM model and CF-RF model. The results found that the RF model performed better than the other. In addition, this paper provides valuable information for local government agencies for landslide risk evaluation and land utilization planning in the study area and other similar areas in the Qinba Mountains. It also shows the value of using slope units for landslide susceptibility mapping.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.