You're viewing our updated article page. If you need more time to adjust, you can return to the old layout.

ORIGINAL RESEARCH article

Front. Earth Sci., 30 March 2023

Sec. Geohazards and Georisks

Volume 11 - 2023 | https://doi.org/10.3389/feart.2023.1118160

Application of different watershed units to debris flow susceptibility mapping: A case study of Northeast China

  • 1. College of Construction Engineering, Jilin University, Changchun, China

  • 2. China Water Resources Bei Fang Investigation, Design & Research Co. LTD., Tianjin, China

Article metrics

View details

6

Citations

3,7k

Views

790

Downloads

Abstract

The main purpose of this study was to compare two types of watershed units divided by the hydrological analysis method (HWUs) and mean curvature method (CWUs) for debris flow susceptibility mapping (DFSM) in Northeast China. Firstly, a debris flow inventory map consisting of 129 debris flows and 129 non-debris flows was randomly divided into a ratio of 70% and 30% for training and testing. Secondly, 13 influencing factors were selected and the correlations between these factors and the debris flows were determined by frequency ration analysis. Then, two types of watershed units (HWUs and CWUs) were divided and logistic regression (LR), multilayer perceptron (MLP), classification and regression tree (CART) and Bayesian network (BN) were selected as the evaluation models. Finally, the predictive capabilities of the models were verified using the predictive accuracy (ACC), the Kappa coefficient and the area under the receiver operating characteristic curve (AUC). The mean AUC, ACC and Kappa of four models (LR, MLP, CART and BN) in the training stage were 0.977, 0.931, and 0.861, respectively, for the HWUs, while 0.961, 0.905, and 0.810, respectively, for the CWUs; in the testing stage, were 0.904, 0.818, and 0.635, respectively, for the HWUs, while 0.883, 0.800, and 0.601, respectively, for the CWUs, which showed that HWU model has a higher debris flow prediction performance compared with the CWU model. The CWU-based model can reflect the spatial distribution probability of debris flows in the study area overall and can be used as an alternative model.

1 Introduction

According to the China Statistical Yearbook (http://www.stats.gov.cn/tjsj/ndsj/), a total of 7,840 geological disasters occurred in China in 2020, resulting in 197 casualties and direct economic losses of 740 million dollars, of which debris flows accounted for 11.46%. Debris flows are among the most frequent and destructive disasters in mountainous areas (Dash et al., 2022; Jiang et al., 2022; Qiu et al., 2022). Debris flow susceptibility mapping (DFSM), representing where debris flows are likely to occur, plays an important role in debris flow management strategies and has been a hot topic in disaster research worldwide (Ilia and Tsangaratos, 2015; Qin et al., 2019; Sun et al., 2021; Yao et al., 2022).

There are many uncertainties in the process of disaster susceptibility mapping, such as selecting appropriate mapping units, determining evaluation models, screening influencing factors, determining the proportion of training and testing data and others (Tien Bui et al., 2015; Cama et al., 2016; Zezere et al., 2017; Chen et al., 2018; Du et al., 2018; Dou et al., 2019; Qiao et al., 2021). Among the above uncertainty factors, selecting appropriate mapping units is the first step to address disasters and environmental factors. The mapping unit is the basic functional spatial element for dividing the study area (Cama et al., 2016). The term refers to a portion of the land surface which contains a set of ground conditions that differ from the adjacent units across definable boundaries (Van Den Eeckhaut et al., 2009). The selection of mapping units affects the methods used to address the uncertainty in the input data, the model fitting, the reliability of disaster susceptibility mapping and the application of disaster susceptibility mapping in disaster prevention and mitigation (Fausto Guzzetti et al., 1999; Cama et al., 2016; Qiao et al., 2021). At present, mapping units mainly include the following classes: grid cell units, slope units, watershed units, topographic units, geohydrological units, political or administrative units, and unique condition units (Van Den Eeckhaut et al., 2009; Chen et al., 2019; Sun et al., 2020).

For DFSM, grid cell units and watershed units are used frequently. Grid cell units are the most popular mapping units with the same cell size, fast processing speed and simple algorithm (Reichenbach et al., 2018). However, the division of grid cells destroys the integrity of debris flows and is almost completely unrelated to geological and topographic information (Dragut and Eisank, 2011; Wang et al., 2017). Moreover, since debris flows are a dynamic process, the DFSM based on grid cell units cannot comprehensively reflect spatial information (Qin et al., 2019). Watershed refers to the river catchment area that is surrounded by the water-parting line; it is the basic unit for the development and activity of debris flows, and it is the object of exploration, research, and prevention of debris flows. Furthermore, the watershed unit includes the formation area, circulation area, and accumulation area of a debris flow (Qin et al., 2019). Compared with grid cell units, watershed units can completely consider the spatial information of a debris flow. Some scholars have carried out DFSM based on watershed units and obtained reliable results. Qin et al. (2019) explored the accuracy and practicability of mapping units for the evaluation of debris flow susceptibility based on grid cell units and watershed units, and the results showed that watershed units were more feasible than grid cell units when considering the effects of geology and geomorphology on the occurrence of debris flows. Qiao et al. (2021) proposed a region-partitioning method for DFSM based on the topographic characteristics of watershed units, and the results demonstrated that this method can enable more reasonable regional-scale DFSM. Li et al. (2017) presented an application of the rock engineering system and fuzzy C-means algorithm for debris flow susceptibility assessment using watershed units as mapping units in the Wudongde Dam area, the evaluation results agreed well with field investigations. Zou et al. (2019) developed a quantitative method for regional risk assessment of debris flows by analyzing in-depth the relationships among hazard-forming environments, disaster factors and elements at risk based on hydrological response units. The presented method may serve as pertinent guidance for regional risk assessment of debris flows. In addition, some scholars have used watershed units to evaluate and compare the performance of different evaluation models for DFSM (Liang et al., 2020; Xiong et al., 2020), and the conclusions provide helpful data for assessing and mitigating debris flow hazards. Therefore, it is important to carry out research based on watershed units, which provide more evidence and views for DFSM research. The commonly used watershed units are based on the hydrological analysis model, also known as hydrological response units (Li et al., 2021). In addition, watershed units can be generated based on the mean curvature model (Romstad and Etzelmüller, 2012). To compare the results of applying different watershed units in DFSM, we extracted the watershed units based on the hydrological analysis method and mean curvature method in the study.

There are plenty of evaluation models for disaster susceptibility mapping, from qualitative approaches to quantitative approaches (Aditian et al., 2018; Huang et al., 2020; Asadi et al., 2022). Qualitative methods are based on air photo and field interpretation and the opinions of an individual or a group of experts (Aditian et al., 2018; Ghasemian et al., 2022b). Some qualitative methods include ranking and weighting, such as analytic hierarchy process and weighted linear combination (Ayalew and Yamagishi 2005; Rozos et al., 2010). These qualitative or semi-quantitative methods are subjective and highly dependent on experts’ knowledge, and are not suitable for large-scale research fields (Bălteanu et al., 2010). Quantitative statistical models are built based on appropriate mathematical models to analyze the statistical relations between disasters and influencing factors (Hadmoko et al., 2017; Ghasemian et al., 2022b), including the information value (Xu et al., 2012), certainty factor concepts (Devkota et al., 2012), frequency ratio method (Balamurugan et al., 2016), bivariate statistical analysis (Ayalew and Yamagishi 2005), index of entropy (Shirani et al., 2018), weight of evidence (Constantin et al., 2010), evidential belief functions (Carranza 2014), logistic regression (Cao et al., 2019), etc. Machine learning models are now widely used because these models can analyze the non-linear corrections between past events and the influencing factors and they predict where disasters will occur (He et al., 2012; Xiong et al., 2020). These models include artificial neural networks (Pham et al., 2017; Chen et al., 2021; Chen et al., 2022), support vector machines (Colkesen et al., 2016), random forest (Hong et al., 2016), decision trees (Althuwaynee et al., 2014), classification and regression tree (Youssef et al., 2015), boosted regression trees (Xiong et al., 2020), Bayesian network (Song et al., 2012), adaptive neuro-fuzzy inference (Jaafari et al., 2019), logistic model tree (Tien Bui et al., 2015) and random gradient descent (Hong et al., 2020). Reichenbach et al. (2018) reviewed the statistically-based landslide susceptibility assessment literature from 1983 to 2016, and found that the most common statistical methods for landslide susceptibility modeling include logistic regression, neural network analysis, data-overlay and index-based and weight of evidence analyses. In this study, to avoid the model uncertainty caused by different evaluation models, we use logistic regression (LR), multilayer perceptron (MLP), classification and regression tree (CART) and Bayesian network (BN) to carry out DFSM based on two types of watershed units.

This study compared and analysed the applicability of two different watershed units in regional DFSM based on four models (LR, MLP, CART, and BN). The main purpose is to support the selection of watershed units for DFSM. Yongji county in the Jilin Province, China was taken as the study region because it is under serious threat of frequent debris flows. The division process and results of two types of watershed units were compared. Eight DFSMs are discussed and AUC, ACC, and Kappa analyses were used to evaluate the accuracy of the debris flow susceptibility models.

2 Study area

2.1 General settings

Yongji county is located in central eastern Jilin Province, China (Figure 1), which covers a total area of 2,620 km2. The number of debris flows in Yongji county has increased from 71 in 2007 to 129 in 2021, causing several deaths, destroying hundreds of houses and thousands of acres of farmland. The debris flows scoured the roadbed and piled up on the road, resulting in traffic paralysis. It is necessary and urgent to map the susceptibility of debris flows in Yongji county.

FIGURE 1

FIGURE 1

The geographic location of the study area.

The study area lies between 125°48′09″E to 126°40′01″E longitude and 43°18′07″N to 43°35′00″N latitude. There are four landforms in the entire area: middle mountains, low mountains, platform, and river valley. From southeast to northwest, the landforms of the study area are middle mountains, low mountains and platform with the altitudes ranging from 1,386 to 182 m. In addition to several andesites and metamorphic rocks, the main rock type is Yanshan Early Granite. The study area lies in the Tianshan–Xingan geosyncline fold area of the Jilin and Heilongjiang fold system (Qin et al., 2019). Folds and faults are relatively developed in Yongji county, which provides conditions for the occurrence of geological disasters (Figure 2A). Yongji county is in the mid-latitude subtemperate continental climate zone with an annual average precipitation of 722.75 mm. There are 39 rivers covering an area of more than 20 km2. The main rivers include the Yinma River, Wende River, Chalu River and Aolong River.

FIGURE 2

FIGURE 2

Geological map and debris flow field photos of the study area: (A) geological map; (B–E) debris flow field photos.

2.2 Debris flow data inventory

A debris flow inventory map is a prerequisite for DFSM(Xu et al., 2012; Arabameri et al., 2020; Dash et al., 2022). A total of 129 debris flows were collected based on field surveys and historical materials. Figure 2A shows that debris flows are mainly distributed across the southeast mountain area. Statistics show that among 129 debris flows, only 7 are medium in size and 122 are small. In recent years, the increase in debris flow frequency in Yongji county has been closely related to deforestation and reclamation. With the destruction of forest vegetation, rainfall is more likely to cause soil erosion, which gradually forms a series of gullies. These gullies provide circulation conditions for debris flows. Figures 2B–E shows some images of occurred debris flows in the study area.

3 Watershed units

3.1 Division methods for different watershed units

In this study, the extraction of watershed units was completed in ArcGIS 10.2 software (Tien Bui et al., 2015; Cao et al., 2019). The most commonly used watershed units (HWUs) are classified by the hydrological principles (Fausto Guzzetti et al., 1999). HWUs are derived based on an 8-direction flow algorithm (Horton et al., 2013). Establishing the HWUs consists of the following six steps: 1) filling the original DEM, 2) extracting the flow direction, 3) calculating the flow accumulation, 4) extracting river networks based on a threshold, 5) stream linking, 6) dividing HWUs based on flow direction and stream linking. The detailed classification process is shown in Figure 3A.

FIGURE 3

FIGURE 3

Classification process of the watershed units: (A) hydrological analysis method and (B) mean curvature method.

In addition, watershed units can be generated based on the mean curvature method (CWUs). The mean curvature is a simple combination of profile curvature and plan curvature. Its maximum and minimum values can indicate the changes in aspect and slope positions at the same time. Therefore, the mean curvature can reflect the ridge line, valley line, platform edge and wide valley edge (Romstad and Etzelmüller, 2012). Establishing the CWUs consists of the following five steps: 1) smoothing the original DEM, 2) calculating the mean curvature, 3) extracting the flow direction, 4) filling depressions based on flow direction data, and 5) dividing CWUs based on flow direction and depressions. The detailed classification process is shown in Figure 3B.

3.2 Watershed unit classification results

For HWUs, the number and size are closely related to DEM resolution and flow threshold, but for CWUs, the control factor is only DEM resolution. For HWUs, flow threshold values of 500, 1,000, 2000, 5,000, and 10,000 were chosen based on a DEM with a resolution of 30 m. For CWUs, we resampled the DEM with resolutions of 50, 100, 200, 300, 500, and 1,000. To ensure that the number and size of the two types of watershed units were not much different and consistent with the actual watersheds, a flow threshold of 1,000 and a DEM resolution of 300 were selected to divide the watershed units. For the HWUs, the study area was divided into 1,092 watershed units. The smallest unit was 0.10 km2, the largest unit was 13.63 km2, and the mean size was 2.40 km2 (Figure 4A). For CWUs, the study area was divided into 1,211 watershed units. The smallest unit was 0.11 km2, the largest unit was 8.87 km2, and the mean size was 2.17 km2 (Figure 4B).

FIGURE 4

FIGURE 4

Division of watershed units: (A) hydrological analysis method and (B) mean curvature method.

4 Materials and methods

The flowchart of the research methodology is shown in Figure 5. DFSM of Yongji county using four models (LR, MLP, CART, and BN) and watershed units (HWUs and CWUs) have been carried out in five main steps: 1) data collection and screening influencing factors, 2) division of two types of watershed units, 3) calculation of frequency ratio values (FRs) for all influencing factors, 4) building debris flow models and constructing DFSM, and 5) debris flow model validation and comparison using AUC, ACC and Kappa.

FIGURE 5

FIGURE 5

Flowchart of the research methodology.

4.1 Screening influencing factors

The occurrence of debris flows is affected by many factors including topographic, geomorphologic, geological, ecological and meteorological factors (Zhang et al., 2012; Bregoli et al., 2014; Hu et al., 2014). Based on field observations, available literature and expert experience, fourteen influencing factors were considered, such as watershed area, relative height difference, watershed average elevation, watershed slope, mean curvature, fault density, river density, stream power index (SPI), topographic wetness index (TWI), plan normalized difference vegetation index (NDVI), landforms, precipitation, land use and lithology.

Because substantial collinearity will lead to model instability, collinearity analysis is essential before influencing factors are applied for DFSM(Qiu et al., 2022). Person’s correlation coefficient was calculated to test the collinear relationship among these factors, and the results are shown in Table 1. There is no correlation coefficient when the absolute value is less than 0.7 (Dormann et al., 2013; Yao et al., 2022). There was high collinearity between relative height difference and watershed average elevation, and the Person’s correlation coefficient was 0.86. In addition, the value of collinearity between watershed average elevation and precipitation was 0.69. Therefore, the watershed average elevation was eliminated.

TABLE 1

Factors WA E RHD WS F R NDVI SPI TWI Pre MC LU LF Li
WA 1.00
E 0.04 1.00
RHD 0.25 0.86 1.00
WS 0.00 0.33 0.29 1.00
F −0.03 0.22 0.09 0.02 1.00
R −0.05 −0.44 −0.36 −0.36 −0.18 1.00
NDVI −0.10 −0.30 −0.38 −0.22 −0.08 0.17 1.00
SPI 0.06 0.45 0.46 0.23 0.14 −0.19 −0.23 1.00
TWI 0.00 0.00 0.04 −0.55 0.00 0.20 0.02 0.36 1.00
Pre −0.02 0.69 0.48 0.21 0.18 −0.24 −0.43 0.42 0.02 1.00
MC −0.10 −0.19 −0.19 −0.03 −0.08 0.03 0.06 −0.42 −0.25 −0.11 1.00
LU 0.02 −0.17 −0.17 −0.59 0.11 0.31 0.18 −0.04 0.37 −0.08 −0.13 1.00
LF 0.03 0.33 0.30 0.42 0.12 −0.34 −0.28 0.27 −0.16 0.36 −0.16 −0.26 1.00
Li 0.00 −0.10 −0.09 −0.48 −0.04 0.23 0.13 −0.15 0.29 −0.03 0.05 0.31 −0.28 1.00

The results of the Person’s Correlation Coefficient.

(‘WA’ represents ‘Watershed area’, ‘E’ represents ‘Watershed average elevation’, ‘RHD’ represents ‘Relative height difference’, ‘WS’ represents ‘Watershed slope’, ‘F’ represents ‘Fault density’, ‘R’ represents ‘River density’, ‘NDVI’ represents ‘Plan normalized difference vegetation index’, ‘SPI’ represents ‘Stream power index’, ‘TWI’ represents ‘topographic wetness index’, ‘Pre’ represents ‘precipitation’, ‘MC’ represents ‘mean curvature’, ‘LU’ represents ‘Land use’, ‘LF’ represents ‘Landforms’ and ‘Li’ represents ‘Lithology’).

The watershed area, relative height difference, watershed slope, mean curvature, SPI and TWI were extracted from the DEM with a resolution of 30 m. Fault, river, and lithology data were acquired from the geological map of Yongji county and field investigations. The Landsat 8 image taken on 11 August 2021, was used to produce the NDVI. Landforms, precipitation, and land use were provided by government reports. Thirteen influencing factors were converted to a grid cell with a resolution of 30 m in ArcGIS 10.2 (Chen et al., 2017). Table 2 shows date source and scale of influencing factors. When watershed units are applied to DFSM, grid patterns for each factor need to be transferred to the corresponding watershed units. For watershed area, geometric calculation in the attribute table was used to calculate the area of each watershed. The difference between the highest and the lowest points in each watershed was calculated as a relative height difference (Qin et al., 2019). For watershed slope, mean curvature, SPI, TWI, and NDVI, the zonal statistics tool in the spatial analysis was used and the statistical type was “mean.” The length of faults and rivers in each watershed was extracted by using the intersection tool, and then, the fault density and river density in each watershed were calculated using the field calculator. Precipitation for each watershed was determined based on the principle of majority, and this principle was also applied to factors of landforms, land use and lithology. The data types of precipitation, landforms, land use and lithology are discrete, while the data types of other factors are continuous. The influencing factor layers based on HWUs with a flow threshold of 1,000 are shown in Figure 6.

TABLE 2

Factors Data source Scale
Watershed area (km2) DEM 30 m × 30 m
Relative height difference (m) DEM 30 m × 30 m
Watershed slope (°) DEM 30 m × 30 m
Mean curvature DEM 30 m × 30 m
Fault density (km/km2) The geological map of Yongji County and field investigations 1:200000
River density (km/km2) The geological map of Yongji County and field investigations 1:200000
SPI DEM 30 m × 30 m
TWI DEM 30 m × 30 m
NDVI The Landsat 8 image 30 m × 30 m
Landforms Government reports 1:200000
Precipitation (mm) Government reports 1:200000
Land use Government reports 1:200000
Lithology The geological map of Yongji County and field investigations 1:200000

Date source and scale of influencing factors.

FIGURE 6

FIGURE 6

. Maps of influencing factors based on HWUs with a flow threshold of 1,000: (A) watershed area; (B) relative height difference; (C) watershed slope; (D) mean curvature; (E) fault density; (F) river density; (G) SPI; (H) TWI; (I) NDVI; (J) landforms; (K) precipitation; (L) land use; (M) lithology.

The FRs of the influencing factor subclass were used as the input variable of the DFSM models (Huang et al., 2020). Based on a series of previous studies (Xu et al., 2012; Aditian et al., 2018; Vakhshoori et al., 2019; Chang et al., 2020), we divided the continuous factor into eight levels using the natural fracture method. Taking HWUs with a flow threshold of 1,000 as an example, the FRs for each level of thirteen factors are shown in Table 3.

TABLE 3

Factors Descriptions of influencing factors Class FR
Watershed area (km2) The watershed area is generally proportional to the amount of water in the catchment and material sources. The greater the material sources are, the stronger the destructive power is 0.10–1.17 0.92
1.17–2.01 0.36
2.01–2.84 1.01
2.84–3.86 1.43
3.86–5.03 0.95
5.03–6.64 0.14
6.64–8.84 3.14
8.94–13.64 1.77
Relative height difference (m) The relative elevation difference is the value between the highest and lowest elevations in a watershed unit Shi et al. (2015). The greater the height difference is, the greater the potential energy of the debris flow is, loose deposits are easy to move under water scouring, and the kinetic energy of debris flow is also high 1.88–58.41 0.00
58.41–125.09 0.22
125.09–187.63 0.82
187.63–251.52 0.80
251.52–323.73 1.28
323.73–429.99 1.76
429.99–590.74 3.95
590.74–817.72 2.98
Watershed slope (°) The watershed slope represents the average slope within the watershed. The greater the watershed slope is, the worse the slope stability is, which provides materials for debris flows 0.02–3.77 0.00
3.77–5.54 0.68
5.54–7.42 0.80
7.42–9.40 1.25
9.40–11.46 2.18
11.46–14.46 2.29
14.46–22.29 2.43
Mean curvature Curvature is generally used to describe the physical characteristics of the watershed and understand the erosion process Li et al. (2021) −0.0585–−0.0094 2.34
−0.0094–−0.0037 2.15
−0.0037–−0.0016 1.93
−0.0016–−0.0002 1.18
−0.0002–0.0009 0.50
0.0009–0.0028 0.87
0.0028–0.0060 0.61
0.0060–0.0226 4.84
Fault density (km/km2) Faults are often related to seismic activity. Earthquakes produce loose deposits which provides materials for debris flows Hong et al. (2015) 0–0.07 0.82
0.07–0.19 3.22
0.19–0.33 0.62
0.33–0.44 1.82
0.44–0.58 1.45
0.58–0.79 0.37
0.79–1.11 4.41
1.11–2.54 1.04
River density (km/km2) River density is the ratio of total river length in a watershed to the watershed area Lei et al. (2010). The river will erode the rock mass and destroy the stability of the slope, which provides conditions for the formation of debris flows 0–0.18 0.61
0.18–0.46 0.78
0.46–0.69 2.07
0.69–0.97 1.10
0.97–1.36 0.65
1.36–1.90 0.22
1.90–2.81 0.15
2.81–5.46 0.00
SPI SPI is used to measure the erosion power of the stream Althuwaynee et al. (2014) −4.96–1.13 0.00
1.13–1.73 0.14
1.73–2.29 0.57
2.29–2.81 0.87
2.81–3.33 1.93
3.33–4.53 2.50
TWI The higher the TWI value is, the higher the soil water content is, indicating a higher potential for triggering debris flows in the watershed (Esper Angillieri. (2020) 5.85–6.79 2.15
6.79–7.23 1.59
7.23–7.73 1.03
7.73–8.32 0.54
8.32–8.96 0.12
8.96–18.43 0.00
NDVI The NDVI reflects vegetation coverage. High vegetation coverage is conducive to soil and water conservation and can reduce the amount of materials in debris flows −0.09–0.08 1.52
0.08–0.12 1.31
0.12–0.17 1.17
0.17–0.23 1.53
0.23–0.29 0.40
0.29–0.36 0.99
0.36–0.43 0.35
0.43–0.54 0.21
Landforms Landforms affect the formation, movement and scale of debris flows. Mountain areas are prone to debris flows due to the large slope and fast water catchment speed Platform 0.00
River valley 0.27
Low mountains 0.37
Middle mountains 3.15
Precipitation (mm) Precipitation provides water and dynamic conditions for debris flows 650–675 0.00
675–700 0.25
700–715 2.05
715–730 0.83
Land use Land use is closely related to the occurrence of debris flows. For example, farmland is prone to debris flows due to serious soil erosion Forest 1.45
Residential land 1.04
Surface of the water 0.00
Farmland 0.51
Lithology For different lithologies, their hardness, resistance to erosion and weathering are also different Relatively hard clastic rock 0.94
Soft clastic rock 0.87
Hard bedded rock 0.00
Hard massive rock 2.00
Soil mass 0.25

Description and FRs of all the influencing factors (HWUs with a flow threshold of 1,000).

4.2 Logistic regression (LR)

Logistic regression (LR) may be the most widely used statistical technique in susceptibility assessment (Colkesen et al., 2016). As a multivariate regression method, LR can find a model to describe the relationship between multiple independent variables and a dependent variable (Lee and Pradhan 2006; Lee 2007; Pourghasemi et al., 2013). For DFSM, the influencing factors are considered the independent variables and the occurrence and non-occurrence of debris flows are considered the dependent variables. For LR, variables may be continuous, discrete or arbitrary combinations of two types (Lee, 2007). LR can be expressed as follows (Ayalew and Yamagishi 2005; Yalcin et al., 2011; Schlögel et al., 2018):where denotes the probability of a debris flow occurrence in each watershed, varying between 0 and 1; represents the dependent variable including non-debris flows (0) and debris flows (1); α represents the intercept of the regression function, , , … are the regression coefficients; and , , … are the debris flow influencing factors.

4.3 Multilayer perceptron (MLP)

Multilayer perceptron (MLP) is a kind of artificial neural network and has been widely used in classification (Tien Bui et al., 2015; Pham et al., 2017). The MLP generally consists of three main components, namely, input layers, hidden layers, and output layers (Kavzoglu and Mather 2003). For DFSM, the input layers are considered the influencing factors of debris flow, the output layers are considered the classification result of inferring debris flow or non-debris flow, and the hidden layers are considered the classification layers that convert input into output. The MLP model with only one hidden layer is the most basic three-tier structure model, which can fit and predict many non-linear problems (Li et al., 2019). In this study, a single-hidden-layer MLP model is used in DFSM. For example, , and represent the number of input, hidden and output layers, respectively, and the input variables are . Then, the input and output of the hidden layer are: (Li et al., 2019; Huang et al., 2020):where , and represent the th input variable, threshold value and output variable of the neuron in the hidden layer, respectively, represents the weight value between the th input neuron and the th neuron in the hidden layer, and represents the activation function. Then the input and output of neurons in the output layer are:where , and represent the th input variable, threshold value and output variable of the neuron in the output layer, respectively, represents the weight value between the th neuron in the hidden layer and the th output neuron.

4.4 Classification and regression tree (CART)

The decision tree model is a technique that uses a tree structure to discover and describe structural patterns in data. It does not require a preestablished relationship between all input variables and a target variable (Hitoshi Saito and Matsuyama, 2009). As an algorithm of the decision tree model, classification and regression tree (CART) was first proposed by Breiman et al. (1984) The CART consists of a root node, a set of internal nodes and a set of leaf nodes. The leaf nodes correspond to the classification result, and the other nodes correspond to the classification rules. CART was selected as the decision tree model in this study in view of its performance efficiency (Wang et al., 2015).

4.5 Bayesian network (BN)

The Bayesian network (BN) is a graphical model for probabilistic relationships among a set of variables (Song et al., 2012). BN can be represented by directed acyclic graphs and conditional probabilities, reflecting the independent and interdependent relationship among various variables. The calculation formula is given as follows (Han et al., 2019):where is the prior probability, indicating the conditional probability without the parent node, is the conditional probability, indicating the occurrence probability of under conditions and is the conditional probability, indicating the occurrence probability of under and conditions.

4.6 Model performance evaluation

In this study, three commonly used criteria, including the predictive accuracy (ACC), the Kappa coefficient and the area under the receiver operating characteristic curve (AUC) were used to evaluate the prediction ability of DFSMs. The calculation of the three criteria is based on the confusion matrix (Ghasemian et al., 2022a). The confusion matrix, also known as the error matrix, is a standard format for accuracy evaluation. The confusion matrix can represent the difference between the model prediction results and the actual observation results (Xiong et al., 2020). In this study, the confusion matrix of the debris flow susceptibility predictive models is shown in Table 4. For example, a true positive (TP) suggests that the prediction result is ‘Debris-Flow’, and the actual observation result is ‘Debris-Flow’.

TABLE 4

Prediction
Actuality
Debris-flow (1) Non-Debris-flow (0)
Debris-Flow (1) True positive (TP) False negative (FN)
Non-Debris-Flow (0) False positive (FP) True negative (TN)

Confusion matrix.

The predictive accuracy (ACC) represents the ratio of correctly predicted observations to total observations. This index shows how well the debris flow model works:

The Kappa index is used to assess the acceptability of debris flow models which can be calculated by:where is the Kappa coefficient; is overall classification accuracy, namely, ACC; is the actual number of debris flows, and it is also the sum of TP and FN (Table 4); is the actual number of non-debris flows, and it is also the sum of FP and TN; is the predicted number of debris flows, and it is also the sum of TP and FP; is the predicted number of non-debris flows, and it is also the sum of FN and TN; is the total number of samples, and it is also the sum of and or and . The varies from 0.0 to 1.0, the higher the value is, the better the classification accuracy of the model.

The receiver operating characteristic (ROC) curve and area under the curve (AUC) can compare the prediction performance of different classifiers (Akgun et al., 2012). The abscissa and ordinate of the ROC are the false-positive rate (FPR) and true-positive rate (TPR) respectively. They can be obtained from the following equations (Pourghasemi et al., 2013):

AUC represents the quality of models that reliably predict the occurrence or non-occurrence of debris flows. The AUC varies from 0.5 to 1.0, and the higher the AUC value is, the better the prediction performance of the model.

5 Results and validation

5.1 Model parameters

The whole analysis process was implemented in IBM SPSS software (Sun et al., 2019; Sun et al., 2021). For LR, the forward step mode was adopted to screened variables. For BN model, the mechanism type was Tree Augmented naive Bayes (TAN), and Bayesian adjustment of small cell count was selected as a parameter learning method. For MLP, one hidden layer was selected, and the maximum training time was used as the termination rule. For CART, the maximum tree depth was set to 10, and percentage was used as the termination rule. Other parameters are default.

5.2 Contribution of debris flow influencing factors

The Chi-Squared statistic was employed to identify the most important factors affecting the occurrence of debris flows in the study area (Ghasemian et al., 2022a). Figure 7 shows that landforms have the highest impact (128.5) on debris flows in the study area, followed by relative height difference and SPI (114.5), watershed slope (110.5), TWI (82.5), precipitation (74.5), lithology (60), land use (43.5), watershed area (27.5), river density (26.5), mean curvature (24), NDVI (23), and fault density (21.5).

FIGURE 7

FIGURE 7

Contribution of influencing factors.

5.3 Spatial datasets for model building

According to field surveys and historical materials, a total of 129 debris flows were collected. Meanwhile, 129 non-debris flows were selected, which were at least 500 m away from the nearest debris flow (Figure 4) (Dou et al., 2019; Sun et al., 2020). Assigned 1 and 0 for debris flows and non-debris flows, respectively. The FRs of the thirteen influencing factors shown in Table 2 were taken as the input variables, and the debris flows and non-debris flows were taken as the output variables. For all 258 samples, 70% (n = 180) were selected randomly for training data, which were used to create the DFSM models. The remaining 30% (n = 78) were used as testing data, which were applied to validate the DFSM models. Based on two types of watershed units (HWUs and CWUs) and four models (LR, MLP, CART and BN), eight DFSMs of Yongji county were completed.

5.4 Eight DFSMs

In this paper, IBM SPSS software was chosen to build the debris flow susceptibility predictive models. The model outputs are the debris flow susceptibility indices of all watershed units in the study area. Debris flow susceptibility indices are the probability of debris flow occurrence which varies from 0 to 1 (Xiong et al., 2020). Based on the ArcGIS software, the debris flow susceptibility indices were converted into raster format to produce the debris flow susceptibility map. Quantile classification was applied to divide the final maps into five classes, namely, very low susceptibility (VL), low susceptibility (L), moderate susceptibility (M), high susceptibility (H), and very high susceptibility (VH). (Martha et al., 2013; Hussin et al., 2016; Steger et al., 2017).

As shown in Figure 8, the susceptibility distributions of the eight models have common characteristics. Very high and high susceptibility areas are mainly distributed in the southeast, moderate susceptibility areas are mainly distributed in the middle, and very low and low susceptibility areas are mainly distributed in northwestern of Yongji county, which is consistent with previous research results (Qin et al., 2019). The landform in the southeast of the study area is mainly middle mountains, and the land use is mainly forest and farmland. The watershed units distributed in the southeast have large relative height differences and slopes, which leads to frequent debris flow disasters. The precipitation decreases from southeast to northwest, which is consistent with the susceptibility distribution. The lithology in southeastern Yongji county is hard massive rock, mainly granite. Weathered granite is a component of debris flows, which increases the density and destructive power of debris flows (Figures 2B–E).

FIGURE 8

FIGURE 8

Eight DFSMs: (A) DFSM of HWUs and LR; (B) DFSM of HWUs and MLP; (C) DFSM of HWUs and CART; (D) DFSM of HWUs and BN; (E) DFSM of CWUs and LR; (F) DFSM of CWUs and MLP; (G) DFSM of CWUs and CART and (H) DFSM of CWUs and BN.

For the eight DFSMs, the area ratios of the five susceptibility classes (very high, high, moderate, low, and very low) were 12.85–19.96, 13.82–21.51, 14.87–23.79, 17.70–28.33, and 16.56%–36.70%, respectively (Figure 9A); The debris flow ratios of the five susceptibility classes were 62.79–76.74, 12.40–24.81, 6.98–14.73,0.78%–3.10% and 0%–0.78%, respectively (Figure 9B). As shown in Figure 9C, the debris flow density was calculated to evaluate the performance of the DFSMs, that is, the ratio of debris flow percentage to area percentage on each susceptible class (Pham et al., 2016). The maximum values of the debris flow density of the eight models appear in the very high susceptibility class, varying from 3.15 to 5.61. The minimum values all appear in the very low susceptibility class, varying from 0.00 to 0.04. The debris flow density increases gradually from a very low class to a very high class, which provides a good visualization of the spatial predictions of debris flows (Pham et al., 2017; Asadi et al., 2022).

FIGURE 9

FIGURE 9

The classification of DFSMs and debris flow density: (A) area ratio; (B) debris flows ratio; (C) debris flow density.

5.5 Validation and comparison of the models

Model validation is a vital step in disaster susceptibility mapping (Wang et al., 2022). By considering the three commonly used performance metrics of ACC, AUC and Kappa, eight models were verified. The AUC, ACC and Kappa coefficient values of the eight models on the training and testing data are shown in Figure 10.

FIGURE 10

FIGURE 10

The AUC, ACC, and Kappa coefficient values of the eight models for the training and testing data.

In the training phase, when HWUs were used as the mapping unit, the ACC stated that HWUs_CART model had the highest value (0.990), followed by HWUs_MLP (0.932), HWUs_BN (0.919) and HWUs_LR (0.881). It showed that the HWUs_CART model can correctly classify the debris flow and non-debris flow locations as debris flow and non-debris flow situations respectively. The highest and lowest Kappa values were 0.980 and 0.762, respectively for the HWUs_CART and HWUs_LR. Meanwhile, HWUs_MLP (0.864) and HWUs_BN (0.838) was ranked in other positions. In terms of AUC, results indicated that the HWUs_CART model with a value of 0.991 had higher performance than the HWUs_MLP (0.982), HWUs_BN (0.970) and HWUs_LR (0.966). When CWUs was used as the mapping unit, the ACC, Kappa and AUC values of the CWUs_CART model were 0.980, 0.960 and 0.985, which showed that the performance of the CWUs_CART model was the highest, followed by the CWUs_MLP (0.893, 0.787, 0.967), CWUs_BN (0.899, 0.778, and 0.946) and CWUs_LR (0.858, 0.716, and 0.946) (Figure 10). Although the results showed the excellent performance for all the four algorithms, the CART had the highest ability in debris flow classification and susceptibility mapping in the study area. In terms of watershed unit, ACC, Kappa and AUC values decreased when HWUs was replaced by CWUs, indicating that HWUs were more suitable for DFSM in the study area than CWUs.

Right side of Figure 10 showed the prediction capabilities of the eight models based on testing dataset. These results are very important for evaluating the applicability and robustness of the models. When HWUs were used as the mapping unit, the highest value of ACC was 0.834 for the HWUs_CART model, next for the HWUs_MLP (0.826), HWUs_BN (0.808) and HWUs_LR (0.802) models. The Kappa for the HWUs_CART model was 0.668 as the highest value, whereas this value was 0.652, 0.617, and 0.604 for HWUs_MLP, HWUs_BN, and HWUs_LR, respectively. The highest and lowest AUC values were 0.942 and 0.882, respectively for the HWUs_CART and HWUs_LR. Meanwhile, HWUs_MLP (0.902) and HWUs_BN (0.892) was ranked in other positions. Correspondingly, ACC, Kappa and AUC from CWUs were shown in Figure 9, which indicated a similar result with HWUs. CART model resulted in the highest ACC, Kappa and AUC values of 0.832, 0.663, and 0.909, which manifested it is the best model for the study area. At the same time, the HWU-based models had better performance than the CWU models for DFSM in the study area.

5.6 One-way ANOVA test

The results of the models are tested by one-way ANOVA in SPSS. For HWUs, there are significant differences between CART and each of the three methods (LR, MLP, and BN). There are no significant differences among LR, MLP, and BN. For CWUs, there are no significant differences between MLP and each of the two methods (LR and BN). There are significant differences between the other methods.

6 Discussion

6.1 Watershed unit classification processes and results comparison

As shown in Figure 3, the extraction processes of HWUs are more complex than those of CWUs, because HWUs require six steps while CWUs require five steps. Model builder in ArcGIS is a workflow that connects a series of geoprocessing tools (Qin et al., 2019). It takes the output of one tool as the input of the other tool. Model builder can greatly reduce operation time and improve work efficiency. We had built two workflows for the processes of extracting HWUs and CWUs in the model builder. Experiments on two types of watershed units showed that HWUs extraction required 17 s, while CWUs extraction required only 3 s. In addition, for the division of HWUs, the influence of DEM resolution and flow threshold needs to be considered, while for CWUs, only DEM resolution needs to be considered. In summary, it takes more time and effort to extract HWUs than CWUs.

There are also significant differences between the two types of watershed units in the classification results. As shown in Figure 4, HWUs are mostly strip-shaped and widely different in size, while CWUs are nearly square and relatively uniform in size. The watershed unit boundaries extracted by the hydrological analysis method in areas with obvious topographic changes are relatively consistent with reality (Figure 11A). However, there are many small and narrow units in flat areas, because there is no clear flow direction in flat areas for hydrological analysis (Massimiliano et al., 2016) (Figure 11C). For the curvature method, the division of watershed units in flat areas is relatively satisfactory, and there is no parallel line problem similar to the hydrological analysis method (Figure 11D). However, in areas with obvious topographic changes, the boundaries of watershed units do not match well with the actual situation (Figure 11B).

FIGURE 11

FIGURE 11

Watershed unit classification results comparison: (A) The boundaries of HWUs are relatively consistent with reality in mountainous areas; (B) The boundaries of CWUs do not match well with the actual situation in mountainous areas; (C) Small and narrow units in flat areas of HWUs; and (D) The division of CWUs in flat areas is relatively satisfactory.

6.2 Comparison of DFSMs based on different evaluation models

Four models, LR, MLP, CART and BN, were used to complete the DFSMs of Yongji county in this study. Figure 10 shows the predictive ability of the eight models. When HWUs were used as mapping units, CART had the highest evaluation criteria with AUC, ACC and Kappa values of 0.991, 0.990, and 0.980 respectively, followed by MLP (0.982, 0.932, 0.864), BN (0.970, 0.919, 0.838) and LR (0.966, 0.881, 0.762) in the training stage. For the testing stage, the CART had the highest prediction accuracy with AUC, ACC and Kappa values of 0.942, 0.834 and 0.668 respectively, followed by MLP (0.902, 0.826, 0.652), BN (0.892, 0.808, 0.617) and LR (0.882, 0.802, 0.604). When CWUs were used as mapping units, the evaluation results showed the same trend as HWUs. The comparisons of the four evaluation models show that the CART had the best predictive ability over the other three models. The current research was in agreement with previous research results. Wang et al. (2015) analyzed landslide susceptibility based on five mathematical models (artificial neural network, frequency ratio, CART, LR and weights of evidence methods) and three sampling strategies. They indicated the results obtained from CART show steady prediction power with an AUC value larger than 0.7. Felicísimo et al. (2012) indicated that the CART is one of the most predictive models with the AUC value of 0.77. Using random forest (RF), boosted regression tree (BRT), classification and regression tree (CART), and general linear (GLM), Youssef et al. (2015) found the success rate for CART was 0.816 and for the prediction rate the CART was the highest with a value of 0.862. CART represents information in an intuitive and easy visual way, and is widely used in many fields (Bevilacqua et al., 2003; Malinowska 2014; Kim et al., 2015; Youssef et al., 2015; Yang et al., 2016).

Several studies have been conducted in similar areas. Selecting the frequency ratio (FR) model as the statistical method, Qin et al. (2019) explored the accuracy and practicability of HWUs and grid cell units (GCUs) in evaluating debris flow susceptibility in Yongji county. Qiao et al. (2021) built debris flow susceptibility models via the analytical hierarchy process (AHP) method and generated maps of Yongji county. The AUC values of the testing data in different studies are shown in Table 5. When the HWUs were chosen as mapping units, the AUC values of the DFSMs based on AHP, FR, LR, BN, MLP and CART were 0.812, 0.879, 0.882, 0.892, 0.902, and 0.942 respectively. The main difference among these DFSMs is the selection of different evaluation models, which indicates that machine learning models can improve the prediction accuracy of DFSMs. These results are consistent with previous studies, indicating that machine learning models are more suitable for DFSM than heuristic and general statistical models (Huang et al., 2020; Sun et al., 2021).

TABLE 5

Studies Method Mapping units AUC
This study LR HWUs 0.882
MLP HWUs 0.902
CART HWUs 0.942
BN HWUs 0.892
Qin et al. (2019) FR HWUs 0.879
Qiao et al. (2021) AHP HWUs 0.812

AUC values of testing data in different studies.

6.3 Comparison of DFSMs based on different watershed units

The selection of mapping units is one of the key issues for the rationality and correctness of disaster susceptibility mapping (Van Den Eeckhaut et al., 2009; Chen et al., 2019; Sun et al., 2020). The impact of different mapping units on disaster susceptibility mapping is greater than that of statistical methods (Zezere et al., 2017). Although many studies have tried to compare different evaluation models for disaster susceptibility mapping (Achour et al., 2018; Liang et al., 2020; Xiong et al., 2020; Dash et al., 2022; Qiu et al., 2022), very few studies have considered different mapping units. Qin et al. (Qin et al., 2019) explored the effect of grid cell unit and HWUs on the susceptibility mapping of debris flow, they found HWUs can reflect the geological and geomorphic environmental conditions of a debris flow accurately and perfectly. Li et al. (Li et al., 2021) discussed the influence of four different HWUs on debris flow susceptibility assessment results. The results show that the appropriate watershed division scheme can obtain more reasonable results. In this study, HWUs and CWUs were selected to map debris flow susceptibility. When the CART was selected as the machine learning model, the HWUs generated high AUC, ACC, and Kappa for training data (0.991, 0.990 and 0.980) compared to the CWUs (0.985, 0.980, and 0.960). For testing data, the AUC, ACC, and Kappa of HWUs were 0.942, 0.834, and 0.668, respectively. The AUC, ACC, and Kappa of CWUs were 0.909, 0.832, and 0.663, respectively. The results suggest that the HWU model has a higher debris flow prediction performance than the CWU model. The same trend can be observed in the LR, MLP, and BN models. Therefore, the HWU-based model is superior to the CWU-based model in debris flow susceptibility assessment due to higher training and testing accuracy.

As described in “6.1 Watershed unit classification processes and results comparison,” compared with CWUs, HWUs agree well with the actual watershed units in mountainous areas, but small and narrow units appear in plain areas. Since the frequency of debris flows in mountainous areas is much higher than that in plain areas, the division of watershed units in mountainous areas is more important than that in plain areas. Therefore, the HWU model is more practical than the CWU model. CWUs can also represent the distribution of watersheds and can be used as an alternative scheme.

Although this paper discussed the application of two types of watershed units in DFSM and obtained positive results, there are some limitations: 1) the number of debris flows is small, and 2) only HWUs with a threshold of 1,000 and CWUs with a resolution of 300 are selected for comparison. In future research, we will constantly update the debris flow database to improve the data quality. Moreover, it is necessary to explore the similarities and differences of multiscale watershed units in DFSM.

7 Conclusion

This paper mainly explored the influence of using different watershed units (HWUs and CWUs) in debris flow susceptibility assessment models. LR, MLP, CART, and BN were chosen as evaluation models to avoid the model uncertainty caused by different models. Yongji county, with 129 recorded debris flows and 13 related influencing factors, was used as the study area and eight DFSMs were produced.

The DFSM results showed that CART has the best predictive ability over the other three models through the analysis of AUC, ACC and Kappa. By using Model Builder in ArcGIS, 1,092 HWUs and 1,211 CWUs were extracted. Compared with HWUs, the extraction process of CWUs is simpler. For the results of watershed unit division, HWUs have more advantages in areas with undulating terrain, but they are not satisfactory in areas with flat terrain. CWUs perform well in flat areas but do not match the actual watershed boundaries in areas with undulating terrain. Since debris flows mostly occur in mountainous areas, the DFSM based on HWUs is more accurate and practical than that based on CWUs. In addition, the AUC, ACC and Kappa showed that the HWU-based model has remarkably higher debris flow prediction performance than CWUs. This result means that the HWUs are more effective in debris flow susceptibility assessment of the study area. The CWU-based model can also reflect the spatial distribution probability of debris flows in the study area overall and can be used as an alternative model. Further studies should propose a more appropriate watershed unit for DFSM.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

JL: Conceptualization, methodology, formal analysis, investigation, writing, original draft, writing-review and editing. SUQ: Methodology, validation, resources, data curation, writing -original draft, visualization, project administration, funding acquisition. JC: Validation, investigation, supervision, project administration. SGQ: Investigation, supervision. JGY: Software, data curation. XZ: Conceptualization, supervision. RC: Software, supervision. JHY: Investigation, data curation.

Funding

This work was funded by the National Natural Science Foundation of China under Grants 41977221 and 41972267, and in part by the Jilin Provincial Science and Technology Department (Grant No. 20190303103SF).

Acknowledgments

The authors are also thankful to the reviewers for their valuable feedback on the manuscript.

Conflict of interest

JL, XZ, RC, and JHY were employed by China Water Resources Bei Fang Investigation, Design & Research Co. LTD.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    Achour Y. Garçia S. Cavaleiro V. (2018). GIS-based spatial prediction of debris flows using logistic regression and frequency ratio models for Zêzere River basin and its surrounding area, Northwest Covilhã, Portugal. Arabian J. Geosciences11, 550. 10.1007/s12517-018-3920-9

  • 2

    Aditian A. Kubota T. Shinohara Y. (2018). Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology318, 101111. 10.1016/j.geomorph.2018.06.006

  • 3

    Akgun A. Kincal C. Pradhan B. (2012). Application of remote sensing data and GIS for landslide risk assessment as an environmental threat to Izmir city (west Turkey). Environ. Monit. Assess.184, 54535470. 10.1007/s10661-011-2352-8

  • 4

    Althuwaynee O. F. Pradhan B. Park H. J. Lee J. H. (2014). A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides11, 10631078. 10.1007/s10346-014-0466-0

  • 5

    Arabameri A. Saha S. Roy J. Chen W. Blaschke T. Tien Bui D. (2020). Landslide susceptibility evaluation and management using different machine learning methods in the gallicash river watershed, Iran. Remote Sens.12, 475. 10.3390/rs12030475

  • 6

    Asadi M. Goli Mokhtari L. Shirzadi A. Shahabi H. Bahrami S. (2022). A comparison study on the quantitative statistical methods for spatial prediction of shallow landslides (case study: Yozidar-Degaga Route in Kurdistan Province, Iran). Environ. Earth Sci.81, 51. 10.1007/s12665-021-10152-4

  • 7

    Ayalew L. Yamagishi H. (2005). The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology65, 1531. 10.1016/j.geomorph.2004.06.010

  • 8

    Balamurugan G. Ramesh V. Touthang M. (2016). Landslide susceptibility zonation mapping using frequency ratio and fuzzy gamma operator models in part of NH-39, Manipur, India. Nat. Hazards84, 465488. 10.1007/s11069-016-2434-6

  • 9

    Bălteanu D. Chendeş V. Sima M. Enciu P. (2010). A country-wide spatial assessment of landslide susceptibility in Romania. Geomorphology124, 102112. 10.1016/j.geomorph.2010.03.005

  • 10

    Bevilacqua M. Braglia M. Montanari R. (2003). The classification and regression tree approach to pump failure rate analysis. Reliab. Eng. Syst. Saf.79, 5967. 10.1016/s0951-8320(02)00180-1

  • 11

    Bregoli F. Medina V. Chevalier G. Hürlimann M. Bateman A. (2014). Debris-flow susceptibility assessment at regional scale: Validation on an alpine environment. Landslides12, 437454. 10.1007/s10346-014-0493-x

  • 12

    Breiman L. F. F. Olshen R. Stone C. (1984). Classification and regression trees. Wadsworth. Biometrics40, 358.

  • 13

    Cama M. Conoscenti C. Lombardo L. Rotigliano E. (2016). Exploring relationships between grid cell size and accuracy for debris-flow susceptibility models: A test in the giampilieri catchment (sicily, Italy). Environ. Earth Sci.75, 238. 10.1007/s12665-015-5047-6

  • 14

    Cao J. Zhang Z. Wang C. Liu J. Zhang L. (2019). Susceptibility assessment of landslides triggered by earthquakes in the Western Sichuan Plateau. Catena175, 6376. 10.1016/j.catena.2018.12.013

  • 15

    Carranza E. J. M. (2014). Data-Driven evidential belief modeling of mineral potential using few prospects and evidence with missing values. Nat. Resour. Res.24, 291304. 10.1007/s11053-014-9250-z

  • 16

    Chang Z. Du Z. Zhang F. Huang F. Chen J. Li W. et al (2020). Landslide susceptibility prediction based on remote sensing images and GIS: Comparisons of supervised and unsupervised machine learning models. Remote Sens.12, 502. 10.3390/rs12030502

  • 17

    Chen J. Dai Z. Dong S. Zhang X. Sun G. Wu J. et al (2022). Integration of deep learning and information theory for designing monitoring networks in heterogeneous aquifer systems. Water Resour. Res.58, 429. 10.1029/2022wr032429

  • 18

    Chen J. Dai Z. Yang Z. Pan Y. Zhang X. Wu J. et al (2021). An improved tandem neural network architecture for inverse modeling of multicomponent reactive transport in porous media. Water Resour. Res.57, 595. 10.1029/2021wr030595

  • 19

    Chen W. Shahabi H. Shirzadi A. Hong H. Akgun A. Tian Y. et al (2018). Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling. Bull. Eng. Geol. Environ.78, 43974419. 10.1007/s10064-018-1401-8

  • 20

    Chen W. Xie X. Wang J. Pradhan B. Hong H. Bui D. T. et al (2017). A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena151, 147160. 10.1016/j.catena.2016.11.032

  • 21

    Chen Z. Liang S. Ke Y. Yang Z. Zhao H. (2019). Landslide susceptibility assessment using different slope units based on the evidential belief function model. Geocarto Int.35, 16411664. 10.1080/10106049.2019.1582716

  • 22

    Colkesen I. Sahin E. K. Kavzoglu T. (2016). Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression. J. Afr. Earth Sci.118, 5364. 10.1016/j.jafrearsci.2016.02.019

  • 23

    Constantin M. Bednarik M. Jurchescu M. C. Vlaicu M. (2010). Landslide susceptibility assessment using the bivariate statistical analysis and the index of entropy in the Sibiciu Basin (Romania). Environ. Earth Sci.63, 397406. 10.1007/s12665-010-0724-y

  • 24

    Dash R. K. Falae P. O. Kanungo D. P. (2022). Debris flow susceptibility zonation using statistical models in parts of northwest Indian himalayas—Implementation, validation, and comparative evaluation. Nat. Hazards111, 20112058. 10.1007/s11069-021-05128-3

  • 25

    Devkota K. C. Regmi A. D. Pourghasemi H. R. Yoshida K. Pradhan B. Ryu I. C. et al (2012). Landslide susceptibility mapping using certainty factor, index of entropy and logistic regression models in GIS and their comparison at Mugling–Narayanghat road section in Nepal Himalaya. Nat. Hazards65, 135165. 10.1007/s11069-012-0347-6

  • 26

    Dormann C. F. Elith J. Bacher S. Buchmann C. Carl G. Carré G. et al (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography36, 2746. 10.1111/j.1600-0587.2012.07348.x

  • 27

    Dou J. Yunus A. P. Tien Bui D. Merghadi A. Sahana M. Zhu Z. et al (2019a). Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Sci. Total Environ.662, 332346. 10.1016/j.scitotenv.2019.01.221

  • 28

    Dou Q. Qin S. Zhang Y. Ma Z. Chen J. Qiao S. et al (2019b). A method for improving controlling factors based on information fusion for debris flow susceptibility mapping: A case study in Jilin Province, China. Entropy (Basel)21, 695. 10.3390/e21070695

  • 29

    Dragut L. Eisank C. (2011). Object representations at multiple scales from digital elevation models. Geomorphol. (Amst)129, 183189. 10.1016/j.geomorph.2011.03.003

  • 30

    Du G. Zhang Y. Yang Z. Guo C. Yao X. Sun D. (2018). Landslide susceptibility mapping in the region of eastern himalayan syntaxis, Tibetan plateau, China: A comparison between analytical hierarchy process information value and logistic regression-information value methods. Bull. Eng. Geol. Environ.78, 42014215. 10.1007/s10064-018-1393-4

  • 31

    Esper Angillieri M. Y. (2020). Debris flow susceptibility mapping using frequency ratio and seed cells, in a portion of a mountain international route, Dry Central Andes of Argentina. Catena189, 104504. 10.1016/j.catena.2020.104504

  • 32

    Fausto Guzzetti A. C. Cardinali M. Reichenbach P. Reichenbach P. (1999). Landslide hazard evaluation: A review of current techniques and their application in a multi-scale study, central Italy. Geomorphology31, 181216. 10.1016/S0169-555X(99)00078-1

  • 33

    Felicísimo Á. M. Cuartero A. Remondo J. Quirós E. (2012). Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: A comparative study. Landslides10, 175189. 10.1007/s10346-012-0320-1

  • 34

    Ghasemian B. Shahabi H. Shirzadi A. Al-Ansari N. Jaafari A. Geertsema M. et al (2022a). Application of a novel hybrid machine learning algorithm in shallow landslide susceptibility mapping in a mountainous area. Front. Environ. Sci.10, 897254. 10.3389/fenvs.2022.897254

  • 35

    Ghasemian B. Shahabi H. Shirzadi A. Al-Ansari N. Jaafari A. Kress V. R. et al (2022b). A robust deep-learning model for landslide susceptibility mapping: A case study of kurdistan Province, Iran. Sensors (Basel)22, 1573. 10.3390/s22041573

  • 36

    Hadmoko D. S. Lavigne F. Samodra G. (2017). Application of a semiquantitative and GIS-based statistical model to landslide susceptibility zonation in Kayangan Catchment, Java, Indonesia. Nat. Hazards87, 437468. 10.1007/s11069-017-2772-z

  • 37

    Han L. Zhang J. Zhang Y. Lang Q. (2019). Applying a series and parallel model and a bayesian networks model to produce disaster chain susceptibility maps in the changbai mountain area, China. Water11, 2144. 10.3390/w11102144

  • 38

    He S. Pan P. Dai L. Wang H. Liu J. (2012). Application of kernel-based Fisher discriminant analysis to map landslide susceptibility in the Qinggan River delta, Three Gorges, China. Geomorphology171-172, 3041. 10.1016/j.geomorph.2012.04.024

  • 39

    Hitoshi Saito D. N. Matsuyama H. (2009). Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: The Akaishi Mountains, Japan. Geomorphology109, 108121. 10.1016/j.geomorph.2009.02.026

  • 40

    Hong H. Pourghasemi H. R. Pourtaghi Z. S. (2016). Landslide susceptibility assessment in lianhua county (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology259, 105118. 10.1016/j.geomorph.2016.02.012

  • 41

    Hong H. Pradhan B. Xu C. Tien Bui D. (2015). Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena133, 266281. 10.1016/j.catena.2015.05.019

  • 42

    Hong H. Tsangaratos P. Ilia I. Loupasakis C. Wang Y. (2020). Introducing a novel multi-layer perceptron network based on stochastic gradient descent optimized by a meta-heuristic algorithm for landslide susceptibility mapping. Sci. Total Environ.742, 140549. 10.1016/j.scitotenv.2020.140549

  • 43

    Horton P. Jaboyedoff M. Rudaz B. Zimmermann M. (2013). Flow-R, a model for susceptibility mapping of debris flows and other gravitational hazards at a regional scale. Nat. Hazards Earth Syst. Sci.13, 869885. 10.5194/nhess-13-869-2013

  • 44

    Hu W. Xu Q. Rui C. Huang R. Q. van Asch T. W. J. Zhu X. et al (2014). An instrumented flume to investigate the initiation mechanism of the post-earthquake huge debris flow in the southwest of China. Bull. Eng. Geol. Environ.74, 393404. 10.1007/s10064-014-0627-3

  • 45

    Huang F. Cao Z. Guo J. Jiang S. H. Li S. Guo Z. (2020). Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. Catena191, 104580. 10.1016/j.catena.2020.104580

  • 46

    Hussin H. Y. Zumpano V. Reichenbach P. Sterlacchini S. Micu M. van Westen C. et al (2016). Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model. Geomorphology253, 508523. 10.1016/j.geomorph.2015.10.030

  • 47

    Ilia I. Tsangaratos P. (2015). Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides13, 379397. 10.1007/s10346-015-0576-3

  • 48

    Jaafari A. Panahi M. Pham B. T. Shahabi H. Bui D. T. Rezaie F. et al (2019). Meta optimization of an adaptive neuro-fuzzy inference system with grey wolf optimizer and biogeography-based optimization algorithms for spatial prediction of landslide susceptibility. Catena175, 430445. 10.1016/j.catena.2018.12.033

  • 49

    Jiang H. Zou Q. Zhou B. Hu Z. Li C. Yao S. et al (2022). Susceptibility assessment of debris flows coupled with ecohydrological activation in the eastern qinghai-tibet plateau. Remote Sens.14, 1444. 10.3390/rs14061444

  • 50

    Kavzoglu T. Mather P. M. (2003). The use of backpropagating artificial neural networks in land cover classification. Int. J. Remote Sens.24, 49074938. 10.1080/0143116031000114851

  • 51

    Kim K. N. Kim D. W. Jeong M. A. (2015). The usefulness of a classification and regression tree algorithm for detecting perioperative transfusion-related pulmonary complications. Transfusion55, 25822589. 10.1111/trf.13202

  • 52

    Lee S. (2007). Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int. J. Remote Sens.26, 14771491. 10.1080/01431160412331331012

  • 53

    Lee S. Pradhan B. (2006). Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models. Landslides4, 3341. 10.1007/s10346-006-0047-y

  • 54

    Lei T. C. Wan S. Chou T. Y. Pai H. C. (2010). The knowledge expression on debris flow potential analysis through PCA + LDA and rough sets theory: A case study of chen-yu-lan watershed, nantou, taiwan. Environ. Earth Sci.63, 981997. 10.1007/s12665-010-0775-0

  • 55

    Li D. Huang F. Yan L. Cao Z. Chen J. Ye Z. (2019). Landslide susceptibility prediction using particle-swarm-optimized multilayer perceptron: Comparisons with multilayer-perceptron-only, BP neural network, and information value models. Appl. Sci.9, 3664. 10.3390/app9183664

  • 56

    Li Y. Wang H. Chen J. Shang Y. (2017). Debris flow susceptibility assessment in the Wudongde Dam area, China based on rock engineering system and fuzzy C-means algorithm. Water9, 669. 10.3390/w9090669

  • 57

    Li Z. Chen J. Tan C. Zhou X. Li Y. Han M. (2021). Debris flow susceptibility assessment based on topo-hydrological factors at different unit scales: A case study of mentougou district, beijing. Environ. Earth Sci.80, 365. 10.1007/s12665-021-09665-9

  • 58

    Liang Z. Wang C. M. Zhang Z. M. Khan K. U. J. (2020). A comparison of statistical and machine learning methods for debris flow susceptibility mapping. Stoch. Environ. Res. Risk Assess.34, 18871907. 10.1007/s00477-020-01851-8

  • 59

    Malinowska A. (2014). Classification and regression tree theory application for assessment of building damage caused by surface deformation. Nat. Hazards73, 317334. 10.1007/s11069-014-1070-2

  • 60

    Martha T. R. van Westen C. J. Kerle N. Jetten V. Vinod Kumar K. (2013). Landslide hazard and risk assessment using semi-automatically created landslide inventories. Geomorphology184, 139150. 10.1016/j.geomorph.2012.12.001

  • 61

    Massimiliano A. Ivan M. Paola R. Mauro R. Francesca A. Federica F. et al (2016). Automatic delineation of geomorphological slope units with <tt>r.slopeunits v1.0</tt> and their optimization for landslide susceptibility modeling. Geosci. Model. Dev. Discuss.9, 39753991. 10.5194/gmd-9-3975-2016

  • 62

    Pham B. T. Tien Bui D. Prakash I. Dholakia M. B. (2017). Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena149, 5263. 10.1016/j.catena.2016.09.007

  • 63

    Pham B. T. Tien Bui D. Prakash I. Dholakia M. B. (2016). Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS. Nat. Hazards83, 97127. 10.1007/s11069-016-2304-2

  • 64

    Pourghasemi H. R. Moradi H. R. Fatemi Aghda S. M. (2013). Landslide susceptibility mapping by binary logistic regression, analytical hierarchy process, and statistical index models and assessment of their performances. Nat. Hazards69, 749779. 10.1007/s11069-013-0728-5

  • 65

    Qiao S. S. Qin S. W. Sun J. B. Che W. C. Yao J. Y. Su G. et al (2021). Development of a region-partitioning method for debris flow susceptibility mapping. J. Mt. Sci.18, 11771191. 10.1007/s11629-020-6497-1

  • 66

    Qin S. Lv J. Cao C. Ma Z. Hu X. Liu F. et al (2019). Mapping debris flow susceptibility based on watershed unit and grid cell unit: A comparison study. Geomatics, Nat. Hazards Risk10, 16481666. 10.1080/19475705.2019.1604572

  • 67

    Qiu C. Su L. Zou Q. Geng X. (2022). A hybrid machine-learning model to map glacier-related debris flow susceptibility along Gyirong Zangbo watershed under the changing climate. Sci. Total Environ.818, 151752. 10.1016/j.scitotenv.2021.151752

  • 68

    Reichenbach P. Rossi M. Malamud B. D. Mihir M. Guzzetti F. (2018). A review of statistically-based landslide susceptibility models. Earth-Science Rev.180, 6091. 10.1016/j.earscirev.2018.03.001

  • 69

    Romstad B. Etzelmüller B. (2012). Mean-curvature watersheds: A simple method for segmentation of a digital elevation model into terrain units. Geomorphology139-140, 293302. 10.1016/j.geomorph.2011.10.031

  • 70

    Rozos D. Bathrellos G. D. Skillodimou H. D. (2010). Comparison of the implementation of rock engineering system and analytic hierarchy process methods, upon landslide susceptibility mapping, using GIS: A case study from the eastern achaia county of peloponnesus, Greece. Environ. Earth Sci.63, 4963. 10.1007/s12665-010-0687-z

  • 71

    Schlögel R. Marchesini I. Alvioli M. Reichenbach P. Rossi M. Malet J. P. (2018). Optimizing landslide susceptibility zonation: Effects of DEM spatial resolution and slope unit delineation on logistic regression models. Geomorphology301, 1020. 10.1016/j.geomorph.2017.10.018

  • 72

    Shi M. Chen J. Song Y. Zhang W. Song S. Zhang X. (2015). Assessing debris flow susceptibility in Heshigten Banner, Inner Mongolia, China, using principal component analysis and an improved fuzzy C-means algorithm. Bull. Eng. Geol. Environ.75, 909922. 10.1007/s10064-015-0784-z

  • 73

    Shirani K. Pasandi M. Arabameri A. (2018). Landslide susceptibility assessment by dempster–shafer and index of entropy models, sarkhoun basin, southwestern Iran. Nat. Hazards93, 13791418. 10.1007/s11069-018-3356-2

  • 74

    Song Y. Gong J. Gao S. Wang D. Cui T. Li Y. et al (2012). Susceptibility assessment of earthquake-induced landslides using bayesian network: A case study in beichuan, China. Comput. Geosciences42, 189199. 10.1016/j.cageo.2011.09.011

  • 75

    Steger S. Brenning A. Bell R. Glade T. (2017). The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements. Landslides14, 17671781. 10.1007/s10346-017-0820-0

  • 76

    Sun J. Qin S. Qiao S. Chen Y. Su G. Cheng Q. et al (2021). Exploring the impact of introducing a physical model into statistical methods on the evaluation of regional scale debris flow susceptibility. Nat. Hazards106, 881912. 10.1007/s11069-020-04498-4

  • 77

    Sun X. Chen J. Han X. Bao Y. Zhan J. Peng W. (2019). Application of a GIS-based slope unit method for landslide susceptibility mapping along the rapidly uplifting section of the upper Jinsha River, South-Western China. Bull. Eng. Geol. Environ.79, 533549. 10.1007/s10064-019-01572-5

  • 78

    Sun X. Chen J. Han X. Bao Y. Zhou X. Peng W. (2020). Landslide susceptibility mapping along the upper jinsha river, south-Western China: A comparison of hydrological and curvature watershed methods for slope unit classification. Bull. Eng. Geol. Environ.79, 46574670. 10.1007/s10064-020-01849-0

  • 79

    Tien Bui D. Tuan T. A. Klempe H. Pradhan B. Revhaug I. (2015). Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides13, 361378. 10.1007/s10346-015-0557-6

  • 80

    Vakhshoori V. Pourghasemi H. R. Zare M. Blaschke T. (2019). Landslide susceptibility mapping using GIS-based data mining algorithms. Water11, 2292. 10.3390/w11112292

  • 81

    Van Den Eeckhaut M, R. P. Guzzetti F. Rossi M. Poesen J. (2009). Combined landslide inventory and susceptibility assessment based on different mapping units: An example from the flemish ardennes, Belgium. Nat. Hazards Earth Syst. Sci.9, 507521. 10.5194/nhess-9-507-2009

  • 82

    Wang F. Xu P. Wang C. Wang N. Jiang N. (2017). Application of a GIS-based slope unit method for landslide susceptibility mapping along the longzi river, southeastern Tibetan plateau, China. ISPRS Int. J. Geo-Information6, 172. 10.3390/ijgi6060172

  • 83

    Wang L. J. Guo M. Sawada K. Lin J. Zhang J. (2015). A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosciences J.20, 117136. 10.1007/s12303-015-0026-1

  • 84

    Wang X. Huang F. Fan X. Shahabi H. Shirzadi A. Bian H. et al (2022). Landslide susceptibility modeling based on remote sensing data and data mining techniques. Environ. Earth Sci.81, 50. 10.1007/s12665-022-10195-1

  • 85

    Xiong K. Adhikari B. R. Stamatopoulos C. A. Zhan Y. Wu S. Dong Z. et al (2020). Comparison of different machine learning methods for debris flow susceptibility mapping: A case study in the sichuan Province, China. Remote Sens.12, 295. 10.3390/rs12020295

  • 86

    Xu W. Yu W. Jing S. Zhang G. Huang J. (2012). Debris flow susceptibility assessment by GIS and information value model in a large-scale region, Sichuan Province (China). Nat. Hazards65, 13791392. 10.1007/s11069-012-0414-z

  • 87

    Yalcin A. Reis S. Aydinoglu A. C. Yomralioglu T. (2011). A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. Catena85, 274287. 10.1016/j.catena.2011.01.014

  • 88

    Yang T. Gao X. Sorooshian S. Li X. (2016). Simulating California reservoir operation using the classification and regression‐tree algorithm combined with a shuffled cross‐validation scheme. Water Resour. Res.52, 16261651. 10.1002/2015wr017394

  • 89

    Yao J. Qin S. Qiao S. Liu X. Zhang L. Chen J. (2022). Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping. Bull. Eng. Geol. Environ.81, 148. 10.1007/s10064-022-02615-0

  • 90

    Youssef A. M. Pourghasemi H. R. Pourtaghi Z. S. Al-Katheeri M. M. (2015). Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides13, 839856. 10.1007/s10346-015-0614-1

  • 91

    Zezere J. L. Pereira S. Melo R. Oliveira S. C. Garcia R. A. C. (2017). Mapping landslide susceptibility using data-driven methods. Sci. Total Environ.589, 250267. 10.1016/j.scitotenv.2017.02.188

  • 92

    Zhang W. Chen J. P. Wang Q. An Y. Qian X. Xiang L. et al (2012). Susceptibility analysis of large-scale debris flows based on combination weighting and extension methods. Nat. Hazards66, 10731100. 10.1007/s11069-012-0539-0

  • 93

    Zou Q. Cui P. He J. Lei Y. Li S. (2019). Regional risk assessment of debris flows in China—an HRU-based approach. Geomorphology340, 84102. 10.1016/j.geomorph.2019.04.027

Summary

Keywords

debris flow susceptibility mapping, watershed units, hydrological analysis method, mean curvature method, machine learning model

Citation

Lv J, Qin S, Chen J, Qiao S, Yao J, Zhao X, Cao R and Yin J (2023) Application of different watershed units to debris flow susceptibility mapping: A case study of Northeast China. Front. Earth Sci. 11:1118160. doi: 10.3389/feart.2023.1118160

Received

07 December 2022

Accepted

21 March 2023

Published

30 March 2023

Volume

11 - 2023

Edited by

Wei Zhao, Institute of Mountain Hazards and Environment (CAS), China

Reviewed by

Shuai Chen, Central South University, China

Paraskevas Tsangaratos, National Technical University of Athens, Greece

Himan Shahabi, University of Kurdistan, Iran

Updates

Copyright

*Correspondence: Shengwu Qin,

This article was submitted to Geohazards and Georisks, a section of the journal Frontiers in Earth Science

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics