Soil Erosion Assessment of Alpine Grassland in the Source Park of the Yellow River on the Qinghai-Tibetan Plateau, China

The source park of the Yellow River (SPYR), as a vital ecological shelter on the Qinghai-Tibetan Plateau, is suffering different degrees of degradation and desertification, resulting in soil erosion in recent decades. Therefore, studying the mechanism, influencing factors and current situation of soil erosion in the alpine grassland ecosystems of the SPYR are significant for protecting the ecological and productive functions. Based on the 137Cs element tracing technique and machine learning algorithms, five strategic variable selection algorithms based on machine learning algorithms are used to identify the minimal optimal set and analyze the main factors that influence soil erosion in the SPYR. The optimal model for estimating soil erosion in the SPYR is obtained by comparisons model outputs between the RUSLE and machine learning algorithms combined with variable selection models. We identify the spatial distribution pattern of soil erosion in the study area by the optimal model. The results indicated that: (1) A comprehensive set of variables is more objective than the RUSLE model. In terms of verification accuracy, the simulated annealing -Cubist model (R = 0.67, RMSD = 1,368 t km–2⋅a–1) simulation results represents the best while the RUSLE model (R = 0.49, RMSD = 1,769 t⋅km–2⋅a–1) goes on the worst. (2) The soil erosion is more severe in the north than the southeast of the SPYR. The average erosion modulus is 6,460.95 t⋅km–2⋅a–1 and roughly 99% of the survey region has an intensive erosion modulus (5,000–8,000 t⋅km–2⋅a–1). (3) Total erosion loss is relatively 8.45⋅108 t⋅a–1 in the SPYR, which is commonly 12.64 times greater than the allowable soil erosion loss. The economic monetization of SOC loss caused by soil erosion in the entire research area was almost $47.90 billion in 2014. These results will help provide scientific evidences not only for farmers and herdsmen but also for environmental science managers and administrators. In addition, a new ecological policy recommendation was proposed to balance grassland protection and animal husbandry economic production based on the value of soil erosion reclassification.


INTRODUCTION
The Qinghai-Tibetan Plateau is acknowledged as the "Third Pole" and has received an increasing attention toward ecological and environmental concerns (Yao et al., 2012;Madsen, 2016). The source park of the Yellow River (SPYR), located in the North-Eastern range of the Qinghai-Tibetan Plateau, is a dynamic ecological shelter and is one of the most vulnerable ecological zones in the world (Meng et al., 2016). Because of the conservation of water bodies and provision of water resources to both communities and adjacent areas, the SPYR is recognized as the "Water Tower of the Yellow River" (Ge et al., 2018;Wu et al., 2018). The main part of the SPYR is covered by alpine grassland. The alpine grasslands in the SPYR not only plays an exceptional role in ecological functions such as water conservation, biodiversity protection, and carbon fixation (Harris, 2010), but also play critical roles in livestock production, representing the main sources of income for local pastoralists. However, nearly 90% of this alpine grassland suffers different degrees of degradation and desertification due to climate change, global warming and anthropological activities (Evans, 2005;Fassnacht et al., 2015;Mariano et al., 2018). Portions of the grasslands are experiencing moderate and severe degradation, 40.8 and 17.58%, respectively (Miehe et al., 2019) and a direct soil erosion is also resulted in recent decades (Yao et al., 2016). This erosion not only reduces soil fertility and pollutes water resources but also responsible for sediment accumulation, river obstructions, downstream flooding and flow patterns (Evans et al., 2017). Consequently, exploring the mechanism, influencing factors, and current situation of soil erosion in the alpine grassland ecosystems of the SPYR is significant for protecting the ecological and productive functions.
Soil erosion is a complicated process that depends on soil properties, ground slope, vegetation, freeze/thawing, wind erosivity, and rainfall/precipitation volume and intensity Zhang et al., 2019). Soil erosion research is generally conducted by field observations (Mhazo et al., 2016;Zeng et al., 2018), tracer studies (Jia et al., 2016;Wang et al., 2017), experimental operations (Zhang et al., 2019) and soil erosion simulations (Konz et al., 2012). 137 Cs is an anthropogenic radioisotope, which has been widely used to quantify soil erosion at the Spatio-Temporal scale, and to calibrate or validate the erosion models (Li et al., 2021). Conventional experimental methods can be easily applied to small-scale studies, such as research at the patch scale and slope scale (Bakker et al., 2005), and the results help to investigate the potential occurrence of soil erosion (Teng et al., 2018). However, traditional experimental methods are labor-intensive, expensive, and difficult to apply in large-scale research (Efthimiou, 2018). Therefore, model simulation is a convenient method for studying large-scale soil erosion and is commonly applied in such research fields (Abdelwahab et al., 2018;Starkloff et al., 2018). The earliest soil erosion model [the universal soil loss equation (USLE)] was established by Wischmeier and Smith (1958). It is widely reported that erosion plot data collected in United States have been used to develop and calibrate the Revised Universal Soil Loss Equation (RUSLE), which was further used to estimate global soil erosion (Doetterl et al., 2012;Prasannakumar et al., 2012), and it can provide valuable input for the SPYR as well as China, based on the challenge of sustainable grassland resource uses. In addition, various physical models of soil erosion processes have also been established in recent years, including the Water Erosion Prediction Project (WEPP) model (Kinnell, 2017), Revised Wind Erosion Equation (RWEQ) (Teng et al., 2021), European soil erosion model (EUROSEM) (Veihe et al., 2001), Limburg soil erosion model (LISM) and soil erosion model of Mediterranean regions (SEMMED) (De Jong et al., 1999). However, these models are derived from specific scientific concerns and applicable to certain regions. The factors related to the specific alpine grassland grazing ecosystem of the SPYR in the Qinghai-Tibetan Plateau, such as freeze/thaw cycles, grazing-induced erosion (Evans, 2005;Lin et al., 2008), wind erosion, and human activities, have not been considered in the soil erosion process (Gourfi et al., 2018). Therefore, it is unreasonable to use these models in the present study because they may cause the prediction results to be biased or inaccurate for this region.
With the development of "3s" technologies [geographic information system (GIS), remote sensing (RS), a global positioning system (GPS)], it is possible to identify comprehensive environmental variables to characterize soil erosion (Gholami et al., 2018). Thus, a comprehensive set of soil erosion variables can be integrated, including meteorological, soil, topography, vegetation, and management variables. A strategic variable screening process was developed to avoid redundancy and collinearities between variables to determine the minimal-optimal set for developing simulation models by a machine learning algorithm (Xiong et al., 2014). A machine learning algorithm is a process used to fit a model to a dataset through training or learning (Willcock et al., 2018). Many types of machine learning algorithms have been used extensively in determining soil characteristics, such as soil organic carbon (SOC), soil nutrient content, and soil parent material (Heung et al., 2016;Zhi et al., 2018). However, innovative research on soil erosion based on multiple machine learning algorithms is rare, especially in the alpine grassland ecosystem of the SPYR. In this study, five strategic variable selection algorithms based on machine learning algorithms, namely, the Boruta all-relevant (BOR), simulated annealing (SA), genetic algorithm (GA), recursive features elimination (RFE), and univariate filters (UF), are used to identify the minimal-optimal set (You et al., 2014;Hosseini et al., 2016;Yang R. M. et al., 2016;Li and Ma, 2018). Compared to logistic regression, bivariate and multivariate statistical models, several modern machine learning approaches such as the generalized linear model (GLM), cubist model (cubist), random forest model (RFM), and boosting tree (BST) model can overcome the obstacles of spatial modeling in the field of soil erosion assessment (Gayen et al., 2019) and achieve a higher prediction accuracy (Marjanoviae et al., 2011;Micheletti et al., 2013). Hence the GLM, cubist, RFM, and BST model would be adopted to estimate soil erosion, and a cross-comparison of those models can be performed (Li et al., 2017;Chang et al., 2018;Peng et al., 2018;Wang et al., 2019).
The objectives of this study were to (1) Categorize the soil erosion parameters set, screened by the machine learning method and analyze the main factors that influence soil erosion in the SPYR. (2) Determine the optimal model for estimating soil erosion in the SPYR that can be combined with specific environmental factors of the alpine grassland ecosystem of the SPYR. (3) Identify the spatial distribution pattern of soil erosion in the study area by the optimal prediction model. The soil erosion classification and direct economic losses caused by soil erosion enable farmers and policymakers to generate grassland management strategies with the expectation of profit and corresponding risk. These results will help to provide scientific direction for grassland protection, soil restoration and establishment of policy framework for ecological development in the SPYR.

Study Area
SPYR (N32 • 02 -36 • 13 , E95 • 42 -102 • 17 ) comprises an area of about 130,798 km 2 , and the area records for 35% of the total runoff of the Yellow River (Zheng et al., 2018). The elevation of the SPYR ranges from 2,052 to 6,227 m and descends from the Southwest to the Northeast (Figure 1). Topographic features and geographical situations pose an impact on the natural environment, with characteristics such as frost temperatures, intense diurnal temperatures, frequent seasonal rainfall, recurrent strong winds, and powerful solar radiation (Ge et al., 2017). Most of the study area has a typical continental plateau climate, with an average annual temperature ranging −4 and 2 • C rising from Western to Eastern. Average annual rainfall ranges from 350 to 750 mm and is concentrated from June to September. The soil types are mainly alpine meadow soil and steppe soil. Alpine grassland accounts for 95.12% of the total area of the SPYR, and it includes alpine meadow at 83.12% and alpine steppe at 12% (Liang et al., 2016).

Soil Sampling Design
Considering that the terrain of the study region is treacherous with high mountains and glaciers, two transect surveys were performanced which were superimposed an additional constraint that sampling sites were selected based on NDVI levels under various conditions of grassland types and topography to ensure potential representative sampling locations. The one runs throughout the whole study area from Northeast to Southwest along national road G214, while the other transect survey along the Dawu River in the Northeast of Guo Luo Tibetan Autonomous Prefecture (Figure 1). Each sampling point was located at least 500 m away from the highway or river bank to avoid the disturbance caused by proximity. During July and August 2014 (keeping in view of the returning green stage and the accessibility to the sampling sites), 165 soil samples were collected from the soil surface using an internal diameter cores sampler with 20 cm long and 5 cm diameter from the soil surface. The bulk density of each sample of 0-20 cm layers was obtained from each soil mass and sampled volume as described earlier by Nosrati et al. (2015) and Wang et al. (2017). The SOC was determined using the potassium dichromate, external heating method (Cao et al., 2011). The vegetation and soil samples were measured at each sampling site by a standard method (Ren, 1998).

Cesium-137 Content Analysis
Collected samples were air-dried, ground, and passed through a 2 mm sieve. Each sample 137 Cs radionuclide content was analyzed by low background gamma-ray spectrometry using a hyper pure coaxial germanium detector linked to a multichannel digital analyzer system (EG&G, ORTEC) (Haribala et al., 2016) at the School of Nuclear Science and Technology, Lanzhou University. The 137 Cs radionuclide content (Bq kg −1 ) was detected at the 661.6 keV peak over a counting time of 24 h. Each sample weight exceeded 300 g and supplied a precision of approximately ±5% at the 95% confidence level.
The observed soil erosion modulus was calculated by following three model (Wang Y. B. et al., 2014;Li et al., 2021): where CPI i is the areal activity (Bq·m −2 ) of i th sampling point, i the sample number, C i 137 Cs radionuclide content of the i th sample (Bq kg −1 ), W the mass of ith soil sample (kg), S the corer area (m 2 ).
where CPR i is 137 Cs content change rate (%) of the ith sample, CRI 137 Cs background value (Bq·m −2 ) for the study area. In this study, the soil 137 Cs background value was 2,229.1 Bq·m −2 , which was determined by the SPYR (Wang et al., 2017).
where E i is the observed soil erosion modulus of the ith sampling point (t·hm −2 ·a −1 ), Bd i soil bulk density (Mg·m −3 ), DI i soil layer thickness (m) of the ith sampling point, T time between the 137 Cs settling peak (1963) and the sample time.

Data Sources
Thirty-three environmental variable data sets based on available data sources on the climate, soil, topography, vegetation, and management were collected (Appendix Table 1). These data sources were compiled into raster data by ArcGIS 10.2 software to construct the soil erosion model. About 40% of these variables (12) were categorical, including soil taxonomic properties, landuse and land-cover change (LULC), and vegetation type, while 60% (21) of the variables (Table 1) were continuous, including organic matter content, vegetation cover, NDVI, frozen earth period, and climatic and biotic variables.

Gridded Site Characteristics
The elevation, aspect, slope, slope length, and slope factor (LS factor) were calculated from the digital elevation model (DEM) using ArcGIS 10.2 software. The DEM data were downloaded from 90 m Shuttle Radar Topography Mission (SRTM) images 1 .  And resample the image's spatial resolution from 90 to 100 m. The projection-type of DEM data was Albers.

Gridded Soil Characteristics
The SOC, soil pH, soil carbonate content, soil sand content, soil silt content, soil clay content, and frozen earth period were obtained from the China Soil Map-Based Harmonized World Soil Database (V1.1). The data source mentioned above comes from the Cold and Arid Regions Sciences Data Centre of the Chinese Academy of Sciences (CAS) at Lanzhou 2 .

Climate Date
The daily precipitation, daily average temperature, and average wind velocity were based on a dataset of daily surface observation values in China (V3.0). The 838 meteorological stations data was acquired from the China Meteorological Data Sharing Service System website 3 .

Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI)
Normalized Difference Vegetation Index and EVI obtained from TM/ETM satellite images. The vegetation index products (MOD13Q1) were downloaded from a NASA website 4 .

Data Pre-processing
The daily precipitation and daily temperature were averaged every month; the daily wind velocities were accumulated every year; and the monthly mean temperature, monthly total precipitation, and yearly wind velocity were obtained based on meteorological data using meteorological interpolation software (ANUSPLIN, version 4.3). The monthly total precipitation was used to simulate rainfall erosivity. The MOD13Q1 data from January to December 2014 were transformed and registered into a GeoTIFF format using the MODIS reprojection tool (MRT). The maximum NDVI and maximum EVI were selected from NDVI and EVI datasets from January to December. Vegetation cover was simulated by the NDVI data. Soil erodibility data and vegetation management factor data were calculated by the RUSLE model (Prasannakumar et al., 2012). The details of the other variables are included in Appendix Table 1.

Environmental Variable Selection
The parameters and sample data were integrated using the rasterpackage function in R 3.4.3 (R Development Core Team, 2017) software. Then, variables were selected by the BOR, GA, SA, RFE, and UF algorithms (You et al., 2014;Hosseini et al., 2016;Yang R. M. et al., 2016;Li and Ma, 2018). The five variable filtering algorithms were used in R 3.4.3 (R Development Core Team, 2017).

Determining the Optimal Soil Erosion Model
The whole dataset was randomly split into a calibration set (70%) and a validation set (30%) for model establishment and validation. The GLM forms a multivariate regression relation between a response variable and several predictor variables. The advantage of a GLM over simple linear regression is that the variables may be continuous, categorical, or any combination of these two allowing for non-linearity in the data (Atkinson and Massari, 1998;Rudy et al., 2016;Li et al., 2017). Cubist model is a ruled-based regression technique that builds multivariate linear regression models at the terminal leaves of a tree and parts the predictor variates into different subsets (Moisen et al., 2006;Akpa et al., 2016). RFM trains classification samples through decision trees and makes predictions based on the results of the classification, and has the ability to obtain classification and regression analysis data from various measurement scales with non-parametric statistics without making assumptions (Li et al., 2017;Wang et al., 2019;Tang et al., 2021). BST model is based on a first classification tree with subsequent trees generated by assigning greater weights to incorrectly classified training data (Pouteau et al., 2011;Yang R. M. et al., 2016). The GLM, cubist, RFM, and BST models (Li et al., 2017;Chang et al., 2018;Peng et al., 2018;Wang et al., 2019) were implemented with the mboost, bst, plyr, cubist, and RF packages, respectively, in R 3 the predicted values were plotted against the predicted data to find the trend of the slope of the expected curves. If the expected curve tends to make an angle of 45 degree with the axes, this means that there is no significant difference between the actual and predicted values. To evaluate the predictive capability of the 24 models, the following three performance indicators were used : the coefficient of determination (R 2 ), root mean squared deviation (RMSD) and residual prediction deviation (RPD) (Lin et al., 2013). The goodness of fit for the RUSLE model and machine learning algorithms combined with variable selection models were assessed via comparisons between the predicted results and observed values to evaluate the reliability and accuracy of the model output. The root mean square deviation (RMSD), standard deviation (SD), and correlation coefficients (R) were used to evaluate model accuracy (Lin et al., 2013). The cross-comparison results of the R and RMSD in different models are displayed in the Taylor diagram. The Taylor diagram illustrated R, RMSD, the standard difference in observations, and the standard difference in predicted values (Choubin et al., 2018). The polar axes and the radial axes separately illustrated the correlation coefficient and the RMSD of the model validation results. The predicted models were more approached observed point on the x-axis, and they presented a relatively higher correlation and low RMSD (Jia et al., 2016).

Soil Erosion Classification
Prediction results by the optimal soil erosion model were classified by the Standards for Classification and Gradation of Soil Erosion (Ministry of Water Resources of the People's Republic of China, 2007). The classification results were used to provide a reference for the soil restoration and ecological policy. The economic monetization of SOC loss caused by soil erosion in the SPYR was based on the market value approach (Costanza et al., 1998). The net soil erosion modulus minus the tolerable soil erosion modulus is the actual soil erosion modulus (Hancock et al., 2015). Tolerable soil erosion loss is generally 500 t·km −2 ·a −1 (Chen et al., 2000;Li et al., 2009). The values of SOC were estimated by the C tax ($180·t C) (Patton et al., 2015).

Environment Variable Selections
Among the 33 environmental variables, the number of variables selected by the GA, RFE, BOR, UF, and SA algorithms was 29, 26, 22, 22, and 13 variables, respectively (Appendix Table 2). The five algorithms selected precipitation, wind velocity, rainfall erosivity, frozen earth period, slope, aspect, water flow direction, the NDVI, land cover, and vegetation type as variables in common that may play significant roles in explaining the soil erosion mechanism.
Figure 2 compared five variable search methods across four modeling techniques (GLM, cubist, RFM, and BST) in terms of both prediction accuracy and model complexity. The crossvalidation results of GA BST model that selected 29 variables showed slightly inferior performance to the UF cubist model. This indicated that the 22 variables selected by the UF cubist model contained better predictive power to that of 29 variable selected by the GA BST model. Comparing with the UF cubist model, the SA cubist model had fairly comparable performance in similar model accuracy, but dramatically decreased model complexity. The SA cubist model struck a balance between model complexity and performance. It greatly reduced the number of variables to only 13 while preserving most of the predicting power to infer soil erosion. The cross-validation identified and optimized model parameters. The 13 variables selected by the SA cubist model captured the major factors -soil organic carbon, soil property and soil erodibility besides nine common variables mentioned above involving climate (e.g., precipitation, wind velocity, and rainfall erosivity), soil (e.g., frozen earth period, soil organic carbon, soil property, and soil erodibility), topography (e.g., slope, aspect, and water flow direction), vegetation (e.g., vegetation type and NDVI) and management (e.g., land cover) properties (Figure 2 and Appendix Table 2).

The Optimal Soil Erosion Model
The goodness of fit for the RUSLE models and machine learning algorithms combined with variable selection models showed that the GLM (R = 0.63) and cubist model (R = 0.67) performed the best while RUSLE (R = 0.49) performed the worst (Figure 3). The Taylor diagram show that (1) the fitting result was poorer for the RUSLE model than the 24 machine learning models (the location labeled with 'RUSLE' were the outermost from the observation point).
(2) Compared with the other five variable selection models, the SA variable set combined with the GLM and cubist model was more consistent with the actual measurements, and the SA variable set combined with BST was the least consistent with the actual measurements. (3) Compared with the BST and RF models, the GLM and cubist models were slightly more consistent with the field measurements (the points labeled with 'GLM' and 'cubist' are nearer to the observation point than the points labeled with 'BST' and 'RFM'). The cross combination of the BST model and all variable sets had the lowest agreement (the points labeled with 'BST' is the farthest to the observation point than the other points). (4) Contrast to the AR, BOR, GA, UF, and RFE variables, the GLM did not present a significant difference in fitting results (the gaps of the points labeled with 'RFM' from the observation point were nearly the same). The results show that in terms of classification accuracy, the RFE performs best among the five feature-selection algorithms, and the Cubist model performs best among the four machine learning algorithms. In these 24 combinations, the optimal model is the SA cubist model (R = 0.67, RMSD = 1368).

Spatial Distribution of Soil Erosion
The SA cubist model estimated the total erosion loss was about 8.45 10 8 t·a −1 in 2014, which is 12.64 times the allowable soil erosion loss. The soil erosion modulus showed substantial spatial heterogeneity in the SPYR (Figure 4). The soil erosion modulus ranged from 4841.68 to 8054.09 t·km −2 ·a −1 , and the average soil erosion modulus was 6460.95 t·km −2 ·a −1 . In general, 99% of the area in the SPYR belongs to the severe soil erosion modulus (5,000-8,000 t·km −2 ·a −1 ). According to the Standards for Classification and Gradation of Soil Erosion, the region with the most severe soil erosion modulus (7,000-8,000 t·km −2 ·a −1 ) occupied 12.38% of the total soil erosion regions. It mainly appeared in the Northern part of the SPYR. The moderately severe region of the soil erosion modulus (6,000-7,000 t·km −2 ·a −1 ) occupied 71.19% of the total soil erosion regions and was distributed in the middle of the study area. The slightly severe region of the soil erosion modulus (5,000-6,000 t·km −2 ·a −1 ) occupied 16.43% of the total soil erosion regions and was distributed South of the SPYR (Figure 4).

Selected Key Variables May Provide a More Precise Soil Erosion Explanation
The simulated soil erosion modulus values were more significant than that in two previous report (276 and 3,208 t·km −2 ·a −1 ), which used the RUSLE model (Teng et al., 2018) and RWEQ model (Teng et al., 2021) to calculate the soil water and wind erosion modulus on the Qinghai-Tibetan Plateau, respectively. At the same time, the RUSLE model emphasizes soil erosion loss by rainfall, slope, aspect, vegetation, and soil erodibility but ignores soil erosion loss from winds, freeze-thaw cycles, and human activity. Although few research studies have provided a comprehensive prediction of soil erosion (Sadeghi et al., 2018;Shen et al., 2018), high levels of precipitation have been reported to lead to severe erosion, while low temperatures combined with adequate precipitation may cause even greater soil losses in the study area . The same as the RWEQ model.
Under freeze-thaw conditions, soil moisture is transported, and the soil structure is damaged; moreover, the soil porosity, bulk density, shear strength, aggregate stability, and organic matter are all changed, which leads to high soil erodibility and erosion intensity increases (Wang L. et al., 2014;Sadeghi et al., 2018). In addition, soil erosion modulus caused by wind erosion is also an important aspect of soil erosion because the wind speed is a vital factor that affects soil erosion (Teng et al., 2021). Soil particles erode as the sheer pressure exerted by wind-force leads to soil particles being hard to grip on the soil surface Zhang et al., 2018). Furthermore, grazing livestock and human activity are both key factors in the intensification of soil erosion. Intensive grazing will alter and destroy soil surfaces, causing whole sod to erode and expose mineral soil, especially in areas with low vegetation cover (Lin et al., 2008;Evans et al., 2017;Zhang et al., 2018).
Besides, frequent human activities will cause soil erosion, such as collecting Chinese Caterpillar Fungus (Wang C. G. et al., 2018) and overexploitation of tourist attractions, and so on . The comprehensive set of variables based on the inclusion of freeze-thaw erosion, wind erosion, water erosion, and human-caused erosion was more objective than that used in the RUSLE or RWEQ model. The inclusion of these variables may explain why the soil erosion results derived from this set of variables were more excellent than the results from the RUSLE or RWEQ model because this comprehensive set of variables involved more soil erosion factors, including meteorological, soil, topographic, vegetation, and management factors, which enhance our understanding of soil erosion processes using a data-based model.

Comparison of Categorical Variables and Model
Evaluation results of model accuracy comparisons among five variable selection methods and four prediction models are illustrated in Figure 3. The model validation included machine learning models and the RUSLE model. The results showed that variable screening technology could further reduce model complexity. The SA method selected 13 out of the 33 relevant variables, i.e., climatic factors, including precipitation, wind velocity, and rainfall; soil factors, including SOC, soil properties, soil erodibility, and frozen earth period; topography factors, including slope, aspect, and water flow direction; and vegetation factors, including vegetation type and the NDVI. The UF method largely reduces the number of variables that were most able to explicate soil erosion. The Boruta algorithm screening out the majority of climatic and topographic variables, thus abated model collinearity. The feature-selection method can simplify model parameters and reduce operational costs; in addition, this method can also improve the estimation accuracy of a model by eliminating redundant interference parameters.
Our results showed that in terms of classification accuracy, the RFE performs the best among the five feature-selection algorithms and the Cubist model performs the best among the four machine learning algorithms. The projected results of the RUSLE model were limited by fixed parameters, which included only rainfall, slope, slope direction, vegetation, and soil texture. The accuracy of the RUSLE model is the lowest among the models because some factors that may affect soil erosion were ignored, such as diurnal temperature, strong winds, freeze-thaw, and so on.
In order to gain much more insight through statistical analysis and modeling compared to biased linear sampling along roads, the conditional Latin Hypercube design for soil sampling to even its spatial distribution would be adopted in our near future studies (Ma et al., 2020;Tang et al., 2021).

New Strategy for the Prevention and Control of Soil Erosion Loss in the Source Park of the Yellow River
According to the Standards for Classification and Gradation of Soil Erosion, 99% of the SPYR area experiences severe soil erosion. (Ministry of Water Resources of the People's Republic of China, 2007), soil erosion modulus was further reclassified into three grades: level I (severe, soil erosion modulus in 5,000-6,000 t·km −2 ·a −1 ), level II (moderately severe, the soil erosion modulus in 6,000-7,000 t·km −2 ·a −1 ), and level III (most severe, the soil erosion modulus 7,000-8,000 t·km −2 ·a −1 ). The soil erosion modulus of each county was reclassified into three grades ( Figure 5). Level III erosion occurred mainly in the triangle zone of Qumalai County and Maduo County. Level II erosion occurred in Xinghai County, Dari County, Chengdu County, Western Zeku County, Western Maqin County, Tongde County, and parts of Jiuzhi County, Henan County, Gande County, and Yushu County. Level I erosion primarily occurred in Eastern Maqin County, Eastern Zeku County, and most areas of Jiuzhi County, Henan County, Gande County, and Yushu County (Figure 5).
Severe soil erosion causes significant ecological degradation as well as substantial ecological, economic losses in the SPYR. For example, the economic monetization of SOC losses caused by soil erosion total $47.90 billion in the SPYR. The economic losses were approximately fivefold of the output value of animal husbandry ($10.10 billion) in 2014 (Qinghai Bureau Statistics, 2014). We can infer that the economic monetization of soil erosion would be far huger than the SOC loss. Actually natural resources are priceless. Monetization of soil erosion loss is only a caution light for taking a partial solution in eco-compensation. Therefore, a new ecological policy recommendation based on soil erosion reclassification should be proposed to balance grassland conservation and animal husbandry production for the prevention and control of soil erosion loss (Figure 6).
A conservative management strategy was suggested for some fragile areas (soil erosion at level III) since grassland's economic products have led to serious soil erosion (Figure 6). This grassland sector should not be used for grazing (Yin et al., 2019). At this level, the government should provide ecological  subsidies to compensate for the economic losses caused by not allowing grazing. Additionally, the government should relocate local herder households to new countryside neighbors to cities. At the same time management department should also supply training projects to sustain transitions to alternative livelihoods for these herdsmen, such as making Tibetan typical handicrafts and building Tibetan distinctive hotels for developing Tourism. Notably, over the long term, simply banning grazing may result in a snowballing population increase of deleterious wild animal species that are short of plenty natural predators, such as rabbits and pika, thus probably resulting in new ecological problems . Therefore, after 5 years of banning grazing, grasslands should be managed under a grassanimal balance policy, or grazing should be continued to be banned. The implemented policy will depend on the condition of the grassland. The moderately severe sector (level II) should be managed intensively to improve the basic conditions (Figure 6). Even now the balance of livestock and pasture production is an important factor at level II soil erosion regions (Yin et al., 2019). The study conducted by Fan et al. (2010) propose that the determination of appropriate carrying capacity should follow the principle of the "law of the minimum limiting factors" in order to minimize degradation of the grasslands, i.e., stocking rates should be primarily based on herbage yields of low-productivity years, there is also the opportunity to conserve forage in high-yield years and carry this over to alleviate grazing pressure in low-yield years. Further, there is the possibility of reducing grazing pressure on more degraded parts of the grassland in high-yield years to aid their recovery. For sustainable development, it is insufficient to manage grazing systems solely within the framework of pasture productivity (Lin et al., 2011) especially for distinctive erosion areas (level II). The 'optimum' stocking rate should be obtained from grazing experiments not only considering the foragelivestock balance, but also maintaining the soil erosion at a tolerable level. Sun et al. (2015) conducted a 3-year study on Tibetan sheep on the Eastern Qinghai-Tibetan Plateau. They found that the 'optimum' stocking rate that ensures economic sustainability is 24 sheep months ha −1 (Sun et al., 2015). Considering moderately severe soil erosion conditions in this area, the stocking rate should not exceed one-third of the 'optimum' stocking rate in order to prevent and control the soil erosion loss at level II soil erosion regions. If herder households comply with this policy, then they should be able to receive a subsidy.
The severe sector (level I) should be managed intensively to improve grassland productivity via the application of certain measures (fertilization, tillage, and resowing) for the sustainable utilization of grassland (Figure 6). When soil erosion is at the level I, the stocking amount should not exceed two-thirds of the 'optimum' stocking rate in order to obtain a certain production income, meanwhile prevent and control the soil erosion. In addition, the stocking rate in the cold-period pastures should be less than the stocking rate in the warm-season pastures (Sun et al., 2015).

CONCLUSION
New models using machine learning techniques that combine local ecological variables (n = 33) were constructed based on the observed soil erosion modulus determined by the 137 Cs element tracing technique. 13 environmental variables played critical roles in explaining the process of soil erosion. The results from the comprehensive set of variables were more objective than the produced from the RUSLE model. Nearly 99% of the study region has an intensive erosion modulus (5,000-8,000 t·km −2 ·a −1 ). Northern soil erosion modulus was higher intense than that in the Southern of the SPYR. The total erosion loss was almost 8.45·10 8 t·a −1 in SPYR. The annual economic loss of SOC caused by soil erosion throughout the entire survey zone was generally $47.90 billion in 2014. In addition, integrating monetization of nature resource into soil erosion classification outlined a partial solution in eco-compensation, and provided ecological policy recommendations for three levels of soil erosion. That is a more accurate and flexible method of managing grasslands.
The economic monetization of soil erosion may be underestimated. This manuscript only accounted for SOC loss during soil erosion. Obviously, natural resources are priceless. Monetization of nature resource is only a partial solution in eco-compensation.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
HL conceptualized this study and led the writing. HL and YZ collected and analyzed the data, interpreted the results, and revised the text. Both authors contributed to this work and approved the final manuscript before submission.

FUNDING
This work was supported by the National Natural Science Foundation of China (32171680 and 31772666) and the Fundamental Research Funds for the Central Universities (lzujbky-2021-kb13).