Data-Driven Landslide Nowcasting at the Global Scale

Landslides affect nearly every country in the world each year. To better understand this global hazard, the Landslide Hazard Assessment for Situational Awareness (LHASA) model was developed previously. LHASA version 1 combines satellite precipitation estimates with a global landslide susceptibility map to produce a gridded map of potentially hazardous areas from 60° North-South every 3 h. LHASA version 1 categorizes the world’s land surface into three ratings: high, moderate, and low hazard with a single decision tree that first determines if the last seven days of rainfall were intense, then evaluates landslide susceptibility. LHASA version 2 has been developed with a data-driven approach. The global susceptibility map was replaced with a collection of explanatory variables, and two new dynamically varying quantities were added: snow and soil moisture. Along with antecedent rainfall, these variables modulated the response to current daily rainfall. In addition, the Global Landslide Catalog (GLC) was supplemented with several inventories of rainfall-triggered landslide events. These factors were incorporated into the machine-learning framework XGBoost, which was trained to predict the presence or absence of landslides over the period 2015–2018, with the years 2019–2020 reserved for model evaluation. As a result of these improvements, the new global landslide nowcast was twice as likely to predict the occurrence of historical landslides as LHASA version 1, given the same global false positive rate. Furthermore, the shift to probabilistic outputs allows users to directly manage the trade-off between false negatives and false positives, which should make the nowcast useful for a greater variety of geographic settings and applications. In a retrospective analysis, the trained model ran over a global domain for 5 years, and results for LHASA version 1 and version 2 were compared. Due to the importance of rainfall and faults in LHASA version 2, nowcasts would be issued more frequently in some tropical countries, such as Colombia and Papua New Guinea; at the same time, the new version placed less emphasis on arid regions and areas far from the Pacific Rim. LHASA version 2 provides a nearly real-time view of global landslide hazard for a variety of stakeholders.


INTRODUCTION
Landslides cause thousands of casualties and substantial socioeconomic impacts around the world every year (Kirschbaum et al., 2015a;Froude and Petley, 2018). Rainfall is the most frequent trigger of these landslide events, although earthquakes and anthropogenic impacts such as mining and construction can also be tremendously destructive. Near realtime information on the spatiotemporal distribution of potential landslide hazards may mitigate loss and improve the speed and effectiveness of disaster response and recovery. Remotely sensed data can provide a global view of this hazard to advance understanding of landslide processes and improve landslide monitoring and prediction.
At a local scale, the basic physics behind mass movements of rock and soil (herein landslides) are well understood and incorporated into the practice of geotechnical engineering. While physically based models can be difficult to apply over large areas, some practitioners have reported success with this approach (Raia et al., 2014;Hsu et al., 2018;Thomas et al., 2018). Physically based models typically require a significant amount of information about subsurface conditions and can be sensitive to small errors in these values (Iverson et al., 2015). Because accurate descriptions of the ground are rarely available for large areas, most regional landslide hazard assessment systems instead rely on an empirical approach .
Many different empirical approaches to spatial and temporal prediction of rainfall-triggered landslides have been applied since the 1970s, with hazard assessment or early warning systems applied in many regions of the world . Most commonly, researchers rely on thresholds that take into account both the intensity and the duration of rain storms (e.g. Gariano et al., 2015;Segoni et al., 2018), with inclusion of snow and snowmelt as a contributing factor in some models (e.g. Chleborad et al., 2008;Krøgli et al., 2018). A growing number of systems also use soil moisture as a predictor of potential instabilities (e.g. Brocca et al., 2012;Mirus et al., 2018;Felsberg et al., 2021). While snow water equivalent and soil moisture content have a straightforward connection to the saturated conditions under which many landslides occur, the same is not true of land cover. Although rarely considered as an influencing factor for landslide triggering , land cover is often used to calculate landslide susceptibility; in turn, susceptibility is often used to predict rainfall-triggered landslides (Hong and Adler, 2007;Kirschbaum et al., 2015b;Monsieurs et al., 2019).
Although many landslide models have been developed at local and regional scales, few have characterized landslide hazard at a global scale. The National Aeronautics and Space Administration (NASA) has developed a nearly global landslide nowcasting system that locates the most hazardous conditions in near real-time. Landslide Hazard Assessment for Situational Awareness (LHASA) version 1 is a decision tree model that produces a map of potentially hazardous landslide areas between 60°North and South latitude with three categorizations: low hazard, moderate hazard, and high hazard (Kirschbaum and Stanley, 2018;Emberson et al., 2020). It combines satellite precipitation estimates with a global landslide susceptibility map that incorporates information on roads, faults, geology, forest loss, and topography . LHASA version 1 was evaluated with landslide reports from the Global Landslide Catalog (GLC; Kirschbaum et al., 2015a), but the model structure, susceptibility analysis, and rainfall threshold were derived from literature review and expert judgment, rather than a data-fitting process. LHASA version 1 runs at NASA's Goddard Space Flight Center and integrates a multi-satellite product developed by the Global Precipitation Measurement (GPM) mission, which merges multiple satellite precipitation estimates to produce the Integrated Multi-satellitE Retrievals for GPM (IMERG; Huffman et al., 2020). The most recent nowcast can be accessed through an ESRI REST API or viewed at https://landslides.nasa.gov/viewer/ (Figure 1).
Although LHASA version 1 has provided nearly global situational awareness of potential landslide activity, new landslide inventories and satellite-based datasets have recently enabled the use of a data-driven approach to landslide nowcasting at the global scale. Machine learning is now widely used in landslide susceptibility mapping (Korup and Stolle, 2014;Segoni et al., 2015;Reichenbach et al., 2018), and it has been proposed for use in dynamic predictions of slope failure in some cases (van Natijne et al., 2020). Machine learning promises important benefits: improved overall accuracy and probabilistic outputs that better reflect the low but non-zero chance of landslides in flat and dry regions. However, machine learning also presents several challenges. First, the combination of numerous variables with large datasets can lead to overfitted models that are unreliable or even physically unrealistic. Second, these models are often complex to the point of inscrutability, which decreases trust in even well-founded methods. Finally, model performance is almost entirely tied to the available data, which presents a significant challenge at the global scale. Despite these limitations, recent research shows that these issues can be addressed when reasonable precautions are taken .
In this work, we have outlined a data-driven approach to global rainfall-triggered landslide hazard assessment that outperforms LHASA version 1. The goal was to advance from categorical to probabilistic global nowcasts of rainfall-triggered landslide hazard, as well as substantially increase the model's accuracy. We present the methodology for LHASA version 2 here and compare the performance of both versions over the period 2015-2020 with new landslide inventories. These nowcasts are intended to facilitate disaster planning and response at regional to global scales by a broad range of stakeholders such as governments, relief agencies, emergency responders, and insurers.

Explanatory Variables
Several globally extensive datasets were considered for use as predictive variables (predictors) in LHASA. These included Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 the variables previously used to generate the global landslide susceptibility map  and to run LHASA version 1 (Kirschbaum and Stanley, 2018).
In addition, several new variables with the potential for explaining the variability in landslide occurrence were tested. The predictors retained were those with an obvious FIGURE 1 | This map shows a nowcast (version 1) for November 18, 2020, during the passage of Hurricane Iota through Nicaragua and Honduras. The latest global landslide nowcast can be viewed (along with supplementary information) at https://landslides.nasa.gov/viewer/. A prerequisite for considering a dataset was that it was globally extensive and publicly available.
Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 3 relevance to landslide occurrence and were used in one or more decision trees during model training. These variables are snow mass, soil moisture, slope, distance to faults, lithologic strength, antecedent rainfall, and current daily rainfall (Table 1).
Slope is a strong predictor of landslides because it can identify areas too flat for most landslides to occur. A key component of the existing global susceptibility map, this variable was calculated by first deriving the slope for each 90 m grid cell in the Viewfinder Panoramas digital elevation model (de Ferranti, 2014), then aggregating these values by selecting the maximum value within each 30-arcsecond grid cell. The maximum value was used in order to avoid omitting generally flat regions in which some hazardous locations may still exist.
The primary rationale for using distance to active faults as a landslide predictor is that earthquakes trigger landsliding, while also rendering landscapes more landslide-prone for years following (Marc et al., 2015). From a long-term perspective, faults are the site of tectonic activity, which often involves mountain building and high erosion rates, including mass wasting. The distance to the nearest active fault exhibits a strong empirical association with landslide occurrence, as will be shown in A Data-Driven Nowcast. This variable was derived from a global database of active faults that is not complete but covers most of the world's land surface (GEM Hazard Team, 2019).
The physical properties of earth materials control whether and how landslides occur under forcing conditions such as intense rainfall and seismic shaking. Although detailed properties of the soil and rock profile are not globally available, use of regional geologic maps improves landslide susceptibility determinations (Amatya et al., 2019). Even simplified groups based on rock type are useful for modulating rainfall thresholds at the regional scale (Peruccacci et al., 2012). The release of open data sets such as the global lithologic map used in this work (Hartmann and Moosdorf, 2012) should enhance landslide research at the global scale. The map contains 16 major lithologic classes, which we use to inform models of landslide occurrence by broadly representing geotechnical properties based on weathering susceptibility that are independent of topography and seismicity. In A Global Lithologic Rating, we describe how this map was converted into a lithologic strength rating.
Antecedent soil moisture can reduce the rainfall necessary to induce slope failure Zhao et al., 2019). In situ sensors provide good estimates of soil moisture fluctuations in hilly terrain (Thomas et al., 2019), but are not globally available as a dense network with rapid reporting. Both models and satellites can offer this global coverage, albeit at a coarse spatial resolution. The Soil Moisture Active Passive Level 4 (SMAP L4) product (Reichle et al., 2018) combines a hydrologic model with satellite observations to produce a highquality product with continuous coverage. Bessette-Kirton et al. (2019) found that wet areas identified by SMAP L4 were associated with areas of dense landslide occurrence in Puerto Rico and Felsberg et al. (2021) found that SMAP data performed well relative to other global measurements of soil moisture. However, SMAP L4 has a latency of roughly 2.5 days, which means that it can only be used in a real-time product as an antecedent contributing factor, not the proximate cause of slope failure. SMAP L4 includes several variables related to soil moisture. LHASA version 2 uses "total profile soil wetness," a dimensionless measure of the soil water content between the ground surface and bedrock. Values range from 0, indicating no water content, to 1, indicating complete soil saturation. Unlike the volumetric soil moisture, this variable represents a locally standardized quantity, obviating any data transformation by the nowcast algorithm.
The presence of snowpack can be an important factor that increases the severity of rainfall-triggered landslide events in some regions (Chleborad, 2000;Sarikhan et al., 2008;Musselman et al., 2018;Vega et al., 2020). Falling rain can melt snow, which increases the total volume of water available to infiltrate and increase subsurface pore pressures. Although a few variables related to snow are available in the SMAP L4 product, only snow mass was chosen to avoid inclusion of highly correlated model inputs and to reduce processing time. The snow mass variable is measured in kilograms per square meter and is not normalized by LHASA version 2. The inclusion of snow mass enables the global landslide nowcast to reflect the potential for melting snow to run off or flow into groundwater, combining with the total from rainfall.
Precipitation is the fundamental triggering variable considered in this model. Consistent with LHASA version 1, IMERG is used to identify conditions under which landslide triggering is more likely. IMERG provides high-quality, lowlatency estimates of precipitation (Huffman, 2016;Huffman et al., 2020) and has been used in landslide modeling at multiple scales (Kirschbaum et al., 2015b;Kirschbaum and Stanley, 2018;Hartke et al., 2020). For this study, we use IMERG version 06B, which encompasses data from both the Tropical Rainfall Measuring Mission (TRMM) and the GPM mission, because it provides a nearly 20 years record of precipitation from June 2000 to present. IMERG provides three latencies of products to support different user groups: early (∼4 h latency), late (∼12-14 h latency), and final (3.5 months latency). In this model, we use the IMERG Early Product, which is available at a 0.1 degree, 30 min spatiotemporal resolution at https://gpm. nasa.gov/data/imerg. In addition to estimating precipitation depth, IMERG estimates the probability that precipitation will be in liquid phase. This is determined with reference to the wet-bulb surface temperature from numerical weather predictions, which is the key variable for separating precipitation phases (Sims and Liu, 2015). (Wet-bulb temperatures combine temperature with humidity to account for the effects of evaporation). Because falling snow and freezing rain are unlikely to trigger rapid snowmelt or landslides, LHASA version 2 uses liquid precipitation, referred to as "rain" or "rainfall" in this manuscript. For this study, daily data were used to develop the global landslide nowcast because the exact time of occurrence is not available for most historical landslide events.

Landslide Data
In order to represent the spatiotemporal distribution of rainfalltriggered landslides, dozens of published landslide inventories were compiled with a few privately held landslide inventories. Original published landslide datasets were retrieved from Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 university research articles, government agencies, and professional contacts within the landslide research community. From these, we chose inventories for use in model development ( Table 2) that were authoritative, welldocumented, or from a trusted source. In addition to previously developed inventories, the NASA landslides team developed an additional set of event-based inventories triggered primarily by major rainfall events using a semi-automatic landslide detection method described below. Manual mapping of landslides after a major triggering event can be extremely time consuming. Amatya et al. (2021) presents an open-source tool for rapid mapping of landslides from high resolution optical imagery using semi-automated techniques. The Semi-Automatic Landslide Detection (SALaD) system relies on object-based image analysis and machine learning to create polygon representations of landslides. This tool has been applied after several disasters, including events in Zimbabwe and Vanuatu ( Table 2). It was also applied retrospectively to map landslides triggered by Cyclone Komen in Myanmar. Similar inventories were produced with commercial software for events in Burundi, Brazil, and Japan (Amatya et al., 2019). In addition, landslide outlines were hand mapped from optical satellite imagery for events in the Philippines and India (Emberson et al., 2021). Finally, an inventory of landslide initiation points in Pokot, Kenya was manually mapped from a Sentinel-2 satellite image (Benz and Stanley, 2020).
In addition to these event-based landslide inventories, NASA has maintained the GLC, a multitemporal inventory of rainfalltriggered landslides. This database contains over 11,000 landslide reports (Table 2), which have been obtained primarily from news media (Kirschbaum et al., 2010;Kirschbaum et al., 2015a). The GLC has been completed for the years 2007-2017, and the first half of 2018 is also complete. Some linguistic and economic biases are known to affect the GLC, which are outlined in Kirschbaum et al. (2010) and(2015). Nevertheless, it is an invaluable resource for research at the global scale because it represents far more individual landslide events than are available as event-based landslide inventories. The GLC has recently been supplemented by the Landslide Reporter Catalog (LRC), an inventory of landslides reported by citizen scientists (Juang et al., 2019). Although it is much smaller than the GLC, the quality of reports in the LRC appears to be comparable to those in the GLC. In addition, the spatiotemporal distribution of points in the LRC is somewhat different from the GLC, so it may help correct some limitations of the GLC. Both catalogs can be viewed and downloaded at https://landslides.nasa.gov/viewer. As part of the National Climate Assessment, landslide inventories from across the United States of America were compiled into a single database of rainfall-triggered landslides with known dates (Kelkar et al., 2017). Although these multitemporal inventories contain many independent landslide events, most reports are not within the temporal window considered for this analysis, which starts in 2015, or lack the necessary spatial precision for model training.
Several datasets from other researchers were also used for developing a global landslide nowcast. The Australian government has published a long-term national record of landslides (Geoscience Australia, 2018), while Ecuador has published many landslides that occurred during the year 2016 (Secretaría de Gestión de Riesgos -Escenarios, 2016). Inventories of landslides triggered by major rainfall events in Colombia (Marc et al., 2018), Dominica (van Westen and Sijmons, 2016; Some landslides within these inventories were not used due to the event date or other concerns. Although these inventories represent a broad swath of terrain from every inhabited continent, the collection is far from a complete record of rainfall-triggered landslides, even for the years 2015-present. Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 van Westen and Zhang, 2018), and Puerto Rico (Hughes et al., 2019) enriched the coverage of the Americas. Taken together, the combined inventories from NASA and other researchers provide a much more extensive view of landslide occurrence than is available from any single dataset.

A Climatic Correction Factor
The rainfall necessary to initiate slope failure varies geographically (Baum and Godt, 2009), due to topographic, lithological (Peruccacci et al., 2012), and climatic factors (Wilson and Jayko, 1997). With respect to climate, normalizing model precipitation thresholds by mean annual precipitation alone is unlikely to succeed over large areas because it fails to consider the frequency and intensity of the rainfall events that compose annual totals (Wilson, 2000). Instead, geographic variations can be accounted for by evaluating real-time precipitation relative to extreme events. In LHASA version 2, we use the 99th percentile to represent relatively extreme events, and divide daily rainfall totals at each pixel by that amount. To calculate the historical 99th percentile rainfall at each grid cell, a set of empirical observations for each 0.1°grid cell in the area from 60°N to 60°S was generated from the final IMERG daily rainfall totals for the years 2000-2018. For each grid cell, a synthetic log-normal distribution was fitted to the data with the SciPy python library (Virtanen et al., 2020), because the log-normal distribution fits satellite rainfall data well (Cho et al., 2004). Moreover, the lognormal distribution is not as sensitive to the skewness of rainfall data when compared with the gamma distribution, which means the estimation of extreme values using synthetic fits is less influenced by outliers in the empirical data. A global 0.1°r aster that represented the 99th percentile rainfall was generated from the log-normal distributions defined for each grid cell.

A Global Lithologic Rating
The global lithologic map (GLiM; Hartmann and Moosdorf, 2012) was converted to a lithologic strength rating by using maximum topographic slope as an indicator of material strength. This analysis relies on the physical assumption that stronger rocks will, on average, support steeper slopes in mountainous terrain (Schmidt and Montgomery, 1995). For each lithologic unit, we extracted the distribution of maximum slope values at 1 km 2 resolution. The average of this maximum slope distribution was calculated for each lithologic category, from which a ranked order of lithology was determined (Table 3).
For this analysis, we excluded low-relief areas (maximum slope < 25°) because much of the world's terrain lacks relief for reasons other than lithologic strength. This choice of slope threshold was subjective but balanced the size of the area observed with the need to focus on mountain belts. Some units from the GLiM database were combined and others were excluded. Combining units with similar slope distributions but small spatial extents avoided spurious results and produced a classification where most lithologic groups had similar spatial representation in our analysis of between 12 and 24% of the total analyzed area ( Table 3). For example, volcanic rocks and plutonic rocks of variable chemistry were merged into two categories, respectively. The exception to these roughly equal-extent categories was "weak sedimentary rocks," which is represented by evaporites and unconsolidated sediments and covers just 3% of the analyzed area. GLiM categories of "no-data," water bodies, and ice were excluded from the slope analysis and assigned a lithologic rating of zero.
The resulting ranking by slope largely follows weathering susceptibility of primary rock forming minerals (e.g. Wilson, 2004), with the exception of weak sedimentary materials where lithification likely plays a dominant role. Although this rating was constructed with a small portion of the global slope dataset, it represents an independent and physically meaningful quantity. It is worth noting that this classification relies on lithology alone, and therefore lacks consideration of rock age used by other proposed lithology ranking schemes for landslide susceptibility (e.g. Nadim et al., 2006). We favor use of GLiM because the map resolution is much higher than other global datasets that include rock age. We also recognize that soil cover is highly variable and contributes to landslide susceptibility that is not considered by our analysis. This approach also avoids the use of landslide data as a means of determining the relative strength of lithologic units, which would pose problems related to the small number of landslides recorded in some lithologic units, confounding variables, and the independence of the validation data. A recent global analysis of erodibility used similar methods and data to produce a series of lithologic ratings that resemble but are not identical to that in Table 3 (Ott, 2020).

A Global Gridded Landslide Inventory
In order to reflect the diversity of landslides triggered by rainfall, numerous landslide inventories were obtained ( Table 2). These data spanned a gamut of spatial, tabular, and text formats. Each inventory was stored in a geodatabase if basic information such as the spatial coordinate reference system could be ascertained. Events without an associated date were omitted from the geodatabase. Landslides that occurred before the availability of SMAP data in 2015 were removed, as were landslides with spatial uncertainty worse than 1 km, and landslides caused by triggers other than rainfall (e.g. earthquakes). Some specific types of mass

A Data-Driven Nowcast
In order to achieve the full potential of machine learning, we employed XGBoost, a widely used framework (Chen and Guestrin, 2016). Due to its accuracy and speed, XGBoost has been applied to numerous problems, including landslide research (Zhao et al., 2018;Chakraborty et al., 2019;Sahin, 2020;Stanley et al., 2020). This algorithm builds an ensemble of trees by creating a single tree at each iteration. Each successive tree corrects the deficiencies of the existing ensemble. These trees are often called "weak learners" because they are poorly predictive as individuals but can be extremely powerful as a collective.
XGBoost can be applied in other ways, but LHASA version 2 has been structured as a binary classification problem in which the algorithm was trained to predict the presence or absence of landslides. In addition to speed and accuracy, XGBoost offers several additional benefits that led to its use in the global landslide nowcast. It has a large user community, extensive documentation, and access via multiple programming languages. XGBoost also offers features that can make the trained model more realistic and reliable. Interaction constraints control which variables can be included in the same tree. Monotonicity constraints limit the direction in which each variable contributes to the final output, but the weight of that contribution is still determined empirically. The latter is an important tool to prevent overfitting . Finally, the structural similarity between its trees and the tree structure of LHASA version 1 (Kirschbaum and Stanley, 2018) should help users to make the switch. For these reasons, the new global landslide nowcast was created by training a model with the XGBoost library.
In LHASA version 2, information from a variety of time scales is merged to produce a map of landslide hazard for the current moment. The model incorporates slowly changing or static variables, such as topography and lithologic strength, as well as time variant ones including rainfall and soil moisture ( Table 1). LHASA version 2 incorporates the most recent daily rainfall available in IMERG, which has a 4 h latency ( Figure 2). The possible contribution of melting snowpack to runoff and groundwater is represented by the variable snow mass from SMAP L4. The full profile soil wetness from SMAP L4 represents the state of surficial ground water prior to the current rainfall. Because SMAP L4 has a latency of about 3 days, the gap between antecedent soil wetness and current daily rainfall must be filled by an antecedent rainfall metric (Figure 2). In LHASA version 2, the antecedent rainfall is simply summed over the most FIGURE 2 | Four dynamic predictors, derived from satellite remote sensing, are used to produce the global landslide nowcast. The full-profile soil wetness and snow mass variables from the SMAP level 4 have a latency greater than 2 days. Therefore, 2 days of antecedent rainfall are used to fill the gap between these variables and the most recent rainfall.
Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 recent 2 day period. This combination of data from multiple time periods was intended to succinctly represent processes that contribute to landslide hazard over a range of time scales. These variables were translated to a probability of landslide occurrence with XGBoost by using the gridded global landslide inventory as a target dataset with a daily 30-arcsecond resolution. (All predictors were regridded to this resolution with the nearest-neighbor method.) The nowcast model was trained with data from the period April 3, 2015 to December 31, 2018. (Any incomplete data points were removed.) This dataset included 9,700 landslide grid cells and over 1 million grid cells randomly selected across space and time to represent the nonoccurrence of terrestrial landslides. Each class was split randomly into a training dataset that contained 80% of the available grid cells and a testing dataset that contained the remaining 20%. This testing dataset was used only for model development; landslide inventories from the years 2019 and 2020 were retained for model evaluation, as described below. An ensemble of 1,000 trees was trained, but performance on the test dataset declined after 300 iterations. Therefore, only the first 300 trees were retained for subsequent predictions, including model validation. Table 4 summarizes the model settings. Model depth was limited to a maximum of 2, in order to enable a specific arrangement of interaction constraints. A rarely used feature of XGBoost, interaction constraints control which variables can be included in the same model tree. In this scheme, all variables were forced to interact with rainfall, but other interactions were disabled. This arrangement was intended to represent the effects of all other preconditions on the rescaled daily rainfall necessary to trigger landslides. This model design parallels the "triggercause" framework proposed by Bogaard and Greco (2018), although its intellectual origins lie with the recent-antecedent rainfall thresholds for the city of Seattle in the western United States of America (Chleborad, 2000;Scheevel et al., 2017). Relatively few changes to other model settings were required ( Table 4).
The trained model relies most heavily on the current daily rainfall to determine whether landslides are probable, as indicated by standard metrics of variable importance: gain, coverage, and frequency ( Table 5). Gain represents the information obtained from the average split on that variable, coverage represents the portion of data points for which a variable impacts the model outputs, and frequency indicates the number of times XGBoost makes a split on a variable. Due to the design of the interaction constraints, rainfall would be expected to show a high frequency-but the high gain and coverage indicate a strong empirical relationship between current daily rainfall and landslide occurrence. The same pattern can be seen in an example tree (Figure 3), but the other variables occupy a large proportion of the nodes in other trees. These variables were retained in the final model because they may be important in scenarios such as rain-onsnow events, even if the overall weight is less than for rainfall. Data preprocessing and model training occurred on a virtual machine with ten central processing units, 120 gigabytes of memory, and a Linux operating system. It took approximately 8 min to produce a global nowcast for 1 day. Input-output operations were more significant to system performance than the machine-learning model itself, which requires less than a minute to process the entire world for 1 day. In order to speed up processing, the retrospective model run was divided across multiple virtual machines, each handling 1 year of data. These resources were provided by the Advanced Data Analytics Platform at the NASA Center for Climate Simulation (https:// www.nccs.nasa.gov/systems/ADAPT).

Evaluation Across Inventories
The model was applied to data from May 2015 to April 2020 based on the length of the SMAP L4 archive. As in model training, incomplete data points were dropped; these were typically coastal pixels masked in SMAP L4. The resulting daily maps of landslide probability were limited to the zone between 60°North and South. In order to evaluate the model's performance against the discrete outputs of LHASA version 1 (Emberson et al., 2020), a threshold of 0.12 was applied to the probabilistic outputs of LHASA version 2; this ensured that the discretized nowcasts would have the same false positive rate (FPR). This threshold is solely intended for comparison and is not necessarily optimal for any specific use. An operational system should consider multiple goals prior to the selection of one or more probability thresholds to determine landslide hazard levels.
Temporal separation of training and validation data has been recommended for hydrological modeling (KlemeŠ, 1986) and  Gain represents the information obtained from the average split on that variable, coverage represents the portion of data points for which a variable impacts the model outputs, and frequency indicates the number of times XGBoost makes a split on a variable. The current daily rainfall was the most important variable by all 3 metrics, but seismicity and topography are also important factors.
Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 machine learning (Galvez et al., 2019), because the use of random train-test splits or cross-validation can reduce the independence of the validation dataset by using data from the same event for both model development and evaluation. In keeping with this practice, landslide inventories collected for rainfall events in the years 2019 and 2020 were used to evaluate the model. Then, the predicted probability for the date and location of each landslide was extracted. For comparison, the LHASA version 1 high-hazard nowcast was also extracted for each landslide. Landslides with probabilities higher than the threshold were considered true positives, and the others were considered false negatives. With these values, a true positive rate (TPR) was calculated for each validation inventory, as well as the combined meta-inventory.

RESULTS
LHASA version 2 outperformed version 1 for all landslide inventories evaluated ( Table 6). Overall results were excellent FIGURE 3 | The 1st of 300 trees in the XGBoost ensemble uses current rainfall and distance to faults (m). This indicates that hazard is highest when rainfall is high and distance to faults is low. Other trees in the model have a similar structure but may split these or other variables at different numerical values. LHASA version 2 outperformed version 1 for all inventories. The true positive rate (TPR) was highest for major landslide events triggered by tropical cyclones. The mean prediction was obtained by determining the probability that would have been output for the time and date of each historical landslide; then, the arithmetic mean of these values was calculated. Percentages have been rounded.
Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 for both the training and validation data. However, LHASA version 2 failed to predict a probability greater than 0.12 for the majority of landslides in several inventories: the GLC, LRC, USLI, and the databases for Australia, Burundi, and Rio de Janeiro. With the exceptions of Burundi and Rio de Janeiro, high probabilities of landslide occurrence were obtained for all the event-based landslide inventories (Figure 4). Possible reasons for these failures are discussed below. Given a threshold of 0.12, LHASA version 2 produced nowcasts most frequently in landslide hotspots ( Figure 5). We assume that the frequency of nowcasts serves as an approximation of the true false alarm rate, because landslide events are quite rare in most locations. The most notable hotspots were in the nations of Colombia, Indonesia, and Papua New Guinea, where nowcasts were produced more than 10% of the time in small areas. However, LHASA version 2 would have produced nowcasts quite rarely in most of the world (Figure 5), and even large mountainous regions would have a false alarm rate less than 1%. In contrast, LHASA version 1 produces nowcasts at a relatively consistent rate in susceptible zones because it uses a percentile-based threshold (Kirschbaum and Stanley, 2018). A few of these nowcasts represent landslide events, but most are not associated with reported landslides.
In addition to evaluating the model's performance relative to the existing global nowcasts, the predictions can provide some insight into patterns of landslide hazard-although the 5 years timeframe precludes the detection of long-term trends or climate patterns. The mean daily probability of landslide occurrence for each 30-arcsecond grid cell shows the relative hazard across the world's temperate and tropical regions. This average is very low in flat or dry regions such as the Sahara Desert and the Australian Outback ( Figure 6). Like the global landslide susceptibility map , this map highlights the major mountain ranges and the Pacific Rim. However, the importance in LHASA version 2 of rainfall and proximity to active faults ( Table 5) has reduced the emphasis on certain areas, such as eastern North America, coastal Brazil, and Tibet, relative to the susceptibility map. This retrospective analysis also identified a  few locations with unusually high probabilities of landslide occurrence: the Northern Andes, the Indonesian archipelago, and the East African Rift ( Figure 6). The distribution of hazard across the latter region largely corresponded to that shown in a data-based landslide susceptibility map of Africa (Broeckx et al., 2018), but LHASA version 2 places relatively less emphasis on other African regions. Figure 6 shows greater differences with a map of precipitation-triggered landslide hazard for Indonesia (Cepeda et al., 2010), which placed a higher emphasis on the islands of Sulawesi, Borneo, and Flores. Despite some changes in emphasis, the overall geographic distribution of landslide hazard identified by LHASA version 2 generally corresponds with findings from prior landslide research.

DISCUSSION
Comparison of the results for different inventories shows a remarkable divergence in accuracy ( Table 6). One explanation could be the predominance of landslides triggered by tropical cyclones within the training data. In this hypothesis, the model is well-calibrated to predicting major cyclonic events but not isolated landslides. However, we believe this is not the sole-or even the primary-cause of the divergence. Some important false negatives, such as the February 6, 2019 landslides in Rio de Janeiro ( Figure 4) and the Regent landslide in Sierra Leone, were not predicted due to the absence of heavy rainfall shown by IMERG. IMERG tends to underestimate the intensity of some mesoscale convective systems (Cui et al., 2020) because the orbiting passive microwave sensors within the IMERG constellation can miss short, intense peaks in precipitation if there is not a recent overpass. However, underestimation of peak rain rates should be less important for the IMERG Early product (Maranan et al., 2020). In contrast, IMERG estimates heavy rainfall from tropical cyclones well (Omranian et al., 2018;Huang et al., 2019). Recent object-based analysis also found that IMERG performs better with large, intense storms, although it often underestimates peak precipitation rates . These known errors may limit the resolution of short but intense convective events that can form and dissipate quickly, causing localized landslides and flashfloods.
Although LHASA version 2 may miss some of these events, the IMERG algorithm team is working to better resolve extreme precipitation events, which should improve the model's future performance.
In addition to errors in precipitation estimates, the divergence could be explained by anthropogenic factors. The multitemporal inventories, such as the GLC and USLI, contain many landslides in anthropogenically modified terrain, such as along roads and in mines. These sites may have required far less rain to trigger landslides than the relatively natural areas covered by much of the event-based inventories. The performance of the model on the LRC data points suggests that these submissions are of comparable quality to the GLC and other report-based inventories; thus, future models could be trained with these data. This should not be too surprising, since all points in the LRC are reviewed by NASA staff prior to publication (Juang et al., 2019). LHASA version 2 does not incorporate any measures of anthropogenic disturbance, which limits its ability to predict many life-threatening landslides. Deforestation is one mechanism through which humans alter the environment and increase landslide hazard. This factor was not included in LHASA version 2, but it may be possible to detect the effects of deforestation on landslide hazard at the global scale. Future work can also address the challenge of anthropogenic impact by considering one or more indices of disturbance that could be added to the model as explanatory variables.
Although incorporation of SMAP data in LHASA version 2 provides valuable information on antecedent conditions, it also limits the nowcast to a specific spatiotemporal domain. SMAP's land mask excludes some small coastal areas such as the delta north of Balubuk, Indonesia, but these are very small in comparison to the earth's total land mass. More significantly, the SMAP L4 product does not cover dates prior to the Spring of 2015. This limits the applicability of LHASA version 2 to long-term analyses of climate and landslide hazard, as well as the length of the model calibration period. However, it is still possible to train the global landslide nowcast with the available data, and the inclusion of both snow mass and soil moisture in some decision trees ( Table 5) indicates that these variables are acting to reduce errors. In addition to the limited domain, using the SMAP product introduces a second limitation to the model's real-time implementation to the prior potential for interruptions to FIGURE 6 | The mean prediction over the time period May 1, 2015 to April 30, 2020 was lowest in many of the world's deserts and highest in small areas of the Northern Andes and New Guinea.
Frontiers in Earth Science | www.frontiersin.org May 2021 | Volume 9 | Article 640043 IMERG data production and transfer. These interruptions are expected to be rare, but one such outage coincided with the passage of Hurricane Iota through Central America in November 2020 with the result that nowcast publication was delayed by one day. Nevertheless, based on the model evaluation, the benefits of incorporating antecedent conditions from the SMAP L4 product outweigh the challenges. Future iterations of the LHASA version 2 model may consider additional soil moisture products or integrate information from upcoming missions. The most labor-intensive task in developing LHASA version 2 was compiling the gridded global landslide inventory. The lack of standardized file formats, schema, and publication methods prevented the use of fully automated methods for filtering and merging data from a wide variety of institutions. Even locating openly available landslide inventories required substantial effort. The United States Geological Survey hosts a repository of earthquake-triggered landslide inventories that has achieved a relatively high level of participation by the research community (Schmitt et al., 2017), which facilitates finding and obtaining the relevant data. Unfortunately, no comprehensive collection of rainfall-triggered landslides has achieved comparable success. A few researchers have shared data through NASA's Cooperative Online Landslide Repository 1 , but most data on that portal was produced by NASA. We hope that the new LandAware effort will help to coordinate data sharing across a diverse research community (Calvello et al., 2020).
Machine learning has a reputation for producing "black box" models that cannot be easily understood. Although this might be true for recurrent neural networks, tree-based models like XGBoost are much easier to interpret. The contents of each tree can be shown in human readable format and reviewed individually for plausibility. Even for ensembles with hundreds of trees, this effort could be completed in finite time by a human reviewer. In the development of LHASA version 2, we took additional measures to build trust and enhance model interpretability. The use of monotonicity constraints, interaction constraints, and a tree depth limit of 2 forced the model into a simple and consistent structure. Not coincidentally, this structure parallels that of LHASA version 1, with its division into triggering rainfall and contributing factors. This familiarity should help end users transition between model versions. These practices not only facilitated model interpretation, but also prevented overfitting. We consider this ease of interpretation a crucial strength of LHASA version 2, and one that should more easily allow regulators and other stakeholders to trust the model results.
In addition to building trust, model interpretability may enhance scientific research by illuminating which known physical processes have acted most frequently, most strongly, or in specific instances. Sensitivity analysis, scenario analysis, and visualization can also be important tools for this purpose. In evaluating the model, we produced numerous diagrams in which two factors were systematically varied from historical conditions. Figure 7 shows FIGURE 7 | Contours from an analysis in which the current day's rainfall (rescaled at each grid cell) and antecedent rainfall were varied over a wide range of potential inputs, while keeping all other variables consistent with historical conditions on December 16, 2015 in Eastern Luzon, Philippines (122.3542,16.73756). Dark blue indicates low probability of landslide occurrence, while yellow indicates high probabilities. The shape and spacing of the contours indicate that both variables play an important role in landslide hazard, but the tighter spacing of the contours along the y axis suggests that the model weights current daily rainfall more heavily than prior rainfall. one example of this process in which the relative contributions of current and antecedent rainfall can be seen in one historical scenario.
The contour shape and spacing shows that the rescaled daily rainfall variable has a strong influence on model outputs, while the antecedent rainfall has a lesser but still substantial influence. Since standard validation metrics can be misleading (Steger et al., 2016), we recommend checks like these to all empirical landslide modelers.
The shift from LHASA version 1 to version 2 doubled the accuracy of the global landslide nowcast for the inventories evaluated. Furthermore, the new version produces a continuous output that facilitates use cases with a wider variety of trade-offs between false positives and false negatives. The model succeeded at identifying landslides triggered by major tropical cyclones, but missed a majority of landslides contained in national and global multitemporal inventories, which tend to contain more isolated and anthropogenically influenced landslides. Nevertheless, LHASA version 2 outperforms the existing product on these inventories as well. NASA Goddard Space Flight Center has begun publishing the new global landslide nowcast, which should be available by the middle of 2021. To enable other institutions to run the model, the source code will be published at https://github.com/nasa/LHASA. Future research should be directed toward locating and quantifying the effects of anthropogenic slope modification, given the underestimates of landslide hazard in disturbed terrain. In addition, the dynamic effects of wildfire, earthquakes, and forest loss have not been fully captured by the variables used in this model; more research must be done to incorporate these into LHASA. Finally, more work needs to be done to convert these results into real-time estimates of global landslide risk. We anticipate that with greater sharing of landslide inventories, vulnerability estimates, and other information across the global communities of scientific research and disaster response, these goals may be achieved. LHASA version 2 advances our understanding of landslide hazard and could help to identify populations and infrastructure at risk from landslides around the world.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
DK was responsible for the strategic vision and stakeholder interactions. MC designed the global lithologic rating, and WM prototyped this variable. RE calculated the statistical distribution of IMERG rainfall estimates and manually mapped landslides. PA mapped landslides with a semiautomatic object-based approach. GB searched for landslide inventories and converted all inventories into a standard format for ingest into LHASA. TS developed the LHASA version 2 model and performed all other tasks.

FUNDING
This research was supported by NASA's Disasters program through the solicitation for Earth Science Applications: Disaster Risk Reduction and Response (NNH18ZDA001N).