Probabilistic Tsunami Hazard and Risk Analysis: A Review of Research Gaps

Tsunamis are unpredictable and infrequent but potentially large impact natural disasters. To prepare, mitigate and prevent losses from tsunamis, probabilistic hazard and risk analysis methods have been developed and have proved useful. However, large gaps and uncertainties still exist and many steps in the assessment methods lack information, theoretical foundation, or commonly accepted methods. Moreover, applied methods have very different levels of maturity, from already advanced probabilistic tsunami hazard analysis for earthquake sources, to less mature probabilistic risk analysis. In this review we give an overview of the current state of probabilistic tsunami hazard and risk analysis. Identifying research gaps, we offer suggestions for future research directions. An extensive literature list allows for branching into diverse aspects of this scientific approach.


INTRODUCTION
Tsunamis are rare but potentially devastating natural hazards. With often limited available data, a coherent framework that incorporates data, physical assumptions (i.e., the general model of the system), and statistical methods for hazard and risk analysis is necessary to assess consequences affecting different layers of societies. To further develop, standardize and document such a framework is the underlying objective of COST Action AGITHAR (Accelerating Global Science in Tsunami Hazard and Risk Analysis; AGITHAR, 2020) and this article forms one outcome of the Action.
Probabilistic tsunami hazard and risk analyses (PTHA and PTRA, respectively) offer structured and rigorous procedures that allow for tracing and weighting the key elements in understanding the potential tsunami hazard and risk in globally distributed applications (e.g., Basili et al., 2021). Because of this, PTHA are becoming a standard basis for tsunami risk assessment around the world. Significant challenges in this analysis method are 1) the choice of hypothetical events and assigning "correct" probabilities, and ii) the impact of source regions distributed throughout an ocean basin and, conceivably, unifying distinct types of sources in a homogeneous probabilistic framework with a comprehensive treatment of uncertainty. The great importance of PTHA is due to its practical implications for society providing information for long-term planning and coastal management in areas where potential tsunamis may occur. Conversely, PTRA are still less abundant and standardized than PTHA, as elaborated in this review.
Few mega-tsunamis have been observed in the instrumental period, a timeframe spanning from approximately the 1960s to today. Thus, it is challenging to confidently assess the rate at which consequential tsunamis will occur. Predominantly seismically triggered tsunamis comprise about 80% of all tsunamis worldwide (e.g., Harbitz et al., 2014) with the remainder caused by landslides, volcanoes, or meteorological phenomena.
The sparsity of background data and requirements in engineering applications have driven the development of probabilistic methods for assessing tsunami hazard and risk aiming for unbiased comparisons of different hazards (natural and anthropogenic) as well as their uncertainty quantification. In recent years, the probabilistic framework has been increasingly applied. However, broadly accepted approaches are not yet defined, and potentially incompatible implementations of probabilistic methods are used in different regions across the world, and different tsunami source types are often treated separately and are rarely combined.
In this study, we have documented current gaps and open research questions related to PTHA and PTRA. We have organized this review into two main sections, one focused on tsunami hazard and the other on risk. We preface these topics with a brief introduction to the probabilistic framework underlying both PTHA and PTRA. Note that we grouped the gaps in numerical modeling in the hazard analysis related section, even though modeling may also be considered a cross-cutting topic. We believe, however, that the mentioned gaps are more related and addressed in a similar way as the other hazards related research gaps.
While PTHA and PTRA allow for including uncertainty in a consistent way, it is necessary to point out that it is not always simple to describe the knowledge gaps formally, for example through alternative models, and quantify their impact on hazard and risk models in terms of epistemic uncertainty (i.e., caused by lack of knowledge or data, Kiureghian and Ditlevsen, 2009). Overall, the research gaps identified in this study are "known unknowns" (e.g., Logan, 2009) and deserve more thorough research efforts, in order to determine their influence on the overall outcome of the PTHA or PTRA workflow.
This fact makes it hard to determine quantitatively the importance of each of the research gaps. Nevertheless, we tried to assess-in a qualitative way-the relative priority of research gaps and discuss this in the last section of this report.

PROBABILISTIC FRAMEWORK
In this section, we present a structure for probabilistic hazard and risk analyses. An overview is given in Figure 1. More in-depth reviews of identified gaps related to the individual probabilistic framework components are discussed in sections "Probabilistic Tsunami Hazard Analysis" and "Probabilistic Tsunami Risk Assessment".
The purpose of PTHA is to find the probability for a tsunami intensity measure (IM) to exceed a given threshold in a predefined time interval. Note that, in the PTHA framework, "Intensity Measure" is used with a meaning that differs from the "tsunami intensity scale" used, for instance, in tsunami catalogs to define the "size" of a tsunami or the effects it produces inland. In the PTHA context, an IM is a physical observable strictly connected to the physics of the process. Common IMs are wave amplitude, flow depth, current velocity, momentum flux, or maximum inundation height, depending on the problem setting (Grezio et al., 2017).
Different probabilistic framework alternatives for computational PTRA exist. One option, rooted in seismic risk analysis, is performance-based risk assessment, presented by PEER (Pacific Earthquake Engineering Center) in 2000. The term performance-based is often used in contraposition to traditional prescriptive assessment procedures for seismicresistant building design (Fardis, 2009). The performancebased framework aims to provide a practical yet rigorous workflow and has also been used for risk assessment for hurricanes (e.g., van de Lindt and Dao, 2009;Barbato et al., 2013), floods (De Risi et al., 2013;Jalayer et al., 2016), and tsunamis (Chock et al., 2011;Chock, 2016;Attary et al., 2017). This framework can be organized in different modules; for example, hazard and vulnerability or hazard, fragility and connects vulnerability with fragility and describes physical damage. DV connects fragility with consequences and reaches out to decisionmakers with numbers of casualties, repair costs, or downtimes. Interestingly, several risk-informed decision-making processes related to tsunamis are based on PTHA information only (e.g., hazard-based evacuation zones, hazard-based land-use planning). As an example, the criterion "flow depth (IM) larger than a threshold (im)" can be used as a basis for decision-making (e.g., assigning evacuation zones). In other words, an IM can act as an intermediate variable (intensity measure) as well as a decision variable.
To illustrate the framework, suppose a finite set of N hypothetical tsunamigenic sources representing all possible tsunami events affecting the site of interest. Each event occurs randomly in time and independently of all others (i.e., as a Poisson process). The tsunami hazard curve-the main outcome of PTHA-describes the mean annual rate of a tsunami event affecting location x with an intensity measure IM(x) greater than some threshold im, denoted as λ(IM(x)≥im). This can be expressed as: where λ M min,i is the mean annual rate of occurrence of tsunamigenic events from source i (e.g., earthquakes, landslides, etc.) having magnitudes M exceeding M min , f M is the conditional probability density function for M ≥ M min,i , and f S|M is the probability density function of the set of source parameters S given magnitude M. The aleatoric uncertainty associated with variable source characteristics can be represented by probabilistic prediction models of the source parameters. Finally, p(IM(x)≥im|s,m) is the complementary cumulative distribution function of IM given S s and M m, and can be evaluated through tsunami simulations. Note that Eq. 1 can be used only if sources are independent; a counterexample being a landslide generated from the same earthquake that amplifies the ensuing tsunami's destruction. Epistemic uncertainty in PTHA is often accounted for using logic trees or, more recently, ensemble modeling, which allow alternative hypotheses for uncertain parameters, each of which is assigned a weight reflecting confidence in the respective parameter value (e.g., Geist and Parsons, 2006;Selva et al., 2016;Grezio et al., 2017). Equation 1 is computed for each logic tree 'end branch'.
Building on tsunami hazard, the tsunami loss curve at any location is calculated by convolving vulnerability and hazard over the entire span of IM: where λ(DV ≥ dv) is the mean annual rate of occurrence of DV larger than a threshold dv. Vulnerability is expressed through the complementary probability distribution function denoted as G DV|IM (dv|im), for DV given IM, and is itself calculated by integrating fragility and consequence functions (see also Figure 1): with • f DM|IM , the tsunami fragility function, predicts the probability of incurring a particular value (dm) of damage measure DM (e.g., damage states) for a given IM im; • G DV|DM (dv|dm), the tsunami consequence function (e.g., the damage-loss function), expressed as the complementary cumulative distribution function of DV given DM.
Strictly speaking, Eqs. 1 and 2 do not consider multi-hazard and multi-risk aspects such as cascading effects, combined damage due to tsunami loading and earthquake shaking. Assuming a Poisson process, the rate of exceedance λ is often transformed the first excursion of a specific value dv for a generic decision variable DV in the time Δt (e.g., 1 year, 50 years):

PROBABILISTIC TSUNAMI HAZARD ANALYSIS
This section discusses gaps in PTHA, focusing on those in tsunami sources and hydrodynamic modeling. Each subsection includes a summary of the present state-of-the-art, followed by an in-depth discussion of the gaps.

Identified Gaps
Limited Past Events and Data to Inform Hazard Models (S1) Completeness and quality of historical earthquake data, needed to constrain seismic source parameters, varies greatly depending on the history of the investigated geographical region (Stucchi et al., 2004;Albini et al., 2014). Event catalogs are generally too short to account for the source frequency needed to model large average return periods in PTHA. The description of earthquake Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 4 recurrence and of the tail of the frequency-magnitude distribution is highly uncertain (Kagan, 2002;Geist and Parsons, 2014;Rong et al., 2014;Bommer and Stafford, 2016). In the attempt of constraining this uncertainty, seismic source parameters have been estimated globally using seismic or geodetic data or both (e.g., Bird and Kagan, 2004;; however, these types of input data are not always considered by PTHAs. Moreover, a framework for constraining PTHA directly from tsunami observations exists (Geist and Parsons, 2006;Grezio et al., 2017), while treatment of incomplete catalogs is described by Smit et al. (2017). Where possible, other data types should also be considered. Paleoseismic and paleo-tsunami catalogs may help constrain or validate at least large event recurrence (e.g., Priest et al., 2017;Paris et al., 2020), while GPS-constrained strain accumulation can indicate the total seismic moment rate (e.g., Hayes et al., 2018). Care should be taken of potential biases coming from overweighting evidence of large tsunamis in the past (Geist and Parsons, 2006).

Fault Identification, Fault and Source Zone Parameterization and Tsunamigenic Potential Characterization (S2)
Tsunami sources are often constrained from infrequent offshore geologic studies investigating very large areas. Therefore, geologic fault data are often incomplete, causing a wide range of source knowledge levels (Basili et al., 2013). Seismic source characterization for SPTHA generally refers to properties of pre-existing large faults, and often only to great subduction zone sources. All other-mostly crustal-faults are seldomly considered in PTHA, although non-subduction earthquakes may control tsunami hazard, especially when located near the target site (Selva et al., 2016). Despite overall good constraint of subduction interface geometries (e.g., Hayes et al., 2018), alongstrike trench segmentation and its impact on rupture propagation remains uncertain, limiting rupture forecasts and hindering estimates of subduction earthquake maximum magnitude (e.g., Bilek, 2010;Kopp, 2013;Grezio et al., 2017). Whenever fault knowledge is incomplete, more randomized "background" seismicity modeling is needed, with less predictable geometry and seismic behavior compared to subduction interfaces (Sørensen et al., 2012;Selva et al., 2016). Fault slip rates can constrain seismicity recurrence parameters; these can vary both spatially (Zechar and Frankel, 2009) and temporally (e.g., Ota and Yamaguchi, 2004;Ramírez-Herrera et al., 2011;Tiberti et al., 2014) but usually only averages are considered due to a lack of information. Kagan and Jackson (2014) pointed out that more research would be needed for focal mechanism forecasting; identifying the prevailing faulting mechanism is a critical task particularly in tectonically complex environments. This is expected, in turn, to exert a strong influence on tsunami hazard.

Variety, Complexity, and Dynamics of Fault Mechanics (S3)
Source simplification represents a dominant uncertainty in SPTHA (Geist and Oglesby, 2014). Its effect on seafloor deformation needs to be investigated better, concerning deformation models that incorporate complex material properties, geometrical complexity, varying depth-dependent fault conditions, dynamic simulations including off-fault damage and near-surface amplification, which all may increase tsunami hazard (Masterlark, 2003;Ma, 2012;Kozdon and Dunham, 2013;Ryan et al., 2015;Murphy et al., 2016;Lotto et al., 2017;Murphy et al., 2018;Scala et al., 2019;Scala et al., 2020;Tonini et al., 2020). Secondary ruptures including splay faulting may happen as an independent source or as part of a large earthquake on the subduction interface (Wendt et al., 2009;Li et al., 2014;Hananto et al., 2020).
Tsunami earthquakes produce excessively large tsunami intensities compared to their moment magnitude (Polet and Kanamori, 2016), and their global and local frequency is unconstrained. A simplified characterization of tsunami earthquakes, which is sometimes adopted, assumes larger slip associated with less rigid materials at shallow depths to preserve the seismic moment (e.g., Bilek and Lay, 1999;Geist and Bilek, 2001). These and other very complex ruptures, potentially containing fault branching, rupture jumping, and mixed-mode slip (e.g., Ulrich et al., 2019a;Ulrich et al., 2019b), are not well represented in PTHA. On a larger scale, rupture area may be shared by more than one subduction interface, like in the case of triple junctions (e.g., Solomon event 2007, Lorito et al., 2016). Due to a lack of observations the likelihood of such events is uncertain and quantification of their relative contribution to SPTHA therefore difficult.
Due to all these uncertainties and the extreme computational demand for dynamic computation, numerical simulations are de facto replaced with heterogeneous stochastic slip modeling (e.g., Herrero and Bernard, 1994;Mai and Beroza, 2002;Davies et al., 2015;Sepúlveda et al., 2017), and less frequently with stochastic stress modeling (e.g., Wendt et al., 2009). Because source observations are relatively scarce, more statistical tests (Davies and Griffin, 2019) are needed for source model validation.

Empirical Scaling Relations (S4)
Several different empirical scaling relations are used to define earthquake rupture properties, such as length, width, average slip, and earthquake magnitude (e.g., Wells and Coppersmith, 1994;Murotani et al., 2008;Blaser et al., 2010;Strasser et al., 2010;Murotani et al., 2013;Goda et al., 2016;Skarlatoudis et al., 2016;Allen and Hayes, 2017;Thingbaijam et al., 2017). These relationships quantify appreciable uncertainties that are seldomly accounted for in SPTHA. These relations imply stress drop and time-dependent rupture characteristics and self-similarity of earthquakes across scales, but this is apparently violated in some cases. For example, the 2011 Tohoku earthquake released a huge amount of slip in a relatively small portion of the subduction interface compared to the Sumatra 2004 or Chile 1960 earthquakes (Okal, 2015); scaling relations are not directly applicable to abnormally slow and unusually large shallow slip occurring in low-rigidity zones during tsunami earthquakes.

Complex, Non-stationary Seismic Cycle (S5)
Even in the simplest subduction environment, the seismic cycle over co-seismic, inter-seismic and post-seismic phases is complex and non-stationary, for example due to visco-elastic rheology and Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 5 the role of fluids (Wang et al., 2012;Moreno et al., 2014;Melnick et al., 2017). Time-dependent models could potentially be used to estimate the stress transfer from one earthquake to the neighboring faults (King et al., 1994). Stress transfer from megathrust earthquakes triggering outer-rise ruptures or possibly even the opposite are such examples (e.g., Lorito et al., 2016). Based on seismic catalogs, it is possible to infer non-Poissonian earthquake recurrence, for example earthquake clustering (Kagan and Jackson, 1991). A time-dependent model, which could better describe the probability of earthquake occurrence for some specific applications or timeframes, is taken into account by only a few PTHAs (e.g., Goda et al., 2017;Goda, 2020).

Other Constraints (S6)
It is reasonable to assume that high seismic coupling correlates with future slip location. Under simplifying assumptions, alongstrike geodetic coupling variation can be inferred from geodetic strain (Métois et al., 2012). Large uncertainty remains, particularly regarding the near-trench region (Loveless and Meade, 2011). Recent developments in seafloor geodesy and modeling techniques are offering improved constraints (e.g., increasing offshore coupling resolution, Bürgmann and Chadwell, 2014;Foster et al., 2020), and slow slip events and consequently the stress evolution on the fault (e.g., Araki et al., 2017). High seismic coupling combined with stress accumulation in areas of seismic inactivity is described as a seismic gap. The possibility of using seismic gaps to identify zones of enhanced seismic hazard has long been debated (e.g., Bilek and Lay, 2018). Attempts to obtain physically motivated constraints on the maximum magnitude utilizing convergence rate, age of the oceanic crust and sediment thickness have been rather unsuccessful (Okal, 2015). Ongoing research explores these and other controlling factors of subduction zone seismicity, including small-and large-scale roughness of the subduction interface, static friction coefficient, upper plate strain and rigidity, dip angle and curvature (e.g., Heuret et al., 2012;Bletery et al., 2016;Sallarès and Ranero, 2019;Rijsingen, et al., 2019;Muldashev and Sobolev, 2020). Additionally, rupture cycles and supercycles over multiple segments controlled by geological asperities have been proposed (Philibosian and Meltzner, 2020). Similar to some of the previously discussed items in this section, no consensus has been reached on the statistical meaning of such information and on how to frame it within SPTHA.

Identified Gaps Lack of Understanding and Likelihoods for Tsunamigenic Landslide Volumes (L1)
For submarine landslides, we refer to the reviews of Huhn et al. (2019) and Harbitz et al. (2014). The challenge can be attributed to several factors: • Limited or insufficient mapping of past landslide occurrences. Their characteristics and lack of dating prevent constraining the age of the sediments without excessive uncertainty ranges (e.g., Geist et al., 2013). The new global landslide database initiative ) is a good starting point for standardizing, but not yet complete enough for feeding LPTHA. Good data coverage exists for certain regions such as the Mediterranean (Urgeles and Camerlenghi, 2013), Gulf of Mexico (Pampell-Manis et al., 2016) and the US East Coast (Chaytor et al., 2009. • Limited understanding of how past landslide recurrence can be projected into the future hazard, including time and geological context dependency. For example, we cannot yet generally link climatically driven trends to past landslide frequency (Urlaub et al., 2013). However, it is concluded that the last ice age affect present landslide probability offshore US (Lee, 2009) and Norway (Bryn et al., 2005). • Limited available geological and geotechnical data inhibit identification of failure-prone sediments and discrimination from stable areas, including weak failure zones, pore pressure conditions or fractures, as well as obstacles or structures. When data exist, they may be proprietary, and a challenge is related to the need for covering very large geographical and heterogeneous regions. A methodological gap exists in bridging geotechnical data and slope stability models (e.g., Carlton et al., 2019) to volume-frequency relationships.
Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 6 • Limited data and knowledge on triggers of landslides, such as meteorological or seismic events, impedes the quantitative assessment of potential landslide magnitude.

Difference of Onshore and Offshore Landslides (L2)
The specific character of subaerial and submarine landslides is often vastly different. Potential direct or indirect trigger mechanisms are sometimes not fully understood or difficult to embed into the probability of failure (e.g., precipitation-induced landslides, weak zones and fluid overpressure, range of failure propagation and cascading failure propagation spread). Understanding and estimating the annual probability of landslide failure in rock slopes with complex fracture patterns and stress conditions is associated with large uncertainty. Extensively monitored rock slopes in Norway (e.g., Blikra et al., 2005) show large motion over decades before failure takes place, rendering assessment of failure probability difficult. Matching expert judgment (e.g., Hermanns et al., 2013) to observed landslide magnitude frequency statistics (e.g., Nes, 2018) will help aggregate understanding of landslide frequencies and help link knowledge on failure-prone areas to probability. While epistemic uncertainties in the described situations are large, current LPTHA models do not incorporate them.

Limited Constraints on Landslide Dynamics and Material Behavior (L3)
The interplay of diverse tsunamigenic landslide parameters makes the generation complex, implying that much less voluminous landslides may be more effective tsunami generators than respectively larger ones. As an example, we note that the approximately 500 km 3 Traenadjupet Slide that occurred 4,500 years BP likely produced a moderate coastal impact possibly of just a few meters (e.g., Løvholt et al., 2017), while the 100 times less voluminous 1998 Papua New Guinea landslide induced more than 10 m run-up locally (e.g., Tappin et al., 2008). Because tsunami genesis is tightly linked to landslide acceleration as well as rate of mobilization of the landslide volume (e.g., Løvholt et al., 2005), quantifying the rate and nature of the slope failure is important. Just a few studies discuss the effect of initial failure rate on tsunami generation (e.g., Trapper et al., 2015;Germanovich et al., 2016;Puzrin et al., 2016) and related aspects such as remoulding and cascading failures on the landslide tsunami generation (e.g., Løvholt et al., 2017;Kim et al., 2019;Zengaffinen et al., 2020). How to include these factors and their associated probabilities in PTHA is not resolved. While advanced numerical models for landslide dynamics exist (e.g., Tinti et al., 1997;Jop et al., 2006;Savage et al., 2014;Si et al., 2018a;Si et al., 2018b;Kim et al., 2019;Wang et al., 2019;Gallotti and Tinti, 2020), their complexity and variety obfuscate understanding on which models are most suitable to be used. Furthermore, some models (e.g., Savage et al., 2014;Si et al., 2018a;Si et al., 2018b) are presently too comprehensive to be used in PTHA. Procedures for linking them to measured material properties and geological settings are not in place. Finally, fluid resistance forces (pressure drag, skin friction, and added mass) may be as important as the landslide properties, in particular for submarine landslides and further investigating physical understanding is necessary to constrain epistemic uncertainty.

Limited Availability of Benchmarks (L4)
Suitable benchmarks have recently been made available (e.g., Huang and Garcia, 1998;NTHMP, 2018;Kim et al., 2019), but are arguably less mature and fewer than their hydrodynamic modeling counterparts (e.g., Pedersen, 2008;Synolakis et al., 2008). A challenge is a transition from simplified laboratory tests to real-world landslide representation. Moreover, while numerous empirical lab experiments exist, they are significantly influenced by scale effects (Heller, 2011). Neither complex rheological behavior nor real-world complexity is covered in the benchmarks. Complex laboratory experiments (e.g., Rondon et al., 2011) can be used for validating CFD models, but CFD models are presently too computationally expensive for tsunami hazard analysis modeling.

Limited Past Events to Inform Hazard Models (L5)
Information about past landslides and tsunamis can be used to infer landslide dynamics uncertainty. This can be done using landslide run-out information alone (e.g., Salmanidou et al., 2017), which consequently yields broad epistemic uncertainties in LPTHA. By using tsunami information, such uncertainties can be drastically reduced (e.g., Gylfadóttir et al., 2017;Kim et al., 2019;Løvholt et al., 2020). In practice, however, very few landslide tsunami data are available.

Existing Methods
Volcanic PTHA, coined VPTHA here, is even less developed than LPTHA (Grezio et al., 2017). Among the few examples are the VPTHA framework developed in Ulvrova et al. (2016) and Paris et al. (2019) for underwater explosions at Campi Flegrei, and Grezio et al. (2020) for pyroclastic flows of Vesuvius. Given that risk reduction measures at volcanoes are often related to the identification of precursory patterns preceding eruptions or to recognizing unrest episodes with increased volcanic activity, the volcanic hazard is often computed conditional to eruptions or unrest, and without an explicit quantification of long-term probability. For example, in Paris et al. (2019), the hazard analysis (Campi Flegrei, Naples, Italy) is confined to conditional tsunami intensity probabilities, due to probabilistic realizations of eruptions with different vent size and location.

Identified Gaps
Variety of Potential Volcanic Sources (V1) Tsunamigenic volcanic events are diverse and they include both eruptive and non-eruptive triggering phenomena, such as underwater explosions, pyroclastic flows, lahars, slope failures, volcanic earthquakes, shock waves from large explosions, and caldera subsidence (Latter, 1981;Kienle et al., 1987;Begét et al., 2005;Day, 2015;Paris, 2015;Grezio et al., 2017). A large range of wave characteristics is typical for volcano tsunamis, even if most such sources are localized and generate mainly short-period waves with greater dispersion and limited far-field effects compared to earthquake-generated tsunamis (e.g., Yokoyama, 1987;Nomanbhoy and Satake, 1995;Le Méhauté and Wang, 1996;Choi et al., 2003;Watts and Waythomas, 2003;Bellotti et al., 2009;Maeno and Imamura, 2011;Ulvrova et al., 2016;Selva et al., 2019Selva et al., , 2020. However, tsunamis are among the farthest propagating volcanic perils, often generating regional impact (e.g., Krakatau, Stromboli, Ischia, etc., see for example Paris et al., 2014;Rosi et al., 2018;Selva et al., 2019;. Notably, some of the tsunamigenic volcanic events overlap with those recorded for seismic and landslide tsunami: flank collapse, slope failure and even pyroclastic flows are related to landslides (Løvholt et al., 2015b;Paris, 2015); volcano-tectonic earthquakes occur with high frequency in volcanic areas (Paris, 2015). Such frequency information as well as understanding material properties and transformation during flow should draw upon volcanological expertise. It is often difficult to define a single generation phenomenon since different potentially tsunamigenic processes can occur during the same volcanic episode, especially during large caldera-forming eruptions (Paris, 2015).

Difficulties in Constraining Recurrence Rates (V2)
Since volcanic tsunami generation is so diverse, constraining recurrence rates for the different source types as eruptive phenomena (Walter et al., 2019), unrest episodes (Tinti et al., 1999;Selva et al., 2020), and triggered subaerial landslides  is difficult. The integration into a multi-source VPTHA is further complicated by the need for accounting for the complex interdependencies that may exist among the different source mechanisms. The hazard is often nonstationary through time (e.g., Bebbinghton, 2008;Bebbinghton, 2010), which represents another challenge.

Gaps in Modeling Tsunami Generation and Propagation (V3)
Extensive reviews on existing strategies to model volcanic sources are found in Paris, (2015), Grezio et al. (2017) and Paris et al. (2019). Given the complexity, an important part of the hazard analysis is oriented toward understanding the physical mechanism of generation, and how to represent this probabilistically. Similar to landslide generated tsunamis, volcano tsunami modeling suffers from the difficulty of coupling the complex dynamics of the generating event and its interaction with wave propagation. For example, pyroclastic flows are complex, multi-phase phenomena involving the interaction of high-temperature gases and volcanic clasts covering a very large range of granulometric dimensions (Freundt, 2003;Bougouin et al., 2020). This difficulty leads to simplified modeling schemes (e.g., Bevilacqua et al., 2017;Sandri et al., 2018). These simplified strategies may be too reduced for an effective constraint of their tsunami potential (Grezio et al., 2020). Some phenomena may be represented by empirical models (for submarine explosions, see Paris et al., 2019, and for caldera collapse, see Ulvrova et al., 2016). Experimental and numerical simulations coupled with field data increased understanding of the physics and main parameters of volcanic tsunamis (Grezio et al., 2017).

Lack of Data From the Geological Record (V4)
Tsunami is often not dealt with in the volcanological community, although it may be more fatal than other volcanic hazards such as lava flows or ash falls (Auker et al., 2013;Brown et al., 2017). Consequently, a systematic investigation of tsunami-related data in geological surveys at volcanoes is often missing. Because different volcanic phenomena may trigger tsunamis, even when tsunami data exist, attributing the observation to a specific mechanism is difficult (e.g., Krakatau 1883 eruption: Paris et al., 2014). Therefore, a systematic collection of available volcano-generated tsunami data and linking to potential volcanic generating processes is required. This will imply defining a strategy of tsunami-oriented monitoring around coastal volcanoes. It would be useful to combine such efforts with existing data collections such as the Global Volcanism Program (Global Volcanism Program, 2013).

Limited Availability of Well Recorded Past Events or Benchmark Studies (V5)
Only a few past events are well constrained in terms of both the source and of the subsequent tsunami (e.g., Unzen 1792, Karymskoye Lake 1996; Montserrat 1997, Anak Krakatau 2018Stromboli 2002 and. The lack of consensus in modeling procedures for each type of tsunamigenic volcanic event, along with the tendency to consider all sources as "unique", complicates the task of defining benchmarks for volcano tsunamis.

Existing Methods
Meteotsunami PTHA, coined MPTHA here, was developed only recently (see Grezio et al., 2017). A framework for MPTHA development is proposed by . The dynamics of meteotsunamis are fairly well-known (e.g., Monserrat et al., 2006;Sibley et al., 2020), related to unusually strong and rapid atmospheric pressure fluctuations and resonance effects causing strong waves closely associated with the behavior of tsunamis. The source mechanisms of meteotsunamis are also well understood (Monserrat et al., 2006;Pattiaratchi and Wijeratne, 2015) with a major driver a Proudman resonance (Proudman, 1929). Because meteotsunamis are strongly linked to (un)favorable combinations of pressure fluctuations, shallow (shelf) bathymetry, and directivity of the weather system, they take place more frequently in specific geographical areas, such as in the Adriatic Sea (Vilibić andŠepić, 2009), the Baltic Sea (Pellikka et al., 2020), and the East Coast of the United States (Pasquet et al., 2013). The main input data for meteotsunamis include meteorological pressure data, preferably with full spatial and temporal characteristics of the pressure field for given meteorological events. Such data can be used to provide synthetic probabilistic source scenarios as input to an MPTHA, where an example for the Northeast US coastline is given by . While this field does Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 8 not share the data sparsity issues that are associated with volcanoes and landslides, large uncertainties persist, as briefly discussed below.

Identified Gaps
Lack of Understanding the Potential and Likelihood for Tsunamigenic Meteorological Patterns (M1) A systematic assessment of potential source areas and exposed coastal regions is not available. Some regional studies can serve as a preliminary indication (e.g., Dusek et al., 2019;Šepić et al., 2012;Sepić et al., 2016), but a rigorous catalog is missing. Climatological information is likely available, but a systematic extraction of data concerning meteotsunami potential has not been performed. It is not clear whether the resolution of available climatological data (e.g., from reanalysis) is sufficiently fine to allow for the extraction of corresponding relevant meteotsunami source patterns.

High Sensitivity to Several Parameters and Lack of Understanding of Local Amplification Factors (M2)
Whitmore and Knight (2014) demonstrate the high sensitivity of typical tsunami impact to source parameters and hence a large gap in knowledge on relevant localized parameters. The size, speed, amplitude, directivity, and duration of an atmospheric disturbance resonating with the water column in a specific topographic setting need to be known to assess the hazard. Therefore, such parameters need to be derived for all tsunamigenic regions, then applied to available climatological data sets, and finally fed into corresponding models for assessment of hazard. An assessment of amplifying tidal conditions in each of such regions is also missing.

Limited Availability of Benchmark Studies (M3)
While there are many individual meteotsunami events described in the literature (e.g., Churchill et al., 1995;González et al., 2001;Pasquet et al., 2013;Vilibić et al., 2014), no truly validated benchmark data are available for meteotsunami benchmarking. In principle, a similar methodology as described in Synolakis et al. (2008) could be followed. However, only very little unification of source modeling has been achieved and except for preliminary simplified tests (as in Vilibić, 2008), there exists no widely accepted test suite. This applies in particular to verification and validation of the probabilistic workflow of MPTHA.

Limited Past Events and Data to Inform Hazard Models (M4)
There is no consistent catalog of occurrences, although regional studies have been performed (e.g., Haslett et al., 2009;Woodruff et al., 2018). As stated before, there are no unified parameterizations of meteotsunami sources, which could be entered into such a catalog. Even though many individual events are described in the literature (see subsection above), these are by no means representative or complete to be used in hazard models. More rigorous collection of data with the special focus on meteotsunamis-background climatology, meteorological situation, ocean state, topo-bathymetry-for the diverse areas of interest would be desirable.

Existing Methods
Hydrodynamic tsunami modeling includes numerical simulation of tsunami generation, propagation as well as coastal and onshore impact. It is an essential part of any PTHA or PTRA analysis. Reviews of commonly applied methods are available (e.g., Pedersen, 2008;Synolakis et al., 2008;Behrens and Dias, 2015). The pre-eminent challenge is the need to bridge a broad range of scales. First, in the probabilistic regime, a comprehensive PTRA must consider a very large number of scenarios to cover all relevant tsunamigenic sources, explore wave physics, and quantify uncertainties. Second, for each individual scenario source, large-scale propagation and coastal inundation modeling (optimally at scales of 1-10 m) need to be represented to quantify tsunami-related on-shore damages and losses. However, the fastest HPC simulation workflows (e.g., de la Asunción et al., 2013;Oishi et al., 2015;Macías et al., 2017;Musa et al., 2018) still require typically 10-60 min to simulate tsunami inundation at a scale of tens of meters, rendering them unsuitable for extensive PTRA studies with up to millions of scenarios (Basili et al., 2021). To overcome this "challenge of scales", modeling approximations are presently necessary for PTHA feasibility and can either involve 1) largely reducing the number of inundation scenarios (e.g., González et al., 2009;Lorito et al., 2015;Volpe et al., 2019;Williamson et al., 2020), 2) use of approximate models or statistics such as amplification factors (e.g., Løvholt et al., 2012;Kriebel et al., 2017;Gailler et al., 2018;Glimsdal et al., 2019), or 3) machine learning-based tsunami emulators (e.g., Sarri et al., 2012;Salmanidou et al., 2017;Giles et al., 2020).

Identified Gaps PTHA Uncertainty Treatment for Tsunami Inundation Processes (H1)
At present, we lack well tested local PTHA benchmarks where the sources of uncertainties are effectively characterized, in a way that allows their formal propagation along the PTHA/PTRA assessment chain. Moreover, the effect of coseismic coastal displacement due to near field sources (e.g., Volpe et al., 2019), which affects tsunami inundation, should be investigated more deeply, especially when using techniques for reducing the number of scenarios. For this purpose, a large number of inundation scenarios are needed to quantify the epistemic uncertainty and bias caused by simplifications introduced through approximate methods. A local PTHA application using more than 40,000 earthquake sources (Gibbons et al., 2020) is only a start.

Tsunami Generation (H2)
Unit source models (Kajiura, 1963;Nosov and Kolesov, 2007;Molinari et al., 2016) of varying computational cost and complexity approximate the volumetric deep-water source displacements. While Lotto et al. (2019) clarified that the horizontal momentum does not effectively contribute to tsunami generation in deep-water sources, an extensive sensitivity analysis of how such simplifications affect PTHA Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 9 has not been carried out. Incorporating time-dependent and moving sources, be it earthquakes (e.g., Ulrich et al., 2019a), landslides (e.g., Løvholt et al., 2015b) or volcanoes, will involve much higher computational burden. How to limit the number of source time steps for time-dependent source modeling is sparsely studied (e.g., Zengaffinen et al., 2020). For landslide tsunamis, closed-form models (e.g., Cecioni and Bellotti, 2010) represent a simple alternative but can introduce biases when conveyed to real geographical settings, due to oversimplification or inadequacy for the real situation. Subaerial landslides and volcanoes are often simplified because the required consideration of full 3D hydrodynamics (e.g., Abadie et al., 2020) into PTHA poses too high computational demand. Hence, more research is needed for developing simplified timedependent sources compatible with PTHA demands, while quantifying the epistemic uncertainty and bias caused by the simplification. New methods may facilitate more detailed characterization of past inundation scenarios and their sources (e.g., Chagué-Goff et al., 2012;Sugawara et al., 2014;Paris et al., 2020).

Uncertainty and Variability due to Numerical Model for Tsunami Propagation (H3)
Most non-linear shallow water (NLSW) simulation codes produce similar results in the propagation phase, in particular in controlled benchmark cases (e.g., Synolakis et al., 2008). However, clear model differences can appear due to varying components (applied numerical method, workflow, sources, setup etc.) in practical applications. Comparing different numerical forecast models in the Indian Ocean, Greenslade et al. (2014) found large variations, attributed to differences in the workflow and source representation rather than to the tsunami model itself. Testing how such kinds of uncertainty quantification relate to "heterogeneous modeling practices" has not been carried out systematically. Moreover, a rigorous investigation of the performance of far-field propagation is sparse (Dao and Tkalich, 2007;Davies and Griffin, 2020). Differences in numerical dissipation and discretization can also contribute to modeling deviations. As there is no standardized test case for far-field propagation that could reveal the differences in performance of different approaches, it is pressing to address these issues more systematically. Due to the computational burden, most PTHA applications today employ shallow water type models, neglecting frequency dispersion, which can lead to bias. Dispersion can be incorporated through conventional dispersive wave solvers (e.g., Bellotti et al., 2008;Løvholt et al., 2008;Kim et al., 2009;Shi et al., 2012), or through manipulating numerical schemes in NLSW codes (like MOST, e.g., Wei et al., 2008), although the general applicability of the latter is presently not clear. A systematic investigation quantifying the effect of dispersion (as in Glimsdal et al., 2013) on PTHA for practical source configurations would be desirable.

Nonlinearity and Resonances (H4)
Most tsunami simulations to date start from an ocean at rest and assume that interaction of currents with gravity waves is negligible. Androsov et al. (2011) demonstrated that significant alterations of the wave height can be attributed to tidal activity. A quantitative sensitivity analysis of this effect, its dependence on bathymetry, and its correlation to the choice of model (NLSW) is necessary. Huthnance (1975) described the phenomenon of trapped waves on continental shelves that may trigger edge waves and other amplified phenomena. Tsunami resonance effects in Chile and the Balearic Islands are studied in Aranguiz et al. (2019) and Vela et al. (2014). Pattiaratchi and Wijeratne (2015) describe the effect of such phenomena as amplifying factors for meteotsunamis. It is currently unclear how such amplifying phenomena can be represented in the numerical model, nor if the strength is captured adequately.

Quantifying the Influence of Modeling Assumptions and Scaling (H5)
A hierarchy of modeling approaches, from shallow water assumption, over dispersive long wave solvers, to Navier Stokes type models, can be used to numerically treat tsunami hazard analysis in varying complexity. Due to ever-increasing computational resources, a trend toward more involved model equations can be observed. However, a clear quantitative assessment of the difference has only partly been performed. Lynett et al. (2017) use extensive benchmarking to study and compare modeling approaches to currents induced by tsunami waves. While this study is enlightening and provides very good benchmarking tools, further assessment is necessary to quantify the influence of higher fidelity modeling techniques. Generally, we note that current benchmarking (e.g., Synolakis et al., 2008) stays behind current high-fidelity modeling capabilities. Additionally, some benchmarks based on laboratory experiments have issues with scaling (see Heller, 2011;Pedersen et al., 2013), and related bias and accuracy have not been investigated systematically.

Modeling Situations With Complex Tsunami Inundation (H6)
NLSW models are predominantly used to simulate tsunami inundation. However, real inundation situations involve features too complex for NLSW approximate modeling, such as urban structures, or damage and erosion due to debris transport. At present, these topics are only partly represented, often using heuristic model formulations. Examples include spatially variable friction mapping (e.g., Gayer et al., 2010;Kaiser et al., 2011), or porous body equivalent friction models representing buildings (e.g., Yamashita et al., 2018). Bottom friction parameterization is almost insensitive for offshore modeling (see Arcos and LeVeque, 2015). However, variable bottom friction parameterizations may pose a viable tool for simulating detailed inundation, but large uncertainties still prevail (e.g., Griffin et al., 2015;Macías et al., 2020). While small scale laboratory tests exist (Park et al., 2013), the heuristic nature of named models and the difficulty to perform controlled tests, implies potentially large epistemic uncertainties. Debris impact and transport are predominantly addressed through post-disaster surveys and experimental analysis of data so far (e.g., Nistor et al., 2017a; Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 2017b; Stolle et al., 2019), and is mostly embedded in only vulnerability analysis (see below), and not in hydrodynamic modeling or PTHA to our knowledge. Extending the modeling dimensions and physical complexity is desirable (e.g., Marras and Mandli, 2021). Open and related to this issue is the influence and potential bias of the accuracy of topo-bathymetric grids, including filtering of structures and vegetation, on the accuracy of inundation simulations (see Griffin et al., 2015;Goda and Song, 2019). Unphysical bias can also be introduced when coupling high resolution (nested) models to large-scale propagation models as shown in Harig et al. (2008).

PROBABILISTIC TSUNAMI RISK ASSESSMENT
This section discusses identified gaps in PTRA. We go through current state for exposure modeling, physical vulnerability, and risk and resilience metrics, as they naturally follow each other in a consequence-based risk workflow (Figure 1). Methods characterizing the complex social, organizational, and economic context in a tsunami risk assessment are discussed subsequently.

Existing Methods
Exposure data provide information about the characteristics and location of people and assets at risk. There are several techniques for the acquisition of exposure data, with different degrees of resolution and precision (Pittore et al., 2017). Data from governmental agencies are most commonly used, as they are open and available in most developed countries. These data often provide coverage for the entire building inventory (e.g., physical assets) and are regularly updated for asset management (e.g., national technical maps) and fiscal reasons (e.g., cadastral data). Different exposure databases exist. The Global Exposure Database-GED (De Bono and Mora, 2014; De Bono and Chatenoux, 2015) developed for GAR13 and updated later for GAR15 (UNISDR, 2013; 2015) provides a global dataset at 5 km grid resolution at inland and 1 km at coastal locations, including data for buildings, their use, and exposed value. The 2013 and 2015 versions of the GED served as the exposure databases for the global risk model by the United Nations Office for Disaster Risk Reduction, which considered earthquakes, hurricanes, tsunamis and riverine floods as hazards. The DRMKC Risk Data Hub WebGIS tool (Antofie et al., 2019) has been developed to provide access and sharing of EU-wide information relevant for disaster risk management. Initiatives such as the Open Exposure Data (OED) with roots in proprietary catastrophe modeling and reinsurance industry, provide the opportunity to generate exposure data, including those relevant to tsunami risk, with interoperability between different modeling tools. These databases mainly contain data from census or remote sensing. A recent interview-based approach, relying on local practicing engineers with knowledge of building features, has been adopted for the compilation of building inventories at regional scales (Polese et al., 2020). Careful validation needs to address possible heterogeneity in data. At present, the only guidelines and tools that exist for capturing and classifying exposure data for a tsunami are the multi-hazard exposure taxonomy, and associated tools, provided by GED4ALL (Silva et al., 2018b).

Identified Gaps Lack of Detail (E1)
Most available exposure data have not been collected for the purpose of tsunami risk assessment and may be missing important information for modeling tsunami fragility or vulnerability. For instance, population cadastral data are often collected at the municipal, district or residential unit level, requiring extra assumptions to determine the geographical distribution. Tsunami hazard intensities can vary considerably between two nearby locations. Accurate geo-localization of the exposed assets and people is needed to obtain robust results, necessitating a minimum resolution level for the exposure databases. While main building construction characteristics are often known, tsunami relevant features (e.g., building lateral load resistance, foundation) are missing (Rivera et al., 2020). Exposure data for critical structures and infrastructure should include functionality information for the exposed asset. This would allow for proper modeling and hence assessment of community resilience, considering different services such as healthcare and education. In other cases, data gaps and uncertainties are associated with regulatory and privacy limitations or outdated sources.

Lack of Exposure Data (E2)
In many developing countries, where cities have rapid urbanization processes and long-term planning is not consistently enforced, exposure data are not always available or updated. Such data may be inferred from satellite and aerial imagery, from freeware data made available from international projects (e.g., NASA's EOSDIS), from volunteered geographic information systems (e.g., Huyck et al., 2011;Huyck and Eguchi, 2017;OpenStreetMap, 2020), or through intergovernmental organizations (e.g., JRC Risk Data Hub, 2020).

Lack of Tsunami Exposure Model and Taxonomy (E3)
Significant efforts have been made in the earthquake risk community to define a common exposure taxonomy (e.g., GED4GEM, Silva et al., 2018a;METEOR, Huyck et al., 2019). However, these taxonomies do not contain all the required structural attributes for estimating tsunami risk such as geomorphological, land use, and land cover datasets, or number and size of openings in buildings. A recent development is GED4ALL, a multi-hazard taxonomy (Silva et al., 2018b), which considers tsunami as a hazard. GED4ALL also discusses multiple asset types like buildings, people, infrastructure systems and crops. Common taxonomy and attributes are fundamental to avoid heterogeneity, especially when considering multiple asset types.

Spatio-Temporal Variability (E4)
Most exposure models are static in time and do not consider the spatio-temporal variability of exposure components. This aspect Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 is critical when modeling human exposure since there can be daily and seasonal variations. For example, coastal regions often attract tourists, visitors and seasonal workers, leading to significant seasonal fluctuations in the population (Fraser et al., 2014). Spatio-temporal variation in exposure heavily influences the tsunami risk.

Existing Methods
As tsunami losses are closely connected to damages to buildings and infrastructure, the vulnerability component is often cut into two parts: a tsunami-to-damage fragility function, and a damageto-loss consequence function (Figure 1). Advancements in tsunami vulnerability models have significantly lagged behind those of tsunami hazard, with almost no studies found to precede the 2004 Indian Ocean Tsunami (Charvet et al., 2017). However, with the recent devastating tsunamis providing a large quantity of observed damage and loss data to develop and validate fragility and vulnerability models, this field of study has rapidly grown. Several empirical fragility functions for the assessment of buildings (Koshimura et al., 2009;Mas et al., 2012;Suppasri et al., 2014;Charvet et al., 2015;Chock et al., 2016) and infrastructure (Eguchi et al., 2014;Hatayama, 2014) Alam et al., 2018;Karafagka et al., 2018;Páez-Ramírez et al., 2020), and under sequential earthquake and tsunami impact (Park et al., 2012;Attary et al., 2019;Petrone et al., 2020). Only a few studies exist that move from fragility to vulnerability modeling . There is a lack of consensus on many aspects of physical fragility and vulnerability modeling.

Identified Gaps Limitation in Asset Types and Geographical Scope (P1)
The vast majority of existing tsunami fragility and vulnerability models relate to buildings, few exist for bridges, fuel tanks, or other types of infrastructure. The main reason is that most fragility functions are empirical, and few observational damage or loss data are available for infrastructure components. Even for buildings, the geographical scope of existing vulnerability and fragility models is limited. Most empirical fragility functions are based on data from the 2004 Indian Ocean and 2011 Tohoku events, and hence represent non-engineered buildings in countries surrounding the Indian Ocean and engineered buildings typical of Japan. With analytical fragility functions only covering a small number of building types, large portions of the world's exposure remain unrepresented by current studies.

Effect of Multiple Hazard on Empirical Tsunami Fragility Mode (P2)
Tsunamis are commonly triggered by large earthquakes. Nearsource, observational data on asset damage and loss collected after the tsunami often include the combined effects of earthquake ground shaking and tsunami inundation. Hence, empirical fragility and vulnerability models derived from such data inherently comprise the effects of both hazards. Therefore, corresponding empirical fragility models may be regarded as inappropriate for use in a tsunami-only risk assessment. Pure tsunami damage data is rare and currently limited to nonengineered structures (Charvet et al., 2017).

Lack of Consensus Regarding Best Tsunami Intensity Measure (P3)
The intensity measure IM (Figure 1) links the hazard and vulnerability components within risk models. Traditionally, tsunami inundation maps are presented in terms of inundation depth. While the majority of fragility and vulnerability models adopt inundation depth as IM, other tsunami IM have also been used such as the flow velocity or momentum flux. The absence of inundation velocity measurements in field data requires running tsunami inundation simulations to use such IM (Koshimura et al., 2009;Song et al., 2017). More recently, force-based IM (e.g., flow velocity, momentum flux) were used in fragility functions for engineered buildings yielding better correlation to observed damage than inundation depth (Macabuag et al., 2016). However, no consensus on the most appropriate IM could be reached. As a consequence, mismatches between representations of hazard and vulnerability in risk modeling may exist.

Gaps in Building Analysis and Assessment for Use in Analytical Tsunami Fragility (P4)
Buildings are often used as vertical evacuation shelters and an assessment of their structural fragility is therefore an important information in the risk assessment workflow. Tsunami engineering being a younger discipline than earthquake engineering has adopted approaches from the latter community. This was supported by the physical similarity of both hazards applying predominantly horizontal loads to structures. However, there are fundamental differences in how earthquake and tsunami loads are applied to buildings. For example, tsunami loads affect the lower floors of a high-rise building, whereas seismic loads are inertial forces usually causing increasing magnitude for higher floors . Thus, earthquakes induce large bending moments in structural elements, whereas tsunamis typically induce large shear. Since typical structural modeling approaches tend to prioritize flexural effects, the bias in tsunami fragility assessment may be large. Furthermore, seismic loads are dynamic, whereas loads from tsunami inundation can be considered quasi-static, and Rossetto et al. (2018) have shown that building ductility is often not crucial in the tsunami response of structures. Although no consensus has been reached in this regard, more fragility functions based on static rather than time-dependent non-linear approaches are derived now Rossetto et al., 2019). As a tsunami applies direct pressures to a structure, non-structural components like infill walls (and their openings) are seen to play an important role in determining tsunami forces (Del Zoppo et al., 2021). Furthermore, buoyancy, foundation scour and debris impact, which significantly affect building damage from tsunami inundation are rarely modeled (Del Zoppo et al., 2019). These Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 effects are still to be investigated; therefore, published analytical tsunami fragility functions are subjected to large modeling uncertainties. Progress towards more comprehensive and reliable analytical fragility and vulnerability models is needed.

Existing Methods
Tsunami risk assessments typically reflect the impact on the exposed population and infrastructure. The most commonly used decision variables (or metrics) are the number of fatalities, injuries, affected people, besides the direct and indirect economic losses. Direct economic losses represent the repairing/replacement costs of damaged assets, whereas indirect losses reflect costs as down-time, partial loss of functionality of buildings and infrastructure, loss or reduction in network connectivity, flow and/or capacity. These metrics can be used in alternative approaches such as worst-case scenarios, scenariobased for a prescribed return-period, and fully probabilistic. A review of early methods for tsunami risk assessment can be found in Jelínek and Krausmann (2008). Fully probabilistic risk assessments require the integration of hazard estimates (PTHA) with vulnerability functions (see Figure 1, Løvholt et al., 2015a;. Since the results of PTHA are not always available, tsunami risk assessments are often performed considering selected (worst-case) scenarios as hazard input (e.g., Triantafyllou et al., 2019), which sometimes represent past disasters (e.g., Daniell et al., 2017). Having the results of PTHA available, tsunami risk assessment can also be performed for a limited number of scenarios (e.g., Nadim and Glade, 2006;Okumura et al., 2017). When the PTHA results are available in the form of stochastic event sets, a fully probabilistic tsunami risk assessment (PTRA) can be performed (Ordaz, 2000;Strunz et al., 2011;Salgado-Gálvez et al., 2014), although these types of analyses usually demand an extensive computational effort (e.g., Løvholt et al., 2015a;Jaimes et al., 2016;Goda and Song, 2019;Ordaz et al., 2019).
In a fully probabilistic tsunami risk assessment workflow, risk results are obtained in terms of exceedance frequencies for the above-mentioned metrics (Figure 1). For instance, loss exceedance curves (LEC) provide the relationship between loss values and their exceedance frequencies (Løvholt et al., 2015a;Jaimes et al., 2016;Attary et al., 2017;Ordaz et al., 2019). The area under the LEC corresponds to the average annual loss (AAL), a metric that provides a long-term overview of risk and accounts for the contribution of large and infrequent events as well as small and more frequent ones. From the LEC, loss values associated with a given return period can be obtained, such as loss values estimated by Løvholt et al. (2015a) at a global level representing direct losses. The Hazus tsunami loss estimation methodology provides state-of-theart decision-support software for estimating potential losses from tsunami events (FEMA, 2017a;FEMA, 2017b).
Risk assessment is not necessarily limited to quantifying the direct and indirect impact on exposed populations and infrastructures. The evaluation of safety and reliability of physical systems is of interest too and for this, fragility functions ("Gaps in Physical Vulnerability" section) can be integrated with hazard to obtain the frequency of exceeding a given damage level (see Figure 1, e.g., Park et al., 2019;Fukutani et al., 2019). The risk metrics provide valuable data also for the assessment of quantitative resilience (also denoted as engineering resilience), which aims to estimate the resilience of a network, an infrastructure, or even an urban ecosystem to a specific natural hazard (see Mebarki et al., 2016 for industrial plants, Akiyama et al., 2020 for bridges). Quantitative resilience should not be confused with coastal community resilience which is discussed in detail in the following section.

Identified Gaps Gaps Related to Characterization and Propagation of Uncertainties (R1)
Most existing PTRA models rely on a homogeneous Poisson process as the probabilistic backbone for the occurrence process (Eq. 4). The Poisson model, strictly speaking, should be used for propagating only those uncertainty sources that renew with the occurrence of each new event (Kiureghian, 2005). This means that propagation of other sources of uncertainties in a PTRA framework (i.e., those that lack renewal properties), such as the uncertainties in modeling, analysis method, and in general epistemic uncertainties, need more research Goda, 2020). One possible direction could point to Bayesian methods (Jalayer and Ebrahimian, 2020).

Challenges in Characterizing Vulnerability Functions (R2)
PTRA lacks a clear distinction and definition of the different loss components that are quantified through the vulnerability functions. On the one hand, direct economic losses can be estimated with a good degree of confidence using existing methodologies (Pagnoni et al., 2019). Long-term direct (e.g., cost of maintenance) and indirect losses (e.g., down-time and reduced functionality including business interruption) typically represent a significant component of the total economic loss (direct + indirect) yet require better quantitative approaches.

Lack of a Tsunami Consequences Database (R3)
There is a lack of tsunami-specific consequence databases accounting for casualties and losses (Yamao et al., 2015). These types of databases exist for disasters in general (e.g., EM-DAT) and more specifically for earthquakes (So et al., 2012;Cardona et al., 2018). They are useful not only to keep a consistent record of past events and the affected regions but to disaggregate the impacts of large events in terms of losses (direct and indirect) and casualties (fatalities and injured), besides assessing the consequences in particular sectors (e.g., road networks, heritage sites, etc.) at different resolution levels. The information included in the consequences databases provides valuable data to validate and calibrate different components of the models (e.g., fragility curves, vulnerability functions). Some data can be partially acquired from collections of documented eyewitness accounts (Santos and Koshimura, 2015), or other sources (e.g., ITIC, 2020).

General Lack of Risk Studies for Networks and Lifelines (R4)
Current implementations of PTRA are mainly focused on residential buildings and emergency planning activities such as Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 the definition of evacuation routes. However, the resilience of coastal areas relies on conventional and strategic infrastructures (Akiyama et al., 2013;Pitilakis et al., 2019). Conventional infrastructure such as roads, bridges, power, water, sanitation and communication networks, underpin economic and social activities in most urban areas . Schools and hospitals support the provision of education and health services, which are essential to recovery. Critical infrastructures in coastal areas include harbors (nuclear) power plants, gas and oil storage, and early warning infrastructure, such as tidal buoys and offshore bottom pressure gauges . Such infrastructures are complex, often interconnected and geographically distributed systems involving multiple sectors (Duenas-Osorio and Vemuru, 2009;Argyroudis et al., 2019), where further research is needed to quantify their resilience to tsunamis.

Assessing Tsunami Risk in a Multi-Hazard and Multi-Risk Framework (R5)
As triggered events, tsunamis fit naturally within a multi-hazard framework. Moreover, there can be several cascading consequences associated with the occurrence of tsunamis, such as technological disasters induced by natural hazards known as NATECH risks (e.g., the Fukushima Disaster), disruption to supply chains, and societal impacts. Therefore, management and decision-making for tsunami risk should be framed in a multi-risk context. To be able to make risk-informed decisions considering tsunamis, it is important to model the interaction of tsunamis with other phenomena at the level of hazards, vulnerabilities, and socio-economic consequences. An important gap related to risk assessment for tsunamis (and in general) is the lack of a streamlined and standard workflow for modeling the multi-hazard and multi-risk aspects. Currently, most studies consider the different hazards to be independent or "simultaneous" (e.g., earthquake and tsunami as independent events); whereas, few works consider interacting hazards such as coupled simulation of tsunami and earthquake (De Risi and Goda, 2016;Goda et al., 2017;Goda and De Risi, 2018;Ordaz et al., 2019;Park et al., 2019), the cumulation of tsunami and earthquake damages and losses (Ordaz, 2015;Attary et al., 2019;Park et al., 2019;Petrone et al., 2020), and interaction of tsunami and aging infrastructure (Akiyama et al., 2020).

Lack of Understanding and Quantification of Mortality (R6)
Strikingly, the 2004 tsunami with more than 226,000 dead and missing people (EM-DAT, 2020) caused an order of magnitude higher fatalities than the 2011 Tohoku tsunami with 19,846 (EM-DAT, 2020). Hence, past major disasters indicate that the vulnerability to tsunami mortality of a population is much more sensitive to demographic factors  than to physical vulnerabilities ("Gaps in Physical Vulnerability" section).
Correlations of tsunami flow depth and number of fatalities following the 2004 Indian Ocean, 2006 Java and 2011 Tohoku tsunamis reveal much larger scatter than those observed in physical vulnerability functions, even when derived from the same events (Reese et al., 2007;Koshimura et al., 2009;Suppasri et al., 2016). As human behavior influences mortality strongly (Johnston et al., 2016;Blake et al., 2018), deriving simplified vulnerability charts based on single tsunami intensity measures may not be appropriate. Tsunami awareness and availability of tsunami early warning systems and infrastructure are important (Gregg et al., 2006;Fraser et al., 2014), as well as proximity to source areas. Our understanding and ability to quantify and assess the effect of all these factors on tsunami mortality is still very limited.
The Weakness of Capturing Multi-Faceted Aspects of Vulnerability (R7) Quantitative risk assessments typically address several socioeconomic parameters (e.g., safety, downtime, direct and indirect economic losses, and even human behavior and response) as dimensions of consequences to disruptive tsunami events. However, PTRA falls short in modeling some dimensions of vulnerability that are part of a given context and not directly caused by a disruptive event (e.g., governance-related issues, adaptation and coping capacities, societal inequalities).
There are no established methodologies, within the context of the PTRA framework (Equations 1-4), for characterizing contextbased impacts of tsunami on the social, political and economic dimensions, leaving it unclear how to address these dimensions.
Integrated and heuristic approaches such as "MOVE" (Birkmann et al., 2013) or holistic approaches as those proposed by Carreño et al. (2007) or Aguirre-Ayerbe et al. (2018), have strived to address the context-based and multi-dimensional nature of vulnerability and risk and could be adapted to be used as physical risk indicators in the outcomes of PTRA.

Existing Methods
Although not directly addressing tsunami risk, Jasanoff (1993) pointed out the urge to bridge the two cultures of quantitative and qualitative risk assessment, stressing the importance to view risk in a larger context of social justice (who should we protect, from which harm, at what cost, and by foregoing what other opportunities). The societal factors impacting vulnerability and risk are mainly rooted in a complex and diverse aggregate, which varies over time and space. Qualitative vulnerability investigations use models and frameworks considering several dimensions (e.g., economic, demographic, psychological, political or physical), summarized by composite vulnerability and risk indices. These indicators can be distinguished from the risk and resilience metrics discussed in the previous section ("Gaps in Risk and Resilience Metrics" section) since some of them cannot be directly integrated into a computational PTRA procedure. Examples of existing multi-dimensional vulnerability and risk indicators are: The community resilience (e.g., Lam et al., 2016;Saja et al., 2019), the urban disaster risk index (Carreño et al., 2007;Salgado-Gálvez et al., 2016), the social vulnerability index (Cutter et al., 2003;Flanagan et al., 2011), the Coastal vulnerability index (McLaughlin and Cooper, 2010), Metropolitan Tsunami Human Vulnerability Assessment (Tufekci et al., 2018).

Identified Gaps
The Difficulty of "Quantifying" Social Vulnerability (I1) Social vulnerability describes combinations of social, cultural, economic, political, and institutional processes that determine differentials in the experience of hazards and recovery from dangerous events (Spielman et al., 2020). Experts may construct meaningful indicators to include a social component in hazard planning, preparation, and response. Integrating social vulnerability research into emergency and disaster risk management is essential, but caution is required to assign quantitative elements. Integration of social factors may allow planners and decision-makers to better identify problems in case of destructive events and provide insights into addressing recovery solutions (Cardona, 2001;Chakraborty et al., 2005;Schmidtlein et al., 2008). Social Vulnerability Index (SoVI) is a single quantitative indicator which was developed through a review of hazard case studies by Cutter et al. (2003) examining the spatial patterns of social vulnerability to natural hazards at the county level in the United States. Because of the complex and multidimensional nature of factors contributing to vulnerability, no variable has yet been identified to fully validate SoVI. An alternative approach to assess its reliability is to identify how the changes in the SoVI algorithm construction may lead to the changes in the outcome. Schmidtlein et al. (2008) investigated the sensitivity of quantitative features of the SoVI such as the scale of application, the set of used variables, and various geographic contexts.

Ambiguities in Definition of Community Resilience (I2)
Resilience is a frequently used term to discuss the capacity of a society or ecosystem to recover quickly from a disaster. The United Nations Office for Disaster Risk Reduction has defined resilience as "the capacity of a system, community or society potentially exposed to hazards to adapt, by resisting or changing in order to reach and maintain an acceptable level of functioning and structure. This is determined by the degree to which the social system is capable of organizing itself to increase this capacity for learning from past disasters for better future protection and to improve risk reduction measures" (UNISDR, 2007). A comprehensive review of various definitions of resilience can be found in Davoudi et al. (2012) and Ayyub (2014). The definition of coastal resilience is hindered by varying definitions and non-unified terminology, difficulties in selecting and combining different resilience indicators, and lack of data for validation (Lam et al., 2016). In fact, resilience is still lacking rigorous measurement methods (Bozza et al., 2015), especially in the context of tsunami hazard (Genadt, 2019).

Lack of Tsunami Vulnerability Index (I3)
A specific tsunami Disaster Risk Index (TDRI), similarly to the Disaster Risk Index (DRI) developed by the UN Development Program to compare disaster risk between countries exposed to hazards (UNDP, 2004) or the Urban Seismic Risk Index by Carreño et al. (2007) should be developed.

Integrated Approaches to Consider the Multi-Dimensional Aspects of Tsunami Risk (I4)
Vulnerability and risk are multi-faceted concepts and encompass various assets, physical, organizational, and institutional dimensions (e.g., Herslund et al., 2016). Vulnerability and risk assessment considering these different facets often requires different scientific backgrounds and approaches (Hufschmidt et al., 2005). A consequence-based approach to risk assessment (e.g., the PEER framework, or computational PTRA in general) has its roots in engineering. The approach follows a logical flow from causes associated with a disruptive event toward quantifying its direct and indirect socio-economic consequences. This approach focuses on the physical dimension of vulnerability, acting as a "container" of functions and services and thereby invokes-directly or indirectly-other dimensions of vulnerability such as social, economic and organizational vulnerability. On the other hand, the context-based approach (e.g., approaches based on integrated indicators) has its roots in the humanities and social disciplines. This approach deals with the context and the interactions between different actors, the respective territory, the different drivers (climate, societal changes) and how decisions can affect the overall context and the complex interplay between actors and drivers. Needless to say, the two approaches complement each other and have to be taken into account in policymaking for DRR in an integrated manner (O'Brien et al., 2007).

Considering Community Response and Organizational Capacities (I5)
Recent tsunami events worldwide have highlighted the need to critically revisit how human behavior in tsunami evacuation, and more generally, the human dimension of preparedness for tsunamis is addressed within the risk assessments. The lessons from Japan 2011, Chile 2010 and Indonesia in 2010 and 2018 events highlight such needs. Questions arise on how and if the different and seemingly inconsistent human behavior can be addressed in tsunami risk assessments. Moreover, atypical events such as the Krakatoa, Indonesia 2018, do not allow for conventional prevention, warning and mitigation strategies. In most cases, aid and help arrive late due to limited organizational capacities, leaving the affected communities in even more vulnerable conditions, especially during the first critical hours and days after the event. Events with growing levels of complexity are likely to continue to occur in the future and this calls for a more in-depth consideration of how different communities respond and how those variations can be integrated within the risk assessment framework.

Incorporating Risk Perception in the Formulation and Analysis of Complex Risks (I6)
Perceptions are dynamic and socially constructed. Perceptions can change abruptly or gradually, depending on the context. Understanding evacuation behavior requires an understanding of risk perceptions. This can help explain why the response to tsunami drills may be different than when responding to a real event. It is quite challenging for risk methodologies to consider the dynamic, complex and subjective aspects of risk perception. Only by understanding the subjective meanings of perceived risks allows risk communication to be designed and applied more effectively.

CONCLUSIONS AND DIRECTIONS
In this review, we discuss a large number of research gaps in PTHA and PTRA. It becomes obvious that methods have substantially improved over the past decades, but also that open questions remain in the physical description, conceptualization, modeling, as well as the social and psychological dimensions of the topic. The physics and geological complexity of tsunamigenic sources are still not captured nor understood adequately, leading to large uncertainties. For SPTHA, neither all earthquake faults nor their exact location, geometry, boundary and initial conditions (e.g., stress, friction) are fully constrained. Statistical models of recurrence constitute the largest uncertainties in large and rare events, including tsunami earthquakes. Uncertainty may become excessive for landslide tsunamis, where statistics on past events often are absent, and our understanding of slope failure probability is limited. The need for covering vast geographical scales, source diversity and related uncertainties, render LPTHA extremely challenging. For VPTHA additional difficulties arise due to the complexity of tsunamigenic volcano sources and triggers, but they are constrained spatially. MPTHA may benefit from a large meteorological data network allowing for (prototypical) forecasting as well as PTHA applications, but sensitivity to source parameters is still unconstrained.
While modeling and parameterization of individual phenomena are possible, they are often excessively computationally expensive or highly uncertain due to missing constraints on input parameters. The multiple scales involved in PTHA from far-field propagation over oceanic distances to the need to resolve small scale inundation features while capturing physics and resolving uncertainties still represent an open challenge. Yet, this solution is needed to convey PTHA information properly into risk analysis.
Even more challenging is the situation in PTRA, where gaps exist in the transformation of physical hazard to risk and quantifying the uncertainties in the assessment of risk and resilience. Key concepts, such as physical vulnerability and mortality and their related uncertainties, are less developed than the main PTHA elements. There are gaps regarding selection of IM, limited observed damage asset-and locationwise, and limited experimental validation.
Furthermore, tsunami science is immature concerning embedding issues with intrinsically multi-hazard and multirisk aspects, such as the cascading events that are entangled with tsunami hazards. A weakly developed link between quantitative PTRA and the social sciences is a clear gap. At this point, it is worth noting that terms "vulnerability" and "resilience" are multi-dimensional concepts that are used both in the consequence-based-natural sciences inspired-as well as context-based approaches-motivated by social sciences. Therefore, they may have quite different interpretations depending on the analysis context.
The overarching issue is integrating all the above components and developing an overall consistent sensitivity and uncertainty quantification framework, to understand tsunami risk and identify risk drivers, from the probability of the sources causing hazards to the probability of their physical consequences and societal impact. This understanding must be developed and prioritized in future research.
To guide such efforts, we have performed an expert judgment exercise that we discuss in the following subsection. It may help to identify most pressing research needs as well as prioritize research efforts.

Prioritizing Research Gaps
A scientific sensitivity analysis of the impact of each research gap, as conducted for individual sources in Sepúlveda et al. (2017) or Davies and Griffin (2020), on the overall result of a PTHA or PTRA is certainly out of the scope of a single review paper. However, some guidance on prioritization of efforts is certainly desirable. Since we focused our description on research gaps, we suggest two important metrics for the prioritization: The susceptibility of PTHA and PTRA results on uncertainty due to the research gap (sensitivity) and the difficulty or amount of research effort needed to fill that respective gap (tractability).
In order to assess these two metrics, we conducted a first-pass expert judgment among the more than 50 co-authors of this article-all experts in one or more of the aspects of our review. A questionnaire was designed that asked three questions for each of the 47 research gap subsections that we have described before. The first two questions addressed the two metrics just mentioned. The third question asked if experts were of the opinion if the research gap existed because of a missing theoretical understanding, a lack of data, or both. While this somewhat ad hoc prioritization is not as solid as a rigorous expert elicitation (e.g., Cooke, 1991;Budnitz et al., 1997;Morgan, 2014; for tsunami hazard see an application in Basili et al., 2021, or the discussion in Grezio et al., 2017) and hence could be somehow biased, we believe it still provides a valuable starting point for future efforts. It is a qualitative broad-brush answer to the question, which research gap may be of highest importance. More details on this exercise are given in the Supplementary Material.
The result of this exercise is visualized in a priority matrix ( Figure 2). It may appear natural to respond first to those research gaps that are located in the left upper quadrant of the matrix, since these gaps are considered less difficult to solve, while they are expected to influence the risk considerably. It can be noted that most of the research gaps are judged hard to solve but with a highly sensitive impact on the overall result. This seems natural, since high impact but simple problems would have been solved already.
Based on our qualitative assessment, we can therefore identify some overall trends. First, we see some common challenges related to establishing annualized source probability of occurrence, which tend to cluster in the upper right corner of Figure 2. This means that they are considered relatively most important, yet hardest to solve. Of these, obtaining landslide related annual source probabilities (L1) is considered both the largest yet most important obstacle, while a just slightly lower similar prioritization is evident for earthquake and volcano sources (S1 and V2). Another aspect that is considered important (and challenging) is the multi-hazard and cascading hazard aspect (R5). On the other hand, the research gaps that appear to be least sensitive and also easy to be filled are related to the numerical modeling of wave propagation (H3), as well as lack of joint intensity measures (I3) and gaps related to earthquake scaling relations (S4). Finally, we also note Figure 2 allows us to analyze several instances of components with similar sensitivity but with clearly different tractability. For instance, the lack of tsunami exposure data (E2) is considered as important as modeling complicated aspects of inundation (H6), but the former is assumed by the authors of this paper to be more easily achieved. Several other similar examples can be analyzed from Figure 2.
It is noteworthy that most of the research gaps that most experts find consensus on are highly sensitive in their impact (all located at the upper margin of the point cloud). It is also worth noting that most research gaps are considered to relate to data and theory gaps and that those that relate to only a missing theoretical understanding are considered of relatively low sensitivity. This may be related to the fact that when we don't understand a phenomenon, we cannot really judge whether it affects our results or not. In other words, this may be an "unknown". Whereas a data related research gap may already have proved to be sensitively influential by a specific example, but due to a lack of data cannot be involved concisely into the workflow.
This priority matrix is just a very first approach. Since tsunami research eventually aims at protecting life from natural hazard, one could also prioritize those research gaps with direct impact on this goal. These would be in particular those topics mentioned in sections "Gaps in Physical Vulnerability," "Gaps in Risk and FIGURE 2 | Priority Matrix for all the 47 research gaps identified. Letters indicate seismic source gaps (S), landslide source gaps (L), volcanic source gaps (V), meteorological source gaps (M), hydrodynamical modeling gaps (H), exposure related gaps (E), physical vulnerability related gaps (P), resilience related gaps (R), social vulnerability and risk indicators related gaps (I). The size of each marker relates to the agreement of experts, larger marker size means less spread in the answers. Colors are used to indicate if the gap is caused by missing theoretical understanding (blue), a lack of data (red), or both (cyan).
Frontiers in Earth Science | www.frontiersin.org April 2021 | Volume 9 | Article 628772 Resilience Metrics," and "Gaps in Social Vulnerability, Multi-Dimensional Vulnerability and Risk Indicators" (marked with P, R, and I; respectively).

Final Considerations
We have described and prioritized a comprehensive list of research gaps in PTHA and PTRA. While our approach to prioritization and the metric used to do so are to some extent subjective, it remains for the scientific community and further investigation as well as future incentives to decide, which directions to choose from. Nevertheless, our priority matrix will serve as a first impression on the weight of each of the identified research gaps. An important part of the future puzzle will be exploring how uncertainties propagate to risk across disciplines. While uncertainties are more extensively explored in earthquakerelated hazard analysis, non-seismic hazard, vulnerability, exposure and risk are lagging behind. On the other hand, different levels of maturity of methods and understanding will always exist. Hence, it is imperative to develop PTRA standards and guidelines to appropriately merge all risk analysis components considering their different uncertainty exploration and maturity level.
While validation of individual components has been addressed in several of the sections in our text, validating the PTHA and PTRA workflow as a whole is still ongoing research. Marzocchi and Jordan (2014) propose a methodology for a meaningful test of general probabilistic hazard models and an example of a successful application can be found in Meletti et al. (2021).
Certainly, research gaps exist also outside of the scope of PTHA and PTRA. New computational methods, like fuzzy methods, machine learning techniques and even advances in classical computational methods have to be considered. Rigorous, information theory inspired approaches to validation may also be explored.
Considering the goals of the Sendai Framework for Disaster Risk Reduction and acknowledging the vast number of challenges outlined in the sections before, a concerted interdisciplinary effort to close the most pressing gaps is required. Attempts to gather expertize, facilitate exchange and development, and coordinate community efforts are represented by the Global Tsunami Model (GTM, 2020) and the COST Action AGITHAR. A thorough consolidation of available sources of information in openly accessible databases, documentation of standard workflows, unification of terminology and metrics, as well as information hubs need to be established.

ACKNOWLEDGMENTS
This work would not have been possible without the help of many researchers, who are too many to be named individually, but we thank all participants of COST Action AGITHAR for their valuable input in official or informal conversations, workshops and interactions. We would like to thank the reviewers and editor for very considerate and challenging suggestions that improved this manuscript substantially.