A Source Clustering Approach for Efficient Inundation Modeling and Regional Scale Probabilistic Tsunami Hazard Assessment

For coastal regions on the margin of a subduction zone, near-field megathrust earthquakes are the source of the most extreme tsunami hazards, and are important to handle properly as one aspect of any Probabilistic Tsunami Hazard Assessment. Typically, great variability in inundation depth at any point is possible due to the extreme variation in extent and pattern of slip over the fault surface. In this context, we present an approach to estimating inundation depth probabilities (in the form of hazard curves at a set of coastal locations) that consists of two components. The first component uses a Karhunen-Loève expansion to express the probability density function (PDF) for all possible events, with PDF parameters that are geophysically reasonable for the Cascadia Subduction Zone. It is then easy and computationally cheap to generate a large N number of samples from this PDF; doing so and performing a full tsunami inundation simulation for each provides a brute force approach to estimating probabilities of inundation. However, to obtain reasonable results, particularly for extreme flooding due to rare events, N would have to be so large as to make the tsunami simulations prohibitively expensive. The second component tackles this difficulty by using importance sampling techniques to adequately sample the tails of the distribution and properly re-weight the probability assigned to the resulting realizations, and by grouping the realizations into a small number of clusters that we believe will give similar inundation patterns in the region of interest. In this approach, only one fine-grid tsunami simulation need be computed from a representative member of each cluster. We discuss clustering based on proxy quantities that are cheap to compute over a large number of realizations, but that can identify a smaller number of clusters of realizations that will have similar inundation depths. The fine-grid simulations for each cluster representative can also be used to develop an improved strategy, in which these are combined with cheaper coarse-grid simulations of other members of the cluster. We illustrate the methodology by considering two coastal locations: Crescent City, CA and Westport, WA.


INTRODUCTION
The primary goal of this work is to present a general methodology for developing the hazard curve for a quantity of interest (e.g., maximum water depth) at a coastal location that may be inundated by tsunamis. An inundation hazard curve shows the annual probability that the flooding depth will exceed each value in a range of specified exceedance values. The same techniques could be applied to other quantities of interest (e.g., maximum flow speed or momentum flux) but here we concentrate on water depth h for illustration, and use h max to represent the maximum value of h over the entire simulation (at some particular point of interest). Let h denote some particular exceedance value. The hazard curve is then obtained by determining P[h max > h], the annual probability that h max exceeds h at this particular location, as a function of h. The ultimate goal is to develop such a hazard curve at each point on a fine grid covering a community of interest, from which it is possible to then create hazard maps that show the spatial distribution of maximum water depth expected for a given annual probability, or the spatial distribution of annual probability for a given exceedance value, or potentially other products useful to emergency managers or community planners. We give some examples of how this can be done in Section 6.
A full probabilistic tsunami hazard assessment (PTHA) would have to include all potential sources of tsunamis, far-field as well as near-field, and also possibly tsunamis induced by landslides or other processes; a complete review of this can be found in Grezio and Babeyko (2017). Here we concentrate on one aspect of PTHA, assessing the probabilities in a coastal region due to a megathrust event on a nearby subduction zone. This is a difficult aspect of PTHA because variations in the spatial distribution of the slip can have a significant effect on the resulting tsunami (Goda et al., 2016;. In addition, some events may cause substantial subsidence or uplift of the coast around the location of interest, which can also greatly effect the inundation extent and depth of the resulting tsunami. To perform PTHA it is necessary to first have some model for the probability density function (PDF) of all possible events. It is impossible to know the correct distribution due to the high degree of epistemic uncertainty in subduction zones with infrequent past megathrust events. However, recent studies have suggested ways to generate a geophysically reasonable distribution that can be easily sampled to generate large numbers of hypothetical events ; accordingly, the approach we use is based on a Karhunen-Loève (K-L) expansion to generate slip patterns with correlation lengths that are thought to be "reasonable" from studies of past events, e.g., (Mai and Beroza, 2002;Goda et al., 2016;Melgar and Hayes, 2019;Crempien et al., 2020), as discussed further in Section 2. We stress, however, that we do not claim we have the "correct" distribution, or even the best possible based on available science, and so our focus is on a methodology that could also easily be applied to other choices of the PDF. We also suggest that any PTHA study intended as guidance for decision makers should include a sensitivity study that considers how robust the results are to changes in the assumed earthquake distribution, along with other approaches for assessing the effects of the epistemic uncertainty inherent in this problem, see, for example, Davies and Griffin (2020). In addition there is a need for further testing of random tsunami models comparing their statistical properties with historical tsunamis as is done in Davies (2019) for 18 recent events in the Pacific and Indian Oceans. Of course too few data points are available from the Cascadia Subduction Zone (CSZ) itself to allow a local validation of any PDF.
In this paper we focus on how best to handle the aleatoric uncertainty, i.e., assuming that we have a probability distribution to use for the PTHA, how do we efficiently create hazard curves based on this distribution? The brute force approach would be to choose a very large number N of samples from the distribution, perform a numerical tsunami simulation with each, and then (for each location of interest and each exceedance value) determine the number N of samples for which h max exceeds h. Then the ratio N/N is an estimate of a conditional probability that h max exceeds h given that some event from the set of all possible events occurs. If P total is the annual probability that any event from the classes considered occurs, then P total N/N could be used as the annual probability of h max exceeding h.
The primary difficulty we address is that N may need to be very large in order to get meaningful statistics, particular for the relatively unlikely but most dangerous higher values of h. Since a single tsunami simulation with a reasonable spatial resolution can take several minutes if not hours of computer time, this is problematic; and even more so if one also wants to do sensitivity studies and/or must consider many different communities over hundreds of miles of coastline.
A fundamental problem already arises when we ask for a reasonable value of P total since it depends very much on what set of possible earthquakes to consider. Since earthquakes with magnitude less than Mw 7.5 rarely cause damaging tsunamis we could define P total as the annual probability of any event with magnitude greater than this. In Section 2 we discuss our choice of PDF for the distribution of earthquakes. Although not necessarily correct for large subduction zone events, the Gutenberg-Richter law is a reasonable starting point for choosing a distribution. According to this, a magnitude Mw 7.5 event is 10 times more likely than an 8.5 event and roughly 32 times more likely than Mw 9.0. Hence, for example, if we sample N 3, 200 events we would expect perhaps 100 to be Mw 9.0 or larger, a rather sparse sampling of these important and potentially quite diverse events. Moreover, most of the samples would be small events for which there is little or no inundation, a clear waste of computing resources.  Table of the four magnitude classes used in this work, with the annual probability P j for an event from each class, along with the corresponding return period 1/P j , based on the Gutenberg-Richter formula p 10 6.279−Mw for a magnitude M w event.
Class j M w Annual probability P j Return period (years) We can address this by a simple application of importance sampling. We first split the space of all possible events of interest into a small number of classes. For illustration in this paper we use four classes based on magnitude: Mw 7.5, 8.0, 8.5, and 9.0, but this could easily be expanded. We then assign an annual probability to each class, call this P j for class j (j 1, , 2, 3, 4 for our case, for which we assign the probabilities shown in Table 1). We then take N j samples from class j, compute the fraction N j that exceed h, and use P j N j /N j as an estimate of the annual probability of exceeding h by an event from class j. These can then be combined to obtain the annual probability of exceeding h by any event (see Section 6). The advantage of splitting into classes is that we can choose a large number of events in a class corresponding to high impact but low probability (Mw 9.0 in our case) and then the corresponding fraction N j /N j is weighted by a smaller annual probability P j when combining with the probabilities obtained from other classes. For illustration, we have chosen to take N j 500 for each of the four classes so that we only consider 2,000 events in total but 500 of them are in the Mw 9.0 class (In Section 6 we discuss the rationale and implications of choosing this number of realizations.).
Next we tackle the problem that even 2,000 tsunami simulations may be excessively demanding, particularly when we expect that many of these events will give very similar inundation patterns and depths as other events, and so in principle it should be possible to estimate the hazard curve with fewer simulations of judiciously chosen representative tsunamis. We develop an approach for clustering that can be applied to the 2,000 events before doing any fine-scale tsunami modeling, in order to identify clusters of events that we expect to give very similar tsunami impact in the location of interest. We then do a fine-scale tsunami model of only one realization from each cluster (which we call the "cluster representative") and assign it a weight that is based on the collection of events in that cluster. Based on this we can estimate the contribution that this cluster should make to each hazard curve. This clustering is explained in much more detail in Section 5. Other studies have used clustering to achieve scenario reduction. For example, see Lorito et al. (2015) for hazard assessment, Gusman et al. (2014) for early warning and Volpe et al. (2019) for a study more closely related to this paper.
The clustering approach we illustrate in this paper is based on doing a coarse-grid tsunami simulation for each of the 2,000 realizations, with a grid resolution that allows much faster simulation, but is too coarse to properly represent the tsunami inundation over the communities of interest. However, we show that these coarse grid simulations give information in the form of proxy variables that can be used to very effectively cluster the events.
Moreover, the coarse-grid simulations can be greatly enhanced to provide "pseudo-fine" results that are at the resolution of the desired fine grid and that agree very well with the actual fine-grid simulations of the same realization, but are much cheaper to compute. This enhancement is performed in part using information about the difference between the coarse and fine grid simulations performed for the few realizations where both are available (the cluster representatives). This procedure is described in more detail in Section 4.
For illustration we consider two sample communities: Crescent City, CA, which is near the southern extent of the CSZ and Westport, WA, roughly 570 km north of Crescent City. These communities are both at high risk to CSZ tsunamis and have been the subject of past studies. They also have quite different topographic features as discussed further in Section 3. The same set of 2,000 CSZ realizations was used for each site, although the clustering algorithm is applied separately to each, since the set of realizations that give similar inundation patterns at one site may not form a suitable cluster at the second site. For illustration we show that selecting only 18 clusters (and hence performing only 18 fine-grid simulations for each site) gives hazard curves and maps that compare very well with those obtained if all 2,000 realizations are simulated on the fine grid, particularly after adding in additional information obtained from coarse-grid simulations of each realization. These results are presented in Section 7.
Some of the techniques presented in this paper were first developed as part of a project funded by Federal Emergency Management Agency Region IX, and presented in the project Final Report by Adams et al. (2017). Subsequently we have improved some of these techniques. We are also now using a probability distribution that is potentially more realistic than the original choice, and we consider two different target communities with quite different topography in order to better test the general applicability of these ideas. The original report and associated webpages (Adams et al., 2017) contain more discussion of some of these ideas, along with illustrations of some related approaches that are not reported in this paper. Research on PTHA using stochastic collections of sources goes back many years, see, for example, the early review Geist and Parsons (2006) and the more recent ones of Geist and Lynett (2014) and Grezio and Babeyko (2017) for many more references.
Recently, several researchers have adopted the use of a K-L expansion to generate large suites of realizations for PTHA studies of particular regions and/or to study sensitivities and uncertainty. For example, Gonzalez et al. (2020) generated 400 realizations for a hybrid deterministic/PTHA to Iquique, Chile, and Crempien et al. (2020) generated 10,000 realizations on a idealized fault and performed GeoClaw tsunami simulations of each on idealized topography to study the effect of spatial slip correlation on tsunami intensity. The techniques developed in this paper could help to accelerate such studies.
Research on reducing the work required to handle large sets of realizations has also been done by others. For example, Sepúlveda et al. (2017) used the K-L expansion together with a stochastic reduced order model to obtain better results than with a brute force Monte Carlo simulation, and Sepúlveda et al. (2019) used these techniques to do a PTHA analysis including a sensitivity study for Hong Kong and Taiwan locations due to earthquakes on the Manila Subduction Zone. These techniques reduced the number of simulations needed from 10,000 to 200 for each of 11 sets of earthquakes, followed by fine-grid simulations of the resulting 2,200 realizations. The reduction was based on seafloor deformation statistics at the earthquake sources. That paper also has a strong emphasis on using sensitivity analysis to quantify errors in PTHA.
Even closer to the methodology presented in this paper is the source filtering approach developed recently by Volpe et al. (2019). They started with a suite of more than 1.7 million scenarios that affect their study region, and used a clustering algorithm based on cheaply obtained proxies to reduce this to a smaller set of 1,154. They then performed fine-grid simulations for one representative from each cluster. One significant difference in our approach is that we use coarse-grid simulations (using the full nonlinear tsunami model, including onshore inundation over the study region) and the associated pseudo-fine results, which allowed us to obtain PTHA results with fewer fine-grid simulations. On the other hand, we performed 2,000 coarse-grid simulations to obtain these, whereas Volpe et al. (2019) performed the clustering based on proxy data that was more cheaply obtained. A hybrid approach might be to apply our methodology to the 1,154 cluster representatives identified in Volpe et al. (2019), performing only coarse-grid inundation simulations of these, and then further clustering into a much smaller set for the fine-grid simulations.
The approach we use to create pseudo-fine results is also similar to the idea of multilevel or multifidelity Monte Carlo methods (Giles, 2015;Peherstorfer et al., 2018), in which results from two or more different resolution simulations are combined to reduce the computational load. This is often done in the context of creating a surrogate model or emulator that can be very cheaply evaluated for new parameter choices in order to do a more extensive Monte Carlo simulation. This approach has been used in connection with tsunami modeling by de Baar and Roberts (2017), and by Salmanidou et al. (2017) for underwater landslide and tsunami modeling. For a review of these types of statistical approaches, see Viana et al. (2017). Our approach is somewhat different in the way we use cluster representatives and the differences in the local topography at different resolutions in defining the corrections.

EARTHQUAKE PROBABILITY DENSITY AND REALIZATIONS
Probability distributions proposed for CSZ earthquake magnitudes have included both characteristic and Gutenberg-Richter (G-R) types. More generally, Parsons et al. (2012) noted that this is a long-standing controversy for many other fault zones. Consequently, they developed both types of distribution models for the Nankai Trough, based on data from many past events. The characteristic earthquake model was based on fixed rupture geometries and historical/palaeoseismic recurrence times, and the G-R model was based on fault-slip rates and an estimated distribution slope (b-value). They found that the G-R distribution, constrained with a spatially variable long-term slip rate, replicated much of the spatial distribution of observed segmented rupture rates along the Nankai, Tonankai, and Tokai subduction zones, although with some rate differences between the two methods in the Tokai zone. Thus, where supporting information exists (e.g., palaeoseismic and historical recurrence data), and fault segmentation observations are absent, they suggested that very simple earthquake rupture simulations based on empirical data and fundamental earthquake laws could be useful forecast tools in settings with sparse data from past events. Models using a G-R distribution but without the explicit guidance of a varying longterm slip rate have also been employed, both globally and specifically along the Cascadia margin (Rong et al., 2014). We thus view a G-R distribution of magnitudes as adequate for this study.
We generated 2,000 slip realizations over four magnitude classes: Mw 7.5, 8.0, 8.5, and 9.0 (with 500 of each). To determine the annual probabilities of earthquakes in each of the magnitude classes, we follow a G-R law using a b-value of 1, indicating "normal" seismic behavior. We also assume a yearly rate of occurrence of a Mw 9.0 along the CSZ as once every 526 years based on paleotsunami records from Goldfinger et al. (2012). This implies an a-value of 6.279 in the G-R relation and gives us annual probabilities P j in Table 1 for each of our magnitude classes. Incidentally, Table 1 could also be extrapolated to show that the CSZ should have a M6.3 earthquake every year. However, this is not the case on the anomalously-quiet CSZ. Nonetheless, for the purposes of presenting a PTHA methodology, using a Gutenberg-Ricter law is a starting point. In this study, we use 0.5 magnitude unit spacing between our classes. Other studies, such as Li et al. (2016) use much smaller spacing between classes. For a full PTHA analysis, a fine spacing that allows for the complete overlap of earthquake properties between different magnitudes would be preferred.
We limit our earthquake realizations to imitate a series of thrust events located on the megathrust interface along the CSZ. To introduce variability to each realization, we allow for geophysically reasonable variations in slip distribution, location, and rupture dimension. An example of a rupture from each magnitude class is shown in Figure 1. We employ a regional fault geometry that approximates the CSZ from McCrory et al. (2012). This is then discretized into triangular subfaults using the three-dimensional finite element mesh generator GMSH (Geuzaine and Remacle, 2009). A triangular mesh allows a variable strike and dip that can better approximate the McCrory et al. (2012) geometry than a rectangular discretization. Our area of interest extends along the entire CSZ margin and down to a depth of 30 km beyond which slip is not expected to continue (Frankel et al., 2015).
In order to introduce variability in ruptures of the same magnitude, as is observed from past earthquakes, the length and width of each realization is obtained from a probabilistic source dimension scaling law (Blaser et al., 2010). For each individual rupture we sample from a lognormal PDF such that where σ L and σ W depend on the faulting environment and for reverse faulting (subduction zones) are 0.18 and 0.17, respectively (Blaser et al., 2010). The rupture extent is then centered about a Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 randomly chosen subfault within our CSZ mesh geometry. If the chosen subfault is located in such a place that the rupture extent exceeds the bounds of the rupture geometry, then it is moved up/ down dip and/or along strike until it falls completely within the CSZ. Once the bounds of the rupture area are established, we generate a stochastic slip distribution using an application of the Karhunen-Loève (K-L) expansion following LeVeque et al. (2016) and Melgar et al. (2016). Here, we assign slip over participating subfaults using a von Karman correlation function, C(r), which replicates the statistics of slip distributions as observed from finite-fault solutions of past moderate sized earthquakes (Mai and Beroza, 2002). Here, the correlation between the sth and dth subfaults (in along strike and along dip directions) is where H is the Hurst exponent (set in this study as 0.75), K H is the modified Bessel function of the second kind, and (r sd ) is a length measurement for sth and dth subfaults that depends on the distance between subfaults in the along strike (r s ) and along dip (r d ) directions as well as the correlation length along strike (a s ) and dip (a d ), written as The correlation length and width for each realization governs the size of asperities and uses a magnitude dependent scaling law from Melgar and Hayes (2019) where a s 17.7 + 0.35L eff , a d 6.7 + 0.41W eff .
The variables L eff and W eff are based on the effective length and width scales (in kilometers) from Eq. 1 and are determined from Mai and Beroza (2000). Using the defined correlation function the distribution of slip across the fault surface is treated as a spatially random field. The vector s containing the amount of slip at each subfault is then given by the Karhunen-Loève expansion as, where μ is the mean slip over the entire fault, λ k and v are the eigenvalues and eigenvectors of the chosen correlation function, and z k are random numbers normally distributed with a mean of 0 and standard deviation of 1. After defining the correlation function we assume a marginal log-normal distribution as described by LeVeque et al. (2016) where we use a standard deviation of 0.45 of the mean slip in any given model. This value is Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 5 obtained from an analysis of a database of slip models dating back to 1990 (Melgar and Hayes, 2019). It is important to note that there are as many eigenvectors or eigenmodes (n f ) as there are subfaults in the model geometry. Equation 6 is a statement of how each eigenmode distributes slip away from the background mean model modulated by a random number z k and thus achieving a stochastic realization of slip. For tsunami modeling because seafloor deformation is a relatively long period phenomena LeVeque et al. (2016) showed how high order modes contribute relatively little to tsunamigenesis so it is possible to truncate the sum to just a few tens or even hundreds of terms. Here we limit the number of contributing modes to no more than 200. Finally for the background mean model we assume enough homogeneously distributed slip to match the target magnitude given the chosen fault dimensions. It is also possible to make other choices for µ such as a known slip distribution or a geodetic locking model (Goldberg and Melgar, 2020).
We cap the upper level of slip possible for any realization in this study to 60 m, as was recommended in Melgar et al. (2016) and based on plate convergence rates from McCaffrey et al. (2007). We achieve this by rejecting and re-running any realization where any subfault in our mesh has an assigned slip that is greater than our maximum slip threshold. This cap is in place in order to limit the possibility of unrealistically large amounts of slip in any earthquake realization. It should be noted that this cap creates an upper limit in tsunami intensity that may be reflected in any final PTHA analysis. We do not enforce the target magnitudes in a strict sense. The resulting magnitude after the stochastic process can be slightly higher or lower than the requested values. We do not re-scale the slip in anyway to force the rupture to have the target magnitude exactly. This, and the maximum slip requirement, can introduce departures from the desired PDF (Sepúlveda et al., 2017) but this is generally an effect that is much smaller than the epistemic uncertainty.
We calculate the total seafloor deformation of each earthquake realization using angular dislocations for triangular subfaults in an elastic half space (Comninou and Dundurs, 1975). This method can be seen as a variant of the Okada equations, which focus on rectangular subfaults (Okada, 1985). We obtain the deformation over the entire CSZ study region with at a 30" spatial resolution. This is a fine enough spacing to ensure Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 6 that we recover slip features that may be present at our smallest magnitude class, including rupturing asperities at the smallest reported slip correlation length (see examples in Figure 2).
Seafloor deformation is directly translated to a disturbance at sea level by assuming an incompressible water column. While some large magnitude earthquakes can have rupture durations extending multiple minutes, this kinematic effect on the tsunami in the near-field is minimal (Satake, 1987;Williamson et al., 2019). Here, we simplify the rupture process by treating all seafloor deformation as instantaneous and occurring at the initial time step of our tsunami model. It is this initial disturbance that initializes the tsunami model, as discussed further in Section 3. Figure 2 shows both slip on the fault and the resulting seafloor deformation for two of the magnitude 9.0 realizations. We use R i to denote the ith realization and the figure shows the realizations R 1,655 and R 1,999 out of the N 2, 000 realizations, chosen because R 1655 has slip concentrated on the southern margin of the fault while R 1,999 has it concentrated to the north, and hence they have very different effects in Westport and Crescent City, as discussed in the next section. This illustrates that even within a single magnitude class there are significant variations between the tsunamis generated.

COMMUNITIES OF INTEREST AND TSUNAMI MODELING
We consider two sample communities as shown in Figure 3. Crescent City, CA was used in previous work on this topic (Adams et al., 2017), and was the subject of a previous PTHA analysis by González et al. (2014) and Adams et al. (2015). Tsunamis tend to focus in Crescent City due to the offshore bathymetry and harbor (Horrillo et al., 2008), and the central business district is bounded by the harbor, the low-lying Elk River valley to the east, and higher hills to the north.
Westport, WA lies on a peninsula at the entrance to Grays Harbor. The topography is below roughly 10 m everywhere on FIGURE 3 | Communities of interest in this study. (A) Regional and inset view of Westport, WA and two representative cross sections shown in Section 7. (B) Regional and inset view of Crescent City, CA with two representative cross sections. Both regional views have a bathymetric (dashed) and topographic (solid) contour interval of 10 m. Inset figures use a contour interval of 5 m. The coastline is differentiated with a bolded brown line.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 the peninsula, and a number of north-south running ridges protect some areas from the direct waves arriving from the west that may still be flooded from the east after the tsunami enters Grays Harbor. Westport is the site of the Ocosta Elementary School, recently rebuilt to include the first tsunami vertical evacuation structure constructed in the United States, for which tsunami modeling was presented by González et al. (2013). We selected these two communities to showcase the versatility clustering PTHA methodology. However, this methodology can be extended to any coastal community. Tsunami simulations are performed using GeoClaw Version 5.7.0, distributed as part of the open source Clawpack software (Clawpack Development Team, 2020). This solves the twodimensional depth-averaged non-linear shallow water equations using adaptive mesh refinement on rectangular grid patches (in longitude-latitude coordinates). GeoClaw allows each cell to be wet or dry and to change dynamically, so that the wet/ dry boundary of the coastline evolves as the tsunami inundates the coastal site of interest.
For this study we simulated the tsunami from each of the 2,000 realizations in two separate simulations. The first set were the "fine-grid runs" where refinement down to 1 arcsecond (roughly 30 m in latitude, less in longitude) was enforced over both study sites. This provided the "ground truth model" hazard curves and maps to use for comparison purposes, i.e., we assume that our goal is to produce good approximations to these curves and maps with much less work than was required to run 2,000 fine-grid simulations. The second set of simulations were the "coarse-grid runs" in which the refinement only went down to 9," and hence a factor of 81 fewer grid points on the finest level than in the finegrid runs. Moreover on these coarser grids it is also possible to take larger time steps [while still respecting the Courant-Friedrichs-Lewy condition required by the explicit finite volume method used in GeoClaw], potentially giving another factor of 9. However, since some of the computation takes place on coarser grids over the entire computational domain, the coarse grid simulations are on average 5 times faster than the fine grid simulations; see below. We also note that for a real PTHA we might want to use even finer grids, e.g., 1/3" is often used now used for hazard studies, and 1/9" topography is becoming available in many locations. In this case the relative speedup for coarse-grid simulations could be much more dramatic.
We use adaptive mesh refinement to optimize the computational cost of each tsunami simulation. All simulations used three levels of refinement in the openocean, with grid resolution 1″, 6″, and 3″, and with regridding every few time steps to follow the propagating waves (based on a tolerance on the sea surface elevation). On the continental shelf, refinement is allowed to the next Adaptive Mesh Refinement level at 90". An additional refinement level of 9" is enforced around both study sites. For the coarse-grid simulations only these five levels of Adaptive Mesh Refinement are used. For the fine-grid runs, two additional Adaptive Mesh Refinement levels are introduced at 3" and 1" resolutions, and the study areas are forced to be resolved at the finest 1" level. The ETOPO1 topography Digital Elevation Model at 1 arcminute resolution (Amante and Eakins, 2009) was used over the full computational domain. A subset of the Astoria, Oregon 1/3" Digital Elevation Model (NOAA NCEI, 2017) was used around Westport, and around Crescent City a version of the Crescent City, California 1/3" Digital Elevation Model (NOAA NCEI, 2012) was used that was modified to remove the pier in the harbor, since water flows under the pier, for an earlier PTHA study of this region by González et al. (2014).
In each simulation we monitor the maximum water depth h over a grid of points covering the study area (at the finest resolution of the simulation) over the duration of the simulation. For this study we ran each tsunami simulation to 4 h of simulated time after the instantaneous seafloor displacement. Examining the results we found that in a few cases there were still significant edge waves trapped along the coast that could have lead to slightly larger values of the maximum at some points, so a realistic PTHA should run some realizations out to later times. For the purposes of this study our reference solution uses the maximum h over the same time period as our approximations and so comparisons are still valid.
At each point where h is monitored, these maximum values (denoted by h f max for the fine-grid runs) are used to compute a reference hazard curve. The coarse-grid simulations produce their own set of h c max values on a coarser set of points (which can be extended to the fine grid by piecewise constant interpolation within each coarse grid cell). These coarse-grid values are used both in the clustering algorithm and in computing a pseudo-fine result from each coarse result, as explained in the sections below.
The reference hazard curve is affected by the spatial distribution and properties of the rupture realizations which act as our ground truth. Therefore, it is important to have enough realizations at each magnitude class to capture all tsunamigenic behavior that is possible given the seismic constraints we presented in Section 2. The total number of realizations per magnitude bin is based on the variability in the likelihood of exceeding a set of tsunami amplitudes in the harbors of both Crescent City, and Westport, as illustrated in Figure 4. Here, we calculate the likelihood that h exceeds a set of tsunami thresholds, ranging from 1 cm to 10 m at one particular point in each study region. As more realizations in a magnitude class are added, the variability in the probability of exceeding each tsunami threshold reduces. We can estimate that we have enough realizations to act as our ground truth when each probability curve has flattened out. Here, this occurs at about 400 realizations per magnitude.
The tsunami simulations were performed using the OpenMP feature of GeoClaw using 30 threads on a Linux server. The total Central Processing Unit time varied for each realization, depending on whether the initial deformation came from a small localized slip patch (requiring small regions of refinement in the ocean and possibly resulting in a negligible tsunami) or a larger rupture. Total Central Processing Unit time (summed over all threads and over all 2,000 simulations) was 49.2 h for the coarse-grid runs and 255.6 h for the fine-grid runs.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 8 Before proceeding, we first show h max in each study region for the two sample M w 9.0 realizations shown in Figure 2. Figures 5, 6 show h max for R 1,665 at Westport and Crescent City, respectively, and similarly Figures 7, 8 show h max for R 1,999 . In each case, panel A shows the fine-grid result h f max while B shows the coarse-grid result h c max . Note that the 9" grid cell resolution is clearly visible in B and that this coarse grid cannot resolve all features of the flow, but that the general order of magnitude is correct. Panel C shows the difference between coarse and fine results, which are substantial in some regions.
The remaining panels of each figure show the result of enhancing the coarse-grid results using techniques developed in the next section, where these will be discussed in more detail.

COARSE-MOD AND PSEUDO-FINE ENHANCEMENTS
Our PTHA approach starts by sampling N realizations, which we denote by R i (for i 1, 2, . . . , N). These may consist of N j realizations from class j as described in Section 1, with N j N j . Performing coarse-grid simulations of each gives us h c max at each location on a coarse grid covering the study region. We wish to avoid doing fine-grid simulations of all realizations, and instead we will use a clustering approach, described in detail in below, to group these into K clusters and to choose one representative realization from each cluster. This "cluster representative" will be denoted by R k for the particular realization from cluster k 1, 2, . . . , K. Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 One approach to approximating the hazard curves for N realizations using K clusters is to perform only one fine-grid simulation for each cluster (for a "cluster representative" realization selected from the cluster), and assign a weight to each that is the sum of the weights of all realizations in the cluster. We show results of this approach in Section 7. However, using only K events will give a hazard curve with only K jump discontinuities and cannot well approximate the true hazard curve if K is much smaller than N, particularly at the lower probabilities. In our example application, N 2, 000 and we will choose K 18.
Much better results are obtained if we also make use of the remaining N − K 1, 982 coarse-grid simulations that were performed to do the clustering. The coarse grid results alone do not give sufficient resolution of h max for use directly, but they can be enhanced to approximate the inundation that each would produce on a fine grid with much less work than required to do the fine-grid simulation. This is done as a two-stage process. In the first ("coarse-mod") step the coarse grid results are combined with the fine-grid topography to give estimates of the maximum depth on the fine topography. This is independent of the clustering and can be done immediately following each coarse grid simulation. The second ("pseudo-fine") step uses the clustering, and the idea that the difference between the coarsemod and fine-grid simulations at the cluster representative (both of which are available) gives a good indication of how other coarse-mod results in the same cluster should be adjusted to better approximate the result of a fine-grid simulation. We describe each of these in turn.

Modified Coarse Grid Corrections
Each of the N 2, 000 coarse-grid runs provides an estimate of h c max at a set of coarse grid points covering the study region. These were computed using a coarse grid (with resolution nine" in our Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 case), which is inadequate to represent the community of interest. For example, the top row of Figures 5-8 shows h max in each study region for two particular realizations, and the difference in resolution is apparent between the fine-grid simulation (h f max in Panel A of each figure) and the corresponding coarse-grid simulation (h c max in Panel B). Panel C of each figure shows the difference between the fine and coarse results, after the coarse-grid results are interpolated to the fine grid as a piecewise constant function over each coarse grid cell.
The coarse-mod corrections are based on the observation that the maximum water surface η max B + h max (where B is the pre-seismic topography) is often much more smoothly varying over a community than is the maximum water depth h max (Note that even a constant η max throughout the study region would still have large variations in h max η max − B due to the variations in topography B.). Hence at any point, if we assume η max is roughly correct, we can get a better estimate of h max by subtracting off the fine-grid topography at this point from the η c max value predicted by the coarse-grid simulation. As usual, we focus on values at a single grid point on the fine grid. Let B f represent topography from the fine-grid simulation at this point and B c the topography value from the coarse-grid simulation in the coarse cell containing this point. Then where In other words, we simply adjust h c max at each fine grid point by ΔB, the difference between the fine and coarse topography at this point. This represents the most common situation where water reaches both the coarse and fine bathymetry levels and is given in the first two lines in Eq. 9.
However, there are a few special cases where we can not use (7). Clearly, if ΔB > h c max > 0, we can not allow h cm max to become negative (the water reaches the coarse bathymetry level but not the fine level), so h cm max is set to 0 in the third line in Eq. 9.
The last three lines in Eq. 9 refer to the special case when water does not reach the coarse bathymetry level (h c max 0). In this case, water may or may not reach the fine bathymetry level. To determine if it does, we define an η threshold value, called η T and now use η c to denote the η max value at a point of interest on the coarse grid. Over the four neighboring grid cells around the coarse cell containing the location of interest, we find η T , the FIGURE 6 | Sample results for realization 1,665 at Crescent City, where the tsunami was large. Panels as described in Figure 5. Purple is above 8 m. and Green is land not inundated.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 maximum η c value where h c max > 0; that is, the threshold where we have seen flooding locally. Lines four and five in Eq. 9 give h cm max when water can reach the fine bathymetry but not the coarse bathymetry level. In line four, the threshold is at least the fine bathymetry, but doesn't exceed the coarse bathymetry, so only η T − B f meters of water can be placed above the fine bathymetry. In line five, the threshold η T computed from including the four neighboring coarse cells is at least the coarse bathymetry level, but we do not allow h cm max to exceed the value of B c − B f since no water appeared above B c in the cell of interest. Lastly, the sixth line in Eq. 9 gives the situation where water does not reach either the fine or coarse bathymetry levels since η T did not exceed B f . In all cases, h cm max will remain as h c max at locations where the fine and coarse bathymetries were equal (ΔB 0). Based on the above discussion the modified coarse grid value is given by: The second row of Figures 5-8 shows examples of the effect of this. Panel D is again the fine grid h max but now Panel E shows the h max estimated on the fine grid after applying these coarse-mod corrections. Panel F shows the resulting errors.
An earlier version of this coarse-mod strategy was used in (Adams et al., 2017), and more examples are given in the Appendix of that report showing how well the modified coarse data can compare to the original coarse data and the fine data. We have since improved this strategy by looking only locally for the FIGURE 7 | Sample results for realization 1,999 at Westport, where the tsunami was large. Panels as described in Figure 5. Purple is above 8 m and Green is land not inundated.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 threshold value η T as opposed to across the entire community in the Federal Emergency Management Agency report, and have allowed this value to even be negative, which makes the strategy more applicable to locations near the shoreline that see little inundation during an uplift event. These improvements also result in a more effective pseudo-fine strategy as discussed below.

Pseudo Fine Grid Corrections
We now present an approach to further improve each of the coarse-mod results defined by (9) by also using the fine-grid runs performed for each cluster representative. We begin by clustering the N modified coarse grid runs (or the original N coarse grid runs) into a small number of non-overlapping clusters. For this paper, the clustering was done using the original N 2, 000 coarse grid runs, to produce K 18 clusters, as described in Section 5. These clusters contain different numbers of runs. Each cluster has one run designated as its cluster representative, which we will denote by R k for the particular realization from cluster k 1, 2, . . . , K.
After the clustering has been done, more information is available that can be used to further improve the coarse-mod approximations. Since we assume that the cluster representative R k is somewhat typical of the pattern of flooding seen for all realizations in the cluster, and since we have both fine-grid and coarse-mod results available for this representative realization, we can use the difference between these as an estimate of what the difference between fine-grid and coarse-mod results would be for all realizations in the cluster. This is used to modify each coarsemod result to get a better approximation to the expected fine-grid result. This is what we call the pseudo-fine result for each realization. We again use h cm max (R i ) to denote the h max value obtained for a particular realization R i from the coarse-grid simulation after applying the coarse-mod corrections, and similarly h f max (R i ) comes from the fine-grid simulation, as always focusing on a single spatial location. Then the pseudofine approximation at this location for each realization R i in Cluster k is given by Note in particular that the pseduo-fine result for the cluster representative itself (i.e, for R i R k ) agrees exactly with the finegrid result for that realization. For non-representative cluster members, the pseudo-fine results improve as the clustering improves, since differences between their coarse-mod and fine results become closer to the difference between the cluster representative's coarse-mod and fine results (the last two terms in Eq. 10). Increasing the number of clusters would increase the number of cluster representatives while reducing the number of non-representatives per cluster, and could improve the pseudo-fine results at the expense of additional fine grid simulations. We have not investigated these tradeoffs, as the pseudo-fine results reported in the next sections were quite good already for our two chosen communities of interest using a small number of clusters. We also believe the modified-coarse results provide an excellent start in building the pseudo-fine results because the coarse grid GeoClaw simulations already contain information about the nonlinear flow dynamics on land. This is in contrast to cheaper coarse results that could have been obtained in deep water using the linearlized shallow water equations, such as the thousands used by Li et al. (2016), but this is an approximation we wished to avoid since our focus is the inundation on land.
The third row of Figures 5-8 show examples of the effect of this. Panel G is again the fine grid h max but now Panel H shows the h max estimated on the fine grid after applying these pseudo-fine corrections. Panel I shows the resulting errors relative to the finegrid result. Additional illustrations of this idea can be found in the Appendix to (Adams et al., 2017).

CLUSTERING
In this section we discuss how one can subdivide the individual events into a small number of clusters that are likely to have similar inundation patterns. The clustering will be based on proxy quantities for each event that can be computed solely from the coarse-grid (low-resolution) simulations, whose runtime is orders of magnitude smaller than the fine-grid (high-resolution) tsunami inundation simulations.
In our previous work (Adams et al., 2017), proxy variables based only on the seafloor deformation of each realization were also considered. These are much cheaper to compute than the coarse-grid simulation proxies, but did not do as good a job of clustering, even though in that work we only considered ruptures on the southern margin of CSZ and only one near-field study region, Crescent City. Given the wider range of events now being considered, where many events are localized far away from a study region, we believe that it may be harder to develop robust proxies based only on the seafloor deformations. However, due to the greater efficiency of that approach, this could be a fruitful area for future research. We also note that Volpe et al. (2019) used seafloor deformation near the study region, both to classify realizations into near-field and far-field, and also in clustering the near-field realizations.
In this work we use three proxy variables computed from the coarse-grid simulations for each realization. Each realization thus corresponds to a point in a three-dimensional space and various clustering methods can then be used to identify clusters. We consider variables that attempt to capture aspects of the spatial variation of the inundation patterns. For each realization, and each coastal location, we compute the value η c from the coarsegrid tsunami simulation that measures the surface level at maximum inundation, defined by where B c represents the pre-event topography on the coarse grid. We consider the spatial variation of η c over the onshore points that are flooded: the grid-cells in the simulation centered at (x i , y j ) that satisfy both B c (x i , y j ) ≥ 0 and h c max (x i , y j ) > 0. We will denote these values from the coarse-grid simulation by η c ij . The total number of flooded onshore points will be denoted by N f lood .
As the proxy variables, we will use the following statistics of η c , where the sums are over all onshore flooded points (i, j): The first variable η c logsum is a measure of the total extent and elevation of the flooding, while the second variable η c mean is the mean surface elevation. The third variable η c sd measures the spatial variation of surface elevation over the onshore flooded region. The first two variables summarize the severity of the flooding while the third variable summarizes the spatial variation of the flooding pattern.
Utilizing these quantities, the coarse-grid inundation from each individual realization can be mapped to a point in the threedimensional space of proxy variables, which we will call "proxyspace," as shown in Figure 9. To perform clustering it is also necessary to define a metric that measures the distance between two points in this space, and here we simply use the Euclidean distance (the square root of the sum of squares of differences in each of the three proxy variables). We then use K-means clustering (Lloyd, 1982), as implemented in scipy. sklearn by Pedregosa et al. (2011) to cluster the 2,000 points in proxy-space into K clusters, with the property that each point belongs to the cluster with the closest centroid (as measured in the specified metric). Note that because η c mean typically has larger magnitudes than the other two proxy variables (see Figure 9), the use of the Euclidean distance effectively weights differences in η c mean more heavily than differences in the other proxy variables. We also tried first normalizing all proxy variables (equivalent to using a different weighted metric) and somewhat different clusters were generated but with very similar final results in the hazard curves. If using proxy variables with vastly different magnitudes, and/or where some variables are thought to be more important than others, some care should be used in choosing the metric.
This clustering is done independently for the two study regions, and the results are shown in the scatter plots in Figure 9. The plot also highlights which realization is closest to the centroid of each cluster, which we refer to as the cluster representative R k for Cluster k. The number of clusters is determined by the user, and in this instance 18 clusters were used to subdivide the 2,000 events of varying magnitude. We also tried clustering with only 6 or 12 clusters but the results did not match the all-fine (ground truth model) as well, while using more than 18 clusters did not seem necessary for this particular data set. Volpe et al. (2019) use an iterative procedure to select the number of clusters by enforcing a maximum distance between cluster members and the centroid and this might be a good approach in general.
From Figure 9 one observes that there is a general monotonic behavior with respect to all three variables. Higher magnitude events tend to have higher values for all three variables. But also note that the scatter plot for the two coastal communities show qualitatively different patterns. While the points in proxy space for Westport show a more predictable behavior with higher magnitude events more concentrated along a smooth, monotone increasing curve, there is much more variation in the proxy variable η c sd in Crescent City and consequently more scatter. This can be explained by the difference in topography. In Westport there are ridges in the north-south direction which roughly separates the onshore regions into zones that can flood independently, leading to more variation in η c sd for most realizations. However, for realizations with the most severe surface elevations, all zones are inundated, leading to smaller variation in η c sd . In Crescent City, most of the onshore region is facing the south-west direction and is rising more monotonically away from the coast, but there is a sharp gradient in the topography in the west shoreline, acting as a barrier. The variation of inundation along this barrier is significant for extreme realizations, causing more scatter in η c sd .

HAZARD CURVES AND MAPS
Finally we combine the techniques developed in the previous sections to produce approximate hazard curves with much less work than would be required to perform all N fine-grid simulations. We first summarize our notation and the definition of the hazard curves and these approximations. For other discussions of hazard curves, see, for example, (González et al., 2014;Adams et al., 2015), and also the review paper (Grezio and Babeyko, 2017). and associated Jupyter notebooks that illustrate these concepts interactively.
We assume that we have split all possible events into J classes indexed by j 1, 2, . . . , J (in our case J 4 and the classes are for Mw 7.5, 8.0, 8.5, and 9.0). We assume each class has an associated annual probability P j . We also make the reasonable assumption that these probabilities are sufficiently small that the probability of two events happening in a year is negligible. More specifically, we assume that the annual probability of at least one earthquake (from the classes considered) is well approximated by the sum of the P j . In fact if the different classes represent potentially independent events, then the probability of at least one of them occurring is given by As long as the probabilities are small, the final line of (13) is a good approximation to the true value. For the probabilities listed in Table 1, P total 0.08527 when calculated using the product formula in the first line of (13), and is well approximated by the slightly larger value obtained using the sum P j 0.08704.
We next chose N j realizations from each class j, for a total of N j N j realizations. We use R i as shorthand for "Realization i", for i 1, 2, . . . , N (enumerating all realizations from all of the classes). For each R i we assign an associated weight w i defined as w i P j /N j if R i is of class j.
Now consider a fixed location in the study region where we have computed h max , the maximum tsunami inundation depth, for each realization. The value computed on the fine grid for FIGURE 9 | Clustering results for the two study regions. In each case all 2,000 realizations are represented as a point in the three-dimensional proxy-space, and colored by cluster after with 18 clusters in each case. The crosses indicate the cluster representative, i.e., the realization closest to the centroid of each cluster.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 realization R i will be denoted by h f max (R i ). If we perform finegrid simulations for all realizations, then we can define the ground truth model hazard curve (at this location) as follows. For any exceedance value h ≥ 0 we might choose, let {i : h f max (R i ) > h} denote the indices of the set of realizations for which h max computed on the fine grid exceeds this value. Then we define Plotting p f ( h) vs. h gives the hazard curves, as shown, for example, in Figure 10. From the hazard curve at each point on a grid covering the study region, it is possible to extract the data needed to produce a hazard map; see Section 7.2. Note that summing the weights w i over all R i for which h is exceeded, as done in (14), is equivalent to computing J j 1 ( N j /N j )P j , where N j is the number of realizations from Class j for which h f max (R i ) > h (Since each w i P j /N j for some j and we add in one such contribution for each realization that exceeded h.). We refer to the w i as weights rather than probabilities because we do not mean to imply that every realization in a class has the same probability of occurring, even though they each have the same weight. Some of the realizations may be outliers that are very unlikely to occur, while most of them will come from closer to the center of the distribution. But because we assume that we sampled the distribution within each class properly, the fraction N j /N j is the proper frequency to modify the probability P j , and our choice of weights accomplishes this via the definition (14).
In Section 5 we discussed an approach to clustering the R i into clusters indexed by k 1, 2, . . . , K, for some number of clusters K that is much smaller than N. For each cluster we identified one realization R k from the cluster that we will call the "cluster representative," with the hope that a single fine-grid simulation of the tsunami resulting from R k will give a good indication of the flooding expected for all realizations in the cluster.
One simple strategy for approximating the hazard curve is then to assign a weight w k to Cluster k, defined by w k w i summed over {i : R i is in cluster k}, and then approximate P[h max > h] by a function we will denote p cf ( h) with the superscript standing for "cluster-fine": Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 In other words, we assume that if the tsunami from the cluster representative R k gives an h max exceeding h then all realizations in the cluster will, and we add in the cluster weight w k in this case, which is just the sum of the weights w i for all realizations in the cluster.
Another approximation, which requires no clustering, would be to approximate P[h max > h] by where h c max (R i ) is the h max value obtained with a coarse grid simulation of realization R i . This uses information from all N realizations, but is presumably not a good approximation because, by definition, the coarse grid is not sufficiently fine to resolve the study region adequately.
Much better results can be obtained by using all of the coarse-grid results after enhancing them using the techniques presented in Section 4. Using only the coarse-mod corrections to each coarse-grid result would lead to the approximation while also using the clustering to produce the pseudo-fine corrections gives In general using Eq. 19 as an approximation to the all-fine grid (ground truth model) hazard curve defined by p f ( h) has been found to give very good results with much less work. Only K fine-grid simulations need to be performed, but the results are based on all N realizations, with pseudo-fine approximations that are often nearly as good as the all-fine grid results when incorporated into PTHA. Hazard curves, hazard plan view and transect maps, and a table of differences for comparing these models to the all-fine grid ground truth model are given in Section 7 below.

PROBABILISTIC TSUNAMI HAZARD ASSESSMENT RESULTS
We now explore the results of performing PTHA using the clustering strategies developed in Section 5, either alone or in conjunction with additional coarse-mod or pseudo-fine results as developed in Section 4. Recall that we have sampled N 2, 000 realizations of a CSZ event using the techniques described in Section 2, and for the purposes of this paper we assume that the hazard curves (and resulting maps) that are generated from a fine-grid tsunami simulation of each of these events is the correct reference solution, which we are trying to approximate more cheaply using the clustering and pseudo-fine grid techniques. To assess the accuracy of our approximations, we performed fine-grid simulations of all realizations in order to compute p f ( h), although in practice this is what we wish to avoid.

Hazard Curves
Recall from Section 6 that a hazard curve is defined at each point in the study region where h max values have been calculated over the entire simulation. Figure 10 shows sample hazard curves at two locations in Westport, and two in Crescent City. At each location the figure shows the reference curve p f ( h) and three of the approximations discussed above, the clusters-fine, coarsemod, and pseudo-fine strategies. The particular spatial points were chosen to illustrate typical hazard curves. Additional hazard curves can be found on the website Williamson et al. (2020).
Note that the all-fine, coarse-mod, and pseudo-fine hazard curves obtained using 2,000 realizations typically have 2,000 jump discontinuities, one at the location of h max for each realization. The magnitude of the jump in probability at each discontinuity is equal to the weight w i assigned to that realization. This is because this h max value contributes w i to the estimated annual probability for any smaller exceedance value, but not for any larger exceedance value. Also note that the smallest nonzero probability that can occur on any hazard curve is the weight we assign to M w 9 events, w 4 P 4 /500 3.8 × 10 −6 where P 4 is from Table 1.
The hazard curve p cf ( h) for the cluster-fine strategy is computed using only the fine-grid results h f max (R k ) for the 18 cluster representatives. As a result, it has only 18 jump discontinuities and the jump in annual probability at each discontinuity is the cluster weight w k . This generally gives a reasonable approximation to the true hazard curve within the constraint of a piecewise constant function with so few jumps. It may not agree well for the most extreme events (smallest probabilities) since it assigns 0 annual probability to any exceedance value h greater than the maximum of h f max (R k ) over the k 1, 2, . . . , 18, whereas the true hazard curve goes to 0 only above h max i h f max (R i ) maximized over all 2,000 realizations. If the realization that maximizes this happens to be a cluster representative then the two hazard curves indicate the same maximum possible flooding, but in general this will not be the case. Similarly, for other very small values of p that correspond to inundation by only a few of the 2,000 realizations, the probability can not be properly represented when only using the 18 cluster representatives.
Using the coarse-mod enhancement of each coarse-grid result gives the hazard curve p cm ( h). Even though all N simulations are now used, this correction is not sufficient to give good results in general, and this hazard curve generally deviates significantly from the correct hazard curve p f ( h). However, using the pseudo-fine version of all N coarse-grid simulations gives much better results, as seen in Figure 10 and also generally seen at other locations.
In evaluating the results shown in Figure 10, it is important to remember that we cannot expect very good agreement at the smallest annual probabilities, where the results depend entirely on the most extreme tsunamis out of the 2,000 selected. Of most interest in this study is the portion of the hazard curve above say p 10 −4 , which corresponds to a return time of T 10, 000 years. Developing an accurate hazard curve for lower probabilities would require more than 2,000 realizations, even considering only the aleatoric Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 uncertainty. Moreover, the epistemic lack of knowledge of the proper probability distribution would also be a limiting factor.

Hazard Maps and Transects
The hazard curves p f ( h) at every point on the h max grid can be combined to produce hazard maps, as follows. For a fixed annual probability p we determine from each hazard curve the corresponding value of h such that p f ( h) p. After doing this at every grid point in the study region we can produce plan view plots of the expected inundation depth h for this p. Figures 11, 12 show such plots for two different probabilities, p 0.002 (return time T 500 years) and p 0.0004 (return time T 2, 500 years). In each case we show the results for four of the strategies listed above. Again the all-fine strategy gives the reference result, and we compare this to the cluster-fine, coarsemod, and pseudo-fine strategies. In general the pseudo-fine strategy gives the best approximation to the all-fine results.
Note that in these maps we show offshore points as well as onshore points, since h max for both the fine-grid and coarse-grid simulations were obtained by monitoring the maximum water depth on rectangular grids also covering some offshore points. At these points h max is always at least as great as the original preseismic water depth, so at these points we do not plot h max itself but rather the quantity we call zeta, defined by ζ max h max + B, where B is the pre-seismic topography at the point. More generally we define Then ζ max agrees with h max onshore, is continuous at the shoreline, and offshore it indicates the maximum tsunami elevation relative to sealevel (We always use the pre-seismic topography in defining this, since each realization can have a different amount of uplift or subsidence.). It is hard to quantitatively compare these hazard maps, and impossible to present results for more than one probability p on the same map of this type. So in Figures 11, 12 we also show two selected transects in each study region (each at some fixed latitude). We then plot a cross section of the hazard map along this transect for four different values of T 1/p as FIGURE 11 | Sample hazard maps and transects for select locations in Westport. The top panels show plan view plots for return times T 500 years (left) and 2,500 years (right), over the same spatial domain as shown in Figure 3. In each case panel (A) shows the reference all-fine hazard maps produced with the all-fine hazard curves p f ( h), (B) shows the map produced with the coarse-mod hazard curves p cm ( h), (C) shows the map produced with the clusters-fine hazard curves p cf ( h), (D) shows the map produced with the pseudo-fine hazard curves p pf ( h). The bottom figures show transects of the hazard maps for four different choices of annual probabilities p, corresponding to different return times T 1/p as indicated in the legend. In each case the dashed line is the all-fine reference solution, while the solid line is the approximation generated using 18 clusters and the pseudo-fine improvements of the other coarse-grid runs.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 indicated in the legend. As T increases the expected flood level naturally increases. In these plots we also compare two strategies, the reference all-fine result as a dashed line and the pseudo-fine strategy as the solid line. These plots clearly show that the pseudo-fine strategy does a remarkably good job of estimating the all-fine (ground truth model) inundation for all four return times, at least along these particular transects. These are fairly typical of the results seen at other locations in the study regions, and additional plots are available on the webpage Williamson et al. (2020). Finally, in Table 2, we give the maximum and mean differences between ζ max over each study region for two different return times, for each of the three strategies illustrated above. Note that in each case, the pseudo-fine strategy has mean errors less than 15 cm, even though the mean value of ζ max varied from 1.88 to 5.18 m in the four FIGURE 12 | Sample hazard maps and transects for Crescent City. As described in the caption to Figure 11. TABLE 2 | The magnitude of ζ max and differences between ζ max as computed using the coarse-mod, clusters-fine, and all-pseudo strategies, compared to the reference allfine strategy at both Westport and Crescent City (CC), for return times T 2, 500 and 500 years (p 0.0004 and 0.002). 14 All values are in meters. The columns labeled max (Δ) and mean (Δ) are the maximum and mean of the difference over all grid points (i, j), e.g., for the coarse-mod row, Δ ζ cm max − ζ f max . For comparison, the corresponding maximum and mean values of ζ max across the community are also listed for each strategy, in the columns labeled max (ζ) and mean (ζ). Recall that ζ represents the maximum flooding depth on land, or the flooding depth added to the pre-seismic bathymetry for points offshore.
Frontiers in Earth Science | www.frontiersin.org October 2020 | Volume 8 | Article 591663 cases shown (T 2, 500 and 500 years, at the two study regions). This indicates that our approach is capable of giving very small relative errors compared to the all-fine (ground truth model) in the maximum flooding depth of the tsunami, even for the longer return time shown.

CONCLUSIONS
We have presented a general approach to performing PTHA when given a) a set of classes of possible events with an annual probability or return time for each class, and b) a probability density within each class that can be sampled to obtain a sufficiently large number of sample realizations that hazard curves can be accurately approximated by performing fine grid tsunami simulations for each realization. The problem we considered is that the number of realizations needed may be too large to perform fine grid simulations of each, particularly if many coastal locations are of interest, and so the goal is to obtain good approximations to the hazard curves that would be generated by all the fine-grid simulations with much less work, employing only coarse-grid simulations, clustering, and correction procedures. We considered a model problem where 2,000 fine-grid inundation simulations were performed in order to obtain a reference hazard curve to test our methodology, which we again summarize. We first performed coarse-grid inundation simulations for each realization, using a set of four magnitude classes and a K-L expansion to define the probability density within each class for illustrative purposes. We then clustered them into only 18 clusters and performed one fine-grid simulation for a single representative from each cluster. We also used these 18 simulations, with very little additional work, to enhance the remaining coarse-grid results. The resulting 2,000 pseudo-fine results were then used to produce approximate hazard curves that are much more accurate than those obtained using the cluster representatives alone.
Although we chose geophysically reasonable parameters, we do not claim that the results of our fine-grid hazard curves are correct, only that they are reasonable reference solutions against which to compare cheaper strategies. We have also ignored other magnitude events on the CSZ, other fault mechanisms such as splay faults, along with distant earthquakes and other tsunami sources, so the results in this paper should not be interpreted as providing realistic estimates of hazards in either Crescent City or Westport. Certainly any realistic PTHA meant to inform decision-making should also include sensitivity studies, particularly in light of the large epistemic uncertainty in the parameters that go into the probability densities (whether generated by K-L expansion or by any other technique). The techniques of this paper can be useful for such sensitivity studies since they can accelerate the PTHA for any set of realizations and thus allow testing more sets of realizations in order to investigate the resulting variation in hazard curves. Sets of realizations might be generated with different density parameters, or be of different sizes. Different random sets of realizations of the same size can also help to better explore the aleatoric uncertainty.
More work is needed to better understand and optimize the clustering method used in this paper. In particular there is a need to better quantify the number of clusters that should be used, and the best set of proxy variables to use in performing the clustering.
However, we note that we were generally able to achieve nearly the same hazard curves with our pseudo-fine results as when using all the fine-grid results, down to an annual probability of p 10 −4 or less for each of our test communities. We therefore believe these techniques can be a useful component in a full probabilistic study of these sites, and many others. We also note that they can be applied to any other choice of classes and probability densities, and adapted to work with any tsunami modeling software.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: http://depts. washington.edu/ptha/frontiers2020a/.