Dynamic Statistical Models for Pyroclastic Density Current Generation at Soufrière Hills Volcano

To mitigate volcanic hazards from pyroclastic density currents, volcanologists generate hazard maps that provide long-term forecasts of areas of potential impact. Several recent efforts in the field develop new statistical methods for application of flow models to generate fully probabilistic hazard maps that both account for, and quantify, uncertainty. However a limitation to the use of most statistical hazard models, and a key source of uncertainty within them, is the time-averaged nature of the datasets by which the volcanic activity is statistically characterized. Where the level, or directionality, of volcanic activity frequently changes, e.g. during protracted eruptive episodes, or at volcanoes that are classified as persistently active, it is not appropriate to make short term forecasts based on longer time-averaged metrics of the activity. Thus, here we build, fit and explore dynamic statistical models for the generation of pyroclastic density current from Soufriere Hills Volcano (SHV) on Montserrat including their respective collapse direction and flow volumes based on 1996-2008 flow datasets. The development of this approach allows for short-term behavioral changes to be taken into account in probabilistic volcanic hazard assessments. We show that collapses from the SHV lava dome follow a clear pattern, and that a series of smaller flows in a given direction often culminate in a larger collapse and thereafter directionality of the flows change. Such models enable short term forecasting (weeks to months) that can reflect evolving conditions such as dome and crater morphology changes and non-stationary eruptive behavior such as extrusion rate variations. For example, the probability of inundation of the Belham Valley in the first 180 days of a forecast period is about twice as high for lava domes facing Northwest toward that valley as it is for domes pointing East toward the Tar River Valley. As rich multi-parametric volcano monitoring dataset become increasingly available, eruption forecasting is becoming an increasingly viable and important research field. We demonstrate an approach to utilize such data in order to appropriately ‘tune’ probabilistic hazard assessments for pyroclastic flows. Our broader objective with development of this method is to help advance time-dependent volcanic hazard assessment, by bridging the

To mitigate volcanic hazards from pyroclastic density currents, volcanologists generate hazard maps that provide long-term forecasts of areas of potential impact. Several recent efforts in the field develop new statistical methods for application of flow models to generate fully probabilistic hazard maps that both account for, and quantify, uncertainty. However, a limitation to the use of most statistical hazard models, and a key source of uncertainty within them, is the time-averaged nature of the datasets by which the volcanic activity is statistically characterized. Where the level, or directionality, of volcanic activity frequently changes, e.g., during protracted eruptive episodes, or at volcanoes that are classified as persistently active, it is not appropriate to make short term forecasts based on longer time-averaged metrics of the activity. Thus, here we build, fit and explore dynamic statistical models for the generation of pyroclastic density currents from Soufrière Hills Volcano (SHV) on Montserrat including their respective collapse direction and flow volumes based on 1996-2008 flow datasets. The development of this approach allows for short-term behavioral changes to be taken into account in probabilistic volcanic hazard assessments. We show that collapses from the SHV lava dome follow a clear pattern, and that a series of smaller flows in a given direction often culminate in a larger collapse and thereafter directionality of the flows changes. Such models enable short term forecasting (weeks to months) that can reflect evolving conditions such as dome and crater morphology changes and non-stationary eruptive behavior such as extrusion rate variations. For example, the probability of inundation of the Belham Valley in the first 180 days of a forecast period is about twice as high for lava domes facing Northwest toward that valley as it is for domes pointing East toward the Tar River Valley. As rich multi-parametric volcano monitoring datasets become increasingly available, eruption forecasting is becoming an increasingly viable and important research field. We demonstrate an approach to utilize such data in order to appropriately tune probabilistic hazard assessments for pyroclastic flows. Our broader objective with development of this method is to help advance time-dependent volcanic hazard assessment, by bridging the gap between eruption forecasting based on monitoring time series data and development of cutting edge probabilistic volcanic hazard maps.

INTRODUCTION
The Soufrière Hills Volcano (SHV), Montserrat, has been extensively studied since it erupted on 18 July, 1995 for the first time in centuries.
The multi-parametric datasets coupled with detailed observations of the surface activity for the main eruptive period facilitates exploration of forecasting models. Thus, SHV provides an ideal setting to build and explore next-generation models for short-term (weeks to months) pyroclastic density current forecasting where eruptive precursors and current conditions such as dome growth and orientation may be brought to bear (see Section 3.1). Here we propose models for the frequency, volume, and initial directions of pyroclastic density currents (PDCs) at SHV, with the end goal of improving forecasts for these sometimes devastating events. Most efforts at modeling for PDC forecasts either focus on retrospective forecasting (e.g., Sulpizio et al., 2010;Charbonnier and Gertisser, 2012) or long-term forecasting (e.g., Bayarri et al., 2009Bayarri et al., , 2015Connor et al., 2012;Sandri et al., 2014;Neri et al., 2015), based on the stationary assumption that behavior remains constant over time. Recently, non-stationary models have been developed for long-term PDC forecasting  and for very long-term hazard evolution modeling (Jaquet et al., 2017). Non-stationary models have also been developed for, and applied to, modeling the state of volcanic systems (onset, active, repose, etc.) (e.g., Carta et al., 1981;Bebbington, 2007Bebbington, , 2010Bebbington, , 2012 and for forecasting volcanic events ≥VE1 (Bebbington, 2012). Here we present the first approach to short-term probability-based PDC forecasting based on dynamic statistical models.
Earlier, Bayarri et al. (2009) modeled the times of PDC events as a stationary stochastic process, with distributions for their volumes and initial directions that did not change over time. While our results suggest that this is adequate for longterm forecasts and hazard assessment (but see Bebbington, 2010;Bevilacqua et al., 2017;Jaquet et al., 2017 for a contrasting view), we show that the frequency and directional distribution vary markedly over shorter time intervals. Both Aspinall et al. (2003) and Woo (2018) have also highlighted the need for short-term, dynamic forecasting that can take into account monitoring and precursor information. Although much work toward this end focuses on combining probabilities of individual processes using Bayesian Belief Networks or Bayesian Event Trees (e.g., Marzocchi et al., 2008;Sandri et al., 2009;Hinks et al., 2014;Woo and Aspinall, 2014), this work does not focus on probabilistically modeling the individual components. Here we build dynamic statistical models in which PDC frequency alternates between high-activity and low-activity states, and in which the directional distribution can change following domecollapse events.
The dates, valleys affected, and estimated volumes of 845 PDCs of volume ≥ 0.15 · 10 6 m 3 (i.e., 0.15 million cubic meters) recorded at the Montserrat Volcano Observatory (MVO) through July 2008 are reported in Ogburn et al. (2012), one of the most complete PDC dataset ever assembled. PDCs and rockfalls with volumes < 0.15 · 10 6 m 3 are not recorded here. All of the modeling and data analysis in this work is based on that dataset, and by "PDC event" we will mean a PDC at SHV of volume at least 0.15 · 10 6 m 3 . Figures 1A,B show the drainage and valley structure surrounding SHV. Figure 1C shows the volumes (on log scale, as radius, with a small amount of jitter to reveal multiplicity) and initial angles (as azimuth counter-clockwise from due East) of all recorded PDCs. Only PDCs' valleys are known, so interval-censored PDC angles {φ j } are represented as arcs j .
While there is some suggestion of heavier flow activity to the east, the evidence against independent uniform distributions is not strong. Further, the evidence against independence between uniformly distributed initiation angles {φ j } and Paretodistributed volumes {V j } for significant flows is not strong (for flows exceeding 10 6 m 3 , a chi-squared test of independence for categorized data with three levels of volume and five groups of valleys gave a value of χ 2 = 13.3 on 8 degrees of freedom, for a p-value of 0.10 showing little evidence against independence). While this stationary model may be reasonable for long-term forecasts (several years or more), there is evidence that it is not appropriate for short-term forecasts. The PDC frequency appears to vary over time, with alternating high-frequency and lowfrequency periods; the angular distribution of initial flow angles appears non-uniform over short periods; and its distribution too may change over time. Figure 2 shows directional tendencies for the periods between major PDCs (those whose volumes exceed 6 · 10 6 m 3 ), along with a red arc representing the valley inundated by the last major PDC of that period or epoch. Note the strong directional concentration within most of these periods, and the abrupt changes that often occur at the times of major PDCs at the end of each epoch. A Generalized Likelihood Ratio Test confirms that the angular distributions differ across epochs; see section (2.2.3) for details. Figure 3 presents the cumulative number of PDCs for the study period, just over 12 years following 1st January 1996. The slope of the line in this figure, measured in PDC events per year, is one measure of volcanic activity level. Evidently it is not constant over the decade shown and it is recognized that these changes relate, to first order, to the actual growth and pause periods of volcanic activity (Wadge et al., 2010).
Here we propose and support a new model for PDC generation whose frequency and whose initiation angle distributions are random stochastic processes. The PDC frequency jumps on occasion back and forth between two levels, one low and one high. The angular distribution is weighted toward some central direction toward which the lava dome grows; this direction will change to some randomly-chosen new direction following the generation of a major PDC and dome collapse event.
We employ the Objective Bayesian inferential paradigm ) in order to reflect in our volcanic forecasts, both uncertainty about estimates of model parameters, as well as that arising from the natural uncertainty and variability of volcanic systems. In that paradigm, uncertainty about the model parameters of interest (e.g., slopes of cumulative PDC counts), here represented by a vector θ , is represented in the form of a posterior probability density function π(θ | data), proportional (by "Bayes' Rule") to the product of the likelihood function, f (data | θ ), and a prior density function π(θ ) chosen to represent initial ignorance. We present and justify our choices of probability models (and hence the likelihood functions) and prior distributions, and then use Markov chain Monte Carlo (MCMC) numerical methods to generate both parameter estimates (with attendant measures of uncertainty) and model forecasts (also with uncertainty quantified), from their posterior distributions (Rosenthal, 2011). With this model the conditional distribution of the time to the next PDC, and its magnitude and initial direction, may depend on known features of the data set, such as the directions and Days from January 1, 1996 FIGURE 3 | Cumulative count of PDC events, illustrating that PDC frequency varies over time. Note slope of curve is sometimes steep (in high-activity periods) and sometimes shallow (in low-activity periods or dome growth pauses).
times of recent PDCs. In section (3.1) we illustrate how this can be used to improve short-term forecasts. It is worth noting that we do not model the (difficult to observe) magma ascent directly. That said, the timing of events (which we do model) and magma ascent share similar non-stationary patterns (Wadge et al., 2010), but only the event timing is readily observable here. We present the model in two parts: first in section (2.1) we describe the model for the frequency of pyroclastic density currents, then in section (2.2) we describe the model for the volume and initial directions of those flows. In section (3.1) we illustrate and validate how this model can support shortterm forecasting, by making model forecasts for a 3-year period following a specified date (we chose 1st January 2006), based only on data prior to that date, and comparing model forecasts to actual observations. Finally, in section (4) we present our conclusions and some directions for future development. Computational details and lists of hyperparameter values, probability distributions, mathematical notation, likelihood and posterior distributions are collected in an Appendix.

A Dynamic Model for Eruptive Activity
In this section we construct a model for the times of PDC events sufficiently large to be recorded at SHV, i.e., of volume 0.15 · 10 6 m 3 or more.  To accommodate this we model the times {τ j } of PDC events as an inhomogeneous Poisson point process with rate λ(t) that may vary with time. Thus, the numbers of events in disjoint time intervals (a i , b i ] will have independent Poisson distributions with means given by (1) for an uncertain time-varying PDC frequency λ(t), measured in events/ yr. The simplest choice for a function λ(t) that appears consistent with the data would be a two-state model in which λ(t) takes only two possible values 0 < λ lo < λ hi < ∞, switching from the low rate to the high rate and back at transition times {s m , t m } respectively, or .., beginning in the "low" state. Also introduce t 0 : = 0 to simplify formulas below (here the notation "a : = b" means that a is defined to be b). A simulation of such a function appears in Figure 4, with an observation period of [0, T] and a subsequent forecast period of (T, T ′ ] with T = 10 year and T ′ = 15 year. Note that the values of the rates λ lo and λ hi in the upper panel correspond to values of the slopes of the cumulative event counts in the lower panel. The likelihood function for the parameters λ = (λ lo , λ hi ) and transition time vectors st = {s 1 , t 2 , · · · } from such a realization will depend on the data vector τ = {τ j } of observed PDC event times in a specified observation period [0, T] only through two features of the data: the counts N hi and N lo of events that occur in periods of high and low activity, respectively, and the total amounts of time T hi and T lo spent in those periods. These can be evaluated from the observed event times {τ j } and modeled transition times {s m , t m } as follows: Here the notation "(s, t]" denotes the interval of all those times τ for which s < τ ≤ t, with the inclusion of the right end-point indicated by a square bracket and the exclusion of the left endpoint indicated by a round one. Such an interval is said to be "half open on the left". The notation "s ∧ t" denotes the smaller of two real numbers s and t, while 1 A (τ ) for a set A and a real number τ is the "indicator function, " equal to one if τ ∈ A and otherwise zero. Thus, (2) gives a quick way of evaluating the numbers N hi and N lo of PDCs that occur during the unions ∪ m (s m , t m ] of all the intervals in which λ(t) = λ hi takes its "high" values, and ∪ m (t m−1 , s m ] in which λ(t) = λ lo takes its "low" values, respectively. Similarly (3) gives a simple way of evaluating the total lengths of time within the study period [0, T] (see Figure 4) that the PDC frequency is at its high and low levels, respectively.
Under the model described in (1), for fixed event rates λ = (λ lo , λ hi ) and transition times st = {s 1 , t 1 , s 2 , · · · }, the number N hi of events in the high periods will have a Poisson probability distribution N hi ∼ Po(T hi λ hi ) with mean E[N hi ] = T hi λ hi , the product of the total high-frequency duration T hi and the high event rate λ hi (the symbol "∼" means "is distributed as"). Similarly the number N lo of events in the low periods will have the Poisson N lo ∼ Po(T lo λ lo ) distribution, with N hi and N lo independent. This leads to the Poisson likelihood function: (the notation "f (·) ∝ g(·)" means that the function f is proportional to g, i.e., that f (x) = c · g(x) for some unspecified constant c > 0. Note that, unlike probability density functions, likelihood functions are only defined up to an arbitrary positive multiplicative factor). Note that all the {τ j } lie in the interval [0, T], so only transition times {s m , t m } in that interval contribute to the sums in either Equations (2, 3), and hence only those appear in the likelihood function of Equation (4). We still model transition times subsequent to [0, T], though, for the purpose of making forecasts.

Prior Distributions for Eruptive Activity
Our initial efforts to model λ(t) with a two-state Markov chain (i.e., a renewal process with exponentially-distributed waiting times between transitions-see Chaps. 4, 5 of Karlin and Taylor, 1975) failed: under such a model, some high-and low-activity periods would last just days while others would last years. The observed durations of high-activity and low-activity periods are more regular than independent exponential distributions would allow. This led us to the following approach.

Prior Distribution for Transition Times
To reflect this regularity we introduce a renewal process with independent gamma distributions for the durations of low-activity and high-activity intervals, respectively (here "iid" means the random variables are independent and identically distributed, and "Ga(α, β)" denotes the Gamma distributionsee Appendix). We take distinct unitless shape parameters α lo , α hi > 0 and a common rate parameter β > 0 (in yr −1 ). For shape parameters α lo , α hi exceeding one, these intervals will be more regular than exponentially-distributed ones would be (the unitless "coefficient of variation", the standard deviation as a fraction of the mean, is unity for the exponential distribution and σ/µ = 1/ √ α for the Ga(α, β) distribution, so for α > 1 these distributions are less variable than exponential distributions).
We model the transition times {s m , t m } explicitly for a fixed number M of intervals, chosen large enough to ensure that the range [0, t M ] will include the entire study period [0, T ′ ] (of observations and forecasts) with very high probability (this can be ensured by taking M ≫ T ′ β/(α lo + α hi ), since t M ∼ Ga(M(α lo + α hi ), β)). This leads to a prior probability density function (pdf) of the form where f lo (t) and f hi (t) are the pdfs for the low-activity and highactivity period lengths, respectively. For the Gamma distributions we employ, this prior pdf is or log prior pdf (often easier to work with) given by where Ŵ(z) denotes Euler's gamma function (Abramowitz and Stegun, 1964, section 6.1).
Values for the prior hyper-parameters (i.e., quantities governing the probability distributions of the parameters, like {s m } and {t m }, that in turn govern the probability distributions of observable quantities like the times and volumes of eruptions) α lo , α hi , β governing interval lengths are based on their elicited means α lo /β ≈ 3 year and α hi /β ≈ 1 year and coefficient of variation σ/µ ≈ 0.5 for full cycles (s m+1 − s m ). Results were insensitive to these choices.

Prior Distribution for Levels of Activity
We model the rates 0 < λ lo < λ hi < ∞ of PDCs with volume V > ǫ with a joint prior distribution with density for unitless shape parameters a lo , a hi > 0 and repulsion parameter r > −1, with rate parameter b > 0 (in yr). Here the notation 1 E for an event E denotes the indicator function taking the value one if the event E occurs and otherwise zero. For r = 0 this gives independent Gamma random variables conditioned to satisfy the order relation λ lo < λ hi , but taking r > 0 will encourage larger separation |λ hi − λ lo | between the high and low rates.
Values for the hyper-parameters a lo , a hi , b, r governing high and low event rates are based on their elicited typical values a lo /b ≈ 0.10 d −1 for λ lo and a hi /b ≈ 0.50 d −1 for λ hi , and their coefficients of variation and correlation (note values must be converted from d to yr). This prior distribution is "conjugate" in the sense that the posterior distribution is of the same form, with new values for the hyper-parameters a lo , a hi , b that will depend on the data.   from low to high. This may, for example, impact decisions on the timing of enlarging no-entry zones when faced with indications of newly heightened activity levels.

A Dynamic model for PDC Volume and Direction
Here we model the volumes V and initial directions φ of PDCs at SHV.

A Dynamic Model for PDC Volumes
Large flow volumes {v j } at Soufrière Hills Volcano are well fit by a power law or Pareto probability model in which the conditional probability that a PDC volume exceeds any value v, given that it exceeds a lower threshold v 0 , is approximately proportional to a negative power of v: The near-linearity on the log-log scale plot of frequency-vs.volume presented in Figure 7 suggests that this holds for α ≈ 1.10 and any v 0 ≥ 6 · 10 6 m 3 (note that α is commonly referred to as the "shape parameter" of the Pareto distribution). Artifacts in the dataset (such as the clumping of reported values at round numbers of 5 · 10 6 m 3 and below) affect reported volumes for smaller flows, making it difficult to confirm or refute relation (7) for lower values of v 0 . We make the modeling assumption that Equation (7) continues to hold for values of v 0 as small as the lower limit of flows reported in Ogburn et al. (2012), 0.15 · 10 6 m 3 . We denote this lower limit by ǫ : = 0.15 · 10 6 m 3 , and limit our modeling efforts to flows of volume V ≥ ǫ. This assumption makes very little difference in forecasts of flows large enough to be hazardous, e.g., those exceeding 10 6 m 3 (which we term "significant"). A generalized likelihood-ratio test gave a GLR of = 3.6 · 10 −30 for Log-Normal distributions compared to Pareto for all 145 recorded PDCs of volume exceeding ǫ and = 4.3 × 10 −7 for the 45 significant PDCs of volume exceeding 1 · 10 6 m 3 , overwhelming evidence in favor of the Pareto. Note this contrasts with the results of Neri et al. (2015), who found the evidence about log-normal vs. Pareto inconclusive for a smaller data set and who expressed their own preference for the log-normal.
The Pareto distribution has much heavier tails than do more commonly encountered distributions like the exponential, lognormal, or gamma, i.e., the probability of exceeding high thresholds for Paretos does not decrease as fast as it does for these other distributions. As a consequence, under this Pareto model one is much more likely to observe future volumes that far exceed those in the recent history. For example, if {V i } iid ∼ Ex(λ) are independent with identical exponential distributions, the probability that a new observation V 11 would be ten time higher than the largest max(V 1 , . . . , V 10 ) of ten recent ones is about 5·10 −6 ; for Pareto random variables (like ours) with α ≈ 1, it's about 10 −2 . The probability that V 101 is ten times higher than the most recent 100 values max(V 1 , . . . , V 100 ) for exponential random variables is about 2 · 10 −14 ; it's far larger for Pareto random variables, about 10 −3 .
The evidence against stationarity of PDC volumes, like that against independence of initiation angles and PDC volumes described in section (1), is very weak. Thus, we model PDC volumes {V j } exceeding threshold ǫ : = 0.15 · 10 6 m 3 as independent and identically-distributed Pa(α, ǫ) random variables, independent of the times {τ j } and initiation angles {φ j }. Note, this modeling choice is the only one shared between this work and our earlier modeling efforts (Bayarri et al., 2009), presenting a simpler stationary model.

Interval Censoring
Some PDC volumes v j are specified precisely in our data set; for others (typically smaller ones), we learn only that v j lies in a reported interval [v min j , v max j ]. Such data are said to be "interval censored". Let J 0 be the set of indices for the uncensored observations, let J 1 be the index set for censored observations, and let J : = J 0 ∪ J 1 denote their union, the set indexing all observations. From (7), the likelihood and log

Flow Volume vs. Frequency
Volume V (m 3 ) Fraction with volume ≥ V 2 ⋅10 6 5 ⋅10 6 10 7 2 ⋅10 7 5 ⋅10 7 10 8 2 ⋅10 8 FIGURE 7 | Plot of fraction of PDCs with volume exceeding V, vs. V, on log-log scale, for PDCs of volume V ≥ 5 · 10 6 m 3 (shown as vertical blue line). Near linearity suggests Pareto (power-law) distribution. Measurement artifacts distort entries below about 6 · 10 6 m 3 : some small volumes are reported only as ranges, indicated here by horizontal lines, and some small volumes are rounded off, leading to apparent ties in the data.
likelihood for the Pareto shape parameter α, based on observed values {v j : j ∈ J 0 } of independent Pareto-distributed random variables {V j : j ∈ J 0 } iid ∼ Pa(α, ǫ) representing the volumes of PDCs exceeding ǫ = 0.15 · 10 6 m 3 , would be anything proportional to L 0 and any constant plus ℓ 0 , respectively, for: The precise volumes v j of some (particularly smaller) PDCs in our data, however, are not reported-instead, an Some of these are shown in Combining (8a) (and setting v min j : = v j ) for the uncensored volume observations {v j : j ∈ J 0 }, and (8b) for the censored ones reflecting the evidence about α from both censored and uncensored observations. This reflects fully the uncertainty arising from imprecisely reported (i.e., interval censored) volumes. Figure 2 suggests that, in the periods between major domecollapse PDC events, initiation angles {φ j } are not uniformly distributed on the circle-rather, within those epochs they appear to be concentrated in the vicinity of some central direction, which may change following PDC events large enough to change the dome morphology (often by dome collapse). We take flows exceeding a threshold volume of : = 6 · 10 6 m 3 as surrogates for "dome collapse events", and model initial flow angles in the epoch (T e , T e+1 ] between successive major flows of volume V > with von Mises distributions:

A Dynamic Model for PDC Angles
indicates that the {φ j } are independent with the specified distributions, here the indicated von Mises) for a common concentration parameter κ φ and epochspecific centrality parameters {µ e }. Note that we model the precise direction φ j of initial dome collapse, but we observe only the valley(s) inundated by the resulting PDC. A Generalized Likelihood Ratio Test comparing the fit to data of a single von Mises distribution for all PDCs, compared to fitting separate von Mises distributions within epochs, generated a χ 2 statistic of 214.5 on 20 degrees of freedom, for a p-value of 1.5 · 10 −34 , overwhelming evidence against uniformity of the angular distribution across epochs.
The von Mises distribution (von Mises, 1918;Mardia and Jupp, 1999, section 3.5.4) on the circle S 1 of possible initiation angles φ is governed by two parameters: the central direction µ ∈ S 1 and the concentration κ ≥ 0. A concentration of κ = 0 gives the uniform distribution on S 1 with no concentration at all, while larger values of κ are more and more concentrated in a small range symmetrically distributed around µ. The probability density function for the vM(µ, κ) distribution (in degrees) with centrality parameter µ ∈ S 1 and concentration parameter κ ≥ 0 is given in Equation (10b) (in which I 0 denotes the modified Bessel function of order zero; see Abramowitz and Stegun, 1964, section 9.6.16) , and is plotted for a few representative values of κ (all centered at µ = 135 • ) in Figure 8. Figure 2 suggests that the initial angles of PDC events seem to be centered within particular valleys between successive major events of volume V > = 6 · 10 6 m 3 . To explore this phenomenon, we fit a sequence of von Mises vM µ e (τ ), κ µ distributions whose centrality parameter µ e is a function of time τ .
In an attempt to guide our modeling efforts, we explore von Mises fits to data in intervals between major events. Each histogram is centered at 0 • , due East. Note these are much narrower than prediction histograms for actual angles {φ j } (cf. Figure 2). evolves. Generally, as time progresses the fit to the centrality parameter gets tighter and often happens to coincide with the valley inundated by the next major flow. This agreement is highlighted by the black vertical bars, the 95% credible interval of the centrality parameter at the time T e+1 of the next major event (using all valley data between major PDCs, T e < τ ≤ T e+1 ), plotted along with (and often on top of) the red bars. The success of the von Mises fits within each epoch bracketed by major events motivates the following modeling choices.
To reflect this phenomenon, we update the centrality parameter µ e following the major PDC of volume V j > at time τ j = T e+1 that ends the epoch (T e , T e+1 ], with a new one: drawn from another von Mises distribution centered at the old direction µ e with its own fixed concentration parameter κ µ . The resulting posterior conditional density is: and vM(µ, κ) distributions, respectively, evaluated at volume v and direction φ.
Here "e j " denotes the epoch e of the jth PDC, which occurs at a time τ j satisfying T e < τ j ≤ T e+1 . The epoch boundaries {T e } and event epochs e j can be calculated from the data {(V j , φ j , τ j )} by setting T 0 : = 0 and, for e ∈ N : = {1, 2, 3, . . . } and j ∈ {1, 2, . . . , N}, T e : = min{τ j > T e−1 : V j > } e j : = max{e : T e < τ j } In practice the times τ j for all PDC events are observed with adequate precision, but some PDC volumes V j and initial angles φ j are not-for some we learn only a range [v min j , v max j ] for the volume, and for all we learn only which valley(s) a flow inundated, and not the specific initial angle φ j , so the data are intervalcensored. We accommodate the interval-censored volumes using the approach of section (2.2.2), and the interval-censored angles by integrating each term f vM (φ j | µ e j , κ φ ) in (10d) with respect to φ j over the appropriate intervals j representing river valley(s) for that flow. Note this approach fully reflects in model forecasts all uncertainty about the precise (but unreported) volumes and angles of flows. Our model for the times and angles of PDCs may be viewed as a self-exciting point process model (i.e., one in which uncertain point intensities are positively correlated at nearby times). Other authors (e.g., Bebbington and Cronin, 2011;Bevilacqua et al., 2016) have used self-exciting point processes to model event locations and times on geological time scales. Figure 10 shows posterior histograms of PDC central directions {µ e } for ten epochs (T e , T e+1 ]; compare with Figure 2, presenting raw data for the same ten epochs. Figure 11 shows the posterior predictive density function for angles {φ j } within each epoch (T e , T e+1 ], as light blue curves, along with the valleys filled by the major PDCs ending those epochs, as shaded intervals.

Retrospective Short Term Forecasts
To illustrate how our methods may be used to make short-term forecasts, and to validate those forecasts, we select a specific time in our data set (1st January 2006, a decade into the eruption) and compare model forecasts for the next few years based only on data prior to that date with actual observations from those same years. Figure 12 shows 100 forecasts of cumulative PDC volume from 1st January 2006 to 1st August 2008 (just over 2.5 years), all based on eruption data prior to 2006. Solid black symbols show the actual cumulative PDC volume then observed in that forecast interval, exhibiting eruption rates and jump sizes similar to those in model forecasts. This may be viewed as a validation of our forecast approach: forecasts of "future" collapse frequencies within the existing data set, based on prior data, are broadly consistent with observations of what actually occurred. For example, the central 90% predictive forecast interval for cumulative volumes after 180 days is 4-250 · 10 6 m 3 , and the actual cumulative volume at that date was 120 · 10 6 m 3 ; the central 90% predictive interval for the highest-volume PDC (and so the biggest jump) in the first year is 2.2-350 · 10 6 m 3 , while the largest observed PDC volume is 200 · 10 6 m 3 . Figure 13 shows the forecast activity levels λ(t), cumulative event counts, central directions {µ e } (counter-clockwise from due East), and cumulative volumes in Figures 13A-D, respectively, from a single draw from the posterior predictive distribution of activity from 1st January 2006 through 1st August 2008. Note that during time intervals of high frequency λ(t) (as indicated in Figure 13A), PDC event frequencies in Figure 13B are higher (i.e., cumulative event count slope is steeper), and density of points is higher in Figures 13C,D. Also note that change times for centrality parameters µ e in Figure 13C coincide with major jumps in cumulative volume Figure 13D), i.e., with large-volume PDCs. Figure 14 offers another way of visualizing the same posterior predictive simulation presented in Figure 13. Time (in days) is represented radially on a linear scale. Central angles {µ e } are represented as radial line segments that may change at the times (indicated by concentric circles) of dome-collapse PDCs, whose angles are indicated by filled colored circles. This draw featured two such major PDCs, and so two full epochs and part of a third, in that first approximately 2.5 years. Probability forecasts based on Monte Carlo simulations of PDCs (i.e., repeatedly sampling volumes V j , times τ j , and initiation angles φ j as illustrated in Figures 12, 13), based on the posterior distributions described in this work, quantify both epistemic and aleatoric uncertainty-let us explain further. Note that in Figures 5, 10 that there is not one single "right" value of λ lo , λ hi , or {µ e }, but for each of these parameters there is a distribution of possible values, some more likely than others. This is also true for the Pareto shape parameter α (the negative slope of the log-log volume distribution intended to model the data in Figure 1). In the Bayesian paradigm, such distributions of parameters reflect the epistemic uncertainty resulting from the fact that these models (and parameters) are fit using a limited quanity of imperfect data. Each thin colored curve in Figure 12 has its own sample values of λ lo , λ hi , and α, that specify the Poisson and Pareto distributions used to sample PDC event times and volumes. Each thin colored curve in the cumulative volume forecast is then calculated by summing up (sampled) PDC volumes at the (sampled) times they occur. This idea is perhaps easier to see in Figure 13. The solid blue lines in the panels represent samples from parameter probability distributions (frequencies in the top panel, central angles in the third panel). The orange dots are various PDC samples over time.
Together the panels in Figure 13 represent one realization of a PDC forecast. A different realization would have both different parameter samples (blue curves) and PDC samples (orange dots).
Aleatoric uncertainty describes the natural randomness of a system, so modeling the system probabilistically is the first step to quantifying aleatoric uncertainty. Our modeling approach lets us take this one step further to incorporate information about the current state of the system. That is, we can begin a probability forecast in either a state of high or low PDC frequency, chosen to match the current (at the beginning of the forecast period) level of activity available from monitoring data. Likewise, we can begin a forecast where the central angle µ 0 is set to reflect the current dome alignment. The later case leads to dramatic differences in short-term forecasts, as illustrated in Figure 15. In either case if we were to forecast over longer time horizons, say years, the effects of non-stationary modeling would wash out. That is, the system would naturally evolve away from its initial state (high or low frequency, initial dome alignment, etc.) making probability forecasts much less sensitive to assumptions about the initial state of the system. This illustrates the importance of both nonstationary probabilistic modeling and potential incorporation of monitoring data into short term forecasts. Figure 15A shows the forecast probability of large flows (V > V thresh for V thresh = 2, 5, or 10·10 6 m 3 ) down the azimuth sector covered by the Belham valley in the first t days following 1st . This illustrates how easily changing conditions can be reflected in model forecasts in this modeling approach, and how they affect forecasts. Figure 15B shows daily hazard, the probability of Belham Valley inundation on the t th day after 1st January 2006 conditionally on no prior inundation. For V thresh = 5 · 10 6 m 3 , at t = 180 d the probability is about 1.5 · 10 −3 / d if the initial dome faces the Belham valley (µ 0 = 135 • ), notably higher than the value 1.0 · 10 −3 / d if the initial dome faces the Tar River valley (µ 0 = 0 • ). By t = 600 d there is no difference in hazard-it is approximately 1.1 · 10 −3 / d for both µ 0 = 135 • and µ 0 = 0 • . This illustrates how the present methodology is capable of supporting short-term forecasts that reflect current conditions, unlike models embodying stationary assumptions.
Although Figures 12, 15 both present forecasts, they are different in nature and it is worth contrasting them. The information presented in Figure 12 about cumulative volumes (model samples, confidence intervals, and data) provides a validation of sorts-the data (black asterisks) look much like a sample from our model (colored curves). In Figure 15 we present probability forecasts and instantaneous hazard curves conditioned on a current state of the volcano (dome growth direction) that could be informed by monitoring data. This kind of short-term forecasting, that can be readily updated as new data become available or as the system evolves, is how we imagine such modeling efforts will be utilized by volcanologists, observatories, and civil protection authorities.

DISCUSSION AND CONCLUSIONS
Statistical modeling assumptions based on stationarity are reasonable for long-time pyroclastic density current forecasting (5-10 years), but they are neither reasonable nor consistent with data over short time scales (days-months). In this work, we introduce models for frequency, volume, and initial direction of pyroclastic density currents that can capture non-stationary and heavy-tailed behavior consistent with the data at SHV. The data clearly contain periods of high and low frequency of flows associated with phases of dome growth and extrusion pauses. This is consistent with geological observations that as a given lava lobe extrudes, PDCs are generated around that locus of activity but after a significant dome collapse, subsequent new growth may occur on a different part of the dome (Calder et al., 2002).
Furthermore, the directional data show that sequential small flows tend to inundate the same or adjacent valleys and that they tend to precede a moderately large to large flow with a similar orientation, after which the dominant direction may change.
Our statistical modeling assumptions are simple and, importantly, are not tied to geological observations or explanations, yet are consistent with the behavior of the data. Our model allows the system to change from a high-tolow or low-to-high frequency at random times. It allows the directions (or valleys) of successive flows to vary randomly yet have a shared preference. It also allows this shared preference to be re-set after each major pyroclastic density current. In addition, our volume model indicates that we are in a heavy-tailed regime where larger-than-recorded events must be accounted for in forecasting. This is a key outcome, which is critically important for volcanic hazard assessment. We employ the Bayesian paradigm for inference and forecasting, which provides a coherent mechanism to account for uncertainty about both model fit (epistemic uncertainty, arising from our imprecise knowledge of model parameters and the imperfect fit of the model to nature) and forecasts (which must also reflect aleatoric uncertainty, or the natural variability of geophysical systems). Data constrain this model well, and retrospective forecasts are consistent with the data.
We believe that the power of this modeling framework lies in its ability to support updating short-term forecasts during periods of rapidly changing activity. (Note that this or a similar framework could be applied to other active volcanic systems, other volcanic hazard types, or to other natural hazards.) That is, we can condition forecasts on the knowledge that we are currently in a period of high-activity (or low-activity), but still account for the fact that there is a significant chance of transition between those levels during the forecast interval. , as solid curves. Threshold volume V thresh is 2 · 10 6 m 3 for cyan curves, 5 · 10 6 m 3 for blue curves, and 10 · 10 6 m 3 for black curves.
Forecasts are based only on data prior to 1st January 2006. (B) forecast daily hazard of Belham inundation by a flow exceeding 5 · 10 6 m 3 for a longer period, for a dome facing the Belham (µ 0 = 135 • ), as upper blue curve, and for a dome facing due East (µ 0 = 0 • ), as lower red curve. Note hazards differ strikingly for first 200 days but are indistinguishable after 600 days. Daily (or instantaneous) hazard is the probability of inundation on day t, conditional on having no earlier inundation.
Likewise, we can make short-term forecasts based on other readily observable traits of the system (e.g., dome growing toward east vs. dome growing toward northwest). The fact that our model also accounts for epistemic uncertainty is critical for hazard assessment as quantitative analysis of data is necessary for fully defensible decision making (Bretton and Aspinall, 2017). Probabilistic volcanic hazard maps such as those developed in Connor et al. (2012); Volentik and Houghton (2015); Biass et al. (2016); Mead and Magill (2017) represent the state of the art in volcanic hazard assessment. Each of these works combines probabilistic scenario models and physical simulations of potentially hazardous volcanic processes (lava flows, lahars, and tephra fallouts, respectively). However, the computational expense of exercising a physical model at a large number of samples from a scenario model is often a limiting factor in constructing a probabilistic assessment. This typically has the result that a probabilistic hazard map needs to "settle" for a single scenario distribution.
Looking forward, modeling efforts like those presented here provide additional flexibility. In particular, they may be even more powerful when used in conjunction with complex geophysical models, and in simulations of pyroclastic density currents to make short-term probabilistic hazard maps. Probabilistic hazard maps for pyroclastic density currents display probabilities of inundation spatially in potentially vulnerable areas surrounding a volcano (Bayarri et al., 2009(Bayarri et al., , 2015Spiller et al., 2014). These works introduce an efficient approach to combine scenario and physical models that results in the rapid construction of probabilistic hazard maps that enable the exploration of various aleatoric scenarios. Short-term probabilistic hazard maps, constructed this way, would allow one to visualize the change in hazard threat as scenario models are updated.

AUTHOR CONTRIBUTIONS
RW: did most of the model development, derived likelihood and posterior distributions, oversaw the continuing development of the model as it evolved, did some of the computer coding. ES: significant contributions to model development, most of the computer coding and figure generation, significant contributions to model evolution. EC: motivated entire project; supplied the needed expertise in geology and volcanology, supplied the data, helped ensure the model's consistency with geophysical processes.