Real-Time Prediction of Impending Ground Shaking: Review of Wavefield-Based (Ground-Motion-Based) Method for Earthquake Early Warning

Earthquake early warning (EEW) systems aim to provide advance warning of impending ground shaking, and the technique used for real-time prediction of shaking is a crucial element of EEW systems. Many EEW systems are designed to predict the strength of seismic ground motions (peak ground acceleration, peak ground velocity, or seismic intensity) based on rapidly estimated source parameters (the source-based method), such as hypocentral location, origin time, magnitude, and extent of fault rupture. Recently, however, the wavefield-based (or ground-motion-based) method has been developed to predict future ground motions based directly on the current wavefield, i.e., ground motions monitored in real-time at neighboring sites, skipping the process of estimation of the source parameters. The wavefield-based method works well even for large earthquakes with long duration and huge rupture extents, highly energetic earthquakes that deviate from standard empirical relations, and multiple simultaneous earthquakes, for which the conventional source-based method sometimes performs inadequately. The wavefield-based method also enables prediction of the ongoing seismic waveform itself using the physics of wave propagation, thus providing information on the duration, in addition to the strength of strong ground motion for various frequency bands. In this paper, I review recent developments of the wavefield-based method, from simple applications using relatively sparse observation networks to sophisticated data assimilation techniques exploiting dense networks.


INTRODUCTION
Earthquake early warning (EEW) aims to prevent/mitigate earthquake disasters by providing people with enough time to take appropriate safety measures in advance of impending strong ground motion. EEW systems have been researched and developed in Japan, Mexico, Taiwan, the United States, European countries, China, Turkey, south Korea, and many other regions (e.g., Hoshiba et al., 2008;Cuellar et al., 2014;Chen et al., 2015;Cochran et al., 2018). The prediction of strong ground motions is an important element of EEW, and many methods have been proposed; the basic algorithms are classified into three prediction concepts: 1): Predicting seismic wave propagation; 2): Predicting the amplitude of S-waves from those of preceding P-wave; and 3): Predicting an entire rupture based on the initial rupture.
EEW algorithms are constructed based on one or combination of these concepts. For example, the operational EEW system of the Japan Methodological Agency (JMA) combines Concepts 1) and 2) but does not adopt Concept 3) (Hoshiba, 2014).
Concept 1) is based on the fact that modern communication speeds (∼10 5 km/s) are much faster than seismic wave velocity (∼10 0 -10 1 km/s); information about seismic waves detected near a hypocenter can be relayed to distant locations much faster than the seismic waves propagate. This concept assumes that observation sites are closer to the hypocenter than the prediction (target) site; the seismic observation network thus plays an important role.
Concept 2) is based on the fact that amplitudes of early P-wave arrivals are usually smaller than those of late arriving S-waves. The S-wave/P-wave amplitude ratio has been estimated theoretically or empirically, and many authors take a value of ∼5. Because communication is not necessarily required, this concept can be used for even a single isolated site. However, this concept cannot be used in cases when the earthquake rupture duration is longer than S-P time (the time between P and S wave arrives), because P waves from later large ruptures may be contaminated by S waves from earlier small ruptures. This situation usually occurs at sites near the hypocenter (that is, short S-P time) of large (that is, long rupture duration) earthquakes. Discrimination of P waves in S wave train is required.
Concept 3) is based on the hypothesis that initial parts of small and large earthquake ruptures differ. Some authors have claimed that the final moment magnitude (Mw) can be estimated from the first several seconds of P-wave portion even for large events of long rupture duration (typical durations are ∼10 and 30 s for Mw7 and 8 earthquakes, respectively), and that it may be possible to rapidly estimate the final Mw while the rupture is still ongoing (e.g., Olson and Allen, 2005;Zollo et al., 2006;Noda and Ellsworth, 2016;Melgar and Hayes, 2019). However, many authors have challenged this deterministic view of ruptures claiming that they are nondeterministic and statistically common rupture growth (e.g., Rydelek and Horiuchi, 2006;Meier et al., 2016;Okuda and Ide, 2018;Trugman et al., 2019). The debate over the deterministic nature of earthquake rupture has continued for more than 2 decades in research on EEW and the physics of earthquake rupture dynamics. Although some EEW algorithms are based on this concept, but many others are not.
In this review paper, I focus on Concept (1). When the characteristics of wave propagation are expressed simply and the source location and strength are estimated quickly, it is possible to easily predict waves strength at a given location. c(x) 2 € u(x, t) Assumption of plane wave propagation: Many authors have thus proposed methods to rapidly estimate origin time, hypocentral location, and earthquake magnitude for EEW purposes (the point-source algorithm). Moreover, the rapid estimation of source extent has also been investigated (the finite fault algorithm) (Yamada, 2014;Böse et al., 2018). In these "source-based methods," the strength of ground motion (e.g., peak ground acceleration, PGA; peak ground velocity, PGV; and/or seismic intensity) is usually evaluated based on Mw and distance (e.g., hypocentral distance, epicentral distance, or fault distance) using a ground motion prediction equation (GMPE). Nonetheless, the source-based methods have vulnerabilities, including underpredicting ground motion for extended ruptures in point-source algorithms, and incorrectly identifying multiple simultaneous earthquakes. Even when the hypocenter and magnitude are estimated precisely, the precision of predicted ground-motion strengths are controlled by the uncertainty in the GMPE. Recent theoretical works have suggested that the source-based methods are inherently limited in terms of the timeliness and accuracy of its prediction (Minson et al., 2018Hoshiba, 2020).
As an alternative to source-based methods, another algorithm that has been intensively investigated over recent decade does not necessarily require source parameters to predict the strength of ground motion. Instead, future ground motions are predicted directly from observed ground motion, skipping the process of source estimation. The "wavefield-based method" or "ground-motion-based method" (hereinafter "wavefield-based method" for simplicity) first estimates the current wavefield, then predicts future wavefield based on the physics of wave propagation. Because source parameters are not estimated, this method avoids the vulnerabilities of the source-based method (rupture extent, and simultaneous multiple earthquakes). This paper reviews recent developments, the current situation, and future prospects of the wavefield-based method by comparison with the source-based method.

THEORETICAL BACKGROUND
Prediction of seismic wave propagation is a key element of Concept (1). Because seismic wave is controlled by the physics of wave propagation, future wavefield can be predicted by wave propagation theory. In this section, I explain the theoretical background based on three independent approaches (summarized in Table 1): the finite difference method, the boundary integral equation method, and radiative transfer theory (RTT). In the following subsections, scalar wave expression is used for simplicity although seismic waves are vector waves.

Finite Difference Method
Wave propagation is expressed by the wave equation: where u (x, t) is the wave amplitude at location x and at time t, c is phase velocity, ∇ 2 is the Laplacian, and ü is the second order differential of u with respect to t (i.e., z 2 u/zt 2 ). This equation implies that the timeevolution of a wave's amplitude, ü, is determined by its spatial distribution (∇ 2 u). Therefore, future wavefields can be predicted from the spatial distribution of a known wavefield when the velocity structure, c(x), is known ( Figure 1). Eq. 1 is approximated as The wavefield one time step Δt in the future, u (x, t+Δt), can be estimated from the current wavefield, u (x, t), and that one time step prior, u (x, t-Δt). Then, u (x, t+2Δt) is computed from u (x, t+Δt) and u (x, t) as, Thus, the wavefield at any future time can be obtained by repeating this procedure as needed. Note that this assumes that no new waves are radiated in the future, i.e., that no new earthquakes occur over the prediction period.
In this approach, precise monitoring of the spatial distribution of u (x, t) and u (x, t-Δt) is important for precise predictions. Once the detailed distribution of u (x, t) and u (x, t-Δt) are obtained, source parameters (radiation location and the strength of radiated waves, i.e., hypocenter location and magnitude) are not needed to predict the future wavefield. For precise estimation of the current wavefield, data assimilation is a powerful tool, which will be explained in a later section. Furumura et al. (2019) and Oba et al. (2020) applied the finite difference approach to predict long-period ground motions (>3-10 s). Because they used a three-dimensional (3-D) velocity structure, they were able to predict not only the waveforms of direct P-and S-waves, but also those of reflected, refracted and surface (Rayleigh and Love) waves. At present, however, the finite difference method is not useful for computing short-period ground motions (<1 s) because the very fine mesh size required exceeds modern computing capabilities and the very precise velocity structure required to simulate wave propagation is not easily obtained by current survey techniques.

Boundary Integral Equation Method
Whereas the finite differential method represents wave propagation using differential form of the wave equation (Eq. (1)), boundary integral equation takes its integral form; in 3-D space, where S is the surface enclosing the target site, x p , x 1 is a location on S, z/zn is the derivative with respect to the normal vector to S at x 1 , and G is the Green's function. S can be taken arbitrarily ( Figure 2A). Here, the integrations are performed with respect to x 1 on S (i.e., the 2D integral), and with respect to the convolution integral τ. Eq. 4 is valid when there are no radiations (i.e., no sources) inside S. Eq. 4 is known as the Kirchhoff-Fresnel integral in wave propagation theory. Huygens' principle is its preliminary qualitative description, in which points on wavefront are virtual sources of secondary waves aligned along the wavefront.
As an example, let us take S 1 as a sphere of radius of l and centered on x p in a homogenous velocity structure, c 0 , such FIGURE 1 | The process for predicting future wavefield using the finite difference method. From the current wavefield, u (x, t-Δt) and u (x, t), the future wavefield, u (x, t+Δt), is predicted, and then, from u (x, t) and u (x, t+Δt), u (x, t+2Δt) is forecast. Repeating this process makes it possible to obtain the wavefields at any time in the future.
FIGURE 2 | Schematic illustration of the boundary integral equation method (After Hoshiba, 2013a). The surface S can be taken arbitrarily around the target site x p , and x 1 is a location on S, and n is the normal vector to S at x 1 . Here θ is the angle of incoming ray paths from the surface normal. (B) Surface S 1 is taken around x p as a sphere of radius l. When an incident plane wave is assumed, travel distance of path A is larger than B by l-l cos θ. (C) An example of deploying pairs of monitoring sites to predict ground motion at target site x p . The seismometer pairs are located at radius l on the ground surface and underground. Each seismometer in each pair is spaced by Δl. (D) An example of plane wave propagation. The surface S is divided into S A and S B . S A is an infinite plane normal to the z axis, and S B is a half space of infinite radius. Incident plane wave normal to S A is assumed.
Frontiers in Earth Science | www.frontiersin.org September 2021 | Volume 9 | Article 722784 that |x p -x 1 | l ( Figure 2B). Because G (x p , t; x 1 , 0) 0 when t < l/c 0 , and taking travel-time into consideration, G (x p , t + l/ c 0 -τ ; x 1 , 0) 0 when t + l/c 0 -τ < l/c 0 , i.e., when τ> t, so that This means that the waveform at the target site at time l/c 0 in the future is predicted based on u (x 1 , τ) between the past (time -∞) and present (time t).
Green's function in a homogeneous 3-D structure is expressed as, When the wavelength is much smaller than the spatial fluctuation of the absolute amplitudes of u (x p , t) and G (x p , t-τ ; x 1 , 0), the high frequency approximation is valid, and Eq. 5 is approximated as where θ ( θ(x 1 , t)) is the angle of incoming ray paths from the surface normal (Shearer 1999;Hoshiba 2013a). The waveform at time t + l/c 0 is the weighted sum of the time differential of waveforms at x 1 and t. This means that when _ u(x 1 , t) and θ(x 1 , t) are obtained on surface S 1 , we can predict the wave motion at x p with a lead time of l/c 0 . Here, source information (hypocentral location and magnitude) is not required. For near future predictions small values of l (i.e., small S 1 ) are used, and for predictions of more distant future larger values of l (larger S 1 ) are accordingly required. When the angle of waves approaching x p is 0°, cos θ 1 and cos θ +1 2, indicating their large contribution to constitute waveforms at x p . When the angle of waves travelling away from x p is 180°, cos θ -1 and cos θ +1 0, indicating no contribution. Thus, waves approaching x p are important for predicting waveforms at x p , and those traveling away from x p are negligible. At θ 90°(as for large-angle refractions), cos θ +1 1; such waves are weighted half as strongly as those of θ 0°. However, taking wavelength into consideration, the contribution is more concentrated around θ 0. Based on Fresnel theory, the area that mainly affects the wave motion at x p is approximately given by the relation of "(half wavelength) ≥(Path A)-(Path B)" in Figure 2B, where an incident plane wave is assumed. This area is large for lowfrequency waves, and small for high-frequency waves. For example, in a homogeneous velocity structure (c 0 3 km/s) and with l 30 km, θ ≤0.32 rad (18°) for a 1 Hz wave, and θ ≤0.14 rad (8°) for a 5 Hz wave. Because contributions from outside this area are small, large angle refractions do not affect high-frequency waves, and the ray theory approach is valid. Nagashima et al. (2008) and Kuyuk and Motosaka (2009) proposed a front-detection method. They tried to predict ground motion using waveforms of halfway applying empirical transfer function, which corresponds to empirically estimated Green's function. Iervolino et al. (2007) and Iervolino (2014) proposed to deploy some observation sites on a circle whose center location is the target site, which is similar to Eqs 5, 7; this idea is essentially the basis of the boundary integral equation method. Hoshiba (2013a) explicitly introduced the boundary integral equation method for predicting future ground motion. Although it is relatively easier to take high-frequency waves in the boundary integral equation method than in the finite difference method, information on the time-dependent propagation direction (θ) is required. An array technique is useful to estimate θ. When pairs of seismometers are deployed as shown in Figure 2C and Eq. 5 is approximated as where n 1 is the inward-facing normal vector to S 1 at x 1 , and Δl is the distance between the seismometers in each pair. Each pair simply takes the place of an array. At this point I wish to return to Eq. 4 to review a relation between the boundary integral equation method and the source-based method. A point source at x 0 instantaneously radiating at t t 0 of amplitude A 0 is expressed by the source A 0 δ(xx 0 ) δ(tt 0 ). Waveforms at x 1 on the surface S are given as A 0 G (x 1 , t; x 0 , t 0 ). When the source has volume V 0 and an arbitrary duration, and the source function is expressed as A (x 0 , t 0 ), the waveforms at x 1 are described as: In the source-based method, A (x 0 , t 0 ) is first estimated from waveforms u (x 1 , t) observed at multiple x 1 locations, and then the waveform at x p is predicted as: In the boundary integral equation method, the surface S is taken such that all x 1 are located on S, but the source volume V 0 is outside S. Note that surface S can be of arbitrary shape. Using Eqs. 4, 10, Frontiers in Earth Science | www.frontiersin.org September 2021 | Volume 9 | Article 722784 is used, because G (x 1 , t; x 0 , t 0 ) satisfies Eq. 4, Eq. 12 means that the boundary integral equation method bypasses the process of estimating source function, A (x 0 , t 0 ), to predict the waveform at x p by using observations at many x 1 locations. In the source-based method, the hypocentral distance is required to evaluate geometrical spreading attenuation (for example, the inverse of hypocentral distance for body waves). However, wave propagation physics suggests that geometrical spreading attenuation is determined by the local curvature of the wavefront: large curvatures give strong attenuation, small curvatures weak attenuation, and no curvature (e.g., plane wave) no attenuation. When the wavefield is estimated precisely and the curvature of wavefront is obtained, it is not necessary to estimate hypocentral distance to evaluate geometrical spreading attenuation.

Radiative Transfer Theory (RTT)
When the ray theory approach is valid, radiative transfer theory (RTT) is a powerful tool for representing high frequency wave propagation; scattering, attenuation and reflection are easily treated, although it is not easy to include refraction. In RTT, the propagation of energy is calculated instead of propagation of the wave itself. Many authors obtain the time history of energy, F (x, t), from the running average of the squared amplitude of the band-pass filtered waveform, u (x, t), at location x and time t (e.g., Hoshiba, 1995;Sato et al., 2012;Ogiso et al., 2018). RTT has been widely used to represent the envelope shape of high-frequency (≳1 Hz) seismic waveforms (Sato et al., 2012). Hoshiba and Aoki (2015) applied RTT to EEW, but they considered in 2-D space. Here, I explain RTT in 3-D space. Following Hoshiba and Aoki (2015), when isotropic scattering is assumed, RTT is expressed as: where f is the energy density at location x and time t traveling in direction q (here q is the unit vector), c(x) is the velocity of seismic wave at x, and g s (x) and h s (x) are the strength of scattering and intrinsic absorption at x, respectively (Sato et al., 2012). Here it is assumed that scattering does not cause wave conversion (i.e., P→S or S→P). The time history of energy, F (x, t), is the sum of f (x, t: q) in all directions: Here, for simplicity, I assume that both the velocity and attenuation structures are homogeneous; thus, velocity, scattering strength, and intrinsic absorption are independent of x: c(x) c 0 , g s (x) g 0 and h s (x) h 0 . Then Eq. 14 is expressed as, The left-hand side of Eq. 16 represents advection, the first term on the right-hand side means attenuation, and the second term represents scattering from direction q′ to q. Because the first term on the left-hand side is the differential of f with respect to time, Eq. 16 means that it is possible to predict future f provided that the current spatial and directional distributions of f are known. Eq. 16 is approximated as: Repeating this process makes it possible to predict f at any future time. Note that information about the earthquake hypocenter and magnitude is not required for this prediction.
To efficiently calculate RTT simulation, a particle method based on the Monte-Carlo technique has been widely used in recent decades (e.g., Gusev and Abubakirov, 1987;Hoshiba, 1991Hoshiba, , 1995Hoshiba, , 1997Yoshimoto, 2000). In this method, the propagation of wave energy is simulated by the movement of a very large number of particles. Instead of the Eulerian representation expressed in Eq. 16, I use the Lagrangian representation, First, let us consider the case of h 0 0, which means no attenuation due to absorption. The first term on the right-hand side means that the energy propagating in direction q is attenuated by scattering in proportion to f. The energy density, f, decreases exponentially as travel distance increases. The second term is the contribution of energy changing direction from q′ to q. As shown in Figure 3, this physical process is simulated by many particles through a probabilistic process (Hoshiba and Aoki, 2015). The probability that energy travels without scattering during the time interval Δt is exp (-g 0 c 0 Δt), which is approximated as 1-g 0 c 0 Δt when g 0 c 0 Δt <<1. The probability that scattering occurs during this time interval is given by 1-exp (-g 0 c 0 Δt) ≈ g 0 c 0 Δt. When scattering occurs, the probability density of scattering from q′ toward q is 1/(4π) because scattering is assumed to be isotropic. These processes are simulated by using a very large number of particles.
Let the number of particles be M, and let X n,m be the location of the mth particle at time, n. When traveling in direction q m , the particle is expected to move by v 0 Δt q m over the time interval Δt, if scattering does not occur. In contrast, if scattering occurs, the particle changes direction. Let R 1 , R 2 and R 3 be independent uniform random variables between 0 and 1. When R 1 ≥ g 0 v 0 Δt (i.e., no scattering), the particle moves to X n,m X n-1,m + c 0 Δt q m in the next time step. When R 1 < g 0 c 0 Δt (i.e., with scattering), the particle's new direction is determined as q N (2π R 2 , cos −1 (1-2R 3 )) and the particle is located at X n,m X n-1,m + c 0 Δt q N after Δt. At the next time step, q m = q N is used as the propagation direction.
The third term on the right-hand side of Eq. 18 represents attenuation due to absorption. When the mth particle has energy q n-1,m at time n-1, the energy is attenuated as: with increasing n, where h 0 v 0 Δt <<1 is assumed. The energy of the particle is attenuated as the elapsed time increases regardless of scattering. In this paper, because h 0 is assumed to be homogeneous (i.e., independent of x), amount of attenuation of each particle is assumed to be the same for all m. Hoshiba and Aoki (2015) and Ogiso et al. (2018) applied RTT to real-time predictions of the strength of seismic ground motion for EEW. They called the method "Numerical Shake Prediction," because of its analogy to numerical weather prediction in meteorology, in which physical processes are simulated from a precise estimate of the present conditions. Because they focused on predicting seismic intensity, which is mainly determined by relatively high frequency component of waveforms, ray theory is a good approximation and RTT is applicable. They succeeded in predicting the time trace of future seismic intensities. Whereas the finite difference method is a good approach for predicting low-frequency waveforms but not high frequency waveforms, RTT is valid for high-frequency but does not necessarily hold for low-frequency because RTT is based on ray theory.

Propagation of Local Undamped Motion (PLUM) Method
For real-world application of the boundary integral equation method, a precise estimation of the wavefield distribution is required. Because quite dense observation networks are not yet available at present except few cases, it is not easy to directly apply the boundary integral equation method. Some approximations have been introduced, and the propagation of local undamped motion (PLUM) method is one of them (Kodera et al., 2018).
For plane wave propagation, let us assume the surface enclosing x p to be an infinite plane, S A , and half sphere of infinite radius, S B , as shown in Figure 2D. Because S B is located infinitely far from x p , contribution from S B is negligible. Plane waves propagating in the +z direction are expressed as u (x, t) u (x, y, z, t) u p (t-z/c 0 ). Then, as z/zn z/zz, zu/zn zu/zz. The contribution from S A is Here is used, where G is given by Eq. 6 and H is the step function. This is the Green's function in 1-D space. Eq. 20 means that plane wave propagates without attenuation: where |·| maxt indicates the maximum amplitude, such as PGA for accelerograms or PGV for velocity waveforms. Because plane waves form when a hypocenter is located infinitely far away, some attenuation is expected for hypocenters at finite distances, for waves arriving at x p later than x 1 , where PGA (or PGV) is expected to occur at x p later than x 1 . Plane waves correspond to FIGURE 3 | Schematic illustration of radiative transfer theory, simulated using a very large number of particles. When the propagation direction, q m , is known, the future locations of the mth particle (X n+1,m , X n+2,m , . . . ) are easily predicted from its current location X n,m , that is X n+1,m X n,m + c 0 Δt q m and X n+2,m X n,m + 2c 0 Δt q m , if scattering does not occur. If scattering occurs, the propagation direction is newly given by the probability density as shown for X n+1,3 .
Frontiers in Earth Science | www.frontiersin.org September 2021 | Volume 9 | Article 722784 the prediction for the most severe scenario. When several stations are available for monitoring the wavefield around the target site, the simple relation, may be valid, where i is the index of the monitoring site. Strictly speaking, this relation does not hold well for multiple simultaneous sources (i.e., multiple waves propagating towards x p ); nonetheless, it is an indicator of the possible strength of ground motions. Eq. 24 is the basis of the PLUM method, in which Max i |u (x i , t < t c )| max t is the predicted strength of ground motions accounting for the site amplification (see later section for site amplification correction) and t c is the current time. Because it assumes plane wave propagation and thus the most severe scenario, PLUM tends to overpredict ground motions. However, because the prediction is based on actual observations of ground motions, PLUM acts to reduce underpredictions and missed alarms. Because PLUM assumes plane waves propagation, the distance between the monitoring and target sites must be much smaller than the hypocentral distance. The use of sites far from x p as monitoring sites gives a long lead time, but is prone to overprediction. For this reason, long lead time predictions are not given by the PLUM method. In contrast, when monitoring sites around x p are sparse, few sites are available for use in the PLUM method. Kodera et al. (2016)  The PLUM method was implemented into the JMA's operational EEW system in 2018, in addition to the pointsource algorithms. Since then, PLUM has prevented underprediction caused by uncertainties in the GMPE, and sometimes issued earlier warnings than the point-source algorithm. Kodera et al. (2020) summarized its performance. For example, during the 2018 Eastern Iburi, Hokkaido, Japan, earthquake (Mw6.6, focal depth: 37 km), PLUM issued a public warning 13.35 s after the origin time, 3.1 s earlier than the pointsource algorithms. For this earthquake, the GMPE significantly underpredicted PGA and PGV at near-source sites (Dhakal et al., 2019), meaning that the point-source algorithm underpredicted ground motions even when source parameters (hypocentral location and magnitude) were precisely estimated. In contrast, PLUM predicted them appropriately, reflecting the strong ground motion observed at neighboring sites.

DATA ASSIMILATION
The first step in obtaining a precise prediction using the wavefield-based method is to estimate the current wavefield. Data assimilation is a powerful technique for estimating current conditions that is widely used in numerical weather prediction, oceanography and rocket control (Kalnay, 2003;Awaji et al., 2009). Figure 4 illustrates the data assimilation procedure: the spatial distribution of the wavefield is estimated from not only actual observations but also the simulation of wave propagation based on wave propagation physics, leading to a precise estimation of the current wavefield. Therefore, data assimilation incorporates actual observations into the simulation of wave propagation. In this section, I will explain data assimilation technique, following Hoshiba and Aoki (2015).
Let u n indicate the wavefield in the model space at time t n nΔt, in which u n [u (x, nΔt), u (x (n-1)Δt) ] in the finite difference method, or u n [f (x, nΔt: q) ] in RTT. When the 3-D space is discretized as 0 to L x Δx, 0 to L y Δy and 0 to L z Δz, the number of elements of u n is I 2·L x L y ·L z in the finite difference method, and when the azimuth is discretized as 0 to L q Δq, the number is I L x L y ·L z L q in RTT. When u n-1 is given, we can predict u n by simulating the propagation of the wave; this prediction one time-step-ahead is expressed as u n P (u n-1 ), where P is the operation of Eq. 2 or Eq. 17. To discriminate between u n before and after being combined with the actual observations, the wavefields before and after are denoted u b n and u a n . P is applied to the wavefield after the combination at one time step before, i.e., t n-1 ; thus, the one step-ahead prediction is expressed as: u b n P(u a n−1 ).
Let v n (v n1 , v n2 , v n3 , . . . v nj , . . . v nJ ) T be the actual observation in observational space at time t n , in which v nj means the observed data at the jth element. Let the total number of observation elements be J. Usually I (the number of grids in model space) is much larger than J (number of observation elements). Data assimilation is expresses as Here H is the J×I matrix called the observation matrix and it means the interpolation of grid points onto the location of the observation point, and then (v n -Hu b n ) is the difference between the one-step-ahead prediction and the actual observation at time t n . W is the I×J matrix called weight matrix, and W (v n -Hu b n ) indicates the correction of the one-step-ahead prediction in the simulation of wave propagation. From u n a , u n+1 b is obtained from Eq. 25. Iterative application of Eqs 25, 26 produces time-evolutional estimation of wavefield. The process of the left side in Figure 4 indicates repeated application of one-step-ahead prediction, that is, the simulation of wave propagation. In the data assimilation technique, actual observations are included in the simulation to correct the wavefield at each time step; that is, actual data are assimilated in the simulation process.
The parameter setting of W is important in data assimilation, and several techniques have been proposed. The simplest is the optimal interpolation method, in which W is constant irrespective of time n, though in the Kalman filter method it changes with increasing n. In the optimal interpolation method, Matrix W is expressed in relation to the errors in the one-stepahead prediction (background error, σ b ), and in the observations (observational error, σ o ). When the correlation distance of the background error and the ratio σ o /σ b are assumed, matrix W is obtained (for detail, see Awaji et al., 2009;Kalnay, 2003;Hoshiba and Aoki, 2015).
When the correlation distance is large, the W (v n -Hu b n ) correction is applied to a wide area around each observation point, and when the distance is small, the correction is restricted to a small area around each observation point. The density of observation network, therefore, may influence the parameter setting of the correlation distance: large distance for sparse network and small distance for dense network. For sparse network, observation of each site needs to represent wavefield of wide area around the site, but small area for dense network. In general, dense network can reconstruct the complicated wavefield in data assimilation better than sparse network. When too large correlation distance is used, seismic wave propagates artificially faster than actual velocity in the process of W (v n -Hu b n ) correction, which reduces the accuracy of arrival time of strong shaking. The correlation distance at each observation point can be varied according to the network distribution: for example, small correlation distances where station interval is small around the site, and large correlation distances where station interval is large.
When the observational errors are assumed to be much larger than the background errors, σ o /σ b >>1, W≈0 and then u n a ≈ u n b . Iterative application of Eqs 25, 26, therefore, results in just the simulation of wave propagation, because the observation have no effect. In contrast, σ o /σ b ≈ 0 corresponds to the case where the contours of the actual observations are drawn independently at each time step, because the one-step-ahead prediction has little effect in Eq. 26. The ratio, σ o /σ b , at each observation point also can be varied according to quality condition. For example, large FIGURE 4 | The flow of the data assimilation process. One-step-ahead prediction, u b n P (u a n-1 ), is combined with actual observations, v n , to correct the estimation of the current wavefield (after Hoshiba and Aoki, 2015). By repeating this process, the current wavefield, u a n , is estimated by using not only the current observation (v n ), but also all past observations (v n-1 , v n-2 , v n-3 , . . . ).
Frontiers in Earth Science | www.frontiersin.org September 2021 | Volume 9 | Article 722784 σ o /σ b for noisy stations in urban areas and small σ o /σ b for quiet stations in mountain areas. Although the wavefield is observable at the ground surface when stations are densely deployed at the surface (i.e., 2-D space), the underground wavefield at depths of more than a few kilometers cannot be observed because many borehole observations deeper than a few kilometers are not realistic. Because actual seismic wavefields are expressed in 3-D space, assumptions are required to apply data assimilation to estimate the 3-D wavefield. Handling the difference between the 2-and 3-D spaces is an important subject for future advancement of the data assimilation technique.

PREDICTION
Once the present wavefield, u n a , has been estimated precisely by the data assimilation technique, the future wavefield, u P , is predicted from the current wavefield, u a n , u P n+1 P u a n (27) and u P n+2 is forecast from u P n+1 , that is u P n+2 P (u P n+1 ) P 2 (u a n ). Repeating this process Future wavefield at any time can be predicted from the current wavefield.

REAL-TIME CORRECTION OF THE SITE AMPLIFICATION FACTOR
Site amplification is an important factor to determine seismic-wave amplitude in addition to source and propagation factors, and it depends on frequency. In application of the wavefield-based method, the frequencydependent site amplification should be removed from the observed amplitude when simulating wave propagation described in the previous section, and then include it to evaluate waveforms at target sites, especially those characterized by large amplification factors. Many previous studies have investigated frequency-dependent site amplification in the frequency domain by assuming a model, where O kl (f), S k (f), T kl (f), and A l (f) represent the observed seismic wave spectrum from event k at site l, the source spectrum characterizing event k, the propagation factor between event k and site l, and the site amplification factor at site l, respectively, and f is the frequency of the seismic waves. When borehole is available at the site, fine vertical structures of velocity and attenuation can be obtained, and A l (f) is estimated theoretically. However, borehole observation at all sites is not realistic at present. Instead of the theoretical approach, many empirical approaches have been proposed to obtain the relative site amplification factors, {A 2 (f)/A 1 (f)}: spectral ratio (e.g., Ikeura and Kato, 2011); spectrum inversion of source, propagation, and site factors (e.g., Iwata and Irikura, 1988;Kato et al., 1992); coda normalization method (e.g., Phillips and Aki, 1986), and others. These approaches usually neglect phase characteristics, focusing on amplitude characteristics, |A 2 (f)/A 1 (f)|. For example, in the spectral ratio method, when sites 1 and 2 are adjacent, compared to the hypocentral distance, |T k2 (f)| ≈|T k1 (f)| is assumed, then The relative site amplification factor, |A 2 ( f )/A 1 ( f )|, can be estimated from spectral ratio of observed waveforms, |O k2 ( f )/ O k1 ( f )|. When the site amplification factor at site 1 relative to site 1' (different location from site 1) is known, that is | A 1 ( f)/A 1' ( f )|, it is easy to estimate the amplification factor at site 2 relative to site 1′ from By repeating the process, it is possible to estimate relative site amplification factor even when the two sites are not adjacent. Here, |A 2 ( f )/A 1 ( f )| (and | A 2 ( f )/A 1' ( f )|) is used to correct the difference of siteamplification condition.
In EEW, it is preferable to correct the frequencydependent site amplification factor in real time. As | A 2 ( f )/A 1 ( f )| is expressed in the frequency domain, it may be possible to apply a fast Fourie transform (FFT) with a short time interval (i.e., every 1 s or less) to ongoing waveforms at site 1, multiply them by |A 2 ( f )/A 1 ( f )|, and then perform an inverse FFT to predict ongoing waveforms at site 2 ( Figure 5). Instead of methods in the frequency domain, a method in the time domain was proposed (Hoshiba, 2013b;Pilz and Parolai, 2016), in which a causal recursive filter, which allows correction of site amplification factors in real time, is used. The time domain filter alleviates computational workload of the system, comparing with that of the frequency domain analysis, and makes it easy to estimate continuously sitecorrected waveforms regardless of whether earthquakes are occurring or not. Trigger is not required in the continuous operation, which minimizes the fluctuation of the workload. Hoshiba (2013b) proposed to model the frequency dependent site amplification using the form: where N and M are the numbers of the firstand secondorder filters, respectively, and s i (2πf). Here ω 1n , ω 2n , ω 1m , and ω 2m are the angular frequencies and h 1m and h 2m are the damping factors that characterize the frequency dependence, respectively. Note that s 2 +2hω m s +ω m 2 represents a damping oscillation. Parameter ω 1n , ω 2n , ω 1m , ω 2m , h 1m and h 2m are positive numbers, and are estimated to satisfy |A 2 (f)/ A 1 (f)|≈|F(s)|.
The filter is modeled as Eq. 31, where F(s) is represented by combination of first and second orders of s, By a mapping procedure called the bilinear transform, and pre-warping for ω 1n , ω 2n , ω 1m , and ω 2m of the digital filtering technique, the transfer function, F(z), is obtained (Scherbaum, 1996) in a form of infinite impulse response (IIR) filter, where ΔT is the sampling interval of the digital waveforms and z exp (sΔT): The recursive filters are given by y k g 0 (a 0 x k + a 1 x k-1 )b 1 y k-1 for F 1n (z), and by y k g 0 (a 0 x k + a 1 x k-1 +a 2 x k-2 ) -(b 1 y k-1 + b 2 y k-2 ) for F 2m (z), where x k and y k are the input and output of the time series of the digitized waveform, respectively.
In many researches of site amplification factors to estimate |A 2 (f)/ A 1 (f)|, hard rock site is chosen for site 1 as a reference site. However, site 1 is not necessarily hard rock site here, because the purpose of the site factor correction is to make virtually the site conditions common to isolate seismic wave propagation. For the boundary equation method, for example, x 1 and x p corresponds to site 1 and site 2, respectively. Many locations are assumed for site 1 (x 1 ), but site 2 (x p ) is single location. By applying F(z) to waveforms obtained at site 1, the waveforms are virtually converted to those having the site amplification of site 2. The integrals in Eqs 5, 7, 9 are carried out for waveforms of the common site amplification. In finite difference method and RTT, future wavefields are predicted by using the sitefactor corrected waveforms. For prediction of amplitude at each location, it is necessary to convert inversely the waveforms to those having the site amplification factor at the individual location. For the inverse process, that is application of |A 1 (f)/A 2 (f)|, F −1 (s) can be used. Because denominator and numerator are the same order of s (Eq. 31), and h 1m and h 2m are positive numbers, both forward, F(s), and inverse, F −1 (s), filters are stable, where all poles and all zeros are distributed in the left half space in the s-plane (Scherbaum, 1996).
For the PLUM method (Eq. 24), the IIR filter F pi (z) can be applied, which represents site amplification at the target site p relative to site i. Applying F pi (z) to u (x i , t) to obtain the corrected waveforms, u c (x i , t), and then can be used instead of Eq. 24. For the boundary integral equation method, can be applied instead of Eq. 4.

EXAMPLE OF APPLICATION
Here, I provide an example of a real-time prediction by the wavefield-based approach. Figure 6 shows the case of a Mw6.4 earthquake, the largest aftershock of the Mw6.7 Chuetsu, Niigata, Japan, earthquake (October 23, 2004) using radiative transfer theory. Small dots in the panels indicate the locations of observation points (many stations for site 1 in Eq. 30). Site amplification factors are corrected relative to that of the target site, Ohtemachi, Tokyo; i.e., Tokyo is the site 2. The distribution of observation of strong ground motions was complicated, FIGURE 6 | An example application of the wavefield based method using a Mw6.4 earthquake (the largest aftershock of the Mw6.7 Chuetsu, Niigata, Japan earthquake; October 23, 2004). Data from K-NET and KiK-net of the National Research Institute for Earth Science and Disaster Prevention (NIED) were used. The locations of observation points are shown by small dots. The actual eventual distribution of ground motion was complicated (A) relatively strong ground motion was observed at Kanto, and weak one at Tokai. This complicated distribution is not reproduced by the source-based method, as shown in (B,C) using M7.0 and M6.0, respectively. In the wavefield-based method, the predicted eventual distribution from the observation at t 11s (D) is similar to the prediction using a M7.0 event (B). At t 50s, strong ground motions were observed to be propagating toward Tokyo (E), and the updated prediction indicates the arrival of the strong ground motions at Tokyo at t 60s (F) The predicted eventual distribution from t 50s (G) reproduces well the actual distribution (A).
Frontiers in Earth Science | www.frontiersin.org September 2021 | Volume 9 | Article 722784 differing from the concentric circles predicted by the sourcebased method. For example, relatively strong shaking was observed in Kanto, but weak shaking was observed in Tokai. Especially strong shaking was observed towards Tokyo. At t 11s, strong ground motions were observed around the focal region, and the prediction from the observation indicates that the strong ground motion propagates like an expanding circle, with a final distribution similar to that predicted for a M7.0 event. At t 50s, strong ground motions were observed toward Tokyo, and the prediction shows that the strong ground motion arrives at Tokyo 10 s later. The final eventual distribution is predicted to be the almost the same as the actual observation. Prediction of distant future is normally less precise than that of near future. As shown in Figure 6, predicted wavefield at 70 s from 11 s does not forecast well the actual wavefield, but that from 50 s predicts it better. For improvement of distantfuture prediction, introduction of velocity and attenuation structures is a key. Ogiso et al. (2018) introduced heterogeneous structures in the calculation of the radiative transfer theory. They estimated g s (x) and h s (x) at first, and then used Eq. 14 instead of Eq. 16 in the prediction. Estimation of detail velocity and attenuation structures contributes to the precise prediction, especially that of distant future. Hoshiba and Aoki (2015) applied the technique to data of the 2011 Tohoku earthquake (Mw9.0) and the 2004 Chuetsu, Niigata, earthquake (mainshock, Mw6.7), and Ogiso et al. (2018) did it to data of the 2016 Kumamoto earthquake (Mw7.1) incorporating the heterogeneous attenuation structure. Wang et al. (2017a, b) also used the technique for real-time prediction of ground shaking. Furumura et al. (2019) and Oba et al. (2020) used the finite difference approach to predict long period ground motions (>3-10 s) for the 2007 Off Niigata earthquake (Mw6.6) and the 2004 Off Kii Peninsula earthquake (Mw7.4), respectively.

SUMMARY
One of important key elements of EEW is the real-time prediction of ground motion, as well as the rapid transmission of monitored data and the wide dissemination of warnings. Because seismic motion is a wave propagation phenomenon, the physics of wave propagation, which have been well studied in many research fields, is the basis of real-time prediction of ground motion. In history of EEW research, in the source-based method many authors have focused on rapidly estimating source parameters (e.g., source location, magnitude, source extent), from which peak ground motions (PGA, PGV or seismic intensity) are estimated using a GMPE. Although GMPEs usually indicates empirical relations between the peak ground motion and the source parameters, they do not express the physics of wave propagation in detail; notably, the causality of the occurrence of peak ground motions is not included in it. In the near fault region, PGA and PGV occur before rupture completion, that is, before estimation of eventual magnitude, which essentially leads to late EEWs (Hoshiba, 2020). GMPEs are mainly aimed at explaining the peak ground motion of anticipated future earthquakes (or of past events at locations where seismometers were not deployed). Thus, GMPEs are not necessarily constructed for the purpose of real-time prediction, such as EEW. Instead of borrowing GMPEs, new technique based on the physics of wave propagation for the purpose of the real-time prediction of ground motion is a key to more precise and timely warning. The wavefield-based method described in this review follows this strategy.
To conclude, I compare the wavefield-based (groundmotion-based) method with the conventional method based on source parameters (Figure 7) (Hoshiba and Aoki, 2015). In the source-based method, using monitored data from past to present we look back to the "oldest past situation". Not only is the future situation predicted, the past situation is post-dicted from the "oldest past situation". The time needed for estimation of the source parameters can be regarded as blind time with respect to EEW. In contrast, all data up to the present are used in the wavefield-based method to estimate the "present situation" of wave propagation. Post-diction is not performed in the process.
In the source-based method, information about the very complicated space-time distribution of ground motion is first compressed in order to represent it with a limited number of parameters (such as latitude, longitude, focal depth, origin time, and M), and then it is expanded again to predict ground motion. It is difficult to completely reconstruct the spatial distribution of ground motion even for the present situation: inevitably there are discrepancies between the predicted present situation and the actual present observation. As a result, even if estimation of the source parameters is precise, the prediction of ground motion is not necessarily precise. In contrast, in the wavefield-based method the actual present observation is reflected as much as possible in the estimate of the present situation. Discrepancies are minimized between the estimated present situation and the actual present observation before proceeding to the prediction.

AUTHOR CONTRIBUTIONS
MH designed the study, performed analyses, and drafted the article.

FUNDING
This research was partially supported by the JSPS KAKENHI Grant Number 17H0264.