A Novel Robust Strategy for Discontinuous Galerkin Methods in Computational Fluid Mechanics: Why? When? What? Where?

In this paper we will review a recent emerging paradigm shift in the construction and analysis of high order Discontinuous Galerkin methods applied to approximate solutions of hyperbolic or mixed hyperbolic-parabolic partial differential equations (PDEs) in computational physics. There is a long history using DG methods to approximate the solution of partial differential equations in computational physics with successful applications in linear wave propagation, like those governed by Maxwell’s equations, incompressible and compressible fluid and plasma dynamics governed by the Navier-Stokes and the Magnetohydrodynamics equations, or as a solver for ordinary differential equations (ODEs), e.g., in structural mechanics. The DG method amalgamates ideas from several existing methods such as the Finite Element Galerkin method (FEM) and the Finite Volume method (FVM) and is specifically applied to problems with advection dominated properties, such as fast moving fluids or wave propagation. In the numerics community, DG methods are infamous for being computationally complex and, due to their high order nature, as having issues with robustness, i.e., these methods are sometimes prone to crashing easily. In this article we will focus on efficient nodal versions of the DG scheme and present recent ideas to restore its robustness, its connections to and influence by other sectors of the numerical community, such as the finite difference community, and further discuss this young, but rapidly developing research topic by highlighting the main contributions and a closing discussion about possible next lines of research.


A BRIEF INTRODUCTION TO DG
The first discontinuous Galerkin (DG) type discretisation is either attributed to Reed and Hill in 1973 [1] for an application to steady state scalar hyperbolic linear advection to model neutron transport, or to Nitsche in 1971 [2] who introduced a discontinuous finite element method (FEM) to solve elliptic problems with non-conforming approximation spaces. It was however a series of papers by Cockburn and Shu et al. starting 20 years later [3][4][5][6] that introduced the modern form of the so-called Runge-Kutta DG scheme. They applied the method especially to nonlinear hyperbolic problems such as the compressible Euler equations on unstructured simplex grids with slope limiting to capture shocks. Bassi and Rebay were the first that extended the DG method to the compressible Navier-Stokes equations [7]. They used a fully discontinuous ansatz based on a mixed variational formulation, where they rewrote the second order partial differential equation (PDE) into a first order system. The resulting DG formulation requires numerical fluxes for the advective as well as for the diffusive part. Although the methods gave reasonable results for the compressible Navier-Stokes equations, an analysis of the method in Arnold and Brezzi et al. [8,9] applied to pure elliptic problems revealed how to improve the method in terms of convergence rate, adjoint consistency, and stability. Since the introduction of its modern form, the DG method has been applied and advanced by many researchers across different scientific disciplines around the world. The DG method is used in a wide range of applications such as compressible flows [10][11][12], electromagnetics and optics [13][14][15][16], acoustics [17][18][19][20][21], meteorology [22][23][24][25], and geophysics [26,27]. The first book available on DG was basically a collection of papers [28]. Since then, many different text books on DG are available focusing on theoretical developments as well as specific implementation details, e.g., [29][30][31].
One of the main applications of DG methods is the discretisation of nonlinear advection-diffusion problems of the form where u is the vector of conserved quantities, e.g., the mass, momentum, or energy. The vector f(u) defines the flux functions that in general depend nonlinearly on the solution u, and can be compactly written with the double arrow notation as block vectors with the fluxes f i in each spatial direction x i , i 1,2,3. The viscous fluxes are denoted by f v and not only depend on the solution, but also on its spatial gradient thus modeling parabolic effects, e.g., heat conduction. The problem is typically defined on a given spatial domain Ω ⊂ R 3 , with a final time T, and suitable initial and boundary conditions. The DG scheme is based on a Galerkin type weak formulation. For the sake of simplicity, we drop the viscous second order terms in what follows and refer to, e.g. [32], for a complete description of the advection-diffusion case. To construct the approximation space of the DG method, the domain is split into non-overlapping elements E ⊂ Ω. Each component of the solution u is represented as a polynomial function inside each element where P(N) is the number of polynomial basis functions depending on the polynomial degree N. The time dependent polynomial coefficients are U j (t), and ϕ j ( x → ) spans the polynomial basis. The DG approximation space is polynomial inside an element, but discontinuous across element interfaces. For a given element E, we define first the inner product for state vectors Similarly, for block vectors, We obtain the weak formulation by multiplying each equation by a polynomial test function ϕ( x → ). Next, we integrate over the element E and use integration-by-parts to move the spatial derivatives off of the physical fluxes onto the test function If we choose the test function ϕ to be all the polynomial basis functions from the solution ansatz space {ϕ j } P j 0 it generates P equations in each element for each state variable. This matches exactly the number of unknown polynomial coefficients U j (t). Due to the discontinuous ansatz, the flux normal f · n → at the surface integral is not uniquely defined. Borrowing ideas from the Finite Volume (FV) community, this non-unique normal flux function is replaced and approximated with a so-called numerical surface flux function that depends on the two values at the element interface, i.e. on the value inside the element U and outside from the neighbor element U + . Typically, the numerical surface fluxes are constructed from (approximate) Riemann solvers, e.g., [33]. With the surface numerical flux, we arrive at the semi-discrete DG formulation of the advection problem in weak form or if we apply integration-by-parts once more to the volume terms it becomes the so-called strong DG formulation where the surface contribution is a penalty between the numerical surface flux and the normal flux evaluated from the interior of an element. Note, the weak and strong form DG formulations are equivalent [34]. There are still many critical decisions necessary before the DG formulation produces an algorithm that can be implemented. The type of element shape needs to be decided as well as which polynomial basis. For example, modal polynomial basis functions for tetrahedral elements or nodal tensor product polynomials for hexahedral elements. In addition, the surface integral and the volume integral needs to be discretized. In most cases, the integrals are approximated with numerical quadrature rules, e.g., high order Gauss-Legendre quadrature and cubature. Many variants and detailed descriptions can be found in text books on DG methods and their implementation, e.g., [28][29][30][31]. It is important to note that these choices involving the element type, basis functions, and approximation of inner products all have a major impact on the performance of the resulting DG scheme in terms of computational complexity and robustness due, e.g., to the presence of spurious oscillations near discontinuities that result in unphysical solution states (like negative density or pressure) or aliasing instabilities. Many mechanisms exist in the DG community to combat spurious oscillations (i.e., shock capturing) such as slope [3,5,35] or WENO [36,37] limiters, filtering [29,38,39], finite volume sub-cells [40][41][42], MOOD-type limiting [43][44][45][46][47], or artificial viscosity [48,49]. The issue of shock capturing will not be discussed further. However, aliasing errors, how they arise within the DG method and strategies to remove said errors and increase robustness will be discussed at length in this article.
With the above decisions, we arrive at the generic semidiscrete ordinary differential equation (ODE) form of the DG scheme, which can be integrated in time with an appropriate high order explicit or implicit ODE solver, e.g., [50][51][52][53][54].
The resulting DG method is high order accurate and has excellent dispersion and dissipation behavior, e.g., [55,56]. Furthermore, due to its compact stencil (only interface neighbor data is needed) the DG scheme is well known for its excellent parallel scaling, e.g., [57,58], and its ability to handle unstructured and non-conforming grids, e.g., [16,54,[59][60][61][62][63]. These nice properties of the DG methodology are one reason why more and more researchers apply and extend the DG methodology to many different problem setups in computational physics. However, DG is not the perfect discretisation and there are unfortunately some issues that necessitate detailed analysis and discussion.
The remainder of this review article gives the answers to why we need novel developments, Section 2, when the novel developments started, Section 3, what the key ideas of these novel strategies are, Section 4, and where there are still open questions toward future research directions, Section 5.

WHY DO WE NEED A NOVEL ROBUST STRATEGY?
Throughout the analysis and discussions in this manuscript we describe different types of stability for a numerical approximation. Principally, we concentrate on the stability and boundedness of the spatial DG discretization.

On the L 2 -Stability of the DG Method
It is easy to show that the DG scheme is L 2 -stable for linear advection problems with constant coefficients due to its Galerkin nature, e.g., as a special case in Ref. 64. As a brief illustrative example let's consider the one-dimensional scalar linear advection model where f(u) au with constant velocity a. The respective strong form DG scheme reads We get the discrete evolution of the L 2 -norm U 2 dx by inserting ϕ U for the polynomial test function Assuming time continuity, the first term reduces to z t U 2 /2 dx.
We observe that in the volume integral f x aU x is a polynomial of degree N−1 and ϕ a polynomial of degree N. Thus, we need the quadrature rule to be exact for polynomials with at least degree 2N−1. This is guaranteed by all Gauss-Legendre type quadrature rules with at least N+1 nodes, such as the Legendre-Gauss-Lobatto (LGL) quadrature. The volume term contribution is 0, as a 2 U U + is a unique discrete energy flux at every interface and cancels when summing over all elements. Thus, for linear advection, the DG scheme with at least 2N−1 accurate quadrature is L 2 -stable, i.e. the discrete L 2 -norm is bounded for all times t. Note, that for quadrature rules with less than 2N integration precision, the estimate is not for the exact L 2norm, but for a discrete L 2 -norm corresponding to the quadrature rule chosen.
For linear problems basically all DG variants are stable, but what about nonlinear problems? To address nonlinear problems, there was a crucial contribution by Jiang and Shu in 1994 [65], who demonstrated that for scalar nonlinear hyperbolic problems, the DG method is L 2 -stable provided: 1) Exact evaluation of all integrals are used; 2) Entropy stable numerical surfaces fluxes are used at the element interfaces. The L 2 -stability result with conditions 1) and 2) also extends to symmetric variable coefficient hyperbolic systems [66]. Again, as a brief example to illustrate the important steps of the analysis, we consider a scalar one-dimensional problem Eq. 11 with a simple quadratic flux function f (u) 1 2 u 2 , the so-called Burgers' equation. The DG scheme is, again, given by Eq. 13 and we get the evolution of the discrete L 2 -norm by replacing the test function with the DG solution ϕ U. Note, that for this nonlinear problem, the volume integral requires a quadrature rule with higher integration precision to be exact. For the quadratic flux function f ∼ u 2 , the quadrature rule needs 3N−1 integration precision. With the exact evaluation of the volume integral, its contribution is, once again, lifted onto the boundary of the element to give Following the same arguments as above for the linear advection problem, we sum over all of the elements in the domain and obtain L 2 -stability for the nonlinear scalar hyperbolic problem for quadrature rules with at least 3N−1 integration precision.
Unfortunately, even ignoring the practical issues with the assumption of exact integration for a moment, the results of Jiang and Shu cannot be directly extended to general nonlinear hyperbolic systems, e.g., the compressible Euler equations. A key step in the analysis of Jiang and Shu is that the test functions ϕ in the DG formulation Eq. 10 are replaced with the discrete DG solution which itself is a linear combination of the test functions {ϕ j } P(N) j 0 and, hence, an element of the test function space. While this gives an L 2 -norm estimate for symmetric systems, this approach does not lead to a proper norm estimate (in the continuous as well as in the discrete case) for general nonlinear systems. This lack of a stability estimate, even when using "exact integration," is the explanation why the DG method can still crash for complex PDEs, e.g., [67].

On the Entropy Stability of the DG Method
In the previous section it was shown for a scalar nonlinear conservation law that the DG solution U is bounded in the L 2 -norm if a special choice of the numerical surface flux is chosen, such as the one in Eq. 20. We conjecture that an analogous statement of nonlinear stability should be true for systems of nonlinear conservation laws. However, in general, stability in the L 2 -norm is insufficient to exclude unphysical phenomena such as expansion shocks [68]. To remove the possibility of such phenomena we generalize the notion of a stability estimate for nonlinear problems. Before we discuss the mathematics of a general nonlinear hyperbolic system, a detour is taken to examine an important underlying physical principle. In particular, we introduce concepts from thermodynamics, which is a branch of physics relating the heat, temperature, or entropy of a given physical system to energy and work. The laws of thermodynamics are some of the most fundamental laws in all of physics. This is because they play an important role in describing how, and predicting why, physical systems behave and evolve the way that we observe them. Moreover, thermodynamics provides fundamental rules to decide how a physical system cannot behave. That is, what type of solution behavior is physically meaningful and what is not. From a mathematical point of view, we note that satisfying the second law of thermodynamics is not enough to guarantee uniqueness of the PDE solution and that conditions for uniqueness are an active topic of research in the analysis of said PDEs, see for example [69][70][71][72].
The first law of thermodynamics concerns the conservation of the total energy in a closed system. The second law of thermodynamics states that the entropy of a closed physical system tends to increase overtime and, importantly, that it cannot shrink. The laws of thermodynamics must be satisfied simultaneously at all times, otherwise a mathematical solution can exhibit strange and obviously incorrect behavior. For example, a fluid that only conserves its total energy but does not take care on the entropy, i.e. satisfying the first law of thermodynamics but not the second law, could transfer all of its internal energy into kinetic energy. The result would be a very fast, but very cold jet of air. Such a flow configuration has never been observed in nature. This discrepancy is removed when incorporating the second law of thermodynamics where the transfer of energies are regulated. For reversible processes the entropy remains constant over time (isentropic) and the time derivative of the total system entropy is zero. For irreversible processes the entropy increases and the time derivative is positive. Solution dynamics where the total system entropy shrinks in time are never observed and deemed unphysical.
To discuss these ideas in a mathematical context consider a system of nonlinear hyperbolic conservations laws where we take the viscous flux components in Eq. 1 to be zero. Typically, the diffusion terms are dissipative in nature and are mostly uncritical. A prototypical example of a purely hyperbolic system of nonlinear conservation laws is the compressible Euler equations modeling inviscid gas dynamics. A smooth solution that satisfies the system of PDEs Eq. 23 corresponds to a reversible process. One of the difficulties, either analytically or numerically, of nonlinear hyperbolic PDEs is that the solution may develop discontinuities (e.g., shocks) regardless of the continuity of the initial conditions [73]. A discontinuous solution of Eq. 23 corresponds to an irreversible process and dissipates entropy. To mathematically account for possible discontinuous solutions, system Eq. 23 is considered in its weak form. Just as in the Galerkin discretisation in Section 1, the weak form of the PDE is found by multiplying the governing equations by a smooth test function ϕ(t, x → ) with compact support and integrating over R + × R 3 . Integration-by-parts is again applied to move the derivatives onto the test function and weaken the smoothness requirements on possible solutions. Hence, weak solutions of system Eq. 23 satisfy Another form of the conservation law is its integral form that, under the assumption of differentiable fluxes, arises from Gauss' theorem and holds for arbitrary control volumes Ω, e.g., [74]. Unfortunately, weak solutions of a PDE are, in general, not unique and must be supplemented with extra admissibility criteria in order to single out the physically relevant solution [75][76][77]. This is precisely where the laws of thermodynamics play a pivotal role because, as already discussed, due to their intrinsic ability to select physically relevant solutions. In most applications, e.g., compressible fluid dynamics or astrophysics, the total entropy is not part of the state vector of conservative variables u. However, we know from the discussion above that for reversible (isentropic) processes the total entropy is a conserved quantity. Where is this conservation law "hiding"? It turns out that there are additional conserved quantities, e.g., the entropy, which are not explicitly built into the nonlinear hyperbolic system Eq. 23 but are still a consequence of the PDE. In order to reveal this auxiliary conservation law we define a convex (mathematical) entropy function s s(u) that is a scalar function and depends nonlinearly on the conserved variables u. This allows the definition of a new set of entropy variables that provides a one-to-one mapping between the conservative variable space and entropy space [78]. If we contract the nonlinear hyperbolic system Eq. 23 from the left with the entropy variables w we have From the definition of the entropy variables and assuming continuity in time, we know that Further, each of the flux vectors in the coordinate directions x i must satisfy a compatibility condition where f i s , i 1,2,3 is a corresponding entropy flux [78]. We point out that a chain rule is the linchpin of the manipulations Eqs. 28 and 29 to move from the space of conservative state variables into the space of entropy variables. In the continuous setting this is not an issue under certain continuity assumptions. However, in a numerical setting it is extraordinarily difficult (or even impossible) to recover the chain rule with discrete differentiation, e.g., [79]. We postpone the discussion on this issue and what it means for a high order DG numerical approximation to Section 4.
By definition of a mathematical entropy, contracting the nonlinear hyperbolic system into entropy space, as in Eq. 27, and assuming the solution is smooth (i.e. a reversible process) results in the auxiliary conservation law for the entropy The corresponding integral form of the entropy conservation is given by for an arbitrary volume Ω. For irreversible processes, physical entropy is increasing. In the mathematical community however, entropy is defined as a decaying function and hence the entropy conservation law Eq. 31 becomes the entropy inequality [78] for discontinuous solutions.
As an illustrative example for the contraction of a nonlinear hyperbolic system of PDEs into entropy space, we consider the compressible Euler equations of gas dynamics in one spatial dimension with the ideal gas assumption where γ is the adiabatic constant. The convex entropy function s(u) for the compressible Euler equations is not unique [80]. However, a common choice for the mathematical entropy function is the scaled negative thermodynamic entropy [80][81][82][83][84] with the corresponding entropy flux f s v s(u). From this definition of the mathematical entropy function we get the entropy variables and see that each component of the entropy variables is a highly nonlinear function of the state vector components. Regardless, the variables Eq. 36 contract the one dimensional compressible Euler equations into entropy space and, when integrated over the domain Ω, become the entropy conservation law Eq. 31 for smooth solutions or the entropy inequality Eq. 32 for discontinuous solutions. It is worth noting that the entropy variables w are further useful in the analysis of the system, as they allow to derive a symmetric form of the PDE [85]. The discrete equivalent of the entropy inequality Eq. 32 is referred to as entropy stability. It is a generalization of the L 2stability statement to systems of nonlinear hyperbolic PDEs, e.g., [79,86]. An additional requirement built into the entropy stability condition is that the fluxes f remain bounded [68], which restricts the flow to physically realisable states, e.g., positive density and pressure in gas dynamics. Overall, entropy stability ensures that a numerical approximation obeys the fundamental laws of thermodynamics and is viewed as an important quality to capture [86][87][88]. But it is an active area of research to investigate the role of entropy stability and how it fits into the question of provable nonlinear stability [82,89,90].
At present, we restrict ourselves to one of the key ingredients in the analysis to derive a nonlinear entropy stability estimate for general nonlinear hyperbolic systems, see e.g., [78,86]. It is natural to develop an entropy stable DG approximation because the continuous and discrete analysis both rely on a weak form of the governing equations. However, for entropy stability the nonlinear system is not multiplied by the solution u as was the case for the L 2 -stability analysis. Instead the equation is multiplied with the entropy variables w Eq. 26, which are nonlinear functions of the state u, see e.g., the compressible Euler equations in (36). Thus, a direct combination of this approach with the analysis of Jiang and Shu from Section 2.1 is not possible. For a polynomial DG ansatz U, the discrete entropy variables W W(U) are no longer polynomials of degree N and do not belong to the space of test functions ϕ. Hence, it is not allowed to replace the test functions ϕ in the DG formulation Eq. 10 with W. Technically, only a projection of W onto the space of polynomials with degree N can be inserted; however, in this case the analysis does not lead to an entropy stability estimate as the chain rule holds for the full entropy variables and not their projections.
To overcome the issue with the test function space and to enable an entropy stability estimate for the DG method, Hughes et al. [91] as well as Hildebrand and Mishra et al. [92][93][94] used a space-time DG approach with an ansatz directly written in terms of the entropy variables. The idea is to make the DG ansatz in entropy space, i.e. to approximate the entropy variables with a polynomial of degree N. Ignoring the time discretisation for brevity, the DG formulation changes to which shows that the scheme is still formulated in conservative form, however all the conserved variables u now depend implicitly on the polynomial approximation of the entropy variables W. As this approach is naturally implicit, a straightforward and elegant extension of the scheme in time is to use a temporal DG scheme on top of the spatial DG scheme, resulting in a fully implicit space-time DG formulation. Hildebrand and Mishra proved that the resulting discretisation is entropy stable provided: i) Exact evaluation of all integrals; ii) Entropy stable numerical fluxes at the spatial surface integrals and upwind fluxes (due to causality) in the temporal surface integrals are used. These conditions on the space-time DG approximation are very similar to those imposed by Jiang and Shu for the scalar nonlinear hyperbolic case discussed for the nonlinear Burgers' equation.
Unfortunately, the assumption 1) on exact integration is extremely difficult to guarantee and implement in practical simulations, which we describe in more detail in the next subsection. Without proper exact quadratures of the integral terms, the chain rule does not hold discretely. Such inexact approximations of functions are referred to as aliasing errors. These can occur in realistic simulations and might cause instabilities. Thus, robustness is still an issue and ad hoc dealiasing or stabilization mechanisms, e.g., artificial diffusion [92] are necessary.

The Unpleasant Role of Numerical
Integration, Nonlinearities, Variational Crimes, and Aliasing As described above it is necessary to discretize the variational formulation and make certain choices in the DG approximation such as the polynomial basis functions and, especially, the discrete integration to approximate the surface and volume integrals. Unfortunately, only in the rarest and simplest cases is it possible to avoid these discretisation steps and use an exact evaluation of the integrals. Hence, the notion of "variational crimes" is introduced to express the steps necessary to turn the formulation into an actual algorithm that can be implemented.
One of the biggest problems when discrediting nonlinear advection-diffusion problems is that in many interesting cases, the nonlinearity is non-polynomial. Our exemplary problem, the compressible Euler equations depend on the mass density u 1 , the momentum density in the x direction u 2 , and energy density u 3 . Often, these are also denoted as u 1 ρ and u 2 ρ v, where v is the velocity. We compute the velocity from the conserved variables as This is important in the context of the DG discretisation because if the variables u 1 and u 2 are polynomials of degree N, the velocity v is not a polynomial, but a rational function. This occurs not only for the velocity but also for other quantities that are needed to evaluate the advective fluxes f, such as e.g., the pressure p. Hence, the fluxes f are no longer polynomials of degree N, and possibly rational functions, as in case of the compressible Euler equations. When approximating the integrals with high order quadrature formulae, such as the Legendre-Gauss rules, it is important to realize that these numerical integration rules are constructed for polynomial integrands. Hence, in theory, they cannot integrate non-polynomial functions exactly no matter how many quadrature nodes are considered. If we focus for instance on the strong form DG volume integral from Eq. 38, we see that the core part to evolve the DG solution in time is an L 2 projection of the flux divergence onto the If this projection is not evaluated exactly, due to either the aforementioned variational crimes, the nonlinearities of the flux function, or a combination of the two, the exact L 2 projection turns into a discrete projection, most often taking the form of an interpolation at the quadrature nodes. This is a subtle but important observation. In contrast to an exact L 2 projection, which cleanly "cuts out" high order content of the flux divergence with polynomial degrees larger than N, a discrete L 2 projection interprets (i.e. aliases) some of the high order content as part of the projection polynomial. This artificially and unpredictably decreases or increases the polynomial coefficients of the projection. This "incorrect interpretation" of high order content is also well known in Fourier analysis and signal processing. If the sampling rate (in this case the number of Legendre-Gauss quadrature nodes) is not high enough according to the Nyquist theorem, high frequency data (high order content) gets interpreted as low frequency data (onto the polynomial of degree N) and pollutes the result. This analogy to Fourier analysis illustrates the possibility that high frequency information can masquerade as low frequency information when represented on a discrete and unresolved grid. This is the fundamental issue often termed aliasing. As the issue of the discrete projection onto a space of polynomials is similar in spirit, the term aliasing is also often used in the DG community, as well as the spectral and finite difference communities, to give potential consequences of the variational crimes a name. In summary, basically all of the DG algorithms for nonlinear advection dominated problems have the issue of inexact evaluation of the integrals and hence all DG algorithms have aliasing errors.
Unfortunately, these aliasing issues are not simply an abstract and "ugly" theoretical oddity without practical consequences. On the contrary, aliasing plays an important role when using DG methods for realistic complex applications to model nonlinear phenomena. It is worth pointing out that one of the advantages of DG, it's very low dissipation errors, are in this particular point of view also it's biggest problem. Due to the inherent low numerical dissipation in a high order DG method, there is no in-built selfdefence against the aliasing issues and any instability that they may create. A repercussion of this fact is that it has become naturalized in the numerics community that especially the high order variants of the DG method, with very low dissipation errors, have robustness issues in practical applications. For instance DG approximations of the compressible Euler and Navier-Stokes equations are known to sometimes fail due to aliasing instabilities, e.g., [39]. This instability can manifest itself through the observation that the kinetic energy artificially grows in the simulation, while the inner energy decreases. Note, that the total energy is conserved by construction with the DG method; however, this exchange of kinetic and internal energy is unphysical and violates the second law of thermodynamics and is purely a result of the variational crimes (inexact integration).
An obvious solution to these problematic variational crimes, nonlinearities, and aliasing is to decrease their deleterious effects as much as possible. While technically unavoidable in the strictest mathematical sense, it is possible to increase the amount of Legendre-Gauss quadrature nodes to evaluate all integrals "consistently" such that the inexactness errors are on the order of machine precision, see e.g., Kirby et al. [95]. This approach is quite effective and immediately has a positive stabilizing effect on many applications with nonlinear PDEs, see e.g., [39,96]. However, it is clear that the computational complexity drastically increases when arbitrarily increasing the number of quadrature nodes. Hence, one often tunes the increased number of Legendre-Gauss quadrature nodes and takes as many as needed to make a simulation stable-which is, of course, unsatisfying and ad hoc, as it highly depends on the particular problem setup. In comparison, another approach is to not directly fight the variational crimes themselves, but the consequences they induce. This is achieved by applying well designed filters to the solution, with the purpose to clip out higher order aliasing content in an effort to decrease its effect, e.g., [39]. It is needless to say that this filtering approach is also very ad hoc and depends on many parameters that need tuning depending on the particular problem one wishes to solve with the high order DG scheme.
While these ad hoc stabilization techniques are reinforced by little to no mathematical analysis or rigor, the prevailing consensus in the DG community has been that, in practice, they work reasonably well. In that, consistent integration (sometimes referred to as over-integration) and filtering increase the robustness of high order DG methods to a level such that they could be applied to model challenging physical problems allowing them to shine with their high order accuracy and low dispersion and dissipation errors. Especially in the turbulence community, many research groups started to apply high order DG methods with stabilization in the context of implicit large eddy simulation with excellent results competitive with others from the broader numerics community, e.g., [97,98]. However, in Moura et al. [67] it was reported that certain configurations of the DG method for the inviscid Taylor Green vortex problem kept crashing, even when drastically increasing the number of quadrature nodes in the surface and volume integrals. In fact, the amount of quadrature points was increased up to the point where the DG scheme was no longer computationally feasible, but the simulations still crashed. The inviscid Taylor Green vortex setup in this case was used to investigate the case of a very high Reynolds number flow with severe under resolution common in realistic turbulence setups. These findings were also verified by the authors of this review article and have a strong consequence for the DG community. While the ad hoc stabilization techniques were "good enough" in the sense that they helped to make the DG scheme run for a broad range of interesting problems, this approach is apparently not bullet proof. Further, it is impossible to tell a priori for which cases the stabilization will work and for which cases it will not.
This one example, where the high order DG scheme was not stable and could not finish the simulation illustrates, fundamentally, that the removal of aliasing and variational crimes cannot be reliably done in an ad hoc fashion. Instead we need a better understanding of these aliasing errors and how they can be removed from inexact and/or under resolved discretisations. Furthermore, we require a mathematically sound approach to address these aliasing errors in the DG approximation. That is, we need a novel strategy to design robust high order DG methods to approximate the solution of nonlinear advection-diffusion systems.

WHEN DID THE NOVEL DEVELOPMENT START?
In 2013, two landmark results completely reshaped the development of the DG method moving forward. First, in his PhD thesis, Fisher extended the work on entropy stable schemes LeFloch and Rhode [99] as well as the high order entropy stable schemes of LeFloch et al. [100] to the summation-by-parts (SBP) finite difference framework with high order boundary closures in Refs. 88 and 101 respectively. LeFloch et al. and Fisher et al. found that the entropy stability estimates for low-order FV methods, developed by Tadmor [78], can be extended to high order accuracy. Whereas the high order reconstruction of LeFloch et al. was for periodic domains (i.e. without considering finite domain boundaries), the SBP finite difference framework includes special boundary closures and are applicable for finite domains. Kreiss et al. [102][103][104][105] introduced the SBP finite difference framework to specifically mimic integration-byparts. Integration-by-parts is a valuable tool for the construction of stability estimates. Further discussion on SBP is given by, e.g., Olsson [106,107], Strand [108], Nordström [109] and Svärd and Nordström [110].
To briefly introduce the main ideas of the classic SBP finite difference framework, we consider a discretisation in one spatial dimension on a finite interval E [−1,1]. Within this interval, we consider a set of N+1 regular grid nodes x j that include the boundaries x 0 1 and x N 1. On this grid, a continuous function u(x,t) is represented as the grid node values U j (t) u(x j , t). In short notation, we collect the nodal values into the vector quantity U. For the approximation of the PDE, we need two discrete operators: One that approximates integration, M ∈ R (N+1)×(N+1) ; and one that approximates differentiation, D ∈ R (N+1)×(N+1) . In this article, we only consider diagonal matrices M, sometimes referred to as diagonal norm SBP finite difference operators. With these operators we have The discrete integration and differentiation need to be compatible for a SBP operator to satisfy the property where B is the boundary integral evaluation operator with B diag(−1, 0, . . . , 0, 1). Multiplying Eq. 41 by grid values U T of an arbitrary function u(x) from the left and the approximation V of an arbitrary function v(x) from the right gives Grouping terms and using that M is diagonal such that M M T we have which is a discrete approximation of the integration-by-parts formula for the corresponding functions u and v hence the name summation-by-parts. With the SBP property, it is directly possible to show L 2stability of the finite difference scheme for constant coefficient linear advection problems. Starting again as an example with the scalar problem Eq. 11 and the linear flux f au, we get the following SBP finite difference semi-discretisation As stated above, one motivation of the SBP framework is to mimic the energy analysis of finite element discretisations like that found in the DG analysis presented above for linear advection. We proceed and first get a corresponding Galerkin type variational form by multiplying with the discrete integration matrix M This form is valid for all grid functions V T multiplied from the left and hence is a direct approximation of the variational Galerkin form, e.g., Eq. 12. We point out that this finite difference approximation ignores the surface terms that are specific to the discontinuous Galerkin scheme. Here, the arbitrary grid function V T takes the role of the test function ϕ. Hence, we can mimic the next step in the analysis, i.e. replacing the test function with the DG solution ϕ U, by multiplying Eq. 46 with U T from the left The volume term can be reformulated with the SBP property to move the discrete derivative onto the test function and generate boundary data (48) From this step we see that the contribution of the volume terms in the SBP finite difference scheme can be again lifted to the interval boundaries but this time without any assumption on a necessary quadrature rule precision. In fact, this is general and holds for all diagonal norm SBP finite difference operators. Again, with the assumption of time continuity and periodic boundary conditions (U N U 0 ), it follows that the discrete L 2 -norm of the SBP finite difference solution U 2 SBP U T M U is bounded for all t. We emphasize again that neither the reconstruction techniques of LeFloch et al., nor the SBP finite difference framework as a whole, depend on integration or exact evaluation of integrals. Thus, in contrast to the DG stability results discussed above, the stability results obtained for SBP finite differences by Fisher et al. do not assume exact evaluation of any integrals. Thus, such methods yield efficient algorithms with feasible implementations that have provable stability estimates.
The second important result was separately discovered in 2013 by Gassner [111]. He realized that the base operators of the nodal discontinuous Galerkin spectral element method (DGSEM) have the diagonal norm SBP property as long as the collocation nodes {x j } N j 0 and weights {ω j } N j 0 were chosen to be those associated with LGL quadrature. It is interesting to note, that earlier, in 2010, Kopriva and Gassner [34] already found out that for DGSEM with LGL quadrature, the weak DG formulation and the strong DG formulation are discretely equivalent. As shown in Eqs. 9 and 10, the weak form and strong form can be transformed into one another with integration-by-parts. Thus, when both forms are discretely equivalent, it basically means that discrete integrationby-parts, i.e., SBP, holds. We point out that in 1996, in the context of spectral methods with Chebyshev-Lobatto nodes or LGL nodes, Carpenter and Gottlieb [112] showed a similar property as SBP for these spectral operators, however they assumed that integration-by-parts holds for the proof. The results in Refs. 34 and 111 complete their findings as they remove the assumption of exact integration.
In the nodal DGSEM-LGL framework, similar to the finite difference framework, the solution coefficients of the DG polynomial are nodal values U j (t) at the location of the LGL nodes x j . The nodal DG polynomial is represented with Lagrange basis functions {ℓ j (x)} N j 0 spanned with the LGL nodes which have the Kronecker delta property that ℓ j (x i ) δ ij , i.e., 1 if i j and 0 otherwise. With this choice of basis function and quadrature rule, it is possible to find discrete versions of the corresponding integral operator and the differentiation operator.
For the integral we consider with M diag(ω 0 , . . . , ω N ) and the vector of LGL nodal values U and V. Furthermore, we have for the discrete differentiation where we used the short hand notation for the spatial derivative of the Lagrange basis d dx ℓ(x) ℓ ′ (x). Introducing the differentiation matrix as we get As was shown in Ref. 111, these two discrete operators are again compatible and provide the SBP property which means that the DGSEM-LGL operators belong to the class of diagonal norm SBP operators. This simple property of onedimensional discrete integration-by-parts is the basis for a whole polynomial spectral calculus [113] that includes, for instance, discrete version of Gauss' law on curvilinear grids in three spatial dimensions.
Returning to the discussion on stability, the LGL quadrature rule with N+1 points has an integration precision of 2N−1. Thus, the DGSEM-LGL is stable for scalar linear advection as shown above. However, the DGSEM-LGL is not stable for nonlinear problems, e.g., for the quadratic flux function discussed in Section 2.1 where an integration precision of 3N−1 is necessary i.e., exact integration of the volume terms. But using the SBP property of the DGSEM-LGL operators, it is possible to apply ideas similar to Fisher et al. and construct a novel DGSEM with LGL quadrature, that is discretely L 2 -stable for the nonlinear Burgers' equation, without the assumption on exact evaluation of the integrals [111]. These first results have been extended and compounded upon for the compressible Euler equations [114][115][116][117][118], the shallow water equations [63,[119][120][121], the compressible Navier-Stokes equations [32,122,124], non-conservative multi-phase problems [124], magnetohydrodynamics [125,126], relativistic Euler [127], relativistic magnetohydrodynamics [128], the Cahn-Hilliard equations [129], incompressible Navier-Stokes (INS) [130], and coupled Cahn-Hilliard and INS [131] among many other complex PDE models and DG discretization types e.g., [132].

WHAT IS THE KEY IDEA?
To recap the discussion, there are several obstacles that make it difficult to obtain entropy stability estimates for high order DG methods in the case of a general nonlinear system of hyperbolic PDEs: 1) The assumption of exact evaluation of integrals is unfeasible in practice; 2) We need to contract with entropy variables w that nonlinearly depend on the conservative quantities u, which means that we need to replace the test functions by a projection (or interpolation) of w(u) ; 3) We need to satisfy a discrete version of the chain rule to contract the flux divergence into entropy space, i.e. a discrete version of where now the entropy variables and the flux functions are discrete projections and the derivative is replaced with our discrete derivative operator.
In what follows, the key ideas to resolve all three issues are presented. We focus on issue 1) and consider a scalar nonlinear problem with quadratic fluxes discussed above first. Then we ramp-up the complexity in the second subsection and discuss how to extend the novel approach to general systems and how to resolve all issues (i)-(iii).

On the Conservative Form, Split Forms and Skew-Symmetry
To illustrate the general idea of a split form and how to incorporate it into a high order DG approximation we examine our simple scalar nonlinear hyperbolic conservation law, the Burgers' equation. We start with the conservative form that can be rewritten into its advective form which is equivalent in the continuous case for smooth solutions. We can also consider an equivalent combination of the two forms where α ∈ R is an arbitrary parameter. This form is called the split form of Burgers' equation, with α being the split form parameter. While in the continuous case with smooth solutions all of these forms are equivalent, it is important to note that in the discrete case this is not true. Considering the DGSEM operators with N+1 LGL nodes, the volume terms for the conservative form are given by where U is the vector of values of u at the LGL nodes, D is the DGSEM-LGL derivative operator and U diag(U 0 , . . . , U N ) is a matrix that has the nodal values U injected onto its diagonal. Analogously, the volume terms for the advective form are Only for polynomial functions u with degree ≤ N/2 and their corresponding nodal values U, do we have In all other cases, i.e. in the general case for arbitrary nodal vectors U, we get The discrete forms are different because of different aliasing errors. Whereas the conservative form computes a discrete derivative of U 2 , the second form computes a "clean" derivative of U, but on the other hand needs to compute the product of two functions U and DU on a grid with only N+1 nodes. An interesting question is, if we can make use of the different (aliasing) errors in the two forms and find combinations via the split formulation where these errors cancel. We note that the idea of split formulations was already introduced in the spectral community to develop stable numerical methods for the incompressible Navier-Stokes equations e.g., [133], but is especially prominent in the finite difference fluid dynamics community e.g., [134][135][136][137][138][139]. Split formulations are used as a built-in dealiasing mechanism to stabilize numerical methods e.g., [140]. Combinations of different forms of the advective terms of the compressible Euler equations yield finite difference approximations that are more robust than the standard conservative ones. In a perfect world, it would be desirable if we could choose the split form parameter α such that the different aliasing errors cancel exactly. Unfortunately, in general, it is not possible to cancel the aliasing errors for each grid node; however, what we will show next is that it is possible to cancel the aliasing errors in a way to get a global L 2 -stability estimate similar to the estimates by Jiang and Shu [65], but without the assumption of exact integration.
First, we derive the strong DG formulation of the split form of Burgers' Eq. 58 by multiplying with a test function ϕ, integrating over a grid cell E, inserting a numerical flux function f* at the grid cell interface to account for the discontinuous nature of our ansatz, and use integration-by-parts again to arrive at where the flux is f(U) U 2 /2. We consider specifically the DGSEM-LGL variant to get discrete operators that satisfy the Frontiers in Physics | www.frontiersin.org January 2021 | Volume 8 | Article 500690 SBP property with diagonal norm (mass matrix) M. We arrive at the DGSEM-LGL variant when we replace the integrals by discrete quadrature with N+1 LGL nodes and when using the same N+1 LGL nodes to span the Lagrange basis functions used for our polynomial ansatz. This gives the following discrete DGSEM-LGL split form where B is the boundary evaluation matrix from the SBP property Eq. 41, F F(U) is the vector of collocated nodal flux values i.e., F j f (U j ) ∀j, and F* is a vector that contains the numerical fluxes at the interfaces "left" and "right" in its first and last entry and is zero elsewhere. The value U + , again, indicates that the numerical flux functions depends not only on local element values U, but also on the values from the neighbor grid cells.
We refer to Gassner [111] for a detailed derivation of this form and its connection to the SBP framework with simultaneousapproximation-terms (SAT). Next, we follow the standard procedure to derive an L 2stability estimate by multiplying the scheme with the DG solution and use the quadrature rule to numerically integrate over the element i.e., we multiply by U T M where we used the fact that M U U M because both matrices are diagonal. Again, we consider the semi-discrete version and assume continuity in time to have the evolution of the discrete L 2 -norm in the grid cell E. Next, we focus on the volume terms. Note, that in the analysis of Jiang and Shu exact integration was assumed to contract the volume contribution to the surface. This is very important, as it allows direct control over the stability of the scheme with the choice of the numerical interface flux F*. Without the assumption of exact integration however, we look at the influence of the choice of the split form parameter α instead. We realize that the second term in the volume integral can be transposed U T U MD U U T (MD) T U U and is similar to the first term in the volume integral, except for (M D) T is now transposed. Using the SBP property (MD) T B − MD we get The term with the boundary evaluation matrix B is a surface term, however the remainder term is a volume term that can either increase or decrease the L 2 -norm. Hence, this term can be potentially critical in cases where it increases the norm, as this is an unstable behavior that could lead to break down of the simulation. We note that this volume term is another expression of the aliasing issues. To guarantee that this term does not affect stability, we need to guarantee that it vanishes. We see that there is a single (unique) choice of the split form parameter α 2/3 that cancels the remaining volume term. With this choice, the discrete change of the L 2 -norm reads as This estimate is now analogous to the one with exact integration Eq. 19 and hence, with the same arguments, the choice of the numerical flux functions as gives again a discrete stability estimate for the split form DGSEM-LGL with α 2/3 when summing over all grid cells with periodic boundary conditions. It is important to observe that this estimate is discrete in the sense that it did not assume exact integration and that it can be only derived with the particular choice α 2/3 to cancel out the volume contribution of the aliasing errors. With this, we have a novel method where we have solved issue i) mentioned in the beginning of the section. We note that this particular choice of numerical flux function exactly preserves the discrete L 2 -norm. If one considers nonsmooth solutions this choice would be inappropriate as for e.g., shocks, because the L 2 -norm needs to decrease as u 2 is a mathematical entropy for Burgers' equation. For scalar equations, s(u) u 2 /2 is the square entropy (which leads to an L 2 -stability estimate e.g., [99,141]) that gives the simple entropy variables w(u) ds du u. Hence, the specific choice of numerical flux Eq. 72 is often referred to as an entropy conservative (EC) flux function. For an entropy dissipative flux function, there are many choices available. It can be shown that the class of E-fluxes, e.g., [33], are guaranteed dissipative and lead to the estimate As an example, a simple choice of an entropy dissipative numerical flux function is that of Rusanov An important question is how we can extend this approach to general nonlinear systems. Following the same ideas, it was possible to derive split forms for the shallow water equations [63,119,142], a simplified version of the compressible Euler equations. There are many split forms for compressible Euler Frontiers in Physics | www.frontiersin.org that, for instance, give kinetic energy preserving properties e.g., [143,144]. However, up to now, no split form for the compressible Euler equations is known that gives the desired discrete entropy stability estimate. The problems are issues ii) and iii) mentioned in the beginning of the section, where we need the discrete chain rule property to contract the volume terms to the surface. Concluding this subsection we revisit the derivations of Burgers' equation and make two important observations. First, with the proper choice of α 2/3, we get the so-called skew-symmetric form Skew-symmetry is strongly connected to entropy, see e.g., Tadmor [85]. Multiplying the spatial derivative term by u as in the L 2 -stability analysis gives which shows that the skew-symmetric form gives a product-rule type form in the stability analysis that can directly be contracted to the divergence form i.e., contracts to the surface when integrating. In fact, for this simple problem, the chain rule needed for contraction reduces to the simpler product rule. Analogously, we get for the discrete skew-symmetric volume terms of the DGSEM-LGL Thus, in our derivation, we already used a specific discrete version of the chain rule (product rule) to get our estimate. The question is, how to extend this idea to the general case?
The second important observation pioneered for SBP schemes by Fisher in his PhD thesis [88] (in the spirit of earlier work by LeFloch et al. [100]) is that the particular skew-symmetric volume terms 1 3 D U U + U D U can be rewritten for any diagonal norm SBP operator (hence, also for the DGSEM-LGL case) into with f *,EC being the particular numerical flux Eq. 69 that is symmetric in its arguments and that leads to exact conservation of U 2 . We further introduced the shorthand notation D EC f for the volume term, that indicates that we use a specific derivative operator built on the EC-flux. As a remark, we note that this relation is easy to prove, as the discrete derivative of a constant is zero and then, for instance, In combination with the first observation we get the property that this new discrete derivative operator (or divergence operator in the multi-dimensional case) satisfies where F s is the collocated nodal vector of the entropy flux f s u 3 /3 for Burgers' equation with the square entropy s(u) u 2 /2. This relation is the important discrete analogue of the chain-rule property u f x f s x , as it follows with integration that (82) is the discrete analogue of We will see in the next subsection, how these observations guide the path to discrete stability estimates for general hyperbolic PDE systems.

On the Discrete Entropy Stability of the DGSEM-LGL
We have demonstrated how to build a high order skewsymmetric DG approximation of the scalar nonlinear Burgers' equation. To do so required a very particular discrete derivative operator Eq. 78 that was the key to restore discrete entropy stability. We now discuss how to extend the split form approach to general systems of nonlinear hyperbolic conservation laws. For general nonlinear systems, it is unclear how to explicitly construct the split form to obtain a discrete chain rule property. In particular, the compatibility condition on the physical fluxes obtained when one contracts into entropy space Eq. 29 that we reproduce here, assuming one spatial dimension, due to their pertinence in the present discussion As previously indicated, the chain rule is either unfeasible or even impossible to directly recover with discrete differentiation. With this in mind we apply the product rule to this compatibility condition between the physical flux f and the entropy flux f s to find A principle motivation for this manipulation is because it is far easier to recover the product rule discretely than it is the chain rule. That is, we already have a particular discrete equivalent for the product rule if the discrete derivative matrix D is a SBP operator. Next, we aim to find a discrete version of the new compatibility condition Eq. 81 following the ideas of Tadmor [78]. Tadmor analyzed low order FV schemes and developed conditions on the numerical surface flux to derive a discretely entropy conserving scheme. In the context of the low order FV methodology, our unknowns in the elements are mean values that are naturally discontinuous across grid cell interfaces. As mentioned above, the idea to resolve these discontinuities with numerical flux functions was also used in the construction of the DG approximation. Consider the contribution to the compatibility condition Eq. 81 at an arbitrary interface. It depends on the discrete values in the current cell and the direct neighbor of that cell, denoted again with a "+". We approximate all derivatives with first order differences and define Tadmor's entropy conservation condition on the numerical surface flux function where Δx is the size of each grid cell. Equivalently, we arrive at the following general condition on the numerical surface flux for entropy conservation For scalar nonlinear problems this condition can be solved explicitly [145]. For example, in the case of Burgers' equation, we have w(u) u, f(u) u 2 /2, and f s (u) u 3 /3 such that solving Eq. 83 which matches the particular entropy conservative flux derived in Section 4.1. We note again that the entropy conservative flux is symmetric in its arguments U + and U, and is consistent to the physical flux in the sense that for the same arguments we recover the PDE flux f *,EC (U, U) f (U). However, for systems of nonlinear hyperbolic conservation laws Eq. 83 is a single algebraic condition for a system vector of unknown flux quantities. Therefore, care must be taken to define an entropy conservative numerical flux function that remains physically consistent. That being said, the entropy conservation condition on the numerical surface flux Eq. 83 is an incredibly powerful statement. Provided we know an explicit form of the entropy variables, the physical flux, and the entropy flux we can define an appropriate numerical flux that ensures entropy consistency for a low order FV numerical approximation. A general form for such a numerical flux was developed by Tadmor [77] defined as a phase integral To evaluate the phase integral form of the numerical flux function requires a certain quadrature rule that defines a path through phase space. Though theoretically useful, this phase integral form is computationally prohibitive for practical simulations, even for low order numerical approximations. However, over the past 20 years "affordable" versions of the entropy conservative flux function f *,EC (U + , U) have been developed for a variety of nonlinear systems like the shallow water equations [119,146], compressible Euler [81,83], and ideal magnetohydrodynamics [147].
The key to these numerically tractable versions of the entropy conservative numerical flux function is to evaluate the components of the physical flux at various mean states between U + and U. Note that these mean states can take on incredibly complex forms that depend on the arithmetic mean, the product of arithmetic means, or more uncommon quantities like the logarithmic mean. Complete details on the derivation of such numerical flux functions can be found in e.g., [81,83,146,147].
For illustrative purposes we summarize the specific form of an entropy conservative numerical flux for the one dimensional compressible Euler equations with the ideal gas assumption due to Chandrashekar [81]. First, we introduce notation for the arithmetic mean and the logarithmic mean for two quantities a and a + : Also, we introduce a variable proportional to the inverse of the temperature which simplifies the form of the entropy variables Eq. 36 to be Then, applying the entropy conservation condition Eq. 83 and many algebraic manipulations later, we arrive at an analytical expression of an entropy conservative numerical flux for the compressible Euler equations So far, the discussion on entropy conservative numerical approximations has all been in the context of low order finite volume methods. It is possible to create a high order entropy aware scheme with ENO [148] or WENO type reconstructions [149]. However, as mentioned above, a critical and remarkable result of Fisher's work is that a low order finite volume entropy conservative scheme can be extended to an arbitrarily high order accurate spatial scheme, when it is based on diagonal norm SBP operators [101]. As was described for Burgers' equation in the last two observations in Section 4.1, the crucial part is to move the contribution to the entropy production from the volume terms to the surface via a discrete version of the chain rule. Once stability is only governed by surface contributions, it is possible to control the stability with a proper choice of numerical interface flux function. In this sense, the analysis of the high order scheme reduces to a similar problem as for the low order finite volume method Eq. 83 and the well-established theoretical analysis tools and results can be reused. It is worth mentioning that f *,EC is not unique and that a particular choice of the entropy conservative flux generates a different split form of the governing equations, e.g., [144,150,151]. We return to a strong form DG approximation of a nonlinear hyperbolic system of conservation laws that takes a form identical to Eq. 38 but now only considered in one spatial dimension Here the standard tools of a nodal DG approximation have been applied: 1. The solution and physical fluxes are approximated with polynomials. 2. Any integrals in the variational formulation are approximated with high order LGL quadrature. 3. The interpolation and quadrature nodes are collocated.
Importantly, these steps mean that the discrete DG differentiation matrix is a SBP operator. Furthermore, we use the entropy conservative numerical surface flux at the interface and the particular discrete derivative projection D EC defined in Eq. 76 in the volume contribution to arrive at the entropy conservative DG approximation Now, if we take the test function ϕ W with W j w(U j ) evaluated at each LGL node x j , we obtain assuming continuity in time. Also, the discrete differentiation operator D EC moves volume information onto the boundary, see Refs. 32, 101, and 125 for complete details, such that We note that this remarkable property holds for general nonlinear systems with available entropy estimate and a corresponding low order entropy conserving flux f *,EC . Combining this with the definition of the entropy conservative flux Eq. 83, the discrete entropy evolution of the DGSEM-LGL becomes which is the discrete analogue of the integral form of the entropy conservation law discussed in Section 2.2. The resulting DGSEM-LGL is entropy conservative by construction and it is important to note we have assumed no exactness on the integration. From this baseline entropy conservative numerical scheme, that does not dissipate entropy by construction, we can create a high order DGSEM-LGL that enforces the entropy inequality Eq. 32. We do so by introducing dissipation at the element interfaces via the choice of the numerical surface flux function, e.g., the Rusanov flux Eq. 72. More complex dissipation techniques are also available that dissipate solution information according to the different wave strengths with complete details found in, e.g., [152][153][154][155][156]. We finally note that this discussion was restricted to one spatial dimension for the sake of convenience and simplicity. Extensions to general three dimensional curvilinear coordinate systems are available, see e.g., [32,101,116,125,157,158] for details.

Validation of Robustness and Application to Space Physics of the Entropy Stable DGSEM
In this subsection, we demonstrate two exemplary simulation results of a DG scheme based on the key ideas outlined above. The general split form DGSEM with LGL nodes on three dimensional curvilinear hexahedral unstructured meshes is implemented in the open source software FLUXO (project-fluxo/fluxo at github), written in modern Fortran with a special emphasis on massively parallel CPU based hardware. The main focus of the software is on compressible Navier-Stokes and visco-resistive MHD equations. Time integration of the semi-discrete form is done with a fourth order accurate low storage Runge-Kutta method of Carpenter and Kennedy [159]. For the validation of the robustness we revisit an important numerical contribution of Moura et al. [67]. They were the first to report of a test case that the DG scheme with (numerical) exact integration was not able to run, demonstrating, that further improvement on the robustness of the DG methodology was necessary. For this validation test, we consider the compressible Navier-Stokes equations (viscous case) or the compressible Euler equations (inviscid case, basically setting the viscosity parameter to zero). The considered problem is the Taylor-Green vortex in a fully periodic domain, which serves as a test case for a fully periodic turbulent box [0,1] 3 , that starts with a smooth initial velocity field v 1 v 0 sin(2 π x 1 )cos(2 π x 2 )cos(2 π x 3 ), v 2 −v 0 cos(2 π x 1 )sin(2 π x 2 )cos(2 π x 3 ), and transitions to turbulence during its temporal evolution until it reaches a state similar to homogeneous turbulence. The initial density is uniform ρ 1 and the initial pressure is give by p p 0 + 1 16 (cos(4 π x 1 ) + cos(4 π x 2 ))(cos(4 π x 3 ) + 2), where p 0 is a background pressure and v 0 the velocity amplitude used to set the initial Mach number. In our case, we choose the Mach number to be Ma 0.1. Unresolved vortical driven flows are especially prone to the aliasing issues discussed above. The difficulty of this test case lies in its wide range of scales when the Reynolds number increases (i.e., for low viscosities), e.g., [39]. For the DGSEM discretisation, we choose the polynomial degree N and the number of grid cells N 3 Q . Thus, the total number of degrees of freedom (DOF) for one conserved quantity is For the numerical flux function at the surface, we use the Rusanov flux.
For the robustness investigation, we consider an inviscid flow (with the viscosity parameter is zero) and focus on three particular setups, where N 1,3,7 with number of elements N Q 56,28,14 respectively. This ensures that for all three computations the overall number of DOF per conserved quantity is equal, about 1.4 million. There are many more investigations of different configurations presented in Ref. 160, but they all demonstrate the same behavior: while the low order variants N 1,3 seem to be relatively robust with full integration, the higher the polynomial degree, the less stable the DG method becomes. And for the case N 7 with N Q 14 the simulation crashed at about a simulation time of t crash 8.4, even when increasing the quadrature nodes from 8 3 up to 32 3 32768 per element. In contrast, the novel entropy stable DGSEM with standard LGL nodes runs all configurations without crashing.
Furthermore, it is possible to run this challenging test case even without any artificial dissipation, i.e. with the F *,EC numerical flux instead of the Rusanov flux function. This is very interesting, as it allows us to fully observe and control the artificial numerical dissipation generated by the scheme. To demonstrate this, we consider again the Taylor-Green vortex test case, but this time with non-zero viscosity such that the Reynolds number is Re 1600. In this Navier-Stokes case, it is possible to relate the kinetic energy decay over time with the temporal behavior of the enstrophy to get an estimate for the Reynolds number. In theory, this should be Re 1600 for the simulation. In practice, the finite resolution causes numerical errors such as dispersion and dissipation, e.g., [55]. We present two results in Figure 1 for the viscous test case with N 7 and N Q 8.
We note, that this test case would crash for the standard DG scheme, however for the presented novel DGSEM-LGL with the proper discrete chain rule, it runs with the dissipative Rusanov numerical flux F *,Rusanov (entropy stable DGSEM-LGL) and even with the non-dissipative numerical flux F *,EC (entropy conservative DGSEM-LGL). After an initial transition zone, the entropy conservative scheme retains the physical Reynolds number remarkably well with Re numerical 1600 and the simulation is virtually dissipation free throughout the temporal evolution. The entropy stable variant clearly introduces stabilizing dissipation as soon as the spatial scales can no longer be resolved. It is interesting to note, that these results hint toward the possibility of quantifying and controlling the artificial dissipation of the DGSEM for under resolved turbulence and use this to construct high fidelity turbulence models, see e.g., for a proof of concept [161].
For an exemplary application we consider a complex test case from space physics. We focus on the electrodynamic and plasma interaction of the moon Io with the strong magnetic field of Jupiter. Io is embedded in a dense plasma torus, induced by the magnetosphere of Jupiter, and it exhibits interesting plasma flow characteristics containing steep gradients and discontinuities [162]. The general problem setup is illustrated in Figure 2. Neglecting neutral density, relativistic, viscous, resistive, and Hall effects, this MHD flow within Io's plasma torus can be modeled with the ideal MHD equations. For such magnetohydrodynamic flows, the entropy stable DGSEM solver in FLUXO uses a hyperbolic divergence cleaning mechanism to enforce the divergence-free constraint on the magnetic field variables B → [125]. Additionally, the solver must be augmented with a shock capturing technique in order to handle strong discontinuities [163].
As Io orbits Jupiter, plasma from the torus streams in and forms a local ionosphere that induces polarisation charges and modifies the electric field, thus changing the local Lorentz force and damping electron and ion flow close to Io. The plasma flow is strongly reduced inside Io's (very weak) atmosphere, which can be modeled by incorporating a neutral collision source term to the ideal MHD system, Saur et al. [162,164], with the collision frequency where ϖ in > 0 is constant. The inner atmosphere of Io is represented as a neutral gas cloud U. In order to model the ionosphere, we also introduce a smooth transition area T by an exponential blending dependent on the radii r U , r and the dilatation factor d. In this region the neutral atmosphere thins causing ionospheric conductivities to shrink such that they are no longer able to maintain the ionospheric current perpendicular to the magnetic field. Eventually, electric current is continued along the magnetic field lines out of Io's ionosphere, where it is finally fed into Io's Alfvén wings as illustrated in Figure 3.
In the dimensionless computational domain we scale the radius of the atmosphere to take a spherical shape with radius one and locate the center of the sphere at the origin The ionospheric processes uses the exponential blending of the collision frequency above with a dilatation factor d 150/1820. The initial conditions for the flow are taken as that will evolve to a final time T 5. The gas constant is taken to be c 5/3. The boundary states at the left, front and back boundary faces are constant to this reference state, whereas we define outflow boundary conditions at the right, top and bottom of the domain. In anticipation to capture the relevant physical interactions at the sphere as well as the development of the Alfvén wings best, we exploit the geometric flexibility of the entropy stable DGSEM solver and divide the computational domain into an unstructured, curvilinear mesh presented in Figure 4. Within each element we use a polynomial order of N 3.
In Figure 5 we show a 2D-slice of the B 1 and v 1 components at y 0 of the entropy stable approximation at the final time which presents the numerically generated Alfvén wings from the entropy stable DGSEM. It also demonstrates the expected positive correlation of the B 1 variable with the velocity variable v 1 in the northern Alfvén wing and the negative correlation in the southern wing. Moreover, we consider profile slices in these B 1 and v 1 along the line z 5 and compare the results to a solution computed by the open source software ZEUS (www.astro. princeton.edu/jstone/zeus.html) in Figure 6. ZEUS is a FV solver written in spherical coordinates that uses explicit time integration. For the presented comparison, ZEUS used 10 million grid cells (DOF) in total whereas the entropy stable DGSEM solver used approximately 14,336 elements with N 3 polynomials and a total of about 1 million DOF. The reduction of DOFs also translated in a nearly 10 fold reduction of overall CPU time when both codes were run in parallel using MPI on 100 cores. This increased efficiency of the high order DGSEM-LGL, the increased robustness due to discrete entropy stability, and the geometrical flexibility are several advantages of this novel DG framework.

WHERE TO GO NEXT?
The response of the DG community to the split form DGSEM with LGL quadrature on tensor-product hexahedra has been astounding. However, naturally, there are still many limitations of this method. Some that have been recently addressed and many others still open. So far, we discussed semi-discrete DGSEM-LGL variants with tensor product expansions on possible curvilinear unstructured hexahedral meshes. Direct extensions of this variant include nonconforming meshes [166,167], moving meshes [168][169][170], different related versions such as e.g., the line DG method [171], and a fully discrete space-time approach without the assumption on time continuity [172][173][174][175]. An exciting recent development are explicit modified Runge-Kutta methods that retain the semi-discrete entropy stability estimates [176].
A downside of LGL is that in comparison to the Legendre-Gauss (LG) points, the accuracy in dispersion and dissipation is lower, see e.g., [56]. Unfortunately, LG points do not include the boundary nodes, hence they do not directly satisfy the classic SBP property and the presented developments cannot be directly applied to this case. However, there are several developments where the framework was extended to construct entropy stable variants with LG nodes. One is based on a staggered grid approach by Parsani et al. [176]. The authors used the standard LG nodes to span the solution, however to compute the discrete derivative operator, they interpolate onto a higher order staggered LGL grid where they can use the SBP property. Then they ensure that the back-projection is still entropy stable to retain the stability estimate on top of the accuracy of the LG nodes. Another approach is presented by e.g., Ortleb [177] where the author directly constructed a scheme that preserves the kinetic energy with LG nodes. Although SBP could not directly be applied, the difference lies only in the boundary operator B. For classic SBP B is diagonal, whereas in case of LG nodes B has some columns filled. Ortleb fixed this by considering special correction terms at the boundary, while using similar ideas as in the LGL case for the volume terms. In his PhD thesis, Fernández [178] extended the classic SBP property to general node sets that do or do not include the boundary nodes, where all grid nodes lie inside or even outside of the considered domain.
A generalization onto multi-dimensional domains is given in e.g., [179], termed multidimensional SBP operators. This is an interesting development, as it resolves another limitation of the classic DGSEM-LGL. In some applications, it is favourable to have more flexibility when generating meshes for geometries with complex shapes. In this case, meshes with simplex element types such as triangles and tetrahedra or even hybrid element types such as prisms and pyramids are desired. We will mention some recent developments and extensions here, but stress that this list is not complete and there are many more. Returning to the multidimensional SBP framework, this approach's strength is that it can be used to construct stable methods on simplex meshes e.g., by Chan [115,180], Chen and Shu [132], Hicken et al. [181], and Crean et al. [182]. Another interesting approach to generate DG scheme on general meshes is presented by Chan [115]. He shows that a special projection directly with the entropy variables collocated at the grid nodes can give a SBP type property that can be used to construct stable discretisations.
As stated the response of the DG community has been astonishing with an explosion of developments, extensions, and new insights. However, there are still many unresolved issues that need to be researched in the future to further evolve high order DG methods into a viable tool for computational physics. The most important problem is still robustness. Although entropy stability significantly improves the stability of the scheme in many applications, there are still situations where no significant gain in robustness of the DG scheme can be observed [183,184]. A possible reason could be, for instance, the incorrect choice of mathematical entropy function. While there is typically only one physical (thermodynamic) entropy, there are many mathematical ones that often lead to a stability estimate in corresponding norms of the solution. So what are the important entropy quantities to consider? What about e.g., kinetic energy, cross helicity and FIGURE 3 | Standing Alfvén current tube magneto-spheric disturbances, termed Alfvén wings, that spread and propagate away along field lines in both directions. The development of the Alfvén wings are observable in the regions with decreasing plasma bulk velocity v 1 and perturbed magnetic field B 1 north and south of Io in the xz plane. It is known that the Alfvén current tubes are bent back by a constant angle with respect to the unperturbed background magnetic field. Further, the perturbation of the magnetic field is positively correlated with the perturbation of the velocity field in the northern Alfvén wing and negatively correlated in the southern wing [133]. FIGURE 5 | Alfvén wings numerically computed with the entropy stable DGSEM for the plasma interaction of a spherical gas cloud. The snapshot is a slice in the xz plane at y 0 at the final time T 5. The polynomial order in each spatial direction was N 3 in each spatial direction. As expected, the Alfvén wings evolve from the northern and southern poles of the neutral gas cloud and are bent back by a constant angle with respect to the background magnetic field. This bending was taken into account in the construction of the curvilinear mesh.
Frontiers in Physics | www.frontiersin.org related quantities? For example, in turbulence, the kinetic energy and proper prediction of its behavior seems to play an important role [144,161,185,186]. Besides getting discrete entropy stability estimates, it is possible to use the split form DGSEM-LGL approach from Section 4.1 to construct DG schemes that discretely preserve the kinetic energy and are discretely compatible for the inner and the total energy, e.g., [116]. It could be shown in Refs. 160 and 161 that such kinetic energy preserving DG schemes behave favourably for the simulation of compressible turbulence in comparison to the standard DG, especially in combination with subgrid turbulence models. However, just as in the development of entropy conservative (and in turn entropy stable) numerical flux functions there is nonuniqeuness. There exist many possible solutions to create a numerical flux that is e.g., kinetic energy preserving or entropy conservative or both e.g., [81,83,144,185,187]. Moreover, the discrete behavior of the kinetic energy as it evolves in time for under-resolved turbulent flows is quite different even between fluxes that are all provably kinetic energy preserving on paper [116,187,188]. So, an important question for the future is thus: What are the important quantities not only from a mathematical, but also from a physical point of view?
A fundamental issue in this context are problem setups that involve physical discontinuities e.g., shock waves in the compressible Euler or ideal MHD equations. Discontinuities trigger another instability, inherent in high order methods: the Gibbs phenomenon i.e., numerical oscillations. These oscillations can be devastating as under-or overshoots can cause non-physical state solutions e.g., negative density or pressure. Hence, positivity is a necessary criterion for all numerical methods when simulating such problems. However, up to this date there is still not enough research into the topic of entropy stability and positivity e.g., [40]. It is worth pointing out, that mathematically, the entropy function is only well-defined for positive solutions and, hence, is strongly connected to positivity. Generalizing this discussion, it is evident that entropy stability is "not enough" as a property for the numerical method. We need more properties, such as e.g., positivity. However, this is also where we reach uncharted research territory as even for many continuous problems e.g., the compressible Navier-Stokes equations it is up to this point unclear to show positivity even for the model itself.
The overview in this work focused on the volume contributions and the underlying tools (physical and mathematical) which led to the entropy stable DG method. However, the contributions at the physical boundaries have been ignored. Properly posing the boundary conditions to be entropy stable for a given model, like the compressible Euler or Navier-Stokes equations, has been considered [189][190][191][192][193][194][195], but this is remains an active area of research particularly because the treatment and behavior of the solution (whether on the continuous or discrete level) is directly related to the validity of a mathematical PDE model and directly tied into issues of well-posedness.
Concluding, we are currently at an exciting development stage with high order DG methods, where we can mimic important continuous stability estimates by careful construction of discrete operators. However, practical simulations show that these are not enough for the most complex problems that we desire to simulate. Plus, our numerical schemes and their properties are very close to the current analytical knowledge we have about the physical models. It is very hard to progress with the numerical developments further than what is analytically known: Which properties are important? How do you show positivity? Or a more general question: How does one prove physicality of the solutions for a given PDE model? It seems that the answers can only be given in close collaboration of researchers from physics and mathematics. FIGURE 6 | One dimensional visualization of the Alfvén wing solution along the line z 5 from the xz plane slice at y 0. A comparison is performed between the entropy DG solver with N 3 in each spatial direction and the first order finite volume solver ZEUS. The entropy stable DG approximation uses 90% fewer DOF compared to the 10 million DOF used for the ZEUS computation. Qualitatively, the solutions are very similar.