Edited by: Philippe Renard, Université de Neuchātel, Switzerland
Reviewed by: Guillaume Pirot, Universität Lausanne, Switzerland; Roseanna M. Neupauer, University of Colorado System, United States
This article was submitted to Freshwater Science, a section of the journal Frontiers in Environmental Science
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
We present a generalized hybrid Monte Carlo (gHMC) method for fast, statistically optimal reconstruction of release histories of reactive contaminants. The approach is applicable to large-scale, strongly nonlinear systems with parametric uncertainties and data corrupted by measurement errors. The use of discrete adjoint equations facilitates numerical implementation of gHMC without putting any restrictions on the degree of nonlinearity of advection-dispersion-reaction equations that are used to describe contaminant transport in the subsurface. To demonstrate the salient features of the proposed algorithm, we identify the spatial extent of a distributed source of contamination from concentration measurements of a reactive solute.
An accurate reconstruction of the release history of contaminants in geophysical systems is essential to regulatory and remedial efforts. These efforts rely on measurements of pollutant concentration to identify the sources and/or release history of a pollutant. Unfortunately, available concentration data are typically sparse in both space and time and are corrupted by measurement errors. Source identification and reconstruction of release history are further complicated by both spatial heterogeneity of model parameters and their insufficient characterization, although we do not consider these effects in the present work.
Detailed reviews of the historic developments and state-of-the-art in the field of inverse modeling as related to contaminant source identification are presented in Atmadja and Bagtzoglou (
While existing probabilistic approaches, such as random walk particle tracking for the backward transport equation (Bagtzoglou et al.,
Finally, most existing approaches to the reconstruction of release history are restricted to linear transport phenomena, that is, transport phenomena for which the transport equation is linear in the concentration, and thus are limited to migration of contaminants that are either conservative (all the references above) or exhibit first-order (linear) reaction rates (Neupauer and Wilson,
Purely statistical approaches to history reconstruction, such as the geostatistical inversion with Bayesian updating (Snodgrass and Kitanidis,
We present an optimal reconstruction of contaminant release history that fully utilizes all available information and requires neither the linearity of governing transport equations nor the Gaussianity of the underlying fields. In section 2 we formulate the problem of reconstructing the contaminant release history from noisy observations. Section 3 introduces our general computational framework, which is further implemented in section 4 for various examples.
We consider migration of a single chemically active contaminant in a porous medium Ω ⊂ ℝ
together with corresponding boundary conditions. Here
Introducing the dimensionless quantities
and the normalized reaction term
The non-dimensionalization of the reaction term is specific to its functional form. The non-dimensionalization of a particular case employed in this manuscript is discussed in section 4. For all the following discussions we will consider the dimensionless Equation (2) and we will drop the prime notation for denoting dimensionless quantities.
For the remainder of this work we assume that
Concentration measurements are corrupted by measurement errors. We assume that the measured concentrations
where the errors ϵ
In a typical situation, one has prior information (or a belief) about potential sources of contamination (a region Ω
To simplify the exposition, we assume a spatially distributed chemical release at time
where
where
This formulation assumes that the measurement errors ϵ
The contribution of highly fluctuating or unphysical initial conditions is reduced by adding a regularization term
where
and the weight γ > 0 is a tuning hyperparameter. The regularization term
A conceptual difference between our approach and maximum likelihood methods is worth emphasizing. Rather than sampling the Gibbs distribution exp{−
In principle, one can sample the Gibbs distribution by using Markov-chain Monte Carlo (MCMC) (e.g., Michalak and Kitanidis,
Hybrid Monte Carlo (HMC) refers to a class of methods that combine Hamiltonian molecular dynamics with Metropolis-Hastings Monte Carlo simulations (see Neal,
In HMC one treats the log-likelihood function
At any given time, the state of the system is completely described by (
and the total Hamiltonian of the system is
It follows that the Hamiltonian dynamics are given by
where
for
The remaining part of HMC consists of deciding whether to accept or reject the new state
The momenta variables
As we have noted before, the method samples from the multivariate target distribution, ~ exp(−Ĥ), by computing a Markov chain. Sampling from this density allows us to estimate the mean state (reconstructed initial configuration) and its variance. Markov chain sampling from the posterior distribution involves a
In many cases, the generalized hybrid Monte Carlo (gHMC) of Toral and Ferreira (
where
The two formulations, (9) and (12), are identical if
In order to illustrate how the introduction of the matrix
so that the forcing is given by −
The disadvantage of such a configuration is that too long of a step for a certain component might increase the total Hamiltonian enough to produce a rejection according to (11). If the rejection rate of the chain is too large, one would have to reduce δτ, which affects all components. The issue of the rejection rate would be addressed, but then some components would be updated with very short steps, increasing their correlation. To solve this issue, one can remove the appearance of
Unfortunately, in general, the Hamiltonian (6b) for our problem does not have a simple bilinear form for which an appropriate selection of acceleration matrix
In this section we illustrate how the framework outlined above can be applied to source identification problems. In the first case we study the implementation of HMC to contaminant transport problems with a nonlinear reaction term for different configurations of observations. In the second case we study a linear advection-dispersion problem and explore possible selections of the gHMC acceleration matrix.
We consider a one-dimensional transport with uniform velocity
The Hamiltonian corresponding to this setup is
It is evaluated, together with its sensitivity with respect to the initial condition, by using a method-of-lines discretization of the concentration field
To partially alleviate this problem, we use a single-step ODE integration scheme for the forward problem and compute the sensitivity of
and
respectively, where d
This implies that the sensitivities can be evaluated by repeatedly computing products of the form (d
We test this formulation on a one-dimensional transport problem defined in the domain [0,
corresponding to a nonlinear heterogeneous (precipitation/dissolution) reaction with equilibrium concentration
Breakthrough curves of contaminant at observation locations along the transport domain. Noisy measurements used for inference are shown in dashed.
The release configuration is inferred using the HMC scheme, carried out with hybrid timestep Δτ = 0.17, number of leapfrog steps
where
Reconstruction of release profile via HMC.
In order to study the construction of an acceleration matrix
with uniform coefficients
and (unknown) initial condition
Similar to the case studied in section 4.1, available concentration data consist of a set of measurements continuous in time on the interval (0,
The state variable
which defines the likelihood of the measurements given an initial release vector
which defines the
Let
where
and (·)* denotes the Hermitian adjoint. By projection, (16) is discretized into the set of uncoupled ODEs for the Fourier modes
with initial conditions
Substituting (22) and (24) into (19) yields the following expression for the measurement Hamiltonian:
where
If the measurements are available at every node of the computational domain, i.e., if
which is equivalent to
where the coefficients ĝ
Note that all coefficients ĝ
It follows from (25) that
where
An empirical study of the SVD decomposition
The regularization Hamiltonian extends the distributions of
where γ is a regularization hyperparameter and
or
where
Note that a Fourier-space discretization of the regularization Hamiltonian leads to a similar bilinear form for
For the full Hamiltonian
where
The added cost of computing the acceleration matrix
An advantage of the Cholesky factorization is that the vector of momenta
A less computationally costly alternative for the construction of
which gives
Note that we have replaced the transpose of
Since
In order to retain the validity of the leapfrog method with generalized momenta, we require said momenta to be associated with a kinetic energy following a bilinear form. We achieve this by assuming that
For
where
Let
since both
It follows from (32) that only the components
Hence the vector
We apply the gHMC algorithm to the model problem (16) with parameters
Breakthrough curves of contaminant at observation locations along the transport domain. Noisy measurements used for inference are shown in dashed lines.
Reconstruction of release profile via gHMC and
Next, we study the effect of the choice of acceleration matrix
where ρ(
We compute 10 gHMC chains for each choice of
Autocorrelation functions for
The maximum effective sample size reduction ratio η for each choice of
Effective sample size reduction ratio η for various choices of regularization parameter γ and acceleration matrix
1 × 10−2 | 3.4 | 20.7 | 22.3 |
1 × 10−3 | 4.1 | 102.8 | 161.2 |
For problems with reaction terms, the forcing
Such an approximation can be obtained by disregarding the nonlinear reaction term and using
We presented a computationally efficient and accurate algorithm for identification of sources and release histories of (geo)chemically active solutes. The algorithm is based on a generalized hybrid Monte Carlo approach, in which MC sampling is accelerated by the use of discrete adjoint equations. Some of the salient features of our approach are: (1) its ability to handle nonlinear systems, since it requires no linearizations, and (2) its compatibility with various regularization strategies.
The introduction of an acceleration matrix to the gHMC scheme was tested for an advection-dispersion problem. While the example presented was limited to one-dimensional domains, periodic boundary conditions, and homogeneous porous media, our analysis demonstrated that the proposed acceleration matrices improve upon basic HMC; therefore, we consider the proposed acceleration strategy to be promising. The generalization of these constructions to problems with nonlinear reaction terms, two- and three-dimensional, heterogeneous media, and non-periodic boundary conditions, will be the subject of future work.
Finally, we note the importance of considering the heterogeneity of flow and transport parameters, such as the hydraulic conductivity and dispersion coefficient tensors, for source identification tasks (Xu and Gömez-Hernández,
DB-S implemented the presented algorithms in MATLAB. All authors contributed equally to the preparation of this manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The handling editor is currently co-organizing a Research Topic with one of the authors DT, and confirms the absence of any other collaboration.
We thank N. Gulbahce and T. Bewley for fruitful discussions.
For the problem in section 4.1 we use the linearized Runge-Kutta (Rosenbrock) method ROS2 of Verwer et al. (
We assume that the advection-dispersion-reaction equation can be discretized into an autonomous system of ODEs
where
where
The discussion in section 4.1 led us to conclude that it is necessary to compute products of the form (d
The next task is to derive formulas for the Jacobians of the stage derivatives
and
with
The computation of the products