On the Efficiency of Covariance Localisation of the Ensemble Kalman Filter Using Augmented Ensembles

Farchi, Alban; Bocquet, Marc

doi:10.3389/fams.2019.00003

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 26 February 2019

Sec. Dynamical Systems

Volume 5 - 2019 | https://doi.org/10.3389/fams.2019.00003

This article is part of the Research TopicData Assimilation of Nonlocal Observations in Complex SystemsView all 8 articles

On the Efficiency of Covariance Localisation of the Ensemble Kalman Filter Using Augmented Ensembles

A commentary has been posted on this article:

Commentary: On the Efficiency of Covariance Localisation of the Ensemble Kalman Filter Using Augmented Ensembles
1. Read general commentary

Alban Farchi^*

Marc Bocquet

CEREA, Joint Laboratory École des Ponts ParisTech and EDF R&D, Université Paris-Est, Champs-sur-Marne, France

Localisation is one of the main reasons for the success of the ensemble Kalman filter (EnKF) in high-dimensional geophysical data assimilation problems. It is based on the assumption that correlations between variables of a dynamical system decrease at a fast rate with the physical distance. In the EnKF, two types of localisation methods have emerged: domain localisation and covariance localisation. In this article, we explore possible implementations for covariance localisation in the deterministic EnKF using augmented ensembles in the analysis. We first discuss and compare two different methods to construct the augmented ensemble. The first method, known as modulation, relies on a factorisation property of the background covariance matrix. The second method is based on randomised singular value decomposition (svd) techniques, and has not been previously applied to covariance localisation. The qualitative properties of both methods are illustrated using a simple one-dimensional covariance model. We then show how localisation can be introduced in the perturbation update step using this augmented ensemble framework and we derive a generic ensemble square root Kalman filter with covariance localisation (LEnSRF). Using twin simulations of the Lorenz-1996 model we show that the LEnSRF is numerically more efficient when combined with the randomised svd method than with the modulation method. Finally we introduce a realistic extension of the LEnSRF that uses domain localisation in the horizontal direction and covariance localisation in the vertical direction. Using twin simulations of a multilayer extension of the Lorenz-1996 model, we show that this approach is adequate to assimilate satellite radiances, for which domain localisation alone is insufficient.

1. Introduction

The ensemble Kalman filter (EnKF, [1]) is an ensemble data assimilation (DA) method that has been applied with success to a wide range of dynamical systems in geophysics (see for example, [2, 3]). When the ensemble size is small, ensemble estimates are unreliable, which is why localisation techniques have been introduced in the EnKF. With a chaotic model, Bocquet and Carrassi [4] have shown that localisation is necessary when the ensemble size is smaller than the number of unstable and neutral modes of the dynamics.

Localisation is based on the assumption that correlations between variables in a dynamical system decrease at a fast rate with the physical distance. This assumption is used either to make the assimilation of observations local (domain localisation; [5, 6]) or to artificially taper distant spurious correlations (covariance localisation, [7]). Although both approaches are based on the same rationale, each one has its own implementation and leads to different algorithms. EnKF algorithms using domain localisation are by construction embarrassingly parallel but cannot assimilate non-local observations without ad hoc approximations. By contrast, EnKF algorithms using covariance localisation rely on a single global analysis with a tapered (localised) background covariance, which is more complex to implement especially in a deterministic context. In practice, the two approaches coincide in the limit where the analysis is driven by the background statistics [8] and could differ otherwise.

The ability to assimilate non-local observations becomes increasingly important with the prominence of satellite observations [9]. EnKF algorithms using domain localisation have been adapted to the case of satellite radiances [see e.g., 10, 11]. In these algorithms, the shape of the weighting function associated to a specific satellite channel is used to give an approximate location to this channel (usually the function mode). However, using a realistic one-dimensional model with satellite radiances, Campbell et al. [9] have shown that this approach systematically yields higher errors than covariance localisation.

In this article, we focus on the implementation of covariance localisation in the deterministic EnKF and we put forward two difficulties: how to construct an accurate representation of the localised background covariance and how to efficiently update the perturbations using this representation.

Regarding the first issue, the EnKF literature shows a growing interest in using augmented ensembles during the analysis step, that is when the ensemble size during the analysis step is larger than during the forecast step. Buehner [12] has proposed a method to construct a modulated ensemble that follows the localised background covariance based on a factorisation property shown by Lorenc [13]. This method has then been leveraged upon by Bishop and Hodyss [14] and used in the literature to perform covariance localisation [15–19]. With an alternative point of view, Kretschmer et al. [20] have included localisation in the ensemble transform Kalman filter (ETKF, [21]) by the means of a climatologically augmented ensemble. Finally, Lorenc [22] has shown that the background error covariance matrix can be improved in hybrid ensemble variational data assimilation systems by using time-lagged and time-shifted perturbations.

Another possibility would be to construct the augmented ensemble using randomised singular value decomposition (svd) techniques [23]. Such a method will be detailed in this article.

With an augmented ensemble, standard formulae cannot be used for the perturbation update because the number of perturbations to propagate must be inferior to the augmented ensemble size. This issue is discussed by Bocquet [18], with proposed update formulae. The same update was later derived again by Bishop et al. [19] in a different but formally equivalent form. In this article, we will consider several alternatives for the perturbation update and we will eventually select the update of Bocquet [18], which we found to be the most adequate.

A brief introduction of the EnKF is given in section 2. In section 3, we explain how covariance localisation can be implemented in the deterministic EnKF using augmented ensembles. In section 4, we compare the accuracy of the methods designed to approximate the localised background covariance. Section 5 is dedicated to the numerical illustration of the resulting algorithms using the one-dimensional Lorenz 1996 [L96, 24] model. We explain in section 6 how these methods can be used to assimilate satellite radiances and give an illustration using a multilayer extension of the L96 model that mimics satellite radiances. Conclusions follow in section 7.

2. Background and Motivation

Before getting to the matter of covariance localisation, we recall the basics of the EnKF and we introduce the notation.

2.1. The Kalman Filter Analysis

Consider the DA problem that consists in estimating a state vector $x_{k} \in ℝ^{N_{x}}$ at discrete times t_k, k ∈ ℕ, through independent realisations of the observation vectors $y_{k} \in ℝ^{N_{y}}$ given by

\begin{array}{l} x_{k} = M_{k} (x_{k - 1}) + w_{k}, w_{k} ~ N (0, Q_{k}), (1) \end{array}

\begin{array}{l} y_{k} = H_{k} (x_{k}) + v_{k}, v_{k} ~ N (0, R_{k}) . & (2) \end{array}

When the dynamical model $M$ and the observation operator $H$ are linear and when the initial probability density function (pdf) is Gaussian, the analysis pdf at all time is Gaussian with mean vector and error covariance matrix given by the dynamical Riccati equation.

In the following, we focus on one linear Gaussian analysis step. For simplicity we drop the time index k and the conditioning on the previous observations. The linear observation operator is written H. The background and analysis pdfs are

\begin{array}{l} p (x) = N (x | x_{f}, B), & (3) \end{array}

\begin{array}{l} p (x | y) = N (x | x_{a}, P), & (4) \end{array}

where $x_{a}$ and P are given by

\begin{array}{l} K = B H^{T} {(R + H B H^{T})}^{- 1}, & (5) \end{array}

\begin{array}{l} x_{a} = x_{f} + K (y - H x_{f}), & (6) \end{array}

\begin{array}{l} P = (I - K H) B . & (7) \end{array}

2.2. The EnKF Analysis

In the EnKF [25], the statistics are carried through by the ensemble ${x^{i}, i = 1 \dots N_{e}}$ . Let E be the ensemble matrix, that is the N_x × N_e matrix whose columns are the ensemble members. Let $\bar{x}$ be the ensemble mean and X be the normalised anomaly matrix:

\begin{array}{l} \bar{x} = \frac{E 1}{N_{e}}, & (8) \end{array}

\begin{array}{l} X = \frac{E - \bar{x} 1^{T}}{\sqrt{N_{e} - 1}}, & (9) \end{array}

where $1 \in ℝ^{N_{e}}$ is the vector whose components are 1.

The main EnKF assumptions are

\begin{array}{l} x_{f} = \bar{X}, & (10) \end{array}

\begin{array}{l} B = X X^{T} . & (11) \end{array}

This means that the background pdf $N (x_{f}, B)$ is represented by the empirical pdf $N (\bar{x}, {X X}^{T})$ . The ensemble update depends on the specific implementation of the EnKF. For example, in the ETKF it is given by

\begin{array}{l} Y = H X, & (12) \end{array}

\begin{array}{l} \bar{y} = \frac{Y 1}{N_{e}}, & (13) \end{array}

\begin{array}{l} x_{a} = \bar{x} + X {(I + Y^{T} R^{- 1} Y)}^{- 1} Y^{T} R^{- 1} (y - \bar{y}), & (14) \end{array}

\begin{array}{l} X_{a} = X {(I + Y^{T} R^{- 1} Y)}^{- \frac{1}{2}} U, & (15) \end{array}

\begin{array}{l} E \leftarrow x_{a} 1^{T} + \sqrt{N_{e} - 1} X_{a}, & (16) \end{array}

where U is an arbitrary orthogonal matrix that verifies $U 1 = 1$ and the $\frac{1}{2}$ exponent denotes the square root for diagonalisable matrices with non-negative eigenvalues, defined as follows. Let M be the matrix GDG⁻¹ with G an invertible square matrix and D a diagonal matrix with non-negative elements. We define its square root to be the matrix $M^{\frac{1}{2}} = {G D}^{\frac{1}{2}} G^{- 1}$ .

2.3. Rank Deficiency of the EnKF

The empirical error covariance matrix XX^T has rank limited by N_e−1, which is probably too small to accurately represent the full error covariance matrix of a high-dimensional system where N_x ≫ N_e. Indeed, when the ensemble is too small, the empirical error covariance matrix is characterised by large sampling errors, which often take the form of spurious correlations at long distance.

To fix this, covariance localisation uses an N_x × N_x localisation matrix ρ and regularises the background error empirical covariance matrix by a Schur (element-wise) product

\begin{array}{l} B = ρ \circ (X X^{T}) . & (17) \end{array}

The localisation matrix ρ is a short-range matrix that describes the correlation in the physical domain. If ρ is positive definite, then B is positive semi-definite and therefore can be used as a covariance matrix [26]. This new background covariance matrix also has the following desirable properties: (i) if ρ is short-range then spurious correlations at long distance are removed in B and (ii) the rank of B is no longer limited by N_e − 1. In practice, B is always full-rank (and hence positive definite).

3. Implementing Covariance Localisation in the Deterministic EnKF

3.1. Principle

Covariance localisation as presented in Equation (17) is formulated in state space, while the ETKF ensemble update (Equations 12–16) is formulated in ensemble space. As is, these two approaches are irreconcilable. However, there are other variants of the deterministic EnKF in which covariance localisation can be easily integrated. In the “determinisitic ensemble Kalman filter” [27], the ensemble update is based on the Kalman gain Equation (5) only, where covariance localisation can be included in B. In the serial ensemble square root filter [28], the ensemble update is based on a modified scalar Kalman gain, for which the localisation matrix ρ can be applied entry-wise.

Another possibility to include covariance localisation in the EnKF is to use augmented ensembles. In this case, the EnKF analysis step would be split into three sub-steps:

1. compute a set of ${\hat{N}}_{e}$ perturbations $\hat{X}$ such that $B = ρ \circ {X X}^{T} \approx \hat{X} {\hat{X}}^{T}$ ;

2. apply an EnKF analysis step (e.g. the ETKF) using the background state $\bar{x}$ and the ${\hat{N}}_{e}$ perturbations $\hat{X}$ to compute the analysis state $x_{a}$ and the ${\hat{N}}_{e}$ perturbations ${\hat{X}}_{a}$ ;

3. form N_e updated members using the analysis state $x_{a}$ and the ${\hat{N}}_{e}$ analysis perturbations ${\hat{X}}_{a}$ .

Augmented ensembles are currently used in operational centres to implement localisation in four-dimensional ensemble variational methods [29–31].

In section 3.2, we present different methods that can be used to construct the augmented ensemble (sub-step 1) and in section 3.3 we discuss potential implementations of the perturbation update within this augmented ensemble context (sub-step 3).

3.2. Approximate Factorisation of the Prior Covariance Matrix

3.2.1. Mathematical Goal

Given the prior covariance matrix B = ρ ∘ (XX^T), we want an $N_{x} \times {\hat{N}}_{e}$ matrix $\hat{X}$ such that

\begin{array}{l} \hat{X} {\hat{X}}^{T} \approx B, & (18) \end{array}

\begin{array}{l} \hat{X} 1 = 0 . & (19) \end{array}

Note that, although $\hat{X}$ represents a set of ${\hat{N}}_{e}$ perturbations, we call it “augmented ensemble” for simplicity.

3.2.2. Approximation via Modulation

Suppose that we have an N_x × N_m matrix W such that ρ ≈ WW^T. We define the N_x × N_mN_e matrix WΔX by

\begin{array}{l} {[W Δ X]}_{n}^{j N_{e} + i} = {[W]}_{n}^{j} {[X]}_{n}^{i}, & (20) \end{array}

which is a mix between a Schur product (for the state variable index n) and a tensor product (for the ensemble indices i and j). WΔX is the modulation product of X and W. As shown by Lorenc [13]:

\begin{array}{l} (W Δ X) {(W Δ X)}^{T} = (W W^{T}) \circ (X X^{T}) . & (21) \end{array}

Moreover, it is easy to verify that, as long as $X 1 = 0$ , we have $(W Δ X) 1 = 0$ . Therefore, $\hat{X} = W Δ X$ is a solution to Equations (18) and (19) with ${\hat{N}}_{e} = N_{m} N_{e}$ perturbations.

The name “modulation” was given by Bishop et al. [19]. It stems from the fact that the columns of W should be the main modes of ρ.

Using Equation (20), we conclude that $\hat{X}$ is constructed with complexity:

\begin{array}{l} T_{mod} = O (N_{x} {\hat{N}}_{e}) . & (22) \end{array}

In this equation, we excluded the cost of computing the matrix W. Indeed, if the localisation matrix ρ is constant in time, then the same matrix W can be used for all analysis steps and it only needs to be computed once. A fair comparison with the other methods must take into account this fact.

One remaining question is: how large must be N_m? This question is largely discussed in the litterature related to principal component analysis (see e.g., [32]). However, its answer highly depends on the spatial structure of ρ itself. In the numerical experiments of sections 4, 5, and 6, we illustrate how our performance criterion depends on the number of modes N_m. Yet at this point, it is not clear which degree of accuracy we need for the factorisation of B. Finally note that, in high dimensional spaces, W can be obtained using the random svd (Algorithm B) derived by Halko et al. [23] and described in Appendix B.

3.2.3. Including Balance in the Modulation

In this section, we describe a refinement of the modulation method, based on a new idea. When there is variability between the state variables, it could be interesting to remove part of this variability by transferring it to W as follows. Let Λ be the N_x × N_x diagonal matrix containing the standard deviations of the ensemble:

\begin{array}{l} Λ = {[diag (X X^{T})]}^{\frac{1}{2}} . & (23) \end{array}

The background error covariance matrix can then be written

\begin{array}{l} B = (Λ ρ Λ) \circ {(Λ^{- 1} X) {(Λ^{- 1} X)}^{T}} . & (24) \end{array}

Suppose now that the N_x × N_m matrix W verifies ΛρΛ ≈ WW^T. If we have an N_x × (N_m + δN_m) matrix W₊ such that $W_{+} W_{+}^{T} \approx ρ$ , then W can be constructed as the N_m main modes of ΛW₊. For the same reason as in section 3.2.2, $\hat{X} = W Δ (Λ^{- 1} X)$ is a solution to Equations (18) and (19) with ${\hat{N}}_{e} = N_{m} N_{e}$ perturbations.

In the transformed anomalies Λ⁻¹ X, all state variables have the same variability (namely 1). The variability transfer from X to W means that the matrix W can be deformed and adapted to the current situation in order to yield a more accurate representation of the prior covariance matrix B.

Using this method, the longest algorithmic operation consists in obtaining W from the svd of ΛW₊. Therefore, ΛX is constructed with approximate complexity:

\begin{array}{l} T_{mod, bal} = O (N_{x} {(N_{m} + δ N_{m})}^{2}) . & (25) \end{array}

Again, we excluded the cost of computing the matrix W⁺ because it only needs to be computed once.

3.2.4. Approximation via Truncated svd

Suppose that we have a truncated svd of B given by

\begin{array}{l} B = ρ \circ (X X^{T}) \approx U Σ U^{T}, & (26) \end{array}

where U is an N_x × N_m orthogonal matrix and Σ is an N_m × N_m diagonal matrix. Since B is symmetric positive definite, Equation (26) is a truncated eigendecomposition. Let $\hat{X}$ be an N_x × (N_m + 1) matrix such that

\begin{array}{l} \hat{X} {\hat{X}}^{T} = (U Σ^{\frac{1}{2}}) {(U Σ^{\frac{1}{2}})}^{T}, & (27) \end{array}

\begin{array}{l} \hat{X} 1 = 0 . & (28) \end{array}

Appendix A describes a method to construct $\hat{X}$ from $U Σ^{\frac{1}{2}} .$ Then $\hat{X}$ is a solution to Equations (18) and (19) with ${\hat{N}}_{e} = N_{m} + 1$ perturbations.

How can we efficiently obtain the truncated svd of B Equation (26)? Since B is an N_x × N_x matrix, its svd can be computed with complexity $O (N_{x}^{3})$ . A more adequate solution is to use the random svd (Algorithm 2) derived by Halko et al. [23]. A brief description of this algorithm can be found in Appendix B. For a detailed description, we refer to the original article by Halko et al. [23].

Using this method, the longest algorithmic operations are empirically

1. applying the background error covariance matrix B (steps 2, 5, 7, and 10 of Algorithm 2);

2. computing the QR factorisations (steps 3, 6, and 8 of Algorithm 2).

This means that $\hat{X}$ is constructed with approximate complexity:

\begin{array}{l} T_{trunc svd} = 2 (q + 1) {\hat{N}}_{e} T_{B} + (2 q + 1) O (N_{x} {\hat{N}}_{e}^{2}), & (29) \end{array}

where T_B is the complexity of applying the matrix B to a vector and the parameter q is the number of iterations performed in Algorithm 2. With a dense representation of the matrix B, $T_{B} = O (N_{x}^{2})$ . However, for any vector $v$ of size N_x, we have the identity:

\begin{array}{l} B v = \sum_{i = 1}^{N_{e}} X^{i} \circ (ρ (X^{i} \circ v)), & (30) \end{array}

where Xⁱ is the i-th column of matrix X. If ρ is banded with non-zero coefficients on N_b diagonals, the matrix product Bv has complexity $T_{B} = O (N_{e} N_{x} N_{b})$ . Furthermore, if ρ is a circulant matrix (this corresponds to an invariance by translation in physical space) it is diagonal in spectral space and $T_{B} = O (N_{e} N_{x} ln N_{x})$ .

Finally, as explained in details by Halko et al. [23] and recalled in Appendix B.2, the matrix multiplications implying B in the truncated svd method can be parallelised, which reduces T_{trunc svd} to:

\begin{array}{l} T_{trunc svd} = 2 (q + 1) \frac{{\hat{N}}_{e}}{N_{thr}} T_{B} + (2 q + 1) O (N_{x} {\hat{N}}_{e}^{2}), & (31) \end{array}

where N_thr is the number of available threads.

Again, the same question remains: how large must be N_m? For the same reasons as in section 3.2.2, we cannot provide a clear answer at this point. However, in the numerical experiments of sections 4, 5, and 6, we illustrate how our performance criterion depends on the number of modes N_m.

3.3. Updating the Perturbations

Once the augmented ensemble $\hat{X}$ is constructed, we need to specify how we are going to update the perturbations. This is a non-trivial problem because the number of perturbations that compose $\hat{X}$ , ${\hat{N}}_{e}$ , is potentially different from (and most of the time greater than) the number of pertubations to update, N_e.

3.3.1. Updating the Perturbations Without Localisation

The perturbation update of the ETKF is given by

\begin{array}{l} T_{e} = {(I + Y^{T} R^{- 1} Y)}^{- \frac{1}{2}}, & (32) \end{array}

\begin{array}{l} X_{a} = X T_{e} . & (33) \end{array}

This is a simplified version of Equation (15) that rigorously satisfies

\begin{array}{l} X_{a} X_{a}^{T} = P = (I - K H) X X^{T} . & (34) \end{array}

Sakov and Bertino [8] have shown that Equation (33) is equivalent to

\begin{array}{l} T_{x} = {(I + B H^{T} R^{- 1} H)}^{- \frac{1}{2}}, & (35) \end{array}

\begin{array}{l} X_{a} = T_{x} X . & (36) \end{array}

Note that I + BH^T R⁻¹ H is not necessarily symmetric. However, if we suppose that B is positive definite (the generalisation to positive semi-definite matrices is not fundamental) then BH^T R⁻¹ H is similar (in the matrix sense) to

\begin{array}{l} B^{- \frac{1}{2}} B H^{T} R^{- 1} H B^{\frac{1}{2}} = B^{\frac{1}{2}} H^{T} R^{- 1} H B^{\frac{1}{2}}, & (37) \end{array}

which is symmetric positive semi-definite. Therefore, BH^T R⁻¹ H is diagonalisable with non-negative eigenvalues and I + BH^T R⁻¹ H is diagonalisable with positive eigenvalues. This means that T_x is well-defined.

3.3.2. Updating the Perturbations With Localisation Using the Augmented Ensemble

The right-transform T_e is formulated in ensemble space. As a result, there is no way to enforce covariance localisation (formulated in state space) using this approach. By contrast, the left-transform T_x is formulated in state space and is therefore more adequate to covariance localisation.

Using the augmented ensemble constructed in section 3.2, the prior covariance matrix is approximated by Equation (18). This means that the left-transform can be approximated by

\begin{array}{l} T_{x} \approx {\hat{T}}_{e} = {(I + \hat{X} {\hat{Y}}^{T} R^{- 1} H)}^{- \frac{1}{2}}, & (38) \end{array}

where $\hat{Y} = H \hat{X}$ . Using this expression for the update still implies linear algebra in state space, which is problematic with high-dimensional systems. $I + \hat{X} {\hat{Y}}^{T} R^{- 1} H$ is an N_x × N_x matrix, it is therefore intractable with high-dimensional systems. However, Equation (25) of Bocquet [18] shows that

\begin{array}{l} {\hat{T}}_{e} = I - \hat{X} {(I + {\hat{Y}}^{T} R^{- 1} \hat{Y} + {(I + {\hat{Y}}^{T} R^{- 1} \hat{Y})}^{\frac{1}{2}})}^{- 1} {\hat{Y}}^{T} R^{- 1} \hat{H}, & (39) \end{array}

where the linear algebra is now performed in the augmented ensemble space (that is, using ${\hat{N}}_{e} \times {\hat{N}}_{e}$ matrices). This perturbation update has been rediscovered by Bishop et al. [19] and included in their gain ETKF (GETKF) algorithm. However, the update formula used in the GETKF is prone to numerical cancellation errors as opposed to Equation (39).

In the rest of this article, we use the following perturbation update:

\begin{array}{l} X_{a} = {\hat{T}}_{e} X, & (40) \end{array}

and we compute the left-transform ${\hat{T}}_{e}$ using Equation (39). Algorithm 1 summarises the analysis step of the resulting generic ensemble square root Kalman filter with covariance localisation (LEnSRF). Finally note that the consistency of the perturbation update in deterministic EnKF algorithms using covariance localisation is the subject of another study by Bocquet and Farchi (under review).

ALGORITHM 1

Algorithm 1. Analysis step for a generic LEnSRF used in this article. The augmented ensemble at step 3 can be constructed using the modulation method with or without the balanced refinement or using the truncated svd method.

4. Accuracy of the Approximate Factorisation

4.1. A Simple One-Dimensional Model for the Background Error Covariance Matrix

In this section, we investigate the accuracy of the factorisation Equation (18) obtained with the methods described in section 3.2. The background error covariance matrix is obtained as follows.

Let G be the piecewise rational function of Gaspari and Cohn [33]. Let d be the Euclidean distance over the set {1…N_x} with periodic boundary conditions. For any radius r ∈ ℝ⁺, we define C(r) as the N_x × N_x matrix whose coefficients satisfy

\begin{array}{l} {[C (r)]}_{m, n} = G (\frac{d (m, n)}{r}) . & (41) \end{array}

For any vector $v \in ℝ^{N_{x}}$ , we define $D (v)$ as the N_x × N_x diagonal matrix whose diagonal is precisely $v$ .

We draw a vector c from the distribution $N (1, α_{c} C (r_{c}))$ , where α_c and r_c are two parameters. We define:

\begin{array}{l} C_{ref} = D (c) C (r_{ref}) D (c) . & (42) \end{array}

C_ref plays the role of a reference covariance matrix in our model. We now draw a sample of N_e independent members from the distribution $N (0, C_{ref})$ . Without loss of generality, we assume that the associated ensemble matrix X is normalised by $\sqrt{N_{e} - 1}$ . We define the background error covariance matrix as

\begin{array}{l} B = ρ \circ (X X^{T}), & (43) \end{array}

with the localisation matrix ρ = C(r_ref).

4.2. Experimental Setup

In the following test series, we use two matrices B₁ and B₂. They are constructed using different realisations of the model described in section 4.1 with the parameters specified in Table 1. Note that we used short-range correlations (r_ref = 20) to construct B₁ and mid-range correlations (r_ref = 100) to construct B₂. Figure 1 displays the matrices B₁ and B₂.

TABLE 1

Table 1. Parameters used to construct B₁ and B₂ with the model described in section.

FIGURE 1

Figure 1. Background error covariance matrices B₁ (left) and B₂ (right).

The modulation method described in sections 3.2.2 and 3.2.3 requires an approximate factorisation for ρ = C(r_ref), that we precompute by keeping the first N_m or N_m + δN_m (when using the balance refinement) modes in the svd of ρ. Since ρ is sparse, we use the random svd (Algorithm 2) to obtain this factorisation.

4.3. Results and Discussion

We apply the methods described in section 3.2 to obtain an approximate factorisation for B₁ and B₂ and we compute the normalised Frobenius error:

\begin{array}{l} e_{F, i} = \frac{{‖ B_{i} - \hat{X} {\hat{X}}^{T} ‖}_{F}}{{‖ B_{i} ‖}_{F}} . & (44) \end{array}

Using the Eckart–Young theorem [34], we conclude that the lowest normalised Frobenius error for a factorisation with rank ${\hat{N}}_{e} - 1$ is

\begin{array}{l} e_{F, i}^{min} = \frac{\sqrt{\sum_{k = {\hat{N}}_{e}}^{N_{x}} σ_{k}^{2} (B_{i})}}{{‖ B_{i} ‖}_{F}}, & (45) \end{array}

where σ_k(M) is the k-th singular value of the matrix M.

Figure 2 shows the evolution of the normalised Frobenius error e_{F, i} as a function of the augmented ensemble size ${\hat{N}}_{e}$ when the factorisation is computed using the truncated svd method (section 3.2.4) or the modulation method with (section 3.2.3) or without (section 3.2.2) the balance refinement. The value reported for e_{F, i} is the average value over 100 independent realisations of the random svd. For q ≥ 1, the Frobenius error of the truncated svd method (not shown here) cannot be distinguised from the minimum possible value. For the modulation method, using the balance refinement with δN_m > 10 (not shown here) does not yield a clear improvement over the case δN_m = 10. The singular values of B₂ (mid-range case) decay much faster than the singular values of B₁ (short-range case). This explains why the normalised Frobenius errors e_{F, 2} are systematically lower than the normalised Frobenius errors e_{F, 1}.

FIGURE 2

Figure 2. Evolution of the normalised Frobenius errors e_{F, 1} (top) and e_{F, 2} (bottom) as a function of the augmented ensemble size ${\hat{N}}_{e}$ . The approximate factorisation of B₁ (top) and B₂ (bottom) is computed either using the truncated svd method with several values for the parameter q for the random svd Algorithm 2 (blue lines) or using the modulation method with (solid red line) or without (dashed red line) the balance refinement. The lowest normalised achievable Frobenius errors $e_{F, i}^{min}$ are plotted in yellow for both cases.

The modulation method is very fast but yields a poor approximation of the background error covariance matrix B. With the balance refinement, the approximation is a bit better and the method is still very fast. By contrast, the truncated svd method is much slower but yields an approximation of B close to optimal. At this point, it is not clear what level of precision is needed for B. Yet, we can already conclude that, in a cycled DA context, we will have to find a balance between speed and accuracy in the construction of the augmented ensemble and in the perturbation update.

Finally, different matrix norms could have been used in this test series. Indeed, even if equivalent, two matrix norms give different informations. This is why we have computed the normalised spectral error (not shown here) and found quite similar results to those depicted in Figure 2. This shows that our results are not specific to the Frobenius norm.

5. Cycled DA Experiments With a One-Dimensional Model

5.1. The Lorenz 1996 Model

The L96 model [24] is a low-order one-dimensional discrete model whose evolution is given by the following ordinary differential equations (ODEs):

\begin{array}{l} \frac{d x_{n}}{d t} = (x_{n + 1} - x_{n - 2}) x_{n - 1} - x_{n} + F, n = 1 \dots N_{x}, & (46) \end{array}

where the indices are to be understood with periodic boundary conditions: x₋₁ = x_{N_x−1}, x₀ = x_{N_x}, x₁ = x_{N_x+1} and where the system size N_x can take arbitrary values. These ODEs are integrated using a fourth-order Runge–Kutta method with a time step of 0.05 time unit.

The standard configuration N_x = 40 and F = 8 is widely used in DA to assess the performance of the DA algorithms. It yields chaotic dynamics with a doubling time around 0.42 time unit. In this section, we use an extended standard configuration where N_x = 400 and F = 8, which is essentially a repetition of ten times the standard configuration. The observations are given by

\begin{array}{l} y = x + v, v ~ N (0, I), & (47) \end{array}

and the time interval between consecutive observations is Δt = 0.05 time unit, which represents 6 h of real time and corresponds to a model autocorrelation around 0.967. The standard deviation of the observation noise is approximately one tenth of the climatological variability of each observation.

For the localisation, we use the Euclidean distance over the set {1…N_x} with periodic boundary conditions (the same as the one that was used in section 4).

Note that we use N_x = 400 state variables instead of 40 such that typical local domains (with a localisation radius around r = 20 grid points) do not cover the whole domain.

5.2. Experimental Setup

In this section, we illustrate the performance of the LEnSRF Algorithm 1 using twin simulations of the L96 model in the extended configuration. The augmented ensemble (step 3 of Algorithm 1) is computed using the truncated svd method (section 3.2.4) or the modulation method with (section 3.2.3) or without (section 3.2.2) the balance refinement. Both methods use an ensemble of N_e = 10 members and a localisation matrix ρ = C(r), where the localisation radius r is a free parameter.

For the modulation method, the approximate factorisation of ρ is precomputed once for each twin experiment by keeping the first N_m or N_m + δN_m (when using the balance refinement) modes in the svd of ρ.

For the truncated svd method, the matrix multiplications implying B are computed using Equation (30) and ρ is applied in spectral space.

The runs are 2 × 10⁴Δt long (with an additional 2 × 10³Δt spin-up period) and our performance criterion is the time-average analysis root mean square error, hereafter called the RMSE.

5.3. Augmented Ensemble Size—RMSE Evolution

Figure 3 shows the evolution of the RMSE as a function of the augmented ensemble size ${\hat{N}}_{e}$ . Both the truncated svd and the modulation methods are able to produce an estime of the true state with an accuracy equivalent to that of the local ensemble transform Kalman filter (LETKF, [35]). As expected after the experiments in section 4, for a given level of RMSE score we need a much smaller augmented ensemble size ${\hat{N}}_{e}$ when using the truncated svd method than when using the modulation method. However, before we conclude that the truncated svd method is more efficient, we need to take into account the fact that computing the augmented ensemble is much slower with this method than with the modulation method.

FIGURE 3

Figure 3. Evolution of the RMSE as a function of the augmented ensemble size ${\hat{N}}_{e}$ (left) and of the wall-clock analysis time for the 22 × 10³ cycles as a function of the RMSE for the LEnSRF Algorithm 1. For each data point, the localisation radius r and the inflation factor λ are optimally tuned to yield the lowest RMSE. The augmented ensemble is computed either using the truncated svd method with several values for the parameter q in the random svd Algorithm 2 (green and blue lines) or using the modulation method with (solid red line) or without (dashed red line) the balance refinement. In the left panel, the horizontal solid cyan baseline is the RMSE of the LETKF with an ensemble of N_e = 10 members and optimally tuned localisation radius r and inflation factor λ.

With the truncated svd method, the augmented ensemble size ${\hat{N}}_{e}$ needs to be at least of the same order as the number of unstable and neutral modes of the model dynamics (around 133 here) in order to yield accurate results [4]. We have also tested q > 1 in the truncated svd method or δN_m > 20 in the modulation method (not shown here) and found that none of these parameters yields clear improvements in RMSE scores.

5.4. RMSE—Wall-Clock Time Evolution

The longest algorithmic operations in Algorithm 1 are:

1. constructing the augmented ensemble (step 3);

2. computing the svd of $\hat{S}$ (required for steps 8 and 9);

When using the truncated svd method, some level of parallelisation can be included in the construction of the augmented ensemble as remarked in Appendix B.2. When using the modulation method (even with the balance refinement), constructing the augmented ensemble is almost instantaneous compared to computing the svd of $\hat{S}$ . Therefore, we only enable parallelisation when using the truncated svd method.

To compute the analysis state $x_{a}$ and analysis perturbations X_a, we have tested several approaches and concluded that the most efficient is to compute the svd of $\hat{S}$ with a direct svd algorithm, which cannot be parallelised.

Figure 3 shows the evolution of the wall-clock time of one analysis step as a function of the RMSE. All simulations are performed on the same double Intel Xeon E5-2680 platform, which has a total of 24 cores. Parallelisation is enabled when possible using a fixed number of OpenMP threads. For a given level of RMSE score, the truncated svd method is clearly faster than the modulation method. This shows the advantage of using the truncated svd method over the modulation method, especially when parallelisation is possible. However, this result is specific to the problem considered in this section and may not generalise to all situations.

6. Cycled DA Experiments With Satellite Radiances

6.1. Is Covariance Localisation Viable With High-Dimensional Models?

In section 5, we have implemented covariance localisation in the EnKF and successfully applied the resulting algorithm to a one-dimensional DA problem with N_x = 400 state variables. With a high-dimensional system, covariance localisation in the EnKF will probably require a very large augmented ensemble size ${\hat{N}}_{e}$ , too large to be affordable. In this case, the use of domain localisation will be mandatory.

When observations are local, domain localisation is simple to implement and yield efficient algorithms such as the LETKF. However, when observations are non-local, one must resort to ad hoc approximations to implement domain localisation in the EnKF, for example assigning an approximate location to each observation. In this section, we discuss the case of satellite radiances, which are non-local observations and we show how existing variants of the LETKF deal with such observations. We then give an extension of our LEnSRF Algorithm 1 designed to assimilate satellite radiances in a spatially extended model. Finally we introduce a multilayer extension of the L96 model that mimics satellite radiances and we illustrate the performance of our method using twin simulations of this model.

6.2. The Case of Satellite Radiances

Suppose that the physical space consists of a multilayer space with P_z vertical levels of P_h state variables. For any h ∈ {1…P_h} and z ∈ {1…P_z}, the state variable located at the h-th horizontal position and at the z-th vertical level is written x_{(z, h)}. For any state vector x, the sub-vector of the components located at the h-th horizontal position at any level is written x_h and called the h-th column of x.

Suppose that each state vector column is observed independently via

\begin{array}{l} y_{h} = Ω x_{h,} & (48) \end{array}

where Ω is a P_c × P_z weighting matrix and $y_{h}$ is the vector containing the P_c observations at the h-th horizontal position. P_c is the number of channels. The full observation vector $y$ is the concatenation of all $y_{h}, h = 1 \dots P_{h}$ . It has N_y = P_c × P_h components and for any h ∈ {1…P_h} and c ∈ {1…P_c}, the observation located at the h-th horizontal position and corresponding to the c-th channel is written y_{(c, h)}.

This simple model describes a typical situation for satellite radiances. From these definitions, it is clear that each observation is attached to an horizontal position, but has no well-defined vertical position (unless the weighting matrix Ω is diagonal). Several variants of the LETKF have been designed to assimilate such observations. When the weighting function of a channel has a single and well-located maximum, the vertical location of this maximum can play the role of an approximate height for this channel. This is the approach followed for example by Fertig et al. [11]. Based on these vertical positions, they use the channels to update “adjacent” vertical levels as long as the corresponding weighting function is above a threshold value. Campbell et al. [9] has followed the same approach to define the approximate height of the channels. However their update formula includes a vertical tapering of the anomalies that depends on the vertical distance. When the weighting functions are flat, another possibility is to define the approximate height of a channel as the middle of the support of its weighting function [36]. Miyoshi and Sato [10] have proposed an alternative that does not require the definition of an approximate height of the channels: their update formula includes a vertical tapering of the anomalies that depends on the shape of the weight functions only. Finally, in the algorithm of Penny et al. [37], vertical localisation has been removed.

Using a realistic one-dimensional model with satellite radiances, Campbell et al. [9] have shown that ad hoc approaches based on domain localisation only systematically yield higher errors than covariance localisation. In a spatially extended model with satellite radiances, it seems natural to apply domain localisation in the horizontal direction, in which observations are local, while using covariance localisation in the vertical direction, in which observations are non-local.

6.3. Including Domain Localisation in the LEnSRF

Following the approach of Bishop et al. [19], we apply four modifications to the generic LEnSRF Algorithm 1 in order to include domain localisation in way similar to the LETKF.

1. We perform P_h local analyses instead of one global analysis. The aim of the h-th local analysis is to give an update for the P_z state variables that form the h-th column. The linear algebra must be adapted in consequence.

2. We taper the anomalies related to each observation with respect to the horizontal distance to the h-th column. This is usually implemented in $R^{- \frac{1}{2}}$ .

3. Observations whose horizontal position is far from the h-th column (i.e., observations that are not in the local domain) do not contribute to the update. These observations are therefore omitted in the local analysis in order to save some computational time.

4. Since observations that are outside of the local domain are omitted, we only need to compute an augmented ensemble for the state variables inside the local domain. Since covariance localisation is only applied in the vertical direction, the augmented ensemble ${\hat{X}}^{ℓ}$ must have empirical covariance matrix given by

\begin{array}{l} {\hat{X}}^{ℓ} {({\hat{X}}^{ℓ})}^{T} \approx ρ_{v} \circ (X^{ℓ} {(X^{ℓ})}^{T}), & (49) \end{array}

where X^ℓ is the set of perturbations that are located within the local domain and ρ_v is the vertical localisation matrix, whose coefficients only depend on the vertical layer indices.

This modified LEnSRF, hereafter called local analysis LEnSRF (L²EnSRF), implements domain localisation in the horizontal direction and covariance localisation in the vertical direction and therefore can be used to assimilate vertically non-local observations such as satellite radiances.

6.4. The Multilayer L96 Model

We now introduce a multilayer extension of the L96 model, hereafter called mL96 model. This multilayer extension is used to test and illustrate the performance of the L²EnSRF algorithm.

The mL96 model consists of P_z coupled layers of the one-dimensional L96 model with P_h variables. Keeping the notations defined in section 6.2, the evolution of the model is given by the following P_zP_h ODEs:

\begin{array}{l} \begin{array}{l} \frac{d x_{(z, h)}}{d t} = (x_{(z, h + 1)} - x_{(z, h - 2)}) x_{(z, h - 1)} - x_{(z, h)} + F_{z} \\ + δ_{{z > 0}} Γ (x_{(z - 1, h)} - x_{(z, h)}) \\ + δ_{{z \leq P_{z}}} Γ (x_{(z + 1, h)} - x_{(z, h)}) . \end{array} & (50) \end{array}

The first line of Equation (50) corresponds to the original L96 ODE Equation (46), with a forcing term that may depend on the vertical layer index z. The second and third lines correspond to the coupling between adjacent layers, with a constant intensity Γ. As for the L96, the horizontal indices are to be understood with periodic boundary conditions: x_{(z, −1)} = x_{(z, _P_h−1)}, x_{(z, 0)} = x_{(z, _P_h)}, and x_{(z, 1)} = x_{(z, _P_h+1)}. These ODEs are integrated using a fourth-order Runge–Kutta method with a time step of 0.05 time unit.

For this experiment, we use P_z = 32 layers and P_h = 40 to mimic the standard configuration of the L96 model. The forcing term linearly decreases from F₁ = 8 at the lowest level to F_{P_z} = 4 at the highest level. Without the coupling, these values would render the lower levels dynamics chaotic and the higher levels dynamics laminar, which is a typical behaviour in the atmosphere. Finally, we take Γ = 1 such that adjacent layers are highly correlated (correlation around 0.87). To be more specific, the correlation between the z-th level and the (z + δz)-th level first rapidly decreases with δz. It reaches approximately −0.1 for δz = 6 and then it starts increasing. Finally, its absolute value is below 10⁻² when δz > 10. This model is chaotic and the dimension of the unstable or neutral subspace is around 50.

The observation operator H follows the model described in section 6.2. We use P_c = 8 channels and a weighting matrix Ω designed to mimic satellite radiances as shown in Figure 4. The observations are given by

\begin{array}{l} y = H x + v, v ~ N (0, I), & (51) \end{array}

and the time interval between consecutive observations is the same as the one used with the L96 model, Δt = 0.05 time unit. Once again, the standard deviation of the observation noise is approximately one tenth of the climatological variability of each observation.

FIGURE 4

Figure 4. Observation operator H. Each line represents the weighting function of a different channel, corresponding to a row of the weighting matrix Ω. Every channel has a single maximum and is relatively broad (half-width around 10 vertical layers). The sum of the weights has been adjusted individually such that every channel yields an observation with approximately the same climatological variability.

For the horizontal localisation, we use the Euclidean distance d_h over the set {1…P_h} with periodic boundary conditions. For the vertical localisation, we use the Euclidean distance d_v over the set {1…P_z}.

6.5. Experimental Setup

In this section, we give some details on how localisation is implemented in the L²EnSRF algorithm for the mL96 model. We then described which approximations are made to implement an LETKF algorithm. For each algorithm, the runs are 10⁴Δt long (with an additional 10³Δt spin-up period) and our performance criterion is the time-average analysis RMSE.

6.5.1. Horizontal Localisation

Let r_h be the horizontal localisation radius. During the h₁-th local analysis, the anomalies related to observation y_{(c,_h₂)} are tapered by a factor

\begin{array}{l} \sqrt{G (\frac{d_{h} (h_{1}, h_{2})}{r_{h}})}, & (52) \end{array}

where G is the piecewise rational function of Gaspari and Cohn [33]. This means that the h-th local domain consists of the columns {h− ⌊r_h…h⌋ + ⌊r_h⌋} where the indices are to be understood with P_h periodic boundary conditions and ⌊r_h⌋ is the integer part of r_h.

6.5.2. Vertical Localisation

Let r_v be the vertical localisation radius. The vertical localisation matrix ρ_v has coefficients given by

\begin{array}{l} {[ρ_{v}]}_{(z_{1}, h_{1}), (z_{2}, h_{2})} = G (\frac{d_{v} (z_{1}, z_{2})}{r_{v}}) . & (53) \end{array}

The local domain gathers $P_{h}^{ℓ} = 2 r_{h} + 1$ columns, hence ρ_v is a $P_{z} P_{h}^{ℓ} \times P_{z} P_{h}^{ℓ}$ block-diagonal matrix. Since its coefficients only depend on the vertical layer indices, it can also be seen as a P_z × P_z matrix.

The $P_{z} P_{h}^{ℓ} \times {\hat{N}}_{e}$ matrix ${\hat{X}}^{ℓ}$ of the augmented ensemble is computed using the truncated svd method (section 3.2.4) or the modulation method with (section 3.2.3) or without (section 3.2.2) the balance refinement. Both methods use an ensemble of N_e = 8 members and the localisation matrix ρ_v.

For the modulation method, the approximate factorisation of ρ_v is precomputed once for each twin experiment by keeping the first N_m or N_m+δN_m (when using the balance refinement) modes in the svd of the P_z × P_z matrix ρ_v.

For the truncated svd method, the matrix multiplications implying B are computed using Equation (30). Because the coefficients of the localisation matrix ρ_v only depends on the vertical layer indices, applying the $P_{z} P_{h}^{ℓ} \times P_{z} P_{h}^{ℓ}$ matrix ρ_v to a vector with $P_{z} P_{h}^{ℓ}$ components reduces to applying the P_z × P_z matrix ρ_v to a vector with P_z components. It should be relatively quick and therefore we do not perform this product in spectral space. This means that the implementation can be straightforwardly adapted to the general case where the vertical layers are not equally distributed in height.

6.5.3. Ad hoc Approximations for the LETKF

We define an approximate height z_c for the c-th channel:

\begin{array}{l} z_{c} = \frac{\sum_{z = 1}^{P_{z}} z {[Ω]}_{c, z}}{\sum_{z = 1}^{P_{z}} {[Ω]}_{c, z}} \in [1, P_{z}] . & (54) \end{array}

We did not define z_c as the vertical position of the maximum of the c-th weight function because we wanted to account for the fact that our weight functions are skewed in the vertical direction.

In the LETKF algorithm, we perform P_zP_h local analyses (one for each state variable). In the local analysis for the state variable x_{(z,_h₁)}, the anomalies related to observation y_{(c,_h₂)} are tapered by a factor

\begin{array}{l} \sqrt{G (\sqrt{{(\frac{δ h}{r_{h}})}^{2} + {(\frac{δ z}{r_{v}})}^{2}})}, & (55) \end{array}

where

\begin{array}{l} δ h = \min {| h_{2} - h_{1} |, P_{h} - | h_{2} - h_{1} |}, & (56) \end{array}

\begin{array}{l} δ z = | z - z_{c} |, & (57) \end{array}

and r_h and r_v are the horizontal and vertical localisation radii, respectively.

6.6. Results

Figure 5 shows the evolution of the RMSE and of the wall-clock time of one analysis step as a function of the horizontal localisation radius r_h for the L²EnSRF and the LETKF. All simulations are performed on the same double Intel Xeon E5-2680 platform, which has a total of 24 cores. Parallelisation is enabled for the P_h independent local analyses using a fixed number of OpenMP threads N_thr = 20. In the L²EnSRF algorithm, the augmented ensemble is computed either using the truncated svd method with q = 0 in the random svd (Algorithm 2) or using the modulation method without the balance refinement (δN_m = 0). Preliminary experiments with q > 0 or δN_m > 0 (not shown here) did not display clear improvements in RMSE score over the cases q = 0 and δN_m = 0.

FIGURE 5

Figure 5. Evolution of the RMSE (top) and of the wall-clock analysis time for the 10³ cycles (bottom) as a function of the horizontal localisation radius r_h for the L²EnSRF algorithm. For each data point, the vertical localisation radius r_v and the inflation factor λ are optimally tuned to yield the lowest RMSE. The augmented ensemble is computed either using the truncated svd method (blue lines) with q = 0 in the random svd Algorithm 2 or using the modulation method (red lines) without the balance refinement (ΔN_m = 0). As a reference, we draw the same data for the LETKF with the ad hoc approximations described in section 6.5.3 (cyan line).

The LETKF produces rather high RMSE scores (compare to the standard deviation of the observation noise, which is 1), while not completely failing to reconstruct the true state. Although domain localisation in the horizontal direction is a powerful tool, vertical localisation is necessary in this DA problem. Because the weight functions of the channels are quite broad, observations cannot be considered local and domain localisation in the vertical direction is inefficient. By contrast, with a reasonable augmented ensemble size ${\hat{N}}_{e}$ , the L²EnSRF yields significantly lower RMSEs. This shows that combining domain localisation in the horizontal direction and covariance localisation in the vertical direction is an adequate approach to assimilate satellite radiances.

The comparison between the truncated svd and the modulation methods is not as simple as it was in the L96 test series. As expected, for a given augmented ensemble size ${\hat{N}}_{e}$ , the truncated svd method yields lower RMSE scores. However, for a given level of RMSE score, using the truncated svd method is not always the fastest approach. For example, the RMSE of the truncated svd method with ${\hat{N}}_{e} = 64$ is approximately the same as the RMSE of the modulation method with ${\hat{N}}_{e} = 128$ , but in this case the modulation method is faster by a factor 1.5 on average. This can be explained by two factors. First, in the truncated svd method the vertical localisation matrix is not applied in spectral space. Second, both the truncated svd and the modulation method benefit from parallelisation, but the parallelisation potential of the truncated svd method is not fully exploited here because our computational platform has a limited number of cores. This would change if we could use several threads for each local analysis. Finally, these results confirm that, for high dimensionals DA problems where the memory requirement is an issue, the truncated svd method is the best approach to obtain accurate results while using only a limited augmented ensemble size ${\hat{N}}_{e}$ .

7. Conclusions

Localisation is widely used in DA to make the EnKF viable in high-dimensional systems. In the EnKF, two different approaches can be used to include localisation: domain localisation or covariance localisation. In this article, we have explored possible implementations for covariance localisation in the deterministic EnKF using augmented ensembles in the analysis step. We have discussed the two main difficulties related to augmented ensembles: how to construct the augmented ensemble and how to update the perturbations.

We have used two different methods to construct the augmented ensemble. The first one is based on a factorisation property of the background error covariance matrix. It is already widespread in the geophysical DA literature under the name modulation. For this method, we also introduced a balance refinement in order to smooth some variability between the state variables. As an alternative, we proposed a second method based on randomised svd techniques, which are very efficient when the localisation matrix is easy to apply. The random svd algorithm is commonly used in the statistical literature but it had never been applied in this context. We have called this approach truncated svd method.

We have shown how covariance localisation can be included in the perturbation update using the augmented ensemble framework. The resulting update formula [18] uses linear algebra in the augmented ensemble space. It is included in the generic LEnSRF detailed in this article.

Using numerical simulations of a very simple one-dimensional covariance model with 400 state variables, we have shown that the truncated svd method yields a much more accurate approximation of the background covariance than the modulation method. This result has been confirmed by twin simulations of the one-dimension L96 model with 400 variables. In a standard DA setup, we have found that the balance between fast computation of the augmented ensemble and fast perturbation update is in favor of the truncated svd method. In other words, for a given level of RMSE score, it is worth spending more time to construct a smaller but more reliable augmented ensemble with the truncated svd method and then use a fast perturbation update.

We have defined the L²EnSRF algorithm as an extension of the LEnSRF suited to assimilate satellite radiances in spatially extended models. It implements domain localisation in the horizontal direction in a similar way as the LETKF and covariance localisation in the vertical direction. Such an extension had been discussed by Bishop et al. [19] but without numerical illustration.

Finally, we have constructed a simple multilayer extension of the L96 model, called mL96 model. We have performed twin simulations of this model using a satellite-like observation operator. As expected, the LETKF hardly reconstructs the true state. By contrast, the L²EnSRF yields an estimate of the true state with an acceptable accuracy. We have concluded that using domain localisation in the horizontal direction and covariance localisation in the vertical direction is an adequate approach to assimilate satellite radiances in a spatially extended model. For a given level of RMSE score, the modulation method is the fastest approach in this DA problem. This result is however mitigated by the fact that our computational setup does not use the full parallelisation potential of the truncated svd method. However, when the augmented ensemble size ${\hat{N}}_{e}$ is limited, the truncated svd method is the best approach to obtain accurate results.

Author Contributions

All authors have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

CEREA is a member of Institut Pierre–Simon Laplace (IPSL). MB acknowledges early discussions with N. Bousserez on the randomised svd techniques.

References

1. Evensen G. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J Geophys Res. (1994) 99:10143–62. doi: 10.1029/94JC00572

CrossRef Full Text | Google Scholar

2. Houtekamer PL, Mitchell HL, Pellerin G, Buehner M, Charron M, Spacek L, et al. Atmospheric data assimilation with an ensemble Kalman filter: results with real observations. Mon Weather Rev. (2005) 133:604–20. doi: 10.1175/MWR-2864.1

CrossRef Full Text | Google Scholar

3. Sakov P, Counillon F, Bertino L, Lisæter KA, Oke PR, Korablev A. TOPAZ4: an ocean-sea ice data assimilation system for the North Atlantic and Arctic. Ocean Sci. (2012) 8:633–56. doi: 10.5194/os-8-633-2012

CrossRef Full Text | Google Scholar

4. Bocquet M, Carrassi A. Four-dimensional ensemble variational data assimilation and the unstable subspace. Tellus B (2017) 69:1304504. doi: 10.1080/16000870.2017.1304504

CrossRef Full Text | Google Scholar

5. Houtekamer PL, Mitchell HL. A sequential ensemble Kalman filter for atmospheric data assimilation. Mon Weather Rev. (2001) 129:123–37. doi: 10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2

CrossRef Full Text | Google Scholar

6. Ott E, Hunt BR, Szunyogh I, Zimin AV, Kostelich EJ, Corazza M, et al. A local ensemble Kalman filter for atmospheric data assimilation. Tellus B (2004) 56:415–28. doi: 10.1111/j.1600-0870.2004.00076.x

CrossRef Full Text | Google Scholar

7. Hamill TM, Whitaker JS, Snyder C. Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon Weather Rev. (2001) 129:2776–90. doi: 10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2

CrossRef Full Text | Google Scholar

8. Sakov P, Bertino L. Relation between two common localisation methods for the EnKF. Computat Geosci. (2011) 15:225–37. doi: 10.1007/s10596-010-9202-6

CrossRef Full Text | Google Scholar

9. Campbell WF, Bishop CH, Hodyss D. Vertical covariance localization for satellite radiances in ensemble Kalman filters. Mon Weather Rev. (2010) 138:282–90. doi: 10.1175/2009MWR3017.1

CrossRef Full Text | Google Scholar

10. Miyoshi T, Sato Y. Assimilating satellite radiances with a local ensemble transform Kalman filter (LETKF) applied to the JMA global model (GSM). SOLA (2007) 3:37–40. doi: 10.2151/sola.2007-010

CrossRef Full Text | Google Scholar

11. Fertig EJ, Hunt BR, Ott E, Szunyogh I. Assimilating non-local observations with a local ensemble Kalman filter. Tellus B (2007) 59:719–30. doi: 10.1111/j.1600-0870.2007.00260.x

CrossRef Full Text | Google Scholar

12. Buehner M. Ensemble-derived stationary and flow-dependent background-error covariances: evaluation in a quasi-operational NWP setting. Q J Roy Meteor Soc. (2005) 131:1013–43. doi: 10.1256/qj.04.15

CrossRef Full Text | Google Scholar

13. Lorenc AC. The potential of the ensemble Kalman filter for NWP-a comparison with 4D-Var. Q J Roy Meteor Soc. (2003) 129:3183–203. doi: 10.1256/qj.02.132

CrossRef Full Text | Google Scholar

14. Bishop CH, Hodyss D. Ensemble covariances adaptively localized with ECO-RAP. Part 2: a strategy for the atmosphere. Tellus B (2009) 61:97–111. doi: 10.1111/j.1600-0870.2008.00372.x

CrossRef Full Text | Google Scholar

15. Brankart JM, Cosme E, Testut CE, Brasseur P, Verron J. Efficient local error parameterizations for square root or ensemble Kalman filters: application to a basin-scale ocean turbulent flow. Mon Weather Rev. (2011) 139:474–93. doi: 10.1175/2010MWR3310.1

CrossRef Full Text | Google Scholar

16. Bishop CH, Hodyss D. Adaptive ensemble covariance localization in ensemble 4D-VAR state estimation. Mon Weather Rev. (2011) 139:1241–55. doi: 10.1175/2010MWR3403.1

CrossRef Full Text | Google Scholar

17. Leng H, Song J, Lu F, Cao X. A new data assimilation scheme: the space-expanded ensemble localization Kalman filter. Adv Meteorol. (2013) 2013:410812. doi: 10.1155/2013/410812

CrossRef Full Text | Google Scholar

18. Bocquet M. Localization and the iterative ensemble Kalman smoother. Q J Roy Meteor Soc. (2016) 142:1075–89. doi: 10.1002/qj.2711

CrossRef Full Text | Google Scholar

19. Bishop CH, Whitaker JS, Lei L. Gain form of the ensemble transform Kalman filter and its relevance to satellite data assimilation with model space ensemble covariance localization. Mon Weather Rev. (2017) 145:4575–92. doi: 10.1175/MWR-D-17-0102.1

CrossRef Full Text | Google Scholar

20. Kretschmer M, Hunt BR, Ott E. Data assimilation using a climatologically augmented local ensemble transform Kalman filter. Tellus B (2015) 67:26617. doi: 10.3402/tellusa.v67.26617

CrossRef Full Text | Google Scholar

21. Bishop CH, Etherton BJ, Majumdar SJ. Adaptive sampling with the ensemble transform Kalman filter. Part I: theoretical aspects. Mon Weather Rev. (2001) 129:420–36. doi: 10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2

CrossRef Full Text | Google Scholar

22. Lorenc AC. Improving ensemble covariances in hybrid variational data assimilation without increasing ensemble size. Q J Roy Meteor Soc. (2017) 143:1062–72. doi: 10.1002/qj.2990

CrossRef Full Text | Google Scholar

23. Halko N, Martinsson PG, Tropp JA. Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. (2011) 53:217–88. doi: 10.1137/090771806

CrossRef Full Text | Google Scholar

24. Lorenz EN, Emanuel KA. Optimal sites for supplementary weather observations: simulation with a small model. J Atmos Sci. (1998) 55:399–414. doi: 10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2

CrossRef Full Text | Google Scholar

25. Evensen G. Data Assimilation: The Ensemble Kalman Filter. Berlin; Heidelberg: Springer-Verlag (2009).

Google Scholar

26. Horn RA, Johnson CR. Matrix Analysis. New York, NY: Cambridge University Press (2012).

Google Scholar

27. Sakov P, Oke PR. A deterministic formulation of the ensemble Kalman filter: an alternative to ensemble square root filters. Tellus B (2008) 60:361–71. doi: 10.1111/j.1600-0870.2007.00299.x

CrossRef Full Text | Google Scholar

28. Whitaker JS, Hamill TM. Ensemble data assimilation without perturbed observations. Mon Weather Rev. (2002) 130:1913–24. doi: 10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2

CrossRef Full Text | Google Scholar

29. Desroziers G, Camino JT, Berre L. 4DEnVar: link with 4D state formulation of variational assimilation and different possible implementations. Q J Roy Meteor Soc. (2014) 140:2097–110. doi: 10.1002/qj.2325

CrossRef Full Text | Google Scholar

30. Desroziers G, Arbogast É, Berre L. Improving spatial localization in 4DEnVar. Q J Roy Meteor Soc. (2016) 142:3171–85. doi: 10.1002/qj.2898

CrossRef Full Text | Google Scholar

31. Arbogast É, Desroziers G, Berre L. A parallel implementation of a 4DEnVar ensemble. Q J Roy Meteor Soc. (2017) 143:2073–83. doi: 10.1002/qj.3061

CrossRef Full Text | Google Scholar

32. Peres-Neto PR, Jackson DA, Somers KM. How many principal components? Stopping rules for determining the number of non-trivial axes revisited. Comput Stat Data Ann. (2005) 49:974–97. doi: 10.1016/j.csda.2004.06.015

CrossRef Full Text | Google Scholar

33. Gaspari G, Cohn SE. Construction of correlation functions in two and three dimensions. Q J Roy Meteor Soc. (1999) 125:723–57. doi: 10.1002/qj.49712555417

CrossRef Full Text | Google Scholar

34. Eckart C, Young G. The approximation of one matrix by another of lower rank. Psychometrika (1936) 1:211–8. doi: 10.1007/BF02288367

CrossRef Full Text | Google Scholar

35. Hunt BR, Kostelich EJ, Szunyogh I. Efficient data assimilation for spatiotemporal chaos: a local ensemble transform Kalman filter. Phys D (2007) 230:112–26. doi: 10.1016/j.physd.2006.11.008

CrossRef Full Text | Google Scholar

36. Anderson J, Lei L. Empirical localization of observation impact in ensemble Kalman filters. Mon Weather Rev. (2013) 141:4140–53. doi: 10.1175/MWR-D-12-00330.1

CrossRef Full Text | Google Scholar

37. Penny SG, Behringer DW, Carton JA, Kalnay E. A hybrid global ocean data assimilation system at NCEP. Mon Weather Rev. (2015) 143:4660–77. doi: 10.1175/MWR-D-14-00376.1

CrossRef Full Text | Google Scholar

Appendix

A. Recentre the Perturbations

Let X be an N_x × (N_e − 1) matrix. We want to construct a matrix that has the same empirical covariance matrix and which is centred. Since X has rank at most N_e − 1, we need to find an N_x × N_e matrix Z such that

\begin{array}{l} Z Z^{T} = X X^{T}, & (A1) \end{array}

\begin{array}{l} Z 1 = 0. & (A2) \end{array}

For ϵ ∈ {−1, 1}, we define $λ = \sqrt{N_{e}} {(\sqrt{N_{e}} - ϵ)}^{- 1}$ and Q_ϵ as the N_e × N_e matrix whose coefficients are

\begin{array}{l} {[Q_{ϵ}]}_{i, j} = {\begin{array}{l} \frac{ϵ}{\sqrt{N_{e}}} & if i = 1 or j = 1 \\ 1 - \frac{λ}{N_{e}} & if i = j \geq 2 \\ - \frac{λ}{N_{e}} & else & (A3) \end{array} . \end{array}

It can be easily checked that $Q_{ϵ} Q_{ϵ}^{T} = I$ (i.e., Q_ϵ is an orthogonal matrix) and $Q_{ϵ} 1 = {\vec{e}}_{1}$ , the first basis vector.

Let W be the N_x × N_e matrix whose first column is zero and whose other columns are those of X, that is

\begin{array}{l} W = [0, X] . & (A4) \end{array}

By construction Z = WQ_ϵ is a solution of Equations (A1) and (A2).

B. A Random svd Algorithm

B.1. The Algorithm

Algorithm 2 describes the random svd algorithm proposed by Halko et al. [23]. The objective of this algorithm is to efficiently compute an approximate truncated svd with P columns of the M × N matrix A as a parallelisable alternative to Lanczos techniques.

The random svd algorithm is based on two ideas. First, suppose that there is a matrix Q with P orthonormal columns which approximates the range of the matrix A. In other words A ≈ QQ^T A. Then, an approximate truncated svd can be obtained for A using the svd of the smaller matrix Q^T A. Second, the matrix Q can be constructed using random draws. Indeed, if $X = {x_{1} \dots x_{P}}$ is a set of random vectors, then it is most likely a linearly independant set. Therefore, the set $Y = {A x_{1} \dots A x_{P}}$ is most likely linearly independant, which means that it spans the range of A.

One major contribution of Halko et al. [23] and the references therein is that they have provided a mathematical justification of these ideas. In particular, they have given statistical performance bounds for the random svd algorithm and emphasised the fact that, on average, the (spectral or Frobenius) error of the resulting truncated svd with P columns should be close to the minimal error for a truncated svd with P columns.

Finally, Halko et al. [23] have introduced two elements to improve the numerical stability and efficiency of the random svd algorithm. The first element is a loop over i ∈ {1…q}, which forces the algorithm to construct singular vectors of (AA^T)^q A instead of A. However, (AA^T)^q A and A share the same singular vectors. Moreover, the singular values of (AA^T)^q A decay faster than those of A, which means that this technique enables a better approximation of the decomposition as shown by Corollary 10.10 of Halko et al. [23]. The second element is to include QR factorisations to make the algorithm less vulnerable to round-off errors. Both elements have been taken into account in Algorithm 2.

ALGORITHM 2

Algorithm 2. Random svd algorithm

B.2. Application to the Prior Covariance Matrix

In section 3.2.4, we need to compute the truncated svd Equation (26) of the prior covariance matrix. To do this, we can apply Algorithm 2 using the input matrix A = ρ∘ (XX^T). The prior covariance matrix is a large N_x × N_x matrix. However, Algorithm 2 can work with the map

\begin{array}{l} {\begin{array}{l} ℝ^{N_{x}} & \to & ℝ^{N_{x}} \\ v & \mapsto & (ρ \circ (X X^{T})) v, & (A5) \end{array} \end{array}

which can be efficiently computed using Equation (30). Steps 2, 5, 7, and 10 of Algorithm 2 can even be parallelised by applying A = ρ ∘ (XX^T) independently to each column.

Finally, the approximate truncated svd

\begin{array}{l} A = ρ \circ (X X^{T}) \approx U Σ V^{T}, & (A6) \end{array}

resulting from Algorithm 2 does not necessarily satisfy U = V even though the input matrix is symmetric positive definite. The simplest fix is to make the additional approximation

\begin{array}{l} U Σ V^{T} \approx U Σ U^{T} . & (A7) \end{array}

Keywords: ensemble Kalman filter, covariance localisation, modulation, random svd, satellite radiances, non-local observations

Citation: Farchi A and Bocquet M (2019) On the Efficiency of Covariance Localisation of the Ensemble Kalman Filter Using Augmented Ensembles. Front. Appl. Math. Stat. 5:3. doi: 10.3389/fams.2019.00003

Received: 08 November 2018; Accepted: 14 January 2019;
Published: 26 February 2019.

Edited by:

Ulrich Parlitz, Max-Planck-Institute for Dynamics and Self-Organisation, Max Planck Society (MPG), Germany

Reviewed by:

Peter Jan Van Leeuwen, University of Reading, United Kingdom
Xin Tong, National University of Singapore, Singapore

Copyright © 2019 Farchi and Bocquet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Alban Farchi, YWxiYW4uZmFyY2hpQGVucGMuZnI=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.