Error Propagation of Capon’s Minimum Variance Estimator

The error propagation of Capon’s minimum variance estimator resulting from measurement errors and position errors is derived within a linear approximation. It turns out, that Capon’s estimator provides the same error propagation as the conventionally used least square fit method. The shape matrix which describes the location depence of the measurement positions is the key parameter for the error propagation, since the condition number of the shape matrix determines how the errors are amplified. Furthermore, the error resulting from a finite number of data samples is derived by regarding Capon’s estimator as a special case of the maximum likelihood estimator.


INTRODUCTION
The reconstruction of model parameters from a given set of measurements is one of the most important tasks in geophysical and space science studies. The measurements are always affected by measurement errors which result in estimation errors for the wanted model parameters. The Capon method [1][2][3], also known as minimum variance distortionless response estimator (MVDR), is currently being considered as a robust inversion method for the analysis of planetary magnetic fields. In the past, the method has successfully been applied to the analysis of waves [1,4] and therefore, specific attention has been paid to errors of the spectrum resulting from random perturbations in the amplitude and phase of sensor arrays [5]. Since Capon's method is based on the evaluation of statistically averaged data, the spectrum is also affected by errors resulting from a finite number of samples [6]. Within the estimation of the frequency-wavenumber spectrum, the difference between static structures and waves or their combination can be discerned through dispersion relation analysis [7][8][9][10][11][12]. Concerning the application of the Capon method for the analysis of planetary magnetic fields, the error propagation of Capon's estimator itself is of major importance for assessing the quality of the reconstructed model parameters. In general, two essential types of errors are expectable. On the one hand, the measured magnetic field data are affected by e.g., offsets, gains resulting from thermal variations and spacecraft magnetic disturbances [13]. On the other hand, the determination of the spacecraft's positions can be defective (measurement position errors), which results in a defective shape matrix. As a follow-up of the generalized derivation of Capon's method [3] and the error estimation of the power spectrum [6], in this work the effects of measurement errors, measurement position errors as well as finite sample sizes on Capon's estimator are considered.

CAPON'S METHOD
Before deducing the error propagation of Capon's method, the main ideas of the method are shortly revisited [2,3]. Due to the complexity of several physical problems, the entire parametrization of experimental data is unrealizable. Thus, it is useful to decompose the measurements B into parametrized parts H g, where g contains the corresponding wanted model coefficients and the shape matrix H describes the distribution of the measurement positions with respect to the underlying model, non-parametrized parts v as well as measurement noise n, so that is valid. The measurement noise is assumed to be Gaussian with variance σ n and zero mean 〈n〉 0 . Since the shape matrix H is not invertible and the non-parametrized parts are unknown, the exact solution for the wanted model coefficients g is not available in general. Capon's method delivers an estimator g C for the ideal solution g. The method is based on the construction of a filter matrix w, that minimizes the output power tr w † M w (2) with respect to the distortionless constraint where tr w † M w is the trace of the matrix w † M w and I is the identity matrix. The matrix M 〈B+B〉 denotes the data covariance matrix. Capon's estimator realizing the minimal output power results in The robustness of the method can be improved by the diagonal loading technique M → M + σ 2 d I, where σ d is the so called diagonal loading parameter [3]. Thus, the estimator depends on the measurements and on the measurement positions so that measurement errors as well as measurement position errors transfer onto the estimator.

ERROR PROPAGATION
In the following, the error propagation of Capon's estimator is deduced. Since it is expectable that the errors are much smaller than the measurements themselves, the error propagation is derived by making use of a linearization. For the approximation of the matrix inversions that are necessary for calculating Capon's estimator, the Neumann series bears of essential meaning for which reason we discuss it in a seperate section.

Neumann Series
The Neumann series is a special case of the functional calculus for linear operators or matrices, respectively, [14] and enables the approximation of matrix inversions.
Let T denote a bounded matrix with norm T < 1. Then, where I is the identity matrix and T k denotes the k'th power of the matrix T. The demand T < 1 guarantees the convergence of the series. Thus, the Neumann series can be understood as a generalization of the geometric series for linear operators. As a direct consequence it follows that For the estimation of Capon's error propagation a generalized formulation of the Neumann series is required. To expand the inverse of the sum of a matrix S and a matrix T, where it is assumed that S is invertible, the sum can be rewritten as If S − 1 T < 1, the Neumann series can be applied to the sum I + S − 1 T resulting in

Measurement Errors
In the following, the error of Capon's estimator resulting from measurement errors is derived. The model for the (temporal) statistically averaged accurate data 〈B〉 without measurement errors is given by where g is the wanted coefficient vector, H denotes the shape matrix which describes the spatial dependence of the measurement positions with respect to the underlying model and 〈v〉 denotes the (temporal) statistically averaged parts of the measurements that are not parametrized by the model H g [3]. The corresponding accurate estimator resulting from Capon's method is given by where denotes the accurate filter matrix composed of the shape matrix H and the data covariance matrix M 〈B+B〉 [3]. The perturbed measurementsB can be rewritten as where δB denotes the measurement error. Because of the linearity of the averaging process, the perturbed averaged measurements 〈B〉 are given by 〈B〉 〈B〉 + 〈δB〉 (13) where 〈δB〉 denotes the statistically averaged measurement error. The corresponding perturbed estimator results iñ wherew † denotes the perturbed filter matrix. Thus, the difference between the accurate and the perturbed estimator results in Within the practical application only the perturbed measurements 〈B〉 are known and thus, only the filter matrix w † is known. For the calculation of the difference between the two filter matrices, in the following a linearized approximation is applied. By means of Eq. 12, the unknown accurate data covariance matrix can be rewritten as whereM 〈B +B 〉. We assume, that the measurement errors are much smaller than the measurements themselves, i.e., δB ≪ B . This assumption is surely justified in the majority of applications. Considering for example the analysis of planetary magnetic fields, the measurement errors are smaller than 1 nT, so that δB / B < 1%. Thus, in the following all terms being quadratic within the errors (e.g., 〈δB+δB〉) will be neglected, so that the data covariance matrix results in In the case of M − 1 ΔM 2 < 1, where . 2 denotes the spectral norm, the application of the Neumann series (Section 3.1) delivers as well as for the unknown coefficient matrix P [3]. Using it follows that where I again denotes the identity matrix. Thus, the difference between the filter matrices can be approximated by Inserting this approximation into Eq. 16 delivers within the linear approximation. Further transformation for estimating the relative error of the estimator delivers Making use of the Cauchy-Schwarz inequality yields where For the calculation of Capon's estimator, the weighted difference the unknown set of model parametersg , resulting in [3]. A discussion about the calculation of the matrix M − 1/2 is given within the Appendix.
Assuming that the weighted model mismatches are neglibigly small compared to the measurement errors, the deviation can be estimated upwards via or equivalently Frontiers in Physics | www.frontiersin.org for the relative error. When the measurements are adequately described by the underlying model, i.e., 〈B〉 ≈ H g C , it follows that Comparing this expression with the relative error of the least square fit estimator [15].
where H + denotes the pseudoinverse of the shape matrix H, shows that the error propagation of Capon's estimator follows the structure of the error propagation of the least square fit estimator. Since Capon's method is based on the evaluation of averaged data, it can be seen that Gaussian errors with a vanishing mean value (i.e., 〈δB〉 0) do not influence the estimator. Making use of the distortionless constraint [3].
where Σ min denotes the smallest singular value of the shape matrix H [15], i.e. 1/Σ min describes the largest singular value of H + . Using the definition of the condition number where Σ max denotes the largest singular value of the shape matrix H, yields Thus, Capon's estimator propagates the measurement errors in the same way as the least square fit estimator. This mathematical property can be interpreted as follows: These two methods differ in the filter matrices H + and w † which weight or eliminate parts of the data in different ways. Since these matrices are applied to the same set of measurements, which is characterized by a given measurement error, the different weighting of the data or the elimination of subsets does not reduce the errors, since the weighted data originate from the same ensemble. The upper bound of the relative estimation error is qualitatively sketched in Figure 1. Furthermore, it should be noted that the condition number of the shape matrix determines how the measurement errors are amplified. Thereby, the condition number depends on the underlying model and the measurement positions, whereas the measurement error is produced by the sensor. For a given underlying model, the condition number solely depends on the measurement positions. Thus, the estimation errors can be reduced by analyzing measurements from suitable data points. Considering the analysis of Mercury's magnetic field, the underlying model describes the geometry of the magnetic field, for example the internal dipole and quadrupole field [2,3]. For the estimation of the corresponding Gauss coefficients [16,17], the data points have to cover the geometry of the field properly. For example, the superposition of Mercury's internal dipole field with the quadrupole field can equivalently be described as a northward shifted dipole field. When only measurement positions in the northern hemisphere are available, the condition number of the shape matrix increases, resulting in large estimation errors so that the internal field can be misinterpreted as a strong dipole field. The analysis of measurement points covering the southern and the northern hemisphere symmetrically, like that of the BepiColombo mission, decreases the condition number and enables a more detailed characterization of the geometry of Mercury's internal magnetic field [18]. The smaller the condition number of the shape matrix becomes, the better the geometry of the field is covered by the data points [18].
Before discussing the errors resulting from the perturbed determination of the measurement positions, let us make a general comment about the error of Capon's estimator resulting from measurement errors. The perturbed data can be rewritten in the form 〈B 〉 〈B〉 + 〈δB〉 H g + 〈v〉 + 〈δB〉 H g + 〈ṽ 〉, (36) where 〈ṽ 〉 〈v〉 + 〈δB〉. Since the filter matrixw † eliminates the non-parametrized parts from the total measured field [3], one might suppose that the measurement errors do not influence the estimation since these errors can as well be interpreted as non-parametrized parts. From the elimination of the non-parametrized parts it does not follow thatw † 〈δB〉 0. Since this term dominates the error of the estimator (cf. Eq. 28), the measurement errors are not eliminated by the filter matrix in general.

Measurement Position Errors
The model for the correctly determined measurement positions is given by The corresponding accurate estimator results in where The perturbed determination of the sensor's positions transfers onto a perturbed shape matrix H. Thus, the underlying model H g and the non-parametrized parts 〈v〉 are perturbed so that the noisy model is given by and the corresponding perturbed estimator results iñ The deviation between the accurate and the perturbed estimator results in which describes the magnetic field of a current sheet superposed with Mercury's internal dipolar field [6], where R M indicates the planetary radius of Mercury and r 1 and r 2 are the measurement positions. The perturbed filter matrix is given byH Assuming that Δr i ≪ r i , for i 1, 2, the perturbed matrix can be rewritten as by making use of a Taylor series expansion. It should be noted that the shape matrix can be linearized for any underlying model by performing a Taylor series expansion for the functions within the model. Linearization of the error terms for the estimation of the unknown accurate filter matrix delivers Making use of the Neumann series results in so that or equivalently Thus, the deviation of the estimator can be approximated via Byw † 〈B〉 g C it follows that and therefore Within the application of Capon's method to Mercury's magnetic field analysis, the first summand on the right hand side of Eq. 54 is of the order of about 10 nT 2 , whereas P ΔH † M − 1/2 ∼ 500 · 10 − 7 nT 2 and thus, the weighted model mismatches are negligibly small compared to the measurement errors, so that the deviation can be estimated upwards via The relative error results in where κ H denotes the condition number of the perturbed shape matrixH . The upper bound for the relative error with respect to the condition number is qualitatively sketched in propagation of Capon's estimator again equals the error propagation of the least square fit estimator [15].

Measurement Errors and Measurement Position Errors
Within the former sections the influences of measurement errors and measurement position errors have been discussed separately. Within the practical application it is expectable that both errors occur simultanously. The relative error resulting from the noisy measurements and the perturbed measurement positions is given by the quadratic sum of the two cases discussed above This can be derived as follows: The accurate model is given by so that the corresponding accurate estimator results in The measured data are not affected by the perturbed determination of the sensor's positions. The data are solely affected by measurement errors 〈δB〉 so that the perturbed model can be written as The corresponding perturbed estimator results iñ and M M + ΔM. The noisy shape matrix can again be written as H H + ΔH within a linear approximation. Thus, the deviation between the accurate and the perturbed estimator is given by By making use of the Neumann series, the unknown filter matrix w † can be rewritten within a linear approximation resulting in Using FIGURE 2 | Sketch of the upper bound for the relative estimation error resulting from measurement position errors with respect to the condition number κ H of the shape matrix.
Frontiers in Physics | www.frontiersin.org June 2021 | Volume 9 | Article 684410 6 and again making use of the Neumann series delivers (67) Taking into account that w † 〈δB〉 w † 〈δB〉 (68) within the linear approximation as well as the deviation between the estimators can be approximated by and thus, Assuming that 〈B 〉 ≈Hg C the relative error can be estimated by It is important to note that the right sides of Eqs. 35, 56 and 73 represent upper bounds for the error of the estimator. These bounds can be much larger than the true errors. The great advantage of the bounds lies in the fact that they solely depend on known quantities, wheras the accurate estimator for calculating the true estimation error is unknown within the practical application of the method. Thus, the above derived upper bounds are conservative and guarantee that the true estimation errors cannot exceed these bounds as long as the measurement errors as well as the measurement position errors are much smaller than the measurements themselves so that the linear approximation is valid.

FINITE SAMPLE AVERAGING
For the calculation of Capon's estimator, averaged magnetic field data are required [3]. Here, Q denotes the number of measurements at a fixed set of data points. Within the practical application of the method only a finite number Q < ∞ of samples is available, which yields a standard deviation of Capon's estimator g C . In the following, an approximation for the variance of Capon's estimator is derived by regarding Capon's method as a special case of the maximum likelihood method [6].
In the vicinity of its maximum value the likelihood function can be approximated as where N denotes the noise covariance matrix [6,19]. Before deducing the error resulting from the finite number of samples let us make a general comment about the construction of the likelihood function in terms of the averaging process. The noise matrix is already statistically evaluated so that the likelihood function is constructed by making use of two different averaging processes. Within the first averaging process the noise matrix N is calculated. This averaging process does not incorporate the underlying model since only the distribution of the data around the mean value 〈B〉 is determined which allows to assess the general quality of the measurements. When the distribution of the measurements is large, significant errors within the subsequent data fitting are expectable. Especially, these errors are independent of the chosen model. Within the second averaging process a specific underlying model H g is fitted to the data B α . For most practical applications, the noise matrix N is unknown and has to be approximated. In the special case of Gaussian errors with zero mean and variance σ n , the noise covariance matrix can be written as N σ 2 n I, where I denotes the identity matrix. Substituting the noise covariance matrix N by the data covariance matrix M 〈B+B〉, the maximum likelihood estimator may be converted into Capon's estimator [6]. The corresponding likelihood function is modified to The weighted difference between the model and the data can be rewritten as As discussed above, the data covariance matrix M is already statistically evaluated so that the averaging process results in Insertion into Eq. 76 yields Thus, the error of the m'th component of Capon's estimator results in which declines as 1/ Q √ . The functional dependence of the error is qualitatively illustrated in Figure 3.

SUMMARY AND OUTLOOK
The analysis of the error propagation is of major importance for the application of linear inversion methods. Within a linear approximation, upper bounds for the errors of Capon's estimator resulting from measurement errors and measurement position errors are derived. These upper bounds solely depend on known quantities, i.e., measurements and measurement positions, whereas the true estimation error cannot be calculated within the practical application of the method, since the accurate estimator is unavailable. It turns out that Capon's method provides the same error propagation as the least square fit method. These two methods differ in the filter matrices which weight or eliminate parts of the data in different ways. Since these matrices are applied to the same set of measurements, the different weighting of the data or the elimination of subsets does not reduce the errors. The condition number of the shape matrix is the key parameter for the error propagation, since it determines how the errors are amplified. The measurement errors as well as the measurement position errors have to be estimated from the measurements. For a given underlying model, the condition number of the shape matrix solely depends on the measurement positions. Thus, the amplification of the errors can be reduced by choosing preferred data points.
Furthermore, Capon's method is based on the evaluation of statistically averaged data. For the practical application of the method only a finite number Q of samples is available. This limited number of samples results in an error of Capon's estimator which can be derived by regarding Capon's method as a special case of the maximum likelihood estimator. The more samples are available, the smaller the error becomes, since the error declines as 1/ Q √ . As a follow-up of the generalized derivation of Capon's method, the present work establishes the mathematical basis of Capon's error propagation for the practical application of the method.
Appendix: Matrix Power of the Data Covariance Matrix