Frontiers journals are at the top of citation and impact metrics

Original Research ARTICLE

Front. Appl. Math. Stat., 29 February 2016 | https://doi.org/10.3389/fams.2016.00001

Efficient Spectral Estimation by MUSIC and ESPRIT with Application to Sparse FFT

Daniel Potts1*, Manfred Tasche2 and Toni Volkmer1
  • 1Faculty of Mathematics, Technische Universität Chemnitz, Chemnitz, Germany
  • 2Institute of Mathematics, University of Rostock, Rostock, Germany

In spectral estimation, one has to determine all parameters of an exponential sum for finitely many (noisy) sampled data of this exponential sum. Frequently used methods for spectral estimation are MUSIC (MUltiple SIgnal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Technique). For a trigonometric polynomial of large sparsity, we present a new sparse fast Fourier transform by shifted sampling and using MUSIC resp. ESPRIT, where the ESPRIT based method has lower computational cost. Later this technique is extended to a new reconstruction of a multivariate trigonometric polynomial of large sparsity for given (noisy) values sampled on a reconstructing rank-1 lattice. Numerical experiments illustrate the high performance of these procedures.

1. Introduction

The problem of spectral estimation resp. frequency analysis arises quite often in signal processing, electrical engineering, and mathematical physics (see e.g., the books [1, 2] or the survey [3]) and reads as follows:

(P1) Recover the positive integer M, the distinct frequencies φj(-12,12], and the complex coefficients cj ≠ 0 (j = 1, …, M) in the exponential sum of sparsity M

h(x):=j=1Mcje2πiφjx(x),    (1.1)

if noisy sampled data h~k:=h(k)+ek (k = 0, …, N − 1) with N ≥ 2M are given, where ek ∈ ℂ are small error terms with |ek|110minj|cj|.

Introducing so-called left/right signal spaces and noise spaces in Section 2, we explain the numerical solution of the problem (P1) by the MUSIC method (created by Schmidt [4], see also Manolakis et al. [1, Section 9.6.3] and the references therein) and the ESPRIT method (created by Roy and Kailath [5], see also Manolakis et al. [1, Section 9.6.5], Stoica and Moses [2, Chapter 4] and the references therein). In a new unified approach to MUSIC and ESPRIT, we show that both methods are based on singular value decomposition (SVD) of a rectangular Hankel matrix of given sampled data. For the MUSIC and ESPRIT method, it is important to choose the window length L (number of rows) of the rectangular Hankel matrix in an optimal way. Based on Theorem 2.5, where we estimate the singular values and the spectral norm condition number of the rectangular Hankel matrix of noiseless data, one can see that LN2 is a good choice. By the right choice of L, one can detect the correct sparsity M of (1.1) and avoid the computation of spurious frequencies.

The main disadvantages of MUSIC and ESPRIT are the relatively high computational cost in the case of large sparsity M, caused mainly by the SVD. The known algorithms for MUSIC and ESPRIT have moderate computational cost only for small sparsity M. Thus, following question arises: How can one improve the MUSIC and ESPRIT methods for spectral estimation of exponential sums with large sparsity M? In this paper, we show that this is possible by a special divide-and-conquer technique. In the numerical examples of Section 5, the cases M = 256 and M = 1024 are handled.

The computational cost of an algorithm is measured in the number of arithmetical operations, where all operations are counted equally. Often the computational cost of an algorithm is reduced to the leading term, i.e., all lower order terms are omitted. For a unified approach to Prony–like methods for the parameter estimation of (1.1), namely the classical Prony method, the matrix pencil method [6, 7], and the ESPRIT method [5, 8], we refer also to Potts and Tasche [9] and the references therein. For a survey of the most successful methods for the data fitting problem with linear combinations of complex exponentials, we refer to Pereyra and Scherer [10].

Section 3 is the core of this paper. Here we present a new efficient spectral estimation with low computational cost for large sparsity M and a moderate number of given samples, if one has to recover a trigonometric polynomial of large sparsity M. This means we specialize the problem (P1). Let S > 0 be a large even integer. Assume that φj=ωjS(-12,12], where ωj(-S2,S2]. Replacing the variable x by Sx in (1.1), we consider the 1-periodic trigonometric polynomial of sparsity M

g(x):=h(Sx):=j=1Mcje2πiωjx(x)    (1.2)

with integer frequencies ωj(-S2,S2]. Consequently we investigate the following spectral estimation problem:

(P2) Recover the sparsity M ∈ ℕ, all integer frequencies ωj(-S2,S2] as well as all non-zero coefficients cj ∈ ℂ of the trigonometric polynomial (1.2) for noisy sampled values g~k :=g(kS)+ek=h(k)+ek (k = 0, …, N − 1) with N ≥ 2M, where ek ∈ ℂ are small error terms with |ek|110minj|cj|. Often one considers the modified problem (P2*), where the sparsity M is known.

A numerical solution of problem (P2) or (P2*) with low computational cost is called sparse fast Fourier transform (sparse FFT). Using divide-and-conquer technique, the trigonometric polynomial (1.2) of large sparsity M is split into some trigonometric polynomials of lower sparsity and corresponding samples are determined by fast Fourier transform (FFT). Here we borrow an idea from sparse FFT in Lawlor et al. [11] and Christlieb et al. [12] and use shifted sampling of (1.2), i.e., equidistant sampling with few equidistant shifts. Then the trigonometric polynomials of lower sparsity can be recovered by MUSIC resp. ESPRIT. The computational cost of the new sparse FFT is analyzed too.

A similar splitting technique is suggested in Lawlor et al. [11] and Christlieb et al. [12], but with a different method to detect frequencies, when aliasing between two or more frequencies occurs. The method in Lawlor et al. [11] and Christlieb et al. [12] follows an idea of Iwen [13], which is based on the Chinese Remainder Theorem, see also Ben-Or and Tiwari [14]. A different method for the sparse FFT, based on efficient filters is suggested in Hassanieh et al. [15] and Gilbert et al. [16]. We remark that there are two types of methods, deterministic (see [13]) and randomized (see [15, 11, 16]). Further related randomized methods based on compressed sensing can be found in the papers [17, 18, 19] and in the monograph [20]. Please note that the sparse FFT methods mentioned before solve the problem (P2*), i.e., one assumes that the sparsity (or an upper bound) is known, whereas our new deterministic sparse FFT also detects the sparsity M. We remark that preliminary tests of the implementation [21, 22] of the sfft version 3 algorithm [15] suggest that this method also works if the sparsity input parameter is chosen larger than the actual sparsity of the signal. For further references on sparse FFTs, we refer to Remark 3.1.

In Section 4, we extend our method to a new reconstruction of multivariate trigonometric polynomials of large sparsity, where sampled data on a convenient rank-1 lattice are given. In Section 5, several numerical experiments with noiseless resp. noisy sampled data illustrate the high performance of the sparse FFT as proposed in Section 3. Note that in the case of successful recovery of the sparse trigonometric polynomial (1.2) all frequencies are correctly detected. For the modified sparse FFT of Section 4, numerical examples for the reconstruction of six-variate trigonometric polynomials of sparsity 256 are given too. Moreover, we compare our results with preliminary tests of the implementation [21, 22] of the sfft version 3 algorithm [15].

In summary we present a splitting method, in between the well-known methods ESPRIT, MUSIC, and FFT for the problem (P2) with the parameters in Table 1. For the results, see the Tables 3, 4 in Section 3. Furthermore, we use a reconstructing rank-1 lattice in order to reconstruct multivariate trigonometric polynomials, see Section 4. Here in the case of successful recovery of a sparse multivariate trigonometric polynomial, all frequency vectors are detected without errors.

TABLE 1
www.frontiersin.org

Table 1. Numbers of required samples and computational costs for ESPRIT, MUSIC, and FFT, where M denotes the sparsity of (1.1) and S is a large even integer so that all frequencies φj are of the form ωjS with ωj(-S2,S2].

2. Reconstruction of Exponential Sums

The main difficulty is the recovery of the frequency set Φ: = {φ1, …, φM} in (1.1).

We introduce the rectangular Fourier–type matrix FN,M :=(e2πiφj(k1))k,j=1N,M. Note that FN, M coincides with the rectangular Vandermonde matrix VN,M(z) :=(zjk-1)k,j=1N,M with z :=(zj)j=1M, where zj :=e2πiφj (j = 1, …, M) are distinct nodes on the unit circle. Then the spectral estimation problem can be formulated in following matrix-vector form

VN,M(z)c=h˜,    (2.1)

where h~ :=(h~k)k=0N-1 is the vector of noisy sampled data and c :=(cj)j=1M the vector of complex coefficients of (1.1).

Under the natural assumption that the nodes zj (j = 1, …, M) are well-separated on the unit circle, it can be shown that FP, M has a uniformly bounded spectral norm condition number for sufficiently large integer P > M.

Theorem 2.1 (see [23, Theorem 2]) Assume that the frequencies φj(-12,12] (j = 1, …, M) are well-separated by the separation distance

q:=minj(minn|φj+nφ|)>0

and that P>max{M,2π+1q}.

Then the discrete Ingham inequalities related to FP, M indicate that for all x ∈ ℂM

α1(P)x22FP,Mx22α2(P)x22    (2.2)

with α1(P) :=P(2π2πP2q24P) and

α2(P):={P(42π+2πP2q2+32P)for evenP,(P+1)(42π+2π(P+1)2q2+32P+1)for oddP.

Furthermore, the rectangular Fourier–type matrix FP, M has a uniformly bounded spectral norm condition number

cond2FP,Mα2(P)α1(P).

Proof. The assumption P>2π+1q is sufficient for the gap condition

q>1P(12πP)1/2    (2.3)

to hold. The gap condition (2.3) ensures that α1(P) > 0. For a proof of the discrete Ingham inequalities (2.2) under the gap condition (2.3) see Liao and Fannjiang [23, Theorem 2]. Let λ1 ≥ … ≥ λM > 0 be the ordered eigenvalues of FP,M*FP,MCM×M. Using the Raleigh–Ritz Theorem and (2.2), we obtain that for all x ∈ ℂM

α1(P)x22λMx22FP,Mx22λ1x22α2(P)x22

and hence

0<α1(P)λMλ1α2(P)<.    (2.4)

Thus, FP,M*FP,M is positive definite and

cond2FP,M=λ1λMα2(P)α1(P).

This completes the proof. 

Corollary 2.2 Assume that the frequencies φj(-12,12] (j = 1, …, M) are well-separated by the separation distance q > 0 and that P>max{M,2π+1q}.

Then the discrete Ingham inequalities related to FP,MT indicate that for all y ∈ ℂP

α1(P)y22FP,MTy22α2(P)y22.    (2.5)

Proof. The matrices FP, M and FP,MT possess the same singular values λj (j = 1, …, M). By the Rayleigh–Ritz Theorem we obtain that

λMy22FP,MTy22λ1y22

for all y ∈ ℂP. Applying (2.4), we obtain the discrete Ingham inequalities (2.5). 

Remark 2.3 The Riesz stability of the exponentials e2πiφjx (j = 1, …, M) in the Hilbert space 2(N) follows immediately from the discrete Ingham inequalities (2.2), where ℤN : = {0, …, N − 1} denotes the sampling grid. If the assumptions of Theorem 2.1 are fulfilled for P = N, then the exponentials e2πiφjx (j = 1, …, M) are Riesz stable with respect to the discrete norm of 2(N), i.e.,

α1(N)c22k=0N1|h(k)|2α2(N)c22

for all exponential sums (1.1) with arbitrary coefficient vectors c=(cj)j=1MM. Note that the Riesz stability of these exponentials related to continuous norms was formerly discussed and applied in spectral estimation in Peter et al. [24] and Potts and Tasche [25]. 

In practice, the sparsity M of the exponential sum (1.1) is often unknown. Assume that L ∈ ℕ is a convenient upper bound of M with MLNM + 1. In applications, such an upper bound L is mostly known a priori. If this is not the case, then one can choose LN2. As mentioned in Remark 2.6, the choice LN2 is optimal in some sense. Often the sequence {h~0,h~1,,h~N-1} of sampled data is called a time series of length N. Then we form the L-trajectory matrix of this time series

H˜L,NL+1:=(h˜+m),m=0L1,NL    (2.6)

with the window length L ∈ {M, …, NM + 1}. Analogously, we define the L-trajectory matrix of noiseless data

HL,NL+1:=(h(+m)),m=0L1,NL.    (2.7)

Obviously, (2.6) and (2.7) are L×(NL+1) Hankel matrices. For simplicity, we consider mainly the noiseless case, i.e. h~k=h(k) (k = 0, …, N − 1).

The main step in the solution of the frequency analysis problem (P1) is the determination of the sparsity M and the computation of the frequencies φj or alternatively of the nodes zj=e2πiφj (j = 1, …, M). Afterwards one can calculate the coefficient vector c ∈ ℂM as least squares solution of the overdetermined linear system (2.1), i.e., the coefficient vector c is the solution of the least squares problem

mincMVN,M(z)c(h˜k)k=0N12.

We denote square matrices with only one index and refer to the well known fact that the square Vandermonde matrix VM(z) is invertible and the matrix VN, L(z) with L ∈ {M, …, NM + 1} has full column rank. Additionally we introduce the rectangular Hankel matrices

H˜L,NL(s):=H˜L,NL+1(1:L, 1+s:NL+s)(s=0, 1).    (2.8)

In the case of noiseless data h~k=h(k) (k = 0, …, N − 1), the related Hankel matrices (2.8) are denoted by HL, NL(s) (s = 0, 1).

Remark 2.4 The Hankel matrix HL, NL + 1 of noiseless data has the rank M for each window length L ∈ {M, …, NM + 1} and the related Hankel matrices HL, NL(s) (s = 0, 1) possess the same rank M for each window length L ∈ {M, …, NM} (see [9, Lemma 2.1]). Consequently, the sparsity M of the exponential sum (1.1) coincides with the rank of these Hankel matrices. 

By the Vandermonde decomposition of the Hankel matrix HL, NL + 1 we obtain that

HL,NL+1=VL,M(z)(diag c)(VNL+1,M(z))T.    (2.9)

Under mild conditions, the Hankel matrix HL, NL + 1 of noiseless data has a bounded spectral norm condition number too.

Theorem 2.5 Let L, N ∈ ℕ with MLNM + 1 and min{L,N-L+1}>2π+1q be given. Assume that the frequencies φj(-12,12] (j = 1, …, M) are well-separated by the separation distance q > 0 and that the non-zero coefficients cj (j = 1, …, M) of the exponential sum (1.1) fulfill the condition

0<γ1|cj|γ2<(j=1,,M).    (2.10)

Then for all y ∈ ℂNL + 1

γ12α1(L)α1(NL+1)y22HL,NL+1y22       γ22α2(L)α2(NL+1)y22.    (2.11)

Further, the lowest resp. largest positive singular value of HL, NL + 1 can be estimated by

0<γ1α1(L)α1(NL+1)σMσ1  γ2α2(L)α2(NL+1).    (2.12)

The spectral norm condition number of HL, NL + 1 is bounded by

cond2HL,NL+1γ2γ1α2(L)α2(NL+1)α1(L)α1(NL+1).    (2.13)

Proof. By the Vandermonde decomposition (2.9) of the Hankel matrix HL, NL + 1, we obtain that for all y ∈ ℂNL + 1

HL,NL+1y22=FL,M(diag c)FNL+1,MTy22.

By the discrete Ingham inequalities (2.2) and the assumption (2.10), it follows that

γ12α1(L)FNL+1,MTy22HL,NL+1y22γ22α2(NL+1)                              FNL+1,MTy22.

Using the discrete Ingham inequalities (2.5), we obtain the estimates (2.11). Finally, the estimates of the lowest resp. largest positive singular value and the spectral norm condition number of HL, NL + 1 arise from (2.11) and the Rayleigh–Ritz Theorem.

Remark 2.6 For fixed N, the positive singular values as well as the spectral norm condition number of the Hankel matrix HL, NL + 1 depend strongly on L ∈ {M, …, NM + 1}. A good criterion for the choice of optimal window length L is to maximize the lowest positive singular value σM of HL, NL + 1. It was shown in Potts and Tasche [submitted, Lemma 3.1 and Remark 3.3] that the squared singular values increase almost monotonously for L=M,,N2 and decrease almost monotonously for L=N2,,N-M+1. Note that the lower bound (2.12) of the lowest positive singular value σM is maximal for LN2. Further the upper bound (2.13) of the spectral norm condition number of (2.7) is minimal for LN2. Therefore, we prefer to choose LN2 as optimal window length. Thus, we can ensure that σM > 0 is not too small. This property is decisively for the correct detection of the sparsity M in the first step of the MUSIC resp. ESPRIT Algorithm 2.8 resp. 2.9. 

The ranges of HL, NL + 1 and VL, M(z) coincide in the noiseless case with MLNM + 1 by (2.9). If L > M, then the range of VL, M(z) is a proper subspace of ℂL. This subspace is called left signal space SL. The left signal space SL is of dimension M and is generated by the M columns eLj) (j = 1, …, M), where

eL(φ):=(e2πiφ)=0L1(φ[12,12]).

Note that ||eL(φ)||2=L for each φ[-12,12]. The left noise space NL is defined as the orthogonal complement of SL in ℂL. The dimension of NL is equal to LM.

Remark 2.7 Let ML < NM + 1 be given. If we use HL,N-L+1* instead of HL, NL + 1, then we can define the right signal space as the range of VN-L,M(z̄), where z̄ denotes the complex conjugate of z. The right signal space is an M-dimensional subspace of ℂNL + 1 and is generated by the M linearly independent vectors eNL + 1 (− φj) (j = 1, …, M). Then the corresponding right noise space is the orthogonal complement of the right signal space in ℂNL + 1

By QL we denote the orthogonal projection onto the left noise space NL. Since eLj) ∈ SL (j = 1, …, M) and NLSL, we obtain that

QLeL(φj)=0(j=1,,M).

If the number φ(-12,12)\Φ, then the vectors eL(φ1),,eL(φM),eL(φ)L are linearly independent, since the square Vandermonde matrix

(eL(φ1)||eL(φM)|eL(φ))(1:M+1,1:M+1)

is invertible for each LM + 1. Hence eL(φ) ∉ SL = span{eL1), …, eLM)}, i.e. QLeL(φ) ≠ 0. Thus, the frequency set Φ can be determined via the zeros of the left noise-space correlation function

NL(φ):=1LQLeL(φ)2(φ(12,12]),

since NLj) = 0 for each j = 1, …, M and 0 < NL(φ) ≤ 1 for all φ(-12,12]\Φ, where QLeL(φ) can be computed on an equispaced fine grid. Alternatively, one can seek the peaks of the left imaging function

JL(φ):=LQLeL(φ)21(φ(12,12]).

In this approach, we prefer the zeros resp. the lowest local minima of the left noise-space correlation function NL(φ).

In the next step we determine the orthogonal projection QL onto the left noise space NL. Here we can use the SVD or the QR decomposition of the L-trajectory matrix HL, NL + 1. For an application of QR decomposition see Potts and Tasche [9]. Applying SVD, we obtain that

HL,NL+1=ULDL,NL+1WNL+1,

where ULL×L and WN-L+1(N-L+1)×(N-L+1) are unitary matrices and where DL,N-L+1L×(N-L+1) is a rectangular diagonal matrix. The diagonal entries of DL, NL+1 are the singular values σj of the L-trajectory matrix arranged in non-increasing order σ1 ≥ … ≥ σM > σM + 1 = … = σmin{L, NL + 1} = 0. Thus, we can determine the sparsity M of the exponential sum (1.1) by the number of positive singular values σj.

Introducing the matrices

UL,M(1):=UL(1:L,1:M),UL,LM(2):=UL(1:L,M+1:L)

with orthonormal columns, we see that the columns of UL,M(1) form an orthonormal basis of SL and that the columns of UL,L-M(2) are an orthonormal basis of NL. Hence the orthogonal projection onto the left noise space NL has the form

QL=UL,LM(2)(UL,LM(2)).

Consequently, we obtain that

QLeL(φ)22=QLeL(φ),QLeL(φ)=(QL)2eL(φ),eL(φ)      =QLeL(φ),eL(φ)      =(UL,LM(2))eL(φ),(UL,LM(2))eL(φ)      =(UL,LM(2))eL(φ)22.

Hence the left noise-space correlation function can be represented by

NL(φ)=1L(UL,LM(2))eL(φ)2(φ(12,12]).

In MUSIC, one determines the lowest local minima of the left noise-space correlation function, see e.g., [1, 4, 26, 27].

Algorithm 2.8 (MUSIC via SVD)

Input: N ∈ ℕ (N ≥ 2M) number of samples, LN2 window length, h~k=h(k)+ek (k = 0, …, N − 1) noisy sampled values of (1.1), 0 < ε ≪ 1 tolerance.

1. Compute the SVD of the rectangular Hankel matrix H~L,N-L+1=U~LD~L,N-L+1W~N-L+1* from (2.6), where the singular values σ~ (ℓ = 1, …, min{L, NL + 1}) are arranged in non-increasing order. Determine the numerical rank M of (2.6) such that σ~Mεσ~1 and σ~M+1<εσ~1. Form the matrix U~L,L-M(2)=U~L(1:L,M+1:L).

2. Calculate the left noise-space correlation function ÑL(φ) :=1L||(U~L,L-M(2))*eL(φ)||2 on an equispaced grid {-12+1S,,12-1S,12} for sufficiently large S.

3. The M lowest local minima of ÑL(2k-S2S) (k = 1, …, S) form the frequency set Φ~ :={φ~1,,φ~M}. Set z~j :=e2πiφ~j (j = 1, …, M).

4. Compute the coefficient vector c~ :=(c~j)j=1MM as solution of the least squares problem

minc˜MVN,M(z˜)c˜(h˜k)k=0N12,

where z~ :=(z~j)j=1M denotes the vector of computed nodes.

Output: M ∈ ℕ sparsity, φ~j(-12,12] frequencies, c~j coefficients (j = 1, …, M).

Let L, N ∈ ℕ with M < LNM + 1 be given. For noisy sampled data h~k=h(k)+ek (k = 0, …, N − 1), the MUSIC Algorithm 2.8 is relatively insensitive to small perturbations on the data (see [23, Theorem 3]).

In opposite to the MUSIC Algorithm 2.8, the following ESPRIT Algorithm is based on orthogonal projection onto a right signal space. For details see [5, 9, submitted].

Algorithm 2.9 (ESPRIT via SVD)

Input: N ∈ ℕ (N ≥ 2M) number of samples, LN2 window length, h~k=h(k)+ek (k = 0, …, N − 1) noisy sampled values of (1.1), 0 < ε≪1 tolerance.

1. Compute the SVD of the rectangular Hankel matrix H~L,N-L+1=U~LD~L,N-L+1W~N-L+1* from (2.6), where the singular values σ~ (ℓ = 1, …, min{L, NL + 1}) are arranged in non-increasing order. Determine the numerical rank M of (2.6) such that σ~Mεσ~1 and σ~M+1<εσ~1. Form the submatrices

W˜NL,M(s):=W˜NL+1(1+s:NL+s,1:M)(s=0,1).

2. Calculate the matrix

F˜M:=W˜NL,M(1)(W˜NL,M(0)),

where (W˜NL,M(0)*) denotes the Moore–Penrose pseudoinverse.

3. Determine all eigenvalues zj (j = 1, …, M) of F~M. Set

φ˜j:=12πArgzj|zj|(12,12](j=1,,M),

where Argz ∈ (−π, π] means the principal value of the argument of z ∈ ℂ\{0}.

4. Compute the coefficient vector c~ :=(c~j)j=1MM as solution of the least squares problem

minc˜MVN,M(z˜)c˜(h˜k)k=0N12,

where z~ :=(z~j)j=1M denotes the vector of computed nodes z~j :=e2πiφ~j.

Output: M ∈ ℕ sparsity, φ~j(-12,12] frequencies, c~j coefficients (j = 1, …, M).

Note that in Algorithms 2.8 and 2.9 the tolerance can be theoretically chosen as ε=(2cond2HL,N-L+1)-1. Then by Theorem 2.5, the tolerance ε is not too small for LN2. A simple procedure for the practical choice of ε is described in the Example 5.1. By the right choice of the window length LN2, we can recover the correct sparsity M of (1.1) and avoid the determination of spurious frequencies.

The numbers of required samples and the computational costs of the Algorithms 2.8 and 2.9 are listed in Table 2. Thus, the main disadvantages of these algorithms are the high computational costs for large sparsity M, caused mainly by the SVD. Therefore, in Potts and Tasche [28], we have suggested to use a partial SVD (based on partial Lanczos bidiagonalization) instead of a complete SVD. For both Algorithms 2.8 and 2.9, one needs too many operations in the case of large sparsity M, see Table 2.

TABLE 2
www.frontiersin.org

Table 2. Computational costs for the Algorithms 2.8 and 2.9 in the case of N given samples, where S is a large even integer so that all frequencies φj are of the form ωjS with ωj(-S2,S2].

3. Sparse Fast Fourier Transform

In this section, we apply Algorithm 2.8 (MUSIC) resp. Algorithm 2.9 (ESPRIT) to the reconstruction of sparse trigonometric polynomials. Clearly, one can approximate the unknown frequencies φj of the exponential sum (1.1) by fractions. Therefore, we assume that the unknown frequencies φj of (1.1) are fractions ωjS with ωj(-S2,S2], where S is a large even integer. Replacing the variable x by Sx in (1.1), we obtain the new exponential sum (1.2). Then (1.2) is a 1-periodic trigonometric polynomial with sparsity M. Consequently we consider the spectral estimation problem (P2) as mentioned in Section 1. A fast algorithm, which solves the problem (P2) or (P2*), is called sparse fast Fourier transform (sparse FFT). In recent years many sublinear algorithms for sparse FFTs were proposed, see Section 1 and Remark 3.1.

In the following, we propose a new deterministic sparse FFT for solving the problem (P2) of a trigonometric polynomial (1.2) with large sparsity M. Using divide–and–conquer technique, we split the trigonometric polynomial (1.2) of large sparsity M into some trigonometric polynomials of lower sparsity and determine corresponding samples. Here we borrow an idea from sparse FFT in Christlieb et al. [12] and use shifted sampling of (1.2). For a positive integer PS and a parameter K ∈ ℕ, K ≥ 2, we construct a discrete array of samples of size P×(2K + 1) via

gP[s,k]:=g(sP+kS),(s=0,,P1;k=0,,2K).    (3.1)

For each k = 0, …, 2K we form the discrete Fourier transform (DFT) of length P and obtain

g^P[,k]:=s=0P1gP[s,k]e2πis/P(=0,,P1).    (3.2)

The fast Fourier transform (FFT) allows the rapid computation of this DFT of length P in O(P log P) operations. In Figure 1, the sampling scheme and the applied DFTs are visualized. Next, for each ℓ = 0, …, P − 1, it follows that

g^P[,k]=s=0P1j=1Mcje2πiωj(s/P+k/S)e2πis/P=j=1Mcje2πiωjk/Ss=0P1e2πi(ωj)s/P   =0orP.
FIGURE 1
www.frontiersin.org

Figure 1. Illustration of the sampling scheme (3.1) and the applied DFTs (3.2).

Now we define the index sets

IP():={j{1,,M}:ωj(mod P)}

Such that

g^P[,k]=PjIP()cje2πiωjk/S.

Consequently, for each ℓ = 0, …, P − 1, we may interpret ĝP[ℓ, k] (k = 0, …, 2K) as samples of a trigonometric polynomial with frequencies supported only on the index set {ωj}jIP() ⊂ {ω1, …, ωM}, where the samples are taken at the nodes kS (k = 0, …, 2K). In simplified terms, the trigonometric polynomial (1.2) is partitioned into P many trigonometric polynomials of smaller sparsity.

Next, we use K ∈ ℕ, K ≥ 2, as sparsity cut-off parameter. This means for each ℓ = 0, …, P − 1, we apply Algorithm 2.8 resp. 2.9 to the “samples” h~k :=ĝP[,k] (k = 0, …, 2K) and we check if we can uniquely identify all frequencies ωj, jIP(ℓ), i.e., if the determined sparsity M: = M from Algorithm 2.8 resp. 2.9 fulfills M < K. In this case, we use the obtained local fractions φ~j to compute the frequencies ωj=round(φ~jS) by rounding to nearest integer and we use the corresponding coefficients cj.

When the condition |IP(ℓ)| < K is fulfilled for all ℓ = 0, …, P − 1, this approach requires (2K + 1)P samples of g and 2K + 1 FFTs of length P. The computational costs for the corresponding algorithms are listed in Table 3.

TABLE 3
www.frontiersin.org

Table 3. Computational cost of one iteration step of Algorithm 3.2 in the case of (2K + 1)P given samples (3.1).

If we cannot uniquely identify all frequencies, i.e., if |IP(ℓ)| ≥ K for some ℓ, then we form iteratively the new trigonometric polynomial

g1(x):=g(x)jIcje2πiωjx,    (3.3)

where I is the union of all IP(ℓ) with the property |IP(ℓ)| < K. In the next iteration step, we choose a positive integer P1S different from P and repeat the method on the trigonometric polynomial g1. In doing so, we can compute the values

jIcje2πiωj(sP1+kS)=jI(cje2πiωjP1s)e2πiωjSk                        (s=0,,P11;k=0,,2K)

by the non-equispaced fast Fourier transform (NFFT) [29] in O(P1(K logK + |I|)) arithmetic operations.

We perform additional iterations until all frequencies can be identified, i.e., if |IP1(ℓ)| < K for all ℓ = 0, …, P1 − 1. Note that our algorithm is related to the sparse FFT proposed in Christlieb et al. [12]. But here we use the methods of Section 2, if aliasing with respect to modulo P occurs.

Remark 3.1 (Relations to other sparse FFTs) In Hassanieh et al. [15], an algorithm for the problem (P2*) is presented, which allows to determine the (unknown) support I and the Fourier coefficients cj from O(M log S) samples with computational cost of O(M log S) operations, as well as a second algorithm, which allows the M-sparse ℓ2 best approximation of the Fourier coefficients of g from O(M log(S) log(SM)) samples with computational cost of O(M log(S) log(SM)) operations. As mentioned in the introduction, preliminary tests of the implementation [21] suggest that this method may also work for the problem (P2), if an upper bound for the sparsity of the signal is used as sparsity input parameter. In Indyk et al. [30], another variant was discussed, where the number of samples is O(M logS)(log logS)O(1) and the computational cost is O(M log2S)(log logS)O(1).

Recently in Indyk and Kapralov [31], a result was presented for the multivariate case where the number of required samples is O(M log S) for constant dimensions d and the computational cost is O(Sd log O(1)S). In general the exact constants, especially the dependence on d, are unknown due to missing implementations. For instance the number of samples O(M log S) of the last mentioned algorithm contains a factor of dO(d), see Section [31, Section 4].

Moreover, a deterministic sparse FFT, using the Chinese Remainder Theorem, was presented in [13] for the univariate case and in Iwen [32] for the multivariate case, which takes O(d4M2 log4(dS)) samples and arithmetic operations. This means there is neither a exponential/super-exponential dependency on the dimension d ∈ ℕ nor a dependency on a failure probability in the asymptotics of the number of samples and arithmetic operations for this method. Besides this deterministic algorithm, there also exists a randomized version which only requires O(d4M log4(dS)) samples and arithmetic operations.

Recently, another sparse FFT, which is based on a multiscale approach, was presented in Christlieb et al. [12] as an extension of the method [11]. Their algorithm is able to handle (additive) noise and requires O(M log(SM)) on average.

For further references on the sparse FFT we refer to the nice webpage http://groups.csail.mit.edu/netmit/sFFT/. Despite the fact that the computational cost of the sparse FFT is lower, it is not clear which algorithm is indeed faster and more stable for the practical problems mentioned in the Section 5.

We stress again that it is already well known that one can use Prony-type methods for the sparse FFT, see e.g., Heider et al. [33]. Using the proposed splitting approach, one can significantly decrease the high computational cost of MUSIC and ESPRIT, but keep the numerical stable evaluation. Clearly one can combine the suggested method with a reconstruction of the non-zero Fourier coefficients in a dimension incremental way [34]. 

All of the methods described in Section 2 apply an SVD and use the tolerance ε as a relative threshold parameter to determine the local sparsity M of the signal. A good choice of this parameter may depend without limitation on noise in the sampling values of the trigonometric polynomial g and on the smallest distance between two frequencies, where this distance may change for each ℓ ∈ {0, …, P − 1} in each iteration. We propose to use a (small) list of possible relative threshold parameters ε, which are tested for each ℓ ∈ {0, …, P − 1} in each iteration.

Our sparse FFT for recovery of a trigonometric polynomial (1.2) with large sparsity M reads as follows:

Algorithm 3.2 (Sparse FFT via MUSIC resp. ESPRIT, see Algorithm 6.1 for detailed listing with extended parameter list).

Input: S ∈ 2ℕ frequency grid parameter, K ∈ ℕ (K ≥ 2) Hankel matrix size parameter, P ∈ ℕ (PS) initial FFT length, g~ 1-periodic sparse trigonometric polynomial (1.2) of unknown sparsity M ∈ ℕ with frequencies in (-S2,S2] perturbed by noise.

1. For each k = 0, …, 2K, sample g~P[s,k]:=g~(sP+kS), s = 0, …, P − 1, and compute ĝP[,k]=0P-1 by FFT of g~P[s,k]s=0P-1.

2. for ℓ = 0, …, P − 1

2.1. Apply Algorithm 2.8 resp. 2.9 with L: = K and N: = 2K + 1 on the values h~k :=ĝP[,k]k=02K to obtain the local sparsity M: = M and the local fractions φ~,m :=φ~m(-12,12] for m = 1, …, M. If MK, then go to step 2 and continue with next ℓ. Otherwise, compute the local frequencies ω,m :=round(φ~,mS) by rounding to nearest integer.

2.2. Compute the local coefficients cℓ, m as least squares solution of the overdetermined Vandermonde system

min(c,m)m=1MMP(e2πikω,m/S)k=0,m=12K,M(c,m)m=1M(g^P[,k])k=02K2.    (3.4)

2.3. If the residual of (3.4) is small (see step 3.3.6 of Algorithm 6.1), then append the frequencies ωℓ, m (m = 1, …, M) to the frequency set Ω.

3. If the (global) residual

maxs=0,,P1k=0,,2K|j=1|Ω|C[j]e2πiΩ[j](sP+kS)g(sP+kS)|

is small, then exit the algorithm. Otherwise, form the new trigonometric polynomial (3.3). In the next iteration step choose a positive integer P1S different from P, sample (3.3) on sP1+kS for s = 0, …, P1 − 1 and k = 0, …, 2K, and repeat the above method.

Output: Ω(-S2,S2] set of recovered frequencies ωj (j = 1, …, M), M: = |Ω| detected sparsity, cj ∈ ℂ coefficient related to ωj (j = 1, …, M).

For (2K + 1)P given samples (3.1), the computational cost for one iteration of Algorithm 3.2 is shown in Table 3.

If we take (2K + 1)P = O(M) samples, then we obtain minimal computational cost for the parameters K = O(M1∕3) and P = O(M2∕3). For this case, we compare the numbers of required samples and computational costs for different methods of spectral estimation in Table 4 such as sparse FFT via MUSIC, sparse FFT via ESPRIT, MUSIC, ESPRIT, and classical FFT. As we can see, the sparse FFT via ESPRIT is very useful for the spectral estimation by a relatively low number of samples and low computational cost.

TABLE 4
www.frontiersin.org

Table 4. Numbers of required samples and computational costs using the splitting approach for one iteration of Algorithm 3.2 as well as for Algorithms 2.8 and 2.9 in the case K = O(M1∕3), P = O(M2∕3) and ML∕2 ≈ N∕4.

4. Reconstruction of Multivariate Trigonometric Polynomials

Let d, M ∈ ℕ with d > 1 be given. We consider the d-variate exponential sum of sparsity M

g(x):=j=1Mcje2πiωj·x    (4.1)

for x:=(x1,,xd)Td with non-zero coefficients cj ∈ ℂ and distinct frequency vectors ωjd. Here the dot in the exponent denotes the usual scalar product in ℝd. Note that the function (4.1) is a d-variate trigonometric polynomial of sparsity M which is 1-periodical with respect to each variable. Let Ω: = {ω1, …, ωM} be the set of the frequency vectors.

Assume that it is known a priori that ωj are contained in a frequency set Γ ⊂ ℤd. Then the cardinality of Γ satisfies |Γ| ≥ M. Examples of possible frequency sets Γ are the cube {kd:||k||N} and the symmetric hyperbolic cross

{k=(ks)s=1dd:s=1dmax{1,|ks|}N}.

For given z ∈ ℤd and S ∈ ℕ, the set

Λ(z,S):={xk=kSzmod1;k=0,,S1}𝕋d[0,1)d

is called rank–1 lattice, where 1: = (1, …, 1)T. Note that xk = xk+nS for k = 0, …, S − 1 and n ∈ ℤ. For given Γ ⊂ ℤd, there exists a reconstructing rank-1 lattice Λ(z, S) such that the matrix

AS,|Γ|:=(e2πik·x)xΛ(z,S),kΓ

fulfills the condition (see [35] and [36, Section 3.2])

AS,|Γ|AS,|Γ|=SI|Γ|.    (4.2)

Then we consider the following spectral estimation problem:

(P3) Assume that ωj ∈ Γ (j = 1, …, M) and that Λ(z, S) is a reconstructing rank–1 lattice with respect to Γ. Recover the sparsity M ∈ ℕ, all frequencies ωj ∈ Γ as well as all non-zero coefficients cj ∈ ℂ of the d-variate exponential sum (4.1), if noisy sampled data

g˜k:=g(xk)+ek(|ek|110minj|cj|)

for all k = 0, …, 2L − 2 are given, where xk ∈ Λ(z, S), SL > M and ek ∈ ℂ are small error terms.

For simplicity we discuss only noiseless data. Let HL:  =(g(xk+n))k,n=0L1 be the response matrix of the given data. Then HL is a Hankel matrix. Further we introduce the rectangular Fourier-type matrix

FL,M:=(e2πiωj·xk)k=0,j=1L1,M.

From (4.2) it follows in the case L = S that FS,M*FS,M=SIM and hence for all x ∈ ℂM

FS,Mx22=xFS,MFS,Mx=Sx22.

Consequently, all positive singular values of FS, M are equal to S and cond2FS, M = 1.

The matrix HL can be represented in the form

HL=FL,M(diag(cj)j=1M)FL,MT.    (4.3)

The ranges of HL and FL, M coincide in the noiseless case by (4.3). The range of FL, M is a proper subspace of ℂL. This subspace is called left signal space SL. The left signal space SL is of dimension M and is generated by the M columns eL(ωj) (j = 1, …, M), where

eL(ω):=(e2πiω·xk)k=0L1(ωΓ).

Note that ||eL(ω)||2=L for each ω ∈ Γ. The left noise space NL is defined as the orthogonal complement of SL in ℂL. The dimension of NL is equal to LM > 0.

By QL we denote the orthogonal projection onto the left noise space NL. Since eL(ωj) ∈ SL (j = 1, …, M) and NLSL, we obtain that

QLeL(ωj)=0(j=1,,M).

If ω ∈ Γ \ Ω, then the vectors eL(ω1),,eL(ωM),eL(ω)L are linearly independent for SL > M. This can be seen as follows: For distinct ω, ω′ ∈ Γ, it follows by [36, Lemma 3.1] that

ω·zω·z(modS).

Consequently the vectors

eL(ωj):=(e2πi(ωj·z)kS)k=0L1(j=1,,M)

and eL(ω) with ω ∈ Γ \ Ω are linearly independent for SL > M, since the square Vandermonde matrix

(eL(ω1)||eL(ωM)|eL(ω))(1:M+1,1:M+1)

is invertible for each LM + 1. Hence

eL(ω)SL=span{eL(ω1),,eL(ωM)},

i.e. QLeL(ω) ≠ 0.

Thus, the frequency vectors can be determined via the M zeros resp. lowest local minima of the left noise-space correlation function

NL(ω):=1LQLeL(ω)2(ωΓ)

or via the M peaks of the left imaging function

JL(ω):=LQLeL(ω)21(ωΓ).

Similar to Section 2, one can determine the left noise-space correlation function resp. the left imaging function on Γ by using SVD of the response matrix HL.

Now, we proceed analogously to Section 3. For a positive integer PS and a parameter K ∈ ℕ, K ≥ 2, we construct the sampling array of (4.1) of size P × (2K + 1) via

gP[s,k]:=g((sP+kS)z)(s=0,,P1;k=0,,2K).

As in the univariate case, for each k = 0, …, 2K we form the DFT of length P

g^P[,k]:=s=0P1gP[s,k]e2πis/P(=0,,P1).

For each ℓ = 0, …, P − 1, we obtain that

g^P[,k]=s=0P1j=1Mcje2πi(s/P+k/S)ωj·ze2πis/P   =j=1Mcje2πikωj·z/Ss=0P1e2πi((ωj·z))s/P.

Introducing the index sets

IP():={j{1,,M}:ωj·z(modP)}(=0,,P1),

it follows that

g^P[,k]=PjIP()cje2πikωj·z/S.

This means for each ℓ = 0, …, P − 1, we may interpret ĝP[ℓ, k] (k = 0, …, 2K) as samples of a multivariate trigonometric polynomial with frequencies supported on the index set {ωj}jIP(ℓ) ⊂ {ω1, …, ωM}, where the samples are taken at the nodes kSz (k = 0, …, 2K). In Figure 2, a two-dimensional example is shown which visualizes the partitioning by the index sets IP(ℓ).

FIGURE 2
www.frontiersin.org

Figure 2. Illustration of an example frequency index set {ωj}j=119 and corresponding one-dimensional frequencies ωj · z mod S partitioned by IP(ℓ) with parameters P: = 5, z: = (1, 33)T, S: = 37.

Next, we apply Algorithm 2.8 resp. 2.9 with L: = K and N: = 2K + 1 on the values h~k :=ĝP[,k] (k = 0, …, 2K) for each ℓ = 0, …, P − 1 to obtain the local sparsity M = M and the local fractions φ~,m :=φ~m(-12,12] for m = 1, …, M. Afterwards, we compute the one-dimensional frequencies ω,m :=round(φ~,mS)(-S2,S2] by rounding to nearest integer. We transform these one-dimensional frequencies ωℓ, m into their d-dimensional counterparts ωℓ, m ∈ Γ using the relation ωℓ, m · z ≡ ωℓ, m (mod S) given by the reconstructing rank-1 lattice Λ(z, S). Then we compute coefficients cℓ, m from the samples ĝP[ℓ, k] (k = 0, …, 2K) by solving the corresponding overdetermined Vandermonde system. If we cannot identify all the frequencies, i.e., if |IP(ℓ)| ≥ K for some indices ℓ, we consider the new trigonometric polynomial

g1(x):=g(x)jIcje2πiωj·x=j=1Mcje2πiωj·xjIcje2πiωj·x                   (x𝕋d)    (4.4)

in an additional iteration, where the index set I contains all index sets IP(ℓ) with |IP(ℓ)| < K. In the next iteration, we choose a positive integer P1S different from P and repeat the method for the trigonometric polynomial g1. In doing so, we compute the values

jIcje2πi(sP1+kS)ωj·z=jI(cje2πiωjP1s)e2πiωj·zSk          (s=0,,P11;k=0,,2K)

of the second sum in (4.4) evaluated at the nodes x=(sP1+kS)z with the univariate NFFT [29] in O(P1(K log K + |I|)) arithmetic operations. We perform additional iterations until all frequencies can be identified, i.e. |IP1(ℓ)| < K for all ℓ = 0, …, P1 − 1.

We modify Algorithm 3.2 from Section 3 as described above and additionally in the following way. Here, we describe the changes in the detailed listing (see Algorithm 6.1) of Algorithm 3.2. In step 1, we sample the multivariate trigonometric polynomial at the nodes (sP+kS)·z(s=0,,P-1,k=0,,2K). In step 3.3.3, we compute the local frequencies ω,m :=round(φ~,mS) for m = 1, …, M. Next, we compute the d-dimensional counterparts ωℓ, m of the one-dimensional frequencies ωℓ, m using the relation ωℓ, m · z ≡ ωℓ, m (mod S). In step 3.3.4, we filter the frequencies ωℓ, m by removing non-unique ones and by keeping only those with ωℓ, m · z ≡ ℓ(mod P). We remark that we have to modify step 3.3.4 and that we have to perform the conversion of one-dimensional frequencies ωℓ, m to their d-dimensional counterparts ωℓ, m before the filtering, since the conditions ωℓ, m · z ≡ ℓ(mod P) and ωℓ, m ≡ ℓ(mod P) are not equivalent in general if P is not a divisor of S.

5. Numerical Experiments

In this section, we present some numerical results for Algorithm 3.2. The related software is available from the authors' homepage. All computations are performed in MATLAB with IEEE double–precision arithmetic. First we consider noiseless sampled data and later the case, where the sampled data are perturbed by additive (white) Gaussian noise. Finally we present some numerical results for the modified Algorithm 3.2 of Section 4.

5.1. Noiseless Sampled Data

Example 5.1 From noiseless sampled values, we reconstruct 100 trigonometric polynomials (1.2) of sparsity M = 256 with random frequencies ωj(-S2,S2] and random coefficients cj on the unit circle. We set the array of relative SVD threshold values epsilon_svd_list: = [10−3, 10−4, …, 10−8], the parameter εspatial :=10-8, the absolute value of minimal non-zero coefficients εfc_min :=10-1=10-1·minj|cj| and the maximal number of iterations R: = 10, see Algorithm 6.1 for the extended parameter list. Applying the sparse FFT Algorithm 3.2 with MUSIC in the case S = 216 with parameters K ∈ {6, 12, 16} and P ∈ {16, 32, 64, 128}, we can successfully detect all integer frequencies ωj. In Table 5, the column “iterations” depicts the maximal number of iterations actually used by the Algorithm 3.2 (computed over 100 trigonometric polynomials). The column “samples” contains the maximal number of sampled values used by the Algorithm 3.2. The column “ℓ2–errors” shows the maximal relative ℓ2–error of the coefficients, which are locally computed in step 2.2 of Algorithm 3.2. The column “updated ℓ2–errors” shows the maximal relative ℓ2–error of the coefficients, which are determined by additionally solving one large Vandermonde system at the end of Algorithm 3.2 with all frequencies as well as all samples of (1.2). For comparison, the classical FFT of length 216 requires 216 samples and the resulting ℓ2–error is 2.6e-16. The minimal number of samples for the cases K ∈ {6, 12, 16} and P ∈ {16, 32, 64, 128} is reached for K = P = 16 with 1716 samples, the next smallest number of samples is 1725 for K = 12 and P = 32. If we do not use the splitting approach (P = 1 and R = 1), we observe that the detection of some frequencies fails for exactly 1 of the 100 signals for K = 750 and the detection of all frequencies of all 100 signals succeeds for K = 850 requiring 1701 samples. This number of samples is very close to the minimum of 1716 samples from above. However, a direct application of MUSIC method (entry K = 850) has distinctly more computational cost than using the sparse FFT Algorithm 3.2. Note that R denotes the maximal number of iterations in Algorithm 6.1, the detailed description of Algorithm 3.2. 

TABLE 5
www.frontiersin.org

Table 5. Results for Algorithm 3.2 via MUSIC for frequency grid parameter S = 216 and sparsity M = 256 with random coefficients on the unit circle.

Example 5.2 Now we apply Algorithm 3.2 via ESPRIT with the same parameters as in Example 5.1. Then we obtain identical results for the number of iterations and samples as well as almost identical ℓ2-errors, but Algorithm 3.2 via ESPRIT has lower computational cost. If we do not use the splitting approach (P = 1 and R = 1), we observe that the detection of some frequencies fails for exactly 2 of the 100 signals for K = 750 and the detection of all frequencies of all 100 signals again succeeds for K = 850 requiring 1701 samples.

Additionally, we apply the implementation [21] of the sfft version 3 algorithm. The number of used samples is 7669, which is about two to four times the number of samples for the MUSIC and ESPRIT algorithm in Table 5, and the maximal relative ℓ2-error of the coefficients is 3.6e-04, which is about five orders of magnitude larger. 

Example 5.3 We generate 100 random trigonometric polynomials (1.2), where the coefficients cj are drawn uniformly at random from [−1, 1] + [−1, 1]i with |cj|10-2. We set the absolute value of minimal non-zero coefficients εfc_min :=10-3=10-1·minj|cj|. For the cases K ∈ {6, 12, 16} and P ∈ {16, 32, 64, 128}, we apply Algorithm 3.2 via ESPRIT. We obtain results almost identical to the ones from Example 5.2. This means, all frequencies of all 100 signals are correctly detected and the maximal relative ℓ2-errors differ slightly from those in Table 5. The maximal numbers of iterations and samples are identical.

Additionally, we apply the implementation [21] of the sfft version 3 algorithm. For each signal, the number of taken samples is 7669. Only for 16 of the 100 signals, all frequencies are detected correctly, whereas between 1 and 7 frequencies are not correctly detected for 84 signals. 

Example 5.4 Next we apply Algorithm 3.2 via ESPRIT for signals with higher sparsity. From noiseless sampled values, we reconstruct 100 trigonometric polynomials (1.2) of sparsity M = 1024 with random frequencies ωj(-S2,S2] and random coefficients cj on the unit circle. The results for the frequency grid parameter S: = 222 are shown in Table 6. The minimal number of samples is about six times higher compared to the results in Table 5.

In general, we observe that the maximal number of used iterations decreases for increasing initial FFT length P ∈ {64, 128, 256, 512} as well as for increasing values K ∈ {8, 10, 12}. In the cases, where all frequencies of all the 100 trigonometric polynomials are correctly detected, the number of required samples first decreases and later increases again for increasing initial FFT length P and fixed values K. The reason for this is that the number of samples per iteration increases for growing FFT length, while the number of used iterations decreases until its minimum one is reached.

Again, we apply the implementation [21] of the sfft version 3 algorithm. Here, the number of used samples is 31718, which is up to three times the number of samples of Algorithm 3.2 via ESPRIT in Table 6, and the maximal relative ℓ2-error of the coefficients is 3.6e-04, which is about five orders of magnitude larger than in Table 6

TABLE 6
www.frontiersin.org

Table 6. Results for Algorithm 3.2 via ESPRIT for frequency grid parameter S = 222 and sparsity M = 1024 with random coefficients on the unit circle.

5.2. Noisy Case

In this subsection, we test the robustness to noise of Algorithm 3.2. For this, we perturb the samples of the trigonometric polynomials g from (1.2) by additive complex white Gaussian noise with zero mean and standard deviation σ, i.e., we have measurements g~(kS+sP)=g(kS+sP)+ηk,s, where ηk, s ∈ ℂ are independent and identically distributed complex Gaussian noise values. Then, we may approximately compute the signal-to-noise ratio (SNR) in our case by

SNR1Sk=0S1|g(kS)|21Sk=0S1|ηk,0|2j=1M|cj|2σ2.

Correspondingly, we choose σ=(cj)j=1M2SNR for a targeted SNR value. For the numerical computations in MATLAB, we generate the noise by ηk,s :=σ2 *(randn + 1i*randn). Moreover, we choose the parameter εspatial: = 5σ and this means that the absolute value of the noise |ηk, s| ≤ εspatial for more than 99.9998% of the noise values ηk, s. Since we assume that the absolute value of the noise 110minj|cj| throughout this paper, we should choose the SNR such that 5σ=5(cj)j=1M2SNR110minj|cj|, which yields SNR502(cj)j=1M22(minj|cj|)2.

Example 5.5 As in Example 5.1, we generate 100 trigonometric polynomials (1.2) of sparsity M = 256 with random frequencies ωj(-S2,S2] and random coefficients cj from the unit circle. We set the frequency grid parameter S = 216, the signal sparsity M = 256, the array of relative SVD threshold values epsilon_svd_list: = [10−3, 10−4, …, 10−8] and the maximal number of iterations R: = 10. Here, we set the absolute value of minimal non-zero coefficients εfc_min :=10-1=10-1·minj|cj| and we use the parameters P ∈ {32, 64, 128} and K ∈ {12, 24}. For possible SNR values, we have SNR50(cj)j=1M22(minj|cj|)2=6.4·105 and we consider the SNR values 1010, 108, and 106 in our numerical tests. The results of Algorithm 3.2 via ESPRIT are presented in Table 7. Additionally, we test the sparsity cut-off parameter K2 ∈ ℕ differently from the Hankel matrix size parameter K, see Algorithm 6.1. Here, we use the parameter combinations (K, K2) ∈ {(12, 6), (12, 12), (24, 12)}. In general, we observe that we require more samples for SNR = 106 than for SNR = 108 and again more for SNR = 108 than for SNR = 1010. The relative errors are about one order of magnitude larger for SNR = 106 than for SNR = 108 as well as for SNR = 108 than for SNR = 1010, since the maximal noise values ηk, s are larger by about one order of magnitude with high probability each time. Moreover, the maximal number of samples in the noisy case is higher than in the noiseless case, cf. Table 5. For some parameter combinations at SNR = 108, exactly one of the 100 signals is not correctly detected and this is indicated by the entry “–” in the column “ℓ2-errors” resp. “updated ℓ2-errors.” For SNR = 106 with K = 12, K2 = 6, and P = 64 exactly one frequency at three of the 100 signals is not correctly detected, whereas all frequencies of all 100 signals for SNR = 106 are correctly detected in the other cases shown in Table 7. All parameters of (1.2) are correctly detected in the case SNR = 1010 for the parameter combinations (K, K2) ∈ {(12, 6), (12, 12), (24, 12)} and P = 32. For the considered test parameters, the choices (K, K2) ∈ {(12, 6), (24, 12)}, which yield a higher oversampling within the ESPRIT algorithm, give slightly better results compared to the choice K = K2 = 12.

TABLE 7
www.frontiersin.org

Table 7. Results for Algorithm 3.2 via ESPRIT for frequency grid parameter S = 216 and sparsity M = 256 with noisy data.

Additionally, we apply the implementation [21] of the sfft version 3 algorithm. We remark that this particular algorithm is not suited for noisy samples. The number of taken samples is 7669 for all signals, which is about 50 percent higher than for Algorithm 3.2 at SNR = 1010 and similar at SNR = 108. In the cases SNR = 1010 and SNR = 108, all frequencies of all 100 signals are correctly detected and the maximal relative ℓ2-errors of the Fourier coefficients are about 3.6e-04. For SNR = 106, all frequencies of 94 of the 100 signals are correctly detected and up to four frequencies at four of the 100 signals are not correctly detected. 

Example 5.6 Additionally, we generate 100 random trigonometric polynomials (1.2), where the coefficients cj are drawn uniformly at random from [−1, 1] + [−1, 1]i with |cj|10-2. We set the absolute value of minimal non-zero coefficients εfc_min :=10-3=10-1·minj|cj|. In the case SNR = 108, we observe in each considered parameter combination that the correct detection of one or two frequencies fails for several of the 100 trigonometric polynomials. The most likely reason is the fact that the smallest coefficient can be very close to the noise level. If we decrease the noise by one order of magnitude, i.e. SNR = 1010, the frequency detection succeeds for all considered parameter combinations.

Furthermore, we generate 100 random trigonometric polynomials (1.2), where the coefficients cj are drawn uniformly at random from [−1, 1] + [−1, 1]i with |cj|10-1. Then we set the absolute value of minimal non-zero coefficients εfc_min :=10-2=10-1·minj|cj|. This means that the smallest possible coefficient as well as the parameter εfc_min are by one order of magnitude larger than before. Now in both of the cases SNR = 1010 and SNR = 108, we observe for each parameter combination (K, K2) ∈ {(12, 6), (12, 12), (24, 12)} and P ∈ {32, 64, 128} that all frequencies of all trigonometric polynomials are correctly detected.

For results of the implementation [21] of the sfft version 3 algorithm, we refer to Example 5.3. 

5.3. Reconstruction of 6-variate Trigonometric Polynomials

Finally we test the modified Algorithm 3.2 of Section 4 for the reconstruction of six-variate trigonometric polynomials with the sparsity M = 256.

Example 5.7 We choose the index set Γ of possible frequency vectors as the six-dimensional hyperbolic cross Γ :={k6:s=16max{1,|ks|}16} of cardinality 169209. Further we use the reconstructing rank-1 lattice Λ(z, S) with generating vector z = (1, 33, 579, 3628, 21944, 169230)T and rank-1 lattice size S = 1105193, see Kämmerer et al. [37, Table 6.2]. We generate 100 random trigonometric polynomials (4.1) with sparsity M = 256, where the frequency vectors ωj (j = 1, …, M) are chosen uniformly at random from Γ (without repetition) and the corresponding coefficients cj are randomly chosen on the unit circle. We set the array of relative SVD threshold values epsilon_svd_list: = [10−3, 10−4, …, 10−8], the absolute value of minimal non-zero coefficients εf_min :=10-1, and the maximal number of iterations R: = 10. In the noiseless case, we set the parameter εspatial :=10-8, and in the noisy case as described in Section 5.2. The results of the modified Algorithm 3.2 via ESPRIT are presented in Table 8.

TABLE 8
www.frontiersin.org

Table 8. Results of the modified Algorithm 3.2 via ESPRIT with sparsity M = 256, frequency vectors within six-dimensional hyperbolic cross index set Γ={k6:s=16max{1,|ks|}16}, and reconstructing rank-1 lattice Λ(z, S) with z = (1, 33, 579, 3628, 21944, 169230)T and S = 1105193.

The columns of Table 8 have the same meaning as in Section 5.2. For the noiseless case, i.e., “SNR = ∞,” we observe the same behavior as in the one-dimensional case in Section 5.1. The detection of all frequency vectors of all 100 trigonometric polynomials (4.1) succeeds for K = K2 ∈ {8, 10, 12} and P ∈ {8, 16, 32, 64, 128}. For the noisy case, the results are worse than in Table 7 of the one-dimensional case. The reason for this is that we have a bijective mapping between six-dimensional ωj and one-dimensional frequencies ωj by means of the reconstructing rank-1 lattice, ωj · z ≡ ωj(modS), and the rank-1 lattice size S influences how close two distinct one-dimensional fractional frequencies ωjS and ωjS may get in the ESPRIT algorithm, see also Figure 2.

We tried to apply the implementation [21] of the sfft version 3 algorithm in this test setting. However, the implementation failed with an internal error caused by the used problem size n = S = 1105193 and sparsity M = 256. Another test run with rank-1 lattice size S = 221 = 2097152 yielded 7938 samples and a maximal relative ℓ2-error of 6.8e-04, where the latter is about 4 orders of magnitude larger than the results in Table 8. Moreover, we consider the noisy case. Again, we remark that the implementation [21] is not suited for noisy samples. For SNR = 1010, the detection of one frequency fails for one of the 100 signals. If we increase the SNR to 108, then between one and 9 frequencies are not correctly detected for 81 of the 100 signals. 

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The first and last named author gratefully acknowledge the funding by the European Union and the Free State of Saxony (EFRE/ESF NBest-SF). The results of this paper were first presented during the Dagstuhl Seminar 15251 on “Sparse modeling and multi-exponential analysis” (June 14 – 19, 2015). Moreover, the authors would like to thank the referees for their valuable suggestions.

References

1. Manolakis DG, Ingle VK, Kogon SM. Statistical and Adaptive Signal Processing. Boston, MA: McGraw-Hill (2005).

Google Scholar

2. Stoica PG, Moses RL. Spectral Analysis of Signals. Upper Saddle River, NJ: Prentice Hall (2005).

Google Scholar

3. Plonka G, Tasche M. Prony methods for recovery of structured functions. GAMM–Mitt. (2014) 37:239–58. doi: 10.1002/gamm.201410011

CrossRef Full Text | Google Scholar

4. Schmidt RO. Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag. (1986) 34:276–80.

Google Scholar

5. Roy R, Kailath T. ESPRIT—estimation of signal parameters via rotational invariance techniques. IEEE Trans Acoustic Speech Signal Process. (1989) 37:984–94.

Google Scholar

6. Hua Y, Sarkar TK. Matrix pencil method for estimating parameters of exponentially damped/undamped sinusoids in noise. IEEE Trans Acoust Speech Signal Process. (1990) 38:814–24.

Google Scholar

7. Golub GH, Milanfar P, Varah J. A stable numerical method for inverting shape from moments. SIAM J Sci Comput. (1999) 21:1222–43.

Google Scholar

8. Roy R, Kailath T. ESPRIT—estimation of signal parameters via rotational invariance techniques. In: Auslander L, Grünbaum FA, Helton JW, Kailath T, Khargoneka P, Mitter S, editors. Signal Processing, Part II. Vol. 23 of IMA Volumes in Mathematics and its Applications. New York, NY: Springer (1990). pp. 369–411.

9. Potts D, Tasche M. Parameter estimation for nonincreasing exponential sums by Prony-like methods. Linear Algebra Appl. (2013) 439:1024–39. doi: 10.1016/j.laa.2012.10.036

CrossRef Full Text | Google Scholar

10. Pereyra V, Scherer G. Exponential data fitting. In: Pereyra V, Scherer G, editors. Exponential Data Fitting and its Applications. Sharjah: Bentham Science Publishers; IEEE Computer Society (2010). pp. 1–26.

Google Scholar

11. Lawlor D, Wang Y, Christlieb A. Adaptive sub-linear time Fourier algorithms. Adv Adapt Data Anal. (2013) 5:25. doi: 10.1142/S1793536913500039

CrossRef Full Text | Google Scholar

12. Christlieb A, Lawlor D, Wang. A multiscale sub-linear time Fourier algorithm for noisy data. Appl Comput Harmon Anal. (in press). doi: 10.1016/j.acha.2015.04.002

CrossRef Full Text

13. Iwen MA. Combinatorial sublinear-time Fourier algorithms. Found Comput Math. (2010) 10:303–38. doi: 10.1007/s10208-009-9057-1

CrossRef Full Text | Google Scholar

14. Ben-Or M, Tiwari P. A deterministic algorithm for sparse multivariate polynomial interpolation. In: Twentieth Annual ACM Symposium on the Theory of Computing. New York, NY: ACM Press (1988). pp. 301–9.

15. Hassanieh H, Indyk P, Katabi D, Price E. Nearly optimal sparse Fourier transform. In: Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing. New York, NY: ACM (2012). pp. 563–78.

16. Gilbert AC, Indyk P, Iwen M, Schmidt L. Recent developments in the sparse Fourier transform: a compressed Fourier transform for big data. IEEE Signal Proc Mag. (2014) 31:91–100. doi: 10.1109/MSP.2014.2329131

CrossRef Full Text | Google Scholar

17. Rauhut H. Random sampling of sparse trigonometric polynomials. Appl Comput Harmon Anal. (2007) 22:16–42. doi: 10.1016/j.acha.2006.05.002

CrossRef Full Text | Google Scholar

18. Kunis S, Rauhut H. Random sampling of sparse trigonometric polynomials II, orthogonal matching pursuit versus basis pursuit. Found Comput Math. (2008) 8:737–63. doi: 10.1007/s10208-007-9005-x

CrossRef Full Text | Google Scholar

19. Gröchenig K, Pötscher BM, Rauhut H. Learning trigonometric polynomials from random samples and exponential inequalities for eigenvalues of random matrices. arXiv:math/0701781v2. (2010).

Google Scholar

20. Foucart S, Rauhut H. A mathematical introduction to compressive sensing. In: Applied and Numerical Harmonic Analysis. New York, NY: Birkhäuser;Springer (2013).

Google Scholar

21. Schumacher J. sFFT 0.1.0 (2013). Available online at: http://spiral.net/software/sfft.html.

22. Schumacher J, Püschel M. High-performance sparse fast Fourier transforms. In: Proceedings of IEEE Workshop on Signal Processing Systems (SIPS). Belfast, UK: IEEE (2014). pp. 1–6.

23. Liao W, Fannjiang A. MUSIC for single-snapshot spectral estimation: stability and super-resolution. Appl Comput Harmon Anal. (2016) 40:33–67. doi: 10.1016/j.acha.2014.12.003

CrossRef Full Text | Google Scholar

24. Peter T, Potts D, Tasche M. Nonlinear approximation by sums of exponentials and translates. SIAM J Sci Comput. (2011) 33:314–34. doi: 10.1137/100790094

CrossRef Full Text | Google Scholar

25. Potts D, Tasche M. Parameter estimation for multivariate exponential sums. Electron Trans Numer Anal. (2013) 40:204–224.

Google Scholar

26. Fannjiang AC. The MUSIC algorithm for sparse objects: a compressed sensing analysis. Inverse Problems (2011) 27:32. doi: 10.1088/0266-5611/27/3/035013

CrossRef Full Text | Google Scholar

27. Kirsch A. The MUSIC algorithm and the factorization method in inverse scattering theory for inhomogeneous media. Inverse Problems (2002) 18:1025–1040. doi: 10.1088/0266-5611/18/4/306

CrossRef Full Text | Google Scholar

28. Potts D, Tasche M. Fast ESPRIT algorithms based on partial singular value decompositions. Appl Numer Math. (2015) 88:31–45. doi: 10.1016/j.apnum.2014.10.003

CrossRef Full Text | Google Scholar

29. Keiner J, Kunis S, Potts D. Using NFFT3 - a software library for various nonequispaced Fast Fourier Transforms. ACM Trans Math Softw. (2009) 36: 1–30. doi: 10.1145/1555386.1555388

CrossRef Full Text | Google Scholar

30. Indyk P, Kapralov M, Price E. (Nearly) sample-optimal sparse Fourier transform. In: Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing. Portland: ACM (2014). pp. 563–78.

31. Indyk P, Kapralov M. Sample-optimal Fourier sampling in any constant dimension. In: Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on. Philadelphia, PA (2014). pp. 514–23.

Google Scholar

32. Iwen MA. Improved approximation guarantees for sublinear-time Fourier algorithms. Appl Comput Harmon Anal. (2013) 34:57–82. doi: 10.1016/j.acha.2012.03.007

CrossRef Full Text | Google Scholar

33. Heider S, Kunis S, Potts D, Veit M. A sparse prony FFT. In: 10th International Conference on Sampling Theory and Applications. Bremen (2013).

Google Scholar

34. Potts D, Volkmer T. Sparse high-dimensional FFT based on rank-1 lattice sampling. Appl Comput Harmon Anal. (in press). doi: 10.1016/j.acha.2015.05.002

CrossRef Full Text | Google Scholar

35. Kämmerer L. Reconstructing multivariate trigonometric polynomials from samples along rank-1 lattices. In: Fasshauer GE, Schumaker LL, editors. Approximation Theory XIV: San Antonio 2013. Cham: Springer International Publishing (2014). pp. 255–71.

36. Kämmerer L. High Dimensional Fast Fourier Transform Based on Rank-1 Lattice Sampling. Dissertation. Universitätsverlag Chemnitz (2014). Available online at: http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-157673

37. Kämmerer L, Potts D, Volkmer T. Approximation of multivariate periodic functions by trigonometric polynomials based on rank-1 lattice sampling. J Complex. (2015) 31:543–76. doi: 10.1016/j.jco.2015.02.004

CrossRef Full Text | Google Scholar

Appendix

Detailed Sparse FFT Algorithm

Algorithm 6.1 (Detailed listing of Algorithm 3.2 with extended parameter list)

Input: S ∈ 2ℕ frequency grid parameter, K ∈ ℕ Hankel matrix size parameter, K2 ∈ ℕ sparsity cut-off parameter (default value K), P ∈ ℕ initial FFT length, g 1-periodic sparse trigonometric polynomial of unknown sparsity M ∈ ℕ with frequencies in (-S2,S2], epsilon_svd_list array of relative SVD threshold values 0 < εSVD < 1 in descending order, εspatial > 0 estimate for maximal noise value, εfc_min > 0 lower bound of absolute values of non-zero coefficients, R ∈ ℕ maximal number of iterations.

Create empty index set array Ω and coefficient array C.

for iteration r: = 1, …, R

1. Construct the discrete array of samples of g of length P×(2K + 1) via gP[s,k] :=g(sP+kS)-j=1|Ω|C[j]e2πiΩ[j](sP+kS)(s=0,,P-1;k=0,,2K).

2. Compute for each k = 0, …, 2K an FFT of length P and obtain array ĝP of length P×(2K + 1), ĝP[,k] :=s=0P-1gP[s,k]e-2πisP for ℓ = 0, …, P − 1, if P > 1. Otherwise if P = 1, then set ĝP[ℓ, k]: = gP[ℓ, k] for ℓ = 0, …, P − 1.

3. for ℓ: = 0, …, P − 1

3.1. If ∥ĝP[ℓ, 0:2K]∥P < εspatial, then go to 3. and continue with next ℓ.

3.2. Set variable found_svd: = 0.

3.3. for εSVD in epsilon_svd_list

3.3.1. Apply Algorithm 2.8 resp. 2.9 with L: = K, N: = 2K + 1 and ε: = εSVD on the values h~k :=ĝP[,k]k=02K to obtain the (local) sparsity M: = M and fractions φ~,m :=φ~m(-12,12] for j = m, …, M.

3.3.2. If MK2 then go to 3.3. and continue with next (smaller) εSVD.

3.3.3. Compute local frequencies ω,m :=round(φ~,mS) for m = 1, …, M.

3.3.4. Filter frequencies ωℓ, m by removing non-unique ones and by keeping only those where ωℓ,m ≡ ℓ (modP). Set M to number of resulting frequencies ωℓ, m.

3.3.5. Compute (local) Fourier coefficients cℓ, m as least squares solution from the overdetermined Vandermonde system (ĝP[,0:2K])T(e2πikω,mS)k=0;m=12K; M (P·c,m)m=1M.

3.3.6. If the residual (e2πikω,mS)k=0,m=12K,M (c,m)m=1M-(ĝP[,0:2K])TP10·εspatial, then set variable found_svd: = 1, leave for εSVD loop and go to 3.5. Otherwise, go to 3.3. and continue with next (smaller) εSVD.

3.3. end for εSVD

3.4. If found_svd ≠ 1, then go to 3. and continue with next ℓ.

3.5. If a frequency has already been found, i.e., ω,m=Ω[j] for any m = 1, …, M, then update the corresponding coefficient C[j′] by computing C[j]:=C[j]+c,m.

3.6. Append new frequencies of ωℓ, m, m = 1, …, M, to array Ω and append corresponding coefficients to array C.

3. end for

4. Remove small coefficients C[j]<εf_min from array C and remove corresponding frequencies from array Ω for any j′.

5. If the residual

maxs=0,,P1k=0,,2K|j=1|Ω|C[j]e2πiΩ[j](sP+kS)g(sP+kS)|<10·εspatial,

then set Rused: = r and exit r-loop.

Otherwise, determine next prime number larger than current FFT length P and use this larger prime as P in the next iteration.

end for iteration r

Output: Detected sparsity M: = |Ω| ∈ ℕ, array Ω(-S2,S2] of detected frequencies, array C ∈ ℂM of corresponding coefficients.

Keywords: Spectral estimation, ESPRIT, MUSIC, parameter identification, exponential sum, sparse trigonometric polynomial, sparse FFT

AMS Subject Classifications: 65T50, 42A16, 94A12.

Citation: Potts D, Tasche M and Volkmer T (2016) Efficient Spectral Estimation by MUSIC and ESPRIT with Application to Sparse FFT. Front. Appl. Math. Stat. 2:1. doi: 10.3389/fams.2016.00001

Received: 04 December 2015; Accepted: 05 February 2016;
Published: 29 February 2016.

Edited by:

Charles K. Chui, Stanford University and Hong Kong Baptist University, USA

Reviewed by:

Ronny Bergmann, University Kaiserslautern, Germany
Hyenkyun Woo, Korea University of Technology and Education, South Korea

Copyright © 2016 Potts, Tasche and Volkmer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Daniel Potts, potts@mathematik.tu-chemnitz.de