A Hybrid Norm for Guaranteed Tensor Recovery

Luo, Yihao; Wang, Andong; Zhou, Guoxu; Zhao, Qibin

doi:10.3389/fphy.2022.885402

ORIGINAL RESEARCH article

Front. Phys., 13 July 2022

Sec. Statistical and Computational Physics

Volume 10 - 2022 | https://doi.org/10.3389/fphy.2022.885402

A Hybrid Norm for Guaranteed Tensor Recovery

Yihao Luo ¹

Andong Wang ^1,2^*

Guoxu Zhou ^1,3

Qibin Zhao ²

1. School of Automation, Guangdong University of Technology, Guangzhou, China
2. RIKEN AIP, Tokyo, Japan
3. Key Laboratory of Intelligent Detection and the Internet of Things in Manufacturing, Ministry of Education, Guangzhou, China

Article metrics

View details

Citations

2,9k

Views

547

Downloads

Abstract

Benefiting from the superiority of tensor Singular Value Decomposition (t-SVD) in excavating low-rankness in the spectral domain over other tensor decompositions (like Tucker decomposition), t-SVD-based tensor learning has shown promising performance and become an emerging research topic in computer vision and machine learning very recently. However, focusing on modeling spectral low-rankness, the t-SVD-based models may be insufficient to exploit low-rankness in the original domain, leading to limited performance while learning from tensor data (like videos) that are low-rank in both original and spectral domains. To this point, we define a hybrid tensor norm dubbed the “Tubal + Tucker” Nuclear Norm (T2NN) as the sum of two tensor norms, respectively, induced by t-SVD and Tucker decomposition to simultaneously impose low-rankness in both spectral and original domains. We further utilize the new norm for tensor recovery from linear observations by formulating a penalized least squares estimator. The statistical performance of the proposed estimator is then analyzed by establishing upper bounds on the estimation error in both deterministic and non-asymptotic manners. We also develop an efficient algorithm within the framework of Alternating Direction Method of Multipliers (ADMM). Experimental results on both synthetic and real datasets show the effectiveness of the proposed model.

1 Introduction

Thanks to the rapid progress of computer technology, data in tensor format (i.e., multi-dimensional array) are emerging in computer vision, machine learning, remote sensing, quantum physics, and many other fields, triggering an increasing need for tensor-based learning theory and algorithms [1–6]. In this paper, we carry out both theoretic and algorithmic research studies on tensor recovery from linear observations, which is a typical problem in tensor learning aiming to learn an unknown tensor when only a limited number of its noisy linear observations are available [7]. Tensor recovery finds applications in many industrial circumstances where the sensed or collected tensor data are polluted by unpredictable factors such as sensor failures, communication losses, occlusion by objects, shortage of instruments, and electromagnetic interferences [7–9], and is thus of both theoretical and empirical significance.

In general, reconstructing an unknown tensor from only a small number of its linear observations is hopeless, unless some assumptions on the underlying tensor are made [9]. The most commonly used assumption is that the underlying tensor possesses some kind of low-rankness which can significantly limit its degree of freedom, such that the signal can be estimated from a small but sufficient number of observations [7]. However, as a higher-order extension of matrix low-rankness, the tensor low-rankness has many different characterizations due to the multiple definitions of tensor rank, e.g., the CANDECOMP/PARAFAC (CP) rank [10], Tucker rank [11], Tensor Train (TT) rank [12], and Tensor Ring (TR) rank [13]. As has been discussed in [7] from a signal processing standpoint, the above exampled rank functions are defined in the original domain of the tensor signal and may thus be insufficient to model low-rankness in the spectral domain. The recently proposed tensor low-tubal-rankness [14] within the algebraic framework of tensor Singular Value Decomposition (t-SVD) [15] gives a kind of complement to it by exploiting low-rankness in the spectral domain defined via Discrete Fourier Transform (DFT), and has witnessed significant performance improvements in comparison with the original domain-based low-rankness for tensor recovery [6, 16, 17].

Despite the popularity of low-tubal-rankness, the fact that it is defined solely in the spectral domain also naturally poses a potential limitation on its usability to some tensor data that are low-rank in both spectral and original domains. To address this issue, we propose a hybrid tensor norm to encourage low-rankness in both spectral and original domains at the same time for tensor recovery in this paper. Specifically, the contributions of this work are four-fold:

• To simultaneously exploit low-rankness in both spectral and original domains, we define a new norm named T2NN as the sum of two tensor nuclear norms induced, respectively, by the t-SVD for spectral low-rankness and Tucker decomposition for original domain low-rankness.
• Then, we apply the proposed norm to tensor recovery by formulating a new tensor least squares estimator penalized by T2NN.
• Statistically, the statistical performance of the proposed estimator is analyzed by establishing upper bounds on the estimation error in both deterministic and non-asymptotic manners.
• Algorithmically, we propose an algorithm based on ADMM to compute the estimator and evaluate its effectiveness on three different types of real data.

The rest of this paper proceeds as follows. First, the notations and preliminaries of low-tubal-rankness and low-Tucker-rankness are introduced in Section 2. Then, we define the new norm and apply it to tensor recovery in Section 3. To understand the statistical behavior of the estimator, we establish an upper bound on the estimation error in Section 4. To compute the proposed estimator, we design an ADMM-based algorithm in Section 5 with empirical performance reported in Section 6.

2 Notations and Preliminaries

Notations. We use lowercase boldface, uppercase boldface, and calligraphy letters to denote vectors (e.g., v), matrices (e.g., M), and tensors (e.g., ), respectively. For any real numbers a, b, let a ∨ b = max{a, b} and a ∧ b = min{a, b}. If the size of a tensor is not given explicitly, then it is in . We use c, c′, c₁, etc., to denote constants whose values can vary from line to line. For notational simplicity, let and d_\k = d₁d₂d₃/d_k for k = 1, 2, 3.

Given a matrix , its nuclear norm and spectral norm are defined as ‖M‖_∗≔∑_iσ_i and , respectively, where {σ_i |i = 1, 2, … , d₁ ∧ d₂} are its singular values. Given a tensor , define its l₁-norm and F-norm as , respectively, where vec (⋅) denotes the vectorization operation of a tensor [18]. Given , let denote its ith frontal slice. For any two (real or complex) tensors of the same size, define their inner product as the inner product of their vectorizations . Other notations are introduced at their first appearance.

2.1 Spectral Rankness Modeled by t-SVD

The low-tubal-rankness defined within the algebraic framework of t-SVD is a typical example to characterize low-rankness in the spectral domain. We give some basic notions about t-SVD in this section.

Definition 1 (t-product [15]). Givenand, their t-productis a tensor whose (i, j)-th tube, where • is the circular convolution[15].

Definition 2 (tensor transpose [15]). Letbe a tensor of sized₁ × d₂ × d₃, thenis thed₂ × d₁ × d₃tensor obtained by transposing each of the frontal slices and then reversing the order of transposed frontal slices 2 throughd₃.

Definition 3 (identity tensor [15]). The identity tensoris a tensor whose first frontal slice is thed × didentity matrix and all other frontal slices are zero.

Definition 4 (f-diagonal tensor [15]). A tensor is called f-diagonal if each frontal slice of the tensor is a diagonal matrix.

Definition 5 (Orthogonal tensor [15]). A tensoris orthogonal if.Then, t-SVD can be defined as follows.

Definition 6 (t-SVD, tubal rank [15]). Any tensorhas a tensor singular value decomposition aswhere,are orthogonal tensors andis anf-diagonal tensor. The tubal rank ofis defined as the number of non-zero tubes of,where#counts the number of elements in a set.For convenience of analysis, the block diagonal matrix of 3-way tensors is also defined.

Definition 7 (block-diagonal matrix [15]). Letdenote the block-diagonal matrix of the tensorin the Fourier domain¹, i.e.,

Definition 8 (tubal nuclear norm, tensor spectral norm [17]). Given, letbe its Fourier version in. The Tubal Nuclear Norm (TNN) ‖⋅‖_tnnofis defined as the averaged nuclear norm of frontal slices of,whereas the tensor spectral norm ‖⋅‖ is the largest spectral norm of the frontal slices,We can see from Definition 8 that TNN captures low-rankness in the spectral domain and is thus more suitable for tensors with spectral low-rankness. As visual data (like images and videos) often process strong spectral low-rankness, it has achieved superior performance over many original domain-based nuclear norms in visual data restoration [6, 17].

2.2 Original Domain Low-Rankness Modeled by Tucker Decomposition

The low-Tucker-rankness is a classical higher-order extension of matrix low-rankness in the original domain and has been widely applied in computer vision and machine learning [19–21]. Given any K-way tensor , its Tucker rank is defined as the following vector:where denotes the mode-k unfolding (matrix) of [18] obtained by concatenating all the mode-k fibers of as column vectors. We can see that the Tucker rank measures the low-rankness of all the mode-k unfoldings T_(k) in the original domain.

Through relaxing the matrix rank in Eq. 4 to its convex envelope, i.e., the matrix nuclear norm, we get a convex relaxation of the Tucker rank, called Sum of Nuclear Norms (SNN) [20], which is defined as follows:where α_k’s are positive constants satisfying ∑_ka_k = 1. As a typical tensor low-rankness penalty in the original domain, SNN has found many applications in tensor recovery [19, 20, 22].

3 A Hybrid Norm for Tensor Recovery

In this section, we first define a new norm to exploit low-rankness in both spectral and original domains and then use it to formulate a penalized tensor least squares estimator.

3.1 The Proposed Norm

Although TNN has shown superior performance in many tensor learning tasks, it may still be insufficient for tensors which are low-rank in both spectral and original domains due to its definition solely in the spectral domain. Moreover, it is also unsuitable for tensors which have less significant spectral low-rankness than the original domain low-rankness. Thus, it is necessary to extend the vanilla TNN such that the original domain low-rankness can also be exploited for sounder low-rank modeling.

Under the inspiration of SNN’s impressive low-rank modeling capability in the original domain, our idea is quite simple: to combine the advantages of both TNN and SNN through their weighted sum. In this line of thinking, we come up with the following hybrid tensor norm.

Definition 9 (T2NN). The hybrid norm called “Tubal + Tucker” Nuclear Norm (T2NN) of any 3-way tensoris defined as the weighted sum of its TNN and SNN as follows:whereγ ∈ (0, 1) is a constant balancing the low-rank modeling in the spectral and original domains.As can be seen from its definition, T2NN approximates TNN when γ → 1, and it degenerates to SNN as γ → 0. Thus, it can be viewed as an interpolation between TNN and SNN, which provides with more flexibility in low-rank tensor modeling. We also define the dual norm of T2NN (named the dual T2NN norm) which are frequently used in analyzing the statistical performance of the T2NN-based tensor estimator.

Lemma 1. The dual norm of the proposed T2NN defined ascan be equivalently formulated as follows:Proof of Lemma 1. Using the definition of T2NN, the supremum in Problem (7) can be equivalently converted to the opposite number of infimum as follows:By introducing a multiplier λ ≥ 0, we obtain the Lagrangian function of Problem (9),Since Slatter’s condition [23] is satisfied in Problem (9), strong duality holds, which meansThus, we proceed by computing as follows:where (i) is obtained by the trick of splitting into four auxiliary tensors for simpler analysis and (ii) holds because for any positive constant α, any norm f (⋅) with dual norm f*(⋅), we have the following relationship:This completes the proof.Although an expression of the dual T2NN norm is given in Lemma 1, it is still an optimization problem whose optimal value cannot be straightforwardly computed from the variable tensor . Following the tricks in [22], we instead give an upper bound on the dual T2NN norm which is directly in terms of in the following lemma:

Lemma 2. The dual T2NN norm can be upper bounded as follows:Proof of Lemma 2. The proof is a direct application of the basic equality “harmonic mean ≤ arithmetic mean” with careful construction of auxiliary tensors in Eq. 8 as follows:where the denominator M is given byIt is obvious that . By substituting the particular setting of into Eq. 8, we obtainThen, by using “harmonic mean ≤ arithmetic mean” on the right-hand side of Eq. 11, we obtainwhich directly leads to Eq. 10.

3.2 T2NN-Based Tensor Recovery

3.2.1 The observation Model

We use to denote the underlying tensor which is unknown. Suppose one observes N ≪ d₁d₂d₃ scalars,where ’s are known (deterministic or random) design tensors, ξ_i’s are i. i.d. standard Gaussian noises, and σ is a known standard deviation constant measuring the noise level.

Let and denote the collection of observations and noises. Define the design operator with adjoint operator as follows:

Then, the observation model (13) can be rewritten in the following compact form:

3.2.2 Two Typical Settings

With different settings of the design tensors

, we consider two classical examples in this paper:

• Tensor completion. In tensor completion, the design tensors are i. i.d. random tensor bases drawn from uniform distribution on the canonical basis in the space of d₁ × d₂ × d₃ tensors , where e_i denotes the vector whose ith entry is 1 with all the other entries 0 and ◦ denotes the tensor outer product [18].
• Tensor compressive sensing. When is a random Gaussian design, Model (13) is the tensor compressive sensing model with Gaussian measurements [24]. is named a random Gaussian design when are random tensors with i. i.d. standard Gaussian entries [22].

3.2.3 The Proposed Estimator

The goal of this paper is to recover the unknown low-rank tensor from noisy linear observations y satisfying the observation model (13).

Inspired by the capability of the newly defined T2NN in simultaneously modeling low-rankness in both spectral and original domains, we define the T2NN penalized least squares estimator to estimate the unknown truth ,where the squared l₂-norm is adopted as the fidelity term for Gaussian noises, the proposed T2NN is used to impose both spectral and original low-rankness in the solution, and λ is a penalization parameter which balances the residual fitting accuracy and the parameter complicity (characterized by low-rankness) of the model.

Given the estimator in Eq. 15, one may naturally ask how well it can estimate the truth and how to compute it. In the following two sections, we first study the estimation performance of by upper bounding its estimation error and then develop an ADMM-based algorithm to efficiently compute it.

4 Statistical Guarantee

In this section, we first come up with a deterministic upper bound of the estimation error and then establish non-asymptotic error bounds for the special cases of tensor compressive sensing with random Gaussian design and noisy tensor completion.

First, to describe the low-rankness of

, we consider both its low-tubal-rank and low-Tucker-rank structures as follows:

• `Low-tubal-rank structure: Let denote the tubal rank of . Suppose it has reduced t-SVD , where are orthogonal tensors and is f-diagonal. Then, following [25], we define the following projections of any tensor :

where

denotes the identity tensor of appropriate dimensionality.

• Low-Tucker-rank structure: Let denote the Tucker rank of , i.e., . Then, we have the reduced SVD factorization , where and are orthogonal and is diagonal. Let be an arbitrary tensor. Similar to [22], we define the following two projections for any mode k = 1, 2, 3:

where

denotes the identity matrix of appropriate dimensionality.

4.1 A Deterministic Bound on the Estimation Error

Before bounding the Frobenius-norm error , the particularity of the error tensor is first characterized by a certain choice of regularization parameter λ involving the dual T2NN norm in the following proposition.

Proposition 1.

By setting the regularization parameter, we have

(I) rank inequality:

(II) sum of norms inequality:

(III) an upper bound on the “observed” error:

Proof of Proposition 1. The proof is given as follows:Proof of Part (I): According to the definition of in Eq. 16, we haveDue to the facts that , [26], and , we haveAlso, according to the definition of in Eq. 17, we haveDue to the facts that , [26], and , we haveProof of Part (II) and Part (III): The optimality of to Problem Eq. 15 indicatesBy the definition of the error tensor , we can get , which leads toThe definition that yieldswhere the last inequality holds due to the definition of the adjoint operator .According to the definition and upper bound of the dual T2NN norm in Lemma 1 and Lemma 2, we obtainAccording to the decomposibility of TNN (see the supplementray material of [25]) and the decomposibility of matrix nuclear norm [27], one hasandThen, we obtainUsing the definition of T2NN and triangular inequality yieldsFurther using the setting yields Part (III),where by combing (i) and , Part (II) can be directly proved; inequality (ii) holds due to the compatible inequality of TNN and matrix nuclear norm, i.e., [25] and [27], and Iiequality (iii) holds because one can easily verify the facts that [25] and [27].Note that inequality (20) gives an upper bound on the , which can be seen as the “observed” error. However, we are more concerned about upper bounds on the error itself ‖Δ‖_F rather than its observed version. The following assumption builds a bridge between and ‖Δ‖_F.

Assumption 1 (RSC condition). The observation operatoris said to satisfy the Restricted Strong Convexity (RSC) condition with parameterκif the following inequality holds:for anybelong to the restricted direction set,Then, a straightforward combination of Proposition 1 and Assumption 1 leads to an deterministic bound on the estimation error.

Theorem 1. By setting the regularization parameter, we have the following error bound for any solutionto Problem (15):Note that we do not require information for distribution of the noise ξ in Theorem 1, which indicates that Theorem 1 provides a deterministic bound for general noise type. The bound on the right-hand side of Eq. 26 is in terms of the quantity,which serves as a measure of structure complexity, reflecting the natural intuition that more complex structure causes larger error. The result is in consistent with the results for sum-of-norms-based estimators in [5, 22, 24, 28]. A more general analysis in [24, 28] indicates that the performance of sum-of-norms-based estimators are determined by all the structural complexities for a simultaneously structured signal, just as shown by the proposed bound (26).

4.2 Tensor Compressive Sensing

In this section, we consider tensor compressive sensing from random Gaussian design where ’s are random tensors with i. i.d. standard Gaussian entries [22]. First, the RSC condition holds in random Gaussian design as shown in the following lemma.

Lemma 3 (RSC of random Gaussian design). Ifis a random Gaussian design, then a version of the RSC condition is satisfied with probability at least 1–2 exp(−N/32) as follows:for any tensor in the restricted direction set whose definition is given in Eq. 25.Proof of Lemma 3. The proof is analogous to that of Proposition 1 in[27]. The difference lies in how we lower bound the right hand side of (H.7) in[27], i.e.,where , , andwhere random vector and random tensor are independent with i. i.d. entries.We bound the quantity in Eq. 27 as follows:where can be bounded according to Lemma 4. The rest of the proof follows that of Proposition 1 in [27].The remaining bound on is shown in the following Lemma.

Lemma 4 (bound on ). Letbe a random Gaussian design. With high probability, the quantityis concentrated around its mean, which can be bounded as follows:Proof. Since ξ_m’s are i. i.d. variables, we havewith high probability according to Proposition 8.1 in [29].For k = 1, 2, 3, let X^∗(ξ) (_k) be the mode-k unfolding of random tensor . A direct use of Lemma C.1 in [27] leads towith high probability. A similar argument of Lemma C.1 in [27] also yieldswith high probability. Combining Eqs 30, 31, we can complete the proof.Then, the non-asymptotic error bound is obtained finally as follows.

Theorem 2 (non-asymptotic error bound). Under the random Gaussian design setup, there are universal constantsc₃, c₄, andc₅such that for a sample sizeNgreater thanand any solution to Problem (15) with regularization parameterthen we havewhich holds with high probability.To understand the proposed bound, we consider the three-way cubical tensor with regularization weights γ = (1 − γ)α₁ = (1 − γ)α₂ = (1 − γ)α₃ = 1/4. Then, the bound in Eq. 52 is simplified to the following element-wise error:which means the estimation error is controlled by the tubal rank and Tucker rank ofsimultaneously. From the right-hand side of Eq. 33, it can be seen that the more observations (i.e., the larger N), the smaller the error; it is also reflected that larger tensors with more complex structures will lead to larger errors. The interpretation is consistent with our intuition.Equation 33 also indicates the sample size N should satisfyfor approximate tensor sensing.Another interesting result is that by setting the noise level σ = 0 in Eq. 33, the upper bound reaches 0, which means the proposed estimator can exactly recover the unknown truth in the noiseless setting.

4.3 Noisy Tensor Completion

For noisy tensor completion, we consider a slightly modified estimator,where is a known constant constraining the magnitude of entries in . The constraint is very mild because real signals are all of limited magnitude, e.g., the intensity of pixels in visual light images cannot be greater than 255. The constraint also provides with theoretical continence in excluding the “spiky” tensors while controlling the identifiability of . Similar “non-spiky” constraints are also considered in related work [6, 16, 30].

We consider noisy tensor completion under uniform sampling in this section.

Assumption 2 (uniform sampling scheme). The design tensorsare i.i.d. random tensor bases drawn from uniform distribution Π on the set,Recall that Proposition 1 in Section 4.1 gives an upper bound on the “observed part” of the estimation error . As our goal is to establish a bound on ‖Δ‖_F, we then connect with ‖Δ‖_F by quantifying the probability of the following RSC property of the sampling operator :when the error tensor Δ belongs to some set defined aswhere is an F-norm tolerance parameter and mr = (r, r₁, r₂, r₃) is a rank parameter whose values will be specified in the sequel.

Lemma 5 (RSC condition under uniform sampling). For any, it holds with probability at leastthatwhereeis the base of the natural logarithm, and the entriesϵ_iof vectorare i. i.d. Rademacher random variables.Before proving Lemma 5, we first define a subset of by upper bounding the F-norm of any element Δ in it,and a quantitywhich is the maximal absolute deviation of from its expectation in . Lemma 6 shows the concentration behavior of Z_T.

Lemma 6 (concentration of Z_T). There exists a constantc₀such thatProof of Lemma 6. The proof is similar to that of Lemma 10 in [31]. The difference lies in the step of symmetrization arguments. Note that for any , it holds thatwhich indicatesThen, Lemma 5 can be proved by using the peeling argument [30].Proof of Lemma 5. For any positive integer l, we define disjoint subsets of aswith constants and . Let D = d₁d₂d₃ for simplicity, and define the eventand its sub-events for any ,Note that Lemma 6 implies thatThus, we haveRecall that , then , which leads to the result of Lemma 5.Based on the RSC condition in Lemma 5, we are able to give an upper bound on the estimation error ‖Δ‖_F in the following proposition.

Proposition 2. With parameter, the estimation error satisfieswith probability at least.Proof of Proposition 2. A direct consequence of property (II) in Proposition 1 and the triangular inequality is that the error tensor Δ satisfiesSince and , we also have .Let denote the rank complexity of the underlying tensor . By discussing whether tensor is in set , we consider the following cases.Case 1: If , then from the definition of set , we haveCase 2: If , then by Proposition 1 and Lemma 5, we havewith probability at least .By performing some algebra (like the proof of Theorem 3 in [30]), we haveCombining Case 1 and Case 2, we obtain the result of Proposition 2. According to Proposition 2 and Lemma 5, it remains to bound and . The following lemmas give their bounds respectively. As the noise variables {ξ_i} are i. i.d. standard Gaussian, it belongs to the sub-exponential distribution [32], and thus, there exists a constant ϱ as the smallest number satisfying [30]Suppose the sample complexity N in noisy tensor completion satisfiesThen, we have the following Lemma 7 and Lemma 8 to bound and .

Lemma 7. Under the sample complexity of noisy tensor completion inEq. 44, it holds with probability at leastthatwhereC_ϱis a constant dependent on theϱthat is defined inEq. 43.

Proof of Lemma

The proof can be straightforwardly obtained by adopting the upper bound of the dual T2NN norm in Lemma 2 and Lemma 5 in the supplementary material of [

], and Lemma 5 in [

] as follows:

• First, Lemma 5 in the supplementary material of [25] shows that letting N ≥ d₁d₃ ∨ d₂d₃ and , then it holds with probability at least that

• For k = 1, 2, 3, let X^∗(ξ) (_k) be the mode-k unfolding of random tensor . Then, Lemma 5 in [30] indicates that letting N ≥ d_k ∨ (d_\k) and , then it holds with probability at least that

Then, combining Eq. 46 and 47 and using union bound, Eq. 45 can be obtained.

Lemma 8. Under the sample complexity of noisy tensor completion inEq. 44, it holds that

Proof of Lemma

Similar to the proof of Lemma 7, the proof can be straightforwardly obtained by adopting the upper bound of the dual T2NN norm in Lemma 2 and Lemma 6 in the supplementary material of [

], and Lemma 6 in [

• First, Lemma 6 in the supplementary material of [25] shows that letting N ≥ d₁d₃ ∨ d₂d₃ and N ≥ 2 (d₁ ∧ d₂) log² (d₁ ∧ d₂) log (d₁d₃ + d₂d₃), then, the following inequality holds:

• For k = 1, 2, 3, let X∗(ϵ) (_k) be the mode-k unfolding of random tensor . Then, Lemma 6 in [30] indicates that letting N ≥ d_k ∨ d_\k and N ≥ 2 (d_k ∧ d_\k) log² (d_k ∧ d_\,k) log (d_k + d_\k), then, the following inequality holds:

Then, Eq. 48 can be obtained by combining Eqs. 49 and 50.Further combining Lemma 7, Lemma 8, and Proposition 2, we arrive at an upper bound on the estimation error in the follow theorem.

Theorem 3. Suppose Assumption 2 is satisfied and. Let the sample sizeNsatisfies Eq. 44. By settingthe estimation error of any estimatordefined in Problem (35) can be upper bounded as follows:with probability at least.To understand the proposed bound in Theorem 3, we consider the three-way cubical tensor with regularization weights γ = (1 − γ)α₁ = (1 − γ)α₂ = (1 − γ)α₃ = 1/4. Then, the bound in Eq. 52 is simplified to the following element-wise error:which means the estimation error is controlled by the tubal rank and Tucker rank ofsimultaneously. Equation 53 also indicates that the sample size N should satisfyfor approximate tensor completion.

5 Optimization Algorithm

The ADMM framework [33] is applied to solve the proposed model. Adding auxiliary variables and to Problem (15) yields an equivalent formulation,

To solve Problem (55), an ADMM-based algorithm is proposed. First, the augmented Lagrangian iswhere tensors and are the dual variables.

The primal variables , and can be divided into two blocks: The first block has one tensor variable , whereas the second block consists of four variables and ’s. We use the minimization scheme of ADMM to update the two blocks alternatively after the tth iteration (t = 0, 1, ⋯):

Update the first block: We update by solving following -subproblem with all the other variables fixed:

By taking derivative with respect to and setting the derivative to zero, we obtain the following equation:

Solving the above equation yieldswhere is the identity operator.

Update the second block: We update and in parallel by keeping all the other variables fixed. First, is updated by solving the -subproblem,where is the proximal operator of TNN given in Lemma 9.

Then, is updated by solving the -subproblem (k = 1, 2, 3),where is the folding function to reshape a mode-k matricazation to its original tensor format and is the proximal operator of matrix nuclear norm given in Lemma 10.

Lemma 9 (proximal operator of TNN [34]). Let tensorwith t-SVD, whereandare orthogonal tensors andis the f-diagonal tensor of singular tubes. Then, the proximal operator of function ‖⋅‖_tnnat pointwith parameterτcan be computed as follows:whereanddenote the operations of fast DFT and fast inverse DFT on all the tubes of a given tensor, respectively.

Lemma 10 (proximal operator of the matrix nuclear norm [35]). Let tensorwith SVD T₀ = USV^⊤, whereandare orthogonal matrices andis a diagonal matrix of singular values. Then, the proximal operator of function ‖⋅‖_∗at pointT₀with parameterτcan be computed as follows:Update the dual variables. We use dual ascending [33] to update as follows:Termination Condition. Given a tolerance ϵ > 0, check the termination condition of primal variablesand convergence of constraintsThe ADMM-based algorithm is described in Algorithm 1.

We then discuss the convergence of as follows.

Theorem 4 (convergence of Algorithm 1). For any positive constantρ, if the unaugmented Lagrangian functionhas a saddle point, then the iterationsinAlgorithm 1satisfy the residual convergence, objective convergence, and dual variable convergence (defined in[33]) of Problem (55) ast → ∞.Proof of Theorem 4. The key idea is to rewrite Problem(55) into a standard two-block ADMM problem. For notational simplicity, letwith u, v, w, and A defined as follows:where vec (⋅) denotes the operation of tensor vectorization (see [18]).It can be verified that f (⋅) and g (⋅) are closed, proper convex functions. Then, Problem(55) can be re-written as follows:According to the convergence analysis in [33], we havewhere f^∗, g^∗ are the optimal values of f(u), g(v), respectively. Variable w^∗ is a dual optimal point defined aswhere are the dual variables in a saddle point of the unaugmented Lagrangian. Since there are only equality constraints in the convex problem(55), strong duality holds naturally as a corollary of Slater’s condition [23], which further indicates that the unaugmented Lagrangian has a saddle point. Moreover, according to the analysis in [36], the convergence rate of general ADMM-based algorithms is O (1/T), where T denotes the iteration number. In this way, the convergence behavior of Algorithm 1 is analyzed.

6 Experimental Results

In this section, we first conduct experiments on synthetic datasets to validate the theory for tensor compressed sensing and then evaluate the effectiveness of the proposed T2NN on three types of real data for noisy tensor completion. MATLAB implementations of the algorithms are deployed on a PC running UOS system with an AMD 3 GHz CPU and a RAM of 40 GB.

6.1 Tensor Compressed Sensing

Our theoretical results on tensor compressed sensing are validated on synthetic data in this subsection. Motivated by [

], we consider a constrained T2NN minimization model that is equivalent to Model (15) for the ease of parameter selection. For performance evaluation, the proposed T2NN is also compared with TNN-based tensor compressed sensing [

]. First, the underlying tensor

and its compressed observations {

y_i

} are synthesized by the following tow steps, respectively:

• Step 1: Generatethat is low-rank in both spectral and original domains. Given positive integers d₁, d₂, d₃, and r ≤ min{d₁, d₂, d₃}, we first generate by , where and are tensors with i. i.d. standard Gaussian entries. Then, let where ×₃ is the tensor mode-3 product [18], and is a matrix with i. i.d standard Gaussian entries. Our extensive numerical experimental results show that with high probability, the tubal rank and Tucker rank of are all equal to r, that is, and .
• Step 2: GenerateNcompressed observations {y_i}. Given a positive integer N ≪ D, we first generate N design tensors with i. i.d. standard Gaussian entries. Then, N noise variables {ξ_i} are generated as i. i.d. standard Gaussian variables. The parameter of standard deviation σ is set by σ = cσ₀, where , and we use c to denote the noise level. Finally, {y_i} are formed according to the observation model (13). The goal of tensor compressed sensing is to reconstruct the known from its noisy compressed observations {y_i}.

For simplicity, we consider cubic tensors, i.e.,

d₁

d₂

d₃

, and choose the parameter of T2NN by

= 1/4,

α₁

α₂

α₃

= 1/3. Recall that the underlying tensor

generated by the above Step 1 has the tubal rank and Tucker rank all equal to

with high probability. We consider tensors with dimensionality

∈ {16, 20, 24} and rank proxy

∈ {2, 3}. Then, if the proposed main theorem for tensor compressed sensing (i.e., Theorem 2) is correct, the following two phenomena should be observed:

(1) Phenomenon 1: In the noiseless setting, i.e., σ = 0, if the observation number N is larger than C₀rd² for a sufficiently large constant C₀, then the estimation error can be zero, which means exact recovery. Let N₀ = rd² as a unit measure of the sample complexity. Then, by increasing the observation number N gradually from 0, we will observe a phase transition point of the estimation error in the noiseless setting: If N/N₀ > C₀, the estimation error is relatively “large”; once N/N₀ ≤ C₀, the error will drop dramatically to 0.
(2) Phenomenon 2: In the noisy case, the estimation error scales linearly with the variance σ² of the random noises once the observation number N ≥ C₀N₀.

To check whether Phenomenon 1 occurs, we conduct tensor compressed sensing by setting the noise variance σ² = 0. We gradually increase the normalized observation number N/N₀ from 0.25 to 5. For each different setting of d, r, and N/N₀, we repeat the experiments 10 times and report the averaged estimation error . For both TNN [37] and the proposed T2NN, we plot the curves of estimation error in logarithm versus the normalized observation number N/N₀ for with rank proxy r = 2 in Figure 1. It can be seen that Phenomenon 1 occurs for the proposed T2NN: When N/N₀ > 1.75, the estimation error is relatively “large”; once N/N₀ ≤ 1.75, the error will drop dramatically to 0. The same phenomenon also occurs for TNN with a phase transition point near 3.5. Thus, the sample complexity for exact tensor compressed sensing of T2NN is lower than that of TNN, indicating the superiority of the proposed T2NN. Since similar phenomena have also been observed for tensors of other sizes and rank proxies, we simply omit them.

FIGURE 1

Estimation error in logarithm vs. the normalized observation number N/N₀ for tensor compressed sensing of underlying tensors of size 16×16×16 and rank proxy r =2. The proposed T2NN is compared with TNN [37].

For the validation of Phenomenon 2, we consider the noisy settings with normalized sample complexity N/N₀ = 3.5, which is nearly the phase transition point of TNN and much greater than that of T2NN. We gradually increase the noise level c = σ/σ₀ from 0.025 to 0.25. For each different setting of d, r, and c, we repeat the experiments 10 times and report the averaged estimation error . For both TNN [37] and the proposed T2NN, we plot the curves of estimation error in logarithm versus the (squared) noise level for with rank proxy r = 2 in Figure 2. It can be seen that Phenomenon 2 also occurs for the proposed T2NN: The estimation error scales approximately linearly with the (squared) noise level. The same phenomenon can also be observed for TNN with a higher estimation error than T2NN, indicating T2NN is more accurate than TNN. We omit the results for tensors of other sizes and rank proxies because the error curves are so similar to Figure 2.

FIGURE 2

Estimation error vs. the (squared) noise level for tensor compressed sensing of underlying tensors of size 16×16×16 and rank proxy r =2. The proposed T2NN is compared with TNN [37].

6.2 Noisy Tensor Completion

This subsection evaluates effectiveness of the proposed T2NN through performance comparison with matrix nuclear norms (NN) [30], SNN [22], and TNN [25] by carrying out noisy tensor completion on three different types of visual data including video data, hyperspectral images, and seismic data.

6.2.1 Experimental Settings

Given the tensor data , the goal is to recover it from its partial noisy observations. We consider uniform sampling with ratio p ∈ {0.05, 0.1, 0.15} for the tensors, that is, {95, 90, 85%} entries of a tensor are missing. The noise follows i. i.d. Gaussian where σ = 0.05σ₀, where is the rescaled magnitude of tensor .

6.2.2 Performance evaluation

The effectiveness of algorithms is measured by the Peak Signal Noise Ratio (PSNR) and structural similarity (SSIM) [38]. Specifically, the PSNR of an estimator is defined asfor the underlying tensor . The SSIM is computed viawhere , and denote the local means, standard deviation, cross-covariance, and dynamic range of the magnitude of tensors and . Larger PSNR and SSIM values indicate the higher quality of the estimator . In each setting, we test each tensor for 10 trials and report the averaged PSNR (in db) and SSIM values.

6.2.3 Parameter Setting

For NN [30], we set the parameter . For SNN [22], we set the regularization parameter λ = λ_ι and chose the weight α by α₁: α₂: α₃ = 1 : 1: 1. For TNN [25], we set . For the proposed T2NN, we set the regularization parameter and choose the weights γ = 0.5 and α with α₁: α₂: α₃ = 1 : 1: 10. The factor λ_ι is then tuned in {10^–3, 10^–2, … , 10³} for each norm, and we chose the one with highest PSNRs in most cases in the parameter tuning phase.

6.2.4 Experiments on Video Data

We first conduct noisy video completion on four widely used YUV videos: Akiyo, Carphone, Grandma, and Mother-daughter. Owing to computational limitation, we simply use the first 30 frames of the Y components of all the videos and obtain four tensors of size 144 × 17 × 30. We first report the averaged PSNR and SSIM values obtained by four norms for quantitative comparison in Table 1 and then give visual examples in Figure 3 when 95% of the tensor entries are missing for qualitative evaluation. A demo of the source code is available at https://github.com/pingzaiwang/T2NN-demo.

TABLE 1

(a) Akiyo
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	15.37	19.49	27.04	27.51
	SSIM	0.1864	0.6047	0.8019	0.8302
10%	PSNR	18.01	22.54	29.18	30.08
	SSIM	0.2858	0.7186	0.8556	0.8828
15%	PSNR	19.64	24.37	30.60	31.43
	SSIM	0.3694	0.7812	0.8791	0.8968
(b) Carphone
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	13.58	17.58	23.81	24.16
	SSIM	0.1378	0.5282	0.6725	0.7226
10%	PSNR	16.04	20.49	25.38	25.87
	SSIM	0.2352	0.6425	0.7309	0.7813
15%	PSNR	17.86	22.42	26.40	26.81
	SSIM	0.3242	0.7139	0.7663	0.8057
(c) Grandma
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	16.52	19.13	28.53	28.84
	SSIM	0.1928	0.5503	0.8135	0.8516
10%	PSNR	18.34	22.52	31.44	32.64
	SSIM	0.2992	0.6755	0.8822	0.9141
15%	PSNR	19.66	24.88	33.02	34.28
	SSIM	0.3757	0.7525	0.9088	0.9307
(d) Mother–daughter
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	16.17	20.22	27.66	27.96
	SSIM	0.1895	0.5824	0.7491	0.7816
10%	PSNR	18.55	23.48	29.38	30.32
	SSIM	0.2780	0.6840	0.8035	0.8412
15%	PSNR	20.32	25.31	30.49	31.18
	SSIM	0.3548	0.7405	0.8293	0.8526

PSNR and SSIM values obtained by four norms (NN [30], SNN [22], TNN [25], and our T2NN) for noisy tensor completion on the YUV videos.

The highest PSNS/SSIM are highlighted in bold.

FIGURE 3

Visual results obtained by four norms for noisy tensor completion with 95% missing entries on the YUV-video dataset. The first to fourth rows correspond to the video of Akiyo, Carphone, Grandman, and Mother-duaghter, respectively. The sub-plots from **(A)** to **(F)**: **(A)** a frame of the original video, **(B)** the observed frame, **(C)** the frame recovered by NN [30], **(D)** the frame recovered by SNN [22], **(E)** the frame recovered by the vanilla TNN [25], and **(F)** the frame recovered by our T2NN.

6.2.5 Experiments on Hyperspectral Data

We then carry out noisy tensor completion on subsets of the two representative hyperspectral datasets described as follows:

• Indian Pines: The dataset was collected by AVIRIS sensor in 1992 over the Indian Pines test site in North-western Indiana and consists of 145 × 145 pixels and 224 spectral reflectance bands. We use the first 30 bands in the experiments due to the trade-off between the limitation of computing resources.
• Salinas A: The data were acquired by AVIRIS sensor over the Salinas Valley, California in 1998, and consists of 224 bands over a spectrum range of 400–2500 nm. This dataset has a spatial extent of 86 × 83 pixels with a resolution of 3.7 m. We use the first 30 bands in the experiments too.

The averaged PSNR and SSIM values are given in Table 2 for quantitative comparison. We also show visual examples in Figure 4 when 85% of the tensor entries are missing for qualitative evaluation.

TABLE 2

(a) Indian pines
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	20.44	22.01	25.68	26.00
5%	SSIM	0.3895	0.6359	0.6293	0.6730
10%	PSNR	22.23	24.94	27.45	28.27
10%	SSIM	0.4836	0.7171	0.7226	0.7724
15%	PSNR	23.52	26.61	28.54	29.11
15%	SSIM	0.5438	0.7668	0.7713	0.7979
(b) Salinas A
Sampling ratio	Index	NN	SNN	TNN	T2NN
5%	PSNR	15.21	20.79	22.55	23.68
5%	SSIM	0.2594	0.7547	0.5667	0.7013
10%	PSNR	20.62	25.56	25.72	27.93
10%	SSIM	0.4775	0.8284	0.7027	0.8291
15%	PSNR	23.09	27.99	28.06	29.67
15%	SSIM	0.5643	0.8622	0.7804	0.8671

PSNR and SSIM values obtained by four norms (NN [30], SNN [22], TNN [25], and our T2NN) for noisy tensor completion on the hyperspectral datasets.

The highest PSNS/SSIM are highlighted in bold.

FIGURE 4

Visual results obtained by four norms for noisy tensor completion with 85% missing entries on the hyperspectral dataset (gray data shown with pseudo-color). The first and second rows correspond to *Indian Pines* and *Salinas A*, respectively. The sub-plots from **(A)** to **(F)**: **(A)** a frame of the original data, **(B)** the observed frame, **(C)** the frame recovered by NN [30], **(D)** the frame recovered by SNN [22], **(E)** the frame recovered by the vanilla TNN [25], and **(F)** the frame recovered by our T2NN.

6.2.6 Experiments on Seismic Data

We use the seismic data tensor of size 512 × 512 × 3, which is abstracted from the test data “seismic.mat” of a toolbox for seismic data processing from Center of Geopyhsics, Harbin Institute of Technology, China. For quantitative comparison, we present the PSNR and SSIM values for two sampling schemes in Table 3.

TABLE 3

Sampling ratio (%)	Index	NN	SNN	TNN	T2NN
5	PSNR	22.25	22.28	22.49	22.80
5	SSIM	0.4369	0.4906	0.3928	0.4794
10	PSNR	23.58	23.53	23.48	24.05
10	SSIM	0.5462	0.5740	0.5004	0.5845
15	PSNR	24.74	24.66	24.51	25.25
15	SSIM	0.6266	0.6552	0.5898	0.6657

PSNR and SSIM values obtained by four norms (NN [30], SNN [22], TNN [25], and our T2NN) for noisy tensor completion on the Seismic dataset.

The highest PSNS/SSIM are highlighted in bold.

6.2.7 Summary and Analysis of Experimental Results

According to the experimental results on three types of real tensor data shown in

, and

, the summary and analysis are presented as follows:

1) In all the cases, tensor norms (SNN, TNN, and T2NN) perform better than the matrix norm (NN). It can be explained that tensor norms can honestly preserve the multi-way structure of tensor data such that the rich inter-modal and intra-modal correlations of the data can be exploited to impute the missing values, whereas the matrix norm can only handle two-way structure and thus fails to model the multi-way structural correlations of the tensor data.
2) In most cases, TNN outperforms SNN, which is in consistence with the results reported in [14, 17, 25]. One explanation is that the used video, hyperspectral images, and seismic data all possess stronger low-rankness in the spectral domain (than in the original domain), which can be successfully captured by TNN.
3) In most cases, the proposed T2NN performs best among the four norms. We owe the promising performance to the capability of T2NN in simultaneously exploiting low-rankness in both spectral and original domains.

7 Conclusion and Discussions

7.1 Conclusion

Due to its definition solely in the spectral domain, the popular TNN may be incapable to exploit low-rankness in the original domain. To remedy this weaknesses, a hybrid tensor norm named the “Tubal + Tucker” Nuclear Norm (T2NN) was first defined as the weighted sum of TNN and SNN to model both spectral and original domain low-rankness. It was further used to formulate a penalized least squares estimator for tensor recovery from noisy linear observations. Upper bounds on the estimation error were established in both deterministic and non-asymptotic senses to analyze the statistical performance of the proposed estimator. An ADMM-based algorithm was also developed to efficiently compute the estimator. The effectiveness of the proposed model was demonstrated through experimental results on both synthetic and real datasets.

7.2 Limitations of the Proposed Model and Possible Solutions

Generally speaking, the proposed estimator has the following two drawbacks due to the adoption of T2NN:

• Sample inefficiency: The analysis of [24, 28] indicates that for tensor recovery from a small number of observations, T2NN cannot provide essentially lower sample complexity than TNN.
• Computational inefficiency: Compared to TNN, T2NN is more time-consuming since it involves computing both TNN and SNN.

We list several directions that this work can be extended to overcome the above drawbacks.

• For sample inefficiency: First, inspired by the attempt of adopting the “best” norm (e.g., Eq. 8 in [28]), the following model can be considered:

for a certain noise level

≥ 0. Although Model (65) has a significantly higher accuracy and lower sample complexity according to the analysis in [

], it is impractical because it requires

and

(

= 1, 2, 3), which are unknown in advance. Motivated by [

], a more practical model is given as follows:

where

> 0 is a regularization parameter.

• For computational inefficiency: To improve the efficiency of the proposed T2NN-based models, we can use more efficient solvers of Problem (15) by adopting the factorization strategy [40, 41] or sampling-based approaches [42].

7.3 Extensions to the Proposed Model

In this subsection, we discuss possible extensions of the proposed model to general

-order (

> 3) tensors, general spectral domains, robust tensor recovery, and multi-view learning, respectively.

• Extensions toK-order (K > 3) tensors: Currently, the proposed T2NN is defined solely for 3-order tensors, and it cannot be directly applied to tensors of more than 3 orders like color videos. For general K-order tensors, it is suggested to replace the tubal nuclear norm in the definition of T2NN with orientation invariant tubal nuclear norm [5], which is defined to exploit multi-orientational spectral low-rankness for general higher-order tensors.
• Extensions to general spectral and original domains: This paper considers the DFT-based tensor product for spectral low-rank modeling. Recently, the DFT based t-product has been generalized to the *_L-product defined via any invertible linear transform [43], under which the tubal nuclear norm is also extended to *_L-tubal nuclear norm [44] and *_L-Spectral k-support norm [7]. It is natural to generalize the proposed T2NN by changing the tubal nuclear norm to *_L-tubal nuclear norm or *_L-Spectral k-support norm for further extensions. It is also interesting to consider other tensor decompositions for original domain low-rankness modeling such as CP, TT, and TR as future work.
• Extensions to robust tensor recovery: In many real applications, the tensor signal may also be corrupted by gross sparse outliers. Motivated by [5], the proposed T2NN can also be used in resisting sparse outliers for robust tensor recovery as follows:

where

denotes the tensor of sparse outliers, the tensor

l₁

-norm ‖⋅‖

₁

is applied to encourage sparsity in

, and

> 0 is a regularization parameter.

• Extensions to multi-view learning: Due to its superiority in modeling multi-linear correlations of multi-modal data, TNN has been successfully applied to multi-view self-representations for clustering [45, 46]. Our proposed T2NN can also be utilized for clustering by straightforwardly replacing TNN in the formulation of multi-view learning models (e.g., Eq. 9 in [45]).

Statements

Data availability statement

Publicly available datasets were analyzed in this study. These data can be found here; https://sites.google.com/site/subudhibadri/fewhelpfuldownloads, https://engineering.purdue.edu/∼biehl/MultiSpec/hyperspectral.html, https://rslab.ut.ac.ir/documents/81960329/82035173/SalinasA_corrected.mat, https://github.com/sevenysw/MathGeo2018.

Author contributions

Conceptualization and methodology—YL and AW; software—AW; formal analysis—YL, AW, GZ, and QZ; resources—YL, GZ, and QZ; writing: original draft preparation—YL, AW, GZ, and QZ; project administration and supervision—GZ, and QZ; and funding acquisition—AW, GZ, and QZ. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants 61872188, 62073087, 62071132, 62103110, 61903095, U191140003, and 61973090, in part by the China Postdoctoral Science Foundation under Grant 2020M672536, and in part by the Natural Science Foundation of Guangdong Province under Grants 2020A1515010671, 2019B010154002, and 2019B010118001.

Acknowledgments

AW is grateful to Prof. Zhong Jin in Nanjing University of Science and Technology for his long-time and generous support in both research and life. In addition, he would like to thank the Jin family in Zhuzhou for their kind understanding in finishing the project of tensor learning in these years.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1.^The Fourier version is obtained by performing 1D-DFT on all tubes of , i.e., in MATLAB.

References

1.
GuoCModiKPolettiD. Tensor-Network-Based Machine Learning of Non-Markovian Quantum Processes. Phys Rev A (2020) 102:062414.
- Google Scholar
2.
MaXZhangPZhangSDuanNHouYZhouMet alA Tensorized Transformer for Language Modeling. Adv Neural Inf Process Syst (2019) 32.
- Google Scholar
3.
MengY-MZhangJZhangPGaoCRanS-J. Residual Matrix Product State for Machine Learning. arXiv preprint arXiv:2012.11841 (2020).
- Google Scholar
4.
RanS-JSunZ-ZFeiS-MSuGLewensteinM. Tensor Network Compressed Sensing with Unsupervised Machine Learning. Phys Rev Res (2020) 2:033293. 10.1103/physrevresearch.2.033293
- CrossRef
- Google Scholar
5.
WangAZhaoQJinZLiCZhouG. Robust Tensor Decomposition via Orientation Invariant Tubal Nuclear Norms. Sci China Technol Sci (2022) 34:6102. 10.1007/s11431-021-1976-2
- CrossRef
- Google Scholar
6.
ZhangXNgMK-P. Low Rank Tensor Completion with Poisson Observations. IEEE Trans Pattern Anal Machine Intelligence (2021). 10.1109/tpami.2021.3059299
- CrossRef
- Google Scholar
7.
WangAZhouGJinZZhaoQ. Tensor Recovery via *_L-Spectral k-Support Norm. IEEE J Sel Top Signal Process (2021) 15:522–34. 10.1109/jstsp.2021.3058763
- CrossRef
- Google Scholar
8.
CuiCZhangZ. High-Dimensional Uncertainty Quantification of Electronic and Photonic Ic with Non-Gaussian Correlated Process Variations. IEEE Trans Computer-Aided Des Integrated Circuits Syst (2019) 39:1649–61. 10.1109/TCAD.2019.2925340
- CrossRef
- Google Scholar
9.
LiuX-YAeronSAggarwalVWangX. Low-Tubal-Rank Tensor Completion Using Alternating Minimization. IEEE Trans Inform Theor (2020) 66:1714–37. 10.1109/tit.2019.2959980
- CrossRef
- Google Scholar
10.
CarrollJDChangJ-J. Analysis of Individual Differences in Multidimensional Scaling via an N-Way Generalization of “Eckart-Young” Decomposition. Psychometrika (1970) 35:283–319. 10.1007/bf02310791
- CrossRef
- Google Scholar
11.
TuckerLR. Some Mathematical Notes on Three-Mode Factor Analysis. Psychometrika (1966) 31:279–311. 10.1007/bf02289464
- CrossRef
- Google Scholar
12.
OseledetsIV. Tensor-Train Decomposition. SIAM J Sci Comput (2011) 33:2295–317. 10.1137/090752286
- CrossRef
- Google Scholar
13.
ZhaoQZhouGXieSZhangLCichockiA. Tensor Ring Decomposition. arXiv preprint arXiv:1606.05535 (2016).
- Google Scholar
14.
ZhangZElyGAeronSHaoNKilmerM. Novel Methods for Multilinear Data Completion and De-Noising Based on Tensor-Svd. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014). p. 3842–9. 10.1109/cvpr.2014.485
- CrossRef
- Google Scholar
15.
KilmerMEBramanKHaoNHooverRC. Third-Order Tensors as Operators on Matrices: A Theoretical and Computational Framework with Applications in Imaging. SIAM J Matrix Anal Appl (2013) 34:148–72. 10.1137/110837711
- CrossRef
- Google Scholar
16.
HouJZhangFQiuHWangJWangYMengD. Robust Low-Tubal-Rank Tensor Recovery from Binary Measurements. IEEE Trans Pattern Anal Machine Intelligence (2021). 10.1109/tpami.2021.3063527
- CrossRef
- Google Scholar
17.
LuCFengJChenYLiuWLinZYanS. Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm. IEEE Trans Pattern Anal Mach Intell (2020) 42:925–38. 10.1109/tpami.2019.2891760
- CrossRef
- Google Scholar
18.
KoldaTGBaderBW. Tensor Decompositions and Applications. SIAM Rev (2009) 51:455–500. 10.1137/07070111x
- CrossRef
- Google Scholar
19.
LiXWangALuJTangZ. Statistical Performance of Convex Low-Rank and Sparse Tensor Recovery. Pattern Recognition (2019) 93:193–203. 10.1016/j.patcog.2019.03.014
- CrossRef
- Google Scholar
20.
LiuJMusialskiPWonkaPYeJ. Tensor Completion for Estimating Missing Values in Visual Data. IEEE Trans Pattern Anal Mach Intell (2013) 35:208–20. 10.1109/tpami.2012.39
- CrossRef
- Google Scholar
21.
QiuYZhouGChenXZhangDZhaoXZhaoQ. Semi-Supervised Non-Negative Tucker Decomposition for Tensor Data Representation. Sci China Technol Sci (2021) 64:1881–92. 10.1007/s11431-020-1824-4
- CrossRef
- Google Scholar
22.
TomiokaRSuzukiTHayashiKKashimaH. Statistical Performance of Convex Tensor Decomposition. In: Proceedings of Annual Conference on Neural Information Processing Systems (2011). p. 972–80.
- Google Scholar
23.
BoydSBoydSPVandenbergheL. Convex Optimization. Cambridge: Cambridge University Press (2004).
- Google Scholar
24.
MuCHuangBWrightJGoldfarbD. Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery. In: International Conference on Machine Learning (2014). p. 73–81.
- Google Scholar
25.
WangALaiZJinZ. Noisy Low-Tubal-Rank Tensor Completion. Neurocomputing (2019) 330:267–79. 10.1016/j.neucom.2018.11.012
- CrossRef
- Google Scholar
26.
ZhouPLuCLinZZhangC. Tensor Factorization for Low-Rank Tensor Completion. IEEE Trans Image Process (2018) 27:1152–63. 10.1109/tip.2017.2762595
- CrossRef
- Google Scholar
27.
NegahbanSWainwrightMJ. Estimation of (Near) Low-Rank Matrices with Noise and High-Dimensional Scaling. Ann Stat (2011) 2011:1069–97. 10.1214/10-aos850
- CrossRef
- Google Scholar
28.
OymakSJalaliAFazelMEldarYCHassibiB. Simultaneously Structured Models with Application to Sparse and Low-Rank Matrices. IEEE Trans Inform Theor (2015) 61:2886–908. 10.1109/tit.2015.2401574
- CrossRef
- Google Scholar
29.
FoucartSRauhutH. A Mathematical Introduction to Compressive Sensing, Vol. 1. Basel, Switzerland: Birkhäuser Basel (2013).
- Google Scholar
30.
KloppO. Noisy Low-Rank Matrix Completion with General Sampling Distribution. Bernoulli (2014) 20:282–303. 10.3150/12-bej486
- CrossRef
- Google Scholar
31.
KloppO. Matrix Completion by Singular Value Thresholding: Sharp Bounds. Electron J Stat (2015) 9:2348–69. 10.1214/15-ejs1076
- CrossRef
- Google Scholar
32.
VershyninR. High-Dimensional Probability: An Introduction with Applications in Data Science, Vol. 47. Cambridge: Cambridge University Press (2018).
- Google Scholar
33.
BoydSParikhNChuEPeleatoBEcksteinJ. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations Trends® Machine Learn (2011) 3:1–122. 10.1561/2200000016
- CrossRef
- Google Scholar
34.
WangAWeiDWangBJinZ. Noisy Low-Tubal-Rank Tensor Completion Through Iterative Singular Tube Thresholding. IEEE Access (2018) 6:35112–28. 10.1109/access.2018.2850324
- CrossRef
- Google Scholar
35.
CaiJ-FCandèsEJShenZ. A Singular Value Thresholding Algorithm for Matrix Completion. SIAM J Optim (2010) 20:1956–82. 10.1137/080738970
- CrossRef
- Google Scholar
36.
HeBYuanX. On the $O(1/n)$ Convergence Rate of the Douglas-Rachford Alternating Direction Method. SIAM J Numer Anal (2012) 50:700–9. 10.1137/110836936
- CrossRef
- Google Scholar
37.
LuCFengJLinZYanS. Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence (2018). p. 1948–54. 10.24963/ijcai.2018/347
- CrossRef
- Google Scholar
38.
WangZBovikACSheikhHRSimoncelliEP. Image Quality Assessment: from Error Visibility to Structural Similarity. IEEE Trans Image Process (2004) 13:600–12. 10.1109/tip.2003.819861
- CrossRef
- Google Scholar
39.
ZhangXZhouZWangDMaY. Hybrid Singular Value Thresholding for Tensor Completion. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014). p. 1362–8.
- Google Scholar
40.
WangA-DJinZYangJ-Y. A Faster Tensor Robust Pca via Tensor Factorization. Int J Mach Learn Cyber (2020) 11:2771–91. 10.1007/s13042-020-01150-2
- CrossRef
- Google Scholar
41.
LiuGYanS. Active Subspace: Toward Scalable Low-Rank Learning. Neural Comput (2012) 24:3371–94. 10.1162/neco_a_00369
- CrossRef
- Google Scholar
42.
WangLXieKSemongTZhouH. Missing Data Recovery Based on Tensor-Cur Decomposition. IEEE Access (2017) PP:1.
- Google Scholar
43.
KernfeldEKilmerMAeronS. Tensor-Tensor Products with Invertible Linear Transforms. Linear Algebra its Appl (2015) 485:545–70. 10.1016/j.laa.2015.07.021
- CrossRef
- Google Scholar
44.
LuCPengXWeiY. Low-Rank Tensor Completion with a New Tensor Nuclear Norm Induced by Invertible Linear Transforms. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019). p. 5996–6004. 10.1109/cvpr.2019.00615
- CrossRef
- Google Scholar
45.
LuG-FZhaoJ. Latent Multi-View Self-Representations for Clustering via the Tensor Nuclear Norm. Appl Intelligence (2021) 2021:1–13. 10.1007/s10489-021-02710-x
- CrossRef
- Google Scholar
46.
LiuYZhangXTangGWangD. Multi-View Subspace Clustering Based on Tensor Schatten-P Norm. In: 2019 IEEE International Conference on Big Data (Big Data). Los Angeles, CA, USA: IEEE (2019). p. 5048–55. 10.1109/bigdata47090.2019.9006347
- CrossRef
- Google Scholar

Summary

Keywords

tensor decomposition, tensor low-rankness, tensor SVD, tubal nuclear norm, tensor completion

Citation

Luo Y, Wang A, Zhou G and Zhao Q (2022) A Hybrid Norm for Guaranteed Tensor Recovery. Front. Phys. 10:885402. doi: 10.3389/fphy.2022.885402

Received

28 February 2022

Accepted

27 April 2022

Published

13 July 2022

Volume

10 - 2022

Edited by

Peng Zhang, Tianjin University, China

Reviewed by

Jingyao Hou, Southwest University, China

Yong Peng, Hangzhou Dianzi University, China

Jing Lou, Changzhou Institute of Mechatronic Technology, China

Guifu Lu, Anhui Polytechnic University, China

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Andong Wang, w.a.d@outlook.com

This article was submitted to Statistical and Computational Physics, a section of the journal Frontiers in Physics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

ORIGINAL RESEARCH article

A Hybrid Norm for Guaranteed Tensor Recovery

Abstract

1 Introduction

2 Notations and Preliminaries

2.1 Spectral Rankness Modeled by t-SVD

2.2 Original Domain Low-Rankness Modeled by Tucker Decomposition

3 A Hybrid Norm for Tensor Recovery

3.1 The Proposed Norm

3.2 T2NN-Based Tensor Recovery

3.2.1 The observation Model

3.2.2 Two Typical Settings

3.2.3 The Proposed Estimator

4 Statistical Guarantee

4.1 A Deterministic Bound on the Estimation Error

4.2 Tensor Compressive Sensing

4.3 Noisy Tensor Completion

5 Optimization Algorithm

6 Experimental Results

6.1 Tensor Compressed Sensing

6.2 Noisy Tensor Completion

6.2.1 Experimental Settings

6.2.2 Performance evaluation

6.2.3 Parameter Setting

6.2.4 Experiments on Video Data

6.2.5 Experiments on Hyperspectral Data

6.2.6 Experiments on Seismic Data

6.2.7 Summary and Analysis of Experimental Results

7 Conclusion and Discussions

7.1 Conclusion

7.2 Limitations of the Proposed Model and Possible Solutions

7.3 Extensions to the Proposed Model

Statements

Data availability statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

Footnotes

References

Summary

Outline

Figures

Cite article

Share article

Article metrics