Internet traffic data recovery via a low-rank spatio-temporal regularized optimization approach without d-th order T-SVD

Duan, Yuxuan; Ling, Chen; Liu, Jinjie; Yang, Xinmin

doi:10.3389/fams.2025.1587681

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 06 May 2025

Sec. Optimization

Volume 11 - 2025 | https://doi.org/10.3389/fams.2025.1587681

This article is part of the Research TopicLarge Tensor Analysis and ApplicationsView all 4 articles

Internet traffic data recovery via a low-rank spatio-temporal regularized optimization approach without d-th order T-SVD

Yuxuan Duan¹

Chen Ling²

Jinjie Liu¹^*

Xinmin Yang¹

¹National Center for Applied Mathematics in Chongqing, Chongqing Normal University, Chongqing, China
²Department of Mathematics, Hangzhou Dianzi University, Hangzhou, China

Accurate recovery of Internet traffic data can mitigate the adverse impact of incomplete data on network task processes. In this study, we propose a low-rank recovery model for incomplete Internet traffic data with a fourth-order tensor structure, incorporating spatio-temporal regularization while avoiding the use of d-th order T-SVD. Based on d-th order tensor product, we first establish the equivalence between d-th order tensor nuclear norm and the minimum sum of the squared Frobenius norms of two factor tensors under the unitary transformation domain. This equivalence allows us to leave aside the d-th order T-SVD, significantly reducing the computational complexity of solving the problem. In addition, we integrate the alternating direction method of multipliers (ADMM) to design an efficient and stable algorithm for precise model solving. Finally, we validate the proposed approach by simulating scenarios with random and structured missing data on two real-world Internet traffic datasets. Experimental results demonstrate that our method exhibits significant advantages in data recovery performance compared to existing methods.

1 Introduction

Internet traffic data, characterized by distinct spatio-temporal attributes, are crucial for documenting information transmission volumes over specified time periods. It plays a vital role in network design and management [1–3]. However, in practice, uncontrollable factors often result in incomplete or corrupted traffic data, hindering its effective use. Therefore, accurately recovering original data from incomplete traffic data is of significant importance.

Since Zhang et al. [4] introduced the sparse regularized matrix factorization (SRMF) model for data recovery, while incorporating the unique characteristics of Internet traffic data, research in the field of Internet traffic data recovery has continued to advance. As higher-order generalizations of matrices, tensors are more effective at capturing the structural characteristics within the data. Consequently, many researchers have proposed various effective methods for Internet traffic data recovery based on different tensor decomposition techniques, such as the CANDECOMP/PARAFAC (CP) decomposition [5, 6], the tensor-train (TT) decomposition [7], and the third-order T-product factorization [8]. Given that low-rankness is a typical characteristic of incomplete data in scenarios with high levels of missing samples, low-rank modeling serves as another effective strategy beyond tensor decomposition. Candès et al. [9] were the pioneers in establishing a low-rank matrix data recovery model. To address its inherent non-convexity and NP hardness, they employed the nuclear norm of a matrix to convexly relax the model, enabling an effective solution. Unlike the definition of matrix rank, which is well-defined, the definition of tensor rank relies on specific decomposition method.

In 2014, based on the tensor singular value decomposition (T-SVD), Zhang et al. [10] proposed third-order tensor nuclear norm (TNN) and employed it to convexly relax the tubal rank of tensors, and they established a model for image restoration. On this basis, Li et al. [11] developed the SRTNN model for the recovery of Internet traffic data by incorporating spatial-temporal regularization with the nuclear norm of third-order tensors, which demonstrates significant advantages in terms of data recovery effectiveness. Meanwhile, to mitigate the computational burden associated with SVD, He et al. [12] proposed the equivalence between the so-called transformed tubal nuclear norm for a third-order tensor and the minimum of the sum of two factor tensors' squared Frobenius norms under a general invertible linear transform.

Currently, the recovery of Internet traffic data primarily relies on model construction based on third-order tensors. While these studies have achieved certain advancements, the effectiveness in restoring data with structural missingness remains limited. Addressing this challenge requires data recovery models to more thoroughly account for the intrinsic structural characteristics of the data. In the case of Internet traffic data, considering its complexity and multidimensionality, it can be organized in the form of a fourth-order tensor ( $X \in ℝ^{D \times T \times N \times N}$ , where N signifies network nodes, T is the daily recording times, and D indicates the recording days) to capture the internal structural features of the data more comprehensively. Therefore, exploring a low-rank fourth-order tensor model for Internet traffic data recovery is essential.

Notably, Martin et al. [13] have extended the third order T-product to d-th order T-product, which is explained in a recursive way but for computational speed is implemented directly using the fast Fourier transform. Inspired by this, Qin et al. [14] have generalized the multiplication in the discrete Fourier transform (DFT) domain to tensor product in the domain of general invertible transforms and introduced the d-th order tensor nuclear norm. They employed the d-th order TNN to perform convex relaxation on the fourth order tensor low-rank model, establishing the HTNN method for recovering visual data. However, for solving optimization problems involving nuclear norms, the thresholding operator is often employed, which involves performing SVD on a large number of matrices during the process, significantly increasing the computational time [11, 14]. Hence, a natural question arises about how to build a low-rank recovery model for a d-th order tensor without the need for SVD computation.

In this study, by arranging the observed Internet traffic data as a tensor $M \in ℝ^{D \times T \times N \times N}$ , we enable unitary transformations on mode-3 and mode-4 to fully integrate the internal data of these modes. Meanwhile, a spatio-temporal regularization term is applied to modes-1 and-2, aiming to preserve the periodicity and similarity of the recovered data as much as possible, thereby enhancing the accuracy of data recovery. We develop a new low-rank model for the recovery of fourth-order Internet traffic data as follows:

\begin{array}{l} \min_{X} ‖ X ‖_{⋆, L} + \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (X) ‖_{F}^{2} + \frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (X) ‖_{F}^{2} + γ ‖ X ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ), \end{array}

where ||·||_{⋆, L} means the tensor nuclear norm under the unitary transform L, and ${Mat}_{1} (X)$ and ${Mat}_{2} (X)$ are consisted by flow data of consecutive days and time points, respectively. Through the equivalence relationship between the d-th order tensor nuclear norm and the minimum sum of two factor tensors' squared Frobenius norms within the unitary transformation domain and under the d-th order T-product which is first established by us, we reformulate this model into the following version without relying on d-th order T-SVD:

\begin{array}{l} \min_{X, U, V} \frac{1}{2} (‖ U ‖_{F}^{2} + ‖ V ‖_{F}^{2}) + \frac{λ}{2} ‖ X - U *_{L} V ‖_{F}^{2} + γ ‖ X ‖_{F}^{2} \\ + \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (X) ‖_{F}^{2} + \frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (X) ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ) . \end{array}

We will give a detail explanation about these two models in Section 3.

The rest of this study is organized as follows. Some notions and preliminaries are listed in Section 2. In Section 3, we establish a low-rank Internet traffic data recovery model without d-th order tensor SVD and solve it using the alternating direction method of multipliers (ADMM), proposing a computational framework for solving the model. In Section 4, we conduct a convergence analysis of Algorithm 1, ensuring that it converges to a stationary point. Numerical experiments are presented in Section 5, where we apply the proposed method to two real-world Internet traffic datasets and simulate potential structural missingness scenarios in practice. The experimental results demonstrate that our method exhibits significant advantages, both in terms of recovery accuracy and computational efficiency. Finally, we give the conclusion of this study in Section 6.

Algorithm 1

Algorithm 1. Tensor completion algorithm with d-th order tensor product decomposition and spatio-temporal regularization (SRdTPD).

2 Preliminary

In this study, the real number field and the complex number field are denoted by ℝ and ℂ, respectively. We use lowercase letters a, b, ⋯ to denote scalars, bold lowercase letters a, b, ⋯ to represent vectors, capital letters A, B, ⋯ to signify matrices, and calligraphic letters $A, B, \dots$ to denote tensors. For any positive integer n, we define the set [n]: = {1, 2, ..., n}. For A∈ℂ^m×n, A^H denotes the conjugate transpose of A (when A is in the real number field, its transpose is A^⊤), and its nuclear norm is denoted as $‖ A ‖_{⋆} = \sum_{i = 1}^{min {m, n}} σ_{i} (A)$ , where σ_i is the i-th singular value of A. For a d-th order real tensor $A = (a_{i_{1} i_{2} \dots i_{d}}) \in ℝ^{n_{1} \times n_{2} \times \dots \times n_{d}}$ , its corresponding Frobenius norm is $‖ A ‖_{F} = {(\sum_{i_{1} = 1}^{n_{1}} \dots \sum_{i_{d} = 1}^{n_{d}} | a_{i_{1} \dots i_{d}} |^{2})}^{1 / 2}$ . The inner product between $A \in ℝ^{n_{1} \times n_{2} \times \dots \times n_{d}}$ and $B \in ℝ^{n_{1} \times n_{2} \times \dots \times n_{d}}$ is denoted as $〈 A, ℬ 〉 = \sum_{i_{1} = 1}^{n_{1}} \dots \sum_{i_{d} = 1}^{n_{d}} a_{i_{1} \dots i_{d}} b_{i_{1} \dots i_{d}}$ . Denote $A (i_{1}, \dots, i_{l - 1}, :, i_{l + 1}, \dots, i_{d}) \in ℝ^{n_{l}}$ as a column vector formed by fixing all modes except for mode-l, and $A (i_{1}, \dots, i_{p - 1}, :, i_{p + 1}, \dots, i_{q - 1}, :, i_{q + 1}, \dots, i_{d}) \in ℝ^{n_{p} \times n_{q}} (p < q)$ as the matrix slice along mode-p and mode-q. In particular, the matrix slice encompassing the first two modes is designated as $A^{j} = A (:, :, i_{3}, \dots, i_{d}) \in ℝ^{n_{1} \times n_{2}}$ , where j = (i_d−1)n₃⋯n_d−1+⋯+(i₄−1)n₃+i₃, i_d∈[n_d]. Denote

bdiag (A) = [\begin{array}{l} A^{1} \\ A^{2} \\ ⋱ \\ A^{J} \end{array}] \in ℝ^{(n_{1} n_{3} n_{4} \dots n_{d}) \times (n_{2} n_{3} n_{4} \dots n_{d})},

where J = n₃⋯n_d. Herein, elucidate the face-wise product for d-th order tensors.

Definition 1 (Face-wise product [14]). The face-wise product $A △ B$ of two d-th order tensors $A \in ℂ^{n_{1} \times l \times n_{3} \times \dots \times n_{d}}$ and $B \in ℂ^{l \times n_{2} \times n_{3} \times \dots \times n_{d}}$ is the element of $ℂ^{n_{1} \times n_{2} \times n_{3} \times \dots \times n_{d}}$ defined according to

\begin{array}{l} C = A △ B \Leftrightarrow bdiag (C) = bdiag (A) \cdot bdiag (B) . & (1) \end{array}

For a given set of invertible transformation matrices denoted as ${L_{i} \in ℂ^{n_{i} \times n_{i}}}_{i = 3}^{d}$ , the linear transformation of a d-th order tensor $A$ under L is defined as $A_{L} ≜ L (A) = A \times_{3} L_{3} \times_{4} \dots \times_{d} L_{d}$ , where the symbol × _k represents the k-mode product of a tensor with a matrix defined in [15], and the inverse transformation of $A$ under L is denoted as $L^{- 1} (A) = A \times_{3} L_{3}^{- 1} \times_{4} \dots \times_{d} L_{d}^{- 1}$ . Thus, when the corresponding matrices ${L_{i}}_{i = 3}^{d}$ of invertible linear transforms L are unitary matrices, we can obtain

\begin{array}{l} || A | |_{F} = || A_{L} | |_{F} . & (2) \end{array}

Next, the clear definitions for the multiplication operation and related concepts of a d-th order tensor $A$ within the domain of general invertible transformations.

Definition 2 (d-th order T-product [14]). Let $A \in ℝ^{n_{1} \times n_{2} \times \dots \times n_{d}}$ and $B \in ℝ^{n_{2} \times l \times n_{3} \times \dots \times n_{d}}$ . Then, the invertible linear transforms L based T-product is defined as

\begin{array}{l} C = A *_{L} B = L^{- 1} (A_{L} △ B_{L}) . \end{array}

Definition 3 (d-th order tensor conjugate transpose [14]). The invertible linear transforms L based conjugate transpose of a tensor $A \in ℂ^{n_{1} \times n_{2} \times n_{3} \times \dots \times n_{d}}$ is the tensor $B \in ℂ^{n_{2} \times n_{1} \times n_{3} \times \dots \times n_{d}}$ , whose matrix slice encompassing the first two modes $ℬ (:, :, i_{3}, \dots, i_{d}) = {(A_{L} (:, :, i_{3}, \dots, i_{d}))}^{H}$ . We denote by $A^{H}$ the conjugate transpose of a tensor $A$ .

Definition 4 (d-th order TNN [14]). Let $A \in ℝ^{n_{1} \times \dots \times n_{d}}$ , the tensor nuclear norm of $A$ is defined as

‖ A ‖_{⋆, L} : = \frac{1}{ρ} ‖ bdiag (A_{L}) ‖_{⋆} = \frac{1}{ρ} \sum_{i_{3} =1}^{n_{3}} \dots \sum_{i_{d} =1}^{n_{d}} ‖ A_{L} (:, :, i_{3}, \dots, i_{d}) ‖_{⋆},

where ρ > 0 is a positive constant determined by the invertible linear transforms L.

Remark 1. The constant ρ in the key definition and theorem arises when the corresponding matrices ${L_{i}}_{i = 3}^{d}$ of invertible linear transforms L fulfill the given equation:

\begin{array}{l} (L_{d} \otimes L_{d - 1} \otimes \dots \otimes L_{3}) \cdot (L_{d}^{H} \otimes L_{d - 1}^{H} \otimes \dots \otimes L_{3}^{H}) \\ = (L_{d}^{H} \otimes L_{d - 1}^{H} \otimes \dots \otimes L_{3}^{H}) \cdot (L_{d} \otimes L_{d - 1} \otimes \dots \otimes L_{3}) \\ = ρ I_{n_{3} n_{4} \dots n_{d}}, \end{array}

where ⊗ is the Kronecker product, I is the identity matrix, and n₃n₄⋯n_d is its dimensions.

Based on Definition 4, we can draw the conclusion of the equivalence between d-th order tensor nuclear norm and the minimum of the sum of two factor tensors' squared Frobenius norms:

Theorem 1. If the corresponding matrices ${L_{i}}_{i = 3}^{d}$ of invertible linear transforms L are unitary matrices, for $A \in ℝ^{n_{1} \times n_{2} \times \dots \times n_{d}}$ , we have

‖ A ‖_{⋆, L} = \min_{X, Y} {\frac{1}{2} (‖ X ‖_{F}^{2} + ‖ Y ‖_{F}^{2}) : A = X *_{L} Y^{H}} .

Proof: By Definition 4, we have

\begin{array}{l} ‖ A ‖_{⋆, L} = \frac{1}{ρ} \sum_{i_{3} = 1}^{n_{3}} \dots \sum_{i d = 1}^{n_{d}} ‖ A_{L} (:, :, i_{3}, i_{4}, \dots, i_{d}) ‖_{⋆} \\ = \min_{X_{L}, Y_{L}} {\frac{1}{2 ρ} \sum_{i_{3} = 1}^{n_{3}} \dots \sum_{i_{d} = 1}^{n_{d}} (‖ X_{L} (:, :, i_{3}, i_{4}, \dots, i_{d}) ‖_{F}^{2} \\ + ‖ Y_{L} (:, :, i_{3}, \dots, i_{d}) ‖_{F}^{2}) : \\ X_{L} (:, :, i_{3}, i_{4}, \dots, i_{d}) \cdot {(Y_{L} (:, :, i_{3}, \dots, i_{d}))}^{H} \\ = A_{L} (:, :, i_{3}, \dots, i_{d})} \\ = \min_{X, Y} {\frac{1}{2} (‖ X ‖_{F}^{2} + ‖ Y ‖_{F}^{2}) : A = X *_{L} Y^{H}}, \end{array}

where the proof for the second equality can be deduced from the proofs of Lemma 5.1 and Proposition 2.1 of [16], and the reason for the third equality holding is that ${L_{i}}_{i = 3}^{d}$ are unitary matrices.

3 Model and algorithm

As mentioned in the introduction, Internet traffic data consist of network traffic records measured T times daily from N origin nodes to N destination nodes. Consequently, data collected over D consecutive days can be organized as a fourth-order tensor in ℝ^D×T×N×N (i.e., n₁ = D, n₂ = T, n₃ = n₄ = N), comprehensively capturing the spatio-temporal dynamics of the network traffic.

3.1 Design of model and transformation

Based on low-rank property of observed incomplete tensor $M \in ℝ^{n_{1} \times n_{2} \times n_{3} \times n_{4}}$ , we construct a data recovery model for missing traffic data within the domain of unitary transformation (i.e., the corresponding matrices ${L_{i}}_{i = 3}^{4}$ in L are unitary):

\begin{array}{l} \min_{X} ‖ X ‖_{⋆, L} + \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (X) ‖_{F}^{2} + \frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (X) ‖_{F}^{2} + γ ‖ X ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ), & (3) \end{array}

where α₁, α₂, and γ are positive parameters, $X \in ℝ^{n_{1} \times n_{2} \times n_{3} \times n_{4}}$ is the Internet traffic tensor needed to be estimated, Ω is the index set corresponding to the observed entries of $M$ , and P_Ω(·) is the linear operator that keeps known elements in Ω while setting the others to be zeros. Here, ${Mat}_{1} (X) = [X (:, 1, 1, 1), \dots X (:, n_{2}, 1, 1), \dots, X (:, n_{2}, n_{3}, 1), \dots, X (:, n_{2}, n_{3}, n_{4})]$ $\in ℝ^{n_{1} \times (n_{2} n_{3} n_{4})}$ , where adjacent rows represent flow data measurements from consecutive days. And ${Mat}_{2} (X) = [X (1, :, 1, 1), \dots, X (n_{1}, :, 1, 1), \dots, X (n_{1}, :, n_{3}, 1), \dots, X (n_{1}, :, n_{3}, n_{4})] \in ℝ^{n_{2} \times (n_{1} n_{3} n_{4})}$ , where adjacent rows represent flow data measurements from consecutive time points. The regularization term $γ | X | |_{F}^{2}$ is employed to ensure the boundedness of the sequence generated by the proposed algorithm. To better capture the periodicity of the data measured on each observation day and the similarity of the data obtained between adjacent observation time points within each day, we choose the temporal constraint matrix H = Toeplitz(0, 1, −1) of the size (n₁−1) × n₁ and K = Toeplitz(0, 1, −1) of size (n₂−1) × n₂, respectively, i.e.,

\begin{array}{l} H = {[\begin{matrix} 1 & - 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & - 1 & \dots & 0 & 0 \\ 0 & 0 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 1 & - 1 \end{matrix}]}_{(n_{1} - 1) \times n_{1}}, \\ K = {[\begin{matrix} 1 & - 1 & 0 & \dots & 0 & 0 \\ 0 & 1 & - 1 & \dots & 0 & 0 \\ 0 & 0 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 1 & - 1 \end{matrix}]}_{(n_{2} - 1) \times n_{2}} . \end{array}

It is well-known that the Equation (Model) 3 requires the so-called SVD to compute the ||·||_{⋆, L} minimization subproblem, which often takes much computing time for large-scale tensors. Therefore, with the employment of Theorem 1, we gainfully reformulate Equation (Model) 3 as the following tensor factorization version:

\begin{array}{l} \min_{X, U, V} \frac{1}{2} (‖ U ‖_{F}^{2} + ‖ V ‖_{F}^{2}) + \frac{λ}{2} ‖ X - U *_{L} V ‖_{F}^{2} + γ ‖ X ‖_{F}^{2} \\ + \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (X) ‖_{F}^{2} \\ + \frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (X) ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ), & (4) \end{array}

where $U \in ℝ^{n_{1} \times s \times n_{3} \times n_{4}}, V \in ℝ^{s \times n_{2} \times n_{3} \times n_{4}}$ . Clearly, such an optimization model no longer requires SVD to find its solution. Therefore, we call Equation (Model) 4 a SVD-free model. Moreover, to facilitate the solution of Equation (Model) 4, we introduce auxiliary variables $Y, Z \in ℝ^{n_{1} \times n_{2} \times n_{3} \times n_{4}}$ . Thus, the Equation (Model) 4 is equivalently written as

\begin{array}{l} \min_{X, U, V, Y, Z} f (X, U, V, Y, Z) = \frac{1}{2} (‖ U ‖_{F}^{2} + ‖ V ‖_{F}^{2}) + \frac{λ}{2} ‖ X - U *_{L} V ‖_{F}^{2} \\ + \frac{γ}{2} ‖ Y ‖_{F}^{2} + \frac{γ}{2} ‖ Z ‖_{F}^{2} + \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (Y) ‖_{F}^{2} \\ + \frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (Z) ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ), Y = X, Z = X . & (5) \end{array}

3.2 Description of algorithm

We apply the alternating direction method of multipliers (ADMM) algorithm to solve Equation (Model) 5. The augmented Lagrangian function of Equation (Model) 5 is written as follows:

\begin{array}{l} ℒ (X, U, V, Y, Z, W_{1}, W_{2}) = f (X, U, V, Y, Z) \\ + 〈 W_{1}, X - Y 〉 + \frac{β_{1}}{2} ‖ X - Y ‖_{F}^{2} + 〈 W_{2}, X - Z 〉 + \frac{β_{2}}{2} ‖ X - Z ‖_{F}^{2}, & (6) \end{array}

where $W_{1}$ and $W_{2} \in ℝ^{n_{1} \times n_{2} \times n_{3} \times n_{4}}$ are the Lagrangian multipliers, and β₁ and β₂ are positive penalty parameters. Below, given the q-th iterate $Θ^{q} : = (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q})$ , we provide specific solutions for the updates of each variable.

▸ The $X$ -subproblem

We update $X$ by solving the following subproblem

\begin{array}{l} \min_{X} \frac{λ}{2} ‖ X - U^{q} *_{L} V^{q} ‖_{F}^{2} + 〈 W_{1}^{q}, X - Y^{q} 〉 + \frac{β_{1}}{2} ‖ X - Y^{q} ‖_{F}^{2} \\ + 〈 W_{2}^{q}, X - Z^{q} 〉 + \frac{β_{2}}{2} ‖ X - Z^{q} ‖_{F}^{2} \\ s.t. P_{Ω} (X) = P_{Ω} (ℳ) . & (7) \end{array}

The solution of Equation (Model) 7 is given by

\begin{array}{l} X_{i j k l}^{q + 1} = {\begin{cases} \frac{1}{λ + β_{1} + β_{2}} (λ U^{q} *_{L} V^{q} + β_{1} Y^{q} \\ + β_{2} Z^{q} - W_{1}^{q} - W_{2}^{q})_{i j k l}, (i, j, k, l) \in Ω, \\ ℳ_{i j k l} otherwise . \end{cases} & (8) \end{array}

▸ The $U$ -subproblem

We update $U$ by

\begin{array}{l} U^{q + 1} = \underset{U}{\arg \min} {\frac{1}{2} ‖ U ‖_{F}^{2} + \frac{λ}{2} ‖ X^{q + 1} - U *_{L} V^{q} ‖_{F}^{2}} . & (9) \end{array}

Because the corresponding matrices ${L_{i}}_{i = 3}^{4}$ in L are unitary, it holds that for j∈[n₃n₄],

\begin{array}{l} \begin{array}{l} \frac{1}{2} ‖ U ‖_{F}^{2} + \frac{λ}{2} ‖ X^{q + 1} - U *_{L} V^{q} ‖_{F}^{2} \\ = \frac{1}{2} ‖ U_{L} ‖_{F}^{2} + \frac{λ}{2} ‖ X_{L}^{q + 1} - U_{L} △ V_{L}^{q} ‖_{F}^{2} \\ = \frac{1}{2} \sum_{j = 1}^{n_{3} n_{4}} (‖ U_{L}^{j} ‖_{F}^{2} + λ ‖ {(X_{L}^{j})}^{q + 1} - U_{L}^{j} {(V_{L}^{j})}^{q} ‖_{F}^{2}) . \end{array} & (10) \end{array}

Due to the separable structure of Equation 10 with respect to decision variables $U_{L}^{j}$ for j∈[n₃n₄], solving the Equation 9 is equivalent to solving

\begin{array}{l} \min_{U_{L}^{j}} \frac{1}{2} ‖ U_{L}^{j} ‖_{F}^{2} + \frac{λ}{2} ‖ {(X_{L}^{j})}^{q + 1} - U_{L}^{j} {(V_{L}^{j})}^{q} ‖_{F}^{2}, j \in [n_{3} n_{4}], & (11) \end{array}

whose solution is given by

\begin{array}{l} {(U_{L}^{j})}^{q + 1} = (λ {(X_{L}^{j})}^{q + 1} {({(V_{L}^{j})}^{q + 1})}^{⊤}) \\ {(λ {(V_{L}^{j})}^{q + 1} {({(V_{L}^{j})}^{q + 1})}^{⊤} + I)}^{- 1} . \end{array}

With the help of the obtained ${(U_{L}^{j})}^{q + 1}$ for j∈[n₃n₄], we can get $U^{q + 1}$ by

\begin{array}{l} U^{q + 1} = L^{- 1} (U_{L}^{q + 1}) . & (12) \end{array}

▸ The $V$ -subproblem

We update $V$ by

\begin{array}{l} V^{q + 1} = \underset{V}{\arg \min} {\frac{1}{2} ‖ V ‖_{F}^{2} + \frac{λ}{2} ‖ X^{q + 1} - U^{q + 1} *_{L} V ‖_{F}^{2}} . & (13) \end{array}

Similarly to $U$ -subproblem, solving Equation 13 is equivalent to solving the following problem:

\begin{array}{l} \min_{V_{L}^{j}} \frac{1}{2} ‖ V_{L}^{j} ‖_{F}^{2} + \frac{λ}{2} ‖ {(X_{L}^{j})}^{q + 1} - {(U_{L}^{j})}^{q + 1} \cdot V_{L}^{j} ‖_{F}^{2}, & (14) \end{array}

for every j∈[n₃n₄]. Clearly, the solution of Equation (Model) 14 is given by

\begin{array}{l} (V_{L}^{j}) = {(λ {({(U_{L}^{j})}^{q + 1})}^{⊤} {(U_{L}^{j})}^{q + 1} + I)}^{- 1} \\ (λ {({(U_{L}^{j})}^{q + 1})}^{⊤} {(X_{L}^{j})}^{q + 1}), \end{array}

for every j∈[n₃n₄]. Thus, $V^{q + 1}$ can be obtained via

\begin{array}{l} V^{q + 1} = L^{- 1} (V_{L}^{q + 1}) . & (15) \end{array}

▸ The $Y$ -subproblem

We update $Y$ by

\begin{array}{l} Y^{q + 1} = \underset{Y}{\arg \min} {\frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (Y) ‖_{F}^{2} + 〈 W_{1}^{q}, X^{q + 1} - Y 〉 \\ + \frac{β_{1}}{2} ‖ X^{q + 1} - Y ‖_{F}^{2} + \frac{γ}{2} ‖ Y ‖_{F}^{2}}, & (16) \end{array}

which is equivalent to solving

\begin{array}{l} \min_{{Mat}_{1} (Y)} \frac{α_{1}}{2} ‖ H \cdot {Mat}_{1} (Y) ‖_{F}^{2} + \frac{β_{1}}{2} ‖ {Mat}_{1} (Y) - ({Mat}_{1} (X^{q + 1}) \\ + \frac{1}{β_{1}} {Mat}_{1} (W_{1}^{q})) ‖_{F}^{2} + \frac{γ}{2} ‖ {Mat}_{1} (Y) ‖_{F}^{2} . & (17) \end{array}

By the optimality condition of Equation (Model) 17, we have

\begin{array}{l} α_{1} H^{⊤} H {Mat}_{1} (Y^{q + 1}) + β_{1} ({Mat}_{1} (Y^{q + 1}) - ({Mat}_{1} (X^{q + 1}) \\ + \frac{1}{β_{1}} {Mat}_{1} (W_{1}^{q}))) + γ {Mat}_{1} (Y^{q + 1}) = 0, & (18) \end{array}

which implies that the optimal solution of Equation (Model) 17 is

\begin{array}{l} \begin{array}{l} M a t_{1} (Y^{q + 1}) = ({(α_{1} H^{⊤} H + (β_{1} + γ) I)}^{- 1} \\ (β_{1} M a t_{1} (X^{q + 1}) + M a t_{1} (W_{1}^{q})) . \end{array} & (19) \end{array}

As a result, the optimal solution of Equation 16 is

\begin{array}{l} \begin{array}{l} Y^{q + 1} = I M a t_{1} ({(α_{1} H^{⊤} H + (β_{1} + γ) I)}^{- 1} \\ (β_{1} M a t_{1} (X^{q + 1}) + M a t_{1} (W_{1}^{q}))), \end{array} & (20) \end{array}

where IMat₁(·) is the inverse of Mat₁(·).

▸ The $Z$ -subproblem

We update $Z$ by

\begin{array}{l} Z = \underset{Z}{\arg \min} {\frac{α_{2}}{2} ‖ K \cdot {Mat}_{2} (Z) ‖_{F}^{2} + 〈 W_{2}^{q}, X^{q + 1} - Z 〉 \\ + \frac{β_{2}}{2} ‖ X^{q + 1} - Z ‖_{F}^{2} + \frac{γ}{2} ‖ Z ‖_{F}^{2}} . & (21) \end{array}

Similarly to $Y$ -subproblem, the optimal solution of Equation 21 is

\begin{array}{l} \begin{array}{l} Z^{q + 1} = I M a t_{2} ({(α_{2} K^{⊤} K + (β_{2} + γ) I)}^{- 1} \\ (β_{2} M a t_{2} (X^{q + 1}) + M a t_{2} (W_{2}^{q}))), \end{array} & (22) \end{array}

where IMat₂(·) is the inverse of Mat₂(·).

▸ The $W_{1}$ and $W_{2}$ -subproblems

We update $W_{1}$ and $W_{2}$ by

\begin{array}{l} \begin{matrix} W_{1}^{q + 1} = W_{1}^{q} + β_{1} (X^{q + 1} - Y^{q + 1}), \\ W_{2}^{q + 1} = W_{2}^{q} + β_{2} (X^{q + 1} - Z^{q + 1}) . \end{matrix} & (23) \end{array}

Following the above analysis, the algorithmic framework for solving Equation (Model) 5 is presented at Algorithm 1.

Remark 2. In the $U$ and $V$ -subproblems, the inverses of two related matrices are computed directly. In the $Y$ and $Z$ -subproblems, we first utilize the special structure of H and K to perform Cholesky decomposition on the two matrices $H^{⊤} H + (β_{1} + γ) I$ and $K^{⊤} K + (β_{2} + γ) I$ , decomposing them into the product of two triangular matrices respectively, and then obtain their inverse matrices. This will significantly reduce the computational cost. In addition, considering that the inverses of the two related matrices remain constant in each round of solving the $Y$ and $Z$ -subproblems, we only need to compute the inverses once, and in the subsequent solving of the $Y$ and $Z$ -subproblems, we can simply call the inverse matrices directly.

4 Convergence analysis

In this section, we investigate the convergence properties of Algorithm 1. Although the convergence of ADMM algorithm for solving the general multi-block optimization cannot be guaranteed [17], we can still use the special structure of the Equation (Model) 5 to obtain the convergence of Algorithm 1, whose proofs are similar to the proofs in Section 3.2 of [18]. We first present the Karush-Kuhn-Tucker (KKT) conditions for Equation (Model) 5:

\begin{array}{l} \nabla_{X_{Ω^{c}}} f (X, U, V, Y, Z) + W_{1 Ω^{c}} + W_{2 Ω^{c}} = O_{Ω^{c}}, & (24) \end{array}

\begin{array}{l} \nabla_{U} f (X, U, V, Y, Z) = O, & (25) \end{array}

\begin{array}{l} \nabla_{V} f (X, U, V, Y, Z) = O, & (26) \end{array}

\begin{array}{l} \nabla_{Y} f (X, U, V, Y, Z) - W_{1} = O, & (27) \end{array}

\begin{array}{l} \nabla_{Z} f (X, U, V, Y, Z) - W_{2} = O, & (28) \end{array}

\begin{array}{l} P_{Ω} (X) = P_{Ω} (M), & (29) \end{array}

\begin{array}{l} Y = X, & (30) \end{array}

\begin{array}{l} Z = X, & (31) \end{array}

where Ω^c represents the complement of the set Ω. Moreover, we call $(X^{•}, U^{•}, V^{•}, Y^{•}, Z^{•}, W_{1}^{•}, W_{2}^{•})$ satisfying (Equations 24–31) a stationary point of Equation (Model) 5, where $W_{1}^{•}$ and $W_{2}^{•}$ are Lagrange multipliers associated with the constraints $Y^{•} = X^{•}$ and $Z^{•} = X^{•}$ , respectively. For any integer q≥0, denote $Θ^{q} = (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q})$ .

Proposition 1. Let ${Θ^{q} = (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q})}_{q = 0}^{\infty}$ be the sequence generated by Algorithm 1. Then, we have

\begin{array}{l} ℒ (X^{q + 1}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) + \frac{λ + β_{1} + β_{2}}{2} ‖ X^{q + 1} - X^{q} ‖_{F}^{2} \\ \leq ℒ (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) . & (32) \end{array}

Proof: We consider three cases: (a) Ω = ∅, (b) ∅⊊Ω⊊[n₁] × [n₂] × [n₃] × [n₄], and (c) Ω = [n₁] × [n₂] × [n₃] × [n₄]. In case (a), the $X$ -subproblem Equation (Model) 7 becomes an unconstrained optimization problem, which is equivalent to ${min}_{X} f_{1} (X)$ , where

\begin{array}{l} f_{1} (X) = \frac{λ}{2} ‖ X - C^{q} ‖_{F}^{2} + \frac{β_{1}}{2} ‖ X - D^{q} ‖_{F}^{2} + \frac{β_{2}}{2} ‖ X - ℰ^{q} ‖_{F}^{2} & (33) \end{array}

with $C^{q} = U^{q} *_{L} V^{q}$ , $D^{q} = Y^{q} - (1 / β_{1}) W_{1}^{q}$ and $E^{q} = Z^{q} - (1 / β_{2}) W_{2}^{q}$ . It is obvious that f₁ is strongly convex with modulus at least λ+β₁+β₂. Consequently, by Theorem 5.24 in [19], it holds that

\begin{array}{l} f_{1} (X^{'}) \geq f_{1} (X) + 〈 \nabla f_{1} (X), X^{'} - X 〉 + \frac{λ + β_{1} + β_{2}}{2} ‖ X^{'} - X ‖_{F}^{2} & (34) \end{array}

for any $X, X^{'} \in ℝ^{n_{1} \times n_{2} \times n_{3} \times n_{4}}$ . Since $X^{q + 1}$ is the optimal solution of ${min}_{X} f_{1} (X)$ , which means $\nabla f_{1} (X^{q + 1}) = O$ , from Equation (Inequality) 34, it holds that

\begin{array}{l} f_{1} (X^{q}) \geq f_{1} (X^{q + 1}) + \frac{λ + β_{1} + β_{2}}{2} || X^{q} - X^{q + 1} {||}_{F}^{2}, & (35) \end{array}

which implies, together with the definition of $L$ , that Equation (Inequality) 32 holds.

In the case ∅⊊Ω⊊[n₁] × [n₂] × [n₃] × [n₄], the simple equality constraints in Equation (Model) 7 can be eliminated by substituting them into its objective function, and the corresponding $X$ -subproblem is converted into an unconstrained optimization problem, whose objective function is similar to the structure of f₁, but $X$ , $C^{q}$ , $D^{q}$ , and $E^{q}$ are replaced by $X_{Ω^{c}}$ , $C_{Ω^{c}}^{q}$ , $D_{Ω^{c}}^{q}$ , and $E_{Ω^{c}}^{q}$ , respectively. Noticing $X_{Ω}^{q + 1} = M_{Ω}$ , similar to the proof for the case (a), we know that Equation (Inequality) 32 still holds. In the case (c), since $X^{q + 1} = X^{q} = M$ , the inequality [Equation (Inequality) 32] is obvious. We complete the proof.

Proposition 2. Let ${Θ^{q}}_{q = 0}^{\infty}$ be the sequence generated by Algorithm 1. Then we have

\begin{array}{l} ℒ (X^{q + 1}, U^{q + 1}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) + \frac{1}{2} ‖ U^{q + 1} - U^{q} ‖_{F}^{2} \\ \leq ℒ (X^{q + 1}, U^{q}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) & (36) \end{array}

and

\begin{array}{l} ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) + \frac{1}{2} ‖ V^{q + 1} - V^{q} ‖_{F}^{2} \\ \leq ℒ (X^{q + 1}, U^{q + 1}, V^{q}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}) . & (37) \end{array}

Proof: For every j∈[n₃n₄] and integer q≥0, we define the function $f_{j q} : ℂ^{n_{1} \times s} \to ℝ$ by

f_{j q} (U_{L}^{j}) = \frac{1}{2} ‖ U_{L}^{j} ‖_{F}^{2} + \frac{λ}{2} ‖ {(X_{L}^{j})}^{q + 1} - U_{L}^{j} {(V_{L}^{j})}^{q} ‖_{F}^{2} .

It is obvious that f_jq is strongly convex with modulus at least 1. Similar to the proof for (a) of Proposition 1, we have

f_{j q} ({(U_{L}^{j})}^{q + 1}) + \frac{1}{2} ‖ {(U_{L}^{j})}^{q + 1} - {(U_{L}^{j})}^{q} ‖_{F}^{2} \leq f_{j q} ({(U_{L}^{j})}^{q}),

which implies, together with the fact (Equation 2), that

\begin{array}{l} \frac{1}{2} ‖ U^{q + 1} ‖_{F}^{2} + \frac{λ}{2} ‖ X^{q + 1} - U^{q + 1} *_{L} V^{q} ‖_{F}^{2} + \frac{1}{2} ‖ U^{q + 1} - U^{q} ‖_{F}^{2} \\ \leq \frac{1}{2} ‖ U^{q} ‖_{F}^{2} + \frac{λ}{2} ‖ X^{q + 1} - U^{q} *_{L} V^{q} ‖_{F}^{2} \end{array}

Moreover, by the definition of $L$ , we know that the first inequality in Equation (Inequality) 36 holds. The inequality [Equation (Inequality) 37] can be proved similarly.

Employing a proof analogous to that of Proposition 1, we can derive the following proposition.

Proposition 3. Let ${Θ^{q}}_{q = 0}^{\infty}$ be the sequence generated by Algorithm 1. Then, we have

\begin{array}{l} ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q + 1}, Z^{q}, W_{1}^{q}, W_{2}^{q}) + \frac{β_{1} + γ}{2} ‖ Y^{q + 1} - Y^{q} ‖_{F}^{2} \\ \leq ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q}, Z^{q}, W_{1}^{q}, W_{2}^{q}), & (38) \end{array}

and

\begin{array}{l} ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q + 1}, Z^{q + 1}, W_{1}^{q}, W_{2}^{q}) \\ + \frac{β_{2} + γ}{2} ‖ Z^{q + 1} - Z^{q} ‖_{F}^{2} \\ \leq ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q + 1}, Z^{q}, W_{1}^{q}, W_{2}^{q}) . & (39) \end{array}

Denote

\begin{array}{l} ξ_{1} = β_{1} (β_{1} + γ) - 2 {(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2} and \\ ξ_{2} = β_{2} (β_{2} + γ) - 2 {(α_{2} σ_{1} (K^{⊤} K) + γ)}^{2}) . \end{array}

From Propositions 1, 2, and 3, we have the following theorem, which characterizes the sufficient decrease property and boundedness of the sequence ${Θ^{q}}_{q = 0}^{\infty}$ .

Theorem 2. Let ${Θ^{q}}_{q = 0}^{\infty}$ be the sequence generated by Algorithm 1. If ξ₁>0 and ξ₂>0, then the sequence ${Θ^{q}}_{q = 0}^{\infty}$ satisfies

\begin{array}{l} ℒ (Θ^{q + 1}) + μ ‖ Λ^{q + 1} - Λ^{q} ‖_{F}^{2} \leq ℒ (Θ^{q}), & (40) \end{array}

where $Λ^{q} = (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q})$ and μ = (1/2)min1, λ+β₁+β₂, ξ₁/β₁, ξ₂/β₂. Furthermore, if $η_{1} : = β_{1} γ - {(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2} > 0$ and $η_{2} : = β_{2} γ - {(α_{2} σ_{1} (K^{⊤} K) + γ)}^{2} > 0$ , then the sequence ${Θ^{q}}_{q = 0}^{\infty}$ is bounded.

Proof: From Equations 6, 23, we have

\begin{array}{l} ℒ (Θ^{q + 1}) = ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q + 1}, Z^{q + 1}, W_{1}^{q}, W_{2}^{q}) \\ + \frac{1}{β_{1}} ‖ W_{1}^{q + 1} - W_{1}^{q} ‖_{F}^{2} + \frac{1}{β_{2}} ‖ W_{2}^{q + 1} - W_{2}^{q} ‖_{F}^{2} . & (41) \end{array}

Moreover, from Equations (Inequalities) 32, 36, 38, we have

\begin{array}{l} ℒ (X^{q + 1}, U^{q + 1}, V^{q + 1}, Y^{q + 1}, Z^{q + 1}, W_{1}^{q}, W_{2}^{q}) \\ + \frac{λ + β_{1} + β_{2}}{2} ‖ X^{q + 1} - X^{q} ‖_{F}^{2} \\ + \frac{1}{2} ‖ U^{q + 1} - U^{q} ‖_{F}^{2} + \frac{1}{2} ‖ V^{q + 1} - V^{q} ‖_{F}^{2} \\ + \frac{β_{1} + γ}{2} ‖ Y^{q + 1} - Y^{q} ‖_{F}^{2} + \frac{β_{2} + γ}{2} ‖ Z^{q + 1} - Z^{q} ‖_{F}^{2} \\ \leq ℒ (Θ^{q}) . & (42) \end{array}

Consequently, by Equation 41 and Equation (Inequality) 42, it holds that

\begin{array}{l} ℒ (Θ^{q + 1}) + \frac{λ + β_{1} + β_{2}}{2} ‖ X^{q + 1} - X^{q} ‖_{F}^{2} + \frac{1}{2} ‖ U^{q + 1} - U^{q} ‖_{F}^{2} \\ + \frac{1}{2} ‖ V^{q + 1} - V^{q} ‖_{F}^{2} + \frac{β_{1} + γ}{2} ‖ Y^{q + 1} - Y^{q} ‖_{F}^{2} \\ + \frac{β_{2} + γ}{2} ‖ Z^{q + 1} - Z^{q} ‖_{F}^{2} \\ \leq ℒ (Θ^{q}) + \frac{1}{β_{1}} ‖ W_{1}^{q + 1} - W_{1}^{q} ‖_{F}^{2} + \frac{1}{β_{2}} ‖ W_{2}^{q + 1} - W_{2}^{q} ‖_{F}^{2} . & (43) \end{array}

On the other hand, it holds that

\begin{array}{l} {Mat}_{1} (W_{1}^{q + 1}) = (α_{1} H^{⊤} H + γ I) {Mat}_{1} (Y^{q + 1}) & (44) \end{array}

for every q≥0. Consequently, by Equation 44, we have

\begin{array}{l} \begin{array}{l} ‖ W_{1}^{q + 1} - W_{1}^{q} ‖_{F}^{2} = ‖ M a t_{1} (W_{1}^{q + 1}) - M a t_{1} (W_{1}^{q}) ‖_{F}^{2} \\ \leq {(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2} ‖ M a t_{1} (Y^{q + 1}) \\ - M a t_{1} (Y^{q}) ‖_{F}^{2} \\ = {(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2} ‖ Y^{q + 1} - Y^{q} ‖_{F}^{2} . \end{array} & (45) \end{array}

Similarly, we have

\begin{array}{l} {Mat}_{1} (W_{2}^{q + 1}) = (α_{2} K^{⊤} K + γ I) {Mat}_{2} (Z^{q + 1}) & (46) \end{array}

for every q≥0, which implies

\begin{array}{l} ‖ W_{2}^{q + 1} - W_{2}^{q} ‖_{F}^{2} \leq {(α_{2} σ_{1} (K^{⊤} K) + γ)}^{2} ‖ Z^{q + 1} - Z^{q} ‖_{F}^{2} . & (47) \end{array}

Combining Equations (Inequalities) 43 with 45 and 47, we obtain

\begin{array}{l} ℒ (Θ^{q + 1}) + \frac{λ + β_{1} + β_{2}}{2} ‖ X^{q + 1} - X^{q} ‖_{F}^{2} + \frac{1}{2} ‖ U^{q + 1} - U^{q} ‖_{F}^{2} \\ + \frac{1}{2} ‖ V^{q + 1} - V^{q} ‖_{F}^{2} + \frac{ξ_{1}}{2 β_{1}} ‖ Y^{q + 1} - Y^{q} ‖_{F}^{2} + \frac{ξ_{2}}{2 β_{2}} ‖ Z^{q + 1} - Z^{q} ‖_{F}^{2} \\ \leq ℒ (Θ^{q}) . & (48) \end{array}

Finally, by the definition of μ, we know that Equation (Inequality) 40 holds.

Now, we prove the boundedness of ${Θ^{q}}_{q = 0}^{\infty}$ . Because $| | W_{1}^{q} | |_{F}^{2} \leq {(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2} | | Y^{q} | |_{F}^{2}$ and $| | W_{2}^{q} | |_{F}^{2} \leq {(α_{2} σ_{1} (K^{⊤} K) + γ)}^{2} | | Z^{q} | |_{F}^{2}$ for any q≥0, by Equation 6, we can obtain

\begin{array}{l} ℒ (Θ^{q}) \geq f (X^{q}, U^{q}, V^{q}, Y^{q}, Z^{q}) + \frac{β_{1}}{2} ‖ X^{q} - Y^{q} + \frac{1}{β_{1}} W_{1}^{q} ‖_{F}^{2} \\ - \frac{{(α_{1} σ_{1} (H^{⊤} H) + γ)}^{2}}{2 β_{1}} ‖ Y^{q} ‖_{F}^{2} \\ + \frac{β_{2}}{2} ‖ X^{q} - Z^{q} + \frac{1}{β_{2}} W_{2}^{q} ‖_{F}^{2} - \frac{{(α_{2} σ_{1} (K^{⊤} K) + γ)}^{2}}{2 β_{2}} ‖ Z^{q} ‖_{F}^{2} & (49) \end{array}

for every q≥0. According to Equation (Inequality) 40, for any q≥0, it is clear that

\begin{array}{l} ℒ (Θ^{q}) + μ \sum_{s = 0}^{q - 1} ‖ Λ^{s + 1} - Λ^{s} ‖_{F}^{2} \leq ℒ (Θ^{0}), & (50) \end{array}

Therefore, by the definition of f, as well as Equation 6 and Equation (Inequalities) 49, 50, it holds that for any q≥0,

\begin{array}{l} ‖ U^{q} ‖_{F}^{2} \leq 2 ℒ (Θ^{0}), ‖ V^{q} ‖_{F}^{2} \leq 2 ℒ (Θ^{0}), η_{1} ‖ Y^{q} ‖_{F}^{2} \leq 2 β_{1} ℒ (Θ^{0}) and \\ η_{2} ‖ Z^{q} ‖_{F}^{2} \leq 2 β_{2} ℒ (Θ^{0}) . & (51) \end{array}

Since η₁>0, η₂>0, we know that ${(U^{q}, V^{q}, Y^{q}, Z^{q})}_{q = 0}^{\infty}$ is bounded, which implies the boundedness of ${(W_{1}^{q}, W_{2}^{q})}_{q = 0}^{\infty}$ from Equations 44, 46. In addition, since $β_{1} | | X^{q} - Y^{q} + (1 / β_{1}) W_{1}^{q} | |_{F}^{2} \leq 2 L (Θ^{0})$ for any q≥0, which from Equation (Inequality) 49, it follows that ${X^{q}}_{q = 0}^{\infty}$ is also bounded. Based on the comprehensive analysis presented above, we have completed the proof.

Remark 3. Given the predefined dimensions of matrices H and K, the singular values of H^⊤H and K^⊤K can be computed. Consequently, we only need to select appropriate parameters that satisfy the hypotheses stated in Theorem 2.

Theorem 3. For the sequence ${Θ^{q}}_{q = 0}^{\infty}$ generated by Algorithm 1. If the parameters satisfy the conditions in Theorem 2, then any cluster point Θ^∞ of ${Θ^{q}}_{q = 0}^{\infty}$ is a stationary point of Equation (Model) 5.

Proof: According to Theorem 2, the sequence ${Θ^{q}}_{q = 0}^{\infty}$ is bounded. Let ${Θ^{q_{i}}}_{i = 0}^{\infty}$ be a convergent subsequence such that $lim_{i \to \infty} Θ^{q_{i}} = Θ^{\infty} : = (X^{\infty}, U^{\infty}, V^{\infty}, Y^{\infty}, Z^{\infty}, W_{1}^{\infty}, W_{2}^{\infty})$ . First, from Equation (Inequality) 50, we have

\begin{array}{l} μ \sum_{q = 0}^{q_{i}} ‖ Λ^{q + 1} - Λ^{q} ‖_{F}^{2} \leq ℒ (Θ^{0}) - ℒ (Θ^{q_{i}}) . & (52) \end{array}

Taking the limit of both sides of the Equation 52 as i approaches ∞, it holds that

\begin{array}{l} μ \sum_{q = 0}^{\infty} ‖ Λ^{q + 1} - Λ^{q} ‖_{F}^{2} \leq ℒ (Θ^{0}) - ℒ (Θ^{\infty}) \leq \infty, & (53) \end{array}

which means $lim_{q \to \infty} | | Λ^{q + 1} - Λ^{q} | |_{F} = 0$ . Moreover, from the optimality conditions of $X$ -subproblem to $Z$ -subproblem, as well as the update of $W_{1}$ and $W_{2}$ , it is easy to see that

\begin{array}{l} {\begin{array}{l} \nabla_{X_{Ω^{c}}} f (X^{q_{i} + 1}, U^{q_{i}}, V^{q_{i}}, Y^{q_{i}}, Z^{q_{i}}) + {(W_{1}^{q_{i}})}_{Ω^{c}} + β_{1} {(X^{q_{i} + 1} - Y^{q_{i}})}_{Ω^{c}} \\ + {(W_{2}^{q_{i}})}_{Ω^{c}} + β_{1} {(X^{q_{i} + 1} - Z^{q_{i}})}_{Ω^{c}} = O_{Ω^{c}}, \\ {(X^{q_{i} + 1})}_{Ω} = {(ℳ)}_{Ω}, \\ \nabla_{U} f (X^{q_{i} + 1}, U^{q_{i} + 1}, V^{q_{i}}, Y^{q_{i}}, Z^{q_{i}}) = O, \\ \nabla_{V} f (X^{q_{i} + 1}, U^{q_{i} + 1}, V^{q_{i} + 1}, Y^{q_{i}}, Z^{q_{i}}) = O, \\ \nabla_{Y} f (X^{q_{i} + 1}, U^{q_{i} + 1}, V^{q_{i} + 1}, Y^{q_{i} + 1}, Z^{q_{i}}) - W_{1}^{q_{i}} = O, \\ \nabla_{Z} f (X^{q_{i} + 1}, U^{q_{i} + 1}, V^{q_{i} + 1}, Y^{q_{i} + 1}, Z^{q_{i} + 1}) - W_{2}^{q_{i}} = O, \\ W_{1}^{q_{i} + 1} = W_{1}^{q_{i}} + β_{1} (X^{q_{i} + 1} - Y^{q_{i} + 1}), \\ W_{2}^{q_{i} + 1} = W_{2}^{q_{i}} + β_{1} (X^{q_{i} + 1} - Z^{q_{i} + 1}) . \end{array} & (54) \end{array}

Since $lim_{i \to \infty} | | Λ^{q_{i} + 1} - Λ^{q_{i}} | |_{F} = 0$ and $lim_{i \to \infty} Θ^{q_{i}} = Θ^{\infty}$ , it holds that $\lim_{i \to \infty} | | W_{1}^{q_{i} + 1} - W_{1}^{q_{i}} | |_{F} = 0$ from Equation (Inequality) 45. Similarly, we have $lim_{i \to \infty} | | W_{2}^{q_{i} + 1} - W_{2}^{q_{i}} | |_{F} = 0$ from Equation (Inequality) 47. Consequently, by letting i → ∞ in Equation 54, we know that Θ^∞ is a stationary point of Equation (Model) 5. We complete the proof.

5 Numerical experiments

In this section, we apply our approach (SRdTPD) to two authentic Internet traffic datasets: the GÉANT dataset [20], which logs Internet traffic data from 23 original nodes to 23 destination nodes every 15 min, and the Abilene¹ dataset, which records Internet traffic data from 11 original nodes to 11 destination nodes every 5 min. For both datasets, we select data spanning a period of 7 days. The sizes of these two datasets, arranged in fourth-order tensor format, are 7 × 96 × 23 × 23 and 7 × 288 × 11 × 11, respectively. For each experiment situation, we reconstruct the Internet traffic data 5 times, and all experimental results presented are the average of five repetitions. As demonstrated by the recovery test results for synthetic experiments and many real-world applications [14], algorithms with the discrete Fourier transform not only exhibit similar recovery accuracy but also incur lower computational costs, compared to methods utilizing the discrete cosine transform or random orthogonal transform. It is well-known that the most basic Fast Fourier Transform (FFT) has a computational complexity of O(NlogN) [21]. Using this result, it is not difficult to see that the computational complexity of SRdTPD using FFT is O(n₁+n₂)n₁n₂+slog(n₃n₄)n₃n₄, which is significantly lower than that of methods using other transforms. Considering the potential computational cost advantage of SRdTPD equipped with FFT when processing large-scale data, we choose normalized discrete Fourier transform as L₃ and L₄ in L, i.e.,

\begin{array}{l} L_{3} = L_{4} = \frac{1}{\sqrt{N}} [\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ 1 & w & w^{2} & \dots & w^{N - 1} \\ 1 & w^{2} & w^{4} & \dots & w^{2 (N - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & w^{N - 1} & w^{2 (N - 1)} & \dots & w^{{(N - 1)}^{2}} \end{matrix}] \end{array}

with the $w = e^{- i \frac{2 π}{N}}$ . All the experiments are performed on a notebook computer (13th Gen Intel(R) Core(TM) i9-13900HX CPU @2.20 GHz with 24G memory) running a 64-bit Windows Operating System. The code of SRdTPD is released at https://github.com/Duan-Yuxuan/Traffic-data-recovery/tree/master.

To verify the performance of SRdTPD, we compare it with five existing approaches designed for data recovery, the first of which is a matrix completion method, and the remaining four are tensor completion methods:

• SRMF, which is a low-rank matrix completion with a spatio-temporal regularization [4];

• CPWOPT, which is a tensor CP factorization completion approach [22];

• T2C, which is a tensor filling approach that decomposes third-order tensors into three third-order low-rank tensors in a balanced manner [23];

• SRTNN, which is a tensor-based approach with spatio-temporal regularization terms [11];

• BGCP, which is a Bayesian Gaussian CP tensor decomposition approach [24].

We use the Normalized Mean Absolute Error (NMAE) to measure the quality of the recovered data by models and algorithms, where the NMAE is defined as follows

\begin{array}{l} NMAE = \frac{\sum_{(i, j, k, l) \notin Ω} | {(X_{true})}_{i j k l} - {(X_{rec})}_{i j k l} |}{\sum_{(i, j, k, l) \notin Ω} | {(X_{true})}_{i j k l} |}, \end{array}

where $X_{true}$ and $X_{rec}$ are original data and recovered data, respectively. Clearly, lower NMAE value means better quality of the recovered data. We first consider the recovery effectiveness of various approaches for missing data with data sample rates ranging from 10% to 90%. Under random missingness in the GÉANT dataset, we set α₁ = 0.1, α₂ = 200, β₁ = 0.01, α₂ = 10, λ = 0.001, γ = 0.01, while for the Abilene dataset, we choose α₁ = β₂ = 0.01, α₂ = 10, β₁ = λ = γ = 0.001. Figure 1 presents the NMAE values for missing data recovery using various methods at different sampling rates. It can be visually observed that SRdTPD outperforms other methods in terms of data recovery effectiveness, particularly as the missing data rate increases, where the superiority of our method's recovery performance becomes increasingly apparent.

Figure 1

Figure 1. Numerical comparison on NMAE and computing time (seconds) with respect to different random sample ratios.

In addition to the recovery of data missing at random, we next consider the following four types of structural missingness scenarios, which are given in [4], thereby demonstrating that our method fully exploits the structural information within the data.

• xxTimeRandLoss This scenario simulates the phenomenon of structured data absence at specific time points influenced by certain factors, such as the utilization of unreliable data transmission equipment. Specifically, these data are missing at these specific time points in a certain random proportion. In our simulation, we randomly select xx% of the matrix slices along the last two modes of the Internet traffic tensor $X$ of size n₁×n₂×n₃×n₄, and subsequently, we further randomly delete q% of its elements in each selected matrix.

• xxElemRandLoss This scenario simulates the structured absence of data for specific OD (Origination-Destination) nodes under the influence of certain factors, such as the use of unreliable data transmission equipment. In this context, OD node data are missing in a certain random proportion. To simulate this phenomenon, we randomly select xx% of the matrix slices along the first two modes of the Internet traffic tensor $X$ of size n₁×n₂×n₃×n₄ and further randomly delete q% of the elements in each selected matrix.

• xxElemSyncLoss This scenario simulates the structured data missingness for specific OD nodes due to a uniform underlying cause, resulting in temporally synchronized missingness among these OD nodes. To emulate this condition, we randomly select xx% of the slices from the Internet traffic tensor $X$ of size n₁×n₂×n₃×n₄, which are matrix representations expanded along its first two modes. Subsequently, for each selected matrix, we randomly choose a common set of time indices (q%) and delete the corresponding elements at these indices.

• RowRandLoss With flow level measurements, data are collected by a router. If a router is unable to collect data for a period of time, all data collected during those specific time points will be missing. To simulate this scenario, we randomly select p% of the time points within a day and delete all data recorded at those time points across a 7-day period.

Under these scenarios of structural missingness, with β₁ = γ = 0.01, the parameters for the first two structural missingness cases are identical within the same dataset (GÉANT: α₁ = 0.1, α₂ = 200, β₂ = 10, λ = 0.001; Abilene: α₁ = 0.1, α₂ = 2, β₂ = 1, λ = 0.001), while the parameters for the last two structural missingness cases are consistent within the same dataset (GÉANT: α₁ = 0.01, α₂ = 2, β₂ = 1, λ = 0.01; Abilene: α₁ = 0.01, α₂ = 1, β₂ = 0.1, λ = 0.01). Among these four types of structured missingness scenarios, we simulate the first three by considering the following 12 specific missingness cases: Missing Id 1-4: Set xx = 25 and, correspondingly, set q = 30, 50, 70, and 90, respectively; Missing Id 5-8: Set xx = 50 and, correspondingly, set q = 30, 50, 70, and 90, respectively; Missing Id 9-12: Set xx = 75 and, correspondingly, set q = 30, 50, 70, and 90, respectively. For the fourth type of structured missingness, we specifically select p = 15, 30, 45, 50, and 75.

The recovery performance and computational time of different methods under distinct structural missingness scenarios are comprehensively presented in Figures 2–4, Table 1. Experimental results demonstrate that the proposed SRdTPD approach achieves substantial improvements in handling structurally missing data. Regarding recovery accuracy, SRdTPD demonstrates superior performance compared to listed methods, attaining competitive results across all test cases. In terms of computational efficiency, SRdTPD maintains favorable time — while a moderate increase in computational time is observed under extreme missingness conditions (attributable to iterative process requirements), the runtime remains within practical thresholds for real-world applications. These findings collectively suggest that SRdTPD effectively balances recovery accuracy and computational demands, providing a robust solution for structural missing data recovery tasks.

Figure 2

Figure 2. Numerical comparison on NMAE and computing time (seconds) with respect to different cases of xxTimeRandLoss.

Figure 3

Figure 3. Numerical comparison on NMAE and computing time (seconds) with respect to different cases of xxElemRandLoss.

Figure 4

Figure 4. Numerical comparison on NMAE and computing time (seconds) with respect to diverse missing ratio of RowRandLoss.

Table 1

Table 1. Results on NMAE and computing time (seconds) with respect to diverse cases of xxElemSyncLoss.

6 Conclusion

This study establishes an equivalence relationship between the d-th order tensor nuclear norm (TNN) in the unitary transformation domain and the squared sum of the Frobenius norms of its two factorization factors. Based on this relationship, we constructed a novel low-rank recovery method for Internet traffic data that effectively incorporates the spatio-temporal characteristics of traffic data while significantly reducing computational time. Auxiliary variables are introduced into the model solution process, which is then solved using the ADMM. Numerical experiments demonstrate that the proposed method exhibits significant advantages in the recovery of Internet traffic data, particularly in cases of structural missingness.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

YD: Investigation, Methodology, Writing – original draft. CL: Validation, Writing – review & editing. JL: Validation, Writing – review & editing. XY: Validation, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. CL's work was supported in part by National Natural Science Foundation of China (No. 11971138). JL's work was supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K202200506), by Chongqing Talent Program Lump-sum Project (No. cstc2022ycjh-bgzxm0040), and by the Foundation of Chongqing Normal University (22XLB005, ncamc2022-msxm02). XY's work was supported by the Major Program of National Natural Science Foundation of China (Nos. 11991020, 11991024).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1. ^http://abilene.internet2.edu/observatory/data-collections.html

References

1. Roughan M, Thorup M, Zhang Y. Traffic engineering with estimated traffic matrices. In: Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement. Miami Beach, FL: Association for Computing Machinery (2003). p. 248–258.

Google Scholar

2. Cahn RS Wide area network design: concepts and tools for optimization. IEEE Commun Mag. (2000) 38:28–30. doi: 10.1109/MCOM.2000.867843

Crossref Full Text | Google Scholar

3. Zhang Y, Ge Z, Greenberg A, Roughan M. Network anomography. In: Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement. Berkeley, CA: USENIX Association (2005). p. 30.

Google Scholar

4. Zhang Y, Roughan M, Willinger W, Qiu L. Spatio-temporal compressive sensing and internet traffic matrices. In: Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication. Barcelona: Association for Computing Machinery (2009). p. 267–278.

Google Scholar

5. Carroll JD, Chang JJ. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika. (1970) 35:283–319. doi: 10.1007/BF02310791

Crossref Full Text | Google Scholar

6. Harshman RA, et al. Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-modal factor analysis. In: UCLA Work Papers Phonetics. Ann Arbor; Los Angeles, CA: University Microfilms (1970). p. 84.

Google Scholar

7. Oseledets IV. Tensor-train decomposition. SIAM J Sci Comput. (2011) 33:2295–317. doi: 10.1137/090752286

Crossref Full Text | Google Scholar

8. Kilmer ME, Martin CD. Factorization strategies for third-order tensors. Linear Algebra Appl. (2011) 435:641–58. doi: 10.1016/j.laa.2010.09.020

Crossref Full Text | Google Scholar

9. Candès E, Recht B. Exact matrix completion via convex optimization. Found Comput Math. (2009) 9:717–72. doi: 10.1007/s10208-009-9045-5

Crossref Full Text | Google Scholar

10. Zhang Z, Ely G, Aeron S, Hao N, Kilmer M. Novel methods for multilinear data completion and de-noising based on tensor-SVD. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE (2014). p. 3842–3849.

Google Scholar

11. Li C, Chen Y, Li D. Internet traffic tensor completion with tensor nuclear norm. Comput Optim Appl. (2024) 87:1033–57. doi: 10.1007/s10589-023-00545-5

Crossref Full Text | Google Scholar

12. He H, Ling C, Xie W. Tensor completion via a generalized transformed tensor T-product decomposition without t-SVD. J Sci Comput. (2022) 93:47. doi: 10.1007/s10915-022-02006-3

Crossref Full Text | Google Scholar

13. Martin CD, Shafer R, LaRue B. An order-p tensor factorization with applications in imaging. SIAM J Sci Comput. (2013) 35:A474–90. doi: 10.1137/110841229

Crossref Full Text | Google Scholar

14. Qin W, Wang H, Zhang F, Wang J, Luo X, Huang T. Low-rank high-order tensor completion with applications in visual data. IEEE Trans Image Process. (2022) 31:2433–48. doi: 10.1109/TIP.2022.3155949

PubMed Abstract | Crossref Full Text | Google Scholar

15. Kolda TG, Bader BW. Tensor decompositions and applications. SIAM Rev. (2009) 51:455–500. doi: 10.1137/07070111X

Crossref Full Text | Google Scholar

16. Recht B, Fazel M, Parrilo PA. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. (2010) 52:471–501. doi: 10.1137/070697835

Crossref Full Text | Google Scholar

17. Chen C, He B, Ye Y, Yuan X. The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math Program. (2016) 155:57–79. doi: 10.1007/s10107-014-0826-5

Crossref Full Text | Google Scholar

18. Popa J, Lou Y, Minkoff SE. Low-rank tensor data reconstruction and denoising via ADMM: algorithm and convergence analysis. J Sci Comput. (2023) 97:49. doi: 10.1007/s10915-023-02364-6

Crossref Full Text | Google Scholar

19. Beck A. First-Order Methods in Optimization. Philadelphia: SIAM. (2017). doi: 10.1137/1.9781611974997

Crossref Full Text | Google Scholar

20. Uhlig S, Quoitin B, Lepropre J, Balon S. Providing public intradomain traffic matrices to the research community. ACM SIGCOMM Comput Commun Rev. (2006) 36:83–6. doi: 10.1145/1111322.1111341

Crossref Full Text | Google Scholar

21. Cooley JW, Tukey JW. An algorithm for the machine calculation of complex Fourier series. Math Comp. (1965) 19:297–301. doi: 10.1090/S0025-5718-1965-0178586-1

Crossref Full Text | Google Scholar

22. Acar E, Dunlavy DM, Kolda TG, Mørup M. Scalable tensor factorizations for incomplete data. Chemom Intell Lab Syst. (2011) 106:41–56. doi: 10.1016/j.chemolab.2010.08.004

Crossref Full Text | Google Scholar

23. Chen X, He Z, Chen Y, Lu Y, Wang J. Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model. Transp Res Part C: Emerg Technol. (2019) 104:66–77. doi: 10.1016/j.trc.2019.03.003

Crossref Full Text | Google Scholar

24. Chen X, He Z, Sun L, A. Bayesian tensor decomposition approach for spatiotemporal traffic data imputation. Transp Res Part C: Emerg Technol. (2019) 98:73–84. doi: 10.1016/j.trc.2018.11.003

Crossref Full Text | Google Scholar

Keywords: Internet traffic data recovery, d-th order TNN, spatio-temporal regularization, ADMM algorithm, tensor completion

Citation: Duan Y, Ling C, Liu J and Yang X (2025) Internet traffic data recovery via a low-rank spatio-temporal regularized optimization approach without d-th order T-SVD. Front. Appl. Math. Stat. 11:1587681. doi: 10.3389/fams.2025.1587681

Received: 04 March 2025; Accepted: 07 April 2025;
Published: 06 May 2025.

Edited by:

Yannan Chen, South China Normal University, China

Reviewed by:

Douglas Soares Goncalves, Federal University of Santa Catarina, Brazil
Can Li, Honghe University, China
Xueli Bai, Guangdong University of Foreign Studies, China

Copyright © 2025 Duan, Ling, Liu and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jinjie Liu, amluamllLmxpdUBjcW51LmVkdS5jbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.