LatLRR-FCNs: Latent Low-Rank Representation With Fully Convolutional Networks for Medical Image Fusion

Medical image fusion, which aims to derive complementary information from multi-modality medical images, plays an important role in many clinical applications, such as medical diagnostics and treatment. We propose the LatLRR-FCNs, which is a hybrid medical image fusion framework consisting of the latent low-rank representation (LatLRR) and the fully convolutional networks (FCNs). Specifically, the LatLRR module is used to decompose the multi-modality medical images into low-rank and saliency components, which can provide fine-grained details and preserve energies, respectively. The FCN module aims to preserve both global and local information by generating the weighting maps for each modality image. The final weighting map is obtained using the weighted local energy and the weighted sum of the eight-neighborhood-based modified Laplacian method. The fused low-rank component is generated by combining the low-rank components of each modality image according to the guidance provided by the final weighting map within pyramid-based fusion. A simple sum strategy is used for the saliency components. The usefulness and efficiency of the proposed framework are thoroughly evaluated on four medical image fusion tasks, including computed tomography (CT) and magnetic resonance (MR), T1- and T2-weighted MR, positron emission tomography and MR, and single-photon emission CT and MR. The results demonstrate that by leveraging the LatLRR for image detail extraction and the FCNs for global and local information description, we can achieve performance superior to the state-of-the-art methods in terms of both objective assessment and visual quality in some cases. Furthermore, our method has a competitive performance in terms of computational costs compared to other baselines.


INTRODUCTION
Medical image fusion is a key technology that has been used extensively in clinical diagnosis and treatment planning (James and Dasarathy, 2014). Modern medical imaging techniques mainly include computed tomography (CT), magnetic resonance (MR), single-photon emission computed tomography (SPECT), and positron emission tomography (PET) (Walrand et al., 2017). CT has a high spatial and density resolution for dense structures (e.g., bones, implants), while MR has a high resolution for soft tissue ) (e.g., muscle, tendon, and fascia). PET is an advanced nuclear medical examination technique that allows visualization of biomolecular metabolism, receptors, and neurotransmitter activity in vivo. SPECT is often applied to quantify images of the physiological and pathological changes of organs or tissues in vitro. Evaluating different perspectives of these imaging techniques reveals that they do, to an extent, complement each other (Walrand et al., 2017). Thus, medical image fusion can be utilized to combine different medical images and generate a new fusing image, providing the clinical information from each original image Huang et al., 2020).
The key point of MST-based fusion techniques is to decompose the original images into a multiscale transform domain (Li et al., 1995). Some fusion rule strategies can be utilized to merge the transformed coefficients, and the merged coefficients are employed to reconstruct the composite image. Note that the current literature indicates that the non-subsampled shearlet transform (NSST) and non-subsampled contourlet transform (NSCT) achieve the optimum performance in terms of image representation among MST-based methods (Anitha et al., 2015;Yin et al., 2018;Zhu et al., 2019). Zhu et al. used NSCT to decompose medical image pairs into low-pass and highpass sub-bands, where a phase congruency rule was applied to fuse the high-pass sub-bands and a local Laplacian energy-based fusion rule was utilized for the low-pass sub-bands (Zhu et al., 2019). Later, Yin et al. introduced a novel framework in which the high-frequency coefficients were fused by a parameter-adaptive pulse coupled neural network (PA-PCNN), and the weighted local energy and the weighted sum of eight-neighborhood-based modified Laplacian were utilized to fuse low-frequency bands in the NSST domain (Yin et al., 2018). However, due to the nature of the transformation, MST-based (including NCST-based and NSST-based) fusion methods may not express and extract certain significant structures of source images properly without being sensitive to misregistration.
To address the misregistration problem in MST-based methods, sparse representation (SR) has emerged as another popular and powerful theory in the medical image fusion field (Liu and Wang, 2014;Liu et al., 2016Liu et al., , 2019Fei et al., 2017).
A typical SR-based medical image fusion method includes three basic steps: (1) a given dictionary is used to find the sparsest representation of source images; (2) some fusion rules are used to integrate the sparse representation coefficients; and (3) the integrated sparse representation coefficients and given dictionary are utilized to construct the fused image. For example, Liu and Wang (2014) proposed a novel adaptive sparse representation model for medical image fusion, where a set of more compact sub-dictionaries was learned to replace the single redundant dictionary in the traditional SR approach and achieved better results. Although the SR-based and extended methods are robust in terms of noise and misregistration to some extent, they cannot capture global information and suffer from significant energy loss.
In the field of medical image fusion, a key issue is to calculate a weight map since it reflects pixel activity information from different modality images, determining the quality of the final fused image. The weight map is calculated by two steps: activity level measurement and weight assignment. However, these two steps suffer from the robustness problem because traditional methods cannot deal with noise and misregistration well, as indicated in Liu et al. (2017). To improve the robustness of activity level measurement and weight assignment, Liu et al. (2017) introduced a deep learning fusion method with a simple multi-layer convolutional neural network (CNN) using the decision map and the medical image under the pyramid-based image fusion framework to reconstruct the fused medical image. While such a method achieves some success in specific medical image fusion tasks, this work may fail in multi-modal image fusion because the simple use of the CNN cannot extract finegrained details efficiently.
To address the aforementioned challenges, we propose a novel hybrid medical image fusion framework with two principal elements (e.g., LatLRR and FCNs), inspired by Liu and Wang (2014) and Liu et al. (2017). The main contributions of this paper are as follows: • The latent low-rank representation (LatLRR) is applied to decompose the medical image into low-rank (for extraction of details) and saliency components (for the preservation of energies). • In the context of the low-rank component, to avoid the fixed-length feature vector from the final full connection layer and the information loss in the traditional CNN, three different FCNs (due to the nature of an input image of arbitrary size) are applied to produce a correspondinglysized feature map with an efficient deconvolution layer (Guo et al., 2018), where a prediction is generated for each pixel and the spatial information in the original input image is retained. A sum strategy is used to fuse the saliency parts for energy preservation. • To the best of our knowledge, this fusion strategy in combination with LatLRR and FCNs is the first to be applied in the medical image domain.
The remainder of this paper is structured as follows. In section 2, the proposed fusion strategy is described in detail. Section 3 gives the experimental configurations. Section 4 illustrates a comparative study between the proposed frameworks and five representative medical image fusion methods in terms of visual quality and quantitative and computational cost assessments. The conclusion is drawn in section 5.

METHODOLOGY
As shown in Figure 1, each proposed framework is fed with a pair of pre-registered multi-modality medical source images, and outputs the fused medical image via the following four steps: • We use the LatLRR theory to decompose the two medical source images into low-rank and saliency components (see section 2.1). • To capture the detailed information of each source, a novel fusion framework of the low-rank components for each paired source based on FCNs, score maps, weight maps, and pyramid fusion is described (see section 2.2). • To retain the energies of each source, a simple sum strategy is used to fuse the saliency components and reconstruct the fused image (see section 2.3).

LatLRR Decomposition
The LatLRR theory was first proposed by Liu and Yan (2011), integrating subspace segmentation and feature extraction simultaneously, to extract the global and local structure from raw data in the context of natural images. It can be summarized into the following problem (Li and Wu, 2018): where * denotes the nuclear norm, 1 denotes the l 1 -norm, and λ > 0 is the balance coefficient. Img is the observed data matrix, and X and Y denote the low-rank and saliency coefficients, respectively. Note that Figure 2 explains the subject to Equation (1), where ImgX, ImgY, and Z are the low-rank, saliency, and noise components of Img, respectively. In this paper, the LatLRR decomposition of Equation (1) can be solved by the inexact augmented Lagrangian multiplier method (Wang et al., 2013), where it extracts the low-rank and saliency components (e.g., ImgX j and ImgY j ) from medical image Img j with j = 1, 2 (here, we consider two medical images, as shown in Figure 1a).

Fusion of Low-Rank Components
The fusion of low-rank component details can be seen in Figure 1b, including the FCN model for producing score maps (section 2.2.1), zero-phase component analysis (ZCA) (Kessy et al., 2018), and l 1 -norm operations (section 2.2.2) for whiting the score maps and generating the weight maps, respectively, weighted local energy (WLE) and weighted sum of eight-neighborhood-based modified Laplacian (WSEML) (Yin et al., 2018) operations (section 2.2.3) for obtaining the fused weight map, and pyramid fusion strategy (section 2.2.4) for reconstructing the fused low-rank component.

FCN Model
The fully convolutional networks (FCNs), demonstrated in many studies (Long et al., 2015;Wang L. et al., 2015;Chen et al., 2017;Guo et al., 2018), achieved significant performance in image semantic segmentation. In the FCN architecture, after multiple convolutions and pooling processes, the obtained image size will be progressively smaller with a lower resolution, resulting in a heatmap (coarse output). To keep the output the same size as the input, a skip architecture is used for upsampling. In this work, three different scenarios are tested, as shown in Figure 3. For each scenario, there are 38 layers of FCNs before upsampling, including 16 convolutional layers (blue color block in Figure 3A), 15 rule layers, five pooling layers (green color block in Figure 3A), and two dropout layers. In Figure 3A, the FCN-32s is a single-stream net in which up-samples stride 32 predictions back to pixels in a single step, but the upsampling output is very coarse. To obtain the refined outputs of FCN-16s, the final layer and the pool4 layer are used to combine the predictions in Figure 3B at stride 16. In Figure 3C, to obtain the outputs of FCN-8s with greater precision, the pool3 layer, the pool4 layer, and the final layer are utilized to combine the predictions at stride 8. As shown in Figure 1b, the three trained FCNs (FCN-32s, FCN-16s, and FCN-8s) are utilized to classify a pair of the low-rank components of medical source images Lr 1 = ImgX 1 and Lr 2 = ImgX 2 pixel by pixel, producing the corresponding score maps S j 1 : C , C = 21, j = 1, 2 (the choice of C = 21 can be seen in section 3.4).

ZCA and l 1 -Norm Operations
The details for ZCA and l 1 -norm operations are depicted in Figure 4. To project the original redundancy score maps into a sparse subspace, we used ZCA to whiten those score maps S j 1 : C and to obtain the score mapsŜ 1 : C j . Among the ZCA, the covariance matrix Co j i is decomposed as follows: where i = 1, 2, · · · , C; j = 1, 2, and i denote the i − th channel score map. Note that U, and V define the left singular, singular values, and right singular matrixes, respectively ). An alternative solution namedŜ i j is given as follows: where η is a small value avoiding bad matrix inversion and I is the identity matrix. Then, the local l 1 -norm and average operations are used to calculate the initial weight map W j : where k = 2 and the average l 1 -norm is calculated by a window centered atŜ i j (u, v).

WLE and WSEML Operations
Once the initial weight maps W 1 and W 2 is calculated, the WLE and WSEML are applied to acquire the final fused weight map F w , which is described in Figure 1b with the orange block.
FIGURE 1 | Schematic diagram of the proposed end-to-end frameworks (LatLRR-FCNs). The proposed LatLRR-FCNs enable the fusion image to extract details and preserve energies from paired sources. It is composed of four parts: (a) LatLRR decomposition, (b) fusion of low-rank components, (c) fusion of saliency components, and (d) reconstruction of fused image. Img 1 and Img 2 are the source medical images, Lr 1 and Lr 2 are the low-rank components of Img 1 and Img 2 , Ls 1 and Ls 2 are the saliency components of Img 1 and Img 2 , S 1 1 : C and S 2 1 : C are the score maps, W 1 and W 1 are the initial weight maps of Lr 1 and Lr 2 , and the final fused weight map is F w . F lr is the fused low-rank component, F ls is the fused saliency component, and the final fused image is F.
FIGURE 2 | The LatLRR decomposing operation. Img is the observed image. ImgX and ImgY are the low-rank and saliency components of Img, respectively. Z denotes the noisy component.
First, the WLE of each W j (i.e., j ) is calculated as follows: where j ∈ {1, 2} and denote a (2r + 1) × (2r + 1) weighting matrix. The value of each element in is 2 2r−d with radius r, d denotes the element of a four-neighborhood distance to the center. If r is 1, is equal to Second, the WSEML of each W j (i.e., j ) is given as follows: Frontiers in Neuroscience | www.frontiersin.org  where EML is expressed as follows: Finally, the fused weight map F w is calculated by the following rule:

Pyramid Fusion Strategy
As shown in Figure 1b, the fused weight map F w is decomposed into a Gaussian pyramid G{S} l (green color arrow). The lowrank components Lr 1 and Lr 2 are decomposed into a Laplacian pyramid (dark blue color arrow) L{C} l and L{M} l , respectively. Note that l denotes the l-th decomposition level, which is calculated by the following: where ⌊·⌋ is the flooring operation and the spatial size of the low-rank component is X × Y.
Next, those coefficients about L{F} are calculated at each decomposition level l: 10) where the threshold τ determines the corresponding fusion mode. Q l (x, y) is given as follows: where E l C (u, v) and E l M (u, v) are the local energy maps of L{C} l and L{M} l , respectively. E l C (u, v) and E l M (u, v) are defined as follows: Finally, the Laplacian pyramid reconstruction method (Mertens et al., 2009) (bottle green color arrow in Figure 1b) is used to reconstruct the fused low-rank components F lr from L{F} l , as indicated in Equation (10).

The Flowchart of the Proposed LatLRR-FCNs
The FCN architectures (FCN-32s or FCN-16s or FCN-8s) are inserted to produce two score maps with the focus property after the LatLRR decomposition once a pair of low-rank components for two images are calculated (hereafter, we named the proposed Perform the LatLRR decomposition on Img j to obtain the low-rank component Lr j = ImgX j Equations (5-8) are used to obtain F w ;  (Mertens et al., 2009) are used to reconstruct the fused low-rank components' image F lr [see Part 2-(iv)]. We sum the saliency components to obtain the fused saliency components' image F ls (see Part 3). Finally, the fused image F is obtained by combining F lr and F ls (see Part 4).

FCN Training Sets
Currently, transfer learning (Bar et al., 2015;Liu et al., 2017;Razzak et al., 2018;Lu et al., 2019Lu et al., , 2020 has become an active topic in the field of medical image analysis. In this study, we directly adopted a transfer learning strategy, and we trained the FCNs (FCN-32s, FCN-16s, and FCN-8s) on the PASCAL VOC 2012 dataset (Everingham et al., 2012) and the semantic boundary dataset (SBD) (Hariharan et al., 2011). The PASCAL VOC 2012 dataset contains 20 foreground object classes and 1 background class. The original dataset contains 1,464 (train), 1,449 (val), and 1,456 (test) pixel-level annotated images. The dataset is augmented with the SBD by extra annotations (Mertens et al., 2009), resulting in 10,582 training images.

Source Medical Image Testing Sets
In our experiments, we used 40 pairs of multi-modal medical images (each medical image fusion problem contains 10 image pairs) to demonstrate the usefulness and efficiency of the proposed methods. Most of the test images were gathered from the Whole Brain Atlas databases (Vidoni, 2012) and have been widely adopted in previous related publications (Liu and Wang, 2014;Liu et al., 2017Liu et al., , 2019Yin et al., 2018;Zhu et al., 2019). Each pair of images was geometrically aligned, and all the test images were normalized to 256 × 256.

State-of-the-Art Methods
Five superior medical image fusion methods were collected for comparison against our proposed methods. These included the adaptive sparse representation (ASR) method (Liu and Wang, 2014)

Parameter Choices
The parameters of all compared methods were set to the default values. The key parameters for our proposed algorithms were given in Table 1. According to this table, the parameter λ in LatLRR decomposition was 0.8 (Li and Wu, 2018), and the threshold τ in Equation (10) was set to 0.8 . The PASCAL VOC 2012 dataset contained 20 foreground object classes and one background class, so that the C in S j 1 : C was equal to 21. Note that we adopted a transfer learning strategy directly to train the FCN-VGG16 (Long et al., 2015) by MacInnes, and the trained models were obtained after 50 epochs using the training data. The choice of epoch was dependent on Figure 5. When the epoch was lower than 50, the accuracy of the training and validation sets increased with the values of epochs. However, the accuracy of the validation set leveled off when the epoch was higher than 50, although the accuracy of the training set still increased regardless of the scenario (FCN-32s, FCN-16s, and FCN-8s). In terms of the loss function, the values for all FCN architectures decreased with the epoch in the case of the training set, but the loss of the scenarios tended to converge at the 50 epochs. Therefore, to balance the computational complexity and accuracy, the epochs for FCN models in this paper were chosen as 50.

Experimental Environment
All the experiments were implemented in MATLAB R2019a on a WIN64 Intel(R) Core (TM)i7-8750H CPU@2.20GHz 8GB RAM. The training models of the proposed method were trained in MATLAB R2019a+VS2017+ MatConvNet 1.0-beta25.

Objective Evaluation Metrices
In this study, five common representative quantitative metrics, e.g., EN ( and p(d) denote the marginal probability distributions of C and D, respectively. p(c, d) denotes the joint probability distribution of C and D. Therefore, the quality of the fused image with respect to input images Img 1 and Img 2 can be defined as: (iii) Edge-based similarity measure Q AB/F : The authors in Xydeas et al. (2000) proposed a metric Q AB/F to produce the similarity between the edges that transform in the fusion process. This metric is defined as follows: where A, B, and F represent the two input images (Img 1 and Img 2 ) and fused images. The size of each image is N × M, Q AF (u, v) and Q BF (u, v) are defined as follows: where Q * F g (u, v) and Q * F α (u, v) are the edge strength and orient preservation values at location (u, v) in images A and B, respectively. The dynamic range for Q AB/F is equal to [0, 1], where a larger value for Q AB/F indicates a better fusion result. For more details of this metric, please refer to Xydeas et al. (2000).
(iv) The sum of the correlations of differences (SCD) Aslantas and Bendes (2015) is a quality metric formulated as follows: where D 1 = F − Img 2 , D 2 = F − Img 1 , F is the fused image, and Img 1 and Img 2 are the input images. The r(.) function calculates the correlation between S k and D k , given as: where k = 1, 2,D k andĪmg k are the average of the pixel values of D k and Img k , respectively. Han et al. (2013): To obtain the VIFF, four steps are needed. First, the source and fused images are filtered and then divided into blocks. Second, visual information is evaluated with and without distortion information in each block. Third, the VIFF of each sub-band is calculated. Finally, the overall quality measure is determined by weighting the VIFF of each sub-band.

Color Space Fusion
In our proposed methods, the YUV color space was used to solve the grayscale and RGB color image (PET, SPECT) fusion issues. First, the RGB color image was converted into a YUV color space, resulting in three channel components of Y, U, and V. Then, the grayscale image and the Y channel were fused by using the proposed fusion methods, as described in section 2. Finally, the fused Y-channel component, the U-channel component, and the V-channel component were inversely transformed by YUV space, obtaining the fused color image.

RESULTS AND DISCUSSION
This section is devoted to showing that the proposed LatRR-FCNs can improve the information details and energy preservation in terms of visual quality assessment (section 4.1), quantitative assessment (section 4.2) and computational cost assessment (section 4.3), compared with five recently proposed methods: ASR (Liu and Wang, 2014), LP-CNN , NSCT-PC-LLE (Zhu et al., 2019), NSST-PAPCNN (Yin et al., 2018), and CSMCA (Liu et al., 2019). In this study, the usefulness and efficiency of each method are investigated with four sets of medical image fusion studies, including CT and MR, MR-T1 and MR-T2, PET and MR, and SPECT and MR.

Visual Quality
The fusion examples of CT and MR images are given in Figure 6. Furthermore, one representative region of each result is enlarged for better comparison. The ASR and CSMCA methods reveal a significant energy loss in both the CT and MR images (resulting in an intensity and contrast decrease in the fused images), especially for the bone and lesion regions in the Figures 6a3-c3,a7-c7. The fusion results of the NSCT-PC-LLE, LP-CNN, NSST-PAPCNN, and the proposed methods have better information preservation for the CT and MR modalities. However, the NSCT-PC-LLE, LP-CNN, and NSST-PAPCNN methods cannot extract the detailed information well in the MR image, which can be seen in the Figures 6a4-c4,a5-c5,a6-c6 and the corresponding highlighted close-ups. Furthermore, the ASR method fails to extract the structural and edge details from the CT modality (see Figures 6a4-c4). The NSCT-PC-LLE and NSST-PAPCNN methods outperform the ASR method, even though some structural details cannot be extracted (see the Figures 6a3-c3,a4-c4,a6-c6). The proposed frameworks and LP-CNN method can effectively extract the structure and edge details from both CT and MR modalities (see Figures 6a8-a10,b8-b10,c8-c10,a5-c5, respectively). The proposed methods perform well on the preservation of detailed and structural information for all three examples. In addition, the NSCT-PC-LLE, LP-CNN, and NSST-PAPCNN methods cannot preserve the detailed information (see the close-ups in Figures 7a4-c4,a5-c5,a6-c6, respectively). Furthermore, the ASR and NSCT-PC-LLE methods exhibit lower ability in structure and edge detail extraction within the MR-T1 modality, explained by the close-up in Figures 7a4-c4,a5-c5. Finally, compared to the other tested methods, our proposed LatLRR-FCN-based methods achieve the best performance, as shown with the close-ups in Figures 7a8-c8,a9-c9,a10-c10, respectively. Figure 8 shows the three fusion examples of MR and PET images. The ASR and CSMCA methods lose a significant amount of energy in both the MR and PET modalities, as viewed in the Figures 8a3-c3,a7-c7 and the corresponding close-ups. Note that the NSCT-PC-LLE and LP-CNN methods are subjected to a severe color distortion (see the close-ups in Figures 8a4-c4,a5-c5). Furthermore, the color distortion existed more or less in the fusion results of the NSST-PAPCNN method (see Figures 8a7-c7 and the close-ups). Overall, the color preservation of our proposed algorithms (see Figures 8a8-c8,a9-c9,a10-c10 together with their close-ups) are also significantly higher than the other methods.
The fusion examples of three sets of MR and SPECT images are shown in Figure 9. The ASR and CSMCA methods still lose much energy in both the MR and PET modalities (see Figures 9a3-c3,a7-c7). Moreover, color distortion exists in the NSCT-PC-LLE and LP-CNN methods (see the closeup in Figures 9a4-c4,a5-c5). Furthermore, in the results of the NSST-PAPCNN method, color distortion also exists (in Figures 9a6-c6, especially the close-up). The visual quality of color preservation of our proposed methods significantly outperforms the others.

Quantitative Assessment
Here, five common quantitative metrics as described in section 3.6 are employed to appraise the fusion performance. The average     As also shown in Table 2, for different metrics, it can be concluded as follows.
(1) For the EN metric, our proposed techniques have the optimal energy preservation in four medical image fusion problems. (2) The Q MI metric shows that our proposed LatLRR-FCN-8s and LatLRR-FCN-16s architectures obtain the best performance in detail information extraction than others in the context of CT and MR image fusion and PET and MR image fusion problems. (3) In terms of the Q AB/F metric, our proposed frameworks are also close to the other comparison algorithms in edge and direction retention. (4) For the SCD metric, our proposed methods have a higher cross-correlation between the fused image and the input image than the others in all four medical image fusion problems. (5) For the VIFF metric, compared to the other methods, our proposed approaches are more consistent with the visual mechanism of human eyes in four medical image fusion problems.
Moreover, Figure 10 shows the objective performance of different methods in each fusion problem. The ten scores of each method in each fusion problem are connected for each metric. Obviously, the proposed three methods show the optimal performance among them. More specifically, the proposed LatLRR-FCNs are the best three ranks on the metrics of EN, SCD, and VIFF for all four problems, which is also concluded in Table 2.

Computational Cost Assessment
The average computational costs of different methods are shown in Table 3, including gray-level and color images. Although the performances of LP-CNN, NSCT-PC-LLE, and NSST-PAPCNN are better than the proposed methods, the proposed methods achieve a better performance in terms of both visual perception and objective assessment. However, the processing cost of ASR and CSMCA is 6 times and 10 times higher than our proposed methods. In total, the experimental results show that the proposed methods can achieve competitive performance in terms of computational costs in practice.

CONCLUSION
In this paper, three LatRR-FCNs have been proposed to improve energy conservation and detail extraction during medical image fusion. Based on LatLRR, the LatRR-FCNs decompose the medical image into low-rank and saliency components, which can enhance the extraction of detail in the SR-based methods. Then, three different fully convolutional networks (FCN-32s, FCN-16s, and FCN-8s), ZCA, l 1 -norm, WLE, and WSEML operations together with a pyramid-based fusion method are applied to fuse the low-rank components, which can simultaneously enhance the energy preservation and detail extraction. We sum the saliency components to obtain the fused saliency components. Finally, the fused image is obtained by combining the fused low-rank components and fused saliency components. The proposed frameworks were evaluated in the context of four kinds of medical image

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: http://www.med.harvard.edu/aanlib/.

AUTHOR CONTRIBUTIONS
ZX and JL: conceptualization. ZX and WX: methodology and writing-review and editing. ZX: software, visualization, and writing-original draft preparation. SZ, CM-C, and ZC: validation. RZ and XC: formal analysis. JL: resources and supervision. BL: data curation and project administration. All authors have read and agreed to the published version of the manuscript.