Multi-Modal Medical Image Fusion With Geometric Algebra Based Sparse Representation

Multi-modal medical image fusion can reduce information redundancy, increase the understandability of images and provide medical staff with more detailed pathological information. However, most of traditional methods usually treat the channels of multi-modal medical images as three independent grayscale images which ignore the correlation between the color channels and lead to color distortion, attenuation and other bad effects in the reconstructed image. In this paper, we propose a multi-modal medical image fusion algorithm with geometric algebra based sparse representation (GA-SR). Firstly, the multi-modal medical image is represented as a multi-vector, and the GA-SR model is introduced for multi-modal medical image fusion to avoid losing the correlation of channels. Secondly, the orthogonal matching pursuit algorithm based on geometric algebra (GAOMP) is introduced to obtain the sparse coefficient matrix. The K-means clustering singular value decomposition algorithm based on geometric algebra (K-GASVD) is introduced to obtain the geometric algebra dictionary, and update the sparse coefficient matrix and dictionary. Finally, we obtain the fused image by linear combination of the geometric algebra dictionary and the coefficient matrix. The experimental results demonstrate that the proposed algorithm outperforms existing methods in subjective and objective quality evaluation, and shows its effectiveness for multi-modal medical image fusion.


INTRODUCTION
Medical Image fusion technology integrates technologies in many fields such as computer technology, sensor technology, artificial intelligence, and image processing. It comprehensively extracts image information collected by different sensors and concentrates all the information of the image, which can reduce the information redundancy of the image, enhance the readability of the image and provide more specific disease information for diagnosis (Riboni and Murtas, 2019;Li et al., 2021;Wang et al., 2022).
According to the types of fused images, medical image fusion can be divided into unimodal medical image fusion and multi-modal medical image fusion (Tirupal et al., 2021). A unimodal medical image refers to multiple images of a patient's organ collected by the same device, which are combined into one image by corresponding fusion algorithm. The purpose is to collect image information under different contrasts (Zhang. et al., 2021). Multi-modal medical images refer to images obtained by different imaging methods. Different types of medical images contain different information, and the obtained fused image can summarize various feature information to provide medical staff with more comprehensive pathological information (Zhu et al., 2017). Common medical images include CT images, MR images, and SPECT images (Thieme et al., 2012;Nazir et al., 2021;Engudar et al., 2022).
Multi-modal medical image fusion mainly includes the following methods: morphological methods, knowledge based methods, wavelet based methods, neural network based methods, methods based on fuzzy logic, and so on (James and Dasarathy, 2014). Naeem used discrete wavelet transform (DWT) to fuse images with different details, which changed the uniformity of the details contained in the fused image (Naeem et al., 2016). Guruprasad et al. (2013) proposed an image fusion algorithm based on DWT-DBSS and use the maximum selection rule to obtain detailed fusion coefficients. Bruno presents a novel Wavelet-based method to fuse medical images according to the MRA approach, that aims to put the right "semantic" content in the fused image by applying two different quality indexes: variance and modulus maxima (Alfano et al., 2007). A hierarchical image fusion scheme is presented which preserves the details of the input images of most relevance for visual perception (Marshall and Matsopoulos, 2002).
Sparse representation (Shao et al., 2020) can deal with the natural sparsity of signals by the physiological properties of the human visual system, which is a linear combination of dictionary atoms and sparse coefficients to represent the signal with as few atoms as possible in a given overcomplete dictionary. Bin Yang and Shutao Li (2010) first introduced sparse representation into image fusion, and adopted the sliding window technique to make the fusion process robust to noise and registration. Zong and Qiu (2017) proposed a fusion method based on classified image blocks, which used the directional gradient histogram feature to classify image blocks to establish a sub-dictionary. It can reduce the loss of image details and improve the quality of image fusion.
Traditional sparse representation fusion method usually processes the color channels separately, which easily destroys the correlation between image channels and results in loss of color in the fused image. Geometric algebra (GA) has been considered as one of the most powerful tools in multidimensional signal processing and has witnessed great success in a wide range of applications, such as physics, quantum computing, electromagnetism, satellite navigation, neural computing, camera geometry, image processing, robotics, and computer vision, et al. (da Rocha and Vaz, 2006;Wang et al., 2019a;Wang et al., 2021a). Inspired by the paper (Wang et al., 2019b), the geometric algebra-based sparse representation (GA-SR) is introduced for multi-modal medical image fusion in this paper.
The rests of this paper are organized as follows. In Section 2, this paper introduces the basic knowledge of geometric algebra. Section 3 introduces the GA-SR algorithm and the fusion steps of the proposed algorithm. Section 4 provides the experimental analysis including subjective and objective quality evaluations. Finally, Section 5 concludes the papers.

GEOMETRIC ALGEBRA
Geometric algebra combines quaternions and Grassmann algebras, which can extend operations to higher-dimensional spaces. The geometric algebra space does not rely on coordinate information for calculation (Batard et al., 2009), and all geometric operators are included in the space. Any multi-modal medical image can be represented by geometric algebraic orthonormal base as a multi-vector for overall processing, which can ensure the correlation between each channel of the image (da Rocha and Vaz, 2006;López-González et al., 2016).
The orthonormal base in the geometric algebraic space G n satisfies the following basic operation rules, Eqs 2-5 where ∧ represents the outer product symbol, represents the inner product symbol, γ i γ j represents the geometric product of γ i and γ j , which is equal to the sum of the inner and outer products of γ i and γ j .

GEOMETRIC ALGEBRA BASED SPARSE REPRESENTATION BASED MULTI-MODAL MEDICAL IMAGE FUSION BASED ON
In this section, the GA-SR based multi-modal medical image fusion is provided.

Geometric Algebra Based Sparse Representation Model
The sparse representation model of GA multi-vector can be defined as is a geometric algebra dictionary containing M dictionary atoms, and a (E 0 (a) is a sparse coefficients vector in geometric algebra form. a 0 is the objective function, which is used to calculate the number of non-zero vectors in the vector a. The multi-modal medical image based on the GA-SR model can be described in the Eq. 7 For a three-channel multi-modal medical image can be converted into a vector q′ ∈ (G 2 ) N of length N, and the vector q′ can be expressed as shown in the Eq. 8 For a three-channel multi-modal medical image, its sparse representation model can be defined as shown in the Eq. 9 min a′ a′ 0 , s.t.q′ D′a′ Where is the corresponding geometric algebra coefficient vector, and a′ 0 is used to calculate the number of non-zero elements in the vector a′.
Therefore, the GA-SR model of the three-channel medical image can be described as follows (10) The general form of a three-channel medical image sparse coefficient matrix can be obtained by

The Representation of Multi-Modal Medical Image
Any pixel F of a multi-modal medical image can be represented as a multi-vector form in G n space, as shown in the Eq. 12 While γ i , γ ij , γ 1...n are the orthonormal base of geometric algebra, and F i (i, j), F ij (i, j), /, F 1/n (i, j) represent the pixels of the multi-modal medical image at (i, j). Each channel of a multi-modal medical image can be encoded on an orthonormal basis of geometric algebra. Therefore, a multimodal medical image Q of size M × N and Q ∈ (G n ) M×N can be expressed as Assuming that each image block of the multi-modal medical where N represents the size of image and K represents the number of image patches, which can be converted into a vector form of length N, and the geometric algebra form of the image block q is shown in the Eq. 14

The Proposed Fusion Algorithm
Let M 1 and M 2 represent two multi-modal medical source images, respectively, and the framework of GA-SR based multi-modal medical image fusion is shown in Figure 1.
(1) The sliding window technique is introduced to divide the two source images into several sub-image blocks. The size of the sliding window is generally n × n and the step size is 1. The image blocks are converted into column vectors, and the i th image block is formed into the column vector which can be denoted as x i 1 , x i 2 .
(2) The sparse representation coefficients α i 1 and α i 2 of the column vectors can be calculated by GAOMP algorithm respectively in Wang et al. (2019b), which are described as follows where D represents the adaptive dictionary of image blocks obtained by dictionary training, α i 1 and α i 2 respectively represent the sparse coefficient vectors obtained by GAOMP, which can be combined to obtain a sparse coefficient matrix. α i ≤ J is the cutoff condition for dictionary training.
(3) The fused sparse coefficient matrix is obtained by the L1 norm (Yanan et al., 2020) maximum rule. The L1 norm refers to the sum of the absolute values of the elements, and the L1 norm is the optimal convex approximation of the L0 norm, which is more efficient than the L0 norm and is easier to optimize the solution. The L1 coefficient of the corresponding columns of the two sparse coefficient matrices are calculated, and the column with the larger norm is used as the column of the fused sparse coefficient matrix. The fusion rules of the sparse coefficients are as Eq. 17 (4) A dictionary training algorithm is used to obtain the dictionary required for sparse representation. K-SVD is a classic dictionary training algorithm (Fu et al., 2019) in sparse representation. The K-GASVD algorithm in (Wang et al., 2019b) consists of two steps, which are sparse coding (Sandhya et al., 2021) and dictionary update (Thanthrige et al., 2020). The K-GASVD algorithm is used to perform and update dictionary training on the obtained sparse coefficient matrix. (5) The fusion result of x i 1 and x i 2 can be obtained according to the GA-SR model of the three-channel multi-modal medical image, as shown in the Eq. 18

EXPERIMENTAL ANALYSIS
In order to verify the effectiveness of the GA-SR based multi-modal medical image fusion, the experiments are implemented on four groups of multi-modal medical images selected from Harvard Medical School Database in Matlab with other exiting methods, such as Laplacian Pyramid algorithm (Liu et al., 2019), DWT-DBSS algorithm (Guruprasad et al., 2013), SIDWT-Haar algorithm (Xin et al., 2013) and Morphological Difference Pyramid algorithm (Matsopoulos et al., 1995). The source images are SPECT images obtained with different radionuclide elements, respectively. The spatial resolution of each image is 256 × 256. The source images used in the experiments are shown in Figure 2.

Subjective Quality Evaluation
The multi-modal medical images are fused by six different algorithms respectively, and the obtained results are shown in Figures 3-6. Figures 6A,B in each group are the source images used in the experiment, and Figures 6C-H are the fused results obtained by the six different algorithms. Subjectively, it can be seen that the edge of the images obtained by the first four algorithms is relatively complete, but the middle part is darker. The contrast and clarity of the images are low, which indicates that these four algorithms cannot fuse the two source images completely. As a result, the fused image information is incomplete. It can be seen that the fused images obtained by the SR algorithm and GA-SR  algorithm are relatively complete, which can comprehensively cover the color and structure information of the two source images, and the fused images obtained are relatively clear. However, there are multiple red spots of different sizes in the images obtained by the SR algorithm, which cause the result to be distorted. The red spots will cover the correct information of the source image, which is not conducive to clinical diagnosis. As can be seen from each group of Figure 6H, the images are relatively clear, and there is no obvious occlusion area. The contrast of the images is relatively high, which indicates that the fused images obtained by the GA-SR algorithm can comprehensively cover the source image. It can provide comprehensive pathological

Objective Quality Evaluation
The evaluation indicators are adopted for objective evaluation of image quality. In this paper, four indicators of CC (Correlation Coefficient) (Li and Dai, 2009), PSNR (Peak Signal to Noise Ratio) (Hore and Ziou, 2010), RMSE (Root Mean Square Error) (Zhao et al., 2020) and Joint-Entropy (Okarma and Fastowicz, 2020) are used for performance analysis with the six fusion algorithms, and four groups of tables are obtained respectively, as shown in Tables 1-4.
For fusion of the four groups, the CC of each group of images obtained by the GA-SR algorithm is higher than that obtained by other algorithms, indicating that the correlation of the images obtained by the GA-SR algorithm with the source image is higher, and the obtained image information is more complete. At the same time, the PSNR and RMSE of the images obtained by the GA-SR algorithm are higher than those obtained by other algorithms, indicating that the fused images obtained by the GA-SR algorithm are closer to the source images and have less distortion and more comprehensive information (Xiao et al., 2021;Gao et al., 2022a;Gao et al., 2022b;Gao et al., 2022c).

Further Analysis
Dictionary training is very important for sparse representation, and the quality of the dictionary directly affects the quality of image fusion. The dictionaries training based on the K-SVD and K-GASVD algorithms can be obtained respectively, as shown in Figure 7. Figure 7A is the dictionary image obtained by the K-SVD algorithm, and Figure 7B is the dictionary image obtained by the    K-GASVD algorithm. It is obvious that the color of the dictionary image obtained by the K-SVD algorithm is relatively single, that is because the K-SVD algorithm cannot fully handle the spectral components of the source image, resulting in the generated dictionary image containing a large number of gray image blocks. The dictionary image of K-GASVD contains richer comprehensive information. In order to verify the effect of the number of dictionary atoms on the quality of the fused image, we change the number of dictionary atoms to obtain different dictionary images, and finally obtain corresponding fused images. The relationship between the PSNR and the atomic number of fused images obtained from dictionaries with different atomic numbers is shown in Figure 8.
We can find that the PSNR of the fused images obtained by the K-GASVD model is significantly higher than that of the K-SVD with the increase of the number of dictionary atoms. On the other hand, the number of dictionary atoms required by the K-GASVD model is about 3/10 of the number of atoms required by the K-SVD model if the PSNR is same. Therefore, the number of atoms used in the K-GASVD is significantly reduced in the realization of the same fusion performance, which can present more colorful structures.
For computational complexity, it usually requires longer computational time for multi-modal medical image fusion than other existing real-valued algorithms, because of the noncommutativity of geometric multiplication. Inspired by the work in (Wang et al., 2021b), reduced geometric algebra (RGA) will be introduced to improve our algorithm with lower computational complexity.

CONCLUSION
In this paper, the multi-modal medical image is represented as a multi-vector, and the GA-SR model is introduced for multimodal medical image fusion to avoid losing the correlation of channels. And the dictionary learning method based on geometric algebra is provided for more specific disease information for diagnosis. The experimental results validate its rationality and effectiveness. At next steps, we will focus on the analysis and diagnosis of pathological information using GA-SR based multi-modal medical image fusion.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
YL: contributed to the conception of the study; NF: performed the experiment, performed the data analyses and wrote the manuscript; HW: contributed significantly to analysis and manuscript preparation; RW: helped perform the analysis with constructive discussions.