Multi-Exposure Image Fusion Algorithm Based on Improved Weight Function

High-dynamic-range (HDR) image has a wide range of applications, but its access is limited. Multi-exposure image fusion techniques have been widely concerned because they can obtain images similar to HDR images. In order to solve the detail loss of multi-exposure image fusion (MEF) in image reconstruction process, exposure moderate evaluation and relative brightness are used as joint weight functions. On the basis of the existing Laplacian pyramid fusion algorithm, the improved weight function can capture the more accurate image details, thereby making the fused image more detailed. In 20 sets of multi-exposure image sequences, six multi-exposure image fusion methods are compared in both subjective and objective aspects. Both qualitative and quantitative performance analysis of experimental results confirm that the proposed multi-scale decomposition image fusion method can produce high-quality HDR images.


INTRODUCTION
Due to the limited dynamic range of imaging equipment, it is impossible for existing imaging equipment to capture all the details in one scene with a single exposure. Therefore, underexposure or overexposure often occurs in daily shooting, which seriously affects the visualization of images and the display of key information. High-dynamic-range (HDR) imaging techniques overcome this limitation, but most of currently used standard monitors use low dynamic range (LDR) (Ma et al., 2015a). So, a tone mapping process is required to compress the dynamic range of HDR images for display after acquiring HDR images. Multi-exposure image fusion (MEF) methods use a costeffective way to solve the dynamic range mismatch between HDR imaging and LDR display. Source image sequences with different exposure levels are taken as input and the brightness information in accordance with the human visual system (Ma et al., 2017) is fused with them to generate HDR images with rich information and sensitive perception.
In recent years, many MEF algorithms have been developed. Like multi-source image fusion (Jin et al., 2021a), MEF algorithms are usually divided into four categories (Liu et al., 2020): spatial domain methods, transform domain methods, the combination of spatial domain and transform domain methods and deep learning methods (Jin et al., 2021b). This article mainly studies the MEF method in spatial domain,these methods mainly focus on providing the weighted sum of the input exposures image to obtain the fused image. Different MEF methods use different techniques to obtain the suitable weight map. Li et al. (2013) obtained the corresponding base and detail layers by decomposing the source image in two scales, and then processed them separately to obtain the final fusion image. Liu and Wang (2015) applied dense scale invariant feature transform (SIFT) (Liu et al., 2016) to obtain both contrast and spatial consistency weights based on local gradient information. Mertens et al. (2010) applied multi-resolution exposure sequences to Laplacian pyramid-based image fusion. The weighted average value was first calculated from the weighted values determined by contrast, saturation and good exposure, and then applied to obtain the pyramid coefficients. Finally, image fusion was achieved by reconstructing the obtained pyramid coefficients. Shen et al. (2014) proposed an exposure fusion method based on hybrid exposure weights and an improved Laplacian pyramid. This method considers the gradient vectors between different exposure source images, and uses an improved Laplacian pyramid to decompose input signals into both base and detail layers. Shen et al. (2011) proposed a probability model of MEF. According to the two quality indicators of both local contrast and color consistency of source image sequences, the generalized random walk framework was first used to calculate the optimal probability set. Then, the obtained probability set was used as the corresponding weights to realize image Fusion. Fei et al. (2017) applied an image smoothing algorithm based on weighted least squares to MEF for achieving detail extraction of HDR scenes. The extracted detail information was used in the multi-scale exposure fusion algorithm to achieve image fusion. So, fused images with rich colors and detailed information can be obtained. Li and Kang (2012) proposed a fusion method based on weighted sum. Firstly, three image features composed of local contrast, brightness and color differences are measured to estimate the weight, and then the weight map is optimized by recursive filter. Zhang and Cham (2012) proposed a simple and effective method, which uses gradient information to complete multi exposure image synthesis in static and dynamic scenes. Given multiple images with different exposures, the proposed method can seamlessly synthesize them under the guidance of gradient based quality evaluation, so as to produce a pleasant tone mapped high dynamic range image. Ma et al. (2017) proposed a method based on image structure block decomposition, which represents the image block with average intensity, signal intensity and signal structure, and then uses the intensity and exposure factor of the image block for weighted fusion, which can be used for both static scene fusion and dynamic scene fusion. Moriyama et al. (2019) proposed to use the light conversion method of preserving hue and saturation to generate a new multi exposure image for fusion, realize brightness conversion based on local color correction, and obtain the fused image by weighted average (weight is calculated by saturation). Wang and Zhao (2020) proposed using the super-pixel segmentation method to divide the input image into non overlapping image blocks composed of pixels with similar visual attributes, decompose the image block into three independent components: signal intensity, image structure and intensity, and then fuse the three components respectively according to the characteristics of human visual system and the exposure level of the input image. Qi et al. (2020) used the exposure quality a priori to select the reference image, used the reference image to solve the ghosting problem in the dynamic scene in the structural consistency test, and then decomposed the image by using the guidance filter, and proposed a fusion method combining spatial domain scale decomposition, image block structure decomposition and moderate exposure evaluation.  proposed a multi exposure image fusion algorithm based on improved pyramid transform. The algorithm improves the local contrast information of the image by using the adaptive histogram equalization algorithm, and calculates the image fusion weight coefficient with good contrast information, image entropy and exposure. Hayat and Imran (2019) proposed a ghosting free multi exposure image fusion technology based on dense sift descriptor and guided filter. Ulucan et al. (2020) proposed a new, simple and effective still image exposure fusion method. This technique uses weight map extraction based on linear embedding and watershed masking. Xu et al. (2021). Proposed a new multi exposure image fusion method based on tensor product and tensor singular value decomposition. A new fusion strategy is designed by using tensor product and t-svd. The luminance and chrominance channels are fused respectively to maintain color consistency. Finally, the chrominance and luminance channels are fused to obtain the fused image.
Both multi-scale decomposition method and fusion strategy of multi-scale coefficients determine the performance of the image fusion framework based on multi-scale decomposition. Pyramid transformation is a commonly used multi-scale decomposition method. Due to different scales and resolutions, the corresponding decomposition layer has different image feature information. In addition, the weight function design of feature extraction plays a decisive role in the final fusion result. Therefore, this article, proposes a fast and effective image fusion method based on improved weight function. The fusion weight map is calculated through the evaluation of exposure moderation and relative brightness. Combined with pyramid multi-scale decomposition, images with different resolutions are fused to generate the required high dynamic range image.
The rest of this article is organized as follows. The second section describes the overall process of the fusion algorithm; The third section is a detailed explanation of the weight function; The fourth section describes the process of image Gaussian pyramid decomposition and Laplace pyramid decomposition; The fifth section is the experimental results and analysis; The sixth section is the summary of this article.

WORKFLOW OF IMAGE FUSION ALGORITHM
MEF aims to generate an image containing the best pixel information from a series of images with different exposure levels. The pixel-based MEF performs weighted image fusion as follows.
where FI represents the fusion image, (x, y) represents pixel coordinates, N represents the number of images, I n represents the pixel intensity of the nth image, and W n represents the pixel weight of the nth image. The workflow of the proposed image fusion based on improved weight function is shown in Figure 1. Equation (7), Equation (8) and other symbols in Figure 1 correspond to the formula below, indicating that the operation corresponding to the equation has been performed. The symbol before Equation (12) in Figure 1 represents the multiplication sign.

WEIGHT FUNCTION
As the core part of the proposed image fusion method, a reasonable weight function is designed based on the appropriate evaluation of exposure levels (Shen-yu et al., 2015). Gray value, as an important measure of image visible information, usually determines the fusion weight based on the distance between image gray and 0.5, but this single index will cause the loss of information of the fused image and some areas of the image are dark. Using the Evaluation of Moderate Exposure, the fusion weight is determined by the gray mean value of the multi exposure image at a certain point and the distance from 0.5 to retain more image information. Additionally, the relative brightness is applied to measure the corresponding weight.

Evaluation of Moderate Exposure
In the evaluation process, the brightness and darkness changes of different pixels obtained by the limited sampling of a scene are analyzed, and each image pixel value in the scene under the optimal moderate exposure is estimated. The differences between the pixel values of each input image and the corresponding optimal pixel values are compared to evaluate moderate exposure. The evaluation value can be directly used as the corresponding weight value for image fusion. For N images with different exposures from the same scene, I n (x, y) represents the pixel value at the coordinate (x, y) of the nth image, and the evaluation indicator of moderate exposure is the sum of weights used to obtain the fused image.
In Equation (2), µ(x, y) represents the optimal pixel value of the pixel at the coordinate (x, y) of the image, which is estimated by Equation (3). On one hand, the value of µ(x, y) should be around 0.5, which can ensure ideal human visual experience.
On the other hand, in order to reflect the real-world lightdark contrast information, it is necessary to approximate the brightness information from the limited sampling of the scene. Therefore, the average value of each pixel in the images with different exposures is calculated by Equation (4). µ(x, y) is the weighted sum of 0.5 and this average value. The weight factor β is a balance parameter between detail information and light-dark contrast information.

Relative Brightness
The evaluation indicator of moderate exposure cannot well capture the information from dark areas of long-exposure images or bright areas of short-exposure images. Therefore, the relative brightness proposed by Lee et al. (2018) is added as another weight indicator. Specifically, when the overall image is bright (long exposure), the relatively dark areas are given greater weights. Conversely, when the overall image is dark (short exposure), the relatively bright areas are given greater weights.
The average pixel intensity of the nth image is denoted as m n . When I n (x, y) is close to 1 − m n , the corresponding weight should be relatively large. Therefore, the relative brightness can be expressed as follows.
In addition, when the adjacent exposed images and the input images have relatively large differences, the different objects in the two images are often in a good exposure state. Therefore, when the average brightness m n of the nth image considerably differs from the average brightness m n−1 ,m n+1 of adjacent images, a larger δ n value is given.α is a constant with a value of 0.75. δ n controls the weight according to the different m n values of the image, which can be expressed as follows.
Therefore, the final weight function can be expressed as follows.

MULTI-SCALE IMAGE DECOMPOSITION
Because the pixels of the image are closely related, it is more reliable to use a wider range of pixels to calculate the fusion weight. In addition, in the real world, objects have different structures at different scales. This shows that if you observe the same object from different scales, you will get different results. Therefore, in the case of multi-scale decomposition, using the image pyramid to calculate the result image will get better fusion results. Gaussian pyramid decomposition is first performed on the weight map and the multi-exposure image sequences. Then the Laplacian pyramid decomposition is applied to the multi-exposure image sequences. After the Gaussian pyramid and Laplacian pyramid of the image are fused between the corresponding layers, the upper layer image of the fused pyramid is up-sampled, and the up-sampled image is added to the lower layer image to obtain an image with the equal size of the image to be fused.

Gaussian Pyramid Decomposition
The Gaussian pyramid obtains a series of down-sampled images through Gaussian smoothing and sub-sampling. Gaussian kernel is first used to convolve the image of the l layer, and then all even rows and columns are deleted to obtain the image of the l + 1 pyramid layer. The decomposition algorithm is shown as follows.
where G l is the image of the lth layer of the Gaussian Pyramid, C l , R l is the total number of rows and columns of the lth layer image, w(m, n) is the value of the mth row and nth column of the Gaussian filter template, L ev represents the number of Gaussian pyramid layers, and the maximum decomposable number of layers is log 2 [min(C 0 , R 0 )].

Laplace Pyramid Decomposition
The Gaussian pyramid obtained by Gaussian convolution and downsampling often loses detailed image information. Therefore, Mertens et al. (2010) introduced Laplacian pyramid to restore detailed image information. The image of each layer of Gaussian pyramid subtracts the predicted image obtained after the upsampling and Gaussian convolution of the upper layer image to obtain a series of difference images, which are the Laplacian decomposition images. First, the upsampling process is expressed as follows 2 ), (x+m) 2 , 2 ) ∈ z 0, else where Z represents an integer, expand( G l (x, y)) indicates that an upsampling operation is performed on the lth layer of Gaussian pyramid. As shown in Equation (11), the image G l of the lth layer of Gaussian pyramid subtracts expand( G l (x, y)) to obtain the lth layer image L l containing detailed information.
The Laplace decomposition process of the image is shown in Figure 2. In this article, the number of layers of image pyramid is 7.

Image Fusion and Reconstruction
According to the above process, the Gaussian pyramid of the weighted image and the Laplacian pyramid of multi-exposure image sequences are first obtained, and then fused between the corresponding layers.
Frontiers in Neurorobotics | www.frontiersin.org FI l represents the fused image data of the lth layer. W k,l represents the lth layer data of the kth weighted image. L k,l represents the lth layer data of the Laplacian pyramid of the kth multi-exposure image. L ev represents the total number of pyramid layers. N represents the number of images. The upper layer image of the fused pyramid is first upsampled, and then expanded and added to the lower layer image to obtain an image with the equal size of the image to be fused as follows.
where FI l represents the lth layer image of the fused pyramid, up represents upsampling, L ev represents the number of pyramid levels, and H represents the final fusion image. The overall workflow of the proposed method is shown in Algorithm 1.

Subjective Comparison
Firstly, experiments are carried out on the "Arno" scene, and the fusion results of different algorithms are shown in Figure 3. It is not difficult to see that when dsift processes the clouds in the right sky, it is generally dark and can not capture the details of the clouds well. The GFF and SPD algorithms, when dealing with the bridge, have the problems of low brightness, resulting in the loss of detail information and poor visual effect. GD, PMEF and the algorithm proposed in this article can maintain the uniformity of Algorithm 1 | Multi-exposure image fusion algorithm based on improved weight function.
Input LDR image sequences I k k = 1, 2, . . . N, N is the total number of images, l represents the number of decomposition layers, (x, y)is the pixel position Output the fused image 1 Calculation of image fusion weights:   means the details of the overexposed image areas can be well captured. But the overall scene is dark, resulting in the detail loss of underexposed image areas. The image fused by GFF has a slight halo on the edge of the hot air balloon. Additionally, part of the sky is dark and the image color is slightly distorted. The sunset area of the image fused by SPD is abnormal. In addition, the image color is seriously distorted, which seriously affects the overall performance of the fused image. When comparing the enlarged details, MESPD, GD, Fmmr, SPD, and PMEF have low brightness, poor visibility and serious loss of details in this area.
In the experimental results of the "Kluki" scene, as shown in Figure 5, the saturation of SPD and PMEF is too high, resulting in some distortion of the color of the resulting image, and poor retention of the details of the clouds in the sky; Other algorithms retain the details of the clouds, and the visual effect is good. In contrast, the fusion results obtained by the proposed method and DSIFT consider the details of the bright and dark areas of the scene. So, the corresponding colors are real, the contrast is clear, and the visual performance of the fused images is good. From the enlarged details of the trees  on the left, dsift, SPD, and PMEF have the problems of low brightness and high saturation, resulting in poor retention effect of details.

Objective Evaluation Indicator Analysis
This article uses both structural similarity index (SSIM) and image information entropy for objective evaluation. As shown in Figure 6 and Tables 1, 2, the results confirm that the propose method achieves good performance in both subjective and objective evaluations. The abscissa in Figure 6 represents the value of information entropy, and the abscissa in Figure 7 represents the value of structural similarity; In addition, the ordinates of the two figures are the same: 1-20 represents different multi exposure sequences, and 21 represents the average value.
1) Image information entropy indicator comparison Image information entropy is one of the important factors that determine the final effect of image fusion. The larger the information entropy, the more detailed information contained in the experimental result graph; On the contrary, the smaller the  information entropy, the less detailed information contained in the experimental result graph. The evaluation results are shown in Figure 6 and Table 1. The multi exposure fusion algorithm under multi-scale decomposition is slightly lower than the SPD algorithm based on image block decomposition and better than other algorithms. This is because the SPD algorithm based on image block decomposition avoids the partial loss of information caused by up and down sampling in multi-scale decomposition, and its entropy is better than the multi exposure fusion algorithm under multi-scale decomposition. The calculation formula of image entropy is as follows.
P i represents the proportion of pixels with gray value i in the image.
2) MEF-SSIM comparison This article uses the MEF quality evaluation model (Ma et al., 2015b) for evaluation. The proposed method is objectively compared with six existing MEF method. Natural images usually contain object information of different scales. Multi-scale can ensure the correlation between the pixels of different scales and optimize image fusion. Structural similarity as an index is used to measure the similarity of two images. As shown in Figure 7 and Table 2, the MEF method under multi-scale decomposition achieves the best SSIM.
From the perspective of image composition, structural information is defined as an attribute that reflects the structure of objects in the scene independent of brightness and contrast. Additionally, model distortion is treated as a combination of three different factors, brightness, contrast, and structure. The mean is used as an estimate of brightness. The standard deviation is used as an estimate of contrast. The covariance is used as a measure of structural similarity. All the definitions are shown as follows.
L(x, y) = 2µ x µ y + c1 C(x, y) = 2δ x δ y + c2 L(x, y), C(x, y), and S(x, y) are the comparison results of image brightness, contrast, and structure, respectively. µ x and µ y are the mean values of image pixels. δ x and δ y are the standard deviations of image pixel values. δ x,y is the covariance of x and y. c1, c2, and c3 are constants to avoid system errors when the denominator is 0. α, β, γ used to adjust the weight of each component, usually α=β=γ =1. The structural similarity index is used for different scales, and the final image quality score is obtained through Formula (19).
where L is the total number of scales and β l is the weight assigned to the lth scale.

Ablation Experiment of Weight Function
In order to prove that the weight function of two different feature indexes, moderate exposure evaluation and relative brightness, can make the multi exposure image fusion get better results. The following ablation experiments were carried out in this article. As shown in Table 3, the objective evaluation index of the fused image obtained by the improved weight function in this article performs well.

Comparison and Analysis of Computational Efficiency
As shown in Table 4, The computational efficiency of the multi exposure fusion algorithm based on the improved weight function is better than the comparison algorithm. In the multi-exposure fusion algorithm based on the improved weight function, although the Laplace image pyramid is used, in the continuous down sampling, the amount of calculation increases only a little due to the doubling of the number of pixels. In addition, because the weight calculation method of this algorithm is simple and easy to calculate, it does not need additional time. Therefore, the computational efficiency of this algorithm is obviously better than other comparison algorithms.

CONCLUSION
In this article, the weight function is improved, and the weight map is calculated by using the evaluation of moderate exposure and relative brightness. Pyramid-based multi-scale decomposition is used to fuse images with different resolutions to generate the final HDR image. The proposed method can effectively capture the rich image details and solve the issues such as splicing traces and border discontinuities in the fused image, avoiding the generation of artifacts. Both MEF-SSIM and image information entropy are used to evaluate the performance of image fusion. Experimental results confirm that the proposed method achieves good image fusion performance.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.