<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioeng. Biotechnol.</journal-id>
<journal-title>Frontiers in Bioengineering and Biotechnology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioeng. Biotechnol.</abbrev-journal-title>
<issn pub-type="epub">2296-4185</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">923364</article-id>
<article-id pub-id-type="doi">10.3389/fbioe.2022.923364</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioengineering and Biotechnology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Two-To-One Deep Learning General Framework for Image Fusion</article-title>
<alt-title alt-title-type="left-running-head">Zhu et al.</alt-title>
<alt-title alt-title-type="right-running-head">General Neural Network Framework</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Zhu</surname>
<given-names>Pan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Ouyang</surname>
<given-names>Wanqi</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1762856/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Guo</surname>
<given-names>Yongxing</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhou</surname>
<given-names>Xinglin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Key Laboratory of Metallurgical Equipment and Control Technology</institution>, <institution>Ministry of Education</institution>, <institution>Wuhan University of Science and Technology</institution>, <addr-line>Wuhan</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering</institution>, <institution>Wuhan University of Science and Technology</institution>, <addr-line>Wuhan</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Precision Manufacturing Institute</institution>, <institution>Wuhan University of Science and Technology</institution>, <addr-line>Wuhan</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1254880/overview">Tinggui Chen</ext-link>, Zhejiang Gongshang University, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1444787/overview">Javad Hassannataj Joloudari</ext-link>, University of Birjand, Iran</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1435929/overview">Yongfeng Li</ext-link>, Henan Institute of Science and Technology, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Wanqi Ouyang, <email>ouyangwanqi@163.com</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Bionics and Biomimetics, a section of the journal Frontiers in Bioengineering and Biotechnology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>14</day>
<month>07</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>923364</elocation-id>
<history>
<date date-type="received">
<day>19</day>
<month>04</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>09</day>
<month>06</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Zhu, Ouyang, Guo and Zhou.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Zhu, Ouyang, Guo and Zhou</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>The image fusion algorithm has great application value in the domain of computer vision, which makes the fused image have a more comprehensive and clearer description of the scene, and is beneficial to human eye recognition and automatic mechanical detection. In recent years, image fusion algorithms have achieved great success in different domains. However, it still has huge challenges in terms of the generalization of multi-modal image fusion. In reaction to this problem, this paper proposes a general image fusion framework based on an improved convolutional neural network. Firstly, the feature information of the input image is captured by the multiple feature extraction layers, and then multiple feature maps are stacked along the number of channels to acquire the feature fusion map. Finally, feature maps, which are derived from multiple feature extraction layers, are stacked in high dimensions by skip connection and convolution filtering for reconstruction to produce the final result. In this paper, multi-modal images are gained from multiple datasets to produce a large sample space to adequately train the network. Compared with the existing convolutional neural networks and traditional fusion algorithms, the proposed model not only has generality and stability but also has some strengths in subjective visualization and objective evaluation, while the average running time is at least 94% faster than the reference algorithm based on neural network.</p>
</abstract>
<kwd-group>
<kwd>bionic vision</kwd>
<kwd>multi-modal image fusion</kwd>
<kwd>convolutional neural network</kwd>
<kwd>y-distribution structure</kwd>
<kwd>multi-convolution kernel</kwd>
<kwd>adaptive feature analysis</kwd>
</kwd-group>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Deep learning is a bio-inspired intelligent computing technology that is based on the principles of neurotransmission processes in the human brain, which resembles the pattern of connections between brain neurons (<xref ref-type="bibr" rid="B42">Xu et al., 2021</xref>). Unlike classical bionic techniques, i. e., ant colony algorithms (<xref ref-type="bibr" rid="B12">Deng et al., 2020</xref>), bee algorithms (<xref ref-type="bibr" rid="B11">&#xc7;il et al., 2020</xref>), etc., and particle swarm optimization (<xref ref-type="bibr" rid="B13">Elbes et al., 2019</xref>), etc., deep learning has an incredible and impressive ability to resolve the complexity of real-world problems, which has caused the attention of many scholars and has been successfully applied to practical problems (<xref ref-type="bibr" rid="B9">Chen et al., 2021b</xref>; <xref ref-type="bibr" rid="B5">Chen et al., 2022a</xref>; <xref ref-type="bibr" rid="B8">Chen et al., 2022c</xref>; <xref ref-type="bibr" rid="B36">Sun et al., 2022</xref>). In recent years, deep learning, especially neural networks, has become one of the most rapidly growing and widely applied artificial intelligence technologies. Several studies have demonstrated the superior performance of neural networks in target detection (<xref ref-type="bibr" rid="B18">Jiang et al., 2021a</xref>; <xref ref-type="bibr" rid="B16">Huang et al., 2021</xref>; <xref ref-type="bibr" rid="B15">Huang et al., 2022</xref>), image segmentation (<xref ref-type="bibr" rid="B19">Jiang et al., 2021b</xref>), data processing (<xref ref-type="bibr" rid="B6">Chen et al., 2021a</xref>; <xref ref-type="bibr" rid="B7">Chen et al., 2022b</xref>), and depth estimation (<xref ref-type="bibr" rid="B20">Jiang et al., 2019</xref>), etc. In addition, image fusion, which is an essential branch of neural network research, has been extensively implemented in various areas, especially in civil, military, and industrial applications, since the research on neural networks has gradually advanced. For example, mobile phones often integrate with high dynamic range (<xref ref-type="bibr" rid="B28">Ma et al., 2015</xref>; <xref ref-type="bibr" rid="B25">Liu et al., 2018</xref>; <xref ref-type="bibr" rid="B32">Qi et al., 2021</xref>) or refocus algorithms (<xref ref-type="bibr" rid="B34">Saha et al., 2013</xref>; <xref ref-type="bibr" rid="B3">Bai et al., 2015</xref>; <xref ref-type="bibr" rid="B46">Zhang and Levine, 2016</xref>) to get stable and information-rich images. Visible and infrared image fusion can provide a more direct monitoring environment to the observers (<xref ref-type="bibr" rid="B43">Xue and Blum, 2003</xref>; <xref ref-type="bibr" rid="B37">Wan et al., 2009</xref>; <xref ref-type="bibr" rid="B52">Zhou et al., 2016</xref>; <xref ref-type="bibr" rid="B48">Zhang et al., 2017</xref>).</p>
<p>Convolutional neural network (CNN), which is a category of neural networks, usually is superior to traditional manual feature extractors in feature extraction (<xref ref-type="bibr" rid="B45">Yan et al., 2017</xref>; <xref ref-type="bibr" rid="B22">Li et al., 2018</xref>), and the number of convolutional filters is significantly larger than traditional filters. Therefore, CNN can capture richer image details and is frequently used for image feature extraction. As such a potent tool, CNN provides new ideas and directions for research on image fusion. In general, neural networks enable to excavate of implicit rules in massive datasets and then predict the result by the gained rules, which render the models with exceptional generalization ability (<xref ref-type="bibr" rid="B10">Cheng et al., 2021</xref>; <xref ref-type="bibr" rid="B16">Huang et al., 2021</xref>). For traditional image fusion algorithms, multi-modal image fusion usually implies different fusion rules and it is difficult to seek a harmonized approach. As for CNN, CNN is not fully exploited in most cases and is primarily applied for image feature extraction. Although a few fully convolutional neural networks, which don&#x2019;t need to impose preprocessing and fusion rules, can automate image fused, the fusion object is specified for single-modal images. Therefore, the study of the generality of multi-modal image fusion faces a tremendous challenge.</p>
<p>In this paper, a general CNN framework for image fusion, called IY-Net, is designed. The structure of IY-Net is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The proposed model has two innovations. First of all, the proposed model has the characteristics of a fully convolutional neural network with relatively good generality. It doesn&#x2019;t need to specify fusion rules and has a simple network structure. This is the key innovation point. Secondly, since the quality of training datasets constrains the model performance in the field of deep learning, the appropriate dataset is particularly critical. Theoretically, the performance of the model that is gained by using images of the same modal as the training dataset is more stable and accurate. However, this paper selects multi-modal images as the training dataset, and the proposed model can avoid the mutual influence of fusion results in some way. Thus, these two innovations can make the proposed model stand out from the current CNN methods.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>The architecture of IY-Net. M represents the size of the feature map. The number at the top block represents the feature depth.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g001.tif"/>
</fig>
<p>The main contribution of this work is to propose a general image fusion framework. It is superior to many traditional algorithms and CNN methods in terms of image visual effects. The proposed model achieves excellent performance in multi-focus, infrared and visible, multi-exposure image fusion, etc. There are two more specific contributions. Firstly, a multi-feature extraction module is introduced, which effectively extends the perceptual field of the convolutional layer and thus captures more feature information. Secondly, a way of image reconstruction is constructed to effectively solve the problem of gradient disappearance and gradient explosion caused by CNN.</p>
<p>The rest of this paper is organized as follows. In <xref ref-type="sec" rid="s2">Section 2</xref>, the paper discusses the related work. <xref ref-type="sec" rid="s3">Section 3</xref> introduces the proposed model in detail. <xref ref-type="sec" rid="s4">Section 4</xref> describes the experimental results and discusses them. In <xref ref-type="sec" rid="s5">Section 5</xref>, the paper shows the conclusion and future research directions.</p>
</sec>
<sec id="s2">
<title>2 Related Work</title>
<p>Regarding CNN and traditional algorithms, despite several research results that have been achieved in image fusion algorithms, there is still space for optimization and improvement. In addition, most methods can only address image fusion of a few patterns and lack generality.</p>
<p>In general, traditional image fusion algorithms can be divided into two categories, i. e., spatial domain and transform domain algorithms. For image fusion algorithms in the spatial domain (<xref ref-type="bibr" rid="B17">Huang and Jing, 2007</xref>; <xref ref-type="bibr" rid="B51">Zhou et al., 2014</xref>; <xref ref-type="bibr" rid="B48">Zhang et al., 2017</xref>; <xref ref-type="bibr" rid="B2">Amin-Naji et al., 2022</xref>), the source image is divided into small pieces or regions according to certain criteria in the first step. Then the significance of the corresponding regions is evaluated, and finally, the most critical regions are fused. These algorithms are mainly applied to same-mode images, which may reduce the edge sharpness and contrast of the fused image or even produce halos at the edges. On the other hand, for the transform domain image fusion algorithm (<xref ref-type="bibr" rid="B14">Haghighat et al., 2011</xref>), the source image is decomposed into a feature domain by multi-scale geometry at the first step. Then, feature weighted fusion is achieved on multiple input images, and finally, the fused image is gained by the inverse transformation of the fused features. Among the current transform domain algorithms, multi-scale transform image fusion algorithms (MSTIF) are becoming increasingly popular. Examples of such transforms include pyramid-based decomposition (<xref ref-type="bibr" rid="B26">Liu et al., 2001</xref>), curvelet transform (<xref ref-type="bibr" rid="B38">Tessens et al., 2007</xref>), dual-tree complex wavelet transform (DTCWT) (<xref ref-type="bibr" rid="B21">Lewis et al., 2007</xref>), discrete wavelet transform (DWT) (<xref ref-type="bibr" rid="B49">Zheng et al., 2007</xref>; <xref ref-type="bibr" rid="B39">Tian and Chen, 2012</xref>) and non-subsampled contourlet transform (NSCT) (<xref ref-type="bibr" rid="B29">Moonon and Hu, 2015</xref>), etc. MSTIF relies on the selection of multi-scale decomposition methods and fusion strategies for multi-scale coefficient fusion. As a result, such algorithms have a relatively high manual factor, which leads to obvious weaknesses and lack of generality. For example, NSCT is weak at capturing curve details and curvelet transform is computationally complex, as well as it is terrible at multi-exposure and remote sensing image fusion. While fusing some modal images, pyramid-based decomposition will be distorted and laplace pyramid transform will incur redundant information, which is not available to infrared and visible image fusion. In conclusion, traditional MSTIF has a wide variety of filters, but it is always restricted in terms of the generality.</p>
<p>In recent years, image fusion methods based on neural networks have been rapidly growing (<xref ref-type="bibr" rid="B25">Liu et al., 2018</xref>). Firstly (<xref ref-type="bibr" rid="B23">Liu et al., 2017</xref>), regarded the fusion of multi-focus images as a classification task and used CNN to predict the focus image to obtain the fused image (<xref ref-type="bibr" rid="B35">Song et al., 2018</xref>). applied two neural networks to perform super-resolution processing of low-resolution terrestrial images and extract the feature map. Then high-pass modulation and weighting strategies are used to reconstruct the feature maps into fused images (<xref ref-type="bibr" rid="B4">Bhalla et al., 2022</xref>). integrated fuzzy theory with Siamese convolutional network to extract salient features of the source image as well as high-frequency information, and finally acquired fusion results by pixel strategy directly mapping to the source image. The above methods require pre-processing to generate fused images. In addition, they can only fuse images of a single-modal and lacks generality (<xref ref-type="bibr" rid="B47">Zhang et al., 2020</xref>). proposed a CNN-based image fusion framework that is trained in an end-to-end manner, and the parameters of the model can be jointly optimized without any subsequent processing. Although they designed a generalized model, it adopted human-selected fusion rules in the feature fusion phase, which led to the degradation of the model generality and the image fusion performance. For example, when infrared and visible images are fused, the model applies MAX fusion features to yield the best result. But when multi-exposure images are fused, it employs SUM fusion features to gain the best result. In summary, although CNN has achieved some success in the domain of image fusion, the majority of current models lack generality. In addition, most CNN is not designed end-to-end (<xref ref-type="bibr" rid="B40">Wang et al., 2019a</xref>) and requires additional steps to complete the task. Therefore, the CNN-based image fusion model has not been fully exploited, and there is still much potential to be boosted in terms of generality.</p>
</sec>
<sec id="s3">
<title>3 Methods and Materials</title>
<sec id="s3-1">
<title>3.1 Feature Extraction Module</title>
<p>The convolutional layer in CNN extracts different feature information from the training image by convolutional kernels and then updates the filter parameters automatically. Therefore, the selection of convolutional kernels is crucial for feature extraction. The specific structure is shown in <xref ref-type="sec" rid="s11">Supplementary Figure S1</xref>. The small-size convolution kernel is used to extract the low-frequency and small detail information, while high-frequency and large detail information can&#x2019;t be detected. Likewise, the large size of the convolution kernel is preferable for identifying high-frequency and large detail information.</p>
<p>As stated above, the paper utilizes multiple feature extraction layers, each of which has convolution kernels of sizes 3 &#xd7; 3, 5 &#xd7; 5, and 7 &#xd7; 7, to capture low and high-frequency information. The specific structure is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. The proposed model detects the feature information of the input image by three multiple feature extraction layers, but multiple convolutions can lead to over-fitting and increasing the training time. Therefore, this paper adds a max-pooling layer after both of the two previous multiple feature extraction layers to avoid such phenomena.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Structure of multi-feature extraction layer.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g002.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>3.2 Feature Fusion Module</title>
<p>There are two general methods for feature fusion: 1) The feature maps are connected along with the number of channels. 2) The feature maps are fused according to certain fusion rules. If the second feature fusion way is chosen, it will lead to a decrease in the generality of the model. Therefore, the paper chooses the first method to get the fused feature map. The specific structure is shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Firstly, the feature maps are concatenated along the channel dimension to gain the initial feature fusion map, and then it is filtered by the convolution layer. Finally, it is down-dimensioned to produce the final cross-channel fused feature map.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Feature fusion structure.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g003.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>3.3 Image Reconstruction Module</title>
<p>Under the effect of the pooling layer, the image size is changed from 256 &#xd7; 256 to 64 &#xd7; 64, which greatly reduces the resolution of the original image, and some features may be ambiguous. For restoring the size of the resource image, the paper applies the up-sampling operation (i.e., transposed convolution) to restore the resolution and optimize the image quality. However, it causes the image edge information to be dropped and blurred, so we deal with this problem by adding a skip connection based on the up-sampling operation, which can further enhance the image edge information. The module undergoes three up-sampling operations, which each time doubles the image size, and eventually produces a grayscale image with the original size. The specific up-sampling operations and skip connection structure are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. Firstly, the feature map and the fused feature map are skip-connected, and then up-sampling operations are executed on them. Finally, the high-dimensional map is down-dimensioned to a low-dimensional map by convolutional layers.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Up-sampling operations and skip connection structure.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g004.tif"/>
</fig>
</sec>
<sec id="s3-4">
<title>3.4 Loss Function</title>
<p>Before training the model, it is necessary to optimize the model parameters using an appropriate loss function to compare the predicted values with the actual values. The proposed model aims to form a fused image by regression of two input images. Therefore, the paper chooses the structural similarity (SSIM) (<xref ref-type="bibr" rid="B41">Wang et al., 2004</xref>) to coping with this problem. As shown in the equation.<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:mi mathvariant="bold-italic">SSIM</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">y</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3bc;</mml:mi>
<mml:mi mathvariant="bold-italic">x</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3bc;</mml:mi>
<mml:mi mathvariant="bold-italic">y</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">xy</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">&#x3bc;</mml:mi>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">&#x3bc;</mml:mi>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mi mathvariant="bold-italic">y</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="bold-italic">C</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>Where <italic>x</italic> is the real image, <italic>y</italic> is the predicted image, <inline-formula id="inf1">
<mml:math id="m2">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>x</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf2">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is mean, <inline-formula id="inf3">
<mml:math id="m4">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>x</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf4">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c3;</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is variance, and <inline-formula id="inf5">
<mml:math id="m6">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is covariance. <inline-formula id="inf6">
<mml:math id="m7">
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf7">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> are stable constants. <italic>L</italic> is the dynamic range of pixel values, <inline-formula id="inf8">
<mml:math id="m9">
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.01</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf9">
<mml:math id="m10">
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.03</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. The sliding window size is set as <inline-formula id="inf10">
<mml:math id="m11">
<mml:mrow>
<mml:mn>11</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:mn>11</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, it moves pixel by pixel in an image from top-left on an image.</p>
<p>Thus, SSIM loss function can be defined as:<disp-formula id="e2">
<mml:math id="m12">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="bold-italic">L</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold-italic">ssim</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="bold-italic">n</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="bold-italic">SSIM</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">y</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>Where <italic>n</italic> represents the total number of sliding windows.</p>
<p>The proposed model has all components of the loss function that are differentiable, thus the model parameters of the paper can be updated by random gradient descent and back-propagation.</p>
</sec>
<sec id="s3-5">
<title>3.5 Training Dataset</title>
<p>It is well known that CNNs are data-driven. So large-scale image datasets are the basis for achieving favorable performance (<xref ref-type="bibr" rid="B23">Liu et al., 2017</xref>). randomly selected multi-focus images from the ImageNet dataset. And the focused images were obscured with a random scale of the Gaussian kernel to generate an image dataset consisting of 2 million pairs of images of size 16 &#xd7; 16. Since no large-scale multi-exposure image dataset was available (<xref ref-type="bibr" rid="B33">Ram Prabhakar et al., 2017</xref>), randomly cropped 64 &#xd7; 64 image segments from small multi-exposure images to generate a multi-exposure dataset.</p>
<p>As mentioned above, current experimental objects are composed mainly of small blocks of images as single-modal datasets, which can&#x2019;t fulfill the experimental requirements. Therefore, multi-focus images, multi-exposure images, and remote sensing images are selected from several datasets to form the training dataset with an image size of 256 &#xd7; 256 in this paper. The images in the training dataset <strike>was</strike> are randomly rotated, randomly contrast shifted, and randomly stretched to boost diversity. The parts of multi-modal images in the dataset are shown in <xref ref-type="sec" rid="s11">Supplementary Figure S2</xref>
<bold>.</bold>
</p>
</sec>
</sec>
<sec id="s4">
<title>4 Experiments and Results</title>
<sec id="s4-1">
<title>4.1 Experimental Settings</title>
<p>IY-Net is implemented by Pytorch 1.8.1 based on Python 3.9.4. The proposed model is trained and tested on a computer equipped with an Intel i5-1035G1 CPU (1&#xa0;GHz) and 2&#xa0;GB GPU, and it is trained on the CPU. The paper trains 1826 pairs of images with an image size of 256 &#xd7; 256 and a batch size of 40 in the training process. The whole process takes about 1&#xa0;h. Concerning the learning rate, using the Adam optimizer (<xref ref-type="bibr" rid="B41">Wang et al., 2004</xref>) and the learning rate set to 0.0005.</p>
<p>In this paper, the proposed model is compared with traditional multi-scale transform algorithms, i. e., discrete wavelet transforms (DWT) (<xref ref-type="bibr" rid="B49">Zheng et al., 2007</xref>) and non-subsampled contourlet transform (NSCT) (<xref ref-type="bibr" rid="B29">Moonon and Hu, 2015</xref>). To further validate the advantages of the proposed model in the area of deep learning, it is compared with three current neural network-based image fusion models, i. e., multi-focus image fusion model (MFCNN) (<xref ref-type="bibr" rid="B23">Liu et al., 2017</xref>), CNN integration model for image fusion (ECNN) (<xref ref-type="bibr" rid="B1">Amin-Naji et al., 2019</xref>) and unsupervised depth model for image fusion (SESF) (<xref ref-type="bibr" rid="B27">Ma et al. 2021</xref>). To verify the generality of the proposed model, five types of datasets (including multifocal images, infrared and visual images, etc.) are experimented and evaluated in the paper. The five image test datasets are shown in <xref ref-type="sec" rid="s11">Supplementary Figures S7,S8,S9,S10,S11</xref>.</p>
<p>For the evaluation of the image fusion algorithm, the paper qualitatively judges the visual effect of the fused images. The performance of different image fusion methods can&#x2019;t be distinguished by visual effects alone. Therefore, five metrics are introduced to further estimate the quantitative manifestation of IY-Net on multi-modal image fusion. The five metrics are spatial frequency (SF), information entropy (IE), average gradient (AG) (<xref ref-type="bibr" rid="B30">Petrovi&#x107;, 2007</xref>), Peille index (Peille) (<xref ref-type="bibr" rid="B31">Piella and Heijmans, 2003</xref>), and edge preservation information (Q<sub>AB</sub>) (<xref ref-type="bibr" rid="B44">Xydeas and Petrovic, 2000</xref>) respectively.</p>
</sec>
<sec id="s4-2">
<title>4.2 Experimental Results and Analysis</title>
<sec id="s4-2-1">
<title>4.2.1 Multi-Focus Image Fusion</title>
<p>Experiments are conducted on multi-focus image test datasets as shown in <xref ref-type="sec" rid="s11">Supplementary Figure S3</xref>. It is verified that the proposed model has a great performance in multi-focus image fusion. Taking &#x201c;Boy&#x201d; as shown in <xref ref-type="sec" rid="s11">Supplementary Figure S8 (A) and (B)</xref> for example. The fusion result of DWT is blurred in some regions and fails to retain the complete details and features, but other algorithms can capture suitable feature information with better visual effects. <xref ref-type="fig" rid="F5">Figure 5</xref> provides the fusion results of multi-focus image test datasets based on all algorithms. Experimental results show that the proposed model is practicable and stable in multi-focus image fusion visually.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Experiment on 4 pairs of multi-focus images. <bold>(A)</bold> DWT, <bold>(B)</bold> NSCT, <bold>(C)</bold> MFCNN, <bold>(D)</bold> ECNN, <bold>(E)</bold> SESF, <bold>(F)</bold> IY-Net.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g005.tif"/>
</fig>
</sec>
<sec id="s4-2-2">
<title>4.2.2 Infrared and Visible Image Fusion</title>
<p>As shown in <xref ref-type="sec" rid="s11">Supplementary Figure S4</xref>, four groups of infrared and visible images reveal different scene information. Experiments are carried on them to confirm the capability of IY-Net in infrared and visible image fusion. For simplicity, &#x201c;Car&#x201d; is used for detailed analysis in <xref ref-type="sec" rid="s11">Supplementary Figure S9</xref>. Apparently, DWT basically preserves the infrared and visible features, but the fused image has relatively low contrast. MFCNN failed to capture the infrared features and the visual effect is weak. NSCT, ECNN, and SESF produce large areas of dark spots and shadows that generate no-desired results. Exhilaratingly, IY-Net acquires the most observable fusion results, which be provided with abundant visible details and infrared features as shown in <xref ref-type="sec" rid="s11">Supplementary Figure S9 (H)</xref>. A similar situation occurs in <xref ref-type="fig" rid="F6">Figure 6</xref> which is obtained from the images in <xref ref-type="sec" rid="s11">Supplementary Figure S4</xref>. To all appearances, IY-Net not only has the best visual effect but also possesses evident stability and adaptability in infrared and visible image fusion.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Experiment on 4 pairs of infrared and visible images. <bold>(A)</bold> DWT, <bold>(B)</bold> NSCT, <bold>(C)</bold> MFCNN, <bold>(D)</bold> ECNN, <bold>(E)</bold> SESF, <bold>(F)</bold> IY-Net.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g006.tif"/>
</fig>
</sec>
<sec id="s4-2-3">
<title>4.2.3 Infrared Intensity and Polarization Image Fusion</title>
<p>
<xref ref-type="sec" rid="s11">Supplementary Figure S5</xref> shows four pairs of infrared intensity and polarization images that are used to check the performance of the proposed model. A group of experimental results, taking &#x201c;SUV&#x201d; for example, is presented in <xref ref-type="sec" rid="s11">Supplementary Figure S10</xref>. The source polarization and infrared intensity images are shown in <xref ref-type="sec" rid="s11">Supplementary Figure S10</xref>. From the results of the experiment, we can see that DWT may maintain polarization and intensity information, but some parts are obscured, which results in poor visual effects. MFCNN cannot fuse the source image validly at all. ECNN and SESF can only combine the polarization and intensity information in part of the region and generate many pixel blocks and black spots, which seriously affects overall visual perception. In contrast, IY-Net and NSCT perfectly integrate these two kinds of images. It shows that NSCT and IY-Net could be employed availably in infrared intensity and polarization image fusion compared to other algorithms. The other fusion results are shown in <xref ref-type="fig" rid="F7">Figure 7</xref> Experiments demonstrated that MFCNN, ECNN, and SESF failed to fuse infrared intensity and polarization images in a dark environment. In addition, it produces the phenomenon of image distortion and partial texture being blurred in bright environments. However, NSCT and IY-Net can be adapted for infrared intensity and polarization image fusion in different environments.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Experiment on 4 pairs of infrared intensity and polarization images. <bold>(A)</bold> DWT, <bold>(B)</bold> NSCT, <bold>(C)</bold> MFCNN, <bold>(D)</bold> ECNN, <bold>(E)</bold> SESF, <bold>(F)</bold> IY-Net.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g007.tif"/>
</fig>
</sec>
<sec id="s4-2-4">
<title>4.2.4 Multi-Exposure Image Fusion</title>
<p>Furthermore, fusion experiments are implemented in multi-exposure images as shown in <xref ref-type="sec" rid="s11">Supplementary Figure S6</xref> to evaluate the capability of the proposed model. The source &#x201c;Computer&#x201d; image is shown in <xref ref-type="sec" rid="s11">Supplementary Figure S4 (A) and (B)</xref>, and the two images show high and low exposure images. <xref ref-type="sec" rid="s11">Supplementary Figure S4 (C)&#x2010;(H)</xref> shows the fusion results of all algorithms. DWT can keep the source image features, but the region of the curtain is ambiguous. The fused results of NSCT, ECNN, and SESF appear with numerous black spots, and partial details of MFCNN failed to be preserved. In contrast, IY-Net saves the fully-featured texture and achieves great visual effect in multi-exposure image fusion. The results of all the test database fusion are shown in <xref ref-type="fig" rid="F8">Figure 8</xref> DWT generates blurred textures in some regions. NSCT, MFCNN, and ECNN can effectively respond to fusion in dark environments, but they can lose efficacy for the images with bright information. SESF displays terrible results for different environments, for example, the fused images appear with extensive black spots and distortion of textures. In contrast to these reference algorithms, the proposed model is suitable for multi-exposure image fusion, and the fusion results reflect clearer features and appropriate visual perception.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Experiment on 4 pairs of multi-exposure images. <bold>(A)</bold> DWT, <bold>(B)</bold> NSCT, <bold>(C)</bold> MFCNN, <bold>(D)</bold> ECNN, <bold>(E)</bold> SESF, <bold>(F)</bold> IY-Net.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g008.tif"/>
</fig>
</sec>
<sec id="s4-2-5">
<title>4.2.5 Remote Sensing Image Fusion</title>
<p>Finally, this paper confirms the performance of the proposed model in remote sensing image fusion, and the test dataset is shown in <xref ref-type="sec" rid="s11">Supplementary Figure S7</xref>. The source &#x201c;Building&#x201d; images are shown in <xref ref-type="sec" rid="s11">Supplementary Figures S12 (A) and (B). Supplementary Figures S12 (C)&#x2010;(H)</xref> show the fusion results of all algorithms. DWT, ECNN, SESF, and NSCT retain most of the detailed features, but some small details are vague. MFCNN and IY-Net can completely detect textures and details, nevertheless, IY-Net has higher contrast and more obvious intensity information than MFCNN. Concerning remote sensing image fusion, IY-Net has a better visual effect. Other fusion results are shown in <xref ref-type="fig" rid="F9">Figure 9</xref>. Experiments reveal that DWT appears to texture distortion, and NSCT has too high contrast and thus obscures some feature information. MFCNN has only a visual perception of single-source image feature information, and ECNN and SESF have a lot of shadows and black spots locally. Obviously, the proposed model has a good visual effect and proper contrast.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Experiment on 4 pairs of remote sensing images. <bold>(A)</bold> DWT, <bold>(B)</bold> NSCT, <bold>(C)</bold> MFCNN, <bold>(D)</bold> ECNN, <bold>(E)</bold> SESF, <bold>(F)</bold> IY-Net.</p>
</caption>
<graphic xlink:href="fbioe-10-923364-g009.tif"/>
</fig>
</sec>
</sec>
<sec id="s4-3">
<title>4.3 Quantitative Comparison and Discussion</title>
<p>
<xref ref-type="table" rid="T1">Table 1</xref>, <xref ref-type="table" rid="T2">Table 2</xref>, <xref ref-type="table" rid="T3">Table 3</xref>, <xref ref-type="table" rid="T4">Table 4</xref>, <xref ref-type="table" rid="T5">Table 5</xref> shows the quantitative metrics corresponding to the above multi-modal image fusion results respectively. In these tables, each value represents the average measured value of the dataset, and the best values are bolded. These metrics can be used to fairly and objectively reveal the fusion performance of all the algorithms from an objective perspective combined with subjective vision. As shown in <xref ref-type="table" rid="T1">Table 1</xref>, IY-Net acquires the optimum Peille metric, which denotes the proposed model is highly correlated with original images compared to these reference algorithms. Although the proposed model failed to yield optimal values for other metrics, the values achieved by the proposed model are acceptable.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Quantitative evaluation results of multi-focus image fusion. </p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Metrics</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SF</td>
<td align="char" char=".">21.41603</td>
<td align="char" char=".">27.6822</td>
<td align="char" char=".">27.5599</td>
<td align="char" char=".">
<bold>29.5400</bold>
</td>
<td align="char" char=".">29.4076</td>
<td align="char" char=".">22.3491</td>
</tr>
<tr>
<td align="left">AG</td>
<td align="char" char=".">7.066</td>
<td align="char" char=".">9.5068</td>
<td align="char" char=".">9.3801</td>
<td align="char" char=".">9.6744</td>
<td align="char" char=".">
<bold>9.7212</bold>
</td>
<td align="char" char=".">8.2074</td>
</tr>
<tr>
<td align="left">IE</td>
<td align="char" char=".">7.4132</td>
<td align="char" char=".">7.4845</td>
<td align="char" char=".">7.4694</td>
<td align="char" char=".">
<bold>7.4783</bold>
</td>
<td align="char" char=".">7.4713</td>
<td align="char" char=".">7.4591</td>
</tr>
<tr>
<td align="left">Q<sub>AB</sub>
</td>
<td align="char" char=".">0.5012</td>
<td align="char" char=".">0.7267</td>
<td align="char" char=".">
<bold>0.7430</bold>
</td>
<td align="char" char=".">0.7296</td>
<td align="char" char=".">0.7212</td>
<td align="char" char=".">0.6880</td>
</tr>
<tr>
<td align="left">Peille</td>
<td align="char" char=".">0.0076</td>
<td align="char" char=".">0.0062</td>
<td align="char" char=".">0.0065</td>
<td align="char" char=".">0.0064</td>
<td align="char" char=".">0.0072</td>
<td align="char" char=".">
<bold>0.0090</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Quantitative evaluation results of infrared and visible image fusion.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Metrics</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SF</td>
<td align="char" char=".">8.1647</td>
<td align="char" char=".">12.7831</td>
<td align="char" char=".">9.5506</td>
<td align="char" char=".">18.3357</td>
<td align="char" char=".">
<bold>24.9147</bold>
</td>
<td align="char" char=".">12.5291</td>
</tr>
<tr>
<td align="left">AG</td>
<td align="char" char=".">3.0915</td>
<td align="char" char=".">5.0239</td>
<td align="char" char=".">3.6153</td>
<td align="char" char=".">5.4813</td>
<td align="char" char=".">
<bold>7.2602</bold>
</td>
<td align="char" char=".">4.8389</td>
</tr>
<tr>
<td align="left">IE</td>
<td align="char" char=".">6.4426</td>
<td align="char" char=".">7.166</td>
<td align="char" char=".">6.6088</td>
<td align="char" char=".">7.1048</td>
<td align="char" char=".">
<bold>7.3101</bold>
</td>
<td align="char" char=".">6.8087</td>
</tr>
<tr>
<td align="left">Q<sub>AB</sub>
</td>
<td align="char" char=".">0.328</td>
<td align="char" char=".">0.5085</td>
<td align="char" char=".">0.4563</td>
<td align="char" char=".">
<bold>0.5811</bold>
</td>
<td align="char" char=".">0.5695</td>
<td align="char" char=".">0.451</td>
</tr>
<tr>
<td align="left">Peille</td>
<td align="char" char=".">0.0064</td>
<td align="char" char=".">0.0043</td>
<td align="char" char=".">
<bold>0.0189</bold>
</td>
<td align="char" char=".">0.0052</td>
<td align="char" char=".">0.0129</td>
<td align="char" char=".">0.0065</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Quantitative evaluation results of infrared intensity and polarization image fusion.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Metrics</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SF</td>
<td align="char" char=".">8.8792</td>
<td align="char" char=".">13.7983</td>
<td align="char" char=".">11.1476</td>
<td align="char" char=".">17.0852</td>
<td align="char" char=".">
<bold>19.5567</bold>
</td>
<td align="char" char=".">14.2838</td>
</tr>
<tr>
<td align="left">AG</td>
<td align="char" char=".">3.0686</td>
<td align="char" char=".">5.1073</td>
<td align="char" char=".">3.9309</td>
<td align="char" char=".">5.4514</td>
<td align="char" char=".">
<bold>6.1073</bold>
</td>
<td align="char" char=".">5.1259</td>
</tr>
<tr>
<td align="left">IE</td>
<td align="char" char=".">6.4152</td>
<td align="char" char=".">
<bold>7.1872</bold>
</td>
<td align="char" char=".">5.987</td>
<td align="char" char=".">6.2169</td>
<td align="char" char=".">6.4883</td>
<td align="char" char=".">6.9192</td>
</tr>
<tr>
<td align="left">Q<sub>AB</sub>
</td>
<td align="char" char=".">0.3268</td>
<td align="char" char=".">0.528</td>
<td align="char" char=".">0.5246</td>
<td align="char" char=".">0.6061</td>
<td align="char" char=".">
<bold>0.6156</bold>
</td>
<td align="char" char=".">0.4627</td>
</tr>
<tr>
<td align="left">Peille</td>
<td align="char" char=".">0.0073</td>
<td align="char" char=".">0.0042</td>
<td align="char" char=".">
<bold>0.0358</bold>
</td>
<td align="char" char=".">0.0242</td>
<td align="char" char=".">0.0241</td>
<td align="char" char=".">0.0049</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Quantitative evaluation results of multi-exposure image fusion.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Metrics</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SF</td>
<td align="char" char=".">15.5429</td>
<td align="char" char=".">23.285</td>
<td align="char" char=".">19.9213</td>
<td align="char" char=".">29.3432</td>
<td align="char" char=".">
<bold>30.4046</bold>
</td>
<td align="char" char=".">22.0839</td>
</tr>
<tr>
<td align="left">AG</td>
<td align="char" char=".">5.4503</td>
<td align="char" char=".">8.7552</td>
<td align="char" char=".">6.8542</td>
<td align="char" char=".">9.6245</td>
<td align="char" char=".">
<bold>9.8643</bold>
</td>
<td align="char" char=".">8.0654</td>
</tr>
<tr>
<td align="left">IE</td>
<td align="char" char=".">7.1778</td>
<td align="char" char=".">7.2096</td>
<td align="char" char=".">7.1206</td>
<td align="char" char=".">
<bold>7.3695</bold>
</td>
<td align="char" char=".">7.2344</td>
<td align="char" char=".">7.2672</td>
</tr>
<tr>
<td align="left">Q<sub>AB</sub>
</td>
<td align="char" char=".">0.4376</td>
<td align="char" char=".">0.7668</td>
<td align="char" char=".">0.6826</td>
<td align="char" char=".">
<bold>0.7916</bold>
</td>
<td align="char" char=".">0.7453</td>
<td align="char" char=".">0.7074</td>
</tr>
<tr>
<td align="left">Peille</td>
<td align="char" char=".">0.0048</td>
<td align="char" char=".">0.0027</td>
<td align="char" char=".">
<bold>0.0103</bold>
</td>
<td align="char" char=".">0.0036</td>
<td align="char" char=".">0.0042</td>
<td align="char" char=".">0.0037</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Quantitative evaluation results of remote sensing image fusion.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Metrics</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SF</td>
<td align="char" char=".">24.9828</td>
<td align="char" char=".">31.6788</td>
<td align="char" char=".">25.4795</td>
<td align="char" char=".">34.3604</td>
<td align="char" char=".">
<bold>36.9394</bold>
</td>
<td align="char" char=".">30.3675</td>
</tr>
<tr>
<td align="left">AG</td>
<td align="char" char=".">9.7509</td>
<td align="char" char=".">12.6092</td>
<td align="char" char=".">10.1474</td>
<td align="char" char=".">12.9351</td>
<td align="char" char=".">
<bold>13.7295</bold>
</td>
<td align="char" char=".">11.9796</td>
</tr>
<tr>
<td align="left">IE</td>
<td align="char" char=".">7.0700</td>
<td align="char" char=".">
<bold>7.2978</bold>
</td>
<td align="char" char=".">6.7970</td>
<td align="char" char=".">6.8664</td>
<td align="char" char=".">6.9975</td>
<td align="char" char=".">6.9814</td>
</tr>
<tr>
<td align="left">Q<sub>AB</sub>
</td>
<td align="char" char=".">0.4699</td>
<td align="char" char=".">0.6895</td>
<td align="char" char=".">0.6557</td>
<td align="char" char=".">
<bold>0.7131</bold>
</td>
<td align="char" char=".">0.7049</td>
<td align="char" char=".">0.6580</td>
</tr>
<tr>
<td align="left">Peille</td>
<td align="char" char=".">0.0063</td>
<td align="char" char=".">0.0048</td>
<td align="char" char=".">
<bold>0.0106</bold>
</td>
<td align="char" char=".">0.0054</td>
<td align="char" char=".">0.0069</td>
<td align="char" char=".">0.0103</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>As can be noticed from the objective metrics in <xref ref-type="table" rid="T2">Table 2</xref>, SESF acquires the greatest SF, AG and IE values, while ECNN gains the best Q<sub>AB</sub> value. However, their fusion images present undesirable visual effects as shown in <xref ref-type="fig" rid="F6">Figure 6</xref>. Although the congeneric values of the proposed model are not optimal, they are totally acceptable, especially combining the visual properties of the fusion results. It exhibits that the fusion result with the proposed method is equipped with rich detail and feature information from resource images.</p>
<p>Similar to the objective values in <xref ref-type="table" rid="T1">Tables 1</xref> and <xref ref-type="table" rid="T2">2</xref>, although the SESF obtained the best values in SF, AG and Q<sub>AB</sub> in <xref ref-type="table" rid="T3">Table 3</xref>, it was also mainly caused by unreasonable distortion as shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. There are similar situations in DWT, MFCNN, and ECNN. Even though NSCT can achieve a similar visual effect to the proposed model, the SF, AG, and Peilla values are lower than IY-Net, which indicates that the proposed model has richer image sharpness and edge information, and is highly relevant to the source images.</p>
<p>In <xref ref-type="table" rid="T4">Table 4</xref>, although the best SF and AG values are attained by SESF and the best Q<sub>AB</sub> and IE values were yielded by ECNN, it is resulting from the distorted and discordant fusion results as shown in <xref ref-type="fig" rid="F8">Figure 8</xref>. In contrast to these reference algorithms, the proposed model is always stable in the expression of fusion results and the objective metrics are also acceptable, despite IY-Net being unable to highlight the advantages in every metric.</p>
<p>Similar to <xref ref-type="table" rid="T4">Table 4</xref>, SESF and ECNN in <xref ref-type="table" rid="T5">Table 5</xref> also produce abnormal SF, AG and Q<sub>AB</sub> values caused by partial loss and distortion of image edge information. NSCT achieves a great IE value since some of the fusion results produce redundant feature information. Unlike these reference algorithms, the proposed model can provide excellent visual perception with sound objective values.</p>
<p>In addition to the visual analysis and objective evaluation metrics discuss, the average running time is an important indicator for evaluating algorithm performance. In <xref ref-type="table" rid="T6">Table 6</xref>, the average running times of all kinds of algorithms are displayed, where the shortest value is bolded. Apparently, the average running time of IY-Net is significantly optimal compared with these reference algorithms, and the proposed neural network model is at least 94% faster than these reference network algorithms. In general, the proposed model has a significant advantage in terms of average running time, compared to these reference algorithms.</p>
<table-wrap id="T6" position="float">
<label>TABLE 6</label>
<caption>
<p>Average running time of various algorithms (Time unit: second).</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="center">DWT</th>
<th align="center">NSCT</th>
<th align="center">MFCNN</th>
<th align="center">ECNN</th>
<th align="center">SESF</th>
<th align="center">IY-Net</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Runtime</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">2.025</td>
<td align="char" char=".">0.38</td>
<td align="char" char=".">0.34</td>
<td align="char" char=".">0.31</td>
<td align="char" char=".">
<bold>0.16</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold indicates best values.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>Although the reference algorithms yield the best metrics for some modal images, the majority are overestimated due to the incongruous texture features in their fusion results, and they lack generality and stability for different patterns of images. For example, MFCNN, SESF, and ECNN achieved acceptable visual effects only in multi-focus image fusion, and DWT yielded favorable visual effects only in multi-exposure image fusion. As for NSCT, it is also inadequate in generality despite acquiring valuable visual effects in infrared intensity and polarization image fusion and multi-focus image fusion. In contrast, IY-Net can gain reasonable and acceptable quantitative metrics, and it also has significant strengths in the visual effects of multi-modal image fusion, while the computational speed is much faster than these reference algorithms. It reveals that the proposed model has premium generality, stability and rapidity. With the quantitative analysis and running time comparison, it is not difficult to realize that IY-Net achieves outstanding metrics in certain aspects, but there is still much progress to be expected.</p>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusion</title>
<p>In this paper, a general CNN framework for image fusion is proposed. Compared to current image fusion models, the proposed model has three main advantages: 1) Since it is fully convolutional, the model can be trained end-to-end and without pre-processing. 2) Although the training dataset is comprised of multi-modal images, the fused images not only have outstanding visual effects but also are not impacted by other modal images. 3) Its structure is similar to MSTIF, hence, it has outstanding generality in multi-modal image fusion. To summarize, IY-Net is superior to partial traditional multi-scale algorithms and existing neural network image fusion methods in terms of generality.</p>
<p>The proposed model provides the optimal visual effects compared to these reference algorithms through numerous fusion experiments, but the quantitative metrics are slightly inadequate. There are still several problems to be resolved to get a better-performing image fusion model. Firstly, this paper has a small training dataset, and increasing the large-scale sample may raise the model performance. Secondly, the proposed model consists of only three multiple feature extraction layers, which is relatively simplified, and the efficiency of the model can be enhanced by using a deeper network structure. Thirdly, the loss functions of the model are relatively simple, and the construction of more complex and optimized loss functions may enhance the stability and adaptability of the model.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>PZ provided the algorithmic ideas and theoretical analysis. WO performed the data processing and manuscript editing. YG guided the writing of the manuscript. All authors read and contributed to the manuscript.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This work was supported by grants from the National Natural Science Foundation of China (Grant No: 61901310, E080703, 51778509).</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbioe.2022.923364/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbioe.2022.923364/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Presentation1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Amin-Naji</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Aghagolzadeh</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ezoji</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Ensemble of CNN for Multi-Focus Image Fusion</article-title>. <source>Inf. fusion</source> <volume>51</volume>, <fpage>201</fpage>&#x2013;<lpage>214</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2019.02.003</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2019.02.003">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Ensemble+of+CNN+for+Multi-Focus+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B2">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Amin-Naji</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Aghagolzadeh</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mahdavinataj</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>).<article-title>Fast Multi Focus Image Fusion Using Determinant</article-title>, <conf-name>2022 International Conference on Machine Vision and Image Processing (MVIP)</conf-name>. <publisher-name>IEEE</publisher-name>, <fpage>1</fpage>&#x2013;<lpage>6</lpage>. <conf-loc>Ahvaz, Iran, Islamic Republic of</conf-loc>, <conf-date>23-24 Feb. 2022</conf-date>. <pub-id pub-id-type="doi">10.1109/MVIP53647.2022.9738555</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/MVIP53647.2022.9738555">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Fast+Multi+Focus+Image+Fusion+Using+Determinant&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Quadtree-based Multi-Focus Image Fusion Using a Weighted Focus-Measure</article-title>. <source>Inf. Fusion</source> <volume>22</volume>, <fpage>105</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2014.05.003</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2014.05.003">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Quadtree-based+Multi-Focus+Image+Fusion+Using+a+Weighted+Focus-Measure&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bhalla</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Koundal</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Bhatia</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Khalid Imam Rahmani</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Tahir</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Tahir</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Fusion of Infrared and Visible Images Using Fuzzy Based Siamese Convolutional Network</article-title>. <source>Comput. Mat. Con</source> <volume>70</volume>, <fpage>5503</fpage>&#x2013;<lpage>5518</lpage>. <pub-id pub-id-type="doi">10.32604/cmc.2022.021125</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.32604/cmc.2022.021125">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Fusion+of+Infrared+and+Visible+Images+Using+Fuzzy+Based+Siamese+Convolutional+Network&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2022a</year>). <article-title>Identifying Emergence Process of Group Panic Buying Behavior under the COVID-19 Pandemic</article-title>. <source>J. Retail. Consumer Serv.</source> <volume>67</volume>, <fpage>102970</fpage>. <pub-id pub-id-type="doi">10.1016/j.jretconser.2022.102970</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.jretconser.2022.102970">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Identifying+Emergence+Process+of+Group+Panic+Buying+Behavior+under+the+COVID-19+Pandemic&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Evolutionary Game of Multi-Subjects in Live Streaming and Governance Strategies Based on Social Preference Theory during the COVID-19 Pandemic</article-title>. <source>Mathematics</source> <volume>9</volume> (<issue>21</issue>), <fpage>2743</fpage>. <pub-id pub-id-type="doi">10.3390/math9212743</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/math9212743">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Evolutionary+Game+of+Multi-Subjects+in+Live+Streaming+and+Governance+Strategies+Based+on+Social+Preference+Theory+during+the+COVID-19+Pandemic&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Qiu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2022b</year>). <article-title>Analysis of Effects on the Dual Circulation Promotion Policy for Cross-Border E-Commerce B2B Export Trade Based on System Dynamics during COVID-19</article-title>. <source>Systems</source> <volume>10</volume> (<issue>1</issue>), <fpage>13</fpage>. <pub-id pub-id-type="doi">10.3390/systems10010013</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/systems10010013">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Analysis+of+Effects+on+the+Dual+Circulation+Promotion+Policy+for+Cross-Border+E-Commerce+B2B+Export+Trade+Based+on+System+Dynamics+during+COVID-19&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Rong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2022c</year>). <article-title>Modeling Rumor Diffusion Process with the Consideration of Individual Heterogeneity: Take the Imported Food Safety Issue as an Example during the COVID-19 Pandemic</article-title>. <source>Front. Public Health</source> <volume>10</volume>, <fpage>781691</fpage>. <pub-id pub-id-type="doi">10.3389/fpubh.2022.781691</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/35330754/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fpubh.2022.781691">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Modeling+Rumor+Diffusion+Process+with+the+Consideration+of+Individual+Heterogeneity:+Take+the+Imported+Food+Safety+Issue+as+an+Example+during+the+COVID-19+Pandemic&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Cong</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>Modeling Multi-Dimensional Public Opinion Process Based on Complex Network Dynamics Model in the Context of Derived Topics</article-title>. <source>Axioms</source> <volume>10</volume> (<issue>4</issue>), <fpage>270</fpage>. <pub-id pub-id-type="doi">10.3390/axioms10040270</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/axioms10040270">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Modeling+Multi-Dimensional+Public+Opinion+Process+Based+on+Complex+Network+Dynamics+Model+in+the+Context+of+Derived+Topics&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Gesture Recognition Based on Surface Electromyography-Feature Image</article-title>. <source>Concurrency Comput. Pract. Exp.</source> <volume>33</volume> (<issue>6</issue>), <fpage>e6051</fpage>. <pub-id pub-id-type="doi">10.1002/cpe.6051</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1002/cpe.6051">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Gesture+Recognition+Based+on+Surface+Electromyography-Feature+Image&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>&#xc7;il</surname>
<given-names>Z. A.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Mete</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>&#xd6;zceylan</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Mathematical Model and Bee Algorithms for Mixed-Model Assembly Line Balancing Problem with Physical Human&#x2013;Robot Collaboration</article-title>. <source>Appl. soft Comput.</source> <volume>93</volume>, <fpage>106394</fpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2020.106394</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.asoc.2020.106394">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Mathematical+Model+and+Bee+Algorithms+for+Mixed-Model+Assembly+Line+Balancing+Problem+with+Physical+Human&#x2013;Robot+Collaboration&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>An Effective Improved Co-evolution Ant Colony Optimisation Algorithm with Multi-Strategies and its Application</article-title>. <source>Int. J. Bio-Inspired Comput.</source> <volume>16</volume> (<issue>3</issue>), <fpage>158</fpage>&#x2013;<lpage>170</lpage>. <pub-id pub-id-type="doi">10.1504/ijbic.2020.10033314</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1504/ijbic.2020.10033314">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=An+Effective+Improved+Co-evolution+Ant+Colony+Optimisation+Algorithm+with+Multi-Strategies+and+its+Application&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elbes</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Alzubi</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kanan</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Al-Fuqaha</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hawashin</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Survey on Particle Swarm Optimization with Emphasis on Engineering and Network Applications</article-title>. <source>Evol. Intel.</source> <volume>12</volume> (<issue>2</issue>), <fpage>113</fpage>&#x2013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.1007/s12065-019-00210-z</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s12065-019-00210-z">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=A+Survey+on+Particle+Swarm+Optimization+with+Emphasis+on+Engineering+and+Network+Applications&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haghighat</surname>
<given-names>M. B. A.</given-names>
</name>
<name>
<surname>Aghagolzadeh</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Seyedarabi</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Multi-focus Image Fusion for Visual Sensor Networks in DCT Domain</article-title>. <source>Comput. Electr. Eng.</source> <volume>37</volume> (<issue>5</issue>), <fpage>789</fpage>&#x2013;<lpage>797</lpage>. <pub-id pub-id-type="doi">10.1016/j.compeleceng.2011.04.016</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.compeleceng.2011.04.016">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Multi-focus+Image+Fusion+for+Visual+Sensor+Networks+in+DCT+Domain&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Hao</surname>
<given-names>Z.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>Multi-scale Feature Fusion Convolutional Neural Network for Indoor Small Target Detection</article-title>. <source>Front. Neurorobotics</source> <volume>85</volume>. <fpage>881021</fpage>, <pub-id pub-id-type="doi">10.3389/fnbot.2022.881021</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fnbot.2022.881021">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Multi-scale+Feature+Fusion+Convolutional+Neural+Network+for+Indoor+Small+Target+Detection&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hao</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Detection Algorithm of Safety Helmet Wearing Based on Deep Learning</article-title>. <source>Concurrency Comput. Pract. Exp.</source> <volume>33</volume> (<issue>13</issue>), <fpage>e6234</fpage>. <pub-id pub-id-type="doi">10.1002/cpe.6234</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1002/cpe.6234">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Detection+Algorithm+of+Safety+Helmet+Wearing+Based+on+Deep+Learning&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Jing</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Evaluation of Focus Measures in Multi-Focus Image Fusion</article-title>. <source>Pattern Recognit. Lett.</source> <volume>28</volume> (<issue>4</issue>), <fpage>493</fpage>&#x2013;<lpage>500</lpage>. <pub-id pub-id-type="doi">10.1016/j.patrec.2006.09.005</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.patrec.2006.09.005">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Evaluation+of+Focus+Measures+in+Multi-Focus+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Manipulator Grabbing Position Detection with Information Fusion of Color Image and Depth Image Using Deep Learning</article-title>. <source>J. Ambient. Intell. Hum. Comput.</source> <volume>12</volume> (<issue>12</issue>), <fpage>10809</fpage>&#x2013;<lpage>10822</lpage>. <pub-id pub-id-type="doi">10.1007/s12652-020-02843-w</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s12652-020-02843-w">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Manipulator+Grabbing+Position+Detection+with+Information+Fusion+of+Color+Image+and+Depth+Image+Using+Deep+Learning&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>Semantic Segmentation for Multiscale Target Based on Object Recognition Using the Improved Faster-RCNN Model</article-title>. <source>Future Gener. Comput. Syst.</source> <volume>123</volume>, <fpage>94</fpage>&#x2013;<lpage>104</lpage>. <pub-id pub-id-type="doi">10.1016/j.future.2021.04.019</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.future.2021.04.019">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Semantic+Segmentation+for+Multiscale+Target+Based+on+Object+Recognition+Using+the+Improved+Faster-RCNN+Model&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Gesture Recognition Based on Binocular Vision</article-title>. <source>Clust. Comput.</source> <volume>22</volume> (<issue>6</issue>), <fpage>13261</fpage>&#x2013;<lpage>13271</lpage>. <pub-id pub-id-type="doi">10.1007/s10586-018-1844-5</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s10586-018-1844-5">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Gesture+Recognition+Based+on+Binocular+Vision&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lewis</surname>
<given-names>J. J.</given-names>
</name>
<name>
<surname>O&#x2019;Callaghan</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Nikolov</surname>
<given-names>S. G.</given-names>
</name>
<name>
<surname>Bull</surname>
<given-names>D. R.</given-names>
</name>
<name>
<surname>Canagarajah</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Pixel- and Region-Based Image Fusion with Complex Wavelets</article-title>. <source>Inf. fusion</source> <volume>8</volume> (<issue>2</issue>), <fpage>119</fpage>&#x2013;<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2005.09.006</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2005.09.006">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Pixel-+and+Region-Based+Image+Fusion+with+Complex+Wavelets&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>PMSC: PatchMatch-Based Superpixel Cut for Accurate Stereo Matching</article-title>. <source>IEEE Trans. Circuits Syst. Video Technol.</source> <volume>28</volume> (<issue>3</issue>), <fpage>679</fpage>&#x2013;<lpage>692</lpage>. <pub-id pub-id-type="doi">10.1109/TCSVT.2016.2628782</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/TCSVT.2016.2628782">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=PMSC:+PatchMatch-Based+Superpixel+Cut+for+Accurate+Stereo+Matching&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Multi-focus Image Fusion with a Deep Convolutional Neural Network</article-title>. <source>Inf. Fusion</source> <volume>36</volume>, <fpage>191</fpage>&#x2013;<lpage>207</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2016.12.001</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2016.12.001">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Multi-focus+Image+Fusion+with+a+Deep+Convolutional+Neural+Network&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z. J.</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2018a</year>). <article-title>Deep Learning for Pixel-Level Image Fusion: Recent Advances and Future Prospects</article-title>. <source>Inf. Fusion</source> <volume>42</volume>, <fpage>158</fpage>&#x2013;<lpage>173</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2017.10.007</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2017.10.007">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Deep+Learning+for+Pixel-Level+Image+Fusion:+Recent+Advances+and+Future+Prospects&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Blasch</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Bhatnagar</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>John</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>R. S.</given-names>
</name>
</person-group> (<year>2018b</year>). <article-title>Fusing Synergistic Information from Multi-Sensor Images: an Overview from Implementation to Performance Assessment</article-title>. <source>Inf. Fusion</source> <volume>42</volume>, <fpage>127</fpage>&#x2013;<lpage>145</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2017.10.010</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2017.10.010">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Fusing+Synergistic+Information+from+Multi-Sensor+Images:+an+Overview+from+Implementation+to+Performance+Assessment&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tsukada</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Hanasaki</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Ho</surname>
<given-names>Y. K.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>Y. P.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Image Fusion by Using Steerable Pyramid</article-title>. <source>Pattern Recognit. Lett.</source> <volume>22</volume> (<issue>9</issue>), <fpage>929</fpage>&#x2013;<lpage>939</lpage>. <pub-id pub-id-type="doi">10.1016/s0167-8655(01)00047-2</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/s0167-8655(01)00047-2">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Image+Fusion+by+Using+Steerable+Pyramid&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ban</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Mukeshimana</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Sesf-fuse: An Unsupervised Deep Model for Multi-Focus Image Fusion</article-title>. <source>Neural Comput. Applic</source> <volume>33</volume> (<issue>11</issue>), <fpage>5793</fpage>&#x2013;<lpage>5804</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-020-05358-9</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s00521-020-05358-9">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Sesf-fuse:+An+Unsupervised+Deep+Model+for+Multi-Focus+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Kai Zeng</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Zhou Wang</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Perceptual Quality Assessment for Multi-Exposure Image Fusion</article-title>. <source>IEEE Trans. Image Process.</source> <volume>24</volume> (<issue>11</issue>), <fpage>3345</fpage>&#x2013;<lpage>3356</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2015.2442920</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/26068317/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/tip.2015.2442920">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Perceptual+Quality+Assessment+for+Multi-Exposure+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moonon</surname>
<given-names>A.-U.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Multi-focus Image Fusion Based on NSCT and NSST</article-title>. <source>Sens. Imaging</source> <volume>16</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/s11220-015-0106-3</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s11220-015-0106-3">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Multi-focus+Image+Fusion+Based+on+NSCT+and+NSST&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Petrovi&#x107;</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Subjective Tests for Image Fusion Evaluation and Objective Metric Validation</article-title>. <source>Inf. Fusion</source> <volume>8</volume> (<issue>2</issue>), <fpage>208</fpage>&#x2013;<lpage>216</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2005.10.002</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2005.10.002">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Subjective+Tests+for+Image+Fusion+Evaluation+and+Objective+Metric+Validation&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Piella</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Heijmans</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2003</year>).<article-title>A New Quality Metric for Image Fusion</article-title>, <conf-name>Proceedings 2003 international conference on image processing (Cat. No. 03CH37429)</conf-name>. <publisher-name>IEEE</publisher-name>. <conf-loc>Barcelona Spain</conf-loc>, <conf-date>14-17 Sep. 2003</conf-date>. <pub-id pub-id-type="doi">10.1109/ICIP.2003.1247209</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICIP.2003.1247209">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=A+New+Quality+Metric+for+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Deep Unsupervised Learning Based on Color Un-referenced Loss Functions for Multi-Exposure Image Fusion</article-title>. <source>Inf. Fusion</source> <volume>66</volume>, <fpage>18</fpage>&#x2013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2020.08.012</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2020.08.012">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Deep+Unsupervised+Learning+Based+on+Color+Un-referenced+Loss+Functions+for+Multi-Exposure+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ram Prabhakar</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Sai Srikar</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Venkatesh Babu</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deepfuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs</article-title>. <conf-name>Proceedings of the IEEE international conference on computer vision</conf-name>, <fpage>4714</fpage>&#x2013;<lpage>4722</lpage>. <conf-loc>Venice Italy</conf-loc>, <conf-date>22-29 Oct. 2017</conf-date>. <pub-id pub-id-type="doi">10.1109/ICCV.2017.505</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICCV.2017.505">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Deepfuse:+A+Deep+Unsupervised+Approach+for+Exposure+Fusion+with+Extreme+Exposure+Image+Pairs&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saha</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Bhatnagar</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Q. M. J.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Mutual Spectral Residual Approach for Multifocus Image Fusion</article-title>. <source>Digit. Signal Process.</source> <volume>23</volume> (<issue>4</issue>), <fpage>1121</fpage>&#x2013;<lpage>1135</lpage>. <pub-id pub-id-type="doi">10.1016/j.dsp.2013.03.001</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.dsp.2013.03.001">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Mutual+Spectral+Residual+Approach+for+Multifocus+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Song</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Hang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Spatiotemporal Satellite Image Fusion Using Deep Convolutional Neural Networks</article-title>. <source>IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.</source> <volume>11</volume> (<issue>3</issue>), <fpage>821</fpage>&#x2013;<lpage>829</lpage>. <pub-id pub-id-type="doi">10.1109/jstars.2018.2797894</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/jstars.2018.2797894">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Spatiotemporal+Satellite+Image+Fusion+Using+Deep+Convolutional+Neural+Networks&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Tong</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>Low-illumination Image Enhancement Algorithm Based on Improved Multi-Scale Retinex and ABC Algorithm Optimization</article-title>. <source>Front. Bioeng. Biotechnol.</source>, <volume>10</volume>, <fpage>396</fpage>. <pub-id pub-id-type="doi">10.3389/fbioe.2022.865820</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fbioe.2022.865820">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Low-illumination+Image+Enhancement+Algorithm+Based+on+Improved+Multi-Scale+Retinex+and+ABC+Algorithm+Optimization&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tao Wan</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Canagarajah</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Achim</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Segmentation-driven Image Fusion Based on Alpha-Stable Modeling of Wavelet Coefficients</article-title>. <source>IEEE Trans. Multimed.</source> <volume>11</volume> (<issue>4</issue>), <fpage>624</fpage>&#x2013;<lpage>633</lpage>. <pub-id pub-id-type="doi">10.1109/tmm.2009.2017640</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/tmm.2009.2017640">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Segmentation-driven+Image+Fusion+Based+on+Alpha-Stable+Modeling+of+Wavelet+Coefficients&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B38">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Tessens</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ledda</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pizurica</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Philips</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2007</year>).<article-title>Extending the Depth of Field in Microscopy through Curvelet-Based Frequency-Adaptive Image Fusion</article-title>, <conf-name>2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP&#x27;07</conf-name>. <publisher-name>IEEE</publisher-name>. <isbn>I-861-I-864</isbn>. <conf-loc>Honolulu, HI, USA</conf-loc>, <conf-date>15-20 Apr. 2007</conf-date>. <pub-id pub-id-type="doi">10.1109/icassp.2007.366044</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/icassp.2007.366044">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Extending+the+Depth+of+Field+in+Microscopy+through+Curvelet-Based+Frequency-Adaptive+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Adaptive Multi-Focus Image Fusion Using a Wavelet-Based Statistical Sharpness Measure</article-title>. <source>Signal Process.</source> <volume>92</volume> (<issue>9</issue>), <fpage>2137</fpage>&#x2013;<lpage>2146</lpage>. <pub-id pub-id-type="doi">10.1016/j.sigpro.2012.01.027</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.sigpro.2012.01.027">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Adaptive+Multi-Focus+Image+Fusion+Using+a+Wavelet-Based+Statistical+Sharpness+Measure&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Dou</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kemao</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Di</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2019a</year>). <article-title>Y-net: a One-To-Two Deep Learning Framework for Digital Holographic Reconstruction</article-title>. <source>Opt. Lett.</source> <volume>44</volume> (<issue>19</issue>), <fpage>4765</fpage>&#x2013;<lpage>4768</lpage>. <pub-id pub-id-type="doi">10.1364/ol.44.004765</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/31568437/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1364/ol.44.004765">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Y-net:+a+One-To-Two+Deep+Learning+Framework+for+Digital+Holographic+Reconstruction&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Bovik</surname>
<given-names>A. C.</given-names>
</name>
<name>
<surname>Sheikh</surname>
<given-names>H. R.</given-names>
</name>
<name>
<surname>Simoncelli</surname>
<given-names>E. P.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Image Quality Assessment: from Error Visibility to Structural Similarity</article-title>. <source>IEEE Trans. Image Process.</source> <volume>13</volume> (<issue>4</issue>), <fpage>600</fpage>&#x2013;<lpage>612</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2003.819861</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/15376593/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/tip.2003.819861">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Image+Quality+Assessment:+from+Error+Visibility+to+Structural+Similarity&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Retinal Vessel Segmentation Algorithm Based on Residual Convolution Neural Network</article-title>. <source>Front. Bioeng. Biotechnol.</source> <volume>9</volume>, <fpage>786425</fpage>. <pub-id pub-id-type="doi">10.3389/fbioe.2021.786425</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/34957078/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fbioe.2021.786425">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Retinal+Vessel+Segmentation+Algorithm+Based+on+Residual+Convolution+Neural+Network&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B43">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Xue</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Blum</surname>
<given-names>R. S.</given-names>
</name>
</person-group> (<year>2003</year>).<article-title>Concealed Weapon Detection Using Color Image Fusion</article-title>, <conf-name>Proceedings of the 6th Internation Conference On Information fusion</conf-name>. <publisher-name>IEEE</publisher-name>, <fpage>622</fpage>&#x2013;<lpage>627</lpage>. <conf-loc>Cairns, QLD, Australia</conf-loc>, <conf-date>08-11 July 2003</conf-date>. <pub-id pub-id-type="doi">10.1109/ICIF.2003.177504</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICIF.2003.177504">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Concealed+Weapon+Detection+Using+Color+Image+Fusion&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xydeas</surname>
<given-names>C. S.</given-names>
</name>
<name>
<surname>Petrovic&#x301;</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Objective Image Fusion Performance Measure</article-title>. <source>Electron. Lett.</source> <volume>36</volume> (<issue>4</issue>), <fpage>308</fpage>&#x2013;<lpage>309</lpage>. <pub-id pub-id-type="doi">10.1049/el:20000267</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1049/el:20000267">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Objective+Image+Fusion+Performance+Measure&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Single Image Depth Estimation with Normal Guided Scale Invariant Deep Convolutional Fields</article-title>. <source>IEEE Trans. Circuits Syst. Video Technol.</source> <volume>29</volume> (<issue>1</issue>), <fpage>80</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1109/TCSVT.2017.2772892</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/TCSVT.2017.2772892">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Single+Image+Depth+Estimation+with+Normal+Guided+Scale+Invariant+Deep+Convolutional+Fields&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Levine</surname>
<given-names>M. D.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Robust Multi-Focus Image Fusion Using Multi-Task Sparse Representation and Spatial Context</article-title>. <source>IEEE Trans. Image Process.</source> <volume>25</volume> (<issue>5</issue>), <fpage>2045</fpage>&#x2013;<lpage>2058</lpage>. <pub-id pub-id-type="doi">10.1109/tip.2016.2524212</pub-id> <ext-link ext-link-type="uri" xlink:href="https://pubmed.ncbi.nlm.nih.gov/26863661/">PubMed Abstract</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/tip.2016.2524212">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Robust+Multi-Focus+Image+Fusion+Using+Multi-Task+Sparse+Representation+and+Spatial+Context&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>IFCNN: A General Image Fusion Framework Based on Convolutional Neural Network</article-title>. <source>Inf. Fusion</source> <volume>54</volume>, <fpage>99</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2019.07.011</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2019.07.011">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=IFCNN:+A+General+Image+Fusion+Framework+Based+on+Convolutional+Neural+Network&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Bai</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Infrared and Visual Image Fusion through Infrared Feature Extraction and Visual Information Preservation</article-title>. <source>Infrared Phys. Technol.</source> <volume>83</volume> (<issue>1</issue>), <fpage>227</fpage>&#x2013;<lpage>237</lpage>. <pub-id pub-id-type="doi">10.1016/j.infrared.2017.05.007</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.infrared.2017.05.007">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Infrared+and+Visual+Image+Fusion+through+Infrared+Feature+Extraction+and+Visual+Information+Preservation&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Essock</surname>
<given-names>E. A.</given-names>
</name>
<name>
<surname>Hansen</surname>
<given-names>B. C.</given-names>
</name>
<name>
<surname>Haun</surname>
<given-names>A. M.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>A New Metric Based on Extended Spatial Frequency and its Application to DWT Based Fusion Algorithms</article-title>. <source>Inf. Fusion</source> <volume>8</volume> (<issue>2</issue>), <fpage>177</fpage>&#x2013;<lpage>192</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2005.04.003</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2005.04.003">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=A+New+Metric+Based+on+Extended+Spatial+Frequency+and+its+Application+to+DWT+Based+Fusion+Algorithms&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Multi-scale Weighted Gradient-Based Fusion for Multi-Focus Images</article-title>. <source>Inf. Fusion</source> <volume>20</volume>, <fpage>60</fpage>&#x2013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2013.11.005</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2013.11.005">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Multi-scale+Weighted+Gradient-Based+Fusion+for+Multi-Focus+Images&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Perceptual Fusion of Infrared and Visible Images through a Hybrid Multi-Scale Decomposition with Gaussian and Bilateral Filters</article-title>. <source>Inf. Fusion</source> <volume>30</volume> (<issue>c</issue>), <fpage>15</fpage>&#x2013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1016/j.inffus.2015.11.003</pub-id> <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.inffus.2015.11.003">CrossRef Full Text</ext-link> &#x7c; <ext-link ext-link-type="uri" xlink:href="https://scholar.google.com/scholar?hl=en&#x0026;as_sdt=0%2C5&#x0026;q=Perceptual+Fusion+of+Infrared+and+Visible+Images+through+a+Hybrid+Multi-Scale+Decomposition+with+Gaussian+and+Bilateral+Filters&#x0026;btnG=">Google Scholar</ext-link>
</citation>
</ref>
</ref-list>
</back>
</article>