<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Physiol.</journal-id>
<journal-title>Frontiers in Physiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Physiol.</abbrev-journal-title>
<issn pub-type="epub">1664-042X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1290820</article-id>
<article-id pub-id-type="doi">10.3389/fphys.2023.1290820</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Physiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>DRI-Net: segmentation of polyp in colonoscopy images using dense residual-inception network</article-title>
<alt-title alt-title-type="left-running-head">Lan et al.</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fphys.2023.1290820">10.3389/fphys.2023.1290820</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Lan</surname>
<given-names>Xiaoke</given-names>
</name>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Chen</surname>
<given-names>Honghuan</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2435348/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/validation/"/>
<role content-type="https://credit.niso.org/contributor-roles/Writing - review &#x26; editing/"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Jin</surname>
<given-names>Wenbing</given-names>
</name>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/Writing - review &#x26; editing/"/>
</contrib>
</contrib-group>
<aff>
<institution>College of Internet of Things Technology</institution>, <institution>Hangzhou Polytechnic</institution>, <addr-line>Hangzhou</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1115242/overview">Ruizheng Shi</ext-link>, Central South University, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1951545/overview">Yiling Lu</ext-link>, University of Derby, United Kingdom</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1954916/overview">Hui Wang</ext-link>, Hangzhou Cancer Hospital, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1752875/overview">Yucheng Song</ext-link>, Central South University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Honghuan Chen, <email>hhchen@hdu.edu.cn</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>25</day>
<month>10</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1290820</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>09</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>10</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Lan, Chen and Jin.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Lan, Chen and Jin</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Colorectal cancer is a common malignant tumor in the gastrointestinal tract, which usually evolves from adenomatous polyps. However, due to the similarity in color between polyps and their surrounding tissues in colonoscopy images, and their diversity in size, shape, and texture, intelligent diagnosis still remains great challenges. For this reason, we present a novel dense residual-inception network (DRI-Net) which utilizes U-Net as the backbone. Firstly, in order to increase the width of the network, a modified residual-inception block is designed to replace the traditional convolutional, thereby improving its capacity and expressiveness. Moreover, the dense connection scheme is adopted to increase the network depth so that more complex feature inputs can be fitted. Finally, an improved down-sampling module is built to reduce the loss of image feature information. For fair comparison, we validated all method on the Kvasir-SEG dataset using three popular evaluation metrics. Experimental results consistently illustrates that the values of DRI-Net on IoU, Mcc and Dice attain 77.72%, 85.94% and 86.51%, which were 1.41%, 0.66% and 0.75% higher than the suboptimal model. Similarly, through ablation studies, it also demonstrated the effectiveness of our approach in colorectal semantic segmentation.</p>
</abstract>
<kwd-group>
<kwd>image segmentation</kwd>
<kwd>colonoscopy</kwd>
<kwd>residual-inception</kwd>
<kwd>dense</kwd>
<kwd>down-sampling</kwd>
</kwd-group>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Computational Physiology and Medicine</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1 Introduction</title>
<p>In today&#x2019;s world, cancer has become the most important disease threatening human health. Due to genetic, environmental, diet and other factors, there are more and more patients with colorectal cancer, and the death rate is also the second highest. Research shows that colorectal cancer lesions are closely related to colorectal polyps. Therefore, early detection and treatment can effectively control the occurrence of diseases and reduce the mortality rate. By far, colonoscopy is an effective diagnostic method for detecting polyps in the intestine, and it has become the gold standard for early screening of colorectal cancer. Although the size, shape and lesions of tumors can be visually observed through colonoscopy, the characteristic analysis of the pathological images is entirely dependent on the professional doctor. This method not only has a long detection cycle and high labor intensity, but also relies heavily on the subjective judgment and cognition of doctors. Besides, with the increase in the number of disease patients, the demand for professional experts is also increasing, which poses a huge challenge to the medical talent industry. For this reason, the combination of computer vision technology and pathological image diagnosis has become extremely important in the medical field.</p>
<p>At present, deep-learning performs very well in computer vision, especially in medical image-assisted diagnosis (<xref ref-type="bibr" rid="B8">Dang et al., 2023</xref>; <xref ref-type="bibr" rid="B21">Maria et al., 2023</xref>; <xref ref-type="bibr" rid="B22">Morita et al., 2023</xref>; <xref ref-type="bibr" rid="B31">Yang et al., 2023</xref>; <xref ref-type="bibr" rid="B33">Zhang et al., 2023</xref>). Compared with traditional segmentation frameworks (<xref ref-type="bibr" rid="B27">Srikanth and Bikshalu, 2022</xref>; <xref ref-type="bibr" rid="B5">Chen et al., 2023</xref>), the core advantage of deep learning is that it can independently discover and learn higher-level image features directly from training data, thus significantly reducing the refinement of feature extraction and facilitating end-to-end image processing in deep architectures. At present, convolutional neural network (CNN) (<xref ref-type="bibr" rid="B18">Lecun et al., 1998</xref>) is one of the most popular models in deep learning networks. By introducing local receptive fields, weight sharing, and pooling operations, the generalization ability of the model is greatly improved. However, this network needs to assign labels to each pixel, and medical images often contain millions of pixels, so it takes a lot of time to process millions of forward channels. In addition, all pixels are calculated independently, resulting in spatial inconsistencies in the segmentation results. To solve the above problems, <xref ref-type="bibr" rid="B20">Long et al. (2015)</xref> proposed a full convolutional network (FCN). By replacing the fully-connected layer in CNN with a convolutional layer, the spatial information of images can be preserved by using the features and up-sampling strategies of different layers. At the same time, this method can accept any size of input image, and is easier to implement than the traditional image block classification method.</p>
<p>Inspired by FCN, similar network structure models have emerged in an endless stream, mainly improved from extended convolutions (<xref ref-type="bibr" rid="B19">Liu et al., 2020</xref>; <xref ref-type="bibr" rid="B17">Karthika and Senthilselvi, 2023</xref>), recurrent neural networks (<xref ref-type="bibr" rid="B29">Tan et al., 2021</xref>; <xref ref-type="bibr" rid="B6">Chen J et al., 2022</xref>), multi-scale features (<xref ref-type="bibr" rid="B9">Dourthe et al., 2022</xref>; <xref ref-type="bibr" rid="B11">Goyal et al., 2022</xref>), residual connections (<xref ref-type="bibr" rid="B2">Anil and Dayananda, 2023</xref>; <xref ref-type="bibr" rid="B26">Selvaraj and Nithiyaraj, 2023</xref>) and attention mechanisms (<xref ref-type="bibr" rid="B16">Kanimozhi and Franklin, 2023</xref>; <xref ref-type="bibr" rid="B24">Rasti et al., 2023</xref>). Among them, <xref ref-type="bibr" rid="B30">Tang et al. (2022)</xref> proposed a guidance network for segmentation of medical images that can learn and cope with uncertainty end-to-end. Specifically, this method contains of three parts: firstly, the rough segmentation module is used to obtain the rough segmentation and uncertainty graph. Secondly, the feature refinement module is used to embed multiple double attention blocks to generate the final segmentation. Finally, to extract richer context information, a multi-scale feature extractor is inserted between the encoder and decoder of the coarsely segmented module. <xref ref-type="bibr" rid="B28">Sun et al. (2022)</xref> proposed a dual-path CNN with DeepLabV3&#x2b; as the backbone. In this method, soft shape monitoring blocks were inserted between the regional path and the shape path to realize the cross-path attention mechanism, so as to accurately detect and segment thyroid nodules. <xref ref-type="bibr" rid="B34">Zhang et al. (2022)</xref> proposed a retinal vessel segmentation algorithm based on M-Net. Firstly, to reduce the influence of noise, a double-attention mechanism based on channel and space was designed. Then, the self-attention mechanism in Transformer is introduced into skip connections to recode features and explicitly model remote relationships. <xref ref-type="bibr" rid="B10">Fu et al. (2022)</xref> proposed an automatic segmentation method for cardiac MRI images. On the one hand, CNNs were used for feature extraction and spatial encoding of inputs. On the other hand, by using Transformer to add remote dependencies to advanced features, the model&#x2019;s ability to capture details can be fully utilized.</p>
<p>In this research, we proposed a new dense residual-inception network (called DRI-Net) for the segmentation of colorectal polyps and performed comparative experiments on a public dataset. Compared to other networks, our contributions are the following:<list list-type="simple">
<list-item>
<p>1) Using standard U-Net architecture, the DRI-Net was presented to provide guidance for the accurate segmentation of polyps.</p>
</list-item>
<list-item>
<p>2) In DRI-Net, to make the network structure wider without gradient disappearing, simple convolutional blocks were replaced with dense residual-inception blocks.</p>
</list-item>
<list-item>
<p>3) The down-sampling was carefully redesigned using average-pooling to reduce the loss of image feature information.</p>
</list-item>
<list-item>
<p>4) We do ablation studies on residual-inception, dense and down-sampling. Compared with several classical algorithms, our approach has better performance.</p>
</list-item>
</list>
</p>
</sec>
<sec sec-type="methods" id="s2">
<title>2 Methods</title>
<p>DRI-Net is a classic encoder-decoder structure, and its overall network is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The left encoder includes four dense residual-inception modules, and each of which is followed by a pooling layers to down-sample the image. The right decoder also contains four dense residual-inception modules and the resolution is successively increased by the up-sampling operation until it is consistent with the resolution of the input image. Skip connections are used in the network to connect the up-sampled result to the output of a module with the same resolution in the encoder as the input to the next module in the decoder. Finally, the activation function used in the last layer is a Sigmoid function to generate binary segmentation results, and the rest of the activation functions are linear activation functions. In the following, we will explain each block in detail.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Proposed DRI-Net architecture.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g001.tif"/>
</fig>
<sec id="s2-1">
<title>2.1 Residual-inception</title>
<p>In deep learning, many algorithms achieve better results by simply deepening or broadening neural networks. However, it not only greatly increases the number of parameters and the amount of computation, but also causes problems such as generator over-fitting, gradient disappearing and insufficient diversity of generated samples. To overcome the above difficulties, we propose an improved inception module with multiple convolution kernels of 1 &#xd7; 1, 1 &#xd7; 3, 3 &#xd7; 1 and 3 &#xd7; 3, as shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. By using parallel structure, the weight of each convolution kernel is adjusted adaptively during the training process, so that the network can adapt to images of different scales. At the same time, three sets of convolution kernels can convert full connection-layer connections to sparse connections, thus improving computational efficiency and extracting more features. However, it is important to note that with the number of convolution cores increases, the number of parameters will increase. Therefore, each group of parallel branches will first undergo 1 &#xd7; 1 convolution operations to reduce channel dimensions to achieve the purpose of dimensionality reduction of images.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The inception block.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g002.tif"/>
</fig>
<p>To further improve the feature extraction capability of the network, residual-inception block is designed, as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Firstly, the proposed residual-inception structure connects the input and output of inception layer and 1 &#xd7; 1 convolution layer respectively. This approach presents an overall sequential connection, and the distance between the two connected network layers is short and there is only one network layer. Firstly, the residual-inception structure is proposed to connect the input and output of inception layer and 1 &#xd7; 1 convolution layer respectively. This approach presents an overall sequential connection, and the distance between the two connected network layers is short and there is only one network layer. Then, the input features of the image are connected with the output of the second connection layer. In this structure, the features of short jump connections include both adjacent outputs and distant ones.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>The residual-inception block.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g003.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>2.2 Dense connections</title>
<p>As shown in <xref ref-type="fig" rid="F4">Figure 4</xref>, the DRI-Net network adopts a dense structure, and the top convolutional layer is directly connected to the subsequent convolutional layer. After each convolution layer there is a Batch Normalization (BN) layer and a Rectified Linear Unit (ReLU) layer. This connection integrates the larger eigenvalues of the bottom layer into the smaller eigenvalues of the top layer, which can effectively alleviate the problems of over-fitting and gradient disappearance. In addition, the number of existing colon image datasets is small, which will make deep neural network training difficult. At the same time, the disappearance of gradients during training will seriously limit the improvement of the accuracy of neural networks, and dense structures can alleviate these problems to some extent.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>The dense connections block.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g004.tif"/>
</fig>
</sec>
<sec id="s2-3">
<title>2.3 Down-sampling layer</title>
<p>The traditional U-Net (<xref ref-type="bibr" rid="B25">Ronneberger et al., 2015</xref>) uses Max-pooling to reduce and compress features in the shrink path. However, this will cause a lot of useful information in the image to be lost. In order to store more fine-grained feature information and reduce information loss caused by pooling process, this paper adopts two 1 &#xd7; 1 convolution steps, one 3 &#xd7; 3 convolution, one 5 &#xd7; 5 convolution and average-pooling layers for parallel processing, as shown in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>The down-sampling layer.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g005.tif"/>
</fig>
</sec>
</sec>
<sec id="s3">
<title>3 Experiments and results</title>
<p>The colorectal polyp images from Kvasir-SEG (<xref ref-type="bibr" rid="B14">Jha et al., 2020</xref>) dataset were used to evaluate the performance of DRI-Net. The database has 1,196 images, of which 700 are training sets, 300 are verification sets, and 196 are test sets. The programming language used in the experiment is Python 3.6, the operating system is Windows 10. The system memory is 24&#xa0;GB, and the GPU is NVIDIA Quadro RTX 6000. According to the effect of the network, we select the Adam optimizer, the initial learning-rate was 0.001 (<xref ref-type="bibr" rid="B4">Badshah and Ahmad, 2021</xref>), the batch size was 16, the number of iterations was 200, the learning rate was 0.001, and the loss function was Dice loss.</p>
<sec id="s3-1">
<title>3.1 Evaluation metrics</title>
<p>Several quantitative metrics, including Intersection over Union (IoU) (<xref ref-type="bibr" rid="B1">Ahmed et al., 2021</xref>), Matthews correlation coefficient (Mcc) (<xref ref-type="bibr" rid="B15">Jiang et al., 2021</xref>), and Dice (<xref ref-type="bibr" rid="B32">Yang et al., 2020</xref>) were adopted to evaluate the performance of each algorithm. The above indicators can calculate as:<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
<disp-formula id="e2">
<mml:math id="m2">
<mml:mrow>
<mml:mi>M</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>c</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m3">
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>C</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>T</mml:mi>
<mml:mi>P</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>N</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>where TP indicates that the actual target is a positive sample, and the algorithm also judges the target as a positive sample. TN is represented as a negative sample, and the algorithm also judges this negative sample as a negative sample. FP means a negative sample, but the algorithm incorrectly judges it as a positive sample, FN means a positive sample, but the algorithm incorrectly judges it as a negative sample.</p>
</sec>
<sec id="s3-2">
<title>3.2 Comparison with other methods</title>
<p>To quantitatively analyze the performance of the models, IoU, Mcc and Dice were calculated for automatic segmentation compared with manual specificity, as shown in <xref ref-type="table" rid="T1">Table 1</xref>. By adding the gate attention mechanism to the UNet structure, AttUNet, Connected-AttUNet and FF-UNet can effectively improve the precision to segment the colonoscopy images and achieve good results. Among them, the results of DenseUnet, ASF-Net and DRI-Net network are very close, and the comparison between them can objectively reflect the advantages of dense connections. Although the dense mechanism can increase the size of the network and reduce the over-fitting of without using a pre-trained model, its ability to increase the size of the network is limited. Obviously, our approach is superior to other methods in depth feature characterization and can obtain more accurate segmentation results. As you can see from the last two columns, although DRI-Net achieves better segmentation results, it requires more parameters and runtimes due to the introduction of many modules.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>The results of comparison with other methods.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center"/>
<th align="center">IoU (%)</th>
<th align="center">Mcc (%)</th>
<th align="center">Dice (%)</th>
<th align="center">Parameter (M)</th>
<th align="center">Time (ms/step)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">U-Net <xref ref-type="bibr" rid="B25">Ronneberger et al. (2015)</xref>
</td>
<td align="center">69.09</td>
<td align="center">80.09</td>
<td align="center">81.36</td>
<td align="center">2.06</td>
<td align="center">13</td>
</tr>
<tr>
<td align="center">AttUNet <xref ref-type="bibr" rid="B23">Oktay et al. (2018)</xref>
</td>
<td align="center">73.72</td>
<td align="center">83.13</td>
<td align="center">83.53</td>
<td align="center">8.49</td>
<td align="center">21</td>
</tr>
<tr>
<td align="center">DenseUnet <xref ref-type="bibr" rid="B12">Huang et al. (2017)</xref>
</td>
<td align="center">76.31</td>
<td align="center">85.28</td>
<td align="center">85.76</td>
<td align="center">2.29</td>
<td align="center">17</td>
</tr>
<tr>
<td align="center">NestedUNet <xref ref-type="bibr" rid="B35">Zhou et al. (2018)</xref>
</td>
<td align="center">72.02</td>
<td align="center">82.17</td>
<td align="center">83.13</td>
<td align="center">8.74</td>
<td align="center">33</td>
</tr>
<tr>
<td align="center">Connected-AttUNet <xref ref-type="bibr" rid="B3">Baccouche et al. (2021)</xref>
</td>
<td align="center">0.728825</td>
<td align="center">0.828933</td>
<td align="center">0.836375</td>
<td align="center">5.60</td>
<td align="center">29</td>
</tr>
<tr>
<td align="center">ASF-Net <xref ref-type="bibr" rid="B7">Chen P et al. (2022)</xref>
</td>
<td align="center">0.766183</td>
<td align="center">0.853072</td>
<td align="center">0.862914</td>
<td align="center">5.63</td>
<td align="center">18</td>
</tr>
<tr>
<td align="center">FF-UNet <xref ref-type="bibr" rid="B13">Iqbal et al. (2022)</xref>
</td>
<td align="center">0.7488</td>
<td align="center">0.843762</td>
<td align="center">0.851601</td>
<td align="center">3.97</td>
<td align="center">27</td>
</tr>
<tr>
<td align="center">DRI-Net</td>
<td align="center">77.72</td>
<td align="center">85.94</td>
<td align="center">86.51</td>
<td align="center">29.42</td>
<td align="center">102</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>
<xref ref-type="fig" rid="F6">Figure 6</xref> shows the comparison of the visualization segmentation results on the Kvasir-SEG dataset between the DRI-Net and the models proposed by some researchers in recent years. It can be seen from the segmentation example that in the structure of U-net, due to the lack of support for convolutional low-level information, the segmentation details are poor and there are many false negatives. Compared with U-Net, the results of other models are better, and the false negative is reduced. However, due to the loss of global association, the phenomenon of over-segmentation appeared, and the false positives of polyp segmentation were relatively high. As can be seen from the comparison between the visual segmentation results and Ground-truth, compared with other methods, our method can well distinguish polyp boundaries, and is better in maintaining the consistency of polyp morphological features, with lower FP and FN.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Comparison experiment with other methods on Kvasir-SEG dataset. <bold>(A)</bold> original images; <bold>(B)</bold> Ground-truth; <bold>(C&#x2013;J)</bold> are the results of U-Net, AttUNet, DenseUnet, NestedUNet, Connected-AttUNet, ASF-Net, FF-UNet and DRI-Net.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g006.tif"/>
</fig>
<p>In order to prove that each module added to the proposed DRI-Net network plays a role in colorectal polyp images, ablation experiments are conducted for each module. Experimental comparisons were conducted on the Kvasir-SEG dataset using U-Net, U-Net with only residual-inception module, U-Net with only dense module, U-Net with only down-sampling module, U-Net with residual-inception&#x2b;dense module, U-Net with residual-inception&#x2b;down-sampling module, U-Net model with dense&#x2b;down-sampling module, and DRI-Net. As can be seen from <xref ref-type="table" rid="T2">Table 2</xref>, when the network is added with the combination of residual-inception, dense and down-sampling modules, compared with a single U-Net, all evaluation indicators are better than those obtained by the latter, it demonstrate the effectiveness of the these modules.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Ablation experiment of DRI-Net on Kvasir-SEG dataset.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">U-net</th>
<th align="center">Residual-inception</th>
<th align="center">Dense</th>
<th align="center">Down-sampling</th>
<th align="center">IoU (%)</th>
<th align="center">Mcc (%)</th>
<th align="center">Dice (%)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">69.09</td>
<td align="center">80.09</td>
<td align="center">81.36</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">73.27</td>
<td align="center">83.17</td>
<td align="center">84.16</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">74.67</td>
<td align="center">83.95</td>
<td align="center">84.70</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">72.45</td>
<td align="center">82.64</td>
<td align="center">83.26</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">75.77</td>
<td align="center">84.93</td>
<td align="center">85.61</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">75.28</td>
<td align="center">84.75</td>
<td align="center">85.42</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#xd7;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">75.61</td>
<td align="center">84.76</td>
<td align="center">85.38</td>
</tr>
<tr>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">
<bold>&#x221a;</bold>
</td>
<td align="center">77.72</td>
<td align="center">85.94</td>
<td align="center">86.51</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>For further analysis of ablation performance, the partial segmentation results are shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. Visually, it can be seen that before module fusion, there was still under segmentation at the boundaries of some lesions, and the complete tumor region could not be segmented well. After adding residual-inception, dense and improved down-sampling modules, the segmentation accuracy of the whole network is greatly contributed. Therefore, the results of ablation experiments further verify the validity of these modules.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Ablation experiment of DRI-Net on Kvasir-SEG dataset. <bold>(A)</bold> original images; <bold>(B)</bold> Ground-truth; <bold>(C&#x2013;J)</bold> are the combination results of baseline (U-Net), U-Net&#x2b;residual-inception, U-Net&#x2b;dense, U-Net&#x2b;down-sampling, U-Net&#x2b;residual-inception&#x2b;dense, U-Net&#x2b;residual-inception&#x2b;down-sampling, U-Net&#x2b;dense&#x2b;down-sampling, and U-Net&#x2b;residual-inception&#x2b;dense&#x2b;down-sampling.</p>
</caption>
<graphic xlink:href="fphys-14-1290820-g007.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="conclusion" id="s4">
<title>4 Conclusion</title>
<p>Based on the color similarity between colon polyps and surrounding tissues and the diversity of size, shape and texture of colon polyps, a dense residual initialization network structure is proposed, which is an effective extension of encoder-decoder U-Net network. Firstly, we integrate the reside-inception module and dense connection into U-Net to effectively extract more discernible features in colon cancer tissue from a large amount of information. Then, the re-designed down-sampling module aim to suppress useless information and improve the recognition accuracy of the network. We assessed all methods on the Kvasir-SEG dataset using three popular evaluation metrics. Experimental results consistently illustrates that DRI-Net has better results than other typical networks. In the future, we will investigate the lightweight and over-fitting problems of these methods and apply them to more medical image segmentation tasks.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6">
<title>Ethics statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislations and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="s7">
<title>Author contributions</title>
<p>XL: Conceptualization, Writing&#x2013;original draft. HC: Validation, Writing&#x2013;review and editing. WJ: Visualization, Writing&#x2013;review and editing.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ahmed</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Ahmad</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jeon</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A real-time efficient object segmentation system based on U-Net using aerial drone images</article-title>. <source>J. Real-Time Image Process</source> <volume>18</volume>, <fpage>1745</fpage>&#x2013;<lpage>1758</lpage>. <pub-id pub-id-type="doi">10.1007/s11554-021-01166-z</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Anil</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Dayananda</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Automatic liver tumor segmentation based on multi-level deep convolutional networks and fractal residual network</article-title>. <source>IETE J. Res.</source> <volume>69</volume>, <fpage>1925</fpage>&#x2013;<lpage>1933</lpage>. <pub-id pub-id-type="doi">10.1080/03772063.2021.1878066</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baccouche</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Garcia-Zapirain</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Castillo Olea</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Elmaghraby</surname>
<given-names>A. S.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Connected-UNets: a deep learning architecture for breast mass segmentation</article-title>. <source>NPJ Breast Cancer</source> <volume>7</volume>, <fpage>151</fpage>. <pub-id pub-id-type="doi">10.1038/s41523-021-00358-x</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Badshah</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Ahmad</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>ResBCU-Net: deep learning approach for segmentation of skin images</article-title>. <source>Biomed. Signal Process. Control</source> <volume>40</volume>, <fpage>103137</fpage>&#x2013;<lpage>137</lpage>. <pub-id pub-id-type="doi">10.1016/j.bspc.2021.103137</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>H. Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H. Q.</given-names>
</name>
<name>
<surname>Zhen</surname>
<given-names>X. J.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>A hybrid active contour image segmentation model with robust to initial contour position</article-title>. <source>Multimed. Tools Appl.</source> <volume>82</volume>, <fpage>10813</fpage>&#x2013;<lpage>10832</lpage>. <pub-id pub-id-type="doi">10.1007/s11042-022-13782-3</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gong</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>ASF-Net: adaptive screening feature network for building footprint extraction from remote-sensing images</article-title>. <source>IEEE Trans. Geosci. Remote. Sens.</source> <volume>60</volume>, <fpage>1</fpage>&#x2013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1109/TGRS.2022.3165204</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yue</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Skin lesion segmentation using recurrent attentional convolutional networks</article-title>. <source>IEEE Access</source> <volume>10</volume>, <fpage>94007</fpage>&#x2013;<lpage>94018</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2022.3204280</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>X. X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>X. Q.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>LVSegNet: a novel deep learning-based framework for left ventricle automatic segmentation using magnetic resonance imaging</article-title>. <source>Comput. Commun.</source> <volume>208</volume>, <fpage>124</fpage>&#x2013;<lpage>135</lpage>. <pub-id pub-id-type="doi">10.1016/j.comcom.2023.05.011</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dourthe</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Shaikh</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Pai</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Fels</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Brown</surname>
<given-names>S. H. M.</given-names>
</name>
<name>
<surname>Wilson</surname>
<given-names>D. R.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>Automated segmentation of spinal muscles from upright open MRI using a multiscale pyramid 2D convolutional neural network</article-title>. <source>Spine</source> <volume>47</volume>, <fpage>1179</fpage>&#x2013;<lpage>1186</lpage>. <pub-id pub-id-type="doi">10.1097/BRS.0000000000004308</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fu</surname>
<given-names>Z. Y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>R. Y.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y. T.</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>D. D.</given-names>
</name>
<name>
<surname>Xia</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>TF-Unet: an automatic cardiac MRI image segmentation method</article-title>. <source>Math. Biosci. Eng.</source> <volume>19</volume>, <fpage>5207</fpage>&#x2013;<lpage>5222</lpage>. <pub-id pub-id-type="doi">10.3934/mbe.2022244</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goyal</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Lepcha</surname>
<given-names>D. C.</given-names>
</name>
<name>
<surname>Dogra</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S. H.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A weighted least squares optimisation strategy for medical image super resolution via multiscale convolutional neural networks for healthcare applications</article-title>. <source>Complex Intell. Syst.</source> <volume>8</volume>, <fpage>3089</fpage>&#x2013;<lpage>3104</lpage>. <pub-id pub-id-type="doi">10.1007/s40747-021-00465-z</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Van Der Maaten</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Weinberger</surname>
<given-names>K. Q.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Densely connected convolutional networks</article-title>,&#x201d; in <source>In IEEE conference on computer vision and pattern recognition</source>, <fpage>21</fpage>&#x2013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.243</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Iqbal</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Sharif</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Khan</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Nisar</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Alhaisoni</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>FF-UNet: a u-shaped deep convolutional neural network for multimodal biomedical image segmentation</article-title>. <source>Cogn. Comput.</source> <volume>14</volume>, <fpage>1287</fpage>&#x2013;<lpage>1302</lpage>. <pub-id pub-id-type="doi">10.1007/s12559-022-10038-y</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jha</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Smedsrud</surname>
<given-names>P. H.</given-names>
</name>
<name>
<surname>Riegler</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Halvorsen</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>de Lange</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Johansen</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). &#x201c;<article-title>Kvasir-SEG: a segmented polyp dataset</article-title>,&#x201d; in <source>In proceedings of the international conference on multimedia modeling</source>. <comment>Available online: <ext-link ext-link-type="uri" xlink:href="https://datasets.simula.no/kvasir-seg/">https://datasets.simula.no/kvasir-seg/</ext-link>.</comment>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zhai</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A novel deep learning model DDU-net using edge features to enhance brain tumor segmentation on MR images</article-title>. <source>Artif. Intell. Med.</source> <volume>121</volume>, <fpage>102180</fpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2021.102180</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kanimozhi</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Franklin</surname>
<given-names>J. V.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>An automated cervical cancer detection scheme using deeply supervised shuffle attention modified convolutional neural network model</article-title>. <source>Automatika</source> <volume>64</volume>, <fpage>518</fpage>&#x2013;<lpage>528</lpage>. <pub-id pub-id-type="doi">10.1080/00051144.2023.2196114</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Karthika</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Senthilselvi</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Smart credit card fraud detection system based on dilated convolutional neural network with sampling technique</article-title>. <source>Multimed. Tools Appl.</source> <volume>82</volume>, <fpage>31691</fpage>&#x2013;<lpage>31708</lpage>. <pub-id pub-id-type="doi">10.1007/s11042-023-15730-1</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lecun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Bottou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Bengio</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Haffner</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>1998</year>). &#x201c;<article-title>Gradient-based learning applied to document recognition</article-title>,&#x201d; in <source>Proceedings of the IEEE</source> (<publisher-name>IEEE</publisher-name>), <fpage>2278</fpage>&#x2013;<lpage>2324</lpage>. <pub-id pub-id-type="doi">10.1109/5.726791</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Q. H.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Gong</surname>
<given-names>X. Q.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Densely dilated spatial pooling convolutional network using benign loss functions for imbalanced volumetric prostate segmentation</article-title>. <source>Curr. Bioinform.</source> <volume>15</volume>, <fpage>788</fpage>&#x2013;<lpage>799</lpage>. <pub-id pub-id-type="doi">10.2174/1574893615666200127124145</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Long</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Shelhamer</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Darrell</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Fully convolutional networks for semantic segmentation</article-title>. <source>IEEE Trans. Pattern Anal.Mach. Intell.</source> <volume>39</volume>, <fpage>640</fpage>&#x2013;<lpage>651</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2016.2572683</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maria</surname>
<given-names>H. H.</given-names>
</name>
<name>
<surname>Jossy</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Malarvizhi</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>A hybrid deep learning approach for detection and segmentation of ovarian tumours</article-title>. <source>Neural comput. Appl.</source> <volume>35</volume>, <fpage>15805</fpage>&#x2013;<lpage>15819</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-023-08569-y</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Morita</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Mazen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Tsujiko</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Otake</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Numajiri</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Deep-learning-based automatic facial bone segmentation using a two-dimensional U-Net</article-title>. <source>Int. J. Oral Max. Surg.</source> <volume>52</volume>, <fpage>787</fpage>&#x2013;<lpage>792</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijom.2022.10.015</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Oktay</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Schlemper</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Folgoc</surname>
<given-names>L. L.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Heinrich</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Misawa</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <source>Attention U-Net: learning where to look for the pancreas</source>. <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1804.03999v2">https://arxiv.org/abs/1804.03999v2</ext-link>.</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rasti</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Biglari</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Rezapourian</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z. Y.</given-names>
</name>
<name>
<surname>Farsiu</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>RetiFluidNet: a self-adaptive and multi-attention deep convolutional network for retinal OCT fluid segmentation</article-title>. <source>IEEE Trans. Med. Imaging</source> <volume>42</volume>, <fpage>1413</fpage>&#x2013;<lpage>1423</lpage>. <pub-id pub-id-type="doi">10.1109/TMI.2022.3228285</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ronneberger</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Fischer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Brox</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>U-Net: convolutional networks for biomedical image segmentation</article-title>,&#x201d; in <source>In international conference on medical image computing and computer-assisted intervention</source> (<publisher-name>Springer</publisher-name>), <fpage>234</fpage>&#x2013;<lpage>241</lpage>. <comment>arXiv:1505.04597</comment>.</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Selvaraj</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Nithiyaraj</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>CEDRNN: a convolutional encoder-decoder residual neural network for liver tumour segmentation</article-title>. <source>Neural process. Lett.</source> <volume>55</volume>, <fpage>1605</fpage>&#x2013;<lpage>1624</lpage>. <pub-id pub-id-type="doi">10.1007/s11063-022-10953-z</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Srikanth</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Bikshalu</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Chaotic multi verse improved Harris hawks optimization (CMV-IHHO) facilitated multiple level set model with an ideal energy active contour for an effective medical image segmentation</article-title>. <source>Multimed. Tools Appl.</source> <volume>81</volume>, <fpage>20963</fpage>&#x2013;<lpage>20992</lpage>. <pub-id pub-id-type="doi">10.1007/s11042-022-12344-x</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname>
<given-names>J. W.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>C. Y.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>Z. D.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X. Q.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>TNSNet: thyroid nodule segmentation in ultrasound imaging using soft shape supervision</article-title>. <source>Comput. Meth. Prog. Bio.</source> <volume>215</volume>, <fpage>106600</fpage>. <pub-id pub-id-type="doi">10.1016/j.cmpb.2021.106600</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tan</surname>
<given-names>Q. X.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yip</surname>
<given-names>T. C. F.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>G. L. H.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Explainable uncertainty-aware convolutional recurrent neural network for irregular medical time series</article-title>. <source>IEEE Trans. Neur. Net. Lear. Syst.</source> <volume>32</volume>, <fpage>4665</fpage>&#x2013;<lpage>4679</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2020.3025813</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>P. L.</given-names>
</name>
<name>
<surname>Nie</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Unified medical image segmentation by learning from uncertainty in an end-to-end manner</article-title>. <source>Knowledge-Based Syst.</source> <volume>241</volume>, <fpage>108215</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.108215</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>T. T.</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>L. L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>P. Z.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Real-time automatic assisted detection of uterine fibroid in ultrasound images using a deep learning detector</article-title>. <source>Ultrasound Med. Biol.</source> <volume>49</volume>, <fpage>1616</fpage>&#x2013;<lpage>1626</lpage>. <pub-id pub-id-type="doi">10.1016/j.ultrasmedbio.2023.03.013</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Y. Y.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>R. F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Automatic segmentation model combining U-Net and level set method for medical images</article-title>. <source>Expert Syst. Appl.</source> <volume>153</volume>, <fpage>113419</fpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2020.113419</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Vinodhini</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Muthu</surname>
<given-names>B. A.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Deep learning assisted medical insurance data analytics with multimedia system</article-title>. <source>Int. J. Interact. Multi. Artif. Intell.</source> <volume>8</volume>, <fpage>69</fpage>&#x2013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.9781/ijimai.2023.01.009</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>H. B.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z. J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y. A.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z. L.</given-names>
</name>
<name>
<surname>Lv</surname>
<given-names>J. Q.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <article-title>TiM-Net: transformer in M-Net for retinal vessel segmentation</article-title>. <source>J. Healthc. Eng.</source> <volume>2022</volume>, <fpage>9016401</fpage>. <pub-id pub-id-type="doi">10.1155/2022/9016401</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>Z. W.</given-names>
</name>
<name>
<surname>Siddiquee</surname>
<given-names>M. M. R.</given-names>
</name>
<name>
<surname>Tajbakhsh</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Liang</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>UNet&#x2b;&#x2b;: a nested U-Net architecture for medical image segmentation</article-title>,&#x201d; in <source>Deep learning in medical image analysis and multimodal learning for clinical decision support</source>, <fpage>3</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.48550/arXiv.1807.10165</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>