<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Phys.</journal-id>
<journal-title>Frontiers in Physics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Phys.</abbrev-journal-title>
<issn pub-type="epub">2296-424X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1117261</article-id>
<article-id pub-id-type="doi">10.3389/fphy.2022.1117261</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Physics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Atomic number prior guided network for prohibited items detection from heavily cluttered X-ray imagery</article-title>
<alt-title alt-title-type="left-running-head">Chen et al.</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fphy.2022.1117261">10.3389/fphy.2022.1117261</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Chen</surname>
<given-names>Jinwen</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/2126782/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Leng</surname>
<given-names>Jiaxu</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2129100/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Gao</surname>
<given-names>Xinbo</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Mo</surname>
<given-names>Mengjingcheng</given-names>
</name>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Guan</surname>
<given-names>Shibo</given-names>
</name>
</contrib>
</contrib-group>
<aff>
<institution>School of Computer Science and Technology</institution>, <institution>Chongqing University of Posts and Telecommunications</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1307533/overview">Huafeng Li</ext-link>, Kunming University of Science and Technology, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2133171/overview">Lulu Wang</ext-link>, Kunming University of Science and Technology, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2133181/overview">Neng Dong</ext-link>, Nanjing University of Science and Technology, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2133312/overview">Xiaosong Li</ext-link>, Foshan University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Jiaxu Leng, <email>lengjx@cqupt.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Radiation Detectors and Imaging, a section of the journal Frontiers in Physics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>05</day>
<month>01</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>1117261</elocation-id>
<history>
<date date-type="received">
<day>06</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>19</day>
<month>12</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Chen, Leng, Gao, Mo and Guan.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Chen, Leng, Gao, Mo and Guan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Prohibited item detection in X-ray images is an effective measure to maintain public safety. Recent prohibited item detection methods based on deep learning has achieved impressive performance. Some methods improve prohibited item detection performance by introducing prior knowledge of prohibited items, such as the edge and size of an object. However, items within baggage are often placed randomly, resulting in cluttered X-ray images, which can seriously affect the correctness and effectiveness of prior knowledge. In particular, we find that different material items in X-ray images have clear distinctions according to their atomic number Z information, which is vital to suppress the interference of irrelevant background information by mining material cues. Inspired by this observation, in this paper, we combined the atomic number Z feature and proposed a novel atomic number Z Prior Guided Network (ZPGNet) to detect prohibited objects from heavily cluttered X-ray images. Specifically, we propose a Material Activation (MA) module that cross-scale flows the atomic number Z information through the network to mine material clues and reduce irrelevant information interference in detecting prohibited items. However, collecting atomic number images requires much labor, increasing costs. Therefore, we propose a method to automatically generate atomic number Z images by exploring the color information of X-ray images, which significantly reduces the manual acquisition cost. Extensive experiments demonstrate that our method can accurately and robustly detect prohibited items from heavily cluttered X-ray images. Furthermore, we extensively evaluate our method on HiXray and OPIXray, and the best result is 2.1% <italic>mAP</italic>
<sub>50</sub> higher than the state-of-the-art models on HiXray.</p>
</abstract>
<kwd-group>
<kwd>object detection</kwd>
<kwd>X-ray image</kwd>
<kwd>prohibited items detection</kwd>
<kwd>prior knowledge</kwd>
<kwd>public safety</kwd>
</kwd-group>
<contract-num rid="cn001">62102057 62036007</contract-num>
<contract-num rid="cn002">CSTB2022NSCQ-MSX1024</contract-num>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">Natural Science Foundation of Chongqing<named-content content-type="fundref-id">10.13039/501100005230</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>As society develops, the flow of people on public transport is increasing. X-ray security machine is widely used in the security inspection of railway stations and airports, which is a critical facility for maintaining public safety and transportation safety. However, traditional security checks mostly rely on manual identification methods. After prolonged work hours, security inspectors easily cause fatigue, significantly increasing the risk of missed and false detection and laying many hidden dangers for public safety. Therefore, it is increasingly necessary to identify prohibited items through intelligent algorithms.</p>
<p>Different from traditional detection tasks, in this scenario, there are various items in the passenger&#x2019;s luggage and random permutations between items, resulting in heavily cluttered X-ray images [<xref ref-type="bibr" rid="B1">1</xref>&#x2013;<xref ref-type="bibr" rid="B4">4</xref>]. Therefore, object detection algorithms for general natural images do not perform well on cluttered X-ray images as in <xref ref-type="fig" rid="F1">Figure 1</xref>. Fortunately, the tremendous success of deep learning [<xref ref-type="bibr" rid="B5">5</xref>&#x2013;<xref ref-type="bibr" rid="B11">11</xref>] has made the intelligent detection of prohibited items possible by transforming it into an object detection task in computer vision [<xref ref-type="bibr" rid="B12">12</xref>&#x2013;<xref ref-type="bibr" rid="B14">14</xref>]. Hence, many researchers have applied deep learning methods to prohibited object detection. Flitton et al. [<xref ref-type="bibr" rid="B15">15</xref>] explored 3D feature descriptors with application to threat detection in Computed Tomography (CT) airport baggage imagery. Bhowmik et al. [<xref ref-type="bibr" rid="B16">16</xref>] investigated the difference in detection performance achieved using real and synthetic X-ray training imagery for CNN architecture. Gaus et al; [<xref ref-type="bibr" rid="B17">17</xref>] evaluated several leading variants spanning the Faster R-CNN, Mask R-CNN, and RetinaNet architectures to explore the transferability of such models between varying X-ray scanners. Hassan et al; [<xref ref-type="bibr" rid="B18">18</xref>] presented a cascaded structure tensor framework that automatically extracts and recognizes suspicious items in multi-vendor X-ray scans. Zhao et al; [<xref ref-type="bibr" rid="B19">19</xref>] established the associations between feature channels and different labels and adjust the features according to the assigned labels (or pseudo labels) to tackle the overlapping object problem. These methods all improve detection performance to a certain extent but do not use the unique imaging characteristics of X-ray images to improve the algorithm.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Various items in passengers&#x2019; luggage and random permutations between articles result in cluttered X-ray images. For general object detectors, a large amount of irrelevant background information interference can easily lead to missed detections. With the assistance of the atomic number prior knowledge, our method can suppress background interference and detect items correctly.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g001.tif"/>
</fig>
<p>Recently, some works have tried adding prior information about X-ray images to guide network learning, as shown in <xref ref-type="fig" rid="F2">Figure 2</xref> [<xref ref-type="bibr" rid="B20">20</xref>]. Obtained edge images by using the traditional edge detection algorithm Sobel. Chang et al. [<xref ref-type="bibr" rid="B4">4</xref>] found that different classes of prohibited objects have a clear distinction in physical size and used Otsu&#x2019;s threshold segmentation algorithm [<xref ref-type="bibr" rid="B21">21</xref>] to segment the original image into foreground and background, treating the foreground region as the approximate size of the detected object. Although these two methods improve the detection accuracy to a certain extent by introducing such prior information, the obtained prior information is easily disturbed by other irrelevant information due to the messy distribution of prohibited items, which hinders further performance improvement. Specifically, in the presence of cluttered items, the former method to obtain the boundary information of prohibited terms is severely interfered with by the boundary information of irrelevant items. Furthermore, the latter cannot fully believe the accuracy of treating the binarized foreground as the area of the detected items, especially when other items appear inside the detection region.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Framework comparisons between existing methods based on prior knowledge and our method. For each row, the left is the network framework, and the right is the visualization of prior knowledge. The prohibited objects in each X-ray image are annotated in red bounding boxes. <bold>(A)</bold> The method to obtain the boundary information of prohibited items will be seriously interfered with by the boundary information of unrelated items. <bold>(B)</bold> The way cannot fully believe the accuracy of treating the binarized foreground as the area of the detected object, especially when other items appear inside the detection box. <bold>(C)</bold> Unlike them, our method pays more attention to the atomic number feature, taking advantage of the distinction in atomic numbers to reduce the interference of useless background information.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g002.tif"/>
</fig>
<p>In this paper, we propose a novel atomic number Z Prior Guided Network (ZPGNet) for heavily cluttered X-ray images, which can remove irrelevant background information by effectively incorporating the atomic number feature. Unlike optical images, X-ray images are generated by illuminating objects with X-ray. X-ray security inspection machine is based on the object difference in absorbing X-ray to detect the effective atomic number and then show distinct colors [<xref ref-type="bibr" rid="B22">22</xref>]. Specifically, the color information in X-ray images represents material information, where blue represents inorganic material, orange represents organic material, and green represents mixture [<xref ref-type="bibr" rid="B23">23</xref>], as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Atomic number images of X-ray image variants can directly reflect the material type of an item, which is the dominant information in X-ray images. This characteristic motivates us to explore this critical information to improve detection accuracy by removing irrelevant background information. Bhowmik et al; [<xref ref-type="bibr" rid="B24">24</xref>] examined the impact of atomic number images <italic>via</italic> the use of CNN architectures for the object detection task posed within X-ray baggage security screening and obviously illustrated a vital insight into the benefits of using atomic number images for object detection and segmentation tasks. However, they only simply connect atomic number images with RGB images and do not fully use atomic number images. In order to make full use of the atomic number features of items, we designed a Material Activation (MA) module. It cross-scale flows atomic number information through the network to mine deep material clues, which is beneficial to reduce irrelevant information interference in detecting prohibited items.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>From left to right are inorganic matter, organic matter, and mixture.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g003.tif"/>
</fig>
<p>Atomic number images need to be collected manually, which increases the costs. In particularly, X-ray imaging systems render different materials in different colors. Blue represents inorganic material, orange represents organic material, and green represents mixture, as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Therefore, we can obtain the material classification of each pixel by analyzing the color. Thus, we propose an atomic number Z Prior Generation (ZPG) module, which automatically generates the atomic number feature according to the imaging color of X-ray images, as those shown in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>The X-ray image samples are from the OPIXray dataset. The left part of each set of photograph is the original image, and the right part is the atomic number image generated by our proposed ZPG method. The prohibited objects in each X-ray image are annotated in red bounding boxes.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g004.tif"/>
</fig>
<p>Overall, the contributions of our work can be summarized as follows:<list list-type="simple">
<list-item>
<p>&#x2022; We propose a novel atomic number Z Prior Guided Network (ZPGNet) to improve the detection accuracy of cluttered items by effectively incorporating the atomic number feature. In addition, the proposed method is generic and can be easily embedded into existing detection frameworks as a module.</p>
</list-item>
<list-item>
<p>&#x2022; We propose an atomic number Z Prior Generation (ZPG) module, which automatically generates the atomic number feature according to the imaging color of X-ray images. Compared with the manual collection, the costs are significantly reduced.</p>
</list-item>
<list-item>
<p>&#x2022; We design a Material Activation (MA) module to cross-scale fuse image features with the atomic number feature and then flow the fused features from high-level to low-level to enhance the ability of the model to mine deep material clues.</p>
</list-item>
<list-item>
<p>&#x2022; We evaluate ZPGNet on the HiXray and OPIXray datasets and demonstrate that the performance of our ZPGNet is superior to state-of-the-art methods in identifying prohibited objects from cluttered X-ray baggage images.</p>
</list-item>
</list>
</p>
</sec>
<sec id="s2">
<title>2 Related work</title>
<p>In this section, we first introduce the existing public datasets for detecting prohibited items in X-ray images and then describe some generic object detection methods and some strategies to solve the clutter problem in X-ray images.</p>
<sec id="s2-1">
<title>2.1 Security inspection image dataset</title>
<p>X-ray security inspection machines show different colors for different material items by the object distinction in absorption X-ray [<xref ref-type="bibr" rid="B22">22</xref>]. Therefore, it has many applications in many tasks, such as security inspection [<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B25">25</xref>&#x2013;<xref ref-type="bibr" rid="B27">27</xref>]and medical imaging analysis [<xref ref-type="bibr" rid="B8">8</xref>, <xref ref-type="bibr" rid="B28">28</xref>&#x2013;<xref ref-type="bibr" rid="B33">33</xref>]. However, there are very few X-ray image datasets due to the particularity of security inspection scenes. To our knowledge, four recently published datasets are GDXray [<xref ref-type="bibr" rid="B22">22</xref>], SIXray [<xref ref-type="bibr" rid="B26">26</xref>], OPIXray [<xref ref-type="bibr" rid="B20">20</xref>], and HiXray [<xref ref-type="bibr" rid="B34">34</xref>]. The GDXray dataset has 19,407 images containing three prohibited items, namely, guns, darts, and razors. However, the GDXray dataset only contains grayscale images, which are far from realistic scenarios. The SIXray includes 1,059,231 X-ray images, which only have 8,929 labeled images. The pictures in the SIXray dataset are obtained by real security machines from several subway stations, which is more in line with the data distribution of real scenes. The OPIXray dataset is the first high-quality security target detection dataset, which contains five categories of prohibited items, namely, folding knives, straight knives, scissors, utility knives, and multitool knives, with a total of 8885 X-ray images. The HiXray dataset contains 44,364 X-ray images from daily security checks at international airports, which contain eight categories of prohibited items such as lithium batteries, liquids, and lighters that are common in daily life. Each image in the HiXray dataset is annotated by airport staff, which ensures the accuracy of the data.</p>
</sec>
<sec id="s2-2">
<title>2.2 Generic object detection</title>
<p>Object detection is an essential part of computer vision tasks, which supports many downstream tasks [<xref ref-type="bibr" rid="B35">35</xref>&#x2013;<xref ref-type="bibr" rid="B38">38</xref>]. Methods based on convolutional neural networks can be summarized into two categories: single-stage [<xref ref-type="bibr" rid="B39">39</xref>&#x2013;<xref ref-type="bibr" rid="B43">43</xref>] and multi-stage [<xref ref-type="bibr" rid="B44">44</xref>&#x2013;<xref ref-type="bibr" rid="B46">46</xref>]. In recent years, compared with multi-stage detection methods, single-stage detection methods have been widely adopted due to their simple design and powerful performance. YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>] considers both real-time and accuracy by using the region proposal method. RetinaNet [<xref ref-type="bibr" rid="B41">41</xref>] improves the detection accuracy while maintaining the inference speed by solving the problem of class balance. It is far higher in real-time performance and accuracy than general multi-stage detection methods. FCOS [<xref ref-type="bibr" rid="B43">43</xref>] is anchor box free, as well as proposal free, to solve object detection in a per-pixel prediction fashion. In addition, YOLOv5 [<xref ref-type="bibr" rid="B47">47</xref>] makes several improvements based on YOLOv3, which significantly improves the detection speed and accuracy. However, so far, most object detection methods are for natural images. In the security check scene, various items in the passenger&#x2019;s luggage and random permutations between the objects resulted in heavily cluttered X-ray images, so the detection effect is often unperformed.</p>
</sec>
<sec id="s2-3">
<title>2.3 Solutions to heavily cluttered problems</title>
<p>Previous works have mainly focused on solving the problem of highly cluttered X-ray images. Shao et al. [<xref ref-type="bibr" rid="B48">48</xref>] proposed a foreground and background separation X-ray prohibited item detection framework that separates prohibited items from other items to exclude irrelevant background information. Tao et al. [<xref ref-type="bibr" rid="B34">34</xref>] proposed a lateral inhibition module to eliminate the influence of noisy neighboring regions on the interest object regions and activate the boundary of items by intensifying it.</p>
</sec>
</sec>
<sec id="s3">
<title>3 Proposed method</title>
<p>Atomic number images of X-ray image variants can directly reflect item material, which is the dominant information in X-ray images. Inspired by this, we propose a novel atomic number Z Prior Guided Network (ZPGNet) for cluttered X-ray images, as shown in <xref ref-type="fig" rid="F5">Figure 5</xref>. The ZPGNet consists of three main components: 1) an atomic number Z Prior Generation (ZPG) module automatically generates atomic number images, which reduces the cost of manually collecting atomic number images, 2) a Material Activation (MA) module fuses the atomic number feature to remove irrelevant background information, 3) a Bidirectional Enhancement (BE) module enriches feature expression through bidirectional information flow.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Overall framework of the proposed atomic number Z Prior Guided Network (ZPGNet). The network consists of three key modules, i.e., an atomic number Z Prior Generation (ZPG) module generating the atomic number feature, a Material Activation (MA) module cross-scale fusing the image features with the atomic number feature, and a Bidirectional Enhancement (BE) module mining contextual semantics for enhancing feature representation. CBR is composed of a convolution layer, a batch normalization layer, and a relu activation function. SENet stands for Squeeze-and-Excitation Networks [<xref ref-type="bibr" rid="B49">49</xref>].</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g005.tif"/>
</fig>
<p>Specifically, we first design the ZPG module, combining the characteristics that different materials will show different colors, to map a three-channel (RGB) color image to a single-channel atomic number image. Then, we repeatedly pass the atomic number feature generated by the ZPG module into the network to pay more attention to item material information. To effectively fuse the extracted image features and the atomic number feature, MA cross-scale flows the atomic number feature under the extracted multi-scale features and uses a channel attention module to self-adapt the importance of different features. Finally, we add a layer of low sampling rate features to obtain more detailed information and mine contextual semantics for enriching feature expression.</p>
<sec id="s3-1">
<title>3.1 Z Prior Generation</title>
<p>Unlike optical images, X-ray images are generated by illuminating objects with X-rays, whose penetration is related to the material&#x2019;s density, size, and composition [<xref ref-type="bibr" rid="B22">22</xref>]. X-ray security machines detect the atomic number of objects based on the difference in absorbing X-rays, which then display a distinct color. Bhowmik et al. [<xref ref-type="bibr" rid="B24">24</xref>] proved that the introduction of atomic number images is an effective method to improve detection performance <italic>via</italic> large experiments. Inspired by this, the designed ZPG module compresses three-channel X-ray images into a single-channel to generate atomic number images that can highlight material differences. Compared with manually collecting atomic number images, it significantly reduced costs.</p>
<p>For each pixel in the RGB image, the maximum of the three channels will render its corresponding color. We use its subscripts to classify different materials.<disp-formula id="e1">
<mml:math id="m1">
<mml:msub>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>g</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">ijk</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(1)</label>
</disp-formula>where <italic>x</italic>
<sub>
<italic>ijk</italic>
</sub> denotes the value of the k-channel at position (<italic>i</italic>, <italic>j</italic>) the input image. <italic>argmax</italic> (&#x2022;) denotes the index corresponding to finding the maximum value of an element.</p>
<p>Materials of the same class tend to present different depths of color due to different thicknesses. We introduce two variables, base-value <italic>B</italic>, and width-value <italic>W</italic>. The former is used to distinguish different materials, and the latter reflects the difference between the same materials.<disp-formula id="e2">
<mml:math id="m2">
<mml:msub>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m3">
<mml:mtable class="align" columnalign="left">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:msub>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2217;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2217;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>/</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>255</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>255</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right"/>
<mml:mtd columnalign="left">
<mml:mspace width="1em"/>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2217;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
<mml:mo>&#x2217;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>/</mml:mo>
<mml:mn>255</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(3)</label>
</disp-formula>Where <italic>&#x3b1;</italic> and <italic>&#x3b2;</italic> are hyperparameters that respectively control basis-value <italic>B</italic> and width-value <italic>W</italic>.</p>
<p>Finally, the basis-value <italic>B</italic> and width-value <italic>W</italic> are added and normalized, and then passed through a series of convolutional layers to obtain the atomic number feature <italic>Z</italic>.<disp-formula id="e4">
<mml:math id="m4">
<mml:msub>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="{" close="">
<mml:mrow>
<mml:mtable class="cases">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if</mml:mtext>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>255,255,255</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>/</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if&#x2009;others</mml:mtext>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="e5">
<mml:math id="m5">
<mml:mi>Z</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(5)</label>
</disp-formula>where <inline-formula id="inf1">
<mml:math id="m6">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2022;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula> denotes the n-layer &#x201c;Conv-BN-ReLU&#x201d; operation, Since no items are in the white area, we specially treat for the pixel (255, 255, 255).</p>
</sec>
<sec id="s3-2">
<title>3.2 Material activation</title>
<p>In particular, different material items in X-ray images have clear distinctions according to their atomic number information, which is vital to suppress the interference of background information by mining deep material cues.</p>
<p>In cluttered X-ray images, the boundary and color information of prohibited items are easily interfered with by background information. MA introduces the atomic number feature to mine material cues, which is beneficial to reduce useless background information interference in detecting prohibited items, as shown in <xref ref-type="fig" rid="F6">Figure 6</xref>.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>The bottom part shows the edge detection results obtained directly by the Canny algorithm [<xref ref-type="bibr" rid="B50">50</xref>], and the top part is obtained by first passing through the ZPG module and then through the Canny detection. It is intuitive to see that the edges of the items processed by the ZPG module are more evident than the original. The prohibited objects in each X-ray image are annotated in red bounding boxes.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g006.tif"/>
</fig>
<p>Specifically, the backbone network has <italic>n</italic> feature map outputs <italic>F</italic> &#x3d; {<italic>f</italic>
<sub>0</sub>, &#x2026; , <italic>f</italic>
<sub>
<italic>n</italic>&#x2212;1</sub>}. As shown in <xref ref-type="fig" rid="F7">Figure 7</xref>, the MA structure makes the former <italic>k</italic> layers of <italic>F</italic> as the input. For <italic>Z</italic> and <italic>F</italic> feature maps, which are output by ZPG and Backbone, we pool the atomic number feature <italic>Z</italic> to increase the receptive field and then add <italic>Z</italic> flowing down from the previous layer to get a more robust feature <italic>M</italic>. Furthermore, we concatenate them with <italic>F</italic> for information fusion and apply channel attention operation (Squeeze-and-Excitation Networks [<xref ref-type="bibr" rid="B49">49</xref>]) <inline-formula id="inf2">
<mml:math id="m7">
<mml:mi>SE</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2022;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula> on the fused features to adapt the importance between the material feature and other image features (edge, texture, size, etc.).<disp-formula id="e6">
<mml:math id="m8">
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(6)</label>
</disp-formula>
<disp-formula id="e7">
<mml:math id="m9">
<mml:msub>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="italic">SE</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x2016;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(7)</label>
</disp-formula>where &#x2016; represents the operation of concatenating, <inline-formula id="inf3">
<mml:math id="m10">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2022;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula> denotes the Pooling operation.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Illustration of the proposed Material Activation (MA) module, where <italic>k</italic> indicates that the input of the MA module has <italic>k</italic> different-scale feature maps.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g007.tif"/>
</fig>
<p>Separate <italic>F</italic>
<sub>
<italic>ei</italic>
</sub> into <inline-formula id="inf4">
<mml:math id="m11">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf5">
<mml:math id="m12">
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> along the channel dimension, whose dimensions are the same as <italic>f</italic>
<sub>
<italic>i</italic>
</sub> and <inline-formula id="inf6">
<mml:math id="m13">
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>, respectively, where the <inline-formula id="inf7">
<mml:math id="m14">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is used as the input of the next BE module, and the <inline-formula id="inf8">
<mml:math id="m15">
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is passed to the next layer of the MA module as an enhanced atomic number feature to obtain the more robust feature.<disp-formula id="e8">
<mml:math id="m16">
<mml:mfenced open="{" close="">
<mml:mrow>
<mml:mtable class="cases">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mspace width="1em"/>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mspace width="1em"/>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(8)</label>
</disp-formula>
<disp-formula id="e9">
<mml:math id="m17">
<mml:msub>
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="script">U</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2033;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(9)</label>
</disp-formula>where <inline-formula id="inf9">
<mml:math id="m18">
<mml:msubsup>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf10">
<mml:math id="m19">
<mml:msubsup>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> denote the two features obtained by separating <italic>F</italic>
<sub>
<italic>ei</italic>
</sub> along the channel, <inline-formula id="inf11">
<mml:math id="m20">
<mml:mi mathvariant="script">U</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mo>&#x2022;</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> denotes the Upsample operation. Especially, <inline-formula id="inf12">
<mml:math id="m21">
<mml:msub>
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>.</p>
</sec>
<sec id="s3-3">
<title>3.3 Bidirectional Enhancement</title>
<p>When the down-sampling rate is high, it is easy to obtain larger receptive fields and more large-scale item information, which is beneficial for detecting large-scale prohibited objects. However, for some minor prohibited items, too large a downsampling rate tends to lose too much detail feature information of small-scale objects.</p>
<p>In the HiXray [<xref ref-type="bibr" rid="B34">34</xref>] high-quality prohibited items dataset, the average resolution of images is 1,200&#x2a;900, with the largest resolution being 2000&#x2a;1,024. The resolution of some small lighters is only 21&#x2a;57, which is about 1/1,000 the size of the original image. After excessive downsampling, the feature information of lighters is seriously missing, resulting in poor detection in SSD [<xref ref-type="bibr" rid="B51">51</xref>], LIM [<xref ref-type="bibr" rid="B34">34</xref>], DOAM [<xref ref-type="bibr" rid="B20">20</xref>], and other detection models.</p>
<p>BE module adds a low sampling rate feature to obtain more detailed information about the tiny-size prohibited items. However, the low sampling rate feature often contains additional noise information. We remove noisy information by performing multiple pooling operations.<disp-formula id="e10">
<mml:math id="m22">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="script">U</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(10)</label>
</disp-formula>where <inline-formula id="inf13">
<mml:math id="m23">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is the finally denoised low-sampling rate feature, and specifical <inline-formula id="inf14">
<mml:math id="m24">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>.</p>
<p>Finally, the material activation feature <inline-formula id="inf15">
<mml:math id="m25">
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> obtained by the MA module, Backbone output feature {<italic>f</italic>
<sub>
<italic>k</italic>
</sub>, &#x2026; , <italic>f</italic>
<sub>2</sub>}, and <inline-formula id="inf16">
<mml:math id="m26">
<mml:msubsup>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> are streamed bidirectionally, which mines contextual semantics to enrich feature expression.</p>
</sec>
</sec>
<sec id="s4">
<title>4 Experiments</title>
<sec id="s4-1">
<title>4.1 Datasets and evaluation Metrics</title>
<p>We conduct extensive experiments to evaluate our proposed model on two prohibited item detection datasets, HiXray [<xref ref-type="bibr" rid="B34">34</xref>] and OPIXray [<xref ref-type="bibr" rid="B20">20</xref>]. HiXray dataset consists of 45,364 X-ray images from routine security checks at international airports, which contains 8 categories of 102,928 everyday prohibited items commonly seen in daily life, such as lithium batteries, liquids, lighters, etc. Each image in the HiXray dataset was annotated by an airport employee, which ensures the accuracy of the data. OPIXray dataset is the first high-quality object detection dataset for security, which focused on the widely-occurred prohibited item &#x201c;cutter&#x201d;, annotated manually by professional inspectors from the international airport. The dataset contains five categories of prohibited objects with a total of 8885 X-ray images (7,109 for training and 1,776 for testing).</p>
<p>Average Precision (AP) denotes the area under the precision-recall curve of the detection results for a single category of objects. To fairly evaluate the performance of all models, we compute the mean average precision (mAP) with an IOU threshold of .5. In addition, we calculate AP for all categories for each model to see the improvement for each category.</p>
</sec>
<sec id="s4-2">
<title>4.2 Implementation details</title>
<p>All our experiments were done in Pytorch and trained on one NVIDIA RTX 3090 GPU with the initial learning rate set to 1e-2. The parameters were optimized through stochastic gradient descent (SGD). The momentum and weight decay are set to .937 and .0005, respectively. Besides, two new hyperparameters were introduced with respect to the module ZPG, i.e., <italic>&#x3b1;</italic> and <italic>&#x3b2;</italic>, which respectively control base-value B and width-value W, and values are set to .4 and .5.</p>
</sec>
<sec id="s4-3">
<title>4.3 Quantitative results</title>
<p>We test the model performance on HiXray [<xref ref-type="bibr" rid="B34">34</xref>] and OPIXray [<xref ref-type="bibr" rid="B20">20</xref>] datasets. Specifically, we embedded ZPGNet into YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>] and YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>] and compared it with the state-of-the-art methods DOAM [<xref ref-type="bibr" rid="B20">20</xref>] and LIM [<xref ref-type="bibr" rid="B34">34</xref>]. <xref ref-type="table" rid="T1">Table 1</xref> presents the experimental results of DOAM, LIM, and the proposed ZPGNet on HiXray and OPIXray datasets. In order to illustrate the effectiveness of our method and better compare it with the existing state-of-the-art (SOTA) models, we use YOLOv3 and YOLOv5s as this baseline.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Quantitative evaluation results on the HiXray dataset and OPIXray dataset. Where PO1, PO2, WA, LA, MP, TA, CO, and NL denote &#x201c;Portable Charger 1 (lithium-ion prismatic cell)&#x201d;, &#x201c;Portable Charger 2 (lithium-ion cylindrical cell)&#x201d;, &#x201c;Water,&#x201d; &#x201c;Laptop,&#x201d; &#x201c;Mobile Phone,&#x201d; &#x201c;Tablet,&#x201d; &#x201c;Cosmetic&#x201d; and &#x201c;Non-metallic Lighter&#x201d; in the HiXray dataset. FO, ST, SC, UT, and MU donate &#x201c;Folding Knife,&#x201d; &#x201c;Straight Knife,&#x201d; &#x201c;Scissor,&#x201d; &#x201c;Utility Knife,&#x201d; and &#x201c;Multi-tool Knife&#x201d; in the OPIXray dataset, respectively.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="center">Method</th>
<th colspan="9" align="center">HiXray</th>
<th colspan="6" align="center">OPIXray</th>
</tr>
<tr>
<th align="left">
<italic>mAP</italic>
<sub>50</sub>
</th>
<th align="left">PO1</th>
<th align="left">PO2</th>
<th align="left">WA</th>
<th align="left">LA</th>
<th align="left">MP</th>
<th align="left">TA</th>
<th align="left">CO</th>
<th align="left">NL</th>
<th align="left">
<italic>mAP</italic>
<sub>50</sub>
</th>
<th align="left">FO</th>
<th align="left">ST</th>
<th align="left">SC</th>
<th align="left">UT</th>
<th align="left">MU</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">SSD [<xref ref-type="bibr" rid="B51">51</xref>]</td>
<td align="left">71.4</td>
<td align="left">87.3</td>
<td align="left">81.0</td>
<td align="left">83.0</td>
<td align="left">97.6</td>
<td align="left">93.5</td>
<td align="left">92.2</td>
<td align="left">36.1</td>
<td align="left">.01</td>
<td align="left">70.9</td>
<td align="left">76.9</td>
<td align="left">35.0</td>
<td align="left">93.4</td>
<td align="left">65.9</td>
<td align="left">83.3</td>
</tr>
<tr>
<td align="left">SSD &#x2b; DOAM [<xref ref-type="bibr" rid="B20">20</xref>]</td>
<td align="left">72.1</td>
<td align="left">88.6</td>
<td align="left">82.9</td>
<td align="left">83.6</td>
<td align="left">97.5</td>
<td align="left">94.1</td>
<td align="left">92.1</td>
<td align="left">38.2</td>
<td align="left">.01</td>
<td align="left">74.0</td>
<td align="left">81.4</td>
<td align="left">41.5</td>
<td align="left">95.1</td>
<td align="left">68.2</td>
<td align="left">83.8</td>
</tr>
<tr>
<td align="left">SSD &#x2b; LIM [<xref ref-type="bibr" rid="B34">34</xref>]</td>
<td align="left">73.1</td>
<td align="left">89.1</td>
<td align="left">84.3</td>
<td align="left">84.0</td>
<td align="left">97.7</td>
<td align="left">92.4</td>
<td align="left">92.4</td>
<td align="left">42.3</td>
<td align="left">0.1</td>
<td align="left">74.6</td>
<td align="left">81.4</td>
<td align="left">42.4</td>
<td align="left">95.9</td>
<td align="left">71.2</td>
<td align="left">82.1</td>
</tr>
<tr>
<td align="left">Xdet [<xref ref-type="bibr" rid="B4">4</xref>]</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">86.7</td>
<td align="left">90.4</td>
<td align="left">76</td>
<td align="left">91.5</td>
<td align="left">84.3</td>
<td align="left">91.3</td>
</tr>
<tr>
<td align="left">FCOS [<xref ref-type="bibr" rid="B43">43</xref>]</td>
<td align="left">75.7</td>
<td align="left">88.6</td>
<td align="left">86.4</td>
<td align="left">86.8</td>
<td align="left">89.9</td>
<td align="left">88.9</td>
<td align="left">88.9</td>
<td align="left">63.0</td>
<td align="left">13.3</td>
<td align="left">82.0</td>
<td align="left">86.4</td>
<td align="left">68.5</td>
<td align="left">90.2</td>
<td align="left">78.4</td>
<td align="left">86.6</td>
</tr>
<tr>
<td align="left">FCOS &#x2b; DOAM [<xref ref-type="bibr" rid="B20">20</xref>]</td>
<td align="left">76.2</td>
<td align="left">88.6</td>
<td align="left">87.5</td>
<td align="left">87.8</td>
<td align="left">89.9</td>
<td align="left">89.7</td>
<td align="left">88.8</td>
<td align="left">63.5</td>
<td align="left">12.7</td>
<td align="left">82.4</td>
<td align="left">86.5</td>
<td align="left">68.6</td>
<td align="left">90.2</td>
<td align="left">78.8</td>
<td align="left">87.7</td>
</tr>
<tr>
<td align="left">FCOS &#x2b; LIM [<xref ref-type="bibr" rid="B34">34</xref>]</td>
<td align="left">77.3</td>
<td align="left">88.9</td>
<td align="left">88.2</td>
<td align="left">88.3</td>
<td align="left">90.0</td>
<td align="left">89.8</td>
<td align="left">89.2</td>
<td align="left">69.8</td>
<td align="left">14.4</td>
<td align="left">83.1</td>
<td align="left">86.6</td>
<td align="left">71.9</td>
<td align="left">90.3</td>
<td align="left">79.9</td>
<td align="left">86.8</td>
</tr>
<tr>
<td align="left">ATSS [<xref ref-type="bibr" rid="B19">19</xref>]</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">86.6</td>
<td align="left">92.3</td>
<td align="left">72.0</td>
<td align="left">96.6</td>
<td align="left">80.38</td>
<td align="left">91.7</td>
</tr>
<tr>
<td align="left">ATSS &#x2b; DOAM [<xref ref-type="bibr" rid="B19">19</xref>]</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">85.6</td>
<td align="left">90.7</td>
<td align="left">66.8</td>
<td align="left">96.2</td>
<td align="left">81.8</td>
<td align="left">92.5</td>
</tr>
<tr>
<td align="left">ATSS &#x2b; Lacls [<xref ref-type="bibr" rid="B19">19</xref>]</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">&#x2014;</td>
<td align="left">88.3</td>
<td align="left">90.0</td>
<td align="left">75.0</td>
<td align="left">97.6</td>
<td align="left">85.7</td>
<td align="left">93.0</td>
</tr>
<tr>
<td align="left">YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>]</td>
<td align="left">81.7</td>
<td align="left">95.5</td>
<td align="left">94.5</td>
<td align="left">92.8</td>
<td align="left">97.9</td>
<td align="left">98.0</td>
<td align="left">94.9</td>
<td align="left">63.7</td>
<td align="left">16.3</td>
<td align="left">87.8</td>
<td align="left">93.4</td>
<td align="left">67.9</td>
<td align="left">98.1</td>
<td align="left">85.4</td>
<td align="left">94.1</td>
</tr>
<tr>
<td align="left">YOLOv5s &#x2b; DOAM [<xref ref-type="bibr" rid="B20">20</xref>]</td>
<td align="left">82.2</td>
<td align="left">95.9</td>
<td align="left">94.7</td>
<td align="left">93.7</td>
<td align="left">98.1</td>
<td align="left">98.1</td>
<td align="left">95.8</td>
<td align="left">65.0</td>
<td align="left">16.1</td>
<td align="left">88.0</td>
<td align="left">93.3</td>
<td align="left">69.3</td>
<td align="left">97.9</td>
<td align="left">84.4</td>
<td align="left">95.0</td>
</tr>
<tr>
<td align="left">YOLOv5s &#x2b; LIM [<xref ref-type="bibr" rid="B34">34</xref>]</td>
<td align="left">83.2</td>
<td align="left">96.1</td>
<td align="left">95.1</td>
<td align="left">
<bold>93.9</bold>
</td>
<td align="left">
<bold>98.2</bold>
</td>
<td align="left">
<bold>98.3</bold>
</td>
<td align="left">
<bold>96.4</bold>
</td>
<td align="left">65.8</td>
<td align="left">21.3</td>
<td align="left">90.6</td>
<td align="left">94.8</td>
<td align="left">77.6</td>
<td align="left">
<bold>98.2</bold>
</td>
<td align="left">
<bold>88.9</bold>
</td>
<td align="left">93.8</td>
</tr>
<tr>
<td align="left">YOLOv5s &#x2b; ZPGNet (Ours)</td>
<td align="left">83.9</td>
<td align="left">95.7</td>
<td align="left">
<bold>95.2</bold>
</td>
<td align="left">92.5</td>
<td align="left">96.5</td>
<td align="left">97.7</td>
<td align="left">94.4</td>
<td align="left">66.4</td>
<td align="left">
<bold>33.0</bold>
</td>
<td align="left">
<bold>90.7</bold>
</td>
<td align="left">
<bold>95.0</bold>
</td>
<td align="left">
<bold>79.3</bold>
</td>
<td align="left">98.0</td>
<td align="left">86.8</td>
<td align="left">94.2</td>
</tr>
<tr>
<td align="left">YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>]</td>
<td align="left">83.0</td>
<td align="left">
<bold>96.7</bold>
</td>
<td align="left">94.9</td>
<td align="left">91.9</td>
<td align="left">97.9</td>
<td align="left">97.7</td>
<td align="left">94.0</td>
<td align="left">71.9</td>
<td align="left">18.6</td>
<td align="left">78.2</td>
<td align="left">92.5</td>
<td align="left">36.0</td>
<td align="left">97.3</td>
<td align="left">70.8</td>
<td align="left">
<bold>94.4</bold>
</td>
</tr>
<tr>
<td align="left">YOLOv3&#x2b;ZPGNet (Ours)</td>
<td align="left">
<bold>84.4</bold>
</td>
<td align="left">96.6</td>
<td align="left">95.2</td>
<td align="left">92.7</td>
<td align="left">97.7</td>
<td align="left">98.0</td>
<td align="left">95.2</td>
<td align="left">
<bold>73.8</bold>
</td>
<td align="left">26.1</td>
<td align="left">85.4</td>
<td align="left">88.5</td>
<td align="left">65.1</td>
<td align="left">96.7</td>
<td align="left">83.5</td>
<td align="left">93.3</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold values represent the best performance in the same evaluation index.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<sec id="s4-3-1">
<title>4.3.1 Results on HiXray dataset</title>
<p>The experimental results of different algorithms on the HiXray [<xref ref-type="bibr" rid="B34">34</xref>] dataset are shown in <xref ref-type="table" rid="T1">Table 1</xref>. For a fair comparison, we adopt the same baseline YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>] as DOAM [<xref ref-type="bibr" rid="B20">20</xref>] and LIM [<xref ref-type="bibr" rid="B34">34</xref>], which performs the best results on both DOMA and LIM. The proposed method ZPGNet with YOLOv5s baseline improves to 83.9% in mean average prediction, outperforming DOAM and LIM by 1.7% <italic>mAP</italic>
<sub>50</sub> and .7% <italic>mAP</italic>
<sub>50</sub>, respectively. In order to further verify the effectiveness of our model, we also adopted the YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>] baseline, which is still 1.2% <italic>mAP</italic>
<sub>50</sub> higher than the SOTA method (YOLOv5s &#x2b; LIM).</p>
<p>The (YOLOv3&#x2b;ZPGNet) experiment results show that our method is lower than some methods in some categories Water, Laptop, Mobile Phone, and Tablet, but has an 8.0% <italic>AP</italic> and 4.8% <italic>AP</italic> improvement in the cosmetics and lighter categories, respectively, compared to the SOTA method LIM. Cosmetics belong to the mixtures category, commonly disturbed by organic substances (such as plastics), resulting in decreased detection confidence or even missed detection. The significant improvement in cosmetics indicates that our method, introducing the atomic number feature map, can better reduce the interference of useless information in <xref ref-type="fig" rid="F8">Figure 8</xref>. This advantage is facilitated by our method of paying extra attention to the material information using atomic number features. Lighters in luggage are tiny in size and prone to profound feature loss after downsampling. Our method achieves 11.7% <italic>AP</italic> improvement over LIM [<xref ref-type="bibr" rid="B34">34</xref>] with the same baseline YOLOv5s in the lighter category, which is due to the fact that we use a low sampling rate feature map in the BE module to increase the information of small prohibited items.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Visualizations of the original images, atomic number images, and detection results of the ZPGNet-integrated model. Our proposed ZPGNet uses atomic images to pay more attention to material information and thus achieve better performance.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g008.tif"/>
</fig>
</sec>
<sec id="s4-3-2">
<title>4.3.2 Results on OPIXray dataset</title>
<p>
<xref ref-type="table" rid="T1">Table 1</xref> represents the performance of our method on the OPIXray [<xref ref-type="bibr" rid="B20">20</xref>] dataset. With the same baseline YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>], ZPGNet outperforms DOAM [<xref ref-type="bibr" rid="B20">20</xref>] and LIM [<xref ref-type="bibr" rid="B34">34</xref>] by 2.7% <italic>mAP</italic>
<sub>50</sub> and .1% <italic>mAP</italic>
<sub>50</sub>, respectively. In particular, ZPGNet has the highest score on <italic>mAP</italic>
<sub>50</sub> among all the models. It can be clearly seen that the proposed method ZPGNet achieves significant performance improvement based on YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>], especially on <italic>AP</italic> of the severely occluded prohibited items named &#x201c;straight knife&#x201d; improved by 29.1%. This benefits from the fact that our method effectively removes the interference of irrelevant background information.</p>
</sec>
</sec>
<sec id="s4-4">
<title>4.4 Generality verification</title>
<p>To further evaluate the effectiveness of the proposed model ZPGNet and verify that ZPGNet can be applied to various detection networks, we choose the classical detection models YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>], RetinaNet [<xref ref-type="bibr" rid="B41">41</xref>], and YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>] to use our method. Experiments were performed on the OPIXray dataset [<xref ref-type="bibr" rid="B20">20</xref>]. As shown in <xref ref-type="table" rid="T2">Table 2</xref>, our approach ZPGNet improves YOLOv3 by 7.2% <italic>mAP</italic>
<sub>50</sub>, RetinaNet by .7% <italic>mAP</italic>
<sub>50</sub>, and YOLOv5s by 2.9% <italic>mAP</italic>
<sub>50</sub>, respectively. Many objects are commonly disturbed by useless items, quickly resulting in low confidence or even miss detection on the general detection model. As shown in <xref ref-type="fig" rid="F9">Figure 9</xref>, the comparison plot of the experimental results in the first and second rows shows that even with high confidence, there is a particular improvement after introducing the atomic number features. Embedding ZPGNet makes the network pay more attention to object material information to reduce the interference of ineffective information and alleviate the problems of low confidence and missed detection. This indicates that our model can be embedded into most detection networks as a plug-and-play component to minimize the interference of useless background information and achieve better performance.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Comparisons between the ZPGNet-integrated network and three object detection methods.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="left">
<italic>mAP</italic>
<sub>50</sub>
</th>
<th align="left">FO</th>
<th align="left">ST</th>
<th align="left">SC</th>
<th align="left">UT</th>
<th align="left">MU</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">RetinaNet [<xref ref-type="bibr" rid="B41">41</xref>]</td>
<td align="left">87.4</td>
<td align="left">89.4</td>
<td align="left">69.2</td>
<td align="left">98.2</td>
<td align="left">
<bold>86.3</bold>
</td>
<td align="left">
<bold>94.0</bold>
</td>
</tr>
<tr>
<td align="left">RetinaNet &#x2b; ZPGNet</td>
<td align="left">
<bold>88.1</bold>
</td>
<td align="left">
<bold>91.3</bold>
</td>
<td align="left">
<bold>72.1</bold>
</td>
<td align="left">
<bold>98.7</bold>
</td>
<td align="left">85.8</td>
<td align="left">92.6</td>
</tr>
<tr>
<td align="left">YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>]</td>
<td align="left">87.8</td>
<td align="left">93.4</td>
<td align="left">67.9</td>
<td align="left">
<bold>98.1</bold>
</td>
<td align="left">85.4</td>
<td align="left">94.1</td>
</tr>
<tr>
<td align="left">YOLOv5s &#x2b; ZPGNet</td>
<td align="left">
<bold>90.7</bold>
</td>
<td align="left">
<bold>95.0</bold>
</td>
<td align="left">
<bold>79.3</bold>
</td>
<td align="left">98.0</td>
<td align="left">
<bold>86.8</bold>
</td>
<td align="left">
<bold>94.2</bold>
</td>
</tr>
<tr>
<td align="left">YOLOv3 [<xref ref-type="bibr" rid="B42">42</xref>]</td>
<td align="left">78.2</td>
<td align="left">
<bold>92.5</bold>
</td>
<td align="left">36.0</td>
<td align="left">
<bold>97.3</bold>
</td>
<td align="left">70.8</td>
<td align="left">
<bold>94.4</bold>
</td>
</tr>
<tr>
<td align="left">YOLOv3&#x2b;ZPGNet</td>
<td align="left">
<bold>85.4</bold>
</td>
<td align="left">88.5</td>
<td align="left">
<bold>65.1</bold>
</td>
<td align="left">96.7</td>
<td align="left">
<bold>83.5</bold>
</td>
<td align="left">93.3</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>We embedded our method into three different baseline models respectively and divided the models embedded with and without our method into a group, where the bold figures represent the best performance in a group.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Visual results of both the baseline YOLOv3 and the ZPGNet-integrated model. There are many missed and low-confidence prohibited items in baseline YOLOv3. After embedding the proposed ZPGNet, the ability to detect items has been significantly improved, especially for heavily cluttered X-ray images.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g009.tif"/>
</fig>
</sec>
<sec id="s4-5">
<title>4.5 Ablation study</title>
<p>In this subsection, we conduct a series of ablation experiments to analyze the influence of involved hyperparameters and the contribution of critical components of the proposed ZPGNet. In the ablation study, all experiments were performed on the HiXray dataset [<xref ref-type="bibr" rid="B34">34</xref>].</p>
<sec id="s4-5-1">
<title>4.5.1 Effectiveness of ZPG, MA, and BE</title>
<p>ZPG, MA, and BE are essential modules in ZPGNet, and we embed them one by one into YOLOv5s [<xref ref-type="bibr" rid="B47">47</xref>] to evaluate their performance. The insertion of ZPG requires the support of MA, so unity emplaces ZPG and MA together into the model. All experiments here uniformly set the number of MA layers to 2. As shown in <xref ref-type="table" rid="T3">Table 3</xref>, the network embedded with ZPG and MA modules improves its performance by 1.4% <italic>mAP</italic>
<sub>50</sub> compared to the base model, especially in the cosmetics category, where it improves by 5.3% <italic>mAP</italic>
<sub>50</sub>. Cosmetics are commonly disturbed by organic substances (such as plastics), resulting in low confidence and missed detection. The significant improvement in cosmetics indicates that our method, introducing the atomic number features, can better reduce the interference of useless information, as shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. After applying the Bidirectional Enhancement (BE) module, the performance is 2.2% <italic>mAP</italic>
<sub>50</sub> higher than the basic module and .8% <italic>mAP</italic>
<sub>50</sub> higher than that embedded with MA and ZPG, which proves the effectiveness of the BE module.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Ablation results of the proposed ZPG, MA, and BE on the HiXray dataset.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="left">
<italic>mAP</italic>
<sub>50</sub>
</th>
<th align="left">PO1</th>
<th align="left">PO2</th>
<th align="left">WA</th>
<th align="left">LA</th>
<th align="left">MP</th>
<th align="left">TA</th>
<th align="left">CO</th>
<th align="left">NL</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">YOLOV5s [<xref ref-type="bibr" rid="B47">47</xref>]</td>
<td align="left">81.7</td>
<td align="left">95.5</td>
<td align="left">94.5</td>
<td align="left">
<bold>92.8</bold>
</td>
<td align="left">
<bold>97.9</bold>
</td>
<td align="left">
<bold>98.0</bold>
</td>
<td align="left">
<bold>94.9</bold>
</td>
<td align="left">63.7</td>
<td align="left">16.3</td>
</tr>
<tr>
<td align="left">&#x2b;ZPG &#x2b; MA</td>
<td align="left">83.1</td>
<td align="left">95.3</td>
<td align="left">
<bold>95.5</bold>
</td>
<td align="left">92.4</td>
<td align="left">94.9</td>
<td align="left">97.7</td>
<td align="left">93.6</td>
<td align="left">
<bold>69.0</bold>
</td>
<td align="left">26.0</td>
</tr>
<tr>
<td align="left">&#x2b;ZPG &#x2b; MA &#x2b; BE</td>
<td align="left">
<bold>83.9</bold>
</td>
<td align="left">
<bold>95.7</bold>
</td>
<td align="left">95.2</td>
<td align="left">92.5</td>
<td align="left">96.5</td>
<td align="left">97.7</td>
<td align="left">94.4</td>
<td align="left">66.4</td>
<td align="left">
<bold>33.0</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Bold values represent the best performance in the same evaluation index.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Performance comparison of different categories. The number on the gray line indicates the log-average miss rate. Useless background information interference can easily lead to prohibited item missed detections. With the proposed ZPG, MA, and BE, the log-average miss rate of prohibited items (i.e., cosmetic and lighter) is significantly reduced.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g010.tif"/>
</fig>
</sec>
<sec id="s4-5-2">
<title>4.5.2 Number of layers in MAs</title>
<p>We also show the effects of different layer numbers in the proposed MA, as shown in <xref ref-type="fig" rid="F11">Figure 11</xref>. The model performs best when the layer numbers equal 2. The excessive number of layers can lead to performance degradation of the MA module. We believe that the possible reason is that the over-introduction of the atomic number feature leads to the suppression of other essential cues, which leads to a degradation in performance. When MA layers are equal to 2, it can well balance the importance between the atomic number feature and other features. So, in other experiments, we set the layer numbers in each MA to 2.</p>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption>
<p>Bar graph of AP variation of all categories corresponding to different layers number MA module.</p>
</caption>
<graphic xlink:href="fphy-10-1117261-g011.tif"/>
</fig>
</sec>
</sec>
</sec>
<sec sec-type="conclusion" id="s5">
<title>5 Conclusion</title>
<p>Prohibited item detection in X-ray images is an effective measure to maintain public safety. The interference of a large amount of useless background information caused by object disordered placement is an urgent problem to be addressed in prohibited item detection. Inspired by the imaging characteristics of X-ray images, this paper proposes an atomic number Z Prior Generation (ZPG) method, which can automatically generate atomic number images and reduce the cost of manual acquisition. Furthermore, we designed an atomic number Z Prior Guided Network (ZPGNet) to solve useless background information interference in prohibited item detection. The proposed ZPGNet method cross-scale flows the atomic number Z information through the network to mine deep material clues to reduce irrelevant background information interference. We comprehensively evaluate ZPGNet on HiXray and OPIXray datasets, and this result shows that ZPGNet can be embedded into most detection networks as a plug-and-play module and achieve higher performance. There is still a severe occlusion problem in X-ray images, but this paper does not solve the occlusion problem. In the future, we intend to use features such as contour and scale to solve the occlusion problem between items.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xlink:href="https://github.com/OPIXray-author/OPIXray">https://github.com/OPIXray-author/OPIXray</ext-link>.</p>
</sec>
<sec id="s7">
<title>Author contributions</title>
<p>Conceptualization, JC, JL, MM, XG, and SG; methodology, JC; software, MM, and SG; validation, JC; investigation, JL and SG; writing&#x2014;original draft preparation, JC and JL; writing&#x2014;review and editing, XG, MM, and SG; visualization, JC; funding acquisition, JL and XG. All authors have read and agreed to the published version of the manuscript.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This work was supported in part by the National Natural Science Foundation of China under Grants No. 62102057 and No. 62036007, in part by the Natural Science Foundation of Chongqing under Grand No. CSTB2022NSCQ-MSX1024, in part by the Chongqing Postdoctoral Innovative Talent Plan under Grant No. CQBX202217, in part by the Postdoctoral Science Foundation of China under Grant No. 2022M720548, in part by the Special Project on Technological Innovation and Application Development under Grant No. cstc2020jscx-dxwtB0032, and in part by Chongqing Excellent Scientist Project under Grant No. cstc2021ycjh-bgzxm0339.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gaus</surname>
<given-names>YFA</given-names>
</name>
<name>
<surname>Bhowmik</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Ak&#xe7;ay</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Guill&#xe9;n-Garcia</surname>
<given-names>PM</given-names>
</name>
<name>
<surname>Barker</surname>
<given-names>JW</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>TP</given-names>
</name>
</person-group>. <article-title>Evaluation of a dual convolutional neural network architecture for object-wise anomaly detection in cluttered x-ray security imagery</article-title>. In: <conf-name>2019 international joint conference on neural networks (IJCNN)</conf-name>; <conf-date>July 14-19, 2019</conf-date>; <conf-loc>Budapest, Hungary</conf-loc> (<year>2019</year>).</citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Hassan</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Bettayeb</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ak&#xe7;ay</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Khan</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bennamoun</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Werghi</surname>
<given-names>N</given-names>
</name>
</person-group>. <article-title>Detecting prohibited items in x-ray images: A contour proposal learning approach</article-title>. In: <conf-name>2020 IEEE International Conference on Image Processing (ICIP)</conf-name>; <conf-date>October 25-28, 2020</conf-date> (<year>2020</year>).</citation>
</ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Isaac-Medina</surname>
<given-names>BK</given-names>
</name>
<name>
<surname>Willcocks</surname>
<given-names>CG</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>TP</given-names>
</name>
</person-group>. <article-title>Multi-view object detection using epipolar constraints within cluttered x-ray security imagery</article-title>. In: <conf-name>2020 25th International Conference on Pattern Recognition (ICPR)</conf-name>; <conf-date>10 - 15 January 2021</conf-date>; <conf-loc>ITALY</conf-loc> (<year>2021</year>). p. <fpage>9889</fpage>.</citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Detecting prohibited objects with physical size constraint from cluttered x-ray baggage images</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>237</volume>:<fpage>107916</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2021.107916</pub-id>
</citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xia</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Ling</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>An efficient and robust target detection algorithm for identifying minor defects of printed circuit board based on phfe and fl-rfcn</article-title>. <source>Front Phys</source> (<year>2021</year>) <volume>9</volume>:<fpage>661091</fpage>. <pub-id pub-id-type="doi">10.3389/fphy.2021.661091</pub-id>
</citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Franzel</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Roth</surname>
<given-names>S</given-names>
</name>
</person-group>. <article-title>Object detection in multi-view x-ray images</article-title>. In: <source>Joint DAGM (German association for pattern recognition) and OAGM symposium</source>. <publisher-loc>Charm</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2012</year>). p. <fpage>144</fpage>&#x2013;<lpage>54</lpage>.</citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gao</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Huyan</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>B</given-names>
</name>
<etal/>
</person-group> <article-title>Small foreign metal objects detection in x-ray images of clothing products using faster r-cnn and feature pyramid network</article-title>. <source>IEEE Trans Instrumentation Meas</source> (<year>2021</year>) <volume>70</volume>:<fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1109/tim.2021.3077666</pub-id>
</citation>
</ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Luz</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Silva</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Silva</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Silva</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Guimar&#xe3;es</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Miozzo</surname>
<given-names>G</given-names>
</name>
<etal/>
</person-group> <article-title>Towards an effective and efficient deep learning model for Covid-19 patterns detection in x-ray images</article-title>. <source>Res Biomed Eng</source> (<year>2022</year>) <volume>38</volume>:<fpage>149</fpage>&#x2013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1007/s42600-021-00151-6</pub-id>
</citation>
</ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akcay</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>T</given-names>
</name>
</person-group>. <article-title>Towards automatic threat detection: A survey of advances of deep learning within x-ray security imaging</article-title>. <source>Pattern Recognition</source> (<year>2022</year>) <volume>122</volume>:<fpage>108245</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.108245</pub-id>
</citation>
</ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Fei</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zuo</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>C-W</given-names>
</name>
</person-group>. <article-title>Deep learning on image denoising: An overview</article-title>. <source>Neural Networks</source> (<year>2020</year>) <volume>131</volume>:<fpage>251</fpage>&#x2013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2020.07.025</pub-id>
</citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goudet</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Grelier</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Hao</surname>
<given-names>J-K</given-names>
</name>
</person-group>. <article-title>A deep learning guided memetic framework for graph coloring problems</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>258</volume>:<fpage>109986</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.109986</pub-id>
</citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Xiang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Duan</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>Y</given-names>
</name>
</person-group>. <article-title>Incremental learning based multi-domain adaptation for object detection</article-title>. <source>Knowledge-Based Syst</source> (<year>2020</year>) <volume>210</volume>:<fpage>106420</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2020.106420</pub-id>
</citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>P&#xe9;rez-Hern&#xe1;ndez</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Tabik</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Lamas</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Olmos</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Fujita</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Herrera</surname>
<given-names>F</given-names>
</name>
</person-group>. <article-title>Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance</article-title>. <source>Knowledge-Based Syst</source> (<year>2020</year>) <volume>194</volume>:<fpage>105590</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2020.105590</pub-id>
</citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zuo</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>Vision transformers for dense prediction: A survey</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>253</volume>:<fpage>109552</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.109552</pub-id>
</citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Flitton</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>TP</given-names>
</name>
<name>
<surname>Megherbi</surname>
<given-names>N</given-names>
</name>
</person-group>. <article-title>A comparison of 3d interest point descriptors with application to airport baggage object detection in complex ct imagery</article-title>. <source>Pattern Recognition</source> (<year>2013</year>) <volume>46</volume>:<fpage>2420</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2013.02.008</pub-id>
</citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bhowmik</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Gaus</surname>
<given-names>YFA</given-names>
</name>
<name>
<surname>Szarek</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>TP</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>The good, the bad and the ugly: Evaluating convolutional neural networks for prohibited item detection using real and synthetically composited x-ray imagery</article-title>. <comment>arXiv preprint arXiv:1909.11508</comment>
</citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gaus</surname>
<given-names>YFA</given-names>
</name>
<name>
<surname>Bhowmik</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Akcay</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>T</given-names>
</name>
</person-group>. <article-title>Evaluating the transferability and adversarial discrimination of convolutional neural networks for threat object detection and classification within x-ray security imagery</article-title>. In: <conf-name>2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)</conf-name>; <conf-date>December 16-19, 2019</conf-date>; <conf-loc>Boca Raton, Florida, USA</conf-loc> (<year>2019</year>). p. <fpage>420</fpage>&#x2013;<lpage>5</lpage>.</citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hassan</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Khan</surname>
<given-names>SH</given-names>
</name>
<name>
<surname>Akcay</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Bennamoun</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Werghi</surname>
<given-names>N</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Cascaded structure tensor framework for robust identification of heavily occluded baggage items from multi-vendor x-ray scans</article-title>. <comment>
<italic>arXiv preprint arXiv:1912.04251</italic>
</comment>
</citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Dou</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Detecting overlapped objects in x-ray security imagery by a label-aware mechanism</article-title>. <source>IEEE Trans Inf Forensics Security</source> (<year>2022</year>) <volume>17</volume>:<fpage>998</fpage>&#x2013;<lpage>1009</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2022.3154287</pub-id>
</citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>Occluded prohibited items detection: An x-ray security inspection benchmark and de-occlusion attention module</article-title>. In: <conf-name>Proceedings of the 28th ACM International Conference on Multimedia</conf-name>; <conf-date>October 12 - 16, 2020</conf-date>; <conf-loc>Seattle WA USA</conf-loc> (<year>2020</year>). p. <fpage>138</fpage>&#x2013;<lpage>46</lpage>.</citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Otsu</surname>
<given-names>N</given-names>
</name>
</person-group>. <article-title>A threshold selection method from gray-level histograms</article-title>. <source>IEEE Transactions Systems, Man, Cybernetics</source> (<year>1979</year>) <volume>9</volume>:<fpage>62</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1109/tsmc.1979.4310076</pub-id>
</citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mery</surname>
<given-names>D</given-names>
</name>
</person-group>. <source>Computer vision for x-ray testing</source>, <volume>10</volume>. <publisher-loc>Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name> (<year>2015</year>). p. <fpage>973</fpage>&#x2013;<lpage>8</lpage>.</citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liang</surname>
<given-names>KJ</given-names>
</name>
<name>
<surname>Heilmann</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Gregory</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Diallo</surname>
<given-names>SO</given-names>
</name>
<name>
<surname>Carlson</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Spell</surname>
<given-names>GP</given-names>
</name>
<etal/>
</person-group> <article-title>Automatic threat recognition of prohibited items at aviation checkpoint with x-ray imaging: A deep learning approach</article-title>. <source>Anomaly Detect Imaging X-Rays (Adix) (Spie)</source> (<year>2018</year>) <volume>10632</volume>:<fpage>1063203</fpage>. <pub-id pub-id-type="doi">10.1117/12.2309484</pub-id>
</citation>
</ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Bhowmik</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gaus</surname>
<given-names>YFA</given-names>
</name>
<name>
<surname>Breckon</surname>
<given-names>TP</given-names>
</name>
</person-group>. <article-title>On the impact of using x-ray energy response imagery for object detection via convolutional neural networks</article-title>. In: <conf-name>2021 IEEE International Conference on Image Processing (ICIP)</conf-name>; <conf-date>19-22 September, 2021</conf-date>; <conf-loc>Alaska, USA</conf-loc> (<year>2021</year>). p. <fpage>1224</fpage>.</citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Viriyasaranon</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Chae</surname>
<given-names>S-H</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>J-H</given-names>
</name>
</person-group>. <article-title>Mfa-net: Object detection for complex x-ray cargo and baggage security imagery</article-title>. <source>Plos one</source> (<year>2022</year>) <volume>17</volume>:<fpage>e0272961</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0272961</pub-id>
</citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Miao</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Jiao</surname>
<given-names>J</given-names>
</name>
<etal/>
</person-group> <article-title>Sixray: A large-scale security inspection x-ray benchmark for prohibited item discovery in overlapping images</article-title>. In: <conf-name>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>; <conf-date>June 20 2021 to June 25 2021</conf-date>; <conf-loc>Nashville, TN, USA.</conf-loc> (<year>2019</year>). p. <fpage>2119</fpage>&#x2013;<lpage>28</lpage>.</citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mery</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Riffo</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Zscherpel</surname>
<given-names>U</given-names>
</name>
<name>
<surname>Mondrag&#xf3;n</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Lillo</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Zuccar</surname>
<given-names>I</given-names>
</name>
<etal/>
</person-group> <article-title>Gdxray: The database of x-ray images for nondestructive testing</article-title>. <source>J Nondestructive Eval</source> (<year>2015</year>) <volume>34</volume>:<fpage>42</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1007/s10921-015-0315-7</pub-id>
</citation>
</ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Talamonti</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Kanxheri</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Pallotta</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Servoli</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Diamond detectors for radiotherapy x-ray small beam dosimetry</article-title>. <source>Front Phys</source> (<year>2021</year>) <volume>9</volume>:<fpage>632299</fpage>. <pub-id pub-id-type="doi">10.3389/fphy.2021.632299</pub-id>
</citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fourcade</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Khonsari</surname>
<given-names>R</given-names>
</name>
</person-group>. <article-title>Deep learning in medical image analysis: A third eye for doctors</article-title>. <source>J stomatology, Oral Maxill Surg</source> (<year>2019</year>) <volume>120</volume>:<fpage>279</fpage>&#x2013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1016/j.jormas.2019.06.002</pub-id>
</citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Sodha</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Rahman Siddiquee</surname>
<given-names>MM</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Tajbakhsh</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gotway</surname>
<given-names>MB</given-names>
</name>
<etal/>
</person-group> <article-title>Models Genesis: Generic autodidactic models for 3d medical image analysis</article-title>. In: <conf-name>International conference on medical image computing and computer-assisted intervention</conf-name>; <conf-date>September 18th to 22nd 2022</conf-date>; <conf-loc>Singapore</conf-loc> (<year>2019</year>). p. <fpage>384</fpage>&#x2013;<lpage>93</lpage>.</citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>T</given-names>
</name>
</person-group>. <article-title>Mh-net: Model-data-driven hybrid-fusion network for medical image segmentation</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>248</volume>:<fpage>108795</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.108795</pub-id>
</citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Nie</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
</person-group>. <article-title>Unified medical image segmentation by learning from uncertainty in an end-to-end manner</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>241</volume>:<fpage>108215</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.108215</pub-id>
</citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Huangliang</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>H</given-names>
</name>
</person-group>. <article-title>TransUNet&#x2b;: Redesigning the skip connection to enhance features in medical image segmentation</article-title>. <source>Knowledge-Based Syst</source> (<year>2022</year>) <volume>256</volume>:<fpage>109859</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2022.109859</pub-id>
</citation>
</ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Tao</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<etal/>
</person-group> <article-title>Towards real-world x-ray security inspection: A high-quality benchmark and lateral inhibition module for prohibited items detection</article-title>. In: <conf-name>Proceedings of the IEEE/CVF International Conference on Computer Vision</conf-name>; <conf-date>27 October - 2 November 2019</conf-date>; <conf-loc>Seoul, South Korea</conf-loc> (<year>2021</year>). p. <fpage>10923</fpage>.</citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Redmon</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Divvala</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Girshick</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Farhadi</surname>
<given-names>A</given-names>
</name>
</person-group>. <article-title>You only look once: Unified, real-time object detection</article-title>. In: <conf-name>Proceedings of the IEEE conference on computer vision and pattern recognition</conf-name>, <conf-loc>Las Vegas, NV</conf-loc>, <conf-date>June 27&#x2013;30, 2016</conf-date> (<year>2016</year>). <fpage>779</fpage>&#x2013;<lpage>88</lpage>.</citation>
</ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhao</surname>
<given-names>Z-Q</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>S-t.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>Object detection with deep learning: A review</article-title>. <source>IEEE Trans Neural networks Learn Syst</source> (<year>2019</year>) <volume>30</volume>:<fpage>3212</fpage>&#x2013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1109/tnnls.2018.2876865</pub-id>
</citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zou</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>J</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Object detection in 20 years: A survey</article-title>. <comment>
<italic>arXiv preprint arXiv:1905.05055</italic>
</comment>
</citation>
</ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Velayudhan</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Hassan</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Damiani</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Werghi</surname>
<given-names>N</given-names>
</name>
</person-group>. <article-title>Recent advances in baggage threat detection: A comprehensive and systematic survey</article-title>. <source>ACM Comput Surv</source> (<year>2022</year>). <pub-id pub-id-type="doi">10.1145/3549932</pub-id>
</citation>
</ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>C-W</given-names>
</name>
</person-group>. <article-title>Se-ssd: Self-ensembling single-stage object detector from point cloud</article-title>. In: <conf-name>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>; <conf-date>June 20 2021 to June 25 2021</conf-date>; <conf-loc>Nashville, TN, USA</conf-loc> (<year>2021</year>). p. <fpage>14494</fpage>&#x2013;<lpage>503</lpage>.</citation>
</ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z</given-names>
</name>
</person-group>. <article-title>A lightweight one-stage defect detection network for small object based on dual attention mechanism and pafpn</article-title>. <source>Front Phys</source> (<year>2021</year>) <volume>9</volume>:<fpage>708097</fpage>. <pub-id pub-id-type="doi">10.3389/fphy.2021.708097</pub-id>
</citation>
</ref>
<ref id="B41">
<label>41.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Lin</surname>
<given-names>T-Y</given-names>
</name>
<name>
<surname>Goyal</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Girshick</surname>
<given-names>R</given-names>
</name>
<name>
<surname>He</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Doll&#xe1;r</surname>
<given-names>P</given-names>
</name>
</person-group>. <article-title>Focal loss for dense object detection</article-title>. In: <conf-name>Proceedings of the IEEE international conference on computer vision</conf-name>; <conf-date>October 13 - 16, 2003</conf-date>; <conf-loc>NW Washington, DC, United States</conf-loc> (<year>2017</year>). p. <fpage>2980</fpage>&#x2013;<lpage>8</lpage>.</citation>
</ref>
<ref id="B42">
<label>42.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Redmon</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Farhadi</surname>
<given-names>A</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Yolov3: An incremental improvement</article-title>. <comment>
<italic>arXiv preprint arXiv:1804.02767</italic>
</comment>
</citation>
</ref>
<ref id="B43">
<label>43.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>He</surname>
<given-names>T</given-names>
</name>
</person-group>. <article-title>Fcos: Fully convolutional one-stage object detection</article-title>. In: <conf-name>Proceedings of the IEEE/CVF international conference on computer vision</conf-name>; <conf-date>Oct. 27 2019 to Nov. 2 2019</conf-date>; <conf-loc>Seoul, Korea</conf-loc> (<year>2019</year>). p. <fpage>9627</fpage>.</citation>
</ref>
<ref id="B44">
<label>44.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ouyang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Qiu</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>Deepid-net: Multi-stage and deformable deep convolutional neural networks for object detection</article-title>. <comment>arXiv preprint arXiv:1409.3505</comment>
</citation>
</ref>
<ref id="B45">
<label>45.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Du</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>Overview of two-stage object detection algorithms</article-title>. <source>J Phys Conf Ser</source> (<year>2020</year>) <volume>1544</volume>:<fpage>012033</fpage>. <pub-id pub-id-type="doi">10.1088/1742-6596/1544/1/012033</pub-id>
</citation>
</ref>
<ref id="B46">
<label>46.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Xie</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>J</given-names>
</name>
</person-group>. <article-title>Oriented r-cnn for object detection</article-title>. In: <conf-name>Proceedings of the IEEE/CVF International Conference on Computer Vision</conf-name>; <conf-date>11-17 Oct. 2021</conf-date> (<year>2021</year>). p. <fpage>3520</fpage>&#x2013;<lpage>9</lpage>.</citation>
</ref>
<ref id="B47">
<label>47.</label>
<citation citation-type="other">
<person-group person-group-type="author">
<name>
<surname>Jocher</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Stoken</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Borovec</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Nano Code012</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Changyu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <comment>ultralytics/yolov5: v3</comment>
</citation>
</ref>
<ref id="B48">
<label>48.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shao</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Z</given-names>
</name>
</person-group>. <article-title>Exploiting foreground and background separation for prohibited item detection in overlapping x-ray images</article-title>. <source>Pattern Recognition</source> (<year>2022</year>) <volume>122</volume>:<fpage>108261</fpage>. <pub-id pub-id-type="doi">10.1016/j.patcog.2021.108261</pub-id>
</citation>
</ref>
<ref id="B49">
<label>49.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>G</given-names>
</name>
</person-group>. <article-title>Squeeze-and-excitation networks</article-title>. In: <conf-name>Proceedings of the IEEE conference on computer vision and pattern recognition</conf-name>; <conf-date>18-20 June 1996</conf-date>; <conf-loc>San Francisco, CA, USA</conf-loc> (<year>2018</year>). p. <fpage>7132</fpage>.</citation>
</ref>
<ref id="B50">
<label>50.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ding</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Goshtasby</surname>
<given-names>A</given-names>
</name>
</person-group>. <article-title>On the canny edge detector</article-title>. <source>Pattern recognition</source> (<year>2001</year>) <volume>34</volume>:<fpage>721</fpage>&#x2013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1016/s0031-3203(00)00023-6</pub-id>
</citation>
</ref>
<ref id="B51">
<label>51.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Anguelov</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Erhan</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Szegedy</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Reed</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>C-Y</given-names>
</name>
<etal/>
</person-group> <article-title>Ssd: Single shot multibox detector</article-title>. In: <conf-name>European conference on computer vision</conf-name>; <conf-date>8-16 October</conf-date>; <conf-loc>Amsterdam, The Netherlands</conf-loc> (<year>2016</year>). p. <fpage>21</fpage>&#x2013;<lpage>37</lpage>.</citation>
</ref>
</ref-list>
</back>
</article>