<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2022.1024360</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Inter-row information recognition of maize in the middle and late stages <italic>via</italic> LiDAR supplementary vision</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Li</surname>
<given-names>Zhiqiang</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Xie</surname>
<given-names>Dongbo</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Liu</surname>
<given-names>Lichao</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2054949"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Hai</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1727201"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Chen</surname>
<given-names>Liqing</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1965669"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>College of Engineering, Anhui Agricultural University</institution>, <addr-line>Hefei</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Anhui Intelligent Agricultural Machinery Equipment Engineering Laboratory</institution>, <addr-line>Hefei</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Discipline of Engineering and Energy, Murdoch University</institution>, <addr-line>Perth, WA</addr-line>, <country>Australia</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Huajian Liu, University of Adelaide, Australia</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Zhaoyu Zhai, Nanjing Agricultural University, China; Yecheng Lyu, Volvo Car Technology USA, United States; Yanbo Huang, United States Department of Agriculture (USDA), United States; Xiangjun Zou, South China Agricultural University, China</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Liqing Chen, <email xlink:href="mailto:lqchen@ahau.edu.cn">lqchen@ahau.edu.cn</email>
</p>
</fn>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>01</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>1024360</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>08</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>10</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Li, Xie, Liu, Wang and Chen</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Li, Xie, Liu, Wang and Chen</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In the middle and late stages of maize, light is limited and non-maize obstacles exist. When a plant protection robot uses the traditional visual navigation method to obtain navigation information, some information will be missing. Therefore, this paper proposed a method using LiDAR (laser imaging, detection and ranging) point cloud data to supplement machine vision data for recognizing inter-row information in the middle and late stages of maize. Firstly, we improved the YOLOv5 (You Only Look Once, version 5) algorithm based on the characteristics of the actual maize inter-row environment in the middle and late stages by introducing MobileNetv2 and ECANet. Compared with that of YOLOv5, the frame rate of the improved YOLOv5 (Im-YOLOv5) increased by 17.91% and the weight size decreased by 55.56% when the average accuracy was reduced by only 0.35%, improving the detection performance and shortening the time of model reasoning. Secondly, we identified obstacles (such as stones and clods) between the rows using the LiDAR point cloud data to obtain auxiliary navigation information. Thirdly, the auxiliary navigation information was used to supplement the visual information, so that not only the recognition accuracy of the inter-row navigation information in the middle and late stages of maize was improved but also the basis of the stable and efficient operation of the inter-row plant protection robot was provided for these stages. The experimental results from a data acquisition robot equipped with a camera and a LiDAR sensor are presented to show the efficacy and remarkable performance of the proposed method.</p>
</abstract>
<kwd-group>
<kwd>inter-row information recognition</kwd>
<kwd>point cloud</kwd>
<kwd>maize plant protection</kwd>
<kwd>lidar</kwd>
<kwd>machine vision</kwd>
</kwd-group>
<contract-sponsor id="cn001">National Natural Science Foundation of China<named-content content-type="fundref-id">10.13039/501100001809</named-content>
</contract-sponsor>
<counts>
<fig-count count="11"/>
<table-count count="3"/>
<equation-count count="19"/>
<ref-count count="42"/>
<page-count count="14"/>
<word-count count="6888"/>
</counts>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<title>1 Introduction</title>
<p>Maize is one of the five most productive cereals in the world (the other four being rice, wheat, soybean, and barley) (<xref ref-type="bibr" rid="B21">Patricio and Rieder, 2018</xref>) that is an important source of food crops and feed. In recent years, with the rapid increase in maize consumption, an efficient and intelligent maize production process has been required to increase productivity (<xref ref-type="bibr" rid="B25">Tang et&#xa0;al., 2018</xref>; <xref ref-type="bibr" rid="B37">Yang et&#xa0;al., 2022a</xref>). Inter-row navigation is a key to realizing the intelligence of maize planting. Pest control in the middle and late stages of maize determines the crop yield and quality. A small autonomous navigation plant protection robot is a good solution for plant protection in the middle and late stages of maize development (<xref ref-type="bibr" rid="B16">Li et&#xa0;al., 2019</xref>). However, in these stages, the high plant height (<xref ref-type="bibr" rid="B7">Chen et&#xa0;al., 2018</xref>), insufficient light, and several non-maize obstacles lead to a typical high-occlusion environment (<xref ref-type="bibr" rid="B11">Hiremath et&#xa0;al., 2014</xref>; <xref ref-type="bibr" rid="B38">Yang et&#xa0;al., 2022b</xref>). Commonly used navigation systems such as GPS (Global Positioning System) and BDS (BeiDou Navigation Satellite System) have shown poor signal quality in a high-occlusion environment (<xref ref-type="bibr" rid="B8">Gai et&#xa0;al., 2021</xref>); therefore, accurately obtaining navigation information between rows in the middle and late stages of maize has become the key issue to realizing the autonomous navigation of plant protection robots. At present, machine vision is the mainstream navigation method used to obtain inter-row navigation information in a high-occlusion environment (<xref ref-type="bibr" rid="B22">Radcliffe et&#xa0;al., 2018</xref>); that is, the RGB (red, green, and blue) camera acquires images of the maize stems, identifies maize stems through a trained model, and obtains position information so as to plan the navigation path. The convolutional neural network was used to train the robot to recognize the characteristics of maize stalks at the early growth stage, which was implemented on an inter-row information collection robot based on machine vision (<xref ref-type="bibr" rid="B9">Gu et&#xa0;al., 2020</xref>). Tang et&#xa0;al. reported the application and research progress of harvesting robots and vision technology in fruit picking (<xref ref-type="bibr" rid="B24">Tang et&#xa0;al., 2020</xref>). The authorsMachine vision technology was applied for the multi-target recognition of bananas and automatic positioning for the inflorescence axis cutting point (<xref ref-type="bibr" rid="B31">Wu et&#xa0;al., 2021</xref>); in addition, the improved YOLOv4 (You Only Look Once, version 4) micromodel and binocular stereo vision technology were applied for fruit detection and location (<xref ref-type="bibr" rid="B29">Wang et&#xa0;al., 2022</xref>; <xref ref-type="bibr" rid="B26">Tang et&#xa0;al., 2023</xref>). Zhang et&#xa0;al. proposed an inter-row information recognition algorithm for an intelligent agricultural robot based on binocular vision, where the effective inter-row navigation information was extracted by fusing the edge contour and height information of crop rows in the image (<xref ref-type="bibr" rid="B41">Zhang et&#xa0;al., 2020</xref>). By setting the region of interest, Yang et&#xa0;al. used machine vision to accurately identify the crop lines between rows in the early growth stage of maize and extracted the navigation path of the plant protection robot in real time (<xref ref-type="bibr" rid="B37">Yang et&#xa0;al., 2022a</xref>). However, the inter-row environment in the middle and late stages of maize is a typical high-occlusion environment, with higher plant height and dense branches and leaves, seriously blocking light (<xref ref-type="bibr" rid="B17">Liu et&#xa0;al., 2016</xref>; <xref ref-type="bibr" rid="B32">Xie et&#xa0;al., 2019</xref>). When the ambient light intensity is weak, information loss will occur when using machine vision to obtain inter-row navigation information (<xref ref-type="bibr" rid="B4">Chen et&#xa0;al., 2011</xref>). However, considering the fact that machine vision usually takes a certain feature of maize as the basis for the acquisition of information, recognizing multiple features at the same time will greatly reduce the recognition speed and also reduce the real-time performance of agricultural robots, taking non-maize obstacles into consideration (such as soil, bricks, and branches) in the middle and late stages of maize; it is, therefore, quite difficult to obtain all the inter-row information by using only a single feature.</p>
<p>Since LiDAR (laser imaging, detection and ranging) can obtain accurate point cloud data of objects according to the echo detection principle (<xref ref-type="bibr" rid="B23">Reiser et&#xa0;al., 2018</xref>; <xref ref-type="bibr" rid="B30">Wang et&#xa0;al., 2018</xref>; <xref ref-type="bibr" rid="B12">Jafari Malekabadi et&#xa0;al., 2019</xref>) and is less affected by light (<xref ref-type="bibr" rid="B27">Wang et&#xa0;al., 2022a</xref>; <xref ref-type="bibr" rid="B28">Wang et&#xa0;al., 2022b</xref>), it can supplement the missing information caused by the use of machine vision (<xref ref-type="bibr" rid="B13">Jeong et&#xa0;al., 2018</xref>; <xref ref-type="bibr" rid="B1">Aguiar et&#xa0;al., 2021</xref>). In order to solve the issue of information loss when a vision sensor was used to obtain information, a method using LiDAR supplement vision was proposed (<xref ref-type="bibr" rid="B2">Bae et&#xa0;al., 2021</xref>), which pooled the strength of each sensor and made up for the shortcomings of using a single sensor. Through the complementary process between vision and LiDAR (<xref ref-type="bibr" rid="B19">Morales et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B20">Mutz et&#xa0;al., 2021</xref>), the performance of adaptive cruise control was significantly improved; thus, a complementary method combining vision and LiDAR was developed in order to further improve the accuracy of unmanned aerial vehicle (UAV) navigation (<xref ref-type="bibr" rid="B39">Yu et&#xa0;al., 2021</xref>). Liu et&#xa0;al. proposed a new structure of LiDAR supplement vision in an end-to-end semantic segmentation network, which can effectively improve the performance of automatic driving (<xref ref-type="bibr" rid="B18">Liu et&#xa0;al., 2020</xref>). The above methods had good application effects in the field of autonomous driving (<xref ref-type="bibr" rid="B5">Chen et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B36">Yang et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B40">Zhang et&#xa0;al., 2021</xref>). Based on the above research, we believe that LiDAR supplement vision is an interesting and effective method to obtaining inter-row information in the middle and late stages of maize development.</p>
<p>Therefore, this paper proposed a method of using LiDAR point cloud data to supplement machine vision data for obtaining inter-row information in the middle and late stages of maize. We took the location of maize plants as the main navigation information and proposed an improved YOLOv5 (Im-YOLOv5) algorithm (<xref ref-type="bibr" rid="B15">Jubayer et&#xa0;al., 2021</xref>, p. 5) to identify maize plants and obtain the main navigation information. At the same time, we took the locations of stones, clods, and other obstacles as auxiliary navigation information, which were obtained through LiDAR. By the supplementary function of vision and LiDAR, the accuracy of the inter-row navigation information acquisition in the middle and late stages of maize can be improved. The proposed method provides a new and effective way to obtaining navigation information between rows in the middle and late stages of maize under the condition of equal height occlusion.</p>
<p>The contributions of this article are summarized as follows:</p>
<list list-type="order">
<list-item>
<p>A method of inter-row information recognition with a LiDAR supplement camera is proposed.</p>
</list-item>
<list-item>
<p>An Im-YOLOv5 model with efficient channel attention (ECA) and lightweight backbone network is established.</p>
</list-item>
<list-item>
<p>Auxiliary navigation information acquisition using LiDAR can reduce the loss of information.</p>
</list-item>
<list-item>
<p>The proposed method was tested and analyzed using a data acquisition robot.</p>
</list-item>
</list>
</sec>
<sec id="s2" sec-type="materials|methods">
<title>2 Methods and materials</title>
<sec id="s2_1">
<title>2.1 Composition of the test platform</title>
<p>The experimental platform and data acquisition system are shown in <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref>.&#xa0;A personal computer (PC) was used as the upper computer to collect LiDAR and camera signals. The LiDAR model is VLP-16, the scanning distance was 100&#xa0;m, the horizontal scanning angle was 270&#xb0;, and the vertical scanning angle was &#xb1;15&#xb0;. The camera model is NPX-GS650, the resolving power was 640*480, and the frame rate was 790.</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>Data acquisition robot. <italic>PC</italic>, personal computer.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g001.tif"/>
</fig>
</sec>
<sec id="s2_2">
<title>2.2 Commercialization feasibility analysis</title>
<p>The data acquisition platform used in the test costs 490 RMB. The plant protection operation can be carried out by installing a pesticide applicator in the later stage, with the cost of the pesticide applicator about 100 RMB. The cost of the camera sensor was about 100 RMB, and that of the LiDAR sensor was about 5,000 RMB. Consequently, the cost of VLP-16 LiDAR represented a key issue affecting the commercialization of this recognition system. Therefore, our recognition system was applied to small autonomous navigation plant protection robots. The relatively low-cost of small plant protection robots, even with the application of this relatively high-precision recognition system, had a price advantage over UAVs.</p>
</sec>
</sec>
<sec id="s3">
<title>3 Joint calibration of camera and LiDAR</title>
<p>In this paper, a monocular camera and VLP-16 LiDAR were used as the information fusion sensors. When the monocular camera and the LiDAR detect the same target, despite the range and angle information being the same, the detection results of the two sensors belong to different coordinate systems (<xref ref-type="bibr" rid="B3">Chen et&#xa0;al., 2021a</xref>). Therefore, in order to effectively realize the information supplementation of LiDAR to the camera, the coordinate system must be unified; that is, the detection results of the two sensors should be input into the same coordinate system and the relative pose between them should be calibrated at the same time so as to realize the data matching and correspondence between these two sensors.</p>
<p>It should be noted that the main task of the monocular camera calibration was to solve its extrinsic parameter matrix and intrinsic parameters. In this paper, the chessboard calibration method was used (<xref ref-type="bibr" rid="B35">Xu et&#xa0;al., 2022</xref>), with the chessboard size being 400&#xa0;mm&#xa0;&#xd7;&#xa0;550&#xa0;mm and the grid size being 50&#xa0;mm&#xa0;&#xd7;&#xa0; 50;&#xa0;mm. We randomly took 21 chessboard pictures of different positions. The camera calibration error was less than 0.35&#xa0;pixels and the overall mean error was 0.19&#xa0;pixels, which means, according to reference, that the error met the calibration accuracy and that the calibration result has practical value (<xref ref-type="bibr" rid="B34">Xu et&#xa0;al., 2021</xref>). The internal parameters of the camera were as follows: focal length (<italic>f</italic>) = 25&#xa0;mm, radial distortion parameter (<italic>k</italic>
<sub>1</sub>) = 0.012&#xa0;mm, radial distortion parameter (<italic>k</italic>
<sub>2</sub>) = 0.009&#xa0;mm, tangential distortion parameter (<italic>p</italic>
<sub>1</sub>) = &#x2212;0.0838&#xa0;mm, tangential distortion parameter (<italic>p</italic>
<sub>2</sub>) = 0.1514&#xa0;mm, image center (<italic>u</italic>
<sub>0</sub>) = 972&#xa0;mm, image center (<italic>v</italic>
<sub>0</sub>)&#xa0;=&#xa0;1,296&#xa0;mm, normalized focal length (<italic>f<sub>x</sub>
</italic> = <italic>f/dx</italic>)&#xa0;=&#xa0;1,350.3&#xa0;mm, and normalized focal length (<italic>f<sub>y</sub>
</italic> = <italic>f/dy</italic>)&#xa0;=&#xa0;2,700.8&#xa0;mm. On the basis of camera calibration, we carried out the joint calibration of the camera and LiDAR. The calibration principle is shown in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2A</bold>
</xref>. By matching the corner information of the chessboard picture taken by the camera to the corner information of the chessboard point cloud data obtained by LiDAR, a rigid transformation matrix from the point cloud data to the image can be obtained. During calibration, the camera and LiDAR were fixed on the data acquisition robot platform developed by the research group. After the joint calibration, the relative positions of the camera and LiDAR were saved and fixed. The calibration error is shown in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2B</bold>
</xref>. As indicated in <xref ref-type="bibr" rid="B1">Aguiar et&#xa0;al. (2021)</xref>, the calibration error met the calibration accuracy, and the calibration result showed practical value. Through joint calibration, the rigid transformation matrix of the point cloud projection to the image is obtained from Equations (1) and (2).</p>
<disp-formula>
<label>(1)</label>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi mathvariant="bold">lidar</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.9998</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.0032</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.0179</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.0176</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.0807</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.9966</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.0047</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.9967</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.0806</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(2)</label>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>lidar</mml:mi>
</mml:mstyle>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.0468</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mn>0.1139</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr columnalign="center">
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.2667</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>Camera&#x2013;LiDAR (laser imaging, detection and ranging) joint calibration process. <bold>(A)</bold> Principle of joint calibration. <bold>(B)</bold> Joint calibration error. By matching the corner information of the chessboard picture taken by the camera to the corner information of the chessboard point cloud data obtained by LiDAR, the rigid transformation matrix from the point cloud data to the image can be obtained.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g002.tif"/>
</fig>
</sec>
<sec id="s4">
<title>4 Navigation information acquisition based on LiDAR supplement vision</title>
<p>As mentioned in Section 1, machine vision usually takes a single feature of the plant as the basis of recognition. In this paper, the maize stem about 10&#xa0;cm above the ground surface was taken as the machine vision recognition feature. It should be noted that taking the maize stem as the identification feature will cause lack of information on the other non-maize obstacles (such as stones and clods). In order to solve the issue of missing information when using machine vision to acquire navigation information, this paper proposed a method of inter-row navigation information acquisition in the middle and late stages of maize based on LiDAR supplement vision. The detailed principle is shown in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref>. The machine vision datasets were trained using the Im-YOLOv5 algorithm to identify the stem of the maize and, subsequently, to obtain the main navigation information. The point cloud data of the inter-row environment in the middle and late stages of maize were obtained using LiDAR to gather auxiliary navigation information. It should be noted that the method proposed in this paper obtained inter-line information through LiDAR-assisted cameras; therefore, spatial data fusion was used. After establishing the precise coordinate conversion relationship among the radar coordinate systems&#x2014;a three-dimensional world coordinate system&#x2014;a camera coordinate system, an image coordinate system, and a pixel coordinate system&#x2014;the spatial position information of the obstacles in the point cloud data can be matched to the visual image.</p>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>Principle of navigation information acquisition based on LiDAR (laser imaging, detection, and ranging) supplement camera. The machine vision datasets were trained using the improved YOLOv5 (Im-YOLOv5) algorithm to identify the stem of the maize and then obtain the main navigation information, while LiDAR was used to obtain auxiliary navigation information.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g003.tif"/>
</fig>
<sec id="s4_1">
<title>4.1 Main navigation information acquisition with the improved YOLOv5</title>
<p>YOLO models have a real-time detection speed, but require a powerful GPU (graphic processing unit) and a large amount of memory when training, limiting their use on most computers. The large size of the model after training can also increase the hardware requirements on mobile devices. Ideally, a detection model would meet the requirements of detection accuracy and real-time detection speed of maize stems, without high hardware requirements. The YOLOv5 model is a lightweight version of YOLO, has fewer layers and faster detection speed, can be used on portable devices, and requires fewer GPU resources for training (<xref ref-type="bibr" rid="B26">Tang et&#xa0;al., 2023</xref>). Therefore, the goal of this work was to build on the YOLOv5 model and apply the improved model for the detection of maize stems. The main idea for improving YOLOv5 was to lighten its backbone network through MobileNetv2 and introduce the ECANet attention mechanism to improve the recognition accuracy and robustness of the model.</p>
<sec id="s4_1_1">
<title>4.1.1 Lightweight Backbone network</title>
<p>This paper used MobileNetv2 (<xref ref-type="bibr" rid="B42">Zhou et&#xa0;al., 2020</xref>) to replace the backbone network of YOLOv5 for the extraction of maize stem images with effective characteristics. In order to enhance the adaptability of the network to the task of recognizing maize stem features and fully extract features, a progressive classifier was designed in this paper to enhance the network&#x2019;s recognition ability of the corn rhizome. The original MobileNetV2 network was primarily used to deal with more than 1,000 types of targets on the ImageNet dataset, while this paper only targeted maize stems. Therefore, in order to better extract the characteristics of maize stems and improve the recognition ability of the network on maize stems, we the classifier of the network was redesigned, which included two convolution layers, one global pooling layer, and one output layer (convolution layer).</p>
<p>The main task of the classifier was to efficiently convert the extracted maize stem features into specific classification results. As shown in <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>, two convolution kernels with different scales were selected to replace a single convolution kernel in the original classifier in order to perform the compression and conversion operations of the feature map. The size of the first convolution kernel was 1&#xa0;&#xd7;&#xa0;1. It was mainly responsible for the channel number compression of the feature map. In order to avoid the loss of a large number of useful features caused by a large compression ratio, the second convolution was used mainly for the size compression of the feature map to avoid fluctuations in the subsequent global pooling on a large feature map. Comparison of the Im-YOLOv5 network based on MobileNetv2 with the original YOLOv5 network showed that the model parameters decreased from 64,040,001 to 39,062,013 and the parameters decreased by 39%.</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>MobileNetv2 network structure.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g004.tif"/>
</fig>
<p>At the same time, Im-YOLOv5 used CIOU_Loss [complete intersection over union (IOU) loss] to replace GIOU_Loss (generalized IOU loss) as the loss function of the bounding box and used binary cross-entropy and logits loss function to calculate the loss of class probability and target score, defined as follows.</p>
<disp-formula>
<label>(3)</label>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>G</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>O</mml:mi>
<mml:mi>U</mml:mi>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>I</mml:mi>
<mml:mi>O</mml:mi>
<mml:mi>U</mml:mi>
</mml:mstyle>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>A</mml:mi>
<mml:msup>
<mml:mo>&#x222a;</mml:mo>
<mml:mo>&#x200b;</mml:mo>
</mml:msup>
<mml:mi>B</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(4)</label>
<mml:math display="block" id="M4">
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>I</mml:mi>
<mml:mi>O</mml:mi>
<mml:mi>U</mml:mi>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:msup>
<mml:mo>&#x2229;</mml:mo>
<mml:mo>&#x200b;</mml:mo>
</mml:msup>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:msup>
<mml:mo>&#x222a;</mml:mo>
<mml:mo>&#x200b;</mml:mo>
</mml:msup>
<mml:mi>B</mml:mi>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(5)</label>
<mml:math display="block" id="M5">
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>C</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>O</mml:mi>
<mml:mi>U</mml:mi>
</mml:mstyle>
<mml:mo>=</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>I</mml:mi>
<mml:mi>O</mml:mi>
<mml:mi>U</mml:mi>
</mml:mstyle>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mi>&#x3c1;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>b</mml:mi>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>&#x3bd;</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>In Equations (3) and (4), <italic>A</italic> and <italic>B</italic> are the prediction box and the real box, respectively; IOU is the intersection ratio of the prediction box and the real box; and <italic>C</italic> is the minimum circumscribed rectangle of the prediction box and the target box. However, Equations (3) and (4), considering only the overlap rate between the prediction box and the target box, cannot describe well the regression problem of the target box. When the prediction box is inside the target box and the size of the prediction box is the same, GIOU will degenerate into IOU, which cannot distinguish the corresponding positions of the prediction box in each target box, resulting in error detection and leak detection. Equation (5) is the calculation formula of CIOU, where <italic>a</italic> = <italic>v</italic>/(1-IOU)<italic>v</italic> is an equilibrium parameter that does not participate in gradient calculation; v = 4/&#x3c0;^2(arctan (<italic>W<sup>gt</sup>/H<sup>gt</sup>
</italic>) &#x2013; arctan (<italic>W/H</italic>))<sup>2</sup> is a parameter used to measure the consistency of the length-width ratio; <italic>b</italic> is the forecast box; <italic>b<sup>gt</sup>
</italic> is the realistic box; <italic>&#x3c1;</italic> is the Euclidean distance; and <italic>c</italic> is the diagonal length of the minimum bounding box. It can be seen from Equation (5) that the CIOU comprehensively considers the overlapping area, center point distance, aspect ratio, and other factors of the target and prediction boxes and solves the shortcoming of the GIOU loss function, making the regression process of the target box more stable, with faster convergence speed and higher convergence accuracy.</p>
</sec>
<sec id="s4_1_2">
<title>4.1.2 Introducing the attention mechanism</title>
<p>In order to improve the recognition accuracy and robustness of the algorithm in the case of a large number of maize stems and mutual occlusion between stems, efficient channel attention (ECA) was introduced (<xref ref-type="bibr" rid="B33">Xue et&#xa0;al., 2022</xref>). It should be noted that, although the introduction of ECANet into convolutional neural networks has shown better performance improvements, ECANet only considers the local dependence between the current channel of the feature map and several adjacent channels, which inevitably loses the global dependence between the current channel and other long-distance channels. On the basis of ECANet, we added a new branch (shown in the dashed box in <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref>) that has undergone channel-level global average pooling and is disrupted. This branch randomly rearranges the channel order of the feature map after undergoing channel-level global average pooling, so the long-distance channel before disruption may become its adjacent channel. After obtaining the local dependencies between the current channel of the new feature map and its new <italic>k</italic> adjacent channels, weighting the two branches can obtain more interaction information between channels.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>ECANet channel attention.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g005.tif"/>
</fig>
<p>In this paper, suppose that the feature vector of the input feature after convolution is <italic>x &#x3f5; R<sup>W&#xd7;H&#xd7;C</sup>
</italic>, where <italic>W</italic>, <italic>H</italic>, and <italic>C</italic> respectively represent the width, height, and channel size of the feature vector. The global average pooling of the channel dimension can be expressed as:</p>
<disp-formula>
<label>(6)</label>
<mml:math display="block" id="M6">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>g</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Then, in ECANet, the feature vector inputs by the two branches can be expressed as:</p>
<disp-formula>
<label>(7)</label>
<mml:math display="block" id="M7">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>g</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>S</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(8)</label>
<mml:math display="block" id="M8">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mi>g</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>g</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>ys</italic> represents the vector obtained after global average pooling and disrupting the sequential branching of channels; <italic>yg</italic> represents the vector obtained after global average pooling and branching; and <italic>S</italic> is a channel-disrupting operation. Given that the feature vector without dimension reduction is <italic>y &#x3f5; R<sup>C</sup>
</italic>, the inter-channel weight calculation using the channel attention module can be expressed as:</p>
<disp-formula>
<label>(9)</label>
<mml:math display="block" id="M9">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>&#x3c3;</italic>(<italic>x</italic>) = 1/(1+<italic>e<sup>-x</sup>
</italic>) is the sigmoid activation function and <italic>W<sub>k</sub>
</italic> is the parameter matrix for calculating channel attention using ECANet.</p>
<p>We took MobileNetv2 (<xref ref-type="bibr" rid="B42">Zhou et&#xa0;al., 2020</xref>) as the backbone model, combined YOLOv5 with the SeNet and ECANet modules (<xref ref-type="bibr" rid="B10">Hassanin et&#xa0;al., 2022</xref>), and carried out maize stem recognition experiments. The test results are shown in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>. ECANet showed better performance compared toSeNet, indicating that ECANet can improve the performance of YOLOv5 with less computational costs. At the same time, ECANet was more competitive than SeNet, and the model complexity was also lower.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Comparison of the recognition performance (in percent) of the YOLOv5 model integrating different attention modules.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="left">Method</th>
<th valign="top" align="center">
<italic>P</italic>
</th>
<th valign="top" align="center">
<italic>R</italic> (%)</th>
<th valign="top" align="center">FPS (%)</th>
<th valign="top" align="center">F<sub>1</sub> (%)</th>
<th valign="top" align="center">mAP (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">MobileNetv2</td>
<td valign="top" align="center">89.5</td>
<td valign="top" align="center">94.1</td>
<td valign="top" align="center">34.2</td>
<td valign="top" align="center">93.7</td>
<td valign="top" align="center">91.37</td>
</tr>
<tr>
<td valign="top" align="left">MobileNetv2+SeNet</td>
<td valign="top" align="center">93.2</td>
<td valign="top" align="center">91.4</td>
<td valign="top" align="center">63.7</td>
<td valign="top" align="center">91.2</td>
<td valign="top" align="center">97.25</td>
</tr>
<tr>
<td valign="top" align="left">MobileNetv2+ECANet</td>
<td valign="top" align="center">96.7</td>
<td valign="top" align="center">82.3</td>
<td valign="top" align="center">79.6</td>
<td valign="top" align="center">86.3</td>
<td valign="top" align="center">96.98</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>P, comparison of accuracy; R, recall; F<sub>1</sub>, harmonic average; FPS, frame rate; mAP, mean average precision.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>In this work, the ECANet attention mechanism was first placed on the enhanced feature extraction network and the attention mechanism added on the three effective feature layers extracted from the backbone network. Regarding the problems of information attenuation, the aliasing effect of cross-scale fusion and the inherent defects of channel reduction in the feature pyramid network (FPN) in YOLOv5, in this paper, we added the ECANet attention mechanism to the sampling results on FPN in order to reduce information loss and optimize the integration characteristics on each layer. By introducing the ECANet attention mechanism, Im-YOLOv5 can better fit the relevant feature information between the target channels, ignore and suppress useless information, and make the model focus more on training the specific category of maize stems, strengthening it and improving its detection performance. The specific structure of the Im-YOLOv5 algorithm is shown in <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref>.</p>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>Improved YOLOv5 (Im-YOLOv5) architecture.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g006.tif"/>
</fig>
</sec>
</sec>
<sec id="s4_2">
<title>4.2 Auxiliary navigation information acquisition by LiDAR</title>
<p>Because of the obvious color and structural characteristics of maize stems, we trained the Im-YOLOv5 model to only detect maize stems when the main navigation information was obtained through machine vision. However, the actual non-maize obstacles were mainly soil blocks and stones, and the color and shape characteristics of such obstacles are relatively close to the ground color, which greatly increased the difficulty of Im-YOLOv5 model training. At the same time, recognizing multiple features simultaneously by machine vision will also reduce the recognition speed to a certain extent. Under this condition, it is necessary to obtain point cloud information using LiDAR to supplement machine vision.</p>
<sec id="s4_2_1">
<title>4.2.1 Determination of the effective point cloud range</title>
<p>Since the camera and LiDAR were fixed on the data acquisition robot platform, when the robot is walking between lines during data acquisition, it is necessary to determine the effective data range of the LiDAR point cloud according to the shooting angle range of the camera, as shown in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7A</bold>
</xref>.</p>
<fig id="f7" position="float">
<label>Figure&#xa0;7</label>
<caption>
<p>Camera&#x2013;LiDAR (laser imaging, detection, and ranging) joint calibration process. <bold>(A)</bold> Effective data range. <italic>&#x3b8;<sub>e</sub>
</italic> is the camera shooting angle range, <italic>&#x3b8;<sub>i</sub>
</italic> is the scanning angle of LiDAR, and the overlapping area is the effective point cloud range. <bold>(B)</bold> Coordinate transformation. <italic>O<sub>w</sub> - X<sub>w</sub>Y<sub>w</sub>Z<sub>w</sub>
</italic>is the LiDAR coordinate system, <italic>O<sub>c</sub> &#x2013; X<sub>c</sub>Y<sub>c</sub>Z<sub>c</sub>
</italic>is the camera coordinate system, o - <italic>xy</italic>is the image coordinate system, and <italic>O<sub>uv</sub> &#x2013; uv</italic> is the pixel coordinate system. <bold>(C)</bold> Distortion error. <italic>dr</italic> and d&#x3c4; are the radial distortion and the tangential distortion of the camera, respectively.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g007.tif"/>
</fig>
<p>Note that, in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7A</bold>
</xref>, <italic>&#x3b8;<sub>e</sub>
</italic> is the camera shooting angle range, <italic>&#x3b8;<sub>e</sub> </italic>is the scanning angle of LiDAR, and <italic>d</italic> is the width of the robot. Therefore, the range of the effective point cloud data collected by LiDAR is the sector area, where <italic>r</italic> is the radius of the sector with the angle of <italic>&#x3b8;<sub>e</sub>
</italic> and is defined as:</p>
<disp-formula>
<label>(10)</label>
<mml:math display="block" id="M10">
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mi>d</mml:mi>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mi>&#x3c0;</mml:mi>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>e</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="s4_2_2">
<title>4.2.2 Coordinate conversion of the auxiliary navigation information</title>
<p>Through the joint calibration of the camera and LiDAR in the above section, the camera external parameter matrix (<italic>R, T</italic>), the camera internal parameter, and the rigid conversion matrix (<italic>R<sub>lidar</sub>
</italic>, <italic>T<sub>lidar</sub>
</italic>), of the camera and LiDAR sensor information were obtained.</p>
<p>In order to supplement the main navigation information with the auxiliary navigation information, it is essential to establish a conversion model between sensors. Through the established transformation model, the points in the world coordinate system scanned by LiDAR were projected into the pixel coordinate system of the camera to realize the supplementation of the point cloud data to the visual information according to the pinhole camera model, as shown in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7B</bold>
</xref>. Note that, in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7B</bold>
</xref>, <italic>P</italic> is the point on the real object, <italic>p</italic> is the imaging point of <italic>P</italic> in the image, (<italic>x, y</italic>) are the coordinates of <italic>p</italic> in the image coordinate system, (<italic>u, v</italic>) are the coordinates of <italic>p</italic> in the pixel coordinate system, and <italic>f</italic> is the focal length of the camera, where <italic>f</italic> = || <italic>o</italic> &#x2013; <italic>0</italic>
<sub>c</sub>|| (in millimeters). The corresponding relationship between a point <italic>P</italic>(<italic>X<sub>w</sub>, Y<sub>w</sub>, Z<sub>w</sub>
</italic>) in the real world obtained by LiDAR and the corresponding point <italic>p</italic>(<italic>u, v</italic>) in the camera pixel coordinate system can be expressed as:</p>
<disp-formula>
<label>(11)</label>
<mml:math display="block" id="M11">
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>u</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>v</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>x</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>u</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>v</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>0</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
</mml:mstyle>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msup>
<mml:mn>0</mml:mn>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
</mml:mstyle>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>Z</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mn>1</mml:mn>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>According to the principle of LiDAR scanning, the point cloud data obtained by LiDAR are in the form of polar coordinates. Therefore, the distance and angle information of the point cloud data under polar coordinates were converted into the three-dimensional coordinate point information under the LiDAR ontology coordinate system. The conversion formula was as follows:</p>
<disp-formula>
<label>(12)</label>
<mml:math display="block" id="M12">
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>&#x3c1;</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
</mml:mstyle>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
</mml:mstyle>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>Z</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>&#x3c1;</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>s</mml:mi>
</mml:mstyle>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mstyle>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>w</mml:mi>
</mml:mstyle>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>&#x3c1;</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mstyle>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>&#x3c1;</italic> is the distance between the scanning point and the LiDAR;<italic>&#x3b1;</italic> is the elevation angle of the scanning line at the scanning point, namely, the angle in the vertical direction; and &#x3b8; is the heading angle in the horizontal direction.</p>
<p>In order to eliminate the camera imaging distortion error caused by the larger deflection of light away from the lens center and the lens not being completely parallel to the image plane, as shown in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7C</bold>
</xref>, we corrected the distortion of Equation (11)with the correction formula, given as follows (<xref ref-type="bibr" rid="B6">Chen et&#xa0;al., 2021b</xref>):</p>
<p>Radial distortion correction:</p>
<disp-formula>
<label>(13)</label>
<mml:math display="block" id="M13">
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>u</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>4</mml:mn>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>4</mml:mn>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Tangential distortion correction:</p>
<disp-formula>
<label>(14)</label>
<mml:math display="block" id="M14">
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>'</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>u</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mi>v</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi>u</mml:mi>
<mml:msup>
<mml:mo>'</mml:mo>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>v</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>'</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>v</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mi>u</mml:mi>
<mml:mo>'</mml:mo>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi>v</mml:mi>
<mml:msup>
<mml:mo>'</mml:mo>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>k</italic>
<sub>1</sub> and <italic>k</italic>
<sub>2</sub> are the radial correction parameters; <italic>p</italic>
<sub>1</sub> and <italic>p</italic>
<sub>2</sub> are the tangential correction parameters; <italic>u</italic>&#x2032;&#x2032;and <italic>v</italic>&#x2032; re the radially corrected pixel coordinates; and <italic>u</italic>&#x2032;&#x2032; and <italic>v&#x2032;&#x2032;</italic> are the tangentially corrected pixel coordinates.</p>
<p>The corresponding relationship between the point in the world coordinate system obtained by LiDAR and the camera pixel coordinate system is established through Equations (10)&#x2013;(14). According to the established coordinate transformation model, the LiDAR point cloud data can be converted to the image space for the purpose of supplementation between machine vision and LiDAR.</p>
</sec>
<sec id="s4_2_3">
<title>4.2.3 Feature recognition of point cloud based on PointNet</title>
<p>Because of the irregular format of the point cloud, it is difficult to extract its feature, but with the proposal of the PointNet model (<xref ref-type="bibr" rid="B14">Jing et&#xa0;al., 2021</xref>), this problem was solved. In this paper, the features of the non-maize obstacles in the middle and late stages of maize were extracted through PointNet, and their location information taken as the output. Note that we also performed the following work before using the PointNet model for training. The principle is shown in <xref ref-type="fig" rid="f8">
<bold>Figure&#xa0;8</bold>
</xref>.</p>
<fig id="f8" position="float">
<label>Figure&#xa0;8</label>
<caption>
<p>The principle of auxiliary navigation.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g008.tif"/>
</fig>
<sec id="s4_2_3_1">
<title>4.2.3.1 Ground segmentation</title>
<p>In order to obtain auxiliary navigation information from the LiDAR point cloud data, the ground point cloud must be segmented first. In this work, the RANSAC (random sample consensus) algorithm was adopted to segment the collected point cloud data.</p>
<p>The unique plane can be determined by randomly selecting three non-collinear sample points (<italic>x<sub>a</sub>, x<sub>b</sub>, x<sub>c</sub>
</italic>) in the point cloud.</p>
<disp-formula>
<label>(15)</label>
<mml:math display="block" id="M15">
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xb7;</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(16)</label>
<mml:math display="block" id="M16">
<mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>b</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula>
<label>(17)</label>
<mml:math display="block" id="M17">
<mml:mrow>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#xb7;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Where <italic>n<sub>i</sub>
</italic> is the normal vector of the plane model and <italic>d<sub>i</sub>
</italic> is the pitch of the plane model. Then, the distance from any sample point <italic>x<sub>i</sub>
</italic> in the point cloud to the plane model is given by</p>
<disp-formula>
<label>(18)</label>
<mml:math display="block" id="M18">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#xb7;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Let the distance threshold be <italic>T</italic>, when <italic>r<sub>i</sub>&lt;T</italic>. The sample point <italic>x<sub>i</sub>
</italic> is the internal point; otherwise, it is the external point. Let <italic>N</italic> be the number of internal points with</p>
<disp-formula>
<label>(19)</label>
<mml:math display="block" id="M19">
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>=</mml:mo>
<mml:mstyle mathvariant="bold" mathsize="normal">
<mml:mi>N</mml:mi>
<mml:mi>U</mml:mi>
<mml:mi>M</mml:mi>
</mml:mstyle>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&lt;</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>sNote that Equations (15)&#x2013;(19) show a calculation process, but <italic>N</italic> is not necessarily the maximum value at this time; hence, an iterative calculation is needed. Let the number of iterations be <italic>k<sub>c</sub>
</italic>. When <italic>N</italic> takes the maximum value, <italic>N</italic>
<sub>max</sub>, in the iterative process, the plane model corresponding to <italic>n</italic>
<sub>best</sub> and <italic>d</italic>
<sub>best</sub> is the best-fitting ground.</p>
</sec>
<sec id="s4_2_3_2">
<title>4.2.3.2 Removing noise points caused by maize leaves</title>
<p>LiDAR was mainly used to identify obstacles other than maize leaves. In order to reduce the difficulty of model training, the point cloud data of maize leaves were deleted. This technology depends on the analysis of the <italic>z</italic>-coordinate distribution of each point cloud. In general, the height of obstacles such as soil blocks and stones is less than 10&#xa0;cm. Therefore, when we trained the model sexually, we deleted the point cloud with a <italic>z</italic>-coordinate greater than 10&#xa0;cm in the <italic>&#x3b8;<sub>e</sub>
</italic> range.</p>
</sec>
</sec>
</sec>
</sec>
<sec id="s5">
<title>5 Experiments and discussions</title>
<p>The focus of this paper was navigation information acquisition. Navigation information can be used for path planning to guide the robot to drive autonomously and can also be used as the basis for the adjustment of the driving state of the robot, such as reducing the driving speed when detecting rocks or large clods. We provided the results of the information acquisition experiment.</p>
<sec id="s5_1">
<title>5.1 Main navigation information acquisition experiment</title>
<p>We verified the recognition performance of the Im-YOLOv5 for the main navigation information from two aspects: model training and detection results. In order to facilitate comparisons, we also provided the test results of YOLOv5 and Faster-RCNN (faster region-based convolutional network). The datasets used in the experiment were collected by the Anhui Intelligent Agricultural Machinery Equipment Engineering Laboratory. It should be noted that, in order for each model to perform best on the datasets, we adjusted the parameters of each model separately to select the appropriate hyperparameters. The initial hyperparameter settings of each algorithm are shown in <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>. We divided the train set, test set, and verification set according to an 8:1:1 ratio, and the dataset contained 3,000 images.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Target detection hyperparameter setting.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="left">Parameter</th>
<th valign="top" align="center">Im-YOLOv5</th>
<th valign="top" align="center">YOLOv5</th>
<th valign="top" align="center">Faster-RCNN</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Backbone network</td>
<td valign="top" align="center">MobileNetv2</td>
<td valign="top" align="center">Backbone</td>
<td valign="top" align="center">Resnet50</td>
</tr>
<tr>
<td valign="top" align="left">Training size</td>
<td valign="top" align="center">640&#xa0;&#xd7;&#xa0;640</td>
<td valign="top" align="center">640&#xa0;&#xd7;&#xa0;640</td>
<td valign="top" align="center">640&#xa0;&#xd7;&#xa0;640</td>
</tr>
<tr>
<td valign="top" align="left">Batch size</td>
<td valign="top" align="center">16</td>
<td valign="top" align="center">16</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">No. of categories</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">Initial learning rate</td>
<td valign="top" align="center">1e&#x2212;2</td>
<td valign="top" align="center">1e&#x2212;2</td>
<td valign="top" align="center">1e&#x2212;4</td>
</tr>
<tr>
<td valign="top" align="left">No. of iterations</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">100</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>Im-YOLOv5, improved You Only Look Once, version 5; Faster-RCNN, faster region-based convolutional network.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The model training and validation loss rate curves are shown in <xref ref-type="fig" rid="f9">
<bold>Figure&#xa0;9</bold>
</xref>. From the figure, it can be seen that the loss rate tends to stabilize with the increase of iterations, finally converging to the fixed value; this indicates that the model has reached the optimal effect. The debugged model showed good fitting and generalization ability for the maize stem datasets. Note that, due to the Im-YOLOv5 having an improved loss function, the initial loss value of the model was about 0.38, which was the lowest among the three models, and the convergence speed was accelerated.</p>
<fig id="f9" position="float">
<label>Figure&#xa0;9</label>
<caption>
<p>Model training and validation loss rate curves.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g009.tif"/>
</fig>
<p>The <italic>P</italic> (comparison of accuracy), <italic>R</italic> (recall), <italic>F</italic>
<sub>1</sub> (harmonic average), FPS (frame rate), and mAP (mean average precision) values for Im-YOLOv5, YOLOv5, and Faster-RCNN are shown in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>. From the table, it can be seen that Im-YOLOv5 had the highest accuracy rate, followed by YOLOv5; the accuracy rate of Faster-RCNN was low. With the lightweight backbone network, the FPS of Im-YOLOv5 was the highest, and the weight was greatly reduced. While meeting the real-time requirements, the detection speed of a single image was also the fastest and the detection performance was the best. Compared with that of YOLOv5, the FPS of Im-YOLOv5 was increased by 17.91% and the model size reduced by 55.56% when the mAP was reduced by only 0.35%, which improved the detection performance and shortened the model reasoning time. From the datasets, we selected a number of inter-row images of maize in the middle and late stages for testing, as shown in <xref ref-type="fig" rid="f10">
<bold>Figure&#xa0;10</bold>
</xref>. For the same image, Im-YOLOv5 was able to identify most maize stems, even those that were partially covered. At the same time, the detection confidence of Im-YOLOv5 and YOLOv5 was high, but that of Faster-RCNN was relatively low.</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Model evaluation.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="left">Model</th>
<th valign="top" align="center">
<italic>P</italic> (%)</th>
<th valign="top" align="center">
<italic>R</italic> (%)</th>
<th valign="top" align="center">
<italic>F</italic>
<sub>1</sub> (%)</th>
<th valign="top" align="center">FPS</th>
<th valign="top" align="center">Model size (M)</th>
<th valign="top" align="center">mAP (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Im-YOLOv5</td>
<td valign="top" align="center">97</td>
<td valign="top" align="center">81</td>
<td valign="top" align="center">88</td>
<td valign="top" align="center">78</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">96.12</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv5</td>
<td valign="top" align="center">93</td>
<td valign="top" align="center">90</td>
<td valign="top" align="center">93</td>
<td valign="top" align="center">66</td>
<td valign="top" align="center">27</td>
<td valign="top" align="center">96.48</td>
</tr>
<tr>
<td valign="top" align="left">Faster-RCNN</td>
<td valign="top" align="center">76</td>
<td valign="top" align="center">92</td>
<td valign="top" align="center">82</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">108</td>
<td valign="top" align="center">90.52</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>P, comparison of accuracy; R, recall; F<sub>1</sub>, harmonic average; FPS, frame rate; mAP, mean average precision.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="f10" position="float">
<label>Figure&#xa0;10</label>
<caption>
<p>Results of stem detection. <bold>(A)</bold> Improved You Only Look Once, version 5 algorithm (Im-YOLOv5). <bold>(B)</bold> YOLOv5. <bold>(C)</bold> Faster region-based convolutional network (Faster-RCNN).</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g010.tif"/>
</fig>
</sec>
<sec id="s5_2">
<title>5.2 Auxiliary navigation information supplements the main navigation information experiment</title>
<p>In the experiments, the practical feasibility of the proposed inter-row navigation information acquisition method was verified based on LiDAR point cloud data-supplemented machine vision in the middle and late stages of maize. Considering the current coronavirus outbreak, conducting large-scale field experiments had been difficult. Therefore, an artificial maize plant model was used to set up the simulation test environment for verifying the feasibility of the designed method. <xref ref-type="fig" rid="f11">
<bold>Figure&#xa0;11A</bold>
</xref> shows the test environment using the maize plant model. Investigation of maize planting in Anhui Province revealed that the row spacing for maize plants is about 50&#x2013;80&#xa0;cm and that plant spacing is about 20&#x2013;40&#xa0;cm. Therefore, the row spacing in the maize plant model was set to 65&#xa0;cm and the plant spacing to 25&#xa0;cm. At the same time, a number of non-maize obstacles were also set in the experiments. For the purpose of data acquisition in this work, the data acquisition robot was developed by Anhui Intelligent Agricultural Machinery and Equipment Engineering Laboratory at Anhui Agricultural University.</p>
<fig id="f11" position="float">
<label>Figure&#xa0;11</label>
<caption>
<p>
<bold>(A)</bold> Test environment. <bold>(B)</bold> Only the improved You Only Look Once, version 5 algorithm (Im-YOLOv5). <bold>(C)</bold> Laser imaging, detection, and ranging (LiDAR) supplement vision.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-1024360-g011.tif"/>
</fig>
<p>During the experiments, the required main navigation information was the position information of maize plants, while the required auxiliary navigation information was the position information of the non-maize obstacles. We set up six maize plant models and three non-maize obstacles and randomly set the locations of the obstacles. Subsequently, we conducted 10 information acquisition experiments at distances of 1,000, 2,000, and 3,000&#xa0;mm from the data acquisition robot to the front row of the maize plant model. The test results are shown in <xref ref-type="fig" rid="f11">
<bold>Figures&#xa0;11B, C</bold>
</xref>.</p>
</sec>
<sec id="s5_3" sec-type="discussion">
<title>5.3 Discussions</title>
<p>Generally, visual navigation between rows in the middle and late stages of maize extracts the maize characteristics and then fits the navigation path. If the camera was only used to obtain information based on the maize characteristics in the recognition stage, information on the non-maize obstacles between rows in the middle and late stages of maize is missed, as shown in <xref ref-type="fig" rid="f11">
<bold>Figures&#xa0;11B, C</bold>
</xref>. With the introduction of the Im-YOLOv5 stem recognition algorithm, sufficient training for maize stem recognition has become exceptionally accurate; however, the non-maize obstacle recognition rate was almost zero only for Im-YOLOv5, which is extremely fatal for the actual operation safety of plant protection robots in the middle and late stages of maize.</p>
<p>When using LiDAR to obtain auxiliary navigation information in order to supplement the main navigation information obtained by machine vision, the issue of missing information can be properly solved, with the safety of the planned navigation path under this condition being greatly improved. However, due to the recognition accuracy of the 16-line LiDAR and the error of the camera&#x2013;LiDAR joint calibration, the information recognition effect was not very satisfactory when the obstacle is far away and is too small. With increasing distance between the data acquisition robot and the maize plant, the number of maize plant models can be stably maintained, which means that the identification of the main navigation information is also stable. However, recognition of the number of non-maize obstacles showed a downward trend, indicating that the recognition accuracy using the auxiliary navigation information was reduced. In view of these issues, we will be using the 32-line or the 64-line LiDAR, both with higher accuracy, in future experiments.</p>
</sec>
</sec>
<sec id="s6" sec-type="conclusions">
<title>6 Conclusion</title>
<p>In order to solve the problem of missing information when using machine vision for inter-row navigation in the middle and late stages of maize, this paper has proposed a method using LiDAR point cloud data to supplement machine vision in order to obtain more accurate inter-row information in the middle and late stages of maize. Through training of the machine vision datasets with the Im-YOLOv5 model, the main navigation information was obtained by identifying maize plants between the rows of maize in the middle and late stages. As a supplement to the main navigation information acquired by machine vision, LiDAR has been used to provide additional information to identify information on other non-crop obstacles as auxiliary navigation information. Not only was the accuracy of information recognition improved, but technical support for planning a safe navigation path can also be provided. Experimental results from the data acquisition robot equipped with a camera and a LiDAR sensor have demonstrated the validity and the good inter-row navigation recognition performance of the proposed method for the middle and late stages of maize. However, with the improvement in the accuracy of LiDAR, cost is the key problem restricting the commercialization of this recognition system. Therefore, we hope that our recognition system can be applied in small autonomous navigation plant protection robots, as the relatively low cost of small plant protection robots, even with the application of this relatively high-precision recognition system, has a price advantage over UAVs. The navigation information can be used for path planning to guide robots to drive autonomously and can also be used as the basis for the adjustment of the driving state of robots, such as in reducing the driving speed when detecting rocks or large clods. Therefore, in subsequent research, we will focus on path planning between maize rows and the control of the driving state of robots.</p>
</sec>
<sec id="s7" sec-type="data-availability">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s8" sec-type="author-contributions">
<title>Author contributions</title>
<p>ZL: Software, visualization, investigation, and writing&#x2014;original draft. DX and LL: Investigation. HW: Writing&#x2014;review and editing. LC: Conceptualization, methodology, writing&#x2014;review and editing. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="s9" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported in part by the National Natural Science Foundation of China under grant no. 52175212.</p>
</sec>
<sec id="s10" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s11" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aguiar</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Oliveira</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Pedrosa</surname> <given-names>E. F.</given-names>
</name>
<name>
<surname>Santos</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A camera to LiDAR calibration approach through the optimization of atomic transformations</article-title>. <source>Expert Syst. Appl.</source> <volume>176</volume>, <fpage>114894</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.eswa.2021.114894</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bae</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Shin</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Lim</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Choi</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Estimation of closest in-path vehicle (CIPV) by low-channel LiDAR and camera sensor fusion for autonomous vehicle</article-title>. <source>Sensors</source>. <volume>21</volume>, <elocation-id>3124</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/s21093124</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>a). <article-title>Calibration of stereo cameras with a marked-crossed fringe pattern</article-title>. <source>Opt. Lasers Eng.</source> <volume>147</volume>, <elocation-id>106733</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.optlaseng.2021.106733</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>K.-W.</given-names>
</name>
<name>
<surname>Lai</surname> <given-names>C.-C.</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>P.-J.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>C.-S.</given-names>
</name>
<name>
<surname>Hung</surname> <given-names>Y.-P.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Adaptive learning for target tracking and true linking discovering across multiple non-overlapping cameras</article-title>. <source>IEEE Trans. Multimedia</source> <volume>13</volume>, <fpage>625</fpage>&#x2013;<lpage>638</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TMM.2011.2131639</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Lateral stability control of four-Wheel-Drive electric vehicle based on coordinated control of torque distribution and ESP differential braking</article-title>. <source>Actuators</source> <volume>10</volume>, <elocation-id>135</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/act10060135</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Ran</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>W.</given-names>
</name>
<etal/>
</person-group>. (<year>2021</year>b). <article-title>Flexible calibration method of electronically focus-tunable lenses</article-title>. <source>IEEE Trans. Instrum. Meas</source> <volume>70</volume>, <fpage>5013210</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TIM.2021.3097412</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>He</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Q.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Performance analysis and test of a maize inter-row self-propelled thermal fogger chassis</article-title>. <source>Int. J. Agric. Biol. Eng.</source> <volume>11</volume>, <fpage>100</fpage>&#x2013;<lpage>107</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.25165/j.ijabe.20181105.3607</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gai</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Xiang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Tang</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Using a depth camera for crop row detection and mapping for under-canopy navigation of agricultural robotic vehicle</article-title>. <source>Comput. Electron. Agric.</source> <volume>188</volume>, <elocation-id>106301</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2021.106301</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Path tracking control of field information-collecting robot based on improved convolutional neural network algorithm</article-title>. <source>Sensors</source> <volume>20</volume>, <elocation-id>797</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/s20030797</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hassanin</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Anwar</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Radwan</surname> <given-names>I.</given-names>
</name>
<name>
<surname>Khan</surname> <given-names>F. S.</given-names>
</name>
<name>
<surname>Mian</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Visual attention methods in deep learning: An in-depth survey</article-title>. <source>arXiv</source>. doi: <pub-id pub-id-type="doi">10.48550/arXiv.2204.07756</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hiremath</surname> <given-names>S. A.</given-names>
</name>
<name>
<surname>van der Heijden</surname> <given-names>G. W. A. M.</given-names>
</name>
<name>
<surname>van Evert</surname> <given-names>F. K.</given-names>
</name>
<name>
<surname>Stein</surname> <given-names>A.</given-names>
</name>
<name>
<surname>ter Braak</surname> <given-names>C. J. F.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Laser range finder model for autonomous navigation of a robot in a maize field using a particle filter</article-title>. <source>Comput. Electron. Agric.</source> <volume>100</volume>, <fpage>41</fpage>&#x2013;<lpage>50</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2013.10.005</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jafari Malekabadi</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Khojastehpour</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Emadi</surname> <given-names>B.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Disparity map computation of tree using stereo vision system and effects of canopy shapes and foliage density</article-title>. <source>Comput. Electron. Agric.</source> <volume>156</volume>, <fpage>627</fpage>&#x2013;<lpage>644</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2018.12.022</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jeong</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Yoon</surname> <given-names>T. S.</given-names>
</name>
<name>
<surname>Park</surname> <given-names>J. B.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Multimodal sensor-based semantic 3D mapping for a Large-scale environment</article-title>. <source>Expert Syst. Appl.</source> <volume>105</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.eswa.2018.03.051</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jing</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Guan</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zang</surname> <given-names>Y.</given-names>
</name>
<etal/>
</person-group>. (<year>2021</year>). <article-title>Multispectral LiDAR point cloud classification using SE-PointNet plus</article-title>. <source>Remote Sens.</source> <volume>13</volume>, <elocation-id>2516</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs13132516</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jubayer</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Soeb</surname> <given-names>J. A.</given-names>
</name>
<name>
<surname>Mojumder</surname> <given-names>A. N.</given-names>
</name>
<name>
<surname>Paul</surname> <given-names>M. K.</given-names>
</name>
<name>
<surname>Barua</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Kayshar</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2021</year>). <article-title>Detection of mold on the food surface using YOLOv5</article-title>. <source>Curr. Res. Food Sci.</source> <volume>4</volume>, <fpage>724</fpage>&#x2013;<lpage>728</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.crfs.2021.10.003</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Dou</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Control of a path following caterpillar robot based on a sliding mode variable structure algorithm</article-title>. <source>Biosyst. Eng.</source> <volume>186</volume>, <fpage>293</fpage>&#x2013;<lpage>306</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.biosystemseng.2019.07.004</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Mei</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Niu</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chu</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>RBF-based monocular vision navigation for small vehicles in narrow space below maize canopy</article-title>. <source>Appl. Sci.-Basel</source> <volume>6</volume>, <elocation-id>182</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/app6060182</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Yao</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Jia</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Tang</surname> <given-names>Z.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Road segmentation with image-LiDAR data fusion in deep neural network</article-title>. <source>Multimed. Tools Appl.</source> <volume>79</volume>, <fpage>35503</fpage>&#x2013;<lpage>35518</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s11042-019-07870-0</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Morales</surname> <given-names>J.</given-names>
</name>
<name>
<surname>V&#xe1;zquez-Mart&#xed;n</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Mandow</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Morilla-Cabello</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Garc&#xed;a-Cerezo</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The UMA-SAR dataset: Multimodal data collection from a ground vehicle during outdoor disaster response training exercises</article-title>. <source>Int. J. Robotics Res.</source> <volume>40</volume>, <elocation-id>27836492110049</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1177/02783649211004959</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mutz</surname> <given-names>F. W.</given-names>
</name>
<name>
<surname>Oliveira-Santos</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Forechi</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Komati</surname> <given-names>K. S.</given-names>
</name>
<name>
<surname>Badue</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Frana</surname> <given-names>F.</given-names>
</name>
<etal/>
</person-group>. (<year>2021</year>). <article-title>What is the best grid-map for self-driving cars localization? an evaluation under diverse types of illumination, traffic, and environment</article-title>. <source>Expert Syst. Appl</source>. <volume>179</volume>, <elocation-id>115077</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/J.ESWA.2021.115077</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Patricio</surname> <given-names>D. I.</given-names>
</name>
<name>
<surname>Rieder</surname> <given-names>R.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Computer vision and artificial intelligence in precision agriculture for grain crops: A systematic review</article-title>. <source>Comput. Electron. Agric.</source> <volume>153</volume>, <fpage>69</fpage>&#x2013;<lpage>81</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2018.08.001</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Radcliffe</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Cox</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Bulanon</surname> <given-names>D. M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Machine vision for orchard navigation</article-title>. <source>Comput. Ind.</source> <volume>98</volume>, <fpage>165</fpage>&#x2013;<lpage>171</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compind.2018.03.008</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Reiser</surname> <given-names>D.</given-names>
</name>
<name>
<surname>V&#xe1;zquez-Arellano</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Paraforos</surname> <given-names>D. S.</given-names>
</name>
<name>
<surname>Garrido-Izard</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Griepentrog</surname> <given-names>H. W.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Iterative individual plant clustering in maize with assembled 2D LiDAR data</article-title>. <source>Comput. Industry</source> <volume>99</volume>, <fpage>42</fpage>&#x2013;<lpage>52</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compind.2018.03.023</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Lian</surname> <given-names>G.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Recognition and localization methods for vision-based fruit picking robots: A review</article-title>. <source>Front. Plant Sci.</source> <volume>11</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fpls.2020.00510</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Gong</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Hao</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Cui</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Evaluation of artificial intelligence models for actual crop evapotranspiration modeling in mulched and non-mulched maize croplands</article-title>. <source>Comput. Electron. Agric.</source> <volume>152</volume>, <fpage>375</fpage>&#x2013;<lpage>384</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2018.07.029</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Fruit detection and positioning technology for a camellia oleifera c. Abel orchard based on improved YOLOv4-tiny model and binocular stereo vision</article-title>. <source>Expert Syst. Appl.</source> <volume>211</volume>, <elocation-id>118573</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.eswa.2022.118573</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Cai</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2022</year>a). <article-title>Motion-induced error reduction for phase-shifting profilometry with phase probability equalization</article-title>. <source>Opt. Lasers Eng.</source> <volume>156</volume>, <elocation-id>107088</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.optlaseng.2022.107088</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Cai</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2022</year>b). <article-title>Nonlinear correction for fringe projection profilometry with shifted-phase histogram equalization</article-title>. <source>IEEE Trans. Instrum. Meas</source> <volume>71</volume>, <fpage>5005509</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TIM.2022.3145361</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Tang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A study on long-close distance coordination control strategy for litchi picking</article-title>. <source>Agronomy-Basel</source> <volume>12</volume>, <elocation-id>1520</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/agronomy12071520</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>X.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Maize plant phenotyping: Comparing 3D laser scanning, multi-view stereo reconstruction, and 3D digitizing estimates</article-title>. <source>Remote Sens.</source> <volume>11</volume>, <elocation-id>63</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs11010063</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Duan</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Ye</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Ai</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>Z.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Multi-target recognition of bananas and automatic positioning for the inflorescence axis cutting point</article-title>. <source>Front. Plant Sci.</source> <volume>12</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fpls.2021.705021</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xie</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Bao</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Tong</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Shi</surname> <given-names>B.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A self-calibrated photo-geometric depth camera</article-title>. <source>Visual Comput.</source> <volume>35</volume>, <fpage>99</fpage>&#x2013;<lpage>108</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s00371-018-1507-9</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xue</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>ECANet: Explicit cyclic attention-based network for video saliency prediction</article-title>. <source>Neurocomputing</source> <volume>468</volume>, <fpage>233</fpage>&#x2013;<lpage>244</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.neucom.2021.10.024</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Xiong</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>M.</given-names>
</name>
<etal/>
</person-group>. (<year>2021</year>). <article-title>LiDAR-camera calibration method based on ranging statistical characteristics and improved RANSAC algorithm</article-title>. <source>Robot. Auton. Syst.</source> <volume>141</volume>, <elocation-id>103776</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.robot.2021.103776</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Zhuge</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Guan</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Lin</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Gan</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>On-orbit calibration for spaceborne line array camera and LiDAR</article-title>. <source>Remote Sens.</source> <volume>14</volume>, <elocation-id>2949</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs14122949</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Bai</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Intelligent vehicle lateral control method based on feedforward</article-title>. <source>Actuators</source> <volume>10</volume>, <elocation-id>228</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/act10090228</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Wen</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>S.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>a). <article-title>An optimal goal point determination algorithm for automatic navigation of agricultural machinery: Improving the tracking accuracy of the pure pursuit algorithm</article-title>. <source>Comput. Electron. Agric.</source> <volume>194</volume>, <elocation-id>106760</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2022.106760</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Ouyang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Duan</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>b). <article-title>Visual navigation path extraction of orchard hard pavement based on scanning method and neural network</article-title>. <source>Comput. Electron. Agric.</source> <volume>197</volume>, <elocation-id>106964</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2022.106964</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Ma</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Tian</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Lu</surname> <given-names>X.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Registration and fusion of UAV LiDAR system sequence images and laser point clouds</article-title>. <source>J. Imaging Sci. Technol.</source> <volume>65</volume>, <elocation-id>10501</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.2352/J.ImagingSci.Technol.2021.65.1.010501</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Jia</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Gu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Multi-objective optimization of lubricant volume in an ELSD considering thermal effects</article-title>. <source>Int. J. Therm. Sci.</source> <volume>164</volume>, <elocation-id>106884</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.ijthermalsci.2021.106884</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Lv</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>An</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>An adaptive vision navigation algorithm in agricultural IoT system for smart agricultural robots</article-title>. <source>Computers Mater. Continua</source> <volume>66</volume>, <fpage>1043</fpage>&#x2013;<lpage>1056</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.32604/cmc.2020.012517</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Fu</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Cui</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Real-time kiwifruit detection in orchard using deep learning on android (TM) smartphones for yield estimation</article-title>. <source>Comput. Electron. Agric.</source> <volume>179</volume>, <elocation-id>105856</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2020.105856</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>