AUTHOR=Wang Fenghua , Tang Yuan , Gong Zaipeng , Jiang Jin , Chen Yu , Xu Qiang , Hu Peng , Zhu Hailong 

TITLE=A lightweight Yunnan Xiaomila detection and pose estimation based on improved YOLOv8

JOURNAL=Frontiers in Plant Science

VOLUME=Volume 15 - 2024

YEAR=2024

URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1421381

DOI=10.3389/fpls.2024.1421381

ISSN=1664-462X

ABSTRACT=Yunnan Xiaomila is a pepper variety whose flowers and fruits become mature at the same time and multiple times a year. The distinction between the fruits and the background is low and the background is complex. The targets are small and difficult to identify. This paper aims at the problem of target detection of Yunnan Xiaomila under complex background environment, in order to reduce the impact caused by the small color gradient changes between xiaomila and background and the unclear feature information, an improved PAE-YOLO model is proposed, which combines the EMA attention mechanism and DCNv3 deformable convolution is integrated into the YOLOv8 model, which improves the model's feature extraction capability and inference speed for Xiaomila in complex environments, and achieves a lightweight model. It also uses a depth camera to estimate the posture of Xiaomila, while analyzing and optimizing different occlusion situations. The experimental results indicated that the model obtained an average mean accuracy (mAP) of 88.8%, which was 1.3% higher than that of the original model. Its F1 score reached 83.2, and the GFLOPs and model sizes were 7.6G and 5.7MB respectively. The F1 score ranked the best among several networks, with the model weight and gigabit floating-point operations per second (GFLOPs) being the smallest. The loss value was the lowest during training, and the convergence speed was the fastest. Meanwhile, the attitude estimation results of 102 targets showed that the orientation was correctly estimated exceed 85% of the cases, and the average error angle was 15.91°. In the occlusion condition, 86.3% of the attitude estimation error angles were less than 40°, and the average error angle was 23.19°. These results suggest that the improved detection model can accurately identify Xiaomila targets fruits, and can better estimate the target posture.