ORIGINAL RESEARCH article

Front. Agron.

Sec. Pest Management

Volume 7 - 2025 | doi: 10.3389/fagro.2025.1578412

This article is part of the Research TopicModeling, Remote Sensing, and Machine Learning in Pest ManagementView all articles

Research on Multi Class Pest Identification and Detection Based on Fusion Attention Mechanism with Mask-RCNN-CBAM

Provisionally accepted
Xingwang  WangXingwang Wang1Can  HuCan Hu2Xufeng  WangXufeng Wang2Hainie  ZhaHainie Zha3*Xueyong  ChenXueyong Chen1*Shanshan  YuanShanshan Yuan4Jing  ZhangJing Zhang5Jianfeng  LiaoJianfeng Liao5
  • 1College of Mechanical and Electronic Engineering, Fujian Agriculture and Forestry University, Fuzhou, China
  • 2College of Mechanicaland Electrical Engineering, Tarim University, Alar, China
  • 3University Key Laboratory of Intelligent Perception and Computing of Anhui Province, Anqing Normal University,, Anqing, Anhui Province, China
  • 4Quality and Safety Inspection and Testing Center for Agricultural Products, Qiemu County, China
  • 5Anhui Eagle Information Technology Co., Ltd, Anqing, Anhui Province, China

The final, formatted version of the article will be published soon.

With the advancement of agricultural modernization, pest identification and extraction have become increasingly pivotal in crop pest control. To address the issues of false positives and missed detections caused by complex backgrounds and densely packed pest aggregation, this paper proposes an enhanced Mask-RCNN model integrated with a Convolutional Block Attention Module (CBAM) for precise pest extraction in complex environments. In particular, this research focuses on mitigating the missed and false detections that commonly arise in densely populated pest regions by introducing the CBAM attention mechanism, a feature-enhanced pyramid network (FPN), and a dual-channel down sampling module into the original Mask-RCNN architecture. The incorporation of the CBAM attention mechanism allows the model to focus more effectively on the detailed features of the pests while minimizing the interference from background noise. The feature-enhanced FPN module facilitates the fusion of multi-scale feature maps, improving the detection capability for small target pests. Additionally, the dual channel down sampling module optimizes the information retention during feature propagation, further enhancing the model's overall performance. Experimental results demonstrated that the proposed model outperforms existing algorithms such as ResNet, Faster-RCNN, and the original Mask-RCNN in terms of precision, recall, and F1 score, with improvements of 0.73%, 1.38%, and 2.67%, respectively. Moreover, the method significantly reduces false positives and missed detections, achieving a favorable trade-off between computational efficiency and model size, with a reduction of 1.39 MB in parameters compared to the original Mask-RCNN. Ablation studies further substantiate the contributions of the attention mechanism, feature enhanced FPN, and dual channel down sampling modules to the performance gains. Overall, the enhanced Mask-RCNN-CBAM model demonstrates superior performance in complex backgrounds and dense pest regions, exhibiting substantial practical value and promising application prospects.

Keywords: Mask-RCNN-CBAM, attention mechanism, Feature Enhanced Pyramid Network, Dual Channel Downsampling, Pest extraction, deep learning

Received: 17 Feb 2025; Accepted: 18 Apr 2025.

Copyright: © 2025 Wang, Hu, Wang, Zha, Chen, Yuan, Zhang and Liao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Hainie Zha, University Key Laboratory of Intelligent Perception and Computing of Anhui Province, Anqing Normal University,, Anqing, Anhui Province, China
Xueyong Chen, College of Mechanical and Electronic Engineering, Fujian Agriculture and Forestry University, Fuzhou, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.