ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Sustainable and Intelligent Phytoprotection
Volume 16 - 2025 | doi: 10.3389/fpls.2025.1635310
This article is part of the Research TopicCutting-Edge Technologies Applications in Intelligent Phytoprotection: From Precision Weed and Pest Detection to Variable Fertilization TechnologiesView all 7 articles
MAVM-UNet: Multiscale Aggregated Vision MambaU-Net for Field Rice Pest Detection
Provisionally accepted- 1Chengdu University of Technology, Chengdu, China
- 2Xijing University, Xi'an, China
- 3SIAS University, Xinzheng, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Pests in rice fields not only affect the yield and quality of rice, but also cause serious ecological and environmental problems due to the heavy reliance on pesticides. Since the various pests with irregular and changeable shapes, small sizes and complex background, field rice pest detection is an essential prerequisite and challenge for the precise control of pests in the field. A multiscale aggregated vision MambaU-Net (MAVM-UNet) model for rice pest detection is constructed. The model consists of four main modules: Visual State Space (VSS), multiscale VSS (MSVSS), Channel-Aware VSS (CAVSS) and multiscale attention aggregation (MSAA), where VSS is used as the basic module for capturing context information, MSVSS is used to capture and aggregate fine-grained multiscale feature of field rice pest images, CAVSS is added into Skip connection to select the critical channel representations of the encoder and decoder, and MSAA is added in the bottleneck layer to integrate the pest features of different layers of the encoder. Combining MSAA and CAVSS can capture the low-level details and high-level semantics, and dynamically adjust the contributions of features at different scales, such as the slender legs and antennae of pests rely on fine-grained features, while large body of pest relies on coarse-grained features. A large number of experimental results on the rice pest image subset of IP102 dataset show that MAVM-UNet is superior to the state-of-the-art models, with PA and MIoU of 82.07% and 81.48%, respectively. The proposed model provides important guidance for the monitoring and control of pests in rice fields.
Keywords: Field rice pest detection, Visual State Space (VSS), Channel-Aware VSS (CAVSS), Vision MambaU-Net, Multiscale aggregated vision MambaU-Net (MAVM-UNet)
Received: 28 May 2025; Accepted: 15 Jul 2025.
Copyright: © 2025 Zhang, Zhang and Shang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Congqi Zhang, Chengdu University of Technology, Chengdu, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.