Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Robot. AI

Sec. Computational Intelligence in Robotics

Volume 12 - 2025 | doi: 10.3389/frobt.2025.1654074

This article is part of the Research TopicAdvanced Sensing, Learning and Control for Effective Human-Robot InteractionView all articles

Enhancing Weed Detection through Knowledge Distillation and Attention Mechanism

Provisionally accepted
Ali  EL ALAOUIAli EL ALAOUI*Hajar  MOUSANNIFHajar MOUSANNIF
  • Universite Cadi Ayyad Faculte des Sciences Semlalia, Marrakesh, Morocco

The final, formatted version of the article will be published soon.

Weeds pose a significant challenge in agriculture by competing with crops for essential resources, leading to reduced yields. To address this issue, researchers have increasingly adopted advanced machine learning techniques. Recently, Vision Transformers (ViT) have demonstrated remarkable success in various computer vision tasks, making their application to weed classification, detection, and segmentation more advantageous compared to traditional Convolutional Neural Networks (CNNs) due to their self-attention mechanism.However, the deployment of these models in agricultural robotics is hindered by resource limitations. Key challenges include high training costs, the absence of inductive biases, the extensive volume of data required for training, model size, and runtime memory constraints.This study proposes a knowledge distillation-based method for optimizing the ViT model. The approach aims to enhance the ViT model architecture while maintaining its performance for weed detection. To facilitate the training of the compacted ViT student model and enable parameter sharing and local receptive fields, knowledge was distilled from ResNet-50, which serves as the teacher model. Experimental results demonstrate significant enhancements and improvements in the student model, achieving a mean Average Precision (mAP) of 83.47%. Additionally, the model exhibits minimal computational expense, with only 5.7 million parameters.

Keywords: deep learning, precision agriculture, vision Transformer, Weed detection, robotic weed control

Received: 25 Jun 2025; Accepted: 04 Aug 2025.

Copyright: © 2025 EL ALAOUI and MOUSANNIF. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Ali EL ALAOUI, Universite Cadi Ayyad Faculte des Sciences Semlalia, Marrakesh, Morocco

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.