- School of Mechanical Engineering, Jiangsu Ocean University, Lianyungang, China
Deep learning has become a transformative technology for modern weed detection, offering significant advantages over traditional machine vision in robustness, scalability, and recognition accuracy. This review provides a comprehensive synthesis of recent progress in deep learning-based weed detection, with a focus on three major model families: object detection, image segmentation, and image classification. For each category, representative architectures, key algorithmic features, and typical agricultural application scenarios are summarized and compared. The strengths and limitations of these approaches—particularly in terms of spatial localization, pixel-level delineation, computational efficiency, and model generalization—are critically analyzed. In addition, major challenges such as dataset scarcity, annotation cost, variability in weed morphology, and real-time deployment constraints are discussed, along with emerging solutions including crop-based indirect detection, semi-supervised learning, and model–actuator integration. This review highlights future opportunities toward scalable, data-efficient, and precision-integrated weed management, offering guidance for the development of next-generation intelligent weeding systems.
1 Introduction
Weeds remain one of the most damaging biotic constraints to global crop production, imposing substantial yield and quality losses through intense competition for light, water, nutrients, and space (Kaur et al., 2018). Their rapid reproduction, strong ecological adaptability, and ability to thrive under diverse environmental conditions often enable them to outcompete crops, threatening agricultural sustainability and food security. Studies have shown that uncontrolled weed infestations can significantly reduce crop yield, depending on species composition and management practices (Vilà et al., 2021). Moreover, the proliferation of herbicide-resistant weed biotypes and the increasing variability of weed populations across regions further complicate effective control (Shaner, 1995). Beyond direct yield impacts, weeds also increase production costs, hinder harvesting operations, and contribute to long-term ecological risks (Bairambekov et al., 2016; Wicks et al., 2017). Therefore, efficient and timely weed management is essential for stabilizing crop production and ensuring the long-term resilience of agricultural systems.
With the shift toward precision agriculture, weed control has progressively evolved from uniform field-scale treatment to spatially targeted and environmentally conscious management. Mechanical weeding in inter-row areas requires reliable visual perception to avoid damaging crop stems, while intra-row mechanical systems depend heavily on precise localization to remove weeds growing near crops (Balas et al., 2022). Similarly, site-specific herbicide application relies on accurate identification of weed presence, species, and distribution patterns to minimize chemical inputs while maintaining high control efficiency. Conventional broadcast spraying wastes substantial amounts of herbicide on non-target surfaces and contributes to ecological contamination, whereas precision spraying systems can significantly reduce chemical use when supported by accurate weed detection (Reddy and James, 2018). Consequently, robust weed detection has become a foundational requirement for intelligent weeding robots, autonomous tractors, and UAV-based prescription mapping (Xu et al., 2025; Huang et al., 2018). The increasing demand for high-resolution, real-time, and scalable weed perception underscores the necessity of advanced computational approaches capable of operating in heterogeneous and complex field environments.
The rapid development of deep learning, particularly convolutional neural networks (CNNs) and Transformer-based architectures, has fundamentally reshaped the landscape of computer vision. These models have achieved breakthroughs in medical imaging (Kim et al., 2019), autonomous driving (Grigorescu et al., 2020), remote sensing (Dou et al., 2021), industrial inspection (Glaeser et al., 2021), and robotics (Chen et al., 2020) by enabling hierarchical feature learning and delivering high robustness under substantial variation in lighting, texture, scale, and occlusion. In agriculture, deep learning has demonstrated remarkable success in tasks such as disease diagnosis, fruit detection, yield prediction, canopy segmentation, and crop phenotyping, outperforming traditional machine vision and classical machine learning techniques (Deng et al., 2021; Wang et al., 2022; Chu and Yu, 2020; Atkins et al., 2025). Its ability to automatically extract discriminative features from raw images without relying on hand-crafted descriptors makes it especially valuable in field environments characterized by non-uniform illumination, dynamic backgrounds, and biological diversity (Argunşah et al., 2025). These cross-domain achievements illustrate the strong potential of deep learning to serve as the core enabling technology for next-generation intelligent weed management systems.
Traditional weed detection methods primarily rely on manually engineered features derived from color indices (Rehman et al., 2018), texture descriptors (Pan et al., 2016), or spectral characteristics (Ajamian et al., 2021). Although effective in controlled environments, their performance deteriorates under variable illumination, soil clutter, occlusion by crop leaves, morphological variation in weeds, and changes in crop growth stages (Zhou et al., 2025). Deep learning overcomes these limitations by learning multi-scale, high-level semantic representations that capture complex spatial and contextual patterns, enabling more robust differentiation between crops and weeds. Object detection models can localize weeds with bounding boxes, segmentation networks can delineate weed shapes at the pixel level, and lightweight classification networks can distinguish weed species or growth stages with high accuracy (Rai and Sun, 2024). Moreover, deep learning models can be fine-tuned, transferred across regions, or combined with multispectral, hyperspectral, or depth imagery to enhance generalization (Liu et al., 2021). These advantages make deep learning particularly well suited for real-time weed perception in precision agriculture while providing a scalable framework adaptable to diverse sensing platforms and field conditions (Li et al., 2024b).
Given the rapid expansion of deep learning applications in weed detection and the increasing availability of field imagery from UAVs, ground robots, and intelligent implements, a comprehensive and updated review is needed to synthesize existing progress, identify limitations, and outline future opportunities. This review summarizes recent advances in deep learning-based weed detection with an emphasis on three major frameworks—object detection, image segmentation, and image classification. For each category, representative network architectures, application scenarios, and performance metrics are analyzed and compared. The advantages, limitations, and trade-offs among these approaches are discussed, providing guidance for selecting suitable models under different field conditions and computational constraints. Finally, the review outlines key challenges and future research directions toward scalable, robust, and integrated intelligent weed management. The remainder of this paper is organized as follows. Section 2 reviews deep learning-based methods for weed detection, including object detection, image segmentation, and image classification. Section 3 discusses the main challenges related to model design, dataset construction, and system deployment, and outlines potential solutions and research directions. Section 4 concludes the paper.
2 Deep learning-based methods for weed detection
Deep learning has fundamentally transformed computer vision by enabling neural networks to automatically learn hierarchical feature representations from raw image data, avoiding the need for handcrafted feature engineering required in traditional machine learning approaches (Lecun et al., 2015). In the context of weed detection, deep learning models have shown strong robustness against variations in illumination, occlusion, weed density, and background complexity, making them highly suitable for complex agricultural environments (Teimouri et al., 2022). A typical deep learning-based weed detection workflow consists of data collection, dataset annotation, image preprocessing, model training, inference, and evaluation (Figure 1) (Hasan et al., 2021). According to their learning objectives and output structures, deep learning methods for visual weed detection can be broadly grouped into three categories—object detection, image segmentation, and image classification. These model families form the algorithmic foundation for subsequent research and application in intelligent and precise weed management.
Figure 1. Overview of deep learning-based weed detection process (Hasan et al., 2021).
2.1 Categories of deep learning models
Deep learning models for visual perception are commonly instantiated as object detection, image segmentation, or image classification networks, each characterized by distinct architectures, learning objectives, and output representations. These model categories form the algorithmic foundation for a wide range of computer vision tasks in both natural and structured environments.
Object detection models aim to simultaneously localize and classify objects within an image by predicting bounding boxes and corresponding category labels. Two-stage detectors, such as Faster R-CNN (Ren et al., 2017), generate region proposals before refining them through dedicated classification and regression heads, providing high accuracy at the cost of computational complexity. In contrast, one-stage detectors, including SSD (Liu et al., 2016), YOLO (Redmon et al., 2015), and RetinaNet, eliminate the proposal step and directly perform dense regression on predefined anchor boxes or anchor-free keypoints, enabling significantly faster inference. Modern object detectors incorporate feature pyramid networks (FPN), multi-scale feature aggregation, attention mechanisms, and Transformer-based modules to enhance small-object detection and improve robustness across varying image resolutions (Lu et al., 2025; Zhao et al., 2023; Lin et al., 2022).
Image segmentation models focus on pixel-level classification by assigning a semantic label to each pixel in an image. Semantic segmentation architectures such as FCN and U-Net adopt encoder–decoder designs, where the encoder extracts progressively abstract features and the decoder reconstructs spatial detail through upsampling and skip connections (Ates et al., 2023; Agarwal et al., 2023). More advanced models like DeepLab series integrate atrous convolutions and multi-scale context aggregation (ASPP), while Transformer-based architectures leverage long-range dependencies for improved structural consistency (Sharma et al., 2025). Instance segmentation models, exemplified by Mask R-CNN, further distinguish individual object instances by combining detection heads with dedicated mask prediction branches (He et al., 2020). These models are widely used when fine-grained spatial structures or object shapes are critical.
Image classification models seek to assign a single category label to an image or image patch. Classical convolutional neural networks such as AlexNet (Krizhevsky et al., 2017), VGG (Simonyan and Zisserman, 2014), GoogLeNet (Szegedy et al., 2014), and ResNet (He et al., 2015) pioneered deep hierarchical feature extraction, while later architectures—including DenseNet (Huang et al., 2016), MobileNet (Howard et al., 2017), EfficientNet (Tan and Le, 2019), and RegNet (Xu et al., 2021)—optimized parameter efficiency, representational capacity, and deployment feasibility. Recently, Vision Transformers (ViT) and hybrid CNN–Transformer models have demonstrated strong performance by modeling global spatial relationships through self-attention mechanisms (Dosovitskiy et al., 2020; Li et al., 2024a). Classification networks form the backbone of numerous visual recognition tasks and can be adapted for fine-grained species identification, hierarchical categorization, or lightweight edge deployment.
2.2 Object detection methods for weed detection
In the context of weed detection, the primary goal of object detection models is to identify weed plants in images and accurately localize them through bounding boxes (Figure 2) (Darbyshire et al., 2023; Sun et al., 2024). Directly applying general object detection networks—such as YOLO and Faster R-CNN—has become one of the most widely adopted approaches in weed detection. For example, Deng et al. (2023) proposed a YOLOX-based weed detection method for paddy fields, in which CSPDarknet combined with FPN was used for multi-scale feature fusion. Among multiple compared models, YOLOX-tiny achieved the highest accuracy and lowest computational cost in dense small-object scenarios during the seedling stage, making it suitable for deployment on embedded agricultural devices for precision spraying. In addition, Darbyshire et al. (2023) conducted a comprehensive investigation on weed detection from the perspective of precision spraying. Their study evaluated detection accuracy and spraying performance using two independent datasets, multiple image resolutions, and several state-of-the-art object detection algorithms. To realistically represent precision spraying outcomes, a simplified spraying model was introduced, along with two key metrics—”weed coverage” and “sprayed area”—to jointly assess weed hit rate and herbicide use efficiency. The results demonstrated that, when using state-of-the-art visual detection methods, spraying only 30% of the field area could cover 93% of weeds, highlighting the potential of deep-learning-based precision spraying to substantially reduce herbicide consumption and minimize environmental impact.
Figure 2. Detected objects in sample Images (Sun et al., 2024).
Based on the direct applicability of object detection for in-field weed monitoring, researchers have developed and optimized a variety of deep-learning-based weed detection models. Table 1 lists representative models, highlighting their network architectures, target objects, and performance metrics in typical agricultural scenarios.
Although object detection allows direct identification of visible weeds in images and provides information on their location and abundance, certain limitations remain in practical applications. The advantages of using object detection networks lie in their ability to rapidly process large-scale field images and directly obtain the spatial distribution of weeds, enabling automated monitoring. Deep learning networks can extract multi-scale features of weeds from complex field backgrounds, thereby improving detection accuracy. Furthermore, some lightweight models, such as YOLOX-tiny and MobileNet-SSD, can achieve real-time inference on embedded devices, offering feasible deployment solutions for UAVs and field robots.
On the other hand, the limitations of employing object detection networks for weed detection are also significant. Detection performance is highly affected by variations in illumination, crop growth stages, weed density, and occlusion by crops, with small-sized weed plants particularly prone to being missed or misclassified, resulting in reduced model accuracy. Furthermore, the detectable weed species are limited to those present in the training dataset, leading to poor generalization across new regions or different seasonal conditions. In addition, achieving high-precision detection requires a large volume of annotated data, the production of which is costly and time-consuming, thereby constraining the widespread application of these models in practical precision agriculture.
2.3 Image segmentation methods for weed detection
Image segmentation is a fundamental task in machine vision and includes semantic segmentation and instance segmentation. Semantic segmentation assigns a class label to every pixel in an image, thereby delineating precise object regions; however, it does not distinguish between different instances of the same class. In contrast, instance segmentation performs both pixel-level classification and object localization, providing accurate object shapes while differentiating multiple instances within the same category (Nistor et al., 2020). Unlike object detection, which only predicts bounding boxes, image segmentation provides pixel-level information and can capture precise object contours (Minaee et al., 2021). Image segmentation is widely used in a variety of computer vision applications, including medical imaging (Hoorali et al., 2022), autonomous driving (Papadeas et al., 2021), and agricultural analysis of crops and weeds (Fawakherji et al., 2021). Figure 3 illustrates the segmentation network results for weed detection in paddy fields.
Figure 3. Image segmentation results for weed detection in paddy fields (Ma et al., 2019).
Compared with object detection, deep-learning-based weed segmentation provides pixel-level delineation of weed distributions and shapes, which is particularly beneficial when weeds grow in close proximity to crops or are partially occluded. This fine-grained representation offers more precise guidance for precision agriculture. Xu et al. (2023) proposed a weed detection method that integrates visible light color indices with an encoder–decoder-based instance segmentation model. The color indices enhanced the contrast between vegetation and soil, mitigating the effects of illumination variation and background interference. Meanwhile, the combination of ResNet101_v and DSASPP improved multi-scale semantic feature extraction and boundary segmentation. Field experiments demonstrated high accuracy (Acc = 0.905, IoU = 0.959), and the overall performance surpassed or matched that of Deeplabv3+, Deeplabv3, FCN, U-Net, FastFCN, Swin Transformer, and Vision Transformer, confirming its potential for application in precision weed control.
To illustrate the progress of deep learning-based weed segmentation, Table 2 lists representative segmentation models reported in recent years, highlighting the target of detection, the segmentation network employed, and performance metrics in typical agricultural scenarios. This facilitates a clear comparison of different models in accurately depicting weed distribution and shape.
Table 2 presents representative deep learning-based weed segmentation models reported in recent years, and the advantages and limitations of these approaches warrant further discussion. Deep learning-based segmentation methods can achieve pixel-level accuracy, enabling precise delineation of weed distribution and morphology, which is particularly important in complex field environments where weeds grow in close proximity to crops or are partially occluded. In addition, their multi-scale feature extraction capability allows these models to handle weeds of varying sizes, and the resulting segmentation maps can be directly applied to downstream tasks such as precision spraying, yield estimation, or weed density analysis, thereby enhancing the level of agricultural intelligence. Moreover, pixel-level segmentation reduces human judgment bias and improves detection reliability.
However, these methods face several challenges in practical applications. First, the high diversity and complex morphology of weeds make them heavily dependent on large-scale pixel-level annotated datasets, which are costly and labor-intensive to acquire. Second, achieving high-precision pixel-level segmentation often increases computational demands, posing greater challenges for real-time inference. Third, environmental factors such as illumination variation, shadows, and color similarity between weeds and crops can negatively affect segmentation accuracy, and the generalization ability of models across different regions, crop types, or seasons may be limited, often requiring fine-tuning or transfer learning. In summary, while deep learning-based weed segmentation provides highly precise information, trade-offs in terms of data requirements, computational cost, and generalization need to be carefully considered.
2.4 Image classification methods for weed detection
The advantage of using deep learning for image classification lies in its ability to automatically learn features from images through end-to-end training, enabling efficient and accurate automated image categorization. In recent years, machine learning, particularly deep learning, has been widely applied to various image classification tasks across fields such as medicine (Lei and Li, 2025; Xu et al., 2022), sports analysis (Wang et al., 2018), agriculture (Peng and Wang, 2021), and aerospace (Liu et al., 2020), significantly enhancing both model generalization and classification performance.
Based on the aforementioned advances, the application of deep learning-based classification models in weed detection has been gradually increasing. By automatically extracting discriminative features from crop and weed images, these models can classify vegetation at both the species and growth stage levels, enabling rapid and accurate weed identification in complex field environments. Such approaches reduce the reliance on manually designed features inherent in traditional machine learning and facilitate high-quality analysis in large-scale agricultural scenarios, providing strong support for precision weed management and site-specific herbicide application. For example, Almalky and Ahmed (2023) employed UAV-acquired datasets of four growth stages of Consolida regalis and applied deep learning models including YOLOv5, RetinaNet, and Faster R-CNN. They successfully achieved real-time weed detection and growth stage classification, with the YOLOv5-small model achieving a maximum recall of 0.794, and RetinaNet (with a ResNet-101-FPN backbone) attaining an average precision of 87.457%, demonstrating high accuracy and precision.
However, since image classification tasks can only assign an image to a certain category and cannot determine the exact location of weeds within the image, using classification models for weed detection often cannot directly localize weeds and requires additional detection methods. For example, Jin et al. (2022) proposed a grid-based approach, in which each image is divided into sub-images (grid cells), and the presence of weeds in each cell is determined to localize weeds within the image. Additionally, they evaluated several convolutional neural network architectures, including DenseNet, EfficientNetV2, ResNet, RegNet, and VGGNet, for weed detection and classification tasks. Specifically, they conducted (i) a multi-class classification task to distinguish between turfgrass and multiple weed species, and (ii) a binary classification task to differentiate weeds from turfgrass, thereby assessing the feasibility and effectiveness of different networks for accurate weed detection and classification in turfgrass. Meanwhile, Table 3 lists other classification models applied in the field of weed detection using deep learning, along with some representative examples of their applications.
Table 3. Representative deep-learning-based image classification models for in-field weed detection.
Deep learning-based weed classification models can automatically learn discriminative features from images through end-to-end training, enabling rapid and high-accuracy weed identification in complex field environments without relying on handcrafted features (Saleh et al., 2024). Moreover, these classification models are capable of distinguishing weeds at different growth stages or of different species, providing valuable support for category-specific precision herbicide application, weed management decisions, and crop yield prediction. Lightweight classification models further allow real-time inference while maintaining high detection accuracy, making them suitable for deployment on UAVs or edge devices. However, these classification models also have inherent limitations. They can only determine the category of an image or sub-image and cannot directly provide precise spatial information on weed locations, which necessitates integration with additional detection or segmentation methods. In addition, high-accuracy deep models typically rely on large-scale annotated datasets, which are costly and time-consuming to acquire. Furthermore, the generalization ability of deep learning models may be limited across different fields, crop types, or varying lighting conditions.
2.5 Advantages and limitations of deep learning-based weed detection
Deep learning has enabled significant progress in weed detection by providing three complementary modeling paradigms—classification, object detection, and image segmentation—each offering different levels of spatial and semantic detail. Classification models focus on discriminating weed species or growth stages, object detection models provide bounding-box level spatial localization, and segmentation models deliver pixel-level delineation of weed morphology. Together, these models support a broad range of downstream tasks in precision agriculture, such as targeted herbicide application, weed density mapping, and vegetation monitoring. Their ability to extract hierarchical features directly from raw images enables strong robustness against illumination variation, occlusion, and background complexity in field environments.
Despite these advantages, each model type exhibits inherent limitations that affect its suitability for different application scenarios. Classification models cannot provide explicit spatial localization and therefore require additional mechanisms—such as grid partitioning or region proposals—to determine weed positions. Object detection models suffer from reduced accuracy in dense crop–weed canopies and often miss small weeds, particularly under occlusion or strong illumination variability. Segmentation models achieve the highest spatial precision but incur substantial annotation cost and computational burden, making real-time edge deployment challenging. Selecting an appropriate model thus requires balancing accuracy, spatial detail, computational constraints, and annotation requirements. Table 4 summarizes the major advantages and limitations of the three model categories.
3 Discussion
The fundamental limitation of deep learning-based weed detection does not lie in the network architectures themselves, but rather in the substantial cost and difficulty of constructing high-quality annotated datasets. Although object detection, image segmentation, and image classification models have demonstrated impressive recognition capabilities, all these approaches rely heavily on large quantities of accurately labeled images. In real agricultural environments, however, weeds exhibit extensive species diversity, strong morphological variability, and pronounced spatiotemporal changes across regions, seasons, and growth stages. Meanwhile, pixel-level segmentation requires dense annotations, which are extremely expensive and time-consuming to produce at scale. The long-tail distribution of weed species further aggravates the data imbalance problem, resulting in weak generalization when models encounter rare or unseen species in new environments. Consequently, the dependence on large, meticulously annotated datasets has become one of the major bottlenecks hindering the widespread deployment of deep learning models in real-world precision agriculture.
To alleviate the burden of data annotation, current research has explored two main strategies: indirect weed detection and semi-supervised learning. Indirect weed detection reduces annotation costs by shifting the recognition focus from weeds to crops; only crop images need to be labeled, and non-crop green regions are subsequently treated as weeds (Jin et al., 2022c). While this approach is practical for fields with a single crop species, its performance is constrained by several factors: vegetation extraction using traditional color-based thresholds is highly sensitive to illumination changes; background objects such as soil, straw, and shadows can easily be misclassified as weeds; and most importantly, this method cannot differentiate weed species—an essential requirement for species-specific herbicide application and resistance management. The second strategy, semi-supervised learning, leverages large quantities of unlabeled field images to augment training via pseudo-labels, consistency regularization, or teacher–student frameworks (Kong et al., 2024b). This approach substantially reduces annotation demands and enhances the model’s ability to generalize across regions and seasons. Nevertheless, semi-supervised methods remain susceptible to pseudo-label noise and may struggle with small weeds or heavy occlusion. Overall, indirect detection offers a low-cost entry point for practical deployment, whereas semi-supervised learning provides a promising pathway for long-term, scalable, and region-adaptive weed detection. Future work will likely integrate both strategies to reduce annotation cost while improving robustness in complex agricultural environments.
From an operational perspective, different deep learning models vary substantially in their compatibility with weeding actuators. Object detection and image segmentation models can directly provide spatial information—through bounding boxes or pixel-level masks—which enables seamless integration with sprayers, mechanical cutters, robotic arms, or other weeding mechanisms. For example, precision sprayers can trigger individual nozzles based on the center of a detection box, whereas robotic manipulators can plan motion trajectories according to the contours extracted from segmentation masks (Qi et al., 2023). In contrast, image classification networks output only categorical labels without location information, making them unsuitable for direct actuator control. Despite their advantages in inference speed, compactness, and ease of deployment, classification models alone can only determine the presence of weeds at the image level. To address this limitation, Jin et al. (2022) proposed a grid-based classification strategy in which field images are divided into multiple spatial cells, and each cell is independently classified to approximate weed locations. This approach retains the computational efficiency of classification networks while providing coarse spatial cues, making it suitable for real-time or low-power applications where only approximate localization is required.
In practical deployment, the choice of deep learning model is further constrained by hardware capabilities, communication delays, platform dynamics, and actuator response characteristics. High-precision segmentation models often require substantial computational resources, making real-time inference on edge devices—such as UAVs, autonomous weeding robots, or tractor-mounted terminals—challenging. Lightweight detection or classification models are better suited for devices with limited compute budgets, although they may offer reduced recognition accuracy. Consequently, actual system design must balance accuracy, inference speed, power consumption, and hardware cost while also accounting for vehicle speed, nozzle response time, mechanical inertia, and allowable control latency. Moreover, integration between perception models and weeding actuators often requires intermediate modules for spatial coordinate transformation, temporal synchronization, and delay compensation—for instance, mapping image coordinates to spraying coordinates, predicting future weed positions based on vehicle motion, or using real-time depth maps to adjust operation height. With the ongoing advancement of embedded AI accelerators, model compression techniques, and edge computing, future intelligent weeding systems are expected to become more lightweight, modular, and real-time, enabling a complete pipeline that evolves from “recognition–localization–actuation” toward “prediction–planning–closed-loop control”.
From the perspective of weed management strategies, future weeding approaches are expected to adopt a multi-strategy integration. In inter-row areas, where weed density is relatively low, automated mechanical weeding can be effectively applied. Conversely, in intra-row areas, mechanical weeding may damage crops; therefore, site-specific chemical weeding using robotic systems based on deep learning–driven weed localization is recommended. This approach not only protects crops but also reduces herbicide usage, minimizing potential harm to both crops and the environment. Furthermore, with the rapid development of low-altitude aerial technologies, future strategies may combine UAV-based image acquisition to determine weed locations, generate prescription maps, and guide ground-based weeding robots for precise spot treatment. Such an integrated approach is anticipated to provide a more efficient and intelligent solution for precision agriculture.
In summary, the future of weed management is expected to advance toward a diversified, intelligent, and precision-integrated approach. By combining mechanical, chemical, and deep learning-based intelligent localization techniques, optimal weed control strategies can be selected according to the spatial distribution of weeds and the growth status of crops in different field zones. This enables the efficient removal of weeds while minimizing damage to crops and the surrounding environment. Such multi-strategy, intelligent weed management approaches not only enhance agricultural productivity but also promote sustainable agricultural practices, supporting the development of eco-friendly and high-efficiency precision farming.
4 Conclusion
Deep learning has emerged as a powerful foundation for modern weed detection, enabling substantial improvements in robustness, accuracy, and automation compared with traditional machine vision. This review synthesized recent advances across three major model families—object detection, image segmentation, and image classification—and highlighted their respective strengths in spatial localization, pixel-level delineation, and lightweight deployment. Despite this progress, practical adoption remains constrained by the high cost of constructing diverse annotated datasets, limited cross-region generalization, and the computational demands of real-time inference on edge devices. Promising future directions include data-efficient learning strategies (e.g., semi-supervised and domain-adaptive methods), perception–actuation co-design for tighter integration with mechanical and chemical weeding systems, and multi-strategy intelligent weed management that combines deep learning, UAV sensing, precision spraying, and robotic execution. Collectively, these developments point toward a future in which deep learning enables more scalable, sustainable, and precision-integrated weed control across diverse agricultural environments.
Author contributions
HZ: Writing – original draft, Data curation, Methodology, Investigation, Conceptualization. YW: Formal analysis, Methodology, Visualization, Data curation, Conceptualization, Funding acquisition, Investigation, Writing – review & editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This research is supported by the National Natural Science Foundation of China (No. 52275192) and the “Blue and blue project” of Universities in Jiangsu Province.
Conflict of interest
The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Agarwal, M., Gupta, S. K., and Biswas, K. K. (2023). Development of A compressed fcn architecture for semantic segmentation using particle swarm optimization. Neural Computing And Appl. 35, 11833–11846. doi: 10.1007/s00521-023-08324-3
Ahmad, A., Saraswat, D., Aggarwal, V., Etienne, A., and Hancock, B. (2021). Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems. Comput. And Electron. In Agric. 184, 106081. doi: 10.1016/j.compag.2021.106081
Ajamian, C., Chang, H.-C., Tomkins, K., Farebrother, W., Heim, R., and Rahman, S. (2021). Identifying invasive weed species in alpine vegetation communities based on spectral profiles. Geomatics. 1, 177–191. doi: 10.3390/geomatics1020011
Alirezazadeh, P., Schirrmann, M., and Stolzenburg, F. (2024). A comparative analysis of deep learning methods for weed classification of high-resolution uav images. J. Of Plant Dis. And Prot. 131, 227–236. doi: 10.1007/s41348-023-00814-9
Almalky, A. M. and Ahmed, K. R. (2023). Deep learning for detecting and classifying the growth stages of consolida regalis weeds on fields. Agronomy. 13, 934. doi: 10.3390/agronomy13030934
Argunşah, A.Ö, Erdil, E., and Ünay, D. (2025). “Applications of computer vision and machine learning in bioimaging,” in Bioimaging modalities in bioengineering. Eds. Cengiz, I. F., Oliveira, J. M., and Reis, R. L. (Springer Nature Switzerland, Cham).
Ates, G. C., Mohan, P., and Celik, E. (2023). Dual cross-attention for medical image segmentation. Eng. Appl. Of Artif. Intell. 126, 107139. doi: 10.1016/j.engappai.2023.107139
Atkins, K., Garzón-Martínez, G. A., Lloyd, A., Doonan, J. H., and Lu, C. (2025). Unlocking the power of ai for phenotyping fruit morphology in arabidopsis. Gigascience. 14, Giae123. doi: 10.1093/gigascience/giae123
Bairambekov, S. B., Sokolova, G. F., Gar’yanova, E. D., Dubrovin, N. K., and Sokolov, A. S. (2016). Harmfulness of weed plants in crops of vegetabl es and melons. Biosci. Biotechnol. Res. Asia. 13, 1929–1943. doi: 10.13005/bbra/2347
Balas, P., Makavana, J., Mohnot, P., Jhala, K., and Yadav, R. (2022). Inter and intra row weeders: A review. Curr. J. Of Appl. Sci. And Technol. 41, 1–9. doi: 10.9734/cjast/2022/v41i2831789
Chen, A. I., Balter, M. L., Maguire, T. J., and Yarmush, M. L. (2020). Deep learning robotic guidance for autonomous vascular access. Nat. Mach. Intell. 2, 104–115. doi: 10.1038/s42256-020-0148-7
Chu, Z. and Yu, J. (2020). An end-to-end model for rice yield prediction using deep learning fusion. Comput. And Electron. In Agric. 174, 105471. doi: 10.1016/j.compag.2020.105471
Darbyshire, M., Salazar-Gomez, A., Gao, J., Sklar, E. I., and Parsons, S. (2023). Towards practical object detection for weed spraying in precision agriculture. Front. In Plant Sci. 14. doi: 10.3389/fpls.2023.1183277
Das, M. and Bais, A. (2021). Deepveg: deep learning model for segmentation of weed, canola, and canola flea beetle damage. IEEE Access 9, 119367–119380. doi: 10.1109/ACCESS.2021.3108003
Deng, R., Tao, M., Xing, H., Yang, X., Liu, C., Liao, K., et al. (2021). Automatic diagnosis of rice diseases using deep learning. Front. In Plant Science 12doi: 10.3389/fpls.2021.701038
Deng, X., Qi, L., Liu, Z., Liang, S., Gong, K., and Qiu, G. (2023). Weed target detection at seedling stage in paddy fields based on yolox. PloS One 18, E0294709. doi: 10.1371/journal.pone.0294709
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2020). An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale. Arxiv Abs/2010, 11929. doi: 10.48550/arXiv.2010.11929
Dou, P., Shen, H., Li, Z., Guan, X., and Huang, W. (2021). Remote sensing image classification using deep–shallow learning. IEEE J. Of Selected Topics In Appl. Earth Observations And Remote Sens. 14, 3070–3083. doi: 10.1109/JSTARS.2021.3062635
Etienne, A., Ahmad, A., Aggarwal, V., and Saraswat, D. (2021). Deep learning-based object detection system for identifying weeds using uas imagery. Remote Sens. 13, 5182. doi: 10.3390/rs13245182
Fawakherji, M., Potena, C., Pretto, A., Bloisi, D. D., and Nardi, D. (2021). Multi-spectral image synthesis for crop/weed segmentation in precision farming. Robotics And Autonomous Syst. 146, 103861. doi: 10.1016/j.robot.2021.103861
Genze, N., Ajekwe, R., Güreli, Z., Haselbeck, F., Grieb, M., and Grimm, D. G. (2022). Deep learning-based early weed segmentation using motion blurred uav images of sorghum fields. Comput. And Electron. In Agric. 202, 107388. doi: 10.1016/j.compag.2022.107388
Glaeser, A., Selvaraj, V., Lee, S., Hwang, Y., Lee, K., Lee, N., et al. (2021). Applications of deep learning for fault detection in industrial cold forging. Int. J. Of Production Res. 59, 4826–4835. doi: 10.1080/00207543.2021.1891318
Grigorescu, S., Trasnea, B., Cocias, T., and Macesanu, G. (2020). A survey of deep learning techniques for autonomous driving. J. Of Field Robotics 37, 362–386. doi: 10.1002/rob.21918
Hasan, A. S. M. M., Sohel, F., Diepeveen, D., Laga, H., and Jones, M. G. K. (2021). A survey of deep learning techniques for weed detection from images. Comput. And Electron. In Agric. 184, 106067. doi: 10.1016/j.compag.2021.106067
He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2020). Mask R-cnn. IEEE Trans. On Pattern Anal. And Mach. Intell. 42, 386–397. doi: 10.1109/TPAMI.2018.2844175
He, K., Zhang, X., Ren, S., and Sun, J. (2015). “Deep residual learning for image recognition,” in 2016 Ieee Conference On Computer Vision And Pattern Recognition (Cvpr). (IEEE), 770–778.
Hoorali, F., Khosravi, H., and Moradi, B. (2022). Irunet for medical image segmentation. Expert Syst. With Appl. 191, 116399. doi: 10.1016/j.eswa.2021.116399
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. (2017). Mobilenets: efficient convolutional neural networks for mobile vision applications. Arxiv Abs/1704, 04861. doi: 10.48550/arXiv.1704.04861
Huang, H., Deng, J., Lan, Y., Yang, A., Deng, X., Wen, S., et al. (2018). Accurate weed mapping and prescription map generation based on fully convolutional networks using uav imagery. Sensors. 18, 3299. doi: 10.3390/s18103299
Huang, G., Liu, Z., and Weinberger, K. Q. (2016). “Densely connected convolutional networks,” in 2017 Ieee Conference On Computer Vision And Pattern Recognition (Cvpr). (IEEE), 2261–2269.
Jin, X., Bagavathiannan, M., Mccullough, P. E., Chen, Y., and Yu, J. (2022). A deep learning-based method for classification, detection, and localization of weeds in turfgrass. Pest Manag Sci. 78, 4809–4821. doi: 10.1002/ps.7102
Jin, H., Han, K., Xia, H., Xu, B., and Jin, X. (2025). Detection of weeds in vegetab les using image classification neural networks and image processing. Front. In Phys., 13. doi: 10.3389/fphy.2025.1496778
Jin, X., Sun, Y., Che, J., Bagavathiannan, M., Yu, J., and Chen, Y. (2022c). A novel deep learning-based method for detection of weeds in vegetab les. Pest Manage. Sci. 78, 1861–1869. doi: 10.1002/ps.6804
Kaur, S., Kaur, R., and Chauhan, B. S. (2018). Understanding crop-weed-fertilizer-water interactions and their implications for weed management in agricultural systems. Crop Prot. 103, 65–72. doi: 10.1016/j.cropro.2017.09.011
Kim, M., Yun, J., Cho, Y., Shin, K., Jang, R., Bae, H.-J., et al. (2019). Deep learning in medical imaging. Neurospine 16, 657–668. doi: 10.14245/ns.1938396.198
Kong, X., Liu, T., Chen, X., Jin, X., Li, A., and Yu, J. (2024a). Efficient crop segmentation net and novel weed detection method. Eur. J. Of Agron. 161, 127367. doi: 10.1016/j.eja.2024.127367
Kong, X., Liu, T., Chen, X., Lian, P., Zhai, D., Li, A., et al. (2024b). Exploring the semi-supervised learning for weed detection in wheat. Crop Prot. 184, 106823. doi: 10.1016/j.cropro.2024.106823
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 84–90. doi: 10.1145/3065386
Lecun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521, 436–444. doi: 10.1038/nature14539
Lei, L. and Li, W. (2025). Transformer-based multi-task model for lung tumor segmentation and classification in ct images. J. Of Radiat. Res. And Appl. Sci. 18, 101657. doi: 10.1016/j.jrras.2025.101657
Li, Z., Wang, D., Yan, Q., Zhao, M., Wu, X., and Liu, X. (2024b). Winter wheat weed detection based on deep learning models. Comput. And Electron. In Agric. 227, 109448. doi: 10.1016/j.compag.2024.109448
Li, G., Zhao, B., and Li, X. (2024a). Image harmonization with simple hybrid cnn-transformer network. Neural Networks 180, 106673. doi: 10.1016/j.neunet.2024.106673
Lin, J., Zhang, K., Yang, X., Cheng, X., and Li, C. (2022). Infrared dim and small target detection based on U-transformer. J. Of Visual Communication And Image Representation 89, 103684. doi: 10.1016/j.jvcir.2022.103684
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., et al. (2016). “Single shot multibox detector,” in Computer vision – eccv 2016. Eds. Leibe, B., Matas, J., Sebe, N., and Welling, M. (Springer International Publishing, Cham), 21–37.
Liu, Y., Han, Z., Chen, C., Ding, L., and Liu, Y. (2020). Eagle-eyed multitask cnns for aerial image retrieval and scene classification. IEEE Trans. On Geosci. And Remote Sens. 58, 6699–6721. doi: 10.1109/TGRS.2020.2979011
Liu, R., Ruichek, Y., and El Bagdouri, M. (2021). Multispectral background subtraction with deep learning. J. Of Visual Communication And Image Representation 80, 103267. doi: 10.1016/j.jvcir.2021.103267
Lu, Y., Cheng, C., Zhao, D., Zhu, D., and Gao, Q. (2025). Infrared small target detection algorithm based on nested fpn and interference suppression. Expert Syst. With Appl. 274, 127029. doi: 10.1016/j.eswa.2025.127029
Ma, X., Deng, X., Qi, L., Jiang, Y., Li, H., Wang, Y., et al. (2019). Fully convolutional network for rice seedling and weed image segmentation at the seedling stage in paddy fields. PloS One 14, E0215676. doi: 10.1371/journal.pone.0215676
Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., and Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Trans. On Pattern Anal. And Mach. Intell. 44, 3523–3542. doi: 10.1109/TPAMI.2021.3059968
Nistor, S. C., Ileni, T. A., and Dărăbant, A. S. (2020). Automatic development of deep learning architectures for image segmentation. Sustainability 12 (22), 9707. doi: 10.3390/su12229707
Pan, X., Cen, Y., Ma, Y., Yan, W., Gao, X., Liu, X., et al. (2016). Identification of gramineous grass seeds using gabor and locality preserving projections. Multimedia Tools And Appl. 75, 16551–16576. doi: 10.1007/s11042-016-3424-0
Papadeas, I., Tsochatzidis, L., Amanatiadis, A., and Pratikakis, I. (2021). Real-time semantic image segmentation with deep learning for autonomous driving: A survey. Appl. Sci. 11, 8802. doi: 10.3390/app11198802
Peng, Y. and Wang, Y. (2021). An industrial-grade solution for agricultural image classification tasks. Comput. And Electron. In Agric. 187, 106253. doi: 10.1016/j.compag.2021.106253
Qi, L., Gan, Z., Hua, Z., Du, D., Jiang, W., and Sun, Y. (2023). Cleaning of object surfaces based on deep learning: A method for generating manipulator trajectories using rgb-D semantic segmentation. Neural Computing And Appl. 35, 8677–8692. doi: 10.1007/s00521-022-07930-x
Rai, N. and Sun, X. (2024). Weedvision: A single-stage deep learning architecture to perform weed detection and segmentation using drone-acquired images. Comput. And Electron. In Agric. 219, 108792. doi: 10.1016/j.compag.2024.108792
Reddy, K. N. and James, R. R. (2018). Introduction to the symposium on precision agriculture and weed science. Weed Technol. 32, 1–1. doi: 10.1017/wet.2018.2
Redmon, J., Divvala, S. K., Girshick, R. B., and Farhadi, A. (2015). “You only look once: unified, real-time object detection,” in 2016 Ieee Conference On Computer Vision And Pattern Recognition (Cvpr). (IEEE), 779–788.
Rehman, T. U., Zaman, Q. U., Chang, Y. K., Schumann, A. W., Corscadden, K. W., and Esau, T. J. (2018). Optimising the parameters influencing performance and weed (Goldenrod) identification accuracy of colour co-occurrence matrices. Biosyst. Eng. 170, 85–95. doi: 10.1016/j.biosystemseng.2018.04.002
Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster R-cnn: towards real-time object detection with region proposal networks. IEEE Trans. On Pattern Anal. And Mach. Intell. 39, 1137–1149. doi: 10.1109/TPAMI.2016.2577031
Saleh, A., Olsen, A., Wood, J., Philippa, B., and Rahimi Azghadi, M. (2024). Weedclr: weed contrastive learning through visual representations with class-optimized loss in long-tailed datasets. Comput. And Electron. In Agric. 227, 109526. doi: 10.1016/j.compag.2024.109526
Shaner, D. L. (1995). Herbicide resistance: where are we? How did we get here? Where are we going? Weed Technol. 9, 850–856. doi: 10.1017/S0890037X00024325
Sharma, N., Gupta, S., Al-Yarimi, F. A. M., Ghadi, Y. Y., Bharany, S., Rehman, A. U., et al. (2025). Dba-deeplab: dual-backbone attention-enhanced deeplab V3+ Model for plant disease segmentation. Food Sci. Nutr. 13, E70668. doi: 10.1002/fsn3.70668
Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv 1409, 1556. doi: 10.48550/arXiv.1409.1556
Sun, J., You, J., Li, F., Sun, J., Yang, M., Zhao, X., et al. (2024). Research on improvement strategies for A lightweight multi-object weed detection network based on yolov5. Crop Prot. 186, 106912. doi: 10.1016/j.cropro.2024.106912
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., et al. (2014). Going deeper with convolutions. 2015 IEEE Conf. On Comput. Vision And Pattern Recognition (Cvpr). 1–9. doi: 10.1109/CVPR.2015.7298594
Tan, M. and Le, Q. V. (2019). Efficientnet: rethinking model scaling for convolutional neural networks. Arxiv Abs/1905, 11946. doi: 10.48550/arXiv.1905.11946
Teimouri, N., Jørgensen, R. N., and Green, O. (2022). Novel assessment of region-based cnns for detecting monocot/dicot weeds in dense field environments. Agronomy. 12, 1167. doi: 10.3390/agronomy12051167
Vilà, M., Beaury, E. M., Blumenthal, D. M., Bradley, B. A., Early, R., Laginhas, B. B., et al. (2021). Understanding the combined impacts of weeds and climate change on crops. Environ. Res. Lett. 16, 034043. doi: 10.1088/1748-9326/abe14b
Wang, Q., Cheng, M., Xiao, X., Yuan, H., Zhu, J., Fan, C., et al. (2021). An image segmentation method based on deep learning for damage assessment of the invasive weed solanum rostratum dunal. Comput. And Electron. In Agric. 188, 106320. doi: 10.1016/j.compag.2021.106320
Wang, P., Jiang, A., Liu, X., Shang, J., and Zhang, L. (2018). Lstm-based eeg classification in motor imagery tasks. IEEE Trans. On Neural Syst. And Rehabil. Eng. 26, 2086–2095. doi: 10.1109/TNSRE.2018.2876129
Wang, X., Tang, J., and Whitty, M. (2022). Data-centric analysis of on-tree fruit detection: experiments with deep learning. Comput. And Electron. In Agric. 194, 106748. doi: 10.1016/j.compag.2022.106748
Wicks, G. A., Burnside, O. C., and Felton, W. L. (2017). “Mechanical weed management,” in Handbook of weed management systems. (London, UK: Routledge).
Xu, B., Fan, J., Chao, J., Arsenijevic, N., Werle, R., and Zhang, Z. (2023). Instance segmentation method for weed detection using uav imagery in soybean fields. Comput. And Electron. In Agric. 211, 107994. doi: 10.1016/j.compag.2023.107994
Xu, H., Li, T., Hou, X., Wu, H., Shi, G., Li, Y., et al. (2025). Key technologies and research progress of intelligent weeding robots. Weed Sci. 73, E25. doi: 10.1017/wsc.2024.95
Xu, J., Pan, Y., Pan, X., Hoi, S. C. H., Yi, Z., and Xu, Z. (2021). Regnet: self-regulated network for image classification. IEEE Trans. On Neural Networks And Learn. Syst. 34, 9562–9567. doi: 10.1109/TNNLS.2022.3158966
Xu, Y., Wen, G., Yang, P., Fan, B., Hu, Y., Luo, M., et al. (2022). Task-coupling elastic learning for physical sign-based medical image classification. IEEE J. Of Biomed. And Health Inf. 26, 626–637. doi: 10.1109/JBHI.2021.3106837
Xu, K., Yuen, P., Xie, Q., Zhu, Y., Cao, W., and Ni, J. (2024). Weedsnet: A dual attention network with rgb-D image for weed detection in natural wheat field. Precis. Agric. 25, 460–485. doi: 10.1007/s11119-023-10080-2
Yang, S., Lin, J., Cernava, T., Chen, X., and Zhang, X. (2025). Weeddetr: an efficient and accurate detection method for detecting small-target weeds in uav images. Weed Sci. 73, E84. doi: 10.1017/wsc.2025.10035
Yu, J., Sharpe, S. M., Schumann, A. W., and Boyd, N. S. (2019). Deep learning for image-based weed detection in turfgrass. Eur. J. Of Agron. 104, 78–84. doi: 10.1016/j.eja.2019.01.004
Zhao, S., Wen, Z., Qi, Q., Lam, K. M., and Shen, J. (2023). Learning fine-grained information with capsule-wise attention for salient object detection. IEEE Trans. On Multimedia. 1–14. doi: 10.1109/TMM.2023.3234436
Keywords: deep learning, image segmentation, image classification, object detection, weed detection
Citation: Zhao H and Wang Y (2026) Deep learning–based approaches for weed detection in crops. Front. Plant Sci. 16:1746406. doi: 10.3389/fpls.2025.1746406
Received: 14 November 2025; Accepted: 12 December 2025; Revised: 05 December 2025;
Published: 09 January 2026.
Edited by:
Xiaojun Jin, Nanjing Forestry University, ChinaCopyright © 2026 Zhao and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yan Wang, cXF3YW5neWFuMjAwNkAxNjMuY29t
Yan Wang*