Weed Detection in Perennial Ryegrass With Deep Learning Convolutional Neural Network

Precision herbicide application can substantially reduce herbicide input and weed control cost in turfgrass management systems. Intelligent spot-spraying system predominantly relies on machine vision-based detectors for autonomous weed control. In this work, several deep convolutional neural networks (DCNN) were constructed for detection of dandelion (Taraxacum officinale Web.), ground ivy (Glechoma hederacea L.), and spotted spurge (Euphorbia maculata L.) growing in perennial ryegrass. When the networks were trained using a dataset containing a total of 15,486 negative (images contained perennial ryegrass with no target weeds) and 17,600 positive images (images contained target weeds), VGGNet achieved high F1 scores (≥0.9278), with high recall values (≥0.9952) for detection of E. maculata, G. hederacea, and T. officinale growing in perennial ryegrass. The F1 scores of AlexNet ranged from 0.8437 to 0.9418 and were generally lower than VGGNet at detecting E. maculata, G. hederacea, and T. officinale. GoogleNet is not an effective DCNN at detecting these weed species mainly due to the low precision values. DetectNet is an effective DCNN and achieved high F1 scores (≥0.9843) in the testing datasets for detection of T. officinale growing in perennial ryegrass. Moreover, VGGNet had the highest Matthews correlation coefficient (MCC) values, while GoogleNet had the lowest MCC values. Overall, the approach of training DCNN, particularly VGGNet and DetectNet, presents a clear path toward developing a machine vision-based decision system in smart sprayers for precision weed control in perennial ryegrass.


INTRODUCTION
Turfgrasses are the predominant vegetation cover in urban landscapes including golf courses, institutional and residential lawns, parks, roadsides, and sport fields (Milesi et al., 2005). Weed control is one of the most challenging issues for turfgrass management. Weeds compete with turfgrass for nutrient, sunlight, and water, and disrupt turfgrass aesthetics and functionality. Cultural practices, such as appropriate fertilization, irrigation, and mowing, can reduce weed infestation (Busey, 2003), but herbicides often provide the most effective weed control (McElroy and Martins, 2013). Conventional herbicide-based weed control relies on broadcast application, spraying weed patches and pure turfgrass stands indiscriminately. Manual spot-spraying is time-consuming and expensive but practiced commonly to reduce herbicide input and weed control cost.
Deep learning is a category of machine learning that allows a computer algorithm to learn and understand a dataset in terms of a hierarchy of concepts (Deng et al., 2009;LeCun et al., 2015). In recent years, deep learning has emerged as an effective application in various scientific domains, including computer vision (LeCun et al., 2015;Kendall and Gal, 2017;Gu et al., 2018), natural language processing (Collobert and Weston, 2008;Collobert et al., 2011), and speech recognition LeCun et al., 2015). Deep learning has proven to be a promising method in computer-assisted drug discovery and design (Gawehn et al., 2016), sentiment analysis and question answering (Ye et al., 2009;Bordes et al., 2014), predicting sequence specificities of DNA-and RNA-binding proteins (Alipanahi et al., 2015), and performing automatic brain tumor detection (Havaei et al., 2017).
Deep convolutional neural networks (DCNN) have an extraordinary ability to extract complex features from images (LeCun et al., 2015;Schmidhuber, 2015;Ghosal et al., 2018). It has been widely employed as a powerful tool to classify images and detect objects (LeCun et al., 2015). In the 2012 ImageNet competition, a DCNN effectively classified a 1000 class dataset containing approximately a million high resolution images (Krizhevsky et al., 2012). In recent years, a growing number of companies such as Apple, Intel, Nvidia, and Tencent utilize DCNN-based machine vision in facial recognition (Song et al., 2014;Parkhi et al., 2015), self-driving cars (LeCun et al., 2015;Jordan and Mitchell, 2015), real-time smart phone vision applications (Ronao and Cho, 2016), and target identification for robot grasping (Wang et al., 2016).
In agriculture, DCNN reliably detected various ecological crop stresses (Ghosal et al., 2018). For example, Mohanty et al. (2016) presented a DCNN that identified 14 crop species and 26 diseases with an overall accuracy of >99%. Moreover, DCNN-based machine vision can identify plants. For example, Grinblat et al. (2016) presented a DCNN that can reliably identify three legume species using plant vein morphological patterns, while dos Santos Ferreira et al. (2017) presented a DCNN that identified various broadleaf and grassy weeds in relation to soybean (Glycine max L. Merr.) and soil, with an overall accuracy of >99%. Recently, Teimouri et al. (2018) presented a method of automatically determining 18 weed species and their growth stages. Accurate image classification and object detection along with the fast image processing are of paramount importance for real-time weed detection and precision herbicide application (Fennimore et al., 2016). Previous researchers noted that training a DCNN takes several hours with a high-performance graphic processor unit (GPU), while the image classification itself is fast (<1 second per image) (Mohanty et al., 2016;Chen et al., 2017;Sharpe et al. 2019;Yu et al., 2019a).
Dandelion (Taraxacum officinale Web.), ground ivy (Glechoma hederacea L.), and spotted spurge (Euphorbia maculata L.) are distributed throughout the continental United States (USDA-NRCS 2018). These weed species are commonly found in various cool-season turfgrasses. Herbicides, such as 2,4-D, dicamba, MCPP, triclopyr, and sulfentrazone are broadcast-applied in cool-season turfgrasses for POST control of various broadleaf weeds (McElroy and Martins, 2013;Reed et al., 2013;Johnston et al., 2016;Yu and McCullough, 2016a;Yu and McCullough, 2016b). Precision herbicide application using machine-vision based sprayers will substantially reduce herbicide input and weed control costs. The objective of this research was to examine the feasibility of using DCNN for detection of broadleaf weeds in perennial ryegrass.

Image acquisition
Images of E. maculata, G. hederacea, and T. officinale growing in perennial ryegrass, acquired at multiple golf courses and institutional lawns in Indianapolis, Indiana, United States (39.76 °N, 86.15 ° W), were used in the training datasets. Images of E. maculata, G. hederacea, and T. officinale, acquired at multiple institutional lawns and golf courses in Carmel, Indiana, United States (39.97° N, 86.11° W), were used in the testing dataset 1 (TD 1). Images of E. maculata and G. hederacea, acquired at multiple institutional lawns, roadsides, and parks in West Lafayette, Indiana, United States (40.42° N, 86.90° W) were used in the testing dataset 2 (TD 2). Images of T. officinale taken at multiple institutional and residential lawns at Saskatoon, Saskatchewan, Canada (52.13° N, 106.67° W) were included in TD 2. The images acquired in Indiana and Saskatchewan were taken using a Sony ® Cyber-Shot (SONY Corporation, Minato, Tokyo, Japan) and a Canon ® EOS Rebel T6 digital camera (Ohta-ku, Canon Inc., Tokyo, Japan), respectively, at a resolution of 1920 × 1080 pixels. The camera heights were adjusted to a ground-sampling distance of 0.05 cm pixel -1 during image acquisition. The images were acquired during the daytime from 9:00 AM to 5:00 PM and under various sunlight conditions including clear, partly cloudy, or cloudy days. The training and testing images were acquired at multiple times between August and September 2018.

Training and Testing
For training and testing image classification DCNN, images were cropped into 426 × 240 pixels using Irfanview (Version 5.50, Irfan Skijan, Jaice, Bosnia). The images containing a single weed species were selected for training and testing image classification DCNN. The neural networks were trained using the training dataset containing either a single weed species (single-species neural network) or multiple weed species (multiple-species neural network). For training single-species neural networks, the E. maculata training dataset contained 6,180 negative images (images containing perennial ryegrass without target weeds) and 6,500 positive images (images containing perennial ryegrass infested with target weeds); the G. hederacea training dataset contained 4,470 negative and 4,600 positive images; and the T. officinale training dataset contained 4,836 negative and 6,500 positive images. The above training datasets were used to train DCNN for detecting a single weed species growing in perennial ryegrass. For each weed species, a total of 630 negative and 630 positive images were used for validation dataset (VD), TD 1, and TD 2.
The multiple-species neural networks were trained because we were interested to evaluate the feasibility of using a single image classification DCNN to detect multiple weed species growing in perennial ryegrass. We trained AlexNet, GoogLeNet, and VGGNet using two training datasets (A and B). Training dataset A was a balanced dataset that contained 19,500 negative and 19,500 positive images (6,500 images per weed species). Training dataset B was an unbalanced dataset containing a total of 15,486 negative and 17,600 positive images (6,500 images for E. maculata; 4,600 images for G. hederacea; and 6,500 images for T. officinale). A total of 900 negative and 900 positive images (300 images for each weed species) were used for VD. The TD 1 and TD 2 for the multiple-species neural networks contained 630 negative and 630 positive images.
When training object detection DCNN, images were resized to 1280 × 720 pixels (720 p) using Irfanview. A total of 810 images containing T. officinale while growing in perennial ryegrass were used for training DetectNet. Training images were imported into custom software compiled using Lazarus (http://www.lazarus-ide. org/). Bounding boxes were drawn on imported images to identify objects. Program output generated corresponding text files used for DetectNet training. A total of 100 images containing T. officinale while growing in perennial ryegrass were used in the VD, TD 1 or TD 2. For detection of T. officinale, the images of VD, TD 1, and TD 2 contained a total of 630, 446, and 157 individual weeds, respectively.
Neural network training and testing were performed in the NVIDIA Deep Learning GPU Training System (DIGITS) (version 6.0.0, NVIDIA Corporation, Santa Clara, CA) using the Convolutional Architecture for Fast Feature Embedding (Caffe) (Jia et al., 2014). Networks were pre-trained using the ImageNet database (Deng et al., 2009) and KITTI dataset (Geiger et al., 2013). The following hyper-parameters were standardized to compare the results of all DCNN.
• Base learning rate: 0.03 • Batch accumulation: 5 • Batch size: 2 • Gamma: 0.95 • Learning rate policy: Exponential decay • Solver type: AdaDelta • Training epochs: 30 The results of validation and testing for all DCNN were arranged in binary confusion matrixes, including true positive (tp), true negative (tn), false positive (fp), and false negative (fn) (Sokolova and Lapalme, 2009). In this context, tp represents the images containing target weeds that are correctly identified; tn represents the images containing turfgrasses without target weeds that are correctly identified; fp represents the images without target weeds that are incorrectly identified as target weeds; and fn represents the images containing target weeds incorrectly not identified as turfgrasses. We computed the precision (Equation 1), recall (Equation 2), F 1 score (Equation 3), and Matthews correlation coefficient (MCC) (Equation 4) for each DCNN. The precision, recall, and F 1 score values are unitless indices of predictive ability and ranged from 0 to 1. The higher the value is, the better is the predictive ability of the network. The high precision indicates that the neural network achieved high successful rate of detection for the turfgrass area where weeds do not occur, while the high recall indicates that the neural network realized high successful rate of detection for the target weeds. MCC is a correlation coefficient between the observed and predicted binary classifications. MCC ranges from -1 to +1. An MCC value of -1 represents the total discrepancy between observation and prediction, 0 indicates no better than a random prediction, and +1 represents a prefect prediction (Matthews, 1975).
Precision measures the ability of the neural network to accurately identify targets, which was calculated using the following equation (Sokolova and Lapalme, 2009): Recall measures the ability of the neural network to detect its target, which was calculated using the following equation (Sokolova and Lapalme, 2009;Hoiem et al., 2012): F 1 score is a harmonic means of the precision and recall (Sokolova and Lapalme, 2009) calculated by the following equation: MCC measures the quality of binary classifications, particularly the correlation between the actual class labels and predictions, which was determined by the following equation (Matthews, 1975 The capability of multiple-species neural networks to detect these weed species was further evaluated using TD 1 and TD 2 (contained only a single weed species per dataset) ( Table 3). When the AlexNet was trained using the training dataset A, it performed similarly to when trained using the training dataset B. AlexNet demonstrated high recall (≥0.9619) but relatively low precision (≤0.7412) for detection of E. maculata and T. officinale in the TD 1 and T. officinale in the TD 2, which reduced the F 1 score values. The MCC value of GoogleNet was the lowest, while the MCC value of VGGNet was the highest among the multiplespecies neural networks for both training datasets.
When the neural networks were trained for detecting multiple weed species, VGGNet exhibited excellent performances in detecting E. maculata and G. hederacea and achieved high F 1 scores (≥0.9345) and recall values (≥0.9968) in the TD 1 and TD 2. However, for detection of T. officinale, VGGNet trained using the training dataset B exhibited considerably higher recall and F 1 score values than when trained using the training dataset A. GoogLeNet was ineffective at detecting these weed species, primarily due to the low precision values.
Both image classification and object detection DCNN can be used in the machine vision sub-system of smart sprayers but each approach has pros and cons. Compared to the object detection DCNN, the training of the image classification DCNN takes less time because it does not involve the drawing of bounding boxes. The herbicide application of DetectNet-based  smart sprayers can target individual weeds as objects using the narrow spray pattern nozzles, while this is less feasible with the image classification DCNN. Except for T. officinale, selected DCNN produced consistent results for weed detection, even across different geographical regions. When single-species neural networks were trained, AlexNet and VGGNet exhibited excellent precision and recall at detecting T. officinale in the TD 1, but the recall values decreased in the TD 2. The reduction in recall value suggests that the network is more likely to misclassify target weeds as turfgrass. This is undesirable in field applications as weeds would be missed, inadequate herbicide applied, and thus result in poor weed control. The cause is unknown, but we hypothesized that it was likely due to the variations in the morphological characteristics of T. officinale between the training and testing images. Fortunately, this problem has been overcome by training the multiple-species neural networks, particularly the VGGNet.
DetectNet exhibited excellent performances at detecting T. officinale across different growth stages and densities ( Figures  1A-E). Several images used for model testing in TD 1 had high weed densities. The bounding boxes generated by DetectNet failed to cover every single leaf of the target weeds when the testing images containing high weed densities (Figure 1E), which reduced the recall values. However, this is unlikely to be an issue in field applications because the great majority of weeds per image are detected. The few undetected leaves would likely fall into the spray zone if the herbicides are delivered using flat fan nozzles. In addition, several testing images in TD 2 contained smooth crabgrass [Digitaria ischaemum (Schreb.) Muhl]. In a few cases, DetectNet incorrectly detected the smooth crabgrass as the T. officinale ( Figure  1B), which reduced precision. Increasing the number of training images containing crabgrass growing with T. officinale may remove this effect and increase the precision and overall accuracy. In a Multiple-species neural network was trained using the training dataset containing multiple weed species. TD 1 or TD 2 contained 630 negative and 630 positive images. b Training dataset A contained a total of 19,500 negative and 19,500 positive images (6,500 images for each weed species). c Training dataset B contained a total of 15,486 negative and 17,600 positive images (6,500 images for E. maculata, 4,600 images for G. hederacea, and 6,500 images for T. officinale). TD 1, testing dataset 1; TD 2, testing dataset 2; MCC, Matthews correlation coefficient. previous research, DetectNet trained to detect Carolina geranium (Geranium carolinianum L.) in plastic-mulched strawberry crops was successfully desensitized to black medic (Medicago lupulina L.) leaves in a similar circumstance (Sharpe et al., 2018).
In the present study, we evaluated the effectiveness of using a single DCNN to detect multiple weed species growing in perennial ryegrass. Simultaneous detection of multiple weed species is of paramount importance for precision herbicide application because weeds often grow in mixed stands and various postemergence herbicides, such as 2,4-D, carfentrazone, dicamba, and MCPP, are sprayed to provide broad-spectrum control of various broadleaf weeds (McElroy and Martins, 2013;Reed et al., 2013). For training purposes, E. maculata, G. hederacea, and T. officinale constituted a single category of objects to be discriminated from perennial ryegrass (Figure 2). The selected weed species exhibited tremendous differences in plant morphology, while perennial ryegrass was viewed at different turfgrass management regimes, mowing heights, and surface conditions (Figure 2). Meanwhile, weeds present in the training and testing images were at different growth stages, which added the extra complexity for the machine learning algorithms. Surprisingly, VGGNet (trained using training dataset B) achieved high F 1 scores (≥0.92), with high recall values (≥0.99) in the VD, TD 1, and TD 2. These results suggest that the simultaneous detection of multiple weed species with a single VGGNet is effective even when target weeds have distinct morphological structures and weed densities. AlexNet and VGGNet trained using the training dataset A resulted in higher precision and F 1 score, but lower recall than the training dataset B in the VD and most testing results in the TD 1 and TD 2. Recall is a critical factor for precision herbicide application. High recall indicates that target weeds are more likely to be correctly identified, whereas low recall indicates that target weeds are more likely to be misidentified, leading to inadequate herbicide application and thus poor weed control. Regardless of the training datasets, GoogLeNet exhibited high recall but unacceptable precision. Low precision is undesirable since resultant networks are more likely to misclassify turfgrasses as target weeds and spraying herbicides where weeds do not occur.
The training dataset A is balanced and contained equal number of negative and positive images, whereas the training dataset B is unbalanced and contained less training images than the training dataset A. Because of the differential performances of multiplespecies neural networks, particularly AlexNet and VGGNet, were evident when trained using the training dataset A and B, we hypothesized that (1) the ratios of negative and positive images in the training dataset may influence the performances of neural networks, (2) changing the number of positive images for each weed species in the training dataset may alter the performances for weed detection, and (3) a larger training dataset might be needed to further improve the precision and recall and enhance the overall accuracy. These hypotheses will be tested in further work. It should be noted that all training images were taken in a relatively small geographical area. While the model achieved high classification rates, a more diversified training dataset that represents different weed biotypes collected in different geographical regions is highly desired. Moreover, pixel-wise semantic segmentation was noted to improve the accuracy of object detection with less training images , which warrant further evaluation. The expansion of the neural networks to include a wider variety of weed species should be the next immediate step of this research.

CONCLUsION aND sUMMaRY
This research demonstrated the feasibility of using DCNN for weed detection in perennial ryegrass. When the neural networks were trained using training datasets containing a single weed species, AlexNet and VGGNet performed similarly for detection of E. maculata and G. hederacea growing in perennial ryegrass. AlexNet and VGGNet had reduced the recall values in the TD 2 for detection of T. officinale, but this problem was overcome by training the multiple-species neural networks. VGGNet consistently exhibited the highest MCC values in the multiplespecies neural networks for both training datasets. VGGNet trained using training dataset A exhibited higher precision and F 1 score but lower recall compared to the models trained using training dataset B. VGGNet (trained using the training dataset B) achieved high F 1 score (≥0.9278), with high recall (≥0.9952), indicating that it is highly suitable for the automated detection of E. maculata, G. hederacea, and T. officinale growing in perennial ryegrass. GoogleNet is not an effective DCNN at detecting these weed species primarily due to the unacceptable precision. DetectNet exhibited excellent F 1 scores (≥0.9843) and recall values (≥0.9911) in the TD 1 and TD 2, and thus is an effective DCNN for detection of T. officinale growing in perennial ryegrass. To further improve the accuracy of weed detection, other DCNN architectures, such as Single Shot Detection (SSD; Liu et al., 2016), You Only Look Once (Yolo; Redmon et al., 2016), and residual network (He et al., 2016) may be investigated in the future.

DaTa aVaILaBILITY sTaTeMeNT
The datasets generated for this study are available on request to the corresponding author.
aUThOR CONTRIBUTIONs JY, AS, and NB designed the experiment. JY, ZC, and SS acquired the images. JY trained DCNN models, analyzed the data, and drafted the manuscript.
aCKNOWLeDgMeNTs This research received no specific grant from any funding agency in the commercial, public, or not-for-profit organizations.