Revolutionizing agriculture with artificial intelligence: plant disease detection methods, applications, and their limitations

Accurate and rapid plant disease detection is critical for enhancing long-term agricultural yield. Disease infection poses the most significant challenge in crop production, potentially leading to economic losses. Viruses, fungi, bacteria, and other infectious organisms can affect numerous plant parts, including roots, stems, and leaves. Traditional techniques for plant disease detection are time-consuming, require expertise, and are resource-intensive. Therefore, automated leaf disease diagnosis using artificial intelligence (AI) with Internet of Things (IoT) sensors methodologies are considered for the analysis and detection. This research examines four crop diseases: tomato, chilli, potato, and cucumber. It also highlights the most prevalent diseases and infections in these four types of vegetables, along with their symptoms. This review provides detailed predetermined steps to predict plant diseases using AI. Predetermined steps include image acquisition, preprocessing, segmentation, feature selection, and classification. Machine learning (ML) and deep understanding (DL) detection models are discussed. A comprehensive examination of various existing ML and DL-based studies to detect the disease of the following four crops is discussed, including the datasets used to evaluate these studies. We also provided the list of plant disease detection datasets. Finally, different ML and DL application problems are identified and discussed, along with future research prospects, by combining AI with IoT platforms like smart drones for field-based disease detection and monitoring. This work will help other practitioners in surveying different plant disease detection strategies and the limits of present systems.


Introduction
Plant infections significantly impact both crop quality and quantity.Early prediction and recognition of these infections are vital to prevent crop damage and enhance yield.In India, agriculture only contributes around 17% to the country's GDP (Agarwal et al, 2019).India ranks top in critical crops like tomatoes, potatoes, and pepper (Tm et al., 2018;Thapa and Subash, 2019;Zunjare et al., 2023).Various factors, including environmental factors and cross-contamination, influence the emergence and spread of infections in agricultural areas (Kodama and Hata, 2018).Various crops are growing in the world of agricultural cultivation, and they are open to our study.The pest infestations cause an annual decrease in crop productivity of 30-33% (Kumar et al, 2019).Fungal, viral, and bacterial organisms cause infectious diseases in plants.Due to the multitude of infections and various contributing factors, agricultural practitioners need help shifting from one infection control strategy to another to mitigate the impact of these infections.Therefore, the quality and quantity of the crop's overall production is directly impacted by this situation.
In the current era characterized by significant technological advancements, it is noteworthy that farmers continue to follow traditional practices regarding disease identification in crops.Rather than depend on modern specialized tools, farmers persist in personally and visually examining the crops to detect any signs of disease (Ayoub Shaikh et al, 2022).The traditional methods of visually inspecting and evaluating crops solely based on the farmer's expertise present several challenges and limitations in agricultural research.In the worst-case scenario, an undetected crop infection might cause the entire crop to decline, hurting yield.Certain agricultural diseases may exhibit inconspicuous symptoms, posing challenges in determining the appropriate way of action.In such situations, it can be confusing to ascertain the optimal judgment, nature, and intervention methodology.Therefore, it becomes essential to conduct advanced and comprehensive research (Munjal et al., 2023).
To address the challenges mentioned above that are prevalent in modern agricultural settings, computer-aided automated studies such as ML and DL can be instrumental in facilitating precise, rapid, and early identification of diseases.The advantages of employing these technologies lie in their ability to provide fast and accurate outcomes through computerized detections and image processing techniques.Utilizing AI techniques in agriculture can reduce labor costs, decrease time inefficiencies, and enhance crop quality and overall yield.The deployment of appropriate management approaches can facilitate the implementation of disease control plans by utilizing the earliest data regarding the health condition of crops and the specific location of diseases.

Contribution
The following list summarizes the primary findings and contributions of this study: • The classification of common diseases in vegetables such as tomato, chilli, potato, and cucumber are discussed.
• Predetermined steps for automated disease detection along with various methodologies and algorithms are explained.• The literature covers the presentation of AI methodologies for plant disease identifications.Especially ML and DLbased models are discussed in detail.These models are designed to detect vegetable diseases in various plant species.• We discussed the AI based plant disease classification, where, the automated approaches to classify disease in each respective vegetable are provided.• Finally, the challenges associated with applying AI models in disease detection are described in-depth and underlined in this study.

Organization of this study
Our thorough study focused mainly on the use of automated strategies to diagnose plant diseases.The study is categorized into five distinct sections.In Section 2, we focus on the background knowledge for automated plant disease detection and classification.Various predetermined steps are required to investigate and classify the plant diseases.Detailed information on AI subsets such as ML and DL are also discussed in this section.A detailed examination of the joint disease symptoms that could affect the vegetables is provided in Section 3. Section 3 also highlights the AI-based disease detection by providing previous agricultural literature studies to classify vegetable diseases.After reviewing various frameworks in the literature, Section 4 discusses the challenges and unresolved issues related to classification of selected vegetable plant leaf infections using AI.This section also provides the future research directions with proposed solutions are provided in Section 6.We conclude the study in Section 5.
2 Background required for automated plant disease detection Automated technologies to detect plant diseases are currently essential.They prevent crop diseases from occurring frequently and the losses that follow from them.The automated disease detection system that uses AI follows predetermined steps.The procedures involve several steps, including installing various sensors in the agricultural field to collect and record plant images.The collected images are then processed and segmented to be used as data in machine learning algorithms.The ML models then predict whether a leaf is healthy or diseased (Ayaz et al., 2019).The framework with predetermined steps to predict the plant disease is presented (Figure 1).

Plant image acquisition
In this phase, relevant images of the object are captured and acquired to perform classification using automated approaches.A picture is a collection of binary data, which can then be manipulated and analyzed on a computer.This section uses high-resolution digital cameras to capture images (Camargo and Smith, 2009).Smartphones have proven useful by recording image samples in various supported formats such as jpg, png, tif, and more.After all the required images have been captured, they are sent to the image preprocessing stage to be adjusted before use.If the collected images do not fulfill the processing requirements, there is a need to employ image-enhancing methods (Basavaiah and Anthony, 2020).
For an accurate disease classification, the image acquisition phase is crucial.The efficiency of the entire framework is highly dependent on the images acquired.ML models are trained on these images (Camargo and Smith, 2009).The agricultural research literature shows plenty of well-known image datasets for various plant species.The datasets include healthy and unhealthy leaves, making it possible to examine and assess the effects of different diseases on plant health.Several vegetable plant infection-related datasets are available online, such as PlantVillage (Arya and Rajeev, 2019), New Plant Diseases (Wani et al., 2022), IPM Images, APS Images, Plant Doc (D. Singh et al., 2020), PLD (Rashid et al., 2021), and many more.The publicly available datasets of selected plant diseases are provided (Table 1).

Image preprocessing
It is an essential step in the initial phase of image acquisition.The captured images contain various factors such as noise, blur, low or high illumination, unwanted background, etc.Therefore, it is crucial to process this raw data and make it worthy to classify the disease efficiently using automatic approaches.The raw data is converted into a specific format and cleaned up by removing any noise or distortion.In the next phase, images are passed to the step where the essential segmentation and feature extraction procedures are carried out.
Preprocessing allows researchers to maximize the efficiency of their computing resources and maintain uniformity in their image resolutions relative to a set benchmark.Several preprocessing approaches include standardization, image size regularization, color scale, distortion removal, and noise removal, which provide for scaling the image to the specified dimensions performed at this stage.In addition, the image is adjusted to fit the fixed color scale for best analysis and interpretation.Previous studies have shown that a white background for images can help make them easier to understand (Militante et al, 2019).A standard preprocessing methodology in agricultural research uses the type, capacity, and value (HSV) method, closely mimicking human observers' capabilities (Jadhav and Patil, 2016).To improve processing efficiency and accuracy, agricultural researchers frequently use masking and background removal techniques (Sannakki et al., 2013).Due to its resemblance to the perceptual traits of human vision, the conversion of a colored image into the renowned HSI (Hue, Saturation, Intensity) color space representation is used.According to previously published research (Liu and Wang, 2021), the H component of the Hyperspectral Imaging (HSI) system is the most frequently used for further analysis.Low-pass filters are used to reduce high-frequency noise.At the same time, the high-pass filter's negative weighting factors increase those regions with a dramatic intensity gradient.The procedure highlights the most relevant features (Zhang et al, 2020).The Laplacian filter is a typical method used in agricultural research to improve the clarity of image outline structures.Using a Fast Fourier Transform method (Packa et al., 2015), the Fourier transform (FT) filter successfully transforms the images into the spatial frequency domain.The sigma probability of the Gaussian distribution uses a commotion smoothing channel, a straightforward method with impressive results.The quality of plant disease images can be improved using histograms, a technique that changes the power distribution of images (Makandar and Bhagirathi, 2015).Segmenting the image of the infected leaf is crucial for achieving pinpoint accuracy in disease diagnosis.

Image segmentation
Segmentation is a fundamental technique used in agricultural science, wherein an image is meticulously divided into its components.The primary goal is to analyze each object in more detail, extracting beneficial features that might enhance our understanding and knowledge (Jafar et al., 2022).Distinguishing between unaffected and infected regions is possible based on the retrieved features (Makandar and Bhagirathi, 2015) Segmenting the preprocessed images to classify diseased leaves is crucial to extract various potentially helpful features.
Traditional approaches, such as thresholding, edge detection, region-based, and clustering, rely on mathematical and image processing knowledge to segment the given images.Thresholding is one of the most effective segmentation approaches, segmenting images based on pixel intensity values.It is widely used in various applications such as classification, detection, and remote sensing.The three subtypes of thresholding segmentation are global, variable, and adaptive.Each category has its methods for segmenting images; for example, Global Thresholding methods include mean, median, and Otsu thresholding (Makandar and Bhagirathi, 2015).Edge detection is a process where an image is partitioned based on its edges, typically known as the boundaries of the image.The strengths and weaknesses of this approach are discussed in detail (Table 2).Some famous methods for edge detection include the Sobel operator, Canny edge detector, and Laplacian of Gaussian (LoG) filter.
Region-based segmentation divides the image into multiple regions based on the similarity of pixels in terms of intensity value, color, and shape.Two well-known region-based segmentation methods are Region Growing and Region Splitting (Aubry et al., 2014).The methods to segment the image in both are vice versa, with one growing the region by adding seed pixels of neighboring pixels.Clustering, another image segmentation approach, groups pixels together based on their similarity in texture, color, or other required features.K-means (Ell and Sangwine, 2007) and Fuzzy Cmeans (Camargo and Smith, 2009) are famous clustering algorithms for image segmentation and are widely used in various applications.However, traditional approaches lack efficiency in handling complex images with fine details, as provided in the weakness (Table 2).In recent years, deep learning-based automatic segmentation approaches have outperformed traditional methods in terms of performance.Two well-known DL-based segmentation approaches are Semantic Segmentation and Instance Segmentation.Semantic segmentation assigns a category label to each pixel in an image, dividing the image into mutually exclusive sets, with each set identifying a valuable region of the original image (Jafar et al., 2022).DL models, such as Convolutional Neural Network (CNN), outperform and enhance higher-level segmentation accuracy.Instance segmentation is an updated improvement in semantic segmentation designed to handle complex or challenging tasks.This approach predicts instances of object classes from images.Various techniques have been developed and each technique uses famous DL architectures like RCNN, YOLO, Instance Cut, Deep Mask, Tensor Mask, etc.The advantages and drawbacks of semantic and instance segmentation are provided (Table 2).

Feature extraction
In agriculture, the procedure of extracting features from raw data is known as feature extraction.The input image feature descriptors are shape, color, and texture properties.It plays an essential character in classification tasks.In the context of ML, feature engineering is a fundamental technique that includes transforming raw data into a set of meaningful and relevant features (Basavaiah and Anthony, 2020).The dataset is provided as input to this step to determine whether plants are healthy or not.
The basic features in an image include color, texture, morphology, and other related characteristics.When identifying the spot on a leaf that's been damaged, morphological traits prove more effective than others (Yao et al., 2009;Khirade and Patil, 2015).Color features like color moments and Gabor texture are frequently used.Several methods are available for obtaining these characteristics, such as the color histogram (Sugimura et al., 2015), the color correlogram (Huang et al., 1997), the color R moment (Rahhal et al., 2016), and others.Contrast, homogeneity, variance, and entropy are all potential additions to the texture.In the context of plant disease identification problems, it has been discovered that texture feature usage yields more favorable outcomes (Kaur et al, 2019).By using the grey-level co-occurrence matrix (GLCM) method, one may determine the area's energy, entropy, contrast, homogeneity, moment of inertia, and other textural features (Mokhtar et al., 2015;Islam et al., 2017).Texture characteristics may be separated using FT and wavelet packet decomposition (Kaur et al, 2019).Additional features such as the Speed-up robust feature, the Histogram of Oriented Gradients, and the Pyramid Histogram of Visual Words (PHOW) have shown greater effectiveness (Kaur et al, 2019).

Artificial intelligence
Artificial intelligence (AI) is becoming increasingly important in agricultural research, particularly in identifying and classifying plant diseases.Classification is the first stage of this process, which involves separating data into classes.In this context, we are particularly interested in plant leaf detection and classification, specifically in differentiating between healthy and diseased examples.To perform, there is a need to know about the classification and detection algorithms of ML and DL.

Machine learning algorithms
To understand AI's involvement in this domain, it's essential to realize that machine learning is a subset of AI.ML aims to allow computers to learn from experience (Verma, 2023).Currently, we come across various subtypes within the ML domain, each suited to different learning scenarios.Supervised learning involves providing the system with input data and the corresponding goal values predicted from the data.The goal is clear: to learn and develop a relationship that allows the system to predict outputs based on inputs (Radivojevićet al., 2020).This involves training algorithms to classify leaves into plant disease groups using labeled.In contrast, unsupervised learning relies on a different strategy.In this case, the system is given data without explicit input-output specifications.It aims to search for hidden patterns or relationships in the data (Attri et al, 2023).Semi-supervised learning arises when some data is labeled, like in supervised learning, while some data is unlabeled, like in unsupervised learning (Engelen and Hoos, 2020).
Distinguishing between classification and regression tasks in ML is also crucial because they produce different output data types.Classification tasks seek qualitative results and organize inputs into classes.One use of the classification is categorizing plant leaf diseases into distinct groups (Shoaib et al., 2022).In contrast, regression tasks deal with numerical results, trying to estimate values based on input data.There is a wide variability of methods available in supervised ML, each with advantages and limitations and are presented in Table 3. Decision trees, random forests, knearest neighbors, support vector machines, artificial neural networks, naive Bayes, linear regression, and linear discriminant analysis are among the frequently used approaches (Linardatos et al, 2021).

Deep learning models
Deep learning (DL) is a branch of AI and ML that has significantly impacted areas such as image classification, object recognition, and natural language processing (Sarker, 2021).DL employs neural networks for autonomous feature selection, eliminating the intensive artificial feature engineering requirement.It improves accuracy and generalizability in tasks such as image recognition and target identification by combining low-level information to build abstract, high-level features.The development of DL can be split into two eras: the first, from 1943 to 1998, and the second, from 2006 to the present (Lavecchia, 2019).In the first stage, ground-breaking innovations were developed, including backpropagation, the chain rule, Neocognitron, and architectures like LeNet for handwritten text recognition.Modern algorithms and architectures such as deep belief networks (DBN), autoencoders, CNN, and their variants emerged during the second phase of DL (Figure 2).They can be used in various fields, including self-driving cars, healthcare, text recognition, earthquake prediction, marketing, finance, and picture recognition (Sengupta et al., 2020).
DL comprises a wide range of neural network architectures, each best suited to a different class of problems.Among the most well-known are multilayer perceptron (MLP), backpropagation (BP), and deep neural networks (Naskath et al, 2023).While the original MLP was best suited for linear classification tasks, the BP method developed in the second iteration helped with nonlinear classification and learning challenges.The second phase, DL, appeared in 2006, bringing solutions to the gradient vanishing problem.The Hinton team's success in the 2012 ImageNet competition with the DL model AlexNet heralded the ascendance  Several empirical studies have shown that these structures perform better than alternatives.VGG-16 (Simonyan and Zisserman, 2015), GoogleNet (Jahandad et al., 2019), ResNet (Jafar and Lee, 2021), DenseNet (G.Huang et al., 2017), Genetic CNN (Xie and Alan, 2017), SqueezeNet (Iandola et al., 2016), LeNet (LeCun et al., 1999), Inception (LeCun et al., 1999), MobileNet (Howard et al., 2017), and Xception (Chollet, 2017) are a few examples.

AI-based automated vegetables disease detection classification
Plant pathology divides plant diseases into biotic and abiotic diseases.The fungus, bacteria, insects, and viruses cause biotic diseases (Figure 3).Non-living causes like environmental nutritional deficits, chemical imbalances, metal toxicity, and physical traumas produce abiotic disorders (Husin et al., 2012).Plants can also show signs of abiotic diseases when exposed to unfavorable environmental conditions such as high temperatures, excessive moisture, inadequate light, a lack of essential nutrients, an acidic soil pH, or even greenhouse gases (Figure 3).Plant infections can be challenging to spot with the naked eye, making detection and classification an enormous problem (Liu and Wang, 2021).It's also important to remember that many plant diseases share symptoms.Because of their similarities in appearance, determining which plant disease is causing harm can be difficult.Some signs that can be difficult to analyze and identify are irregular leaf development, distortion of leaf pigmentation, slowed growth, reduced and weakened pods, etc (Manavalan, 2021).These visible signs, such as affected leaves, help to identify the disease.To maintain a healthy ecosystem, maximizing vegetable production and ensuring the agricultural sector's economic viability is important (Mitra, 2021).
The primary goal of this study is to determine the root causes of leaf diseases.Previous studies have consistently shown that the health of a plant's leaves is directly related to the strength of its immune system (Qiu et al., 2022).When a plant's leaves are healthy, the plant's immune system strengthens and becomes better able to tackle diseases that might appear in other parts of the plant.The  (Krithika and Veni, 2017).These diseases are quite dangerous because they can spread swiftly and cause much damage.This section presents a comprehensive overview of plant disease detection and classification frameworks utilizing cutting-edge techniques such as ML and DL.These frameworks have been extensively documented in the existing literature for the prescribed vegetables such as tomato, chili, potato, and cucumber.

Automated tomato disease detection
The tomato, scientifically known as Solanum Lycopersicon, is an important agricultural crop cultivated throughout Asia for human use.Some of the most prominent nutrients in this formula include vitamin E, vitamin C, and beta-carotene.These crops are rich in potassium, a crucial mineral for health.Because of its popularity and nutritional value, this vegetable is grown worldwide.The tomato crop is vulnerable to several diseases brought on by bacterial infections, microbes, and pest infestations (Lal, 2021).In contrast, the disease name, diseased image, and unique symptoms that damage specific tomoato plant parts are highlighted (Table 4).Furthermore, the detailed explanations of the previous studies to predict the tomato diseases automatically are provided below.
Previous research (Francis and Deisy, 2019) proposed a CNN model to discriminate between healthy and diseased tomato and apple leaves.The proposed model comprises four convolutional layers, followed by equivalent pooling layers.The model also uses a sigmoid activation function and two dense layers that are fully coupled.A total of 3663 image samples were used during training and testing, all carefully selected from the extensive PlantVillage dataset.The system's output demonstrates an impressively high accuracy rate (87%).Similarly, researchers (Basavaiah and Anthony, 2020) observed the practice of various ML approaches to identify tomato plant disease.In this study, 200 images from 5 classes were used.Texture, color, and form were used since they are well-known global feature descriptors.The authors used KNN, LR, DT, RF, SVM, and other algorithms for model training.The RF model outperformed many other ML algorithms in our analysis with an impressive 94% accuracy rate (Table 5).
The authors (K and Rao, 2019) use KNN and probabilistic neural networks (PNN) to detect and categorize different diseases affecting tomato leaves.The dataset comprises 600 picture samples from healthy and diseased tomato leaves in the field.The model accurately identified Verticillium wilt, powdery mildew, leaf miners, Septoria leaf spot, and spider mites.The results demonstrated that the classification performance of the PNN model surpassed that of the KNN model, achieving an accuracy of 91.88%.
A feature extraction using the K-means method was performed (Vadivel and Suguna, 2022).The BPNN method was then applied to the task of labeling diseased leaves.The model classified leaf diseases using the augmented data with 10000 images from online sources.Seven different features, including contrast, correlation, energy, homogeneity mean, standard deviation, and variance, have been extracted from the dataset.Several models, such as BPNN, neural network, K-mean cluster, and CNN, were used for training.The proposed optimized model achieved a surprising 99.4% accuracy in classification has been attained by the model (Table 5).
Another study (Chakravarthy and Raman, 2020) used DL to identify early blight disease in tomato leaves.The dataset included 4281 image samples carefully collected from a trusted agriculture source.The authors offer a model to distinguish between healthy and early blight-affected tomato leaves.ResNet and Xception were finetuned for tomato plant leaf classification.With this refinement process, the system could discriminate between healthy and early blight-infected leaves on tomato plants with an astounding accuracy of 99.95%.(Kumar and Vani, 2019), the authors analyze several CNN architectures trained to identify diseases in tomato leaves.The PlantVillage dataset is used for this analysis and consists of 14,903 images.There are a total of ten disease types found in tomatoes in this data set.Some diseases that may damage plants are target spots, septoria leaf spots, mosaic viruses, leaf molds, healthy spots, and bacterial spots.The research investigated four common transfer learning-based architectures: LeNet, Xception, ResNet50, and VGG16.Classification accuracies of 96.27%, 98.13%, 98.65%, and 99.25% were achieved by evaluating the efficacy of these Abiotic plant stress and biotic plant diseases.

Leaf Mold
Fungus The elder leaves have light greenish-to-yellow dots on the top surface (Zhang et al., 2020) Tomato Gray Leaf Spot Fungus Circular, grayish-brown lesions on leaves, which have a yellow halo and severe infections can lead to defoliation (Liu and Wang, 2020b) architectures.Based on our in-depth evaluation, we found that the VGG16 model outperformed its competitors.

Automated Chilli disease detection
One of India's most important agricultural products is the chilli, a veggie with a spicy flavor widely used in regional and international cuisines.Chilli pepper, also known as Lanka and Mirchi, has several names.Many varieties can be used as seasonings, dyes, oils, and medicinal compounds.Approximately 45 different viruses are known to infect chilli plants.Only 24 are known to occur naturally; the rest may be brought on through vaccination or other ways (Duranova et al, 2022).Various chilli disease such as Down curl, gemini virus, cercospora, leaf spot etc. are caused by bacteria, virus, and fungus causative agents.The disease name, diseased image, and unique symptoms that damage specific chili plant parts are provided (Table 6).Furthermore, we provided a detailed explanation of the previous studies to predict the chilli diseases automatically below.
This study (Naik et al, 2022) examines the effectiveness of DL and ML techniques for classifying chilli leaf disease.Twelve pre-trained DL networks were employed, and the dataset features images of five critical diseases.Without augmentation, VGG19 had the highest accuracy (83.54%), whereas DarkNet53 performed exceptionally well with augmentation.A unique squeeze-and-excitation-based convolutional neural network (SECNN) model outperformed the rest, obtaining 98.63% accuracy without augmentation and 99.12% with augmentation, respectively (Table 6).
This research examines the prevalence of pests and diseases in growing chili peppers, a vital vegetable crop worldwide.Automated This study (Sachdeva et al, 2021) introduces a DCNN model with Bayesian learning to improve plant disease classification.Early disease diagnosis is critical for crop health.The study includes 20,639 PlantVillage images of healthy and diseased potato, tomato, and pepper bell plant samples.A Bayesian procedure has been built into the structure of a residual network.The model has a remarkable accuracy of 98.9% without any overfitting issues (Sachdeva et al, 2021).
This study presents a new data augmentation method that uses geometric modifications to expand a small dataset depicting healthy and diseased chilli leaves.Convolutional Neural Network (CNN) and ResNet-18 were tested and compared using both the raw data and the data that had been artificially enhanced.The results showed that the trained models were effective, with an average accuracy performance of 97% (Table 7).This research demonstrates the significance of data augmentation in improving the accuracy of DL models for assessing chilli health, which could increase agricultural output (Aminuddin et al., 2022).
The study (Mustafa et al., 2023) uses a dataset of 2475 images of pepper bell leaves to classify plant leaf diseases.The method uses an image enhancement technique, enhancing the effectiveness of the Convolutional Neural Network (CNN) model.The dataset is expanded to 20,000 images, improving the model's effectiveness.The optimized CNN model includes four preprocessing stages, including filter width variations, hyper-parameter optimization, max-pooling, and dropout layers, yielding promising results.The optimized CNN model, trained for 25 epochs, achieved an accuracy rate of 99.99% (Table 7).
The study (Karadağet al., 2020) focuses on early recognition of diseases in plant health by using advanced computerized diagnostic systems.The research uses light leaf reflections to distinguish between healthy and fusarium-diseased peppers.The data includes four groups of pepper leaves: healthy, fusarium-diseased, mycorrhizal fungus infections, and combinations of leaves with both.The process involves generating feature vectors and undergoing rigorous classification using machine learning algorithms like ANN, NB, and KNN.The classification algorithms achieved impressive success rates of 100% for KNN, 97.5% for ANN, and 90% for NB in distinguishing between diseased and healthy pepper plants.
In this paper (Kanaparthi and Ilango, 2023), DL methods investigated the training issues on the Chilli leaf diseases dataset.This research uses 160 images from the public domain repository on Kaggle to assess the efficacy of the Squeeze-Net training architecture in identifying Geminivirus and Mosaic-infected Chilli leaves.Training accuracy varies from 50% to 100% as a function of settings like CNN optimizers, Max-epochs, dropout probability, strides, dilation factor, and padding values.Adopting Adam and RMSprop optimizers with epochs of 40 and 35, respectively, leads to a perfect accuracy score for the Squeeze-Net CNN architecture (Lin et al., 2019a) and achieves 100% accuracy.
This study used chili crop images to diagnose two primary illnesses, leaf spot, and leaf curl, under real-world field circumstances.YOLOv5 was used in this research to identify diseases in chilli crops.The model predicted disease with an accuracy of 75.64% for those with disease cases in the test image dataset (KM et al, 2023).

Automated potato disease detection
The potato maintains its prestigious position as the fourthlargest crop in global cultivation.However, it has difficulties, especially with regard to disease susceptibility.The potato is one of the most widely affected crops in agriculture due to the prevalence of numerous diseases (Wani et al., 2022).These diseases are Black scurf, common scab, black leg, pink rot etc. are caused by different causative agents.The disease name, diseased image, and unique symptoms that damage specific potato plant parts are provided (Table 8).Furthermore, literature's efforts to identify and detect potato crop diseases automatically are highlighted below.The authors of (Patil et al., 2017) compared three ML methods, RF, SVM, and ANN, for spotting blight disease in potato leaf images.These techniques were trained and tested using the PlantVillage dataset and from the University of Agricultural Sciences India.The dataset consisted of 892 images depicting healthy leaves, leaves with early blight, and leaves with late blight.Applying the fuzzy c-mean clustering technique to each image helped identify and distinguish healthy and diseased categories.The simulation showed that the ANN was the most accurate ML technique for detecting diseases.ANN has an impressive 92% accuracy, followed by SVM at 84% and RF at 79% (Table 9).
A machine learning-based automated approach (Suttapakti and Bunpeng, 2019) for classifying potato leaf diseases was introduced in a separate study.The maximum-minimum color difference technique was used alongside a set of distinctive color attributes and texture features to create this system.Image samples were segmented using k-means clustering and categorized using Euclidean distance.Three hundred potato leaf images were attained from the PlantVillage database.The author's suggested approach by integrating MCD and TTF (three texture characteristics).This method correctly diagnosed late blight, early blight, and healthy potato leaf images with 91.67% accuracy.
In this study (Arshaghi et al, 2023), machine vision and AI identify defects in agricultural goods like potatoes.A CNN is In damp conditions, reddish lesions develop erratically, unusual white fleecy sporulation beneath leaf (Nur et al., 2023) Early Blight Fungus Infections with a golden border that develop concentric hoops (Afzaal et al., 2021) employed in this study to classify potato diseases.Potato diseases include healthy, black scurf, common scab, black leg, and pink rot.They used a dataset with 5000 images of the following classes.Compared to previous approaches, the accuracy of the suggested DL methodology was much more significant, reaching 100% and 99% in various disease groups (Table 9).(Arya and Rajeev, 2019), the authors investigated the viability of using CNN and AlexNet architectures for disease detection in potato and mango leaves.The training and testing dataset consisted of 4004 potato photos obtained from the PlantVillage database.The training and validation datasets comprised 3523 photos, where testing dataset had 481 images.Based on models' simulation and analysis, the AlexNet architecture demonstrated outstanding performance, with an accuracy rate of 98.33%, which is very impressive.
An enhanced DL method is presented in this article (Mahum et al., 2023) to detect various potato plant leaf diseases.The potato leaf diseases are categorized into five groups: healthy, late blight, early blight, leaf roll, and verticillium wilt.The model employs a reweighted cross-entropy loss and is pre-trained on Efficient DenseNet to handle unbalanced data.Its testing set accuracy of 97.2% is higher than that of competing models, and it offers a unique approach to identifying and categorizing potato leaf diseases.
In another study (Al-Amin et al, 2019), researchers used a DCNN to identify late and early blight in potato harvests.Their research aimed to identify diseased potato leaves from healthy ones so that the infections may be detected early.To train and test the model, 2250 image samples of potato leaves were used.The DCNN was developed using a custom-built architecture for identifying diseased potato leaves.The model achieves a respectable level of accuracy in its predictions, with a maximum value of 98.33%.
A study (Sharma et al., 2021) overcomes sustainable intensification and boosts output without negatively impacting the environment.This approach considers potato and rice crops, and diseases are detected.Various ML algorithms and DL CNN are supposed to predict the disease.DL CNN outperformed all the ML classifiers (SVM, KNN, DT, and RF) and achieved accuracy rates of 99.58% for rice and 97.66% for potato leaf diseases in this research.
The health of crops depends on the prompt diagnosis of plant diseases (Singh and Yogi, 2023).In this investigation (Singh and Yogi, 2023), CNNs are used to apply DL to automate the diagnosis of diseases in potato leaves.The paper uses a dataset of 1700 images of potato leaves (600 for training and 300 for testing) to showcase the utility of CNNs in disease identification in intelligent farming.The citrus potato diseases are considered to be classified.The CNN model outperforms all other models in accuracy tests, reaching an impressive 99.62% (Table 9).

Automated cucumber disease detection
Cucumbers, a much-loved and renewing vegetable, belong to the prestigious Cucurbitaceae family of plants.The crop is wellknown for its high-water content, making it a refreshing and hydrating choice even during the hottest times.In addition, cucumber plants are susceptible to several ailments, such as anthracnose and angular leaf spots, which cause various leaf problems (Vishnoi et al, 2021).Cucumber plants are particularly susceptible to powdery mildew in their later stages of development.The disease name, diseased image, and unique symptoms that damage specific cucumber plant parts are provided (Table 10).Furthermore, previous automated cucumber crop diseases detection studies are explained in detail below.
A study (Lin et al., 2019a) presents a novel CNN-based U-Net semantic segmentation approach to overcome these obstacles.Over twenty test samples, the model correctly segments images of cucumber leaves damaged by powdery mildew with an average pixel accuracy of 96.08%, an intersection over union score of 72.11%, and a dice accuracy of 83.45% (Table 8).The proposed method shows tremendous potential in pixel-level segmentation of powdery mildew in cucumber leaf diseases.This paper presents a systematic approach to detecting and classifying diseases on cucumber leaves (Khan et al., 2020).The methodology is divided into five stages: image enhancement, segmentation of contaminated areas, deep feature extraction, feature selection, and disease classification.The first stage involves improving images by amplifying local contrast, segmenting regions using the Sharif saliency-based (SHSB) method, and extracting characteristics from images using pre-trained models like VGG-19 and VGG-M.The process involves using local entropy, standard deviation, and interquartile range for feature selection and a multiclass support vector machine to detect illnesses.The suggested methodology achieves a classification accuracy rate of 98.08% showcasing their authenticity and potential as a reliable tool for identifying and classifying diseases.
This research aims to introduce a unique Global Pooling Dilated CNN (GPDCNN) for plant disease identification (Zhang et al., 2019).The advantages of GPDCNN over conventional CNN and AlexNet models are enhanced convolutional receptive field expansion, restoration of spatial resolution through the addition of dilated convolutional layers without increasing training parameters, and the synergistic utilization of dilated convolution algorithms and global pooling.Experimental evaluations on datasets including six common cucumber leaf diseases demonstrate the model's efficacy.Yangling Agriculture Zone China collected the dataset, which has 600 images.The proposed GPDCNN achieved a remarkable 95.18% accuracy rate in cucumber disease recognition (Table 11).
The proposed cucumber disease recognition method (Zhang et al., 2017) employs a three-step process involving K-means This research (Kianat et al., 2021) proposes a hybrid framework for disease classification in cucumbers, emphasizing data augmentation, feature extraction, fusion, and selection over three stages.The number of features is cut down with Probability Distribution-Based Entropy (PDbE) before a fusion step, and feature selection with Manhattan Distance-Controlled Entropy (MDcE) is done.Finally, classifiers are used to categorize the features that have been chosen.Multiple machine-learning classifiers were applied to over 900 images from six different classes.The quadratic SVM attained an accuracy rate of 93.50% on the selected set of features.
In this analysis (Zhang et al, 2020), AI is used to detect and categorize diseases affecting greenhouse plants, particularly those that affect the leaves of cucumbers.Powdery mildew, downy mildew, healthy leaves, and combinations of these diseases were all included in the dataset.They used the cutting-edge EfficientNet-B4-Ranger architecture to create a classification model with a 97% success rate.The model was determined to be the best option for this application.
This research introduces DUNet (Wang et al., 2021), a two-stage model that combines the benefits of DeepLabV3+ and U-Net for disease severity classification in cucumber leaf samples against diverse backgrounds.Disease spots on leaves can be identified with U-Net, while DeepLabV3+ segregates healthy parts from complex backdrops.The experimental results demonstrate the efficacy of this two-stage approach in accurately segmenting disease severity based on the position of leaves and disease spots against diverse backgrounds.The model can accurately segment leaves at a rate of 93.27%, identify disease spots with a Dice coefficient of 0.6914, and classify disease severity with an average accuracy of 92.85% (Table 11).
4 Limitations of AI in disease detection along with future directions

Limitations
Previously, we detailed how AI applications are being used to improve agriculture, most notably in disease detection in vegetable plants.We investigated several automated frameworks and models that have been proposed by researchers from across the world and are described in the literature.It is clear that AI holds great promise in the field of agriculture and, more specifically, in the area of plant disease identification.However, there is a need to recognize and solve the various issues that limit these models' ability to identify diseases.In this part, we list the primary challenges that reduce the efficiency of automatic plant disease detection and classification.

Noise and background analysis
In agricultural research, the plant disease captured images has needless noise and backgrounds in various colors and additional elements like roots, grass, soil, etc.It is crucial to identify such factors and isolate them.Segmentation is a method used to isolate contaminated regions from the captured images.To facilitate realtime identification of plant diseases, the proposed automatic system must eliminate extraneous components within the image, isolating only the desired segment to identify diseases in the fields effectively.

Factors influencing image acquisition
The current datasets primarily consist of images captured in controlled environments, often in laboratory settings.However, obtaining a comparable image can be challenging due to varying factors like light intensity, moisture levels, and environmental variables.To achieve research objectives, getting visual representations of the same leaf specimen from different perspectives, time intervals, and environmental settings is crucial.The selection of tools for image acquisition is essential in

Identification and isolation of disease symptoms
Digital image processing plays a crucial role in agricultural research, particularly in identifying and isolating similar symptoms of various diseases.Segmenting symptoms of diseases exhibiting similar characteristics is vital for better performance.However, this task becomes challenging when numerous diseases have similar symptoms and environmental factors.Alternative segmentation methodologies must be explored to identify vegetable diseases with isolating symptoms.

Impact of dataset size on model performance
From the literature, most authors use a few thousand images for training models, and it highlights the need for more data for specific vegetable diseases.The DL-based data augmentation approach addresses this, enhancing the total training images.A covariate shift arises in this scenario due to the disparity between the training data used for model acquisition and the data on which the model is implemented.Sing extensive datasets can improve model performance but also introduce computational burdens.

Data imbalance for various diseases
The automated detection approaches face challenges due to imbalanced patterns in the training dataset.As discussed above, various vegetable diseases have limited data and non-uniformity between the classes.To prevent bias, it's vital to represent diseases by vegetable samples of similar size, both infected and healthy, to maintain a balanced and unbiased dataset for accurate analysis and prediction.

Multiple concurrent diseases
The assumption that each image contains only one disease is only sometimes accurate, as multiple diseases, nutritional deficiencies, and pests can coexist within the same image simultaneously.This makes identifying and tracking a specific disease more challenging, and the manifestation of symptoms can vary based on the particular geographic location.Therefore, it's crucial to consider these factors when analyzing images.

Disease with similar symptoms
Identifying diseases in agriculture is challenging due to the similarity in symptoms and patterns.Researchers typically use the visible spectrum for investigations.Incorporating infrared spectral bands could help differentiate diseases, but it increases complexity, cost, and challenges.Current methodologies may still be susceptible to errors, but these innovative methodologies could reduce reliance on extensive datasets and the risk of errors in agricultural practices.

Future directions
Image processing and AI methodologies offer significant benefits in plant disease detection and classification, but they also have limitations.Image processing techniques can distinguish and separate afflicted segments within an image, but new methodologies are needed to address noise management and extraneous background elements.His manuscript acknowledges various computer vision methods and techniques that have emerged as a prominent area of research in the agricultural domain.
Real-time machine learning-based systems are scarce for disease identification in the agricultural domain.Investigating suitable chemical solutions and their optimal proportions for mitigating disease proliferation is crucial, as improper or inadequate formulations can negatively impact crop productivity and nutritional value.Farmers often need more thorough assessments to combine chemicals, leading to chemical reactions that pose significant environmental risks.Furthermore, leaf images can detect nutrient deficiencies and water scarcity in plants through careful observation of leaves.There is a pressing demand for advanced, hybridized, automated systems capable of overcoming these challenges.
Early disease detection is pivotal in agricultural research, but there is a need for mobile-based applications and websites tailored to the needs of the general public.While existing literature reports on efficient and accurate disease identification models, rigorous testing, and real-time implementation in mobile applications and web services.Drones, often considered expensive gadgets, have garnered significant attention in various fields, particularly agriculture.Developed nations utilize drones for diverse agricultural purposes, including crop health monitoring, weed control, and spraying.To address these challenges, we propose a generic framework that involves training AI models using plant disease datasets and utilizing transfer learning techniques for model validation.The trained models are then deployed to mobile applications or smart drones (Figure 4).Other platforms can capture plant leaf images in real-time and perform necessary processing to optimize performance.His approach enables both methods to identify plant diseases promptly and accurately and highlights the potential to integrate AI with IoT sensors.

Conclusion
Accurate identification and classification of plant diseases are crucial for successful crop cultivation.Annual detection presents challenges such as significant investment in resources, labor, and expertise and the need to consider factors like agricultural operations, disease classifications, and similar symptoms across different diseases.His affects crop productivity and quality.To address these issues, AI methodology can be employed for automated disease detection.I methods can predict diseases through the analysis of plant foliage.To optimize their use, it is essential to identify relevant and practical models and understand the fundamental steps involved in automated detection.His comprehensive analysis explores various ML and DL models that enhance performance in diverse real-time agricultural contexts.Challenges in implementing machine learning models in automated plant disease detection systems have been recognized, impacting their performance.Strategies to enhance precision and overall efficacy include leveraging extensive datasets, selecting training images with diverse samples, and considering environmental conditions and lighting parameters.ML algorithms such as SVM, and RF have shown remarkable efficacy in disease classification and identification, while CNNs have exhibited exceptional performance in DL.Especially since significant progress in plant disease prediction through image-based methodologies has been made, it is crucial to prioritize accuracy enhancement, real-time testing, and deployment.Exploring potential chemical and pesticide recommendations for identified diseases presents a promising avenue for agricultural research.The review presented herein would be beneficial not only to researchers and specialists in the field but also to pathologists and farmers seeking to predict plant diseases.

FIGURE 1
FIGURE 1Plant disease prediction system with all important steps.
of convolutional neural networks (CNNs)(Arya and Rajeev, 2019).The development of DL architectures has impacted various fields, including plant disease diagnosis, image detection, segmentation, and classification.It is worth noting that several pre-trained models tailored to deep neural networks (DNN) already exist within agricultural research.Keras's cited work describes that these models are deployed in agriculture to aid in prediction, feature extraction, and tweaking.CNN's performance is very sensitive to the complexity of their underlying architectures.Image classification has seen the development and study of several wellknown CNN architectures.

TABLE 1
Datasets description for the selected vegetables.

TABLE 2
Image segmentation Approaches with their advantages and drawbacks.

TABLE 3
ML supervised classification algorithms.

TABLE 5
Tomato vegetable classification using AI.

TABLE 4
The tomato crop diseases with their symptoms based on causative agents (bacteria, virus, and fungus).

TABLE 6 The
Chilli crop diseases with their symptoms based on causative agents (Bacteria, Virus, and Fungus).analysistools are used to spot obvious signs of disease.Researchers examined 974 self-collected images of chilli leaves from Malaysia.They used three machine learning classifiers, an SVM, an RF, and an ANN, with features extracted from six classical methods of each ML and DL.Combined with the SVM classifier, the DL strategies surpassed the conventional approaches with an accuracy rate of 92.10%(Ahmad Loti et al, 2021).
(Houetohossou et al., 2023)ter-soaked lesions on leaves black, circular spots with a yellow halo, leaf curling, and wilting were observed (Ahmad Loti et al., 2021) Bacterial Streak Bacteria Long, brown streaks on leaves, and lesions coalesce, and causing dieback of plant parts (Dhasan et al., 2022) Bacterial Blight Bacteria Wilting, yellowing of lower leaves.Stunted growth.Brown, and slimy vascular tissue on leaves(Houetohossou et al., 2023)image

TABLE 7
Chilli vegetable classification using AI.

TABLE 8
The potato crop diseases with their symptoms based on causative agents (Bacteria, Virus, and Fungus).
Potato Ring RotVirus Brown, necrotic ring-like lesions in the vascular system of the potato plant, leading to wilting and eventual plant death

TABLE 9
Potato vegetable classification using AI.

TABLE 10
The cucumber crop diseases with their symptoms based on causative agents (bacteria, virus, and fungus)., shape/color feature extraction, and sparse representation classification.It overcomes the limitation of treating features equally, achieving efficient computation and improved performance.Various cucumber diseases were classified, such as mildew, bacterial, and powdery mildew.Compared to four other methods, the SR classifier effectively recognizes seven major cucumber diseases, achieving an 85.7% overall recognition rate. clustering

TABLE 11 Cucumber
Vegetable classification using AI.'s performance.Various factors, including the kind of sample-taking instrument, light intensity, time of day, and amount of moisture, impact the precision of forecasts.Therefore, it is crucial to integrate training and immediate implementation of the automated illness prediction model to tackle these issues efficiently.