Comprehensive AI Framework for Automated Classification, Detection, Segmentation, and Severity Estimation of Date Palm Diseases Using Vision-Language Models and Generative AI

Iqbal, Abid

doi:10.3389/fpls.2025.1710188

ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Sustainable and Intelligent Phytoprotection

This article is part of the Research TopicInnovative Field Diagnostics for Real-Time Plant Pathogen Detection and ManagementView all 10 articles

Comprehensive AI Framework for Automated Classification, Detection, Segmentation, and Severity Estimation of Date Palm Diseases Using Vision-Language Models and Generative AI

Provisionally accepted

Abid Iqbal^*

King Faisal University, Al-Ahsa, Saudi Arabia

The final, formatted version of the article will be published soon.

The date palm (Phoenix dactylifera L.) is a vital crop in arid and semi-arid regions, contributing over $13 billion annually to the global economy. However, it faces significant yield losses due to pests, such as the red palm weevil, and diseases, including Bayoud and Black Scorch. Currently, expert visual inspection is the primary method of management, but it is time-consuming, subjective, and unsuitable for detecting large-scale or early-stage damage. Automated approaches based on classical machine learning offer limited improvements due to their lack of generalizability and environmental sensitivity. Recent deep learning methods, such as CNNs and Vision Transformers, have improved classification accuracy, but treat tasks like classification, detection, segmentation, and severity estimation as separate. This paper proposes an integrated Reveal-Aware Hybrid Vision-Language and Transformer-based AI framework that combines GAN-based augmentations for feature generation, CLIP for multimodal classification, PaliGemma2 for text-based detection, Grounding DINO + SAM 2.1 for zero-shot segmentation, and a Vision Transformer regression model for severity prediction. This end-to-end explainable diagnostic pipeline achieved 98% classification accuracy, 95.8% precision, 91.3% recall, and 94.2% F1-score across two datasets: nine classes of infected date palm leaves and three classes of date palm diseases. The proposed framework demonstrated detection accuracy of 94-98%, high-quality segmentations, and reliable severity estimates. This integrated approach highlights the potential of combining AI, vision-language models, and transformers for scalable, accurate, and sustainable plant disease management.

Keywords: Vision-Language Models (VLMs), generative adversarial networks (GANs), PaliGemma2, Zero-shot Segmentation, Severity prediction, sustainable crop management

Received: 26 Sep 2025; Accepted: 28 Nov 2025.

Copyright: © 2025 Iqbal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Abid Iqbal

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.