ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Technical Advances in Plant Science
Volume 16 - 2025 | doi: 10.3389/fpls.2025.1522985
Plant Disease Classification in-the-wild using Vision Transformers and Mixture of Experts
Provisionally accepted- Sejong University, Seoul, Republic of Korea
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Plant disease classification using deep learning techniques has shown promising results, especially when models are trained on high-quality images. However, these models often suffer from a significant drop in their accuracies when tested in real world agricultural settings. In-thewild, models encounter images which are significantly different from the training data in aspects like lighting conditions, capturing conditions, image resolution and the severity of disease. This discrepancy between the training and images in-the-wild conditions poses a major challenge for deploying these models in agricultural settings. In this paper, we present a novel approach to address this issue by combining a Vision Transformer backbone with a mixture of experts where multiple expert models are trained to specialize in different aspects of the input data, and a gating mechanism is implemented to select the most relevant experts for each input. The use of Mixture of Experts allows the model to dynamically allocate specialized experts to different types of input data, improving model performance across diverse image conditions. The approach significantly improves performance on diverse datasets that contain a range of image capturing conditions and disease severities. Furthermore, the model incorporates entropy regularization and orthogonal regularization which aims to enhance the robustness and generalization capabilities.Experimental results demonstrate that the proposed model achieved a 20% improvement in accuracy compared to Vision Transformer (ViT). Furthermore, it demonstrated a 68% accuracy on cross-domain datasets like PlantVillage to PlantDoc, surpassing baseline models such as InceptionV3 and EfficientNet. This highlights the potential of our model for effective deployment in dynamic agricultural environments.
Keywords: deep learning, Disease classification, Computer Vision, plant disease, Mixture of Experts (MoE), vision transformers
Received: 05 Nov 2024; Accepted: 13 May 2025.
Copyright: © 2025 Zafar, Muhammad and Han. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dongil Han, Sejong University, Seoul, Republic of Korea
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.