Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Sustainable and Intelligent Phytoprotection

Volume 16 - 2025 | doi: 10.3389/fpls.2025.1586865

This article is part of the Research TopicMachine Vision and Machine Learning for Plant Phenotyping and Precision Agriculture, Volume IIView all 40 articles

Sorghum Crops Classification and Segmentation using Shifted Window Transformer Neural Network and Localization based on (YOLO)v9-Path Aggregation Network (PANet)

Provisionally accepted
JAVARIA  AMINJAVARIA AMIN1Rida  ZahraRida Zahra2Alena  MaryumAlena Maryum3Amber  SarwarAmber Sarwar1Amad  ZafarAmad Zafar4Seong-Han  KimSeong-Han Kim4*
  • 1Rawalpindi Women University, Rawalpindi, Punjab, Pakistan
  • 2University of Wah, Wah Cantonment, Punjab, Pakistan
  • 3National University of Technology (NUTECH), Islamabad, Islamabad, Pakistan
  • 4Sejong University, Seoul, Republic of Korea

The final, formatted version of the article will be published soon.

The world's population has been increasing continuously, and this requires prompt action to ensure food security. One of the top five cereals produced worldwide, sorghum is a staple of the diets of many developing nations. For this reason, getting accurate information is crucial to raising cereal productivity. The quantity of crop heads arranged in various branching configurations can be used as an indicator to estimate the yields of sorghum. For various crops, computerized methods have been demonstrated to be beneficial in automatically collecting this information. However, the application of the sorghum crops faces challenges due to variations in the color and shape of the sorghum. Therefore, a method is proposed based on the three models for the classification, localization, and segmentation of the sorghum. The shifted window transformer (SWT) network is proposed to have seven layers of path embedding, two Swin transformers, global average pooling, patch merging, and dense connections. The proposed SWT is trained on the following selected hyperparameters: patch size (2, 2), 2 window size, 1e-3 learning rate, 128 batch size, 40 epochs, 0.0001 weight decay, 0.03 dropout, 8 heads, 64 embedding dimension, and 256 MLP. To localize the sorghum region, the YOLOv9‑c model is trained from scratch on the selected hyperparameters for 100 epochs. Due to light, illumination, and noise, the sorghum images are more complex. A transformer-based SegNet model is designed, in which features are extracted using a pre-trained SegFormer-B0 Model Fine-Tuned for ADE-512-512. The proposed model is trained from scratch for 10 epochs using the Adam optimizer with a learning rate of 5e-5 and CrossEntropyLoss hyperparameters, which are finalized after extensive experimentation to achieve more accurate segmentation of the sorghum; this is a significant contribution of this work. The achieved outcomes are superior to those in other published works.

Keywords: Transformer Neural Network, Sorghum, YOLO-v9, Hyperparameters, crops, Classification, localization, and segmentations

Received: 27 Mar 2025; Accepted: 29 Aug 2025.

Copyright: © 2025 AMIN, Zahra, Maryum, Sarwar, Zafar and Kim. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Seong-Han Kim, Sejong University, Seoul, Republic of Korea

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.