AUTHOR=Amin Javaria , Zahra Rida , Maryum Alena , Sarwar Amber , Zafar Amad , Kim Seong-Han TITLE=Sorghum crops classification and segmentation using shifted window transformer neural network and localization based on (YOLO)v9-path aggregation network JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1586865 DOI=10.3389/fpls.2025.1586865 ISSN=1664-462X ABSTRACT=IntroductionThe world’s population has been increasing continuously, and this requires prompt action to ensure food security. One of the top five cereals produced worldwide, sorghum, is a staple of the diets of many developing nations. For this reason,getting accurate information is crucial to raising cereal productivity. The quantity of crop heads arranged in various branching configurations can be used as an indicator to estimate the yields of sorghum. For various crops, computerized methods have been demonstrated to be beneficial in automatically collecting this information. However, the application of sorghum crops faces challenges due to variations in the color and shape of sorghum.MethodsTherefore,a method is proposed based on the three models for the classification, localization, and segmentation of sorghum. The shifted window transformer (SWT)network is proposed to have seven layers of path embedding, two Swin Transformers, global average pooling, patch merging, and dense connections. The proposed SWT is trained on the following selected hyperparameters: patch size(2,2), two window size,1e-3 learning rate,128 batch size,40 epochs, 0.0001 weight decay, 0.03dropout, eight heads, 64 embedding dimension, and 256 MLP. To localize the sorghum region, the YOLOv9−c model is trained from scratch on the selected hyperparameters for 100 epochs. Due to light, illumination, and noise, the sorghum images are more complex. A transformer-based SegNetmodel is designed, in which features are extracted using a pre-trained SegFormer-B0 model fine-tuned for ADE-512-512. The proposed model is trained from scratch for 10 epochs using the Adam optimizer with a learning rate of 5e-5 and CrossEntropyLoss hyperparameters, which are finalized after extensive experimentation to achieve more accurate segmentation of the sorghum; this is a significant contribution of this work.Results and DiscussionThe achieved outcomes are superior to those in other published works.