YOLO-LitchiVar: A Lightweight and High-Precision Detection Model for Fine-Grained Litchi Variety Identification

Xu, Bing; Wu, Xianjun; Su, Xueping; Ke, Wende

doi:10.3389/fpls.2025.1677854

ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Sustainable and Intelligent Phytoprotection

This article is part of the Research TopicHighlights of 1st International Conference on Sustainable and Intelligent Phytoprotection (ICSIP 2025)View all 13 articles

YOLO-LitchiVar: A Lightweight and High-Precision Detection Model for Fine-Grained Litchi Variety Identification

Provisionally accepted

Bing Xu¹

Xianjun Wu¹

Xueping Su¹

Wende Ke^2*

¹Guangdong University of Petrochemical Technology, Maoming, China
²Southern University of Science and Technology, Shenzhen, China

The final, formatted version of the article will be published soon.

Litchi is a popular subtropical fruit with approximately100 known varieties worldwide. Traditional post-harvest litchi varietyclassification primarily relies on manual identification, which suffersfrom low efficiency, strong subjectivity, and a lack of standardizedsystems. This study constructs an image dataset comprising the 12 mostcommon litchi varieties found in commercial markets and proposes YOLO-LitchiVar, a lightweight and high-precision detection model that synergistically optimizes both computational efficiency and recognition accuracy for fine-grained litchi variety classification. The proposed model is built upon theYOLOv12 architecture and achieves significant performance improvementsthrough the synergistic optimization of three novel modules. First, we introduce the DSC3k2 module to address lightweight design requirements, employing a depthwise separable convolutional structure that decouples standard convolution into spatial filtering and channel fusion. This innovation significantly reduces the model's complexity, decreasing parameters from 2.57 million to 2.2 million (a 14.1% reduction) and computational cost from 6.5G to 5.6G FLOPs (a 13.8% reduction). Second, we develop the C2PSA cross-layer feature aggregation module to enhance feature representation through multi-scale feature alignment and fusion, specifically improving shallow microtexture characterization capability. This module effectively addresses the missed detection problem of Icy-Flesh Litchi due to the loss of micro-concave texture, increasing the recall rate from 0.492 to 0.706 (a 43.5% enhancement). Finally, we integrate an ECA attention mechanism to optimize discriminative performance, dynamically calibrating channel weights through adaptive kernel 1D convolution, and suppressing background noise (e.g., illumination variations) and features shared by similar varieties. This integration lowers the misclassification rate between Icy-Flesh Litchi and Osmanthus-Fragr Litchi from 0.462 to 0.34 (a 19.1% reduction). Experiments on a dataset of 11,998 multi-variety litchi images demonstrate that the YOLO-LitchiVar model achieves excellent comprehensive performance, with a mAP50-95 of 94.4%, which is 0.8% higher than the YOLOv12 baseline model. It also maintains lightweight advantages with a parameter count of 2.2 million and a computation volume of 5.6G FLOPs, making it suitable for mobile deployment. This study provides an efficient and effective solutionfor intelligent litchi variety identification with global applicability.

Keywords: litchi variety classification, Lightweight model, YOLOv12, cross-level feature fusion, multi-scale feature alignment

Received: 01 Aug 2025; Accepted: 05 Nov 2025.

Copyright: © 2025 Xu, Wu, Su and Ke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Wende Ke

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.