ORIGINAL RESEARCH article
Front. Plant Sci.
Sec. Technical Advances in Plant Science
This article is part of the Research TopicAdvances in Fruit-Growing Systems as a Key Factor of Successful Production: Volume IIView all 7 articles
Lightweight MSW-YOLOv8n-Seg: The Instance Segmentation of Maturity on Cherry Tomato with Improved YOLOv8n-Seg
Provisionally accepted- Shanxi Agricultural University, Jinzhong, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Automatic and accurate segmentation of cherry tomato maturity in natural environment is the foundation for automatic picking. Lacking of significant differences in adjacent maturity and the problem of mutual occlusion between fruits usually affect the picking process. According to the changes in phenotypic characteristics of cherry tomato during its mature period and the Chinese national standard GH/T 1193-2021, a lightweight maturity instance segmentation method of cherry tomato with 5 levels, including green, turning, pink, lightred and red was proposed based on improved YOLOv8n-Seg model, named as MobileViTv3-SK-WIoU-YOLOv8n-Seg (MSW-YOLOv8n-Seg). In this model, MobileViTv3 was introduced into the original YOLOv8 model as backbone for feature extraction to reduce the parameters of the original model; selective kernel (SK) attention module was added to the neck part to improve the feature expression ability of the model; the complete intersection over union (CIoU) loss function in the original head part was replaced with wise intersection over union (WIoU), which can effectively filter low-quality samples and improve the stability and reliability of the model in complex scenes. The proposed model can better balance the relationship between segmentation speed, accuracy, and model computational complexity. The experimental results show that the bounding box precision, recall and mean average precision (mAP)@0.5 of the improved model on the test sets were 90.8%, 86.3% and 83.9% respectively, and the model size was 6.0 MB. Compared with YOLOv7-Mask, YOLOv8n-Seg, YOLOv9s-Seg, YOLO11n-Seg, Mask R-CNN (Mask region-based convolutional neural network) and Mask2Former, the bounding box precision increased by 9.6%, 5.2%, 5.7%, 12.3%, 13.3% and 5.0%, the recall increased by 7.8%, 7.4%, 8.8%, 13.1%, 13.9% and 0.1%, and the mAP@0.5 increased by 10.5%, 3.0%, 0.9%, 15.0%, 13.8% and 1.4% respectively. In terms of inference speed, the MSW-YOLOv8n-Seg has the highest inference speed, with FPS of up to 52.9 f·s-1 and latency of only 18.2ms, which demonstrates its real-time processing capability. The results show that the improved MSW-YOLOv8n-Seg model is optimal, and it suitable for instance segmentation scenarios with high real-time performance and can provide effective exploration for automated cherry tomato fruit picking.
Keywords: Cherry tomato, Instance segmentation, maturity, MobileViTv3, MSW-YOLOv8n-Seg, SK Attention, WIoU
Received: 24 Oct 2025; Accepted: 10 Dec 2025.
Copyright: © 2025 Miao and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Zhiwei Li
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
