Full-time sequence assessment of okra seedling vigor under salt stress based on leaf area and leaf growth rate estimation using the YOLOv11-HSECal instance segmentation model

Cao, Xiaowei; Li, Yifan; Zhang, Yaben; Zhong, Zhibo; Bai, Ruxiao; Yang, Peng; Pan, Feng; Fu, Xiuqing

doi:10.3389/fpls.2025.1625154

ORIGINAL RESEARCH article

Front. Plant Sci., 14 August 2025

Sec. Sustainable and Intelligent Phytoprotection

Volume 16 - 2025 | https://doi.org/10.3389/fpls.2025.1625154

Full-time sequence assessment of okra seedling vigor under salt stress based on leaf area and leaf growth rate estimation using the YOLOv11-HSECal instance segmentation model

Xiaowei Cao¹

Yifan Li¹

Yaben Zhang¹

Zhibo Zhong²

Ruxiao Bai²

Peng Yang²

Feng Pan³

Xiuqing Fu^1*

¹College of Engineering, Nanjing Agricultural University, Nanjing, China
²Institute of Farmland Water Conservancy and Soil-Fertilizer, Xinjiang Academy of Agricultural Reclamation Science, Shihezi, Xinjiang, China
³Institute of Mechanical Equipment, Xinjiang Academy of Agricultural Reclamation Science, Shihezi, Xinjiang, China

Introduction: With the growing severity of global salinization, assessing plant growth vitality under salt stress has become a critical aspect in agricultural research.

Methods: In this paper, a method for calculating the leaf area and leaf growth rate of okra based on the YOLOv11-HSECal model is proposed, which is used to evaluate the activity of okra at the seedling stage. A high-throughput, Full-Time Sequence Crop Germination Vigor Monitoring System was developed to automatically capture image data from seed germination to seedling growth stage, while maintaining stable temperature and lighting conditions. To address the limitations of the traditional YOLOv11-seg model, the YOLOv11-HSECal model was optimized by incorporating the HGNetv2 backbone, Slim-Neck feature fusion, and EMAttention mechanisms.

Results: These improvements led to a 1.1% increase in mAP50, a 0.6% reduction in FLOPs, and a 14.1% decrease in model parameters. Additionally, Merge and Cal modules were integrated for calculating the leaf area and growth rate of okra seedlings. Finally, through salt stress experiments, we assessed the effects of varying NaCl concentrations (CK, 10 mmol/L, 20 mmol/L, 30 mmol/L, 40 mmol/L, 50 mmol/L, and 60 mmol/L) on the leaf area and growth rate of okra seedlings, verifying the inhibitory effects of salt stress on seedling vitality.

Discussion: The results demonstrate that the YOLOv11-HSECal model efficiently and accurately evaluates okra seedling growth vitality under salt stress in a full-time monitoring manner, offering significant potential for broader applications. This work provides a novel solution for full-time plant growth monitoring and vitality assessment in smart agriculture and offers valuable insights into the impact of salt stress on crop growth.

1 Introduction

With the rapid increase in the global population, the production of vegetables and grains has become a primary concern (Zhu et al., 2023). The vitality of plant seedlings is a key determinant of the speed and uniformity of plant emergence and early growth, directly influencing their ability to acquire and compete for essential resources (Mondo et al., 2013). The yield of crops is closely linked to seedling vitality: stronger vitality at the seedling stage correlates with higher yields under identical growth conditions (Podlaski and Chomontowski, 2020). Plants with higher vitality during the seedling stage exhibit better performance in resource acquisition (such as light, water, and nutrients) and enhanced competitiveness, particularly in competition with weeds. Additionally, high-vitality seeds tend to have larger leaf areas during the seedling stage compared to those of low-vitality seeds. For example, during the four-leaf and eight-leaf stages of maize, the leaf area index of high-vitality seeds is 37% and 16% greater than that of low-vitality seeds, respectively (Mondo et al., 2013). Assessing seedling vitality through leaf area and growth rate during this stage is an essential method for plant selection and breeding. Okra (Abelmoschus esculentus L.), an annual herbaceous plant belonging to the Malvaceae family, is a specialty vegetable with both high nutritional and economic value. Its tender pods are rich in proteins, dietary fiber, minerals, and various vitamins, with protein content surpassing that of most conventional fruits and vegetables, and offering a well-balanced amino acid profile. As an economic crop integrating edible, medicinal, and industrial functions, okra is characterized by a short cultivation cycle, strong environmental adaptability, and high yield per unit area. In addition to fresh consumption, it is extensively used in the production of value-added products such as frozen food, seed oil, and health supplements. Due to these advantages, okra has been widely cultivated across the globe (Elkhalifa et al., 2021). Salt stress is a major challenge in global agriculture, with approximately 21% of arable land in China affected by salinization (Wang et al., 2013). Soil salinization has emerged as one of the principal environmental stressors limiting plant productivity (Yu et al., 2019; Wang L. et al., 2022). Salt stress, particularly during the seedling stage, significantly reduces photosynthetic efficiency, impairs plant biomass accumulation, and inhibits leaf expansion (Zhang et al., 2018; Zhao et al., 2022). Studies have shown that salt stress leads to a reduction in leaf area and chlorophyll content in okra, causing overall growth retardation and potentially inducing oxidative stress (Abbas et al., 2017). Therefore, accurately assessing the leaf area and growth rate of okra seedlings under salt stress is essential for improving seed quality, enhancing yield, and achieving sustainable agricultural objectives.

Traditional methods for measuring leaf area primarily rely on manual techniques, such as the cardboard drawing method (Korva and Forbes, 1997), the weighing method (Castillo et al., 2014), volumetric accelerated leaf area measurement (Lee et al., 2022), and laser scanning (Berk et al., 2020). Although these methods provide high accuracy, most are destructive and involve cumbersome, time-consuming operations. To reduce manual intervention and enhance efficiency, researchers have developed various image-based, non-destructive leaf area estimation methods, including modified models based on leaf length and width (Leroy et al., 2007), segmentation algorithms based on color and shape features (Tu et al., 2021), and estimation methods utilizing leaf surface density constants (Tu et al., 2021). Meanwhile, An et al. developed an automated high-throughput phenotyping pipeline that utilizes a cost-effective imaging system combined with image processing algorithms to generate 2D orthomosaic projections (An et al., 2016). While these approaches improve measurement efficiency, they remain (Castillo et al., 2014) sensitive to factors such as background, lighting, and leaf structure, which limits their accuracy. In recent years, advances in 3D reconstruction and imaging technologies have led researchers to explore methods such as binocular stereo vision (Gong et al., 2015), infrared thermal imaging (Zhang and Zhang, 2022), and LIDAR measurement systems (Hu et al., 2018) for leaf area measurement. These techniques offer non-contact, high-precision 3D reconstruction of leaves, addressing some of the limitations associated with occlusion and projection errors inherent in 2D methods. However, their high equipment costs, complex algorithms, challenges in achieving real-time measurements, and sensitivity to environmental changes continue to hinder their widespread adoption in conventional agricultural management (Hu et al., 2018). Therefore, there is an urgent need to develop a novel method for measuring leaf area and leaf growth rate that is high-precision, full-time-series, non-invasive, robust to environmental variations, and non-destructive. Such a method would better support modern crop growth monitoring and meet the practical demands of intelligent agriculture.

In recent years, deep learning-based instance segmentation techniques have made remarkable advances in agricultural image analysis, significantly enhancing the efficiency and accuracy of plant phenotypic feature extraction. Compared to traditional image processing and shallow machine learning methods, deep learning models exhibit superior feature learning capabilities and adaptability across diverse environments. Among them, the YOLO (You Only Look Once) series has been widely adopted in agricultural applications due to its end-to-end architecture and excellent balance between speed and accuracy. For instance, Chen et al. (2025) proposed the GE-YOLO model, which incorporates a Gold YOLO multi-scale fusion structure and an EMA attention mechanism, effectively improving weed detection performance in rice fields. Miao et al. (2025) introduced SerpensGate-YOLOv8, integrating DySnakeConv and STA modules to enhance the model’s perception of curved edges in plant disease regions. Similarly, Lv and Su (2024) developed YOLOv5-CBAM-C3TR by introducing a Transformer-based attention mechanism, significantly boosting inter-category segmentation accuracy for apple leaf disease detection. Li HK. et al. (2024) proposed the RSG-YOLOv8 model, which integrates CSPDenseNet and the BRA module for improved detection of extremely small targets during rice seed germination. Additionally, Zhang et al. (2024) developed the S-T-YOLOv5 model, combining the Swin Transformer architecture to achieve high-precision, highly adaptive pollen counting in multiple plant species such as alfalfa. Jiang et al., 2023 utilized the YOLOv8-Pea network to assess the drought tolerance of pea seeds, while Fu et al., 2022 evaluated salt tolerance during wheat seed germination using a YOLOv4-based model. Moreover, Wang ZP. et al. (2022) and Cui et al., 2023 applied optimized YOLOv5 and YOLOv4-Tiny models, respectively, to apple stem recognition and pinecone harvesting tasks. Despite YOLO’s strong performance in object detection tasks, its bounding-box-based mechanism cannot provide pixel-level contour information, limiting its effectiveness for applications such as precise leaf area estimation, where high boundary accuracy is critical. In natural field environments, plant leaves often exhibit complex structures, including overlap, occlusion, distortion, and non-rigid deformations. Relying solely on bounding box detection makes it challenging to achieve accurate single-leaf segmentation and area calculation. To overcome these limitations, researchers have increasingly adopted instance segmentation approaches for plant leaf recognition and area estimation. For example, Huang et al. (2024) proposed a Mask R-CNN model enhanced with local refinement mechanisms to achieve fine segmentation of Chinese cabbage leaves under complex greenhouse conditions. Lüling et al. (2023) combined Mask R-CNN with Structure-from-Motion (SfM) 3D reconstruction technology to enable non-contact measurement of fruit volume and leaf area across multiple cabbage growth stages. Lu et al. (2024) developed a maize growth organ recognition and annotation system by integrating YOLOv5 with the Segment Anything Model. Furthermore, the Hierarchical Plant Segmentation Framework (Roggiolani et al., 2023) enabled semantic segmentation without relying on point clouds, achieving joint modeling of plant and leaf instances and providing a decoupled multi-scale pathway for leaf area estimation. Schneider et al. (2024) employed YOLOv8 to detect and model different growth stages of chili peppers in hydroponic systems. Although previous studies have demonstrated significant improvements in segmentation accuracy using various YOLO-based models, they still suffer from several limitations, including complex network architectures, lack of inherent segmentation capabilities, large parameter sizes, low inference efficiency, and limited robustness under natural lighting conditions (Khanam and Hussain, 2024). Moreover, most existing approaches primarily focus on static images or a single growth stage, making them unsuitable for high-throughput monitoring of leaf area throughout the entire plant growth cycle.YOLOv11 (Wu et al., 2024), as the latest open-source iteration of the YOLO series, exhibits comprehensive improvements over earlier versions in terms of input image processing, computational load, edge device deployment, precision, and parameter efficiency. It addresses the shortcomings of earlier variants that primarily focused on irregular single-frame images and were constrained by environmental complexity. YOLOv11 is thus better suited for executing full-growth-cycle object detection and instance segmentation tasks with higher efficiency. In addition, conventional YOLO models tend to underperform in recognizing small features and irregular leaf contours. Therefore, further model improvements to enhance segmentation precision are necessary. When targeting only okra leaf detection and segmentation, the output merely provides the number of segmented masks in an image. To address this, the YOLO model needs to be extended with post-processing modules that refine the segmentation output and enable further image-level analysis. Furthermore, the model still exhibits limitations in terms of FLOPs and parameter count, imposing high requirements for computational resources, runtime, and battery life—factors that challenge its deployment on modern agricultural devices. Thus, reducing FLOPs and model parameters is essential to achieve further lightweighting and improve deployment efficiency in resource-constrained agricultural environments.

In response to the aforementioned challenges, this study designs a full-time-series crop germination vigor monitoring system and proposes an improved YOLOv11-HSECal model to achieve the following objectives:

1. Address the limitations of existing methods in complex environments, including insufficient accuracy, destructive data collection, and irregular phenotypic monitoring, by enabling non-contact and dynamic leaf data acquisition throughout the entire growth process of okra from seed germination to the seedling stage.

2. Meet the agricultural monitoring demands for lightweight, high-throughput, and high-precision deep learning algorithms, while ensuring large-scale deployment on resource-constrained devices.

3. Accurately assess and output the actual leaf area and leaf growth rate of okra under salt stress, enabling effective monitoring of okra seedling vigor. This provides a viable solution to the low accuracy of YOLO models in detecting small features and irregular contours, and to their inability to output beyond instance masks.

2 Materials and methods

2.1 Development of full time sequence monitoring system for plant initial growth

To enable full-time monitoring of okra seedling growth, we developed a high-throughput, full-time crop germination vigor monitoring system capable of maintaining optimal growth conditions for plants, as illustrated in Figure 1. This system supports multiple breeding methods, including soil cultivation and hydroponics, while providing a controlled environment with constant temperature and continuous lighting. Furthermore, it enables the automated and continuous monitoring of the entire developmental process from seed germination to seedling growth, referred to as “ Temporal monitoring”. The system is composed of five major components: (1) breeding environment control system, (2) machine vision-based high-throughput imaging and orbital image acquisition module, (3) seed growth and cultivation unit, (4) computer-aided control module, and (5) human-computer interaction interface.

Figure 1

Experimental equipment setup with labeled sections: A. Breeding environmental regulation system with temperature and lighting controls; B. Human-computer interface with monitoring screens; C. Machine vision and track module; D. Seed growth unit showing plant trays; E. Computer-aided control module with image processing and camera interfaces; F. Data acquisition and testing results showing plant growth stages.

Figure 1. Full-time sequence crop germination vigor monitoring system: (A) Breeding environmental regulation system; (B) Human-computer interface; (C) Machine vision and track-based module; (D) Seed growth and cultivation unit; (E) Computer-aided control module; (F) Data acquisition and testing result.

The breeding environment control system (Figure 1A) consists of an incubator measuring 1055 mm in length, 740 mm in width, and 1740 mm in height (manufacturer: Henan Greentech Electric Technology Co., Ltd.), with a total of five incubators deployed at the experimental site. This system primarily controls lighting and temperature. It integrates a dual-channel temperature control system featuring two precision heaters (operating range: 10–60°C; accuracy: ± 0.1°C) equipped with Tp-100 thermocouples and an embedded PTC thermodynamic circulation unit. Together, they form a dynamic temperature control network capable of intelligently adjusting heating power based on real-time data feedback. When the internal temperature falls below the set threshold, the PTC hot aerodynamic circulation system activates to raise the temperature; conversely, when the temperature exceeds the upper limit, the system automatically shuts down to maintain environmental stability. Temperature settings are located at the upper right corner of the incubator and can be specified using the adjustment button. The lighting system features a dual-sided and top-mounted fixed LED array with a non-dimmable intensity design, controlled via a switch on the incubator panel. It provides full-cycle lighting conditions to support the seedling growth of okra.

The machine vision-based high-throughput imaging and track-based image acquisition module (Figure 1C) was constructed using a two-dimensional precision motion platform, with an X-axis horizontal rail spacing of 800 mm and a Y-axis vertical rail spacing of 1000 mm. A camera was mounted on a stepper motor-driven rail, allowing free movement across the upper plane of the incubator and enabling image capture at six designated positions. The motion mechanism supports an adjustable speed range of 0–50 mm/s. The imaging unit employs an HIV VISION RGB industrial camera (model MV-CS200–10GC) equipped with a CMOS sensor and a fixed-focus lens with a focal length of 30 mm (model LD-23–0.18X145, manufacturer: Jiangsu Suzhou Youxin Zeda Co.). Together with a ring-shaped constant-illuminance light source (luminous intensity of 183.9 kLux and brightness of 46.341 kcd/m²), the system establishes a standardized imaging environment. Image acquisition operates in a fully automated trigger mode, featuring a closed-loop feedback mechanism for both position coordinates and speed parameters. During image capture, the camera is automatically triggered and transmits image data to a computer via an RJ-45 Gigabit Ethernet interface located at the rear of the incubator. This setup facilitates efficient image editing, dataset construction, and training of instance segmentation models.

The seed growth and cultivation unit (Figure 1D) consists of a platform constructed inside the incubator, featuring six fixed positions designed to hold cultivation boxes fabricated via 3D printing. The fixed imaging points for the camera are aligned with these six designated positions. The platform is positioned approximately 1100 mm below the top of the incubator, providing ample vertical space to support the healthy growth of okra seedlings during the seedling stage.

The computer-aided control module (Figure 1E) adopts a dual-software collaborative architecture. The camera debugging software is responsible for hardware parameter configuration (such as exposure and resolution), data transmission management, and abnormal event logging. The image processing software manages the entire workflow of data reception, storage, and analysis, establishing a complete data processing pipeline from raw image acquisition to feature extraction.

The human-computer interaction interface (Figure 1B) is developed based on the MCGSpro platform, integrating the Mitsubishi FX5U-32MT PLC and the MCGS-Top10s touchscreen. Multi-module data interaction is achieved through MODBUS/TCP and TCP/IP protocols. The main control interface features a dual-mode operation design: the automatic mode supports one-click initiation of the acquisition sequence and parameter presetting, while the manual mode provides fine-grained control options, including motion control (coordinate positioning and speed adjustment), light source management, and shooting interval settings. The real-time status monitoring panel consolidates functions such as equipment operation status display, track coordinate feedback, and limit status indication, ensuring both the safety and convenience of system operation.

2.2 Okra seedling dataset construction and image acquisition

To train the leaf area and leaf growth rate calculation model for okra seedlings, this study selected 336 red okra seeds (purchased from Shouguang Xinxinran Horticulture Co., Ltd., Shouguang, Shandong, China) that were morphologically intact and uniform in size to construct a seedling monitoring model. Following the germination pretreatment procedure illustrated in Figure 2a, a sheet of white filter paper (210 mm × 297 mm) was moistened with deionized water and laid flat on an alcohol-sterilized table. Fifteen okra seeds were evenly spaced 20 mm from the long edge of the paper, followed by another fifteen seeds placed 20 mm apart from the first row. Another sheet of filter paper, moistened halfway, was used to cover the seeds with its wet side, allowing the filter papers between the two rows to adhere to each other. The assembly was rolled by folding the long edge (20 mm) vertically three times and the short edge (40 mm) horizontally three times, secured with a rubber band, and placed into a paper cup containing deionized water. This setup, termed a “cultivation paper pack,” was incubated at 28°C for 24 hours to promote germination. In total, 18 cultivation paper packs were prepared. This process ensured the acquisition of seeds at similar germination states, facilitating subsequent vigor assessment during the seedling growth stage. Subsequently, seeds with similar germination states were neatly and uniformly arranged in a 4×4 grid within six soil-based cultivation trays. These trays were then placed at six fixed positions within the seed growth cultivation module, with an inter-seed spacing of 50 mm. All seeds were then transferred to an incubator maintained at 28 °C under constant temperature and illumination conditions for nine days of full-time seedling monitoring. The experimental timeline commenced when the seeds were placed into the incubator and concluded at the end of the nine-day cultivation period. Utilizing the full-time crop germination vigor monitoring system illustrated in Figure 1, images were automatically captured at adjustable 30-minute intervals, yielding a total of 3,456 time-series images. The study employed an HIV VISION RGB industrial camera (model MV-CS200–10GC) equipped with a CMOS sensor and a fixed-focus lens with a 30 mm focal length (model LD-23–0.18X145, manufacturer: Jiangsu Suzhou Youxin Zeda Co.). During each capture cycle, the camera photographed six fixed positions. Additional imaging parameters are detailed in Table 1. Representative daily images are shown in Figure 2b. According to the criteria established by Martini et al (Martini, 2024), the growth of seeds after soil emergence is classified as the seedling stage. Observations indicated that during the first 24–28 hours of the experiment, leaves had not yet emerged, making it difficult to assess seed vigor (Figure 2b). In the later stages, between 24 and 48 hours, substantial leaf overlap was observed (Figure 2b), complicating the accurate identification of individual plants. Consequently, 480 images from the initial 24 hours and 960 images from the final 48 hours were excluded, along with 293 images affected by environmental changes, exposure anomalies, or blurring. Ultimately, 1,723 valid images were retained to form the fundamental dataset for model training.

Figure 2

Diagram illustrating the procedure for germinating okra seeds. Part (a) shows steps including soaking seeds, arranging them on filter paper, and incubating. Part (b) depicts a series of images labeled 1d to 9d, capturing the growth over nine days, showing the progression from germination to seedling development.

Figure 2. (a) Pretreatment of the germination process; (b) Collected images of okra seeds at seedling stage.

Table 1

Table 1. Test parameters.

For dataset annotation, we employed the ISAT-SAM tool (Yateng, 2023) to perform semi-automatic instance segmentation of okra leaves. ISAT-SAM is a semi-automated image annotation tool that integrates Interactive Segmentation Annotation (ISAT) with Meta SAM (Kirillov et al., 2023) (Segment Anything Model). Based on the SAM framework, users can generate high-quality segmentation masks by simply clicking on the target area, significantly reducing the annotation workload. In this study, the annotation was specifically focused on okra leaves, with the label category set as “leaf.” The annotation process is illustrated in Figure 3a. The annotation files generated by ISAT-SAM are in JSON format, which does not directly meet the requirements for instance segmentation model training. Therefore, we utilized the data conversion tool provided by ISAT-SAM to extract key parameters such as image resolution, category identifiers, and bounding box coordinates, and normalized the pixel coordinates to generate a TXT file compatible with the YOLO training standard.

Figure 3

Diagram illustrating a multi-step process for leaf image analysis and data handling. (a) Shows a software interface for storing images, creating a label set, selecting a model, viewing the total number of labels, and data conversion. (b) Displays data annotation and augmentation steps with images in JSON, TXT, and JPG formats, including variations through rotation, hue, noise, blur, brightness, and grayscale. (c) Outlines a workflow from time-series collection, image preprocessing, data annotations, model training, instance segmentation, to area and growth rate calculation. A bright green leaf is also shown with a labeled arrow.

Figure 3. (a) ISAT-SAM software annotation process; (b) dataset division and dataset enhancement. (c) System workflow for time-series data collection, model training, and phenotypic traits calculation.

According to the occlusion levels defined by Paul (Paul et al., 2024b; Paul and Machavaram, 2025), this study considers two levels of occlusion: No Occlusion and Leaves Occlusion. The imaging was conducted under constant illumination conditions, with a light intensity of 183.9 kLux and a luminance of 46.341 kcd/m². This setup ensured uniform lighting across the entire dataset, facilitating consistent feature extraction for subsequent model training and evaluation. To enhance the model’s anti-overfitting capability and generalization across different scenes, multi-dimensional data augmentation strategies were employed (Figure 3b), including rotation (within the range of -45° to +45°), hue adjustment (between -180° and +180°), brightness adjustment (within -90% to +90%), blur (up to 5.7px), and noise (up to 8.26% of pixels). This augmentation process resulted in a total of 2707 datasets. After splitting the dataset at a ratio of 70:20:10, a complete dataset comprising 1,895 training images, 541 validation images, and 271 test images was obtained (the partitioning process is illustrated in Figure 3b). This proportion ensures an appropriate and balanced distribution of samples across subsets, which is critical for building robust models and performing reliable evaluations. Such a division strategy is commonly employed in instance segmentation tasks (Hong et al., 2025) and seed germination object detection studies (Tian et al., 2025). The partition process is illustrated in Figure 3b. By simulating variations in imaging conditions and viewing angles, this approach significantly enhances the model’s adaptability to complex experimental environments. Throughout all image processing and annotation conversion stages, the original resolution and spatial accuracy of the annotations were preserved to ensure the reliability of the object detection model training.

To preliminarily validate the usability of the constructed dataset, we trained the YOLOv11 model using the complete dataset. The trained model was then employed to perform inference on randomly selected okra leaf images. As illustrated in Figure 1F, the model successfully generated object detection bounding boxes and corresponding segmentation mask outputs. These results demonstrate that the dataset possesses good usability and is suitable for supporting subsequent model training and performance evaluation.

Figure 3C illustrates the overall workflow of this study, including time-series data collection, image preprocessing, annotation, model training, instance segmentation, and leaf area, growth rate calculation.

2.3 Okra seedling growth experiment under salt stress

To investigate the potential impact of salt stress on the vigor of okra seedlings, we designed a full-time monitoring experiment under soil culture conditions. In this experiment, sodium chloride (NaCl) solution was primarily used to simulate a salt stress environment (Rewald et al., 2015), with six different NaCl solution concentrations (ranging from 10 to 60 mmol/L) in 10 mmol/L increments. The concentration range was determined based on two key considerations. First, preliminary experiments indicated that the sensitivity threshold of okra seedlings to NaCl is approximately 10 mmol/L, with higher concentrations (≥20 mmol/L) significantly inhibiting leaf expansion. Second, Ullah et al. (2024) reported that even under low-salinity conditions (50 mM NaCl), the seedling vigor of most okra cultivars was markedly reduced (Ullah and Jan, 2024). Deionized water was used as the control group (CK) to compare differences in leaf area and leaf growth rate between the treatment groups and normal water conditions. To minimize experimental variability, two replicate experiments were conducted simultaneously, with a total of 21 culture cassettes. During the experiment, 250 ml of the corresponding solution was sprayed into each culture box every 12 hours. Table 1 presents the remaining experimental parameters. In this study, based on the full-time seedling stage leaf image data collected under salt stress conditions, a deep learning instance segmentation model was developed to systematically analyze the spatiotemporal dynamics of okra seedling growth vitality indices (leaf area and leaf growth rate) under different salt stress levels. The aim was to establish an intelligent evaluation system for okra seedling vitality using computer vision techniques.

2.4 Calculation model of leaf area growth rate

2.4.1 Model training conditions and hyperparameter settings

The processor used in this experiment was a 12th Gen Intel^® Core™ i5-12500H (2.50 GHz), with Windows 11, operating system and an NVIDIA GeForce RTX 3050 Ti graphics card. The deep learning framework utilized was PyTorch 2.6.0 (developed by Facebook Artificial Intelligence Research, FAIR), running in a virtual environment created via Anaconda3. Python 3.11 (developed by the Python Software Foundation, PSF) and CUDA version 12.6 (developed by NVIDIA) were employed for training the deep learning model on okra seedling process images. The remaining environment configurations are listed in Table 2.

Table 2

Table 2. Model training environment.

During training, hyperparameters were carefully fine-tuned to optimize model performance, with the key parameters summarized in Table 3. These include the learning rate for convergence speed and stability, the number of warm-up epochs to prevent gradient oscillation due to an initially high learning rate, and weight decay to mitigate overfitting and regulate model complexity. These settings ensured that the YOLOv11-HSECal model was effectively trained for accurate okra instance segmentation. To ensure fairness and comparability across experiments, all models were trained from scratch without the use of pretrained weights.

Table 3

Table 3. Model training hyperparameter settings.

The model was trained using the Stochastic Gradient Descent (SGD) optimizer (Ketkar, 2017), which is known for its stability and generalization capability, especially suitable for agricultural imaging scenarios involving high-resolution inputs and a limited number of object instances. The initial learning rate was set to 1×10^-2, and a warm-up strategy was adopted, linearly increasing the learning rate during the first 3 epochs to avoid early-stage training instability. A StepLR scheduler (Wen et al., 2023) was then applied to reduce the learning rate by a factor of 0.5 every 20 epochs, gradually decaying it to 1×10^-4 to ensure smooth convergence in the later stages.

To further suppress overfitting, a weight decay of 5×10^-4 was introduced as an L2 regularization term in the optimizer, which constrains the magnitude of model weights and improves generalization. This parameter was selected based on the official YOLO recommendations and refined through pre-experiments within the 1×10^-4 to 1×10^-3 range. Results indicated that setting the weight decay to 5×10^-4 achieves a good trade-off between convergence speed and detection accuracy.

2.4.2 Construction of YOLOv11-HSECal model

In the early seedling stage, when okra first emerges from the soil, the target Leafs are relatively small and often exhibit irregular edge contours—referring to boundaries composed of multiple curved and serrated segments rather than smooth lines or arcs—which substantially increases the complexity of segmentation tasks. These characteristics pose significant challenges for the YOLOv11-seg model in accurately detecting and segmenting such fine-grained features. Although YOLOv11-seg has demonstrated strong performance in object detection and instance segmentation across various applications, it still presents certain limitations in segmentation accuracy (mAP), computational load (FLOPs), and model parameter size. To address these shortcomings, this study proposes a series of optimizations to the original model. Specifically, the YOLOv11-seg backbone was replaced with HGNetv2 from the RT-DETR framework (Zhao et al., 2024); the Neck component was substituted with the lightweight Slim-Neck (Li HL. et al., 2024); and the EMAttention mechanism (Ouyang et al., 2023) was introduced to enhance feature representation. Furthermore, two additional modules—Merge and Cal—were integrated to construct the proposed YOLOv11-HSECal model, as illustrated in Figure 4. These modifications significantly improve the model’s segmentation capability for small and irregular targets, increase segmentation precision, and reduce both FLOPs and parameter count. The enhanced model also facilitates the accurate computation of leaf area and growth rate, thereby improving the model’s real-time performance and deployment potential on resource-constrained edge devices. The specific improvements are as follows:

1. The backbone of the YOLOv11-seg model was replaced with HGNetv2, the backbone architecture of the RT-DETR model, to address the original model’s limited ability to comprehend and process complex scenes. HGNetv2 is a graph neural network (GNN)-based architecture specifically designed to handle complex data with hierarchical structures. By constructing a multi-level graph topology and integrating it with graph convolution operations, HGNetv2 effectively captures both node relationships and global contextual information across multiple scales. This design substantially enhances the model’s capacity to tackle intricate visual tasks. The hierarchical feature propagation mechanism enables the fusion of local and global representations, thereby improving the performance and robustness of tasks such as node classification, graph classification, and graph generation. Integrating HGNetv2 into the YOLOv11-seg framework enhances segmentation accuracy while simultaneously reducing model parameters and computational complexity (FLOPs) to a certain extent.

2. The Slim-Neck feature fusion module was introduced to replace the original neck component, aiming to optimize the feature fusion process and enhance segmentation accuracy for small targets and irregular leaf contours. Slim-Neck improves the efficiency of feature integration by incorporating the VoV-GSCSP module and GSConv (a hybrid convolution module), thereby reducing redundant computations. In this architecture, feature maps at various scales are first processed through the GSConv module. These processed maps are then fused with feature maps from other scales using bilinear interpolation for upsampling, followed by concatenation operations. The resulting fused maps are further refined through another pass of GSConv, followed by additional feature extraction and integration using the VoV-GSCSP module. This design enables more effective multi-scale feature representation, particularly enhancing the detection performance for small-scale objects and improving the model’s robustness and accuracy under complex visual conditions.

3. The EMAttention mechanism was introduced to replace the original C2PSA attention module in YOLOv11-seg, aiming to enhance the model’s object detection and segmentation performance in complex backgrounds and scenarios with overlapping objects. EMA (Efficient Multi-scale Attention) is a novel attention mechanism designed to improve feature representation while reducing computational overhead. It captures both short-range and long-range dependencies within feature maps through a multi-scale attention architecture. Unlike conventional attention mechanisms, EMA avoids dimensionality reduction, thereby preserving rich channel information and strengthening spatial feature aggregation. Additionally, EMA employs parallel sub-networks with 1×1 and 3×3 convolutional kernels to aggregate multi-scale information from feature groups. It also captures pixel-level pairwise relationships through cross-spatial learning, resulting in more refined spatial feature distributions and improved contextual understanding of images. Experimental results demonstrate that integrating EMA significantly enhances the detection and segmentation accuracy of the YOLOv11-seg model.

4. The Merge module was incorporated to address the issue of multiple masks resulting from the segmentation of numerous okra leaves within a single image. Following instance segmentation, each image yields multiple leaf masks, with the segmentation output containing several mask attributes. To integrate these, the Accumulate module iteratively processes each individual mask by converting it to the uint8 format, resizing it to match the original image dimensions, and overlaying the masks onto the original image to generate a unified merged mask containing all segmentation information. A thresholding operation is then applied to the merged mask to ensure pixel values fall within the range of (0, 255). The final merged mask is saved as a high-resolution image file named mask.jpg (1800×1850 pixels). Concurrently, binary images and bounding boxes corresponding to each segmented leaf region are extracted, and the central coordinates (Cx, Cy) and pixel counts of individual masks are output. To further enhance the quality of the instance segmentation masks for okra leaves, a series of image processing techniques were applied, including Canny edge detection, B-spline interpolation smoothing, and morphological operations such as dilation and erosion (Unser et al., 1993; Ding and Goshtasby, 2001; Kang et al., 2016). The integration of these methods significantly improves the precision and smoothness of the segmentation masks, thereby enhancing the accuracy of leaf area calculation and the reliability of leaf growth tracking over time.

5. The Cal module was introduced to compute leaf area and leaf growth rate based on the pixel data and mask images transmitted from the Merge module. This module is subdivided into Cal-area and Cal-speed, depending on the nature of the dataset. For individual images or datasets without temporal continuity, the Cal-area module calculates the leaf area for each sample. In contrast, for datasets representing full time-series growth sequences, the Cal-speed module is employed to compute the real-time growth rate of each leaf, while simultaneously outputting corresponding leaf area values. To enable accurate tracking of individual leaves across different time points, a cross-frame label tracing method was developed. Each leaf is assigned a unique identifier (ID), which is maintained throughout the sequence. In the initial frame, 16 leaves fixed within the image are arranged in a 4×4 grid. Based on the vertical coordinate (Cy) of the leaf centroid, the image is divided into four horizontal rows. Within each row, leaves are sorted by their horizontal centroid coordinate (Cx), and assigned IDs sequentially from left to right, ranging from 0 to 15. For subsequent frames, newly detected leaves are matched to previously identified ones by comparing their segmentation masks to those from the preceding frame. The Intersection over Union (IoU) metric (Cheng et al., 2021) is employed as the matching criterion. If the maximum IoU for a candidate leaf exceeds a predefined threshold (e.g., 0.5), it is deemed to represent the same leaf, and the corresponding ID is inherited. This method ensures continuous and reliable tracking of individual leaves across all frames, facilitating precise analysis of temporal growth patterns.

Figure 4

Flowchart illustrating the architecture and processes of a neural network. It includes separate sections labeled “Backbone,” “Neck,” “Head,” “Merge,” and “Cal.” Various components, such as convolutional layers, are labeled with terms like “Conv,” “GSConv,” “VoVGSCSP,” and “HGBlock.” The “Merge” and “Cal” sections depict processes related to image analysis, including “Gaussian blur,” “Canny Edge Detection,” and calculations for “Cal-speed” and “Cal-area.” Output images show cell-like structures in green.

Figure 4. Improved YOLOv11-HSECal model network structure.

In the actual model training process, the Merge and Cal modules are not directly involved during the training phase; therefore, they have a negligible impact on key model metrics such as segmentation accuracy (mAP), the number of model parameters, and FLOPs.

2.4.3 Okra seedling vigor evaluation index

The leaf area and leaf growth rate during the seedling stage of okra are key indicators directly reflecting the vigor of okra seeds and the growth status of seedlings (Al-Musawi and Al-Moussawi, 2020). The actual leaf area is calculated using the Cal-area module based on Equations 1, 2, with image resolution set at 1800 × 1850 pixels and an actual physical length of 250 mm for the image. The corresponding formulas are as follows:

\begin{array}{l} \begin{matrix} P i x e l_{A r e a} (m m^{2}) = {(\frac{250}{I m g_{W i d t h}})}^{2} \end{matrix} & (1) \end{array}

\begin{array}{l} \begin{matrix} L e a f_{A r e a} (m m^{2}) = P i x e l_{A r e a} \times S u m_{P i x e l} \end{matrix} & (2) \end{array}

where $P i x e l_{A r e a}$ denotes the actual area represented by each pixel; $I m g_{W i d t h}$ refers to the total number of pixels along the image boundary; $L e a f_{A r e a}$ represents the actual area of each individual leaf; and $S u m_{P i x e l}$ corresponds to the total number of pixels in the two-dimensional Boolean matrix obtained through segmentation (Benhacine et al., 2019).

The full time-series dataset utilized in this study incorporates timestamps in the format “YYYY-MM-DD-HH-MM-SS”. The Cal-speed module extracts temporal information from these timestamps using regular expressions and computes the time interval (unit: hours) between successive frames. Once the leaf area for each dataset is obtained, the Cal-speed module calculates the real-time growth rate of okra leaves based on Equation 3, enabling the analysis of growth rate variations under different salt stress conditions. The formula is presented as follows:

\begin{array}{l} \begin{matrix} G r o w t h R a t e = \frac{A r e a_{c u r r e n t} - A r e a_{p r e v i o u s}}{t_{i} - t_{i - 1}} \end{matrix} & (3) \end{array}

where Growth Rate denotes the leaf growth rate, while Area_current and Area_previous represent the leaf areas in the current and previous frames, respectively. t_i and t_i-1 represent the timestamps of the current and previous frames, respectively. Notably, the time interval is not limited to consecutive frames; it can also represent any user-defined time window, allowing for flexible analysis of okra growth rates across different temporal scales. Accordingly, we propose leaf area and leaf growth rate as quantitative indicators for evaluating the growth vigor of okra seedlings.

2.4.4 Evaluation indicators of okra leaf segmentation model

Instance segmentation models are typically evaluated using six key metrics: precision (P), recall (R), and average precision (mAP50and mAP50–95) to assess model accuracy, along with the number of parameters (Params) and floating-point operations per second (FLOPs) to evaluate model complexity (Khanam and Hussain, 2024).

Average precision is employed to evaluate the accuracy of the model in recognizing and segmenting okra leaves. A higher threshold in mAP indicates a greater overlap between the predicted elements (bounding box and mask) and the ground truth target, thereby providing a more stringent assessment of the model’s capability to precisely localize objects. Conversely, a lower threshold emphasizes the model’s ability to determine the presence of a target, regardless of localization precision. This relationship can be quantitatively described by Equations 4–8.

\begin{array}{l} \begin{matrix} P = \frac{T P}{T P + F P} \end{matrix} & (4) \end{array}

\begin{array}{l} \begin{matrix} R = \frac{T P}{T P + F N} \end{matrix} & (5) \end{array}

\begin{array}{l} \begin{matrix} A P = \int_{0}^{1} P (R) d R \end{matrix} & (6) \end{array}

\begin{array}{l} \begin{matrix} m A P_{50} = \frac{1}{n_{c}} \int_{0}^{1} P (R) d R \end{matrix} & (7) \end{array}

\begin{array}{l} \begin{matrix} m A P_{50 - 95} = a v g (m A P_{i}), i = 50, 55, \dots, 95 \end{matrix} & (8) \end{array}

where TP denotes true positives, where the leaf segmentation model correctly detects and segments actual instances of okra seedling leaves. FP represents false positives, referring to instances where the model incorrectly detects or segments non-existent okra seedling leaves (e.g., mistaking soil texture, culture box edges, or light-induced noise as leaf regions) or produces segmentation results that do not meet the required criteria. FN indicates false negatives, where the model fails to detect or segment real okra seedling leaves. In Equation 6, AP denotes the average precision obtained by integrating the area under the Precision-Recall curve. Here, P(R) represents the precision at a given recall level R, and dR is the integration variable corresponding to an infinitesimal change in recall. A higher average precision reflects better overall performance of the model in accurately detecting and segmenting okra leaves.

The lightweight nature and computational complexity of the model are assessed using the number of parameters and floating-point operations per second (FLOPs), as defined by Equations 9, 10, respectively. In these equations, K² denotes the area of the convolutional kernel, H × W represents the height and width of the input feature map, C_in is the number of input channels, and C_out is the number of output channels.

\begin{array}{l} \begin{matrix} F L O P s = 2 \times H \times W (C_{i n} K^{2} + 1) C_{o u t} \end{matrix} & (9) \end{array}

\begin{array}{l} \begin{matrix} P a r a m s = C_{i n} \times K^{2} \times C_{o u t} \end{matrix} & (10) \end{array}

Parameters focus on measuring the complexity and storage cost of the model, affecting the training difficulty and the risk of overfitting. FLOPs measure the computational cost and operational efficiency of the model, and determine the hardware suitability and real-time performance. Therefore, on the premise of ensuring accuracy, the smaller the two, the more cost-effective the model (Wan et al., 2023).

3 Results

3.1 Analysis of training loss for YOLOv11-HSECal

In neural networks, the term “loss” refers to a measure of prediction inaccuracy. The primary metrics used to indicate training error in instance segmentation include box loss, segmentation loss, classification loss, and Distribution Focal Loss (DFL), as illustrated in Figure 5. Box loss quantifies the algorithm’s ability to accurately localize object centers and the precision of predicted bounding boxes containing the objects. Segmentation loss measures the discrepancy between predicted pixel-level labels and ground truth masks. Classification loss evaluates the accuracy of the predicted object classes. Distribution Focal Loss (DFL) is specifically designed to mitigate class imbalance during network training, which occurs when certain classes are overrepresented (Paul and Machavaram, 2025).

Figure 5

Eight line graphs showing training and validation loss over 100 epochs. Each graph represents different loss types: box, segmentation, classification, and DFL. The loss values decrease significantly, with the line marked as “results” and a dotted “smooth” line for comparison.

Figure 5. Training loss curves during custom training of YOLOv11-HSECal model.

The training box loss exhibited a sharp decline around the 2nd epoch, followed by a steady decrease until approximately the 55th epoch, and stabilized near the 85th epoch with a minimum value of 0.266. The other three losses followed similar trajectories, reaching minimum values of 0.328, 0.194, and 0.824, respectively. The initial rapid decline in all losses is attributed to hyperparameter tuning, while the stabilization around the 85th epoch validates the decision to terminate training at the 100th epoch.

The validation loss curves demonstrated trends similar to the training loss but with greater fluctuations, indicating the model’s adaptability to the validation dataset. Validation box loss showed a sharp decrease during early epochs, followed by irregular fluctuations and eventually stabilized around the 78th epoch with a minimum of approximately 0.533. Validation segmentation loss dropped rapidly within the first four epochs, fluctuated intensely between the 8th and 63rd epochs, and stabilized near the 92nd epoch with a minimum value of about 1.919. Validation classification loss sharply declined in the first 10 epochs, oscillated around the 20th epoch, then gradually decreased before stabilizing after a sharp drop at the 91st epoch, with a minimum near 0.496. Validation DFL loss exhibited a similar pattern, with overall stabilization accompanied by fluctuations within a range and a minimum value of approximately 1.042. The fluctuations in validation loss reflect the model’s sensitivity to unseen data, yet the overall downward trend indicates improved generalization performance, consistent with the stable trend observed in training loss.

3.2 Longitudinal comparison of YOLOv11-HSECal with iterative versions of YOLO series instance segmentation models

To comprehensively validate the performance advantages of the proposed model in okra leaf detection and segmentation tasks, we selected several other YOLO-series instance segmentation models as benchmarks, including YOLOv6-seg, YOLOv8-seg, YOLOv8-seg-p6, YOLOv8-segANDCal, YOLOv10-seg, and YOLOv11-seg. Systematic experiments were conducted under unified configurations and hyperparameter settings. Figure 6a presents the segmentation results of these seven models under two occlusion levels during the okra seedling stage: No Occlusion and Leaves Occlusion. “No Occlusion” indicates that okra leaves are fully visible without overlap, whereas “Leaves Occlusion” refers to scenarios where leaves overlap and cause mutual occlusion. In the figure, yellow indicates duplicate segmentation, green represents erroneous segmentation, and red denotes incomplete segmentation. Although YOLOv8-seg and YOLOv8-seg-p6 both belong to the YOLOv8 family, they differ significantly in network architecture. YOLOv8-seg is constructed based on three backbone feature outputs corresponding to 8×, 16×, and 32× downsampled feature maps (denoted as P3/8, P4/16, and P5/32), offering a favorable balance of detection efficiency and model compactness. Conversely, YOLOv8-seg-p6 further incorporates a P6 layer (P6/64), a 64× downsampled deep feature map that enhances semantic representation and improves recognition of small targets, albeit with increased computational complexity and parameter count. Under the No Occlusion condition, models rarely produced erroneous segmentations, with YOLOv11-HSECal exhibiting only one instance of incomplete segmentation. Under Leaves Occlusion, segmentation accuracy declined markedly across models; however, YOLOv11-HSECal demonstrated superior robustness, with only one duplicate and one erroneous segmentation instance, and relatively fewer incomplete segmentations compared to other models. These results indicate that YOLOv11-HSECal achieves outstanding segmentation performance.

Figure 6

(a) Comparative analysis showing fruit segmentation results under “No occlusion” and “Leaves occlusion” conditions across various YOLO models: YOLOv6-seg, YOLOv8-seg, YOLOv8-seg-p6, YOLOv8-segANDCal, YOLOv10-seg, YOLOv11-seg, and YOLOv11-HSECal. Differences in detection and segmentation are highlighted with colored outlines. (b) Bar and line chart displays model complexity using FLOPs (G) and Params (M) compared with segmentation accuracy (mAPmask50% and mAPmask50-95%) for the same models.

Figure 6. (a) Comparative analysis of segmentation performance across models (b) Longitudinal comparative analysis of the performance indicators of the YOLO model.

Wu et al. (2024) utilized an improved YOLOv8-segANDCal model to estimate soybean root length, enhancing local feature extraction of soybean radicles via the SegNext_Attention mechanism. Nonetheless, its architecture primarily targets linear structures (e.g., roots), limiting adaptability to irregular leaf morphologies. We thus conducted a comparative analysis between YOLOv11-HSECal and YOLOv8-segANDCal. Performance evaluations on datasets featuring No Occlusion and Leaves Occlusion conditions (see Figure 6a) revealed that the custom-trained YOLOv11-HSECal outperforms YOLOv8-segANDCal in precision, recall, and F1 score (Table 4). Overall, YOLOv11-HSECal demonstrates superior segmentation capability compared to YOLOv8-segANDCal.

Table 4

Table 4. Comparison of the performance of the two models at different occlusion of okra.

According to the experimental results shown in Figure X(B), the proposed model achieves superior lightweight performance while maintaining high segmentation accuracy. Specifically, the segmentation mask mAP50 and mAP50-95 reached 86.9% and 76.5%, respectively, while the FLOPs and parameter count were reduced to 9.3G and 2.4M. Here, G denotes the number of giga floating-point operations (GFLOPs), and M represents the number of million trainable parameters contained in the model. Compared to YOLOv11-seg, our model improved mAP50 by 1.1%, reduced FLOPs by 0.6%, and decreased the number of parameters by 14.1%. Although YOLOv6-seg and YOLOv8-seg-p6 exhibited slightly higher segmentation accuracy, their computational complexity increased significantly, with FLOPs and parameter counts nearly doubling those of our model.

These comparative experiments demonstrate that our model achieves the optimal balance between segmentation accuracy and model lightweight design. The average precision of the segmentation mask directly influences the accuracy of leaf area and growth rate calculations. Meanwhile, a high number of model parameters and FLOPs imposes substantial demands on hardware resources. In contrast, our model significantly reduces hardware requirements while maintaining a high segmentation accuracy after lightweight optimization. This makes it particularly well-suited for practical agricultural applications, where computational efficiency and accuracy are both critical at lab or field level.

3.3 Horizontal comparison between YOLOv11-HSECal and state-of-the-art instance segmentation models such as grounded SAM

Ayan Paul et al. (Paul et al., 2024a) previously conducted comparative experiments between YOLOv9c-seg and the Grounded SAM model for instance segmentation of pepper pedicels. Building upon this, we aim to further evaluate the strengths and weaknesses of YOLO-based models in comparison to Grounded SAM for instance segmentation tasks. To validate the horizontal effectiveness of our proposed YOLOv11-HSECal model, we also conducted comparative analyses with other state-of-the-art but computationally intensive segmentation frameworks, including Mask2Former (R50-FPN) (Cheng et al., 2022), SOLOV2 (R50-FPN) (Wang et al., 2020), and Mask R-CNN (R50-FPN) (He et al., 2017). All models were trained and tested under identical experimental settings to ensure a fair comparison, as shown in Figure 7.

Figure 7

A bar chart compares model complexity and segment accuracy for five models: Mask R-CNN, SOLOV2, Mask2Former, Grounded SAM, and YOLOv11-HSECal. Complexity is shown in gray and blue bars representing FLOPs and parameters, respectively. A blue line and squares indicate mAP mask accuracy percentages for 50% and 50-95% intervals. Grounded SAM has the highest complexity and lowest accuracy, while YOLOv11-HSECal has negligible complexity and accuracy.

Figure 7. Side-by-side comparative analysis of other advanced model performance indicators.

Our results indicate that, although models such as Grounded SAM, Mask R-CNN (R50-FPN), SOLOV2 (R50-FPN), and Mask2Former (R50-FPN) have demonstrated solid performance in many previous segmentation tasks, they offer no significant accuracy advantage in our case. Moreover, they exhibit substantially higher model sizes and computational costs. These findings highlight the superiority of YOLOv11-HSECal in achieving competitive segmentation performance with greater computational efficiency.

3.4 Ablation experiment of YOLOv11-HSECal, a seedling leaf segmentation model of okra

To evaluate the effectiveness of the aforementioned improvements, a series of ablation experiments were conducted. Four model variants were assessed on the same validation dataset. As illustrated in Figure 8, replacing the backbone of YOLOv11-seg with HGNetv2 resulted in the YOLOv11-H model, which maintained comparable accuracy while achieving a 17% reduction in model parameters and a 9.6% decrease in FLOPs, thereby demonstrating the effectiveness of the lightweight design. Subsequent replacement of the Neck component with the Slim-Neck module in YOLOv11-H led to the YOLOv11-HS model. Although this increased the number of parameters by 6.03%, it yielded a 0.7% improvement in mAP50, with no additional increase in FLOPs. Building upon YOLOv11-HS, the introduction of the EMAttention mechanism further enhanced detection performance, increasing mAP50 by an additional 0.3%, while reducing parameters and FLOPs by 2.25% and 1.06%, respectively. These findings indicate that, compared to YOLOv11-seg, the proposed YOLOv11-HSECal model significantly reduces model complexity while achieving a 1.1% gain in mAP50. As shown in Figure 8, the YOLOv11-HSECal model demonstrates enhanced accuracy in okra leaf segmentation while maintaining computational efficiency, making it more suitable for deployment on resource-constrained hardware platforms.

Figure 8

Bar chart showing model complexity and segment accuracy for four models: YOLOv11-seg, YOLOv11-H, YOLOv11-HS, and YOLOv11-HSECal. Complexity is measured in FLOPs and Params, with a line graph for mAPmask50 and mAPmask50-95 percentages. Model accuracy increases slightly from YOLOv11-seg to YOLOv11-HSECal.

Figure 8. Comparison of YOLOv11-HSECal model ablation experiment performance indicators on the validation sets.

3.5 Accuracy evaluation of the model algorithm

In this study, the leaf area of okra at the seedling stage was used as an accuracy benchmark. To validate the reliability of this metric, a total of 380 images were randomly selected from those captured in the culture box, and a plant exhibiting relatively complete and consistent growth morphology was tracked. The leaf area of this target plant was calculated using both manual and algorithm-based methods. The algorithmic procedure involved three steps: first, instance segmentation was performed using the YOLOv11-HSECal model to generate the mask image; second, the Merge module was employed to refine and optimize the mask, yielding accurate pixel-level data; and finally, the Cal module was used to track the target plant across images by matching its unique ID number, thereby computing the algorithm-derived leaf area. Corresponding manual measurements were conducted based on the same ID. To assess the correlation, agreement, and error characteristics between the manual and algorithmic measurements, regression fitting plots, residual plots, and normal distribution plots were generated, as illustrated in Figure 9.

\begin{array}{l} \begin{matrix} r = \frac{C o v (X, Y)}{\sqrt{V a r | X | \cdot V a r | Y |}} \end{matrix} & (11) \end{array}

\begin{array}{l} \begin{matrix} R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}} \end{matrix} & (12) \end{array}

\begin{array}{l} \begin{matrix} A d j u s t e d R^{2} = 1 - (\frac{1 - R^{2}}{n - k - 1}) (n - 1) \end{matrix} & (13) \end{array}

\begin{array}{l} \begin{matrix} y = a x + b \end{matrix} & (14) \end{array}

\begin{array}{l} \begin{matrix} a = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \end{matrix} & (15) \end{array}

\begin{array}{l} \begin{matrix} y = 1.0078 x - 6.6564 \end{matrix} & (16) \end{array}

Figure 9

Scatter plot comparing predicted and actual leaf area calculations in square millimeters with a strong positive correlation. Insets show: (a) 3D plot of predicted vs. actual values, (c) residual plot, (d) histogram of residuals. Statistics include 380 data points, degrees of freedom 377, Pearson's r 0.9922, R² 0.9845, adjusted R² 0.9844.

Figure 9. Correlation analysis between algorithmic detection values and manual measurements: (a) fitted straight line; (b) manual and algorithmic count statistics; (c)scatter plot of residuals; and (d) normal distribution plot.

Here, Cov (X, Y) denotes the covariance between variables X and Y, while Var|X| and Var|Y| represent the variances of X and Y, respectively. The parameter a corresponds to the slope of the fitted regression line, and b represents the intercept on the Y-axis.

Figure 9A presents the distribution of leaf area measurements obtained through manual annotation and algorithmic calculation. To further validate the consistency between model predictions and manual measurements, a regression fitting plot was generated by setting manually calculated leaf area as the horizontal axis and the model-predicted leaf area as the vertical axis. The correlation analysis yielded a Pearson correlation coefficient of r = 0.9922 (as defined in Equation 11), indicating a very strong linear relationship between the two variables. The corresponding coefficient of determination was calculated as R² = 0.9845 (Equation 12), suggesting that approximately 98.45% of the variance in leaf area can be explained by the model’s predictions, demonstrating excellent fitting performance. Considering the effects of sample size and the number of independent variables, the adjusted coefficient of determination was further calculated as R² = 0.9844 (Equation 13), which remains at a high level. This adjusted metric accounts for the influence of the number of explanatory variables in the model and provides a more realistic reflection of its generalization ability. Collectively, these metrics confirm that the proposed algorithm achieves outstanding accuracy and robustness in leaf area prediction tasks, supporting its practical value in plant phenotyping and quantitative analysis. Furthermore, a linear regression analysis was conducted using the least squares method to obtain the optimal fitting line by minimizing the sum of squared differences between the predicted and actual values. The resulting regression equation is expressed as Equation 14, where the slope a=1.0078 (Equation 15) and the intercept b=−6.6564. The final fitted equation, presented in Equation 16, quantitatively describes the relationship between manual and algorithmic measurements. As observed in Figure 9A, the data points are symmetrically distributed around the fitted line, suggesting a good match. Figure 9B illustrates a three-dimensional spatial distribution of the manual and algorithmic results, providing a visual representation of the consistency between the two measurement approaches. Figure 9C displays the residual plot, showing the deviations of the predicted values from the observed ones. The residuals are distributed evenly on both sides of the zero line without evident patterns, indicating the absence of systematic error and further confirming the goodness of fit. Lastly, Figure 9D shows that the residuals follow a normal distribution, indicating that the prediction errors are both random and unbiased, thereby affirming the robustness and reliability of the model’s predictions.

In conclusion, the YOLOv11-HSECal model demonstrates high accuracy in estimating okra leaf area, effectively supporting the practical application demands of okra seedling leaf monitoring tasks.

3.6 Analysis of the vigor of okra seeds and the growth status of okra seedlings under salt stress

Salinity is one of the major environmental factors adversely affecting plant growth and is known to significantly reduce crop yields. In this study, we conducted experiments at the seedling stage of okra under varying concentrations of NaCl solutions. A control group was established using deionized water, while treatment groups were irrigated with NaCl solutions at concentrations of 10 mmol/L, 20 mmol/L, 30 mmol/L, 40 mmol/L, 50 mmol/L, and 60 mmol/L. Growth images were collected at ten time points, and leaf area as well as leaf growth rate were calculated. The results are illustrated in Figures 10, 11. Due to inherent limitations in model precision, a very small number of negative values appeared in the calculated growth rates, which can be considered negligible in the overall analysis. Figures 12, 13 further illustrate the leaf growth patterns of okra seedlings under varying NaCl concentrations. As the concentration of NaCl increased, a clear downward trend was observed in both leaf area and leaf growth rate at the same developmental stage, indicating that salt stress progressively inhibited leaf expansion and growth vigor. To highlight the differences more clearly, image data from the third day of the seedling stage were analyzed. The average leaf area and real-time growth rate in the CK group were 325.175 mm² and 13.32 mm²/h, respectively. Under increasing NaCl concentrations, the average leaf area and average real-time growth rate were reduced to 241.79 mm², 138.75 mm², 87.66 mm², 75.28 mm², 66.37 mm², and 64.54 mm², and to 6.78 mm²/h, 5.63 mm²/h, 3.13 mm²/h, 2.73 mm²/h, 1.69 mm²/h, and 2.06 mm²/h, respectively.

Figure 10

A series of images showing plant seedlings grown under different concentrations of sodium chloride (NaCl) over various time intervals. Columns represent increasing concentrations from CK to 60 mmol/L NaCl, and rows show growth progression from 36 to 144 hours. Seedling growth appears to diminish with higher salt concentrations, especially from the 40 mmol/L NaCl column onward.

Figure 10. Okra seedling leaf area under different concentration of NACI solutions.

Figure 11

Plant growth under different salt concentrations over time. Rows represent time intervals from 36 to 144 hours. Columns show NaCl concentrations from 10 to 60 millimoles per liter and a control. Seedling density and health decline with increasing salt levels.

Figure 11. Okra seedling leaf growth rate under different concentration of NACI solutions.

Figure 12

Six 3D plots compare absorption spectra over time for various NaCl concentrations. Each plot shows data on a grid with axes labeled “Time,” “ID,” and “Area,” with color gradients from blue to red indicating increasing area values. Labels indicate concentrations: CK, 10 mmol/L, 20 mmol/L, 30 mmol/L, 40 mmol/L, 50 mmol/L, 60 mmol/L NaCl. A seventh plot shows an average area comparison across these concentrations.

Figure 12. Changes of leaf area of each leaf and average leaf area of 16 leaves in the seedling stage of the culture box under different concentration of NACI solutions over time.

Figure 13

Multiple 3D bar charts display data on speed over time for different concentrations of NaCl, from CK to 60 mmol/L. Each chart shows variations in speed, using color gradients from blue (lower speeds) to red (higher speeds). The bottom-right chart summarizes the average speed across all concentrations.

Figure 13. Changes of leaf growth rate of each leaf and average leaf growth rate of 16 leaves in the seedling stage of the culture box under different concentration of NACI solutions over time.

Figures 12, 13 illustrate the leaf area and growth rate of okra seedlings under control conditions (CK) and at various concentrations of NaCl solution. Figure 12 presents the dynamic changes in the area of each individual leaf and the average area of 16 leaves during the seedling stage, across different treatment groups. The monitoring period spanned from 1 day and 12 hours to 8 days (a total of 156 hours), with images captured at 15-minute intervals, resulting in 624 images per group—thereby enabling continuous full-time monitoring of okra leaf area. As shown in Figure 12, although a few individual leaves at each concentration deviated from the overall trend, the growth trajectories of the majority of the 16 leaves at each NaCl concentration remained consistent. With increasing NaCl concentration, the slopes of the growth curves (represented by wall plots) gradually decreased, indicating a reduction in growth vigor. Based on the data from Figure 12, the average leaf areas of the 16 leaves at each concentration on the second day were 268.16 mm² (CK), 231.54 mm² (10 mmol/L), 132.47 mm² (20 mmol/L), 77.1 mm² (30 mmol/L), 69.34 mm² (40 mmol/L), 64.24 mm² (50 mmol/L), and 57.28 mm² (60 mmol/L), respectively. On the fourth day, the average areas increased to 574.8 mm², 553.1 mm², 530.57 mm², 384.79 mm², 373.38 mm², 337.03 mm², and 250.11 mm², respectively. By the seventh day, the values reached 1064.57 mm², 897.75 mm², 908.53 mm², 741.17 mm², 703 mm², 657.7 mm², and 626.62 mm², respectively. These results demonstrate that, across all treatments, the average leaf area of okra seedlings increased over time. However, higher NaCl concentrations consistently resulted in smaller leaf areas when compared to lower concentrations or the CK group at the same time points, indicating that increased salinity negatively affected the growth vigor of okra seedlings.

Figure 13 illustrates the variation in growth rate for each individual leaf and the average growth rate of 16 leaves at the okra seedling stage under CK and various NaCl solution concentrations. The observation period was consistent with that of Figure 13, spanning from 1 day and 12 hours to 8 days (totaling 156 hours), with images captured at 15-minute intervals, resulting in 624 images per treatment group. From the temporal trends in leaf growth rate under the seven NaCl concentrations shown in Figure 13, it is evident that although certain leaves exhibited phases of rapid growth within a specific period, the overall growth rates tended to stabilize over time. The early-stage growth rates were slightly lower than those in the mid-to-late stages. Leaves within the same concentration group exhibited generally consistent growth patterns. However, with increasing NaCl concentration, the height of the wall plots progressively decreased, indicating a gradual decline in both growth rate and physiological vigor. Furthermore, Figure 13 shows that on the second day, the average growth rates of the 16 leaves under CK and NaCl treatments at 10, 20, 30, 40, 50, and 60 mmol/L were 14.22 mm²/h, 9.86 mm²/h, 9.07 mm²/h, 8.83 mm²/h, 7.05 mm²/h, 4.99 mm²/h, and 6.73 mm²/h, respectively. On the fourth day, the rates increased to 17.13 mm²/h, 18.17 mm²/h, 15.44 mm²/h, 10.3 mm²/h, 9.49 mm²/h, 7.79 mm²/h, and 9.04 mm²/h. By the seventh day, the corresponding growth rates were 25.18 mm²/h, 9.07 mm²/h, 13.94 mm²/h, 13.35 mm²/h, 1.17 mm²/h, 5.63 mm²/h, and 1.09 mm²/h. These findings suggest that okra seedlings exhibit relatively slow leaf growth during the early stages, with a noticeable increase in growth rate during the middle and late stages. The overall trend of leaf growth rate mirrors that of leaf area: at a given concentration, the growth rate is typically higher in the later stages than in the early phase. Although occasional anomalies were observed where high-concentration treatments yielded slightly higher rates than lower concentrations, the overall average growth rate consistently declined with increasing NaCl concentration. This confirms that the salt stress environment simulated by NaCl significantly inhibits both the growth capacity and developmental potential of okra seedlings.

4 Discussion

In this study, a full-time sequence evaluation method for assessing okra seedling vigor was developed, offering a valuable tool and reference for understanding leaf development during the seedling stage, optimizing seed treatment strategies, and supporting rapid breeding as well as precision growth management. Despite its effectiveness, the current system has several limitations. First, the light intensity within our full-time monitoring system for crop germination vigor is not yet adjustable, which restricts experimental flexibility under varying illumination conditions. To address this issue, we plan to incorporate an adjustable lighting module to facilitate data collection under different light intensities. Second, the system lacks an automatic irrigation function and currently relies on daily manual watering during the experiment. This increases the labor burden and poses challenges for maintaining consistent environmental conditions. To enhance system automation and environmental control, we intend to integrate an automatic irrigation module in future iterations. Finally, the current assessment primarily relies on leaf area and leaf growth rate as indicators of okra seedling vigor. While these metrics are valid, future work could explore the integration of additional morphological indicators, such as stem length and stem growth rate, derived from 3D stereoscopic imaging. This would enable a more comprehensive and multidimensional evaluation of seedling vigor.

5 Conclusion

To address the issues of large errors and low efficiency associated with traditional manual leaf area measurements, as well as the limitations of existing instance segmentation models—such as complex architecture, high parameter count, and poor robustness—this study aimed to achieve high-throughput, lightweight, and full-time monitoring of okra seedling vigor. To this end, the following research was conducted to explore a full-time seedling vigor evaluation approach for okra based on the YOLOv11-HSECal model:

1. We developed a full-time sequence crop germination vigor monitoring system capable of supporting automated and continuous monitoring of okra seedlings, encompassing dynamic data acquisition from seed germination through to seedling development. The system not only provides a stable environment with controlled light and temperature but also ensures data reliability and validity through high-throughput image acquisition and precise growth tracking. This establishes a robust foundation for assessing plant growth status. Utilizing this system, we successfully conducted a 9-day okra seedling experiment under salt stress conditions, comprising a control group (CK) and six different concentrations of NaCl solution. A total of 3,456 seedling images were collected. Following data annotation and augmentation, we constructed an image dataset capturing the growth dynamics of okra seedlings.

2. To address the task of leaf segmentation and growth evaluation, this study optimized the YOLOv11-seg model and proposed the YOLOv11-HSECal model. By integrating the HGNetv2 backbone network, the Slim-Neck feature fusion module, the EMAttention attention mechanism, and a combination of the Merge and Cal modules, the model significantly enhances segmentation accuracy, particularly for small targets and complex leaf edges, making it directly applicable for okra seedling vigor monitoring. The optimized YOLOv11-HSECal model achieves a mAP50 of 86.9%, with the number of parameters and FLOPs reduced to 2.4M and 9.3G, respectively. This not only ensures high segmentation accuracy but also substantially improves computational efficiency, thereby meeting the requirements for lightweight, high-throughput, and high-precision monitoring in agricultural applications.

3. This study innovatively introduced the leaf growth rate as a key indicator for evaluating the growth vitality of okra seedlings. By integrating both leaf area and leaf growth rate, we assessed the effects of CK and NaCl solutions at concentrations of 10, 20, 30, 40, 50, and 60 mmol/L on seedling viability. The results demonstrated that, at each concentration, the growth rate of okra leaves in the middle and late stages was consistently higher than in the early stage. Furthermore, with increasing NaCl concentration, both leaf area and growth rate significantly declined during the same growth period, confirming the inhibitory effect of salt stress on okra seedling development. By applying the YOLOv11-HSECal model, we effectively analyzed the temporal dynamics of okra leaf growth under different levels of salt stress, providing a novel approach for plant growth assessment in adverse environments.

Although the proposed full-time-series evaluation method for okra seedling vigor demonstrates promising applications in phenotypic monitoring, several limitations remain. For instance, the current platform lacks adjustable light intensity and requires manual irrigation, which compromises the consistency of environmental control and the degree of automation. Additionally, the evaluation indices are primarily based on leaf area and growth rate, without incorporating 3D structural characteristics. Future research will aim to integrate adjustable lighting and automated irrigation modules, as well as incorporate 3D phenotypic features, to enable more precise and comprehensive assessments of seedling vigor in crops.

Conclusion: This study presents a high-throughput, non-destructive, full-time, accurate, and efficient method for assessing the vigor of okra seedlings, offering a novel approach for dynamic plant growth evaluation. Additionally, it provides a practical and effective tool for monitoring plant development under salt stress conditions, thereby advancing the application and development of intelligent agricultural technologies.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

XC: Conceptualization, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing, Data curation, Project administration, Resources, Supervision, Validation. YL: Conceptualization, Formal Analysis, Funding acquisition, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. YZ: Formal Analysis, Methodology, Resources, Validation, Writing – original draft. ZZ: Writing – review & editing. RB: Writing – review & editing. PY: Writing – review & editing. FP: Writing – review & editing. XF: Formal Analysis, Project administration, Supervision, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by the Guiding Science and Technology Plan of the Xinjiang Production and Construction Corps(Grant number 2024ZD001), Alar Financial Science and Technology Plan Project of the First Division(Grant number 2024 NY02), Jiangsu Agriculture Science and Technology Innovation Fund (JASTIF) (Grant number(CX(23)3619), Yazhou Bay Seed Lab in Hainan Province (Grant number B21HJ1005) and Jiangsu Province Seed Industry Revitalization Unveiled Project (Grant number JBGS(2021)007).

Acknowledgments

We are very grateful to XF for his guidance and every student involved in this study for their help and advice. Thanks again to Nanjing Agricultural University for building the experimental platform.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Abbreviations

P, Precision; R, Recall; AP, Average PrecisionmAP, mean Average Precision; Params, parameters; FLOPs, floating point operations per second; IoU, Intersection over Union; YOLO, You Only Look Once; HGNetv2, Hierarchical Graph Neural Network v2; Slim-Neck; B-Spline, Slim Feature Aggregation Neck; Basis Spline; EMAttention, Expectation-Maximization Attention; VoV-GSCSP, Variety of View Grouped Spatial-Channel Split Pyramid; GSConv, Ghost-Shuffle Convolution; GNN, Graph Neural Network.

References

Abbas, T., Sattar, A., Ijaz, M., Aatif, M., Khalid, S., and Sher, A. (2017). Exogenous silicon application alleviates salt stress in okra. Horticult. Environ. Biotechnol. 58, 342–349. doi: 10.1007/s13580-017-0247-5

Crossref Full Text | Google Scholar

Al-Musawi, M. A. H. M. and Al-Moussawi, S. M. A. (2020). GA3 and zn impact on germination and seedling growth of acid lime. Ann. Biol. (Hissar). 36, 406–411.

Google Scholar

An, N., Palmer, C. M., Baker, R. L., Markelz, R. C., Ta, J., Covington, M. F., et al. (2016). Plant high-throughput phenotyping using photogrammetry and imaging techniques to measure leaf length and rosette area 127, 376–394. doi: 10.1016/j.compag.2016.04.002

Crossref Full Text | Google Scholar

Benhacine, F. Z., Atmani, B., and Abdelouhab, F. Z. (2019). Contribution to the association rules visualization for decision support: A combined use between boolean modeling and the colored 2D matrix. Int. J. Interact. Multimed. Artif. Intell. 5, 38–47. doi: 10.9781/ijimai.2018.09.002

Crossref Full Text | Google Scholar

Berk, P., Stajnko, D., Belsak, A., and Hocevar, M. (2020). Digital evaluation of leaf area of an individual tree canopy in the apple orchard using the LIDAR measurement system. Comput. Electron. Agric. 169. doi: 10.1016/j.compag.2019.105158

Crossref Full Text | Google Scholar

Castillo, O. S., Zaragoza, E. M., Alvarado, C. J., Barrera, M. G., and Dasgupta-Schubert, N. (2014). Using the conservative nature of fresh leaf surface density to measure foliar area. Int. Agrophys. 28, 413–421. doi: 10.2478/intag-2014-0032

Crossref Full Text | Google Scholar

Chen, Z. M., Chen, B. F., Huang, Y., and Zhou, Z. S. (2025). GE-YOLO for weed detection in rice paddy fields. Appl. Sciences-Basel 15, 2823. doi: 10.3390/app15052823

Crossref Full Text | Google Scholar

Cheng, B., Girshick, R., Dollar, P., Berg, A. C., and Kirillov, A. (2021). Boundary ioU: improving object-centric image segmentation evaluation. Arxiv. doi: 10.1109/CVPR46437.2021.01508

Crossref Full Text | Google Scholar

Cheng, B., Misra, I., Schwing, A. G., Kirillov, A., and Girdhar, R. (2022). “Masked-attention mask transformer for universal image segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 1290–1299.

Google Scholar

Cui, M. D., Lou, Y. Y., Ge, Y. L., and Wang, K. Q. (2023). LES-YOLO: A lightweight pinecone detection algorithm based on improved YOLOv4-Tiny network. Comput. Electron. Agric. 205. doi: 10.1016/j.compag.2023.107613

Crossref Full Text | Google Scholar

Ding, L. J. and Goshtasby, A. (2001). On the Canny edge detector. Pattern Recognit. 34, 721–725. doi: 10.1016/S0031-3203(00)00023-6

Crossref Full Text | Google Scholar

Elkhalifa, A. E. O., Alshammari, E., Adnan, M., Alcantara, J. C., Awadelkareem, A. M., Eltoum, N. E., et al. (2021). Okra (Abelmoschus esculentus) as a potential dietary medicine with nutraceutical importance for sustainable health applications 26, 696. doi: 10.3390/molecules26030696

PubMed Abstract | Crossref Full Text | Google Scholar

Fu, X. Q., Han, B., Liu, S. Y., Zhou, J. Y., Zhang, H. W., Wang, H. B., et al. (2022). WSVAS: A YOLOv4-based phenotyping platform for automatically detecting the salt tolerance of wheat based on seed germination vigour. Front. Plant Sci. 13. doi: 10.3389/fpls.2022.1074360

PubMed Abstract | Crossref Full Text | Google Scholar

Gong, L., Chen, R., Zhao, Y. S., and Liu, C. L. (2015). Model-based in-situ measurement of pakchoi leaf area. Int. J. Agric. Biol. Engineer. 8, 35–42. doi: 10.3965/j.ijabe.20150804.1442

Crossref Full Text | Google Scholar

He, K., Gkioxari, G., Dollár, P., and Girshick, R. (2017). “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2961–2969.

Google Scholar

Hong, L., Wang, X., Li, Y., and Wang, X. (2025). USIS16K: high-quality dataset for underwater salient instance segmentation. arXiv preprint arXiv:2506.19472.

Google Scholar

Hu, R. H., Bournez, E., Cheng, S. Y., Jiang, H. L., Nerry, F., Landes, T., et al. (2018). Estimating the leaf area of an individual tree in urban areas using terrestrial laser scanner and path length distribution model. Isprs J. Photogramm. Remote Sens. 144, 357–368. doi: 10.1016/j.isprsjprs.2018.07.015

Crossref Full Text | Google Scholar

Huang, F., Li, Y. M., Liu, Z. X., Gong, L., and Liu, C. L. (2024). A method for calculating the leaf area of pak choi based on an improved mask R-CNN. Agriculture-Basel 14. doi: 10.3390/agriculture14010101

Crossref Full Text | Google Scholar

Jiang, H. Y., Hu, F., Fu, X. Q., Chen, C. R., Wang, C., Tian, L. X., et al. (2023). YOLOv8-Peas: a lightweight drought tolerance method for peas based on seed germination vigor. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1257947

PubMed Abstract | Crossref Full Text | Google Scholar

Kang, R., Yang, K., Zhang, X. X., Wu, W., and Chen, K. J. (2016). Development of online detection and processing system for contaminants on chicken carcass surface. Appl. Eng. Agriculture. 32, 133–139. doi: 10.13031/aea.32.11200

Crossref Full Text | Google Scholar

Ketkar, N. (2017). “Stochastic gradient descent,” in Deep learning with Python: A hands-on introduction (Apress: Berkeley, CA, USA: Springer), 113–132.

Google Scholar

Khanam, R. and Hussain, M. (2024). YOLOv11: an overview of the key architectural enhancements. Arxiv.

Google Scholar

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., et al. (2023). “Segment anything,” in Proceedings of the IEEE/CVF international conference on computer vision, 4015–4026.

Google Scholar

Korva, J. T. and Forbes, G. A. (1997). A simple and low-cost method for leaf area measurement of detached leaves. Exp. Agriculture 33, 65–72. doi: 10.1017/S0014479797000173

Crossref Full Text | Google Scholar

Lee, S. H., Oh, M. M., and Kim, J. O. (2022). Plant leaf area estimation via image segmentation. In 2022 37th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC). IEEE. 996–998. doi: 10.1109/ITC-CSCC55581.2022.9894907

Crossref Full Text | Google Scholar

Leroy, C., Saint-André, L., and Auclair, D. (2007). Practical methods for non-destructive measurement of tree leaf area. Agroforestry Syst. 71, 99–108. doi: 10.1007/s10457-007-9077-2

Crossref Full Text | Google Scholar

Li, H. L., Li, J., Wei, H. B., Liu, Z., Zhan, Z. F., and Ren, Q. L. (2024). Slim-neck by GSConv: a lightweight-design for real-time detector architectures. J. Real-Time Image Process. 21. doi: 10.1007/s11554-024-01436-6

Crossref Full Text | Google Scholar

Li, H. K., Liu, L. B., Li, Q., Liao, J., Liu, L., Zhang, Y. J., et al. (2024). RSG-YOLOV8: Detection of rice seed germination rate based on enhanced YOLOv8 and multi-scale attention feature fusion. PLoS One 19. doi: 10.1371/journal.pone.0306436

PubMed Abstract | Crossref Full Text | Google Scholar

Lu, C. H., Nnadozie, E., Camenzind, M. P., Hu, Y. C., and Yu, K. (2024). Maize plant detection using UAV-based RGB imaging and YOLOv5. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1274813

PubMed Abstract | Crossref Full Text | Google Scholar

Lüling, N., Reiser, D., Straub, J., Stana, A., and Griepentrog, H. W. (2023). Fruit volume and leaf-area determination of cabbage by a neural-network-based instance segmentation for different growth stages. Sensors 23. doi: 10.3390/s23010129

PubMed Abstract | Crossref Full Text | Google Scholar

Lv, M. and Su, W. H. (2024). YOLOV5-CBAM-C3TR: an optimized model based on transformer module and attention mechanism for apple leaf disease detection. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1323301

PubMed Abstract | Crossref Full Text | Google Scholar

Martini, F. (2024). On the definition of tree seedlings. Plant Ecol. 225, 75–79. doi: 10.1007/s11258-023-01378-2

Crossref Full Text | Google Scholar

Miao, Y. Z., Meng, W., and Zhou, X. Y. (2025). SerpensGate-YOLOv8: an enhanced YOLOv8 model for accurate plant disease detection. Front. Plant Sci. 15. doi: 10.3389/FPLS.2024.1514832

PubMed Abstract | Crossref Full Text | Google Scholar

Mondo, V. H. V., Cicero, S. M., Dourado-Neto, D., Pupim, T. L., and Dias, M. A. N. (2013). Seed vigor and initial growth of corn crop X1 - Vigor de sementes e crescimento inicial da cultura do milho. J. Seed Sci. 35, 64–69. doi: 10.1590/S2317-15372013000100009

Crossref Full Text | Google Scholar

Ouyang, D., He, S., Zhan, J., Guo, H., Huang, Z., Luo, M., et al. (2023). Efficient multi-scale attention module with cross-spatial learning. Arxiv. doi: 10.1109/ICASSP49357.2023.10096516

Crossref Full Text | Google Scholar

Paul, A. and Machavaram, R. (2025). Advancing capsicum detection in night-time greenhouse environments using deep learning models: Comparative analysis and improved zero-shot detection through fusion with a single-shot detector 10, 100243. doi: 10.1016/j.fraope.2025.100243

Crossref Full Text | Google Scholar

Paul, A., Machavaram, R. J. S. S., and Engineering, C. (2024a). Advanced segmentation models for automated capsicum peduncle detection in night-time greenhouse environments 12, 2437162. doi: 10.1080/21642583.2024.2437162

Crossref Full Text | Google Scholar

Paul, A., Machavaram, R., Kumar, D., and Nagar, H. (2024b). Smart solutions for capsicum Harvesting: Unleashing the power of YOLO for Detection, Segmentation, growth stage Classification, Counting, and real-time mobile identification 219, 108832. doi: 10.1016/j.compag.2024.108832

Crossref Full Text | Google Scholar

Podlaski, S. and Chomontowski, C. (2020). Various methods of assessing sugar beet seed vigour and its impact on the germination process, field emergence and sugar yield. Sugar Tech. 22, 130–136. doi: 10.1007/s12355-019-00754-5

Crossref Full Text | Google Scholar

Rewald, B., Holzer, L., and Göransson, H. (2015). Arbuscular mycorrhiza inoculum reduces root respiration and improves biomass accumulation of salt-stressed Ulmus glabra seedlings. Urban Forestry Urban Greening 14, 432–437. doi: 10.1016/j.ufug.2015.04.011

Crossref Full Text | Google Scholar

Roggiolani, G., Sodano, M., Guadagnino, T., Magistri, F., Behley, J., Stachniss, C., et al. (2023). Hierarchical approach for joint semantic, plant instance, and leaf instance segmentation in the agricultural domain. Journal, 9601–9607. doi: 10.1109/ICRA48891.2023.10160918

Crossref Full Text | Google Scholar

Schneider, F., Swiatek, J., and Jelali, M. (2024). Detection of growth stages of chilli plants in a hydroponic grower using machine vision and YOLOv8 deep learning algorithms. Sustainability 16. doi: 10.3390/su16156420

Crossref Full Text | Google Scholar

Tian, L., Fang, Z., Jiang, H., Liu, S., Zhang, H., Fu, X. J. C., et al. (2025). Evaluation of tomato seed full-time sequence germination vigor based on improved YOLOv8s 230, 109871. doi: 10.1016/j.compag.2024.109871

Crossref Full Text | Google Scholar

Tu, L. F., Peng, Q., Li, C. S., and Zhang, A. Q. (2021). 2D in situ method for measuring plant leaf area with camera correction and background color calibration. Sci. Program. 2021. doi: 10.1155/2021/6650099

Crossref Full Text | Google Scholar

Ullah, H. and Jan, T. R. (2024). Germination test, seedling growth, and physiochemical traits are used to screen okra varieties for salt tolerance. Heliyon 10. doi: 10.1016/j.heliyon.2024.e34152

PubMed Abstract | Crossref Full Text | Google Scholar

Unser, M., Aldroubi, A., and Eden, M. (1993). B-spline signal-processing.1. Theory. IEEE Trans. Signal Process. 41, 821–833. doi: 10.1109/78.193220

Crossref Full Text | Google Scholar

Wan, H., Zeng, X., Fan, Z., Zhang, S., and Zhang, K. (2023). JR-TIP U-DPnet: an ultralight convolutional neural network for the detection of apples in orchards 20, 76. doi: 10.1007/s11554-023-01330-7

Crossref Full Text | Google Scholar

Wang, Z. P., Jin, L. Y., Wang, S., and Xu, H. R. (2022). Apple stem/calyx real-time recognition using YOLO-v5 algorithm for fruit automatic loading system. Postharvest Biol. Technol. 185. doi: 10.1016/j.postharvbio.2021.111808

Crossref Full Text | Google Scholar

Wang, X., Wang, J., Liu, H., Zou, D., and Zhao, H. (2013). Influence of natural saline-alkali stress on chlorophyll content and chloroplast ultrastructure of two contrasting rice (Oryza sativa L. japonica) cultivars. Australian Journal of Crop Science. 7, 289–292.

Google Scholar

Wang, X., Zhang, R., Kong, T., and Li, L. (2020). Shen CJAiNips. Solov2: Dynamic Fast instance Segment. 33, 17721–17732.

Google Scholar

Wang, L., Zuo, Q. S., Zheng, J. D., You, J. J., Yang, G., and Leng, S. H. (2022). Salt stress decreases seed yield and postpones growth process of canola (Brassica napus L.) by changing nitrogen and carbon characters. Sci. Rep. 12. doi: 10.1038/s41598-022-22815-8

PubMed Abstract | Crossref Full Text | Google Scholar

Wen, X., Zeng, M., Chen, J., Maimaiti, M., and Liu, Q. J. L. (2023). Recognition of wheat leaf diseases using lightweight convolutional neural networks against complex backgrounds. Life 13, 2125. doi: 10.3390/life13112125

PubMed Abstract | Crossref Full Text | Google Scholar

Wu, Y., Li, Z., Jiang, H., Li, Q., Qiao, J., Pan, F., et al. (2024). YOLOv8-segANDcal: segmentation, extraction, and calculation of soybean radicle features. Frontiers in plant science 15, 1425100. doi: 10.3389/fpls.2024.1425100

PubMed Abstract | Crossref Full Text | Google Scholar

Yateng, A. (2023). ISAT with segment anything: image segmentation annotation tool with segment anything. ISATwithsegmentanything.

Google Scholar

Yu, C., Wu, Q., Sun, C., Tang, M., Sun, J., and Zhan, Y. (2019). The phosphoproteomic response of okra (Abelmoschus esculentus L.) seedlings to salt stress. Sciences 20, 1262.

PubMed Abstract | Google Scholar

Zhang, C. J., Liu, T., Wang, J. X., Zhai, D. L., Chen, M., Gao, Y., et al. (2024). DeepPollenCount: a swin-transformer-YOLOv5-based deep learning method for pollen counting in various plant species. Aerobiologia 40, 425–436. doi: 10.1007/s10453-024-09828-8

Crossref Full Text | Google Scholar

Zhang, H. H., Xu, N., Wu, X. Y., Wang, J. R., Ma, S. L., Li, X., et al. (2018). Effects of four types of sodium salt stress on plant growth and photosynthetic apparatus in sorghum leaves. J. Plant Interact. 13, 506–513. doi: 10.1080/17429145.2018.1526978

Crossref Full Text | Google Scholar

Zhang, S. and Zhang, L. L. (2022). Using an IR camera to improve leaf area and temperature measurements: A new method for increasing the accuracy of photosynthesis-related parameters. Agric. Meteorol. 322. doi: 10.1016/j.agrformet.2022.109005

Crossref Full Text | Google Scholar

Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., et al. (2024). DETRs beat YOLOs on real-time object detection. Arxiv. doi: 10.1109/CVPR52733.2024.01605

Crossref Full Text | Google Scholar

Zhao, X., Wu, T., Guo, S., Hu, J., and Zhan, Y. (2022). Ectopic expression of AeNAC83, a NAC transcription factor from Abelmoschus esculentus, inhibits growth and confers tolerance to salt stress in Arabidopsis. International journal of molecular sciences 23, 10182. doi: 10.3390/ijms231710182

PubMed Abstract | Crossref Full Text | Google Scholar

Zhu, Y. Y., Wang, Z. W., and Zhu, X. H. (2023). New reflections on food security and land use strategies based on the evolution of Chinese dietary patterns. Land Use Policy 126. doi: 10.1016/j.landusepol.2022.106520

Crossref Full Text | Google Scholar

Keywords: YOLOv11-HSECal model, okra, salt stress, time-series, leaf area, leaf growth rate, plant vitality evaluation

Citation: Cao X, Li Y, Zhang Y, Zhong Z, Bai R, Yang P, Pan F and Fu X (2025) Full-time sequence assessment of okra seedling vigor under salt stress based on leaf area and leaf growth rate estimation using the YOLOv11-HSECal instance segmentation model. Front. Plant Sci. 16:1625154. doi: 10.3389/fpls.2025.1625154

Received: 09 May 2025; Accepted: 28 July 2025;
Published: 14 August 2025.

Edited by:

Milind B. Ratnaparkhe, ICAR Indian Institute of Soybean Research, India

Reviewed by:

Ayan Paul, Indian Institute of Technology Kharagpur, India
Mukesh Kumar Vishal, National Research Centre on Seed Spices (ICAR), India

Copyright © 2025 Cao, Li, Zhang, Zhong, Bai, Yang, Pan and Fu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiuqing Fu, ZnV4aXVxaW5nQG5qYXUuZWR1LmNu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.