A Framework for Identification of Healthy Potted Seedlings in Automatic Transplanting System Using Computer Vision

Automatic transplanting of seedlings is of great significance to vegetable cultivation factories. Accurate and efficient identification of healthy seedlings is the fundamental process of automatic transplanting. This study proposed a computer vision-based identification framework of healthy seedlings. Vegetable seedlings were planted in trays in the form of potted seedlings. Two-color index operators were proposed for image preprocessing of potted seedlings. An optimal thresholding method based on the genetic algorithm and the three-dimensional block-matching algorithm (BM3D) was developed to denoise and segment the image of potted seedlings. The leaf area of the potted seedling was measured by machine vision technology to detect the growing status and position information of the potted seedling. Therefore, a smart identification framework of healthy vegetable seedlings (SIHVS) was constructed to identify healthy potted seedlings. By comparing the identification accuracy of 273 potted seedlings images, the identification accuracy of the proposed method is 94.33%, which is higher than 89.37% obtained by the comparison method.


INTRODUCTION
Vegetables are rich in vitamins, minerals, and crude fiber, which are indispensable food for humans. Vegetable production is an important production activity of normal supply, and its efficiency directly affects vegetable yield. The introduction of industrial production technology into vegetable factories can greatly improve the efficiency of vegetable production.
In the production process of a vegetable factory, vegetable seeds are usually cultured in pots after being washed and soaked. The matrix placed in the pot can be used to nourish the seed growth. Potted seedlings are placed in trays for cultivation. Until the potted seedlings grow to meet the standard of healthy seedlings, the potted seedlings will be transplanted for the preparation of subsequent vegetable production. Therefore, the accuracy and efficiency of potted seedling transplanting is an important process of vegetable production.
The automatic transplanting of healthy potted seedlings can not only ensure the integrity of seedling growth but also reduce the time consumption of manual operation in the whole process of vegetable factory production. Many automatic transplanting systems have been developed for transplanting different kinds of potted seedlings (Han et al., 2018a,b;Rahul et al., 2019;Jin et al., 2020). The transplanting mechanism was designed and the theory of automatic transplanting was analyzed in literature (Sun et al., 2017;Vivek et al., 2017). A mini-automatic transplanting machine was developed, and the performance of its control system was investigated by Yang et al. (2020). The above reports mainly conducted the transplanting speed and success rate under the mechanical structure condition.
Vision-based automatic transplanting can distinguish healthy potted seedlings from unhealthy ones, which may promote an accurate rate of transplanting potted seedlings. The visual identification of the growth state of potted seedlings plays a vital role in the automatic transplanting of healthy potted seedlings. The accurate visual identification of healthy potted seedlings is the first step to ensure the seedlings transplanting without damage. The development of a robust identification algorithm for healthy potted seedlings can make a vision-based transplanting system play its advantages in vegetable factory production.
Machine vision technology has been applied to multiple fields (John, 2017;Mauro et al., 2018;Wang et al., 2018;Kim et al., 2020). For the detection of seedling growth status, an early method could be found in Ling and Ruzhitsky (1996), which could measure tomato seedling canopy with an adaptive threshold algorithm and the Otsu method. Lin et al. estimated the leaf area of seedlings by the projected contour image and proposed an image processing method based on elliptic Hough transform to determine the overlapping position of seedlings leaves (Lin et al., 2002). A model for estimating seedling leaf area was developed using vision technology in Karimi (2009), where the model used the linear regression equation of leaf length and width obtained by a vision to estimate the leaf area. Tong et al. combined the region center of cross-border leaves with the improved watershed segmentation method to measure the leaf area and then estimated the quality of vegetable seedlings through the leaf area (Tong et al., 2013). A linear structured light vision system was designed to measure seedling surface information as described in Feng et al. (2013). In the system, one color image of the seedling line with linear structured light was used to measure the seedling height, and the other color image of the seedling line without linear structured light was used to identify the size of the seedling leaf, a color index of 2G-R-B was used to distinguish the seedling leaf from the substrate, and the Otsu dynamic threshold was adopted to extract the leaf area. Ashraf et al. (2014) used the theory of machine vision to inspect seedlings for sorting and inspecting grafted seedlings. An automated corn seedling phenotyping platform based on a time-of-flight (TOF) camera and an industrial robot arm was developed by Hang et al. (2017). Their method used the TOF camera to obtain the data of three-dimensional (3D) point cloud of seedlings, then used a 3D-to-2D projection and an x-axis pixel density distribution algorithm to segment and match the corn seedlings (Hang et al., 2017). A comprehensive image processing flow, which used discontinuous gray to segment the leaf area of seedling, applied an order statistic filtering to reduce the random noise and utilized the Otsu algorithm to segment the seedling image, was presented to extract the feature information of graft seedlings in Zhang et al. (2015). Franck et al. proposed a fast 3D reconstruction method for seedling phenotyping and measured the seedling surface features by using the developed computer vision technology (Golbach et al., 2016). A method based on the photometric stereo for measuring the seedling leaf morphology was reported in Feng et al. (2018). Yang et al. combined filtering and clustering segmentation algorithms to process the 3D point cloud data of overhead view of seedling using the imaging principle of RGB-D camera (Yang et al., 2019).
Different from the reported methods for detecting seedlings, this study proposed a new framework for identifying healthy seedlings based on the physical transplanting prototype developed by a research group using computer vision technology. The purpose of this study is to identify healthy seedlings quickly and accurately for an automatic seedling transplanting system.

System Framework
The physical prototype of automatic seedling transplanting is shown in Figure 1A and its specific structure is shown in Figure 1B, which consists of a conveyor unit, a visual detection unit, and a transplanting unit.
Seedlings are planted in trays in the form of pot seedlings. The conveyor unit is mainly responsible for the position movement of tray seedlings, which is composed of a conveyor belt, a limit device, and a frame. The conveyor belt is mounted on the frame and driven by a stepper motor. The visual detection unit is mounted above the conveyor belt. To ensure that the visual detection unit can effectively collect image information of potted seedlings, the conveyor belt is equipped with a guide rail and a guide wheel. When the tray passes through the guide rail area, the horizontal placement posture of the tray will be automatically corrected. The edge line of the tray and the edge line of the conveyor belt will be parallel after the correction. The end of the conveyor belt is equipped with a recovery box of potted seedlings, which is used for the removal and recovery of seedlings of poor quality.
The image of tray seedlings was obtained by a charge-coupled device (CCD) camera with a resolution of 1,280 * 960 (model MV-VD120SC, supplied by Micro-vision company, Xi'an in China). The distance between the camera and the conveyor belt is 700 mm. The CCD camera is connected to a personal computer (PC) by a cable with one USB3.0 interface. The images of tray seedlings obtained by the CCD camera were stored in the PC, which has 8 GB RAM, an Intel Core i5-4590 CPU, and a Windows 7 operating system. The software system of the image processing system running on the PC is OpenCV 3.0 and Matlab 8.3. The transplanting unit, including a gantry and an end-effector, is mainly to carry out the sorting operation of potted seedlings. A manipulator moves on a gantry according to the seedling information, and an end-effector equipped at the end of the manipulator is applied to grasp the target seedling. The actions of the gantry and the manipulator are controlled by a programmable logic controller (PLC).
When the tray is transported to the visual detection system, the belt stopped for 1 s and the visual detection unit is triggered FIGURE 1 | System modeling. (A) Physical prototype; (B) System structure diagram: 1, conveyor belt; 2, limit device; 3, tray with seedlings; 4, lightbox; 5, compensatory light; 6, charge-coupled device (CCD) camera; 7, frame; 8, personal computer; 9, cross slide; 10, stepping motor; 11, manipulator mounting plate; 12, end-effector; 13, diffuse reflection laser sensor; 14, recovery box; 15, gantry; 16, programmable logic controller (PLC). to acquire the image information of tray seedlings as shown in Figure 2. The visual detection of the growth status of seedlings is carried out, where the seedling detection algorithm will be specified in the following section.

Seedling Image Preprocessing
The surface color of the potted seedlings is an important factor to reflect its growth status; however, the acquired image of potted seedling includes noises due to various external interferences. There are different objects in the image of tray seedlings, which will increase the difficulty of identifying healthy seedlings.
The potted seedlings are green with different surface colors from that of the seedling matrix and the tray. The seedling matrix is brown-black. The tray is in red color. Therefore, image preprocessing is necessary and a proper color index needs to be developed for the color image of potted seedling in the RGB color space. (1) and (2) to calculate the color differences of all pixels in the images so that the green and red color components can be enhanced, and the rest of the components can be weakened in the seedling image.

Two-color indexes are proposed as shown in Equations
where TG and TR are the green color component and the red color component after image preprocessing, respectively. R, G, and B are color components of the pixels in the RGB color space of the original image.
To investigate the surface information of seedlings more comprehensively, an index of the grayscale image preprocessing is proposed as shown in Equation (3).
where Y is the gray value of the pixel in the grayscale image and R, G, and B are color components of the pixels in the RGB color space of the original image. Thus, the image preprocessing of potted seedlings is finished to prepare for the subsequent image segmentation.

Segmentation and Denoising of Seedling Image
After the image preprocessing, an image segmentation algorithm is proposed using an optimal thresholding method based on the genetic algorithm. The algorithm flow is shown in Figure 3.
In the encoding, the grayscale image is binary coded to produce a 16-bit binary number. The first eight bits represent the segmentation threshold T 1 , and the remaining eight bits represent the segmentation threshold T 2 . The number of seedlings affects the fitness calculation of each generation. The initial generation quantity should be set reasonably. After a lot of simulation, the number of pepper seedlings is set to 21, and the maximum number of breeding algebras is set to 100.
• Decoding: The binary array generated by the encoding is decoded and converted to a value between 0 and 255 for the fitness value. • Fitness function: Equation (4) is used as a function of fitness value, and the linear scaling of the fitness function is taken as follows: where P i is the probability of occurrence of the i-th gradation. T 1 and T 2 are division thresholds with T 1 < T 2 , which could divide grayscale images into C 0 , C 1 , and C 2 categories. P C0 , P C1 , and P C2 represent the probability of occurrence of C 0 , C 1 , and C 2 , respectively.
• Choice: the roulette selection algorithm is applied.
• Cross: the main feature of the cross is the formation of two new individuals. The probability of crossover affects both the possibility of cross-operation and the speed of convergence.
Therefore, it is especially important to choose the crossover probability. The crossover probability is set to 0.6. • Variation: mutation is performed through simple mutation, and the mutation probability is set to 0.03. • Termination criteria: when the algorithm performs to a predetermined algebra or the highest fitness value in the population is stable, the algorithm will end the operation.
A binary map of tray seedlings is generated by using Equation (5).
where T * 1 and T * 2 represent two thresholds obtained after the optimal threshold segmentation of the genetic algorithm 1 represents a seedling pixel, and 0 represents a background pixel. Then, the three-dimensional block-matching algorithm (BM3D) is applied to denoise the tray seedling image.
We use I to represent the image with noise and P to represent any matching block that has been divided. Set the block size of P to be K × K, and use Q to represent a sliding window block in the search process. When the size of the block is known, the pixel in the upper left corner of the image is used to represent the matching block. In the process of block matching, the appropriate step size "h" was first determined. The block is divided and searched according to the order from left to right and from top to bottom. The current block P is selected as the reference block, and the center point of P is considered as the reference point.
It is shown in Equation (6) that S(P) is a 3D matrix set including similar blocks. τ d is the distance threshold in the search process. d is the distance between the matching blocks in the search process as shown in Equation (7).
where X is the matrix value of the matching block. Finally, the matrix block sets in order are arranged in order of magnitude. A 3D matrix of size K × K × S(P) is obtained. Then, the process of denoising in the 3D transform domain can be expressed by using the following equation: where N 3D represents that T S(P) makes 3D unitary transformation, and the arithmetic symbol is N 3D . γ can be expressed as follows: where λ 3D represents the threshold value of hard threshold filtering, and σ represents Gaussian white noise parameter. When the filtering of the noise image is completed, each block will have an estimated value corresponding to it. N P is used to represent the non-zero value in the matrix coefficient after filtering. W P . is used to represent the estimated value of the basic weight in the current block as shown in Equation (10).
After calculating the basic estimation value of the 3D transform domain filtered by Equation (10), the final estimated weight can be obtained as shown in Equation (11).
It can be seen from Equation (11) that the larger the estimated weight is, the smaller the noise is in the real image. Finally, a real image could be obtained by calculating the average estimated weight of each overlapping block.

Healthy Seedlings Identification
The leaf area "M" of the seedling is a visual representation of the seedling growth status. After segmenting the seedling image, a smart identification framework of healthy vegetable seedlings (SIHVS) is implemented for calculating the leaf area of each seedling. The SIHVS algorithm counts the number of pixels of the seedling leaf. The actual area of the seedling leaf can be obtained by proportional conversion with the leaf pixels. The regionprops function in MATLAB is used to calculate the area of each white region in the binary image. The parameter bws of the function regionprops calculates the number of pixels occupied by the white region in each region and could assign the calculating results to props. Then, the actual leaf area of each seedling is converted according to the relationship between the tray size and the pixels as shown in Equation (12).
where M is the actual leaf area of the seedling, K is a proportionality factor, which is the ratio of the actual size of the tray to the number of pixels in the tray, and P is the number of pixels occupied by the leaves of the seedling. The ratio of the extracted leaf area value of the potted seedling to the hole area of the potted seedling is selected as the threshold "F" of leaf area. The SIHVS algorithm uses the threshold "F" to identify the healthy seedling. Healthy seedlings, sub-healthy seedlings, inferior seedlings, and empty cells could be identified and classified based on the average value, maximum value, and minimum value of the threshold F, which can be obtained by a lot of experiments. When confirming the growth state of a seedling, its threshold F is first calculated. If it is greater than the average value of F, it will be identified as a healthy seedling. If it falls between the average value and the minimum value of F, it will be identified as a sub-healthy seedling. If it is smaller than the minimum value of F, it will be considered as a seedling of poor quality. For empty cell recognition, the threshold F should be below 0.015 that could be confirmed by calculating the threshold value F of 100 empty cells. Thus, the healthy seedlings could be identified, and the details of the results will be given in the Results section.

Segmentation and Denoising Results
The original pictures of two trays of pepper seedlings are shown in Figures 4A,B. The segmentation results of the original pepper seedlings are shown in Figures 5A,B based on the optimal threshold method. Denoising results are shown in Figures 6A,B using the BM3D.
To verify the effectiveness of the proposed method, different species of seedlings were used in the validation experiments of effectiveness. The original pictures of watermelon seedlings and cucumber seedlings are shown in the Figures 7A,B, respectively. The segmentation results of the watermelon seedlings and cucumber seedlings are shown in Figures 7C,D based on the proposed method.

Threshold Calculation
Twenty groups of pepper seedlings including 2,100 seedlings were analyzed, and the measured threshold "F" of each group was calculated. The maximum, average, and minimum values of the pepper seedling thresholds of each group were recorded in Table 1. The average value, Fa, of 20 groups was 0.20, the maximum value, F max , was 0.25, and the minimum value, F min , was 0.16.
The healthy seedling is labeled by using red "1, " the subhealthy one is labeled by using green "1, " the seedling of poor quality is labeled by using blue "1, " and the empty cell of the tray is labeled by blue "0." The identification results of original pepper seedling images are shown in Figures 8, 9, which are based on the proposed framework and method in Tong et al. (2013), respectively.

Coordinates Determination of Healthy Seedling
After identification of the growth state of seedlings, the coordinates of different growth states of seedlings are output shown in Figures 10, 11, which are based on the proposed framework and method in Tong et al. (2013), respectively. The origin of the coordinate system is the upper left corner of the whole tray image. The positive direction of the X-axis of the coordinate system is along the upper edge of the image to the right. The positive direction of the Y-axis of the coordinate system is downward along the left edge of the image. By using the proposed method, the coordinates of sub-healthy seedlings center are (1,5) and (11,5) in Figure 10A and (1,1), (3,1), (9,1), (9,3), (11,3), and (13,3) in Figure 10B, the coordinates of seedlings of poor quality are (5,1) in Figure 10A and (13,1) and (7,3) in Figure 10B, and the coordinates of empty tray cells are (1,5) in Figure 10B. Other coordinates represent healthy seedlings in Figure 10. By comparison, the coordinates of subhealthy seedlings center are (5,1) and (11,5) in Figure 11A and (1,1) and (9,1) in Figure 11B, the coordinates of seedlings of poor quality are (13,1) and (7,3) in Figure 11B, and the coordinates   of empty tray cells are (1,5) in Figure 11B. Other coordinates represent healthy seedlings in Figure 11.
A total of 273 pictures of three groups of seedlings were used to verify the potential of the proposed method. The images were randomly obtained under different seedling growth conditions. There are 75 healthy seedlings, 6 sub-healthy seedlings, 2 poor seedlings, and 1 empty tray cell in group A; 94 healthy seedlings, 6 sub-healthy seedlings, 3 poor seedlings, and 2 empty tray cells in group B; and group C has 72 healthy seedlings, 6 sub-healthy seedlings, 4 poor seedlings, and 2 empty tray cells. The statistics data is recorded in Tables 2, 3. The accuracy rates of healthy seedlings identification are 96, 92.55, and 94.44% using the proposed method in Group A, B, and C, respectively. The accuracy rates of healthy seedlings identification are 92, 87.23, and 88.89% using the comparison method in Group A, B, and C, respectively. The average accuracy rates are 94.33 and 89.37%, which were obtained by using the proposed method and the comparison method, respectively.

DISCUSSION
It can be seen from the Figures 5, 6 that the proposed method removed the noise effectively and filled the holes on the leaf     surface, which also implied that the proposed method was robust against the varied illumination because this experiment was not conducted under the structured light environment. The threshold "F" obtained by a lot of experimental analysis could clearly distinguish seedlings of different quality. Although the "F" value of the empty tray cell should be 0 in theory, a lot of experiments showed that the value of the tray empty cell was below 0.015. By setting different thresholds "F, " different quality seedlings can be accurately distinguished, which can be shown in Figure 7. It can be seen that Figure 7 is all the correct identification results of Figure 4. The proposed method is superior to the comparison method in the identification of healthy seedlings and sub-healthy seedlings, which implies the algorithm can segment and restore seedling leaves effectively. The SIHVS algorithm can use the threshold "F" to count the surface pixels of seedling leaves in detail. The average value, F a , is 0.20, the maximum value, F max , is 0.25, the minimum value, F min is 0.16, and 0.015 can be set to distinguish the healthy seedlings, sub-healthy seedlings, seedlings of poor quality, and empty tray cells well, which can be confirmed in Tables 2, 3. The proposed method can give a good identification rate of healthy seedlings that is 94.33% and

CONCLUSION
This study proposed a computer vision-based identification framework of healthy seedlings. The healthy seedlings, subhealthy seedlings, seedlings of poor quality, and empty tray cells could be identified automatically. The BM3D-based method could effectively segment the image of the seedlings by filling the holes on the leaf surface. The SIHVS algorithm could identify the growth status of the seedling by counting the pixel number of the seedling leaf. The threshold "F" that is the ratio of the leaf area value of the seedling to the area of empty tray cell is an important index to distinguish seedling growth state. It was found that F a was 0.20, F max was 0.25, F min was 0.16, and 0.015 was the boundary values to distinguish the healthy seedlings, sub-healthy seedlings, seedlings of poor quality, and empty tray cells. The proposed method could output the coordinates of healthy seedlings for transplanting grasping. The identification accuracy rate of healthy seedlings was 94.33% higher than 89.37% obtained by the comparison method. The above conclusions imply that the proposed method can identify healthy seedlings in the tray which can be prepared well for transplanting seedlings.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.