A Promising Approach to Quantifying Pteropod Eggs Using Image Analysis and Machine Learning

A newly developed protocol to semi-automate egg counting in Southern Ocean shelled (thecosome) pteropods using image analysis software and machine learning algorithms was developed and tested for accuracy. Preserved thecosome pteropod (Limacina helicina antarctica) egg masses collected from two austral summer research voyages in East Antarctica were digitally photographed to develop a streamlined approach to enumerate eggs within egg masses using Fiji/ImageJ and the associated machine learning plugin known as Trainable Weka Segmentation. Results from this semi-automated approach were then used to compare with manual egg counts from eggs dissected from egg masses under stereomicroscope. A statistically significant correlation was observed between manual and semi-automated approaches (R 2 = 0.92, p < 0.05). There was no significant difference between manual and automated protocols when egg counts were divided by the egg mass areas (mm2) (t(29.6) = 1.98, p = 0.06). However, the average time to conduct semi-automated counts (M = 7.4, SD = 1.2) was significantly less than that for the manual enumeration technique (M = 35.9, SD = 5.7; t(30) = 2.042, p < 0.05). This new approach is promising and, unlike manual enumeration, could allow specimens to remain intact for use in live culturing experiments. Despite some limitations that are discussed, this user-friendly and simplistic protocol can provide the basis for further future development, including the addition of macro scripts to improve reproducibility and through the association with other imaging platforms to enhance interoperability. Furthermore, egg counting using this technique may lead to a relatively unexplored monitoring tool to better understand the responses of a species highly sensitive to multiple stressors connected to climate change.


INTRODUCTION
It is now widely recognized that a multitude of concurrent biological, chemical and physical stressors caused by human activities are posing significant threats to global marine ecosystems and their components (IPCC, 2022). In polar regions, research has shown that changes to the development and reproduction of many marine organisms, including zooplankton, are particularly vulnerable to warming and ocean acidification (Johnston et al., 2022). Some zooplanktonic groups, such as gastropod molluscs known as thecosome (shelled) pteropods, are regarded as early responders to climate change (Bednarsěk et al., 2016), as they produce fragile, aragonite shells that are highly susceptible to dissolution linked to high CO 2 partial pressures (pCO 2 ) due to increasing ocean acidification (Riebesell et al., 2000;Orr et al., 2005;Kroeker et al., 2013). Whilst recent studies have shown a relatively higher capacity to withstand such effects than previously assumed , it is the early developmental stages of thecosome pteropods that are at greatest risk to changing ocean chemistry (Gardner et al., 2018). These risks will undoubtedly present wider consequences throughout marine ecosystems as, like many zooplankton taxa, thecosome pteropods provide a key energetic link between basal and higher trophic levels as well as an important contributor to the global export of carbon and carbonate to the deep sea through the fluxing of fast-sinking fecal pellets and shells post-mortem (Manno et al., 2010;Manno et al., 2018).
Shell dissolution in thecosome pteropods has often been studied using the common species Limacina helicina from the Northern Hemisphere (Comeau et al., 2010;Lischka et al., 2011;Comeau et al., 2012b;Bednarsěk et al., 2014), and its Southern Ocean congener species, Limacina helicina antarctica (Manno et al., 2007;Seibel et al., 2012;Johnson and Hofmann, 2017;Gardner et al., 2018). Whilst one study by Bednarsěk et al. (2012) revealed in situ shell dissolution of juvenile L. h. antarctica from the Scotia Sea, situated in the Atlantic sector of the Southern Ocean, much of the effects of climate change on early life development of thecosome pteropods have been observed through laboratory-based manipulation experiments. Examined under predicted levels of ocean acidification and warming, incubated thecosome pteropods have shown a range of adverse responses, including degradation, reduction in and/or lack of shell development (Lischka et al., 2011;Comeau et al., 2012a;Gardner et al., 2018), increased larval mortality (Lischka et al., 2011;Thabet et al., 2015;Gardner et al., 2018), and a decrease in the proportion of eggs developing to advanced embryogenetic stages (Manno et al., 2016). These responses are bound to have wider ecological and long-term ramifications related to population stability and recruitment.
Thecosome pteropods are holoplanktonic with unique life history strategies. Most species begin life as males until they reach a particular size (e.g., shell diameter of~4 mm for L. helicina) then subsequently develop female organs and mature into females whilst their male organs are resorbed (Lalli and Wells, 1978;Lalli and Gilmer, 1989), which characterizes them as protandrous hermaphrodites. Females spawn tens of thousands of transparent eggs during their lifespan which are embedded into ribbons within gel matrix egg masses. Embryogenetic development occurs within these clutches, and hatching generally occurs at the trochophore larval stage (Lalli and Gilmer, 1989;Thabet et al., 2015;Wakabayashi, 2017). Optimal clutch size theory posits that mature females will spawn variable numbers of eggs to maximize the offspring fitness as it relates to resource availability, intraspecific competition, and mortality (Godfray et al., 1991). Different forms of parental care exist in marine gastropods, but for many species, females control the number of eggs contained within egg masses in an effort to manage their fecundity under changing conditions (Spight and Emlen, 1976;Perron, 1981).
Challenges related to estimating fecundity in thecosome pteropods can be attributed to the high number of microscopic eggs embedded within each egg mass. Manually counting them can be time consuming and using abundance of mature-aged adults is a relatively inaccurate alternative given the range of egg masses released by each pteropod adult. Manually counting thecosome eggs has previously involved dissecting the egg ribbons from the egg mass, which may introduce stress, particularly if eggs are being placed in live culture for subsequent observational studies (Manno et al., 2016). One study by Lalli and Wells (1978) used a conversion factor of 35 eggs mm -2 for L. helicina egg masses collected from Eastern Canada which is derived from estimating the number of eggs per area of egg mass measured, however this average value was based on complete measurements taken from only five egg masses. These challenges may be minimized with the use of image analyses platforms.
Autonomous image analysis techniques have previously been tested in plankton research involving the counting and measuring of round objects in aqueous solution, including the use of images and on-board, large-volume samples (Gorsky et al., 1989;Colas et al., 2018). Several studies have employed software platforms to automatically enumerate microscopic eggs of invertebrates from images with high degrees of success (Collin, 2010;Rosati et al., 2015;da Silva Juńior et al., 2018). The purpose of this study is to develop and validate a workflow that uses a combination of image segmentation and a supervised machine learning algorithmic approach to perform semi-automatic detection of thecosome pteropod eggs embedded within egg masses. This study aims to efficiently and accurately enumerate thecosome pteropod eggs embedded in their egg masses using the workflow developed in this study, and statistically compare this method to manual egg enumeration, which involves dissection under stereomicroscope. Reliably predicting the number of eggs within thecosome pteropod egg masses through non-destructive data imaging techniques can be beneficial to monitoring the health of marine ecosystems particularly prone to rapid chemical change.

MATERIALS AND WORKFLOW CONSTRUCTION Study Area and Sampling
Plankton sampling was conducted along the East Atlantic region of the Southern Ocean during two separate research voyages ( Figure 1). The first was aboard the RV Aurora Australis as part of the Kerguelen Axis (K-Axis) program (January-February 2016) within the southern extent of the Kerguelen Plateau. Sampling for K-Axis spanned a region from 62.7°E to 93.5°E, and 57.6°S to 65.2°S. The second was aboard the TRV Umitakamaru as part of the 20 th Kaiyodai Antarctic Research Expedition (KARE20) program (January 2017) which covered a repeat transect southward along the 110°E longitudinal line.
Mesozooplankton samples from K-Axis were obtained using a Rectangular Midwater Trawl (RMT 1 + 8) net with a mouth area of 8 m 2 and a mesh size of 4.5 mm that tapered to a mesh size of 1.5 mm in the last 1.8 m of net [see Hosie et al. (2000) for more details]. Undamaged specimens collected with the RMT1 net, with a mesh size of 315 mm and a mouth area of 1 m 2 , were measured for this study. Samples from KARE20 were obtained using an Ocean Research Institute (ORI) net with a mouth diameter of 160 cm and a mesh size of 500 mm [see Sakurai et al. (2018); Sakurai et al. (2020) for more details]. Both zooplankton collection methods sampled from a maximum depth of 200 m. All samples were preserved in 5% buffered formaldehyde and seawater solution and transported back to the Institute for Marine and Antarctic Studies in Hobart, Tasmania.
Pteropod egg masses ( Figure 2) selected for this study were obtained from two sampling sites determined to have the highest number of intact egg masses, one from each voyage. The sampling site selected from K-Axis was located at -62.318°S and 91.531°E, and the site selected from KARE20 was located 453 nm away at -63.491°S and 107.958°E ( Figure 1).

Manual Counting, Image Capturing, Pre-Processing, Calibration and Threshold Setting
A workflow for image pre-processing, segmenting images and enumerating eggs within thecosome pteropod egg masses is shown in Figure 3. Separated pteropod egg masses (n = 20) were rinsed in filtered seawater and transferred to glass petri dishes in preparation for imaging. Sharpened metal needles were used to gently remove any debris that may affect the segmentation process. Photographs of egg masses were taken with a Canon EOS Mark II 5D camera mounted on a Leica M165 C stereoscopic microscope and using EOS Utility software (Canon USA), while taking note of magnification. For converting measurements from pixels to mm, photographs were also taken of a micrometer slide at the same magnification used for the egg mass images. A selection of images (n = 16) was chosen to include all variations of typically encountered characteristics (e.g., eggs, matrix, phytoplankton cells), and the egg ribbons from the egg masses featured in these images were then carefully dissected under the microscope using a sharpened needle. The eggs from each ribbon were then enumerated to ground truth counts estimated from the automated technique image analysis.
Each digital image was opened in the Fiji/ImageJ software (RRID : SCR_002285) v. 2.3.1 (Schindelin et al., 2012) and a Wacom Intuos drawing tablet and pen (CTL-6100WL) was    used for accurate digital drawing on images. To exclude material around the egg masses, the "Polygon selection" or "Freehand selection" tools from the toolbar were used to draw an overlap around the perimeter of the egg masses ( Figure 4A).
The "Clear outside" function, located under the Edit dropdown menu, was selected and each image was then saved. To calibrate measurements for each egg mass image, the micrometer slide images were opened first, and the "Straight line" tool from the toolbar was superimposed over the micrometer slide ruler using the "Analyze>Set Scale…" function. A value in pixels was then linked to a known distance value of 1 mm from the line drawn over the slide ruler in the image (spatial calibration value = 1224.06 pixels/ mm at 3.2x magnification). A threshold cell size needed to be set due to both the large concentration of non-egg material (e.g., phytoplankton cells) and egg overlapping within the egg masses. To estimate this, the areas (mm 2 ) of a random subset of 20 eggs from four egg mass images were individually measured using the "Polygon selections" tool to draw around the perimeter of each egg and determined to be an average of 10 mm 2 from the values in the results table using the "Analyze>Measure" function. Egg mass lengths and areas were also determined using Fiji/ImageJ, using the "Straight line" and "Freehand selection" tools, respectively.

Segmentation and Egg Enumeration
For Fiji/ImageJ to perform semi-automated enumeration, the eggs need to be differentiated from the backgrounds within each data image by a process known as segmentation. Built into Fiji/ ImageJ is the Trainable Weka Segmentation (TWS; RRID: SCR_001214) plugin (http://imagej.net/Trainable_Weka_ Segmentation) which is a tool that leans on machine learning and user-directed guidance to partition digital images into multiple segments, or classifiers, and subsequently perform automatic quantitative segmentation (Arganda-Carreras et al., 2017). Once both the image and TWS (version 3.3.2 was used in this study) are opened, classes were defined and renamed as "eggs" and "not eggs" (Figure 4B; Supplementary Figure 1).
Training feature settings will generally depend on the quality of the image. For images taken for this study, Gaussian blur, Hessian, Membrane projections, Sobel filter, Difference of gaussians, Variance, and Structure were selected; Membrane thickness, Membrane patch size and Minimum sigma were kept at default settings, and Maximum sigma was changed to 32.0. The "Freehand" tool from the toolbar was used to mark the regions of each image under each class. For accuracy, a minimum of 10 marks per class was defined, before classifier training began.
Training was repeated depending on the quality of the image. The image results from this segmentation workflow were saved and used for the final steps for counting eggs, detailed in the next section. The binary result images generated from the TWS plugin were opened in Fiji/ImageJ ( Figure 4C). Many eggs appeared fused due to the overlapping in regions and the watershed operator was applied to correct for this. The minimum size threshold was set to the previously determined 10 mm 2 under Analyze>Analyze Particles … and egg counts were determined from the display results window ( Figure 4D). A stopwatch was used to time both manual and semi-automated counting techniques.

Statistical Analyses
All statistical analyses were performed using RStudio (RRID : SCR_000432) version 2021.09.0 (Team, R.C. 2014). A linear regression was performed from a correlation calculated between manual and semi-automated counts; the latter performed through Fiji/ImageJ. Two-sample t-tests were used to statistically compare manual with automated estimations of egg per mass area (mm 2 ) and duration of technique (minutes).

Comparing Egg Enumeration Methods
Lengths and areas of pteropod egg masses measured digitally in this study varied between 2.9 mm and 12.2 mm and 4.1 mm 2 and 21.4 mm 2 , respectively ( Table 1). The averages calculated for the number of eggs per egg mass lengths were 48.4 eggs mm -2 (± 9.0 SD) for the manual enumeration technique and 41.7 eggs mm -2 (± 10.1 SD) for the semi-automated counting technique with Fiji/ImageJ.
The comparison of egg counting techniques showed a statistically significant correlation between manual and semiautomated egg counts (R 2 = 0.92, p < 0.05; Figure 5). In all but two of the totals, the manual counting technique produced higher egg counts as compared to those conducted digitally. When egg counts were divided by the egg mass areas (mm 2 ), these values were compared between the manual and semi-automated methods and there was no significant difference [t(29.6) = 1.98, p = 0.06; Figure 5 inset plot]. The semi-automated counting technique averaged 7.4 minutes (± 1.2 SD) in duration, which took statistically significantly less time than manual egg counting, which averaged 35.9 minutes (± 5.7 SD) to complete (t(30) = 2.042, p < 0.05).

Limitations and Suggested Improvements
Before discussing the ecological implications associated with the egg counting outputs originating from the Fiji/ImageJ platform, it is critical to determine the reliability of these results. Many of the images depicted egg mass samples surrounded by non-egg particles, such as phytoplankton cells, that would likely also be counted by the software platform. The inability of the platform to distinguish eggs from other materials, identify egg and egg mass abnormalities or differentiate between eggs within close proximity are all limitations of this technique. Precision is enhanced through pre-analysis image preparation, involving setting size threshold limits, and drawing regions of interest (ROIs) encompassing high concentrations of intended materials. Despite these limitations, the statistically significant similarity obtained between manual and automated techniques validate the latter as a suitable solution for developing future studies that estimate fecundity.
Suggested improvements to these limitations should focus on the pre-processing of the egg mass samples prior to imaging. This could be through staining the sample with an agent that would enable the saturation value of eggs to be detected easily from nonegg materials, a step that is often used in medical imaging and histological studies. This was demonstrated by Malhan et al. (2018) who used various stains to distinguish, by color, various elements of connective tissue, including mineralized bone, cartilage, elastic fibers and muscles. Future studies are encouraged require the identification of constituents within and typically adjacent (e.g., phytoplankton, marine snow) to thecosome eggs and egg masses to select the appropriate stain used to separate these constituents by color or other identifier in preparation for pixel-based segmentation. Pre-process staining may eliminate the early workflow steps that focus on manually selecting ROIs and size threshold limits from images as well as decrease machine learning supervision while improving the overall speed and reproducibility of this methodology. Staining, though an extra step in the pre-processing stage, may effectively reduce the inclusion of background noise while also enabling fully automated batch processing of multiple images through scripts.
Other potential improvements to semi-automated egg enumeration involve advanced machine learning strategies, such as deep learning-based methods with a particular focus on the segmentation of nucleus-like shapes and overlapping objects of interest. Deep learning methods have become increasingly popular in recent years, with many applications used in medical research (Hesamian et al., 2019). Emerging deep learning image analysis tools have been developed to address these issues, including the Fiji plugin StarDist, for cell and nuclei detection from images that detects star-convex shaped objects (Weigert et al., 2020), and Cellpose, which is a segmentation algorithm designed to efficiently segment cells stained for a variety of markers (Stringer et al., 2021). Future studies would benefit from testing deep learning algorithms in zooplankton research.

Anticipated Results Using Semi-Automated Egg Enumeration
Measuring the efficiency of a newly constructed, semi-automated egg enumeration technique is difficult when the eggs are microscopic in size, numerous, and the gel matrices in which they are embedded have other particles present, which creates noise, and consequently, potential for error. Therefore, using data retrieved from manual egg enumeration to compare techniques can enable an appropriate assessment of the effectiveness of the semi-automated technique described here. Pteropod egg counts determined accurately and efficiently can then be used to model drivers of both spatial and temporal patterns of early life development and fecundity throughout a rapidly changing Southern Ocean. Egg count data may thus lead to fruitful gains in assembling monitoring programs used to forecast how spatial, ecological and environmental cues affect variability in egg production of a sentinel species in response to increased ocean acidification, deoxygenation and temperatures (Bednarsěk et al., 2016;Manno et al., 2017).

DISCUSSION
Shelled pteropods perform essential ecological roles in polar regions and serve as sentinels of climate change, though much work is yet to be done to better understand how these changes affect their early life development. This study constructed a framework to perform automatic, albeit supervised, enumeration of microscopic eggs from thecosome pteropod egg masses using image analysis and machine learning algorithms. Prior to this study, egg counting from thecosome pteropod egg masses had been performed manually under stereomicroscope either through counting eggs along ribbons dissected from each clutch or through estimating the number of eggs over a known area (mm 2 ) then extrapolating this value over the entire length of the egg mass. While the former is more accurate, this method is far more invasive and destructive, whereas the latter method does not account for high variability in egg density present along the length of the egg masses. The purpose of this study was to determine if a digital protocol for egg counting could be as accurate and efficient as the more invasive manual counting FIGURE 5 | Estimated counts of eggs within L. h. antarctica egg masses (n=16). Linear regression from the correlation calculated between manual and semiautomated counts obtained through Fiji/ImageJ is y = 0.9787x -55.9883, where y is the predicted number of eggs estimated through automation, and x is manual count variable; R 2 = 0.9217, p <0.05. Dashed line is 1:1 reference. Inset plot: Results of L. h. antarctica egg counts mm -2 conducted by automation and manually. Median values of egg counts per egg mass area are depicted by horizontal lines within the 50% interquartiles (boxes). Upper and lower vertical lines, or "whiskers" refer to maximum and minimum dependent values, respectively. No significant difference was observed between methods, p > 0.05. method and results here reveal no statistically significant difference between methods. There was a strong correlation found between semi-automatic and manual counting methods.
There are very few published data focused on thecosome pteropod egg numbers and morphology, and how these attributes change over time and under future predicted climate scenarios, though spatio-temporal studies on manually enumerated eggs have been conducted in other marine gastropod species for decades (Berry, 1987;Mandal et al., 2010). However, very few studies have examined gastropod egg number variation along a gradient of environmental factors (Przeslawski, 2014). An accurate egg enumeration workflow has the potential to answer questions pertaining to early life responses of shelled pteropods to climate change, and the application of machine learning within these studies allows for the automation and simplified analyses of large-sized datasets. While only a few studies have counted thecosome eggs for various research purposes, at the time of this study, no other studies have closely analyzed different enumeration techniques for pteropod eggs nor developed an image processing technique incorporating supervised or unsupervised machine learning algorithms. This is the first study to develop and propose a framework to analyze thecosome pteropod eggs digitally using open-source image analysis software and machine learning algorithms.
Digital egg enumeration has advantages over manual counting. Namely, it does not impose damage and potential stress to the individual eggs, thus allowing the eggs and egg masses to be maintained in live cultures for further ontogenetic studies. Images can be captured while the live egg masses are placed in petri dishes or well slides under stereomicroscope, and subsequently available to use for ontogenetic experiments. This can facilitate more research into understanding uncertainties related to early life development of species sensitive to ocean acidification and ocean sea surface temperature change.
There is capacity for improving detection accuracy and speed of operation on the workflow presented here. Firstly, the discovery of a stain that would easily differentiate eggs from non-egg materials would be a fruitful next step. This would require a deeper understanding of the constituents that make up organic and inorganic materials within and adjacent to the egg masses and result in more accurate segmentation by the TWS plugin. Secondly, there are additional Fiji/ImageJ-based plugins and tools that have shown promising results in pre-and postprocessing cell enumeration, including cell staining followed by the in-built Fiji/ImageJ Color Deconvolution plugin for color segmentation (Ruifrok and Johnston, 2001). This research included a single observer, an assessment of variability in results between multiple observers would be recommended to test userinduced bias and standardize steps, beginning with settings (e.g., magnification) and equipment (e.g., microscope and camera make and model) related to the acquisition of digital images. And finally, through the development of new macros with customizable parameters (based on egg roundness, area, diameter, etc.) that would enable batch processing of multiple image files, rendering the process more automatic. The workflow described here can serve as a baseline for future development with new functionality.
In conclusion, the semi-automatic machine learning approach to analyzing pteropod egg mass images developed here is a promising user-friendly, non-destructive, and highly practical methodology for enumerating eggs within their gel matrices. This study outlines a simple, stepwise workflow necessary to accomplish accurate pixel-based segmentation of pteropod egg mass images using the image analysis software, Fiji/ImageJ, and the in-build TWS plugin. The effectiveness of this workflow was shown through a comparative analysis with manual counting requiring dissection of egg ribbons embedded within the egg mass gel matrix under a stereomicroscope that revealed high correlation.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
The author confirms all responsibility for study conception and design, sample and data collection, analyses and interpretation of results, and manuscript preparation.

FUNDING
This study was financially supported through a grant provided by the Holsworth Wildlife Research Endowment (grant #109804) of the Ecological Society of Australia, as well as funding from the Australian Government as part of the Antarctic Science Collaboration Initiative program (grant # ASCI000002). The Australian Antarctic Program Partnership is led by the University of Tasmania, and includes the Australian Antarctic Division, CSIRO Oceans and Atmosphere, Geoscience Australia, the Bureau of Meteorology, the Tasmanian State Government and Australia's Integrated Marine Observing System.

ACKNOWLEDGMENTS
This research was made possible through the assistance of those involved in plankton sampling, including the master, crew, scientists and technical support teams aboard both the RV Aurora Australis and TRV Umitaka-maru. I am grateful to the anonymous reviewers who provided constructive feedback that significantly improved the quality of this manuscript.