Assessing the viability of estimating baleen whale abundance from tourist vessels

Many populations of southern hemisphere baleen whales are recovering and are again becoming dominant consumers in the Southern Ocean. Key to understanding the present and future role of baleen whales in Southern Ocean ecosystems is determining their abundance on foraging grounds. Distance sampling is the standard method for estimating baleen whale abundance but requires specific logistic requirements which are rarely achieved in the remote Southern Ocean. We explore the potential use of tourist vessel-based sampling as a cost-effective solution for conducting distance sampling surveys for baleen whales in the Southern Ocean. We used a dataset of tourist vessel locations from the southwest Atlantic sector of the Southern Ocean and published knowledge from Southern Ocean sighting surveys to determine the number of tourist vessel voyages required for robust abundance estimates. Second, we simulated the abundance and distributions of four baleen whale species for the study area and sampled them with both standardized line transect surveys and non-standardized tourist vessel-based surveys, then compared modeled abundance and distributions from each survey to the original simulation. For the southwest Atlantic, we show that 12-22 tourist vessel voyages are likely required to estimate abundance for humpback and fin whales, with relative estimates for blue, sei, Antarctic minke, and southern right whales. Second, we show tourist vessel-based surveys outperformed standardized line transect surveys at reproducing simulated baleen whale abundances and distribution. These analyses suggest tourist vessel-based surveys are a viable method for estimating baleen whale abundance in remote regions. For the southwest Atlantic, the relatively cost-effective nature of tourist vessel-based survey and regularity of tourist vessel voyages could allow for annual and intra-annual estimates of abundance, a fundamental improvement on current methods, which may capture spatiotemporal trends in baleen whale movements on forging grounds. Comparative modeling of sampling methods provided insights into the behavior of general additive model-based abundance modeling, contributing to the development of detailed guidelines of best practices for these approaches. Through successful engagement with tourist company partners, this method has the potential to characterize abundance across a variety of marine species and spaces globally, and deliver high-quality scientific outcomes relevant to management organizations.

Many populations of southern hemisphere baleen whales are recovering and are again becoming dominant consumers in the Southern Ocean. Key to understanding the present and future role of baleen whales in Southern Ocean ecosystems is determining their abundance on foraging grounds. Distance sampling is the standard method for estimating baleen whale abundance but requires specific logistic requirements which are rarely achieved in the remote Southern Ocean. We explore the potential use of tourist vessel-based sampling as a cost-effective solution for conducting distance sampling surveys for baleen whales in the Southern Ocean. We used a dataset of tourist vessel locations from the southwest Atlantic sector of the Southern Ocean and published knowledge from Southern Ocean sighting surveys to determine the number of tourist vessel voyages required for robust abundance estimates. Second, we simulated the abundance and distributions of four baleen whale species for the study area and sampled them with both standardized line transect surveys and non-standardized tourist vessel-based surveys, then compared modeled abundance and distributions from each survey to the original simulation. For the southwest Atlantic, we show that 12-22 tourist vessel voyages are likely required to estimate abundance for humpback and fin whales, with relative estimates for blue, sei, Antarctic minke, and southern right whales. Second, we show tourist vessel-based surveys outperformed standardized line transect surveys at reproducing simulated baleen whale abundances and distribution. These analyses suggest tourist vessel-based surveys are a viable method for estimating baleen whale abundance in remote regions. For the southwest Atlantic, the relatively cost-effective nature of tourist vessel-based survey and regularity of tourist vessel voyages could allow for annual and intra-annual estimates of abundance, a fundamental improvement on current methods, which may capture spatiotemporal trends in baleen whale movements on forging grounds. Comparative modeling of sampling methods provided insights into the behavior of general additive model-based abundance modeling, contributing to the development of detailed guidelines of best practices for these approaches.

Introduction
Baleen whales are an important consumer in polar regions, migrating from winter breeding grounds in low latitudes to high latitudes to acquire most of their annual energy budget (Horton et al., 2022). Most populations of baleen whales are recovering from industrialized whaling during the mid-part of the 20 th century (Baker and Clapham, 2004;Leaper and Miller, 2011;Tulloch et al., 2019) and are again dominating polar food webs (Ratnarajah et al., 2014;Ratnarajah et al., 2016;Savoca et al., 2021). However, we have a limited understanding of current population trends for many baleen whale species, particularly those which do not winter and/or migrate coastally, e.g., blue (Balaenoptera musculus), fin (Balaenoptera physalus), Antarctic minke (Balaenoptera bonaerensis, herein, minke) and sei (Balaenoptera borealis) whales for the southern hemisphere, for which the Southern Ocean foraging grounds may represent a single geographic location where the entire population can be surveyed simultaneously.
This study focuses on the southwest Atlantic sector of the Southern Ocean which is a hotspot for human and biological activity in the Southern Ocean (Tin et al., 2009;Tin et al., 2014;Bender et al., 2016;Erbe et al., 2019;Palmowski, 2020). It is also one of the fastest-warming regions in the Southern Ocean experiencing significant changes to seaice conditions (Carrasco et al., 2021;Lin et al., 2021) and the focus of the majority of the current krill fishing effort (Kawaguchi and Nicol, 2020). The current trigger limit for the catch of Antarctic krill (Euphausia superba) is based on a precautionary fishery approach managed by the Convention for the Conservation of Marine and Antarctic Living Resources (CCAMLR), to avoid potentially negative effects on krill populations and krill-dependent predators. The status of the recovering baleen whale populations foraging in the southwest Atlantic sector is unclear (Erbe et al., 2019). This is despite baleen whales being potentially the greatest consumers of Antarctic krill in the Southern Ocean (Baines et al., 2021;Warwick-Evans et al., 2021) and that the competitive relationships between baleen whales and other krill predators are likely to be as influential on the demographic signals monitored for krill fishing management as climate change and krill fishing itself (Ainley et al., 2006;Trivelpiece et al., 2011;McMahon et al., 2019). Annual monitoring of baleen whale abundance on foraging grounds, from which to estimate aggregate krill consumption by krill predators, will enhance CCAMLR's Ecosystem-based Management (EBM) of the krill fishery.
While there is a need for surveys of baleen whale populations on Southern Ocean foraging grounds, ship-based surveys have been restricted spatially (e.g., Santora et al., 2014;Johannessen et al., 2022), temporally (e.g., Headly et al., 2001Baines et al., 2021), or spatiotemporally (e.g., Herr et al., 2016;Herr et al., 2019;Bassoi et al., 2020). As illustrated by the 19-year gap between broad-scale surveys of the southwest Atlantic sector of the Southern Ocean by CCAMLR (Hedley et al., 2001;Baines et al., 2021) and the cessation of International Whaling Commission (IWC), IDCR/SOWER circumpolar surveys in the early 2000s (Branch and Butterworth, 2001a;Branch and Butterworth, 2001b). Determining ocean basin-scale abundance of baleen whales foraging in the Southern Ocean and inferring environmental, and/or prey-mediated drivers of baleen whale densities in both time and space has therefore proved elusive (El-Gabbas et al., 2020).
Increased collaborations between the tourist industry and scientific endeavors in recent years (e.g., the polar collective) have provided the opportunity to better estimate baleen whale abundance (Johannessen et al., 2022). Before the COVID-19 pandemic, Antarctic tourism was undergoing year-on-year growth (Lynch et al., 2010;Bender et al., 2016;Palmowski, 2020), a trend that after the pandemic-driven hiatus is expected to continue (IAATO Report, 2020). Antarctic tourist ships thus represent platforms of opportunity for repeated and ongoing surveys of baleen whales in the Southern Ocean, though several challenges exist.
Ship-based line transect distance sampling is the common method for surveying air-breathing predators at sea (Buckland et al., 2001;Thomas et al., 2010). For baleen whales, this method involves a vessel passing through a study area, typically along an a priori-designed survey track with trained observers logging animal sightings and recording their perpendicular distance to the transect line during defined periods of effort. Detection functions describe the relationship between the number of sightings and their perpendicular distance to the transect line, which generally decreases as the perpendicular distance from the transect line increases (Thomas et al., 2010). The ability to detect animals along a transect is influenced by several factors in addition to distance, such as sighting conditions, the height of the platform, whale species and behavior, and observer experience, and as such, covariates can be added to account for this variation, termed Multi-Covariate Distance Sampling (MCDS) (Marques and Buckland, 2003). Buckland et al. (2001) recommend 60 -80 sightings per species to fit robust detection functions which are then used for subsequent density estimation. Recent tourist vessel-based surveys of the Bransfield straight suggest minimum sighting requirements are achievable for humpback whales (Megaptera novaeangliae) (Johannessen et al., 2022). Whether this is the case for other species across a wider study area is unclear.
Generally, scientific surveys follow a strict set of survey design principles when considering the placement of proposed transects, that ensure even spatial coverage and allow for both design and modelbased analysis (Strindberg and Buckland, 2004;Thomas et al., 2007;Williams and Thomas, 2009). However, in the case of tourist vesselbased surveys, transects are not randomly placed evenly across the study area, rather, track locations are dictated by factors such as tourist landing sites, wildlife hotspots, weather, anchorages, and other ship traffic (Bender et al., 2016). Model-based approaches, such as Generalized Additive Model (GAM) based density surface modeling (DSM), are reasonably robust to such non-randomly placed surveys for the purposes of estimating abundance or distributions (Hedley and Buckland, 2004;Miller et al., 2013). DSM is a two-stage approach, modeling the incomplete detection of animals on the transect, as described by the detection function/s, combined with segment level (portions of the transect) survey effort information (e.g., sightings conditions) and environmental covariates (e.g., sea surface temperature and water depth, or even space) using a GAM, or alternative methods (Miller et al., 2013). DSM are most likely to accurately represent the underlying animal distribution when transects sample the range of explanatory covariates used in the model (Hedley et al., 1999;Cañadas and Hammond, 2008;Williams et al., 2006) and are spatially spread reasonably evenly within the study area (Strindberg and Buckland, 2004;Thomas et al., 2007;Williams and Thomas, 2009). For tourist vessel-based surveys, it is unclear if the assumed bias in sampling will allow for good spatial coverage and/or adequate sampling of explanatory co-variates to achieve robust results.
Herein, we aim to determine whether tourist ship-based surveys can yield data to support reliable estimates of baleen whale abundance on foraging grounds in the southwest Atlantic sector of the Southern Ocean. We used a dataset of tourist vessel locations collected before the Covid-19 pandemic to respond to the following four aims. First, we aimed to determine the amount of survey effort that is achievable per tourist vessel voyage, and second, use these results to determine the number of tourist vessel voyages needed to return enough sightings to support fitting MCDS detection functions for each baleen whale species. Our third aim was to test the validity of sampling with tourist vessel-based surveys by comparatively sampling simulated whale fields with non-standardized tourist vessel-based surveys against a previously completed standardized line transect survey. Finally, we determined at which point increasing the number of additional voyages no longer returns significant improvements in the accuracy and precision of abundance estimates.

The study region
The study area (~1.4 million km 2 ) was a region of the southwest Atlantic sector of the Southern Ocean between (70˚W and 40˚W south of 55˚S), including the waters surrounding the west Antarctic Peninsula and South Shetland and South Orkney Islands. The region contains CCAMLR management regions, including the entirety of Subarea 48.1, the western third of Subarea 48.2, and the region north of Subarea 48.1 between 57˚S and 55˚S, and is exclusive of the Weddell Sea (Subarea 48.5), which is rarely traversed by any vessels due to the dense ice conditions (Figure 1).
Estimating the survey effort (km) per tourist vessel-based survey Automatic identification system (AIS) locations from tourist vessel voyages within the study area from November to March during the 2019/ 20 austral summer were used to analyze the potential for tourist vessels to make baleen whale observations. This dataset was a subset of the total voyages that the fleet of tourist vessels completed in the western Antarctic Peninsula region in the 2019/20 austral summer season (29 of the 45 tourist vessels and 209 of the total 318 voyages). As vessels report AIS locations at different time frequencies (typically ≈ 500 times/day), the dataset was regularized to hourly locations. To include only locations with appropriate sighting conditions for whale survey, locations with low light conditions (zenith angle of greater than 85°) and average hourly speeds slower than 6 knots or faster than 16 knots were removed ( Figure 1). There were two basic voyage plans; (i) voyages that transit from southern Patagonia via South Georgia (and sometimes the South Orkney Islands) before reaching the west Antarctic Peninsula (Type A), and (ii) those that transit directly to the west Antarctic Peninsula, occasionally via the Falkland Islands/Islas Malvinas (Type B). Most vessels complete a variety of voyage plans across a single season.
Additional potential survey effort is lost due to poor weather. The resultant survey effort achieved is termed, realized survey effort (i.e., the survey design minus the sections of track missed due to weather, poor light, etc.). Here, we estimated realized survey effort by correcting the potential survey effort (our 'survey design') for expected weather conditions. We then multiplied the estimated realized survey effort by the expected baleen whale sighting rates for the region to estimate the expected number of sightings achieved per voyage.
To estimate the hours of survey effort potentially lost to poor weather we used weather data collected during IWC's IDCR/SOWER surveys (Branch and Butterworth, 2001b). This dataset was used as it contained a routine hourly collection of weather observations between 0600 and 2000 local time. The proportion of time deemed good weather during IWC's IDCR/SOWER survey was binned spatially (10˚Longitude* 2.5˚Latitude). Good weather was defined as when the sightability score (a measure of sighting conditions on a scale from 1 (poor) -5 (excellent)) ranged from 3 to 5. Potential survey effort was split to a maximum of 25km lengths, (as is typical for surveys covering ocean basins) and binned spatially (10˚Longitude* 2.5˚Latitude). To simulate a dataset of realized survey effort, a random sample was drawn from this split potential survey effort of equivalent size to the proportion of good weather days in that spatial bin (Supplementary Material 1). This was used to estimate the number of kilometers of realized survey effort per voyage (Supplementary Material 2), creating a dataset of realized survey effort segments to sample the simulated whale fields (described below). Note for spatial bins where <10 IWC's IDCR/SOWER survey-derived hourly weather observations were recorded (colored dark grey in Supplementary Material 2), the mean proportion of good weather hours was used.

Estimating the number of sightings per tourist vessel-based survey
Many Southern Ocean whale species are recovering (Leaper and Miller, 2011;Tulloch et al., 2019), and as such sighting rates of equivalent surveys will also be expected to change commensurately. We define a sighting as an observation of a group of whales from 1 to n individuals (moving together, less than 5 body lengths apart, and only separating temporarily), and a sighting rate as the average number of sightings per distance covered (in this case, per km). We divided the number of sightings of each species by the total distance surveyed for each study. To estimate sighting rates for future summer sampling seasons (i.e., 2022/23), a compound interest rate formula was applied: to estimate the annual rate of change in sighting rates for the period between past (2000) and recent studies (2019) and project into the future (2022/23 season). Past sighting rates were taken from surveys in the year 2000 (The IWC SOWER/CCAMLR krill survey of the year 2000 (Hedley et al., 2001), herein, CCAMLR 2000) and in 2001/02 austral summer (tourist vessel-based survey (Williams, 2003;Williams et al., 2006)). Recent sighting rates were taken from surveys early in the year 2019 (a repeat of the CCAMLR 2000 Survey (Baines et al., 2021), herein, CCAMLR 2019) and the 2018/19 and 2021/22 tourist vesselbased surveys (Johannessen et al., 2022). The spatial range of these studies vary, but all covered the study area at least partially. Sighting rates were calculated separately for the six most common baleen whale species (humpback, fin, Antarctic minke, southern right, sei, and blue) and large unidentified baleen whales (unids). The time period between past and recent surveys was 19 years. The error was not propagated across these estimates due to inconsistent calculation and reporting across the four studies included.
It is reasonable to assume that only small observation teams will be accommodated on tourist vessels, so we assumed a team of two observers per vessel. To maintain consistent observer effort and manage fatigue for a team of this size, the field of observation was restricted to a single observer focused from 0 to 90˚on one side of the vessel. The second observer would be required to log information on the observational effort, environmental conditions, and animal sightings. Alternative use of two observers was considered, but a 90˚field of view was considered the best use of a small team of observers. This is because of the limited space allocated on the bridge, the distraction of bridge crew and radio interference due to the high volume of radio chatter between observers on both bridge wings, and occasional super high densities of whales potentially overwhelming observers (Herr et al., 2022). Scientific surveys generally have larger teams on dedicated platforms and can survey the full 180˚in front of the vessel. To account for this difference in survey design, the final estimated sighting rates for the 2022/23 season were divided by two (where appropriate). Finally, this sighting rate was multiplied by our estimated realized survey effort per voyage to predict the number of sightings per voyage and therefore the voyages required to reach the 60-80 sightings per species for fitting detection functions (Buckland et al., 2001).

Creating simulated whale fields
To understand the extent to which the spatial distribution of the tourist vessel-based survey may bias subsequent model-based abundance estimates, we simulated four whale fields (rasters of whale densities with paired abundances) and sampled them with our dataset of realized survey effort. The four simulated whale fields were created to represent four Southern Ocean foraging baleen whale species: humpback, fin, minke, and blue. Each was generated using rasters of a suite of satellite-derived environmental variables for January 2020 extracted using the R package 'raadtools' (Sumner, 2015). For each of the simulated whale fields, a series of environmental variable rasters (n = 3-4) were selected, with paired mean and standard deviation; chosen to reflect previously reported predictors of high densities of the respective baleen whale species in the Southern Ocean (Table 1). Values from a normal distribution (created from the above paired mean and standard deviation) replaced the values in the selected environmental variable rasters. For example, peak humpback whale densities are thought to occur around the 1i sotherm (Kasamatsu et al., 2000;Branch, 2011;Johannessen et al., 2022), so all values of 1˚had the highest density of humpback whales, with decreasing densities towards higher or lower temperatures (SD:1, range: 0-2, Table 1). In this example, sea surface temperature values outside this range (0-2) were assigned a value of 0. A random field was also created for the study area, using the R function 'sim.rf()' in package 'fields' (Furrer et al., 2009) to incorporate "noise" into the relationship between environmental variables and whale densities by "simulating a stationary Gaussian random field on a regular grid with unit marginal variance" (Furrer et al., 2009). Each of these environmental variable-derived whale fields and the random field were scaled to values of between 0 and 1 (to ensure equal influence), and summed, before being rescaled to reflect probable study area-wide abundance. Simulated whale fields were plotted in Figures

Sampling simulated whale fields
A total of 6346 sampling segments with a maximum of 25 km transect lengths were derived from 209 voyages. As it is practically unfeasible to sample from 209 voyages in a single summer season, we reduced this dataset to 30 randomly selected voyages, resulting in 974 sampling segments. We used the midpoints of these 974 sampling segments to directly sample our simulated whale density (the observation process was not simulated/recreated). The response variable density was modeled as a function of environmental covariates used to create the simulated whale field, a spatial correlation term (a smooth function of easting and northing) and a Gaussian distribution with a log link using GAMs (i.e., whale.density~s(x,y) + s(Enviro 1) + s (Enviro n)+…, Table 1) (using R package 'mgcv' (Wood, 2015)). No spatial autocorrelation was introduced into simulated whale fields other than that inherently presented in environmental variables (where values of similar magnitude often occur in adjacent cells). R functions,  (Kasamatsu et al., 2000;Branch, 2007) dice 30 45 (Kasamatsu et al., 2000;Branch, 2007) All data was extracted via the R package 'raadtools' (Sumner et al., 2015), details can be found here https://github.com/AustralianAntarcticDivision/blueant#data-source-summary.
predict.gam() and dsm_var_gam() (R package 'dsm' (Miller et al., 2021)) were used to estimate grid-scale densities and coefficient of variation (CV), and summed to represent modeled abundance. The coefficient of variation associated with the modeled abundance estimate was calculated using the delta method (Ver Hoef, 2012). To provide a comparison to the results achieved with tourist vessel-based sampling, this process was repeated with the true set of realized survey effort achieved during a systematically designed scientific survey, CCAMLR 2019 (Baines et al., 2021) (herein, scientific vessel based-survey). Again, using a maximum segment length of 25km, resulting in 370 sampling segments.

Tourist vessel-based resampling of simulated whale fields
To further our understanding of the relationship between an increasing number of tourist vessel-based surveys (voyages) and the performance of the model (i.e., modeled abundance and associated CV), we compared abundance estimates obtained by six through 30 voyages, each drawn randomly from the dataset. Each of these 25 samples (six-30 voyages) was redrawn randomly from the dataset 40 times, for a total of 1,000 samples. The GAM was fit (relevant to the simulated whale field, Table 1), and modeled abundance and associated CV were calculated. This process was repeated for each of the four simulated whale fields.

Estimated length of survey effort per tourist vessel-based survey
After filtering out night and low light locations the 29 vessels covered 279,473 km of potential survey effort with a mean vessel speed of 9.5 knots (range: 0-16, SD: 2.93, SE: ± 0.05) and a mean of 1,293 km per voyage (range: 583-3908, SD: 499, SE: ± 32.9). After allowing for potential weather effects on the sighting conditions, the 29 vessels provided 119,133 km of realized survey effort, or a mean of Tourist vessel-based sampling of the humpback (A), fin (B), minke (C), and blue (D) simulated whale fields. Plotted left to right are; (i) the simulated whale field (lighter blues are the higher density of whales) and tourist vessel-based sampling (red lines), (ii) the equivalent modeled whale densities on the same color scale (noting dark grey regions are those where predictions are far greater than the original simulated whale field, see Table 4 for the value ranges), (iii) grid-scale coefficient variations (CV) and, (iv) the true spatiality explicit differences between simulated (i) and modeled (ii) whale fields (scale from greens (underestimate) to white (approximately equal) to browns (overestimated)) with tourist vessel-based sampling (grey lines). These plots show the model generally predicted the spatial patterns in the original simulated whale fields well but performed poorly in some regions (iv). Compare grid-scale CV (iii) with true differences (iv) for inconsistencies between model output and the actual differences between the simulated and modeled grid-scale whale densities. 551 km per voyage (range: 211-1515, SD: 201, SE: ± 13.7) (Supplementary Material 2). Assuming a strip width of 2 km from a 0 to 90˚observation field the 29 vessels included in this study could have surveyed a maximum of 15% of the 1.4 million km 2 study area. On average each voyage would sample 0.078% of the study area (1102 km 2 ), with 18 voyages required for a similar percentage to broad-scale scientific surveys CCAMLR 2000 and 2019 surveys (Table 2).

Estimated sighting rates per tourist vessel-based survey
Expected sighting rates differed by three orders of magnitude among species (range: 0.0002-0.0974 whale sightings/km of survey effort) ( Table 3). The annual rate of change in sighting rates over the 19 years was a~10% increase for humpback, fin, sei, and blue (range: Scientific vessel-based sampling of the humpback (A), fin (B), minke (C), and blue (D) simulated whale fields. Plotted left to right are; (i) the simulated whale field (lighter blues are the higher density of whales) and scientific vessel-based sampling (red lines), (ii) the equivalent modeled whale densities on the same color scale (noting dark grey regions are those where predictions are far greater than the original simulated whale field, see Table 4 for the value ranges), (iii) grid-scale coefficient variations (CV) and, (iv) the true spatiality explicit differences between simulated (i) and modeled (ii) whale fields (scale from greens (underestimate) to white (approximately equal) to browns (overestimated)) with science vessel-based sampling (grey lines). These plots show the model generally predicted the spatial patterns in the original simulated whale fields well but performed poorly in some regions (iv). Compare grid-scale CV (iii) with true differences (iv) for inconsistencies between model output and the actual differences between the simulated and modeled grid-scale whale densities.  (Hedley et al., 2001) CCAMLR 2000 -9740 km of realized survey effort (Hedley et al., 2001;Reilly et al., 2004). CCAMLR 2019 -7219 km of realized survey effort (Baines et al., 2021). Similar spatial coverage (as a percentage of the study area) to previous broadscale baleen whale surveys is achieved with ≈18 tourist vessel-based surveys.   baleen whale surveys of the southwest Atlantic Ocean, and estimates sightings rates for the 2022/23 season. The number of baleen whale sightings is noted in parenthesis, and the realized survey effort is listed below. Using the compound interest rate formula described above, estimates of the annual change in sighting rates between the past and recent surveys are calculated. This estimate of annual change is then projected forward to the 2022/23 season (assuming a constant change in abundance and therefore sighting rates) and multiplied by our estimate of predicted realized survey effort per tourist vessel voyage (551kms) and halved due to a 90˚observation quadrant, to estimate a predicted number of sightings per tourist vessel voyage (sightings per voyage). This estimate was then used to estimate the number of voyages needed for minimum data requirements (60-80 sightings) for multi-covariate distance sampling functions (Buckland et al., 2001) of each species. For humpback and fin whales this was between two and three, and 10 and 13 voyages respectively. Unid: large unidentified baleen whales. Previous studies CCAMLR 2000 -9740 km of realized survey effort (Hedley et al., 2001;Reilly et al., 2004). Tourist vessel-based study -9981kms of realized survey effort (Williams, 2003;Williams et al., 2006). Recent studies CCAMLR 2019 -7219 km of realized survey effort (Baines et al., 2021) Tourist vessel-based studycombined 5375 km of realized survey effort (2018/19 -5 voyages) (Johannessen et al., 2022)  0.050 -0.156 whale sightings/km of survey effort) while for Antarctic minke and southern right whales, rates declined by~5% (range: -0.092 --0.012 whale sightings/km of survey effort) (Table 3). Sighting rates for large unidentified baleen whales (unids) remained relatively stable (Table 3). Humpback and fin whales are likely to dominate sightings with approximately 26 and 6 sightings predicted per voyage respectively, with a total of 39 baleen whale sightings on average (Table 3). Two-three voyages were needed to estimate detection functions for humpback whales, and 10-13 for fin whales (Table 3). Greater than 30 voyages per season were required for the four other species (Table 3).

Comparative sampling of simulated whale fields
When using tourist vessel-based sampling, the spatial patterns in the simulated whale fields (Figures 2A-D(i), (ii), (iii), and (iv)) were well reflected in modeled whale abundance (Figures 2A-D(i), (ii), (iii), and (iv)). However, these four models overestimated the simulated whale abundance by approximately 30% (overestimation range: 23-36%, Table 4). Regions where the 'true' simulated whale density differed from the modeled whale density (Figures 2A-D(i), (ii), (iii), and (iv)), were generally well highlighted by high grid-scale CV values (Figures 2A-D(i), (ii), (iii), and (iv)), with the possible exception of minke whales (compare Figure 2C(iii) with Figure 2C (iv)), which also had the greatest overestimation in total abundance (36% , Table 4). It appears that when high densities were close to the southern edges of the study region, which was the case for humpback, minke, and blue simulated whale fields, this caused problems for the model, particularly in the western Weddell Sea region (60˚W, Figure 2).
Scientific vessel-based sampling performed worse at recreating each of the four simulated whale fields, with a higher overestimation of modeled abundance and associated variance (Table 4). Again, models had difficulty in the southern reaches of the study area.
Regions where the 'true' simulated whale density differed from the modeled whale density ( Figures 3A-D(i), (ii), (iii), and (iv)), were generally well highlighted by higher grid-scale CV values (compare Figures 3A-D(iii) with Figure 3A-D(iv))). Note the scales in these plots, dark grey regions in Figures 2 and 3A-D(ii) depict where values were overestimated outside the plotted scale (consistent scales were used across Figures 2 and 3 to accurately depict values).
In both cases, sampling of the simulated fin whale field ( Figure 2D and Figure 3D) fared better than the simulated humpback and minke whale fields (Figure 2A and C and Figure 3A and C) perhaps due to the greater relative densities of simulated whales in the north portion of the study area for fin whales. Full model outputs for each GAM were presented in the supplementary material ( Supplementary  Material 3). Although this has not been tested here, an exploration into soap film smoothing (Wood et al., 2008), particularly along the convoluted southern edge of the study area may reduce some overestimation of models.

Resampling improved precision and accuracy with increased sample size
The resampling analysis showed model precision (CV) and accuracy (abundance) improved with each additional sample added to the dataset, which is depicted in the reduced spread of values and median of each sample group (six-30 voyages) for both abundance and CV (Figure 4). The median CV within each sample was below 0.25 for all samples but tapered differently between samples. Based on the progressive improvement of precision (CV) and accuracy (abundance), we suggest a sample size of 18-22, 12-16, 17-22, and 13-18 for the humpback, fin, minke, and blue simulated whale fields respectively (grey boxes, Figure 4). For smaller sample sizes there remained substantial improvement in the CV, while for larger sample sizes the improvement was negligible, suggesting an overall sample size of 12-22 voyages. Almost all samples have CVs below 0.25 after ≈18 voyages (except for the humpback whale simulation). Compared to the CV 0-0.6 The coefficient of variation (CV) associated with the abundance estimate and the true percentage difference between simulated and modeled abundances are listed in parentheses. All models overestimated abundance, with scientific vessel-based sampling performing notably worse than tourist vessel-based sampling.
values, abundance values were less variable, with improvements mimicking that of CV values.

Discussion
We demonstrate that tourist vessel-based distance sampling surveys have the potential to provide accurate abundance estimates for baleen whales in remote regions. We show that for the southwest Atlantic, 12-22 tourist voyages are likely required to provide an adequate number of sightings to estimate abundance for humpback and fin whales and enable relative approximations of abundance for several other species (as per Reilly et al. (2004); Baines et al. (2021)). We found on average, for each tourist vessel-based baleen whale survey (one voyage) 551km of realized survey effort would be achieved, resulting in 39 baleen whale sightings. Surveys would be dominated by fin and humpback whale sightings. Two-three voyages for humpback whales and 10-13 voyages for fin whales were needed to meet model requirements for abundance estimates. Each tourist vessel-based survey would result in 1102 km 2 of ocean surveyed, or 0.078% of the study area, suggesting 18 voyages are required to provide similar spatial coverage to earlier science vessel-based baleen whale surveys in the region (1.42%, Table 2, Reilly et al., 2004;Baines et al., 2021). Tourist vessel-based sampling also performed better than scientific vessel-based sampling in estimating whale abundance of simulated whale fields, but still overestimated abundance by approximately 30%. Consequently, the resampling analysis suggested an appropriate sample size of between 12 and 22 voyages.

Potential for inter-and intra-seasonal abundance estimates
The relatively cost-effective nature of tourist vessel-based surveys will allow researchers to repeatedly survey vast regions of the ocean and provide, at minimum, annual estimates of regional abundance for several species of baleen whale. However, given tourist vessels are now operating in the Southern Ocean for more than five months each Resampling to detect improvement in precision and accuracy of abundance estimates. The relationship between sample size (number of voyages, x-axis) and improvements in accuracy (modeled abundance, grey boxplots, left y-axis) and precision (CV, red boxplots, right y-axis) are plotted for each of the four simulated whale fields (A-D, humpback, fin, minke, blue respectively). Median values are horizontal lines on boxplots. The upper bound on a useful CV value (0.25, dashed red line) and true simulated whale abundance (dashed black line) are plotted. Light grey boxes mark the sample size, before which CV improves substantially with increased sample size and after which there is negligible improvement in CV. Note these plots are zoomed to exclude some CV values (large red triangles), and the number of values not plotted is noted (black text inside large red triangles).
year, tourist vessel-based survey should also allow intra-annual characterization of abundance, that may capture the spatiotemporal trends in baleen whale movement on foraging grounds (e.g., Johannessen et al., 2022); a fundamental improvement in the data that is currently collected. Currently, most Southern Ocean cetacean surveys aim to detect inter-decadal differences and provide a single abundance estimate from surveys conducted between late December and February (Reilly et al., 2004;Baines et al., 2021) when numbers of baleen whales are thought to be at their maximum (Laws, 1977). This will influence the species composition detected because species arrival on foraging grounds is staggered. Based on whaling records (noting the potential inaccuracies of these data, and the potential changes to baleen whale movement and clustering behavior since whaling) the largest (blue and fin) arrived first followed by humpbacks, then sei and southern right whales much later in the season (February/March) (Mackintosh, 1972). There is also thought to be an inshore and southerly shift of species later in the foraging season around the Antarctic Peninsula, following the contraction of the sea ice (February -June) Johnston et al., 2012;Santora et al., 2014). Neither staggered arrival times nor within-season movement is well captured by current survey efforts, with a recent tourist vessel-based survey illustrating the potential for detecting these trends (Johannessen et al., 2022).

Reliability of modeled abundance estimates derived from tourist ship-based surveys
Tourist vessel-based surveys outperformed the tested scientific vessel-based surveys when sampling our simulated whale fields. When compared to the tourist vessel-based sampling, the relatively worse overestimation of modeled abundance of scientific vessel-based sampling might be because these surveys sample relativity little of the putative high whale density regions in the southern reaches of the survey area. This resulted in an exaggerated spatial smooth (compare contour plots of spatial smooths, Supplementary Material 3). Additionally, scientific vessel-based sampling had a narrower spread of sampled covariate values (compare smoothed covariate plots, noting the x-axis scales vary, Supplementary Material 3). These two constraints (limited sampling of the southern reaches and narrower spread of sampled covariates) likely contributed to the poorer performance of scientific vessel-based sampling and highlight the potential utility of tourist vessel-based sampling.
These simulated examples do not depict real-world whale distributions nor the observation process but do provide real-world examples of sets of circumstances under which models can fail to accurately predict whale densities. Whale densities tend to peak close to the ice edge (Herr et al., 2019) and/or coastal regions for many species (Nowacek et al., 2011;Johnston et al., 2012) which is often the edge of the surveyable study area and potentially punctuated with complex coastal features, characteristics that were all present in the simulated whale fields presented here. Miller and Bravington (2017) describe similar potential real-world examples which can have an almost pathological series of characteristics, causing modeled estimates to deviate substantially from the underlying simulated whale fields. Those presented here contribute to the development of detailed guidelines of best practices for these approaches [as the IWC developed for designed-based surveys (IWC, 2013;Hedley and Bravington, 2014)].

Practical challenges
There are several practical challenges in the use of tourist vesselbased sampling to estimate baleen whale abundance. The most significant of these is ensuring standardization of the data collection process across vessels and observers. The most consistent and replicable observation platform on tourist vessels is likely the bridge; although the field of view will be obstructed by the ship's superstructure, windows frames, glass, etc. Outside areas on vessels are more problematic as they have uncontrolled tourist interactions with the observer (i.e., alerting observers to sightings) and are inconsistent across vessels. To ensure observers are conducting the observation process in as consistent a manner as possible, the development and implementation of pre-departure training, in-field training, and detailed field protocols in consultation with vessel crews will be required.
These will be passing mode surveys, with no consistent ability to close in on sightings to confirm species identification, group size, etc. Whale approach guidelines for Antarctic tour operators can be found at iaato.org.

Relevance to other regions
This study describes an opportunity for successful engagement with the tourism industry on broad-scale multiday voyages to deliver highquality scientific outcomes that could benefit the management of living resources in the Southern Ocean. For example, many of the krilldependent predator groups in the Southern Ocean are monitored as part of the CCAMLR Environmental Monitoring Program (CEMP). However, there remains a large gap in our understanding of baleen whales in this context, with broad-scale monitoring currently occurring only every two decades and no mechanism to include the impact of recovering baleen whale populations on krill stock. The method outlined here appears to be a possible solution for producing annual abundance estimates of baleen whales in the southwest Atlantic Ocean.
There is also potential for the use of tourist ships for surveys of cetaceans and other species (e.g., seals, penguins, and Pelecaniformes seabirds) in other parts of the world, particularly into the Arctic, as many of the same companies operate in both poles. Tourist vesselbased surveys are suited to regions with a relatively high volume of tourist traffic, multiple voyage plans, and low volumes of survey effort from dedicated scientific voyages (Kaschner et al., 2012). In the Arctic, the north Pacific and north Atlantic Ocean sectors appear to be suited (Compton et al., 2007), however further sitespecific investigation is needed to determine the viability of achieving robust abundance estimates. Here, the southwest Atlantic region of the Southern Ocean appears to be a viable location for tourist vessel-based surveys, while other regions, such as the Ross Sea and east Antarctica may not be as suitable, due to limited tourist vessel traffic.
There are many examples of coastal vessels being used as 'platforms of opportunity' (e.g., tourist, fishing, and cargo vessels and ferries) for collecting scientific data, e.g., marine mammal surveys (Williams et al., 2006;Kiszka et al., 2007;Hupman et al., 2015;Vinding et al., 2015;Viquerat and Herr, 2017) and we now have the statistical tools and computational power to model data from larger spatial areas, across large-scale oceanic regions from multiple tourist vessel platforms. There is potential for a conflict of interests between scientific investigation and vessel operators (e.g., violation of approach distances by tourist vessel operators for enhanced tourist experience (Bearzi, 2017)). However, the success of science and management in resolving complex issues and potential conflicts of interest relies on a holistic approach, such as using social science as a tool for reducing negative wildlife interactions described in Filby et al. (2015); and scientific innovation for reducing seabird bycatch outlined in Avery et al. (2017). Thus, engaging the tourist industry to collect scientific data relevant to management has the potential to result in positive outcomes for all stakeholders; in addition to fostering and incorporating local talent and engaging tourists in the process of science.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.