Origin and Persistence of Markings in a Long-Term Photo-Identification Dataset Reveal the Threat of Entanglement for Endangered Northern Bottlenose Whales (Hyperoodon ampullatus)

Photo-identification methods depend on markings that are stable over time. Using a large dataset of photographs taken over a 31-year period, we evaluate the reliability, rate of change and demographic trends in different mark types on northern bottlenose whales (Hyperoodon ampullatus) in the Endangered Scotian Shelf population, and assess the prevalence and severity of anthropogenically caused markings. Only fin notches and back indentations were stable over long timescales, leading to 48% of the overall population being assessed as reliably marked. Males and mature males were found to have higher incidence of most mark types compared to females and juveniles. The proportion of reliably marked individuals increased over time, a trend that should be accounted for in any temporal analysis of population size using mark-recapture methods. An overall increase in marked individuals may reflect the accumulation of scars on an aging population post whaling. Anthropogenic markings, including probable entanglement and propeller-vessel strike scars, occurred at a steady rate over the study period and were observed on 6.6% of the population. The annual gain rate for all injuries associated with anthropogenic interactions was over 5 times the annual potential biological removal (PBR) calculated for the endangered population. As entanglement incidents and propeller-vessel strike injuries are typically undetected in offshore areas, we provide the first minimum estimate of harmful human interactions for northern bottlenose whales. With low observer effort for fisheries across the Canadian Atlantic, photo-identification offers an important line of evidence of the risks faced by this Endangered whale population.


INTRODUCTION
Photo-identification methods are commonly used to identify individual cetaceans using markings of natural or anthropogenic origin, and long-term datasets have revealed valuable scientific information (Ballance, 2018). Critical to investigations of population size and trends (e.g., Wilson et al., 1999;Barlow et al., 2011), scales of residency and ranging behavior (e.g., Calambokidis et al., 2002;Fearnbach et al., 2014;Mahaffy et al., 2015;Gladilina et al., 2018), demography (e.g., Aschettino et al., 2012), social structure (e.g., Gero et al., 2008), and habitat use (O'Brien et al., 2020), photo-identification has been particularly valuable tool in understanding cetaceans both as individuals and populations. While any distinctive natural markings may be used for individual identification over periods from days to weeks, understanding which markings are permanent or will remain stable over the lifetime of the individual is necessary for reliable long-term identification of individuals. Misidentification due to loss or gain of markings can result in a Type I error (a false positive, incorrectly identifying an animal as a known animal) or Type II error (a false negative, incorrectly identifying a known animal as an unknown or new animal). Long term datasets require regular re-evaluation not only to avoid Type I and II errors, but also to ensure distinctive marks are reliable and do not change or are not lost over the study period (Wilson et al., 1999;Gowans and Whitehead, 2001;Frasier et al., 2009;Urian et al., 2014). Additionally, any trends in the proportion of reliably marked individuals over time could bias population size estimates and need to be incorporated into mark-recapture analyses.
Individual markings, such as distinctive scars or large wounds, can also be used to estimate the prevalence and source of disease or injuries that are natural or anthropogenic in nature, and assess whether there are potential differences within a population (e.g., by age or sex class), over time, or between populations in the rate of predatory or anthropogenic interactions (Chu and Nieukirk, 1988;Baird et al., 2014;Felix et al., 2018). Injuries due to interactions with fisheries (vessels and gear) are thought to be the most important management issue affecting cetaceans (Read, 2008;Moore, 2019). However, with low or no independent observer effort, poor reporting requirements for cetacean bycatch, and limited conclusive necropsies of stranded animals, injurious or fatal interactions of cetaceans with fisheries are especially difficult to quantify (Van Waerebeek et al., 2007;Williams et al., 2011;Hines et al., 2020). The information we do have for many data-poor species is currently limited to bycatch 'anecdotes' (Fisheries and Oceans Canada, 2009;Harris et al., 2013) and screening level risk assessments (e.g., Brown et al., 2013) using broad assumptions about life history, behavior and habitat. Photographic analyses of scars presumed to be due to interactions with vessels or gear offer valuable information on potential unaccounted sources of cryptic mortality and an opportunity to assess and monitor these anthropogenic impacts on wild populations (Kiszka et al., 2008;Leone et al., 2019;Ramp et al., 2021).
The Scotian Shelf population of northern bottlenose whales (Hyperoodon ampullatus) inhabits the deep waters off Nova Scotia and has been extensively studied using photoidentification methods (Gowans and Whitehead, 2001;Wimmer and Whitehead, 2004;O'Brien and Whitehead, 2013). O'Brien and Whitehead's (2013) study found the population was small (∼143 individuals), but stable. This population has been designated as Endangered and listed under the Canadian Species At Risk Act (SARA) in 2006, with associated requirements for protection of critical habitat, ongoing monitoring and recovery measures, and an assessment of current threats (Fisheries and Oceans Canada, 2009). Since commercial whaling for the species ended in 1971, threats to species recovery now include acute injury and mortality from entanglement in fishing gear and ship strikes, as well as chronic threats from noise and ongoing oil and gas exploration (Fisheries and Oceans Canada, 2009;Whitehead and Hooker, 2012). The core habitat for the Scotian Shelf population is centered around the deep waters of the Gully submarine canyon, which was declared as an Ocean's Act Marine Protected Area (MPA) in 2004 (Figure 1). While the MPA includes a prohibition against fishing in Zone 1, there are no restrictions on fishing activities in the adjacent designated critical habitat areas of Shortland and Haldimand canyons (Fisheries and Oceans Canada, 2009, Figure 1).
The impact of acute mortality and injury due to interactions with fisheries on beaked whales, including the northern bottlenose whale, is highly uncertain (Hooker et al., 2019), but the risk has previously been described as low (Brown et al., 2013). Although they are a rarely-seen offshore species, northern bottlenose whales are known to approach boats and interact with fisheries that occur in offshore areas (Mitchell, 1977;Fertl and Leatherwood, 1997;Oyarbide Cuervas-Mons, 2008;COSEWIC, 2011). However, we are aware of only a few reports of bycaught or gear-entangled individuals in the western North Atlantic over the last 30 years (N = 13, Table 1). Patterns of reported incidents are difficult to interpret as a reflection of temporal trends or risk for a number of reasons. Overall, there seem to be more incidents reported before 2010, and while some areas of the Scotian Shelf have seen a reduction in trawl fishing effort and a ban on drift gill-nets over this period, long-line fisheries in deep water areas have continued. Outside the relatively small area of the Gully MPA's Zone 1, long-line fisheries occur along the shelf edge, including in Zone 2 and 3 of the Gully MPA. From the records of entangled beaked whales, we found (Table 1) ∼46% were attributed to long-line gear, ∼23% to trawls and the remaining to other or unknown fisheries (Whitehead et al., 1997;Garrison, 2003;Fisheries and Oceans Canada, 2016). Bycatch records of non-target species brought onboard vessels, which we include here as a source of data on entanglement, suffers from considerable bias in reporting. Due to issues with the spatial representativeness, low levels of observer coverage in the region over the last 30 years (ranging from 0 to 11% of all vessels), the likelihood that large whales are more likely to break free than be brought on board, and variability in the species identification skills of observers, the low number of reports is not informative of the extent or likelihood of beaked whale entanglement incidents (Hooker et al., 1997). Finally, due to their remote habitat, there are few records of beaked whales stranding or washing ashore in Atlantic Canada, and with carcasses in degraded condition and limited resources for forensic investigations, it is typically difficult to attribute cause of death (Lucas and Hooker, 2000;Nemiroff et al., 2010;Benjamins et al., 2011). Despite increased focus on reducing the incidence of entanglement, bycatch, and vessel strikes for other at-risk whale species in Canada (e.g., North Atlantic Right Whale, Eubalaena glacialis, Davies and Brillant, 2019;Moore, 2019), there has been limited progress on improving our understanding of the unintended impact of fisheries on beaked whales. For marine mammals, bycatch, entanglement, and vessel strikes can have both lethal and sub-lethal effects, which, for animals that "survive, " may include the associated fitness costs of infection, injury, energetic loss, inability to forage, and reduced reproductive potential (Visser, 1999;van der Hoop et al., 2016;Dolman and Brakes, 2018). While we know interactions with fisheries are fatal for beaked whale species in other areas (Carretta et al., 2008) and are contributing to dramatic declines of endangered marine mammal populations across the globe (Reeves et al., 2003;Turvey et al., 2007;Brownell et al., 2019;Moore, 2019), the impact of this threat on the Scotian Shelf population of northern bottlenose whale is unknown, despite over 30 years of research. However, previous studies examining anthropogenic-caused injuries from scarring in cetaceans have provided insights on the prevalence of their interactions with fisheries (Kiszka et al., 2008;Felix et al., 2018;Leone et al., 2019).
Here we use a large dataset of high-quality identification photographs of northern bottlenose whale dorsal fins and melons from the Scotian Shelf over a 31-year period (1988-2019) to assess the proportion, rate of change, and sex-age class of individuals with natural and anthropogenically-caused markings. Investigating the trends and bias in markings in the population is necessary for robust population estimates, minimizing error rates in identification, identifying the minimum proportion of northern bottlenose whales that have survived an interaction with a fishery or vessel, and estimating the extent of this threat for this species of beaked whales. The objectives of this study were to (1) evaluate the reliability (gain and loss rates) of different distinctive mark types over the 30-year study period and calculate an error rate for misidentifications; (2) assess trends in distinctive mark types occurring in the population over time, before and after the implementation of the Gully MPA, and by sex-age class; and (3) identify the prevalence and severity of anthropogenically-caused scars in the population.

Data Collection
The photographic data used in this study were collected during summer field seasons on the Scotian Shelf edge from 1988 to 2019. Photographs were taken of the dorsal area of all northern bottlenose whales encountered, regardless of the presence or severity of markings. The melon (forehead) and both the left and right side of each whale were photographed when possible. Biopsies were collected opportunistically for genetic analysis over this same period using methods described in Feyrer et al. (2019).

Photo-Identification
Previous studies (Gowans and Whitehead, 2001;Wimmer and Whitehead, 2004;O'Brien and Whitehead, 2013) hand matched printed photographs; however, here we compiled digitized versions of previous hard copy catalogs and newer digital photographs using the photographic management software, Adobe Lightroom (Version 6.14; Adobe Inc, 2015) using an updated photo-ID protocol (Feyrer et al., 2020a) which is briefly summarized below. The associated metadata for each photograph (e.g., GPS location, quality rating, and keywords) and identification information (e.g., sex and ID number) were saved with each digital image and 'collections' were used to track all photographs for each ID. The left and right sides of dorsal fins were considered separately for initial identification and analysis, but when identifiable marks spanned both left and right sides (e.g., a distinctive notch), both ID sides were linked by a common number. Photographs were given a quality rating (Q) based on the angle, focus, visible proportion of dorsal fin, and exposure, similar to criteria used by O'Brien and Whitehead (2013). Poorest quality photographs, which met none or only one of the criteria, were given a rating of Q1, while highest quality photographs, which met all criteria, were rated Q4 (Figure 2). The highest quality dorsal fin photographs (left and right side) of each individual identified in each year were put into a type specimen collection. Iterative pairwise comparisons between all type photographs were made within and between years and each individual whale received a unique ID number. The number of IDs, resighting rates, and catalog years (number of years in the catalog) were summarized. During the digital compilation of the catalog, we conducted multiple reviews and validated all Q ratings and previously matched IDs, which allowed us to detect misidentifications and estimate an error rate in matching. Error was calculated as the number of incorrectly matched photographs divided by the number of all ID resights (total number of photographs of all IDs minus their first 'type' sighting photograph) as per Frasier et al. (2009).

Melon Age and Sex Analysis
Sex was determined using two methods: (1) genetic analysis of biopsied whales based on Einfeldt et al. (2019) and (2) photographic analysis of the relative "roundness" of melons (foreheads), with males having a square-shaped melon compared to females and juveniles (Gowans et al., 2000;Yeung, 2018). The protocol for sexing northern bottlenose whales using melons has been updated since Gowans et al. (2000) and is based on two classifications: Mature Male (MM) or Female-Juvenile (FJ) (Figure 3), omitting the previous third category of Subadult Male, due to poor agreement (Type I errors) with paired genetic analyses (Yeung, 2018). Using a separate catalog of melon photographs that were quality rated, sexed, and linked to high quality (>Q3) dorsal fin IDs, we were able to increase the proportion of individuals with sex-age class information based solely on genetic methods from 7 to 44% and the proportion of photos from 25 to 78%. The combined sex-age classes used in all analyses presented here are Male-Mature Male (MMM), which includes both genetic males and IDs with square mature-male melons and Female-Juvenile (FJ), which includes both genetic females and IDs with round FJ melons.

Mark Type Classification
Mark type keywords ( Table 2) were given to all good quality photographs (>Q3) in each year, using the best photograph from each year as a guide. To consistently account for differences in the amount of the body visible in each photo, only markings on the dorsal fin or within one fin-width away from the base of the fin, known as the "dorsal skirt" (Figure 4), were considered. Markings could be assigned multiple keywords (i.e., entanglement and large body scar) using a modified version of the mark type classification of Gowans and Whitehead (2001;  Anthropogenic markings, specifically those caused by injuries related to entanglement or propeller-vessel strikes, have not previously been described in northern bottlenose whales. However, observations of entangled or bycaught northern bottlenose whales (Table 1)  indicate that these threats do occur at some level. The literature on cetacean entanglement and ship strikes provides a wealth of descriptions and well-documented images of multiple species with scars from entanglement or propeller-vessel strikes on tail flukes, peduncles, dorsal fins and backs that can be used as reference points for comparative analysis (e.g., Visser, 1999;Baird et al., 2014;Kügler and Orbach, 2014;George et al., 2017;Felix et al., 2018;Basran et al., 2019). We initially classified anthropogenic marks based on (1) the features of entanglement and propeller-vessel strike scars documented and described in other studies and (2) our analysis of scarring resulting from entanglement injuries observed on live beaked whales (Figure 6), gear marks on dead stranded northern bottlenose whales (unpublished data, Ledwell and Huntington, 2006), and a video of an entangled northern bottlenose whale recorded in the Gully (Whitehead Lab, 1999). IDs with anthropogenic marks were then reviewed by external experts with experience in large whale entanglement, beaked whales and gear used in the region's offshore fisheries. Reviewers ranked images of each possible ID on a scale of 1-3 with 1 being low confidence and 3 being high confidence that marks were probable entanglement or vessel strike and only those IDs which reviewers agreed with high confidence were included in further assessment of anthropogenic marks.
In our initial review of the patterns of tissue damage and scarring seen in the dorsal fin region, we screened the  dataset multiple times for possible anthropogenic markings. The identification of entanglement marks used in our assessment included a range of scars that can be caused by the rubbing or pressure of a rope or line as it is wrapped around the body, fin or tail stock of an animal, and these scars are typically characterized by the presentation of a curvilinear pattern of relatively consistent thickness and tapering ends (see Robbins and Mattila, 2001;George et al., 2017;Basran et al., 2019). However, during the process of entanglement, the haphazard wrapping, knots and criss-crossing of various types of fishing gear, can blend, abruptly break or change the direction of the linear pattern of scarring (Robbins, 2009; Figure 6). The weight, tension or restricted movement of entanglement can cause lines to become deeply embedded and result in deep spine indentations (Robbins, 2009), fin mutilations (Baird et al., 2014) or protruding scar tissue (see Figure 6). Examples of severe entanglement injuries on a beaked whale body shape were key references in our analysis and are presented in Figure 6. The individual in Figure 6A was a Sowerby's beaked whale with multiple curvilinear scars from an entanglement in a line forward of the dorsal fin; one wrap of the line appears to be still embedded in the animal's flesh, causing raised tissue and possible necrosis. The curvilinear scars are of consistent thickness until they taper where broken, likely caused by the raised spinal processes and inward curvature of the animal's poor body condition or shifting lines. A second Sowerby's beaked whale ( Figure 6B) has a rope tightly entangled around its body and pectoral fin, causing deep lacerations into the blubber layer. The location of the entanglement likely restricts movement, and while the rope has become embedded in the animal's flesh, where the line does not have contact with the skin there are again breaks in the scar pattern where the skin tissue is still intact. Adjacent to the embedded line there are nonlinear areas of abrasion, possibly due to the chafe from a previous positioning of the embedded line or a secondary line that was lost. The individual in Figure 6C is a live northern bottlenose whale that was entangled in the Davis Strait in 2003 with a longline buoy line wrapped around its beak, while the animal was calm and later released, the linear abrasions around the beak blend together, and blood is coming from the mouth near the wrap point of the line. Figure 6D is a male northern bottlenose whale photographed in the Gully in 1990 with multiple wrapping scars around its body, well forward of the dorsal fin and behind the melon. While no scaring is apparent in the region of the dorsal fin, and this individual is not included in the analysis of dorsal fins we present, the scars appear to blur together with varying thicknesses, angles, and severity, with some lines ending abruptly. The last reference we used was a video taken in 1999 of a northern bottlenose whale in the Gully with a monofilament line wrapped around its beak, possibly hooked in its jaw (Whitehead Lab, 1999). It appears to be resting with its head and beak at the surface, and both are clearly scarred with a thin wrapping diagonal white line going over the left side of the melon and across the blow hole, however, this scar line does not appear to continue on the right side of the animal (Whitehead Lab, 1999). Injuries related to vessel strike incidents have been well characterized in large whales as resulting in: (1) blunt force trauma causing significant fractures, but potentially few other externally apparent injuries (Laist et al., 2001;Vanderlaan and Taggart, 2007); and (2) propeller wounds, which include deep slashes or indentations (Visser, 1999;Laist et al., 2001;Van Waerebeek et al., 2007), mutilated or chopped dorsal fins (Van Waerebeek et al., 2007), and parallel concave lacerations (George et al., 2017). However, severe entanglement can also result in fin mutilation or amputation, and it is not always possible to conclusively attribute propeller-vessel strike as the source of less severe injuries. As a result, all marks initially attributed to either probable entanglement (Figures 5A,E,F) or vessel-propeller strike injuries (Figures 5C,D,F) were combined into one category for anthropogenic scars (Moore et al., 2013). Based on the severity of injury, we also classified anthropogenic scars using a qualitative three-point scale with 1 being low severity and 3 being most severe (Supplementary Figure 1).

Mark Type Analyses
Mark types selected for analyses included notches, back indentations, large dorsal fin scars, patches, and anthropogenic scars as described above and in Table 2. These markings were selected as they are highly distinctive (Urian et al., 2014), most commonly used for inter-annual identification, and determining their prevalence, longevity and reliability has important implications for mark-recapture population analyses, as well as our understanding of potential threats to the population (Gowans and Whitehead, 2001;Fisheries and Oceans Canada, 2009;O'Brien and Whitehead, 2013; Table 2). For each mark type, we assessed all IDs that had at least one high quality photograph with the mark keyword. Mark type classifications were not mutually exclusive, as notches or back indents were in some cases also assessed as anthropogenic (see Figure 5), but all marks were analyzed separately.

Prevalence
The prevalence of the different mark types in the population was calculated separately for left and right IDs and averaged across data collection years. We used binomial generalized linear regression models (GLMs) to assess whether the proportion of marked individuals (right and left side catalogs calculated separately) had either (a) increased, (b) remained stable or (c) differed between the years occurring prior to or after the MPA. We determined the best fit trend based on lowest AIC (Akaike's information criterion) score, with all models having scores AIC < 2 considered as demonstrating some support. The relationship between the proportion of marked IDs in MMM and FJ sex-age classes and the proportion of marked IDs where we have genetic information on molecular sex (XY males and XX females) was tested using linear regression. For each year where there were >10 IDs, we compared the difference between the proportion of MMM and FJ sex-age classes using paired t-tests. All data were normally distributed across years. Statistical analysis was completed in MatLab (2019) and R (2019).

Change
Annual rates of loss or gain were analyzed separately for each mark type for all IDs seen in multiple years. For each year that an ID was in the catalog, a change was recorded as either negative (a decrease in the visible mark size or number), positive (an increase in the visible mark size or number), or none (no change in the mark size or number). If an ID entered the catalog with a mark, nothing was recorded until a change occurred, and if gains (or losses) occurred they were counted once in the first year they were observed. The most recent photograph was always used to compare between subsequent years, and only the highest quality photographs (Q4) were used to analyze mark change for patches, large scars and anthropogenic marks, while analysis of back indentations and notches also used photographs of good-excellent quality (>Q3). The average rate of change was calculated separately for each mark type, summing total gains or losses and dividing by the total whale years for all reliably marked individuals in the catalog. Total whale years are defined as the number of years an individual appears in the catalog (i.e., year of last sighting − year of first sighting), and rates were calculated as per Auger-Méthé and Whitehead (2007): (1) Rate of gain = Total number of gains/total whale years (2) Rate of loss = Total number of losses/available whale years with marks Gowans et al. (2001b) considered marks reliable for reidentification if they had a zero rate of loss in more than five individuals. Due to the larger scale of this analysis, here we define a mark as reliable if loss occurred less than once in a hundred whale years. Using our definition of reliability, the rate of change in status from unreliable to reliable was calculated for all IDs and years. To estimate the number of whales per year that acquire anthropogenic injuries, we multiplied the most recent published population estimate for the Scotian Shelf (∼143 individuals, O'Brien and Whitehead, 2013) by the annual gain rate calculated for probable entanglement and propeller vessel-strike scars.

Photo-Identification Catalog
The Scotian Shelf northern bottlenose whale catalog contained 29,529 dorsal and 9,000 melon jpeg images from 280 days of fieldwork in 25 years between 1989 and 2019. The sample sizes for photographs and identifications for left and right sides are Propeller Scar(s) consistent with a propeller strike -large or deep gashes, parallel or "corkscrew" scars (George et al., 2017) Tooth rake Two or more parallel linear scars consistent with teeth spacing of other odontocetes.
Slough skin Light discoloration in irregular angular shapes from the peeling off of skin; changes rapidly (within days), not used for identification.

Clean
Having none of the mark types listed above.
Mark types in bold were considered distinctive for individual identification and analyzed for rates of mark change in this study. *indicates reliable marks. detailed by year in Supplementary Table 1. Quality rating was reviewed for consistency across years and, due to the effect of low-quality photographs on resighting rates (Urian et al., 2014), photographs < Q3 were not included in the analysis. The catalog contained 662 right side and 677 left side individuals, with an overall average discovery rate of 28 new identifications per year (but only 8.5 reliable IDs per year, Figure 7). Of all IDs, only 33% were seen in a subsequent year; however, for reliably marked whales, 60% of IDs were seen in more than 1 year, not including IDs first sighted in 2019, the last year in the catalog (Supplementary Figure 2). For individuals seen across multiple years, the average number of sighting years was 3.65 (SD = 2.35), with a maximum of 17 sighting years out of a possible 25 years of data collection. A small group of IDs (n = 15) had resights spanning 25-30 years of the 31-year study period. The error rate for ID matching in previous studies of the population (Gowans and Whitehead, 2001;Wimmer and Whitehead, 2004;O'Brien and Whitehead, 2013) was estimated to be 3.6%. Photo-identification errors that were detected during validation (N = 1025 photographs) were largely (78%) duplicates (i.e., Type II, false negatives) with only 22% misidentifications (Type I, false positive errors). All IDs were noted and corrected. Of the 131 IDs with photographs affected by errors, nearly 15% (N = 19) had acquired a notch or back indent during the 31 years study period, significantly changing their appearance.

Mark Prevalence
In the catalog, 45.3% (SE 1.2%) of all individuals had a notch in their dorsal fin, patches were the next most common mark type with 17.4% (SE 1.4%) of IDs, and other marks occurred in less than 10% of IDs (Table 3). Approximately 35% of all IDs were "clean, " having none of the distinctive mark types analyzed here ( Table 2). The prevalence of each of the five mark types was similar whether melon or molecular sex classifications were used to identify sex (R 2 = 0.944, P = 0.001), suggesting that regardless of age, males were generally more marked than females (Supplementary Table 2). In paired t-tests for each year (df = 17), MMM were significantly more scarred (5-20%) than FJ in each mark type category, except for patches, where the proportion of FJ was 7% higher than MMM (p = 0.011, Table 3). An increasing trend in prevalence was well supported ( AIC < 2) for most marks, but for large fin scars and anthropogenic scars a stable trend was the best supported model ( AIC = 0). The effect of MPA had some support in comparison of candidate models for indent, large fin scar, and anthropogenic mark types, however, there was stronger support for stable or increasing trends (Table 4 and Figures 8A-F). Most mark types, with the exception of large fin scars, appeared more prevalent in the period after the 2004 designation of the Gully MPA ( Table 3).

Rate of Mark Change
Marks with the highest average rate of gain were notches (8.2%), but had a very low rate of loss (0.2%) per year. Marks with high gain and loss rates were patches (3.1% gain, 6.3% loss per year) and large fin scars (2.1% gain, 10.8% loss per year; Table 5). Back indents were found to have a low rate of mark gain (0.7% per year), and no mark loss ( Table 5). The gain rate for anthropogenic marks was 1.2% per year, with higher rates of mark loss (3.3% per year).

Reliability
Over the 31-year study period, only notches and back indents had low enough loss rates to be considered reliable, resulting in an average proportion of 0.479 (SE = 0.013) IDs that were reliably marked. Of the IDs seen in multiple years, 24 changed status to reliable during the study period, with an annual rate of change of 6.9 (72) *Sex-age class was known for 64% notches, 63% back indents, 64% of dorsal scars, 56% patches, 92.5% anthropogenic scars. **Indicates reliable marks. Presented as an overall percentage, ±standard error with the total number of marked right + left sides (n) over all years and for both sex-age classes. For marked IDs with sex-age class information*, the proportion of marked to unmarked Males-Mature Males (MMM) and marked to unmarked Female-Juveniles (FJ) in each year with >10 IDs were compared using a paired t-test. The proportion of marked IDs between pre MPA (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)   1.6%. Over time, the proportion of reliable individuals increased at 0.011 per yr (P = 0.02) ( Table 4).

Anthropogenic Markings
Within the catalog, 6.6% of IDs (with photos >Q3) had one visible clear scar of probable anthropogenic origin (classified as either entanglement or propeller-vessel strike scars). Of the 54 IDs (left and right side combined), 43 IDs or ∼80% were seen in more than 1 year, allowing us to calculate the rate of mark gain and loss. With a population size of 143 individuals (O'Brien and Whitehead, 2013), the estimates of annual gain rate equate to ∼1.72 whales per year gaining injuries related to entanglement or propeller-vessel strikes. In qualitative review of scar severity, we found the majority of anthropogenic scars (57%) were considered low -moderate severity (Levels 1-2) and 16% were considered severe injuries (Level 3) such as mutilations or amputations. Most of the scars initially classified as propeller-vessel strike scars were by definition moderate-high severity injuries (Levels 2-3); however, external reviewers indicated many of these scars could also have been caused by severe entanglement.

Challenging Assumptions and Testing Hypotheses With Long-Term Data Sets
Long-term field studies of cetacean populations, such as the Scotian Shelf northern bottlenose whales, have generated detailed photo-identification datasets which have become an important resource for species management. With a large and growing catalog of northern bottlenose whales, researchers have been able to answer questions and provide new data on population status, demographic differences, movement, social structure, and threats (Gowans et al., 2001b;Wimmer and Whitehead, 2004;O'Brien and Whitehead, 2013), vastly improving our understanding of the status of this enigmatic and difficult to study species. Over the last 31 years, researchers have identified hundreds of individual northern bottlenose whales, some of whom have been seen repeatedly in the study area from 1988 to 2019, suggesting they are close to the 37-year minimum estimate of life expectancy currently understood for the species (Christensen, 1973). These long-lived individuals represent the first generation of northern bottlenose whales to be born into the post-whaling period (i.e., after 1972, Whitehead and Hooker, 2012), living through a new era of industrial exploration and the implementation of the first offshore MPA in Canada. While the Scotian Shelf photo-identification dataset is critical for estimating the size of the Endangered population and understanding their status, ongoing monitoring of individuals can be used to improve our appreciation of northern bottlenose whale life expectancy, population age structure, ontogenetic development, and potential changes in patterns of site fidelity in the study area.
Our initial interest in looking at the occurrence of marks over time was to see whether we could detect a change in the proportion of marked individuals after the implementation of the Gully MPA in 2004. While the effect of time was not strong, there was a significant increase in the proportion of individuals with notches and patches over the entire study period. This increase in the proportion of marked individuals could reflect a post-whaling demographic shift in the age distribution toward older individuals, which tend to be more marked, as whaling in the 1960's removed a substantial proportion of the population (Whitehead and Hooker, 2012). As we are unclear on the etiology of notches and patches, their prevalence could also represent a novel pathogen or parasite, an increase in interactions with predators, anthropogenic activities, or even sex-biased migration between areas (Wilson et al., 2000;Hamilton and Marx, 2005;Bossley and Woolfall, 2014). Despite our optimistic hypothesis, it is not entirely surprising that the Gully MPA, which only restricts fishing and vessel traffic in a small deep-water area (475 Km 2 ) of Zone 1, has not had a measurable effect on the proportion of marked individuals in the population. Scotian Shelf northern bottlenose whales regularly travel outside the protected area of the Gully MPA and can be found in Shortland and Haldimand canyons where there are few restrictions on human activities (Figure 1, Wimmer and Whitehead, 2004). While little is known about migratory movements between the Scotian Shelf and other populations, the distribution of acoustic detections along the shelf edge (Feyrer, unpublished data) between the Gully and the foraging aggregation recently discovered off of Newfoundland  suggests individuals may make longer distance movements from regions where fishing activities are less restricted. Examining differences in the proportion of marked IDs between regions could shed light on geographic differences in the origin of certain mark types and improve our understanding of connectivity across the species range. Further study is required to understand the relative contribution and significance of these potential sources, and whether an overall increase in marks becomes a long-term trend in the Scotian Shelf population.
In modernizing the historically printed catalog to a digital database, we updated matching and quality rating protocols as per best practices recommended by Urian et al. (2014). Through this process we were able to detect and correct mistakes, and estimate the identification error rate for the catalog, which suggests it is low and in line with error rates found in other cetacean studies (3.09%, Frasier et al., 2009;3.38%, Stevick et al., 2001). Duplicate IDs represented the majority of errors, which is typical of protocols that require multiple reviewers to confirm a match as they can more easily screen against false positives (Urian et al., 2014). In our protocol we were able to detect duplicates by having a single technician dedicated to the timeconsuming task of reviewing all previous matches. While having one consistent reviewer was useful for standardization across years, with ∼450,000 pairwise matches per side, it is unrealistic as the catalog continues to grow and individuals with knowledge of the IDs in the catalog leave the project. The ∼8.9% combined gain rate for reliable marks and a change in reliability status of 1.6% per year, suggests that new identifications of reliable individuals should be carefully evaluated due to the risk of duplicates and that reliability trends, although small, should be incorporated into population size estimates and monitored on an ongoing basis in long-term datasets (Urian et al., 2014). In the future, automated identification software and further classification of individual distinctiveness, such as refining the definition of notches or unusual scars by size or uniqueness, could help minimize pairwise matching requirements, reduce errors, and increase confidence in population estimates (Hupman et al., 2018).

Demographic Differences
In this study, we found significant differences between the proportion of the MMM and FJ age class in every mark type category, similar to Gowans and Whitehead (2001) who found mature males had significantly more reliable marks than female-juveniles. A lack of repigmentation in scars has been hypothesized to serve as a social signaling function in some cetaceans and, in beaked whales, males typically become more scarred with age, presumably due to male-male competition (MacLeod, 1998;Hartman et al., 2015). While we did not analyze the extent of tooth rake scars, northern bottlenose whales are different from other odontocetes and even other beaked whales, in that they only have two small teeth (<4 cm average total length), which only occur in mature males (>15 years) and barely extend beyond the gums at the front of the jaw (Christensen, 1973;Gol'din, 2014;Feyrer, unpublished data). Although male northern bottlenose whales have been described to engage in "head-butting" (Gowans and Rendell, 1999), due to a lack of dental weaponry, we think conspecific interactions are unlikely to cause deep or significant scarring in this species. Scar patterns of multiple parallel lines most likely originate from interactions with toothier species, such as dolphins or pilot whales (Globicephala melas). The higher proportion of MMM IDs with more severe reliable marks (notches and back indents) and anthropogenic scars may be linked to a higher risk tolerance in males (Altmann, 1958;Frid and Dill, 2002;Symons et al., 2014), resulting in additional interactions with predators, vessels or large debris. Although FJ have proportionally more patches than MMM, due to the temporary nature and variable size and shape of patches, there is much we don't understand about this mark type. Given the small effect size in this difference and the inclusion of juvenile males within the FJ sex-age class, there is still some uncertainty whether sex or age is most relevant. Even within our large longterm dataset, there are few female IDs that can be classified as mature based on their sighting history, limiting our ability to separate the effect of sex and age for females.
While we did not attempt to assess how the age of individuals affects the proportion of injuries, it is possible that life history stage, which is poorly known within the population but may have shifted since whaling ceased, is potentially confounding the assessment of change over time. Generally, calves and juveniles are less marked than mature individuals due to exposure time, while older whales may be more experienced or large enough to survive interactions with predators and break free from fishing gear (George et al., 2017). A juvenile northern bottlenose whale observed by the Whitehead Lab (1999) with an active monofilament line entanglement around its beak was thought unlikely to survive, which suggests another potential bias in any estimates of the rate of anthropogenic interactions by age class. However, the relationship between mark type occurrence and life history stage has previously been used to assign age-classes to other cetaceans (Hartman et al., 2015) and is another area for research in this species.

Mark Change and Reliability
In cetacean species that do not present natural variation in pigmentation, fin, or fluke profile, individuals can only be reliably identified by the irregular occurrence and persistence of scars from parasites, disease, interactions with predators, conspecifics or anthropogenic activities. However, scar pigmentation and accumulation vary widely across cetacean species, with scars persisting for the lifetime of an animal (e.g., Rissos's dolphins, Grampus griseus, Hartman et al., 2015) or fading within months to a few years (e.g., common bottlenose dolphins, Tursiops truncatus, MacLeod, 1998). As long-term photo-identification studies are a primary source of information on cetacean population status and trends, there needs to be a clear understanding of the reliability and rate of change in marks used to match individuals and scale population estimates (Frasier et al., 2009;Hupman et al., 2018). As mark loss violates the assumptions of mark-recapture analysis, only marks that have been analyzed for reliability at the scale of the period under consideration should be used. Here, the only scars that met the criteria for long-term reliability in northern bottlenose whales were fin notches and back indents, which persisted over multiple years with low to zero rates of mark loss. Although patches were considered "reliable" over the 9-year period analyzed by Gowans and Whitehead (2001), with additional years and repeat sampling events, we determined that this mark type may be distinctive, but is not stable due to high rates of loss. Omitting patches as reliable marks reduces the proportion of IDs considered for population estimation from 66% (Gowans and Whitehead, 2001) to 49.8% (this study), but their inclusion may have inflated mortality rates of previous population estimates for this species, as individuals that lose marks may be lost from the record and considered (by the mark-recapture analyses) as probable mortalities (e.g., the estimated mortality was 11% in O'Brien and Whitehead, 2013). Combinations of distinct but non-reliable mark types are still useful for matching individuals within a season or between adjacent years, however, without distinctive mark types (35% of IDs were considered "clean"), repeat identification within the long-term dataset becomes unlikely and is a source of error. Fin shape classification, which has been looked at in other All ID-sides with >1 year of high-quality (Q4) photographs with marks were analyzed for mark change. Rates of gain were estimated for each mark type by dividing the observed number of gains by the total number of whale years in the catalog. Rates of loss were estimated based on the observed number of losses per mark type, divided by the total number of years where whales were observed with the mark, which varied by mark type.
species (e.g., blue whales, Balaenoptera musculus, Gendron and Ugalde de la Cruz, 2012), may help further distinguish "clean" and other poorly marked individuals; however, fin shape can be distorted in lower quality photographs and relies heavily on photographs having a consistent angle to the body's position and roll for comparison.

Implications of Anthropogenic Interactions
In addition to unnatural levels of mortality, there are other population level impacts from the short-term stress of an entanglement or vessel strike incident, including long-term energetic costs that may reduce a survivor's reproductive output (van der Hoop et al., 2016). Baird et al. (2014) found female false killer whales (Pseudorca crassidens) were more likely to bear scars from interacting with fisheries, with potentially significant impacts to reproductive capacity and calf mortality. While we found the MMM age class of northern bottlenose whales were more likely to possess anthropogenic scars, we do not know the sex or age of all marked IDs. In addition, the relationship between scarring and mortality, which may favor the survival of larger or older animals, limits our understanding and interpretation of population level impacts. While the majority of probable entanglements left scars of low severity, we did not assess entanglement scarring in other areas (e.g., the beak or tail fluke), which have been observed in northern bottlenose whales and found to be more prevalent or serious than those of the dorsal fin area in other species (Whitehead et al., 1997;Whitehead Lab, 1999;Fisheries and Oceans Canada, 2009;Ramp et al., 2021). Although the mortality of vessel strike injuries in cetaceans is generally assumed to be quite high, blunt force trauma is harder to detect than mutilations (Laist et al., 2001;Vanderlaan and Taggart, 2007). However, mutilations related to severe entanglement or propeller vessel strikes are hard to distinguish, leading us to combine our assessment of the scars we attribute to probable anthropogenic sources. Overall, we found 6.6% of the population had experienced probable entanglement or propeller-vessel strike scars in the region of their dorsal fin. Our analysis of mark rates over this period suggests that on average 1.72 whales per year suffer injuries related to probable entanglement and propellervessel strike combined. This rate of anthropogenic interactions is of concern as it is over 5 times the potential biological removal (PBR) of 0.3 individuals per year estimated by Harris et al. (2013). Although PBR generally refers to removals due to mortality events, we use it here as a threshold for comparison because (1) many non-fatal anthropogenic injuries may eventually result in mortality, (2) injuries can have longterm impacts to the reproductive capacity of individuals, which would limit population growth, and (3) the rate combined with the risks associated with interactions suggests that there are an unknown number of individuals in the population that do not survive. Taken altogether we think there is cause for concern as anthropogenic impacts are likely limiting individuals from contributing to population growth. We emphasize that our estimate represents a minimum of non-fatal anthropogenic interactions for this population, and we do not know the total number of anthropogenic encounters. The occurrence of anthropogenic markings on northern bottlenose whales is likely influenced by their curious nature, as they are known to inquisitively approach and follow vessels (Mitchell, 1977), interact with fisheries (Fertl and Leatherwood, 1997;Oyarbide Cuervas-Mons, 2008) and engage in group social behavior at the surface (Gowans et al., 2001b). Other studies have found propeller-vessel strike injuries are common in species that approach vessels and bow-ride (Van Waerebeek et al., 2007) or swim in the wash of the propellers (Visser, 1999). For common bottlenose dolphins in Ecuador, the prevalence of anthropogenic scarring in the population was ∼44% (Felix et al., 2018), in the Mayotte archipelago 15% of Indo-Pacific bottlenose dolphins (Tursiops aduncus) had anthropogenic scars (Kiszka et al., 2008), while 7.5% of false killer whales off Hawaii were found to have anthropogenic scarring (Baird et al., 2014). Entanglement rates have also been found to increase due to particular kinds of cetacean social or foraging behavior, such as depredation in sperm whales (Physeter macrocephalus, Hamer et al., 2012) and open mouth filter feeding in North Atlantic right whales, where 85% of individuals bear entanglement scars (Moore, 2019). While we have observed and are aware of other accounts of northern bottlenose approaching fishing vessels, being hand fed by fishers and depredating trawl and longline fisheries in Newfoundland, Labrador and Baffin Bay (Oyarbide Cuervas-Mons, 2008;Fisheries and Oceans Canada, 2009;Johnson et al., 2020; Wayne Ledwell pers. com.), we are not aware of reports of these behaviors occurring on the Scotian Shelf, or any efforts to document the extent of these behaviors across their range. Additional research in this area would help us understand how the behaviors are spread among individuals, whether they are regionally or demographically isolated, and the prevalence of depredation behavior within the Scotian Shelf population.
Our classification of probable anthropogenic scars in high quality dorsal photographs limited our analysis to marks that could be recognized based on established literature from other species and expert opinion of entanglement or propeller-vessel strike injuries. This necessarily excluded individuals with unusual scar patterns or large fin notches without associated linear scars. There may be a broader range of possible entanglement injuries for beaked whales involved in offshore fisheries than those recognized from other more commonly observed species. While the trailing edge of dorsal fins can naturally degrade or become tattered over time (Wursig and Jefferson, 1990), entanglements are typically described as scarring on the leading edge of the fin (Azevedo et al., 2009;Baird et al., 2014;Kügler and Orbach, 2014;Felix et al., 2018). Baird et al. (2014) proposed that trailing edge fin scars could potentially occur if whales that became hooked in the mouth thrashed or twisted against the line to break free. Given the low probability of observing beaked whale entanglements, simulation of entanglement mechanics occurring with lines and gear associated with offshore fisheries (e.g., McLellan et al., 2015;Howle et al., 2018) could provide insight on the origin of other unusual scars. Additional analysis of melon and beaks photographs, or aerial imagery of the entire body (e.g., Ramp et al., 2021) would provide another perspective on patterns observed here and comparative estimates of the rate of fisheries interactions.

CONCLUSION
The contrasting patterns of long-term site fidelity and single sightings, unmarked and distinct individuals, differences between sex-age classes, and observations of anthropogenic scarring found in this study suggest there is still much to be learned about northern bottlenose whale population structure, life history, and threats. While foundational work by Gowans and Whitehead (2001) provided photo-identification methods that have been used for northern bottlenose whales and other species, this study has highlighted that protocols and assumptions about sexing, mark reliability and identifications need to be continuously reviewed to ensure the analysis of larger datasets over longer time periods remains unbiased. Our study found that the prevalence of most mark types is higher for the male-mature male versus female-juvenile sex-age class, which corresponds with patterns found based on molecular sex, but still leaves some uncertainty on whether age or sex is driving these patterns. The increased prevalence of scars could be due to a higher risk tolerance in male-mature males and/or an increase in mark accrual with age. In contrast to our hypothesis on temporal trends, the proportion and rates of most mark types have increased or remained stable rather than decreasing over time. The reasons for increasing trends may be related to an aging population in the Gully.
Despite the implementation of the Gully MPA in 2004, northern bottlenose whales face ongoing threats and a risk of injury when they use habitat areas outside the spatial protections provided within the small area of the Gully's Zone 1, such as Shortland and Haldimand canyons.
The risk of interactions with vessels and fisheries for northern bottlenose whales has previously been assessed as lower than for inshore whale species, largely due to the reduced density of anthropogenic activities (Halpern et al., 2008;Brown et al., 2013). However, cryptic mortality will bias any estimate of observed anthropogenic injury rate downward, due to low detection rates for whales that do not survive entanglement or vessel strikes (Williams et al., 2011). While we observed some demographic differences in scarring in our dataset, it is also possible that some individuals (e.g., juveniles) suffer higher mortality from anthropogenic interactions and will be excluded from any assessment of scars found on live animals (Byard et al., 2012;George et al., 2017;Felix et al., 2018). Given the uncertainties, we emphasize that this first assessment only tells part of the story, that of non-lethal anthropogenic interactions, which have nonetheless caused a steady number of injuries over the last 30 years. Our estimate indicates the annual rate of injury from anthropogenic interactions is already exceeding the accepted PBR. Combined with new information on the species' slow reproductive rate (Feyrer et al., 2020b) and known life history impacts faced by survivors, entanglement and vessel strikes likely present ongoing and significant threats to the recovery of northern bottlenose whales.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

ETHICS STATEMENT
The animal study was reviewed and approved by Dalhousie University Committee on Laboratory Animals.

AUTHOR CONTRIBUTIONS
LF was responsible for project conception, funding acquisition, supervision, data collection, project administration, analysis, data visualization, writing, and editing. MS was responsible for data analysis, validation, visualization, writing, and editing. JY was responsible for data analysis, validation, visualization, and editing. CS was responsible for data analysis, validation, and editing. HW was responsible for funding acquisition, data collection, project administration, supervision, and editing. All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

FUNDING
This work was supported by grants and funding agreements with Species at Risk and Oceans Management branches in the Maritimes Region of Fisheries and Oceans Canada. In addition, LF and HW received funding from the Natural Sciences and Engineering Research Council of Canada and Killam Trusts. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

ACKNOWLEDGMENTS
We would like to acknowledge the contributions of three decades of volunteers and students that participated in the field and lab to collect and process the data used in this study. Wayne Ledwell and Julie Huntington of the Whale Release and Strandings Group in Newfoundland who reviewed and ranked photos of possible entanglement, providing much additional insight, the photograph of the whale in Figure 6C as well as some inspired whale poetry. Dr. Patrick Miller conducted fieldwork in the Gully in 2013 and Kristi O'Brien contributed photos of entangled Sowerby's from those trips, which were important to the analysis conducted here. We gratefully acknowledge Dr. Hilary Moors-Murphy for their efforts to digitize the catalog and the funding support of Species at Risk and Oceans Management branches in the Maritimes Region of Fisheries and Oceans Canada. We would also like to recognize that Dalhousie University is located in Mi'kma'ki, the ancestral and unceded territory of the Mi'kmaq people. Without this support our research would not have been possible.