Risk Assessment of Whale Entanglement and Vessel Strike Injuries From Case Narratives and Classification Trees

Entanglements and vessel strikes impact large whales worldwide. Post-event health status is often unknown because whales are seen once or over short spans that conceal long-term health declines. Well-studied populations with high site fidelity verified by photo-ID offer opportunity to confirm deaths, health declines and recoveries. We used known outcome entanglements and vessel strikes of right whales (Eubalaena glacialis) and humpback whales (Megaptera novaeangliae) to model probabilities of deaths, health declines and recoveries with Random Forest (RF) classification trees. Variables included presence or absence of phrases from case narratives (‘deep laceration’, ‘cyamid’, ‘healing’, ‘superficial’) and a categorical variable for vessel size. Health status post-entanglement was correctly classified in 95.7% of right whale and 93.6% of humpback whale cases (expected by chance=50%). Health status post-vessel strike was correctly classified in 91.4% of right whale and 88.6% of humpback whale cases. Important variables included cyamid presence, emaciation, discolored skin, constricting entanglements, gear-free resightings, superficial or healing lacerations, and vessel size. Cross-validated RF models were applied to unknown outcome cases to estimate the probability of deaths, health declines and recoveries. Total serious injuries (probability of death or health decline > 0.50) assigned by RF were nearly equal to current injury assessment methods applied by biologists for known outcomes. However, RF consistently predicted higher serious injury totals for unknown outcomes, suggesting that current assessment methods may underestimate risk for cases lacking details or long-term observations. Advantages of the RF method include: 1) risk models are based on known outcomes; 2) unknown outcomes are assigned post-event health status probabilities; and 3) identification of important predictor variables improves data collection standards.


INTRODUCTION
Mortality and serious injury (MSI) of large whales caused by entanglements and vessel strikes occurs worldwide and is a chronic problem in several populations (Clapham et al., 1999;Neilson et al., 2009;Conn and Silber, 2013;Robbins et al., 2015;Rockwood et al., 2017;van der Hoop et al., 2017). True impacts of chronic entanglements, in particular, are difficult to quantify, with some noting that large whale populations essentially have an 'entanglement life history stage' (van der Hoop et al., 2017). Scientists are challenged with assessing entanglement and vessel strike health impacts, but an inability to follow all cases through time means many outcomes are unknown, including undocumented deaths, or 'cryptic mortality' (Gilman et al., 2013, Pace et al., 2021. Several studies have quantified cryptic mortality levels by estimating the percentage of available carcasses that are detected to be between 0% and 46%, depending on species, habitat, and mortality source (Kraus et al., 2005;Williams et al., 2011;Gilman et al., 2013;Prado et al., 2013;Wells et al., 2015;Carretta et al., 2016;Rockwood et al., 2017;Harting et al., 2021;Pace et al., 2021). The ability to follow individual whales with entanglement or vessel strike injuries through time via photo identification of unique entangling gear, scars, lacerations, or natural markings provides the opportunity to model the outcomes of such interactions. However, these can produce biased information as more lethal injuries may have a lower chance of being observed later, especially if they occur far from shore (Pace et al., 2021, Williams et al., 2011. In the United States, classification of non-serious vs serious whale injuries from entanglements and vessel strikes is determined by National Oceanic and Atmospheric Administration (NOAA) biologists manually applying decision tree criteria to case narratives (NOAA, 2012a;NOAA, 2012b). NOAA defines a serious injury as any injury that is "more likely than not to result in mortality, or any injury that presents a greater than 50 percent chance of death to a marine mammal." (NOAA, 2012b). Injury assessments are divided into multiple discrete categories, based on whether entangling gear is constricting vs loose, physical evidence of health declines (emaciation, heavy cyamid loads), presence of trailing gear, severity of vessel strike or entanglement lacerations, and vessel sizes and speeds involved in a strike relative to whale size (Angliss and DeMaster, 1998;Pettis et al., 2004;Andersen et al., 2008;NOAA, 2012a;NOAA, 2012b;Conn and Silber, 2013;Rolland et al., 2016). In practice, many entanglement and vessel strike cases represent whales seen only once with generic narratives such as 'whale seen dragging 4 buoys' or 'sailboat reported striking whale, vessel size and speed unknown'. The policy for assessing such 'data poor' or unknown outcome cases is to prorate the probability of injury or death based upon the fraction of known outcome entanglement and vessel strikes resulting in a death or health decline between and 2008(NOAA, 2012aNOAA, 2012b). One such example (a proration factor = 0.75) is applied to entanglements where the amount and configuration of entangling material is unknown and the final health status of the whale is unknown. The serious injury policy states: "of the 114 documented entanglement events with known outcomes from 2004-2008, 85 (75%) either resulted in the whales' deteriorating health or death, or would have resulted in the whales' death if not for intervention (40 were disentangled from constricting wraps)." (NOAA, 2012a;NOAA, 2012b). Because some injury categories are based on small sample sizes and serious injuries are defined as any injury more likely than not to result in mortality, current injury categories also incorporate a binomial test to estimate the likelihood of observed mortality rates of a given category exceeding 50% (NOAA, 2012a;NOAA, 2012b). Current protocols utilize several discrete injury assignment values ranging from zero (a non-serious injury) to one (a serious injury), including prorated serious injury values of 0.14, 0.20, 0.36, 0.52, 0.56, and 0.75, which are counted against anthropogenic removal levels (Potential Biological Removal or PBR) under the Marine Mammal Protection Act (Wade, 1998).
Some whale populations have excellent mark-recapture sighting histories, facilitating long-term tracking of entanglement and vessel strike injuries that may take many months to manifest themselves as health declines or deaths (Moore and Van der Hoop, 2012;Robbins et al., 2015). In particular, researchers have monitored entangled North Atlantic right whales (Eubalaena glacialis) over long enough spans to estimate a 20% reduction in adult and juvenile survival compared to baseline survival rates of unaffected whales (Robbins et al., 2015). For entanglements where the amount and configuration of attached fishing gear is unknown, NOAA applies a similar probability (p=0.75) of a death or health decline, based on 85 of 114 known outcome entanglement cases resulting in deaths or health declines between (NOAA, 2012aNOAA, 2012b). NOAA is reviewing current large whale serious injury assessment procedures, using known outcome entanglement and vessel strike cases to model the probability of deaths, health declines, and recoveries based on case narratives. We describe an approach to improve and automate the large whale serious injury assessment process using variables developed from current serious injury procedures and Random Forest models.

METHODS
Our study includes 698 entanglement and vessel strike cases over a 19-year period (2000-2018) for North Atlantic right whales and North Atlantic and North Pacific humpback whales (Megaptera novaeangliae), including 370 known outcomes and 328 unknown outcomes (Table 1). Cases include published data from annual serious injury reports (Carretta et al., 2020;Henry et al., 2020), journal articles (Knowlton et al., 2012;Sharp et al., 2019) and NOAA marine mammal stock assessments (Hayes et al., 2019;Carretta et al., 2020). Entanglement cases include 125 right whale and 398 humpback whale records. Vessel strike cases include 69 right whale and 106 humpback whale records. Known outcome cases include two possible response variables: 'Dead.Decline' or 'Recovered', corresponding to deaths or health declines, and recoveries. Health declines includes cases with evidence of heavy cyamid loads, skin sloughing or discoloration, necrosis, deep lacerations, rake marks, emaciation, appendage loss or functionality, and deformities (e.g. scoliosis) due to long-term entanglements (Pettis et al., 2004;Moore and Van der Hoop, 2012;van der Hoop et al., 2017). Health declines or deaths may occur over any observation period, while recoveries are defined as whales that are seen ≥1 year post-event in good health and body condition. To model entanglement outcomes, we excluded cases where human intervention to remove fishing gear potentially changed the long-term impact of an entanglement. Our study treats deaths and health declines as equal known outcomes, as both are counted as a 'mortality or 'serious injury' (MSI), which are compared to the anthropogenic removal threshold (Potential Biological Removal; PBR) as defined under the Marine Mammal Protection Act (Wade, 1998). In cases where whales are never seen again post-injury, these are generally considered to be unknown outcomes, unless the individual whale is known, belongs to an intensivelyphotographed population (i.e. North Atlantic right whale) and it hasn't been seen in several years, after which it may be inferred to have died, depending on the individual case. However, attributing cause of death to such 'missing whales' over such a time period involves great uncertainty.
Right and humpback whale entanglement and vessel strike data were stratified to create separate species and injury type models. Case narratives from known outcomes were data mined for the presence or absence of words or phrases anticipated to be good predictors of the health status responses 'Dead.Decline' vs 'Recovered'. Similar language analyses have been used to identify and successfully predict cultural, gender, and racial biases in news stories and political speeches (Gabrielatos and Baker, 2008;Mastro et al., 2011;Dahllöf, 2012;Caliskan et al., 2017). For example, a heavy cyamid load on a whale is excellent evidence of a health decline and increased probability of death (Pettis et al., 2004, Pettis et al., 2017. Thus, presence or absence of the word 'cyamid' or phrase 'whale lice' (or other derivatives) in an injury narrative was recorded for each injury case. Additional evidence of health declines include observations of emaciation, sloughing skin, rake marks, limited mobility, and deformities resulting from injuries, thus multiple phrases characterizing a health decline are pooled into a single presence/absence variable called 'decline'. Similarly, the word 'constricting' is a better indicator of a severe entanglement than the word 'loose', so phrases that include 'constrict, 'embedded', 'impression' 'cutting into' or 'pinned' were pooled into a single presence/absence variable called 'constricting'. In some cases, the same word may be used to represent more than one variable, such as 'amputate', which is indicative of a severe injury that is related to a constricting entanglement. Thus, 'amputate' (or derivatives thereof such as 'cutting into') were included in the variables extensive.severe and constricting. For vessel strikes, phrases such as 'deep laceration' indicate a more severe injury than 'superficial laceration'. Specific regions of a whale involved in injuries, such as the head, pectoral fins, and caudal peduncle are also included in our suite of variables, because they are related to a whale's ability to feed (head), change swimming direction (pectoral fins) or involve body areas where a constricting entanglement or vessel strike could involve major arteries that are near the skin surface (NOAA, 2012a; NOAA, 2012b). We also include a variable indicating whether a whale is identified as a calf or juvenile, as entanglements and/or vessel strikes may impact smaller individuals of a species more severely, and to recognize that a dependent calf of an injured mother also has reduced chances of survival (NOAA, 2012a;NOAA, 2012b). Such phrases inform current serious injury protocols and were therefore coded as presence/absence variables. Some vessel strike narratives include information on the size and speed of vessels involved, which can be predictive of injury severity (Kelley et al., 2020). Phrases and character strings such as 'ferry traveling @ 11 kts', 'navy ship', 'sailboat', 'fishing boat', 'ferry', '>10 kts', '< 10 kts', 'vessel much smaller/larger than whale', '>65 ft', '<65 ft' were used to distinguish between small/large vessels and slow/fast travel speeds to create categorical variables for vessel size (small, large, unknown) and speed (slow, fast, unknown). Phrases associated with only one of two known outcome classes ('Dead.Decline' or 'Recovered') and absent from unknown outcome cases were omitted from models, as these variables would result in model overfitting that isn't informative. Omitted examples include 'carcass', 'necropsy', 'hemorrhaging', and 'fracture'. Variables used in entanglement and vessel strike models are identified in Table 2. Identification and selection of potential variables was aided by reviewing current injury protocols for use of such phrases and plotting and examining their association with known health status responses ( Figure 1).

Random Forest Classification Models
Random Forest (hereafter 'RF') models (Breiman et al., 1984;Breiman, 2001a;Breiman, 2001b) were created to classify the health outcomes of large whale entanglements and vessel strikes using the R programming language (R Core Team, 2020) and the packages randomForest, version 4.6-14 (Liaw and Wiener, 2002) and rfPermute version 2.5 (Archer, 2021). RF models are a recursive partitioning algorithm that use random subsets of variables to split data into successive daughter nodes (see Supplementary Material). The variable that maximizes the purity of classes in daughter nodes is chosen for each split. Our response variable is a binary class of health status: 'Dead.Decline' or 'Recovered'. A perfect data split would fullyseparate 'Dead.Decline' and 'Recovered' cases, or maximize the Gini coefficient of data in subsequent nodes (Gini, 1921). Splits continue until terminal nodes contain a single response class. Each tree is constructed from a random subsample of the cases and many trees are grown, which prevents overfitting of data that can occur with single trees and produces robust predictive Case numbers are divided among injury type and known and unknown outcomes. Random Forest (RF) classification trees were built from known outcome cases and used to classify the health status (Dead.Decline vs Recovered) of whales ≥1 yr post-event from case narratives.
Resulting RF models were then applied to unknown outcome cases to estimate health status probabilities and assign cases as non-serious or serious injuries. models when variables are informative (Breiman, 2001a;Breiman, 2001b). Cases omitted from the construction of each tree in the forest are called 'out-of-bag' (OOB) and are used for cross-validation. Each tree then predicts the status of cases that were OOB (either 'Dead.Decline' or 'Recovered'), based on the status of the training cases in the terminal nodes they are assigned to. The fraction of trees in a forest voting for a class for each case when it was OOB is used to estimate the error rate of the model. These errors are summarized as a confusion matrix for all response classes. The relative importance of predictors in classifications of the RF model is assessed by permuting each predictor, and the resulting decreases in classification accuracy are measured. Important variables will result in the largest decreases in classification accuracy, while unimportant variables result in negligible decreases. For each predictor, an importance 'score' is defined as the mean decrease in classification accuracy (number of additional cases misclassified) across all trees when it was permuted.
Random Forest injury models included 1000 individual trees, each constructed from equal numbers of 'Dead.Decline' and 'Recovered' cases (n=1/2 of smaller class size). This balanced data approach is analogous to a uniform prior distribution and is necessary to mitigate bootstrap oversampling of majority response classes, resulting in poor classification accuracy for minority classes when response class frequencies are highly-unbalanced (Chawla et al., 2004;Anaissi et al., 2013). We use this approach because we are equally concerned with correctly classifying 'Dead.Decline' and 'Recovered' response classes and our data are imbalanced with respect to these classes. To evaluate RF model performance, we compare OOB classification accuracy to that expected by chance, which is based on the respective sample sizes of each response class. Because each tree is constructed with an equal number of cases from each response class, the classification accuracy expected by chance = 50%.

RF Model Application to Novel Data
Entanglements and vessel strikes with unknown outcomes were predicted using the RF models to yield probability assignments of 'Dead.Decline' or 'Recovered', based on the fraction of trees predicting each response. For example, a RF model of 1000 trees may yield 920 predictions of 'Dead.Decline' vs 80 predictions of 'Recovered' for a given injury case. The probability of 'Dead.Decline' for that case is p = 0.92 and the 'majority prediction' is 'Dead.Decline'. In this paper, 'classification' refers to RF assignments of health status for known outcome cases, vs 'predictions' that represent probability-based assignments of health status for unknown outcome cases.
For right and humpback whales combined, we compared RF classifications and probabilistic predictions of known and unknown outcomes with injury determinations made by biologists applying current injury protocols and MSI assignments to the same cases, to identify any notable differences in estimated numbers of deaths or serious injuries

VS
wrap.no "no wraps" EN wrap.multi "multiple wraps", "several wraps" EN Examples for each variable are provided but do not include every phrase used, including derivatives of the same phrase, such as 'gear-free' vs 'gear free'.
FIGURE 1 | Occurrence of variables from entanglement and vessel strike narratives, stratified by known health status ≥ 1 year post-event. In most injury cases, vessel size and speed were unknown and effectively represented a variable with a constant value. Thus the number of 'VsSzUnk' and 'VsSpdUnk' values are not shown to prevent y-axis distortion that would mask patterns evident for known vessel sizes and speeds.
between methods. Our emphasis for comparison with MSI totals is RF majority predictions of Dead.Decline (total cases with Dead.Decline probabilities > 0.50, not the sum of predicted Dead.Decline probabilities), because the definition of a serious injury under the MMPA is one that results in a greater than 50% chance of mortality.  (Figures 2, 3). The most important entanglement variables, based on the largest decreases in classification accuracy when individual variables were permuted, included 'healing', 'gear free', 'pectoral' and 'decline'. Entanglement variables resulting in the largest decreases in classification accuracy when permuted were 'healing' and 'gear free' for right and humpback whales, respectively. The most-important vessel strike variables in clude d ' VessS z' , ' healing ' , ' h e a d' , ' ca lf.ju v ' a n d 'laceration.shallow' for both right and humpback whales. The vessel strike variable resulting in the largest decreases in classification accuracy when permuted was 'healing' for both species (Figures 2, 3).

RESULTS
Generally, RF predictions of health status agreed with MSI values assigned by biologists for known outcome entanglements and vessel strikes. In contrast, for unknown outcomes where biologists assigned MSI values of 0 (a non-serious injury), corresponding RF Dead.Decline probabilities were notably higher (Figures 4, 5). The sum of MSI values assigned to known outcome entanglement and vessel strikes by biologists using current injury protocols was similar to sums of RF majority predictions of deaths or health declines, with ∑MSI/RF majority prediction ratios close to unity ( Table 4). In contrast, the sum of MSI assignments for unknown outcomes were lower than RF majority prediction totals for both injury sources. Differences in ∑MSI/RF majority prediction ratios were apparent between known and unknown outcomes (entanglements: p=0.22, odds ratio=1.24 and vessel strikes: p=0.008, odds ratio=2.72, Fisher Exact test) ( Table 4 and Figures 4, 5). For right and humpback whales combined, the sum of RF majority predictions of deaths or health declines for known outcome entanglements (98) was equal to the sum of MSI assignments (98) ( Table 4). In contrast, for unknown outcome entanglements, the sum of RF majority predictions of deaths or health declines (223) were 25% higher than MSI assignments (178.8) ( Table 4). For known outcome vessel strikes, the sum of RF majority predictions of deaths or health declines (76) closely agreed with the sum of MSI assignments (80) ( Table 4). In contrast, the sum of RF majority Dead.Decline predictions for unknown outcome vessel strikes (29) was much higher than the sum of MSI assignments (11.2), but was based on only 23 injury cases, excluding 7 cases where prorated vessel strike MSI assignments totaling 3.2 whales were made ( Figure 5 and Table 4). Observed cases (rows) and classified cases (columns) are shown, with classification accuracy and 95% confidence intervals for out-of-bag samples, correct classification rates expected by chance, given balanced sample sizes used in tree construction, and the binomial probability of observing the number of correct classifications for each class, assuming an expected 50% accuracy rate for each class.

DISCUSSION
Current large whale serious injury assessment uses a decisiontree set of rules applied to cases by biologists and assignment of non-serious vs serious injuries is based on observed ratios of deaths to total cases and expert opinion from 2004 -2008 data (Andersen et al., 2008;NOAA, 2012b). In this study, we have automated this process by training a Random Forest classification model and using additional data. The results of our model suggest improved performance with some notable enhancements over the previous method. The clearest insights come from comparing known and unknown outcome MSI assignments using current injury protocols and Random Forest classification trees. Most notable was that MSI assignments made by biologists using current protocols were in excellent agreement with RF predictions of health status for known outcome cases. This isn't unexpected, given the high assignment accuracy rates for known outcomes. However, for unknown outcomes, RF probability assignments of deaths and health declines were consistently higher than assigned MSI values, suggesting that current injury protocols may underestimate true impacts of entanglements and vessel strikes for cases lacking longer observation periods needed to document serious injuries or gear burdens not visible during brief encounters.
Predicted entanglement serious injury probabilities from RF models were similar to MSI assignments from biologists for known outcomes (Figure 4). In 135 known outcome entanglements where a MSI = 0 was assigned, the median RF probability of Dead.Decline was 0.138 [95% CI = 0 -0.604] and only 7/135 (5%) of these cases had RF Dead.Decline probabilities > 0.5 that would imply a serious injury. In contrast, where biologists assigned a MSI value of 0 to  unknown outcome entanglements, RF probabilities of Dead.Decline were > 0.5 for 30/72 (42%) of the cases, implying that entanglement risk for unknown outcomes using current protocols is underestimated. Known outcome entanglements assigned a MSI = 1 (n=98) had a median RF Dead.Decline probability = 0.92 [95% = 0.36 -1.00] and only 7/98 (7%) of cases were assigned a Dead.Decline probability < 0.5 ( Figure 4 and Table 4). For unknown outcome entanglements where a MSI = 1 was assigned, the median RF probability of Dead.Decline was 0.873 [95% CI = 0.548 -0.977] and 2/61 (3%) of cases were assigned a Dead.Decline probability <0.5. For unknown outcome entanglements lacking detail on the amount and configuration of fishing gear or injury severity with assigned MSI values = 0.75, the median predicted RF probability of 'Dead.Decline' was 0.664 [95% CI = 0.345 -0.941] for both species combined, with 23/157 (15%) cases assigned a 'Dead.Decline' probability <0.5, implying a non-serious injury (Figure 4). Evaluating the robustness of the '0.75 MSI' entanglement proration factor used in current injury protocols is not straightforward, as it is derived from a sample of 114 known outcome entanglement cases from 2004-2008, where 85 resulted in deaths or health declines (NOAA, 2012a;NOAA, 2012b), while the current study uses twice as many cases (233 over the period 2000 -2018) to generate predictive models. Current injury policy reflects entanglement and vessel strike conditions encountered by whales during the 2004-2008 period. If there are changes in the severity of injuries incurred over time, for example, if rope strength involved in entanglements increases over time, then the severity of entanglements may be greater in more recent years. This is reflected in known-outcome cases and therefore, is included in RF models, but may mean that current injury policies could underestimate entanglement risk for unknown outcomes. Agreement between MSI assignments and RF predictions for known outcome entanglements reflects that the variables used in each method allow for robust injury assessment, but performance of the current injury protocol for unknown outcomes assigned MSI values = 0 appears to underestimate risk of death or health decline compared with RF predictions.
For vessel strikes, differences in MSI values assigned by biologists and RF predictions for unknown outcome outcomes were apparent, particularly in cases where biologists assigned a MSI value of zero ( Figure 5). The median RF probability of Dead.Decline for unknown outcome vessel strikes determined to be non-serious injuries with a MSI value = 0 was 0.57 [95% CI = 0.03 -0.98] and 16/23 (69%) of these cases were assigned a 'Dead.Decline' probability > 0.5, implying that they were serious injuries ( Figure 5 and Table 4). This suggests that the assigned risk of death or health decline for unknown outcome vessel strikes is underestimated with current protocols. Higher RF predictions of Dead.Decline for unknown outcome vessel strikes assigned a MSI value = 0 suggests that such 'nonserious injury' cases may lack longitudinal evidence of a health decline that may take many months to appear (Moore and Van der Hoop, 2012), or that evidence of a serious injury is not observed due to the brief encounter with the whale, as is typical with vessel strikes. For unknown outcome vessel strikes where biologists assigned a MSI value = 1, the corresponding RF probabilities of Dead.Decline were 0.869 [95% CI = 0.469 -0.943] and 7/8 (88%) of these cases were assigned Dead.Decline probabilities > 0.5 ( Figure 5, Table 4). Five cases involved observations of deep propeller lacerations, two cases involved fast vessels much larger than the whale (> 10 kts and >65 ft), and the last case was a dependent calf of a mother killed by a vessel strike, which under current protocols, are automatically assigned as serious injuries with a MSI value = 1 (NOAA, 2012a;NOAA, 2012b).
In entanglement RF models, the importance of the variable 'gear.free' was evident not only from its large contribution to OOB error rate reduction in known outcome humpback whale cases, but also in how it influences predictions for unknown outcomes. For humpback whale entanglements with unknown outcomes, 41/257 were positive for the variable 'gear.free'. Almost all gear free cases (37/41) had a majority prediction of 'Recovered', with a mean recovery probability = 0.791. In contrast, the mean recovery probability for 216 cases lacking the variable 'gear.free' was 0.315, with 194 cases predicted as 'Dead.Decline' and 22 as 'Recovered'. For right whale entanglements, unknown outcomes were limited to 33 cases and the most important variable was 'healing'. The mean recovery probability was 0.634 for 7 cases including reference to healing, compared to a mean recovery probability of 0.253 for 26 cases lacking reference to healing.  Figure 5.
Automated data-mining of case narratives to identify and generate predictor variables poses challenges, including inadvertently identifying variable presence when reference to it actually indicates absence. Words such as 'blood' are expected to contain important clues about health status. However, we found the word 'blood' was as likely to be used to indicate an absence of blood ("no blood observed in water") as to note its presence ("small amount of blood observed in water"). This requires editing of narratives to make presence/absence declarations explicit or involves expansion of search phrases to account for wider language use variance. Ultimately, we found the number of cases using the word 'blood' was relatively small, its utility in predictive models was negligible, and its presence was largely associated with necropsy cases, thus we omitted it from models. Other challenges include recognition of shorthand vernacular phrases, such as the word 'prop' to reference propeller marks on whales. This necessitates data-mining for the use of 'prop', while ensuring that words like 'proper' are not confounded for this variable. Application of the RF method to case narratives will require some combination of language standards to be implemented for narratives, in addition to careful editing of existing narratives to address confounding issues. Despite these challenges, RF model accuracy was much higher than expected by chance for both entanglement and vessel strike cases.

CONCLUSION
Our RF models include entanglements and vessel strikes of two species, but many large whale species are injured or killed from these sources. Our focus on right and humpback whales reflects that they are commonly involved in entanglement and vessel strike cases with long-term sighting histories, facilitated by ease of individual identification. In contrast, more pelagic species such as blue and fin whales are difficult to individually track over time, due to their offshore distribution and challenges in photographically identifying individuals. Although entanglement and vessel strike impacts may differ by species, there are insufficient data to generate species-specific models for all species. Models based on right and humpback whale observations likely have good utility for predicting the health status for other species, as 'severe' and 'superficial' injury narratives are likely to be equally-informative across species. Current NOAA serious injury policy utilizes a unified decisiontree framework applied by biologists that includes multiple species. Our models represent a 'proof-of-concept' that injury narratives can be data-mined for variables used to predict health status with high accuracy. Such models may be tailored to multispecies assessments by pooling all species into one entanglement or vessel strike model, though species represented by only a few known outcome cases will contribute little to overall inference.

ETHICS STATEMENT
Ethical review and approval was not required for the animal study because Whale injury cases are opportunistically observed. They are not experimental subjects.

AUTHOR CONTRIBUTIONS
JC conceptualized the study, conducted the analyses, and created the tables and figures. AH maintained data from Atlantic injury records that formed the foundation of the data format used in the study and assigned and/or reviewed serious injury determinations from both Pacific and Atlantic cases. JC and AH wrote the manuscript. All authors contributed to manuscript revision, and read and approved the submitted version.

FUNDING
NOAA Fisheries Northeast Fisheries Science Center supported AH. NOAA Fisheries Southwest Fisheries Science Center supported JC. Funding from the Southwest Fisheries Science Center were used for publication fees.