Using Ancient DNA Analysis and Radiocarbon Dating to Determine the Provenance of an Unusual Whaling Artifact

Natural history collections provide a critical temporal view of past biodiversity and are instrumental in the study of extinct populations. However, the value of historical specimens relies on correct species identification, collection date and collection locality. The Australian National Maritime Museum (ANMM) holds an unusual artifact – an electric lamp made from a dried whale penis – with unknown age, species-of-origin and collection locality. We used ancient DNA methods to generate a partial mitochondrial DNA (mtDNA) genome sequence to establish the identity and provenance of the whale, and accelerator mass spectrometry (AMS) radiocarbon dating to determine the approximate year of death. Mitochondrial DNA sequences from the 16S rRNA gene and the control region indicate that the specimen belonged to a sperm whale (Physeter macrocephalus) and a modern radiocarbon age suggests it was collected post-1950s. We were unable to determine the collection locality of the whale due to the very broad geographic distribution of its mtDNA haplotype. Our results suggest the specimen was possibly collected as a souvenir during post-war whaling, where nearly 30,000 male sperm whales were killed annually. This study supports and extends previous research that applies ancient DNA and radiocarbon dating techniques to enhance the value of natural history collections, by identifying the species-of-origin and age of historical specimens.


INTRODUCTION
Natural history museum collections provide a rich source of information for the study of evolution and systematics, and provide a critical temporal view of past biodiversity, adaptation and extinction (Allmon, 1994;Suarez and Tsutsui, 2004). However, the value of museum collections relies on correct species identification, collection date and collection locality (Hall, 1974;Goodwin et al., 2015). Missing or improper labeling and identification is common in specimens collected as trophies or curiosities, or where biological specimens were incorporated into objects such as clothing, furniture and tools. There are ways to recover sample identification and collection date using morphological characteristics (Eyualem and Blaxter, 2003) and radiocarbon dating (Cerling et al., 2016). However, collection locality is generally more difficult to establish. Ancient DNA analysis can be used to identify the species-of-origin and collection locality using phylogeographic analysis of genetic material recovered from the unknown specimen (Hartnup et al., 2011;Bi et al., 2013;Besnard et al., 2015;Thomas et al., 2017;Bastian et al., 2018). Combining specimen genetic data with more traditional sources of evidence, such as archives, specimen labels, and evidence from historical taxidermy can help develop robust predictions of specimen provenance.
Whales have been exploited by humans for thousands of years for their valuable meat, oil and blubber (Berta et al., 2015). Commercial hunting commenced in the tenth century in Europe (Reeves, 2018) and fifth century in Japan (Kasuya, 2018), and spread worldwide in the nineteenth century. Whale hunting became industrialized in the 1860s with the invention of cannon-fired harpoons and steam-powered ships (Hjort, 1932;Reeves, 2018). During the twentieth century whale harvesting increased dramatically, especially for fin (Balaenoptera physalus), blue (Balaenoptera musculus), sei (Balaenoptera borealis), and minke (Balaenoptera spp.) whales. It is estimated that 2.9 million large whales were caught and killed during the twentieth century (Rocha et al., 2015). Some whale species have never been able to fully recover from this and require intense conservation to maintain stable population sizes (Jackson et al., 2016). Estimates show that the number of humpback whales (Megaptera novaeangliae) in the North Atlantic Ocean dwindled from 240,000 to 9,000 during the twentieth century (Roman and Palumbi, 2003). During this time whales were hunted primarily as a source of food and oil. However, baleen, the filter feeding system from baleen whales such as right whales (Eubalaena spp.) and humpback whales, was used in a variety of nineteenth century products, including carriage springs, umbrella ribs, corset stays and fishing rods (Lauffenburger, 1993). Whale bones and teeth were used for art (scrimshaw), tools and utensils (Dyer, 2018). These secondary uses of whale body parts have resulted in a large number of historical artifacts, kept and traded as antiques or stored and displayed in museums as part of social history exhibits associated with whaling (Eastop and Mcewing, 2005).
DNA recovered from historical artifacts made of baleen, whale bone and teeth can provide provenance and other contextual information that enhances an object's social and historical value (Solazzo et al., 2017). DNA from these artifacts can also aid in population genetic studies to increase our understanding of the effects of whaling, because they provide crucial temporal and geographic sampling (Baker and Clapham, 2004;Eastop and Mcewing, 2005;Sinding et al., 2012Sinding et al., , 2016. Pichler et al. (2005) and Sinding et al. (2012) have shown that it is possible to extract sufficient DNA from whale baleen samples to be used for population genetic studies and detail methods to be used for minimally destructive analysis of baleen artifacts. Sinding et al. (2016) successfully developed an XY homolog PCR assay for molecular sexing of historical baleen whale artifacts, allowing for a deeper contextual understanding of the social aspect of whaling. The first example of sequencing mitochondrial genomes from eighteenth century baleen artifacts was reported by Eastop and Mcewing (2005). Mitochondrial DNA extracted from the baleen stiffened stomacher of a deliberately concealed eighteenth century garment was found to be from an extinct, previously undescribed, North Atlantic lineage of right whales (Eubalaena glacialis) (Eastop, 2006). This result demonstrates how deeper knowledge of the history of museum specimens can inform understanding of current diversity.
The Australian National Maritime Museum (ANMM) holds an unusual whale artifact -an electric lamp made from the dried penis of an unidentified whale (Figure 1). The whale penis is mounted on a wooden plinth with a standard light bulb, cord and plug. It is a highly unusual example of the way whale products were souvenired, fetishized and converted into contemporary objects. The lamp, like scrimshaw, represents an important historical artifact that reflects the cultural and social aspects of the whaling industry and whalers themselves. The collection date, species-of-origin and collection locality of the lamp are unknown. We used a combination of radiocarbon dating and ancient DNA analysis to attempt species identification and to determine collection locality and date.

Sample
We obtained a piece (5 mm × 5 mm) of dried tissue from the whale lamp (accession number 00042380) in 2017 and sent it the University of Adelaide for DNA testing.

Precautions Against Contamination
We controlled for contamination of the historic museum sample with contemporary DNA and previously amplified mitochondrial DNA (mtDNA) PCR products by conducting all pre-PCR work at dedicated ancient DNA facilities at the Australian Centre for Ancient DNA, University of Adelaide. No contemporary whale samples or DNA had ever been present in the pre-PCR laboratory. We conducted all DNA extraction, library preparation and PCR set-up in the pre-PCR laboratory physically separate from post-PCR laboratories and included the use of dead-air glove boxes fitted with internal UV lights, regular decontamination of all work areas and equipment with sodium hypochlorite, PPE including disposable laboratory gown, face mask, shoe covers and double-gloving and strict one-way movement of personnel (shower > freshly laundered clothes > pre-PCR laboratory > post-PCR laboratory).

DNA Extraction
We extracted DNA from a 3 mm × 3 mm piece of dried tissue that was rehydrated in 1 ml of 0.5 M EDTA for 2 h and then finely minced with a sterile scalpel blade and then digested and extracted using a Qiagen DNeasy Tissue Kit (Qiagen, Valencia, California, United States) as described by Austin et al. (2013), omitting the carrier RNA, and with a final elution volume of 80 µl. We included a negative extraction control during the DNA extraction procedure to monitor for contamination.

Library Preparation and High Throughput Sequencing (HTS)
We initially used hybridization enrichment and high throughput sequencing to attempt to generate a mitogenome sequence from the sample for species identification and to determine collection locality. Twenty microliters of DNA extract was converted to truncated Illumina libraries, with 7 bp dual internal barcodes, using the method described by Meyer and Kircher (2010). We included a library blank (no DNA) to monitor for contamination. A mtDNA genome hybridization enrichment was performed on 100 ng of the truncated library using biotinylated bison (Bison bison) mtDNA genome baits, following the method described by Richards et al. (2019). We have previously shown that bison mtDNA baits can be used to enrich mtDNA genomes from phylogenetically distant taxa including marsupials and birds (Richards et al., 2019). As members of the placental mammal superorder Cetartiodactyla, the bison mtDNA baits should successfully enrich whale mtDNA, which is the suspected identity of the artifact based on morphology. The bison baits were made in-house following the methods of Richards et al. (2019).
Subsequently, and as a result of low mtDNA genome sequence recovery, we also produced a shotgun sequencing library from 150 ng of the truncated library by amplifying with primers that completed the Illumina adapter sequence as described by Richards et al. (2019). We used the shotgun sequencing data to investigate the cause of the poor mtDNA genome enrichment results. The shotgun library, enriched library and matched library blanks were sequenced on separate Illumina MiSeq runs using a 2 × 150 bp paired end kit at the Australian Genome Research Facility (AGRF, Melbourne).
In order to identify the species of whale we used BLAST searches (nucleotide database, Megablast, with default parameters) within Geneious 20.0.4 using the consensus mtDNA 16S rRNA gene sequence obtained from reads mapped to the sperm whale and blue whale mtDNA genomes. Based on BLAST searches and knowledge of cetacean phylogeny (Mcgowen et al., 2019) we aligned the consensus 16S rRNA sequence to representative mtDNA genomes from sperm whale (Physeter macrocephalus), pygmy sperm whale (Kogia breviceps), dwarf sperm whale (Kogia sima), killer whale (Orcinus orca), Northern right whale (Eubalaena glacialis), and blue whale (Balaeonoptera musculus) to confirm species identity.
Based on results of the initial BLAST search we used the consensus mtDNA sequence from reads mapped to the closest related species, in this case the sperm whale reference mitogenome, to try and identify the collection locality of the whale sample. Morin et al. (2018) identified 80 mitogenome haplotypes from 180 different sperm whales collected from the Pacific Atlantic and Indian Oceans. We aligned the whale consensus mtDNA sequence to these 80 existing mitogenome sequences, trimmed all sites that contained an N or gap, and constructed a Minimum Spanning (Bandelt et al., 1999) haplotype network using PopART (Leigh and Bryant, 2015).

Mitochondrial DNA PCR Amplification and Sanger Sequencing
Based on the HTS results, to further confirm species identification and to attempt to refine geographic collection locality we targeted an 89 bp (excluding primers) section of the mtDNA 16S ribosomal RNA gene and a 285 bp (excluding primers) section of the mtDNA control region, using standard PCR and Sanger sequencing. We selected the 16S rRNA region to confirm the hybridization enrichment results. The 285 bp segment of the control region was chosen as it contains 30 of the 31 phylogenetically informative variable sites identified in a 384 bp alignment of global sperm whale mtDNA control region sequences (Alexander et al., 2013) and 30 of 34 variable sites identified in a 619 bp alignment of global sperm whale mtDNA control region sequences by Alexander et al. (2016).
We designed two sets of PCR primers to amplify overlapping fragments of 190 bp and 199 bp for the control region, using an alignment of 41 sperm whale control region sequences from Alexander et al. (2016) (Fragment 1, forward primer: AGATAAATACAAACCCACAGTGCT, reverse primer: TAA TACGAGCTTTCACTGATCG; Fragment 2, forward primer: ACACGCTATGTATAATAGTGCATTCAATT, reverse primer: GTTGCTGGTTTCACGCGGCA). The 16S primers were the generic mammal primers 16S6 and 16S7 from Poinar et al. (2001).
We performed PCRs in 25 µl volumes containing 2 µl of DNA (3.82 ng/µl), 1× High Fidelity PCR Buffer (Invitrogen, California, United States), 250 µM of each dNTP, 400 nM of each primer (IDT), 2 mM MgSO 4 , 1 µg/µl RSA (Sigma) and 0.5 units of Platinum Taq DNA Polymerase High Fidelity (Invitrogen, California, United States). Thermocycling conditions were: denaturation at 94 • C for 2 min, followed by 50 cycles of 94 • C for 15 s, annealing at 55 • C for 15 s and extension at 68 • C for 30 s, with a final extension step at 68 • C for 10 min. We included a PCR no template control and the negative extraction control in each PCR attempt. Successful PCR amplifications were sent to AGRF (Adelaide) for bi-directional Sanger sequencing. Sequence chromatograms were edited and assembled using Geneious v11.0.4. We used a BLAST search (nucleotide database, Megablast, with default parameters) to confirm the identity of the 16S rRNA sequence and aligned the control region sequence to the 41 global control region haplotypes (384 bp) from Alexander et al. (2016). We trimmed the control region alignment to 285 bp and constructed a Minimum Spanning (Bandelt et al., 1999) haplotype network using PopART (Leigh and Bryant, 2015). DNA sequences are available on GenBank (accession numbers: MN563138 and MN548398).

Carbon Dating
We sent a second sample of the whale specimen to the Rafter Radiocarbon Laboratory, GNS Science, New Zealand for radiocarbon dating (sample code R 41180/1) by accelerator mass spectrometry (AMS). A 295.5 mg sub-sample was pretreated by Soxhlet extraction with a standard series of solvents increasing in polarity (hexane, propanol and ethanol), aimed to remove lipids, waxes, resins and preservatives that may have been applied to the sample material. This was followed by repeated washes in dH 2 O and then heating to 85 • C in 0.1 M HCl, aimed to further remove any acid soluble, carbon-bearing compounds, including any atmospheric CO 2 absorbed onto the sample surface. δ13C was measured by isotope-ratio mass spectrometry. The AMS date was calibrated to calendar years (BP) using the Post-bomb Marine calibration curve (Reimer et al., 2009)   Frontiers in Ecology and Evolution | www.frontiersin.org
which accounts for radiocarbon fluctuations in the ocean post-World War II.

RESULTS AND DISCUSSION
DNA and radiocarbon dating results suggest that the whale lamp is modern (post-1950) and made from a sperm whale (Physeter macrocephalus). Phylogeographic analysis of the partial mitogenome sequence and the control region sequence was unable to identify a geographic origin for the whale.

Mitochondrial DNA Genome Enrichment -Species Identification and Geographic Provenancing
Mitochondrial DNA genome enrichment and HTS yielded 2,768,286 retained reads after quality filtering. Only 1414 (0.094%) and 847 (0.067%) unique reads mapped to the sperm whale and blue whale mtDNA reference genomes, respectively. We observed an elevated frequency of purines at the position immediately preceding the start of sequencing reads, as well as an increase of C-to-T and G-to-A substitutions at the start and end of sequencing reads, respectively (Supplementary Figure 1). These patterns are characteristic of degraded DNA (Jonsson et al., 2013). The library blank returned only 1772 retained reads, none of which mapped to the sperm and blue whale mtDNA reference genomes, indicating that the results obtained from the whale sample were not due to contamination during the laboratory process. We focused on the 16S rRNA gene for species identification of the sample. For reads that mapped to the sperm whale or blue whale reference mtDNA genomes we recovered an 898 and 796 bp contiguous consensus sequence, respectively, for the 16S rRNA gene. BLAST searches revealed that both consensus sequences were identical to the sperm whale 16S rRNA gene, and at least 6.2% divergent from all other whale sequences. Among six species of toothed and baleen whale there were 140 variable sites in the 898 bp of aligned mtDNA 16S rRNA sequence ( Table 1). The sequence from the whale sample was identical to the sperm whale but 56-85 substitutions different from the pygmy and dwarf sperm whale, killer whale, Northern right whale and blue whale ( Table 1). These results indicate that the sample came from a sperm whale. Despite our previous research (Richards et al., 2019) showing that bison mtDNA baits are effective across distantly related taxa, we were unable to recover a complete mtDNA genome from the sample. Only 5111 bp of the sperm whale reference mtDNA genome was covered by three or more reads (average fragment length: 92 bp), so we were only able to call a consensus base for 31% of the mitogenome. The partial whale mitogenome was 1 bp different from sperm whale haplotype 47 (Morin et al., 2018, a single individual from the NW Pacific) and 1 bp different from a second sperm whale haplotype (representing 35 collapsed haplotypes from Morin et al., 2018) from 96 individuals collected in the Atlantic, Indian and Pacific Oceans (Figure 2A). Due to the very broad distribution of this haplotype, which includes >50% of the individuals from Morin et al.'s (2018) study, we were unable to identify the ocean from which the sperm whale was collected.

Shotgun Sequencing -Low Endogenous DNA Content
We used the shotgun sequencing data to investigate the cause of the poor mtDNA genome results. Shotgun sequencing yielded 2,868,424 retained reads after quality filtering, with only 4909 (0.17%) unique reads mapping to the sperm whale reference genome. The endogenous mtDNA content was very low (0.0016%, 47 unique reads) compared to the endogenous nuclear content (0.17%, 4862 reads). The mtDNA enrichment, via hybridization, was 57-fold (from 0.0016 to 0.094%) which is within the range (2-828) of fold-enrichment previously observed across a range of samples, preservation conditions, taxa and baits used (Templeton et al., 2013;Mohandesan et al., 2017;Bover et al., 2019;Richards et al., 2019). Therefore, the poor mtDNA genome recovery from the sample appears to be related to its very low endogenous mtDNA content.

PCR and Sanger Sequencing -Species Identification and Geographic Provenancing
We successfully amplified and sequenced 89 bp of the mtDNA 16S rRNA gene and 285 bp of the mtDNA control region using traditional PCR and Sanger sequencing. No amplicons were generated from the extraction or PCR negative controls. The 16S rRNA sequence was identical to the sequence obtained via hybridization enrichment and to the sperm whale reference mtDNA genome (KU891373), and at least seven substitutions different from all other available cetacean sequences ( Table 1). This result corroborates the findings from the HTS data, confirming that the sample belongs to a sperm whale. Alexander et al. (2016) reported 30 variable sites in the 285 bp of control region sequence among 41 sperm whale mtDNA haplotypes sampled from the Atlantic, Indian and Pacific Oceans. Sperm whales, despite being migratory animals, show moderate mtDNA structure due to female philopatry -33 of 39 control region haplotypes with known collection locality have only been found in one ocean basin, two haplotypes (J, BB) occur in two oceans and only four haplotypes (A, B, C, N) are found in all three oceans (Table 2 and Figure 2B). The 285 bp control region sequence from the whale lamp was identical to the most common sperm whale haplotype (haplotype A, Table 2 and Figure 2B), which has been found in 475 of 1546 sperm whales (31%) previously sampled across all three oceans by Alexander et al. (2013). Thus, as a consequence of the very broad distribution of haplotype A we are unable to identify the geographic origin of the whale sample to a specific ocean basin.

AMS Radiocarbon Dating -Specimen Age
AMS radiocarbon dating of the whale sample yielded a modern date ( 14 C fraction modern = 1.0109 ± 0.0026, δ13C = −13.5 ± 0.2). Radiocarbon in the ocean post-1950s is highly variable making calibration to calendar year (BP) difficult in the case of marine animals. Post-bomb marine calibration indicates a calendar age between 1962 and 1963, only 20 years before whaling was banned.

CONCLUSION
From two independent DNA analyses of the whale lamp we confirm that the specimen was collected from a sperm whale (Physeter macrocephalus) sometime after the 1950s. Using a partial mitogenome sequence and control region sequences we were unable to identify from which ocean the whale was sourced. Interestingly, while the HTS methods revealed low endogenous content of the sample, traditional PCR and Sanger sequencing yielded results in a fraction of the time and cost, demonstrating that these methods are still useful and appropriate for specific applications. Morin et al. (2018) described world-wide mitogenome phylogeography of sperm whales from 180 samples and identified 17 different haplotypes that have the same mtDNA control region sequence as the whale lamp. Therefore, additional phylogeographic resolution may be obtained if a complete mitogenome could be generated from the sample. However, this would require significant additional resources, given the very low endogenous content of the sample, and would require sequencing the mtDNA genome via multiple, short, overlapping PCR amplicons or by multiple rounds of hybridization enrichment and increased sequencing effort. An alternative option to resolve collection location of the sperm whale sample would be to investigate markers in the nuclear genome. Hancock-Hanser et al. (2013) propose a way of discovering nuclear loci for populationwide studies in cetaceans using cross-species capture. However, this approach is currently very challenging given the very low endogenous DNA content of the sample and, most importantly, the lack of suitable nuclear single nucleotide polymorphism reference population data for sperm whales.
Hunting of sperm whales increased after WWII and continued until 1988 when the International Whaling Commission introduced a moratorium (Whitehead, 2018). Post-war whaling killed up to 30,000 sperm whales a year (Whitehead, 2018) and the industry targeted large males, thus it is perhaps not surprising that the whale lamp, fitted with an electric light, was made from a male whale killed in the 1950s or 1960s.
As whales are an extremely important part of the marine ecosystem and were hunted for thousands of years, increasing the number of whale samples associated with a collection location would allow an in-depth analysis of genetic and population structure to help our understanding of these creatures both past and present and determine the full impact of whaling. As shown in Eastop and Mcewing (2005) understanding the context of whaling artifacts can provide an insight into historic whale populations that contemporary diversity does not represent.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in GenBank (accession numbers MN563138 and MN548398).

ETHICS STATEMENT
Ethical review and approval was not required for the animal study because only museum specimens were included in the analysis.

AUTHOR CONTRIBUTIONS
CM, RD, and JA contributed to the conception and design of the study. CM and JA performed the genetic laboratory work. CM and BL performed the bioinformatics analysis. CM wrote the first draft of the manuscript. CM, BL, RD, and JA wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.