Access to Marine Genetic Resources (MGR): Raising Awareness of Best-Practice Through a New Agreement for Biodiversity Beyond National Jurisdiction (BBNJ)

cruises; and lastly development of a clearing house to further centralised access to the above. We argue that commitment to best-practice would allow greater sharing of MGR for research and extensive secondary use including conservation and environmental monitoring, and provide an exemplar for access and beneﬁt-sharing (ABS) to inform the biodiversity beyond national jurisdiction (BBNJ) process.

Better scientific knowledge of the poorly-known deep sea and areas beyond national jurisdiction (ABNJ) is key to its conservation, an urgent need in light of increasing environmental pressures.Access to marine genetic resources (MGR) for the biodiversity research community is essential to allow these environments to be better characterised.Negotiations have commenced under the auspices of the United Nations Convention on the Law of the Sea (UNCLOS) to develop a new treaty to further the conservation and sustainable use of marine biological diversity in ABNJ.It is timely to consider the relevant issues with the development of the treaty underway.Currently uncertainties surround the legal definition of MGR and scope of related benefit-sharing, against a background of regional and global governance gaps in ABNJ.These complications are mirrored in science, with recent major advances in the field of genomics, but variability in handling of the resulting increasing volumes of data.Here, we attempt to define the concept of MGR from a scientific perspective, review current practices for the generation of and access to MGR from ABNJ in the context of relevant regulations, and illustrate the utility of best-practice with a case study.We contribute recommendations with a view to strengthen best-practice in accessibility of MGR, including: funder recognition of the central importance of taxonomy/biodiversity research; support of museums/collections for long-term sample curation; open access to data; usage and further development of globally recognised data standards and platforms; publishing of datasets via open-access, quality controlled and standardised data systems and open access journals; commitment to best-practice workflows; a global registry of

INTRODUCTION
Areas beyond national jurisdiction (ABNJ), the vast majority of which are deep sea, represent the largest environments on the planet, yet are the least understood (Ramirez-Llodra et al., 2010).The vast majority of the seafloor is unmapped at highresolution, and the deep sea very poorly characterised compared to other marine ecosystems (Higgs and Attrill, 2015;Glover et al., 2018).Sampling these environments presents substantial technical challenges, and, species discovery rates are significantly lower as a result (Webb et al., 2010;Higgs and Attrill, 2015).High proportions of species collected from these environments are new to science, with estimates varying from 35 to 95% (Poore et al., 2015).Taxonomy, the science of describing, naming and classifying biodiversity underpins all biological research; and is therefore necessary before other research can take place (Glover et al., 2018).Streamlined access to deepsea collections and data is critical to allow taxonomic and other biological research in these poorly known environments and enable their future management in the light of increasing environmental pressures.
The deep sea faces a multitude of environmental challenges, such as cumulative impacts of climate change (Levin, 2018;IPBES, 2019); and potential new ones from seabed mining, with recent reviews suggesting mining operations could result in net biodiversity losses (Niner et al., 2018).While some mechanisms exist for monitoring and environmental protection in ABNJ, including environmental impact assessments and areabased management tools including marine protected areas, governance gaps are evident (Wright et al., 2018b), and no overarching framework exists for the allocation of marine protected areas in ABNJ (De Santo, 2018).In addition, there is a disconnect between regional and global governance in ABNJ (Gjerde et al., 2018).For example, the South Pacific Regional Fisheries Management Organisation (SPRFMO) recently revised bottom fishing rules to allow fishing to potentially continue even when encountering vulnerable marine ecosystems as assessed by observers, rather than implementing United Nations General Assembly recommendations.Overall, the current international ocean governance framework is poorly equipped to conserve and protect biodiversity beyond national jurisdiction (BBNJ) (Gjerde et al., 2018).Recognising these legal gaps, the United Nations General Assembly established a Preparatory Committee by resolution 69/292 for the development of an international legally binding treaty under United Nations Convention on the Law of the Sea (UNCLOS) for 'the conservation and sustainable use of marine biological diversity of areas beyond national jurisdiction'.Further to these aims and in accordance with UN resolution 72/249, negotiations have commenced, and the third Intergovernmental Conference on BBNJ, will be held at the UN in August 2019.Here, deliberations on the recently released draft treaty text will take place.
These negotiations, also known as the BBNJ process, are focussed on four main components: marine genetic resources (MGR), area-based management tools, capacity building and technology transfer, and environmental impact assessments (Tiller et al., 2019).Taxonomy provides a unifying element to all of these components: as MGRs are in essence marine biodiversity, environmental impact assessments require knowledge of the species that live in a habitat being impacted/under assessment, and area based management tools depend on knowledge of species connectivity to inform spatial planning as require knowledge of the biodiversity present in a given region.Taxonomy is also key to capacity development as it is fundamental to all scientific research and therefore to building research capacity.While MGR are only one item of four in the negotiations, they are currently receiving the most attention.MGR and the sharing of benefits arising from their utilisation have long presented a challenge in the BBNJ process (Jorem and Tvedt, 2014;Tladi, 2015;Leary, 2019), key questions being the inclusion or exclusion of genetic or sequence data in the definition of MGR, and the scope of benefit-sharing, whether monetary or non-monetary.
Meanwhile, there are complexities in terms of current processes for the handling of marine samples and associated data in the scientific community.Biological research is becoming increasingly computationally intensive, and both capacity of genomics and the resulting generation of data are ever expanding.The accessibility of samples and data has been improving in response to an increased need for access.However, data are not always fully open, if published in restricted access papers, or in formats which limit their use, and in some cases not available at all (Bax et al., 2016).In the science community, data standards and sampling protocols are available (Wieczorek et al., 2012;Glover et al., 2015;Clark et al., 2016), but awareness and adoption is patchy.Poorly managed samples and data can hamper research progress through loss of knowledge and potentially lead to unsupported policy decisions.Further, commercial entities like the deep-sea mining and fisheries sectors often have a culture of withholding data (Wright et al., 2018a).While access and benefitsharing (ABS) is happening now through current data and sample availability, there is room to improve.Best practice, or usage of standardised workflows and quality assurance/quality control measures which assure quality of output, can assist here.
There is a recognised need to build a repository of bestpractices to support both biodiversity research and governance for BBNJ (Pearlman et al., 2017;Muller-Karger et al., 2018).Synergies across key organisations could be achieved here through best-practice marine sample and data collection.
To support marine biodiversity research and inform the BBNJ process, we discuss current practices in data and sample management from ABNJ, and outline recommendations to strengthen and raise awareness of best-practices.In section 'Conceptualising MGR' , we conceptualise MGR samples and data from a scientific perspective, and discuss the BBNJ process in relation to these concepts.In section 'From Sample Collection to Data Repository: Current Practices and Regulations' , we examine the current scientific research process from sample collection to data repository, highlighting current protocols and discussing challenges.In section 'Current Best-Practices for MGR Sample and Data Collection and Archiving' , we describe current best-practices for MGR sample and data collection and archiving, and in section 'How Could Best-Practice Regarding MGR Samples and Data Be Further Strengthened in Terms of Access, Sharing, and Transparency' how best-practice could be further strengthened.We argue that commitment to a bestpractice approach would increase sharing of samples and data from ABNJ, supporting marine scientific research and extensive secondary use including conservation, and therefore also provide a pathway for ABS in the BBNJ process.

CONCEPTUALISING MGR What Are MGR Samples?
There is currently no internationally agreed legal definition of MGR, but a meaning for this term can be inferred from related definitions provided in the 1992 Convention on Biological Diversity (CBD), and the 2010 Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of the Benefits Arising from their Utilisation (Vierros et al., 2016;Harden-Davies, 2017).MGR therefore can be described as 'material from marine plants, algae, animals, and microbial or other organisms, and parts thereof containing functional units of heredity of actual or potential value (CBD, Article 2)'.The Nagoya Protocol also provides the definition of derivatives (not linked to the definition of genetic resources but to their utilisation), and includes any 'naturally occurring biochemical compound resulting from the genetic expression or metabolism of biological or genetic resources, even if it does not contain functional units of heredity' , therefore also encompassing secondary metabolites, enzymes, and natural products.These processes are depicted in Figure 1.
In terms of utilisation and intent, biological samples used in marine scientific research are collected for a wide variety of scientific fields, including taxonomy, ecology, biogeography, conservation biology, and climate change research.Such research may generate samples containing MGR that may be of interest for bioprospecting, i.e., the development of commercially valuable products for pharmaceutical, cosmetic and/or other applications (Jaspars et al., 2016).To date, seven commercial products on the market have been derived from MGR, including one from a species found both in coastal regions and open water, i.e., ABNJ (Broggiato et al., 2018).Increasingly, environmental samples are the focus of research activities in marine environments, such as those collected for microbial metagenomic studies and often represent mixed communities of Operational Taxonomic Units (OTU's) rather than species assemblages (Walls et al., 2014;Godoy-Lozano et al., 2018).Marine microbes are often the focus for bioprospecting activities, and represent the majority of patent applications from ABNJ (Blasiak et al., 2018).
Samples not collected for molecular work, but retained in a preserved state for later research still contain MGR.These could include sea water, sea ice or sediment samples from ABNJ, collected for environmental chemistry or physical oceanography research.While most utilisation of MGR will come from live cultured, frozen or ethanol-preserved materials, sequencing technologies are advancing very rapidly, and extraction of DNA from formalin-fixed and ancient materials is now an established practice (Palero et al., 2010;Cook et al., 2015;Ruane and Austin, 2017).Therefore, MGR could also be obtained from samples neither originally collected nor preserved for molecular work.Derived samples are often generated from an MGR, such as a tissue subsample from a specimen, or extracted DNA.Any 'child' preparation or sample derived from a 'parent' specimen must retain the link to the original specimen, and can be considered an MGR sample.Specimen parts which may not contain genetic information e.g., molluscan shells, but are crucial in providing key taxonomic information for a specimen also need to stay associated with the MGR in question.
An MGR sample can therefore encompass a very wide range of sample types, from environmental samples of water, ice or sediment that (may) contain whole or partial organisms; through to whole organisms, e.g., single identified specimens, or mixed samples of specimens; to samples derived from any of these, such as extracted DNA or tissue preparations; preserved in such a way as to enable utilisation (defined in Article 2 of the Nagoya Protocol as 'conducting research and development on the genetic/and or biochemical composition of genetic resources').Any of these sample types should, in scientific terms, be considered an MGR sample.

What Are MGR Data?
Legal definitions related to MGR data also remain unclear and lack a standard definition.The term 'digital sequence information' (DSI) on genetic resources was introduced in decisions CBD XIII/16 and the Nagoya Protocol NP-2/14.But this term is not used by the scientific community, and its usage remains heatedly debated under the auspices of the CBD (Laird and Wynberg, 2018).In a BBNJ context of the discussions of the Preparatory Committee, the terms 'in silico' and 'digital sequence information' have been used by different delegations.MGR data, in simple scientific terms however is genetic or genomic sequence information obtained originally from a marine sample or MGR, as outlined above.It could encompass data from: mitochondrial or nuclear genomes for eukaryotic DNA, chromosomes and plasmids for prokaryotic FIGURE 1 | DNA is the cell's master 'instruction manual'; segments of the DNA (genes) are read off (transcribed) to build a corresponding piece of RNA, which ultimately helps to create a protein.Proteins are the functional unit of the cell, used for cell repair or to make hormones, for example.By turning one chemical into another, proteins make secondary metabolites, or biochemical derivatives.Biomolecule data cannot be 'reverse engineered'; knowing a metabolite's formula or a protein's sequence is insufficient to predict the sequence of the underlying gene.Figure and caption reproduced with permission of Jeff Marlow (Marlow et al., 2019).
DNA; protein structure, and/or secondary metabolites (Figure 1; Marlow et al., 2019).Raw genetic or sequence data are reads of a sequence: an arrangement of nucleotides on a strand of DNA or RNA (Consortium of European Taxonomic Facilities [CETAF], 2019).Raw sequence data files -direct output from a sequencing machine may be specific to the sequencing technology or platform used, or in generic formats such as FASTA or FASTQ: text files with sequence data represented by single-letter codes, and also containing data quality information.
Processed sequence data in contrast have undergone various degrees of analysis (Laird and Wynberg, 2018).Sequence data are generally analysed using the Basic Local Alignment Search Tool (BLAST) algorithm against reference sequences in sequence databases (listed in Table 1).Analysis methods may also include scripts for bioinformatics analysis pipelines, and/or parameters used in software such as Geneious or Bowtie (Langmead, 2010;Kearse et al., 2012).These methods, together with interpretations, e.g., phylogenetic trees, or population genetic inferences are likely to be detailed in a resulting publication.Methodologies are sometimes incomplete however, and lacking references to pipelines used for bioinformatic analysis.If version information for a reference dataset used in a metagenomics study is missing for example, reproducibility is undermined (ten Hoopen et al., 2017).There are multiple points in the process of generation of MGR data where variability is introduced, so comprehensive methodology is critical to reproducibility (Escobar-Zepeda et al., 2018).Overall, MGR data could therefore represent a wide range of states: a raw mitochondrial DNA sequence, or multiple processed sequence alignments; reconstructed genomic fragments, to fully assembled reference genomes, and downstream analysis such as functional annotations and full identifications of putative genes.
MGR data, whether in raw or processed form, cannot be viewed in isolation.It is critical that the data stay associated with its contextual information, i.e., the data describing all aspects of the MGR sample it was derived from.Without the integration of sequence and associated data for an MGR the sequence data is of minimal scientific use, as cannot be placed in its context -i.e., how it originated.Depending on the sample type, these data can include the current taxonomy/identification of the sample, physical location and preservation method of a sample (including sample preservation history); occurrence and sampling data (where, when and how the sample was collected), associated environmental data (e.g., oceanographic data); and derived sample information (e.g., extractions isolated from a parent sample).Associated laboratory data also need to be recorded with each sample, and could include DNA quality information, extraction technique, library preparation protocol, marker information and sequencing technology used (ten Hoopen et al., 2017).Associated data can also span a variety of types and formats, and include image files, video e.g., remotely operated vehicle (ROV) footage, acoustic data e.g., telemetry, and spatial data, such as bathymetry.It is important to ensure there are clear links between all these types of data.MGR data and the sample it is derived from are intrinsically linked, and must remain so to safeguard the scientific value of the MGR in question.If linkages are lost, a sample is at real risk of disposal.In summary, we consider MGR data to encompass all

BBNJ Background
The accessibility of MGR from sources in situ (on site), ex situ (samples in collections) and in silico (information in databases) is key to the functioning of the deep-sea research community, as outlined above, and pertinent to the ABS debate for BBNJ.Much of the debate on the legal status of MGR from ABNJ has been concerned with monetary benefits that could arise from any commercial utilisation.A few commentators have pointed out that these monetary benefits have gained a prominence out of scale to their likelihood and without any evidence of their importance to date, similar to the anticipated 'green-gold' of the Nagoya Protocol discussions that failed to eventuate (Leary, 2019).Recently, more attention has been paid to the merits of non-monetary benefits (Leary and Juniper, 2013;Broggiato et al., 2018;Morgera, 2018).The so called 'non-monetary' benefits of participation in marine scientific research, access to research results, data and collections, and other elements of technology transfer and capacity building present elements of a solution to the MGR 'problem' (IISD, 2018).However, the majority of the focus remains on monetary benefits.This urgently needs redressing given the key importance of biodiversity research to conservation and benefit-sharing/building capacity.Given significant research infrastructure requirements, and the substantial expense of undertaking deep-sea investigations, difficulties in conducting such research are amplified for developing countries (DOSI, 2016;Harden-Davies, 2017).
Discussions on benefit-sharing measures for ABNJ have recognised the specific circumstances of both developing countries and Small Island Developing States (SIDS) accordingly (President's Aid to Negotiations, 2019).There is also a disconnect between availability of MGR (data or samples) in developed countries and its actual accessibility by developing ones: while MGR may be made available by scientific efforts, in reality actual utilisation is often not feasible by developing countries, where there is often reduced technical infrastructure, or even research capacity.For example, bioinformatics can be computationally intensive, representing a potential barrier in countries with poor IT infrastructure.As a result, capacity building and technology transfer discussions in the BBNJ negotiations are addressing how to improve access to MGR for developing countries.
Another key question in the BBNJ process is the inclusion of genetic sequence data in a definition of MGR.For the Nagoya Protocol, the potential inclusion of DSI within the definition of genetic resource is currently being debated, with concern that it could lead to restrictions on access to data that are currently open, hampering research (GGBN, 2017; International Chamber of Commerce [ICC], 2017).The scientific community has pointed to current barriers to taxonomic related biodiversity research arising from ABS regulations within national jurisdictions (Pethiyagoda, 2004;Kumar, 2018;Neumann et al., 2018;Prathapan et al., 2018).Similar concerns have been raised by several States that any future regulation of access to MGR in ABNJ could hinder marine scientific research (IISD, 2018).Yet, there is also concern that its exclusion could lead to biotechnology companies profiting from use of the 'global commons' without redistribution to those states with a reduced capacity to undertake such work themselves (Laird and Wynberg, 2018).Most commercial products derived from genetic resources, marine or otherwise, originate from publically funded scientific research.There is a perceived trend of developing countries in favour of including the term and developed countries opposed, but Kumar (2018) points out that all researchers are united in concern about any potential restriction to access for these openly available data.
From a scientific perspective, MGR data and samples are intrinsically linked, but wide definitions have legal and policy implications.For the most part, biodiversity research/taxonomy does not utilise derivatives, such as metabolites (Figure 1), often a target for biotechnology applications.Similarly, functional annotations are generally not directly relevant for taxonomic research.For this reason, the Consortium of European Taxonomic Facilities (CETAF) has recently proposed a definition to replace DSI with NSD: Nucleotide Sequence Data, a more precise term specifically referring to raw sequences (Consortium of European Taxonomic Facilities [CETAF], 2019).More accurate legal terminology reflecting scientific realities would be of benefit but the necessity of maintaining open access to MGR data must be emphasised.
The BBNJ negotiations are further complicated by additional uncertainties around definitions.For example, the water column beyond national jurisdiction is known as the 'deep sea' or 'openocean' to those who conduct research there, but in legal terms it is called the 'High Seas' (UNCLOS, Article 87).Scientists generally use the term 'deep sea' to refer to any part of the ocean, pelagic or benthic, deeper than 200 m, without reference to national boundaries.In reality, marine scientific research often conducts sampling across such boundaries, even during a single expedition (Figure 2).The ' Area' is the term used to describe the seabed in ABNJ (defined in UNCLOS as 'the seabed and ocean floor and subsoil thereof, beyond the limits of national jurisdiction').A legal dichotomy further confusing matters is that while the 'High Seas' are open, and largely unregulated, the mineral resources of the ' Area' , or seafloor in ABNJ are administered by the International Seabed Authority (ISA), established by UNCLOS in 1994.

FROM SAMPLE COLLECTION TO DATA REPOSITORY: CURRENT PRACTICES AND REGULATIONS Sample and Data Collection
To illustrate how current practices and regulations can be streamlined, we will discuss how the process is already managed and regulated from the point of MGR collection, through to the subsequent research, to sample and data archiving and third party access.
The initiation point of this process is a cruise or expedition to collect an MGR sample.Cruises are generally undertaken by international partnerships of various research institutes, and the ships that conduct them are managed and owned largely by institutes and governments, but increasingly also by private industry and philanthropic individuals (León-Zayas et al., 2017).Both pre-cruise planning and post-cruise information logging mechanisms are available in many countries.In the United Kingdom, the Natural Environment Research Council (NERC) managed research vessel programme can be viewed in a jointly managed database1 with those of Royal Netherlands Institute for Sea Research (NIOZ) and the Helmholtz Centre for Ocean Research Kiel (GEOMAR), which allows for comprehensive cruise planning.In some countries, research cruises are registered, for example, the Rolling Deck to Repository (R2R2 ) database records data from all US-based academic vessels.Cruise reports usually contain relevant navigational, oceanographic and environmental sample data, and these are openly available on institutional websites3 in many countries.However, the information regarding all research cruises taking place in ABNJ globally, where the eventual deposit of the collected samples and data will be, and who is responsible for them is not currently housed anywhere centrally.
Regarding on-board sample collection, deep-sea sampling methodologies are complicated by unique technical challenges, and can be very challenging to standardise (Danovaro, 2009;Glover et al., 2015).As deep-sea sampling has been extensively reviewed in recent publications (Glover et al., 2015;Kopf et al., 2015;ten Hoopen et al., 2015;Clark et al., 2016), it will not be covered here.In most cases, at the demobilisation of a research expedition, the samples will be transferred from ship to numerous laboratories for the next phase of research to take place.It is standard practice for Principal Investigators of a research cruise to keep a record of which samples have been transferred where and for what purpose.As such, there is no current national or international standardisation of the process, and significant improvements could be made in the management of collections from cruise to laboratory.
Data collection on ship is also undertaken in a variety of ways.The interdisciplinary nature of marine scientific research often requires integration of a range of data types.This may be done on board, via databases linking imagery and bathymetry with sampling sites for example (Clark et al., 2016), but this is likely a best-case scenario.Data are collected in a range of formats, and while data standards are available they are not consistently employed.There also often a lack of integration between species-occurrence data (where species occur in time and space), and associated oceanographic or environmental datasets.This is partly an artefact of data-recording practices, since occurrences can be recorded in a varied fashion and subject to change over time with taxonomic revisions, but oceanographic data is typically generated in a final form on board ship via instrument readouts.Provision of data in 'real-time' is often proposed as a solution for benefit sharing, but this is not currently possible with biodiversity research, given the time between collection and identifying a specimen can be considerable.Sound collation of high quality data requires considerable time and effort -from this initial stage to throughout the data and sample lifecycle.

Sample Archiving and Related Challenges
When primary research work is completed, samples and associated data should be archived accordingly.Voucher specimens are central to taxonomy and reproducible science (Huber, 1998;Beaman and Cellinese, 2012; Swiss Academies of Arts and Sciences, 2019).Despite the importance of vouchering, there are very few specific regulations for long-term sample archiving and management.According to the 'best scientific practice' of the German Research Foundation, samples have to be stored and kept for only 10 years, and the oil and gas industry in the EU is only required to archive samples for 5 years post collection (Bennear, 2015), compromising the longterm availability of collections and endangering the repeatability of research.Such archival practices however may be a bestcase scenario.While museums generally have robust practices for long-term sample curation for collections, universities often have no equivalent procedures.Increasingly, shorter-term projects not involving museums are funded, and MGR samples are retained in disparate university or government-funded laboratories.Samples collected as part of large programmes at great expense, and which likely include numerous undescribed taxa are at real risk of disposal when a project is considered completed, or a researcher leaves the institute.There have been recent improvements to recognise the importance of archiving with journals and databases requiring information on where vouchers are stored, although this is by no means established practice.
There are significant implications of storage of physical specimens, including substantial curation costs but also those for maintenance of collections buildings, their contents, and associated databases, alongside the key cost of staff time.
It is important to recognise that non-monetary benefits are subsidised in this way by museums and similar collectionsbased institutions.With the adoption by many countries of the Nagoya Protocol in 2010 (which came in to force in 2014), several reports and papers address the need for standardisation in curation of collections that fall under its scope (although there are no specific rules on this in the Protocol itself).For example the (Consortium of European Taxonomic Facilities [CETAF], 2018)4 ABS Code of Conduct provides a set of basic collection management principles to abide by.
In museums, recording of acquisitions which specify both what materials have been acquired and the terms of acquisition, is an established practice.Larger natural history collections in general will require transfer of ownership to the museum at the point of acquisition, and Material Transfer Agreements (MTA's) to reflect this, in exchange for the absorption of long-term curation and data management costs.For smaller non-museum institutes, recording of the acquisition may be incomplete or even lacking.In the context of ABS, transfer of ownership is a potential challenge in the negotiation of an MTA.As such, the practice of title transfer is currently under debate (Sarah Long, personal communication).Distributed collections, or where some collections are held by another institute and linked by a common database could provide a potential solution here, for example as proposed by the recently established European Distributed Systems of Scientific Collections (DiSSCo) programme (Hobern et al., 2019).This would still require some institutions to take on long term curatorial responsibility.The Nagoya Protocol has placed increased pressure on both museum and non-museum institutes to comply with these new regulations when working in areas within national jurisdiction (AWNJ).Institutes dealing largely with open-ocean and deep-sea ecosystems may be unaware of these regulations.There is also potential for confusion and unintentional noncompliance where cruises collect samples from both ABNJ and AWNJ (Figure 2).

Accessing Samples
Samples held in museums have long been available for research by external parties and the associated datasets are becoming increasingly accessible as the need and the technology to allow it have evolved.Access to deep-sea collections is currently mainly through contact with museums or biorepositories and larger institutes who conduct regular research cruises, and databases or publications.The Global Registry of Biodiversity Repositories, GRBio (Schindel et al., 2016), a collation of national and global registers of sample collections, has recently been incorporated into GBIF, the Global Biodiversity Information Facility (The Global Biodiversity Information Facility [GBIF], 2019).National registers may also exist e.g., NatSCA in the United Kingdom,5 and specific sample collections are also displayed on institutional databases and in data aggregators including GBIF and OBIS (Ocean Biogeographic Information System) with holding institutions displayed alongside records.The Global Genome Biodiversity Network (GGBN) aggregates records of molecular/genetic collections housed in various institutes that are available for research6 .

Data Archiving and Access, and Related Challenges
Marine genetic resources data are generally accessed through online data repositories and research publications.There are many databases relevant for access and publishing of MGR data (Table 1), with differing applications, focussing separately on sequence data or species occurrences for example.Key databases include the International Nucleotide Sequence Database Collaboration (INSDC) group, the World Register of Marine Species (WoRMS), and OBIS.Most databases listed provide additional functionality to their core remit as data repositories, and are therefore useful for both curating as well as sourcing data.WoRMS is a comprehensive checklist of marine species names, curated by around 300 taxonomists in accordance with best-practice (Horton et al., 2017;WoRMS Editorial Board, 2019).It also provides data on species traits as well as distributions (Vandepitte et al., 2018).The WoRMS taxon match tool for resolving names, provides crucial quality control support for taxonomic data for the research community and for biodiversity platforms (Vandepitte et al., 2018).The OBIS platform has mapping functionality including the ability to search within ABNJ, and a deep-sea node (O'Hara et al., 2015).Research publications -while a useful source of information regarding locations of MGR samples and data, are not always openly available, with many being retained behind a paywall, and access is therefore limited.Many governments and funders now require research to be either published in openaccess journals, or included in an open-access repository to overcome this issue.
In terms of research outputs, practices of depositing genetic data in open access databases prior to publication are wellestablished in the scientific community, and required for peerreviewed scientific journals.Funding bodies usually also require a data management plan7 with commitments to making data available through open access platforms, and open data policies following an embargo period 2 or 3 years from collection are a standard practice.There are legal requirements underpinning this, for example, the United Kingdom government requires public institutions to provide access to their data.This is a substantial contribution to benefit-sharing in the context of the BBNJ negotiations (Laird and Wynberg, 2018).
However, there are a number of issues with data available in genetic databases, a key one being the increasing number of sequences deposited without reference to formal scientific names, which has resulted in an explosion of 'dark taxa' (Page, 2016).This is a substantial problem resulting in the generation of additional 'taxonomic entities' , with limited scientific meaning.Another issue is a lack of site and other core associated data connected to genetic data (Pope et al., 2015).While guidelines often recommend this contextual data is uploaded with sequences8 , there is no obligation to provide any more data about the specimen than a mandatory specimen ID number.This has resulted in a proliferation of sequences deposited at genetic data repositories (e.g., GenBank) without sample collection information.This disconnect between sequence and contextual data is a significant problem (Pope et al., 2015;Gratton et al., 2016;Deck et al., 2017).This can lead to confusion about which sequence is linked to a type locality for example.Current practice is improving following efforts of organisations like the Genomic Standards Consortium (GSC), GGBN (Droege et al., in preparation), and the marine microbial research community (Kopf et al., 2015).Data storage for genetic data from MGR research is another complicating factor.In light of research reproducibility, archiving raw and processed genetic data is important, but in practice, this could mean storing terabytes of data for just a single study.
Considering occurrence data, there is a well-recognised data gap in OBIS for deep-sea species, resulting from the aforementioned substantial technical challenges and costs in accessing these environments (Appeltans and Webb, 2014;Glover et al., 2018).There are also issues arising from data quality where mis-identified species are uploaded to online repositories as valid species occurrences.This is partly due to limited taxonomic expertise arising from a shortage of adequately trained taxonomists.Another issue, particularly relevant for certain taxa such as marine microbes, is that the highest level of species description may be an OTU, but currently aggregators such as GBIF and OBIS are limited to species occurrences only.According to Bingham et al. (2017), four of the ten most connected databases in the biodiversity informatics space are primarily marine in focus (namely OBIS, WoRMS, FishBase and AquaMaps).A decentralised system of many connected nodes currently exists in the biodiversity informatics space, which provides resilience in the system (Bingham et al., 2017).While this also results in some overlap and duplication, improved links between initiatives will lead to greater interoperability (Costello et al., 2018;Kroh et al., 2019).In summary, access, and therefore benefit-sharing of MGR is happening now, but there is certainly scope to improve.
To demonstrate other deep-sea records currently available from ABNJ, we have extracted data from ABNJ at depths of 500 m and greater from OBIS, which gave 371,890 records of 10,437 species, observed between 1866 and 2018 (Figure 3). 9  The data clearly presents geographic biases, extensive sampling in the North East Atlantic for example.Taxonomic biases and gaps are also evident.The data can be downloaded and viewed at https://mapper.obis.org/?areaid(=1&startdepth(=500).
9 Data extracted from OBIS on 12 May 2019.

CURRENT BEST-PRACTICES FOR MGR SAMPLE AND DATA COLLECTION AND ARCHIVING Data Standards
Use of data standards is critical to data sharing, by enabling database interoperability, simplifying downstream applications and allowing comparison of data across studies.Standards are therefore key to making data FAIR: Findable, Accessible, Interoperable and Reusable (Wilkinson et al., 2016).Key data standards and ontologies for biodiversity data are listed in Table 1.These standards have different aims, for example OBIS-ENV-DATA allows integration of environmental/oceanographic data and species occurrences (De Pooter et al., 2017), averting potential loss of crucial contextual information.Alongside data standards, persistent identifiers are critical to accurately resolving data records (Hobern et al., 2019).Identifiers come in a range of formats, such as DOIs (Digital Object Identifiers), and have varying attributes and levels of adoption, but the key factors are global uniqueness, resolvability, persistence, discoverability, and authority (Guralnick et al., 2018).Without these characteristics, communication and interoperability between databases is compromised (Güntsch et al., 2017), and the records are at risk of duplication or even loss.Therefore, awareness and usage of persistent identifiers by the research community is critical.
Networks, such as GGBN, CETAF and TDWG (the Taxonomic Databases Working Group), are key to adoption and development of ontologies and data standards.Development is generally harmonised so standards complement each other rather than overlap, for example the Biological Collections Ontology (BCO) reuses Darwin Core (DwC) terms (Walls et al., 2014), and the GGBN Data Standard was developed to complement existing standards such as DwC, and covers the requirements of tissue, DNA and environmental sample collections (Droege et al., 2014).GGBN has also developed a High Throughput Sequencing library data standard to be reviewed by the community.These activities are crucial not only to data interoperability but also streamlining sample sharing and curation processes (Benson et al., 2016;Nussbeck et al., 2016).These networks are also important to both raising awareness and further development of existing tools for use with data standards and databases (Table 1).

Standardisation Frameworks: EOVs and EBVs
Essential ocean variables and essential biodiversity variables (EOVs and EBVs) represent frameworks which aim to standardise monitoring of biodiversity at global scales (Wetzel et al., 2015;Constable et al., 2016;Proença et al., 2017;Miloslavich et al., 2018).EOVs/EBVs therefore provide a mechanism to apply scientific findings to conservation and environmental monitoring and policy objectives.For example EOVs/EBVs aim to progress the UN Sustainable Development Goals (SDGs), including SDG1410 ( Anderson et al., 2017).As such these frameworks directly link science and policy, and further development of these links is being discussed (Geijzendorffer et al., 2016;Anderson et al., 2017;Weatherdon et al., 2017;Benson et al., 2018;Wetzel et al., 2018).Quality and timely data is critical to environmental monitoring (Benson et al., 2018).Therefore, accurate taxonomic data and usage of data standards are necessary for these frameworks (Kissling et al., 2018).However, there are often delays between incorporation of scientific findings into policy.Potential issues of incomplete information in employment of EBVs have been addressed in recent reviews (Weatherdon et al., 2017;Muller-Karger et al., 2018).EOVs and EBVs address species distribution and abundance, but in a deep-sea context baseline biodiversity data are often lacking, which underscores the need for fundamental taxonomic research to be undertaken in these environments, particularly if these variables are shaping policy.

Case Study: Clarion Clipperton Zone ABYSSLINE Programme
While adoption and usage of data standards and platforms is key to data sharing, there is room for improvement both in the awareness of existing data standards and platforms within the biodiversity research community.Compounding this are potential difficulties in navigating the array of tools and software available to assist with data handling.With these challenges in mind, here, we illustrate the utility of data standards with a case study from recently published datasets of MGR from ABNJ.
The ABYSSLINE (ABYSSal BaseLINE) programme coordinated by the University of Hawaii for the contractor UK Seabed Resources, has a DNA taxonomy component that provides a case where best-practice has been attempted in taxonomic methods and data and sample management (Glover et al., 2015).One of the key aims of this research programme is to characterise abundance and diversity of abyssal fauna from the UK-1 exploration contract area in the eastern Clarion Clipperton Zone (CCZ) in the Central Pacific, a region undergoing intensive mineral exploration for polymetallic nodules.The programme aims to address the chronic lack of data from the region, despite several decades of cruises associated with mineral exploration activities (Amon et al., 2016;Glover et al., 2018).
Use of Darwin Core is central to the data workflow, where data are mapped to the terms and current taxonomic identifications and occurrence data are exported to a Darwin Core archive.Data and sample handling workflows are detailed in Glover et al. (2015).Following specimen identification (via morphology and molecular methods), species names are recorded and validated against WoRMS.Records are published on the Natural History Museum London data portal (Scott et al., 2019), with global unique identifiers (GUIDS) linking the museum database to the specimen permanent record.Taxonomic data papers are published in the open access journals, Biodiversity Data Journal and ZooKeys (Dahlgren et al., 2016;Glover et al., 2016;Wiklund et al., 2017;Wiklund et al., accepted).These journals are formatted to maximise interoperability: semantically enhanced with embedded links to databases such as GenBank.The datasets are also registered with GBIF, allocated a DOI and published on the OBIS deep-sea node.This allows tracking of use, citation and versioning of the data underlying the publications.
The ABYSSLINE collections include preparations from specimens that do not represent MGR but are derived from it, i.e., slides and scanning electron microscopy stubs, which need to stay in association with the specimen record as they provide additional taxonomic data.Here, relationships between parent specimens and derived samples or preparations (i.e., tissue and DNA vouchers) are captured using the GGBN data standard (Droege et al., 2016).The OBIS deep-sea node then converts the dataset in GGBN format to the OBIS-ENV-DATA data standard (De Pooter et al., 2017).Usage of these standards has allowed increased integration of genetic data with other specimen information.The macrofaunal tissue samples and DNA vouchers are housed in the Natural History Museum's Molecular Collection facility, and are available for research use, as displayed on the GGBN portal11 .
The project findings, including the description of several species new to science, have been a major contribution to faunal records in the CCZ (Wiklund et al., in press;Glover et al., 2018).This new knowledge also includes discovery of a new genus and species of sponge Plenaster craigii (Lim et al., 2017), a potential indicator species in environmental impact assessments (Glover et al., 2018).Further, connectivity studies on this sponge have provided the first evidence-based recommendations for area based management tools in the region (Glover et al., 2018).The CCZ has recently been proposed as a demonstration project for the Deep Ocean Observing Strategy (DOOS) (Levin et al., 2019).The findings described here would directly contribute to such a project, and to the emerging EOV of marine benthic invertebrate diversity.The application of best-practice using available data standards, facilitating interoperability of existing databases provides a clear example for future work and a potential model for use in the BBNJ process.

HOW COULD BEST-PRACTICE REGARDING MGR SAMPLES AND DATA BE FURTHER STRENGTHENED IN TERMS OF ACCESS, SHARING, AND TRANSPARENCY Sample Archiving
Greater open access to samples is critical for the research community, and would facilitate collaboration between deepsea investigators and expert taxonomists from understudied groups, and would be of great benefit to deep-sea research.For example, The German Centre for Biodiversity Research (DZMB) tracks specimens and data obtained during German expeditions (including third party usage), and following species identification, sends vouchers to the natural history collections of Senckenberg (Brandt et al., 2018).In some cases, archive of a physical sample is not possible, for example, where a specimen is completely consumed during analysis and all that is left is resulting data, not even a DNA voucher.This is sometimes the case for environmental DNA (eDNA) collections and molecular collections more widely.Where possible, Whole Genome Amplification could be undertaken on a subset of critical specimens to allow more sustainable usage.While there are issues with the technique, such as potential amplification bias of non-target DNA, recent methodologies address this (Dagnall et al., 2018).Destructive sampling illustrates the importance of considered decision-making for usage of highly valuable and difficult to collect material, particularly relevant in the context of deep-sea collections from ABNJ.Protocols for sustainable, responsible usage of material, and commitments to good data practice and comprehensive data archiving for those collections where some destructive sampling is unavoidable, are crucial.In other cases, video footage, such as seafloor imagery of putative morphotypes may represent the only 'sample' -in these cases robust data annotation and archiving processes is critical as these data represent 'virtual' vouchers (Levin et al., 2019).

Data Publishing
Good-data management practice is of benefit to the data creators, as well as the scientific community as a whole.Publishing of datasets independent of a publication is a key mechanism to not only increase data accessibility, but also encourage best-practice, as transparency will encourage good data management.Further, reuse of data multiplies its value.Datasets can be allocated a DOI, enabling citation and dynamic updates of data, as in the ABYSSLINE case study.This allows datasets to be cited and versioned, increases discoverability, and provides recognition for data creators outside of traditional publications (Michener, 2015;Wetzel et al., 2018).Where possible, an open license such as creative commons should be used for datasets to safeguard open access (Michener, 2015).Semantically linked publications in journals such as ZooKeys provide an existing framework to maximise accessibility, interoperability and traceability of data (Penev et al., 2010).High-quality metadata is essential to both data longevity and tracking provenance, and as such requires special attention.Metadata are often mischaracterised as the associated or contextual data of a sample, such as site information, but in fact are data that document and describe a dataset: data 'about' data (Stow et al., 2018).Good metadata allows interoperability, and enhances data and sample sharing e.g., by recording of the institute archiving the data.Comprehensive guidance on publishing data and metadata are available12 (Penev et al., 2017;Stow et al., 2018).

Further Developing Existing Data Systems
In terms of tools available for data management, gaps in functionality and potential for further development are evident.Some existing tools and software could benefit from improvement.GBIF's Integrated Publishing Toolkit (IPT 13 ), an open source software tool used to publish and share biodiversity datasets, requires significant input from the researcher that would be better handled by systems.In contrast, OBIS works on a system of nodes to support researchers in publishing their data and metadata.The less onerous the system, the more likely data creators are to publish their datasets.Databases require curated, cleaned data and ideally, these processes should be automated as much as possible, via an application or software.While mechanisms to ensure overall data quality, as well as for feedback of data quality between data provider and platform are currently in place (Vandepitte et al., 2014), they could be further expanded.GBIF and OBIS now provide data quality metrics.For example, the new OBIS portal provides a feedback function at the dataset level, and an annotation tool at the record level is in development.This would allow users to flag errors and OBIS could record the reason (e.g., species X does not occur in area Y), which would further contribute to data quality control.

Open Source
While various software and tools are available for annotating data, processing a full data lifecycle of a specimen generally requires moving between tools and systems rather than working within an integrated one.Technical innovations require an active community and efforts on open source code libraries.Open source accompanies open data, and future software developments should archive code (e.g., on code sharing sites like GitHub14 ), which is becoming commonplace in the biodiversity informatics community (Wilson et al., 2017).A good example is the work done by TDWG's data quality interest group15 .Based on code libraries, data dashboards could be developed, such as those proposed in the Humboldt core initiative (Guralnick et al., 2018).This allows users to annotate and visualise their data, similar to web-based data management platforms like BioVel, which integrates key aspects of a sample lifecycle (Hardisty et al., 2016).OBIS is a key platform in the context of MGR in ABNJ and the deep-sea OBIS node provides a platform for researchers to both access and publish deep-sea data.However, success depends on community adoption and involvement.OBIS provides data access to their Application Programming Interface (API), allowing machine to machine communication, and development of data products or applications without the need to physically store the data.These processes align with ABS as open data enhances collaboration, transparency and also potentially technology transfer via sharing of knowledge.

Increased Integration Between Data Systems and Observation Frameworks
There are existing links between sequence-data platforms like GenBank and those housing occurrence data, like OBIS or taxonomic and species-trait data such as WoRMS (e.g., WoRMS provides links between individual species pages and records available in GenBank).However, visibility and reciprocal links could be improved and greater integration is needed (Muller-Karger et al., 2018;Kroh et al., 2019).This could be facilitated by greater visibility of associated sequence data in OBIS, and links to GenBank via accession numbers.Future adoption and usage of data standards, including making them genomic standards such as MIxS interoperable with Darwin Core will contribute here.There is scope to address this via integration of genetic and occurrence databases (such as GenBank and OBIS) and via new initiatives such as the Global Omics Observatory Network (GLOMICON) (see footnote 9).Together with the Genomic Observatories Network (GON), GLOMICON and OBIS are prototyping a molecular biodiversity data pipeline directly integrated with occurrence databases16 .These efforts would be a significant contribution to streamlining access for MGR.
Essential biodiversity variables specifically address the monitoring of genetic diversity via the EBV class genetic composition, which has been identified as a data gap (Geijzendorffer et al., 2016).DNA sequencing and metabarcoding has recently been piloted as a monitoring tool (Keeling et al., 2014;Goodwin et al., 2017), and eDNA metabarcoding pipelines for biomonitoring are being developed (Andruszkiewicz et al., 2017).Levin et al. (2019) suggest 'a large-scale metabarcoding, metagenomic, and transcriptomic census, as has been conducted for marine plankton'.This must be undertaken with commitment to baseline reference data, and generation of morphological and molecular reference data libraries.Also it is important to note that metabarcoding for monitoring cannot be used as a replacement for species description with so many undescribed species in the deep sea.It is also important to recognise that since many of deep-sea species are new to science they may only be identified to genus or higher taxonomic ranks, and will likely remain as OTUs.The characterisation of OTUs is critically important to provide workable ecological information, and must be very carefully managed to ensure comparability between surveys in the future.

Future Scenarios
Future scenarios may include an increased need for data and sample processing from in situ deep-sea observatories and increased use of methods such as machine learning for video and image data.Such datasets are particularly important in cases where collection of a physical voucher specimen is not feasible.MGR may be increasingly studied and monitored remotely by mechanisms such as robotic samplers with inbuilt sequencers (Kissling et al., 2018).Overall as in lab technique development there is scope for much innovation here.Future development should be integrated as far as possible with existing systems and platforms that have a global user community.A current key initiative is the General Bathymetric Chart of the Oceans (GEBCO) Seabed 2030 Project (Mayer et al., 2018), aiming to map the seafloor by 2030.Such efforts are critical to understanding biodiversity and characterising biogeography in the deep sea and present an opportunity to set standards in these previously un-mapped regions that are not available for the already mapped terrestrial areas.There are potential issues with making this data available as with open access as vulnerable ecosystems such as seamounts, or rare and threatened species, may be further targeted.Potential solutions here could draw on experience in terrestrial systems, such adding randomisation to coordinates of species occurrences.In terms of data access in developing countries, publicprivate partnerships may also be required to further develop IT infrastructures to address barriers to data accessibility.Alternatively, bioinformatics capacity could be leveraged by access to a system of distributed nodes to run analyses on, such as the PlanetLab model.17 The Role of Networks Community involvement and networks are important to progressing data and sample accessibility in the deep-sea research community, and to building consensus on standardisation.Strong networks also have also been identified as a key factor in sharing data across science and policy (Weatherdon et al., 2017).The deep-sea research community is already international and collaborative, with global networks such as INDEEP and the Deep-Ocean Stewardship Initiative (DOSI)18 (Mengerink et al., 2014), Other non-marine focussed networks are also relevant here.CETAF, while a European network, provides documentation on best-practice for taxonomic sample collections 19 .The GGBN network provides guidance on best-practice for managing genomic/molecular biodiversity collections and, as discussed earlier, provides established data standards and pipelines.GGBN supports best-practice through initiatives like the GGBN Document Library representing a knowledge platform for the biodiversity research community 20 .Samples of all kinds, including environmental samples, are a core focus of GGBN, and its members' collections provide longterm storage for high quality DNA, tissue and environmental samples, key to the current context of how to facilitate access of MGR from ABNJ.Protocols for collection, preservation and analysis of environmental samples for genetic/genomic purposes is still very much in development, however, GGBN can play an important role in organising workshops on this topic.These activities have the potential to support the BBNJ process in finding solutions for sampling in ABNJ and accessibility of MGR.

Capacity Building
Like ABS, capacity development is already happening, but there is room to improve in terms of strengthening cooperation and awareness of existing efforts and initiatives (Stephenson et al., 2017;Harden-Davies and Gjerde, 2019).The treaty can help to coordinate and raise awareness, highlighting the value of nonmonetary benefits in building research capacity and in turn, the importance of capacity building to conservation.Networks can significantly progress capacity building, for example, through facilitating voluntary mentorship/pairing schemes, where individuals from developing and developed states in the scientific community are paired up.A number of such schemes are already in existence.The OceanTeacher Global Academy runs marine science and data literacy training workshops in developing countries, which both develop capacity and provide a framework whereby host countries can build their own training centres, making efforts sustainable.Another example is the POGO/SCOR21 fellowship programme, which enables scientists from developing countries to visit oceanographic centres for training.MGR collected during a project could also be used for capacity building.Agreements between countries and institutions involved could facilitate the exchange of students and early career researchers between countries, the sharing of samples and data and lead to joint expeditions with shared distribution of the scientific material obtained.Enabling early career researchers to participate in research cruises has been identified as a key mechanism to support capacity development for deepsea scientists.

Role of Museums and Repositories
One of the key questions is who will facilitate improved access to data and samples.As described here, those involved will include individual scientists, working in research institutions and sometimes linked either formally or informally by international networks.Collections and biorepositories such as museums are the main suppliers of MGR, and also lead in standardising procedures for curation and tracking collections (Harden-Davies, 2017;UN Ocean Decade Report, 2018).Under the Nagoya Protocol, collections such as museums are recognised as a key part of benefit-sharing.While a formalised role and recognition of collections could streamline management of MGR from ABNJ, the scientific community has warned that without additional financial resources to meet heightened administrative and other requirements, such regulations could actually hinder biodiversity research (Kumar, 2018;Neumann et al., 2018).This is a real possibility that must be avoided by the treaty at all costs.Currently, while museums and non-museum institutes support non-monetary benefits, they absorb substantial curation and data management costs in the process.It is key that governments are aware of the need for additional infrastructure and funds to support any requirements for sample curation and data management in the future treaty.The treaty itself could assist here in raising awareness of this need.

Learning Lessons From Nagoya
The negotiations leading to the Nagoya Protocol and its subsequent implementation at national levels provide substantive lessons for the BBNJ process.In particular, it is clear that problems remain with regards to the definitions of what constitutes MGR, MGR data and utilisation and that these issues must be overcome for incorporation into the new treaty for ABNJ.One of the issues of the Nagoya Protocol negotiations was a lack of engagement of the scientific community in providing the understanding of these terms and explaining the related processes.As has been illustrated here, the management of MGR data is currently conducted in an open access arena, and any attempt to restrict the sharing of data from ABNJ will be highly problematic to the scientific community, and the funding bodies who support them.Regulation must ensure to not stifle innovation or impede research progress in any way.Input from the scientific community is therefore crucial to the BBNJ discussions (Vierros et al., 2016;Harden-Davies, 2018).This consultation process, drawing on lessons learnt from the Nagoya Protocol is also central for capacity building discussions, from how to implement at the practical level (Vierros et al., 2016), to a broad scope on what is needed, with input from both North and South scientific communities (Gjerde et al., 2018).It will be key to road test any potential benefit-sharing measures, which could be achieved by building iterative stages into the BBNJ process.Science advances much faster than resulting legislation, one of the pitfalls currently facing the Nagoya Protocol (Kumar, 2018), so this approach would have the additional benefit of accounting for the rapid pace of development in science.

Recommendations (and How to Streamline Processes Through the BBNJ Treaty?)
The BBNJ treaty has at least significant capacity to be of benefit to the science community with its global scope, and considering that the negotiations so far have proved to be constructive in the main part (Tiller et al., 2019), and acknowledging the important role of science in any governance mechanism for ABNJ.Here, we provide recommendations which could contribute to treaty discussions, addressing how to strengthen best-practice, to support access, sharing and transparency in relation to MGR.Extension.These data standards are also listed in Table 1.

Sample Archiving
Archiving samples and maintaining collections for as wide as possible future use is crucial for the research community, supporting taxonomy, reproducibility of research and reducing the need for repeat collection.Long-term specimen storage must be planned prior to sample collection and included in funding proposals, with requirements from funding bodies for archiving of material in national collections.Because cruises are so expensive, the post-cruise proportion of the budget is often contracted as a result (Glover et al., 2018).Data management and sample archiving must therefore be fully costed into a grant application, and samples and data made fully available from national collections and data repositories.Journals and genetic databases could require archiving of specimens in national collections before publishing of data.Much of the onus is on funders, whether public or private and regulatory bodies recognise the key importance of taxonomy and related archiving of voucher samples.Museums play a key role in ABS of MGR as well as supporting biodiversity research: the BBNJ agreement should support and recognise the importance of biorepositories and sample archiving.

Data Standards and Open-Access
In terms of data practices, the treaty with its global scope could support best-practice and open access by building consensus on standardisation.usage of currently available global data standards It could also make recommendations to that data standards are imbedded into protocols in sample and data collection, that all datasets are published in open-access journals and via the OBIS deep-sea node, that sequence data are deposited with reference to voucher material and associated site and collection information.A potential workflow illustrating these principles from cruise to data publication is suggested in Figure 4.While commercial entities are generally less willing to openly share their data, to protect their intellectual property, there are examples of companies showing commitment to open source, for example Geneious bioinformatics software provides access to their API (Kearse et al., 2012).In terms of biotechnology applications on MGR, Blasiak et al. (2019) state that scientists should disclose the origin of an MGR when a patent is filed.This could be adopted as common practice, and companies committing to transparency here would build goodwill and trust.Also site information is paramount for ABS to distinguish between AWNJ and ABNJ, and therefore which legal regime would be applicable to an MGR sample.In a data collection context, privately funded research expeditions should also take heed of best practices in data and sample collection, and provide access to their data.Similarly, sectors like deep-sea mining, oil and gas and fisheries should make their data available for research and monitoring purposes as far as feasible, particularly samples and data from environmental impact assessments.Journals could increase requirements of data and code publishing (bioinformatics pipelines and workflows for analysis of genomic data).

Registration of Cruises
Open access to genetic data and samples is critical to the functioning of the deep-sea research community and a key improvement to our current management is a global system to hold information on research cruises that are taking place and collecting samples in ABNJ (and AWNJ).This would be of great benefit to the deep-sea research community (Rabone et al., 2019).It is recommended therefore that the registration of cruises and a mechanism to manage this should be encouraged by the treaty.This mechanism should also aim to take into account privately-funded cruises together with their respective collections and datasets.A global registry of cruises also could facilitate greater shared research planning to identify gaps and/or overlaps as suggested by Oldham et al. (2014).

Clearing-House
A clearing-house mechanism has been proposed to facilitate the sharing of benefits from MGRs from ABNJ (President's Aid to Negotiations, 2019).How the clearing-house will be administered remains up for debate, but one possibility is through the International Oceanographic Data and Information Exchange (IODE) Programme of IOC-UNESCO (UN Ocean Decade Report, 2018).The clearing-house could also leverage emerging data systems such as Ocean Data Information System (ODIS22 ), and Ocean Best Practices23 .However, questions remain about the functions such a mechanism would need to include (President's Aid to Negotiations, 2019).A centralised repository/ABS clearing-house mechanism could streamline processes by providing documentation, guidance, and links to existing platforms and databases relevant to MGR (Figure 4).The recommendations outlined in our paper could be streamlined by such a mechanism, including provisions for a global registry of cruises.The clearing-house could also include information about which institutes hold collections, their availability, and how they could be accessed.The Obligatory Prior Electronic Notification (OPEN) system proposes archiving of samples and data through track and trace compliance (Broggiato et al., 2018).While the OPEN system could ensure comprehensive tracking of samples and data, it may be a considerable administrative burden to undertake for all MGR samples collected from ABNJ.A more flexible approach would be a commitment to a common set of principles, such as FAIR data and open access as described here.Alternatively, exemptions for data and samples collected for research purposes could be considered, so that systems track and trace commercial applications only.
A clearing-house could also help with unworked samples.Numerous unsorted faunal samples from ABNJ currently exist (in museums and research institutes globally) and these are invisible to other researchers, since they are neither mentioned in peer-reviewed publications nor included in datasets uploaded to open-access database repositories.These include unsorted bulk specimens, archived but not worked on further, generally owing to funding and time constraints.A mechanism to make such unworked samples visible and their availability known to other researchers globally (Harden-Davies, 2017) would be efficient and potentially reduce the need for new costly sample collections in ABNJ, thereby supporting ABS.These collections could be published on existing platforms i.e., OBIS, using existing functionality.There are potential implications here for the home institution, so availability of these samples and associated data would require additional resourcing.The clearing-house would in essence be a mechanism not to create a 'new' data system -but a means to strengthen and improve access to and awareness of existing systems and encourage best-practices.Having a central collation of available collections, relevant documentation and a repository of best-practice protocols would be of great benefit to the scientific community.

CONCLUSION
The inherent difficulties in undertaking work in deep-sea environments and the resulting legacy data-gaps for the deep-sea mean that access to MGR from ABNJ is of crucial importance to the scientific community.Fundamental taxonomic research provides a unifying element to the BBNJ negotiations, as the four main components of the future treaty all depend on knowledge of the species that live in ABNJ.Support for taxonomy/biodiversity research can therefore help to achieve goals of the treaty, allowing sound implementation of conservation and environmental management.ABS through access to samples and data is happening now, but there is room for improvements in current processes.Commitment to open access and best practice provides a pathway to achieve these improvements, for example though usage of robust and FAIR data and sample management pipelines, as discussed in our case study in the CCZ.We have outlined recommendations to streamline access to MGR from ABNJ.In particular we recommend the following: funders (both public and private) and regulators recognise the central importance of taxonomy, through support of museums/collections for long-term sample curation; open access to data (including that generated by private industry/companies); usage of globally recognised data standards; usage (and further development) of existing platforms, commitment to best-practice workflows; publishing of datasets via global online repositories and open access journals; increased integration of observation initiatives and biodiversity research, a global registry of cruises; and lastly, development of a clearing house to allow centralised access to the above.It is important that the treaty is flexible enough to adapt as science and technology evolves, and recognises both the realities of undertaking biological research as well as fundamental biological realities themselves, Ongoing consultation and communication between the science community and policy community will be key here.The future treaty has the potential to support science through improved awareness of best-practice, access to data and samples, and benefit-sharing, drawing on lessons from the Nagoya Protocol.Together with recent advancements in bestpractice in the research community, existing platforms and data systems with robust data quality processes, current networks, and discussions at the recent Intergovernmental Conference at UN, the elements are in place to build an agreement that can support science and society.

FIGURE 2 |
FIGURE 2 | (A) 3% of the datasets in OBIS only hold data from ABNJ and 26% of the datasets hold data from both AWNJ and ABNJ, 71% of the datasets in OBIS hold data from AWNJ.(B) 4% or more than 5000 species occur exclusively in ABNJ, 16% occur in both AWNJ and ABNJ, and 80% occur only in AWNJ.Source: OBIS, 1 February 2019.

FIGURE 3 |
FIGURE 3 | Map of all OBIS records from ABNJ from depths of 500m and greater, 371,890 records of 10,437 species in total, observed between 1866 and 2018 (data at 12 May 2019).The dataset can be downloaded and viewed at https://mapper.obis.org/?areaid=1&startdepth=500.

FIGURE 4 |
FIGURE 4 | Potential workflow from cruise to data publication of access to MGR from ABNJ, including the scope of a potential clearing-house mechanism.Data standards are also shown, acronyms are as follows -Dwc: Darwin Core, ABCD: Access to Biological Collection Data GGBN ds: Global Genome Biodiversity Network data standard, ENVO: Environment Ontology, BCO: Biological Collections Ontology, OBIS-ENV-DATA: OBIS extendedMeasurementorFact DarwinCoreExtension.These data standards are also listed in Table1.

TABLE 1 |
Databases, data standards, and open-source data handling tools/software relevant to MGR data (if acronyms are included in the manuscript only, they are in the first table section, otherwise they are included with the relevant item in the table).
UNCLOSUN Convention on Law of the Sea UNESCO United Nations Educational, Scientific and Cultural Organization (Continued) Frontiers in Marine Science | www.frontiersin.orgTABLE 1 | Continued Databases References Link DNA Data Bank of Japan (DDBJ) TABLE 1 | Continued Databases information associated with or extracted from a physical MGR sample, specifically including any genetic sequence information, in both raw and processed form.