Bacterial and Archaeal Communities in Polymetallic Nodules, Sediments, and Bottom Waters of the Abyssal Clarion-Clipperton Zone: Emerging Patterns and Future Monitoring Considerations

Bacteria and archaea are key contributors to deep-sea biogeochemical cycles and food webs. The disruptions these microbial communities may experience during and following polymetallic nodule mining in the Clarion-Clipperton Zone (CCZ) of the North Pacific Ocean could therefore have broad ecological effects. Our goals in this synthesis are to characterize the current understanding of biodiversity and biogeography of bacteria and archaea in the CCZ and to identify gaps in the baseline data and sampling approaches, prior to the onset of mining in the region. This is part of a large effort to compile biogeographic patterns in the CCZ, and to assess the representivity of no-mining Areas of Particular Environmental Interest, across a range of taxa. Here, we review published studies and an additional new dataset focused on 16S ribosomal RNA (rRNA) gene amplicon characterization of abyssal bacterial and archaeal communities, particularly focused on spatial patterns. Deep-sea habitats (nodules, sediments, and bottom seawater) each hosted significantly different microbial communities. An east-vs.-west CCZ regional distinction was present in nodule communities, although the magnitude was small and likely not detectable without a high-resolution analysis. Within habitats, spatial variability was driven by differences in relative abundances of taxa, rather than by abundant taxon turnover. Our results further support observations that nodules in the CCZ have distinct archaeal communities from those in more productive surrounding regions, with higher relative abundances of presumed chemolithoautotrophic Nitrosopumilaceae suggesting possible trophic effects of nodule removal. Collectively, these results indicate that bacteria and archaea in the CCZ display previously undetected, subtle, regional-scale biogeography. However, the currently available microbial community surveys are spatially limited and suffer from sampling and analytical differences that frequently confound inter-comparison; making definitive management decisions from such a limited dataset could be problematic. We suggest a number of future research priorities and sampling recommendations that may help to alleviate dataset incompatibilities and to address challenges posed by rapidly advancing DNA sequencing technology for monitoring bacterial and archaeal biodiversity in the CCZ. Most critically, we advocate for selection of a standardized 16S rRNA gene amplification approach for use in the anticipated large-scale, contractor-driven biodiversity monitoring in the region.

Bacteria and archaea are key contributors to deep-sea biogeochemical cycles and food webs. The disruptions these microbial communities may experience during and following polymetallic nodule mining in the Clarion-Clipperton Zone (CCZ) of the North Pacific Ocean could therefore have broad ecological effects. Our goals in this synthesis are to characterize the current understanding of biodiversity and biogeography of bacteria and archaea in the CCZ and to identify gaps in the baseline data and sampling approaches, prior to the onset of mining in the region. This is part of a large effort to compile biogeographic patterns in the CCZ, and to assess the representivity of no-mining Areas of Particular Environmental Interest, across a range of taxa. Here, we review published studies and an additional new dataset focused on 16S ribosomal RNA (rRNA) gene amplicon characterization of abyssal bacterial and archaeal communities, particularly focused on spatial patterns. Deep-sea habitats (nodules, sediments, and bottom seawater) each hosted significantly different microbial communities. An east-vs.-west CCZ regional distinction was present in nodule communities, although the magnitude was small and likely not detectable without a high-resolution analysis. Within habitats, spatial variability was driven by differences in relative abundances of taxa, rather than by abundant taxon turnover. Our results further support observations that nodules in the CCZ have distinct archaeal communities from those in more productive surrounding regions, with higher relative abundances of presumed chemolithoautotrophic Nitrosopumilaceae suggesting possible trophic effects of nodule removal. Collectively, these results indicate that bacteria and archaea in the CCZ display previously undetected, subtle, regional-scale biogeography. However, the currently available microbial community surveys are spatially limited and suffer from sampling and analytical differences that frequently confound inter-comparison; making

INTRODUCTION
Deep-sea bacteria and archaea (hereafter referred to as microbes) are some of the most abundant lifeforms below the sunlit ocean (Whitman et al., 1998;Kallmeyer et al., 2012) and constitute the greatest standing stocks of biomass inhabiting the seafloor at depths of abyssal plains and below (Wei et al., 2010;Danovaro et al., 2015), in part because of the variety of habitats in which they can survive. In the abyssal plains of the Clarion-Clipperton Zone (CCZ), microbial habitats include polymetallic nodules as well as the more cosmopolitan habitats of sediments and the overlying water column. The CCZ, an approximately 6,000,000 km 2 region in international jurisdiction in the equatorial North Pacific (Wedding et al., 2013), contains the greatest observed concentrations of seabed nodule deposits (International Seabed Authority, 2010). Proposed polymetallic nodule mining operations, currently in the exploratory phase but expected to reach commercial feasibility within the coming decade (Hein et al., 2020) will therefore impact CCZ microbial habitats in a variety of ways, the most obvious being the physical removal of nodules from large swathes of the seafloor. Current mining technology will also necessitate removal of surface sediments and benthic boundary layer water along with nodules. This will result in a pronounced increase in sediment plumes from both direct seafloor disturbance and waste sediment disposal (Christiansen et al., 2020;Drazen et al., 2020), to which local biota -including microbes -are unlikely to be accustomed .
Microbes play critical and diverse roles in energy acquisition pathways and nutrient cycles in the abyss, including both chemoheterotrophic and chemoautotrophic metabolisms and substantial elemental transformations associated with nitrogen, phosphorus, sulfur, iron, and manganese cycling (Orcutt et al., 2011). There are therefore motivations beyond conservation of biodiversity to improve our understanding of the extant microbial communities in the abyssal CCZ prior to human-induced disturbance: in addition to critical ecological functions, mining activity may also detrimentally impact societally relevant ecosystem services provided by microbes in these regions (Thurber et al., 2014;Orcutt et al., 2020). Further, similar to the situation with other deep-sea organisms, abyssal microbes are poorly known to science -the putative metabolic functions and phylogeny of major groups of deep-sea bacteria and archaea are still actively being described in the literature (Baker et al., 2021): e.g., the little-known candidate order Woeseiales (Hoffmann et al., 2020) and putative heterotrophic deep-sea Thaumarchaeota, in contrast with the canonical autotrophic ammonia oxidizers (Aylward and Santoro, 2020).
This combination of factors suggests that, despite the likely importance of microbial communities to the overall biogeochemical and ecological function of the abyssal plains of the CCZ, we are unable to fully describe their roles in this ecosystem. And while bacteria and archaea in general have much shorter generation times than macro-organisms, current evidence indicates that deep-sea microbes impacted by mining will not recover quickly. A recent return visit to the site of the DISCOL simulated small-scale mining experiment in the Peru Basin found that portions of the disturbed areas retained persistently lower CO 2 fixation rates, reduced hydrolytic extracellular enzyme activity, and reduced microbial cell abundances relative to nearby reference sites, 26 years after the sediments were initially disturbed . Deep-sea nodules themselves have been estimated to accrete at rates on the order of millimeters per 10 6 years (Ku and Broecker, 1969;Hein et al., 2020); thus, their removal will functionally lead to the extirpation of any microbes that rely on them for habitat. It is therefore particularly important that the reserve areas set aside from mining impacts, such as the nine current Areas of Particular Environmental Interest (APEIs), which cover 24% of the CCZ planning region (Wedding et al., 2013; see Figure 1), are representative of the taxonomic structure and metabolic function of microbial communities of the surrounding habitats.
Because the energy used by abyssal microbes at sites such as the CCZ that are distant from hydrothermal features ultimately derives from inputs from the surface ocean, the biogeochemical gradients in the upper water column of the south-eastern Pacific raise the possibility of similar, related habitat gradients at the seafloor. These biogeochemical gradients include particulate organic carbon flux to the sediments, which decreases northward from the equatorial upwelling region into the stratified gyre proper, as well as from east to west (Christian et al., 2002;McQuaid et al., 2020;Washburn et al., 2021). East-west variability in particulate organic carbon flux has been posited to explain spatial patterns in species richness of nodule-associated xenophyophores and metazoans (Veillette et al., 2007). The southeastern corner of the CCZ encompasses the pronounced oxygen minimum zone of the eastern tropical FIGURE 1 | Map of sampling sites for datasets included in this synthesis and designated exploration license and APEI areas within the CCZ. Sampling regions are labeled with unique cruise and/or location identifiers (see Table 1). Solid gray polygons represent both reserved and exploration areas, as designated by the International Seabed Authority; squares outlined in gray are APEIs. COMRA, China Ocean Mineral Resources R&D Association contract area (which is divided into east and west regions); UK1, UK Seabed Resources Ltd contract area, with two sampling sites (A and B); OMS, Ocean Mineral Singapore contract area. The three west COMRA sampling sites in the western CCZ share one location dot. For fine-scale sample locations for Abyssline (sampled from 30 × 30 km areas in UK1 and OMS; Lindh et al., 2017) and DeepCCZ (samples collected within 15 km of one another at each site) cruises, see Supplementary Table 1. North Pacific (Talley, 2007), which may further impact the quantity and composition of organic matter reaching the benthos. More directly, nodule abundance and sediment composition also vary within the CCZ, inherently affecting habitat quality and availability for benthic microbes (International Seabed Authority, 2010;McQuaid et al., 2020).
To better understand the current state of knowledge of bacterial and archaeal microbes in the CCZ, here we synthesize a new dataset from polymetallic nodules, sediment, and bottom waters of the CCZ with published datasets from the CCZ and other regions. Our goals in this synthesis are to characterize the current understanding of biodiversity and biogeography of bacteria and archaea in the CCZ and to identify gaps in the baseline data and sampling approaches, prior to the onset of mining in the region. We describe observed biogeographic patterns as they relate both to habitat type (that is, nodules vs. sediments vs. bottom waters) and to spatial distribution. In addition, we highlight how gaps in available data hinder prediction in how these habitats might respond to mining disturbances. We focus here (1) on bacteria and archaea and not on eukaryotic microbes such as the protists or the fungi, as the former two groups have historically been sampled independently from the latter, and (2) on the abyssal water column and benthic habitats, as this is where the majority of published microbial datasets from the CCZ have concentrated their efforts. This is not to suggest that the shallower pelagic environments are expected to be spared the impact of nodule mining; on the contrary, the release of waste sediments into the water column will likely have substantial and still poorly understood impacts on the structure and activity of midwater pelagic communities of all taxa, including the microbes . However, there are not at present sufficient, validly inter-comparable samples from the overlying water column of the CCZ to make a meaningful assessment of spatial heterogeneity.

Caveats in Assessing Microbial Biodiversity From DNA Sequencing Data
A substantial caveat limiting our ability to directly compare results across existing studies results from the rapid shifts in techniques used to characterize bacterial and archaeal communities over the past approximately 15 years. Marine bacteria and archaea are primarily identified through molecular techniques, in particular taxonomic profiling targeting various hypervariable regions of the 16S small subunit ribosomal RNA (rRNA) marker gene. The move from gene cloning and Sanger sequencing to next-generation sequencing (NGS) techniques such as Illumina sequencing (see review in Klindworth et al., 2013) has increased the number of sequences attainable per sample by several orders of magnitude, in turn increasing the detected richness and diversity of the microbial communities (e.g., Sogin et al., 2006), and this discrepancy only grows as technological advances continue to increase the sequencing depth per run and decrease the cost per read. Much of the early work on CCZ microbes was conducted with Sanger sequencing of cloned 16S rRNA gene amplicons, while many recent samples have been sequenced with NGS, complicating inter-cruise comparisons and detection of any potential temporal patterns in biodiversity.
Further, polymerase chain reaction (PCR) primers used to amplify 16S rRNA genes have changed substantially in recent years. Improvements in primer design have increased detection of abundant and cosmopolitan marine microbes such as the SAR11 clade in the water column (Apprill et al., 2015;Parada et al., 2016) and the Thaumarchaeota (Parada et al., 2016). Studies within the CCZ have also used PCR primers that target different combinations of 16S rRNA gene hypervariable regions, which can affect measures of richness and diversity in addition to the taxonomic composition of the resulting libraries (Klindworth et al., 2013;Wear et al., 2018; see also Supplementary Figures 1-3 for a direct example). These effects limit our ability to validly, directly compare results between studies conducted with different 16S rRNA gene primer sets at the level of taxonomic composition. Thus, where necessary, we qualify results with relevant caveats and in some cases omit comparisons between datasets that would otherwise be of interest. We conclude with recommendations that could help to minimize these issues in the future.

Generation of New Dataset
To address the inter-comparability issues introduced by variability in primer choice, we re-sequenced selected benthic boundary layer water, sediment, and nodule genomic DNA samples from the Abyssline cruise program (see Table 1; Lindh et al., 2017;Shulse et al., 2017). We compared these samples with a new dataset collected as part of the DeepCCZ cruise to the western CCZ in spring 2018. DeepCCZ aimed to characterize biodiversity within the three westernmost APEIs. We report here a subset of samples focused on bottom water [nominally 5 m above bottom (MAB)], upper sediments (surface to 5 cm depth), and nodules; additional samples will be described elsewhere.
On the DeepCCZ cruise, water column samples were collected via Niskin bottles affixed to a sampling rosette, with an altimeter allowing sampling of water ∼5 MAB. Seawater (ca. 7.5-8 L) was prefiltered via gentle peristaltic pressure through 3 µm pore size, 25 mm diameter polycarbonate prefilters (Millipore Isopore) to remove macro-organisms and large particles followed by 0.2 µm pore size, 25 mm diameter polyethersulfone filters (Supor) in series to concentrate microbes. Only results from 0.2 µm pore size filters are reported here. Filters were stored dry at −80 • C. Sampling bottles, tubing, and filter holders were rinsed in Milli-Q water between casts.
Sediments were collected via the ROV Lu'ukai using a push corer. Sediment was extruded and sliced into horizons; we report here the upper horizons, from surface−0.5 cm depth; 0.5-1 cm; and 1-5 cm. The upper two sections were transferred to sterile Whirl-Pak bags and frozen at −80 • C. For the larger 1-5 cm section, sterile plastic syringes with the tips cut off were used to subsample sediment from the center of the slices (∼20-30 mLs) for freezing in Whirl-Pak bags. Core tubes were washed with dish soap and soaked in 10% bleach, followed by a Milli-Q water rinse, between ROV deployments. Sampling gear such as slicing plates was rinsed with Milli-Q water between cores and soaked in 10% bleach between dives.
Nodules were collected either from push cores or by targeted sampling with the manipulator arm of the ROV. Nodules were gently rinsed with 0.2-µm filtered seawater to remove attached sediments, measured and photographed, and frozen whole at −80 • C. DNA from water column filters was extracted using the DNeasy Plant Mini Kit as described in Shulse et al. (2017). Nodule and sediment DNA was extracted using the FastDNA Spin Kit for Soil (MP Biomedical) with modifications as described in Shulse et al. (2017). Sediment samples were homogenized in the Whirl-Pak bag before DNA was extracted from ∼0.5 g subsamples. Whole nodules were broken apart within a Whirl-Pak bag using a ceramic pestle and resulting particles were mixed; when nodules were large enough, DNA was extracted from replicate ∼0.5 g subsamples and then pooled (For detailed methods for DNA extraction through bioinformatics, see Supplementary Information). All extracted samples were concentrated with the Zymo Clean & Concentrator-5 kit.
On the Abyssline cruises, benthic sediment and nodule samples were retrieved using box corers and Megacorers. Sediments were collected from the surface to 5 cm depth horizon. Deep water samples (8 L) were collected whole on a 0.2 µm pore size filter without prefiltering. DNA was extracted using similar methods.
16S rRNA gene amplicons were generated using primers targeting the V4-V5 hypervariable regions: 515F-Y (5 -GTGYC AGCMGCCGCGGTAA-3 ) and 926R (5 -CCGYCAATTYM TTTRAGTTT-3 ) as recommended by Parada et al. (2016) with a multiplexing index on the forward primer following the design of the Earth Microbiome Project (Caporaso et al., 2012). Triplicate amplifications were combined and cleaned using an ENZA Cycle Pure Kit (Omega Bio-tek) then pooled into two libraries. Libraries were sequenced on an Illumina MiSeq using pairedend 250 v2 chemistry. Amplicon sequence variants (ASVs) were generated using DADA2 v1.14.1 (Callahan et al., 2016). ASVs are error-corrected, discrete DNA sequences which can differ from one another by as little as a single base pair, in contrast with operational taxonomic units (OTUs), which cluster closely related sequences based on a pre-defined percent nucleotide dissimilarity (Callahan et al., 2016). ASVs were classified to the Silva v138 database . All code is available at https://github.com/ekwear/DeepCCZreview. Individual samples were randomly subsampled to 6,900 sequences using the "rrarefy" function in the R package vegan (Oksanen et al., 2019). Singlet ASVs were removed (for final totals of 6,805-6,876 sequences per sample) and relative abundances were calculated. Weighted UniFrac distance was generated using the R package phyloseq (McMurdie and Holmes, 2013).
Non-metric multidimensional scaling ordinations and multivariate statistics were conducted, and diversity metrics were calculated, in Primer-E v6 (Clarke and Gorley, 2006). Analysis of Similarity (ANOSIM) tests were conducted using weighted UniFrac distances between communities that had been calculated based on ASV-level composition. Two outlier samples from AB02 were discarded based on ordination patterns. Differentially abundant ASVs were identified using the R package DESeq2 (Love et al., 2014). We considered as significantly differentially abundant those ASVs with a greater than three-fold difference between categories (that is, a log2-fold change between categories > 1.585) and a false discovery rate-adjusted p < 0.05. Throughout the text, the phrase "differential abundance" specifically refers to DESeq2 results that have met this significance threshold.

Literature Survey and Dataset Reanalysis
We aggregated datasets where community level, ampliconbased 16S rRNA gene sequences from the CCZ were publicly available, through a combination of literature surveys, public database reviews, and expert knowledge of cruises within the CCZ. Figure 1 and Table 1 highlight the locations of the datasets included in the analysis. We opted not to include microbial sequences from isolates or metagenomes in our spatial comparison of microbial community composition, as these approaches provide different types of information than 16S rRNA gene amplicon surveys; such studies from the CCZ are also numerically sparse and geographically clustered, limiting their utility for spatial analysis. A small number of published studies where raw sequence results were not available in a public database, or where the public database was missing metadata necessary for sample identification, were included in the discussion where relevant but not in the reanalysis. We also omitted those studies that sampled exclusively the upper water column.
To standardize taxonomic classifications to the greatest extent possible, we reanalyzed previously published sequences meeting the above qualifications (Table 1)  1 COMRA, China Ocean Mineral Resources R&D Association contract area (which is divided into east and west regions); UK1, UK Seabed Resources Ltd contract area, with two sampling sites (A and B); OMS, Ocean Mineral Singapore contract area. 2 The number of clones passing quality control in our bioinformatics pipeline. 3 Because sediment cores were sliced at different depth horizons on the DeepCCZ cruise than on the Abyssline cruises, these sediment samples include multiple threesample depth profiles encompassing the upper 5 cm (see section "Materials and Methods"). 4 Nodules were collected from APEIs 1 and 4 but were not present in the sampled region of APEI 7. 5 As recorded for CTD casts.
aligned and classified to the Silva v138 database using mothur v1.42.3 (Schloss et al., 2009). Published datasets sequenced with the Illumina platform included Lindh et al. (2017) and Shulse et al. (2017). We have focused on resequenced samples rather than using original sequences from these two publications to avoid potential confounding effects of different primer sets (e.g., see Supplementary Figures 1-3).

Taxonomic Nomenclature
The phylogenetic relationships between, and taxonomic classifications of, marine microbes remain in flux. We have chosen herein to follow the nomenclature used in Silva v138, with the sole exception of the Thaumarchaeota, which are referred to as the Crenarchaeota in Silva v138.

Data Availability
Sequence files from DeepCCZ samples, and re-sequenced samples from the Abyssline cruises, are available from the Sequence Read Archive 1 under project code PRJNA660809. Sample metadata are available in Supplementary Tables 1, 2, as well as archived on Zenodo 2 ; analyzed sequence results are archived on Zenodo.

RESULTS AND DISCUSSION
Our survey of the community-level microbial literature focused in the CCZ yielded three clone library studies that met our criteria for inclusion, encompassing two nodule samples and seven sediment samples, and two published NGS studies from the Abyssline cruise program, from which we resequenced 75 samples (Table 1) -29 from nodules, 35 from sediment, and 11 from bottom waters. The DeepCCZ dataset added an additional 69 new samples -19 from nodules, 44 from sediment, and 6 from bottom waters. Altogether, these studies include samples from three exploration license areas and four APEIs, as well as some samples from outside currently allocated areas (Table 1 and Figure 1).

Habitat Specificity Is the Primary Driver of Microbial Community Composition
Previous studies from the CCZ (Wu et al., 2013;Shulse et al., 2017;Lindh et al., 2017) have noted the selective effect of habitat -that is, water column vs. sediment vs. nodule -in determining overall microbial community composition as well as measures of diversity and richness, regardless of methodologies used (PCR primers, sequencing technique, etc.). Our literature survey and new dataset add further support to the assessment that habitat is the primary control of microbial community composition in the CCZ (Figures 2A, 3 and Supplementary  Figure 4). Overall community composition across studies was similar (Figure 3): where archaeal 16S rRNA genes were measured, ASVs in the family Nitrosopumilaceae were dominant members of all habitat types. An assortment of members of the classes Alphaproteobacteria and Gammaproteobacteria were consistently abundant, with the composition varying by habitat: the SAR11 order was very abundant in the nearseabed water samples, while the sediment and nodule samples had high relative abundances of the alphaproteobacterial order Kiloniellales and the gammaproteobacterial genus Woeseia. Most samples also contained common deep-sea phyla including the Acidobacteria, members of the Bacteroidota, and the Planctomycetota. Where there were clear differences between studies at a broad taxonomic level (Figure 3), they were between habitat types rather than between geographic locations.
Within our new dataset of DeepCCZ and re-sequenced Abyssline samples, ANOSIM tests indicated that all habitat types were significantly different from one another, even the sediment and nodule communities that were in some cases collected in physical contact with one another (Supplementary Table 3). While sediment and nodule communities shared a high proportion (>85%) of abundant ASVs (with abundant ASVs defined here as those with a cumulative relative abundance within two orders of magnitude of the maximum cumulative relative abundance within a habitat; see Supplementary Figure 5), only about 16% of abundant ASVs were present in both water samples and at least one benthic habitat (Supplementary Figure 5D). In addition to the differences in ASV presence, there were strong patterns in differential abundance as identified by DESeq2 analysis between habitats, with 2,490 ASVs differentially abundant between nodules and water samples; 2,947 between sediment and water samples; and 1,607 between nodules and sediment (Supplementary Figure 6 and Supplementary Table 4). Some of these differences at the ASV level reflect the clear ecological trends seen at broader phylogenetic levels (Figure 3 and Supplementary  Figure 4), such as significantly higher abundances of SAR11 and phylum Thermoplasmatota ASVs in water samples than in sediments and nodules.
Notably, many clades, even at taxonomic levels as low as genera, contained distinct ASVs that were enriched in each habitat (Supplementary Figure 6). This pattern included very common groups across CCZ habitats, such as the family Nitrosopumilaceae, the phylum Planctomycetota, and the gammaproteobacterial genus Woeseia. This ASV-level habitat selection adds an important nuance to the bulk views of microbial community composition, such as in Figure 3 and Supplementary  Figure 4: while a clade may be present and even dominant within multiple habitats in the CCZ, that does not necessarily mean the same members of that clade are present across all habitats. Tully and Heidelberg (2013) observed similar fine-scale habitat preferences within Thaumarchaea MG-1 OTUs collected from nodules and sediment from the South Pacific Gyre, with some OTUs shared across habitats and others limited to one habitat. Ultimately, this suggests that, to most correctly assess microbial diversity in the CCZ, we need to work at an ASVlevel resolution -that is, at the highest possible resolution for amplicon data, able to identify taxa with marker genes differing by as little as a single base pair after application of an error-correction model (Callahan et al., 2017). While broad  Table 1. (B) Bottom seawater samples. In this and subsequent panels, point color and shape indicate sample source region. (C) Sediment samples. This ordination is more correctly viewed as a 3D plot; see Supplementary Video 1 for a 3D animation. The most notable difference in the 3D plot is that the two extremes of the Y -axis in the 2D ordination wrap around to approach each other, such that the samples from APEI 6 are not in reality as distinct from the other AB02 samples as they appear to be in this plot. (D) Nodules.
summaries such as Figure 3 may be the best possible approach for combining currently available datasets that were sampled with different techniques, the fine-scale differential abundances within clades seen in this ASV-level analysis indicates that such an approach may overlook ecologically important levels of biodiversity.

Case Study: Eastern vs. Western CCZ Sites
The DeepCCZ and re-sequenced Abyssline samples allow for a direct comparison of microbial communities from the three westernmost APEIs with those from mining license exploration areas and an APEI in the northeastern CCZ. Based on non-metric multidimensional scaling (NMS) ordinations, spatial differences were present, though less pronounced than differences between spatially co-occurring habitats (e.g., sediments vs. seawater). The magnitude of spatial effects depended on habitat type. Spatial differences were most clear in the sediments (Figure 2C and Supplementary Video 1), where the samples from the western APEIs fell into a south-to-north gradient. The sediments from APEI 6 in the eastern CCZ most closely resembled those of the northernmost western APEI 1, while the sediments from the Abyssline license sites formed a distinct cluster. The bottom waters showed less spatial separation, with weak partitioning along the east-west divide (Figure 2B), while nodules appeared to vary more by cruise than by geographic location as such ( Figure 2D). These biogeographic patterns are consistent with the broad patterns seen in physicochemical variables across the CCZ: bottom waters are similar across the region, while multiple parameters expected to affect benthic organisms (e.g., modeled particulate organic carbon flux to the seafloor, water depth, and sediment composition) vary both from the eastern to western CCZ and latitudinally within the western CCZ (Washburn et al., 2021).
ANOSIM comparisons indicated small but significant differences at the community level between nodules collected from the eastern CCZ versus the western CCZ (if APEI 6 is included: R = 0.123, p = 0.008; if APEI 6 is excluded: R = 0.158, p = 0.003). The majority of pairwise comparisons between nodule communities from specific eastern and western sampling sites were likewise significantly different, while nodules from sites sampled on the same cruises did not differ at the community level (that is, APEI 4 vs. APEI 1 in the west and the UK1-B vs. OMS sites in the east; Supplementary Table 5). ANOSIM comparisons also suggested significant east-west differences in both bottom water and sediment samples (Supplementary Table 5). However, geographic distance in these comparisons is confounded with sampling differences between the DeepCCZ and Abyssline cruises, including in pre-filtering of water samples and sampling of different sediment depth horizons, and thus these observed differences may not be solely ecological in origin. Consistent with these relatively subtle spatial patterns, the majority of abundant ASVs (defined based on cumulative relative abundances, as described above) within each habitat type (84-94%) were shared between all three cruises (Supplementary Figure 5), and spatial differences originated in large part from shifts in relative abundances rather than from presence/absence dynamics. DESeq2 analysis of differential abundances of ASVs within nodules (the habitat type sampled comparably across all cruises in this dataset) from the eastern and western CCZ indicated there were 375 ASVs enriched in the nodules from the eastern sites and 204 in the nodules from the western sites, out of 17,510 total ASVs present across the nodule samples (Supplementary Figure 7 and Supplementary  Table 4). However, these patterns were not driven by ASVs with high relative abundances, as only six differentially abundant ASVs were present at mean relative abundances of 0.5% or greater within a region (Supplementary Table 6). As was the case in the differential abundance analysis between habitats, several common taxonomic groups had ASVs that were enriched in each region, including the order Nitrosopumilales, the phylum Gemmatimonadota, the genus Woeseia in the Gammaproteobacteria, and the genus Nitrospina, suggesting some members of these groups may have preferred habitat niches across the CCZ when examined at a fine taxonomic resolution.
Diversity metrics varied between habitat types but not between geographic locations (Figure 4), with the exception of the benthic samples from APEI 6 (discussed below). Across cruises and stations, sediment samples had the highest observed ASV richness (mean of all samples: 1,774 ASVs), Pielou's evenness (0.91), and Shannon diversity (6.77) values; bottom water samples had the lowest values of all metrics (634; 0.82; and 5.30, respectively); and nodules were in between (1,166; 0.87; and 6.15, respectively). Although sampled volumes differed between habitats, these diversity metrics were calculated on a standardized number of sequences per sample (see section "Materials and Methods"). The bottom water samples from DeepCCZ had lower mean ASV richness and Shannon diversity than the bottom water samples from the Abyssline cruises, but as the DeepCCZ samples were subjected to a 3 µm pore-size pre-filtration step and the Abyssline samples were not, we do not have confidence that this difference is ecological rather than methodological in origin.
An unexpected result of this analysis was the relatively pronounced intra-habitat difference in sediments and nodules between spatially adjacent but temporally separated (∼17 months) Abyssline license area sites. The differences between samples collected on AB01 and AB02 were, by several measures, roughly as large as the differences between samples collected across the approximately 1,500 km latitudinal gradient of the DeepCCZ cruise or the differences between the eastern and western CCZ sites (Figures 2, 5 and Supplementary Figure 8). Nodule communities from the UK1-A site sampled on AB01 were significantly different from those from both AB02 license area sites according to an ANOSIM analysis, and sediment communities differed between the UK1-A site and the AB02 OMS license site (Supplementary Table 5). The Abyssline cruises had overlapping personnel and protocols, making it unlikely that this difference is solely a sampling artifact. This raises the possibility that there could be temporal as well as spatial variability in benthic microbial communities in the CCZ, which would need to be considered in assessing the extent of mining-induced community shifts, particularly in light of the relatively subtle spatial differences we find in microbial communities across the CCZ. Despite the stereotypical view of the abyssal sea as a largely stable habitat, there is evidence this is not the case for some trophic groups -for example, abyssal sediment macrofaunal community composition and function were observed to correlate with seasonal to interannual variability in particulate organic carbon flux in a decade-long study (Ruhl et al., 2008). Alternatively, our findings may indicate that spatial variability within the CCZ can be non-linear over fairly short geographic distances. Distinguishing between these effects will require more extensive datasets than are currently available, but such a distinction will be critical to designing sampling strategies that fully characterize the extant microbial biodiversity in the CCZ.

Case Study: APEI and Mining License Areas in the Eastern CCZ
Most studies of CCZ microbes have focused on either mining exploration license areas or on APEIs. An exception to this is Lindh et al. (2017), in which the authors sampled one box core and one rosette cast in an APEI (APEI 6) relatively close to the license areas that were the main focus of their work. However, given the large spatial extent of the CCZ, this still constituted an almost 900 km distance between the APEI sampling site and the closest license-area samples. Their sequencing of bacterial 16S rRNA gene amplicons indicated that community composition across all habitats -water column, sediment, and noduleswas significantly different between the APEI and corresponding samples from two nearby license areas.
Based on our spatially expanded analysis, it appears that the region of APEI 6 that Lindh et al. (2017) sampled may be an outlier relative to the CCZ in general, at least in sediments and nodules where we have enough samples for a confident assessment (see re-sequenced data in Figures 2B-D  and Supplementary Video 1). Consistent with the observations of Lindh et al. (2017), the two nodules resequenced from APEI 6 had higher relative abundances of Actinobacteria (7.3 and 8.1%, vs. 0.4-6.1% in all other nodules) and Nitrospira (3.7 and 7.8%, vs. 0.1-2.8%) and lower relative abundances of Planctomycetota (3.1 and 3.2%, vs. 4.2-11.4%) than any other nodules included in the new dataset (Supplementary Figure 4C). The APEI 6 nodule samples also had low richness, evenness, and diversity metrics compared with those of the nearby license areas (Figure 4). These distinct benthic communities are perhaps not surprising given the location of APEI 6 on the far northeastern edge of the CCZ, and correspondingly we interpret this specific example as an illustration of the need for further comparisons between APEI and license area microbial communities. While our synthesis and new dataset both indicate that geographic variability within specific habitat types in the sampled portions of the CCZ is relatively low (that is, the absolute weighted UniFrac distances FIGURE 4 | Richness, evenness, and diversity across habitat types and sampling locations from the new dataset. (A) Observed richness: number of observed ASVs (amplicon sequence variants) per sample, after each sample was randomly subsampled to 6,900 reads and singlet ASVs were removed. Within each habitat type, sampling locations are arranged from roughly northeast to southwest. Because of the much smaller number of water column samples, these were grouped by cruise; no nodules were collected in APEI 7. Lines indicate mean ± one standard deviation. (B) Evenness calculated as Pielou's J . (C) Shannon diversity, log base e. between samples are low (<0.3 between nodule samples and <0.2 between sediment samples; Figure 5) and the majority of ASVs are present within habitat types across the CCZ (Supplementary  Figure 5), the current APEIs are generally located on the periphery of the CCZ, in some cases spanning the fracture zones (Figure 1), and thus could be prone to conditions promoting outlier microbial communities.

Nodule Communities Within the CCZ vs. Bordering Regions
While the CCZ is a particular target for potential mining activity due to the abundance and elemental composition of polymetallic nodules found there, microbial communities have also been studied from nodules collected from: the central South Pacific Gyre (Tully and Heidelberg, 2013); the Peru Basin, which is approximately 4,000 km to the southeast of FIGURE 5 | Weighted UniFrac distances between comparable samples across spatial and temporal distances. AB01 and AB02 are Abyssline cruises 1 and 2, respectively; APEI 6 was omitted from AB02 for this analysis as it is spatially distant from the other sampled regions. Approximate distances and times between sample collections are listed between panels; cardinal directions are reported as traveling from the first to the second location. Colors are visual guides to distinguish categories of comparisons only. (A) Nodule samples; category labels as in (B). (B) Sediment samples. For sediments, DeepCCZ samples were only compared within depth horizons between APEIs and within sites, as depth profiles are a distinct aspect of spatial ecology, but all depths were compared with Abyssline samples due to different depths sampled.
the southeastern corner of the CCZ ; the South China Sea (Zhang et al., 2015); and the western North Pacific Gyre (Jiang et al., 2020). The same issues with different sampling techniques that hinder direct comparisons between studies within the CCZ also arise in comparisons with bordering regions, yet several trends are clear despite these complications. Where both sediments and nodules have been examined, there are always pronounced overlaps in OTU or ASV presence between the two habitats, with habitat differences largely driven by differential relative abundances rather than unique organisms, consistent with the patterns we observed in the CCZ. Bacterial communities in nodules across sites have broadly similar composition when examined at the class to phylum levels, dominated by diverse assemblages of both Alphaproteobacteria and Gammaproteobacteria, in particular members of the Alteromonadales. A further suite of phyla was frequently present in nodules across sites in the general range of circa 1 to 10% relative abundances, including the Bacteroidota, Actinobacteria, Planctomycetota, and Verrucomicrobiota.
However, differences in archaeal abundance are substantial enough to suggest that nodules in the CCZ, or perhaps more broadly within the meso-to oligotrophic central gyres of the North and South Pacific (Smith and Demopoulos, 2003), may constitute a distinct microbial habitat than nodules from more productive oceanic regions. In our CCZ dataset, we observed that members of the family Nitrosopumilaceae made up between 12 and 35% of nodule communities (mean 22%). Molari et al. (2020) reported lower relative abundances of Nitrosopumilaceae in nodules sampled from the Peru Basin compared with the CCZ results reported by Shulse et al. (2017); they observed total archaeal sequences constituting at most 7% of nodule communities, a difference the authors attribute to the higher particulate matter fluxes in the Peru Basin and subsequent differential selective pressure on the Nitrosopumilaceae from increased ammonium availability [although bacteria and archaea were amplified with distinct primers in Molari et al. (2020), which may additionally confound this comparison]. Zhang et al. (2015) also observed very low (0.01%) relative abundances of an OTU related to Nitrosopumilus in nodules collected from a dormant hydrothermal vent on a seamount in the South China Sea. While Zhang et al. (2015) used a primer set (515F and 806R) that is known to underrepresent the relative abundance of the Thaumarchaeota, the reported primer effects in direct comparisons have not been severe enough to solely explain that discrepancy from the much higher relative abundances of Nitrosopumilaceae seen in nodules across the CCZ (Parada et al., 2016; see also Supplementary Figures 2, 3). In contrast, Tully and Heidelberg (2013) found a high relative abundance (in some cases, >50% of the community) and OTU-level diversity of Thaumarchaeota in nodules from the South Pacific Gyre, and Jiang et al. (2020) reported a high proportion of cloned archaeal 16S rRNA genes from the Thaumarchaeota, and particularly those closely related to Nitrosopumilus, in the western North Pacific Gyre. This variable relative abundance of Thaumarchaeota raises the possibility that polymetallic nodules underlying meso-to oligotrophic regions such as the CCZ may play more of a role in supplying their local community with chemoautotrophic production of organic matter, and thus nodule removal could have different trophic implications between regions.

CONCLUSION
Our results strongly support a number of previously posited and novel conclusions about microbial communities across the CCZ. First, habitat type is the strongest determinant of microbial community composition; bottom waters share relatively few high resolution, ASV-level taxa with the benthic habitats, while sediments and nodules share many ASV-level taxa that differ in relative abundances. Second, we observe an east-vs.-west distinction in microbial community composition in the CCZ, although the magnitude of this effect, at least with current sample sizes, is relatively subtle and driven by changes in relative abundances rather than ASV-level taxon turnover. However, we also observed community-level differences between sites within the same claim area but sampled ∼17 months apart that were comparable in magnitude to differences between the eastern and western CCZ, suggesting either temporal variability or non-linear spatial shifts in community structure may be important. Either of these patterns would necessitate a specialized monitoring plan, accounting for seasonality or empirically determined spatial factors, to better characterize microbial biodiversity in the region. Finally, the observation that nodules in the CCZ harbor different microbial communities than nodules from more productive regions to the east and west of the CCZ, specifically higher relative abundances of putative chemoautotrophic organisms in CCZ nodules, suggests a possible effect of mining on trophic ecology in this meso-to oligotrophic region, although functional studies are required to confirm this. We can also unambiguously conclude that the currently available datasets do not represent a thorough spatial sampling of the microbial communities of the CCZ. In particular, the central and southeastern regions of the CCZ are poorly represented, and comparisons between proximate license areas and APEIs are extremely limited. We are aware of a small number of research cruises that have occurred but have not yet published microbial biodiversity data, which will improve but not fully alleviate these deficits.

Knowledge Gaps Concerning Microbial Biodiversity and Ecology in Nodule Regions
There remain several major gaps in our knowledge of CCZ microbes, in particular of those inhabiting polymetallic nodules, that we cannot address with current datasets. This includes the basic issue of whether reserve areas provide similar habitats to nearby exploration license areas. As noted above, available studies have largely sampled either APEIs or exploration areas, with scant paired comparisons between neighboring sites. Given the anticipated lengthy recovery time of sediment microbial communities after mining exposure , and the presumably much longer time before nodules could re-form to be colonized (Ku and Broecker, 1969;Hein et al., 2020), it is particularly important that these reserves of intact habitat be representative of the broader CCZ.
This synthesis has focused on habitat selection and basinscale spatial distribution patterns, but we lack knowledge of the basic ecology of deep-sea nodule-associated bacteria and archaea at much finer scales as well. This lack of basic ecological understanding has monitoring implications, in that it may inhibit our ability to design sampling programs that are robust to local variability when the goal is detecting spatial effects (see for example, some samples presenting high within-site variability in weighted UniFrac distances in Figure 5). At this point, we cannot confidently answer basic questions, including: how do properties such as nodule size and metal composition affect microbial community composition? Is nodule density a factor in structuring small-scale microbial communities, as has been observed for megabenthic faunal composition (Simon-Lledó et al., 2019)? Does the frequent colonization of nodules by larger organisms including xenophyophores and sponges (e.g., Gooday et al., 2020) select for particular microbial communities, as has been observed with co-occurring epizoans in other marine habitats (e.g., on kelp blades: James et al., 2020)? Sediment-dwelling xenophyophores have been shown to contain enhanced and compositionally distinct bacterial fatty acids relative to surrounding sediments, suggesting some combination of bacterivory and/or bacterial growth on xenophyophore fecal pellets (Laureillard et al., 2004); both processes potentially could affect microbial community composition in a nodule to which a xenophyophore was attached. And like their shallow-water counterparts, deep-sea sponges host communities of bacteria and archaea that are distinct from those of the surrounding seawater, with deep-sea sponges particularly favorable habitats for members of the Nitrosopumilaceae (e.g., Steinert et al., 2020). Over the very long temporal duration of nodules (Ku and Broecker, 1969;Hein et al., 2020), attached macro-organisms would presumably episodically die; live vs. detrital epizoans would likely have different effects on microbes, and we speculate that changes in epizoan presence and health could potentially explain some nodule samples that have been observed to have outlier microbial communities, such as the three nodules from studies in the central Pacific that were each highly enriched in the chemoheterotrophic genus Colwellia (Tully and Heidelberg, 2013;Blöthe et al., 2015), which includes piezophilic members with chitin-degradation pathways (Peoples et al., 2020).
We also have an incomplete picture of how microbes use nodules as habitat. What are the proportions of microbes in surficial biofilms vs. shallow indentations vs. interior pore waters, and how do they differ? One study that compared the interior vs. exterior of a nodule found that microbial cell abundance was elevated at the nodule surface and decreased inward (Shiraishi et al., 2016). Studies that have compared interior and exterior bacterial and archaeal communities have seen differences in relative abundance and to a lesser extent in presence/absence of OTUs or higher clades between layers (Tully and Heidelberg, 2013;Shiraishi et al., 2016), while another observed higher OTU richness in layers closer to the exterior than in those closer to the nodule nucleus (Jiang et al., 2020). Other studies have found evidence for intra-nodule spatial heterogeneity, with differences in the microbial communities within the hydrogenetic (facing and formed by precipitation from the seawater) vs. diagenetic (facing the sediment and formed by precipitation from the pore waters) portions of individual nodules (Blöthe et al., 2015;Cho et al., 2018). The available nodule-associated microbial samples are insufficient to tease apart these potential factors, with the additional complication that different datasets have variable metadata that further reduce our ability to draw ecological conclusions.
Finally, what are the ecological and biogeochemical roles of bacteria and archaea inhabiting the benthos and near-seabed waters of the CCZ? While this information may be less important for designing sampling schemes, it could help us to better anticipate possible effects of mining on benthic community function. Rate measurements of key metabolic processes and molecular analyses of functional genes via metagenomics and metatranscriptomics would enable an accurate assessment of whether removing nodules and disturbing surface sediments will alter a key part of the abyssal elemental cycles in the CCZ. While rate measurements on deep-sea microbes are complicated by slow speeds and the need to maintain relevant hydrostatic pressures, with careful methodology such rates in the abyssal CCZ do appear to be quantifiable. For example, in situ tracer experiments in the eastern CCZ demonstrated that bacteria in nodule-containing sediments were active both in remineralization of added phytodetritus and in chemoautotrophic fixation of inorganic carbon, although the metabolic pathways underlying the latter process remain unknown (Sweetman et al., 2019).

Recommendations for Future Sampling
Beyond the limited number of extant microbial datasets from the CCZ, our ability to assess biodiversity and microbial ecosystem services is further hampered by the difficulties in directly comparing the samples that do exist. We make several recommendations for future microbial biodiversity work in the CCZ, contributing to the assessment guidelines recommended by the International Seabed Authority (2019), to facilitate future spatial and temporal inter-comparison efforts: (1) Standardized sampling to the greatest extent practical, from field collection through DNA sequencing, will be key to enabling long-term monitoring of microbial diversity in response to mining activity -particularly in light of the very subtle spatial patterns within habitat types that we report in this analysis. International Seabed Authority guidelines already recommend next-generation (i.e., Illumina) amplicon-based sequencing. As the most pressing aspect of this standardization, we suggest adopting a single set of primers for consistent use, ideally one endorsed by a consortium, such as those validated by the Earth Microbiome Project 3 (Walters et al., 2015). The re-analysis included in this manuscript demonstrates how extant sequence information can be reprocessed as necessary to be compatible with future changes in bioinformatics processing routines (e.g., the shift in identifying working amplicon sequences from OTUs to ASVs) and updated reference databases. However, if the physical regions of the 16S rRNA gene, and original generation biases from factors such as PCR primers, constituting those sequences are not inter-comparable, that cannot be corrected for in silico. Yet we recognize that a static option may not be possible if improved primers are introduced over time; thus recommendation 2: (2) As full DNA sequencing standardization over timescales of decades will be impossible in the face of expected future technological improvements, we strongly encourage archiving of biological samples or extracted environmental DNA. This would enable the design of valid Before-After-Control-Impact studies (e.g., as are used to assess responses to the establishment of Marine Protected Areas or the effects of human-caused or natural perturbations; Osenberg and Schmitt, 1996) of the effects of mining processes on microbial communities, using the anticipated future sequencing approaches and instrumentation. Ideally there would be a central repository for such samples, but failing that, we encourage a strong commitment to longterm maintenance from individual research laboratories and contractors, as well as a willingness to share samples within the community when possible. Planned and coordinated archiving will likely be particularly important in the CCZ in light of the anticipated slow pace of postdisturbance community recovery and corresponding need for long-term monitoring (discussed in Jones et al., 2020). (3) Collection of a standardized set of metadata for polymetallic nodules sampled for DNA will enable ecologically relevant comparisons and potentially help to identify those factors influencing microbial community composition. Beyond the basics of location and depth (both of the overlying water column and within the sediment), we encourage measures of nodule size and notes or photographs documenting attached organisms and unusual colors or features (e.g., see Supplementary  Table 2). Additional information realistically may not always be feasible for all samples; for example, metal composition and nodule density could influence microbial communities associated with nodules and in nearby sediments. This list would likely expand or change as we better understand the ecology of nodule-inhabiting microbes. (4) While a shift to non-targeted sequencing (i.e., shotgun metagenomics) could help to alleviate some of the primerrelated inter-comparability issues and would have the additional benefit of providing functional information to address ecological questions, it is unlikely to be a panacea for biodiversity monitoring. This approach carries both technical and practical limitations. It too is affected by improvements in sequence yields and therefore changes in sequencing depth over timescales of decades. Functional gene analyses can be less precise for taxonomic identification than 16S rRNA genes, particularly in sparsely studied environments such as the deep sea. Finally, metagenomic analysis remains relatively expensive compared with 16S rRNA gene profiling and requires an arguably greater degree of specialist knowledge to interpret -it is highly unlikely to be accepted by commercial contractors as a component of routine monitoring activity in the near future. Metagenomics certainly has a place in furthering our understanding of the ecology of the CCZ in a research context, but for routine impact monitoring, we believe that the community should focus on optimizing amplicon-based approaches at this time.
We strongly encourage future sampling to better characterize the spatial and temporal biodiversity and activity of the microbial communities of the CCZ, to enable both more accurate management decisions and future detection of potential mining effects -particularly as our analysis here suggests that there are spatial differences in microbial communities within the CCZ. Careful planning of sampling techniques and target locations, to ensure inter-comparable datasets and alleviation of large areal gaps, will be key to obtaining an accurate understanding of microbial biodiversity in the CCZ prior to the onset of miningrelated disturbances.

AUTHOR CONTRIBUTIONS
EW drafted the manuscript, based in part on a conference report authored by MC, EW, and BO with input from CRS. EW, MC, CNS, ML, and CRS collected and/or conducted the initial analysis of samples included in the analysis. CRS led the DeepCCZ and Abyssline cruises that collected the samples. EW conducted the bioinformatics analyses. All the authors revised and approved the manuscript.

FUNDING
The review portion of this work originated in, and expands upon, a section of the workshop report from the DeepCCZ Biodiversity Synthesis Workshop in October 2019 , funded by the Gordon and Betty Moore Foundation, the Pew Charitable Trusts, and the International Seabed Authority. New data included here were funded by the Gordon and Betty Moore Foundation (Grant 5596 to MC and CRS). Funding for collection of Abyssline samples derived from contracts from the UK Seabed Resources, LTD. (UKSRL) to CRS and MC. UKSRL had no influence in the study design, data collection, data analyses, and data interpretation. BO acknowledges funding support from the Center for Dark Energy Biosphere Investigations (C-DEBI) funded by the US National Science Foundation (award OIA-0939564). This is C-DEBI contribution number 566. This is contribution 11328 from SOEST, University of Hawai'i at Mānoa.

ACKNOWLEDGMENTS
We thank Jason Smith, Travis Washburn, C. Rob Young, other members of the DeepCCZ working groups, two reviewers, and members of the Church lab for conversations and input that contributed to and improved this manuscript. We also thank the officers and crew of the R/V Kilo Moana, the members of the Hawaii Undersea Research Lab, and the DeepCCZ science party, in particular Jen Durden, Erica Goetze, Oliver Kersten, Kirsty McQuaid, Regan Drennan, Astrid Leitner, and Jeff Drazen.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars.