<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2018.03165</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Identification of Microbial Dark Matter in Antarctic Environments</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Bowman</surname> <given-names>Jeff S.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/321111/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Scripps Institution of Oceanography, University of California, San Diego</institution>, <addr-line>La Jolla, CA</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Center for Microbiome Innovation, University of California, San Diego</institution>, <addr-line>La Jolla, CA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Anne D. Jungblut, Natural History Museum, United Kingdom</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Vincent Delafont, University of Poitiers, France; Charles K. Lee, University of Waikato, New Zealand</p></fn>
<corresp id="c001">&#x002A;Correspondence: Jeff S. Bowman, <email>jsbowman@ucsd.edu</email></corresp>
<fn fn-type="other" id="fn002"><p>This article was submitted to Extreme Microbiology, a section of the journal Frontiers in Microbiology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>19</day>
<month>12</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<year>2018</year>
</pub-date>
<volume>9</volume>
<elocation-id>3165</elocation-id>
<history>
<date date-type="received">
<day>01</day>
<month>09</month>
<year>2018</year>
</date>
<date date-type="accepted">
<day>06</day>
<month>12</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2018 Bowman.</copyright-statement>
<copyright-year>2018</copyright-year>
<copyright-holder>Bowman</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Numerous studies have applied molecular techniques to understand the diversity, evolution, and ecological function of Antarctic bacteria and archaea. One common technique is sequencing of the 16S rRNA gene, which produces a nearly quantitative profile of community membership. However, the utility of this and similar approaches is limited by what is known about the evolution, physiology, and ecology of surveyed taxa. When representative genomes are available in public databases some of this information can be gleaned from genomic studies, and automated pipelines exist to carry out this task. Here the paprica metabolic inference pipeline was used to assess how well Antarctic microbial communities are represented by the available completed genomes. The NCBI&#x2019;s Sequence Read Archive (SRA) was searched for Antarctic datasets that used one of the Illumina platforms to sequence the 16S rRNA gene. These data were quality controlled and denoised to identify unique reads, then analyzed with paprica to determine the degree of overlap with the closest phylogenetic neighbor with a completely sequenced genome. While some unique reads had perfect mapping to 16S rRNA genes from completed genomes, the mean percent overlap for all mapped reads was 86.6%. When samples were grouped by environment, some environments appeared more or less well represented by the available genomes. For the domain Bacteria, seawater was particularly poorly represented with a mean overlap of 80.2%, while for the domain Archaea glacial ice was particularly poorly represented with an overlap of only 48.0% for a single sample. These findings suggest that a considerable effort is needed to improve the representation of Antarctic microbes in genome sequence databases.</p>
</abstract>
<kwd-group>
<kwd>Antarctica</kwd>
<kwd>16S rRNA</kwd>
<kwd>glacier</kwd>
<kwd>sea ice</kwd>
<kwd>cryoconite</kwd>
<kwd>sediment</kwd>
<kwd>permafrost</kwd>
<kwd>snow</kwd>
</kwd-group>
<contract-num rid="cn002">1641019</contract-num>
<contract-sponsor id="cn001">Simons Foundation<named-content content-type="fundref-id">10.13039/100000893</named-content></contract-sponsor>
<contract-sponsor id="cn002">Office of Polar Programs<named-content content-type="fundref-id">10.13039/100000087</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="3"/>
<equation-count count="0"/>
<ref-count count="39"/>
<page-count count="11"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec><title>Introduction</title>
<p>The Antarctic continent represents a complex mosaic of microbial habitats. At the continental margin are highly productive coastal seas which transition sharply to the oligotrophic Southern Ocean. Tidewater glaciers and ice shelves bridge the gap between the terrestrial and marine environments, while &#x2013; along with terrestrial glaciers &#x2013; providing unique microbial habitats in cryoconite holes, melt ponds, and subglacial lakes and streams. The complex topography of Antarctica provides for polar deserts and meltwater lakes varying in salinity from nearly fresh to near saturation. This environmental complexity and the isolation of the Antarctic from other continents has inspired over 100 years of microbiological research (<xref ref-type="bibr" rid="B21">McLean, 1918</xref>). However, only in the last few decades have DNA sequencing and other molecular methods allowed for the genetic and phylogenetic characterization of single celled members of the domains Eukarya, Bacteria, and Archaea.</p>
<p>Sequencing of the 16S rRNA gene has emerged as the <italic>de facto</italic> standard for determining the diversity of bacterial and archaeal communities. Although the maximum resolution of a diversity analysis by 16S rRNA gene sequencing is insufficient to identify many phylogenetically similar but genetically distinct strains, community structure derived from 16S rRNA gene sequencing does indicate the genetic structure of the community (<xref ref-type="bibr" rid="B5">Bowman and Ducklow, 2015</xref>). The efficacy of 16S rRNA gene sequencing studies has been improved in recent years by the stabilization of standard methods around the Illumina MiSeq sequencing platform, which provides high quality, high throughput sequencing of relatively short amplicons. Further aiding microbial diversity analysis are improved primers that broadly amplify across the domains Bacteria and Archaea (<xref ref-type="bibr" rid="B35">Walters et al., 2015</xref>), and new methods to denoise Illumina MiSeq data (<xref ref-type="bibr" rid="B9">Callahan et al., 2016</xref>; <xref ref-type="bibr" rid="B1">Amir et al., 2017</xref>). These methods allow microbial community structure to be resolved to the level of unique reads.</p>
<p>Despite the inaccessibility of much of the Antarctic continent, there have been numerous efforts to assess the taxonomic and genetic diversity of Antarctic microbial habitats. Scientific work following an initial assessment of microbial community structure in a given Antarctic environment may be limited, however, by the availability of completed genomes and model strains. These are necessary to fully understand the evolution, adaptation, and physiology of Antarctic microbes. Microbial clades that may be coarsely identified taxonomically, but for which little is known about their genetics, physiology, and ecological role, are considered &#x201C;microbial dark matter&#x201D; (<xref ref-type="bibr" rid="B19">Marcy et al., 2007</xref>) and are good targets for new studies and technological innovations.</p>
<p>To provide a status report on our understanding of Antarctic microbial diversity and the extent of microbial dark matter in different Antarctic environments, the available Illumina MiSeq studies were aggregated by environment, reanalyzed to the level of unique sequences, and a phylogenetic placement approach (<xref ref-type="bibr" rid="B20">Matsen et al., 2010</xref>) was applied to compare sequence identity to those closest completed genomes available in the public Genbank repository. The phylogenetic distance between environmental sequence reads and the closest completed genome provides an estimate of uncharacterized microbial diversity in these samples, and a novel view of the extent of microbial dark matter in Antarctic environments and the putative taxonomy of uncharacterized microbes.</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<p>Datasets were identified on the NCBI SRA by search with the following syntax: Antarctica [All Fields] AND X metagenome [Organism], where X was an environment deemed relevant to Antarctica. These included &#x201C;aquatic,&#x201D; &#x201C;freshwater,&#x201D; &#x201C;glacier,&#x201D; &#x201C;hypersaline lake,&#x201D; &#x201C;ice,&#x201D; &#x201C;lake water,&#x201D; &#x201C;marine,&#x201D; &#x201C;marine sediment,&#x201D; &#x201C;metagenome,&#x201D; &#x201C;microbial mat,&#x201D; &#x201C;rock,&#x201D; &#x201C;salt lake,&#x201D; &#x201C;seawater,&#x201D; &#x201C;sediment,&#x201D; &#x201C;soil,&#x201D; &#x201C;soil crust,&#x201D; &#x201C;snow,&#x201D; and &#x201C;terrestrial.&#x201D; The goal of this search was not to carry out an exhaustive search for Antarctic datasets, but to capture nearly all the datasets available on SRA. To confirm the completeness of this search, an additional Google Scholar search was carried out using the search terms &#x201C;Antarctica&#x201D; and &#x201C;Illumina.&#x201D; The first 1,000 hits were reviewed for any studies that were not captured in the SRA search. Studies that used repositories other than SRA were not included.</p>
<p>Run tables were aggregated for all search results and filtered to include only amplicon studies that relied on the Illumina MiSeq or HiSeq platforms. Samples that were derived from host environments were also excluded from further analysis. Runs that were obviously the result of amplification of genes other than the 16S rRNA gene (e.g., 18S rRNA genes and intergenic transcribed spacer regions) were removed, as were studies where the data were clearly not demultiplexed when it was uploaded to the SRA. The remaining runs were downloaded using the fastq-dump command from the NCBI&#x2019;s SRA Toolbox, with the &#x2013; split-spots and &#x2013; skip-technical flags. Here and elsewhere, Gnu Parallel was used to parallelize operations (<xref ref-type="bibr" rid="B30">Tange, 2011</xref>).</p>
<p>The consensus environment for each downloaded run was determined by evaluating the run metadata and, when available, any papers citing the run, study, or BioProject accession number. Because of ambiguity between freshwater lake sediments, hypersaline lake sediments, and marine sediments (e.g., samples associated with PRJNA387720), all sediments were classified as &#x201C;sediment.&#x201D;</p>
<p>Because not all of the data were derived from paired-end runs and many read pairs could not be merged, only the forward read was considered in this analysis. Quality control of the forward reads was carried out with the dada2 package (<xref ref-type="bibr" rid="B9">Callahan et al., 2016</xref>) in R (R <xref ref-type="bibr" rid="B11">Core Team, 2014</xref>). Reads were trimmed at the first position with a quality score below 20, reads with fewer than 75 bases were discarded, and the remaining reads denoised. Unique (non-redundant) reads were evaluated to determine their taxonomic domain using the CM scan function in Infernal (<xref ref-type="bibr" rid="B24">Nawrocki and Eddy, 2013</xref>) against covariance models for the domains Eukarya, Archaea, and Bacteria downloaded from the Rfam database (<xref ref-type="bibr" rid="B23">Nawrocki et al., 2015</xref>). Reads were assigned the domain for which they received the lowest <italic>E</italic>-value. Only reads identified as belonging to the archaea and bacteria were considered further.</p>
<p>The bacterial and archaeal reads associated with each run were analyzed using the paprica pipeline (<xref ref-type="bibr" rid="B5">Bowman and Ducklow, 2015</xref>). Paprica uses Infernal (<xref ref-type="bibr" rid="B24">Nawrocki and Eddy, 2013</xref>) and pplacer (<xref ref-type="bibr" rid="B20">Matsen et al., 2010</xref>) to place query reads on a phylogenetic tree constructed from full-length 16S rRNA genes extracted from completed genomes in Genbank. In this way, paprica makes a direct association between the query reads and the nearest phylogenetic neighbor with a completed genome. This association was used to make inferences about the degree of microbial dark matter present within different Antarctic environments. The paprica database used in this study was created on August 9, 2018, and thus includes completed Genomes that were available in Genbank on that date. The -unique flag was used for all scripts to enable tracking of unique reads through the paprica pipeline.</p>
</sec>
<sec><title>Results</title>
<p>This study identified 1,810 valid SRA runs that passed the selection criteria (Supplementary Table <xref ref-type="supplementary-material" rid="SM1">S1</xref>). These runs were associated with 39 BioProjects and consisted of 246,417,512 total forward reads. Of these runs, 1,772 had reads associated with the domains Bacteria and Archaea. 68,304,516 reads failed QC while an additional 13,063,077 reads could not be assigned to a domain at the specified cutoff. After QC and domain assignment the analyzed runs contained 163,272,230 reads (133,618,531 Bacteria and 29,653,699 Archaea). Consensus environments for runs, pulled from the run metadata and publications, included cryoconite (<italic>n</italic> = 168), glacial ice (<italic>n</italic> = 9), lake (<italic>n</italic> = 328), lake ice (<italic>n</italic> = 9), rock (<italic>n</italic> = 0), seawater (<italic>n</italic> = 221), sea ice (<italic>n</italic> = 69), sediment (<italic>n</italic> = 288), subglacial lake (<italic>n</italic> = 21), snow (<italic>n</italic> = 13), and soil (<italic>n</italic> = 684) (Figure <xref ref-type="fig" rid="F1">1</xref> and Table <xref ref-type="table" rid="T1">1</xref>). Samples associated with some BioProjects could not be used because of archival errors. These included PRJNA386567 (<xref ref-type="bibr" rid="B14">Frisia et al., 2017</xref>) which lacked quality data, and PRJNA480849 and PRJNA396917 (no citations available) which were not demultiplexed.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Sample location by environment. Sample locations (where available in the metadata) are given according to the final consensus environment.</p></caption>
<graphic xlink:href="fmicb-09-03165-g001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Summary of BioProjects used in this study.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left">BioProject</th>
<th valign="top" align="left">Center name</th>
<th valign="top" align="left">Release date</th>
<th valign="top" align="left">Consensus environment</th>
<th valign="top" align="left">Citation</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">PRJEB11496</td>
<td valign="top" align="left">CENTRAL MICHIGAN UNIVERSITY</td>
<td valign="top" align="left">12/24/2015</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B18">Learman et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJEB11497</td>
<td valign="top" align="left">CENTRAL MICHIGAN UNIVERSITY</td>
<td valign="top" align="left">12/24/2015</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B18">Learman et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJEB11689</td>
<td valign="top" align="left">UNIVERSITY OF WARWICK</td>
<td valign="top" align="left">12/1/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJEB14880</td>
<td valign="top" align="left">UNIVERSITY OF CALIFORNIA SAN DIEGO MICROBIOME INIT</td>
<td valign="top" align="left">7/23/2016</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left">Beaupr&#x00E9; and O&#x2019;Dwyer, 2017</td>
</tr>
<tr>
<td valign="top" align="left">PRJEB20869</td>
<td valign="top" align="left">UNIVERSITY OF CALIFORNIA SAN DIEGO MICROBIOME INIT</td>
<td valign="top" align="left">8/25/2017</td>
<td valign="top" align="left">Lake</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJEB21441</td>
<td valign="top" align="left">UNIVERSITY OF CALIFORNIA SAN DIEGO MICROBIOME INIT</td>
<td valign="top" align="left">8/25/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJEB22851</td>
<td valign="top" align="left">EUROPEAN MOLECULAR BIOLOGY LABORATORY</td>
<td valign="top" align="left">12/2/2017</td>
<td valign="top" align="left">Lake</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B17">Kleinteich et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJEB23732</td>
<td valign="top" align="left">UNIVERSITY OF CAMBRDGE</td>
<td valign="top" align="left">12/20/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJEB25155</td>
<td valign="top" align="left">UNIVERSITY OF NEUCHATEL</td>
<td valign="top" align="left">7/3/2018</td>
<td valign="top" align="left">Lake</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA244335</td>
<td valign="top" align="left">LOUISIANA STATE UNIVERSITY</td>
<td valign="top" align="left">7/31/2014</td>
<td valign="top" align="left">Subglacial lake</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B10">Christner et al., 2014</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA254078</td>
<td valign="top" align="left"/>
<td valign="top" align="left">6/15/2015</td>
<td valign="top" align="left">Seawater</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA278982</td>
<td valign="top" align="left">LOUISIANA STATE UNIVERSITY</td>
<td valign="top" align="left">7/23/2015</td>
<td valign="top" align="left">Subglacial lake</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B34">Vick-Majors et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA280421</td>
<td valign="top" align="left">UNIVERSIDAD MAYOR</td>
<td valign="top" align="left">4/18/2015</td>
<td valign="top" align="left">Seawater</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B22">Moreno-Pino et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA282540</td>
<td valign="top" align="left">LOUISIANA STATE UNIVERSITY</td>
<td valign="top" align="left">4/28/2016</td>
<td valign="top" align="left">Glacial ice</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA296701</td>
<td valign="top" align="left"/>
<td valign="top" align="left">10/8/2015</td>
<td valign="top" align="left">Cryoconite</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B37">Webster-Brown et al., 2015</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA304081</td>
<td valign="top" align="left"/>
<td valign="top" align="left">6/30/2016</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B29">Tahon et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA305344</td>
<td valign="top" align="left"/>
<td valign="top" align="left">12/13/2015</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B33">Tytgat et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA305852</td>
<td valign="top" align="left"/>
<td valign="top" align="left">12/19/2015</td>
<td valign="top" align="left">Lake</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B12">de Scally et al., 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA306790</td>
<td valign="top" align="left"/>
<td valign="top" align="left">1/11/2016</td>
<td valign="top" align="left">Sea ice</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B8">Bowman and Deming, 2016</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA315812</td>
<td valign="top" align="left"/>
<td valign="top" align="left">4/4/2016</td>
<td valign="top" align="left">Seawater</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B26">Rozema et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA317932</td>
<td valign="top" align="left">CSIRO</td>
<td valign="top" align="left">4/27/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B4">Bissett et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA320505</td>
<td valign="top" align="left">INSTITUTE OF BIOCHEMISTRY AND BIOPHYSICS POLISH A</td>
<td valign="top" align="left">5/10/2016</td>
<td valign="top" align="left">Cryoconite</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA344476</td>
<td valign="top" align="left"/>
<td valign="top" align="left">10/13/2016</td>
<td valign="top" align="left">Seawater</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B7">Bowman et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA355879</td>
<td valign="top" align="left"/>
<td valign="top" align="left">12/5/2017</td>
<td valign="top" align="left">Sea ice</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B13">Eronen-rasimus et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA357685</td>
<td valign="top" align="left"/>
<td valign="top" align="left">12/24/2016</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B38">Yan et al., 2017</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA359740</td>
<td valign="top" align="left"/>
<td valign="top" align="left">1/8/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA386506</td>
<td valign="top" align="left"/>
<td valign="top" align="left">5/12/2017</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B3">Bendia et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA387720</td>
<td valign="top" align="left"/>
<td valign="top" align="left">10/6/2017</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA395496</td>
<td valign="top" align="left"/>
<td valign="top" align="left">7/24/2017</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA398047</td>
<td valign="top" align="left"/>
<td valign="top" align="left">10/4/2017</td>
<td valign="top" align="left">Seawater</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA401502</td>
<td valign="top" align="left"/>
<td valign="top" align="left">9/6/2017</td>
<td valign="top" align="left">Cryoconite</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA401941</td>
<td valign="top" align="left"/>
<td valign="top" align="left">9/7/2017</td>
<td valign="top" align="left">Cryoconite</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B28">Sommers et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA415906</td>
<td valign="top" align="left"/>
<td valign="top" align="left">2/18/2018</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B25">Rippin et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA433184</td>
<td valign="top" align="left"/>
<td valign="top" align="left">2/6/2018</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B39">Zhang et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA433310</td>
<td valign="top" align="left"/>
<td valign="top" align="left">2/7/2018</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B39">Zhang et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA433331</td>
<td valign="top" align="left"/>
<td valign="top" align="left">2/7/2018</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"><xref ref-type="bibr" rid="B39">Zhang et al., 2018</xref></td>
</tr>
<tr>
<td valign="top" align="left">PRJNA433699</td>
<td valign="top" align="left"/>
<td valign="top" align="left">2/9/2018</td>
<td valign="top" align="left">Marine sediment</td>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">PRJNA471123</td>
<td valign="top" align="left"/>
<td valign="top" align="left">5/12/2018</td>
<td valign="top" align="left">Soil</td>
<td valign="top" align="left"/>
</tr>
</tbody>
</table>
</table-wrap>
<p>The number of unique reads varied widely between environment and was strongly correlated with the number of samples for archaea and bacteria (Figure <xref ref-type="fig" rid="F2">2</xref> and Table <xref ref-type="table" rid="T2">2</xref>). For bacteria, the number of samples associated with a given environment accounted for 77.0% of the variance in the number of unique sequences across environments (Pearson correlation, <italic>p</italic> = 0.0002). Some environments, however, had a higher number of unique reads than anticipated by this model. These included environments with very few samples including snow, subglacial lake, lake ice, and soil. For archaea, the number of samples accounted for 69.4% of the variance the number of unique reads (Pearson correlation, <italic>p</italic> = 0.0017). Similar to bacteria, snow, subglacial lake, lake, and soil had a higher number of unique reads than predicted by sample number.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Sample diversity for the domain Bacteria and Archaea. <bold>(A)</bold> Rarefaction curves for all consensus environments for bacteria given on a log-log scale. <bold>(B)</bold> The number of unique reads identified in each consensus environment as a function of the number of samples, the line of best fit reflects a linear relationship (<italic>R</italic><sup>2</sup> = 0.78, <italic>p</italic> = 2 &#x00D7; 10<sup>-4</sup>). <bold>(C)</bold> Rarefaction curves for all consensus environments for archaea given on a log&#x2013;log scale. Note that no archaea were identified in lake ice or sea ice samples. <bold>(D)</bold> The number of unique reads identified in each consensus environment as a function of the number of samples, the line of best fit reflects a linear relationship (<italic>R</italic><sup>2</sup> = 0.69, <italic>p</italic> = 9 &#x00D7; 10<sup>-4</sup>).</p></caption>
<graphic xlink:href="fmicb-09-03165-g002.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Read data for each environment.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"/>
<th valign="top" align="center">Cryoconite</th>
<th valign="top" align="center">Glacial ice</th>
<th valign="top" align="center">Lake</th>
<th valign="top" align="center">Lake ice</th>
<th valign="top" align="center">Sea ice</th>
<th valign="top" align="center">Seawater</th>
<th valign="top" align="center">Sediment</th>
<th valign="top" align="center">Snow</th>
<th valign="top" align="center">Soil</th>
<th valign="top" align="center">Subglacial lake</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Number of samples</td>
<td valign="top" align="center">168</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">287</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">69</td>
<td valign="top" align="center">218</td>
<td valign="top" align="center">270</td>
<td valign="top" align="center">13</td>
<td valign="top" align="center">508</td>
<td valign="top" align="center">21</td>
</tr>
<tr>
<td valign="top" align="left">Number of final reads &#x00D7; 10<sup>6</sup></td>
<td valign="top" align="center">2.837392</td>
<td valign="top" align="center">1.403345</td>
<td valign="top" align="center">13.164907</td>
<td valign="top" align="center">1.151896</td>
<td valign="top" align="center">0.594581</td>
<td valign="top" align="center">47.326081</td>
<td valign="top" align="center">16.22292</td>
<td valign="top" align="center">4.721608</td>
<td valign="top" align="center">70.63853</td>
<td valign="top" align="center">5.210974</td>
</tr>
<tr>
<td valign="top" align="left">Number of final reads, Bacteria &#x00D7; 10<sup>6</sup></td>
<td valign="top" align="center">2.836965</td>
<td valign="top" align="center">1.40178</td>
<td valign="top" align="center">13.128534</td>
<td valign="top" align="center">1.151896</td>
<td valign="top" align="center">0.594535</td>
<td valign="top" align="center">47.317886</td>
<td valign="top" align="center">15.04291</td>
<td valign="top" align="center">4.720207</td>
<td valign="top" align="center">42.30966</td>
<td valign="top" align="center">5.114157</td>
</tr>
<tr>
<td valign="top" align="left">Number of final reads, Archaea &#x00D7; 10<sup>6</sup></td>
<td valign="top" align="center">0.000427</td>
<td valign="top" align="center">0.001565</td>
<td valign="top" align="center">0.036373</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.000046</td>
<td valign="top" align="center">0.008195</td>
<td valign="top" align="center">1.180008</td>
<td valign="top" align="center">0.001401</td>
<td valign="top" align="center">28.32887</td>
<td valign="top" align="center">0.096817</td>
</tr>
<tr>
<td valign="top" align="left">Number of normalized reads, Bacteria &#x00D7; 10<sup>6</sup></td>
<td valign="top" align="center">1.170712</td>
<td valign="top" align="center">0.409695</td>
<td valign="top" align="center">6.27037</td>
<td valign="top" align="center">0.52973</td>
<td valign="top" align="center">0.376019</td>
<td valign="top" align="center">25.230212</td>
<td valign="top" align="center">7.973523</td>
<td valign="top" align="center">1.67086</td>
<td valign="top" align="center">23.810962</td>
<td valign="top" align="center">2.789416</td>
</tr>
<tr>
<td valign="top" align="left">Number of normalized reads, Archaea &#x00D7; 10<sup>6</sup></td>
<td valign="top" align="center">0.00035</td>
<td valign="top" align="center">0.000917</td>
<td valign="top" align="center">0.030706</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.000038</td>
<td valign="top" align="center">0.007419</td>
<td valign="top" align="center">1.117945</td>
<td valign="top" align="center">0.0013</td>
<td valign="top" align="center">27.794788</td>
<td valign="top" align="center">0.093139</td>
</tr>
<tr>
<td valign="top" align="left">Number of unique reads, Bacteria</td>
<td valign="top" align="center">133351</td>
<td valign="top" align="center">2073</td>
<td valign="top" align="center">40711</td>
<td valign="top" align="center">598</td>
<td valign="top" align="center">7711</td>
<td valign="top" align="center">36955</td>
<td valign="top" align="center">152348</td>
<td valign="top" align="center">40405</td>
<td valign="top" align="center">514020</td>
<td valign="top" align="center">14826</td>
</tr>
<tr>
<td valign="top" align="left">Number of unique reads, Archaea</td>
<td valign="top" align="center">28</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">63</td>
<td valign="top" align="center">NA</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">30</td>
<td valign="top" align="center">6154</td>
<td valign="top" align="center">14</td>
<td valign="top" align="center">49669</td>
<td valign="top" align="center">254</td>
</tr>
<tr>
<td valign="top" align="left">Mean map ratio, Bacteria</td>
<td valign="top" align="center">0.88</td>
<td valign="top" align="center">0.91</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">0.91</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">0.80</td>
<td valign="top" align="center">0.90</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">0.90</td>
</tr>
<tr>
<td valign="top" align="left">Mean map ratio, Archaea</td>
<td valign="top" align="center">NA</td>
<td valign="top" align="center">0.48</td>
<td valign="top" align="center">0.93</td>
<td valign="top" align="center">NA</td>
<td valign="top" align="center">NA</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">0.83</td>
<td valign="top" align="center">0.85</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The &#x201C;map ratio&#x201D; variable, calculated by pplacer (<xref ref-type="bibr" rid="B20">Matsen et al., 2010</xref>) as the percent identity between the query read and reference sequence, was used as the primary indicator of how well 16S rRNA gene reads were represented by completed genomes in Genbank (i.e., the paprica database). The mean map ratio distribution by sample, limited to samples with more than 1,000 reads assigned to the bacteria or archaea, was used to identify environments that may have microbial communities more poorly represented by completed genomes of bacteria (Figure <xref ref-type="fig" rid="F3">3</xref>) and archaea (Figure <xref ref-type="fig" rid="F4">4</xref>). Mean map ratios ranged from 0.60 (SRR3455314, soil) to 0.97 (SRR6008356, cryoconite) for bacteria, and from 0.48 (SRR2006327, glacial ice) to 0.97 (SRR5535794, sediment) for archaea. Samples with more than 1,000 reads but a mean map ratio below 0.8 were flagged for further investigation. For bacteria these included samples from cryoconite (<italic>n</italic> = 7), lake (<italic>n</italic> = 16), lake ice (<italic>n</italic> = 1), seawater (<italic>n</italic> = 126), snow (<italic>n</italic> = 1), sediment (<italic>n</italic> = 3), and soil (<italic>n</italic> = 3) environments. For the archaea these included samples from glacial ice (<italic>n</italic> = 1), lake (<italic>n</italic> = 15), snow (<italic>n</italic> = 1), sediment (<italic>n</italic> = 75), and soil (<italic>n</italic> = 40).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Sample mean map ratios for the domain Bacteria. For each consensus environment the distribution of mean map ratios is given. Only samples with greater than 1,000 reads assigned to the domain Bacteria are shown in the distribution.</p></caption>
<graphic xlink:href="fmicb-09-03165-g003.tif"/>
</fig>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Sample mean map ratios for the domain Archaea. For each consensus environment the distribution of mean map ratios is given. Only samples with greater than 1,000 reads assigned to the domain Archaea are shown in the distribution. Due to the small number of samples with sufficient archaeal reads for glacier ice (<italic>n</italic> = 1), lake ice (<italic>n</italic> = 0), snow (<italic>n</italic> = 1), sea ice (<italic>n</italic> = 0), and seawater (<italic>n</italic> = 3), these environments are not shown.</p></caption>
<graphic xlink:href="fmicb-09-03165-g004.tif"/>
</fig>
<p>The relationship between the abundance of unique reads (across the entire dataset) and map ratio (Figure <xref ref-type="fig" rid="F5">5</xref> and Table <xref ref-type="table" rid="T3">3</xref>) was used to identify unique reads that were abundant in individual samples but poorly represented by completed genomes. Unique reads for which the map ratio was less than 0.075 &#x00D7; read abundance + 0.4 were flagged for further inspection. These parameters were selected arbitrarily to objectively subset a manageable number of reads. For bacteria 1,675 unique reads met this criterion, while 1,949 unique reads met this criterion for archaea. The 10 most abundant phylogenetic edges (branches or tips) within any environment associated with these unique reads were evaluated further to determine which sequenced genomes represent groups with considerable uncharacterized diversity (Table <xref ref-type="table" rid="T3">3</xref>). The most abundant low map ratio edge for domain Bacteria accounted for 3.9% total seawater reads and placed with <italic>Sulfitobacter pseudonitzschia</italic> SMR1, represented by Genbank genome GCF002222635.1. Classification of a representative sequence using the Ribosomal Database Project (RDP) classifier (<xref ref-type="bibr" rid="B36">Wang et al., 2007</xref>) identified the read as belonging to the phylum Proteobacteria. The most abundant low map ratio edge for domain Archaea accounted for 1.2% of total soil reads and belonged to the family <italic>Haloarculaceae</italic>. RDP classified a representative read as domain Archaea.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>The abundance of unique reads as a function of map ratio for <bold>(A)</bold> bacteria and <bold>(B)</bold> archaea. The abundance of unique reads was determined within each consensus environment (i.e., each unique read may be tallied more than once in different consensus environments). The distribution of data is displayed via a hexagonal density plot, with the color of the hexagons representing the density of the data.</p></caption>
<graphic xlink:href="fmicb-09-03165-g005.tif"/>
</fig>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Abundant phylogenetic edges with low map ratio values.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left">Edge</th>
<th valign="top" align="left">Unique reads</th>
<th valign="top" align="center">Abundance</th>
<th valign="top" align="center">Mean map ratio</th>
<th valign="top" align="left">Taxon<sup>1</sup></th>
<th valign="top" align="left">RDP (50%)<sup>2</sup></th>
<th valign="top" align="left">Predominant environment</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><bold>Bacteria</bold></td>
<td valign="top" align="left"/>
<td valign="top" align="center"/>
<td valign="top" align="center"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">222</td>
<td valign="top" align="left">166</td>
<td valign="top" align="center">979594</td>
<td valign="top" align="center">0.663945247</td>
<td valign="top" align="left">GCF002222635.1 <italic>Sulfitobacter pseudonitzschiae</italic> SMR1</td>
<td valign="top" align="left">Phylum Proteobacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">10015</td>
<td valign="top" align="left">24</td>
<td valign="top" align="center">111525</td>
<td valign="top" align="center">0.517836875</td>
<td valign="top" align="left">FCB Group</td>
<td valign="top" align="left">Class Cyanobacteria</td>
<td valign="top" align="left">Soil, seawater</td>
</tr>
<tr>
<td valign="top" align="left">475</td>
<td valign="top" align="left">36</td>
<td valign="top" align="center">104435</td>
<td valign="top" align="center">0.629734278</td>
<td valign="top" align="left">Brucellaceae</td>
<td valign="top" align="left">Phylum Proteobacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">696</td>
<td valign="top" align="left">40</td>
<td valign="top" align="center">99713</td>
<td valign="top" align="center">0.5506075</td>
<td valign="top" align="left">GCF000183665.1 Candidatus <italic>Liberibacter solanacearum</italic> CLso-ZC1 CLso-ZC1</td>
<td valign="top" align="left">Domain Bacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">9917</td>
<td valign="top" align="left">2</td>
<td valign="top" align="center">94871</td>
<td valign="top" align="center">0.746377</td>
<td valign="top" align="left"><italic>Oscillatoriophycideae</italic></td>
<td valign="top" align="left">Chloroplast (genus <italic>Bacillariophyta</italic>)</td>
<td valign="top" align="left">Seawater, sediment</td>
</tr>
<tr>
<td valign="top" align="left">219</td>
<td valign="top" align="left">11</td>
<td valign="top" align="center">68825</td>
<td valign="top" align="center">0.669268636</td>
<td valign="top" align="left">GCF002158905.1 <italic>Yoonia vestfoldensis</italic> SMR4r</td>
<td valign="top" align="left">Class Alphaproteobacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">53</td>
<td valign="top" align="left">23</td>
<td valign="top" align="center">67685</td>
<td valign="top" align="center">0.632538435</td>
<td valign="top" align="left">GCF000296215.2 <italic>Bradyrhizobium</italic> sp. CCGE-LA001 CCGE-LA001</td>
<td valign="top" align="left">Domain Bacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">960</td>
<td valign="top" align="left">81</td>
<td valign="top" align="center">57654</td>
<td valign="top" align="center">0.581226012</td>
<td valign="top" align="left">GCF000815025.1 <italic>Coxiella</italic> endosymbiont of <italic>Amblyomma americanum</italic></td>
<td valign="top" align="left">Root</td>
<td valign="top" align="left">Snow</td>
</tr>
<tr>
<td valign="top" align="left">7058</td>
<td valign="top" align="left">11</td>
<td valign="top" align="center">50260</td>
<td valign="top" align="center">0.645450273</td>
<td valign="top" align="left">GCF001399775.1 <italic>Thermus aquaticus</italic> Y51MC23 Y51MC23</td>
<td valign="top" align="left">Domain Bacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left">875</td>
<td valign="top" align="left">20</td>
<td valign="top" align="center">39762</td>
<td valign="top" align="center">0.599264</td>
<td valign="top" align="left">Rhodospirillum</td>
<td valign="top" align="left">Domain Bacteria</td>
<td valign="top" align="left">Seawater</td>
</tr>
<tr>
<td valign="top" align="left"><bold>Archaea</bold></td>
<td valign="top" align="left"/>
<td valign="top" align="center"/>
<td valign="top" align="center"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
<td valign="top" align="left"/>
</tr>
<tr>
<td valign="top" align="left">107</td>
<td valign="top" align="left">224</td>
<td valign="top" align="center">329322</td>
<td valign="top" align="center">0.488955781</td>
<td valign="top" align="left"><italic>Haloarculaceae</italic></td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">341</td>
<td valign="top" align="left">83</td>
<td valign="top" align="center">83108</td>
<td valign="top" align="center">0.556996494</td>
<td valign="top" align="left">GCF002214165.1 Candidatus <italic>Microarchaeota archaeon</italic> Mia14</td>
<td valign="top" align="left">Root</td>
<td valign="top" align="left">Soil, sediment</td>
</tr>
<tr>
<td valign="top" align="left">127</td>
<td valign="top" align="left">48</td>
<td valign="top" align="center">76379</td>
<td valign="top" align="center">0.502742583</td>
<td valign="top" align="left">GCF900079125.1 <italic>Methanoculleus bourgensis</italic></td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">347</td>
<td valign="top" align="left">44</td>
<td valign="top" align="center">69891</td>
<td valign="top" align="center">0.491039114</td>
<td valign="top" align="left">Candidatus <italic>Nitrosopumilus sediminis</italic> AR2</td>
<td valign="top" align="left">Genus <italic>Nitrososphaera</italic></td>
<td valign="top" align="left">Sediment, soil</td>
</tr>
<tr>
<td valign="top" align="left">340</td>
<td valign="top" align="left">7</td>
<td valign="top" align="center">53978</td>
<td valign="top" align="center">0.679012</td>
<td valign="top" align="left">Archaea</td>
<td valign="top" align="left">Root</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">431</td>
<td valign="top" align="left">50</td>
<td valign="top" align="center">50964</td>
<td valign="top" align="center">0.51836778</td>
<td valign="top" align="left">GCF000317795.1 <italic>Caldisphaera lagunensis</italic> DSM 15908 DSM 15908</td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">225</td>
<td valign="top" align="left">26</td>
<td valign="top" align="center">43723</td>
<td valign="top" align="center">0.463009192</td>
<td valign="top" align="left">Euryarchaeota</td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">126</td>
<td valign="top" align="left">14</td>
<td valign="top" align="center">35384</td>
<td valign="top" align="center">0.511628</td>
<td valign="top" align="left">GCF000304355.2 <italic>Methanoculleus bourgensis</italic> MS2</td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">348</td>
<td valign="top" align="left">28</td>
<td valign="top" align="center">17068</td>
<td valign="top" align="center">0.504448786</td>
<td valign="top" align="left">GCF002156965.1 Candidatus <italic>Nitrosomarinus catalina</italic> SPOT01</td>
<td valign="top" align="left">Genus Nitrososphaera</td>
<td valign="top" align="left">Soil</td>
</tr>
<tr>
<td valign="top" align="left">364</td>
<td valign="top" align="left">31</td>
<td valign="top" align="center">15033</td>
<td valign="top" align="center">0.463741968</td>
<td valign="top" align="left">GCF000018305.1 <italic>Caldivirga maquilingensis</italic> IC-167</td>
<td valign="top" align="left">Domain Archaea</td>
<td valign="top" align="left">Soil</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>Abundance values include only those reads with low map ratios, not all reads placed to the indicated edge. <sup><italic>1</italic></sup>Taxonomy of the phylogenetic edge on the paprica reference tree. <sup><italic>2</italic></sup>Classification of a representative read for that edge using the Ribosomal Database Project classifier (<xref ref-type="bibr" rid="B36">Wang et al., 2007</xref>) with a 50% cutoff</italic>.</attrib>
</table-wrap-foot>
</table-wrap>
<p>As seen in the distribution of abundant unique reads with low map ratios (Table <xref ref-type="table" rid="T3">3</xref>), the uncharacterized diversity for the bacteria was most pronounced in seawater. This is interesting given that the mean map ratio for seawater samples was not particularly low; individual samples with lower values were found in lake and soil environments (Figure <xref ref-type="fig" rid="F3">3</xref>). In seawater low map ratios were balanced by abundant, well characterized taxa with high map ratios, particularly <italic>Pseudoalteromonas spongiae</italic> UST010723-006 (map ratio = 0.98), Candidatus <italic>Pelagibacter ubique</italic> HTCC1062 (map ratio = 0.96), and <italic>Alteromonas stellipolaris</italic> LMG21856 (map ratio = 0.97). A precise classification of the low map ratio reads was not possible &#x2013; as is expected for microbial dark matter &#x2013; and in most cases the RDP classifier could not provide a classification below the level of domain. <italic>S. pseudonitzschia</italic> SMR1, the edge in seawater samples that had the lowest mean map ratio, was relatively abundant and correctly classified by RDP as belonging to the Proteobacteria.</p>
<p>Although lake ice and soil environments generally had high mean map ratios for bacteria, some samples from these environments were unusually low (Figure <xref ref-type="fig" rid="F3">3</xref>). For lake ice, sample ERR2204499 from BioProject <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJEB22851">PRJEB22851</ext-link> was a clear outlier with a mean map ratio of 0.71. Although this sample had enough reads associated with the domain Bacteria to be considered for the mean map ratio analysis (<italic>n</italic> = 2,225), the sample was of very low diversity with only seven unique reads identified. All of these unique reads had fairly high map ratios except for one abundant read that placed with <italic>Halomonas chromatireducens</italic> AGD 8-3. The soil sample with the lowest map ratio (SRR3455314, BioProject <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJNA317932">PRJNA317932</ext-link>) also had relatively few reads (<italic>n</italic> = 1,207) but was comparatively diverse, with 59 unique reads. The most abundant among these belonged to an unclassified Betaproteobacteria with a map ratio of 0.62, <italic>Verrucomicrobia</italic> with a map ratio of 0.52, and <italic>Blattabacterium punctulatus</italic> CPUpc with a map ratio of 0.60. Verrucomicrobia and Betaproteobacetria are common in soil environments, and <italic>Blattabacterium</italic> spp. are obligate endosymbionts (<xref ref-type="bibr" rid="B15">Gil et al., 2004</xref>), thus reads placed to <italic>B. punctulatus</italic> CPUpc may have been associated with nemotades, tardigrades, or other metazoans common to Antarctic soils.</p>
<p>The sample with the lowest mean map ratio (0.48) for domain Archaea belonged to glacial ice (SRR2006327, BioProject <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJNA282540">PRJNA282540</ext-link>). Abundant unique reads in this sample with low map ratios mapped to <italic>Haloarcula marismortui</italic> ATCC 43049 represented by genome GCF_000011085.1 and <italic>Ferroplasma acidiphilum</italic> Y represented by genome GCF_002078355.1. Although the mean map ratios for soil samples were generally higher for domain Archaea, some soil samples had exceptionally low values. The lowest mean map ratio for soil was sample ERR2012973 (BioProject <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="PRJEB21441">PRJEB21441</ext-link>) at 0.49. Abundant unique edges with low map ratios in this sample included <italic>Metallosphaera curpina</italic> Ar-4 (map ratio = 0.49) represented by genome GCF_000204925.1 and <italic>Methanospirillum hungatei</italic> JF-1 represented by genome GCF_000013445.1 (map ratio = 0.43).</p>
</sec>
<sec><title>Discussion</title>
<p>The domains Bacteria and Archaea showed surprising differences in their relative abundance and in the number of unique sequences identified. Overall bacteria were better sampled than archaea, though this does not necessarily reflect any greater ecological importance in many environments. Only recently have primers been designed to broadly amplify across both domains (<xref ref-type="bibr" rid="B35">Walters et al., 2015</xref>), prior to this many studies focused on the domain Bacteria as a matter of expediency. Thus while no archaeal reads were identified in lake ice and sea ice, and very few in glacial ice and snow, this does not mean that archaea were absent from those physical samples. Archaea were comparatively well sampled in sediment and soil &#x2013; environments that are known to host a considerable number of archaea &#x2013; but the number of unique reads associated with the archaea in these environments was nearly an order of magnitude less than the number associated with the domain Bacteria, despite a similar number of sampled reads. This may reflect an overall lower phylogenetic diversity among the archaea or an analysis artifact, with the available primers and covariance models insufficient to capture the true archaeal diversity. The lack of archaeal sequence data was particularly pronounced for seawater, where archaea are known to play a considerable role in the marine nitrogen cycle and in dark carbon fixation (<xref ref-type="bibr" rid="B32">Tolar et al., 2016</xref>).</p>
<p>A key distinction between the Bacteria and Archaea in this analysis was the impact of read normalization on read abundance. The paprica pipeline normalizes read abundance by dividing the number of reads placed to an edge on the phylogenetic reference tree by the anticipated 16S rRNA gene copy number for that position on the tree. Because many bacteria and archaea have multiple copies of the 16S rRNA gene, this can have a major impact on the estimated abundance of these clades. Across the soil samples, for example, 42.3 &#x00D7; 10<sup>6</sup> reads were associated with the domain Bacteria and 28.3 &#x00D7; 10<sup>6</sup> with Archaea. After normalization only 23.8 &#x00D7; 10<sup>6</sup> reads were associated with Bacteria (a 44% reduction), while 27.8 &#x00D7; 10<sup>6</sup> were associated with Archaea (a 2% reduction; Table <xref ref-type="table" rid="T1">1</xref>). Extrapolating these ratios to a hypothetical single sample suggests that the abundance of bacteria relative to archaea would be overestimated by a factor of nearly 2 if the data were not normalized.</p>
<p>While lake and glacial ice had the highest mean map ratios for bacteria, and lake the highest for archaea, no Antarctic environment was well represented by the available completed genomes in Genbank. All of the investigated environments had some samples with comparatively low mean map ratios, and all samples had some number of unique reads with low map ratios. The abundance of a <italic>S. pseudonitzschia</italic> SMR1 phylotype with a low map ratio in seawater samples indicates that even relatively well-sampled (ranked 4th out of 10 for number of samples per environment) environments contain considerable uncharacterized diversity. <italic>S. pseudonitzschia</italic> was isolated from the marine diatom <italic>Pseudonitzschia</italic> multi-series (<xref ref-type="bibr" rid="B16">Hong et al., 2015</xref>) suggesting that phytoplankton blooms &#x2013; comparatively well-studied environments &#x2013; may host their own microbial dark matter. It is important to note the difference between phylogenetic dissimilarity and sequence identity for such well-characterized taxa as <italic>Sulfitobacter</italic>; an uncharacterized strain may be most closely related to (e.g.) sequenced <italic>Sulfitobacter</italic> but nonetheless share little sequence identity. Overall the dissimilarity between environmental sequence reads and 16S rRNA genes from completed genomes is not surprising given the paucity of completely sequenced genomes from Antarctica. Because data on isolation environment is not typically included with genome metadata it is difficult to determine how many complete genomes of Antarctic bacteria and archaea have been sequenced. However, <xref ref-type="bibr" rid="B6">Bowman (2017</xref>) recently identified only 32 completely sequenced psychrophile genomes, suggesting that bacteria and archaea from the perennially cold Antarctic are not well represented.</p>
<p>A great number of valuable studies were excluded from this analysis based on technical limitations, including the use of older sequencing technologies such as Roche 454, or poor sequence quality. The rate of technological innovation for high-throughput sequencing methods has been extreme since the first 454 sequencing study in 2006 (<xref ref-type="bibr" rid="B27">Sogin et al., 2006</xref>), and the current primers and Illumina MiSeq methodologies reflect a maturation of this technology (e.g., <xref ref-type="bibr" rid="B31">Thompson et al., 2017</xref>). Although widely adopted these methodologies are not ubiquitous, however, and individual investigators must strive to adopt best practices for microbial ecology studies. Methodological errors are compounded by archival errors; several studies of interest could not be used because the data were not correctly uploaded to SRA. The most common error made was not demultiplexing at the time of submission; without a map file identifying barcodes and sample-specific metadata these data are meaningless to the wider community. Journals and funding agencies should continue to require that sequence data and appropriate metadata be archived at the time of manuscript submission or at the completion of a project; however, the current checks are insufficient to insure that data is discoverable and reusable.</p>
<p>Despite the vast size of the Antarctic continent, sampling for most environments was concentrated in just a few areas (Figure <xref ref-type="fig" rid="F1">1</xref>). The western Antarctic Peninsula and McMurdo regions were the most heavily sampled and accounted for nearly all terrestrial samples except for soil. Soil was sampled in several other locations in eastern Antarctica, namely Prydz Bay and in the Sor R&#x00F8;ndane Mountains. Sea ice was sampled exclusively in the Ross Sea region (including McMurdo Sound) and the Weddell Sea. How much microbial diversity remains undiscovered because of this bias is a difficult question to answer. Certainly within these more densely sampled sites there are habitats in space and time that are undersampled, or that have not been sampled at all. The implications of this is clear from the relationships in Figure <xref ref-type="fig" rid="F3">3</xref>; while individual samples within environments may be sampled to saturation, this does not necessarily mean that the total diversity of the environment is well sampled. Future investigations will need to continue to focus on better understanding the environmental drivers of diversity within more heavily sampled regions, while expanding to include new areas that have not been included in previous sampling efforts.</p>
</sec>
<sec><title>Data Availability Statement</title>
<p>All data used in this study are available from the NCBI SRA at the BioProjects listed in Table <xref ref-type="table" rid="T1">1</xref>. Additional information on each included sample is provided in a table in the Supplementary Information.</p>
</sec>
<sec><title>Author Contributions</title>
<p>JB conceived the study, carried out the analysis, and wrote the manuscript.</p>
</sec>
<sec><title>Conflict of Interest Statement</title>
<p>The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This study was supported by NASA 80NSSC18M010S, a Simons Foundation Early Career Marine Microbial Investigator Fellowship, NSF-OPP 1641019, and NSF-OPP 1821911.</p>
</fn>
</fn-group>
<ack>
<p>I would like to thank the Antarctic microbial community at large for the collection and public sharing of these data.</p>
</ack>
<sec sec-type="supplementary material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmicb.2018.03165/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmicb.2018.03165/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.csv" id="SM1" mimetype="text/csv" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amir</surname> <given-names>A.</given-names></name> <name><surname>McDonald</surname> <given-names>D.</given-names></name> <name><surname>Navas-Molina</surname> <given-names>J. A.</given-names></name> <name><surname>Kopylova</surname> <given-names>E.</given-names></name> <name><surname>Morton</surname> <given-names>J. T.</given-names></name> <name><surname>Zech Xu</surname> <given-names>Z.</given-names></name><etal/></person-group> (<year>2017</year>). <source><italic>Deblur Rapidly Resolves Single-&#x2019;, American Society for Microbiology.</italic></source> Available at: <ext-link ext-link-type="uri" xlink:href="http://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-9-r79">http://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-9-r79</ext-link></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beaupr&#x00E9;</surname> <given-names>A. D.</given-names></name> <name><surname>O&#x2019;Dwyer</surname> <given-names>J. P.</given-names></name></person-group> (<year>2017</year>). <article-title>Widespread bursts of diversification in microbial phylogenies.</article-title> <source><italic>arXiv</italic></source> <pub-id pub-id-type="doi">10.3389/fmicb.2018.00899</pub-id> <pub-id pub-id-type="pmid">23056629</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bendia</surname> <given-names>A. G.</given-names></name> <name><surname>Signori</surname> <given-names>C. N.</given-names></name> <name><surname>Franco</surname> <given-names>D. C.</given-names></name> <name><surname>Duarte</surname> <given-names>R. T. D.</given-names></name> <name><surname>Bohannan</surname> <given-names>B. J. M.</given-names></name> <name><surname>Pellizari</surname> <given-names>V. H.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>A mosaic of geothermal and marine features shapes microbial community structure on deception Island Volcano, Antarctica.</article-title> <source><italic>Front. Microbiol.</italic></source> <volume>9</volume>:<issue>899</issue>. <pub-id pub-id-type="doi">10.3389/fmicb.2018.00899</pub-id> <pub-id pub-id-type="pmid">29867810</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bissett</surname> <given-names>A.</given-names></name> <name><surname>Fitzgerald</surname> <given-names>A.</given-names></name> <name><surname>Court</surname> <given-names>L.</given-names></name> <name><surname>Meintjes</surname> <given-names>T.</given-names></name> <name><surname>Mele</surname> <given-names>P. M.</given-names></name> <name><surname>Reith</surname> <given-names>F.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Erratum: introducing base: the biomes of Australian soil environments SOIL microbial diversity database.</article-title> <source><italic>GigaScience</italic></source> <volume>6</volume>:<issue>1</issue>. <pub-id pub-id-type="doi">10.1093/gigascience/gix021</pub-id> <pub-id pub-id-type="pmid">30137319</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowman</surname> <given-names>J.</given-names></name> <name><surname>Ducklow</surname> <given-names>H.</given-names></name></person-group> (<year>2015</year>). <article-title>Microbial communities can be described by metabolic structure: a general framework and application to a seasonally variable, depth-stratified microbial community from the coastal West Antarctic Peninsula.</article-title> <source><italic>PloS One</italic></source> <volume>10</volume>:<issue>e0135868</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0135868</pub-id> <pub-id pub-id-type="pmid">26285202</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowman</surname> <given-names>J. P.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x201C;Genomics of psychrophilic bacteria and archaea,&#x201D; in</article-title> <source><italic>Psychrophiles: From Biodiversity to Biotechnology</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Margesin</surname> <given-names>R.</given-names></name> <name><surname>Schinner</surname> <given-names>F.</given-names></name></person-group>, J. C. Marx, and C. Gerday (New York, NY: Springer), <fpage>345</fpage>&#x2013;<lpage>387</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-57057-0_15</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowman</surname> <given-names>J. S.</given-names></name> <name><surname>Amaral-Zettler</surname> <given-names>L. A.</given-names></name> <name><surname>Rich</surname> <given-names>J. J.</given-names></name> <name><surname>Luria</surname> <given-names>M. C.</given-names></name> <name><surname>Ducklow</surname> <given-names>H. W.</given-names></name></person-group> (<year>2017</year>). <article-title>Bacterial community segmentation facilitates the prediction of ecosystem function along the coast of the western Antarctic Peninsula.</article-title> <source><italic>ISME J.</italic></source> <volume>11</volume> <fpage>1460</fpage>&#x2013;<lpage>1471</lpage>. <pub-id pub-id-type="doi">10.1038/ismej.2016.204</pub-id> <pub-id pub-id-type="pmid">28106879</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowman</surname> <given-names>J. S.</given-names></name> <name><surname>Deming</surname> <given-names>J. W.</given-names></name></person-group> (<year>2016</year>). <source><italic>Wind-Driven Distribution of Bacteria in Coastal Antarctica: Evidence From the Ross Sea region&#x2019;, Polar Biology.</italic></source> <publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>.</citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Callahan</surname> <given-names>B. J.</given-names></name> <name><surname>McMurdie</surname> <given-names>P. J.</given-names></name> <name><surname>Rosen</surname> <given-names>M. J.</given-names></name> <name><surname>Han</surname> <given-names>A. W.</given-names></name> <name><surname>Johnson</surname> <given-names>A. J.</given-names></name> <name><surname>Holmes</surname> <given-names>S. P.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>&#x2018;DADA2: high-resolution sample inference from Illumina amplicon data&#x2019;.</article-title> <source><italic>Nat. Methods</italic></source> <volume>13</volume> <fpage>581</fpage>&#x2013;<lpage>583</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.3869</pub-id> <pub-id pub-id-type="pmid">27214047</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Christner</surname> <given-names>B. C.</given-names></name> <name><surname>Priscu</surname> <given-names>J. C.</given-names></name> <name><surname>Achberger</surname> <given-names>A. M.</given-names></name> <name><surname>Barbante</surname> <given-names>C.</given-names></name> <name><surname>Carter</surname> <given-names>S. P.</given-names></name> <name><surname>Christianson</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>A microbial ecosystem beneath the West Antarctic ice sheet.</article-title> <source><italic>Nature</italic></source> <volume>512</volume> <fpage>310</fpage>&#x2013;<lpage>313</lpage>. <pub-id pub-id-type="doi">10.1038/nature13667</pub-id> <pub-id pub-id-type="pmid">25143114</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Core Team</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <source><italic>R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.</italic></source> Available at: <ext-link ext-link-type="uri" xlink:href="http://www.r-project.org/">http://www.r-project.org/</ext-link></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Scally</surname> <given-names>S. Z.</given-names></name> <name><surname>Makhalanyane</surname> <given-names>T.</given-names></name> <name><surname>Frossard</surname> <given-names>A.</given-names></name> <name><surname>Hogg</surname> <given-names>I. D.</given-names></name> <name><surname>Cowan</surname> <given-names>D. A.</given-names></name></person-group> (<year>2016</year>). <article-title>Antarctic microbial communities are functionally redundant, adapted and resistant to short term temperature perturbations.</article-title> <source><italic>Soil Biol. Biochem.</italic></source> <volume>103</volume> <fpage>160</fpage>&#x2013;<lpage>170</lpage>. <pub-id pub-id-type="doi">10.1016/j.soilbio.2016.08.013</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eronen-rasimus</surname> <given-names>E.</given-names></name> <name><surname>Luhtanen</surname> <given-names>A. M.</given-names></name> <name><surname>Rintala</surname> <given-names>J. M.</given-names></name> <name><surname>Delille</surname> <given-names>B.</given-names></name> <name><surname>Dieckmann</surname> <given-names>G.</given-names></name> <name><surname>Karkman</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>An active bacterial community linked to high chl-a concentrations in Antarctic winter-pack ice and evidence for the development of an anaerobic sea-ice bacterial community.</article-title> <source><italic>ISME J.</italic></source> <volume>11</volume> <fpage>2345</fpage>&#x2013;<lpage>2355</lpage>. <pub-id pub-id-type="doi">10.1038/ismej.2017.96</pub-id> <pub-id pub-id-type="pmid">28708127</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Frisia</surname> <given-names>S.</given-names></name> <name><surname>Weyrich</surname> <given-names>L. S.</given-names></name> <name><surname>Hellstrom</surname> <given-names>J.</given-names></name> <name><surname>Borsato</surname> <given-names>A.</given-names></name> <name><surname>Golledge</surname> <given-names>N. R.</given-names></name> <name><surname>Anesio</surname> <given-names>A. M.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>The influence of Antarctic subglacial volcanism on the global iron cycle during the last glacial maximum.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>8</volume> <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1038/ncomms15425</pub-id> <pub-id pub-id-type="pmid">28598412</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gil</surname> <given-names>R.</given-names></name> <name><surname>Latorre</surname> <given-names>A.</given-names></name> <name><surname>Moya</surname> <given-names>A.</given-names></name></person-group> (<year>2004</year>). <article-title>Bacterial endosymbionts of insects: insights from comparative genomics.</article-title> <source><italic>Environ. Microbiol.</italic></source> <volume>6</volume> <fpage>1109</fpage>&#x2013;<lpage>1122</lpage>. <pub-id pub-id-type="doi">10.1111/j.1462-2920.2004.00691.x</pub-id> <pub-id pub-id-type="pmid">15479245</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hong</surname> <given-names>Z.</given-names></name> <name><surname>Lai</surname> <given-names>Q.</given-names></name> <name><surname>Luo</surname> <given-names>Q.</given-names></name> <name><surname>Jiang</surname> <given-names>S.</given-names></name> <name><surname>Zhu</surname> <given-names>R.</given-names></name> <name><surname>Liang</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Sulfitobacter pseudonitzschiae sp. nov., isolated from the toxic marine diatom Pseudo-nitzschia multiseries.</article-title> <source><italic>Int. J. Syst. Evol. Microbiol.</italic></source> <volume>65</volume> <fpage>95</fpage>&#x2013;<lpage>100</lpage>. <pub-id pub-id-type="doi">10.1099/ijs.0.064972-0</pub-id> <pub-id pub-id-type="pmid">25278561</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kleinteich</surname> <given-names>J.</given-names></name> <name><surname>Hildebrand</surname> <given-names>F.</given-names></name> <name><surname>Bahram</surname> <given-names>M.</given-names></name> <name><surname>Voigt</surname> <given-names>A. Y.</given-names></name> <name><surname>Wood</surname> <given-names>S. A.</given-names></name> <name><surname>Jungblut</surname> <given-names>A. D.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Pole-to-pole connections: similarities between Arctic and Antarctic microbiomes and their vulnerability to environmental change.</article-title> <source><italic>Front. Ecol. Evol.</italic></source> <volume>5</volume>:<issue>137</issue>. <pub-id pub-id-type="doi">10.3389/fevo.2017.00137</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Learman</surname> <given-names>D. R.</given-names></name> <name><surname>Henson</surname> <given-names>M. W.</given-names></name> <name><surname>Thrash</surname> <given-names>J. C.</given-names></name> <name><surname>Temperton</surname> <given-names>B.</given-names></name> <name><surname>Brannock</surname> <given-names>P. M.</given-names></name> <name><surname>Santos</surname> <given-names>S. R.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Biogeochemical and microbial variation across 5500 km of Antarctic surface sediment implicates organic matter as a driver of benthic community structure.</article-title> <source><italic>Front. Microbiol.</italic></source> <volume>7</volume>:<issue>284</issue>. <pub-id pub-id-type="doi">10.3389/fmicb.2016.00284</pub-id> <pub-id pub-id-type="pmid">27047451</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marcy</surname> <given-names>Y.</given-names></name> <name><surname>Ouverney</surname> <given-names>C.</given-names></name> <name><surname>Bik</surname> <given-names>E. M.</given-names></name> <name><surname>L&#x00F6;sekann</surname> <given-names>T.</given-names></name> <name><surname>Ivanova</surname> <given-names>N.</given-names></name> <name><surname>Martin</surname> <given-names>H. G.</given-names></name><etal/></person-group> (<year>2007</year>). <article-title>&#x2018;Dissecting biological &#x201C;dark matter&#x201D; with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>104</volume> <fpage>11889</fpage>&#x2013;<lpage>11894</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0704662104</pub-id> <pub-id pub-id-type="pmid">17620602</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Matsen</surname> <given-names>F. A.</given-names></name> <name><surname>Kodner</surname> <given-names>R. B.</given-names></name> <name><surname>Armbrust</surname> <given-names>E. V.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x2018;Pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree&#x2019;.</article-title> <source><italic>BMC Bioinformatics</italic></source> <volume>11</volume>:<issue>538</issue>. <pub-id pub-id-type="doi">10.1186/1471-2105-11-538</pub-id> <pub-id pub-id-type="pmid">21034504</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McLean</surname> <given-names>A. L.</given-names></name></person-group> (<year>1918</year>). <article-title>Bacteria of ice and snow in Antarctica.</article-title> <source><italic>Nature</italic></source> <volume>102</volume> <fpage>35</fpage>&#x2013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1038/102035a0</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moreno-Pino</surname> <given-names>M.</given-names></name> <name><surname>De la Iglesia</surname> <given-names>R.</given-names></name> <name><surname>Valdivia</surname> <given-names>N.</given-names></name> <name><surname>Henr&#x00ED;quez-Castilo</surname> <given-names>C.</given-names></name> <name><surname>Gal&#x00E1;n</surname> <given-names>A.</given-names></name> <name><surname>D&#x00ED;ez</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Variation in coastal Antarctic microbial community composition at sub-mesoscale: spatial distance or environmental filtering?</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>92</volume>:<issue>fiw088</issue>. <pub-id pub-id-type="doi">10.1093/femsec/fiw088</pub-id> <pub-id pub-id-type="pmid">27127198</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nawrocki</surname> <given-names>E. P.</given-names></name> <name><surname>Burge</surname> <given-names>S. W.</given-names></name> <name><surname>Bateman</surname> <given-names>A.</given-names></name> <name><surname>Daub</surname> <given-names>J.</given-names></name> <name><surname>Eberhardt</surname> <given-names>R. Y.</given-names></name> <name><surname>Eddy</surname> <given-names>S. R.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Rfam 12.0: updates to the RNA families database.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>43</volume> <fpage>D130</fpage>&#x2013;<lpage>D137</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gku1063</pub-id> <pub-id pub-id-type="pmid">25392425</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nawrocki</surname> <given-names>E. P.</given-names></name> <name><surname>Eddy</surname> <given-names>S. R.</given-names></name></person-group> (<year>2013</year>). <article-title>Computational identification of functional RNA homologs in metagenomic data.</article-title> <source><italic>RNA Biol.</italic></source> <volume>10</volume> <fpage>1170</fpage>&#x2013;<lpage>1179</lpage>. <pub-id pub-id-type="doi">10.4161/rna.25038</pub-id> <pub-id pub-id-type="pmid">23722291</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rippin</surname> <given-names>M.</given-names></name> <name><surname>Lange</surname> <given-names>S.</given-names></name> <name><surname>Sausen</surname> <given-names>N.</given-names></name> <name><surname>Becker</surname> <given-names>B.</given-names></name></person-group> (<year>2018</year>). <article-title>Biodiversity of biological soil crusts from the Polar Regions revealed by metabarcoding.</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>4</volume>:<issue>fiy036</issue>. <pub-id pub-id-type="doi">10.1093/femsec/fiy036</pub-id> <pub-id pub-id-type="pmid">29514253</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rozema</surname> <given-names>P. D.</given-names></name> <name><surname>Kulk</surname> <given-names>G.</given-names></name> <name><surname>Veldhuis</surname> <given-names>M. P.</given-names></name> <name><surname>Buma</surname> <given-names>A. G. J.</given-names></name> <name><surname>Meredith</surname> <given-names>M. P.</given-names></name> <name><surname>van de Poll</surname> <given-names>W. H.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Assessing drivers of coastal primary production in northern marguerite bay, Antarctica.</article-title> <source><italic>Front. Marine Sci.</italic></source> <volume>4</volume>:<issue>20</issue>. <pub-id pub-id-type="doi">10.3389/fmars.2017.00184</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sogin</surname> <given-names>M. L.</given-names></name> <name><surname>Morrison</surname> <given-names>H. G.</given-names></name> <name><surname>Huber</surname> <given-names>J. A.</given-names></name> <name><surname>Mark Welch</surname> <given-names>D.</given-names></name> <name><surname>Huse</surname> <given-names>S. M.</given-names></name> <name><surname>Neal</surname> <given-names>P. R.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>Microbial diversity in the deep sea and the underexplored rare biosphere.</article-title> <source><italic>Proc.Natl. Acad. Sci. U.S.A.</italic></source> <volume>103</volume> <fpage>12115</fpage>&#x2013;<lpage>12120</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0605127103</pub-id> <pub-id pub-id-type="pmid">16880384</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sommers</surname> <given-names>P.</given-names></name> <name><surname>Darcy</surname> <given-names>J. L.</given-names></name> <name><surname>Gendron</surname> <given-names>E. M. S.</given-names></name> <name><surname>Stanish</surname> <given-names>L. F.</given-names></name> <name><surname>Bagshaw</surname> <given-names>E. A.</given-names></name> <name><surname>Porazinska</surname> <given-names>D. L.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Diversity patterns of microbial eukaryotes mirror those of bacteria in Antarctic cryoconite holes.</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>94</volume>:<issue>fix167</issue>. <pub-id pub-id-type="doi">10.1093/femsec/fix167</pub-id> <pub-id pub-id-type="pmid">29228256</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tahon</surname> <given-names>G.</given-names></name> <name><surname>Tytgat</surname> <given-names>B.</given-names></name> <name><surname>Willems</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>Diversity of phototrophic genes suggests multiple bacteria may be able to exploit sunlight in exposed soils from the s&#x00F8;r rondane mountains, East Antarctica.</article-title> <source><italic>Front. Microbiol.</italic></source> <volume>7</volume>:<issue>2026</issue>. <pub-id pub-id-type="doi">10.3389/fmicb.2016.02026</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tange</surname> <given-names>O.</given-names></name></person-group> (<year>2011</year>). <source><italic>GNU Parallel - The Command-Line Power Tool.</italic></source> <publisher-loc>Renton, WA</publisher-loc>: <publisher-name>The USENIX Magazine</publisher-name>, <fpage>42</fpage>&#x2013;<lpage>47</lpage>.</citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thompson</surname> <given-names>L. R.</given-names></name> <name><surname>Sanders</surname> <given-names>J. G.</given-names></name> <name><surname>McDonald</surname> <given-names>D.</given-names></name> <name><surname>Amir</surname> <given-names>A.</given-names></name> <name><surname>Ladau</surname> <given-names>J.</given-names></name> <name><surname>Locey</surname> <given-names>K. J.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>A communal catalogue reveals earth&#x2019;s multiscale microbial diversity.</article-title> <source><italic>Nature</italic></source> <volume>551</volume> <fpage>457</fpage>&#x2013;<lpage>463</lpage>. <pub-id pub-id-type="doi">10.1038/nature24621</pub-id> <pub-id pub-id-type="pmid">29088705</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tolar</surname> <given-names>B. B.</given-names></name> <name><surname>Ross</surname> <given-names>M. J.</given-names></name> <name><surname>Wallsgrove</surname> <given-names>N. J.</given-names></name> <name><surname>Liu</surname> <given-names>Q.</given-names></name> <name><surname>Aluwihare</surname> <given-names>L. I.</given-names></name> <name><surname>Popp</surname> <given-names>B. N.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Contribution of ammonia oxidation to chemoautotrophy in Antarctic coastal waters.</article-title> <source><italic>ISME J.</italic></source> <volume>10</volume> <fpage>1</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1038/ismej.2016.61</pub-id> <pub-id pub-id-type="pmid">27187795</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tytgat</surname> <given-names>B.</given-names></name> <name><surname>Verleyen</surname> <given-names>E.</given-names></name> <name><surname>Sweetlove</surname> <given-names>M.</given-names></name> <name><surname>D&#x2019;hondt</surname> <given-names>S.</given-names></name> <name><surname>Clercx</surname> <given-names>P.</given-names></name> <name><surname>Van Ranst</surname> <given-names>E.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Bacterial community composition in relation to bedrock type and macrobiota in soils from the S&#x00F8;r Rondane Mountains, East Antarctica.</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>92</volume>:<issue>fiw126</issue>. <pub-id pub-id-type="doi">10.1093/femsec/fiw126</pub-id> <pub-id pub-id-type="pmid">27402710</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vick-Majors</surname> <given-names>T. J.</given-names></name> <name><surname>Achberger</surname> <given-names>A.</given-names></name> <name><surname>Santib&#x00E1;&#x00F1;ez</surname> <given-names>P.</given-names></name> <name><surname>Dore</surname> <given-names>J. E.</given-names></name> <name><surname>Hodson</surname> <given-names>T.</given-names></name> <name><surname>Michaud</surname> <given-names>A. B.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Biogeochemistry and microbial diversity in the marine cavity beneath the McMurdo Ice Shelf, Antarctica.</article-title> <source><italic>Limnol. Oceanogr.</italic></source> <volume>61</volume> <fpage>572</fpage>&#x2013;<lpage>586</lpage>. <pub-id pub-id-type="doi">10.1002/lno.10234</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walters</surname> <given-names>W.</given-names></name> <name><surname>Hyde</surname> <given-names>E. R.</given-names></name> <name><surname>Berg-Lyons</surname> <given-names>D.</given-names></name> <name><surname>Ackermann</surname> <given-names>G.</given-names></name> <name><surname>Humphrey</surname> <given-names>G.</given-names></name> <name><surname>Parada</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Transcribed spacer marker gene primers for microbial community surveys.</article-title> <source><italic>mSystems</italic></source> <volume>1</volume> e00009-15. <pub-id pub-id-type="doi">10.1128/mSystems.00009-15.Editor</pub-id> <pub-id pub-id-type="pmid">27822518</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Garrity</surname> <given-names>G. M.</given-names></name> <name><surname>Tiedje</surname> <given-names>J. M.</given-names></name> <name><surname>Cole</surname> <given-names>J. R.</given-names></name></person-group> (<year>2007</year>). <article-title>Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.</article-title> <source><italic>Appl. Environ. Microbiol.</italic></source> <volume>73</volume> <fpage>5261</fpage>&#x2013;<lpage>5267</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.00062-07</pub-id> <pub-id pub-id-type="pmid">17586664</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Webster-Brown</surname> <given-names>J. G.</given-names></name> <name><surname>Hawes</surname> <given-names>I.</given-names></name> <name><surname>Jungblut</surname> <given-names>A. D.</given-names></name> <name><surname>Wood</surname> <given-names>S. A.</given-names></name> <name><surname>Christenson</surname> <given-names>H. K.</given-names></name></person-group> (<year>2015</year>). <article-title>The effects of entombment on water chemistry and bacterial assemblages in closed cryoconite holes on Antarctic glaciers.</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>91</volume>:<issue>fiv144</issue>. <pub-id pub-id-type="doi">10.1093/femsec/fiv144</pub-id> <pub-id pub-id-type="pmid">26572547</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>W.</given-names></name> <name><surname>Ma</surname> <given-names>H.</given-names></name> <name><surname>Shi</surname> <given-names>G.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Sun</surname> <given-names>B.</given-names></name> <name><surname>Xiao</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Independent shifts of abundant and rare bacterial populations across East Antarctica glacial foreland.</article-title> <source><italic>Front. Microbiol.</italic></source> <volume>8</volume>:<issue>1534</issue>. <pub-id pub-id-type="doi">10.3389/fmicb.2017.01534</pub-id> <pub-id pub-id-type="pmid">28848537</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Lu</surname> <given-names>L.</given-names></name> <name><surname>Chang</surname> <given-names>X.</given-names></name> <name><surname>Jiang</surname> <given-names>F.</given-names></name> <name><surname>Gao</surname> <given-names>X. D.</given-names></name> <name><surname>Peng</surname> <given-names>F.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Small-scale soil microbial community heterogeneity linked to landforms on King George Island, maritime Antarctica.</article-title> <source><italic>bioRxiv</italic></source> <pub-id pub-id-type="doi">10.1101/310490</pub-id></citation></ref>
</ref-list>
</back>
</article>