<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Mar. Sci.</journal-id>
<journal-title>Frontiers in Marine Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Mar. Sci.</abbrev-journal-title>
<issn pub-type="epub">2296-7745</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmars.2019.00219</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Marine Science</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Estimation of 18S Gene Copy Number in Marine Eukaryotic Plankton Using a Next-Generation Sequencing Approach</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Gong</surname> <given-names>Weida</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/683700/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Marchetti</surname> <given-names>Adrian</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/42816/overview"/>
</contrib>
</contrib-group>
<aff><institution>Department of Marine Sciences, The University of North Carolina at Chapel Hill</institution>, <addr-line>Chapel Hill, NC</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Sandie M. Degnan, The University of Queensland, Australia</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Chih-Ching Chung, National Taiwan Ocean University, Taiwan; Dagmar Hajkova Leary, United States Naval Research Laboratory, United States</p></fn>
<corresp id="c001">&#x002A;Correspondence: Adrian Marchetti, <email>amarchetti@unc.edu</email></corresp>
<fn fn-type="other" id="fn002"><p>This article was submitted to Marine Molecular Biology and Ecology, a section of the journal Frontiers in Marine Science</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>04</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>6</volume>
<elocation-id>219</elocation-id>
<history>
<date date-type="received">
<day>11</day>
<month>02</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>04</month>
<year>2019</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2019 Gong and Marchetti.</copyright-statement>
<copyright-year>2019</copyright-year>
<copyright-holder>Gong and Marchetti</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>The small subunit 18S rRNA (18S) gene is the most commonly used marker for taxonomic identification in eukaryotes. However, protists may harbor substantial variation in their 18S gene copy number, which can lead to a rapid decline in concordance between 18S gene sequences and actual organismal abundances. Here we used a computational method to estimate 18S gene copy number in seven species of marine eukaryotic phytoplankton and found large interspecies and strain-level differences across and within the examined species. Our results emphasize that variations in 18S gene copy number need to be taken into consideration and that corrections can improve the accuracy of quantitative eukaryotic microbial community profiles.</p>
</abstract>
<kwd-group>
<kwd>18S rRNA gene</kwd>
<kwd>amplicon sequencing</kwd>
<kwd>plankton community composition</kwd>
<kwd>bioinformatics</kwd>
<kwd>microbial ecology</kwd>
</kwd-group>
<contract-sponsor id="cn001">National Aeronautics and Space Administration<named-content content-type="fundref-id">10.13039/100000104</named-content></contract-sponsor>
<counts>
<fig-count count="2"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="29"/>
<page-count count="5"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec><title>Introduction</title>
<p>With substantial reductions in DNA sequencing costs combined with higher sequence yields, amplicon sequencing has revolutionized our view of microbial ecology (<xref ref-type="bibr" rid="B24">Sogin et al., 2006</xref>). It produces a culture-independent molecular characterization of the microbial community composition, and application of amplicon sequencing has successfully discovered novel microbes and characterized microbial diversity from a wide range of environments (<xref ref-type="bibr" rid="B6">Caron et al., 2012</xref>).</p>
<p>Due to its high specificity and sequence conservation, the 18S rRNA gene has become the most commonly used marker to explore eukaryotic protist community structure in both aquatic and terrestrial environments (<xref ref-type="bibr" rid="B7">Countway et al., 2005</xref>; <xref ref-type="bibr" rid="B8">de Vargas et al., 2015</xref>). However, 18S gene sequencing has inherent drawbacks that are typical of high throughput sequencing studies, such as PCR chimeras and sequencing errors (<xref ref-type="bibr" rid="B6">Caron et al., 2012</xref>). Ongoing research has focused on fixing such issues to provide a more accurate taxonomic description. Primers have been continuously modified and multiple hyper-variable regions can be simultaneously sequenced to reduce primer mismatch (<xref ref-type="bibr" rid="B21">Parada et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Lin et al., 2017</xref>). Metagenomics has also been employed to preclude PCR-based bias (<xref ref-type="bibr" rid="B13">Eloe-Fadrosh et al., 2016</xref>).</p>
<p>Besides inherent technical issues, another source of bias to quantifying community composition with 18S gene sequences stems from variable copy numbers of ribosomal genes (<xref ref-type="bibr" rid="B7">Countway et al., 2005</xref>; <xref ref-type="bibr" rid="B6">Caron et al., 2012</xref>). Read counts of 18S genes are commonly used to estimate proportions of protists in amplicon sequencing analyses. However, the relative abundance of 18S gene copies in eukaryotic plankton collected from environmental samplescan be attributed both to variation in the relative abundance of different organisms, and to variation in genomic 18S copy number among those organisms (<xref ref-type="bibr" rid="B29">Zhu et al., 2005</xref>; <xref ref-type="bibr" rid="B14">Godhe et al., 2008</xref>). Phylogeny-based approaches have been developed to estimate ribosomal gene copy numbers for prokaryotes (<xref ref-type="bibr" rid="B15">Kembel et al., 2012</xref>; <xref ref-type="bibr" rid="B2">Angly et al., 2014</xref>), however, accuracy of such estimation can be compromised for protists due to the limited number of genomes that have been sequenced. <xref ref-type="bibr" rid="B29">Zhu et al. (2005)</xref> have developed a quantitative PCR-based approach to estimate 18S gene copy number for picoeukaryotes by normalizing total copies of 18S gene in the sample with cell abundance, but the results are highly dependent on DNA extraction efficiency, primer specificity and cell enumeration, and can be impractical for uncultured but prevalent phytoplankton species. Estimating 18S gene copy number remains an arduous task.</p>
<p>Due to the large number of sequences produced through high-throughput sequencing, a computational method that determines gene sequencing coverage has been recently developed and has become a promising method to estimate gene copy number variations (<xref ref-type="bibr" rid="B28">Zhao et al., 2013</xref>). It has been successfully applied to bacterial communities (<xref ref-type="bibr" rid="B22">Perisin et al., 2016</xref>), however, the application of such an approach on protist communities has yet to be performed. By normalizing sequencing coverage of 18S genes with that from single copy genes, we quantified 18S gene copy numbers of selected marine eukaryotic phytoplankton species whose draft assemblies are available in NCBI&#x2019;s GenBank database. Thus far our results provide 18S gene copy number estimates for multiple representative species from four common phytoplankton classes and found large interspecies and strain-level 18S rRNA gene copy number variations across the different phytoplankton species.</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec><title>Bioinformatics Pipeline</title>
<p>Draft/closed genome assemblies and raw sequences for <italic>Emiliania huxleyi</italic>, <italic>Ostreococcus tauri</italic>, <italic>Phaeodactylum tricornutum</italic>, <italic>Symbiodinium kawagutii</italic>, <italic>Symbiodinium minutum</italic>, <italic>Thalassiosira oceanica</italic>, and <italic>Trebouxia</italic> sp. were obtained from the NCBI Short Read Archive (see <xref ref-type="supplementary-material" rid="SM5">Supplementary Table 1</xref> for NCBI accession numbers, <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 1</xref>). Raw sequences were trimmed with Trimmomatic v0.36 (<xref ref-type="bibr" rid="B4">Bolger et al., 2014</xref>). Bases with average quality scores below 20 with a sliding window of 4 bases were trimmed. Trimmed reads were quality checked with FastQC (<xref ref-type="bibr" rid="B1">Andrews, 2010</xref>). Bowtie2 v2.3.4.1 was used to map reads back to genome assemblies, and per-base sequencing depth was computed with samtools v1.8 depth command (<xref ref-type="bibr" rid="B17">Li et al., 2009</xref>; <xref ref-type="bibr" rid="B16">Langmead and Salzberg, 2012</xref>).</p>
</sec>
<sec><title>Identification of Single Copy Genes</title>
<p>Benchmarking Universal Single-Copy Orthologs (BUSCO) was used to assess genome assembly and annotation completeness. BUSCO has a set of phylogeny-specific single copy orthologs (<xref ref-type="bibr" rid="B25">Waterhouse et al., 2018</xref>). The protist dataset was phylogenetically close to eukaryotic phytoplankton and was thus selected as an initial set of reference single copy genes. A set of 83 eukaryotic single copy core genes described by <xref ref-type="bibr" rid="B9">Delmont (2018)</xref> was used as the reference single copy genes in this study (<xref ref-type="supplementary-material" rid="SM6">Supplementary Table 2</xref>). The reference single copy genes showed consistent read depth following a Poisson distribution with overdispersion (see example in <xref ref-type="supplementary-material" rid="SM2">Supplementary Figure 2</xref>), which further validated categorization as single copy genes in the examined phytoplankton genomes (<xref ref-type="bibr" rid="B5">Brynildsrud et al., 2015</xref>).</p>
</sec>
<sec><title>GC% Correction and 18S Gene Copy Number Estimation</title>
<p>GC% in sequences can affect sequencing depth and consequently lead to a biased copy number estimation (<xref ref-type="bibr" rid="B27">Yoon et al., 2009</xref>). Therefore, a linear model was fit for GC% and average per-base sequencing depth for single copy genes. If a significant correlation was detected (<italic>R</italic><sup>2</sup> > 0.1, slope <italic>p</italic> &#x003C; 0.05), read-depth of single copy genes and 18S genes was corrected with the model parameters (<xref ref-type="bibr" rid="B22">Perisin et al., 2016</xref>) (see example in <xref ref-type="supplementary-material" rid="SM3">Supplementary Figure 3</xref>). The 18S rDNA V4 region is the most commonly used hypervariable region for amplicon sequencing due to its high resolution and accuracy for phylogenetic placement and was therefore selected to estimate 18S gene copy number (<xref ref-type="bibr" rid="B11">Dunthorn et al., 2012</xref>, <xref ref-type="bibr" rid="B12">2014</xref>). The ratio of 18S gene read depth in the V4 region to the median of single copy gene read depth was then used to estimate 18S gene copy number.</p>
</sec>
</sec>
<sec><title>Results and Discussion</title>
<p>Estimates of mean 18S V4-region gene copy numbers range from approximately 2&#x2013;166 across the seven closed/draft phytoplankton genomes (<xref ref-type="fig" rid="F1">Figure 1</xref> and <xref ref-type="supplementary-material" rid="SM5">Supplementary Table 1</xref>). <italic>O. tauri</italic>, on the lower end, was estimated to have, on average, 3.4 copies of the 18S gene across 13 strains after GC% correction to account for possible sequencing bias. This is similar to the previously reported four copies of the ribosomal gene in strain RCC4221 (<xref ref-type="bibr" rid="B3">Blanc-Mathieu et al., 2014</xref>). GC% was shown to have a significant correlation with sequencing depth in <italic>O. tauri</italic> genome assemblies and could account for the observed difference from the reported four 18S gene copies estimated by <xref ref-type="bibr" rid="B3">Blanc-Mathieu et al. (2014)</xref> (<xref ref-type="supplementary-material" rid="SM3">Supplementary Figure 3</xref>). On the higher end, our results suggest an average of 160 18S gene copies in the dinoflagellate <italic>S. kawagutii</italic>, which could be attributed to large and repetitive genomes typical of dinoflagellates (<xref ref-type="bibr" rid="B18">Lin, 2011</xref>; <xref ref-type="bibr" rid="B26">Wisecaver and Hackett, 2011</xref>; <xref ref-type="bibr" rid="B23">Shoguchi et al., 2013</xref>; <xref ref-type="bibr" rid="B19">Lin et al., 2015</xref>). The notable two orders of magnitude difference can result in highly biased phytoplankton composition characterization, over/under-estimating species with higher/lower 18S gene copy number. For example, a simulated community with equal 18S gene sequence abundances from seven representative species seems to suggest that each species contributes equally to the community. However, upon 18S gene copy number correction, <italic>O. tauri</italic> and <italic>P. tricornutum</italic> actually dominate the natural community (<xref ref-type="fig" rid="F2">Figure 2</xref>). In addition, strain level differences in 18S gene copy numbers were also detected, further highlighting the strong variation among eukaryotic phytoplankton, which may be a common characteristic among all protists. Estimates of 18S gene copy numbers in the coccolithophorid, <italic>E. huxleyi</italic> ranged from 16 to 109 across 14 different strains collected at different sites across the globe (<xref ref-type="supplementary-material" rid="SM5">Supplementary Table 1</xref>). Strains from English Channel have significantly higher 18S gene copies than those from the neighboring Bergen Sea (<xref ref-type="supplementary-material" rid="SM4">Supplementary Figure 4</xref> and <xref ref-type="supplementary-material" rid="SM5">Supplementary Table 1</xref>). In contrast, 18S gene copy number was found to be more consistent for the diatom <italic>P. tricornutum</italic> strains that were also isolated from a large distribution of locations. It is unclear at this time whether this degree of variation in 18S gene copy number is due to inherent variability within particular phytoplankton groups or a result of the geographical location where the isolates were obtained (<xref ref-type="fig" rid="F1">Figure 1</xref>). Further research is necessary to assess whether there are biogeographical patterns to these strain-level differences. The intraspecies geographic variation in 18S gene copy number adds another level of complexity to characterizing plankton community structure, and suggests that site-specific copy number estimations and corrections may be necessary for further compositional studies.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Geographic distribution of 18S gene copy number estimates for representative phytoplankton species with assembled/draft genomes plotted in relation to their isolation location. Symbols with more than one number represent locations in which multiple strains were sequenced. Inset table provides averages &#x00B1; standard deviation of all strains for each species examined.</p></caption>
<graphic xlink:href="fmars-06-00219-g001.tif"/>
</fig>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Composition of a simulated phytoplankton community with equal sequence abundance from seven phytoplankton species before and after 18S gene copy number correction. Correction is based on 18S gene copy number estimates from this study.</p></caption>
<graphic xlink:href="fmars-06-00219-g002.tif"/>
</fig>
<p>Tremendous sequencing effort has been devoted to estimate ribosomal copy numbers in bacteria and archaea, but our knowledge on their eukaryotic protist counterparts has benefited much less from the fast-developing sequencing technologies. Copy number variation can exert a strong influence on protist community composition and lead to biased ecological inferences. Recent bioinformatics technologies including metagenome assembly of genomes have the potential to provide new insights to estimating 18S gene copy numbers for multiple species simultaneously and will dramatically increase the number of estimates for protists (<xref ref-type="bibr" rid="B10">Delmont et al., 2018</xref>). With slight modifications, the bioinformatic pipeline implemented in this study can be applied to environmental sequences to provide an estimate of 18S gene copy numbers for dominant species. Assembled contigs that can be accurately taxonomically and functionally annotated as single copy genes, along with 18S genes, will generate an 18S gene copy number estimate. We propose that this pipeline be applied to metagenomic samples obtained from each location in which amplicon sequencing based compositional analyses are routinely performed.</p>
<p>Our findings emphasize the need to incorporate 18S gene copy number variation in protist compositional studies and provides a promising means to measure them in eukaryotic plankton, although further research is warranted due to complex genomes and polyploidy, especially in Alveolates. We anticipate that continuing sequencing efforts with consistent sequencing platforms and guaranteed sequencing depth along with emerging bioinformatics tools will add more perspectives to 18S gene copy number estimates and correction and will result in more accurate representation of eukaryotic community structure.</p>
</sec>
<sec><title>Author Contributions</title>
<p>WG designed the research and analyzed the data. WG and AM wrote the manuscript.</p>
</sec>
<sec><title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This research was supported by NASA OBB 2016 80NSSC17 K0552.</p>
</fn>
</fn-group>
<ack>
<p>We thank S. Gifford for valuable and constructive comments.</p>
</ack>
<sec sec-type="supplementary material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmars.2019.00219/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmars.2019.00219/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Image_1.TIF" id="SM1" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_2.TIF" id="SM2" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_3.TIF" id="SM3" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Image_4.TIF" id="SM4" mimetype="image/tiff" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_1.XLSX" id="SM5" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_2.XLSX" id="SM6" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="SM7" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andrews</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <source><italic>FastQC: A Quality Control Tool for High Throughput Sequence Data</italic></source>. Available at: <ext-link ext-link-type="uri" xlink:href="http://www.bioinformatics.babraham.ac.uk/projects/fastqc/">http://www.bioinformatics.babraham.ac.uk/projects/fastqc/</ext-link> (<comment>accessed April 19, 2018</comment>).</citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Angly</surname> <given-names>F. E.</given-names></name> <name><surname>Dennis</surname> <given-names>P. G.</given-names></name> <name><surname>Skarshewski</surname> <given-names>A.</given-names></name> <name><surname>Vanwonterghem</surname> <given-names>I.</given-names></name> <name><surname>Hugenholtz</surname> <given-names>P.</given-names></name> <name><surname>Tyson</surname> <given-names>G. W.</given-names></name></person-group> (<year>2014</year>). <article-title>CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction.</article-title> <source><italic>Microbiome</italic></source> <volume>2</volume>:<issue>11</issue>. <pub-id pub-id-type="doi">10.1186/2049-2618-2-11</pub-id> <pub-id pub-id-type="pmid">24708850</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blanc-Mathieu</surname> <given-names>R.</given-names></name> <name><surname>Verhelst</surname> <given-names>B.</given-names></name> <name><surname>Derelle</surname> <given-names>E.</given-names></name> <name><surname>Rombauts</surname> <given-names>S.</given-names></name> <name><surname>Bouget</surname> <given-names>F.-Y.</given-names></name> <name><surname>Carr&#x00E9;</surname> <given-names>I.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>An improved genome of the model marine alga Ostreococcus tauri unfolds by assessing Illumina de novo assemblies.</article-title> <source><italic>BMC Genomics</italic></source> <volume>15</volume>:<issue>1103</issue>. <pub-id pub-id-type="doi">10.1186/1471-2164-15-1103</pub-id> <pub-id pub-id-type="pmid">25494611</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bolger</surname> <given-names>A. M.</given-names></name> <name><surname>Lohse</surname> <given-names>M.</given-names></name> <name><surname>Usadel</surname> <given-names>B.</given-names></name></person-group> (<year>2014</year>). <article-title>Trimmomatic: a flexible trimmer for Illumina sequence data.</article-title> <source><italic>Bioinformatics</italic></source> <volume>30</volume> <fpage>2114</fpage>&#x2013;<lpage>2120</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btu170</pub-id> <pub-id pub-id-type="pmid">24695404</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brynildsrud</surname> <given-names>O.</given-names></name> <name><surname>Snipen</surname> <given-names>L.-G.</given-names></name> <name><surname>Bohlin</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>CNOGpro: detection and quantification of CNVs in prokaryotic whole-genome sequencing data.</article-title> <source><italic>Bioinformatics</italic></source> <volume>31</volume> <fpage>1708</fpage>&#x2013;<lpage>1715</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btv070</pub-id> <pub-id pub-id-type="pmid">25644268</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Caron</surname> <given-names>D. A.</given-names></name> <name><surname>Countway</surname> <given-names>P. D.</given-names></name> <name><surname>Jones</surname> <given-names>A. C.</given-names></name> <name><surname>Kim</surname> <given-names>D. Y.</given-names></name> <name><surname>Schnetzer</surname> <given-names>A.</given-names></name></person-group> (<year>2012</year>). <article-title>Marine protistan diversity.</article-title> <source><italic>Annu. Rev. Mar. Sci.</italic></source> <volume>4</volume> <fpage>467</fpage>&#x2013;<lpage>493</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-marine-120709-142802</pub-id> <pub-id pub-id-type="pmid">22457984</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Countway</surname> <given-names>P. D.</given-names></name> <name><surname>Gast</surname> <given-names>R. J.</given-names></name> <name><surname>Savai</surname> <given-names>P.</given-names></name> <name><surname>Caron</surname> <given-names>D. A.</given-names></name></person-group> (<year>2005</year>). <article-title>Protistan diversity estimates based on 18S rDNA from seawater incubations in the Western North Atlantic.</article-title> <source><italic>J. Eukaryot. Microbiol.</italic></source> <volume>52</volume> <fpage>95</fpage>&#x2013;<lpage>106</lpage>. <pub-id pub-id-type="doi">10.1111/j.1550-7408.2005.05202006.x</pub-id> <pub-id pub-id-type="pmid">15817114</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Vargas</surname> <given-names>C.</given-names></name> <name><surname>Audic</surname> <given-names>S.</given-names></name> <name><surname>Henry</surname> <given-names>N.</given-names></name> <name><surname>Decelle</surname> <given-names>J.</given-names></name> <name><surname>Mah&#x00E9;</surname> <given-names>F.</given-names></name> <name><surname>Logares</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Ocean plankton. Eukaryotic plankton diversity in the sunlit ocean.</article-title> <source><italic>Science</italic></source> <volume>348</volume>:<issue>1261605</issue>. <pub-id pub-id-type="doi">10.1126/science.1261605</pub-id> <pub-id pub-id-type="pmid">25999516</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Delmont</surname> <given-names>T.</given-names></name></person-group> (<year>2018</year>). <source><italic>Assessing the Completion of Eukaryotic Bins With anvi&#x2019;o. Meren Lab</italic>.</source> Available at: <ext-link ext-link-type="uri" xlink:href="http://merenlab.org/2018/05/05/eukaryotic-single-copy-core-genes/">http://merenlab.org/2018/05/05/eukaryotic-single-copy-core-genes/</ext-link> (<comment>accessed July 24, 2018</comment>).</citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Delmont</surname> <given-names>T. O.</given-names></name> <name><surname>Quince</surname> <given-names>C.</given-names></name> <name><surname>Shaiber</surname> <given-names>A.</given-names></name> <name><surname>Esen</surname> <given-names>&#x00D6;. C.</given-names></name> <name><surname>Lee</surname> <given-names>S. T.</given-names></name><name> <surname>Rapp&#x00E9;</surname> <given-names>M. S.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Nitrogen-fixing populations of Planctomycetes and <italic>Proteobacteria</italic> are abundant in surface ocean metagenomes.</article-title> <source><italic>Nat. Microbiol.</italic></source> <volume>3</volume> <fpage>804</fpage>&#x2013;<lpage>813</lpage>. <pub-id pub-id-type="doi">10.1038/s41564-018-0176-9</pub-id> <pub-id pub-id-type="pmid">29891866</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dunthorn</surname> <given-names>M.</given-names></name> <name><surname>Klier</surname> <given-names>J.</given-names></name> <name><surname>Bunge</surname> <given-names>J.</given-names></name> <name><surname>Stoeck</surname> <given-names>T.</given-names></name></person-group> (<year>2012</year>). <article-title>Comparing the hyper-variable V4 and V9 regions of the small subunit rDNA for assessment of ciliate environmental diversity.</article-title> <source><italic>J. Eukaryot. Microbiol.</italic></source> <volume>59</volume> <fpage>185</fpage>&#x2013;<lpage>187</lpage>. <pub-id pub-id-type="doi">10.1111/j.1550-7408.2011.00602.x</pub-id> <pub-id pub-id-type="pmid">22236102</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dunthorn</surname> <given-names>M.</given-names></name> <name><surname>Otto</surname> <given-names>J.</given-names></name> <name><surname>Berger</surname> <given-names>S. A.</given-names></name> <name><surname>Stamatakis</surname> <given-names>A.</given-names></name> <name><surname>Mah&#x00E9;</surname> <given-names>F.</given-names></name> <name><surname>Romac</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Placing environmental next-generation sequencing amplicons from microbial eukaryotes into a phylogenetic context.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>31</volume> <fpage>993</fpage>&#x2013;<lpage>1009</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msu055</pub-id> <pub-id pub-id-type="pmid">24473288</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eloe-Fadrosh</surname> <given-names>E. A.</given-names></name> <name><surname>Ivanova</surname> <given-names>N. N.</given-names></name> <name><surname>Woyke</surname> <given-names>T.</given-names></name> <name><surname>Kyrpides</surname> <given-names>N. C.</given-names></name></person-group> (<year>2016</year>). <article-title>Metagenomics uncovers gaps in amplicon-based detection of microbial diversity.</article-title> <source><italic>Nat. Microbiol.</italic></source> <volume>1</volume>:<issue>15032</issue>. <pub-id pub-id-type="doi">10.1038/nmicrobiol.2015.32</pub-id> <pub-id pub-id-type="pmid">27572438</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Godhe</surname> <given-names>A.</given-names></name> <name><surname>Asplund</surname> <given-names>M. E.</given-names></name> <name><surname>Harnstrom</surname> <given-names>K.</given-names></name> <name><surname>Saravanan</surname> <given-names>V.</given-names></name> <name><surname>Tyagi</surname> <given-names>A.</given-names></name> <name><surname>Karunasagar</surname> <given-names>I.</given-names></name></person-group> (<year>2008</year>). <article-title>Quantification of diatom and dinoflagellate biomasses in coastal marine seawater samples by real-time PCR.</article-title> <source><italic>Appl. Environ. Microbiol.</italic></source> <volume>74</volume> <fpage>7174</fpage>&#x2013;<lpage>7182</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.01298-08</pub-id> <pub-id pub-id-type="pmid">18849462</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kembel</surname> <given-names>S. W.</given-names></name> <name><surname>Wu</surname> <given-names>M.</given-names></name> <name><surname>Eisen</surname> <given-names>J. A.</given-names></name> <name><surname>Green</surname> <given-names>J. L.</given-names></name></person-group> (<year>2012</year>). <article-title>Incorporating 16S gene copy number information improves estimates of microbial diversity and abundance.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>8</volume>:<issue>e1002743</issue>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002743</pub-id> <pub-id pub-id-type="pmid">23133348</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Langmead</surname> <given-names>B.</given-names></name> <name><surname>Salzberg</surname> <given-names>S. L.</given-names></name></person-group> (<year>2012</year>). <article-title>Fast gapped-read alignment with Bowtie 2.</article-title> <source><italic>Nat. Methods</italic></source> <volume>9</volume> <fpage>357</fpage>&#x2013;<lpage>359</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1923</pub-id> <pub-id pub-id-type="pmid">22388286</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Handsaker</surname> <given-names>B.</given-names></name> <name><surname>Wysoker</surname> <given-names>A.</given-names></name> <name><surname>Fennell</surname> <given-names>T.</given-names></name> <name><surname>Ruan</surname> <given-names>J.</given-names></name> <name><surname>Homer</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>The sequence alignment/map format and SAMtools.</article-title> <source><italic>Bioinformatics</italic></source> <volume>25</volume> <fpage>2078</fpage>&#x2013;<lpage>2079</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp352</pub-id> <pub-id pub-id-type="pmid">19505943</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>S.</given-names></name></person-group> (<year>2011</year>). <article-title>Genomic understanding of dinoflagellates.</article-title> <source><italic>Res. Microbiol.</italic></source> <volume>162</volume> <fpage>551</fpage>&#x2013;<lpage>569</lpage>. <pub-id pub-id-type="doi">10.1016/J.RESMIC.2011.04.006</pub-id> <pub-id pub-id-type="pmid">21514379</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>S.</given-names></name> <name><surname>Cheng</surname> <given-names>S.</given-names></name> <name><surname>Song</surname> <given-names>B.</given-names></name> <name><surname>Zhong</surname> <given-names>X.</given-names></name> <name><surname>Lin</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>The symbiodinium kawagutii genome illuminates dinoflagellate gene expression and coral symbiosis.</article-title> <source><italic>Science</italic></source> <volume>350</volume> <fpage>691</fpage>&#x2013;<lpage>694</lpage>. <pub-id pub-id-type="doi">10.1126/science.aad0408</pub-id> <pub-id pub-id-type="pmid">26542574</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>Y.</given-names></name> <name><surname>Cassar</surname> <given-names>N.</given-names></name> <name><surname>Marchetti</surname> <given-names>A.</given-names></name> <name><surname>Moreno</surname> <given-names>C.</given-names></name> <name><surname>Ducklow</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name></person-group> (<year>2017</year>). <article-title>Specific eukaryotic plankton are good predictors of net community production in the Western Antarctic Peninsula.</article-title> <source><italic>Sci. Rep.</italic></source> <volume>7</volume>:<issue>14845</issue>. <pub-id pub-id-type="doi">10.1038/s41598-017-14109-1</pub-id> <pub-id pub-id-type="pmid">29093494</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parada</surname> <given-names>A. E.</given-names></name> <name><surname>Needham</surname> <given-names>D. M.</given-names></name> <name><surname>Fuhrman</surname> <given-names>J. A.</given-names></name></person-group> (<year>2016</year>). <article-title>Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples.</article-title> <source><italic>Environ. Microbiol.</italic></source> <volume>18</volume> <fpage>1403</fpage>&#x2013;<lpage>1414</lpage>. <pub-id pub-id-type="doi">10.1111/1462-2920.13023</pub-id> <pub-id pub-id-type="pmid">26271760</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perisin</surname> <given-names>M.</given-names></name> <name><surname>Vetter</surname> <given-names>M.</given-names></name> <name><surname>Gilbert</surname> <given-names>J. A.</given-names></name> <name><surname>Bergelson</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>16Stimator: statistical estimation of ribosomal gene copy numbers from draft genome assemblies.</article-title> <source><italic>ISME J.</italic></source> <volume>10</volume> <fpage>1020</fpage>&#x2013;<lpage>1024</lpage>. <pub-id pub-id-type="doi">10.1038/ismej.2015.161</pub-id> <pub-id pub-id-type="pmid">26359911</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shoguchi</surname> <given-names>E.</given-names></name> <name><surname>Shinzato</surname> <given-names>C.</given-names></name> <name><surname>Kawashima</surname> <given-names>T.</given-names></name> <name><surname>Gyoja</surname> <given-names>F.</given-names></name> <name><surname>Mungpakdee</surname> <given-names>S.</given-names></name> <name><surname>Koyanagi</surname> <given-names>R.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Draft assembly of the symbiodinium minutum nuclear genome reveals dinoflagellate gene structure.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>23</volume> <fpage>1399</fpage>&#x2013;<lpage>1408</lpage>. <pub-id pub-id-type="doi">10.1016/J.CUB.2013.05.062</pub-id> <pub-id pub-id-type="pmid">23850284</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sogin</surname> <given-names>M. L.</given-names></name> <name><surname>Morrison</surname> <given-names>H. G.</given-names></name> <name><surname>Huber</surname> <given-names>J. A.</given-names></name> <name><surname>Mark Welch</surname> <given-names>D.</given-names></name> <name><surname>Huse</surname> <given-names>S. M.</given-names></name> <name><surname>Neal</surname> <given-names>P. R.</given-names></name><etal/></person-group> (<year>2006</year>). <article-title>Microbial diversity in the deep sea and the underexplored &#x0026;quotrare biosphere&#x0026;quot.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>103</volume> <fpage>12115</fpage>&#x2013;<lpage>12120</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0605127103</pub-id> <pub-id pub-id-type="pmid">16880384</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Waterhouse</surname> <given-names>R. M.</given-names></name> <name><surname>Seppey</surname> <given-names>M.</given-names></name> <name><surname>Sim&#x00E3;o</surname> <given-names>F. A.</given-names></name> <name><surname>Manni</surname> <given-names>M.</given-names></name> <name><surname>Ioannidis</surname> <given-names>P.</given-names></name> <name><surname>Klioutchnikov</surname> <given-names>G.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>BUSCO applications from quality assessments to gene prediction and phylogenomics.</article-title> <source><italic>Mol. Biol. Evol.</italic></source> <volume>35</volume> <fpage>543</fpage>&#x2013;<lpage>548</lpage>. <pub-id pub-id-type="doi">10.1093/molbev/msx319</pub-id> <pub-id pub-id-type="pmid">29220515</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wisecaver</surname> <given-names>J. H.</given-names></name> <name><surname>Hackett</surname> <given-names>J. D.</given-names></name></person-group> (<year>2011</year>). <article-title>Dinoflagellate genome evolution.</article-title> <source><italic>Annu. Rev. Microbiol.</italic></source> <volume>65</volume> <fpage>369</fpage>&#x2013;<lpage>387</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-micro-090110-102841</pub-id> <pub-id pub-id-type="pmid">21682644</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yoon</surname> <given-names>S.</given-names></name> <name><surname>Xuan</surname> <given-names>Z.</given-names></name> <name><surname>Makarov</surname> <given-names>V.</given-names></name> <name><surname>Ye</surname> <given-names>K.</given-names></name> <name><surname>Sebat</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>Sensitive and accurate detection of copy number variants using read depth of coverage.</article-title> <source><italic>Genome Res.</italic></source> <volume>19</volume> <fpage>1586</fpage>&#x2013;<lpage>1592</lpage>. <pub-id pub-id-type="doi">10.1101/gr.092981.109</pub-id> <pub-id pub-id-type="pmid">19657104</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>M.</given-names></name> <name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Jia</surname> <given-names>P.</given-names></name> <name><surname>Zhao</surname> <given-names>Z.</given-names></name></person-group> (<year>2013</year>). <article-title>Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives.</article-title> <source><italic>BMC Bioinformatics</italic></source> <volume>14(Suppl. 11):S1</volume>. <pub-id pub-id-type="doi">10.1186/1471-2105-14-S11-S1</pub-id> <pub-id pub-id-type="pmid">24564169</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhu</surname> <given-names>F.</given-names></name> <name><surname>Massana</surname> <given-names>R.</given-names></name> <name><surname>Not</surname> <given-names>F.</given-names></name> <name><surname>Marie</surname> <given-names>D.</given-names></name> <name><surname>Vaulot</surname> <given-names>D.</given-names></name></person-group> (<year>2005</year>). <article-title>Mapping of picoeucaryotes in marine ecosystems with quantitative PCR of the 18S rRNA gene.</article-title> <source><italic>FEMS Microbiol. Ecol.</italic></source> <volume>52</volume> <fpage>79</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1016/j.femsec.2004.10.006</pub-id> <pub-id pub-id-type="pmid">16329895</pub-id></citation></ref>
</ref-list>
</back>
</article>