High temporal and spatial diversity in marine RNA viruses implies that they have an important role in mortality and structuring plankton communities

Viruses in the order Picornavirales infect eukaryotes, and are widely distributed in coastal waters. Amplicon deep-sequencing of the RNA dependent RNA polymerase (RdRp) revealed diverse and highly uneven communities of picorna-like viruses in the coastal waters of British Columbia (BC), Canada. Almost 300 000 pyrosequence reads revealed 145 operational taxonomic units (OTUs) based on 95% sequence similarity at the amino-acid level. Each sample had between 24 and 71 OTUs and there was little overlap among samples. Phylogenetic analysis revealed that some clades of OTUs were only found at one site; whereas, other clades included OTUs from all sites. Since most of these OTUs are likely from viruses that infect eukaryotic phytoplankton, and viral isolates infecting phytoplankton are strain-specific; each OTU probably arose from the lysis of a specific phytoplankton taxon. Moreover, the patchiness in OTU distribution, and the high turnover of viruses in the mixed layer, implies continuous infection and lysis by RNA viruses of a diverse array of eukaryotic phytoplankton taxa. Hence, these viruses are likely important elements structuring the phytoplankton community, and play a significant role in nutrient cycling and energy transfer.

Viruses infecting marine phytoplankton have genomes comprised of double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), double-stranded RNA (dsRNA), and single-stranded RNA (ssRNA) (as reviewed in Short, 2012). Their genomes and particle sizes range from very large dsDNA viruses in the Phycodnaviridae to very small ssDNA and ssRNA viruses belonging to the genus Bacilladnavirus and order Picornavirales, respectively. The order Picornavirales is comprised of positivesense, ssRNA viruses that infect eukaryotes (Le Gall et al., 2008), including ecologically important marine protists. These viruses are small (25-35 nm), icosahedral, and have a conserved genomic organization that includes a replication area comprised of a type III helicase, a 3C-like proteinase, and a type I RNA dependent RNA polymerase (Sanfaçon et al., 2009). Isolates in the Picornavirales that are pathogens of marine protists infect a wide diversity of hosts including the bloom-forming raphidophyte Heterosigma akashiwo (Tai et al., 2003) (viral family Marnaviridae), the thraustochytrid Aurantiochytrium sp. (Takao et al., 2006;Yokoyama and Honda, 2007) (proposed viral genus Labyrnavirus) and the cosmopolitan diatoms Rhizosolenia setigera (Nagasaki et al., 2004) and Chaetoceros socialis (Tomaru et al., 2009) (proposed viral genus Bacillarnavirus). Viruses in the Picornavirales appear to be common and widely distributed in coastal waters Culley and Steward, 2007).
Metagenomic and targeted gene studies are uncovering the diversity of marine RNA viruses. For example, phylogenetic analysis of RNA-dependent RNA polymerase (RdRp) sequences from seawater samples supports a monophyletic marine group within the Picornavirales (Culley et al., , 2014Culley and Steward, 2007;Tomaru et al., 2009) and several divergent clades within this marine group (Culley et al., , 2014Culley and Steward, 2007). Additionally, metagenomic analyses reveal that there are numerous sequences from aquatic RNA viruses that cannot be assigned to known taxa (Culley et al., 2006(Culley et al., , 2014Djikeng et al., 2009;Steward et al., 2012). Despite the high diversity of marine RNA viruses (Lang et al., 2009), the spatial and temporal distribution of different phylogenetic groups remains unreported, although there is evidence that the taxonomic structure of marine RNA viral communities is highly uneven. For example, in one sample from a metagenomic study from the coastal waters of British Columbia, 59% of the reads assembled into a single contig, while in a second sample 66% of the reads fell into four contigs, with most falling into two genotypes (Culley et al., 2006). However, with only a few 100 reads in total from the two samples, the coverage of the communities was low. Similarly, RNA viral metagenomic data from a freshwater lake (Djikeng et al., 2009) showed little identical sequence overlap among communities, although there was broad taxonomic similarity over time within a location.
Ecological questions about the distribution of marine viruses over time and space have been examined more extensively in bacteriophages, particularly those infecting cyanobacteria. For example, some data reveal no clear patterns of biogeography in cyanophage isolates locally (Clasen et al., 2013), regionally (Jameson et al., 2011) or more globally (Huang et al., 2010). Other data have shown patterns at a regional scale , although communities in basins that were connected were most similar and those that were separated by land or current boundaries were the least similar. Other data for marine bacteriophages have shown temporal variability (Chen et al., 2009;Wang et al., 2011;Chow and Fuhrman, 2012;Clasen et al., 2013;Marston et al., 2013). If the dynamics of marine bacteriophages and marine RNA viruses are similar, some RNA viral taxa will persist temporally and spatially, while other taxa will be detected sporadically. To test this hypothesis, we examined two samples, taken 5 months apart at the same location, and three samples taken within hours of each other, but 20 km apart in the same coastal basin.
We used high-throughput 454 pyrosequencing to obtain deep coverage of RdRp amplicon sequences and compare the richness of viruses in the Picornavirales among samples from the coastal waters of British Columbia, Canada. The results revealed a phylogenetically diverse and spatially variable community of viruses, suggesting that taxon-specific lytic events are important in shaping the phytoplankton community.  (Figure 1). The Strait of Georgia (SOG) is an estuarine-influenced basin that is on average 22 km across, 222 km long and 150 m deep. The upper 50 m of SOG is where most of the variability in physical and chemical parameters occurs. Jericho Pier (JP) is adjacent to the shoreline, in a well-mixed location with mixed semidiurnal tides.

SAMPLE COLLECTION
On 28 July 2010, a rosette equipped with a Seabird SBE 25 CTD (equipped with Seabird SBE 43 dissolved oxygen sensor, WET Labs WETStar fluorometer, WET Labs C-Star transmissometer, Biospherical Instruments QSP2200PD PAR sensor) and General Oceanics GO-FLO bottles was used to collect 12 L of water from five depths between 2 and 16 m at each of three stations in SOG. The five depths from each station were combined to obtain three integrated water samples. The depths were selected to attempt to encompass the viral diversity at each station, and consisted of a near-surface sample, a sample from the isothermal zone below the mixed layer, and three samples spanning the chlorophyll maximum.
At JP, 60 L of water was pumped from the 1-m depth on 10 July and 12 October 2010. Salinity and temperature at JP were measured using a YSI probe (Yellow Springs, Ohio, USA). For all samples, the water was filtered through 142-mm diameter, 1.2-µm nominal pore-size glass-fiber (GC50 Advantec MFS, Dublin, CA., USA) and 0.22-µm pore-size polyvinyldine (Millipore Bedford, MA, USA) filters. The viral size fraction in the filtrate was concentrated to ∼500 mL (viral concentrate) using tangential flow ultrafiltration using a 30 kDa MW prep-scale Spiral Wound TFF-6 cartridge (Millipore) (Suttle et al., 1991).
Phosphate, silicate and nitrate+nitrite concentrations were determined in duplicate 15-mL seawater samples filtered through 0.45-µm pore-size HA filters (Millipore) and stored at −20 • C until air-segmented continuous-flow analysis on a AutoAnalyzer 3 (Bran & Luebbe, Norderstedt Germany). Chlorophyll a (Chl a) was determined in triplicate by filtering 100 mL of seawater onto 0.45 µm pore-size HA filters (Millipore), and storing the filters in the dark at −20 • C until acetone extraction and then analyzed fluorometrically (Parsons et al., 1984). The average and standard error of the replicates was calculated for each sample.

NUCLEIC-ACID EXTRACTION AND PCR
The viral concentrate was filtered twice through 0.22-µm pore-size Durapore PVDF filters (Millipore) in a sterile Sterivex filter unit (Millipore). The filtrate, containing virus-sized particles, was pelleted by ultracentrifugation (Beckman-Coulter, Brea, California, USA) in a SW40 rotor at 108 000 g for 5 h at 12 • C. The pellet was resuspended overnight in 100 µL of supernatant at 4 • C. To digest free DNA, the pellets were incubated with 1U/µl DNAse with a final concentration 5 mM MgCl 2 for 3 h at room temperature. Nucleic acids were extracted using a Qiamp Viral Minelute spin kit (Qiagen, Hilden, Germany) according to the manufacturer's directions. To remove DNA, the extracted viral pellets were digested with DNase 1 (amplification grade) (Invitrogen, Carlsbad, California, USA) and the reaction was terminated by adding 2.5 mM EDTA (final concentration) and incubating for 10 min at 65 • C. Complementary DNA (cDNA) was generated using Superscript III reverse transcriptase (Invitrogen) with random hexamers (50 ng/µl) as per the manufacturer. PCR was performed using primer set MPL-2 for a targeted set of the marine picornavirus-like RdRp (Culley and Steward, 2007). Each reaction mixture (final volume, 50 µl) consisted of 50 ng of cDNA, 1× (final concentration) PCR buffer (Invitrogen), 2 mM MgCl 2 , 0.2 mM of each deoxynucleoside triphosphate (Bioline, London, UK), 1 µM of each primer, and 1 U Platinum Taq DNA polymerase. The reaction was run in a PCR Express thermocycler (Hybaid, Ashford, UK) with the following conditions: 94 • C for 75 s, followed by 40 cycles of denaturation at 94 • C for 45 s, annealing at 43 • C for 45 s, and extension at 72 • C for 60 s and a final extension step of 9 min at 72 • C. PCR products were cleaned using the Minelute PCR purification kit (Qiagen).

LIBRARY PREP AND PYROSEQUENCING
Libraries for each site were prepared for sequencing using NEBNext DNA Library Prep for 454 kit (New England Biolabs, Ipswich, Massachusetts, USA) following the manufacturer's directions, and using Ampure beads (Beckman-Coulter) for size selection and purification using a bead ratio of 0.8:1 beads:library. PCR amplicons were barcoded and sent for 454 Titanium pyrosequencing (Roche, Basel, Switzerland) at the Génome Québec Innovation Centre at the McGill University (Montreal, QC, Canada).

SEQUENCE ANALYSIS
Sequencing reads were quality trimmed using length settings between 100 and 600 bp, with a maximum of 3 primer mismatches for the specific primer, and denoised using the denoiser algorithm in QIIME (version 1.7) with default settings for Titanium data (Caporaso et al., 2010;Reeder and Knight, 2010). Sequences were checked for chimeras using UCHIME (Edgar, 2010) against a nucleotide database of RdRp sequences built using NCBI-BLAST+ by retrieving nucleotide sequences of the RdRp from Picornavirales viral isolates (accessed: 8 August 2012), and also using the denovo chimera check in UCHIME. The overlap of results from these two methods was defined as chimeric sequences, although none were found in this study. Non-chimeric sequences were queried using BLASTx (Altschul et al., 1990) with an e-value of 1e −3 against the database of RdRp viral isolates. All sequences with hits were retained and all sequences with no hits were then queried against the non-redundant (nr) Genbank database (Benson et al., 2007) using BLASTx with an e-value of 1e −5 . All the sequences identified as contaminants or as unknown with only 1 read were removed. Remaining sequences were translated to amino acids using FragGeneScan with the 454_10 training option (Rho et al., 2010). Sequences were grouped into operational taxonomic units (OTUs) using UCLUST at a range of similarities from 50 to 100% ( Figure S1) using the original seed sequences (centroids) as the output (Edgar, 2010). A similarity of 95% was chosen for this analysis for the following two main reasons: (1) When the NCBI conserved domain alignment for the RdRp region (all Picornavirales) was analyzed for percent similarity, the only sequences that displayed greater than 95% similarity in this region were strains of the same virus. (2) The sequences from the control libraries (consisting of 1 clone) clustered into 1 sequence at this percentage (see Supplemental methods). Thus clustering at 95% similarity was a way to use biological and sequence-based information to inform our choice of cut-off to collapse strain level variation and as a conservative approach to avoid variation that may be present because of the sequencing platform. Control sequences obtained by cloning and Sanger sequencing (see Supplemental Methods) were used to verify the sequence processing methodology. Raw and processed sequence data were deposited in the NCBI BioProject database ID: PRJNA267690.

STATISTICAL ANALYSIS
Generation of rarefaction curves by random resampling of OTU abundances was performed using the vegan package (Oksanen et al., 2013) in R (R Core Team, 2014). Relative abundances were normalized by randomly resampling 10 000 times using vegan (Oksanen et al., 2013), normalizing to the library with the lowest number of reads and then taking the median. Rank-abundance curves were generated with ggplot2 (Wickham, 2009) using the normalized OTU relative abundances. Scripts used in this project are available as part of QIIME and custom user scripts used to process the data are available here: http://github.com/jooolia/RdRp_454_amplicons_Jericho_ and_SOG; doi: 10.5281/zenodo.12509

ANALYSIS OF RdRp SEQUENCES
After quality filtering to remove homopolymers and contaminating reads, 300 180 reads were recovered from the 5 libraries of RdRp amplicons. At all sites the rarefaction curves plateaued indicating that the depth of sampling was adequate to assess the communities (Figure 3). From these reads, 265 unique OTUs (at 95% similarity) were identified, including 108 singletons. For further analysis OTUs were excluded that did not contain recognizable RdRp motifs (Koonin, 1991;Le Gall et al., 2008), generally did not align well with other RdRp sequences and those that were not present in any sample after normalization.
Using the above criteria there were 145 OTUs identified in all samples, and between 24 to 71 OTUs per site. The Jericho Pier samples had 116 OTUs of which only 10 (8.6%) were shared between sampling times (Figure 4). JP-S had the highest richness with 71 OTUs, of which 59 (83 %) were unique, while JP-F had the second highest richness (49 OTUs), of which 45 (92 %) were unique. The SOG sites together had 64 OTUs, none of which were shared among all sites. SOG-1 and SOG-2 had the lowest number of OTUs (24). SOG-1 had only three OTUs which were unique. However, 21 (33%) were shared between SOG-1 and SOG-4, and 6 (9%) between SOG-2 and SOG-4. The majority of OTUs (75%) from SOG-2 were unique, whereas most OTUs from the other SOG sites were shared with other sites (87% for SOG-1 and 63% for SOG-4).
Rank abundance curves of the viral OTUs showed that at each site most sequences were assigned to only a few OTUs (Figure 5).  JP-S had the highest richness but the shallowest slope of these curves, demonstrating more evenness in the abundance of OTUs than at the other sites. SOG-4 and JP-S had similar rank abundance curves that were much shallower that those of SOG-1 and SOG-2 ( Figure 5). The OTUs that were observed in more than 5 reads were placed in phylogenetic context using a maximum likelihood RA×ML tree (Stamatakis et al., 2008) with sequences from previous RdRp gene surveys and isolated viruses (Figure 6).

Frontiers in Microbiology
OTUs from this study fell within a well-supported clade that includes all the marine isolates belonging to the Picornavirales. Within this group there was a well-supported divide between OTUs grouping in the Marnaviridae clade and those grouping with sequences from viruses infecting diatoms and a thraustochytrid. The overall tree topology is not well supported, although there are a number of well-supported clades containing OTUs from this study and other environmental sequences.

www.frontiersin.org
December 2014 | Volume 5 | Article 703 | 5 FIGURE 3 | Rarefaction curves. Rarefaction analysis of RdRp amplicons based on Chao1 richness analysis of operational taxonomic units (OTUs) at 95% similarity. Rarefaction curves were resampled using number of reads recovered per library. Rarefaction curves plateau indicating adequate sequencing for these samples.
The Marnaviridae clade had the greatest number of OTUs (10) associated with it; whereas, very few OTUs (only OTUs 89, 107, 75, and 120) from this study were assigned to clades primarily from Hawaii (Culley and Steward, 2007;Culley et al., 2014). No clade contained OTUs from all sites. The Jericho Pier samples were the most phylogenetically diverse (Figure 6, Table 2), and contained OTUs (e.g., OTUs 6,7,35,31,84,47,14,4,34,39,44,23,8,20) that fell into clades that did not contain OTUs from any of the SOG samples. Some clades contained OTUs from both JP-S and JP-F samples; however, many OTUs within the clades were unique to one Jericho Pier sample. Phylogenetic diversity differed among samples, except for OTUs from the two SOG sites with deeper mixed layers, some of which were present in different clades resulting in similar phylogenetic diversity ( Table 2).
The Strait of Georgia (SOG) sites were sampled within hours of each other, and the water at each site was pooled from multiple depths above, below, and across the chlorophyll maximum. One of the most striking differences among sites was that SOG-1 and SOG -2 had mixed layer depths of 6 m; whereas SOG-4 had a mixed-layer depth of 2 m, and much higher richness and phylogenetic diversity. Sites SOG-1 and SOG-2 had the lowest phylogenetic diversity ( Table 2). All the OTUs found at SOG-1 (33,12,16,9,10,27,29) were within the Marnaviridae clade; similarly, all OTUs (5, 82, 2, 1) from SOG-2 were within one distantly related clade. In both cases OTUs from these clades occurred at SOG-4. SOG-2 did not have the high numbers of HaRNAV-related viruses that were found in all other samples.

FIGURE 4 | Euler diagrams of normalized RdRp OTUs. (A) Euler diagram of Jericho Pier samples. (B)
Euler diagram of SOG samples. The OTUs presented were from reads clustered at 95% similarity, comprise only OTUs that could be aligned to the NCBI CDD RdRp alignment, and contained the RdRp motif C. The diagrams were constructed using the venneuler() algorithm (Wilkinson, 2011). The size of the circles is approximately proportional to the number of OTUs recovered per site. The overlap in the diagram describes OTUs that were found at multiple sites and the non-overlapping areas describe OTUs that were unique to that site.

DISCUSSION
Pyrosequencing of RdRp gene fragments from coastal samples uncovered much greater genetic diversity than in previous gene surveys (Culley et al., , 2014Culley and Steward, 2007) and revealed many previously unknown taxonomic groups within the Picornavirales. As well, striking differences in the taxonomic richness among samples implies that these viruses infect a wide variety of eukaryotic plankton, but that the mortality imposed on some taxa is highly variable across space and time. Other taxonomic groups within the Picornavirales were more widespread, suggesting that infection of some planktonic taxa is more widespread and persistent. These results and their implications are discussed in detail below.

EXPANDING THE KNOWN DIVERSITY OF PICORNAVIRALES
The high depth of sequencing and limited diversity in each library (Figure 3) gives high confidence that the population structure of RdRp amplicons in each sample has been well characterized (Kemp and Aller, 2004). Although some sequences were closely related to those found in previous studies Culley and Steward, 2007) (Figure 6), many OTUs formed new clades. Many OTUs were related to Heterosigma akashiwo RNA virus (HaRNAV) that infects the toxic bloom-forming raphidophyte Heterosigma akashiwo (Tai et al., 2003). HaRNAV is the OTUs were clustered at 95% amino acid similarity and OTU relative abundances were normalized to the sample with the lowest number of reads.
type virus of the family Marnaviridae (Lang et al., 2004); it has a genome of about 9.1 kb and a high burst size as indicated by the large crystalline arrays of particles in the cytoplasm of infected cells (Tai et al., 2003). HaRNAV was isolated from coastal waters in British Columbia, Canada (Tai et al., 2003) and can remain infectious for many years in sediments (Lawrence and Suttle, 2004). Interestingly, HaRNAV was isolated from the same area as the present study, and appeared ancestral to many of the recovered sequences based on the phylogeny. For example, OTU 0 was most abundant (18,034 reads after rarefaction) and clustered closely with HaRNAV, although the sequence was only 76.4% similar at the amino-acid level. However, other OTUs in the cluster ranged between 55 and 79% similar to HaRNAV, which is low compared to amino-acid similarities of other RNA viruses within a family that usually have greater than 90% aa similarity (Ng et al., 2012).

DISTINCT COMMUNITIES OCCURRED IN DIFFERENT SEASONS AT THE SAME LOCATION
While only 8.6% of the OTUs from Jericho Pier were shared between dates, both samples had similar evenness, although the summer sample had greater richness ( Figure 5, Table 2). The small overlap in OTUs between sampling dates is not surprising given the very different conditions between July and October (Figure 1), and the dynamic nature of planktonic communities in response to environmental changes. At the same location, dsDNA viruses belonging to the Phycodnaviridae, which infect eukaryotic phytoplankton, varied seasonally based on fingerprint analyses of DNA polymerase gene fragments using denaturing gradient gel electrophoresis; however, some OTUs persisted for extended periods . Similarly, the composition of other aquatic viral communities has been shown to be dynamic although some OTUs persist (Djikeng et al., 2009;Rodriguez-Brito et al., 2010), and in some cases have repeatable seasonal patterns (Chow and Fuhrman, 2012;Clasen et al., 2013;Marston et al., 2013). With only single samples from summer and fall, inferences about dynamics cannot be made from our data.
One of the few taxonomic groups that occurred in both the summer (JP-S) and fall (JP-F) samples from Jericho Pier was related to HaRNAV (Figure 6). There was greater diversity of OTUs in this clade in JP-F, even though JP-S had higher richness and higher phylogenetic diversity overall. This is unlike bacterial and phytoplankton communities that tend to be more diverse in winter (Zingone et al., 2009;Ladau et al., 2013). However, the www.frontiersin.org December 2014 | Volume 5 | Article 703 | 7 FIGURE 6 | Phylogenetic tree with heatmap. Maximum likelihood (RA×ML) tree of RdRp OTUs that contained more than 5 reads and relevant Picornavirales sequences. Bootstrap values above 65% are labeled. Adjacent to the tips of the tree is a heatmap displaying the relative abundance of each OTU at 95% similarity by site. OTU relative abundances were normalized to the sample with the lowest number of reads.
RdRp primers target a specific subset of the viral community that might not reflect the overall taxonomic diversity. Based on genome organization and sequence identity, RNA viruses that infect diatoms have been assigned to the genus Bacillarnavirus, that includes Rhizosolenia setigera RNA virus (RsRNAV) (Nagasaki et al., 2004), Chaetoceros tenuissimus Meunier RNA virus (CtenRNAV) (Shirai et al., 2008) and Chaetoceros socialis f. radians RNA virus (CsfrRNAV) (Tomaru et al., 2009). In the JP-F sample, the most relatively abundant cluster grouped with RsRNAV that infects the marine diatom Rhizosolenia setigera (Nagasaki et al., 2004). This corresponded with the highest levels of nitrate + nitrite, which is often associated with high diatom abundances (Zingone et al., 2009); hence, these OTUs are likely associated with viruses infecting diatoms.

DISTINCT COMMUNITIES OCCURRED AT GEOGRAPHICALLY PROXIMATE SITES
Areas of higher habitat diversity, such as stratified water layers, generally have higher biological richness (Klopfer and MacArthur, 1960;Chesson, 2000), and this is consistent with Phylogenetic diversity (PD) is calculated as in Faith (1992). OTUs were the same as used in the construction of the phylogenetic tree and must have included 5 or more reads.
the much higher richness and phylogenetic diversity found at SOG-4, which was the most stratified site and included the most abundant OTUs from SOG-1 and SOG-2. Most OTUs from SOG-1 clustered in the Marnaviridae clade, while most SOG-2 OTUs clustered in a phylogenetically distant clade. Given that we have used very conservative clustering, and that dsDNA viruses infecting phytoplankton are strain specific and have phylogenies that are congruent with their hosts (Clasen and Suttle, 2009;Bellec et al., 2014), and that RNA viruses infecting diatoms and the dinoflagellate, Heterocapsa circularisquama (Nagasaki et al., 2005) are host-specific, it implies that closely related OTUs infect closely related taxa of phytoplankton. Hence, it suggests that the most abundant viruses at these three locations infect different species.
There are few clear patterns in the spatial distribution of viruses in marine waters where geographically distant sites are connected by currents and mixing. The best examples are for cyanophages. For instance, when looking at local variation in cyanophages isolated at sites in Southern New England, 72% of the viral OTUs were shared between at least 2 sites ; however, between Bermuda and Southern New England only 2 OTUs overlapped and they comprised only 0.6% of the isolates. Yet, clear patterns of cyanophage OTU distribution by depth occurred in areas adjacent to the SOG when assessed using community fingerprinting (Frederickson et al., 2003). The biggest differences with depth occurred in stratified water in which some OTUs were present at all depths, while others were only present at specific depths, even though the samples were collected only meters apart (Frederickson et al., 2003). These viruses infect cyanobacteria, as opposed to the picorna-like viruses, which likely infect protistan plankton. Nonetheless, the factors governing the distribution of cyanobacterial and protistan hosts are likely similar; hence, different OTUs would be expected to occur in environments with different vertical structure (stratification) of the water column.
Rank abundance curves showed that SOG-1 and SOG-2 were the least even communities (Figure 5). Overall, at most sites four to five viral OTUs were most abundant (Figure 5) similar to other reports for aquatic viral communities in which a few viruses dominate, but most of the diversity comes from rarer viruses (Angly et al., 2006;Suttle, 2007). Our targeted approach showed that the picornavirus-like virus communities at SOG and JP were dominated by only a few genotypes, supporting previous metagenomic results showing that the OTU distributions of RNA viruses in SOG and JP were highly uneven with little overlap between sites (Culley et al., 2006).

EACH OTU LIKELY REPRESENTS A SINGLE LYTIC EVENT
Given that the hosts of marine Picornavirales isolates are protists, and that protists are the most abundant eukaryotes in the sea, it is likely that the majority of OTUs recovered in this study are from viruses that infect these unicellular marine eukaryotes. These eukaryotic communities are highly dynamic and change throughout the year based on environmental and biological factors (Larsen et al., 2004). Since viral infection is usually host specific, the diversity in marine viral communities is a reflection of the underlying diversity of the marine eukaryotic hosts. Moreover, viral propagation is dependent on host encounter rates and is proportional to host-cell abundance (Murray and Jackson, 1992); hence the most abundant taxa will be most likely to encounter and propagate a viral infection, giving the opportunity for rarer species to increase in abundance and promoting diversity (Thingstad, 2000;Winter et al., 2010). Since our study was not over time it is difficult to evaluate whether these data support the Bank model (Breitbart and Rohwer, 2005), however, some taxa were found at one site, but not a similar nearby site, thus these taxa could be present at background levels at some sites and more abundant in others.
It is probable that the most abundant OTUs in these data are from recent lysis of host taxa. An error rate for replication of RNA viruses of about 1 bp mutation per generation (9000 bp genome × 0.0001 error rate per base pair = 1 bp; Holmes, 2009), and a lower-end burst-size estimate of 1000 particles for marine viruses in the Picornavirales that infect protists (Lang et al., 2009), would produce about 1000 different genomes from each lysed cell. For the amplified 500 bp RdRp gene fragment there is a 0.00056% chance of an error in 1 generation, assuming that mutations are distributed evenly in the genome (Sanjuan et al., 2010;Combe and Sanjuán, 2014). Consequently, even with the relatively high error rates of RNA replication, when grouped at 95% similarity at the amino acid level, all of the sequences from a lytic event should fall within a single OTU. The halflife for decay of viral infectivity and particles in the surface mixed layer is typically a few hours (Heldal and Bratbak, 1991;Suttle and Chen, 1992;Noble and Fuhrman, 1997;Bettarel et al., 2009); thus the recovered viral OTUs were likely from recent lytic events. Furthermore, considering the specificity of viruses infecting protists (Short, 2012), each OTU probably stems from viruses infecting a single host taxon. Thus, these data imply that infection of marine protists by viruses in the Picornavirales is not only pervasive, but likely involves a wide diversity of host taxa; hence, these viruses are likely important structuring elements for phytoplankton communities that influence nutrient cycling and energy flow.

AMPLICON DEEP SEQUENCING AS AN APPROACH FOR ESTIMATING VIRAL DIVERSITY
Amplicon deep sequencing is a sensitive and high-resolution approach for examining microbial community dynamics over time and space (Caporaso et al., 2011;Gobet et al., 2012;Gibbons et al., 2013). Careful quality trimming of sequences www.frontiersin.org December 2014 | Volume 5 | Article 703 | 9 and removal of singletons is essential for reliable results (Zhou et al., 2011) since errors in sequences will inflate estimates of diversity. With careful data processing and analysis, amplicon deep sequencing is as accurate for assessing community composition and diversity as cloning and Sanger sequencing (Amend et al., 2010), but with much greater depth of coverage of the community.
There are potential biases associated with reverse transcription with random hexamers (which can decrease yield and could inflate diversity) (Zhang and Byrne, 1999), template amplification by PCR (Lee et al., 2012) and with using highly degenerate primers that target a specific part of the community containing many different templates (Culley and Steward, 2007). A danger of the high cycle number can be diversity overestimates which can come from the increasing number of chimeric sequences produced with greater cycle number (Qiu et al., 2001). The sequences were processed with caution considering the high number of PCR cycles employed in this study. Chimera checking denovo was used to look for chimeric sequences originating from two higher abundance reads, and reference-based chimera checking was used a database of RdRps from isolated viruses to correct for this potential error. In addition, a conservative cut-off was used of only OTUs comprising more than 5 reads that aligned to the conserved domain alignment.
Read abundance of OTUs can be considered semi-quantitative and good for comparisons of richness and diversity among samples (but not for absolute counts of genes) (Amend et al., 2010;Pinto and Raskin, 2012;Ibarbalz et al., 2014). Moreover, by using control sequences obtained by cloning and Sanger sequencing alongside pyrosequenced libraries containing the same sequence (Supplemental Methods, Figure S2) we verified that amplicon deep sequencing and our sequence processing methodology recovered accurate environmental viral sequences and non-inflated estimates of richness like in studies for bacterial amplicons (Sogin et al., 2006;Huse et al., 2008;Kirchman et al., 2010;Caporaso et al., 2011) and clinical viral studies (Romano et al., 2013;Watson et al., 2013).

CONCLUSION
Amplicon deep sequencing of RdRp gene fragments using 454 pyrosequencing revealed the richness and population structure of marine Picornavirales in five coastal samples. The known diversity of viruses in this group was greatly increased with 145 OTUs that differed by at least 5% at the amino-acid level. There were between 24 and 71 OTUs in each sample, with distinct patterns of OTU distribution, richness and diversity among samples. There was little overlap between viral OTUs collected at the same site in summer and fall, and among samples collected 20 km apart on the same day. The high temporal and spatial diversity in RdRp sequences is consistent with viral communities that turnover rapidly, and episodic infection of a wide diversity of protistan hosts. The low overlap in OTUs and phylogenetic diversity among samples implies a dynamic landscape of viral infection and supports the idea that marine picorna-like viruses are important pathogens of marine protists that have an important role in structuring marine planktonic communities, and in nutrient cycling and energy transfer among trophic levels. Ultimately, further study is needed to disentangle the temporal and spatial drivers of these communities.