Geochemistry and Mixing Drive the Spatial Distribution of Free-Living Archaea and Bacteria in Yellowstone Lake

Yellowstone Lake, the largest subalpine lake in the United States, harbors great novelty and diversity of Bacteria and Archaea. Size-fractionated water samples (0.1–0.8, 0.8–3.0, and 3.0–20 μm) were collected from surface photic zone, deep mixing zone, and vent fluids at different locations in the lake by using a remotely operated vehicle (ROV). Quantification with real-time PCR indicated that Bacteria dominated free-living microorganisms with Bacteria/Archaea ratios ranging from 4037:1 (surface water) to 25:1 (vent water). Microbial population structures (both Bacteria and Archaea) were assessed using 454-FLX sequencing with a total of 662,302 pyrosequencing reads for V1 and V2 regions of 16S rRNA genes. Non-metric multidimensional scaling (NMDS) analyses indicated that strong spatial distribution patterns existed from surface to deep vents for free-living Archaea and Bacteria in the lake. Along with pH, major vent-associated geochemical constituents including CH4, CO2, H2, DIC (dissolved inorganic carbon), DOC (dissolved organic carbon), SO42-, O2 and metals were likely the major drivers for microbial population structures, however, mixing events occurring in the lake also impacted the distribution patterns. Distinct Bacteria and Archaea were present among size fractions, and bigger size fractions included particle-associated microbes (> 3 μm) and contained higher predicted operational taxonomic unit richness and microbial diversities (genus level) than free-living ones (<0.8 μm). Our study represents the first attempt at addressing the spatial distribution of Bacteria and Archaea in Yellowstone Lake, and our results highlight the variable contribution of Archaea and Bacteria to the hydrogeochemical-relevant metabolism of hydrogen, carbon, nitrogen, and sulfur.


INTRODUCTION
Microorganisms are the foundation of aquatic food webs, with heterotrophic microorganisms acting as fundamental consumers mediating organic matter mineralization, thus playing key roles in nutrient biogeochemical cycling (Carlsson and Caron, 2001;Salcher et al., 2010). Monitoring the distribution of these microorganisms and investigating the potential environmental drivers of this distribution are important to our understanding of the roles of these microbes and the integral functionality they contribute to aquatic ecosystems such as lakes. Depending on the morphology and geological features, bigger lakes are often characterized by strong environmental gradients in physicochemical parameters including temperature, salinity, oxygen, nitrogen, etc. These environmental gradients result in niche separation and differentiation, leading to the structuring and distribution of distinct microorganisms at different layers within lakes. Cole et al. (1993) found bigger and more abundant cells present in anoxic hypolimnia than oxic conditions. Later studies with molecular approaches confirmed this observation and led to the discovery of certain groups of microbes that preferentially inhabit deeper lake layers such as fermenting bacteria, denitrifying bacteria, methylotrophs, autotrophic sulfur bacteria, and Bacteroidetes (Casamayor et al., 2000;Lehours et al., 2007;Salcher et al., 2008Salcher et al., , 2010. In general, microbial distribution patterns were likely responding to environmental gradients including oxygen (Cole et al., 1993;Lehours et al., 2007;Salcher et al., 2008Salcher et al., , 2010, nutrients (Cole et al., 1993;Casamayor et al., 2000), and the geochemical characteristics of the lakes, such as salinity in hyper-saline environments (Baricz et al., 2014). However, due to complex environmental scenarios in lakes, monitoring microbial distribution and clarifying their environmental drivers still remains a challenge in aquatic microbial ecology.
Yellowstone Lake (YL), the largest (∼352 km 2 ) sub-alpine high-altitude lake in North America is a pristine, non-regulated body of water with a long (10-year) retention time (Benson, 1961;Morgan et al., 2007). The lake is critical to the function of the Yellowstone ecosystem (Schullery and Varley, 1995), and it contributes approximately 10% of the total geothermal flux in Yellowstone National Park (YNP; Balistrieri et al., 2007). Hundreds of lake floor vent features have been documented in the Yellowstone Lake by employing bathymetric, seismic, and submersible remotely operated vehicle (ROV) equipment (Morgan et al., 2003(Morgan et al., , 2007Balistrieri et al., 2007). These vents occur primarily in the northern and West Thumb regions of the lake, although a relatively minor vent area occurs in the proximity of Dot Island (Figure 1). By mixing with the lake water, these strong geochemical signatures provide numerous niches capable of supporting phylogenetically and functionally diverse microbial populations. Recently, considerably novel and diverse populations of Bacteria, and Archaea were observed in the Yellowstone Lake (Clingenpeel et al., , 2013Kan et al., 2011;Yang et al., 2011;Inskeep et al., 2015), demonstrating the complexity in the microbial foundations of the lake food web. These extensive surveys also documented the occurrence of novel bacterial/archaeal phylotypes previously known to only occur in marine environments (Clingenpeel et al., , 2013Kan et al., 2011). However, the spatial distribution patterns of these microorganisms in the lake and the potential environmental drivers have not been fully addressed.
In this current study, we examined the size-fraction filtered (0.1, 0.8, and 3.0 µm nominal filtration) water samples from different locations of Yellowstone Lake, including surface water (3 and 10 m, photic zone), vent fluids, and mixing zones where vent waters mix with cold lake water. Both bacterial and archaeal community structures were assessed using 454 pyrosequencing of V1 andV2 regions of 16S ribosomal RNA genes. Efforts summarized herein focused on the free-living fractions (<0.8 µm) of microbial communities, with additional characterizations of microbial communities at bigger size fractions in two 10 m-depth, photic zone water samples. Geochemical profiling provided environmental context to the microbial community data, and relationships between microbial community structure and environmental variables were tested by using multivariate statistics. This study represents the first investigation of the spatial distribution of Bacteria and Archaea in a lake with documented geothermal inputs.

Sampling Locations and Geochemical Analyses
Surface photic zone, mixing zone, and vent water samples were collected with an ROV from lake regions and sites identified in previous USGS surveys (e.g., Morgan et al., 1977Morgan et al., , 2007. Specifically, the sampling sites were: West Thumb Basin, Inflated Plain, Elliot's Crater, and Southeast Arm at Yellowstone Lake, YNP (Figure 1). Detailed descriptions of sampling sites, locations, and depths were listed in Table 1. Geochemical analyses of water samples were as previously described (Lovalvo et al., 2010;Clingenpeel et al., 2011;Inskeep et al., 2015). To investigate mixing effects from vents on overlying water columns, two vertical sampling profiles of chemistry were conducted at Inflated Plain and Southeast Arm, respectively. In contrast to Inflated Plain, Southeast Arm is outside of the caldera and approximately 20 km from any major lake floor vent fields, and therefore experiences little impact from the lake hydrothermal activities.

Water Sampling and Epifluorescence Microscopy
Remotely operated vehicle operation and microbial and geochemical sampling methods were as previously described (Lovalvo et al., 2010;Clingenpeel et al., 2011;Kan et al., 2011). For microbial biomass collection, 100-300 L of lake/vent water was pumped through a 20 µm prefilter into 50 L carboys on the boat deck. Subsamples (10 mL) of filtrates passing through 20 µm were fixed in 1% glutaraldehyde for epifluorescence microscopy (Chen et al., 2001;Kan et al., 2006). The carboys were sterilized by either autoclaving or by soaking with 10% bleach followed by rinsing with autoclaved deionized water prior to each use. Following the protocol described for the Global Ocean Sampling (GOS; Rusch et al., 2007), surface and vent water was size fractionated by serial filtration through 3.0, 0.8, and 0.1 µm membrane filters. Each filter was sealed in a sterile plastic bag and frozen at −20 • C for transport to the laboratory, where they were stored at −80 • C until DNA extraction.   (Einen et al., 2008); −, data not available.

DNA Extraction, Pyrosequencing, and Sequence Analyses
DNA extraction, 454 high-throughput pyrosequencing, and sequence analysis followed protocols we described previously Kan et al., 2011). Briefly, genomic DNA was extracted from a quarter filter for each sample by using a phenol-chloroform extraction procedure. Hexadecyltrimethyl ammonium bromide (CTAB; 1% w/v) and sodium chloride (0.14 M) were used to remove polysaccharides and residual proteins during the DNA extraction. The V1 and V2 regions of the bacterial and archaeal 16S rRNA gene were amplified with barcoded primers (Bacteria: 27F and 533R; Archaea: A2Fa with A571R; Baker et al., 2003). The PCR products were pooled at equimolar concentrations according to their relative amplicon abundance and pyrosequenced using the 454 GS FLX Titanium sequencer of 454 Life Sciences (Branford, CT; Margulies et al., 2005) at the J.C. Venter Institute sequencing center. Pyrosequencing reads for both Bacteria and Archaea were screened and clustered using the UPARSE pipeline (Edgar, 2013) following the recommendations on the UPARSE website (http://www.drive5.com/usearch/manual/uparse_pipeline.html). Briefly, once barcodes and primer sequences were removed, all reads were trimmed to a length of 300 bp. Based on the Q scores reads with an average expected error greater than 1 were removed. For each sample the reads were subsampled to 16,074 bacterial reads and 12,064 archaeal reads. The sequences were clustered into operational taxonomic units (OTUs) based on 97% sequence similarity threshold. Chimeras were removed as an integral part of the UPARSE clustering method. All pyroread sequences are available in the GenBank SRA database under accession numbers SRX033214, SRX033251and SRX033252.

Real Time PCR (QPCR)
Relative abundance of planktonic total Bacteria and Archaea were analyzed by Real-time qPCR, using the SYBR Green PCR kit (Qiagen) on an MJ research (Bio-Rad) qPCR machine by following previously described protocol (Einen et al., 2008). Briefly, bacterial and archaeal 16S rRNA genes were estimated by using primers 338f-518r and 931f-m1100r, respectively. A triplicate 10-fold dilution series of genomic DNA from Shewanella oneidensis MR-1 and Halobacterium sp. were used to generate standard curves for the Bacteria and Archaea, with biomass calculated based on estimated rRNA copy numbers for Bacteria (3.9) and Archaea (1.8) as described in Einen et al. (2008).

Estimated Richness, Diversity Indices, and Multivariate Statistics
The richness estimator Chao1 was applied to estimate the number of missing species based on numbers of singletons and doubletons (Chao and Shen, 2010). Alpha Diversity (Shannon and Simpson indices) and beta diversity (Morisita-Horn index) were calculated using the program SPADE (Species Prediction and Diversity Estimation, Chao and Shen, 2010). The Bacteria and Archaea components of the community structure were separately analyzed by Non-metric Multidimensional Scaling (NMDS), employing the MDS procedure in SAS/STAT Software (v 9.3; SAS Institute Inc., Cary, NC, USA). Input to NMDS was a normalized data distance (or dissimilarity) matrix using the Bray-Curtis percent dissimilarity values based on sample relative abundances. For ease of interpretation, only the first two dimensions of a NMDS analysis were examined. In most cases, the scree plot (badness-of-fit criterion or stress plotted against number of dimensions) suggested that the first two dimensions were sufficient in defining the overall dimensionality of the input data matrix. Sample groupings suggested from NMDS results were assessed via Multi-Response Permutation Procedures (MRPP; PC-ORD software v 4, MjM Software Design, Gleneden Beach, OR, USA) for statistically significant grouping at p ≤ 0.05. Stress value less than 0.1 indicated a good ordination with little risk of misinterpretation of the distribution pattern (Clarke, 1993).
Variability in the original data explained by an NMDS dimension was assessed using an R 2 value from a regression of the individual dimension scores vs. the original data matrix distance values. Assessing which specific taxonomic groups drove a particular NMDS result as well as examining the relationships between ancillary chemistry data and a given NMDS result was accomplished by correlating (Spearman rank) taxa relative abundances or environmental data values against the NMDS dimension scores. Significant (p ≤ 0.05 or 0.01) correlations indicate which bacterial/archaeal groups or environmental variables are driving differences in microbial community structure.

Environmental Variables
Metadata for the geochemistry measurements were summarized in Table 2. Analytes with more than three values missing or below detection limit including S 2 O 3 2− , PO 4 3− and trace elements (Al, Mn, Fe, Ga, Se, Mo, Sb, Pb, V etc.) were not included in any summary or subsequent analyses. Waters from the deep mixing zone (MIX-YL360, 370) and thermal vents (VNT-YL352, 359, and 369) were mildly acidic, ranging from pH 5.6 to 6.6, compared to surface/shallow water from Southeast Arm (SRF-YL354 and 355, pH 7.0-7.1 with no vent impacts). Vent waters represented hydrothermal conditions of high temperature, high concentrations of gasses and low O 2 . Temperature for surface waters ranged from 11.2 to 13 • C, while vent waters had pronounced higher temperatures (48-65.5 • C). A generally increasing trend from surface to vent waters was also observed for NH 4 + , CH 4 , CO 2 , H 2 S, SO 4 2− , and DIC (dissolved inorganic carbon), while O 2 decreased from 313 to 118 µM. All other environmental measurements such as DOC (dissolved organic carbon), anions, cations, and trace elements were generally similar and no conspicuous trend was found ( Table 2). Aqueous geochemistry, gasses and temperature were examined within the water column atop a particularly active vent in the Inflated Plain (within the caldera) and compared against a water column not associated with any known vent(s) in the Southeast Arm (out of the caldera; Figure 1). Interest was primarily on the vent water, the mixing zone (defined as that location directly above the vent where water temperature decreased below the vent water), and then at depths of 3 and 10 m (Figure 2). Because of the high vent output, no obvious mixing zone of H 2 was observed in the lake (Figure 2), although clearly H 2 was locally enriched in vent emissions and the water column overlying highly active vents or mixing zone samples containing vent fluids (Table 2, Figure 2).

Microbial Community Composition
A total of 519,779 high-quality 16S rRNA gene sequences were obtained from the lake water samples. Of these, 329,878 came from the primers for Bacteria with an average 23,563 reads per sample and 189,901 for Archaea, averaging 14,483 reads per sample (except low yield for the Southeast Arm photic sample SRF-YL355 3 µm fraction, Table 1). Quantitative PCR results indicated that archaeal sequences reflected minority abundance relative to Bacteria, and Bacteria/Archaea ratio varied from ∼4,037 (surface water) to ∼25 (vent water; Table 1). In all samples the Thaumarchaeota was the most dominant archaeal group in the lake (Figure 3A). The Crenarchaeota made up a significant portion (20.7%) of the reads from the Inflated Plain vent and were also seen at low levels (<4%) in the waters above that vent. Low levels of Crenarchaeota were also seen in other vent waters (0.7-3.3%). The Euryarchaeota were only found at significant levels in the West Thumb cone vent at 8.5% of that sample's reads and at 2.5% of the reads from an Inflated Plain vent. In all other samples the Euryarchaeota made up less than 1% of the archaeal reads. For the Bacteria, Actinobacteria (freshwater acI and acIV), Bacteroidetes (Chitinophagaceae), Cyanobacteria (Prochlorococcus-like), Alphaproteobacteria (Pelagibacter/SAR11-like), Betaproteobacteria (Burkholderiales and Methylophilales), and Verrucomicrobia (Puniceicoccaceae) were the major groups in the lake ( Figure 3B). Although microbial diversity per se was not the focus for the current study, the results agreed well with previous characterization of population structures for Bacteria, where detailed phylogeny was conducted based on comparison of high-quality 16S rRNA gene sequences with full-length 16S rRNA gene clones . Freshwater Actinobacteria was the most dominant group in the lake ranging from 44.3% (SRF-YL347) to 70.8% (MIX-YL370) of the total reads. Proteobacteria (mainly Alpha and Beta) were the second most abundant group in all the sampled waters, with abundances varying from 15.5% (VNT-YL369) to 40.8% (VNT-YL352). The most abundant Cyanobacteria were found in the 10 m depth photic zone samples at 4.9% (SRF-YL355) and 9.2% (SRF-YL347) of the reads. Finally, two other groups were present at significant levels in all water samples: Bacteroidetes at an average abundance of 2.7% and Verrucomicrobia at an average abundance of 3.8% of the total reads. The sequences obtained for both the Bacteria and Archaea provided considerable resolution for identifying important phylotypes present in the water samples and detailed FIGURE 2 | Impacts of lake floor vents on overlying water column. Profiles of selected variables from an Inflated Plain vent water column (filled symbols) vs. a Southeast Arm water column (open symbols) that was not vent influenced. Circles denoted the vent-lake water-mixing zone (not available for H 2 ). Frontiers in Microbiology | www.frontiersin.org taxonomic classifications were listed in Supplementary Tables S1 and S2.

Community Composition In Different Size Fractionations
In addition to 0.1-0.8 µm filter classification, microbial communities in the 0.8-3.0 and 3.0-20 µm filtration classes from stations SRF-YL347 and SRF-YL355 (both 10 m photic zone water) were also characterized (Figure 4). The Crenarchaeota and Korarchaeota were more abundant in bigger size fractionations (0.8-3.0 and 3.0-20 µm), whereas the smallest size fraction (0.1-0.8 µm) was almost entirely Thaumarchaeota ( Figure 4A). Bacterial communities also showed size distribution patterns: free-living fractions contained more Acidobacteria, Actinobacteira, Alphaproteobacteria, and Betaproteobacteria, while the Aquificae, Bacteroidetes, Cyanobacteria, Deltaproteobacteria, and Planctomycetes were more abundant in the bigger size fractions ( Figure 4B). Estimated OTU richness (at 97%) and diversity indices showed larger size fractions contained higher diversity for both Archaea and Bacteria ( Table 3). The beta diversity measure (Morisita-Horn index) confirmed that free-living Archaea and Bacteria were distinct from the bigger size fractions.
Our epifluorescence microscopic observation confirmed the photosynthetic cells including Cyanobacteria were bigger than most non-pigmented cells (Figures 6a vs. 6b, 6c vs. 6d). As the biggest bacterial phylum, Proteobacteria accounted for the distribution of both free-living and bigger size fractions, however, distinct subgroups and OTUs were identified as significant components: Alpha, Beta and Gammaproteobacteria for free-living while Beta, Delta, Gammaproteobacteria and unknown Proteobacteria for bigger size fractionations ( Figure 5B).
Archaea and Bacteria exhibited a distribution pattern from 3 m depth (SRF-YL340, 354) to vent waters (VNT-YL352, 359, 369), with 10 m depth (SRF-YL347, 355) and mixing zone waters 370) in between (Figure 7). Bacterial populations from surface waters and vent communities were grouped separately, but the mixing water did not form a distinct group. Instead, both archaeal and bacterial communities from mixing zones were more similar to the vent waters at the same locations (VNT-YL359 to MIX-YL360; VNT-YL369 to MIX-370; Figures 7B,D). Estimated richness and diversity measures (alpha and beta diversity indices) for free-living microbes indicated that vent waters contained higher diversity of Archaea than surface and mixing water samples. In contrast, bacterial community structures shared higher similarity than Archaea among all the water samples and no clear increasing or decreasing trend was observed with depth (Table 3).
Correlation analyses revealed that the distribution of the free-living microbial communities correlated well with water geochemistry measured in this study (Figures 7B,D). Surface water (photic zone) archaeal communities were associated with increasing pH and DOC, while the deep vent water communities were associated with hydrogeothermal features including increasing concentrations of CH 4 , CO 2 , Na + and As 5− (Figure 7B). Dissolved O 2 was identified as the significant driver for surface water free-living bacterial community, which was distinguished from deep mixing zone/vent waters by increasing concentrations of CO 2 , H 2 , SO 4 2 , F − , DIC and Sr 2+ (Figure 7D).

DISCUSSION
This study followed the sampling protocol in the GOS expedition (Rusch et al., 2007) of three size fractionations: 0.1-0.8, 0.8-3.0, and 3.0-20 µm. Therefore, we had the opportunity to investigate microbial communities from different size fractions. Microscopic observations and cell enumeration (this study and Clingenpeel et al., 2011) have suggested that the 0.1-0.8 µm fraction represented the free-living microorganisms, while fractions >3.0 µm were mostly particle associated. Our results (this study and Clingenpeel et al., 2011) confirmed that cell counts for particle-attached microbes were generally <20-50% of the total cell counts (Kirchman and Mitchell, 1982;Kirchman, 1983;Pernthaler et al., 1996). Larger-sized fractions that would likely include particleassociated microbes (>3 µm) contain larger and metabolically more active microbial groups (Stevenson, 1978;Gasol et al., 1995;Pernthaler et al., 1996). For instance, Cyanobacteria have been found to dominate in particle-associated fraction due to their size (Rosel et al., 2012), which is consistent FIGURE 6 | Epifluorescence microscopic images for Yellowstone Lake waters SRF-YL347 (a,b) and SRF-YL355 (c,d). (a,c), total microbial communities; (b,d), same field as (a,c) but with blue light excitation to show photosynthetic cells.
with the epifluorescence microscopy images obtained in the current study (Figure 6). Further, 33 OTUs of Bacteroidetes were identified associated with the bigger size fractions ( Figure 5B). These microbes have been considered crucial degraders of high molecular weight organic matter (Kirchman, 2002;Thomas et al., 2011). Another important group occurring more commonly within the bigger size fractionations was Planctomycetes, which has been found widely distributed in both freshwaters and marine environments (reviewed by Lage and Bondoso, 2012 and references therein). Due to their well-known distinctive morphology with cells forming rosettes connected by a non-cellular stalk (Fuerst, 1995), these microorganisms were easily collected by bigger pore sized filters. The higher estimated richness and diversity for both Archaea and Bacteria (Table 3), indicates greater richness and diversity is associated with organic particles in the water column or in multispecies aggregates as observed by Rosel et al. (2012) and Crespo et al. (2013). Due to the size fractionation approach applied, with emphasis on free-living organisms, this lake's true microbial diversity/activities is likely underestimated, and represents an intriguing target for future studies.
In big lakes such as Yellowstone Lake, environmental gradients influence the distribution of aquatic microorganisms (Figure 7). Previous studies demonstrated that oxygen gradients were associated with bacterial distributions (Cole et al., 1993;Lehours et al., 2007;Salcher et al., 2008Salcher et al., , 2010, and similar patterns were observed in this study (Figure 7D), although the samples obtained in the current study did not derive from a stratified lake and O 2 gradients almost certainly resulted from the lake bottom vents. In addition, lake microbial communities have also been shown to respond to other environmental parameters including nutrients (Cole et al.,FIGURE 7 | Distribution pattern (NMDS plots) of free-living Archaea (A,B) and Bacteria (C,D). Major phyla responsible for the distribution patterns of Archaea and Bacteria were shown in (A) and (C), and correspondence of environmental variations were shown in (B) and (D). Abbreviations for Archaea: Cren, Crenarchaeota; Eury, Euryarchaeota; Thaum, Thaumarchaeota; Unkn, unknown/unclassified. Abbreviations for Bacteria: Acido, Acidobacteria; Actino, Actinobacteria; Bact, Bacteroidetes; Chloro, Chloroflexi; Gemmat, Gemmatimonadetes; Parcub, Parcubacteria (OD1); Proteo, Proteobacteria (α, β, γ-subdivision); Verruc, Verrucomicrobia; Unkn, unknown/unclassified. Only phyla significantly correlated with the scores for a NMDS axis (Archaea, p < 0.05; Bacteria, p < 0.01) were shown. MRPP analyses showed significantly distinct groupings (shadowed) for (A; p = 0.016) and (B; p = 0.00014). Arrow length represented the strength of significant correlations. The asterisks ( * and * * ) in panel (C) and (D) were for ease of labeling and showed the phyla associated with the respective group of arrows. 1993; Casamayor et al., 2000) and salinity (Baricz et al., 2014). Hydrothermal vents on the bottom of Yellowstone Lake significantly contribute to highly localized temperature, pH and geochemistry profiles ( Table 2 and Inskeep et al., 2015), and not unexpectedly, these complicated gradients shaped the distribution of microbial populations (Figures 7B,D). For instance, apparent thermophilic Archaea (e.g., Crenarchaeota-Thermoprotei) were found throughout the lake, but were highly enriched in mixing zone and vent waters (Figure 7A, Supplementary Tables S1 and S2), suggesting the significant geothermal impacts on these specific lake environments. As noted from Figure 7B, pH was one of the significant factors controlling the distribution of archaeal communities in the lake. While not significant at p = 0.05, temperature was correlated with both NMD axes for Archaea and Bacteria (data not shown). Consistent with this observation, both pH and temperature were identified as controlling factors in selecting microorganisms and determining their distributions from a review of 15 typical freshwater microbial groups found in 15 diverse European lakes (Lindstrom et al., 2005).
Previous reports have shown that the availability of trace elements, major nutrients and the distribution of major energy sources influenced the diversity and productivity of biological communities in Yellowstone Lake (Lovalvo et al., 2010;Clingenpeel et al., 2011Clingenpeel et al., , 2013Kan et al., 2011;Yang et al., 2011;Inskeep et al., 2015). In the current study, multivariate statistics corroborate that hydrothermal ventassociated geochemistry (e.g., CH 4 , CO 2 , pH, DOC, Na + , As 5− , and O 2 , H 2 , CO 2 , SO 4 2− , DIC, F − , Sr 2+ ) were the major drivers for Archaea and Bacteria diversity, and their distributions, respectively (Figures 7B,D). In comparison to measurements of general geothermal features (e.g., Langner et al., 2001;Macur et al., 2004;D'Imperio et al., 2008), emissions of hydrothermal vent gasses including CO 2 , and CH 4 from the lake vents were considerably higher than terrestrial hot springs in the Yellowstone Park ( Table 2; Inskeep et al., 2015). H 2 level was significant throughout the lake, and clearly it was locally enriched in vent emissions ( Table 2; Clingenpeel et al., 2011). Since the H 2 growth threshold concentrations were either equivalent to or below nM levels (Conrad et al., 1983), we hypothesized that H 2 throughout the lake served as significant energy sources for broad microbial populations and activities, as noted from Yellowstone's hot springs (Spear et al., 2005;D'Imperio et al., 2008). As a result, methanogens (e.g., Methanomicrobia) were more commonly found in hydrothermal vents and their distribution patterns indicated close relationship to CH 4 and CO 2 concentrations (Figures 7A,B, Supplementary Tables S1 and S2). Further, putative nitrifying Archaea (Nitrosopumilales) were widely distributed in the lake demonstrating their potential association to nitrifier-relevant concentrations of carbon (CO 2 ). High temperature, H 2 S, SO 4 2− , H 2 , and low O 2 in vent waters favored growth of thermophilic Crenarchaeota of the orders Desulfurococcales and Thermoproteales, in which numerous cultivated representatives have been shown to be capable of gaining energy via anaerobically respiring sulfur (Huber and Stetter, 2006;Macur et al., 2013). Our results showed that both Desulfurococcales and Thermoproteales were more enriched in deep vents VNT-YL359 (2.1 and 7.6%) and VNT-YL369 (2.1 and 0.2%).
Similar distribution patterns of free-living bacterial populations in the lake were also observed (Figures 7C,D). The population structures positively correlated with the relevant geochemical signatures such as H 2 , CO 2 , and SO 4 2− . Substantial CH 4 and CO 2 levels translated to significant reduced one-carbon compound metabolisms as evidenced by presence of methylotrophs (Methylocystis, and Methylotenera; Supplementary Tables S1 and S2), which are commonly found in marine and freshwater environments (reviewed by Anthony, 1982;Hanson and Hanson, 1996;Chistoserdova et al., 2009). In addition, higher concentrations of H 2 S, and SO 4 2− -along the oxygen gradient would potentially provide habitats suitable for sulfuroxidizing and sulfate-reducing Bacteria. Recent efforts to characterize the microbial communities from several vent sites in Yellowstone Lake indicated that sulfur-oxidizing bacteria were important in sulfidic habitats (Yang et al., 2011), and our data documented the presence of similar sulfuroxidizing groups including Aquificales (Sulfurihydrogenibium) and Proteobacteria (Thiovirga, Thiobacillus, Thiothrix, and Sulfuricurvum; Supplementary Tables S1 and S2). Further, although not abundant among the total community, sulfatereducing Deltaproteobacteria (Desulfobacteraceae) were also present in vent and deep waters. Recent metagenomic and functional gene characterizations of vent waters and streamers in Yellow Stone Lake have further supported the geochemical controls (elevated CO 2 , S, H 2 , and CH 4 etc.) on microbial community structures in the deep thermal ecosystems (Inskeep et al., 2015). Although certain functional groups/pathways identified were more favorable and dominant in vents, we realized that physiological inference was limited based on the high-throughput sequences because of: (1) constrained taxonomic resolution of short read lengths; and (2) limited current knowledge on known metabolic capabilities for Bacteria and Archaea in general. Nevertheless, our results indicate archaeal and bacterial distributions were strongly influenced by geothermal inputs of Yellowstone Lake.
Another possible explanation for the microbial distributions was mixing events, which would exert pronounced influence on water bodies especially those with longer water retention time. These mixing events would dilute emissions of hydrothermal vents with the overlying water column and thus impact the entire lake geochemistry and distribution of microorganisms. For instance, the free-living microorganisms from the mixing zone water (>30 m depth) tended to be more similar to the deep vents at the same locations (VNT-YL359 and MIX-YL360; VNT-YL369 and MIX-YL370), except the Archaea at stations VNT-YL369 and MIX-YL370 (Figures 6b,d). Clearly, vent emissions significantly influenced the mixing zone samples. We conclude that both archaeal and bacterial distributions were reliant on the vent-provided chemicals in the lake, but the microbial composition in deeper water was obviously overwhelmed by mixing events, which was clearly demonstrated by the water chemistry profiles in Inflated Plain (Figure 2). Windgenerated (and other mixing) currents would also disperse the geochemicals and microorganisms throughout the lake, as demonstrated by H 2 profiles (Figure 2 and Table 2). In addition to vent emission, other potential sources of lake H 2 could derive from hydrothermal activity, nitrogenase activity or eukaryotic algae (Hanson and Hanson, 1996;Melis and Happe, 2001). Nevertheless, microbial distribution correlated with hydrothermal activities and the associated geochemistry (CH 4 , SO 4 2− , H 2 , CO 2 , O 2 , DOC, DIC, pH, and other trace metals), suggesting that these microorganisms were likely involved in cycling geochemicals associated with the vent emissions. Obviously, depending on the relative association with lake vents, the lake food chain is not simple. In addition to phototrophy (as noted the dominancy of Cyanobacteria), chemolithotrophy involving hydrogen, carbon, and sulfur metabolism are likely other major energy platforms in this lake.