The Role of Tetraether Lipid Composition in the Adaptation of Thermophilic Archaea to Acidity

Diether and tetraether lipids are fundamental components of the archaeal cell membrane. Archaea adjust the degree of tetraether lipid cyclization in order to maintain functional membranes and cellular homeostasis when confronted with pH and/or thermal stress. Thus, the ability to adjust tetraether lipid composition likely represents a critical phenotypic trait that enabled archaeal diversification into environments characterized by extremes in pH and/or temperature. Here we assess the relationship between geochemical variation, core- and polar-isoprenoid glycerol dibiphytanyl glycerol tetraether (C-iGDGT and P-iGDGT, respectively) lipid composition, and archaeal 16S rRNA gene diversity and abundance in 27 geothermal springs in Yellowstone National Park, Wyoming. The composition and abundance of C-iGDGT and P-iGDGT lipids recovered from geothermal ecosystems were distinct from surrounding soils, indicating that they are synthesized endogenously. With the exception of GDGT-0 (no cyclopentyl rings), the abundances of individual C-iGDGT and P-iGDGT lipids were significantly correlated. The abundance of a number of individual tetraether lipids varied positively with the relative abundance of individual 16S rRNA gene sequences, most notably crenarchaeol in both the core and polar GDGT fraction and sequences closely affiliated with Candidatus Nitrosocaldus yellowstonii. This finding supports the proposal that crenarchaeol is a biomarker for nitrifying archaea. Variation in the degree of cyclization of C- and P-iGDGT lipids recovered from geothermal mats and sediments could best be explained by variation in spring pH, with lipids from acidic environments tending to have, on average, more internal cyclic rings than those from higher pH ecosystems. Likewise, variation in the phylogenetic composition of archaeal 16S rRNA genes could best be explained by spring pH. In turn, the phylogenetic similarity of archaeal 16S rRNA genes was significantly correlated with the similarity in the composition of C- and P-iGDGT lipids. Taken together, these data suggest that the ability to adjust the composition of GDGT lipid membranes played a central role in the diversification of archaea into or out of environments characterized by extremes of low pH and high temperature.


INTRODUCTION
Archaea inhabit environments characterized by extremes of salt, temperature, and pH. The ecological dominance of archaea in these environments, including environments with characteristics (e.g., combined low pH and high temperature) that preclude bacterial colonization (Inskeep et al., 2010), has been suggested to result from the evolution of phenotypic traits that enable survival under conditions of chronic energy stress (Valentine, 2007). The lipid membrane is fundamental to energy generation and the maintenance of cellular homeostasis, suggesting that survival in these extreme environments requires highly specialized lipid membranes (van de Vossenberg et al., 1998;Macalady et al., 2004;Baker-Austin and Dopson, 2007). Archaea synthesize a variety of diether and tetraether linked membrane lipids (Yamauchi et al., 1993;Macalady et al., 2004;Valentine, 2007). The membrane lipids synthesized by Crenarchaeota, as well as by some Euryarchaeota, are composed of glycerol dibiphytanyl glycerol tetraethers (GDGTs) (De Rosa et al., 1980, 1986De Rosa and Gambacorta, 1988;Macalady et al., 2004). GDGTs consist of ether-linked C 40 polyisoprenoid chains with zero to as many as four cyclopentyl rings and zero or one cyclohexyl ring (i.e., crenarchaeol; Damsté et al., 2002) on each chain (Schouten et al., 2000(Schouten et al., , 2003. The monolayer arrangement and the ether-linked bonding are thought to confer enhanced thermal stability to the lipid membrane (Thompson et al., 1992). In addition, internal cyclopentyl rings are thought to enhance the thermal stability of the GDGT membrane through increased packing density (Gliozzi et al., 1983;Gabriel and Chong, 2000).
In the upper marine water column, marine sediments, and freshwater sediments, the average number of cyclopentyl rings per GDGT correlates with increasing surface water temperature (Schouten et al., , 2003Powers et al., 2004). In terrestrial systems, environmental parameters other than temperature also appear to influence GDGT composition. For example, the composition of GDGT lipids recovered from terrestrial geothermal springs such as those in Yellowstone National Park (YNP), Wyoming, USA exhibit varying correlations with pH and to a lesser extent temperature (Schouten et al., 2007;Pearson et al., 2008). In contrast, the composition of GDGT lipids recovered from terrestrial geothermal springs in the Great Basin (GB), Nevada, USA, which tend to not vary considerably in pH, were correlated with the concentration of bicarbonate but not temperature (Pearson et al., 2004). More recent studies targeting circumneutral to alkaline Tibetan hot springs, which also tend to not vary considerably in pH, indicated an overarching temperature effect on the distribution of some but not all GDGTs in these geothermal springs (He et al., 2012).
In addition to environmental studies, a survey of the composition of GDGT lipids in membranes of archaeal isolates cultivated at optimal pH and temperatures also suggests an important role for tetraether lipids in the survival of archaea at elevated temperature and acidic pH (Macalady et al., 2004). Experimentally, it was shown that the thermoacidophiles Thermoplasma acidophilum (Euryarchaeota), Sulfolobus solfataricus (Crenarchaeota), and Acidilobus sulfurireducens (Crenarchaeota) respond to increases in incubation temperature by increasing the number of cyclopentyl rings in their GDGT core lipids (De Rosa et al., 1980;Uda et al., 2001;Boyd et al., 2011b). In addition, A. sulfurireducens was shown to increase the number of cyclopentyl rings in GDGT core lipids in response to increasing cultivation medium acidity (Boyd et al., 2011b). Thus, the ability for archaea to adjust the composition of their GDGT lipid membranes is a fundamental phenotype that facilitates their survival in a range of habitats.
Targeted 16S rRNA gene and non-targeted metagenomic characterizations of YNP geothermal spring communities also indicate an important role for both pH and temperature in structuring the composition and function of archaeal assemblages (Meyer-Dombard et al., 2005;Spear et al., 2005;Inskeep et al., 2010). These results suggest that archaea have adapted phenotypes that enable ecological success and persistence in these environments. Considering the apparent role that GDGT lipid cyclization has in maintaining functional membranes across gradients of temperature and pH, these observations imply a link between GDGT lipid composition and the diversification of thermophilic archaea. Here, we examine this link by coordinating an assessment of core (C)and intact polar (P)-GDGT lipids, archaeal 16S rRNA gene abundance and diversity, and geochemistry in 27 geothermal springs in YNP. In addition, we quantify the distribution and abundance of archaeal ammonia monooxygenase genes (amoA) since crenarchaeol has been suggested to be a biomarker for nitrifying archaea (de la Torre et al., 2008;Pitcher et al., 2009). A phylogenetic framework is employed to evaluate the links between the diversification of archaea at the level of 16S rRNA gene evolution, Cand P-iGDGT lipid composition, and environmental characteristics. The results of our study demonstrate the utility of integrating molecular genetic approaches with lipid biogeochemistry in generating a better understanding of archaeal ecology and evolution in terrestrial geothermal environments.

SAMPLE COLLECTION AND CHEMICAL ANALYSES
Water, mat, and/or sediment samples were collected from a total of 27 sites from four geographically distinct areas in YNP in June through August of 2008: Norris Geyser Basin (samples E1 to E11, E31, E32), Heart Lake Geyser Basin (samples E12 to E22), Imperial Geyser (samples E23 to E30), and Nymph Lake (samples E33 to E39) ( Table 1). In addition a soil sample was collected from a point adjacent and up the hydrological gradient at each sampling site. Spring temperature and pH were determined using a temperature compensated model 59002-00 Cole-Parmer electrode. An alcohol thermometer was used to confirm temperature. Conductivity was measured using a YSI model 33 S-C-T meter (Yellow Springs Instrument Company, Inc., Yellow Springs, OH, USA). Conductivity values were standardized to a common temperature of 25.0˚C as previously described (Hamilton et al., 2011). Concentrations of ferrous iron (Fe 2+ ) and total sulfide (S 2− ) were quantified onsite with a Hach DR/2000 spectrophotometer (Hach Company, Loveland, CO, USA) and Hachferrozine pillows and sulfide reagents, respectively. For both Fe 2+ and S 2− determinations, water samples were filtered (0.22 µm) prior to addition of reagents. Water samples were filtered (0.22 µm) and frozen on site for use in determination of aqueous chemistry. Dissolved nitrate (NO − 3 ), nitrite (NO − 2 ), ammonia (NH + 4 ), chloride (Cl − ), and sulfate (SO 2+ 4 ) were determined using a model MT-3 segmented flow analyzer (SEALQuAAtro, West Sussex, England) calibrated daily with freshly prepared standards.

DNA EXTRACTION AND QUANTIFICATION
Sediments for molecular analyses were collected aseptically using a flame-sterilized spatula, placed in 1.5 mL centrifuge tubes, and immediately flash-frozen in a dry ice-ethanol slurry for transport to the lab, where they were kept at −80˚C for use in molecular analyses. DNA extraction and purification was carried out as previously described (Boyd et al., 2007b). DNA was extracted in duplicate from ∼250 mg of wet weight sediment and equal volumes of each duplicate extraction were pooled. Genomic DNA was quantified using the Qubit DNA Assay kit (Molecular Probes) and a Qubit 2.0 Fluorometer (Invitrogen).

PCR AMPLIFICATION
Sediment extracts were screened for archaeal 16S rRNA genes amplified using primers 344F (5 -ACGGGGYGCAGCAGGCGC GA-3 ) and 915R (5 -GTGCTCCCCCGCCAATTCCT-3 ) at an annealing temperature of 61˚C. Archaeal amoA genes were amplified as previously described, using an annealing temperature of 53˚C and the following primer set: Arch-amoAF (5 -STAATGGTCTGGCTTAGACG-3 ) and Arch-amoAR (5 -GCGGCCATCCATCTGTATGT-3 ) (Francis et al., 2005). For each set of primers, ∼10 ng of purified genomic DNA was subjected to PCR amplification in triplicate using the following cycling conditions: initial denaturation at 94˚C (4 min), followed by 30 cycles of denaturation at 94˚C (1 min), annealing at specified temperature (1 min), primer extension at 72˚C (1.5 min), followed by a final extension step at 72˚C for 20 min. Reactions contained 2 mM MgCl 2 (Invitrogen, Carlsbad, CA, USA), 200 µM each deoxynucleotide triphosphate (Eppendorf, Hamburg, Germany),  (Schloss et al., 2009). Raw libraries were trimmed, filtered for quality, length, and ambiguous base calls using Mothur. Unique sequences were aligned to the SILVA archaeal database and sequences that started or ended before empirically determined positions of the alignment were removed. The resulting unique sequences were pre-clustered to remove amplification and sequencing errors. Chimeras were detected using UCHIME (Edgar et al., 2011) and were removed. Operational taxonomic units (OTUs) were assigned at a sequence dissimilarity of 0.03 using the furthest-neighbor method. Sequences were classified using the Bayesian classifier and the RDP database and manually verified with BLASTn (Tables S3 and S4 in Supplementary Material). Sequences representing each unique OTU have been deposited in the NCBI Sequence Reads Archive under the accession number SRR648329. Representative sequences of each OTU are also presented in Table S8 in Supplementary Material.

PHYLOGENETIC ANALYSIS
The phylogenetic position of archaeal 16S rRNA genes was evaluated by approximate likelihood-ratio tests (Anisimova and Gascuel, 2006) as implemented in PhyML-aBayes (version 3.0.1) (Anisimova et al., 2011). The archaeal 16S rRNA gene phylogeny was rooted with 16S rRNA genes from Clostridium acetobutylicum ATCC 824 (AE001437) and Caldicellulosiruptor saccharolyticus DSM 8903 (CP000679). Phylogenies were constructed using the General Time Reversible (GTR) substitution model with a proportion of invariable sites and gamma-distributed rate variation as recommended by jModeltest (ver. 3.8) (Darriba et al., 2012). The consensus phylogram was rate-smoothed www.frontiersin.org using the multidimensional version of Rambaut's parameterization as implemented in PAUP (ver. 4.0) (Swofford, 2001). Ratesmoothing was performed according to the parameters identified using jModeltest. This included the identification of the best fit substitution model and specification of the gamma distribution of rate variation across sites, proportion of invariant sites, nucleobase frequencies, and rate matrix for each phylogram.

QUANTITATIVE PCR
Quantitative PCR was used to estimate the number of archaeal 16S rRNA and archaeal amoA genes in DNA extracts. Methods for qPCR generally followed those developed by Boyd et al. (2011a). Archaeal 16S rRNA and archaeal amoA gene clones, generated as described above, were used in generating standard curves for use in relating template copy number to threshold qPCR amplification signal. Two clones for each target gene were used to generate standard curves. The abundance of 16S rRNA and amoA gene clones used in generating the standard curves varied by less than a factor of 1.0 and thus were averaged for use in calculating the average template abundances and SD in template abundances from replicate qPCRs. For quantification of archaeal 16S rRNA genes, a standard curve was generated over five orders of magnitude from 5.6 × 10 2 to 3.1 × 10 7 copies of template per assay (R 2 = 0.995). For quantification of archaeal amoA genes, a standard curve was generated over five orders of magnitude from 9.6 × 10 2 to 2.2 × 10 7 copies per assay (R 2 = 0.993). The detection limit for both archaeal 16S and amoA was ∼15 and 10 copies per ng of extracted DNA, respectively. qPCR assays were performed in a Rotor-Gene 300 quantitative real-time PCR machine (Qiagen, Valencia, CA, USA) in 0.5 mL optically clear PCR tubes (Qiagen, Valencia, CA, USA) using a SsoFast™ EvaGreen Supermix qPCR Kit (Bio-Rad Laboratories, Hercules, CA, USA). qPCR assays were amended with molecular-grade bovine serum albumin (Roche, Indianapolis, IN, USA) to a final concentration of 0.4 mg mL −1 . qPCR cycling conditions were as follows: initial denaturation (95˚C for 10 min) followed by 40 cycles of denaturation (95˚C for 10 s), annealing (55˚C for archaeal 16S rRNA genes, or 53˚C for amoA genes, for 15 s), and extension (72˚C for 20 s). Specificity of the qPCR assays was verified by melt curve analysis. Negative control assays were performed in the absence of template DNA. Each assay was performed in triplicate and the reported template abundances are the average and SD of triplicate determinations for the two control plasmids.

TETRATHER LIPID DETECTION AND QUANTIFICATION
Freeze-dried sediments and mat material were homogenized with a mortar and pestle before lipid extraction. Approximately 5 g of dried solids were extracted following a modified Bligh and Dyer extraction procedure (White et al., 1979). Samples were sequentially extracted six times by sonication for 15 min using a mixture of MeOH, dichloromethane (DCM) and phosphate buffer (pH 7.4) (2:1:0.8, v/v/v; total volume of ∼50 mL). After each sonication, samples were centrifuged at 2500 rpm for 5 min, and were then subjected to the next round of extraction. All extractions were pooled and DCM (1 volume) and phosphate buffer (1 volume) were added to the combined extract (0.9 volume) to achieve phase separation. The bottom DCM phase was collected into a glass tube and dried under N 2 . A known amount of C 46 GDGT internal standard was added to all samples (Huguet et al., 2006). The total lipid extract was divided into apolar and polar fractions by sequential elution with hexane:DCM (9:1, v/v) and DCM:MeOH (1:1, v/v), respectively. One half of the polar fraction was hydrolyzed according to Wei et al. (2011) and the other half was analyzed directly. The samples were then re-dissolved in hexane:propanol (99:1, v/v), and filtered through 0.45 µm PTFE filters prior to injection on the HPLC-MS.

COMMUNITY ECOLOGY AND STATISTICAL ANALYSIS
Phylocom (ver. 4.1) (Webb et al., 2008) was used to construct a community phylogenetic distance matrix using Rao phylogenetic distances derived from the rate-smoothed archaeal 16S rRNA gene cladogram as previously described (Boyd et al., 2010). Euclidean distance matrices derived from the environmental variables listed in Table 1 were constructed using the base package within Frontiers in Microbiology | Terrestrial Microbiology R (version 2.10.1). Chemical measurements that were below the limit of detection (Table 1 and footnotes) were given the minimal detectable value for the purposes of statistical analysis. In addition, the relative abundance of C-and P-iGDGTs was used to construct Euclidean distance matrices describing the dissimilarity of the lipid profiles between samples. Model selection via Akaike Information Criteria adjusted for small sample size (AICc) and Mantel regressions (999 permutations) were used to examine the extent to which the dissimilarity matrices co-varied. Model Selection and Mantel regressions were performed using the R packages Ecodist (Goslee and Urban, 2007), pgirmess (ver. 1.4.3) 1 , vegan 2 , and labdsv 3 . We considered the model with the lowest AICc value to be the best and evaluated the relative plausibility of each model by examining differences between the AICc value for the best model and values for every other model (∆AICc) (Burnham and Anderson, 2002;Johnson and Omland, 2004). Models with ∆AICc < 2 were considered strongly supported by the data, and models with ∆AICc > 10 and/or Mantel significance (p) values > 0.05 can be considered to have essentially no support from the data. Principle coordinate ordination (PCO) of Rao 16S rRNA gene phylogenetic similarity was performed using the R package vegan (see text footnote 2). Pearson linear regression analysis was used to examine relationships between gene abundances, geochemistry, individual C-and P-iGDGTs, C-and P-iGDGT ring indices, and archaeal 16S OTUs. Regression analysis was performed using the program XL Stat (ver. 2008.7.03).

SAMPLE SITE DESCRIPTION
The temperature of the hot spring water where samples were collected ranged from 16.3 to 86.7˚C and pH from 2.06 to 9.57 ( Table 1). A number of the chemical analytes co-varied (Table S1 in Supplementary Material), with pH varying inversely with Fe 2+ (Pearson R = −0.58, p < 0.01) and NH + 4 (Pearson R = −0.67, p < 0.01) and positively with NO − 2 (Pearson R = 0.61, p < 0.01). Concentrations of NO − 2 and NH + 4 were inversely correlated (Pearson R = −0.51, p < 0.01).

SOURCE OF HOT SPRING C-AND P-iGDGT LIPIDS
C-and P-archaeol and iGDGTs were recovered from 25 of the 27 springs analyzed (exceptions are E13 and E23) (Tables 2 and 3) and their abundance and structural composition (C-and P-iGDGTs only) were compared to those from surrounding terrestrial soils (Table S2 in Supplementary Material) in order to establish the potential for exogenous input into the system. The abundance of C-archaeol and C-iGDGTs tended to be greater in hot spring sediments/mats than in surrounding soils; however, the abundance of these compounds in hot spring sediments/mats and soils was correlated (Pearson R 2 = 0.65 and 0.74, respectively) ( Figure 1A). In contrast, the abundance of P-archaeol and P-iGDGTs in hot spring sediments/mats and in surrounding soils were not correlated (Pearson R 2 = 0.00 and 0.02, respectively) ( Figure 1B). To compare the structural composition of C-and P-iGDGTs in hot spring sediments/mats with those obtained from surrounding soils, matrices describing the dissimilarity in the relative abundance of individual lipid structures (i.e., GDGT-0 to GDGT-8) were constructed and subjected to Mantel regression. Comparison of the compositions of C-iGDGT and P-iGDGTs obtained from both hot spring sediment/mat and surrounding soils revealed statistically insignificant relationships (Mantel R = 0.00 and 0.03, respectively; p-values = 0.79 and 0.98, respectively) (Figures 1C,D), indicating that the lipid structures identified in the hot spring sediment/mats are not likely to be the result of exogenous input.

GEOCHEMICAL CONTROLS ON C-iGDGT COMPOSITION
A matrix that describes the dissimilarity in the composition of C-iGDGTs could be best explained by variation in environmental pH (∆AICc = 0.00, Mantel R 2 = 0.36, p < 0.01) ( Table 4). Models describing the dissimilarity in the concentrations of NO − 2 (∆AICc = 95.9, Mantel R 2 = 0.14, p < 0.01) and NH + 4 (∆AICc = 132.5, Mantel R 2 = 0.04, p = 0.01) were also significant predictors of the dissimilarity in the composition of C-iGDGTs. However, given that NO − 2 and NH + 4 both co-vary with pH, it is unclear if these parameters represent individual controls on C-iGDGT composition. The RI, which describes the average number of rings per GDGT, was inversely correlated with pH (Pearson R = −0.82, p < 0.01) ( Figure 3A; Table S3 in Supplementary Material). The RI was not significantly correlated with temperature ( Figure 3B; Table S3 in Supplementary Material). As expected, the relative abundance of a number of individual C-iGDGTs with few rings (e.g., GDGT-0 to GDGT-3) exhibited significant and positive relationships with pH indicating they are more prevalent in alkaline environments. In contrast, the relative abundance of individual C-iGDGTs with a greater number of rings (e.g., GDGT-4 to GDGT-8) exhibited significant and inverse relationships with pH indicating their prevalence in acidic environments. The relative abundance of crenarchaeol and its isomer in the C-iGDGT fraction, both of which contain four cyclopentyl and one cyclohexyl rings, were positively correlated with pH.

GEOCHEMICAL CONTROLS ON P-iGDGT COMPOSITION
The abundance of total P-iGDGTs was positively correlated with pH (Pearson R = 0.46, p < 0.01) ( Table S4 in Supplementary Material). A matrix that describes the dissimilarity in the composition of P-iGDGTs could be best explained by variation in environmental pH (∆AICc = 0.00, Mantel R 2 = 0.17, p < 0.01) ( Table 4). The RI was inversely correlated with pH (Pearson R = −0.71, p < 0.01) ( Figure 3A; Table S4 in Supplementary Material). The P-iGDGT RI was not correlated with temperature ( Figure 3B and Table S4 in Supplementary Material). Similar to the C-iGDGT fraction, the relative abundance of a number of individual P-iGDGTs with fewer rings (e.g., GDGT-0 to GDGT-3) exhibited significant and positive relationships with pH indicating they are more prevalent in alkaline environments. In contrast, the relative abundance of individual P-iGDGTs with a greater number of rings (e.g., GDGT-4 to GDGT-8) exhibited a significant inverse relationship with pH indicating their prevalence in acidic environments. The relative abundance of crenarchaeol and its isomer in the P-iGDGT fraction, both of which contain four cyclopentyl and www.frontiersin.org  (Schouten et al., 2003;Pearson et al., 2008).
one cyclohexyl rings, were positively correlated with pH although neither relationship was statistically significant.

COMPARISON OF HOT SPRING C-AND P-ARCHAEAL LIPIDS (GDGTs AND ARCHAEOL)
The abundance of C-iGDGT and P-iGDGT varied in the springs analyzed (Tables 2 and 3, respectively) but were not significantly correlated to each other (Pearson R = 0.13, p = 0.51) ( Table 5).
Likewise, the amount of archaeol in the core and polar fractions varied in the springs analyzed (Tables 2 and 3, respectively) but were not significantly correlated to each other (Pearson R = −0.03, p = 0.88) ( Table 5). However, the ratio of the abundance of P-iGDGT to C-iGDGT (i.e., total P-iGDGT/total C-iGDGT) correlated inversely with pH (Pearson R = −0.47, p = 0.02) and positively with temperature (Pearson R = 0.47, p = 0.02) (data not shown), suggesting that a combination of high temperature and acidic pH may favor rapid turnover or degradation of C-iGDGT or enhanced synthesis of P-iGDGTs. Matrices describing the dissimilarity in the composition of C-and P-iGDGTs were significantly correlated (Mantel R = 0.73, p < 0.01) (Figure 2). As such, the relative abundance of individual C-and P-iGDGT lipids were generally significantly correlated (p < 0.05) with the sole exception being the relative abundance of GDGT-0 which was not significantly correlated in the C-and P-iGDGT fractions (Pearson R = 0.27, p = 0.17) ( Table 5).

ABUNDANCE OF ARCHAEAL 16S rRNA AND amoA GENES
Archaeal 16S rRNA genes were detected in 26 of the 27 springs analyzed (Table S5 in Supplementary Material). In geothermal mat/sediment extracts containing archaeal 16S rRNA genes, the abundance ranged from 1.4 × 10 3 to 1.6 × 10 8 templates/ng of DNA extracted. Variation in the abundance of archaeal 16S rRNA genes was not significantly correlated with any of the measured geochemical parameters and could best be described by variation in salinity (Pearson R = −0.31, p = 0.08) ( Table S6 in Supplementary Material). Archaeal amoA genes were detected in 16 of the 27 environments examined (Table S5 in Supplementary Material). In geothermal mat/sediment extracts containing archaeal amoA genes, the abundance ranged from 1.2 × 10 1 to 5.2 × 10 4 copies/ng of DNA extracted. The abundance of archaeal amoA was significantly and positively correlated with the concentration of NO − 2 (Pearson R = 0.45, p = 0.02) and NO − 3 (Pearson R = 0.57, p < 0.01) ( Table S6 in Supplementary Material).
The abundance of archaeal 16S rRNA genes was not significantly correlated with the absolute abundance (i.e., relative abundance of individual C-iGDGT lipids multiplied by the total Frontiers in Microbiology | Terrestrial Microbiology   (Schouten et al., 2003;Pearson et al., 2008).
C-iGDGT detected) of any individual C-iGDGT lipid, total C-iGDGT lipids, or C-archaeol (Table S7 in Supplementary Material). In contrast, the abundance of archaeal amoA genes was strongly correlated with the abundance of C-crenarchaeol (Pearson R = 0.34, p = 0.09) and significantly and positively correlated with the abundance of its isomer (Pearson R = 0.55, p < 0.01).
In addition, the abundance of archaeal amoA genes was significantly and positively correlated with the absolute abundance of C-iGDGT-0 and C-iGDGT-1 (Pearson R = 0.42 and 0.40, p = 0.03 and 0.04, respectively). The abundance of archaeal 16S rRNA genes was not significantly correlated with the abundance of total P-iGDGT lipids or P-archaeol, but was significantly correlated with the absolute abundance of P-iGDGT-5 (m/z = 1292) and P-iGDGT-6 (m/z = 1290) (Pearson R = 0.67 and 0.65, p < 0.01 and < 0.01, respectively) ( Table S8 in Supplementary Material). The abundance of archaeal amoA genes was significantly and positively correlated with abundance of total P-iGDGTs (Pearson R = 0.56, p < 0.01) and the absolute abundance of P-iGDGT-1, P-iGDGT-2, and P-iGDGT-3. Intriguingly, the abundance of archaeal amoA genes was also significantly correlated with the abundance of P-crenarchaeol and its isomer (Pearson R = 0.59 and 0.56, p < 0.01 and <0.01, respectively).

PHYLOGENETIC COMPOSITION OF ARCHAEAL 16S rRNA ASSEMBLAGES
Archaeal 16S rRNA gene pyrotags were obtained from 22 of the 27 environments examined. Barcoding reactions failed for five environments (E1, E5, E6, E13) where archaeal 16S rRNA genes were detected. A total of 15692 non-chimeric sequences belonging to 31 unique OTUs (defined at 97% sequence identities) were identified in the 22 geothermal environments where sequence was obtained (Figure 5). Phylogenetic reconstruction of representatives of each unique 16S rRNA gene OTU yielded a phylogeny with resolved nodes with high bootstrap support ( Figure S1 in Supplementary Material) for use in calculating Rao's among community phylogenetic dissimilarity. The among community archaeal 16S rRNA phylogenetic dissimilarity could best be explained by variation in environmental pH (∆AICc = 0.00, Mantel R 2 = 0.06, p < 0.01) ( Table 6). Variation in Fe 2+ concentration was also a significant predictor of among community archaeal 16S rRNA phylogenetic dissimilarity (∆AICc = 4.2, Mantel R 2 = 0.04, p = 0.02). It is likely that the relationship noted between the dissimilarity in Fe 2+ concentration and archaeal 16S rRNA gene phylogenetic dissimilarity is due to the covariance between pH and Fe 2+ concentration in the springs examined (Table S1 in Supplementary Material).

www.frontiersin.org FIGURE 1 | Relationship between the abundance of C-iGDGT and C-archaeol in 26 geothermal sediments/mats plotted as a function of the abundance of the C-iGDGT and C-archaeol in surrounding soils (A).
Relationship between the abundance of P-iGDGT and P-archaeol in 26 geothermal sediments/mats plotted as a function of the abundance of the P-iGDGT and P-archaeol in surrounding soils (B). Mantel regression between a matrix describing the dissimilarity in the composition of C-iGDGTs in 26 geothermal sediments/mats plotted as a function of the dissimilarity in the composition of C-iGDGTs in surrounding soils (C). Mantel regression between a matrix describing the dissimilarity in the composition of P-iGDGTs in 26 geothermal sediments/mats plotted as a function of the dissimilarity in the composition of P-iGDGTs in surrounding soils (D). Data in panel A and B are plotted on a log-log scale with a linear regression depicting the relationship. The data used to construct plots (A-D) is presented in Tables 2 and 3.
The relatively weak relationships noted above when modeling the total variance in archaeal 16S rRNA gene Rao phylogenetic distance as a function of geochemical distance prompted a PCO analysis of archaeal 16S rRNA gene Rao phylogenetic distance to further deconvolute the complexity in the dataset. PCO ordination revealed a significant role for pH in structuring the phylogenetic composition of archaeal assemblages (Figure 4A). Assemblages sampled from acidic (pH 2.0-4.0) and alkaline (pH > 8.0) tended to cluster in the PCO ordination as a function of axis 1. PCO axis 1 (68.1% of variance explained) was significantly correlated with pH (Pearson R = −0.69, p < 0.01) (Figure 4B; Table S9 in Supplementary Material). The strong relationship noted between PCO axis 1 and environmental pH is primarily driven by the seven low pH environments that cluster in the bottom right corner of the regression analysis. Thus, acidic environments tend to harbor assemblages that are phylogenetically distinct from alkaline environments. PCO axis 2 (15.8% of variance explained) was not significantly correlated with any of the measured geochemical parameters and was most strongly correlated with temperature (Pearson R = 0.26, p = 0.25) ( Table S9 in Supplementary Material).
When only considering environments where both archaeal 16S rRNA and tetraether lipids were recovered, PCO axis 1 was significantly correlated with the C-iGDGT and P-iGDGT ring indices (Pearson R = 0.54 and 0.49, p = 0.01 and 0.02, respectively) (Tables S10 and S11 in Supplementary Material), suggesting a relationship between the diversification of archaeal taxa and GDGT lipid composition. The relative abundance of a number of individual C-iGDGTs and P-iGDGTs co-varied with axes 1 and 2. The abundance of GDGT-1 and 2 in both the C-and P-iGDGT fraction were inversely correlated with PCO axis 1, consistent with the tendency for these compounds to be more abundant in alkaline environments. In contrast, the abundance of GDGT-5 and 6 in both the C-and P-iGDGT fractions were positively correlated with PCO axis 1 (Tables S10 and S11 in Supplementary Material), consistent with the tendency for these compounds to be more Frontiers in Microbiology | Terrestrial Microbiology Table 4 | Model ranking using ∆AICc and Mantel correlation coefficients (R 2 ) where C-iGDGT or P-iGDGT distance is the response variable.  abundant in acidic environments. In addition, The abundance of GDGT-4 in only the polar fraction was positively correlated with PCO axis 1. Although not significant, GDGTs with a greater number of cyclopentyl rings (e.g., GDGT-7 and 8) were inversely correlated with PCO axis 2 in both the C-and P-iGDGT fraction.

TAXONOMIC COMPOSITION OF ARCHAEAL 16S rRNA GENE ASSEMBLAGES
Archaeal 16S rRNA gene OTUs were uniquely distributed as a function of environmental geochemistry. The most abundant OTU (OTU6; 30.1% of total sequences generated; Table S12 in Supplementary Material) was significantly and positively correlated with the concentration of nitrite (Pearson R = 0.42, p = 0.04) ( Table  S13 in Supplementary Material). OTU6 shares 99% sequence identities to the nitrifying thaumarchaeote Candidatus Nitrosocaldus yellowstonii and was dominant in four springs with circumneutral to alkaline pH and that had temperatures of <70˚C (Figure 5; Table S12 in Supplementary Material). In addition, sequences with strong identity to Ca. N. yellowstonii were identified in a single acidic spring (E39; pH 4.95) with a temperature of 16.3˚C. The abundance of OTU6 was also significantly and positively correlated with the abundance of archaeal amoA genes (Pearson R = 0.54, p < 0.01) ( Table S13 in Supplementary Material), although this relationship was dependent by the high relative abundance of amoA in E12 (Table S5 in Supplementary Material). The second most abundant OTU (OTU51; 14.4% of total sequences generated) was identified in three circumneutral environments that ranged in temperature from 55.0 to 83.8˚C (Table  S13 in Supplementary Material). OTU51, which exhibited 84% sequence identity to Ca. N. yellowstonii (Table S12 in Supplementary Material), was not significantly correlated to any of the environmental parameters measured (Table S13 in Supplementary Material). Although not significant, the third most abundant OTU (OTU31; 10.1% of total sequences) was inversely correlated with pH (Pearson R = −0.36, p = 0.08) (

RELATIONSHIP BETWEEN 16S rRNA GENE TAXONOMIC COMPOSITION AND C-AND P-iGDGT COMPOSITION
A number of archaeal 16S rRNA gene OTUs co-varied with the abundances of C-and P-iGDGTs (Tables S14 and S15 in Supplementary Material respectively). For simplicity, correlations between OTU relative abundance and individual C-iGDGT lipid abundance will not be presented here given that they tended to be less significant than when the P-iGDGT fraction was considered. The relative abundance of OTU6, which is closely related www.frontiersin.org  to Ca. N. yellowstonii, was significantly correlated with the abundance of crenarchaeol (Pearson R = 0.56, p < 0.01). OTU51, which exhibited distant homology to Ca. N. yellowstonii was not significantly correlated with the abundance of crenarchaeol or any other individual P-iGDGTs. The abundance of OTU31, which is distantly affiliated with Ca. Nitrosphaera sp. EN123 and which was detected in acidic springs, was significantly correlated with the abundances of P-iGDGTs with seven or more rings.

DISCUSSION
The ability for microbial populations to radiate into a new ecological niche is dependent on physiological compatibility with the physical and chemical characteristics of the local environment (Kassen, 2009;Losos, 2010), due in large part to the additional energetic demands that physical and chemical stress impose on cells (Hoehler, 2007;Valentine, 2007). Cellular membranes and the electrochemical potential that is generated across them are critical components in the production of energy for use in biomolecular synthesis and maintenance (van de Vossenberg et al., 1998;Macalady et al., 2004;Baker-Austin and Dopson, 2007). Thus, maintaining a stable and functional lipid membrane is fundamental to energy transduction and the diversification of life into new ecological niches. Here, we quantitatively integrate biomolecular lipid, genetic, and environmental data in YNP geothermal environments in order to better define the role of geochemical gradients in driving the taxonomic and phenotypic diversification of archaea. Gradients in geothermal fluid pH were shown to influence the degree of C-and P-iGDGT cyclization (RI), with acidic fluids tending to select for lipids with more internal cyclopentyl rings. These observations are consistent with previous characterizations of C-iGDGT lipid composition in geothermal habitats, including those in YNP (Schouten et al., 2007;Pearson et al., 2008). As shown here, the degree of cyclization of P-iGDGTs is inversely correlated with pH in geothermal habitats. Given that P-iGDGTs are purported to better reflect living or intact cells when compared to C-iGDGTs (Sturt et al., 2004;Biddle et al., 2006;Schubotz et al., 2009), these results indicate that the active fraction of archaeal assemblages is responding to gradients in pH and this signature is preserved in the C-iGDGT fraction.
Experimental and molecular modeling studies indicate that tetraether lipids and the extent of internal tetraether lipid cyclization confer resistance to proton permeation. For example, studies of archaeal liposomes formed from tetraether lipids derived from the archaean S. acidocaldarius exhibit a much lower proton permeation over a temperature range of 20-80˚C when compared to bacterial acyl-ester lipids (Elferink et al., 1994). Molecular modeling of these liposomes suggest the decreased proton permeability relative to acyl membranes results from both the ether bonding and the bulky isoprenoid lipid core (Yamauchi et al., 1993;Komatsu and Chong, 1998;Mathai et al., 2001). In addition, hydrogen bonding between the sugar residues exposed at the face of the  Table S8 in Supplementary Material. Representative sequences for each OTU are also provided in Table S8 in Supplementary Material. Regression analysis of the top five PCO axes and geochemical variables is presented in Table S9 in Supplementary Material. Regression analysis of the top five PCO axes and C-and P-iGDGTs is presented in Tables S10 and S11 in Supplementary Material, respectively.
FIGURE 5 | Taxonomic composition of archaeal 16S rRNA genes. Taxa are defined as the genus with which they were most closely affiliated based on BLASTn analysis. The percent sequence identity of each OTU to the most closely related genus is indicated in parentheses. Sequences that represented less than 0.01% of the total sequences obtained in the present study were binned as "other." The taxonomy and environmental distribution of the genera, as depicted from top to bottom in the figure, corresponds to the OTUs, listed from top to bottom, in Table S12 in Supplementary Material. www.frontiersin.org lipid monolayer has been proposed to reduce proton permeability, although to a lesser extent than the rigid and dense packing of the isoprenoid core (Komatsu and Chong, 1998). Additional studies are clearly warranted to investigate the influence of GDGT lipid sugar head groups in geothermal environments, and their relation to taxonomic evolution. In separate molecular simulations of archaeal liposomes, it was found that tetraether lipid isoprenoid cyclization enhances the resistance to proton permeation through increased packing density (Gabriel and Chong, 2000). The importance of cyclopentyl rings in resistance to proton permeation, as suggested by the environmental data presented here, is supported by a survey of the GDGT lipid composition of archaeal isolates which indicates that acidophiles, when cultivated at their optimum pH and temperature, synthesize GDGT lipids containing a greater average number of cyclopentyl rings than neutrophiles (Macalady et al., 2004). Moreover, these findings are consistent with recent experimental data that indicate that the thermoacidophile A. sulfurireducens increase the degree of GDGT lipid cyclization in response to increasing acidity (Boyd et al., 2011b). Gradients in geothermal fluid pH select for distinct lineages of archaea, with acidic (pH < 4.0) environments tending to be dominated by sequences affiliated with Vulcanisaeta sp., Thermoplasma sp., Sulfolobus sp., and Thermoproteus sp. The recovery of sequences affiliated with these taxa is consistent with the thermoacidophilic nature of the type strains (Darland et al., 1970;Brock et al., 1972;Zillig et al., 1981;Itoh et al., 2002), many of which were originally isolated from YNP. In addition, a number of studies have previously identified sequences affiliated with these taxa in acidic geothermal springs in YNP (Inskeep et al., 2005;Meyer-Dombard et al., 2005;Spear et al., 2005;Kozubal et al., 2008;Reno et al., 2009;Macur et al., 2013). The lipid membranes of representatives of these lineages are known to be comprised primarily of tetraether lipids, suggesting this as a key to survival in high temperature and/or acidic environments (De Rosa and Gambacorta, 1988;Macalady et al., 2004).
PCO analysis indicates that the phylogenetic relationships among archaeal assemblages is driven primarily by pH gradients, with significant correlations noted between assemblage clustering and the C-and P-iGDGT ring indices. This form of data integration directly links the composition of tetraether lipids with the diversification of archaeal lineages. Other factors are invariably involved in driving the diversification of archaeal lineages such as differences in the energetic favorability of redox couples as a function of spring pH (Inskeep et al., 2005;Meyer-Dombard et al., 2005;Shock et al., 2010), functional adaptation of lineages to utilize those redox gradients (Inskeep et al., 2010), and the evolution of strategies aimed at detoxifying metals that tend to be more soluble and/or bioavailable under acidic conditions (e.g., mercury; Wang et al., 2011), among others. Nonetheless, the fundamental requirement for cells to maintain functional lipid membranes in order to preserve electrochemical gradients and cellular homeostasis (van de Vossenberg et al., 1998;Macalady et al., 2004;Baker-Austin and Dopson, 2007) is expected to represent a first order constraint on the ability for populations to diversify into new pH realms.
A number of C-and P-iGDGT lipids were found to co-vary with the abundance of 16S rRNA gene sequences affiliated with taxa thought to be involved in their synthesis, most notably crenarchaeol and Ca. N. yellowstonii. Ca. N. yellowstonii is an obligate nitrifying autotroph that has been demonstrated to synthesize crenarchaeol (de la Torre et al., 2008). Sequences closely affiliated with Ca. N. yellowstonii were recovered from several circumneutral to alkaline geothermal springs in YNP both in this study, and in previous studies (de la Torre et al., 2008). The abundance of sequences affiliated with Ca. N. yellowstonii, as well as the abundance of P-crenarchaeol, were inversely correlated with the concentration of NH + 4 and significantly and positively correlated with the concentration of NO − 2 , a product of nitrification. These data may suggest that populations of Ca. N. yellowstonii are involved in nitrification in the environments analyzed. In support of this conclusion, the abundance of 16S rRNA gene sequences closely affiliated with Ca. N. yellowstonii was significantly and positively correlated with the abundance of archaeal amoA sequences, the product of which is required to oxidize NH + 4 to the intermediate hydroxylamine. However, not all geothermal mat/sediment samples that exhibited elevated abundances of archaeal amoA sequences (>10 3 copies/ng DNA) yielded 16S rRNA gene sequences closely affiliated with Ca. N. yellowstonii (e.g., E3, E7, E16-E19). In some instances (e.g., E3, E7), this discrepancy might be explained by the detection of 16S rRNA gene sequences closely affiliated with the soil nitrifier Ca. Nitrososphaera sp. which also encodes for amoA (Tourna et al., 2011). Geothermal springs E3 and E7 are moderate temperature and acidic environments and while previous studies have shown evidence for nitrification activity in acidic geothermal springs (Reigstad et al., 2008), it is unclear if the activity could be attributed to populations of Ca. Nitrososphaera spp. The abundance of crenarchaeol in YNP geothermal environments in both the C-and P-iGDGT fractions was, however, closely correlated with the abundance of archaeal amoA, implying a link between nitrification and crenarchaeol production. In accordance, the abundance of crenarchaeol in the C-and P-iGDGT fractions was inversely associated with the concentration of NH + 4 and positively associated with the abundance of NO − 2 .
It has been suggested that the previous detection of crenarchaeol in geothermal springs in the GB, Nevada (Zhang et al., 2006), which host abundant populations of nitrifying archaea (Costa et al., 2009;Dodsworth et al., 2011), is the result of allochthonous input of lipid material from the weathering of nearby soils (Schouten et al., 2007). Our pairwise comparison of geothermal mat/sediment C-and P-iGDGT profiles with those from surrounding soils, which reveals no association between the two, strongly suggests that the GDGTs detected in geothermal springs is unlikely to be solely the result of allochthonous input from surrounding environments. This conclusion is supported by the recent demonstration of P-crenarchaeol production in two geothermal springs in the GB (Pitcher et al., 2009). However, given the results noted above, namely the lack of sequences affiliated with known nitrifiers in springs where crenarchaeol is detected, limited input of allochthonous material from surrounding environments cannot be excluded.

CONCLUSION
In summary, the results presented here quantitatively demonstrate links between environmental variation, the composition of archaeal GDGT lipids, and the distribution of archaeal lineages in geothermal environments. The detection of P-iGDGT lipids in ecosystems that range in pH from 2.06 to 9.57 and temperature from 16.3 to 86.7˚C indicates that archaea have diversified to inhabit a wide range of environments, implying a fundamental role for GDGT lipids in this process. The relationships presented here are strongly dependent on the inclusion of acidic ecosystems, suggesting that ability to adjust GDGT lipid composition is a critical phenotype that enabled the diversification of thermophilic archaea into, or out of, acidic habitats. The close association noted between the distribution of 16S rRNA genes closely affiliated with characterized nitrifying archaea, archaeal amoA gene sequences, and the presence of crenarchaeol support the hypothesis that this lipid is a biomarker for thaumarchaeotes involved in nitrification activity.