Metagenome reveals potential microbial degradation of hydrocarbon coupled with sulfate reduction in an oil-immersed chimney from Guaymas Basin

Deep-sea hydrothermal vent chimneys contain a high diversity of microorganisms, yet the metabolic activity and the ecological functions of the microbial communities remain largely unexplored. In this study, a metagenomic approach was applied to characterize the metabolic potential in a Guaymas hydrothermal vent chimney and to conduct comparative genomic analysis among a variety of environments with sequenced metagenomes. Complete clustering of functional gene categories with a comparative metagenomic approach showed that this Guaymas chimney metagenome was clustered most closely with a chimney metagenome from Juan de Fuca. All chimney samples were enriched with genes involved in recombination and repair, chemotaxis and flagellar assembly, highlighting their roles in coping with the fluctuating extreme deep-sea environments. A high proportion of transposases was observed in all the metagenomes from deep-sea chimneys, supporting the previous hypothesis that horizontal gene transfer may be common in the deep-sea vent chimney biosphere. In the Guaymas chimney metagenome, thermophilic sulfate reducing microorganisms including bacteria and archaea were found predominant, and genes coding for the degradation of refractory organic compounds such as cellulose, lipid, pullullan, as well as a few hydrocarbons including toluene, ethylbenzene and o-xylene were identified. Therefore, this oil-immersed chimney supported a thermophilic microbial community capable of oxidizing a range of hydrocarbons that served as electron donors for sulphate reduction under anaerobic conditions.


INTRODUCTION
Deep-sea hydrothermal vents characterized by steep physicochemical gradients harbor a wide range of microorganisms in different ecological niches, including the high-temperature chimney matrix (Reysenbach and Shock, 2002). Deep-sea hydrothermal vent chimneys are the product of hydrothermal circulation and alteration of seawater entrained through geothermally heated subseafloor basalt, and subsequent precipitation of mental sulfides when hot vent fluids emerge into cold sea water (Von Damm, 1990). The geochemical disequilibria within and surrounding chimneys provide rich energy sources for microorganisms, as various reduced chemicals (such as sulfur, methane, and H 2 ) are utilized as potential electron donors (Jannasch and Mottl, 1985;Distel et al., 1988;Lam et al., 2004;Petersen et al., 2011). The structures of these microbial communities are shaped primarily by variation of hydrothermal fluid composition, in particular H 2 concentrations (Flores et al., 2011). Thanks to advances in sequencing technologies, molecular microbial diversity studies of deep-sea hydrothermal environments have made significant progress in understanding the geographic distributions of these microbial communities (Teske et al., 2002;Huber et al., 2007Huber et al., , 2010Brazelton et al., 2010;Dick and Tebo, 2010;Roussel et al., 2011).
Despite the increased knowledge of the microbial diversity of deep-sea hydrothermal vents, much less was known about the metabolic potential and ecological functions of these communities, especially when considering that less than 1% of environmental microorganisms could be cultured under laboratory conditions (Amann et al., 1995). Metagenomic-based methods have provided unique opportunities to explore the features of microbial communities from diverse deep-sea hydrothermal vent environments. So far, metagenomes from two deep-sea hydrothermal vent chimneys have been published, from a carbonate white chimney at Lost City with relatively low temperature and high pH (<90 • C, pH 9-11) (Brazelton and Baross, 2009), and from a chimney sample collected from Juan de Fuca (Xie et al., 2011) characterized by high temperature and low pH fluids (>300 • C, pH 2-3). Both metagenomes were found enriched in transposases, implying that horizontal gene transfer may be a common feature of hydrothermal vent chimney biosphere. Comparative metagenomic studies with more chimney samples from different deep-sea hydrothermal vents should be performed to reveal common features of the chimney-originated microbial communities.
The Guaymas Basin (Gulf of California) is a unique hydrothermal vent site, where emitted high-temperature fluids are influenced by the presence of a thick layer (100-500 m) of sediments (with 2-4% organic matter, OM). These sediments were formed by precipitation from the highly productive surface waters and terrigenous input (Von Damm et al., 1985). The venting fluids are characterized by an increase of pH (around 6.0), and a decrease in the highest temperature of fluids emitted on the seafloor (270-325 • C) (Von Damm et al., 1985). Therefore, the chimney sample from the Guaymas site would be an ideal sample for comparative metagenomic analysis together with the two published ones. In addition, Guaymas Basin is unique due to the fact that under high-temperature conditions, the OM in its rapidly accumulating sediments is pyrolized to petroleum-like hydrocarbon products, such as aliphatic and aromatic hydrocarbons, short-chain fatty acids, ammonia, and methane (Bazylinski et al., 1988;Welhan, 1988;Martens, 1990;Kawka and Simoneit, 1994). The hydrothermally active sediments of the Guaymas Basin were reported with intensive methane oxidizing and sulfate reducing activities (Jørgensen et al., 1990(Jørgensen et al., , 1992Elsgaard et al., 1994;Weber and Jørgensen, 2002;Kallmeyer and Boetius, 2004). Various hydrocarbon degrading microorganisms which are using sulfate as the electron acceptor have been isolated from Guaymas Basin sites and similar marine habitats (Rüter et al., 1994;Galushko et al., 1999;Musat and Widdel, 2008;Kleindienst et al., 2012). Nevertheless, the metabolic potential, in particular for hydrocarbon degradation, of the whole microbial community from Guaymas vent chimneys has never been investigated. Therefore, comprehensive studies on biodegradation, in particular under anaerobic conditions, still remain to be conducted to characterize the structure and metabolic potential of microbial ecosystems with capability for hydrocarbon biodegradation.
In this study, the metagenome of an oil-immersed chimney in Guaymas Basin was analyzed to demonstrate the metabolic potential and ecological functions of the inhabited microbial community. Additionally, questions related to anaerobic biodegradation of hydrocarbons are addressed: do the microorganisms from this chimney community have anaerobic hydrocarbon degradation activities? If so, what kinds of hydrocarbons could be potentially degraded? Which groups of microorganisms are responsible for carrying out the degradations? Which processes are coupled/closely related to the degradation for electron transfer? Are there any features shared among different chimney-originated samples?

DNA EXTRACTION AND SEQUENCING
The sample 4558-6 under investigation represented the outer layer of a black-smoker chimney with preliminary venting fluid temperature 190 • C (measured above the chimney prior to sampling), and was collected in Guaymas Basin (27 • 0.9 N, 111 • 24.6 W, depth = 2013 m) by the HUV Alvin (supported by the R/V Atlantis) in November, 2009. The chimney ( Figure A1) was kept at −20 • C immediately after sample collection, and stored with dry ice during transportation and stored at −80 • C in laboratory until further analyses. The genomic DNA was extracted from outer sections of all collected samples where highest DNA quantity was found. Isolation of DNA was conducted as described in a previous study (Wang et al., 2009). Metagenome pyrosequencing was performed according to company protocol on the 454 Life Sciences GS FLX system with a practical limit of 400 bp. All the sequences were deposited in the MG-RAST server.

METAGENOMIC SEQUENCE ANALYSIS
Coding regions within the metagenome were predicted using FragGeneScan (Rho et al., 2010), and the predicted sequence features were then annotated (e-value <1e-5) against M5NR protein database. 16S rRNA genes were predicted with HMM and BLASTN (e-value <1e-5) in webMGA (Wu et al., 2011), respectively. To analyze the taxonomic contents, all predicted gene features were subject to blastx (Altschul et al., 1997) searches against NCBI non-redundant (NR) database (e-value <1e-5, word size = 3, multi hit window size = 40 and low complexity filter on) and visualized in MEGAN (Huson et al., 2007). Each predicted sequence feature in the metagenome was assigned to a certain taxon when at least 75% of the BLAST hits of this query were from that specific taxon. Sequences with matches to the eggNOG (Powell et al., 2012), COG (Tatusov et al., 2003) and KEGG (Ogata et al., 1999) database were retrieved to build functional categories and reconstruct metabolic pathways.

ANNOTATION OF SEQUENCES WITH DEGRADATION/METABOLIC ACTIVITIES
Sequences that were annotated as enzymes in the degradation of cellulose and fatty acids were extracted and subject to manual examination. Annotated KEGG pathways in this metagenome were visualized in MEGAN to demonstrate the metabolic potential in the microbial community. Enzymes involved in the degradation of a few organic chemicals (such as benzene, toluene, ethylbenzene, and xylenes) were collected from the Biocatalysis/Biodegradation Database (BBD) of the University of Minnesota (Gao et al., 2010), an online web service that listed known degradation pathways for hundreds of chemicals/contaminants. Within this chimney metagenome, a sequence-similarity based search against BBD was conducted to identify candidate genes that were involved in the biodegradation of certain hydrocarbons. Sequences of previously reported anaerobic alkane degradation genes (Callaghan et al., 2010), benzylsuccinate synthase (bss) and alkylsuccinate synthase (ass), were retrieved from GenBank (accession no.: DQ826035, DQ826036, AJ001848, AB066263, AY032676, and AF113168) and searched against our Guaymas metagenome.

CLUSTERING ANALYSIS OF FUNCTIONAL CATEGORIES
Clustering of functional categories (with KEGG annotation, e-value < 1e-5, min. identity of 30% and min. align. length of 15 a.a.) was conducted in MG-RAST (using ward with canberra distance metric based on normalized values) among metagenomes as following:  (Biddle et al., 2008). All metagenomes were stored in MG-RAST database.

SEQUENCING SUMMARY AND COVERAGE
Initially, as shown in Table 1, a total amount of 512,830 reads and 196,377,880 bp of sequence data were generated by 454 pyrosequencing. After removing low-quality reads and technical duplicates, the remaining 504,915 reads (with an average length 383 bp) were assembled into 49,055 contigs (totaling 26,241,624 bp, with an average length of 543 bp). 187,308 singletons with an average length of 367 bp could not be assembled. From this assembly, 52,366 gene features were predicted, 37,372 of which (71.4%) were with known annotations. A rarefaction analysis of the final assembly was conducted based on the taxonomic information retrieved from annotation results in MG-RAST (Figure 1), which indicated that a reasonable number of individual genomes were sampled and covered in the metagenome.

TAXONOMIC DIVERSITY BASED ON 16S rRNA GENE PREDICTION
70 and 90 16S rRNA sequences were predicted from webMGA (Wu et al., 2011), with the use of HMM and BLASTN, respectively ( Table 2). In both of the results, around 3/4 (72.9 and 76.7%, respectively) of all identified 16S rRNA sequences were from bacteria. Deltaproteobacteria and Euryarchaeota had dominated the bacteria and archaea, respectively.

FUNCTIONAL CATEGORY HITS DISTRIBUTION
Major KEGG function categories were listed ( Table 4) and ordered by the number of unique hits assigned to each category. Similar to a previous study on the chimney from hydrothermal vents (Xie et al., 2011), genes involved in Recombination and Repair were among the abundant categories (Table 4). In addition, transposaes were found to be highly enriched in this sample (Table A3), and genes participated in Chemotaxis and Flagellar Assembly was all identified with high abundance in this metagenome.

DEGRADATION OF REFRACTORY OM AND PETROLEUM HYDROCARBONS
Sequences coding for the complete degradation pathway of cellulose and fatty acids, as well as key enzymes involved in breakdown of lipid and pullulan were all identified in this metagenome ( Table A2). Many of these identified enzymes were with thermorphilic-origin best hits (as marked with asterisk in Table A2). Peptidases were found to be of low abundance in this metagenome (data not shown). For anaerobic hydrocarbon degradation, sequences coding for benzylsuccinate synthase (bss) and alkylsuccinate synthase (ass), key enzymes in the fumarate addition pathway for the anaerobic oxidation of hydrocarbons (Callaghan et al., 2010), were identified ( Table 5). In particular, gene candidates involved in BTEX (benzene, toluene, ethylbenzene, and xylenes) degradation were identified by searching against the BBD of the University of Minnesota (Gao et al., 2010) ( Table 5). Additionally, to show the metabolic potential of this Guaymas sample in the degradation and remediation of organic contaminants (He et al., 2010), genes involved in the degradation of aromatic carboxylic acid (benzoate, phenylpropionate and phthalate), chlorinated aromatics (2-and 4-chlorobenzoate,  2,4,5-trichlorophenoxyacetic acid), heterocyclic aromatics (carbazole and dibenzothiophene), nitroaromatics (nitrobenzene and nitrophenol) as well as a few of other hydrocarbons (cyclohexane and tetrahydrofuran) were searched within the Guaymas metagenome. Notably, this chimney sample seemed to have the potential to degrade toluene, ethylbenzene and o-xylene (Table 5), yet no such evidence for benzene or the rest has been detected.

COMPARATIVE METAGENOMIC ANALYSIS
Clustering on (KEGG) functional gene categories was conducted among different environmental samples (Figure 3). Guaymas chimney sample 4558-6 (4510962.3) was clustered most closely with chimney sample from Juan de Fuca (4510965.3). Both of these two metagenomes were almost depleted in categories of RNA family and folding, sorting and degradation, while they showed higher abundance in the categories of signaling molecules, and interaction and cell communication. These features might be highly related to the specific environmental conditions where these two chimney samples were collected.  Table A3. When compared to metagenomes from different environments, 4458-6 from Guaymas hydrothermal vent chimney was the only one with enzymes for biodegradation of toluene, ethylbenzene, and o-xylene (Table A3), highlighting its metabolic  potential in degrading hydrocarbons in the native oil-immersed condition.

DISCUSSION
Advances in sequencing technologies have made microbial diversity studies easier and more accurate. However, biases were introduced during sequencing and analyzing of environmental sequences. For instance, biases were generated when multiple reads were produced for a unique DNA fragment in a random manner. Such biases might result in an inaccurate representation of the fragments and lead to misleading conclusions. Therefore, strict quality control of the sequenced reads should be done. In this study, a decent sequencing coverage has been reached ( Figure 1) and we are in the position to investigate the taxonomic diversity as well as the metabolic potential of this Guaymas chimney sample.

DIVERSITY AT DIFFERENT TAXONOMIC LEVELS
Estimated from both 16S rRNA sequences as well as taxonomic classifications based on blast hits, the proportion of bacteria in the community was about 60-75%, and 25-30% for archaea. Bacteria were dominated by Proteobacteria, while Euryarchaeota were most abundant among the archaea. So far, sulfate-reducing bacteria and archaea have been isolated in vent chimneys, with either high or low growth temperature (30-90 • C) (Burggraf et al., 1990;Audiffrin et al., 2003;Moussard et al., 2004). Sulfate reduction may occur at temperatures up to 110 • C in hot sediments from Guaymas hydrothermal field (Jørgensen et al., 1992). In this study, a high proportion (at least 21.2%) of sequences from this chimney metagenome was potentially originated from SRP (Table 3), a large and extremely diverse physiological group of anaerobic microorganisms. Sulfate-reducing bacteria and archaea were capable of degrading a wide range of organic substrates (Widdel and Bak, 1992;Widdel and Rabus, 2001), including petroleum-based products that were discussed in this study. The presence of large numbers of SRP in the oil-immersed chimney suggested that the microbial community had the potential of hydrocarbon biodegradation, which were likely to be coupled with sulfate reduction. Besides, most of the retrieved sequences were estimated to originate from thermophilc microorganisms such as Archaeoglobus, heterotrophic Thermococcales and Thermodesulfobacteriaceae, reflecting the influence of the high temperature on the structure of the microbial community.

METABOLIC POTENTIAL FOR HYDROCARBON DEGRADATION
Deep-sea environments have been characterized by the lack of easily biodegradable OM, thus genes related to the degradation of refractory OM have been extensively recovered from the metagenomes of deep oceans (Martin-Cuadrado et al., 2007). Moreover, bacteria isolated from the deep sea have been shown to be capable of degrading refractory OM such as chitin and cellulose (Hedges et al., 2000;Vezzi et al., 2005;Wang et al., 2008). Here, the potential of a chimney microbial community for refractory OM degradation was evaluated. Genes coding for chitin degradation were not found in the metagenome, while all the genes involved in the degradation of cellulose and fatty acids, as well as a few key enzymes in the breakdown of lipid and pullulan were identified, which probably reflected in part the input of terrestrial OM circulating in this Guaymas vent field. As most of the identified enzymes had a presumably thermophilic origin ( Table 5 and Table A2), and thermophilic microorganisms were predominant in the chimney sample (Table 3), it was likely that these enzymes had some degree of heat tolerance. Additionally, microorganisms of this Guaymas chimney were predicted to have the potential to degrade toluene, ethylbenzene, and o-xylene. Notably, anaerobic hydrocarbon degrading microorganisms have been successfully enriched and most extensively studied with the benzene-toluene-ethylbenzenexylenes (BTEX) group of petroleum hydrocarbons (Stockton et al., 2009). For example, pure culture strain EbS7 was isolated from the sediments of Guaymas Basin which was reported with the ethylbenzene-dependent sulfate reduction (Kniemeyer et al., 2003). This Guaymas chimney sample was oil immersed, thus it was not surprising to see the presence of genes involved in the degradations of petroleum hydrocarbons ( Table 5). Our data further highlighted the potential of this microbial community using the fumarate addition pathway for the degradation of aromatic and aliphatic hydrocarbons. Identification of genes coding for hydrocarbon degradation would advance characterizations of the potential source of electrons and energy, as well as the roles of this chimney microbial community had played in its native environment. Notably, this Guaymas chimney metagenome seemed to be the only one (among all the samples included in the comparative metagenomic analysis) with metabolic potential for hydrocarbons biodegradation (Table A3). Enzymes activated by oxygen (namely active under aerobic conditions) were not identified in our metagenome, consistent with the strict anaerobic condition where this chimney sample was collected. Metagenomic analysis suggested that degradation of a variety of petroleum hydrocarbons by SRP might play an important role in the energy metabolism of this chimney microbial community.

COMPARISON AMONG DIFFERENT SAMPLES
For the moment, only three metagenomes from hydrothermal vent chimney are available, one from Juan de Fuca (Xie et al., 2011), one from Lost City (Brazelton and Baross, 2009) and the last one was our chimney sample 4458-6 from Guaymas Basin. The whole metagenome-based comparison (Figure 3) showed that Lost City was not clustered next to the other two chimney samples, which could be due to the fact that the Lost City sample was collected from a white carbonate chimney (CC) rather than a back sulfide chimney (SC) as the other two samples were. When considering the fact that the dominant energy sources for SC and CC were significant different (i.e., metal sulfides in SC and reduced volatiles such as hydrogen, methane in CC, respectively), the genomic differences between the two types of chimneys become clear as different energy metabolisms were promoted accordingly. Moreover, the venting fluid of Lost City had a pH of 9-11 and temperatures lower than 90 • C, whereas that of Juan de Fuca sample was acidic and hot (temperature around 310 • C); the vent fluid of our Guaymas chimney sample represented an intermediate temperature regime (190 • C). Differences in environmental factors (such as temperature, pH, and physico-chemical gradients) between white and black chimneys could also have a significant influence on the distribution of functional gene categories, and as a result the three chimney samples were not clustered next to each other. On the other hand, when looking at specific gene categories, metagenomes from hydrothermal vent chimneys (Guaymas Basin, Juan de Fuca and Lost City) kept a larger collection of genes involved in recombination and repair, chemotaxis and flagellar assembly, as well as transposases. This could be a chimney-specific feature, reflecting adaptations of microorganisms and the microbial community to the extreme fluctuating chemical and physical conditions that characterize deep-sea hydrothermal vent chimneys.