Is the Genetic Landscape of the Deep Subsurface Biosphere Affected by Viruses?

Viruses are powerful manipulators of microbial diversity, biogeochemistry, and evolution in the marine environment. Viruses can directly influence the genetic capabilities and the fitness of their hosts through the use of fitness factors and through horizontal gene transfer. However, the impact of viruses on microbial ecology and evolution is often overlooked in studies of the deep subsurface biosphere. Subsurface habitats connected to hydrothermal vent systems are characterized by constant fluid flux, dynamic environmental variability, and high microbial diversity. In such conditions, high adaptability would be an evolutionary asset, and the potential for frequent host–virus interactions would be high, increasing the likelihood that cellular hosts could acquire novel functions. Here, we review evidence supporting this hypothesis, including data indicating that microbial communities in subsurface hydrothermal fluids are exposed to a high rate of viral infection, as well as viral metagenomic data suggesting that the vent viral assemblage is particularly enriched in genes that facilitate horizontal gene transfer and host adaptability. Therefore, viruses are likely to play a crucial role in facilitating adaptability to the extreme conditions of these regions of the deep subsurface biosphere. We also discuss how these results might apply to other regions of the deep subsurface, where the nature of virus–host interactions would be altered, but possibly no less important, compared to more energetic hydrothermal systems.


INTRODUCTION
Viruses play a crucial role in marine biogeochemical cycles, microbial ecology, and evolution. Several recent reviews (Suttle, 2005;Rohwer and Thurber, 2009;Kristensen et al., 2010) have highlighted our current understanding of the viral impact on the marine environment. Generally, viruses influence the marine environment in three ways: first, through altering biogeochemical cycles by "shunting" the microbial loop through lysis of hosts (Suttle, 2005); second, by modifying the diversity and abundance of their hosts, particularly those that are most abundant, through what is dubbed "kill the winner" (Thingstad and Lignell, 1997); and third, by altering the genetic content of their hosts. Via the latter mechanism, viruses can fundamentally alter the course of evolution in their microbial hosts.
One means by which viruses can manipulate the genetic content of their hosts is by facilitating the process of horizontal gene transfer through transduction. This occurs when, in the process of virion synthesis, host genetic material is also incorporated into the viral genome. The newly synthesized viruses can then transfer the previous host's genetic material into a new host upon infection. It has been suggested that this could be an important mechanism for horizontal gene transfer in the marine environment: one study estimated that up to 10 14 transduction events per year occur in Tampa Bay Estuary alone (Jiang and Paul, 1998). Additionally, viral-like transducing particles known as gene transfer agents (GTAs) are increasingly recognized as an important mechanism for horizontal gene transfer: it has been estimated that GTA transduction rates are over one million times higher than previously reported viral transduction rates in the marine environment (McDaniel et al., 2010). GTAs and their potential impact on the deep subsurface biosphere will be discussed below.
Viruses may be particularly important in facilitating horizontal gene transfer between phylogenetically distinct lineages. Fully sequenced genomes of archaea and bacteria indicate that horizontal gene transfer does occur between lineages that are distantly related, including between domains. Overall, it is thought that interdomain horizontal gene transfer is a relatively common phenomenon, with approximately 3% of genes in bacterial genomes and 4-8% of genes in archaeal genomes involved in transfer events (Yutin and Koonin, 2009). The genome of Methanosarcina mazei, for example, contains many genes of possible bacterial origin including a bacterial chaperonin system (Deppenmeier et al., 2002). Thermophiles appear to have a particularly high percentage of transferred genes (Beiko et al., 2005;Yutin and Koonin, 2009), though it is unclear if this is due to a propensity for acquiring distantly related genes or to the close proximity of archaea and bacteria in high temperature habitats like vents. For example, up to 16.2% of the genes in Aquifex aeolicus had a best hit to archaeal species (Deckert et al., 1998;Yutin and Koonin, 2009), which is much higher than the average for bacteria. The genome of Thermotoga maritima, isolated from geothermally heated sediment, contains genes related to oxygen reduction that are likely to have www.frontiersin.org been transferred through a single transfer event from a member of the Thermococcales (Nelson et al., 1999). Several Thermococcales strains have been found to contain mobile genetic elements or virus-like particles (Prieur et al., 2004), and thus there is a strong possibility that viruses are responsible for these transfer events. Viral transduction and GTAs are attractive as possible mechanisms for transfer events such as these because other mechanisms of genetic transfer do not seem capable of explaining the observed genomic similarities among distantly related species. Conjugation is a specialized process limited to particular lineages, and transformation often transfers only small amounts of genetic material, rather than multi-gene cassettes. While viral host range varies depending on viral type, some, especially GTAs, are thought to be capable of infecting distantly related hosts.
Viruses can manipulate the genetic content of their hosts not only by horizontal gene transfer, but also by expression of viralencoded genes during the course of infection. The life cycle of lysogenic viruses involves a stage in which the virus integrates its own genome into the host genome, lying latent within the host genome for several generations until it is induced by an environmental stressor or other signal. Many integrated prophage have been found to encode what are termed fitness factors or lysogenic conversion genes: genes expressed by the prophage that can promote host fitness (Paul, 2008). These fitness factors can enhance host survivability in various environmental conditions. The cholera toxin genes in Vibrio cholerae, for example, are expressed by a filamentous bacteriophage that has integrated into the V. cholerae genome (Waldor and Mekalanos, 1996). Studies have shown that prophage genes are upregulated in response to changing environmental conditions (Smoot et al., 2001) or during biofilm formation (Whiteley et al., 2001). Even cryptic prophage, which have been integrated in the host genome for so long that they have decayed and are no longer active as phage, can carry genes that improve survivability during osmotic, oxidative, and acid stresses, and influence biofilm formation . Viruses infecting marine cyanobacteria, or cyanophage, have been found to carry genes for both photosystems I and II (Mann et al., 1993;Lindell et al., 2004Lindell et al., , 2005Millard et al., 2004;Sullivan et al., 2005;Sharon et al., 2009). These genes are expressed during viral infection, and are thought to enhance phage fitness by supplementing the host's photosynthetic machinery (Lindell et al., 2005).
While a large amount of work has been dedicated to understanding the viral impact on the marine environment, the implications for the deep subsurface biosphere have been barely explored. Table 1 summarizes current research on viral abundance, production, and diversity in the deep subsurface biosphere, as well as shallow sediments, methane hydrates, and the deep-water column to provide a basis for comparison.
As can be seen in Table 1, only a few studies have focused on deep subsurface viruses. Most research to date in the deep ocean has focused on viral production and abundance in the water column and surface sediments. None of these studies, to our knowledge, has focused on the genetic and evolutionary implications of viral infection on the deep subsurface biosphere. Yet the viral impact on the evolution of bacterial and archaeal hosts could have even more profound implications in the deep subsurface than in the upper water column. A primary reason for this is that it is generally accepted that lysogeny is a more important viral lifestyle under suboptimal conditions, when host or nutrient abundance is low (Paul, 2008). This has been demonstrated in laboratory conditions, in which lysogenic viruses have a competitive advantage over lytic viruses in nutrient-limited media (Levin and Lenski, 1983). In natural populations, lysogeny becomes a more common lifestyle during seasons or in regions where host abundance is low (Jiang Table 1 | Summary of previous work on viral abundance, activity, and diversity in various environments of the deep subsurface biosphere, deep ocean, and sediments.

Environment Work on viruses to date Reference
Surface marine sediments High viral production in benthic ecosystems: may be responsible for up to 80% of cell mortality, thus releasing large amounts of carbon through the "viral shunt." Viral diversity in sediments is fairly high, and showed a higher incidence of lysogenic than lytic phages Danovaro et al. (2008), Middelboe et al. (2006, Siem-Jørgensen et al. Deep-water column Viral abundance generally tracks bacterial abundance, but the virus:cell ratio at depth varies. In some areas, the ratio increases with depth Hara et al. (1996), Parada et al. (2007), Steward and Preston (2011) Metagenomic work characterizing viral diversity found most viral sequences had matches to bacteriophages in the Podo-, Sipho-, and Myoviridae, with a few hits to eukaryotic sequences Frontiers in Microbiology | Extreme Microbiology and Paul, 1998;McDaniel et al., 2002;Williamson et al., 2002;Weinbauer, 2004). Therefore, lysogeny may be a favored lifestyle in the deep subsurface due to difficulties in finding a new host after newly synthesized virions are released. In deep sediments, the main difficulties are likely to be low cell abundance and immobility. In extreme environments, lysogeny may be favored because of reduced viability of viral particles outside the cell. As shown in Table 1, lysogeny appears to be a common lifestyle in sediments and hydrothermal vents. If lysogeny is indeed more common in the deep subsurface biosphere, this would increase the proportion of cells harboring prophage, potentially increasing the number of cells expressing fitness factors encoded by the prophage.
The impact of viruses on the genetic landscape of their hosts may be particularly pronounced in diffuse flow fluids of hydrothermal systems. In these environments, high temperature hydrothermal fluid mixes with seawater both in the subsurface and above the seafloor, resulting in gradients of temperature, pH, and chemical and mineralogical composition (Baross and Hoffman, 1985). A schematic representation of the vent environment, and the accompanying gradients, is shown in Figure 1A. The constant fluid flux through these gradients enables potentially frequent contact between diverse microbial communities.
The microorganisms inhabiting diffuse flow fluids are tremendously diverse in terms of taxonomy, metabolism, and thermal  Edwards et al. (2011), Huber et al. (2003, and Baross and Hoffman (1985). www.frontiersin.org regime. Studies of the population structures of archaea and bacteria in diffuse fluids have found thousands of phylotypes within both domains (Huber et al., 2007). The abundance of reduced compounds in vent fluids allows microbial metabolisms to take advantage of a wide range of energy sources, including hydrogen, reduced iron, sulfur, as well as organic compounds. There is, therefore, a wide range of potential hosts for viruses to infect in diffuse flow fluids.
In the extreme and dynamic conditions of hydrothermal vents, genetic exchange could provide a fitness advantage to any organism able to acquire a novel function that is useful in a changing environment. Evidence already exists suggesting that vent habitats are conducive for genetic exchange. As mentioned above, genomic studies have shown that rates of horizontal gene transfer between thermophiles are higher than between other groups, including between domains. This may be due to the predominant physiology of vent organisms: microbial communities often form biofilms on the surfaces of or within vent structures (Jannasch and Wirsen, 1981;Schrenk et al., 2003). Subsurface sediments and crustal habitats also provide abundant surfaces for biofilm formation. The high cell density in a biofilm increases rates of host contact, thus facilitating higher rates of viral infection, and can also foster genetic exchange through transformation. A recent metagenomics study found that transposase sequences in biofilms in the Lost City hydrothermal field are 10 times more abundant than in metagenomes from other environments (Brazelton and Baross, 2009). If this result is characteristic of subsurface biofilms, then genetic exchange in the deep subsurface biosphere is likely to be of great importance.
It has been suggested that viruses may act as a reservoir of genes that can be used as a mechanism for adaptation by their cellular hosts (Goldenfeld and Woese, 2007), effectively expanding the "pan-genome" to include the viral assemblage. Here, we argue that viral-mediated genetic exchange is particularly important as a means to adapt to frequently changing conditions in the diffuse flow vent environment. We will review what is known about viruses in the vent environment, and examine what evidence exists that viruses play a role in modifying the genetic content (and therefore the fitness) of their cellular hosts. We conclude by discussing the larger implications for viruses in the deep subsurface biosphere in general.

VIRUSES AS A "GENETIC REPOSITORY" IN HYDROTHERMAL VENT SYSTEMS EVIDENCE FOR VIRAL ACTIVITY IN DIFFUSE FLOW SYSTEMS
The few studies that have focused on viruses in diffuse flow vent environments have indicated that viruses play an important role in influencing microbial ecology in these ecosystems. One study quantified viral abundance in diffuse flow fluids, finding that on average there were approximately 10 7 viral-like particles (VLPs) per milliliter of fluid, or about 10 times as many viruses as cells, which is comparable to other marine ecosystems (Ortmann and Suttle, 2005). Another study, focusing on VLP counts at shallow hydrothermal vents, found that viruses were about five times more abundant than cells, though found that VLP counts increased with distance from the vents (Manini et al., 2008). The authors suggested that one possible explanation for this trend was a change in viral lifestyle from lysogenic to lytic, resulting in an increase in the number of apparent viruses. Additionally, diffuse fluids from 9˚North, a hydrothermal vent field on the East Pacific Rise, have been found to harbor a higher incidence of inducible prophage than nearby ambient seawater, indicating that lysogeny is a more common lifestyle for vent phage than those in other marine environments (Williamson et al., 2008). A viral metagenome at Hulk vent on the Juan de Fuca Ridge showed that vent viruses are relatively diverse and have the potential to infect hosts from a wide range of taxonomic groups and thermal regimes (Anderson et al., 2011).
Genome sequences of bacteria and archaea may also show evidence of extensive viral activity in diffuse flow hydrothermal vents. Within bacterial and archaeal genomes are regions called clustered regularly interspaced palindromic repeats (CRISPRs), which are thought to facilitate the immune response to viral infection (Barrangou et al., 2007;Brouns et al., 2008;Sorek et al., 2008;van der Oost et al., 2009;Hovarth and Barrangou, 2010;Labrie et al., 2010;Marraffini and Sontheimer, 2010). These loci consist of a series of short repeats, about 20-50 bp long, and are interspersed by a series of spacer regions, each about 25-75 bp in length. These spacer regions are created to match a short sequence on an invading element, such as a virus or a plasmid. The newly synthesized spacers are then inserted between direct repeats on the CRISPR locus (Makarova et al., 2003;Bolotin et al., 2005;Haft et al., 2005;Mojica et al., 2005;Pourcel et al., 2005;Marraffini and Sontheimer, 2008;Hale et al., 2009). If a virus or plasmid invades a cell that possesses a CRISPR spacer matching a sequence on that invader, the CRISPR system will mobilize the immune response. This occurs through the formation of a Cascade complex from nearby Cas (CRISPR-associated) genes in conjunction with small RNAs derived from the CRISPR spacers, which recognize and bind target DNA (Jore et al., 2011). The invading nucleic acid is cleaved as a result (Garneau et al., 2010).
Each of these CRISPR regions therefore acts as a record of previous viral infection, with each spacer thought to represent at least one independent infection event in the history of that strain. Interestingly, it has been observed that thermophilic strains, on average, have a higher number of CRISPR loci in their genomes than mesophiles or psychrophiles (Makarova et al., 2003;Anderson et al., 2011; Figure 2). While there is no definitive explanation for the abundance of CRISPRs in thermophiles, this does indicate that viral infection plays an important role in the evolution and ecology of thermophilic microbial communities. Moreover, the CRISPR immune mechanism itself is unique among viral immunity systems in that it responds in a sequence-specific manner to the invasion of foreign genetic material, rather than through prevention of phage adsorption, blocking phage DNA entry, or random restriction modification. Thus, the abundance of CRISPRs in thermophiles indicates that at least the entry, if not the successful takeover, of foreign genetic material between different hosts is a relatively common phenomenon in high temperature environments. CRISPR loci can also be found in the genomes of bacteria and archaea isolated from other environments in the deep ocean and sediments ( Table 2), though there is a clear correlation with temperature. This may serve as evidence that viruses play a particularly important role in the evolution of microbes in high temperature environments. This includes diffuse flow hydrothermal vents but could also include other regions in the subsurface with broad temperature gradients.

Frontiers in Microbiology | Extreme Microbiology
A further piece of evidence for the important role of viruses, one that is not restricted to high temperature organisms, is the presence of prophage in the genomes of sequenced archaea and bacteria. As mentioned above, lysogenic phage can provide supplementary metabolic functions while integrated into the genome as prophage. Table 2 lists the bacteria and archaea from diffuse flow fluids and other deep ocean habitats that have been found with integrated prophage in their genomes. Two of the listed bacterial isolates, one from the deep-water column and one from marine sediments, each possess seven prophage in their genomes. This represents a significant amount of genetic material. Considering the tendency of bacteria to select for faster reproduction rates and thus for smaller genomes (Carlile, 1982), the presence of seven prophage regions on the genome represents a potential fitness cost (through slower replication rates) that must be offset by a fitness benefit, such as the presence of fitness factors expressed by the prophage. Considering the prevalence of prophage in the genomes of isolates from diffuse flow fluids and deep sediments, it would not be surprising if prophage are present in most subsurface organisms.

EVIDENCE FOR VIRAL-MEDIATED HORIZONTAL GENE TRANSFER
Having established that viruses are abundant and likely to be active in diffuse flow hydrothermal fluids, we now turn to evidence that viruses mediate the exchange of genes between hosts in these regions. Here, we will do this primarily through analysis of a marine hydrothermal vent viral metagenome, or virome, collected from Hulk vent in the Main Endeavor Field on the Juan de Fuca Ridge, previously described in Anderson et al. (2011). We examine evidence for viral-mediated horizontal gene transfer in the vent environment first by searching the marine vent virome for genes related to lysogeny and gene insertion; and then by using comparative metagenomics to determine which genes are selected to be maintained in the viral gene pool.

GENES RELATED TO LYSOGENY AND GENE INSERTION
As mentioned above, a higher proportion of bacteria and archaea in diffuse flow hydrothermal vents appear to contain inducible prophage within their genomes than bacteria and archaea from other marine environments (Williamson et al., 2008), indicating that lysogeny is an important lifestyle for viruses in diffuse flow fluids. To test whether this finding is supported by publicly available metagenomic data, we searched for genes associated with lysogeny in the marine vent virome from Hulk and compared the number of matches with that of metagenomes from other environments. It is important to note that while most lysogenic phage are integrated into cellular genomes as prophage, the Hulk vent sample would have experienced substantially colder temperatures as well as decreased pressures prior to the filtration steps that removed cells and captured viral-sized particles. This environmental shock may have induced many of the prophage within the microbial community, resulting in their being captured in the viral size fraction. While this has not yet been demonstrated experimentally, it www.frontiersin.org CRISPR loci were identified with CRISPRFinder (Grissa et al., 2007a,b) and prophage were identified with ProphageFinder (Bose and Barber, 2006). Modified from a table in Orcutt et al. (2011).
is an important consideration in the sampling of metagenomes from relatively extreme environments.
To determine the overall abundance of genes related to lysogeny, we first created a database of lysogeny-associated proteins with Pfam seed sequences (Finn et al., 2010). These proteins include phage integrases, repressors, and antirepressors expressed during the prophage stage, regulatory proteins that trigger the switch between the lysogenic and lytic stages, and proteins involved in phage integration and excision. We queried this database with the marine vent virome using BLASTX, as well as a set of other viral and cellular metagenomes from the MG-RAST database (Meyer et al., 2008).
Metagenomes with the highest percentages of sequencing reads matching lysogeny domains were cellular rather than viral ( Table 3). These hits were most likely matches to prophage incorporated into cellular genomes. Two of the top three metagenomes were sampled from environments that are considered "extreme," namely, a highly acidic mine drainage and the high-pH, high temperature Lost City hydrothermal field. This supports the hypothesis stated earlier that lysogeny becomes a more common viral lifestyle in extreme environments because viruses have a higher chance of survival as prophage than as virions under harsh conditions. Also in the top three was the whale fall cellular metagenome. While it is unclear exactly why this metagenome had a relatively high percentage of lysogeny-related domains, one possibility is that since whale falls are relatively rare events, the organisms colonizing whale falls must endure extreme periods of relative starvation, followed by periods of plenty. Any viruses infecting these taxa would benefit from a lysogenic lifestyle in order to survive these extreme periods of starvation, during which cells would most likely either be dormant or replicating extremely slowly.
Fourth on the list was a cellular metagenome sampled from farm soil, in which lysogeny may be the favored lifestyle for viruses as a result of the difficulties in encountering a new host within the sediment matrix, an environment in which mobility is likely to be significantly impaired. However, sequences matching lysogeny domains were not particularly abundant in cellular metagenomes derived from deep Peru Margin sediments, though some depth horizons had a higher abundance than others (Table 3). At this point it is unclear whether this is due to differences in viral lifestyle or simply to a lower abundance of viruses in Peru Margin sediments.
In contrast, the sequences matching lysogeny domains in the Yellowstone hot springs and marine vent viromes (fifth and sixth Frontiers in Microbiology | Extreme Microbiology on the list, respectively) are most likely derived from lysogenic viruses that entered the lytic stage due to an environmental stressor, possibly resulting from the sampling process. Interestingly, these two viromes, both sampled from high temperature environments, had the highest proportion of sequencing reads matching lysogeny domains of all the viromes analyzed here. In contrast, viromes from more temperate environments, such as the Bay of British Columbia or the Sargasso Sea, had a much lower abundance of lysogeny sequences. This parallels the relative abundance of lysogeny-related domains in extreme cellular metagenomes, and again may indicate that natural selection favors lysogenic viruses in extreme environments. The lysogeny-associated domains chosen for this analysis were selected from genes that are uniquely associated with prophage. However, this would exclude genes that serve other roles in both viruses and cells but may also be crucial in facilitating horizontal gene transfer. This includes genes required for integration of DNA into host or viral genomes, such as DNA ligases. Interestingly, 1.25% of the sequencing reads in the marine vent virome had a match to a DNA ligase, which is almost 10 times higher than any of the 17 other cellular or viral metagenomes analyzed here (Table 3). Moreover, these ligases were especially enriched in the subset of the marine vent virome considered more likely to be "viral" -that is, those reads that were assembled into contigs with an average coverage of at least eight, or within contigs in which a majority of reads were categorized as either "unknown" or "viral" by MG-RAST (Anderson et al., 2011). Therefore, the abundance of ligases appears to be distinctly viral in character.
Most of the potential ligase sequences in the marine vent virome had matches to NAD-dependent ligases rather than ATPdependent ligases. ATP-dependent ligases are fairly widespread among eukaryotes, bacteria, and archaea, while NAD-dependent ligases are generally characteristic of bacteria, though they have been seen in some viruses, eukaryotes, and archaea (Doherty and Suh, 2000;Tomkinson et al., 2006). Furthermore, they appear to be characteristic of marine rather than terrestrial metagenomes, for unknown reasons. To determine whether the ligases in the marine vent virome were most closely related to ligases from a particular group of organisms, we constructed a phylogenetic tree of NAD-dependent DNA ligases including ligase-matching www.frontiersin.org sequences from the marine vent virome, using a reference tree of previously characterized DNA ligases as a constraint (Figure 3). The reference viral ligases do not necessarily group together on this tree, suggesting there is not a viral ligase type that is necessarily distinct from bacterial, archaeal, or eukaryotic ligases. Similarly, the sequences from the marine vent virome are scattered across the tree, with some closely related to ligases found in viruses, and others more closely related to bacterial or archaeal ligases. Yutin and Koonin (2009) have previously highlighted the non-monophyletic nature of viral DNA ligases. This may attest to the fluid nature of viral genomes in that they frequently pick up genes from their hosts and may also be a further indication of the high diversity of the marine vent viral assemblage. While several vent virome sequences were grouped with DNA ligases found in viruses, the majority grouped most closely with a DNA ligase from Rickettsia felis, a Gram-negative bacterium in the Alphaproteobacteria group. R. felis is closely related to SAR11, one of the most common bacterial lineages in marine environments. While SAR11 sequences were not common in the marine vent virome, it is possible that ancestral viruses in the marine vent environment had acquired ligases from SAR11 or similar groups, which then became more abundant in the viral gene pool due to positive selection.
While it is unclear exactly what ecological role these ligases play, their high abundance suggests that they play a uniquely important role in the vent viral assemblage. DNA ligases repair double-stranded breaks in DNA by catalyzing the synthesis of phosphodiester bonds between 5 -phosphoryl and 3 -hydroxyl groups (Lehnman, 1974). DNA ligases are thus important for DNA replication, recombination, and repair across all domains as well as viruses. However, the high abundance of DNA ligases in the vent virome is unusual and may reflect a specific role in the vent ecosystem, possibly in the integration of viral genomes into the host genome through recombination. It is therefore plausible that the abundance of these genes is indicative of prevalent horizontal gene transfer in these regions of the deep subsurface biosphere.
Finally, GTAs may act as an important mechanism for viralmediated horizontal gene transfer. These phage-like particles were originally discovered in marine Rhodobacteria, but have since been discovered in various other bacterial and archaeal species, particularly in the marine environment (Lang and Beatty, 2000;Matson et al., 2005;Stanton, 2007;Biers et al., 2008, Leung et al., 2010. GTAs randomly incorporate segments of the host genome into a viral capsid, then transfer this to new hosts, including phylogenetically unrelated organisms, without resulting in lysis of the host cell. It has been suggested that GTAs are defective phage Metagenomic reads are colored red. All ligase protein sequences were obtained from NCBI (accession numbers listed). Trees were constructed in RAxML by incorporating metagenomic sequences into a constraint tree of references sequences based on the phylogeny of Yutin and Koonin (2009). Trees imaged with TreeViewX. (Lang and Beatty, 2000;Matson et al., 2005;Stanton, 2007); if so, it would seem that GTAs have effectively lost their parasitic nature and have instead been usurped by the host for the purposes of gene exchange. While the only sequenced GTA genes are from the Rhodobacter capsulatus (Lang and Beatty, 2000) and Brachyspira hyodysentariae (Matson et al., 2005) GTAs, the marine vent virome included six sequencing reads with matches to known GTA genes. While these are few, the scarcity of sequences from GTAs in public databases to date prohibits a larger-scale search for GTAs in viral metagenomes. Nevertheless, GTAs may play a significant role in transferring genes between hosts in the marine environment and appear to be present in diffuse flow fluids. In the dynamic, extreme conditions of the deep subsurface, selection may be particularly strong for cells that harbor GTAs as a mechanism for obtaining advantageous genes. It has been hypothesized that a large portion of marine viromes may consist of GTAs carrying poorly conserved bacterial genes (Kristensen et al., 2010) and may thus contribute a large portion of the poorly conserved "cloud" of genes in the viral gene pool.

ENRICHMENT FOR FUNCTIONAL GENES IN THE VENT VIROME
This "cloud" of genes, or genetic reservoir, consists of an overlapping pool of genes derived from both viruses and cells. Genes in the viral genetic reservoir are expressed in cellular hosts through either horizontal gene transfer or prophage insertion, and then maintained in the gene pool through positive selection. Thus, the genes maintained in the viral metagenome, in addition to genes for viral synthesis, packaging, and maintenance, are likely to consist of non-housekeeping genes that provide a selective advantage in the event of a shift in environmental conditions. This is expected to be particularly the case in the gradient-dominated, dynamic vent environment.
An initial example of this can be found through analysis of assembled contigs of the marine vent virome from Hulk (assembly and contig analysis described in Anderson et al., 2011). Among the lysogeny domains matching sequences in the marine vent virome was the XerD tyrosine recombinase protein, commonly associated with lysogenic phage and transposases. This gene was assembled onto a contig that also contains the universal stress protein UspA. UspA is known to be phosphorylated in response to stasis stress in Escherichia coli (Nyström and Neidhardt, 1994;Freestone et al., 1997), and thus is induced when growth conditions are not optimal. It would be advantageous to both host and prophage for the prophage-encoded protein to induce a response in the host when growth conditions are poor. As mentioned, prophage have been found to encode genes to enhance host survivability in stressful conditions, and this sequence may be one example of many such instances in the vent ecosystem.
While this is one example of a potential fitness factor encoded by a lysogenic phage, comparative metagenomics can give a more large-scale picture of which functional genes are enriched in the viral fraction. To do this we compared the relative enrichment of several gene categories in the marine vent virome and a cellular metagenome sampled from a sulfide chimney at Mothra hydrothermal field on the Juan de Fuca Ridge (Xie et al., 2011;Figure 4). Each metagenome was analyzed through the MG-RAST pipeline and annotated using the KEGG Orthology database (Meyer et al., 2008). For this analysis, as before, we used only the vent virome subset that we consider to be convincingly viral: that is, sequencing reads assembled into contigs with an average coverage of at least eight or composed of reads identified as "unknown" or "viral" by MG-RAST. The virome was relatively enriched in gene categories such as replication and repair, nucleotide metabolism, and translation, which are important for the synthesis of viral genetic material during the lytic stage. Interestingly, however, the virome was particularly enriched in genes related to energy metabolism, with nearly 12% of all identifiable sequencing reads matching this category in the virome, compared to 7% in the cellular metagenome. Genes in this category include genes used for oxidative phosphorylation, photosynthesis, methane metabolism, carbon fixation, and sulfur and nitrogen metabolism. It seems plausible that these genes are maintained in the viral genetic reservoir through positive selection. For example, if a cell were to be flushed into a region in which the abundance of its conventional electron donor was limited, the acquisition of a gene allowing it to utilize an alternative electron donor, via either transduction or prophage expression, would increase the fitness of that cell. The relative enrichment of genes related to energy metabolism in the virome may be evidence of this type of selection.
Other sequences in the marine vent virome had matches to metabolic genes that have also been found in viruses from other environments. For example, 11 sequencing reads in the marine vent virome had close matches to PhoH, a phosphate starvationinducible protein that has previously been found in Prochlorococcus and Synechococcus phage genomes (Sullivan et al., 2009). It has been hypothesized that PhoH, when expressed by phage, could aid the host in phosphorus scavenging during the infection stage. In hydrothermal systems, phosphate is removed from seawater through water-rock reactions in the deep subsurface, thus limiting phosphate availability to native microbial communities (Wheat et al., 1996). The presence of PhoH in the marine vent virome suggests that vent viruses may, in a manner similar to the cyanophage, assist in phosphate acquisition during the course of infection. Similarly, eight reads closely matching transaldolases, also found on cyanophage genomes, were found in the marine vent virome. It is thought that these transaldolases may play a role in metabolizing carbon substrates to assist in energy production and synthesis of important compounds, and thus may be yet another example of fitness factors expressed by phage (Sullivan et al., 2005).
This positive selection for genes that can increase host fitness would suggest that the genes found in the marine vent virome can give an indication of the unique environmental conditions from which it was sampled, such as limitations or extremes in factors such as nutrients, energy availability, temperature, or pH. For example, 301 sequencing reads in the marine vent virome were annotated by MG-RAST as uptake [NiFe]-hydrogenases, enzymes involved in the oxidation of hydrogen to produce energy (Vignais and Billoud, 2007). Hydrogen oxidation is known to be an important metabolic strategy in hydrothermal vent systems, where it provides a greater energy return than the oxidation of methane or sulfur (Amend and Shock, 2001), which are also common reduced compounds in vent environments. The [NiFe]hydrogenases encoded in the marine vent virome could enable www.frontiersin.org  (Meyer et al., 2008), and reads were annotated with the KEGG Orthology database, release 56.
utilization of an important alternative energy source for viral hosts if transferred into their genome via transduction, or if expressed by prophage as a fitness factor, which could bolster host metabolism during viral infection.
To present a larger-scale illustration of how the genetic profile of a virome might act as a signature of the environment from which it was sampled, we compared the relative enrichment of functional gene categories in the marine vent virome with a set of 42 other viromes, initially analyzed by Dinsdale et al., 2008;Figure 5). As in the previous analysis, only the convincingly viral subset of the marine vent virome from Hulk was used. In this analysis, genes were annotated with the SEED subsystems in MG-RAST to make a more direct comparison with the study by Dinsdale et al. (2008). The results show that the marine vent virome is particularly enriched in genes related to regulation and cell signaling (enriched by 300%) and RNA metabolism (enriched by 230%). The genes assigned to the "cell signaling" category include those related to biofilm formation and quorum sensing, regulation of virulence, and sensing environmental stimuli. For example, 118 sequencing reads in the marine vent virome have close matches to genes in the "regulation of virulence" category. Most of these matched the BarA-UvrY two-component system, a system regulating virulence in pathogenic E. coli (Herren et al., 2006). The BarA-UvrY system functions by sensing changes in environmental conditions to induce a metabolic switch, a function that would be useful in the rapidly changing conditions of the vent environment.
We expect that the pool of genes undergoing positive selection for retention in the virus genetic reservoir would represent those genes that may be occasionally necessary, though not strictly required, in the environment from which it was sampled. Thus, we expect accessory genes such as those related to secondary metabolisms to be selected for maintenance in the virome, rather than housekeeping genes such as ribosomal proteins. In addition to representing the "cloud" of poorly conserved genes from both cellular and viral pangenomes (Kristensen et al., 2010), we expect positive selection to customize this "genetic reservoir" to match the needs of the hosts drawing from it.

EXTRAPOLATING THE VIRAL IMPACT FROM HYDROTHERMAL VENTS TO OTHER REGIMES OF THE DEEP SUBSURFACE BIOSPHERE
We have suggested here that viruses are likely to play an important role in modifying the genetic content of their hosts in diffuse flow fluids of hydrothermal systems. We now turn to a discussion of whether the characteristics of viruses in diffuse fluids is likely to be the case for other regimes of the deep subsurface biosphere. The vent ecosystem has several unique attributes that distinguish it from other deep subsurface habitats, so extrapolation from the vent subsurface to the rest of the deep subsurface biosphere must be done cautiously. The two most distinctive attributes of the vent system are the extreme gradients in temperature, pH, redox state, chemical composition, and mineralogy, and the constant fluid flux Frontiers in Microbiology | Extreme Microbiology between microenvironments. On the ridge axis, hot hydrothermal fluid mixing with cool, oxygenated seawater facilitates interaction among diverse cellular communities and their accompanying viral assemblages. For this reason, we have argued that this environment may be conducive to gene flow from one microenvironment to the next via viral transfer. However, the impact of viruses is likely to extend to other provinces of the deep subsurface biosphere, which we detail below.

CRUSTAL ENVIRONMENTS: UNSEDIMENTED RIDGE FLANKS, CRUSTAL OUTCROPS, SEAMOUNTS, ARC SYSTEMS
This fluid flux from one environment to the next is characteristic of many regimes in the deep subsurface biosphere. As sediments covering the seafloor tend to be relatively impermeable, fluid flux depends on outcropping of igneous crust . Fluids flux most vigorously at mid-ocean ridges at the hydrothermal systems discussed above, and degree of fluid flux decreases with distance from the ridge axis, as shown in Figure 1B. However, these environments are not completely stagnant. On ridge flanks and recharge zones, which can extend hundreds of kilometers from the ridge axis, oxygenated seawater flows into the ocean floor, with residence times of days to years (Johnson et al., 2010). Some of this fluid emerges in discharge zones in the form of hydrothermal vents on ridge axes, while some fluids may circulate more locally. Offaxis, fluid flows through seamounts; and farther afield, fluids may circulate in a more restricted manner (Edwards et al., 2005). Overall, however, the volume of fluid flux is large, as at least 60% of the oceanic crust is hydrologically active, and the fluid-crust reservoir in which these processes might be expected to occur is approximately 10 times the size of the sedimentary reservoir . Fluid-flux-dominated environments, therefore, constitute the majority of the deep subsurface. This fluid flux facilitates transfer of chemicals throughout the crust, thus exposing microbial communities to varying conditions and also transporting microorganisms from one region to the next. The regions of the oceanic crust dominated by fluid flux are therefore analogous to the hydrothermal systems described above, though characterized by fewer extremes in temperature, pH, and redox conditions. By means of this flux, viruses or their accompanying hosts are transported between biomes, and can facilitate horizontal gene transfer between organisms native to drastically different environments. Viral metagenomics has suggested that while local diversity of viral assemblages is high, global diversity may be low, suggesting that viruses frequently migrate between biomes (Breitbart and Rohwer, 2005). One study found that viruses from sediments, soils, and lakes were able to propagate in marine waters, suggesting that viruses have a broad enough host range that they can successfully move between biomes (Sano et al., 2004). Fluid flux between the seafloor-ocean interface, within sediments, and within cracks in the Earth's crust may act as a conduit for viruses or the microbes that bear them, thus potentially sharing genes among habitats. Moreover, if biofilms are hotspots for horizontal gene transfer and viral infection (as discussed above), we may expect that regions of the subsurface characterized by biofilm formation, which could www.frontiersin.org include any habitat with surfaces available for attachment, will be exposed to higher rates of gene transfer than in environments with lower cell density.

SEDIMENTED ENVIRONMENTS: ABYSSAL PLAINS, CONTINENTAL MARGINS
As shown in Figure 1B, some provinces of the deep subsurface biosphere, particularly those characterized by deep sedimentation, experience more restricted fluid flux and therefore have potentially limited contact between hosts of different environment niches. Yet even in these regions, viruses may alter host fitness through lysogeny. As stated above, lysogeny appears to become a more predominant lifestyle in regions with suboptimal conditions, such as low nutrient abundance, low host replication rates, or low host contact rates. Some sedimented regions of the deep subsurface are characterized by particularly low cell abundance, such as in sediments within oligotrophic gyres (D'Hondt et al., 2009), or are exposed to extremely limited organic matter, nutrient, or free energy availability (Schrenk et al., 2010). Moreover, the lack of mobility in the sediment matrix may encourage a lysogenic lifestyle as well, as viruses may have difficulty in contacting a new host within a sediment matrix, especially in regions with low cell abundance. Therefore, the lysogenic lifestyle is likely to be much more common among viruses in these regions, and the archaeal and bacterial inhabitants of these regions have an even higher likelihood of expressing fitness factors encoded by prophage. Initial studies of prophage in the deep subsurface biosphere seem to support this case (Engelhardt et al., 2011). These prophage may aid in host survivability. For example, in deeply buried marine sediments with limited organic carbon or other nutrients, viruses may carry genes to aid in scavenging these compounds or in providing secondary metabolisms to take advantage of alternative energy or nutrient sources. However, more work needs to be done on sequencing viral or cellular isolates from these regions to gain further support for this hypothesis.
One final viral influence that is likely to impact all provinces of the deep subsurface biosphere is the input of viruses from surface waters. Marine sediments receive large inputs of allochthonous material daily, much of it bearing particle-associated microbes that sink through the water column. If these microbes carry prophage or lytic viruses in the process of replicating, these viruses could potentially encounter a different host in its new sediment-bound habitat. These viruses, delivered from the upper water column, could then deliver genes from more pelagic habitats to the deep subsurface. One might even expect induction of prophage to occur more frequently in sinking microbes as they are exposed to increased pressures, which has been found to induce prophage in E. coli (Aertsen et al., 2004). In this sense, viruses may serve as a highway for gene exchange between the surface marine realm and the deep subsurface biosphere.

CONCLUSION
The presence of lysogeny-related genes as well as potentially advantageous cellular functional genes in the virome of a marine, diffuse flow hydrothermal vent bolster the hypothesis that viruses modify the genetic landscape of the hydrothermal habitat, both through expression of prophage genes during lysogenic infection and through the process of transduction. Indeed, this habitatas well as several others in the deep subsurface biosphere -is uniquely poised for fostering an almost symbiotic relationship between viruses and their hosts. As mentioned above, deep sea microbes have higher numbers of integrated prophage and transposable elements in their genomes Vezzi et al., 2005;Ivars-Martinez et al., 2008), and lysogeny appears to be favored over lysis in the extreme regions of the marine realm (Paul, 2008). In regions of the deep subsurface biosphere with low host contact rates, low nutrient flux, or rapidly changing conditions, expression by latent prophage or viral transduction may play a crucial role in bolstering host fitness.
More work certainly remains to be done to determine the scope of the viral impact on the deep subsurface biosphere. Sequencing the genomes of archaeal and bacterial isolates from various regions of the deep subsurface can indicate what prophage elements are present. Viral metagenomics can indicate what types of genes are harbored by viruses, and therefore can indicate which genes may either be transferred into host genomes or expressed during the prophage stage. Thus far, the evidence suggests that viruses play a crucial role in enhancing the fitness of their hosts by modifying their genetic content. They may help their hosts adapt to the unique challenges of the various habitats of the deep subsurface biosphere, acting as a genetic repository for new, adaptive functions. Viruses are emerging as a profound evolutionary force whose impact we have yet to fully assess, particularly in the realms of the deep.