Biogeography of Persephonella in deep-sea hydrothermal vents of the Western Pacific

Deep-sea hydrothermal vent fields are areas on the seafloor with high biological productivity fueled by microbial chemosynthesis. Members of the Aquificales genus Persephonella are obligately chemosynthetic bacteria, and appear to be key players in carbon, sulfur, and nitrogen cycles in high temperature habitats at deep-sea vents. Although this group of bacteria has cosmopolitan distribution in deep-sea hydrothermal ecosystem around the world, little is known about their population structure such as intraspecific genomic diversity, distribution pattern, and phenotypic diversity. We developed the multi-locus sequence analysis (MLSA) scheme for their genomic characterization. Sequence variation was determined in five housekeeping genes and one functional gene of 36 Persephonella hydrogeniphila strains originated from the Okinawa Trough and the South Mariana Trough (SNT). Although the strains share >98.7% similarities in 16S rRNA gene sequences, MLSA revealed 35 different sequence types (ST), indicating their extensive genomic diversity. A phylogenetic tree inferred from all concatenated gene sequences revealed the clustering of isolates according to the geographic origin. In addition, the phenotypic clustering pattern inferred from whole-cell matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF/MS) analysis can be correlated to their MLSA clustering pattern. This study represents the first MLSA combined with phenotypic analysis indicative of allopatric speciation of deep-sea hydrothermal vent bacteria.


INTRODUCTION
Mixing of hydrothermal fluids and ambient seawater at the seafloor creates physically and chemically dynamic habitats for microorganisms. Vent fluids physicochemistry is variable both spatially and temporally as a result of subsurface geological and geochemical processes (Edmond et al., 1979;Butterfield and Massoth, 1994;Butterfield et al., 2004). Diverse microorganisms including both Archaea and Bacteria have been isolated in pure cultures from various hydrothermal fields . In addition, culture-independent studies revealed the dominance of yet-to-be cultured microorganisms in deepsea hydrothermal environments (Haddad et al., 1995;Takai and Horikoshi, 1999;Reysenbach et al., 2000;Corre, 2001;Teske et al., 2002), and provided insight into the great heterogeneity of microbial communities between hydrothermal systems. The heterogeneity can be correlated to differences in the geological and chemical properties between different vents (Takai et al., 2004;Nakagawa et al., 2005a,b;Takai and Nakamura, 2011). On the other hand, there are also some cosmopolitan genera found in deep-sea hydrothermal systems occurring not only in the Mid-Ocean Ridge systems but in the Back-Arc Basin systems and the Volcanic Arc systems Nakagawa and Takai, 2008;Kaye et al., 2011). Members of the genus Persephonella belonging to the order Aquificales, obligately sulfur-and/or hydrogen-oxidizing, chemoautotrophic, thermophilic bacteria, are widely distributed in deep-sea hydrothermal systems (Reysenbach et al., 2000Takai et al., 2004;Nakagawa et al., 2005a,b;Ferrera et al., 2007;Takai et al., 2008). Although the widespread occurrence of this group suggests that they may play important role, many questions remained about their physiology, metabolism, and ecology within the environment because of the difficulty in isolating these strains. Some isolates have been characterized (Götz et al., 2002;Nakagawa et al., 2003), and implied their role in carbon, sulfur and nitrogen cycles in high temperature habitats at deep-sea vents Ferrera et al., 2007). However, little is known about the spatial or biogeographical pattern of Persephonella microdiversity and phenotypic heterogeneity.
Weak biogeographical signals in microbial communities are usually explained by the hypothesis of microbial cosmopolitanism formulated by Bass Becking (Wit and Bouvier, 2006). However, recent studies have explored the effects of dispersal limitation on microbial biogeography. Like macroorganisms, the genetic similarity negatively correlated with geographic distance, i.e., distance-decay relationship, have been reported for cyanobacteria, sulfate-reducing bacteria, marine planktonic bacteria, and hyperthermophilic archaea (Papke et al., 2003;Whitaker et al., 2003;Vergin et al., 2007;Oakley et al., 2010). In addition, the biogeographical diversity pattern was reported in detail for members of the "deep-sea hydrothermal vent euryarchaeota 2" (Flores et al., 2012). Microbial biogeographical studies have been usually based solely on genetic data. Microbial biogeography was recently studied at the phenotypic level (Rosselló-Mora et al., 2008), however, genetic and phenotypic correlation has not been explored. We investigated the spatial diversity pattern of Persephonella population by the combined use of comparative genetic and phenotypic characterizations.

FIELD SITE AND SAMPLING
Samples, i.e., chimney structures, fluids, and sediments, were collected with R/V Natsushima and ROV Hyper-Dolphin or R/V Yokosuka and DSV Shinkai 6500 from the Okinawa Trough (OT) in 2007 and 2009, or the South Mariana Trough (SMT) in 2010 ( Table 1). Vent fluids from the OT are characteristic in the high contents of methane and carbon dioxide (Kawagucci et al., 2011). Among the OT hydrothermal fields, this study focused on the Iheya North and Hatoma Knoll (Figure 1). In the SMT, four vent sites were studied (Figure 1). The Archaean site is located at a ridge flank, about 2 km apart from the backarc-spreading axis. Discharging fluids (T max = 318 • C) was acidic and depleted in Cl − (Cl − = 401 mM) (Ishibashi et al., 2006). Pika site is located on an off-axis knoll, about 5 km from the axis. Fluid chemistry (T max = 330 • C) of Pika site showed brine-rich signature (Cl − = 600 mM) (Ishibashi et al., 2006). Urashima site is newly discovered in 2010, and located at the northern foot of the western peak of the same knoll as Pika. Snail site is located on the active backarc-spreading axis. After retrieval on board, each of the chimney structures were sectioned immediately into the exterior surface and the inside parts, and slurried with 25 ml of sterilized seawater in the presence or absence of 0.05% (w/v) neutralized sodium sulfide in 100 ml glass bottles (Schott Glaswerke, Mainz, Germany). Bottles were then tightly sealed with butyl rubber caps under a gas phase of 100% N 2 (0.2 MPa). Similarly, fluid, sediment, and biological samples were prepared anaerobically in 10 ml glass bottles. Samples were stored at 4 • C until use.

ENRICHMENT, ISOLATION, AND PHYLOGENETIC ANALYSIS
Serial dilution cultures were performed using the MMJHS medium  containing a mixture of electron donors and electron acceptors for hydrogen/sulfur-oxidizing chemoautotrophs at 47, 55 and 70 • C. MMJHS medium included 1 g each of NaHCO 3 , Na 2 S 2 O 3 .5H 2 O, and NaNO 3 , 10 g of S 0 and 10 ml vitamin solution (Balch et al., 1979) per liter of MJ synthetic seawater [gas phase: 80% H 2 +20% CO 2 (0.3 MPa)]. To obtain pure cultures, dilution-to-extinction was repeated at least 2 times (Baross, 1995). The purity was confirmed routinely by microscopic examination and by sequencing of the 16S rRNA gene using several PCR primers. Genomic DNA was extracted from isolates using the UltraClean Microbial DNA isolation Kit (MoBio Laboratories, Inc., Solana Beach, CA, USA) following the manufacturer's protocol. The 16S rRNA gene of each isolate was amplified by PCR using LA Taq polymerase (TaKaRa Bio, Otsu, Japan) as described previously (Takai et al., 2001). The primers used were Eubac 27F and 1492R (Weisburg et al., 1991). These amplicons were bidirectionally determined by the dideoxynucleotide chain-termination method. Almost complete sequences of the 16S rRNA gene were assembled using Sequencher ver 4.8 (Gene Codes Corporation, Ann Arbor, MI, USA). In order to determine the phylogenetic positions of isolates, the sequences were aligned using Greengenes NAST alignment tool (DeSantis et al., 2006), and compiled using ARB software version 03.08.22 (Ludwig et al., 2004).
The levels of genetic variation within and between populations were calculated with the Arlequin ver 3.5 software (Excoffier and Lischer, 2010). F ST values were estimated for groups of two or more strains and were tested for significance against 1000 randomized bootstrap resamplings. Average pairwise genetic distance and standard error based on 500 bootstrap resamplings of each population were estimated using MEGA ver 5.05. Mantel test was performed with XLSTAT software (www.xlstat.com). Sequences obtained in this study have been deposited in DDBJ/EMBL/GenBank under Accession No. AB773894-AB774147. Table 6 were analyzed as previously described Toki et al., 2008). Endmember fluids compositions were estimated by the conventional method, that is extrapolation to Mg = 0 of linear relationship of concentration of each species to Mg among the obtained samples (Von Damm et al., 1985).

PREPARATION OF BACTERIAL SAMPLES FOR WHOLE-CELL MALDI-TOF/MS
Samples for whole-cell matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF/MS) were prepared as described in Hazen et al. (2009). Briefly, Persephonella strains were cultured in 3 ml of MMJHS medium at their isolated temperatures ( Table 1). Following incubation, cells were washed once in 1 ml of 0.85% NaCl and twice in 1 ml of 50% ethanol at 4 • C. Cell pellets were weighed and resuspended in 1% trifluoroacetic acid (TFA) to yield a final concentration of 0.2 mg cells/μl of 1% TFA. Equal volumes of the TFA bacterial suspension and the MALDI-TOF/MS matrix solution (10 mg/ml sinapinic acid in 50% acetonitrile, 50% water, and 0.1% TFA) were mixed in a microcentrifuge tube, and then 1.0 μl of this mixture was spotted in triplicate on a stainless steel MALDI-TOF/MS sample plate (corresponding to approximately 1.4 × 10 8 cells/spot). Samples were allowed to air dry before being loaded in the mass spectrometer.

MALDI-TOF/MS AND DATA PROCESSING
All mass spectra were acquired using the MALDI-TOF/MS spectrometer (4700 proteomics analyzer; Applied Biosystems, Foster City, CA, USA) in the linear and positive-ion modes. The laser (N 2 , 337 nm) intensity was set above the ion generation threshold. Mass spectra were recorded in the m/z range of 2000-14,000.     The acceptance criteria, based on 1000 laser shots per spot, were signal intensities between 2000 and 55,000 counts and a signal/noise ratio of 10 or greater. Raw mass spectra from three spots were normalized using Data Explorer software (Applied Biosystems, Foster City, CA, USA) by baseline correction and combined to generate an averaged peak list. The peaks around 2000 m/z were excluded as noise.
The peaks were ranked according to their signal intensities, and the top 15 most intense peaks were chosen for further analysis. The relative intensity ratio was calculated for the 15 peaks. Squared distance was estimated based on the presence or absence of peaks by Ward's minimum variance method using MVSP software ver 3.21 (Kovach Computing Services, Wales, UK). The presence or absence of peaks was determined within a tolerance of 14 Da.

ISOLATION OF Persephonella STRAINS
We investigated a total of 36 Persephonella strains originating from various hydrothermal samples from the OT (4 strains from Iheya North, and 1 strain from Hatoma Knoll) and the SMT (16 strains from Urashima site, 10 strains from Archaean site, 3 strains from Snail site, and 2 strains from Pika site) (Figure 2 and Table 1). All of the 36 strains shared >98.7% 16S rRNA gene similarities with one another and with P. hydrogeniphila 29W T .

GENETIC DIVERSITY OF Persephonella POPULATION
We developed a MLSA scheme for the Persephonella population based on five housekeeping genes and a functional gene. The gene fragments sequenced varied from 501 to 882 bp in length (Table 1), and nucleotide sequence similarity at MLSA loci varied from 94.6 to 96.3% (average 95.8%). We obtained concatenated sequences of 4254 bp and identified a total of 702 variable positions. Ratios of non-synonymous to synonymous substitutions (Ka/Ks) were much smaller than 1 for all loci (Table 3), indicating the genes were subject to purifying selection, conforming to the general requirements for MLSA loci (Maiden, 2006). This was statistically supported by the high values from the Z-test ( Table 3).

POPULATION GENETIC STRUCTURE
Typing based on sequences of six protein-coding gene fragments revealed 35 different STs among 36 isolates, indicating the high genetic diversity of Persephonella population ( Table 4). The number of different alleles per locus varied between 16 for metG and 24 for tkt. Strains MT-17 and −18 had identical sequences for all MLSA loci. These strains were isolated from the same chimney sample, but the slurries were prepared in the presence (used for strain  or absence (used for strain MT-17) of 0.05% (w/v) sodium sulfide. In other cases, the presence of sodium sulfide in slurries resulted in the isolation of strains classified into different STs (Table 1). The split graph obtained from the concatenated sequence data displayed bushy network structures with complex parallelogram formation indicative of extensive homologous recombination (Figure 3). The result of PHI test (Bruen et al., 2006) for the concatenated sequences also showed the presence of the past recombination events during the evolution of Persephonella (p < 0.05).

POPULATION DIFFERENCE BETWEEN THE OT AND THE SMT
A ML phylogenetic tree derived from the concatenated alignment of six loci showed two different clades with high bootstrap support (Figure 4). The two clades corresponded to the two geographic regions, showing that the SMT strains share a common evolutionary history distinct from the OT strains. The F ST value confirmed that the OT and the SMT populations were significantly different (F ST = 0.8711, p < 0.05) ( Table 5).

CORRELATION BETWEEN CHEMISTRY AND GENETIC DIVERSITY
Geochemical analysis revealed that different vent fluids had distinctive end-member chemical compositions ( Table 6). Although the vent fluids from Archaean and Pika were respectively FIGURE 5 | Relationship between genetic and geographic distance (R 2 = 0.98). π, number of base substitutions per site from between sequences. Cl − -depleted and -enriched in 2004− -depleted and -enriched in and 2005− -depleted and -enriched in (Ishibashi et al., 2006, no significant difference was found between them in this study. We assessed the relative contributions of environmental factors (such as pH and maximum temperature of vent fluids) and geographic distance to Persephonella genetic structure using the Mantel test. The pH of SMT vent fluids (pH 3.0-3.5) were significantly lower than those (pH 5.0-5.2) of OT (Table 6). However, we found no significant correlation between the genetic distance and the absolute difference in vent fluid pH and temperature (Mantel r = −0.28, p = 0.4). In contrast, a large, significant correlation coefficient (Mantel r = 0.993, p < 0.0001) was found in a Mantel test of all pairwise comparisons of the genetic and the geographic distance between strains (Figure 5).

WHOLE-CELL MALDI-TOF/MS ANALYSIS
MALDI-TOF/MS fingerprinting of whole microbial cells was highly reproducible. A peak at m/z 9678 in the MALDI-TOF/MS spectra was detected in all strains despite their geographical origin (Figure 6). Some peaks were detected in some strains with relatively low intensities. Cluster analysis based on the presence or absence of peaks identified two clusters that would correspond to the geographical regions of isolation (Figure 7). Two Persephonella trees, one generated from the whole-cell MALDI-TOF/MS data and a ML tree from concatenated MLSA sequences, show similar topologies (Figure 7).

DISCUSSION
Here we investigated the microdiversity and phenotypic heterogeneity of extremely thermophilic chemolithoautotrophic bacteria in deep-sea hydrothermal vents. Genetic and phenotypic differences corresponding to the geographic origins were discovered by the combined use of MLSA and wholecell MALDI-TOF/MS fingerprinting. The biogeography of hydrothermal vent-associated microbial community has been well studied (Takai et al., 2004;Nakagawa et al., 2005a,b;Kato et al., 2010). Members of the genus Persephonella have been found in global hydrothermal vent fields, however, their genetic and phenotypic heterogeneities were poorly understood.

GENETIC DIFFERENCE BETWEEN OT AND SMT POPULATIONS
We identified 35 STs among 36 Persephonella strains by MLSA based on 6 protein-coding genes, indicating high genetic diversity of Persephonella population. The same ST is rarely shared among Persephonella strains, however, all SMT strains have the same alleles with other SMT strains but not with OT strains in one or more MLSA loci, suggesting that OT and SMT populations are significantly different. Likewise, two OT strains, i.e., OT-1 and -4, have the same allele (no. 7) at napA (Table 4), although the number of OT strains obtained in this study is small. The split decomposition tree showed the evidence of recombination (Figure 3), which might contribute to increased STs. Previous studies showed that recombination generated the large number of unique combinations of alleles in some archaea and bacteria (Suerbaum et al., 2001;Whitaker et al., 2005;Doroghazi and Buckley, 2010).

BIOGEOGRAPHY OF Persephonella
The phylogenetic analysis based on concatenated gene sequences separated the strains into two clusters according to their geographic origins (Figure 4). The F ST value supported significant biogeographical isolation between SMT and OT populations. These results indicate that ubiquitous occurrence of Persephonella in deep-sea vents has not resulted from widespread contemporary dispersal but is an ancient historical legacy.
The microbial distribution seems to be not only influenced by local environmental conditions (Martiny et al., 2006). In this study, we observed clear correlation between the genetic distance and the geographic distance of isolates ( Figure 5) as described in thermophilic archaea (Whitaker et al., 2003;Flores et al., 2012). On the contrary, genetic distance has no significant correlation with the difference in vent fluid pH and temperature. We cannot rule out the possibility that other factors not determined in this study, including grazing pressure and virus activity, may be correlated with the genetic difference of Persephonella. Recently, H 2 concentration in vent fluids was shown to have an impact on the formation of microbial community structures in deep-sea vents (Takai and Nakamura, 2011).

CORRELATION BETWEEN GENOTYPIC AND PHENOTYPIC HETEROGENEITY
Some peaks in the MALDI-TOF/MS spectra were shared among some Persephonella strains. Major peaks of whole-cell MALDI-TOF/MS analysis are considered to reflect ribosomal proteins (Fenselau and Plamen, 2001;Ryzhov and Fenselau, 2001) and thus are independent of growth conditions (Bernardo et al., 2002). There were also some minor peaks that were specific to SMT or OT strains. Likewise the concatenated nucleotide alignment of MLSA loci, MALDI-TOF/MS data clustered the strains into two distinct groups corresponding to the geographic regions (Figure 7), suggesting that protein expression of Persephonella is tuned to function optimally in their original habitats. The genotypic and phenotypic correlation found among Persephonella isolates indicates the occurrence of allopatric speciation.

CONCLUSION
By using both comparative genetic and phenotypic population characterizations, this study for the first time indicated the Persephonella populations were geographically distinct. Since the Persephonella members are extremely thermophilic chemoautotrophs endemic to deep-sea vents, considerable dispersal barriers for the migration to spatially distinct niches should exist. Focal points raised by this study for future research include the effects of cold, oxic deep-sea conditions on the viability of deep-sea vent (hyper) thermophiles during the dispersal, the biogeographical comparison with other ubiquitous thermophiles with different metabolic traits (e.g., heterotrophic fermenters and methanogens), and the comparison with moderately thermophiles or mesophiles with similar energy/carbon metabolisms (e.g., Epsilonproteobacteria) in deepsea vents.