Colonization of plant substrates at hydrothermal vents and cold seeps in the northeast Atlantic and Mediterranean and occurrence of symbiont-related bacteria

Reducing conditions with elevated sulfide and methane concentrations in ecosystems such as hydrothermal vents, cold seeps or organic falls, are suitable for chemosynthetic primary production. Understanding processes driving bacterial diversity, colonization and dispersal is of prime importance for deep-sea microbial ecology. This study provides a detailed characterization of bacterial assemblages colonizing plant-derived substrates using a standardized approach over a geographic area spanning the North-East Atlantic and Mediterranean. Wood and alfalfa substrates in colonization devices were deployed for different periods at 8 deep-sea chemosynthesis-based sites in four distinct geographic areas. Pyrosequencing of a fragment of the 16S rRNA-encoding gene was used to describe bacterial communities. Colonization occurred within the first 14 days. The diversity was higher in samples deployed for more than 289 days. After 289 days, no relation was observed between community richness and deployment duration, suggesting that diversity may have reached saturation sometime in between. Communities in long-term deployments were different, and their composition was mainly influenced by the geographical location where devices were deployed. Numerous sequences related to horizontally-transmitted chemosynthetic symbionts of metazoans were identified. Their potential status as free-living forms of these symbionts was evaluated based on sequence similarity with demonstrated symbionts. Results suggest that some free-living forms of metazoan symbionts or their close relatives, such as Epsilonproteobacteria associated with the shrimp Rimicaris exoculata, are efficient colonizers of plant substrates at vents and seeps.


INTRODUCTION
Hydrothermal vents and other chemosynthesis-based ecosystems like cold seeps are deep-sea hotspots of primary production, which is ensured by chemoautotrophic prokaryotes, many of which live in symbiosis with various metazoans (Van Dover, 2000;Dubilier et al., 2008). The deep-sea ecosystem can also benefit from significant input of exogenous sources of carbon such as plant remains of different origins, like sunken wood (Wolff, 1979). Wood falls contain cellulose, hemicellulose and lignin, i.e., energy-rich polysaccharides that can be degraded by microbial assemblages. This results in the production of reduced compounds such as hydrogen sulfide, dihydrogen and methane, providing a niche for thiotrophic and methanotrophic bacteria (Leschine, 1995;Palacios et al., 2006). Wood falls, but also whale skeletons and others large inputs of organic material, are quickly localized and efficiently colonized by opportunistic fauna (Turner, 1973;Bennett et al., 1994;Baco and Smith, 2003;Smith and Baco, 2003;Laurent et al., 2013). Reducing conditions with elevated sulfide concentrations around degrading organic falls also constitute a prime environment for metazoans with sulfide-oxidizing symbioses (Duperron et al., 2008;Lorion et al., 2009;reviewed in: Dubilier et al., 2008;Laurent et al., 2009;Treude et al., 2009). Seep and vent ecosystems are distributed worldwide but are often separated by large distances, and various authors postulated that wood and whale falls could act as stepping stones between habitats (Vrijenhoek, 2010). Nevertheless, the most successful animal colonizers remain specialists for one substrate type, remarkable examples including Xylophaga spp. or Xyloredo spp. bivalves on wood (Turner, 1973;Distel and Roberts, 1997) and Osedax spp. polychaetes on bones (Rouse et al., 2004;Glover et al., 2005).
Despite the significance of sunken plant substrates as locale organic enrichments, studies on the associated microbial assemblages remain scarce. The first trial to investigate and compare free-living and attached prokaryotic assemblages from sunken wood at several sites and water depths was carried out by Palacios et al. (2009). Authors combined electron microscopy and molecular fingerprinting (capillary electrophoresis single-stranded conformation polymorphisms-CE-SSCP) to test the influence of depth of immersion, geographic location, deployment time, and wood type on the structure of microbial assemblages (Palacios et al., 2009). The phylogeny and diversity of Bacteria and Archaea associated with wood falls was recently investigated using clone libraries (Fagervold et al., 2012), and authors demonstrated the occurrence of various free-living sulfate-reducing and sulfuroxidizing bacteria (SRB and SOX respectively) and methanogenic archaea. This microbial community was modified with time of immersion because of changes in wood chemistry. The combination of ARISA fingerprinting and 454 pyrosequencing recently allowed Bienhold et al. (2013) to gain a more detailed overview of bacterial assemblages in a pine wood deployment lasting 1 year and surrounding sediments at a cold seep site in the Eastern Mediterranean Sea. They established that wood-boring bivalves, cellulolytic and sulfate-reducing bacteria colonized the organic substrate first, and then attracted chemosymbiotic fauna (Bienhold et al., 2013). Finally, using pyrosequencing, Fagervold et al. (2013) also demonstrated that the microbial diversity in oak cubes deployed at the Blanes Canyon (Western Mediterranean Sea) was significantly higher compared to cubes deployed in the adjacent open slope (Fagervold et al., 2013). To date, these high-throughput diversity studies include a single region, use only one substrate type, and are all based on different experimental designs, and as a consequence knowledge of deep-sea wood-associated microbial assemblages is still scarce.
Organic fall-associated and vent/seep faunas include many metazoans harboring chemosynthetic symbionts, and many of these symbionts are environmentally acquired anew at each generation (Gros et al., 1996;McFall-Ngai, 2000;Won et al., 2003;Nussbaumer et al., 2006;Duperron et al., 2008;Lorion et al., 2009;reviewed in: Dubilier et al., 2008;Bright and Bulgheresi, 2010;Vrijenhoek, 2010). This mode of symbiont transmission implies the existence of free-living forms of symbionts which may infect the appropriate host, and for which symbiotic lifestyle may be facultative (Bright and Bulgheresi, 2010). Only few studies successfully documented the occurrence of such free-living symbionts in marine shallow waters (Lee and Ruby, 1992;Gros et al., 2003;Aida et al., 2008), even less in the deep-sea (Harmer et al., 2008). In which habitat and numbers do these free-living forms exist thus remains an open question.
Since 2006, colonization devices (CHEMECOLIs) filled with wood cubes and alfalfa grass have been deployed for various periods of time at several chemosynthesis-based sites located in the Eastern Mediterranean, the Gulf of Cadiz, the Mid-Atlantic Ridge and the Haakon Mosby Mud Volcano (Gaudron et al., 2010;Cunha et al., 2013). In this study, we characterize the bacterial communities colonizing these organic substrates. The aims are to identify bacteria colonizing the substrates, to evaluate the effect of region, type of substrate, depth, duration of deployment, and temperature on the bacterial assemblages, and to screen for potential free-living forms or relatives of metazoan-associated chemosynthetic symbionts. For this, a 454 pyrosequencing-based approach on a 16S rRNA-encoding gene fragment is employed.

SAMPLING
Thirteen sets of standardized colonization devices (CHEMECOLI: CHEMosynthetic Ecosystem COlonization by Larval Invertebrates, described by Gaudron et al., 2010) were deployed at 8 hydrothermal vents and cold seeps sites located in 4 distinct geographic areas (Figure 1): the Mid-Atlantic Ridge (MAR, 3 sites), the Gulf of Cadiz (GoC, 3 sites), the Eastern Mediterranean (EM, 1 site), and the Norwegian Sea (Haakon Mosby Mud Volcano, HMMV, 1 site). Each set consisted of two CHEMECOLIs, one filled with dried alfalfa grass (A) and the second with 2 cm pine wood cubes (W). Deployment sites were located at depths ranging from 354 to 2300 m. CHEMECOLIs were deployed a few meters away from any visible fluid escape and recovered in individual sterile-water filled hermetic boxes after periods of 10 days to 3 years. Operations were performed by various ROVs (Remotely Operated Vehicles) over the course of 13 cruises spanning over years 2006-2013 (Table 1). Immediately after recovery, pieces of alfalfa and wood cubes were randomly selected in a cold room, fixed in 96% ethanol and stored at 4 • C until DNA extraction. Substrates from a short-term deployment at Menez Gwen were also frozen in liquid nitrogen and stored at −80 • C in order to compare different sample fixation procedures (MAR-A-V-C, MAR-W-V-C). Non-deployed substrates were used as controls (Contr-A and Contr-W).

DNA EXTRACTION AND 454 PYROSEQUENCING
Samples were screened under a dissecting microscope to check for the absence of animal material (tissue, eggs, and larvae). Small pieces (2-3 g) of wood cubes were cut in thin slices and homogenized (3 × 30 s) in a metal jar with beads (12 mm diameter) using   Doyle and Doyle (1987) DNA extraction protocol was applied. About 200-300 mg of powder was mixed with 1 mL of preheated CTAB isolation buffer containing 2% hexadecyltrimethylammonium bromide, 1.4 M NaCl, 0.2% βmercaptoethanol, 20 mM EDTA, 100 mM Tris-HCl, 1% PVP, and 0.2 mg × mL-1 proteinase K. Samples were agitated (1 h, 60 • C) then mixed with one volume of chloroform-isoamyl alcohol (24:1). Nucleic acids from the upper aqueous phase were precipitated by isopopropanol and centrifuged. Dry DNA pellets were re-suspended in sterile water. Concentration and quality were checked. DNA extraction was performed in duplicate, using another randomly selected wood cube or alfalfa grass sample from the same CHEMECOLI, resulting in two replicates (region-substrate-nb-1 and region-substrate-nb-2). Primers V5V6_F (CAAACAGGATTAGATACCCTG) and V5V6_R (CGTTRCGGGACTTAACCCAACA) (designed by GENOSCREEN) were used to amplify the V5-V6 hypervariable region corresponding to regions 770-1094 in the E.coli 16S rRNA-encoding gene. Short (∼300 bp) pyrotags were sequenced by 454 pyrosequencing (GsFLX-Roche Diagnostics, GENOSCREEN, France). A 10-bp molecular identifier (MID) tag was inserted between the GS-FLX adapter and the specific primer to facilitate further sequence binning. PCR products were purified and quantified by Picogreen (Invitrogen, USA) and the same amount of amplicons was mixed prior to 454 pyrosequencing. 5/8 of a plate was used to sequence 52 samples (including replicates), expecting a minimum of 7700 reads per sample. Sequences have been submitted to MG-RAST (http:// metagenomics.anl.gov/linkin.cgi?project=9917) with MG-RAST ID's ranging from 4571074.3 to 4571125.3 (Table S1).

DATA ANALYSES
Resulting binary ".sff " files were extracted using Mothur (Schloss et al., 2009). Sequences shorter than 250 bp, longer than 350 bp and containing Ns were eliminated from further analysis. Filtered sequences were sorted by their MID sequences into separate Fasta files. Typical 454 sequencing errors and PCR single base errors were screened using the PyroNoise and SeqNoise modules of the AmpliconNoise software (Quince et al., 2011) with default parameters. Sequences with MIDs and primers removed were used to generate a Needleman-Wunsch distance matrix (function NDist) and clustered into operational taxonomic units (OTUs, function Fcluster) using the same software. The matrix of sequence abundances per OTU was generated using Python scripts using 97% identity threshold for OTU definition (Dataset_1 in Table S1). The 10 best hits (maximal e-value = 1 e −10 ) for each OTU were found using BLAST (Altschul et al., 1997) using the SSURef_NR99_115 Silva database (Quast et al., 2013), and added to the OTU abundance matrix.

ANALYSES OF OTU ABUNDANCES AND SAMPLES COMPARISON
The microbial α-diversity indices (Shannon and Invsimpson), community microbial richness indices (Chao1 and ACE) and rarefaction data were computed with Mothur on Dataset_1 (Schloss et al., 2009). The second dataset was prepared by eliminating all sequences present in less than 5 copies in the total number of reads per sample (rare OTUs eliminated) and was screened for symbiont-related sequences (Dataset_2 in Table S1). For further community abundance and statistical analyses, OTUs containing in total less than 40 sequences, which represents ∼1% of the total number of sequences in the smallest sample (4543) were eliminated, thus generating a reduced data matrix with abundant sequences (Dataset_3 in Table S1). All statistical analyses were performed using R (R Development Core Team, 2013). Non-Metric Multidimensional Scaling (NMDS) was performed using "metaMDS" function in the MASS package. Average agglomerative clustering dendrograms were generated using Unweighted Pair-Group Method with arithmetic Average (UPGMA) on a Bray-Curtis distance matrix. Abundance data were transformed using the Hellinger method (Legendre and Gallagher, 2001) and then used for transformation-based redundancy analyses (tb-RDA). Substrate type and region were used as factors and water temperature, depth, and deployment time were used as variables into the constrained RDA, in order to estimate their contribution to the global variance. Significance was assessed using ANOVA permutation tests.

IDENTIFICATION OF OTUs RELATED TO METAZOAN-ASSOCIATED BACTERIAL SYMBIONTS
The Dataset_2 was used for screening for symbiont-related sequences. Sequences from this dataset were blasted (with 97% identity threshold) against a local database (Supplementary File), containing sequences of bacteria living in symbiosis with deep-sea fauna (Potential_symbionts in Table S1). These sequences were then blasted against public database and their blast hits definitions were then filtered by scanning for the "symbiont" key word (Confirmed_symbionts in Table S1). In parallel, a phylogenetic analysis was computed in order to evaluate the phylogenetic relationship of sequences to known symbionts. For this, 16S rRNA sequences of documented symbiotic bacteria from vent and seep metazoans and several best blast hits from our OTUs were aligned with SINA Web Aligner (Pruesse et al., 2012) and truncated to the V5-V6 region. Symbiont-related OTUs from this study were added to this dataset, aligned with ClustalX (Larkin et al., 2007), and alignments were manually checked. Phylogenetic relationships among sequences were estimated from a 250-bp alignment with MEGA6 (Tamura et al., 2013) using distance methods and neighbor-joining. Bootstrap values were computed on 1000 replicates.

PREPARATION OF DATASETS
Bacterial diversity was investigated on 25 individual CHEMECOLIs filled with either pine wood cubes or alfalfa and deployed for periods of 10-1112 days in four areas in the North East Atlantic and Mediterranean. A total of 564925 V5-V6 reads was obtained. After length filtering, elimination of PCR and random sequencing errors, a raw dataset containing 364633 sequences distributed in 22721 OTUs (97% cut-off) was obtained. Screening for symbiont-related sequences was made on a reduced dataset including 332621 sequences representing 2641 OTUs. Community composition and comparisons were performed on a dataset containing only abundant sequences, overall 306716 sequences in 658 OTUs. On average, dataset sizes were reduced by 35, 41, and 46% of total read numbers and contained 7012, 6397, and 5898 sequences per sample (2 replicate samples per CHEMECOLI). Details on sequence abundances in each sample and dataset, and the mean number of reads per sample are summarized in Table S1.

TAXONOMIC RICHNESS AND DIVERSITY
Rarefaction curves based on a similarity threshold of 97% were generated for each substrate type separately ( Figure S1). Generally, alfalfa samples showed more diverse (up to 1.6-fold) bacterial communities than their wood equivalents. This was confirmed by bacterial community richness and diversity indices ( Table 2). Among alfalfa samples, all long-term ones (>289 days of deployment) from Gulf of Cadiz cold seeps (GoC), Eastern Mediterranean cold seep (EM) and Lucky Strike (L-S) hydrothermal vent on Mid Atlantic Ridge (MAR) (GoC-A-I to GoC-A-IV, EM-A-I and MAR-A-III) presented more diverse bacterial communities than other samples ( Table 2). In samples mentioned above the Chao1 index was higher than 1500 and the Shannon index had a value higher than 4.5 compared to wood samples. On any given set of CHEMECOLIs, the bacterial community was less diverse in wood than alfalfa samples ( Table 2).
Within wood samples GoC-W-II, GoC-W-III and MAR-W-III had the highest values of all considered diversity indices ( Table 2).
Replicate samples for both alfalfa and wood displayed comparable values for all analyzed diversity indices (Table 2), with two exceptions, namely the replicates in a GoC alfalfa sample (GoC-A-II-1 and GoC-A-II-2), and a short-term wood replicates from MAR (MAR-W-IV-1 and MAR-W-IV-2) ( Table 3).
Although Shannon and Invsimpson diversity indices showed similar values for the two GoC-A-II replicates, replicates from MAR-W-IV displayed a 6-fold difference in their Invsimpson index ( Table 2).

TAXONOMIC COMPOSITION OF BACTERIAL COMMUNITIES
At the division or sub-division level, 10 taxa included altogether more than 96% of all sequences present in Dataset_3 (Figure 2; Table S2). Proteobacteria was the most abundant division in this dataset and alone represented above 85% of the total number of reads (Figure 2; Table S2). Within them three taxa (Gamma-, Delta-, and Alphaproteobacteria) corresponded to 77% of the total number of amplicons (Table S2). Gammaproteobacteria were remarkably over-represented in short-term samples (MAR-A/W-IV; EM-W-II and MAR-A-V, Figure 2; Table S2). Epsilonproteobacteria were also highly abundant in short-term samples (MAR-A/W-IV and MAR-A/W-V), while not exceeding 2% of total sequence numbers in long term samples. They were particularly dominant (98%) in the shortterm wood sample from M-G on MAR. While almost absent from short-term samples, Delta-and Alphaproteobacteria represented a significant fraction of reads in all long-term samples. Other taxa represented below 30% of reads per sample. For further community analyses at the OTU level, we focused on the most abundant OTUs, which represented above 1% of the total number of sequences in the dataset (>3000 amplicons; Table S3). 15 OTUs within the Proteobacteria and 1 within the Bacteroidetes matched this criterion, representing 131712 sequences (almost 43% of the total sequences) (Figure 3; Table S3). All were related to marine bacteria. Four major OTUs (OTU_00201, OTU_11277, OTU_00411 and OTU_00023) were abundant in almost all long-term samples and constituted altogether 38 to 86% of the total number of sequences in each sample (Table S3). Communities from short-term experiments were different from the long-term ones, comprising 6 highly abundant OTUs that were rare to absent in long-term deployments (Figure 3; Table S3).

COMPARISON AMONG BACTERIAL COMMUNITIES
Out of the 24 paired replicates of samples, 21 were grouped together in the UPGMA-based cluster dendrogram based on the Bray-Curtis dissimilarity matrix (Figure 4). The exceptions were two GoC samples (GoC-W-I and GoC-W-IV) and one shortterm MAR sample (MAR-W-IV), for which replicates were not the closest relatives. This indicates congruency within most replicates, except for three out of 24 samples. Samples that had been fixed following two different procedures (ethanol fixation for MAR-A/W-V-1/2 and deep freezing for MAR-A/W-V-C) were also on the same branches, indicating that the type of fixation had limited influence on the results. Some groups could be easily distinguished. Both substrates for long-term EM samples (EM-A/W-I) clustered together. Alfalfa substrates from GoC (GoC-A-I to GoC-A-IV) and L-S (MAR-A-III) grouped together; as did wood substrates from GoC (GoC-W-I to GoC-W-IV). A larger group included samples corresponding to both substrates from MAR and HMMV samples (MAR-A/W-I to MAR-A/W-III, HMMV-A/W-I, and HMMV-A/W-II). Finally, all short-term samples formed a clearly distant group (MAR-A/W-IV, MAR-A/W-V, and EM-A/W-II). Interestingly, CHEMECOLIS deployed for different periods of time in the long-term experiment clustered within the same larger groups but do not with each other. These include the GoC Mercator deployments for 289 (GoC-A/W-I) and 731 days (GoC-A/W-II), the MAR Rainbow deployments for 328 and 414 days (MAR-A/W-I and MAR-A/W-II), and the HMMV deployments for 388 and 752 days (HMMV-A/W-I and HMMV-A/W-II). Similar groupings were found on the NMDS graph ( Figure S2).
The influence of factors (substrate type, deployment area) and variables (water temperature, duration of deployment and depth) on the community composition was evaluated by constrained RDA (Figure 5 and Figure S4). On the two dimensional graph, clear segregation of samples by region was visible, samples clustering around the centroids representing their region of origin. The influence of the type of substrate was less marked, as centroids representing pine wood and alfalfa were not very distant from one another (Figure 5 and Figure S4). Variables were represented as three vectors with their lengths proportional to the variance explained. Samples from MAR were for example aligned parallel to the vector "DAYS," according to the duration of deployment: short-term deployments were in the upper left  ( Figure S4). The RDA overall explained 38% of the total variance. The deployment region accounted for 21% of the variance; days, substrate, and depth between 5 to 4%; and the water temperature was responsible for less than 4% of the total variance ( Table S4).
The ANOVA test confirmed that all factors and variables were statistically significant, based on permutation tests (p < 0.001, 99 permutations; Table S4).

IDENTIFICATION AND DISTRIBUTION OF BACTERIA RELATED TO CHEMOSYNTHETIC SYMBIONTS
A total of 30607 reads within 34 OTUs (∼9.2 % of the Dataset_2) have been identified as potential symbionts (>97% of identity with sequences in local database of symbionts) while 21612 reads, distributed in 16 OTUs, had the "symbiont" keyword in at least one of their ten best blast hit definitions (∼6.5 % of the Dataset_2). Putatively related symbionts were distributed within 9 potential metazoan host families. Above 22000 belonged to relatives of symbionts associated with Rimicaris shrimps, Idas mussels and Lyrodus shipworms. A phylogenetic tree was computed using representative sequences of potential symbiont OTU, one best blast hit from the local database of symbionts and four best blast hits from public databases ( Figure S3). Only the OTUs displaying the keyword "symbiont" within at least one out of ten best blast hits (public database) and above 97% sequence identity with a symbiont sequence were retained as potential candidates for free-living forms of symbionts (Table S5). We obtained 21612 sequences representing 16 OTUs related to symbionts (  Table S5).
The Rimicaris symbiont-related OTUs were highly abundant in short-term deployments in MAR, very rare in EM and GoC, and absent everywhere else. Relatives of Idas Bacteroidetes and Symbiont G, Siboglinidae symbionts and those of Tubificidae were ubiquitous (GoC, EM, MAR, HMMV). Those of Lyrodus were present only in GoC and EM, and those of Thyasiridae bivalves only in GoC. The occurrence of symbiont-related OTUs was related to the host presence in the corresponding regions ( Table 3). Shrimps and their symbionts co-occured on the MAR (WoRMS Website 1 ,) with only few symbiont related sequences in GoC and EM. Thyasirids (Rodrigues and Duperron, 2011) and their symbionts co-occurred in GoC, while symbionts were absent in MAR, HMMV and EM. Symbionts of Idas mussels and siboglinids were present in all regions, while the hosts were absent at HMMV  and MAR (World Polychaeta Database Website 2 ) respectively. Symbionts of Tubificoides benedii were also ubiquitous while those of teredinids could be detected mainly in GoC and EM. Metazoans of these two families occur in shallow water.

METHODOLOGY
All CHEMECOLI samples were analyzed in duplicates. Estimated diversity and richness indices of most duplicates were comparable, save for two exceptions in which Chao1 and ACE indices varied 5-9-fold between replicates for a long-term sample from  Table 1.
Color code is the same as that used in Figure 1.
GoC and for short-term one from MAR. These two exceptions emphasize the need for sample replication. The UPGMA dendrograms evidenced the very short distances between most replicated samples. Similar groupings were observed on the RDA (Figure 5) graph, in which replicates were most often in very close vicinity to each other. Finally, high similarity was observed between samples stored in 96% ethanol and deep frozen at −80 • C. Ethanol treatment is thus an appropriate method for sample storage and conservation in view of the difficult conditions experienced sometimes on board oceanographic cruises and during shipment.

BACTERIAL COLONIZATION OF PINE WOOD AND ALFALFA SUBSTRATES IN DEEP-SEA CHEMOSYNTHESIS-BASED ECOSYSTEMS
A variety of devices have been designed to study bacterial colonization in deep-sea ecosystems such as hydrothermal vents, in most cases using relatively inert surfaces (examples in Reysenbach et al., 2000;López-García et al., 2003;Rassa et al., 2009). Wood substrates have been introduced more recently (Fagervold et al., 2012(Fagervold et al., , 2013Bienhold et al., 2013). In our study, bacterial communities colonizing alfalfa were usually more diverse than those colonizing wood samples ( Figure S1; Table 2). This is not unexpected because wood displays high content of lignin, which consists in refractory hydrophobic polymers harder to degrade than cellulose, and because alfalfa offers greater surface per volume for bacterial colonization. The overall OTU diversity in short-term deployments was lower than in samples deployed for more than 289 days. After 289 days, no correlation was observed between community richness and deployment duration (data not shown), suggesting that diversity may have reached saturation sometime in between 14 and 289 days. Short-term samples were usually recovered during distinct legs of the same cruise, while long-terms samples were recovered during follow-up cruises, which occurred on a yearly basis, explaining the lack of intermediate deployment times. These would however be crucial to understand how fast saturation is reached.
The bacterial community in long-term samples included only 10 bacterial subdivisions (Figure 2; Table S2). Three

www.frontiersin.org
February 2015 | Volume 6 | Article 162 | 9 were used as variables into the constrained RDA, generated in R using "rda()" and "plot()" functions in "vegan" package. Geographic areas are indicated by ellipses with color corresponding to region of origin. Color code follows that of Figures 1, 4. Sample IDs of individual points can be found in Figure S4 and environmental variables and factors are described in Table 1. The influence of each variable and factor and their significance were assessed using ANOVA permutation tests-details in Table S4.
of them (Alpha-, Delta-, and Gammaproteobacteria) were over-represented if compared to others, as shown before (Fagervold et al., 2012(Fagervold et al., , 2013Bienhold et al., 2013). Among the most dominant OTUs (Figure 3; Table S3) were members of the Oceanospirillales and Alteromonadales, facultative anaerobe growing on cellobiose and previously found on sunken wood (Fagervold et al., 2012). Within the Deltaproteobacteria, OTUs related to Mycococcales and Desulfobacteriales, and two groups of SRBs detected by Bienhold et al. (2013) (OTU refs in Bienhold et al., 2013: Deltaproteobacteria_03,_24,_27,_50,_55,_283,_421, and _737), were identified. Finally, Bacteroidetes were found, which are often chemoorganotrophs specialized in degrading various biopolymers such as cellulose (Kirchman, 2002). The composition of short-term communities was different, and except for the wood sample from M-G, was dominated by Gammaproteobacteria. This is comparable with the 1day deployment control sample in Bienhold's study (2013). Epsilonproteobacteria were over-represented in the short-term M-G wood sample thanks to the high abundance of Rimicaris exoculata ectosymbiont (Table S3), which is discussed below. The abundance of Epsilonproteobacteria is consistent with their dominance at seep and vent ecosystems and their documented quick colonization of newly formed habitats including wood (Reysenbach et al., 2000;López-García et al., 2003;Huber et al., 2010;Fagervold et al., 2012Fagervold et al., , 2013Bienhold et al., 2013).

MAJOR FACTORS INFLUENCING THE MICROBIAL COMMUNITY COMPOSITION
The RDA explains less than 40% of the total variance observed in the community composition. It means that other factors and variables, likely including environmental parameters not measured during sampling, play a major role. These are hard to obtain because only time-series acquired in the immediate vicinity of CHEMECOLIs and over the course of the experiment would make sense. Nevertheless, physico-chemical data could explain another significant fraction of the total variance (Lee et al., 2014). Among factors considered here, the deployment area contributed 21% of the variance. This is evident from the UPGMA, NMDS and RDA graphs which cluster samples according to their area of origin, with few exceptions (i.e., L-S sample MAR-A/W-III). EM samples are grouped and GoC alfalfa and wood samples build two rather distinct groups. Interestingly, MAR and HMMV samples appear more mixed. This suggests that communities from vent MAR and cold seep HMMV deployments are less distinct, despite the great geographical distance and very different latitudes. This observation reminds of a previous study in which deep ocean microbial communities from poles and low latitudes differed less than corresponding samples from the euphotic surface waters (Ghiglione et al., 2012). Other factors and variables each contribute 5% or less to the variance. The fact that substrate does not explain much variance is probably due to the similar composition of alfalfa and wood, both being plant-derived material. The consequence is that although alfalfa communities are more diverse, the most abundant OTUs tend to be shared. Alfalfa may host a greater number of rare OTUs. More caution is needed when looking at other variables. Duration of deployment for example only explains 5% of the total variance, yet short-term samples are remarkably less diverse and cluster very far from long-term samples in the various analyses. This apparent lack of congruency between the results of RDA and comparisons is clearly due to a sampling bias. Short-term samples (10-13 days) represented only 5 of the 25 deployed CHEMECOLIs, the other being deployed for 289-1112 days. Their quantitative contribution to the variance in the overall datatset is thus low, and only if we had as many short terms as long terms samples would we have a correct estimate of the percentage of variance explained by deployment duration. Long-term CHEMECOLIS deployed at the same site for different durations do not display the most similar communities, but their communities do not cluster very far from one another, suggesting a certain level of stability at this stage. If we had better coverage of intermediate times and a more balanced design, duration of deployment could become a major explanatory factor (Palacios et al., 2009;Fagervold et al., 2012). Although the temperature variable is well equilibrated with 7 sites displaying typical deep-sea temperatures (2-4.6 • C) against six others ranging from 10 to 13 • C, it explains less than 4% of the total variance. Depths cover a broad bathymetric range (354 to 2300 m) and samples are distributed rather homogeneously, yet it explains only 4% of the variance. Depth is usually not considered a major factor in the aphotic zone of the ocean (Fagervold et al., 2013;Zhu et al., 2013). Neither latitude nor water temperature showed significant correlations with estimated microbial diversities in the previous large scale study by Ghiglione et al. (2012). Overall, the geographical area of deployment seems to be the most important factor influencing the microbial colonization in this study. Short term samples are different from long terms in both their overall diversity as well as community composition. Both substrates share similar dominant bacteria and alfalfa displays greater diversity thanks to rare OTUs. Communities become saturated in term of both diversity and composition between 14 and 289 days after deployment. For longer periods, bacterial composition appears to be relatively stable, as samples from the same sites group together. Only marine microbes colonize both substrates in CHEMECOLIs, no typical contaminants could be detected. The bacterial taxonomic composition identified in colonization devices compared to other studies (Fagervold et al., 2012(Fagervold et al., , 2013Bienhold et al., 2013) confirms that the degradation of organic matter starts to take place within the first 2 weeks after immersion.

COLONIZATION OF SUBSTRATES BY SYMBIONT-RELATED BACTERIA
The sequence dataset was screened for sequences related to chemosynthetic symbionts of metazoans. Based on percentage similarity (97%) criterion with documented symbionts, we retained sequences closely related to documented symbiotic bacteria as reasonably supported candidates. Despite that the short length of reads hampers the quality of phylogenetic reconstructions, the combination of sequence similarity and phylogenetic relatedness appears a rather conservative approach. Almost 70% of symbiont-related sequences were found on pine wood ( Table S1). The higher diversity within alfalfa samples may indeed "dilute" symbiont-related sequences in higher numbers of other bacterial OTUs, making their discovery less likely. Putative freeliving forms of symbionts or symbiont-related bacteria from five host metazoan families were identified. The most abundants were related and 99% identical to Epsilonproteobacteria ectosymbionts of Rimicaris exoculata. In this species, bacteria quickly re-colonize the shrimp's gill chamber after each molt, estimated to occur every 10 days (Zbinden et al., 2004;Corbari et al., 2008). Our results are in good agreement with this constraint. Shrimp symbiont-related OTUs were indeed the most abundant in the short-term MAR samples (MAR-A/W-IV and MAR-A/W-V, 10-14 days), and absent from all long-term deployments. They thus seem to be pioneer and efficient colonizers of available surfaces at vents. This questions the specificity of the symbiotic association. It may well be that bacteria behave as opportunists colonizing the new cuticle when it appears, as they would colonize any surface. The molt cycle would favor iterative colonization by the same bacteria simply by renewing the surface available every few weeks.
Relatives of gill endosymbionts of the mytilid Idas sp. were also abundant, mostly the Symbiont-G group and the Bacteroidetes, two symbiont groups which were recently discovered in species from the eastern Mediterranean and the Gulf of Cadiz (Duperron et al., 2008;Rodrigues et al., 2013). The usual sulfur-oxidizers are much rarer, though a few were found. Finding free-living relatives of mussel symbionts was not unexpected, given that environmental acquisition of symbionts is supported in Idas modiolaeformis based on the absence of bacteria within gonads and gametes (Gaudron et al., 2012). In a recent study, Laming et al. (2014) showed that environmental symbiont acquisition took place at the early juvenile stage, after settlement.
The occurrence of Lyrodus and Osedax symbionts-related OTUs with identities between 97.4 and almost 100% was more unexpected in our samples, as shipworms live in shallow coastal waters (Borges et al., 2014) and Osedax is a bone-eating specialist (Rouse et al., 2004;Glover et al., 2005). Lyrodus gill endosymbionts have been proposed to produce cellulolytic enzymes that contribute to the host's ability to digest wood (Waterbury et al., 1983), and multiple bacterial phylotypes can co-occur in its gill bacteriocytes (Distel et al., 2002). Although the Lyrodus pedicellatus symbiont transmission mode remains still unknown, vertical transmission has been evidenced for its relative Bankia setacea (Sipe et al., 2000). But others wood-eating bivalves (Xylophaga spp. and Xyloredo spp.) were abundant in most samples and although we carefully avoided processing any animal tissue in our analyses, we cannot rule out that some bacterial symbionts were released during processing. Knowing that Bienhold et al. (2013) suggested a close phylogenetic relationship between symbionts of the shallow and deep-sea wood-boring bivalves we cannot objectively conclude whether the identified OTUs are actually freeliving forms of Lyrodus pedicellatus or Xylophaga spp. symbionts.
Presence of an OTU related to Tubificoides benedii symbionts is also surprising as these oligochaete worms are found in eutrophic coastal sediments, but at the same time their ectosymbionts have been shown to belong to clades that consist almost exclusively of bacteria associated with invertebrates from deep-sea hydrothermal vents (Ruehland and Dubilier, 2010). The co-occurrence of thyasirid bivalves and their symbionts, represented in our study by only 5 sequences in GoC, may be explained by the mode of life of these animals burrowing deeply in sediments (Dufour and Felbeck, 2003).

SYMBIONT TRANSMISSION AND COLONIZATION
All symbiont-related OTUs corresponded to environmentallytransmitted bacteria. The existence of free-living forms is thus expected, although their natural habitat is not documented (Gros et al., 1996;Harmer et al., 2008). Symbiont-related sequences were quite frequent in a few samples, and some were present on several sites. The abundance of symbionts on short-term wood samples indicates that they are probably efficient colonizers, although little is known about the mechanism of symbiont acquisition from the environment. The Idas sp. symbiont-G related-OTUs were on the other hand found in moderate numbers, but in almost all samples. Most of symbionts detected shared the same distribution pattern as their hosts (Table 3). An exception was the shipworms' symbiont in GoC and in EM, while their host occurs only in coastal shallow waters (Borges et al., 2014), but this may be because of close relatedness with symbionts of Xylophaga spp. which are present. We also did not expect the occurrence of Siboglinidae polychaetes symbionts on MAR, nor those of Idas sp. in HMMV. The detection of unexpected symbiont-related phylotypes ( Table 3; Table S5) fuels the debate on the connectivity between different deep-sea sites (Vrijenhoek, 2010). Moreover, it confirms that short-and long-term colonization devices and high-throughput sequencing are good tools to further investigate the habitat and densities of free-living relatives of the symbionts.
The next step will be to distinguish between free-living close relatives of symbionts and true free-living forms of symbionts, but this will be tricky. Symbiont-specific fluorescence in situ hybridization (FISH) probes may not be able to properly discriminate, and low abundances of free living relatives may render difficult their detection and quantification on substrata, which show a high level of background autofluorescence (data not shown). Finally, the analyzed hypervariable region of 16S rRNA seems to be too short to be a good target for highly specific probes. Standardized devices are currently being deployed in other geographical regions and in other habitats. There is a high potential to improve the protocol we applied in our study with the increasing length of pyrosequencing reads, thus making the identification of microbes more reliable. For better understanding of the influence of environment on the microbial communities, the chemistry of water should be monitored precisely in the proximity of our colonization devices.
Quest 4000 (MARUM, Bremen, Germany) and submersible Nautile (Ifremer, France) and also people on board from all cruises participating in the deployment and recovery of CHEMECOLIs. We thank Françoise Gaill who initiated the CHEMECO EuroDEEP ESF program. This research was supported by UPMC, ANR DeepOases, CHEMECO EuroDEEP ESF, GDR DIWOOD, ITN Symbiomics and in Portugal by European funds (COMPETE) and by national funds through the Portuguese Science Foundation (FCT-EURODEEP/0001/2007 and FCT-PEst-C/MAR/LA0017/2013). Kamil Szafranski was funded through a Ph.D. grant from the Marie Curie Actions Initial Training Network (ITN) SYMBIOMICS (contract number 264774).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fmicb.2015. 00162/abstract Figure S1 | Rarefaction analysis for the alfalfa grass (A) and pine wood (W) samples. The curves were generated for 97% levels of OTU using Mothur (Schloss et al., 2009). Sample IDs are described in Table 1.      Table S5 | The total number of sequences of symbiont-related OTUs identified by the "symbio" key word in at least one of their four best BLAST hit definitions or high similarity with a sequence from the symbiont database and confirmed by the phylogenetic proximity. Only OTUs displaying a symbiont sequence as their closest relative on the phylogenetic tree were retained. Sample IDs are described in Table 1.