Comparative Analysis of Secretomes from Ectomycorrhizal Fungi with an Emphasis on Small-Secreted Proteins

Fungi are major players in the carbon cycle in forest ecosystems due to the wide range of interactions they have with plants either through soil degradation processes by litter decayers or biotrophic interactions with pathogenic and ectomycorrhizal symbionts. Secretion of fungal proteins mediates these interactions by allowing the fungus to interact with its environment and/or host. Ectomycorrhizal (ECM) symbiosis independently appeared several times throughout evolution and involves approximately 80% of trees. Despite extensive physiological studies on ECM symbionts, little is known about the composition and specificities of their secretomes. In this study, we used a bioinformatics pipeline to predict and analyze the secretomes of 49 fungal species, including 11 ECM fungi, wood and soil decayers and pathogenic fungi to tackle the following questions: (1) Are there differences between the secretomes of saprophytic and ECM fungi? (2) Are small-secreted proteins (SSPs) more abundant in biotrophic fungi than in saprophytic fungi? and (3) Are there SSPs shared between ECM, saprotrophic and pathogenic fungi? We showed that the number of predicted secreted proteins is similar in the surveyed species, independently of their lifestyle. The secretome from ECM fungi is characterized by a restricted number of secreted CAZymes, but their repertoires of secreted proteases and lipases are similar to those of saprotrophic fungi. Focusing on SSPs, we showed that the secretome of ECM fungi is enriched in SSPs compared with other species. Most of the SSPs are coded by orphan genes with no known PFAM domain or similarities to known sequences in databases. Finally, based on the clustering analysis, we identified shared- and lifestyle-specific SSPs between saprotrophic and ECM fungi. The presence of SSPs is not limited to fungi interacting with living plants as the genome of saprotrophic fungi also code for numerous SSPs. ECM fungi shared lifestyle-specific SSPs likely involved in symbiosis that are good candidates for further functional analyses.


INTRODUCTION
In forest ecosystems, tree roots are continuously in contact with beneficial, commensal and pathogenic soil microbes. These microbial communities, called the microbiome, are also responsible for nutrient (C, N, and P) recycling and exchanges and have an impact on soil fertility Lakshmanan et al., 2014) and carbon sequestration (Schimel and Schaeffer, 2012). Consequently, the root microbiome is driving forest health, productivity and sustainability (Wagg et al., 2014). Among those microorganisms, fungi stand as key players that demonstrate a wide range of interactions with plants. This includes wood decayers, saprotrophic soil decomposers, plant pathogens and mutualistic symbionts (Bonfante and Genre, 2010;Veneault-Fourrey and Martin, 2011). Ectomycorrhizal (ECM) symbioses appeared more than 180 Mya (Hibbett and Matheny, 2009) several times independently (Ryberg and Matheny, 2012;Kohler et al., 2015). ECM symbioses evolved from ecologically diverse decayer precursors (white-and brown-rot wood decayers, litter decayers) and radiated in parallel, following the origins of their host plant lineages . Polyphyletic evolution of the ECM lifestyle is marked not only by convergent losses of different components of the ancestral saprotrophic apparatus but also by rapid genetic turnover in symbiosisinduced genes . The ECM symbiosis is the most prominent mycorrhiza occurring in forest ecosystems. The tree supplies the ECM fungus with up to 30% of its photosynthesis-derived carbohydrates in return for up to 70% of its N and P needs, which are received from the ECM hyphal networks that extend deep within the soil (Nehls, 2008;Martin and Nehls, 2009). Thus, this mutualistic interaction relies on a constant nutrient exchanges between partners and contributes to better tree growth and health by improving mineral nutrition, strengthening plant defenses and directly contributing to the exclusion of competitive microbes (Wallander et al., 2001;Franklin et al., 2014).
Fungi release in the extracellular matrix a wide range of proteins to decay their substrates and to interact with their microbial, plant, or animal competitors and partners (Stergiopoulos and de Wit, 2009;Tian et al., 2009;Talbot et al., 2013;Essig et al., 2014). The fungal secretomes are composed of several protein categories, including proteases, lipases, Carbohydrate-Active enZymes (CAZymes), secreted proteins of unknown function and small-secreted proteins (SSP) (Alfaro et al., 2014). These secreted proteins either participate in organic matter degradation with hydrolytic enzymes such as CAZymes (Zhao et al., 2014), proteases or lipases or in interactions with their host through surface proteins like hydrophobins (Linder et al., 2005) or SSPs (Martin and Kamoun, 2011;van Ooij, 2011). ECM fungi have a reduced set of plant cell wall-degrading enzymes . Over the past decade, SSPs have appeared as the cornerstone in the molecular dialog with host plants by altering host metabolism and/or defense responses in plant-microbe interactions (Kloppholz et al., 2011;Giraldo and Valent, 2013;Rovenich et al., 2014;Lo Presti et al., 2015). They were gradually associated with the term effector, which is defined as "microbial or pest secreted molecules that alter host-cell processes or structures and generally promote the microbe lifestyle" . SSPs appear to play a key role in the ECM symbiosis (Plett and Martin, 2015). The protein MiSSP7 from Laccaria bicolor is required to establish symbiosis. This MiSSP is targeted to the host-plant nuclei where it interacts with the jasmonate co-receptor JAZ6 suppressing the plant defense reactions and allowing the development of the apoplastic Hartig net (Plett et al., 2011(Plett et al., , 2014. However, despite their ecological importance, little is known about the secretome of ECM fungi as most published analyses focused on the full genome repertoire of CAZymes, either secreted or not . Only a few studies combined both in silico prediction of secretome and proteomic analysis of secretome (Vincent et al., 2012;Doré et al., 2015). To investigate whether the various components of the fungal secretome (CAZymes, proteases, lipases, SSPs) differ between ECM, pathogenic and saprotrophic species, we predicted, annotated and compared the secretomes of 49 fungal species, including 11 ECM symbionts recently sequenced  and available via the JGI fungal genome portal MycoCosm (Grigoriev et al., 2014). Using this large set of predicted gene repertoires, we showed that the secretome size is not related to the fungal lifestyle. We also identified SSPs shared between ECM and saprophytic fungi, as well as lifestyle-specific SSPs.

Correlation between Secretome and Fungal Lifestyles
To determine whether secretomes and SSPs are related to biotrophic or saprotrophic lifestyles, we developed a pipeline to identify and compare the secretome from 49 fungal fungi including 41 Basidiomycota, six Ascomycota, one Zygomycota, and one Chytridiomycota (Figure 1, Table 1). Predicted secreted proteins contain an N-terminus type II secretion signal peptide, no transmembrane domain and do not contain sequences that retain them in organelles (mitochondria, plasts, ER, Golgi, etc.). Within the Basidiomycota, soil decayers and white-rot fungi display the largest secretomes. They displayed a wide range of sizes ranging from 155 to 1715 signal peptide-containing proteins (Figure 2). The total number of genes predicted in the fungal genomes analyzed ranged from 4000 to 25,000 and the proportion of secreted proteins (SP) from 3 to 10% of the total proteome. Most of the 49 secretomes analyzed (73.5%) contained between 500 and 1000 signal peptide (SP)-containing proteins (Figure 2). However, some secretomes are out of this range with a greater (12% of analyzed secretomes) or smaller (14 % of analyzed secretomes) number of secreted proteins. This excess or lower level of secreted proteins is not restricted to a single lifestyle, as three white-rot fungi (Galerina marginata, Sphaerobolus stellatus, and Auricularia subglabra), one litter decayer (Gymnopus luxurians) and one plant-biotrophic pathogen (Melampsora larici populina) possess more than 1000 SP-containing proteins (Figure 2). Plant pathogenic fungi have the largest proportion of secreted proteins. The size of the secretome from pathogenicand white rot-fungi correlates with their proteome size, with FIGURE 1 | Pipeline used to identify and annotate fungal secretome. Secreted proteins have been predicted based on genomic sequences from 49 fungal genomes retrieved from Joint Genome Institute (JGI, url: http://genome.jgi.doe.gov/Mycorrhizal_fungi/Mycorrhizal_fungi.info.html). Prediction uses combined characteristics: proteins with signal-peptide as detected by SignalP v4.1 and no transmembrane domain or one overlapping signal peptide founded, no internal localization (no endoplasmic reticulum addressing "KDEL" motif, secretory pathway by TargetP v1.1 and extracellular by WoLF PSORT 0.2). Proteins have been annotated according to motives scanning (PFAM, NLS, KR rich region), homologies found in specialized databases (CAZy database, MEROPS for proteases and Lipase Engineering Databases) and standard databanks (Uniprot, Swissprot, Mycocosm). SignalP4.1 is set to ≪ sensitive mode ≫, the other softwares are set with default parameters. a correlation coefficient r 2 of 0.97 and 0.84, respectively (Supplementary Image 1A). In contrast, the secretome size from brown-rot decayers and ECM fungi does not correlate with the proteome size as they display a correlation coefficient of 0.57 and 0.30, respectively (Supplementary Image 1A). The fungi with the smallest secretomes are the ECM Tuber melanosporum, the mycoparasitic Tremella mesenterica, the litter decayer Phycomyces blakesleeanus, the plant biotrophic pathogen Ustilago maydis and the yeast Pichia stipitis.

Enrichment of Functional Categories
To identify ECM-specific proteins in secreted proteins with a known PFAM domain (if any), we performed an enrichment analysis on the protein repertoires from species belonging to five fungal lifestyles (white-rot and brown-rot decayers, litter decayers, ECM, and plant pathogens; Table 2). Most of the PFAM domains identified in secreted proteins are related to enzymatic activities [proteases, lipases, and glycosyl hydrolases (GH)]. None of the enriched categories of PFAM-containing secreted proteins is shared by all ECM fungi studied ( Table 2).

Composition of Fungal Secretomes by CAZymes, Proteases, and Lipases
Our analysis then focused on four categories (i.e., proteases, lipases, CAZymes, and SSPs) known for their biological and ecological relevance in saprotrophic and ECM fungi. In this study, we defined SSPs as predicted secreted proteins smaller than 300 amino acids. Secretomes have been annotated according to similarities with available databases or a chosen cut-off (Figure 3). SSPs are the most abundant secreted proteins in all species analyzed, followed by CAZymes, proteases and finally lipases (Figure 3).

CAZymes
The average number of secreted PCW-degrading CAZymes of ECM fungi is significantly reduced when compared to the one of the other fungal lifestyles, except plant pathogens (Supplementary Image 2, Supplementary Data sheet 1). A detailed analysis of the CAZyme families and their evolution is provided in Kohler et al. (2015).

Proteases
To assess the protease capability of the analyzed fungi (mainly found in forest ecosystems), we performed a BLASTP search against the MEROPS database. We considered families S08, S09, M36, S53, and A01 for endoproteases and families M28 and S28 for exoproteases as they are ecologically relevant for ECM . We then subdivided the proteases into exoproteases and endoproteases. Proteases represent approximately 7% of the secretome, ranging from 6 endo-or exoproteases for the litter decayer Agaricus bisporus to 78 proteases for the white-rot fungus A. subglabra. Both types of proteases are present in all studied species (Figure 4). Endoproteases are present in higher numbers than exoproteases independently of the lifestyle (Figure 4). No significant differences were found between biotrophic and saprotrophic species (Supplementary Image 3), indicating that ECM fungi have conserved the protease ability of their saprotrophic cousins.

Lipases
Lipases are the less represented secreted proteins, regardless of the fungal lifestyles. The secretome of the white-rot fungus S. stellatus contains the highest number of secreted lipases (44), whereas the ECM symbiont Pisolithus tinctorius and the yeast Pichia stipitis have only two secreted lipases. Genome-wide  analysis of secreted lipases shows differences between GX and GGGX lipase classes (Figure 5), the two main classes used by the Lipase Engineering Database to classify lipases. Among the 49 fungal secretomes, the GGGX family, which contains carboxylesterases and Candida rugosa lipase-like (CRL), is more represented than the GX class ( Figure 5). Two fungi (the white rot Phycomyces blakesleeanus and the yeast Pichia stipitis) have no GGGX class lipases ( Figure 5A). ECM fungi lack secreted carboxylesterases (CE) from the GGGX family, except for Piloderma croceum (Figure 5A), contrasting with the ericoid symbiont Oidiodendron maius presenting a high number of both CE and CRL. Among GX class lipases, only two fungi (the ECM symbiont Suillus luteus and the white rot G. marginata) have lysophospholipases ( Figure 5B). By contrast, thioesterases are found among all ECM fungi except Scleroderma citrinum ( Figure 5B). ECM fungi display a low repertoire of filamentous fungal lipases compared with white rot fungi (Supplementary Image 4).

Presence of SSPs in ECM Secretomes
There is a positive correlation between the number of SSPs and the secretome size, with the largest secretomes having the highest number of SSPs (Supplementary Image 1B). ECM fungi display significantly higher percentage of SSPs in their secretome than saprophytic species (Figure 6), whereas this percentage does not significantly vary between the other lifestyles. Species-specific SSPs (no sequence similarity within the set of compared fungi using BLASTP) are more abundant in plant pathogens, whereas they display similar levels in other lifestyles (Figure 6). Overall, ECM fungi are enriched in SSPs compared to the other fungal lifestyles, but do not display significantly higher numbers of species-specific SSPs.

Identification of Shared-and Lifestyle-specific SSPs
To identify conserved SSPs shared between the 28 saprotrophic (white rot, brown rot, and soil and litter decayers) and 14 mycorrhizal fungi (orchid, ericoid and ECM symbionts), we performed a clustering analysis based on sequence identity using CD-HIT software with an identity threshold set to 70% (Figure 7). The clustering analysis has been performed on a total of 16,821 SSP sequences and generated a total of 14,284 clusters, of which 101 clusters contain SSPs from at least three different fungal species (Figure 7A). Only the latter clusters have been kept for further analysis. Most of the defined clusters are species-specific (clusters with only one species) (Supplementary Data sheet 2). There is no cluster containing SSPs shared between all lifestyles, but seven SSP clusters are shared between ECM and saprotrophic fungi only ( Figure 7B). PFAM annotations indicate that those proteins are related to fungal proteins of unknown function and uncharacterized domains (PF09435 and PF08520), thaumatin (PF0314), GH 25 (PF01183), cyclophylin (PF00160), and ADP-ribosylation family (PF00025) ( Table 3). ECM fungi share most of their SSPs with brown rot (19 clusters), white rot (7 clusters), and litter decayers (3 clusters) (Figure 7B). Moreover, 17 clusters are specific to ECM fungi ( Table 3). These clusters contain proteins with PFAM domains associated with ceratoplatanin (PF07249), Ser-Thr-rich glycosyl-phosphatidyl-inositol-anchored membrane proteins (PF10342), GH 25 (PF01183), CUE domains likely binding ubiquitin (PF02845), NADPH-binding protein (PF13460), thiol-oxydoreductases (PF07249), and Evr1/Alr family proteins likely involved in the maturation of Fe/S clusters (PF04777). Six clusters specific for ECM fungi contain SSPs without any PFAM domain, suggesting the presence of lifestyle-specific SSPs among ECM symbionts. Finally, SSPs specific to saprotrophic fungi include fungal hydrophobin (PF01185) and GH 61 (PF03443) sequences. To harness evolution of SSP-encoding genes, we performed phylogenetic analysis on the cluster containing the highest number of SSPs from saprotrophic (white rot, brown rot, and litter decayer) and ectomycorrhizal fungi ( Figure 7C). Two clades are highly supported by bootstraps. Clade I contains mainly Boletales and one Atheliales, whereas Clade II contains mainly Agaricales and Polyporales. These two clades are highly consistent with taxonomy and are thus not independent from it. Most interestingly, clade I is enriched but not exclusively with ECM fungi, whereas clade II is enriched but not exclusively with saprotrophic fungi. This analysis is supporting an independent diversification of SSPs. However, clade I contains among seven ECM fungi Serpula lacrymans, a brown-rot fungus and clade II contains among 11 saprotrophic fungi, two ECM fungi: Hebeloma cylindrosporum and Amanita muscaria. This suggests that SSPs of ECM fungi may have evolved from their saprotrophic ancestors.

DISCUSSION
This genomic comparative genomic study is based on in silico analysis and bioinformatics tools. The results should thus be taken with care as they may be impacted by both qualities of sequencing (Supplementary Data sheet 3) and annotation tools used. For instance, secretome size found for two white rot fungi, S. stellatus and A. subglabra, are strongly higher than any other fungal secretome analyzed in this study (1715 and 1682 predicted secreted proteins, respectively). Moreover, proteins can be secreted using unconventional secretion system (Nickel and Rabouille, 2009) and as they do not contain any signal peptide, they are not predicted as secreted proteins using our bioinformatics pipeline. For instance, proteomic analysis of H. cylindrosporum's exoproteome identified 228 secreted proteins not computationally predicted as secreted (Doré et al., 2015). On other hand, a proteomic analysis on L. bicolor freeliving mycelium revealed 815 secreted proteins using SDS-PAGE  shotgun method (Vincent et al., 2012), a number similar to the one of in silico predicted secreted proteins (854 proteins predicted as secreted). However, the two exoproteome are different in their composition (Vincent et al., 2012). Vincent et al., also highlighted that the experimental approach used for separation and identification of proteins has an impact  on the detection of secreted proteins (Vincent et al., 2012). Therefore, combination of both computational and experimental approaches appears as the most accurate strategy to study fungal secretome's composition.

Fungal Secretomes, Not a Matter of Size
In this study, we predicted, analyzed and compared the secretome of 49 fungal species, including 11 ECM fungi. We focused our analysis on ECM secretomes aiming to identify shared-or lifestyle-specific features. A previous analysis on 33 microbial secretomes (mostly Ascomycota and Oomycota) showed that the number of predicted secreted proteins was related to the phylogenetic relationships between species rather than their ecological traits (i.e., lifestyles) (Krijger et al., 2014). Our study, mostly performed on Basidiomycota, confirms and extends those conclusions by showing that most of the analyzed secretomes contain 500-1000 proteins and their size is not related to the species lifestyle. However, a recent proteomics survey of secreted proteins involved in lignocellulose degradation in Basidiomycota highlighted differences between fungi having different lifestyles. The authors suggested that the lifestyle shapes the composition, but not the size of fungal secretomes (Alfaro et al., 2014). In the present study, we surveyed the secretome composition of ECM fungi and compared it to saprotrophic and pathogenic species.

ECM Fungi Have a Reduced Repertoire of Secreted PCW-degrading CAZymes
CAZymes are important for both organic matter degradation and colonizing the host in pathogen/symbiotic interactions by facilitating the breakdown of plant cell wall components (van den Brink and de Vries, 2011; Zerillo et al., 2013;Brouwer et al., 2014). ECM fungal genomes have lost genes encoding for CAZymes and lytic polysaccharide mono-oxygenases , similar to what has been observed in brown rot fungi, and likely due to their intercellular interactions  with plants (Eastwood et al., 2011;Floudas et al., 2012). Our data detail these results by showing that ECM fungi secrete less PCW active CAZymes than white rot fungi, although we did not find any statistically significant differences when we compared ECM fungi with brown rot fungi or other saprotrophs (i.e., soil and litter decayers or yeast). The remaining plant cell wall degradative capacities are likely involved both in the presymbiotic phase, where ECM fungi are not yet in symbiosis with a host, and during the first steps of the colonization process, where the ECM fungus needs to penetrate the plant root cortex.

ECM and Saprotrophic Fungi Have Similar Protease and Lipase Repertoires
In most temperate and boreal forest soils, organic matter is predominant. The role of ECM fungi in soil organic matter degradation and/or modification is known and required decomposition abilities. However, the consequence of soil organic matter modification by ECM fungi in particular for carbon storage is under debate (Lindahl and Tunlid, 2015). Nitrogen is mainly found in an organic form, such as proteins, and carbon in complex carbohydrates. Recent studies using the ECM fungus Paxillus involutus suggest that glucose (corresponding to carbon provided by the plant) controls the assimilation of organic nitrogen by this ECM fungus (Rineau et al., 2012;Shah et al., 2013). Mobilization of nitrogen in organic matter requires the action of secreted proteases (Geisseler et al., 2011). Another recent study proposes that endoproteases from the A01, M36, and S53 families in combination with exoproteases (M28, S28, and S9 families) are likely required to degrade soil proteins . Additionally, subtilisins (S08 family) are dominant in the secretomes of saprotrophic fungi (Hu and Leger, 2004). Most mycoparasitic and pathogenic fungi display an elevated number of subtilisin-like serine proteases (Muszewska et al., 2011) that play a key role in these interactions (Bryant et al., 2009). In contrast, ECM fungi display fewer subtilisins (S08 family) in comparison with white rot fungi, endophytes, ericoid fungi, and pathogenic fungi. This reduced set of subtilisins appears specific to the ECM fungi (and to a lesser extent to several brown rot fungi taxonomically-related to ECM fungi). The reduced number of subtilisins might be a way for ECM fungi to avoid eliciting plant defense mechanisms (Figueiredo et al., 2014). In plant pathogenic fungi, secreted lipases are involved in propagule adhesion and plant tissue penetration to promote List of clusters containing only small-secreted proteins from ectomycorrhizal fungal species (ECM), both ectomycorrhizal and saprotrophic fungal species and only saprotrophic fungal species (white rot, brown rot, litter decayers). PFAM analysis were performed with MotifFinder with an e-value cut off of 1.10 -08 , based on CD-HIT clustering analysis with identity threshold set to 70%.
colonization (Voigt et al., 2005;Chu et al., 2008). They may also be used to facilitate nutrient absorption from the host or involved in the inhibition of immunity-related callose formation (Blümke et al., 2014). In Magnaporthe oryzae, a lipase-like protein upregulated during plant penetration and biotrophic development is likely involved in both appressorium and manipulation of cell to cell communication through plasmodesmata (Oliveira-Garcia E; 28th Fungal Genetics Conference, Pacific-Groove, CA, USA). However, in mutualistic interactions, no functional studies have described the role of secreted lipases. Because levels of secreted lipases is low in each fungal lifestyle, it is not surprising that we do not see a reduction in secreted lipases similar to what has been observed with secreted CAZymes.

ECM Fungi Share SSPs with Saprotrophic Fungi, But Also Display Symbiosis-specific SSPs
SSPs have been extensively studied in recent years for their involvement in host-pathogen interactions as effectors (Giraldo and Valent, 2013). However, those SSPs have been found in most fungal species regardless of their lifestyle. Interestingly, several SSPs are secreted by free-living mycelium and are thus non-specific to symbiotic tissues (Vincent et al., 2012;Doré et al., 2015). This suggests a role of SSPs in the biology of extramatrical mycelium, which is the one interacting with rhizospheric microbes. These secreted proteins are likely involved in a variety of processes including differentiation of fungal structure, such as fruiting body, (e.g., hydrophobins), cell-to-cell communication (Murphy et al., 2012), competition between fungi (Trejo-Hernández et al., 2014) and fungal-host interactions (Chisholm et al., 2006;Plett and Martin, 2015). Our genome-wide survey reflects the versatility of SSPs with both shared-and lifestyle-specific proteins. Among the 17 SSP clusters only found in ECM fungi, we found many proteins identified as ceratoplatanins. Ceratoplatanin is a fungal elicitor of plant defenses and is therefore considered to be involved in pathogen-associated molecular patterns (de Oliveira et al., 2011;Baccelli et al., 2014). Finding ECM-specific ceratoplatanins in mutualistic fungi suggested that that they are involved in microbial-associated molecular patterns (MAMP) involved in polysaccharide recognition. Additionally, a recent study showed that the ceratoplatanin ELP1 from the soil fungus Trichoderma atroviride forms highly ordered monolayers at a hydrophobic surface/liquid-interface and hybrid ordered layers when added to hydrophobins (Bonazza et al., 2015). Several of the PFAM domains found in ECM-specific sequences, such as Ser-Thr/GPI rich anchored proteins and the lysozyme-like GH 25, are also related to cell wall remodeling and/or organization. Interestingly, these PFAM domains are enriched in the secretomes of rust fungi , suggesting a possible role for the biotrophic way of life. Ser-Thr-GPI anchored proteins may be involved in the development of fruiting bodies (Frey et al., 2015) as well as in signaling when they interact with MAPK (Shen et al., 2015). Six clusters do not have any known PFAM domains, suggesting new families of ECM-specific SSPs. Finally, three SSP clusters are specific to saprotrophic fungi; they include fungal hydrophobin (PF01185) and copper-dependent lytic polysaccharide monooxygenases GH 61/AA9 (PF03443) (Levasseur et al., 2013). Orchid mycorrhizal symbionts do not share SSP clusters with ECM fungi. However, they share three SSP clusters with white rot fungi and a single SSP cluster with both white and brown rot fungi. This suggests that orchid symbionts are more similar to saprotrophic fungi and is consistent with a recent comparative genomic study (Martino, personal communication). Overall, ECM fungi share a large set of SSPs with brown rot fungi, white rot fungi and litter decayers in that order, supporting the view of a continuum between ECM fungi, rot fungi and litter-decayer fungi (Riley et al., 2014). Phylogenetic analysis confirms at least some SSPs from ectomycorrhizal fungi have evolved from SSPs found in saprotrophic ancestors. Knowing that MiSSP7 from L. bicolor is an effector protein required for ECM symbiosis establishment (Plett et al., 2014), one could wonder whether this SSP evolved from saprotrophic fungi. We finally identified a number of SSPs specific to the ECM fungi, which make them good candidates for further functional analyses.
Altogether, those results support the concept of a continuum from saprophytic to ECM fungi, where ECM fungi share common secreted proteins with their saprophytic cousins, but also contain ectomycorrhiza-specific SSPs likely involved in the fine-tuning of the mutualistic interaction established with their host(s). The present findings confirmed and extended the large scale genome analysis showing that emergence of ECM symbiosis has been associated with a loss of plant cell wall degrading enzymes and a rapid turn-over of symbiosis-related genes . Overall, SSPs represent a significant part of the fungal secretomes analyzed, especially among ECM fungi. The fact that ECM fungi are enriched in SSPs compared with the other lifestyles might reflect the conservation of SSPs from saprotrophic ancestors and the expansion of symbiosis-specific SSPs dedicated to the molecular cross-talk between partners, the accommodation of hyphae in planta, the establishment and functioning of the symbiosis (Garcia et al., 2015). This stresses the need for functional analysis of the candidate effectors to allow for more accuracy in future computational studies (Sperschneider et al., 2015).

Bioinformatics Pipeline and Functional Annotation of Fungal Secretomes
Prediction of secreted proteins was performed using a custom bioinformatic pipeline (Figure 1) assessing the following combined sequence characteristics: (a) proteins were predicted as secreted if the presence of a signal peptide was detected with SignalP, with D-cutoff values set to "sensitive" (version 4.1; option eukaryotic; Petersen et al., 2011), and no transmembrane helix or one overlapping the signal peptide found by TMHMM using default parameters (version 2.0; Melén et al., 2003) and (b) protein subcellular localization. Proteins were considered as secreted if subcellular localization was assigned as a secretory pathway using TargetP with the -N option to exclude plants (version 1.1; Emanuelsson et al., 2000) and as extracellular with WolfPsort using the option "fungi" (version 0.2; Horton et al., 2007). To filter out proteins that permanently reside in the endoplasmic reticulum (ER) lumen, we scanned the proteins for the KDEL motif (Lys-Asp-Glu-Leu) in the C-terminal region (prosite accession "PS00014") with PS-SCAN (version 1.79). Annotation of the secreted proteins was completed by a BLASTP query comparing protein sequences against different resources and specialized databases (e value = 10 −5 and choosing the best hit) using the followingdatabases: (1) CAZyme (http://www.cazy. org/), (2) MEROPS (http://merops.sanger.ac.uk/), and (3) Lipase Engineering Database (http://www.led.uni-stuttgart.de/) and the following international DNA databases: (1) Uniprot Swissprot and (2) JGI Mycocosm. We also performed domain searches with the HMMER package (version 3.0, default parameters; Finn et al., 2011) for PFAM domains. To predict whether the secreted proteins targeted nuclei, we used PredictNLS (default parameters, version 1.0.20; https://rostlab.org/owiki/index.php/ PredictNLS) for determine the presence of a nuclear localization signal. We also estimated the percentage of cysteine and the KRrich regions of the secreted proteins. We considered secretome proteins smaller than 300 amino acids as SSPs. Data mining and comparison and figure plotting have been performed using the R software (R Core Team, 2014, http://www.R-project.org/) and an in-house Python script.

Clustering Analysis and Venn Diagram
The clustering analysis has been performed based on sequence identity with CD-HIT software (Huang et al., 2010). The identity threshold has been set to 70% and only clusters involving at least three different fungal species have been taken in account. Protein sequences have been retrieved with an in-house Python script. Clustering analysis data have been used to generate a 4-sets Venn diagram.
Pairwise t-test and pairwise wilcoxon test, both with holm correction, have been performed using the R Software (R Core Team, 2014, http://www.R-project.org/) with p-values cut-off fixed at 0.01.

AUTHOR CONTRIBUTIONS
FM and CV designed the study. CP performed the clustering and secretome analyses. EM designed the bioinformatics pipeline to identify the secretomes and performed the PFAM enrichment analysis. CP and CV wrote the manuscript. FM and CV edited the manuscript. All authors commented on the manuscript before submission. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
This material is based on work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, supported under contract no. DE-AC02-05CH11231. This work was supported by the French National Research Agency through the Laboratory of Excellence ARBRE (grant no. ANR-11-LBX-002-01), the Plant-Microbe Interactions Project, Genomic Science Program, of the U.S. Department of Energy, Office of Science, Biological, and Environmental Research (grant no. DE-AC05-00OR22725), the Institut National de la Recherche Agronomique, the Région de Lorraine, and the European Fund for Regional Development (funding for the Functional Genomics Facilities at Institut National de la Recherche Agronomique-Nancy), the Université de Lorraine (Ph.D. scholarship to CP). We thank Dr. Igor Grigoriev (JGI) and the Mycorrhizal Genomics Initiative consortium principal investigators for access to the genome sequences before publication. We also thank Dr. Sylvain Raffaelle (INRA Toulouse, France) for helpful input regarding the enrichment analysis. We thank Joe Spatafora for helpful comments on phylogenetic analyses and the two reviewers for their valuable input in the manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb.

2015.01278
Supplementary Data sheet 1 | Prevalence of secreted Plant Cell Wall Degradative CAZymes.
Supplementary Data sheet 2 | Raw output from CD-HIT clustering software.
Supplementary Data sheet 3 | Assessment of genome completeness using CEGMA score.
Supplementary Image 1 | Correlation between Proteome size, Secretome size, and number of Small-secreted proteins within different lifestyles. Proteome size and Secretome size are plotted with linear regression and adjusted R-squared (A) as well as Secretome size and number of SSPs (B). Confidence interval (95%) is shown in gray.
Supplementary Image 2 | Composition in secreted PCW-degrading CAZymes. The prevalence of 21 families of CAZymes active on plant material has been compared between 6 different fungal lifestyles including saprotrophic (white rot, brown rot, and other saprotrophs) and biotrophic (ectomycorrhizal fungi, orchid symbiont, and pathogen) fungi. Boxplots for each category and every lifestyle show median (bold horizontal line), first and third quartiles (upper and lower limit of the box) and maximum and minimum values (straight vertical lines). Different letters indicates significant differences (p < 0.01) using pairwise comparisons with Wilcoxon rank sum test and holm correction of p-values.
Supplementary Image 3 | Prevalence of secreted exoproteases and endoproteases among lifestyles. Proteasic ability is represented as a boxplot for each lifestyles, showing median (bold horizontal line), first and third quartiles (upper and lower limit of the box) and maximum and minimum values (straight vertical lines).
Supplementary Image 4 | Prevalence of secreted GX and GGGW class liapses among lifestyles. Lipasic ability is represented as a boxplot for each lifestyles, showing median (bold horizontal line), first and third quartiles (upper and lower limit of the box) and maximum and minimum values (straight vertical lines).