Insight into Potential Probiotic Markers Predicted in Lactobacillus pentosus MP-10 Genome Sequence

Lactobacillus pentosus MP-10 is a potential probiotic lactic acid bacterium originally isolated from naturally fermented Aloreña green table olives. The entire genome sequence was annotated to in silico analyze the molecular mechanisms involved in the adaptation of L. pentosus MP-10 to the human gastrointestinal tract (GIT), such as carbohydrate metabolism (related with prebiotic utilization) and the proteins involved in bacteria–host interactions. We predicted an arsenal of genes coding for carbohydrate-modifying enzymes to modify oligo- and polysaccharides, such as glycoside hydrolases, glycoside transferases, and isomerases, and other enzymes involved in complex carbohydrate metabolism especially starch, raffinose, and levan. These enzymes represent key indicators of the bacteria’s adaptation to the GIT environment, since they involve the metabolism and assimilation of complex carbohydrates not digested by human enzymes. We also detected key probiotic ligands (surface proteins, excreted or secreted proteins) involved in the adhesion to host cells such as adhesion to mucus, epithelial cells or extracellular matrix, and plasma components; also, moonlighting proteins or multifunctional proteins were found that could be involved in adhesion to epithelial cells and/or extracellular matrix proteins and also affect host immunomodulation. In silico analysis of the genome sequence of L. pentosus MP-10 is an important initial step to screen for genes encoding for proteins that may provide probiotic features, and thus provides one new routes for screening and studying this potentially probiotic bacterium.

Lactobacillus pentosus MP-10 is a potential probiotic lactic acid bacterium originally isolated from naturally fermented Aloreña green table olives. The entire genome sequence was annotated to in silico analyze the molecular mechanisms involved in the adaptation of L. pentosus MP-10 to the human gastrointestinal tract (GIT), such as carbohydrate metabolism (related with prebiotic utilization) and the proteins involved in bacteria-host interactions. We predicted an arsenal of genes coding for carbohydratemodifying enzymes to modify oligo-and polysaccharides, such as glycoside hydrolases, glycoside transferases, and isomerases, and other enzymes involved in complex carbohydrate metabolism especially starch, raffinose, and levan. These enzymes represent key indicators of the bacteria's adaptation to the GIT environment, since they involve the metabolism and assimilation of complex carbohydrates not digested by human enzymes. We also detected key probiotic ligands (surface proteins, excreted or secreted proteins) involved in the adhesion to host cells such as adhesion to mucus, epithelial cells or extracellular matrix, and plasma components; also, moonlighting proteins or multifunctional proteins were found that could be involved in adhesion to epithelial cells and/or extracellular matrix proteins and also affect host immunomodulation. In silico analysis of the genome sequence of L. pentosus MP-10 is an important initial step to screen for genes encoding for proteins that may provide probiotic features, and thus provides one new routes for screening and studying this potentially probiotic bacterium.

INTRODUCTION
The Lactobacillus genus belongs to the LAB group, which currently comprises of 222 species described in List of Prokaryotic Names with Standing in Nomenclature "LPSN" 1 (February 2017). In this context, Lactobacillus represents a highly heterogeneous taxonomic group encompassing species with various physiological, biochemical and genetic characteristics that reflect their capacity to colonize many ecological niches and to respond to several environmental stresses (De Angelis and Gobbetti, 2004;Pot et al., 2014). Lactobacilli have been isolated from different sources [e.g., plants, foods, and the mucosal surfaces (i.e., from oral, gastrointestinal, and reproductive tracts) of mammalian hosts], and they have widely been used as starter cultures in food fermentations, due to their safe-history of use, and also as protective cultures because of their production of antimicrobial substances (e.g., bacteriocins, peroxide, diacetyl, among others) (Leroy and de Vuyst, 1999;Heller, 2001;Hansen, 2002;Holzapfel, 2002;Giraffa et al., 2010;Franz et al., 2011;Garrigues et al., 2013). Thus, the Food and Drug Administration and European Food Safety Authority certify some Lactobacillus species as Generally Recognized As Safe (GRAS) or having a Qualified Presumption of Safety (QPS), respectively (Bernardeau et al., 2008). Furthermore, many Lactobacillus species represent main components of the global probiotic market: L. acidophilus, L. bulgaricus, L. plantarum, L. brevis, L. reuteri, L. johnsonii, L. casei, L. rhamnosus, and L. salivarius. Specifically, some L. pentosus strains have exerted probiotic effects such as the acceleration of IgA secretion in saliva and the enhancement of IgA production in the small intestine (Kotani et al., 2010;Izumo et al., 2011), which have aroused great interest due to vegetal origin (Pérez Montoro et al., 2016). Generic mechanisms for underlying probiotic effects can be linked to taxonomic groups (genus or species); however, specific mechanisms tend to be strain-specific (Hill et al., 2014). As such, whole genome sequencing (WGS) remains the best way to better understand the genetic and metabolic potential of each species/strain, to demonstrate the plasticity of their phylogenetic relationships, metabolic pathways, adaptation, fitness and safety (Jolley and Maiden, 2010;Maiden et al., 2013).
Lactobacillus pentosus MP-10 is a potential probiotic LAB isolated from naturally fermented Aloreña green table olives (Abriouel et al., 2011) and has exhibited several probiotic capacities when tested in vitro such as good growth and survival capacities under simulated gastro-intestinal conditions, ability to auto-aggregate, and co-aggregate with pathogenic bacteria, adherence to intestinal and vaginal cell lines, antagonistic activity against pathogens and fermentation of several prebiotics and lactose (Pérez Montoro et al., 2016). However, the putative health-promoting capacities of this strain may depend on genetic characteristics and the interactions within its ecological niche (O'Sullivan et al., 2009); for this reason, the whole-genome sequence obtained by Abriouel et al. (2016) and the subsequent annotation will improve our knowledge about the functionality of this strain, its adaptation to the human gastrointestinal tract (GIT) and its interaction within the host. As such, we carried out in silico analysis of L. pentosus MP-10's carbohydrate metabolism and the factors that affect their interaction with the host with the aim to identify genes as potential probiotic markers.
To highlight the molecular mechanisms involved in the adaptation of L. pentosus MP-10 to the human GIT, we focused the in silico analysis on carbohydrate metabolism related to prebiotic utilization and the proteins involved in host interactions, since the adaptation of probiotics is mainly represented by the enrichment of mucus-binding proteins and enzymes involved in breakdown of complex carbohydrates (Ventura et al., 2012).
In silico analysis has some limitations related with the prediction accuracy which in turn depends on the algorithm used and the phenotype data from experiments (Ng and Henikoff, 2006); however, to avoid incorrect predictions all the annotations made in the present study were curated manually.

Carbohydrate Metabolism Related with Prebiotic Utilization
Over 8% of the identified genes in L. pentosus MP-10 genome are involved in carbohydrate metabolism (279 of 3558 genes), which is similar to the most-studied bifidobacterial genomes and 30% higher than other gastrointestinal (GIT)-resident bacteria (Ventura et al., 2009). The abundance of carbohydrate metabolism genes in L. pentosus MP-10 is important with respect to its possible adaptation to the microhabitats of gastrointestinal environment and its interaction with human host, and thus may enhance its survival, competitiveness and persistence.
Lactobacillus pentosus MP-10 is a facultatively heterofermentative LAB, and its genome possesses genes for both the phosphoketolase and Embden-Meyerhof pathways (EMP). Thus, it can potentially ferment carbohydrates mainly via the EMP, utilizing glucose, and converting it to pyruvate and then to lactate (glycolysis). However, in the absence of six-carbon sugars (e.g., glucose, et al.), L. pentosus MP-10 would possibly ferment five-carbon carbohydrates such as xylose, xylulose, arabinose, or ribose via the phosphoketolase pathway (PK), as reported for other L. pentosus strains (Bustos et al., 2005). Analysis by BlastKOALA indicated that EMP (complete pathway), pentose phosphate pathway (PP) (both oxidative and non-oxidative complete pathways), and galactose degradation pathway (complete Leloir pathway) form the central core of carbohydrate metabolism in L. pentosus MP-10; however, the Entner-Doudoroff pathway (ED) appears incomplete.
Lactobacillus pentosus MP-10 has been shown to be able to ferment in vitro a variety of carbohydrates such as glucose, galactose, fructose, lactose, saccharose, and lactulose (Pérez Montoro et al., 2016). In silico analysis of the annotated genome sequence of L. pentosus MP-10 also predicted its capacity to ferment several simple carbohydrates of both five-carbon and six-carbon sugars such as mannose, inositol, ribose, arabinose, rhamnose, maltose, xylose, xylulose, and trehalose; furthermore, we also predicted its ability to use complex carbohydrates such as cellulose, xylan (hemicellulose), starch, raffinose, chitin, and levan (Figure 2). These carbohydrates can either be dietary compounds or carbon sources derived from the metabolism of the gastrointestinal microbiota (Korakli et al., 2002). Ultimately, 15 carbohydrate utilization pathways were predicted in L. pentosus MP-10 genome sequence: glycolysis/gluconeogenesis, citrate cycle, PP pathway, pentose, and glucuronate interconversions, fructose and mannose metabolism, galactose metabolism, ascorbate, and aldarate metabolism, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, pyruvate metabolism, glyoxylate and dicarboxylate metabolism, propanoate metabolism, butanoate metabolism, C5-branched dibasic acid metabolism and inositol phosphate metabolism. As such, the wide repertoire of enzymes involved in the fermentation of various carbohydrate substrates is reflected in its relatively large genome size, which is also corroborated by the significantly abundant number of genes for the phosphoenolpyruvate-(PEP) dependent sugar phosphotransferase system (PTS) (77 genes) and the presence of specific genes or gene clusters involved in carbohydrate utilization by L. pentosus MP-10.
The possible adaptation and enrichment of L. pentosus MP-10 in GIT could be predicted by the presence of genes encoding various carbohydrate-modifying enzymes able to modify oligoand polysaccharides. These enzymes are produced by intestinal microbial communities and are required for the metabolism of plant-and host-derived carbohydrates (e.g., cellulose, xylan, and pectin), since mammals have limited evolved abilities to hydrolyze complex polysaccharides for digestion (Cantarel et al., 2012). Among these enzymes, many were predicted in L. pentosus MP-10 genome and belong to several CAZY "Carbohydrate-Active Enzymes" families ( Table 1): glycoside hydrolases or glycosylases (15 genes); hexosyl-(15 genes), pentosyl-(13 genes) and phospho-transferases (13 genes); and also isomerases (24 genes).
Furthermore, the presence of sugar ABC transporters, carbohydrate esterases, glycosyl transferases, polysaccharide lyases, permeases, and PEP-PTS (PEP; PTS) components required for the uptake and metabolism of plant and hostderived carbohydrates were predicted in the L. pentosus MP-10 genome, as similarly reported for the probiotic Bifidobacterium (Kim et al., 2009). This arsenal of genes coding for carbohydrate-modifying enzymes predicted in L. pentosus MP-10 genome could represent a key indicator of this bacterium's adaptation to the GIT environment, as these genes are involved in the metabolism and transport of carbohydrates non-digestible by human enzymes. Glycosyl (hexosyl-, pentosyl-, and phospho-) transferases are involved in the biosynthesis of disaccharides, oligosaccharides and polysaccharides by transferring sugar moieties from an activated donor to a specific substrate (Lairson et al., 2008); the resulting glycoconjugates (as part of the glycome) play an important role in the establishment of environment-and host-specific interactions (Kay et al., 2010). Glycoside hydrolases are able to hydrolyze the glycosidic bond between two or more carbohydrates, and also between carbohydrate and noncarbohydrate moieties. The most common predicted genes found in L. pentosus MP-10 were coding for oligo-1,6-glucosidase, beta-galactosidase, alpha-L-rhamnosidase, and 6-phosphobeta-glucosidase among others (with several GH families), playing a key role not only in carbohydrate hydrolysis but also their action as retaining enzymes involved in the synthesis of oligosaccharides that may be selectively used as prebiotics by L. pentosus MP-10 and other gastrointestinal probiotic bacteria ( Table 1).
Regarding isomerases, we observed several carbohydrate isomerases involved in the glycolytic pathway; however, the presence of different copies of phosphoglycerate mutase may indicate that gene-products may accomplish other functions as a moonlighting protein (Candela et al., 2007).

Complex Carbohydrate Metabolism
Lactobacillus pentosus MP-10 has the capacity to metabolize complex carbohydrates (e.g., starch, cellulose, galactan, xylan, pullulan, pectins, and gums). For example, glycogen metabolism plays an important role in survival and fitness of LAB in their ecological niche by contributing to cellular processes such as carbohydrate metabolism, energy production, stress response, and cell-cell communication (Eydallin et al., 2007(Eydallin et al., , 2010. The glycogen metabolism operon (glg) predicted in L. pentosus MP-10 is encoded by a 9608-base chromosomal region and consists of glgBCDAP-apu genes (XX999_00114 to XX999_00119), which are co-transcribed as polycistronic mRNA ( Table 2). The organization of the core genes (glgBCDAP) is identical to many bacteria, with the exception of two additional glycogen synthase genes exclusive to L. pentosus MP-10 (XX999_01233 and XX999_02081) which are homologous with Bacillus subtilis 168 and Mycobacterium tuberculosis CDC 1551, respectively ( Table 2). Furthermore, genes amyB and pgcA coding for alpha-amylase 2 and phosphoglucomutase, respectively, are distantly located from the glg operon (Table 2 and Figure 2B). According to Goh and Klaenhammer (2014), the glycogen gene cluster organization might differ depending on the bacterial species and origin; in this study, the glycogen gene cluster is composed of glgBCDAP-apu-amyB-pgcA genes and the other two glycogen synthase genes (XX999_01233 and XX999_02081). Glycogen metabolism is predicted as an additional trait in L. pentosus MP-10, as it will contribute to probiotic activities and the retention of this bacterium in highly competitive and dynamic niches, such as the gastrointestinal      environment, similarly as the probiotic L. acidophilus (Goh and Klaenhammer, 2013). The presence of more than one glycogen synthase gene in L. pentosus MP-10 indicates the capacity of these bacteria to store carbohydrates in the form of glycogen.
Regarding other enzymes involved in complex carbohydrate degradation, we found genes coding for a protein similar to chitooligosaccharide deacetylase of E. coli K12 and betahexosaminidase involved in chitin degradation pathway as part of glycan degradation. Further, several genes coding for enzymes involved in the degradation of plant structural polysaccharides such as cellulose, ß-glucan, and xylan were predicted in L. pentosus MP-10 genome ( Table 3). In this context, a gene coding for a protein similar to cellulase/esterase CelE from Clostridium thermocellum ATCC 27405, which is a multifunctional enzyme involved in the degradation of plant cell wall polysaccharides, was identified in L. pentosus MP-10 genome necessary for cellulose and xylan digestion by both human and animals (Table 3). Moreover, endo-1,4-betaxylanase, acetylxylan esterase (three genes) and polysaccharide deacetylase were predicted in L. pentosus MP-10 genome sequence being involved in xylan catabolic pathway. Alphagalactosidase coding gene was also detected in L. pentosus MP-10 genome sequence and is involved in raffinose degradation (Table 3), which was previously shown in vitro by Pérez . Furthermore, L. pentosus MP-10 also had genes coding for cellulose synthase (two genes exclusive to L. pentosus MP-10 and two other genes) involved in cellulose synthesis (Table 3), which could accumulate cellulose on the cell wall surface as an extracellular matrix for cell adhesion and biofilm formation to protect the bacteria. Cellulose production has been reported in lactic acid bacteria (Adetunji and Adegoke, 2007); however, no reports were found of cellulase production, although some Lactobacillus sp. genomes exhibit cellulase genes such as L. delbrueckii subsp. bulgaricus CNCM I-1519 (UniProtKB-G6F519) and L. plantarum (UniProtKB -A0A1C9HK74). For probiotic bacteria, such as E. coli Nissle 1917, cellulose production is required for adhesion of bacteria to the gastrointestinal epithelial cell line HT-29, to the mouse epithelium in vivo, and for enhanced cytokine production (Monteiro et al., 2009). Thus, the role of cellulose production in L. pentosus MP-10 must be investigated in depth.
Overall, the repertoire of enzymes coding genes identified in L. pentosus MP-10 genome highlight the attractiveness of this bacterium as potential probiotic for human and animal.

Molecular Mechanisms Involved in the Interaction with the Host
Probiotic lactobacilli can mimic the same mechanisms used by the pathogens in the colonization process, thus they can express cell surface proteins such as key probiotic ligands that interact with host receptors resulting in several probiotic effectsthus inducing signaling pathways in the host (Voltan et al., 2008). The identification and characterization of these proteins, often strain-specific, involved in the molecular communication or interaction with the host are necessary to evaluate a priori the probiotic potential of Lactobacillus sp. candidates. Here, the possible interaction between L. pentosus MP-10 and the intestinal host cells, the target of most interactions with probiotics (Lebeer et al., 2010), may be bioinformatically predicted from the genome sequence. For example, several extracellular proteins (reviewed by Sánchez et al., 2008) were predicted in L. pentosus MP-10 to be involved in mucus adhesion: MucBP domain protein (codified by two genes determined in this study), lipoprotein signal peptidase (lspA gene) and moonlighting proteins such as glutamine-binding periplasmic protein (glnH genes) and elongation factor Tu (tuf gene) ( Table 4). The high genetic heterogeneity of MucBP proteins among Lactobacillus species (and strains) was reported by Mackenzie et al. (2010) for MUB and MUB-like proteins in L. reuteri. MucBP domain proteins are bacterial peptidoglycan-bound proteins, which are ligands or effector molecules contributing to specific properties such as adherence to the host, auto-aggregation and/or co-aggregation with pathogenic bacteria (Pérez Montoro et al., 2016)-as reported by Mackenzie et al. (2010) for MUB in L. reuteri. However, this should be further investigated for L. pentosus MP-10 under different conditions. Adhesion to mucus has been attributed to other molecules such as the Lactobacillus surface protein A (LspA), reported as mucus binding protein in L. salivarius UCC118 (van Pijkeren et al., 2006), which was also found in L. pentosus MP-10 ( Table 4). Mucus binding proteins in L. pentosus MP-10 may have a dual role: (1) being involved in the adhesion of this bacterium to the host cells and thus reinforcing the protection of the mucosal barrier and the competitive exclusion of pathogens, and (2) these proteins could also be implicated in the induction of mucin secretion by the host as reported for other lactobacilli (Mack et al., 2003). These finding are corroborated by the fact that L. pentosus MP-10 was able to adhere to Caco-2 and HeLa 229 cell lines and also co-aggregate with different Other proteins predicted to be involved in adhesion to epithelial cells or extracellular matrix include: poly-beta-1,6-N-acetyl-D-glucosamine synthase, collagen binding protein, manganese ABC transporter substrate-binding lipoprotein precursor and moonlighting proteins such as elongation factor Tu, glyceraldehyde-3-phosphate dehydrogenase, 10 and 60 kDa chaperonins, enolase, 2 glutamine synthetase, and glucose-6-phosphate isomerase ( Table 4). The poly-beta-1,6-Nacetyl-D-glucosamine synthase encoded by L. pentosus MP-10 was similar to E. coli K12 (33.89% identity), and it has been reported to be a surface polysaccharide involved in biofilm formation by this strain (Matthysse et al., 2008). However, the role of this protein in lactobacilli has not been determined. Furthermore, we predicted the presence of collagen-binding protein specific to L. pentosus MP-10, which could be involved in their adhesion to epithelial cells/extracellular matrix proteins similarly as shown other lactobacilli such as L. reuteri NCIB 11951 (Roos et al., 1996) and L. fermentum RC-14 (Heinemann et al., 2000). Thus, this could be of vital importance for effective colonization and also competitive displacement of gut pathogens (Yadav et al., 2013).
On the other hand, the manganese ABC transporter substratebinding lipoprotein precursor predicted in L. pentosus MP-10, similar to Streptococcus pneumoniae ATCC BAA-334 (51.96% identity), has been described as an important factor in pathogenesis and infection, since it acts as an adhesin involved on adherence to extracellular matrix (Dintilhac et al., 1997). Furthermore, the manganese ABC transporter substratebinding lipoprotein precursor has also been detected in different Lactobacillus sp. such as L. plantarum, L. rhamnosus, and L. delbrueckii among others being involved in cell adhesion (UniprotKB).
The moonlighting proteins, or multifunctional proteins such as elongation factor Tu and chaperonin GroEL, have been involved in the adhesion to epithelial cells and/or extracellular matrix proteins and also in host immunomodulation in L. johnsonii NCC 533 (Granato et al., 2004;Bergonzelli et al., 2006;Sánchez et al., 2008), while α-enolase has been involved in adhesion to epithelial cells and/or extracellular matrix proteins and also plasma components in L. crispatus ST1 (Antikainen et al., 2007). Glyceraldehyde-3-phosphate dehydrogenase and phosphoglycerate mutase have been involved in the adhesion to plasma components in L. crispatus ST2 (Antikainen et al., 2007;Candela et al., 2007). Furthermore, Kainulainen et al. (2012) showed that glutamine synthetase and glucose-6-phosphate isomerase have also been involved in adhesion to epithelial cells. However, the role of these moonlighting proteins in L. pentosus MP-10 has not yet been determined, requiring for this purpose further mutation or proteomic studies.

CONCLUSION
Lactobacillus pentosus MP-10 has harbored in its genome several genes putatively involved in their adaptation to the human GITparticularly those involved in carbohydrate metabolism related to prebiotic utilization, and also the proteins involved in the interaction with host tissues. Enzymes involved in carbohydrate modification and complex-carbohydrate metabolism are highly represented in L. pentosus MP-10 genome, which may enhance their survival, competitiveness, and persistence in a competitive GIT niche. Furthermore, we found genes encoding mucus-binding proteins-involved in the adhesion to mucus, epithelial cells or extracellular matrix, to plasma componentsand also moonlighting proteins, or multifunctional proteins, predicted to be involved in their adhesion to epithelial cells and/or extracellular matrix proteins and also involved in host immunomodulation. In conclusion, in silico analysis of the L. pentosus MP-10 genome sequence highlights the attractiveness of this bacterium as a potential probiotic for human and animal hosts, and offers opportunities for further investigation of novel routes for screening and studying these bacteria.

MATERIALS AND METHODS
Genomic DNA Sequence of L. pentosus

MP-10
The complete genome sequence of L. pentosus MP-10, obtained by using PacBio RS II technology  and deposited at the EMBL Nucleotide Sequence Database under accession numbers FLYG01000001 to FLYG01000006, was annotated as described by Abriouel et al. (in press). Briefly, the assembled genome sequences were annotated using the Prokka annotation pipeline, version 1.11 (Seemann, 2014), which predicted tRNA, rRNA, and mRNA genes and signal peptides in the sequences using Aragorn, RNAmmer, Prodigal, and SignalP, respectively (Laslett and Canback, 2004;Lagesen et al., 2007;Hyatt et al., 2010).
In Silico Analysis of Carbohydrate Metabolism in L. pentosus MP-10 The annotated genome sequence was used to detect the putative genes involved in carbohydrate metabolism, their products, and the associated GO terms. Furthermore, the carbohydrate metabolic pathways were reconstructed by using BlastKOALA (last update March 4, 2016) as part of the KEGG (Kyoto Encyclopedia of Genes and Genome) tool in the pathway database 2 for annotating genomes; here, we used the annotated genes predicted in L. pentosus MP-10 genome as the input query.

AUTHOR CONTRIBUTIONS
HA, NB, CK, and AG drafted the manuscript. HA, NB, BPM, CC-S, APP, NCG, SC-G, and ME-M analyzed the data; All authors discussed the results, commented on the manuscript, and approved the final version.