Whole-Genome Sequencing, Phylogenetic and Genomic Analysis of Lactiplantibacillus pentosus L33, a Potential Probiotic Strain Isolated From Fermented Sausages

Lactobacillus is a diverse genus that includes species of industrial and biomedical interest. Lactiplantibacillus pentosus, formerly known as Lactobacillus pentosus, is a recently reclassified species, that contains strains isolated from diverse environmental niches, ranging from fermented products to mammalian gut microbiota. Importantly, several L. pentosus strains present health-promoting properties, such as immunomodulatory and antiproliferative activities, and are regarded as potential probiotic strains. In this study, we present the draft genome sequence of the potential probiotic strain L. pentosus L33, originally isolated from fermented sausages. Comprehensive bioinformatic analysis and whole-genome annotation were performed to highlight the genetic loci involved in host-microbe interactions and the probiotic phenotype. Consequently, we found that this strain codes for bile salt hydrolases, adhesins and moonlighting proteins, and for Class IIb bacteriocin peptides lacking the GxxxG and GxxxG-like motifs, crucial for their inhibitory activity. Its adhesion ability was also validated in vitro, on human cancer cells. Furthermore, L. pentosus L33 contains an exopolysaccharide (EPS) biosynthesis cluster, and it does not carry transferable antibiotic resistance genes. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and CAZymes analyses showed that L. pentosus L33 possesses biosynthetic pathways for seven amino acids, while it can degrade a wide array of carbohydrates. In parallel, Clusters of Orthologous Groups (COGs) and KEGG profiles of L. pentosus L33 are similar to those of 26 L. pentosus strains, as well as of two well documented L. plantarum probiotic strains. Conclusively, L. pentosus L33 exhibits good probiotic potential, although further studies are needed to elucidate the extent of its biological properties.


INTRODUCTION
Lactobacillus is a diverse genus that includes Gram-positive, facultatively anaerobic, non-spore-forming, hetero-, or homofermentative bacteria that inhabit a broad range of nutrientrich environmental niches (Duar et al., 2017). The species of this genus have been recently reclassified to 25 genera, based on shared ecological and metabolic properties (Zheng et al., 2020). Lactobacillus strains can be found as autochthonous or allochthonous, mainly in the mammalian gastrointestinal tract, fresh fruit and vegetable microbiota, as well as in fermented foodstuffs (Inglin et al., 2018). In this context, several strains exhibit great biotechnological interest, due to their fermentation capacity and are being incorporated as starter cultures in a broad range of dairy and non-dairy products (Kok and Hutkins, 2018). Furthermore, specific strains are considered probiotic, meaning that they can confer health benefits to the host, when consumed in adequate quantities (FAO/WHO, 2002). Regarding the proposed positive impact of probiotics on host health, preclinical and clinical studies have shown that they can exhibit antimicrobial (Yu et al., 2015), immunomodulatory (Chondrou et al., 2020), antioxidant (Wu et al., 2019), antiproliferative (Tiptiri-Kourpeti et al., 2016), and even psychobiotic activity (Tian et al., 2020). Today, probiotics are commercially available in supplements or in functional products, comprising a rapidly growing global market, currently worth more than $50 billion, as market reports indicated. 1 The commercialization of probiotic strains is strictly monitored. Indeed, several guidelines have been set in place for the characterization of novel probiotic strains by the World Health Organization (WHO), the Food and Agriculture Organization of the United Nations (FAO), and the European Food Safety Authority (EFSA) (FAO/WHO, 2002;EFSA, 2018). First, new isolates should be molecularly assigned to a specific taxonomic group. EFSA also requires full genome sequencing and annotation of strains that are intended for biotechnological applications (EFSA, 2018). Importantly, probiotics must be safe for consumption; they should not exhibit hemolytic or virulence activity. Consequently, they should be characterized by either the Food and Drug Administration (FDA) or EFSA with the "Generally Recognized as Safe" (GRAS) or of "Qualified Presumption of Safety" status, respectively (Rodrigo-Torres et al., 2019). Furthermore, probiotic microorganisms should be able to tolerate the gastrointestinal tract conditions; be resistant to low pH, gastric enzymes, and bile acids and, also, adhere to and, at least transiently, colonize the gastrointestinal mucosa (Hill et al., 2014). The proposed health effects of new isolates should be thoroughly explored in vitro and in vivo, to finally be validated in the clinical setting (FAO/WHO, 2002). Mechanistic studies on host-probiotic interactions have flourished recently, with the advent of multi-omics technologies, facilitating a better understanding of their properties and biological functions (Kiousi et al., 2021).
The introduction of genomics in the microbiology field has restructured the characterization of novel Lactobacillus strains as probiotics. As next-generation sequencing platforms are becoming increasingly accessible, the taxonomic and functional characterization of new isolates can be performed with greater accuracy. One of the species that has been reclassified recently is Lactiplantibacillus pentosus, formerly known as Lactobacillus pentosus (Zheng et al., 2020). The bacteria of this species are mainly associated with environmental samples, such as fruit and vegetable microbiota, however, several strains harbor genes for mammalian host adaptation (Abriouel et al., 2017). Genome mining in L. pentosus strains and comparative genomic analysis with the closely related L. plantarum species have revealed functional characteristics involved in the probiotic phenotype, such as the presence of genes involved in stress response (Ye et al., 2020), metabolic capacity (Abriouel et al., 2017), adhesion on the intestinal mucosa and bacteriocin production (Maldonado-Barragán et al., 2011).
L. pentosus L33 is a LAB (Lactic Acid Bacteria) strain, with desirable probiotic properties, as previously demonstrated in a series of established in vitro tests (Pavli et al., 2016). The aim of this study was to further investigate the probiotic potential of the strain by characterizing the genetic basis of the probiotic phenotype. Firstly, whole-genome sequencing was performed to reveal the genomic characteristics of the strain. Then, genome annotation and comparative genomic analysis with other L. pentosus, as well as, L. plantarum genome sequences were executed to detect strain-specific genes and pinpoint genes of interest. More specifically, the presence of gene clusters involved in the biosynthesis of bacteriocins, adhesion proteins and exopolysaccharides were investigated. Lastly, KEGG pathway and CAZymes analyses were performed to evaluate the metabolic capabilities of L. pentosus L33.

MATERIALS AND METHODS
Bacterial Strain, Culture Conditions, and DNA Isolation L. pentosus L33 was originally isolated from fermented sausages (Pavli et al., 2016), and was acquired by the Institute of Technology of Agricultural Products, Hellenic Agricultural Organization DIMITRA (Athens, Greece). It was maintained in de Man, Rogosa, and Sharpe (MRS) broth (Condalab, Madrid, Spain) at 37 • C for 16-18 h under anaerobic conditions, prior to DNA extraction. Bacterial cells were collected by centrifugation at 8,000 g for 4 min. Total genomic DNA was extracted from the cell pellets using the NucleoSpin R Tissue kit (Macherey-Nagel, Düren, Germany), according to manufacturer's instructions. DNA purity and quantity were confirmed spectrophotometrically at 260 nm using NanoDrop R ND-1000 UV-Vis Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, United States).

Whole-Genome Sequencing and Genome Annotation
The genomic DNA of L. pentosus L33 was sequenced using Illumina NovaSeq6000 (2 × 151 paired ends) platform. A total of 8,806,648 paired-end reads were obtained. The quality of the reads was estimated using FASTQC (version 0.11.9) (Andrews, 2010), while low-quality reads were removed via Trimmomatic (version 0.39) (Bolger et al., 2014). De novo assembly process was executed with SPAdes (version 3.15.1) (Bankevich et al., 2012), selecting the "-careful" option to reduce mismatches and SSPACE_Standard (version 3.0) (Boetzer et al., 2011) with the parameter to filter out contigs with length below 500 base pairs.
Genome annotation was carried out locally, using the Prokaryotic Genome Annotation Pipeline (PGAP) (Tatusova et al., 2016) algorithm with default parameters. EggNOGmapper (version 2.0) tool from the online EggNOG database (version 5.0) (Huerta-Cepas et al., 2019) was used for functional classification of proteins into COGs. BlastKOALA (version 2.2) was utilized for Kyoto Encyclopedia of Genes and Genomes Orthology (KO) assignment and KEGG mapping of the predicted genes (Kanehisa et al., 2016). Carbohydrate-active enzymes (CAZymes) were searched against the CAZy database (Lombard et al., 2014). Clustered regularly interspaced palindromic repeats (CRISPR) inside the assembly were evaluated using CRISPRDetect (version 2.4) (Biswas et al., 2016). PHAge Search Tool Enhanced Release (PHASTER) (Arndt et al., 2016) was utilized for identification and annotation of putative prophage sequences inside the bacterial assembly. Visualization of the genome assembly was performed by Artemis tool (version 18.1.0) (Carver et al., 2012), while its metrics were calculated with the Quality Assessment Tool (QUAST) (version 5.2.0) (Gurevich et al., 2013).

Phylogenetic and Comparative Analysis
Average Nucleotide Identity (ANI) analysis was performed on the complete genome assembly, using a python module called Pyani (version 0.2.10) (Pritchard et al., 2016), to verify the taxonomic identity of L. pentosus L33. Pangenome analysis of the available L. pentosus strains (May 2021), was operated by Roary (version 3.13.0) (Page et al., 2015). The phylogenomic analysis, including 1,000 bootstrap replicates (Maximum Composite Likelihood model), was performed by MEGAX (version 10.1.8) (Kumar et al., 2018). The publicly available online EMBL tool called "Interactive Tree of Life" (iTol) (version 6.1.1) (Letunic and Bork, 2016) was used for phylogenetic tree construction. Strain-specific genes were determined via an in-house Python script.

Detection of Genetic Elements Associated With Probiotic Characteristics
BAGEL (version 4) (de Jong et al., 2006) was employed for detection and visualization of gene clusters that are involved in bacteriocin biosynthesis. The presence of antibiotic resistance genes was verified by Resistance Gene Identifier (RGI) (version 5.1.1) (Jia et al., 2017). BLAST (basic local alignment search tool) was used for the search of genes that are involved in EPS production, bile salt hydrolysis and cell adhesion.

Quantitative Adhesion Assay
The assay was performed as described before, with minor modifications (Plessas et al., 2020). Briefly, human colon adenocarcinoma HT-29 cells were seeded in 24-well plates at a density of 40 × 10 4 cells per well and incubated for 14 days to form a monolayer. The cells were maintained in Roswell Park Memorial Institute (RPMI)-1640 medium enriched with GlutaMAX TM , 10% fetal bovine serum (FBS), 100 µg/mL streptomycin and 100 U/mL penicillin (Thermo Fisher Scientific, Waltham, MA, United States) and incubated at 37 • C, 5% CO 2 in a humidified atmosphere. 10 7 or 10 8 CFU/mL of viable L. pentosus L33 or L. rhamnosus GG cells were added to each well. After 4 h of co-incubation at 37 • C, the cells were washed with PBS and lysed with 1% Triton X-100 (Sigma-Aldrich, Taufkirchen, Germany). The lysates were serially diluted in Ringer's solution (Lab M, Lancashire, United Kingdom), plated on 2% MRS agar, and incubated at 37 • C, until the formation of visible colonies. For the calculation of adhesion values the following formula was applied: % Adhesion = (V B /V A ) × 100, where V A is the initial viable count of bacteria tested, and V B is the viable bacteria count attached on HT-29 cells. Colony forming units per milliliter (CFU/mL) was used as viable count measure that was determined using the formula: CFU/mL = (number of colonies × dilution factor)/volume of culture plate.

Genome Features
Whole-genome sequencing and comprehensive bioinformatic analysis were employed for the investigation of the genomic features of L. pentosus L33 (Table 1), ultimately leading to the construction of its genome map (Figure 1). The complete genome of L. pentosus L33 has a length of 3,923,201 bp with a GC content of 46.01%. Among the 3,630 predicted genes, 3,429 were found to be protein-coding sequences (CDS). Furthermore, 127 pseudogenes, 58 tRNAs, 6 rRNAs, and 5 ncRNAs were identified. The 58 tRNA encoding sequences correspond to all 20 amino acids (Supplementary Table 1). In addition, 3 clustered regularly interspaced short palindromic repeats (CRISPR) arrays (Supplementary Table 2), as well as 4 intact prophage regions (Supplementary Table 3) were recognized.

Phylogenetic Analysis and Unique Genome Characteristics of Lactiplantibacillus pentosus L33
For the characterization of strain L33, sequencing of the V1-V3 region of 16S rRNA gene, followed by multiplex PCR  targeting the recA gene was performed. Strain L33 was assigned to the species of Lactobacillus pentosus (Pavli et al., 2016), currently known as Lactiplantibacillus pentosus (Zheng et al., 2020). A neighbor-joining phylogenetic tree, including 1,000 bootstrap replications, based on orthologous gene clusters, was built to reveal the exact phylogenetic position of L. pentosus L33 within L. pentosus species (Figure 2). Moreover, we have used 26 L. pentosus strains, 2 well documented L. plantarum probiotic strains; L. plantarum WCFS1 (van den Nieuwboer et al., 2016) and L. plantarum 299v (Nordström et al., 2021), as well as Staphylococcus aureus NCTC8325 and Streptococcus pneumoniae NCTC11032, as controls (Supplementary Figure 1). The closest evolutionary relatives of L. pentosus L33 are L. pentosus IG7, which was isolated from the brine of natural Spanish-style green olive fermentation (Calero-Delgado et al., 2018), and L. pentosus BGM48, originated from laboratory scale Sicilianstyle green olive fermentation (Golomb et al., 2013; Figure 2).
Furthermore, when compared to L. pentosus L33, ANI analysis found that L. pentosus IG7 and L. pentosus BGM48 exhibit the greatest ANI scores, with 99.3 and 98.8%, respectively. The full ANI matrix, including all genomes, is presented in Figure 3. Additionally, L. pentosus L33, when comparing to 26 L. pentosus analyzed genomes, has 243 genes (6.60%) that were found to be strain-specific. The proteins encoded by unique genes were classified into COG functional categories (Figure 4). A total of 190 (78.18%) unique proteins were assigned to 18 COG functional categories. The majority (96 proteins) were categorized as "poorly characterized."

Functional Classification
We sought to perform in silico functional classification of L. pentosus L33 and applied various interconnected approaches to achieve a well-rounded categorization of its genes/CDSs. The COG database is a valuable tool for describing the   functional characteristics of newly sequenced genomes, as well as, comparing microbial communities (Galperin et al., 2019). Moreover, KEGG analysis is used to examine the diversity, as well as, the functionality of the proteins. Therefore, we performed a comprehensive analysis and comparison of the COG and KEGG profiles for L. pentosus L33, 26 L. pentosus strains, L. plantarum WCFS1 and L. plantarum 299v. The vast majority (94.66%) of the CDSs of L. pentosus L33, were allocated to 20 COG functional categories ( Figure 5). The category "Function Unknown" was the most abundant (21.1%), followed by "General Function Prediction only" (12.3%), "Transcription" (9.0%), "Replication, Recombination, and Repair" (6.2%), "Carbohydrate transport and metabolism" (6.1%). Furthermore, comparison of the COG profile of L. pentosus L33 with the respective COG profiles of the 26 L. pentosus strains, L. plantarum WCFS1, and L. plantarum 299v, highlighted its similarity in respect to the percentage of the genes allocated in each of the COG functional categories (Figure 5 and Supplementary Table 4). The abovementioned similarity is irrelevant of the isolation source of the analyzed strains, since they have been derived from a variety of ecological niches such as meat samples, olive brines, milk products, fermented vegetables, and human intestine etc.
To unveil the functional characterization of the CDSs of L. pentosus L33, we performed KEGG analysis. More precisely, approximately half of the L. pentosus L33 CDSs (52.10%) were assigned to 39 KEGG functional categories and 189 pathways. These pathways are mainly involved in the biosynthesis of secondary metabolites (ko: 01110; 180 genes), microbial metabolism in diverse environments (ko: 01120; 100 genes), and biosynthesis of amino acids (ko: 01230; 86 genes). Similarly to COG profiles, the number of genes assigned to each of the KEGG functional categories, is similar between L. pentosus L33, the other 26 L. pentosus strains, L. plantarum WCFS1, and L. plantarum 299v (Figure 6 and Supplementary Table 5). In addition, five virulence factors were identified in L. pentosus L33, including, a molecular chaperone (Hsp33), a translocase (YidC), two proteins of Mycobacterium tuberculosis with poorly defined function (Jag and YidD) (Yu et al., 2011), and a hemolysin iii family protein.
However, the functionality of the detected hemolysin remains questionable, due to reports which indicate that L. pentosus L33 does not exhibit hemolytic activity in vitro (Pavli et al., 2016).

Identification of Genes Implicated in the Probiotic Potential of Lactiplantibacillus pentosus L33
Finally, we performed comparative and comprehensive bioinformatical analysis to analyze in depth the L. pentosus L33 genome and locate genes and/or regions endowing a   probiotic potential. Prokaryotic Genome Annotation Pipeline (PGAP) predicted that L. pentosus L33 contains 4 genes related to bile salt resistance; two bile salt hydrolases and two enzymes that are members of the GCN5-related N-acetyltransferases family (GNAT) ( Table 2). Furthermore, RGI showed that the resistome of L. pentosus L33 does not contain transferable antibiotic resistance genes. Furthermore, a gene cluster consisting of 18 genes, involved in EPS biosynthesis, was identified during genome annotation. The aforementioned cluster has been, previously, described in the potential probiotic strain L. pentosus SLC13 and it is also present in the probiotic strain L. plantarum WCFS1 (Huang et al., 2018). Comparison between the EPS gene clusters indicated that the genes carried by L. pentosus L33 are homologous to those of strain SLC13, which is a potent exopolysaccharide producing strain (Figure 7; Huang et al., 2018). In addition, L. pentosus L33 contains 3 mucus-binding domain containing proteins and 2 proteins with fibronectin-binding domains along with NFACT domains (Table 3). Furthermore, 6 surface proteins carrying LPxTG cell wall anchored motifs were identified (Table 3). Moreover, moonlighting proteins with adhesin-like activity, elongation factor Tu, chaperonin GroEL, and co-chaperone GroES, are also present in the genome of L. pentosus L33 (Table 3). Notably, the adhesion capacity of L. pentosus L33 was validated in vitro utilizing HT-29 cells. Importantly, the strain exhibited similar adhesion capacity to that of the reference strain, L. rhamnosus GG (Figure 8).
Concerning the antimicrobial activity of the studied strain, L. pentosus L33 encodes for a class IIb bacteriocin, which is homologous to plantaricin NC8 αβ (Bengtsson et al., 2020). Class IIb bacteriocins consist of two peptides, which mediate their action by the interaction of their GxxxG and GxxxG-like (SxxxS and GxxxS) motifs with the membrane of the target pathogen (Maldonado et al., 2003;Bengtsson et al., 2020). The peptides coded by this strain lack a GxxxG-like motif and as a result, the functionality of the final product might be seriously affected (Supplementary Figure 2).

DISCUSSION
In this study, we present the draft genome sequence of Lactiplantibacillus pentosus L33, a strain isolated from traditional meat products (Pavli et al., 2016). The genome of this strain consists of a circular chromosome; with no plasmid sequences detected. The complete genomic length (3,923,201 bp) and GC content (46.01%) of L. pentosus L33 were found to be similar to that of other L. pentosus strains, such as the closely related L. pentosus IG7 (3,802,404 bp, GC content: 45.79%, Accession: GCA_002993395.1) (Calero-Delgado et al., 2018), and the potential probiotic strain L. pentosus MP-10 (3,698,214 bp, GC content: 46.00%, Accession: GCA_900092635.1) (Abriouel et al., 2017). The genomic size and GC content of strains could be indicative of their lifestyle and preferred environmental niche. Strains of the Lactobacillus sensu lato that are free-living or FIGURE 7 | Comparison of length, position and direction of genes comprising the EPS biosynthesis cluster of L. pentosus L33 and L. pentosus SLC13. Protein identities between the two strains are also displayed. The red line indicates that gene number 9 in L. pentosus SLC13 is a pseudogene. nomadic usually possess a larger genome with an approximate length of 3-4 Mb, whereas host-adapted strains have a drastically smaller genome due to gene loss (Duar et al., 2017). The genetic traits that can be affected by this event are clusters for amino acid synthesis, and genes involved in metabolism regulation (Zheng et al., 2015). In this study, we found that the genes of L. pentosus L33 are involved in the complete biosynthesis of seven amino acids (Supplementary Table 7 and Supplementary  Figures 3-8) and encode part of the required proteins necessary for the biosynthesis of the rest 13 amino acids. Interestingly, comprising of 100 genes, the "microbial metabolism in diverse environments" pathway (ko: 01120) was the second most common. The modules of this pathway include carbohydrate, methane, nitrogen, co-factor, and vitamin metabolism, among others. Concerning carbohydrate metabolism, CAZymes analysis showed that L. pentosus L33 does, indeed, code for enzymes involved in the synthesis and degradation of a broad array of simple and complex carbohydrates, such as glucose, galactose, mannose, trehalose, xylose, chitin, and cellulose. Additionally, it codes for galactose-, lactose-, starch-, and glycogen-binding modules that facilitate the catalytic activity of hydrolases (Boraston et al., 2004). These findings suggest that L. pentosus L33 may be able to inhabit a broad range of environmental niches.
Concerning the functional properties of this strain, L. pentosus L33 has presented desirable attributes in a previous in vitro study, where a total of 48 Lactobacillus strains were assessed for their susceptibility to common antibiotics, hemolytic activity, tolerance to gastrointestinal conditions, and antimicrobial properties (Pavli et al., 2016). It was found that L. pentosus L33 exhibited good tolerance to bile salts, that was not accompanied by bile hydrolase activity. Interestingly, in the present study, we detected two coding sequences for bile salt hydrolases ( Table 2); however, their functionality is questioned based on the in vitro findings. Nevertheless, it is important to note that bile salt resistance is a complex phenotype that can be mediated by several mechanisms, such as bile-efflux systems, changes in EPS and S-layer protein production (Ruiz et al., 2013). This character should be explored in greater depth in future studies.
In this study, we found that this strain does not carry transferable antibiotic resistance genes. In fact, with the exception of vancomycin, L. pentosus L33 was not able to survive treatments with common antibiotics (Pavli et al., 2016). Vancomycin resistance in Lactobacillus strains is considered to be intrinsic ; therefore, it is no surprise that L. pentosus L33 presents resistance to vancomycin. Consequently, the demonstrated resistance does not raise any safety concerns, as there is no implication of horizontal gene transfer (Shao et al., 2015). The mode of action of this antibiotic involves its interaction with peptidoglycan precursors, leading to the inhibition of cell wall synthesis. More specifically, vancomycin binds to the D-alanine/D-alanine terminus of the muramyl pentapeptide and inhibits the polymerization of the peptidoglycan precursor. In this context, we found that L. pentosus L33 possesses a gene (VanX) encoding a D-ala-Dala dipeptidase which hydrolyzes D-alanine/D-alanine residues (Liu et al., 2009). Moreover, in several LAB species, the D-alanine residue located at the end of the pentapeptide is substituted by D-lactate or D-serine and thus blocking vancomycin binding (Delcour et al., 1999). KEGG analysis showed that L. pentosus L33 encodes a Ddl ligase, responsible for the D-alanine to D-lactate substitution in Lactobacilli (Tuyarum et al., 2021). Furthermore, there are reports that indicate that many LAB genera exhibit intrinsic resistance to other antibiotics, such as bacitracin, kanamycin, teicoplanin, and quinolones (Imperial and Ibana, 2016). Additionally, it should be noted, that transfer of the vancomycin resistance cluster from Enterobacteriaceae to commercial probiotic strains has been reported in vitro and in vivo, during transit in the murine gastrointestinal tract (Mater et al., 2008).
The cellular surface of Lactobacilli is decorated by a plethora of cell surface proteins that can interact with host receptors and give rise to a variety of probiotic effects (Teame et al., 2020). Indeed, probiotics can interact with the gastrointestinal mucosa of mammalian hosts utilizing pilli, mucin-, and fibronectinbinding proteins, as well as surface-layer (S-layer) proteins (Siciliano et al., 2019). These interactions are necessary for the transient attachment of ingested probiotics in the intestinal mucosa, while they can also facilitate important probiotic functions, including antimicrobial (Tytgat et al., 2016) and immunomodulatory activity (Monteagudo-Mera et al., 2019). In this study, we found that L. pentosus L33 carries mucusand fibronectin-binding proteins (Table 3), however, it does not encode for spaCBA pilli, commonly found in other LAB strains, such as L. rhamnosus GG (Reunanen et al., 2012). The adhesins are covalently anchored to the peptidoglycan layer by a C-terminal Leu-Pro-any-Thr-Gly (LPxTG) motif, which is, also, used for their identification in silico (Siegel et al., 2017). Moreover, cytoplasmic proteins that participate in important housekeeping functions such as carbohydrate metabolism, translation regulation and protein folding, can be found, anchorless, in the cellular envelope acting as adhesins (Celebioglu et al., 2017). These multifunctional proteins, also known as moonlighting proteins, have been identified in animals, plants, yeast and bacteria. L. pentosus L33 encodes some of these proteins; elongation factor Tu (EF-Tu), chaperonin GroEL, and co-chaperonin GroES. Previous reports have shown that L. plantarum and L. pentosus strains utilize EF-Tu (Choudhary et al., 2019) and GroEL (Calasso et al., 2013) for the adhesion on the intestinal epithelium. The adhesion capacity of the strain was further validated in vitro. We showed that L. pentosus L33 can efficiently adhere to HT-29 cells, exhibiting similar behavior to L. rhamnosus GG, a reference strain, whose capacity to attach to and colonize the gastrointestinal mucosa has been previously described (Chondrou et al., 2018;Pagnini et al., 2018). Further studies are required to evaluate this finding and elucidate its contribution to probiotic efficacy.
Furthermore, we report that the L. pentosus L33 genome includes five virulence factors. Hemolysin iii family protein is very common among Lactiplantibacillus genomes, including the probiotic strains L. plantarum 299v and L. plantarum ST-III.
The abovementioned strains have an established safety profile, and they are widely used as probiotics (Chokesajjawatee et al., 2020). Heat shock protein 33 (Hsp33) is a redox-regulated molecular chaperone that binds to unfolded proteins and prevents protein aggregation (Winter et al., 2005). YidC gene encodes a translocase that regulates respiratory metabolism in Mycobacterium tuberculosis (Thakur et al., 2016). YidD and Jag belong to the same gene cluster along with YidC, but their function remains unclear (Yu et al., 2011). Nevertheless, the impact of these factors in the safety profile of L. pentosus L33 has to be further examined.
The probiotic character has also been linked to EPS biosynthesis, as it is well established that EPS play a key role in the dynamic interaction of bacteria with their environment (Angelin and Kavitha, 2020). EPS can be found loosely attached to the cell surface or excreted in the growth medium of the producer strain, while the yield of production can fluctuate based on growth conditions. In addition, the produced exopolysaccharides can vary in terms of monosaccharide constitution, charge, linkage, and existence of repeated sidechains (Caggianiello et al., 2016). EPS can facilitate niche adaptation, as they promote auto-aggregation (Aslim et al., 2007), attachment to abiotic or biotic surfaces and biofilm formation (Castro-Bravo et al., 2018). Furthermore, there are several physiological functions attributed to EPS such as anti-inflammatory, antioxidant, antiviral, and antiproliferative activity (Nguyen et al., 2020). Lastly, the production of EPS in high concentrations can alter the organoleptic characteristics of fermented products (Ale et al., 2020). In this study, we found an EPS biosynthesis cluster, homologous to that of L. pentosus SLC13, a LAB strain known for its capacity to produce high yields of EPS (Huang et al., 2018). In this context, the EPS fraction of L. pentosus L33, is currently being studied for its antimicrobial and antibiofilm potential.
Our analysis showed that L. pentosus L33 does not code for functional bacteriocins, due to the lack of motifs crucial for their inhibitory action. This finding agrees with previous in vitro studies, where no bacteriocin-like activity was detected (Pavli et al., 2016). However, probiotics can exert antimicrobial effects through various mechanisms. such as competition for nutrients, inhibition of pathogen adhesion (Walsham et al., 2016) and immune system stimulation (Tuo et al., 2018). Moreover, they can produce other inhibitory compounds, fatty acids, hydrogen peroxide, ethanol (Chen et al., 2019) and biosurfactants (Sharma and Saharan, 2016), or induce indirect antimicrobial effects by lowering of intestinal pH, due to production of high amounts of lactic and acetic acids. Thus, ongoing studies focus on the potential of this strain to interfere with proliferation and biofilm formation of clinically relevant strains, such as Staphylococcus aureus, Salmonella enteritidis, and Escherichia coli, by alternative mechanisms to bacteriocin production.
Conclusively, these findings in combination with previous in vitro work, support that L. pentosus L33 is a good probiotic candidate. This strain fulfills the main criteria for probiotic selection; tolerance to gastrointestinal tract conditions, susceptibility to common antibiotics and γ-hemolytic activity. In the present study, we introduced new traits that add to the characterization of L. pentosus L33 as a novel probiotic strain, the capacity to produce adhesins and exopolysaccharides. Whole-genome sequencing and comprehensive bioinformatic analysis facilitate targeted laboratory validation of traits of newly isolated strains, streamlining their characterization as probiotic. In this context, future studies will demonstrate the in situ performance of L. pentosus L33 strain as a starter/adjunct culture for the production of fermented dry meat products (as this strain was previously isolated from fermented sausages), to signify its effectiveness for application in sausage manufacturing. Additionally, future researches will explore the L. pentosus L33-host interactome, and especially gut colonization mechanisms. Overall, L. pentosus L33 exhibits a great interest as a potential probiotic strain and forthcoming studies will further unravel its characteristics in vitro, in vivo, and in situ.

DATA AVAILABILITY STATEMENT
The datasets presented in this study have been submitted to DDBJ/ENA/GenBank under the accession number JAHKRU000000000. The version described in this manuscript is the JAHKRU010000000.

AUTHOR CONTRIBUTIONS
AP, NC, PK, and AG designed the study. OS, KT, DK, and MT performed genome analysis and participated in the writing of the manuscript. AP, CT, NC, PK, and AG contributed to editing and critical reviewing of the manuscript. CT and NC took charge of the resources. All authors had read and approved the final manuscript.