Novel Insights Into the Phylogeny and Biotechnological Potential of Weissella Species

In this study, the genomes of the Weissella (W.) beninensis, W. diestrammenae, W. fabalis, W. fabaria, W. ghanensis, and W. uvarum type strains were sequenced and analyzed. Moreover, the ability of these strains to metabolize 95 carbohydrates was investigated, and the genetic determinants of such capability were searched within the sequenced genomes. 16S rRNA gene and genome-based-phylogeny of all the Weissella species described to date allowed a reassessment of the Weissella genus species groups. As a result, six distinct species groups within the genus, namely, W. beninensis, W. kandleri, W. confusa, W. halotolerans, W. oryzae, and W. paramesenteroides species groups, could be described. Phenotypic analyses provided further knowledge about the ability of the W. beninensis, W. ghanensis, W. fabaria, W. fabalis, W. uvarum, and W. diestrammenae type strains to metabolize certain carbohydrates and confirmed the interspecific diversity of the analyzed strains. Moreover, in many cases, the carbohydrate metabolism pathway and phylogenomic species group clustering overlapped. The novel insights provided in our study significantly improved the knowledge about the Weissella genus and allowed us to identify features that define the role of the analyzed type strains in fermentative processes and their biotechnological potential.


INTRODUCTION
Weissella is a genus belonging to the phylum Firmicutes, order Lactobacillales, and family Lactobacillaceae. Weissella species are nonspore-forming, catalase-negative, and Gram-positive bacteria with either a coccoid or rod shape. They have a widespread occurrence, being isolated from various ecological niches such as soil (mainly Weissella soli) (Chen et al., 2005), sediments of coastal marsh and estuary (Sica et al., 2010), plants (Emerenini et al., 2014), and foods. Their presence in food matrices and many spontaneous fermentation processes of vegetables, dairy, and cereals (Wang et al., 2011;Zannini et al., 2013Zannini et al., , 2018Mun and Chang, 2020) reveals their role in these fermentative processes, in which they exert an influence on the quality, and especially on the texture of foods. In addition, they have been isolated from the gut, oral cavity, breast milk (Martín et al., 2007;Albesharat et al., 2011), and urogenital and gastrointestinal tracts of humans (Lee et al., 2012), as well as from many animals, such as giant pandas and rainbow trout, and from the skin, milk, and gastrointestinal tracts of vertebrates (Xiong et al., 2019;Mortezaei et al., 2020;Wang et al., 2020).
The presence of these species as common inhabitants of the human intestine (Lee et al., 2012) and animal feces (Cai et al., 1998;Beasley et al., 2006;Muñoz-Atienza et al., 2013), as well as their occurrence in a variety of food matrices, has suggested their potential use as probiotics (Teixeira et al., 2021). This potential probiotic activity is supported by observations that some Weissella species can overcome the gastric barrier (Wang et al., 2008) and produce antimicrobial compounds such as bacteriocins (Srionnual et al., 2007;Masuda et al., 2012). For example, W. cibaria has been reported to inhibit biofilm formation in vitro and the proliferation of the main bacterial pathogens in dental caries and upper respiratory infections (Yeu et al., 2021), and has been studied for its potential anti-inflammatory activity against lipopolysaccharide stimulation (Yu et al., 2019), while W. viridescens was proposed as a potential probiotic for the skin (Espinoza-Monje et al., 2021). Furthermore, potential probiotic features of this genus include the reduction in depressive-like behavior (Sandes et al., 2020), the influence on gut permeability and intestinal epithelial regeneration (Prado et al., 2020), and the antagonistic activity against common human pathogens (Afolayan et al., 2017;Choi et al., 2018;Yu et al., 2019).
The capacity of this genus to grow at a wide range of temperatures, water activity levels, and pH (Ricciardi et al., 2009) increases its promising biotechnological potential, mainly attributed to the ability to produce exopolysaccharides (EPS) (Fusco et al., 2015). This has already promoted their use as starter cultures in dairy industries and sourdough fermentation (Galle et al., 2010), especially in the preparation of gluten-free baked products (Li et al., 2021). W. cibaria (Galli et al., 2020) and W. confusa have been largely studied for their ability to produce high amounts of dextrans, which enhance the softness of fresh bread (Wolter et al., 2014), and also for improving the textural properties of gluten-free bread (Montemurro et al., 2021).
In this study, we sequenced the genomes of W. beninensis, W. diestrammenae, W. fabalis, W. fabaria, W. ghanensis, and W. uvarum type strains. These were the only Weissella species whose assemblies were not available at the time this study was performed. Based on our genomic analysis, we updated the taxonomic classification of Weissella species. We also phenotypically characterized the newly sequenced strains, evaluating their capability to metabolize different carbohydrates, while identifying the genetic determinants encoding these enzymatic activities.

Whole-Genome Sequencing
A single colony of each strain was inoculated in 10 ml of MRS (de Man, Rogosa, and Sharpe) broth (Oxoid, Italy) and incubated overnight at 30 • C. Two mL of each broth culture were washed in Tris-HCl (10 mM, pH 7.5) and resuspended in 500 µl of the same buffer. Genomic DNA was extracted using the peqGOLD bacterial DNA kit (Peqlab, Erlangen, Germany) according to the manufacturer's instructions. The integrity, purity, and quantity of DNA were assessed using agarose gel electrophoresis, Nanodrop photometer (Peqlab), and Qubit 3.0 fluorometer (Life Technologies). To prepare sequencing libraries, the Illumina TruSeq Nano DNA LT Library Prep Kit (MiSeq v3-kit) (Illumina, San Diego, USA) was used according to the manufacturer's instructions, and then sequenced on the Illumina MiSeq platform using the 2 × 250 pair procedure. Reads were then trimmed with the NxTrim (V2) (O'Connell et al., 2015) and the Trimmomatic (Bolger et al., 2014), and then de novo assembly was performed using SPAdes version 3.10.1 (Bankevich et al., 2012). The whole-genome shotgun projects were deposited at DDBJ/ENA/GenBank under the accessions JAGMVS000000000 for W. beninensis LMG 25373 T , JAGMVT000000000 for W. diestrammenae DSM 27840 T , JAGMVU000000000 for W. fabalis LMG 26217 T , JAGMVV000000000 for W. fabaria LMG 24289 T , JAGMVW000000000 for W. ghanensis DSM 19935 T , and JAGMVX000000000 for W. uvarum B18NM42 T . The versions described in this study are JAGMVS010000000 for W. beninensis LMG 25373 T , JAGMVT010000000 for W. diestrammenae DSM 27840 T , JAGMVU010000000 for W. fabalis LMG 26217 T , JAGMVV010000000 for W. fabaria LMG 24289 T , JAGMVW010000000 for W. ghanensis DSM 19935 T , and JAGMVX010000000 for W. uvarum B18NM42 T .

Bioinformatic Methods
The completeness of the assemblies was assessed by identifying a set of essential genes, which are typically present in a single copy in almost all prokaryotic genomes, by using MiGA (Rodriguez-R et al., 2018) and estimating the percentage of their presence in the genome. The quality scores were then calculated as the completeness percentage minus five times the contamination percentage.
Proteins were predicted by using the Prokaryotic Genome Annotation Pipeline (Tatusova et al., 2016) and PROKKA pipeline (Seemann, 2014) implemented in the Galaxy platform (Galaxy Version 1.14.6 + galaxy0; Afgan et al., 2018). Functional classification, subsystem prediction, and metabolic reconstruction comparison were performed using the RAST server (Aziz et al., 2008). All the protein sequences used in this study were retrieved from GenBank (NCBI). The homologybased relationship of Weissella strains toward selected proteins was determined using the BLASTP algorithm on the NCBI site (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Gene models were manually determined, and clustering and orientation were subsequently deduced for the closely linked genes. Genes were also retrieved by keyword search within the UniProtID entry list obtained by functional annotation and then manually curated.

Phylogenetic Analysis
A comparative genomic analysis was performed using strains and genomic assemblies listed in Supplementary Table S1. The 16S rRNA gene sequences were extracted from each genome by a BLASTn search and compared to the type strain sequences retrieved from available assemblies, whose accession numbers are indicated in Supplementary Table S1. An alignment was generated by MUSCLE 3.8.31 run mode using the phylogeny.fr platform (http://www. phylogeny.fr/index.cgi), configured as "A la Carte" Mode (Dereeper et al., 2008). The phylogenetic reconstruction was performed using the neighbor-joining method; the phylogenetic robustness was inferred by a bootstrapping procedure with 500 replications to obtain the confidence value for the aligned sequence dataset. The tree was graphically generated by iTOL version 5.5 (Letunic and Bork, 2019).
The genetic divergence was calculated using the ANI/AAI calculator (Goris et al., 2007;Rodriguez-R and Konstantinidis, 2016), which estimates the average nucleotide/amino acid identity (ANI/AAI) between genomic datasets using both best hits (one-way ANI) and reciprocal best hits (two-way ANI).
A genome-based phylogeny was reconstructed using the Phylogenetic Tree Building Service implemented in the Patric platform (www.patricbrc.org) with Bifidobacterium bifidum ATCC 29521 T as an outgroup according to Fusco et al. (2015) and the maximum likelihood method RAxML with progressive refinement (Stamatakis, 2014). For each strain, the genomic sequences deposited in GenBank were used for the analysis (Supplementary Table S1).

Comparative Genomic Analysis
A comparative analysis of carbohydrate-active enzymes (CAZy) was performed by executing a CAZyme annotation using the dbCAN meta server (http://bcb.unl.edu/dbCAN2/blast. php). The database was searched for preserved domains of all CAZy families, following the protocol proposed by dbCAN (Yin et al., 2012). A heatmap was manually constructed and visualized using the heatmapper web server (www.heatmapper.ca; Babicki et al., 2016), with average linkage as the clustering method and the Euclidean distance measurement method.
A comparative analysis of carbohydrate metabolism was performed by using the pathway reconstruction, annotating proteins with the Kyoto Encyclopedia of Genes and Genome (KEGG) Mapper (Kanehisa and Sato, 2020). A heatmap was again manually constructed based on the protein count in each pathway retrieved for individual genomes, and visualized by using the heatmapper web server (www.heatmapper.ca; Babicki et al., 2016) with average linkage as the clustering method and the Euclidean distance measurement method.
A comparative analysis of SEED subsystems was performed by uploading individual genome sequences to the SEED Viewer Server (Overbeek et al., 2014). Functional roles of RAST annotated genes were assigned and grouped in subsystem feature categories.

Phenotypic Characterization
Phenotypic characterization was carried out as previously described by Fanelli et al. (2020), with some modifications. Biolog AN (Biolog, Inc., Hayward, CA, USA) plates were used to evaluate the consumption of 95 different carbon sources. Briefly, strains were grown in MRS broth (Oxoid, Italy) for 24 h. Cells were then harvested by centrifugation (10,000 rpm for 10 min) and washed two times with sterile phosphate buffer (50 mmol/l pH 7.0). Thereafter, cells were resuspended in sterile physiological saline (0.9 w/v NaCl). Each well of the plates was inoculated with 100 µl of bacterial suspension, adjusted to 65% transmittance. Plates were incubated in an anaerobic jar at 35 • C for 24 h as recommended by the manufacturer. Positive reactions were automatically recorded using the MicroStation microplate reader (Biolog) with 590 nm and 750 nm wavelength filters.

Statistical Analysis
All analyses were carried out in triplicates. Metabolic fingerprints of Weissella strains as determined by the Biolog system were subjected to permutation analyses using the PermutMatrix as described by Fanelli et al. (2020).

General Features of Weissella Genomes
The genomes of the six Weissella type strains were sequenced and assembled using the SPAdes software (version 3.10.1), obtaining a total of 76, 33, 39, 40, 106, and

Phylogenetic Analysis
The 16S rRNA gene sequence-based phylogeny is shown in Figure 1. According to the reconstructed tree, Weissella species can be clustered into six different species groups. The first group is constituted by W. thailandensis, W. bombi, W. paramesenteroides, W. hellenica, and W. sagaensis; a second group is formed by W. cibaria (W. kimchii) and W. confusa, which have a 16S rRNA gene sequence identity of 99.3%. The same percentage is shared by W. oryzae and W. muntiaci, which can be placed in a third species group. W. soli occurs close to these couples in the phylogenetic tree and has the highest 16S rRNA gene sequence identity with W. muntiaci ( A genome-based phylogeny was inferred from RAxML analysis. The results are shown in Figure 2. Partially confirming the clustering obtained by 16S rRNA gene sequence phylogeny, six species groups could be identified and named as W. beninensis, W. kandleri, W. oryzae, W. confusa, W. paramesenteroides, and W. halotolerans species groups. The W. beninensis group includes W. beninensis, W. cryptocerci, W. fabalis, W. fabaria, and W. ghanensis. The W. kandleri group comprises W. kandleri, W. diestrammenae, W. koreensis, W. soli, and W. coleoptorum. The W. oryzae group includes W. oryzae and W. muntiaci. The fourth species group, named the W. confusa species group, comprises W. confusa and W. cibaria. The fifth species group, designated as W. paramestenteroides by Fusco et al. (2015), includes W. paramesenteroides, W. hellenica, and W. thailandensis, with the addition of the recently described W. sagaenisis (Li et al., 2020) and W. bombi species (Praet et al., 2015). The sixth species group, already designated as W. halotolerans by Fusco et al. (2015), includes W. halotolerans, W. ceti, W. uvarum, W. minor, and W. viridescens.

Carbohydrate Metabolism Comparative Analysis
A comparative analysis of CAZy families present in the genomes of Weissella spp. is shown in Figure 3. The largest families are represented by the glycosyl transferases (GTs), which catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds, followed by the glycosyl hydrolases (GHs). Polysaccharide lyases (PLs) and carbohydrate esterase (CE) are poorly represented, with CE9 and CE1 being the most frequently occurring. The highest number of CAZymes, with 61 annotated proteins, was identified in the genome of the W. cibaria type strain, while the lowest number was found in the W. ceti type strain, for which only 13 CAZymes could be identified. The W. bombi and W. cibaria type strains have the highest numbers of GHs (28), while the W. muntiaci and W. cibaria type strains have the highest number of GTs (29). Figure 3 also shows the clustering of Weissella species obtained by analyzing the occurrence and distribution of CAZymes among species.
A detailed analysis of the enzymes involved in the metabolism of carbohydrates of the six strains sequenced in this study is reported in Table 2 and Supplementary Table S4. Regarding the hydrolytic enzymes, 13 GH proteins were annotated in the W. ghanensis type strain, while 12 were found in the W. beninensis type strain and W. diestrammenae type strain each. One FADbinding protein with an auxiliary activity family 4 domain that catalyzes the conversion of a wide range of phenolic compounds was identified in the W. ghanensis type strain. Polysaccharide lyase enzymes were retrieved in the genomic sequences of the W. ghanensis, W. fabaria, and W. fabalis type strains. One carbohydrate-binding module CBM34 zyme was retrieved in the genomic sequence of the alpha-glycosidase (GH13_20) of the W. diestrammenae type strain. All strains harbor one Nacetylglucosamine 6-phosphate deacetylase belonging to the carbohydrate esterase of family 9 (CE9), while in the W. diestrammenae type strain, two additional CEs were annotated. Glycosyl transferases were the class with the highest count, with A comparative analysis of carbohydrate metabolism pathways is depicted in Figure 4. The analysis shows that the highest count of enzymes within these pathways was retrieved in the categories "amino sugar and nucleotide sugar metabolism, " "pyruvate metabolism, " "glycolysis and gluconeogenesis, " and "starch and glucose metabolism." The W. cryptocerci type strain has the highest enzyme count (172 in total), while the lowest occurred in the W. kandleri type strain (111).
A comparative analysis of the SEED subsystem is shown in Figure 5. The highest number of features was determined in the subsystem "protein metabolism, " followed by "carbohydrates and nucleosides and nucleotides." The W. cibaria, W. cryptocerci, and W. jogaejeotgali type strains showed the highest total feature counts.

Substrate Consumption by Weissella Strains
Carbon source consumption was evaluated using Biolog AN plates, and the results are presented in Table 3. All Weissella strains used N-acetyl-D-glucosamine, D-fructose, α-Dglucose, D-mannose, palatinose, turanose, and maltotriose.  Furthermore, according to the comparative analysis of carbohydrate metabolism pathways and the cluster analysis based on carbon source oxidation (Supplementary Figure S1), the cluster including W. beninensis, W. fabalis, W. fabaria, and W. ghanensis type strains was characterized by the consumption of glycyl-L-methionine, α-ketobutyric acid, and pyruvic acids.
Only W. fabalis and W. fabaria consumed i-erythritol, Dtrehalose, and D-malic acid (Table 3). Moreover, formic acid was used only by the W. fabaria type strain, while α-cyclodextrin, fumaric acid, and glycyl-L-glutamine were utilized by the W. fabalis type strain. Among all the strains tested, the W. beninensis type strain showed the widest consumption of carbon sources.

DISCUSSION
Based on the results of ANI, AAI, 16S rRNA gene sequence identity, and 16S rRNA gene-and genome-based phylogeny, all Weissella-type strains investigated could be confirmed as representing individual and separate species. Based on our ANI/AAI results, we confirmed that W. jogaejeotgali, first described by Lee et al. (2015) as a novel species, could be classified as a later heterotypic synonym of W. thailandensis. Furthermore, we also confirmed that W. kimchii, described as a novel species by Choi et al. (2002), is a later heterotypic synonym of W. cibaria. Our data partially validate the taxonomic clustering previously proposed by De Bruyne et al. (2010) and Fusco et al. (2015), which were essentially based on the 16S rRNA gene and pheS gene sequence phylogenies, with the latter showing a higher discriminatory power as a marker gene. The description of novel species (Lee et al., 2015;Praet et al., 2015;Heo et al., 2019;Li et al., 2020;Lin et al., 2020;Hyun et al., 2021; this study) and the availability of additional genomic sequences (refer to Supplementary Table S1) allowed us to perform a comprehensive genomic-based phylogenetic analysis of all species in the genus Weissella. Our analysis also integrates genomic indexes, which are currently considered as minimal standards to perform taxonomic classification. Therefore, based on our phylogenomic reconstruction, we were able to define six distinct species groups within the genus Weissella, i.e., the W. beninensis, W. kandleri, W. confusa, W. halotolerans, W. oryzae, and W. paramesenteroides species groups.
The first divergent line within the genus is constituted by the W. beninensis group, which includes W. beninensis, W. cryptocerci, W. fabalis, W. fabaria, and W. ghanensis. In this case, the clustering recently updated by Fusco et al. (2015) is confirmed by the genome-based phylogeny and by genomic analysis: the average ANI value among these species is 81.2%, with W. ghanensis and W. fabaria being the most closely related, with an ANI value of 85.63%. Species within this group share 96% of 16S rRNA gene sequence identity; the percentage rises to 99.5% among W. fabaria, W. fabalis, and W. ghanensis. The second species group, designated as the W. kandleri group, comprises W. kandleri, W. diestrammenae, W. koreensis, W. soli, and W. coleoptorum. The latter was described in 2021 (Hyun et al., 2021) and was therefore not represented in the previous classification. The 16S rRNA gene sequence identity among these species is 97.1% on average, while the ANI value is 82%, with W. coleoptorum and W. soli being the closest related, sharing an ANI value of 83.91%. In contrast to the results previously presented by Fusco et al. (2015), W. oryzae was not included in this group, as the genomic analysis performed in this study placed it into a separate branch together with W. muntiaci. These two species share 99.32% of 16S rRNA gene sequence identity, 80.15% of ANI, and 75.64 of AAI, and can be assigned to a third species group, namely the W. oryzae species group. The fourth species group, namely the W. confusa species group, comprises W. confusa and W. cibaria, confirming the clustering previously proposed by Fusco et al. (2015). Based on the results of this study, we added the recently described W. sagaenisis (Li et al., 2020) and W. bombi species (Praet et al., 2015) to the fifth species group, designated by Fusco et al. (2015) as W. paramestenteroides, which also includes W. paramesenteroides, W. hellenica, and W. thailandensis. The average ANI value shared among these species is 82.4%, while 16S rRNA gene sequence identity is 98%. Notably, W. sagaenisis and W. hellenica 16S rRNA gene sequences are identical, while their shared ANI value is 89.7%, thus confirming the correct designation of W. sagaenisis as a novel species. In accordance with Fusco et al. (2015), the sixth species group is the W. halotolerans group, which includes W. halotolerans, W. ceti, W. uvarum, W. minor, and W. viridescens. Within these species, the 16S rRNA gene sequence identity reaches 95.1%, with the FIGURE 5 | Comparative analysis of SEED subsystem features in Weissella species. Genes annotated by RAST were assigned to functional categories and grouped into subsystems. Colored bars indicate the number of genes assigned to each category. maximum value of 99.14% occurring between W. minor and W. uvarum, while the shared ANI value for this group is 79.06%.
According to CAZy and KEGG pathway analyses, species within this genus harbor a comprehensive carbohydrate utilization system, including sugar uptake, transporters, and metabolism-related genes, which confer them strong carbohydrate utilization capabilities, as shown by different carbohydrate utilization profiles. The carbohydrate metabolism pathway clustering follows the phylogenomic species group clustering in many cases: this occurred for W. beninensis, W. fabalis, W. fabaria, and W. ghanensis; for W. minor and W. uvarum; for W. ceti and W. halotolerans; for W. bombi and W. sagaensis; and for W. thailandensis, W. paramesenteroides, and W. jogaejeotgali.
The W. fabalis, W. ghanensis, and W. uvarum type strains were found to harbor the genetic determinants for the synthesis of oligo-1,6-glucosidases, which hydrolyze the α-1,6 linkage in starch, glycogen, and the derived oligosaccharides to produce sugars with an α-configuration. The W. diestrammenae type strain was found to harbor a gene coding for a maltogenic amylase, an enzyme that favors starch degradation in maltose. Therefore, these Weissellas may play an important role in the fermentation of sourdoughs due to the metabolism of starch and amylopectin, which are composed of α-1,4 glucose main chain and α-1,6 glucose side chain (van der Maarel et al., 2002;Wang et al., 2021).
All strains were able to use the D-mannose sugar. This sugar is a dextrorotatory hexose aldehyde/aldose monosaccharide found in certain bacteria, fungi, and plants, and is rarely present in nature as a free monosaccharide. Nevertheless, it is a constituent of numerous simple and complex polysaccharides and is mostly found in nature as a component of mannan, hemicellulose, and cellulose in dietary fiber (Hu et al., 2016). For example, it constitutes the basic molecule of mannans, the reserve polysaccharides of some plant species (e.g., palm), or is associated with galactose (mannogalactans) to form gummy mucilages that protect the seeds of some plants (e.g., carob); they are widely used as stabilizers of food products such as ice cream and mayonnaise. The W. diestrammenae type strain was isolated from the gut of a camel cricket. The remaining 5 type strains sequenced
in this study originated from vegetables such as cassava, cocoa, and grapes, all sources containing mannose and/or its (oligo)polymers. Due to the described enzymatic activities, these Weissellas could be employed as starters for the fermentation of such mannose-and oligo mannose-containing vegetables. Their ability to metabolize mannose is corroborated by the fact that in every sequenced genome we retrieved (i) the manXa transporter, which can transfer phosphorus-containing groups to D-mannose, and (ii) the mannose-6-phosphate isomerase [EC:5.3.1.8] for the conversion of D-mannose-6-phosphate to β-D-fructose 6-phosphate. The mannose transportation operon manXYZ (Jeckelmann and Erni, 2020) was identified in all the sequenced strains. Except for W. diestrammenae, all the other strains possess an scrK gene, coding for a fructokinase. It catalyzes the transfer of phosphate to D-fructose, which is then converted to α-D-glucose 6-phosphate by the glucose-6-phosphate isomerase. In the W. diestrammaneae type strain, fructose utilization might proceed through its conversion to α-D-glucose, catalyzed by the xylose isomerase, only harbored by this strain (xylA gene, KAR27_02655). In the W. uvarum and W. fabaria type strains, the fruAB gene, coding for the D-fructose phosphotransferase, was identified. Only in the W. uvarum type strain can D-fructose enter glycolysis through the subsequent action of the fruK fructokinase and the triosephosphate isomerase, which converts the β-D-fructose 1,6-bisphosphate to glyceraldehyde 3-phosphate.
According to the Biolog data, W. beninensis is the only species able to metabolize D-galactose and raffinose. In fact, the galA gene was detected only in the genomic sequence of the type strain of this species, in addition to two copies of the β-galactosidase (KAK10_01750 and KAK10_05755). The galA codes for the α-galactosidase [EC:3.2.1.22] (KAK10_01740), the exoglycosidase that hydrolyzes α-1,6 galactoside linkages found in sugars, such as raffinose, melibiose, and stachyose, and branched polysaccharides such as galactomannans and galacto-glucomannans. The ability to metabolize nondigestible sugars such as oligosaccharides of the raffinose family and galactomannans is common for several probiotic bacteria (Zartl et al., 2018). The α-galactosidase enzyme is well explored in the food industry, where it is used for removing raffinose family oligosaccharides (RFOs) in soymilk and sugar crystallization processes, and for the improvement of animal feed quality and biomass processing (Bhatia et al., 2020).
Only W. beninensis can consume D-sucrose. This is due to the presence of (i) a PTS transporter subunit EIIC (KAK10_07455), which transports sucrose into the cell, converting it to sucrose-6-phosphate; (ii) a sucrose-6-phosphate hydrolase (KAK10_07450), which catalyzes the formation of fructose and glucose-6-phosphate; and (iii) a sucrose phosphorylase (KAK10_07625), which reconverts the sucrose-6phosphate into sucrose. Moreover, a preliminary evaluation of EPS production (data not shown) indicates that W. beninensis can produce ropiness when incubated in MRS added with 20 g/L of sucrose. W. beninensis was isolated from spontaneous fermentation of cassava, and it was previously characterized for the acid production from D-fructose, D-galactose, D-glucose, lactose, maltose, D-mannose, melibiose, D-raffinose, sucrose, N-acetylglucosamine, and D-mannitol (Padonou et al., 2010). Cassava usually contains a low amount of free sugar, which differs according to the variety and the tissue age in the storage root (Junior and Campos, 2004). Its spontaneous fermentation is accomplished by the presence of different species of yeasts and lactic acid bacteria (Padonou et al., 2009) and activates mutually stimulating interactions, as widely characterized in other products such as sourdough and kefir (De Vuyst and Neysens, 2005;Tofalo et al., 2020). The ability of W. beninensis to use numerous carbon sources may indicate its adaptation to this complex microbial and metabolic environment.
Trehalose, or α-D-glucopyranosyl-1,1-α-D-glucopyranoside, is a disaccharide made up of two α-D-glucose molecules joined by an α-1,1 glycosidic bond. It naturally occurs in bacteria, fungi, yeasts, algae, plants, and invertebrates, including insects, but it is absent in vertebrates. Its major dietary sources are mushrooms; in fact, it is contained in most edible fungi and is an important part of reconstituting dried shiitake mushrooms. For this reason, it is also referred to as a mushroom sugar, while it is called seaweed sugar in China since trehalose is contained in marine plants such as "hijiki" seaweed. Weissellas that can metabolize this sugar may play a pivotal role in the fermentation and digestion of the above-mentioned (novel) foods. To metabolize trehalose, this disaccharide enters the cells throughout the action of the trehalose PTS permease (EC: 2.7.1.201). Consequently, the α,α-phosphotrehalase catalyzes the hydrolysis of α,α-trehalose 6-phosphate into D-glucose and D-glucose 6-phosphate. This enzyme was annotated in the genomic sequences of the W. fabalis, W. fabaria, and W. ghanensis type strains to occur in a genomic cluster, which shows differences in the organization among these species (Figure 6), while the corresponding genes in the W. ghanensis type strain were annotated in two separate contigs. All these 3 species share the same ecological niche of origin (W. ghanensis type strain was isolated from Ghanaian cocoa fermentation; W. fabaria type strain was isolated from fermented cocoa bean heaps; and W. fabalis was isolated from spontaneous cocoa bean fermentation). Common to all 3, within the cluster, there are (1) a gene coding for a hypothetical protein with a GH (family 13) catalytic domain, (2) the treR gene (coding for the trehalose operon repressor), (3) the treC gene (coding for the α,α-phosphotrehalase), (4) the genes coding for TreB (the trehalose transporter with the two domain phosphotransferases (PTS) subunit EIIC and IIA), (5) one flippase gene, (6) one gene coding for an EpsG family protein, (7) one for a sugar transferase, and (8) several genes coding for glycosyl transferases. In the W. fabaria type strain, this locus also comprises a sequence of 14 kbp downstream of treR, where there are annotated rhamnose operon rbf ABCD and genes coding for (1) a dTMP kinase, (2) a LicD family protein, (3) a DUF1972 domain-containing protein, (4) a CpsD/CapB family tyrosineprotein kinase, (5) a capsular biosynthesis protein, (6) an LCP family protein, and (7) a tyrosine phosphatase. The trehalose operon has not yet been described in the Weissella species. In addition to the treR and treB genes, Lactococcus lactis includes the phosphomannomutase gene femB, the trehalose/maltose hydrolase gene trePP, and the β-phosphoglucomutase pgmB gene (Andersson et al., 2005) (Figure 6). The pgmB gene was instead annotated in W. diestrammenae (KAR27_07555), within a locus comprising genes coding for the transcriptional regulator LacI, an α-glycosidase, a glycoside hydrolase, a galactose maturotase, and sugar transporters.
The maltose operon was first described in L. lactis by Andersson and Rådström (2002). Maltose is transported by an ATP-dependent permease system, and then it is degraded by the concerted action of a Pi-dependent maltose phosphorylase (MP) and the β-phosphoglucomutase (β-PGM), which are located in two different operons. In the Weissella species analyzed in this study, pgmB occurred in the W. diestrammenae (KAR27_07555), W. beninensis (KAK10_00105), and W. uvarum (KAR63_02130) type strains. The mapA gene, which codes for the maltose phosphorylase [EC:2.4.1.8] that catalyzes the reaction between maltose and phosphate to form D-glucose and β-D-glucose-1-phosphate, was found to be present in the W. uvarum (KAR63_02125), W. beninensis (KAK10_00115), and W. diestrammenae (KAR27_07545) type strains. In contrast to L. lactis, the malP and pgmB genes are located closely on the genomic sequences of the species in which they were annotated. In the W. uvarum type strain, this operon also contains genes coding for the glucose transport system, formed by the PTS transporter subunit EIIC and PTS glucose transporter subunit IIA, and a trehalose operon repressor. However, the sequences surrounding this genomic area are different from the trehalose operon as above described in the other Weissella species, indicating a divergent evolutionary pattern. In the pgmB genomic locus of the W. diestrammenae type strain, there are also genes coding for a galactose mutarotase, a glycosyl FIGURE 6 | Trehalose operon in Weissella species. Genomic organization of the putative trehalose operon in the Weissella species. Gene clustering is represented by the arrows superposed on the black horizontal line. Gene and intergenic spaces are not drawn in scale. Lactococcus lactis subsp. lactis Il1403 (GenBank accession no. NC_002662): glmM, phosphoglucosamine mutase; treR, trehalose operon repressor; IIA, PTS glucose transporter subunit IIA; EIIc, PTS transporter subunit EIIC; pmgB, beta-phosphoglucomutase; α/βH, alpha/beta hydrolase; gH, glycosyl-hydrolase; polysaccharide biosynthesis C-terminal domain-containing protein flip, flippase; gT, glycosyltransferase; ppT, polysaccharide pyruvyl transferase family protein; EpsG, EpsG family protein; sT, sugar transferase; rbfABCD, rhamnose operon; dTMPk, dTMP kinase, LicD, LicD family protein.
hydrolase, the α-glycosidase (KAR27_07540) gene, a gene coding for an MFS transporter, one for a LacI family DNA-binding transcriptional regulator, one for an ECF transporter, one for an ABC transporter substrate-binding protein, and a permease gene. The pgmB gene codes for a β-phosphoglucomutase, which catalyzes the interconversion of D-glucose 1-phosphate (G1P) and D-glucose-6-phosphate (G6P), forming β-D-glucose 1,6-(bis)phosphate (β-G16P) as an intermediate. It is the catabolic activity in the maltose and trehalose pathways. Although pgmB has not yet been characterized in the Weissella species, this gene was reported as essential in the trehalose pathway of L. lactis (Andersson et al., 2001). The occurrence of this gene in the W. diestrammenae type strain, which, according to the carbon source consumption analysis, is not able to metabolize trehalose, suggests its involvement in the maltose metabolism. This sugar is indeed consumed by this strain, most likely through the subsequent reactions catalyzed by the maltose phosphorylase MapA and PgmB. In the W. beninensis type strain, the putative maltose operon was annotated in a very short contig; therefore, it was not possible to reconstruct the surrounding organization of the putative operon, although there are annotated genes coding for a galactose mutarotase, a glycosyl hydrolase, one MFS transporter, and two LacI family DNAbinding transcriptional regulators. In the W. fabalis, W. uvarum, and W. ghanensis type strains, the ability to hydrolyze maltose can also be associated with the presence of the maltase-glucoamylase (E. 3.2.10) that catalyzes maltose hydrolysis to produce two molecules of D-glucose. Furthermore, α-glucosidase genes were retrieved in the W. fabalis, W. ghanensis, and W. uvarum type strains. The substrates for this hydrolytic enzyme are maltooligosaccharides, phenyl αmaltoside, nigerose, soluble starch, amylose, amylopectin, and β-limit dextrins (Tomasik and Horton, 2012). In a recent study by Wangpaiboon et al. (2021), this enzyme was characterized in W. cibaria as acting on short-chain maltooligosaccharides. Genes encoding 6-phospho-β-glucosidase enzymes were annotated in the W. diestrammenae, W. fabalis, W. fabaria, and W. ghanensis type strains. This enzyme is involved in the hydrolysis of phosphorylated disaccharides and usually does not have hydrolytic activity toward nonphosphorylated substrates. The resulting glucose and glucose 6-phosphate are further metabolized by the glycolytic pathway. β-D-glucosidase activity is widespread among lactic acid bacteria. It was suggested to play a role in the interaction with the human host. Furthermore, it is relevant for food fermentation processes and is involved in the release of β-glucosidases from their β-D-glucosylated precursors in several plant secondary metabolites (Michlmayr and Kneifel, 2014).
In W. beninensis, only one GH1 was annotated, in the putative ribose operon, comprising sugar transporters, and the rbsC and rbsD genes, coding for the ribose permease and the D-ribose pyranase, respectively.
Pectin lyases were identified in the genomic sequences of the W. fabalis, W. fabaria, and W. ghanensis type strains. These enzymes catalyze pectin degradation via eliminative cleavage of the α-(1,4) glycosidic linkages in homogalacturonan. The capability of these species to hydrolyze pectin suggested by the presence of these enzymes could be relevant for the exploitation in the food industry; pectin lyases are, in fact, usually employed in wine and juice prepress maceration, as well as in juice clarification methods (Mantovani et al., 2005;Kassara et al., 2019).

CONCLUSION
In this study, for the first time, we sequenced and analyzed the genomes of the type strains of 6 Weissella species discovered in the past decade, whose assemblies were not yet available at the time this study was initiated. We also performed a comprehensive phylogenomic analysis on all the Weissella species to date described and reassessed the phylogenetic structure of strains in the Weissella genus using 16S rRNA gene sequence and genomebased phylogeny. Updating the previous clustering proposed by other authors, we were able to identify six distinct species groups within the genus, namely, W. beninensis, W. kandleri, W. confusa, W. halotolerans, W. oryzae, and W. paramesenteroides species groups.
Moreover, we investigated the capability of the 6 type strains to metabolize 95 carbohydrates, demonstrating the strong carbohydrate utilization capabilities of the sequenced strains. This ability was also confirmed by the identification of the genetic determinants of the enzymes involved in carbohydrate metabolism. The genomic and phenotypic analyses provided further knowledge about the ability of the W. beninensis, W. ghanensis, W. fabaria, W. fabalis, W. uvarum, and W. diestrammenae type strains to metabolize certain carbohydrates and to detect their genetic determinants. In fact, the studies reporting the discovery of these type strains (De Bruyne et al., 2008Padonou et al., 2010;Oh et al., 2013;Snauwaert et al., 2013;Nisiotou et al., 2014) provided data about the utilization of a maximum of 19 carbohydrates against the 95 carbohydrates tested herein by using the Biolog. Furthermore, the permutation analysis of the Biolog data confirmed the interspecific metabolic diversity of the analyzed type strains.
The increasing availability of the genomic sequences of the Weissella species will contribute to improving the knowledge about this genus and identifying the features defining its role in fermentative processes and its biotechnological potential.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
VF conceived the work and interpreted the data. G-SC performed the genomic sequencing. VF and FF organized and performed the bioinformatic work. DC checked the purity of the strains and prepared the working cultures for DNA extraction and phenotypic characterization. MM performed the phenotypic characterization. VF, FF, and MM wrote the manuscript. All authors contributed to the revision of the manuscript, read, and approved the submitted version.

ACKNOWLEDGMENTS
The authors acknowledge Sebastien Santini (CNRS/AMU IGS UMR7256) and the PACA Bioinfo Platform (supported by IBISA) for the availability and management of the phylogeny.fr website used to analyze phylogenetic relationships between 16S rRNA gene sequences. They acknowledge Dr. Anastasia Nisiotou for kindly providing the type strain of W. uvarum.