Molecular Analysis of Bacterial Isolates From Necrotic Wheat Leaf Lesions Caused by Xanthomonas translucens, and Description of Three Putative Novel Species, Sphingomonas albertensis sp. nov., Pseudomonas triticumensis sp. nov. and Pseudomonas foliumensis sp. nov.

Xanthomonas translucens is the etiological agent of the wheat bacterial leaf streak (BLS) disease. The isolation of this pathogen is usually based on the Wilbrink’s-boric acid–cephalexin semi-selective medium which eliminates 90% of other bacteria, some of which might be novel species. In our study, a general purpose nutrient agar was used to isolate 49 bacterial strains including X. translucens from necrotic wheat leaf tissues. Maximum likelihood cluster analysis of 16S rRNA sequences grouped the strains into 10 distinct genera. Pseudomonas (32.7%) and Pantoea (28.6%) were the dominant genera while Xanthomonas, Clavibacter and Curtobacterium had 8.2%, each. Erwinia and Sphingomonas had two strains, each. BLAST and phylogenetic analyses of multilocus sequence analysis (MLSA) of specific housekeeping genes taxonomically assigned all the strains to validly described bacterial species, except three strains (10L4B, 12L4D and 32L3A) of Pseudomonas and two (23L3C and 15L3B) of Sphingomonas. Strains 10L4B and12L4D had Pseudomonas caspiana as their closest known type strain while strain 32L3A was closest to Pseudomonas asturiensis. Sphingomonas sp. strains 23L3C and 15L3B were closest to S. faeni based on MLSA analysis. Our data on MLSA, whole genome-based cluster analysis, DNA-DNA hybridization and average nucleotide identity, matrix-assisted laser desorption/ionization-time-of-flight, chemotaxonomy and phenotype affirmed that these 5 strains constitute three novel lineages and are taxonomically described in this study. We propose the names, Sphingomonas albertensis sp. nov. (type strain 23L3CT = DOAB 1063T = CECT 30248T = LMG 32139T), Pseudomonas triticumensis sp. nov. (type strain 32L3AT = DOAB 1067T = CECT 30249T = LMG 32140T) and Pseudomonas foliumensis sp. nov. (type strain 10L4BT = DOAB 1069T = CECT 30250T = LMG 32142T). Comparative genomics of these novel species, relative to their closest type strains, revealed unique repertoires of core secretion systems and secondary metabolites/antibiotics. Also, the detection of CRISPR-Cas systems in the genomes of these novel species suggests an acquired mechanism for resistance against foreign mobile genetic elements. The results presented here revealed a cohabitation, within the BLS lesions, of diverse bacterial species, including novel lineages.

Xanthomonas translucens is the etiological agent of the wheat bacterial leaf streak (BLS) disease. The isolation of this pathogen is usually based on the Wilbrink's-boric acid-cephalexin semi-selective medium which eliminates 90% of other bacteria, some of which might be novel species. In our study, a general purpose nutrient agar was used to isolate 49 bacterial strains including X. translucens from necrotic wheat leaf tissues. Maximum likelihood cluster analysis of 16S rRNA sequences grouped the strains into 10 distinct genera. Pseudomonas (32.7%) and Pantoea (28.6%) were the dominant genera while Xanthomonas, Clavibacter and Curtobacterium had 8.2%, each. Erwinia and Sphingomonas had two strains, each. BLAST and phylogenetic analyses of multilocus sequence analysis (MLSA) of specific housekeeping genes taxonomically assigned all the strains to validly described bacterial species, except three strains (10L4B, 12L4D and 32L3A) of Pseudomonas and two (23L3C and 15L3B) of Sphingomonas. Strains 10L4B and12L4D had Pseudomonas caspiana as their closest known type strain while strain 32L3A was closest to Pseudomonas asturiensis. Sphingomonas sp. strains 23L3C and 15L3B were closest to S. faeni based on MLSA analysis. Our data on MLSA, whole genome-based cluster analysis, DNA-DNA hybridization and average nucleotide identity, matrix-assisted laser desorption/ionization-time-of-flight, chemotaxonomy and phenotype affirmed that these 5 strains constitute three novel lineages and are taxonomically described in this study. We propose the names, Sphingomonas albertensis sp. nov. (type strain
Wheat BLS caused by X. translucens pv. undulosa is an important disease worldwide (Duveiller, 1990;Duveiller et al., 1997;Adhikari et al., 2012). In the last decade, the incidence of BLS in wheat has significantly increased in the Midwestern United States, the major wheat-growing regions (McMullen and Adhikari, 2011;Adhikari et al., 2012), and wheat-belt of western Canada 1 . The typical symptoms of BLS start with watersoaked streaks that subsequently become translucent necrotic leaf lesions (Smith et al., 1919;Duveiller et al., 1997;Sapkota et al., 2020), and under high disease pressure, the entire wheat leaf might be severely affected. Bacterial ooze can be seen on the leaf surface under humid and warm weather conditions. Traditional methods of isolation of the pathogen use a Wilbrink'sboric acid-cephalexin (WBC) agar, a semi-selective medium, to eliminate other microbes (Duveiller, 1994). Duveiller (Duveiller, 1994) indicated the elimination of over 90% of "saprophytic" bacteria from washes of wheat and triticale lots using WBC, facilitating the identification of the pale yellow X. translucens pv. 1 https://www.grainews.ca/features/detect-and-avoid-bacterial-leaf-streak/#post-50393 undulosa. It is clear that the main goal of the plant pathologist is to isolate the BLS causal agent for downstream studies such as accurate identification using biochemical, morphological and genomic evaluation, and enhancement of the biological collection as well as verification of Koch's postulate. However, in so doing other bacterial species are eliminated, some of which might be novel genotypes.
Epi-and endo-phytic phyllosphere are habitats to millions of pathogenic and beneficial microorganisms that colonize leaves. Culture-dependent (Yashiro et al., 2011;Anguita-Maeso et al., 2019) and culture-independent methods (Grady et al., 2019;Singh et al., 2019;Regalado et al., 2020) have been used extensively to study the microbial diversity of healthy phyllosphere. Until now, however, the phylogenetic positions of cultivable bacteria associated with diseased leaf lesions, beside the causal pathogens, are not well studied.
The objectives of this study were to (i) isolate X. translucens pv. undulosa and other culturable bacteria associated with typical symptoms of BLS on wheat; (ii) characterize the diversity and taxonomic positions of the strains using 16S rRNA, multilocus sequence analysis (MLSA) and genome analysis; (iii) perform in silico analysis of orthologous genes, secretion systems and antimicrobial secondary metabolites of potential novel strains relative to their closest known/validly described bacterial species; and (iv) describe novel bacterial species. 16S rRNA sequence analysis is a reliable taxonomic marker for genuslevel classification of bacteria (Gomila et al., 2015;Tambong, 2019) while MLSA, as single individual housekeeping genes or several housekeeping genes concatenated into pseudomolecules, has been routinely used for accurate and consistent phylogenetic resolution of strains at the species-level (Konstantinidis and Tiedje, 2007;Mulet et al., 2010;Liu et al., 2012;Tambong et al., 2014;Gomila et al., 2015;Tambong, 2019). Also the advent of genome sequencing using next generation technologies and recent advances in bioinformatics tools have revolutionized bacterial classification, providing highly reliable, reproducible and cumulative datasets and databases (Thompson et al., 2013;Parks et al., 2018;Paul et al., 2019). Genome-based parameters, such as comparative genomics, in silico DNA-DNA hybridization and average nucleotide identity; and matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) as well as biochemical characterization were used to validate the MLSA results and the hypothesis that three unique lineages discovered in this study constitute putative novel species within two bacterial genera, Pseudomonas and Sphingomonas.

Sample Collection and Isolation of Bacteria
Wheat flag leaves exhibiting symptoms of bacterial streak were collected from four southern Alberta fields of spring wheat (CWRS-AAC Viewfield). Four leaves per field were surface sterilized for 1 min in 2% sodium hypochlorite and rinsed three times with sterile water. The isolation of the bacteria was done as previously reported (Malette et al., 2020). Briefly, 0.25 cm 2 section of symptomatic leaf tissue was aseptically excised, bisecting at least one bacterial lesion and immersed in a 100 µl droplet of sterile water and incubated at room temperature for 5 min. The droplet was examined under a dissecting microscope to confirm bacteria oozing into a suspension. Using a 10-µl loop, the suspensions were streaked onto nutrient agar (NA) and plates were incubated at 25 • C and examined at 24 and 48 h for bacterial colonies. Up to four, morphologically-unique colonies were picked from each plate onto fresh NA plates using a sterile needle and incubated for 48 h. A total of 42 cultures were sent to the Ottawa Molecular Bacteriology Laboratory of Agriculture and Agri-Food Canada for identification. After repeated streaking (Tambong et al., 2017;Tchagang et al., 2018), 49 single colonies were obtained and preserved at −80 • C in LB medium with 30% glycerol (v/v) for long-term storage. Gram-reaction of the strains was determined using the 3% KOH assay (Ryu, 1940).

BLAST and Phylogenetic Analyses
The Basic Local Alignment Search Tool (BLAST; Altschul et al., 1990) program was used to query GenBank databases with the partial gene sequences obtained in this study for homology. The individual gene loci were separately aligned in SeqMan Pro ver 15.3.0 (DNASTAR) and the 5 and 3 ends of DNA sequences trimmed to the shortest sequence. Geneious 11.1.5 (Biomatters Ltd.) was used to concatenate corresponding gene sequences. The individual genes and concatenated pseudogene sequences were independently aligned using MUSCLE v3.8.1551 (Edgar, 2004); and modeltest-ng (Darriba et al., 2020) was implemented to identify the best-fit models of nucleotide substitution based on the Akaike's Information Criterion (AIC). The best-fit model of each dataset is given in the phylogenetic tree captions. Maximum likelihood phylogenetic tree inference was performed using RAxML-NG (Kozlov et al., 2019) with 1,000 non-parametric bootstrap replicates to assess internal branch robustness (Felsenstein, 1985).

De novo Genome Sequencing and Analysis
Bacterial sample preparations for de novo genome sequencing of 7 representative strains were performed as described previously (Malette et al., 2020). Libraries were constructed using a Nextera DNA Flex Prep Kit (Illumina) following the manufacturer's instructions. The draft genome sequences were determined by paired-end sequencing using Illumina NextSeq technology at the Molecular Technologies Laboratory, Ottawa Research and Development Centre, Ottawa, Canada. The quality of the pairedend reads, each 150 bp in length, were checked using FastQC version 0.11.3 (Andrews, 2010) and trimmed, if required. De novo assemblies were performed using ABySS version 1.5.2 (Simpson et al., 2009) or Unicyler v0.4.8 (Wick et al., 2017 as implemented in PATRIC (Wattam et al., 2014) and scaffolds < 300 bp in length were discarded.
Closely-related type strain genomes were determined via the Type (Strain) Genome Server (TyGS) 2 which implements the MASH algorithm (Ondov et al., 2016); and corresponded to the closest bacterial species determined by MLSA. The Genome BLAST Distance Phylogeny approach (GBDP) as previously described was used to compute precise distances and genome-based phylogenetic relationships were inferred under the algorithm 'trimming' as recommended previously (Meier-Kolthoff et al., 2013). The resulting intergenomic distances were used to infer, using FASTME 2.1.4 including SPR postprocessing (Lefort et al., 2015), balanced minimum evolution trees with branch support (100 pseudo-bootstrap values). Trees were rooted at the midpoint (Farris, 1972). Genome-based DNA-DNA hybridizations (gDDH) 3 were calculated using the default parameters of the GGDC 2.1 (Meier-Kolthoff et al., 2013) and values compared at the recommended species-level cut-off threshold of 70%. Average nucleotide identity values (ANI), were computed with default parameters using the FASTANI tool (Jain et al., 2018) to assess the taxonomic position of strains relative to the closest taxa at a species delineation cut-off threshold of 96%. The FastANI tool fragments a given query genome into overlapping fragments of a specific size. The sized fragments are then mapped to the reference genome using Mashmap (Jain et al., 2018).
For gene content analysis, the assembled genome sequences were annotated using the NCBI Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) and the output used as in put data. Identification and comparison of orthologs and orthologous gene network analysis were done using OrthoVenn2 . The detection of bacterial secretion systems and cas3-typing (CRISPR-associated proteins) were done using, respectively, the TXSScan and cas_finder modules of MacSyFinder (Abby et al., 2014). In silico searches for secondary metabolites and antibiotic profiling were done against antiSMASH databasebacterial version 4 (Blin et al., 2019).

MALDI-TOF MS, Fatty Acid Analysis and Phenotypic Fingerprinting
Matrix-assisted laser desorption/ionization-time-of-flight MS was used to profile whole cells according to the ethanol/formic acid extraction protocol recommended by Bruker Daltonics 5 . Pseudomonas strains were cultured in tryptic soy agar medium and incubated at 28 • C for 24 h while Sphingomonas strains were grown at 20 • C for 72 h in trypticase soy agar. Strains were grown, processed and profiled in duplicates. Spectral measurements, mass range of 2,000-20,000 m/z, were recorded using the Microflex L20 instrument (Bruker Daltonics, Germany) and data analyzed using FlexAnalysis and BioTyper 3.1 Explorer (Bruker Daltonics). The spectra were obtained in linear, positive ion mode according to the manufacturer's recommended settings (Bruker Daltonik, Bremen, Germany). Each final spectrum resulted from the sum of the spectra generated at random positions to a maximum of 240 shots per spectrum. Spectra were visualized and pairwise superimposed using mMass version 5.5.0 (Niedermeyer and Strohalm, 2012). Where required, log score values were generated from the main spectra and clustering and principal component (PCA) analyses done as previously described (Maier et al., 2006).
Whole cell fatty acids of the Pseudomonas strains were analyzed as previously reported (Tambong et al., 2017) while that of Sphingomonas sp. nov. strain 23L3C was analyzed as reported by Busse et al. (2003). The extraction and analysis of fatty acid methyl esters were done according to MIDI (Sherlock Microbial Identification System, version 6.2) recommended protocol (Sasser, 1990). The Agilent 7890B gas chromatograph was used to generate the profiles and automatically identified by the MIDI TSBA 6 database.
Standard biochemical and bacteriological characterizations were performed on the potential novel strains as previously described (Tambong et al., 2017) using GEN III Microplate TM (Biolog) to analyze 71 carbon sources and 23 chemical sensitivity assays including pH, salt tolerance and antibiotics as recommended by manufacturer. In addition, API ZYM galleries (bioMérieux) were used to study the strains according to manufacturer's protocol. Catalase assay was performed as previously reported (Tambong et al., 2017) and oxidase activity was tested using strips (Millipore-Sigma). Cell morphology was investigated using scanning (SEM) and transmission (TEM) electron microscopy as previously reported (Tambong et al., 2017). SEM images were captured using Hitachi SU7000 (Hitachi, Tokyo, Japan) field emission scanning electron microscopy, as recommended by manufacturer. Biolog and API ZYM assays were performed in duplicates with similar results.

Genus-Level Identification Based on 16S rRNA
Thirty-nine (80%) out of the 49 strains were Gram-negative bacteria while 10 strains were Gram-positive suggesting that the former major group is more abundant in the infected wheat leaf that showed typical symptoms of the bacterial leaf streak caused by X. translucens (Figure 1). All of the Gram-negative strains belong to the phylum Proteobacteria while the 10 Gram positive strains are of the Actinobacteria phylum. The 49 strains were taxonomically affiliated to seven bacterial families. Families Erwiniaceae and Pseudomonadaceae encompassed 65% of the strains (Figure 1). The family Microbacteriaceae (9 out of 10 strains) was the more abundant of the two families within the Gram-positive major group.
Forty-nine 16S rRNA sequences, l200 -1400 nt in length, were grouped phylogenetically by using the maximum likelihood algorithm into 10 unique clusters constituting different bacterial genera (Figure 1). All clusters had, at least, two strains with the exception of clusters V, VII and VIII with only one strain each (Figure 1). Comparative 16S rRNA sequence analysis using BLAST and publicly available GenBank databases taxonomically classified the strains of each of the 10 clusters into 10 valid bacterial genera. Fourteen of the 16 strains belonging to the family Erwiniaceae were classified as Pantoea (cluster I) while two strains were assigned to the genus Erwinia (Cluster II) (Figure 1 and Supplementary  Table 1). Strains of cluster III (Figure 1) belonged to the genus Pseudomonas; and exhibited 16S rRNA sequence similarity values in the range of 98.9 to 99.9% (Supplementary Table 1) with type strains of species with validly published names, e.g., Pseudomonas caspiana FB102 T (99.6% with strain 10L4B and 12L3D), Pseudomonas congelans DSM 14939 T (99.8% with strain 27L4A), and Pseudomonas asturiensis LPPA 221 T (99.5% with strain FIGURE 1 | 16S rRNA tree based on maximum likelihood algorithm grouped the strains into 10 clusters corresponding to distinct and validly described bacterial genera. Based on Akaaike's information criterion, the GTR + I + G4 substitution model was used. Bootstrap values >50% are shown at the branching points. Table 1  to Curtobacterium pusillum DSM 20527 T were taxonomically placed in cluster X (Figure 1). Based on 16S rRNA sequence homologies (> 98%), only one strain was identified for clusters V, VII and VIII (Figure 1); and their respective strains were classified as Delftia (Comamonadaceae), Sanguibacter (Sanguibacteriaceae) and Pseudoclavibacter (Microbacteriaceae) (Supplementary Table 1). These 3 clusters, with only strain each, were not analyzed beyond 16S rRNA.  Table 1). Species level identification of Pseudomonas strains was achieved by BLAST analysis of rpoD, gyrB, and rpoB as previously indicated (Tambong et al., 2017) at a cut-off threshold of 97%. The 16 Pseudomonas strains were classified to 8 closest validly described species based on the sequence similarity in the range of 95.0 to 100%, 94.0 to 99.0%, and 93.0 to 100%, for rpoD, rpoB and gyrB, respectively (Table 1). Based on this scheme, 13 Pseudomonas strains were accurately assigned to 6 validly described species (Table 1). For example, five strains (2L1B, 8aL3C, 18L2B, 27L4A, and 34L2A), with sequence homology values (98.1-98.5%) higher than the species cutoff threshold, were taxonomically assigned to Pseudomonas congelans ( Table 1). Strain 32L3A, however, had the highest gene sequence homologies of 95.9% (rpoD), 93.2% (rpoB), and 96.6% (gyrB), with Pseudomonas asturiensis LPPA 221 T (Table 1), with three genes showing percent homology values less than the species cut-off threshold of 97% (Gomila et al., 2015;Mulet et al., 2010). Also, two strains (10L4B and 12L4D) had low percent gyrB, rpoB and rpoD sequence homology values of 95.1, 93, and 93.0%, respectively, with Pseudomonas caspiana FBF102 T , suggesting a potential novel genotype ( Table 1). The two Sphingomonas strains recorded percent similarity values of 96.4, 96.4 and 92.6% for fusA, rpoB, and atpD genes, respectively, with strain MAolki T , the type strain of Sphingomonas faeni. These low homology levels (<97%) suggest that these two strains (15L3B and 23L3C) could be putative novel species within the genus Sphingomonas.

Species-Level Identification Based on BLAST and Phylogenetics Analyses of Housekeeping Genes
Multilocus sequence analysis phylogenetic analyses of individual and concatenated genes were used to further characterize the taxonomic positions of these five strains (10L4B,12L4D, 32L3A, 23L3C, and 15L3B). Single gene ML evolutionary trees showed these novel strains clustering uniquely from known validly described species and well supported by 100% bootstrap values (data not shown). The ML trees (Figures 2A-C) derived from concatenated partial gene sequences specific for the respective bacterial genera (Pseudomonas, rpoB-gyrB-rpoD, 8321 nt; and Sphingomonas, atpD-fusA-rpoB, 6997nt) clustered the strains similarly to those of single gene phylogenies. These analyses reaffirmed the uniqueness of these strains and suggest that they represented three putative novel species within Pseudomonas and Sphingomonas genera.

MALDI-TOF MS Data
Matrix-assisted laser desorption/ionization-time-of-flight MS is an emerging technology for quick and accurate identification of most bacterial strains by analyzing mass spectra of whole cell proteins. This technique was used to confirm that strains 32L3A, 10L4B, and 23L3C represented novel lineages relative to their corresponding closest known type strains. Pairwise comparative analysis of the spectra of the novel strains Pseudomonas sp. 32L3A or Sphingomonas sp. 23L3C and their corresponding closest known type strain revealed 8 or 7 peaks (m/z) indicated by asterisks that could be used to differentiate these strains from P. asturiensis or S. faeni, respectively (Supplementary Figure 1). For the relationship between strain 10L4B and its closest type strain (P. caspiana), we implemented cluster analysis to generate a dendrogram and pca (Figure 3). The novel Pseudomonas sp. strain 10L4B clustered distinctively at a distance level of about 250 with Pseudomonas caspiana CECT 9164 T as the closest neighbor ( Figure 3A). Principal component analysis of MALDI-TOF profiles of strain 10L4B confirmed its distinct taxonomic status relative to P. caspiana, P. cichorii, P. congelans and P. syringae ( Figure 3B). These MALDI-TOF results are consistent with MLSA-based data, suggesting that these strains represent novel Pseudomonas and Sphingomonas species.

Genome-Based DNA-DNA Homology and Phylogenomics
To further validate strain identification, we sequenced and analyzed the genomes of 9 representative strains: 6 identified based on MLSA data to the species-level and the 3 putative novel lineages. Supplementary Table 2 shows the basic statistics of the whole genome sequences obtained in this study. All the genome sequences were of good quality. Genome size ranged from 4.1 Mb (Sphingomonas sp. 23L3C) to 6.0 Mb (Pseudomonas sp. 10L4B) with a completeness of 94.7 to 100% (Supplementary  Jain et al., 2018) analyses, at the respective species cut-off threshold of 70 and 96%, validated the species-level taxonomic positions of the 6 bacterial strains identified using housekeeping genes ( Table 2). For example, the genome sequence of strain 7L3B had the highest gDDH and ANI values (> threshold values of 70 and 96%, respectively) of 91.5 and 99.0%, respectively, with the type strain of Pantoea allii LMG 24248 T (NTMH00000000) while strain 1L1A was confirmed to be a putative Pantoea agglomerans exhibiting gDDH and ANI values of 88.6 and 98.7%, respectively, with the type strain, DSM 3493 T (FYAZ00000000) ( Table 2). Also, the draft genome sequence of strain 17L2C had gDDH and ANI values greater than the respective threshold values with the type strain, Xanthomonas translucens pv. translucens DSM 18974 T (78.0 and 97.7%) and showed 94.9 and 99.7% with a reference, Xanthomonas translucens pv. undulosa Xtu 4699. This confirms that this strain belongs to pathovar undulosa as indicated by the pathogenic reaction on wheat and barley (data not shown). Similarly, the gDDH and ANI values of the 3 putative novel bacterial genotypes were lower than the respective species cut-off threshold levels ( Table 2). For example, strain 32L3A had the highest gDDH and ANI values of 58.8 and 80.9%, respectively, with the type strain of Pseudomonas asturiensis LMG 26898 T (FRDA00000000) ( Table 2). Also, strain 23L3C had only 34.4% (gDDH) and 88.8% (ANI) genome homology values with Sphingomonas faeni MA-olki T (QAYE00000000 (Table 2). The gDDH and ANI results were corroborated by whole genome-based phylogenetic analysis inferred using GBDP-derived distances of strains 10L4B, 32L3A, and 23L3C and their corresponding closest validly described bacterial type strains (Supplementary Figure 2). For example, Sphingomonas sp. strain 23L3C grouped uniquely but close to Sphingomonas faeni MA-olki T with a branch support of 100% pseudo-bootstrap values (Supplementary Figure 2). These whole genome sequence data strongly corroborated the other results and supported the hypothesis that these strains constitute putative novel genotypes within the genera Pseudomonas and Sphingomonas.
FIGURE 2 | Maximum Likelihood phylogenetic trees of concatenated housekeeping gene sequences of the novel genotypes (in bold) within the genera Pseudomonas (A,B), and Sphingomonas (C) and corresponding closest type strains. rpoB-gyrB-rpoD (8321 nt) and atpD-fusA-rpoB (6997nt) were used for Pseudomonas, and Sphingomonas, respectively. Trees were infer using RAxML-ng (Kozlov et al., 2019) with TIM2 + I + G4 (A) and GTR + I + G4 (B,C) as the best substitution models (Darriba et al., 2020) based on Akaaike's information criterions. Bootstrap values >50% are shown at the branching points. GenBank accession numbers of closest type strains are given in parentheses.

Orthologous Gene Analysis
To provide a better insight of the genomic make-up of the proposed novel species, we analyzed the orthologous genes of the novel lineages relative to their closest type strains of species with validly published names. The gene content comparisons of the potential novel species and their closest described species indicated distinct genetic make-up.
The two potential novel Pseudomonas strains and their corresponding closest validly described species (P. caspiana and P. asturiensis) were analyzed together. The four whole genome sequences had a total of 18,463 protein-coding sequences. The genes were grouped into 5,523 clusters consisting of 2,061 orthologous clusters with at least two species and 3,462 single-copy gene clusters. The four Pseudomonas genome sequences shared 3,493 orthologous protein clusters ( Figure 4A) that are involved in biological processes, molecular function and cellular component. The novel strain 10L4B shared 682 orthologous protein family clusters with its closest known type strain, P. capsiana FBF102 T while strain 32L3A shared 608 protein family clusters with P. asturiensis LMG 26894 T (Figure 4A). Strain 32L3A, however, shared only 12 protein cluster families with P. caspiana FBF102 T while strains 10L4B and P. asturiensis 26894 T uniquely shared only 36 protein family clusters. Of the 3,493 protein family clusters shared by the 4 Pseudomonas strains, cluster 1 had the highest protein count of 18 and is annotated as P:antibiotic biosynthetic process; IEA:UniProtKB-KW (GO:0017000; Swissprotein hit # P0C064). The gene network of cluster 1 shows the similarity levels of Pseudomonas sp. nov. strain 32L3A and its closest validly described species, P. asturiensis and had 6 and 7 predicted protein sequences, respectively ( Figure 4B). For the other pair, Pseudomonas sp. nov. 10L4B and its closest known bacterial relative, P. caspiana FBF102 T exhibited significantly lower numbers, 2 and 3 respectively, of this protein ( Figure 4B). Protein similarity/homology was confirmed by phylogenetic analysis which revealed 5 distinct groupings, three of which are unique to Pseudomonas sp. 32L3A and its closest known bacterial species, P. asturiensis LMG 26894 T (Figure 4C). The two novel species, Pseudomonas sp. nov. 32L3A and Pseudomonas sp. nov. 10L4B, had 2 and 14 unique protein family clusters, respectively that were not detected in the whole genome sequences of their respective closest type strains. The two clusters (clusters 5,279 and 5,280) of strain 32L3A had 4 proteins with no significant match to entries of the Swissprotein databases. BLASTp analysis of the two proteins of cluster 5,279 indicates high homology to integrase, an enzyme required for integration of the phage into the host genome by site-specific recombination (Van Duyne, 2005). Cluster 5280 proteins seem to be a calcium-binding protein with 87% homology to that of Pseudomonas corrugata strain C8A5 isolated from the rhizosphere of painted nettles (Plectranthus scutellarioides).
Seven of the 14 unique protein family clusters in the novel strain 10L4B were annotated to be involved in biological processes such as ion transport (GO:0006811), nitrogen and organic acid metabolic processes. The 7 other clusters consist of 27 proteins with four protein sequences in cluster 3,534 showing low homology (42%) to BapA prefix-like domaincontaining protein (WP_039297424) of Cedecea neteri that has been implicated in biofilm formation and host colonization (Latasa et al., 2005). The two proteins of cluster 5,286 or 5,287 were 98 or 100% similar to the demethoxyubiquinone hydroxylase family protein (WP_095100952) or SMI1/KNR4 family protein (WP_095101549), respectively in Pseudomonas sp. Irchel 3A5, an unclassified Pseudomonas strain. Protein family cluster 5,290 contains two proteins that are unique to the novel strain 10L4B. The two proteins in cluster 5,292 had about 77% homology to the suppressor of fused domain protein of Pseudomonas sp. Irchel 3A5 and Pseudomonas savastanoi. All the other clusters were annotated as hypothetical proteins and exhibited high sequence homologies (>95%) with Pseudomonas sp. Irchel 3A5.
Pairwise orthologous gene analysis for the novel Sphingomonas sp. strain 23L3C revealed sharing 2,778 protein family clusters with S. faeni MA-olki T , its taxonomically closest known type species; and, also, each had 40 or 71 unique protein families, respectively ( Figure 5A). Cluster 1 protein family, shared by both strains, had 8 protein sequences with only one identified in the genome sequence of strain 23L3C. The protein similarity network ( Figure 5B) and phylogenetic tree ( Figure 5C) of cluster 1 suggest that the only protein identified in strain 23L3C is distinct. Protein sequences in cluster 1 matched the Swiss-protein # P20384, a putative transposon Tn552 DNA-invertase bin3 that facilitates horizontal transfer and recombination events related to multidrug resistance (Shore et al., 2010;Abriouel et al., 2015). This disproportionately high number (7) of protein sequences suggests that the Sphingomonas faeni MA-oki T could be prone to acquiring antimicrobial resistance genes from the environment. Cluster 3 of the protein shared between the two strains is a family identified as a multidrug efflux pump subunit AcrB. This cluster matched entry number P31224 in the Swiss-Protein database; and seems to constitute part of the AcrA-AcrB-AcrZ-TolC complex with broad spectrum and uses the proton motive force to export substrates including antimicrobial compounds (Sennhauser et al., 2007;Hobbs et al., 2012).
Forty-one protein family clusters were identified only in the novel Sphingomonas sp. strain 23L3C and consisted of 89 coding sequences, ranging from 2 to 6 per cluster. Seventeen (42.5%) of the 41 family clusters could not be matched to entries in the curated Swiss-protein database. Cluster 2 showed the highest number of proteins (6) and matched entry P25184 in the Swiss-protein database with a gene ontology of P:siderophore transport, a receptor for specific transmembrane uptake of the ferric pseudobactin 358. Cluster 4 had four proteins that matched Swiss-protein number D3Z7P3, a glutaminase isoform that has been implicated in the catabolism of glutamine and plays a role in acid-base homeostasis (Cassago et al., 2012;Li et al., 2016). A multidrug export proteins (EmrA) grouped in cluster 8 corresponded to P27303 protein of the Swiss-protein database, a part of the EMrAB-TolC tripartite efflux system that confers antibiotics resistance to compounds such as nalidixic acid and 2,4-dinitrophenol (Lomovskaya and Lewis, 1992;Borges-Walmsley et al., 2003).

In silico Detection of Secondary Metabolites, and Secretion Systems
In silico secondary metabolite detection using AntiSMASH tool found at least three gene clusters within the genome sequence of the strains ( Table 3). The types of metabolites detected included NRPS, NAGGN, acryl polyene, bacteriocin, siderophores, T3PKS, lasso peptide and terpene ( Table 3). Metabolites of the aryl polylene, siderophores and bacteriocin types were more prominent in strains of Pseudomonas while the T3PKS type metabolites were detected only in Sphingomonas strains ( Table 3). All the Pseudomonas strains have a cluster that is similar to pyoverdine with the exception of strain 10L4B (Table 3). A cluster involved in the biosynthesis of cichofactin A/cichofactin B was detected in Pseudomonas sp. nov. 32L3A (100% similarity) but not in its closest known species, P. asturiensis LMG 26898 T . The taiwachelin biosynthetic cluster was detected in Pseudomonas sp. nov. 10L4B but absent in P. caspiana FBF102 T ( Table 3). The novel genotype Sphingomonas sp. strain 23L3C had an emulsanlike cluster of the lantidin type which was not detected in the whole genome sequence of its corresponding closest known type strain, S. faeni Ma-olki T ( Table 3).
Bacterial secretion systems are essential in the virulence, pathogenicity and global cell function. We detected non-flagellar and flagellar secretion systems, showing differences between the analyzed strains (Figure 6). The presence of the T1SS, T2SS and the flagellar systems were comparable across the strains while the T3SS and T6SSii contents differed between Pseudomonas and Sphingomonas strains (Figure 6). Sphingomonas strains (23L3C and Ma-oki T ) had very few genes annotated as T3SS and T6SSii, respectively (Figure 6). The novel strains 32L3A and 10L4B as well as the type strain of P. caspiana FBF102 T had very few protein coding sequences that were annotated as T4SS (Figure 6). Also, fewer counts of components of the tight adherence (Tad) pilus were detected in the novel genotype 10L4B and its corresponding closest known type strain, P. caspiana compared to the other strains. Sphingomonas strains had fewer components of the Type IV pilus (T4P) compared to the other bacterial strains studied (Figure 6) while those of the new Pseudomonas strains and their corresponding closest type strains were similar. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR-Cas), a defense mechanism against foreign genetic elements in prokaryotes, were detected by using MacSyFinder tool. Five or six putative and mandatory CRISPR-Cas3 type I systems were detected, with variable sequence lengths of 445-829 bp or 194-810 bp, for the Pseudomonas or Sphingomonas strains, respectively.
Based on GENIII microplate assay (Biolog), 52, 43, or 35 carbon sources out of the 77 substrates tested were readily utilized by strain 10L4B, 32L2A or 23L3C respectively (Supplementary Table 4) and phenotypic patterns differed from those of their closest type strains. The strains were observed to be rod-shaped of variable sizes with Sphingomonas sp. nov. strain 23L3C being the smallest (Figure 7). Multiple polar flagella were observed on the cells of Pseudomonas sp. nov. strain 32L3A while strain 10L4B had one visible polar flagellum; and cells of Sphingomonas sp. nov. strain 23L3C did not show the presence of flagella (Figure 7).
Also, both novel Pseudomonas strains tolerated pH 6.0 (neutral pH) while Sphingomonas strain 23L3C preferred pH 5.0 (acidic condition) (Supplementary Table 4). Pseudomonas strains grew well at 4% salt tolerance level while the Sphingomonas strain did not seem to tolerate high salt concentrations (Supplementary Table 4). Both novel Pseudomonas strains were resistant to troleandomycin, rifamycin SV, vancomycin, lincomycin but sensitive to nalidixic acid and minocycline. Strain 23L3C is resistant to vancomycin and nalidixic acid, but sensitive to rifamycin SV, minocycline and lincomycin (Supplementary Table 4).

DISCUSSION
This study reports the analysis of cultivable bacterial strains isolated from a unique and unexploited niche, the wheat necrotic lesions caused by Xanthomonas translucens (the causal agent of bacterial leaf streak disease), and the identification of three novel and previously undescribed bacterial species.
Until now, research on the characteristic BLS lesions only focused on the isolation of the target pathogen for downstream studies such as accurate identification, verification of Koch's postulate and population dynamics. To achieve this isolation objective, the Wilbrink's-boric acid-cephalexin semi-selective medium is generally employed since it eliminates 90% of saprophytic and epiphytic bacteria (Duveiller, 1994). It is clear why plant pathologists use this semi-selective medium, but in so doing a vast amount of bacterial biodiversity is not being documented. Several other semi-selective media are routinely used for the isolation of agricultural important bacterial pathogens e.g., Clavibacter michiganensis on tomato seeds (Ftayeh et al., 2011) and Pseudomonas syringae on kiwifruit (Miyoshi and Tachibana, 1994) and thus, eliminating other bacteria, some of which might be novel taxa of unknown ecological significance.
We used a general purpose medium, nutrient agar, and isolated forty-nine bacterial strains, consisting of 10 known and validly described genera (Figure 1). The 10 bacterial genera were reliably identified to the genus-level based on 16S rRNA sequence analysis, which revealed that Pseudomonas (32.7%) and Pantoea (28.6%) were the dominant genera. Consistent with previous reports (Mulet et al., 2010;Gomila et al., 2015;Tambong et al., 2017), 16S rRNA gene sequences showed low resolution at the intrageneric level. Housekeeping genes, such as leuS, atpD, rpoD, gyrB, fusA, rpoB, ppkA, and recA, in different combinations, have been used routinely to refine interspecfic phylogenetic and taxonomic positions of species of Pseudomonas, Pantoea, Clavibacter, Sphingomonas and Curtobacterium (Konstantinidis and Tiedje, 2007;Mulet et al., 2010;Tambong et al., 2017). rpoD, rpoB and gyrB are key housekeeping genes for specieslevel identification of Pseudomonas with rpoD reported to be the most discriminatory (Mulet et al., 2010;Gomila et al., 2015;Tambong et al., 2017) and is routinely used as the key gene for identification of species of Pseudomonas. leuS, rpoB and gyrB have been reported to provide accurate species delineation within the genus Pantoea (Deletoile et al., 2009;Tambong et al., 2014;Tambong, 2019). leuS gene fragment used in this study is a reliable marker for identification and differentiation of members of the genus Pantoea and is highly congruent to genome-based approach (Tambong et al., 2014;Rahimi-Khameneh et al., 2019;Tambong, 2019). BLAST and phylogenetic (single and concatenated genes) analyses of these housekeeping genes were employed in our study and resulted in accurate species-level identification of the majority (80%) of the strains. Strains of Pantoea were identified to belong to either Pantoea agglomerans or Pantoea allii. Pantoea agglomerans is a versatile bacterium isolated from a variety of sources including wheat (Lindh et al., 1991;Kobayashi et al., 2018;Cherif-Silini et al., 2019). It was, however, surprising to isolate from wheat two strains of Pantoea allii, a known potent pathogen of onion (Brady et al., 2011;Rahimi-Khameneh et al., 2019). The strains of P. allii isolated in this study were found to be pathogenic to onion but not wheat (data not shown). Taxonomic assignment of the 4 Xanthomonas strains to X. translucens was done using atpD and gyrB, housekeeping FIGURE 6 | In silico detection of bacterial secretion systems and cas-3 sequences/genes within the genome sequences of the novel bacterial strains (Pseudomonas sp. nov. 10L4B (= DOAB1069); Pseudomonas sp. nov. 32L3A (= DOAB 1067) and Sphingomonas sp. nov. 23L3C (= DOAB 1063) and their respective corresponding closest validly described type strains Pseudomonas asturiensis, Pseudomonas caspiana and Sphingomonas faeni. Secretion systems-and Cas3-typing-related proteins were detected using, respectively, the TXSSCan and Cas_finder modules of MacSyFinder [1]. T1SS, T2SS, T3SS, T4SS, T5SS, and T6SS represent type I, type II, type III, type IV, type V and type VI secretion systems, respectively. Tad, Tad pili; and T4P, Type IV pili. Cas3 represents CRISPR-Cas, Clustered Regularly Interspaced Short Palindromic Repeats.
genes that have been routinely used for differentiation of species within this genus Fischer-Le Saux et al., 2015;Ferreira et al., 2019). Based on the rpoD, rpoB and gyrB (Mulet et al., 2010;Gomila et al., 2015), the Pseudomonas strains were accurately classified to six validly described species with the exception of five strains (32L3A, 12L4D, 10L4B, 23l3C, and 15L3B). Strain 32L3A had Pseudomonas asturiensis LMG 26898 T as its closest validly described type strain. Strains 10L4B and 12L4D were found to be identical based on high homology values (>99.0%) of their 16S rRNA, rpoD, rpoB, and gyrB nucleotide sequences and had Pseudomonas caspiana as the closest type strain. Also, the 2 strains (23L3C and 15L3B) of Sphingomonas were similar but taxonomically differed from their closest type strain, Sphingomonas faeni MA-olki T . As such the taxonomical positions of these three strains of Pseudomonas and two of Sphingomonas were further analyzed to determine whether they constitute novel genotypes. We used MALDI-TOF and genomebased analyses to verify the uniqueness of these bacterial strains.
Matrix-assisted laser desorption/ionization-time-of-flight MS analysis of the whole cell protein content based on spectra allowed for a straightforward separation of the novel strains from their respective closest validly described type strains. MALDI-TOF MS is an innovative tool for rapid and cost-effective bacterial identification (Bizzini and Greub, 2010;Camoez et al., 2016;Sauget et al., 2017) that has recently be integrated in clinical microbial identification workflows (Bizzini and Greub, 2010;Sauget et al., 2017). This phenotypic analysis is becoming key in rapid identification of novel bacterial species/genotypes (Moore and Rossello-Mora, 2011;Spitaels et al., 2014;Waleron et al., 2019;Lassalle et al., 2020). Even though MALDI-TOF analysis is a valuable screening tool, it might show a low discriminative power in delineating closely related bacterial species, e.g., P. congelans and P. syringae. In such cases, the corresponding results should be interpreted with caution and other methods used to validate the identification.
Rapid advances in high-throughput sequencing technologies and the development of bioinformatics tools have enabled the analysis of whole genome sequences for accurate identification, classification and taxonomy/systematics of bacterial strains (Gomila et al., 2015;Tambong, 2019). dDDH and ANI analyses based on whole genome sequence data corroborated the hypothesis that strains 32L3A, 10L4B, and 23L3C constitute novel species within the genera Pseudomonas and Sphingomonas since the computed values were lower than the cut-off threshold levels for species delineation. Genome-based techniques are reliable alternatives to wet lab DNA-DNA hybridization (wDDH) in bacterial species and strain identification (Thompson et al., 2013;Gomila et al., 2015;Parks et al., 2018;Tambong, 2019). Genome-based techniques eliminate the inherent shortcomings of wDDH such as irreproducibility between laboratories and high error (Sneath, 1989;Stackebrandt, 2003).
The availability of whole genome sequences allowed us to investigate the gene content of the novel species relative to their respective closest type strains. Analysis of several clusters of orthologous proteins showed similarities and differences. In the Pseudomonas, for example, all the strains had protein sequences in cluster 1 but the number of proteins was significantly different (Figure 4). The novel strain 32L3A and its corresponding type strain, Pseudomonas asturiensis, had 13 of the 18 protein sequences in cluster 1 while the novel strain 10L4B and its corresponding type strain (Pseudomonas caspiana) only had 3 and 2 proteins, respectively. This protein cluster is involved in the pathway for gramicidin S biosynthesis, an antimicrobial compound effective against some gram-positive, gram-negative bacteria and fungi (Gause and Brazhnikova, 1944;Kondejewski et al., 1996). The presence of this protein family cluster in all the Pseudomonas strains suggests a potential role in ecological fitness of these bacteria, especially in nutrientrestrictive niches. Still to be investigated is whether the high number of genes of this protein cluster family in Pseudomonas sp. nov. strain 32L3A and its corresponding closest known type strain could translate to higher competitiveness/fitness in their unique niches.
In silico analysis of secondary metabolites using AntiSMASH detected the presence of a pyoverdine cluster in all the genome sequences of Pseudomonas strains with the exception of strain 10L4B ( Table 3). The lack of pyoverdine in the whole genome sequence of strain 10L4B could explain why this novel strain, when grown on King's B medium, is non-fluorescent under ultra-voilet. Pyoverdin is a powerful iron-chelating andtransporting siderophore, and represents a ready marker for identification of some Pseudomonas (Meyer, 2000;Bodilis et al., 2009;Cezard et al., 2015). It is a yellow-green fluorescent pigment present in fluorescent Pseudomonas species (Meyer, 2000;Cezard et al., 2015).
Comparative orthologous gene analysis of the whole genome sequences of Sphingomonas sp. 23L3C and S. faeni Ma-oki T identified 89 unique protein coding sequences in the former strain only. While about 42.5% of the 41 protein family clusters could not be matched to entries in the curated Swiss-protein database, three of the annotated clusters provided insight to how strain 23L3C could be competitive in the environment. Cluster 2 was annotated as a receptor for specific transmembrane uptake of ferric pseudobactin 358. BLAST analysis suggests that this could be similar to pupB gene, an inducible ferric-pseudobactin receptor of Pseudomonas putida WCS358 (Koster et al., 1993). This PupB receptor is reported to facilitate iron transport using two (pseudobactin BN7 and pseudobactin BN8) distinct heterologous siderophores (Koster et al., 1993). The presence of the ferric-pseudobactin receptor suggests that strain 23L3C could be competitive in iron-limiting environments. Also, protein family cluster 8 matched a multidrug export proteins, EmrA (P27303), a part of EMrAB-TolC tripartite efflux system that confers antibiotics resistance to compounds such as nalidixic acid and 2,4-dinitrophenol (Lomovskaya and Lewis, 1992;Borges-Walmsley et al., 2003). This could explain why, in antibiotic sensitivity studies carry out in our study, the growth of strain 23L3C was not inhibited by nalidixic acid.
Sixteen bacterial species were isolated from necrotic tissues of wheat caused by Xanthomonas translucens including three previously undescribed taxa (Table 1). With the exception of the undescribed taxa, Pseudomonas lurida, Pseudomonas moraviensis, and Pseudomonas simiae, all the other isolated species have been reported to cause diseases on field crops including wheat with distinct and characteristic symptoms. However, the only characteristic symptoms observed during sample collection were those that are specific to Xanthomonas translucens, the causal agent of the bacterial leaf streak disease of wheat. Preliminary pathogenicity results using some of the other known plant pathogenic bacterial species did not induce compatible reaction on wheat seedlings (data not shown). Also, the presence of some of these other bacterial species on plant leaves is not surprising. For example, Pseudomonas lurida (Behrendt et al., 2007) and P. congelans (Behrendt et al., 2003) were initially isolated from phyllosphere of healthy grasses. These arguments suggest that these other bacterial species are either saprophytes, epiphytes, or endophytes. Further work is required to elucidate the ecological role(s) of these other species.
As a taxonomic conclusion, this study isolated 49 bacterial strains from the BLS lesions on wheat using a general purpose medium, nutrient agar, instead of the frequently used Wilbrink'sboric acid-cephalexin semi-selective medium. All the strains were identified to the species-level with the exception of these five strains: 32L3A, 10L4B, 12L4D, 15L3B and 23L3C. Polyphasic classification of these strains determined that the first three strains are members of the genus Pseudomonas while the latter strains belonged to Sphingomonas sensu stricto (Takeuchi et al., 2001). Analyses of the 16S rDNA, multilocus gene sequences, genome-based DNA-DNA hybridization and average nucleotide identity values, MALDI-TOF analysis and electron microscopy as well as biochemical/physiological traits separated these strains from their closest validly described Pseudomonas or Sphingomonas species. Based on the data from these genotypic and phenotypic analyses presented here, it is concluded that these isolates represent three novel species. We propose the names Pseudomonas triticumensis sp. nov. for strain 32L3A T ; Pseudomonas foliumensis sp. nov. for strains 10L4B T and 12L4D, and Sphingomonas albertensis sp. nov. for strains 23L3C T and 15L3B.
Description of Pseudomonas triticumensis sp. nov.
Description of Sphingomonas albertensis sp. nov.
The type strain, 23L3C T (= DOAB 1063 T = CECT 30248 T = LMG 32139 T ), isolated from necrotic wheat leaf tissues naturally infected by Xanthomonas translucens in Alberta, Canada. The DNA G + C content of the type strain 23L3C T is 65.7%.

AUTHOR CONTRIBUTIONS
JT received funding, contributed to the conception, design and data collection, and analysis and write-up. MH and GD collected the diseased leaf samples and did the isolation. RX contributed in MLSA gene sequencing. SG performed genome sequencing. DC and KH generated the electron microscopy data. All authors contributed to manuscript revision, read, and approved the submitted version.

FUNDING
This study is funded by Agriculture and Agri-Food Canada through projects J-002272, J-001577 and J-000409.

ACKNOWLEDGMENTS
We are grateful to the staff of the Molecular Technologies Laboratory, Ottawa Research and Development Centre, Ottawa, Canada (MTL-ORDC) for the Illumina NextSeq sequencing; and to the Alberta Agriculture and Forestry for providing the resources to survey wheat fields. Finally, the authors are thankful to all the technicians and summer students who worked on these projects.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.

2021.666689/full#supplementary-material
Supplementary Figure 1 | Comparative analysis of matrix-assisted laser desorption/ionization-time-of-flight mass spectrometric profiles of Pseudomonas sp. nov. 32L3A (= DOAB 1067) and Sphingomonas sp. nov. 23L3C (= DOAB 1063) and their corresponding closest phylogenetic neighbors. Asterisks denote sets of seven (m/z) or eight peaks (m/z) that could be used to differentiate the novel strains from the closest known species of the respective genera. MALDI-TOF profiles were visualized using mMass 5.5.0 (Niedermeyer and Strohalm, 2012).
Supplementary Figure 2 | Phylogenetic trees inferred from nucleotide distances derived from genome sequences of representative strains using the Genome BLAST Distance Phylogeny (GBDP) approach: (A) Pseudomonas sp. 10L4B; (B) Pseudomonas sp. 32L3A; and (C) Sphingomonas sp. 23L3C The branch lengths are scaled in terms of GBDP distance formula d 5 . The numbers above branches are GBDP pseudo-bootstrap support values >50% from 100 replications, with an average branch support of 99.7%. Trees were inferred with FastME 2.1.6.1 (Farris, 1972) and rooted at the midpoint (Lefort et al., 2015).