Agromyces chromiiresistens sp. nov., Novosphingobium album sp. nov., Sphingobium arseniciresistens sp. nov., Sphingomonas pollutisoli sp. nov., and Salinibacterium metalliresistens sp. nov.: five new members of Microbacteriaceae and Sphingomonadaceae from polluted soil

There are many unidentified microbes in polluted soil needing to be explored and nominated to benefit the study of microbial ecology. In this study, a taxonomic research was carried out on five bacterial strains which were isolated and cultivated from polycyclic aromatic hydrocarbons, and heavy metals polluted soil of an abandoned coking plant. Phylogenetical analysis showed that they belonged to the phyla Proteobacteria and Actinobacteria, and their 16S rRNA gene sequence identities were lower than 98.5% to any known and validly nominated bacterial species, suggesting that they were potentially representing new species. Using polyphasic taxonomic approaches, the five strains were classified as new species of the families Microbacteriaceae and Sphingomonadaceae. Genome sizes of the five strains ranged from 3.07 to 6.60 Mb, with overall DNA G+C contents of 63.57–71.22 mol%. The five strains had average nucleotide identity of 72.38–87.38% and digital DNA-DNA hybridization of 14.0–34.2% comparing with their closely related type strains, which were all below the thresholds for species delineation, supporting these five strains as novel species. Based on the phylogenetic, phylogenomic, and phenotypic characterizations, the five novel species are proposed as Agromyces chromiiresistens (type strain H3Y2-19aT = CGMCC 1.61332T), Salinibacterium metalliresistens (type strain H3M29-4T = CGMCC 1.61335T), Novosphingobium album (type strain H3SJ31-1T = CGMCC 1.61329T), Sphingomonas pollutisoli (type strain H39-1-10T = CGMCC 1.61325T), and Sphingobium arseniciresistens (type strain H39-3-25T = CGMCC 1.61326T). Comparative genome analysis revealed that the species of the family Sphingomonadaceae represented by H39-1-10T, H39-3-25T, and H3SJ31-1T possessed more functional protein-coding genes for the degradation of aromatic pollutants than the species of the family Microbacteriaceae represented by H3Y2-19aT and H3M29-4T. Furthermore, their capacities of resisting heavy metals and metabolizing aromatic compounds were investigated. The results indicated that strains H3Y2-19aT and H39-3-25T were robustly resistant to chromate (VI) and/or arsenite (III). Strains H39-1-10T and H39-3-25T grew on aromatic compounds, including naphthalene, as carbon sources even in the presence of chromate (VI) and arsenite (III). These features reflected their adaptation to the polluted soil environment.

There are many unidentified microbes in polluted soil needing to be explored and nominated to benefit the study of microbial ecology.In this study, a taxonomic research was carried out on five bacterial strains which were isolated and cultivated from polycyclic aromatic hydrocarbons, and heavy metals polluted soil of an abandoned coking plant.Phylogenetical analysis showed that they belonged to the phyla Proteobacteria and Actinobacteria, and their S rRNA gene sequence identities were lower than .% to any known and validly nominated bacterial species, suggesting that they were potentially representing new species.Using polyphasic taxonomic approaches, the five strains were classified as new species of the families Microbacteriaceae and Sphingomonadaceae.Genome sizes of the five strains ranged from .
to .Mb, with overall DNA G+C contents of .-. mol%.The five strains had average nucleotide identity of .-. % and digital DNA-DNA hybridization of .-. % comparing with their closely related type strains, which were all below the thresholds for species delineation, supporting these five strains as novel species.Based on the phylogenetic, phylogenomic, and phenotypic characterizations, the five novel species are proposed as Agromyces chromiiresistens (type strain H Ya T = CGMCC .T ), Salinibacterium metalliresistens (type strain H M -T = CGMCC .
Many soil microbes remain uncultivated (Daniel, 2005;Lok, 2015), and microbes dwelling in polluted soil may adapt themselves by evolution for resistance or even assimilation of chemical pollutants such as PAHs and heavy metals for energy or carbon sources (Gou et al., 2020;Thakur et al., 2022).We previously explored the microbial diversity of polluted sites of the coking plant and found that the microbial taxa responded differently to PAHs and heavy metals (Yang et al., 2022).This communication reports the isolation and genotypic and phenotypic characterization of five bacterial strains, namely, H3Y2-19a T , H3M29-4 T , H39-1-10 T , H39-3-25 T , and H3SJ31-1 T , from the polluted soil samples.With polyphasic taxonomic approaches, those strains were identified and classified into five novel species pertaining to the genera Agromyces and Salinibacterium of the family Microbacteriaceae and genera Novosphingobium, Sphingobium, and Sphingomonas of the family Sphingomonadaceae.

. Sample collection
Soil samples were collected from an abandoned coking plant that operated for 60 years in Hangzhou City, Zhejiang Province, China.The sampling site is located near the center of the plant area and a mass of black and sticky contaminants was observed from the soil section of the sampling site.Five different soil types from the upper-30 cm layer were mixed to obtain one composite sample.The samples show a visibly black appearance and a crude oil smell.The climate of Hangzhou is a subtropical monsoon with four distinct seasons, which is warm and humid.Sampling was conducted in December 2020, when it was wintertime with the local temperature of 5-9 • C.After collecting the soil samples, they were immediately stored in sterile PE bags and transported to the laboratory.The soil samples were analyzed for PAHs and heavy metals in the laboratory using HPLC and ICP-OES, and the results showed that the soil was severely polluted by PAHs with concentrations as high as 12,558.06±611.19mg/kg and was slightly polluted by heavy metals (Yang et al., 2022). .Culture media, isolation, and cultivation conditions Soils were suspended in a sterile phosphate buffer solution and shaken for 2 h to prepare the cell suspension.The cell suspensions after being diluted were spread onto agar plates and cultured in different conditions.To make M1 medium, the concentrations of K 2 HPO 4 and MgSO 4 .7H 2 O were in accordance with that of R2A (Lee and Whang, 2020), while other components were diluted five times.The final pH was adjusted to 7.2 by adding K 2 HPO 4 or KH 2 PO 4 .The medium was autoclaved for 30 min at 105 • C. To make M2 medium, 200 µl of a PAH matrix was spread on the mineral salt medium (MSM) agar plates (Samsu et al., 2020).A mixed solution of benz(a)anthracene and benzo(a)pyrene dissolved in acetone (1 g/L, respectively) was used as the PAH matrix.The matrix was sterilized by a 0.22-µm sterile filter.Before spreading the cell suspension, the acetone was volatilized.To make M3 medium, the soil extract solution was added into M1 with 10% dosage and then autoclaved for 30 min at 105 • C. The soil extract solution was prepared by the addition of 5.0 g soil into 150 ml distilled water, rotatory shaking at 150 rpm for 30 min, and then centrifuged to obtain supernatant.The agar plates were cultivated at room temperature (20-25 • C) or 30 • C. Colonies were picked up following their gradual appearance.Colonies were re-streaked on M1 agar plates until obtaining the pure strain.

. Heavy metal resistance and aromatic compound metabolism
The five bacterial strains were cultured in R2A medium to prepare the seed cultures for heavy metal resistance and aromatic compound metabolism experiments.The seed cultures were washed twice using sterile PBS.After sterilization, the R2A medium was supplemented with 0.5, 2.0, 5.0, and 10.0 mM sterilized sodium arsenite or 5, 20, 50, and 100 mg/L sterilized potassium dichromate.To each well of 96 microwell plates, 150 µl heavy metal-containing R2A medium and inoculated 5 µl washed cells was added.The 96 microwell plates were cultured at 30 • C for 13 days, and OD 600 was recorded at intervals using a microplate reader.The maximum increased OD 600 from the initial values of each well was picked out to determine the growth of bacteria.In aromatic compound metabolism experiments, 100 mg/L of 2,5-dihydroxybenzoic acid, 4-hydroxybenzoic acid, protocatechuic acid, salicylic acid, phthalic acid, benzoic acid, naphthalene, or phenanthrene were added into MSM medium, respectively.Each well of 96 microwell plates was infused with 150 µl MSM medium and inoculated with 5 µl washed cells.The 96 microwell plates were cultured at 30 • C for 30 days.OD 600 was recorded at intervals using a microplate reader.All the tests were carried out in duplicate.
To examine the PAH-degrading ability of isolates, phenanthrene (200 mg/L) or naphthalene (200 mg/L) accompanied with or without heavy metals was added into 5 ml MSM medium, respectively.The heavy metal was 0.5 mM sodium arsenite or 5 mg/L of potassium dichromate.PAH degradation was tested in the presence of arsenite or dichromate.Bacterial cells were washed and inoculated into the medium at 2% (v/v) and incubated for 3 days for naphthalene degradation and 10 days for phenanthrene degradation.At the end of cultivation, the residual PAHs were extracted using dichloromethane and detected by HPLC (Sakshi et al., 2022;Yang et al., 2022).The tubes without inoculation were set as control.The reduction of PAH content compared with control was calculated to determine the degradation percent.All the degradation tests were carried out in triplicates.

. Morphology observation and chemotaxonomic determinations
Morphology of bacterial colony was observed after aerobically culturing on R2A agar plates at 30 • C for 2-5 days.The single-cell morphology was observed by transmission electron microscopy (JEM-1400, Joel).The metabolic profiling of the carbon source was investigated using Biolog TM GEN III microplate systems according to the manufacturer protocol, which contained 71 different carbon source utilization assays and 23 chemical sensitivity assays.To make significative comparisons, the cell biomass for fatty acid identification of each bacterial strain was cultured in the corresponding medium reported for their reference type strains.H39-3-25 T was cultured in R2A medium.H3M29-4 T was cultured in a marine agar medium (Han et al., 2003).H39-1-10 T was cultured in an NA medium (Singh et al., 2015).H3SJ31-1 T and H3Y2-19a T were cultured in TSA medium (Gupta et al., 2009).Cells were harvested during the exponential growth phase.Cellular fatty acids were then extracted, methylated according to the standard MIDI protocol (Sherlock Microbial Identification System, version 6.0), and analyzed with a gas chromatograph (HP 6890 Series GC System, Agilent) (Sasser, 1990).Polar lipids were separated by two-dimensional thinlayer chromatography on silica TLC plates (10 × 10cm; Merck), for which chloroform/methanol/water (65:25:4, in volume) and chloroform/methanol/acetic acid/water (80:12:15:4, in volume) were used as the first-and second-dimensional spreading agents, respectively (Minnikin et al., 1984).After separation, the polar lipids were detected by spraying reagents as follows: 10% ethanolic molybdophosphoric acid for total lipids, 0.4% ninhydrin solution in butanol for aminolipids, 1.3% molybdenum blue spray reagent for phospholipids, and 0.5% α-naphthol sulfuric acid reagent for glycolipids. .

S rRNA gene sequencing and phylogenetic analysis
The complete 16S rRNA genes were amplified using the universal primers 27F ( 5′ -AGAGTTTGATCCTGGCTCAG−3 ′ ) and 1492R (5 et al., 2012).The gene similarities to previously reported type strains were determined using the EzBioCloud server (Yoon et al., 2017).The 16S rRNA gene sequences of type strains were downloaded from the EzBioCloud server and aligned using CLUSTAL W (Thompson et al., 1994).Phylogenetic trees were constructed using MEGA 11 software based on the neighbor-joining (NJ) method according to Kimura's two-parameter model (Kimura, 1980), maximumlikelihood (ML) method based on the Tamura-Nei model (Tamura and Nei, 1993), and maximum-parsimony (MP) algorithms based on the Subtree-Pruning-Regrafting (SPR) search method (Fitch, 1971).The statistical reliability of these trees was conducted using bootstrap analysis with 1,000 replications (Felsenstein, 1985).
. Genome sequencing and analysis Genomic DNA was extracted using commercial TIANamp bacteria DNA kits and sequenced on an Illumina Hiseq Xten platform.After quality control, the reads were assembled with multiple assemblers to obtain the best assembly, and the predicted genes were annotated using DIAMOND software by referring to databases (including KEGG, COG, NR, SwissProt, and Pfam), as described in the Global Catalog of Type Strain (gcType) Platform Manual v2 (https://gctype.wdcm.org/manual.jsp#detail).The RAST annotation engine (http://rast.nmpdr.org/)was also used to annotate genes.To assess the genome-based phylogeny, whole-genome-based phylogenomic trees were constructed using the up-to-date bacterial core genes (UBCGs) set pipeline (www.ezbiocloud.net/tools/ubcg)(Na et al., 2018).Average nucleotide identity (ANI) values and digital DNA-DNA hybridization (dDDH) were calculated using the ANI calculator (https://www.ezbiocloud.net/tools/orthoani)and Genome-to-Genome Distance Calculator 3.0 (GGDC; https://ggdc.dsmz.de/ggdc.php#)along with UPGMA dendrogram (unweighted pair group method with arithmetic mean) (Meier-Kolthoff et al., 2013;Lee et al., 2016).The DNA G+C contents were also determined by GGDC 3.0.Genome analysis by Check M showed that the genomes of all strains were not contaminated (Parks et al., 2015).
For comparative genome and genomic synteny analyses, the nearest phylogenomic relatives to the five strains with available genome sequences were chosen, and their genomes were retrieved from the NCBI database.The coding sequences of the genomes were predicted using GeneMarkS and Glimmer 3.02 software (http://ccb.jhu.edu/software/glimmer/index.shtml).The comparison of orthologous gene clusters was carried out using OrthoVenn2 software (https://orthovenn2.bioinfotoolkits.net/home).The genomic synteny analyses were carried out using MUMmer software by Promer way and visualized in a dot plot (Kurtz et al., 2004).Syntenic blocks of DNA sequence were analyzed using Mauve software (Darling et al., 2004).The KEGG pathways were predicted through the KEGG automatic annotation server (KAAS), and the richness of functional genes based on KEGG pathway level 3 was visualized in a heatmap.The distribution of homologous gene clusters was displayed using the Bioplot module of Chiplot (https://www.chiplot.online/).

. Culture preservation
All the assigned type strains in this study were deposited at the China General Microbiological Culture Collection Center (CGMCC).The accession numbers of these strains are provided in the section of the species description.
The genomes of the five strains were sequenced and annotated.The genome sizes and annotated gene numbers are provided in Tables 1, 2. Interpretation of genome annotation and their physiology such as carbon source assimilation, resistances to heavy metals and chemicals, and biodegradation of aromatic compounds are described in the following paragraphs.The genomic DNA G+C molar contents of H3Y2-19a T , H3M29-4 T , H39-1-10 T , H39-3-25 T , and H3SJ31-1 T were 71.22, 69.07, 66.15, 63.57, and 66.16 mol%, respectively. .
The overall results from chemical sensitivity assays showed that strains H3M29-4 T , H39-1-10 T , and H3SJ31-1 T were more sensitive to chemical agents than H3Y2-19a T and H39-3-25 T .In the 23 chemical agents, H3M29-4 T was sensitive to 18 agents, and tolerated pH 6.0, 1% NaCl, 1% sodium lactate, nalidixic acid and potassium tellurite.It did not show growth at the conditions of pH lower than 5.0, and sodium chloride higher than 4%.H3SJ31-1 T was sensitive to 19 agents, and tolerated pH 6.0, lincomycin, nalidixic acid and potassium tellurite.It did not show growth at the condition of pH lower than 5.0.H39-1-10 T was sensitive to 22 agents.By contrast, H3Y2-19a T showed the strongest tolerance (resisting 11 chemical agents).According to genome annotation, all bacteria had genes for resistance to fluoroquinolones (Supplementary Table S1), and H3Y2-19a T , H39-3-25 T , H3M29-4 T , and H3SJ31-1 T showed growth in nalidixic acid.Genes encoding for β-lactamase (Bonomo, 2017) were annotated in the genomes of H39-1-10 T , H39-3-25 T , and H3SJ31-1 T , but they did not show experimentally resistance to aztreonam.H3Y2-19a T was annotated two genes for encoding multidrug resistance efflux pumps (Alibert et al., 2017) and showed resistance to five antibiotics, namely, rifamycin SV, lincomycin, vancomycin, nalidixic acid, and aztreonam.Unexpectedly, strain H39-1-10 T was annotated by nine genes of multidrug resistance efflux pumps, but it did not resist any antibiotics tested with the Biolog TM GEN III microplate.

. Heavy metal resistance and aromatic compound metabolism
As the five strains were isolated from heavy metals and/or PAH-polluted soil samples, we tested their abilities to resist Frontiers in Microbiology frontiersin.org Potassium dichromate (mg/L) For heavy metal resistance tests, the bacteria were cultured in R2A medium supplemented with 0.5, 2.0, 5.0, and 10.0 mM sodium arsenite or 5, 20, 50, and 100 mg/L potassium dichromate.For aromatic compound metabolizing tests, the bacteria were cultured in MSM medium supplemented with 100 mg/L of each aromatic compound as the sole carbon source, respectively.Bacteria could be generally regarded as having growth when the OD600 value increased by at least 0.1.The OD600 value was represented by -, <0.1; +, 0.1-0.2;++, 0.2-0.3;+++, 0.3-0.4;++++, ≥0.4.
heavy metals and degrade PAHs.The results showed that strains H39-3-25 T , H39-1-10 T , and H3M29-4 T also displayed arsenite resistance, and H39-3-25 T and H3M29-4 T displayed dichromate resistance (Table 3).The strain H3Y2-19a T was the most resistant one and tolerated up to 10 mM of arsenite and 100 mg/L of potassium dichromate.We observed that strain H3SJ31-1 T did not show any growth in the presence of arsenite or dichromate.The genome annotations indicated that strain H39-1-10 T possessed the gene homologs to arsB (Kaur et al., 2011), chrB, and chrR (Ackerley et al., 2004;He et al., 2018), which played the roles of arsenite and dichromate resistance, respectively (Supplementary Table S4).Surprisingly, we did not identify any homologs to known chromium resistance genes from the most tolerant H3Y2-19a T .Agromyces was detected as an abundant member in the rhizosphere microbial community and was positively related to Zn and Cd accumulation in plants growing in polluted soils (De Maria et al., 2011).
The arsenite and dichromate-resistant strain H3Y2-19a T , which was a potential new species of Agromyces, would expand its bioresources and would be beneficial for the phytoremediation of heavy metals.
Although originated from polluted soil samples, strains H3Y2-19a T , H3SJ31-1 T , and H3M29-4 T could not use any of the tested aromatic compounds.Strains H39-1-10 T and H39-3-25 T showed observable growth as the sole carbon source using naphthalene, 2,5-dihydroxybenzoic, p-hydroxybenzoic, protocatechuic, salicylic, phthalic, and benzoic acids, and they were further tested for degradation of naphthalene in the presence of arsenite or dichromate (Figure 3).The results showed that H39-1-10 T could catabolize 80.2% of naphthalene in the presence of 0.5mM arsenite, and H39-3-25 T eliminated more than 92% of naphthalene, regardless of adding heavy metals or not.According to the genome annotation (Supplementary Table S4), H39-1-10 T and H39-3-25 T were annotated naphthalene 1,2-dioxygenase (Selifonov et al., 1996), which is the critical enzyme for the first catalytic reaction step of degrading PAHs.The genotypes of H39-1-10 T and H39-3-25 T agreed with the growth phenotype of utilizing naphthalene as a carbon source.H39-1-10 T and H39-3-25 T were later identified as species of Sphingomonas and Sphingobium that are well known for pollutant degraders (Ghosal et al., 2016).

. Comparative genome and genomic synteny analyses
The comparative genomic analyses were carried out to reveal the genomic synteny and diversity between the five strains and their relatives.The nearest phylogenomic type strain neighbors, namely, Agromyces italicus DSM 16388 T to H3Y2-19a T , Salinibacterium hongtaonis 194 T to H3M29-4 T , Sphingomonas panacis DCY99 T to H39-1-10 T , Sphingobium psychrophilum AR-3-1 T to H39-3-25 T , and Novosphingobium aquimarinum M24A2M T to H3SJ31-1 T , which were isolated from catacombs, feces of Tibetan antelopes, rhizosphere, Arctic soil, and seawater samples, respectively, were chosen for comparative genomic analysis.Through comparative analysis of the orthologous relationship of predicted protein-coding genes, we found remarkable gene overlaps among these investigated strains' genomes (Figure 4A).At the level of protein sequence, analysis using OrthoVenn2 revealed that there are 2667, 1565, 3298, 2935, and 2399 orthologous clusters shared by H3Y2-19a T , H3M29-4 T , H39-1-10 T , H39-3-25 T , and H3SJ31-1 T compared to their relatives, respectively.All the five strains in this study possess more unique orthologous clusters than their relatives.Among the unique clusters of the five studied strains, the annotated known genes account for 46.2%-75.0%(data not shown).Thus, there are still many unknown protein-coding genes in their genomes that need to be explored.From the dot plot of genomic synteny analysis based on protein sequences, we observed the syntenic regions of DNA sequence shared by the studied strains and their respective relatives (Figure 4C).The overall results showed that H3Y2-19a T , H3M29-4 T , and H39-1-10 T shared more syntenic blocks (12, 13, and 9 for each) with their respective relatives than H39-3-25 T and H3SJ31-1 T .Although H39-3-25 T has a much more analogous sequence with Spm.panacis DCY99 T , no remarkable syntenic blocks were observed.The observations are also supported by the synteny analysis using Mauve software based on DNA sequences, as shown in Supplementary Figure S2.The results suggest that the level of genomic synteny for H3Y2-19a T , H3M29-4 T , and H39-1-10 T with their respective relatives is much higher than that for H39-3-25 T and H3SJ31-1 T .The fact that the majority of the genome regions of H39-3-25 T and H3SJ31-1 T were not in syntenic blocks suggests that dramatic gene rearrangements occurred in their genomes after they diverged from their most recent common ancestor.On the other hand, there are still extensive syntenic DNA sequence orders existing between them and their relatives.These syntenic and dislocated genes may be caused by selective aggregation under evolutionary pressures (Wan, 2019).
As the five strains were isolated from polluted soils, we further comparatively investigated the distribution of functional genes related to the degradation of xenobiotic pollutants between them and their relatives based on the KEGG annotation.The richness of annotated functional genes is shown in Figure 4B.The overall results showed that the bacterial species from the family Sphingomonadaceae represented by H39-1-10 T , H39-3-25 T , and H3SJ31-1 T possess much more functional proteincoding genes than the species from the family Microbacteriaceae represented by H3Y2-19a T and H3M29-4 T , such as genes in the pathways of ko00364, ko00623, ko00624, ko00625, ko00626, and ko00633 for the degradation of fluorobenzoate, toluene, PAHs, chloroalkane and chloroalkene, naphthalene, and nitrotoluene, respectively.These sphingomonads also have numerous genes encoding cytochrome P450 in the pathways of ko00980 and ko00982 to deal with other xenobiotics and drugs.The previous studies showed that strains of sphingomonads usually showed great contaminant-degrading efficiency (Waigi et al., 2015), and thus, they had been widely considered as excellent decomposers of pollutants (Ghosal et al., 2016).The results of this study agree with this opinion at the molecular level.Among the three sphingomonads, the studied species of the genera Sphingomonas and Sphingobium have more genes than Novosphingobium in the pathways of ko00540, ko00603, and ko00600 for lipopolysaccharide and glycosphingolipid biosynthesis and sphingolipid metabolism, which were probably responsible for producing surfactants such as rhamnolipid (Ma et al., 2018;Posada-Baquero et al., 2019), to accelerate the degradation of xenobiotics.The strain H39-1-10 T has 15 genes in the ko00626 pathway responsible for naphthalene degradation.The strain H39-3-25 T has the most functional genes in the pathways of ko00621 (13), ko00625 (18), ko00626 (24), ko00980 (23), and ko00982 (23) than other strains to degrade dioxin, chloroalkane and chloroalkene, naphthalene, other xenobiotics, and drugs, respectively.It also has 18 functional genes for xylene degradation.Moreover, to investigate the particular mechanism of pollutant degradation of these sphingomonads, specifically the pathway of naphthalene degradation, the functionally related gene clusters were drawn in Figure 4D.As can be observed, both the strains H39-3-25 T and H39-1-10 T have intact upstream metabolic pathway with the required genes nahA, nahB, nahC, nahD, nahE, and nahF progressively transforming naphthalene to salicylate.The salicylate was then transformed to gentisate using nagG and nagH and afterward entered the tyrosine metabolism pathway to be mineralized or assimilated.There are also some genes of the central metabolic pathway of PAHs existing around the gene cluster such as maiA, nagM, nagK, and mhpD.In contrast, their closest relatives Spm.panacis DCY99 T and Spb.psychrophilum AR-3-1 T all lack the intact metabolic pathway of naphthalene.Although Spb. psychrophilum AR-3-1 T have some scattered genes such as nahB, nahC, nahD, and nahE, it is lack of the most critical gene nahA that encodes the naphthalene 1,2-dioxygenase for the first oxidation step of naphthalene degradation (Selifonov et al., 1996).Strain H3SJ31-1 T only possesses nahE and nahF in the cluster that could not solely complete the degradation process of naphthalene, and no upstream metabolic pathway gene clusters were found in its closest relative N. aquimarinum M24A2M T .The interpretations of the comparative genome analysis are in agreement with the results of the phenotype test of naphthalene degradation, as described above.Compared with their respective closest relatives which were isolated from non-polluted samples, H39-1-10 T and H39-3-25 T evolve much more genes to construct the intact pathway of naphthalene degradation, suggesting that the genes for pollutants degradation were enriched in the heavy PAH-polluted soil environments.
. Five novel bacterial taxa and their species descriptions Based on the 16S rRNA gene and genome sequences analysis, we further studied the phylogenetic relationships of each bacterial isolate with their closely related and validly nominated bacterial taxa, and the phylogenetic trees are shown in Figure 5.The UPGMA dendrogram trees based on the ANI scores of their genomes were generated and are shown in Figure 6.The dDDH values were calculated and are shown in Supplementary Tables S2,  S3.According to the chemotaxonomic characteristics and the results from DNA molecule analysis, we proposed that H3Y2-19a T , H3M29-4 T , H39-1-10 T , H39-3-25 T , and H3SJ31-1 T represent novel species.
Based on the results of phylogenetic, phylogenomic, and phenotypic characterizations, we concluded that strain H3Y2-19a T represents a novel species affiliated to the genus Agromyces, and the name Agromyces chromiiresistens sp.nov. is proposed.
The type strain is H3Y2-19a T (=CGMCC 1.61332 T ), isolated from a polluted soil sample.The GenBank accession numbers for the 16S rRNA gene sequence and genome sequence of the type strain are OP493225 and JARFNL000000000, respectively.
The type strain is H3M29-4 T (=CGMCC 1.61335 T ), isolated from a polluted soil sample.The GenBank accession numbers for the 16S rRNA gene sequence and genome sequence of the type strain are OP456335 and JARGEM000000000, respectively.
The type strain is H39-1-10 T (=CGMCC 1.61325 T ), isolated from a polluted soil sample.The GenBank accession numbers for the 16S rRNA gene sequence and genome sequence of the type strain are OP493228 and JARFNJ000000000, respectively.
The type strain is H39-3-25 T (=CGMCC 1.61326 T ), isolated from a polluted soil sample.The GenBank accession numbers for the 16S rRNA gene sequence and genome sequence of the type strain are OP493227 and JARFNK000000000, respectively.

. . Strain H SJ -T
The phylogenetic trees revealed that H3SJ31-1 T clustered members of Novosphingobium genus (Figures 5B1, B2).The close relatives to H3SJ31-1 T were N. mathurense SM117 T (97.53%, 16S rRNA gene identity) and N. soli CC-TPE-1 T (97.60%).The ANI and dDDH values between H3SJ31-1 T and N. mathurense SM117 T were 76.91 and 20.7%, respectively (Table 2 and  Figure 6E), which were all below the threshold for differentiating two species.At the date of this description, 58 species of the Novosphingobium genus have been described.The major cellular fatty acids of H3SJ31-1 T were summed feature 8 (C 18 : 1 ω7c/C 18 : 1 ω6c, 30.74%) and C 16 : 0 (47.67%), and the component of C 16 : 0 distinguished this organism from other members of the Novosphingobium genus as shown in Supplementary Table S3.H3SJ31-1 T contained polar lipids of sphingoglycolipid, diphosphatidylglycerol, phosphatidylethanolamine, phosphatidyl dimethylethanolamine, phosphatidylglycerol, phosphatidylcholine, phosphatidyl monomethylethanolamine, and unknown lipid (Supplementary Figure S1).The DNA G+C content was 66.16 mol%, which is in the range (62-67 mol%; Takeuchi et al., 2001) of the Novosphingobium genus (Supplementary Table S3).Based on the results of phylogenetic, phylogenomic, and phenotypic characterizations, we concluded that strain H3SJ31-1 T represents a novel species affiliated to the genus Novosphingobium and the name Novosphingobium album sp.nov. is proposed.
The type strain is H3SJ31-1 T (=CGMCC 1.61329 T ) isolated from a polluted soil sample.The GenBank accession numbers for the 16S rRNA gene sequence and genome sequence of the type strain are OP493226 and JARESE000000000, respectively.

FIGURE
FIGURE Cellular morphology (transmission electron microscopy) of the five bacteria isolated from polluted soil.The names of each bacterium and the scale bars of cellular size are shown in the pictures.

FIGURE
FIGUREDegradation of naphthalene and phenanthrene by H --T (A) and H --T (B) in the presence or absence of heavy metals.Nap, naphthalene; Phe, phenanthrene; Nap-As and Phe-As mean the medium for PAH degradation was added .mM sodium arsenite; Nap-Cr and Phe-Cr indicate that the medium for PAH degradation was added mg/L potassium dichromate.

FIGURE
FIGUREAnalysis of orthologous genes and genomic synteny between the five strains and their nearest phylogenomic type strain neighbors.(A) The distribution of shared and unique orthologous gene clusters between the five queried strains (y-axis) and their respective reference strains (x-axis).The numbers on the axis indicate the unique orthologous clusters of each strain.The numbers in the circles indicate the shared orthologous clusters between two comparative strains.(B) The richness of functional genes related to the biodegradation and metabolism of xenobiotics based on the KEGG annotation in pathway level .The richness is illustrated by heatmap analysis.The color indicates the number of annotated genes.The pathway KO number and their functional descriptions are shown on the right side of the heatmap diagram.(C) The genomic synteny between the five queried strains (x-axis) and their respective reference strains (y-axis).The syntenic regions are displayed by the dot plot analysis using MUMmer software in a Promer way.Red dots represent forward synteny.Blue dots represent reverse synteny.The syntenic blocks are signed with a red triangle.(D) The functional gene clusters for metabolizing xenobiotics in the strains H --T , H --T , H SJ -T , and their respective reference strains.The genes that perform roles in one pathway are shown with the same color.The arrows point in the direction of gene expression.

FIGURE
FIGUREPhylogenetic and phylogenomic trees of the five bacteria constructed on the basis of the S rRNA gene sequences and whole genomes using the neighbor-joining algorithm showing the relationships of the five novel taxa to their closely related type bacterial strains.(A) The phylogenetic (A ) and phylogenomic (A ) trees of strains H Y -a T , H M -T , and their closely related species in the family Microbacteriaceae, and the sequence of Brevibacterium linens DSM T (X ) was used as an out-group; (B) the phylogenetic (B ) and phylogenomic (B ) trees of strains H --T , H --T , H SJ--T , and their closely related species in the family Sphingomonadaceae, and the sequence of Brevundimonas diminuta ATCC T (GL ) was used as an out-group.GenBank accession numbers are given in parentheses.Bootstrap percentages (> %) based on , replicates are shown at the nodes.Phylogenetic trees based on the maximum-likelihood and the maximum-parsimony methods with , bootstraps were also reconstructed (see Supplementary FigureS).The filled circles indicate the nodes supported by all three methods regardless of bootstrap percentages.Bar, .substitutions per nucleotide position for (A ); bar, .substitutions per nucleotide position for (B ); bar, .substitutions per nucleotide position for (A , B ).

FIGURE
FIGURE UPGMA phylogenetic trees and ANI heat maps based on whole genomes.Each of the UPGMA phylogenetic trees and the ANI heat maps displays the connections between a novel bacterial taxon and their respective close relatives: (A) H Y -a T ; (B) H M -T ; (C) H --T ; (D) H --T ; and (E) H SJ -T .The novel taxon names proposed in this study are shown in red.GenBank accession numbers of the genomes are shown in parentheses.

TABLE Phenotypic ,
chemotaxonomic, and genomic features of three novel species of Sphingomonadaceae, and di erentiation to their closely related species.
T ; 2, Sphingomonas panacis DCY99 T TABLE Characteristics of heavy metal resistance and aromatic compound metabolization of the five novel species.