Comparative genomic analysis of Bacillus atrophaeus HAB-5 reveals genes associated with antimicrobial and plant growth-promoting activities

Bacillus atrophaeus HAB-5 is a plant growth-promoting rhizobacterium (PGPR) that exhibits several biotechnological traits, such as enhancing plant growth, colonizing the rhizosphere, and engaging in biocontrol activities. In this study, we conducted whole-genome sequencing of B. atrophaeus HAB-5 using the single-molecule real-time (SMRT) sequencing platform by Pacific Biosciences (PacBio; United States), which has a circular chromosome with a total length of 4,083,597 bp and a G + C content of 44.21%. The comparative genomic analysis of B. atrophaeus HAB-5 with other strains, Bacillus amyloliquefaciens DSM7, B. atrophaeus SRCM101359, Bacillus velezensis FZB42, B. velezensis HAB-2, and Bacillus subtilis 168, revealed that these strains share 2,465 CDSs, while 599 CDSs are exclusive to the B. atrophaeus HAB-5 strain. Many gene clusters in the B. atrophaeus HAB-5 genome are associated with the production of antimicrobial lipopeptides and polypeptides. These gene clusters comprise distinct enzymes that encode three NRPs, two Transat-Pks, one terpene, one lanthipeptide, one T3PKS, one Ripp, and one thiopeptide. In addition to the likely IAA-producing genes (trpA, trpB, trpC, trpD, trpE, trpS, ywkB, miaA, and nadE), there are probable genes that produce volatile chemicals (acoA, acoB, acoR, acuB, and acuC). Moreover, HAB-5 contained genes linked to iron transportation (fbpA, fetB, feuC, feuB, feuA, and fecD), sulfur metabolism (cysC, sat, cysK, cysS, and sulP), phosphorus solubilization (ispH, pstA, pstC, pstS, pstB, gltP, and phoH), and nitrogen fixation (nif3-like, gltP, gltX, glnR, glnA, nadR, nirB, nirD, nasD, narl, narH, narJ, and nark). In conclusion, this study provides a comprehensive genomic analysis of B. atrophaeus HAB-5, pinpointing the genes and genomic regions linked to the antimicrobial properties of the strain. These findings advance our knowledge of the genetic basis of the antimicrobial properties of B. atrophaeus and imply that HAB-5 may employ a variety of commercial biopesticides and biofertilizers as a substitute strategy to increase agricultural output and manage a variety of plant diseases.


Introduction
Plant pathogenic fungi, bacteria, viruses, and viroids can reduce agricultural productivity and result in yield losses of up to 14% for various crops (Peng et al., 2021).An essential step in the production of agricultural products is the application of pesticides to combat plant diseases.Chemical pesticides have been used extensively to control plant diseases; without them, the production of fruits, vegetables, and grains would have dropped by 78%, 54%, and 32%, respectively.Regardless of all the above, excessive pesticide usage in agriculture pollutes the environment and has a negative impact on human health (Syed Ab Rahman et al., 2018;Tudi et al., 2021).Furthermore, pesticides can alter the composition of plant-associated microbial communities (Meena et al., 2020).Therefore, the development of ecofriendly pesticide alternatives is urgently required.Due to their safe and environmentally friendly effects on crops, antagonistic bacteria have become a potent substitute for conventional pesticides in the management of crop diseases in recent years.Numerous strains of the Bacillus species are being evaluated for use as biopesticides and have gained prominence as a biocontrol agent for plant diseases.Bacillus strains are the most promising group of plant growth-promoting rhizobacteria (PGPR), which play important roles in promoting plant growth, enhancing growth hormones, producing antioxidant enzymes, nitrogen fixation, phosphate solubilization, phytohormones, and volatile organic compounds (VOCs), and triggering induced systemic resistance (ISR) by producing different types of secondary metabolites that can potentially inhibit the growth of plant pathogens and control soilborne diseases (Ongena and Jacques, 2008;Chowdhury et al., 2015;García-Fraile et al., 2015;Fan et al., 2018;Chandran et al., 2020;Samaras et al., 2021).
Bacillus atrophaeus is a significant member of the plant growthpromoting rhizobacteria (PGPR), which are known for enhancing plant growth development as well as controlling plant pathogenic fungi and bacteria.When applied to seedlings and harvested fruits, they stimulate plant growth and improve the resistance of plants against insect pests and plant diseases such as powdery mildew and tomato gray mold (Reva et al., 2013;Zhang et al., 2013;Huang et al., 2015).Bacterial secondary metabolites are important not only for producer cells but also have a positive impact on their host.These secondary metabolites have significant applications in agriculture and pharmaceuticals as bioactive compounds (Bornscheuer, 2016).Advances in whole-genome sequencing technology have enabled the detection of putative antimicrobial and genome mining tools, enabling researchers to uncover the molecular basis of strain-versatile lifestyles and prioritize industrially important secondary metabolites at the genomic level.These specialized metabolites represent a possible way to improve crop yield and produce antimicrobial activities to control plant diseases (Chun et al., 2017(Chun et al., , 2019;;Wang et al., 2020;Iqbal et al., 2021).For example, genome mining of B. atrophaeus L193 has revealed a non-ribosomal peptide synthetase gene cluster involved in the production of surfactin, fengycin, bacillomycin, and iturin (Rodríguez et al., 2018).
Genome analysis of B. atrophaeus GQJK17 revealed eight gene clusters that produce antimicrobial secondary metabolites such as surfactin, fengycin, bacillaene, and bacillibactin (Ma et al., 2018).Surfactin, fengycin, bacillomycin, iturin, bacillaene, and bacillibactin have been reported to have antimicrobial properties (Wang et al., 2024).Bacillibactin has been reported to inhibit the growth and invasion of Phytophthora capsici and Fusarium oxysporum (Woo and Kim, 2008;Yu et al., 2011).Surfactin was reported to have a broad spectrum of antibacterial activity to significantly inhibit bacterial diseases such as Arabidopsis root infection caused by Psuedomonas syringae and tomato wilt caused by Ralstonia solanacearum (Bais et al., 2004;Xiong et al., 2015), and fengycin exhibited antifungal activity against a broad spectrum of filamentous fungi (Vanittanakom et al., 1986).
In this study, we selected the B. atrophaeus HAB-5 strain, which was isolated from the cotton rhizosphere in Xinjiang Province, Republic of China.The HAB-5 strain has potential as a biological agent for antifungal and antiviral agents.Previous studies conducted in our laboratory have shown that HAB-5 can provide high control efficacy against 22 plant pathogenic fungi such as Alternaria alternata MfAa-1, Alternaria brassicicola MfAb-  (Rajaofera et al., 2020).It also prevents disease infection, protects tobacco seedlings from P. nicotianae, and exhibits an antiviral effect against tobacco mosaic virus (TMV) by activating the signaling of regulatory genes (NPR1), defense genes (PR-1a, PR-1b, and chia5), and hypertensive response-related genes (Hsr 203j and Hin1; Rajaofera et al., 2019).Similarly, HAB-5 was found to be effective against C. gloeosporioides through the volatilization of antimicrobial volatile compounds such as octadecane, hexadecanoic acid, methyl ester, and chloroacetic acid, tetradecyl ester, chloroacetic acid, tetradecyl ester, octadecane, hexadecanoic acid, and methyl ester (Rajaofera et al., 2018).In addition, the HAB-5 strain exhibited a remarkable ability to improve the growth of tobacco plants, and the inoculated plant exhibited a significant increase in fresh shoot weight, dry shoot weight, fresh root weight, and dry root weight by 76.47%, 80.58%, 71.71%, and 82.10%, respectively, compared with the non-inoculated control plant (Rajaofera et al., 2019)

Materials and methods
Bacillus atrophaeus HAB-5 strain selection and genomic DNA extraction HAB-5 was isolated from Xinjiang Province, Republic of China (Rajaofera et al., 2019) and preserved in the Key Laboratory of Green Prevention and Control of Tropical Plant Disease and Pest at Hainan University, Ministry of Education.The HAB-5 was cultivated in Luria-Bertani (LB) agar medium at 37°C with shaking at 180 rpm and was used to extract genome DNA.The bacterial culture was subjected to genomic DNA extraction using a commercial kit according to the manufacturer's instructions (Sigma Aldrich, St. Louis, MO, United States).NanoDrop and Qubit (Thermo Fisher Scientific United States) were utilized for optical density measurement and quality control.

Genome sequences of Bacillus atrophaeus HAB-5
The complete genome of HAB-5 was sequenced by third generation sequencing on the PacBio RS II sequencing platform (Pacific Biosciences).Fragment DNA samples were sheared and treated with Exo VII to remove single-stranded ends, and size selection was performed to retain longer reads (>10 k reads) for sequencing.Blunt reactions were performed, and the SMRT bell template was annealed for sequencing.The large insert libraries were sequenced through single-molecule real-time (SMRT) sequencing, and the cells were run on the Pac Biosciences RS II systems using P6-C6 chemistry.After 180 min of mode data collection, all reads were spliced into contigs and combined into scaffolds.

Genome assembly, gene function annotation, and genome component prediction of Bacillus atrophaeus HAB-5
The Pac Bio reads were assembled into contigs using de novo hierarchical genome assembly process (HGAP) software in the singlemolecule real-time sequencing (SMRT) portal using default parameters (Chin et al., 2013(Chin et al., , 2016)).The assembly results were then corrected based on NGS data using Pilon software (Walker et al., 2014).Finally, the gaps between contigs were filled by comparing the contigs assembled from PacBio using MUMmer software (Delcher et al., 1999).The genome sequences were annotated by the National Center for Biotechnology Information (NCBI) by using the Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP).
Functional description of putative protein-encoding genes was performed using BLASTx, with an E-value of 1e − 5. We used GenoVi software for circular genome representations. 1 The Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology assignment and prediction of KEGG pathways were performed by Kanehisa et al. (2004) to identify the components of cellular processes (CP), environmental, genetic, human disease, metabolism, and organismal system pathways.The COG (Cluster of Orthologous Group) annotated the predicted genes in accordance with Tatusov et al. (2003); the gene ontology (GO) was completed by Bard and Winter (2000).Genome components were predicted using a glimmer 2 using Markov models.Scam-SE was used for tRNA, rRNA, and sRNA recognition, and other types of RNAmmers were predicted by comparison with the Rfam database (Lowe and Eddy, 1997).

The estimation of the core and pan genomes
To categorize the core and pan genomes, HAB-5 was analyzed using the Prokaryotic Genome Annotation Pipeline (PGAP) to identify core orthologs from strains (Zhao et al., 2012).The size of the core genome was determined as the number of common genes shared by all analyzed genomes, and the pan-genome size was defined as the sum of all gene families.Species-specific core orthologous genes and strain-specific unique genes were also examined in the HAB-5 genome sequences.The Ortho MC tool was used to identify core-specific genes in the genomes (Li et al., 2003).The gene accumulation curve was produced using the R package gg plot 2 using the results from Roary.

Gene family clustering of Bacillus atrophaeus HAB-5 and collinearity analysis
Hclustersg software was used to carry out gene family alignment (Maqbool and Babri, 2007), and muscle software was used to analyze the alignment sequence for the cluster gene family (Edgar, 2004).The parameters were set as follows: a Blastp E-value threshold of 1e-5 to ensure the quality of the comparisons.Genome-wide collinearity among strains HAB-5, DSM7, SRCM101359, FZB42, HAB-2, and 168 was determined using the BLASTp database, with an e-value of ≤1e − 5 and an identity threshold of ≥85% at both nucleic acid and amino acid levels.For the analysis of genome synteny and collinearity, D-GENIES and C-Sibelia software were used.Visualization of the alignment of the synteny blocks was achieved using Circos (Krzywinski et al., 2009;Minkin et al., 2013;Cabanettes and Klopp, 2018).

Phylogenetic trees and heat map synteny
All genomes used in this study were downloaded in the FASTA format from the NCBI database to construct neighbor-joining phylogenetic trees based on 16Sr RNA.Molecular Evolutionary Genetic Analysis (MEGA) was used to construct neighbor-joining phylogenetic trees (Tamura et al., 2011) with the p-distance model and 1,000 bootstraps.An analysis subset of SNPs identified in all singlecopy genes.A phylogenetic tree of SNPs was constructed using HAB-5 against the reference genomes by Tree Best (Nandi et al., 2010) with 1,000 bootstraps.The average nucleotide identity analysis was performed using Jsspecies 1.2.1 (Richter et al., 2016).The CIMminer3 was used for heat maps based on the Average Nucleotide Identity values, and pairwise genome alignment for synteny was performed using Mauva Version 2.4.0 (Darling et al., 2004).

Genome mining analysis of secondary metabolite gene clusters
The web-based tool, Antibiotics and Secondary Metabolites Analysis SHell (antiSMASH 7.0) software, was used to predict the biosynthesis gene clusters of secondary metabolites in HAB-5 (Blin et al., 2023).antiSMASH is available at http://antismash.secondarymetabolites.org/.The assembled sequences were uploaded to antiSMASH reports containing both known and unknown clusters to identify similar clusters by genome comparisons with detailed NRP function annotation, and the chemical structure of the gene cluster was generated (Medema et al., 2011;Blin et al., 2013;Weber et al., 2015).The Roary pan genome pipeline was used to identify gene cluster homologies, and gene cluster synteny maps were produced using the R package genoPlotR (Guy et al., 2010;Page et al., 2015).

Genome sequences and features of Bacillus atrophaeus HAB-5
In the present study, complete genome sequencing of HAB-5 was performed using third generation Pacific Biosciences (PacBio) singlemolecule real-time (SMRT) sequencing platform technology.A total of 1,314 MB of raw data were collected, and 1,182 MB of data were assembled.The whole genome was distributed on a 4,083,597-bp circular chromosome with an average GC content of 44.2%.The strain contains 599 protein-coding gene CDSs, including 59 telomer restriction fragment, 38 minisatellite DNA, 02 microsatellite DNA, 82 tRNA, 08 rRNA, 29sRNA, 01 prophage, and 01 CRISPR domain.The distribution of genes in the COG functional categories is presented in Figure 1A, and additional information about the genome statistics is presented in Table 1.

Phylogenetics of Bacillus atrophaeus HAB-5
To determine the level of differences between strains HAB-5, DSM7, SRCM101359, 168, FZb42, and HAB-2, the 16S rRNA sequence of strain HAB-5 and other correlated sequences were obtained from NCBI for the construction of the phylogenetic association tree using MEGA 5.0.The phylogenetic relationship of HAB-5 was similar to that of SRCM101359 (Figure 2A).Additionally, WG-based phylogeny was constructed, which showed that the HAB-5 genome is closely linked to SRCM101359.However, lower phylogenetic relationships were found among the distinct branches from the other strains 168, DSM7, FZB42, and HAB-2 (Figure 2B).

Dispensable gene heat map
The average nucleotide identity was examined using the dispensable gene heat map.The strain HAB-5 was most closely related to the strain SRCM101359.The average nucleotide identity of 168, DSM7, FZB42, and HAB-2 showed the least similarity with HAB-5.HAB-5 and SRCM101359 had the highest ANI identity (Figure 4).Together with a genome-to-genome distance calculator, average nucleotide identity has become a potent genome-based criterion for identifying species.It can reveal which genomes need to have their taxonomic and evolutionary positions altered or reclassified.

Structural distinction and collinearity analysis
We performed a collinearity analysis to compare the genomic similarities of HAB-5 with other strains (DSM7, SRCM101359, 168, FZb42, and HAB-2).The results showed that the HAB-5 genome demonstrated different synteny to 168 (Supplementary Figures S1A,B), followed by FZb42 (Supplementary Figures S1C,D), DSM7 (Supplementary Figures S1E,F), HAB-2 (Supplementary Figures S1G,H), and SRCM101359 (Supplementary Figures S1I,J).HAB-5 showed the highest levels of nucleotide and amino acid synteny with the SRCM101359 genome; representatives of their evolutionary stages were the closest, and their genomes were more related.
Genetic basis for the plant growth-promoting activity of Bacillus atrophaeus HAB-5 Beneficial rhizobacteria influence plant growth by affecting nutrient uptake.Most of the genes associated with promoting plant growth were identified in HAB-5.IAA is a significant phytohormone that regulates plant cell growth and tissue differentiation; there are some nice genes related to IAA biosynthesis that have been identified in strain HAB-5 (Supplementary Table S1).Iron, sulfur, phosphorus, and nitrogen are necessary for the growth and development of plants.It was predicted that the gene clusters (nif3-like, glt, gln, gln, nad, nirB, nir, nas, nar, nar, and nar) would be involved in nitrogen metabolism  and fixation (Supplementary Table S2).A number of other genes involved in iron transportation (fbpA, fetB, feuC, feuB, feuA, and fecD; Supplementary Table S3), phosphate solubilization (ispH, pstA, pstC, pstS, pstB, gltP, and phoH; Supplementary Table S4), and sulfur metabolisms (cysC, sat, cysK, cysS, and sulP; Supplementary Table S5) were found in HAB-5's genome.In the present research, we have identified two more genes encoding acetoin (acuB and acuC) and three genes encoding enzymes of the biosynthetic pathway from acetion dehydrogenase, such as acoA, acoB, and acoR (Supplementary Table S6).

Discussion
Genome sequences were performed to ascertain the molecular basis of the mechanisms underlying the promotion of plant growth and the biocontrol capabilities of B. atrophaeus HAB-5, and a comparative genome analysis was conducted with other Bacillus strains.The genomic analysis showed that 1,182 (Mb) of clean data were created and that the HAB-5 genome contained a circular chromosome with a size of 4,083,597 bp and a GC content of 43.36%.Additionally, the genome did not contain any plasmids.A comparative analysis showed that the genome size of HAB-5 (4,083,597 bp) was larger than that of DSM7 (3,980,199 bp), FZb42 (3,918,596 bp), and HAB-2 (3,894,648 bp), but it was still smaller than that of 168 (4,215,606 bp) and SRCM101359 (4,180,819 bp).The GC content of HAB-5 (44.21%) was higher than that of 168 (43.1%) and SRCM101359 (43%) but it was lower than that of FZb42 (46.6%),HAB-2 (46.6%), and DSM7 (46.1%).To determine the relationship between HAB-5 and other strains, a phylogenetic tree based on the 16SrRNA gene sequence and a phylogenomic tree were constructed.
The results showed that HAB-5 is closely related to SRCM101359 and showed the highest similarity.
In previous studies, many plant growth-promoting bacteria were analyzed at the whole-genome level to gain an in-depth understanding of PGP mechanisms in bacteria such as Pseudomonas aeruginosa B18 (Singh et al., 2021), B. velezensis HAB-2 (Xu et al., 2020), B. megaterium BM89 andB. subtilis BS87 (Chandra et al., 2021), K. variicola UC4115 (Guerrieri et al., 2021), and Streptomyces (Subramaniam et al., 2020).Numerous genes involved in biocontrol and growth-promoting activities in plants have been identified by whole genome sequencing.The presence of antimicrobial genes was also revealed by an analysis of the B. atrophaeus genome for the presence of secondary metabolites.Bacillus atrophaeus strains are remarkably capable of producing secondary metabolites with antimicrobial compounds, such as terpenes, polyketides, and non-ribosomally synthesized peptides (NRPs; Liu et al., 2012;Chan et al., 2013;Ma et al., 2018).In a previous study, genomic research revealed that B. atrophaeus L193 carries a cluster of genes known as non-ribosomal peptide synthetases.These genes include fenC, srfA-A, BmyB, and ituD, which are involved in the production of surfactin, fengycin, bacillomycin, and iturin (Rodríguez et al., 2018).Eight gene clusters that produce antimicrobial secondary metabolites, such as surfactin, bacillaene, fengycin, and bacillibactin, were found in the genome study of B. atrophaeus GQJK17 (Ma et al., 2018).
In the present studies, the whole genome sequencing of HAB-5 identified the genes encoding for novel antimicrobial peptides associated with its biocontrol properties.There were 11 gene clusters predicted in the genome of HAB-5: three gene clusters encoding for NRPS (non-ribosomal peptide synthetases), two gene clusters encoding for Transat-pks, two gene clusters encoding for terpene, one encoding for lanthipeptide, one gene cluster for T3PKS, one encoding RiPP, and one encoding thiopeptide that synthesized bacillaene, fengycin, bacillibactin, surfactin, and rhizocticin A. These secondary metabolites show antifungal and antibacterial activities against plant pathogens.According to Ongena and Jacques (2008), surfactin possesses antibacterial and antifungal properties, and fengycin and rhizocticin exhibit antifungal properties (Kugler et al., 1990).Bacillaene exhibits antimicrobial activity against many types of plantharmful bacteria and fungi (Patel et al., 1995;Um et al., 2013;Müller et al., 2014), and two of the HAB-5 secondary metabolites gene clusters that may produce terpenes were identified; however, the other four are yet unknown.Terpenes are large, diversified, naturally occurring organic compounds that are present in bacteria, fungi, plants, and animals.They have a variety of medicinal uses and can be added to food and cosmetic products.They also have antifungal and anticarcinogenic characteristics (Zhao et al., 2016).Additionally, it plays a significant role in defending numerous plant, animal, and microbe species from infections and insects, as well as transmitting messages to non-specific and mutuality regarding the existence of food, partners, and adversaries, as well as from abiotic and biotic stressors (Gershenzon and Dudareva, 2007).Furthermore, the HAB-5 genome showed an amazing capacity to create bacillibactin, a type of siderophore that is characterized by short peptide molecules with functional groups and a side chain that can provide a set of ligands to coordinate ferric ions (Crosa and Walsh, 2002).Bacillibactin is a kind  of strong siderophore that increases the absorption of ferric ions in soil for plant growth and to secrete volatile compounds (Nithyapriya et al., 2021).Furthermore, the gene cluster of Bacillibactin strain HAB-5 also contains other genes that promote plant growth and codify useful substances such as butanone, protease, phytase, and phosphatase (Ma et al., 2018;Rajaofera et al., 2020).Among the metabolites, VOCs have gained great attention for their potential in the control of plant pathogens.It has been reported that the strain HAB-5 produces a variety of VOCs, which have strong antifungal effects, inhibiting the growth of C. gloeosporioides (Rajaofera et al., 2019).Besides, HAB-5 has detected volatile chemical-producing genes such as acoA, acoB, acoR, acuB, and acuC.Acetion, one of the active bacterial volatile compounds, was released to stimulate the induced systemic resistance (ISR) of plants (Zhang et al., 2015).

Conclusion
Our study findings show that whole genome sequencing of B. atrophaeus HAB-5 generated 4,083,597 bp.A comparative genomic analysis of the HAB-5 strain with other Bacillus strains revealed its genome similarity to SRCM101359.Through genome mining, HAB-5 was found to harbor several antimicrobial secondary metabolites contributing to its biocontrol activities and demonstrated multiple genes related to IAA phytohormones, iron, sulfur, phosphate solubilization, and nitrogen fixation.These results will contribute to in-depth research on plant growth promotion and biocontrol mechanisms.Project administration.Ycxc,ycxD,sfp,yczE,yckI,yckJ,yciC,yx01,yckC,yckD,yckE,nin,hxlAhxlB,hxlR,xy02,srfAA,srfAB,comSsrfAC,and srfAD Antibiotic Induction of ISR Fungi
FIGURE 3(A) Pan gene and core gene dilution curve, (B) conversed and specific gene counts (every ellipse shows strains, and numbers in the ellipse are specific genes; the white circle represents conserved genes among the six Bacillus strains), and (C) number of orthologs (unique genes, family number, unclustered genes, clustered genes, and gene number).

FIGURE 4
FIGURE 4Dispensable gene heat map of the average nucleotide (ANI) value of the whole genome of the strain Bacillus atrophaeus HAB-5 and five other Bacillus strains.

TABLE 1
The general genome features of Bacillus atrophaeus HAB-5.

TABLE 2 The
potential secondary metabolites gene clusters in B. atrophaeus HAB-5.