Introduction
Cyanobacteria comprise one of the oldest and most diverse phyla in the Bacteria domain and are recognized for their importance in the biosphere evolution. Members of this phylum can be found in a wide variety of environments reflecting their photosynthetic ability, adaptability to various environmental conditions, and diversified metabolism. Such characteristics make cyanobacteria one of the preferred targets for research on bioactive compounds and new enzymes (Schirrmeister et al., 2011; Dittmann et al., 2015).
Pantanalinema was described as a new genus of the Leptolyngbyaceae cyanobacterial family by a polyphasic approach, which included morphological characteristics, 16S rRNA gene phylogeny, 16S-23S ITS rRNA secondary structures, and physiological characteristics such as adaptability to pH variations (Vaz et al., 2015).
This genus has been described only in Brazilian biomes such as the Pantanal and the Amazon, the first isolates being found in a lake. These Pantanalinema isolates were characterized by their ability to grow over a wide pH range (pH 4 to 11) as well as to modify the culture medium pH around neutrality (pH 6 to 7.4). Due to these characteristics, it is thought that this genus can occupy a variety of ecological niches, such as alkaline or slightly acidic water bodies (Vaz et al., 2015; Genuário et al., 2017). Taxonomic classification of Pantanalinema isolates requires the use of molecular markers as this genus is morphologically very similar to the recently described genus Amazoninema, which, in turn, has comparable morphology to other genera of the Leptolyngbyaceae family (Genuário et al., 2018).
In this work, we report the genome sequence of a new Pantanalinema strain, named GBBB05, which was isolated from the Brazilian Cerrado biome. This is the first genome assembly for the Pantanalinema genus, which, along with the analyses provided here, is expected to enhance our understanding of this genus's metabolic potential.
Value of Data
Pantanalinema is a recently described new genus of cyanobacteria. Here we describe the first genome of a strain in this genus. The high-quality draft genome assembled from an environmental culture shows the feasibility of this method, especially for cyanobacteria samples that are underrepresented in metagenomic samples and difficult to obtain from axenic cultures. The reported genome will allow further understanding of this species' biology.
Results
Total DNA isolated from a non-axenic unialgal culture of strain GBBB05 was sequenced using the MiSeq-Illumina platform. The sequencing generated 8,728,802 reads, and after quality control, de novo assembly and binning of 5,745,014 quality-filtered PE reads allowed the recovery of eight bacterial genomes with completeness varying from 50 to 99.5% (Supplementary Table 1). With the initial taxonomic placement using GTDB-Tk (Chaumeil et al., 2020) and the module “classify_bins” from Metawrap (Uritskiy et al., 2018) (Supplementary Table 2), one of the recovered genomes (GBBB05) is from cyanobacteria (Leptolyngbyaceae family) while the other seven are from heterotrophic bacteria.
The cyanobacterial genome of the GBBB05 strain was assembled into 94 contigs and has an estimated size of 7,181,771 bp and a GC content of 48.43% (Table 1). The genome showed high completeness (99.05%) and low contamination (0.4%). Selected features of the GBBB05 genome are presented in Table 1 and Supplementary Table 1.
Table 1
| Features | Chromosome |
|---|---|
| Strain | Pantanalinema GBBB05 |
| Number of contigs | 94 |
| L50 value | 17 |
| N50 value (bp) | 142,797 |
| Completeness | 99.05% |
| Contamination | 0.4% |
| Sequencing coverage | 18x |
| GC content | 48.43% |
| Estimated chromosome size (bp) | 7,181,771 |
| Protein-coding genes (CDS)1 | 5,976 |
| rRNAs1 | 6 |
| 5S rRNAs | 2 |
| 16S rRNAs | 2 |
| 23S rRNA | 2 |
| tRNAs | 106 |
| ncRNAs | 4 |
| Pseudo genes | 152 |
| CRISPR2* | 16 sequences |
| CAS2* | 1 sequence |
| Phage3* | 1 |
Genome features of Pantanalinema GBBB05.
Genome statistics were obtained through CheckM.
Prokaryotic Genome Annotation Pipeline (PGAP);
CRISPRCasFinder;
PHAST;
Complete results from analysis with CRISPRCasFinder and PHAST are shown in Supplementary Tables 2 and 3.
Based on the morphology shown in the culture, the GBBB05 strain was assigned to either Pantanalinema or Amazoninema genera for which there are no genome sequences currently available. In silico DDH values with reference genomes from the Leptolyngbyaceae family ranged from 18.4 to 22.8% (Supplementary Table 5). Nevertheless, phylogenomic analysis grouped GBBB05 with one representative genome of the genus Leptolyngbya with 100% bootstrap support (Figure 1A). Phylogenetic reconstruction using Bayesian inference of partial 16S rRNA sequences from Leptolyngbyaceae shows that GBBB05 belongs to the Pantanalinema genus (Figure 1C).
Figure 1
Pan-genome analysis revealed that the Pantanalinema sp. GBBB05 shares a set of 1,624 core genes (~28% CDSs) with the reference strains Leptolyngbya PCC 6306, Leptolyngbya JSC 12, and Geitlerinema PCC 7407 (Figure 1B and Supplementary Table 6).
The functional prediction performed by the RAST annotation found about 25 functional categories (Figure 1D). Among all the categories, “Carbohydrates” with 176, “Amino Acids and Derivatives” with 169, “Cofactors, Vitamins, Prosthetic Groups, Pigments” with 143, and “Protein metabolism” with 135 genes were the biggest group. In the category “Secondary metabolites,” five genes related to plant alkaloids and plant hormones (auxins) were identified. Six genes related to metal (cadmium, cobalt, mercury, and zinc) resistance and two to fluoroquinolones were found.
Analyses using antiSMASH resulted in the prediction of 17 biosynthetic gene clusters (BGCs), which included the following: 5 clusters for terpene production; 4 bacteriocin clusters; 1 cluster for betalactone biosynthesis; 1 cluster for resorcinol production; 1 mixed module of polyketide synthase (PKS) type 1 and non-ribosomal peptide synthetase (NRPS) related to nostophycin; and 6 clusters containing NRPS biosynthetic pathways (Supplementary Table 7).
Analysis in the NaPDoS server, using the default setting, predicted 11 CDSs related to metabolite production in domain C pathways and 4 CDSs related to metabolite production in KS domain pathways (Supplementary Tables 8, 9).
The PRISM4 analysis identified seven gene clusters: five non-ribosomal peptide, one mixed module of polyketide synthase (PKS) type 1 and non-ribosomal peptide synthetase (NRPS), and one prochlorosin cluster (Supplementary Table 10).
Materials and Methods
Growth Conditions and Genomic DNA Isolation
The strain GBBB05 was isolated from a waterfall in the outside border of the Chapada das Mesas National Park, Carolina County, Maranhão State, Brazil (S07°02.6575 / W047°30.4508). Aerobic cultivation was performed in the BG-11 medium (Stanier and Cohen-Bazire, 1977) under the illumination of 3–15 μmol photons m−2 s−1 for 4 weeks at 28°C. Non-axenic unialgal culture was obtained by the spread plate method and serial dilutions, and each step was monitored with an optical microscope. Total DNA was extracted from 25 ml of stationary phase non-axenic unialgal culture using the PowerPlant kit (MoBio, California, USA) according to the manufacturer's instructions and stored at −20°C. DNA purity and concentration were evaluated by the absorbance at 260 and 280 nm on a NanoDrop ND-2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). DNA concentration was further quantified with a Quant-iT Picogreen dsDNA assay kit (Thermo Fisher Scientific, Waltham, MA, USA).
Genome Sequencing, Assembly, and Taxonomic Classification
The sequencing library was prepared with 30 ng of total DNA using the Illumina Nextera DNA library preparation kit (Illumina, Inc.) and sequenced with MiSeq-Illumina platform (Illumina, Inc., San Diego, USA) using the MiSeq Reagent kit v2 (500-cycle format). Paired-end (PE) reads were quality-filtered and trimmed (PHRED quality score ≥ 30) with Read_qc module from Metawrap (Uritskiy et al., 2018). De novo assembly was performed with the Metawrap Assembly module using metaSPAdes v. 3. 13 (Nurk et al., 2017). Assembled contigs were subjected to three different binning rounds using CONCOCT (Alneberg et al., 2014), MaxBin2 (Wu et al., 2016), and MetaBAT2 (Kang et al., 2019). The results were compared using the Bin_refinement module, and the highest-quality bins were selected. The genome statistics were obtained through CheckM (Parks et al., 2015). Digital DNA–DNA hybridization (DDH) values were calculated using the Genome-To-Genome Distance Calculator (GGDC) server (Meier-Kolthoff et al., 2013, 2014). GTDB-TK (Chaumeil et al., 2020) and Classify_bins from metaWRAP (Uritskiy et al., 2018) were used to do the genomic classification (Supplementary Table 1). All parameters were kept at default values.
Phylogenetic Analysis
A phylogenomic tree of strain GBBB05 closely related genomes was constructed using the Species Tree App present in KBase (Arkin et al., 2018). Further phylogenetic analysis was performed using the GBBB05 16S rRNA partial sequence retrieved from its genome assembly plus 89 partial 16S rRNA sequences from various strains of the Leptolyngbyaceae family and Gloeobacter genus, retrieved from the NCBI. The nucleotide substitution model used was SYM+G, determined by PAUP 4.0b101 and MrModeltest 2 (Nylander, 2004). The tree was constructed using MrBayes 3.2.6 (Ronquist et al., 2012), running with 107 generations, sampling every 100th iteration. TRACER 1.6 (Rambaut et al., 2018) was used to check the performance of tree construction with all parameters at default settings.
Genome Annotation and Functional Analyses
The genome assembly was submitted to the Prokaryotic Genome Annotation Pipeline (PGAP) (Tatusova et al., 2016) at the National Center for Biotechnology Information (NCBI). The CRISPRCasFinder server (Couvin et al., 2018) was used to predict CRISPR sequences and CAS genes, and the PHAST web server tool (Zhou et al., 2011) was used to identify the phage-related sequences (Table 1). Pan-genome analysis was performed using the Bacterial Pan Genome Analysis Tool (BPGA) (Chaudhari et al., 2016), with default parameters. To predict gene clusters related to secondary metabolite production, the online servers AntiSMASH 5.0 (Blin et al., 2019) and PRISM4 (Skinnider et al., 2017) were used with all their options enabled. For the detection of C domains and KS domains, the NaPDoS online server pipeline was used with its default configuration (Ziemert et al., 2012). Prediction of functional categories was performed using RAST server annotation at default settings (Aziz et al., 2008; Overbeek et al., 2014).
Statements
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: NCBI [accession: PRJNA560270].
Author contributions
HD, AS, JS, KS, DF, and LD conceived and supervised this study. RS, MO, and PM collected and established the enriched cultures of Pantanalinema. LF, AB, IR, PS, RR, CM, and AL performed sequencing and bioinformatic analysis. LF, AB, HD, AS, JS, EG, DF, and LD wrote and revised the manuscript. All authors read and approved the final manuscript.
Acknowledgments
LF, IR, AS, and JS acknowledge the fellowship support received from CNPq. LD acknowledges all the support received from the Marie Curie Alumni Association and the authors thank Dr. Layla F. Martins from the Center for Advanced Technologies in Genomics, Instituto de Química/USP, for technical assistance on Illumina sequencing.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2021.639852/full#supplementary-material
Footnotes
1.^Available at: http://phylosolutions.com/paup-test/.
References
1
AlnebergJ.BjarnasonB. S.de BruijnI.SchirmerM.QuickJ.IjazU. Z.et al. (2014). Binning metagenomic contigs by coverage and composition. Nat. Methods11, 1144–1146. 10.1038/nmeth.3103
2
ArkinA. P.CottinghamR. W.HenryC. S.HarrisN. L.StevensR. L.MaslovS.et al. (2018). KBase: the United States department of energy systems biology knowledgebase. Nat. Biotechnol.36, 566–569. 10.1038/nbt.4163
3
AzizR. K.BartelsD.BestA.DeJonghM.DiszT.EdwardsR. A.et al. (2008). The RAST Server: rapid annotations using subsystems technology. BMC Genomics9:75. 10.1186/1471-2164-9-75
4
BlinK.ShawS.SteinkeK.VillebroR.ZiemertN.LeeS. Y.et al. (2019). AntiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res.47, W81–W87. 10.1093/nar/gkz310
5
ChaudhariN. M.GuptaV. K.DuttaC. (2016). BPGA-an ultra-fast pan-genome analysis pipeline. Sci. Rep.6, 1–10. 10.1038/srep24373
6
ChaumeilP. A.MussigA. J.HugenholtzP.ParksD. H. (2020). GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics36, 1925–1927. 10.1093/bioinformatics/btz848
7
CouvinD.BernheimA.Toffano-NiocheC.TouchonM.MichalikJ.NéronB.et al. (2018). CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins. Nucleic Acids Res.46, W246–W251. 10.1093/nar/gky425
8
DittmannE.GuggerM.SivonenK.FewerD. P. (2015). Natural product biosynthetic diversity and comparative genomics of the cyanobacteria. Trends Microbiol.23, 642–652. 10.1016/j.tim.2015.07.008
9
GenuárioD. B.De SouzaW. R.MonteiroR. T. R.Sant'AnnaC. L.MeloI. S. (2018). Amazoninema gen. Nov., (Synechococcales, Pseudanabaenaceae) a novel cyanobacteria genus from Brazilian Amazonian rivers. Int. J. Syst. Evol. Microbiol. 68, 2249–2257. 10.1099/ijsem.0.002821
10
GenuárioD. B.VazM. G. M. V.MeloI. S.de (2017). Phylogenetic insights into the diversity of homocytous cyanobacteria from Amazonian rivers. Mol. Phylogenet. Evol.116, 120–135. 10.1016/j.ympev.2017.08.010
11
KangD. D.LiF.KirtonE.ThomasA.EganR.AnH.et al. (2019). MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ2019, 1–13. 10.7717/peerj.7359
12
Meier-KolthoffJ. P.AuchA. F.KlenkH. P.GökerM. (2013). Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics14:60. 10.1186/1471-2105-14-60
13
Meier-KolthoffJ. P.KlenkH. P.GökerM. (2014). Taxonomic use of DNA G+C content and DNA-DNA hybridization in the genomic age. Int. J. Syst. Evol. Microbiol.64, 352–356. 10.1099/ijs.0.056994-0
14
NurkS.MeleshkoD.KorobeynikovA.PevznerP. A. (2017). MetaSPAdes: a new versatile metagenomic assembler. Genome Res.27, 824–834. 10.1101/gr.213959.116
15
NylanderJ. (2004). MrModeltest V2. Program Distributed by the Author. Uppsala: Uppsala University.
16
OverbeekR.OlsonR.PuschG. D.OlsenG. J.DavisJ. J.DiszT.et al. (2014). The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res.42, 206–214. 10.1093/nar/gkt1226
17
ParksD. H.ImelfortM.SkennertonC. T.HugenholtzP.TysonG. W. (2015). CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res.25, 1043–1055. 10.1101/gr.186072.114
18
RambautA.DrummondA. J.XieD.BaeleG.SuchardM. A. (2018). Posterior summarisation in Bayesian phylogenetics using Tracer 1.7. Syst. Biol.syy032. 10.1093/sysbio/syy032
19
RonquistF.TeslenkoM.Van Der MarkP.AyresD. L.DarlingA.HöhnaS.et al. (2012). Mrbayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst. Biol.61, 539–542. 10.1093/sysbio/sys029
20
SchirrmeisterB. E.AntonelliA.BagheriH. C. (2011). The origin of multicellularity in cyanobacteria. BMC Evol. Biol.11:45. 10.1186/1471-2148-11-45
21
SkinniderM. A.MerwinN. J.JohnstonC. W.MagarveyN. A. (2017). PRISM 3: Expanded prediction of natural product chemical structures from microbial genomes. Nucleic Acids Res.45, W49–W54. 10.1093/nar/gkx320
22
StanierR. Y.Cohen-BazireG. (1977). Phototrophic prokaryotes: the cyanobacteria. Ann. Rev. Microbiol. 31, 225–274. 10.1146/annurev.mi.31.100177.001301
23
TatusovaT.DicuccioM.BadretdinA.ChetverninV.NawrockiE. P.ZaslavskyL.et al. (2016). NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res.44, 6614–6624. 10.1093/nar/gkw569
24
UritskiyG. V.DiruggieroJ.TaylorJ. (2018). MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis 08 Information and Computing Sciences 0803 Computer Software 08 Information and Computing Sciences 0806 Information Systems. Microbiome6, 1–13. 10.1186/s40168-018-0541-1
25
VazM. G. M. V.GenuárioD. B.AndreoteA. P. D.MaloneC. F. S.Sant'AnnaC. L.BarbieroL.et al. (2015). Pantanalinema gen. nov. and Alkalinema gen. nov.: novel pseudanabaenacean genera (Cyanobacteria) isolated from saline–alkaline lakes. Int. J. Syst. Evol. Microbiol.65, 298–308. 10.1099/ijs.0.070110-0
26
WuY. W.SimmonsB. A.SingerS. W. (2016). MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics32, 605–607. 10.1093/bioinformatics/btv638
27
ZhouY.LiangY.LynchK. H.DennisJ. J.WishartD. S. (2011). PHAST: a fast phage search tool. Nucleic Acids Res.39, 347–352. 10.1093/nar/gkr485
28
ZiemertN.PodellS.PennK.BadgerJ. H.AllenE.JensenP. R. (2012). The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity. PLoS ONE7:e34064. 10.1371/journal.pone.0034064
Summary
Keywords
enriched metagenomics, genome mining, Leptolyngbyaceae, non-axenic, phylogeny
Citation
Ferreira LSS, Butarelli ACA, Sousa RC, Oliveira MA, Moraes PHG, Ribeiro IS, Sousa PFR, Dall'Agnol HMB, Lima ARJ, Gonçalves EC, Sivonen K, Fewer D, Riyuzo R, Piroupo CM, da Silva AM, Setubal JC and Dall'Agnol LT (2021) High-Quality Draft Genome Sequence of Pantanalinema sp. GBBB05, a Cyanobacterium From Cerrado Biome. Front. Ecol. Evol. 9:639852. doi: 10.3389/fevo.2021.639852
Received
10 December 2020
Accepted
05 May 2021
Published
17 June 2021
Volume
9 - 2021
Edited by
Sucheta Tripathy, Indian Institute of Chemical Biology (CSIR), India
Reviewed by
Shu Cheng, University of Arizona, United States; Laura Baxter, University of Warwick, United Kingdom
Updates
Copyright
© 2021 Ferreira, Butarelli, Sousa, Oliveira, Moraes, Ribeiro, Sousa, Dall'Agnol, Lima, Gonçalves, Sivonen, Fewer, Riyuzo, Piroupo, da Silva, Setubal and Dall'Agnol.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Leonardo Teixeira Dall'Agnol leonardo.td@ufma.br
This article was submitted to Phylogenetics, Phylogenomics, and Systematics, a section of the journal Frontiers in Ecology and Evolution
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.