High-Quality Draft Genome Sequence of Pantanalinema sp. GBBB05, a Cyanobacterium From Cerrado Biome

Citation: Ferreira LSdS, Butarelli ACdA, Sousa RdC, Oliveira MAd, Moraes PHG, Ribeiro IS, Sousa PFR, Dall’Agnol HMB, Lima ARJ, Gonçalves EC, Sivonen K, Fewer D, Riyuzo R, Piroupo CM, da Silva AM, Setubal JC and Dall’Agnol LT (2021) High-Quality Draft Genome Sequence of Pantanalinema sp. GBBB05, a Cyanobacterium From Cerrado Biome. Front. Ecol. Evol. 9:639852. doi: 10.3389/fevo.2021.639852 High-Quality Draft Genome Sequence of Pantanalinema sp. GBBB05, a Cyanobacterium From Cerrado Biome


INTRODUCTION
Cyanobacteria comprise one of the oldest and most diverse phyla in the Bacteria domain and are recognized for their importance in the biosphere evolution. Members of this phylum can be found in a wide variety of environments reflecting their photosynthetic ability, adaptability to various environmental conditions, and diversified metabolism. Such characteristics make cyanobacteria one of the preferred targets for research on bioactive compounds and new enzymes (Schirrmeister et al., 2011;Dittmann et al., 2015).
Pantanalinema was described as a new genus of the Leptolyngbyaceae cyanobacterial family by a polyphasic approach, which included morphological characteristics, 16S rRNA gene phylogeny, 16S-23S ITS rRNA secondary structures, and physiological characteristics such as adaptability to pH variations (Vaz et al., 2015). This genus has been described only in Brazilian biomes such as the Pantanal and the Amazon, the first isolates being found in a lake. These Pantanalinema isolates were characterized by their ability to grow over a wide pH range (pH 4 to 11) as well as to modify the culture medium pH around neutrality (pH 6 to 7.4). Due to these characteristics, it is thought that this genus can occupy a variety of ecological niches, such as alkaline or slightly acidic water bodies (Vaz et al., 2015;Genuário et al., 2017). Taxonomic classification of Pantanalinema isolates requires the use of molecular markers as this genus is morphologically very similar to the recently described genus Amazoninema, which, in turn, has comparable morphology to other genera of the Leptolyngbyaceae family (Genuário et al., 2018).
In this work, we report the genome sequence of a new Pantanalinema strain, named GBBB05, which was isolated from the Brazilian Cerrado biome. This is the first genome assembly for the Pantanalinema genus, which, along with the analyses provided here, is expected to enhance our understanding of this genus's metabolic potential.

Value of Data
Pantanalinema is a recently described new genus of cyanobacteria. Here we describe the first genome of a strain in this genus. The high-quality draft genome assembled from an environmental culture shows the feasibility of this method, especially for cyanobacteria samples that are underrepresented in metagenomic samples and difficult to obtain from axenic cultures. The reported genome will allow further understanding of this species' biology.

RESULTS
Total DNA isolated from a non-axenic unialgal culture of strain GBBB05 was sequenced using the MiSeq-Illumina platform. The sequencing generated 8,728,802 reads, and after quality control, de novo assembly and binning of 5,745,014 quality-filtered PE reads allowed the recovery of eight bacterial genomes with completeness varying from 50 to 99.5% (Supplementary Table 1). With the initial taxonomic placement using GTDB-Tk (Chaumeil et al., 2020) and the module "classify_bins" from Metawrap (Uritskiy et al., 2018) (Supplementary Table 2), one of the recovered genomes (GBBB05) is from cyanobacteria (Leptolyngbyaceae family) while the other seven are from heterotrophic bacteria.
The cyanobacterial genome of the GBBB05 strain was assembled into 94 contigs and has an estimated size of 7,181,771 bp and a GC content of 48.43% ( Table 1). The genome showed high completeness (99.05%) and low contamination (0.4%). Selected features of the GBBB05 genome are presented in Table 1 and Supplementary Table 1.
Based on the morphology shown in the culture, the GBBB05 strain was assigned to either Pantanalinema or Amazoninema genera for which there are no genome sequences currently available. In silico DDH values with reference genomes from the Leptolyngbyaceae family ranged from 18.4 to 22.8% (Supplementary Table 5). Nevertheless, phylogenomic analysis grouped GBBB05 with one representative genome of the genus Leptolyngbya with 100% bootstrap support ( Figure 1A). Phylogenetic reconstruction using Bayesian inference of partial 16S rRNA sequences from Leptolyngbyaceae shows that GBBB05 belongs to the Pantanalinema genus ( Figure 1C).
The functional prediction performed by the RAST annotation found about 25 functional categories ( Figure 1D). Among all the categories, "Carbohydrates" with 176, "Amino Acids and Derivatives" with 169, "Cofactors, Vitamins, Prosthetic Groups, Pigments" with 143, and "Protein metabolism" with 135 genes were the biggest group. In the category "Secondary metabolites, " five genes related to plant alkaloids and plant hormones (auxins) were identified. Six genes related to metal (cadmium, cobalt, mercury, and zinc) resistance and two to fluoroquinolones were found. Analyses using antiSMASH resulted in the prediction of 17 biosynthetic gene clusters (BGCs), which included the following: 5 clusters for terpene production; 4 bacteriocin clusters; 1 cluster for betalactone biosynthesis; 1 cluster for resorcinol production; 1 mixed module of polyketide synthase (PKS) type 1 and non-ribosomal peptide synthetase (NRPS) related to nostophycin; and 6 clusters containing NRPS biosynthetic pathways (Supplementary Table 7).
Analysis in the NaPDoS server, using the default setting, predicted 11 CDSs related to metabolite production in domain C pathways and 4 CDSs related to metabolite production in KS domain pathways (Supplementary Tables 8, 9).
The PRISM4 analysis identified seven gene clusters: five nonribosomal peptide, one mixed module of polyketide synthase (PKS) type 1 and non-ribosomal peptide synthetase (NRPS), and one prochlorosin cluster (Supplementary Table 10).

Growth Conditions and Genomic DNA Isolation
The strain GBBB05 was isolated from a waterfall in the outside border of the Chapada das Mesas National Park, Carolina County, Maranhão State, Brazil (S07 • 02.6575 / W047 • 30.4508). Aerobic cultivation was performed in the BG-11 medium (Stanier and Cohen-Bazire, 1977) under the illumination of 3-15 µmol photons m −2 s −1 for 4 weeks at 28 • C. Nonaxenic unialgal culture was obtained by the spread plate

Phylogenetic Analysis
A phylogenomic tree of strain GBBB05 closely related genomes was constructed using the Species Tree App present in KBase (Arkin et al., 2018). Further phylogenetic analysis was performed using the GBBB05 16S rRNA partial sequence retrieved from its genome assembly plus 89 partial 16S rRNA sequences from various strains of the Leptolyngbyaceae family and Gloeobacter genus, retrieved from the NCBI. The nucleotide substitution model used was SYM+G, determined by PAUP 4.0b10 1 and MrModeltest 2 (Nylander, 2004). The tree was constructed using MrBayes 3.2.6 (Ronquist et al., 2012), running with 10 7 generations, sampling every 100th iteration. TRACER 1.6 1 Available at: http://phylosolutions.com/paup-test/. (Rambaut et al., 2018) was used to check the performance of tree construction with all parameters at default settings.

Genome Annotation and Functional Analyses
The genome assembly was submitted to the Prokaryotic Genome Annotation Pipeline (PGAP) (Tatusova et al., 2016) at the National Center for Biotechnology Information (NCBI). The CRISPRCasFinder server (Couvin et al., 2018) was used to predict CRISPR sequences and CAS genes, and the PHAST web server tool (Zhou et al., 2011) was used to identify the phage-related sequences ( Table 1). Pan-genome analysis was performed using the Bacterial Pan Genome Analysis Tool (BPGA) (Chaudhari et al., 2016), with default parameters. To predict gene clusters related to secondary metabolite production, the online servers AntiSMASH 5.0 (Blin et al., 2019) and PRISM4 (Skinnider et al., 2017) were used with all their options enabled. For the detection of C domains and KS domains, the NaPDoS online server pipeline was used with its default configuration (Ziemert et al., 2012). Prediction of functional categories was performed using RAST server annotation at default settings (Aziz et al., 2008;Overbeek et al., 2014).

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories.