Genome-Wide Identification and Characterization of GASA Gene Family in Nicotiana tabacum

The gibberellic acid stimulated Arabidopsis (GASA) gene family is critical for plant growth, development, and stress response. GASA gene family has been studied in various plant species, however, the GASA gene family in tobacco (Nicotiana tabacum) have not been characterized in detail. In this study, we identified 18 GASA genes in the tobacco genome, which were distributed to 13 chromosomes. All the proteins contained a conserved GASA domain and highly specific 12-cysteine residues at the C-terminus. Phylogenetic analysis divided the NtGASA genes into three well-conserved subfamilies. Synteny analysis suggested that tandem and segmental duplications played an important role in the expansion of the NtGASA gene family. Cis-elements analysis showed that NtGASA genes might influence different phytohormone and stress responses. Tissue expression analysis revealed that NtGASA genes displayed unique or distinct expression patterns in different tissues, suggesting their potential roles in plant growth and development. We also found that the expression of NtGASA genes were mostly regulated by abscisic and gibberellic acid, signifying their roles in the two phytohormone signaling pathways. Overall, these findings improve our understanding of NtGASA genes and provided useful information for further studies on their molecular functions.


INTRODUCTION
The gibberellic acid stimulated Arabidopsis (GASA) gene family is widespread in monocotyledonous and dicotyledonous plant species (Nahirñak et al., 2012). It encodes a class of cysteine-rich peptides characterized by a signaling amino acid region at the N-terminus and a conserved domain with 12 cysteines at the C-terminus (Silverstein et al., 2007). Previous studies indicated that peptides with a mutated or missing GASA domain are non-functional (Sun et al., 2013).
The GAST1 gene, which was first identified in tomato and characterized as a gibberellic acid (GA)deficient (gib1) mutant gene (Shi et al., 1992). Subsequently, many GASA homologs were identified in Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), wheat (Triticum aestivum), grapevine (Vitis vinifera L.), and tomato (Solanum lycopersicum) (Taylor and Scheuring, 1994;Aubert et al., 1998;Furukawa et al., 2006;Zhang et al., 2017;Ahmad et al., 2020). GASA gene family play important roles in plant growth and development. In Arabidopsis, AtGASA4 is involved in light signaling and promotes floral development, whereas overexpression of AtGASA5 delays flowering by downregulating the expression of LFY and FT and upregulating the expression of FLC (Zhang et al., 2009). In petunia, GASA are involved in floral transition and shoot elongation (Ben-Nissan et al., 2004).
Most GASA genes are involved in GA signaling pathways. In soybean (Glycine max), GmGASA32 is upregulated by GA and interacts with GmCDC25 to control plant height (Chen et al., 2021). In Gerbera corolla, GEG, a GASA family member, is stimulated by the exogenous application of GA 3 and regulates cell expansion (Kotilainen et al., 1999). In strawberry (Fragaria×ananassa), FaGAST genes are upregulated by the exogenous application of GA and affect fruit ripening (de la Fuente et al., 2006). Besides, the expression of GASA genes is increased by other phytohormones such as brassinosteroid (BR), salicylic acid (SA), abscisic acid (ABA), naphthalene acetic acid (NAA), and indole-3-acetic acid (IAA) (Mutasa-Göttgens and Hedden, 2009;Lee et al., 2015;Qu et al., 2016;Boonpa et al., 2018). In rice, OsGSR1, a GASA family member, influences the BR signaling networks by interacting with the BR synthetase DIM/ DWF1 . In Arabidopsis, AtGASA2, AtGASA5, and AtGASA14 are involved in ABA signaling and affect flower induction. AtGASA6 is an integrator of GA, ABA, and glucose signaling and controls seed germination and cell elongation (Zhang and Wang, 2008;Zhong et al., 2015). In apple (Malus domestica), the expression of MdGASA are upregulated by GA and ABA applications during the flowering stage (Fan et al., 2017). The Gene ID were modified with regular form. GASA gene family also involved in plant response to abiotic and biotic stresses. In Arabidopsis, overexpression of AtGASA4 suppresses the accumulation of reactive oxygen species (ROS) and nitric oxide in wounded leaves (Rubinovich and Weiss, 2010). In transgenic Arabidopsis plants, overexpression of GASA4 from common beech (Fagus sylvatica) improves tolerance to salt, ROS, and heat stress (Alonso-Ramírez et al., 2009), overexpression of GsGASA1 from soybean inhibits root growth in low temperatures and upregulates the expression of RGL2 and RGL3 (Li et al., 2011). In tomato, Snakin-1 and Snakin-2, two GASA-like genes, are active in vitro against various bacteria (i.e., Clavibacter michiganensis subsp. Sepedonicus) and fungi (i.e., Fusarium solani and Botrytis cinerea) by regulating the redox levels (Almasia et al., 2008;Balaji and Smart, 2012). In rubber (Hevea brasiliensis), HbGASA genes are upregulated upon inoculation with Colletotrichum gloeosporioides and are involved in innate immunity by regulating ROS accumulation . Therefore, GASA gene family is involved in numerous physiological and biological processes, displaying complex and diverse functions.
Tobacco (Nicotiana tabacum L.) is widely cultivated and has been used as a model plant for biological research. GASA genes are important in plant growth and development, however, the tobacco GASA gene family were not characterized previously. In this study, we identified GASA gene family in the tobacco genome with bioinformatics methods, and characterized their gene structure, phylogenetic relationships, protein motifs, chromosomal locations, syntenic regions, cis-acting elements, and expression patterns in different tissues. Our findings provide useful clues for further studies of GASA gene family in tobacco.

Plant Materials and Growth Conditions
The cultivar K326 was used to analyze the expression of GASA genes in tobacco. Seeds were germinated in a nursery tray, FIGURE 1 | Phylogenetic analysis of GASA proteins from Arabidopsis, rice, grapevine and tobacco. A total of 15 GASA proteins from Arabidopsis, 10 GASA proteins from rice, 14 GASA proteins from grapevine, and 18 NtGASA proteins from tobacco were used to generate the unrooted neighbor-joining (NJ) tree with 1,000 bootstrap replicates. The GASA proteins are classified into three subfamilies (marked as I, II, III), and distinguished by different colors: AtGASA labeled in green, OsGASA labeled in blue, VvGASA labeled in red, and NtGASA labeled in cyan.

Genome-Wide Identification of NtGASA Genes
For NtGASA identification, 15 GASA sequences were obtained from the Arabidopsis database (TAIR; http://www.arabidopsis.org) and used as queries for BLAST search against the Solanaceae Genomics Network (https://solgenomics.net/). Subsequently, the Hidden Markov Model-based profile of the GASA domain PFAM 02704 was used to verify the presence of the complete GASA domain in NtGASA sequences. The non-redundant putative NtGASA sequences with a conserved GASA domain were used for further bioinformatics (phylogenetic relationships, chromosomal locations, Cis-regulatory elements, etc) and expression analysis.

Chromosomal Locations and Gene Duplications Analysis
To obtain the chromosomal locations of NtGASA genes, the DNA sequence of each gene was mapped using MG2C 2.0 (http://mg2c. iask.in/mg2c_v2.0/). Segmental and tandem duplicated gene pairs within the tobacco genome, as well as collinear gene pairs among the Arabidopsis, rice, grapevine, and tobacco genomes, were identified using MCScanX (Wang et al., 2012). The collinearity map was constructed using Circos (Krzywinski et al., 2009). The synonymous and non-synonymous substitution rates (Ks and Ka, respectively) were calculated using KaKs_Calculator 2.0 .

Expression Analysis of NtGASA Genes
Plant samples were collected from root, flower, leaf, stem and axillary bud of tobacco at flowering stage, total RNA was isolated from FIGURE 2 | Chromosomal distributions and gene duplication of NtGASA genes. Chromosome size is indicated by its relative length. Segmental duplicated NtGASA genes are connected by green colored lines, and red box shows two tandem duplicated gene pairs. NtGADPH gene was used as the internal control for data normalization, and the relative expression levels of selected genes were calculated using the 2 −ΔΔCt method (Schmittgen and Livak, 2008). The primers used for qRT-PCR are listed in Supplementary  Table S3.

Prediction and Classification of Cis-Regulatory Elements
The 3 kb DNA sequence upstream of the start codon of NtGASA genes was examined for the presence of cis-regulatory elements. Cisregulatory elements in the promoters of each NtGASA gene were analyzed using the PlantCARE database (http://bioinformatics.psb. ugent.be/webtools/plantcare/html/) and classified according to their regulatory functions.

Physicochemical Properties and Localization of NtGASA
To identify the GASA genes in tobacco, we used 15 AtGASA sequences as queries for BLAST search, and identified 18 putative NtGASA based on amino acid similarities. As shown in Table 1, the total and coding sequence lengths of NtGASA genes were 186 to3,715 bp and 186 to 444 bp, respectively. The deduced NtGASA proteins varied from 61 to 147 amino acids with a molecular weight of 6.6-16.17 kDa, and the isoelectric point ranged from 6.66 to 9.75. Apart from these, the instability index for most of the proteins (77.8%) were more than 35. According to the Grand average of Hydropathicity (GRAVY), the NtGASA proteins were hydrophilic except for NtGASA3, NtGASA4, and NtGASA9. The amino acid content of NtGASA was conserved, cysteine, lysine, and leucine were Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 12 | Article 768942 5 predominant amino residues. Most NtGASA proteins were localized in the extracellular membrane, chloroplasts, and mitochondria. Detailed information about NtGASA physicochemical characteristics is presented in Table 2.

Chromosomal Distributions and Synteny Analysis of NtGASA Genes
The localizations of the NtGASA genes in the chromosomes of tobacco were further determined. Using a simplified physical map, we found that the 18 NtGASA genes were unevenly distributed in 11 chromosomes in the tobacco genome. Chromosome (Chr.) 1, 4, 6, 8, and 21 contained two copies each, whereas Chr. 2, 10, 12, 14, 15, 16, 17, and 18 contained one copy each (Figure 2).
Tandem and segmental duplicates play an important role in the expansion of gene families. Two genes (NtGASA10 and NtGASA11) were tandemly duplicated on Chr.1. In addition, five pairs (NtGASA3/NtGASA4, NtGASA4/NtGASA5, NtGASA6/NtGASA7, NtGASA15/NtGASA16, and NtGASA17/NtGASA18) were segmental duplicated ( Figure 2). All tandem and segmental duplicates had Ka/Ks values less than 1 ( Table 3), indicating that the six gene pairs evolved under the influence of purifying selection.

Analysis of Conserved Motifs and Gene Structure
To further explore the phylogenetic relationships among NtGASA genes, an unrooted tree was constructed between NtGASA genes. In concordance with the phylogenetic tree including the tobacco, Arabidopsis, grapevine, and rice GASA genes, this analysis also supported the classification of NtGASA genes into three subfamilies ( Figure 4A). The number of conserved motifs in NtGASA proteins varied from three to 6 ( Figure 4B). The highly conserved motifs 1, 2, and three were detected in all 18 NtGASA proteins, whereas motif five was only found in NtGASA6 and NtGASA7, motif eight were Structural analysis revealed that the length, arrangement, and position of introns in NtGASA genes were relatively less conserved. For instance, subfamilies I and II contained one to FIGURE 6 | Expression patterns of NtGASA genes in different tissues and organs. Transcript levels in flowers were set to one, in NtGASA9, transcript levels in root was set to one. Each value represents the mean ± standard error of three biological replicates.
Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 12 | Article 768942 8 three introns and subfamily III contained one intron, except for NtGASA8 that had only one exon and no intron ( Figure 4C). Intron gain and loss is a frequent phenomenon during evolution and can increase the complexity of gene structures.
In previous findings, putative GASA protein possesses highly conserved C-terminal domain that containing 12 conserved cysteines (Aubert et al., 1998). Amino-acid sequence comparison of AtGASA, OsGASA, VvGASA, and NtGASA revealed that all putative NtGASA proteins shared a conserved GASA domain, except for NtGASA17, in which GASA domains were mutated by the insertion of several amino acids ( Figure 5).

Tissue-Specific Expression Profiling of NtGASA Genes
The spatio-temporal expression analysis of genes can provide information about gene function. We performed qRT-PCR for expression profiling of the NtGASA genes in the root, flower, leaf, stem, and axillary bud. The expression profiling showed that most NtGASA genes had diverse expression patterns in different tissues. NtGASA3, NtGASA11, NtGASA17, and NtGASA18 were expressed relatively ubiquitously. Whereas many NtGASA genes showed high expression in specific tissues, such as NtGASA9 had the highest expression levels in the stem, NtGASA7 in the leaf, NtGASA16 in the axillary bud, and NtGASA2, NtGASA5, NtGASA6, NtGASA10, NtGASA13, NtGASA14, and NtGASA15 in the flower. Notably, NtGASA12 had the lowest expression levels in the stem. In general, most NtGASA genes were highly expressed in reproductive organs (i.e., flower) compared with vegetative parts (i.e., leaf and stem) ( Figure 6).

Analysis of Cis-Elements in the Promoters of NtGASA Genes
The study of cis-elements could provide clues about regulatory pathways of gene expression, then we analyzed the 3,000-bp upstream promoter sequences of NtGASA genes. The largest number of cis-elements observed across the NtGASA genes was associated with light-responsiveness. In addition, cis-elements involved in phytohormone (i.e., ABA, GA, IAA, SA, and MeJA) and stress (i.e., low temperature) responses were also identified in the promoter sequences of NtGASA genes (Figure 7). The diversity in response elements indicated the regulatory roles of NtGASA genes in various physiological and biological processes.
Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 12 | Article 768942 10 DISCUSSION GASA influence various biological processes and signal transduction pathways, and then playing critical roles in plant growth and development (Choi et al., 2017). Due to complexities in functional mechanisms, different members of the GASA gene family have identical or diverse functions during the vegetative and reproductive stages. In Arabidopsis, AtGASA5 is activated by ABA during seed dormancy, whereas AtGASA4 is expressed during germination (Zhang et al., 2009). In strawberry, FaGAST1 and FaGAST2 have distinct expression patterns and belong to different subfamilies, but they are both involved in similar physiological functions and synergistically affect the fruit cell size (Moyano-Cañete et al., 2013). The GASA gene family is found in many plant species, but little is known about the corresponding genes in tobacco. Here, we conducted a comprehensive genome-wide identification and expression profiling study of GASA gene family in tobacco.
We identified 18 NtGASA genes in the tobacco genome, more than those previously found in Arabidopsis, rice, grapevine, potato, and soybean (Roxrud et al., 2007;Nahirñak et al., 2016;Ahmad et al., 2019;Muhammad et al., 2019;Ahmad et al., 2020). Based on phylogenetic analyses, the identified NtGASA genes were divided into three subfamilies, of which subfamily I contained the highest number of genes ( Figure 1). Physicochemical analysis showed that all the identified NtGASA had low molecular weight and were alkaline, except for NtGASA11 (Table 2), consistently with previously reported results in Arabidopsis, grapevine, and apple (Herzog et al., 1995;Berrocal-Lobo et al., 2002). In addition, cysteine was the predominant amino acid among NtGASA proteins, probably due to the highly conserved 12-cysteine residue at the C-terminus (Table 2; Figure 5).
We also found that motif 1, 2, and three were highly conserved and present in all 18 NtGASA proteins, whereas motif five and eight were only present in NtGASA6/7 and NtGASA9/10, respectively ( Figure 4B). Variation in conserved motifs suggested that NtGASA functions were diversified during evolution. Indeed, NtGASA gene structure analysis revealed that the number of introns was varied from 0 to 3 ( Figure 4C), indicating that a gain and loss of introns occurred over time, which may be caused by chromosomal rearrangements (Xu et al., 2012;Guo et al., 2013).
Tandem or segmental duplication, as well as whole-genome duplication, markedly affect the evolution of gene families (Vision Todd et al., 2000;Paterson et al., 2010). Our results showed that the presence of both tandem and segmental duplications contributed to the evolutionary process of NtGASA genes. We identified one pair of tandem duplicated NtGASA genes and five pairs of segmental duplicated NtGASA genes throughout the genome (Table 3), these results corroborates the previous findings that segmental duplications occur more frequently than tandem duplications . The collinear analysis of GASA genes from Arabidopsis, rice, grapevine, and tobacco showed that the existence of more collinear gene pairs between grapevine and tobacco (Figure 3), suggesting a closer evolutionary distance between the two plant species.
We further analyzed the expression profiles of NtGASA genes in different tissues and found a large variety of expression patterns.
Several genes (i.e., NtGASA11 and NtGASA17) showed ubiquitous expression, whereas most NtGASA genes were upregulated only in specific tissues (i.e., NtGASA9 in the stem; NtGASA7 in the leaf; NtGASA16 in the axillary bud; and NtGASA2/5/6/10/13/14/15 in the flower) ( Figure 6). Previous studies indicated that GASA genes contribute to the regulation of flower induction in various species such as Petunia hybrida, Gerbera hybrida, rice, and cotton (Ben-Nissan et al., 2004;Peng et al., 2010;Muhammad et al., 2019;Qiao et al., 2021). Here, 13 NtGASA genes showed high expression in the flower, suggesting that they might play important roles in floral development.
The promoter region of a gene is related to its function, and thus, the analysis of cis-elements assists in its functional characterization (Lescot et al., 2002). Our results showed that NtGASA genes contained various regulation elements on their promoters, such as cis-acting regulatory elements essential for light, phytohormone, and stress responses (Figure 7), suggesting their involvement in multiple signaling pathways. GASA transcripts are responsive to phytohormones and share common phytohormone-related cis-elements. In the present study, we found that all NtGASA genes were regulated by multiple phytohormones, especially ABA and GA, except for NtGASA16, that was only induced by MeJA. Besides, NtGASA17 and NtGASA18 were downregulated by all applied phytohormones (ABA, GA, IAA, SA, or MeJA), indicating that unidentified cis-elements might regulate their expression ( Figure 8). The complex expression patterns of NtGASA genes under phytohormone applications highlighted their potential integral roles in various physiological processes.

CONCLUSION
To our knowledge, this is the first report on the identification and characterization of GASA genes in tobacco. We identified 18 NtGASA genes and analyzed their physicochemical characteristics, phylogenetic relationships, gene structure, conserved motifs, chromosomal locations, synteny, and ciselements in the promoters, which showed a clear evolutionary history for this family in tobacco. We also studied the expression patterns of NtGASA genes in various tissues and under different phytohormone applications. Overall, our results provided insights into the role of NtGASA genes in several physiological and biological pathways and laid a solid foundation for further exploring the underlying molecular and biochemical mechanisms.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
ZL and JG conceived and designed the study. GW, SW, and KC conducted the bioinformatics analysis. WP, YW, and QX assisted Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 12 | Article 768942 in data collection. ZL and XF wrote the paper. All authors read and approved the manuscript.

FUNDING
This work was supported by the Key funding of CNTC (No. 110202101003(JY-03) and No. 110201801030(JY-07)) and CTCCC (No. B20202NY1337). The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.