Comparative Transcriptome Reveals the Genes’ Adaption to Herkogamy of Lumnitzera littorea (Jack) Voigt

Lumnitzera littorea (Jack) Voigt is among the most endangered mangrove species in China. The morphology and evolution of L. littorea flowers have received substantial attention for their crucial reproductive functions. However, little is known about the genomic regulation of flower development in L. littorea. In this study, we characterized the morphology of two kinds of L. littorea flowers and performed comparative analyses of transcriptome profiles of the two different flowers. Morphological observation showed that some flowers have a column embedded in the petals while others produce a stretched flower style during petal unfolding in flowering. By using RNA-seq, we obtained 138,857 transcripts that were assembled into 82,833 unigenes with a mean length of 1055.48 bp. 82,834 and 34,997 unigenes were assigned to 52 gene ontology (GO) functional groups and 364 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, respectively. A total of 4,267 differentially expressed genes (DEGs), including 1,794 transcription factors (TFs), were identified between two types of flowers. These TFs are mainly involved in bHLH, B3, bZIP, MYB-related, and NAC family members. We further validated that 12 MADS-box genes, including 4 MIKC-type and 8 M-type TFs, were associated with the pollinate of L. littorea by herkogamy. Our current results provide valuable information for genetic analysis of L. littorea flowering and may be useful for illuminating its adaptive evolutionary mechanisms.


INTRODUCTION
(LC) species (Polidoro et al., 2010). In China, the wild plant number was only 359 in 2006, and rapidly declined to 9 in 2018. The narrow distribution of it was only in Sanya Tielu harbor and Lingshui Dadun village of Hainan Island. In 2018, all of the wild L. littorea growing in Lingshui Dadun village died (Fan and Chen, 2006;Zhang et al., 2018). During 13 years of field observation, no seedlings or young trees were observed due to high (76%) seed abortive rate . The protection of this species is facing a great challenge and the mangrove L. littorea is therefore is listed as a plant under state protection (category II) (Zhong et al., 2011;Zhang et al., 2013).
L. littorea has relatively low genetic diversity and gene flow in China (Su, 2004), possibly because of the limited number of wild individuals and distribution area . According to the pollen-ovule ratio (P/O) and hybridization index (OCI) analyses, L. littorea is classified as a typical cross-pollinated plant with red petals and erectly terminal inflorescence. Most of the L. littorea flowers can only be pollinated from the same tree or even from the same flower (Li et al., 2016;Zhang et al., 2016Zhang et al., , 2017. The pollen viability L. littorea in China was lower than 10% (Zhang et al., 2013. The heavy abortion of L. littorea seeds resulted from the high empty embryo rate . In woody perennials, there are several studies on the regulation of flowering , but the underlying molecular mechanisms of floral dynamics and breeding systems in the development of L. littorea remain poorly understood. MADS-box transcription factors play crucial roles in floral organ formation, embryo and reproductive development and flowering time control (Angenent, 1995;Moon et al., 2003;Arora et al., 2007;Irish, 2010;Mohanty and Joshi, 2018;Zhang et al., 2018). Extensive studies of Arabidopsis mutants show several genetic models of floral organ formation (Coen and Meyerowitz, 1991;Theissen et al., 2016). The ABCDE model involves five subgroups of the MADS family. AP1 (APETALA 1) and AP2 belong to A-class; AP3 and PI (PISITTALA) belong to B-class; AGAMOUS (AG) belongs to C-class; SEEDSTICK (STK) belongs to D-class; and SEP1 (SEPALLATA 1), SEP2, SEP3, and SEP4 belong to E-class (Bowman and Meyerowitz, 1991). The combinations of MADS-box proteins determine the tetrameric complexes (Theiβen and Saedler, 2001). For instance, class A+B+E genes control petal development, B+E+C genes determine stamen development, C+E specify carpels, and D+E are necessary for ovule development (Wellmer et al., 2014). A, B, or C proteins could constitute higher-order complexes with SEP proteins . The sep1/2/3/4 mutant displays indeterminate flowers composed of leaf-like organs and sepal development, indicating the role of SEP proteins in control flower development (Favaro et al., 2003). Importantly, the "ABCDE" model key genes are conservative in the control of petal and style development in Soybean, Impatiens and Marcgravia (Geuten et al., 2006;Litt and Kramer, 2010;Huang et al., 2014).
TF flowering locus C is a convergence point for environmental and endogenous pathways that regulate flowering time in Arabidopsis (Mateos et al., 2015). AGL27 mutants flower earlier in a dosage dependent manner while transgenic plants carrying AGL27 overexpression cassettes are delayed in flowering (Scortecci et al., 2001;Yun et al., 2011). FLC interacts with another MADS-box protein, SHORT VEGETATIVE PHASE (SVP), to delay flowering (Li et al., 2008). The function of the MADS-box gene has been verified in the flower development of many species, but it has not been reported in L. littorea.
Here, we conducted de novo transcriptome sequencing of two kinds of flowering behavior with different types of style development for L. littorea in order to investigate gene expression patterns associated with special style development morphology. To our knowledge, this is the first comprehensive transcriptomic study of flower development for L. littorea, providing important bioinformatic resources for the investigation of genes involved in flower development, and building a foundation for investigating the role of these genes and gene networks in the evolution of floral diversity across L. littorea.

Plant Material
The L. littorea trees live in Sanya Tielu Bay, Hainan, China (18 • 15 -18 • 17 N, 109 • 42 -109 • 44 E). Flowers with columns embedded in the petals (L-1) or with stretched styles (L-2) were collected from one florescence of one tree at 9 am on July 20, 2017. For each type, at least three flowers were selected. The materials were immersed into liquid nitrogen and stored at −80 • C for subsequent research. Three biological replicates were prepared for sequencing.

RNA Extraction and Deep Sequencing
RNA isolation was performed with TRIzol R Reagent (Invitrogen, United States) following the manufacturer's instructions. RNA samples with an RNA integrity number (RIN) >9.5 were used for purification and subsequent cDNA construction with the TruSeq RNA sample RNA prep kit (Illumina, United States). After synthesis of the first-strand cDNA, the second-strand cDNA was produced using buffer, dNTPs, RNase H, and DNA polymerase I. The double-strand cDNA was purified using the QIAquick PCR extraction kit (QIAGEN, Germany) and washed with EB buffer for end repair and single nucleotide adenine (A) addition. After PCR amplification for 15 cycles, the products were loaded onto flow cell channels at 12 pM for paired-end 150 bp × 2 sequencing with the Illumina HiSeq 4000 platform (Majorbio, Shanghai, China).

De novo Assembly and Analysis of Illumina Reads
Clean reads were obtained by (1) removing the adapters and reads without fragmentation; (2) cutting the low quality bases (quality score less than 20) at the 3 end of the sequence and then, if the quality of the residual sequence is still less than 10, removing the entire sequence, while sequences with a quality greater than 10 are retained; (3) removing reads that contain too many Ns (≥10%); and (4) removing reads less than 20 bp long after adapter discarding and quality control. Analysis tools: SeqPrep 1 and Sickle 2 . The de novo assembly was conducted with the Trinity software 3 (Grabherr et al., 2011). The raw data have been uploaded to NCBI SRA under accession numbers SRR6429108 to SRR6429113.

Transcriptome Annotation
BlastX was used to perform sequence alignments between the transcriptome and sequence data from the NR, String, SwissProt and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Alignments with E-values less than 1e −5 were chosen. NCBI_NR is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, the Protein Information Resource (PIR), the Protein Research Foundation (PRF), and the Protein Data Bank (PDB). Via GO (gene ontology) annotation, the database standardizes the biological terms of genes and gene products and unifies the definitions and descriptions of gene and protein functions (Grabherr et al., 2011). The Clusters of Orthologous Groups of proteins (COG) database 4 is an orthologous protein cluster database that depends on the phylogenetic relationships of complete protein sequences from 66 selected strains. Functional annotation, classification and protein evolution analysis can be performed by comparing sequences with the COG database . Pathway assignments were performed according to the KEGG 5 pathway database (Grabherr et al., 2011;Huang et al., 2015) with BlastX and an E-value threshold of 1e −5 .

Identification of Differentially Expressed Transcripts
EdgeR 6 was used for differential expression analysis. Gene read count data were calculated as the input of EdgeR or DESeq2. This analysis method is based on the negative binomial distribution model. The screening criteria of significant DEGs were as follows: FDR < 0.05 and |log2FC| > = 1 (Anders and Huber, 2010).

Annotation and Phylogenetic Analysis
To identify the TFs represented in the L. littorea transcriptomes, all DEGs were searched against the plant TF database PlantTFDB 4.0 (Jin et al., 2014) 7 . BlastN searches of the Phytozome database, using Arabidopsis genes as queries, were used to identify flower and floral organ development-related genes in L. littorea. The CDS sequences of all MADS-box genes from Arabidopsis, Hevea brasiliensis, and Fragaria ananassa were downloaded from GenBank. Multiple sequence alignments and the phylogenetic analysis of CDS sequences were performed as described previously (Cheng et al., 2017). An unrooted phylogenetic tree was created with the neighbor-joining method

Real-Time PCR Analysis
Total RNA was isolated from flowers with different flowering behavior. Three biological replicates were set. qRT-PCR assays were conducted using the ABI PRISM 7300 Sequence Detection System (Applied Biosystems) with SYBR Green PCR Master Mix (Applied Biosystems). The housekeeping gene ACTIN (c14139_g1) was used for normalization in each qRT-PCR run. The relative expression levels of target genes are presented as 2 − CT (Livak and Schmittgen, 2001). Primers used in this study are listed in Supplementary Table S1.

Floral Structure Morphogenesis of L. littorea Flowers
The L. littorea flowers are hermaphroditic with red, erect petals and a deep, curved calyx tube with abundant nectar (Figures 1A,B). The diameter of a single flower is approximately 6.70 ± 0.04 mm. The five petals per flower are 4.20 ± 0.43 mm long. The pistils and stamens are approximately the same length and as long as 8 mm. Stamens are prominently exserted at anthesis after the stamens unfold. During flower opening, two kinds of flowers can be found before the petals are uncovered. One kind retains stigma within the petals and was named as L-1 ( Figure 1C); the other kind (L-2) are herkogamy flowers and has columns stretched beyond the petals before flowering ( Figure 1D). In L-1, no stylar canal was found on the stigma ( Figure 1E). In L-2, there is an obvious stylar canal in the stigma (Figure 1F), the filaments are kept folded, and the anthers are kept intact.

RNA-Seq and de novo Assembly
To obtain an overview of the transcriptome profiles, L-1 and L-2 were sampled at different column development stages for Illumina deep sequencing. 66.83 (L-1) and 65.40 (L-2) million raw reads were yielded. A total of 138,857 transcripts with a GC percent of 38.91%, average length of 1665.35 and an N50 size of 3049, were obtained (Table 1). 82,833 unigenes with a GC percent of 38.59, an average length of 1055.48 and an N50 size of 2270, were produced.

Functional Annotation of Unigenes
The unigene sets obtained from the L. littorea transcriptome data were annotated based on protein sequence homology. All transcripts and unigenes produced were searched against the NCBI NR, SwissProt, String, KEGG and Pfam databases with an E-value threshold < 1e −5 . As a result, 138,857 transcripts and 82,833 unigenes were annotated (Supplementary Figure S1). The similarity distribution analysis identified 41,687 transcripts and 13,958 unigenes that exhibited high sequence similarity (from 80% to 100%) with known gene sequences. Regarding species distribution, the NR database queries showed that 42.6% of the L. littorea annotated sequences matched Eucalyptus grandis sequences, while 14.14, 13.52, and 8.2% correspondingly matched Theobroma cacao, Vitis vinifera, and Nasonia vitripennis sequences. Characteristics of the homology search of L. littorea unigenes against the NR database are shown in Supplementary Figure S2.
Based on the BLASTX results against the NR database, we assigned GO terms to the assembled unigenes to obtain GO functional annotations and categorizations. All of the unigenes were used to query the GO database in order to classify their predicted functions (Supplementary Figure S3 and Supplementary Table S2). In the "biological process" category (34,918 unigenes), macromolecule metabolic process (4,389) was the largest subcategory. In the "cell component" (31,040 unigenes) and "molecular function" (16,876 unigenes) categories, intracellular (4,429) and nucleotide binding (2,755) were the most abundant GO terms, respectively. The GO analysis indicated that a high number of unigenes were associated with the various biological processes and molecular functions in L. littorea floral tissues. The annotated sequences were further applied to a search against the clusters of orthologous groups of proteins (COG) and clusters of orthologous groups for eukaryotic complete genomes (KOG) databases for functional prediction and classification. As a result, each annotated unigenes was assigned 25 COG and 25 KOG terms. Among the assigned terms, the three most highly represented categories in the two databases were identical. (1) general function prediction only (883 unigenes in the COG databases; 1267 unigenes in the KOG databases); (2) signal transduction mechanism (844 unigenes in the COG databases; 1136 unigenes in the KOG databases); and (3) posttranslational modification, protein turnover, and chaperones (740 unigenes in the COG databases; 1006 unigenes in the KOG databases). The smallest group was "cell motility, " with 3 unigenes in the COG databases and 1 unigene in the KOG databases (Supplementary Figure S4).

Transcription Factors in DEGs Modulating the Herkogamy of L. littorea Flowers
Transcription factors (TFs) play key regulatory roles in floral development by binding to specific motifs in the promoters of target genes . Here, we showed that 41% (1,749/4,267) of DEGs between L-1 and L-2 belong to TFs (Supplementary Table S4). These TFs were classed into 54 categories with bHLH, B3, bZIP, MYB-related, NAC, C2H2, C3H, ERF, WRKY, and MYB being the most highly represented (Figure 2A). It is noted that the MADS, bZIP, bHLH, and MYB genes play key roles in the regulation of flower development and flowering time (Streisfeld et al., 2013;Wang et al., 2013;Rocheta et al., 2014).

Phylogenetic Analysis of MADS-Box Genes Associated With the Herkogamy of L. littorea Flowers
To investigate the evolutionary history and phylogenetic relationships of MADS-Box genes, 12, 14, 30, and 17 MADS-Box TFs were individually identified in L. littorea, Arabidopsis, rubber tree, and strawberry based on the PlantTFDB 4.0 database ( Figure 2B and Supplementary Table S5). A neighbor-joining phylogenetic tree was then generated by alignment of these MADS-box proteins. As shown in Figure 2B, these proteins were classified into seven groups, similar to the description in a previous report (Parenicova et al., 2003). In each subgroup, MADS proteins in L. littorea were more closely related to those in rubber tree, except in the Mα subfamily. Twelve Frontiers in Genetics | www.frontiersin.org differentially expressed MADS proteins between L-1 and L-2 appeared in each putative functional group. Of them, LliMADS3 and LliMADS11 are the most homologous with AT5G55690, and belong to the Mα subdivision of type I MADS-box genes. LliMADS12 and AthPHERES1 (PHE1) are on the same branch in the Mγ subdivision of type I MADS-box genes. LliMADS1 and LliMADS9 belong to the B class of the AP3 subfamily, and LliMADS10 belongs to the SEP subfamily.

RNA-Seq Expression Validation by Real-Time PCR
To confirm the gene expression patterns identified by RNA-Seq data, the transcript levels of twelve MADS-box genes together with six other DEGs were examined by qRT-PCR. All 18 selected DEGs were successfully amplified with single bands of the expected sizes. The expression of 10 genes were down regulated and that of 7 genes were up regulated (Supplementary Table S6), consistent with those of the RNA-Seq data (Figure 3). Therefore, the DEGs obtained from the assembled transcriptome were accurate and reliable.

DISCUSSION
Lumnitzera littorea is an endangered mangrove species in China . Herkogamy is found in almost 60% of L. littorea flowers, but approximately 40% of the flowers have a column embedded in the petals when the petals unfold during florescence. Almost all those flowers have empty seeds, which are speculated by the results of forced self-pollination. Thus, the breeding system of L. littorea is out-crossing with partial selfpollination . Out-crossing is obligate in unisexual flowers and selfing can occur in hermaphrodite flowers, but a self-compatible level can be strongly selected by herkogamy, i.e., the spatial separation of anthers and stigmas within a flower (Mertens et al., 2018). In addition, geitonogamous selfing is not prevented within or between inflorescences on a plant when flowers are at different sexual phases in L. littorea (Zhang and Wolfe, 2016;Zhang et al., 2017). The temporal separation of male and female phases is a common floral feature in hermaphrodite species (Rosas-Guerrero et al., 2017). In such a bad survival trend, only the herkogamy flower morphogenesis of L. littorea could hardly improve the natural reproduction rate. Therefore, flower morphogenesis suitable for geitonogamous selfing as L-1 may be a positive adaption for this survival condition, even though most of them failed in seed production.
In the present study, we analyzed the transcriptome profiles of flowers with columns embedded in the petals (L-1) and with stretched styles (L-2) and identified a total of 82,833 unigenes and 138,857 transcripts, respectively. Using the stringent criteria of both FDR < 0.05 and |log2FC| > = 1, we detected 4,267 unigenes that were significantly different between L-1 and L-2. These results imply a diverse and complex mechanism of column development gene expression in L. littorea.
Gene ontology functional enrichment revealed that a high number of genes were associated with various biological processes and molecular functions in L. littorea floral tissues.
KEGG pathway analysis showed that many DEGs were involved in secondary metabolite biosynthesis, including carbon metabolism, ribosome, protein processing in the endoplasmic reticulum and amino acid biosynthesis. Furthermore, plant oxidative phosphorylation that plays an important role in plant floral development (Liu et al., 2019) is also activated.
TFs play important roles in the regulation of flower development and flowering time . In this study, we identified 1,749 differentially expressed TFs with 54 diverse categories. The different expression of these TFs indicated their possible different roles in modulating the formation of herkogamy flowers in L. littorea. Of these TFs, MADS-box family genes function in flower development (Parenicova et al., 2003). MADS-box proteins are generally divided into types I and II. Type I is categorized into Mα, Mβ, Mγ, and Mδ clades and type II is classified into MIKC c and MIKC * (Mohanty and Joshi, 2018). Several type I MADS box TFs are shown to be involved in reproductive development in Arabidopsis (Luo et al., 2008). The empty embryo rate of 70% may be related to the upregulated expression of Mγ genes in the flowers of L. littorea in China. SVP has been identified to delay flowering by modulating the biosynthesis of gibberellin in Arabidopsis (Andrés et al., 2014) and Jatropha curcas (Hui et al., 2018). In the flowers of L. littorea, two downregulated SVP genes (LliMADS2 and LliMADS4), and one upregulated SVP genes (LliMADS7) affect dichogamy morphogenesis, which may also be caused by the modulating the biosynthesis of gibberellin in florescence. In Arabidopsis, the petal, stamen and carpel of sep1sep2sep3 mutants are switched to sepal (Ditta et al., 2004). LliMADS10 and AthSEP were found in the same branch covering SEPs ( Figure 2B). In the flowers of L. littorea, LliMADS10 was downregulated and may contribute to dichogamy morphogenesis. These genes should be paid more attention in further research.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/, SRP127706.

AUTHOR CONTRIBUTIONS
YZa and CZ designed the study and modified the manuscript. YZa, YC, YZo, JZ, and HB collected the samples and acquired the data. YZa and YC drafted the manuscript. CZ, JZ, and HB helped to draft the manuscript. All authors read and approved the final manuscript.

FUNDING
This study was funded by the National Natural Science Foundation of China (41776148 and 31360173) and the project of Zhejiang Provincial Natural Science Foundation (LY18C030001).