Combined Analysis of Volatile Terpenoid Metabolism and Transcriptome Reveals Transcription Factors Related to Terpene Synthase in Two Cultivars of Dendrobium officinale Flowers

Dendrobium officinale is a kind of traditional Chinese herbal medicine. Its flowers could be used as health care tea for its aroma flavor and medicinal value. Most recent studies demonstrated that terpenoids are the main components of the aromatic compounds in the flowers, but the biosynthesis of terpenoids is poorly understood in D. officinale. In the experiment, the flowers from two cultivars of D. officinale with different smells were collected. The transcriptome analysis and combined volatile terpenoids determination were performed to identify the genes related to the biosynthesis of the terpenoids. The results showed that the different products of volatile terpenoids are α-thujene, linalool, α-terpineol, α-phellandrene, γ-muurolene, α-patchoulene, and δ-elemene in two cultivar flowers. The transcriptome analysis detected 25,484 genes in the flowers. And 18,650 differentially expressed genes were identified between the two cultivars. Of these genes, 253 genes were mapped to the terpenoid metabolism pathway. Among these genes, 13 terpene synthase (TPS) genes may have correlations with AP2/ERF, WRKY, MYB, bHLH, and bZIP transcription factors by weighted gene co-expression network analysis (WGCNA). The transcription factors have regulatory effects on TPS genes. These results may provide ideas for the terpenoid biosynthesis and regulatory network of D. officinale flowers.


INTRODUCTION
Terpenoids are a class of highly diverse natural products. There are more than 80,000 known terpenoids, and at least half of them are synthesized by plants (Nagegowda and Gupta, 2020). Terpenoids have diverse biological functions in nature. It plays an essential role in the interaction and mediation of antagonism between organisms (Abbas et al., 2017). It can be used to attract pollinators, defend against ground and underground herbivores, and transmit signals between plants (Raguso, 2008;Ali et al., 2012;Abbas et al., 2017). Otherwise, it can be used as natural flavors and aroma compounds, which have beneficial effects on human health (Wagner and Elmadfa, 2003). Volatile terpenoids found in almost all plant organs are the main part of volatile compounds in flowers (Das et al., 2013). The common volatile terpenes in orchid flowers are limonene, pinene, myrcene, linalool, and ocimene (Gao et al., 2018;Baek et al., 2019;Robustelli Della Cuna et al., 2019).
Dendrobium officinale Kimura et Migo is a valuable Chinese traditional medicine for its beneficial effects, including antitumor, anti-angiogenesis, immune enhancement, anti-oxidation, and alleviating diabetes (Tang et al., 2017;Teixeira da Silva and Ng, 2017). Previous studies have identified 34 TPS genes in D. officinale. They were classified into four subfamilies (TPS-a, TPS-b, TPS-c, and TPS-e/f) . In the experiment, the volatile terpenoids in flowers were identified and quantitatively analyzed for two cultivars of Wanhu No.5 and Wanhu No.6 of D. officinale. By using transcriptome sequence and weighted gene co-expression network analysis (WGCNA), the terpene synthesis genes were identified. And the correlation between volatile terpenoids and differential gene expression levels was analyzed. The relationship between TFs AP2/ERF, WRKY, MYB, bHLH, and bZIP and terpenoid metabolism was explored.

Plant Material
Two cultivars of flowers, Wanhu No.5 and Wanhu No.6, came from the laboratory of Professor Cai Yongping of Anhui Agricultural University. The two cultivars of flowers were sampled at the flowering period in the full bloom period. A total of 12 flower samples were collected. There were three replicates for each of the two types of flowers, for terpenoid metabolic profiling and transcriptome analysis. To analyze the natural volatile compounds, the flowers of Wanhu No.5 and Wanhu No.6 were enclosed and sampled in an extraction bottle. Each cultivar of flowers was sealed into solid-phase microextraction (SPME) vials immediately for further analysis (Sun et al., 2016).
To investigate the spatiotemporal correlation between the TFs related to TPS and the emission of volatile terpene compounds, a range of samples, including two cultivars of flowers, were collected for RNA extraction. All samples were immediately frozen in liquid nitrogen and stored at -80 • C until required.
Gas Chromatography-Mass Spectrometry Analysis of Volatile Compounds in Flowers of Wanhu No.5 and Wanhu No.6 Headspace SPME was employed to collect the volatile compounds from flower tissues, which were absorbed by a 75 µm CAR/PDMS fiber (Sigma-Aldrich) for 2 h at 25 • C. Total trapped volatile compounds were subsequently thermally desorbed and transferred to an Agilent 5975-6890N gas chromatographymass spectrometry (GC-MS) apparatus (Agilent Technologies) equipped with HP5-MS quartz capillary column (250 µm diameter, 60 m length, and 0.25 µm film thickness). The instrument used for the gas chromatography-mass spectrometry analysis was an Agilent 7890B-7000B triple quadrupole gas-mass spectrometer. The carrier gas was helium with 1 ml/min of flow rate. The temperature of the electron ionization (EI) ion source is 230 • C, and the electron energy was 70 eV. The temperature of the quadrupole is 150 • C for 280 • C of interface temperature, and the mass scanning range was 50-400 amu. The temperature program was isothermal at 60 • C for 3 min, then increased at a rate of 5 • C min −1 to 300 • C, and was then mainteined at 300 • C for 5 min.
Through the NIST (National Institute of Standards and Technology) 2011 standard library, the volatile compounds detected during the experiment were identified and qualitatively analyzed. The obtained compounds were compared with the literature to obtain the determined volatile terpenoids.

Transcriptome Data Analysis
The Illumina NEBNext R Ultra TM RNA Library Prep Kit was used in the library construction. AMPure XP beads were used to screen 200 bp cDNA, and polymerase chain reaction (PCR) amplification was performed. AMPure XP beads were used to purify the PCR products again, and the library was finally obtained. RNA quality was evaluated on a NanoPhotometer R spectrophotometer (IMPLEN, München, Germany). RNA-Seq was performed by LC-bio (Hangzhou, China) on the Illumina HiSeq 4000 platform. Raw reads obtained from RNA-Seq were pre-processed; adapters were trimmed, and low-quality and shorter reads were removed. The Dendrobium genome was selected as a reference for Wanhu No.5 and Wanhu No.6. Clean reads were aligned to the Dendrobium genome, and the GenBank assembly accession was GCA_001605985.2 (latest). The transcriptome data could be obtained on the National Center for Biotechnology Information (NCBI); the BioProject accession number was PRJNA703321. Q20, Q30, and GC contents of the clean data were calculated. Then the fragments per kilobase million (FPKM) of each gene was calculated based on the length of the gene, and the reads mapped to that gene were calculated (Bray et al., 2016).
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding the advanced functions and utilities of biological systems, such as cells, organisms, and ecosystems, from molecular-level information, especially largescale molecular data sets generated by genome sequencing and other high-throughput databases (Kanehisa et al., 2017). We used clusterProfiler R software to analyze the statistical enrichment of differentially expressed genes (DEGs) in the KEGG pathway. To map the target genes to metabolic pathways, all sequences of DEGs were uploaded to the Mercator v.3.6 1 to generate a root map file, then it was imported to the Mapman software (V3.6.0 RC1) to obtain the map. In detail, the DEGs in the terpernoids biosynthesis pathway were displayed with the value of log2.Fold_change between Wanhu No.5 and Wanhu No.6.

Gene Co-expression Analysis
We performed WGCNA on all DEGs screened out in transcriptome sequencing (Langfelder and Horvath, 2008). We used the WGCNA package to run the FPKM expression of DEGs in the R software. It had a module with default settings, the power was 6, minModuleSize was 30, and the minimum height of the combined module was 0.25. The gene with the highest connectivity within the module was called the intra-mold hub gene. Networks were visualized by Cytoscape software v 3.6.1 (Shannon et al., 2003).

Quantitative Real-Time PCR (qRT-PCR) Assays
RNAprep pure plant kit (Biofit, Chengdu, China) was used to isolate total RNA from fresh Wanhu No.5 and Wanhu No.6    flowers (100 mg). According to the manufacturer's instructions, RNA (1 µg) was used to synthesize cDNA using PrimeScript TM RT kit with gDNA eraser (January, Perfect Real Time, Takara, Tokyo, Japan). Using QuantStudio 6 Flex real-time PCR system (Thermo Fisher, Waltham, MA, United States) and SYBR R PremixExTaq TM II (2x) (Japan, Takara), the gene expression level was detected by qRT-PCR. We used NCBI-BLAST online software 2 to design fluorescent quantitative primers for the key genes in the terpene synthesis pathway (Supplementary Table 1). The results are attached in Figure 11. The reaction steps are 50 • C 2 min, 95 • C 30 s, 95 • C 5 s, 60 • C 34 s, 40 cycles, and 72 • C for 10 min (Jin et al., 2013). The 2 − CT method was used to calculate the relative gene expression, and the experiment was repeated three times (Livak and Schmittgen, 2001).

Statistical Analysis
Average and standard derivations of chemicals were calculated using Microsoft Excel software. The results of GC-MS were drawn using Origin 6.0 (Origin Lab Corporation, United States). Prism 8 was used for some figures. R software was used to 2 https://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC= BlastHome calculate correlation factors. The phylogenetic analysis included the genomes of D. officinale. The protein sequences were aligned using the default parameters in MUSCLE 3 , and then the neighbor-joining tree was generated by guided analysis (1,000 repeats) using MEGA 7.0 software (Fan et al., 2020).

RNA-Seq and DEGs Analysis
In order to identify the candidate genes related to the volatile terpenoids biosynthesis, RNA-seq was performed using the flowers. As a result, a total of 25,484 genes were detected and mapped to the D. officinale genome. Of these genes, 5,240 genes were identified with differentially expressed levels between Wanhu No.5 and Wanhu No.6 ( Figure 4A). If compared with Wanhu No.5, 2,935 genes were upregulated, and 2,305 genes were downregulated in Wanhu No.6. And KEGG cluster analysis showed 253 unigenes involved in the metabolism of terpenoids and polyketides pathway ( Figure 4B).
Terpenoids are synthesized through the MVA and methylerythritol phosphate (MEP) pathways, which are independent but complementary to each other. The MEP pathway is mainly responsible for the biosynthesis of monoterpenoids, accounting for about 53% of the total terpenoids in flowers; sesquiterpenes are synthesized through the MVA pathway, accounting for about 28% of the total terpenoids in flowers. Figure 5 shows that TPS in D. officinale Wanhu No.5 was quite different from that in Wanhu No.6, and the number of upregulated genes was greater than that of downregulated genes. The squares in red frames marked with red arrows indicate the differentially expressed TPS genes between Wanhu No.6 and Wanhu No.5. The red squares reveal the upregulated genes in Wanhu No.5 compared with those in Wanhu No.6. The blue frames showed the downregulated genes in Wanhu No.5 in contrast with Wanhu No.6. In the MEP pathway, two genes, HMGR and HMGS, were expressed differently in Wanhu No.5 and Wanhu No.6, which was also the main reason for the different terpenes produced by the two flowers.

Analysis of Gene Correlation Network
In order to obtain the comprehensive transcriptome changes of two cultivars of D. officinale flowers, we established a weighted FIGURE 7 | Correlation coefficients between terpene synthase (TPS), module, and volatile compound. Each row represents a module, and each column represents a terpenoid. The color of each block at the row-column intersection indicates the correlation coefficient: red for high positive correlation and green for high negative correlation, with a scale shown on the right of the panel.
gene co-expression network to classify 18650 DEGs. Genes that were more closely related to each other would be gathered in the same module, and finally 10 modules would be obtained (Figure 6). The biggest module was the turquoise module, which contains 6,047 genes. However, the smallest module contained 6 genes. The expression patterns of the different modules of the two cultivars of flowers were also quite different from each other. For Wanhu No.5, the turquoise module's gene expression was higher, while for Wanhu No.6 the gene expression of the blue module was higher (Supplementary Figure 1).
To understand the relationship between terpenoids and genes, a relationship module diagram between genes and volatile terpenoids was analyzed. As was shown in Figure 7, the correlations between different modules and volatile terpenoids were relatively close, except for thujene, linalool, and terpineol, which were the different metabolites in Wanhu No.5 and Wanhu No.6. After screening, we found that the module contains 13 TPS genes, including 7 TPS genes in the turquoise module, 3 TPS genes in the blue module, 2 TPS genes in the green module, and 1 TPS gene in the yellow module. Different TPS genes might be closely related to the production of these volatile terpenoids through the relationship diagram.
To further clarify TPS genes's potential roles in the turquoise module and the blue module, we generated a phylogenetic tree by neighbor-joining method . Figure 8 showed that DoTPSs proteins were classified into four different clades, 14 in TPS-a, 16 in TPS-b, 3 in TPS-c, and 1 in TPS-e/f. The function of TPS-a family was to synthesize sesquiterpene synthase. TPS-b mainly synthesized monoterpene synthase and isoprene synthase. The function of TPS-c family was to synthesize the bifunctional class I/II (terephthaloyl diphosphate synthase/kaurene) involved in secondary metabolism, and the monofunctional class II included diterpene synthase (terephthaloyl diphosphate synthase enzyme) and diterpene synthase. TPS-e/f was a monofunctional class I diterpene synthase, diterpene synthase, sesquiterpene synthase, and monoterpene synthase involved in secondary metabolism (Alicandri et al., 2020). The phylogenetic tree showed that DoTPS in the blue module, including DoTPS04, DoTPS10, and DoTPS07, were located in TPS-a, TPS-b, and TPSc; DoTPS in the turquoise module, including DoTPS13, DoTPS03, DoTPS05, DoTPS06, DoTPS11, DoTPS15, and DoTPS18, were located in TPS-a and TPS-b.

TFs Regulating Terpenoid Synthases in Two D. officinale Cultivars
As listed in Supplementary Tables 3-8, the gene IDs and relative transcript levels of five TF families were obtained from the transcriptomes data of the two cultivars of D. officinale flowers. As is shown in Supplementary Figures 2-6, the expression patterns of TFs and DoTPSs in Wanhu No.5 were different from those in Wanhu No.6. The relative transcript levels were higher in Wanhu No.5. Also, the differences between Wanhu No.5 and Wanhu No.6 were significant in Supplementary Figures 3-8. And 10 terpenoid synthases and their read counts were collected in the transcriptome. When compared with Wanhu No.6, six TPS genes were upregulated with more than 2-fold (>2-fold), and eight TPS genes were downregulated (<-2-fold).

Verification of Gene Expression
In order to verify the transcriptome data, terpenoid synthesis pathway genes and related TFs were selected for real-time PCR analysis (Figure 11). The results showed that the expression patterns of the selected genes were consistent with the transcriptome data.

DISCUSSION
The flowers of D. officinale not only have a certain ornamental value but are also a kind of Chinese medicine that can be used to make tea with a certain anti-cancer effect (Lai, 2020). In Dendrobium chrysanthum, some terpenes including αphellandrene, α-pinene, α-thujene, L-β-myrcene, α-terpinene, O-cymene, D-limoene, β-ocimene, and carene were detected in the flowers. The volatile components of Dendrobium lohohense flowers are mainly esters, and the aroma composition of Dendrobium densiflorum is mainly alkanes. The volatile components of Dendrobium hancockii and D. officinale are mainly terpenes (Li et al., 2015;Lv et al., 2016). In the experiment, 18 and 20 volatile compounds were detected from Wanhu No.5 and Wanhu No.6 flowers, respectively. There were 9 volatile terpenoids compounds detected in Wanhu No.5, and 10 in Wanhu No.6. α-Pinene is the most abundant compound in the flowers of the two cultivars. The volatile terpenoids are quite different in the two cultivars of D. officinale. For example, α-thujene, linalool, and αterpineol were only detected in Wanhu No.5, while αphellandrene, γ-muurolene, α-patchoulene, and δ-elemene were only found in Wanhu No.6.
The comprehensive analysis of gene co-expression and terpenoid accumulation has recently provided new insights into the regulation of terpenoid metabolism (Tai et al., 2018). In order to understand the regulation of terpenoid biosynthesis in D. officinale, the expressed genes detected by RNA-Seq were connected by using the method of WGCNA, which provides a network of nodes (genes) and edges (connections). This method is mainly about obtaining connections in the network based on gene co-expression data. This strategy has been used to discover potential target genes and TFs in plants (Ferreira et al., 2016). We obtained 10 different expression modules after WGCNA analysis of the transcriptome data. The results can provide a way to build a network of mining potential target genes and TFs in plants. In D. officinale, with 34 TPS genes, 13 TPS genes were obtained in WGCNA analysis. The function of DoTPS10 has been verified; located in chloroplasts, DoTPS10 uniquely converted geranyl diphosphate to linalool in vitro .
In the present study, the top two modules enriched for TPS genes were selected for further analysis. Based on the WGCNA analysis, we obtained 13 TPS genes and 5 kinds of TFs that are related to the synthesis of TPS: AP2/ERF, bHLH, MYB, WRKY, and bZIP (Lu et al., 2013;Zhang et al., 2015;Pu et al., 2018;Majid et al., 2019). Through the correlation network diagram, we found that different TFs regulate different TPS differently. Transient expression of AaERF1 and AaERF2 can increase the transcription of amorpha-4,11-diene synthase (ADS) and CYP71AV1 and increase accumulation of artemisinin and artemisinic acid . The Arabidopsis MYC2 TF could bind to the promoter regions of the TPS21 and TPS11 that catalyzed sesquiterpenes' formation to activate their expression, thereby increasing the release of sesquiterpenes (Hong et al., 2012). A peltate glandular trichomes (PGT)specific R2R3-MYB gene, MsMYB, was identified in the RNA-Seq comparison data in spearmint. The analysis of the transgenic lines showed increased levels of monoterpenes. In contrast, the levels of MsMYB overexpression lines decreased (Reddy et al., 2017). In Catharanthus roseus, overexpression of Cr-WRKY1 could downregulate the expression levels of ORCA2/3, CrMYC2, and zinc-finger C. roseus transcription factors (ZCTs) to regulate the synthesis of monoterpenes (Suttipanta et al., 2011). AaAPK1 interacted with AabZIP1 in Artemisia annua, and AaAPK1 enhanced the transactivation activity of AabZIP1 on artemisinin biosynthesis genes through phosphorylation (Zhang et al., 2018). Regarding the TPS genes identified based on WGCNA, there are eight in the TPS-a family, four in the TPSb family, and one in the TPS-c family. TPS-a members are  Values shown are mean ± SE of three replicates. "*" indicates that the difference is significant, "**" indicates that the difference is very significant. related to sesquiterpene formation, and TPS-b is related to monoterpene biosynthesis (Alicandri et al., 2020). Therefore, these identified TPSs might play an important role in producing volatile terpenoids in D. officinale. As the last enzymatic step of the MVA and MEP pathways, TPS is responsible for the direct synthesis of terpenoids (Schilmiller et al., 2009). However, in most cases, the expression level of these TPS genes has no linear relationship with their product content. There are two main reasons for uncertainty (Degenhardt et al., 2009). Firstly, a considerable amount of TPS was a multi-product enzyme that can produce multiple volatiles from a single substrate. Secondly, the replication and evolution of the TPS family produced isozymes that express different functions in time and space (Takehiko et al., 2014). There were too few studies on the TPS gene of D. officinale, so more experiments are needed to test the functions of these 13 DoTPS, such as overexpression in Escherichia coli for in vitro enzymatic analysis and stable transformation for in vivo functional analysis (Zhou and Pichersky, 2020). The regulation between DoTPS and TFs will be detected in more experiments.

CONCLUSION
The transcriptome analysis in the two cultivars of D. officinale with differences in volatile terpenoid products was performed in order to mine the biosynthetic pathway related genes and regulatory mechanisms of the terpenoid metabolites of D. officinale. In the analysis of the two cultivars of D. officinale transcriptomes, the expression of upstream genes in the MVA and MEP pathways did not change much, and the TPS genes were quite different. Therefore, the diversity of terpenoids was caused by the differential expression of TPS. We obtained 10 gene modules from WGCNA. From the gene module, we screened 13 TPS genes and AP2/ERF, WRKY, MYB, bHLH, and bZIP TFs, analyzed the correlation between these TFs and TPS expression, and found that these TFs were displayed in the position of the correlation network. They played a role in regulating terpenoid metabolism. Future work should focus on the direct and indirect interactions between TPS and related TFs to clarify the functional network that controls terpene production. These results might provide ideas for the terpenoid biosynthesis and regulatory network of D. officinale flowers.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: National Center for Biotechnology Information (NCBI) BioProject, https://www. ncbi.nlm.nih.gov/bioproject/, PRJNA703321.

AUTHOR CONTRIBUTIONS
HF designed the study. NL wrote the manuscript. NL, YD, and ML performed the experiments. XS, LL, and LQ helped in data analysis and manuscript preparation. YC revised the manuscript. All authors contributed to the article and approved the submitted version.