Function of AP2/ERF Transcription Factors Involved in the Regulation of Specialized Metabolism in Ophiorrhiza pumila Revealed by Transcriptomics and Metabolomics

The hairy roots (HR) of Ophiorrhiza pumila produce camptothecin (CPT), a monoterpenoid indole alkaloid used as a precursor in the synthesis of chemotherapeutic drugs. O. pumila HR culture is considered as a promising alternative source of CPT, however, the knowledge about the biosynthetic pathway and regulatory mechanism is still limited. In this study, five genes that encode AP2/ERF transcription factors, namely OpERF1–OpERF5, were isolated from HR of O. pumila. Phylogenetic analysis of AP2/ERF protein sequences suggested the close evolutionary relationship of OpERF1 with stress-responsive ERF factors in Arabidopsis and of OpERF2 with ERF factors reported to regulate alkaloid production, such as ORCA3 in Catharanthus roseus, NIC2 locus ERF in tobacco, and JRE4 in tomato. We generated the transgenic HR lines of O. pumila, ERF1i and ERF2i, in which the expression of OpERF1 and OpERF2, respectively, was suppressed using RNA interference technique. The transcriptome and metabolome of these suppressed HR were analyzed for functional characterization of OpERF1 and OpERF2. Although significant changes were not observed in the metabolome, including CPT and related compounds, the suppression of OpERF2 resulted in reduced expression of genes in the 2-C-methyl-d-erythritol 4-phosphate and secologanin-strictosidine pathways, which supply a precursor, strictosidine, for CPT biosynthesis. Furthermore, while it was not conclusive for OpERF1, enrichment analysis of differentially expressed genes in the suppressed HR showed that the gene ontology terms for oxidation-reduction, presumably involved in secondary metabolite pathways, were enriched in the ERF2i downregulated gene set. These results suggest a positive role of OpERF2 in regulating specialized metabolism in O. pumila.


INTRODUCTION
Plants produce a broad array of secondary metabolites that serve special functions, such as protection from biotic and abiotic threats, attracting insects and animals for pollination and seed dispersal among others. A number of these phytochemicals have been widely used by humans for agricultural, medical, nutritional, and industrial purposes (Croteau et al., 2000;Krishna, 2003;Bakkali et al., 2008;Mithöfer and Boland, 2012;Jacobo-Velázquez et al., 2015). One of the commercially important classes of plant secondary metabolite is alkaloid, which comprises of organic compounds containing nitrogen atom in diverse structures. Different types of alkaloids exhibit a wide range of pharmaceutical activities that are beneficial for clinical treatments, for example, the antispasmodic activity of the tropane alkaloid hyoscyamine, antimalarial activity of the quinoline alkaloid quinine, and antitumor activity of the monoterpenoid indole alkaloids (MIA) vincristine and vinblastine (Amirkia and Heinrich, 2014).
Camptothecin (CPT) was first isolated from Camptotheca acuminata (Wall et al., 1966), and it is currently well recognized as an antitumor agent because of its unique mechanism of action through the inhibition of DNA topoisomerase I (Hsiang et al., 1985). Although, CPT belongs to the structural family of quinoline alkaloids, from the biosynthetic view point CPT constitutes a MIA, which is one of the largest groups of bioactive plant alkaloids. Interestingly, CPT is produced by several plant species that are taxonomically distant from C. acuminata, such as Nothapodytes foetida (Govindachari and Viswanathan, 1972) and Ophiorrhiza pumila (Aimi et al., 1989). Semi-synthetic CPT derivatives such as topotecan and irinotecan, in which low water solubility and potent toxicity of CPT were modified, were approved for the treatment of recurrent small-cell lung cancer (Garst, 2007), ovarian cancer (Coleman, 2002), and colorectal cancer (Cunningham et al., 2001). Furthermore, other CPT derivatives such as cositecan (BNP1350) (Munster and Daud, 2011) and belotecan (CKD-602) (Kim et al., 2010), are currently undergoing clinical trials. Growing demand for these drugs requires the increasing amount of naturally-derived CPT as a lead compound for synthesis. Global CPT supply from extracts of bark and seeds of C. acuminata and N. foetida  has raised the concern of environmental sustainability and CPT supply chain sufficiency and thus led to the necessity of the establishment of alternative sources (Sadre et al., 2016). Several studies have attempted to develop cell culture systems for CPT-producing plants but only a few could achieve high yield of CPT. For example, callus and hairy roots (HR) cultures of C. acuminata produced 0.1 and 0.236% dry weight of CPT, respectively (Wiedenfeld et al., 1997;. HR cultures of O. pumila accumulated 0.1% dry weight of CPT, which was at least 2.5 times higher than the level accumulated in the leaves (Saito et al., 2001). Considering a short proliferation period and successful establishment of tissue culture in an upscale bioreactor system , O. pumila HR is a potential source of in vitro CPT production, and it serves as a model for studying CPT biosynthesis. To date, biosynthetic genes that encode catalytic enzymes, tryptophan decarboxylase (TDC), geraniol 10-hydroxylase (G10H), secologanin synthase (SLS), and strictosidine synthase (STR), were isolated and characterized in O. pumila (Yamazaki et al., 2003a,b). The incorporation of labeled [1-13 C] glucose into HR of O. pumila indicated that the secologanin moiety of the CPT molecule is produced via 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway and tryptamine is from the shikimate pathway . The biosynthetic intermediates were searched using nontargeted metabolite profiling of the HR with the suppression of TDC and SLS . Using deep transcriptome analysis and untargeted metabolic profiling of O. pumila HR, the candidate genes in the specialized metabolism were profiled in comparison with those in the cell suspension culture (CSC) that doesn't produce CPT (Yamazaki et al., 2013). Recently, Rohani et al. (2016) reported the characterization of a suppressor, OpMYB1, involved in the regulation of specialized metabolism in O. pumila. Their study demonstrated that overexpression of OpMYB1 in HR negatively affected the transcript level of TDC and resulted in reduced accumulation of CPT. Furthermore, some other transcription factor genes were profiled as genes specifically expressed in HR. The HR culture of O. pumila is a robust system for studying CPT biosynthesis in plants. To utilize HR effectively as an alternative source for CPT, the ability to control and manipulate the system is indispensable. Information on the production pathway, including enzymes, intermediates and regulatory factors that influence CPT and other related metabolites production, is necessary.
The regulatory mechanism underlying MIA biosynthesis has been intensively studied in Catharanthus roseus. The jasmonate signaling pathway regulates the production of vinca alkaloids. Transcription factors CrMYC2, CrORCA2, CrORCA3, and CrWRKY1 were identified to be involved in this signal transduction (Menke et al., 1999;van der Fits andMemelink, 2000, 2001;Suttipanta et al., 2011;Zhang et al., 2011;Schluttenhofer et al., 2014). Among them, CrORCA2 and CrORCA3 are classified as APETHALA2/Ethylene Response Factor (AP2/ERF). AP2/ERF represents a superfamily of transcription factor that plays an important role in plant biological systems, such as growth and development (Chuck et al., 1998;Ohto et al., 2009), response to stimuli (Licausi et al., 2010;Mizoi et al., 2012;Zhao et al., 2012), and biosynthesis of primary and secondary metabolites (van der Fits andMemelink, 2000, 2001;Zhang et al., 2005;Shoji et al., 2010;Todd et al., 2010;Cárdenas et al., 2016;Thagun et al., 2016). On the basis of the number of conserved AP2/ERF domain and the existence of a family-specific B3 domain, AP2/ERFs are further classified into three families: AP2, ERF, and RAV. Among these families, the ERF family is the largest and is characterized by a single AP2/ERF domain. In 2006, Nakano et al. subcategorized the Arabidopsis ERF family into 12 groups on the basis of phylogenetic relationships, exon-introns, and protein motifs (Nakano et al., 2006). In their study, group I to IV correspond to dehydrationresponsive element binding-protein (DREB) factors, and group V-X, group VI-L, and group Xb-L correspond to ERF factors. This classification provides a perspective on the biological functions of each group, as demonstrated by several members of the group IX ERF from different plants that are involved in regulation of secondary metabolite biosynthesis. For example, CrORCA2 and CrORCA3 from C. roseus positively regulate the expression of genes in vinca alkaloid biosynthesis (Menke et al., 1999;van der Fits andMemelink, 2000, 2001). NbERF1 from Nicotiana benthamiana caused a reduction in nicotine alkaloids accumulation when the expression was suppressed (Todd et al., 2010). A cluster of ERFs in the NIC2 locus of Nicotiana tabacum control nicotine biosynthesis in tobacco (Shoji et al., 2010). Recently, the orthologs of GAME9/JRE4 were reported to modulate steroidal glycoalkaloids (SGA) in tomato and potato (Cárdenas et al., 2016;Thagun et al., 2016). AaERF1 and AaERF2 from Artemisia annua L. positively regulate artemisinin biosynthesis (Yu et al., 2012).
In this study, ERF family genes, OpERF1, OpERF2, OpERF3, OpERF4, and OpERF5, were isolated from CPT-producing HR of O. pumila. Their expression profiles and evolutional relationships with reported secondary metabolite-regulating AP2/ERFs and Arabidopsis AP2/ERFs were analyzed. For further characterization, RNA interference (RNAi) transgenic HR with suppressed OpERF1 (ERF1i) or OpERF2 (ERF2i) were generated. Using the transcriptome and metabolome approach, a positive role of OpERF2 in regulating the MEP and secologaninstrictosidine pathways, which produce a precursor for CPT biosynthesis, and the possible involvement of OpERF1 in stress response were revealed.

Plant Materials
O. pumila transgenic HR were maintained in the condition as described by Saito et al. (2001). Collected HR were immediately frozen in liquid nitrogen and homogenized using multibeads shocker (Yasui Kikai, Japan).

RNA Extraction and cDNA Synthesis
Total RNA was extracted from homogenized 3-week-old HR using RNeasy Plant Mini Kit (Qiagen, Germany) and treated with RNase-free DNase (Qiagen, Germany). First strand cDNA synthesis was carried out using Superscript R VILO TM Reverse Transcriptase (Invitrogen, USA) according to the manufacturer's instructions.

Isolation and Cloning of OpERFs
Isolation of differentially expressed genes between HR and CSC using PCR-select TM cDNA subtraction (Clontech, Japan) was performed according to previously described by Bunsupa et al. (2011). HR-specific AP2/ERF unigenes were selected from de novo transcriptome assembly (Yamazaki et al., 2013) using log 2 reads per kilobase of transcript per million mapped reads (RPKM) of HR/CSC ≥1 as a cut-off value. Fulllength coding sequences of OpERF1 (LC183910), OpERF2 (LC171328), OpERF3 (LC171329), OpERF4 (LC183911), and OpERF5 (LC183912) were obtained from 3 ′ to 5 ′ RACE using SMART TM RACE cDNA amplification kit (Clontech, Japan) according to the manufacturer's instructions. Amplified RACE fragments were subjected for DNA sequencing by Eurofins Genomics, Japan.

Quantitative Real-Time PCR (qRT-PCR)
cDNA synthesized from total RNA was used as an amplification template in qRT-PCR. PCR reaction was performed using Power SYBR R Green PCR Master Mix (Thermo Fisher Scientific, USA) on Applied Biosystems StepOnePlus TM Real-Time PCR System (Thermo Fisher Scientific, USA). Gene specific primers used in qRT-PCR are listed in Table S1. A housekeeping gene βtubulin was used for normalization. Relative expression fold change of the target genes in ERF1i and ERF2i vs. GUSi was calculated by comparative C T method. The significant differences compared to average of GUSi were determined by t-test (p < 0.05).

RNA Sequencing (RNA-seq) and Raw Data Processing
Total RNA extracted from each transgenic lines were subjected for cDNA library preparation and sequencing at Kazusa DNA Research Institute (Chiba, Japan) on Illumina HiSeq TM 2000 sequencer (Illumina Inc., USA). Raw read pre-processing, alignment of clean reads to a reference O. pumila de novo assembly (Rohani et al., 2016), transcript abundant estimation and expression level in fragment per kilobase exon per million mapped fragment (FPKM) were performed as described by Rai et al. (2016a). Raw read sequences of six RNAi HR lines were deposited in NCBI's Gene Expression Omnibus (Accession number GSE89674; Edgar et al., 2002).

Differential Gene Expression Analysis and Gene Ontology (GO) Enrichment Analysis
Expression fold change of genes in ERF1i and ERF2i relative to GUSi was calculated using DESeq2 (Love et al., 2014) by assigning log 2 FPKM ≥ 1 and false discovery rate <0.05 as a threshold. Upregulated and downregulated gene sets were used in GO analysis based on Fisher's exact test with a p-value cutoff 0.05, using O. pumila de novo assembly as a reference set as described by Rai et al. (2016b). Blast2GO version 3.0 (Biobam, Spain) was used for the assignment of GO term and visualization of the GO functional classification. Only GO categories passing the statistical criteria were chosen for further discussion.
O. pumila de novo transcriptome assemblies were subjected to tBLASTx alignment against transcripts from the MEP and secologanin-strictosidine pathways from C. roseus (Van Moerkercke et al., 2013;Miettinen et al., 2014) and the jasmonic acid biosynthesis pathway from Arabidopsis to identify their corresponding contigs in O. pumila (Table S2). Top hit contigs of each enzyme with p < 0.05, positive alignment ≥70%, and length >500 bp, were used in functional analysis.

Untargeted Metabolite Analysis Using LC-QTOF-MS
The homogenized HR were freeze dried using FDU-2200 freeze dryer (Tokyo Rikakikai, Japan) and metabolites were extracted with 50 µL of 80% MeOH containing 2.5 µM lidocaine and 10-camphor sulfonic acid per milligram dry weight using a mixer mill with zirconia beads for 7 min at 18 Hz at 4 • C. After centrifugation for 10 min, the supernatants were filtered using an HLB µElution plate (Waters). The extracts were analyzed using LC-QTOF-MS according to the condition described by Nakabayashi et al. (2014) with modifications as following; LC: gradient program, 0.5%B at 0 min, 5%B at 0.1 min, 80%B at 10 min, 99.5%B at 10.1 min, 99.5%B at 12.0 min, 0.5%B at 12.1 min, and 0.5%B at 15.0 min; flow rate, 0.3 ml/min at 0 min, 0.3 ml/min at 10 min, 0.4 ml/min at 10.1 min, 0.4 ml/min at 14.4 min, and 0.3 ml/min at 14.5 min; MS detection: mass range, m/z 50-1500; polarity, positive. MS/MS data was acquired in the ramp mode as the following analytical conditions: (1) MS: mass range, m/z 50-1500; scan duration, 0.1 s; inter-scan delay, 0.014 s; data acquisition, centroid mode; polarity, positive; and (2) MS/MS: mass range, m/z 50-1500; scan duration, 0.02 s; inter-scan delay, 0.014 s; data acquisition, centroid mode; polarity, positive collision energy, ramped from 10 to 50 V. In this mode, MS/MS spectra of the top 10 ions (>1000 counts) in an MS scan were automatically obtained. If the ion intensity was <1000, MS/MS data acquisition was not performed and moved to next top 10 ions. Data acquisition was performed using Progenesis CoMet (Nonlinear Dynamics, Durham, NC, USA). Peaks with intensity less than 2000 were treated as noise and not proceeded for MS/MS data acquisition. Peak normalization was conducted by comparison with internal standard (lidocaine). Chemical assignment was conducted using m/z values reported previously as a reference (Yamazaki et al., 2013) with tolerance 0.01 Da.

Isolation of Hairy Roots Specific AP2/ERFs
A previous study reported higher accumulation of CPT and putative intermediates in the HR culture of O. pumila, whereas no synthesis was observed in the CSC derived from HR (Saito et al., 2001). This opposite secondary metabolite profile for CPT and related intermediates implies that genes associated with the production of these compounds are active in HR and limited/not active in CSC. Therefore, using comparative expression analysis, we identified AP2/ERFs that are specially expressed in HR when compared with CSC. From the data derived by PCRselect cDNA subtraction (Rohani et al., 2016), six out of 353 HR-specific fragments were annotated as ERF-like proteins. Full-length cDNA cloning revealed that all these six fragments constitute a part of 903-bp cDNA sequence (designated as OpERF1). Next we evaluated a more comprehensive data set acquired by deep transcriptome analysis (Yamazaki et al., 2013) and identified 95 AP2/ERF annotated unigenes, with 31 unigenes being HR-specific ( Table S3). As expected, among these 31 HRspecific AP2/ERFs, a homolog of OpERF1, unigene27166_All was also included. We further reviewed the annotation list of these unigenes and could narrow down four unigenes with a high sequence similarity with the functionally characterized AP2/ERFs (Table S3). Unigene18253_All (designated as OpERF2) showed 68% sequence identity to CrORCA3, an AP2/ERF transcription factor of C. roseus that regulates the expression of key genes in TIA biosynthesis (van der Fits andMemelink, 2000, 2001), and 65% sequence identity to NbERF1, which is involved in nicotine alkaloids biosynthesis in N. benthamiana (Todd et al., 2010). Unigene26293_All (designated as OpERF3) showed 69% identity to Arabidopsis AtERF13, a member of the group IX ERF as CrORCA2 and CrORCA3. Unigene34283_All (designated as OpERF4) and Unigene37445_All (designated as OpERF5) showed 60 and 88% identity, respectively, to pathogen response-related NtERF5 (Fischer and Dröge-Laser, 2004). These annotations formed the basis for selection of these four unigene candidates for further characterization. Cloning for their full-length coding sequences was performed using 3 ′ and 5 ′ RACE and top hit results from BLAST search of full-length OpERF1 to OpERF5 against NCBI's protein database are listed in Table 1.

AP2/ERF Domain and Phylogenetic Analysis of OpERFs
A phylogenetic tree was constructed using the amino acid sequences of all five OpERFs identified in this study, together with known alkaloid-regulating AP2/ERFs, terpenoid-regulating AP2/ERFs, and Arabidopsis AP2/ERFs, to portray their evolutionary relationships (Figure 1). OpERF1 was clustered in the group VII ERF clade. Several members of the group VII ERF in Arabidopsis have been reported to play an important role in biotic and abiotic stress responses. For example, hypoxia responsive ERF (HRE) 1 and HRE2 respond to a low oxygen environment (Licausi et al., 2010), and RAP2.2 plays a role in Botrytis cinerea resistance and ethylene response (Zhao et al., 2012). OpERF2 was assigned to the clade that is composed of group IX ERFs, with the members previously reported to be involved in specialized metabolism. Interestingly, OpERF2 exhibited a closer relationship with C. roseus ORCA2 and ORCA3 that regulate MIA biosynthesis than with SlJRE4, NbERF1, NtERF189, and NtORC1, the ERFs from Solanaceae family that regulate SGA and nicotine alkaloids biosynthesis (van der Fits andMemelink, 2000, 2001;Shoji et al., 2010;Todd et al., 2010;Cárdenas et al., 2016;Thagun et al., 2016) or AaERF1 and AaERF2 from Asteraceae family that regulate terpenoid biosynthesis (Yu et al., 2012). OpERF4 and OpERF5 were classified into a separated cluster that exclusively consists of Arabidopsis group IXc ERFs. Several members of this group, such as ERF1 (AT3G23240.1), AtERF14, and AtERF15 act as strong transcription activators in defense response (Oñate-Sánchez and Singh, 2002;Zhang et al., 2015). OpERF3, interestingly, was designated to the group I ERF clade composed of DREB members. Two functionally characterized members of this group, RAP2.4 and AtERF53, are related to abiotic stress responses (Lin et al., 2008;Hsieh et al., 2013). Alignment of full-length deduced amino acid sequences of the OpERF1 to OpERF5 showed the ERF family-specific WLG motif in all OpERF sequences. Two characteristic amino acids among ERF members, alanine at position 14 and aspartic acid at position 19, were observed in the AP2/ERF domain of OpERF1, OpERF2, OpERF4, and OpERF5. On the other hand, valine and leucine, the characteristic amino acids among DREB members, were found at the respective positions in OpERF3 (Liu et al., 1998;Sakuma et al., 2002;Nakano et al., 2006; Figure S2A). This difference in the AP2/ERF domains within AP2/ERFs suggests that OpERF3 might have a distinct function. In the alignment of OpERF2, secondary metabolite-regulating ERF sequences and Arabidopsis ERF sequence, the serine substitutes at position 14 of AP2/ERF domain were found in Solanaceae ERFs. A serine-rich regions, which has been proposed to be involved in transactivation of AP2/ERF transcription factor (Riechmann and Meyerowitz, 1998), exist in C-terminal region of all sequences (Figure 2). The sequence alignment of the group VII ERFs clearly showed a characteristic N-terminal MCGGAI(I/L) motif (Tournier et al., 2003) in all sequences ( Figure S2B). Therefore, OpERF1 may possess a similar function as the members of this group. Taking into account that OpERF1 is dominantly expressed (RPKM 21.229) among the five isolated OpERFs (Table S3) with a potential role in biotic and abiotic stress responses and OpERF2 is the second highly expressed (RPKM 6.433; Table S3) with a close phylogenetic relationship with secondary metabolite-regulating AP2/ERFs, we selected these two AP2/ERFs for further functional characterization.

Generation of RNAi Transgenic Hairy Roots
Transgenic research has been performed by the scientists for the functional characterization of genes. By perturbing the expression of the target transcript by using gain and loss of function coupled with multiomics analysis, biological importance, and functionality of genes of interest could be understood (Reuben et al., 2013;Verma et al., 2015). For functional characterization, transgenic HR lines with suppressed expression of OpERF1 (ERF1i) and OpERF2 (ERF2i) were generated using RNAi technique, and the β-glucuronidase gene was used in the negative control lines (GUSi). Expression levels of target genes in each transformation event were investigated using qRT-PCR ( Figure S3). We obtained transgenic lines with strong suppression of target genes, with <0.02-fold expression of OpERF1 in ERF1i and <0.04-fold expression of OpERF2 in ERF2i when compared with the expression levels in GUSi. Two of the transgenic HR lines, each with the lowest expressions of OpERF1 (ERF1i-7 and ERF1i-19) and OpERF2 (ERF2i-2 and ERF2i-3), and two negative control GUSi lines (GUSi-9 and GUSi-10) were subjected to RNA-seq and untargeted metabolome analysis (Figure 3).

RNA-seq Based Transcriptome Profiling of Transgenic HR Lines
RNA-seq performed for the six transgenic HR lines generated more than 13 million paired end raw reads in each library, which provided more than 88 million paired end raw reads in total. Removal of adapter sequences, low-quality reads, and reads shorter than 50 bp yielded a total of over 84 million high-quality paired end clean reads ( Table 2). Clean reads from FIGURE 1 | Phylogenetic tree of OpERF1 to OpERF5, alkaloid-regulating AP2/ERFs, terpenoid-regulating AP2/ERFs, and Arabidopsis AP2/ERFs. OpERF1 to OpERF5 are marked with a red squares. Alkaloid-and terpenoid-regulating AP2/ERFs are marked with a green squares. The tree was generated based on the alignment of full-length amino acid sequences using neighbor-joining algorithm by MEGA6 software. The scale bar shows the evolutionary distances. ERF group numbers based on the classification by Nakano et al. (2006)  individual libraries were aligned with previously described and annotated O. pumila de novo transcriptome assembly (Rohani et al., 2016) using Bowtie 2.0 (Langmead et al., 2009), and transcript abundance in FPKM was estimated using the RSEM program (Li and Dewey, 2011). More than 88% of the clean reads from each library could be aligned to a reference de novo assembly. The full-length coding sequences of OpERF1 and OpERF2 cloned in this study can be aligned with assembled contigs c15284_g1_i1 and c17706_g2_i1, respectively, with a high percentage of sequence identity (more than 98%) and low E-value.
Principal component analysis (PCA), a powerful multivariate data analysis approach, assists visualization of the relationship between samples with high-dimensional data set (Ringnér, 2008). Therefore, we performed PCA of the transcriptome data to illustrate the overall expression pattern of the six transgenic HR lines. The PCA score plot showed apparent clustering of the RNAi lines with the same silenced gene and separation  of the RNAi lines with the different silenced genes across PC1. Although ERF1i-7 and ERF1i-19 were separated along PC2, they were in a different quartile from ERF2i and GUSi (Figure 4). This clustering and separation pattern suggested that, the transcriptome profile resulting from silencing of AP2/ERFs showed a specific perturbation response. To obtain more details of the transcripts in response to the silencing of OpERF1 and OpERF2, differential expression analysis of the suppressed OpERF lines and GUSi lines was performed using DESeq2 (Love et al., 2014). Overall, 291 of 907 upregulated contigs and 112  of 450 downregulated contigs were differentially expressed in both ERF1i-7 and ERF1i-19. A higher number of the perturbed transcripts were found in ERF2i lines; 1180 of 1591 upregulated contigs and 352 of 691 downregulated contigs were differentially expressed in both ERF2i-2 and ERF2i-3 ( Table 3). The higher number of affected transcripts in ERF2i than in ERF1i inferred that the expression of more genes was affected by the suppression of OpERF2 and that the overlap of perturbed transcripts was higher between ERF2i-2 and ERF2i-3 than between ERF1i-7 and ERF1i-19. This observation was also demonstrated by the position of the samples in the PCA score plot, in which ERF2i-2 and ERF2i-3 were grouped together but ERF1i-7 and ERF1i-19 were separated (Figure 4).

Effects of OpERF Suppression on the MEP and Secologanin-Strictosidine Pathways
Previous studies on the involvement of the group IX ERF factors in alkaloid biosynthesis in several plants increased our interest to evaluate whether OpERF2 has a similar function because it has a close phylogenetic relationship with secondary metaboliteregulating group IX ERFs. We investigated the expression levels of transcripts in the MEP pathway that synthesizes isopentenyl pyrophosphate and dimethylallyl pyrophosphate, which are primary metabolite substrates of TIA biosynthesis in plant, and secologanin-strictosidine pathway that produces strictosidine, a precursor of CPT  and other TIAs in O. pumila. Putative MEP and secologaninstrictosidine pathways transcripts in O. pumila were identified on the basis of sequence similarity with the corresponding reference genes from C. roseus (Table S2). In a heatmap representing the comparative degree of expression between RNAi lines, a trend of downregulation of MEP and secologanin-strictosidine pathways genes was observed when OpERF2 was suppressed; however this trend was not observed in ERF1i lines (Figure 5).
the annotated de novo transcriptome assembly of O. pumila. Consistent with the observation made using qRT-PCR and heatmap, GO terms associated with oxidation-reduction that are generally involved in secondary metabolite biosynthesis were enriched in the downregulated transcripts of the ERF2i lines. In the ERF2i upregulated transcript set, a broad range of enriched GO terms related to primary metabolism was enlisted (Figure 6).
To examine the effects of reduced OpERF expression on the metabolomic profile of transgenic HR lines, untargeted metabolome analysis of six RNAi lines (six biological replicates per line) was performed using LC-QTOF-MS. Chemical assignment to ion peaks was conducted based on reference m/z values as reported by Yamazaki et al. (2013), and metabolite levels were determined using peak intensity. Despite a trend of reduced expression of CPT precursor-producing enzyme genes in ERF2i, no drastic difference in the accumulation of CPT and its late-step intermediates was observed among the RNAi lines ( Figure S5). To further understand the differences in the metabolic profiles of the RNAi lines, PCA was conducted using the normalized intensities of all 3735 detected peaks. The PCA score plot showed clustering of ERF2i, clustering GUSi, and dissociation of ERF1i-7 and ERF1i-19 at a short distance. Proximity of the samples in the PCA score plot as well indicated no drastic difference in the metabolomic profiles ( Figure S6).
In this study, a trend of downregulation was observed throughout MIA precursor biosynthesis pathways, starting from the MEP pathway to the production of strictosidine in response to OpERF2 suppression. In C. roseus, the regulation of early MIA biosynthesis is reportedly being separated between two families of transcription factors. ERF transcription factors, ORCA2 and ORCA3, only act on late secologanin-strictosidine pathway (TDC, loganicacid-O-methyltransferase, SLS, and STR) and strictosidine-β-glucosidase (Menke et al., 1999;van der Fits andMemelink, 2000, 2001;Miettinen et al., 2014). On the other hand, the basic helix-loop-helix (bHLH) transcription factors, BIS1 and BIS2, specifically regulate the genes in MEP and iridoid pathways (Van Moerkercke et al., 2015. This demonstrates a diversity in transactivation patterns between homologs from difference species. Several studies have reported a basic helix-loop-helix transcription factor MYC2 acting upstream of group IX ERFs in response to the jasmonate signal to regulate synthesis of specialized metabolism. In C. roseus, MYC2 binds to the jasmonate-responsive element in the promoter region of ORCA3 and can induce ORCA3 expression, which then results in increased TIA levels (Zhang et al., 2011). In tobacco and tomato, MYC2 regulate nicotine alkaloids and cholesterol biosynthesis genes, respectively, by induction of group IX ERFs expression, acting synergistically with group IX ERFs, or binding directly to alkaloid biosynthesis genes (Shoji and Hashimoto, 2011;Cárdenas et al., 2016). We expected either no alteration of MYC2 expression or upregulation of MYC2 to compensate the function of OpERF2 in ERF2 suppressed lines. Surprisingly, the levels of putative MYC2 transcripts were slightly downregulated in ERF2i (log 2 FPKM fold change −0.4 to −0.6 in ERF2i when compared with GUSi). Since the available data are not substantial to make a conclusion, further study is required to confirm the relationship between ERF2 and MYC2 in O. pumila.
The inconsistency between enzyme expression and metabolite accumulation is not unexpected, as observed in the study conducted by Peebles et al. (2009). Transgenic HR of C. roseus that overexpressed ORCA3 failed to increase the accumulation of TIAs, despite induced expression of key enzyme genes in TIA biosynthesis. Moreover, a complex biosynthesis network for secondary metabolites may also avert the effects of suppressed OpERFs on the alteration of CPT and intermediates in HR. Therefore, besides alteration of transcription factor expression, other measures such as inhibition of metabolic flux to branching pathways, escalation of enzyme activity in rate-limiting steps, and augmentation of metabolite transportation and metabolic sink should also be considered for the successful enhancement of CPT production.

Jasmonic Acid-Related GO Annotation in the ERF1i Lines
As discussed in an earlier section that transcript levels of MEP and secologanin-strictosidine pathways were not affected in ERF1i, we examined enriched GO terms in ERF1i to understand the scope of categories affected by OpERF1 suppression. In the ERF1i upregulated transcript set, several stress response-related GO terms, including jasmonic acid response, were enriched ( Figure S7). Jasmonic acid is a phytohormone involved in various biological processes in plants, especially in wounding and defense responses. Therefore, we evaluated the expression profiles of enzyme genes in jasmonic acid biosynthesis by using Arabidopsis genes as a reference for the identification of putative genes in O. pumila (Table S2). A heatmap generated using FPKM of the corresponding contigs showed only moderately induced expression in ERF1i-19, suggesting no effect on jasmonic acid biosynthesis in response to the suppression of OpERF1 (Figure S8). Arabidopsis group VII ERFs have been confirmed to respond to a low-oxygen environment and biotic stress; hence, the enrichment of several stress-associated GO terms in ERF1i upregulated genes suggested that OpERF1 may be responsible for a similar function. Further studies on these stress responses are required to validate this hypothesis.

CONCLUSION
To the best of our knowledge, this is the first report on the functional characterization of AP2/ERF transcription factors associated with an important specialized metabolism in O. pumila. We isolated OpERF1 to OpERF5, five key AP2/ERF genes specially expressed in CPT-producing HR vs. non-CPTproducing CSC. On the basis of the transcriptome analysis and phylogenetic relationships with secondary metaboliteregulating AP2/ERFs, a dominantly expressed OpERF1 and alkaloid-regulating AP2/ERF homolog OpERF2 were selected for the functional characterization. We generated the transgenic HR lines with suppressed expression of OpERF1 or OpERF2 and used the transcriptome and metabolome to identify the specialized metabolic pathways affected by the altered OpERF1 and OpERF2 expressions. The phylogenetic relationship of OpERF1 with Arabidopsis group VII ERFs and enrichment of gene ontologies related to stress response in OpERF1-suppressed lines suggested its function during stress conditions. The positive role of OpERF2 in regulation of secondary metabolism was demonstrated in ERF2i lines. Suppression of OpERF2 resulted in the downregulation of genes in the MEP and secologaninstrictosidine pathways, which are early steps that supply a precursor for CPT biosynthesis in O. pumila; however, effects on metabolite accumulation were not observed. The finding of this study serve as an accentuating reference for a future study on the regulators of valuable alkaloid production in other medicinal plants and a model for the sustainable production of commercially important crops and herbs.  Open reading frame is represented by a white box and AP2/ERF domain is represented by a black box. Solid lines and dashed lines under OpERF sequences indicate the part of genes used as the target sequences in binary vector construction and the part of genes amplified in qRT-PCR, respectively. RB, right border; LB, left border; Km-r, kanamycin resistant gene; 35S-P, Cauliflower Mosaic Virus 35S promoter; NOS-T, nos terminator; Hyg-r, hygromycin resistant gene; bp, base pair. Figure S2 | Amino acid sequence alignment of (A) OpERF1 to OpERF5, (B) OpERF1, and group VII Arabidopsis AP2/ERFs Op, Ophiorrhiza pumila; At, Arabidopsis thaliana. The sequences were aligned using ClustalW in Bioedit software version 7.2.5. AP2/ERF domain is enclosed in a black box. N-terminal MCGGAI(I/L) motif is enclosed in a red box. Asterisks indicate the reserved amino acids at position 14 and 19 of AP2/ERF domain. Solid bar indicates WLG motif. Figure S3 | Expression levels of OpERF1 and OpERF2 in RNAi lines. Data represents means of 3 repeated experiments (n = 1) ± standard deviation. Figure S4 | Expression levels of MIA biosynthesis genes in RNAi lines. Expression fold change relative to GUSi determined using qRT-PCR is presented by bar graph. FPKM of enzyme's corresponding contig obtained from RNA-seq is presented by line graph. qRT-PCR data represents means of 4 biological replicates ± standard deviation and experiments were repeated twice. Asterisks indicate significant differences compared to average of GUSi (t-test, * p < 0.05, * * p < 0.01).

Figure S5 | Accumulation levels of CPT and intermediates in RNAi lines.
Data represents means of 6 biological replicates ± standard deviation. Asterisks indicate significant differences compared to average of GUSi (t-test, * p < 0.05, * * p < 0.01).