Triterpenoid Biosynthesis and Engineering in Plants

Triterpenoid saponins are a diverse group of natural products in plants and are considered defensive compounds against pathogenic microbes and herbivores. Because of their various beneficial properties for humans, saponins are used in wide-ranging applications in addition to medicinally. Saponin biosynthesis involves three key enzymes: oxidosqualene cyclases, which construct the basic triterpenoid skeletons; cytochrome P450 monooxygenases, which mediate oxidations; and uridine diphosphate-dependent glycosyltransferases, which catalyze glycosylations. The discovery of genes committed to saponin biosynthesis is important for the stable supply and biotechnological application of these compounds. Here, we review the identified genes involved in triterpenoid biosynthesis, summarize the recent advances in the biotechnological production of useful plant terpenoids, and discuss the bioengineering of plant triterpenoids.

OSC, lanosterol synthase (LAS), for sterol biosynthesis. However, higher plants have several OSCs not only for sterol biosynthesis, such as cycloartenol synthase (CAS) and LAS (Ohyama et al., 2009), but also for triterpenoid biosynthesis. The molecular diversity of OSCs enables more than 100 skeletal variations of triterpenoids in plants . Until now, some dozens of OSC genes from not only model plants but also crops and medicinal plants have been cloned and functionally characterized (reviewed in Kushiro and Ebizuka, 2010). For example, the Arabidopsis thaliana genome has 13 OSC genes, and the functional identification of these genes has been completed, at least, by in vitro experiments. Most of the OSCs from eudicots are phylogenetically classified into some groups, and the reaction products differ from group to group. The site-directed mutagenesis and homology modeling of plant OSCs have been carried out to investigate the reaction mechanisms regarding their product variety (reviewed in Kushiro and Ebizuka, 2010). Of OSCs from various organisms, the structure of human LAS protein was elucidated (Thoma et al., 2004).
After an OSC constructs the basic triterpenoid skeleton, the skeleton is modified to a hydrophobic aglycone called sapogenin. The first modification is oxidation catalyzed by cytochrome P450 monooxygenase (P450), and this step enables further modifications such O-glycosylation. P450 is highly diverse and catalyzes several kinds of chemical reactions committed to the secondary metabolism (Kahn and Durst, 2000).
Glycosylation is essential for saponin biosynthesis. Glycosylation increases the water solubility and changes the biological activity of triterpenoid. Uridine diphosphate (UDP)-dependent glycosyltransferases (UGTs) recognize a wide range of natural products as acceptor molecules.
P450 species and UGTs belong to multigene families and are the key factors for explosive diversification of other natural products in plants. In the case of reported P450 species in saponin biosynthesis, those CYP families vary respecting not only the carbon skeletons of the triterpenoid substrates but also the target positions of the reactions.

IntroductIon
Triterpenoids including steroids are a highly diverse group of natural products widely distributed in plants (Vincken et al., 2007). Plants often accumulate these compounds in their glycosylated form -saponin. Saponins comprise hydrophobic triterpenoid aglycones called sapogenin and one or more hydrophilic sugar moieties.
Biologically, plant saponins are considered defensive compounds against pathogenic microbes and herbivores (Osbourn, 1996;Kuzina et al., 2009;Szakiel et al., 2011). These saponins also have beneficial properties for humans. For example, Panax and Glycyrrhiza plants are well-known traditional herbal medicines containing saponins, ginsenosides, and glycyrrhizin, respectively, with various pharmacological effects (Shibata, 2000(Shibata, , 2001. Saponins have a variety of other applications as well. They show foaming ability when mixed with water, as indicated by the word sapo, meaning soap in Latin. In fact, Saponaria officinalis (common soapwort) and Quillaja saponaria (soapbark) have been used as soap. The saponins of Q. saponaria are also used as emulsifiers in cosmetics and foods. Furthermore, glycyrrhizin is used as a natural sweetener, with 150 times the sweetness of sugar.
In this article, we summarize the genes involved in triterpenoid biosynthesis identified to date and the recent advances in the bioengineering production of useful plant terpenoids; finally, we provide a perspective on the bioengineering of plant triterpenoids.

trIterpenoId bIosynthesIs
Terpenoids are built up from C5 units, isopentenyl diphosphate (IPP). IPP is supplied from the cytosolic mevalonic acid (MVA) pathway and the plastidal methylerythritol phosphate (MEP) pathway. Triterpenoids and sesquiterpenoids are biosynthesized via the MVA pathway, whereas monoterpenoids, diterpenoid, and tetraterpenoids are biosynthesized via the MEP pathway. The first diversifying step in triterpenoid biosynthesis is the cyclization of 2,3-oxidosqualene catalyzed by oxidosqualene cyclase (OSC; Abe et al., 1993; Figure 1). In general, animals and fungi have only one The diversity of these enzymes makes identification of the genes for saponin biosynthesis difficult. The genes involved in triterpenoid biosynthesis identified in plants to date are presented as follows.

trIterpenoIds In ArAbidopsis thAliAnA
The first model plant A. thaliana has total 13 OSCs, 246 P450 species (Werck-Reichhart et al., 2002) and 112 UGTs (Paquette et al., 2003). The protein encoded by At5g48010, an OSC, was identified as thalianol synthase . However, no tricyclic triterpenoid including thalianol had been reported in Brassicales at that time. After that thalianol was detected at about 0.4% of total sterols in the root of A. thaliana, CYP708A2 and CYP705A5 were identified as P450 species in thalianol metabolism (Field and Osbourn, 2008). Although the details of variety and content of saponins in A. thaliana have not been clarified, Warnecke et al. (1997) reported one UGT for sterol glycosylation.
Although DDMP saponins and their derivatives have beneficial effects on human health, some group A saponins are unfavorable because of their astringent taste (Okubo et al., 1992). To reduce the astringent taste of soybean, transgenic soybean plants with suppressing β-amyrin synthase (bAS), an OSC, gene by RNAi silencing were generated. The sapogenol levels of the transgenic seeds were reduced to below 500 μg g −1 or about 25% of the content in wild type . Only CYP93E1 C-24 hydroxylase was identified as an oxidase in soyasaponin biosynthesis . Soyasaponin βg, the main soyasaponin in G. max, is soyasapogenol B that attaches three sugar molecules, glucuronic acid, galactose, and rhamnose, at the C-3 hydroxyl group. UGT73P2 and UGT91H4 attach the second and third sugars in the sugar chain, respectively (Shibuya et al., 2010). These UGTs were selected as G. max expressed FiGUre 1 | Triterpenoid biosynthetic pathway. After the cyclization of 2,3-oxidosqualene catalyzed by OSC, a triterpenoid undergoes various modifications including P450-catalyzed oxidation and UGT-catalyzed glycosylation. Blue arrows, OSC-catalyzed steps; red arrows, P450-catalyzed steps; green arrows, additional modifications including UGT-catalyzed steps. Haralampidis et al., 2001;Qi et al., 2006;Mylona et al., 2008;Mugford et al., 2009). Genetic analysis showed five Sad loci, Sad2, Sad3, Sad6, Sad7, and Sad8, are within 3.6 cM around the Sad1 locus, and especially three Sad genes, Sad1, Sad3, and Sad7, clearly clustered in the genome (Qi et al., 2004). A cluster of such genes was also found in the A. thaliana genome for thalianol metabolism (Field and Osbourn, 2008). These clusters interest in the evolutionary process of triterpenoid biosynthesis in plants (Osbourn, 2010).

VaccarosIdes In sAponAriA vAccAriA
The seeds of S. vaccaria, used in traditional Chinese medicine, accumulate oleanane-type saponins called vaccarosides. The aglycone of vaccaroside B is a C-23 and C-28 carboxylated β-amyrin, gypsogenic acid. A bAS cDNA was cloned by homology-based PCR (Meesapyodsuk et al., 2007). UGT74M1, with sequence similarity to other plant ester-forming glucosyltransferases, was cloned from the developing seed EST library and identified as an UDPglucosyltransferase to C-28 of gypsogenic acid in an ester linkage (Meesapyodsuk et al., 2007).

GInsenosIdes In pAnAx GinsenG
Panax ginseng is a famous medicinal plant in Asia. The main pharmacologically active compounds in the ginseng are saponins called ginsenosides (Shibata, 2001). Major ginsenosides have a dammarane skeleton constructed by an OSC, dammarenediol-II synthase (PNA). Ginsenoside R 0 , a minor ginsenoside, is derived only from β-amyrin. Ebizuka and coworkers identified CAS, two bAS, LAS, and PNA cDNAs from hairy root cultures of P. ginseng in the ginsenoside-accumulating period (Kushiro et al., 1998a,b;Suzuki et al., 2006;Tansakul et al., 2006). Han et al. (2006) also identified PNA from the flower-accumulated dammarane-type saponins. The genes encoding P450 and UGTs committed to ginsenoside biosynthesis would be identifiable from such resources in the near future.

bIotechnoloGIcal productIon of useful plant terpenoIds
Although almost all production of glycyrrhizin depends on the collection of wild licorice, its harvesting is restricted to prevent exhaustion and desertification in the main producing country, China. Similarly, ginseng requires 4-5 years of careful cultivation and prevention of injury by continuous cropping. Such problems occur not only for saponins but also for other natural plant products. To ensure their stable supply, environment-friendly, and lower-cost alternatives such as biotechnological production are necessary. In the following, we describe the recent advances in the biotechnological production of useful terpenoids. artemIsInIn Artemisinin, a sesquiterpenoid originally sourced from Artemisia annua, is used in combination therapy for malaria. The cost of therapy is too high for people in low-income countries where malaria is prevalent, and total synthesis of artemisinin (Schmid and Hofheinz, 1983) is not easy at low cost. Semi-synthesis of artemisinin from artemisinic acid (Roth and Acton, 1989) derived by fermentation could be an alternative lower-cost supply method (White, 2008). Ro et al. (2006) genetically modified yeast to increased productivity of a sesquiterpenoid biosynthesis starter, farnesyl pyrophosphate sequence tags (ESTs) with homologous sequences in the EST database of Medicago truncatula, which also produces a soyasaponin βg intermediate called soyasaponin I (Huhman et al., 2005). The activity of the first UGT, UDP-glucuronic acid:soyasapogenol B-glucuronyl transferase, was detected in the microsomal fraction of G. max (Kurosawa et al., 2002). However, no gene encoding the UGT has been cloned. Group A saponins with a terminal acetylated sugar in the C-22 sugar chain accumulate only in seed hypocotyls (Shimoyamada et al., 1990). A gene controlling the terminal sugar variety was mapped on soybean chromosome 7 (Takada et al., 2010).

saponIns In medicAGo truncAtulA
Medicago truncatula, a leguminous model plant, accumulates over 30 oleanane-type saponins (Huhman et al., 2005). A corresponding bAS cDNA was identified by EST database mining (Suzuki et al., 2002) and homology-based PCR (Iturbe-Ormaetxe et al., 2003). UGT73K1 and UGT71G1 were characterized as triterpenoid glycosyltransferases by integrated analysis of the transcriptome and metabolome of M. truncatula, but UGT71G1 preferred some flavonoids to triterpenoids as substrates in vitro (Achnine et al., 2005). Although the glycosylated positions by both UGTs were not clarified in vitro, in silico docking simulation of the UGT71G1 crystal structure with UDP-glucose and medicagenic acid suggested that UGT71G1 can transfer a glucose molecule to the hydroxyl group at C-3 (Shao et al., 2005). UGT73F3 was identified as a glucosyltransferase of the hederagenin C-28 carboxyl group in an ester linkage by cluster analysis of transcription patterns and genetic loss-of-function analysis (Naoumkina et al., 2010).

GlycyrrhIzIn In lIcorIce
Glycyrrhizin is an oleanane-type saponin present in the underground parts of licorice (Glycyrrhiza). For use as a medicinal herb, the Japanese pharmacopia standard requires the root or stolon with 2.5% or more glycyrrhizin content. The biosynthetic pathway of glycyrrhizin from β-amyrin involves hydroxylations at C-11 and C-30, and two steps of glucuronyl transfers to the hydroxyl group at C-3. A bAS was identified from G. glabra (Hayashi et al., 2001). Further, we identified CYP88D6 as a β-amyrin C-11 oxidase (Seki et al., 2008). For CYP88D6 cloning, we first constructed an EST library of the underground parts (Sudo et al., 2009). On the basis of the sequence similarities, we identified P450 genes and selected the candidate P450 gene expressed in glycyrrhizin-accumulating tissues. In addition, CYP93E3 was identified as a β-amyrin C-24 oxidase in the secondary metabolism of glycyrrhizin. Further investigation is currently undertaken in our group for identifying other candidate genes including another P450 responsible for C-30 oxygenation (Seki et al., submitted) and UGTs involved in the biosynthetic pathway of glycyrrhizin.
aVenacIns In AvenA striGosA Avena spp. (oats) produce antimicrobial oleanane-type saponins called avenacins. Osbourn and coworkers generated saponin-deficient (sad) mutants of A. strigosa; cloned Sad1, encoding bAS, Sad2, encoding CYP51H10 β-amyrin oxidase, and Sad7, encoding serine carboxypeptidase-like acyltransferase; and investigated sad3 and sad4 mutants accumulating a monodeglucosyl avenacin (Papadopoulou daffodil and crtI encoding phytoene desaturase from the bacterium Erwinia uredovora produced provitamin A, the content was not enough to meet the recommended daily allowance for children even in the regions where rice is a staple food. Therefore, Golden rice 2 was developed (Paine et al., 2005). The replacement of psy from daffodil by psy from maize improved the content adequately (>30 μg g −1 ).
Carotenoids are also popular targets in metabolically engineered microbial production. Alper et al. (2005) combined systematic and combinatorial gene knockout target identification methods in E. coli to increase the productivity of a well-known carotenoid pigment in tomato, lycopene. First, strain genotypes were systematically designed with gene knockouts reported to improve productivity. These strains then underwent deletions of unknown genes by a combinatorial transposon-based search. After the combination, the strain produced a high amount of lycopene (23 mg g −1 dry cell weight). Recently, an engineered E. coli with the MVA pathway genes from Enterococcus faecalis and Streptococcus pneumoniae and β-carotene biosynthesis genes produced β-carotene of 465 mg l −1 (Yoon et al., 2009). Furthermore, improvement in the culture condition of the recombinant E. coli increased the β-carotene titer to 663 mg l −1 ).
Furthermore, future challenging targets to elucidate for bioengineering of plant triterpenoids should be the regulatory mechanisms of the biosynthetic gene expressions and the accumulation mechanisms of triterpenoids. Saponins are frequently accumulated in specific tissues and organs. Glycyrrhizin and ginsenosides are accumulated in xylems of roots of licorice and ginseng, respectively (Shan et al., 2001;Fukuda et al., 2006). Genes for saponin biosynthesis also express in specific tissues and organs. In avena, Sad genes are expressed in the root epidermal cells accumulating avenacin A-1 (Haralampidis et al., 2001;Qi et al., 2006;Mylona et al., 2008;Mugford et al., 2009). In addition, metabolomic and transcriptomic analyses showed good correlations between expressions of the biosynthetic genes and the accumulations . In fact, recent successes for identification of saponin biosynthetic genes are based on such correlation analysis as described above. Those observations indicate that saponin productions are regulated most likely at the transcription level and thus implying the existence of specific transcription factor(s) for (FPP), and expressed cDNAs encoding an amorphadiene synthase (ADS), CYP71AV1, and a cytochrome P450 reductase (CPR) from A. annua in the yeast. The recombinant yeast produced a large amount of artemisinic acid (115 mg l −1 ). Subsequent improvement in the fermentation process increased the artemisinic acid titer to 2.5 g l −1 (Lenihan et al., 2008). On the other hand, an engineered Escherichia coli-integrated yeast MEV pathway to supply a large amount of FPP (Martin et al., 2003), modified CYP71AV1 at the N-terminus, and CPR produced 105 mg l −1 of artemisinic acid (Chang et al., 2007). Further improvement in the MEV pathway achieved an amorphadiene titer of 27.4 g l −1 (Tsuruta et al., 2009).

taxol (paclItaxel)
Taxol (paclitaxel) is a diterpenoid used against numerous cancers. Originally, Taxol was isolated from the bark of pacific yew (Taxus brevifolia) at low content (Wani et al., 1971). The complex structure of this drug limits its commercial chemical synthesis (Holton et al., 1994a,b;Nicolaou et al., 1994). Therefore, semi-synthesis from more accessible biosynthetic intermediates such 10-deacetylbaccatin III and production in Taxus plant cell cultures have been developed as alternative supply methods (Kingston, 2007), which still depend on plant sources. For further improvement in productivity and reduction in the therapeutic cost, biosynthetic production has been attempted. Introduction of several biosynthetic enzyme genes for Taxol in yeast resulted in the production of only trace amounts of the first hydroxylated intermediate, taxadien-5α-ol (Dejong et al., 2006).
On the other hand, Ajikumar et al. (2010) presented an optimization termed multivariate modular pathway engineering for taxadiene production in E. coli. They divided the taxadiene-producing pathway into two modules at the isopentenyl pyrophosphate step and searched for the optimal balance of the expression strength of each module for taxadiene production. The optimization enabled over 1 g l −1 of taxadiene production in E. coli, and subsequent expression of a chimeric protein from CYP725A4, a taxadien-5αhydroxylase, and Taxus CPR resulted in 58 mg l −1 of taxadien-5α-ol production. The researchers (Ajikumar et al., 2010) indicated that re-optimization including the chimeric protein would improve the taxadien-5α-ol productivity.

carotenoIds
Carotenoids are well-known tetraterpenoid pigments in plants and microorganisms. They are used not only as natural colorants in food and feed but also in nutraceutical, cosmetic, and pharmaceutical products because of their antioxidant property. Vitamin A is converted from some carotenoids collectively called provitamin A in human body. Deficiency of vitamin A causes blindness and mortality due to weakening of the immune system in children of the developing world. Therefore, carotenoids have received much attention as metabolic engineering targets (reviewed in Das et al., 2007;Misawa, 2010). The engineered host organisms vary from microorganisms to plants. Here are two examples of the hosts -one for a plant (rice) and one for microorganism (E. coli). "Golden rice" is one of the most successful metabolically engineered plants (Ye et al., 2000). To increase provitamin A intake from rice, a provitamin A biosynthetic pathway was constructed in the endosperm of rice. Although Golden rice with psy encoding phytoene synthase from  Biotechnol. 23, 612-616. Borevitz, J. O., Xia, Y., Blount, J., Dixon, R. A., and Lamb, C. (2000). Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell 12, 2383-2394. Burrows, J. C., Price, K. R., and Fenwick, R. G. (1987). Soyasaponin IV, an additional monodesmosidic saponin isolated from soyabean. Phytochemistry 26, 1214-1215. Chang, M. C. Y., Eachus, R. A., Trieu, W., Ro, D. K., and Keasling, J. D. (2007). Engineering Escherichia coli for production of functionalized terpenoids using plant P450s. Nat. Chem. Biol. 3, 274-277. Chappell, J., Wolf, F., Proulx, J., Cuellar, R., and Saunders, C. (1995). Is the Genetic engineering in yeast sterol biosynthesis to enhance the availability of β-amyrin may also improve the production of 11-oxo-β-amyrin. Kirby et al. (2008) isolated a bAS from A. annua and achieved 6 mg l −1 of β-amyrin production in yeast expressing an N-terminal-truncated HMGR and restricting the expression of a native OSC, LAS. A large amount of squalene accumulated in yeast means that the yeast can produce even more β-amyrin. In addition to the efforts to improve the common isoprenoid pathway, as already described, enhancement of the catalytic properties of enzymes in triterpenoid biosynthesis should be effective. In Solanaceae, CYP71Ds are involved in sesquiterpenoid phytoalexin biosynthesis. Protein engineering of CYP71Ds based on sequence alignment analysis with phylogenetically related P450 species and homology modeling successfully enhanced the catalytic efficiencies of the enzymes (Takahashi et al., 2007). In Glycyrrhiza, not only glycyrrhizin-producing species but also non-producing species can produce other oleanane-type triterpenoid saponins (Hayashi et al., 2000). The saponin diversity in Glycyrrhiza spp. could be derived from the variation in homologous P450 species and UGTs in saponin biosynthesis. Evaluation of the differences in these enzymes would be useful to improve their activities.
Although the number of identified genes has increased in the last decade, yet there is no saponin biosynthetic pathway of which all genes encoding the proteins involved in the biosynthetic steps have been identified. In fact only one CYP88D6 has been identified in glycyrrhizin biosynthetic pathway that requires two P450 species and two UGTs. The recent transcriptomic and metabolomic approaches have accelerated the elucidation of plant secondary metabolisms (Ziegler et al., 2006;Hirai et al., 2007;Yonekura-Sakakibara et al., 2007, 2008Liscombe et al., 2009;Okazaki et al., 2009;Matsuda et al., 2010;Saito and Matsuda, 2010), and some saponin biosynthetic genes were identified by such strategies (Achnine et al., 2005;Seki et al., 2008;Naoumkina et al., 2010). Introduction of the current advanced DNA sequence technology in the omics strategies should enhance gene discovery Sun et al., 2010). In addition to the efforts to discover the proteins for the lacking biosynthetic steps, understanding the regulatory mechanisms of the expression of biosynthetic genes and the accumulation mechanisms of triterpenoids in plant and microbial hosts should enable further promising application for the production of useful triterpenoids.

acknowledGments
This work was supported by the Program for Promotion of Basic and Applied Researches for Innovations in Bio-oriented Industry (BRAIN). The authors thank Dr. Hikaru Seki (Osaka University) for his comments on the manuscript. saponin biosynthesis. The engineering of transcription factor is a promising way to modify the biosynthetic pathway in addition to an introduction of multiple biosynthetic enzyme genes (Borevitz et al., 2000;Hirai et al., 2007;Gonzalez et al., 2008). The discovery of novel structural genes involved in the pathway can be achieved by an analysis of overexpressing lines of a transcription factor (Tohge et al., 2005;Luo et al., 2007). Furthermore, the control by the transcription factor would be a useful reference to improve the productivity by the optimization of the expression levels of multiple pathway genes introduced in a heterologous host.
At the subcellular level, saponins are accumulated in vacuoles (Kesselmeier and Urban, 1983;Mylona et al., 2008). However OSCs, P450s and some UGTs for saponin biosynthesis are known as microsomal enzymes (Hayashi et al., 1996;Kurosawa et al., 2002). These facts suggest the presence of a vacuolar transporter of saponin. The transporter is also a target of engineering to improve the accumulation of the target compounds. So far, as a plant terpenoid transporter, an ATP-binding cassette transporter, NpPDR1, involved in secretion of an antifungal diterpenoid, sclareol, in tobacco plant was reported (Jasiń ski et al., 2001). A homolog of human sterol carrier protein-2 in A. thaliana was identified as a lipid transfer protein (Edqvist et al., 2004). There is, however, no report to identify a triterpenoid transporter in a plant to date. To narrow down the transcription factor and transporter candidates in plant secondary metabolism, omics analyses are also useful strategies (Goossens et al., 2003;Hirai et al., 2007;Morita et al., 2009;Sawada et al., 2009).
Noteworthily, almost all OSCs, catalyzing the first triterpenoid diversification step, cloned from plants have been heterologously expressed in yeast for functional identification (reviewed in Kushiro and Ebizuka, 2010), because OSCs are membrane binding-type proteins and require eukaryotic developed intracellular membrane systems for their heterologous expression. Furthermore, yeast has a sterol biosynthetic pathway for producing the membrane constituent, and this pathway could be converted to a useful triterpenoidproducing pathway.
Previously, we developed a recombinant yeast system for the mechanistic investigation of P450 species in glycyrrhizin biosynthesis (Seki et al., 2008). To supply β-amyrin as the CYP88D6 substrate endogenously, a bAS was constitutively expressed. After the accumulation of β-amyrin in the yeast, CYP88D6 was co-expressed with a CPR as the redox partner. The final yields of 11-oxo-β-amyrin and 11α-hydroxy-β-amyrin at 2 days of culture after the induction were approximately 1.6 and 0.2 mg l −1 , respectively. Although we employed Lotus bAS and CPR at that time, G. uralensis bAS and CPR would be more suitable to improve the production.