Detailed Phytochemical Analysis of High- and Low Artemisinin-Producing Chemotypes of Artemisia annua

Chemical derivatives of artemisinin, a sesquiterpene lactone produced by Artemisia annua, are the active ingredient in the most effective treatment for malaria. Comprehensive phytochemical analysis of two contrasting chemotypes of A. annua resulted in the characterization of over 80 natural products by NMR, more than 20 of which are novel and described here for the first time. Analysis of high- and low-artemisinin producing (HAP and LAP) chemotypes of A. annua confirmed the latter to have a low level of DBR2 (artemisinic aldehyde Δ11(13) reductase) gene expression. Here we show that the LAP chemotype accumulates high levels of artemisinic acid, arteannuin B, epi-deoxyarteannuin B and other amorpha-4,11-diene derived sesquiterpenes which are unsaturated at the 11,13-position. By contrast, the HAP chemotype is rich in sesquiterpenes saturated at the 11,13-position (dihydroartemisinic acid, artemisinin and dihydro-epi-deoxyarteannunin B), which is consistent with higher expression levels of DBR2, and also with the presence of a HAP-chemotype version of CYP71AV1 (amorpha-4,11-diene C-12 oxidase). Our results indicate that the conversion steps from artemisinic acid to arteannuin B, epi-deoxyarteannuin B and artemisitene in the LAP chemotype are non-enzymatic and parallel the non-enzymatic conversion of DHAA to artemisinin and dihyro-epi-deoxyarteannuin B in the HAP chemotype. Interestingly, artemisinic acid in the LAP chemotype preferentially converts to arteannuin B rather than the endoperoxide bridge containing artemisitene. In contrast, in the HAP chemotype, DHAA preferentially converts to artemisinin. Broader metabolomic and transcriptomic profiling revealed significantly different terpenoid profiles and related terpenoid gene expression in these two morphologically distinct chemotypes.

Based on the content of artemisinin and its precursors, two contrasting chemotypes of A. annua have been described: a low-artemisinin production (LAP) chemotype and a highartemisinin production (HAP) chemotype (Wallaart et al., 2000). Both chemotypes contain artemisinin, but the HAP chemotype has a relatively high content of DHAA and artemisinin, whereas the LAP chemotype has a high content of AA and ArtB (Lommen et al., 2006;Arsenault et al., 2010;Larson et al., 2013). Recent studies have concluded that a major factor in determining the biochemical phenotype of HAPs and LAPs is the differential expression of DBR2-with low expression in LAP chemotypes correlating with a number of insertions/deletions in the DBR2 promoter sequence (Yang et al., 2015). We have recently shown that the overall pathway to artemisinin biosynthesis is under strict developmental control with early steps in the pathway occurring in young leaves and later steps in mature leaves (Czechowski et al., 2016). In the present study, we have used both metabolomics and transcriptomics to investigate the developmental regulation of sesquiterpene biosynthesis in HAP and LAP chemotypes. Using a combination of NMR and UPLC-/GC-MS techniques we have characterized a number of amorphane and cadinane sesquiterpenes in addition to other terpenes isolated from leaf glandular trichomes. We have also extended the transcript analysis in HAPs and LAPs beyond the genes encoding artemisinin-pathway enzymes. Our findings suggest profound differences in general terpenoid metabolism between HAP and LAP chemotypes that extend well beyond altered DBR2 expression and artemisinin content.

Plant Material
Artemis is an F1 hybrid variety of A. annua developed by Mediplant (Conthey, Switzerland), produced by crossing C4 and C1 parental material of East Asian origin (Delabays et al., 2001). Artemisinin content has been reported to reach 1.4% of the leaf dry weight when grown in the field, and its metabolite profile is typical for the HAP chemotype (Larson et al., 2013). NCV ("non-commercial variety"), an "open-pollinated" variety of European origin was also provided by Mediplant, and has the lowest reported artemisinin content from any A. annua germplasm in addition to a metabolite profile characteristic of the LAP chemotype (Larson et al., 2013). Plants were grown from seeds in glasshouse conditions as previously described (Graham et al., 2010).

Leaf Area Measurements
The leaf area of glasshouse-grown plants was measured by scanning for leaves 14-16 (counting from the apical meristem), followed by calculation of the leaf area using LAMINA software (Bylesjö et al., 2008).

Trichome Density Measurements
Trichome density was quantified on the abaxial surface of the terminal leaflets of leaves 14-16 (counting from the apical meristem). Trichomes were visualized using a Zeiss fluorescent dissecting microscope (fitted with a 470/40 nm excitation filter/ 525/50 nm emission filter). Images were recorded using AxioVision 4.7 software (Carl Zeiss Ltd. Herts., UK). Trichome number was counted manually across a 3 × 0.5 mm 2 leaflet sample area and the average (mean) trichome density was then calculated for the whole leaf.

NMR Structural Data for Natural Compounds From Artemis and NCV
Leaf and stem material from Artemis (5 Kg) was extracted in CHCl 3 (20 L). The organic solvent was removed by rotary evaporation and a portion of the residual dark green aromatic plant extract (ca 2.5% w/w) was "dry-loaded" on to a silica column for gradient column chromatography (see Table section Gradient Column Chromatography of the Artemis Variety of A. annua). Each of the fractions A-Y from gradient column chromatography of Artemis were then further purified by isocratic preparative normal-phase HPLC ( * fractions B, D, I, O, and T were also subjected to a second round of isocratic column chromatography prior to prep. HPLC); and individual metabolites were then characterized by NMR, as listed in Figure 1A and the Supplemental Table 1 (1D-and 2D-NMR data for all metabolites is also given in the Supplementary List 1). Selected fractions were analyzed by UPLC-APCIhigh resolution MS to verify molecular weights and chemical formulae. Confirmed annotations were used to update m/z and retention time reference data, to enable reporting of these compounds from plant extracts by UPLC-MS.
Leaf and stem material from the NCV variety of A. annua (780 g) was extracted in CHCl 3 (4 L). The organic solvent was then removed by rotary evaporation and the residual dark green aromatic plant extract (16.6 g; ca 2% w/w) was dry-loaded onto a silica column for gradient column chromatography (see Table  section  Each of the fractions A-N from gradient column chromatography of NCV were then further purified by isocratic preparative normal-phase HPLC; individual metabolites were then characterized by NMR, as listed in Figure 1B and the Supplemental Table 1 (1D-and 2D-NMR data for all metabolites are also given in the Supplementary List 2). Selected fractions were analyzed by UPLC-APCI-high resolution MS to verify molecular weights and chemical formulae. Confirmed annotations were used to update m/z and retention time reference data, to enable reporting of these compounds from plant extracts by UPLC-MS.

Metabolite Analysis by UPLC-MS and GC-MS
Metabolite analysis by UPLC-and GC-MS were performed as described previously (Czechowski et al., 2016). Fifteen plants from each of five genotype classes were grown from seeds in 4-inch pots under 16 h days for 12 weeks. Metabolite profiles were generated from 50 mg fresh weight (FW) pooled samples of leaves collected at two different developmental stages: 1-5 (counted from the apical meristem), representing the juvenile stage; and leaves 11-13, representing the mature, expanded stage ( Figure 3A). Fresh leaf samples were stored at −80 • C, pending analysis. In addition, dry leaf material was also obtained from 14week old plants, cut just above the zone of senescing leaves, and dried for 14 days at 40 • C. Leaves were stripped from the plants, and leaf material sieved through 5 mm mesh to remove small stems. Trichome-specific metabolites were extracted as described previously (Czechowski et al., 2016) with minor modifications. Briefly, 50 mg of fresh material was extracted by gentle shaking in 500 µl chloroform for 1 h. Supernatant was taken out and remaining plant material was fully dried in a centrifugal evaporator (GeneVac R Ez-2 plus, Genevac Ltd, Ipswich, UK). Weight of the extracted and dried material was taken and used to quantify abundance of the specific compounds per unit of extracted dry weight. Dry leaf material (0.5 g) was ground to a fine powder using a TissueLyser II ball mill fitted with stainless steel grinding jars (Qiagen, Crawley, UK) operated at 25 Hz for 1 min. Ten mg sub-samples of dry leaf material were extracted in 9:1 (v/v) chloroform:ethanol with gentle shaking for 1 h and then analyzed as per fresh material.
For UPLC-MS analysis of sesquiterpenes, a diluted (1:5 (v/v) extract:ethanol) 2 µL aliquot was injected on an Acquity UPLC system (Waters, Elstree, UK) fitted with a Luna 50 × 2 mm 2.5 µm HST column (Phenomenex, Macclesfield, UK). Metabolites were eluted at 0.6 mL/min and 60 • C using a linear gradient from 60 to 100% A:B over 2.5 min, where A = 5% (v/v) aqueous MeOH and B = MeOH, with both A and B containing 0.1% (v/v) formic acid. Pseudomolecular [M+H] + ions were detected using a Thermo Fisher LTQ-Orbitrap (ThermoFisher, Hemel Hempstead, UK) mass spectrometer fitted with an atmospheric pressure chemical ionization source operating in positive ionization mode under the control of Xcalibur 2.1 software. Data was acquired over the m/z range 100-1,000 in FTMS centroid mode with resolution set to 7500 FWHM at m/z 400. Data extraction and analysis was performed using packages and custom scripts in R 3.2.2 (https://www.R-project. org/). XCMS (Smith et al., 2006) incorporating the centWave algorithm (Tautenhahn et al., 2008) was used for untargeted peak extraction. Deisotoping, fragment and adduct removal was performed using CAMERA (Kuhl et al., 2012). Artemisinin was quantified using the standard curve of the response ratio of artemisinin (Sigma, Poole, UK) to internal standard (βartemether; Hallochem Pharmaceutical, Hong Kong) that was previously added to extracts and standards. Metabolites were identified with reference to authentic standards or NMR-resolved structures and empirical mass formulae calculated using the R package rcdk (Guha, 2007) within 10 ppm error and elemental constraints of: C = 1-100, H = 1-200, O = 0-20, N = 0-1. Peak concentrations were calculated using bracketed response curves, where standard curves were run every ∼30 samples. Metabolite concentrations were expressed as a proportion of the residual dry leaf material following extraction. FIGURE 2 | Ten pairs of 11,13-dihdyro/ 11,13-dehydro amorphanolides between Artemis (left-hand side) and NCV (right-hand side) varieties of A. annua characterized by the NMR approach. Numbering of compounds is consistent with Supplementary Lists 1, 2. Numbering of carbon atoms showed. Novel compound indicated by asterisk.
All the above results are consistent with a higher DBR2 activity in the HAP chemotype compared to the LAP chemotype (Yang et al., 2015). The relative abundances for 8 of these 10 "pairs" are also well matched between the Artemis and NCV varieties, suggesting a "shared" further metabolism for DHAA in Artemis and AA in NCV. The first exception is arteannuin B (ArtB 60), which is abundant in NCV, whilst its analog, dihydroarteannuin B (14), is relatively low in Artemis (Supplemental Table 1). The second is artemisitene, the 11,13-dehydro analoge of artemisinin (Acton and Klayman, 1985;Woerdenbag et al., 1994; Figure 1; Supplemental Table 1) which is a minor compound in NCV, while its "partner" artemisinin is the most abundant metabolite in Artemis (Supplemental Table 1). These observations suggest that while there are many parallels in the pathways that further transform DHAA (8) and AA (9) in the HAP and LAP chemotypes there are also some significant differences.

Metabolomic and Gene Expression Studies Reveal Multiple Differences Between HAP and LAP Chemotypes
Using a leaf maturation time-series, we recently demonstrated that artemisinin levels increase gradually from juvenile to mature leaves and remain stable during the post-harvest drying process in Artemis HAP chemotype plants (Czechowski et al., 2016). Using a similar time-series (which included fresh leaf 1-5 (juvenile), and 11-13 (mature) (counting from the apical meristem); plus oven-dried whole plant-stripped leaves (dry) from 12-week-old glasshouse-grown plants), we have now performed UPLC-and GC-MS based metabolite profiling of extracts from both HAP (Artemis) and LAP (NCV) chemotypes. We found that the pathway entry-point metabolite, amorpha-4,11-diene (A-4,11-D), is only detectable in juvenile leaves, and at approximately 2-fold higher concentration in Artemis as compared to NCV (Figure 3Ai; Supplemental Table 3). A much greater difference was seen for the enzymatically-produced artemisinin precursor, dihydroartemisinic acid (DHAA), which was present at a 24-fold higher concentration in juvenile Artemis leaves compared to NCV (Figure 3Aii), Supplemental Table  2). Artemisinic acid (AA) on the other hand accumulated in NCV leaves at a 10-fold higher concentration than in Artemis (Figure 3Aiii), Supplemental Table 1). Interestingly the levels of AA in the young leaves of NCV variety are approximately twice the levels of DHAA in young leaves of Artemis (Figures 3Aii,iii), Supplemental Table 2). The levels of both DHAA and AA dropped sharply beyond the juvenile leaf stage in Artemis and NCV, respectively (Figures 3Aii,iii), Supplemental Table 2). These changes in metabolite levels occur during leaf maturation are mirrored by changes in steady state mRNA levels of genes encoding the enzymes involved in their biosynthesis including: amorpha-4,11-diene synthase (AMS), amorpha-4,11diene C-12 oxidase (CYP71AV1), artemisinic aldehyde 11,(13) reductase (DBR2) and aldehyde dehydrogenase (ALDH1) which are expressed at levels two to three orders of magnitude higher in juvenile than in mature leaves (Figures 3Bi-iv).
Previous work has suggested that in vivo conversions beyond DHAA (8) (Czechowski et al., 2016) and in vitro conversions beyond AA (9) (Brown and Sy, 2007) are nonenzymatic. Consistent with this, we have found that mature leaves of NCV contain high levels of epi-deoxyarteannuin B (EDB, 13) and arteannuin B (ArtB, 60) (Figures 3Av,vii), Supplemental Table 2), while Artemis accumulates dihydroepi-deoxyarteannuin B (DHEDB, 12) and artemisinin (22) (Figures 3Aiv,vi) Supplemental Table 2) at 20-30-fold higher levels than NCV. Both artemisinin (22) and arteannuin B (60) continue to accumulate in the post-harvest drying process in Artemis and NCV respectively (Figures 3Avi,vii). Postharvest accumulation of artemisinin has been reported before (Ferreira and Luthria, 2010) and it might be related to lightdependent conversion of DHAA. However slightly different batch specific environmental effects during drying might explain the difference between the artemisinin accumulation pattern shown in Figure 3Avi) and that which was previously reported for the Artemis variety (Czechowski et al., 2016). Interestingly, the developmental pattern of DHEDB (12) accumulation in Artemis leaves is different to its 11,13-dehydro analog, EDB (13) in NCV leaves. DHEDB (12) follows the same accumulation pattern as for artemisinin (22) in Artemis (Figures 3Aiv,vi); whereas EDB (13) is found predominantly in juvenile leaves of the NCV variety (Figure 3Av). We have found that production of the artemisinin 11,13-dehydro analog, artemisitene (66) in NCV parallels the accumulation of artemisinin (22) in Artemis (Supplemental Table 2), albeit at very much reduced levels. The levels of deoxyartemisinin (23), another product of nonenzymatic conversion of DHAA through the DHAA allylic hydroperoxide, increase during dry leaf storage, accumulating to 0.1% leaf dry weight (Supplemental Table 2), which is consistent with previous findings (Czechowski et al., 2016). This process is paralleled by accumulation of deoxyartemisitene (67) (the 11,13-dehydro analog of deoxyartemisinin) in the NCV variety (Supplemental Table 2).
RT-qPCR analysis confirmed the expression level for DBR2 to be significantly repressed (8-fold lower) in the juvenile leaves of NCV compared to Artemis, which is consistent with previous findings (Yang et al., 2015). Interestingly, DBR2 transcript abundance had decreased to the same levels in mature leaves of both chemotypes (Figure 3Biii), highlighting the importance of developmental timing in regulating flux and partitioning of sesquiterpene metabolites. More surprisingly, ALDH1 expression is increased in juvenile leaves (2.4-fold) and further increased in mature leaves (40-fold) of NCV (Figure 3Biv) compared to Artemis. Thus it would appear that in addition to DBR2 being down-regulated in the NCV (LAP) chemotype, ALDH1 is upregulated at the transcriptional level. This could also account for the increase in flux into artemisinic acid and the arteannuin B branch of sesquiterpene metabolism. The major differences in metabolite levels and gene expression between Artemis and NCV varieties for the artemisinin biosynthetic pathway are summarized in Figure 3C.
NMR analysis revealed that metabolite differences between Artemis and NCV are not restricted to artemisinin-related sesquiterpenes. Monoterpenes also vary between the two chemotypes, with for example camphor being most abundant in Artemis while artemisia ketone level is much more abundant in NCV (Supplemental Table 1). Unfortunately, NMR-analysis could only provide approximate information about the relative abundance of the metabolites, therefore metabolite content of both chemotypes was also studied by GC-and UPLC-MS (Supplemental Tables 2, 3). We were able to detect 75 unique compounds in three leaf types by UPLC-MS of which annotations were assigned to 30 compounds based on NMR-verified standards as described in the Materials and Methods. The majority of the known compounds were sesquiterpenes and flavonoids. GC-MS detected 202 unique compounds in juvenile and mature leaves, of which 33 had assigned annotations. The majority of known GC-MS-detected compounds were monoand sesquiterpenes. Using principal component analysis, it can be seen that the overall metabolite profile of NCV appears strikingly different to that of Artemis; as much as the difference between the profiles between juvenile leaves and mature-and/or dry leaves. In fact, UPLC-and GC-MS PCA plots show four distinct clusters (Figures 4A and B). Developmental differences are most apparent in juvenile leaf tissue, which show the highest abundance of most of the terpenes described below (Figure 4, Supplemental Tables 2 and 3). Our findings that the metabolite profiles in Artemis and NCV young leaf tissues are considerably different to mature and dry leaves in both varieties are consistent with our previous findings (Czechowski et al., 2016).
There are a number of compounds specifically produced by NCV, mostly in low quantities (Supplemental Tables 2 and  3) which have known medicinal use including, for example, isofraxidin (39), which is five-fold more abundant in the juvenile leaves of NCV as compared to Artemis (Supplemental Table 2). Isofraxidin is a coumarin with anti-inflammatory (Niu et al., 2012) and anti-tumor activities (Yamazaki and Tokiwa, 2010). Artemisia ketone (42), an irregular monoterpene found in the essential oil from various A. annua varieties displaying antifungal activities (Santomauro et al., 2016) is the most abundant volatile in the juvenile and mature leaves of NCV, but virtually absent in Artemis (Supplemental Table 3). The juvenile and mature leaves of Artemis accumulate velleral, a sesquiterpene dialdehyde which has proposed antibacterial activities (Anke and Sterner, 1991), which is virtually absent in the NCV variety (Supplemental Table 3). GC-MS analysis further revealed that several major montoerpenes are also more abundant in juvenile and mature leaves of Artemis, including camphor (3.7-fold higher), camphene (3.4-fold higher), borneol, (16-fold higher), α-pinene (4.6-fold higher) and 1,8-cineole (8fold higher) (Supplemental Table 3). Some minor monoterpenes detected in the Artemis variety, such as: α-myrcene, αterpinene, chrysanthenone and α-copaene, are virtually absent in young and mature NCV leaves (Supplemental Table 3). A few striking differences were noted for the level of artemisininunrelated abundant sesquiterpenes, such as sabinene and cissabinene hydrate, which are 7.5-and 38-fold (respectively) more abundant in Artemis young leaves than in NCV (Supplemental Table 3). Germacrene A is a sesquiterpene common across the Asteraceae family for which it has been demonstrated that its downstream metabolism parallels artemisinic acid biosynthetic pathway (Nguyen et al., 2010). Germacrene A levels are 32-and 17-fold higher in NCV young and mature leaves (respectively) making it the most abundant volatile in mature and the second most abundant in young leaves of the NCV variety.
Visualization of the loadings from the multivariate analyses were used to identify the most influential compounds discriminating chemotypes. PC1 loading plots identified 18 compounds from UPLC-and 20 from GC-MS analysis ( Supplementary Figure 1), which were used to create the heatmaps presented in Figure 5. The vast majority of the most influential compounds distinguishing between two chemotypes from UPLC-MS analysis were the amorphane sesquiterpenes ( Figure 5A). The mono-and sesquiterpenes mentioned above (together with some unknown compounds) were the most influential GC-MS-detectable metabolites distinguishing between two chemotypes ( Figure 5B).

Morphological Difference Between Two Chemotypes of A. annua
In addition to having very distinct phytochemical compositions the F1 Artemis HAP chemotype and the open pollinated NCV LAP chemotype varieties also have very distinct morphological features (Figure 6). Most strikingly, NCV is much taller with longer internodes but produces smaller leaves than Artemis. The density of glandular secretory trichomes, the site of artemisinin synthesis, is similar for both varieties (Figure 6E), which is consistent with the main difference in artemisinin production being due to an alteration in metabolism rather than trichome density. A. annua varieties typically require short day length for flowering (Wetzstein et al., 2014), but we observed that NCV, unlike Artemis, can also flower under long days. However, the two chemotypes do cross-pollinate and produce viable progeny.

DISCUSSION
This manuscript presents the first detailed phytochemical comparison of high-(HAP) and low-artemisinin producing (LAP) chemotypes chemotypes of A. annua.
Twenty six of the 85 metabolites that have been characterized by NMR from the HAP and LAP varieties of A. annua in this study are novel as natural products (all are mono-and sesquiterpenes). And of these, 19 are amorphane sesquiterpenes, which is the most diverse and the most abundant subclass (Supplemental Table 1, Supplementary Lists 1 and 2). The majority of these amorphane sesquiterpenes are highly oxygenated with structures that would be consistent with further oxidative metabolism of DHAA (11,13-saturated, 8) in the HAP variety and AA (11,13-unsaturated, 9) in the LAP variety (Figures 1 and 2 UPLC-and GC-MS analysis of leaf developmental series also revealed amorphanes either saturated or unsaturated at the 11,13-position in the HAP and LAP chemotypes, respectively (Figure 3, Supplemental Table 2). This observation is consistent with the expression of the DBR2 gene, which encodes the enzyme responsible for reducing the 11,13-double bond of artemisinic aldehyde (the precursor for 11,13-dihydroamorphane/cadinane sesquiterpenes), being strongly down-regulated in juvenile leaves of NCV (Figure 3Biii). These findings are in complete agreement with the recent report on reduced levels of DBR2 in LAP compared with HAP chemotypes (Yang et al., 2015). In addition to altered expression of DBR2, we also found that expression of aldehyde dehydrogenase (ALDH1), which converts artemsinic and dihydroartemsinic aldehydes to their respective acids (Teoh et al., 2009), is significantly elevated in juvenile and mature leaves of NCV compared to Artemis. This may lead to an increased flux from  in NCV when compared with flux from  in Artemis which is reflected by a significantly higher concentration of AA found in juvenile leaves of NCV when compared to the concentration of DHAA in young Artemis leaves (Figures 3Aii,iii). The elevated flux from A-4,11-D to AA (8) might also explain lower levels of A-4,11-D found in juvenile leaves of NCV when compared with Artemis (Figure 3Ai) as the expression of amorpha-4,11diene synthase (AMS) is at very similar level in both varieties (Figure 3Bi). We have also observed that the NCV (LAP) variety expresses a sequence variant of amorpha-4,11-diene C-12 oxidase (CYP71AV1) with a 7 amino acid N-extension (Supplementary Figure 2). This LAP-chemotype associated sequence variant upon transient expression in Nicotiana benthamiana, in combination with the other artemisinin pathway genes resulted in a qualitatively different product profile ("chemotype"); that is a shift in the ratio between the unsaturated and saturated (dihydro) branch of the pathway (Ting et al., 2013). That result strongly suggests the two distinct isoforms of CYP71AV1 are associated with HAP-and LAP-branches of the artemisinin pathway in Artemisia annua ( Figure 3C). A number of previous reports have described the existence of LAP-and HAP-chemotypes of A. annua arising from distinct geographical locations (Lommen et al., 2006;Arsenault et al., 2010;Larson et al., 2013). It would be interesting to establish if sequence variant forms of CYP71AV1 and differential expression of DBR2 are generally found between these other LAP-and HAP-chemotypes.
Recent attempts to constitutively overexpress DBR2 in transgenic A. annua resulted in doubling of the artemisinin concentration, which was also accompanied by a significant increase in DHAA and AA production (Yuan et al., 2015). Improvements in artemisinin concentration obtained in these experiments by Yuan et al. were significantly better than those achieved by constitutive co-expression of CYP71AV1 and CPR (Shen et al., 2012), where the LAP-sequence variant of CYP71AV1 was overexpressed in transgenic A. annua. Our results suggest the glandular trichome-targeted overexpression of DBR2 specifically in the HAP-type of CYP71AV1 might be the more efficient route to improving artemisinin production in transgenic A. annua.
Although arteannuin B (ArtB) was almost entirely absent from young leaf tissue of the NCV variety, as leaves matured it accumulated to become the most abundant natural product (Figure 3Avii). This observation seemed to parallel both the accumulation of artemisinin in the mature tissues of Artemis that has been noted above (Figure 3vi), as well as the recently described accumulation of arteannuin X in the mature leaves of the cyp71av1-1 mutant of A. annua (Czechowski et al., 2016). The accumulation of both artemisinin and arteannuin X are considered to be the result of non-enzymatic processes, in which the 4,5-double bond of a precursor sesquiterpene undergoes spontaneous autoxidation with molecular oxygen to produce a tertiary allylic hydroperoxide. The metabolic fate of this hydroperoxide is critically dependent on the identity of the precursor-and in particular on the functionality contained elsewhere in the molecule. Thus, in the case of Artemis, the precursor is DHAA which presents a 12-carboxylic acid group (as well as saturation at the 11,13-position); whilst for the cyp71av1-1 mutant it is amorpha-4,11-diene (A-4,11-D), which presents a 11,13-double bond (Czechowski et al., 2016). Both in vivo and in vitro experiments indicate that this difference in functionality is the basis of why DHAA-OOH (the tertiary allylic hydroperoxide from DHAA) is converted to artemisinin, whereas A-4,11-D-OOH is converted to arteannuin X (Czechowski et al., 2016).
We therefore hypothesized that the conversion of artemisinic acid (AA) to artemisitene (ArtB) in NCV may also be a nonenzymatic process, paralleling the conversion of DHAA into artemisinin in Artemis ( Supplementary Figures 3A and B) and of amorpha-4,11-diene to arteannuin X in the cyp71av1-1 mutant (Czechowski et al., 2016). The tertiary allylic hydroperoxide from artemisinic acid (AA-OOH) differs from the two foregoing examples in that it incorporates both a 12-carboxylic acid group and unsaturation at the 11,13-position. In support of this hypothesis, when a sample of AA-OOH (produced by photosensitized oxygenation of AA; and purified by HPLC) was left unattended for several weeks, it was indeed found to have been converted predominantly to ArtB (albeit at a rate that was significantly slower than for the conversion of DHAA-OOH to artemisinin). This unexpected transformation is mostly simply explained by attack of the 12-carboxylic acid group at the allylic position of the hydroperoxide, as is shown in Supplementary Figure 3A. Further studies will be required to explain why it should be that this (apparently) rather subtle modification to the 12-CO 2 H group (i.e., the introduction of 11,13-unsaturation in AA-OOH) has resulted in such a radically different pathway, as compared with DHAA-OOH.
The second most abundant product of AA-OOH conversion is epi-deoxyarteannuin B (EDB), which accumulates predominantly in young leaves of NCV. The EDB accumulation pattern is therefore different to DHEDB (the 11,13-saturated anaolog), where the latter's concentration rises from top to mature and dry leaves in Artemis, broadly following the accumulation pattern of artemisinin. We have proposed that the spontaneous conversions of AA into EDB and DHAA into DHEDB progress via very similar molecular mechanisms (Supplementary Figures 3C and D). Interestingly we have observed very little EDB arising from the spontaneous conversions of AA-OOH described above, which was predominantly converted to ArtB. It is known that a hydrophobic (lipophilic) environment promotes conversions of DHAA-OOH into artemisinin whereas an aqueous, acidic medium promotes DHAA-OOH conversions to DHEDB (Brown and Sy, 2004). This may also explain the very minor conversion of AA-OOH into EDB which was carried out in a hydrophobic environment (deuterated chloroform), and which promoted AA-OOH conversions to ArtB. This highlights the parallels between artemisinin and arteannuin B biogenesis shown in Supplementary Figures 3A and B. It also suggests that in vivo conversions of AA-OOH to EDB requires an aqueous intracellular environment, which might be expected to be present in young leaf trichomes, but less so in mature leaf trichomes where the sub-apical hydrophobic cavities are predominant (Ferreira and Janick, 1995), or upon cell dehydration (in dried leaf material).
Differences between the LAP and HAP chemotypes extended well beyond artemisinin-related sequiterpenes to other classes of terpenes (Figures 4 and 5, Supplemental Tables 1-3). This divergence at the level of metabolism is not that surprising given that these chemotypes also exhibit significant differences in their morphology (Figure 6). Artemis is an F1 hybrid derived from HAP parents of East Asian origin (Delabays et al., 2001) while NCV is an open-pollinated variety of European origin (personal communication with Dr. Michael Schwerdtfeger, curator of Botanical Garden at the University of Göttingen, Germany). This is consistent with the general trend for the A. annua varieties of European and North American origin which mostly represent the LAP chemotype and the majority of East-Asian origin varieties which represent the HAP chemotype (Wallaart et al., 2000), Details of the genetic divergence of these varieties remains a topic for further investigation that could reveal further insight into the sesquiterpene flux into different end products.

CONCLUSION
This first comparative phytochemical analysis of high-(HAP) and LAP chemotypes of A. annua has resulted in the characterization of over 85 natural products by NMR, 26 of which have not previously been described in A. annua. We have also shown that the vast majority of amorphane sesquiterpenes are unsaturated at the 11,13-position in the LAP-chemotype as opposed to the majority of them being saturated at the 11,13position in the HAP-chemotype. This is explained by existence of two sequence variants of CYP71AV1 in the two investigated chemotypes and differential expression of the key branching enzyme in the artemisinin pathway, namely artemisinic aldehyde 11 (13) reductase (DBR2). By highlighting the main points of difference between HAP and LAP chemotypes our findings will help inform strategies for the future improvement of artemisinin production in either A. annua or heterologous hosts.

AUTHOR CONTRIBUTIONS
TC planned and performed the experiments, analyzed the data, and wrote the manuscript. TL planned the UPLC-MS and GC-MS experiments, analyzed data and reviewed the manuscript. TMC planned and performed morphological plant analysis. DH performed UPLC-MS and GC-MS experiments. CW planned and performed extraction, purifications and NMR experiments and analyzed data. ME performed extraction, purifications and NMR experiments. GDB planned and performed NMR experiments, analyzed data, wrote and reviewed the manuscript. IAG planned and supervised the experiments and wrote the manuscript.