REVIEW article

Front. Plant Sci., 23 December 2021

Sec. Plant Biotechnology

Volume 12 - 2021 | https://doi.org/10.3389/fpls.2021.791219

Review on the Development and Applications of Medicinal Plant Genomes

  • 1. State Key Laboratory of Quality Research in Chinese Medicine, Faculty of Chinese Medicine, Macau University of Science and Technology, Taipa, Macao SAR, China

  • 2. Lushan Botanical Garden, Chinese Academy of Sciences, Jiujiang, China

  • 3. Joint Laboratory for Translational Cancer Research of Chinese Medicine, The Ministry of Education of the People’s Republic of China, Macau University of Science and Technology, Taipa, Macao SAR, China

Abstract

With the development of sequencing technology, the research on medicinal plants is no longer limited to the aspects of chemistry, pharmacology, and pharmacodynamics, but reveals them from the genetic level. As the price of next-generation sequencing technology becomes affordable, and the long-read sequencing technology is established, the medicinal plant genomes with large sizes have been sequenced and assembled more easily. Although the review of plant genomes has been reported several times, there is no review giving a systematic and comprehensive introduction about the development and application of medicinal plant genomes that have been reported until now. Here, we provide a historical perspective on the current situation of genomes in medicinal plant biology, highlight the use of the rapidly developing sequencing technologies, and conduct a comprehensive summary on how the genomes apply to solve the practical problems in medicinal plants, like genomics-assisted herb breeding, evolution history revelation, herbal synthetic biology study, and geoherbal research, which are important for effective utilization, rational use and sustainable protection of medicinal plants.

Introduction

Medicinal plants, in the simple definition, are plants that can be used for medicinal purposes; in the detailed definition, are plants that have been verified and used for a long time as traditional medicines, have been found to have medicinal value in modern research, or contain medicinal ingredients in them. And they can provide the essential resources for human life, such as drugs, nourishment, condiments, and medicinal oil. They also uncovered and promoted the evolution of nature, animals, and humans. The foundation of all life is the genetic code. Therefore, access to the primary DNA sequence and how genes are encoded within the genome has become a basic resource in biology (Hamilton and Robin Buell, 2012). The genomics study of medicinal plants is to elucidate their molecular mechanism to prevent human diseases, by utilizing the genetic information and regulatory network of the species and the omics technologies, accordingly, to reveal their effect on the human body from the level of the genome. Now the process of genome sequencing in plants lags behind that in microorganisms and animals. Due to the lack of genomic information, there is a lack of communication between medicinal plants and modern life sciences, and the new frontier life science technology is hardly be applied to their research. Over the years, the works of research on medicinal plant medicines mainly focus on chemistry and pharmacology, the studies to uncover the biological nature of medicinal plants need to be strengthened.

Regarding plant genome sequencing methods and strategies, radical changes have taken place in the past 5 years, and medicinal plant genome sequencing is no exception. Previous reviews summarized the status of sequenced plant genomes before 2012 (Hamilton and Robin Buell, 2012), the status of sequenced angiosperm genomes before 2018 (Chen et al., 2018), and the impact of third generation genomic technologies on plant genome assembly (Jiao and Schneeberger, 2017). In addition, there were also Chinese reviews that proposed and introduced the Herb Genome Program (Chen et al., 2010) and 1,000 genome projects of medicinal plants (Chen et al., 2019). As sequencing cost reduces drastically and long-read sequencing technology develops quickly in recent years, it is certain that the genome continues to be improved, while more and more large and complicated medicinal plant genomes are reported. The future of revealing the secret of medicinal plant biology is bright. However, there is still not a review covering the medicinal plant genomes that have been released so far and introducing the development of sequencing strategies and applications.

In this manuscript, we conducted a systematic review of medicinal plants genome research. Moreover, the genome situation, sequencing technology development, and application of medicinal plant genomes are discussed. This review provides a historical perspective on the current situation of genomes in medicinal plant biology and highlights the use of rapidly developing sequencing technologies in plant biology. Challenges in genomics for medicinal plants are improved to some extent by long-read sequencing technologies regarding the current limitations. Multiple omics methods are integrated to make better use of medicinal plant genome data and to solve practical problems meeting in the breeding and medical fields. We also conduct a comprehensive summary of the application of medicinal plant genomes, to promote the studies of important questions in plant biology, like genomics-assisted herb breeding, herbal synthetic biology, and geoherbal research, which are significant for securing the future of medicinal plants and their active compounds.

Literature Search Methods and Results

The systematic literature search was performed by the following PRISMA guidelines (Moher et al., 2009). Firstly, it was performed through electronic databases, including PubMed (National Library of Medicine, United States), EMBASE (Elsevier, Netherlands), and Web of Science (Clarivate Analytics, United States) databases published until June 4, 2021. Studies were selected using the term “medicinal plant genome.” Additionally, we also searched the studies from the plaBiPD (Forschungszentrum Jülich GmbH, Germany) database and identified the medicinal plants from all the plants which have reported genomes. About the medicinal plant genomes, a total of 5,064 articles were identified initially by retrieving the electronic database comprehensively. Among these, 1,678 articles were from PubMed, 1,982 articles were from EMBASE, and 1,404 articles were from Web of Science, 173 articles were from the plaBiPD database, 831 articles were excluded for duplicates. A total of 4,189 articles were excluded by scanning the titles and the abstracts, and the exclusion reasons included irrelevant articles, not studies, and so on. Fifty-nine articles were excluded by reading the full-text manuscripts, with the exclusion reasons of reviews, not for medicinal plants and not for whole-genome sequencing and no mention of medicinal related content. Finally, a total of 158 articles were included in this meta-analysis. A flowchart of articles search and selection is shown in Figure 1. According to our statistical result, there were at least 161 reference genomes reported in 158 articles belonging to 126 medicinal plants published. We counted the number of journals that have published medicinal plant genomes, there were a totally of 40 journals, and the corresponding journal name and article number are provided in Supplementary Table 1. Since 2010, articles about medicinal plant genomes have appeared in journals almost every year. Since 2017, the number of medicinal plant genome articles has increased significantly.

FIGURE 1

General Introduction of Medicinal Plant Genomes Research

History and General Characteristics of Medicinal Plant Genome Research

The medicinal plant genomes are more complex than animal genomes, so the process of sequencing the medicinal plant genomes has been hindered, and it entered a period of rapid development from 2016. This may be due to the decline in sequencing price and the development of long-read sequencing technologies. The status of medicinal plant genomes articles reported each year is shown in Figure 2A. In 2020, the number of published medicinal plant genomes has reached up to 53. In 2021, 33 medicinal plant genome articles have been published until June 4th, and the total article number is inferred to be more than 60. As more and more medicinal plant genomes have been revealed, several plants have been sequenced twice or multiple times for genomes. Among these repeatedly sequenced medicinal plant genomes, some are because of sequencing at the same time, some are due to improved level and quality, and some are genomes of different varieties from the same species. Among those 53 medicinal plant genomes reported in 2020, 18 genomes were reported repeatedly, accounting for 34%. This tells us that sequencing technology is continuously developing and progressing, bringing us to a completer and more accurate genome. Take Panax notoginseng (Chinese name: Sanqi) as an example, five versions of its genomes have been reported, the first two versions published in 2017 were sequenced by the next-generation sequencing (NGS) technology of Illumina platform (Chen et al., 2017; Zhang D. et al., 2017), and the recent three versions published in 2020 and 2021 were sequenced by the third-generation sequencing technologies of Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) (Fan G. et al., 2020; Jiang et al., 2021; Yang et al., 2021b). The latest two versions of the genome were assembled to the chromosome level, the length of the assembled sequences was hundreds of times longer than the first two versions, and the accuracy and credibility of annotation have also been greatly improved. The statistical results of detailed information about medicinal plant genomes were shown in Table 1.

FIGURE 2

TABLE 1

#NamePlatformClassEsti-sizeAssem-sizeRepeatContig N50Scaffold N50GeneReferences
MbMb%kbkb#
1Acer truncatumI, Peudi73963361.8773.2 kb46.36 Mb28438Ma Q. et al., 2020
2Akebia trifoliatasubsp.australisI, Peudi67068271.26.2 Mb43.11 Mb25598Huang et al., 2021
3Allium sativumI, P, Omono169001624391.3194 kb1691 Mb57561Sun et al., 2020
4Aloe veraI, Omono160401293078.7NA14.6 kb86177Jaiswal et al., 2021
5–1Andrographis paniculataI, Peudi28026953.3388 kbNA25428Sun et al., 2019
5–2Andrographis paniculataI, Peudi31028457.45.14 MbNA24015Liang et al., 2020
6–1Apium graveolensIeudi3180221068.913.1 kb35.6 kb34277Li M.-Y. et al., 2020
6–2Apium graveolensI, Peudi3470333087.1790.6 kb289.78 Mb31326Song et al., 2021
7–1Aquilaria sinensisI, Oeudi77372759.11.1 Mb88.78 Mb29203Ding et al., 2020
7–2Aquilaria sinensisIeudi78478461.2NA87.6 Mb35965Nong et al., 2020
8Aquilegia oxysepala var. kansuensisI, Peudi31229345.72.2 Mb40.9 Mb25571Xie et al., 2020
9Artemisia annuaI, P, Reudi1740174061.618.95 kb104.86 kb63226Shen et al., 2018
10Asparagus setaceusI, Omono72071064.41.36 Mb2.19 Mb28410Li S.-F. et al., 2020
11–1Averrhoa carambolaI, Oeudi35833561.34.22 Mb31.25 Mb25419Wu S. et al., 2020
11–2Averrhoa carambolaIeudi47547168.244.84 kb2.76 Mb24726Fan Y. et al., 2020
12Azadirachta indicaIeudi36436413.0740 bp452 kb20169Krishnan et al., 2012
13Betula platyphyllaI, Peudi43043043.0751 kbNA31253Chen S. et al., 2020
14Brassica oleraceaI, R, Saeudi63054038.826.8 kb1.46 Mb45758Liu S. et al., 2014
15Broussonetia papyriferaI, Peudi38038749.2171.2 kb29.48 Mb30512Peng et al., 2019
16Calotropis giganteaIeudi22515728.348.6 kb806.0 kb18197Hoopes et al., 2018
17Camellia sinensisIeudi3000302080.920.0 kb449.5 kb36951Xia et al., 2017
18Camptotheca acuminataIeudi50340335.6108 kb1752 kb31825Zhao et al., 2017
19–1Cannabis sativaI, Reudi820534NANA16.2 kb30000vanBakel et al., 2011
19–2Cannabis sativaI, Peudi84380874.8513.6 kb83 Mb38828Gao et al., 2020
20–1Capsicum annuumIeudi3260334980.955.4 kb1226.8 kb35336Qin et al., 2014
20–2Capsicum annuumIeudi3480306076.430 kb2.47 Mb34903Kim S. et al., 2014
21Carthamus tinctoriusI, Peudi1170106060.121.23 Mb88.21 Mb33343Wu et al., 2021
22Catharanthus roseusIeudi738523NANA26.2 kb33829Kellner et al., 2015
23Centella asiaticaIeudi43043056.4NA15.7 Mb25226Pootakham et al., 2021a
24Cerasus humilisI, Peudi22822343.11.45 Mb26.23 Mb26821Wang et al., 2020
25Chimonanthus praecoxI, Pmagno77969547.52.19 Mb65.35 Mb23591Shang et al., 2020
26Chimonanthus salicifoliusI, Pmagno83682057.72.3 MbNA36651Lv Q. et al., 2020
27Chiococca albaIeudi567558NANA2.35 Mb28707Lau et al., 2020
28Chrysanthemum nankingenseI, Oeudi3240253069.6130.7 kbNA56870Song et al., 2018
29Cinnamomum kanehiraeI, Pmagno82473148.00.9 Mb50.4 Mb27899Chaw et al., 2019
30Citrus medicaIeudi40740543.846.5 kb367 kb32579Wang et al., 2017
31Citrus reticulataIeudi33433450.124.7 kb1.7 Mb28820Wang L. et al., 2018
32Coix aquaticaI, Pmono1680162075.42.24 MbNA39629Guo et al., 2020
33–1Coix lacryma-jobiI, Pmono1800173077.73.19 Mb13.98 Mb44485Liu H. et al., 2020
33–2Coix lacryma-jobiI, Pmono1560128077.0NA594.3 kb39574Kang et al., 2020b
34Colocasia esculentaI, P, Omono2390240588.4400 kb159.4 Mb28695Yin et al., 2021
35–1Coptis chinensisI, Peudi104795862.21.58 Mb4.53 Mb34109Chen D. et al., 2021
35–2Coptis chinensisI, Oeudi115093762.5806.6 kbNA41004Liu et al., 2021
36Coriandrum sativumI, Peudi2130211980.6604.1 kb160.99 Mb40747Song X. et al., 2020
37Cuscuta australisP, Ieudi27326558.03.63 Mb5.95 Mb19671Sun et al., 2018
38Dalbergia odoriferaI, Peudi65363854.25.92 Mb56.16 Mb30310Hong et al., 2020
39Datura stramoniumI, Oeudi2000210061.013.1 kb164.1 kb52149Rajewski et al., 2021
40Daucus carotaIeudi47342246.031.2 kb12.7 Mb32113Iorizzo et al., 2016
41Dendrobium catenatumIeudi1110101078.133.1 kb391 kb28910Zhang G. Q. et al., 2016
42–1Dendrobium officinaleI, Pmono1270135063.325.1 kb76.4 kb35567Yan et al., 2015
42–2Dendrobium officinaleI, Pmono1210123064.41.44 Mb63.07 Mb27631Niu et al., 2021
43Dimocarpus longanIeudi44547252.926.0 kb566.6 kb31007Lin et al., 2017
44Dioscorea zingiberensisImono85180042.81.08 kb1.96 kb27057Zhou et al., 2018
45Dracaena cambodianaImono1120106453.51.87 kb3.19 kb53700Ding et al., 2018
46Eleutherococcus senticosusI, Peudi1260130073.6309.4 kb50.79 Mb36372Yang et al., 2021a
47–1Erigeron breviscapusI, Peudi1520120054.618.8 kb31.5 kb37504Yang et al., 2017
47–2Erigeron breviscapusI, Peudi1520143067.4140.95 kb156.82 Mb43514He et al., 2021
48Eriobotrya japonicaI, Peudi80376185.93.98 Mb43.16 Mb43996Su et al., 2021
49–1Eucommia ulmoidesI, Peudi1100118061.217.06 kb1.03 Mb26723Wuyun et al., 2018
49–2Eucommia ulmoidesPeudi102094862.513.16 Mb53.15 Mb26001Li Y. et al., 2020
50Fagopyrum tataricumI, Peudi48948951.0550.7 kbNA33366Zhang L. et al., 2017
51Forsythia suspensaI, Oeudi70173754.57.3 Mb7.3 Mb33062Li L.-F. et al., 2020
52Gardenia jasminoidesI, Oeudi55153562.21.0 Mb44 Mb35967Xu et al., 2020b
53–1Gastrodia elataImono1180106166.268.9 kb4.9 Mb18969Yuan Y. et al., 2018
53–2Gastrodia elataImono1378112069.8110 kb1.64 Mb24484Chen S. et al., 2020
54Gelsemium elegansI, Oeudi33833543.210.23 Mb40.47 Mb26768Liu Y. et al., 2020
55Gelsemium sempervirensIeudi219244NANA411 kb22617Franke et al., 2019
56Ginkgo bilobaIgymno117501061076.648.2 kb1.36 Mb41840Guan et al., 2016
57Glycyrrhiza uralensisI, Peudi40137936.57.3 kb109.3 kb34445Mochida et al., 2017
58Hemerocallis citrinaI, Pmono3800377078.92.09 MbNA54295Qing et al., 2021
59Hypericum perforatumI, Peudi40037346.91.41 Mb2.31 Mb29150Zhou et al., 2021
60Isatis indigoticaI, Peudi30529453.31.18 Mb36.17 Mb30323Kang et al., 2020a
61Jacaranda mimosifoliaI, Oeudi73970756.816.77 Mb39.98 Mb30507Wang M. et al., 2021
62–1Juglans regiaI, Peudi60666751.246.1 kb465.0 kb32498Martínez-García et al., 2016
62–2Juglans regiaOeudi62057458.41.1 Mb37 Mb37554Marrano et al., 2020
63Lagenaria sicerariaIeudi33431346.928.3 kb8.7 Mb18534Wu et al., 2017
64Lavandula angustifoliaI, Peudi109589558.31.22 Mb36.2 Mb65905Li et al., 2021
65Lepidium meyeniiIeudi75174347.781.8 kb2.4 Mb96417Zhang J. et al., 2016
66Linum usitatissimumIeudi37330250.020.1 kb693.5 kb43384Wang et al., 2012
67Lithospermum erythrorhizonI, Oeudi36936751.8314.3 kbNA27720Auber et al., 2020
68Litsea cubebaI, Pmagno1370132655.5607.31760.031329Chen Y.-C. et al., 2020
69Lonicera japonicaI, Oeudi88784358.22.1 Mb84.4 Mb33939Pu et al., 2020
70Luffa acutangulaI, Peudi76073562.2NA786.1 kb32233Pootakham et al., 2021b
71–1Luffa cylindricaI, Peudi73766962.25 Mb53 Mb31661Zhang et al., 2020
71–2Luffa cylindricaI, Peudi72065663.88.8 Mb48.76 Mb25508Wu H. et al., 2020
71–3Luffa cylindricaI, Peudi77369056.8NA578.6 kb43828Pootakham et al., 2021b
72Macleaya cordataIeudi54137843.525.0 kb308.0 kb22328Liu et al., 2017
73Magnolia biondiiI, Pmagno2240222066.5269.1 kb92.86 Mb47547Dong et al., 2021
74–1Medicago sativa/autotetraploidI, P, Oeudi3150273855.0459.0 kbNA164632Chen H. et al., 2020
74–2Medicago sativaZhongmu No.1 /haploidI, Peudi80081657.03.92 MbNA49165Shen et al., 2020
74–3Medicago sativaspp. caerulea/diploidI, Oeudi80279355.63.86 MbNA47202Li A. et al., 2020
75Mentha longifoliaI, Peudi400353NA4.5 kbNA35597Vining et al., 2017
76Mesua ferreaIeudi685614NA251.7 kb392.8 kb46540Patil et al., 2021
77Mitragyna speciosaIeudi1123112344.270.4 kb1.02 Mb55746Brose et al., 2021
78–1Momordica charantiaIeudi33928615.3NA1.1 Mb45859Urasaki et al., 2017
78–2Momordica charantiaI, Peudi30330352.59.9 Mb25.37 MbNAMatsumura et al., 2020
79Morinda officinalisI, Oeudi48548558.04.2 Mb40.97 Mb27102Wang J. et al., 2021
80Moringa oleiferaIeudi27821740.645.3 kb957.2 kb18451Chang et al., 2019
81Morus notabilisIeudi35733147.034.5 kb390.1 kb29338He et al., 2013
82Myrica rubraIeudi31329045.668.6 kb2164.2 kb26325Ren et al., 2019
83–1Nelumbo nuciferaI, Reudi92980457.038.8 kb3.4 Mb26685Ming et al., 2013
83–2Nelumbo nuciferaIeudi87979249.539.3 kb986.5 kb36385Wang et al., 2013
84–1Ocimum basilicumIeudi2360206861.648.3 kb1.5 Mb78990Bornowski et al., 2020
84–2Ocimum basilicumIeudi2320213076.045.7 kb19.3 Mb62067Gonda et al., 2020
85Ocimum tenuiflorumIeudi61237442.92.6 kb27.1 kb36768Upadhyay et al., 2015
86Ophiorrhiza pumilaI, Peudi44044058.218.49 Mb40.06 Mb32389Rai et al., 2021
87Osmanthus fragransI, Peudi74172749.41.59 MbNA45542Yang et al., 2018
88Paeonia suffruticosaI, Peudi157601379080.249.9 kbNA34854Lv S. et al., 2020
89–1Panax ginsengIeudi3500343062.222.0 kb108.7 kb42006Xu et al., 2017
89–2Panax ginsengIeudi3600298079.522.5 kb569.0 kb59352Kim et al., 2018
90–1Panax notoginsengIeudi2310239075.916.0 kb96.0 kb36790Chen et al., 2017
90–2Panax notoginsengIeudi2002185061.313.2 kb158.0 kb34369Zhang D. et al., 2017
90–3Panax notoginsengO, Peudi2310224079.1220.9 kbNA39452Fan G. et al., 2020
90–4Panax notoginsengI, Peudi2380266085.91.12 Mb216.47 Mb37606Jiang et al., 2021
90–5Panax notoginsengI, Peudi2310241088.21.45 Mb196.33 Mb47870Yang et al., 2021b
91–1Papaver somniferumI, Oeudi2870272070.91.77 Mb2.04 Mb51213Guo et al., 2018
91–2Papaver somniferumIeudi3370262065.886.0 kb6.86 Mb79668Pei et al., 2021
92–1Passiflora edulisI, Oeudi13961341NA3.1 MbNA23171Xia et al., 2021
92–2Passiflora edulisI, Peudi1410128086.370.2 kb126.4 Mb39309Ma et al., 2021
93Phytolacca americanaIeudi1260930NA35.2 kb42.5 kb29773Neller et al., 2019
94Piper nigrumI, Pmagno76276154.9NA29.8 Mb63466Hu et al., 2019
95Platycodon grandiflorusIeudi68368036.215 kb277.1 kb40017Kim et al., 2020
96Pogostemon cablin /diploidIeudi1576115058.60.4 kb1.1 kb45020He et al., 2016
97Pogostemon cablin /octaploidIeudi2380191643.734.7 kb699.0 kb110850He et al., 2018
98Polygonum cuspidatumIeudi2600256071.52.8 kb3.2 kb55075Zhang Y. et al., 2019
99Poncirus trifoliataI, Peudi26526542.6842.8 kb27.7 Mb25538Peng et al., 2020
100–1Punica granatumIeudi35732846.167.0 kb1.89 Mb29229Qin et al., 2017
100–2Punica granatumIeudi33627451.297.0 kb1.7 Mb30903Yuan Z. et al., 2018
100–3Punica granatumI, Peudi31332050.94.49 Mb39.96 Mb33594Luo et al., 2020
101Raphanus sativusIeudi52940226.7NA46.3 kb61572Kitashiba et al., 2014
102Rhodiola crenulataIeudi42034566.225.4 kb144.7 kb31517Fu et al., 2017
103Ricinus communisSaeudi32035150.321.1 kb496.5 kb31237Chan et al., 2010
104–1Rosa chinensisI, Peudi56056067.9NA24 Mb36377Raymond et al., 2018
104–2Rosa chinensisI, Peudi56851263.23.4 MbNA39669Hibrand Saint-Oyant et al., 2018
105Rosa roxburghiiIeudi48140947.61.5 kb3.6 kb22721Lu et al., 2016
106Rosmarinus officinalisIeudi1180101454.721.8 kb368.7 kb51389Bornowski et al., 2020
107Salvia bowleyanaI, Peudi46246258.71.18 Mb57.96 Mb44044Zheng et al., 2021
108–1Salvia miltiorrhizaI, P, Reudi61553854.412.4 kb51.0 kb30478Xu et al., 2016
108–2Salvia miltiorrhizaI, Peudi57259564.82.7 Mb69.8 Mb32483Song Z. et al., 2020
109Santalum albumIeudi20322127.4460.7 kb460.7 kb38119Mahesh et al., 2018
110–1Scutellaria baicalensisI, Peudi40838755.2880.6 kb1.34 Mb28524Zhao Q. et al., 2019
110–2Scutellaria baicalensisI, Oeudi44237755.22.1 MbNA33414Xu et al., 2020a
111Scutellaria barbataI, Peudi40535353.52.5 MbNA41697Xu et al., 2020a
112Selaginella tamariscinaI, Plycopo30130160.6201.2 kb407.7 kb27761Xu et al., 2018
113Senna toraI, Peudi54752653.94.03 Mb41.7 Mb45268Kang et al., 2020c
114–1Sesamum indicumIeudi35727428.552.2 kb2.1 Mb27148Wang et al., 2014
114–2Sesamum indicumPeudi337292NA1.06 Mb20.5 Mb28406Li C. et al., 2020
115Sinapis albaIeudi553459NA1.7 kbNA34012Kumari et al., 2020
116–1Siraitia grosvenoriiIeudi420470NA34.2 kb101.1 kb43856Itkin et al., 2016
116–2Siraitia grosvenoriiI, Peudi42047051.1432.4 kbNA30565Xia et al., 2018
117Spatholobus suberectusI, Peudi79379847.82.1 Mb86.99 Mb31634Qin et al., 2019
118Stevia rebaudianaI, Oeudi1160141680.1616.9 kb106.55 Mb44143Xu et al., 2021
119Taxus wallichianaI, Ogymno106001090085.08.6 Mb987 Mb44008Cheng et al., 2021
120Toona sinensisI, Oeudi55959664.61.5 Mb21.5 Mb34345Ji et al., 2021
121Trichopus zeylanicusI, Pmono86071447.4289.5 kb430.0 kb34452Chellappan et al., 2019
122Trichosanthes anguinaI, Oeudi103092080.020.11 Mb82.12 Mb22874Ma L. et al., 2020
123Tripterygium wilfordiiPeudi36634852.44.36 Mb13.52 Mb28321Tu et al., 2020
124–1Vernicia fordiiIeudi1200117658.7NA474.9 kb46829Cui et al., 2018
124–2Vernicia fordiiI, Peudi1310112073.3NA87.15 Mb28422Zhang L. et al., 2019
125Xanthoceras sorbifoliumI, Peudi44244056.4642.3 kb29.43 Mb21059Liang et al., 2019
126Ziziphus jujubaIeudi44443846.834.0 kb301.0 kb32808Liu M. J. et al., 2014

The statistical results of medicinal plant genome published journals.

#, number; Esti-size, estimated genome size; Assem-size, assembled genome size; mono, monocots; eudi, eudicots; magno, magnoliids; lycopo, lycopodiophyta; gymno, gymnosperm; Sa, Sanger; R, Roche/454; I, Illumina; P, PacBio; O, Oxford Nanopore; NA, not reported.

Research, Protection, and Utilization of Geoherbal Resources

With the widespread application of NGS technology, genome sequencing of medicinal plants has become more feasible due to the greatly reduced cost and time required to complete the project. According to the whole genome sequence, the basic information of biology and biomedical functions can be well understood.

We have made statistics on the medicinal plant genome articles over the years, and have a basic understanding of the general characteristics of the reported medicinal plant genomes. The comparison of size and repetitiveness ratio of these published medicinal plant genomes and their evolution relationship is shown in Figure 2B. Among them, the genomes of five medicinal plants are much larger than other medicinal plants, they are Allium sativum, Paeonia suffruticosa, Aloe vera, Taxus wallichiana, and Ginkgo biloba. In the plants whose genomes have been sequenced, there are 123 angiosperms (including 12 monocots, 105 eudicots, and 6 magnoliids), two gymnosperms, and one lycopodiophyta plant. The simplified phylogeny of the major clades of sequenced medicinal plants is also shown in Figure 2B. Angiosperms account for the vast majority of sequenced medicinal plants, and eudicots make up the majority of angiosperms. Genome size has a positive correlation with the ratio of repetitive elements, when the genome size is larger, the proportion of repetitive elements also tends to be correspondingly larger. Most of the genome size is concentrated within 4 Gb, and the repetitiveness ratio sequences are concentrated between 30 and 90%.

It has been said that plant genome reports were formulaic and lack biology significance, their descriptions mainly included the assembly, protein-coding genes, repeats, evolution analysis, some aspects of biology, usually with a focus on transcription factors and active compounds biosynthesis pathway (Michael and Jackson, 2013). According to these published medicinal plant genomes, most of them have not yet been used to solve specific application problems, such as discovering new medicinal mechanisms, cultivating new resistant varieties, explaining evolutionary events, and so on. But the assembly of the genomes provides us with the guarantee of the database. Once we need the support of genetic information, the genome is the solid foundation and reference.

Implications and Hallmark of Medicinal Plant Genome

Medicinal plants are the main sources of medicine, and their records for medicinal usage can be traced back to almost 5,000 years ago in China, India, and Egypt (Moss and Yuan, 2006; Jamshidi-Kia et al., 2018). They are also the precious resource libraries for many chemical drugs, currently, more than one-third of clinical medications are derived from plant extracts or their derivatives (Chen and Song, 2016). The sequencing and demystification of the genome can give us a better understanding of the biosynthesis and regulation of bioactive compounds. Artemisinin-derived plant named Artemisia annua is one of the most famous medicinal plants, while the discovery of artemisinin has won the 2015 Nobel Prize in Physiology or Medicine (Su and Miller, 2015). A semi-synthetic system has been used to improve the production of artemisinin greatly (Paddon et al., 2013). Further revealing the genome of A. annua provides a comprehensive understanding of artemisinin biosynthesis and leads to improvement in artemisinin production. Before A. annua genome revelation, studies manipulating artemisinin biosynthesis focused on either upstream (Nafis et al., 2011) or downstream (Yuan et al., 2015) genes on the artemisinin biosynthesis pathway. Then the combined study and analysis of A. annua genomic and associated transcriptomic data proposed other efficient strategies to increase the production of artemisinin, one was to simultaneously enhance the expression of enzyme genes in different steps in the biosynthesis pathway including the upstream (HMGR), midstream (FPS) and downstream (DBR2), and the other was to overexpress the expression of transcription factors like AaMYB2 that could regulate the expression of ADS, CYP71AV1, DBR2, and ALDH1 in artemisinin biosynthesis pathway, which could significantly improve artemisinin and dihydroartemisinic acid content, providing a new insight for increasing the supply of artemisinin from plant sources (Shen et al., 2018).

In addition to improving the content of active compounds, it is also necessary to ensure the agronomic traits and enhance the resistance ability to stresses of medicinal plants. Genome sequencing can help identify the genes associated with agronomic and disease resistance traits, and can target control of the genes to cultivate new varieties of medicinal plants with highly effective ingredients, excellent agronomic features, and high resistance abilities. P. notoginseng, a well-known medicinal plant, is susceptible to a wide range of pathogens, so its cultivation faces several challenges (Ou et al., 2011). The sequencing of the P. notoginseng chromosome level genome, combining a genome-wide association study on 240 cultivated individuals, successfully identified 63 genes associated with dry root weight (included genes encoding cysteine/histidine-rich C1 domain proteins), 168 genes associated with stem thickness (included APC6, WRKY71, and RWA3, etc.) and 33 genes associated with disease resistance trait (included genes encoding LRR receptor-like serine/threonine-protein kinases) (Fan G. et al., 2020). These valuable resources of P. notoginseng can provide new opportunities to harness the full potential of its economic and medicinal values.

Moreover, some medicinal plants also play an important role in evolution, and the discovery of their genomes can help to understand the evolutionary relationship of plants. Ginkgo biloba is a living fossil without living relatives, which represents one of the four extant gymnosperm lineages (cycads, ginkgo, conifers, and gnetophytes). Its genome showed that LTR-RT insertions and two whole-genome duplications (WGD) events in evolution history contribute to the large genome size and long introns. In angiosperms, chromosomal breakages and fusions, as well as uneven gene loss, might occur to prevent a continuous growth in genome size (Schnable et al., 2009), and this mechanism for removing transposable elements (TEs) might lack and lead to enormous genome size in gymnosperms like ginkgo. The outstanding defense ability of ginkgo resulted from the remarkable duplication of resistance genes and enrichment of relevant pathways. The ginkgo genome sheds light on sequencing large plant genomes and helps to know the genetic and evolutionary process of land plants in natural evolution (Guan et al., 2016).

Quality and Integrity Improvement of Medicinal Plant Genomes

The quality of genome assembly directly affects the quality of the whole genome. Contig N50 and scaffold N50 are the primary indicators for evaluating genome assembly results. Generally, the longer the contig N50 and scaffold N50 are, the better the assembly result is. As shown in Table 1, in 2017 and before, most of the reported medicinal plant genomes used the NGS technologies, such as Illumina and Roche/454, and the length of contig N50 ranged from a few kilobases to dozens of kilobases. In 2018, half of the published genomes used a combination of next- and third-generation sequencing technologies, such as Illumina + PacBio and Illumina + Oxford Nanopore. In 2019 and beyond, the sequencing strategy of combining next- and third-generation has been applied to the majority of the reported genomes. It can be seen from Figure 3 that the length of contig N50 became long since 2018, and then increased year by year. By 2020, the length has been greatly improved, the length of contig N50 was generally increased to the range between a few hundred kilobases and several megabases. The length of contig N50 was similar in the medicinal plant genomes published in 2020 and 2021. And the longest length was as long as 21.23 Mb (Cheng et al., 2021). It shows that the popularization and application of third-generation sequencing have brought convenience to scientific research, and at the same time have greatly improved the quality and integrity of the genome.

FIGURE 3

Sequencing Strategy Development

The development process of sequencing strategy on medicinal plant genomes has experienced three stages, germination stage, development stage, and expansion stage (Figure 4).

FIGURE 4

Germination Stage of Medicinal Plant Genome Sequencing

The start of genomics is from the early 1990s, and automated sequencing methods using dideoxy chain termination with fluorescent molecules developed, which is known as Sanger sequencing. The effectiveness of the Sanger platform for large eukaryotic genomes was first reported in 2000 for Drosophila melanogaster, ushering in a new era of genomics (Adams et al., 2000). This method was also applied in plant biology, like sequencing ESTs in Arabidopsis thaliana (Newman et al., 1994), and then sequencing the whole genome of various plants, like Oryza sativa (Yu et al., 2002), Populus trichocarpa (Tuskan et al., 2006), Carica papaya (Ming et al., 2008) and Brachypodium distachyon (International Brachypodium Initiative, 2010). However, there are still gaps and errors in the assembly of these genomes, so they are not completely “finished,” because the process of “finishing” needs inspection and experimental resolution of inconsistencies, and it’s a time-consuming, tough, and expensive work (Hamilton and Robin Buell, 2012). In the germination stage of the development process about medicinal plant genome sequencing, considering this and cost, the Sanger sequencing method is only used to sequence the genome of major economic crops that are also regarded as medicinal plants, like Ricinus communis, to provide references and templates for subsequent research.

The Development Stage of Medicinal Plant Genome Sequencing

After 2011, the NGS technology develops rapidly and occupies the position of the mainstream sequencing platform, becoming the preferred technology for sequencing the medicinal plant genomes. The widely and mainly used NGS platforms are Roche 454 platform and Illumina platform.

Roche 454 platform is the first commercially successful NGS system. This sequencing method uses a high-throughput pyrosequencing technology (Margulies et al., 2005). This platform utilizes emulsion PCR to detect the pyrophosphate released during nucleotide incorporation. In 2005, the read length of Roche 454 was only 100–150 bp with 20 Mb output data per run (Mardis, 2008). In 2008, the 454 GS FLX Titanium system appeared, with a reading length up to 700 bp and 0.7 G output data per run within 24 h. In late 2009, Roche simplified the library preparation and data processing and improved the output to 14 G per run (Liu et al., 2012). In 2012, the platform upgraded to the FLX+ and could generate 1 million reads, with a reading length up to 1,000 bp.

Illumina platform is a high-throughput technology of sequencing by synthesis using reversible dye terminators developed by Solexa and then purchased by Illumina in 2008 (Bentley et al., 2008). The mechanism of the Illumina platform is bridge PCR, which is different from the Roche/454 platform. The library DNA with fixed adaptors is denatured to single strands and linked on the flow cell, followed by bridge amplification to synthesize clusters of clonal DNA fragments. The library splices into single strands by linearization enzyme (Mardis, 2008), and then four kinds of fluorescently labeled nucleotides which have been modified with a terminator complement the template one base at a time, the signal is captured, then the terminator and fluorescent dye are cleaved, and a new round of synthesis repeats until coming up to the desired read length. In late 2011, the paired-end mode of the Hi-Seq2000 Illumina platform could generate more than 250 million reads total sequences of one lane.

Because the throughput of Hi-Seq 2000 is higher, the price is lower, and the application range is wider than Roche/454, the application of the Illumina platform in the medicinal plant genome sequencing occupies the mainstream position. The Illumina platform is widely applied for expression profiling, de novo sequencing, and re-sequencing in plant sequencing, like Thellungiella parvula (Dassanayake et al., 2011) and Arabidopsis thaliana (Cao et al., 2011). As more and more medicinal plant genomes have been reported, the medicinal plant genome sequencing has begun to enter the development stage, many large size medicinal plant genomes were successfully sequenced. However, another difficulty of plant genomes is the high repetition in the genome, so it is difficult to accurately assemble them by the NGS technologies.

Expansion Stage of Medicinal Plant Genome Sequencing

The development of third-generation sequencing has overcome this problem. The most widely applied long-read sequencing platform is Single-Molecule Real-Time (SMRT) sequencing of Pacific Biosciences company. SMRT sequencing is run on cells, which have tiny wells called zero-mode waveguides (ZMWs). In each ZMWs, a DNA polymerase/template complex gets immobilized, and synthesizes a new DNA strand (Jiao and Schneeberger, 2017). Each incorporation generates a light pulse that can be recognized for differently labeled nucleotides (Eid et al., 2009). PacBio systems can sequence reads with an average size of about 20 kb and a maximum length of over 60 kb (Kim K. E. et al., 2014; Vanburen et al., 2015). Although the sequencing error rate of raw reads is up to 15%, self-correction by adequate coverage sequencing data (Chin et al., 2013) or correction with NGS data (Bashir et al., 2012; Koren et al., 2012) enables genome assemblies with the accuracy of over 99.999% simply by running bioinformatics analysis software (Chin et al., 2016). Besides the PacBio SMRT platform, there is also another long-read sequencing platform introduced by ONT Technologies, which provided access to their first sequencing system in 2014 (Quick et al., 2014; Deamer et al., 2016). Single DNA molecules are run through nanopores, and individual nucleotides create characteristic disruptions in them, which reveal the sequence of the nucleotides. The reads length and sequencing accuracy are similar with PacBio reads, and the longest reads can reach up to 200 kb. First, whole-genome assemblies using ONT data have reached N50 values of multiple hundred kb for fungal genomes, and bacterial genomes could be fully assembled with a nucleotide accuracy of over 99% (Goodwin et al., 2015; Loman et al., 2015).

The emergence of third-generation sequencing technology has made a great leap in sequencing read length and brought medicinal plant genome sequencing into a stage of rapid development. The strategy used in this stage is a combination of second- and third-generation sequencing technologies, which can ensure long read length, high throughput, and reasonable sequencing price at the same time. Medicinal plant genomes are large and have high-ratio repetitive elements, the frequently-used strategy is combining high coverage Illumina and low coverage PacBio SMRT or ONT data. Because third-generation sequencing can provide long-read sequences to increase the assembly accuracy and genome draft quality, but the price is relatively high, so Illumina platform is used to guarantee enough sequencing data. And this can make it possible to assemble large and complex medicinal plant genomes to the chromosome level. After these years of sequencing development, the medicinal plants not only can obtain draft genome relevant information and dig out target protein-coding genes, but also recognize the chromosome-level of the genome to discover the evolution, gene cluster’s function, repetitive elements effect, and so on.

Genomes of Species Have Been Repeatedly Sequenced

We found that not only does the number of medicinal plant genomes sequenced continue to increase, but the number of medicinal plant genomes sequenced repeatedly is also increasing. Why? First of all, because the genomes of many medicinal plants have not been revealed yet, many teams are performing de novo sequencing of the genomes at the same time, and accordingly publish them at the same time. Then, with the continuous development of gene sequencing technology, we can obtain longer sequencing read lengths, so as to assemble more complete and accurate high-level genomes. Genomes assembled to the chromosome level are the current trend. The information that the genome gives us is no longer a contig or scaffold, but the chromosome and the position of a gene on the chromosome.

There are 25 medicinal plants with two reported genomes, three medicinal plants with three reported genomes, and one plant with five reported genomes. Representative medicinal plants include Momordica charantia (bitter gourd), Salvia miltiorrhiza (Danshen), Punica granatum (pomegranate), Panax notoginseng (Sanqi), Panax ginseng (Asian ginseng), etc. Bitter gourd and danshen have two reported versions of the genome. Bitter gourd completed the de novo assembly of the genome draft in 2017, as well as basic annotation and evolutionary analysis (Urasaki et al., 2017). In 2020, using PacBio long-read sequencing technology, the Momordica charantia genome was assembled to the chromosome level, and further investigate the genomic changes under domestication (Matsumura et al., 2020). The genome of Salvia miltiorrhiza was also assembled to eight chromosomes, the assembled genome size increased from 538 to 594.75 Mb, and the proportion of repetitive elements also increased from 54.44 to 64.84% (Xu et al., 2016; Song Z. et al., 2020). Punica granatum (pomegranate), which is a popular and nutritious fruit with medicinal properties, has three published genome versions (Qin et al., 2017; Yuan Z. et al., 2018; Luo et al., 2020). The third version of the genome is assembled to the chromosome level, and it is a high-quality genome map of the soft-seed pomegranate, which helps to clarify the genetic divergence between soft- and hard-seeded varieties and provides insights into the genetic diversity and population structure of pomegranates (Luo et al., 2020). Panax notoginseng (Sanqi) is a well-known TCM whose genome research is sought after by scientists, and a total of five versions have been reported. The three recent versions are assembled to the chromosome level (Fan G. et al., 2020; Jiang et al., 2021; Yang et al., 2021b), which are more complete than the previously available genome assemblies (Chen et al., 2017; Zhang D. et al., 2017), further reveal the biosynthesis pathways of ginsenosides and dencichine, as well as provide a resource for further exploration of the saponin biosynthesis, cultivation, and breeding of P. notoginseng. Panax ginseng (Asian ginseng), reputed as the king of medicinal herbs, belongs to the same genus Panax, which also has two versions of reported genomes (Xu et al., 2017; Kim et al., 2018). Both of these two genomes provide a comprehensive understanding for functional and evolutionary analysis as well as ginsenoside biosynthesis. Additionally, Kim et al. (2018) identified fatty acid desaturases that can increase freezing tolerance and chlorophyll a/b binding protein genes which enable efficient photosynthesis under low light. However, the read length of both genomes is not long enough by the current standards, and there is still space for further improvement in the integrity and accuracy of the ginseng genome.

Application of Medicinal Plant Genomes

Genomics-Assisted Herb Breeding

The genes related to medicinal plant growth and development, disease resistance, important genetic traits, and germplasm characters which are the important functional genes in medicinal plants, taking advantage of genome annotation information, discovering good genes, using genetic engineering methods to break the reproductive isolation, and cultivating the new species with excellent agronomic characters and high content of active ingredients, so that it can lay the foundation for the large-amount extraction of active ingredients and extensive clinical application. By combing transcriptome and resequencing of individual species within or between species, the large-scale molecular markers can be identified rapidly and accurately, and genetic linkage study of molecular markers and qualified characters can also be accelerated, the phenotypes of medicinal plants and the relationship of physical characteristics and genotypes are discovered quickly so that efficiency of breeding are improved obviously.

The study of Scutellaria baicalensis (Huangqin) genome sequencing revealed that a specialized metabolic pathway for the synthesis of 40-deoxyflavonebioactives evolved in the genus Scutellaria and found that the gene encoding a specific cinnamate coenzyme A ligase likely obtained its new function following recent mutations and that four genes encoding enzymes in the 40-deoxyflavone pathway are present as tandem repeats in the genome of Huangqin. Further analysis discovered that gene duplications, segmental duplication, gene amplification, and point mutations coupled to gene neo- and subfunctionalizations were involved in the evolution of 40-deoxyflavone synthesis in Scutellaria. These results not only provide significant insight into the evolution of specific flavone biosynthetic pathways in the mint family Lamiaceae but also facilitate the development of tools for enhancing bioactive productivity by molecular breeding in plants (Zhao Q. et al., 2019).

Evolution History Revealing

Whole-genome sequencing cannot only elucidate the biosynthesis pathways of natural products but also give insight into their evolution. The evolution will bring the whole genome change, like WGD and whole-genome triplication (WGT), to adapt to the environment alteration and explain the characters of plants. We summarized the WGD and WGT events of some representative species reported in the medicinal plant genome articles, and these situations are shown in Figure 5. These WGD and WGT events are summarized and introduced into three types of plants, which are eudicots, monocots, and magnoliids.

FIGURE 5

In the eudicots part, we select five representative branches to demonstrate the situation. The representative medicinal plants of Araliaceae and Apiaceae are clustered together. P. ginseng, P. notoginseng, and E. senticosus belong to Araliaceae, and P. notoginseng is diploid, while P. ginseng and E. senticosus are tetraploid. Two rounds of WGD were discovered in these Araliaceae plants, the first round occurred around 29.6 Mya, P. ginseng, and E. senticosus both had the second round of WGD, which were found almost 2.2 Mya in P. ginseng and 13 Mya in E. senticosus, respectively. Additionally, these recent WGDs were discovered to contribute to the ability of P. ginseng to overwinter and E. senticosus to adapt to cold environment, enabling them to live and spread broadly through the cold area (Kim et al., 2018; Jiang et al., 2021; Yang et al., 2021a). These two rounds of WGD occurred in the family Araliaceae after divergence with the Apiaceae, which may be one of the reasons why its genome was bigger than other medicinal plants. In the D. carota and A. graveolens that belonged to Apiaceae, one shared WGD occurred in about 43 Mya, and one recent WGD only existed in A. graveolens in approximately 1.9 Mya, and this duplication contributed to the expansion of terpene synthase gene families (Song X. et al., 2020). The second branch in the eudicots part includes six plants belonging to Lamiales, one shared WGD (almost 60.7 Mya) was identified in S. baicalensis, S. barbata, S. miltiorrhiza, and S. indicum, which might be responsible for chromosomal expansion and rearrangement (Xu et al., 2020a), and two rounds of WGD were found in S. splendens and L. angustifolia, which could result in the gene families expansion related to terpenoid biosynthesis (Li et al., 2021). In P. cuspidatum, it experienced current lineage-specific WGD at 6.6 Mya after the divergence with F. tataricum from the ancestor, and it shared the ancient and common WGD with F. tataricum at 65 Mya (Zhang Y. et al., 2019), after this WGD, the genome of F. tataricum experienced dramatic chromosomal rearrangements, resulting in very fragmented intra-genome collinear blocks (Zhang L. et al., 2017). There is also a WGT event identified and reported in the medicinal plant genome articles. T. wilfordii was found to have a WGT event in approximately 21 Mya, which enabled it to cope better with and adapt to the markedly changed environment, and the duplication of the triptolide biosynthesis genes were almost generated by this WGT event, suggesting this WGT event was important to the evolution of triptolide biosynthesis (Tu et al., 2020).

In the monocots part, A. sativum and H. citrina are the representatives. A. sativum has undergone two rounds of WGD, suggesting WGD can be the important driving force of the proliferation of TEs and genome expansion in garlic (Sun et al., 2020). Otherwise, H. citrina experienced a recent WGD event at about 15.73 Mya, which was the main factor resulting in multiple copies of the orthologous genes (Qing et al., 2021). In the magnoliids part, C. salicifolius and P. nigrum are the representatives. Two rounds of ancient WGD were inferred in the C. salicifolious genome, one was shared by Calycanthaceae at ∼87 Mya after its divergence with Lauraceae, and the other was dating back to approximately 142 Mya in the ancestry of Magnoliales and Laurales (Lv Q. et al., 2020). Meanwhile, the P. nigrum genome was speculated to have a WGD event at ∼17.9 Mya, which brought genetic changes that were responsible for the particular biosynthesis of piperine (Hu et al., 2019).

Domestication Process Understanding

Domestication is a complex evolutionary process, which is one of the most important technological innovations in human history, humans use plants to change their morphology and physiology traits, distinguishing them from wild ancestors, and ultimately giving rise to the current human cultures (Diamond, 2002; Hancock, 2005). Some of the domesticated plants are medicinal plants. The timing and geographical origins of domesticated traits, as well as the genes that lead to changes in traits, can be sent to find clues from genomic information (Purugganan and Fuller, 2009).

Coix is a widely cultivated grass crop with high nutritional and medicinal value, which has been domesticated as early as the Neolithic era. However, its genetic research and breeding were hampered by the lack of a sequenced genome. Two chromosome-level genomes of coix have been reported simultaneously, which belong to elite cultivar Beijing (Liu H. et al., 2020) and wild relative Coix aquatica Daheishan (Guo et al., 2020), respectively. They both find that hull thickness is an important domestication trait between the wild relatives and cultivars, and selection of papery hull from the stony hull in wild progenitors was a key step in coix domestication. Combining resequencing analysis and comparative analysis, several domesticated loci or genes (like loci in the ∼2 to 150 kb region upstream of ub3) and two major quantitative trait loci associated with hull thickness and color (Ccph1 and Ccph2), were discovered to be the potential identification loci for domestication. These findings will greatly facilitate and benefit the molecular breeding of coix and provide a powerful reference for the domestication and evolution of medicinal plants.

Herbal Synthetic Biology

The active components of medicinal plants with complex and diverse structures are the material basis for their medicinal effect, and it’s also an important source of new drug discovery. However, many medicinal plant materials often face a series of problems in the process of development and utilization, for example, the growth of many medicinal materials is greatly affected by environmental factors; some rare herbs grow slowly and are difficult to grow by artificial cultivation; most of the active ingredients are low in content, complex in chemical structure and difficult in chemical synthesis; traditional methods of natural extraction or artificial chemical synthesis cannot meet the needs of scientific research and new drug development. Synthetic biology will be an effective way to resolve these problems.

As high-throughput sequencing technology for genome and transcriptome studies have developed rapidly, using bioinformatics method and functional genomics approach to screen and identify enzyme-coding genes on specific secondary biosynthesis pathway from a large number of the original species of medicinal plants, which will greatly accelerate the analysis process of secondary biosynthesis pathway and lay a solid foundation for herbal synthetic biology research of medicinal plants.

Tripterygium wilfordii genome is one of the typical examples. Because of the extremely low yield of triptolide extracted from T. wilfordii, its original plant cannot be grown on a large scale, and the current chemical synthesis route is limited to a yield of less than 1.64%. A more promising method to obtain more triptolide could be metabolic engineering, which can be realized via a synthetic biology strategy. However, it required elucidation of the triptolide biosynthesis pathway. Therefore, the sequencing of the T. wilfordii genome was completed, and cytochrome P450 TwCYP728B70 involved in triptolide biosynthesis was identified, accordingly, the triptolide content in the CYP728B70 overexpression line increased obviously (Tu et al., 2020). It’s important to make full use of genomic resources to reveal the biosynthesis pathways of active compounds in medicinal plants and use candidate genes in these pathways for the heterologous bioproduction under synthetic biology strategy.

Geoherbal Research, Protection, and Utilization of Resources

Geoherbs, controlled by genetic factors and affected by environmental conditions, are representative of high-quality medicinal materials. The utilization of sequencing technology and data can provide useful tools to elucidate the molecular mechanism of geoherbs. For the same medicinal plants in different areas, epigenomic studies of medicinal plants can be carried out to clarify the genetic variation of different production areas, especially the modification effect of different environments on the epigenome of medicinal material, including DNA methylation modification, small RNA sequencing analysis, chromatin immunoprecipitation analysis, and so on. In addition, microorganisms in soil are also important factors in the growth environment of geoherbs. Metagenomic analysis of soil microbial community can be sequenced to provide the basis for revealing the interaction between soil microorganisms and the growth of medicinal plants.

Recently, 545 genomes of ginkgo trees sampled from 51 populations across the world were sequenced to identify three refugia in China and detect multiple cycles of population expansion and reduction along with glacial admixture between relict populations in the southwestern and southern refugia, and multiple anthropogenic introductions of ginkgo were proved to occur from eastern China into different continents. This study provides insight into the evolutionary history of ginkgo and helps to provide protection and utilization way for its valuable genomic resources (Zhao Y. P. et al., 2019).

Improving the Synthesis Efficiency of Bioactive Compounds Within Species

Because of the rapid development and progress of sequencing technology, more and more biosynthesis pathways of active ingredients from medicinal plants have been revealed. The early-stage was based on the mining from transcriptome data, and the later stage was based on the combined mining from genome and transcriptome data. Although transcriptome sequencing has so far occupied a major position in the research of biosynthesis pathways of medicinal ingredients, genome data can provide more important information, for example, it can reveal the evolution process of biosynthesis pathway genes, thereby efficiently synthesizing secondary metabolites with medicinal activity. In the opium poppy genome, a great discovery about a gene cluster including 15 genes was reported. Meanwhile, in its evolution process, the events like gene duplication, rearrangement, and fusion, could lead to the aggregation and co-expression of genes in the two metabolic pathways of noscapine and morphinan, so that it resulted in the formation of this supergene cluster, which could synergistically synthesize the medicinal ingredients in opium poppy (Guo et al., 2018). Therefore, the opium poppy genome helps to decipher the mystery of the synthesis of secondary metabolites. It is not only beneficial to the development of molecular plant breeding tools and cultivating new varieties, but also has great guiding significance for the selective improvement of the production of alkaloids with different efficacy in future artificial synthesis.

It also provides new ideas for the application of medicinal plant genomes. Through the evolution process, gene duplications and neofunctionalization can generate gene clusters, which may relate to specialized metabolites, and this phenomenon has already been observed in several model plants, like A. thaliana, Zea mays, and Solanum lycopersicum (Bharadwaj et al., 2021). In medicinal plants, we can refer to the research strategy of the opium poppy (Guo et al., 2018), which can help us understand the formation process of gene clusters related to medicinal active ingredients and improve their biosynthesis efficiency.

Comparative Genomic Analysis Among Different Species or Different Populations in the Same Species

The continuous emergence of high-quality genomes has made the application of comparative genomics analysis more and more extensive and in-depth, and it is also a powerful tool for researchers to dig out biological problems and explain biological phenomena (Nobrega and Pennacchio, 2004). Comparative genomics, based on genome mapping and sequencing technology, are generally referred to as comparative analysis of the structural and functional gene regions of the genomes among multiple species or multiple individuals (populations) from one species. Specifically, it is to compare the similarities and differences in the structural characteristics, study the contraction and expansion of gene families, discover the differentiation time and evolution relationship, analyze the generation and evolution of new genes, etc.

One representative example of comparative genomics among different medicinal plant genome species can be Scutellaria baicalensis and Scutellaria barbata. The comparative genomic analysis of them showed the recent LTR may result in chromosomal rearrangement and expansion, and tandem duplication of paralogs after their speciation might contribute to the divergent evolution of flavonoid biosynthesis gene families, which provided a significant foundation for the evolution and chemodiversity studies in the Lamiaceae (Xu et al., 2020a).

Moreover, a representative of comparative genomics among different populations in the same species can be Forsythia suspense. Genome-wide comparative analysis was then conducted for the 15 natural populations across its current distribution range. The results revealed that candidate genes associated with local adaptation were functionally correlated with heterogeneous environmental factors, and supported the hypothesis that adaptive differentiation should be highly obvious in the genes of signal crosstalk between different environmental variables, which gave insights into the fundamental genetic mechanisms of the local adaptation to climatic gradients in plant species (Li L.-F. et al., 2020).

Outlook and Challenges of Medicinal Plant Genome Sequencing

The use of medicinal plants has a long history and diverse application methods. Related works of research mainly focus on the discovery of chemical basis and the analysis of pharmacodynamic effect, but the understanding of medicinal plant genetic resources is relatively weak. Therefore, the research on the genome of medicinal plants should make use of the latest technologies and achievements of genomics, and integrate the studies of structural genome, functional genome, transcriptome, proteome, epigenome, metagenome, synthetic biology, metabolome, bioinformatics, and other relevant databases. Therefore, the essence of medicinal plants can be revealed, the relationship among genetic resources, chemical quality, and drug efficacy can be recognized.

We are most concerned about the medicinal value of medicinal plants. The medicinal value is not only reflected in the content of their medicinal ingredients, but also the stability of the quality of their medicinal materials. Now medicinal plant genomes can be annotated to obtain protein-coding genes, especially biosynthesis genes of active ingredients, analyze their evolutionary history and domestication process, and discover genes that respond to environmental stresses to help improve their resistance and ability. However, the powerful ability of the medicinal plant group has not yet been manifested, and its ability to solve the difficulties in practical applications remains to be developed. How to use the information of the medicinal plant genome to transform and obtain excellent medicinal plant varieties has not yet been realized. Determining suitable model medicinal plants is of great significance to the research on the practical application of medicinal plant genomes. The determination of appropriate model medicinal plants is of great significance to the study of the genomics of medicinal plants. From the perspective of general biological characteristics, it usually should have the traits of a short age cycle, many offspring, and stable phenotype. As for genetic resources, the genome should be relatively small, easy to sequence, and genetic transformation is relatively easy. As for medicinal characteristics, it should be suitable for secondary metabolite biosynthesis and production research. Therefore, the establishment and improvement of a suitable model medicinal plant platform will greatly enhance the application value of medicinal plant genomes.

The assembly of plant genomes is a challenging problem because of their high repetitiveness due to TEs, extreme genome sizes, and polyploid nature. With the development and emergence of long-read sequencing (Eid et al., 2009; Deamer et al., 2016) and long-range scaffolding methods such as optical mapping (Schwartz et al., 1993), chromosome conformation capture (Burton et al., 2013), and DNA dilution-based technologies (Amini et al., 2014; Zheng et al., 2016), the medicinal plant genome sequencing overcomes weaknesses of short-read assemblies and becomes possible to assemble to the chromosome-level (Jiao and Schneeberger, 2017). Although there have been medicinal plants that enable the assembly of entire chromosomes, most medicinal plants just still obtained long scaffolds or super-scaffolds. And now we have got a large amount of sequencing data from medicinal plants, how to effectively explore and apply them to dig deeper information is still facing problems and challenges.

Moreover, thanks to the advancement and development of sequencing technology and bioinformatics algorithms, at least one hundred medicinal plant genomes have been obtained. How to use them thoroughly and effectively has attracted the attention of many institutions and researchers. In recent years, several databases of medicinal plant genomes have already been built, such as the Herbal Medicine Omics Database1 (Wang X. et al., 2018), 1K Medicinal Plant Genome Database,2 and Database of 10,000 Medicinal Plants.3 These databases summarize the medicinal plant genomes that have been reported at this stage or aim to build a biological big data platform for medicinal plants, linking the omics data, active ingredients, disease information, and other information to promote their modernization. All of the above indicate that the medicinal plant genome has entered the stage of big data association research from the stage of exploring the unknown. Moreover, because of the limitations of previous technologies and methods, the disclosed medicinal plant genome information is limited. If the obtained medicinal plant genome information is aggregated and shared through the database, this should be a huge treasure to be unearthed, which will prompt the research efficiency of medicinal plants.

Conclusion

Thanks to the invention of the long-read sequencing technology, the research on medicinal plant genomes has developed rapidly and is no longer limited by their huge genome size and high repetitive sequences. The number of genomes reported in the past 2 years has increased significantly, and the quality of genomes has also been greatly improved, most of which have been assembled to the chromosome level. Correspondingly, the sequencing strategy they adopted has also been continuously updated, making them more and more widely used, answering and solving many problems in scientific researches and practical applications, including herb breeding assistance, evolutionary history revealing, domestication process understanding, herb synthetic biology study, geoherbal research and comparative genome analysis, these are of great significance to the effective use and sustainable protection of medicinal plants, which can improve their research efficiency and promote their modern development.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Statements

Author contributions

Q-QC planned the manuscript outline, wrote the draft, and created the figures and tables. YO, Z-YT, C-CL, Y-YZ, and C-SC proofread the manuscript. HZ supervised the study and revised the manuscript. All authors contributed to the article and approved the submitted version.

Funding

This project was funded by the Science and Technology Development Fund, Macao SAR (Project Nos. 0001/2020/AKP, 0061/2019/AGJ, 0027/2017/AMJ, and 062/2017/A2) and the National Key Research and Development Program of China (Project No. 2017YFE0119900).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2021.791219/full#supplementary-material

References

  • 1

    AdamsM. D.CelnikerS. E.HoltR. A.EvansC. A.GocayneJ. D.AmanatidesP. G.et al (2000). The genome sequence of Drosophila melanogaster.Science28721852195. 10.1126/science.287.5461.2185

  • 2

    AminiS.PushkarevD.ChristiansenL.KostemE.RoyceT.TurkC.et al (2014). Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing.Nat. Genet.4613431349. 10.1038/ng.3119

  • 3

    AuberR. P.SuttiyutT.McCoyR. M.GhasteM.CrookJ. W.PendletonA. L.et al (2020). Hybrid de novo genome assembly of red gromwell (Lithospermum erythrorhizon) reveals evolutionary insight into shikonin biosynthesis.Hortic. Res.7:82. 10.1038/s41438-020-0301-9

  • 4

    BashirA.KlammerA. A.RobinsW. P.ChinC.WebsterD.PaxinosE.et al (2012). A hybrid approach for the automated finishing of bacterial genomes.Nat. Biotechnol.30701707. 10.1038/nbt.2288

  • 5

    BentleyD. R.BalasubramanianS.SwerdlowH. P.SmithG. P.MiltonJ.BrownC. G.et al (2008). Accurate whole human genome sequencing using reversible terminator chemistry.Nature4565359. 10.1038/nature07517

  • 6

    BharadwajR.KumarS. R.SharmaA.SathishkumarR. (2021). Plant metabolic gene clusters: evolution, organization, and their applications in synthetic biology.Front. Plant Sci.12:697318. 10.3389/fpls.2021.697318

  • 7

    BornowskiN.HamiltonJ. P.LiaoP.WoodJ. C.DudarevaN.BuellC. R. (2020). Genome sequencing of four culinary herbs reveals terpenoid genes underlying chemodiversity in the Nepetoideae.DNA Res.27112. 10.1093/dnares/dsaa016

  • 8

    BroseJ.LauK. H.DangT. T. T.HamiltonJ. P.do Vale MartinsL.HambergerB.et al (2021). The Mitragyna speciosa (Kratom) Genome: a resource for data-mining potent pharmaceuticals that impact human health.G3 (Bethesda)11:jkab058. 10.1093/g3journal/jkab058

  • 9

    BurtonJ. N.AdeyA.PatwardhanR. P.QiuR.KitzmanJ. O.ShendureJ. (2013). Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions.Nat. Biotechnol.3111191125. 10.1038/nbt.2727

  • 10

    CaoJ.SchneebergerK.OssowskiS.GüntherT.BenderS.FitzJ.et al (2011). Whole-genome sequencing of multiple Arabidopsis thaliana populations.Nat. Genet.43956963. 10.1038/ng.911

  • 11

    ChanA. P.CrabtreeJ.ZhaoQ.LorenziH.OrvisJ.PuiuD.et al (2010). Draft genome sequence of the oilseed species Ricinus communis.Nat. Biotechnol.28951956. 10.1038/nbt.1674

  • 12

    ChangY.LiuH.LiuM.LiaoX.SahuS. K.FuY.et al (2019). The draft genomes of five agriculturally important African orphan crops.Gigascience8116. 10.1093/gigascience/giy152

  • 13

    ChawS.-M.LiuY.-C.WuY.-W.WangH.-Y.LinC.-Y. I.WuC.-S.et al (2019). Stout camphor tree genome fills gaps in understanding of flowering plant genome evolution.Nat. Plants56373. 10.1038/s41477-018-0337-0

  • 14

    ChellappanB. V.ShidhiP. R.VijayanS.RajanV. S.SasiA.NairA. S.et al (2019). High quality draft genome of arogyapacha (Trichopus zeylanicus), an important medicinal plant endemic to Western Ghats of India.G3923952404. 10.1534/g3.119.400164

  • 15

    ChenD.PanY.WangY.CuiY.-Z.ZhangY.-J.MoR.et al (2021). The chromosome-level reference genome of Coptis chinensis provides insights into genomic evolution and berberine biosynthesis.Hortic. Res.8:121. 10.1038/s41438-021-00559-2

  • 16

    ChenF.DongW.ZhangJ.GuoX.ChenJ.WangZ.et al (2018). The sequenced angiosperm genomes and genome databases.Front. Plant Sci.9:418. 10.3389/fpls.2018.00418

  • 17

    ChenH.ZengY.YangY.HuangL.TangB.ZhangH.et al (2020). Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa.Nat. Commun.11:2494. 10.1038/s41467-020-16338-x

  • 18

    ChenS. L.SongJ.-Y. (2016). [Herbgenomics].Zhongguo Zhong Yao Za Zhi4138813889. 10.4268/cjcmm20162101

  • 19

    ChenS. L.SunY. Z.XuJ.LuoH. M.SunC.HeL.et al (2010). Strategies of the study on Herb genome program.Yaoxue Xuebao45807812.

  • 20

    ChenS. L.WuW.-G.WangC.-X.XiangL.ShiY.-H.ZhangD.et al (2019). [Molecular genetics research of medicinal plants].Zhongguo Zhong Yao Za Zhi4424212432. 10.19540/j.cnki.cjcmm.20190514.102

  • 21

    ChenS.WangX.WangY.ZhangG.SongW.DongX.et al (2020). Improved de novo assembly of the achlorophyllous orchid Gastrodia elata.Front. Genet.11:580568. 10.3389/fgene.2020.580568

  • 22

    ChenS.WangY.YuL.ZhengT.WangS.YueZ.et al (2021). Genome sequence and evolution of Betula platyphylla.Hortic. Res.8:37. 10.1038/s41438-021-00481-7

  • 23

    ChenW.KuiL.ZhangG.ZhuS.ZhangJ.WangX.et al (2017). Whole-Genome sequencing and analysis of the Chinese herbal plant Panax notoginseng.Mol. Plant10899902. 10.1016/j.molp.2017.02.010

  • 24

    ChenY.-C.LiZ.ZhaoY.-X.GaoM.WangJ.-Y.LiuK.-W.et al (2020). The Litsea genome and the evolution of the laurel family.Nat. Commun.11:1675. 10.1038/s41467-020-15493-5

  • 25

    ChengJ.WangX.LiuX.ZhuX.LiZ.ChuH.et al (2021). Chromosome-level genome of Himalayan yew provides insights into the origin and evolution of the paclitaxel biosynthetic pathway.Mol. Plant1411991209. 10.1016/j.molp.2021.04.015

  • 26

    ChinC. S.AlexanderD. H.MarksP.KlammerA. A.DrakeJ.HeinerC.et al (2013). Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.Nat. Methods10563569. 10.1038/nmeth.2474

  • 27

    ChinC.-S.PelusoP.SedlazeckF. J.NattestadM.ConcepcionG. T.ClumA.et al (2016). Phased diploid genome assembly with single-molecule real-time sequencing.Nat. Methods1310501054. 10.1038/nmeth.4035

  • 28

    CuiP.LinQ.FangD.ZhangL.LiR.ChengJ.et al (2018). Tung Tree (Vernicia fordii, Hemsl.) Genome and transcriptome sequencing reveals co-ordinate up-regulation of fatty acid β-oxidation and triacylglycerol biosynthesis pathways during Eleostearic acid accumulation in seeds.Plant Cell Physiol.5919902003. 10.1093/pcp/pcy117

  • 29

    DassanayakeM.OhD.-H.HaasJ. S.HernandezA.HongH.AliS.et al (2011). The genome of the extremophile crucifer Thellungiella parvula.Nat. Genet.43913918. 10.1038/ng.889

  • 30

    DeamerD.AkesonM.BrantonD. (2016). Three decades of nanopore sequencing.Nat. Biotechnol.34518524. 10.1038/nbt.3423

  • 31

    DiamondJ. (2002). Evolution, consequences and future of plant and animal domestication.Nature418700707. 10.1038/nature01019

  • 32

    DingX.MeiW.HuangS.WangH.ZhuJ.HuW.et al (2018). Genome survey sequencing for the characterization of genetic background of Dracaena cambodiana and its defense response during dragon’s blood formation.PLoS One13:e0209258. 10.1371/journal.pone.0209258

  • 33

    DingX.MeiW.LinQ.WangH.WangJ.PengS.et al (2020). Genome sequence of the agarwood tree Aquilaria sinensis (Lour.) Spreng: the first chromosome-level draft genome in the Thymelaeceae family.Gigascience9:giaa013. 10.1093/gigascience/giaa013

  • 34

    DongS.LiuM.LiuY.ChenF.YangT.ChenL.et al (2021). The genome of Magnolia biondii Pamp. provides insights into the evolution of Magnoliales and biosynthesis of terpenoids.Hortic. Res.838. 10.1038/s41438-021-00471-9

  • 35

    EidJ.FehrA.GrayJ.LuongK.LyleJ.OttoG.et al (2009). Real-time DNA sequencing from single polymerase molecules.Science323133138. 10.1126/science.1162986

  • 36

    FanG.LiuX.SunS.ShiC.DuX.HanK.et al (2020). The chromosome level genome and genome-wide association study for the agronomic traits of Panax notoginseng.iScience23101538. 10.1016/j.isci.2020.101538

  • 37

    FanY.SahuS. K.YangT.MuW.WeiJ.ChengL.et al (2020). Dissecting the genome of star fruit (Averrhoa carambola L.).Hortic. Res.7:94. 10.1038/s41438-020-0306-4

  • 38

    FrankeJ.KimJ.HamiltonJ. P.ZhaoD.PhamG. M.Wiegert-RiningerK.et al (2019). Gene discovery in gelsemium highlights conserved gene clusters in monoterpene indole alkaloid biosynthesis.ChemBioChem208387. 10.1002/cbic.201800592

  • 39

    FuY.LiL.HaoS.GuanR.FanG.ShiC.et al (2017). Draft genome sequence of the Tibetan medicinal herb Rhodiola crenulata.Gigascience615. 10.1093/gigascience/gix033

  • 40

    GaoS.WangB.XieS.XuX.ZhangJ.PeiL.et al (2020). A high-quality reference genome of wild Cannabis sativa.Hortic. Res.7:73. 10.1038/s41438-020-0295-3

  • 41

    GondaI.FaigenboimA.AdlerC.MilavskiR.KarpM.-J.ShachterA.et al (2020). The genome sequence of tetraploid sweet basil, Ocimum basilicum L., provides tools for advanced genome editing and molecular breeding.DNA Res.27110. 10.1093/dnares/dsaa027

  • 42

    GoodwinS.GurtowskiJ.Ethe-SayersS.DeshpandeP.SchatzM. C.McCombieW. R. (2015). Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.Genome Res.2517501756. 10.1101/gr.191395.115

  • 43

    GuanR.ZhaoY.ZhangH.FanG.LiuX.ZhouW.et al (2016). Draft genome of the living fossil Ginkgo biloba.Gigascience5:49. 10.1186/s13742-016-0154-1

  • 44

    GuoC.WangY.YangA.HeJ.XiaoC.LvS.et al (2020). The coix genome provides insights into Panicoideae evolution and papery hull domestication.Mol. Plant13309320. 10.1016/j.molp.2019.11.008

  • 45

    GuoL.WinzerT.YangX.LiY.NingZ.HeZ.et al (2018). The opium poppy genome and morphinan production.Science362343347. 10.1126/science.aat4096

  • 46

    HamiltonJ. P.Robin BuellC. (2012). Advances in plant genome sequencing.Plant J.70177190. 10.1111/j.1365-313X.2012.04894.x

  • 47

    HancockJ. F. (2005). Contributions of domesticated plant studies to our understanding of plant evolution.Ann. Bot.96953963. 10.1093/aob/mci259

  • 48

    HeN.ZhangC.QiX.ZhaoS.TaoY.YangG.et al (2013). Draft genome sequence of the mulberry tree Morus notabilis.Nat. Commun.4:2445. 10.1038/ncomms3445

  • 49

    HeS.DongX.ZhangG.FanW.DuanS.ShiH.et al (2021). High quality genome of Erigeron breviscapus provides a reference for herbal plants in Asteraceae.Mol. Ecol. Resour.21153169. 10.1111/1755-0998.13257

  • 50

    HeY.PengF.DengC.XiongL.HuangZ.ZhangR.et al (2018). Building an octaploid genome and transcriptome of the medicinal plant Pogostemon cablin from Lamiales.Sci. Data5:180274. 10.1038/sdata.2018.274

  • 51

    HeY.XiaoH.DengC.XiongL.NieH.PengC. (2016). Survey of the genome of Pogostemon cablin provides insights into its evolutionary history and sesquiterpenoid biosynthesis.Sci. Rep.6:26405. 10.1038/srep26405

  • 52

    Hibrand Saint-OyantL.RuttinkT.HamamaL.KirovI.LakhwaniD.ZhouN. N.et al (2018). A high-quality genome sequence of Rosa chinensis to elucidate ornamental traits.Nat. Plants4473484. 10.1038/s41477-018-0166-1

  • 53

    HongZ.LiJ.LiuX.LianJ.ZhangN.YangZ.et al (2020). The chromosome-level draft genome of Dalbergia odorifera.Gigascience918. 10.1093/gigascience/giaa084

  • 54

    HoopesG. M.HamiltonJ. P.KimJ.ZhaoD.Wiegert-RiningerK.CrisovanE.et al (2018). Genome assembly and annotation of the medicinal plant Calotropis gigantea, a producer of anticancer and antimalarial cardenolides.G3 Genes Genomes Genetics8385391. 10.1534/g3.117.300331

  • 55

    HuL.XuZ.WangM.FanR.YuanD.WuB.et al (2019). The chromosome-scale reference genome of black pepper provides insight into piperine biosynthesis.Nat. Commun.10:4702. 10.1038/s41467-019-12607-6

  • 56

    HuangH.LiangJ.TanQ.OuL.LiX.ZhongC.et al (2021). Insights into triterpene synthesis and unsaturated fatty-acid accumulation provided by chromosomal-level genome analysis of Akebia trifoliata subsp. australis.Hortic. Res.8:33. 10.1038/s41438-020-00458-y

  • 57

    International Brachypodium Initiative (2010). Genome sequencing and analysis of the model grass Brachypodium distachyon.Nature463763768. 10.1038/nature08747

  • 58

    IorizzoM.EllisonS.SenalikD.ZengP.SatapoominP.HuangJ.et al (2016). A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution.Nat. Genet.48657666. 10.1038/ng.3565

  • 59

    ItkinM.Davidovich-RikanatiR.CohenS.PortnoyV.Doron-FaigenboimA.OrenE.et al (2016). The biosynthetic pathway of the nonsugar, high-intensity sweetener mogroside V from Siraitia grosvenorii.Proc. Natl. Acad. Sci. U.S.A.113E7619E7628. 10.1073/pnas.1604828113

  • 60

    JaiswalS. K.MahajanS.ChakrabortyA.KumarS.SharmaV. K. (2021). The genome sequence of Aloe vera reveals adaptive evolution of drought tolerance mechanisms.iScience24:102079. 10.1016/j.isci.2021.102079

  • 61

    Jamshidi-KiaF.LorigooiniZ.Amini-KhoeiH. (2018). Medicinal plants: past history and future perspective.J. Herbmed Pharmacol.717. 10.15171/jhp.2018.01

  • 62

    JiY.XiuZ.ChenC.WangY.YangJ.SuiJ.et al (2021). Long read sequencing of Toona sinensis (A. Juss) Roem: a chromosome-level reference genome for the family Meliaceae.Mol. Ecol. Resour.2112431255. 10.1111/1755-0998.13318

  • 63

    JiangZ.TuL.YangW.ZhangY.HuT.MaB.et al (2021). The chromosome-level reference genome assembly for Panax notoginseng and insights into ginsenoside biosynthesis.Plant Commun.2:100113. 10.1016/j.xplc.2020.100113

  • 64

    JiaoW.-B.SchneebergerK. (2017). The impact of third generation genomic technologies on plant genome assembly.Curr. Opin. Plant Biol.366470. 10.1016/j.pbi.2017.02.002

  • 65

    KangS.-H.KimB.ChoiB.-S.LeeH. O.KimN.-H.LeeS. J.et al (2020b). Genome assembly and annotation of soft-shelled adlay (Coix lacryma-jobi Variety ma-yuen), a cereal and medicinal crop in the poaceae family.Front. Plant Sci.11:630. 10.3389/fpls.2020.00630

  • 66

    KangM.WuH.YangQ.HuangL.HuQ.MaT.et al (2020a). A chromosome-scale genome assembly of Isatis indigotica, an important medicinal plant used in traditional Chinese medicine.Hortic. Res.7:18. 10.1038/s41438-020-0240-5

  • 67

    KangS.-H.PandeyR. P.LeeC.-M.SimJ.-S.JeongJ.-T.ChoiB.-S.et al (2020c). Genome-enabled discovery of anthraquinone biosynthesis in Senna tora.Nat. Commun.11:5875. 10.1038/s41467-020-19681-1

  • 68

    KellnerF.KimJ.ClavijoB. J.HamiltonJ. P.ChildsK. L.VaillancourtB.et al (2015). Genome-guided investigation of plant natural product biosynthesis.Plant J.82680692. 10.1111/tpj.12827

  • 69

    KimJ.KangS.-H.ParkS.-G.YangT.-J.LeeY.KimO. T.et al (2020). Whole-genome, transcriptome, and methylome analyses provide insights into the evolution of platycoside biosynthesis in Platycodon grandiflorus, a medicinal plant.Hortic. Res.7:112. 10.1038/s41438-020-0329-x

  • 70

    KimK. E.PelusoP.BabayanP.YeadonP. J.YuC.FisherW. W.et al (2014). Long-read, whole-genome shotgun sequence data for five model organisms.Sci. Data1110. 10.1038/sdata.2014.45

  • 71

    KimN.-H.JayakodiM.LeeS.-C.ChoiB.-S.JangW.LeeJ.et al (2018). Genome and evolution of the shade-requiring medicinal herb Panax ginseng.Plant Biotechnol. J.1619041917. 10.1111/pbi.12926

  • 72

    KimS.ParkM.YeomS.-I.KimY.-M.LeeJ. M.LeeH.-A.et al (2014). Genome sequence of the hot pepper provides insights into the evolution of pungency in Capsicum species.Nat. Genet.46270278. 10.1038/ng.2877

  • 73

    KitashibaH.LiF.HirakawaH.KawanabeT.ZouZ.HasegawaY.et al (2014). Draft Sequences of the Radish (Raphanus sativus L.) Genome.DNA Res.21481490. 10.1093/dnares/dsu014

  • 74

    KorenS.SchatzM. C.WalenzB. P.MartinJ.HowardJ. T.GanapathyG.et al (2012). Hybrid error correction and de novo assembly of single-molecule sequencing reads.Nat. Biotechnol.30693700. 10.1038/nbt.2280

  • 75

    KrishnanN. M.PattnaikS.JainP.GaurP.ChoudharyR.VaidyanathanS.et al (2012). A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica.BMC Genomics13:464. 10.1186/1471-2164-13-464

  • 76

    KumariP.SinghK. P.RaiP. K. (2020). Draft genome of multiple resistance donor plant Sinapis alba: an insight into SSRs, annotations and phylogenetics.PLoS One15:e0231002. 10.1371/journal.pone.0231002

  • 77

    LauK. H.BhatW. W.HamiltonJ. P.WoodJ. C.VaillancourtB.Wiegert-RiningerK.et al (2020). Genome assembly of Chiococca alba uncovers key enzymes involved in the biosynthesis of unusual terpenoids.DNA Res.27112. 10.1093/dnares/dsaa013

  • 78

    LiA.LiuA.DuX.ChenJ.-Y.YinM.HuH.-Y.et al (2020). A chromosome-scale genome assembly of a diploid alfalfa, the progenitor of autotetraploid alfalfa.Hortic. Res.7:194. 10.1038/s41438-020-00417-7

  • 79

    LiC.LiX.LiuH.WangX.LiW.ChenM.-S.et al (2020). Chromatin architectures are associated with response to dark treatment in the oil crop Sesamum indicum, based on a high-quality genome assembly.Plant Cell Physiol.61978987. 10.1093/pcp/pcaa026

  • 80

    LiJ.WangY.DongY.ZhangW.WangD.BaiH.et al (2021). The chromosome-based lavender genome provides new insights into Lamiaceae evolution and terpenoid biosynthesis.Hortic. Res.8:53. 10.1038/s41438-021-00490-6

  • 81

    LiL.-F.CushmanS. A.HeY.-X.LiY. (2020). Genome sequencing and population genomics modeling provide insights into the local adaptation of weeping forsythia.Hortic. Res.7:130. 10.1038/s41438-020-00352-7

  • 82

    LiM.-Y.FengK.HouX.-L.JiangQ.XuZ.-S.WangG.-L.et al (2020). The genome sequence of celery (Apium graveolens L.), an important leaf vegetable crop rich in apigenin in the Apiaceae family.Hortic. Res.7:9. 10.1038/s41438-019-0235-2

  • 83

    LiS.-F.WangJ.DongR.ZhuH.-W.LanL.-N.ZhangY.-L.et al (2020). Chromosome-level genome assembly, annotation and evolutionary analysis of the ornamental plant Asparagus setaceus.Hortic. Res.7:48. 10.1038/s41438-020-0271-y

  • 84

    LiY.WeiH.YangJ.DuK.LiJ.ZhangY.et al (2020). High-quality de novo assembly of the Eucommia ulmoides haploid genome provides new insights into evolution and rubber biosynthesis.Hortic. Res.7:183. 10.1038/s41438-020-00406-w

  • 85

    LiangQ.LiH.LiS.YuanF.SunJ.DuanQ.et al (2019). The genome assembly and annotation of yellowhorn (Xanthoceras sorbifolium Bunge).Gigascience8115. 10.1093/gigascience/giz071

  • 86

    LiangY.ChenS.WeiK.YangZ.DuanS.DuY.et al (2020). Chromosome level genome assembly of Andrographis paniculata.Front. Genet.11:701. 10.3389/fgene.2020.00701

  • 87

    LinY.MinJ.LaiR.WuZ.ChenY.YuL.et al (2017). Genome-wide sequencing of longan (Dimocarpus longan Lour.) provides insights into molecular basis of its polyphenol-rich characteristics.Gigascience6114. 10.1093/gigascience/gix023

  • 88

    LiuH.ShiJ.CaiZ.HuangY.LvM.DuH.et al (2020). Evolution and domestication footprints uncovered from the genomes of Coix.Mol. Plant13295308. 10.1016/j.molp.2019.11.009

  • 89

    LiuL.LiY.LiS.HuN.HeY.PongR.et al (2012). Comparison of next-generation sequencing systems.J. Biomed. Biotechnol.2012:251364. 10.1155/2012/251364

  • 90

    LiuM. J.ZhaoJ.CaiQ.-L.LiuG.-C.WangJ.-R.ZhaoZ.-H.et al (2014). The complex jujube genome provides insights into fruit tree biology.Nat. Commun.5:5315. 10.1038/ncomms6315

  • 91

    LiuS.LiuY.YangX.TongC.EdwardsD.ParkinI. A. P.et al (2014). The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes.Nat. Commun.5:3930. 10.1038/ncomms4930

  • 92

    LiuX.LiuY.HuangP.MaY.QingZ.TangQ.et al (2017). The genome of medicinal plant Macleaya cordata provides new insights into benzylisoquinoline alkaloids metabolism.Mol. Plant10975989. 10.1016/j.molp.2017.05.007

  • 93

    LiuY.TangQ.ChengP.ZhuM.ZhangH.LiuJ.et al (2020). Whole-genome sequencing and analysis of the Chinese herbal plant Gelsemium elegans.Acta Pharm. Sin. B10374382. 10.1016/j.apsb.2019.08.004

  • 94

    LiuY.WangB.ShuS.LiZ.SongC.LiuD.et al (2021). Analysis of the Coptis chinensis genome reveals the diversification of protoberberine-type alkaloids.Nat. Commun.12:3276. 10.1038/s41467-021-23611-0

  • 95

    LomanN. J.QuickJ.SimpsonJ. T. (2015). A complete bacterial genome assembled de novo using only nanopore sequencing data.Nat. Methods12733735. 10.1038/nmeth.3444

  • 96

    LuM.AnH.LiL. (2016). Genome survey sequencing for the characterization of the genetic background of Rosa roxburghii tratt and leaf ascorbate metabolism genes.PLoS One11:e0147530. 10.1371/journal.pone.0147530

  • 97

    LuoX.LiH.WuZ.YaoW.ZhaoP.CaoD.et al (2020). The pomegranate (Punica granatum L.) draft genome dissects genetic divergence between soft- and hard-seeded cultivars.Plant Biotechnol. J.18955968. 10.1111/pbi.13260

  • 98

    LvQ.QiuJ.LiuJ.LiZ.ZhangW.WangQ.et al (2020). The Chimonanthus salicifolius genome provides insight into magnoliid evolution and flavonoid biosynthesis.Plant J.10319101923. 10.1111/tpj.14874

  • 99

    LvS.ChengS.WangZ.LiS.JinX.LanL.et al (2020). Draft genome of the famous ornamental plant Paeonia suffruticosa.Ecol. Evol.1045184530. 10.1002/ece3.5965

  • 100

    MaD.DongS.ZhangS.WeiX.XieQ.DingQ.et al (2021). Chromosome-level reference genome assembly provides insights into aroma biosynthesis in passion fruit (Passiflora edulis).Mol. Ecol. Resour.21955968. 10.1111/1755-0998.13310

  • 101

    MaL.WangQ.MuJ.FuA.WenC.ZhaoX.et al (2020). The genome and transcriptome analysis of snake gourd provide insights into its evolution and fruit development and ripening.Hortic. Res.7:199. 10.1038/s41438-020-00423-9

  • 102

    MaQ.SunT.LiS.WenJ.ZhuL.YinT.et al (2020). The Acer truncatum genome provides insights into nervonic acid biosynthesis.Plant J.104662678. 10.1111/tpj.14954

  • 103

    MaheshH. B.SubbaP.AdvaniJ.ShirkeM. D.LoganathanR. M.ChandanaS. L.et al (2018). Multi-Omics driven assembly and annotation of the sandalwood (Santalum album) Genome.Plant Physiol.17627722788. 10.1104/pp.17.01764

  • 104

    MardisE. R. (2008). The impact of next-generation sequencing technology on genetics.Trends Genet.24133141. 10.1016/j.tig.2007.12.007

  • 105

    MarguliesM.EgholmM.AltmanW. E.AttiyaS.BaderJ. S.BembenL. A.et al (2005). Genome sequencing in microfabricated high-density picolitre reactors.Nature437376380. 10.1038/nature03959

  • 106

    MarranoA.BrittonM.ZainiP. A.ZiminA. V.WorkmanR. E.PuiuD.et al (2020). High-quality chromosome-scale assembly of the walnut (Juglans regia L.) reference genome.Gigascience9116. 10.1093/gigascience/giaa050

  • 107

    Martínez-GarcíaP. J.CrepeauM. W.PuiuD.Gonzalez-IbeasD.WhalenJ.StevensK. A.et al (2016). The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols.Plant J.87507532. 10.1111/tpj.13207

  • 108

    MatsumuraH.HsiaoM.-C.LinY.-P.ToyodaA.TaniaiN.TaroraK.et al (2020). Long-read bitter gourd (Momordica charantia) genome and the genomic architecture of nonclassic domestication.Proc. Natl. Acad. Sci. U.S.A.1171454314551. 10.1073/pnas.1921016117

  • 109

    MichaelT. P.JacksonS. (2013). The first 50 plant genomes.Plant Genome617. 10.3835/plantgenome2013.03.0001in

  • 110

    MingR.HouS.FengY.YuQ.Dionne-LaporteA.SawJ. H.et al (2008). The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus).Nature452991996. 10.1038/nature06856

  • 111

    MingR.VanBurenR.LiuY.YangM.HanY.LiL.-T.et al (2013). Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.).Genome Biol.14:R41. 10.1186/gb-2013-14-5-r41

  • 112

    MochidaK.SakuraiT.SekiH.YoshidaT.TakahagiK.SawaiS.et al (2017). Draft genome assembly and annotation of Glycyrrhiza uralensis, a medicinal legume.Plant J.89181194. 10.1111/tpj.13385

  • 113

    MoherD.LiberatiA.TetzlaffJ.AltmanD. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.PLoS Med.6:e1000097. 10.1371/journal.pmed.1000097

  • 114

    MossJ.YuanC. S. (2006). Herbal medicines and perioperative care.Anesthesiology105441442. 10.1097/00000542-200609000-00002

  • 115

    NafisT.AkmalM.RamM.AlamP.AhlawatS.MohdA.et al (2011). Enhancement of artemisinin content by constitutive expression of the HMG-CoA reductase gene in high-yielding strain of Artemisia annua L.Plant Biotechnol. Rep.55360. 10.1007/s11816-010-0156-x

  • 116

    NellerK. C. M.DiazC. A.PlattsA. E.HudakK. A. (2019). De novo assembly of the pokeweed genome provides insight into pokeweed antiviral protein (PAP) gene expression.Front. Plant Sci.10:1002. 10.3389/fpls.2019.01002

  • 117

    NewmanT.deBruijnF. J.GreenP.KeegstraK.KendeH.McIntoshL.et al (1994). Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones.Plant Physiol.10612411255. 10.1104/pp.106.4.1241

  • 118

    NiuZ.ZhuF.FanY.LiC.ZhangB.ZhuS.et al (2021). The chromosome-level reference genome assembly for Dendrobium officinale and its utility of functional genomics research and molecular breeding study.Acta Pharm. Sin. B1120802092. 10.1016/j.apsb.2021.01.019

  • 119

    NobregaM. A.PennacchioL. A. (2004). Comparative genomic analysis as a tool for biological discovery.J. Physiol.5543139. 10.1113/jphysiol.2003.050948

  • 120

    NongW.LawS. T. S.WongA. Y. P.BarilT.SwaleT.ChuL. M.et al (2020). Chromosomal-level reference genome of the incense tree Aquilaria sinensis.Mol. Ecol. Resour.20971979. 10.1111/1755-0998.13154

  • 121

    OuX.JinH.GuoL.YangY.CuiX.XiaoY.et al (2011). [Status and prospective on nutritional physiology and fertilization of Panax notoginseng].Zhongguo Zhong Yao Za Zhi3626202624. 10.4268/cjcmm20111904

  • 122

    PaddonC. J.WestfallP. J.PiteraD. J.BenjaminK.FisherK.McPheeD.et al (2013). High-level semi-synthetic production of the potent antimalarial artemisinin.Nature496528532. 10.1038/nature12051

  • 123

    PatilA. B.ShindeS. S.RaghavendraS.SatishB. N.KushalappaC. G.VijayN. (2021). The genome sequence of Mesua ferrea and comparative demographic histories of forest trees.Gene769145214. 10.1016/j.gene.2020.145214

  • 124

    PeiL.WangB.YeJ.HuX.FuL.LiK.et al (2021). Genome and transcriptome of Papaver somniferum Chinese landrace CHM indicates that massive genome expansion contributes to high benzylisoquinoline alkaloid biosynthesis.Hortic. Res.8:5. 10.1038/s41438-020-00435-5

  • 125

    PengX.LiuH.ChenP.TangF.HuY.WangF.et al (2019). A chromosome-scale genome assembly of paper mulberry (Broussonetia papyrifera) provides new insights into its forage and papermaking usage.Mol. Plant12661677. 10.1016/j.molp.2019.01.021

  • 126

    PengZ.BredesonJ. V.WuG. A.ShuS.RawatN.DuD.et al (2020). A chromosome-scale reference genome of trifoliate orange (Poncirus trifoliata) provides insights into disease resistance, cold tolerance and genome evolution in Citrus.Plant J.10412151232. 10.1111/tpj.14993

  • 127

    PootakhamW.NaktangC.KongkachanaW.SonthirodC.YoochaT.SangsrakruD.et al (2021a). De novo chromosome-level assembly of the Centella asiatica genome.Genomics11322212228. 10.1016/j.ygeno.2021.05.019

  • 128

    PootakhamW.SonthirodC.NaktangC.NawaeW.YoochaT.KongkachanaW.et al (2021b). De novo assemblies of Luffa acutangula and Luffa cylindrica genomes reveal an expansion associated with substantial accumulation of transposable elements.Mol. Ecol. Resour.21212225. 10.1111/1755-0998.13240

  • 129

    PuX.LiZ.TianY.GaoR.HaoL.HuY.et al (2020). The honeysuckle genome provides insight into the molecular mechanism of carotenoid metabolism underlying dynamic flower coloration.New Phytol.227930943. 10.1111/nph.16552

  • 130

    PuruggananM. D.FullerD. Q. (2009). The nature of selection during plant domestication.Nature457843848. 10.1038/nature07895

  • 131

    QinC.YuC.ShenY.FangX.ChenL.MinJ.et al (2014). Whole-genome sequencing of cultivated and wild peppers provides insights into Capsicum domestication and specialization.Proc. Natl. Acad. Sci. U.S.A.11151355140. 10.1073/pnas.1400975111

  • 132

    QinG.XuC.MingR.TangH.GuyotR.KramerE. M.et al (2017). The pomegranate (Punica granatum L.) genome and the genomics of punicalagin biosynthesis.Plant J.9111081128. 10.1111/tpj.13625

  • 133

    QinS.WuL.WeiK.LiangY.SongZ.ZhouX.et al (2019). A draft genome for Spatholobus suberectus.Sci. Data6:113. 10.1038/s41597-019-0110-x

  • 134

    QingZ.LiuJ.YiX.LiuX.HuG.LaoJ.et al (2021). The chromosome-level Hemerocallis citrina Borani genome provides new insights into the rutin biosynthesis and the lack of colchicine.Hortic. Res.8:89. 10.1038/s41438-021-00539-6

  • 135

    QuickJ.QuinlanA. R.LomanN. J. (2014). A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer.Gigascience316. 10.1186/2047-217X-3-22

  • 136

    RaiA.HirakawaH.NakabayashiR.KikuchiS.HayashiK.RaiM.et al (2021). Chromosome-level genome assembly of Ophiorrhiza pumila reveals the evolution of camptothecin biosynthesis.Nat. Commun.12:405. 10.1038/s41467-020-20508-2

  • 137

    RajewskiA.Carter-HouseD.StajichJ.LittA. (2021). Datura genome reveals duplications of psychoactive alkaloid biosynthetic genes and high mutation rate following tissue culture.BMC Genomics22:201. 10.1186/s12864-021-07489-2

  • 138

    RaymondO.GouzyJ.JustJ.BadouinH.VerdenaudM.LemainqueA.et al (2018). The Rosa genome provides new insights into the domestication of modern roses.Nat. Genet.50772777. 10.1038/s41588-018-0110-3

  • 139

    RenH.YuH.ZhangS.LiangS.ZhengX.ZhangS.et al (2019). Genome sequencing provides insights into the evolution and antioxidant activity of Chinese bayberry.BMC Genomics20:458. 10.1186/s12864-019-5818-7

  • 140

    SchnableP. S.WareD.FultonR. S.SteinJ. C.WeiF.PasternakS.et al (2009). The B73 maize genome: complexity, diversity, and dynamics.Science32611121115. 10.1126/science.1178534

  • 141

    SchwartzD. C.LiX.HernandezL. I.RamnarainS. P.HuffE. J.WangY. K. (1993). Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping.Science262110114. 10.1126/science.8211116

  • 142

    ShangJ.TianJ.ChengH.YanQ.LiL.JamalA.et al (2020). The chromosome-level wintersweet (Chimonanthus praecox) genome provides insights into floral scent biosynthesis and flowering in winter.Genome Biol.21:200. 10.1186/s13059-020-02088-y

  • 143

    ShenC.DuH.ChenZ.LuH.ZhuF.ChenH.et al (2020). The chromosome-level genome sequence of the Autotetraploid alfalfa and Resequencing of core Germplasms provide genomic resources for alfalfa research.Mol. Plant1312501261. 10.1016/j.molp.2020.07.003

  • 144

    ShenQ.ZhangL.LiaoZ.WangS.YanT.ShiP.et al (2018). The genome of Artemisia annua provides insight into the evolution of Asteraceae family and Artemisinin biosynthesis.Mol. Plant11776788. 10.1016/j.molp.2018.03.015

  • 145

    SongC.LiuY.SongA.DongG.ZhaoH.SunW.et al (2018). The Chrysanthemum nankingense genome provides insights into the evolution and diversification of Chrysanthemum flowers and medicinal traits.Mol. Plant1114821491. 10.1016/j.molp.2018.10.003

  • 146

    SongX.SunP.YuanJ.GongK.LiN.MengF.et al (2021). The celery genome sequence reveals sequential paleo-polyploidizations, karyotype evolution and resistance gene reduction in apiales.Plant Biotechnol. J.19731744. 10.1111/pbi.13499

  • 147

    SongX.WangJ.LiN.YuJ.MengF.WeiC.et al (2020). Deciphering the high-quality genome sequence of coriander that causes controversial feelings.Plant Biotechnol. J.1814441456. 10.1111/pbi.13310

  • 148

    SongZ.LinC.XingP.FenY.JinH.ZhouC.et al (2020). A high-quality reference genome sequence of Salvia miltiorrhiza provides insights into tanshinone synthesis in its red rhizomes.Plant Genome13:e20041. 10.1002/tpg2.20041

  • 149

    SuW.JingY.LinS.YueZ.YangX.XuJ.et al (2021). Polyploidy underlies co-option and diversification of biosynthetic triterpene pathways in the apple tribe.Proc. Natl. Acad. Sci. U.S.A.118e2101767118. 10.1073/pnas.2101767118

  • 150

    SuX. Z.MillerL. H. (2015). The discovery of artemisinin and the Nobel Prize in Physiology or Medicine.Sci. China Life Sci.5811751179. 10.1007/s11427-015-4948-7

  • 151

    SunG.XuY.LiuH.SunT.ZhangJ.HettenhausenC.et al (2018). Large-scale gene losses underlie the genome evolution of parasitic plant Cuscuta australis.Nat. Commun.9:2683. 10.1038/s41467-018-04721-8

  • 152

    SunW.LengL.YinQ.XuM.HuangM.XuZ.et al (2019). The genome of the medicinal plant Andrographis paniculata provides insight into the biosynthesis of the bioactive diterpenoid neoandrographolide.Plant J.97841857. 10.1111/tpj.14162

  • 153

    SunX.ZhuS.LiN.ChengY.ZhaoJ.QiaoX.et al (2020). A chromosome-level genome assembly of garlic (Allium sativum) provides insights into genome evolution and allicin biosynthesis.Mol. Plant1313281339. 10.1016/j.molp.2020.07.019

  • 154

    TuL.SuP.ZhangZ.GaoL.WangJ.HuT.et al (2020). Genome of Tripterygium wilfordii and identification of cytochrome P450 involved in triptolide biosynthesis.Nat. Commun.11:971. 10.1038/s41467-020-14776-1

  • 155

    TuskanG. A.DifazioS.JanssonS.BohlmannJ.GrigorievI.HellstenU.et al (2006). The genome of black cottonwood, Populus trichocarpa (Torr. & Gray).Science31315961604. 10.1126/science.1128691

  • 156

    UpadhyayA. K.ChackoA. R.GandhimathiA.GhoshP.HariniK.JosephA. P.et al (2015). Genome sequencing of herb Tulsi (Ocimum tenuiflorum) unravels key genes behind its strong medicinal properties.BMC Plant Biol.15:212. 10.1186/s12870-015-0562-x

  • 157

    UrasakiN.TakagiH.NatsumeS.UemuraA.TaniaiN.MiyagiN.et al (2017). Draft genome sequence of bitter gourd (Momordica charantia), a vegetable and medicinal plant in tropical and subtropical regions.DNA Res.245158. 10.1093/dnares/dsw047

  • 158

    vanBakelH.StoutJ. M.CoteA. G.TallonC. M.SharpeA. G.HughesT. R.et al (2011). The draft genome and transcriptome of Cannabis sativa.Genome Biol.12:R102. 10.1186/gb-2011-12-10-r102

  • 159

    VanburenR.BryantD.EdgerP. P.TangH.BurgessD.ChallabathulaD.et al (2015). Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum.Nature527508511. 10.1038/nature15714

  • 160

    ViningK. J.JohnsonS. R.AhkamiA.LangeI.ParrishA. N.TrappS. C.et al (2017). Draft genome sequence of Mentha longifolia and development of resources for mint cultivar improvement.Mol. Plant10323339. 10.1016/j.molp.2016.10.018

  • 161

    WangJ.XuS.MeiY.CaiS.GuY.SunM.et al (2021). A high-quality genome assembly of Morinda officinalis, a famous native southern herb in the Lingnan region of southern China.Hortic. Res.8:135. 10.1038/s41438-021-00551-w

  • 162

    WangL.HeF.HuangY.HeJ.YangS.ZengJ.et al (2018). Genome of wild mandarin and domestication history of mandarin.Mol. Plant1110241037. 10.1016/j.molp.2018.06.001

  • 163

    WangL.YuS.TongC.ZhaoY.LiuY.SongC.et al (2014). Genome sequencing of the high oil crop sesame provides insight into oil biosynthesis.Genome Biol.15:R39. 10.1186/gb-2014-15-2-r39

  • 164

    WangM.ZhangL.WangZ. (2021). Chromosomal-Level reference genome of the neotropical tree Jacaranda mimosifolia D.Don. Genome Biol. Evol.1327. 10.1093/gbe/evab094

  • 165

    WangP.YiS.MuX.ZhangJ.DuJ. (2020). Chromosome-Level genome assembly of Cerasus humilis using PacBio and Hi-C technologies.Front. Genet.11:956. 10.3389/fgene.2020.00956

  • 166

    WangX.XuY.ZhangS.CaoL.HuangY.ChengJ.et al (2017). Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction.Nat. Genet.49765772. 10.1038/ng.3839

  • 167

    WangX.ZhangJ.HeS.GaoY.MaX.GaoY.et al (2018). HMOD: an omics database for herbal medicine plants.Mol. Plant11757759. 10.1016/j.molp.2018.03.002

  • 168

    WangY.FanG.LiuY.SunF.ShiC.LiuX.et al (2013). The sacred lotus genome provides insights into the evolution of flowering plants.Plant J.76557567. 10.1111/tpj.12313

  • 169

    WangZ.HobsonN.GalindoL.ZhuS.ShiD.McDillJ.et al (2012). The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.Plant J.72461473. 10.1111/j.1365-313X.2012.05093.x

  • 170

    WuH.ZhaoG.GongH.LiJ.LuoC.HeX.et al (2020). A high-quality sponge gourd (Luffa cylindrica) genome.Hortic. Res.7:128. 10.1038/s41438-020-00350-9

  • 171

    WuS.ShamimuzzamanM.SunH.SalseJ.SuiX.WilderA.et al (2017). The bottle gourd genome provides insights into Cucurbitaceae evolution and facilitates mapping of a Papaya ring-spot virus resistance locus.Plant J.92963975. 10.1111/tpj.13722

  • 172

    WuS.SunW.XuZ.ZhaiJ.LiX.LiC.et al (2020). The genome sequence of star fruit (Averrhoa carambola).Hortic. Res.7:95. 10.1038/s41438-020-0307-3

  • 173

    WuZ.LiuH.ZhanW.YuZ.QinE.LiuS.et al (2021). The chromosome-scale reference genome of safflower (Carthamus tinctorius) provides insights into linoleic acid and flavonoid biosynthesis.Plant Biotechnol. J.1917251742. 10.1111/pbi.13586

  • 174

    WuyunT.WangL.LiuH.WangX.ZhangL.BennetzenJ. L.et al (2018). The hardy rubber tree genome provides insights into the evolution of polyisoprene biosynthesis.Mol. Plant11429442. 10.1016/j.molp.2017.11.014

  • 175

    XiaE. H.ZhangH.-B.ShengJ.LiK.ZhangQ.-J.KimC.et al (2017). The tea tree genome provides insights into tea flavor and independent evolution of caffeine biosynthesis.Mol. Plant10866877. 10.1016/j.molp.2017.04.002

  • 176

    XiaM.HanX.HeH.YuR.ZhenG.JiaX.et al (2018). Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo.Gigascience719. 10.1093/gigascience/giy067

  • 177

    XiaZ.HuangD.ZhangS.WangW.MaF.WuB.et al (2021). Chromosome-scale genome assembly provides insights into the evolution and flavor synthesis of passion fruit (Passiflora edulis Sims).Hortic. Res.8:14. 10.1038/s41438-020-00455-1

  • 178

    XieJ.ZhaoH.LiK.ZhangR.JiangY.WangM.et al (2020). A chromosome-scale reference genome of Aquilegia oxysepala var. kansuensis.Hortic. Res.7:113. 10.1038/s41438-020-0328-y

  • 179

    XuH.SongJ.LuoH.ZhangY.LiQ.ZhuY.et al (2016). Analysis of the genome sequence of the medicinal plant Salvia miltiorrhiza.Mol. Plant9949952. 10.1016/j.molp.2016.03.010

  • 180

    XuJ.ChuY.LiaoB.XiaoS.YinQ.BaiR.et al (2017). Panax ginseng genome examination for ginsenoside biosynthesis.Gigascience6115. 10.1093/gigascience/gix093

  • 181

    XuX.YuanH.YuX.HuangS.SunY.ZhangT.et al (2021). The chromosome-level Stevia genome provides insights into steviol glycoside biosynthesis.Hortic. Res.8:129. 10.1038/s41438-021-00565-4

  • 182

    XuZ.PuX.GaoR.DemurtasO. C.FleckS. J.RichterM.et al (2020b). Tandem gene duplications drive divergent evolution of caffeine and crocin biosynthetic pathways in plants.BMC Biol.18:63. 10.1186/s12915-020-00795-3

  • 183

    XuZ.GaoR.PuX.XuR.WangJ.ZhengS.et al (2020a). Comparative genome analysis of Scutellaria baicalensis and Scutellaria barbata reveals the evolution of active flavonoid biosynthesis.Genomics Proteomics Bioinformatics18230240. 10.1016/j.gpb.2020.06.002

  • 184

    XuZ.XinT.BartelsD.LiY.GuW.YaoH.et al (2018). Genome analysis of the ancient tracheophyte Selaginella tamariscina reveals evolutionary features relevant to the acquisition of desiccation tolerance.Mol. Plant11983994. 10.1016/j.molp.2018.05.003

  • 185

    YanL.WangX.LiuH.TianY.LianJ.YangR.et al (2015). The genome of Dendrobium officinale illuminates the biology of the important traditional Chinese orchid herb.Mol. Plant8922934. 10.1016/j.molp.2014.12.011

  • 186

    YangJ.ZhangG.ZhangJ.LiuH.ChenW.WangX.et al (2017). Hybrid de novo genome assembly of the Chinese herbal fleabane Erigeron breviscapus.Gigascience617. 10.1093/gigascience/gix028

  • 187

    YangX.YueY.LiH.DingW.ChenG.ShiT.et al (2018). The chromosome-level quality genome provides insights into the evolution of the biosynthesis genes for aroma compounds of Osmanthus fragrans.Hortic. Res.5:72. 10.1038/s41438-018-0108-0

  • 188

    YangZ.ChenS.WangS.HuY.ZhangG.DongY.et al (2021a). Chromosomal-scale genome assembly of Eleutherococcus senticosus provides insights into chromosome evolution in Araliaceae.Mol. Ecol. Resour.2122042220. 10.1111/1755-0998.13403

  • 189

    YangZ.LiuG.ZhangG.YanJ.DongY.LuY.et al (2021b). The chromosome-scale high-quality genome assembly of Panax notoginseng provides insight into dencichine biosynthesis.Plant Biotechnol. J.19869871. 10.1111/pbi.13558

  • 190

    YinJ.JiangL.WangL.HanX.GuoW.LiC.et al (2021). A high-quality genome of taro (Colocasia esculenta (L.) Schott), one of the world’s oldest crops.Mol. Ecol. Resour.216877. 10.1111/1755-0998.13239

  • 191

    YuJ.HuS.WangJ.WongG. K.-S.LiS.LiuB.et al (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. indica).Science2967992. 10.1126/science.1068037

  • 192

    YuanY.JinX.LiuJ.ZhaoX.ZhouJ.WangX.et al (2018). The Gastrodia elata genome provides insights into plant adaptation to heterotrophy.Nat. Commun.9:1615. 10.1038/s41467-018-03423-5

  • 193

    YuanY.LiuW.ZhangQ.XiangL.LiuX.ChenM.et al (2015). Overexpression of artemisinic aldehyde Δ11 (13) reductase gene-enhanced artemisinin and its relative metabolite biosynthesis in transgenic Artemisia annua L.Biotechnol. Appl. Biochem.621723. 10.1002/bab.1234

  • 194

    YuanZ.FangY.ZhangT.FeiZ.HanF.LiuC.et al (2018). The pomegranate (Punica granatum L.) genome provides insights into fruit quality and ovule developmental biology.Plant Biotechnol. J.1613631374. 10.1111/pbi.12875

  • 195

    ZhangD.LiW.XiaE.ZhangQ.LiuY.ZhangY.et al (2017). The medicinal herb panax Notoginseng genome provides insights into Ginsenoside biosynthesis and genome evolution.Mol. Plant10903907. 10.1016/j.molp.2017.02.011

  • 196

    ZhangG. Q.XuQ.BianC.TsaiW.-C.YehC.-M.LiuK.-W.et al (2016). The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution.Sci. Rep.6:19029. 10.1038/srep19029

  • 197

    ZhangJ.TianY.YanL.ZhangG.WangX.ZengY.et al (2016). Genome of plant Maca (Lepidium meyenii) illuminates genomic basis for high-altitude adaptation in the central andes.Mol. Plant910661077. 10.1016/j.molp.2016.04.016

  • 198

    ZhangL.LiX.MaB.GaoQ.DuH.HanY.et al (2017). The tartary buckwheat genome provides insights into rutin biosynthesis and abiotic stress tolerance.Mol. Plant1012241237. 10.1016/j.molp.2017.08.013

  • 199

    ZhangL.LiuM.LongH.DongW.PashaA.EstebanE.et al (2019). Tung tree (Vernicia fordii) genome provides a resource for understanding genome evolution and improved oil production.Genomics Proteomics Bioinformatics17558575. 10.1016/j.gpb.2019.03.006

  • 200

    ZhangT.RenX.ZhangZ.MingY.YangZ.HuJ.et al (2020). Long-read sequencing and de novo assembly of the Luffa cylindrica (L.) Roem. genome.Mol. Ecol. Resour.20511519. 10.1111/1755-0998.13129

  • 201

    ZhangY.ZhengL.ZhengY.ZhouC.HuangP.XiaoX.et al (2019). Assembly and annotation of a draft genome of the medicinal plant Polygonum cuspidatum.Front. Plant Sci.10:1274. 10.3389/fpls.2019.01274

  • 202

    ZhaoD.HamiltonJ. P.PhamG. M.CrisovanE.Wiegert-RiningerK.VaillancourtB.et al (2017). De novo genome assembly of Camptotheca acuminata, a natural source of the anti-cancer compound camptothecin.Gigascience617. 10.1093/gigascience/gix065

  • 203

    ZhaoQ.YangJ.CuiM.-Y.LiuJ.FangY.YanM.et al (2019). The reference genome sequence of Scutellaria baicalensis provides insights into the evolution of Wogonin biosynthesis.Mol. Plant12935950. 10.1016/j.molp.2019.04.002

  • 204

    ZhaoY. P.FanG.YinP. P.SunS.LiN.HongX.et al (2019). Resequencing 545 ginkgo genomes across the world reveals the evolutionary history of the living fossil.Nat. Commun10:4201. 10.1038/s41467-019-12133-5

  • 205

    ZhengG. X. Y.LauB. T.Schnall-LevinM.JaroszM.BellJ. M.HindsonC. M.et al (2016). Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.Nat. Biotechnol.34303311. 10.1038/nbt.3432

  • 206

    ZhengX.ChenD.ChenB.LiangL.HuangZ.FanW.et al (2021). Insights into salvianolic acid B biosynthesis from chromosome-scale assembly of the Salvia bowleyana genome.J. Integr. Plant Biol.6313091323. 10.1111/jipb.13085

  • 207

    ZhouW.LiB.LiL.MaW.LiuY.FengS.et al (2018). Genome survey sequencing of Dioscorea zingiberensis.Genome61567574. 10.1139/gen-2018-0011

  • 208

    ZhouW.WangY.LiB.PetijováL.HuS.ZhangQ.et al (2021). Whole-genome sequence data of Hypericum perforatum and functional characterization of melatonin biosynthesis by N-acetylserotonin O-methyltransferase.J. Pineal Res.70:e12709. 10.1111/jpi.12709

Summary

Keywords

medicinal plant, genome, sequencing, long-read sequencing technology, application

Citation

Cheng Q-Q, Ouyang Y, Tang Z-Y, Lao C-C, Zhang Y-Y, Cheng C-S and Zhou H (2021) Review on the Development and Applications of Medicinal Plant Genomes. Front. Plant Sci. 12:791219. doi: 10.3389/fpls.2021.791219

Received

08 October 2021

Accepted

23 November 2021

Published

23 December 2021

Volume

12 - 2021

Edited by

Qi Chen, Kunming University of Science and Technology, China

Reviewed by

Wei Gao, Capital Medical University, China; Enhua Xia, Anhui Agriculture University, China

Updates

Copyright

*Correspondence: Hua Zhou,

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics