# ADVANCED TECHNOLOGIES FOR THE QUALITY CONTROL AND STANDARDIZATION OF PLANT BASED MEDICINES

EDITED BY : Jiang Xu and Caroline Howard PUBLISHED IN : Frontiers in Pharmacology and Frontiers in Plant Science

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-798-0 DOI 10.3389/978-2-88963-798-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ADVANCED TECHNOLOGIES FOR THE QUALITY CONTROL AND STANDARDIZATION OF PLANT BASED MEDICINES

Topic Editors:

Jiang Xu, China Academy of Chinese Medical Sciences, China Caroline Howard, Medicines and Healthcare products Regulatory Agency, United Kingdom

Herbs and herbal products are of paramount importance for human health. To be able to guarantee safety and quality, standards and testing methods are needed. Pharmacopoeias contain quality control protocols setting the standards which are then required by governments. The quality traits are many, including the intrinsic variables of medicinal plant, e.g. the levels of the active compounds, and the absence of possibly natural occurring toxic compounds. On the other hand, many quality traits are related to agricultural conditions and practices, or to the harvesting and post-harvest processing. With so many variables, quality control of the end product becomes extremely complex, time consuming and costly. To ensure the quality of medicinal plants for human consumption quality management -the use of "good practices" at each step, from seed to final product- becomes a crucial aspect.

In general, quality control includes the inspection of the product's identity, purity, and content, based on its physical, chemical or biological properties. To ensure the quality of herbal medications, criteria such as botanical quality, type of preparation, physical constants, adulteration, contaminants, chemical constituents, pesticides residues et al. should be examined. Meanwhile, authentication of herbs is needed to avoid possible adulteration or contaminating plants, even toxic herbs such as Aristolochia species. Many of the methods are long standing, such as microscopy in combination with color reactions, but some 50 years ago chromatography developed as a major tool for both qualitative and quantitative analysis of herbal preparations. Nowadays, research is working on the improvement of these methods and on the development of novel tools.

For instance, next generation sequencing and mass spectrometry imaging, are emerging as new technologies for the quality control of herbal medicines. With these technologies, quick testing of herbal products and of mixed herbal powder preparations, including the testing for specific plant parts (botanical drugs), can be achieved. Also, novel chemical tools such as metabolomics and Near Infrared Red (NIR) spectroscopy are being developed as powerful tools to identify and to link these with activity by using chemometric tools such as multivariate analysis. Finally, progress of informatic tools such as machine learning helps to deal with the big data generated by sequencing or mass spectrometry. However, these new technologies, like all other new born technologies, should be tested and perfected for a broader range of products.

Citation: Xu, J., Howard, C., eds. (2020). Advanced Technologies for the Quality Control and Standardization of Plant Based Medicines. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-798-0

# Table of Contents


Jing Zhang, Wen Xu, Peng Wang, Juan Huang, Jun-qi Bai, Zhi-hai Huang, Xu-sheng Liu and Xiao-hui Qiu


Xinlian Chen, Jianguo Zhou, Yingxian Cui, Yu Wang, Baozhong Duan and Hui Yao

*48 Detection of Cistanches Herba (*Rou Cong Rong*) Medicinal Products Using Species-Specific Nucleotide Signatures*

Xiao-yue Wang, Rong Xu, Jun Chen, Jing-yuan Song, Steven-G Newmaster, Jian-ping Han, Zheng Zhang and Shi-lin Chen


Wing Lam, Yongshen Ren, Fulan Guan, Zaoli Jiang, William Cheng, Chang-Hua Xu, Shwu-Huey Liu and Yung-Chi Cheng

*80 A Practical Quality Control Method for Saponins Without UV Absorption by UPLC-QDA*

Manjia Zhao, Yuntao Dai, Qi Li, Pengyue Li, Xue-Mei Qin and Shilin Chen

*88 Sequence-Specific Detection of* Aristolochia *DNA – A Simple Test for Contamination of Herbal Products*

Tiziana Sgamma, Eva Masiero, Purvi Mali, Maslinda Mahat and Adrian Slater


*125 A Comprehensive Comparative Study for the Authentication of the*  Kadsura *Crude Drug*

Jiushi Liu, Xueping Wei, Xiaoyi Zhang, Yaodong Qi, Bengang Zhang, Haitao Liu and Peigen Xiao

*136 St. John's Wort (*Hypericum perforatum*) Products – How Variable is the Primary Material?*

Francesca Scotti, Katja Löbel, Anthony Booker and Michael Heinrich

*148 Selection of Reference Genes for Expression Analysis in Chinese Medicinal Herb* Huperzia serrata

Mengquan Yang, Shiwen Wu, Wenjing You, Amit Jaisi and Youli Xiao


Louise Isager Ahl, Narjes Al-Husseini, Sara Al-Helle, Dan Staerk, Olwen M. Grace, William G. T. Willats, Jozef Mravec, Bodil Jørgensen and Nina Rønsted

*224 Medicinal Plant Analysis: A Historical and Regional Discussion of Emergent Complex Techniques*

Martin Fitzgerald, Michael Heinrich and Anthony Booker

# Systems Pharmacology Based Strategy for Q-Markers Discovery of HuangQin Decoction to Attenuate Intestinal Damage

Xiao-min Dai1,2, Dong-ni Cui1,2, Jing Wang<sup>3</sup> , Wei Zhang<sup>4</sup> , Zun-jian Zhang1,2 \* and Feng-guo Xu1,2 \*

<sup>1</sup> Key Laboratory of Drug Quality Control and Pharmacovigilance, Ministry of Education, China Pharmaceutical University, Nanjing, China, <sup>2</sup> State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing, China, <sup>3</sup> College of Pharmacy, Shaanxi University of Chinese Medicine, Xianyang, China, <sup>4</sup> State Key Laboratory for Quality Research in Chinese Medicines, Macau University of Science and Technology, Taipa, Macau

#### Edited by:

Jiang Xu, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Jianbo Wan, University of Macau, Macau Wenzhi Yang, Tianjin University of Traditional Chinese Medicine, China

#### \*Correspondence:

Feng-guo Xu fengguoxu@gmail.com Zun-jian Zhang zunjianzhangcpu@hotmail.com

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 09 February 2018 Accepted: 02 March 2018 Published: 20 March 2018

#### Citation:

Dai X-m, Cui D-n, Wang J, Zhang W, Zhang Z-j and Xu F-g (2018) Systems Pharmacology Based Strategy for Q-Markers Discovery of HuangQin Decoction to Attenuate Intestinal Damage. Front. Pharmacol. 9:236. doi: 10.3389/fphar.2018.00236 The quality control research of traditional Chinese medicine (TCM) is lagged far behind the space of progress in modernization and globalization. Thus the concept of quality marker (Q-marker) was proposed recently to guide the quality investigations of TCM. However, how to discover and validate the Q-marker is still a challenge. In this paper, a system pharmacology based strategy was proposed to discover Q-marker of HuangQin decoction (HQD) to attenuate Intestinal Damage. Using this strategy, nine measurable compounds including paeoniflorin, baicalin, scutellarein, liquiritigenin, norwogonin, baicalein, glycyrrhizic acid, wogonin, and oroxylin A were screened out as potential markers. Standard references of these nine compounds were pooled together as components combination according to their corresponding concentration in HQD. The bioactive equivalence between components combination and HQD was validated using wound healing test and inflammatory factor determination experiment. The comprehensive results indicated that components combination is almost bioactive equivalent to HQD and could serve as the Q-markers. In conclusion, our study put forward a promising strategy for Q-markers discovery.

Keywords: Q-marker, systems pharmacology, traditional Chinese medicine, HuangQin decoction, intestinal damage

## INTRODUCTION

Traditional Chinese medicine (TCM) plays a vital role in prevention and treatment of diseases and receives more and more attention (Jiang et al., 2010). Due to its highly complex chemical composition, TCM is confronting a major challenge in quality control research (Yang et al., 2017). In Chinese Pharmacopeia monographs, the quality standards of TCM were usually established based on the absolute quantitation of one or several specific chemical compounds. This approach can only ensure the consistency of the assigned chemical markers. It is often questionable whether these chemical markers are responsible for and directly related to the holistic efficacy of TCM. Many efforts have been made to drive the advance of quality control research of TCM (Tilton et al., 2010; Long et al., 2015; Wang F. et al., 2017). But the proposed strategy or methods are quite complex and not easy to follow. Thus, a standardized and commonly accepted strategy for

**5**

TCM quality control research is needed. Recently, the concept of quality marker (Q-marker) was proposed to standardize TCM quality research and to enhance the quality consistency (Liu et al., 2016). However, how to discover and validate the Q-marker is still a huge challenge.

Systems pharmacology is an emerging approach that integrates chemoinformatics, network pharmacology and -omics data. It is a useful tool to achieve a comprehensive insight into the therapeutical mechanism of multi-compound herbs. The public availability of system pharmacology platforms and other bioinformation databases put systems pharmacology-based TCM research strategy into practice (Ru et al., 2014). It has been successfully used to reveal the material basis and the mechanism of Yin-Huang-Qing-Fei capsule (Yu et al., 2017) and rhubarb (Xiang et al., 2015) on the treatment of chronic bronchitis and renal interstitial fibrosis, respectively. Systems pharmacology provides a bridge to link the TCM chemical constituents with the corresponding targets, which would facilitate the Q-markers discovery. Therefore, we proposed a systems pharmacology based strategy (**Figure 1**) for Q-marker discovery. The feasibility of this strategy was tested by taking HuangQin decoction (HQD) as an example.

HuangQin decoction (HQD) is a basic formula listed in Treatise on Exogenous Febrile Disease written by Zhongjing Zhang. It has been widely used in China for more than 1800 years on the treatment of gastrointestinal (GI) ailments, including diarrhea, abdominal spasms, vomiting, and nausea (Wang X. et al., 2017). Recent studies have revealed that PHY906, a modified formulation derived from HQD, could ameliorate chemotherapy-induced GI toxicity and enhance the therapeutic efficacy of irinotecan, capecitabine and other antitumor drugs (Lam et al., 2010; Saif et al., 2010; Wang et al., 2011). HQD is constituted with four medicinal herbs, i.e., Scutellaria baicalensis Georgi, Glycyrrhiza uralensis Fisch, Paeonia lactiflora Pall, and Ziziphus jujuba Mill. Up to now, it is still unclear that which

ingredients of HQD are active to ameliorate intestinal damage, which significantly limits the establishment of quality standards.

In this paper, we tried to discover the Q-marker of HQD to treat diarrhea using a system pharmacology based strategy. After collecting the active components of HQD and the corresponding therapeutic targets of diarrhea, the component-target (C-T) network was constructed firstly. LC-IT-TOF/MS fingerprint was used to clarify which active compounds actually existed in water decoction of HQD. Then the detectability of selected potential markers were tested and the absolute concentration of measurable components were quantified using HPLC/UV. Bioactive equivalent experiment was performed to evaluate the efficacy of HQD and the combination of selected potential markers from the aspect of alleviating intestinal damage.

#### MATERIALS AND METHODS

#### Chemicals and Reagents

Paeoniflorin, baicalin, scutellarein, liquiritigenin, baicalein, glycyrrhizic acid, wogonin, and oroxylin A were purchased from Chengdu Herbpurify Co., Ltd. (Sichuan, China). Glycyrrhizic acid ammonium salt and norwogonin were purchased from ChemFaces (China). All other reagents and solvents were of high performance liquid chromatography (HPLC) grade. Deionized water was purified using a Milli-Q system (Millipore, Bedford, MA, United States). Scutellaria baicalensis Georgi (Hebei Province), Glycyrrhiza uralensis Fisch (Inner Mongolia of China), Paeonia lactiflora Pall (Anhui Province), and Ziziphus jujuba Mill (Henan Province) were authenticated by Dr. Ehu Liu (State Key Laboratory of Natural Medicines, China Pharmaceutical University, China).

### Collection of Active Ingredients and Diarrhea Targets

The ingredients of the four constitutional herbs, i.e., Scutellaria baicalensis Georgi, Glycyrrhiza uralensis Fisch, Paeonia lactiflora Pall, and Ziziphus jujuba Mill in HQD were collected from Traditional Chinese Medicines for Systems Pharmacology Database and Analysis Platform (TCMSP<sup>1</sup> ), Traditional Chinese Medicine integrative database (TCMID), TCM Database @ Taiwan, HIT and wide-scale literature mining. Generally, ADME screening was used in previous prediction, which includes series of pharmacokinetic parameters such as oral bioavailability (OB), drug-likeness (DL), and blood–brain barrier (BBB) value (Shen et al., 2016). Considering that our study focused on intestinal damage, the active components might exert therapeutic effect without being absorbed into serum or brain. Thus, we only set DL ≥ 0.05 as criteria to filter out active ingredients as many as possible.

Targets of all effective components in HQD were collected from DrugBank, TCMSP, and STITCH. The targets that are in close relationship with diarrhea were obtained from PharmGKB<sup>2</sup> ,

<sup>1</sup>http://lsp.nwu.edu.cn/tcmsp.php

<sup>2</sup>https://www.pharmgkb.org/

TTD database<sup>3</sup> (Yang et al., 2016) GAD (Genetic Association Database) and OMIM (Online Mendelian Inheritance in Man, up data to 2017). Then UniProt database<sup>4</sup> was employed to standardize the target related genes and to focus on the targets from the human. The genes that are associated with diarrhea and can be targeted by HQD were kept. And then, the Compound-Target (C-T) network were generated and their topological properties were analyzed by Cytoscape 3.4.0.

## Active Components Identification in HQD by LC/MS

LC-IT-TOF/MS was used to clarify which active compounds are actually existing in water decoction of HQD. One milliliter of standard decoction HQD (Wang X. et al., 2017) was dissolved in a suitable amount of 80% (v/v) methanol with the assistant dissolve effect of DMSO by ultra-sonication and subsequently centrifuged (16000 rpm, 4◦C) for 10 min. The supernatants were filtered and analyzed on a ZORBAX SB-C18 rapid resolution HT (2.1 mm × 100 mm, 1.8 µm) (Agilent Technologies). The mobile phase consisted of 0.1% formic acid (A) and methanol (B). The gradient elution began with 10% B, increased to 45% B in 12 min, further increased to 100% B in 16 min and last for 6 min, and brought back to 10% B in 1 min followed by 10 min of re-equilibration. The mass spectrometry (MS) analysis was performed in a ultrafast LC-ion trap time-of-flight mass spectrometer via electrospray ionization (ESI) interface (SHIMADZU, Japan). The parameters were as follows: ESI (±), nebulizing gas rat, 1.5 L/min; drying gas pressure, 100 kPa; detector voltage, 1.85 kV; interface voltage, −3.5 kV; CDL and heat block temperature, 200◦C; ion accumulation time, 30 ms. The mass range was set at m/z 100–1000. The components in HQD were identified by comparing with the reference standards available in our lab or the fragment models in literatures. The results were combined with those identified in PHY906 (Ye et al., 2007).

## Potential Markers Quantification in HQD by HPLC/UV

After checking the identified components of HQD in the C-T network, only the common ones were screened out as potential markers. The detectability of these selected potential markers was tested and the absolute concentration of measurable components were quantified using HPLC/UV. After dilution and filtering, HQD was analyzed on an Agilent 1100 series HPLC system (Agilent, United States) using Agilent Zorbax SB-C18 column (250 mm × 4.6 mm, 5 µm). The mobile phase consisted of 0.1% phosphoric acid in water (A) and acetonitrile (B). The gradient elution program was 19–21% B at 0–8 min, 21–35% B at 8–10 min, 35–35% B at 10–18 min, 35–40% B at 18–20 min, 40–40% B at 20–38 min, 40–100% B at 38–43 min. The flow rate was kept at 1.0 ml/min at 30◦C. Different detection wavelengths were set for different

<sup>3</sup>http://bidd.nus.edu.sg/group/cjttd/

<sup>4</sup>http://www.uniprot.org/

compounds. 236 nm for paeoniflorin; 278 nm for baicalin, baicalein, wogonin, liquiritigenin; 250 nm for glycyrrhizic acid ammonium salt; 270 nm for oroxylin A; 340 nm for scutellarein; 280 nm for norwogonin. Standard references of these compounds were pooled together as components combination according to their corresponding concentration in HQD.

### Bioactive Equivalence Assessment Between Components Combination and HQD

#### Cell Culture

Lipopolysaccharide (LPS)-stimulated NCM460 damage (Bhattacharyya et al., 2008) and LPS-stimulated THP-1 derived macrophage inflammation (Perezperez et al., 1995) were used as two cell models to assess the bioactive equivalence between components combination and HQD. NCM460 and THP-1 were obtained from Model Animal Research Center of Nanjing University and Stem Cell Bank of Chinese Academy of Sciences, respectively. Cells were grown at 37◦C under a humidified atmosphere with 5% (v/v) CO2. NCM460 and THP-1 were cultured in Dulbecco's modified Eagle's medium (DMEM) (Boster Biological Technology Co., Ltd.) and Roswell Park Memorial Institute (RIPM) 1640 medium (Gibco-Thermo Fisher Scientific, United States) respectively, containing 10% fetal bovine serum and 1% penicillin-streptomycin (Biological Industries, Israel).

#### Cell Migration Assay

Collective migration of epithelial cells refers to fundamental physiological processes as an inherent part of embryonic morphogenesis, cancer and wound healing, which can be measured by scratch assay (Das et al., 2015). In colon epithelial monolayer (NCM460), opening of a free surface by scratchwounding triggers collective movement of the surrounding cells to fill the gap. To assess the effect of medicines on NCM460, we pretreated cells with HQD, components combination and baicalin (corresponding concentrations in 400, 200, 100, 50 µg/ml HQD) for 24 h. After pre-incubation and the 100% cell confluent observed, scratch-wounding was performed. The supernatant was removed and the cells were washed with PBS three times to remove the damaged cells. Then cells were subjected to 1 µg/ml LPS (Lipopolysaccharides from Escherichia coli O111:B4, Sigma) for 24 h except the vehicle control group. The images of two migrating epithelial monolayers of NCM460 was captured with an inverted phase contrast microscope (Nikon Eclipse Ti-U), which was used to calculate the % relative cell migration according to the following equation (Buranasukhon et al., 2017).

%Relative migration =

Area between cells 0 h − Area between cells 24 h Area between cells 0 h × 100

#### TNF-α and PGE<sup>2</sup> Release

To assess the anti-inflammatory effect of HQD, components combination and baicalin, the NCM460 and THP-1-derived macrophages were pretreated for 12 h with medicines. Then, NCM460 were cultured in serum-free DMEM supplemented with LPS(1 µg/ml) for 6 or 12 h, while THP-1-derived macrophages were cultured in serum-free RIPM 1640 supplemented with 1 µg/ml LPS (8 h for TNF-α, 21 h for PGE2) (Padilla et al., 2017). The accumulated TNF-α and PGE<sup>2</sup> in the culture medium were measured using commercial ELISA kits [Multisciences(Lianke) Biotech for TNF-α and MEIMIAN for PGE2] according to the manufacturer's instruction.

#### Statistical Analysis

fphar-09-00236 March 17, 2018 Time: 17:7 # 4

All data were expressed as mean ± standard deviation (SD). Data were subjected to statistical analysis using Graphpad Prism 5.0 (Graphpad Software, San Diego, CA, United States). One-way analysis of variance (ANOVA) with Dunnett's post hoc test was carried out for statistical comparison. In all cases, the value of P < 0.05 was considered to be statistical significance.

## RESULTS

## Collection of HQD Ingredients and Diarrhea Targets

Considering that intestinal tissues and intestinal content play important roles in the occurrence of diarrhea, we selected DL as the only standard to filter active ingredients. The name and Mol ID of 186 ingredients from Scutellaria baicalensis Georgi, 111 from Paeonia lactiflora Pall, 236 from Glycyrrhiza uralensis Fisch, and 226 from Ziziphus jujube Mill was shown in **Supplementary Table S1**. The corresponding target that these ingredients act on was screened out based on TCMSP and STITCH database (**Supplementary Table S2**). At the same time, 64 diarrhea-related proteins were found from PharmGKB, TTD, GAD, and OMIM (**Supplementary Table S3**). Only 33 common targets from these two independent search were kept, which were interacted with 208 ingredients of HQD (**Supplementary Table S4**).

## Compound-Target Network Construction and Analysis

As small molecules typically exert their bioactive effects through interactions with protein targets. Thus in order to identify the interaction between the filtered 208 compounds and 33 diarrhea targets, a network was established. As we can see from **Figure 2**, 430 compound-target interactions were generated. The node degree represents the connectedness of a node with other nodes and it is the basic quantitative properties of network. The degree of compounds and targets were listed in **Supplementary Table S5**. Among these 33 targets, Prostaglandin G/H synthase 2 (PTGS2, D = 199) has the highest degree, followed by Nitric (nitric) oxide synthase (NOS2, D = 97), Vascular endothelial growth factor receptor 2 (KDR, D = 25), Tumor necrosis factor (TNF, D = 20) and so on, which indicated that they played a significant role in the network as the hub target. Wogonin, oroxylin A, and berberine could interacted with PTGS2 and NOS2 simultaneously. Rutin, wogonin, baicalein, and paeoniflorin could interacted with TNF. The above results clearly elucidated the "multi-component and multi-target" mechanism of HQD and synergistic therapeutical effect on diarrhea.

## Active Components Identification and Quantification of HQD

Although above results suggested that 208 ingredients have effects on diarrhea-related targets, it does not mean that all these 208 components are detectable in HQD. The phytochemical components in water decoction of the four constitutional herbs were then identified by LC-IT-TOF/MS fingerprint in ESI positive and negative ion modes (**Supplementary Figure S1**). Totally, 38 compounds in HQD were identified by comparison with available reference standards in our lab or the fragment information in literatures, including 8 from Glycyrrhiza uralensis Fisch, 2 from Paeonia lactiflora Pall and 28 from Scutellaria baicalensis Georgi (**Supplementary Table S6**). Combining these 38 compounds with those identified in PHY906 (Ye et al., 2007), we got 79 compounds. Eleven of them that could be well matched with the C-T network of diarrhea were kept as potential markers.

Quantitative determination results (**Supplementary Figure S2**) demonstrated that except chrysin and rutin, the content of the rest 9 potential markers in HQD was more than Limit of Quantitation (LOQ). The LOQ of chrysin and rutin by HPLC/UV was 66.15 and 45.14 ng, respectively. Therefore, paeoniflorin, baicalin, scutellarein, liquiritigenin, norwogonin, baicalein, glycyrrhizic acid, wogonin, and oroxylin A were screened out as potential markers. Standard references of these 9 compounds were pooled together as components combination according to their corresponding concentration in HQD.

#### Bioactive Equivalence Assessment Between Components Combination and HQD

Wound healing test and inflammatory factor determination experiment results indicated that HQD showed remarkable protective effects and components combination exerted the same or better effects.

Representative phase-contrast images of control group wound areas at 0 and 24 h following scratching were shown in **Figure 3A**. Quantitative results demonstrated that LPS stimulation resulted in significantly lower cell mobility of NCM460 than the control group (P < 0.01). Components combination increased the cell mobility of LPS-stimulated NCM460 with dose-dependent and the effect was better than that of HQD at the same dose (**Figure 3B**). Baicalin (12 µg/mL), one of the most abundant compounds in HQD, showed some activities but could not achieve bioactive equivalence with HQD at the same dose level (400 µg/mL). In addition, LPS stimulation resulted in a substantial increase of TNF-α secretion in NCM460, while preincubation of components combination or HQD alleviated the LPS-induced increase of TNF-α. The results of LPS stimulation 6 h suggested that 400 µg/mL components combination had a similar efficacy to 200 µg/mL HQD. With LPS stimulation 12 h, only 200 µg/mL components combination exerted

notable anti-inflammatory effect, which demonstrated that the anti-inflammatory effect of components combination is superior to that of HQD (**Figure 3C**). Baicalin exerted weak effects and the results were consistent with theory of superimposed effect in TCM.

To further investigate the anti-inflammatory action on macrophages, effects of components combination and HQD on TNF-α and PGE<sup>2</sup> production in LPS-activated THP-1 were determined. Differentiated THP-1 was obtained by 48 h treatment with phorbol 12-myristate 13-acetate (PMA). Stimulation of LPS for 8 h increased TNF-α release, whereas preincubation with HQD or components combination notably alleviated the elevation of TNF-α compared with model group. At the optimum concentration 200 µg/mL, components combination showed comparable effect with 100 or 50 µg/mL HQD. An interesting finding is that baicalin showed the best activity compared with components combination and HQD, which could be used to explain the monarch role of Scutellaria baicalensis Georgi in HQD (**Figure 4A**). Stimulation of LPS for 21 h significantly increased PGE<sup>2</sup> production, pre-treatment with 200 µg/mL components combination or 100 µg/mL HQD had the same effect to alleviate the LPS-induced increase of PGE2. Baicalin showed some activity but it was inferior to HQD (**Figure 4B**).

## DISCUSSION

Traditional Chinese medicine show advantage especially on the treatment of chronic disease, and receive more and more attention. However, the quality control problem of TCM is a major obstacle hindering its modernization and globalization.

24 h wound healing was determined as % of 0 h. (C) NCM460 were pretreated with vehicle, HQD, the combination and baicalin for 12 h, following 6 or 12 h stimulation of LPS, the accumulated TNF-α in the culture medium were measured using commercial ELISA kits. Results are expressed as mean ± SD of at least three independent experiments. ##P < 0.01 versus control group, <sup>∗</sup>P < 005, ∗∗P < 0.01, ∗∗∗P < 0.001 versus model group (One-way analysis of variance with Dunnett's post hoc test).

<sup>∗</sup>P < 0.05, ∗∗P < 0.01, ∗∗∗P < 0.001 versus model group (One-way analysis of variance with Dunnett's post hoc test).

Thus the concept of Q-marker was proposed recently to guide the TCM quality investigations. The Q-marker of TCM refers to a group of bioactive constituents that are closely associated with the therapeutic effects. The bottleneck in Q-marker-based quality standard investigation is how to screen out the chemical markers. Zhang et al. (2016a) tried to find the potential Q-markers of Corydalis Rhizoma based on biosynthesis, specificity, and pharmacodynamics experiments. Compounds that could be found in the brain tissues were regarded to exert antalgic effect. Non-targeted metabolomics and artificial nerve network were employed to explore the identity markers for five different parts of P. ginseng (Qiu et al., 2016). But the bioactivity of these markers was not taken into account. A triarchic theory of "property-effectcomponent" and multidiscipline-based strategies are proposed to discover effect-associated markers. The key steps are to test the effect of the extract or single compounds on multiple models and to determine the pharmacokinetics parameters (Zhang et al., 2016b). It is obvious that the process is time-consuming due to the complex composition of herbs. Although, significant progress has been made for Q-marker discovery, there are some drawbacks in the current studies. There still needs to be a standardized and commonly accepted strategy to follow.

Therefore, in this paper we proposed a systems pharmacologybased Q-marker discovery strategy. This strategy, integrating target prediction databases of Chinese medicine and disease databases, facilitates our understanding of effective components and was successfully applied to the study of HQD. As a result, 9 compounds were filtered out as potential markers, which interacted with 10 diarrhea-related targets including PTGS2, NOS2, and TNF etc. Previous studies have revealed that PHY906, the modified formulation derived from HQD, performed its effect on the intestinal toxicity by inhibiting PTGS2, NOS2, and TNF (Lam et al., 2010). These results proved the feasibility of our strategy to some extent.

Another huge challenge in Q-marker investigation is how to validate whether the selected Q-marker could be responsible for the holistic efficacy of TCM. Thus we borrow the concept of bioactive equivalent combinatorial compounds. Standard references of the selected compounds were pooled together as components combination according to their corresponding concentration in HQD. Irinotecan caused NCM460 damage was chosen as a model to study the bioactive equivalence between components combination and HQD. At the first stage, we only use the cell survival rate as parameter. The result was disappointed and HQD showed no effect on cell survival rate (**Supplementary Figure S3**), which was incompatible with in vivo experiment (Wang X. et al., 2017). It was speculated that cell survival rate was not a sensitive parameter and a mechanism based experiment should be designed. According to the C-T network of HQD on the treatment of diarrhea, we found that wogonin, norwogonin, and oroxylin A could affect PTGS2 and NOS2 activity simultaneously, liquiritigenin, baicalin, baicalein, and scutellarein were also associated with PTGS2. Previous studies have revealed that paeoniflorin, baicalin, baicalein, and wogonin could decrease production of tumor necrosis factor-α (TNF-α) (Kwak et al., 2014; Zhai and Guo, 2016). LPS could stimulate intestinal damage and increase the expression of inflammatory factor at the same time (Chen et al., 2001; Huang et al., 2007) in spite of little influence on NCM460 cell survival rate (**Supplementary Figure S4**). Thus, LPS-stimulated NCM460 damage (Bhattacharyya et al., 2008) and LPS-stimulated THP-1-derived macrophage inflammation (Perezperez et al., 1995) were used as cell models to perform the bioactive equivalence assessment using cell mobility, TNF-α and PGE<sup>2</sup> release as sensitive parameters.

#### CONCLUSION

The discovery and validation of Q-marker still face enormous challenges despite the fact that the concept of Q-marker has been presented and great efforts have been made. In this study, a systems pharmacology based strategy was proposed to discover Q-markers of TCM. Compared with other approaches to establish Q-markers, systems pharmacology contributes to finding the effect-associated markers faster and takes full advantage of the existing data. Using this strategy, nine compounds in HQD were screened out to compose components combination. The components combination has been validated to be almost bioactive equivalent to original decoction and could be deemed as the Q-markers of HQD. It is promising that systems pharmacology could be applied to Q-marker discovery to ensure efficacy and batch-to-batch consistency of TCM. The limitation of this study was that the contribution of each component has not been clarified, which emphasized the value of further research.

#### AUTHOR CONTRIBUTIONS

X-mD carried out most of the studies, performed the statistical analysis, and wrote the manuscript. D-nC performed the composition identification experiment of HQD. JW and WZ provided professional advice. Z-jZ and F-gX designed the study and revised the manuscript. All authors gave approval to the final version.

#### FUNDING

This work was supported by the NSFC (Nos. 81773861 and 81302733), Macao Science and Technology Development Fund (FDCT, No. 006/2015/A1), the Program for Jiangsu Province Innovative Research Team, the Program for New Century Excellent Talents in University (No. NCET-13-1036), a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), and the Open Project Program of Guangxi Key Laboratory of Traditional Chinese Medicine Quality Standards.

#### ACKNOWLEDGMENTS

The authors are grateful to Suyun Yu and Xu Wang from Nanjing University of Traditional Chinese Medicine for technical assistance in cell culture and network pharmacology, respectively.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar. 2018.00236/full#supplementary-material

FIGURE S1 | Fingerprint chromatography of HQD.

FIGURE S2 | Quantitative determination results of potential Q-markers.

FIGURE S3 | Influence of CPT-11 and HQD on NCM460 cell survival rate.

FIGURE S4 | Influence of LPS and HQD on NCM460 cell survival rate.

TABLE S1 | Constituents of herbs in HQD.

TABLE S2 | All targets of constituents in HQD.

TABLE S3 | All 64 targets related to diarrhea.

TABLE S4 | Active constituents of herbs in HQD and their corresponding targets related to diarrhea.

TABLE S5 | The degree of compounds and targets.

TABLE S6 | The identified components of HQD in our lab.

## REFERENCES

fphar-09-00236 March 17, 2018 Time: 17:7 # 8


application in nervous and mental diseases. Evid. Based Complement. Altern. Med. 2016:9146378. doi: 10.1155/2016/9146378


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Dai, Cui, Wang, Zhang, Zhang and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Chemical Analysis and Multi-Component Determination in Chinese Medicine Preparation Bupi Yishen Formula Using Ultra-High Performance Liquid Chromatography With Linear Ion Trap-Orbitrap Mass Spectrometry and Triple-Quadrupole Tandem Mass Spectrometry

#### Edited by:

*Jiang Xu, China Academy of Chinese Medical Sciences, China*

#### Reviewed by:

*Wei Zhang, Macau University of Science and Technology, Macau Xionghao Lin, Howard University, United States*

#### \*Correspondence:

*Zhi-hai Huang zhhuang7308@163.com Xu-sheng Liu liuxu801@126.com Xiao-hui Qiu qiuxiaohui@gzucm.edu.cn*

*†These authors have contributed equally to this work.*

#### Specialty section:

*This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology*

Received: *12 April 2018* Accepted: *14 May 2018* Published: *08 June 2018*

#### Citation:

*Zhang J, Xu W, Wang P, Huang J, Bai J, Huang Z, Liu X and Qiu X (2018) Chemical Analysis and Multi-Component Determination in Chinese Medicine Preparation Bupi Yishen Formula Using Ultra-High Performance Liquid Chromatography With Linear Ion Trap-Orbitrap Mass Spectrometry and Triple-Quadrupole Tandem Mass Spectrometry. Front. Pharmacol. 9:568. doi: 10.3389/fphar.2018.00568* Jing Zhang† , Wen Xu† , Peng Wang, Juan Huang, Jun-qi Bai, Zhi-hai Huang\*, Xu-sheng Liu\* and Xiao-hui Qiu\*

*Guangdong Provincial Key Laboratory of Clinical Research on Traditional Chinese Medicine Syndrome, The Second Clinical Medical College of Guangzhou University of Chinese Medicine, Guangdong Provincial Hospital of Traditional Chinese Medicine, Guangzhou, China*

Bupi Yishen Formula (BYF), a Chinese medicine preparation, has been clinically applied for the recovery of chronic kidney disease and for delaying its progress. Nevertheless, the chemical components in BYF have yet to be fully clarified. Ultra-high performance liquid chromatography with linear ion trap-Orbitrap mass spectrometry (UHPLC-LTQ-Orbitrap-MS<sup>n</sup> ) and triple-quadrupole tandem mass spectrometry (UHPLC-TQ-MS/MS) methods were developed for qualitative chemical profiling and multi-components quantitative analysis in BYF. The chromatographic separation was performed on a Phenomenex Kinetex C<sup>18</sup> column (2.1 × 100 mm i.d., 1.7µm) using gradient elution of water (A) and acetonitrile (B) both containing 0.1% formic acid. Eighty-six compounds, including flavones, saponins, phenolic acids, and other compounds were authenticated or temporarily deduced according to their retention behaviors, mass mensuration, and characteristic fragment ions with those elucidated reference substances or literatures. Among the herbal medicinal materials of the formula, Astragali Radix, Codonopsis Radix, Salviae Miltiorrhizae Radix Rhizoma, and Polygoni Multiflori Radix Praeparata contributed to the bulk of the dissolved metabolites of the formula extraction. In addition, seven analytes were simultaneously determined by UHPLC-TQ-MS/MS, which was validated and has managed to determine major components in BYF. The study indicated that the established qualitative and quantitative methods would be potent and dependable analytical tools for characterizing multiconstituent in complex prescriptions decoction and provided a basis for the evaluation of bioactive components in BYF.

Keywords: Chinese medicine preparation, chemical analysis, multi-component determination, linear ion traporbitrap, quality control

## INTRODUCTION

Chinese herbal medicine is the main form of clinical prevention and treatment of Traditional Chinese Medicine (TCM), the composition of which is composed of many different ingredients, and the organic combination of these different ingredients is different from adding individual ingredients simply. The material basis of Chinese herbal medicine is to coordinate and interact with each other so as to achieve the integrate function. Different from western medicine research, studies on Chinese herbal compound emphasize the integrity of the complex prescription, which should not split off from intrinsic characteristics of TCM and pursuit monomer compound (Wang et al., 2005). The material base of single herb or prescription is active substance groups. These groups of active substances are compatibly combined according to certain requirements, which act on multiple targets and thus has pleiotropic effects by multiple pathways (Xiong et al., 2015). Therefore, it is imperative to use modern advanced techniques to intrinsically explain the material basis of Chinese herbal medicine and to elaborate the connotation of compatibility and its curative effect.

Bupi Yishen Formula (BYF) is a non-herbal combination preparation of TCM which possesses the basic characterization of formula compatibility of TCM. BYF is prepared from the extract mixture of nine herbs, namely Astragali Radix, Codonopsis Radix, Atractylodis Macrocephalae Rhizoma, Poria, Dioscoreae Rhizoma, Polygoni Multiflori Radix Praeparata, Cuscutae Semen, Coicis Semen, and Salviae Miltiorrhizae Radix Rhizoma (Liu et al., 2012). The clinical application of BYF is treating and delaying the progression of chronic kidney disease, including postponing chronic renal failure symptoms, defering early and mid-renal dysfunction, delaying entering the dialysis time, and protection of residual renal function (Mao et al., 2015). Modern pharmacological studies revealed that the decoction could effectively delay glomerular filtration rate (GFR) of patients on the fourth stage of chronic kidney disease. Unambiguously, detecting and identifying the major components in BYF is a prerequisite and the hinge to disclose the active constituents and how they produce the effectiveness.

In recent years, reports on global characterizations of complicated ingredients in TCM prescriptions continues to grow steadily due to the recently rapid development of multifarious hyphenated and hybrid mass spectrometry (MS). Analytical methods have exhibited good performance in analysis of unknown targets from TCM prescriptions, containing LC-ESI/MS (Dou et al., 2009; Shaw et al., 2012), LC-TOF/MS (Sun et al., 2013), LC/MS-IT-TOF, etc. (Hao et al., 2008; Liu et al., 2016).

Ultra-high performance liquid chromatography (UHPLC) has been utilized in many bioanalytical fields in recent years due to its rapid analysis and excellent separation (Simons et al., 2009; Ha et al., 2013). Equipped with a relatively short column with a low flow rate, UHPLC usually cost a remarkably shorter analysis time to achieve the same separation efficiency as HPLC. The hybrid LTQ-Orbitrap analytical platform, being composed of an ion trap coupled with an Orbitrap mass analyzer, enables two scan types obtained at the same time. The Orbitrap provides relatively higher mass accuracy (<3 ppm) and mass resolution than a number of other mass spectrometers, which is available for determining exact molecular formulas (Dunn et al., 2008; Tchoumtchoua et al., 2013). Moreover, multi-stage MS<sup>n</sup> mass spectra can be detected using ion trap by data-dependent scan and also minimize total analysis time, owing to its trigger for fragment spectra of target ions, and avoiding duplication by dynamic exclusion settings (Qiu et al., 2013). Thus, the LTQ-Orbitrap platform provides elemental compositions as well as multiple-stage mass data, which allow fast, sensitive, and reliable detecting, thus facilitating the identification of unknown compounds. Constituents of BYF could be structurally classified based on similar carbon skeletons, which should share a similar fragmentation pathway of each type and hence generate common characteristic product ions. Thus, mass spectra analysis for structural identifications could be facilitated by proposed strategies. In our previous study, the combination of UHPLC and LTQ-Orbitrap-MS<sup>n</sup> has been successfully used in analyzing multiple components in single herbal extracts (Xu et al., 2014; Wang et al., 2015; Zhang et al., 2015). In this study, we attempt to exploit it to detect and identify the TCM prescription, which contain hundreds of different chemical constituents.

The present work attempted to establish an expeditious UHPLC-LTQ-Orbitrap-MS<sup>n</sup> applicable approach for rapid separation and reliable identification of major constituents in BYF extract. Several strategies were used during the process, such as diagnostic fragment ions screening and fragment monitoring. In the decoction, eighty-six components altogether were identified or tentatively identified according to retention time and MS spectra data. Besides, a quantitative analysis approach has been constructed by Ultra-high performance liquid chromatography with triple quadrupole mass spectrometry (UHPLC-TQ-MS/MS). Seven representative compounds of relatively high contents unequivocally identified, were selected as marker components to evaluate the quality of BYF. The UHPLC-LTQ-Orbitrap MS<sup>n</sup> , and UHPLC-TQ-MS/MS platforms were proved as potent tools for both rapid qualitative and quantitative detection and analysis of complicated constituents from natural resources and the study facilitated the comprehensive quality control of BYF.

#### EXPERIMENTAL

#### Chemicals, Reagents, and Materials

Chemical references including calycosin-7-O-β-Dglucopyranoside, calycosin, formononetin, astragulin, salvianolic acid B, (E)-2,3,5,4′ -Tetrahydroxystilbene-2-O-glucopyranoside ((E)-THSG) Astragaloside I, Astragaloside II, Astragaloside III, Astragaloside IV, soyasaponin I, lobetyolin, emodin were bought from Must (Chengdu, China). Rosmarinic acid, lithospermic acid, formononetin-7-O-glucopyranoside were from Yuanye (Shanghai, China). Salvianolic acid A was purchased from Feiyu (Jiangsu, China). Isomucronulatol-7-O-glucoside, 9, 10-dimethoxypterocarpan-3-O-β-D-glucopyranoside were isolated from Astragalus membranaceus and provided by Prof. Zhu Dayuan from Shanghai Institute of Materia Medica. (S)-THSG, emodin-8-O-β-D-glucoside and physcion-8-O-β-D-glucoside were isolated from Polygonum multiflorum in our lab. The purity of each standard was determined by HPLC (≥95%) and their structures were confirmed by MS, <sup>1</sup>H-NMR, and <sup>13</sup>C-NMR. All references were deliquated with methanol for at a concentration of 50.0µg/mL.

HPLC-grade Acetonitrile, methanol, and formic acid were from Sigma Aldrich (MO, USA). Ultra-pure water was prepared by a Milli-Q water system (Millipore, MA, USA). Other reagents and chemicals were of analytical grade.

Astragali Radix (No. 11050419, Neimenggu), Codonopsis Radix (No. 110653271, Gansu), Atractylodis Macrocephalae Rhizoma (No. 110600711, zhejiang), Poria (No. 110506341, Hunan), Dioscoreae Rhizoma (No. 121001014, Henan), Polygoni Multiflori Radix Praeparata (No. 110400831, Henan), Cuscutae Semen (No. 110502581, Shandong), Coicis Semen (No. 110600371, Guizhou), and Salviae Miltiorrhizae Radix Rhizoma (No. 110601741, Anhui) were from Kangmei(Guangdong, China). They were authenticated by Dr. Huang Zhihai and the specimens were preserved in Guangdong Provincial Hospital of TCM. Two batches of BYF concentrated granule was produced in the pilot-scale by Peili Pharmaceutical Co., Ltd. (NanNing, China).

## Preparation of Calibration Standard Solutions

Standard of seven compounds was accurately weighed and dissolved in methanol separately to prepare the stock solution of each. A mixed stock solution was obtained, containing seven stock solutions, giving a concentration of 15.30µg/mL for calycosin-7-O-Glc, 6.45µg/mL for calycosin, 644.80µg/mL for (E)-THSG, 6.03µg/mL for astragulin, 15.60µg/mL for rosmarinic acid, 8.10µg/mL for salvianolic acid A, 0.812µg/mL for salvianolic acid B, respectively. Daidzein (50.0µg/mL) was also prepared with methanol to obtain the internal standard (IS) stock solution. To construct calibration curves, the mixed stock solution was continuously diluted for series concentrations at 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, and 1/128 of the original one. In a 2 mL volumetric flask, 0.2 mL of each concentration solution above, as well as 100 µL IS solutions were added, and all concentrations were finally diluted to 2 mL with 18% aqueous methanol. The acquired solutions were conserved at 4◦C in refrigerator until use. All the solutions were filtered through 0.22µm membranes before analysis.

#### Sample Preparation

(i) Extraction of crude drugs: A total 125 g dry pieces of nine medicinal materials were mixed by prescription ratio and extracted with boiling water (1:10) for three times (45,30, and 30 min, respectively), filtered through gauze. Then three filtrates were combined and vacuum evaporated to recover the solvent at 56◦C, and then BYF extract could be obtained. Extract was transferred into 250 mL volumetric flask, then adjusted to desired level with 10% methanol solution (final crude drug concentration was 0.5 g/mL). Solid-phase extraction (SPE) with C-18 column (ProElut, 200 mg, 3 mL column volume) was used for the pretreatment procedure, which had been conditioned with methanol (2 mL) and water (2 mL). After 1.0 mL of BYF extract was loaded, the column was washed by 10% methanol (2 mL), and eluted with 1.0 mL 100% methanol slowly. The dry pieces of each herb were disposed through the same procedure, thus individual decoction was obtained. All the sample solutions were passed through 0.22µm membranes prior to analysis.

(ii) Pretreatment of decoction for quantitative study: The lyophilized powder of BYF decoction was produced by freezedrying. 11.20 g lyophilized powder was acquired from 100 mL of BYF decoction. Five hundred and sixty milligrams of BYF was filtered, a 0.1 mL portion of which was added with 10 µL of the IS solution, and then was diluted with methanol to 5 mL. The sample solutions were passed through 0.22µm membranes.

(iii) Pretreatment of BYF concentrated granule: Concentrated granule (2.0 g) was accurately weighed and precisely dissolved in 100 mL of 80% aqueous methanol, and then refluxed for 1 h. The extract was cooled down to room temperature, weighed and made up a deficiency by 80% methanol, which was then treated in the same way as (ii).

## UHPLC-LTQ-Orbitrap-MS<sup>n</sup> Conditions

Chromatographic separation was conducted by a Thermo Accela UHPLC system (San Joes, USA) comprising an autosampler, a quaternary pump, a diode-array detector (DAD), and a column compartment settled to room temperature. A Phenomenex Kinetex C<sup>18</sup> column (2.1 × 100 mm i.d., 1.7µm) was utilized for sample separating. The mobile phase was mixture of water (A) and acetonitrile (B), both containing 0.1% formic acid. The elution gradient was set as follows: 0–12 min (10–25% B), 12– 25 min (25–32% B), 15–42 min (32–56% B), 42–51 min (56–95% B). The injection volume of samples was 2 µL with a flow rate of mobile phase at 200 µL/min.

For qualitative experiments, a Thermo Fisher Scientific LTQ-Orbitrap XL hybrid mass spectrometer (Bremen, Germany) was hyphenated to the LC instrument via an electron spray ionization (ESI) interface. The samples were determined in negative mode. The ESI parameters were set (spray voltage was −3.5 KV; capillary temperature was 325◦C; tube lens voltage was −76 V; Sheath gas and auxiliary gases were 45 and 6 units, respectively). The Orbitrap mass analyzer was set up the full scan mass range at m/z 120–1,200 of 30,000 resolution in centroidedtype mass mode. In data-dependent MS<sup>n</sup> acquisition, the most intense ions were always selected for online MS<sup>2</sup> -MS<sup>3</sup> analysis by FT and MS<sup>4</sup> -MS<sup>5</sup> analysis by LTQ, and dynamic exclusion detection was also conducted during the process for repetition prevention. Dynamic exclusion parameters was set as follows: Repeat count, 2; Repeat duration, 0.35 min; Exclusion duration, 1.0 min; Exclusion mass width, 3 amu. The collision energy for collision-induced dissociation (CID) was set as 30 % of maximum.

The number and types of expected atoms were fixed as follows for possible elemental composition of components: carbons ≤50, hydrogens ≤80, oxygens ≤30, nitrogens ≤2. The accuracy error threshold was set at 3 ppm. The software of Thermo Fisher Scientific Xcalibur 2.1 was applied for data analysis.

#### UHPLC-TQ-MS/MS Analysis

An AccelaTM UPLC system and a Thermo Scientific TSQ Quantum Ultra triple-quadrupole spectrometer (San Jose, USA) fitted with an ESI probe were employed for quantitative analysis. The separation column, column temperature, and the mobile phase were identical with those of qualitative conditions, with a gradient elution of 18–39% B at 0–5 min, 39–65% B at 5– 7 min, 65–95% B at 7–9 min at 10–12 min with a flow rate at 250 µL/min. The injection volume was set at 5 µL.

Multiple reaction monitoring (MRM) was used for MS data acquisition and the conditions were designed as below: capillary temperature was 400◦C; capillary voltage was 2.5 kV for negative mode and 3.0 kV for positive mode; sheath gas (N2) was pressure 40 psi; auxiliary gas was 8 psi; the dwell time was 100 ms. The detection parameters of target compounds were summarized in **Table 1**. Peak areas of each analyte and IS acquired in MRM mode were employed for calibration curve establishing. Data were collected and analyzed by Thermo Xcalibur 2.1.0 Software.

#### Validation of Quantitative Method

The linear calibration curves were established by the analyte/IS ratio of each analyte (peak area ratio between each analyte and IS). Diluted standard solutions were successively analyzed until a signal-to-noise ratio (S/N) 3:1 and 10:1 were reached, respectively, to measure the limit of detection (LOD) and limit of quantification (LOQ) of each target compound. The intraday precision was evaluated by detecting six times during 1 day, while the inter-day precision was assessed for 3 days in a row. Repeatability was obtained by six independent sample solutions using identical procedure in section Sample Preparation and variations were displayed by the relative standard deviation (RSD). One sample solution was tested at room temperature at different times within 24 h for stability evaluation. The recovery test was validated by adding known amounts of mixed reference solution to sample solutions at three concentration levels.

#### RESULTS AND DISCUSSION

### Optimization of Extraction Procedure and Analysis Conditions

Variable factors during extraction procedures of BYF granule, including extraction solvent (water, 50, 80, and 100% methanol), method (reflux and sonication), solvent volume (30, 60, and 100 mL), and time (15, 30, 45, and 60 min) were optimized so as to extract the compounds efficiently. The optimized method was finally determined to extract the BYF granule with 100 ml of 80% methanol by refluxed for 1 h.

The UHPLC conditions were optimized, containing type of column, column temperature, mobile phase system, and flow rate. The Phenomenex Kinetex C<sup>18</sup> column was selected based upon our previous multi-constituents analysis. Besides, different kinds of mobile phases were tested (acetonitrile and methanol with added modifiers, including formic acid, acetic acid, and ammonium acetate). A combination of acetonitrile and water both containing 0.1% (v/v) formic acid was found not only compatible to MS analysis, but also suitable for compounds separation for qualitative analysis. Comparing the TIC of the negative and positive modes, signal response was found more sensitive to the majority of components in negative mode, thus the MS<sup>n</sup> data were detected in negative mode. The total ion chromatogram of BYF was acquired for structure confirmation (shown in **Figure 1**).

## Characterization of Constituents in BYF Extract

BYF extract was analyzed using the optimized UHPLC-LTQ-Orbitrap-MS<sup>n</sup> method. To elucidate the chemical components, known compounds were identified by comparing with the data of reference standards. Based on the MS<sup>n</sup> analysis of the authentic compounds, the characteristic fragmentation behaviors of each type with the same carbon skeleton were conducted, and thus applied the obtained rules to structure characterization of their derivatives. For other unknown compounds, the structures were tentatively identified according to MS<sup>n</sup> spectra and previous data in literatures. Eighty-six compounds in all were identified or tentatively identified (**Table 2**), including 15 flavones, 10 saponins, 12 phenolic acids, and other compounds. Nine herbs made markedly different chemical contributions to BYF. Specifically, the major constituents in BYF extract came from Astragali Radix (25 compounds), Codonopsis Radix (18 compounds), Salviae Miltiorrhizae (11 compounds), Cuscutae Semen (16 compounds), and Polygoni Multiflori Radix Praeparata (6 compounds).


#### Compounds From Astragali Radix

Isoflavones and saponins, the major bioactive compounds in Astragali Radix, have various effects such as tonic, immunostimulant, cardioprotective diuretic, and hepatoprotective properties (Xu et al., 2006; Auyeung et al., 2009). In our work, 20 compounds from RA were totally characterized in BYF, including 12 isoflavones and 8 saponins. By comparing with information of reference standards, calycosin-7-O-β-D-glycoside, ononin, calycosin, formononetin, isomucronulatol-7-O-β-D-glucoside, 9,10-diMP-3-O-glucoside, 9,10-di-methoxypterocarpan-3-O-β-D-glucopyranoside,

Astragaloside I, Astragaloside II, Astragaloside III, Astragaloside IV, and soyasaponin I were identified. Based on the MS<sup>n</sup> analysis of these authentic compounds, the characteristic fragmentation behaviors of isoflavones and saponins were proposed in our previous study (Zhang et al., 2015), which were applied for the structure elucidation of their derivatives.

The MS<sup>2</sup> spectra of Compound 39 and Compound 55 exhibited characteristic product ions [M-C2H2O]<sup>−</sup> (m/z 475.1 and 429.1) and [M-glu-C2H2O]<sup>−</sup> (m/z 283.1 and 267.1), and their characteristic product ions yielded from the aglycone ion coincided with those of calycosin and formononetin. Based on the cleavage rules of loss of acetyl (42 Da) and acetylglucosyl (204 Da) groups, the two compounds were deduced as acetylglucoside of calycosin and formononetin.

Astragalosides from BYF decoction were mainly constituted by cycloastragenol aglycone, while aglycone ions or ions originated from the neutral loss of different glycosyl moiety in their MS<sup>2</sup> spectra. Take Compound 69 as an example, the [M-H]<sup>−</sup> ion was m/z 783.450 20 (C41H67O − <sup>14</sup>), which easily lose the sugar units in its MS<sup>2</sup> spectra and gained typical product ions at m/z 651, 621, 489 from the loss of one xylose ([M-132]−), one glucose ([M-164 (glu)]−), one xylose and glucose ([M-132 (xyl)-164 (glu)]−), respectively. In addition, one soyasaponin (Compound 78) of lower content from Astragali Radix was found in BYF decoction.

#### Compounds From Codonopsis Radix

The identified compounds of Codonopsis Radix in BYF can be classified into four main classes, namely, phenylpropanoid glycosides, acetylene glycosides, hexyl (hexenol) glycosides (Lin et al., 2013). In (-)ESI-MS spectra of BYF decoction, apart from phenylpropanoid glycosides (including compounds 8, 12, 23, 37, and 40) existing in [M-H]<sup>−</sup> ion forms, others displayed as both [M+HCOO]<sup>−</sup> and [M-H]<sup>−</sup> ions.

By comparing the retention time values and mass data with those of the references, Compound 8 and 30 were unambiguously identified as tangshenoside I and lobetyolin, which are the representative compounds of phenylpropanoid glycosides and acetylene glycosides in Radix Codonopsis. Their MS<sup>n</sup> spectra and proposed fragmentation pathways were summarized in **Figures 2A–C**, **3**, **4**, respectively. The [M-H]<sup>−</sup> ion and the typical ions in the MS<sup>2</sup> spectra of Compound 19 (C26H37O<sup>−</sup> <sup>13</sup>), namely, m/z 557.2 [M-H]−, 467.2 [M-C7H6] <sup>−</sup>, 341.1 [M-C14H17O2] −, were all 162 Da less compared to those of Compound 30, demonstrating that they have identical site cleavage. Compound 19 was therefore characterized as lobetyolinin by comparison with the literature (Kanji et al., 2003).

Compounds 37 and 40 were identified as diastereomers by the same deprotonated ions at m/z 823.265 8 (C38H47O − <sup>20</sup>) and the

#### TABLE 2 | Compounds detected and identified in BYF decoction.


#### TABLE 2 | Continued


#### TABLE 2 | Continued



*a [M*+*HCOO]*−*; b [M*+*H]*+*; <sup>c</sup>only detected in positive mode.*

*Cs, Cuscutae Semen; Cr, Codonopsis Radix; Pm, Polygoni Multiflori Radix Praeparata; Ar, Astragali Radix; Sm, Salviae Miltiorrhizae Radix Rhizoma; Co, Coicis Semen; Am, Atractylodis Macrocephalae Rhizoma; Dr, Dioscoreae Rhizoma; P, Poria; pen, pentoside; glup, glucopyranoside.*

same productions at m/z 497.2, m/z 453.2, and m/z 261.1, and they could be differentiated by their elution order. Their log P calculated by Discovery Studio were 0.47 and 0.59. As cis-isomers with lower polarity could by eluted relatively later than transisomers, compounds 37 and 40 were identified as 6′′′-trans- and 6 ′′′-cis-p-coumaroyl-tangshenoside I.

Compounds 9, 13, and 17 exhibited the [M+HCOO−] − precursor ion at m/z 471.207 40, 441.196 66, and 309.154 85, and their MS<sup>2</sup> and MS<sup>3</sup> spectra all yielded ions at m/z 263.1 (C12H23O − 6 ) and 161.1 (C6H9O − 5 ) as the base peak, respectively. It is inferred that Compounds 9 and 13 were, respectively, substituted by an additional glucopyranoside and pentoside compared to Compound 17. Compounds 9, 13, and 17 were tentatively characterized as hexyl β-sophoroside, hexyl-(pen)-glucopyranoside, and hexyl β-D-glucopyranoside.

Isomers (Compounds 4 and 6) were obtained by the EIC of m/z 423. Both of Them displayed [M-H]<sup>−</sup> ion at m/z 423.185 76 (C18H31O − 11) and their MS<sup>2</sup> spectra all exhibited

ion at m/z 261.1 [M-H-C6H10O5] <sup>−</sup> and 161.1 (C6H9O − 5 ). Based on the fragmentation information and related literature (Tsai and Lin, 2010), they were primarily identified as (S)- 3-hexenyl-β-D-sophoriside and (E)-2-hexenyl-β-D-sophoriside. Their log P calculated by Discovery Studio were −1.9 and −2.0 and cis-isomers was eluted relatively later, therefore, compounds 4 and 6 were assigned as (S)-3-hexenyl-β-D-sophoriside and (E)-2-hexenyl-β-D-sophoriside.

TABLE 3 | Regression equations, linearity ranges, correlation coefficients, LOD, and LOQ data of the seven analytes.


*<sup>a</sup>LOD, limit of detection.*

*<sup>b</sup>LOQ, limit of quantification.*



### Compounds From Salviae Miltiorrhizae Radix Rhizoma

Salviae Miltiorrhizae Radix Rhizoma was mainly composed of hydrophilic salvianolic acids and lipophilic diterpenoid quinines (Wu et al., 2007). This research adopted the water extraction method so that the major ingredients of Salviae Miltiorrhizae Radix Rhizoma in the BYF are primarily salvianolic acids. This type of compounds has high molecular weight and a lot of homologs, which display similar ESI-MS<sup>n</sup> behaviors for their differentiations.

Compound 42 displayed the [M-H]<sup>−</sup> ions at m/z 717.144 65 with the elemental composition of C36H30O16. Its MS<sup>2</sup> spectrum gave diagnostic fragment ions at m/z 519.1 and 321.0, caused by the loss of one and two molecular unit of Danshensu, respectively. In the MS<sup>3</sup> spectrum, two distinctive ions at m/z 339.1 and 321.0, resulted from neutral loss of one danshensu and the McLafferty rearrangement, respectively, were observed. The CID of ion m/z 321.0 could furtherly produced MS<sup>4</sup> and MS<sup>5</sup> spectra. Its MS<sup>n</sup> spectra, as well as fragmentation pattern were shown in **Figures 2D–G**, **5**. As its retention time and fragmentation ions were identical with those of the reference compound, Compound 42 was identified as salvianolic acid B.

As for the fragmentation pathway of Salvianolic acids, with Danshensu as their parent nucleus, their loss of H2O and CO, as well as successive losses of Danshensu, occurred based on their MS spectra. Based on molecular weight and multi-stage information provided by MS<sup>n</sup> , combining with literatures (Hu et al., 2005; Liu et al., 2007; Zhu et al., 2010), 11 phenolic acids in the prescription were identified accurately.

TABLE 5 | Recovery of the analytes.


#### Compounds From Other Component Herbs

Phenolic constituents, including stilbenes and anthraquinones, were regarded as the main active components in Polygoni Multiflori Radix Praeparata (Chen et al., 2012). Most of these stilbenes previously reported were mainly 2,3,5,4′ -tetrahydroxy substituted type (Liu et al., 2011). Compounds 7, 16, 50, 62, and 82 were indubitably distinguished as (E)-THSG, (S)-THSG, emodin-8-O-β-D-glucoside, physcion-8-O-β-D-glucoside, and emodin, by comparing their tR-values and mass information with those of the standards. The MS and MS<sup>2</sup> spectra of Compounds 7, 16, and 20 showed characteristic ions at m/z 405.118 29 and 243.065 55, representing the corresponding elemental composition of C20H21O − 9 and C14H11O − 4 , which was consistent with our previous studies (Qiu et al., 2013). Fragmentation behaviors of anthraquinones were also accordance with previous results.

The components in Cuscutae Semen are mainly phenolic acids and flavonoids. For example, The MS<sup>2</sup> and MS<sup>3</sup> spectra of Compounds 1, 2, and 3 were in accordance with those of 3- CQA (caffeoyl-quinic acid), 5-CQA, and 4-CQA available from the literature (Zhang et al., 2007). Although the fragmentation data of 5-CQA and 4-CQA were the same, retention time of 5-CQA was shorter in reversed-phase chromatography. Elution orders of the three isomers were consistent with the reported.

Compounds 67 and 68 both exhibited [M-H]<sup>−</sup> ion at m/z 843.421 75 (C38H67O − 20) and their MS<sup>2</sup> spectra all identical ions, they were tentatively identified as cuscutic acid C and its isomers. Five isomers of acely-Cuscutic acid C (Compounds 71, 72, 73, 74, and 76) were presented by ion extraction at m/z 885 from TIC. They showed identical precursor ion and MS<sup>2</sup> spectrum, while substituted position of their acetyl group was remained to be further studied.

#### Validation of Quantitative Method

Seven compounds, unequivocally identified with relatively high content in both the decoction and the granule, were selected as marker components to evaluate the quality of BYF. As the extraction process that we applied was through traditional method, which is extracted by water, the major components of high content were mainly water-soluble and highly polar compounds. These compounds have been observed at the early 25 min of the UHPLC-LTQ-Orbitrap-MS spectra. Meanwhile, those less polar compounds of much lower content emerged between 25 and 50 min, However, the relatively high peak area of such compounds in mass spectra has no direct relationship to their actual content in samples. Seven compounds for quantitative analysis were mainly phenolic and flavonoid compounds. Thus daidzein, a flavonoid, was chosen as the IS due

TABLE 6 | Contents (µg/g, *n* = 3) of the seven investigated compounds in the samples of BYF extract powder and BYF granule.


samples; 1, Calycosin-7-O-Glc; 2, (E)-THSG; 3, astragulin; 4, rosmarinic acid; 5, salvianolic acid A; 6, salvianolic acid B; 7, Calycosin; IS, Daidzein.

to its structural and polar similarity with the analytes, and no daidzein exist nor be detected in BYF.

Nice linearity with coefficients of determination (R <sup>2</sup> > 0.9994) were gained for the seven compounds. LOD and LOQ tests were carried out and listed in **Table 3**. The intra- or interday variations (RSD) were within the range of 0.24–2.99, 0.64– 3.04, and 0.53–2.32%, 2.16–3.72% for mixed standard solution and sample solution, respectively (**Table 4**). Analytes in the sample solution were found stable for 24 h with a RSD <3.38%. Recoveries of the fourteen compounds ranged from 95.86 to 104.04% with RSD from 1.12 to 4.02% (shown in **Table 5**). As a result, the developed UHPLC-TQ-MS/MS method was considered as a sensitive, repeatable and accurate tool for the quantitative analysis of main compounds in BYF.

#### Application to Analysis of BYF Samples

The established UHPLC-TQ-MS/MS method was subsequently applied for quantitative analysis of both BYF decoction and its preparations. Two different batches of BYF extract powder and three different batches of BYF granule were detected using the developed method. MRM chromatograms of seven main compounds in BYF were displayed in **Figure 6**. The contents of the investigated compounds were determined and the outcomes were shown in **Table 6**.

As shown in **Table 6**, salvianolic acid B was found as the most abundant compound, and compared with BYF decoction, the contents of most investigated compounds are relative low in concentrated granule. This difference might result from manufacturing procedures, namely, concentration, mixing, granulation, and drying processes. Meanwhile, the contents of rosmarinic acid and salvianolic acid A were much higher in concentrated granule than in BYF decoction. The variability could be explained because salvianolic acid B could be degradated and oxidized in the manufacturing procedures thus transform to other phenolic acids, such as rosmarinic acid, salvianolic acid A, lithospermic acid, etc. (Zheng and Qu, 2012).

#### REFERENCES


### CONCLUSION

In this paper, chemical constituents of BYF were systematically investigated by UHPLC-LTQ-Orbitrap-MS<sup>n</sup> and UHPLC-TQ-MS/MS methods, which provided comprehensively both qualitative and quantitative information for analysis of major components in BYF. Eighty-six compounds including flavones, saponins, phenolic acids, and other compounds were identified. The quantitative method was proved to have nice linearity, good accuracy, sensitivity, and repeatability. Although the bioactive components have not be determined, the present method will be helpful for providing the chemical basis for the further pharmacokinetic studies and effective quality evaluation of BYF, which would be of great importance for its safety use and mechanisms of action.

## AUTHOR CONTRIBUTIONS

JZ and WX performed the experiments, analyzed the data, and wrote the paper. XL, ZH, and XQ conceived and designed the experiment, contributed reagent, materials, analysis tools, and revised the manuscript. PW, JH, and JB provided constructive suggestions for this research. All authors gave approval to the final version.

## FUNDING

This work was supported by the Science and Technology Planning Project of Guangdong Province, China (2016A020226045, 2017A030313709, 2014A020221101, 2016A020226037, 2017B030314166), Pearl River S&T Nova Program of Guangzhou, China (201806010048), Special Subject of TCM Science and Technology Research of Guangdong Provincial Hospital of TCM, China (YN2016QJ07, YN2016QJ01) and Construction Project of TCM Hospital Preparation by Special Fund of Strong Province Construction in TCM, Guangdong, China (No. 6).


Salvia miltiorrhiza and its related preparations by HPLC-DAD and LC-MS<sup>n</sup> . J. Chromatogr. B. Analyt. Technol. Biomed. Life Sci. 846, 32–41. doi: 10.1016/j.jchromb.2006.08.002


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zhang, Xu, Wang, Huang, Bai, Huang, Liu and Qiu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Interactions Between Emodin and Efflux Transporters on Rat Enterocyte by a Validated Ussing Chamber Technique

Juan Huang<sup>1</sup> , Lan Guo<sup>1</sup> , Ruixiang Tan<sup>1</sup> , Meijin Wei<sup>1</sup> , Jing Zhang<sup>1</sup> , Ya Zhao<sup>1</sup> , Lu Gong<sup>1</sup> , Zhihai Huang<sup>1</sup> and Xiaohui Qiu1,2 \*

<sup>1</sup> The Second Clinical College of Guangzhou University of Chinese Medicine, Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, China, <sup>2</sup> Guangdong Provincial Key Laboratory of Clinical Research on Traditional Chinese Medicine Syndrome, Guangzhou, China

#### Edited by:

Jiang Xu, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Qi Wang, Harbin Medical University, China Luigi Menghini, Università "G. d'Annunzio" di Chieti-Pescara, Italy

> \*Correspondence: Xiaohui Qiu qiuxiaohui@gzucm.edu.cn

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 16 March 2018 Accepted: 30 May 2018 Published: 22 June 2018

#### Citation:

Huang J, Guo L, Tan R, Wei M, Zhang J, Zhao Y, Gong L, Huang Z and Qiu X (2018) Interactions Between Emodin and Efflux Transporters on Rat Enterocyte by a Validated Ussing Chamber Technique. Front. Pharmacol. 9:646. doi: 10.3389/fphar.2018.00646 Emodin, a major active anthraquinone, frequently interacts with other drugs. As changes of efflux transporters on intestine are one of the essential reasons why the drugs interact with each other, a validated Ussing chamber technique was established to detect the interactions between emodin and efflux transporters, including P-glycoprotein (P-gp), multidrug-resistant associated protein 2 (MRP2), and multidrug-resistant associated protein 3 (MRP3). Digoxin, pravastatin, and teniposide were selected as the test substrates of P-gp, MRP2, and MRP3. Verapamil, MK571, and benzbromarone were their special inhibitors. The results showed that verapamil, MK571, and benzbromarone could increase digoxin, pravastatin, and teniposide absorption, and decrease their E<sup>r</sup> values, respectively. Verapamil (220 µM) could significantly increase emodin absorption at 9.25 µM. In the presence of MK571 (186 µM), the Papp values of emodin from M-S were significantly increased and the efflux ratio decreased. With the treatment of emodin (185, 370, and 740 µM), digoxin absorption was significantly decreased while teniposide increased. These results indicated that emodin might be the substrate of P-gp and MRP2. Besides, it might be a P-gp inducer and MRP3 inhibitor on enterocyte, which are reported for the first time. These results will be helpful to explain the drug–drug interaction mechanisms between emodin and other drugs and provide basic data for clinical combination therapy.

Keywords: emodin, P-gp, MRP2, MRP3, Ussing chamber technique

## INTRODUCTION

Emodin (1,3,8trihydroxy-6-methylanthraquinone), a major active anthraquinone, is naturally present in some herbs which have been wildly used in Oriental countries, such as Rheum officinale Baill., Polygonum multijiorum Thunb., Polygonum cuspidatum Sieb. Et Zucc., etc. (Dong et al., 2016). In the past decades, emodin has been shown a wide spectrum of biological and pharmacological effects, such as hepatoprotective antiviral, anti-diabetic, anti-bacterial, antiallergic, anti-osteoporotic, immunosuppressive, and neuroprotective activities (Dong et al., 2016; Monisha et al., 2016). Recent studies have placed emodin back into the limelight, which exhibits a good prospect in anticancer treatment with its anticancer activities against several types of

cancer cells, such as lung carcinoma, gastric carcinoma, pancreatic cancer, and breast cancer, with apoptosis, antiangiogenesis, and anti-proliferation as possible mechanisms of action (Heo et al., 2010; Wei et al., 2013; Li et al., 2014; Cha et al., 2015). Mitoxantrone, a commonly used anticancer drug, is the prodrug of emodin (Riahi et al., 2008).

In recent years, drug–drug interactions between emodin and other drugs attracted more and more researchers' attention. Di et al. (2015) showed that piperine significantly improved the in vivo bioavailability of emodin and inhibited glucuronidation metabolism of emodin. Yu et al. (2017) demonstrated that 2,3,5,4'-tetrahydroxystilbene-2-β-D-glucoside could enhance the emodin absorption in a Caco-2 cell culture model. Meanwhile, many studies have shown synergistic effects between emodin and other antineoplastic drugs (Ko et al., 2010; Wang et al., 2010; Tan et al., 2011; Guo et al., 2013). Actually, changes of transporters on intestine affect the absorption characteristics of many drugs, which is one of the essential reasons why the drugs interact with each other. Some researchers have proved that emodin was the substrate of P-glycoprotein (P-gp) and multidrug-resistant associated protein 2 (MRP2; Teng et al., 2007, 2012; Liu et al., 2012). However, there were no reports about the research of emodin direct effects on intestinal transporters.

The Ussing chamber technique has been presently provided a physiologically relevant system for studying transepithelial transport of ion, drugs, and nutrients across various epithelial tissues. In this system, the drugs can be exposed at either mucosal or serosal levels, and therefore the absorption direction (M-S) and secretion direction (S-M) characteristics both can be detected. Furthermore, the usefulness of Ussing chambers for intestinal transport studies has long been recognized, and many researchers regard it as gold standards (Erik et al., 2016). Up to now, studies of P-gp influences in drug intestinal absorption by using this technique have been reported (Ballent et al., 2014; Xiao et al., 2016). However, there were few reports about testing and verifying this technique whether or not suitable for P-gp related studies. Moreover, other efflux transporters studies rarely reported by this technique.

In this study, a validated Ussing chamber technique was established and interaction studies between emodin and three efflux transporters including P-gp, MRP2, and multidrugresistant associated protein 3 (MRP3) were carried out. To verify this Ussing chamber technique whether or not suitable for transepithelial transport studies related to efflux transporters, P-gp, MRP2, and MRP3 were employed as the typical proteins. Specific substrates and inhibitors were selected to detect P-gp, MRP2, and MRP3 activities and functions under our experimental conditions. Subsequently, interaction studies between emodin and these three efflux transporters were investigated. This is the first time to study the direct influence of emodin on rat intestinal P-gp, MRP2, and MRP3 functions by using this technique. It will be helpful to explain the drug–drug interaction mechanisms between emodin and other drugs and provide basic data for clinical combination therapy.

## MATERIALS AND METHODS

## Chemicals and Reagents

Standards of digoxin and pravastatin were purchased from Chendu Rui Fensi Biotechnology Co., Ltd. (Chengdu, China). Benzbromarone and teniposide were obtained from Guangzhou Feibo Biotechnology Co., Ltd. (Guangzhou, China). Verapamil, topotecan, etoposide, and MK571 were obtained from Dalian Mellon Biotechnology Co., Ltd. (Dalian, China). Ginsenoside Rg<sup>1</sup> and emodin were purchased from the National Institutes for Food and Drug Control (Beijing, China). Krebs' Ringer bicarbonate (KRB) buffer is composed by 114 mM NaCl, 10 mM Glucose, 1.25 mM CaCl2·2H2O, 1.1 mM MgCl2·6H2O, 5.03 mM KCl, 0.30 mM NaH2PO4·H2O,1.65 mM Na2HPO4, and 25 mM NaHCO<sup>3</sup> of pH 6.5 (Huang et al., 2016). Solvents were of HPLC grade and other chemicals used were of analytical grade.

## Animals

The specific pathogen-free male Sprague–Dawley rats weighing 220 ± 20 g were obtained from Guangdong Medical Laboratory Animal Center (Guangzhou, China). Before starting the experiments, the rats were kept in an environmentally controlled breeding room (temperature: 22 ± 1 ◦C, humidity: 50–70%) and fed standard laboratory food and water for 1 week. The animals were fasted overnight with free access to water before experiments. All animal studies were approved by the Institutional Animal Ethics Committee of Guangdong Provincial Hospital of Chinese Medicine.

## Tissue Preparation

The four intestinal segments including duodenum, jejunum, ileum, and colon from anesthetized rats were immediately removed by surgery and washed with KRB solution. Each section of the intestinal segments was then placed in cold (on ice), bubbled (O2:CO<sup>2</sup> 95:5) KRB buffer. The four intestinal segments were then cut into 2 cm pieces, respectively, and serosas were removed by blunt dissection. Peyer's patches were excluded from the experiment by visually identified. The stripped tissues were mounted in chambers with exposed surface area of 0.49 cm<sup>2</sup> . Temperature of the chambers was maintained at 37 ± 0.5◦C. Each of the half-cells in the chambers was filled with 5 mL fresh KRB buffer (pH 6.5). Tissue viability was continuously controlled by potential difference (PD). Tissue with electrical values less than 2 mV was refused (Moazed and Hiebert, 2007; Heinen et al., 2013).

## Model Validation for Efflux Transporters' Studies

Digoxin, pravastatin, and teniposide were employed as the test substrates of P-gp, MRP2, and MRP3. Verapamil, MK571, and benzbromarone were inhibitors of these three efflux transporters (**Table 1**; Kool et al., 1999; Zhou et al., 2008; Ellis et al., 2013; Abbasi et al., 2016; Liu et al., 2017). The experiments were started after 30 min equilibration time, by changing

TABLE 1 | The substrates and inhibitors of P-gp, MRP2, and MRP3.


The SRM settings of the substrates and specific internal standards (ISs) were performed by LC-MS/MS system.

the KRB buffer on both sides; 5 mL KRB solution with different test compounds was filled in the mucosal or serosal compartment; meanwhile, equal volume fresh buffer was added to the other compartment. In the inhibitory studies, the inhibitors verapamil, MK571, and benzbromarone were added to the mucosal side, respectively; 0.5 mL samples were withdrawn from the receiver compartment every 30 min and replaced with fresh KRB buffer. The bidirectional transport studies with specific inhibitors were assessed to validate this model whether or not suitable for P-gp, MRP2, and MRP3 studies. Samples were stored at –80◦C until analysis (Deusser et al., 2013).

#### Interaction Studies Between Emodin and Efflux Transporters

In this study, the effects of the three efflux transporters on emodin intestinal absorption were investigated. Meanwhile, the influence of emodin on intestinal P-gp, MRP2, and MRP3 functions were also discussed. Solutions with three concentrations (low, middle, and high) of emodin were filled in the mucosal or serosal compartment to measure the bidirectional transport characteristic. In order to observe whether or which efflux transporters involved in emodin intestinal absorption process, verapamil, MK571, and benzbromarone were added to the mucosal side, respectively. In the emodin impact experiments, digoxin, pravastatin, and teniposide were employed as P-gp, MRP2, and MRP3 substrates.


Changes of the bidirectional transport characteristic were measured before and after adding emodin to the mucosal side.

#### Preparation of Perfusate Samples

Frozen samples were adequately vortexed after thawing at the room temperature; 10 µL suitable IS solution was added to 200 µL samples; 200 µL methanol was added after a thorough vortex mixing for 30 s. The mixtures were then vortexed for 30 s and centrifuged at 15,000 rpm for 30 min. Finally, 5 µL of supernatant was injected into the LC-MS/MS system.

#### Liquid Chromatographic and Mass Spectrometric Conditions

The quantifications of the compounds were performed by the TSQ Quantum Ultra Triple Quadrupole LC-MS/MS system from Thermo Fisher Scientific. Chromatography was carried out using an Agilent Poroshell 120 SB-C18 (2.7 µm, 2.1 mm × 100 mm) column with a Phenomenex AFO-8497 C<sup>18</sup> pre-column, operating at 30◦C. Adaptive gradient elution methods were applied for the analytes. The flow rate of the mobile phase was kept at 0.2 mL min−<sup>1</sup> . Flow was directed to the ion spray interface. All measurements were carried out in negative ESI mode. Ion spray voltage was -2500 V. Vaporizer and capillary temperatures were set at 250 and 350◦C,


Digoxin (25 µM), pravastatin (47 µM), and teniposide (60 µM) were employed as the test substrates of P-gp, MRP2, and MRP3. Verapamil (220 µM), MK571 (186 µM), and benzbromarone (470 µM) were inhibitors of P-gp, MRP2, and MRP3. Each column represents the mean with SD of five measurements (∗p < 0.05 and ∗∗p < 0.01, compared to the control according to independent samples t-test).

respectively. Auto sampler temperature was set at 10◦C. Sheath gas and aux gas were set at 30 and 15 Arb, respectively. The selective reaction monitoring (SRM) transitions were at m/z 268.934→225.047 for emodin and m/z 779.147→475.447 for ginsenoside Rg<sup>1</sup> [internal standard (IS)]. Other SRM settings and specific IS are shown in **Table 1**. Peak integrations and calibrations were carried out using LC Quan 2.5.2 software from Thermo Fisher Scientific. All the data were within the acceptable limits to meet the guidelines for bioanalytical methods.

FIGURE 2 | The Papp of emodin from mucosal to the serosal side in different intestinal segments. Experiments were conducted at three concentrations (9.25, 18.5, and 36.5 µM). Each bar represents the mean with SD of five measurements (∗p < 0.05, statistically significant differences among different concentrations, according to a one-way ANOVA test).

### Data Analysis

The Q (accumulative quantity), Papp (apparent permeability), E<sup>r</sup> (efflux ratio), P<sup>r</sup> (Papp ratio), and Err (E<sup>r</sup> ratio) across the excited rat intestinal segments in the Ussing chamber were calculated using the following equations (Li et al., 2012; Huang et al., 2016):

$$\text{Q} = \text{5C}\_{\text{n}} + 0.5 \sum\_{i=1}^{n-1} \text{C}\_{\text{i}} \tag{1}$$

$$\text{P}\_{\text{app}} = \frac{\text{dQ/dt}}{\text{A} \cdot \text{C}\_0} \tag{2}$$

$$\mathbf{E\_{f}} = \frac{\mathbf{P\_{S-M}}}{\mathbf{P\_{M-S}}} \tag{3}$$

P<sup>r</sup> = Papp/Papp (control) (4)

$$\mathbf{E\_{rt}} = \mathbf{E\_{r}/E\_{r}(control)}\tag{5}$$

where C<sup>n</sup> (nM) is the concentration of the drug at the time point (n), Q (nM) is the accumulated absorption amount, A is the exposed surface area of the intestine (0.49 cm<sup>2</sup> ), dQ/dt (nM·s −1 ) is the amount of the drug transported, and C<sup>0</sup> (nM) is the initial concentration of the test drug. Experiments were performed in batches, so control groups were set as one for each batch.

#### Statistical Analyses

The data are presented as the mean ± SD for all experiments. Independent samples t-test was applied to compare the means between treatments. One-way ANOVA with LSD (equal variances assumed) or Danett T3 (equal variances not assumed) multiple comparison (post hoc) tests were used to evaluate statistical differences. A p-value of less than 0.05 was considered statistical significance.

## RESULTS

### The PD Values for Rat Intestinal Segments

After 30 min equilibration time, PD was 7.16 ± 1.59, 7.23 ± 2.18, 6.50 ± 0.75, and 4.12 ± 0.69 mV for duodenum, jejunum, ileum, and colon, respectively. Obviously, the PD values were higher for small intestine segments under this experimental condition. In general, the PD slightly decreased by time when the chambers with small amount drugs (<200 µM).

#### Model Validation for Efflux Transporters' Studies Were Employed as the Test Substrates of P-gp, MRP2, and MRP3

The Papp and Er values of the test substances (digoxin, pravastatin, and teniposide) are summarized in **Table 2** and **Figure 1**. The results showed that the inhibitors, including

verapamil, MK571, and benzbromarone, could significantly increase the absorption of the aforesaid substances and inhibit their secretion, and thus their Er values were decreased. In P-gp validation study, the concentrations of the inhibitor (verapamil) were set at 220 and 440 µM for the small intestine segments and colon, respectively. It was observed that verapamil could



Emodin was loaded onto the mucosal side at three concentrations (185, 370, and 740 µM). Each column represents the mean with SD of five measurements (∗p < 0.05 and ∗∗p < 0.01, compared to the control according to a one-way ANOVA with post hoc test).

significantly inhibit the effect of P-gp protein on jejunum, ileum, and colon. However, it was showed no statistic difference on duodenum after adding the inhibitor. These may be related to the distribution characteristic of P-gp on rat enterocyte. In MRP2 and MRP3 validation studies, the inhibitors MK571 (186 µM) and benzbromarone (470 µM) were also displayed significant inhibition effect on both jejunum and ileum. These results indicated that the Ussing chamber technique could be used to investigate the role of P-gp, MRP2, and MRP3 in drug intestinal transport studies. Though the distribution characteristics of the effluxes were different, both jejunum and ileum were suitable for the study. Finally, jejunum was chosen for further studies because of the most stable data it revealed during model validation research.

#### The Absorption Characteristics of Emodin

The absorption characteristics of emodin at three concentrations (9.25, 18.5, and 37 µM) in different intestinal segments were investigated. The intestinal absorption rates of emodin displayed no regioselectivity whereas the Papp values from M-S were very low at the lowest concentration (**Figure 2**). These indicated that some efflux transporters may involved in emodin intestinal transport. The Er values of emodin in jejunum were all more than five at the three concentrations, pointing out that efflux transporters were involved in emodin intestinal absorption (**Figure 3**).

#### Interaction Studies Between Emodin and Efflux Transporters

Transport studies were performed in the presence of P-gp, MRP2, and MRP3 inhibitors (verapamil, MK571, and benzbromarone) to determine the effect of these effluxes on the transport of emodin. The data are summarized in **Figure 4**. Verapamil (220 µM) could markedly increase emodin absorption and decrease the efflux ratio at 9.25 µM. In the presence of the MRP2 efflux transporter inhibitor MK571 (186 µM), the Papp values of emodin from mucosal to the serosal side were significantly increased and the efflux ratio decreased. However, no significant differences were observed in the presence of benzbromarone. These results indicated that emodin might be the substrate of P-gp and MRP2.

To investigate the role of emodin on the efflux transporters P-gp, MRP2, and MRP3 functions on rat enterocyte, digoxin, pravastatin, and teniposide were carried out for transport studies in the presence of emodin. With the treatment of emodin, the Papp values of digoxin from M-S significantly decreased while those from S-M increased. Thus, the E<sup>r</sup> values of digoxin were significant higher after adding emodin. The date are summarized in **Table 3** and **Figure 5**. These results indicate that emodin could enhance P-gp function on rat jejunum. In the presence of emodin, the Papp from S-M and Er of the MRP3 substrate teniposide remarkably decreased, indicating that emodin might be an MRP3 inhibitor. No significant differences were observed in the Papp and E<sup>r</sup> values of pravastatin after emodin added.

#### DISCUSSION

Emodin, a potential antineoplastic drug, has been proved to have drug–drug interactions with many other drugs whereas the mechanisms have not yet to be discovered. Our results showed that emodin was the substrate of P-gp and MRP2, which is consistent with the literature reported (Teng et al., 2012; Zhang et al., 2012). Furthermore, we found that emodin might be a P-gp inducer and MRP3 inhibitor on enterocyte, which has not been reported in the literature before. However, verapamil only promoted emodin intestinal absorption at a low

concentration; this is probably due to the fact that emodin itself is a P-gp inducer. MRP3, an efflux transporter, mainly distributes in liver, intestine, and adrenal glands, while most researches focused on its function and expression on liver. It is involved in the enterohepatic circulation of non-sulfated and sulfated bile salts such as glycocholates and taurocholates (Keppler, 2014, 2017). Our results showed that emodin might not be a substrate of MRP3 whereas it could inhibit MRP3 function on enterocyte, which indicated that the role of MRP3 should not be ignored in intestine. We consider that emodin might regulate MRP3 function by influencing the upstream proteins or kinases. Furthermore, we believe that oral administration of emodin would affect bile transport to some extent.

In this study, we used the Ussing chamber technique to evaluate the interactions between emodin and efflux transporters, which is regarded as gold standards for drug transport studies. The use of a living and intact intestinal tissue is more realistic than cell cultures and provides many advantages. The intestinal tissues are likely to express all the transporters and the enzymes at the same "physiological" level of expression. Moreover, the data obtained through rat intestine can be directly correlated to in vivo experiments that will be conducted in the same animal model (Mazzaferro et al., 2012; Sjögren et al., 2016). However, samples withdrawn from the receiver compartment often at very low concentration. With the development of LC-MS/MS technology, the amount of test compounds can easier to be detected. Besides, the procedure for tissue preparation is a technically challenging technique. KRB solution should be filtered with 0.22 µm filter and the removed intestine tissue should be placed in the cold KBR solution with gas in incessancy. Preparation must be done on ice carefully for it takes time and is associated with risks of tissue damage.

Although emodin has multiple pharmacological activities, its toxicity has attracted more and more attention in recent years. However, emodin treatment is a double-edged sword.

#### REFERENCES


It showed protective effect on alcoholic liver injury while hepatotoxicity appeared with high doses and long-term drug delivery (Dong et al., 2009; Wang et al., 2011; Liu et al., 2014). We suspect that the changeable role of emodin may be related to the variation of intestinal environment, including intestinal transporters, structures, microorganisms, and so on. Therefore, it requires further study to find out the changes of environment in vivo and emodin disposition characteristics during long-term administration process.

#### CONCLUSION

In the present study, we have shown that the Ussing chamber technique was suitable for P-gp, MRP2, and MRP3 related studies on rat enterocyte. On the basis of this technique, we discovered that emodin might be the substrate of P-gp and MRP2, but not MRP3. Besides, emodin could decrease digoxin and increase teniposide absorption on rat intestine, indicating that emodin might be a P-gp inducer and MRP3 inhibitor.

#### AUTHOR CONTRIBUTIONS

JH and XQ designed the project. JH, LaG, RT, and MW performed the experiments. JH, LaG, RT, and XQ analyzed the data. JH, JZ, YZ, LuG, ZH, and XQ wrote the manuscript.

## FUNDING

This research was supported by the National Natural Science Foundation of China (No. 81373967) and Science and Technology Planning Project of Guangdong Province (No. 2017B030314166).

inhibition of hepatic stellate cells activation. World J. Gastroenterol. 15, 4753–4762.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Huang, Guo, Tan, Wei, Zhang, Zhao, Gong, Huang and Qiu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Identification of Ligularia Herbs Using the Complete Chloroplast Genome as a Super-Barcode

Xinlian Chen<sup>1</sup>† , Jianguo Zhou<sup>1</sup>† , Yingxian Cui<sup>1</sup> , Yu Wang<sup>1</sup> , Baozhong Duan<sup>2</sup> and Hui Yao<sup>1</sup> \*

<sup>1</sup> Key Lab of Chinese Medicine Resources Conservation, State Administration of Traditional Chinese Medicine of the People's Republic of China, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China, <sup>2</sup> College of Pharmaceutical Science, Dali University, Dali, China

More than 30 Ligularia Cass. (Asteraceae) species have long been used in folk medicine

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Zhi Chao, Southern Medical University, China Eszter Virag, University of Pannonia, Hungary

\*Correspondence: Hui Yao scauyaoh@sina.com †These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 27 April 2018 Accepted: 08 June 2018 Published: 03 July 2018

#### Citation:

Chen X, Zhou J, Cui Y, Wang Y, Duan B and Yao H (2018) Identification of Ligularia Herbs Using the Complete Chloroplast Genome as a Super-Barcode. Front. Pharmacol. 9:695. doi: 10.3389/fphar.2018.00695 in China. Morphological features and common DNA regions are both not ideal to identify Ligularia species. As some Ligularia species contain pyrrolizidine alkaloids, which are hazardous to human and animal health and are involved in metabolic toxification in the liver, it is important to find a better way to distinguish these species. Here, we report complete chloroplast (CP) genomes of six Ligularia species, L. intermedia, L. jaluensis, L. mongolica, L. hodgsonii, L. veitchiana, and L. fischeri, obtained through highthroughput Illumina sequencing technology. These CP genomes showed typical circular tetramerous structure and their sizes range from 151,118 to 151,253 bp. The GC content of each CP genome is 37.5%. Every CP genome contains 134 genes, including 87 protein-coding genes, 37 tRNA genes, eight rRNA genes, and two pseudogenes (ycf1 and rps19). From the mVISTA, there were no potential coding or non-coding regions to distinguish these six Ligularia species, but the maximum likelihood tree of the six Ligularia species and other related species showed that the whole CP genome can be used as a super-barcode to identify these six Ligularia species. This study provides invaluable data for species identification, allowing for future studies on phylogenetic evolution and safe medical applications of Ligularia.

Keywords: Ligularia Cass., chloroplast genome, identification, super-barcode, Illumina sequencing

#### INTRODUCTION

Ligularia Cass., belonging to the Senecioneae tribe of Asteraceae, comprises about 140 species of perennial herbs. These species are distributed in Asia and Europe, with a total of 123 species distributed in China, 89 of which are endemic (Liu and Illarionova, 1989). In China, Ligularia species are mainly distributed in mountainous areas in the southwest (Liu and Illarionova, 1989) and more than 30 Ligularia species have long been used in folk medicine (Wang, 2007). The roots, stems, leaves, and flowers of them contain various chemical compounds, such as

**Abbreviations:** IR, inverted repeat; LSC, large single-copy; ML, maximum likelihood; SSC, small single-copy; SSRs, simple sequence repeats.

sesquiterpenes (Wang, 2007; Shimizu et al., 2014; Saito et al., 2017) and alkaloids (Asada et al., 1981; Feng, 2016). They are used as herbal medicines for the treatment of bronchitis, coughing, pulmonary tuberculosis, and hemoptysis. These herbal medicines are usually used as substitutes for Asteris Radix et Rhizoma which originates from Aster tataricus L. and is recorded in the Chinese Pharmacopoeia (Lin and Liu, 1989; Chinese Pharmacopoeia Commission, 2015). Approximately, 3% of flowering plants (as many as 6,000 species), including Ligularia species (Smith and Culvenor, 1981; Stegelmeier et al., 1999), contain pyrrolizidine alkaloids (PAs). It has been reported that various Ligularia species contain PAs, including L. japonica (Asada et al., 1981), L. wilsoniana (Xiong et al., 2016), L. duciformis, L. intermedia, L. hodgsonii, and L. veitchiana (Pu et al., 2004). PAs are phytoalexins that function in plant defense systems against herbivores, insects, and plant pathogens. However, they are harmful to human and animal health (Jank and Rath, 2017; Martinello et al., 2017), as they are involved in metabolic toxification in the liver caused by PA poisoning (Bull et al., 1968; Prakash et al., 1999). The German Federal Department of Health stated that the safe total daily dose of PA is less than 1 µg, and doctors do not allow continuous administration of drugs with PA for more than 6 weeks. In addition, all PA-containing products are banned in Australia (Wiedenfeld and Edgar, 2011).

Ligularia has been traditionally classified based on morphological structures, such as the arrangement of inflorescences, leaf shape, leaf veins, and phyllaries (Liu and Illarionova, 1989). Interspecific hybridization of Ligularia species is common and their morphological variation is complicated (Hanai et al., 2012; Yu et al., 2014; Saito et al., 2017), making it difficult to correctly identify species. Common DNA barcoding sequences (ITS, matK, psbA-trnH, and rbcL) are also not ideal for identifying Ligularia species (He and Pan, 2015). Recently, researchers have screened sequences from the whole chloroplast (CP) genome from numerous plant taxa, such as Juglans L. plants and bamboo (Zhang et al., 2011; Hu et al., 2016), or use CP genome as a super-barcode to distinguish species (Xia et al., 2016). The CP genome is highly conserved in plants regardless of the size, structure, or gene content (Tonti-Filippini et al., 2017), and the majority of the retained core genes are involved in the light reactions of photosynthesis or in functions related to transcription or translation (Sato et al., 1999). The CP genome map is a circular DNA molecule that includes a SSC region, a LSC region, and two inverted-repeat (IRa and IRb) regions (Sato et al., 1999). Several CP genomes from Asteraceae have previously been reported, including CP genomes from Aster (Choi and Park, 2015), Ambrosia (Nagy et al., 2017), Carthanus (Lu et al., 2015), and Taraxacum (Salih et al., 2017). However, only one CP genome from Ligularia, for L. fischeri, has previously been published (Lee et al., 2016). In this study, we report the CP genomes of six Ligularia species, L. intermedia, L. jaluensis, L. mongolica, L. hodgsonii, L. veitchiana, and L. fischeri, obtained through high-performance Illumina sequencing technology. Our aim is to use the CP genome as a super-barcode for the identification of Ligularia species to provide invaluable genetic information for future studies.

## MATERIALS AND METHODS

## Plant Materials and DNA Extraction

Fresh leaves of L. intermedia and L. fischeri were collected from Baishan City and Tonghua City, Jilin Province, China, respectively. Fresh leaves of L. jaluensis and L. mongolica were collected from Yanbian Korean Autonomous Prefecture, Jilin Province. These four species were identified by Prof. Junlin Yu from Tonghua Normal University, Jilin. Fresh leaves of L. hodgsonii and L. veitchiana were collected from Enshi Tujia and Miao Autonomous Prefecture, Hubei Province, and the Qinling Mountains, Shaanxi Province, respectively. These two samples were identified by Prof. Yulin Lin from the Institute of Medicinal Plant Development (IMPLAD), Chinese Academy of Medical Sciences (CAMS), and Peking Union Medical College (PUMC). The exact GPS coordinates for the collection locations of six Ligularia species are listed in Supplementary Table S1. Voucher specimens were deposited in the herbarium at IMPLAD. Collected fresh leaves were stored in a −80◦C freezer until further use. DNA extraction was performed using a DNeasy Plant Mini Kit (Qiagen Co., Germany) following the manufacturer's protocol.

## Illumina Sequencing and Genome Assembly

Approximately 5–10 µg of high-quality DNA were used to build shotgun libraries with insert sizes of 500 bp and were sequenced in accordance with the protocol for Illumina Hiseq X technology. The total raw data of the six species were produced with 150 bp paired-end read lengths. The software Trimmomatic (Bolger et al., 2014) was employed to filter low-quality reads from the raw data. After filtering for quality sequences, the remaining clean reads were used to assemble the CP genome sequences. The CP sequences of all plants downloaded from the National Center for Biotechnology Information (NCBI) were used to create a reference database. Then, the clean sequences were mapped to the database and the mapped reads were extracted on the basis of coverage and similarity. The extracted reads were assembled into contigs using SOAPdenovo2 (Luo et al., 2012). The scaffold of the CP genome was constructed using SSPACE (Boetzer et al., 2011), and the gaps were filled using GapFiller (Boetzer and Pirovano, 2012).

#### Validation, Annotation, and Sequence Submission

The accuracy of the assembly of the four boundaries (SSC, LSC, IRa, and IRb regions) of the CP sequences was confirmed through PCR and Sanger sequencing using validated primers (Supplementary Table S2). The thermocycler conditions for the PCR were as follows: 94◦C for 5 min; 94◦C for 30 s, 56◦C for 30 s, 72◦C for 1.5 min, and 32 cycles; 72◦C for 10 min. The online programs Dual Organellar GenoMe Annotator (DOGMA) (Wyman et al., 2004) and CPGAVAS (Liu et al., 2012) were used for the initial annotation of the CP genomes of the six species, followed by manual correction. The complete data from the study were submitted to NCBI under the BioProject ID PRJNA400300 and BioSample ID SAMN07562669. The assembled complete CP genome sequences of the six Ligularia species were submitted to NCBI GenBank with the accession numbers MF539929- MF539933, and MG729822.

## Genome Structure Analysis

fphar-09-00695 June 29, 2018 Time: 16:48 # 3

The software tRNAscan-SE (Schattner et al., 2005) and DOGMA (Wyman et al., 2004) were used to identify tRNA genes. Gene maps were generated using Organellar Genome DRAW v1.2 (Lohse et al., 2007) with default settings and then the gene maps were checked manually. MEGA 6.0 was used to calculate the GC content (Tamura et al., 2013). REPuter (University of Bielefeld, Bielefeld, Germany) (Kurtz et al., 2001) was used to identify the size and location of repeat sequences in the CP genomes of the six Ligularia species. We used the MISA software Misa-Microsatellite Identification Tool, 2017<sup>1</sup> to detect SSRs with the parameter settings the same as those described in Li et al. (2013). All the repeated sequences were manual verified and excess data

<sup>1</sup>http://pgrc.ipk-gatersleben.de/misa/

TABLE 1 | Summary statistics for assembly of the six CP genomes of Ligularia species.




One or two asterisks after genes indicate that gene contains one or two introns, respectively.

were removed. The distribution of codon usage was studied using CodonW with the relative synonymous codon usage (RSCU) ratio (Sharp and Li, 1987). The online program Predictive RNA Editor for Plants suite (Mower, 2009) with a cutoff value of 0.8 were used to predict RNA editing sites in the six CP genomes of Ligularia species.

#### Phylogenetic Analysis

For identification purposes and to further phylogenetic research on this genus, we used mVISTA (Thompson et al., 1994) to compare six Ligularia species with L. hodgsonii as the reference genome. MEGA 6.0 was used to construct the phylogenetic tree with Platycodon grandiflorus and Adenophora remotiflora included as outgroups based on ML analysis. The details of the selected species (excluding the six Ligularia species) are presented in Supplementary Table S3.

#### RESULTS AND DISCUSSION

#### CP Genome Structure of Six Ligularia Species

The raw data from the six Ligularia species is 9.1 Gb for L. intermedia, 7.2 Gb for L. hodgsonii, 7.4 Gb for L. jaluensis, 6.4 Gb for L. mongolica, 6.3 Gb for L. veitchiana, and 6.2 Gb for L. fischeri. The sizes of the six CP genomes range from 151,118 bp for L. mongolica to 151,253 bp for L. veitchiana, which are similar to other Asteraceae CP genomes (Liu et al., 2013; Salih et al., 2017; Wang et al., 2017; Zhang et al., 2017). The investigated genomes showed typical circular tetramerous structure, including an SSC region and an LSC region, separated by two IR regions (**Figure 1**). The corresponding lengths of the four regions from the six species are similar: the SSC lengths range from 18,214 to 18,247 bp, the LSC lengths range from 83,244 to 83,330 bp, and the IR lengths range from 24,830 to 24,838 bp (**Table 1**). The size of the previously published L. fischeri CP genome is 151,133 bp, and included an SSC region (18,233 bp), an LSC region (83,238 bp), and two IR regions (24,831 bp apart) (Lee et al., 2016). Our results showed that all six of the newly sequenced CP genomes have a GC content of 37.5%, which is lower than some Asteraceae species (Liu et al., 2013; Salih et al., 2017; Wang et al., 2017; Zhang et al., 2017). The GC content of four homologous regions of the six CP genomes is the same. However, the distribution of the GC content in each region is uneven. The GC content in the IR region is the largest (43.0%), followed by the LSC region (35.6%), and the region with the lowest GC content is the SSC region (30.7%). Our analysis showed that the high GC content in the IR region is attributed to four rRNA genes (rrn16, rrn23, rrn4.5, and rrn5). The AT content of the first, second, and third position of proteincoding genes in the six CP genomes are 54.5–54.6%, 61.9–62.0%,

and 70.1%, respectively. The higher AT content in the third site has also been observed in other plants (Yi and Kim, 2012; He et al., 2017; Zhou et al., 2017) and is usually used to distinguish DNA of CP, nucleus, and mitochondria origin (Clegg et al., 1994).

Each of the six CP genomes contains 134 genes, including 87 protein-coding genes, 37 tRNA genes, eight rRNA genes, and two pseudogenes (ycf1 and rps19; **Table 2**). Seven protein-coding genes (ndhB, rp12, rp123, rps12, rps7, ycf15, and ycf2), seven tRNAs (trnA-UGC, trnI-CAU, trnI-GAU, trnL-CAA, trnN-GUU, trnR-ACG, and trnV-GAC), and all of the rRNAs (rrn16, rrn23, rrn4.5, and rrn5) are duplicated in the IR regions, which is similar to Artemisia annua (Shen et al., 2017) and Artemisia frigida (Liu et al., 2013). The CP genomes of the six Ligularia species contain a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, which is a unique feature in Asteraceae (Kim et al., 2005; Liu et al., 2013). The LSC region included 62 protein-coding genes and 22 tRNA genes. The SSC region included 11 proteincoding genes and one tRNA gene (trnL-UAG). The CP genomes of each of these six Ligularia species did not have an inverted SSC region, which has also been found in the CP genomes of A. frigida (Liu et al., 2013), Scutellaria baicalensis (Jiang et al., 2017), Carthamus tinctorius (Lu et al., 2015), and Juglans L. (Hu et al., 2016). In contrast, the SSC regions of Helianthus annuus, Lactuca sativa (Timme et al., 2007), and Aster spathulifolius (Choi and Park, 2015) are inverted. The functional ycf1 copy is located in the IRb-SSC boundary and the pseudogene ycf1 copy is located in the IRa region. The functional rps19 copy is on the boundary of LSC and IRa and the pseudogene rps19 copy is on the IRb region. The coding region occupied 59.67–59.72% of the CP genomes of six Ligularia species, including protein-coding genes, tRNA genes, and rRNA genes. Meanwhile, non-coding regions, including introns, pseudogenes, and intergenic spacers occupied 40.28–40.33% of the CP genomes of the six Ligularia species.

#### Codon Usage and RNA Editing Sites

All protein-coding genes in the six Ligularia CP genomes are composed of 26,136–26,138 codons. The most and least universal amino acids of the CP genomes of the six Ligularia species are leucine (10.8%) and cysteine (1.1%), respectively (**Figure 2**). This is also similar to the CP genome from artichoke (Curci et al., 2015). However, the most universal amino acid from A. frigida is isoleucine (Liu et al., 2013). The most and the least abundant amino acids in the Taraxacum obtusifrons and Taraxacum amplum CP genomes are serine and methionine (Salih et al., 2017), respectively. **Figure 2** shows that with the increase of specific amino acid codes the RSCU increases accordingly. Most of the amino acid codons have preferences, except for methionine and tryptophan. Potential RNA editing sites were predicted for 35 genes from the CP genomes of the six Ligularia species. Forty-eight RNA editing sites were identified. S to L of amino acid change appeared most frequently, while R to W and T to I occurred least. Each corresponding gene from

FIGURE 3 | Repeat sequences in six CP genomes. REPuter was used to identify repeat sequences with length ≥30 bp and sequences identified ≥90% in the CP genomes. F, P, R, and C indicate the repeat types F (forward), P (palindrome), R (reverse), and C (complement). Repeats with different lengths are indicated in different colors.



the RNA editing sites of the six Ligularia species is at the same nucleotide position (Supplementary Table S4).

A total of 18 genes containing introns, including 12 proteincoding genes (atpF, clpP, ndhA, ndhB, petB, petD, rpl16, rpl2, rpoC1, rps12, rps16, and ycf3), and six tRNA genes (trnA-UGC, trnG-UCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC; Supplementary Table S5), were identified in this study. Nine protein-coding genes contain only one intron and three proteincoding genes (clpP, rps12, and ycf3) contain two introns. All six tRNAs contain only one intron. TrnK-UUU has the longest intron (2,556 bp), which contains matK. The clpP gene and ycf3 gene are both located in the LSC region. The rps12 gene is a special trans-splicing gene with the 5<sup>0</sup> exon located in the LSC region, but the 3<sup>0</sup> exon located in the IR region. This condition exists in many species, such as A. frigida (Liu et al., 2013), artichoke (Curci et al., 2015), and Aster spathulifolius (Choi and Park, 2015).

## Long Repeats and SSRs in the CP Genomes From the Six Ligularia Species

Repeat sequences, which are related to plastome organization (Salih et al., 2017), are mostly distributed in intergenic regions and intron regions, and only a small fraction is present in the

genetic region. Four types of long repeats were observed in the CP genomes of the six Ligularia species, including forward, palindromic, reverse, and complement repeats (**Figure 3**). The length of the repeat unit ranged from 30 to 48 bp. Ligularia intermedia and L. jaluensis both had 19 forward and 20 palindromic repeats. Ligularia hodgsonii had the following repeats: 18 forward, 20 palindromic, and one reverse. Ligularia mongolica had 18 forward and 20 palindromic repeats. Ligularia veitchiana had 20 forward and 21 palindromic repeats. Ligularia fischeri had the following repeats: 19 forward, 19 palindromic, and one complement. The long repeat sequences observed in the CP genomes of the six Ligularia species, with L. hodgsonii as the reference, are presented in Supplementary Table S6.

Simple sequence repeats, also called microsatellites, exist widely in the genome, and the sequences consist of one to six nucleotide repeat units (Powell et al., 1995). SSRs are widely used in studies on species identification, population genetics, and phylogenetic studies based on polymorphisms (Yang et al., 2011; Jiao et al., 2012; Xue et al., 2012). Four types of SSRs were found in the CP genomes from the six Ligularia species: mononucleotide (56.6–60.7%), dinucleotide (11.5–13.2%), trinucleotide (9.3–9.8%), and tetranucleotide (18.0–21.6%); the SSRs were mainly distributed in the noncoding region of the LSC and SSC. Of all these SSRs, the number of mononucleotide SSRs (A/T) is the largest, ranging from 29 in L. hodgsonii to 37 in L. veitchiana, enriching A and T in the CP genomes. The next most common SSR is dinucleotide (AT/AT), six dinucleotide SSRs in CP genomes of L. hodgsonii and L. mongolica and seven dinucleotide SSRs in other four CP genomes. All of the CP genomes from the six species have two trinucleotide AAG/CTT SSRs, one ATC/ATG trinucleotide SSR, and 11 tetranucleotide SSRs (**Table 3**). The CP genome of L. veitchiana has three AAT/ATT trinucleotide SSRs, while the other five species only have two trinucleotide SSRs.

## Identification and Phylogenetic Analysis of Ligularia Species

The CP genomes from the six Ligularia species are highly similar. Among the few variations, non-coding regions exhibited higher

levels of variability than the coding regions. The largest change in gene length occurred in pseudogene ycf1, with 5,097 bp in L. mongolica, 5,100 bp in L. hodgsonii and L. veitchiana, and 5,094 bp in the other three species. This difference led to a divergence in the length of the coding regions of the six species. The IR regions of the six CP genomes are conservative regardless of the number and order of the genes. Previous research screened highly variable region from CP genomes as the potential DNA barcodes for authenticating species, such as Dioscorea (Ma et al., 2018) and Fritillaria species (Li et al., 2016).

Sequence homology was investigated compared with the reference CP genome from L. hodgsonii using the mVISTA software (**Figure 4**). Our results showed high similarity among all sequences. Differences were observed in the intergenic regions of matK-trnK and ndhG-ndhI (**Figure 4**). There was only one variable site in the matK-trnK region and five variable sites in ndhG-ndhI region, but this is not enough to distinguish among the six Ligularia species. Because of the highly conservative sequences, the structure, and size of the CP genomes of Ligularia species, no obvious hypervariable region was screened. Thus, the complete CP genomes were considered to distinguish Ligularia species.

In addition to the six CP genomes sequenced in this study, 25 other CP genomes from Asteraceae were chosen to construct the phylogenetic tree, and P. grandiflorus and A. remotiflora (Campanulaceae) were included as outgroups (**Figure 5**). In the ML tree, we identified two main clades (clade A and B) excluding outgroup species. Six species of Ligularia were a monophyly with well-supported (100%). The support values in clade A were not less than 60%, and L. fischeri and L. jaluensis have a close relationship. Ligularia is most closely related to L. sativa, Saussurea involucrata, Centaurea diffusa, and Carthamus tinctorius. The results showed that the CP genomes can be used to identify the six Ligularia species.

#### CONCLUSION

This study reported the CP genomes from six Ligularia species, and the structure and composition of the CP genomes are highly

#### REFERENCES


similar. Like most Asteraceae species, the CP genomes of the six Ligularia species had a small 3.4 kb inversion within a large 23 kb inversion in the LSC region. The ML tree showed that the CP genome can be used to identify the six Ligularia species and is expected to become a super-barcode for the identification of Ligularia species.

#### AUTHOR CONTRIBUTIONS

HY conceived the study and acquired the funding. XC, YW, and BD collected samples and conducted the experiment. JZ and YC performed the genome assembly and analysis on the data. XC and JZ wrote the manuscript. All authors have read and approved the final manuscript.

## FUNDING

This work was supported by CAMS Innovation Fund for Medical Sciences (CIFMS) (No. 2016-I2M-3-016) and Major Scientific and Technological Special Project for "Significant New Drugs Creation" (No. 2014ZX09304307001).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar. 2018.00695/full#supplementary-material

collected in Shangrila County, Yunnan province of China. Nat. Prod. Commun. 7, 1565–1568.




Aristolochia medicinal species. Int. J. Mol. Sci. 18, 1839–1853. doi: 10.3390/ ijms18091839

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chen, Zhou, Cui, Wang, Duan and Yao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Detection of Cistanches Herba (Rou Cong Rong) Medicinal Products Using Species-Specific Nucleotide Signatures

Xiao-yue Wang<sup>1</sup> , Rong Xu<sup>1</sup> , Jun Chen<sup>1</sup> , Jing-yuan Song<sup>1</sup> , Steven-G Newmaster <sup>2</sup> , Jian-ping Han<sup>1</sup> \*, Zheng Zhang<sup>1</sup> \* and Shi-lin Chen<sup>3</sup>

*<sup>1</sup> Key Laboratory of Bioactive Substances and Resources Utilization of Chinese Herbal Medicine, Ministry of Education, Institute of Medicinal Plant Development, Chinese Academy of Medicinal Science and Peking Union Medicinal College, Beijing, China, <sup>2</sup> NHP Research Alliance, Biodiversity Institute of Ontario (BIO), University of Guelph, Guelph, ON, Canada, <sup>3</sup> Key Laboratory of Beijing for Identification and Safety Evaluation of Chinese Medicine, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China*

#### Edited by:

*Caroline Howard, Medicines and Healthcare Products Regulatory Agency, United Kingdom*

#### Reviewed by:

*Catherine Anne Kidner, University of Edinburgh, United Kingdom Anna Paola Casazza, Istituto di Biologia e Biotecnologia Agraria (IBBA), Italy*

#### \*Correspondence:

*Jian-ping Han jphan@implad.ac.cn Zheng Zhang zhangzheng321@126.com*

#### Specialty section:

*This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science*

> Received: *27 February 2018* Accepted: *23 October 2018* Published: *13 November 2018*

#### Citation:

*Wang X, Xu R, Chen J, Song J, Newmaster S-G, Han J, Zhang Z and Chen S (2018) Detection of Cistanches Herba (Rou Cong Rong) Medicinal Products Using Species-Specific Nucleotide Signatures. Front. Plant Sci. 9:1643. doi: 10.3389/fpls.2018.01643* Cistanches Herba is a medicinal plant that has tonification properties and is commonly used in Asia. Owing to the imbalance between supply and demand, adulterants are frequently added for profit. However, there is no regulatory oversight because quality control tools are not sufficient for identifying heavily processed products. Thus, a novel molecular tool based on nucleotide signatures and species-specific primers was developed. The ITS2 regions from 251 Cistanches Herba and adulterant samples were sequenced. On the basis of SNP sites, four nucleotide signatures within 30∼37 bp and six species-specific primers were developed, and they were validated by artificial experimental mixtures consisting of six different species and different ratios. This method was also applied to detect 66 Cistanches Herba products on the market, including extracts and Chinese patent medicines. The results demonstrated the utility of nucleotide signatures in identifying adulterants in mixtures. The market study revealed 36.4% adulteration: 19.7% involved adulteration with *Cynomorium songaricum* or *Cistanche sinensis*, and 16.7% involved substitution with *Cy. songaricum*, *Ci. sinensis,* or *Boschniakia rossica*. The results also revealed that *Cy. songaricum* was the most common adulterant in the market. Thus, we recommend the use of species-specific nucleotide signatures for regulating adulteration and verifying the quality assurance of medicinal product supply chains, especially for processed products whose DNA is degraded.

Keywords: Cistanches Herba, Chinese patent medicine, nucleotide signature, degraded DNA, medicine quality control

## INTRODUCTION

Cistanches Herba (Rou Cong Rong) is a well-known Pharmacopoeia-recorded medicine in Asia (Chinese Pharmacopoeia Commission, 2015; Japan Pharmacopeial Convention, 2016); this medicine is derived from the dried succulent stems of Cistanche deserticola Y. C. Ma or Cistanche tubulosa (Schenk) Wight. Cistanches Herba has been used for more than 3,000 years as a superior

**48**

tonic, as it is not toxic and can be taken for long periods of time (Li et al., 2016). Furthermore, Cistanches Herba was bestowed with the honor of being named "Desert Ginseng" because of its great medicinal value, especially in strengthening male sexual function (Zhang and Su, 2014; Gu et al., 2016). There are more than 100 Chinese patent medicines recorded in the Chinese Pharmacopoeia Commission (2015) and in other local official promulgated standards (Wang et al., 2012). As the population of elderly individuals increases, there is considerable demand for Cistanches Herba and its medicinal products. However, raw material resources are becoming increasingly scarce. In fact, the two original species of Cistanches Herba have been added to the China Plant Red Data Book as state-protected wild plants (category II) (Fu, 1991). The medicinal materials on the market are mainly cultivated in northwestern China.

Owing to the considerable imbalance between the supply and demand of Cistanches Herba, many adulterants have entered the market; these adulterants are inconsistent with standards and can threaten drug security. The known adulterants include the dried succulent stems of Cynomorium songaricum Rupr. (Cynomorii Herba, Suo Yang in Chinese), Cistanche sinensis Beck, Orobanche coerulescens Stephan, and Boschniakia rossica (Cham. et Schlecht.) Fedtsch. et Flerov (Sun et al., 2012). These adulterants have morphological characteristics similar to those of Cistanches Herba, making traditional taxonomic identification difficult, particularly after the material is processed into medicinal products. Microscopic identification is not available because Cistanches Herba has no definitive or unique microscopic characteristics. Current analytical chemistry tools are not sufficient for detecting adulteration of Cistanches Herba because similar compounds also exist within the known adulterants Ci. salsa (Lei et al., 2001; Chen et al., 2007) and Ci. sinensis (Liu et al., 2013). Therefore, the development of a rapid molecular method for the authentication of Cistanches Herba and its products is urgently needed for proper quality control systems in Chinese patent medicine and other medicinal product industries.

Chen et al. first suggested internal transcribed spacer 2 (ITS2) as a universal barcode for medicinal plants (Chen et al., 2010). Sun et al. verified the ITS2 region as a preferable DNA barcode for identifying Cistanches Herba and its adulterants (Sun et al., 2012). However, ITS2 cannot be used to distinguish Chinese patent medicine with degraded DNA. Recently, an increasing number of studies have shown that the "mini barcode" is a useful method for amplifying degraded DNA (Hajibabaei et al., 2006; Meusnier et al., 2008; Dubey et al., 2011; Lo et al., 2015). However, nucleotide signatures are more appropriate than mini barcodes, as the formers refer to one or more nucleotides that are unique to one species and can be effectively utilized by many molecular techniques, such as DNA probes, microfluidics and loop-mediated isothermal amplification (de Boer et al., 2015). Han et al. developed nucleotide signatures for Panax ginseng, Angelica sinensis and Lonicera japonica and successfully identified the associated Chinese patent medicines (Liu et al., 2016; Wang et al., 2016a; Gao et al., 2017).

The goal of our research focused on the development of nucleotide signatures for the identification of adulterants in functional products containing Cistanches Herba. Specifically, we (1) developed four nucleotide signatures and six specific primer pairs for differentiating authentic Cistanches Herba and known adulterants, (2) validated the four nucleotide signatures in experiments including mixtures of known taxonomic vouchers used to prepare Cistanches Herba products that contain both authentic and adulterated ingredients, and (3) performed a market survey of 66 Cistanches Herba products via their nucleotide signatures.

## MATERIALS AND METHODS

#### Sample Collection and Preparation

In total, 251 samples were collected from Inner Mongolia, Xinjiang, and Ningxia, among other areas; these samples included 214 Cistanches Herba and 37 adulterants and are detailed in **Supplementary Table S1**. Corresponding voucher samples were validated by taxonomists and deposited in the Herbarium of the Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Beijing, China. A total of 35 batches of powders, slices and extracts of Cistanches Herba were purchased from online stores and brick-and-mortar drugstores in Beijing and Chengdu (**Table 1**). In total, 31 batches of Chinese patent medicine containing Cistanches Herba were purchased from different drugstores (**Table 2**), and the declared compositions of different Chinese patent medicines are shown in **Supplementary Table S3**. The different morphological characteristics of different dose forms are shown in **Figure 1**.

Mixed samples: Powders of Ci. deserticola, Ci. tubulosa, Cy. songaricum, Ci. sinensis, B. rossica, and O. coerulescens were artificially mixed in different combinations (at a ratio of 1:1) before extraction. The details of these mixed samples are shown in the legend of **Figure 3**. In addition, the powders of four adulterants were mixed with the genuine Ci. deserticola at different weight ratios: 10:1, 50:1, 100:1, 200:1, 500:1, 1000:1, 2000:1, 5000:1, 10000:1, 15000:1, 20000:1, 25000:1, 30000:1, 40000:1, 50000:1 and 60000:1 (**Table 3**). And the two genuine products are mixed in the same proportions.

Decoction: Slices of Ci. deserticola and Cy. songaricum were used to prepare the decoction. The slices (10 g) were boiled in 300 mL of double-distilled water for 30, 60, 90, 120, 150, 180, 210, and 240 min and then used for DNA extraction.

#### DNA Extraction, Polymerase Chain Reaction (PCR) Amplification and Sequencing

Specimens, decoction, and mixed samples: The samples (40– 50 mg) were ground into fine powders via a Retsch MM400 laboratory mixer mill (Retsch Co., Germany) at a frequency of 30 Hz. The genomic DNA was subsequently extracted with a Plant Universal Genomic DNA Kit (Tiangen Biotech Beijing Co., China) according to the manufacturer's instructions. ITS2 was amplified by the universal primers 2F/3R (Chen et al., 2010).

Extract and Chinese patent medicines: Samples (40–50 mg) were collected into a tube and then ground via a Retsch MM400 laboratory mixer mill (Retsch Co.). Ten samples were collected TABLE 1 | Sample information and identification results of the 35 powders, medicinal slices, and extract.


in parallel per batch. The Chinese patent medicine powder was washed with 700 µL of prewash buffer [100 mM Tris-HCl, pH 8.0; 20 mM ethylenediaminetetraacetic acid (EDTA), pH 8.0; 700 mM NaCl; 2% polyvinylpyrrolidone (PVP)-40; and 0.4% βmercaptoethanol] several times until the supernatant was clear and colorless, after which the mixture was centrifuged at 7500 × g for 5 min at room temperature. The precipitate was subsequently used to extract the genomic DNA via the Plant Universal Genomic DNA Kit (Tiangen Biotech Beijing Co.) according to the manufacturer's instructions. In the end, DNA from each batch was concentrated into one tube.

Six species-specific primer pairs—SYF1/SYR1, HMRCF/HMRCR, GHRCF/GHRCR, CCRF/CCRR, SCRF/SCRR, and LDF/LDR—were designed via Primer Premier


6.0 software (Premier Co., Canada) to amplify Cistanches Herba and its adulterants (the details are shown in **Table 4**). PCR was performed in a 50 µL-volume reaction containing 1 µL of KOD FX (Toyobo Co., Japan), 25 µL of 2 × PCR buffer, 10 µL of dNTPs (2 mM), 1.5 µL of each primer (10µM), and 4 µL (∼50 ng) of DNA template; the remaining volume consisted of double-distilled water. The reactions were performed by a thermal cycler (VeritiTM 96-Well Thermal Cycler, Applied Biosystems Co., USA); the thermal programs are listed in **Table 4**. The PCR products were examined via 3% agarose gel electrophoresis and purified for bidirectional sequencing with an ABI 3730XL sequencer (Applied Biosystems Co.) in accordance with the Sanger sequencing method. Then, the sensitivity of the six primers was tested via the quantitative real-time PCR (qRT-PCR) assay with a CFX96 Real-time System (Bio-rad Lab., USA). The cycle thresholds were automatically calculated by the system via the "PCR Baseline Subtracted Curve Fit" model. qRT-PCR was performed in a 15 µL-volume reaction containing 7.5 µL of SYBR <sup>R</sup> Premix Ex TaqTM (Tli RNaseH Plus) (Takara Bio Co., Japan), 1.0 µL of each primer (10µM), and 1.0 µL of DNA template with the remaining volume consisting of double-distilled water. The qRT-PCR was performed with three technical replicates based on which standard error was calculated.

Sequence analysis: The sequences were edited and manually assembled via CodonCode Aligner 5.1.4 (CodonCode Co., USA).

ITS sequences from GenBank were annotated via the Hidden Markov model (HMM) to obtain the ITS2 sequences (Keller et al., 2009). The sequences were then aligned by MEGA 5.0 software via the "Muscle" alignment method (Edgar, 2004; Tamura et al., 2011).

## RESULTS

#### Development of Nucleotide Signatures and Species-Specific Primer Pairs for Cistanches Herba and Adulterants

The PCR amplification and sequencing success rates of the 251 samples were 100% when the primer pair 2F/3R was used. The aligned length of the Cy. songaricum ITS2 sequences was 229 bp (**Supplementary Figure S1**). Analysis of the sequences from the herbarium species and those of closely related species retrieved from GenBank (**Supplementary Table S2**) revealed two single nucleotide polymorphism (SNP) sites for Cy. songaricum. On the basis of the SNPs, one Cy. songaricum-specific 30 bp nucleotide signature (5′ -caattatttg aggtgcattg taagaagcgt-3') was developed (**Figure 2A**). Basic Local Alignment Search Tool (BLAST) results in NCBI demonstrated that this nucleotide signature was unique to Cy. songaricum (**Table 5**). With the similar analysis of the sequences from closely related species, one to two SNPs were discovered from Ci. sinensis, B. rossica and O. coerulescens. On the basis of the SNPs, the nucleotide signatures for the other three adulterants were also developed similarly, including a 34 bp signature (5′ -cgatggtctc ccgtgcgcga ggatgcacgg ccgg-3′ ) for Ci. sinensis, a 37 bp signature (5′ acactggcct cccgtgcgca acgacgtgcg gccggtc-3') for B. rossica, and a 31 bp signature (5′ -gtctgtcgtg tcggatggtg ttgcttgttg g-3′ ) for O. coerulescens (**Figures 2B–D**). BLAST analysis in NCBI also revealed that these nucleotide signatures were specific and not present in any other species (**Table 5**).

Cistanches Herba and its adulterants could be amplified simultaneously from mixtures via the 2F/3R universal primer pair. Thus, we designed species-specific primers for nucleotide signature amplification by aligning the ITS2 sequences. Four specific primer pairs—SYF1/SYR1, CCRF/CCRR, SCRF/SCRR, and LDF/LDR—were designed to amplify the nucleotide signatures of Cy. songaricum, B. rossica, Ci. sinensis, and O. coerulescens, respectively (**Table 4**). The lengths of the amplicons were 123, 72, 131, and 71 bp, respectively.

In addition, a total of 214 ITS2 sequences from experimental Cistanches Herba materials were analyzed. Two short specific primers—HMRCF/HMRCR and GHRCF/GHRCR for Ci. deserticola and Ci. tubulosa, respectively—were designed to amplify the short regions of the degraded samples(**Table 4**). The lengths of the amplicons were 132 and 134 bp, respectively.

## Validation of the Nucleotide Signature and Species-Specific Primer Method Based on Artificial Mixtures and Decoction

The amplification efficiencies of the new primer pairs were validated from the mixture. PCR products were obtained via each primer pair for each targeted species, as shown


FIGURE 2 | DNA sequence alignment results and SNP sites of the four nucleotide signatures. (A) Alignment of *Cynomorium songaricum* nucleotide signature with its region location in ITS2; (B) Alignment of *Cistanche sinensis* nucleotide signatures with its region location in ITS2; (C) Alignment of *Boschniakia rossica* nucleotide signature with its region location in ITS2; (D) Alignment of *Orobanche coerulescens* nucleotide signature with its region location in ITS2. The highlighted regions represent nucleotide signature regions and the marked bases represent the SNP sites of each nucleotide signature, the dots represent identical nucleotides.

*Ci. deserticola,* and *Ci. tubulosa* with the primers SYF1/SYR1, SCRF/SCRR, CCRF/CCRR, LDF/LDR, HMRCF/HMRCR, and GHCRF/GHRCR, respectively.

in **Figure 3**. For instance, the fifth mixture sample is a mixture of Cistanches Herba and its four adulterants, and each species could be amplified with the primers SYF1/SYR1, SCRF/SCRR, CCRF/CCRR, LDF/LDR, HMRCF/HMRCR, and GHRCF/GHRCR. Moreover, the amplification regions were sequenced, and the nucleotide signatures were observed within the target sequences. Furthermore, to measure the sensitivity, mixtures of two of the six samples were created at different weight ratios and the sample with lower proportion was amplified by specific primer pairs via qRT-PCR. As shown in **Table 3**,


To verify whether the nucleotide signature method functions with processed materials, decoctions of Ci. deserticola and Cy. songaricum were prepared. The results showed that the short barcode from Ci. deserticola could be amplified, even after the samples were boiled for 210 min. In addition, the nucleotide signature of Cy. songaricum could be amplified after the samples were boiled for 150 min, while no PCR products were detected after the samples were boiled for 210 or 240 min (**Figure 4**). The sequencing results demonstrated that the short nucleotide signature was successfully obtained from the decoction.

#### Market Survey of Adulteration via Nucleotide Signature and Specific Primers

The above method was applied for the detection of Cistanches Herba products on the market. Thirty five batches of Cistanches Herba slices, powders and extracts were amplified and sequenced by using six designed specific primers (the agarose gel electrophoresis results are shown in **Supplementary Figure S2** and the sequences are listed in **Supplementary Data Sheet 1**). Analysis of the sequences via their nucleotide signatures revealed that five batches of slices were authentic. One slice batch and one powder batch were substituted with Cy. songaricum, and one extract was substituted with Ci. sinensis. The other six batches of powders were mixtures: five were adulterated with Cy. songaricum and Ci. sinensis, and one was adulterated with Cy. songaricum (**Table 1**). In addition, one slice batch was substituted with Salvia miltiorrhiza.

The Cistanches Herba in Chinese patent medicines is subjected to various processes that make authentication difficult. The ability of the six primer pairs to amplify the speciesspecific nucleotide signature regions from the Chinese patent medicines was tested (the agarose gel electrophoresis results are shown in **Supplementary Figure S2**). Most Chinese patent medicines contain components of different types of species. For example, Shihu Yeguang pills (ZCY34) contain 25 ingredients, including Cistanches Herba (Rou Cong Rong), Dendrobii Caulis (Shi Hu), and Ginseng Radix et Rhizoma (Ren Shen). From these ingredients, Cistanches Herba could be amplified specifically via the primer pair GHRCF/GHRCR. Direct sequencing of the PCR products revealed very clean traces. However, a visible band could not be obtained with the primer pair HMRCF/HMRCR.

Another example is Kangguzhi Zengsheng pills (ZCY44). There are eight ingredients in addition to Cistanches Herba in this Chinese patent medicine, but there were no visible bands obtained by either GHRCF/GHRCR or HMRCF/HMRCR, which meant that no Cistanches Herba was present. However, the adulterant region of Cy. songaricum was successfully amplified by the specific primer pair SYF1/SYR1. The sequencing results demonstrated the short nucleotide signature of Cy. songaricum was successfully obtained.

TABLE 3 | qRT-PCR Cq values with standard deviation for mixtures at different sample weight ratios.


TABLE 5 | BLAST results of the four conserved nucleotide regions.


By seeking the nucleotide signatures developed in this study, we found that 15 of 31 Chinese patent medicines labeled as containing Cistanches Herba instead contained adulterants, including eight counterfeit ingredients and seven adulterants (**Table 2**). For example, six batches were replaced with Cy. songaricum, including three batches of Shihu Yeguang pills, one batch of Kangguzhi Zengsheng pills, one batch of Sanbao capsules, and one batch of Wenweishu particles.

Moreover, different batches from the same manufacturer produced somewhat different results. For example, among three batches from one manufacturer (ZCY16, ZCY69, and ZCY71), one batch comprised a mixture of Ci. deserticola and Cy. songaricum, one batch comprised a mixture of Ci. deserticola and Ci. tubulosa, and one batch contained only Ci. tubulosa. Two batches from another manufacturer (ZCY44 and ZCY70) also differed: one batch contained Cy. songaricum, whereas the other batch contained Ci. deserticola and Ci. tubulosa.

Cistanches Herba was detected in 23 of the 31 Chinese patent medicines tested (**Table 2**), and only 16 samples were authentic, e.g., without adulterants or counterfeit ingredients. Ci. tubulosa was detected in 16 batches of Chinese patent medicines, and Ci. deserticola was detected in 10 batches. O. coerulescens was not detected in any of the products (**Table 2**).

#### DISCUSSION

#### Necessity of Developing a New Method for Monitoring Commercially Available Medicinal Products Containing Cistanches Herba

Cistanches Herba is a tonic that is widely used in restorative Chinese patent medicines and other medicinal products. However, the quality control of Chinese patent medicines presents great challenge due to the diversity and complexity of the ingredients. Due to the lack of regulatory oversight, there is considerable opportunity for product adulteration or counterfeiting. In addition, all products should be processed in accordance with the Pharmacopoeia or other standards; adulterating or counterfeiting is not permitted during processing. Thus, varieties of quality control methods have been established, such as multi-heart-cutting two-dimensional liquid chromatography (Yao et al., 2015), near-infrared reflectance spectroscopy (Zhang and Su, 2014; Zhang et al., 2015) and liquid chromatography-mass spectrometry (Wang et al., 2016b). However, the analytical chemistry methods currently in the Chinese Pharmacopoeia Commission (2015) cannot be used to authenticate all of the ingredients in Chinese patent medicines or to detect the presence of adulterant ingredients. Moreover,

studies have shown that targeted metabolites in plants are altered during product processing, resulting in considerable variability in test results or complete failure of test methods (Ananingsih et al., 2013). Thus, molecular tools such as species-specific nucleotide signatures are poised to reinforce quality control systems against the risk of fraudulent product substitution and adulteration and inclusion of unlabeled ingredients.

Although ITS/ITS2 is considered a high-efficiency tool for the identification of herbal medicines, these sequences cannot be amplified from highly processed samples (Newmaster et al., 2013; de Boer et al., 2015). Wang et al. reported that ITS2 could not be amplified from Angelicae Sinensis Radix extract or decoctions boiled for more than 120 min (Wang et al., 2016a). According to traditional technologies and the Chinese Pharmacopoeia Commission (2015), Cistanches Herba is always highly processed to increase its medicinal efficacy; these processes, include oven drying, salting and steaming with wine (Zou et al., 2017), which lead to DNA degradation. In addition, various excipients are added during processing, such as honey, starch, and dextrin. If these excipients are not removed completely, the purity of DNA will be affected. For example, the following manufacturing process is used to generate Cistanches Herba-containing Sanbao capsules: "Boil the medicinal slices for 1.5 h twice, combine the decoctions and filter the mixture. Concentrate the filtrate to a relative density of 1.20∼1.25 (at 80◦C). Add other ground powders and combine them to obtain a homogeneous mixture. Next, dry the mixture at 60◦C, and then grind it into a fine powder." However, after the production process described above, there could be difficulties during the DNA extraction of Chinese patent medicines, and long fragments might not be amplified from the degraded DNA, which would prevent the identification of adulterants. Thus, to ensure the quality and purity of DNA, we added additional steps before the genomic DNA extraction, including washing with prewash buffer and eluting ten parallel tubes into one tube for each batch.

Although all the Chinese patent medicines used in the present study contained 6–25 ingredients, the primer pairs developed could specifically amplify the sequences of the adulterants in these Chinese patent medicines. Direct sequencing of the PCR products showed clean trace files. Thus, this nucleotide signature method is capable of identifying both authentic species ingredients and adulterants and should broaden the application of DNA-based molecular diagnostic tools for market supervision.

## Nucleotide Signatures for the Effective Identification of Cistanches Herba Products

Molecular tools that utilize PCR technology are very promising for medicinal product authentication within quality control systems. The successful application of the primers for identifying DNA-degraded adulterants from Cistanches Herba suggests that a PCR-based detection method could be used widely. In the Chinese herbal medicine market, the price of authentic Cistanches Herba species ingredients is more than five times higher than the prices of its adulterants. Our results showed that Cy. songaricum is the most common adulterant of Cistanches Herba on the market, followed by Ci. sinensis. Cy. songaricum was added because these medicines share similar morphological characteristics. In addition, the chemical composition of Ci. sinensis is similar to that of Cistanches Herba. As quality control markers for Cistanches Herba extracts, echinacoside and acteoside can be inexpensively extracted from Ci. sinensis. Thus, some pharmaceutical factories use Ci. sinensis as a substitute in the production of Cistanches Herba extracts. Taken together, the results of this study indicate that there is considerable fraud in the market for medicinal products.

Adulteration in Chinese patent medicine is similar to that found in other countries. Similar levels of adulteration have been recorded in North America (Newmaster et al., 2013), Europe (Raclariu et al., 2017), and Asia (Cheng et al., 2014; Shanmughanandhan et al., 2016; Gao et al., 2017). In this study, the adulterated rate of Chinese patent medicines was approximately 48.4%, with only 16 of the 31 samples being authentic Cistanches Herba. In addition, we speculated that the different results produced in products from the same manufacturer could be attributed to differences in the qualities of the different batches of Chinese medicine materials. Therefore, to control the quality of Chinese patent medicines, the raw materials should be authenticated before being processed into products.

Adulteration of Cistanches Herba has traditionally been associated with issues of supply and demand of raw materials. Ci. deserticola and Ci. tubulosa are the two original plants currently used to formulate Cistanches Herba. However, Ci. deserticola is the only original species in traditional authentic Cistanches Herba listed in the Chinese Pharmacopoeia Commission (2000), in which Ci. tubulosa is identified as an adulterant. Owing to the shortage of Ci. deserticola resources, Ci. tubulosa has been listed as a supplement in the Chinese Pharmacopoeia since 2005 (Jiang and Tu, 2009). Until recently, the prices of these herbs have markedly differed; Ci. tubulosa has been much less expensive than Ci. deserticola because there is a much larger supply of the former. Here, our results showed that Ci. tubulosa is more widely used in commercially available Cistanches Herba products.

In conclusion, the nucleotide signatures and PCR-based methods developed in this study may serve as useful tools for the medicinal product industry to authenticate ingredients and detect adulterants in Cistanches Herba products. In accordance of the sensitivity result, even if the proportion of adulterant was one in ten thousand, it can be detected via qRT-PCR. It means that once a nucleotide signature is detected in Cistanches Herba-containing functional products, it could be identified as an adulterant or counterfeit ingredient. A novel solution for detecting counterfeit ingredients or adulterated Cistanches Herba was provided that was not previously available via chemical detection methods in the Chinese Pharmacopoeia. In addition, this method could be used to validate increasing types of medicine and to broaden the applications of DNA-based molecular diagnostic tools for market supervision.

#### AUTHOR CONTRIBUTIONS

JH conceived the study and participated in its design. XW, RX, and JC contributed samples and performed the experiments. XW analyzed the data. XW, JH, ZZ, S-GN, JS, and SC drafted the manuscript. All authors have read and approved the final manuscript.

### FUNDING

This work was supported by the National Natural Science Foundation of China [grant number 81673552], the CAMS Innovation Fund for Medical Sciences [grant number 2016-I2 M-3-016], and the United Fund Key Project of the National Natural Science Foundation of China [grant number U1403224].

#### ACKNOWLEDGMENTS

We would like to thank our colleagues who helped with the sample collection, identification, laboratory work and manuscript preparation, including Chaokui Sun, Dianyun Hou, and Piao Zhang.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018. 01643/full#supplementary-material

Supplementary Figure S1 | The alignment result of ITS2 sequence from six species.

Supplementary Figure S2 | Agarose gel electrophoresis of all samples.

Supplementary Table S1 | Sampling information of Cistanches Herba and its adulterants.

Supplementary Table S2 | Sequence information of related species downloaded from GenBank.

Supplementary Table S3 | The declared compositions of different Chinese patent medicine samples.

Supplementary Data Sheet S1 | All the sequences obtained in this study.

#### REFERENCES


neurotransmitters from diminution in 6-hydroxydopamine lesion rats. J. Ethnopharmacol. 114, 285–289. doi: 10.1016/j.jep.2007.07.035


highthroughput sequencing: the story for Liuwei Dihuang Wan. Sci. Rep. 4:5147. doi: 10.1038/srep05147


Fu, L. (1991). China Plant Red Data Book. Part I. Beijing: Science Press, 502.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wang, Xu, Chen, Song, Newmaster, Han, Zhang and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Screening and Identification of Cardioprotective Compounds From Wenxin Keli by Activity Index Approach and in vivo Zebrafish Model

Hao Liu<sup>1</sup> , Xuechun Chen<sup>1</sup> , Xiaoping Zhao<sup>2</sup> \*, Buchang Zhao<sup>3</sup> , Ke Qian<sup>3</sup> , Yang Shi <sup>3</sup> , Mirko Baruscotti <sup>4</sup> \* and Yi Wang<sup>1</sup> \*

*<sup>1</sup> Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China, <sup>2</sup> School of Basic Medical Sciences, Zhejiang Chinese Medical University, Hangzhou, China, <sup>3</sup> Shandong Danhong Pharmaceutical Co., Ltd., Heze, China, <sup>4</sup> Department of Bioscienze, The PaceLab, University of Milano, Milan, Italy*

#### Edited by:

*Jiang Xu, China Academy of Chinese Medical Sciences, China*

#### Reviewed by:

*Xin Hui Tian, Shanghai University of Traditional Chinese Medicine, China Pu Jia, Northwest University, China*

#### \*Correspondence:

*Xiaoping Zhao zhaoxiaoping@zcmu.edu.cn Mirko Baruscotti mirko.baruscotti@unimi.it Yi Wang mysky@zju.edu.cn*

#### Specialty section:

*This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology*

Received: *31 August 2018* Accepted: *22 October 2018* Published: *13 November 2018*

#### Citation:

*Liu H, Chen X, Zhao X, Zhao B, Qian K, Shi Y, Baruscotti M and Wang Y (2018) Screening and Identification of Cardioprotective Compounds From Wenxin Keli by Activity Index Approach and in vivo Zebrafish Model. Front. Pharmacol. 9:1288. doi: 10.3389/fphar.2018.01288* Wenxin Keli (WXKL) is a widely used Chinese botanical drug for the treatment of arrhythmia, which is consisted of four herbs and amber. In the present study, we analyzed the chemical composition of WXKL using liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) to tentatively identify 71 compounds. Through typical separate procession, the total extract of WXKL was divided into fractions for further bioassays. Cardiomyocytes and zebrafish larvae were applied for assessment. *In vivo* arrhythmia model in Cmlc2-GFP transgenic zebrafish was induced by terfenadine, which exhibited obvious reduction of heart rate and occurrence of atrioventricular block. Dynamic beating of heart was recorded by fluorescent microscope and sensitive camera to automatically recognize the rhythm of heartbeat in zebrafish larvae. By integrating the chemical information of WXKL and corresponding bioactivities of these fractions, activity index (AI) of each identified compound was calculated to screen potential active compounds. The results showed that dozens of compounds including ginsenoside Rg1, ginsenoside Re, notoginsenoside R1, lobetyolin, and lobetyolinin were contributed to cardioprotective effects of WXKL. The anti-arrhythmic activities of five compounds were further validated in larvae model and mature zebrafish by measuring electrocardiogram (ECG). Our findings provide a successful example for rapid discovery of bioactive compounds from traditional Chinese medicine (TCM) by activity index based approach coupled with *in vivo* zebrafish model.

Keywords: Wenxin Keli, arrhythmia, zebrafish, cardioprotection, drug screen

## INTRODUCTION

Natural products have played important roles in healthcare system throughout history and will continue to be served as huge and invaluable resource for the discovery of drug candidates. Traditional Chinese Medicine (TCM), widely used in Eastern Asian countries, has been regarded as an important part of natural products for the therapy of various diseases (Wang et al., 2012). The discovery of bioactive constituents from TCM is the key step in the modernization of TCM. As a consequence, developing high throughput methods with satisfied sensitivity for identifying active compounds from complex mixtures of TCM is in great demand. In past two decades, many efforts have been made for rapidly screening of compounds from TCM. Artemisinin (qinghaosu) with antimalarial effect is a successful and impressive example as the gift from TCM (Tu, 2011).

Arrhythmia occurs with abnormal beating of heart myocardium, generally represent disorder of ion channels or cardiomyopathy, can be classified with disorders of impulse formation or conduction. Arrhythmia has intricate pathogenesis, in general, most cardiovascular diseases such as heart failure always accompany with arrhythmia. Some types of arrhythmias are capable of triggering cardiac arrest and sudden death. Unfortunately, most of the antiarrhythmic drugs lack specificity and have numerous adverse effects (Page and Roden, 2005). TCMs have multitargets and synergy effect that benefit with those complex diseases (Li et al., 2011). Wenxin Keli (WXKL) is one of the widely used Chinese patent medicine for arrhythmia and heart failure, and is the first Chinese-developed anti-arrhythmic medicine approved by the China Food and Drug Administration (CFDA) and approved by Chinese Pharmacopeia (ChP), and consists of Codonopsis pilosula, Polygonatum sibiricum, Radix Notoginseng, Nardostachys jatamansi, Succinum. Researches in decades have proved that WXKL can suppress and prevent cardiac arrhythmias, including atrial and ventricular arrhythmias (Xing et al., 2013; He et al., 2016; Wang et al., 2016; Li et al., 2017), and inhibit multiple ion channels (Chen et al., 2013; Yang et al., 2017), especially the atrial-selective inhibition of sodium-channel current (Burashnikov et al., 2012), and effect on late Na current (Xue et al., 2013).

Arrhythmia is difficult to model in vitro, multiple elements may contribute to the final arrhythmia including genetic predisposition, extrinsic injury, environmental exposures, and stochastic processes. Cell-based model unable to fully reveal the pathological process of arrhythmia. On the other hand, using animal model for screening is costly. Zebrafish (Danio rerio) has been a model for biomedical research for decades and is ideal for phenotype-based screen in various ways. The generation time of adult fish is about 3 months and it is easy to maintain a great number of zebrafish with low cost and needn't much space. The embryogenesis can be finished in 24 h post-fertilization (hpf) and each pair of fish can produce more than hundred eggs and the mating is not depended on season. More attractively, their embryos are transparent and most organs including the heart, liver, intestine, and kidney develop in 96 hpf that can be clearly visualized (Barros et al., 2008). The larvae can be manipulated in well-plate and live with a little fluid and is permeable to small molecules (Zhang et al., 2003; Kari et al., 2007). These traits make it possible to establish an easy and high-throughput model to assay the effects of drug candidates on internal organs in the live organism. Zebrafish are easy to genetic manipulation to simulate human disease (Asnani and Peterson, 2014). As a developmental and genetic model, zebrafish has been used for anti-cancer compounds discovery, chemicals toxicity assessment, and so on. Zebrafish heart is highly comparable with human heart in structures, functions, signal pathways, and ion channels (Hu et al., 2000) and is particularly suitable for the study of the cardiovascular system. Here, we used terfenadine, an antihistamine drug but also a potent hERG blocker and QT prolonger (Dhillon et al., 2013), and was reported that had pro-arrhythmic effects (Chaudhari et al., 2013), to induce the heart disturbance of zebrafish.

In the present study, we simultaneously used cell-based and zebrafish-based model to assess the cardioprotective and anti-arrhythmia effect of WXKL and the separated fractions, and combined HRMS and chemometric analysis to identify bioactive compounds from WXKL. H9c2 cell damaged by H2O<sup>2</sup> were conducted to evaluate the protective activity of fractions. In vivo arrhythmic model based on Cmlc2-GFP transgenic (Tg) zebrafish was applied and the heart rate and rhythm of larvae were measured to evaluate pharmacological effects. The cell viability and heart rate recovery of zebrafish were transformed as the bioactivity coefficient and correlated with the compounds constitute in fractions from WXKL to calculate active index of every compounds. The entire process is illustrated in **Figure 1**. With both in vitro and in vivo assessment and active index calculation, several compounds including ginsenoside Rg1, ginsenoside Re, notoginsenoside R1, lobetyolin, lobetyolinin were selected to validate the activity on larvae and mature fish.

## MATERIALS AND METHODS

### Materials and Reagents

Wenxin Keli was obtained from Shanxi Buchang Pharmaceutical Co., Ltd (Shanxi, China). Ginsenoside Rg1, Ginsenoside Re, Notoginsenoside R<sup>1</sup> were purchased from Winherb Medical Tech. Co., Ltd (Shanghai, China). Lobetyolin was obtained from Push Bio-Tech (Chengdu, China). Extract of codonopsis glycosides was obtained from Dasfbio (Nanjing, China).

HPLC-grade acetonitrile and methanol were purchased from Merck (Darmstadt, Germany). Formic acid (HPLC grade) was purchased from Roe Scientific (Newark, DE, USA). Ethanol was purchased from Zhejiang Changqing Chemical (Hangzhou, China). Deionized water was prepared with an Elga PURELAB flex system (ELGA LabWater, UK).

High-glucose Dulbecco's modified Eagle's medium, fetal bovine serum, trypsin-EDTA and antibiotics (100 U/ml penicillin G and 100 g/mL streptomycin) were obtained from Gibico BRL (Grand Island, NY, USA). Tert-Butyl hydroperoxide solution, Thiazolyl Blue Tereazolium Bromide, Terfenadine, DMSO, N-Phenylthiourea (PTU), and Tricaine were acquired from Sigma– Aldrich (St. Louis, MO, USA).

#### Apparatus

Tecan infinite M1000 system (Tecan, Zurich, Switzerland). AB TripleTOF 5600plus System (AB SCIEX, Framingham, USA), coupled to a Waters ACQUITY UPLCTM system (Waters, MA, USA). Finnigan LCQ DecaXPplus mass spectrometer equipped with an ESI source (Thermo, MA, USA) coupled to Agilent 1100 liquid chromatography (Agilent, Waldbronn, Germany). Agilent 1200 preparative performance liquid chromatography (Agilent, Waldbronn, Germany). Leica DMI 3000B Fluorescence Inversion Microscope System (Leica Microsystems Inc., USA), Andor Zyla

5.5 sCMOS Cameras (Oxford Instruments plc, Tubney Woods, Abingdon, UK). IX-100F Zebrafish system (iWorx Systems, Inc., USA).

## Characterization of Major Compounds of WXKL by LC–HRMS

Chemical composition of WXKL was characterized by AB TripleTOF 5600plus System coupled to a Waters Acquity UPLC system. The MS conditions: scan range m/z 100–2,000. Negative ion mode: source voltage was−4.5 kV, and the source temperature was 550◦C. Positive ion mode: source voltage was +5.5 kV, and the source temperature was 600◦C. The pressure of Gas 1 (Air) and Gas 2 (Air) were set to 50 psi. The pressure of Curtain Gas (N2) was set to 35 psi. Maximum allowed error was set to ±5 ppm. Declustering potential (DP), 100 V; collision energy (CE), 10 V. For MS/MS acquisition mode, the parameters were almost the same except that the collision energy (CE) was set at 50 ± 20 V, ion release delay (IRD) at 67, ion release width (IRW) at 25. The acquisition parameters for Finnigan LCQ DecaXPplus mass spectrometer were as follows: nebulizing gas, high purity nitrogen (N2); collusion gas, high-purity helium (He); ion spray voltage:−3 kV; capillary temperature: 350◦C; capillary voltage:−15 V; mass range: m/z 100–1,500. Chromatographic separation was carried out on an Agilent Zorbax SB-C18 analytical column (4.6 × 250 mm I.D., 5µm; Agilent Technologies, USA). The mobile phase consisted of water (A) and acetonitrile (B) both containing 0.05% v/v formic acid. A gradient program was used as follows: 0–5 min, 10% B; 5–15 min, 10–25% B; 15–35 min, 25–35% B; 35– 40 min, 35–40% B; 40–45 min, 40–70% B; 45–55 min, 70–95% B; 55–65 min: 95% B. The flow rate was 0.5 mL/min, the column temperature was 30◦C, and the injection volume was 20 µL.

#### Cell Culture and Anti-Oxidation Assays

H9c2 cell were obtained from Cell Bank of the Chinese Academy of Science (Shanghai, China) and cultured in high glucose Dulbecco's modified Eagle's medium supplemented containing 10% fetal bovine serum (FBS) and antibiotics (100 units/mL penicillin and 100µg/mL streptomycin). The cultures were maintained at 37◦C in a humidified atmosphere of 5% CO2. The anti-oxidation activity of each fraction was determined by tetrazolium based colorimetric assay (MTT assay). Briefly, cells (5 × 10<sup>4</sup> cell/mL) were seeded to 96-well plates for 24 h and then treated with fractions of WXKL for another 24 h prior to 150µM H2O<sup>2</sup> exposure in fresh medium for 3 h, After that, 100 µL 0.5 mg/mL MTT in fresh medium replaced the former medium for 4 h at 37◦C. Then, the medium was replaced by 100 µL DMSO and vibrated for 10 min. The cell viabilities of tested fractions were determined by measuring the optimal densities (ODs) of untreated cells (control), the cells exposed to H2O<sup>2</sup> (model), and the cells pre-incubated with components (tested). The activities of the components were calculated using the following formula: Survival rate% = OD of tested/OD of control. Protection rate% = (OD of model – OD of tested)/(OD of model – OD of control).

#### Zebrafish Husbandry and Management

Heterozygotes and homozygote transgenic Cmlc2-GFP zebrafish expressing green fluorescent protein (GFP) exclusively in myocardium were provided by Zebrafish Resource Center, Zhejiang University School of Medicine (Hangzhou, China) and maintained according to established standard procedures. Two parent zebrafish were placed separately in a mating box equipped with a separator to protect the eggs from being eaten. Spawning was induced in the morning and embryos from each box were collected and rinsed with system fish water (containing 0.3% Instant Ocean Salt in deionized water with final pH 6.9–7.2, conductivity 450–550 µs/cm, and hardness of about 90 mg/L NaHCO3). The embryos were maintained in the Petri dish with system fish water and transferred to the incubator and incubated at 28◦C. This study was granted by the Institutional Animal Care and Use Committee of the Laboratory Animal Center, Zhejiang University. We followed the relevant guidelines from the Laboratory Animal Center of Zhejiang University.

### Zebrafish Arrhythmia Model and Drug Incubation

In 24 hpf, larvae with fluorescence were picked under fluorescent microscope and membranes of these larvae were ruptured artificially. Larvae were distributed into a 24-well plate and 8–10 larvae in each well with system fish water added with 0.2 mM N-Phenylthiourea (PTU) and 6 nM methylene blue for treatment. Set groups by wells, including Control, Model, and Treat. Terfenadine was stocked in DMSO at 100 mM, fractions of WXKL was stocked in DMSO at 100 mg/mL and fish water was used to dilute the stock to appropriate concentration. The model group was only given terfenadine, and the treat groups were given terfenadine and corresponding fractions. In 48 hpf incubating, the previous medium was discarded, and added fractions and terfenadine working solution, according to the groups, and filled to 2 mL with fish water medium in each well. The final concentrations of terfenadine was 6µM. The fractions were diluted to appropriate concentration, mostly 50µg/mL and some were 25, 12.5, 6.25µg/mL, depending on the toxicity refer to cell assay.

#### Heartbeat Recording

In 72 hpf, the beating of zebrafish heart was recorded under florescence with Leica DMI 3000B Fluorescence Inversion Microscope System (Leica). The readout speed of the sCMOS camera was set at 10 frames or 20 frames per second with 4 × 4 pixel binning. L5 filter cube (excitation wave length of 480 nm and emission wave length of 527 nm). One hundred continuous dynamic images were captured by Zyla 5.5 sCMOS Cameras (Andor), subsequently were recognized by Matlab. The area of heart in each picture was measured. The area change with time was supposed to exhibit the heart rhythm. The heartbeats were also recorded manually for accuracy. We calculated the heartbeat of ventricle uniformly.

#### Calculation of the Activity Indexes

The recovery rate (Ri) of the components were calculated using the following formula:

$$R\_i = \frac{B\_i - B\_M}{B\_C - B\_M} \times 100\%$$

Ri : normalized heart rate recovery rate of fraction i; B<sup>i</sup> : beats of larvae treated by fraction i; BM: beats of larvae treated by terfenadine; BC: beats of larvae treated in control group.

The peak area of each compound was normalized according to the following formula:

$$A\_j = \frac{A\_{i,j}}{\sum\_{i=1}^{m} A\_{i,j}}$$

Aj : normalized values of peak area of constituent j in fraction i; Ai,<sup>j</sup> : peak area of compound j in fraction i; m: the numbers of fractions obtained from whole extract.

The activity indexes of compounds were given by the following formula:

$$AI\_{\dot{j}} = \sum (R\_{\dot{i}} \times A\_{\dot{j}})$$

AI<sup>j</sup> : activity index of compound j.

#### Zebrafish ECG Measurement

The ECG of zebrafish was measured by IX-100F Zebrafish system (iWorx Systems, Inc., USA). Zebrafish was anesthetic at first and positioned on its back on a fish-bed. Use a paper to gently remove excess water and ensure that the fins are not crossing the belly of the fish. Place the fish-bed with the fish, head to the right, in the chamber and position it under Ag/AgCl surface electrodes. The two electrodes were placed axially along the center-line of the fish's belly and the forward electrode should be placed close to the gills. ECG was recorded by LabScribe v3 software (iWorx System Inc., USA).

#### Statistical Analysis

The data are expressed as mean ± standard deviation (SD). Parameter comparisons between groups were made with oneway ANOVA analysis of variance. GraphPad prism 7 software (GraphPad Software, USA) was used to carry out statistical analysis. P < 0.05 was considered statistically significant.

### RESULTS AND DISCUSSION

#### The Chemical Composition of Wenxin Keli Extract

The main compounds of WXKL include sugar, glycosides, lignans, polyynes, saponins, iridoid glycosides, detailed information is listed in **Table 1**. The negative ion model base peak LC-MS chromatograms of WXKL was showed as **Figure 2**. We collected 71 compounds information of MS/MS and identified 53 compounds primarily, including saponins, phenylpropanoids, polyacetylene, triterpenoid, and others. Twenty-seven compounds of them belong to Notoginseng, including notoginsenoside R1, ginsenoside Rg1, ginsenoside Re, ginsenoside Ra3, ginsenoside Rb1, notoginsenoside R2, ginsenoside Rc, ginsenoside Rd and so on. Sixteen compounds belong to Codonopsis, including tangshenoside V, lobetyolinin, lobetyolin, atractylenolide III, gentisic acid β-D-glucoside, syringin, hexyl 6-O-β-D-glucopyranosyl-β-D-glucopyranoside, hexyl 2-O-β-D-glucopyranosyl-beta-D-glucopyranoside and others. Besides, 5 were identified from Polygonatum, and 2 were from Nard (**Table 1**).

#### Evaluating Cardioprotective Effect of Components by Zebrafish Arrhythmia Model

We first performed standard isolation by preparative chromatography to obtain fractions, which were analyzed by Finnigan LCQ DecaXPplus mass spectrometer. The mass spectrums of every fractions were shown in **Supplementary Material**.



#### TABLE 1 | Continued


#### TABLE 1 | Continued


#### TABLE 1 | Continued


TABLE 1 | Continued


*DS, Codonopsis pilosula; HJ, Polygonatum sibiricum; SQ, Radix Notoginseng, GS, Nardostachys jatamansi.*

Oxidative stress plays a key role in the pathogenesis of various diseases (Furukawa et al., 2004). These fractions were then evaluated the protective activity on H9c2 cell damaged by H2O2. The toxicity of all fractions were tested at first, and the safe concentration of most fraction were 50µg/mL, and some were 25/12.5/6.25µg/mL to insure no toxicity. The cell viability with these fractions treated were shown in **Supplementary Material**.

Zebrafish (D. rerio) has been an ideal model for drug screening (Goldsmith, 2004). There have been many applications of zebrafish as a high-throughput screening model and cardiotoxicity risk assessment of drug candidates (Wen et al., 2012; Zhu et al., 2014). Here we used Cmlc2-GFP Tg zebrafish as the base. This transgenic line expressing GFP exclusively in myocardium driven by promoter cmlc2 (cardiac myosin light chain 2 gene) (Huang et al., 2003). Terfenadine has been reported can induce QT prolongation in zebrafish and guinea pig (Milan et al., 2006; Lu et al., 2012; Chaudhari et al., 2013), which is associated with ventricular tachyarrhythmia (Gowda et al., 2004). Terfenadine causes QT prolongation in adult zebrafish, also demonstrate in zebrafish embryos (Langheinrich et al., 2003).

As shown in **Figure 3**, terfenadine had less effect on the structure of the heart (**Figure 3A**) but influenced the rhythm of beat obviously (**Figure 3B**). The rhythm of heartbeat was exhibited through the area change of heart analyzed by Matlab. It's obvious that the control groups performed fast and regular rhythm (about 180 beats/min), and in the model groups, the heart rate were down to 80–100 beats/min with irregular heartbeats after incubated with terfenadine for 24 h, and some showed typically atrioventricular block (Peal et al., 2011), while co-incubated with WXKL stabilized the rhythm (**Figure 3B**, **Supplementary Videos**). And WXKL and its fractions performed varying affection on heart rate. Fraction 1 and Fraction 2 and former part of Fraction 3 showed beneficial effect on heart rate, while the remaining parts of Fraction 3 lowered the rate even more (**Figure 3C**).

## Screening Active Compounds by Activity Indexes Calculation and Ranking

Activity index (AI) of each compound was calculated according to mathematical formulae proposed in our previous study. It was assumed that the compounds with positive activity index might be active and has contribution to the activity of whole formula to some extent (Wang et al., 2014). The relative intensities of the identified compounds in each fraction were visually presented in a heatmap (**Figure 4A**). After multiply corresponding bioactivity coefficient (i.e., heart rate recovery rate) of each fraction, the heatmap was converted into a bio-active map, and the red and gray color represent good or bad effect, respectively. The calculated scores were exhibited as histogram on the right (**Figure 4B**). The detailed scores were listed in **Supplementary Material**. We plot compounds with the effect on cardiomyocytes and the heart rate of zebrafish (**Figure 4C**), the compounds in upper right region represented a better activity.

### Validation of Active Compounds

According to the scores, Ginsenoside Rg1, Ginsenoside Re, Notoginsenoside R1, Lobetyolin, Lobetyolinin were selected to validate activity, considering the available. Lobetyolinin was prepared and enriched by ourselves from commercial codonopsis glycosides. Their toxicity was confirmed before. We increased the dosage of terfenadine and shortened the incubation time as an acute injury model to improve significance when validating the active of pure compounds by reason of the pure compounds were not strong enough to exhibit activity in original method. After pre-treated with compounds (50µM) for 24 h, the larvae were treated with 15µM terfenadine for 2 h and recorded heartbeat under fluorescent. As the consequence, the heart rate of larvae was recovered in varying degree (**Figure 5A**). Ginsenoside Rg<sup>1</sup> and lobetyolinin exhibited better activities. Meanwhile, ECG of adult zebrafish treated with compound was measured. The heart rate of normal zebrafish was around 100 beats/min, and after treated with 25µM terfenadine for 1 h, the heart rate was down and occurred irregular rhythm (**Figure 5B**). Lobetyolinin pre-treated for 6 h recovered the heart rate and represent electrocardiograms were showed as **Figure 5C**.

## DISCUSSION

A few researches described the chemical components of WXKL. Wang et al. established a database for the chemical components of the five herbs in WXKL for active compounds predication (Wang et al., 2017), however, all the compounds were acquired from database refer to the herb not the real composition of the patent drug, and the chemical components probably change during the manufactory process. We analyzed the extract of WXKL directly with LC-HRMS at beginning, but the analytic method we established still has limitation. Actually, the compounds in Fraction 2 weren't separated clearly and seemed have low mass spectrum response, which make this portion of fractions have similar composition. Compounds identified from Nard and Succinum were rare, maybe for the reason that the main constitutes of Nard and Succinum are volatile oil, which are more appropriate analyzed by gas chromatography–mass spectrometry (GC–MS). According to report, the extracts of Nard significantly blocked INa and IIto of rat ventricular myocytes (Liu et al., 2009). The active compounds we predicted especially on the top are minor composition were difficult to get standard substance for bioactive assays except ginsenoside Re, ginsenoside Rg<sup>1</sup> and notoginsenoside R1, which limited the further validation. We attempted to isolate substances such as lobetyolinin from extracts of C. pilosula. It has reported that ginsenoside Re has negative effect on cardiac contractility and autorhythmicity (Peng et al., 2012), ginsenoside Rg<sup>1</sup> prolonged ventricular refractoriness and repolarization (Wu et al., 1995), notoginsenoside R<sup>1</sup> has protective effects on cardiovascular system (Li et al., 2014). Related activity of lobetyolin has few reports.

Drug-induced model is a common approach, verapamil and terfenadine were applied to develop a zebrafish heart failure

model (Zhu et al., 2018). QT prolonging is a typical characteristic of arrhythmia which also can be induced by cisapride and astemizole besides terfenadine (Langheinrich et al., 2003). It has to be considered that drug treated by oral may cause unstable effect, so design rational approaches of drug treatment is necessary. Chaudhari et al. performed parenteral administration of terfenadine with different doses and recorded ECG to assess drug-induced QTc prolongation in zebrafish. Those with the doses above 1 mg/kg were observed some proarrhythmic effects such as Ventricular Premature Contractions, Ventricular Tachycardia, Atrio-Ventricular (AV) Block, and Torsade de pointes (TdP) (Chaudhari et al., 2013). However, the cardiotoxic of terfenadine is possible not associated with QT prolongation and the occurrence of TdPs, but with marked widening of the QRS complex and other cardiac arrhythmias (Hondeghem et al., 2011). It's reported that terfenadine caused non-TdP like VT/VF by slowing of conduction via blockade of INa (Lu et al., 2012). In addition, transgenic zebrafish lines are also feasible to avoid the unstable results of drug induce method. Several mutants were identified that exhibited arrhythmias (Milan and Macrae, 2008) like the bradycardic line slo mo, with variable degrees of sinoatrial or atrioventricular heart block (Baker et al., 1997). Besides, some mutants exhibited recessive lethal phenotypes included mutants such as tremblor (Langenbacher et al., 2005), island beat (Rottbauer et al., 2001), and reggae (Hassel et al., 2008). The Tg line that express fluorescent proteins (e.g., Cmlc2-GFP) was beneficial for optical measurement from the other perspective.

ECG (electrocardiogram) abnormalities is the critical characteristic of arrhythmia. Several ECG measurement equipments for zebrafish have been developed, and mostly of them consist of electrode or micropipette and electrical filter (Milan et al., 2006; Chaudhari et al., 2013; Dhillon et al.,

2013). Detection of 3 dpf larva is also possible (Chi et al., 2008). However, the invasive injury and anesthesia could cause damage to the individual. As for high throughput screening, a fast and stable method of ECG measurement is required, but existing devices seem not compatibility. Computerized recognition of ECGs has become a well-established practice, assisting to classify long-term ECG recordings, which suggests new approaches like Machine Learning are able to recognize and classify the rhythm signal. For instance, automatic classification of single-lead ECG signals with Deep learning (also known as unsupervised feature learning or representation learning) was established (Singh et al., 2018). A new semi-supervised approach based on deep learning and active learning for classification of electrocardiogram signals is proposed (Sayantan et al., 2018). Though several algorithms have focused on automatically classifying heartbeats in ECGs, the scalability failure to handle large intra-class variations wherein the robustness of many existing ECG classification techniques remains limited. We have acquired plenty of dynamic images of different conditions of heartbeat and attempt to establish the relationship between waveform and define characteristic to classify the phenotype. Rhythm classify method based on image processing will be a non-invasive measurement of heart regulation.

In conclusion, we identified 71 compounds from extract of WXKL by LC-HRMS, and firstly utilized a transgenic zebrafish cmlc2-GFP induced by terfenadine as an animal model for screening active compounds from WXKL. After recording heartbeat that affected by fractions under a fluorescent microscope, a convenient image process was applied to exhibit the rhythm of heartbeat. Subsequently, we integrated chemometric analysis with bio-activity in vivo model of corresponding fractions and calculated active index of identified compounds. Ginsenoside Rg1, ginsenoside Re, notoginsenoside R1, lobetyolin, and lobetyolinin were selected to validate activity. Measurement of ECG of adult zebrafish also performed as a complement. Our results suggest that integrate bio-assay and substantial analysis to perform active index calculation improve the efficiency of active compounds discovering from TCM, and this approach is possible to be applied for the research of complex diseases.

## AUTHOR CONTRIBUTIONS

HL and XC designed and performed the experimental work. BZ, KQ, and YS provided the WXKL patent drug and related herb and extract. MB guided the theory of cardiac electrophysiology and pharmacology. All authors proofread the paper and provided feedback.

## FUNDING

This study was supported by the National Key Scientific and Technological Project of China (grant 2017ZX09301012), and National Natural Science Foundation of China (No. 81774151, No. 81822047), the Fundamental Research Funds for the Central Universities (2016FZA7016).

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar. 2018.01288/full#supplementary-material

Supplementary Table 1 | Bioactive-coefficient ranking of identified compounds.

Supplementary Figure 1 | LC-MS chromatograms of fractions of WXKL. Some fractions were not enough for analysis so were not showed here.

Supplementary Figure 2 | Anti-oxidation activity of WXKL fractions. (A) Protection rate of all fractions of WXKL, *n* = 3. (B) Dose-response of selected fractions that has protective activity.

Supplementary Figure 3 | Dose effect of Terfenadine on heart rate of zebrafish. ∗∗*P* < 0.01 vs. control. *n* = 8.

Video 1 | Side view of heart of zebrafish larva in normal group.

Video 2 | Vertical view of heart of zebrafish larva in normal group.

#### REFERENCES


inhibition of late sodium current. Pacing Clin. Electrophysiol. 36, 732–740. doi: 10.1111/pace.12109


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Liu, Chen, Zhao, Zhao, Qian, Shi, Baruscotti and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mechanism Based Quality Control (MBQC) of Herbal Products: A Case Study YIV-906 (PHY906)

Wing Lam<sup>1</sup> , Yongshen Ren<sup>1</sup> , Fulan Guan<sup>1</sup> , Zaoli Jiang<sup>1</sup> , William Cheng<sup>1</sup> , Chang-Hua Xu1,2, Shwu-Huey Liu<sup>3</sup> and Yung-Chi Cheng<sup>1</sup> \*

<sup>1</sup> Department of Pharmacology, Yale University School of Medicine, New Haven, CT, United States, <sup>2</sup> College of Food Science and Technology, Shanghai Ocean University, Shanghai, China, <sup>3</sup> Yiviva Inc., New York, NY, United States

YIV-906 (PHY906), a four-herb Chinese medicine formulation, is inspired by an 1800 year-old Chinese formulation called Huang Qin Tang which is traditionally used to treat gastrointestinal (GI) symptoms. In animal studies, it could enhance anti-tumor activity of different classes of anticancer agents and promote faster recovery of the damaged intestines following irinotecan or radiation treatment. Several clinical studies have shown that YIV-906 had the potential to increase the therapeutic index of cancer treatments (chemotherapy, radiation) by prolonging life and improving patient quality of life. Results of animal studies demonstrated five clinical batches of YIV-906 had very similar in vivo activities (protection of body weight loss induced by CPT11 and enhancement of antitumor activity of CPT11) while four batches of commercial–made Huang Qin Tang, HQT had no or lower in vivo activities. Two quality control platforms were used to correlate the biological activity between YIV906 and HQT. Chemical profiles (using analysis of 77 peaks intensities) obtained from LC-MS could not be used to differentiate YIV-906 from commercial Huang Qin Tang. A mechanism based quality control (MBQC) platform, comprising 18 luciferase reporter cell lines and two enzymatic assays based on the mechanism action of YIV-906, could be used to differentiate YIV-906 from commercial Huang Qin Tang. Results of MBQC could be matched to their in vivo activities on irinotecan. In conclusion, the quality control of an herbal product should be dependent on its pharmacological usage. For its specific usage appropriate biological assays based on its mechanism action should be developed for QC. Chemical fingerprints comparison approach has limitations unless irrelevant chemicals have been filtered out. Additionally, using a similarity index is only useful when relevant information is used. A MBQC platform should also be applied on other herbal products.

Keywords: YIV-906, Chinese medicine, mechanism, quality control, chemical fingerprint, herbal products

#### INTRODUCTION

In human history Herbal Medicine is the oldest medicine. Herbs are the most important elements in different traditional medicines from different cultures around the world including Traditional Chinese Medicine, Ayurveda, Unani and Sidha. Many herbs are widely claimed to help a variety of disease or symptoms. In Asia and certain countries, herbal medicines are used as mainstream medicine for treating diseases. However, in many western countries, most herbal

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Xue Qiao, Peking University, China Wei Song, Peking Union Medical College Hospital (CAMS), China

#### \*Correspondence:

Yung-Chi Cheng yccheng@yale.edu; yung-chi.cheng@yale.edu

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 04 September 2018 Accepted: 29 October 2018 Published: 19 November 2018

#### Citation:

Lam W, Ren Y, Guan F, Jiang Z, Cheng W, Xu C-H, Liu S-H and Cheng Y-C (2018) Mechanism Based Quality Control (MBQC) of Herbal Products: A Case Study YIV-906 (PHY906). Front. Pharmacol. 9:1324. doi: 10.3389/fphar.2018.01324

products are still being used as food supplements with low quality control standards. In order to promote herbal medicine as an accepted mainstream medicine in western countries for unmet needs, herbal products need to pass clinical trials with favorable and consistent clinical outcome. So far the FDA has only approved two botanical drugs: Veregen <sup>R</sup> Ointment and Fulyzaq <sup>R</sup> (crofelemer). FDA approved Veregen <sup>R</sup> Ointment, a green tea extract as topical drug for genital warts in 2006. In 2012 the FDA approved Fulyzaq <sup>R</sup> (crofelemer), an extract of the latex of the South American tree Croton lechleri for treating diarrhea in HIV patients in 2012 (Yeo et al., 2013). Quality control for Fulyzaq <sup>R</sup> is relatively simple because it contains only purified oligomeric procyanidins and proanthocyanidins, which are polymers of (epi)catechin or (epi)gallocatechin.

Many herbal medicines are used as raw extracts with polychemicals because purification may lead to separate active compounds or lose their biological activity. In Traditional Chinese Medicine (TCM) formulations with multiple herbs are commonly used. Due to the chemical complexity of herbal products, it is extremely difficult to reproduce an herbal product with the same biological activity over time without knowing all active ingredients.

We are currently developing YIV-906 (formerly PHY906, KD018) as an adjuvant for cancer therapies. YIV906 is a standardized four-herb formula based on formula "Huang Qin Tang," an 1800-year ago Chinese herbal formulation for numerous gastrointestinal (GI) symptoms, including diarrhea, nausea, and vomiting. These symptoms are common side effects of chemotherapy. YIV906 is composed of four herbs: Glycyrrhiza uralensis Fisch (**G**), Paeonia lactiflora Pall (**P**), Scutellaria baicalensis Georgi (**S**), and Ziziphus jujuba Mill (**Z**). YIV-906 was prepared using high-quality herbs selected by highly experienced herbalists and manufactured according to cGMP (current Good Manufacturing Practice) standards. Results from seven Phase (I/II to II) clinical trials with different batches of YIV-906 on 140 evaluable patients in Yale University and other institutions in United States suggested that there was no YIV-906 related toxicity with the used dosage; a clear indication of decreased G3/4 diarrhea, nausea, vomiting, and improved quality of life in those patients who received irinotecan, capecitabine or chemo-radiation were observed (Farrell and Kummar, 2003; Saif et al., 2010; Kummar et al., 2011; Saif et al., 2014). In preclinical studies YIV-906 reduced CPT-11 induced intestinal inflammation by inhibiting NFκB, COX-2, and iNOS while promoting intestinal stem/progenitor cell repopulation by stimulating the Wnt signaling pathway (Lam et al., 2010). YIV-906 also decreased GI toxicity from irradiation (Rockwell et al., 2013). YIV-906 could selectively alternate bacteria population of the intestines dependent on different treatment conditions (Lam et al., 2014). In tumors, YIV-906 was shown to enhance the anti-tumor activity of different classes of anti-cancer agents in vivo (Liu and Cheng, 2012). Detailed mechanism studies indicated that YIV-906 increased the anti-tumor activity of CPT-11 and Sorafenib by increasing apoptosis of tumor cells and promoting the polarization of macrophage to M1-like-type that assists tumor rejection in the tumor micro-environment (Wang et al., 2011; Lam et al., 2015). mRNA array results of colon 38 tumor following CPT-11+YIV-906 treatment suggested that YIV-906 could switch the immune status of tumor from chronic to acute inflammation associated with the up-regulation of IRF5, IFN, and JAK/STAT signaling (Wang et al., 2011). Overall, different batches of YIV-906 manufactured over a period of 15 years appeared to show similar biological activities in clinical and pre-clinical studies.

In this report, we compared different batches of YIV-906 with alternate commercial-made batches of Huang Qin Tang (HQT) for their biological activity on CPT11 of colon 38 tumor bearing BDF1 mice. We also compared correlation analysis based on chemical profiles, which is the most common quality control method used in botanical industry, against biological activities of "mechanism based quality control"(MBQC) could be used to differentiate clinical batches of YIV-906 from the commercial batches of HQT and matched to their biological activity in vivo.

## MATERIALS AND METHODS

## In vivo Mouse Models

Murine Colon 38 cells (1−2 × 10<sup>6</sup> cells in 0.1 ml phosphatebuffered saline, PBS) were transplanted subcutaneously into 4- to 6-week-old female BDF1 mice (Charles River Laboratories). After 10–14 days, mice with tumor sizes of 150–300 mm<sup>3</sup> were selected. Unless otherwise indicated, treatment groups each consisted of five mice. Tumor size, body weight, and mortality of the mice were monitored daily. Tumor volume was estimated by using the formula length × width<sup>2</sup> × π/6. Unless otherwise indicated, treatment groups each consisted of five mice. PHY906 (batches number 6, 7, 8, 10, 11 and F, 38, 39, 40 which are commercial Huang Qin Tang) were given orally (po) for 4 days [twice per day (b.i.d), 500 mg/kg at approximately 10:00 am and 3:00 pm], while CPT-11 (360 mg/kg) was administered intraperitoneal (ip) on Day 1. On Day 1, PHY906 was given 30 min prior to CPT-11 administration. In the control groups, mice were administered a vehicle, either PBS for i.p. administration or water for oral administration. Data was analyzed by two-way ANOVA (GraphPad Prism 6), The difference was considered to be statistically significant when ++ (P < 0.001), + (P < 0.05) and – (P > 0.05).

## LC-MS Analysis for Chemical Profiles of the Metabolites of PHY906

10 µl of 10 mg/ml herbal water extract of each sample was subjected to LC-MS analysis. Six times individual experiments were repeated for each sample. The LC-MS analysis was performed on an Agilent 1200 series HPLC coupled with AB SCIEX 4000 QTRAP mass spectrometer. The separation was conducted on an Agilent Zorbax C18HPLC Column (5 µm, 4.6 × 250 mm). The mobile phase is acetonitrile (A) and water with 0.1% formic acid (B) with linear gradient elution: 0 min, 5% A; 10 min, 20% A; 20 min, 25% A; 40 min, 30% A; 45 min, 35% A; 55 min, 45% A; 60 min, 70% A; 62 min, 90% A; 67 min, 90% A; 68 min, 5% A; and 75 min, 5% A. The flow rate is 1.0 mL/min, and the column temperature was set at 30◦C, the detection

wavelength was set at 230 nm. The mass spectrometer was operated in the negative modes and equipped with a electrospray ionization (ESI) source. Source parameters were as follow: sheath gas (nitrogen) flow rate and auxiliary gas (nitrogen) flow rate: 60 and 20 arbitrary units, respectively, capillary temperature: 400◦C, heater temperature: 30◦C, spray voltage was −3.8 kV. The instrument was operated from m/z 120–1000 Da in the full scan mode. Acquisitioning and processing of the data from the mass spectrometer was performed using Analyst 1.4.2 <sup>R</sup> Software, the peaks were compared and a clustering analysis was created by MZmine software.

### Mechanism Based Quality Control Platform

18 x Luciferase report cell lines for different signaling pathways were selected. Cells were seeded into half-area 96-well microplate at 20000 cells/well in 40 ul medium for overnight at 37◦C 5% CO<sup>2</sup> incubator. Different dosages of PHY906 water extracted from 750 µg/ml to 83 µg/ml were added to the cells and placed in a 37◦C 5% CO<sup>2</sup> incubator. After removing medium at 6 h, 10 µl of lysis buffer (Tris-HC 25 mM at pH7.8, DTT 2 mM, CDTA 2 mM, glycerol 10%, Triton X−100 1%) will be used to lyse the cells and 40 µl of luciferase reaction buffer (Tris-HCI 20 mM at pH7.8, NaHCO<sup>3</sup> 1 mM, MgSO<sup>4</sup> 2.5 mM, DTT 10 mM, Coenyzme-A lithium 60 µM, potassium luciferin 225 µM, ATP 250 µM) will be added for reading luminescence using a luminescence microplate reader. IC50 (concentration required to inhibit 50% of control) or EC50 (concentration required to achieve 50% of maximum activation) will be determined based on the dosresponse curve. IC50 or E50 for each assay were determined from three independent experiments which were done in triplicate with 5 different doses. Methods for determining Cox-2 activity Assay and iNOS activity can be found in reference (Lam et al., 2010).

## Algorithm for Determining Correlation Coefficients

Graphpad Prism 6 software will be used to determine the correlation coefficients. Each raw input table represents different genes or different signal pathways. Each column represents different batches. Values of gene expression or IC50 or AC50 were input. "Column analyses" function of the software was selected for correlation analysis. Computing the correlation between each pairs of columns will be performed based on assuming a sample with Gaussian distribution. Pearson coefficients were calculated.

#### RESULTS

#### All YIV-906 Batches but Not Commercial Batches of Huang Qin Tang Enhance Antitumor Activity of CPT11 While Reducing the Body Weight Loss Caused by CPT11

We previously showed YIV-906-10 could enhance the action of CPT11 against colon-38 tumor growth while reducing

body weight loss caused by CPT11. Here, we compared five different clinical batches (6, 7, 8, 10, and 11) of YIV906 which were manufactured separately over the span of 15 years with commercial batches of Huang Qin Tang, HQT (F, 38, 39, 40) on the biological activities of CPT11 on colon-38 tumor bearing BDF1 mice. Results indicated that YIV-906-10 and other batches YIV-906 enhanced the anti-tumor activity of CPT11 against colon-38 tumor growth (**Figures 1A,E**) while promoted body weight recovery following CPT11 treatment (**Figures 1B,E**). Commercial HQT (F, 38, 39, 40) had no or low in vivo activities for enhancing CPT11 action on colon-38 tumor growth (**Figures 1C,E**). Commercial HQT (F, 38, 39, 40) also had no activity in promoting body weight recovery following CPT11 treatment (**Figures 1D,E**). This result confirmed that YIV906 which manufactured apart 15 years could have very similar biological activities.

#### Chemical Profile and Correlation Analysis for YIV-906 Batches and Commercial Batches of Huang Qin Tang Did Not Match Their Biological Activities on CPT11

Peaks from LC-MS profile were selected once their signals are significantly higher (5 folds higher) than the background (which was roughly about 5 × 10<sup>4</sup> ). Totally 77 peaks, which were based on their specific ion pairs in the LC-MS spectra, could be selected from either YIV-906 or HQT. Peaks of the 77 peaks might or might be not commonly found in YIV-906 or HQT (**Figure 2A**). Totally integrated area of the 77 peaks was about 90% of total integrated area of all peaks of chemical profiles. Each corresponding peak of different batches of YIV-906 or HQT were de-noised and aligned using MZmine software (**Figure 2A**).

When all 77 peaks were included for Pearson correlation analysis for each pair of samples, we did find that different pairs of YIV-906 batches demonstrated a strong positive similarity index (R from 0.9 to 0.99, red) (**Figure 2B**). However, based on their chemical profiles, HQT-F and HQT-39, which didn't have any biological activity on CPT11 in animal, also showed a positive similarity to most batches of YIV-906. Therefore, using all detectable chemicals for quality control may lead to a false prediction.

#### Mechanism Based Quality Control Platform and Correlation Analysis for YIV-906 Batches and Commercial Batches of Huang Qin Tang Matched Their Biological Activities on CPT11

Based on our previous animal and cell culture experiments, we know that YIV-906 had strong impact on inflammatory signaling via inhibiting NFkB, iNOS, COX2, IL6 and could promote tissue recovery by potentiating Wnt signaling. YIV-906 contains flavonoids which are known to have impact on hormone signaling and anti-oxidation property as well. Therefore, we selected 18 relevant luciferase reporter assays and two enzymatic assays, which were relevant to the mechanism action of YIV906, as our MBQC platform. We may not cover all biological activities for YIV-906 but all selected biological assays for YIV-906 could be used to explain its mechanism action for improving side effect caused by chemotherapy.

We compared the signal transduction activity responses of clinical batches (6, 7, 8, 10, and 11) of YIV906 and commercial batches (F, 38, 39, 40) of HQT across these assays (**Figure 3A**). As compared to YIV-906-10, YIV-906 (6, 7, 8, and 11) showed very similar activities in different assays (**Figure 3A**). However, commercial HQT (F, 38, 39, 40) showed very low activities (with much larger EC50 or IC50) in certain assays when compared with YIV-906 (**Figure 3A**). Biological activities among the HQT batches were also very different (**Figure 3A**). When all these results from the assays of MBQC platform were analyzed using Pearson correlation for each pair samples, we found different batches (6, 7, 8, 10, and 11) of YIV-906 had quite high similarity (R > 0.9) (**Figure 3B**). As expected, HQT (F, 38, 39, 40) batches had lower similarity to YIV-906 (6, 7, 8, 10, and 11) batches (**Figure 3B**) (R < 0.9). Most importantly the correlation analysis based on the MBQC platform fitted results from animal experiments where all YIV-906 showed biological activity on CPT11 but not HQT. This method can be even further simplified using fewer bioassays.

## DISCUSSION

fphar-09-01324 November 16, 2018 Time: 17:16 # 6

In this study, we reported that different batches of YIV-906 manufactured over a period of 15 years had very similar biological activity on CPT11 in animals but not in commercial batches of Huang Qin Tang (HQT). We developed a MBQC platform to differentiate YIV-906 from HQT and predict their biological activities in animals. Chemical profile analysis based on all detectable chemicals of YIV-906 or HQT could not be used to differentiate them and may lead to false predictions for their biological activities.

Here, we showed that the clinical grade YIV-906 had better quality than commercial HQT because different batches of YIV-906 were manufactured according to cGMP protocol in which each manufacturing steps followed standard operation procedure. In addition, herbs for YIV-906 manufacturing were selected by very experience herbalists from defined source and specific season. Other HQT may use different sources of herbs and may not follow cGMP protocol. Some HQT also have different ratio of the four herbs from YIV-906. Therefore, different batches of YIV-906 had better consistency of their biological activities than other HQT.

In 2000, we proposed "phytomics" which covers both chemical and biological fingerprints to characterize herbal mixture (Tilton et al., 2010). However, the relevance of detected chemicals or biological responses to pharmacological activity of herbal mixtures could be an issue. This report highlights that chemical profiles analysis is not the ideal methodology for the quality control of herbal products.

For the past 20 years due to the available of many advanced analytical chemistry technologies, such as HPLC-MS, GC-MS, CE-MS, LC-NMR, NIR, NMR, 2D-IR in the market (Jiang et al., 2010; Song et al., 2013; Wang et al., 2013; Cheng et al., 2014), using chemical profile analysis as quality control became very popular in the botanical industry. People strongly believed that herbal products with similar detected chemical profiles should have similar biological activities. Even herbal pharmacopeia published by some countries heavily rely on one or two so called "key compound(s)" as the quality indication for many herbs. However, quality control dependent on using specific chemical detection can have many drawbacks. There is no single chemical detection method that can cover 100% of the chemicals in a given herbal product: Different chemical detection techniques have their advantages and limitations for detecting certain class chemicals. Many herbs or formulation of herbs have multiple active compounds, without knowing comprehensively all biological activity of all chemicals of these herbal products, including irrelevant chemicals, the analysis will mask the differences and interactions between different herbal preparations.

According to the Botanical Drug Development Guidance for Industry published by FDA in 2016, it is clear that the identification of each constituent of botanical product is impossible and identification of active constituents is not essential. Furthermore, the guidance suggests that biological assays should be developed for the active constituents which are not known or quantifiable before Phase 3 studies. Therefore, future quality control for herbal products should be focusing more on the biological activity of herb product based on their usage rather than their chemical profiles. With enough scientific knowledge on the mechanism action for different claims of a given herbal product, we can select relevant biological assays to establish a specific quality control platform for assessing the quality of the herbal product for the particular usage. The results from the quality control platform in vitro should be further validated using in vivo experiments.

In conclusion, the quality control of the herb should be depended on its usage. Appropriate biological assays should be developed for the QC of its particular usage. Unless irrelevant chemicals have been filtered out, chemical fingerprint analysis alone has notable shortcomings. Additionally, using a similarity index is only useful when relevant information is used and subject to bias. Unless all the active compounds of an herbal product have been identified, MBQC should always take precedent in the chemical profile analysis and collaboration between academia and industry could help further develop a rigorous MBQC platform. Our novel approach for quality control or CMC could be applied to other herbal medicines in order to ensure their biological activity.

#### ETHICS STATEMENT

Animal experimental protocols were approved by Yale University Institutional Animal Care and Use Committee (IACUC). All animal experiments were carried out in accordance with an approved Yale University Institutional Animal Care and Use Committee (IACUC) protocol.

## AUTHOR CONTRIBUTIONS

WL did luciferase reporter assays, correlation analysis, and wrote the manuscript. YR did LC-MS analysis. FG did luciferase reporter assays and enzymatic assays. ZJ did animal experiments. WC did correlation analysis and data processing for LC-MS. S-HL provided YIV-906 and wrote the manuscript. C-HX helped setting up LC-MS. Y-CC designed experiments and wrote the manuscript.

## FUNDING

This work was supported by grant (1PO1CA154295-01A1) from National Cancer Institute (NCI), NIH, United States. Y-CC is a fellow of National Foundation for Cancer Research (NFCR), United States.

#### REFERENCES

fphar-09-01324 November 16, 2018 Time: 17:16 # 7


advanced pancreatic and other gastrointestinal malignancies. Phytomedicine 17, 161–169. doi: 10.1016/j.phymed.2009.12.016


**Conflict of Interest Statement:** Y-CC and S-HL are the co-inventors of YIV-906 for cancer treatment. Yale has the IP position. S-HL is employee of Yiviva which had licensed the IP of YIV-906 from Yale and Yale is a cofounder of the company.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Lam, Ren, Guan, Jiang, Cheng, Xu, Liu and Cheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Practical Quality Control Method for Saponins Without UV Absorption by UPLC-QDA

#### Manjia Zhao<sup>1</sup> , Yuntao Dai<sup>1</sup> \*, Qi Li<sup>1</sup> , Pengyue Li<sup>1</sup> , Xue-Mei Qin<sup>2</sup> and Shilin Chen<sup>1</sup> \*

1 Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China, <sup>2</sup> Modern Research Center for Traditional Chinese Medicine, Shanxi University, Taiyuan, China

#### Edited by:

Caroline Howard, Medicines and Healthcare Products Regulatory Agency, United Kingdom

#### Reviewed by:

Jianping Chen, Shenzhen Traditional Chinese Medicine Hospital, China Rufeng Wang, Beijing University of Chinese Medicine, China

\*Correspondence:

Yuntao Dai ytdai@icmm.ac.cn; dai\_yuntao@live.cn Shilin Chen slchen@icmm.ac.cn

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 31 July 2018 Accepted: 09 November 2018 Published: 11 December 2018

#### Citation:

Zhao M, Dai Y, Li Q, Li P, Qin X-M and Chen S (2018) A Practical Quality Control Method for Saponins Without UV Absorption by UPLC-QDA. Front. Pharmacol. 9:1377. doi: 10.3389/fphar.2018.01377 Saponins are a class of important active ingredients. Analysis of saponin-containing herbal medicines is a major challenge for the quality control of medicinal herbs in companies. Taking the medicine Astragali radix (AR) as an example, it has been shown that the existing evaporative light scattering detection (ELSD) methods of astragaloside IV (AG IV) has the disadvantages of time-consuming sample preparation and low sensitivity. The universality of ELSD results in an inapplicable fingerprint with huge signals from primary compounds and smaller signals from saponins. The purpose of this study was to provide a practical and comprehensive method for the quality control of the astragalosides in AR. A simple sample preparation method with sonication extraction and ammonia hydrolyzation was established, which shortens the preparation time from around 2 days to less than 2 h. A UPLC-QDA method with the SIM mode was established for the quantification of AG IV in AR. Methanol extract was subjected to UPLC-QDA for fingerprinting analysis, and the common peaks were assigned simultaneously with the QDA. The results showed that with the newly established method, the preparation time for a set of samples was less than 90 min. The fingerprints can simultaneously detect both saponins and flavonoids in AR. This simple, rapid, and comprehensive UPLC-QDA method is suitable for quality assessment of RA and its products in companies, and also provides references for the quality control of other saponin ingredients without UV absorption.

#### Keywords: saponins, Astragalus membranaceus, astragaloside IV, fingerprint, UPLC-QDA

#### INTRODUCTION

Saponins are of great value in the development of new drugs or functional foods because of their wide distribution and various activities (Monschein et al., 2013). Some commonly used Chinese medicines, including ginseng (Panax ginseng C.A. Mey), Notoginseng (Panax notoginseng), Astragali radix (Astragalus membranaceus), licorice (Glycyrrhiza), dioscoreae rhizoma (Dioscorea opposita Thunb.), Ophiopogonis radix (Ophiopogon japonicus), all contain saponins (Ministry of Public Health of the People's Republic of China, 2015). Therefore, the establishment of a simple and comprehensive quality control method is important for ensuring the quality of products containing saponins.

Because of the complexity of botanical ingredients, quantitative determination of index compounds (or active compounds) and the holistic analysis of fingerprints are widely used for the quality control of herbal medicines (Liu et al., 2007). However, saponins do not produce UV absorption or have terminal absorption. The ultraviolet detection method is used to detect the ultraviolet absorption peak of the compounds, with 203 nm often used as the detection wavelength for saponins (Qi et al., 2006). However, this method has weak sensitivity and low accuracy, and therefore, its usage rate gradually reduced. The existing evaporative light scattering detection (ELSD) method is currently used more as a general-purpose detector for saponins. Although the compounds with no UV absorption have relatively high sensitivity compared with the former, there are still some disadvantages such as insufficient sensitivity (Qi et al., 2008). The high-performance liquid chromatography with mass spectrometry (HPLC-MS) method for the determination of astragalosides has better selectivity and higher sensitivity, but it is relatively expensive and cannot be widely applied. QDA is a modular single quadruple mass detector. It is a small and inexpensive mass spectrometer detector compared with QTOF and a detector with high sensitivity for saponins compared with ELSD (Veryser et al., 2015; Yao et al., 2016). In this study, QDA was used to establish fingerprints and for the quantitative determination of astragalosides.

Astragalosides have important pharmacological functions in Astragali radix (AR) (Ministry of Public Health of the People's Republic of China, 2015), which is one of the best known and widely used herbal medicines. It has been used over 2000 years for its immunomodulating (Huang et al., 2007; Zhang et al., 2009), for antioxidative (Sheih et al., 2011), and for antiinflammatory (Shon and Nam, 2003; Hoo et al., 2010). At present, quality control methods for AR include the determination of astragaloside IV with HPLC-ELSD in the Chinese Pharmacopoeia (Ministry of Public Health of the People's Republic of China, 2015). In this method, sample preparation involves reflux extraction and liquid–liquid separation with n-butanol, which may take more than 2 days per sample. In addition, attempts were made to establish fingerprints for saponins by HPLC-ELSD for the overall quality control of AR (Ministry of Public Health of the People's Republic of China, 2015). However, the fingerprint of saponins was overwhelmed by very large peaks from primary components, and the peaks for the saponins were too small because of the universality of the HPLC-ELSD method and the high proportion of primary components. For these reasons, quality control of saponins is time-consuming and lacks specificity or integrity. Hence, a simple, economical, and valid quality control method for AR is urgently required.

Taking AR as an example, the purpose of this study was to establish a simple and integrated quality control method for saponins in order to meet the requirements of product quality supervision during production. The astragaloside content was determined by the SIM mode of ultra-performance liquid chromatography (UPLC-QDA), and the full scanning mode was used to establish the fingerprint of the astragalosides and the main flavonoids in AR. It is a simple, fast, and holistic quality control method for saponins from AR.

## MATERIALS AND METHODS

## Plants and Chemicals

Commercial samples of AR were collected from different places in China and authenticated as the dry roots of Astragalus membranaceus (Fisch.) Bge. var. mongholicus (Bge.) Hsiao using DNA barcoding method. The mean content of AG IV in all samples met the requirements of the Chinese Pharmacopoeia (Ministry of Public Health of the People's Republic of China, 2015). A voucher specimen was deposited in the herbarium of the Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences.

Saponin reference compounds, including astragaloside IV (AG IV, S1), astragaloside III (AG III, S2), astragaloside II (AG II, S3), astragaloside I (AG I, S5), and isoastragaloside I (iAG I, S6), and the internal standard ginsenoside Rg1 (N1), were obtained from the National Institute for the Control of Pharmaceutical and Biological Products. Their purities, as determined by HPLC, were above 98%. The structures of these compounds are shown in **Figure 1**. HPLC-grade acetonitrile (Fisher Scientific, United States), Optima LC-MS grade formic acid (Fisher Scientific, Czechia), and pure water (Wahaha, China) were used in the mobile phase. Other reagents and chemicals were of analytical grade. All solvents and samples were filtered through 0.22 µm membrane filters (Jinteng, Tianjin, China) before injecting into the HPLC.

#### Astragaloside IV (AG) Analysis Reference Solution Preparation

Five saponin reference compounds and the internal standard ginsenoside Rg1 were accurately weighed and formulated into

standard solutions of 1 mg/mL with methanol and stored at 4◦C for further use.

#### Sample Preparation

fphar-09-01377 December 11, 2018 Time: 12:24 # 3

The dried roots of AR were milled to a homogeneous powder, and then sieved through a No. 65 mesh. Each powder sample, accurately weighed (2 g), was placed in a 50 mL centrifuge tube and ultrasonicated (40 kHz, 500 W) with 30 mL methanol for 30 min. After being centrifuged (about 3000 × g) for 5 min, the methanol solution was filtered. The residue was washed twice with 15 mL methanol, ultrasonicated (40 kHz, 500 W) for 5 min, centrifuged (about 3000 × g) for 5 min, and filtered. The filtrate was combined and evaporated with a rotary evaporator, and then the residue was redissolved in 10 mL of 10% (V/V) ammonia solution, shaken from time to time for 10 min, filtered through a membrane filter (0.22 µm), and then injected into the HPLC.

#### UPLC-QDA Conditions

Chromatographic analysis was performed on a Waters ACQUITY H-Class UPLC <sup>R</sup> system, equipped with a quaternary solvent manager, sample manager, flow-through needle, high temperature column heater with active preheating, and QDA detector. Chromatographic separation was carried out at 35◦C on a BEH Shield RP18 column (2.1 mm × 100 mm, 1.7 µm) (Waters). The mobile phase consisted of 0.1% formic acetonitrile (A) and 0.1% formic acid water (B), using an gradient elution of 5–30% A at 0–3 min, 30–40% A at 3–5 min, 40–100% A at 5–15 min, and 100–5% A at 15–18 min. The sample volume injected was 2.5 µL, and the flow rate was 0.4 mL/min.

The conditions of the electrospray ionization (ESI) source were as follows: ESI in positive mode; capillary voltage, 800 V; fragmentor, 15 V; sampling frequency, 5 Hz; Probe temperature 500◦C. Ginsenoside Rg1 was detected in SIM 823.48 Da [M+Na]<sup>+</sup> mode at 0–5.5 min; AG IV and AG III were detected in SIM 808.00 Da [M+Na]<sup>+</sup> mode at 5.5–8 min; AG II and iAG II were detected in SIM 849.50 Da [M+Na]+mode at 0–8 min; AG I was detected in SIM 869.50 Da [M+H]<sup>+</sup> mode at 0–8 min; iAG I was detected in SIM 891.50 Da [M+Na]<sup>+</sup> mode at 0–8 min.

#### HPLC-ELSD Conditions

Quantitative analysis was performed using a 1200 Series HPLC (Agilent)–ELSD (Alltech 2000 ES). A YMC-Triart C18 column (250 mm × 4.6 mm. D.S-5 µm) was used for the chromatographic separations. The mobile phase consisted of 0.1% formic acetonitrile (A) and 0.1% formic acid water (B), using a gradient elution of 5–10% A at 0–5 min, 10–32% A at 5–10 min, 32–45% A at 10–30 min, 45–95% A at 30–35 min, and 95–20% A at 35–40 min. The injection volume was 20 µL, and the flow rate was 1 mL/min. ELSD was performed with air as the carrier gas at a flow rate of 2.5 L/min, and the nebulizer temperature was set to 100◦C.

#### Method Validation

#### Calibration Curves, Limits of Detection (LOD) and Quantification (LOQ)

Methanol stock solution of AG IV was prepared and diluted to appropriate concentration ranges (0.008, 0.009, 0.01, 0.06, 0.08, and 0.09 mg/mL) for the construction of calibration curves. The calibration curve was constructed using relative peak area (analyte/internal standard; Y axis), and the concentration of the standard (µg/mL; X axis). The diluted solution of the reference compound was further diluted with methanol to a series of concentrations for the gain of LOD and LOQ. The LOD and LOQ under the present chromatographic conditions were determined at a signal-to-noise (S/N) ratio of 3 and 10, respectively.

#### Precision, Repeatability, Stability, and Accuracy

Intra-day variations for six successive injections within 1 day were chosen to determine the precision of the developed method. Inter-day variations for three consecutive days were chosen to determine the precision of the developed method. To confirm the repeatability, six different working solutions from the same sample were prepared and analyzed. The sample stability test was determined with one sample during 1 day at 0, 0.5, 1, 2, 4, 8, 16, and 24 h. Over this period, the solution was stored at room temperature.

A recovery test was used to evaluate the accuracy of this method. For this, 1 mL of the above-developed AG IV standard solution of 1 mg/mL was combined with 1 g of the sample, and the mix was extracted as described above in the "Sample Preparation" section. Recovery was determined by comparing the difference between the mass of AG IV of the mix (sample + standard) (M1) and the mass of AG IV in the 1-g sample alone (M2), divided by the mass of AG IV standard added (M3), as shown in Equation (1). Recovery (%) = [(M1–M2)/M3] × 100% (1).

## Fingerprint Analysis

#### Sample Preparation

Each powder sample, accurately weighed (2 g), was placed in a 50 mL centrifuge tube and ultra-sonicated (40 kHz, 500 W) with 30 mL methanol for 30 min. After being centrifuged (about 3000 × g) for 20 min, the methanol solution was filtered through a membrane filter (0.22 µm), and then injected into the HPLC.

#### UPLC-QDA Conditions

Chromatographic separation was carried out at 30◦C on a Waters CORTECS T3 column (2.1 mm × 100 mm, 1.6 µm). The mobile phase consisted of 0.1% formic acetonitrile (A) and 0.1% formic acid water (B) using an elution gradient of 2–19% A at 0–2 min, 19–42% A at 2–11.5 min, 42–55% A at 11.5–15 min, 55–65% A at 15–16.5 min, 65–75% A at 16.5–18 min, 75–100% A at 18– 22.5 min, and 100–2% A at 22.5–24 min. The sample volume injected was 3 µL, and the flow rate was 0.4 mL/min.

The conditions of the ESI source were as follows: ESI in positive mode; capillary voltage, 800 V; fragmentor, 20 V; sampling frequency, 10 Hz. The QDA analysis worked using full scan mode, and the mass range was set at m/z 450–1200.

#### Method Validation

Intraday variations for six times within 1 day were chosen to determine the precision of the developed method. To confirm the repeatability, six different working solutions prepared from the same sample were analyzed. The sample stability test was determined with one sample during 1 day. In this period, the solution was stored at room temperature. By using the software "Similarity Evaluation System for Chromatographic Fingerprint of TCM," the "correlation coefficients" and the "relative retention time (RRT)" and "relative peak area (RPA)" of the "common peaks" were calculated. Then the correlation coefficients and the RSD% of the RRT and RPA of common peaks were used as evaluation criterion, which could semi-quantitatively express the chemical properties in the chromatographic profiles of samples.

#### RESULTS AND DISCUSSION

fphar-09-01377 December 11, 2018 Time: 12:24 # 4

#### Optimization of UPLC Systems

Accord to literature, acetonitrile-water with 0.1% formic acid was used as mobile phase (Qi et al., 2009). Two columns were screened as fixed phase for the determination of AG IV in the AR extracts. **Figure 2** shows the total ion chromatograms (TICs) of ESI (+) for the AR extracts separated on different columns. AG IV and AG III did not separate on a CORTECS T3 column, whereas good separation was achieved with a BEH Shield RP18 column (**Figure 2**).

An elution gradient was used for the determination of AG IV, instead of the isocratic elution methods used in Chinese Pharmacopoeia, and most literature (Ministry of Public Health of the People's Republic of China, 2015). To avoid the interference of other compounds, the elution gradient was set to start with 5–30% of organic solvent for 3 min before the elution of target compounds. A UV spectrum showed that most of the flavonoids compounds were eluted out before the peak of AG IV. The elution gradient was optimized to ensure that the elution of most highly polar compounds took place before the elution of AG IV, avoiding the impact of other compounds on the determination of AG IV.

#### Optimization of Sample Preparation

Sample preparation of AR in the determination methods of AG IV in the Chinese Pharmacopoeia includes 4-h solid-liquid extraction, liquid–liquid separation with butanol, taking more than 1 day, and column enrichment (Ministry of Public Health of the People's Republic of China, 2015). One sample preparation will take more than 2 days, which is not suitable for monitoring a large number of products in a commercial situation. In this study, sonication extraction methods were used, instead of reflux extraction. The results showed that there was no statistical difference between the sonication and the reflux extraction methods (**Supplementary Table S1**).

After extraction, liquid–liquid separation with butanol and column enrichment were used to separate and enrich astragalosides from the extracts in the methods of Chinese



TABLE 2 | Quantitative analytical results of astragaloside IV in AR samples.


Pharmacopoeia (Ministry of Public Health of the People's Republic of China, 2015). This step was omitted in the sample preparation here and was done in the following UPLC analysis

TABLE 3 | On-line detected data for assigned compounds in Astragali Radix.

step, with a graduated wash starting with a high percentage of water elution, as described in the "Materials and Methods" section of this paper. This on-line elution with UPLC, instead of both solvent extraction and off-line column enrichment, saved a significant amount of time and also improved the sample preparation accuracy.

After extraction, ammonia solution was added into the extracts, and the amounts of ammonia compared. The results showed that the peak area of AG IV increased with the amount of ammonia solution, reaching its highest values with 10 mL or more of 10% (V/V) ammonia solution; therefore, 10 mL of 10% (V/V) ammonia solution was used in this study.

One important step for the sample preparation in Chinese Pharmacopoeia was reverse extraction with ammonia. It has proved that the purpose of this step was to transform other saponins into AG IV (Chu et al., 2014). The chromatograms of the astragalosides in different ion channels were recorded before and after the addition of aqueous ammonia treatment (**Figure 2**). They showed that the saponins detectable in the methanol extract of AR were AG IV (S1) and other astragalosides, including AG II (S3), iAG II (S4), AG I (S5), and iAG I (S6). After being processed with ammonia, AG II (S3), iAG II (S4), AG I (S5), and iAG I (S6) all disappeared and transformed into AG IV (S1) (Chu et al., 2014). The results indicated that other astragalosides, except AG III, could be converted into AG IV, and that the amount of AG IV detected was mainly the sum of AG I (S5), iAG I, AG II (S3), and AG IV (S1).

#### Method Validation and Comparison With HPLC-ELSD

The calibration curve (Y = 29.215X + 0.2772) was successfully constructed using relative peak area for the Y axis and the concentration of standard as µg/mL for the X axis. The linearity of analytical response was acceptable with correlation coefficients higher than 0.99 offering a linear dynamic range of about two orders of magnitude. The LOD, LOQ, precision, stability, repeatability, and accuracy of the established methods for the determination of AG IV are summarized in **Table 1** and **Supplementary Table S2**. All results of precision, stability,


repeatability, and accuracy indicated that this method was valid.

The potential of UPLC-QDA was compared with the performance of HPLC-ELSD. **Table 1** lists the performance index of UPLC-QDA compared with HPLC-ELSD for the analysis of AG IV. LOD and LOQ were seen to be low at 8 and 25 ng, compared with 200 and 500 ng for ELSD, which meant that the sensitivity was greatly improved by the use of UPLC-QDA instead of HPLC-ELSD. The higher sensitivity of UPLC-QDA than that of ELSD was also observed in sample detection. **Figure 2** shows the chromatograms for the sample (d) with HPLC-ELSD and the chromatogram for standard compound (e) and sample (f) with UPLC-QDA. An obvious peak for AG III was observed with the established method, but it was not obvious in the chromatogram by HPLC-ELSD, which is attributed to the higher sensitivity of QDA than that of ELSD. The linearity range was also broadened with UPLC-QDA than ELSD and UPLC-QDA also showed a notably shortened analysis time. These advantages of UPLC-QDA indicate it successfully quantitative applications in quality analysis of RA. The precision, stability, and repeatability of the established UPLC-QDA method were not as good as the HPLC-ELSD method, but it is acceptable for the determination of AG IV in AR.

The established UPLC-QDA method was investigated for the analysis of AR. Fifteen samples from different batches were analyzed and the analytical contents were summarized in **Table 2**. All the analyses were carried out and repeated three times, and the data were recorded and expressed as the mean AG IV content.

**Table 2** shows the means for 15 batches detected with UPLC-QDA, and the results show a successful application of UPLC-QDA method to for the determination of AG IV in different AR sample.

#### Optimization of UPLC-QDA Conditions for Fingerprint

The elution gradient was optimized to achieve good separation for each peak in a short time. Several different gradients were tried and finally the gradient used in this study was selected, with good separation of each peak.

#### Validation for Fingerprint Method

The correlation coefficients and the RSD% of the RRT and RPA of common peaks were calculated (**Supplementary Tables S3**– **S5**). The correlation coefficients of precision were higher than 0.989 and the RSD% of RPA was lower than 5.00. The correlation coefficients of the repeatability test were higher than 0.992 and the RSD% of RPA was lower than 8.00. The correlation coefficients of the stability test were higher than 0.984 and the RSD% of RPA was lower than 6.10, indicating that the sample remained stable for 1 day. All tests for precision, repeatability, and stability indicated that this method was valid and applicable.

## Establishment of Fingerprint of Saponins in AR

In this study, 15 samples were analyzed by the newly developed method. The mean chromatogram and correlation coefficients of the samples were calculated by using the similarity software, and it was found that correlation coefficients of AR samples were higher than 0.920, which indicated that all the samples tested have high consistency in quality.

There were 13 "common peaks" existing in the chromatograms for the AR samples, which were assigned with ion mass analyzed with QDA and confirmed with reference compounds. **Figure 3** shows the typical mass spectra for saponins. In the positive ion mode, six of the saponins generated typical [M+Na]<sup>+</sup> ions, with mass accuracy at 807.5 + 42 n (n = 0 refers to AG IV, n = 1 refers to AG II/iAG II, n = 2 refers to AG I/iAG I). The six peaks of the flavonoids generated typical [M+Na]<sup>+</sup> ions. The MS data for the six saponins and flavonoids in the positive ion mode are shown in **Table 3**. The RRT and RPA of the common peaks in the 15 samples were calculated and the data of the RPA was shown in **Supplementary Table S6**.

The results shows that the fingerprint method established in this study can simultaneously detect a variety of saponins and

## REFERENCES


flavonoids. In addition, the identities of each compound can be directly established by its mass number.

### CONCLUSION

A simple and fast quantification method for AG IV and an overall fingerprint of the main components (astragalosides and flavonoids) in AR have been established with UPLC-QDA. The established method was feasible for comprehensive quality evaluation of RA. The UPLC-QDA exhibits advantages over ELSD in sensitivity, peak assignment and simultaneous detections of components with and without UV absorption in fingerprint. The established methods provide references for the quality control of saponin ingredients without UV absorption. This study therefore provides suitable methods for the practical quality assessment of saponins in commercial situations.

## AUTHOR CONTRIBUTIONS

YD, QL, X-MQ, and SC designed the study. MZ did the experiments. YD and MZ wrote the manuscript. All authors gave approval to the final version.

## FUNDING

The authors are grateful for the financial support provided by National Science Foundation (81473340 and 81803734), Project of Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences (ZXKT17048), and Project of China Academy of Chinese Medical Sciences (ZXKT17009 and GH201701).

## ACKNOWLEDGMENTS

We would like to thank Waters Technologies (Shanghai) Ltd. providing UPLC-QDA for analysis and Editage (www.editage.cn) for English language editing.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar. 2018.01377/full#supplementary-material

its anti-inflammatory activity. Nutr. Metab. 7:67. doi: 10.1186/1743-7075- 7-67


J. Chromatogr. B. Analyt. Technol. Biomed. Life Sci. 846, 32–41. doi: 10.1016/ j.jchromb.2006.08.002


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zhao, Dai, Li, Li, Qin and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sequence-Specific Detection of Aristolochia DNA – A Simple Test for Contamination of Herbal Products

Tiziana Sgamma<sup>1</sup> \*, Eva Masiero<sup>1</sup> , Purvi Mali<sup>1</sup> , Maslinda Mahat1,2 and Adrian Slater<sup>1</sup>

<sup>1</sup> Faculty of Health & Life Sciences, Biomolecular Technology Group, De Montfort University, Leicester, United Kingdom, <sup>2</sup> Natural Product Testing Section, Toxic Compound Detection Unit, National Pharmaceutical Control Bureau, Jalan University, Selangor, Malaysia

Herbal medicines are used globally for their health benefits as an alternative therapy method to modern medicines. The market for herbal products has increased rapidly over the last few decades, but this has in turn increased the opportunities for malpractices such as contamination or substitution of products with alternative plant species. In the 1990s, a series of severe renal disease cases were reported in Belgium associated with weight loss treatment, in which the active species Stephania tetrandra was found to be substituted with Aristolochia fangchi. A. fangchi contains toxic aristolochic acids, which have been linked to kidney failure, as well as cancers of the urinary tract. Because of these known toxicities, herbal medicines containing these compounds, or potentially contaminated by these plants, have been restricted or banned in some countries, but they are still available via the internet and in alternate formulations. In this study, a DNA based method based on quantitative real-time PCR (qPCR) was tested to detect and distinguish Aristolochia subg. Siphisia (Duch.) O.C.Schmidt species from a range of medicinal plants that could potentially be contaminated with Aristolochia material. Specific primers were designed to confirm that Aristolochia subg. Siphisia can be detected, even in small amounts, if it is present in the products, fulfilling the aim of offering a simple, cheaper and faster solution than the chemical methods. A synthetic gBlock template containing the primer sequences was used as a reference standard to calibrate the qPCR assay and to estimate the copy number of a target gene per sample. Generic primers covering the conserved 5.8S rRNA coding region were used as internal control to verify DNA quality and also as a reference gene for relative quantitation. To cope with potentially degraded DNA, all qPCR primer sets were designed to generate PCR products of under 100 bp allowing detection and quantification of A. fangchi gBlock even when mixed with S. tetrandra gBlock in different ratios. All proportions of Aristolochia, from 100 to 2%, were detected. Using standards, associating the copy number to each start quantity, the detection limit was calculated and set to about 50 copies.

Keywords: Aristolochia, Stephania tetrandra, DNA barcoding, herbal medicines, contamination, gBlock, quantitative real-time PCR

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Natalia Ivanova, University of Guelph, Canada Xiasheng Zheng, Guangzhou University of Chinese Medicine, China

> \*Correspondence: Tiziana Sgamma tiziana.sgamma@dmu.ac.uk

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 29 August 2018 Accepted: 26 November 2018 Published: 11 December 2018

#### Citation:

Sgamma T, Masiero E, Mali P, Mahat M and Slater A (2018) Sequence-Specific Detection of Aristolochia DNA – A Simple Test for Contamination of Herbal Products. Front. Plant Sci. 9:1828. doi: 10.3389/fpls.2018.01828

## INTRODUCTION

fpls-09-01828 December 7, 2018 Time: 16:19 # 2

Herbal medicines are often perceived as "good" and "safe" because they are "natural," in contrast to "chemical" drugs. People tend to be more relaxed in using them and ask less questions of producers or practitioners. Unfortunately, it is a well-known fact that many plants are in fact toxic and dangerous (Efferth and Kaina, 2011). In some cases botanical misidentification of plants, deliberately or accidentally, can also play a role in herbal drugs toxic reactions.

In the early '90s, Han Fang Ji (Stephania tetrandra) was incorrectly substituted with Guang Fang Ji (Aristolochia fangchi) in diet pills probably because of their similar Chinese Pin Yin names (Vanherweghem et al., 1993; Nortier et al., 2000). Aristolochia manshuriensis (Guan Mu Tong) has also been reported to have been substituted for other Mu Tong herbal drugs which should have had contained Akebia and Clematis (Zhu, 2002; Yang et al., 2007). More recently, the substitution of Solanum lyratum by Aristolochia mollissima in Baiying preparations has been detected by DNA barcoding (Li et al., 2012).

Although Aristolochia species are used in Traditional Chinese Medicine (TCM) they are also known for containing nephrotoxic and carcinogenic aristolochic acids (AA) (Nortier et al., 2000; International Agency for Research on Cancer [IARC], 2002). AA have been classified as human carcinogenic class I by the World Health Organization International Agency for Research on Cancer in 2002 (International Agency for Research on Cancer [IARC], 2002). Because of this, herbal mixtures containing Aristolochia or plants that could be substituted with it because of similarities in their common names (i.e., Stephania, Akebia, Asarum, Cocculus, and Sinomenium), have been banned from the market (International Agency for Research on Cancer [IARC], 2002; Medsafe, 2003; Martena et al., 2007; Debelle et al., 2008; Abdullah et al., 2017). In spite of this, some of these species are still available in markets and via the internet and the risk of being contaminated with Aristolochia plants is still high (Schaneberg and Khan, 2004; Abdullah et al., 2017). In support of this are the hundreds of cases of renal failure linked to potential contamination by Aristolochia species that have been reported over the past two decades (Nortier et al., 2000; Debelle et al., 2008; Michl et al., 2013; Jadot et al., 2017).

There is still a tangible need for development of detection methods to avoid exposure to AA. A reasonable way to decrease this risk should be the systematic quality control of herbal preparations by using reproducible and accurate analytical methods. In the case of Stephania pills, herbal drugs are consumed in the form of ground roots. Although there are morphological differences between the roots of the genera described as Fang Ji, they also present many similarities which present the opportunity for mis-identification and substitution especially in powdered and macerated samples (Tankeu et al., 2016). For powdered samples, HPLC methods are used as they are considered to be more reliable (Joshi et al., 2008). Hyperspectral imaging studies that combine both chemical and physical properties have also been conducted in Fang Ji herbal medicines but the accuracy has a 10% limit in terms of prediction of adulteration (Tankeu et al., 2016). These methods all have limitations such as extensive sample preparation and being correlated to physiological influence, intraspecific differences and storage conditions. Quality control techniques that provide a rapid, inexpensive and accurate discrimination between the Fang Ji herbal medicines are still needed.

The practicality of using DNA barcoding in industrial quality assurance procedures has been recently discussed (Sgamma et al., 2017; Raclariu et al., 2018). Despite controversy around using DNA barcoding for herbal products authentication, DNA-based methods such as quantitative real-time PCR (qPCR), are a valuable addition to the toolkit of industrial quality assurance overcoming many of the limitations of standard DNA barcoding (Yang et al., 2018).

Different DNA-based methods for plants species identification and discrimination have attracted increased interest in recent year in many fields such as commercially processed food ingredients, spices, honey and herbal medicines. Species-specific qPCR assay has been proved to discriminate Rhodiola rosea from non-rosea Rhodiola species (Sgamma et al., 2017). Species-specific qPCR assays with Taq Man probes have been successfully used to discriminate several plants species in Corsican honey, while DNA metabarcoding and High Resolution melting analysis have been used to characterize the floral composition of honey in order to investigate honey bee foraging (Laube et al., 2010; Hawkins et al., 2015; Soares et al., 2018). High resolution melting (HRM) has been successfully used to differentiate seven selected Zingiberaceae plants (Osathanunkul et al., 2017). Duan et al. (2018) used barcoding coupled with HRM (Bar-HRM) to test the authenticity of Rhizoma species used in TCM as compared to their adulterants.

Focusing on the detection of Aristolochia species, a number of DNA-based methods, mostly targeting the matK and ITS2 regions, have been proved be promising in aiding in species-discrimination. Traditional DNA barcoding, targeting the chloroplast DNA loci matK, rbcL and trnH-psbA showed a different level of polymorphism between the loci with matK containing the most variation being able to discriminate genuine herbal medicines from their Aristolochia adulterants (Li et al., 2014). Yang et al. (2014) validated the ITS2 region as another DNA barcode region to discriminate Aristolochia mollissima from other plants used as herbal medicine including Menispermi dauricum, Sophora tonkinensis, Stephania tetrandra, and Cocculus orbiculatus. qPCR using TaqMan probes targeting the ITS2 region was also used to authenticate plant species from the Aristolochiaceae family and those from non-Aristolochiaceous substitutes and divide them in groups, but without quantifying the contamination (Wu et al., 2015). Loop-mediated isothermal amplification (LAMP) targeting the ITS2 region has also been proved to be effective in discriminating between Mu-tong, Akebia caulis, and its adulterant Guan-mu-tong, Aristolochia manshuriensis within 60 min in pure and mixed samples (Wu et al., 2016). More recently, Dechbumroong et al. (2018) developed a low cost and fast species-specific multiplex PCR assay to differentiate three Aristolochia species belonging to the subgenus Aristolochia (Aristolochia pierrei, Aristolochia tagala,

and Aristolochia pothieri) present in Thailand known as "Krai-Krue."

Here, DNA-based technology is proposed as a complementary approach to identify and quantify adulterant Aristolochia subg. Siphisia material in herbal formulations providing a reliable quality control for contamination of the plant material.

## MATERIALS AND METHODS

#### Plant Material and Total DNA Extraction

Fresh leaves or dry wood were provided by Dr Ben Gronier (De Montfort University, United Kingdom) and Prof Michael Heinrich (University College London, United Kingdom), respectively (**Table 1**). DNA was extracted from 100 mg of frozen material, previously ground to a fine powder in liquid nitrogen with mortar and pestle, using DNeasy Plant Mini Kit (Qiagen Inc., Germantown, MD, United States) following the manufacturers' guidelines.

#### DNA Samples

All genomic DNA (gDNA) samples were supplied pre-extracted from the Royal Botanic Gardens, Kew DNA Bank<sup>1</sup> (**Table 1**).

### gBlock Fragments

Four double-stranded, sequence-verified gene fragments, or gBlocks (**Table 1**), were ordered from Integrated DNA

<sup>1</sup>https://dnabank.science.kew.org/homepage.html


Technologies, BVBA (Leuven, Belgium). The gBlocks were designed to cover the 5.8S-ITS2 region within the nuclear ribosomal Internal Transcribed Spacer (nrITS) of the respective species (**Figure 1**). The GenBank accession numbers of the reference sequences are listed in **Table 1**. The gBlocks were resuspended in water at 10 ng/µl concentration. The copy number/µl in each gBlock was calculated converting the concentration from ng/µl to copy number/µl by using the formula provided by IDT guidelines<sup>2</sup> (**Table 2**). After optimisations, the S−<sup>5</sup> dilution was used as working material.

#### Phylogenetic Analyses

Phylogenetic analyses were conducted using the MEGA6.06 software package. The evolutionary history was inferred with the Maximum Likelihood method based on the Tamura 3-parameter model (Tamura, 1992).

#### Primer Design

The NCBI database<sup>3</sup> was accessed to obtain the nrITS sequences of Aristolochia, Stephania, Cocculus, and Sinomenium.

Based on all the nrITS sequences obtained, generic and Aristolochia-specific primers were designed (**Table 3**). The generic primers were designed to target the 5.8S conserved region while the specific primers were designed to the ITS2 region of selected problematic Aristolochia species (**Figure 1**). Primer specificity was determined using Basic Local Alignment Search Tool (BLAST) software<sup>4</sup> and NCBI database (**Supplementary Data Sheet S1**).

## Standard PCR and Sequencing (nrITS)

PCR was performed using 1 × MyTaq Red Mix (Bioline), 0.2 µM of each forward (ITS1 TCCGTAGGTGAACCTGCGG) and reverse (ITS4 TCCTCCGCTTATTGATATGC) primers, and 1 µL of gDNA as template. Thermocycling conditions were optimized at 94◦C for 2 min, followed by 40 cycles of 94◦C for 15 s, 60◦C for 30 s and 72◦C for 30 s, with a final extension step of 72◦C for 2 min. PCR products were run on 2% (w/v) agarose, 1 × TBE gels with 1 µL SYBR <sup>R</sup> Safe DNA Gel Stain (Invitrogen, Paisley, United Kingdom) at 100 V for 30 min and analyzed in a Gel DocTM EZ Gel Documentation System (Bio-Rad, Oxford, United Kingdom). Products were submitted for sequence analysis to Macrogen<sup>5</sup> to verify the authenticity of the starting material.

## Quantitative Real-Time PCR (qPCR)

Each qPCR reaction contained 1 × Sensifast SYBR green Hi-Rox mix (Bioline), 0.5 µl of gDNA or gBlock, 0.1 µM of each forward and reverse primer (**Table 3**), in a total volume of 10 µl made up with sterilized distilled water (SDW). qPCR was performed using three biological replicates with three technical replicates for each sample. After PCR amplification, all products were sequenced to confirm their identity. Aristolochia gBlock serial dilutions

<sup>2</sup>https://eu.idtdna.com/pages/education/decoded/article/tips-for-working-withgblocks-gene-fragments

<sup>3</sup>http://www.ncbi.nlm.nih.gov

<sup>4</sup>http://blast.ncbi.nlm.nih.gov/Blast.cgi

<sup>5</sup>http://www.macrogen.com

on the Aristolochia fangchi ITS gBlock sequence.

TABLE 2 | Quantification gBlock Fragments and DNA copy number.


TABLE 3 | Aristolochia-specific and generic primers and annealing temperature (Ta) used in quantitative real-time PCR.


(from S−<sup>3</sup> to S−<sup>7</sup> ) were run to generate the standard curve (**Supplementary Data Sheet S2**). Working dilution gBlocks S−<sup>5</sup> , gDNAs and mixes of Aristolochia and Stephania gBlocks S−<sup>5</sup> at different percentages and concentrations (**Table 4**) were used as templates. Water was run as a negative control for each test. A StepOnePlusTM Real-Time PCR thermocycler machine (Applied Biosystem) was used. Thermocycling conditions were optimized at 95◦C for 2 min, followed by 40 cycles of 95◦C for 5 s and 30 s at the primer specific Ta (**Table 3**). The melting curve was obtained by melting the amplified template from 65 to 95◦C increasing the temperature by 0.5◦C per cycle. Analyses were conducted according to MIQE guidelines (Bustin et al., 2009). DNA levels were expressed as a relative proportion of the total DNA by using the generic primers as the "reference gene," and compared to the control sample (Aristolochia working dilution gBlocks S−<sup>5</sup> ) using the comparative (2−11Ct) method (Livak and Schmittgen, 2001).

#### Contamination Testing Using qPCR

A contamination test was performed where the Aristolochia gBlock S−<sup>5</sup> working sample was mixed with the Stephania gBlock S <sup>−</sup><sup>5</sup> working sample at different proportions (**Table 4**). Each mix was also diluted 1:10, 1:100, and 1:1000. DNA copy numbers were also calculated (**Table 4**).

#### RESULTS

#### Amplification of the gDNA Templates With ITS Generic Primers

To test the quality of the gDNA samples, a standard PCR using ITS1 and ITS4 primers was performed. The expected ITS fragment was detected in most of the samples (**Figure 2**). A very faint band was detected in Aristolochia californica (**Figure 2**, lane 4) and no bands were detected in Aristolochia clematitis


sample (**Figure 2**, lane 7) (**Figure 2**). Identification of samples was confirmed by sequencing of the full ITS fragment.

#### Phylogenetic Analysis

Before designing Aristolochia primers the ITS2 regions of Aristolochia sequences present on NCBI GenBank database were aligned using the Clustal W MegAlign package of DNAStar (DNAStar Inc.). Evolutionary relationships of the genus members were inferred with the Maximum Likelihood method based on the Tamura 3-parameter model using the MEGA6.06 software package (**Figure 3**). The phylogenetic analysis showed two main clades. Species which have proved particularly problematic with regard to substitution and contamination, including Aristolochia fangchi, A. manshuriensis and A. mollissima were found to belong to Clade B. These two clades align with the two main Aristolochia subgenera (Aristolochia and Siphisia) supported by morphological and molecular studies (Ohi-Toma et al., 2006; Do et al., 2015; Wu et al., 2015), with the subgenus corresponding to Clade B correctly named as Aristolochia subg. Siphisia (Duch.) O.C.Schmidt (Ohi-Toma and Murata, 2016).

The clear separation between the two subgenera was apparent from examination of the multiple alignment of Aristolochia ITS2 sequences. The divergence between the sequences of the two subgenera was such that it proved difficult to design genus specific-primers that would amplify all members of the genus. This investigation therefore focused on the design of primers to detect the ITS2 sequences of some of the most problematic species, which belong to the subgenus Siphisia. These primers were designed to target regions in the ITS2 sequence that are very similar between all members of this subgenus, but differ from members of the subgenus Aristolochia. They can therefore be described as "Aristolochia subgenus Siphisia-specific" primers.

#### Primers Specificity Testing Using Quantitative Real-Time PCR (qPCR)

Generic and Aristolochia subgenus Siphisia-specific qPCR primers were designed on the 5.8S and ITS2 region, respectively, within the nrITS sequence (**Figure 1**). Specificity of the developed qPCR reactions was evaluated in triplicate for each sample, including gDNA from various Aristolochia species and non-Aristolochiaceous genera including Stephania, Cocculus, Sinomenium, Asarum, Saussurea, Diploclisia, and Menispermum. Synthetic gBlocks designed to match Aristolochia fangchi, Stephania tetrandra, Cocculus orbiculatus, and Sinomenium acutum 5.8S and ITS2 regions were used as reference standards (**Figure 1**). The sensitivity of the triplex assay was determined using a serial dilution of Aristolochia gBlock DNA fragments representing the synthetic versions of the target genes at concentrations ranging from 0.1 to 1.00E-06 ng/µl per reaction. Linear regressions showed linear relationships (r <sup>2</sup> = 0.999 for all runs) between the quantities of gBlock templates and the cycle threshold (Ct) values across the tested concentration range. The real-time PCR efficiency was 90.1 and 91.8, % for the generic internal control 5.8S and Aristolochia subgenus Siphisia-specific primers, respectively.

fpls-09-01828 December 7, 2018 Time: 16:19 # 5

Frontiers in Plant Science | www.frontiersin.org

**93** 6 December 2018 | Volume 9 | Article 1828

As shown in **Figure 4A**, significant amplification signal was obtained in all samples when the generic primers were used. The Ct values in all gBlocks S−<sup>5</sup> samples and the neat mixtures (Aristolochia plus Stephania) was between 17.2 and 18.4; the Ct value of the mixture dilutions was on average 20.8, 24.2, and 27.5 for the 1:10, 1:100, and 1:1000 dilutions showing an equivalent pattern when comparing DNA copy numbers. The Ct values of the genomic DNA samples were between 10.8 and 15.8. Primer specificity was assessed by melt curve analysis, with the results showing that just one peak was generated for all samples (**Figure 4B**). The size and uniformity of the product was confirmed visually by running the samples on agarose gel electrophoresis (**Figure 4C**). Interestingly, a product was visible in both Aristolochia californica and Aristolochia clematitis samples with a Ct value of 14.7 and 14.2, respectively. These two samples showed either a very faint or no band, respectively, when PCR was performed to amplify the full length nrITS fragment (**Figure 2**).

DNA copy numbers and Ct values obtained from qPCR using Aristolochia subgenus Siphisia-specific primers were used to compare specificity between target and non-target samples. The amplification plots for the Siphisia-specific primers showed a clear difference in Ct value (around 15 cycles) between the Aristolochia S −5 gBlock dilution and the S−<sup>5</sup> dilution of non-target gBlocks (**Figure 5A**). The melting curve showed the presence of primer dimers and non-specific products with Ct values around and greater than 30 in gDNA non-target samples (**Figure 5B**). The presence of non-specific products in gDNA non-target samples was also confirmed visually running the samples on agarose gel electrophoresis (**Figure 5C**). The expected size product (88 bp) was visible in Aristolochia samples belonging to the subgenus Siphisia (Aristolochia californica and Aristolochia kaempferi) while the expected product was not visible in Aristolochia samples belonging to the subgenus Aristolochia (Aristolochia baetica and Aristolochia clematitis) or in the non-Aristolochaceous samples.

## Relative Quantitative Analysis of Aristolochia in Mixed Samples

To verify the feasibility of our method in the detection and quantitation of possible Aristolochia contamination in mixed samples, a series of gBlock admixtures containing different amounts of Stephania tetrandra (10, 50, 90, and 98% respectively) and model adulterant Aristolochia fangchi DNA were prepared (**Table 4**) starting from the gBlocks S−<sup>5</sup> dilution. Each of these mixtures was then diluted 1:10, 1:100, and 1:1000 to check detection limits (**Table 4**).

When comparing the relative proportions of Aristolochia DNA between the Aristolochia gBlock, the other species gBlocks and the mixtures representing the different contamination rates, consistent results were observed for all samples (**Figure 6**). The Aristolochia subgenus Siphisia-specific primers were able to detect Aristolochia DNA down to a 2% contamination level, while no amplification was detected in non-target gBlock samples. DNA copy numbers from the "100% contamination" sample (Aristolochia gBlock S−<sup>5</sup> ) were used as the reference for relative DNA copy number calculation. **Figure 7** shows that the serial dilutions showed the same pattern. Putting together this data and the samples melting curve results (**Figure 5B**) it is possible to set a safe detection limit of 2% for the 1:100 dilution which corresponds to a copy number of about 50 copies of Aristolochia fangchi nrITS DNA. In contrast, the 2% mixture in the 1:1000 dilution set gave a melting curve profile that indicated possible primer dimer formation. The detection limit is further supported by the detection of the correct amplicon in the 10% contamination mixture of the 1:1000 dilution set which would correspond to about 25 copies of Aristolochia nrITS DNA. Although the lowest dilution set of 1:1000 is at the limits of quantitative detection, it is still a useful qualitative indicator of the presence of Aristolochia at the lowest % mixtures, but not accurate enough to reliably quantify the amount of contamination.

#### Testing gDNA

To prove that the detection and quantification test is valid with gDNA, a set of Aristolochiaceous and non-Aristolochiaceous genomic DNA samples were tested. All samples were amplified by the generic primers (**Figure 4**) indicating the absence of plant secondary product PCR inhibitors in the samples. Most of the samples also showed the presence of the full-length nrITS fragment (**Figure 2**), which was then sequenced to confirm the species. Although Aristolochia californica and Aristolochia clematitis did not show a clear band for the full-length nrITS fragment (**Figure 2**), they both acted as templates for the generic primers (**Figure 4**). None of the non-target species gDNA appeared to be amplified by the Aristolochia Siphisiaspecific fragment (**Figure 5**), while the expected product was amplified in a range of target species in the Aristolochia subgenus Siphisia (**Figure 5**). The results were analyzed using the relative

fpls-09-01828 December 7, 2018 Time: 16:19 # 6

amplification method to determine the relative quantities of target species DNA compared to the amount of templates for the generic primers (**Figure 8**).

#### DISCUSSION

Aristolochic acid I and Aristolochic acid II have been identified as potent carcinogens and renal toxins (Arlt et al., 2002). All herbal formulations that contain any Aristolochia species have been classified as a Group 1 carcinogen by the International Agency for Research on Cancer (IARC) (International Agency for Research on Cancer [IARC], 2002; Grollman et al., 2007). Despite this classification it has been reported that products containing AA or suspected to contain AA are still in use and available on web sites (Gold and Slone, 2003; Nortier and Vanherweghem, 2007).

A reasonable way to detect the presence of Aristolochia contamination and decrease the risk associated with it, would be the systematic quality control of herbal preparations by

using reproducible and accurate analytical methods. Currently, for industrial quality control, chemical and macro-morphology based analysis are conducted to identify the presence of Aristolochia species in herbal medicines (Kite et al., 2002; Lee et al., 2003; Sorenson and Sullivan, 2007; Joshi et al., 2008). These methods have limitations as they may be affected by many factors including growth conditions, environmental factors and post harvesting procedures (Zhang et al., 2012). DNA-based tests have emerged as a powerful, rapid, reliable, robust, and affordable identification system for authentication of medical plants and commercial herbal products that could be incorporated into industrial quality control processes (Sgamma et al., 2017).

Previous work has identified ITS2 as a suitable target to discriminate Aristolochia species and used 11 primer/probe combinations in TaqMan qPCR assay to identify herbal material from the Aristolochiaceae family and divide them in groups, but without quantifying the contaminant (Wu et al., 2015). Each combination of primers and probes detected different groups which contained some species from the Aristolochiceae family; for instance group A identified a large number of Aristolochia species but also many Asarum because they shared sequence similarity (Wu et al., 2015).

In this study, a simpler method was developed for the identification and quantification of the Aristolochia subgenus Siphisia in pure or mixed samples using DNA-based techniques designed to overcome the limitations of other identification methods. This was achieved by designing a reliable qPCR test to detect and quantify the presence of very small amounts of Aristolochia DNA using an internal control and an Aristolochia subgenus Siphisia-specific set of primers. qPCR is a simple, fast and sensitive test that could be suited to industrial quality control testing (Sgamma et al., 2017).

One of the limitations of working with banned herbal products is sourcing the samples and this study is not an exception. Aristolochia fangchi plant material or gDNA was unavailable. Therefore to overcome this issue, synthetic DNA, a gBlock, was designed based on the reference barcoding regions available in GenBank. The quality and quantity of many samples sourced through DNA banks is similarly a limitation. The amount of gDNA sample provided is usually in the order of few µls. The DNA concentration is also never very high, possibly due to the poor quality of the original plant material. Therefore, having enough material for optimizations and replicates is often an issue. The gDNA samples used in this study were therefore checked through barcoding the nrITS region. The sequence results gave an indication of DNA quality and also a prove of the authenticity of the samples. gBlocks were sourced also for the other plants species used in this study to overcome the problem related to the amount of DNA provided. Another reason for using gBlocks was to develop a quantification assay. Following the MIQE guidelines, when using qPCR for quantification rather than identification, it is necessary

to generate a standard curve from known quantities of a target (Bustin et al., 2009).

Although quantification kits to be used as validated standards are commercially available, when working with non-human samples they became less reliable (Nielsen et al., 2006; Conte et al., 2018). Standard templates have been used from a range of sources, including cloned target sequences and PCR products, which require many steps that could potentially contaminate the laboratory and the standard itself. More recently, the use of synthetic gene fragments, such as gBlocks, as a standard

FIGURE 7 | Quantitation of Aristolochia subgenus Siphisia DNA in admixtures. 1Ct values were calculated as the difference between the mean Ct value of the target Siphisia-specific amplification and the mean Ct value of the internal control 5.8S amplification. Aristolochia S <sup>−</sup><sup>5</sup> gBlock neat and dilutions 1:10, 1:100, and 1:1000 were used as calibrator samples for the corresponsive dilution mix. qPCR was performed using three biological replicates with three technical replicates for each sample. Error bars represent Standard deviation.

has becoming an affordable, fast and reliable quantification strategy (Dhanasekaran et al., 2010; Conte et al., 2018).

In this study a gBlock has been used to create a standard curve to overcome the lack of available and reliable material, but also to prove that is possible to create a sensitive and reliable assay which can estimate the copy number of a target gene per sample and could potentially be used in an industrial setting. Reconstitution of the lyophilized gBlocks fragment provided over 2.14E+10 copies of the target. Dilution of the stock standard was done to create a sub-stock that was used to prepare the standard curve for the qPCR assay.

Generic primers were designed to target the conserved 5.8S rRNA coding region to amplify any template DNA. These can be used as an internal control to verify DNA quality and also as a reference gene for relative quantitation of the specific target DNA region. This primer pair was designed to generate a PCR product of under 100 bp which makes them suitable to be used in qPCR and ideal when working with potentially degraded DNA (Sgamma et al., 2017). This "mini-barcode" region proved to be useful for two of our samples. In fact, Aristolochia californica and Aristolochia clematitis gDNA samples did not present a clear amplicon for the nrITS fragments but then both of them presented templates for the generic 5.8 primers (**Figure 4**) indicating the presence of possible degraded, but still detectable DNA.

The ITS2 sequences for Aristolochia species available in GenBank demonstrated that the ITS2 region can be used to distinguish Aristolochiaceous species from their putative substitutes (non-Aristolochiaceae family) (Wu et al., 2015). In this study the Aristolochia species were separated into two clades using the ITS2 region. These two clades were recognized as corresponding to two subgenera previously reported, with Clade A corresponding to Aristolochia subgenus Aristolochia while Clade B corresponds to Aristolochia subgenus Siphisia (Ohi-Toma et al., 2006, Do et al., 2015; Ohi-Toma and Murata, 2016). Short "mini-barcode" regions within the ITS2 sequence were targeted for the design of Siphisia-specific primers because of the many reports of substitution of non-toxic plants with

plants belonging to this subgenus, including Aristolochia fangchi, A. manshuriensis, A. kaempferi, A. mollissima, and A. versicolor (Debelle et al., 2008). Furthermore, it proved to be difficult to design Aristolochia subgenus Aristolochia specific primers because the ITS2 sequences within this group are more diverse than those in the Siphisia subgenus. Therefore, in this study we chose to work on the subspecies that included Fang Ji and Mu Tong to target the worst known cases of contamination.

Although significantly high DNA copy numbers were present in all of the gBlock and genomic DNA samples, only the target species showed the presence of the Siphisia-specific "mini-barcode" regions.

Optimization of qPCR with Aristolochia Siphisia-specific primers allowed detection and quantification of this genus in mixed samples containing also Stephania tetrandra in different ratios. When Aristolochia DNA was mixed with Stephania at different rates, it was possible to detect it in 2% ratio Aristolochia and 98% of Stephania. Using standards associating the copy number to each start quantity this corresponded to about 50 copies. All proportions of Aristolochia, from 100 to 2%, were detected. The melting curve data provided confirmation that there was only one amplification product. Stephania, Sinomenium and Cocculus gBlocks or gDNA samples were not amplified by qPCR when using the Aristolochia subgenus Siphisia-specific primers. Although the amplification curves indicated a small amount of apparent amplification of gDNA samples, it was considered to be negligible because the Ct values were higher than the blank and the melting curves confirmed non-specific product or primer dimer formation. Therefore, it was proved that it is possible to differentiate Aristolochia subg. Siphisia from the other genera using a DNA-based strategy in pure or mixed samples. The achievement of this study could be utilized by the manufacturers, importers and retailers of herbal products to conduct a preliminary safety test for all of their raw materials. After that stage, only samples that were positively identified to contain Aristolochia subg. Siphisia species will be further confirmed by chemical analysis. Cocculus orbiculatus, Sinomenium acutum, and Stephania tetrandra have been proven scientifically for their health benefits (Zhao et al., 2012; Bhagya and Chandrashekar, 2018). This study describes a rapid, sensitive qPCR test for the detection of Aristolochia species in the subgenus Siphisia. The assay is designed for use

#### REFERENCES


by industrial and regulatory quality control laboratories for screening of herbal drugs for contamination by those Aristolochia plants that have most frequently been implicated in the toxicity of adulterated medicines. This study represents the first phase of assay development in which the parameters have been optimized using pure components and gBlocks. The next phase will be to trial the assay using DNA extracted from herbal medicines to determine how robust the method is under conditions of PCR inhibitors and low quantities of poor quality fragmented DNA. The qPCR primers sets were in fact designed to generate PCR products of under 100 bp to cope with potentially degraded DNA. The introduction of reliable contamination tests into the supply chains of medicinal plants that are currently banned because of the risk of Aristolochia contamination will enhance the quality assurance of the safety of these herbs for consumption. This should help to restore consumer confidence and could eventually lead to the previous bans imposed on these harmless plant species being revoked by the regulatory authorities.

### AUTHOR CONTRIBUTIONS

TS and AS conceived the project, designed and supervised the experimental strategy. TS, EM, and AS edited the paper. TS, PM, and MM performed the experiments. TS analyzed the data, and wrote most of the paper.

## ACKNOWLEDGMENTS

We greatly appreciate the contribution of Prof Michael Heinrich (University College London, United Kingdom) and Dr. Ben Gronier (De Montfort University, United Kingdom), for the provision of plant material. This project was supported by De Montfort University HEIF funds.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.01828/ full#supplementary-material

quantitative real-time pcr experiments. Clin. Chem. 55, 611–622. doi: 10.1373/ clinchem.2008.112797




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Sgamma, Masiero, Mali, Mahat and Slater. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Goji Who? Morphological and DNA Based Authentication of a "Superfood"

Sascha Wetters\*, Thomas Horn and Peter Nick

Molecular Cell Biology, Karlsruhe Institute of Technology, Karlsruhe, Germany

"Goji" (Lycium barbarum and Lycium chinense) is a generic name for medical plants with a long historical background in the traditional Chinese medicine. With the emerging trend of "Superfoods" several years ago, Goji berries soon became an established product in European countries and not only are the most popular product of traditional Chinese medicine outside of China but to this day one of the symbols of the entire "Superfood" trend. However, since Goji is an umbrella term for different plant species that are closely related, mislabeling and adulterations (unconsciously or purposely) are possible. We carefully verified the identity of Goji reference plant material based on morphological traits, mainly floral structures of several inflorescences of each individual, in order to create a robust background for the downstream applications that were used on those reference plants and additionally on commercial Goji products. We report morphological and molecular based strategies for the differentiation of Lycium barbarum and Lycium chinense. The two different Goji species vary significantly in seed size, with an almost double average seed area in Lycium chinense compared to Lycium barbarum. Differences could be traced on the molecular level as well; using the psbAtrnH barcoding marker, we detected a single nucleotide substitution that was used to develop an easy one-step differentiation tool based on ARMS (amplification refractory mutation system). Two diagnostic primers used in distinct multiplex PCRs yield a second diagnostic band in a subsequent gel electrophoresis for Lycium barbarum or Lycium chinense, respectively. Our ARMS approach is a strong but simple tool to trace either of the two different Goji species. Both the morphological and the molecular analysis showed that all of the tested commercial Goji products contained fruits of the species Lycium barbarum var. barbarum, leading to the assumption that consumer protection is satisfactory.

Keywords: Goji, superfood, food diagnostics, Lycium barbarum, Lycium chinense, molecular authentication, ARMS

## INTRODUCTION

A rising concern for health and an aging society seem to be important driving forces for the boom of "Superfoods" in Europe with a rapid sequence of new products entering a dynamic and further growing market. Germany ranks second (behind the United States) with respect to the import of products that are labeled as "Superfood" (Mintel, 2018). Every season new products are trending

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Rosemary White, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia Hugo J. De Boer, University of Oslo, Norway Daniël Duijsings, BaseClear B.V., Netherlands

> \*Correspondence: Sascha Wetters sascha.wetters@kit.edu

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 03 August 2018 Accepted: 30 November 2018 Published: 18 December 2018

#### Citation:

Wetters S, Horn T and Nick P (2018) Goji Who? Morphological and DNA Based Authentication of a "Superfood". Front. Plant Sci. 9:1859. doi: 10.3389/fpls.2018.01859

**101**

and advertised as "Superfoods." Current trends are, for instance, basil seeds, or even varieties of cabbage, Brassica oleracea L., that have been in use in Germany for more than 500 years (Baumann et al., 2001) but currently experience a revival as "Superfood," e.g., kale (Brassica oleracea var. acephala) Šamec et al., 2018. Some of these "Superfoods," such as Chia seeds or Goji berries have in the meantime turned into established products that can be obtained in almost every supermarket or even in discounters as well. This hype of "Superfoods" is expected to continue, which can be seen in the sales numbers, that for Chia doubled from 2015 to 2016 to a volume around 23 million Euro in Germany alone (Statista, 2018a) and are still growing. Among the "Superfoods," Wolfberries, or "Goji" rank among the leading products. "Goji" is the generic name for different plant species from the genus Lycium, belonging to the Solanaceae family. "Goji" represents the most popular product of traditional Chinese medicine outside of China (Chinadaily, 2018), the Western vernacular name "Goji" derives from the Chinese term gou qi (Potterat, 2010). In traditional Chinese medicine different plants, mainly Lycium barbarum L. and Lycium chinense Mill. are described under the umbrella term gou qi and have both been used for more than 2,000 years (Burke et al., 2005). Though exotic, Goji berries were already used in Europe before the introduction of the Novel Food Regulation (1997) to an extent that it did not fall under the Novel Food Legislation. However, this actually holds true only for the species Lycium barbarum. Several years ago, "Goji" berries were marketed as one of the first so called "Superfoods" and praised in advertisements and newspapers for miraculous properties (Latimes, 2018). Although it is already some time back that "Goji" berries became popular, they are still appreciated and bought for the perceived health effects, and continue to lead the charts for the most popular "Superfoods" (Statista, 2018b). Also in the news coverage of "Superfoods," "Goji" is inevitably used as example and almost can be considered as symbol for the entire trend (SWR, 2018). While numerous reports refer to high levels of antioxidants, vitamins, minerals, or proteins that are claimed to account for the positive effects of such "Superfoods," it has to be kept in mind that numerous traditional food plants harbor comparable levels of these healthy ingredients, rendering the term "Superfood" rather ambiguous. The combination of media hype with vernacular nomenclature (that had been separated from its traditional context) conceals the fact that it is often unclear, what plant is actually found in the package sold as "Goji." Therefore, it is important to focus on the actual plants hidden behind the term "Superfood Goji." The Flora of China distinguishes among others between Lycium barbarum ( , ning xia gou qi), Lycium chinense ( , gou qi), and Lycium ruthenicum Murray ( , hei guo gou qi, also called "Black Goji") (Efloras, 2018). These plants are thorny shrubs (common name boxthorn) with heights up to three meters, small inconspicuous purple flowers and orange to red fruits (black for Lycium ruthenicum) that reach up to two centimeters in length. Especially Lycium barbarum and Lycium chinense are very similar with respect to morphology and phylogenetic relationship. Moreover, they grow in the same habitat, are harvested at the same time (usually from August to October) and have both been used in traditional medicine

all over East Asia since thousands of years (Potterat, 2010). Because of their important role in traditional Chinese medicine, the fruits of "Goji" have been intensively investigated with respect to their bioactive compounds (reviewed in Potterat, 2010; Amagase and Farnsworth, 2011; Masci et al., 2018), as well as for their anatomical and histochemical features (Konarska, 2018). While both species are marketed to a conspicuous extent, the Pharmacopoeia of the People's Republic of China lists only Lycium barbarum as the official drug (Zhonghua Renmin Gongheguo wei sheng bu yao dian wei yuan hui, 2000), and also the Novel Food Catalogue of the EU allows the usage of only Lycium barbarum as food or food ingredient (Europa, 2018). Because of these legal constraints, providers of Goji Berries claim that their commercial products that are offered outside of Asia exclusively contain fruits of Lycium barbarum (Potterat, 2010). However, not all commercial Goji products are properly labeled with the scientific name. Usually the generic name "Goji" is accompanied by additional Western vernacular names, such as Boxthorn, Wolfberry, Chinese Wolfberry, or Matrimony Wine which does not really contribute to consumer safety (Blaschek et al., 1993). Considering how closely related these species are, this nomenclatural nonchalance and partially also ignorance will easily create mislabeling. A simple and robust one-step differentiation method based on molecular markers derived from carefully determined reference plants is needed. Such a marker has to be robust enough to be amenable to processed samples, since "Goji" is often sold as powder, and even for dry fruits it is very hard to trace back to the originating plant(s).

DNA barcoding is an important identification tool that was first used for animal identification done by amplification of the mitochondrial COI1 gene (Hebert et al., 2003). This approach has meanwhile been developed for plants as well. A number of different marker regions are commonly used, but still there is no universal region that fulfills all the desired criteria, that are: the simple amplification of the respective region with one primer pair, an adequate size to perform bidirectional sequencing without additional primer, and the condition for maximal discrimination power between species (Hollingsworth et al., 2009). The plastid psbA-trnH spacer region that is flanked by the highly conserved psbA and trnH genes is one of the most frequently used DNA barcodes and widely accepted as reliable, because of its high variance that can be used for discrimination down to the species level (Kress et al., 2005). The region has been especially valuable for phylogenetic and taxonomic studies of the genus Solanum from the Solanaceae family, where an estimated 96.5% (projected from a small sample range) of the species can be differentiated with this marker (Pang et al., 2012). While DNA barcoding requires sequencing, discrimination of two species with a known sequence can make use of sequence polymorphisms that can be translated into a specific pattern. Amplification of the rbcL barcode followed by restriction with appropriate enzymes has been used successfully to discriminate two Australian Myrtaceae that are commercialized under the same vernacular name "Lemon Myrtle" (Horn et al., 2012). Alternatively, such barcodes can be used to design a duplex PCR, where a diagnostic third primer located in the center of the amplicon will produce a diagnostic

band in addition to the full-length barcode (Horn et al., 2014). To improve the discriminative power of this approach, the diagnostic primer is designed using the amplified refractory mutation system (ARMS) strategy, where even single basepair substitutions in sequences can be detected by design of a 3'-end destabilized primer, which is added to the full-length primer pair. As a result, this duplex PCR yields a second, smaller amplicon that can be detected as a second band in the subsequent gel electrophoresis. Originally developed for the detection of mutations (Old, 1992), this method can be used to discriminate two closely related species using a single PCR leading to a specific fingerprint that can be visualized by gel electrophoresis. In the past, this ARMS approach had been successfully used to discriminate closely related Lamiaceae species to safeguard against adulteration in commercial products (Horn et al., 2014), or to discover adulteration of Bamboo Tea by Chinese Carnations in consequence of a false translation of a Chinese vernacular name into English (Horn and Häser, 2016). As contribution to consumer safety and quality control, we are reporting a method to validate the identity of "Goji" in commercial products. Our approach is based on carefully authenticated reference plants for different Lycium species that are validated by taxonomical characterization which forms the base for downstream morphological and molecular analysis of those Lycium barbarum and Lycium chinense accessions, development of a diagnostic assay, and application of this assay to clarify the identity of commercial products sold as "Superfood Goji."

#### MATERIALS AND METHODS

#### Plant Specimens

A set of 15 accessions of Lycium was analyzed in this study (**Table 1**). The set included two specimens of Lycium barbarum, one of Asian origin and one commercially available in Germany, and seven specimens of Lycium chinense that were all cultivated and are maintained in the Botanic Garden of the Karlsruhe Institute of Technology. Fruits from Lycium ruthenicum were obtained from China. To get insight into the phylogenetic relations of Lycium barbarum and Lycium chinense, four additional Lycium species were cultivated, including specimens of the Mediterranean (Lycium europaeum L.) and South American (Lycium chilense Bertero and Lycium ameghinoi Speg.) regions. In addition, the DNA of one South African specimen (Lycium oxycarpum Dunal) was included into the study. This set of 15 Lycium specimens was complemented by a set of 17 "Goji" commercial products that were obtained as dried fruits from different sources (**Table 2**), making in total 32 accessions.

#### Identification of Reference Specimens

The identity of specimens was verified by taxonomical identification using appropriate taxonomic keys (Flora of China (Efloras, 2018), Schmeil/Fitschen: The Flora of Germany and the neighboring countries, 95th edition (Schmeil and Fitschen, 2011). In detail, we documented floral traits (i.e., pubescense of corolla, undulation of calyx, length of corolla tube) from 30 inflorescenses of each of the L. barbarum and L. chinense accessions to eliminate errors in determination that might occur when looking only at a single flower. The observed characteristics were documented digitally (Stereolupe 420, Leica, Bensheim, Germany).

#### Seed Analysis

Fruits of the cultivated plants were harvested and the seeds phenotyped quantitatively. 30 seeds of each Lycium barbarum, Lycium chinense and Lycium ruthenicum accession, as well as from all of the commercial "Goji" products were excised from at least five different fruits and digital images recorded (Stereolupe 420, Leica, Bensheim, Germany). The digital images of the seeds were analyzed using the program SmartGrain (Tanabata et al., 2012). With this software many parameters like area size or length-to-width ratio of seeds can be measured, after the digital image of the seeds is loaded into the program, and the scale bar is calibrated by the "set scale" tool. SmartGrain automatically detects the objects of the image, or the seeds can be picked manually as well. To evaluate and illustrate the obtained data as boxplots R Studio version 3.2.0 was used.

## DNA Barcoding

DNA from fresh leaves of reference plants (using 60 mg of starting material) and dried fruits of commercial products (using 120 mg of starting material) was isolated using the Invisorb <sup>R</sup> Spin Plant Mini Kit (Stratec Biomedical AG). The quality and quantity of isolated DNA was evaluated by spectrophotometry (NanoDrop, Peqlab), and DNA concentration was diluted to 50 ng/µl to be used as template in PCR.

A 30 µl reaction volume containing 20.4 µl nuclease free water (Lonza, Biozym), onefold Thermopol Buffer (New England Biolabs), 1 mg/ml bovine serum albumin, 200 µM dNTPs (New England Biolabs), 0.2 µM of forward and reverse primer (see Primer list, **Table 3**), 100–150 ng DNA template and three units of Taq polymerase (New England Biolabs) was used to amplify the marker sequences.

Thermal cycler conditions for the amplification of the psbAtrnH intergenic spacer region included initial denaturation at 95◦C for 2 min; following 33 cycles at 94◦C for 1 min, 56◦C for 30 s, 68◦C for 45 s; ending with an extension of 68◦C for 5 min.

The PCR was subsequently evaluated by agarose gel electrophoresis using NEEO ultra-quality agarose (Carl Roth, Karlsruhe, Germany). DNA was visualized using SYBRsafe (Invitrogen, Thermo Fisher Scientific, Germany) and blue light excitation. The fragment size was determined using a 100 bp size standard (New England Biolabs). Amplified DNA was purified for sub'sequent sequencing using the MSB <sup>R</sup> Spin PCRapace kit (Stratec). Sequencing was outsourced to Macrogen Europe (Netherlands) or GATC (Germany).

The quality of the obtained sequences was examined with the program FinchTV Version 1.4.0<sup>1</sup> . To get a more robust result, the marker region was sequenced from two directions. The resulting two sequences were merged for each accession.

<sup>1</sup>https://digitalworldbiology.com/FinchTV

#### TABLE 1 | Lycium reference specimens, overview of identity and origin.


<sup>∗</sup>Received as Lycium barbarum, determined as Lycium chinense.

TABLE 2 | Commercial Goji products obtained from different sources.


### Phylogenetic Analysis

For the sequence alignment and the phylogenetic analysis the program MEGA7 (Version 7.0.14) with the integrated tree explorer was used (Kumar et al., 2016). The Sequences were aligned using the Muscle algorithm of MEGA7 (Edgar, 2004). Alignments were trimmed to the first nucleotide downstream of the forward primer and the nucleotide preceding the reverse primer. With the same software, the evolutionary relationships were inferred by using the neighbor-joining algorithm with a bootstrap value that was based on 1,000 replicates (Felsenstein, 1985; Saitou and Nei, 1987). The species Nolana werdermannii was chosen as an outgroup, since the genus Nolana is a sister taxon to the Lycieae (Levin and Miller, 2005). The psbAtrnH spacer region sequence for N. werdermannii (GenBank: FJ189604) was obtained from the NCBI database.

#### ARMS Diagnostics

A single nucleotide difference in psbA-trnH intergenic spacer sequences of L. barbarum and L. chinense was used to design a diagnostic primer to clearly discriminate these two closely related species in an one-step duplex-PCR protocol.

The primer was designed with the Primer3Plus webtool (Untergasser et al., 2012). A thymine in position 265 of the psbA-trnH multiple sequence alignment in Lycium barbarum is

TABLE 3 | Primer list.


U, universal primers; d, diagnostic primer for ARMS. For the diagnostic primers the decisive nucleotide is underlined.

substituted by a guanine in the other Lycium species. Thymine was placed at the 3<sup>0</sup> -end of the diagnostic primer and an additional nucleotide was exchanged in the 3<sup>0</sup> -region, to prevent the binding of the primer to the other Lycium species. With this design, this diagnostic primer (LB\_265T\_fw) should only be able to bind to the Lycium barbarum accessions. Since the region surrounding this nucleotide substitution is very AT-rich, the primer length had to be increased to 31 nucleotides in order to reach an appropriate annealing temperature.

Based on the same strategy, a second diagnostic primer (LC\_265T\_fw) was designed that should only bind to the psbAtrnH spacer region template of Lycium chinense.

These diagnostic primers were then used in combination with the universal psbA and trnH primers (see primer list, **Table 3**) in distinct duplex-PCRs. Usage of either LB\_265T\_fw or LC\_265T\_fw ought to yield an additional second diagnostic band of 290 bp in only one of the species while in the other species only the full-length amplicon with a length of 546 base pairs would be visible after the gel electrophoresis. We tested this prediction and validated the predicted readout experimentally (**Figure 6**).

#### RESULTS

#### Morphology of Floral Organs Clearly Delineate the Two Main "Goji" Species

To assess the validity of morphological traits that are used to delineate the main species behind "Goji," we assessed three morphological features that are used to discriminate Lycium barbarum and Lycium chinense (Reference Flora of China) across 30 inflorescences collected from the reference plants (**Figures 1**, **2**). An elongated corolla tube and glabrescence of corolla blades as distinctive traits of Lycium barbarum (**Figure 1A**) were consistently found over all investigated 30 inflorescenses collected from each of the two reference plants. In contrast, for all reference plants of Lycium chinense all corolla blades were densely pubescent at the margin (**Figure 2A**), and the corolla tube was distinctively shorter than the corolla lobes (**Figure 1A**). Thus, both of these traits could be validated to differentiate Lycium barbarum and Lycium chinense specimens. Opposed to this, the third feature reported to discriminate the two species, the number of calyx lobes (**Figure 1B**), was not consistent over all flowers of Lycium chinense. Depending on the accession, between 6.7% (in accessions 5551 and 6815) and 23.3% (in accession 5550) of the Lycium chinense flowers had a two-lobed calyx, which is usually characteristic for Lycium barbarum. In fact, without any exception, all flowers of our Lycium barbarum accessions had a calyx that was two-lobed. The Flora of China additionally lists different varieties of Lycium barbarum and Lycium chinense, respectively. However, all of the available reference plants were determined as either Lycium barbarum var. barbarum or Lycium chinense var. chinense.

#### Seed Size Can Be Used to Discriminate the Two "Goji" Species

Since "Goji" is traded as fruit, we were searching for morphological traits that can be inspected in fruits and allow to discriminate the two types of "Goji" berries. We noted that the seeds of Lycium barbarum var. barbarum and Lycium chinense var. chinense differ significantly in size (**Figure 2B**), and we quantified this trait by measuring cross-section areas (**Figure 3**). The two available accessions of Lycium barbarum var. barbarum showed cross sections that were less than half of those seen in the seven investigated Lycium chinense var. chinense (<2.5 mm<sup>2</sup> as compared to almost 5 mm<sup>2</sup> ). Although the intraspecific variation in Lycium chinense var. chinense was more pronounced, even the accessions with smaller seeds were clearly bigger than seeds from Lycium barbarum var. barbarum plants. The single available accession of Lycium ruthenicum had small seeds that are comparable to the values of the two Lycium barbarum var. barbarum sizes, but it is easy to delineate those fruits, due to their black pericarp. With an average boxplot median of 2.62 mm<sup>2</sup> all the investigated commercial Goji products had seed sizes that were comparable to Lycium barbarum var. barbarum, with low variance between the different commercial samples; the lowest median was Goji product 2 (2.22 mm<sup>2</sup> ), and the highest median was Goji product 19 (2.88 mm<sup>2</sup> ), suggesting that these "Goji" products indeed contained Lycium barbarum.

#### psbA-trnH Spacer Phylogeny Clearly Distinguishes Lycium Species

To probe the phylogenetic relationship of the three different "Goji" species (Lycium barbarum var. barbarum, Lycium chinense var. chinense and Lycium ruthenicum) and the tested commercial products (abbreviated as Gp in **Figure 4**) in comparison to Lycium accessions from Europe and the New World, we used a 510-bp region of the psbA-trnH marker including the variable intergenic spacer (**Figure 4**). Although this universal psbAtrnH spacer region is one of the most variable DNA barcodes, we found only small differences between those three obviously closely related species. The only difference between Lycium barbarum var. barbarum and Lycium chinense var. chinense was one nucleotide substitution at site 265 with a thymine for Lycium barbarum var. barbarum and a guanine for Lycium chinense var. chinense and Lycium ruthenicum (**Figure 5A**). As already

seen for seed size (**Figure 3**), the commercial "Goji" products all clustered with Lycium barbarum by sharing the informative thymine at position 265. The three Asian species are closely related to each other and clearly clustered separately from the two tested species from South America, as well as from the two species from Europe and South Africa (**Figure 4**). Lycium ruthenicum had an additional substitution from thymine to guanine at site 358 and an additional nine nucleotide insert at site 450, that was neither present in Lycium barbarum, nor in Lycium chinense. All accessions of L. barbarum var.

barbarum and L. chinense var. chinense were identical within the respective species. The variations were exclusively interspecific. This single nucleotide substitution was used to design a strategy based on ARMS to discriminate Lycium barbarum from Lycium chinense.

#### A One-Step Protocol Based on ARMS Allows to Differentiate the Two Main "Goji" Species

To develop an assay that allows an easy one-step discrimination of the two main "Goji" species in unprocessed commercial samples, we used the single nucleotide polymorphism at position 265 of the psbA-trnH spacer (**Figure 5A**) to apply an ARMS based-strategy. Two diagnostic primers – LB\_265T\_fw and LC\_265T\_fw – with complementary readouts were designed (**Figure 5B**).

The diagnostic LB\_265T\_fw primer yielded clear diagnostic bands with a fragment length of 290 base pairs at the two Lycium barbarum var. barbarum accessions and all of the Goji products (**Figure 6A**), while this diagnostic band was absent in Lycium chinense var. chinense and Lycium ruthenicum. Conversely, adding the LC\_265T\_fw diagnostic primer in a duplex-PCR, the diagnostic 290-band was present in all the Lycium chinense var. chinense accessions and the Lycium ruthenicum sample, but absent in the Lycium barbarum var. barbarum reference plants and the commercial Goji products (**Figure 6B**).

#### DISCUSSION

In the current work, we have developed an one-step assay to discriminate two closely related, but distinct species of Lycium that are used in traditional Chinese medicine and are currently booming in Europe and US as so called "Superfood" under the common vernacular name of "Goji berries." Based on validated plant material, we have developed morphological and molecular traits that allow to differentiate between both species, and we have developed a robust one-step PCR-based assay that can reliably identify any given unprocessed commercial product as either Lycium barbarum or Lycium chinense.

#### Identity Matters: Any Assay Is Only as Good as the Reference Material Is Reliable

It is a time where the possibilities of plant molecular biology appear to be overwhelming, including huge throughput of wholegenome sequencing, mapping of gene regulatory networks and phylogenomics; however, a rather ancient nevertheless essential discipline of botany starts to have its renaissance (Ledford, 2018). Although it is often considered as trivial task, to actually look at and morphologically determine the plant that is the subject of the study, this task is far from trivial. Surveys of misdeterminations and mislabeling in herbaria, botanical gardens, and germplasm collections show that between 10 and 20% of accessions are not what they are supposed to be (Goodwin et al., 2015).

Even for DNA barcoding, the correct determination of the used reference plant material is a necessary precondition for the validity of the outcome. This requirement is even more important for medicinal plants or plants that are en vogue because of their promised health effects. The power of the determination key is crucial to have a robust background for all subsequent experiments. Unfortunately in many keys terms with a low degree of certainty like "about" or "usually" are used, that can be interpreted differently by different taxonomists. This is also true for the otherwise high-quality Flora of China, that was mainly used in this study. For the discrimination of Lycium barbarum and Lycium chinense terms such as "usually 2-lobed" can be found in this key. To avoid uncertainties in determining a plant species, different morphological traits should be applied, if possible. The Flora of China lists three different traits to distinguish between the two gou qi species, Lycium barbarum and Lycium chinense.

Regarding the close relation of those two species, the common usage of both species in traditional Chinese medicine and the expectation of consumers to get the correct plants in the purchased products, the importance of correct determination is undoubtedly urgent. For the discrimination of closely related plant species, the evaluation of floral morphology traits should be central, since they are most likely linked with the reproductive isolation between these species. With the combination of the different floral traits, corolla pubescence, length of corolla tube and the number of calyx lobes, an impeccable discrimination of the two widely used Goji species could be done for the available reference plants. However, the corolla traits were more robust and therefore more reliable compared to those of the calyx, since there was variability in the features of the calyx in a considerable number of investigated inflorescences of Lycium chinense. The importance of evaluating an appropriate number of flowers becomes obvious, because phenomena like phenotypic plasticity or changes due to environmental factors can influence the appearance of single plant organs. For instance, differences in the velocity of polar auxin transport as they might occur depending on light quantity and quality are expected

#### FIGURE 3 | Continued

fpls-09-01859 December 16, 2018 Time: 13:5 # 9

the commercial Goji products (white). The median of the Lycium barbarum var. barbarum reference plants is between 2 and 3 mm<sup>2</sup> , whereas the Lycium chinense var. chinense seeds are significantly bigger in area size with a median between 4 and 6 mm<sup>2</sup> . Lycium ruthenicum seeds are smaller again, with a median slightly bigger than 2 mm<sup>2</sup> . The commercial Goji products have boxplots medians ranging from 2 and 3 mm<sup>2</sup> , similar to Lycium barbarum and Lycium ruthenicum.

(black) are labeled with colored squares. The evaluated commercial Goji products are labeled with light gray squares. To put the three different "Goji" species into a context of the Lycium genus, species from South America (Lycium chilense and Lycium ameghinoi, yellow) and Europe (Lycium europaeum and Lycium oxycarpum, blue) were included into the phylogenetic analysis. The Lycium barbarum var. barbarum reference plants cluster with the Goji products, while Lycium chinense var. chinense (that is closest related to Lycium barbarum) and Lycium ruthenicum have their own clusters.

to modulate the number of apices in the primordial whorl committed to form the calyx (Reinhardt et al., 2003). However, the possibility that the morphospecies Lycium chinense might comprise several genetically isolated cryptospecies, should also be kept in mind.

The seed analysis revealed significant differences in the area size of Lycium barbarum var. barbarum and Lycium chinense var. chinense specimens. Thus, besides the prior described differences in the reproductive organs the two species can be distinguished by the size of their dispersal units. All the

seeds of the tested commercial Goji products had similar boxplots medians compared to the two Lycium barbarum var. barbarum reference plants, wherefore we conclude that all the investigated commercial Goji products are actually fruits of the ningxia gou qi (Lycium barbarum). The boxplot median of the commercial products is comparable to Lycium ruthenicum as well; however, since the fruits of Lycium barbarum and Lycium chinense differ in size and especially in color from Lycium ruthenicum, surrogation by this species is unlikely.

#### "Goji Berries" Are Lycium barbarum

The evaluated commercial Goji products of this study were obtained from a many different companies and different

geographic origin. Nevertheless, all of the investigated products seem to originate from Lycium barbarum var. barbarum plants. Asian companies claim to sell only the ningxia gou qi (Lycium barbarum) as "Goji" in commercial products outside of China. Our results suggest that the consumer protection seems to be quite satisfactory in this respect, and that the "correct" products are sold on the German market. However, the pronounced difference in price for otherwise comparable "Goji" products indicates that consumer protection has to consider additional aspects of quality assessment. In this context, it should be noted that in China, "Goji" is marketed in different quality grades termed "super," "king," "special," and "Grade A," (EzineArticles, 2018) whereby the grade is linked with berry size. In other words, berries of Lycium chinense would be sold at higher prize. Thus, the fact that only the small Lycium barbarum berries were found in the commercial products sampled in Germany might therefore have a different explanation from a efficient system of consumer protection: the smaller berries from Lycium barbarum are less attractive for consumers in China (except for the few knowledgeable professionals that know about their medicinal value) and therefore are preferentially exported to Europe and the United States, where they still can be sold at high price. While the commercial Goji products as well as the Lycium barbarum var. barbarum reference plants show a fairly homogenous seed size, there is significant intraspecific variation in seed size among the accessions of Lycium chinense var. chinense. Since the measured seeds were chosen randomly and the sample size was sufficient, the likelihood that this variation is caused by sampling bias is rather

low, which means that the differences in size must have genetic or developmental reasons. There are some reports that the chromosome number of Lycium chinense varies from 2n = 24 to 36 or even 48 (Tropicos, 2018), which would mean that there exist cryptospecies within Lycium chinense. This might also be a possible factor for the observed variability seen in calyx lobing between the different accessions of this species. The enlargement of seed size in Lycium chinense might therefore be linked with allopolyploidy. However, for the closely related Asian Goji species the literature regarding karyotypes is surprisingly scarce.

During the past years, the phylogenetic relations within the genus Lycium and its biogeographic background has been intensively studied. One of the surprising outcomes was the paraphyletic nature of Lycium (Fukuda, 2001; Levin and Miller, 2005). Different barcoding marker regions were used and combined for the genus to which Goji belongs, ranging from the more common rbcL, matK, trnF–trnH, psbA-trnH to conserved ortholog sequences (COS) regions (Levin and Miller, 2009). For our morphologically validated species we choose the sequences of the highly variable chloroplast psbA-trnH spacer region to find small differences in sequences that can be utilized for a simple one-step test to discriminate Lycium barbarum and Lycium chinense. However, we wanted to link our data into the context of the Lycium genus as well. We found three well separated clades corresponding to the three biogeographic regions. The East Asian "Goji" species were closely related, and formed a separate clade distinct from the other Old-World

and the clade from New-World species of Lycium. Of course, for a biogeographic and in-depth phylogenetics approach many independent regions should be evaluated for a large set of species to receive a robust tree. However, as mentioned above this has been done extensively and our main goal was to find informative single-nucleotide polymorphisms that can be used to differentiate between the two "Goji" species Lycium barbarum and Lycium chinense by a one-step ARMS assay. The reliance on DNA basedmethods is crucial, since the flowers of a commercial Goji product can obviously not be traced back, and some commercial Goji products are processed as powders as well.

## Authentication by ARMS Provides an Innate Positive Control for DNA Quality and PCR Quality

Based on the single nucleotide substitution between Lycium barbarum and Lycium chinense two diagnostic ARMS primers could be designed to trace either of the species. In both of the Lycium barbarum var. barbarum reference plants using the LB\_265T\_fw primer, and in all of the seven Lycium chinense var. chinense reference primer using the LC\_265T\_fw primer, the desired second diagnostic band could be amplified in distinct multiplex PCR approaches. The results of this molecular approach strengthen the data of the seed analysis, because for all commercial Goji products there was an additional diagnostic band, when using the LB\_265T\_fw primer. Thus, all of the investigated commercial Goji products were authenticated morphologically and by molecular markers to be Lycium barbarum var. barbarum. The advantage of this ARMS approach is the implemented positive control of the universal psbA-trnH spacer region band in the gel. This control displays that the extracted DNA is actually of good quality and, more importantly, that the region of interest is present and the discrimination occurs exclusively because of the nucleotide substitution between the two species. Apart from the practical application of the discrimination for two closely related species of commercial importance, we'd like to highlight the importance of morphology and taxonomy and the endangered art of looking at plants in order to really understand them.

While considerable effort has been invested into the evaluation of bioactive compounds of Goji (Chang and So, 2008), to our knowledge this is the first case, where a robust onestep discrimination for these closely (from an evolutionary, cultural-historic and medicinal point of view) related "Goji" species has been developed. Previous studies have used RAPD fingerprinting and the downstream Sequence characterized amplified region (SCAR) strategy (Zhang et al., 2001; Sze et al., 2008). A disadvantage of RAPD in authentication is its limited reproducibility and reliability as fingerprint patterns are strongly dependent on sample quality and DNA integrity. Despite these drawbacks RAPD is a rapid and cost effective method that continues to be used today (Krishnan et al., 2017), especially for authentication approaches combined with downstream SCAR markers (Mei et al., 2017). RAPD-based SCAR markers yield reproducible binary results, and this method has been used successfully for "Goji" (Sze et al., 2008), ginseng (Panax ginseng C.A. Mey.) (Wang et al., 2001), saffron (Crocus sativus L.) (Torelli et al., 2014), and others. The validation by SCAR at first sight seems easier to interpret, because only a single band has to be assessed. However, missing bands after gel electrophoresis could also be caused by problems with DNA purity or integrity, or likewise with suboptimal conditions of the PCR. Both problems are avoided by the ARMS approach, because the full-length amplicon serves as an inbuilt positive control to calibrate presence or absence of the additional diagnostic band. ARMS has been successfully used for authentication of several TCM taxa, including Curcuma (Sasaki et al., 2002), Alisma (Li et al., 2007) and Rheum (Yang et al., 2004). The authors emphasize the efficiency and reproducibility of this method. We realize that the ARMS application has its disadvantages as well, i.e., if it is applied to "Goji" products that are processed as powders, one would need to extend the one-step protocol and use both diagnostic primers to examine whether only Lycium barbarum, only Lycium chinense or both species are present in the respective sample. For strongly degraded DNA that might derive from extraction of processed powder products the amplification of the relatively long psbA-trnH spacer region might be difficult, meaning the internal control that is the advantage of this ARMS approach might suffer in quality. However, most of the commercial "Goji" products are sold as unprocessed dried fruits, for which the presented method easily can be applied. The ARMS approach shown in this study is not limited to superfoods like "Goji," but equally applicable to other fields of nutrition, and has the potential to be an important tool for consumer safety and quality control when it comes to discriminating "real" ingredients from surrogate species. The public availability of sequences for a large number of different plant taxa in NCBI GenBank makes it easy to design ARMS primers for prospective projects, and in many cases removes the need for initial sequencing to design new ARMS primers. However, we want to emphasize that only sequences from correctly identified herbarium vouchered material should be used for the highly sensitive ARMS method.

#### AUTHOR CONTRIBUTIONS

PN and TH contributed to the idea, topic, background information, and experiment planning. SW carried out the shown experiments under the lab supervision of TH. PN and SW wrote the manuscript.

#### ACKNOWLEDGMENTS

We acknowledge support by Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of Karlsruhe Institute of Technology. We also acknowledge the staff of the Botanical Garden of the KIT for their excellent support.

### REFERENCES

fpls-09-01859 December 16, 2018 Time: 13:5 # 13



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wetters, Horn and Nick. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Simultaneous Extraction and Determination of Compounds With Different Polarities From Platycladi Cacumen by AQ C18-Based Vortex-Homogenized Matrix Solid-Phase Dispersion With Ionic Liquid

#### Edited by:

Caroline Howard, Medicines and Healthcare Products Regulatory Agency, United Kingdom

#### Reviewed by:

Johanna Mahwahwatse Bapela, University of Pretoria, South Africa Yun K. Tam, Sinoveda Canada Inc., Canada

#### \*Correspondence:

Yan-xu Chang Tcmcyx@126.com; tcmcyx@tjutcm.edu.cn

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 23 July 2018 Accepted: 13 December 2018 Published: 09 January 2019

#### Citation:

Ding M, Li J, Zou S, Tang G, Gao X and Chang Y-x (2019) Simultaneous Extraction and Determination of Compounds With Different Polarities From Platycladi Cacumen by AQ C18-Based Vortex-Homogenized Matrix Solid-Phase Dispersion With Ionic Liquid. Front. Pharmacol. 9:1532. doi: 10.3389/fphar.2018.01532 Mingya Ding1,2, Jin Li1,2† , Shuhan Zou1,2, Ge Tang<sup>3</sup> , Xiumei Gao1,2 and Yan-xu Chang1,2 \* †

<sup>1</sup> Tianjin State Key Laboratory of Modern Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin, China, <sup>2</sup> Tianjin Key Laboratory of Phytochemistry and Pharmaceutical Analysis, Tianjin University of Traditional Chinese Medicine, Tianjin, China, <sup>3</sup> Department of Nephrology, The First Teaching Hospital, Tianjin University of Traditional Chinese Medicine, Tianjin, China

This study presented a rapid, simple and environmentally friendly method of employing AQ C18-based vortex-homogenized matrix solid-phase dispersion with ionic liquid (AQ C18-IL-VHMSPD) for the extraction of compounds with different polarities from Platycladi Cacumen (PC) samples by ultra high-performance liquid chromatography with PDA detection. AQ C<sup>18</sup> (aqua C18) and ionic liquid ([Bmim]BF4) were used as the adsorbent and green elution reagent in vortex-homogenized MSPD procedure. The AQ C18- IL-VHMSPD conditions were optimized by studying several experimental parameters including the type of ionic liquid, the type of adsorbent, ratio of sample to adsorbent, the concentration and volume of ionic liquid, grinding time and vortex time. The recoveries of the target compounds were in the range of 96.9–104% with relative standard deviation values no more than 2.8%. The limits of detection and limits of quantitation were in the range of 0.2–1.2 and 1.0–5.4 ng mL−<sup>1</sup> , respectively. Compared with the traditional ultrasonic-assisted extraction, the developed AQ C18-IL-VHMSPD method required less sample, reagent and time. It was concluded that the AQ C18-IL-VHMSPD method was a powerful method for the extraction and quantification of the high polarity and low polarity compounds in traditional Chinese medicines samples.

Keywords: AQ C18, ionic liquid, Platycladi Cacumen, UHPLC, vortex-homogenized matrix solid-phase dispersion

## INTRODUCTION

Platycladi Cacumen (PC), namely Cebaiye, derived from the dry twigs and leaves of Platycladus orientalis (L.) Franco, is one of the most commonly used traditional Chinese medicines (TCMs). It has been applied in many TCM formulations for thousands of years. It was recorded that the PC could cool the blood to stanch bleeding, dispel pathogenic wind, remove dampness and

**115**

resolve phlegm (Chinese Pharmacopoeia Commission, 2015). Recently, it was confirmed that PC has antimicrobial, anti-tumor, anti-inflammatory, anti-oxidant activities and neuroprotective effect (Hassanzadeh et al., 2001; Emami et al., 2011; Zaugg et al., 2011; Fan et al., 2012; Zhang et al., 2013). Flavonoids are regarded as the material basis for the efficacy. The main flavonoid glycosides consists of myricitrin, isoquercitrin, and quercitrin. In addition, there is a large amount of biflavonoids in PC. Among these biflavonoids, hinokiflavone, and amentoflavone are the representative ones (Lu et al., 2006). These five flavonoids with different polarities have been reported to exert a therapeutic effect on diseases in previous literatures (Ma and Wu, 2013). Thus, the complete extraction and precise analysis of the five compounds (**Figure 1**) with different polarities are particularly crucial for quality control and pharmacological investigations of herbal PC.

The conventional methods for the extraction of PC included the ultrasonic-assisted extraction (UAE) with HPLC-UV or UPLC-DAD (Zhuang et al., 2017, 2018), the microwave-based extraction with ultraviolet–visible detection, and heat reflux extraction with UPLC-DAD (Chen et al., 2007; Shan et al., 2018). However, these methods normally require a great deal of organic reagent, which arouse the environment pollution. Besides, the longer extraction time was another shortage. Matrix solidphase dispersion (MSPD) is a comprehensive sample preparation method in which sample homogenization, disruption, extraction, fractionation and purification were simultaneously performed in one step and a shorter extraction time and less organic solvent were required (Barker et al., 1989; García-López et al., 2008). At present, many MSPD methods have been successfully applied for various samples (Liu et al., 2008; Vela-Soria et al., 2014). Recently, a vortex-homogenized MSPD (VHMSPD) technique not only retained the superiority of the traditional MSPD method but also overcame the shortage of the loss of analytes in complex procedures (Du et al., 2018a,b). However, no information about VHMSPD was available for the analysis of Platycladi Cacumen in the literature.

The adsorbent used in the MSPD procedure acts a pivotal part in improving the extraction efficiency of target analytes (Cao et al., 2016). Most applications of MSPD have utilized the silica gel matrix material and absorptive matrix materials. PSA, NH2, CN, COOH, C8, C<sup>18</sup> (end capped), C18-N (no end capped) and AQ C<sup>18</sup> were silica gel matrix material. Florisil was absorptive matrix materials. Compared with C18, more silanol groups linking to the surface of C18-N, which could provide the additional polar interactions. A certain proportion of polar functional groups were added to the surface of nonpolar materials of AQ C18, which not only has the higher adsorption capacity on low polarity target compounds, but also greatly increases the adsorption capacity of high polarity target compounds. Thus, AQ C<sup>18</sup> could be used to extract high polarity and low polarity target compounds from traditional Chinese medicines. To our knowledge, no reference on the use of AQ C<sup>18</sup> as an adsorbent for the extraction of TCMs by VHMSPD has been reported.

Ionic liquids (ILs) are characterized by diverse combinations of organic or inorganic anions and organic cations. Recently, they have been widely used as green elution reagent for microextraction due to the advantages of high thermal stability, negligible vapor pressure and good solubility for inorganic and organic compounds (Han et al., 2012). It was reported that ILs could interact with analytes by various mechanisms such as π-π, electrostatic force, hydrogen bonding, ion-dipole, inclusion complex etc. (Qiu et al., 2012). Thus, the green ILs were applied for extracting analytes using VHMSPD method in the TCMs.

In this study, an AQ C18-based vortex-homogenized matrix solid phase dispersion with ionic liquids (AQ C18-IL-VHMSPD) technique with UHPLC-PDA was first established for simultaneous determination of compounds with different polarities from Platycladi Cacumen including three high polarity flavonoid glycosides (myricitrin, isoquercitrin and quercitrin) and two low polarity bioflavonoids (hinokiflavone and amentoflavone). The combination of AQ C<sup>18</sup> and ionic liquid was applied to extract natural compounds in the vortex-homogenized matrix solid phase dispersion procedure. Some parameters such as type of adsorbent, sample/adsorbent ratio, and type and concentration of eluent were optimized to obtain a good extraction efficiency in detail. Furthermore, the five compounds were extracted by using the conventional ultrasonic-assisted extraction to evaluate the feasibility of the developed AQ C18-IL-VHMSPD method.

#### MATERIALS AND METHODS

## Chemicals and Reagents

Reference standards of myricitrin, isoquercitrin, quercitrin, amentoflavone and hinokiflavone were purchased from Chengdu Desite Bio-Technology Co., Ltd., (Chengdu, China). PSA (40–60 µm, 60 A), NH<sup>2</sup> (40–60 µm, 60 A), CN (50 µm, 60 A), COOH (50 µm, 60 A), Florisil (60–100 µm, 80 A), C<sup>8</sup> (50 µm, 60 A), C<sup>18</sup> (50 µm, 60 A), C18-N (50 µm, 60 A) and AQ C<sup>18</sup> (50 µm, 60 A) were supplied from Agela Technologies. 1-butyl -3-methylimidazolium tetrafluoroborate ([Bmim]BF4), 1-hexyl -3-methylimidazolium tetrafluoroborate ([Hmim]BF4), 1-octyl-3-methylimidazolium tetrafluoroborate ([Omim]BF4), 1-butyl-3-methylimidazolium hexafluorophosphate ([Bmim] PF6) and 1-octyl-3-methylimidazolium hexafluorophosphate ([Omim]PF6) were purchased from Shanghai Chengjie Chemical Co., Ltd., HPLC-grade acetonitrile and methanol were purchased from Dikma Technologies Inc., United States. HPLC-grade formic acid was purchased from Tedia Company, Inc. (Tedia, Fairfield, OH, United States). Deionized water was purified from a Milli-Q academic ultra-pure water system (Millipore, Milford, MA, United States). Other chemical reagents were of analytical grade. All the solutions were filtered through a 0.22 µm filter membrane before UPLC analysis.

#### Plant Material

A total of 8 batches of Platycladi Cacumen and its processed products were collected from different regions of China and authenticated by Dr. Yan-xu Chang (Tianjin University of Traditional Chinese Medicine). All samples were pulverized

using a pulverizer (Zhongcheng Pharmaceutical Machinery) after being dried at 60◦C for 24 h, then passed through a 100-mesh sieve.

## Preparation of Standard Solutions

The standard stock solutions of myricitrin, isoquercitrin, quercitrin, hinokiflavone and amentoflavone were separately prepared in methanol. Myricitrin and quercitrin were 2 mg mL−<sup>1</sup> . Isoquercitrin, hinokiflavone and amentoflavone were 1 mg mL−<sup>1</sup> . The appropriate amount solution of all standards was diluted with methanol to obtain eight different appropriate concentrations for calibration curves. The concentrations of myricitrin, isoquercitrin, quercitrin, hinokiflavone and amentoflavone were in the range of 0.4–100, 0.08–20, 0.8–200, 0.2–50, and 0.2–50 µg mL−<sup>1</sup> , respectively. The related standard solutions were stored at 4◦C.

## Ultra High-Performance Liquid Chromatography With PDA Detection (UHPLC-PDA) Analysis

The chromatographic analysis was performed on a Waters ACQUITY UPLC System (Waters Co., Milford, MA, United States) that consisted of a photodiode array (PDA). The workstation controlled by Empower 2 software was employed to collect and analyze data. The separation was performed on an ACQUITY UPLC BEH C<sup>18</sup> column (2.1 mm × 100 mm, 1.7 µm, Waters) at the flow rate of 0.3 mL min−<sup>1</sup> . The mobile phase consisted of water with 0.1% formic acid (eluent A) and acetonitrile (eluent B) using a gradient elution: 0–2 min, 5–37% B; 2–9 min, 37–67% B; 9–10 min, 67–85% B; 10–13 min, 85–95% B; 13–15 min, 95–5% B, then post run 6 min. The column temperature was maintained at 30◦C and the injection volume was 1 µL. The detection wavelength was set at 340 nm. Under the above chromatographic conditions, the chromatographic peaks of analytes included samples and standard solutions were separated excellently (**Figure 2**).

## AQ C18-Based Vortex-Homogenized Matrix Solid-Phase Dispersion With Ionic Liquid Procedure

An aliquot of 25 mg of the previously crushed sample and 50 mg adsorbents (AQ C18) were put into an agate mortar gradually. The mixture was grinded with a pestle for 3 min. Once completely dispersed, the mixture was transferred into a 4 mL polypropylene tube. 1.5 mL elution reagent ([Bmim]BF4) was added and then thoroughly shaken by vortex for 45 s. Subsequently, the tubes were placed into a centrifuge at 14000 rpm for 10 min. The supernatant liquor was collected and 1 uL was injected into the UHPLC for analysis. The schematic diagram of AQ C18-IL-VHMSPD method was exhibited in **Figure 3**.

## Ultrasonic Extraction

According to the Chinese Pharmacopoeia 2015, the dried PC samples (0.500 g) were precisely weighed and introduced into a 100 mL Erlenmeyer flask, then mixed with 20 mL methanol. Finally the mixture was extracted ultrasonically (40 kHz, 96% power) for 30 min and the weight loss of the solution was

FIGURE 2 | Ultra high-performance liquid chromatograms of mixture of standard compounds (A) Platycladi Cacumen samples (B). Peak: 1, myricitrin; 2, isoquercitrin; 3, quercitrin; 4, amentoflavone; 5, hinokiflavone.

complemented with methanol. All the extract solution was filtrated through a 0.22 µm filter membrane and 1 µL filter liquor was injected into the UPLC-PDA for further analysis.

## Optimization of AQ C18-IL-VHMSPD Parameters

To obtain a good extraction efficiency of the target compounds, several experimental parameters including the type of adsorbent, ratio of sample to adsorbent, type and concentration of the eluting solvent, and grinding time were investigated. Each test was repeated in triplicate.

Several adsorbents were investigated including PSA, NH2, CN, COOH, Florisil, C8, C18, C18-N and AQ C18. An aliquot of 25 mg PC samples and 50 mg adsorbents were transferred into an agate mortar, then grinded for 3 min. The eluent was 1.5 mL 90 mM ILs and 3 min was chosen as vortex time. Different types of ILs such as [Bmim]BF4, [Hmim]BF4, [Omim]BF4, [Bmim]PF<sup>6</sup> and [Omim]PF<sup>6</sup> were optimized. Then, four levels of concentration of [Bmim]BF<sup>4</sup> (50, 70, 90, and 110 mM) and volumes of IL (0.5–2 mL) were considered to be optimized. The ratio of sample to adsorbent (1:0, 1:1, 1:2, and 1:3) was considered to investigate while the other conditions remained unchanged. In addition, the vortex time (15, 30, 45,

The errors bars represent RSD (n = 3).

and 60 s) and grinding time (0, 1, 2, 3, and 4 min) were also tested.

#### RESULTS AND DISCUSSION

#### Optimization of the AQ C18-IL-VHMSPD Method

#### Type of Adsorbent

In the development of the MSPD procedure, it is crucial to employ a suitable dispersing adsorbent. The adsorbent is not only used as a disruption and dispersion agent that destroys the structure of samples for the extraction of the target compounds, but also as a purificant that removes the interfering substance of matrix. In the present study, nine types of adsorbent were evaluated. As **Figure 4A** shows, the analytes has the strongly retention when PSA, NH2, CN, COOH, Florisil and C<sup>8</sup> were used as adsorbent. The reason was that the analytes bonded with the adsorbent too tightly to be eluted effectively. C18-N and C<sup>18</sup> produced the equivalent extraction efficiency, but not dramatically better than AQ C18. One probable explanation was that the abundant Si-O-Si and Si-OH groups formed hydrogen bonds between adsorbent and analytes. Additionally, the interaction force may be supplied by the silica, including hydrogen bonding and electrostatic interaction. Therefore, AQ C<sup>18</sup> was chosen as the optimal adsorbent for the subsequent extraction procedure.

#### Type of Ionic Liquid

fphar-09-01532 January 4, 2019 Time: 14:12 # 6

An appropriate type of elution solvent is a significant parameter to obtain good extraction efficiency for target analytes. The elution solvent should have a similar polarity as target analytes, which is easy to disrupt the interaction between the target analytes and the adsorbent (García-Mayor et al., 2012). [Bmim]BF4, [Hmim]BF4, [Omim]BF4, [Bmim]PF<sup>6</sup> and [Omim]PF<sup>6</sup> are five common ionic liquids. To select the most appropriate one, these five ILs were evaluated based on the extraction efficiency of the target analytes in the MSPD procedure. As shown in **Figure 4B**, the elution efficiency of ILs was affected by the alkyl chain length of cation and the types of anion. It was quite obvious that the peak areas of the five target compounds were significantly increased from [Omim]BF<sup>4</sup> to [Bmim]BF4. The mechanism of this phenomenon is that the hydrogen bonding interaction was weakening between the target analytes and imidazolium rings when alkyl chain length was increased with the same anion BF<sup>4</sup> <sup>−</sup>. Although [Bmim]BF<sup>4</sup> and [Bmim]PF<sup>6</sup> had the same cation, the highest peak areas of all analytes were observed with the BF<sup>4</sup> <sup>−</sup> anion. The possible reason is that the combination of anion BF<sup>4</sup> <sup>−</sup> and analytes generated the stronger electrostatic interaction. Overall, [Bmim]BF<sup>4</sup> was chosen as the optimum elution solvent for the next experiment.

#### Ratio of Sample to Adsorbent

The mass ratio of adsorbent to sample is a significant parameter affecting the extraction yield of the target analytes from the samples (Rodrigues et al., 2010; Xu et al., 2016). An appropriate ratio of sample to adsorbent not only guarantees that the sample is completely homogenized and dispersed in the adsorbent, but also decreases the loss of sample in the blending procedure. It can be seen in **Figure 4C** that the peak areas of the five target compounds were distinctly increased when the sample/adsorbent ratio increased from 1:0 to 1:2. The possible reason was that the larger the amount of adsorbent, the stronger the molecular interaction produced, such as hydrogen bonding and electrostatic force between analytes and adsorbents. However, further increment of the sample/adsorbent ratio led to a slightly decrease of the extraction efficiency. The reason may be that the excessive adsorbent generated such a strong interaction between the five compounds and AQ C<sup>18</sup> that eluent could not completely eluted. Thus, sample/sorbent ratio of 1:2 was selected.

#### Concentration and Volume of Ionic Liquid

The concentration and volume of [Bmim]BF<sup>4</sup> are also crucial parameters affecting the extraction yield of the target analytes in the elution process. The results (**Figure 4D**) showed that the peak areas of target compounds were gradually increased from 50 to 90 mM. A possible reason was that the π-π, electrostatic and hydrogen-bond interactions between the [Bmim]BF<sup>4</sup> and target analytes were stronger than the interaction of the analytes and adsorbent, while the extraction efficiency was slightly decreased following the increment of the concentration of [Bmim]BF4. This phenomenon could be ascribed to the intensive viscosity, which gave rise to poor capacity to transfer the target analytes from adsorbent into eluent. Thus, 90 mM [Bmim]BF<sup>4</sup> seemed to be the optimum concentration for further experiments.

To attain the highest extraction efficiency with the minimum volumes of ILs, 0.5 mL to 2 mL volume of 90 mM [Bmim]BF<sup>4</sup> were investigated. **Figure 4E** demonstrates that the contents of the five compounds were increased from 0.5 to 1.5 mL. The reason could be that the stronger interaction was generated between analytes and eluant along with the increase of the volume of eluant. Nevertheless, the peak areas of the five compounds remained unchanged when the volume kept increasing. T-test was introduced to evaluate the differences between two groups by Microsoft excel (version 2010). The results showed that there was no significant difference between 1.5 and 2 mL of 90 mM [Bmim]BF4. Consequently, 1.5 mL of [Bmim]BF<sup>4</sup> was used in the following MSPD extraction procedure.

#### Grinding Time

The grinding time is a vital parameter of great concern in the VH-MSPD method. In order to evaluate the effect of different grinding times, five time intervals at 0, 1, 2, 3, and 4 min were tested. As the results show in **Figure 4F**, the peak areas of the 5 target analytes were increased with increasing grinding time from 1 to 3 min. It was likely that the longer grinding time caused a stronger interaction force between AQ C<sup>18</sup> and analytes, facilitating the transfer of the target analytes from samples into the adsorbent. However, the extraction efficiency employing 3 mingrinding was approximately equal to that of 4 min-grinding. Thus, the grinding time for 3 min was chosen for further experiments.

#### Vortex Time

Previous studies have proven that vortex time is an important factor influencing the extraction efficiency. As shown in **Figure 4G**, the peak areas of the five target compounds were significantly increased as the vortex time increased from 15 to 30 s. However, increasing the vortex time to 60 s resulted in a slightly decline of the peak areas. Thus, 45 s was chosen as the optimal vortex time. Eventually, the optimized AQ C18-IL-VHMSPD conditions were determined to be 25 mg PC sample, 50 mg AQ C18, grinding time of 3 min, 1.5 mL 90 mM [Bmim]BF4 as the elution solvent and 45 s vortex time.

## Method Validation

#### Selectivity and Linearity

The calibration curves (n = 8) of 5 analytes were obtained by performing the peak areas as Y-axis, versus the concentration in µg mL−<sup>1</sup> as X-axis, which ranged from 0.08 to 200 µg mL−<sup>1</sup> .



#### TABLE 2 | The results of precision and stability.

fphar-09-01532 January 4, 2019 Time: 14:12 # 7


#### TABLE 3 | The results of recovery test (n = 6).


The correlation coefficients (R<sup>2</sup> ) of each analyte were higher than 0.9997 (**Table 1**).

#### Limits of Detection and Quantification

Limits of detection (LOD) and limit of quantification (LOQ) were employed to assess the sensitivity of the developed method. They were estimated as the concentrations of the analytes when the signal-to-noise (S/N) ratio reached 3 and 10 individually. The LODs of the five compounds ranged from 0.0002 to 0.0012 µg mL−<sup>1</sup> , while the LOQs ranged from 0.001 to 0.0054 µg mL−<sup>1</sup> (**Table 1**).

#### Reproducibility

The repeatability was evaluated by six parallel AQ C18-IL-VHMSPD extracts of a PC sample. As summarized in **Table 1**, the values of relative standard deviations (RSDs) were all less than 3.7%. It was confirmed from the results that the developed method had good reproducibility during experiment.

#### Precision, Stability and Recovery

Instrumental precision was expressed as intra-day and interday precision by determining the relative standard deviations (RSDs) at three levels of concentrations in six replicates of each compounds. Intra-day precision and inter-day precision were tested in a single day and within continuous 3 days, respectively. The validation results are presented in **Table 2**, the accuracies were within the range of 95.2–104.2% (RSD ≤ 3.3%) and 96.2–114.6% (RSD ≤ 3.1%) for intra-day precision and inter-day precision, respectively.

The stability was investigated by analyzing the accuracies of three levels of concentrations of five compounds at room temperature condition over 24 h. The accuracies of five compounds were in a range of 95.8–114.8%, while with the RSDs

#### TABLE 4 | Contents of the five flavonoids of Platycladi Cacumen samples from 8 batches (n = 6).


<sup>a</sup>Crude Platycladi Cacumen samples; <sup>b</sup>processed Platycladi Cacumen samples; <sup>∗</sup> the certain Platycladi Cacumen sample was extracted by the authoritative extract method (Pharmacopoeia of China 2015).

TABLE 5 | Comparison of the AQ C18-IL-VHMSPD method with other methods in the determination of compounds in Platycladi Cacumen sample.


were both no more than 2.6%, showing that the analytes were stable from 0 to 24 h at room temperature.

To verify the accuracy of the proposed method, recovery tests were performed by analyzing the spiked sample in triplicate. Unspiked samples and spiked samples were simultaneously extracted using the optimum AQ C18-IL-VHMSPD procedures. The results are listed in **Table 3**. The mean recoveries of 5 compounds were all in the range of 96.9–103.6% and the RSDs were all less than 2.8%, which demonstrated that the proposed AQ C18-IL-VHMSPD method was reliable and effective.

#### Application

All of the above analysis results demonstrated that the proposed method had applicable value. Thus, the developed AQ C18- IL-VHMSPD method was employed to analyze the target compounds with different polarities using the last-optimized conditions in six batches of crude Platycladi Cacumen and two batches of carbonized Platycladi Cacumen obtained from various producing areas. The contents of myricitrin, isoquercitrin, quercitrin, amentoflavone and hinokiflavone in crude PC were in the range of 1.64–2.10, 0.11–0.16, 3.41–4.13, 0.40–0.48, and 0.20–0.26 mg g−<sup>1</sup> , individually (**Table 4**). Furthermore, the contents of myricitrin, isoquercitrin, quercitrin, amentoflavone and hinokiflavone in carbonized PC were in the range of 0.06–0.17, 0.00–0.01, 0.03–0.25, 0.02–0.06, and 0.02–0.04 mg g −1 , respectively. The results clearly revealed that the contents of the five target compounds were remarkably decreased after processing.

In order to compare the proposed AQ C18-IL-VHMSPD method with conventional ultrasonic-assisted extraction from Pharmacopoeia of China 2015, the contents of the five target compounds from the same batches of PC were determined by these two methods. It was found that there was no significantly difference between the contents of the five components by two methods. The results indicated that the developed AQ C18-IL-VHMSPD method had the almost same effectiveness as the method of Pharmacopoeia of China 2015 for extracting PC.

#### Comparison With Other Methods

To evaluate the performance of the proposed AQ C18-IL-VHMSPD method, several methods including ultrasonic assisted extraction and reflux extraction were introduced to compare the

sample amount, extraction solvent, solvent volume, extraction time and detection time. As summarized in **Table 5**, it is obviously observed that the developed AQ C18-IL-VHMSPD method has the lower extraction time and detection time in contrast with other methods. Except for the method 2 in **Table 5** (deep eutectic solvents based ultrasonic assisted extraction) and the AQ C18-IL-VHMSPD method, other methods all employed a large volume of organic solvent. Moreover, the proposed method required less extraction time and detection time than deep eutectic solvents based ultrasonic assisted extraction method. Overall, these results indicated that the AQ C18-IL-VHMSPD method was a rapid, simple and environmentally friendly method for the extraction of the high polarity and low polarity compounds in PC samples.

#### CONCLUSION

An environmentally friendly sample pretreatment method, AQ C18-based vortex-homogenized matrix solid-phase dispersion with ionic liquid was successfully developed to extract and quantify the target analytes with different polarity from Platycladi Cacumen by UHPLC-PDA. AQ C<sup>18</sup> was employed as the

## REFERENCES


adsorbent to improve the adsorption capacity of compounds of different polarity. The use of green ionic liquids reduced the environment pollution. Compared with other extraction methods (UAE and reflux extraction), the present method is rapid, timesaving and efficient. This proposed method could be used for determination of compounds with different polarity from other traditional Chinese medicines.

### AUTHOR CONTRIBUTIONS

Y-xC, GT, XG, and JL designed the experiments. MD and SZ performed the experiments. MD wrote the manuscript.

## FUNDING

This research was supported National Natural Science Foundation of China (81374050 and 81703702) and Special Program of Talents Development for Excellent Youth Scholars in Tianjin of China.

the determination of macrolide antibiotics in sheep's milk. Food Chem. 134, 553–558. doi: 10.1016/j.foodchem.2012.02.120


solid phase dispersion via ultrahigh performance liquid chromatography coupled with an ultraviolet detector and quadrupole time-of-flight tandem mass spectrometry. J. Chromatogr. A 1436, 64–72. doi: 10.1016/j.chroma.2016. 01.046


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ding, Li, Zou, Tang, Gao and Chang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Comprehensive Comparative Study for the Authentication of the Kadsura Crude Drug

Jiushi Liu1,2, Xueping Wei1,3, Xiaoyi Zhang1,2, Yaodong Qi1,3, Bengang Zhang1,3 , Haitao Liu1,2 \* and Peigen Xiao1,2

1 Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China, <sup>2</sup> Key Laboratory of Bioactive Substances and Resources Utilization of Chinese Herbal Medicine (Peking Union Medical College), Ministry of Education, Beijing, China, <sup>3</sup> Engineering Research Center of Traditional Chinese Medicine Resources, Ministry of Education, Beijing, China

The stems and roots of Kadsura species have been used as the folk medicine in Traditional Chinese medicine (TCM) and have good traditional efficacy and medicinal application with a long history. Among these species, K. coccinea, K. heteroclita and K. longipedunculata are the most widely distributed species in the regions of south and southwest China. Owing to their similar appearance, the crude drugs are often confusedly used by some folk doctors, even some pharmaceutical factories. To discriminate the crude drugs, haplotype analysis based on cpDNA markers and ITS was firstly employed in this study. Generic delimitation, interspecific interrelationships, and the identification of medicinal materials between K. longipedunculata and K. heteroclita remained unresolved by the existing molecular fragments. The original plant could be identified through the morphological character of flower, fruit and leaf. However, in most situation collectors have no chance to find out these characters due to lack of reproductive organs, and have no experience with the minor difference and transitional variation of leaf morphology. The chemical characterization show that the chemometric of chemical composition owned higher resolution to discriminate three herbs of Kadsura species. In conclusion, this integrative approach involving molecular phylogeny, morphology and chemical characterization could be applied for authentication of the Kadusra. Our study suggests the use of this comprehensive approach for accurate characterization of this closely related taxa as well as identifying the source plant and confused herbs of TCM.

Keywords: Kadsura, morphology, molecular markers, chemical characterization, identification

#### INTRODUCTION

Traditional Chinese medicine (TCM) is widely accepted in the health care system, and has made a significant contribution to prevention and treatment of human diseases. This extensive use warrants safety measures and so TCM drug safety monitoring and quality control are becoming increasingly important tasks to guarantee the safety and efficacy of TCM treatments (Chen et al., 1999). However, the use of substitute products and confused materials still aggravate the chaotic

#### Edited by:

Karl Tsim, Hong Kong University of Science and Technology, Hong Kong

#### Reviewed by:

Shuai Ji, Xuzhou Medical University, China Jianping Chen, Shenzhen Traditional Chinese Medicine Hospital, China

> \*Correspondence: Haitao Liu htliu0718@126.com

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 25 June 2018 Accepted: 31 December 2018 Published: 22 January 2019

#### Citation:

Liu J, Wei X, Zhang X, Qi Y, Zhang B, Liu H and Xiao P (2019) A Comprehensive Comparative Study for the Authentication of the Kadsura Crude Drug. Front. Pharmacol. 9:1576. doi: 10.3389/fphar.2018.01576

**125**

situation in clinical application. It is important to find a reliable way for distinguishing them from each other (Li et al., 2015).

Kadsura belongs to the economically and medicinally important family Schisandraceae and eight species mainly distributed in the southwest and southeast in China (Saunders, 1998; Wu et al., 2008). In China, the stems and roots of genus Kadsura are commonly used as folk medicines and 5 species of genus Kadsura are documented in the official Pharmacopoeia and folk record (GuangXi Zhuang Autonomous Region Health Department, 1992; Guangdong Food and Drug Administration, 2004; FuJian Food and Drug Administration, 2006; Chinese Pharmacopeia Commission, 2015). The stems of K. heteroclita, K. longipedunculata and K. coccinea are the most widely used in south China which have good traditional efficacy and medicinal application with a long history, and often used confusedly in clinical application. Numerous phytochemical and pharmacological studies have been carried out and focus on its health benefits (Liu et al., 2014). There are differences in clinical efficacy between the three crude drugs. K. heteroclita was used for the treatment of rheumatic arthralgia. K. longipedunculata was used for the treatment of irregular menstruation. K. coccinea was used for the treatment of gastric and duodenal ulcer. Owing to their similar appearance, these crude drugs are often confusedly used by some folk doctors, even some pharmaceutical factories. There is an urgent need to find a reliable, accurate way for distinguishing three Kadsura crude drugs (Liu et al., 2012).

In order to identify this kind of medicine herbs by molecular sequences, Zhou et al. chose psbA-trnH for distinguishing eight species. Although they found a stabilized single nucleotide polymorphism (SNP), SNPs as potential tool to distinguish Kadsrua crude drugs could not be further analyzed because of poor samples of K. heteroclite (Zhou et al., 2016). Combination of ITS + psbA-trnH + matK + rbcL as the most ideal DNA barcode for discriminating the medicinal plants of Schisandra and Kadsura, nonetheless, degree of species resolution was lower among the closely related species, and exposed K. heteroclita and K. longipedunculata could not be discriminated by four commonly used DNA barcodes (Zhang et al., 2015). Molecular identification of TCM is objective, more accurate, and easier to perform than traditional identification methods, and has successfully been applied to identify medicinal plants (Kress et al., 2005; Chen et al., 2010; de Boer et al., 2015). However, previous studies of DNA barcoding have not effectively resolved the problem of identifying three crude drugs of the Kadsura. In this study, we try to identify three crude drugs by haplotype analysis based on cpDNA and ITS markers.

The high selectivity and sensitivity of UPLC-QTOF/MS has been successfully applied to the metabolite analysis and identification of complex compounds in herbal materials (Song et al., 2013; Yao et al., 2013; Cubero-Leon et al., 2014). There are few researches in investigated chemical profiles of Kadsura species for the safety and efficacy, and the comparative analysis on chemical composition of these Kadsura herbs is needed. Bioactivity-based characteristics are good quality indicators too, as they are pharmacologically relevant (Liu et al., 2017). Analysis on main chemical components disparity in three medicinal materials could guarantee the clinical uses the medicine the rationality, the security and the validity.

Adequately considering samples representativeness and experiments economy, we therefore started multi-populations survey in twenty populations covering five provinces including Hunan, Guangxi, Guizhou, Chongqing and Sichuan provinces and in reproductive stage at summer and autumn from June 2016 to December 2017. The overall aim of this study was to explore the usefulness of an authentication approach to three crude


K. coccinea K. heteroclita K. longipedunculata



TABLE 1 | Samples of K. longipedunculata, K. coccinea, and K. heteroclita.

drugs of the Kadsura complex using cpDNA and ITS markers, morphology, and UPLC-QTOF/MS chemical profiling. We want to compare the genetic polymorphism of haplotypes and analyze their population difference for the taxa in the Kadsura species, and distinguish chemotypes of the species complex by comparing their UPLC-QTOF/MS chemical profiles using chemometric data analysis.

## MATERIALS AND METHODS

fphar-09-01576 January 19, 2019 Time: 16:45 # 3

#### Plant Materials

There are morphological differences between the three species to discriminate them during flower or fruit stages of life cycles. For the sake of sampling accuracy, we therefore started multipopulations survey in reproductive stage at summer and autumn from June 2016 to December 2017. The Kadsura samples were collected in the main areas in China: Hunan, Guangxi, Guizhou, Chongqing and Sichuan provinces. In total 52 samples of K. coccinea, K. heteroclita and K. longipedunculata were collected directly from wild region. The 52 leaves dried using silica gel for DNA extraction and stored at 4◦C until use. And 18 the stems dried in the shade for UPLC-MS analysis (**Table 1**).

## DNA Extraction, PCR Amplification and Sequencing

Total genomic DNA was extracted from silica gel-dried leaves by using the Plant Genomic DNA Kit (Tiangen Biotech, Beijing, China) following the manufacturer's instructions. Three cpDNA gene markers, matK,rbcL,psbA-trnH and one nrDNA ITS, were separately amplified for each individual by using the primers and protocol of Guo et al. (2015). Sanger sequence reactions were carried out using the DYEnamic ETDye Terminator Cycle Sequencing Kit (Amersham Pharmacia Biotech) and sequenced on ABI 3730XL genetic analyzer (Applied Biosystems, CA, United States).

## Network Analysis of Haplotypes

The DNA sequences were aligned using the program Clustal X v.1.83 (Thompson et al., 1997) and manually adjusted in BioEdit v. 7.0.9 (Hall, 1999). Voucher and GenBank accession numbers were listed in the **Supplementary Table S1**. A network of the cpDNA haplotypes (chlorotypes) was constructed using NETWORK 5.0.0.1 (Bandelt et al., 1999), with a default parsimony connection limit of 95% and each insertion/deletion (indel) treated as a single mutation event.

## Sample Preparation and UPLC-QTOF/MS Conditions

HPLC-grade acetonitrile (Merck KGaA, Darmstadt, Germany) and formic acid (Fisher Scientific, NH, United States) were utilized for UPLC analysis. Pure water (18.2 M) for UPLC analysis was obtained from a Milli-Q system (Millipore, MA, United States). All other chemicals were of analytical grade.

Kadsura samples (0.5000 g, 65-mesh) were accurately weighed and extracted with 25 mL methanol by ultrasonication (35 kHz)

for 30 min. After centrifugation at 10,000 × g for 10 min, the supernatant was stored at 4◦C and filtered through 0.22 µm membrane prior to injection into the UPLC system.

A Thermo ScientificTM DionexTM UltiMateTM 3000 Rapid Separation LC (RSLC) system performed UHPLC separations using the gradient conditions as follows. Mobile phase A was water and mobile phase B was acetonitrile; both A and B contained 0.1% formic acid. The conditions were optimized as follows: 0–3 min, 2–20% B; 3–4.5 min, 20–75% B; 4.5– 6.5 min, 75–100% B; 6.5–15 min, 100% B; 15–15.5 min, 100–5% B; 15.5–17 min, 5% B. The column was a HSS T3 column (2.1 mm × 100 mm, 1.7 µm, waters) operated at 45◦C. The flow rate was 300 µL/min and the injection volume was 2 µL.

A Thermo ScientificTM Q ExactiveTM hybrid quadrupole Orbitrap mass spectrometer equipped with a HESI-II probe was employed. The HESI-II spray voltages were 3.7 kV for positive mode, the heated capillary temperature was 320◦C, the sheath gas pressure was 30 psi, the auxiliary gas setting was 10 psi, and the heated vaporizer temperature was 300◦C. Both the sheath gas and the auxiliary gas were nitrogen. The collision gas was argon at a pressure of 1.5 mTorr. The parameters of the full mass scan were as follows: a resolution of 70,000, an auto gain control target under 1 × 10<sup>6</sup> , a maximum isolation time of 50 ms, and an m/z range 150–1500. The calibration was customized for the analysis of Q Exactive to keep the mass tolerance of 5 ppm. The LC-MS system was controlled using Xcalibur 2.2 SP1.48 software (Thermo Fisher Scientific), and data were collected and processed with the same software.

#### UPLC-QTOF/MS Data Analysis

UPLC-QTOF/MS data for Kadsura samples were analyzed to identify potential discriminant variables. Peak finding, alignment and filtering of ES raw data were carried out using Xcalibur 2.2 SP1.48 software (Thermo Fisher Scientific). The parameters used were as follows: retention time (tR) of 0.5–10.5 min, mass of 150–800 Da, retention time tolerance of 0.05 min, and mass tolerance of 0.02 Da. Three replicate samples collected from each geographic location were used (n = 3). A total of three, 114 variables were used to create the model. The resulting data was analyzed by heatmap analysis with MetaboAnalyst, which is a web-based tool for visualization of chemometrics (Deng et al., 2014). And principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were applied to discriminate three Kadsura species

the observed frequencies of the haplotypes.



by the EZinfo 2.0 software (Masssart et al., 1998; Xia et al., 2012).

#### RESULTS

#### Morphology

The variation of morphological traits in the Kadsura is relatively complex and the Kadsura in relationship was near with each other. There are still some morphological differences among three species. We observed main morphology characters (male flower, fruit shape, stem and leaf listed in **Table 2**) of K. coccinea, K. heteroclita and K. longipedunculata by specimens and natural populations, and established morphological basis to identify three Kadsura species. The morphology of male flower, fruit shape, stem and leaf are an important basis of identifying Kadsura (**Figure 1**). In most situations, collectors have no chance to find out these characters because of comparatively short flower and fruit time or lack of these organs in some habits and young individuals. Minor difference and transitional variation of leaf morphology make it difficult for those inexperienced collectors to identify. It is important to find a reliable, accurate way for distinguishing three crude drugs.

#### Haplotypes Network

The alignment of the combined three cpDNA fragments were designated 13 haplotypes (C1–C13) including 52 variable characters, and ITS were designated 29 haplotypes (H1– H29) with 49 variable characters. K. coccinea occupied eight private chloroplastic haplotypes (**Figure 2A**; C1–C8), while K. longipedunculata and K. heteroclita shared the three main haplotypes (C9, C11, and C12), two rare haplotypes C10 and C13 were fixed by K. heteroclita and K. longipedunculata, respectively. In the ITS haplotypes network (**Figure 2B**), K. coccinea occupied four private haplotypes which were quite different from the others (H1–H4). 17 haplotypes were fixed in K. longipedunculata, and seven haplotypes in K. heteroclita. Only one H17 was shared by K. longipedunculata and K. heteroclita. H17 was one of the main haplotypes of ITS and it was the center of the "star-like" haplotypes network of K. longipedunculata and K. heteroclita. It was obviously to see that both cpDNA markers and ITS can distinguish K. coccinea from K. longipedunculata and K. heteroclita clearly. However, the phylogenetic relationship between K. longipedunculata and K. heteroclita are quite closely related, their haplotypes are star-like and they shared the main haploptypes in both cpDNA markers and ITS (**Supplementary Table S2**).

## UPLC-QTOF/MS Analysis

Twenty compounds were tentatively identified by elucidating the retention time (min), parentions [M+H]+, MS/MS fragmentation pattern and calculated molecular formula of each peak, and by matching above data with those reported previously (**Table 3**). For example, schisandrin, in the low energy spectrum, the protonated adduct ion [M+H]<sup>+</sup> at m/z 433.1866. Further confirmation of schisandrin was provided by the high-energy function. At m/z 415.2115 was detected the fragment identified as [M-H2O+H]<sup>+</sup> and at m/z 384.1932 we assigned a fragment due to the further loss of methoxy group corresponding to [M-H2O-OCH3+H]+. It was identified to schisandrin based on the parent and characteristic fragmentions information.

As depicted in **Figure 3**, we can observe that: (a) the chemical components of three Kadsura species were very different by the heatmaps, and the components of K. longipedunculata were close to K. heteroclita. (b) Among all the identified compounds, kadangustin L, gomisin H, and ananolignan A have a relatively high concentration in K. coccinea, while showing low levels in the K. heteroclita and K. longipedunculata. While kadangustin E, kadsumarin A, interiotherin A and kadoblongifolin B are present mainly in the K. longipedunculata, followed by the

K. heteroclita and limited concentratins in the K. coccinea. Previous studies have suggested that these compounds are found in high concentrations in K. heteroclita, which is supported by our results (Liu et al., 2014).

The two-component PCA model cumulatively accounted for 46.04% of the variation (PC1, 36.43%; PC2, 9.61%). The PCA score plot shows that these three species were obtained the very good separation. Group I was formed by K. coccinea, Group II consisted of K. heteroclita and Group III was formed by K. longipedunculata (**Figure 4**).

The PLS–DA model performed well in classifying the three species of Kadsura, and group I was far away from the group II and III (**Figure 5A**). A total of six credible and significant markers were determined to facilitate discrimination of these groups by the S-plot of PLS-DA. The identities of six potential markers were tentatively assigned (**Figure 5B**). The components correlated with these six ions were tentatively identified as isomers of kadsumarin A, gomisin H, kadangustin L, interiotherin A, kadangustin E, and kadoblongifolin B. The marker compounds could be used to distinguish the three plant species, as the ion intensities of kadsumarin A, kadoblongifolin B, and interiotherin A in K. heteroclita and K. longipedunculata was higher than in K. coccinea. Marker gomisin H and kadangustin L could be detected in K. coccinea, which was higher than in the other two species (**Figure 6** and **Supplementary Data Sheet S1**).

#### DISCUSSION

The three Kadsura species distribute widely in tropical or subtropical evergreen forests of south of Yangtze River in China. In fact, there are morphological differences between the three species to discriminate them during flower or fruit stages of life cycles. For the sake of sampling accuracy, we therefore started multi-populations survey in reproductive stage at summer and autumn from 2016 June to December 2017. Adequately considering samples representativeness and experiments economy, we chose following survey and sampling strategy. 80 individuals were observed in twenty populations covering five provinces including Hunan, Guangxi, Guizhou, Chongqing and Sichuan provinces, while 45 leaves samples were collected for DNA barcoding experiment and 18 stems samples were used for metabolites analysis, in which samples of each species included ten individuals (DNA test) and six individuals (Chemical test) from different populations.

#### Morphology

As mentioned above, we usually discriminate the three species by the shape of staminate flower torus, the size and shape of fruits, the length of fruit stalk and leaf shape. For example, the shape of staminate flower torus of K. coccinea is conical, K. heteroclite is elliptical and K. longipedunculata is spherical. The size of fruit is K. coccinea (6–10 cm) > K. heteroclita (2.5–4 cm) > K. longipedunculata (1–3.5 cm). These identifying characteristics also were record in FRPS and FOC (Academiae Sinicae Edita, 2004; Flora of China Edita, 2013). When we surveyed in wild populations, these morphological characters were very valuable to discriminate them. In spite of obvious differences between reproductive organs, in most situation collectors have no chance to find out these characters due to comparatively short flower and fruit time or lack of these organs in some habits and young individuals. In addition, leaf

morphology may be observed through whole growth period, minor difference and transitional variation between species make difficult for those inexperienced collectors. Consequently this leads to collection uncertainty for the three crude drugs, and confusedly mixed application often occurs in present research and clinic use. Nowadays a popular solution is to extract DNA fragments from dried materials, and then conducts DNA barcoding or SNPs analysis (Kress et al., 2005; Chen et al., 2010).

#### DNA Sequence

In our study, haplotype analysis based on cpDNA and ITS markers can distinguish clearly K. coccinea from K. longipedunculata and K. heteroclita, but can't distinguish K. longipedunculata and K. heteroclite. Haplotype analysis is suitable for the study of closely related species and genetic diversity of intraspecific species by molecular biology methods. However, it does not show any advantage to delimit the boundary between K. longipendunculata and K. heteroclita.

Haplotypes of K. longipedunculata and K. heteroclita shared the main haploptypes in both cpDNA markers and ITS. MatK,rbcL,psbA-trnH and ITS are the four suggested DNA barcode in plant (CBOL Plant Working Group, 2009). The cpDNA is characterized by its evolutionary conservatism, matrilineal inheritance, and lack of recombination (Wolfe et al., 1987). However, the complicated relationship such as the potential hybridization, reticulate evolution and gene introgression may further intersify the difficulty of species identification in closely related species of the Kadsura. The existing DNA barcode have not effectively resolved the problem of identifying K. longipedunculata and K. heteroclita.

#### Chemical Characteristics

Three herbs differ from their metabolite profiles including lignan based chemometric analysis. Heatmap analysis, PCA analysis and PLS-DA showed the chemical constituents of three kinds of medicinal materials differed significantly. We summarized the chemical constituents of Kadsura and found that a lot of spirobenzofuranoid dibenzocyclooctadiene compounds have been found in K. heteroclita. Tetrahydrofura compounds have been found in K. longipedunculata. 18 (13→12)-abeo-lanostane and nortriterpenoid compounds have been found only in K. coccinea (Liu et al., 2014). The different chemical constituents can influence the curative effect and security. The chemometrics analysis can make up for the shortage of molecular identification and has successfully been applied to identify the three Kadsura crude drug.

In this study, the DNA sequence analyzes, the recheck of morphology and chemical characteristics applied to identify the three Kadsura crude drug. The identification of medicinal materials between K. longipedunculata and K. heteroclita remained unresolved by the existing molecular fragments. The chemical characterization shows that the chemometric of chemical composition owned higher resolution to discriminate three crude drugs of the Kadsura and helpful to differentiate the source of samples and judge the consistency of three Kadsura species which make up for the shortage of molecular identification. This paper conducts a comprehensive analysis on three Kadsura crude drugs and provides a new research route for the confused herbs by molecular phylogeny, morphology and chemical composition.

#### AUTHOR CONTRIBUTIONS

JL involved field survey, performed operation of the whole experiments, and wrote the manuscript. XZ and XW assisted with JL in the experiments. YQ and HL responsible for provided the technical guidance and designed the experiments. BZ and PX improved the manuscript.

## FUNDING

The authors are grateful for the financial support provided by the National Natural Sciences Foundation of China (Nos. 81373913 and 81703650) and CAMS Initiative for Innovative Medicine (CAMS-I2M-1-010).

### ACKNOWLEDGMENTS

We thank the reviewers for carefully reviewing our manuscript and making many valuable suggestions.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar.2018. 01576/full#supplementary-material

## REFERENCES

fphar-09-01576 January 19, 2019 Time: 16:45 # 10


Zhou, H., Ma, S., Chen, B. B., Han, Z. Z., and Yao, H. (2016). Identification of spatholobi caulis, kadsurae caulis, and sargentodoxae caulis using the psbAtrnH barcode. World Sci. Technol. 18, 40–45.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Liu, Wei, Zhang, Qi, Zhang, Liu and Xiao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# St. John's Wort (Hypericum perforatum) Products – How Variable Is the Primary Material?

Francesca Scotti<sup>1</sup> , Katja Löbel<sup>1</sup> , Anthony Booker1,2 and Michael Heinrich<sup>1</sup> \*

<sup>1</sup> Pharmacognosy and Phytotherapy Group, Pharmaceutical and Biological Chemistry, UCL School of Pharmacy, London, United Kingdom, <sup>2</sup> Division of Herbal and East Asian Medicine, Department of Life Sciences, University of Westminster, London, United Kingdom

Background: Saint John's wort (Hypericum perforatum L., HP) is commonly registered in Europe under the THR scheme (Traditional Herbal Registration) or licensed as a medicine. Nonetheless unregulated medical products and food supplements are accessible through the internet which are often of poor quality. The species' natural distribution stretches through large regions of Europe to China and four subspecies have been distinguished. When compared to the European Pharmacopoeia reference, the presence of additional compounds was linked to so-called Chinese HP.

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Adolfo Andrade-Cetto, National Autonomous University of Mexico, Mexico Liselotte Krenn, Universität Wien, Austria

> \*Correspondence: Michael Heinrich m.heinrich@ucl.ac.uk

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 16 July 2018 Accepted: 19 December 2018 Published: 24 January 2019

#### Citation:

Scotti F, Löbel K, Booker A and Heinrich M (2019) St. John's Wort (Hypericum perforatum) Products – How Variable Is the Primary Material? Front. Plant Sci. 9:1973. doi: 10.3389/fpls.2018.01973 Aim: In order to obtain an integrated picture of the entire chemoprofile, the chemical composition of HP materia prima was studied using a combination of techniques wellestablished in the relevant industries. The impact of phytogeographic factors on the materia prima can shed light on whether the variability of the final products is strongly influenced by these factors of whether they relate to poor processing, adulteration, or other factors linked to the processing of the material.

Methods: Eighty-six Hypericum samples (77 H. perforatum) were collected from 14 countries. Most were authenticated and harvested in the wild; others came as roughly ground material from commercial cultivations, markets and pharmacies. The samples were analyzed using HPTLC and <sup>1</sup>H-NMR-based principal component analysis (PCA).

Results and Discussion: Limited chemical variability was found. Nonetheless, the typical fingerprint of Chinese HP was observed in each specimen from China. Additional compounds were also detected in some samples collected in Spain. Rutin is not necessarily present in the crude material. The variability previously found in the marketed products can be ascribed only partially to the geographical origin of harvested material, but mainly to the plant part harvested, closely related to harvesting techniques, processing and probably time of harvest.

Conclusion: HP can be sourced in a consistent composition (and thus quality) from different geographical sources. However, chemical variability needs to be accounted for when evaluating what is considered authentic good material. Therefore, the processing and good practice are all stages of primary importance, calling for a better (self-)regulation and quality assurance along the value chain of an herbal medical product or botanical.

Keywords: avicularin/guaiaverin, Hypericum perforatum (Saint John's wort, SJW), materia prima, natural variability, quality control, subspecies boundaries, value chains

## INTRODUCTION

fpls-09-01973 January 22, 2019 Time: 17:17 # 2

Saint John's wort (Hypericum perforatum L. – HP, Hypericaceae) has been used traditionally across Europe for centuries and in contemporary society it plays an important medical role. Its renown ability to treat wounds is being investigated (Oztürk et al., 2007; Süntar et al., 2010, 2011) but, most importantly, it is now widely used as a prescription or over the counter medicine to treat minor to moderate depression (licensed medicines) or 'low mood' (registered products). It was found that its activity is comparable to antidepressants when dealing with mild to moderate depression (Apaydin et al., 2016). In general, it is a licensed drug in many European countries and in the United Kingdom it is registered under the THR scheme. While considerable effort has gone into understanding the chemistry and pharmacology of commercially used materials, little attention has been paid to the biological and chemical complexity of the starting material and specifically to the diversity within the taxon H. perforatum (Dauncey et al., 2017). HP could be considered as an umbrella term for different Hypericum taxa united by the use as herbal medicines esp. in 'mood disorders.'

The species used medicinally in more recent Western medicine is Hypericum perforatum. HP is native to Eurasia, it is found in Europe (excluding the extreme north), the "Levant and western Saudi Arabia to NW India (Uttar Pradesh), Transcaucasia, Turkmenistan to Altai, Angara-Sayan and NW Mongolia; China (W. Xinjiang and from Gansu east to Hebei, south to Jiangxi and west to Yunnan)" (Robson, 2002). It is also found in NW Africa, including Canary Islands, Madeira and Azores. It has been introduced in the American continent, where it can be found from Canada to Argentina, in the Republic of Sudan (Jebel Marra), South Africa, Reunion, Australia, New Zealand, and Japan.

Robson (2002) proposed the distinction of four subspecies, based on minor morphological traits and with a well-defined geographical distribution of two of these subspecies, while the two Central/Western subspecies overlap through a large range of their territories (Dauncey et al., 2017).

Apomictic reproduction is known to give surge to a number of interfertile hybrids, morphologically different but within a continuum, therefore rendering taxonomic identification extremely difficult (Dickinson, 1998; Bicknell and Koltunow, 2004). This tendency, through time, has led to a considerable variability among the morphology of H. perforatum species. The state of current knowledge (Robson, 2002) is that these four subspecies possibly originated from a common ancestor (Western Siberia) which, interbred with other Hypericum species, gave birth to morphologically distinct, but recurring and geographically restricted, hybrids that are now being recognized as subspecies. According to Robson (2002), from the common ancestor first ssp. songaricum and ssp. perforatum evolved. Subsequently, ssp. veronense and ssp. chinense, respectively, evolved. According to Robson ssp. songaricum is the closest to the original ancestor. Ssp. chinense is seen as particularly distinct from its predecessor, and further away evolutionarily.

This situation raises two distinct but interrelated questions:


Previous investigations have revealed how the quality of food supplements (botanicals) is variable (Zheng and Navarro, 2015; Booker et al., 2016a,b; Barrella et al., 2017; Ruhsam and Hollingsworth, 2018), and studies conducted on the chemical quality of HP products showed problems specific to this species, (Frommenwiler et al., 2016; Booker et al., 2018), including strength and dosage inconsistencies and the presence of food dyes. A chemical pattern previously identified by Huck-Pezzei et al. (2013) and Frommenwiler et al. (2016) was initially labeled "Chinese HP" as it was almost only found in commercial products of Chinese origin. The Chinese material, and, therefore, the ssp. chinense has been thought to constitute a specific chemotype differing from the other subspecies. The previously found Chinese HP fingerprint (Frommenwiler et al., 2016; Booker et al., 2018) showcased three main features different from the EP and USP HPTLC standards and raw herbal material analyzed in those studies:


As a consequence, it was thought expedient to analyze HP crude drug material collected across the world and evaluate the chemical profiles via nuclear magnetic resonance (NMR) spectroscopy and high performance thin layer chromatography (HPTLC) to verify whether this profile can only be found among Chinese specimens and whether other chemical variation exists in the naturally occurring crude drug material around the globe.

#### AIMS AND OBJECTIVES


This has been achieved by analyzing samples from different countries across the world and comparing the data with those provided by the previous study conducted on the finished products available on the market (Booker et al., 2018).

#### MATERIALS AND METHODS

All solvents were purchased from Merck KGaA, Fisher Scientific Ltd. and VWR International LLC, except of deuterated methanol which was purchased from Cambridge Isotope Laboratories Inc.

#### Sample Collection

fpls-09-01973 January 22, 2019 Time: 17:17 # 3

Eighty-six samples (for a detailed description see the **Supplementary Material**) were collected and dried at room temperature and voucher specimens for the unprocessed samples are deposited at UCL School of Pharmacy Herbarium.

Specimens were harvested in the wild or obtained from commercial cultivations, the flowering aerial parts were collected (unless differently stated in the **Supplementary Material**) and dried in the shade.

Commercial processed samples were purchased or donated in the form of roughly ground plant material.

#### Reference Standards

St. John's wort dry extract (European Pharmacopoeia, EP, Reference Standard 01131, code: Y0001050, batch: 2.0), avicularin and guaiaverin were obtained from Sigma-Aldrich Inc. while rutin (L10815B002) from Adooq Bioscience. Hypericin primary reference standard (batch HWI 01814-1) was purchased from HWI ANALITIK GmbH.

### <sup>1</sup>H-NMR Spectroscopy

Nuclear Magnetic Resonance analysis was carried out using a Bruker Avance Spectrometer featuring a QNP multi-nuclear probe head with z-gradient/5 mm cryoprobe head operating at 500.13 MHz. Spectra were acquired at 298 K, using 64k data points, line broadening factor = 0.16 Hz, pulse width = 30◦ , relaxation delay d1 = 1 s. Each run was subjected to 256 scans. The acquired data was processed using TopSpin 3.2 software. Chemical shifts were calibrated to the tetramethylsilane (TMS) signal.

#### Sample Preparation for NMR

The dried material was ground finely using a blender. The solvent choice and the analytical methods followed the outline of the previous study on HP marketed products (Booker et al., 2018) for the purpose of comparability.

Fifty milligrams of powder was extracted in 1 mL deuterated methanol, vortexed for 20 s, sonicated for 5 min at room temperature and finally centrifuged for 5 min at 13,000 rpm. 0.6 mL of supernatant was sent for analysis.

Reference standard pure compounds were simply dissolved in methanol, at the concentration of 1 mg/mL and 0.6 mL was analyzed.

#### Principal Component Analysis of Data

<sup>1</sup>H-NMR signals were calibrated to the TMS peak. The spectra acquired were converted to ASCII file using AMIX 3.9.14. Using only positive intensities and no scaling, buckets of 0.04 ppm were created using the multivariate analysis software. Via the use of Excel, the NMR elaborated data was introduced onto SIMCA 14.0, the software utilized for the principal component analysis (PCA). Sample 22 was analyzed twice and, after different trials, it was established that no scaling in SIMCA gave a statistical model that was to be considered more reliable based on the proximity of the sample 22 repeats in the plot.

## HPTLC

HPTLC was performed using a CAMAG setup consisting of a Linomat 5 semi-automated sampler, automatic developing chamber 2 (ADC2), TLC plate heater III and TLC visualizer coupled to visionCATS 2.1 software. The HPTLC plates Silica gel 60 F<sup>254</sup> used for stationary phase were purchased from Merck KGaA (Darmstadt, Germany).

#### Sample Preparation for HPTLC Analysis

Five hundred milligrams of powdered material was extracted with 5 mL of methanol, shaken on a rotary mixer for 20 s, sonicated 10 min in at 60◦C and filtered using Millex <sup>R</sup> Syringe filter unit 0.45 µm.

The references hypericin and quercetin were dissolved in methanol, while rutin in acetone, with a concentration of 1 mg/mL then sonicated for 10 min at 60◦C. Rutin had to be filtered through a Millex <sup>R</sup> syringe filter unit 0.45 µm, to remove any residual suspended particle prior to use. The EP standard was prepared in methanol at a concentration of 100 mg/mL.

#### HPTLC Analysis

The method used reflects the one published by the HPTLC association for the extraction and analysis of HP powdered drug (HPTLC, 2016). Each plate was visualized under white light and UV 254 nm, prior to sample application in order to later correct for the background. 2 µL of sample and standards, were spotted on the plates in bands of 8 mm. The plate was developed in the automatic developing chamber at 33% humidity, with 20 min saturation time, 10 min activation time and 5 min pre-drying. The mobile phase consisted of a freshly prepared mixture of ethyl acetate, dichloromethane, HPLC-grade water, formic acid, glacial acetic acid in the proportion 100:25:11:10:10 (v/v/v/v/v). After development, the plate was visualized under white light, UV 254 and 366 nm. Prior to derivatization the plate was heated at 100◦C for 3 min and subsequently dipped, while still hot, in NP reagent first (1 g 2-aminoethyl diphenylborinate in 200 mL ethyl acetate) and then PEG reagent (10 g polyethylene glycol 400 in 200 mL dichloromethane), for the detection of flavonoids. The plate was then visualized under white light and UV 366 nm.

#### RESULTS AND DISCUSSION

With the intention to define the chemical profile of HP, the project embarked on the analysis of samples trying to identify the common as well as the variable chemical components of HPs from different geographical regions. Therefore, our collection of 77 HP samples from 14 different countries covered native Europe extensively (South England, Portugal, Spain, Germany, Switzerland, Italy, Bulgaria, Greece), Lebanon, Tajikistan, China and areas of introduction such as South America (Chile, Argentina) and Australia (**Figure 1**).

As a first step, the chemical composition of different sections of the aerial parts, the traditionally recommended drug, were analyzed. One single HP specimen from Southern England (nr 53) was cut in 4 parts (sample 53#1 0–18 cm, lower; 53#2 18–37, cm lower intermediate; 53#3 37–54 cm, upper intermediate; 53#4 54–65 cm, flowering tops); in addition, samples containing only leaves (sample 53#5) and only flowers (sample 53#6) were taken from the same specimen.

HPTLC analysis showed, as expected, a variation in the chemical content between parts (**Figure 2**). Material derived to the lower section of the aerial parts was constituted only of woody stems and the methanolic solution obtained was light yellow. The chromatographic fingerprint showed very low levels of detectable components. Samples 53#2 and #3 both contained leaves and slimmer woody stems, the methanolic solution obtained was dark brown in color and the HPTLC fingerprint seemed perfectly acceptable for an HP product.

Sample #4 represented the flowering tops, the part to harvest based on the pharmacopeial requirements ("Whole or fragmented, dried flowering tops of Hypericum perforatum L., harvested during flowering time" BP 2018, Ph. Eur. 9.3 Update). In this case the methanolic solution is dark red and the fingerprint is similar to the previous two samples with the addition of a green band at Rf = 0.77 and slightly more concentrated bands of hypericin derivatives (red bands between Rf = 0.54 and Rf = 0.63). As expected, the sample exclusively made of leaves (53 #5) has exactly the same fingerprint of #2 and #3 but the methanolic solution is green in color. Finally, the flower sample, 53 #6, shows a fingerprint with a level of hypericins comparable to #4, the green band at Rf = 0.77, a faint yellow band right above said green band and a much fainter top elution band.

"Chinese HP" with its specific fingerprint characteristics could be adulterated with other species. Therefore, nine samples of other Hypericum species growing in China were collected and analyzed by HPTLC including H. ascyron (F6), H. acmosepalum (F7, 9), H. uralum (F8), H. densiflorum (F10), H. beanii (F11), H. patulum (F12), H. japonicum (F14), H. elodeoides (F15).

The HPTLC (**Figure 3**) and NMR (**Figure 11**) results clearly show that none of the fingerprints features the yellow band at Rf = 0.49 (present in the Chinese H. perforatum, **Figure 3**, track 2). The other Hypericum species' fingerprints (**Figure 3**, tracks 3–10) are very distinct from H. perforatum's. Except for H. elodeoides (**Figure 3**, track 10), they do not contain hypericins and it is unlikely that they could have been added, accidentally or on purpose to boost the products' specifications.

Principal component analysis of NMR data relative to the HP crude drug samples altogether shows a fairly homogeneous spread, without any starkly prominent difference (**Figure 4**). Nonetheless the samples from China and those from the Mediterranean area form separate clusters. Western/Central European samples from Germany and England overlap over both clusters. Analysis of the flavonoid specific area 6–9 ppm failed to show any further difference (see **Supplementary Material**), instead showing an even more homogeneous distribution, with less similarities but without any type of clustering and a broader distribution.

HPTLC analysis of the samples highlighted a few main differences across the collection, namely the presence of the "Chinese HP" fingerprint, the separate presence of the extra yellow band (Rf = 0.49) in other samples, the presence/absence

of rutin, low flavonoid concentrations and differences in the hypericins content.

intermediate part (37–54 cm); (4) flowering tops (54–65 cm); (5) leaves only; (6) flowers only.

Each of the samples acquired from China showed the "Chinese HP" fingerprint, with, most notably, an extra compound, represented by the yellow band with Rf = 0.49 and the missing yellow band at Rf = 0.18 (**Figure 5**). This seems to define a specific chemotype for specimens belonging to the postulated ssp. chinense (also described as geographically restricted to China).

Interestingly, a yellow band with Rf = 0.49 was also detected in 50% of the samples collected in Spain (8 out of 16, from two separate regions). In the latter cases though, the persistence of the yellow band with Rf = 0.18 indicates a fingerprint distinct from the Chinese one (**Figure 5**). The compound at Rf = 0.49 was otherwise not detected in any other sample of our collection.

Additionally, rutin is not necessarily found in the crude drug material as 38% (27 out of 71 samples with sufficient flavonoid concentration to be able to read rutin band) of the samples did not show the corresponding band; Chinese samples were always found to contain rutin, in different concentrations, while the majority (81%) of Spanish material does not contain it. On the other hand, all the marketed products analyzed by Booker et al. (2018) contain rutin.

HPTLC analysis highlighted the presence of lower concentrations of compounds in samples consisting of processed material (purchased or donated in the form of roughly chopped material). Samples 1–5, 22, 23, 24, 65, 66 and 86 were obtained from commercial sources (producers, pharmacies, markets), allegedly being simply roughly processed materia prima. Visual

inspection revealed roughly chopped herbal material, making it difficult, if not impossible to determine the identity of the plant with the naked eye; in addition, most of them included a high amount of woody material (stems). In the HPTLC analysis samples 1–5 (purchased at herbal markets, in three different regions of China: Yunnan, Hebei, Shanxi) showed an extremely low content of the typical HP compounds at the concentration examined. Samples 22–24 (respectively, purchased as loose material in a pharmacy in Crete, Greece; in a pharmacy in Chile, and acquired through a manufacturer in Chile) and 65–66 (both samples acquired from a manufacturer's cultivation in Bulgaria) show better concentration but still among the lowest across the whole selection. This could be due to the apparent higher amount of woody material present in the mixtures.

This observation could be linked to the results obtained from the HPTLC analysis of the different sections of the aerial parts. The fainter fingerprint of the processed material could be due to the harvesting practice cutting further down the stem and this including a higher percentage of the woody material. Given that the wood itself does contain extremely low quantities of flavonoid compounds, this constitutes a natural bulking agent from the same plant. Whether this was done intentionally or due to a lack of knowledge cannot be ascertained in this study. Of note, for products regulated as botanicals/food supplements, this would not constitute an adulteration, but for herbal medicines it would, if the regulation follows, for example, the European Pharmacopoeia.

Alternatively, the fainter fingerprint could be ascribed to purchasing material from middlemen, implying that little information is available to the processors as to when the material was harvested and handled. Time and conditions of storage can lead to the degradation and oxidation of components, and therefore a lowering of their concentrations.

Based on our analysis, NMR-based PCA is unable to pick up on the composition differences detected via HPTLC. Moreover, the contribution of a single compound on the overall NMR spectra is minimal, especially when considering complex spectra such as those obtained from total plant extracts.

Hypericins, namely hypericin and pseudohypericin, are easily spotted in HPTLC plates treated with NP/PEG as two close red bands at Rf = 0.55–0.60. Their concentration varies across the collection of samples examined, ranging from thick brilliant to faint dark bands. However, no systematic correlations with regions of origin could be demonstrated. These differences can sometimes be associated with overall low flavonoid content (as in the case of commercial samples), but at times they do not directly correlate. As previously explained their lower content can be due to a lower proportion of flowers and leaves in the samples, the age of the material (often unknown in the case of commercial samples) but can as well be explained by time of the day/season when the material was collected. The failure to find a distinguishable marker peak for hypericin in the NMR spectra reinforces the idea that the NMR-PCA plot would not have taken into consideration the hypericin content differences.

#### Avicularin Versus Guaiaverin

As previously mentioned, samples of Chinese origin analyzed were found to have a peculiar fingerprint, characterized mainly by the presence of an extra compound at Rf = 0.49. It was initially identified, based on mass spectrometry, as avicularin, or quercetin-3-O-α-arabinofuranose, which had previously been isolated in HP (Wei et al., 2009). Another quercetin-glycoside, guaiaverin (quercetin-3-O-α-L-arabinopyranoside) though, with the same molecular weight and the same fragmentation pattern as avicularin was previously isolated from H. maculatum (Zheleva-Dimitrova et al., 2012), raising doubts relative to the identity of the yellow band at Rf = 0.49 (Booker et al., 2018).

Their molecular structures are similar (**Figure 6**) but their NMR spectra differs and distinct signals can be identified (**Figure 7**). Pure compounds were compared to identify the samples' NMR fingerprints. Peaks at δ (500 MHz, CD3OD) 5.47 (s) and 5.18 (d) ppm, found, respectively, in avicularin and guaiaverin, in an area of low signal crowding, provide a means for distinguishing between the two compounds and were chosen as marker signals. The singlet at 5.47 ppm is found in a representative sample of both Chinese and Spanish samples (No. 59 and 41, respectively), but is missing in another Spanish sample (**Figure 8**, sample 40, yellow) that did not show the extra band at Rf = 0.49. Therefore, the samples with the band at Rf = 0.49 are likely contain avicularin. The doublet at 5.18 ppm seems to be present in all samples' spectra, but as it appears in an area of high signal crowding it did not seem appropriate to derive a clear conclusion solely based on NMR. Next, the possibility of both compounds being present (guaiaverin being present in a very low concentration) was investigated using HPTLC.

FIGURE 7 | NMR spectra (500 MHz) of avicularin (blue) and guaiaverin (red) in CD3OD, highlighting particularly useful diagnostic features at 7.52 ppm/5.46 ppm (avicularin) and 7.74 ppm/5.16 ppm (guaiaverin).

HPTLC analysis following the HP protocol showed a significant separation between avicularin and guaiaverin and based on this result the band at Rf = 0.49 represents avicularin (**Figure 9**) but not guaiaverin. Additionally, the band corresponding to guaiaverin (Rf = 0.30) was detected in both Chinese and Spanish samples, indicating the presence of a mixture of the two, with guaiaverin being present in a much lower concentration (**Figure 9**). Guaiaverin was detected also in other HP samples investigated (represented by German sample 51, in **Figure 9**).

#### Materia Prima Versus Finished Products

The NMR data obtained from all the 86 samples was plotted against the data collected by Booker et al. (2018) on marketed HP products, excluding products consisting of extracts and/or combination with other plants. The PCA score plot shows that a few marketed products fall far away from the central cluster (**Figure 10**). Due to higher variability found among marketed products, the differences between the crude drug samples disappear. The different chemical fingerprints found in the materia prima represent the natural chemovariability, (especially the presence of avicularin) which, however, is minimal compared to the differences found in the finished products. The natural variability cannot explain the marketed products variability. Therefore, the reasons behind it need to be found elsewhere. This demonstrates that unregulated products' significant variation in composition is very heavily influenced by the various processing techniques of the materia prima, i.e., the differences in the value chains of these products. This highlights the importance and necessity for a carefully managed and well controlled value chain from the primary material to the finished products.

#### Comparison of NMR-PCA and HPTLC

Overall, NMR-PCA was able to detect major differences between samples, but has not been useful to discern the much more limited differences between the samples of the materia prima. Of course, it is more affected by total composition than a single compound's variations. It is a useful method for identification of trends and differences between different species, as exemplified in **Figure 11**, and makes evaluation of the results obtained from large pools of samples easier as it provides a general overview.

HPTLC unveils specific chemical differences. The combination of these two methods has helped an all-round evaluation of the chemical profile and differences existing among the HP available in nature.

## CONCLUSION

This study demonstrates that the view of 'Chinese HP' containing some unique marker substances cannot be substantiated. The HPTLC profiles have highlighted how the Chinese samples and some of the Spanish samples both contain avicularin. At the same time the Chinese samples carry some extra differences that distinguish them from the Spanish avicularincontaining ones. According to the Hypericum monographs (Robson, 2002), the distribution of subspecies perforatum and veronense overlaps in Mediterranean Europe, with minor morphological differences serving as diagnostic markers. On the other hand subspecies chinense is quite isolated geographically. As a consequence, it is possible that these detected anomalies, when compared to the EP standard, represent chemotypes

characteristic for specific geographical regions. Our samples could not be clearly assigned to these subspecies Moreover, this study demonstrates that rutin, though present in the EP standard and found in all the marketed products analyzed previously (Booker et al., 2018), is not necessarily found in the materia prima. The hypericins content was not always directly correlated to the overall flavonoid concentration. It was found to be low in commercial material, either due to higher content of woody material, or unknown age of the sample.

It is vital that there is a standard reference representative of good quality crude drug material taking into consideration the natural chemical variability. Pharmacopeial monographs should include a description of such variable characteristics. In the case of HP, this could either result in accepting H. perforatum ssp. chinense as a source of drug material if it complies with the other requirements or a new definition of what material is acceptable from a pharmacopeial perspective.

This study for the first time compares a large collection of crude material as a group and also with marketed products, establishing that in the case of HP the naturally occurring chemical differences are not responsible for the poor quality found in the finished commercial products. There is no way of establishing which chemotype has been traditionally used and substantiating that one chemotype is more appropriate than the others. Therefore, these natural differences should not be of major concern. However, in this study all samples were processed using a standard procedure, which is clearly not the case within industry, resulting in inevitable quality variations.

The results regarding the processed material, on the other hand, have highlighted how acquiring material that has been sourced along poorly managed value chains constitutes a concern

that needs to be considered and resolved. In such cases the identity, provenance, collection practices, storage conditions and length of storage are unknown and could lead to poor quality material. This strengthens the importance of minimizing the role of middlemen, who lack the knowledge of how to ascertain good quality, operating between growers/collectors and manufacturers.

This study's findings show the importance of comprehensive investigation and knowledge about crude materials as the foundations for the delivery of quality herbal products on the market. Outreach activities need to target collectors, growers and producers to guarantee that the fundamental steps of cultivating, collecting or acquiring good/acceptable quality material is carried out correctly. If the crude material's natural variation is known the final product's quality will be better defined and more predictable.

## AUTHOR CONTRIBUTIONS

FS, AB, and MH contributed to the conception and design of the study. FS gathered the samples and prepared voucher specimens. FS and KL analyzed the samples. FS analyzed the data and drafted the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

## FUNDING

FS's postdoctoral position was funded through a charitable donation by Dr. Willmar Schwabe GmbH and Co. KG, Karlsruhe, Germany, who has had no input into the experimental design and the interpretation of the data.

## ACKNOWLEDGMENTS

We would like to thank the Natural History Museum, London, in particular Dr. Norman Robson, for his insight and expertise and Mr. Jacek Wajer for his help and support. The collection of samples would have not been possible without the valuable collaboration of: Dan Zhao (School of Pharmacy, Guiyang University of Chinese Medicine, China), Ziwan Ning (School of Chinese Medicine, Hong Kong Baptist University, Hong Kong), Diego Rivera (Spain), Concepcion Obon De Castro (Spain), Alonso Verde (Spain), Jose Fajardo (Spain), Lixiang Zhai (Hong Kong), Carlos Echiburu Chau (Chile), Roberto Saavedra (Chile), Ivo Pischel (Germany), Sarah Edwards (England), Stephanie Miles (England), Hans Wohlmuth (Australia), Michael Keusgen (Germany), Fabrizio Zara (Italy), Silvia Soldatou (Greece), Zachary Bellman (England), Matthew Traver (England), Nicola Bell (England), Peter Field (England), Marco Leonti (Italy), Ana Maria Carvalho (Portugal), Debora Frommenwiler (Switzerland), and Xiaofei Zhang (China).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.01973/ full#supplementary-material

## REFERENCES

fpls-09-01973 January 22, 2019 Time: 17:17 # 12


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Scotti, Löbel, Booker and Heinrich. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Selection of Reference Genes for Expression Analysis in Chinese Medicinal Herb Huperzia serrata

Mengquan Yang1,2† , Shiwen Wu<sup>1</sup>† , Wenjing You1,2† , Amit Jaisi<sup>1</sup> and Youli Xiao1,2,3 \*

<sup>1</sup> CAS Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China, <sup>2</sup> University of Chinese Academy of Sciences, Beijing, China, <sup>3</sup> CAS-JIC Centre of Excellence in Plant and Microbial Sciences, Shanghai, China

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Mingqian Li, Zhejiang Institute of Traditional Chinese Medicine Shu Shaohua, Huazhong Agricultural University, China

#### \*Correspondence:

Youli Xiao ylxiao@sibs.ac.cn †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 28 August 2018 Accepted: 14 January 2019 Published: 01 February 2019

#### Citation:

Yang M, Wu S, You W, Jaisi A and Xiao Y (2019) Selection of Reference Genes for Expression Analysis in Chinese Medicinal Herb Huperzia serrata. Front. Pharmacol. 10:44. doi: 10.3389/fphar.2019.00044 Huperzine A (HupA) is a powerful and selective inhibitor of acetylcholinesterase. It has attracted widespread attention endangering the ultimate plant sources of Lycopodiaceae family. In this study, we used Huperzia serrata, extensively used in Traditional Chinese medicine (TCM), a slow growing vascular plant as the model plant of the Lycopodiaceae family to develop and validate the reference genes. We aim to use gene expression platform to understand the gene expression of different tissues and developmental stages of this medicinal herb. Eight candidate reference genes were selected based on RNA-seq data and evaluated with qRT-PCR. The expression of L/ODC and cytochrome P450s genes known for their involvement in lycopodium alkaloid biosynthesis, were also studied to validate the selected reference genes. The most stable genes were TBP, GAPDH, and their combination (TBP + GAPDH). We report for the first time the reference gene of H. serrata's different tissues which would provide important insights into understanding their biological functions comparing other Lycopodiaceae plants and facilitate a good biopharming approach.

Keywords: Huperzia serrata, lycopodium alkaloids, reference gene, real-time quantitative PCR, gene expression

## INTRODUCTION

The Lycopodiaceae family comprises three main genera, namely, Huperzia, Phlegmariurus, and monotypic Phylloglossum. The morphological variability between Phlegmariurus and Huperzia has presented a taxonomic challenge. Interestingly, they possess similar chemical diversity, especially lycopodium alkaloids, such as huperzine A (HupA), a highly potent, selective, and reversible inhibitor of AchE (Zhao and Tang, 2002), hence, a lead candidate for Alzheimer's disease. HupA was initially isolated from the traditional Chinese medicine Qian Ceng Ta (Huperzia serrata). H. serrata is an economically important traditional Chinese herb that is used extensively for treatment of contusions, strains, swellings, schizophrenia, myasthenia gavis, and organophosphate poisoning since the Tang Dynasty (Ma et al., 2007; Xu et al., 2017). In the United States, H. serrata is marketed as a memory-enhancing dietary supplement (Ma and Gang, 2004). However, the wide clinical investigation and application of HupA are hampered by its poor supply from natural resource or uneconomical synthesis route (Benca, 2014). Moreover, extensive harvest for HupA has endangered H. serrata and other species in the Lycopodiaceae family. Synthetic biology approach offers an

alternative potential source of HupA, but the inadequate understanding of its biosynthetic pathway restricts its production by metabolic engineering.

Current understanding of the biosynthesis of HupA and other lycopodium alkaloids originates from lysine and/or ornithine from feeding experiments and the pathway was initially proposed lysine/ornithine decarboxylase (L/ODC) as the first enzyme (Ma and Gang, 2004). Bunsupa and coauthors reported that L/ODC can catalyze the first step in the biosynthesis pathway of lysine-derived alkaloids, quinolizidine, and lycopodium alkaloids (**Figure 1**; Bunsupa et al., 2012, 2016). Furthermore, we cloned six HsL/ODC genes from H. serrata by degenerate method and characterized the function of one HsL/ODC in vitro and in vivo (Xu et al., 2017). A comprehensive relative quantitative metabolomic analysis of these alkaloids in different tissues of H. serrata was also performed by our group (Wu et al., 2018). However, the genes involved in skeleton formation and modification remain unclear (**Figure 1**, blue color; Yang et al., 2017).

Gene expression patterns in different plant tissues and growth developmental stages provide important insights into understanding their biological functions (Bustin, 2000; Vandesompele et al., 2002; Kozera and Rapacz, 2013). Transcriptome analysis and data mining have helped identify differentially expressed genes and measure the relative levels of their transcripts. Quantitative real-time PCR (qRT-PCR) provides a rapid, efficient, accurate, and reproducible method to present the mRNA transcription level in different samples or tissues and to validate data obtained from other methods (Vandesompele et al., 2002; Kozera and Rapacz, 2013; Zhang et al., 2017). The selection and validation of reference genes are the first steps in any qRT-PCR gene expression studies. The most commonly used genes for normalization of gene expression in different plant species include housekeeping genes, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), β-actin, tubulin, and 18S (Radonic et al., 2004; Niu et al., 2015; Martins et al., 2017). However, the transcript expression level of such genes is not always stable, especially in samples of different developmental stages and tissues and those subjected to stresses, leading to erroneous results (Radonic et al., 2004; Petriccione et al., 2015; Pombo et al., 2017). Hence, screening and validating reference genes for normalization of the gene expression levels are pivotal.

In this study, actin, tubulin, 18S, TBP, GAPDH, HSP90, MUB, and SAM were selected as candidate reference genes based on global RNA-seq data. Their expression stabilities in the roots, stems, leaves, and sporangia of H. serrata in different developmental stages (2-, 3-, 4-, and 5-year old) were evaluated using geNorm, NormFinder, BestKeeper programs, comparative 1Cq method, and comprehensive stability rankings obtained from RefFinder. The expression of targeted genes, namely, L/ODC and cytochrome P450s, which are potentially involved in HupA biosynthesis, were used to validate the selected reference genes. This study is the first report to evaluate the expression stability of the reference genes in H. serrata. Results will be particularly useful in the selection of structural genes involved in HupA biosynthesis and research of Lycopodiaceae plants.

## MATERIALS AND METHODS

## Plant Materials

Plants of different growth periods (2-, 3-, 4-, and 5-year old) were collected from Xiangxi, Hunan, China in January 2017, identified as H. serrata (Thunb.) Trevis<sup>1</sup> , and deposited at the Chinese herbarium with Barcode ID: 00019690<sup>2</sup> . The plants were carefully rinsed in running tap water, and soil was removed by hand. Root, stem, leaf, and sporangia were kept in collection tubes immediately after being separated from the plant, immersed in liquid nitrogen, and stored at −80◦C until further use.

## RNA Isolation and cDNA Synthesis

Total RNA was extracted from four different tissues of H. serrata, namely, root, stem, leaf, and sporangia with TIANGEN RNAprep Pure Plant Kit [Tiangen Biotech (Beijing) Co., Ltd.] according to the kit instructions. DNase I was used to digest contaminated DNA. The purified total RNA was quantified using Nanodrop (Agilent 2100, Agilent Technologies, United States) and 1% agarose gel. cDNA was synthesized as previously reported (Yang et al., 2017).

## Reference Gene Selection and Primer Design

Eight reference genes (actin, tubulin, 18S, TBP, GAPDH, HSP90, MUB, and SAM) were selected as potential candidates. All homologous in H. serrata were gathered by BLAST-search against the global RNA-seq data (Yang et al., 2017), and the candidate reference genes were selected with similar fragments per kilobase per million (FPKM) values determined in the four tissues (**Figure 2**). The primers of the candidate reference genes were designed, as listed in **Table 1**. The primer specificities were verified by the presence of a single DNA band with the expected size in 1.0% agarose gel electrophoresis and the presence of a single peak in qRT-PCR melting curve assays (**Figure 4**).

## qRT-PCR Analysis

qRT-PCR amplification was performed as previously reported (Yang et al., 2017). Expression levels were recorded as cycle quantification (Cq). The PCR efficiency of each primer pair (E = 10−1/slope−1) was determined through slope of the amplification curve in the exponential phase, obtained by four fold series dilution of cDNA (Rutledge and Stewart, 2008).

## Gene Expression Stability Analysis

The expression stability of the eight candidate reference genes across all tissues was evaluated with four algorithms, namely, geNorm (Vandesompele et al., 2002), NormFinder (Andersen et al., 2004), BestKeeper (Pfaffl et al., 2004), and 1Cq method. RefFinder (Xie et al., 2012), a web-based userfriendly comprehensive tool, was employed to generate the comprehensive ranking.

<sup>1</sup>http://www.theplantlist.org/tpl1.1/record/tro-26621719 <sup>2</sup>http://www.cvh.ac.cn/en/spm/HUST/00019690

## Validation of Identified Reference Genes

Previous studies have shown that L/ODC is the first key structure gene for precursor formation of HupA (Bunsupa et al., 2012, 2016; Xu et al., 2017). The homologues of the reported L/ODC (Unigene94988, Unigene94617) and four cytochrome P450 genes (CL9415.8, CL1143.2, Unigene1166, and Unigene25121) which were proposed to participate in the modification of HupA skeleton were also used to confirm the reliability of the selected reference genes by using the most two stable versus the least two stable genes.

## RESULTS

## RNA-Seq-Assisted Selection of Candidate Reference Genes and Primer Design

The eight reference genes (actin, tubulin, 18S, TBP, GAPDH, HSP90, MUB, and SAM) with similar FPKMs in four tissues were gathered, and primers were designed (**Table 1**, **Figure 3**, and **Supplementary Table S1**).

## Expression Levels and Variations of Candidate Reference Genes

The PCR product of candidate reference genes were verified by electrophoresis in 1.0 % agarose gel showed only a single band. The presence of a single peak in qRT-PCR melting curve analysis for each of the eight sets of primers indicated high specificity (**Figure 4**). qRT-PCR was performed to determine the expression levels of each candidate reference genes, and the Cq values showed differential transcript levels in the samples examined with low Cq values, which suggested transcript abundance. The mean Cq value of the eight candidate reference genes ranged from 13.57 to 31.72 (**Figure 5** and **Supplementary Table S1**). In all sample set, the mean Cq values showed a minimum of 15.35 and a maximum of 25.23 for the highest and lowest expression levels for 18S and TBP, respectively. The coefficient of variation (CV) of the Cq values was also calculated to evaluate the expression levels of candidate reference genes of the four tissues, where low values represent low variability or maximum stability. The CV values of the eight reference genes among all samples ranged from 6.91 to 14.60%. TBP was the least variable reference gene with a CV of 6.91% among the eight candidate reference genes studied, and HSP90 was the most variable with a CV of 14.60%. The stability ranking of all candidate reference genes on the basis of CV values is as follows: (most stable to least stable): HSP90 < MUB < SAM < 18S < GAPDH < Actin < TBP (**Figure 5**).

## Stability of the Reference Gene

Expression stabilities of the eight candidate reference genes were determined using 1Cq, geNorm, NormFinder, and BestKeeper, and their overall stabilities were ranked by RefFinder across all the tissue samples.

## 1Cq Analysis

The eight candidate reference genes from the most to least stable expression, as calculated by the 1Cq method, are listed

in **Table 2**. GAPDH and TBP were the most stable reference genes in the root and leaf. Actin and TBP were the most stable genes for the stem and sporangia, respectively. In sum, TBP, Actin, and GAPDH were the top three ideal reference

#### geNorm Analysis

The stabilities of the eight candidate reference genes of H. serrata calculated using geNorm were ranked in the different tissues according to their M values, as shown in **Figure 6**. The lowest M value indicates the most stable reference gene, and the highest M

genes.


value indicates the least stable one. Using the criteria of M < 0.5, TBP and GAPDH were stable reference genes in the four tissues of root, stem, leaf, and sporangia. When the stabilities from all the samples were combined, TBP and GAPDH were also determined to be the most stable reference genes. By contrast, HSP90 and MUB were two common unstable reference genes in all tissues and developmental stages.

The pairwise variation (Vn/Vn+1) between two sequential normalization factors NF<sup>n</sup> and NFn+<sup>1</sup> was calculated by the geNorm algorithm to determine the optimal number of reference genes for accurate normalization. A cutoff value of 0.15 is the recommended threshold, which indicates that an additional reference gene will inconsiderably contribute to the normalization. The V3/4 values in the root and stem were less than 0.15 (**Figure 7**), which suggested that the top two reference genes were sufficient for accurate normalization. For the leaf, V5/6 was 0.126, which indicated that the top five reference genes (TBP, GAPDH, 18S, SAM, and actin) were needed for accurate normalization. For the sporangia, V3/4 was 0.148, which showed that three reference genes (actin, SAM, and TBP) were required.

The value V2/3 for total was 0.129, which indicated that the most stable genes, TBP and GAPDH, could be used as the reference genes for the normalization of gene expression in H. serrata.

#### NormFinder Analysis

As shown in **Table 3**, TBP and GAPDH were the most stable genes (lowest stability value) in the root, leaf, and total subsets calculated using NormFinder. For the stem and sporangia samples, actin and TBP were the most stable reference genes. When all samples were taken to determine the stability of reference genes, the two most stable genes were TBP and GAPDH. SAM and actin also had low stability values, which indicated that the two reference genes were also suitable for qRT-PCR normalization, although not the most stable candidates.

#### BestKeeper Analysis

BestKeeper determined the stabilities of the candidate reference genes on the basis of their standard deviation (SD). Genes with SD>1 were considered unacceptable reference genes. The genes are listed from most to least stability in **Table 4**. Actin was the most stable gene in the root and total subsets, GAPDH was the most stable genes in the stem and leaf subsets, and 18S was the most stable gene in the sporangia. Only MUB and HSP90 were unstable genes.

#### RefFinder Analysis

The rankings of the four algorithms were integrated by RefFinder to acquire reliable results for the expression stabilities of the eight candidate reference genes of H. serrata, and the results are shown in **Table 5**. The expression of GAPDH was ranked as the most stable in the root and leaf, and the expression of actin was ranked as the most stable in the stem and sporangia. The expression of TBP was ranked the most stable in total. By contrast, MUB and HSP90 were two least stable reference genes almost in all tissues calculated by all five programs. Overall, the best reference genes

for accurate transcript normalization in all of the samples were actin, GAPDH, and TBP, which had the lowest geometric mean of the ranking values.

## Validation of the Identified Reference Genes

The expression levels of HupA biosynthesis-related genes, L/ODC (Unigene94617, Unigene94988), and cytochrome P450s (CL94158.8, CL11443.2, Unigene1166, and Unigene25121) were

investigated using different reference genes in different tissues at different developmental stages to validate the selected candidate reference genes. Each of the two most stable reference genes (TBP and GAPDH), its combination (TBP + GAPDH), and the two least stable reference genes (HSP90 and MUB) were used as internal controls. When using TBP alone, GAPDH alone, MUB alone, and the combination of TBP + GAPDH for normalization, the expression patterns were similar in all six validated genes. However, when the least stable gene HSP90 was used for normalization, the expression patterns showed some differences (**Figure 8**). Thus, RNA Seq-assisted selection of candidate reference genes was helpful.

## DISCUSSION

Standardization and quality assessment of traditional herbal formulations is of paramount importance in order to modernize. However, still major bottlenecks faced by the herbal industry is the unavailability of rigid quality control profiles, primarily because of the complexity and incomplete knowledge of the active medicinal compounds. H. serrata, a vulnerable group of slow-growing plant, extensively harvested by the traditional medicinal practitioners. It contains many active compounds, especially HupA whose contents differed significantly among the organs, varieties, age, and production areas of the herbal medicines (Ma et al., 2006). Hence, to address such variation



<sup>∗</sup> Total: Pooled samples from all treatments.

fphar-10-00044 January 30, 2019 Time: 17:59 # 7

in quality of medicinal material, studies has been directed towards understanding the molecular regulatory mechanisms of secondary metabolism through transcriptomics or functional genomics approaches.

Most of the modern plant research is often underpinned by the genetic approach creating transgenic lines to test the gene functions in planta. Inability to genetically transform any lycophytes species such as H. serrata has been challenging, and as such our understanding of Huperzia development lags significantly behind almost all other land plant lineages despite its traditional medicinal application. Recently, our group published the global RNA-seq of four different tissues which assisted us for the gene mining regards to the HupA biosynthesi (Yang et al., 2017). Elucidating the biosynthetic pathway is a prerequisite to heterologous production of targeted metabolites limiting the overexploitation of the natural habitat. To our knowledge no suitable reference gene for this plant is available. It is important to select a suitable reference gene to study the different expression patterns in different varieties and different tissues in medicinal plants. Here, we report the use of eight genes (actin, tubulin, 18S, TBP, GAPDH, HSP90, MUB, and SAM), to select and validate the suitable reference genes for the qRT-PCR normalization in different tissues and developmental growth stages.

qRT-PCR is one of the most commonly used technologies for transcript expression analysis owing to its sensitivity and reproducibility (Derveaux et al., 2010). Coexpression analysis is a useful method to screen the candidate structure genes involved in specialized metabolite biosynthesis (Saito et al.,

2008). Normalization with stable reference genes is critical for obtaining accurate results from qRT-PCR data. Differential and coexpression analyses of the structural genes derived from qRT-PCR have been successful to screen UDP-dependent glucuronosyltransferase, which can catalyze continuous two-step glucuronosylation of glycyrrhetinic acid to yield (Xu et al., 2016). Hence, differential analysis coupled with coexpression analysis will be a useful method to screen the specific genes in the plants without genome information.

For H. serrata, the global RNA-seq data from four different tissues have been published, which can be directly used for differential and coexpression analyses. However, the suitable reference gene for this plant is still not selected, which may result in different expression patterns in different tissues or treatments. Here, eight reported reference genes (actin, tubulin, 18S, TBP, GAPDH, HSP90, MUB, and SAM) were selected and validated to discover the suitable reference genes for the qRT-PCR normalization in different tissues.

In the current study, four housekeeping genes (GAPDH, actin, tubulin, and 18S) and other four genes (TBP, SAM, HSP90, and MUB) were used as query genes for the blast against the RNA-seq data (Yang et al., 2017) to find the homologous genes (Stanton et al., 2017). Genes with similar expression levels in four different tissues were selected as candidates as previously reported (Tan et al., 2015). In this study, the traditional reference genes, GAPDH, actin, tubulin, and 18S, had good performances in CV values in qRT-PCR Cq values (CV < 10%), in line with the previous reports (Shivhare and Lata, 2016; Martins et al., 2017). The four most extensively used programs (1Cq, geNorm, NormFinder, and BestKeeper) were used in this study for analyzing the stabilities of candidate reference genes to avoid selection of coregulated genes. The four programs showed a few differences in results; TBP and GAPDH were the most two stable reference genes, MUB and HSP90 were the least stable reference genes, and others



<sup>∗</sup> Total: Pooled samples from all treatments.

TABLE 4 | Stability analysis of candidate reference genes, as assayed with BestKeeper software.


<sup>∗</sup> Total: Pooled samples from all treatments.

TABLE 5 | Expression stability of candidate reference genes, as assayed with RefFinder software.


were midstable candidates with different rankings calculated by different programs.

Although TBP was the most stable candidate for all samples in our study, its expression level was very low, which was also observed in equine milk somatic cells and in Aedes aegypti (Cieslak et al., 2015; Dzaki et al., 2017). This low level was due to that the Cq values in qRT-PCR assays varied from 21.34 to 31.05 in all experiments with the five dilution cDNAs, which indicated that the TBP is a new plant species-dependent reference gene, hence suggesting a proper validation in each case. While, GAPDH exhibited good performance on qRT-PCR normalization in different tissues of plants of different developmental stages, as calculated by most of the programs. Hence, GAPDH alone is suitable for qRT-PCR normalization as a reference gene under different tissues. The few differences in reference genes showed in different programs were also in agreement with earlier studies (Jain et al., 2006; Cruz et al., 2009; Jain, 2009; Qi et al., 2016). The different rankings of the reference genes showed in different programs were also observed in chrysanthemum (Gu et al., 2011; Wang et al., 2015); thus, all these programs must be combined to evaluate the candidates for each species.

We further performed RT-qPCR experiments to investigate the expression levels of L/ODC genes, which were previously characterized in H. serrata, Lycopodium clavatum, and Leguminosae (Bunsupa et al., 2012; Bunsupa et al., 2016; Xu et al., 2017) by using the two most stable reference genes (TBP and GAPDH) and the two least stable reference genes (MUB and HSP90), to evaluate the eight selected reference genes. According to the pairwise analysis by geNorm software, two reference genes were sufficient for the normalization; thus, the combination of TBP and GAPDH was also used to calculate the expression level of targeted genes. Regardless of which reference gene was used, the expression patterns of

L/ODC (Unigene94988) were the same. To further validate, Unigene94617, a homologous gene of Unigene94988, and four cytochrome P450 genes (CL9415.8, CL1143.2, Unigene1166, and Unigene25121) proposed to participate in HupA biosynthesis (Yang et al., 2017) were employed for the normalization. All reference genes, with exception of HSP90, acquired a similar expression pattern for all targeted genes. Hence, only HSP90 was unsuitable for qRT-PCR normalization in all tissues of H. serrata, which suggested that RNA-seq-assisted selection was a useful method for selecting suitable reference genes. Previous studies in Arabidopsis thaliana, Coffeea arabica, Gossypium hirsutum, and Chrysanthemum showed that the novel reference genes exhibited better performance than traditional reference genes (Czechowski et al., 2005; Cruz et al., 2009; Artico et al., 2010; Qi et al., 2016). Taken all together, although we observed some inconsistency on the expression patterns of the some genes in HupA biosynthesis between RNA-seq and qRT-PCR, this might be due to the plant growth condition differences (season and climate) when we collected (**Supplementary Table S1**). The major reason for this is likely due to the seasonal and climatic factors or growth as this plant takes years to grow (Ma et al., 2006). Similarly, inconsistency was also observed previous reports. In many cases, the gene expressions quantified with different methods were dramatically different (Wang et al., 2006; Marioni et al., 2008; Qin et al., 2013; Rajkumar et al., 2015; Dapas et al., 2017). Due to the lack of successful in vitro propagation approach of Lycopodiaceae family, its important to design such functional genomics study from the control climatic conditions and/or established in vitro platform. Our lab is currently exploring the approach of in vitro propagation of endangered species of Lycopodiaceae family. This study state possible use of housekeeping genes as a stable candidate for qRT-PCR normalization of plants belonging to Lycopodiaceaea family especially H. serrata.

#### CONCLUSION

In this study, we proposed H. serrata as a model plant for functional genomics study in the Lycopodiaceae family. The qRT-PCR reference gene normalization in tissues of H. serrata showed that TBP and GAPDH were the two most suitable reference genes. The combination of the two genes as reference genes was accurate for qRT-PCR normalization, as performed in different tissues of H. serrata according to the pairwise variation analysis by geNorm program. The reference genes identified and validated here through RNA-seq data for qRT-PCR normalization will facilitate the establishment of standardized qRT-PCR program for other genetically close plants.

## AUTHOR CONTRIBUTIONS

fphar-10-00044 January 30, 2019 Time: 17:59 # 11

YX conceived the research. YX and SW designed the experiments. WY, SW, and MY, performed the experiments. MY and SW analyzed the data, wrote the manuscript, and coordinated its revision. YX and AJ revised the manuscript. All authors provided helpful discussions and approved the final version.

### REFERENCES


### FUNDING

This work was financially supported by Chinese Academy of Sciences (CAS) (Grant XDB27020203, 153D31KYSB20170121, and 153D31KYSB20160074) and CAS-JIC center of Excellence in Plant and Microbial Sciences (CEPAMS) funding.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphar. 2019.00044/full#supplementary-material

TABLE S1 | The consistency of target genes expression and the expression data of qRT-PCR and RNA-seq.


the tomato-Pseudomonas pathosystem. Sci. Rep. 7:44905. doi: 10.1038/srep 44905


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Yang, Wu, You, Jaisi and Xiao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# DNA Metabarcoding Authentication of Ayurvedic Herbal Products on the European Market Raises Concerns of Quality and Fidelity

Gopalakrishnan Saroja Seethapathy1,2, Ancuta-Cristina Raclariu-Manolica1,3 , Jarl Andreas Anmarkrud<sup>1</sup> , Helle Wangensteen<sup>2</sup> and Hugo J. de Boer<sup>1</sup> \*

<sup>1</sup> Natural History Museum, University of Oslo, Oslo, Norway, <sup>2</sup> Department of Pharmaceutical Chemistry, School of Pharmacy, University of Oslo, Oslo, Norway, <sup>3</sup> Stejarul Research Centre for Biological Sciences, National Institute of Research and Development for Biological Sciences, Piatra Neamt, Romania

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Jianping Han, Institute of Medicinal Plant Development (CAMS), China Zhigang Hu, Hubei University of Chinese Medicine, China

> \*Correspondence: Hugo J. de Boer hugo.deboer@nhm.uio.no

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 12 October 2018 Accepted: 17 January 2019 Published: 05 February 2019

#### Citation:

Seethapathy GS, Raclariu-Manolica A-C, Anmarkrud JA, Wangensteen H and de Boer HJ (2019) DNA Metabarcoding Authentication of Ayurvedic Herbal Products on the European Market Raises Concerns of Quality and Fidelity. Front. Plant Sci. 10:68. doi: 10.3389/fpls.2019.00068 Ayurveda is one of the oldest systems of medicine in the world, but the growing commercial interest in Ayurveda based products has increased the incentive for adulteration and substitution within this herbal market. Fraudulent practices such as the use of undeclared fillers and use of other species of inferior quality is driven both by the increased as well as insufficient supply capacity of especially wild plant species. Developing novel strategies to exhaustively assess and monitor both the quality of raw materials and final marketed herbal products is a challenge in herbal pharmacovigilance. Seventy-nine Ayurvedic herbal products sold as tablets, capsules, powders, and extracts were randomly purchased via e-commerce and pharmacies across Europe, and DNA metabarcoding was used to assess the ability of this method to authenticate these products. Our analysis reveals that only two out of 12 single ingredient products contained only one species as labeled, eight out of 27 multiple ingredient products contained none of the species listed on the label, and the remaining 19 products contained 1 to 5 of the species listed on the label along with many other species not specified on the label. The fidelity for single ingredient products was 67%, the overall ingredient fidelity for multi ingredient products was 21%, and for all products 24%. The low level of fidelity raises concerns about the reliability of the products, and detection of threatened species raises further concerns about illegal plant trade. The study highlights the necessity for quality control of the marketed herbal products and shows that DNA metabarcoding is an effective analytical approach to authenticate complex multi ingredient herbal products. However, effort needs to be done to standardize the protocols for DNA metabarcoding before this approach can be implemented as routine analytical approaches for plant identification, and approved for use in regulated procedures.

Keywords: Ayurvedic herbal products, botanical authentication, DNA barcoding, herbal medicines, pharmacovigilance, quality control

## INTRODUCTION

fpls-10-00068 February 2, 2019 Time: 18:16 # 2

Ayurveda, or Ayurvedic medicine, is one of the oldest systems of traditional medicine (TM), with origins in India more than 3,000 years ago. Nowadays Ayurveda is popular and used worldwide in complementary and alternative healthcare and medical practices (CAM) (World Health Organization [WHO], 2013). Ayurvedic formulations are obtained using an average of 80% botanicals, 12% animals, and 8% minerals, and are used as raw materials and preparations such as extracts (Joshi et al., 2017). About 7,000 plant species are used for medicinal purposes in India, from which, about 1,200 species have been reported to be actively traded (Goraya and Ved, 2017). The total commercial demand for herbal material in India, in 2014 and 2015, was estimated to be in excess of 512,000 tons, with a market value of 1 billion USD (Goraya and Ved, 2017). India has more than 8,000 licensed manufacturing units for medicinal products and the increasing level of consumption of herbal products exceed the supply capacity for some plant species (Goraya and Ved, 2017). In order to ensure a level of uniformity of the therapeutic formula and the ingredients used, the Ayurvedic formulary and Ayurvedic Pharmacopeia of India was published by the Government of India as a legally binding document describing the quality, purity, and strength of selected drugs that are manufactured, distributed and sold by the licensed manufacturers in India (Joshi et al., 2017).

As many other TMs, Ayurvedic herbal medicines, require quality assurances for their wider usage and acceptability in CAM practicing countries (World Health Organization [WHO], 2013). The growing demand for Ayurveda encourages an industry for mass production of herbal products, leading to the use of large quantities of plant raw material, mainly harvested from the wild flora (Valiathan, 2006; Goraya and Ved, 2017; Joshi et al., 2017). Many of the Indian medicinal plant species are in short supply due to the lack of cultivation and several wild species are not available in sufficient quantities for commercial exploitation (Goraya and Ved, 2017). The intensive use of herbal products increases the incentive for adulteration and substitution in the medicinal plant trade (Newmaster et al., 2013). This awareness of content irregularities calls attention to the quality of the traded mass produced herbal products with direct impact on their efficacy and safety (Leonti and Casu, 2013). One of the pharmacognostic parameters to assure quality, safety and efficacy of a herbal medicine is the utilization of correctly identified medicinal plants used as raw material (Evans, 2009). Several new strategies and appropriate standard methods have been proposed to exhaustively assess and monitor both the quality of raw materials and marketed herbal products (Barnes, 2003; De Boer et al., 2015). Standard methods routinely used to assess herbal material, preparations and products rely on morphological characters, microscopy, and chemical fingerprinting [i.e., thin–layer chromatography', high–performance liquid chromatography (HPLC), and gas chromatography (GC)] (De Boer et al., 2015; Parveen et al., 2016). These methods are quick and cost-effective techniques for primary qualitative analysis of raw material and derived herbal products. Alternatively, the use of more advanced methods for identification and quantification of chemical marker compounds is becoming popular [i.e., liquid chromatography (LC)–mass spectrometry (MS), GC-MS, and LC-nuclear magnetic resonance (NMR)], but requires valuable instrumentation (Jiang et al., 2010; Wang et al., 2017; Zhang et al., 2017; Raclariu et al., 2018a).

Various important issues influence the quality of Ayurvedic herbal products and they need to be carefully taken into consideration when determining the analytical method of choice for quality control. The herbal products are usually complex mixtures of plant material and/or extracts and excipients, and results of manifold processing steps. To apply only standard analytical methods may pose serious challenges to the accuracy of herbal product quality control. Furthermore, adulteration by the deliberate use or admixture of substitutes and undeclared plant fillers, fraudulent adulteration by using fillers of botanical origin or plant materials of inferior quality (Zhang et al., 2012), the addition of pharmaceuticals or other synthetic substances in order to reach an expected effect or a certain level of marker compounds (Calahan et al., 2016; Rocha et al., 2016) raises concerns about the quality and safety of the herbal products. Multiple plant species as source for botanical drug as allowed in different pharmacopeias, as well as the accidental substitutions, all raise concerns ranging from simple misleading labeling to potential serious adverse drug reactions (Ernst, 1998; Heubl, 2010; Gilbert, 2011) or poisoning due to toxic contaminants (Chan, 2003).

All the standard analytical approaches, including sensory and chemical inspection may have a good resolution in quality control by detecting the quality and quantity of specific lead or phytochemical marker compounds. However, they are generally not applicable in identifying target plant species within a complex herbal product, and show low ability to detect non-targeted plant ingredients in herbal products (De Boer et al., 2015). To overcome this limitation, DNA-based approaches have been proposed as useful analytical tools for the quality control of herbs and herbal products (Parveen et al., 2016). DNA barcoding is a cost-effective, species-level identification based upon the use of short and standardized gene regions, known as 'barcodes' (Hebert et al., 2003). Several reviews have corroborated the diverse applicability of DNA barcoding in the field of medicinal plant research (Techen et al., 2014; De Boer et al., 2015). Initially used as an identification tool, DNA barcoding is now applied in the industrial quality assurance context to authenticate a wide range of herbal products (De Boer et al., 2015; Parveen et al., 2016; Sgamma et al., 2017).

The combination of High-Throughput Sequencing (HTS) and DNA barcoding, known as DNA metabarcoding, enables simultaneous high-throughput multi-taxa identification by using the extracellular and/or total DNA extracted from complex samples containing DNA of different origins (Taberlet et al., 2012). Several studies have utilized this approach in identifying and authenticating medicinal plants and derived herbal products. For example, Echinacea species, Hypericum perforatum, and Veronica officinalis were detected in 89, 68 and 15%, respectively, of the investigated herbal products (Raclariu et al., 2017a,b, 2018b). Similarly, Ivanova et al. (2016) found that 15 tested herbal supplements contained non-listed, non-filler plant DNA, and Cheng et al. (2014) showed that the quality of 27 tested herbal

preparations was highly affected by the presence of contaminants. Coghlan et al. (2012) revealed the species composition of 15 highly processed traditional Chinese medicines using DNA metabarcoding, and showed that the products contained species included on CITES appendices I and II. A number of studies in India have surveyed herbal raw drug markets and tested the authenticity of the herbal drugs using DNA barcoding. These studies reported that 24% of raw drug samples of Phyllanthus amarus Schumach. & Thonn. were substituted with other phenotypically similar Phyllanthus species (Srirama et al., 2010). Similar substitution were reported for other species, such as Sida cordifolia L. (76%) (Vassou et al., 2015), Cinnamomum verum J.Presl (70%) (Swetha et al., 2014), Myristica fragrans Houtt. (60%) (Swetha et al., 2017), Senna auriculata (L.) Roxb. (50%) (Seethapathy et al., 2015), Senna tora (L.) Roxb. (37%) (Seethapathy et al., 2015) and Senna alexandrina Mill. (8%) (Seethapathy et al., 2015). Furthermore, Vassou et al. (2016) reported that 21% of raw drugs in Indian herbal markets were unauthentic. Shanmughanandhan et al. (2016) found that 60% of 93 herbal products sold in the form of capsules and plant powders in local stores in India were adulterated. Studies that combined spectroscopic methods, such as NMR, with DNA barcoding or microscopy to authenticate herbal products, reported 80% adulteration in Saraca asoca (Urumarudappa et al., 2016), 80% in Berberis aristata (Srivastava and Rawat, 2013) and 22% in Piper nigrum (Parvathy et al., 2014). All these studies utilizing DNA barcoding and metabarcoding have highlighted the concerns over the quality and good labeling practices of herbal products (Coghlan et al., 2012; Ivanova et al., 2016; Raclariu et al., 2017a,b; Veldman et al., 2017).

The aim of this study was threefold. First, we aimed to test the composition and fidelity of Ayurvedic products marketed in Europe using DNA metabarcoding. Secondly, we aimed to analyze the presence of any red listed species listed on the product label and used as ingredients using DNA metabarcoding. Our final aim was to evaluate the ability of DNA metabarcoding to identify the presence of authentic species, any substitution and adulteration and/or presence of other off labeled plant species.

## MATERIALS AND METHODS

#### Sample Collection

Seventy-nine Ayurvedic herbal products sold as tablets (n = 30), capsules (n = 30), powders (n = 16), and extracts (n = 3) were purchased via e-commerce (n = 53) and pharmacies (n = 26), from Norway (n = 21), Romania (n = 26), and Sweden (n = 32). Based on the label information, 26 were single plant ingredient products, 39 contained between two to ten plant ingredients, and 14 products contained between eleven to 27 plant ingredients (**Supplementary Table S1**). The products contained a total of 159 plant species belonging to 132 genera and 60 families (**Supplementary Table S2**). It was also confirmed that nrITS sequences of all the 159 plant species labeled in the analyzed herbal products were available within the NCBI/GenBank database (**Supplementary Table S2**). The accepted binomial names and authors of the plants species used as ingredients were validated using The Plant List (2013). The Ayurvedic herbal products were imported into Norway for scientific analyses under Norwegian Medicines Agency license no. 16/04551–2. An overview of the products, including label information, but not the producer/importer name, lot number, expiration date or any other information that could lead to the identification of that specific product, can be found in **Supplementary Table S1**.

## DNA Extraction, Amplicon Generation, and High Throughput Sequencing

The 79 Ayurvedic herbal products were processed depending on their pharmaceutical formulation, in addition to an extraction blank per DNA extraction round. A small amount of each herbal product, about 200 mg, was homogenized using 3–5 zirconium grinding beads in a Mini-Beadbeater-1 (Biospec Products Inc., Bartlesville, Oklahoma, United States). The total DNA from each product was extracted from homogenized contents using CTAB extraction (Doyle and Doyle, 1987). The final elution volume was 100 µl. Extracted DNA was quantified using a Qubit 2.0 Fluorometer and Qubit dsDNA HS Assay Kit (Invitrogen, Carlsbad, California, United States). All amplicon libraries, defined as PCR amplified products from a study sample, were prepared in three replicates. For each replicate two nuclear ribosomal target sequences were amplified, the internal transcribed spacers nrITS1 and nrITS2, respectively. The fusion primers included the annealing motif from the Sun et al. (1994) plant-specific primer pairs 17SE and 5.8I1, and 5.8I2 and 26SE. The forward primers included the Ion Torrent A adapter, a 10 bp multiplex identifier tag following the IonXpress setup for Ion Torrent (Thermo Fisher Scientific, Carlsbad, California, United States). The reverse primer included the truncated P1 (trP1) tags in addition to the annealing motif. Expected amplicon sizes were 300–350 bp.

Polymerase chain reactions were carried out using DNA extracted from the herbal products in final reaction volumes of 25 µl including 0.5 µl of template DNA solution (ranging from 0.5 to 2 ng/µl), 1X Q5 reaction buffer (New England Biolabs Inc., United Kingdom), 0.6 µM of each primer (Biolegio B.V., Netherlands), 200 nM dNTPs, 5 U Q5 High-Fidelity DNA Polymerase (New England Biolabs Inc., United Kingdom) and 1X Q5 High GC enhancer. The PCR cycling protocol consisted of initial denaturation at 98◦C for 30 s, followed by 35 cycles of denaturation at 98◦C for 10 s, annealing at 56◦C for nrITS1 or 71◦C for nrITS2 for 30 s, and elongation at 72◦C for 30 s, followed by a final elongation step at 72◦C for 2 min. Three PCR negative controls of the extraction blanks were included per amplification to control for external and cross sample contamination. After PCR, the amplicons were purified using Illustra Exostar (GE Healthcare, Chicago, Illinois, United States) in accordance with the manufacturer protocols. The molarity of each amplicon library was measured using a qPCR based assay (CFX96 Touch Real-Time PCR Detection System, Bio-Rad, Hercules, California, United States). The equimolar amounts of each amplicon library were merged and sequenced using an Ion Torrent Personal Genomic Machine (Thermo Fisher Scientific) as described by Raclariu et al. (2017a).

#### Bioinformatics Analysis

fpls-10-00068 February 2, 2019 Time: 18:16 # 4

The sequencing read data were analyzed and demultiplexed into FASTQ files, per sample, using Torrent Suite version 5.0.4 (LT), and each of the replicates was analyzed individually. FASTQ read files were processed using the HTS-barcodechecker pipeline available as a Galaxy pipeline at the Naturalis Biodiversity Center<sup>1</sup> (Lammers et al., 2014). Using the HTS pipeline, nrITS1 and nrITS2 primer sequences were used to demultiplex the sequencing reads per sample and to filter out reads that did not match any of the primers. PRINSEQ was used to determine filtering and trimming values based on read lengths and Phred read quality. All reads with a mean Phred quality score of less than 26 were filtered out, as well as reads with a length of less than 200 bp. The remaining reads were trimmed to a maximum length of 380 bp. CD-HIT-EST was used to cluster reads into molecular operational taxonomic units (MOTUs) defined by a sequence similarity of >99% and a minimum number of ten reads. The consensus sequences of non-singleton MOTUs were queried using BLAST against a reference nucleotide sequence database, with a maximum e-value of 0.05, a minimum hit length of 100 bp and sequence identity of >97%. The number of reads per MOTU, as well as the BLAST results per MOTU, were compiled using custom scripts from the HTS Barcode Checker pipeline (Lammers et al., 2014). The reference sequence database consisted of a local copy of the NCBI/GenBank nucleotide database that is refreshed monthly. These parameters were applied to each of the replicates. A species was considered and validated as being present within the product only if this was detected in at least 2 out of the 3 replicates.

#### Presence and Abundance of Species Across Samples

To assess species diversity within each sample, and to obtain insights into the dominant species within the Ayurvedic herbal products, the read abundances were normalized by dividing the number of reads for a MOTU by the total number of reads per sample. As a result, the read counts are transformed into a proportion of reads found per species within each sample (**Supplementary Tables S3, S4**). Furthermore, MOTUs detected in at least two out of the three replicates, for each sample, were categorized into expected-detected (MOTUs corresponding to species listed on the product label versus species detected in the analysis), expected-not detected (MOTUs corresponding to species listed on the product label but not detected in the analysis), and not expected-detected (MOTUs corresponding to species non-listed on the product label but detected in the analysis) (**Supplementary Table S5**). The total occurrences of MOTUs per category of expected and detected were evaluated (**Supplementary Table S5**), and a matrix of

### RESULTS

## Fidelity of Ayurvedic Products

The genomic DNA extracts were highly variable in quantity and quality. Total DNA concentration for each of the 79 herbal products is provided in **Supplementary Table S6**. **Table 1** shows the average DNA yield for each of the investigated herbal product types. The result shows that three samples labeled as containing only standardized extracts yielded an average of 0.5 ng/µl DNA, whereas tablets, capsules and powders yielded an average of 5.8, 9.6, and 44.7 ng/µl DNA, respectively. Out of 79 products used in the study, 10 tablets were also labeled to contain extracts in addition to crude plant material (#6, #12, #13, #14, #17, #18, #20, #21, #38, and #74). PCR amplification for nrITS1 and nrITS2 regions were performed for all 79 samples, and amplicons were generated for all replicates for nrITS1 and nrITS2 (for samples and concentrations see **Supplementary Table S6**). The extraction blanks yielded no molecular operational taxonomic units (MOTUs) with nrITS1 and nrITS2 primers.

The sequencing success rate was 44% for ITS1 and 41% for ITS2 (**Supplementary Table S6**). Thirty-five products out of 79 (44%) yielded no MOTUs in any of the replicates either for nrITS1 or nrITS2 that fulfilled our quality criteria, and they were excluded from the results and the further discussion (#11, #20–22, #28, #29, #33, #35, #37–39, #41–51, #53, #54, #56, #57, #59, #62, 64, #65, #67, #71, #72, #76, and #78). These products consisted of 13 tablets, 11 capsules, and 11 powders (**Supplementary Table S6**). The products that yielded MOTUs were represented by 17 tablets, 19 capsules, 5 powders, and 3 extracts (**Supplementary Table S7**).

A total of 188 different plant species belonging to 154 genera and 65 families were identified from the retained MOTUs using BLAST. The separate analyses resulted in 131 plant species (110 genus and 53 families) for nrITS1, and 101 plant species (84 genus and 39 families) for nrITS2. The number of species detected per sample ranged from one to 42. After applying our quality selection criteria, where a species was considered and validated as being present within the product only if it was detected in at least 2 out of the 3 replicates, five additional products (#4, #15, #24, #25, and #26 includes 2 tablets and 3 extracts) that failed to yield the same MOTU in any of the replicates were discarded. The remaining 39 products resulted in a total of 97 plant species belonging to 40 families (62 species for nrITS1, and 60 species for nrITS2). The species detected for all the replicates for both ITS1 and ITS2, were merged for each sample for further analyses (**Figure 1** and **Supplementary Tables S3, S7**).

**Figure 2** illustrates the fidelity of herbal products between various product forms, country, and method of acquisition. In ten out of twelve single ingredient products that were labeled as containing only one species, we detected multiple species (exceptions #5 and #52), from which six contained the species labeled on the product together with other species, whereas four products did not contained the species listed on the product

correlation was generated using ClustVis (Metsalu and Vilo, 2015).

<sup>1</sup>http://145.136.240.164:8080/


A total of 159 plant species belonging to 132 genera and 60 families were specified on the labels of the 79 Ayurvedic herbal products used in this study. Assessing the source and availability of these plants, we found that 83 plants species are solely harvested from wild, and 31 of these are under various threat levels, including critically endangered and protected species, such

TABLE 1| Genomic DNA yield and amplicon concentrations per herbal product

 type.

Product type

Tablets Capsules Powders

Extracts

No.

as Pterocarpus marsupium Roxb., Pterocarpus santalinus L.f., Santalum album L., and Saraca asoca (Roxb.) Willd. (**Figure 4** and **Supplementary Table S2**; Ved and Goraya, 2007; Envis Frlht, 2017; Goraya and Ved, 2017). The DNA metabarcoding analysis confirms the presence of four of these threatened species, i.e., Celastrus paniculatus, Glycyrrhiza glabra, Gymnema sylvestre, and Saraca asoca, whereas the remaining threatened species were not detected despite being included as labeled ingredients (**Figure 3** and **Supplementary Table S7**). The following species were found in over 20% of the products: Withania somnifera (L.) Dunal (39%), Tribulus terrestris L. (27%), Convolvulus prostratus Forssk. (23%), Coriandrum sativum L. (23%), Ipomoea parasitica (Kunth) G. Don (23%), Ocimum basilicum L. (23%) and Senna alexandrina Mill. (23%) (**Figure 3** and **Supplementary Table S3**). Seventeen are present in more than 10% of samples are listed in the **Supplementary Table S3**.

### DISCUSSION

The British Pharmacopeia is one the first to publish a specific methods section on DNA barcoding, and in the 2016 version it included a new methods appendix on "Deoxyribonucleic acid (DNA) based identification techniques for herbal drugs" to create a framework for compliance of DNA barcoding with regulatory requirements (British Pharmacopeia Commission, 2016; Sgamma et al., 2017). However, DNA barcoding and metabarcoding are not yet widespread validated methods for use in the regulatory context of quality control. Several studies advocate its usefulness for herbal product authentication and pharmacovigilance either as a standard method or as a complementary method (Ivanova et al., 2016; Raclariu et al., 2017a,b, 2018a; Sgamma et al., 2017). In this study, DNA metabarcoding was used as an analytical approach in Ayurveda herbal product authentication.

A number of studies have shown that the quality of the extraction substrate influences amplification and sequencing success (Ivanova et al., 2016; Raclariu et al., 2017a, 2018b). In addition the presence of DNA in the extraction substrates is influenced by degradation during the harvesting, drying, storage, and industrial processing of plant material (Novak et al., 2007). The success rate in generating raw sequence reads from the herbal products, and the number of products from which MOTUs could be identified per product after applying strict trimming and filtering quality criteria, reduced the number of




samples yielding DNA metabarcoding results from 79 to 39 samples. In this study, 44% of products did not yield MOTUs in any of the replicates either for nrITS1 or nrITS2. Also, in the herbal products labeled to contain only extracts, no plant DNA was detected. The undetected MOTUs in these products could be related to the methodological framework of DNA metabarcoding such as DNA extraction protocol, suitability of primer pair sequences, amplification protocols in PCR for the library preparation, sequencing platform, filtering, quality thresholds, and chimera removal, and clustering thresholds (De Boer et al., 2017; Sgamma et al., 2017; Raclariu et al., 2018b). In addition, extraction of crude herbal drugs either in preprocessing or manufacturing can reduce the availability of plant DNA from those species, especially if material is extracted in boiling water or alcohol, and evaporated or dried at high temperatures.

Considerable incongruences were observed between the detected species and those listed on the label of the products. Similarly, Raclariu et al. (2017b) demonstrated the ability of DNA metabarcoding in detecting Hypericum species in complex herbal formulations, and revealed the incongruence between constituent species and those listed on the label in all products. Also, De Boer et al. (2017) performed DNA metabarcoding analyses on 55 commercial products based on orchids (salep) purchased in Iran, Turkey, Greece, and Germany, and concluded that there are significant differences in labeled and detected species. They also highlighted the applicability of DNA metabarcoding in targeted efforts for conservation of endangered orchid species. In our study, we detected a total of 97 species in 39 products that passed our quality criteria, and most of the identified species are likely ingredients of Ayurvedic herbal products. Detection of certain species is improbable given their distribution or unlikely use, and these include Achillea millefolium L., Anchusa italica Retz., Calluna vulgaris (L.) Hull, Damrongia cyanantha Triboun, Fraxinus albicans Buckley and Trigastrotheca molluginea F.Muell. The identification of these plant species may be explained by (i) amplified PCR chimeras; (ii) false-positive BLAST identifications due to incomplete or error-prone reference databases; or (iii) presence of pollen from wind pollinating species, and this confirms previously raised concerns about the hypersensitivity of DNA metabarcoding (De Boer et al., 2017).

Out of 97 species detected in the DNA metabarcoding analysis, 40 species are sourced from wild, 38 species are cultivated, and 15 species are sourced from both wild and cultivation. Similarly, among the 89 species which were not detected in the analysis, 62 species are mainly sourced from wild, including endangered species such as Embelia ribes Burm.f., Pterocarpus marsupium Roxb., Pterocarpus santalinus L.f., Pueraria tuberosa (Willd.) DC., and Santalum album L. Understanding, the discrepancies between the species detected using DNA metabarcoding and those listed on the label of the products require careful consideration. In DNA metabarcoding analyses, the level of similarity clustering thresholds (>97, >99, and 100%) have an impact on the number and size of assigned MOTUs (Raclariu et al., 2017a). In this study, we used a 99% clustering threshold similar to previously published studies (Raclariu et al., 2017a; Veldman et al., 2017). Furthermore, to limit the impact of sequencing errors, which are known to affect the Ion Torrent sequencing platform (Salipante et al., 2014) and which could lead to the formation of false MOTUs, we used only the clusters that contained a minimum of 10 reads. In addition, by using three replicates for each sample and marker, we reduced further noise by accepting MOTUs only if present in more than one replicate. Furthermore, the strict filtering and trimming thresholds for base calling, length and quality, and strict clustering criteria for MOTUs formation, increase confidence of the results. As reported by previous studies (Ivanova et al., 2016; Raclariu et al., 2017b), the results related to the authentication of herbal products using DNA metabarcoding need to focus primarily on checking the presence of the labeled ingredients and contaminants. The presence of non-listed species may be explained by various factors, including but not limited to the deliberate adulteration and unintentional substitution that may occur from the early stage of the supply chain of medicinal plants (i.e., cultivation, transport, and storage), to the manufacturing

FIGURE 3 | Detection of species in Ayurvedic herbal products. Species (y-axis) are colored by relative abundance of normalized read numbers. Species are categorized in expected-detected and not expected-detected, based on the total number of occurrences, whereas the category expected-not detected is based on the number of times that the species is expected but not detected. Species are clustered by Euclidean distances. Ayurvedic samples (x-axis) are numbered with product code and grouped by product type.

process and the commercialization of the final products. DNA metabarcoding is a highly sensitive method and even traces of DNA, e.g., contamination from grains of pollinating species or plant dust in the manufacturing process, can be detected and identified.

species are sourced both from wild and cultivation.

The advantage of DNA metabarcoding is its ability to simultaneously identify total species diversity within complex multi-ingredient and processed mixtures. Importantly, DNA metabarcoding data is used for qualitative evaluation only, to determine presence of taxa, and not for quantitative assessment of relative species abundance based on read numbers, as many variables considerably impact the obtained sequence read results (Staats et al., 2016). In the context of the quality control of herbal products, DNA metabarcoding does not provide any quantitative nor qualitative information of the active metabolites in the raw plant material or the resulting preparation, and this narrows its applicability only to identification and authentication procedures. Thus, if product safety control relies on threshold levels of specific marker compounds, absence of toxins, allergens and admixed pharmaceuticals, then other methods may be more relevant than DNA-based composition analysis. On the other hand, if product fidelity, species substitution or adulteration is suspected then the latter method outperforms in terms of resolution.

The results of this study reveal that there is a need for a better quality control of herbal products. A novel analytical approach should eventually use a combination of innovative high throughput methods that complement the standard ones recommended today.

## CONCLUSION

Assessment of Ayurvedic herbal medicines using DNA metabarcoding provides insight into species diversity in these products and highlights a marked incongruence between species listed as ingredients on the product labels and those detected from DNA present in the samples. Detection of not-listed and not-expected species first and foremost suggests irregularities in the manufacturing process. The presence of foreign plant material could be due accidental reasons, such as contamination from insufficiently cleaned bags, containers, mills, conveyors, and other equipment, or co-occurrence of weeds in cultivation, pollen from wind pollinated species or seeds from wind-dispersed species. However, foreign plant material could also result from fraud, i.e., substitution, adulteration and/or admixture of other species. Interpretation of incongruences should focus on the detected species in the products, and less on the failure to detect species as there are many steps in manufacturing processes that could lead to degradation or loss of DNA beyond detectable limits, e.g., alcoholic extraction, decoction and drying of material at high temperatures. Our study showed that the investigated herbal products contained species not listed on the product labels, and this reveals a clear need for improved quality control. A novel analytical approach should eventually use a combination of advanced chemical methods and innovative high throughput sequencing to complement the standard ones recommended today. The findings of our study show that DNA metabarocoding is a promising tool for quality evaluation of herbal products and pharmacovigilance, and

a good candidate for an effective use as a regulatory tool to authenticate complex herbal products. However, standardization of protocols is necessary before DNA metabarcoding can be implemented as a routine analytical approach and approved by competent authorities for use in a regulatory framework.

#### SUPPORTING INFORMATION

Ion-Torrent sequencing data is deposited in Zenodo doi: 10.5281/ zenodo.2548681.

#### AUTHOR CONTRIBUTIONS

GS, ACRM, HW, and HdB conceived the experiment. GS collected the material and carried out the molecular lab work and analysis together with ACRM. JA carried out high-throughput sequencing together with GS. GS wrote the manuscript together with HdB. All authors contributed to and approved the final version of the manuscript.

#### REFERENCES


#### FUNDING

GS was supported through the University of Oslo by the Quota Scheme of the Norwegian Centre for International Cooperation in Higher Education.

#### ACKNOWLEDGMENTS

The authors acknowledge the help received in the DNA laboratory and genetic analyses from Audun Schrøder-Nielsen and Birgitte Lisbeth Graae Thorbek. Data storage was provided by UNINETT Sigma2 (project no. NS9080K) – the Norwegian national infrastructure for high performance computing and data storage.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00068/ full#supplementary-material


supplements: strengths and limitations. Planta Med. 82, 1225–1235. doi: 10. 1055/s-0042-111208


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Seethapathy, Raclariu-Manolica, Anmarkrud, Wangensteen and de Boer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genome-Wide Identification and Characterization of Salvia miltiorrhiza Laccases Reveal Potential Targets for Salvianolic Acid B Biosynthesis

Qing Li<sup>1</sup>† , Jingxian Feng<sup>1</sup>† , Liang Chen<sup>1</sup> , Zhichao Xu<sup>2</sup> , Yingjie Zhu<sup>3</sup> , Yun Wang<sup>1</sup> , Ying Xiao<sup>1</sup> , Junfeng Chen<sup>1</sup> , Yangyun Zhou<sup>1</sup> , Hexin Tan<sup>4</sup> , Lei Zhang4,5 \* and Wansheng Chen<sup>1</sup> \*

<sup>1</sup> Department of Pharmacy, Changzheng Hospital, Second Military Medical University, Shanghai, China, <sup>2</sup> Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China, 3 Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China, <sup>4</sup> Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai, China, <sup>5</sup> State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Hangzhou, China

#### Edited by:

Caroline Howard, Medicines and Healthcare Products Regulatory Agency, United Kingdom

#### Reviewed by:

Yang Chu, China Academy of Chinese Medical Sciences, China Quanzi Li, Chinese Academy of Forestry, China

#### \*Correspondence:

Lei Zhang zhanglei@smmu.edu.cn Wansheng Chen chenwansheng@smmu.edu.cn †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 31 July 2018 Accepted: 22 March 2019 Published: 05 April 2019

#### Citation:

Li Q, Feng J, Chen L, Xu Z, Zhu Y, Wang Y, Xiao Y, Chen J, Zhou Y, Tan H, Zhang L and Chen W (2019) Genome-Wide Identification and Characterization of Salvia miltiorrhiza Laccases Reveal Potential Targets for Salvianolic Acid B Biosynthesis. Front. Plant Sci. 10:435. doi: 10.3389/fpls.2019.00435 Laccases are widely distributed in plant kingdom catalyzing the polymerization of lignin monolignols. Rosmarinic acid (RA) has a lignin monolignol-like structure and is converted into salvianolic acid B (SAB), which is a representatively effective hydrophilic compound of a well-known medicinal plant Salvia miltiorrhiza and also the final compound of phenolic acids metabolism pathway in the plant. But the roles of laccases in the biosynthesis of SAB are poorly understood. This work systematically characterizes S. miltiorrhiza laccase (SmLAC) gene family and identifies the SAB-specific candidates. Totally, 29 laccase candidates (SmLAC1-SmLAC29) are found to contain three signature Cu-oxidase domains. They present relatively low sequence identity and diverse intron–exon patterns. The phylogenetic clustering of laccases from S. miltiorrhiza and other ten plants indicates that the 29 SmLACs can be divided into seven groups, revealing potential distinct functions. Existence of diverse cis regulatory elements in the SmLACs promoters suggests putative interactions with transcription factors. Seven SmLACs are found to be potential targets of miR397. Putative glycosylation sites and phosphorylation sites are identified in SmLAC amino acid sequences. Moreover, the expression profile of SmLACs in different organs and tissues deciphers that 5 SmLACs (SmLAC7/8/20/27/28) are expressed preferentially in roots, adding the evidence that they may be involved in the phenylpropanoid metabolic pathway. Besides, silencing of SmLAC7, SmLAC20 and SmLAC28, and overexpression of SmLAC7 and SmLAC20 in the hairy roots of S. miltiorrhiza result in diversification of SAB, signifying that SmLAC7 and SmLAC20 take roles in SAB biosynthesis. The results of this study lay a foundation for further elucidation of laccase functions in S. miltiorrhiza, and add to the knowledge for SAB biosynthesis in S. miltiorrhiza.

#### Keywords: Salvia miltiorrhiza, laccase, genome-wide, bioinformatics, salvianolic acid B

**Abbreviations:** ABA, abscisic acid; GA, gibberellins; LAC, laccase; MeJA, methyl jasmonate; RA, rosmarinic acid; SA, salicylic acid; SAB, salvianolic acid B.

## INTRODUCTION

fpls-10-00435 April 5, 2019 Time: 17:12 # 2

Laccase (p-diphenol: dioxygen oxidoreductase, EC.1.10.3.2), originally found in Rhus vernicifera, widely exists in fungi, bacteria, insects and plants (Yoshida, 1883; Wang et al., 2015). As a multicopper glycoprotein oxidase, laccase (LAC) mainly works in catalyzing one-electron oxidation of a wide range of substrates, coupled with the reduction of oxygen to water (Mot and Silaghi-Dumitrescu, 2012). LACs typically contain three conserved Cu-oxidase sites, named Type 1 (T1), Type 2 (T2), and binuclear Type 3 (T3) Cu sites respectively. When a substrate is bound and oxidized at T1, an electron is released and transferred to T2/T3 trinuclear copper cluster (TNC), consequently the free hydrogens are combined with molecule oxygens (O2) and reduced to water molecules (H2O) (Jones and Solomon, 2015). Due to the ability of oxidizing a variety of substrates, such as phenols, aromatic amines and metal ions, LACs have the potential to be used in industrial processes (Forootanfar and Faramarzi, 2015).

In recent years, great achievements have been made on the studies of LACs in lignin biosynthesis in plants (Liang et al., 2006a; Berthet et al., 2011; Cesarino et al., 2013; Zhao et al., 2013). In Arabidopsis, through T-DNA insertional mutagenesis, Berthet et al. (2011) found that the xylem was collapsed and the soluble constituents were detected in both laccase 4 (AtLAC4) and laccase 17 (AtLAC17) knockout mutants. By knocking down AtLAC4 and AtLAC17 along with AtLAC11 (laccase 11), Zhao et al. (2013) observed serious physiological changes in the living plants, such as growth inhibition, narrowed stems and lack of lignified vascular bundles, indicating that AtLAC11 may also be involved in the lignin polymerization. In addition to their functions in lignin biosynthesis (Cai et al., 2006; Liang et al., 2006a,b; Berthet et al., 2011; Zhao et al., 2013), LACs may perform other roles as some LACs are expressed in non-woody tissues and participate in oxidation of flavonoids (Pourcel et al., 2007; Turlapati et al., 2011).

Salvia miltiorrhiza (Dan-Shen) is one of the most commonly used medicinal plants in traditional Chinese medicine for treatment of cardiovascular and cerebrovascular diseases. Salvianolic acid B (SAB) is a representatively effective hydrophilic compound in S. miltiorrhiza. According to Chinese pharmacopeia (Chinese Pharmacopoeia Commission, 2015), it is also the quality control component of S. miltiorrhiza. Understanding the biosynthetic pathway of SAB will help improve the quality of S. miltiorrhiza by breeding improvement or quality control during the growth of S. miltiorrhiza. It will also benefit the metabolic engineering in S. miltiorrhiza such as increasing the abundance of SAB in the plant for extraction. Based on our studies of the biosynthetic pathway of salvianolic acids (**Figure 1**), a similar pathway to that of lignin and flavonoids, we suspected that LACs in the plant might be candidates in catalyzing rosmarinic acid (RA) to SAB (Di et al., 2013). In fact, five candidate LACs (SMil\_00009266, SMil\_00023004, SMil\_00000484, SMil\_00003461, and SMil\_00018228) were claimed to participate in the salvianolic acids pathway (Xu Z. et al., 2016) in S. miltiorrhiza. However, SMil\_00000484, SMil\_00003461 and SMil\_00018228 are nonmembers of LAC family. SMil\_00000484 is a monocopper oxidase-like protein while SMil\_00003461 and SMil\_00018228 are both L-ascorbate oxidase homologs.

Under the umbrella of S. miltiorrhiza genome data, 80 LACs were annotated (Xu H. et al., 2016; Xu Z. et al., 2016). However, confirmation is required in urgent since not only could the annotation of LACs be easily mixed with other multi-copper oxidases and peroxidases, but the functions of LACs also vary in plants. Since there is no comprehensive analysis of LAC multigene family in S. miltiorrhiza, the aim of this work is to characterize S. miltiorrhiza laccases (SmLACs), with the long-term goal to identify bona fide LACs involved in SAB biosynthesis. For this purpose, we characterized the annotated LACs in S. miltiorrhiza through a genomewide comprehensive analysis of the gene family including analysis of the gene structures, protein domains as well as putative promoter cis regulatory elements. A phylogenetic tree was constructed using the neighbor-joining method. In addition, the expression patterns of SmLAC genes were evaluated and confirmed by quantitative real-time PCR. Silencing and overexpression of the candidate SmLACs in the hairy roots of S. miltiorrhiza were carried out to detect the variation of SAB. To sum up, 29 LAC candidates were identified in S. miltiorrhiza genome. All of them have conserved copper-binding domains but are different in gene structures, indicating similar genetic origin but divergent biological functions. The potential regulation mechanism of SmLAC genes by transcription factors, miRNAs and phosphorylation were discussed. Five SmLACs (SmLAC7/8/20/27/28) are assumed to be involved in the SAB biosynthetic pathway. Besides, silencing of SmLAC7, SmLAC20 and SmLAC28, and overexpression of SmLAC7 and SmLAC20 in the hairy roots of S. miltiorrhiza resulted in diversification of SAB accumulation.

## MATERIALS AND METHODS

## Genome-Wide Characterization of Laccase Genes in S. miltiorrhiza

Based on the annotation of S. miltiorrhiza genome, all the peptide sequences contained Cu-oxidase domains were extracted with related coding sequences. The peptide sequences were then verified while blasted in NCBI<sup>1</sup> and checked on the Conserved Domain Database in the same website. Sequences with three typical Cu-oxidase domains were classified as LAC candidates after exclusion of L-ascorbate oxidase homologs and monocopper oxidase-like proteins.

The various physical and chemical characteristics of all the candidate SmLAC proteins were analyzed using the ProtParam tool<sup>2</sup> . Putative signal peptide cleavage sites were predicted by

<sup>1</sup>https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE\_TYPE= BlastSearch&LINK\_LOC=blasthome <sup>2</sup>http://web.expasy.org/protparam

SignalP 4.1 server<sup>3</sup> (Petersen et al., 2011). WoLF PSORT server<sup>4</sup> and TargetP 1.1 server<sup>5</sup> were used to predict the subcellular localization of the mature SmLAC proteins, respectively. Potential glycosylation sites and phosphorylation sites were separately analyzed through online NetNGlyc 1.0 Server<sup>6</sup> , YinOYang 1.2 server<sup>7</sup> , and NetPhos 2.0 Sever<sup>8</sup> . Visualization of the intron-exon structure of SmLAC genes was conducted by Gene Structure Display Server (GSDS 2.0)<sup>9</sup> based on each coding sequence and corresponding genomic sequence. Sequence similarity of SmLAC proteins was obtained with alignment in ClustalX 2<sup>10</sup>, and the conserved sites were checked manually for their corresponding amino acid residues, which were shaded by GeneDoc software (Nicholas et al., 1997). Secondary structure predictions of SmLACs were performed with Secondary Structure Prediction Method (SOPMA<sup>11</sup>) (Combet et al., 2000). All the performance was carried out with default settings.

### Pairwise Distances, Phylogenetic and MEME Motif Analyses

The amino acid sequence alignments of SmLACs were performed using ClustalW implemented in the MEGA 5.05 software with p-distance settings. Phylogenetic and molecular evolutionary genetics analyses were performed using the Neighbor-Joining (NJ) method (Saitou and Nei, 1987) with pairwise deletion option in MEGA 5.05 and 1000 bootstrap replicates. The bootstrap

<sup>4</sup>http://www.genscript.com/psort/wolf\_psort.html

values above 50% were added to the tree branches generated from the original dataset. Conserved motifs in the complete amino acid sequences of SmLACs were identified using MEME (Multiple EM for Motif Elicitation<sup>12</sup>) (Bailey et al., 2006) with the maximum number of motifs setting at 10.

#### Prediction of miR397 Target Laccase Genes and in silico Analysis of SmLAC Promoter Sequences

The transcript sequences of the candidate SmLACs were uploaded to web-based psRNATarget server<sup>13</sup> for identification of potential targets corresponding to the ptr-miR397a and sslmiR397. Sequences with a cut-off score ≤ 5 were chosen as putative targets.

PlantCARE (Plant cis-Acting Regulatory Elements<sup>14</sup>) was appointed to investigate the promoter sequences of S. miltiorrhiza LACs for potential cis-acting regulatory elements. The identified elements were sort out based on their reported functions.

#### Methyl Jasmonate Treatment

Methyl jasmonate (MeJA) was dissolved in ethanol and added into the culture medium of hairy root at the final concentration of 0.1mM for 0, 4, 8, 16, and 24 h, respectively. Three independent biological replicates for each group were performed.

#### Transcript Abundance of Laccases in Different Organs and Tissues of S. miltiorrhiza

To get insight into the transcript abundance of SmLAC genes, the Illumina and PacBio RNA-Seq data provided by Xu et al. (2015) was utilized. The RNA-Seq expression profile data were generated

<sup>3</sup>http://www.cbs.dtu.dk/services/SignalP/

<sup>5</sup>http://www.cbs.dtu.dk/services/TargetP/

<sup>6</sup>http://www.cbs.dtu.dk/services/NetNGlyc/

<sup>7</sup>http://www.cbs.dtu.dk/services/YinOYang/

<sup>8</sup>http://www.cbs.dtu.dk/services/NetPhos/

<sup>9</sup>http://gsds.cbi.pku.edu.cn/index.php

<sup>10</sup>http://www.ebi.ac.uk/Tools/clustalw2/

<sup>11</sup>http://npsa-pbil.ibcp.fr/cgi-bin/npsa\_automat.pl?page=/NPSA/npsa\_sopma. html

<sup>12</sup>http://meme-suite.org/tools/meme

<sup>13</sup>http://plantgrn.noble.org/psRNATarget/?function=3

<sup>14</sup>http://bioinformatics.psb.ugent.be/webtools/plantcare/html/

using different organs of mature S. miltiorrhiza at blooming stage, including roots, stems, leaves and flowers. The periderm, phloem and xylem of roots were also included. At least three biological replicates for each kind of organ and tissue were used. Finally, the heat map of SmLAC gene expression patterns was constructed using the log2 transformed and normalized expression level data in Multi Experiment Viewer (MeV).

#### Plant Materials

fpls-10-00435 April 5, 2019 Time: 17:12 # 4

The plant of S. miltiorrhiza was grown at the medicinal botanical garden of Second Military Medical University in Shanghai, China. It was identified by Professor Hanming Zhang. The fresh leaves, stems and roots of the plant were treated with liquid nitrogen immediately after collection, and then stored at −80◦C for subsequently use. At least three biological duplicate samples of each organ were collected.

The hairy roots were derived after the infection of 60-dayold S. miltiorrhiza leaves with Agrobacterium tumefaciens C58C1, and stocked in 1/2 MS solid medium at 25◦C in the dark. They were harvested from the culture medium at the 60th day after been transferred into liquid medium, and as much as 0.2 g of harvested hairy roots were used for total RNA isolation. The rest hairy roots were dried at 40◦C in an oven until constant dry weight was reached.

### Preparation of RNA and cDNA

Total RNA of S. miltiorrhiza was isolated from stored roots, stems and leaves respectively using the TransZol Up Plus RNA Kit (TransBionovo Co., Ltd., Beijing, China). The integrity and quality of the RNA were confirmed by 1% agarose gels stained with ethidium bromide, and the RNA concentration was determined by a Nanodrop 2000 spectrophotometer (Thermo, Waltham, MA, United States). One µg of RNA for each sample was used for reverse transcription following the TransScript First-Strand cDNA Synthesis SuperMix operating procedures (TransBionovo Co., Ltd., Beijing, China).

### Quantitative Real-Time PCR

Quantitative real-time PCR was performed on a TaKaRa TP800 PCR system (TaKaRa, Japan) using TransStart Top Green qPCR SuperMix Kit (TransBionovo Co., Ltd., Beijing, China) according to the manufacturer's instructions with three technical replicates. Primer Express 3.0<sup>15</sup> was used to design the gene-specific primers (see **Supplementary Table 1**) for each SmLAC. Specificity of each primer pair was verified by 2% agarose gels and dissociation curve analysis. The transcript abundance of each SmLAC gene was normalized to SmACTIN as control and compared with roots as reference using 2−11Ct method.

## Construction of Recombinant Vectors

The candidate SmLAC cDNAs were cloned from S. miltiorrhiza through pEASY-Blunt Zero Cloning Vector (TransBionovo Co., Ltd., Beijing, China) respectively according to the manufacturer's instructions. Then the fragments of cloned SmLACs without the termination codons were PCR amplified using Primer 1 F/R and

<sup>15</sup>http://primer3.ut.ee/

Primer 2 F/R (see **Supplementary Table 2**) respectively. They were cleaved with the correspondent restriction enzymes, and cloned into pCambia-1300 or pPHB-flag vectors to yield the RNAi or overexpression vectors.

## Extraction and Analysis of Phenolic Acids

The dried hairy roots were ground into powder, and 1 mL methanol: water (70:30, v/v) per 10 milligrams of powder was added into each sample. The mixture was sonicated for 30 min, 3 times with an output amplitude 50%. The solution was centrifuged at 13000 rpm for 10 min in an Eppendorf centrifuge 5418R rotor, and the supernatant was diluted with the same solvent to 5 mL per milliliter. Then the extract solution was filtered through a 0.2 µm organic membrane. Liquid chromatography-MS/MS (LC-MS/MS) was carried out to analyze the metabolites content using a triple-quadrupole mass spectrometer (Agilent G6410A). The separation was performed on a 2.1 × 50 mm 2.5 µm C18 column (Waters). The mobile phase consisted of acetonitrile: water (60:40 v/v).

## RESULTS

## Genome-Wide Identification of Laccases in S. miltiorrhiza

To identify LAC genes in S. miltiorrhiza at the genome level, the genome database and the genome annotation information of S. miltiorrhiza were downloaded from http://www.ndctcm. org/shujukujieshao/2015-04-23/27.html (Xu H. et al., 2016). According to the annotation information, 80, 80 and 79 genes models were identified containing Cu-oxidase, Cu-oxidase\_2 and Cu-oxidase\_3 domain respectively, which are the typical domains of LACs (see **Supplementary Table 3**).

After combining the Cu-oxidase domain contained genes, 101 sequences were obtained. They were then blasted in NCBI and checked on via the Conserved Domain Database in NCBI. The results showed that 32 genes were non-laccases, and 40 ones were partial laccases because of low quality sequencing or assembling, and only 29 were full-length LACs (see **Supplementary Table 4**). SMil\_00000484, SMil\_00003461, and SMil\_00018228 are all Cu-oxidase domain contained genes, but they are not LACs as mentioned in previous studies (Xu Z. et al., 2016). SMil\_00000484 is a monocopper oxidase-like protein while SMil\_00003461 and SMil\_00018228 are both L-ascorbate oxidase homologs.

The 29 full-length LACs of S. miltiorrhiza [named SmLAC1 to SmLAC29 randomly (see **Supplementary Text 1**)] were applied for bioinformatics analysis. Generally, SmLACs consist of 500–600 amino acids (aa) and most between 560 and 580 aa. The molecular masses of the 29 SmLAC proteins range from circa 57.15 kDa (SmLAC15) to 67.70 kDa (SmLAC16) and the predicted theoretical isoelectric points (pI) range from 5.83 (SmLAC12) to 9.47 (SmLAC17). The signal peptide prediction showed that 23 of the SmLACs had a 20–30 aa length signal peptide at the N-terminus, indicating that most SmLACs are

probably secreted proteins. This agrees with McCaig's finding that most plant LACs have a cleavable N-terminal signal peptide targeting themselves to the secretory pathway (McCaig et al., 2005). The subcellular prediction showed that most of SmLAC proteins are localized in the secretory pathway and a few of them in mitochondria (SmLAC17 and SmLAC23) or nucleus (SmLAC5), indicating that most SmLACs are extracellular proteins. In addition, variable N- or O-glycosylation sites and phosphorylation sites were predicted to present in all SmLAC proteins, indicating potential post-translational modifications (**Table 1**).

### Gene Structure Analysis of S. miltiorrhiza Laccase Family

The gene structure of SmLACs was investigated by using Gene Structure Display sever (**Figure 2**). The number of exons in the 29 SmLACs varied from 4 to 9, indicating a diverse intronexon pattern within SmLAC genes. In general, there were 17 genes containing 5 introns, 6 genes containing 6 introns, 3 genes containing 4 introns and 2 genes containing 3 introns. Gene SmLAC16 contained 8 introns. It is noteworthy that all the genes consisted of an intron phase 0 at the initiating terminal (except SmLAC16 and SmLAC17) and an intron phase 1 at the C-terminal, which indicates that they are relatively conserved.

The secondary structures of SmLACs were predicted on SOPMA. The results showed that random coil element was the main unit in SmLACs, followed by extended strand and α-helix (see **Supplementary Table 5**). The proportion of the α-helix structure in the 29 SmLACs ranges from 12.74% (SmLAC26) to 21.43% (SmLAC5), and the β-turn structure ratio from 8.66% (SmLAC18) to 12.63% (SmLAC25). The extended strand and random coil are from 26.52% (SmLAC15) to 32.23% (SmLAC18), and 39.27% (SmLAC10) to 46.90% (SmLAC26) respectively.

## Protein Sequence Similarity Analysis

Protein sequence similarity of the 29 SmLACs was first carried out through sequence comparisons. According to the result of sequence alignment (**Figure 3**), the highly conserved residues with 4 coordinated copper atoms were found, except SmLAC5 which lacks the T1 copper binding site (H-C-H) near the

TABLE 1 | Physical, chemical characterization and the prediction of signal peptide and protein location of 29 SmLACs.


C-terminus. It was believed that the axial ligand near the T1 copper binding site (H-C-H-X3-H-X3-G-[LMI(F)]) proximal to the C-terminus partially influenced LAC redox potential (Turlapati et al., 2011). Functions of SmLAC5 might be different and need further examination.

fpls-10-00435 April 5, 2019 Time: 17:12 # 6

Pairwise sequence similarities among the predicted 29 peptide sequences of SmLACs ranged from a low of 39.7% (SmLAC10 vs. SmLAC16) to a high of 98.2% (SmLAC9 vs. SmLAC13) (see **Supplementary Table 6**). For most ones, the identity percentage varied from 40 to 70%. SmLAC9 and SmLAC13 are examples of closely related proteins sharing amino acid identity greater than 98% that may represent within-species alleles.

### Phylogenetic Analysis and Conserved Motifs Identification

To obtain the evolutionary relationships among the 29 SmLACs and other LACs from 10 selected plants (Zea mays, Nicotiana tabacum, Populus trichocarpa, Pinus taeda, Gossypium arboreum, Glycine max, Acer pseudoplatanus, Liriodendron tulipifera, A. thaliana and Oryza sativa), a neighbor-joining tree was constructed by MEGA 5.05 with 1000 bootstrap reconstruction and pairwise deletion gaps/missing data treatment and clustered into seven phylogenetic groups (**Figure 4**). Both group I and II contained 3 SmLACs respectively. Group III consisted of 4 SmLACs. There were 1, 2 and 1 SmLACs in group IV, V, and VI, respectively. Group VII included nearly half of the total SmLACs (15 SmLACs).

It is commonly accepted that proteins usually bear similar functions in their respective species if they share a high degree of sequence similarity. Although the functions of SmLACs are unknown, our construction of the SmLACs phylogenetic tree with different plants can help to derive the functions of SmLACs and lay a solid foundation for future functional studies. As shown in **Figure 4**, SmLAC10 shares 100% similarity with AtLAC6 and SmLAC14 shares 99% similarity with AtLAC15, strongly suggesting that SmLAC10 and SmLAC14 may have similar functions with AtLAC6 and AtLAC15 respectively. AtLAC15 is known involving in lignin synthesis, seed germination, root elongation (Liang et al., 2006a), oxidizing epicatechin into the oligomers as well as the synthesis of flavonoids in the seed coat of A. thaliana (Pourcel et al., 2005), SmLAC14 therefore may have the same functions in S. miltiorrhiza's development. It was reported that when AtLAC8 was knocked out, the flowering of plant was delayed and the number of the leaves was decreased (Cai et al., 2006). SmLAC25 and SmLAC29 are close to AtLAC8 in the phylogenetic tree, they might play similar roles in controlling the flowering and leaves growth of S. miltiorrhiza as AtLAC8

does in Arabidopsis. SmLAC4 and SmLAC24 are the closest homologs to AtLAC17 (up to 90% similarity), which was strongly expressed in Arabidopsis's stems and participated in lignin synthesis (mainly participate in guaiacol radical accumulation) (Berthet et al., 2011), indicating that SmLAC4 and SmLAC24 may participate in S. miltiorrhiza's lignin synthesis too. SmLAC22 is close to AtLAC2, whose function is related to root elongation in Arabidopsis (Cai et al., 2006). Group VII contains 15 SmLACs but without any members from other species, indicating that they might have species specificity and drive the unknown functions in S. miltiorrhiza.

To further analyze the sequence features of these 29 SmLAC proteins, a conserved motif search was conducted by MEME (**Figure 5**). The result suggested that most SmLAC proteins in the same group have similar motifs. Members in group IV, V, and VI contained 10 different types of conserved motifs. Eleven SmLACs in group VII held 10 kinds of motifs, while the rest four contained 9 motifs. Eight types of conserved

motifs were found in some members of group II and group III (SmLAC15/5/17).

#### Prediction of Diverse cis Regulatory Elements in SmLAC Promoters

Various numbers of putative cis-acting elements, including the core CAAT box and TATA box, were detected in the promoters of each S. miltiorrhiza LAC genes by PlantCARE (see **Supplementary Table 7**). All 29 SmLAC promoter sequences have many light responsive elements, such as G-box (Argüello Astorga and Herrera Estrella, 1998), revealing an essential role of SmLACs in plant morphogenesis. Besides, there are three types of representative DNA regulatory elements: stress responsive elements responding to diverse abiotic (anaerobic induction, defense and stress, cold and dehydration, heat stress, low temperature, drought, and wound) and biotic (fungal elicitor) stresses; hormone responsive elements involved in response to various plant hormones, such as ABA, MeJA, GA, SA, auxin, and ethylene; tissue specific expressed elements related to meristem-, endosperm-, seed- or shoot-specific activation and regulation. Moreover, two classes of MYB binding site elements (MBS I and MBS II), which are the flavonoid biosynthetic genes regulation sites were discovered in the promoters of five SmLAC genes (SmLAC3/9/10/13/28).

## Seven SmLACs Were Found to Be Potential Targets of miR397

It is reported that ptr-miR397a is a negative regulator of LAC genes in Populus trichocarpa (Lu et al., 2013). Since miR397 sequence of S. miltiorrhiza was not available, ptr-miR397a from P. trichocarpa was used to search the transcript sequences of the 29 candidate SmLACs, and 7 SmLACs (SmLAC8/24/4/27/5/25/29) were predicted to be the potential ptr-miR397a targets (see **Supplementary Table 8**). Ssl-miR397 of Salvia sclarea, which is the congener plant of S. miltiorrhiza from the same genus, was also used to search the 29 SmLACs transcript sequences, the same seven SmLACs turned out to be the potential ssl-miR397 targets. Thus, SmLAC8/24/4/27/5/25/29 may be negatively regulated by miR397 in S. miltiorrhiza.

## Differential Expression Profiles of SmLACs in Different Organs and Tissues

The relative constitutive abundance of the 29 SmLACs was quantified in roots, stems, leaves and flowers through Illumina and PacBio sequencing technology (Xu et al., 2015). Besides, the abundance of SmLACs in different tissue parts of roots including periderm, phloem and xylem was also tested. The expression level of each SmLAC was estimated according to RPKM (reads per kilobase per million) values and presented in the heatmap in **Figure 6**. The results showed that the expression levels of the 29 SmLACs varied with organs. For instance, SmLAC7/8/20/27/28 were highly expressed in roots while SmLAC3/8/11/12/15/17/20/22/24/27/28 were highly expressed in stems. SmLAC9/11/12/20 were highly expressed in leaves, and SmLAC3/8/20/22/27 were highly expressed in flowers. The expression level of SmLAC20 was high in all the four organs. SmLAC8 and SmLAC27 were highly expressed in roots, stems and flowers. However, the expression level of 5 SmLACs (SmLAC5/10/18/19/26) was very low in the four organs.

As to the three different tissues of roots, 5 SmLACs (SmLAC7/20/25/26/28) displayed the highest transcript abundance in all xylem, epidermis and periderm. SmLAC16 was found to be highly expressed in phloem followed by periderm and xylem. SmLAC19 was highly expressed in periderm and more than in phloem and xylem. SmLAC8/22/24/27 were highly expressed in xylem. These results indicated the functional conservation and diversity of SmLACs.

## Confirmation of Five Highly Expressed SmLACs in Roots

Transcript abundance of a gene often correlates to its function. Considering the fact that hydrophilic compounds such as SAB of S. miltiorrhiza are more in roots than in other organs, the highly expressed five SmLACs (SmLAC7/8/20/27/28) in roots may participate in salvianolic acids biosynthesis. To verify their expression levels, real-time PCR was performed (**Figure 7**) and the results showed that the expression levels of SmLAC8, SmLAC27 and SmLAC28 were higher in stems than in roots and leaves. The expression level of SmLAC7 was high in roots, followed by stems and leaves. It was consistent with its expression in the heatmap. SmLAC20 exhibited a high expression pattern in leaves and its expression in stems was much lower than that in roots and in leaves.

### Effects of Methyl Jasmonate on Expression of the Five Targeted SmLACs

Methyl jasmonate (MeJA) has been used in plant cell engineering for inducing gene expression (Xiao et al., 2009). Previous studies have shown that genes in the metabolic pathway of S. miltiorrhiza can be significantly induced by MeJA and thus increase the content of SAB (Xiao et al., 2009). In order to obtain preliminary information about the effect of MeJA on SmLACs, different expression levels of the five SmLACs in MeJA treated hairy roots at different times, including 0, 4, 8, 16, and 24 h were analyzed through real-time PCR using gene-specific primers (see **Supplementary Table 1**). The results (**Figure 8**) showed that SmLAC7, SmLAC20 and SmLAC28 were significantly induced by MeJA, and the expressions of the genes were increased more than 3-fold. SmLAC28 reached its peak at the 4th hour, while the maximum expressions of SmLAC7 and SmLAC20 appeared at the 8th and the 16th hour. However, there was no significant arising trend in SmLAC8 and SmLAC27.

### Silencing of SmLAC7, SmLAC20, and SmLAC28 in Hairy Roots of S. miltiorrhiza

To explore the roles of MeJA responded genes (SmLAC7, SmLAC20 and SmLAC28) in phenolic acid synthetic pathway in S. miltiorrhiza, RNAi transgenic hairy roots were generated by RNAi strategy under the control of the CaMV35S promoter. Realtime PCR was performed to confirm the transcript levels of these genes (**Figure 9A**). In contrast with the wild type (WT) line, the transcript levels of SmLAC7, SmLAC20, and SmLAC28 were all decreased in the RNAi lines (**Figure 9A**). It also resulted in reductions of RA and SAB content in the SmLAC7 or SmLAC20 silenced lines (**Figure 9C**). The content of RA and SAB in the negative control (NC) line was dramatically different with the WT line. Since the NC line was induced by A. tumefaciens with the empty vector of RNAi, the accumulation of compounds might be affected. Therefore, the NC line was used as control. Compared to NC, the average reduction of SAB was 87% in line SmLAC20, 29.6% and 7.45% in line SmLAC7 and SmLAC28, respectively (**Figure 9C**).

Besides, the transcriptional expression levels of the other two genes showed different performance in every single RNAi line. In SmLAC7 silence line, the expression levels of SmLAC20 and SmLAC28 were higher than that in WT (4.18 folds and 1.78 folds respectively for the most), while in silence lines SmLAC20 and SmLAC28, the expression of SmLAC7 was lower than that in WT (about 0.6 folds and 0.16 folds of that in WT, respectively). However, the behavior of SmLAC20 in SmLAC28 silence line did not show obvious discrepancy when contrasted with that in WT, even though SmLAC28 in SmLAC20 silence line could reach to 1.97 folds compared with that in WT (**Figure 9B**).

To further investigate the putative impact of the transgenes at lignin biosynthesis, the vascular development of the hairy roots was inspected. The cross sections of hairy roots of both wild type and RNAi samples were dyed by Safranine O-Fast green FCF. The diameters of the transgenic cultures were much greater than that of the WT ones, so was the width of xylem (**Figure 9E**). More significantly, the xylem cells in RNAi samples appeared larger and looser than that in WT samples. There existed holes in the middle of the RNAi samples except the SmLAC28 silence line. What's more, the biomass of the hairy roots with SmLACs silenced in shake-flask cultures exhibited an obviously decreasing growth trend after 30 days cultivation compared with the wild type (**Figure 9D**).

#### Overexpression of SmLAC7 and SmLAC20 in Hairy Roots of S. miltiorrhiza

SmLAC7, SmLAC20, and SmLAC28 were all overexpressed in S. miltiorrhiza hairy roots. However, hairy roots with SmLAC28 overexpressed didn't grow well and were excluded during this analysis. To explore the in vivo roles of SmLAC7 and SmLAC20 in SAB biosynthesis pathway, their transcript levels were determined by real-time PCR and showed strikingly increase in the transgenic lines (**Figure 10A**), accordingly, the contents of RA and SAB (**Figure 10B**). The accumulation of SAB in SmLAC20 overexpressed line was 5.45 folds higher than that of NC. It was 5.61 folds higher in SmLAC7 overexpressed line than the NC one.

Unlike the LACs silencing lines, the SmLAC7 and the SmLAC20 overexpressed lines (**Figures 10C,D**) showed increased biomass, as well as larger diameter of vascular compared to WT lines. Interestingly, the lignification degree in LACs overexpressed hairy root lines was increased compared with the wild type. The area of xylem in the SmLAC20 overexpressed lines turned to be larger than that in the wild type (**Figure 10D**).

#### DISCUSSION

As a multigene family, LACs widely exist in fungi, bacteria, plants and animals. Because of their special catalytic properties, function variations have been found in fungi and bacteria. However, functions of plant LACs are still poorly understood. Recent studies in structural and functional genomics in higher plant model species such as Arabidopsis revealed 17 LACs which are involved in stress response and lignin biosynthesis (Turlapati et al., 2011). The availability of the whole genome sequences of S. miltiorrhiza facilitates a comprehensive characterization of LAC genes in the commonly used herb. Here, we identified and characterized 29 LAC candidates in

S. miltiorrhiza. They all exhibit the typical characteristics of three conserved Cu-oxidase domains, four signature sequences and 12 housed copper ligands.

Through utilization of a number of bioinformatics methods, we systematically analyzed all the 29 SmLACs. Although they all share similar coding domain structures, their sequence similarities are relatively low and intron–exon structures are quite different. Further differences in terms of number of amino acids and values of pI, implicate potentially functional divergence. In agreement with previous reports on Arabidopsis LACs (Turlapati et al., 2011), we find out that S. miltiorrhiza LAC genes are mostly located in the secretory pathway and contain glycosylation sites ensuring protein stability, folding, and formation of the cell wall.

Based on the phylogenetic relationships of S. miltiorrhiza and other ten plants, 29 SmLACs are mainly distributed into seven groups. The group VII contains 15 LACs all from S. miltiorrhiza, suggesting that these 15 SmLACs might hold species specificity. The homology of SmLAC8 and SmLAC27 is very close to AtLAC4. Also both SmLAC4 and SmLAC24 are just next to AtLAC17 in the phylogenetic tree. Since the two AtLACs function in lignin synthesis (Berthet et al., 2011; Zhao et al., 2013), we anticipate that all the four SmLAC4/8/24/27 participate in the synthesis of lignin in S. miltiorrhiza. This could be supported by the potential involvement of MYB58 in regulating SmLAC4/8/24/27 since SmLAC4/8/24/27 promoters contain the MYB binding site (MBS). MYB members MYB58 and MYB63 are known transcription activators in lignin biosynthesis, and MYB58 is particularly capable of activating AtLAC4 (Zhou et al., 2009).

SmLACs may be negatively regulated by miR397. In P. trichocarpa, overexpressed ptr-miR397a negatively regulates LAC genes and decreases lignin content (Lu et al., 2013). We contrast the 29 SmLACs with ptr-miR397a and ssl-miR397 one by one and find out that 7 SmLACs (SmLAC4/5/8/24/25/27/29) can be combined with miR397 tightly, reflecting miR397's roles in regulating the expressions of SmLACs.

Expressions of the 29 LACs are tissues and organs dependent as supported by the analysis of transcriptome sequencing. We also observed SAB is mainly accumulated in the roots and the accumulation is positively correlated with the overexpressions of Smlac7/8/20/27/28, thus we speculate that these LACs may be involved in the synthesis of lignin and salvianolic acid in roots. This is supported by a combination of an early report that the content of SAB as well as RA are affected by MeJA (Xiao et al., 2009) and our current result that MeJA responsiveness motifs appear in the promoters of more than half SmLACs (SmLAC1/3/4/5/6/7/8/9/12/13/14/15/19/20/21/23/24/25/27),

including the four highly expressed genes (SmLAC7/8/20/27) in roots. And indeed, MeJA significantly affected the expressions of SmLAC7, SmLAC20, and SmLAC28 based on the results of real-time PCR on MeJA treated hair roots of S. miltiorrhiza. What's more, when the LAC in poplar is inhibited, its lignin component is not changed, but the phenolic metabolites are altered (Ranocha et al., 2002). This indicates that the highly expressed genes in roots likely participate in the biosynthesis of secondary metabolites in the phenylpropanoid pathway. Therefore, the three MeJA affected and highly expressed genes in roots were chosen for further study to illustrate their roles in the biosynthesis of SAB.

As expected, the contents of SAB in both SmLAC7 and SmLAC20 silenced lines were lower than in the wild type and negative control. Conversely, in the SmLACs overexpressed transgenic hairy root lines, SAB content increased with the expression of LACs. These observations strongly support that both SmLAC7 and SmLAC20 participate in the synthesis progress of SAB despite the exact SAB biosynthesis mechanism remains to be revealed.

In short, we comprehensively characterized LACs of S. miltiorrhiza and analyzed their molecular regulation functions. The results provide a solid ground for further exploring LACs in S. miltiorrhiza and other species as well. Our work adds to the knowledge for unveiling the formation of SAB and demonstrates a promising future in S. miltiorrhiza metabolic regulation in quality control.

## AUTHOR CONTRIBUTIONS

QL and JF conceived the study. QL participated in data mining, data analysis, and proofreading the manuscript. JF performed the RNAi and overexpression experiment and wrote the manuscript. LC carried out the qRT-PCR experiment. ZX and YZhu provided the genome and transcription information of S. miltiorrhiza. YW, YZhou, and HT prepared the figures. YX initiated the project. JC helped to analysis the data. LZ participated in the design of the study. WC helped to conceive the study and participated in its design and coordination. All authors read and approved the final manuscript.

## FUNDING

This work was financially supported by the National Natural Science Foundation of China (31770329, 81325024, 81603220, 81303160, and 81673529) and the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University (2018FR003 and ZY20180206).

## ACKNOWLEDGMENTS

We are greatly acknowledge Prof. Shilin Chen (China Academy of Chinese Medical Sciences), Prof. Jingyuan Song and Hongmei Luo (Peking Union Medical College) for kindly providing the genome data and transcription profiling of S. miltiorrhiza.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00435/ full#supplementary-material

## REFERENCES

fpls-10-00435 April 5, 2019 Time: 17:12 # 15


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer YC declared a shared affiliation, with no collaboration, with one of the authors, YiZ, to the handling Editor at the time of the review.

Copyright © 2019 Li, Feng, Chen, Xu, Zhu, Wang, Xiao, Chen, Zhou, Tan, Zhang and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Molecular Authentication of the Medicinal Species of Ligusticum (Ligustici Rhizoma et Radix, "Gao-ben") by Integrating Non-coding Internal Transcribed Spacer 2 (ITS2) and Its Secondary Structure

#### Zhen-wen Liu<sup>1</sup> , Yu-zhen Gao<sup>2</sup> and Jing Zhou<sup>2</sup> \*

<sup>1</sup> CAS Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China, <sup>2</sup> School of Pharmaceutical Sciences and Yunnan Key Laboratory of Pharmacology for Natural Products, Kunming Medical University, Kunming, China

#### Edited by:

Caroline Howard, Medicines and Healthcare Products Regulatory Agency, United Kingdom

#### Reviewed by:

Adrian Slater, De Montfort University, United Kingdom Natalia V. Ivanova, University of Guelph, Canada

\*Correspondence: Jing Zhou zhoujing\_apiaceae@163.com

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 04 August 2018 Accepted: 21 March 2019 Published: 09 April 2019

#### Citation:

Liu Z-w, Gao Y-z and Zhou J (2019) Molecular Authentication of the Medicinal Species of Ligusticum (Ligustici Rhizoma et Radix, "Gao-ben") by Integrating Non-coding Internal Transcribed Spacer 2 (ITS2) and Its Secondary Structure. Front. Plant Sci. 10:429. doi: 10.3389/fpls.2019.00429 Ligustici Rhizoma et Radix (LReR), an important Chinese medicine known as "Gaoben," refers to Ligusticum sinense Oliv. or Ligusticum jeholense Nakai et Kitag. However, a number of other species are commonly sold as "Gao-ben" in the herbal medicine market, which may result in a series of quality control problems and inconsistent therapeutic effects. The "Gao-ben" is commonly sold sliced and dried, making traditional identification methods difficult. Here, the mini barcode ITS2 region was examined on 68 samples representing LReR and 7 potential adulterant or substitute species. The results showed 100% success rates of PCR and sequencing and the existence of a barcoding gap. The neighbor-joining (NJ) tree indicated that all the tested samples could be exactly identified. The ITS2 secondary structure revealed a clear difference between true "Gaoben" and three adulterant species. We therefore recommend the use of ITS2 as a mini barcode for distinguishing between closely or distantly related plant species that may be used in Chinese medicine.

Keywords: apiaceae, DNA barcoding, Ligustici Rhizoma et Radix, ITS2, secondary structures

## INTRODUCTION

Apiaceae, the 16th-largest flowering plant family, comprises more than 3,540 species in 446 genera (Mabberley, 1997). It is a well-known and economically important plant family in medicine, spices, vegetables and ornamental gardening. The roots and rhizomes of Ligusticum sinense Oliv. or Ligusticum jeholense Nakai et Kitag form a widely used traditional Chinese medicine, known as "Ligustici Rhizoma et Radix" (LReR), or "Gao-ben" in Chinese (Commission, 2015). It is commonly used to treat colds, trapped wind, headaches and rheumatic arthralgia (Commission, 2015). It has been reported to exhibit many beneficial properties such as analgesic, antipyretic, anti-inflammatory and anticonvulsive activities, plus antimicrobial and antioxidant

effects (Wang et al., 2011). As a result, this herb has attracted more and more attention in the medical field, and have been widely used in clinical therapies.

However, LReR is easily confused with other herbs, causing potential mistakes in treatment. Certain species that are closely related or morphologically similar to LReR are frequently used as local remedies in various regions due to geographical and historical factors. For instance, Meeboldia yunnanensis (H. Wolff) Constance & F. T. Pu, Ligusticum delavayi Franchet and Ligusticum pteridophyllum Franchet are often used in folk medicine in southwestern China, but their function and efficacy are not quite the same (Li et al., 2001). Additionally, Conioselinum vaginatum (Spreng.) Thell., Ligusticum tenuissimum (Nakai) Kitagawa, Sium suave Walter and Ligusticum acuminatum Franch. are also sold as "Gao-ben" in medicinal markets. Using morphology-based or chemical identification methods to identify Rhizoma et Radix herbs is difficult, especially when the plant is sliced and dried, as they often are for selling (Hon et al., 2003; Joshi et al., 2004; Xue et al., 2006; Mishra et al., 2016). Therefore, a simple, inexpensive and effective method for distinguishing between the above species is urgently needed.

DNA barcoding is a new taxonomic method that uses one or a few short, standard genomic DNA region(s) for rapid, reliable and effective species identification (Floyd et al., 2002; Hebert et al., 2003). For identifying traditional herbal medicines, however, the commonly used three-barcode system has not always been effective, because matK and rbcL are difficult to amplify, especially from powdered products where DNA degradation is very common. The internal transcribed spacer 2 (ITS2) region of nuclear ribosomal DNA might be a promising standardized region to barcode medicinal plants, due to its relatively short length, consistent performance in distinguishing closely related species, and ease of amplification with a single set of universal primers (Chen et al., 2010). In cells, ITS2 has conserved nucleotide motifs which play a role in forming a three dimensional structure, and hence transformation into a functional complex before conversion to a mature rRNA (Hall, 1999; Lalev and Nazar, 1999). Compared with previous sequence alignments based only on nucleotide similarity, the ITS2 conserved nucleotide motifs permit multiple sequence alignments, from which a more homologous overall alignment can be generated (Kjer, 1995; Coleman and Mai, 1997). Additionally, the secondary structure is maintained by the mutual between base-pairs that are canonical (GC, AU), non-canonical stable (GU) and unstable (AC) (Elgavish et al., 2001; Leontis and Westhof, 2001). Theses paired and unpaired ITS2 structures provide extra molecular morphological features that can greatly improve taxonomic classification (Telford et al., 2005; Keller et al., 2010).

In this study, we use ITS2 sequence and secondary structure information to determine whether genuine "Gao-ben" can be distinguished from the most commonly used adulterants and substitutes. We also discuss the possibility that ITS2 might play a regulatory role in the herbal medicine market in the future.

## MATERIALS AND METHODS

## Plant Materials

A total of 43 samples belonging to eight species, covering the two original species of LReR (L. sinense and L. jeholense) and seven potential substitutes (L. tenuissimum, L. pteridophyllum, L. acuminatum, S. suave, M. yunnanensis, and C. vaginatum), were collected from fields. Voucher specimens were deposited in the Kunming Institute of Botany, Chinese Academy of Sciences (KUN), and Chengdu Institute of Biology, Chinese Academy of Sciences (CDBI) (**Supplementary Table S1**). To determine possible infraspecific molecular variation, each species was represented by at least two individuals. Furthermore, 25 commercial "Gao-ben" products were purchased from different medicinal markets in China (**Supplementary Table S1**).

## Laboratory Protocols

Before DNA extraction, the surface of the medicinal materials was wiped with 75% ethanol, then ground into powder with a grinder. Genomic DNA was extracted using the modified CTAB procedure of Doyle and Doyle (1987). The primers BEL-2 (5<sup>0</sup> -GATGCGGAGATTGGCCCCCCGTGC -3<sup>0</sup> ) and BEL-3 (5<sup>0</sup> -GACGCTTCTCCAGACTACAAT -3<sup>0</sup> ) were used to amplify the complete ITS2 region (Chiou et al., 2007). The PCR parameters were as follows: Initial denaturation for 3 min at 94◦C, followed by 36 cycles of denaturation (94◦ , 45 s), annealing (55◦C, 1 min) and extension (72◦C, 2 min), and a final extension for 7 min at 72◦C. Purified PCR products were sequenced in both directions with the primers used for PCR amplification on an ABI 3730 automated sequencer (Applied Biosystems, Foster City, CA, United States) in Sangon Biotech Corporation (Shanghai, China).

#### Data Analysis

Newly generated sequences were initially edited and assembled using SeqMan of the DNASTAR 5.01 software package (DNASTAR, Inc., Madison, United States). The ITS2 region was annotated using the Hidden Markov Model (HMM) (Keller et al., 2009) to delete the conserved 5.8 and 28S sections (Koetschan et al., 2012). ITS2 secondary structures of all investigated taxa were predicted by homology modeling in the ITS2 Database (identity matrix and 75% threshold for the helix transfer<sup>1</sup> ) (Koetschan et al., 2012). This method usually results in several alternative folding patterns for the same ITS2 sequence. The true folding pattern corresponds to the secondary structure model of Mai and Coleman, and was well supported by compensatory base changes (CBCs) and hemi compensatory base changes (hCBCs) revealed by comparisons among related taxa (Mai and Coleman, 1997). Sequences with homologous structure were automatically and synchronously aligned using 4SALE 1.7 (Seibel et al., 2006, 2008). Genetic distances were calculated according to the kimura-2-parameter (K2P) model using MEGA 7.0 software (Kumar et al., 2016). A neighbor-joining (NJ) tree was constructed and bootstrap tests were performed using 1000 replicates to separate the sampled species via MEGA 7.0

<sup>1</sup>http://its2.bioapps.biozentrum.uni-wuerzburg.de/

(Kumar et al., 2016). CBCs are substitutions in two positions that retain pairing, i.e., G = C ↔ C = G, A = U ↔ U = A. The proposed ITS2 secondary structure was examined for CBCs with the CBCAnalyzer option (Wolf et al., 2005) implemented in 4SALE, whereas hCBCs (pair ↔ non-pair, i.e., G = C ↔ G = U) were observed manually.

#### RESULTS

#### Amplification, Sequencing, and Sequences Characteristics

The success rate of the ITS2 PCR amplification and sequencing was 100% (**Table 1**). All high-quality sequences were submitted to GenBank (**Table 1**). The ITS2 sequence showed minor length variation across all samples, ranging from 220 bp (M. yunnanensis) to 226 (S. suave). The length of 15 L. sinense and L. jeholense individuals was 222 bp, and the average GC content was 54.9%. The ITS2 sequence lengths of L. acuminatum, L. delavayi, L. pteridophyllum, L. tenuissimum, C. vaginatum, M. yunnanensis, and S. suave were 223, 223, 223, 223, 222, 220, and 226 bp, respectively. The corresponding average GC content of these adulterants varied from 53.2 to 57.9%. The aligned length of 227 bp exhibited 84 variable sites, a rate of 37.0% (**Table 1**). Therefore, the ITS2 sequences for the sampled species were relatively variable.

#### Intra/Interspecific Distance, Barcoding Gap, and NJ Tree

All individuals of L. sinense and L. jeholense were identical for ITS2, sharing a single haplotype. According to the K2P model, the average interspecific distance between these and the adulterant species was 0.175, with the maximum interspecific distance being 0.337 from L. delavayi, and the minimum being 0.059 from C. vaginatum (**Table 2** and **Figure 1A**). The NJ tree showed that 15 individual samples were determined to be L. jeholense and L. sinense clustered together in a highly supported clade (Bootstrap value = 99) that was separated from the commonly used substitutes and adulterants such as L. pteridophyllum, L. acuminatum, L. tenuissimum, C. vaginatum, S. suave, M. yunnanensis, and L. delavayi (**Figure 2A**). There was therefore a high interspecific variation and an obvious barcoding gap was noted (**Figure 1B**).

Of the 25 commercial "Gao-ben" products, six samples were identified as LReR, accounting for 24%, while the remainder were



TABLE 2 | Analysis of intra-specific variation and inter-specific divergence of the ITS2 sequences.


identified as M. yunnanensis (12%), and C. vaginatum (64%) (**Figure 2B**).

#### Analysis of the Secondary Structure

ITS2 secondary structures obtained for all species examined fold into the common core structure known for eukaryotes, made up of four helices, the third being the longest (**Supplementary Figures S1A–H**; Mai and Coleman, 1997; Coleman, 2003, 2007, 2015; Schultz et al., 2005). **Figure 3** visualizes a 51% consensus structure. Sequence motifs include a U-U mismatch in helix II, an A-rich conserved spacer between helices II and III, and a UGGU motif 5<sup>0</sup> side to the apex of helix III (**Figure 3**). In comparable portions of the secondary structure, most CBCs, and hCBCs observed between LReR and its adulterants are in helices III, I, and II, with a few in helix IV (**Table 3**). L. sinense, L. jeholense, L. tenuissimum, L. acuminatum, C. vaginatum, and L. pteridophyllum form a group without any CBC in conserved ITS2 regions (i.e., in helices II and III), from which the group of L. delavayi, M. yunnanensis, and S. Suave may be distinguished by the presence of at least one CBC in these regions.

#### DISCUSSION

#### Identification Capability of ITS2 for LReR

In Chinese Pharmacopeia, only L. sinense and L. jeholense are listed as LReR for their very similar chemical compositions and the near identical use and efficacy (Commission, 2015). Morphologically, both species share many similar characters, e.g., ternate-2- or 3-pinnate blade, ultimate segment margins irregularly serrate, and long and reflexed styles. Geographically, L. jeholense is in northern China, while L. sinense occurs more widely, but does not overlap with L. jeholense (Sheh and Watson, 2005). Both our present ITS2 (**Table 2**) and unpublished data including psbA-trnH, matK, and rbcL show no interspecific variation between these two species. Therefore, we consider

FIGURE 1 | (A) Genetic distances from LReR to its adulterants and substitutes. (B) Relative distribution of interspecific divergence between LReR and its adulterants and substitutes and intraspecific variation in the ITS2 region using K2P genetic distance.

that L. sinense and L. jeholense are close relatives, or the latter represents a vicariant geographical element.

According to the criterion proposed by Coleman (2003) and coworkers, presence/absence of even a single CBC in the conserved areas of helices II and III of ITS2 is associated with incompatibility/inability to hybridize, thus establishing the boundaries between biological species and populations. In contrast, hCBCs in the conserved parts as well as changes in the less conserved regions (e.g., in helices I and IV) do not correlate with interbreeding ability. LReR can be distinguished from L. delavayi, M. yunnanensis, and S. suave by at least one CBC in the conserved ITS2 regions (i.e., in helices II and III) (**Table 3** and **Supplementary Figures S1F–H**). M. yunnanensis and L. delavayi are widely used as "Huang Gao-ben" in Yunnan

Province, whereas S. suave has also been sold as "Gao-ben" in medicinal markets. It is difficult to distinguish them from LReR when they are dried, sliced, and shredded, but our ITS2 analysis indicated that they are phylogenetically distant species (**Figures 1A**, **2A** and **Table 2**). Moreover, these three species can be distinguished from the true LReR by the presence of at least one CBC in the conserved ITS2 regions (i.e., in helices II and III) (**Table 3**). Results from epidermal analysis (Zhou and Liu, 2018), together with that from cytological evidence (Zhou et al., 2008) and molecular phylogenetics (Zhou et al., 2009), indicated that these are not close relatives of L. sinense. We therefore recommend that L. delavayi, M. yunnanensis, and S. Suave should be marketed under their original herbal medicinal name.

Other genetically close relatives of the LReR plant group include L. pteridophyllum, L. tenuissimum, L. acuminatum, and C. vaginatum (**Figures 1A**, **2A** and **Table 2**). L. pteridophyllum is another herbal medicine widely used in Yunnan in the name of "Hei Gao-ben" and has been regarded as an adulterant of Peucedanum praeruptorum Dunn (Rao et al., 1995). In our analysis, all accessions of L. pteridophyllum comprised a strongly supported clade, having a sequence divergence value of 0.073 with LReR (**Figures 1A**, **2A** and **Table 2**). According to Ye et al. (2004) the chemical composition of L. pteridophyllum is similar to LReR, so, considering it as a regional substitute seems to be reasonable. L. acuminatum had a sequence divergence value of 0.064 from LReR, for which it is used as a regional substitute in western Sichuan (Sheh and Watson, 2005), so further research is needed to determine whether it can be regarded as an effective "Gao-ben" substitute.

Conioselinum vaginatum, known as "Xinjiang Gao-ben," is found mainly in the Tian and Altai mountains of Xinjiang and western Junggar mountains in central Asia and western Siberia (Sheh and Watson, 2005), and is widely cultivated. Chemical analysis showed that C. vaginatum contains 16 compounds, including ligustilide, ferulic acid, and myristic ether, which are the same as in L. jeholense (Li, 2013); however, its pharmacological

TABLE 3 | Occurrence/frequency of CBCs and hCBCs between LReR (Ligusticum jeholense/L. sinense) and its adulterants and substitutes.


Base pairs displaying CBCs are indicated in bolds.

action remains controversial (Dai, 1988; Li et al., 2013). Given that the annual demand for LReR exceeds 3500 tons, which exceeds the natural production capacity (Ding, 2010), research on the pharmacological efficacy of C. vaginatum is urgently needed, to determine whether C. vaginatum could be an effective substitute.

## Potential Application of ITS2 in the Authentication of Medicinal Materials

The trade in crude drugs has surged globally, generating annual revenues over US \$60 billion (Newmaster et al., 2013). There are strong financial incentives for dishonest merchants to use adulterants and substitutes intentionally, leading to treatments not working as advertised, and posing serious risks to the health of consumers (Newmaster et al., 2013). Accurate and rapid species authentication is the best way to combat this. Traditional methods usually require taxonomists, who are few in number, and moreover a fairly complete specimen, meaning they will not work on plant fragments sold as. The ITS2 barcode presented here solves this problem. The ITS2 is region is easy to amplify and sequence, has a short length, and reveals high interspecific variation (Chiou et al., 2007; China Plant Bol Group et al., 2011). While the ITS2 nucleotide sequences evolve quickly, their secondary structures are maintained by certain conserved motifs (Hershkovitz and Zimmer, 1996), which are very useful for sequence alignment (Mai and Coleman, 1997), especially when possible species are spread across many families, as is the case for some Traditional Chinese Medicines (Zhang et al., 2015). Meanwhile, the secondary structures of ITS2 can provide additional molecular morphological characteristics for better species discrimination (Grajales et al., 2007; Gu et al., 2013). Much research has been carried out using ITS2 to regulate the herbal medicine market (Zhang et al., 2015, 2016; Zhao et al., 2015; Zhu et al., 2017), and the current paper shows its effectiveness for LReR or "Gao-ben." Twenty-five commercial "Gao-ben" samples fell into three clades, corresponding to L. sinense + L. jeholense, C. vaginatum, and M. yunnanensis (**Figure 2A**). Surprisingly, more than half of the samples were derived from C. vaginatum. As mentioned above, whether C. vaginatum can be substituted for genuine "Gao-ben" is still controversial. Although M. yunnanensis is mainly cultivated and used in Yunnan province in the name of "Huang Gao-ben," its adulteration or substitution can cause confusion in identification and in therapeutic efficacy.

ITS2 has been proved to vary in sequence and secondary structure in a way that highly correlates with species taxonomy. Müller et al. (2007) compared the ITS2 secondary structure of 1373 species to their nearest relatives, and observed that in 93% of cases, if two taxa are different somewhere in their ITS2 by one CBC, they would be classified as different species. This criterion has been less commonly used for herbal medicine authentication. However, in our study, presence of a CBC distinguishes genuine "Gao-ben" from three other species, i.e., L. delavayi, M. yunnanensis, and S. suave, at least the last of which is sometimes sold as "Gao-ben." So, as a rapid, inexpensive, and informative DNA barcode, ITS2 could be widely used to regulate the herbal medicine market.

## CONCLUSION

Traditional Chinese medicine is vulnerable to the replacement of the correct and most effective species with others that may be closely or distantly related. ITS2 could be an ideal candidate marker for authentication from both divergences of primary sequences and variations in secondary structures. This method is suitable for the identification of raw medicinal materials, but it is unsuitable for the authentication of heavily processed materials in which DNA degradation frequently occurs. A promising direction suitable for authentication of degraded material would be to combine the next generation sequencing (NGS)-based and species-specific PCR based methods (such as nucleotide signatures). During the process, knowing the secondary structure of ITS2 can help to locate the positions of the short motifs, that is well conserved within the species and develop nucleotide signatures.

## AUTHOR CONTRIBUTIONS

Z-wL, Y-zG, and JZ collected the samples and carried out the experiments. Z-wL and JZ analyzed the data, conceived and designed the study, and wrote the manuscript. All authors have read and approved the final manuscript.

## FUNDING

This work was supported by the National Natural Science Foundation of China (No. 31460052) and the United Research Foundation of Yunnan Science and Technology Department-Kunming Medical University (No. 2015FB014).

## ACKNOWLEDGMENTS

We thank Richard I. Milne from The University of Edinburgh and David E. Boufford from Harvard University Herbaria for language polishing.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00429/ full#supplementary-material

FIGURE S1 | Secondary structure of ITS2 in Ligustici Rhizoma et Radix (LReR) and its adulterants and substitutes. (A) LReR, (B) L. acuminatum, (C) L. pteridophyllum, (D) L. tenuissimum, (E) Conioselinum vaginatum, (F) L. delavayi, (G) Sium suave, and (H) Meeboldia yunnanensis. Red and blue arrows show the site of the compensatory base changes (CBCs) and hemi compensatory base changes (hCBCs) between LReR and its adulterants and substitutes, respectively.

TABLE S1 |Detailed information of samples used in this study.

#### REFERENCES

fpls-10-00429 April 5, 2019 Time: 16:43 # 7



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Liu, Gao and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Phylogenomic Approaches to DNA Barcoding of Herbal Medicines: Developing Clade-Specific Diagnostic Characters for Berberis

Marco Kreuzer<sup>1</sup> \*, Caroline Howard<sup>2</sup> , Bhaskar Adhikari<sup>3</sup> , Colin A. Pendry<sup>3</sup> and Julie A. Hawkins<sup>1</sup>

<sup>1</sup> School of Biological Sciences, University of Reading, Reading, United Kingdom, <sup>2</sup> BP-NIBSC Herbal Laboratory, National Institute for Biological Standards and Control, Potters Bar, United Kingdom, <sup>3</sup> Royal Botanic Garden Edinburgh, Edinburgh, United Kingdom

#### Edited by:

Nunzio D'Agostino, University of Naples Federico II, Italy

#### Reviewed by:

Michael R. McKain, The University of Alabama, United States Salvatore Cozzolino, University of Naples Federico II, Italy

> \*Correspondence: Marco Kreuzer marco.c.kreuzer@gmail.com

#### Specialty section:

This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Plant Science

Received: 29 November 2018 Accepted: 18 April 2019 Published: 14 May 2019

#### Citation:

Kreuzer M, Howard C, Adhikari B, Pendry CA and Hawkins JA (2019) Phylogenomic Approaches to DNA Barcoding of Herbal Medicines: Developing Clade-Specific Diagnostic Characters for Berberis. Front. Plant Sci. 10:586. doi: 10.3389/fpls.2019.00586 DNA barcoding of herbal medicines has been mainly concerned with authentication of products in trade and has raised awareness of species substitution and adulteration. More recently DNA barcodes have been included in pharmacopoeias, providing tools for regulatory purposes. The commonly used DNA barcoding regions in plants often fail to resolve identification to species level. This can be especially challenging in evolutionarily complex groups where incipient or reticulate speciation is ongoing. In this study, we take a phylogenomic approach, analyzing whole plastid sequences from the evolutionarily complex genus Berberis in order to develop DNA barcodes for the medicinally important species Berberis aristata. The phylogeny reconstructed from an alignment of ∼160 kbp of chloroplast DNA for 57 species reveals that the pharmacopoeial species in question is polyphyletic, complicating development of a species-specific DNA barcode. Instead we propose a DNA barcode that is clade specific, using our phylogeny to define Operational Phylogenetic Units (OPUs). The plastid alignment is then reduced to small, informative DNA regions including nucleotides diagnostic for these OPUs. These DNA barcodes were tested on commercial samples, and shown to discriminate plants in trade and therefore to meet the requirement of a pharmacopoeial standard. The proposed method provides an innovative approach for inferring DNA barcodes for evolutionarily complex groups for regulatory purposes and quality control.

Keywords: DNA barcoding, next-generation sequencing, operational phylogenetic units, herbal medicines, Berberis, pharmacopoeia, pharmacopoeial standards, plastome

## INTRODUCTION

DNA barcoding has two major objectives: specimen identification, where an unknown sequence is matched to a sequence of a known species, and species discovery, which is equivalent to species delimitation and species description (DeSalle, 2006). DNA barcoding of herbal medicines is mainly concerned with authentication, the identification of specimens for quality assurance (Sgamma et al., 2017). In the last decade, DNA barcoding of herbal medicines has raised awareness of species substitution and adulteration, highlighting issues surrounding the quality of herbal

medicines in the global market (Newmaster et al., 2013; Srirama et al., 2017). Regulation of herbal medicines is a pressing issue for regulatory agencies (Directive 2001/83/Ec, 2001; Directive 2004/83/EC, 2004; Vlietinck et al., 2009). Published pharmacopoeial standards for authentication predominantly rely on chemical and anatomical methods (e.g., British Pharmacopoeia, 2016), but DNA barcoding offers new tools for regulatory purposes (de Boer et al., 2015) and DNA barcodes have recently been incorporated into the British Pharmacopoeia for the first time (British Pharmacopoeia Commission, 2017). Here we investigate opportunities and limitations of DNA barcoding using next-generation sequence data of an evolutionarily complex genus. The aim is to design new methodological approaches for producing DNA barcodes for regulatory purposes, pharmacovigilance and quality assurance.

To date, the British Pharmacopoeia has approved 6 annotated DNA barcodes for the individual identification of the following species: Anethum graveolens Sowa (ITS2); Glehnia littoralis (ITS2); Ocimum tenuiflorum (trnH-psbA); Myristica fragrans (trnH-psbA); Phellodendron amurense (trnH-psbA); and Phellodendron chinense (trnH-psbA). The British Pharmacopoeia Commission (2017) have also published guidelines for the use of these barcodes, guiding users through the extraction of DNA, amplification of barcode markers, sequencing and comparison to pharmacopoeial standards. This development of bespoke barcode markers for different species is an approach likely to continue since there is no single, universal DNA barcode for land plants (Hollingsworth et al., 2011). For taxonomic purposes, several propositions have been made (e.g., Kress et al., 2005; Chase et al., 2007; CBOL Plant Working Group et al., 2009). Following Hollingsworth et al. (2011), most studies use a combination of the plastid regions matK, rbcL, the intergenic spacer trnH-psbA and the nuclear ITS2. Advances in sequencing technology have encouraged the barcoding community to augment the standard barcoding approach (Kane et al., 2012; Vaughn et al., 2014; Coissac et al., 2016; Zhang et al., 2017). In the era of next-generation sequencing, some researchers have even argued for the use of whole plastid genomes as barcodes (Kane et al., 2012; Vaughn et al., 2014; Coissac et al., 2016; Zhang et al., 2017; Manzanilla et al., 2018). How whole plastid genomes might be best deployed for pharmacopoeial purposes has hardly been explored yet.

Methodological approaches for specimen identification using DNA barcodes commonly rely on either distance-based measures or phylogenetic methods (Austerlitz et al., 2009). The former are based on the assumption that intra- and interspecific variation do not overlap (e.g., Hebert et al., 2004), also referred to as the barcoding gap (Meyer and Paulay, 2005). Accurate specimen identification using distance-based approaches such as BLAST are highly dependent on a well-curated database in which all members of a group are ideally represented by several individuals (Meyer and Paulay, 2005). The drawbacks of using distance-based approaches are that there is no objective distance threshold criterion and that the nearest neighbor is not always the closest relative (Moritz and Cicero, 2004). Specimen identification using phylogenetic methods is based on membership of a query sequence to a specific clade (Casiraghi et al., 2010). One difficulty associated with using treebased barcoding methods is that phylogenies inferred from the barcode sequence might not be resolved sufficiently for an individual to be allocated to a clade, and that clades may exhibit poor support, questioning the robustness of any phylogenetic hypothesis (Moritz and Cicero, 2004). The use of concatenated DNA sequences for species tree inference has been shown to produce more robust phylogenetic hypotheses (Rokas et al., 2003). However, phylogenetic methods of DNA barcoding are not suitable when the underlying system is not based on strictly hierarchical ancestor-descendant relations structures, such as in nested structures (Goldstein and DeSalle, 2005).

Whether specimens of different species can be differentiated depends on the choice of the DNA barcode and the reproductive isolation and evolutionary history of the species under investigation. Although relatively high success rates for the identification of genera has been reported when using common barcodes in plants, limited sequence variation is often the cause of the failure to distinguish between closely related species (Seberg and Petersen, 2009; Parmentier et al., 2013; Braukmann et al., 2017). One incentive for employing genomic approaches for barcoding is that broader genome coverage increases the variation in the barcoding data set (Coissac et al., 2016). However, closely related species may not exhibit a DNA barcoding gap even when the most variable regions are employed. In the case of incipient speciation where lineage sorting is incomplete, species are likely to be paraphyletic (Rieseberg and Brouillet, 1994; Fazekas et al., 2009). Furthermore, cytoplasmic genomes can have different evolutionary histories compared with nuclear genomes because of processes such as chloroplast capture (Rieseberg and Soltis, 1991), and specimens may group geographically rather than taxonomically (Acosta and Premoli, 2010). The success of DNA barcoding may therefore be limited in some plant groups because of their biology and evolutionary history (Percy et al., 2014).

The genus Berberis is a case in which DNA barcoding using only a few regions has had limited success (Roy et al., 2010). Similarly, a phylogeny of Berberis based on ndhF and ITS loci failed to resolve boundaries of several species (Adhikari et al., 2015). Berberis aristata is a medicinal plant that has been in traditional use in India for centuries and is nowadays traded throughout the world (Srirama et al., 2017). Local market studies suggest that several species are traded under the same vernacular name (Srivastava and Rawat, 2013), including B. aristata and B. asiatica. B. aristata is described in several pharmacopoeias (Ayurvedic Pharmacopoeia of India, 2001; British Pharmacopoeia, 2016). Chemical and anatomical tests are deficient and conventional macro-morphological and microscopic examination do not distinguish the traded materials (Chandra and Purohit, 1980; Srivastava et al., 2004) therefore there is a strong incentive for the development of a DNA barcoding method for their identification.

The aim of this study is to investigate whole plastid sequences of the genus Berberis as a resource for barcode design, utilizing a whole plastid phylogeny of the species in order to better understand the difficulties of using barcoding for pharmacopoeial purposes. In light of the challenges of this complex group,

we develop a method for identifying short, informative plastid barcode regions based on diagnostic nucleotides. These barcodes, which are informative of clade membership in a phylogenetic context, are tested on commercial samples, and their utility for regulatory purposes and quality control outlined.

### MATERIALS AND METHODS

#### Sampling

This study includes 85 specimens from 57 species (**Table 1**). The dataset includes sequences from two putative new species (named in this study as B\_newsppA and B\_newsppB) and one unidentified species (B\_spp).

#### Laboratory Work and DNA Sequencing DNA Extraction

DNA was extracted using either the Qiagen DNeasy Plant Kit following the manufacturer's protocol or the CTAB method (Doyle and Doyle, 1987). The quality of the extractions was checked for the degree of degradation on 1 or 1.5% agarose gels. Furthermore, we performed PCR amplifications of the rbcL gene in different dilutions (1:1, 1:10 and 1:100) and finally we measured the DNA concentration on a Qubit <sup>R</sup> Fluorometer (Life Technologies, Carlsbad, CA, United States), using the dsDNA High Sensitivity kit. The concentrations after extraction ranged from 1.5 to 34.8 ng/µl.

#### Library Preparation and Sequencing

The library preparation for the shotgun sequencing was performed according to Meyer and Kircher (2010). The libraries were sequenced in two runs on a MiSeq <sup>R</sup> and a NextSeq <sup>R</sup> . Depending on their integrity, the DNA samples were sheared mechanically to a fragment size of approximately 400 bp using a Covaris© sonicator with peak incident power of 75; duty factor of 10%, and 200 cycles per burst. The duration of treatment was chosen according to the observed fragment size on agarose gels and ranged between 30s (medium degradation) and 40s (genomic DNA).

We followed the protocol for blunt-end repair, adapter ligation and adapter fill-in. After each of these steps, the DNA was cleaned-up with AMPure <sup>R</sup> XP beads (Agencourt <sup>R</sup> ). Before the indexing PCR, the DNA quantity was measured on a Qubit©. Depending on the concentration of adapter-ligated libraries, we aimed to use between 50 and 100 ng of DNA as input for the indexing PCR where possible. Higher concentrations may impair the PCR reaction. In order to avoid high duplication levels, a minimal number of PCR cycles were applied. Libraries with concentrations lower than 40 ng were amplified with 16 PCR cycles. If more than 40 ng of library was used for the PCR, 12 cycles were applied. We used the index sequences ("barcodes") as suggested by the protocol. The final libraries were washed using AMPure <sup>R</sup> XP beads (Agencourt <sup>R</sup> ). We then measured for concentration with Qubit© and assessed the fragment size using Bioanalyzer <sup>R</sup> (Agilent). The libraries were diluted to 10 mM and pooled together. The libraries were sequenced in two runs on either an Illumina MiSeq <sup>R</sup> using the MiSeq v2 reagent kit with the 250 bp paired-end option or a NextSeq <sup>R</sup> with the NextSeq 500 High Output kit performing 150 bp paired-end sequencing.

#### Bioinformatics Raw Read Processing and Quality Control

The adapters of the raw reads were removed either with the builtin Illumina software on sequencers or using cutadapt v. 1.10 (Martin, 2011). Raw reads were trimmed using Trimmomatic v.0.33 (Bolger et al., 2014) with the options LEADING:3, TRAILING:3, SLIDINGWINDOW:4:20. Reads from Illumina NextSeq were discarded when shorter than 30 bp and from MiSeq when shorter than 50 bp. The read quality was checked with FastQC (Andrews, 2010).

#### Reference Plastid Genome Reconstructions

The reference genome for B. aristata7 was reconstructed using a hybrid strategy of read mapping and de novo assembly. All reads were mapped to the reference plastid genome of Berberis bealei (Ma et al., 2013 GenBank reference KF176554), using the Geneious medium-low sensitivity "Map to Reference" function with five iterations. The resulting contig was then checked manually for low coverage and low pairwise identity regions. One read from each of these regions was extracted and all reads were then mapped against these individual reads as a new reference sequence using the same settings as above. The iterations lead to an extension of the read to a contig (typically up to 2,500 bp). The consensus sequences were then mapped to the reference obtained from the first read mapping. This method allowed large indels in the B. aristata reference that were not detected by the read mapping algorithm to be identified. The built-in de novo algorithm in Geneious 7.1.7 was used for the de novo assembly of the plastid genome. We performed the assembly only with reads that matched to the reference sequence of B. bealei. The ten largest contigs, ranging in length from 1,132 to 29,132 bp, were then mapped to the B. aristata reference and checked for ambiguities. All reads were then mapped again to the new consensus sequence.

#### Plastid Genome Reconstructions and Alignment

We made our plastid genome reconstructions by mapping to a reference genome, having verified that the levels of variation between B. aristata, our reference, and the chloroplast genome of a member of the distantly related congeneric (B. bealei; Ma et al., 2013 GenBank reference KF176554), were structurally congruent. Reconstructions to a reference permitted a more rapid and costeffective generation of high quality data than de novo assembly. The quality filtered paired-end reads were mapped to a reference genome of B. aristata7 with Burrows-Wheeler Alignment tool (BWA, ver. 0.7.12, Li and Durbin, 2009). The reference genome was indexed using option "bwa index." Read pairs that survived the quality check were mapped with default options of the command "bwa mem." The resulting SAM file was converted to BAM format with "samtools view" and sorted with "samtools sort" in SAMtools v. 1.2 (Li et al., 2009). Optical read duplicates were removed with Picard tools<sup>1</sup> . We used the single nucleotide

<sup>1</sup>http://broadinstitute.github.io/picard, last accessed June 30, 2017


**197**


Frontiers in Plant Science | www.frontiersin.org

**198**


TABLE 1


Continued

polymorphism (SNP) calling workflow in GATK (McKenna et al., 2010; Van der Auwera et al., 2013). Regions that contain insertions and deletions are often badly aligned. Therefore, a local realignment process was applied with the command "–T IndelRealigner" in GATK. Variant calling was performed on the realigned BAM files with the "–T HaploTypeCaller" module with haploid settings ("-ploidy 1"). The output is a genomic variant call file (GVCF) that contains base call information for all sites of the markers. The variant calls were then exported with "–T GenotypeGVCFs" to the standard variant call format (VCF). SNP and indel variants were then filtered separately. The first SNP filter applied is quality by depth (QD), which can be considered as the quality of the variant call standardized by the depth of coverage. QD avoids inflation of the Phred quality score for the variant call caused by deep coverage. Variants that had a QD < 2 were filtered out as recommended by Van der Auwera et al. (2013). The FisherStrand (FS) quality filter is a Phred-scaled probability that strand bias exists at a specific site. Specifically, the score is a measure for whether an alternate allele was seen more or less often on either forward or reverse reads. The mapping quality (MQ) in GATK is calculated as the root mean square quality over all reads at a given site. The sites where variance resulted in an MQ score < M 40 were treated as missing data in order to avoid carry-over of reference- specific base pairs. The final sequence was reconstructed with the command "– T FastaAlternateReferenceMaker" in GATK. We checked our pipeline by visual comparison of the final plastid sequence with the BAM file for selected samples.

The plastids were aligned using the MAFFT v7.215 aligner (Katoh and Standley, 2013) with default options. The alignment of repetitive regions such as poly A sequences was not straightforward, therefore two alignment files were created: the first alignment was used for phylogenetic inference, and blocks where no unambiguous alignment could be constructed were removed. Furthermore, the inverted repeats were removed, since SNP calling on these repeats was difficult to address. Reads with polymorphisms in only one region will map to the other repeat as well. Random mapping to inverted repeat regions often results in apparently heterozygous read alignments, precluding unique assignments of SNPs to a specific inverted repeat. The second alignment was used for the barcoding analysis. Regions were masked (coded as "N") where no unambiguous alignment was possible.

#### Annotation of Plastid Sequence

The online platforms DOGMA (Wyman et al., 2004) and CpGAVAS (Liu et al., 2012) were used for the annotation of the genome of B. aristata7. The full genome sequences were imported into Apollo (Lee et al., 2009). The annotation of B. aristata was compared with the previously published annotation of B. bealei (Ma et al., 2013). Start and stop codons were checked manually. The annotation was visualized using OGdraw.

#### Universal Barcode Reconstruction

The sequences of matK, rbcL, and trnH-psbA of B. aristata were extracted from the annotated reference B. aristata7. The sequences were then aligned to the plastid genomes using BLAT (Kent, 2002). The output was parsed to produce a BED file, which denotes the start and end position of an alignment. The respective sequence was then extracted with the "getfasta" option in BEDTools (Quinlan and Hall, 2010).

A two-step pipeline was devised to reconstruct the ITS2 from shotgun sequencing data. Firstly, reads that map to the ITS2 reference were filtered and then a de novo assembly was performed using these reads. Filtering prior to de novo assembly reduces computation time substantially. The reference sequence of ITS2 (Berberis repens, BOLD accession: HIMS1138- 12) was indexed with BWA (Li and Durbin, 2009) using the command "bwa index." Trimmed and filtered reads were mapped to the reference with "bwa mem." Mapped reads were then separated from unmapped reads with SAMtools (Li et al., 2009) "samtools view –b –F 4," resulting in a BAM file with only mapped reads. The mapped reads were then extracted to fastq format using Picard tools (see footnote 1) with the command "SamToFastq." The reads were then used for de novo assembly using SPAdes v3.7.0 (Bankevich et al., 2012) and the longest contig extracted.

#### Barcoding Analysis and Phylogenies

The phylogeny of the plastid alignment was estimated using RAxML v. 8.2.10 (Stamatakis, 2014). The best model of substitution was calculated under the Aikaike Information Criterion in jModeltest2. The ML phylogeny was estimated with 1,000 bootstrap replicates under the GTRGAMMA + I substitution model using the online CIPRES portal (Miller et al., 2010). The whole alignment was considered as a single partition. Members of the compound-leaved Berberis were set as outgroup (B. nervosa, B. polyodonta and B. nevinii).

Potential novel Berberis-specific barcodes were explored by extracting SNP positions of the multiple sequence alignment of whole plastid genomes with the program SNP-sites (Page et al., 2016). The SNPs were summarized in 500 bp windows and their distribution plotted with Circos (Krzywinski et al., 2009). Potential barcodes were selected spanning regions where a 500 bp window had a sequence variability of >5%, and a maximum amount of missing/masked data <3%. The 500 bp regions were then compared to the annotated plastid genome and the barcodes were constructed to correspond with genomic regions, such as intergenic spacers that are flanked by conservative regions suitable for primer design. These Berberis specific barcodes derived from the whole plastid alignment were evaluated, along with the commonly used barcodes ITS2, rbcL, matK, and trnH-psbA.

TABLE 2 | Commercial samples analyzed in this study.


The samples Market1 and Market2 were purchased from the same company. The sample Market3 was purchased from India via the Internet.

The individual barcode regions were aligned using MAFFT v7.215 (Katoh and Standley, 2013) with default options and were then manually trimmed. A first step was to infer a maximum likelihood tree of the barcode with RAxML v.8.2.9 (Stamatakis, 2014) with 1,000 rapid bootstrap replicates ("–f a") under the GTRCAT model. The potential barcodes were sorted according to the percent variable sites, percent parsimony informative sites, recovery of B. aristata and B. asiatica groups and the recovery of groups present in the whole plastid phylogeny. The selected barcodes were concatenated and a maximum likelihood phylogeny was built with the same parameters as described above. Phylogenies of the selected barcodes were inferred under

asiatica group comprise a monophyletic group. Numbers above branches are bootstrap values between 51 and 99. Branches with support <50 were collapsed to polytomies, bootstrap values of 100 are not shown.

the GTRCAT model in RAxML v. 8.2.9 (Stamatakis, 2014). Additionally, haplotype networks were constructed with the function haploNet in the R package pegas (Paradis, 2010). Finally the alignment of each selected barcode was then reduced to SNP sites only and diagnostic polymorphisms were identified for each group in order to delimit a minimal barcode.

#### Test Data

The first test data consisted of three commercial samples, supposedly of B. aristata (**Table 2**). Sequences for the commercial samples were generated and the sequence data used to make identifications according to the diagnostic loci in **Table 4**.

## RESULTS

#### Whole Plastid Phylogeny

The whole plastid phylogeny is shown in **Figure 1**. Nine groups, eight of which are monophyletic, are identified and numbered 1 to 9. The aristata, asiatica and Mahonia clades (numbered 4, 5, and 9 in **Figure 1**) are of most importance in terms of authentication. The plastid phylogeny reveals that B. aristata is not monophyletic since B. jaeschkeana, B. karnaliensis and B. mucrifolia are nested amongst the specimens of this species in clade 4. The topology of the phylogeny is consistent with morphological and biogeographical characters, and with the topology based on nuclear sequence data (Kreuzer et al., in prep.). The annotated plastid sequence of B\_aristata7 is shown in **Supplementary Figure S1** and the corresponding sequence is found on Genbank with reference number MK714340.

## Identifying Informative Barcodes

The barcoding analysis aimed to find a set of informative nucleotides that are unique to clades of interest. The topology of the whole plastid genome phylogeny was used to determine evolutionarily meaningful groups, termed Operational Phylogenetic Units (OPUs). Barcodes were then constructed for identifying these OPUs, rather than individual species. A barcoding method based on diagnostic characters was preferred over distance or purely phylogenetic approaches, because of its ease of application to regulatory purposes and to provide an alternative approach in an evolutionarily complex

group. The density of SNPs in 500 bp windows along the whole plastid alignment is shown in **Figure 2**. The bins contained between 0 and 124 variable sites per 500 bp. The inspection of bins with >25 SNPs (5%) resulted in 21 potential barcode regions. Several of the highly variable bins fell into regions where the alignment was partly masked due to ambiguous alignment, leaving 13 bins for further inspection. Two neighboring bins were combined into a single potential barcode of 1,000 bp, and a set of four bins combined into a 2000 base pair barcode. The barcode of 2,000 bp (SSC\_noncoding2) was further examined by partitioning the alignment into 50 bp windows and reducing the barcode size (SSC\_noncoding2, **Figure 3**). The trnH-psbA intergenic spacer was identified among one of the seven highly variable regions, and together with the matK, rbcL and ITS2 barcodes, selected because they are commonly used barcode regions, eleven barcode candidates were investigated (**Table 3**). None of the individual barcodes retrieved phylogenies with the same topology as the whole plastid phylogeny. Although the matK phylogeny is not well resolved overall, species from the aristata and asiatica groups were recovered. B. asiatica is monophyletic in the non-coding SSC\_noncoding2 phylogeny, but species from the aristata clade are separated into two groups. The percent variable sites varied between 2.2 in rbcL and 9.85 in the intergenic spacer ndhI-ndhG (**Table 3**) and the latter was chosen along with matK and SSC\_noncoding2 as barcodes for phylogenetic and haplotype analysis (**Figure 4**).

These three barcodes yielded 133 variable positions in total. Nine positions were sufficient to identify seven of the nine groups with clade-specific nucleotide variants. Groups 3 and 8 (**Figure 1**) share a barcode, in other words their barcodes are identical. The phylogeny of the concatenated barcodes matK, SSC\_noncoding2 and ndhI-ndhG barcodes is shown in **Figure 5**. The topology of the tree differs substantially from the total-evidence tree inferred from whole plastid sequences. However, four of the major clades are identified in both trees. Haplotype networks constructed for each of the separate data sets showed variation in the haplotype associated with the B. aristata clade (**Figure 4**). There was no haplotype unique to B. aristata: for the SSC\_noncoding2 region one of the B. aristata haplotypes is found also in B. karnaliensis; for the matK region there is also a haplotype shared between B. aristata and B. karnaliensis; for ndhI-ndhG there is a haplotype found in B. aristata, B. jaeschkeana, B. karnaliensis and B. mucrifolia. The lack of species-specific haplotypes even in these most variable regions underlines the necessity of a clade-based approach. However, for pharmacopoeial purposes the haplotype

TABLE 3 | Barcode selection resulting from investigating variability patterns across whole plastid alignment.


matK and rbcL were not identified as highly variable but included in the study. Var = Variable sites; PIS = parsimony informative sites; "aristata recovered" and "asiatica recovered" indicates whether the clades were recovered in the respective phylogeny. Barcode selection resulting from investigating variability patterns across whole plastid alignment. The DNA barcodes that were selected are highlighted in bold font.

The positions are relative to the consensus of the multiple sequence alignments of each barcode. "SA clade" stands for South American clade. Bottom: Results of the test samples. Market1, Market2, and Market3 are commercial samples. and Mixture1 and Mixture2 are in silico mixtures. Numbers below multiple base calls represent the ratio of nucleotides in the mapping.

networks reveal separation of the B. aristata clade haplotypes and B. asiatica haplotypes.

#### Testing Barcodes

The minimal barcode consists of nine positions and includes barcodes unique to seven groups. No unique SNPs were identified for groups 3, 6, and 8. No individual barcode for groups 6 and 8 could be constructed (**Table 4**). The barcodes were evaluated with the test data set. The commercial samples Market1 and Market2 were identified as belonging to the Mahonia clade. The sample Market11 shared the barcode with B. asiatica samples.

## DISCUSSION

DNA barcoding for quality assurance and pharmacovigilance has great potential and is likely to be implemented as a routine diagnostic method. In this study, we present an approach for barcoding of an evolutionarily complex group of species and demonstrate that these barcodes can identify the species in commercial samples. Our purpose was to provide a barcode for pharmacopoeial purposes that discriminates B. aristata and B. asiatica since these are the pharmacopoeial species and the main substitute, respectively. We present a solution for barcoding that meets regulatory needs.

With the emergence of new sequencing technologies, whole plastid sequencing has been proposed as an extension of the current barcoding concept (Coissac et al., 2016). It has been shown that whole plastid sequences increase phylogenetic resolution (Parks et al., 2009) and simultaneously increase the effectiveness of discriminating between species. In this study, we show how whole plastid next-generation sequencing can be used to investigate sequence variability patterns for the discovery of informative DNA barcodes. We confirm the difficulty of

barcoding Berberis species as suggested by Roy et al. (2010), even when whole plastid sequences are used for comparison. Although the sampling was limited, with only a few of the species represented with multiple samples, the low resolution of the plastid phylogeny at shallow phylogenetic levels and the presence of polyphyletic species (e.g., B. aristata) indicates evolutionary reasons for the failure of barcoding this genus to species level (Mutanen et al., 2016). DNA barcoding is challenging in groups where frequent hybridization occurs in conjunction with plastid capture or where lineage sorting has not yet been completed (Fazekas et al., 2009). A salient point arising from our study is that the pharmacopoeial species, B. aristata, is polyphyletic. One explanation for this finding is hybridization, a phenomenon documented in Berberis (Adhikari et al., 2012). Low resolution among the closely related species of Berberis as reported in the whole plastid phylogeny, could point toward retention of ancestral polymorphism or incomplete lineage sorting (Naciri and Linder, 2015). Misidentification of B. jaeschkeana, B. karnaliensis and/or B. mucrifolia is unlikely, since these have been included in recent revisionary work (Adhikari et al., 2012). Polyphyletic species are likely to persist where they are morphologically robust entities, and

that were recovered in the whole plastid phylogeny (see Figure 1).

the development of methods for their identification, in this case for pharmacopoeia, benefits from understanding of their evolutionary history. The case of barcoding medicinal Berberis species provides an example of how barcoding for regulatory purposes in an evolutionarily complex group can be approached. Phylogenies can be essential for formulating adequate barcoding hypotheses; the whole plastid phylogeny reveals that at least three species are nested in the clade with the main species. The polyphyly of B. aristata indicates that universal barcodes are unlikely to delineate these species, and haplotype analysis shows this is the case for three of the most variable regions. Furthermore, several clades show low resolution at terminal branches. We have therefore adapted our classification scheme and defined meaningful OPUs that do not correspond to existing species limits. OPUs are the entities that can be discriminated by the barcodes put forward. The OPUs in this study are delimited using an integrative approach based on the interpretation of a whole plastid phylogeny, coupled with the detection of diagnostic nucleotides in relatively short barcodes for wellsupported groups. These DNA barcodes can be targeted by PCR and Sanger sequencing and therefore offer a simple and fast identification test for regulatory purposes and quality control. Appropriate OPUs would be identified on a case-by-case basis for other evolutionarily complex groups for regulatory purposes. This is because for evolutionarily complex groups barcodes do not confirm species identity. The novelty of our approach lies

in using whole plastid phylogeny to identify of short, easily amplified markers that incorporate clade-specific SNPs, and although we expect it to be more widely applicable it is only appropriate when the non-pharmacopoeial species belonging to the OPU are neither candidate adulterants nor substitute species, as is the case here.

The barcode presented in this study is based on diagnostic nucleotides for groups of species, referred to here as OPUs. Like the morphological classification of species, diagnostic methods provide a set of unique characters to assign specimens to species or species groups (Little and Stevenson, 2007). Diagnostic methods are particularly well-suited to pharmacopoeial purposes because a sequence generated from test material can be compared to a published sequence in a way that is comparable to other pharmacopoeial standards. The barcode we propose would require the user to amplify and sequence three regions, whereas the barcodes included in the British Pharmacopoeia to date are single regions (British Pharmacopoeia, 2016). We have limited the number of loci that would be part of the test to three because incorporating more loci would make the test more unwieldy for users. Limiting the number of regions necessarily reduces the number of informative sites. Identifying the most informative regions, as we do here, is therefore important. A deficiency of the diagnostic method is that further samples might show variation that is not present amongst the samples used for barcode design. However, there is scope to modify the published barcodes, perhaps by using the IUPAC nucleotide codes, if novel variants are reported.

The diagnostic method has been implemented in various analysis tools (Sarkar et al., 2008; Weitschek et al., 2013), mainly for specimen identification. Some of the algorithms use logic mining techniques (Bertolazzi et al., 2009). Logic mining for DNA barcoding refers to a two-step process, in which the barcode is first reduced to a set of very informative nucleotides and thereafter a logic mining method is applied, to define a set of formulas for separating the species. More recent approaches, such as BLOG 2.0 (Weitschek et al., 2013), provide a diagnostic, character-based methodology to species identification that is based on supervised machine learning. Character-based approaches circumvents analytical issues such as the nearest-neighbor problem in distance-based methods (DeSalle et al., 2005). Although the in silico mixtures presented in this study were created from the samples that were used for producing the DNA barcode and are therefore not true test samples, the analysis demonstrates the utility of analyzing mixed samples based on diagnostic nucleotides when shotgun sequencing data is available.

We believe that the development of clade-specific DNA barcodes is the way forward when investigating evolutionarily complex species. The barcodes we present are readily understandable and easily applicable for large-scale and routine testing of samples using PCR and Sanger sequencing. DNA barcoding is beyond doubt a powerful method for specimen identification, but its implementation as a routine process for quality assurance (Sgamma et al., 2017) and pharmacovigilance (de Boer et al., 2015) will depend on the ease of application. Neither phylogenetic nor distance methods are appropriate, since they depend on large databases, sophisticated tools and lack objective criteria. For this reason, the British Pharmacopoeia (BP) approach is to present a sequence which samples must match for authentication. Pharmacopoeias ensure the safe use of pharmaceuticals by defining certain quality standards and DNA barcodes have recently been published in the BP for the first time (British Pharmacopoeia Commission, 2017). The question "does this sample correspond to the pharmacopoeial species?" is addressed by comparison to the pharmacopoeial sequence, since methods based on diagnostic nucleotides provide an easy and straight-forward way to answer the question. Identifying such sequences for inclusion in a pharmacopeia is the challenge addressed by this study. The whole plastid approach described here could become a model that can be applied to species that are difficult to resolve. Success depends on devising a sampling strategy that includes species that are closely related to the target species. Furthermore, the inclusion of distantly related, congeneric species increases the confidence in detected diagnostic nucleotide polymorphisms.

## AUTHOR CONTRIBUTIONS

JH, CH, CP, and MK contributed to the conception and design of the study. BA and CP provided samples and made taxonomic identifications. CH and MK conducted the laboratory work. MK performed the data analysis and wrote the first draft of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

## FUNDING

This work was conducted as part of the MedPlant ITN and received funding from the European Union's Seventh Framework Program for research, technological development and demonstration under grant agreement no. 606895.

## ACKNOWLEDGMENTS

We would like to acknowledge the herbal medicines research group, the NGS core facility at the National Institute for Biological Standards and Control (NIBSC) and Edward Mee for help in NGS sequencing. We also would like to thank the group of JH at the University of Reading for facilitating lab work and discussions of the manuscript. Julian Harber has contributed to this study by providing samples from his personal collection.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00586/ full#supplementary-material

FIGURE S1 | Gene map of the plastid genome of Berberis aristata. Genes on the outside of the circle are transcribed clockwise and genes on the inside anti-clockwise. The dark gray histograms in the inner circle show the GC content.

#### REFERENCES

fpls-10-00586 May 10, 2019 Time: 14:47 # 15



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Kreuzer, Howard, Adhikari, Pendry and Hawkins. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Detection of Seasonal Variation in Aloe Polysaccharides Using Carbohydrate Detecting Microarrays

Louise Isager Ahl<sup>1</sup> \*, Narjes Al-Husseini<sup>1</sup> , Sara Al-Helle<sup>1</sup> , Dan Staerk<sup>2</sup> , Olwen M. Grace<sup>3</sup> , William G. T. Willats<sup>4</sup> , Jozef Mravec<sup>5</sup> , Bodil Jørgensen<sup>5</sup> and Nina Rønsted<sup>1</sup> \*

<sup>1</sup> Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark, <sup>2</sup> Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark, <sup>3</sup> Comparative Plant and Fungal Biology, Royal Botanic Gardens Kew, Richmond, United Kingdom, <sup>4</sup> School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom, <sup>5</sup> Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Jun-ichi Kadokawa, Kagoshima University, Japan Ruben F. Gonzalez-Laredo, Durango Institute of Technology, Mexico

> \*Correspondence: Louise Isager Ahl louise.ahl@snm.ku.dk Nina Rønsted nronsted@snm.ku.dk

#### Specialty section:

This article was submitted to Technical Advances in Plant Science, a section of the journal Frontiers in Plant Science

> Received: 30 November 2018 Accepted: 03 April 2019 Published: 14 May 2019

#### Citation:

Ahl LI, Al-Husseini N, Al-Helle S, Staerk D, Grace OM, Willats WGT, Mravec J, Jørgensen B and Rønsted N (2019) Detection of Seasonal Variation in Aloe Polysaccharides Using Carbohydrate Detecting Microarrays. Front. Plant Sci. 10:512. doi: 10.3389/fpls.2019.00512 Aloe vera gel is a globally popular natural product used for the treatment of skin conditions. Its useful properties are attributed to the presence of bioactive polysaccharides. Nearly 25% of the 600 species in the genus Aloe are used locally in traditional medicine, indicating that the bioactive components in Aloe vera may be common across the genus Aloe. The complexity of the polysaccharides has hindered development of relevant assays for authentication of Aloe products. Carbohydrate detecting microarrays have recently been suggested as a method for profiling Aloe polysaccharide composition. The aim of this study was to use carbohydrate detecting microarrays to investigate the seasonal variation in the polysaccharide composition of two medicinal and two non-medicinal Aloe species over the course of a year. Microscopy was used to explore where in the cells the bioactive polysaccharides are present and predict their functional role in the cell wall structure. The carbohydrate detecting microarrays analyses showed distinctive differences in the polysaccharide composition between the different species and carbohydrate detecting microarrays therefore has potential as a complementary screening method directly targeting the presence and composition of relevant polysaccharides. The results also show changes in the polysaccharide composition over the year within the investigated species, which may be of importance for commercial growing in optimizing harvest times to obtain higher yield of relevant polysaccharides.

Keywords: Aloe, authentication, carbohydrate detecting microarrays, plant cell walls, polysaccharides, seasonal variation, succulent tissue

#### INTRODUCTION

The succulent Aloe vera L. leaf tissue is a natural product used globally in a wide range of household commodities (Grace et al., 2015). By the end of 2016, Aloe vera leaf tissue had reached a revenue of US\$ 1.6 billion and it is estimated that the revenue will exceed US\$ 3.3 billion by 2026 (Future Market Insights, 2016). The succulent inner leaf tissue, the gel, is a polysaccharide rich matrix containing high amounts of mannan (polymannose), which enables the tissue to hold larger

From the Aloe leaves, two different medicinal products can be derived – the excudate and the gel. The often yellow and bitter exudate comes from aloitic cells (specialized cells in relation to the vascular bundles, that excrete af mixture of compounds used for medicinal purposes (Reynolds, 2004) in the outer leaf mesophyll, and contains a range of compounds used as purgative (Grace et al., 2009). The colorless polysaccharide-rich gel from the inner leaf is used topically for treatment of wounds, minor burns, and skin irritation or internally for a range of different applications (Grindlay and Reynolds, 1986; Reynolds and Dweck, 1999; Hamman, 2008; Grace et al., 2009). Due to the complexity of the polysaccharides, the composition and bioactivity of Aloe gel is not well understood, and there is a lack of useful methods for analysis and authentication (Bozzi et al., 2007; Grace and Rønsted, 2017).

The plant cell wall is an insoluble entity composed almost entirely of complex polysaccharides arranged in an intricate matrix (Cosgrove, 2005; Albersheim, 2011). The main noncellulosic polysaccharides in Aloe inner leaf mesophyll are hemicelluloses and pectins. Hemicelluloses cover a range of different polysaccharides with xyloglucans usually being the principal ones (Albersheim, 2011; Pedersen et al., 2012). Another hemicellulose mannan, and in particular an acetylated form of it, have been of particular interest in relation to Aloe research as it is considered the most likely bioactive component in the gels (Reynolds and Dweck, 1999; Talmadge et al., 2004; Simões et al., 2012). Plant cell wall polysaccharides are traditionally investigated indirectly using monosaccharide analyses (Albersheim, 2011; Grace et al., 2013), but by the complete break-down of the plant cell wall, information is inevitably lost about the tertiary structure and chemical construction of the polymers, why development of methods targeting polysaccharides or at least oligosaccharides have been highly sought after (Fangel et al., 2012; Krešimir et al., 2017).

The ability to analyze and distinguish between polysaccharide compositions in different plant tissues, between different batches, and between species are especially important in plants containing bioactive polysaccharides used for medicinal purposes like the acetylated mannan of Aloe vera (Femenia et al., 1999; Ahl et al., 2018; Minjares-Fuentes et al., 2018). Mannan is not only a common plant cell wall polysaccharide, but it is also often found in tissues related to water storage (Stancato et al., 2001). The acetylated mannan (polymannose) from Aloe vera has been linked to induced tissue repair in humans (Reynolds and Dweck, 1999; Xing et al., 2014; Thunyakitpisal et al., 2017), whereas a de-acetylation of mannan have been shown to result in a loss of bioactivity (Chokboribal et al., 2015).

Polysaccharide and phenolic compound contents are expected to vary with age of the plant, between batches, and with season and rainfall or water availability (Hu et al., 2003; Beppu et al., 2006; Cristiano et al., 2016). Harvesting and subsequent processing including drying of Aloe gel can also influence the content and composition of bioactive compounds including causing de-acetylation of mannan polymers (Minjares-Fuentes et al., 2016; Sriariyakul et al., 2016).

The efficacy and safety of herbal products can be compromised through accidental adulteration, misidentification and deliberate contamination, which can lead to lack of the desired effect at best, or severe side effects due to the presence of toxic compounds in worst case scenarios (Ernst, 2004; van Breemen and Farnsworth, 2008; Gilbert, 2011; Saslis-Lagoudakis et al., 2015). To ensure the efficacy and safety of herbal products, their qualitative and quantitative composition are regulated by international and national monographs such as the European Pharmacopeia by the European Directorate for the Quality of Medicines and Healthcare (EDQM, 2016), which presents a series of monographs for herbal products, including recommended tests for identification and quality of the plant species included in these products.

Two bulk Aloe herbal products are included in the European Pharmacopeia (EDQM, 2016), namely Aloe barbadensis (a synonym of the accepted name, Aloe vera L.), and Aloe capensis (a synonym of the accepted name, Aloe ferox Mill.), but both are based on the detection of hydroxyanthracene derivatives in the juice (exudate). A World Health Organization [WHO] (1999) monograph is available on Aloe vera gel recommending a chromatographic assay (t'Hart et al., 1989; World Health Organization [WHO], 1999), but no quantitative requirements of content has been proposed.

Considering the global use and appraisal of Aloe vera gel and its acclaimed beneficial effects, there is an urgent need for establishing reliable, and relevant authentication methods. In addition to ensuring the safety and efficacy of Aloe herbal products, an authentication method can also be used to assist in control of illegal harvesting and trade. All Aloe species except Aloe vera are prohibited from trade under the Convention on International Trade in Endangered Species as described in appendix II (CITES, 2017).

Due to the complexity of the polysaccharides, no efficient standard method exists for neither qualitative nor quantitative authentication of polysaccharide composition in Aloe herbal products (Grace et al., 2013; Minjares-Fuentes et al., 2018). Full structural identification of polysaccharides can currently only be achieved through a complex combination of spectroscopic techniques (Simões et al., 2012; Shi et al., 2018). However, a number of indirect methods exist, such as <sup>1</sup>H-NMR spectroscopy, which can be used to verify the presence of specific structural groups, such as the acetyl groups of the acetylated mannan (Bozzi et al., 2007; Campestrini et al., 2013).

Structure–activity relationships suggest that monosaccharide composition and branching patterns play an important role in the bioactivity of plant polysaccharides (Paulsen and Barsett, 2005). As a proxy, the constituent monosaccharides have

fpls-10-00512 May 11, 2019 Time: 14:10 # 2

therefore been suggested as a tool for authenticating Aloebased products (O'Brian et al., 2011; Minjares-Fuentes et al., 2018). Several analytical techniques are in use including colorimetric and spectrophotometric fingerprinting methods, and chromatographic methods, which can efficiently separate, identify, and quantify the monosaccharides (t'Hart et al., 1989; Eberendu et al., 2005; Nazeam et al., 2017; Zhang et al., 2018). However, little is known about the relationship between polysaccharide composition and therapeutic value of the leaf mesophyll in Aloe, and it is recommended that future authentication focus on developing methods targeting the polysaccharides (Grace et al., 2013).

Carbohydrate detecting microarrays (Moller et al., 2007) have been proposed as a possible method for qualitative comparison of polysaccharide composition between Aloe species and in Aloe herbal products (Ahl et al., 2018). Carbohydrate detecting microarrays is a high-throughput method allowing for the simultaneous investigation of numerous samples at the same time. However, carbohydrate microarrays are limited by what antibodies are available and the effectiveness of extractions and immobilization. The most optimal use of the method in relation to authentication is as a complementary screening tool prior to analyses like <sup>1</sup>H-NMR spectrometry analysis for more in-depth knowledge of the present Aloe compounds (Campestrini et al., 2013; Minjares-Fuentes et al., 2018). For the purpose of obtaining quantitative data, GC-MS profiling of monosaccharides is also still a useful method (Grace et al., 2013).

The aim of the present study was to use carbohydrate detecting microarrays to investigate the seasonal variation in the polysaccharide composition of two medicinal and two nonmedicinal aloes over the course of a year. Microarray profiling was complemented by microscopy to understand where in the cells the bioactive polysaccharides are present.

## MATERIALS AND METHODS

### Plant Material

Four species were chosen for this study to represent medicinal and or non-medicinal usage, but also based on their growth form, geographical distribution, and leaf size (**Table 1**). Aloe vera is a short-stemmed species growing in large clumps, and is probably native to the Arabian Peninsula (Grace et al., 2015). Aloe arborescens is a widespread species in the southern part of the African continent. The two medicinal aloes are very different in terms of habit, growth form and distribution, with A. arborescens growing up to 3 m in height compared to A. vera being a maximum of 1 m tall. Both non-medicinal species selected for this study are native to Madagascar, with A. decaryi being a narrow endemic growing in a pendulous or sprawling habit in thickets near sea level. Aloe vaombe, on the other hand is a widespread tree growing up to 5 m tall at altitudes of 50–1200 m (Carter et al., 2011).

Plant material was sampled from the living collections of the Botanical Garden, Natural History Museum of Denmark, University of Copenhagen, Denmark, and vouchers are deposited in Herbarium C (**Table 1**).

Plants of the four species were mature (+20 years old) when sampled and were grown under glass in conditions mimicking the daylight changes and water availability of the region they come from (**Table 1**). Samples were collected in triplicates for each species once a month from June 26th, 2017 to June 25th, 2018. Seasonality are expressed as northern hemisphere spring (March–May), summer (June–August), autumn (September–November), and winter (December–February) for the greenhouse-grown material, although the natural habitat of most aloes is in the southern hemisphere (**Table 1**). Two different types of fresh samples were taken at each collection point. To reduce the risk of contamination with phenolic compounds, which can bind and lead to masking of epitopes in the carbohydrate detecting microarrays, only the inner leaf mesophyll was carefully collected for carbohydrate detecting microarray analysis. Sections including epidermis were collected for microscopy investigations.

Ph. Eur. reference material for Aloe vera (product number 103504) and A. capensis (product number 203304) was obtained from Alfred Galke, Bad Grund, Germany, and was used as a standard (EDQM, 2016).

#### Microarray Profiling of Polysaccharides

Polysaccharide data was obtained by following the protocol described by Ahl et al. (2018), which in turn was modified from that of Moller et al. (2007) to accommodate the unique properties of the Aloe tissue (**Figure 1**). The succulent inner leaf mesophyll was collected in three biological replicates from the four selected Aloe species each month for a full year. The tissue was carefully excised from mature leaves, and immediately placed in labeled Falcon tubes (Corning, New York, United States) before they were snap frozen in liquid nitrogen. The collected samples were kept at –20◦C for 24 h before they were freeze dried, weighed, and milled prior to extractions. Samples of approximately 5 mg were weighed to 1 decimal accuracy from each biological replicate and placed in Corning 8-strip cluster tubes (Merck Life Science, Darmstadt, Germany). Samples were homogenized in a Tissuelyser II (Gentec Biosciences, Columbia) using glass beads prior to extractions.

Extractions were carried out in three-step sequential series and for each sample, the extractant volume was adjusted to accommodate the exact weight of each sample reaching a ratio of 10 mg sample to 300 µL extraction solvent. The extraction series is based on the work by Moller et al. (2007) and adjusted according to Ahl et al. (2018). The glass beads used for homogenization were kept in the tube to enhance the extraction of polysaccharides during the sequential steps. The following solvents were used: dH2O – targeting primarily soluble unbound or loosely bound polysaccharides including mannans, 50 mM CDTA (trans-1,2-diaminocyclohexane-N,N,N<sup>0</sup> ,N0 -tetraacetic acid monohydrate, pH 7.5, Merck Life Science, Darmstadt, Germany) – targeting primarily pectins and some hemicelluloses, and finally 4 mM NaOH – targeting


TABLE 1 |

Biogeographical,

morphological,

 and usage of the selected Aloe species found in the wild.

 Denmark.

more cloudly watering was decreased.

 Temperature

 were adjusted to be within the listed ranges. Vouchers are deposited in Herbarium

 C, National History Museum of Denmark, University of Copenhagen,

primarily hemicelluloses. For all three extraction steps samples were shaken in a Tissuelyzer at 27 s−<sup>1</sup> for 2 min before the speed was reduced to 6 s−<sup>1</sup> for 2 h. All extractions were carried out at room temperature. After the extractions, samples were centrifuged at 4000 RPM (Thermo Fisher Scientific, Waltham, MA, United States) for 10 min before the supernatant was carefully removed and transferred to a labeled 0.5 mL Eppendorf tube (Eppendorf, Hamburg, Germany). Extractions were carried out on the pellet, and extracts were kept at 4◦C during the subsequent extractions to minimize degradation.

Once extractions were done for all samples, fourfold dilution series were made for each sample in a 384-well microtiter plate (Merck Life Science, Darmstadt, Germany). Dilutions were made using Arrayjet buffer (55.2% glycerol, 44% water, 0.8% Triton X-100). The 384-well microplates with the diluted extracts were centrifuged at 3000 RPM (Thermo-Fisher Scientific, Waltham, MA, United States) for 10 min

before they were printed on a 0.45 µm nitrocellulose membrane (Whatman, Maidstone, United Kingdom) using an Arrayjet Sprint (Arrayjet, Edinburgh, United Kingdom) piezoelectric robotic printer. For each sample, the dilution series was printed in four technical replicates on each microarray, to yield a total of 48 spots per plant specimen per harvest (16 spots per extraction step). The three biological replicates were extracted and printed on three separate days using the approach described above.

Fifteen primary monoclonal antibodies were selected to cover as many types of different pectic and hemicellulotic polysaccharide epitopes as possible (**Figure 2**). The primary antibodies were paired with either alkaline phosphatase conjugated anti-rat or anti-mouse as secondary antibody (Merck Life Science, Darmstadt, Germany) depending on the origin of the primary antibody. The printed arrays from all three identical extraction rounds were developed, quantified and analyzed simultaneously following the procedures described by Ahl et al. (2018). The final tally of arrays developed for this study accounts to 47 arrays: 1 for each antibody and extraction round, plus two for negative controls of the secondary antibodies.

For the data analysis averages were calculated using both the dilution series for each sample and the array triplicates (total of 48 data points per sample). The full data set was visualized in a heatmap format with all antibodies and their binding shown in the **Figure 2**. The highest mean value of the entire dataset was assigned the value of 100%, and the remainder of the data were adjusted accordingly and normalized with a 5% cut off (represented with a zero – "0"). All data analyses were carried out in Microsoft Excel for Mac, version 16.16.4 (181110), 2018.

#### Microscopy

The microscopy work was done on samples from the summer collection in August 2017.

Sections from all four Aloe species were also stained with 1% Toluidine blue for 10 min, washed twice in distilled water and mounted on glass slides under a coverslip. Images were taken on Olympus BX41 microscope with a mounted Olympus ColorView I camera (**Figure 3**).

Tissue pieces of approximately 3 mm in diameter were excised from the sampled material and fixed for 30 min in 4% formaldehyde prepared from paraformaldehyde in phosphate-buffered saline (PBS). Sections were washed twice in PBS, before they were dehydrated in a series of methanol:water solutions until reaching a final concentration of 100% methanol. The methanol was then substituted with a methanol:LR White resin mixture (1:1) for 8– 10 h. Sections were then transferred to a pure LR resin overnight. The specimens were organized in gelatine capsules filled with pure LR resin. The final polymerization was performed overnight in a 60◦C oven. 1 µm-thick sections were made from each species using a Leica EM-UC7 ultramicrotome (Leica, Roskilde, Denmark) and glass knives, and subsequently adhered on Superfrost Slides (Thermo Scientific, Roskilde, Denmark) in a drop of water at 60◦C.

Immunolocalization of different polysaccharides in resin sections was performed following the procedure described by Mravec et al. (2017) using the antibodies BS-400-4 and LM21 (Pettolino et al., 2001; Marcus et al., 2010). In short, leaf sections were placed on individual glass microscope slides, and a hydrophobic circle was drawn around each section with a PAP pen (Merck Life Science, Darmstadt, Germany). Sections were then blocked with a 5% milk powder and PBS solution for 15 min, before probing with the monoclonal antibodies (**Figures 4**, **5**) for 1 h. Antibodies were diluted 1:10 in a 5% milk powder and PBS solution. Sections were then washed twice with 5% milk powder in PBS solution, before probing with secondary antibodies. The secondary antibodies used were either anti-rat or anti-mouse conjugated to Alexa Fluor 555 (Invitrogen, Roskilde, Denmark) at 1:300 dilution in 3% bovine serum albumin (BSA) in PBS. Leaf sections were then washed three times in PBS, and counterstained with Calcofluor White (Merck Life Science, Darmstadt, Germany) at 0.1 mg/ml concentration for 10 min. Finally, the leaf sections were washed one last time before being mounted in CitiFluor, an antifading reagent (Agar Scientific, Essex, United Kingdom).

The fluorescently labeled samples were scanned using a Leica SP5 confocal laser scanning microscope equipped with UV diode (405 nm), Ar (488 nm), and HeNe (543 nm) lasers at either 20X or 63X water objectives. Pictures were processed with GIMP2 software for color enhancement and contrast. Control samples were treated equally for comparison.

## RESULTS

#### Aloe Inner Leaf Mesophyll Structure and Localization of Polysaccharides

Distinctive morphological differences between the four different Aloe leaves investigated for this study were observed (**Figure 3**). The amount of inner leaf mesophyll varied from a thin layer in A. vaombe to a thicker many-celled layer in A. vera. The micrographic observation of toluidine blue stained resin sections showed the basic anatomical structure is conserved between the Aloe species. Below epidermis, is the outer mesophyll made of app. 15 layers of round or slightly elongated parenchymatic cells with size of 40– 100 µm in diameter followed by the inner mesophyll of enlarged water storage cells reaching 300 µm in diameter. These observations showed that the overall thickness of the Aloe leaves is largely determined by the thickness of the inner mesophyll.

Two mannan-specific antibodies (BS-400-4 and LM21) were used to investigate the localization of mannans in the mesophyll layers. The exact specificity of the mannan-binding antibody BS-400-4 is (1→4)-β-mannan/galacto-(1→4)-β-mannan (Pettolino et al., 2001). The probed micrographs (**Figures 4**, **5**) show that that the majority of mannan is found in the cytosol inside the cells, and in very different amounts depending on



FIGURE 2 | All extractions are summarized per season. The Ph. Eur. Material extractions (standards) are also summarized for each species. All tested antibodies are included. Included antibodies and their target epitope as well as the origin of the antibody and where their bindings have been described is listed below the heatmap. DE, degree of esterification; RGl, rhamno-galacturonan; KG, kyloglycan; Neg. R, anti-rat; Neg. M, anti-mouse. The highest mean value of the entire dataset was assigned the value of 100%, and the remainder of the data were adjusted accordingly and normalized with a 5% cut off (represented with a zero – "0").

FIGURE 3 | Anatomy of four Aloe species, (A) Leaves and (B) cross-sections from left to right of Aloe vera, A. vaombe, A. arborescens and A. decaryi. (C) Cross-sections from left to right of Aloe vera, A. voombe, A. arborescens and A. decaryi stained with Toluidine blue. Marked are ep, epidermis; olm, outer leaf mesophyll; v, vasculature; ilm, inner leaf mesophyll.

fpls-10-00512 May 11, 2019 Time: 14:10 # 7

the species. In A. vaombe the mannan seems to be almost entirely embedded in the wall. The mannan polymers do not seem to form a continuous entity, but rather appear as granules. In A. decaryi the mannan granules fill up almost the entire inside of the cells, whereas there were distinct vacuoles in the gatherings of mannan polymers in A. vera and A. arborescens. Additionally, in A. arborescens it seems like mannan is primarily present in the outer mesophyll and only present in thin bands around the edges in the inner mesophyll. The mannan-specific antibody LM21 binds β-(1→4) manno-oligosaccharides from DP2 to DP5, but it also displays a wider recognition including mannan, glucomannan and galactomannan polysaccharides (Marcus et al., 2010). Based on the histological micrographs (**Figure 5**) from the four Aloe species there appear to be very little mannan present in the mesophyll when LM21 is used to detect it. Again, in A. vaombe, the mannan seems to be almost embedded in the wall, whereas the distribution resembles the binding pattern of BS-400-4 more in the remaining three species although in a lower concentration. In A. vera, A. arborescens, and A. decaryi the mannan recognized

by LM21 also seems to be granulated rather than a dense sheet (**Figure 4**).

### Polysaccharide Profile Variation Between Species

Binding studies of 15 primary monoclonal antibodies representing different pectic and hemicellulotic polysaccharide epitopes show differences between the four Aloe species in polysaccharide compositions of their inner leaf mesophyll (**Figures 2**, **6**). As expected, the sequential extraction series resulted in the extraction of a mixture of polysaccharides. **Figure 2** presents a summary of the pooled serial extractions for each species and each antibody, in order to investigate if the total amounts of polysaccharides change over the course of a year, whereas detailed results for each of the three extraction solvents (H2O, CDTA, and NaOH) are shown in the **Supplementary Material**. In **Figure 6**, species-specific heatmaps including only the antibodies that recognized epitopes in the material are depicted

along with graphs showing the changes in mannan epitopes over time.

For all species mannan and xyloglycan epitopes were detected, although in various amounts. A. vaombe contained the highest amounts of mannan in the water extractions as detected by all three mannan-specific antibodies (LM21, BS-400-4, and CCRC-170) seen in **Figure 6** (Pettolino et al., 2001; Marcus et al., 2010; Pattathil et al., 2012; Zhang et al., 2014). For all species, the CCRC-170, binding acetylated mannan, the signal completely disappears in the spring and summer samples during the NaOH extraction, but re-appear in lower concentrations during the fall and winter.

The most distinct changes between species are found in the pectin profiles and this has also been observed in other studies (Ahl et al., 2018). In relation to specific antibodies, only A. vaombe and A. arborescens show binding from JIM5, targeting low methylated homogalacturonan – the backbone of the pectin polymer (Vandenbosch et al., 1989; Willats et al., 2000; Clausen et al., 2003). High-methylated homogalacturonan (JIM7), expressed by the binding of JIM7, is primarily released from the matrix in the CDTA extraction, but only in very low amounts from A. decaryi. Slightly more is released from A. arborescens and A. vaombe. The highest amounts of JIM7 is released from A. vera reaching almost double the amount when all seasons and extractions are combined (**Figure 2**; Vandenbosch et al., 1989; Willats et al., 2000; Clausen et al., 2003). Three different antibodies are detecting partially methylated homogalacturonan – LM18, LM19, and LM20 (Verhertbruggen et al., 2009). Despite being described as binding to the same type of epitope there are clear differences in the binding patterns of the three antibodies. There is binding for LM18 and LM19 in all selected species, but LM20 does not show any binding to A. decaryi and also has a very low binding to A. vaombe and A. arborescens, but then binds strongly to A. vera in the same pattern as JIM7 did. In terms of LM18 and LM19, A. vera is the species with the lowest binding with amounts hardly above the background cut-off. The three remaining species all express strong binding to both LM18 and LM19 with noticeable differences between the seasons. For all three species relative amounts of the polysaccharides are

almost doubled in the spring and summer periods compared to fall and winter.

In this study two antibodies targeting galactan and arabinan epitopes on rhamno-galacturonan, a pectin side-chain, was included – LM5 and LM6 (Jones et al., 1997; Willats et al., 1998; Lee et al., 2005). LM6 only bound to A. vaombe in the CDTA extraction of the spring and summer periods, but very weakly. Similarly, LM5 only showed binding in the CDTA extractions of A. vaombe and A. arborescens. Again, the binding showed low amounts of galactan with the highest values found in the spring and summer periods. Whereas seasonal differences were clearly detectable in the different extraction steps (**Figure 6** and **Supplementary Material**), the combined heatmap (**Figure 2**) shows that even though changes do occur over a 12-month period they are much subtler when considering the pooled extracts. The summer amounts are still the highest for almost all antibodies and all species. The differences between the species are still clearly visible even when extraction data is pooled (**Figure 2**).

## DISCUSSION

#### Organization of Mannans in the Succulent Tissue of Aloes

The microscopy work was done to determine the placement of the polysaccharides within the succulent tissue and determine the differences and similarities between the four species – A. arborescens, A. decaryi, A. vaombe, and A. vera. The microscopy work has overall corroborated the carbohydrate detecting microarrays results both in terms of species differences and the localization of specific polymers recognized by the same set of antibodies as were

fpls-10-00512 May 11, 2019 Time: 14:10 # 10

used for the carbohydrate detecting microarrays. However, on the histological micrographs, A. vaombe appeared to contain a low amount of mannan based on the detection of LM21 and BS-400-4 and only in the cell wall, whereas the carbohydrate detecting microarrays analysis of the comparable summer samples showed A. vaombe to be the one investigated species containing the most of both epitopes. In the microscopy study the focus was on the bioactive polysaccharide mannan using the antibodies BS-400-4 and LM21. Both anti-mannan antibodies detected the polymers in all species, but in A. arborescens, the signal from both antibodies BS-400-4 and LM21 was more pronounced in the outer mesophyll cells than in the inner leaf mesophyll (**Figures 4**, **5**, respectively). This could indicate that the outer cell layers are more used for storage than the inner most cells are, assuming mannans function as storage polymers (Stancato et al., 2001). In A. vaombe the signals from the mannan recognizing antibodies were generally weaker, and the mannan appeared to be embedded in the cell wall with only very low amounts of the polysaccharide located in the cytosol. In terms of mannan amounts and distribution based on the histological micrographs, A. vera and A. decaryi seem to be containing the highest amounts of mannan, suggesting A. decaryi potentially could also be a source of medicinally relevant mannans. A. arborescens also contained larger amounts of mannan, but not throughout the mesophyll as did A. vera and A. decaryi. The extensive medicinal use of A. arborescens indicates that it is the quantity rather than the specific localization of the polysaccharides in the mesophyll that determines the medicinal quality of the mannan.

## Structural Function of Aloe Polysaccharides in the Cell Wall

A very general description of a plant cell wall is based on a scaffold of linear cellulose strands bound together by an array of hemicelluloses embedded in a pectin matrix (Cosgrove, 2005; Albersheim, 2011). The main non-cellulosic polysaccharides detected by carbohydrate detecting microarrays in Aloe inner leaf mesophyll are pectins and two kinds of hemicelluloses – mannans and xyloglucan. Whereas the acetylated mannan is interesting from a medicinal point of view, from a plant cell wall perspective, the xyloglucan is likely to be the primary one binding the cellulose strands together as the mannan seems to be present in granulates more than as a flat sheet when looking at the histological micrographs. A large concentration of mannan was released in the water extraction also suggesting that these polymers are very loosely bound in the matrix, as tightly bound hemicelluloses would normally be expected to require NaOH for bulk release (Hansen et al., 2014). The release of xyloglucan in the NaOH extraction thus further supports the idea that this polysaccharide is more tightly bound in the cell wall participating in the general scaffold together with cellulose. Neither xylan nor glucans were detected in the carbohydrate detecting microarrays analyses (seen by the lack of detection by antibodies LM23, BS-400-2, and BS-400-3) and comparatively, xyloglucan contains more side-chains than xylan, which could likely have an effect on the cell wall structure (Meikle et al., 1991; Manabe et al., 2011; Pedersen et al., 2012; Torode et al., 2015). However, negative detection of a polysaccharide is not evidence of its absence, as its presence could be under the level of detection for the method or not expressed in the studied material. The acetylation of the mannan might be an important factor in relation to the types of bindings formed between the mannan and the scaffold polysaccharides. The detected pectin epitope changes over the season and between the extractions supports this idea of a highly flexible matrix.

## Composition and Variation of Aloe Polysaccharides

The expectation of finding highly acetylated mannans in at least the Aloe vera gel was supported in particular by the binding of CCRC-170, as this antibody has a known target epitope containing acetylations (Pattathil et al., 2012; Zhang et al., 2014). The antibody BS-400-4 bound most strongly to the Aloe extractions indicating that the tissue contains high amounts of loosely bound (1→4)-β-D-mannan. We observed a complete lack of binding from the mannanspecific antibody LM22 compared to the binding seen for LM21. Both epitopes was shown to bind mannans by Marcus et al. (2010), although their study was focused on mannan derived oligosaccharides from Amorphophallus cognac K. Koch and Seratonia siliqua L., which may be structurally different from Aloe derived mannans. In particular LM22 has been reported to bind strongly to galactomannan, whereas LM21 binds to glucomannan and the lack of binding of LM21 may therefore suggest glucomannan is not present in Aloe (Marcus et al., 2010). The acetylation of mannan has been a concern with regards to the mannan recognizing antibodies as previous studies using the same set of antibodies failed to show any binding (Ahl et al., 2018). In the present study, however, the method has been optimized, especially in terms of sample wait-time, meaning the polysaccharides printed on the nitrocellulose were of a better quality. If samples sit for too long between extraction and printing, signals are likely to fade or even disappear, and a similar situation is expected if samples are frozen (personal observation).

Based on the carbohydrate detecting microarrays results, seasonal variation was detected in the quantity of polysaccharides. The monthly variation was subtle, but when data was pooled according to season, distinct differences were seen. The changes were primarily seen in the binding patterns of pectin and mannan-specific antibodies. Variation in cell wall composition is well-known and reflects the flexibility of the cell wall (Albersheim, 2011). When all data for each species was pooled per season as shown in **Figure 2** the changes were not as obvious as when the three extractions steps were compared separately. There was a clear trend, however, of the plants having the highest amounts of polysaccharides in June, July, and August. The optimal harvest time for obtaining higher yield of the sought-after polysaccharides

might change from location to location. The ability to detect change in mannan content could therefore be of importance to the Aloe industry for planning of harvest time in plantations.

## Potential of Carbohydrate Detecting Microarrays for Authentication of Aloe Products

Acetylated mannan from Aloe vera has been linked to induced tissue repair in humans for decades (Reynolds and Dweck, 1999; Xing et al., 2014; Thunyakitpisal et al., 2017), but traditionally more than 25% of the genus Aloe species (about 150 taxa) have been used to treat a range of conditions (Grace et al., 2009). The primary aim of this study was to investigate if carbohydrate detecting microarrays could be used as a complementary method to detect a seasonal variation in the polysaccharide composition of the selected aloes – two medicinally used and two non-medicinally used. Based on our results, each species had a distinct polysaccharide profile, yet the two medicinally used species (A. vera and A. arborescens) were more similar to each other than they were to the two non-medicinally used species in terms of both pectin and to some extent mannans. Although, it was possible to differentiate between the Aloe species based on carbohydrate detecting microarrays analyses of the investigated samples, the profiles are distinct enough to use carbohydrate detecting microarrays to discriminate unambiguously between individual species. Furthermore, the observed seasonal variation supports the stand that carbohydrate detecting microarrays should not be used as a stand-alone means of analysis to authenticate an Aloe vera product. Additional replicate samples would also be needed to explore potential within species variation. However, the similarities and differences between the polysaccharide compositions of the medicinally used species and the non-medicinally ones may potentially be useful to identify the group of medicinal aloes.

As carbohydrate detecting microarrays are primarily a qualitative method, they cannot be used for quantitative authentication of products, but could be suitable to detect if a product actually contains polysaccharides in a composition that could be related to Aloe vera or another medicinally used species. The most important antibodies to use for such a screen would be both the mannan-specific ones from which we saw signals in this study, but also pectinspecific antibodies, which showed more differentiation of species. In terms of feasibility for authentication of Aloe polysaccharides, carbohydrate detecting microarrays is a high-throughput method capable of simultaneously analyzing 10–20 different samples. Apart from the investment in a microarrayer, the running costs include non-specialist lab equipment, nitrocellulose for printing, and a few selected antibodies. For the analysis of fewer samples, a commercial services analysis of freeze-dried and milled samples could possibly be set up with laboratories having the set up in house. One additional concern is the availability of relevant standardized reference material for comparison. Aloes are rich in phenolic compounds, which can bind and lead to masking of epitopes, and it is therefore important to only use inner leaf mesophyll for the carbohydrate detecting microarray analysis. The commercial plant material of Aloe vera and A. ferox obtained here as Ph. Eur. reference standard is whole above ground plant material, whereas the monographs specify either exudate or inner leaf mesophyll depending on intended use and the whole plant extracts included here also differed in polysaccharide composition and content (**Figure 2** and **Supplementary Material**). Consequently, there is a need for a more specific definition of what should be considered standard Ph. Eur. reference material.

## CONCLUSION

The histological micrographs showed differences between species in terms of the amounts of mannan present in different parts of the aloe leaf tissue. The micrographs revealed that the polysaccharides serve as structural hemicellulose in the cell wall but can also function as a storage polysaccharide within the cytosol of Aloe species. In terms of Quality Control and Standardization of Plant Based Medicines – carbohydrate detecting microarrays were able to detect differences between species and seasonal variation in composition and abundance of polysaccharides and relevant antibodies are available to screen for the acetylated mannans hypothesized to be responsible for the acclaimed bioactivities of Aloe gel. Carbohydrate detecting microarrays therefore has potential as a complementary screening method directly targeting the presence and composition of relevant polysaccharides. The observed seasonal variation may be of importance for commercial growing to optimize harvest times. In addition to seasonal variation, we would expect potential variation of polysaccharides according to the age and origin of the plants, as well as the impact of the local growing conditions, and of storage conditions further down the production line. The carbohydrate detecting microarrays method could thus be used to provide relevant information about variation of the polysaccharides in individual plantations allowing optimization of yield.

## AUTHOR CONTRIBUTIONS

LA designed the study together with NR and OG, conducted the probing with the antibodies, supervised NA-H and SA-H together with DS and NR, and wrote the manuscript together with NR. LA, NA-H, and SA-H collected and prepared the samples. NA-H and SA-H conducted the carbohydrate detecting microarrays extractions and printing under guidance of LA. LA and BJ conducted the data analysis. JM conducted the microscopy imaging and interpreted the results together with LA. All authors contributed to the discussions and to the final version of the manuscript.

#### FUNDING

This research was supported by grants from the Villum Foundation, Planet Project #9283 to WW, BJ, and NR and #17489 to JM.

#### ACKNOWLEDGMENTS

fpls-10-00512 May 11, 2019 Time: 14:10 # 13

We thank gardener Martin Årseth-Hansen, Botanical Garden, Natural History Museum of Denmark, for

#### REFERENCES


growing the plant material, and Jeanett Hansen for help in the laboratory at PLEN, and Henriette L. Pedersen for discussion on carbohydrate detecting microarrays methods and results.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00512/ full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ahl, Al-Husseini, Al-Helle, Staerk, Grace, Willats, Mravec, Jørgensen and Rønsted. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-10-00512 May 11, 2019 Time: 14:10 # 14

# Medicinal Plant Analysis: A Historical and Regional Discussion of Emergent Complex Techniques

Martin Fitzgerald<sup>1</sup> , Michael Heinrich<sup>2</sup> and Anthony Booker 1,2\*

<sup>1</sup> Herbal and East Asian Medicine, School of Life Sciences, College of Liberal Arts and Sciences, University of Westminster, London, United Kingdom, <sup>2</sup> Pharmacognosy and Phytotherapy, UCL School of Pharmacy, London, United Kingdom

The analysis of medicinal plants has had a long history, and especially with regard to assessing a plant's quality. The first techniques were organoleptic using the physical senses of taste, smell, and appearance. Then gradually these led on to more advanced instrumental techniques. Though different countries have their own traditional medicines China currently leads the way in terms of the number of publications focused on medicinal plant analysis and number of inclusions in their Pharmacopoeia. The monographs contained within these publications give directions on the type of analysis that should be performed, and for manufacturers, this typically means that they need access to more and more advanced instrumentation. We have seen developments in many areas of analytical analysis and particularly the development of chromatographic and spectroscopic methods and the hyphenation of these techniques. The ability to process data using multivariate analysis software has opened the door to metabolomics giving us greater capacity to understand the many variations of chemical compounds occurring within medicinal plants, allowing us to have greater certainty of not only the quality of the plants and medicines but also of their suitability for clinical research. Refinements in technology have resulted in the ability to analyze and categorize plants effectively and be able to detect contaminants and adulterants occurring at very low levels. However, advances in technology cannot provide us with all the answers we need in order to deliver high-quality herbal medicines and the more traditional techniques of assessing quality remain as important today.

Keywords: herbal medicine, medicinal plant, analysis, quality, pharmacopoeia, complexity, advances

## INTRODUCTION

Medicinal plants have been a resource for healing in local communities around the world for thousands of years. Still it remains of contemporary importance as a primary healthcare mode for approximately 85% of the world's population (Pešić , 2015), and as a resource for drug discovery, with 80% of all synthetic drugs deriving from them (Bauer and Brönstrup, 2014). Concurrently, the last few hundred years has seen a prolific rise in the introduction, development, and advancement of herbal substances analysis. Humans have been identifying and selecting medicinal plants and foods based on organoleptic assessment of suitability and quality for thousands of years, but it is only in

#### Edited by:

Jiang Xu, China Academy of Chinese Medical Sciences, China

#### Reviewed by:

Manoj Gajanan Kulkarni, University of KwaZulu-Natal, South Africa Rainer Willi Bussmann, Saving Knowledge, Bolivia

\*Correspondence: Anthony Booker a.booker@westminster.ac.uk

#### Specialty section:

This article was submitted to Ethnopharmacology, a section of the journal Frontiers in Pharmacology

Received: 04 September 2018 Accepted: 14 November 2019 Published: 09 January 2020

#### Citation:

Fitzgerald M, Heinrich M and Booker A (2020) Medicinal Plant Analysis: A Historical and Regional Discussion of Emergent Complex Techniques. Front. Pharmacol. 10:1480. doi: 10.3389/fphar.2019.01480

**224**

the span of the last seven decades since the invention of basic analytical techniques, e.g., paper chromatography, that has seen rapid development from sight, touch, and smell to using sophisticated instrumentation. Though this mechanization of the senses has appeared relatively recently, historically conceptual expansion has been building throughout the scientific revolution, outwards toward the universe and inwards to a scale below recognition capable with a human eye, leading to development of some of the earliest analytical tools assisting the senses, the telescope and microscope. From the initial discovery of new microscopic worlds, through structural, chemical, and atomic levels, the sensitivity and range of human perception has been extended and enhanced.

Rapid progress is especially evident considering that the concept of a laboratory was only formally formed in Europe during the early 1600s. First as an extension of philosophers', doctors', and scientists' workrooms, it becomes a space to study nature and gather empirical evidence (Wilson, 1997), where studies could be conducted at the analyst's convenience rather than at specific times when daylight or weather permitted. This was a small but important step towards more formalized analytical investigations.

In modern analysis, single techniques such as paper chromatography and much earlier colorimetry appeared. It was followed by a greater range and wider application of these techniques until early hyphenations such as LC-UV emerged, culminating more recently in multiple combinations of multihyphenated instrumentation, availing of the analytical advantages inherent in each individual technique. The emergence of hyphenated analytical techniques in many aspects is analogous to the organoleptic synthesis that occurs when selecting a medicinal plant; viewing, smelling and tasting it to use combinations of different senses, increasing the points of reference/statistical degrees of freedom to improve the probability of correctly identifying and assessing its quality. The emergence and application of these hyphenated techniques only became possible and useful as computer systems and data management tools developed, enabling rapid and selective synthesis of information from the large amount of instrumental and analytical data signals generated.

Probably the single greatest influence in recent times in the advancement of the analysis of herbal materials (and arguably analysis generally) is, though, how large amounts of data can be collected, assimilated, and used more meaningfully in human readable forms. Similar to the historical advancements in combinatorial hyphenated instrumentation, now combinatorial data processing techniques like fingerprinting, metabolomic profiling, and pattern recognition algorithms have emerged, further increasing analytical capabilities, while reducing operator time and expertise required. This trend has further accelerated the pace and rate of advancement of analytical techniques and has led to an increase in the pace and capability of the associated research. In this paper, we analyze publication trends and pharmacopoeial developments in order to better understand the role and progression of analytical techniques. Since their initial discovery and development, with a particular focus on China, an Asian country with both deep cultural and long-term historical roots in plant medicine, to more modern day developments and applications.

#### PUBLICATION TRENDS

Increasing interest in medicinal plant research and analysis is reflected in the number of recent publications, with more than a three-fold increase from 4,686 publications during the year 2008 to 14,884 in 2018. Output published during the 8 years of the present decade alone outnumbered all those combined before 2000, since the included database records began in 1800 (Figure 1).

The largest proportion of publications cited in current databases over the last 10 years for medicinal plant analysis reports are in the disciplines of pharmacology and pharmacy (Figure 2). With plant sciences, biochemical molecular biology and agriculture research following closely behind, together comprising almost 70% of the total publications.

#### REGIONAL TRENDS—LAST 10 YEARS

The majority (about 58%) of medicinal plant analysis publications in the last 10 years have collectively emerged from mainland China, India, USA, and South Korea (Figure 3). This may be an expression of the strong medicinal plant traditions in Asia in addition to the USA's dominant presence as an international user of herbal products (Hu et al., 2013). The major East Asian regions, in particular, China, Japan, South Korea, together with Taiwan, contribute more than half of the total citations (55%). This may be indicative of the rapid economic progress and technological capability of these countries. China is the major contributor, with a 15% increase in its dominance of research outputs in the last 10 years. This influence has also been seen in the effect of China's growing involvement in aiding the development of pharmacopoeias around the world and as a leader in the analysis of Chinese medicinal plants (Figure 3).

#### REGULATION AND A CHANGING ANALYTICAL LANDSCAPE

From a regulatory perspective, the pharmacopoeial requirements are the central reference point for the analysis of medicinal plants. Though internationally many pharmacopoeias exist, the most comprehensive of these relating to herbal medicinal materials is the Chinese Pharmacopoeia (ChP). The current ChP introduced in 2015 is the 10th iteration presented in three volumes and includes 5,608 drugs, a 10-fold increase from its first edition in 1953. More than half of the current monographs (Hamid-Reza et al., 2013, 598) relate to CHM specifically including raw plants, slices, herbal mixtures, and oils. A noticeable inclusion in the current version compared with the previous version is the addition of 400 herbal mixtures (Qian et al., 2010).

"medicinal plant" OR "herbal medicine" AND "analysis" chosen for the maximum retuned records after exploring a list of similar topic and combination of keywords such as "photochemical analysis," "traditional medicine," and "herbal." The Web of Science or collection, KCI- Korean Journal database, MEDLINE®, Russian Science Citation index, and SciELO Citation index databases were included in the search.

## PHARMACOPOEIA MONOGRAPHS— THEIR INFLUENCES AND CHALLENGES

Though more recently the ChP is playing an increasing role in influencing medicinal plant analysis, the development of the ChP has been heavily influenced by Western pharmacopoeias. Historically the identification, preparation, and analysis of medicinal plants were based on classic texts such as the Shengnong Bencao Jing (Shengnong Materia Medica, 25–220 CE), where the category and quality of 365 plants and 113 prescriptions were assessed by taste. Organoleptic sensing of bitterness, sweetness, saltiness, and even neutral tastes were

thought to indicate the function and application of the medicine. Arguably, the most influential Chinese pharmacy monograph is the Bencao Gangmu (Compendium of Materia Medica, 1368– 1644 CE) containing 1,892 plant descriptions and 11,096 prescriptions sorted in 16 divisions and 60 orders, emphasizing appearance, taste, and odor as a key to authentication and quality.

However, the main precursor to the modern format of the current Chinese Pharmacopoeia was printed in the 1930s with 670 drugs. Even at this early stage, the then dominant Western powers such as Britain, Germany, America, and Japan found challenges in understanding and forming consensus for recognizing, categorizing, and assuring the quality of Chinese medical materials. At this time a difficulty emerged in securing materials for the more Western styled "scientifically run" hospitals. Initially it was though that as Japan had adopted a translation of the German pharmacopoeia in 1886, the Chinese could follow suit using the British Pharmacopoeia, which in 1927 had been translated into Chinese as a joint effort by the London and British Chambers of Commerce. However, some differences in opinion between the four occupiers had to be first resolved.

Many of the technological demands necessary to produce and maintain the pharmacopoeial standards required by the Americans was beyond the ability and technological capability of the Chinese at that time. America had recently just printed a Chinese translation of its United States Pharmacopeia (10th edition) published in 1926. The strict American standards for aconite, digitalis, adrenalin, and insulin were purported to be managed by new or foreign trained pharmacists (Read, 1930). Preparations such as liniments found in the British and U.S. Pharmacopoeias were included in the Chinese version. Syrups such as those of codeine and glucose and tinctures of cannabis were from the British influence. Foreign residents in China found it difficult to ingest local food and stated an "extensive need for bowel remedies." Therefore, drugs of the time, albuminis, aspidium, and emetin, were included. Vaccines for diphtheria, tetanus, and smallpox were maintained through the instruction of the USP.

German chemists had already gained a reputation for the isolation of chemical compounds, many of which were used medicinally and were already included in the Japanese Pharmacopoeia such as oxalic acid, pyrogallic acid, and bromine. Therefore, the existing German-Japanese analytical methods were generally utilized for these areas, which comprised about 25% of the new Chinese Pharmacopoeia. Whereas more British and American derived analytical methods and preparations were included for vegetable- and animal-based materials.

Agreement over the correct translation and naming of chemical compounds also proved problematic, e.g. when attempting to resolve disagreement between German-Latin and Anglo-American descriptions such as "natrium chloratum" and "sodii chloridum." The shared Latin common language elements aided European and American common understanding; however, translation into Chinese was troublesome. A potentially easier route would have been to adopt the Japanese Pharmacopoeia names and descriptions, often possessing the same Asian (Hanzi) character as that in China, however, this was resisted due to the strong nationalistic sentiment at the time in mainland China (Read, 1930).

Though the Japanese favored direct foreign phonetic transliterated terms for drugs, about 60 original Chinese materia medica entries had persisted in the Japanese Pharmacopoeia including entries for camphor, ginger, aloes, cardamom, and star anise.

Difficulty in plant identification and common naming was not confined to Asia. During the early 1900s period of European and American political expansion, attempts were being made in Europe to catalogue multilingual terms for similar plants such as the publication of "the illustrated polyglot dictionary of plants names" in Latin, Arabic, Armenian, English, French, German, Italian, and Turkish languages (Bedevian, 1936), cataloguing 3,657 plants in eight languages.

#### CHRONOLOGY OF PHARMACOPOEIAL DEVELOPMENTS IN CHINA

#### 1900–1949

Medicinal plant publications during the early 1900s, before the formation of the People's Republic of China in 1949, were greatly influenced by the previous "age of exploration." Many scientific societies were set up by explorers, their peers, and investors as forums to communicate knowledge and acknowledge ownership of findings and discoveries (Fyfe and Moxham, 2016). The rise in fashion of the "gentleman scholar" engaging in academic pursuits supported the occupation of writing. During this time, many publications focused on the identification and classification of ethnic/indigenous medical plants, such as Aztec medicinal plants still in use in modern Mexico (Braubach, 1925; Heinrich et al., 2014), Algonquians from nowadays, Canada, (Speck, 1917), Micronesians (St John, 1948), Babylonians and Assyrians, (Jastrow, 1914), Native American Indian tribes (Castetter et al., 1935), Persia, (Garrison, 1933) and India, (Chopra, 1933). Publications in English describing the history and use of Chinese medicine in the context of Western orthodox also appeared (Chan, 1939).

#### Post-1949

Periods of advancements in TCM research after 1949 to the present day have been described as occurring in three defined phases lasting about 20 years each. The first was 1950–1970, springing from the rapid development of TCM in universities, research, and hospitals in China during this time. The second phase took place during 1980–2000s, where we see the construction of legal, economic, and scientific networks. The third phase, from 2000 to date, is defined by a focus on elucidating the scientific basis and scientific clinical practice of TCM using cross-disciplinary and global collaborations (Xu et al., 2013).

## 1950–1969

#### Political Context

This period immediately followed the formation of the People's Republic of China and saw a rise in nationalism and political introspection. International relationships cooled and a closer connection with the Soviet Union was officially forged with the Sino-Soviet Treaty of Friendship, Alliance, and Mutual Assistance in 1950.

#### Regulatory and Pharmacopoeial Developments

This period saw the launch of the first edition of the People's Republic of China Pharmacopoeia (ChP) in Chinese launched in 1953. It contains 531 monographs and mainly retains the information of the previous precursor published in the 1930s, compiled from foreign influences. It guided both identification and quantification of synthetic drugs and medicines together in one issue. Some crude herbal materials were listed, but not in analytical detail. Internationally post-World War II, good-will fostered a sense of cooperation and collaboration. This was also reflected by the World Health Organization's release of the international pharmacopoeia (Ph. Int) issued by the World Health Organization in 1951, produced in two volumes. It contained 344 monographs and 84 tests, with an aim to provide a harmonized international reference for pharmacopoeial methods. The first European Pharmacopoeia Ph. Eur. was produced in 1967, with a more European focus, but combining many common elements of the long-existing British Pharmacopoeia and the United States Pharmacopeia.

#### Medicinal Plant Research and Analytical Development

Research publication output during the 1950s was varied but the most cited publication trends concerned identification of plant species using electron microscopy (Watson, 1958), the use of plant tissue staining methods (Bergeron and Singer, 1958; Fernstrom, 1958), and use of plant extracts for colorimetric analysis (Holt and Withers, 1958; Lillie, 1958). Though originating in the 19th century, the analytical tradition of extraction, purification, and separation of chemical plant components, e.g., the alkaloids, became increasingly sophisticated during this period (Svoboda et al., 1959). Toxicity studies during this time were still basic, exposing mainly mice to plant extracts and using mortality rate counting and organ biopsy and cell station techniques, e.g., quercetin, podophyllotoxin, and podophyllin extract toxicity studies (Leiter et al., 1950) and induced liver lesions with Pyrrolizidine alkaloid extracts (Schoental, 1959).

Chemical screening of plants for their medicinal effects in various chemical and clinical trials is featured (Farnsworth, 1966) as did their use in derivatized forms for the treatment of nerve inflammation (Jancso et al., 1967) and in human metabolism studies (Pletscher, 1968). Studies into the use of medicinal plants for their potential use in cancer treatments were encouraged by the first isolation of paclitaxel from the pacific yew, Taxus brevifolia Nutt.

Older basic chromatographic techniques that had been already in use remained commonly used analytical techniques, e.g., paper chromatography applied to the analysis of common broom [Cytisus scoparius (L.) Link.] (Jaminet, 1959) and in medicinal plant quality control (Paris and Viejo, 1955). Separation of alkaloids e.g. in Duboisia myoporoides R. Br. (Hills and Rodwell, 1951) remained a common interest and the analysis of other important metabolites including scilliroside in red squill, Drimia maritima. (L.) Stearn (Dybing et al., 1954). An investigation of Cannabis sativa L. for its antibacterial activity was also conducted during this timeframe (Krejci, 1958).

Much of the medicinal plant research of this period concerned the extraction and isolation of single compounds from plants. Basic colorimetric tests, UV-visible and infrared spectroscopy, and paper chromatography had previously supported this type of analysis. Spectroscopic techniques such as UV-Vis spectrometry with chart recorders had been in use since the 1920s (Hardy, 1938). These were being increasingly used for quantitative applications, such as in the analysis of glucoside in walnuts and monitoring the chemical composition of plants in relation to seasonal variations (Daglish, 1950).

However, the 1950–1970s was a golden period for the development of analytical technology. A time when the techniques of mass spectrometry (MS), nuclear magnetic resonance (NMR) spectroscopy, and gas chromatography (GC) techniques had come of age. Mass spectrometry, which had been invented in the late 1800s and used in a more analytical form during the 1910s, had now come into a relatively more advanced era. It was during the period 1950–1970 that the ion trap technique was developed, for which Dehmelt and Paul later received a Noble prize. The Purcell and Bloch groups at Harvard and Stanford University, respectively, developed NMR techniques and in 1952 also received a Nobel Prize (in Physics). In 1952, Archer John Porter Martin and Richard Synge also shared a Nobel Prize (in chemistry) for inventing partition chromatography, the basis of modern GC. Gas–liquid separations solved the problem of separating sugar-based molecules, which tended to bond with traditional stationery phases such as silica and volatile compounds, such as volatile oils, which are lost through evaporation during collection, preparation, and analysis. GC was applied for the first time to resolve 17 difficult to separate plant glycosides from a broad range of chemical classes, including phenolic, coumarin, isocoumarin, isoflavone, anthraquinone, cyanogenic, isothiocyanate, and monoterpene (Furuya, 1965), 15 kinds of valerian sesquiterpenoids in valerianaceous plant oils (Furuya and Kojima, 1967), and the extraction and analysis of rose oil (Minkov and Trandafilov, 1969).

Publications included well-applied examples where visible, ultra-violet (UV), and infrared (IR) spectral data were combined to elucidate structural characteristics of plants while undergoing chemical degradation, e.g., the stereochemical discrimination of lignin components paulownin and isopaulownin from Paulownia tomentosa Steud. (Takahashi and Nakagawa, 1966), the alkaloids of the Orchidaceae (Lüning et al., 1967), and terpenoids of Zanthoxylum rhetsa DC (Mathur et al., 1967).

MS was also used side-by-side with NMR, resulting in the structural elucidation of key metabolites, e.g., the characterization of the opium papaverrubine alkaloids and their N‐methyl derivatives in the genus Papaver (Brochmann-Hanssen et al., 1968), the analysis of three new coumestan derivatives from the root of licorice, Glycyrrhiza spp., (Shibata and Saitoh, 1968), and the isolation and purification of polyprenols from the leaves of Aesculus hippocastanum L. (horse chestnut) (Wellburn et al., 1967).

Up to this time, China had played a very marginal role in international research and development activities, a situation that was to change significantly in the following period.

#### 1970–1989

#### Political Context

1971 saw China's introspection from the Mao era revert to more external international engagement with the "People's Republic of China" (PRC) elected as a permanent member of the United Nations' General Assembly. This followed the American government's extension of political relations with PRC after the Richard Nixon presidential visit that catalyzed an "Opening up to the West" phase in Chinese history. This opening began in 1978, orchestrated by the interim leader Deng Xiaoping, who initiated support for wide sweeping economic reforms. On a local level this manifested as individuals within China being allowed to make personal economic decisions, with the tightly governed communes being dissolved. Rural markets were replaced by open markets, resulting in a dramatic increase in international trade, supporting Xiaoping's wish to fund economic growth from foreign investment. In the context of medicine, China's ambition to look outward was highlighted over a decade earlier by a University College London anatomy Professor, Derrick James, when a British delegation visited China in 1954 and in his subsequent Lancet article outlined China's intention to introduce a more scientific, modernized TCM (James, 1955).

As international trade from China expanded, so did the trade in medicinal plants from Asia and with it, increased access for Chinese scientists to modern analytical instrumentation. Internally by the mid-1980s, 25 Chinese medicine colleges were formed in a reportedly scientific and modern style with an almost 30-fold increase of TCM hospital beds to 2.5 million since the formation of the state in 1949 (Cai, 1988).

#### Regulatory and Pharmacopoeial Developments

The establishment in 1985 of the China State Administration of Traditional Chinese Medicine began the formal organization of TCM research and development nationally and internationally, sowing the seeds for the formal cooperative global links that would provide the backbone for the future of international Chinese medicinal plant research. China's motivation to secure international links was also manifest in the publication of the PRC's first dual Chinese and English language Pharmacopoeia, ChP, 4th edition in 1997, which began its new 5-year publication cycle trend.

#### Medicinal Plant Research and Analytical Developments

The newly fostered R&D investment and cooperation during this period globally is represented by the leap in sophistication and complexity of the research published, with a shift from basic to more advanced biochemical investigations and more emphasis focused on disease and diagnosis strategies such as in cancer and infectious disease. The most widely cited articles of this time include advanced biomedical research on Forskolin, from the roots of Plectranthus barbatus Andrews as a diterpene activator in nucleotide metabolism. Even though basic biochemical equipment and colorimetric methods and spectrometric enzymatic assays were used, a more complex understanding of plant metabolites is apparent (Seamon et al., 1981).

This is also evident in the investigation of lectins as cell recognition molecules and their involvement in a wide range of molecular processes and potential pathologies, e.g., in metabolic regulation, viral, and bacterial infection processes (Sharon and Lis, 1989). In addition to plants playing a role as phytochelants in complexing heavy metals (Grill et al., 1985 and Grill et al., 1987), licorice was studied in greater depth using a conceptually new approach of assessing the mineral-corticoid activity of licorice and its role in sodium retention (Stewart et al., 1987) and the radical scavenging properties of its flavonoids (Hatano et al., 1988).

Awareness of plants having a role in cancer with both causative and curative effects emerged, with a highly cited review of potential causes of esophageal cancer in China. Particular concerns were linked to effects of fungal growth and associated nitrosamines due to poor storage conditions (Mingxin et al., 1980). This was a precursor to later studies on aflatoxins, which are now acknowledged as causing serious health problem linked to poor storage and processing. From a therapeutic perspective, the interest in antileukemia and anti-tumor agents, e.g., in Taxus brevifolia Nutt. stem bark, first investigated some decades before, continued and ultimately resulted in the introduction of a completely new therapeutic approach (Wani et al., 1971).

One of the landmark discoveries in medicinal plant history was reported to the west during this period. The antimalaria effect of artemisinin, derived from Artemisia annua L., for which the Chinese scientist Youyou Tu later received a Nobel Prize in Medicine (Klayman, 1985), described a conceptual shift in the approach to treating malaria, illustrating both a change in approach from using quinoline-based drugs, which parasites were showing increasing resistance to, and paving the way for the development of new classes of drugs e.g. with potential in antiviral and anticancer treatment (Su and Miller, 2015).

#### 1990–2008

#### Political Context

This period in China was characterized largely by economic, political, and academic success delivering on the earlier aspirations of Deng Xiaoping through focused planning and the tight administrative grip of three successive presidents (Chairpersons) and state administration. An unusually highperforming economy producing more than a 10% sustained gross domestic profit (GDP) created a stable base for China to successfully join the world trade organization in 2001, marking its arrival on the world stage as a competent economic power and its transition to a market economy (Morrison, 2013). This, however, came with challenges to families and the environment.

On a local level as communes of the last decades had dissolved, a system of "household responsibility" was adapted as a kind of contract that guaranteed agricultural family holdings to provide a certain level of food (and herb) output (Ash, 1988). This ensured that levels of agricultural production were optimized for the land available. Because families were now allowed to sell grown products in an open market that mirrored the economic national trend, food and medicinal herbs began to take on more distinct financial attributes. This combined with mass migration of rural workers to rapidly developing industrialized cities away from countryside homes without sufficient locally produced food in urban surrounds created a situation of widespread supply and demand, leading to new value chains for food and medicinal plant products, along with potential motivation for the substitution or adulteration of these products.

#### Regulatory and Pharmacopoeial Developments

As industrialization occurred so too did environmental pollution, with increased volume and concentration of raw materials and waste presenting greater potential for pollution of medicinal plant material. The PRC at this stage had gone through a period of prolonged political stability. Economic policy became more flexible and governance developed an increasingly regulatory role compared with that of previous, more rigid enforcement. Regulation and safety testing of medical products saw further guidance through the production of four further volumes of the ChP in both Chinese and English culminating in the 8th edition in 2005, listing 3,217 monographs, almost double that of the 1990 edition. This period saw China's confidence increase and extend to regulatory and guidance aspects, with the ChP undergoing the greatest leap in analytical sophistication and rate of change to date. The 1990 edition was a significant step in the acceptance and introduction of modern instrumental analytical techniques for standard herbal substance testing. Since the 1985 edition, specific identification tests were introduced using mainly thin layer chromatography (TLC). Now chromatogram images of the crude and test samples were included and required for testing. Basic identification was expanded to require quantitation where high-performance liquid chromatography (HPLC) and GC were now included for the first time and TLC extended for content analysis. More instrumental techniques replaced older ones such as the introduction of spectrophotometric determination of the alkaloid content of berberine, which had been gravimetrically analyzed in previous editions. Quantification moved from measuring simpler marker components to more specific active compounds like anthroquinone from He Shou Wu, Polygonum multiflorum Thunb [now Reynoutria multiflora (Thunb.) Moldenke]. The 2000 edition introduced assays for residues of organic chlorine pesticides for Gan Cao, Glycyrrhiza uralensis Fisch. ex DC. and Huang Qi, Astragalus membranaceus Fisch. ex Bunge (Kwee, 2002). Another leap occurred in the 2005 edition with an expansion of the acceptance of HPLC-MS, LC-MS-MS, and DNA molecular markers and chemical fingerprinting, setting the stage for 21st century pharmacopoeial trends and the ChP as a central global influence for the analysis of medicinal plants.

#### Medicinal Plant Research and Analytical Developments

The fruition of investment in external academic relations from the "opening up" phase and internal support for the now formed TCM structures of the previous decades state initiatives were borne out by the publication output in this period, with a six-fold increase in output compared with that of the previous equivalent 20-year period. Much of the output from this time demonstrated a refinement of thought around the effect of plant compounds on humans as a holistic system rather than the more singular metabolic pathway thinking of previous years. It also shows a tremendous emphasis on obtaining large datasets especially of the known metabolites and a wide exploration of acclaimed effects. Whole plant extracts and combinations of metabolites rather than single ones became a core theme, as became a medicinal plant's effect on longer term health and preventative medicine. This ignited a resurgence of interest in the analysis of medicinal plants as a source of lead compounds for drug discovery.

The role of medical plants in coronary disease analysis becomes topical during this phase, e.g., long-term studies on elderly demonstrating the reduced risk of death from sustained flavonoids intake via inhibition of the oxidation of low-density lipoprotein (Hertog et al., 1993). More sophisticated quantitative analysis and differentiation appeared during this time such as HPLC of mulberry leaves containing four varieties of flavonoids (including rutin and quercetin), and their antioxidant properties (Zhishen et al., 1999). Flavonoid coronary disease risk prevention and cancer roles were advanced by the characterization and analysis studied in a wide range of fruits, seeds, oils, wines, and tea (Middleton et al., 2000). A greater awareness of the potency and efficacy of drugs and medicinal plants became evident as in the studies and analysis of the effect of fluorine on drug binding and potency (Purser et al., 2008). Cancer research also demonstrated further advances through combining previous findings on receptor binding with advancements in DNA extraction, amplification techniques, and cloning techniques. Resveratrol became a key area of interest for its chemoprotective effects (Jang et al., 1997).

Many of the most cited publications of these two decades were detailed reviews, which brought together the findings of previous research on individual plant research.

## 21st Century

China's growing influence was marked in 2011 with the Chinese State Administration of TCM (SATCM) forming an official relationship with the European Directive on the Quality of Medicines (EDQM) to share expertise and knowledge in addition to raising the standards of testing in China and Europe through cooperation. These include translation of historical TCM documents, information relating to preparation of products, process, and sourcing. Europe, seen as an aggregate, has an approximately 16% representation in the last decades' research output, higher than the USA. The European Pharmacopoeia (Ph Eur) manages CHM's by allowing importation of CHM's to countries who have signed up to the European Pharmacopoeia convention. Currently there are 43 CHMs included in the Ph Eur, 8th edition, 34 from the Ph Eur TCM Working Party, 21 of which have been included as full monographs (Wang and Franz, 2015). New Ph Eur CHM monographs are being developed based, in part, on the ChP. This was facilitated by a working party on TCM (Ph Eur WP) and was officially introduced in 2005. It included 38 member states with a delegation from the EU (a representative from DG Health & Food Safety and the European Medicines Agency). Additional observers are composed of 27 countries/regions/ organizations [which include 7 European countries, the Taiwan Food and Drug Administration (TFDA), and World Health Organization (WHO)] (EDQM, 2017). The WHO, through participation in the PhEur, additionally has led efforts to develop a harmonized international pharmacopoeia (WHO, 2018).

The monographs for medicinal plants in Ph Eur have developed from standard western drug monographs with an emphasis on chemical and physical testing, while those in the ChP have formed from revisions of older traditional texts.

As pharmacopoeial monographs expand and develop, so too does the range and complexity of analytical methods and analytical hardware needed to meet the regulatory demands and expectations of quality.

These emerging research trends and pharmacopoeial directives have paved the way for the development of a broad range of analytical techniques, mainly centering around the use of liquid chromatography (LC), GC, MS, and established UV/ visible spectrophotometric techniques.

We present a selection of these analytical techniques and give examples of their applications in the analysis of medicinal plants and medicinal plant products.

#### Analytical Hardware, Attested and Emerging Methods

#### High-Performance Liquid Chromatography

HPLC is one of the most developed and widely used analytical techniques. It is built on a historical knowledge base amassed from TLC and optical chemistry experience. HPLC chromatography elements rely on similar principles of TLC/ HPTLC, where separation of components is dependent on selective affinities to stationary supports and liquid phases.

Detection employs a photomultiplier system able to detect individual wavelengths of light, a range (spectrum) and/or multiple simultaneous wavelengths in its different iterations, combined in an enclosed automated instrument system with sample injectors; this has significantly increased the precision and reproducibility of the chromatography when compared with older chromatographic methods. The widespread use of HPLC has made it more affordable for laboratories. High operator skill level is not required; it is robust and sensitive to low level detection and is particularly used for the quantification of components (active substances and adulterants).

HPLC applied to herbal products is well developed, and it has been successfully applied to the analysis of complex mixtures of similar compounds, both for the separation of individual compounds and for the differentiation of medicinal plant species. The high resolution of the technique has supported the development of the concept of a characteristic "fingerprint" developed for medicinal plants and herbal products to aid identification and authentication, e.g., Li et al. (2010) demonstrated differentiation of the same type of medicinal plant product from 40 different manufacturers, while simultaneously separating nine marker chemical compounds (berberine, aloe-emodin, rhein, emodin, chryso- phanol, baicalin, baicalein, wogonoside, and wogonin).

#### High-Performance Thin Layer Chromatography

HPTLC has become a common addition to the method section of new monographs, replacing the widely used TLC tests; it has shown to be a reliable and reproducible method of analysis that provides essential information regarding the compositional quality of an herbal substance.

Some advantages of this technique include low cost and a relatively simple test method. It does not require advanced sample preparation methods or high levels of expertise. Sample amounts are relatively small, and it is a more sensitive technique compared with HPLC, well suited to detecting contaminants. However, some disadvantages are that the reproducibility is dependent on a variety of external factors, and although more sensitive than HPLC, it is not able to sufficiently detect compounds at very low concentrations (PPB) where LC-MS (or HPTLC-MS) may be more suitable. HPTLC relies on the same principle as TLC and uses similar TLC plates and mobile phases, although relatively small amounts of solvents are required compared with standard TLC. The process of adding the sample to plates (spotting) has been made more reproducible and precise by spraying the sample onto the plate to form a band of compound rather than a spot. Retention factors for individual compounds are more reproducible due to controlled humidity during development. Derivatizing the analysis plates is completed mainly by machine and the visualization is captured by modern camera systems connected to powerful software. The software allows further manipulation of images to optimize visualization in a way that would be very difficult chemically. Another advantage is that the HPTLC system can be easily linked to a scanning densitometer; this not only allows for more precise quantitative work to be carried out but also the data can be exported for multivariate analysis. It is likely that more of the monographs with TLC requirements will be upgraded to HPTLC in the future.

#### Gas Chromatography

GC in respect to medicinal plant analysis is mainly used for the analysis of compounds with higher volatility, e.g., compounds found within essential oils, and more volatile adulterants, e.g., pesticides. While single GC column chromatography and its hyphenated derivatives have been use for many years, 1991 saw the introduction of 2D-GC or GC x GC, where the eluents of a standard separation are trapped and recirculated for another round of separation. This allows not only greater resolution and better separation but also the ability to purge undesired or interfering compounds so that more specific areas of the separation can be targeted (Liu and Philips, 1991). This led the way for multidimensional gas chromatography (MDGC) and the advances of the modules and valve systems that trap, control, and divert sample streams. These improvements extend to the thermal control and valve systems allowing greater thermal flow and split streaming (Bahaghighat et al., 2019). One key problem with GC is the introduction of sample into a gas stream. Historically squeezing, boiling, and later distillation of herbal materials were used for the collection and production of volatile compounds such as oils. However, the inherent instability of volatile components and losses as well as the poor recovery of these substances presented difficulties. This situation has somewhat been overcome by advances in extraction techniques such a solvent-free microwave extraction, e.g., for citrus peel oils [Citrus sinensis (L.) Osbeck]. No solvents or water are necessary for high recoveries with this method, and it allows for highly efficient, compatible sample introduction without the need for interfering solvents (Aboudaou et al., 2018). This sample extraction method commonly known as headspace analysis for GC has undergone many iterations (Gerhardt et al., 2018). It has now developed to the stage where it is increasingly used for bacterial and microorganism detection such as in Commiphora species (Rubegeta et al., 2018).

Microextraction techniques are essential for the introduction of small sample volumes into the GC gas stream. Needle-based extraction techniques have the advantage of automation, ease of interface to other instruments, and compatibility with miniaturization. Advances in solid phase dynamic extraction (SPDE), In-tube extraction (ITEX), and needle trap extraction (NTE) have refined the use of these techniques for natural and herbal compounds (Kędziora-Koch and Wasiak, 2018), e.g., SPDE and ITEX for pesticide residues in dried herbs (Rutkowska et al., 2018), herbal mint aromas compounds in commercial wine (Picard et al., 2018), and volatiles in Chinese herbal formula Baizhu Shaoyao San (Xu et al., 2018).

#### Supercritical Fluid Chromatography

Another liquid-based chromatographic technique based on pressurized low viscosity (supercritical) fluids, often carbon dioxide, is supercritical fluid chromatography (SFC). Since its introduction by Klesper in 1962, it has made large advances mainly due to improvements in its initially troublesome instrumentation (Desfontaine et al., 2015). Its main advantage over other techniques is in its usefulness for separating complex components characteristic of natural compounds. Selection of the correct conditions of SFC mobiles phases and modifiers can be finely tuned across a wide range of polarities from non-polar to polar allowing a broad selection of separations (Gao et al., 2010). Early analysis of natural products with SFC was when it was first hyphenated with gas chromatography (King, 1990). Recently, it has been more fully developed to analyze a range of natural compounds in herbal substances, notably, focusing on terpenes, phenolics, flavonoids, alkaloids, and saponins. This has been achieved with hyphenation to MS, diode array detectors, SFC-ELSD, in addition to the development of novel stationary phases such as cyanopropyl, pentaflouro phenyl (PFP), and imidazolyl. An example of this is with the separation of coumarins in Angelica dahurica (Hoffm.) Benth. & Hook.f. ex Franch. & Sav. roots and anthraquinones in rhubarb root (Pfeifer et al., 2016).

#### Near-Infrared Spectroscopy

Although commonly used within industry since the 1990's, nearinfrared (NIR) spectroscopy was not the method of choice for medicinal plant analysis mainly due to overlapping peaks making interpretation of data problematic, and consequently, it never became the instrumentation of choice within the quality control laboratory in the same way that HPLC and TLC developed. However, with the addition of new computational software, NIR is re-emerging as an affordable and useful analytical technique used in the analysis of medicinal plants and has been particularly favored by Chinese companies in routine quality control analysis due to its ability to both rapidly differentiate between species and provide quantitative information on metabolite content (Li et al., 2013; Zhang and Su, 2014).

As with HPTLC and NMR data, NIR also provides an opportunity for multivariate analysis and it appears capable of resolving very small variations in metabolite content. It is argued that more traditional TLC or HPLC techniques can be more subjective in the data interpretation stage and require a high degree of operator skill and that NIR is more suitable for high volume analysis in the routine quality control laboratory (Wang and Yu, 2015). However, this has partly been addressed by the introduction of the fully automated systems available for HPTLC analysis and the inclusion of scanning densitometry equipment that reduce the need for operator interpretation. The main advantages of NIR appear to be the preservation of sample integrity, little sample preparation needed, and no need for solvents, and it has shown to perform well comparable to HPLC

for species differentiation and quantification of metabolites (Chan et al., 2007). Probably the main drawback in NIR compared with other methods, and especially, TLC, HPTC, LC-MS is in its sensitivity and some reports suggest that this technique may only be suitable for detecting compounds that exist at a concentration above 0.1% (Lau et al., 2009). Another consideration is that variation in NIR data is dependent both on the chemical and physical properties of the sample, with the physical properties, e.g., particle size, having greater effects on the variation than the chemical. Therefore, before multivariate analysis can take place some pre-treatment of the spectral data is necessary, e.g., to reduce baseline noise, light scattering, and consequently enhance any chemical variation in the sample set (Chen et al., 2008). Some advantages of NIR certainly are apparent, although it may not be appropriate for all situations and all types of samples. The technology has made a huge leap forward since its first introduction and now it needs to establish itself more widely as a useful tool in the quality analysis of medicinal plants.

#### Hyphenated Techniques

Combinations of techniques with modern developments in metabolomic analysis and computational pattern recognition programs open up a wider scope of applications to medicinal plant analysis. Tandem combinations of analytical instrumentation such as MS with HPLC has proved a productive route to expanding analytical medicinal plant applications. Not only in identification and fingerprinting but further chemical characterization of individual compounds e.g., Liu et al. (2011), characterized a spectrum of alkaloid components in the Chinese herb Ku Shen (Sophora flavescens Aiton). Further combinations and permutations of MS and NMR in combination with HPTLC have been demonstrated, such as the detection of acetylcholinesterase inhibitors in galbanum in a search for natural product drug candidates (Hamid-Reza et al., 2013), and mass spectroscopy (MS) HPTLC-MS shown for Ilex vomitoria Aiton with the use of a sampling probe following HPTLC combined with MS with Electrospray Ion Trap (Ford and Van Berkel., 2004) and Hydrastis canadensis L., with HPLTLC-MS atmospheric pressure chemical ionization (Van Berkel et al., 2007).

Analytical combinations including ESI-IT-TOF/MS-HPLC-DAD-ESI-MS have been demonstrated for the analysis of coumarin patterns in Angelica polymorpha Maxim. roots (Liu et al, 2011) and multihyphenated techniques such as SPE-LC-MS/MS-ABI quadrupole trap have been used for the analysis of six major flavones in Scutellaria baicalensis Georgi (Fong et al., 2014) and 38 saponins in the roots of Helleborus niger L. by LC-ESI-IT-MS (Duckstein et al., 2014).

Merging the separation ability of HPTLC or HPLC with the analysis power of NMR and MS has significant benefits for analyzing complex samples in complex matrices such a blood, soil, and plants. However, each technique also possesses its inherent disadvantages. MS being complex, expensive, and time-consuming, requiring high analytical skill levels, it may not be suitable for a general quality assurance laboratory. Though powerful, extensive method development and post analysis data processing is required when applied to natural compounds with broad complex compositions in contrast to simpler synthesized pharmaceutical ingredients. Similarly, NMR is also expensive and sensitive to variations in sample preparation and composition. It is not fully applicable to all natural compound samples and signals generated from NMR analysis often overlap making data analysis for individual compounds problematic. However, the relative speed, rich information output, and insight into the overall composition of medicinal plants from both MS and NMR far outweigh the disadvantages. These techniques allow the detection of compounds into the parts per billion analytical range (MS) and allow a detailed fingerprint of metabolites across differing polarities (NMR) and so for research and for larger companies they are highly applicable analytical hardware.

#### METABOLOMICS

Pharmacopoeial methods focus on authentication and quality of herbal materials; however, metabolomics allow us to go a step beyond authentication and look in more detail at a broad range of secondary metabolites. By coupling analytical data to multivariate software, this allows us to develop statistical models to firstly differentiate between species but also to get a better idea of a typical metabolite composition for a particular species. The advantage of this is that it can help to inform any laboratory test or clinical intervention. There has been great emphasis on making sure that any experiment or intervention uses plant material that is authenticated, with a herbarium specimen deposited. However, the requirements do not stipulate that a good representative of the species should be used. This is where metabolomics can provide essential information—by collecting a wide range of samples from different geographical locations, altitudes, growing conditions, it allows us to map their metabolite differences and highlight how diverse or how similar metabolite composition is. When an experiment is performed, we have the choice to use a specimen that may be typical, i.e., contains an average composition or we can look at compositions that are atypical, containing greater amounts of specific metabolites or even different metabolites. Moreover, if a particular experiment produces positive results and we want to reproduce the data, a metabolomic model allows us to choose species that have a similar composition.

This approach has important economic implications as a detailed understanding of metabolomic analysis allows us to inform industry as to how to grow plants that will be of the best composition and so help to support local livelihoods offarmers and primary processors in developing economies, e.g., Chachacoma (Senecio nutans Sch. Bip.) cultivation in the high altitude regions of Chile where metabolomics has helped to establish the best altitude for growing plants with the highest content of the antiinflammatory acetophenone (Lopez et al., 2015).

This strategy also has applications in product development, where metabolomics can help to determine the quality of products based on their metabolite content, e.g., Curcuma longa L. (Turmeric products) (Booker et al., 2014), and also help to provide evidence that can lead to value addition of a product and greater confidence in its quality and safety.

#### NANOPARTICLES

Nanoparticles 1–100 nm sized ions or organic/inorganic molecules have proven to be important in the development of new analytical testing (Tao et al., 2018), occupying the analytical regions of space between the ionic dimensions and small molecules.

Recent developments in nanoparticle research has led to an increased focus on chemo-bio sensing, as DNA has become the most used biological molecule to functionalize nanoparticles. Nanoparticles have provided many advantages to more consistent and specific testing including providing a more reproducible stable matrix for research and development, more controllable and reliable basis for designing and conjugating to functional molecules, and a wide rebate of flexibility for purification, selection, and modification of analytes. Nanoparticles have been used in creating a biological bar code for trace analysis of mycotoxins in Chinese herbs e.g. conjugated nanoparticles with DNA fragments to bind and target Chinese medicinal plants, e.g., Jue Ming Zi [Cassia seeds—Senna obtusifolia (L.) H.S.Irwin & Barneby], Yuan Zhi (Polygala tenuifolia Willd.), and Bai Zi Ren [Platycladus orientalis (L.) Franco] (Yu et al., 2018).

#### THE FUTURE

The next steps in analytical advancement in combination with technological improvements will most likely occur in the realm of artificial intelligence. Neural networks have already shown promise in consumer electronics and online search engine optimization. Self-learning algorithms have been in development for decades, with great potential for the application of self-synthesizing, autocreating, and auto-adapting algorithms, which can optimally recognize and synthesize analytical data into meaningful and useful patterns. This goes beyond what a single human mind could hope to achieve in lifetimes, now possible in seconds with current and more so with future technology. This extends not only the human potential of thinking and observation but also prediction and design. This could potentially play a role in selfdesign of analytical instrumentation and its modules, selfoptimizing of methods in real-time, saving time that would perhaps take an analyst weeks or months of human work-hours to complete.

The greatest challenge with AI is its opacity and computational complexity. With self-learning systems already

#### REFERENCES

Aboudaou, M., Ferhat, M. A., Hazzit, M., Ariño, A., and Djenane, D. (2018). Solvent free-microwave green extraction of essential oil from orange peel (Citrus sinensis L.): effects on shelf life of flavored liquid whole eggs during storage under commercial retail conditions. [Preprint]. Available at: https:// www.preprints.org/manuscript/201801.0055/v12018010055 (Accessed August 15, 2018). doi: 10.20944/preprints201801.0055.v1

self-generating codes and pathways that would take decades for a single human to decode and understand, if ever possible. This presents a great challenge for use in reproducible, validated quality-driven, audit-trailed regulated orientated environments. This is where natural compounds such as herbal substances can play a significant role i.e. data from the same plants species with variable composition can help verify the input and outputs of complex analysis and recognition software. In AI-driven systems, natural substances are ideal candidates for testing the analytical attributes such as accuracy, precision, and robustness of whole AI-instrumentation systems.

#### CONCLUSIONS

As pharmacopoeial requirements continue to develop and instrumental technology advances, it is clear that we will be able to delve further and further into the chemical composition of medicinal plants and develop more advanced techniques for the detection and quantification of adulterants and contaminants. However, it should be considered that although these technological advances give us this opportunity, more traditional organoleptic analysis also provides us with essential sensory information regarding medicinal plant quality.

We have shown the emergence and historical importance of complex analytical techniques used in medicinal plant analysis. However, any analytical approach, can only provide a partial perspective on complex multicomponent preparations. So future improvements in this area may not entirely rely on developing ever more complex analytical techniques, but in implementing best practice throughout all stages of the production and supply of herbal medicines.

#### AUTHOR CONTRIBUTIONS

AB wrote the sections on applications of metabolomics, NIR, parts of the introduction, and conclusions. MF wrote most of the instrumentation, trends in publications and history, part of the introduction and conclusions. MH contributed towards the methodological design of the study and assisted with the data analysis.

#### FUNDING

MF scholarship is funded by Brion Research Group (Sun Ten Pharmaceutical Co) and Herbprime, UK.

Ash, R. F. (1988). The evolution of agricultural policy. China Quart. 116, 529–555. doi: 10.1017/S0305741000037887

Bahaghighat, H. D., Freye, C. E., and Synovec, R. E. (2018). Recent advances in modulator technology for comprehensive two dimensional gas chromatography. TrAC Trends In Anal. Chem. 113, 379–391. doi: 10.1016/j.trac.2018.04.016

Bauer, A., and Brönstrup, M. (2014). Industrial natural product chemistry for drug discovery and development. Natural Prod. Rep. 31 (1), 35–60. doi: 10.1039/ C3NP70058E


trap system. Rapid Commun. In Mass Spectrom. 18 (12), 1303–1309. doi: 10.1002/rcm.1486


derived from grapes. Science 275 (5297), 218–220. doi: 10.1126/ science.275.5297.218


Available at: https://sustainabledevelopment.un.org/content/documents/ 6544118\_Pesic\_Development%20of%20natural%20product%20drugs%20in% 20a%20%20sustainable%20manner.pdf. (Accessed August 15, 2018).


needle trap device with multivariate data analysis. R. Soc. Open Sci. 5 (6), 171987. doi: 10.1098/rsos.171987


Conflict of Interest: MF scholarship is funded by Brion Research Group (Sun Ten Pharmaceutical Co) and Herbprime, UK.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Fitzgerald, Heinrich and Booker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.