Assessing the Identity of Commercial Herbs From a Cambodian Market Using DNA Barcoding

In Cambodia, medicinal plants are often used to treat various illnesses. However, the identities of many medicinal plants remain unknown. In this study, we collected 50 types of traditional Cambodian medicinal plants that could not be identified by their appearance from a domestic market. We utilized the DNA barcoding technique, combined with the literature survey, to trace their identities. In the end, 33 species were identified at the species level and 7 species were identified at the genus level. The ethnopharmacological information of 33 medicinal plants was documented. The DNA barcoding technique is useful in the identification of medicinal plants with no previous information.


INTRODUCTION
Cambodia is located in the Indo-China Peninsula, where it borders Thailand, Vietnam, and Lao PDR in Southeast Asia. Although it does not exceed 4% (181,035 km 2 ) of the total area of Southeast Asia, Cambodia is well-known for its rich biodiversity, overlapping with four of the 25 "biodiversity hotspots" and maintaining rich natural resources and a unique ecosystem. It is estimated that the country has more than 3,000 vascular plant species (Chassagne et al., 2016). Approximately 1,200 medicinal plants are used to treat diseases (Xu, 2008;Walker, 2017). Traditional medicine plays an important role in the lives of most Cambodians. In the face of disease, 70-80% of Cambodians opt for traditional medicinal methods (Walker, 2017;Yao et al., 2017) with approximately 40-50% of the population using medicinal plants daily (Chassagne et al., 2017).
Traditional Cambodian medicine involves several cultural and regional traditions derived from Theravada Buddhism, Ayurveda, traditional Chinese medicine, and French pharmaceutical traditions (Chassagne et al., 2017). Among these, Chinese and Ayurvedic medicines are the two oldest and most comprehensive medical systems based on natural medicinal agents. Consequently, the importance of traditional medicinal plant research in Cambodia is relatively high. However, Cambodia still does not have a national ethnopharmacopoeia (World Health Organization [WHO], 2005). Furthermore, there are few curricula teaching traditional Cambodian medicine, and books offer inconsistent and confusing information (Richman et al., 2010). Due to climate change and agro-industrial development in Cambodia, the local ecological environment is threatened (Chassagne et al., 2016). At the same time, Western medicine is being promoted, and knowledge of medicinal plants is being lost (Xu, 2008). Therefore, the study of medicinal plants in this country is very important. The medicinal plant market is not only a place for the sale of natural therapies but also a place for people to exchange information on medicinal plants, which preserves the knowledge as the information is passed from one generation to the next (Lee et al., 2008;de Carvalho Nilo Bitu et al., 2015;Jin et al., 2018). Therefore, we chose to conduct our study at a traditional market in Phnom Penh, the Cambodian capital.
We performed this study in August 2016, December 2016, and November 2017 in Orussey Market, which is one of the largest traditional markets for Chinese merchants in Phnom Penh carrying a wide variety of medicinal plants.
The medicinal materials market that we surveyed represented only a small part of the entire Orussey Market (Figure 1). Only about 10 merchants were selling herbal medicines. The business model of the Cambodian medicinal plant market is mainly retail sales. Each store was small and independently run, with its own shop name. The medicinal plants were stored outside of the shops for customers to browse and purchase. The herbs had no fixed specifications, and they were derived from plants in the region. There were various types of herbs that included roots, stems, leaves, fruits, and whole plants. Most medicinal plants were previously dried, and a few fresh medicinal plants were formulated into medicines or single-flavored products. Due to the large number of Chinese customers, the shops also sold commonly used herbs in China, such as red dates, pepper, and Atractylodes macrocephalae rhizome (Atractylodes macrocephala Koidz.). The quality and specifications of the medicinal plants were not significantly different from store to store, and the price was similar across different shops. We collected samples from a total of 118 medicinal plants, of which 68 could be identified by morphology, whereas the remaining 50 species could not be morphologically distinguished. The main objective of this study was to clarify the original species of these medicinal plants.
To identify medicinal plants, morphological, microscopical, and physical and chemical identification methods are commonly used, but they require experienced investigators that are knowledgeable in the field. A shortage of experienced investigators has led to difficulties in identifying unknown plants. With advances in science and technology, such as chromatography, spectroscopy, and X-ray diffraction, new methods have been used to study medicinal plants (Chen et al., 2012). Although most methods are not useful when combined with the starting material of unknown origin, they can provide indirect evidence for the authenticity of the material (Han et al., 2016).
The DNA barcoding technique is an effective tool that can identify unknown medicinal plants with no background information . It uses a short DNA sequence from a standard and agreed-upon position in the genome to identify the species rapidly and accurately (de Vere et al., 2015;Li et al., 2015). The experimental method is fast, standardized, and simple. It generates a large experimental throughput and easily identifies the species (Chen et al., 2011). The common DNA barcodes for plants are rbcL, matK, ITS2, and psbA-trnH; however, matK is difficult to amplify with commonly used primers ; therefore, different taxonomic groups require different sets of primers (Hollingsworth, 2008). Furthermore, matK sequences evolve slowly, and this locus has by far the lowest divergence among plastid genes in flowering plants (Kress et al., 2005). Due to its modest discriminatory ability, it is not recommended for studies at the species level. Presently, psbA-trnH is the most widely used plastid barcode for species identification, as its universal primers can amplify nearly all angiosperms (Shaw et al., 2007). Internal transcribed spacer 2 (ITS2), a part of the nuclear DNA, is another ideal barcode because of its short length, easy amplification with a single primer pair, high Frontiers in Pharmacology | www.frontiersin.org sequencing efficiency, and high variation between species (Yu et al., 2017).
Internal transcribed spacer 2 and psbA-trnH represent the universal barcode for the reliable identification of medicinal plants (Chen et al., 2010;Yao et al., 2010). The ITS2 sequence can accurately identify Solani nigri (Solanum nigrum L.) and its sibling species (Chen et al., 2017) Scutellaria barbata D.Don and its adulterants (Guo et al., 2016), and Lycium barbarum L. and its adulterants , as well as the origin of various ginseng plants based on SNP barcodes . This technique can also be used to classify and identify medicinal plants from family Orchidaceae (Tang et al., 2017). Furthermore, the ITS2 region has been used to supervise the proportions and varieties of adulterant species (Yu et al., 2017). Presently, DNA barcoding is widely used, as it has been applied in the authentication and identification of small berry fruits (Wu et al., 2018), as well as in the study of Li minority medicine (Cui et al., 2019) and various animal species . The technique has also been used to identify Sida L. herbal products , traditional Dai medicines, and laxative producing plants . Han et al. (2016) identified unknown herbal plants in common markets and reported the adulterant rate. In this study, we used the DNA barcoding technique to identify the 50 unknown medicinal plants obtained from the Orussey Market, one of the largest Cambodian traditional markets.

RESULTS AND DISCUSSION
In total, we collected samples from 118 medicinal plants, of which 68 could be morphologically identified, the remaining 50 were randomly selected for DNA barcoding analysis. According to Table 1, 42 of the 50 samples were derived from the stem, bark, and vine, except for two fruits, four roots, one leaf, and one rhizome. Based on their morphological appearance, it was difficult to confirm their identity. The ITS2 amplification success rate was 94% (47/50), and the sequencing success rate was 98% (46/47). According to the results of ITS2 and psbA-trnH experiments, 33 plants were identified at the species level. Seven were identified at the genus level. Ten plants were unidentifiable (five were due to either amplification failures, sequencing failures or no matched results, and five were due to the low maximum similarity). There were 27 medicinal plants with a maximum similarity <97%, and the psbA-trnH region was amplified in these samples. The psbA-trnH amplification success rate was 52% (14/27), and the sequencing success rate was 100% (14/14). All the identification results (matched species, maximum similarity, length, and maximum score) are shown in Table 1. The maximum similarity range of ITS2 was 83-100%, and that for psbA-trnH was 89-100%. In this study, the ITS2 amplification of three samples and psbA-trnH amplification of 13 samples were unsuccessful, which might have been due to impure or degraded DNA of processed medicinal materials. This shows the boundedness of DNA barcoding due to the instability of DNA. This result also reflects the limitations of Sanger sequencing, such as the unspecific amplification of non-target DNA when target DNA is degraded (Pawar et al., 2017). In the method part, we have taken some measures to avoid these problems. If mixed sequencing signals were present, we would clone the PCR products and sequence the single colonies to identify target amplicons. To overcome the limitations of Sanger sequencing, we can employ next-generation sequencing (NGS) approach technology to simultaneously detect plant and fungal DNAs (Pawar et al., 2017). And it is also possible that the samples were not suitable for ITS2 or psbA-trnH amplification. To tackle this issue, specific primers (Zhao et al., 2018) or other types of barcodes, such as minibarcodes (Song et al., 2017;Liu, 2018) or plastid superbarcodes (Krawczyk et al., 2018), can be used. However, this method also has limitations, according to our data, seven medicinal species were identified at the genus level, five exhibited low maximum similarity (<90%), and several could not be matched to any of the existing medicinal species, presumably due to the low species-level resolution of many plant genera and insufficient database information of GenBank. Therefore, the analyses with multiple genetic loci (e.g., singlenucleotide polymorphisms, SNPs) and other analytical methods, such as infrared spectroscopy and X-ray diffraction, must be employed to achieve high resolution for species differentiation (Chen et al., 2012).
Seventeen herbs could not be identified at the species level; besides the limitations of DNA barcoding, one plausible explanation can be that incomplete database information. In this context, accurate species identification by botanists is key to improve the NCBI database for identifying unknown medicinal species.
However, C. odorata and Chhke sreng are related to each other as indicated by several references, presumably owing to the mismatch of local and Latin names of some medicinal species or the adulteration and mis-authentication of C. odorata in Cambodian markets. Thus, DNA barcoding plays a key role in ensuring medicinal safety in Cambodia and it would be better when DNA barcoding combines chemical information.
The ethnopharmacological information (family name, distribution, local name) of 33 plants is shown in Table 2. Among these, there were five Rubiaceae and eight Fabaceae plants. The pictures of five representative medicines are shown in Figure 2. Furthermore, legumes were the most cited plants in studies from Cambodia, Thailand, and Laos (Chassagne et al., 2017).
Importantly, 19 out of 33 medicinal plants were also used as Chinese medicines, and they were Nauclea officinalis, Ficus sagittata, Dalbergia pinnata, Flacourtia indica, Anacardium occidentale, Dalbergia oliveri, Cinnamomum bejolghota,   (Zhan, 2013). Therefore, studies are needed to further characterize these plants of medicinal value. According to the results shown in Table 2, there were 19 species distributed in Guangdong, 17 species in Guangxi, and 15 species in Yunnan. Therefore, the medicinal

Study Area and Materials
In Orussey Market, we interviewed a total of 10 medicine retailors, all sellers were briefed on the purpose and details of the investigation, they were also informed that the investigation could be terminated any time as needed. As shown in Tables

DNA Extraction
Approximately 30 mg of each sample was ground for 2 min (40 Hz) using a high-throughput tissue grinder Ningbo,China). Total genomic DNA was extracted using a plant genomic DNA extraction kit (Tiangen Biotech Co., China). Occasionally, alien DNA sequences from other species -such as fungi and algae -or mixed sequence signals were repeatedly detected if the primers were not specific. To prevent nonspecific PCR amplification, we washed the samples of medicinal materials using 75% alcohol to remove fungi and other plant powder contaminations.

PCR Amplification and Sequencing
The primers used for amplification and sequencing were as follows: ITS2 (the second ITS) (forward, 5 -GCGATACTTG GTGTGAAT-3 ; reverse, 5 -GACGCTTCTCCAGACTACA AT-3 ) (Chen et al., 2010) and psbA-trnH intergenic spacer [forward, 5 -GTTATGCATGAACGTAATGCTC-3 (Sang et al., 1997); reverse, 5 -CGCGCATG GTGGATTCACAATCC-3 (Tate and Simpson, 2003)]. Primers were synthesized by Shanghai Shenggong Bioengineering Co., Ltd. The 25 µL PCR reaction contained 12.5 µL of 2× Taq PCR Mix, 1.0 µL each of the forward and reverse primers (2.5 µmol L −1 ), 8.5 µL of double distilled water, and 2.0 µL of the template (genomic DNA < 0.1 ng). The PCR amplification procedure for ITS2 was as follows: denaturation at 94 • C for 5 min, followed by 40 cycles of denaturation at 94 • C for 30 s, annealing at 56 • C for 30 s, and extension at 72 • C for 45 s. The PCR amplification procedure for psbA-trnH was as follows: denaturation at 95 • C for 4 min, followed by 35 cycles of denaturation at 94 • C for 30 s, annealing at 55 • C for 1 min, and extension at 72 • C for 1 min. A final extension was performed at 72 • C for 10 min for both PCR amplification procedures. The PCR was conducted in a thermal cycler (model 2720; Thermo Fisher Scientific). Bidirectional sequencing of the PCR products was performed by Beijing Qingke New Industry Biotechnology Co., Ltd.

Data Analysis
Codon Code Aligner V 7.0.1 (CodonCode Co., United States) was used to assemble and cut the contigs and to generate the ITS2 and psbA-trnH sequences. The sequences were submitted to the National Center for Biotechnology Information (NCBI) database to search for other similar sequences, which have been taxonomically validated from published literatures. To identify the species of each medicinal plant, each species was searched in the literature in descending order of similarity. The maximum score was used to determine if the medicinal plant distributed in Cambodia. If its origin was from Cambodia and its similarity was ≥97% [97% was used as the DNA barcoding identification similarity threshold for medicinal plants (Chen et al., 2012;Gu et al., 2015)], the species identified were considered to be the final. If the similarity was ≤97%, the psbA-trnH sequence was amplified. The final identification results of similarities ≥97% were determined according to the aforementioned ITS2 method. If the similarity was between 90 and 97%, the results revealed the genus level. In cases of inconsistent ITS2 and psbA-trnH results, we choose the one with the higher similarity.

Ethics
We plan to work with the National Center of Traditional Medicine (NCTM), Ministry of Health of Cambodia, to publish Handbook of medicinal plants in Cambodia, which will be made available to the Cambodian sellers we interviewed after its publication. Meanwhile, we also collaborate with NCTM on the Sino-Cambodian International Exchange Project to promote educational and academic communications between China and Cambodia; this project will also provide Cambodia with technical and theoretical supports in the identification and marker-assisted selection of medicinal plants.

CONCLUSION
Cambodia has a history of nearly 2000 years, and for a long time, it has suffered from civil wars and wars of aggression. In light of the extremely poor conditions, the Cambodian people have relied on their own practices to identify medicinal plants to fight diseases. Although there is some information on various medicinal plants, there is no pharmacopoeia and no readily available body of medicinal plant literature, which has hindered the application of medicinal plants and the dissemination of results from investigators of other countries. Therefore, the current study of medicinal plants in Cambodia is incomplete, and there remain many gaps in knowledge. Furthermore, many plants have become endangered due to industrialization and environmental pollution (Chassagne et al., 2016). In light of this, the DNA barcoding technique can provide useful information on the species of various medicinal plants in Cambodia. This will not only preserve plant knowledge in Cambodia, but also help develop an ethnopharmacopia and provide new insights on the development of new drugs in the future. At the same time, the use of DNA barcoding is one step in supporting the improvement in the quality control of plants being sold for medicinal use in Cambodia and it will emphasize the importance of protecting endangered species. This approach can also be used in other countries or regions with relatively backward economies and underdeveloped research practices. Through DNA barcoding, commonly used medicinal plants can be completely characterized.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.