Chemical Characteristics of Platycodon grandiflorum and its Mechanism in Lung Cancer Treatment

Objective: The technology, network pharmacology and molecular docking technology of the ultra performance liquid chromatography-quadrupole time-of-flight tandem mass spectrometry (UPLC-Q-TOF-MS/MS) were used to explore the potential molecular mechanism of Platycodon grandiflorum (PG) in the treatment of lung cancer (LC). Methods: UPLC-Q-TOF-MS/MS technology was used to analyze the ingredients of PG and the potential LC targets were obtained from the Traditional Chinese Medicine Systems Pharmacology database, and the Analysis Platform (TCMSP), GeneCards and other databases. The interaction network of the drug-disease targets was constructed with the additional use of STRING 11.0. The pathway enrichment analysis was carried out using Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) in Metascape, and then the “Drug-Ingredients-Targets-Pathways-Disease” (D-I-T-P-D) network was constructed using Cytoscape v3.7.1. Finally, the Discovery Studio 2016 (DS) software was used to evaluate the molecular docking. Results: Forty-seven compounds in PG, including triterpenoid saponins, steroidal saponins and flavonoids, were identified and nine main bioactive components including platycodin D were screened. According to the method of data mining, 545 potential drug targets and 2,664 disease-related targets were collected. The results of topological analysis revealed 20 core targets including caspase 3 (CASP3) and prostaglandin-endoperoxide synthase 2 (PTGS2) suggesting that the potential signaling pathway potentially involved in the treatment of LC included MAPK signaling pathway and P13K-AKT signaling pathway. The results of molecular docking proved that the bound of the ingredients with potential key targets was excellent. Conclusion: The results in this study provided a novel insight in the exploration of the mechanism of action of PG against LC.


INTRODUCTION
Lung cancer (LC) is a malignant tumor with the highest morbidity and mortality worldwide, mostly in male than in female, indeed known as "the king of cancer" (Shao and Zhang, 2020). At present, surgery, chemotherapy and radiotherapy are the main treatments to combat LC, but their side effects are numerous and unavoidable and the clinical prognosis is not ideal. The development of traditional Chinese medicine (TCM) included the discovery and used of anti-cancer drugs; thus, it has attracted more and more attention. TCM adopts syndrome specific treatments, TCM combined with chemotherapy and other methods, with less toxic side effects. For this reason, TCM prescriptions achieved good results in clinical practice (Bai et al., 2017;Wang and Li, 2019). Therefore, the research and development of new TCM to combat LC would be of great significance.
Platycodon grandiflorum (PG) is a plant belonging to the family of campanulaceae, and the dried root is used in TCM to regulate the lung meridian. PG exerts the effect of smoothing lung, dispelling the phlegm, and expelling the pus, and represents the main treatment to cure sore throat, vomiting due to a purulent carbuncle infection in the lung, hypochondriac pain in the chest and other syndromes (Chinese Pharmacopoeia Commission, 2020). The properties of PG were first published in Shennong Materia Medica Classic. PG is mainly growing in the Northeast China, Central China and Guangdong, and the components of PG are different in different areas. Shandong is one of the authentic areas cultivating PG. The roots of PG from Shandong are longer, less bifurcated and with a high content of active components (Zhu et al., 2013). According to the TCM, PG mainly acts on the lung and its related parts, with an antitussive and expectorant effect, and good therapeutic effect on LC (Yim et al., 2016;Deng et al., 2020). However, its mechanism of action in the treatment of LC is not clear.
UPLC-Q-TOF-MS/MS is a high-throughput analytical technology rapidly developed in the past decade, which is widely used in medicine, drug research and other fields (Jin et al., 2018;Ren et al., 2020). Network pharmacology is a theory based on systems biology, which emphasizes the multipathway regulation of signaling pathways, thus in agreement with the multi-component-multi-target characteristics of TCM Zhang et al., 2019;Ye et al., 2020). Therefore, in this study, the components of the PG roots cultivated in the Shandong Province were analyzed, and then the "Drug-Ingredients-Targets-Pathways-Disease" (D-I-T-P-D) network was constructed according to the relevant principles and methods of network pharmacology to explore the potential molecular mechanism used by PG to treat LC. Our aim was to identify a new drug through the development of the potential of PG to provide a theoretical basis for its clinical application. The flow chart of the approach used in this study is shown in Figure 1.

Chemicals and Materials
Methanol, acetonitrile, and formic acid used for high performance liquid chromatography (HPLC) were purchased from ACS (Washington D.C., MD, United States). Methanol for herb extraction was purchased from Xilong Scientific Co., Ltd (Guangdong, China). Ultrapure water was obtained from a Milli-QB system (Bedford, MA, United States). PG root pieces (simply FIGURE 1 | A comprehensive strategy diagram the chemical ingredients analysis, targets prediction, and network calculation for investigation the mechanism of action of Platycodon grandiflorum on lung cancer. defined as PG pieces according to the Chinese Pharmacopoeia 2015 edition that assumes that PG is the root) were purchased from Jiangxi Jiangzhong Herbal Pieces Co., Ltd (Jiangxi, China; batch number: 181024). The original PG medicinal material was purchased from Yiyuan, Shandong province, and was identified as the dried root of Platycodon grandiflorum (Jacq.) A. DC. Campanulaceae from Professor Fu Xiaomei. PG decoction pieces were processed by Jiangxi Jiangzhong TCM Decoction Co., Ltd. according to the processing method of the Chinese Pharmacopoeia 2015 edition. Next, dried PG pieces were crushed into a 40 mesh powder and stored in the laboratory of the Jiangxi University of TCM.

Preparation of Standard and Sample Solutions
Ten milligrams of each reference compound (deapio-platycodin D, platycodin D, polygonatoside C1, adenosine, ferulic acid, apigenin, luteolin, chlorogenic acid, caffeic acid, kaempferol, robinin, lobetyolin, rutin, 3-O-β-D-glucopyranosyl platycodigenin and linoleic acid) were weighed and transferred into a 10-ml volumetric flask. Methanol was added to reach the volumetric mark, and the solution was shaken well, stored at 4°C and used as a stock solution. Then, the appropriate amount of stock solution was transferred into a 5 ml volumetric flask, and methanol was added to reach the volumetric mark. The solutions were filtered using a 0.22 μm microporous membrane to obtain the standard solutions.
A total of 2.0 g PG powder was accurately weighed and transferred into a round bottom flask with 50 ml 50% methanol. The solution was mixed well, incubated for 0.5 h at room temperature, and ultrasonically treated for 30 min using an ultrasonic cleaning instrument (Jiangsu, China). The extracted solution was centrifuged at 14,000 rpm for 15 min at room temperature, and filtered using a 0.22 μm microporous membrane before qualitative analysis.
The settings of Q-TOF-MS/MS parameters were as follows: ion source gas 1 (GSI) and gas 2 (GS2) were both set at 50 psi, curtain gas (CUR) was set at 40 psi, ion spray voltage floating (ISVF) was set at 5500 V in the positive mode while 4500 V was set in the negative mode, ion source temperature (TEM) was set at 500°C, collision energy (CE) was set at 60 V, collision energy spread (CES) was set at 15 V, declustering potential (DP) was set    at 100 V, and nitrogen was used as a nebulizer and auxiliary gas. Samples were analyzed in both positive and negative ionization modes with a scanning mas-to-charge (m/z) range from 100 to 1,250. Data were collected in the information-dependent acquisition (IDA) mode and analyzed by PeakView ® 1.2 software (AB Sciex, Foster City, CA, United States).

Identification of the Ingredients
The chemical PG ingredients were collected from existing databases, such as SciFinder (

Protein-Protein Interaction (PPI) Network Construction
R software was used to intersect PG related targets and LC related targets, and the overlapping targets were uploaded into STRING 11.0 (https://string-db.org/). The protein type was set to "Homo sapiens" and the minimum interaction score was 0.4. The PPI network diagram was obtained, imported into Cytoscape v3.7.1 software, and CentiScape was used to calculate the degree centrality (DC) to filter PPI network core targets.  Metascape (http://www.metascape.org/) is a gene annotation tool that integrates many authoritative databases such as GO, KEGG, UniProt and DrugBank. It allows the completion of pathway enrichment analysis and biological process annotation and performs gene-related protein network analysis and drug analysis, providing comprehensive and detailed information regarding each gene . Metascape perfectly fills the gap of DAVID while maintaining its advantages, and its data is frequently updated (the last update was August 14, 2019), greatly guaranteeing the timeliness and credibility of the data.
The gene symbols of the core targets were introduced into Metascape, "Homo sapiens" was selected to perform the enrichment analysis, which was annotated and analyzed using the KEGG database (https://www.kegg.jp/) and PathwayBuilderTool_2. 0 software to further explain the role of the core targets in gene function and signaling pathway.

D-I-T-P-D Network Construction
The files related to "drug-core ingredient," "core ingredient-core target," "pathway-core target," and "disease-core target" were established, and the D-I-T-P-D network was constructed using Cytoscape v 3.7.1 to explain the multi-effect synergistic mechanism of PG.

Molecular Docking Evaluation
Discovery Studio 2016 software (DS) is a commonly used molecular modeling and simulation software widely used in the field of drug design and optimization, protein structure and function (Zhang and Mao, 2018). The PDB format of the core target is obtained in the Uniprot database, and the X-ray crystal structure of the core target is downloaded from the RCSB database (https://www.rcsb.org/). Then the molecular docking function of DS software is used to perform component-target molecular docking in the LibDock module.

Composition Analysis and Identification
The typical total ion chromatogram of the non-volatile components extracted from PG, shown in Figure 2, of positive and negative ions was analyzed by Peakview ® 1.2, and the composition was screened by "XIC manager". The structural formula of the target compound was matched with the secondary fragment ion, and the compounds were further identified according to the matching degree and the law of ion bond breaking. The compounds with a matching degree greater than  80% and in accordance with the law of bond breaking were considered. According to the reference (Wang et al., 2017a;2017b) and the data of the control quality spectrum, 47 chemical constituents were identified in the alcohol extract of PG (Table 1). Among them, deapio-platycodin D, platycodin D, polygonatoside C1, adenosine, ferulic acid, apigenin, luteolin, chlorogenic acid, caffeic acid, kaempferol, robinin, lobetyolin, rutin, 3-O-β-D-glucopyranosyl platycodigenin and linoleic acid were compared with the reference standards. Among these 47, 24 were triterpenoid saponins, three were steroidal saponins, six were flavonoids, three were phenolic acids, four were organic acids and seven were other components.

Triterpene Saponins
Triterpenoid saponins are the largest number of components and the main components identified from PG. The main approach to cleave triterpenoid saponins is a continuous intramolecular  Figure 3.

Steroidal Saponins
Three steroidal saponins were identified and their cleavage rules were similar to the ones to cleave triterpenoid saponins. The characteristic fragment ions were obtained by the cleavage of their sugar chains. Using polygonatoside C1 as an example for the analysis, the molecular ion peak m/z 869.4540 was

Flavonoids
It is already known that flavonoids are the main components in PG (Wang et al., 2017b;Deng et al., 2020). The main components identified were flavonoids and flavonoid glycosides, and the characteristic cleavage mode was the R-DA cleavage, while glycosides followed the glycosyl cleavage. Using luteolin as an example for the analysis, the molecular ion peak m/z 285.0405 was produced in the negative ion mode, and the retention time     Figure 6.

Organic Acids
The organic acid components identified in this work were mainly long-chain carboxylic acids with multiple unsaturated bonds. The characteristic cleavage method involved the Michael rearrangement cracking, alpha cracking and neutral molecule removal. Using the sanleng acid as an example for the analysis, it generated a molecular ion peak m/

Screening of the Active Ingredients
The absorption-distribution-metabolism-excretion-toxicity (ADMET) properties of the chemical constituents were predicted by DS, and apigenin, caffeic acid, kaempferol, linoleic acid, methyl linolenic acid, and ferulic acid were selected. Although some components do not meet the DS screening criteria, they are still considered as active components in order to select the active components of PG more comprehensively. For example, luteolin and robinin were obtained from TCMSP, with oral bioavailability greater than 30% and drug-likeness greater than 0.18 despite they did not meet the DS screening criteria, so they are considered as active ingredients. Recent studies showed that platycodin D is one of the main components of PG, and has a certain preventive and therapeutic effect on LC (Zhao et al., 2015;Zhang, 2016b;Deng et al., 2020), thus platycodin D is also retained as the active ingredient. To sum up, a total of nine ingredients were selected as the active components of PG, and their detailed characteristics are listed in Table 2.

PPI Network Analysis
A total of 545 targets of the selected nine active components in PG were obtained using the Swiss Target Prediction, Pubchem, TCMSP and pharmmapper databases. A total of 2664 LC related targets were collected using GeneCards and DisGenet databases using "Lung cancer" as the keyword (Schedules 2,3). The Venn diagram of drug-disease overlapping targets was obtained using the R software to intersect PG related targets and LC related targets ( Figure 8A), revealing the existence of 285 common targets (Schedule 4). The interactive PPI network of common targets was obtained by introducing the 285 drugdisease common targets into STRING and the obtained network was imported into the Cytoscape software for visualization ( Figure 8B). No interaction was found between the target S100P and other targets, but the other 284 common targets had an interaction. The average value of DC calculated by CentiScape was 39.775, and 108 potential targets with DC values greater than 39.775 were selected. The higher the DC value, the more important the role in the network. In this study, the targets with the top 20 DC values were selected as the core targets. The red dot in Figure 8B represents the core target, the light green dot represents the potential target, the yellow dot represents the interactive target, and the connection between the nodes represents the interaction between the two proteins.

GO and KEGG Pathway Enrichment Analysis
The pathway enrichment analysis of the 20 core targets was carried out using Metascape. The enrichment results were selected under the conditions of p＜0.01, minimum count 3, and enrichment factor>1.5. A total of 849 GO biological functions and 116 KEGG enrichment items were obtained. The GO functions related to the treatment of LC included oxidative stress response (GO:0006979), active regulation of cell migration (GO:0030335), regulation of DNA binding transcription factor activity (GO:0051090), and regulation of cytokine-mediated signaling pathway (GO:0019221) ( Figure 9A), mostly related to apoptosis, oxidative stress and energy metabolism. The 20 core targets were closely related to cancer pathway (hsa05200), TNF signaling pathway (hsa04668), MAPK signaling pathway (hsa04010) and P13K-AKT signaling pathway (hsa04151), and are related to diseases such as hepatitis B, colorectal cancer, nonsmall cell LC and small cell LC ( Figure 9B). The first 20 representative signaling pathways are listed in Table 3, which might represent the key pathways in the treatment of LC. It is also suggested that PG can be used in the treatment of LC. The top 10 pathways were determined in the KEGG database and annotated using the PathwayBuilderTool_2.0 software to integrate the potential pathways of PG in the treatment of LC ( Figure 9C). This analysis provided a new research method for the limitations in treating LC.

D-I-T-P-D Network Analysis
In order to explain the complex mechanism of multi-ingredientmulti-target-multi-pathway of PG in the treatment of LC, this study selected the top 10 pathways and used Ctyoscape v 3.7.1 to construct a D-I-T-P-D network, as shown in Figure 10. In this network, the purple node represents the drug, the green node represents the core ingredient of PG, the yellow node represents the core target of PG in the treatment of LC, the red node represents the pathway, the blue node represents the disease, and the edge represents thee interaction among nodes. The network consists of 40 nodes (1 drug, eight core components, 20 core targets, 10 pathways and one disease), with a total of 277 edges, which fully showed that the targets of the core ingredients of PG were distributed in different pathways, and the ingredients coordinated with each other in the body.

Subsistence Analysis
The relationship between the top five core targets in the PPI network and the prognosis of LC patients was analyzed using the Kaplan-Meier Plotter database. The results revealed that the overall survival of LC patients with high expression of GAPDG, AKT1, TP53, IL6 and MPKA3 was significantly lower than that in patients with their low expression (p < 0.01) ( Figure 11).

Molecular Docking Verification Results
The analysis of the D-I-T-P-D network showed that each core component interacted with multiple targets. Thus, the DS software was used to dock the core components with the core target molecules in order to verify the interaction among them. It is generally accepted that the higher the score of the ligand binding to the receptor, the greater the possibility of interaction. The docking results of luteolin and protein PTGS2 docking are shown in Figure 12A, which might bond by hydrogen bonds, van der Waals forces and other forces. The docking score was sorted out, the highest score was obtained (Schedule 5), and then ImageGP was used for visual processing ( Figure 12B). The redder the color, the better the affinity of the component to the target. The relationship between the abovementioned core components and core targets was consistent according to the analysis of the docking scoring results. The cluster analysis revealed that GAPDH, VEGFA, EGFR, JUN, CCND1, MAPK8, FOS and CASP3 could be included into one group, while EP300, AKT1, HSP90AA1, MAPK3, TNF, SIRT1, ESR1, PTGS2, TP53, FN1, MAPK1 and IL6 could be included in one category. SIRT1, PTGS2, CASP3, VEGFA, MAPK8 and GAPDH were the targets with strong binding ability according to the molecular docking. Linoleic acid, methyl linolenate, luteolin, apigenin and kaempferol were the core components with strong binding ability according to the molecular docking, and the docking scores of these five components with SIRT1 were greater than 120, followed by methyl linolenate, kaempferol, linoleic acid, luteolin and apigenin. This result suggested that these five components have a good binding activity with SIRT1 and might play a key role in the treatment of LC using PG.

DISCUSSION
The etiology of LC is complex and so far, no final conclusion was obtained. It is related to smoking, air pollution, dietary factors, decreased immune function and genetic factors , while TCM believes that the main cause in the development of LC is the lack of "vital qi", since the "deficiency of vital qi" is "evil". Professor Jia Yingjie claims that the "deficiency of vital qi and coexistence of toxin and blood stasis" is the key of the pathogenesis of LC . PG exerts the effect of smoothing the lung, dispelling the phlegm, and expelling the pus, which is consistent with the pathogenesis of LC. However, the mechanism of PG in the treatment of LC is still unclear due to the multi-ingredient and multi-target characteristics of TCM. Therefore, it is imperative to analyze the ingredients of PG and explore the mechanism used by PG in the treatment of LC.
In this study, 47 non-volatile ingredients were identified from the alcohol extract of PG by the UPLC-Q-TOF-MS/MS technique. According to the ADMET parameters and other basic properties of the ingredients, and combined with the results in the literature, nine main active ingredients including platycodin D and apigenin were selected. The results showed that they play an important role by affecting 285 overlapping genes involved in the treatment of LC. The PPI network showed that the DC values of GAPDH, AKT1 and TP53 were greater than 175, and they were predicted as the most relevant targets. The enrichment analysis of the GO pathway and KEGG pathway in the 20 core targets showed that most of the functional enrichment results of GO were apoptosis, oxidative stress and energy metabolism. Our hypothesis was that oxidative stress response might be the most important biological process managed by PG in the treatment of LC. The results of the KEGG pathway enrichment analysis showed that the top 10 pathways included cancer pathway, TNF signaling pathway, MAPK signaling pathway, P13K-AKT signaling pathway and FoxO signaling pathway. Among them, MAPK signaling pathway, PI3K-AKT signaling pathway, FoxO signaling pathway and HIF-1 signaling pathway were related to cell proliferation and apoptosis, oxidative stress and inflammation. MAPK signaling pathway is one of the most important pathways regulated by PG in the treatment of LC. Related studies showed that the MAPK signaling pathway participates in the regulation of cell cycle, apoptosis and proliferation of NSCLC cells, and inhibits the expression of P-glycoprotein (P-gp), multidrug resistance gene 1 (MDR1), TP53 and other proteins, thus inhibiting the growth of NSCLC cells, blocking cell cycle and inducing apoptosis (Liao et al., 2020;Qiao et al., 2020;Wang et al., 2020). PI3K-AKT signaling pathway is considered as the primary pathway for cancer cell survival, since it promotes tumor cell proliferation and metastasis, and inhibits apoptosis and angiogenesis. If this pathway is abnormal, it directly leads to the abnormal proliferation of cells (Ma et al., 2014;Sui et al., 2018). Therefore, the activation of the MAPK signaling pathway and the inhibition of the PI3K-AKT signaling pathway are crucial in the treatment of LC.
The D-I-T-P-D network showed that the same target could interact with many ingredients. For example, PTGS2 could bind ferulic acid, luteolin, linoleic acid, apigenin and sophorin, while CASP3 can bind methyl linolenic acid, apigenin, ferulic acid and robinin, indicating that multiple active ingredients in PG might act on the same target. In addition, our results showed that apigenin was related to CASP3, AKT1, FOS, JUN, MAPK8, TNF, VEGFA, and CCND1, and linoleic acid was related to PTGS2, TP53, ESR1, MAPK1, MAPK8 and IL6, which was consistent with the results obtained by molecular docking, suggesting that PG could act on multiple targets through the same active component. This result also explained the characteristics of multi-ingredient and multi-target synergism of PG, providing a basis in the mechanism of PG to treat LC. Our hypothesis was that PG was closely related to cell proliferation and apoptosis in the treatment of LC according to the GO pathway and KEGG pathway enrichment analysis. Several core components of PG could interact with PTGS2 and CASP3, potentially promoting cancer cell apoptosis through specific signaling pathways such as MAPK signaling pathway and P13K-AKT signaling pathway in order to cure LC.
The survival analysis results showed that the top five core targets (GAPDG, AKT1, TP53, IL6, and MPKA3) in the PPI network were closely related to the survival of LC patients. The overall survival of LC patients with a high expression of these genes was significantly lower than that of patients with their low expression, suggesting that its overexpression is related to a poor prognosis in LC patients. Thus, they could be used as biomarkers to evaluate LC prognosis.
The results of molecular docking showed that apigenin, luteolin, linoleic acid, kaempferol and methyl linolenate might be the potential active ingredients. Some studies showed that apigenin has a cytotoxic effect on human LC cisplatin-resistant cell line A549/DDP, and indeed it effectively inhibits its growth and reverse its drug resistance (Zhao et al., 2017;Mo et al., 2020). Zhou Liang et al. found that luteolin inhibits the metastasis and proliferation of LC through the down-regulation of the PI3K/AKT signaling pathway and the improvement of the immune function in the body (Zhou et al., 2017). In addition, Li Xiaolin et al. demonstrated that luteolin effectively blocks A549/DDP cell cycle and even promotes apoptosis, and that TP53 protein is involved in this process (Li et al., 2009). In addition, some scientists found that kaempferol weakens the invasion and migration ability of NSCLC A549 cells by inhibiting the expression of estrogen-related receptor α (ERRα), thus providing a strong support in the study of the anti-cancer mechanism of kaempferol . However, Mouradian and other scientists found that linoleic acid increases the activity of PI3K/AKT signaling pathway, promotes the proliferation of LC cells, and leads to tumor formation, with GAB1 as the main target of linoleic acid (Mouradian et al., 2014). Apigenin, luteolin, linoleic acid, kaempferol, and methyl linolenate are widely found in plants, but plants that contain all these five components are rare. Besides PG, Panax notoginseng (Liu, 2019) and peony seed (Zhang, 2016a) also contain these five components. However, so far, no reports are available regarding Panax notoginseng and peony seed anti-LC effect. Therefore, our analysis of the effect and mechanism of PG in the treatment of LC is of great significance.

CONCLUSION
PG is one of the most commonly used TCM in clinical practice, since it has a significant therapeutic effect on lung-related diseases and has been used in China for thousands of years. In this study, the chemical constituents of PG were analyzed and identified. A total of nine active PG components and 285 overlapping targets of PG and LC were screened. PPI network, GO and KEGG enrichment analysis showed that the mechanism of PG in the treatment of LC might be related to its involvement in cancer cell apoptosis, inflammation and oxidative stress through the MAPK signaling pathway and P13K-AKT signaling pathway. The network pharmacological method developed in this study provided another strategy for a comprehensive understanding of the mechanism of PG in the treatment of LC.

DATA AVAILABILITY STATEMENT
All datasets presented in this study are included in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
YD integrated the data and wrote the manuscript; ML, YL, and KW performed the literature search; XY and YC completed the ingredient identification; HR and LX collected the prediction targets; HL and HZ completed the network analysis; ZZ improved the manuscript writing; JZ conceptualized and designed the experimental plan.