Uncovering the Complexity Mechanism of Different Formulas Treatment for Rheumatoid Arthritis Based on a Novel Network Pharmacology Model

Traditional Chinese medicine (TCM) with the characteristics of “multi-component-multi-target-multi-pathway” has obvious advantages in the prevention and treatment of complex diseases, especially in the aspects of “treating the same disease with different treatments”. However, there are still some problems such as unclear substance basis and molecular mechanism of the effectiveness of formula. Network pharmacology is a new strategy based on system biology and poly-pharmacology, which could observe the intervention of drugs on disease networks at systematical and comprehensive level, and especially suitable for study of complex TCM systems. Rheumatoid arthritis (RA) is a chronic inflammatory autoimmune disease, causing articular and extra articular dysfunctions among patients, it could lead to irreversible joint damage or disability if left untreated. TCM formulas, Danggui-Sini-decoction (DSD), Guizhi-Fuzi-decoction (GFD), and Huangqi-Guizhi-Wuwu-Decoction (HGWD), et al., have been found successful in controlling RA in clinical applications. Here, a network pharmacology-based approach was established. With this model, key gene network motif with significant (KNMS) of three formulas were predicted, and the molecular mechanism of different formula in the treatment of rheumatoid arthritis (RA) was inferred based on these KNMSs. The results show that the KNMSs predicted by the model kept a high consistency with the corresponding C-T network in coverage of RA pathogenic genes, coverage of functional pathways and cumulative contribution of key nodes, which confirmed the reliability and accuracy of our proposed KNMS prediction strategy. All validated KNMSs of each RA therapy-related formula were employed to decode the mechanisms of different formulas treat the same disease. Finally, the key components in KNMSs of each formula were evaluated by in vitro experiments. Our proposed KNMS prediction and validation strategy provides methodological reference for interpreting the optimization of core components group and inference of molecular mechanism of formula in the treatment of complex diseases in TCM.

Traditional Chinese medicine (TCM) with the characteristics of "multi-component-multitarget-multi-pathway" has obvious advantages in the prevention and treatment of complex diseases, especially in the aspects of "treating the same disease with different treatments". However, there are still some problems such as unclear substance basis and molecular mechanism of the effectiveness of formula. Network pharmacology is a new strategy based on system biology and poly-pharmacology, which could observe the intervention of drugs on disease networks at systematical and comprehensive level, and especially suitable for study of complex TCM systems. Rheumatoid arthritis (RA) is a chronic inflammatory autoimmune disease, causing articular and extra articular dysfunctions among patients, it could lead to irreversible joint damage or disability if left untreated. TCM formulas, Danggui-Sini-decoction (DSD), Guizhi-Fuzi-decoction (GFD), and Huangqi-Guizhi-Wuwu-Decoction (HGWD), et al., have been found successful in controlling RA in clinical applications. Here, a network pharmacology-based approach was established. With this model, key gene network motif with significant (KNMS) of three formulas were predicted, and the molecular mechanism of different formula in the treatment of rheumatoid arthritis (RA) was inferred based on these KNMSs. The results show that the KNMSs predicted by the model kept a high consistency with the corresponding C-T network in coverage of RA pathogenic genes, coverage of functional pathways and cumulative contribution of key nodes, which confirmed the reliability and accuracy of our proposed KNMS prediction strategy. All validated KNMSs of each RA therapy-related formula were employed to decode the mechanisms of different formulas treat the same disease. Finally, the key components in KNMSs of each formula were evaluated by in vitro experiments. Our proposed KNMS prediction and validation INTRODUCTION Rheumatoid Arthritis (RA) is a chronic systemic autoimmune disease with symmetric inflammation of aggressive multiple joints (Sodhi et al., 2015). As the most common inflammatory rheumatic disease, the prevalence of RA is about 0.5%-1.0% in the world (Saraux et al., 2006). The inflammatory cell infiltration of synovium, pannus formation, and the progressive destruction of articular cartilage and bone destruction are the main pathological properties of RA (Brzustewicz and Bryl, 2015).
The data from epidemiological investigations shows that about 90% of RA patients developed bone erosions within 2 years, eventually leading to joint deformities or even disability (Cecilia et al., 2013). Therefore, RA brings great impact on the quality of life of patients and also imposes a heavy burden on families and society.
Traditional Chinese medicine (TCM) has the advantages of definite curative effect, safety and few side effects in the treatment of rheumatoid arthritis and has attracted more and more attention in the prevention and treatment of rheumatoid arthritis. TCM usually treats RA and other complex diseases in the form of formulas, which has theoretical advantages and rich clinical experience. In the study of RA therapy-related formulas, increasing evidence confirmed that different formulas can treat RA, which coincide with the theoretical concept of "treating the same disease with different treatments" in TCM (Fu et al., 2014). Such as Danggui-Sini-decoction (DSD) (Bang et al., 2017), Guizhi-Fuzi-decoction (GFD) (Peng et al., 2013), and Huangqi-Guizhi Wuwu-Decoction (HGWD) (Wang et al., 2010) etc., have been found successful in controlling RA in TCM clinics. Previous pharmacological studies have shown that DSD exert positive effects and good anti-inflammatory function which might protect collagen-induced arthritis rats from bone and cartilage destruction (Cheng et al., 2017). It has been reported that GFD could substantially inhibit the activities of interleukin-6 and tumor necrosis factor-a in the serum of adjuvant-induced arthritis rats, as well as inhibit the formation of synovitis and pannus, and has obvious therapeutic effect on rheumatoid arthritis (He and Gu, 2008;Xia and Song, 2011). In addition, some pharmacological experimental studies have found that HGWD could promote the apoptosis of synovial cells in rheumatoid arthritis rats with abnormal hyperfunction , and reduce the degree of foot swelling in adjuvant arthritis rats, affect the arthritis index of rats, and play a role in treating rheumatoid arthritis (Shi et al., 2006).
Network pharmacology has been widely used in the research of treating the same diseases with different formulas. For example, Gao et al. used network pharmacology to decode the mechanisms of Xiaoyao powder and Kaixin powder in treating depression; Liu et al. clarified the molecular mechanism of Sini San and Suanzaoren Tang in treating insomnia based on network pharmacology, etc (Yao et al., 2018;Liu et al., 2019). With the indepth intersection of systems biology, poly-pharmacology, bioinformatics and other technologies, and the continuous improvement of the accuracy, reliability, and integrity of data resources, the research ideas and technical means of network pharmacology will be better applied to the mechanism research of formulas in TCM and provide more innovation in methodology for the molecular level research of TCM.
In this study, network pharmacology model was applied to analyze the key gene network motif with significant (KNMS) of different formulas in the treatment of RA. Coverage of RA pathogenic genes, coverage of functional pathways and cumulative contribution of key nodes were employed to evaluate the accuracy and reliability of KNMSs, and then the validated KNMSs were used to infer the common potential mechanism of different formulas in the treatment of RA. In summary, the proposed network pharmacology strategy aims to identify major mechanism and related pharmacological effects of different treatments in treating RA through specific KNMSs, which may offer a new network-based method for evaluating and selecting suitable treatment strategies of complex diseases in TCM.

Flowchart
This phenomenon that different formulas treat the same diseases is widely used in TCM clinical applications. However, there is lack of systematic method to decode the mechanisms of treat the same disease with different treatments. In this study, we designed a network pharmacology model to decode the common and specific potential mechanisms of 3 formulas in the treatment of RA, which may provide a methodological reference for different formulas treat the same disease. The workflow is illustrated in Figure 1 and described as follows: 1) the components of DSD, GFD and HGWD were collected from TCMSP, TCMID, and TCM@Taiwan; 2) ADME based methods were used to identify the main active components; 3) the main active components from three formulas and their predicted targets were used to construct the component-target (C-T) networks; 4) The KNMSs were detected from integrated C-T and target-target interaction networks; 5) the KNMSs were validated by the coverage of RA pathogenic genes, coverage of functional pathways and cumulative contribution of key nodes; 6) Finally, all validated KNMSs were employed to decode the underlying mechanism of different formulas treat the same disease.

Component Identification
All chemical components of Danggui-Sini-decoction (DSD), Guizhi-Fuzi-decoction (GFD), and Huangqi-Guizhi Wuwu-Decoction (HGWD) were collected from Traditional Chinese Medicine Systems Pharmacology (TCMSP) Database (Ru et al., 2014) (http://lsp.nwsuaf.edu.cn/tcmsp.php), Traditional Chinese Medicine integrated database (Xue et al., 2013) (TCMID, http:// www.megabionet.org/tcmid/), and TCM@Taiwan (Chen, 2012) (http://tcm.cmu.edu.tw/zh-tw). The chemical identification and concentration of the herbs in DSD, GFD, and HGWD were collected from the previous reports. All chemical structures were prepared and converted into canonical SMILES using Open Babel Toolkit (version 2.4.1). The targets of DSD, GFD, and HGWD were predicted by using Similarity Ensemble Approach SEA (Keiser et al., 2007) (http://sea.bkslab.org/) and Swiss Target Prediction (David et al., 2014) (http://www.swisstargetprediction.ch/). components release from the herbs into the systemic circulation and is an important indicator for evaluating the intrinsic quality of the component (Xu et al., 2012). Drug-like (DL) indicate the characteristics that an ideal drug should have and was a comprehensive reflection of the physical and chemical properties and structural characteristics exhibited by successful drugs (Tao et al., 2013). In this study, active components from DSD, GFD, and HGWD were mainly filtered by integrating Lipinski's rules, oral bioavailability (OB), and drug-likeness (DL). The detail of Lipinski's rules includes molecular weight lower than 500 Da, number of donor hydrogen bonds less than 5, number of acceptor hydrogen bonds less than 10, the logP lower than 5 and over -2, and meets only the criteria of 10 or fewer rotatable bonds. Besides, oral bioavailability (OB), and druglikeness (DL) also were employed to screen the active components. The components with OB values higher than 30% and DL values higher than 0.14 were retained for further investigation .

Networks Construction
The component-target (C-T) networks of three formulas were constructed by using Cytoscape software (Version 3.7.0) (Lopes et al., 2010). The topological parameters of networks were analyzed using Cytoscape plugin NetworkAnalyzer (Jong et al., 2003).

Detection of Key Gene Network Motif With Significant (KNMS)
The exploration of motif structures in networks is an important issue in many domains and disciplines. To find key gene network motifs with significant (KNMS) of 3 formulas in the treatment of RA, a mathematical algorithm was designed and described as follows: To take advantage of the motif structure of the network, m motif codebooks, and one index codebook are used to describe the random walker's movements within and between motifs, respectively. Motif codebook i has one codeword for each node a∈i and one exit codeword. The codeword lengths are derived from the frequencies at which the random walker visits each of the nodes in the motif, p a∈i , and exits the motif, q i↷ . We use p i↻ to denote the sum of these frequencies, the total use of codewords in motif i, and P i to denote the normalized probability distribution. Similarly, the index codebook has codewords for motif entries. The codeword lengths are derived from the set of frequencies at which the random walker enters each motif, q i↶ . We use q ↶ to denote the sum of these frequencies, the total use of codewords to move into motifs, and Q to denote the normalized probability distribution. We want to express average length of codewords from the index codebook and the motif codebooks weighted by their rates of use. Therefore, the map equation is Below we explain the terms of the map equation in detail and we provide examples with Huffman codes for illustration.
L(M) represents the per-step description length for motif partition M. That is, for motif partition M of n nodes into m motifs, the lower bound of the average length of the code describing a step of the random walker.
The rate at which the index codebook is used. The per-step use rate of the index codebook is given by the total probability that the random walker enters any of them motifs. This variable represents the proportion of all codes representing motif names in the codes. Where q i↶ is probability of jumping out of Motif i.
This variable represents the average byte length required to encode motif names. The frequency-weighted average length of codewords in the index codebook. The entropy of the relative rates to use the motif codebooks measures the smallest average codeword length that is theoretically possible. The heights of individual blocks under Index codebook correspond to the relative rates and the codeword lengths approximately correspond to the negative logarithm of the rates in base 2.
This variable represents the coding proportion of all nodes (including jump-out nodes) belonging to motif i in the coding. The rate at which the motif codebook i is used, which is given by the total probability that any node in the motif is visited, plus the probability that the random walker exits the motif and the exit codeword is used.
This variable represents the average byte length required to encode all nodes in motif i. The frequency-weighted average length of codewords in motif codebook i. The entropy of the relative rates at which the random walker exits motif i and visits each node in motif i measures the smallest average codeword length that is theoretically possible. The heights of individual blocks under motif codebooks correspond to the relative rates and the codeword lengths approximately correspond to the negative logarithm of the rates in base 2.

Contribution Coefficient Calculation
The contribution coefficient (CC) represents the network contribution of KNMSs in 3 formulas. R value was used to determine the importance of the components by the following mathematical model: where d c represents the degree of each component, which is calculated by Cytoscape. R is an indicator to evaluate the importance of the component.
Where n is the number of components from different KNMSs of DSD, GFD, and HGWD, respectively; m is the number of components from C-T network of DSD, GFD, and HGWD, respectively; Ri represents the indicator of each component in KNMSs of DSD, GFD, and HGWD, and Rj represents the indicator of each component in C-T network of DSD, GFD, and HGWD.

KEGG Pathway
To analyze the main function of the KNMSs, the pathway data were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Draghici et al., 2007) for KEGG pathway enrichment analyses. P-values were set at 0.05 as the cut-off criterion. The results of analysis were annotated by Pathview (Luo and Brouwer, 2013) in the R Bioconductor package (https://www.bioconductor.org/).

Cell Culture and Treatment
RAW264.7 cells were obtained from the cell bank of the Chinese Academy of Sciences (Shanghai, China). The cells were cultured in DMEM with 10% FBS, and incubated at 37°C under 5% CO 2 . When RAW264.7 cells reached 80% confluency, the cells were treated with isoliquiritigenin, isorhamnetin and quercetin for 2 h, then the cells were treated with LPS (1 mg/ml) for 24 h.

Cell Viability Assay
MTT assay was utilized to measure cell viability. RAW264.7 cells (6×10 4 per/well) were seeded in 96-well plates. After 24 h incubation, RAW264.7 cells were treated with 1, 5, 10, 20, 40, and 80 mM isoliquiritigenin, isorhamnetin and quercetin for 24 h. Ten ml of MTT were added to reach a final concentration of 0.5 mg/ml, and incubated for a further 4 h. The absorbance was measured at 570 nm with a microplate reader (BioTek, USA).

Measurement of NO
Griess reagent was utilized to detected the level of NO in the culture supernatant of RAW264.7 cells. After incubation with isoliquiritigenin, isorhamnetin and quercetin for 2 h and LPS (1 mg/ml) for 24 h, the culture supernatant was collected and mixed with Griess reagent for NO assay. The absorbance was measured at 540 nm using a microplate reader.

Statistical Analysis
To compare the importance of motifs in three formulas, SPSS22.0 was used for statistical analysis. One-way analysis of variance followed by a Dunnett post-hoc test was used to compare more than two groups. Obtained p-values were corrected by Benjamini-Hochberg false discovery rate (FDR). Results were considered as statistically significant if the p-value was <0.05.

Chemical Analysis
Chemical analysis plays important roles in the study of substances basis and mechanism of herbs in the formulas. By searching from the literature, we collected the information on specific chemical identification and concentration of the herbs in DSD, GFD and HGWD, respectively. The detail information was shown in Table 1 and Table S1. The results suggest that chemical components of herbs and the concentration of identified components provide an experiment-aided chemical space for search of active components. This will provide valuable reference for the further analysis.

Active Components in DSD, GFD, and HGWD
By a comprehensive search of the TCMSP, TCMID, and TCM@ Taiwan database, 812 components from seven herbs in DSD, 640 components from five herbs in GFD, and 459 components from five herbs in HGWD were obtained. A TCM formula usually contains large number of components, and ADME screening approaches are always used to select active components. After ADME screening, 124 active components in DSD, 120 active components in GFD, and 48 active components in HGWD were passed the combined filtering criteria which integrated by Lipinski's rule, OB, and DL ( Table 2). For further analysis of these active components, 31 common components in three formulas and 93, 89, and 17 unique components in DSD, GFD, and HGWD were found ( Figure 2). These results indicate that three formulas might exert roles in treating RA by affecting the common components and specific components.

C-T Network Construction
To facilitate analysis of the complex relationships between active components and their targets of three formulas, component-target networks were constructed by using Cytoscape (Figures S1-S3). The results revealed that the DSD network consisted of 124 active components, 846 target proteins, and 3758 interactions; the GFD network contained120 active components, 821 target proteins, and 3759 interactions; the HGWD network consisted of 48 active components, 612 target proteins, and 1373 interactions.
We further analyzed the topology parameters of these C-T networks using NetworkAnalyzer and found that the average degree of components and targets in DSD were 30.31 and 5.20; the average degree of components and targets in GFD were 31.33 and 5.36; the average degree of components and targets in HGWD were 28.6 and 2.43. These results indicate that there exist interactions between one component and multiple targets in three formulas, and also exist phenomenon that different components act on the same target, which is in line with the characteristics of multi-component and multi-target mediated synergistic effect of TCM, and also reflects the complexity of the mechanism of TCM.

KNMSs Predication
These C-T networks are complex and huge. How to quickly extract important information from these complex networks is the key step to decode underlying molecular mechanism. Here, we introduced the infomap algorithm in the network pharmacology model for the first time based on the random walk theory combined with Huffman-encoding. The algorithm performs to optimize the discovery of KNMSs in C-T network heuristically by using a reasonable global metric. 7, 10, and 10 KNMSs were predicted in DSD, GFD, and HGWD, respectively (p value < 0.05) (Figures 3-5). The detail information of network KNMSs were shown in Table S2.

KNMSs Validation
In order to validate whether predicted KNMSs in each formula can represent corresponding full C-T networks in treating RA. Three strategies were used to verify the accuracy and reliability and of KNMSs. The first strategy was used to see whether the number of RA pathogenic genes in KNMSs are close to the number of RA pathogenic genes in CT network. The coverage was defined as the percentage of the number of pathogenic genes in KNMSs to the number of pathogenic genes in C-T network. High coverage indicated that KNMSs could retain most formulatargeted RA pathogenic genes that included in the corresponding C-T network. The second strategy was designed to see whether the gene enrichment pathways in KNMSs covers the gene enrichment pathways in C-T network as much as possible. High coverage indicated that KNMSs could cover most genes enriched pathways of the corresponding C-T network. The third strategy was employed to calculate the percentage of cumulative contribution of important nodes in KNMSs to that of nodes in C-T network. High percentage means KNMSs can retain the

Validated the Number and Coverage of Pathogenic Genes in KNMSs
To assess whether the number of RA pathogenic genes in KNMSs are close to the number of RA pathogenic genes in corresponding C-T network. The known pathogenic genes of RA reported by published literature and databases were collected, and the pathogenic genes confirmed by more than 5 literatures were selected for further analysis (Table S3). We found that the C-T network of DSD, GFD, and HGWD contains 50, 52, and 39 pathogenic genes, respectively. While the KNMSs of DSD, GFD, and HGWD contains 39, 40, and 30 pathogenic genes. The number of pathogenic genes in KNMSs compared to that in C-T network of DSD, GFD, and HGWD reached 78%, 76.9%, and 76.9%, which confirmed that the predicted KNMSs with high coverage of pathogenic genes ( Figures 6A-C). These results demonstrate that KNMSs have a high coincidence degree with C-T network in the number and coverage of pathogenic genes, it also indicated that our proposed KNMS detection model can maximize the coincidence degree of pathogenic genes in the C-T network of formulas.

Validated the Genes Enriched Pathways in KNMSs
An additional metric for evaluating the importance of the inferred motifs is determined by their functional coherence, which can be accessed via their related genes enrichment pathways from KEGG (Kanehisa and Goto, 2000). Here, we used this method to detect whether KNMSs found in each formula can represent their full C-T networks at functional level. Our analysis shown that genes enriched pathways of KNMSs in DSD accounts for 85.8% of genes enriched pathways of the full C-T network in DSD; genes enriched pathways of KNMSs in GFD accounts for 86.6% genes enriched pathways of the full C-T network in GFD; genes enriched pathways of KNMSs in HGWD accounts for 81.9% genes enriched pathways of the full C-T network in HGWD ( Figures 7A, B). It was encouraged that the gene enriched pathways involved in KNMSs of 3 formulas are highly compatible with gene enriched pathways of their C-T networks. This result confirmed that KNMSs have a high coincidence degree with C-T network at the gene functional level and also suggested that our proposed KNMS detection model can maximize the retention of functional pathways in the formulas of TCM.

Validated the Cumulative Contribution of Important Nodes in KNMSs
The degree of nodes is a key topological parameter that characterizes the influence of nodes in a network (Lv et al., 2014). Here, a mathematical model was established to evaluate the importance of KNMSs in each formula based on the degree of nodes. According to the calculation results, each KNMS was assigned a CC value. The detailed information was shown in Figure 8 and Table S4. The sum of CC of 7, 10, and 10 KNMSs in each formula reached 80.44%, 79.88%, and 70.76% of that in C-T networks of DSD, GFD, and HGWD, respectively. The results confirmed that KNMSs have a high coincidence degree with C-T network on the topological structure and also indicated that our proposed KNMS detection model could maximize the coverage of important network topological structures compared with C-T network in each formula.

Potential Mechanisms Analysis of Different Formulas Treats the Same Disease
In order to reveal the potential mechanism of KNMSs in different formula for treating rheumatoid arthritis, pathway enrichment analysis of KNMS-related genes in each formula were performed. In the DSD, genes in total 7 KNMSs were enriched in 165 pathways, genes in two KNMSs, DSD1, and DSD2 were enriched in 158 pathways, accounting for 95.8% of that in 7 KNMSs. The arthritis-related signaling pathways corresponding to DSD1 and DSD2 were partially complementary, for example, genes in DSD1 were mainly enriched in JAK-STAT signaling pathway and AMPK signaling pathway, genes in DSD2 mainly enriched in NF-kappa B signaling pathway, p53 signaling pathway and Wnt signaling pathway. In GFD, we total got 10 KNMSs. Genes in these 10 KNMSs were enriched in 151 pathways. Four of 10 KNMSs, GFD1, GFD3, GFD4, and GFD5 related genes are enriched in 144 pathways, accounting for 95.4% of enrichment pathways in 10 KNMSs of GFD. Moreover, some of their corresponding arthritis-related signaling pathways are complementary. For example, GFD1 mainly includes TNF signaling pathway, IL-17 signaling pathway and AMPK signaling pathway, and GFD3 mainly includes Inflammatory mediator regulation of TRP channels and GnRH signaling pathway. In HGWD, genes in 10 predicted KNMSs were enriched in 110 pathways, genes in HGWD4 and HGWD5, HGWD6, and HGWD8 covered 102 pathways, accounting for 92.7% of all KNMSs gene enrichment pathways. Consistent with DSD and GFD results, some of their corresponding arthritisrelated signal pathways were also complementary. For example, HGWD4 mainly includes TNF signaling pathway, Hedgehog signaling pathway and IL-17 signaling pathway, HGWD5 mainly includes VEGF signaling pathway, NF-kappa B signaling pathway and mTOR signaling pathway, HGWD6 mainly includes cAMP signaling pathway and cGMP-PKG signaling pathway ( Figure 9, Table S5). These results show that KNMSs in different formulas have distinct roles and synergistic effects in the treatment of rheumatoid arthritis. In order to further explore the potential mechanism of the three formulas in treating RA, besides the difference analysis of each KNMS in different formulas, KEGG enrichment analysis of all KNMSs in each formula were also implemented and found that 3 formulas play the therapeutic effect on RA through the following five common pathways: Rap1 signaling pathway, cAMP signaling pathway, MAPK signaling pathway, EGFR Tyrosine Kinase Inhibitor Resistance, Calcium signaling pathway and Neuroactive ligand-receptor interaction. Except the common pathways, we found that the three formulas can play the role of treating RA through their specific pathways ( Figure 10). For example, DSD can play the role of treating RA by regulating VEGF signaling pathway. GFD can play a role in treating RA by regulating HIF-1 signaling pathway. HGWD can play a role in treating RA by regulating PI3K-Akt signaling pathway. These results indicate that 3 formulas can play the role of treating RA through different and common pathways, which may act as the essence of different formulas treat the same disease.
Through PubMed literature search, we found that among the common pathways, MAPK signaling pathway and cAMP signaling pathway have the most correlation records with rheumatoid arthritis. We selected MAPK signaling pathway  and cAMP signaling pathway which were reported closely related to inflammation to illustrate the mechanism of different formulas treat the same disease in detail. Firstly, a comprehensive inflammatory pathway was constructed by integrating the two pathways. And then, the genes in KNMSs of three formulas were mapped to the comprehensive inflammatory pathway (Figure 11). play therapeutic roles through targeting different genes in the comprehensive inflammatory pathway.
NO is a regulator of information transmission between cells and has the function of mediating cellular immune and inflammatory reactions. In order to further evaluate the results obtained by the network pharmacology model, the key components in KNMSs of each formula were selected for experimental validation. Isoliquiritigenin from motif 1 (DSD1) of DSD, isorhamnetin from motif 1 (GFD1) of GFD, and

DISCUSSION
The therapeutic effect of current synthetic agents in treating RA is not satisfactory and most of them have undesirable side effects. In China, some classical formulas have a long history of clinical application to treat RA and have shown significant curative effects. However, TCM formulas is a multi-component and multi-target agent from the molecular perspective (Corson and Crews, 2007;Bo et al., 2013). Based on the characteristics of complex components and unclear targets of TCM formula, the development of novel methods became an urgent issue needed to be solved. TCM network pharmacology emerging recently has become a flourishing field in TCM modern studies along with the rapid progress of bioinformatics (Guo et al., 2017;Gao et al., 2018;Wang et al., 2018). So, using the method, combined with the rich experience of TCM treatment, could hopefully decode the underlying mechanism of TCM formula in the treatment of complex disease with the characteristic of "multi-targets, multicomponent". Network pharmacology approach could help us search for putative active components and targets of herbs based on widely existing databases and shows the network of drugtargets by a visual way (Gao et al., 2016). Moreover, it abstracts the interaction between drugs and target genes into a network model and investigates the effects of drugs on biological networks from a holistic perspective. It can help us to further understand potential action mechanisms of TCM within the context of interactions at the system level. However, in the decode process of complex networks, there are still exist redundancies and noises in current network pharmacology study.
In order to solve this problem, we introduced the infomap algorithm based on huffman encoding and the random walk theory for the first time. The algorithm performs to optimize the discovery of motif in C-T network heuristically by using a reasonable global metric. The results of optimized KNMSs are used to analyze the mechanism of different formulas for the treatment of RA. During this process the contribution coefficient model was used to validate the predicted KNMSs, which confirm the accuracy and reliability of our proposed strategy.
In this study, 230 active components of three formulas were found in total after ADME screening, 31 of these components are common to the three formulas, and 93, 89, and 17 components are specific to each formula. It suggested that the three formulas play therapeutic effect on rheumatoid arthritis through both common and specific components. In order to analyze the key component groups and mechanisms of the three formulas in the treatment of rheumatoid arthritis, we used target prediction tools to predict the targets of active components in different formulas and construct C-T networks. The degree distribution in the C-T network shows that the same components could act on different targets, and different components could also act on the same targets, which fully reflects the multi-component and multitarget complexity of TCM in treating complex diseases.
In order to quickly extract important information from complex C-T networks, motif prediction and validation strategy were used to rapidly discover the KNMSs of different formulas in the treatment of RA by using multidimensional data. More and more evidences show that network motif is an effective method to extract functional units and find core elements in complex networks. Radicchi et al. has confirmed that network motif offers an effective and manageable approach for characterizing rapidly the main functional unit of disease progression (Radicchi et al., 2004). Yang has reported that identifying overlapping motifs is crucial for understanding the structure as well as the function of real-world networks (Yang and Leskovec, 2012). Cai et al. indicate that uncovering motif structures of a complex network can help us to understand how the network play functions (Cai et al., 2014). Utilizing the network motif prediction model, 7, 10, and 10 KNMSs (p<0.05) were predicted in DSD, GFD and HGWD, respectively. Coverage of RA pathogenic genes, coverage of functional pathways and cumulative contribution of key nodes were employed to evaluate the accuracy and reliability of KNMSs. The verification results show that KNMSs has a high coincidence degree with C-T network at the pathogenic genes, gene functional and topological structure level. It suggests that our proposed KNMS detection model can maximize the retention of functional pathways, the coverage of network topological structure and the coincidence degree of pathogenic genes in the formulas of TCM.
Through the analysis of KNMSs gene enrichment pathways in different formulas, we found that the percentage of gene enrichment pathways of different KNMSs is distinct compare to the gene enrichment pathways of all KNMSs in each formula. In DSD, gene enrichment pathways of DSD1, DSD2 account for more than 95% of gene enrichment pathways of 7 KNMSs. In GFD, the gene enrichment pathways of 4 KNMSs, GFD1, GFD3, GFD4, and GFD5 account for 95.4% of gene enrichment pathways of 10 KNMSs. In HGWD, the gene enrichment pathways of 4 KNMSs, HGWD4, HGWD5, HGWD6, and HGWD8 account for 92.7% of gene enrichment pathways of 10 KNMSs. These KNMSs in each formula play different roles by targeting on common and complementary inflammation-related signaling pathways. These complementary inflammatory signaling pathways include: DSD1 specifically related JAK-STAT signaling pathway and AMPK signaling pathway, DSD2 specifically related NF-kappa B signaling pathway, p53 signaling pathway and Wnt signaling pathway. GFD1 specifically related TNF signaling pathway, IL-17 signaling pathway, GFD3 specifically related Inflammatory mediator regulation of TRP channels and GnRH signaling pathway, HGWD4 specifically related TNF signaling pathway, Hedgehog signaling pathway, HGWD5 specifically related VEGF signaling pathway and mTOR signaling pathway, HGWD6 specifically related cAMP signaling pathway and cGMP-PKG signaling pathway. These results indicate that KNMSs in different formulas have distinct roles and synergistic effects in the treatment of rheumatoid arthritis.
In addition to the difference analysis of each KNMS in different formulas, KEGG enrichment analysis of all KNMSs in each formula were also implemented and revealed that 3 formulas exert the therapeutic effect of RA through common pathway, such as MAPK signaling pathway, cAMP signal pathway etc. or specific pathway, such as VEGF signaling pathway, HIF-1 signaling pathway, PI3K-Akt signaling pathway etc. Among them, MAPK signaling pathway plays an important role in the pathological process of RA (Schett and Zwerina, 2008). Its over-activation is closely related to inflammatory hyperplasia of synovial tissue and destruction of articular cartilage tissue. As an inducible transcription factor, MAPK regulates the expression of many genes and has been considered as a promising target for the treatment of RA (Rubbert-Roth, 2012). Studies have shown that collagen-induced arthritis rats administrated with MAPK signal transduction pathway inhibitor have significant differences in inhibiting synovitis, bone destruction and articular cartilage destruction compared with the group without signal pathway inhibitor (Adelheid et al., 2014). cAMP signal pathway is an important signal pathway for peripheral blood lymphocytes of RA patients. The study found that the cAMP level in peripheral blood lymphocytes (PBL) of RA patients increased, and its proliferation response was significantly lower than that of PBL in normal patients. It was also found that the abnormal activation of adenylate cyclase in RA patients was related to the low function of Gi protein (Dai and Wei, 2003). The formation of RA neovascularization depends on the expression of various angiogenic factors, especially VEGF and its receptor in RA (Mi-La et al., 2006). It has been confirmed that VEGF expression is upregulated in synovial macrophages and fibroblasts of RA patients, and VEGF expression is positively correlated with RA disease activity and joint destruction (Kanbe et al., 2015). The articular cavity of RA is anoxic microenvironment. Recent studies have shown that the increased expression of HIF-1 in synovium of RA joint is closely related to the occurrence and development of RA (Xu et al., 2010). PI3K-AKT signal pathway is an important intracellular signal transduction pathway, which is closely related to abnormal apoptosis of RA fibroblast-like synovial cells (RAFLS) (Smith and Walker, 2004). Inhibition of abnormally activated PI3K-AKT signaling pathway or expression of anti-apoptotic molecules can induce apoptosis in RAFLS and have therapeutic effect on RA (Liu and Pope, 2003).
Besides the function analysis, we also analysis and validated the key components in KNMSs of each formula. In DSD, the results suggested that the key component isoliquiritigenin from motif 1 (DSD1) exert effect on treatment of RA possibly through acting on MAPK signaling pathway. Studies have shown that isoliquiritigenin suppresses RANKL-induced osteoclastogenesis and inflammatory bone loss via RANK-TRAF6, MAPK, IkBa/NF-kB, and AP-1 signaling pathways (Zhu et al., 2012). In GFD, the results suggested that the key component isorhamnetin from motif 1 (GFD1) treats RA possibly through acting on TNF signaling pathway. Published reports confirmed that isorhamnetin play intervening roles in the development and progression of RA via anti-inflammatory and anti-oxidative activities. Previous studies have suggested that isorhamnetin attenuates collagen-induced arthritis via modulating the levels of cytokines TNF-a, IL-1b, and IL-6 etc. in the joint tissue homogenate of mice (Wang and Zhong, 2015). In HGWD, the results suggested that the key component quercetin from motif 5 (HGWD5) has therapeutic effect on RA possibly through acting on PI3K-Akt signaling pathway. This also verified by previous studies, which found that the mechanisms responsible for the quercetin-induced apoptosis of FLS from patients with RA are associated with the inhibition of PI3K/AKT pathway activation (Pan et al., 2016). Cellular experiments were applied to prove the reliability of the network pharmacology model through verifying the protective effects of key components in KNMSs of three formulas on the inflammation of mice RAW264.7 cells induced by LPS. In addition, in order to better evaluate the reliability of our proposed network pharmacology model, in vivo study will be conducted in our future research.
To summarize, a network pharmacology-based approach was established to extract core components group and decode the mechanisms of different formulas treat the same disease of TCM. Additionally, our proposed KNMS prediction and validation strategy provides methodological reference for optimization of core components group and interpretation of the molecular mechanism in the treatment of complex diseases using TCM.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

AUTHOR CONTRIBUTIONS
A-PL, D-GG, and LG provided the concept and designed the study. K-XW and YG conducted the analyses. K-XW and D-GG wrote the manuscript. K-XW, YG, CL, YL and B-YZ participated in data analysis. X-MQ, G-HD, and A-PL provided oversight. A-PL, D-GG, and LG contributed to revising and proof-reading the manuscript. All authors contributed to the article and approved the submitted version.