Treating Different Diseases With the Same Method—A Traditional Chinese Medicine Concept Analyzed for Its Biological Basis

Introduction The fundamental theory of traditional Chinese medicine (TCM) implies that when different diseases have the same pathogen, the syndromes of these individual diseases will be the same. “Treating different diseases with the same method” is a TCM principle suggesting that when different diseases have similar pathological changes during different stages of their development, the same method of treatment can be applied. Our study aims to analyze the concept “treating different diseases with the same method” from a molecular perspective, in order to clarify its biological basis and to objectively standardize future TCM syndrome research. Objective The TCM syndromes Qi deficiency and blood stasis have similar pathogenesis in relation to coronary heart disease (CHD) and stroke. We aim to use big data technology and complex network theory to mine the genes specifically relevant to these TCM syndromes. This study aims to explore the correlation between the biological indicators of CHD and stroke from a scientific perspective. Methods Mining the relevant neuroendocrine-immune (NEI) genes by means of gene entity recognition, complex network construction, network integration, and decomposition to categorize relevant syndrome terms and establish a digital dictionary of gene specifically related to individual diseases. We analyzed the biological basis of “treating different diseases with the same method” from a molecular level using the TCMIP v2.0 platform in order to categorize the TCM syndromes most relevant to CHD and stroke. Results We found 46 genes were involved in the TCM syndromes of Qi deficiency and blood stasis of CHD and stroke. The same genes and their molecular mechanism also appeared to be in close relation to inflammatory response, apoptosis, and proliferation. Conclusion By using information extraction and complex network technology, we discovered the biological indicators of the TCM syndromes Qi deficiency and blood stasis of CHD and stroke. In the era of big data, our results can provide a new method for the researchers of TCM syndrome differentiation, as well as an effective and specific methodology for standardization of TCM.

Conclusion: By using information extraction and complex network technology, we discovered the biological indicators of the TCM syndromes Qi deficiency and blood stasis of CHD and stroke. In the era of big data, our results can provide a new method for the researchers of TCM syndrome differentiation, as well as an effective and specific methodology for standardization of TCM.
Keywords: treating different diseases with the same method, traditional Chinese medicine, information extraction, complex network, TCMIP v2.0 PREFACE "Treating different diseases with the same treatment" is an important part of the fundamental theory of TCM, embodying the spirit of "treatment by syndrome differentiation". A similar point of view can be seen in modern medicine where it is academically accepted that certain disease mechanisms lead to various diseases. For example, insulin resistance exists in obesity, diabetes, dyslipidemia, and other metabolic diseases. Another example is that the inflammatory response mechanism can be found in infections, atherosclerosis, glomerulosclerosis, hypertension, and ischemia-reperfusion of heart and brain. The knowledge that certain diseases have shared pathological mechanisms has a significant impact on the study of pathology both in modern medicine and TCM. From their respective perspectives both TCM and modern medicine acknowledge the importance of shared pathological mechanism of diseases. Therefore, the concept of "treating different diseases with the same method" can be seen as a focal point of the integration and unification of TCM and modern medicine (Xuezhong et al., 2004).
In the era of big data, the acknowledgment of TCM theory is closely related to informatics. The standardization and digitalization of TCM diagnosis need information and big data to improve the research. Information technology is needed to make a breakthrough in the fundamental research of TCM syndrome differentiation and biology. Getting research results published is common procedure for any scholar to have their research known and acknowledged, TCM is no exception. To this day, a large number of research documents related to TCM are stored in WOS, PubMed, and other international medicine databases. However, only a small amount of these papers and their knowledge can be found by the means of Artificial Intelligence (AI) and algorithms and has to be individually searched and found by humans. It will be of great significance for the medical society to discover the relationship between TCM syndrome differentiation and modern molecular biology. We aim to discover and compare a vast number of medical journals and texts to explore "treating different diseases with the same method" from the perspective of microbiology by using information technology (including machine learning, text mining, complex network, etc.) To begin our research of "treating different diseases with the same method", we chose diseases and syndromes with a high incidence rate, complex treatment and which have a major impact on the health of the general population. With the rapid development of the Chinese national economy peoples living conditions and lifestyles have changed dramatically, ultimately resulting in a shift of the most common diseases. The incidence rate of coronary heart disease (CHD) and stroke is increasing. According to the China Health and Family Planning Statistical Yearbook, released by the national health and family planning commission, stroke and CHD have become the leading cause of death among Chinese citizens. Additionally, a WHO survey points out that the incidence rate of stroke is ranked first among diseases in China and is now trending toward younger parts of the population. How to effectively prevent and treat these two diseases has become a key point in the medical field.
Qi deficiency and blood stasis are important syndromes in the diagnosis and treatment of TCM. These two TCM syndromes are common in CHD, stroke, tumor, hypertension, cerebral infarction, and other multi-system diseases. To treat the above diseases, TCM doctors use prescriptions to invigorate Qi and activate blood. For example, Buyang Huanwu Decoction is a commonly used formula for the treatment of the syndromes of Qi deficiency and blood stasis, which in turn has a good effect on CHD and stroke. This principle of treating different diseases in their respective stage of development with the same method principle is the essence of "treating different diseases with the same method". In this paper, we chose the common mechanism of CHD and stroke as our focal point to discover the biological basis of "treating different diseases with the same method". This paper is divided into four chapters. Chapter 1, Introduction of research in China and abroad. Chapter 2, Introduction of data and methods used in this study. Chapter 3, Study results. Chapter 4, Summarization of research and suggestions for future research.

LITERATURE REVIEW
Information Extraction (IE) and Its Applications in the Biomedical Field IE refers to the process of extracting, integrating and processing relevant information from existing data in order to store the specific information in a new data structure for future use and consultation (Bin, 2002). In IE, the uttermost important is to discover and extract the relationship between various entries. At present, relationship extraction generally applies to relationship extraction in dictionaries, as part of pattern matching and by means of machine learning. The dictionary based approach matches the words in the corpus with those in the professional dictionary in order to identify the words and/or their relationships. By matching the relevant literature and gene dictionaries related to Qi deficiency and blood stasis syndromes of CHD, Zhao (Xing et al., 2015) identified the genes most relevant to these syndromes. Using relationship extractions based on dictionaries is simple and fast, but in the era of big data, the speed that information updated is even faster leading to dictionaries lacking behind the current information. Thus it can be understood that this method is not suitable for scenarios where the data of dictionaries is updated on a regular basis.
Relation extraction of pattern matching is based on the observation and analysis of entries by linguistic experts to define the specific rules and extract the correlation of biomedical entries. For example, in 2001, Ono et al. (2001) created an extraction model base on relations between protein-protein interaction (PPI) relationships by observing regular expressions, shallow syntactic patterns, and pattern matching method. Although methods based on pattern matching can extract the relationship between biomedical entities more accurately, it is difficult to identify the relationships outside of the defined set of rules. Therefore, the accuracy is often high but at the same time the recall rate is low.
Based on machine learning, the relevant features and parameters are calculated from the sample data and following used to identify and establish the new model. Machine learning can be further divided into eigenvector based methods and kernel based methods. Feature vector based method, which belongs to supervised machine learning and eigenvector based method, is a commonly used machine learning method applied in biomedical entity relationship extraction. Feature vector based method is used to change the relation of linguistic information into plane features, and following to create a high-dimensional map in vector space. This is done in order to translate information of the natural language into recognizable vector classifiers. For example, using support vector machine (SVM) models Xiao et al. (2005) extracted PPI relationship by applying a combination of lexical, syntactic, and semantic features in combination with maximum entropy classifiers. Although feature vector based relation extraction method is flexible and has good performance, the spectrum of feature selection often directly determines the extent of relation extraction. At the same time, the quantity and quality of labeled corpus usually determines the performance of extraction. The core function is used to calculate the similarity of two candidate relationship instances in potential vector space. For example, Bunescu and Mooney (2005a) proposed the idea of applying kernel function method to extract protein relations. Based on kernel function they managed to isolate PPI relationships by extracting the dependency pathways between specific proteins. The kernel based method has better performance because it has more flexibility to extract correlating information between entities. However, per definition kernel function has a direct impact on the result of relation extraction. It can be seen that there is a common problem in relation extraction methods based on machine learning. It gives rise to similar issues, whether using eigenvector or kernel function for relation extraction. Due to the need of large quantity of labeled corpora the quality of these functions will deteriorate along with the quality of the labels.

Use of Complex Network in Biomedical Research
The complex network theory is a method based on graph theory and complex system theory. Large-scale nodes, complex network structures, and dynamic spatiotemporal evolution of networks are the main characteristics of complex network. They coincide with the characteristics of data complexity, diversity, and the dynamic changes of informatics. Applying the theory and method of complex network to data analysis helps analysts to pinpoint what exact problems should be researched, and as a whole to improve the efficiency of their analytic tasks.
Complex network theory is not only applied to climate, politics, economy, society, military, and management, but also widely used in biomedical research. At present, the research of complex network in the biomedical field mainly focuses on the location of disease gene, disease-related sub-network recognition, networkbased disease control research and drug target predictions. For example, Lim et al. (2006) used pathway information to discover pathogenic proteins. Deng et al. (2004) predicted a drug target method based on network topology. Artzy-Randrup et al. (2004) constructed the pin sub-network of Huntington's disease (HTT) and by these means discovered a new HTT pathogenic gene Regarding TCM research, Li (Li et al., 2007) applied the information technology of text mining to prove that "disease" and "syndrome" have the possibility of a "disease syndrome combination" at the level of biological networks. Zhou et al. (Xuezhong et al., 2004) processed and integrated the database of TCM literature to find the gene related to kidney Yang deficiency and thereby provided an effective way to understand the function of this specific gene and its significance regarding the TCM syndrome. From the perspective of "interaction network function", Li (Shao, 2007) explored and established the relevant method to apply computational system biology on TCM. He discovered through the excavation and research of the TCM syndromes of "cold and heat", diseases, and prescriptions that the biological network of somatic angiogenesis is closely related to the TCM concept of" collateral disease". Ding et al. (Fan et al., 2014) used network analysis method to study Buyang Huanwu Decoction from three aspects: chemical composition, target tissue, and related diseases to discover its potential in cancer treatment.
Since the 1970s, modern medicine has recognized the concept and interconnection of the neuro-endocrine-immune (NEI) systems. In 1973, Isaković andJanković (1973) firstly proposed the relationship between these three systems by injuring different parts off rat brains with bilateral symmetrical electrolysis and following observing their effects on lymphoid organs. The discovery and acknowledgment of NIE coincide with the overall concept advocated by TCM for thousands of years, and to a certain extent, it validates the theory of TCM, that has been developed through millennial of clinical practice. In 1990, Wang (Lan and Yusheng, 1990) firstly drew parallels between NEI network and TCM theory. Afterward, more scholars have researched the connection between TCM theory and NEI network, such as the study of "cold and heat" syndromes (Ma et al., 2010) and the study of therapeutic mechanism of acupuncture (Ding et al., 2013). The key of "treating different diseases with the same method" is that although diseases are different, the syndromes can be the same. By using NEI network to explore the correlation between biological indicators of different diseases, we got a deeper understanding of TCM which assists to its modernization and internationalization.
Through the use of information technology in biomedical science, this TCM research has achieved good results. We still see the need for improvements. The main problems are as follows: 1) although there are many experiments of TCM using information technology, modern medicine and molecular biology most of them lack the effective combination of the mathematical modeling of precise result mining. This in turn leads to the biological significance of TCM syndromes and diseases being undiscovered. In other words, there are no clear indicators of the correlation between the perspectives of macro biology and microbiology in the research. 2) In the current research on biological networks the constructed networks are large and complex in scale. Due to the complexity of the network indicators, the results are difficult to analyze. There is no known method to compress networks and define the similarity between nodes ones the networks are compressed. We proposed to apply machine learning, information extraction and complex network technology to construct gene networks based on TCM syndromes, as well as use data mining by combining decomposition and combination throughout these networks. Our hypothesis is that by using these technologies to discover common genetic nodes through calculations, analysis and mining it will be possible to discover the genes specifically related to the syndromes of Qi deficiency and blood stasis of CHD and stroke.
Mesh (Medical Subject Headings) technology was used to mine the specific genes of NEI throughout PubMed database to construct the biological network. We also explored the mechanism of the concept "treating different diseases with the same method" from the perspective of microbiology. This has deepened the understanding of the theoretical system of TCM syndrome differentiation, and promoted the studies of TCM syndromes.

MATERIALS AND METHODS
This article was based on biomedical literature downloaded from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi, including the two target syndromes. This was chosen due to vast PubMed database managed by the United States National Library of Medicine and the National Institutes of Health, providing free catalogs of biomedical papers worldwide. With the increasing internationalization of TCM research, the biomedical literature in PubMed contains a large number of articles related to Traditional Chinese medicine (A total of 148,000 articles related to Traditional Chinese medicine dated from 2010 to 2020 found by searching the keywords "Traditional Chinese medicine", "herbal medicine", "aspiration" and "prescriptions"). These articles are valuable to TCM scientific research and innovation, TCM technological innovation and the understanding of the current situation of TCM clinical practice in and outside of China. The free and easy access to vast biomedical literature related to TCM and modern medicine is why we chose PubMed as our data source. In order to precisely mine the genetic data of the two target syndromes, and to improve the accuracy of the data mining, the author proposed a data mining model based on the combination of "network decomposition" and "symptom combination" to mine the NEI genetic data in PubMed database (Xing et al., 2015;Liu et al., 2019). Then, the author combined the results of the two methods to draw a final conclusion. The methods of mining are described below.

Network Decomposition
In order to reduce the complexity of network analysis, the network was divided into sub networks according to the network categories and the specific genes and their functions. Each sub network was individually analyzed and the results of each sub network were combined to find the genes related to the target syndromes. See Figure 1.

Data Collection
According to the Guidance Principle of Clinical Studies of New Drug of Traditional Chinese Medicine (Xiao-yu, 2002), the terms of the target syndromes were selected. Then, two TCM experts weighted the rationality of these terms according to their personal clinical experiences to decisively determine the terms to be used in this paper. As the TCM terms chosen were all in Chinese and PubMed database in English, it was necessary to translate them into English. However, due to the specificity of TCM terminology many translated terms could not be searched in PubMed. Therefore, it was necessary to convert these TCM terms into corresponding modern medicine terminology. In this paper, Symmap database was used to finalize the translation of TCM terms into modern medicine terms. Symmap database (Wu et al., 2018) was developed by Chen Jian-xin, Professor of Beijing University of Chinese Medicine. Through Symmap and its classifications of internal molecular mechanism and external symptom, the database charted 1,717 TCM symptoms and allocated them to 961 modern medicine symptoms. For example, "red tongue" of TCM diagnosis is translated through Symmap database into "Glossitis" of western medicine ( Figure 2).
The key terms were loaded into PubMed database according to the search format. We searched the Mesh subject terms and downloaded the literature summary. Following we downloaded the documents and converted them into a unified formation order to generate the corpus intended for mining.

Gene Information Extraction
According to the basis of this paper, the entity of NEI genes that needed extraction from the literature was relatively defined and had little change. For improvements of efficiency and accuracy of entity extraction we used a dictionary matching method. The downloaded documents were pre-processed, and following divided into independent sentences that were further segmented by using the Jieba word, a python segmentation package. After this process the separated words were matched with the key words of the NEI gene Dictionary. The NEI gene dictionary was downloaded from dbNEI (http://bioinfo.au.tsinghua.edu.cn/dbNEIweb/) A total of 1,435 NEI genes were downloaded and used for comparison ( Figure 3).
Gene relationship extraction is used to identify the possible gene relationships in literature abstracts (In this paper it mainly refers to the gene relationships recognized in the abstracts). We found the relationships between genes and understood their potential usage through analysis and identification of the specific genes and their interrelations. We used sparse multi-instance learning algorithm (SMIL) (Wu et al., 2018) to extract gene relationships based on the summaries of previous relationship extraction algorithms. This approach can be applied to relationship extraction. By assessing whether or not a specific group of genes have interactions, and whether or not these interactions are of identical genes.
In this study, each grouping contained multiple relationships. If the relationship between two or more genes were observed, then the group was labeled positive. If no relationships between genes were observed, the package was labeled negative. E.g. A study has two genes that may be mentioned several times throughout the abstract, but generally speaking a specific relationship will only be mentioned once. Therefore, the information regarding these two specific genes will be relative sparse. We decided to use SMIL to extract the genetic relationship and avoid overlooking important information in the literature abstracts acquired from PubMed.
In addition to the data packages, "knowledge base" and "file data" are needed to extract relationships using SMIL algorithm. "File data" is a data source that contains data relations. "Knowledge  base" has the ability to replace manual labeling of existing relationships. Therefore, the "knowledge base" must contain the exact same relationships as the relationships examined for extraction. There are many industry databases about the relationship between genes in the field of bioinformatics. Therefore, to obtain the industry knowledge base was easy. After the comparative analysis of these specific gene databases, we finally decided to build a knowledge base from the string database. String database (http://string-db.org/) can analyze protein-protein interaction and includes 5,090 organisms. It not only contains the protein-protein interaction data known through validated experiments, but the database also includes predictions of unknown protein-protein interaction. We copied and pasted all NEI genes into the string database through the "multiple proteins by names/identifiers" option, selected the set species as Homosapiens before beginning the search. The outcome of the protein relationship search was 823 nodes, 7,616 edges and the relationship values of each paired protein. We then downloaded the protein relationship data and used it to construct the knowledge base which contains the relationship between all NEI genes. The method of constructing the instance package is shown in algorithm 1. When two genes appear in the same paper abstract the whole text is traversed through the algorithm and the two genes added to a specific bag. If the specific gene relationship in bag exists in STRING knowledge base, the bag is flag as 1, otherwise as 0. We carried out this task until all the gene relationships in the knowledge base were mapped.
The algorithm used for word bag model generation is as follows (Lamurias et al., 2017

Network Analysis
The adjacency matrix was constructed based on the gene relationship mined by SMIL method and following converted into Pajek recognized.net file. The file was imported into Pajek software to generate a visualized gene network map. We applied the method of "decomposition integration" because the complexity of the genetic network. This method was used to divide the genetic network into sub networks (community nodes), that in turn were separately analyzed. Finally, the results of each sub network were recombined to analyze the results of the whole network. Referring to the results of Yi's research (YIDH, 2017), we calculated the community node centrality index, subnet weight index, standardized weight, and node genetic weight index. This was done according to the following formulas:  (CMI) comprehensive measurement indicator; (CD)centrality degree; (CB) betweenness centrality; (CC) closeness centrality; (CE) centrality eigenvector; (SW) subnet weight; (ZSW) standardized weight of SW; (N) number of nodes in subnet; (ZJW) The correlation between subnet function and the problem to be studied problem, which is scored by experts; (GW) gene weight.

Combination Method
Combination method is an analytic method used to combine equations and set aside variables to discover a unified value. We assumed that multiple genes would correspond to each individual symptom (key), and that multiple symptoms would in turn correspond to one TCM syndrome. We investigated the most relevant genes (Figure 4) through the research concept of "gene symptom syndrome". The key words of target syndromes in modern medicine were regarded independently. Firstly, we discovered the genes related to the modern medicine syndrome, and following referred the data to the standard of TCM diagnosis for re-matching. As a final step we uncovered the specific gene which existed in each combination. This specific gene is the key gene and the result of these mentioned methods.

Search for Symptom Specific Gene
The search for symptom specific gene was done by the use of GenCliP software. GenCliP is a literature mining tool that can be used to form a list of genes with the use of keywords. These lists were then extracted by the software from their related literature and then manually verified. GenCliP displays specified genes and keywords mentioned in literature for manual association verification (Huang et al., 2008). We used the software to input all NEI genes in the "Upload Gene List", and input each target syndrome and their symptom into the search box "Word Related Gene Search". This was done to explore the NEI gene related to each specific symptom, and to define the symptom and the accompanying gene as "Abstract". The results of these searches included two parts: "Gene" and "Hit". "Gene" refers to the name of the searched gene, "Hit" refers to the number of documents in which the gene and the correlated searched symptom have appeared simultaneous. This concept can also be understood as the "weight of the gene".

Symptomatic Combination
In TCM each disease and syndrome are composed of many symptoms, and these symptoms can be further divided into primary and secondary symptoms. Under the guidance of TCM experts, we combined the individual symptoms according to the TCM diagnostic standards to better reflect the essential characteristics of the disease related to this study. The TCM experts divided the target syndromes into three parts: main syndrome, Qi deficiency syndrome and blood stasis syndrome. According to the TCM diagnosis standard, experts combined the symptoms of these three sections in different models to extract the gene information of each individual combination. We used this information to plot the intersection of the symptom combinations, and to screen out the genes existing in each combination. Finally, combining the method of Li (Li et al., 2007) and the experience of TCM experts, we discovered the core genes of each individual target syndrome. We following compared the genes of two respective target syndromes by "combination" and "decomposition" methods to extract the relevant genes.

Results of "Decomposition Method"
We retrieved the relationships between TCM symptoms and modern medicine symptoms based on the SymMap database. Tables 1 and 2 show translation from Chinese of symptom keywords in English based on TCM terminology. These keywords were used for literature search in PubMed. We used these keywords for Mesh retrieval in PubMed database. A total of 85,733 documents related to syndromes of Qi deficiency and blood stasis of CHD and 96,038 documents related to Qi deficiency and blood stasis syndromes of stroke were retrieved. We downloaded the documents (including author, title, key words, abstract, and so forth) and used them to generate the document library in XML format. A total of 84 NEI genes relevant to Qi deficiency and blood stasis syndrome of CHD and 199 NEI genes relevant to Qi deficiency and blood stasis syndrome of stroke were extracted by the mining method shown in Gene Information Extraction. Through the relationship recognition method shown in Gene Information Extraction, 109 gene pairs relevant to Qi deficiency and blood stasis syndrome of CHD and 313 gene pairs relevant to Qi deficiency and blood stasis syndrome of stroke were identified. The adjacency matrix of gene relationship was constructed and converted into a.net file recognized by Pajek software to generate a visualized gene network diagram ( Figure 5).
Calculations of four centrality indexes per network were made from the gene nodes, and then, according to Formula 1 the comprehensive measurement value of the centrality index was calculated. As a final step each of the two networks were individually divided into sub networks according to the community division algorithm of complex network. The results showed that the genetic network of Qi deficiency and blood stasis syndromes of CHD was divided into 14 sub networks (left of Figure 6). Among them, there were three large communities with more than 10 nodes, four with 5-10 nodes, and seven small communities with less than 5 nodes. A total of 21 sub networks (right of Figure 6) were discovered from the gene division of Qi deficiency and blood stasis syndromes of stroke. Among them, there were nine large communities with more than 10 nodes, three with 5-10 nodes, and nine small communities with less than 5 nodes. We then calculated the GW values of the target syndromes (Tables 3 and 4) according to formula 4.

Results of "Combination Method"
Since TCM diseases and syndromes are composed of a variety of symptoms, and these can be divided into primary and secondary symptoms. Under the guidance of TCM experts|, we clarified what symptoms are allocated to our two target syndromes, according to the criteria of TCM diagnostic, and following combined all target FIGURE 5 | Gene network of (left) Qi deficiency and blood stasis in coronary heart disease and (right) Qi deficiency and blood stasis in stroke.  We firstly used Genclip software to obtain the genes corresponding to each symptom and their respective keywords. This was done according to the method in Combination Method (see Table 5).
Unfortunately, no genes were related to "tongue disorder", "insomnia" and "chest heaviness" by using Genclip. These three symptoms were found in Genclip, but since our aim was to analyze the combination, rather than individual symptoms removing these three aforesaid symptoms from the symptom combination would not impact our subsequent research.

Symptom Combination Results of Qi Deficiency and Blood Stasis Syndromes in CHD
Using the formula combination of "main symptoms" + "Qi Deficiency" + "blood stasis" in Qi deficiency and blood stasis of CHD, 10 individual symptom combinations were obtained after  Using the formula combination of "main symptoms" + "Qi deficiency" + "blood stasis" in Qi deficiency and blood stasis in stroke, six different combinations of symptoms were obtained after deleting the keywords "Chest heaviness, insomnia and tongue disorder". Comparing these combinations and removing any data repeated between them a total of 373 genes related to the syndrome of Qi deficiency and blood stasis of CHD and 804 genes relevant to the syndrome of Qi deficiency and blood stasis of stroke were obtained.

Results Combination of "Decomposition" and "Combination"
According to the results of "decomposition method" and "combination method", "decomposition method" had 84 genes related to the syndrome of Qi deficiency and blood stasis of CHD and 199 genes related to the syndrome of Qi deficiency and blood stasis of stroke, "combination method" had 373 genes related to the syndrome of Qi deficiency and blood stasis of CHD and 804 genes related to the syndrome of Qi deficiency and blood stasis of stroke. By matching the results of "decomposition method" and "combination method" a total of 84 genes related to Qi deficiency and blood stasis syndrome of CDH and 161 genes related to Qi deficiency and blood stasis syndrome of stroke were found.
It can be seen from the above table that genes related to different diseases share certain common pathways. Nine shared signaling pathways (including AMPK, FoxO, ErbB, HIF-1, Jak-STAT, PI3K-Akt, Rap1, Ras, T cell receptor), 30 shared disease pathways (7 tumor related pathways, 13 infectious disease related pathways, 6 immune system disease related pathways, 4 endocrine disease pathways).

Genes Related to Traditional Chinese Materia Medica
We searched the "TCM target database" of TCMIP v2.0 in order to further study the relevance of traditional Chinese materia medica to the target syndromes. Using the reverse search function of TCMIP v2.0 led us to the discovery of the medical components related to the 46 genes involved in our target syndromes. TCMIP v2.0 (Xu et al., 2019) is an intelligent data platform, which is embedded with a broad spectrum of authoritative algorithms including drugs physical and chemical properties, evaluation of drug composition, and prediction of drug targets. In addition, the system is also able to analyze functions and pathways related to drug targets and disease targets. This function allows for cross retrieval from traditional Chinese materia medica ! formula ! ingredients ! target gene ! function/pathway ! disease. This cross search function can uncover the relevance of syndromes and drugs related to specific diseases, as well as rate the medicines' compatibility with target molecular groups. As shown in the Figure 8, a target gene was inserted into TCMIP v2.0 "TCM target database". For example, insert "AR", now the database will show the results of gene name, the specific genes corresponding to Chinese materia medica compounds and its corresponding prescription. With this search we found a total of 25 types of traditional Chinese materia medica related to the target syndromes; Arestemona root, pinellia tuber, danshen root, common rush, adhesive rehmannia root, garden burnet root, barbary wolfberry fruit, seaweed extract, puncturevine caltrop fruit, Indian Quassiawood twing and leaf, lotus seed, pyrola herb, European verbena herb, dwarf lilyturf tuber, hogfennel root, ginseng, mulberry twig, common yam rhizome, inula root, longstamen onion bulb, inula flower, yanhusuo, Yào Wańg; Cha, poppy shell, and milkwort root.

DISCUSSIONS
According to TCM CHD belongs to the category of chest arthralgia, palpitation, and subjective fatigue, it is located in the heart and blood vessels and mostly caused by the invasion of external evils, internal injury of emotions, improper diet, stress, and deficiency of viscera. In the development of CHD, Qi deficiency is the main factor, with inseparable bonds to phlegm and blood stasis (Houde, 2017).Yilin Gaicuo said "deficiency of vital energy(Qi) will lead to Qi not reaching the blood vessels, blood vessels without Qi will accumulate and form blood stasis", the etiology and pathogenesis of cardiovascular disease are mostly related to Qi deficiency and blood stasis (Jianqi and Yu, 2016), a vast amount of TCM literature point out that, according to TCM Qi deficiency and blood stasis are the two main pathological changes causing CHD. Stroke is located in the brain, and is according to TCM closely related to the heart, liver, spleen and kidney. Wind, fire, phlegm, blood stasis, deficiency and toxin are the main pathogenic factors of stroke (Zhe, 2010). Wang Qingren believed that the root of hemiplegia induced by stroke could be found in the deficiency of vital energy (Qi), based on this theory, he composed the Buyang Huanwu Decoction for the treatment of Qi deficiency and blood stasis syndromes of stroke (Zhengtai et al., 2017).
Neurotrophic factors can induce, regulate and control the survival, growth, and migration of neurons as well as establish functional connections with other cells by regenerating axons as part of nerve regeneration (Lichtman, 1987). As an important neurotrophic factor, glial cell-derived neurotrophic factor (GDNF) can reduce the area of infarction during the acute stage of stroke (Gunther et al., 2005). The increased expression of brain-derived neurotrophic factor (BDNF) may be relevant to the recovery of neural function and plasticity after cerebral ischemia (Ergul et al., 2012).
By searching the KEGG database, we further confirmed the interrelations of the above mentioned genes and their pathways. PI3K-Akt signal path ( Figure 9) involves FoxO, ErbB, Ras, and other sub modules, while FoxO, ErbB, HIF-1, and Jak-STAT also contain signals of the sub module PI3K-Akt. It is known that PI3K-Akt signaling pathway can promote endothelial regeneration, vasodilation, and platelet adhesion in cardiomyocyte, hence improve their survival rate and functionality (PAN et al., 2017). In addition, PI3K-Akt signaling pathway is also involved in cerebral infarction and other diseases (Zhang Hong and Junjian, 2011).
We found, through our analysis that the traditional Chinese herbs related to the 46 specific genes includes; Qi replenishing herbs ginseng and yam; blood activating herb danshen root; resuscitation herb milkwort root; phlegm dissolving herbs pinellia tuber, hogfennel root, and inula flower. This indicated that Qi replenishing, blood activating, resuscitation, and phlegm dissolving herbs have close connection to Qi deficiency and blood stasis syndrome. This discovery is consistent with TCM theory and clinical practice. Some scholars (Mengna et al., 2017) think that CHD is caused by Qi deficiency, Yang deficiency, blood deficiency, and Yin deficiency with Qi deficiency as the common denominator. Qi is the driving force of blood circulation. When Qi flows, blood flows. Weak Qi leads to blood not flowing smoothly that in turn leads to stagnation and pain. Qi deficiency, phlegm coagulation and blood stasis constitute the key pathogenesis of CHD and stroke. In the treatment of these diseases, emphasis should be given on Qi and blood circulation in order to tonify the body and resolve blood stasis. Modern pharmacological research validates that ginseng contains ginsenoside, ginseng polysaccharide, volatile oil, and other components. These chemicals can regulate the heart function, blood vessels, blood pressure, and central nervous system. Ginsenoside as the main component of ginseng has proven in vivo to protect cardiomyocytes, improve myocardial metabolism, and to increase stroke volume (Wang et al., 2015). Ginsenoside Rb1 prevents lipopolysaccharide induced cardiomyocyte inflammation by inhibiting the PI3K/Akt signaling pathway (Zhiyang et al., 2018). Tanshinone, Danshensu and other water-soluble components in danshen root can improve myocardial metabolism, increase coronary blood flow, and reduce the incidence of myocardial ischemia and myocardial infarction. Jiawei Danshen Yin (ruoxia et al., 2020) can activate the PI3K/Akt signaling pathway through negative regulation of PTEN that in turn inhibit apoptosis of myocardial cells during ischemia-reperfusion and improve the survival rate of myocardial cells during ischemia and hypoxia. This leads to improvements of the heart and reduces the injury of cardiovascular disease to the heart. Pinellia tuber can regulate blood lipid metabolism leading to improved hemodynamics, reduce blood viscosity, and inhibition of RBC aggregation (Wenyue et al., 2002).
This study preliminarily clarified that the pathological mechanism of Qi deficiency and blood stasis is involved in cell necrosis, apoptosis and inflammatory responses. As well as involving FoxO, ErbB, HIF-1, Jak-STAT, PI3K-Akt signal pathways. We aim to reveal the molecular mechanism based on biological research to provide a scientific basis for the TCM theory of "treating different diseases with the same method".

SUMMARY AND PROSPECT
Using Qi deficiency and blood stasis as our fundamental concepts to analyze the extensive online literature and big data with "decomposition" model, "combination" model, "information extraction," and "complex network". Ultimately, leading to discussions on the biological basis of the TCM theory "treating different diseases with the same method". This study provides a bridge between TCM and modern medicine, and also demonstrates the use of information technology to study the TCM term "syndrome".
It is a new approach to use big data and complex network as a method to analysis the biological basis of the TCM concept "treating different diseases with the same method". Follow up research will be needed to verify and validate the genes related to the two target syndromes as well as the practicality of our method. We plan to conduct more in-depth researches on the relationship between traditional Chinese materia medicagenes-syndromes-diseases by exploring their common molecular mechanism. Then, we will get a deeper biological understanding of the TCM concept "treating different diseases with the same method".

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.