Gene Targets Network Analysis for the Revealing and Guidance of Molecular Driving Mechanism of Lung Cancer

The objective was to explore the function of gene differential expressions between lung cancer tissues and the interaction between the relevant encoded proteins, thereby analyzing the important genes closely related to lung cancer. A total of 120 samples from the GEO database (including two groups, i.e., 60 lung cancer in situ specimens and 60 normal specimens) were taken as the research objects, which were submitted to the analysis of signaling pathway, biological function enrichment, and protein interactions to reveal the molecular driving mechanism of lung cancer. Results: A total of 875 differentially expressed genes were obtained, including 291 up-regulated genes and 584 down-regulated genes. The up-regulated genes were mainly involved in biological processes such as protein metabolism, protein hydrolysis, mitosis, and cell division. Down-regulated genes were mainly involved in neutrophil chemotaxis, inflammatory response, immune response, and angiogenesis. The protein expression of high expression genes and low expression genes in patients were higher than those in the control group. The protein corresponding to the high expression gene was highly expressed in the patient group. Meanwhile, the proteins corresponding to the low expression genes were also expressed in the patient group, which showed that although the proteins corresponding to the low expression genes were low in the patients, they were still the target genes related to lung cancer. In conclusion, the molecular driving mechanism in lung cancer was mainly related to protein metabolism, proteolysis, mitosis, and cell division. It was found that TOP2A, CCNB1, CCNA2, CDK1, and TTK might be the critical target genes of lung cancer.


INTRODUCTION
The incidence rate of lung cancer is one of the fastest growing malignant tumors (Masters et al., 2017). Lung cancer is mainly divided into small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), of which NSCLC accounts for 80% of all lung cancer cases (Shi et al., 2017). Currently, molecular-targeted drug therapy takes the molecules that block the high expression of cancer cell membrane or cells as the therapeutic target, reduces the fragmentation effect on normal cells by blocking the growth, infiltration, metastasis, and inducing apoptosis of normal cells, and reduces the incidence of adverse drug reactions in patients (Shen et al., 2018;Xu et al., 2018). In the absence of biopsy, the blood samples of patients with lung cancer are the only source of information for analyzing clinically relevant genetic changes, including epidermal growth factor receptor (EGFR), Kirsten rat sarcoma viral oncogene (KRAS), v-raf murine sarcoma viral oncogene homolog B1 (BRAF), c-ros oncogene 1 (ROS1), and anaplastic lymphoma kinase (ALK) (Allan-Blitz et al., 2018). As new treatment options emerge, predictive detection of lung cancer has become a research hot spot in medical field (Horimasu et al., 2017). The diagnosis of lung cancer diseases mainly includes the identification and classification of malignant tumors, molecular tests, and immunohistochemical analysis. Complex diagnostic analysis algorithms have evolved, requiring specific drugs tailored to individual patients and considering the way to make investigations and diagnostic strategies based on individual tumors (Zhang et al., 2018). Some studies have reported that KRAS mutations may be the targets for preventing and treating KRAS mutant lung cancer and other tumor diseases (Krasnov et al., 2017). Studies have shown that the molecular driving mechanisms of lung cancer in different tumor stages are also different, and NKTR may be the target of prevention and treatment of lung cancer diseases (Zhou et al., 2017). Some studies have used the CIBERSORT method to identify and quantify the number of different cells in a tumor sample by reference genes combined with machine learning. Such an approach solves one of the major problems in determining cell types to some extent by using the reference genes (Zins et al., 2018). CIBERSORT is used to estimate the abundance of member cell types in mixed cell population by using gene expression data. It is a tool of bioinformatics analysis method and has important application value in the field of molecular biology.
Bioinformatics uses computers to mine and analyze great information in biological databases, focuses on gene and proteomic analysis, and is widely used in the fields of molecular genetics and genomics. In the field of tumor research, bioinformatics combines suspicious tumor genes with known biological data through the biological network analysis of tumorrelated pathways and biological processes, identifies tumorrelated functional categories, and excavates tumor networks. It also predicts potential pathogenic proteins and plays an important role in tumor pathogenesis, diagnosis, and treatment. As the gene chip technology continuously develops, it has become a hot topic how to process and analyze tremendous data and find more effective information. At present, gene chip technology is mainly used in the research of tumorrelated gene information, such as screening tumor-related genes, measuring tumor mutation genes, studying tumor gene expression profiles, and diagnosing tumor diseases. In this way, it can explore the extent of influences of genetic, environmental, and pharmaceutical factors for tumors on the expression of related genes during the occurrence and development of tumors.
The rapid development of high-throughput technologies, such as MeDip-seq, methylated microarrays, and RNA-seq, has provided technical support for the identification of biomarkers for a variety of diseases such as lung cancer, as well as opportunities for the availability of publicly available data sets. By selecting the gene expression dataset of lung cancer, this study innovatively explores the network of lung cancer target genes through gene expression analysis of different databases, thus exploring the molecular driving mechanism of lung cancer and providing reference for clinical molecular drug treatment and nursing guidance of lung cancer.

Data Resource and Processing
A total of 120 samples of lung cancer mRNA sample GSE19408 (including two groups: 60 lung cancer in situ specimens and 60 normal specimens) were selected from the GEO (Gene Expression Omnibus) database, using open-source software R3.4.2. for preprocessing the differential analysis of sample data.
First, download the sample, import the CEL (cool edit loop) format file into the R program, use the limma package in the R language to count the difference between the lung cancer gene and the normal gene, and then follow the FDR (false discovery rate) and FC (fold change, gene expression fold ratio) from which differentially expressed genes were selected, and the comparison between the two groups of genes must satisfy the requirements of FDR < 0.01 and | log2 FC | ≥ 1.

Signal Pathway Analysis and Biological Function Enrichment
Signaling pathway analysis and biological function enrichment of the screened NSCLC differentially expressed genes were performed using the Functional annotation chart tool under the DAVID platform. First, the differentially expressed genes were introduced into the DAVID list in the form of gene symbol, and the humans were submitted to the task in the species type, and the GO (Gene Ontology) analysis and the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway were performed on both the up-regulated and down-regulated genes (Wang et al., 2020). After the results were obtained, the differentially expressed genes with statistical significance (P ≤ 0.01) were selected.

Protein Interaction Analysis
Gene data can be applied to gene regulatory network analysis to analyze the differential expression of genes for studying the differential expression of their target genes and the processes that constitute various organisms, such as organ formation, embryo development, and disease pathogenesis. The network of relationships is compared between cell types or states and analyzed further, and specific molecular features and functional blocks can be identified, which are the basis for state transitions. In order to identify key target genes related to lung cancer, this study established a protein interaction network model to explore the regulatory relationship of differential genes at the protein level. The differentially expressed genes obtained by the DAVID platform were subjected to ID (Identity Document) conversion and input into the STRING 9.1 (the Search Tool for the Retrieval of Interacting Genes) database to establish a differentially expressed gene encoding proteinprotein between Interaction network diagram. Proteins at the center of the protein-protein interaction network often play a relatively important role in the development of the disease. The selection criteria for PPI (Protein-protein interaction network) analysis was combination score >0.4 (medium confidence). Enter the PPI value into the visualization tool, that is, the Cytoscape software, and use the analysis plug-in to calculate the edge of the nodes in the network to get the number of protein interactions (Degree). The analysis steps of Cytoscape software are as follows: first, import the node attribute file, file->import->table->file(node.txt) (here is table instead of network), and then set the format of simple network diagram in style. Finally, export the file. The data can be network file, table file, or picture file. The picture file includes a variety of picture formats and PDF format, which can be selected in the toolbar.

Western Blotting Detection
(1) Total protein extraction: Cells were taken out; the culture medium was discarded, and the cells were washed with PBS. Then, 70 µL of cell lysate was added to each well. After 5 min, the cell suspension was transferred to an Eppendorf (EP) tube (TIANGEN Biochemical Technology (Beijing) Co., Ltd., China) and shaken once every 5 min for a total of 6 times. The cell suspension was put into a 4 • C centrifuge, centrifuged at 1,000 rpm/min for 15 min. The supernatant was taken for bicinchoninic acid (BCA) protein quantitative determination, and the standard curve was drawn.
(2) Preparation of stacking gel and separation gel: The reagents (purchased from TIANGEN Biochemical Technology (Beijing) Co., Ltd., China) were summarized in Table 1 below: (3) Electrophoresis and image development: The glass plate was cleaned thoroughly with distilled water and ethanol. The glass plate was aligned and put in the clamp vertically on the glue rack. The distilled water was added to the glass plate to a suitable position. Then, the device was stood for 8 min to test whether the glass plate was leaking. A 10% separation gel was prepared according to the formula in Table 1. After mixing, 6 mL was added to the gap in the middle of the glass plate with a pipette; then, 3 mL of isopropanol was added slowly. Under 37 • C condition, once a refraction line appeared between isopropanol and the separation gel, the separation gel solidified. Afterward, the isopropanol was poured out, and the device was washed with distilled water three times for later use. After the stacking gel was configured, 3 mL was added to the glass plate, which should slowly enter the comb to prevent bubbles.
After the concentrated gel was solidified, the glass plate and the plastic replacement plate were sandwiched in the rack with electrodes; then, the device was put into the electrophoresis tank, Tetramethylethylenediamine (TEMED) 0.004 × 10 3 0.004 × 10 3 and the comb was pulled out. Next, 30 µL of the expressed protein supernatant was taken out, added with 10 µL of 5 × loading buffer, mixed evenly, and boiled for 10 min at 100 • C. Eventually, 40 µL of the sample was loaded on each well of the electrophoresis gel. Under 80V voltage, the bromophenol blue formed a straight line in the gel, and then the voltage was changed to 120V. When the bromophenol blue ran to the lower edge, the power supply was disconnected, and the membrane was transferred. The membrane transfer process is as follows: soak the glue in the transfer buffer for 10 min, cut six pieces of membrane and filter paper according to the size of the glue, put the transfer buffer for 10 min, place each layer in the order of sponge/3 layers of filter paper/glue/membrane/3 layers of filter paper/sponge, and drive away the bubbles with a test tube. Then put the transfer tank into the ice bath, put the above interlayer, add transfer buffer, and insert the electrode, 100V for 1 h.
After the membrane transfer was completed, the gel image processing system (Unverbindlicher Verkaufspreis, Germany) was used to analyze the target band's molecular weight and net optical density. The relative expression of target protein = target band gray value OD/internal reference gray value OD. Frontiers in Genetics | www.frontiersin.org

Influence of Patients' Clinical
Characteristics on Their Quality of Life Figure 1 presented the basic clinical characteristics of 60 patients. Figure 1 suggested that patients in stage III-IV had more severe symptoms, including nausea, vomiting, insomnia, and peripheral neuropathy, than patients in stage I-II. Patients who received more than three chemotherapies had more severe nausea, vomiting, insomnia, and peripheral neuropathy than those who received less than three chemotherapies. This indicated that the more times the chemotherapy patients had, the greater the side effects of the body were. How to make cancer patients achieve the best therapeutic effect within the minimum number of chemotherapy is not only a difficult problem of anti-cancer treatment, but also a key research direction.

Up-Regulated Gene Signal Analysis Network
The up-regulated gene COL11A1 was taken as an example; the types of its signal transduction molecules were counted (Figure 2). According to Figure 3, the inhibitory conduction signals in normal human tissues were lower than those in lung cancer tissues. In comparison, the activating conduction signals in lung cancer tissues were generally higher than those in normal tissues. This suggested that COL11A1 was involved in the molecular driving mechanism of lung cancer. Next, the types of transferred molecules of COL11A1 were analyzed (Figure 4).
As shown in Figure 4, the types of metastatic molecules of inhibitory COL11A1 in normal human tissues were lower than those in lung cancer tissues. In contrast, the types of metastatic molecules of activating COL11A1 in lung cancer tissues were more than those in normal tissues. This showed that COL11A1 was very metastatic in lung cancer tissues.

PPI Analysis Results
A 292 nodes and 1,425 interaction networks were obtained from 291 up-regulated genes, and 529 nodes and 1,624 interaction networks were obtained from 584 down-regulated genes by analyzing the string tool. After processing with visualization software, the significant module in the proteinprotein interaction relationship network in Figure 7 was obtained, and the high expression in the center of the proteinprotein interaction network was selected from the proteinprotein interaction network. The gene (see Figure 7A), included TOP2A (Degree = 62), CCNB1 (Degree = 57), CCNA2 (Degree = 54), CDK1 (Degree = 55), and TTK (Degree = 51), all of which have larger mutual Acting relationship. A lowexpression gene at the center of the protein-protein interaction network (see Figure 7B), including IL6 (Degree = 89), IL1B (Degree = 60), CCL1 (Degree = 58), EDN1 (Degree = 53), and FGF2 (Degree = 51) had a large interaction relationship. These highly expressed genes and low expressed genes may be key target genes related to lung cancer diseases.

Protein Expressions of High-Expressed and Low-Expressed Genes
Protein expressions of the high-expressed genes CCNB1 and TOP2A were illustrated in Figure 8 below: Afterward, the expression of messenger RNA corresponding to CCNB1 and TOP2A proteins was analyzed, and the results were shown in Figure 9 below.
As shown in Figures 8, 9, CCNB1 and TOP2A proteins corresponding to the messenger RNA expression level in normal humans were around 10, while the CCNB1 and TOP2A protein corresponding to the messenger RNA expression level in the patient group both exceeded 35, indicating that CCNB1 and TOP2A proteins were highly expressed in patients.  Protein expressions of the low-expressed genes IL6 and IL1B were illustrated in Figure 10 below: Then, the messenger RNA expression of IL6 and IL1B proteins was analyzed (Figure 11).
As shown in Figures 10, 11, protein expressions of the low-expressed genes IL6 and IL1B in patients were low. The messenger RNA expressions corresponding to IL6 and IL1B proteins in the control group were around 5, while they both exceeded 25 in the patient group. This suggested that even though IL6 and IL1B proteins were low-expressed in patients, they were still lung cancer-related target genes.

CONCLUSION
This study attempts to reveal the molecular driving mechanism of lung cancer through signal pathway, biological function enrichment, protein interaction analysis, and gene target network analysis. A total of 875 differentially expressed genes were obtained by analyzing the samples. These genes are mainly involved in biological processes such as protein metabolism, protein hydrolysis, mitosis and cell division. TOP2A, CCNB1, CCNA2, CDK1, and TTK may be the key target genes of lung cancer. Exploring the changes of various genes and pathways in the pathogenesis of lung cancer provides reference for the molecular driving mechanism of lung cancer, and provide theoretical basis for molecular-targeted drug therapy and clinical nursing guidance of lung cancer. However, there are still some shortcomings. The selection number of up-regulated and downregulated genes is limited, which cannot meet the huge molecular network analysis. In the later stage, the screening amount FIGURE 10 | Protein expressions of IL6 and IL1B in patients with lung cancer.
Frontiers in Genetics | www.frontiersin.org of up-regulated and down-regulated genes will be increased. The molecular driving mechanism of lung cancer was still in the preliminary stage. In the subsequent research, TOP2A with large interaction relations among the critical target genes related to lung cancer obtained by screening would be screened for drug resistance, providing assistance for the development of its inhibitors.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: the GEO (Gene Expression Omnibus) database.

AUTHOR CONTRIBUTIONS
RH: writing -original draft and conceptualization. XX: data curation and software. KZ: supervision and resources. YZ: formal analysis. CW: validation. GH: writing, review, editing, and methodology. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by the Natural Science Foundation of Zhejiang Province of China (LY21H160011).