Comprehensive Analysis of Key Genes and Regulatory Elements in Osteosarcoma Affected by Bone Matrix Mineral With Prognostic Values

Osteosarcoma is one of the most common types of bone sarcoma with a poor prognosis. However, genes involved in the mineral metabolism in the microenvironment of the bone affected by osteosarcoma are, to date, largely unknown. A public data series (GSE114237) was used to identify differentially expressed genes (DEGs) between osteosarcoma cells adhering to demineralized osseous surfaces and mineralized osseous surfaces. Functional enrichment analysis of DEGs and hub genes, protein-protein interaction network of DEGs and regulatory network (miRNA-mRNA network and transcription factor (TF)-mRNA network), survival analysis of hub genes was visualized. The prognostic hub genes were considered as candidate genes and their functional predictions were analyzed. A total of 207 DEGs were mainly enriched in extracellular space and thirteen hub genes were mainly enriched in the function of epithelial to mesenchymal transition. However, out of these, only one candidate gene was found to be suitable as a candidate gene. Besides that, 297 miRNAs and 349 TFs interacting with the hub genes were screened. In conclusion, the DEGs, hub genes, miRNAs and TFs screened out in this research could contribute to comprehend the latent mechanisms in osteosarcoma affected by matrix mineral and provide potential research molecular for further study.


INTRODUCTION
Osteosarcoma is one of the most common types of bone sarcoma, accounting for approximately 20% of newly diagnosed bone cancers (Klein and Siegal, 2006;Siegel et al., 2018). The prevalence of osteosarcoma is related to bone growth and is the most malignant tumors in adolescents (Mirabello et al., 2009). Unlike the other tissues, there is a lot of mineral in the microenvironment matrix Abbreviations: DEGs, differentially expressed genes GEO, gene expression omnibus GO, gene ontology KEGG, kyoto encyclopedia of genes and genomes PPI, protein-protein interaction SAGE, serial analysis of gene expression STRING, search tool for the retrieval of interacting genes. of bone tissue, which may play a potential role in the tumorigenesis of osteosarcoma. When mesenchymal stem cells (MSCs) were embedded in a calcified ceramic scaffold, tumoral bone formation was observed in the surroundings scaffold (Rubio et al., 2014). In vitro experiments showed that CA 2+ supplementation can provide bone mineralization (Flammini et al., 2016). Meanwhile, selenium-doped hydroxyapatite nanoparticles could induce apoptosis of tumor cells by an inherent caspase-dependent apoptosis pathway together with the generation of reactive oxygen .Osteosarcoma cell lines have been reported can act as osteoblasts when grown on calcified materials (Zhang et al., 2013). However, the mechanisms of the mineral in the microenvironment affected by osteosarcoma are, to date, still largely unknown.
Bioinformatics, as a new technology, has been widely developed in recent years and plays an important role in revealing the internal mechanism of tumor progression and carcinogenesis at the genome level. Till date, numerous studies have been published on different cancer types, and some researches have revealed the interesting information of the expression of key genes, microRNAs and co-expression modules in osteosarcoma Li et al., 2020) and drug resistance in osteosarcoma patients (Bhuvaneshwar et al., 2019).
Here we analyzed the microarray dataset obtained from osteosarcoma cells adhered to the demineralized osseous surfaces and mineralized osseous surfaces from the Gene Expression Omnibus (GEO) database to identify differentially expressed genes (DEGs). Subsequently, a protein-protein interaction (PPI) network was constructed to screen out hub genes in the DEGs. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the DEGs and the hub genes were used to reveal the underlying mechanisms. Subsequently, we constructed regulatory networks of the hub genes. Furthermore, the hub genes survival analysis was carried out to find some candidate genes and to predict the possible function of these genes. Taken together, a total of 207 DEGs, 297 miRNAs, 349 transcription factors, 13 hub genes, and 1 candidate gene were identified.

Identification of DEGs and Construction of PPI Network
A public series submitted by Wischmann et al. in 2018, GSE114237 (Wischmann et al., 2018, was downloaded from the GEO 1 (Barrett et al., 2013). The researchers had prepared 8 groups of porcine tibia slices; four groups were demineralized whereas the others were left in their native condition. Later, osteosarcoma cells (MG-63 cells) were seeded onto the osseous surfaces for 72 h. All the expression data were analyzed using the R language (version 3.5.1) BIOCONDUCTOR package, and the DEGs were screened out via the LIMMA package. The cutoff of statistical significance was p-values < 0.05 and an absolute value of fold change of gene expression greater than 1 http://www.ncbi.nlm.nih.gov/geo, RRID: SCR_005012 two. The STRING database (online Search Tool for the Retrieval of Interacting Genes, 2 ) (Szklarczyk et al., 2017) was used to predict and construct PPI network with a statistical significance of interaction scores of >0.4 (medium confidence score), and the network was visualized by Cytoscape software (version 3.7.0, RRID: SCR_003032; the National Resource for Network Biology [NRNB], Bethesda, MD, United States) (Smoot et al., 2011). Subsequently, hub genes were selected via MCODE (Molecular Complex Detection, RRID: SCR_015828) (Bandettini et al., 2012), the app of Cytoscape, employing the following criteria: MCODE score >5.5, degree cut-off = 2, score cut-off = 0.2, max depth = 100, and k-score = 2.

Functional and Pathway Enrichment
DAVID 3 (Huang et al., 2009) was used to perform the biological process (BP), cellular component (CC), molecular function (MF) analysis (Gene Ontology Consortium, 2008), and KEGG (Kanehisa and Goto, 2000) pathway enrichment analysis. The DEGs and hub genes were carried out separately with an indicated statistical significance of p-value < 0.05.

Regulatory Network of the Hub Genes
The miRDB 4 (Wong and Wang, 2015) online database was used to determine the potential miRNAs that may bind to the hub genes, which were filtered by a confidence score >60. Additionally, the AnimalTFDB 3.0 5 , was used to further determine the potential transcription factors (TF) that might interact with hub genes and regulate their expression, filtered with Q-values < 0.01. The miRNA-mRNA and TF-mRNA network were visualized via Cytoscape (Smoot et al., 2011).

Survival Analysis of the Hub Genes
All the hub genes were input into the PROGgeneV2 6 database (Goswami and Nakshatri, 2014) separately, and the overall survival plot (Kaplan Meier, KM plot) for each gene was created. According to the median gene expression, the data were divided into high and low expression groups. PROGgeneV2 evaluates the prognostic implications of genes using the SURVIVAL package of the R language for a hypothesis test via publicly available data. The hub genes were considered as candidate genes with a p-value less than 0.05 for further analysis.

Prediction of Functions of the Candidate Genes
The candidate genes were submitted to the GeneMANIA 7 (Warde-Farley et al., 2010) to illustrate the interactive functional association network among the candidate genes and their co-expression genes. The advanced statistical options used were as follows: max resultant genes = 20, max resultant attributes = 10, using the automatically selected weighing FIGURE 1 | Boxplot, volcano plot, protein-protein interaction (PPI) network, and the most significant module of the DEGs are shown. The boxplot demonstrated that the profiles of GSE114237 had a good consistency (A). The volcano plot exhibited the expression of genes between mineralized and demineralized groups with a cut-off criterion of p-value < 0.01 and absolute value logFC (fold change) >1; the red dots represent downregulated genes, the green dots represent upregulated genes, and the gray dots represent unchanged genes (B). The PPI network of the DEGs was constructed and visualized by Cytoscape (C); the round nodes represent upregulated proteins, the diamond nodes represent downregulated proteins, and the edges represent protein-protein associations. Besides, the yellow nodes represent the most significant module obtained from the PPI network (D); the width of the edges represents the coexpression value between the nodes.
method. Subsequently, the Anatomic Viewer of the online SAGE database (Serial Analysis of Gene Expression database, 8 ) (Boon et al., 2002) displayed the expression of candidate genes in normal and corresponding malignant human tissues. The related expression levels were ordered by different  Frontiers in Genetics | www.frontiersin.org The parameter selection of all these tools above is based on the developers' references to select the most appropriate parameters.

Identification of DEGs and Construction of PPI Network
The profiles of the public series from GEO had a good consistency shown in Figure 1A. In total, 207 DEGs were identified in the demineralized group; approximately 151 upregulated and 56 downregulated genes were identified ( Figure 1B). As we know, the p-value calculation includes the number of genes we input for the enrichment and the total number of genes set in a specific concentration; it means that if we analyze the upand down-regulated genes together, we will get a smaller and more significant p-value. In this way, some underestimating important significant pathways can be avoided. The PPI network constructed via Cytoscape of the DEGs as shown in Figure 1C, while the most significant module in Figure 1D with 13 nodes and 33 edges. These 13 genes were proposed as hub genes (12 upregulated genes and 1 downregulated gene). The detailed information is provided in Table 1. The network of hub genes may be an important regulatory mechanism in osteosarcoma affected by matrix mineral.

Functional and Pathway Enrichment
Results of the GO and KEGG pathway enrichment analyses of the DEGs are shown in Figure 2 (details are shown in Tables 2, 3). In the BP ontology (Figure 2A), cell adhesion (25 genes) and extracellular matrix organization (13 genes) were the most significantly enriched terms. In the CC ontology ( Figure 2B), the terms were related to extracellular space (34 genes) and plasma membrane (64 genes). While the terms in the MF ontology ( Figure 2C), were related to transforming growth factor-beta binding (4 genes) and protease binding (7 genes). Besides that, in the KEGG pathways (Figure 2D), the terms were revealed to be proteoglycans in cancer (11 genes) and pathways in cancer (14 genes). Particularly, GO and KEGG pathway enrichment analysis of the hub genes showed that they were mainly related to the positive regulation of epithelial to mesenchymal transition, extracellular matrix, type II transforming growth factor-beta receptor binding, and hypertrophic cardiomyopathy (Figure 3, also can be seen in Supplementary Tables 1, 2). These enrichment results reveal that transforming growth factor-beta binding function, especially in the extracellular matrix, should need more attention.

Survival Analysis of the Hub Genes
The overall survival plots could be obtained for only 11 hub genes out of 13 hub genes, as shown in Figure 5. However, most of them were not statistically significant. Unfortunately, there was no significant difference in the survival curve of COL1A1, CSF1, FSTL3, ITGAV, ITGB5, POSTN, PRSS23, TGFB1, TGFB3, and TGFBI, and no data result for AMTN and VCAN. Only one gene, namely TGFBR1, had a p-value < 0.05. High expression of TGFBR1 had better overall survival. Since TGFBR1 is one of the hub genes involved in the regulation mechanism in osteosarcoma affected by matrix mineral, and its differential expression will affect the prognosis and survival of patients. Therefore, TGFBR1 was considered as a candidate gene for further analyses.

Prediction of Functions of the Candidate Genes
An interactive functional association network constructed by GeneMANIA revealed correlations among genes for the candidate gene. The gene set enriched for TGFBR1 is responsible mainly for the pathways related to TGFB (transforming growth factor-beta) receptor binding and TGFB receptor signaling ( Figure 6A). In our study, the expression profiles of the TGFBR1 in human tissue are displayed using SAGE. As shown, TGFBR1 mRNA in thyroid, lung, pancreas, colon, and prostate cancer tissues displayed higher levels than in the matched normal tissues (Figure 6B). Subsequently, the target-disease association related to TGFBR1 in neoplasm and skeletal system disease could be determined and viewed by using the Open Targets Platform. Among the diseases, the top five diseases with a high score of TGFBR1-neoplasm association were cancer, multiple keratoacanthomata, Ferguson-Smith type carcinoma, glioma, and breast carcinoma ( Figure 7A). Furthermore, the top five diseases with a high score of TGFBR1-skeletal system disease association were craniosynostosis, connective tissue disease, Marfan syndrome, isolated brachycephaly, and isolated scaphocephaly ( Figure 7B). All these findings predict the function of TGFBR1 and can be helpful for further research.

DISCUSSION
The mineralization of the bone matrix plays a potentially important role in the process of bone formation. Osteosarcoma is an intraosseous malignancy that occurs predominantly in growing skeletal regions (Alfranca et al., 2015). The extracellular mineral is the main inorganic component of the bone matrix. Zhang et al. reported that osteosarcoma cell lines altered the expression of cytokines when grown on calcified materials (Zhang et al., 2013). Meanwhile, Kingsley et al. found that calcium is released into bone microenvironment through osteoclastic bone resorption, leading to a "vicious circle" by its participation in different mechanisms of Parathyroid    Hormone Related Protein (PTHRP) production (Kingsley et al., 2007). Meanwhile, Rubio et al. have demonstrated the role of microenvironment calcium matrix in the development of the osteosarcoma via embedding osteosarcoma tumor cells in calcium phosphate biomaterials (Rubio et al., 2014). In our study, a microarray dataset was analyzed to obtain DEGs from osteosarcoma cells adhering to the demineralized osseous surfaces against mineralized osseous surfaces. Altogether, 207 DEGs comprised 151 upregulated and 56 downregulated genes were identified. Subsequently, enrichment analyses of the DEGs were used to explore some other interactions. Previous studies have reported that extracellular space in the bone microenvironment is essential for osteosarcoma development Baglio et al., 2017;Cortini et al., 2017;Zheng et al., 2018). Moreover, studies have revealed that cell adhesion is a significant process of osteosarcoma, which involves a number of molecules (Gang et al., 2017;Hu et al., 2017;Villanueva et al., 2019). In other words, all these studies are consistent with the results obtained in our studies. In contrast, the functional analysis demonstrated the hub genes were mainly related to the positive regulation of epithelial to mesenchymal transition, extracellular matrix, type II transforming growth factor-beta receptor binding, and hypertrophic cardiomyopathy.
The hub genes were selected in the DEGs PPI network of the most significant module. Further survival analysis revealed TGFBR1 as a candidate gene because of its statistical significance. Meanwhile, some hub genes have been reported involved in the tumorigenesis and progression of osteosarcoma, such as ITGAV, POSTN, COL1A1, TGFB1, TGFB3, and ITGAV. These can be negatively regulated by miR-548c-3p, just like the target agents against integrin as reported in clinical trials, and can be potential tumor therapeutic targets (Luo et al., 2016). Moreover, high expression of POSTN (periostin, also known as osteoblastspecific factor two) in osteosarcoma is significantly associated with angiogenesis and is considered to be a promising prognostic factor in osteosarcoma patients (Hu et al., 2016). Similarly, COL1A1 also has a prognostic value in osteosarcoma patients (He et al., 2014). In osteosarcoma, TGFB signaling could interact with both microenvironment and tumor cells, exerting the characteristic effects of tumor precursors. Therefore, TGFBs could be considered as independent therapeutic targets for osteosarcoma (Lamora et al., 2016;Verrecchia and Redini, 2018;Xie et al., 2018).
In addition, a search of published literature revealed that some hub genes have been associated with tumors. VCAN is reported to be essential for migration and metastasis of breast cancer  and could be a potential prognostic biomarker for colon cancer (Chida et al., 2016). Moreover, knockdown of PRSS23 could inhibit EIF2 signaling, thereby suppressing the tumor growth of gastric cancer (Han et al., 2019). Furthermore, one report has demonstrated that CSF1 could promote melanoma resistance to PD1 checkpoint blockade (Neubert et al., 2018), and many reports have shown that CSF1R targeting may be beneficial by showing strong anti-tumor effects (Zhu et al., 2014;Obba et al., 2015;Kumar et al., 2017;Neubert et al., 2018). Similarly, TGFBI has tumor-promoting and tumor-protective FIGURE 6 | The functional association network (A) and expression profiles (B) of TGFBR1. In the functional association network, the edge-colors represent the bioinformatics method applied; the node-colors represent the biological functions of the set of enrichment genes. In the expression profiles, the left side represents normal tissues and the right side represents the matched cancer tissues. The related expression levels are based on analysis of the counts of SAGE tags, ordered by 10 colors. roles in certain gastrointestinal tract cancers (Han et al., 2015), and in prostate cancer (Al Shareef et al., 2018).
Particularly, the candidate gene TGFBR1 is reported to be involved in multiple biological functions in tumor. Inhibition of TGFBR1 can block the activation of fibroblasts mediated via IL1B/TGFB1 and reduce the secretion of pro-inflammatory cytokines in colorectal cancer, which would make the cancer cells more sensitive to chemotherapy. Moreover, TGFBR1 inhibitor could reduce the metastasis ability of tumor cells in vivo (Guillen Diaz-Maroto et al., 2019). It is also reported that in lung cancer TGFBR1 may be a target of tumor suppressor genes (Gao et al., 2019). However, there are few reports on the function of TGFBR1 in osteosarcoma. Upregulated TGFBR1 will induce the epithelialto-mesenchymal transition in osteosarcoma cell lines, which is regulated by the interferon consensus sequence-binding protein, but the specific mechanism is not fully elaborated (ICSBP) (Sung et al., 2019).
This bioinformatics study provides information on DEGs and regulatory elements involved in osteosarcoma affected by matrix mineral. However, the mechanism of these molecules are FIGURE 7 | The target-disease association related to TGFBR1 in neoplasm (A) and skeletal system disease (B) could be determined and viewed by the Open Targets Platform; the bubbles color and bubbles size represent the association score, and the top 10 diseases with a high score value are listed below. still unknown, and additional well-designed experiments and analyses are required. In addition, all the results from this study were got in silico. Further, in vivo and in vitro experiments are necessary to test the biological functions of the key genes and their regulatory elements.
In conclusion, we identified 207 DEGs, 13 hub genes, 297 miRNAs, 349 TFs, and 1 candidate gene, which may be involved in the processes of matrix mineralization that may provide a functional signal to osteosarcoma cells. The molecules that we found may be potential targets for future research on the microenvironment of osteosarcoma. Furthermore, our results may contribute to the identification of the biomarkers for matrix mineralization related to osteosarcoma.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study were generated at GSE114237 (Wischmann et al., 2018) in GEO. Derived data supporting the findings of this study are available from the corresponding author on reasonable request.

AUTHOR CONTRIBUTIONS
ML and XJ analyzed the data. HL provided the help of the R language. CY suggested online tools. SD and GW designed the project. SD selected the analyzed results and wrote the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We thank Lao Xinyuan, the CEO of helixlife, for his guidance and help in our scientific research work.