GPC2 Is a Potential Diagnostic, Immunological, and Prognostic Biomarker in Pan-Cancer

Background Glypican 2 (GPC2), a member of glypican (GPC) family genes, produces proteoglycan with a glycosylphosphatidylinositol anchor. It has shown its ascending significance in multiple cancers such as neuroblastoma, malignant brain tumor, and small-cell lung cancer. However, no systematic pan-cancer analysis has been conducted to explore its function in diagnosis, prognosis, and immunological prediction. Methods By comprehensive use of datasets from The Cancer Genome Atlas (TCGA), Cancer Cell Line Encyclopedia (CCLE), Genotype-Tissue Expression Project (GTEx), cBioPortal, Human Protein Atlas (HPA), UALCAN, StarBase, and Comparative Toxicogenomics Database (CTD), we adopted bioinformatics methods to excavate the potential carcinogenesis of GPC2, including dissecting the correlation between GPC2 and prognosis, gene mutation, immune cell infiltration, and DNA methylation of different tumors, and constructed the competing endogenous RNA (ceRNA) networks of GPC2 as well as explored the interaction of GPC2 with chemicals and genes. Results The results indicated that GPC2 was highly expressed in most cancers, except in pancreatic adenocarcinoma, which presented at a quite low level. Furthermore, GPC2 showed the early diagnostic value in 16 kinds of tumors and was positively or negatively associated with the prognosis of different tumors. It also verified that GPC2 was a gene associated with most immune-infiltrating cells in pan-cancer, especially in thymoma. Moreover, the correlation with GPC2 expression varied depending on the type of immune-related genes. Additionally, GPC2 gene expression has a correlation with DNA methylation in 20 types of cancers. Conclusion Through pan-cancer analysis, we discovered and verified that GPC2 might be useful in cancer detection for the first time. The expression level of GPC2 in a variety of tumors is significantly different from that of normal tissues. In addition, the performance of GPC2 in tumorigenesis and tumor immunity also confirms our conjecture. At the same time, it has high specificity and sensitivity in the detection of cancers. Therefore, GPC2 can be used as an auxiliary indicator for early tumor diagnosis and a prognostic marker for many types of tumors.


INTRODUCTION
Cancer brings immense suffering to individuals (1). From radiotherapy and chemotherapy to targeted therapy and immunotherapy, persistent efforts enhance our understanding toward the complex pathogenesis of tumor and raise the level of treatment (2). However, immunotherapy calls for more investigation in different cancers to validate itself (3,4). Pancancer analysis is the analysis of genes in a wide variety of cancers, in which the differences and similarities of the expression of extracted genes are compared (5). Thanks to public databases like The Cancer Genome Atlas (TCGA), valuable factors can be mined for diagnosis, prognosis, and immunotherapy (6).
Glypican 2 (GPC2) is a protein-coding gene expressing cell surface proteoglycan bearing heparan sulfate (7). The glypican (GPC) family genes encode GPC which attaches to the cell membrane by means of a glycosylphosphatidylinositol (GPI) anchor (8). Studies manifest that these glypicans work as protein co-receptor, playing a part in signal transduction of wingless (Wnts), hedgehogs (Hhs), fibroblast growth factors (FGFs), and bone morphogenetic proteins (BMPs) (7). Six species of GPC (GPC1-6) have been identified in mammals, and all of them are shown as cancer therapeutic targets with high expression in cancers (9). Their expression varies in different tissues, and among them GPC2 is mainly active in growing nervous tissues and thyroid cancer tissues (10)(11)(12)(13)(14). It participates in the growth and differentiation of neuronal axons (15). Increasing evidence has demonstrated the overexpression of GPC2 in neuroblastoma, a kind of childhood cancer (9,16,17). Based on previous research, immunotherapy and targeted therapy have shown good therapeutic prospects in neuroblastoma and malignant brain tumors (16,18,19). A research identified immunotherapy targets in 12 pediatric cancers, and GPC2 was analyzed in 8 diseases such as osteosarcoma (OS) and Ewing sarcoma (EWS), which makes it evident that GPC2 has a wide range of functions in childhood cancers (20). Some papers consider that it keeps silent relatively in various adult normal tissues such as brain, heart, lung, and kidney (9,21). However, small-cell lung cancer and prostate cancer were discovered to have an upregulated expression (17,22). Moreover, experiences show that a high expression of GPC2 may lead to favorable prognosis in early pancreatic duct adenocarcinoma after pancreaticoduodenectomy (23). Generally, GPC2 has an effect on protein transduction, cell proliferation and differentiation, and oncogenic signatures (7,23).
In view of the lack of pan-cancer study and inconsistencies in past research, we retrieved diverse data resources containing TCGA, Cancer Cell Line Encyclopedia (CCLE), Genotype-Tissue Expression Project (GTEx), cBioPortal, and Human Protein Atlas (HPA) and extracted corresponding data subsequently.
With the analysis and comparison of the expression of GPC2 in types of malignancies, we further conducted immune infiltration levels, co-expression analysis of immune-related genes with GPC2, and DNA methylation across 33 types of cancer. Besides, we also investigated competing endogenous RNA (ceRNA) networks and interacting chemicals and genes of GPC2. There is a discovery that GPC2 can be employed as a diagnostic, prognostic, and immunological predictor of generalized cancers. The study may broaden the train of thought toward application of GPC2 in immunotherapy.

Data Preprocessing and Differential Expression Analysis
The mRNA expression profiles and correlative clinical data from 33 types of cancer samples and corresponding normal samples were downloaded from TCGA (https://www.cancer.gov/aboutnci/organization/ccg/research/structural-genomics/tcga), which involve 11,315 samples in all. The differentially expression genes (DEGs) between tumor tissues and adjacent tissues were identified using the log 2 transformation and t-tests in TCGA cohorts with a p-value <0.05. The intersection genes were screened from the cancer species with significant differential expression.
We downloaded gene expression data from GTEx (https:// commonfund.nih.gov/GTEx) from 31 different tissues. The CCLE database (https://sites.broadinstitute.org/ccle) is a large, public cancer genome database, which includes information of thousands of cell lines and methylation gene expression profiles. We downloaded the data of cancer cell lines from 37 human tissues in CCLE and analyzed their GPC2 expression.
The downloaded data enabled us to evaluate the expression levels of GPC2 in 31 normal tissues as well as 33 tumor tissues and compare the cancer samples with paired standard samples in 33 cancers. Log 2 transformation and t-tests were performed on the expression data and these tumor types. The expression difference between tumor and normal tissue samples was identified by the standard of p-value < 0.05. R software (Version 4.0.2, https://www. Rproject.org) was used for data analysis, and the "ggplot2" R package was applied to draw the box diagrams.

Immunohistochemistry Staining of GPC2
HPA (https://www.proteinatlas.org/) is a human proteome atlas database containing information on the protein distribution of human tissues and cells. To analyze the differential expression of GPC2 at the protein level, we downloaded immunohistochemical images of 15 kinds of tumor tissues with their corresponding normal tissues from HPA. These included liver cancer, testis cancer, thyroid cancer, lymphoma, ovarian cancer, skin cancer, prostate cancer, breast cancer, stomach cancer, pancreatic cancer, cervical cancer, endometrial cancer, renal cancer, colorectal cancer, and lung cancer.

Analysis of the Diagnosis Value of GPC2
Mined from each sample provided by TCGA, the clinical phenotype, tumor stage, was chosen and its link with GPC2 expression was analyzed, which was carried out benefiting from "ggplot2" R packages. "ggplot2" is a kind of drawing package that can separate drawing and data, data-related drawing, and data irrelevant drawing. To evaluate the diagnostic accuracy of GPC2, the ROC curve analysis based on sensitivity and specificity was conducted using the "pROC" package. The area under the curve (AUC) ranges from 1.0 (perfect diagnostic) to 0.5 (no diagnostic value) (24).

Analysis of the Relationships Between GPC2 and Prognosis
We also had access to the survival data profiting from the samples downloaded from TCGA. Overall survival (OS), disease-specific survival (DSS), and progression-free interval (PFI) were considered as the indicators to explore the relevancy between GPC2 expression and patient prognosis. When it comes to survival analyses, the Kaplan-Meier method and log-rank test were used in each cancer type. R packages "survival" and "survminer" were used to draw the survival curves. Moreover, we employed the R packages "forestplot" to ascertain the relationship between GPC2 expression and survival in pan-cancer.

Relationship Between GPC2 Expression and Immunity
The relative scores for 24 immune cells in 33 cancers were calculated by a metagene tool, CIBERSORT (https://cibersort. stanford.edu/), which can predict the phenotypes of immunocytes. What is more, the correlations between GPC2 and each immune cell infiltration level were assessed based on R software packages "ggplot2" and "ggpubr"("ggplot2" is a flexible package for elegant data visualization in R. The "ggpubr" package provides some easy-to-use functions for creating and customizing "ggplot2"-based publication-ready plots).
Additionally, we analyzed the co-expression of GPC2 and immune-related genes, specifically involving genes encoding the major histocompatibility complex (MHC) and immune activation, immunosuppressive, chemokine, and chemokine receptor proteins. Moreover, the visualization results were presented by "reshape2" and "RColorBrewer" packages. "Reshape2" is applied for the interaction between wide-format data and long-format data while "RColorBrewer" is applied to configure colors.

Correlation of GPC2 Expression With DNA Methylation
UALCAN (http://ualcan.path.uab.edu/) is a interactive web portal that is used to conduct an in-depth analysis of TCGA gene expression data (25). In this study, UALCAN was used to investigate the promoter methylation level of GPC2 in cancers. cBioPortal (http://www.cbioportal.org/) is a platform that contains all tumor gene data in TCGA database and can provide researchers with multidimensional visual data. We selected data from 32 cancers, a total of 10,953 samples, and used cBioPortal for further analysis. The type and frequency of GPC2 gene mutation in all tumors were analyzed in "OncoPrint" and "CancerTypesSummary." "OncoPrint" shows the mutation, copy number, and expression of the target gene in all samples in the form of a heat map. In addition, "CancerTypesSummary" shows the mutation rate of the target gene in generalized carcinoma in the form of a bar chart.

Interaction of GPC2 With Chemicals and Genes
The Comparative Toxicogenomics Database (CTD, http:// ctdbase.org/) is a digital resource contributing to investigation in novel connections of molecular mechanisms by which chemicals influence health outcomes (27). We used this database to query the interacting chemicals of GPC2 and explore the genes with high similarity to GPC2 in terms of common interacting chemicals.
The GeneMANIA database (http://www.genemania.org) is a user-friendly website that can find functionally similar genes according to the given gene list based on a wealth of genomics and proteomics data (23). Through detection of similar gene functions in GeneMANIA, we identified genes whose expression patterns were similar to those of GPC2.

Differential Expression of GPC2 Between Tumor and Normal Tissue Samples
The GTEx datasets were used to analyze the expression levels of the GPC2 gene across different tissues under physiological conditions ( Figure 1A). It is not difficult to find that GPC2 expression levels were highest in testis (compared with other tissues, the differences were statistically significant), but low in most other normal tissues. Figure 1B presents the relative GPC2 expression levels in various cancer cell lines from CCLE. It can be seen from the results that the expression levels of GPC2 are generally increased in cancer cell lines from different tissue sources, which is consistent with the analysis result of TCGA database, and it is significantly expressed in the peripheral nervous system.
Whereafter, we ranked GPC2 expression levels in various cancers from lowest to highest ( Figure 1C). GPC2 was expressed in all tumors, with the highest level in uterine carcinosarcoma (UCS) and, conversely, lowest in liver hepatocellular carcinoma (LIHC). We also made a comparison between cancer and paired normal samples on GPC2 expression levels in 33 cancers, based on TCGA data ( Figure 1E). Except for those cancers in which no normal tissue data were available or only had very few normal samples, it was detected that the expression of GPC2 in 21 types of cancer was significantly different from that in normal tissue. Thereinto, GPC2 levels were upregulated in bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), LIHC, lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), pheochromocytoma and paraganglioma (PCPG), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), stomach adenocarcinoma (STAD), thyroid carcinoma (THCA), uterine corpus endometrial carcinoma (UCEC), and glioblastoma multiforme (GBM). In contrast, GPC2 had a low expression in tumor relative to normal tissues in pancreatic adenocarcinoma (PAAD). However, there was no significant difference in GPC2 levels between sarcoma (SARC), skin cutaneous melanoma (SKCM), thymoma (THYM), and nontumor tissues. Besides, a noteworthy increase in GPC2 expression in 16 types of cancer was observed respectively in paired tumor samples compared with corresponding normal samples ( Figure 1D). These results suggest that GPC2 expression is upregulated in various types of cancer, indicating that GPC2 may play a potentially pivotal role in cancer diagnosis.
Furthermore, to assess the expression of GPC2 in terms of protein level, we elicited the immunohistochemical images taking advantage of the HPA database. From Figure 2, it can be intuitively seen that the protein expression of GPC2 was significantly higher in 15 cancers than in normal tissues.

Diagnosis Value of GPC2 Across Cancers
In the examination on the tumor stage relevance, we discovered it was in 16 types of cancer that the GPC2 expression significantly increased in the early tumor stage (Figure 3), including CHOL, LUSC, LUAD, KIRP, HNSC, LIHC, ESCA, KIRC, UCEC, BLCA, COAD, READ, STAD, PRAD, THCA, and BRCA, indicating that GPC2 may have important clinical value in the early diagnosis of these tumors. The ROC curves were utilized to make an evaluation of the performance of the gene signature for diagnostic accuracy. A different AUC cutoff has been considered to indicate high diagnostic accuracy (AUC: 1.0-0.9), relative diagnostic accuracy (AUC: 0.9-0.7), or low diagnostic accuracy (AUC: 0.7-0.5). Figure 4 shows that the AUC of ROC analysis of the model has high diagnostic accuracy in 6 types of cancer, relative diagnostic accuracy in 16 types of cancer, and low diagnostic accuracy in 7 types of cancer. It is worth emphasizing that the AUC achieved 1.0 in CHOL.

Prognostic Significance of GPC2 Across Cancers
Aiming to investigate the association between GPC2 expression level and prognosis, we performed a survival association analysis for each cancer, concentrating on OS, DSS, and PFI. One the one hand, Cox proportional hazards model analysis illustrated that the expression levels of GPC2 were associated with OS in COAD (p < 0.001), PAAD (p < 0.001), acute myeloid leukemia (LAML) (p < 0.001), ACC (p < 0.001), SARC (p < 0.001), KIRC (p <

Relationship Between GPC2 Expression Level and Tumor Immune Cell Infiltration
Our result of CIBERSORT revealed that for most types of cancer, the association between levels of immune cell infiltration and GPC2 expression was significant ( Figure 8). Especially, GPC2 expression level had a positive relation with infiltrating T cells, T helper cells, Tcm, Th17 cells, and Th2 cells in THYM. Moreover, a co-expression analysis was carried out in 33 tumors, in order to detect the relationships between GPC2 expression and immune-related genes. From the heat map (Figure 9), we can intuitively see that almost all immunerelated genes were co-expressed with GPC2, and except LUSC and SARC, majority of immune-related genes were positively correlated with GPC2 in all types of tumor (p < 0.05).

Correlation of GPC2 Expression With DNA Methylation
The UALCAN online tool provided a platform for us to investigate promoter methylation levels of GPC2 among groups of patients and normals according to different cancers. The beta value indicates level of DNA methylation ranging from 0 (unmethylated) to 1 (fully methylated). A different beta value cutoff has been considered to indicate hypermethylation (beta-value: 0.7-0.5) or hypomethylation (beta-value: 0.3-0.25). Figure 10 shows that the promoter methylation levels of GPC2 were significantly higher in 12 tumor groups than those in normal groups.
The mutation of the GPC2 gene in all tumor tissues was analyzed by the cBioPortal platform. 10,953 patients from the TCGA database were analyzed. The amplification of GPC2 accounted for the largest proportion of all mutation types, of which esophageal squamous cell carcinoma, esophagogastric adenocarcinoma, and CHOL had the highest occurrence rates of 8.42%, 6.42%, and 5.56%, respectively ( Figure 11). Amplification is the most common type.

Prediction of Target miRNAs and Construction of the Co−Expressed Network
It is well known that miRNAs are able to induce gene silencing and downregulate gene expression via combining mRNAs. The ceRNA network is the connection built on the interaction among mRNAs, miRNAs, and their corresponding ncRNAs. NcRNAs, including circRNAs and lncRNAs, are regarded as upstream molecules, which can influence the miRNAs′ function through binding miRNA response elements and further upregulating  gene expression (28). In the end, we acquired 22 target miRNAs of GPC2 from five databases. However, only 8 target miRNAs can be retrieved in StarBase to predict their circRNAs and lncRNAs. As a result, 121 target lncRNAs and 149 target circRNAs were obtained about the target miRNAs of GPC2. The ceRNA networks shown in Figure 12 were accorded to the prediction results, which might provide a basis for us to research the potential drugs regulating GPC2.
The gene-gene interaction network for GPC2 and similar genes was constructed by GeneMANIA. The results showed that the 20 most frequently altered genes closely correlated with GPC2, in which Midkine (MDK) has the most significant correlation to GPC2. Moreover, the functional analysis suggested that GPC2 and its similar genes were prominently associated with the glycosaminoglycan metabolic process, aminoglycan metabolic process, and aminoglycan biosynthetic process ( Figure 13).

DISCUSSION
Hitherto, cancer-related research has always been a research focus in the current medical domain. 33 cancer-related data from TCGA and CCLE platforms were used to explore biomarkers suitable for broad-spectrum cancer diagnosis through gene expression difference analysis. By pan-cancer analysis, GPC2 emerged from a number of genes due to its significant upregulation in many types cancer, and we illuminate the significant difference in its expression between cancer and normal tissues in many ways and discussed its early detection value, regulatory pathways, associated genes, and compounds. GPC2 is a member of glypicans. Heretofore, GPC3 and GPC1, which show excellent diagnostic effects in specific cancer types, respectively, have monopolized most studies of glypicans. For example, GPC1, the same subfamily gene of GPC2, has been proved to be a diagnostic biomarker and therapeutic target for pancreatic cancer and trigger a wave of interest of glypicans (29). Also reported in the literature, GPC3 has been proved to have high specificity in the diagnosis of hepatocellular carcinoma and can be used as a marker to distinguish hepatocellular carcinoma from other liver tumors (17,30). In a further study of GPC3, Tetsuya Nakatsura et al. found that it could also be used as an auxiliary indicator for the early diagnosis of melanoma (31). GPC2 was originally identified in rat brain at locus 7q22.1, encoding a 579-amino acid protein, but the mechanism of FIGURE 9 | Co-expression of GPC2 and immune-related genes. *p < 0.05, **p < 0.01. action has not been revealed (7,32). The UniProt (https://www. uniprot.org/) platform predicts that GPC2 has five hydrogen sulfur bond insertion sites, and it has been reported that the unique structure of GPC2 helps to bind to the Wnt signaling pathway, thus affecting the expression of MYCN Proto-Oncogene (MYCN) and regulating the proliferation of tumor cells (33). In our comprehensive analysis and screening of a large number of genes, GPC2 has captured our attention because of its preeminent detection performance. Except for cancers with no normal tissue data or only an insufficient number of normal tissue samples, our results detected the significant differences of GPC2 expression between tumors and normal tissues of 20 forms of cancer. Among them, GPC2 expression levels were upregulated in BLCA, BRCA, CESC, CHOL, COAD, ESCA, HNSC, and so on. A mere one form of cancer (PAAD) shows a downregulation between PAAD tumor tissues and nontumor tissues.
Unfortunately, due to the insufficient number of normal samples in the database, the data of GPC2 in the expression difference analysis of THYM, SKCM, and SARC were not statistically significant. In cancers such as LGG, UCS, TGCT, OV, LAML, DLBC, ACC, UVM, and MESO, the analysis was not successful due to the lack of normal group samples. With the accumulation of datasets, this part is worth a further exploration  in the future. For instance, recent studies have manifested that GPC2 expression is low in normal pediatric tissues but elevated in optic neuroblastoma tissues, and it has been selected as an excellent chimeric antigen receptor T cell therapy target for optic neuroblastoma, and its therapeutic effect is attracting much attention (16).
In addition, GPC2 expression was significantly increased in 16 cancers in paired sample expression differential analysis. Immunohistochemical analysis confirmed higher levels of GPC2 protein at the protein level in almost all cancers. By and large, these findings confirm that GPC2 expression is upregulated in a variety of cancers, suggesting that the prospect of GPC2 in cancer diagnosis is worth looking forward to.
For the time being, the early cancer detection is of great clinical significance, to push back the frontier of the early cancer detection; thereby, we explored the differential expression of GPC2 in the samples marked with cancer staging information. Analysis showed an early elevation of GPC2 in 16 of the 17 cancers in which staging and normal control samples were collected. The AUC of the ROC curve also confirmed the superior performance of GPC2 in the diagnosis of multiple cancers. GPC2 showed high diagnostic accuracy in 6 forms of By understanding the relationship between GPC2 gene expression and the level of tumor immune cell infiltration, we can find that the expression of GPC2 is mostly negatively correlated with the level of immune cell infiltration. GPC2 is believed to be involved in the transduction of the Wnt/b-catenin signaling pathway, which can regulate the differentiation and development of macrophages, B cells, and other immune cells and regulates the immune response process through multiple ways (34)(35)(36), These may also be the mechanism of GPC2 affecting the number of immune cells. This predicts that GPC2 is a good indicator that can reveal the occurrence of cancer in vivo from the side and play a very good supporting role in the diagnosis of tumor. Also, there is a significant positive correlation between GPC2 and immune-related genes.
From the results interpreted in the cBioPortal platform, we know that GPC2 is mutated in most forms of tumors. Thereinto, the incidence of esophageal squamous cell carcinoma, esophagogastric adenocarcinoma, and CHOL is the highest, which suggests that we should pay attention to the relationship between GPC2 gene mutation and digestive system tumors.
In our study, an elevated methylation level of the GPC2 promoter and a high expression level of GPC2 appeared simultaneously, which is not uncommon in tumor tissues. Smith et al. discussed several possible mechanisms of promoter DNA hypermethylation leading to paradoxical gene activation in detail, such as binding to transcription inhibitors, combining to remote control elements, or inducing alternative promoter activation (37). This study shows that there is a more complex network mechanism for gene expression regulation (37,38). In order to demonstrate the upstream and downstream expression mechanisms of GPC2 in vivo more comprehensively, we constructed an intuitive ceRNA expression network containing ncRNAs, circRNAs, and lncRNAs. Based on these prediction results, we identified compounds that may regulate GPC2 expression and constructed the gene interaction network of 20 genes that are most closely related to GPC2 through chemical association.
To put it in a nutshell, we found that GPC2 was widely differentially expressed between tumor tissues and normal tissues through pan-cancer analysis and revealed the correlation between GPC2 expression and clinical prognosis. Our findings suggest that GPC2 has the potential to become an independent prognostic factor for many tumors and that the level of GPC2 expression may vary in different types of tumor. In the most recent study by Clevers et al., GPC2 is designed as a therapeutic target for optic neuroblastoma (39). By silencing GPC2, Wnt/b-catenin signaling is inactivated and MYCN expression is reduced, which is a driver of optic neuroblastoma. The specific role of GPC2 in each tumor needs to be further studied. Furthermore, the analysis results of tumor immune cell infiltration level and immune-related genes also showed that the expression level of GPC2 was mostly positively correlated with immune-related expression level. We also investigated GPC2 from the aspects of methylation level, immunohistochemical analysis, and mutation analysis, which will be helpful to further elucidate the mechanism of GPC2 in tumor development in the future.

AUTHOR CONTRIBUTIONS
YF contributed to the conception and design of the study. GC, DqL, NZ, DyL, JZ, and HL drafted the manuscript. ZL, XL, QC, CZ, YL, Y-TC, and QR collected and analyzed the data. NW and YF revised the manuscript. All authors contributed to the manuscript revision and read and approved the submitted version.