Development of an Immune-Related Prognostic Index Associated With Glioblastoma

Background: Although the tumor microenvironment (TME) is known to influence the prognosis of glioblastoma (GBM), the underlying mechanisms are not clear. This study aims to identify hub genes in the TME that affect the prognosis of GBM. Methods: The transcriptome profiles of the central nervous systems of GBM patients were downloaded from The Cancer Genome Atlas (TCGA). The ESTIMATE scoring algorithm was used to calculate immune and stromal scores. The application of these scores in histology classification was tested. Univariate Cox regression analysis was conducted to identify genes with prognostic value. Subsequently, functional enrichment analysis and protein–protein interaction (PPI) network analysis were performed to reveal the pathways and biological functions associated with the genes. Next, these prognosis genes were validated in an independent GBM cohort from the Chinese Glioma Genome Atlas (CGGA). Finally, the efficacy of current antitumor drugs targeting these genes against glioma was evaluated. Results: Gene expression profiles and clinical data of 309 GBM samples were obtained from TCGA database. Higher immune and stromal scores were found to be significantly correlated with tissue type and poor overall survival (OS) (p = 0.15 and 0.77, respectively). Functional enrichment analysis identified 860 upregulated and 162 downregulated cross genes, which were mainly linked to immune response, inflammatory response, cell membrane, and receptor activity. Survival analysis identified 228 differentially expressed genes associated with the prognosis of GBM (p ≤ 0.05). A total of 48 hub genes were identified by the Cytoscape tool, and pathway enrichment analysis of the genes was performed using Database for Annotation, Visualization and Integrated Discovery (DAVID). The 228 genes were validated in an independent GBM cohort from the CGGA. In total, 10 genes were found to be significantly associated with prognosis of GBM. Finally, 14 antitumor drugs were identified by drug–gene interaction analysis. Conclusions: Here, 10 TME-related genes and 14 corresponding antitumor agents were found to be associated with the prognosis and OS of GBM.


INTRODUCTION
Glioblastoma (GBM) is a highly aggressive brain tumor with poor prognosis (1). The 1-and 5-year survival rates of GBM are estimated to be 35.7 and 4.7%, respectively, while its average overall survival (OS) time is 12-18 months (2). The Cancer Genome Atlas (TCGA) is a valuable resource used for the classification and discovery of large cancer gene expression datasets (3,4). GBM is classified into four subtypes: proneural, neural, classic, and mesenchymal (5). In 2016, the WHO updated the classification of GBM into three subtypes based on molecular and histological features: (1) isocitrate dehydrogenase (IDH) wild type, (2) IDH mutant, and (3) not otherwise specified (NOS) (unspecified) (6). Several genes and transcription factors have been shown to play a critical role in the occurrence, development, and evolution of GBM (7,8). Other scholars have demonstrated that the tumor microenvironment (TME) strongly influences gene expression in tumor tissues, thus having a notable impact on the clinical outcomes of cancers (9)(10)(11)(12)(13)(14). The TME refers to the environment around a tumor and is mainly composed of immune cells, endothelial cells, mesenchymal cells, inflammatory mediators, and stromal cells (15,16). In tumors, immune and stromal cells are the two main non-tumor constituents with diagnostic and prognostic potential.
Various tools have been developed to analyze the molecular composition of the TME in TCGA gene expression datasets (14,17). For example, the estimation of stromal and immune cells in malignant tumor tissues using expression data (ESTIMATE) scoring algorithm estimates the number of stroma and immune cells in tumors (14). This algorithm predicts non-tumor cell infiltration by analyzing the expression patterns of particular genes in immune and stromal cells and also calculates their scores. ESTIMATE has been applied to study various cancers, including breast (18), prostate (19), and colon cancer (20). However, the significance of immune and stromal scores in GBM has not been established.
Currently, GBM is difficult to treat because of several challenges, such as the blood-brain barrier, tumor heterogeneity, drug efflux pumps, and glioma stem cells (21,22). Therefore, it is important to find new druggable targets for the treatment of GBM.
Herein, we calculated the immune and stromal scores of GBM cohorts downloaded from TCGA database using the ESTIMATE algorithm (14). An array of TME-associated genes that predict poor outcomes in GBM patients were found. Finally, the Chinese Glioma Genome Atlas (CGGA) GBM cohort was used to validate the prognostic performance of the genes. Drugs targeting the genes were identified in the Dgidb database.

Database
The transcriptome profiles of GBM patients were obtained from TCGA database (https://tcga-data.nci.nih.gov/tcGA/). The gene expression data of GBM were determined using Agilent G4502A072 (Sep 08, 2017). Clinical data, including gender, tissue type, prognosis, and survival rates, were retrieved from TCGA database. Immune and stromal scores were calculated using the ESTIMATE algorithm (14). The gene expression profiles of GBM patients were downloaded from the CGGA database (http://www.cggA.org.cn/). The RNA sequencing of samples from diffuse gliomas was performed using the Agilent Whole Human Genome array. The survival and prognosis data were downloaded from the CGGA database.

Identification of Differentially Expressed Genes
The datasets were analyzed using the LIMMA package (23). Fold change ≥1.5 and adjusted p ≤ 0.05 were set as the cutoff thresholds for selection of differentially expressed genes (DEGs).

Construction of a Protein-Protein Interaction Network and Module Analysis
Protein-protein interaction (PPI) networks were constructed using STRING v11.0. The full gene list was uploaded to the database. Disconnected nodes were excluded from the resulting network (24). The PPI network was visualized using Cytoscape (version 2.8.1) (25).

Overall Survival Analysis
The association of the hub genes with the OS rate was determined using survival data of GBM patients obtained from TCGA-BRCA database.

Gene Ontology and Kyoto Encyclopedia of Genes and Genomes Pathway Analyses
Gene ontology (GO) is a tool used to predict the biological processes (BPs) of gene. The GO terms analysis included BPs, molecular function (MF), and cellular composition (CC) (26,27). Kyoto Encyclopedia of Genes and Genomes (KEGG) is an open-access resource for analyzing the biological pathways of genes. These tools were used to determine functions and pathways associated with the prognostic value genes (28). p ≤ 0.05 was considered statistically significant.

Drug-Gene Cross-Talk and Functional Analysis of Potential Genes
The Drug Gene Interaction Database (DGIdb) was used to determine potentially druggable targets for the mutated and altered genes (29).

Statistical Analysis
All statistical analyses were performed using R/BioConductor (version 3.6.3) with two-tailed p-values. For Cox regression Violin plot shows that there is significant association between GBM subtypes and the level of immune scores (n = 309, p < 0.001). (B) Distribution of stromal scores of GBM subtypes. Violin plot shows that there is significant association between GBM subtypes and the level of stromal scores (n = 309, p < 0.001). (C) GBM cases were divided into two groups based on their immune scores: as shown in the Kaplan-Meier survival curve, median survival of the low-score group is longer than high-score group; it is not statistically different as indicated by the log rank test; p-value is 0.15. (D) GBM cases were divided into two groups based on their stromal scores: the median survival of the low-score group is longer than the high-score group; similarly, it is not statistically different as indicated by the log rank test p = 0.77.  Frontiers in Neurology | www.frontiersin.org analyses, survival data that show the state of survival and the time of follow-up were considered the dependent variable, whereas continuously expressed values of the 10 RNAs were considered as the independent variables. Kaplan-Meier (KM) plots and receiver operating characteristic (ROC) curves for the prognosis model were constructed using R (survival, survminer, and ggplot2 packages). The Cox proportional hazards regression model was applied to determine OS, and log-rank test was used to compare differences in OS. p < 0.05 was considered statistically significant.

Immune and Stromal Scores Are Associated With Glioblastoma Subtypes and Survival Outcome
Gene expression profiles of 309 GBM cases initially diagnosed between 1989 and 2010, and their associated clinical data, were downloaded from TCGA dataset. Of the 309 GBM cases, 122 (39.5%) were female and 187 (60.5%) were male. Using the ESTIMATE algorithm, we calculated the stromal scores, which ranged from 1,430.1 to 3,920.55, and immune scores,   which ranged from −405.52 to 3,640.43 (Figures 1A,B). The mesenchymal cell subtype had the highest immune and stromal scores, indicating that immune and stromal scores can be used for subtype classification.
To determine the correlation between OS and the two scores, 309 GBM cases were divided into high-and low-score groups based on the immune and stromal scores. KM analysis revealed that patients with high immune scores had lower OS than those with low immune scores (442 vs. 394 days, log-rank test p = 0.15, Figure 1C). Notably, patients with high interstitial scores had shorter median OS (442 vs. 422 days, log-rank test p = 0.77, Figure 1D) than patients with low interstitial scores, although the difference was not significant.

Association of Gene Expression Profiles With Immune Scores and Stromal Scores in Glioblastoma
The data obtained from TCGA database were standardized using the LIMMA package, and the results were visualized on heatmaps and a volcano map (Figures 2A-D). The analysis revealed different gene expression profiles for high/low immunoassays and high/low stromal assays. Subgroup analysis based on immune scores revealed that 860 genes were upregulated and 162 genes were downregulated in the high-score group compared with the low-score group (fold change ≥1.5, p ≤ 0.05). Similarly, 611 genes were upregulated and 37 were downregulated in the high stromal score group. The overlap between genes revealed that 601 were upregulated and 36 downregulated (Figures 2E,F). Interestingly, the DEGs extracted from the immune score comparison were found to have covered most of the genes extracted from the comparison based on stromal scores. Therefore, these DEGs were further analyzed.
To predict the roles of the DEGs, GO function analysis was performed. For the immune score group, the DEGs were enriched in immune response in the biological component, inflammatory response in the cellular component, and plasma membrane and receptor activity in the MF component ( Figure 3A). KEGG pathway analysis identified 57 pathways associated with the DEGs. The top 5 enriched pathways were cytokine-cytokine receptor interaction pathways (59 DEGs), viral protein cross-talk with cytokine and cytokine receptor pathways (34 DEGs), hematopoietic cell lineage (31 DEGs), chemokine signaling pathways (31 DEGs), and tuberculosisrelated pathways (57 DEGs) (Figure 3B).

Impact of the Differentially Expressed Genes on Overall Survival
To determine whether the DEGs were associated with the OS, KM survival curves were designed based on TCGA dataset. Among the 860 DEGs upregulated in the high-immune scores group, 228 DEGs were significantly associated with poor OS (logrank test, p ≤ 0.05, Table 1). The selected genes are shown in Figure 4.

Construction of a Protein-Protein Interaction Network for the Genes With Prognostic Value
To analyze the interaction between identified DEGs, a PPI network was constructed using the STRING (http://string-db. org). A total of 228 genes were included in the PPI network, which contained co-genes, including 207 nodes, 1,481 edges, and a score > 0.25 (Figure 5A). With the use of MCODE analysis, the top 2 significant modules were selected for further analysis (Figures 5B,C, Table 2).

Functional Enrichment Analysis for Genes With Prognostic Value
Consistent with PPI network analysis, functional enrichment clustering of these genes revealed a strong association of the genes and immune response. A total of 115 GO terms of BP, 21 of CC, and 48 of MF were found to be significant (FDR ≤ 0.05,log FDR ≥ 1.301). The top GO terms associated with the genes were immune/inflammatory response, extracellular region, and carbohydrate binding ( Figure 6A). Moreover, KEGG analysis revealed that all pathways were associated with immune response (Figure 6B).

Validation of the Prognostic Value of Genes in the Chinese Glioma Genome Atlas Dataset
To understand whether the genes identified in TCGA database also affect the prognosis of other cases of GBM, we downloaded and analyzed gene expression data from 124 cases of GBM in an independent glioma database, CGGA. A total of 10 genes were validated (Figure 7) to be significantly associated with poor prognosis ( Table 3).

Drug-Gene Interaction and Functional Analysis of the Genes
Drug-gene interaction analysis was performed on the 10 validated genes assigned to the critical gene module 1. The DGIdb analysis identified 186 drugs that interacted with the TABLE 2 | Forty-eight genes in the two modules that obtained from TCGA database.   histamine receptor H1 (HRH1) gene, 22 drugs that interacted with Fc fragment of IgG receptor IIb (FCGR2B) gene, two drugs that interacted with solute carrier family 16 members 3 (SLC16A3), two drugs that interacted with CD163 (CD163 molecule), and three drugs that interacted with coagulation factor III, tissue factor (F3). Of the 215 drugs, 14 targeted FCGR2B and SLC16A3 and exhibited antineoplastic activity and anti-GBM activity ( Table 4).

DISCUSSION
Here, gene expression data from TCGA and survival data of 309 GBM cases were analyzed. A total of 228 genes were found to be differentially expressed between samples with high and low immune or stroma scores. Importantly, 10 genes were validated in GBM patients from CGGA, a separate GBM database (Figure 8). Finally, analysis of gene-drug database identified 14 drugs that interacted with the 10 genes.
Comparison between high and low immune scores groups identified 860 DEGs, many of which were found to be involved in the TME, as revealed by GO term analysis (Figure 3). This finding was consistent with previous reports on the function of immune cells and extracellular matrix (ECM) molecules in the TME of GBM (30)(31)(32)(33). Next, we examined the impact of the 860 genes on the OS of GBM patients. The results showed that 228 genes were associated with poor outcomes. Additionally, we constructed the top 2 PPI modules that were related to immune/inflammatory responses ( Figure 5). Finally, using an independent cohort of 124 GBM patients, the 10 TME-related genes were validated and found to be significantly correlated with the prognosis of GBM ( Table 3). Analysis of the DGIdb database identified 215 drugs that were associated with the genes. Among them, 14 drugs targeted FCGR2B and SLC16A3 genes and exhibited antineoplastic properties.
SLC16A3 regulates the secretion of lactic acid, which maintains pH and the Warburg effect (34). Multiple studies indicate that overexpression of SLC16A3 promotes the growth and proliferation of hypoxic cancer cells and is associated with poor cancer prognosis (35)(36)(37). FCGR2B belongs to the rhodopsin-like G-protein-coupled receptor family. It is expressed in multiple diseases, including systemic lupus erythematosus, non-small cell lung cancer, and IgA nephropathy liver hepatocellular carcinoma (38)(39)(40)(41)(42).
The expression of genes has been reported to influence the OS of GBM, based on findings from cancer cell line experiments, animal tumor models, or patient samples. However, due to the complexity of the GBM microenvironment, a more in-depth analysis should be performed using larger patient cohorts. The rapid development of whole-genome sequencing technology has led to the development of high-throughput, publicly available cancer databases, such as TCGA and CGGA. Other databases such as DGIdb provide a basis for large GBM data analysis (8,(43)(44)(45).
Previous studies have found that various tumor intrinsic genes influence various aspects of the TME (8). Here, we focused on gene features of the TME that influence the development, progression, and the OS of GBM patients. The findings of this study expand our understanding of the complex interaction between GBM and its TME.
In summary, this study found 10 TME-related genes that influence the OS of GBM patients. By using the DGIdb database, several drugs that interact with the genes in GBM were identified. Some of the identified genes can be considered as potential biomarkers of GBM. It would be interesting to find out whether a combination of these genes can better predict the survival rates of GBM patients. Further characterization of the identified drugs is advocated to reveal effective drugs for the treatment of GBM.