An Immune-Related Gene Pairs Signature for Predicting Survival in Glioblastoma

Background: Glioblastoma (GBM) is the frequently occurring and most aggressive form of brain tumors. In the study, we constructed an immune-related gene pairs (IRGPs) signature to predict overall survival (OS) in patients with GBM. Methods: We established IRGPs with immune-related gene (IRG) matrix from The Cancer Genome Atlas (TCGA) database (Training cohort). After screened by the univariate regression analysis and least absolute shrinkage and selection operator (LASSO) regression analysis, IRGPs were subjected to the multivariable Cox regression to develop an IRGP signature. Then, the predicting accuracy of the signature was assessed with the area under the receiver operating characteristic curve (AUC) and validated the result using the Chinese Glioma Genome Atlas (CGGA) database (Validation cohorts 1 and 2). Results: A 10-IRGP signature was established for predicting the OS of patients with GBM. The AUC for predicting 1-, 3-, and 5-year OS in Training cohort was 0.801, 0.901, and 0.964, respectively, in line with the AUC of Validation cohorts 1 and 2 [Validation cohort 1 (1 year: 0.763; 3 years: 0.786; and 5 years: 0.884); Validation cohort 2 (1 year: 0.745; 3 years: 0.989; and 5 years: 0.987)]. Moreover, survival analysis in three cohorts suggested that patients with low-risk GBM had better clinical outcomes than patients with high-risk GBM. The univariate and multivariable Cox regression demonstrated that the IRGPs signature was an independent prognostic factor. Conclusions: We developed a novel IRGPs signature for predicting OS in patients with GBM.


INTRODUCTION
Glioma is the most common and the most aggressive cancer of the central nervous system (1). Grade IV glioma: glioblastoma (GBM) is the major subtype of glioma, accounting for more than 48.6% of glioma cases in the USA (2). Although advanced therapies, like surgery, chemotherapy, targeted therapy, and so on, had been applied for GBM, the 5-year relative survival probability of GBM is only limited to 6.8% (3), and the median lifespan is only 19.2 months (4). Due to these discouraging survival data, the development of more accurate tumor-specific biomarkers of GBM for further constructing novel diagnostic signature and guiding clinical treatment is still a priority in GBM management.
There is emerging evidence showing that the immune system plays an important role in the tumorigenesis, development, and metastasis of cancers (5). Immune cells could identify and eradicate tumor cells by recognizing tumor-associated antigen, a key factor for immune surveillance (6). However, tumor cells could act on the immune system to attenuate the effects of immune surveillance and to promote self-development (7). Immunotherapy, a novel treatment method for tumors based on this theory, has been used for the therapy of many cancers (8). In recent years, two-phase III trials of immunotherapy (NCT02017717 and NCT02617589) have been initiated and raised new hopes for the GBM therapy. Nevertheless, owing to the blood-brain barrier, the central nervous system becomes an immunologically privileged situation, and immune escape challenges the effectiveness of immunotherapy in GBM (9,10). Therefore, the combination management to achieve more efficacious therapeutic benefits is required for complex immune evasion strategies of GBM.
Abnormal gene expression is a common phenomenon in malignant tumors, which may promote the progression of tumors (11). Recently, omics technology and bioinformatics analysis have created an opportunity to identify novel molecular biomarkers and understand potential mechanisms in many tumors. For example, Zhao et al. (12) identified a six-gene risk signature for the outcome prediction of GBM. Several studies have reported immune-related gene pairs (IRGPs) signature in ovarian cancer (13), liver cancer (14), and colorectal cancer (15) and showed well-predictive accuracy. However, up to now, there is no study reporting an IRGP signature in GBM.
In the last two decades, genetic analyses identified point mutation of the gene: isocitrate dehydrogenase (IDH) was an early and common event in gliomagenesis, especially in Grade II and Grade III gliomas (16,17). In recent years, GBM was defined as a diffuse astrocytic glioma without IDH-mutation (18,19). Moreover, IDH-mutation in patients with glioma predicts better overall survival (OS) and influences the infiltration of immune cells and immunity (20)(21)(22), which may bias the analysis in the study. Hence, in the present study, we did not enroll the patients with GBM with IDH-mutation. Herein, we collected the gene expression matrix from The Cancer Genome Atlas (TCGA) database (https://cancergenome.nih.gov) and immunerelated gene (IRG) set from the ImmPort database (https:// www.immport.org/home) to establish IRGPs and an IRGP signature in GBM with IDH-wildtype. Meanwhile, the signature was validated with data from the Chinese Glioma Genome Atlas (CGGA) database (http://www.cgga.org.cn). Then, we explored the relationship between the signature and clinical characteristics. Finally, we proved the predictive effectiveness and accuracy of this signature by comparing the area under the curve (AUC) of this signature with that of clinical indexes.

Data Source and Pre-processing
The gene expression profile and relevant clinical characteristics were downloaded from the databases of TCGA and CGGA. Patients with follow-up time <1 mouth or lacking complete OS would be removed. Three independent cohorts including 514 cases were gathered in the present study [160 cases in Training cohort (TCGA-GBM); 260 cases in Validation cohort 1 (CGGA-693); and 94 cases in Validation cohort 2 (CGGA-301)]. In addition, 472 IRGs were collected from the ImmPort database (https://www.immport.org/home). All raw data from CGGA were transformed with log 2 .

Establishment of an IRGP Signature
As described in the previous study (23), we performed the pairwise comparison in all samples. Score 1 represented that the expression level of first IRG was higher than that of second IRG in a specific IRGP, and score 0 represented the opposite. Additionally, an IRGP would be deserted if the score of which was 0 or 1 in over 80% of samples. Then, in the Training cohort, the univariate regression analysis with p-value <0.001 as the threshold was used to preliminary filtrate IRGPs. Later, the least absolute shrinkage and selection operator (LASSO) regression analysis with iteration = 1,000 and 10-fold cross-validations to prevent overfitting were used to screen up the remaining IRGPs. Ultimately, with p-value <0.05 as the cut-off criterion, we applied multivariate Cox regression analysis to determine top OS-related IRGPs, which would be used to develop an IRGP signature and immune risk-score formula for calculating risk score. Based on the risk cut-off score: the median value, patients were classified into two subgroups (high-and low-risk groups).

Evaluation and Validation of the IRGPs Signature
In the same way, the risk score of each case in Validation cohort 1 and Validation cohort 2 was also calculated. The receiver operating characteristic curve (ROC) was carried out to evaluate the predictive capacity of this IRGPs signature by calculating the AUC.

Difference of Tumor-Infiltrating Immune Cells Between Two Subgroups
As described previously (24), the CIBERSORT algorithm was utilized to assess the landscape of 22 tumor-infiltrating immune cells (TIICs) in the tumor microenvironment. Cases with p < 0.05 were selected for further research. The distribution of 22 TIICs between subgroups was also pictured.

Gene Set Enrichment Analysis
In the Training cohort, the gene set enrichment analysis (GSEA) was used to assess the potential biological pathways related to this IRGPs signature. False discovery rate (FDR) <0.01 was chosen as the cut-off criterion. Table 1 presents all clinical and pathological information of patients with GBM in three cohorts. For categorical data, the differences among different groups were compared with the chi-square test, whereas, for measurement data, the differences were compared with the t-test. The univariate and multivariate Cox regression analyses were used to determine independent prognosis-related factors. Survival difference was compared with the logrank test and shown with the Kaplan-Meier method. All statistical analyses were performed using R 3.6.3 (https://www.rproject.org).

Establishment of the IRGPs Signature
A total of 19,916 IRGPs with score of 0 or 1 in <80% were paired. With the univariate regression analysis, 48 IRGPs were filtered in the Training cohort. Then, the LASSO regression analysis screened out 24 IRGPs for further analysis ( Figure 1A). Finally, 10 IRGPs were identified and used to develop an IRGP signature ( Table 2 and Figure 1B). Risk score = (−0.4987 × Based on the cut-off risk score (1.089), patients were divided into two subgroups (high-and low-risk groups). The distribution of risk score and survival status of the IRGPs signature in Training cohort were shown in Supplementary Figure 1A. A nomogram combining risk score and clinical characteristics was also constructed for the prediction of 1-, 3-, and 5-year survival probability ( Figure 1C).

Evaluation and Validation of the IRGPs Signature
We calculated the AUC of this IRGPs signature in the Training cohort to assess the predictive accuracy and effectiveness of the IRGPs signature. The AUC of IRGPs signature in the Training cohort for 1-, 3-, and 5-year OS was 0.801, 0.901, and 0.964, respectively (Figure 2A), which were markedly higher than the AUCs of clinical indexes ( Supplementary Figures 2A-C). Also, we calculated the immune risk score of each patient in the Validation cohort 1 and Validation cohort 2 and divided them into high-and low-risk groups according to the cut-off risk score. Here, we constructed a signature with 10 IRGPs, which included 16 IRGs (BMP2, CHGB, FCGR2B, FGF12, IL10RA, LEFTY2, SLC11A1, IL13RA2, NOV, OXTR, GDF11, PRKCB, FGFR2, OSMR, MSTN, and TRIM22). Among 16 genes, the expressions of 9 genes (CHGB, FCGR2B, IL10RA, SLC11A1, IL13RA2, NOV, OXTR, OSMR, and LEFTY2) in the high-risk group was significantly upregulated, and the expressions of two genes (FGFR2 and MSTN) were opposite (Supplementary Figure 3).

Association Between the IRGPs Signature and Clinical Indexes
Clinicopathological data, including age, gender, tumor molecular subtype, KPS (Karnofsky), recurrent, radiotherapy, and chemotherapy, were collected. As shown in Figure 6A, the distribution of age (p = 0.047) and tumor subtype (p = 0.036) between high-and low-risk groups was significantly different. In  addition, the risk scores of patients aged ≥50 were significantly increased (p = 0.034, Figure 6B). An analogous phenomenon was observed in patients with mesenchymal GBM (p = 0.004, Figure 6F). Compared with patients without chemotherapy, patients treated with chemotherapy had lower risk scores ( Figure 6H). However, there was no difference in the risk scores between female patients and male patients (p = 0.790, Figure 6C), patients with KPS ≤70 and KPS >70 (p = 0.120, Figure 6D), primary tumor and recurrent tumor (p = 0.600, Figure 6E), and, patients treated with radiotherapy and without radiotherapy (p = 0.180, Figure 6G).

Gene Set Enrichment Analysis
The results of GSEA illustrated that a total of 22 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway-related gene sets were enriched. Among them, a few pathways were closely correlated with the immune system, such as "cytokine-cytokine receptor interaction, " "chemokine signaling pathway, " "Janus Kinase (JAK)-signal transducer and activator of transcription (STAT) signaling pathway, " "complement and coagulation cascades, " "Nucleotide-binding oligomerization domain (NOD)-like receptor signaling pathway, " and "intestinal immune network for IgA production" (Figure 8).

DISCUSSION
Glioblastoma is the frequently occurring and most aggressive malignancy in the central nervous system (25). The 5-year survival rate of patients with GBM after diagnosis is still limited to 6.8% (3). It is important to seek the tumor-specific markers and build risk stratification for the assessment of prognosis in GBM, which may facilitate the development of novel strategies for the diagnosis and therapy of GBM.
The immune system is a considerable influencing factor for the progression of cancers (5). Immunotherapy as a novel treatment method has revolutionized cancer therapy (8), which also provides new hope for the treatment of GBM. In this study, we downloaded the IRG expression matrix from the TCGA database to establish IRGPs and develop an IRGP signature for the prediction of clinical outcomes of patients with GBM. Based on the cut-off value, patients were segmented into two subgroups. The survival analysis showed that patients with high-risk scores had a more unfavorable prognosis. In addition, the univariable and multivariate Cox regression analysis in three cohorts demonstrated the IRGPs signature was an independent prognostic factor in patients with GBM. However, the HR of univariable and multivariate Cox regression analysis in the Validation cohort 2 was 19.023 and 21.331, respectively, substantially higher than that in the Training cohort and Validation cohort 1. It may be caused by the patients with lowproportion low-risk scores in the Validation cohort 2 (high-risk score:low-risk score = 71: 23).
Later, we estimated the predicted capacity of this IRGPs signature. The AUCs of the IRGPs signature for predicting OS probability in the Training cohort reached 0.800 at 1 year, 0.901 at 3 years, and 0.948 at 5 years, which were significantly superior to the AUCs based on the clinical characteristics, such as age, gender, chemotherapy, recurrent, and so on ( Supplementary Figures 2A-C). The AUCs for 1-, 3-, and 5year OS in the Validation cohorts 1 and 2 were similar to that in the Training cohort and were also obviously higher than that of clinical indexes (Supplementary Figures 2D-I). All data demonstrated that the IRGPs signature was suitable to predict 1-, 3-and 5-year OS in GBM. Recently, several studies explored prognostic biomarkers and evaluated the prognosis of GBM with immune-related features. For example, Zhang et al. (26) used a random forest algorithm to screen the tumor microenvironment (TME)-related genes and developed a TME score with five genes using the multivariate Cox regression analysis. The Kaplan-Meier plot showed a significant survival difference between patients with high-and low-TME scores. A study utilized the univariable Cox regression analysis and LASSO regression analysis to establish an immune-based prognostic scoring model for predicting survival probability in GBM with the AUC = 0.657 at 1 year, 0.667 at 3 years, and 0.667 at 5 years (27). Another study constructed a Foxp3-related immune prognostic signature in patients with GBM using the LASSO regression analysis (28). The AUC for predicting 1-and 3-year OS was 0.633 and 0.695, respectively. In 2018, Zhou et al. (29) performed the survival analysis to select prognostic immune-related-lncRNA and then carried out the multivariate Cox regression analysis to develop a six immune-related lncRNA prognostic risk model. The AUCs of the model for predicting 5-year OS was 0.842. All of those studies illustrated the wellpredictive ability of immune-related features. Different from previous studies, herein, we constructed IRGPs in GBM and applied the univariable Cox regression analysis and LASSO regression analysis to preliminary filtrate IRGPs. Finally, we executed the multivariate Cox regression analysis to further screen and develop IRGPs signature. Up to now, this is the first time using IRGPs, a novel immune-related feature, to build a signature for predicting OS probability in GBM. The AUCs of the signature were obviously superior to that in the study of Qin et al. (27) and Guo et al. (28) and slightly preferable to the AUC in the study of Zhou et al. (29), showing the IRGPs signature is robust and reliable.
The IRGPs signature consisted of 10 gene pairs, including 16 IRGs (BMP2, CHGB, FCGR2B, FGF12, IL10RA, LEFTY2,  SLC11A1, IL13RA2, NOV, OXTR, GDF11, PRKCB, FGFR2, OSMR, MSTN, and TRIM22). Between low-and high-risk groups, the expression of 11 genes (CHGB, FCGR2B, IL10RA, SLC11A1, IL13RA2, NOV, OXTR, OSMR, LEFTY2, FGFR2, and MSTN) was with statistical difference (Supplementary Figure 3). Fibroblast growth factor receptor 2 (FGFR2) and fibroblast growth factor 12 (FGF12) are the members of the fibroblast growth factor family, which is involved in the activation of the Kirsten Rat Sarcoma (RAS)-Mitogen-Activated Protein Kinase (MAPK) and the Threonine Kinase (AKT) and the PI3K-AKT pathway, and the cell proliferation, differentiation, and apoptosis of cancers (30). The studies demonstrated that they could influence tumor development via regulating macrophages, enhance PD-1 and PD-L1 expression, and promote inflammation (31)(32)(33)(34). Fc fragment of IgG receptor IIb (FCGR2B) encoded a receptor for the Fc region of immunoglobulin gamma complexes. It is involved in several regulatory functions, such as modulating B cells producing antibodies and the phagocytosis of immune complexes (35). Moreover, it could bring about the downmodulation of cell activation triggered through regulating antigen receptors on B cells and T cells and upregulation of IgG responses, which contribute to humoral immune responses (35,36). Protein kinase C beta (PRKCB) is a kind of serinespecific protein kinases, and it is involved in various cellular signaling processes, such as the activation of B cells and cell proliferation (37). Moreover, it also could mediate the differentiation of dendritic cells (37). Tripartite motif-containing 22 (TRIM22) and solute carrier family 11 member 1 (SLC11A1) participate in host resistance to the pathogen and interferonmediated antiviral effects (38)(39)(40). SLC11A1 also participates in the activation of macrophages to promote macrophage immune effector functions, such as inhibiting immune inflammation and altering the processing and presentation of antigens by enhancing the autoimmune T cell (39,40). TRIM22 could regulate macrophage autophagy (38). Interleukin 10 receptor subunit alpha (IL10RA), interleukin 13 receptor subunit alpha 2 (IL13RA2), and oncostatin M receptor (OSMR) are cytokine receptors and could mediate the immunosuppressive signal and the synthesis of proinflammatory cytokines. Previous studies showed IL10RA was highly related to the clinical outcomes of colorectal cancer and melanoma (41,42). IL13RA2 is overexpressed in over 75% of patients with GBM and related to prognosis and participates in the downregulation of immune response mediated by helper T cells (43,44). Immunization with IL13RA2 DNA vaccine could inhibit tumor growth and induces tumor immunity (45). OSMR could transduce signaling induced by oncostatin M, IL6, and IL31 and regulated immune cellmediated metabolism (46). Basic research showed that OSMR contributed to the adjusting of extracellular matrix process and local immune response in GBM, which was highly associated with the development of GBM (47). Chromogranin B (CHGB), oxytocin receptor (OXTR), and cellular communication network factor 3 (CCN3) also belong to cytokine and cytokine receptor, which play a vital role in the homeostasis of innate and adaptive immunity and remarkably affect anti-tumor immunity through activating cytotoxic T cells and cancer immunotherapy (48,49). Bone morphogenetic protein 2 (BMP2), myostatin (MSTN), growth differentiation factor 11 (GDF11), and left-right determination factor 2 (LEFTY2) belong to the transforming growth factor-beta superfamily, which has pleiotropic effects on cell proliferation, differentiation, invasion, metastasis, and apoptosis of cancer cells. It also influences inflammation, neurogenesis, and immune response via regulating immune cell differentiation and the activity of immune components (50)(51)(52).
Nowadays, emerging studies show treatment targeting the tumor microenvironment is an inspiring way to overcome therapeutic escape issues (53). Herein, we depicted the landscape of immune cells in the tumor microenvironment of GBM and found macrophages was the highest proportion immune cell, which was in line with previous findings (53,54). Furthermore, we identified that the proportion of macrophages M1 in the lowrisk group was significantly increased, whereas, macrophages M0 was opposite. After induced by lipopolysaccharide and interferon-γ stimulation, macrophages M0 may polarize into M1 that could promote GBM cell proliferation, invasion, and metastasis (55)(56)(57). Additionally, in the high-risk group, the fractions of Tregs were also significantly increased. The accumulation of Tregs in the tumor microenvironment functions as immunosuppression via inhibiting proliferation and degranulation of other immune cells (58). Increased proportion of Tregs was highly associated with tumor recurrence and decrease of survival and efficacy of immunotherapeutic strategies in GBM (59).
In this study, the IRGPs signature demonstrated an excellent predictive capability for patients with GBM. However, there are still several limitations that need to be improved. (1) As all cases were obtained from open databases, the possibility of selection bias cannot be excluded. (2) The IRGPs signature was developed based on RNA-seq data and microarray expression. It is costly and time-consuming. (3) Due to all samples were collected from the public database, the potential selection bias could not be eliminated, and some clinical information, such as treatments, molecular subtype, and so on, were missing in some samples, which may result in information bias. Meanwhile, the study was a retrospective study. Hence, validation of prospective samples, especially samples with complete clinical information, was still needed. (4) In the Training cohort, we did not obtain the clinical information: the status of IDH-mutation, which may present in enrolled patients with GBM, especially the patients with recurrent tumor. (5) No experimental studies were performed to testify the finding in this study. Thus, further basic experiments are warranted to validate the finding of this research.

CONCLUSIONS
In conclusion, in this study, we constructed a 10-IRGPs signature for predicting 1-, 3-, and 5-year OS in GBM. The IRGPs signature would be an effective and convenient means to predict the clinical outcome of GBM and provide a novel way for risk assessment.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study, this data can be found at: The Cancer Genome Atlas (https://portal.gdc. cancer.gov/).

AUTHOR CONTRIBUTIONS
SW: data collection, data analysis, and writing. XX: study design, data analysis, and revised the manuscript. All authors reviewed the manuscript.