Development and Validation of an Immune-Related Prognostic Signature in Cervical Cancer

Background Cervical cancer is the fourth most frequent gynecological malignancy across the world. Immunotherapies have proved to improve prognosis of cervical cancer. However, few studies on immune-related prognostic signature had been reported in cervical cancer. Methods Raw data and clinical information of cervical cancer samples were downloaded from TCGA and UCSC Xena website. Immunophenoscore of immune infiltration cells in cervical cancer samples was calculated through the ssGSEA method using GSVA package. WGCNA, Cox regression analysis, LASSO analysis, and GSEA analysis were performed to classify cervical cancer prognosis and explore the biological signaling pathway. Results There were eight immune infiltration cells associated with prognosis of cervical cancer. Through WGCNA, 153 genes from 402 immune-related genes were significantly correlated with prognosis of cervical cancer. A 15-gene signature demonstrated powerful predictive ability in prognosis of cervical cancer. GSEA analysis showed multiple signaling pathways containing Programmed cell death ligand-1 (PD-L1) expression and PD-1 checkpoint pathway differences between high-risk and low-risk groups. Furthermore, the 15-gene signature was associated with multiple immune cells and immune infiltration in tumor microenvironment. Conclusion The 15-gene signature is an effective potential prognostic classifier in the immunotherapies and surveillance of cervical cancer.


INTRODUCTION
According to estimates from GLOBOCAN 2018, cervical cancer was the fourth most common cancer among women, with approximately 570,000 new cases and 311,000 deaths (1). Cure rate for earliest stages is more than 90%, whereas locally advanced lesions treated with multimodality therapy can only achieve 65% cure rate in stage II lesions and 55% in stage III lesions (2). Therapies to improve survivorship are in desperate need for cervical cancer especially locally advanced disease.
With rapid development of precision medicines, novel therapeutic strategies, especially immunotherapies, have been proposed to significantly improve clinical outcomes of cervical cancer (3,4). Harnessing an antitumor immune response has been a fundamental strategy in cancer immunotherapy. A paradigm shift has appeared in cancer immunotherapy: from traditional immune enhancement with low-objective responses and frequent adverse events to more effective and less toxic reactions immune normalization (5,6).
Cancer immunotherapies, such as administration of the cytokine Interleukin-2 (IL-2), adoptive cell transfer, and the checkpoint modulators CTLA-4 and PD-1, have proved effective in clinical practice (6). Blockade of the checkpoint modulators cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) and programmed cell death protein-1 (PD-1) starts the field of immune normalization in immunotherapy. Upregulated PD-1 in tumor microenvironment inhibits an effector T cell antitumor immune response, and therapies blocking this pathway have proven effective against multiple tumor types (5). Anti-PD therapy perform antitumor immunity mainly through the following three principles:1) targeting a tumor-induced immune escape mechanism, 2) selectively modulating immunity in the tumor microenvironment, and 3) resetting immunity in the tumor microenvironment (5). It has been proved that patients with high mutation burden and burden of potential neoepitopes benefit more from immunological checkpoint blockade (7)(8)(9)(10)(11). Immunotherapy plays a dispensable role in management of cervical cancer. In KEYNOTE-158, pembrolizumab has been approved by the US Food and Drug Administration for use in advanced cervical cancer with progressive disease either during or after chemotherapy (12). The objective response rate was 26.3% with a disease control rate of 68.4% for immunotherapy in cervical cancer (4). However, only small proportion patients benefited from immunotherapy, and this proportion will hopefully increase with better patient selection and combination therapy. Hence, it is urgent to find potential biomarkers for prediction of response to checkpoint immunotherapy and the rationale for the use of checkpoint immunotherapy.
In the present study, we qualify immune cells infiltration in cervical cancer and analyze the correlation between immune cells and cancer prognosis. Hub genes regulating prognosis through immune infiltration in cervical cancer were identified by weighted gene co-expression network analysis (WGCNA) and least absolute shrinkage and selection operator (LASSO). It was suggested by infiltrated immune cells and pathway enrichment analysis that our immune-related signature was closely related to tumor prognosis and could predict response of immunotherapy. A robust immune-related prognostic signature based on transcriptomics in cervical cancer was constructed and validated.

Data Source and Processing
The gene expression profiles and clinical information of cervical cancer were downloaded from The Cancer Genome Atlas (TCGA) Genomic Data Commons Data Portal (https://portal. gdc.cancer.gov/). Patients with pathologically confirmed cervical cancer and complete information about transcriptomics overall survivals (OSs) were included in this study. Finally, a total of 304 primary cervical cancer samples, two metastatic cervical cancer samples, and three normal cervix samples from TCGA were analyzed in our study. Details on clinical information of the included samples were summarized in Table 1. A workflow of this study was indicated in Figure 1.

Infiltration of Immune Cells
Tumor-infiltrating immune cells can be quantified from RNA sequencing (RNA-seq) data of human tumors using bioinformatics approaches. Single-sample gene set enrichment analysis (ssGSEA) calculated and qualified the infiltration of immune cells through RNA-seq data. The infiltration of 28 immune cells was obtained. Immunophenoscore (normalized enrichment score) of each TCGA cervical cancer sample was calculated through ssGSEA. The ssGSEA ranks the genes by their absolute expression in a sample and computes enrichment score by integrating the differences between the empirical cumulative distribution functions of the gene ranks (13).

Survival Analysis
Univariate Cox regression analysis was performed to identify the association between cervical cancer survival and immunophenoscore of infiltrated immune cells. Forest plot was drawn to demonstrate the influence of infiltrated immune cells on survival. The best separation statistic was performed by use of "survminer" package that divides gene expression into high groups and low groups based on best separation. Next, Kaplan-Meier curve was made to further analyze relationship between survival and infiltration immune cells.

Construction of Weighted Gene Co-Expression Network Analysis
WGCNA is an algorithm for finding genetic interactions in a weighted manner. It is used to build a gene co-expression networks to mine network modules closely associated with clinical traits through systematic biological method (14). In this study, immunophenoscore of the infiltrated immune cells was regarded as target clinical traits. Genes expressing not available (NA) were removed. The top 25% genes with most median absolute deviation used as a robust measure of variability were selected for WGCNA analysis (15,16).
Using the WGCNA function adjacency, an adjacency matrix is constructed by computing the Pearson correlation between all pairs of genes in the selected sample. Genes were divided into different gene modules based on the dissimilarity measure. A hierarchical clustering tree was constructed with different branches of the tree representing different gene modules (15)(16)(17).
The WGCNA network was built. There were eight immune cells associated with cervical cancer survival. The magenta module was the module that most related to infiltration of immune cells. A total of 402 genes contained in magenta module were analyzed for survival using the standardized expression data FPKM. Finally, we found that 153 genes were associated with survival (p < 0.05) in magenta module.

Screening Hub Genes by LASSO and Survival Analysis
The 153 genes with the highest correlation associated with survival were picked out from WGCNA analysis. Then, hub genes were further screened from the 153 genes by the use of LASSO. Survival analysis of hub genes was performed using "survival" and "survminer" packages to verify whether differently expressed genes affected tumor prognosis. Data of cervical cancer from University of California, Santa Cruz (UCSC) Xena were downloaded and applied to compare the expression of hub genes in normal cervix and cervical cancer.

Construction of Prognostic Scoring Model
Patients with cervical cancer in TCGA were randomly divided into two groups by stage: the training group of 70% and the test group of 30% by the use of caret R package. Finally, a total of 199 patients were used as training group and 74 patients were regarded as test group. The selected key genes for support vector machine analysis were used to fit a LASSO Cox-proportional hazards (Cox-PH) model for selecting an optimal panel of predictive genes with penalized package of R (18). Optimal lambda value was computed through a 10 cross-validations. Next, Cox-PH coefficients and infiltration of immune cells levels of these selected genes were used to calculate prognostic score as follows: Risk Score = S (coef gene × immunophenoscore gene ), where Coef gene and immunophenoscore gene suggest Cox-PH coefficient and immunophenoscore level of a gene, respectively.

Validation of Prognostic Scoring Model
We calculated risk score of every patient in the training and test group by the model. We separated the training group and test group into a high-risk group and a low-risk group with the  median risk score as cutoff, respectively. Kaplan-Meier curve was applied to obtain and compare the OS time of two risk groups. ROC (operating characteristic curve) was performed to evaluate the predictive accuracy of the model.

Comparison of Eight Immune Cell Subtypes Between High-Risk and Low-Risk Groups
To explore the differences of immune cell subtypes between high-risk and low-risk groups, the eight immune cell subtypes associated with OS in cervical cancer were assessed in test and train cohorts. Mann-Whitney U-test was used to compare differences in immune cell subtypes in the high-risk and lowrisk groups.

Gene Set Enrichment Analysis (GSEA)
All 304 patients with primary cervical cancer were evaluated by prognostic scoring model and then divided into high-risk group and low-risk group based on the standard cutoff. The global gene expression was analyzed and displayed with volcanic maps using the limma package in R. GSEA was conducted respectively to search "all gene sets" enriched in the samples with high-risk group and low-risk group. The differentially expressed genes were enriched in PD-L1 pathway functional sets (19).

Quantify Immune Cell Infiltration and Survival Analysis
We used ssGSEA to quantify mRNA data for immune cell infiltration. Finally, 28 infiltrating immune cells were included. Immunophenoscores of 304 primary cervical cancer samples, two metastatic cervical cancer samples, and three normal cervix samples for 28 immune cells were calculated and demonstrated in Figure 2A. Univariate Cox regression analysis was performed to identify association between immune cells and OS of cervical cancer, and the results are shown in Figure 2B. It indicated that eight immune cells were correlated with OS significantly. Moreover, these eight immune cells (activated B cell, activated CD8 T cell, eosinophil, monocyte, activated CD4 T cell, effector memory CD8 T cell, immature B cell, and plasmacytoid dendritic cell) were all protective factors for OS. Kaplan-Meier curve validated that high expression of these eight immune cells were related to longer OS time suggested worse prognosis ( Figure 3).

The Weighted Gene Co-Expression Network Analysis Construction and Key Module Identification
The top 25% of the gene expression of variance was screened by the quartile of gene expression level, and 609 genes were screened out to construct co-expressed gene networks and the sample dendrogram and trait heatmap are constructed ( Figure 4A). In this study, the power of b = 8 was selected to ensure a scale-free network ( Figures 4B, C) (scale-free R 2 = 0.9, slope = −1.65).
After merging the modules with the high similarity of feature genes in the gene cluster dendrogram through a cutline (0.25) ( Figure 5A), nine modules were identified by the dynamic tree cut method. The clustering dendrograms of genes were shown in Figure 5B. A heatmap illustrating the correlation between immunophenoscores of infiltrated immune cells and key genes in the module was created ( Figure 5C).
It was obviously that magenta module was the most correlated with immune infiltration in the heatmap. Hence, magenta module was selected as the clinical significant module for further analysis. We found that the correlation coefficient between magenta module and 11 immune infiltration cells was ≥ 0.7, which suggested strong correlation. This indicates that the genes in this module are most relevant to tumor OS. All 402 genes in magenta module were analyzed for survival of cervical cancer. Through univariate Cox regression survival analysis, 153 genes found to be significantly correlated with prognosis of cervical cancer were selected for further analysis.

Screening the Key Genes and Survival Analysis
To develop a prognostic scoring model based on WGCNA, the 153 key genes were used to fit the LASSO Cox-PH model. Using parameter lambda (0.038) obtained upon performing 1000 crossvalidations, a combination of 15 genes was obtained. These 15 genes were as follows: LAG3, CD74, CCL22, CH25H, OLR1, MIAT, BATF, IKZF3, TRARG, ACSL6, C11orf21, GTSF1, APOL1, CD1C, and LINC00158 (Table 2 and Figure 6). Risk score of each sample in the training set was calculated. Then, the samples in the training cohort were divided into high-and lowrisk groups according to median risk score. As demonstrated in Figure 6, the high-risk group patients in the training cohort have shorter OS time than the low-risk group (p < 0.001, HR: 5.1, 95% CI: 3.4-7.5), with an AUC of 0.803 in 1 year, 0.809 in 3 years, and 0.800 in 5 years. Next, the 15-gene signature was then verified by the validation cohort from TCGA. Consistently, the high-risk group patients in the validation cohort had worse prognosis than the low-risk group (p < 0.001, HR: 1.3, 95% CI: 1.1-1.4), and the AUC was 0.845 in 1 year, 0.703 in 3 years, and 0.761 in 5 years ( Figure 7). All these suggested that the 15-gene signature associated with immune infiltration is able to predict prognosis in patients with cervical cancer.

Immune Cell Subtypes Between High-Risk and Low-Risk Groups
The expression levels of the 15 genes in test and validation cohorts are shown in Figure 8A. Different immune scores had differential OS in patients with cervical cancer. In both test cohorts and train cohorts, these eight immune cell subtypes (activated B cell, activated CD8 T cell, eosinophil, monocyte, activated CD4 T cell, effector memory CD8 T cell, immature B cell, and plasmacytoid dendritic cell) expressed differentially in high-risk and low-risk groups. (Figures 8B, C).

GSEA Analysis
All 304 primary cervical cancer samples were divided into highrisk group and low-risk group based on the 15-gene signature.
To identify potential function of the 15 key genes, GSEA was conducted respectively to search "all gene sets" enriched in all 304 samples. GSEA showed 33 significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway associated with risk score, including PD-L1 expression and PD-1 checkpoint pathway in cancer ( Figure 9 and Table 3). These results validated that the key genes in clinical significant module were mainly involved in the regulation of immune system. From GSEA, the 15-gene signature obviously participated in regulation of PD-1/PD-L1 pathway in cervical cancer.

DISCUSSION
Cervical cancer represents a major public health problem even in developed countries such as United States, with 14,000 new cases     (20). Increasing incidence of cervical adenocarcinoma and attenuation of earlier declines for cervical squamous cell carcinoma emphasize the importance of the need for improved therapeutic options to reduce the burden of cervical cancer (21). The most important risk factors affecting prognosis of cervical cancer are stage, status of the lymph nodes, tumor volume, depth of tumor invasion into the cervical stroma, and Lymphovascular space invasion (LVSI) (22). The 5-year OS decreased significantly with rising stages (23). A substantial percentage of patients with advanced cervical cancer will undergo recurrence and poor prognosis (2,24), and the recurrence rate fluctuates from 10% to 74% (25). Tumor stage was one of the most pivotal factors related to recurrence: 10% for stage IB, 17% for stage IIA, 23% for stage IIB, 42% for stage III, and 74% for stage IVA (26). Prognosis of metastatic and recurrent cervical cancer was extremely poor with a median survival time of 12 months (27). Therefore, improved therapeutic options in management of cervical cancer, especially locally advanced cervical cancer is in urgent need. High-risk subtypes of the human papilloma virus (HPV) are the cause of cervical cancer (28,29). Viral oncoproteins E6 and E7 leads to dysregulation of p53 and Hypoxia-inducible factor-1 (HIF-1) alpha,   thus affecting cell cycle proteins and VEGF expression (28)(29)(30)(31)(32). HPV is highly immunogenic and elicits immune responses in humans; thus, immune response might play important roles in carcinogenesis of cervical cancer. Incidence of cervical cancer was substantial declined in countries with high HPV vaccine coverage (33)(34)(35). Immunotherapy fights against tumor cells through activating endogenous immune response, which seems to switch on the new frontier of the anticancer treatment (36). Immunotherapy includes different approaches, such as active immunotherapy (vaccine), passive immunotherapy (adoptive cellular transfer, antibodies, and cytokines), and immunomodulation (cyclooxygenase 2 inhibitor). Discovery of immune checkpoint such as CTLA-4 and PD-1 plays an indispensable role in the development of cancer immunotherapy. It was surprising that immune checkpoint inhibitors anti-CTLA-4 and anti-PD-1 displayed enormous success in solid tumors (37). Similar to other solid tumors, novel immunotherapeutic approaches, such as immune checkpoint inhibitors, have shown encouraging results in cervical cancer (12). Implementing immunotherapeutic approaches earlier in advanced cervical cancer would seem to be most appropriate. However, the objective response rate with anti-PD-1/PD-L1 monotherapy is hovering at 20%. Moreover, immune-related toxicities and severe adverse effects can occur during PD-1/PD-L1 blockade therapy (38). PD-1/PD-L1 blockade therapy does not demonstrate efficacy in almost 80% of patients with cervical cancer, suggesting that the potential mechanism of PD-1/PD-L1 in immunotherapy remains to be further clarified. Thus, new immune checkpoint inhibitors or comprehensive understanding of specific mechanism underlying PD-1/PD-L1 regulation in carcinogenesis is in urgent need.
Immune infiltration of cervical cancer determines the immune activation of tumor microenvironment and is related with clinical outcome of patients. In this study, immunophenoscore of 28 infiltrated immune cells in TCGA cervical cancer was calculated. Univariate Cox regression analysis showed that activated B cell, activated CD8 T cell, eosinophil, monocyte, activated CD4 T cell, effector memory CD8 T cell, immature B cell, and plasmacytoid dendritic cell were strongly associated with OS of cervical cancer. WGCNA revealed that magenta module was the most relevant module to immune infiltration. In total, we identified 10 kinds of infiltrated immune cells with strong correlation with magenta  CD223) is the third clinically targeted inhibitory receptor (39). CD74 (invariant chain) plays a dispensable role in process of immune systems that it participates in antigen presentation, B-cell differentiation, and inflammatory signaling. CD74 has the potential to be a therapeutic target in cancer and autoimmune disease (40). The chemokine CCL22 promoted regulatory T cell communication with dendritic cells to control immunity and was associated with poor prognosis (41,42). CH25H produces 25-hydroxycholesterol, which inhibited tumor-derived extracellular vesicles uptake and correlated with prognosis in patients with melanoma (43). OLR1 (oxidized low-density lipoprotein (LDL) receptor 1) was a possible link between obesity, dyslipidemia, and cancer. OLR1 played carcinogenic role by activating Nuclear factor kappa B (NF-kB) pathway to promote proliferation and migration and to inhibit apoptosis and de novo lipogenesis (44). MIAT Long non-coding RNAs (lncRNA) was overexpressed in a number of malignancies and caused poor prognosis (45)(46)(47). Basic leucine zipper transcription factor, ATF-like (BATF) was an important transcription factor regulating differentiation of early effector CD8 + T cells (48) and was a prognostic indicator for patients with colon cancer (49). IKZF3 promoted growth of multiple tumors to cause poor prognosis (50). Immune infiltration played important roles in the survival of cervical cancer. In this study, we identified 15 immune infiltration associated genes in cervical cancer and built a prognostic signature. Immune scores depended on the expression of these 15 genes and were associated with the survival of cervical cancer. High immune scores meant good prognosis. GSEA analysis showed that the 15-gene prognostic signature was obviously associated with PD-L1 expression and PD-1 checkpoint pathway in cancer. The prognostic signature could provide basis for potential immunotherapy in the future. Similarly, other researchers also constructed prognostic signatures for cervical cancer based on immune-related genes (51,52). Compared with our study, although these studies use different types and different quantity of genes, all the signatures can well predict the prognosis of cervical cancer. However, the study has several limitations. First, no in vitro or in vivo molecular experiment was performed to verify our analysis. Second, our study was a retrospective study. Thus, prospective study is in need to validate the findings of our study in the future.

CONCLUSION
In conclusion, we successfully constructed a 15-gene prognostic signature with powerful predictive function. Differences in the OS of high-and low-risk groups are implicated in immune infiltration, tumor microenvironment, PD-L1 expression, and PD-1 checkpoint pathway. These findings revealed the underlying mechanism of immunotherapy and provided basis for cervical cancer pathogenesis and clinical treatment.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
RS is responsible for experimental design. CWJ and HB are responsible for instrument operation. JX is responsible for data analysis. LZ and CJJ are for providing overall ideas. All authors contributed to the article and approved the submitted version.