Development and Verification of an Immune-Related Gene Pairs Prognostic Signature in Hepatocellular Carcinoma

With the increasing prevalence of Hepatocellular carcinoma (HCC) and the poor prognosis of immunotherapy, reliable immune-related gene pairs (IRGPs) prognostic signature is required for personalized management and treatment of patients. Gene expression profiles and clinical information of HCC patients were obtained from the TCGA and ICGC databases. The IRGPs are constructed using immune-related genes (IRGs) with large variations. The least absolute shrinkage and selection operator (LASSO) regression analysis was used to construct IRGPs signature. The IRGPs signature was verified through the ICGC cohort. 1,309 IRGPs were constructed from 90 IRGs with high variability. We obtained 50 IRGPs that were significantly connected to the prognosis and constructed a signature that included 17 IRGPs. In the TCGA and ICGC cohorts, patients were divided into high and low-risk patients by the IRGPs signature. The overall survival time of low-risk patients is longer than that of high-risk patients. After adjustment for clinical and pathological factors, multivariate analysis showed that the IRGPs signature is an independent prognostic factor. The Receiver operating characteristic (ROC) curve confirmed the accuracy of the signature. Besides, gene set enrichment analysis (GSEA) revealed that the signature is related to immune biological processes, and the immune microenvironment status is distinct in different risk patients. The proposed IRGPs signature can effectively assess the overall survival of HCC, and provide the relationship between the signature and the reactivity of immune checkpoint therapy and the sensitivity of targeted drugs, thereby providing new ideas for the diagnosis and treatment of the disease.


INTRODUCTION
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors and the leading cause of cancer-related deaths, and its morbidity and mortality are increasing (Forner et al., 2018;Villanueva, 2019). According to the latest cancer statistics in 2019, approximately 42,030 people in the United States are diagnosed with liver cancer each year, and 31,780 people die from the disease (Siegel et al., 2019). Although breakthroughs in the diagnosis and treatment of HCC have been made in recent years, the prognostic outcome of patients remains poor (Yang et al., 2019). The current clinical application of immunotherapy in HCC has benefited some patients, but this approach has not effectively improved the prognosis, and the long-term survival rate is still very poor (Dong et al., 2020). Due to the complexity and heterogeneity of HCC, individualized decision-making plans are required in diagnosis and treatment. Therefore, it is necessary to identify the novel prognostic signature of HCC and use it to guide clinical treatment as an effective way to improve the prognosis of patients.
Understanding the characteristics of tumor immune cell infiltration can improve the responsiveness of immunotherapy and is of great help in understanding the mechanism of cancer occurrence (Ino et al., 2013;Binnewies et al., 2018). The landscape of immune cells in HCC mapped by single-cell sequencing and other evidence indicate that the tumor immune microenvironment plays an indispensable role in the progression of HCC (Nishida and Kudo, 2017;Zheng et al., 2017;Kurebayashi et al., 2018). Different immune cells infiltrating into HCC have different prognostic effects (Prieto et al., 2015;Cariani and Missale, 2019). It has been observed clinically that the increase of PD1 + CD8 + T cells in HCC is associated with poor clinical outcomes (Chew et al., 2017). Although numerous studies have found the importance of immunology in HCC, its molecular mechanism is still unclear. Current studies have shown that tumor immune-related markers have commendable effects on the diagnosis and treatment of cancer (Peng et al., 2019;Hong et al., 2020). There have been many studies on the prognostic value of identifying key genes to build models to predict the prognosis of HCC patients (Li et al., 2017a;Kaur et al., 2019;Liu et al., 2020;Ouyang et al., 2020). However, there is no in-depth survey on the clinical relevance and prognostic significance of IRGPs in HCC.
In this study, we used the HCC gene expression datasets of the TCGA and ICGC databases to develop individualized prognostic signature based on IRGPs. Then, we evaluated its ability to predict prognosis in HCC patients and its responsiveness to immune checkpoint therapy and targeted therapy.

Data Sources
The HCC level 3 RNA-seq data and clinical information were downloaded from the TCGA database, including normal tissues (n 50) and tumor tissues (n 374). 365 cases of tumor patients have survival time and survival status, of which 247 cases contain clinical information (age, gender, histologic grade, TNM stage, vascular invasion, and alpha fetoprotein). The RNA-seq data and clinical information of HCC were downloaded from the ICGC database, including normal tissues (n 202) and tumor tissues (n 243). 232 tumor patients have survival time, survival status, and clinical information (age, gender, and TNM stage). The list of IRGs was downloaded from the IMMPORT (Bhattacharya et al., 2018). Both the TCGA and ICGC data are publicly available. Therefore, this research does not require the approval of the local ethics committee.

Construction of an Individualized Prognostic Signature Based on IRGPs
In the TCGA and ICGC cohorts, the limma package (Ritchie et al., 2015) was used to identify the IRGs that are different between cancer tissues and normal tissues. The filtering criteria were | logFC | > 1, FDR < 0.01. We identified the common differential IRGs in the two cohorts. The protein-protein interaction network was used to demonstrate differential genes (Udhaya , and the Cytoscape software (v3.8.2) was used for visualization. Genes with the median absolute deviation greater than 1 in tumor samples whose expression levels of these common differential IRGs were considered as candidate genes. The expression levels of these candidate genes were compared in pairs to generate a score for each IRGP. If IRG 1 is less than IRG 2, the IRGP score is 1, otherwise, the IRGP score is 0 (Li et al., 2017b). In the TCGA cohort, univariate cox was used to select prognostic IRGPs to assess the association between each IRGP and the overall survival rate of the patient. Prognostic IRGPs with a p value of less than 0.001 were candidates for establishing an IRGPs signature. From these IRGPs, the R language was used to perform Lasso Cox proportional hazard regression to construct the risk score, and finally 17 IRGPs were used to define the IRGPs signature. To classify patients into highor low-risk patients, the optimal IRGPs signature cut-off value was determined by the ROC curve.

Verification of IRGPs Signature
In the TCGA and ICGC cohorts, the survival package and survminer package (https://CRAN.R-project.org/package survminer) were used to establish the survival curve of the high and low-risk patients through the Kaplan-Meier diagram, and the Log-rank test was used to analyze the difference in survival curves. Cox proportional hazards analysis was used for univariate and multivariate analysis. The survivalROC package (https://CRAN.R-project.org/package survivalROC) was used for ROC curve analysis to evaluate the predictive ability of the IRGPs signature.

Functional Annotation and Immune Cell Infiltration
To understand the underlying molecular mechanism of the IRGPs signature, we divided the TCGA cohort patients into high-and low-risk patients so that the software GSEA 4.1.0 (http://www.gsea-msigdb.org) can be used for gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations. When p < 0.05 and FDR < 0.25, the enriched gene set is considered to be statistically significant. The gsva package (Hänzelmann et al., 2013) was used to quantify immune cell infiltrations, and the correlation between high-and low-risk patients and 16 immune cell infiltrations and 13 immunerelated pathways was evaluated.
Explore the Relationship Between the IRGPs Signature and the Reactivity of Immune Checkpoint Therapy and the Sensitivity of Targeted Drugs Immune checkpoint blockers of immunotherapy have obvious effects in the treatment of many human cancers. Therefore, we explored the expression and correlation between high and lowrisk patients and the current common immune checkpoints. In addition, small molecule tyrosine kinase inhibitors, as a class of molecularly targeted drugs, have become one of the mainstream trends in current anti-liver cancer research. The pRRophetic package is used to analyze the sensitivity of tyrosine kinase inhibitors in high-and low-risk patients (Geeleher et al., 2014).

Construction and Definition of the IRGPs Signature
Principal component analysis (PCA) was first performed on the expression profiles of the two cohorts. The results showed that there were differences in the distribution of tumor tissue and normal tissue in the two data sets, which could be used for further analysis (Supplementary Figures S1A,C). 231 differential IRGs were identified in the TCGA cohort and 172 differential IRGs were identified in the ICGC cohort (Supplementary Figures  S1B,D). We obtained 120 shared differential IRGs and displayed them in the protein-protein interaction network ( Figure 1A). In order to obtain genes with high variability, we screened among 120 IRGs. The filter criterion was that the median absolute deviation of these genes in tumor samples is greater than 1. 90 FIGURE 1 | Identification of IRGs and determination of risk score. (A) 120 shared differential immunity related genes were displayed on the protein-protein interaction network. (B) Univariate Cox analysis obtained 50 IRGPs related to prognosis. (C) ROC curve of the risk score of the IRGPs signature in the TCGA cohort. The risk score was 0.210, which served as a threshold for classifying patients into the high and low-risk patients.
Frontiers in Molecular Biosciences | www.frontiersin.org October 2021 | Volume 8 | Article 715728 IRGs were finally got for constructing 1,309 pairs of IRGPs. The correlation between these IRGPs and overall survival was evaluated in the TCGA cohort, and 50 IRGPs related to prognosis were screened out, with p < 0.001 as the cut-off criterion ( Figure 1C). A signature composed of 17 IRGPs was constructed on the TCGA cohort through LASSO Cox proportional hazard regression. The IRGPs signature consists of 22 unique IRGs (Supplementary Table S1). Finally, the timedependent ROC curve was used to determine the optimal cutoff value of 0.210 for the risk score, classified patients as high-or lowrisk patients ( Figure 1B). At this point, the sensitivity was 74.1% and the specificity was 75.5%. In the TCGA cohort, the high-risk patients were found to be significantly correlated with histologic grade, Child-Pugh grade, alpha-fetoprotein, vascular invasion, and TNM stage ( Table 1). Overall survival in the TCGA cohort was worse in the high-risk patients than in the low-risk patients (Log-rank, p < 0.001) (Figure 2A). PCA analysis shows that the distribution patterns of patients in different risk patients are contrasting ( Figure 2B). The ROC curve assessed the prognostic ability of the prognostic signature, and the area under the curve (AUC) reaches 0.809 at 1 year, 0.757 at 2 years, and 0.712 at 3 years ( Figure 2C).

Validation of the Feasibility of the IRGPs Signature to Predict Survival
To determine whether the signature has prognostic value, the signature was applied to the ICGC cohort as independent external verification. Patients in the ICGC cohort were divided into high or low-risk patients based on the above risk score. The high-risk patients in the ICGC cohort were also correlated with the TNM stage ( Table 1). Also, the overall survival rate of the high-risk patients in the ICGC cohort is lower than in the low-risk patients (Log-rank, p < 0.001) ( Figure 2D). PCA analysis showed that the distribution pattern of the two groups of patients was distinct ( Figure 2E). In addition, the ROC curve showed 0.772 at 1 year, 0.744 at 2 years, and 0.759 at 3 years ( Figure 2F). These results are similar to those obtained in the TCGA cohort.

Validation of the IRGPs Signature as an Independent Prognostic Factor
In the TCGA cohort, univariate COX analysis showed that TNM stage (p 0.002), vascular invasion (p 0.035), and the IRGPs signature (p < 0.001) were significantly correlated with the prognosis of HCC.  Figure 3B). In the   Figures S2A,C). In addition, the IRGPs signature divided patients with early (I and II) and late (III and IV) HCC into different prognostic groups. Also, for patients with stage IandII disease, low-risk patients have a good prognosis in both the TCGA cohort (Log-rank, p < 0.001) and the ICGC cohort (Logrank, p < 0.001) (Supplementary Figures S2B,D). Similarly, for patients with advanced stages III and IV, low-risk patients also have a good prognosis in the TCGA cohort (Log-rank, p < 0.001) and ICGC cohort (Log-rank, p 0.01) (Supplementary Figures  S2B,D). Overall, the IRGPs signature seems to be able to independently assess the overall survival of HCC.

Functional Annotation and Immune Cell Infiltration Between High and Low-Risk Patients
GSEA results showed that some immune-related biological processes are involved in high-risk patients (Supplementary Figure S3A), such as activation of innate immune response, antigen processing and presentation of peptide antigen via MHC class I, positive regulation of activated T cell proliferation, regulation of type I interferon mediated signalling pathway. Interestingly, some immune-related KEGG pathways are enriched in high-risk patients (Supplementary Figures S3B), such as FC epsilon RI signalling pathway, MAPK signalling pathway, mTOR signalling pathway, NOD like receptor signalling pathway. These results indicate that immune-related biological processes may play an indispensable role in the development of HCC. We further explored the status of immune cells and immune-related functions in high-risk and low-risk populations. In the TCGA cohort, the high-risk patients were positively correlated with tumor-infiltrating immune cells (aDCs, Macrophages, TH2 cells, Treg), while negatively correlated with Mast cells, Neutrophils, and NK cells ( Figure 4A). In the ICGC cohort, except for Mast cells, which were not statistically significant, the results were consistent with the results in TCGA ( Figure 4C). In the TCGA cohort, the highrisk patients were positively correlated with MHC class I and negatively correlated with Type II IFN Response ( Figure 4B). In the ICGC cohort, the same result was obtained ( Figure 4D).

Reactivity of Immune Checkpoint Therapy and Sensitivity of Targeted Drugs
Immune checkpoint therapy has shown better results in the treatment of cancer, and it has made a major breakthrough in the field of HCC. Therefore, we investigated the expression of immune checkpoint markers in low and high-risk patients. In the TCGA and ICGC cohorts, immune checkpoint markers (CD276, HHLA2, TNFRSF18, TNFSF9, LGALS9) were expressed higher in high-risk patients, and there was a positive correlation between the IRGPs signature and these markers (|R| > 0.3, p < 0.05, Figures 5A-E). Tyrosine kinase inhibitors currently approved and in clinical trials have demonstrated efficacy in the treatment of hepatocellular carcinoma, which can be helpful in the treatment and management of patients with HCC. Hence, we evaluated the IC50 of each sample and observed that the IC50 of the eight small molecule tyrosine kinase inhibitors was significantly different between the two groups. The results showed that Axitinib, Motesanib, Dasatinib, Nilotinib, Erlotinib, Pazopanib, Lapatinib, and Saracatinib are more sensitive to low-risk patients (p < 0.001, Figures 6A-H). This may provide an accurate strategy for the treatment of HCC patients.

DISCUSSION
Immune infiltration plays an important role in cancer progression. With the exploration in the field of immunotherapy, it is of great help in the treatment of tumor.
There have been studies that have provided strong evidence for the treatment and diagnosis of some diseases through bioinformatics analysis Udhaya Kumar et al., 2020). Due to the high heterogeneity of HCC, some patients fail to achieve the expected curative effect on immunotherapy. Therefore, it is extremely crucial to determine the sensitivity of different patient subsets to treatment response. Individualized treatment for different patient subgroups will help to enhance the prognosis of patients (Gan et al., 2020;Zhou et al., 2020;. Novel signature related to tumor immune infiltration may be a sword for identifying new molecular targets and improving patient prognosis. In this study, we developed a prognostic signature based on 17 IRGPs in HCC and validated them in two independent data sets on different platforms. Our prognostic immune signature can further divide clinically defined patients [for example, early stage (I and II) and late stage (III and IV)] into subgroups with different survival outcomes. Univariate COX analysis showed that TNM stage and the IRGPs signature were significantly correlated with the prognosis of HCC. By multivariate COX analysis, the TNM stage and the prognostic signature can be used as independent prognostic factors. Therefore, our prognostic signature can be used as a personalized prognosis and diagnosis and treatment of HCC patients and can be easily translated into clinical practice. The IRGPs signature we constructed included 22 IRGs, and the results of GSEA indicated that some immune-related biological processes and signal pathways were enriched in high-risk patients. Such as, activation of innate immune response, MAPK signalling pathway and mTOR signalling pathway, etc. Innate immunity is a part of the HCC tumor microenvironment, which can suppress and promote cancer. For example, dendritic cells, neutrophils, and macrophages can promote the occurrence of HCC, while natural killer cells and natural killer T cells can inhibit the development of HCC (Ruf et al., 2021). The activation of MAPK signalling pathway is closely related to the development of tumors, and it is activated in about 50% of patients with early HCC and almost all patients with advanced HCC (Neuzillet et al., 2014). Excessive activation of mTOR promotes the development of tumors, and affects the immune regulation involved in the differentiation of immune cells, and plays an important role in tumor metabolism (Zou et al., 2020). There may be a link between aberrant activation of these pathways for the low overall survival of high-risk patients. We further studied the relationship between tumor immune infiltration and the IRGPs signature. In the high-risk patients of the TCGA and ICGC cohorts, aDCs, Macrophages, TH2 cells, and Treg infiltration increased. Studies have shown that the increase of regulatory DCs can promote the increase of Treg in liver cancer (Cheng et al., 2016). Treg and Macrophages, which are less infiltration in HCC, have a good prognosis (Zhu et al., 2008;Zhou et al., 2016). The increase of TH2 cells is associated with poor prognosis of HCC (Zhu et al., 2008;Zhou et al., 2010). In this study, in the TCGA and ICGC cohorts, the infiltration of Neutrophils and NK cells in high-risk patients was reduced, while the prognosis of the high-risk patients was worse than that of the low-risk patients. Studies have pointed out that neutrophils can inhibit the development of cancer (Souto et al., 2011). NK cells infiltrate less in advanced HCC and are associated with poor prognosis (Wu et al., 2013). These results are consistent with our research results. In the high-risk patients in the TCGA and ICGC cohorts, the type II IFN reactivity decreased, while the MHC class I activity increased. Type II IFN is mainly produced by activated NK cells (Perussia, 1991), and it plays an important role in regulating the tumor immune environment. In our study, the NK cell invasion and type II IFN reactivity in the high-risk patients were reduced, and the prognosis of the high-risk patients was poor, which was consistent with the above study results. The level of MHC class I is elevated in patients with advanced HCC, and it may negatively regulate innate immunity and adaptive immunity to cause tumor escape mechanisms to occur (Jinushi et al., 2005). According to the above findings, dysregulation of the immune microenvironment may account for the survival differences between the IRGPs signature subgroups.
Among the immune checkpoint markers, CD276, HHLA2, TNFRSF18, TNFSF9, and LGALS9 were highly expressed in high-risk patients, indicating differences in responsiveness to these immune checkpoint treatments among patients grouped by this signature. CD276 plays an important role in innate immunity and T cell-mediated adaptive immunity, which is highly expressed in HCC and other cancers and is associated with poor patient prognosis. And it has great potential for immunotherapy (Picarda et al., 2016). HHLA2 can inhibit the function of CD4 and CD8 T cells, and blocking HHLA2 can enhance the proliferation and activation of T cells, which is helpful for cancer immunotherapy (Zhao et al., 2013;Rieder et al., 2021). TNFRSF18, also known as glucocorticoid-induced TNFR-related protein (GITR), is a costimulatory receptor in malignant tumors. Agonistic targeting of GITR can enhance the anti-tumor response of TIL derived from HCC patients (van Beek et al., 2019). TNFSF9, also known as 4-1BBL, is expressed on active T cells and antigen cells. 4-1BBL targeted immunotherapy has shown anti-tumor effects in the treatment of HCC (Xiao et al., 2007;Li et al., 2011). The high expression of LGALS9 is associated with the poor prognosis of many human cancers. LGALS9 preferentially kills T cells rather than cancer cells may assist cancer immune escape (Yang et al., 2021).
Several tyrosine kinase inhibitors have been used as first or second-line agents in the treatment of liver cancer, but one challenge is the lack of reliable biomarkers to identify patients who benefit from these treatments (Qin et al., 2019). Interestingly, we found that low-risk patients are more sensitive to Axitinib, Motesanib, Dasatinib, Nilotinib, Erlotinib, Pazopanib, Lapatinib, and Saracatinib. These findings provide effective treatment strategies for patients stratified by the IRGPs signature.
Although we used two independent data sets to rigorously verify the signature we proposed, our research has certain limitations. First of all, our study only carried out retrospective research and lacked prospective research. Second, our study only includes immune-related genes, and important prognostic genes in HCC may have been deleted. Finally, the relationship between the IRGPs signature and immune infiltration will be verified in follow-up studies.
In conclusion, the proposed IRGPs signature can accurately predict the prognosis of HCC patients and guide clinicians to make specific treatment decisions. It also provides the relationship between this signature and the responsiveness of immune checkpoints and targeted drugs. These results will be beneficial to the effective treatment of HCC patients. Prospective studies are required to further verify its accuracy.

AUTHOR CONTRIBUTIONS
SM and YM: conceptualization. XF and SM: software and methodology. XF and YM: writing-original draft preparation. WW: writing-review and editing. All authors have read and agreed to the published version of the article.