Systematic Construction and Validation of a Prognostic Model for Hepatocellular Carcinoma Based on Immune-Related Genes

Hepatocellular carcinoma (HCC), a highly aggressive tumor, has high incidence and mortality rates. Recently, immunotherapies have been shown to be a promising treatment in HCC. The results of either the CheckMate-040 or IMbrave 150 trials demonstrate the importance of immunotherapy in the systemic treatment of liver cancer. Thus, in this study, we tried to establish a reliable prognostic model for liver cancer based on immune-related genes (IRGs) and to provide a new insight for immunotherapy of HCC. In this study, we used four datasets that incorporated 851 HCC samples, including 340 samples with complete clinical information from the cancer genome atlas (TCGA) database, to establish an effective model for predicting the prognosis of HCC patients based on the differential expression of IRGs and validated the prognostic model using the data from International Cancer Genome Consortium (ICGC). The top 6 characteristic IRGs identified by protein-protein interaction (PPI) network analysis, MMP9, FOS, CAT, ESR1, ANGPTL3, and KLKB1, were selected for further study. In addition, we assessed the correlations of the six characteristic IRGs with the tumor immune microenvironment, clinical stage, and sensitivity to anti-cancer drugs. We also explored whether the differential expression of the characteristic IRGs was specific to HCC or present in pan-cancer. The expression levels of the six characteristic IRGs were significantly different between most tumor tissues and adjacent normal tissues. In addition, these characteristic IRGs showed a strong association with immune cell infiltration in HCC patients. We found that MMP9 and ESR1 were independent prognostic factors for HCC, while CAT, ESR1, and KLKB1 were associated with the clinical stage. We collected HCC paraffin sections from 24 patients from Xijing hospital to identify the differential expression of the five genes (MMP9, ESR1, CAT, FOS, and KLKB1). Finally, the results of decision curve analysis (DCA) and nomogram revealed that our models provided a prognostic benefit for most HCC patients and the predicted overall survival (OS) was consistent with the actual OS. In conclusion, we systemically constructed a novel prognostic model that provides new insights into HCC.


INTRODUCTION
Hepatocellular carcinoma (HCC), a common primary malignant tumor, accounts for almost 85% of all liver cancers (Sung et al., 2021). The worldwide incidence and mortality rates of HCC rank sixth and third, and in 2020 approximately 906,000 new cases of HCC and 830,000 related deaths were estimated to occur (Sung et al., 2021). The incidence and mortality rates of HCC depend on the patient's race, region, age, sex, and risk factors related to tumor progression (Kulik and El-Serag, 2019;Sung et al., 2021). Current treatment options of HCC such as surgery, chemoradiotherapy, liver transplantation, and radiofrequency ablation will benefit a very small percentage of patients, and a median overall survival (OS) of untreated HCC patients was <9 months (Giannini et al., 2015;Wang et al., 2018). With the development of immunotherapy, the OS of patients with unresectable HCC were prolonged. Previous studies found that the prognosis of patients with HCC is associated with their immune microenvironment and clinical pathological features (Ayuso et al., 2018;Zhang et al., 2019). Notably, there is no effective prognostic model for HCC that systematically assesses immune-related genes (IRGs) together with clinicopathological features.
Application of immunotherapy regimens in patients with HCC has led to encouraging results in terms of both safety and efficacy, and immunological mechanisms have been demonstrated to be playing a key role in the epigenetic mechanism, pathogenesis and development of HCC (Keenan et al., 2019;Jayant et al., 2020). Recently, a multicenter, randomized and phase III study which is called IMbrave 150 evaluated combined atezolizumab and bevacizumab treatment vs. sorafenib in patients with unresectable HCC, met its coprimary endpoint and both progression free survival (PFS) and OS were improved (Finn et al., 2020). However, the HCC tumor microenvironment (TME), which includes antigen-presenting cells (APCs), myeloid-derived suppressive cells (MDSCs), and neutrophils, is complex, thereby resulting in immunotherapy resistance in HCC patients. Due to the complexity and immune suppression mechanisms of the liver immune microenvironment, immune checkpoint inhibitors (ICPI) treatment has limited benefits for a small number of patients with HCC. A high-throughput study of patient tissue samples revealed that 25% of the samples exhibited a transcriptomic hallmark of proinflammatory responses associated with adaptive or exhausted immunity (Sia et al., 2017). The immune response is improved by immunotherapies targeting coinhibitory receptors such as PD-1 and CTLA4 which inhibit the immunosuppressive mechanisms in several tumors (Wolchok, 2015;Chikuma, 2017). However, a substantial amount of work is need to make immunotherapies successful for patients with HCC. In addition, the clinical relevance and expression profiles of IRGs in HCC have not been explored systematically.
Previous studies have explored the immunotherapeutic effects on most cancers, and the relationship between immune cell infiltration and HCC prognosis has been the focus of a few studies (Zhou et al., 2019). The expression of 7 IRGs was detected in 374 HCC samples. Nevertheless, the conclusions are not very reliable due to a lack of prognostic model correction and to the inclusion of few clinical variables . Studies have shown that vascular normalization and antitumor immune responses are promoted by the inhibition of PD-1 in HCC (Garris et al., 2018;Sharpe and Pauken, 2018). However, these studies included small sample sizes, and the correlation between HCC patient prognosis and immunotherapy needs to be further assessed. Past studies have paid little attention to the sensitivities of IRGs to anti-cancer drugs and a systematic analysis of the associations of IRGs with the prognosis of HCC patients is urgently needed.
In this research, we utilized a multigene expression cohort of 851 HCC samples to develop and validate an individualized IRG set based on HCC prognostic signatures. The correlations of the characteristic IRGs with immune infiltrating cells, clinical stage, clinical prognosis, and drug sensitivity were analyzed, with the aim of providing a sufficient amount of data to improve the prognosis and immunotherapeutic responses of HCC patients. In addition, we performed a comprehensive analysis that incorporated clinical characteristics to identify potential prognostic biomarkers and molecular targets for HCC.

Ethics Statement
Twenty-four paraffin sections of HCC tissue which were used to perform immunohistochemistry (IHC) were obtained at the Xijing Hospital. This study was approved by the Xijing Hospital's Ethics Committee and was conducted in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments.

Data Source and Preprocessing
The RNA-sequencing data and expression levels of microRNAs (miRNAs) in HCC patient samples were downloaded from The Cancer Genome Atlas (TCGA), 1 and the dataset included 340 HCC samples that included clinical information. The fragments per kilobase million (FPKM) values were transformed to transcripts per kilobase million (TPM) values for further study. The clinical features of the patients, such as their clinical stage, sex, and age, were obtained from the UCSC Xena website. 2 The gene expression data for different tissues and cell lines were downloaded from the TCGA and Cancer Cell Line Encyclopedia (CCLE) databases.
Hepatocellular carcinoma gene chip data (GSE14520, GSE101685, and GSE36376) were downloaded from Gene Expression Omnibus (GEO) datasets, and all data were from Homo sapiens (Roessler et al., 2010). GSE14520 was based on the GPL571 (HG-U133A_2) Affymetrix Human Genome U133A 2.0 Array and the GPL3921 (HT_HG-U133A) Affymetrix HT Human Genome U133A Array platform and contained information on 247 tumor samples and 241 normal liver samples. GSE101685 was based on the GPL570 (HG-U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array platform and contained information on 24 HCC samples and 8 normal liver samples. GSE36376 was based on the GPL10558 Illumina HumanHT-12 V4.0 expression beadchip platform and included information on 240 HCC samples and 193 normal liver samples. Tumor-associated IRGs were downloaded from the ImmPort database. 3 All data used in this study were freely available online.

Identification of Immune-Related Genes in Hepatocellular Carcinoma
Principal component analysis (PCA) was conducted to assess the fundamental differences between HCC tissues and adjacent normal tissues (Luo et al., 2014). The analysis of differentially expressed genes (DEGs) was performed using the "DESeq2" package, with a corrected P < 0.05 and a | Log fold change(FC)| ≥ 1.0 serving as the thresholds (Love et al., 2014). The differential analysis results are presented as heatmaps and volcano plots. The heatmaps were produced by using the pheatmap package and the volcano plots were generated with R software. The Venn diagram web-tool was adopted to identify the common DEGs, and IRGs were associated with HCC among the common DEGs and tumor-related IRGs from the ImmPort database were identified.

Enrichment Analysis of Immune-Related Genes
Both Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed by using the "clusterprofiler" package (Yu et al., 2012). GO terms were identified with a strict cutoff of p < 0.01 and a false discovery rate (FDR) of <0.05. KEGG signaling pathways were visualized in the form of a network map drawn with Cytoscape 3.7.2 software (Shannon et al., 2003).

Identification of Characteristic Immune-Related Genes
Search Tool for the Retrieval of Interacting Genes/Proteins (STRING), 5 a free online tool, was used to identify and predict interactions between proteins or genes. A PPI network of IRGs was constructed with a cutoff standard of a combined score >0.4 (Szklarczyk et al., 2019). The PPI network map of the IRGs was visualized with Cytoscape software. CytoHubba, a plug-in of Cytoscape, was used to explore essential nodes in the network. The six characteristic IRGs with the best scores were identified by the maximal clique centrality (MCC) method and selected for further study (Chin et al., 2014).
The miRNA-mRNA interaction data were downloaded from miRTarBase, 6 miRDB, 7 and TargetScan, 8 and the three miRNA databases were then integrated by using NetworkAnalyst. 9 Subsequently, a regulatory network was constructed via correlation analysis of the mRNA-miRNAs (Lewis et al., 2003(Lewis et al., , 2005Chen and Wang, 2020;Huang et al., 2020). The construction of the network was performed by Cytoscape (v3.7.2) to validate the reliability of the characteristic IRGs at the mRNA expression level.

Correlations of the Expression Levels of the Characteristic Immune-Related Genes With Anti-cancer Drug Sensitivity and Clinical Stage
The CellMiner database 10 which is a web resource allows the rapid retrieval of data related to the impact of 20,503 chemical compounds, including 102 drugs, on 22,379 gene transcripts and 360 miRNA (Reinhold et al., 2012). Gene expression data and half maximal growth inhibition (GI50) values for different anti-cancer drugs were downloaded to analyze the correlations between the characteristic IRGs and anti-cancer drug sensitivity. Moreover, we grouped the patients into four groups according to the clinical stage, namely stage I, stage II, stage III, and stage IV; to compare the differences in the expression levels of the characteristic IRGs among the groups.
Correlations of the Characteristic Immune-Related Genes With the Infiltration of Immune Cells and Immune Checkpoint Genes in Hepatocellular Carcinoma Patients TIMER, a comprehensive resource, that includes 10,897 samples across 39 cancer types, was used to systematically analyze of immune infiltrates (Li et al., 2016. 11 We explored the correlations of the expression levels of the characteristic IRGs with the abundance of infiltrating immune cells via gene modules. The left-most panel shows the gene expression levels according to tumor purity. We also used the GEPIA website (Gene Expression Profiling Interactive Analysis) 12 to analyze the correlation of the two prognostic IRGs (MMP9 and ESR1) with the immune checkpoint genes .

Construction and Validation of the Clinical Prognostic Model
We aimed to construct a prognostic HCC model based on the characteristic IRGs together with clinicopathological features. Cox regression analysis was adopted to determine the risk score of patients in terms of OS. A nomogram for survival prediction was then constructed with the TCGA cohort. Harrell's concordance index (C-index) was measured to quantify the differentiation performance of the constructed nomogram. A calibration curve was used to evaluate the performance of the nomogram, and to compare the predicted values of the nomogram with the observed actual survival rates. Decision curve analysis (DCA) was performed to evaluate the clinical efficacy of the prognostic model (Van Calster et al., 2018). And the data which included 229 HCC samples from the International Cancer Genome Consortium (ICGC) database were used for the external validation of prognostic models.

Immunohistochemistry
Human liver tissues were obtained from the Xijing Hospital and the tumor sections were incubated with commercial rabbit polyclonal antibodies against MMP9 (1:400 dilution, Servicebio Biotechnology), ESR1 (1:1,000 dilution, Servicebio Biotechnology), KLKB1 (1:50 dilution, Abcam Biotechnology), CAT (1:800 dilution, Abcam Biotechnology), and FOS (1:400 dilution, Servicebio Biotechnology) overnight at 4 • C. Then, the sections were treated with an appropriated secondary conjugated to horseradish peroxidase (HRP) at room temperature for 1 h, and the bound antibodies were visualized with 3, 3diaminobenzidine (DAB). We used a scoring method for the semi-quantitative analysis of IHC images. The tissue sections were scored by degree of staining and range of positivity by three individuals under a light microscope, and were repeated three times independently, the mean values were taken, and the final scores were summed and compared.

Statistical Analysis
The log-rank test was used for Kaplan-Meier survival analysis, to assess the differences in survival. The survival ROC package and pROC package were applied to construct the receiver operating characteristic (ROC) curve and to calculate the area under the curve (AUC) (Robin et al., 2011;Rizvi et al., 2019). To calculate the correlation coefficients among characteristic IRGs, Pearson correlation analysis was performed. Cox regression analysis was used to identify the independent prognostic factors. The p-value threshold for inclusion of multivariate cox regression analysis was 0.2 (Kang et al., 2013). The chi-square test was used to compare and analyze the significance of differences between the two sets of categorical variables. To compare two sets of continuous variables, independent Student's t-tests were conducted, and the differences between the non-normally distributed variables were analyzed by the Wilcoxon rank sum test. In this study, R v4.0.2 was used for all statistical analyses (*p < 0.05; **p < 0.01; ***p < 0.001).

Identification of 20 Immune-Related Genes in Hepatocellular Carcinoma Patients
The flow chart for this study is included in the Supplementary  Figure 1. To analyze the key genes associated with HCC, we first performed PCA of the four datasets, revealing significant differences between HCC tissues and normal tissues (Supplementary Figure 2).
Subsequently, we analyzed the DEGs in HCC samples from the four datasets separately. In the TCGA HCC dataset, 3,328 genes were significantly upregulated and 1,241 genes were significantly downregulated (Figures 1A,B); in the GSE14520 dataset, 274 genes were significantly upregulated, and 488 genes were significantly downregulated (Figures 1C,D); in the GSE101685 dataset, 538 genes were significantly upregulated, and 843 genes were significantly downregulated (Figures 1E,F); and in the GSE36376 dataset, 89 genes were significantly upregulated and 360 genes were significantly downregulated (Figures 1G,H).
The Venn diagram revealed 206 common DEGs among the four datasets (Supplementary Figure 3A). In total, we downloaded 1,811 tumor-associated IRGs from the ImmPort database. A Venn diagram revealed 20 differentially expressed IRGs among the 206 DEGs (Supplementary Figure 3B).

Functional Enrichment Analysis of the 20 Immune-Related Genes
We performed GO functional enrichment analysis of the 20 IRGs revealing that the IRGs were closely associated with biological processes such as the response to metal ions, the response to cadmium ions, the growth hormone receptor complex, collagen trimers, and peptide binding (Figures 2A-D). The detailed GO analysis results are available in Supplementary  Table 1. KEGG functional analysis results suggested that the 20 IRGs mainly affect the IL-17, endocrine resistance, TNF, and estrogen signaling pathways (Figures 2E,F). The IL-17 and endocrine resistance signaling pathways are demonstrated in detail (Figures 2G,H).
Moreover, the results of GSEA in the TCGA database showed that the enrichment of the ribosome, the complement and coagulation cascade, and the peroxisome proliferatorsactivated receptor (PPAR) signaling pathways were significant (Supplementary Figure 4). The specific enrichments of the related pathways are shown in Supplementary Table 2.
To assess the variations in different pathways in patients with HCC, GSVA was performed to analyze the enrichment scores of patients with HCC in each of the four datasets, and the relevant signaling pathways with significant differential expression are shown as heat maps (Supplementary Figure 5). The GSVA results were consistent with the GSEA results.

Identification of the Characteristic Immune-Related Genes
The 20 differentially expressed IRGs were added to the STRING database to construct the PPI network thereby identify characteristic IRGs that are closely Frontiers in Cell and Developmental Biology | www.frontiersin.org   Figure 6A). MMP9, FOS, CAT, ESR1, ANGPTL3, and KLKB1 were identified as characteristic IRGs based on the MCC algorithm and were selected for further analysis (Supplementary Figure 6B). When we got the characteristic IRGs, we did the GSEA for each characteristic IRG (Supplementary Figure 7). An interaction network including differential IRGs and miRNAs was constructed (Supplementary Figure 8A). Moreover, we selected differentially expressed transcription factors related to the expression of the 20 IRGs, with the thresholds of a Pearson's coefficient > 0.4 and a p-value < 0.01, and imported the highly expressed and correlated transcription factors into Cytoscape to construct the transcription factor regulatory network (Supplementary Figure 8B).

related to HCC (Supplementary
Analysis at the mRNA and protein levels, revealed that the characteristic IRGs play key roles in the development of HCC. In addition, IHC analysis was performed to assess the MMP9, ESR1, CAT, FOS, and KLKB1 expression in HCC patients and the expression level of characteristic IRGs (MMP9, ESR1, CAT, FOS, and KLKB1) of IHC were scored blindly and compare to each other using a semi-quantitative analysis method (Figure 3). The MMP9 protein was highly expressed in HCC tissues while the ESR1, CAT, FOS, and KLKB1 proteins were expressed at lower levels in HCC tissues than in adjacent normal tissues. The results of the IHC analyses of patient samples were consistent with the results of the differential expression analysis of the TCGA cohort.

Pan-Cancer Analysis of Characteristic Immune Gene
We analyzed the differential expression of 6 characteristic IRGs in 33 tumors from the TCGA database. In the vast majority of tumors, the expression levels of the characteristic IRGs were significantly different, compared to adjacent normal tissues (Supplementary Figure 9). Moreover, the mRNA expression levels of the characteristic IRGs in 1,457 tumor cell lines were determined using the CCLE database (Supplementary Figure 10). The results showed that the six characteristic IRGs may act key roles in most cancers not only in HCC. It provided potential targets for the treatment of cancers.

Associations of the Characteristic Immune-Related Genes With the Immune Microenvironment and Drug Sensitivity
To assess the abilities of the expression levels of the six characteristic IRGs obtained by PPI network analysis to discriminate between HCC tissues and adjacent normal tissues in the TCGA dataset, ROC curve analysis was conducted. All of the characteristic IRGs were able to separate HCC tissues from adjacent normal tissues (Supplementary Figure 11).
We further analyzed the effects of these characteristic IRGs on the immune microenvironment in HCC patients (Figure 4). The six characteristic IRGs showed strong correlations with B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells (DCs), however, the correlations of FOS with B cells, CAT with neutrophils, and KLKB1 with CD8+ T cells were not statistically significant (P < 0.05).
Moreover, we further analyzed the associations of these characteristic IRGs with sensitivities to various anti-cancer chemotherapeutic drugs using the CellMiner database (Supplementary Figure 12). A significant positive correlation was observed between the expression of the ESR1 gene and sensitivity to fulvestrant, raloxifene, etc., and between the expression of MMP9 and sensitivity to rebimastat, etc., detailed information is shown in Table 1. This study may provide new insight for the clinical treatment of HCC.

Clinical Correlation Analysis of the Characteristic Immune-Related Genes
Differential expression analysis of the TCGA database showed that among the six characteristic IRGs, the MMP9 gene was significantly expressed in tumor tissues compared with adjacent normal tissues ( Figure 5A). In contrast, the remaining five genes were expressed at lower levels in tumor tissues (Figures 5B-F). In addition, the effects of the six IRGs on the prognosis of HCC patients were further studied. High MMP9 gene expression, and low ESR1 gene expression, were closely associated with poor prognosis in HCC patients (log-rank P < 0.001; Figures 6A,D), whereas the expression levels of the other genes had no significant effects on the prognosis of HCC patients (Figures 6B,C,E,F). Moreover, the expression levels of the CAT, ESR1, and KLKB1 genes were significantly correlated with the HCC stage (P < 0.05), while MMP9 was not significantly correlated with the tumor stage (Figure 7).
In the multivariate Cox regression analysis, p < 0.2 on univariate analysis were included in multivariable model and we included gender, stage, AFP, hepatitis, and other clinically relevant factors which were important for prognosis, and the expression of MMP9 and ESR1 to determine whether MMP9, ESR1 was an independent factor for prognosis of HCC. The results revealed that the MMP9 and ESR1 genes were independent prognostic predictors for HCC patients (P = 0.011; P = 0.015). While MMP9 expression was identified as a risk factor, ESR1 expression exerted a protective effect on HCC patients (Figure 8). Then we analyzed the correlation of MMP9, ESR1 with PD-1, PD-L1, and CTLA-4 which were called the immune checkpoint genes. We found that the expression of MMP9 has a positive and significant correlation with PD-1 and CTLA-4. While the expression of ESR1 has a negative and significant correlation with PD-L1 and CTLA-4 (Supplementary Figure 13).
Next, we incorporated the expression levels of two genes (MMP9 and ESR1) and clinicopathological features into the model (Figures 9A,B). A nomogram was then constructed to predict the OS of patients with HCC (Figures 9C,E). The C-index was used to calculate the discriminatory power of the nomogram, which showed a certain degree of discrimination [MMP9: 0.654 (0.601-0.706); ESR1: 0.658 (0.605-0.710)]. Moreover, the calibration curves showed good agreement between the 1-, 2-, and 3-year OS estimates from the prediction of the nomogram and the actual OS rates of the patients (Figures 9D,F). In addition, the DCA results showed a prognostic benefit for approximately 15-95% of patients as determined by the model (Figures 9G,H).
We also used the ICGC-JP dataset as an external validation to measure the accuracy and stability of the prognostic model (Supplementary Figure 14).

DISCUSSION
Hepatocellular carcinoma, an aggressive cancer, has a poor longterm prognosis, and ranks as the third leading cause of cancerrelated death. In 2017, Food and Drug Administration (FDA) approved the nivolumab for the treatment of HCC patients who have been previously treated with sorafenib, immunotherapy has become a standard of care treatment option for HCC (El-Khoueiry et al., 2017). With the development of immunotherapy, the OS of HCC patients, especially those with advanced liver cancer, has been prolonged. It suggests that immune research is promising in HCC. However, the relationship between IRGs and HCC remains unknown. In this study, we aimed to systematically analyze the relationship between IRGs and clinical stage, and prognosis in HCC patients, and then establish a prognostic model and validate using external dataset. In addition, we first investigated the associations of these IRGs with immune cell infiltration and drug sensitivity in patients with HCC. Compared to other studies of immune-related prognostic models, this study incorporates a larger study population and includes more characteristic IRGs. Besides, our prognostic model is analyzed by multi-omics level and validated by external dataset from ICGC to ensure the credible results and has high AUC values.
The Venn diagrams of DEGs from the four datasets, and the subsequent comparison of commonly altered genes with 1,811 tumor immune genes, revealed a total of 20 IRGs. KEGG pathway analysis showed that IRGs were mainly enriched in the IL-17, endocrine resistance, TNF, and estrogen signaling pathways in HCC patients. IL-17, a cytokine that is involved in the immune response, binds to the IL-17 receptor on the cell membrane. A study published in the Journal of Hepatology in 2020 identified IL-17a as a potential therapeutic target for HCC (Ma et al., 2020). Insulin resistance, TNF and estrogen promote the progression of HCC, and these factors are attracting increased attention from researchers. Inhibitors of G-proteincoupled estrogen receptor (GPER1) that have been characterized could be used to treat or prevent liver cancer (Park et al., 2010;Chaturantabut et al., 2019;Jindal et al., 2019). GSEA revealed that IRGs, in HCC patients, were mainly enriched in the ribosome, complement and coagulation cascade, and PPAR signaling pathways. A previous study reported that many lncRNAs, that bind to the ribosomal protein S6 in cancer cells, promote HCC progression by regulating cell proliferation and migration, and their levels are correlated with poor prognosis in HCC patients (Pang et al., 2020). The complement system has many functions in vivo, such as functioning as a proteolytic cascade in serum, mediating innate immunity, and providing extra protection against pathogens. Regardless of the factors driving HCC, the liver immune microenvironment is the main factor associated with chronic inflammation, the progression of liver fibrosis, and cirrhosis. The complement cascade, which is related to these immunological mechanisms, is central to the network and tightly regulates humoral and cellular responses to external stimuli (Malik et al., 2020). PPARs, nuclear hormone receptors, could be activated by fatty acids and their derivatives and PPAR agonists are beneficial for the treatment of non-alcoholic fatty liver disease (NAFLD), liver fibrosis, and HCC (Cao et al., 2009). Recent, studies demonstrated that PPARs participate in the progression of gut microbiota inhabitation and adaptation which is related to the pathogenesis of HCC (Yu et al., 2020). These signaling pathways were shown to be associated with HCC progression in our analysis and were consistent with those reported in published literature.
We incorporated 20 IRGs into the PPI network and obtained 6 characteristic IRGs, namely, MMP9, FOS, CAT, ESR1, ANGPTL3, and KLKB1, by applying the MCC algorithm. MMP9 plays key roles in local extracellular matrix proteolysis and in leukocyte migration. A previous study showed that HCC progression was inhibited by 7-methoxy-1-tetralone, possibly via the modulation of proliferation-and migration-related mediators, including MMP9 (Wen et al., 2020). The FOS gene encodes leucine zipper proteins that, dimerize with proteins of the JUN family, to form the AP-1, transcription factor complex. FOS proteins have been associated with the regulation of cell proliferation, differentiation, and transformation. FOS gene expression has been associated with apoptotic cell death and liver carcinogenesis is related to FOS-dependent inflammation (Bakiri et al., 2017). ESR1, also known as NR3A1, is one of the two main types of estrogen receptors. A previous study demonstrated that the expression of ESR1 gene is decreased by >90%, in almost 50% of HCC patients (Hishida et al., 2013). Decreased expression of this gene was significantly related to a high liver damage score, pathological invasion of the intrahepatic portal vein, tumor size and hepatitis B virus infection, showed that ESR1 gene is a candidate protective gene in HCC (Hishida et al., 2013). ANGPTL3 encodes secreted proteins that were expressed predominantly in the liver and plays a role in angiogenesis. ANGPTL3 expression and serum levels could act as novel biomarkers in the diagnosis of chronic hepatitis and HCC, and ANGPTL3 expression could be useful for discriminating HCC from chronic hepatitis in patients (El-Shal et al., 2017). In addition, we showed for the first time that CAT, and KLKB1 may be potential therapeutic targets in HCC patients. The differential expression of MMP9, ESR1, CAT, FOS, KLKB1 in HCC patients were verified by IHC.
In addition, the HCC microenvironment includes parenchymal cells, which are complex immune-related cells; however, the success of immune checkpoint suppression in solid tumors highlights the key role of the TME in tumor progression. TIMER was utilized to explore the effects of six characteristic IRGs on the infiltration of immune cells, revealing that MMP9, ESR1, and ANGPTL3 were closely associated with B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and DCs. All  six characteristic IRGs showed strong associations with immune cells, but the associations of the FOS gene with B cells, the CAT gene with neutrophils, and the KLKB1 gene with CD8 cells were not significant. Thus, we characterized the associations of these IRGs with the levels of immune cell infiltration in HCC patients, thereby providing a new reference for immunotherapy. However, immune cells such as T cells and DCs have different subtypes and further studies needed to confirm the detailed mechanisms of IRGs and different immune cell subtypes.
Notably, ROC curve analysis was performed to confirm the expression of the six characteristic IRGs that sufficiently distinguished HCC tissues from adjacent normal tissues. We analyzed the differential expression of the six characteristic IRGs in HCC patient tissues; MMP9 was shown to be expressed at significantly higher levels in HCC tissues than in normal tissues, while the other five genes were expressed at significantly lower levels in HCC tissues than in adjacent normal tissues. Further analysis of the prognostic impacts of the six characteristic IRGs on HCC patients revealed significant correlations of high MMP9 gene expression and low ESR1 gene expression with a poor prognosis in HCC patients. Recent studies revealed the inhibition of MMP9 which is regulated by α1-ACT, could suppress liver cancer development (Zhu et al., 2017). Previous studies on ESR1 focused on breast cancer, and breast cancer patients highly expressing the ESR1 gene were shown to have a better prognosis; however, recent studies showed that ESR1 is also associated with breast cancer liver metastasis (De Santo et al., 2019;Tian et al., 2021). We further found that MMP9 and ESR1 were independent predictors of HCC patient prognosis by performing multifactorial Cox regression analysis and we found that there are strong and significant correlation between MMP9 and ESR1 with CTLA-4. In addition, we first revealed that the gene expression levels of CAT, ESR1, and KLKB1 were significantly associated with the HCC clinical stage. For the first time, we also analyzed the associations of IRGs with therapeutic sensitivity to anti-cancer drugs. The expression of the ESR1 gene had a significant positive correlation with sensitivity to fulvestrant, raloxifene, etc. While the expression of the MMP9 gene had a significant positive correlation with sensitivity to rebimastat. Previous studies showed that fulvestrant inhibited the proliferation of HepG2 cell, via the ERα and non-canonical Wnt pathways, and indicated that it may be a promising therapeutic for HCC (Wang et al., 2014). Another study demonstrated that the effective delivery of raloxifene had proapoptotic and cytotoxic effects on HCC cell lines (Almutairi et al., 2019). However, studies on HCC and rebimastat have not been reported.
This study does have some limitations. First, subgroup analysis of IRGs, together with more clinical characteristics of HCC patients should be performed to comprehensively identify the factors and effects influencing HCC prognosis. Second, 851 liver cancer patient tissues were included in the analysis, but complete clinical information was available for only 340 samples. In future studies, cross-validation with internal datasets and increased sample sizes from other databases are needed. Third, in this study, we analyzed the association between IRGs with the sensitivity to different anti-cancer drugs, but the association between IRGs with the sensitivity to immunotherapeutic drugs need to pay attention. Finally, because obtaining fresh HCC samples was difficult, we used the paraffin-embedded sections from only HCC patients to validate our findings by IHC, and more data obtained by additional techniques such as qPCR are needed.

CONCLUSION
In summary, the 20 identified DEGs were mainly enriched for numerous immune-related GO terms and KEGG pathways, and the top six IRGs (MMP9, FOS, CAT, ESR1, ANGPTL3, and KLKB1) were selected for further study. We then analyzed the effect of the six characteristic IRGs on the HCC immune microenvironment and found positive correlations between their expression levels and the infiltration of a variety of immune cells. Two characteristic IRGs (MMP9 and ESR1) were significantly associated with the prognosis of HCC, and multifactorial Cox regression analysis showed that they were independent prognostic factors. We are the first to demonstrate that characteristic IRGs (CAT, ESR1, and KLKB1) are significantly associated with the clinical stage of HCC. Moreover, for the first time, we investigated the associations of characteristic IRGs with anti-cancer drug sensitivity, and found that fulvestrant, raloxifene, SR16157, and rebimastat have potential therapeutic effects in HCC. Besides, we found a significant correlation between MMP9, ESR1, and the immune checkpoint gene CTLA4, suggesting that it may be a potential molecular target. Most importantly, we successfully established two prognostic models for HCC based on independent prognostic IRGs (MMP9, ESR1) that provided prognostic benefit to approximately 15-95% of HCC patients and the predicted OS rates were in good agreement with the actual observed rates. In addition, we validated the prognostic models using the data from ICGC database.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Xijing Hospital's Ethics Committee. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
YH, JBW, and XMZ conceived and designed the experiments. XPD, YNH, FFY, and GYX collected the data. JHY, SYT, MZ, YSL, and SYM analyzed the data. JHY, MZ, and SYT drafted the manuscript. All authors critically reviewed the manuscript.