Comprehensive analysis and immune landscape of chemokines- and chemokine receptors-based signature in hepatocellular carcinoma

Background Despite encouraging results from immunotherapy combined with targeted therapy for hepatocellular carcinoma (HCC), the prognosis remains poor. Chemokines and their receptors are an essential component in the development of HCC, but their significance in HCC have not yet been fully elucidated. We aimed to establish chemokine-related prognostic signature and investigate the association between the genes and tumor immune microenvironment (TIME). Methods 342 HCC patients have screened from the TCGA cohort. A prognostic signature was developed using least absolute shrinkage and selection operator regression and Cox proportional risk regression analysis. External validation was performed using the LIHC-JP cohort deployed from the ICGC database. Single-cell RNA sequencing (scRNA-seq) data from the GEO database. Two nomograms were developed to estimate the outcome of HCC patients. RT-qPCR was used to validate the differences in the expression of genes contained in the signature. Results The prognostic signature containing two chemokines-(CCL14, CCL20) and one chemokine receptor-(CCR3) was successfully established. The HCC patients were stratified into high- and low-risk groups according to their median risk scores. We found that patients in the low-risk group had better outcomes than those in the high-risk group. The results of univariate and multivariate Cox regression analyses suggested that this prognostic signature could be considered an independent risk factor for the outcome of HCC patients. We discovered significant differences in the infiltration of various immune cell subtypes, tumor mutation burden, biological pathways, the expression of immune activation or suppression genes, and the sensitivity of different groups to chemotherapy agents and small molecule-targeted drugs in the high- and low-risk groups. Subsequently, single-cell analysis results showed that the higher expression of CCL20 was associated with HCC metastasis. The RT-qPCR results demonstrated remarkable discrepancies in the expression of CCL14, CCL20, and CCR3 between HCC and its paired adjacent non-tumor tissues. Conclusion In this study, a novel prognostic biomarker explored in depth the association between the prognostic model and TIME was developed and verified. These results may be applied in the future to improve the efficacy of immunotherapy or targeted therapy for HCC.


Introduction
Hepatocellular carcinoma (HCC) is the third leading contributor to mortality from cancer globally, with a five-year overall survival rate of about 18% and 830,000 deaths from the disease each year (1). Despite substantial advances in local and systemic therapy for intermediate to advanced HCC, in particular, immunotherapy and molecular targeted therapy, most patients may not respond or develop resistance to the drugs and eventually died of the disease (2). Therefore, there is still an emergency to explore more effective systemic therapies, as well as predictable biomarkers, to enable personalized and cost-effective treatment stratification.
Chemokines are a class of 8-to 12-kDa secreted proteins that can be classified into four categories XC, CC, CXC, and CX3C (3). Their main functions include the regulation of target cell migration (chemotaxis), adhesive properties, cell development, cellular localization, and cell-cell interfaces which serve an instrumental part in intracellular homeostasis, and pathological processes, in particular tumorigenesis (4). Chemokines contribute to tumor immunity in a variety of aspects, as they are involved in the localization and migration patterns of immune cells in the lymphoid tissue and tumor immune microenvironment (TIME) and directly shape the immune response (5). In general, tumorassociated mesenchymal and cancer cells could unleash a diverse array of chemokines, resulting in the recruitment and the activation of various cell types, which in turn mediate the equilibrium between anti-tumor and pro-tumor reactions (6). Several chemokine receptor antagonists have shown antitumor effectiveness in preclinical studies in various cancers, including hepatocellular carcinoma (7,8). Recent studies have also shown that the combination of chemokine antagonists may boost the therapeutic efficacy of immune checkpoint inhibitors (ICIs) or molecularly targeted therapies in patients with HCC (9,10). However, the mechanisms of chemokines in TIME are only just beginning to be discovered (5). In addition, studies of chemokines or chemokine receptors in HCC remain scarce, particularly in the prediction of survival in HCC patients and the relationship between the proportion of various immune cell subtypes in TIME and the expression of chemokines-related genes (CRGs).
In the present study, we first established and substantiated a kind of HCC predictive risk signature based on chemokine and chemokine receptor family genes. Nomograms were then developed and evaluated to precisely and conveniently forecast the outcome of HCC patients. Subsequently, the association between CRGs and immune infiltrating cells were analyzed at the transcriptomic and genomic mutational levels, respectively. Finally, we verified the differential expression of CRGs in HCC tissues and adjacent nontumor tissue using real-time quantitative polymerase chain reaction (RT-qPCR).

Data acquisition and processing
Download mRNA expression profiles and pathological features of HCC patients from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) databases. 342 HCC patients were screened from the TCGA-LIHC cohort, excluding patients with missing expression data and those with survival times of less than 30 days (n=35). 231 HCC patients were obtained from the ICGC-LIHC-JP cohort as an external validation dataset. Table 1 summarizes the detailed clinical profiles of the HCC patients enrolled in this study. Gene expression data of normal liver tissue were downloaded from the GTEx portal. scRNA-seq data for GSE149614 were obtained from the Gene Expression Omnibus (GEO) database, including 10 cases of primary tumor (PT), 2 cases of portal vein tumor thrombus (PVTT), 1 case of metastatic lymph node (MLN) and 8 cases of non-tumor liver tissue (NTL). The single-cell data were filtered by setting each gene to be expressed in a minimum of 3 cells, with each cell expressing at least 250 genes, resulting in 71915 cells. The percentage of mitochondria and rRNA was calculated using the "PercentageFeatureSet" function and ensuring that each cell expressed >100 and less than 8000 genes, that the mitochondrial content was less than 10% and that the UMI of each cell was at least >100, resulting in 67101 cells. We then normalized the data for each of the 21 samples by log-normalization. Variable features were identified by finding highly variable genes (based on variance stabilization transformation using the "FindVariableFeatures" function. All genes were scaled using the "ScaleData" function. Then we removed batch effects between samples and integrated the data by using the "FindIntegrationAnchors" and "IntegrateData" functions. Subsequently, we performed PCA downscaling to find anchor points by "RunPCA" function, selecting a dim of 40, and clustering the cells by using the "FindNeighbors" and "FindClusters" functions (setting Resolution=3) to obtain a total of 51 subgroups. Meanwhile, we used the "RunTSNE" function to perform T-distributed Stochastic Neighbour Embedding (tSNE) downscaling analysis on 67101 cells. Cells were annotated through published literature, the cell-marker website, and the "SingleR" package. Malignant cells were predicted using the "copycat" package. The genes used for the final annotation are shown in Figure S1). The detailed workflow diagram and corresponding analysis for the study were illustrated in Figure 1.

Development of prognostic risk signature
A total of 66 CRGs, including 47 chemokine ligands and 19 different chemokine receptors, were included in this study based on previous literature reports (Table S1) (11). In the TCGA dataset, we used the R project "limma" package to screen differentially expressed CRGs between HCC and adjacent non-tumor tissue with thresholds set to false discovery rate (FDR)< 0.05 and |log2 fold-change (FC)| > 1. CRGs related to overall survival (OS) in HCC patients were screened using univariate Cox regression analysis with a screening criterion of P< 0.05. The 342 HCC patients were subsequently randomized in a 3:1 ratio into the training and testing group using the R project "caret" package. The least absolute shrinkage and selection operator (LASSO) analysis with one standard error (SE) and 1000-fold cross-validation to filter the most significant markers in the training dataset using the "glmnet" and "survival" R packages. Given the simplicity and repeatability of the model, a backward stepwise Cox proportional risk regression model was developed using multivariate Cox analysis. Determine the model with the lowest AIC value as the final model. The risk score for each HCC patient in the three cohorts was determined by the following formula: Assessment and confirmation of the prognostic risk signature and nomogram All HCC patients in the three cohorts were allocated into the high-and low-risk groups according to the median risk score. Kaplan-Meier analysis was utilized to assess and compare the differences in survival outcomes between high-and low-risk groups of HCC patients in the training, test and ICGC datasets, respectively. The nomogram was constructed using the 'rms' and 'regplot' R packages, combining tumor staging and risk scoring. Calibration and time-related receiver operating characteristic (time ROC) curves were used to assess the accuracy and discrimination of the risk model and the nomogram using the "time ROC" and "rms" R packages.

The landscape of immune and gene mutation
The Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT) algorithm was implied for the quantitative assessment of the transcriptomic data and Flow chart of screening for chemokines-and chemokine receptors-based signature in hepatocellular carcinoma.
subsequently translated into the abundance of 22 types of immune and stromal cells (12). The single sample GSEA (ssGSEA) approach was applied to evaluate the inflammatory infiltration profiles and immune functions. Furthermore, we evaluated various immune cell infiltration using different algorithms with the Tumor Immune Estimation Resource (TIMER) database (http://timer.compgenomics.org/) (13,14). The Somatic Copy Number Alterations (SCNA) module of the TIMER database was adopted to investigate the association between specific genes and immune infiltration. Gene somatic mutation data were downloaded from Genomic Data Commons (GDC) database. The "oncoplot" function in the "maftools" package was performed to plot the respective mutations in the high-and low-risk groups and estimated the significantly different distribution of mutated genes by the "mafCompare" function in the "maftools" package.

Drug sensitivity assessment and enrichment analysis
The semi-inhibitory concentration (IC50) values of well-known chemotherapeutic and targeted therapeutic agents were predicted by the "oncoPredict" R package to compare their efficacy in the different risk groups (15). Hallmark gene set and KEGG analysis were conducted using Gene Set Enrichment Analysis (GSEA) software 4.1.0. Gene ontology (GO) analysis was performed employing the R package "clusterProfiler" for identifying the functional differences between the two groups (16,17). Furthermore, the gene sets of "c2.cp.kegg.v7.4.symbols.gmt" were downloaded from the MSigDB database to run GSVA enrichment analysis using the "GSVA" R package (18).

RT-qPCR and proteomics for further demonstration
A total of 19 paired HCC and adjacent non-tumor tissue specimens were collected from the First Medical Centre of the Chinese PLA General Hospital. All specimens were quickly placed in liquid nitrogen (-196˚C) for preservation after excision. RNA extraction was performed using TRIzol reagent (Invitrogen, California, USA) and complementary DNA (cDNA) was constructed using ReverTra Ace qPCR RT Kit (Toyobo, Japan). Similarly, the detection and amplification of the prognostic genes were conducted using SYBR ® Green Realtime PCR Master Mix (Toyobo, Japan) in the ABI Step One Plus Real-Time PCR system (Applied Biosystems) while b-Actin was kept as an endogenous control. Each sample was repeated three times. The primer sequences are given in Table S2. Differences in gene expression between HCC and its paired adjacent non-tumor tissues were compared using the 2 −DDCt approach. In addition, in the CPTAC dataset (Clinical Proteomics), we analyzed the protein expression levels of CCL14 and CCL20 in HCC using the UALCAN portal.

Statistical analysis
The Kruskal-Wallis test or the Mann-Whitney U test was used for comparisons of continuous variables. Categorical variables were indicated as frequencies and compared using either the chi-square test or Fisher's exact test. Survival curves were created using the Kaplan-Meier method and compared by applying a log-rank test. Independent risk factors associated with OS were identified using univariate and multivariate Cox regression analysis. P values< 0.05 were statistically significant and all tests were two-tailed. Data analysis and processing using R version 4.1.0, SPSS 26.0 and GraphPad Prism 8.

Results
The construction of risk signature on chemokine-related genes A total of 19 CRGs were sieved in the TCGA database that was differentially expressed between HCC and adjacent non-tumor tissues ( Figures S2A, B). Four of the 19 differentially expressed genes were notably related to overall survival time in HCC patients ( Figure S2C). In the training dataset, we selected the candidate with the lowest AIC value as the final risk model ( Figure S2D). Ultimately, the risk score for each HCC patient was obtained by the following equation: Risk score = CCR3 * 0:501 + CCL20 * 0:089 − CCL14 * 0:84 Assessment and validation of prognostic risk signature HCC patients were classified into two groups (high risk and low risk), depending on median risk values in the training, testing and ICGC dataset, respectively. There were no apparent discrepancies in clinical characteristics between high-and low-risk groups in the TCGA cohort, while the tumor stage and risk score were significantly correlated in the ICGC cohort (Table 2). In the three cohorts, risk scores decreased with the increased expression of CCL14 and increased when the expression of CCR3 and CCL20 increased (Figures 2A-C). Mortality in HCC patients tends to increase with higher risk score in the training ( Figure 2D

Construction and assessment of the nomogram
Combining risk score and tumor staging, we constructed nomograms in the TCGA ( Figure 3A) and ICGC ( Figure 3B) datasets, respectively. This can be applied as a clinically relevant quantitative method to enable clinicians to predict the survival outcome of HCC patients more accurately. The results of calibration plots indicated that the performance of the nomograms was comparable to the ideal model ( Figures 3C, D). The time-dependent ROC of the two cohorts also demonstrated the high accuracy and predictive power of the nomograms ( Figures 3E, F).
Immune cell infiltration in the TCGA dataset based on risk score groupings Given the intense relationship between CRGs and immune cells or immune function, we depicted and compared the discrepancies in the proportion of immune cells in TIME and immune function profiles between high-and low-risk groups using diverse algorithms. The results of CIBERSORT method discovered that B cell (naïve and memory), natural killer (NK) cells (resting and activated), monocytes, macrophages (M1, M2), and resting mast cells accounted for a substantially greater proportion of TIME in low-risk HCC than in high-risk group (P< 0.05). Activated memory CD4 + T cells, T-follicular helper (Tfh) cells, regulatory T cells (Tregs), M0 macrophages, and resting dendritic cells accounted for a significantly higher proportion of TIME in the high-risk HCC than in low-risk patients (P< 0.05) ( Figure 4A). An overview of the 22 subtypes of tumor-infiltrating immune cells in the two groups is presented in Figure S5A. In addition, we calculated the Spearman correlation coefficients between each gene in the signature and 22 types of immune cells. The expression of CCL14 correlated negatively with M0 macrophages, Tregs, and memory B cells and positively with monocytes, M2 macrophages, NK cells and naive B cells (P< 0.05) ( Figure S5B). The expression of CCL20 ( Figure S5C) and CCR3 ( Figure S5D) was negatively correlated with NK cells, monocytes and M2 macrophages and positively correlated with Tregs, CD4 + memory T cells and M0 macrophages (P< 0.05). The correlation between risk score and the proportion of immune cell subtypes in TIME calculated by different algorithms were illustrated in Figure S5E. The single sample GSEA (ssGSEA) algorithm obtained similar findings for immune cell infiltration as the CIBERSORT algorithm. Furthermore, there were also considerable differences in immune functions between the two groups, with significant activation of immune checkpoints, human leukocyte antigen (HLA), pro-inflammation, parainflammation, T cell co-stimulation, and CC motif chemokine receptor (CCR) in the high-risk group ( Figure 4B). In addition, we found that the expression of HLA genes ( Figure 4C), immune checkpoint genes ( Figure 4D), and T-cell stimulation genes ( Figure 4E) was significantly higher in the high-risk group.
scRNA analysis of the CCL cellular pathway network in PT, PVTT and MLN To systematically assess the role of the CCL signaling pathway in the immune microenvironment and tumor progression of HCC, we analyzed scRNA-seq data. The results of the data pre-processing in detail are shown in Figures S6A-C. A total of 67101 cells were identified in four relevant tissue types: non-tumor liver (NTL), primary tumor (PT), portal vein tumor thrombus (PVTT) and metastatic lymph node (MLN) ( Figure 5A). These cell clusters were then marked into 12 cell types based on specific genetic markers ( Figure 5B). The proportion of the 12 cell types in each sample was shown in Figure 5C. We subsequently found that the expression of CCL20 was higher in CD4+ T cells, CD8+ T cells, endothelial cells, macrophages, malignant cells, mast cells, mature B cells, and myeloid cells in PVTT and MLN than in other cell and tissue types ( Figure 5D). We investigated the overall profile (Figures 6A-C) and the CCL signaling pathway (Figures 6D-F) of differences in possible incoming or outgoing signaling pathways between different types of tissues. We found that in the CCL signaling pathway of PT or PVTT and MLN, the main cells for incoming and outgoing signals were CD4+ T cells and macrophages, respectively. We subsequently studied the differences in cell-cell communication networks in the CCL signaling pathway between different types of tissues by calculating the communication probabilities and obtained concordant results (Figures 6G-I). Finally, we examined the expression of CCL signaling pathway-related genes in different cell subpopulations in three tissue types (Figures 6J-L). Interestingly, we observed that the expression of CCL20 dominated intercellular communication in MLN tissues and that CCL20 was highly expressed in macrophages, malignant cells and myeloid cells.

Genetic variation analysis in the TCGA dataset
The results of the gene mutation analysis showed notable differences between the high-risk and low-risk groups of HCC patients, and we ranked the top 20 mutated genes (Figures 7A,  B). However, we did not identify the mutations associated with CRGs. We also observed a higher incidence of mutations in common genes among the high-risk group, compared to the lowrisk group of HCC patients. Furthermore, we identified a much higher frequency of mutations in TP53, ARID1A, DNAH10, and C10orf90 genes associated with tumorigenesis and progression in the high-risk group than in the low-risk group ( Figure 7C). Subsequently, we initially investigated the type and frequency of alterations among chemokine family genes in HCC samples ( Figure  S7A), as well as the specific location of these genes in the chromosomes ( Figure S7B). The relationship between the level of tumor immune infiltration and CNAs was further assessed using the TIMER database. The results suggested that changes in CCL14 and CCL20 somatic copy number alterations were closely associated with the level of immune cell infiltration ( Figure S8).

Risk score-based strategies for the treatment of HCC
Using the "oncoPredict" method, we calculated IC50 values for each HCC patient receiving chemotherapy or small molecule targeted therapy to predict their sensitivity when given different treatments. By comparison of the differences in IC50 values between different groups in the TCGA dataset, we identified that HCC patients in the high-risk group were more sensitive to treatment with Lapatinib, 5-Fluorouracil, Dasatinib, and Gefitinib (P< 0.001), compared to patients in the low-risk group (Figures 8A-D). Patients in the low-risk group were more sensitive to treatment with Entospletinib, Leflunomide, Gemcitabine, Sorafenib, and Axitinib, (P< 0.001) compared to patients in the high-risk group (Figures 8E-I).
Functional analysis, RT-qPCR, and proteomics verification GSEA and GO functional analysis were carried out to delineate the differential gene expression profiles and function deviations between two risk groups, while GSVA analysis was conducted to elucidate the correlation between biological activity pathways and risk score. GSEA results of the hallmark genes set indicated that remarkable enrichment of immune-related pathways in the highrisk group compared to the low-risk group, such as IL6-JAK-STAT3, G2M-checkpoint, and P53 pathway ( Figure 9A). On the other hand, the low-risk group was significantly enriched in metabolic function pathways compared to the high-risk group, such as b-alanine metabolism, fatty acid metabolism, and tyrosine metabolism ( Figure 9B). All results of the GO functional analysis are shown in Figure 9C and Table  complex, the external side of the plasma membrane, and the collagen-containing extracellular matrix. The molecular Function (MF) primarily included antigen binding and immunoglobulin receptor binding ( Figure 9D). A total of 121 enriched pathways were recognized by GSVA analysis among the two groups. The lowrisk group primarily enriched in metabolism-associated pathways, while the high-risk group was primarily enriched in immunerelated pathways ( Figure 9E and Table S4). Given the paucity of adjacent non-tumor tissue samples in the TCGA database, we increased adjacent non-tumor tissue samples through the GTEx database and found that CCL14 ( Figure 10A) was significantly highly expressed in adjacent non-tumor tissue, while CCL20 ( Figure 10B) and CCR3 ( Figure 10C) were significantly less expressed when compared with HCC tissue.
For the further investigation of the expression of the three genes (CCR3, CCL14, CCL20) in HCC tissues, RT-qPCR was applied to the detection of mRNA expression of the three genes between HCC tissues and adjacent non-tumor tissues. We affirmed the expression of genes in the signature and obtained consistent results ( Figures 10D-F and Table S5). As for the expression of protein levels, the CPTAC database results showed that the expression of CCL14 in HCC tissues was significantly lower than that in normal tissues ( Figure 10G), while the CCL20 expression level was significantly higher than that in normal tissues ( Figure 10H).

Discussion
Sophisticated genetic mutations and cellular dysfunction drive the formulation of hepatocellular carcinoma (HCC) with extensive heterogeneity. Compared to wide-spectrum molecular drive therapies that apply to specific patient populations for specific cancer types, current immunotherapy and targeted therapies for advanced HCC are "one-size-fits-all", without detailed patient stratification, which limits the efficacy of systemic treatment and fueling the unsatisfied medical needs of HCC (19). The development of a novel, reliable prognostic scoring system is urgent to improve the risk stratification of HCC patients and predict the efficacy of systemic therapy (20). The cancer cell and tumor microenvironment could modulate the expression and function of chemokine-related genes, which in turn shapes the type and consequence of the immune response to malignant cells and cancer systemic therapy (5). However, no studies are focusing on the relationship between CRGs and the prognosis of HCC patients or the tumor immune microenvironment.
In this study, we have established a prognostic signature based on chemokines and chemokine receptors. This signature showed strong predictive power for HCC patients with an external dataset. Notably, we have also developed and evaluated nomograms, which have good accuracy and reliability to facilitate clinical application. The comprehensive multi-omics analysis could detect and discover not only the meticulous genetic modifications and mutations in HCC but also the precise composition of the different cell types in the TIME and their interactions with hepatocellular carcinoma cells (21,22). Therefore, we investigated the association between the chemokine-based signature and the various subtypes of immune cells in the TIME employing different algorithms. We further demonstrated that chemokines play an integral part in the shaping of different types of TIME, which have significant implications for tumor progression and treatment outcome in the TCGA cohort. We identified two chemokine ligands and one chemokine receptor, CCL14, CCL20 and CCR3, which are uniquely advantageous in predicting the prognosis of HCC patients. It has been shown that the overexpression of CCL14 inhibits cancer cell proliferation and promotes apoptosis and that its low expression is associated with poor prognosis in HCC patients (23). MMP-21 promotes macrophage recruitment by increasing CCL-14 levels and M2 macrophage polarization by increasing CSF-1 and FGF-1 expressions, thereby regulating the immune microenvironment and metastasis of HCC (24). The CCL20-CCR6 axis has been reported to be associated with a variety of cancers, including HCC, colorectal cancer, breast cancer, pancreatic cancer, cervical cancer and renal cancer (25). The CCL20-CCR6 axis facilitates cancer progression directly by potentiating the migration and proliferation of cancer cells, and indirectly by reshaping the tumor microenvironment through immune cell manipulation (25). Consistent with our single-cell analysis results, CCL20 gene expression was significantly higher in metastatic HCC tissues (PVTT and MLN). Moreover, the CCL20 gene dominates intercellular communication in MLN and was highly expressed in macrophages, malignant cells and myeloid cells. There is also a study using online databases that have found significantly higher expression of CCL20 in HCC tissues compared to the adjacent nontumor tissues (26). Another retrospective clinical study revealed that the expression of CCL20 was significantly associated with tumor recurrence and survival outcomes in HCC patients (27). These results were consistent with our study. In addition, the role of CCL20 in the TIME is manifested mainly through the recruitment of Th17 cells, immature DCs, Tregs and tumor-associated macrophages (TAMs) (5). No studies of CCR3 in HCC have been reported. However, studies in other types of cancer have shown that CCR3 plays a major role in promoting tumor progression and is strongly associated with poor prognosis of patients, with its role in the TIME exhibited through the recruitment of TAMs to facilitate tumor progression (5,(28)(29)(30). In addition, it has been found that inhibition of CCR3 in cancer cell lines induces polyploid giant cell formation and b-catenin stabilization through the PI3K/Akt/GSK-3b signaling pathway, a process associated with EMT, as a result of CCR3 inhibition, transformed cells acquired enhanced mobility and proliferation (31). Non-polarized (M0) macrophages are derived from monocytes following colony-stimulating factor 1 (CSF-1) induction. M0 macrophages can be differentiated into M1 and M2 macrophages when stimulated by different cytokines. M1 macrophages exhibited cytotoxic properties on tumors by secreting cytokines such as IL-2 and TNF-a. In contrast, M2 macrophages have the potential to promote tumor progression (32). Furthermore, the increased infiltration of TAMs in TIME was associated with worse patient outcomes based on TCGA analysis of HCC (33). The phenotypes of dendritic cells were complex, with either subtype that promotes CD8+ T cell activation or subtypes that contribute to T cell dysfunction (33). Tfh cells were known to promote anti-tumor immune responses but suffered strong immunosuppression due to the high expression of PD-1 (34). NK cells are a subtype of immune cells with anti-tumor properties and serve an indispensable part in the immune surveillance and eradication of tumors. However, in the environment of tumor or chronic infection, NK cells manifest a state of exhaustion similar to that of T cell exhaustion, with depressed effector function and altered phenotype (35). Tregs are a subset of lymphocytes with highly immunosuppressive properties that suppress tumor-infiltrating cytotoxic T lymphocytes and accumulate commonly in HCC (36). The increased infiltration of Tregs in TIME is firmly associated with the aggressiveness and progression of HCC, as well as poor patient prognosis (37). In the present research, macrophages, Tregs, dendritic cells, and Tfh cells accounted for significantly higher proportions of TIME in the highrisk group compared to the low-risk group. At the same time, NK cells, monocytes, and M1 macrophages comprised a substantially more proportion of TIME in the low-risk group. The findings could provide an explanation for the unfavorable outcomes of HCC patients in the high-risk group. In addition, we observed remarkable activation of immune-related functions such as HLA, T cell co-stimulation and immune checkpoints in the high-risk group of HCC patients. The activation of immune functions in these three groups was then confirmed at the level of gene expression. These results corresponded to a higher frequency TMB in the high-risk group of HCC patients. Since meaningful mutations occurring in tumors are tailored into neoantigens which are delivered to CD8+ T cells via major histocompatibility complex (MHC) proteins. The increased TMB results in larger numbers of neoantigens, increasing the opportunity for recognition by CD8+ T cells (38).
We also stratified HCC patients in the TCGA dataset according to risk score profile and assessed the sensitive properties of patients in different groups to specific drugs. Sorafenib was the first small molecule-targeted drug to be demonstrated effectiveness in the systemic therapy of advanced HCC and remains one of the firstline of treatments for advanced HCC (39). Our prediction results indicated lower IC50 values in the low-risk group of HCC patients treated with sorafenib. This suggested that HCC patients in the high-risk group were not as sensitive to sorafenib as those in the low-risk group. In addition, we identified six small moleculetargeted drugs and three chemotherapeutic drugs with meaningful differences in IC50 values in the different risk groups, which shed light on potential chemotherapies and targeted therapy for HCC. GSEA, GO, and GSVA analysis provides mechanistic insights into the discrepancy in the infiltration of various immune cell subtypes of TIME in different groups. GSEA profiling indicated that immune function-related signaling pathways such as Notch, MAPK, P53, and Jak-Stat were notably abundant in the high-risk group of the TCGA cohort. These pathways serve as critical Further affirmation of the expression of genes in the signature. Analysis of (A) CCL14, (B) CCL20, and (C) CCR3 expression differences between HCC and normal tissues by combining TCGA and Genotype-Tissue Expression (GTEx) databases. Further demonstration of significant differences in the expression of (D) CCL14, (E) CCL20, and (F) CCR3 between HCC and normal tissues by RT-qPCR. Differential expression of (G) CCL14 and (H) CCL20 in normal and tumor tissues analyzed by UALCAN.
components in the development of HCC and targeted therapy (40)(41)(42)(43). Correspondingly, metabolism-related signaling pathways were substantially concentrated in the low-risk group, while interactions between nutrients, metabolites and immune cells played a principal role in immune editing and tumor escape. Metabolic reprogramming, for example, is the backbone of T cell differentiation and activation, as the transition from quiescence to activation, proliferation, differentiation and infiltration bears a heavy energy burden (44, 45). We obtained consistent results from GO and GSVA analyses, with considerable discrepancy in the enrichment of immune-and metabolism-associated signaling pathways in the two groups of HCC patients. All of the above results provide sufficient evidence for the immune or inflammatory activeness profile of chemokine-associated signaling in TIME for patients with HCC, and firmly substantiate the underlying mechanism by which the signature predicts the prognosis of HCC patients. Although the chemokine-related signature is a valid independent prognostic factor and was strongly associated with TIME in patients with HCC, some limitations should still be recognized. Firstly, all cohorts used for analysis are sourced from the public database and the conclusions require additional external data validation. Second, the genes in the signature need to be further validated in cellular and animal models for understanding the functions and mechanisms involved in TIME and tumor development.
In conclusion, a novel signature was successfully developed and substantiated to forecast the prognostic outcome of patients with HCC. The relationship and potential mechanisms involved in the chemokines-related signature and the TIME in HCC were preliminarily explored. This study presents a promising landscape for clinical research in the application or targeting of chemokines in monotherapy or combination therapies.

Ethics statement
The studies involving human participants were reviewed and approved by The ethics committee of the Chinese PLA General Hospital (Approval No. S2018-111-01). The patients/participants provided their written informed consent to participate in this study.

Author contributions
ZZ, MM, FW, and ZPZ collected, analyzed and interpreted the data. FW, YZ, MM, JS, LC, and XW provide technical support and assistance with experiments. SL, PX, and ZZ conceived and designed the study. SL and ZZ drafted the manuscript. SL, PX, and ZPZ supervised research and manuscripts. All authors have read and approved the final version of the manuscript.
respectively. (D, E) Concordance index analysis for the training group, testing group, and external validation group respectively.

SUPPLEMENTARY FIGURE 5
Analysis of immune cell infiltration about risk score and prognostic genes in the TCGA database. (A) Overview of 22 immune cell infiltrations in the highand low-risk groups. Correlation analysis of (B) CCL14, (C) CCL20, and (D) CCR3 with immune cells. (E) The landscape of immune cells infiltration between high-and low-risk groups by different algorithms.

SUPPLEMENTARY FIGURE 6
The results of scRNA-seq data reprocessing.