Decoding the Immune Microenvironment of Clear Cell Renal Cell Carcinoma by Single-Cell Profiling to Aid Immunotherapy

Clear cell renal cell carcinoma (ccRCC) is the most common subtype of kidney cancer, and it is the major cause of kidney cancer death. Understanding tumor immune microenvironments (TMEs) is critical in cancer immunotherapies. Here, we studied the immune characterization at single-cell resolution by integrating public data of ccRCC across different tissue types, and comparing the transcriptome features and tumor TME differences in tumors, normal adjacent tissue, and peripheral blood. A total of 16 different types of cell components of ccRCC were identified. We revealed that there is an overall increase in T-cell and myeloid populations in tumor-infiltrated immune cells compared to normal renal tissue, and the B-cell population in the tumor showed a sharp decrease, which indicates that the cells in tumor tissue undergo strong immune stress. In addition, the cell–cell communication analysis revealed specific or conserved signals in different tissue types, which may aid to uncover the distinct immune response. By combining and analyzing publicly available ccRCC bulk RNA-seq datasets, 10 genes were identified as marker genes in specific cell types, which were significantly associated with poor prognosis. Of note, UBE2C, which may be a good indicator of tumor proliferation, is positively associated with reductions in overall survival and highly associated with tumor grade. Our integrated analysis provides single-cell transcriptomic profiling of ccRCC and their TME, and it unmasked new correlations between gene expression, survival outcomes, and immune cell-type components, enabling us to dissect the dynamic variables in the tumor development process. This resource provides deeper insight into the transcriptome features and immune response of ccRCC and will be helpful in kidney cancer immunotherapy.


HIGHLIGHTS INTRODUCTION
Clear cell renal carcinoma (ccRCC) is the most common and lethal form of renal cell carcinoma (RCC) and is responsible for more than 75% of RCC cases (1). It is a malignant tumor with multiple molecular features and a poor prognosis (2). Due to the lack of typical clinical symptoms, it is difficult to diagnose ccRCC and approximately 35% of patients had developed metastasis at the time of diagnosis (3). Studies have demonstrated that ccRCC is among the most immune and vascularly infiltrated cancer types (4). Hence, understanding tumor immune microenvironments (TMEs) is critical for identifying immune modifiers of cancer progression and developing cancer immunotherapies. For example, immune checkpoint blockade therapy and combination regimens have significantly increased survival in patients with ccRCC (5). The infiltrating CD4+ T cells can regulate the proliferation of RCC by modulating the TGFb1/YBX1/HIF2a signals (6). However, major challenges remain, including lack of reliable predictive biomarkers and identification of more immunotherapeutic targets.
Recent applications of single-cell RNA sequencing (scRNA-seq) in dissecting TME have brought important insights into the biology of tumor-infiltrating immune cells, including their heterogeneity, dynamics, and potential roles in the disease progression and response to immune checkpoint inhibitors and other immunotherapies (7)(8)(9)(10)(11). The tumor immunology field has focused heavily on local immune responses in the TME, yet immunity is coordinated across the tissue. For example, many myeloid cells are frequently replenished from hematopoietic precursors in the bone marrow (12), and critical T-cell priming events typically occur in lymphoid tissues (13). The localized antitumor immune response cannot exist without continuous communication with the periphery. Among these non-cancer cells, the tumor-infiltrating immune cells (TIICs) exert a central role in pro-and anti-tumorigenic processes; moreover, they have been found to be closely correlated with the clinical outcome and response to immunotherapy (6). Previous single-cell analyses of renal cell cancers were mostly focused on solid tumor to study mechanisms of intratumorally and intertumoral heterogeneity (14), tumor microenvironment immune subtypes for classification (15), as well as distinct immune characteristic between tumor and peripheral blood or normal renal tissue (16,17). However, the conserved or specific immune response in ccRCC across the peripheral blood mononuclear cells (PBMCs) and adjacent normal tissue in addition to within the tumor also need to be dissected. Therefore, a comprehensive understanding of ccRCC holds the promise to improve personalized treatment strategies.
In this study, we integrated publicly available single-cell RNA-seq data and comprehensively analyzed the immune characterization, as well as dynamic changes in cell subtype composition and intercellular interactions across tumor tissue. Our analyses provide insight into the immune lineages in ccRCC tumors, adjacent tissue, and PBMC. Specifically activated cellular signals in tumor tissue revealed the potential relevance to tumor progression or inflammation, and this may provide vital evidence for dissecting tumor immune response mechanism. In addition, by combining with The Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) RNA-seq transcriptional profile and clinical data, we focused on establishing an understanding of the associations among the TME, biomarkers, and clinical outcomes. Finally, 10 unique markers are identified to be associated with patient prognosis. Notably, UBE2C, which acts as one of the critical biomarkers specifically expressed in CD8+ T_3 cells, is involved in tumor progression and has essential prognostic value. This resource provides deeper insights into ccRCC biology that will be helpful in advancing kidney cancer diagnosis and therapy.

Single-Cell Transcriptional Landscape of ccRCC Tumor, Peripheral Blood, and Adjacent Normal Tissue
In order to elucidate the TME of human ccRCC, we downloaded public available scRNA-seq data to dissect the transcriptome heterogeneity from tumors and matched peripheral blood from three treatment-naive ccRCC patients. In parallel, the scRNA-seq data from three other individuals derived from the renal tumor, adjacent normal tissue, and peripheral blood were downloaded for integrative analyses, aiming to facilitate the identification and assessment of ccRCC-specific differences. The data collection and quality control (QC) criteria are described in Methods. All high-quality cells were integrated into an un-batched and comparable dataset and subjected to principal components analysis (PCA) after correction for reading depth and mitochondrial read counts. Using graph-based uniform manifold approximation and projection (UMAP), we identified 16 clusters across 75,173 cells.
A total of 31,093 cells originated from the tumor, 14,788 cells were normal adjacent tissue-derived, and 29,292 cells were obtained from peripheral blood ( Figures 1A, B). We cataloged cells into 16 distinct cell lineages, including myeloid (CD14 and CD16 cells, cDC2, cDC1, and pDC cells), T cells (CD4+ T cells,  Treg, CD8+ T_1, CD8+ T_2, CD8+ T_3 cells, and NKT cells), B cells (B, plasmablast cells), NK cells, HSPCs, and platelets as the common cell types ( Figure 1C). Seurat (18) cell reference datasets, SingleR package (19), and known markers in the CellMarker database (20) were together used for this cell-type annotation. It is obvious to see that the distribution of the cells from different tissue types of ccRCC is similar, but the abundance of certain subclusters is different, while the original study reported that the cells derived from renal tumors, matched peripheral blood, and healthy normal kidneys were enriched in distinct clusters. This illustrated that the immune response in pathology is different from the healthy condition. The relative proportion of cell types comprised by tissue type is illustrated in Figure 1D. The top five markers of the main cell lineages were visualized as a heatmap ( Figure 1E).

Cell-Type Differences and Hallmark Signatures in ccRCC
To be more comprehensive and intuitive in observing changes in cell composition in different groupings, cell components were visualized in the form of a bar plot in the form of cell types and samples (Figures 2A, B). The cell types contained in different samples were almost identical but the abundance of cells in each cell type was varied. Among the three groups, the abundance of CD14 cell clusters is the most of all the cell clusters, which indicates a vital role of CD14 in tumor immune response. We observed a great decrease of CD4+ T cells and B cells within tumors and adjacent normal kidneys relative to peripheral blood ( Figure 2A). This is similar to the findings of the original study by Borcherding et al. They observed a decreased CD4+ T cells and B cells within healthy normal kidneys or tumors relative to peripheral blood (17). In addition, we also noticed an increase in three subtypes of CD8+ T cells (CD8+ T_1, CD8+ T_2, and CD8+ T_3), and NKT cells in tumors relative to adjacent normal tissue and peripheral blood. The increasing tendency is similar to the decreasing tendency of CD4+ T cells and B cells. In the original study, the authors also revealed that the trends of increased CD8+ and decreased CD4+ T cells were similar, which is performed by immunohistochemistry on paired normal and tumor tissue. cDC1 and cDC2 cell abundance shows a clear decreased tendency from tumor versus adjacent normal tissue and peripheral blood derived from ccRCC patient samples, indicating the activation of adaptive immune responses. The relative abundance of NK cells, CD16 cells, and plasmablast cells in the tumor was comparable to that in peripheral blood but higher than that in adjacent normal tissue. Notably, HSPC cells were significantly enriched in tumor comparison with adjacent normal tissue, and the abundance of HSPC in peripheral blood is much lower. Furthermore, we notice that platelet cells are almost only in the tumor and adjacent normal tissues, and the platelet cells were highly enriched in tumor tissues ( Figure 2B); this suggested an important function of platelets in TME and the central component during the development of tumors. The above result revealed that the tumor microenvironment of ccRCC is highly heterogeneous. A deep understanding of the tumor microenvironment, especially the characteristics of tumor-infiltrating immune cells, is crucial for exploring the key regulatory molecules of tumor development.
Next, we focused on the transcriptomic features of each major cell type. Specific marker genes of each cell cluster were obtained according to their gene profiles, picking the marker gene with the best specific indicative effect ( Figure 2C). To explore the existence of differential expressions of these marker genes in different groups, a violin pagination map was made based on the sample source ( Figure 2D). Most of these markers' relative expression in tumors was comparable to that in peripheral blood and adjacent normal tissue; the differences in these gene expression patterns were only exhibited in distinct subtypes. The following three genes are obviously different in their expression: (1) S100A8 (S100 Calcium Binding Protein A8), which plays a prominent role in the regulation of inflammatory processes and immune response. It was highly enriched in CD14 cells derived from peripheral blood. The unique high expression level of S100A8 in the CD14 cell cluster indicates that the CD14 cells may play pro-inflammatory and anti-tumor roles in ccRCC. (2) JCHAIN (Joining Chain of Multimeric IgA and IgM), which is almost only detected in tumor-derived plasmablast cells. One of the JCHAIN-related pathways is cell surface interactions at the vascular wall; it indicated that the role of plasmablast is closely related to tumor angiogenesis. (3) The third gene was KRT19 (Keratin 19), whose related pathways are embryonic and induced pluripotent stem cell differentiation pathways and lineage-specific markers. In this study, we found that it was detected in platelet cells derived from tumor and adjacent tissue. These marker genes emphasized that the transcriptional features of various immune cells derived from different tissue types are specificity, further demonstrating the immune response heterogeneous in the tumor, matched adjacent normal tissue, and peripheral blood.

CD8+ T_3 Cluster Associated With the Cell Cycling Process
Clustering of cells revealed three distinct CD8+ T subtypes with relative transcriptional specific features, and CD8+ T_3 subtypes highly exhibited cell cycle association characteristics that were mainly in S or G2M phases ( Figure 3A). To further understand the characteristics of the tissue-specific distribution of CD8+ T_3 subclusters, CD8+ T_3 cells were sub-clustered into five distinct clusters ( Figure 3B). The top 5 marker genes in each subgroup were selected to visualize the specific gene expression pattern ( Figures 3C, D). The expression of these marker genes in different tissues showed high heterogeneity. Tissue-infiltrating CD8+ T cells (both tumor and adjacent normal tissue) comprised the majority of C1 and C4. Clusters C2 and C3 comprised tumor and peripheral blood cells, and cluster C0 cells were only derived from renal tumor cells. Going from left to right across the x-axis of the UMAP, there is a change in tissuespecific contribution starting from peripheral blood, tumor, and adjacent normal tissue to the tumor, which may represent the process of tissue infiltration CD8+ T3 cells. Previous studies revealed that the proliferation of CD8+ T cells is an important surrogate marker of the antitumor immune response (17,21). Furthermore, we found that the C0 subcluster has high expression of inhibitory checkpoints, including LAG3, CTLA4, GZMK, and PDCD1. Since these molecules are markers of T-cell exhaustion, these data indicated that C0 subclusters were exhausted CD8+ T cells. At present, PD-1/PD-L1 and CTLA4 are the most popular targets of immunotherapy. Since the expression of LAG3, GZMK, FXYD2, and TRBV20-1 was higher than that of PD-1 in the exhausted T-cell subcluster, our data revealed that LGA3, GZMK, FXYD2, and TRBV20-1 may serve as potential targets for ccRCC immunotherapy. In order to assess possible functional differences based on these subclusters in CD8+ T3 cells, we performed gene set variation analysis (GSVA) ( Figure 3E). As expected, based on the process features of tissue infiltration of CD8+ T3 cells ( Figure 3D), PANCREAS_BETA_CELLS, COAGULATION, and ANGIOGENESIS pathways were upregulated in C4 and C1 clusters ( Figure 3E), and the APICAL_SURFACE signal pathway and the BILE_ACID_METABOLISM pathway are upregulated in C2 and C3 clusters ( Figure 3E), respectively. Cluster C0 showed upregulated activity of the KRAS_SIGNALING_DN pathway ( Figure 3E). These specific pathways in different subsets reflected the characteristics of tumor immune infiltration and response processes. For example, the C2 and C3 clusters exhibited obviously metabolic re-program characteristics, while C1 and C4 demonstrated a strong immune stress response of immune cells no matter which tissue they were derived from in tumor patients, and cluster C1 indicated that the expression of immune checkpoint inhibitors may be associated with the KRAS pathway.
In order to examine gene expression patterns across distinct subclusters in CD8+ T_3 cells, we utilized monocle2 (22) to build branched structures among subclusters, inferring the developmental trajectory of CD8+ T_3 clusters ( Figure 3F). We identified one major curve with the origin in C4; cluster C0, which represented an exhausted T-cell cluster, was inferred as the end state of the differentiation trajectory; C1 and C2 cells were located between these two end states. Afterwards, genes were selected based on biological functions or immune response features to prove the inferring, and the result is consistent with the inferred trajectory ( Figure 3G). For example, LAG3, which is an immune checkpoint inhibitor that represented exhausted T cells, showed high expression at the late pseudotime. In order to ensure the accuracy performance of inferring trajectory, we also performed trajectory inference by the dyno package (23), which is a benchmark of trajectory inferring methods for cellular ordering, topology, scalability, and usability. The top three good performances inferring results of trajectory show high consistency with the predicted results of monocle2 (24) (Supplementary Figures 1A-3A). Lastly, the top 50 differentially expressed genes (DEGs) were selected to visualize the features of gene expression in developmental trajectory in a heatmap ( Figure 3H). The expression pattern of these genes ordered by pseudotime was examined by the top three methods with good performance (Supplementary Figures 1B-3B).

Hallmark Signatures and Metabolism Disturbance of the CD8+ T_2 Cluster
The previous observation showed that the majority of cells in the CD8+ T_2 subcluster were derived from tumor tissue ( Figure 2B), indicating that it is the most infiltrating CD8+ T-cell subset in tumor tissue. Hence, we next focused on the transcriptomic features of the CD8+ T_2 subcluster. Across CD8+ T_2 cells, sub-clustering found six distinct clusters that were labeled as C0 to C5 ( Figures 4A,  B), and the expression patterns of specific marker genes in distinct subclusters were further examined ( Figures 4B, C). Results exhibited tissue-specific distribution, with the majority of tumorinfiltrating CD8+ T cells in C0, C1, C4, and C5. In contrast, both C2 and C3 clusters were composed of renal tumors, in matched adjacent normal parenchyma, and peripheral blood. The expression patterns of marker genes of different tissues were visualized as violin plots ( Figure 4C). Marker genes in tumorinfiltrating CD8+ T cells could be mainly classified into three types: (1) T-cell receptor-associated genes, such as TRDV1, TRGV2, TRGV8, and TRBV20-1; (2) chemokines and cytokines, such as CXCL13, XCL1, and XCL2; and (3) stressful stimuli genes, such as BAG3, HSPA6, and HPPA1A. The above genes played a key role in tumor growth or inflammation. Notably, in the C3 cluster, marker genes like COX1, COX2, COX3, ATP6, and ND3 were only detected in adjacent normal tissue, and these genes play important roles in the mitochondrial oxidative respiratory chain. The finding indicated that mitochondrial oxidative stress is strongly linked to immune responses in tumor progression, and it may contribute to exploring the different immune responses in the tumor, adjacent normal tissue, and PBMC.
In order to assess possible functional differences based on these subclusters, we performed the KEGG pathway and GSVA analysis ( Figures 4D, E). The KEGG enrichment results show that these marker genes in the CD8+ T_2 cluster are essential in interfering with antigen presentation, apoptosis, and host immune system response ( Figure 4D). The MITOTIC SPINDLE, ADIPOG ENESIS, ANDROGEN_RESPONSE, ESTROGEN RESPONSE, and SPERMATOGENESIS pathways all upregulated in CD8+ T_2 subclusters from C0 to C5 ( Figure 4E). We noticed that the activities in certain pathways exhibited two patterns: in C2 and C3 subclusters, the pathway activities are downregulated, while in C0, C1, C4, and C5, the activities of pathways are upregulated. As the cells in C0, C1, C4, and C5 are all tumor-infiltrating, we performed gene set analysis on the genes in C0, C1, C4, and C5 compared with genes in clusters C2 and C3. The results revealed that the I N TE RF ER ON _ G AMMA_ R ESP ON S E p at h way , th e IL2_STAT5_SIGNALING pathway, the TNFA SIGNALING VIA NKFB pathway, the UV_RESPONSE_UP, IL6_JAK_ STAT3_SIGNALING pathway, and the HYPOXIA pathway were upregulated in tumor-infiltrating cells ( Figure 4F). The six hallmark signals not only produce more parsimonious but equivalent results, avoiding the problem of gene set redundancy and overrepresentation altogether, but also provide candidate pathways that play a vital role in ccRCC tumor tissue.

Relationship Between Prognostic Features and Cell Type-Specific Marker Genes in ccRCC
To assess the clinical significance of expression of a given gene set, we utilized RSEM normalized log2 bulk RNA-seq expression data available from TCGA through the Firehose pipeline hosted by the Broad Institute (25). We utilized clinical and outcomes data available through the TCGA project website. From these resources, we filtered our analysis by patients who underwent testing for a renal tumor or in matched adjacent tissue. We then scaled and centered the log2 bulk RNA-seq data for the patients, and used this dataset for gene set enrichment analysis.
To determine if these transcriptional differences led to functional differences in tumor response, we investigated whether gene signatures were associated with prognostic values. Using the TCGA dataset for ccRCC, a total of 899 DEGs between tumor and adjacent normal tissue were identified, 281 survival-associated DEGs were obtained after using univariate Cox proportional hazard regression to evaluate the association between the expression of DEGs and patient overall survival (OS). In addition, by overlapping prognostic-associated DEGs with the top-ranked marker genes in 16 major subclusters at single-cell resolution, 10 cell type-specific marker genes that were associated with prognosis were obtained. The transcriptional features of the 10 markers, including expression patterns in tumor or in matched adjacent normal tissue, as well as the prognostic signatures, are visualized in Figure 5A.
Furthermore, we compared these gene expression patterns at single-cell resolution ( Figure 5B), and the result showed that the expression of these genes in different immune cells is totally distinct. For example, ZNF683 is mainly expressed in NKT cells, while BATF, TNFRSF18, and CTLA4 are mainly expressed in Treg cells. The UBE2C gene is highly enriched in the CD8+ T_3 cluster (which is associated with cell cycle progress), and LILRA4 is mainly enriched in pDC cells. We utilized clinical data from TCGA to explore whether the expression of these genes is  Figure 5D). We also observed that the UBE2C gene expression was associated with increasing histological grades ( Figure 5E), indicating that it may be a potential novel target for tumor progression prediction. Lastly, we compare the gene expression of UBE2C across TCGA cancers ( Figure 5F), results revealed that the gene expression is significantly different between tumor and normal. This indicates that the UBE2C may be a good biomarker for patients on clinical application.

Cell-Cell Communication Diversity in Distinct Tissues in ccRCC
To better understand global communications among cells in tumor progression, accurate representation of cell-cell signaling links and global analyses of those links were required. We integrated cell-cell communications using the CellChat (26) R package across all cell types in ccRCC ( Figure 6). As there were no cells detected in platelet cell groups derived from PBMC, cell-cell communication analysis only focused on the other 15 major cell types. We first examined the overall patterns of communication across all cell populations, and statistical analysis of the strength of cell interactions was performed ( Figure 6A). Results revealed that the cell-cell interaction strength in tumor tissue is obviously higher than that in PBMC and adjacent normal tissue. Next, we compared the signal information flow for each signaling pathway between PBMC and tumor tissue, or adjacent normal and tumor tissue ( Figures 6B, C). The information flow for a given signaling pathway is defined by the sum of communication probability among all pairs of cell groups in the inferred network. We found that some pathways, including BAFF, IL1, IL16, FLT3, TNF, and ANNEXIN, maintain a similar flow between the tissue conditions (black in Figures 6B, C). We interpret that these pathways are equally important in the tumor progression or immune response in both tissues. In contrast, other pathways prominently change their information flow at PBMC or normal as compared to tumor tissue: (i) turn off (BAG), (ii) decrease (such as BAFF, FLT3, and IL1), (iii) turn on (such as GAS and LIGHT), or (iv) increase (such as MIF, PARs, ANNEXIN, and IL16).
Moreover, we studied the detailed interaction strength across all cell types in different groups ( Figures 6D, E). Results once again demonstrated that the communication strength of cells in distinct tissues is significantly altered. The cell-cell interaction strength in tumor tissue experienced an obvious increase in CD8+ T_2 cells, CD8+ T_3 cells, NKT cells, and Treg as compared to PBMC ( Figure 6D). When we compare the interaction strength between normal adjacent tissue to tumor tissue, we noticed that the strength in tumors almost increased. However, the signal information sent from plasmablast to other cells was decreased and the strength of interaction from platelet cells to other cell types was increased in tumors as compared to normal adjacent tissue ( Figure 6E). Hence, these results revealed that cell-cell communication is highly associated with pathophysiological characteristics of the tumor microenvironment.

Discovering Major Signaling Changes in Response to ccRCC
Intercellular connections are an important pathway for cell-cell crosstalk. Such crosstalk in cells is critical for informing diverse cellular decisions, including decisions to activate the cell cycle or programmed cell death, undergo migration, or differentiate along the lineage. In humans, cell-cell crosstalk is mediated through ligand-receptor signaling pathways or secretion/uptake of exosome-transmitting information across the surrounding intercellular environment. Hence, we studied the detailed changes in the outgoing signaling across all significant pathways using pattern recognition analysis. There are 25 and 30 signaling pathways with differential strength in tumors compared to PBMC or normal tissue, separately ( Figures 7A, B), including MIF, GALECTIN, FLT3, IL16, CCL, and MK pathways. We found that the accumulated effect strength of platelet cells obviously increased in tumor tissue compared with PBMC or normal tissue. Notably, the SPP1 signal is turned on in tumor tissue, and the accumulated effect strength of it in the tumor is obviously higher than other pathways. This suggests that SPP1 is an important pathway that may be associated with tumor progression.
Next, we inferred intercellular communication networks for the tumor as compared to PBMC or adjacent normal tissue separately. Six pathways were specifically active in tumors compared to PBMC, including known inflammatory signals MIF, TNF, and IL16, suggesting that these pathways might critically contribute to tumor progression. Specific to MIF signaling in tumor tissue, CellChat identified ligand MIF and its multi-subunit receptor CD74+CXCR4 as the most significant signaling, contributing to the communication from CD4+ T cells to CD8+ T2 and CD8+ T3 cells (Figures 7C, D). This is in agreement with a reported experiment finding. That study reported that CD74 and CXCR4 are upregulated in renal cells in diseased kidneys and MIF activation of CD74 in kidney cells promotes an inflammatory response (27). Ligand TNF and its receptor TNFRSF1B were found to act as major signaling from CD4+ T cells to CD8+ T2/T3 cells, and the ability of the TNF-TNFRSF1B pair to kill tumor cells in vitro has been reported before (28). Ligand CD70 and its receptor CD27 were also found to be active in tumors, in particular, for the signaling from Treg to CD8+ T2 and CD8+ T3 (Figures 7C, D). This reveals the dynamic changes in the levels and patterns of ligand-receptor expression in different tissue or tumor progression conditions. By computing the Euclidean distance between any pair of the shared signaling pathways, we observed a large distance for signaling pathways like IL16, ANNEXIN, GALECTIN, BTLA, TNF, and MIF, suggesting that these pathways exhibit significantly different communication network architectures in tumors compared to PBMC or adjacent normal tissue ( Figures 7E, F). The signaling pathways also show relatively small distances in tumors as compared to normal adjacent tissue, including CD40, APRIL, and FLT3. This indicates communication network architectures for these overlapping pathways in both tumor and normal tissues. A closer look at the IL16 pathway ( Figures 7G, H) shows its high signaling redundancy (i.e., multiple signaling sources) and high target promiscuity. The latter finding indicated that certain pathways have highly conserved signaling architecture (i.e., high degree of redundancy), which is largely independent of the specific cellular composition of the tissue. Taken together, the results provided insights into the heterogeneity of cell-cell interactions from different locations within the tumor, and it may reveal how various tissue microenvironments influence which cell-cell interactions occur.

DISCUSSION
With the improvement of understanding of how immunotherapies work, the phenotypic and functional profile of immune cells in the TME is now well known to influence prognosis and disease outcome. The latest advances in immunotherapy are completely changing the pattern of clinical immuno-oncology and are significantly improving the survival rate of patients with various cancers (29). Although the single-cell expression profile of ccRCC was previously reported (17), the original study mainly focuses on the distinct immune characteristic of tumor and peripheral blood or normal renal tissue. In this study, we integrated publicly available single-cell RNA-seq data and comprehensively analyzed the immune characterization. Compared with adjacent normal tissue and PBMC, CD8+ T2 and CD8+ T3 cells were significantly tumorinfiltrating cell clusters, which were cell cycle or T-cell exhaustionassociated cells, indicating well-functioning immunosurveillance in tumor tissue of ccRCC (30)(31)(32). Consistent with this finding, enrichment of CD8+ T (CD8+ T1, CD8+ T2, and CD8+ T3) cells and DCs (cDC1 and cDC2) in tumor tissue conferred enhanced immune activation and recruitment of antitumor effector cells (32,33), while the abundance of B cells was fewer in tumor tissue compared to adjacent normal tissue and PBMC. B cell- mediated antibody production can lead to the killing of tumor cells through the complement cascade activation, phagocytosis by macrophages, and activation of the tumor-killing activity of NK (34). The decreased population of B cells may suggest an increased pro-tumoral activity. The different abundance of specific cell types in distinct tissues revealed the immune response association and heterogeneity in the tumor, adjacent normal tissue, and PBMC. The results will facilitate the understanding of intratumor heterogeneity and the immune microenvironment complexity in the ccRCC. As critical players in tumor immunity, CD8+ T cells can directly recognize and kill cancer cells upon recognition of neoantigens. In addition, it also responds to various cues to tune their developmental lineages and states. In this study, we identified three subclusters of CD8+ T cells. We revealed that CD8+ T_3 cluster cells, which mainly stay on cell cycle G2/M and S1 phases, showed exhaustion characterization in ccRCC. T-cell exhaustion was previously characterized as a loss of proliferative capacity and high levels of PD-1 and LAG3 expression in human tumors (28,35,36). We identified several markers specifically expressed in the C0 cluster that was tumor-infiltrated, such as checkpoint inhibitor (LAG3), cytotoxicity-associated genes (GZMK), and oncogenic driver gene (FXYD2), further indicating that these CD8+ T cells were exhausted. This finding is consistent with the original study (17). In addition, we noticed that the expression of LAG3 and GZMK is higher than that of immune inhibitors, such as PD-L1, TIM3, and CTLA4, in the CD8+ T_3 cluster; thus, LAG3 or GZMK may be a better target for tumor immunotherapy in ccRCC.
A total of 10 candidate biomarkers were obtained by combining analysis with TCGA-KIRC bulk RNA-seq data. Some of them have been mentioned in hematological malignancies or solid tumors and affect tumor progression (35,(37)(38)(39)(40)(41). Previous studies may report that some of them were associated with prognostic signature (36,(42)(43)(44), but they all focus on gene expression level or epigenetic modification. In this study, we identified several cell type-specific expressed biomarkers, including CCL5, BATF, and CTLA4 at a single-cell resolution, whose good clinical prognostic signature may be highly associated with specific immune cell types, like CD8+ T cells and Tregs. Lin et al. (45) reported that CCL5 was highly expressed in breast cancer lymph node metastasis and that CCR5-CCL5 interaction promotes cancer cell migration under hypoxic conditions. Zhang et al. (46) reported that CCL5 deficiency delayed tumor growth and metastasis via facilitating CD8+ T cells to accumulate into the tumor site. NNMT was identified as one clear cell renal cell carcinoma (ccRCC)-associated gene (47), and it induces the proliferation and invasion of squamous cell carcinoma cells (48); Tang et al. (47) demonstrated the crucial role of NNMT in the promotion of cellular invasion in ccRCC cell lines. Therefore, the high expression of NNMT in ccRCC was linked to poor prognosis, further suggesting that it may be a potential biomarker for worse prognosis. We also found that CTLA4 was upregulated in CD8+ T_2 and Treg cells in ccRCC tissues and closely related to the disease progression as well as a poor prognosis. This is consistent with the previous study result that Tregs can inhibit the activation of C D 8 + T c e l l s t h r o u g h C T L A 4 , t r i g g e r i n g t u m o r immunosuppression (49). In addition, we found that CTLA4 was markedly correlated with multiple immune checkpoints, which suggested that ccRCC patients with high expressed CTLA4 may benefit more from immune checkpoint blockade combined therapy. Notably, we identified UBE2C, the most important biomarker that is specifically expressed in exhausted CD8+ T cells, as being associated with clinical factors including TNM stage, gender, and pathological stage. Higher UBE2C expression predicted shorter OS and progression-free survival. This is consistent with the previous study that UBE2C is an important gene in ccRCC and is essential to the proliferation and migration of ccRCC (50). Strikingly, we noticed that the expression of UBE2C in CD8+ T_3 cluster cells is almost identical, regardless of origin (tumor tissue, adjacent normal tissue, or PBMC in ccRCC). Hence, we concluded that UBE2C may be a critical factor for predicting the prognosis of ccRCC patients and that the detected expression level of UBE2C from PBMC may contribute to predicting tumor progression and aid immunotherapy in ccRCC.

CONCLUSION
In conclusion, this study comprehensively compares the transcriptomic signature in different regions of ccRCC, and it unmasked the conserved and specifically activated signals among tumor tissue, adjacent normal tissue, and PBMC in ccRCC. The biomarkers identified uncovered the correlations between gene expression, survival outcomes, and immune cell-type components, aiding in the development of more effective immunotherapy strategies for ccRCC.

GEO Dataset Acquirement
All samples were obtained from the National Center for Biotechnology Information GEO dataset. GSE121636 (17) performed single-cell sequencing on the peripheral blood and tumor-infiltrating cells of three patients with renal clear cell carcinoma. GSE139555 (51) sequenced the peripheral blood, adjacent tissues, and tumor tissues of three patients who were sick of renal cell carcinoma. We divided all patients into three groups according to their origin tissue (PBMC: peripheral blood; T: renal tumor; N: normal adjacent tissue).

Single-Cell Data Processing
Raw data were converted into a Seurat object by the R package Seurat (v 3.1.2) (18). Cells whose percentage of ribosomes or percentage of mitochondria are less than 15 or single cells with less than 500 genes detected were considered low-quality cells and were removed. In order to eliminate potential doublets, single cells with over 4,000 genes detected were also filtered out. Finally, 75,173 single cells remained, and they were applied in downstream analyses.
After quality control, the Seurat object was normalized by the function SCTransform of the Seurat package. Since samples from six patients were processed and sequenced in batches, their origin tissue was used to remove the potential batch effect. In this progress, the top 2,000 variable genes were used to create potential anchors with the FindIntegrationAnchors function of Seurat. Subsequently, the Harmony function was used to integrate data and merge a new matrix with 2,000 features, in which potential batch effect was regressed out.
To reduce the dimensionality of the scRNA-Seq dataset, PCA was performed on an integrated data matrix. With the elbowplot function of Seurat, the top 30 PCs were used to perform the downstream analysis. The main cell clusters were identified with the FindClusters function offered by Seurat, with resolution set as default (res = 0.8), and then they were visualized with 2D UMAP plots (52). Conventional markers described were used to categorize every cell into a known biological cell type. Firstly, all cells were clustered into twenty-four major clusters and further clustered into sixteen clusters in a previous annotation by Azimuth, SingleR (19), and CellMarker database (20). All details regarding the Seurat analyses performed in this work can be found in the website tutorial (https://satijalab.org/seurat/v3.0/ pbmc3k_tutorial.html).

Cell-Cell Communication Analysis
The CellChat v1.1.3 software (26) was used to infer cell-cell communication based on ligand-receptor interaction with default parameters. For each ligand-receptor pair, only the secreted signaling interaction category was considered for downstream analysis. We filtered out the cell-cell communication if there are fewer than 15 cells in certain cell groups. The statistical significance of communication probability values was assessed by a permutation test. p < 0.05 was considered statistically significant.

Marker Gene Selection Is Specific to Clusters
For pairwise comparisons between clusters, we manually calculated the log2 fold change (log 2 FC) between each cluster using the Seurat FindMarkers function. Genes were required to be expressed in >10% of cells with each of the sixteen cell clusters. Genes were selected as marker genes based on the statistical threshold [log2FC > 0.25, adjusted p-value (Bonferroni) < 0.01]. The top 5 genes of each cluster were visualized with heatmap plots.

Data Collection and Code Availability Statement
The datasets analyzed during the current study are available in the TCGA, TCGA-KIRC, and GEO repository (GSE121636 and GSE139555). The data and code are available in GitHub (https:// github.com/Wang-biolab/scRNA-ccRCC/). Transcriptome RNAsequencing data of KIRC were downloaded by the R package RTCGA.rnaseq. There were 534 cases of KIRC tissues and 72 cases of normal tissues. The clinical information and demographic data were also obtained by the R package RTCGA.clinical.

Differentially Expressed Genes
DEGs between KIRC tissues and normal tissues were preliminarily screened via the R software limma package and edgeR package. DEGs were determined with the following cutoff value: false discovery rate (FDR) = 0.01, log 2 |fold change| = 3.

Survival Analysis
The OS was involved as the endpoint and index for the prognostic outcome. For DEGs, the survival R package was applied to screen the survival-associated DEGs by univariate Cox analysis, with a p-value < 0.01 based on the log-rank test. Kaplan-Meier curves were generated to illustrate the relationship between patients' OS and gene expression levels of DEGs.

Pathway and Functional Annotation
A gene functional enrichment analysis was performed based on the marker genes in each cell cluster using KEGG pathway enrichment analysis. These DEGs were loaded into cluster profiles for the GO and KEGG pathway enrichment analysis (53). Pathways for which the adjusted p-value was less than 0.05 were considered significantly enriched.
Gene set enrichment analysis (GSEA) algorithm was implemented to evaluate the relative activation status between gene expression within clusters and sets of genes of known biological significance. Gene sets used for analysis were derived from the C7 (immunological gene sets) database available through the MSigDB Collections at the Broad Institute for all gene set analysis. Only gene sets with a false discovery rate (FDR) p-value less than 0.05 and nominal p-values less than 0.05 were considered significantly enriched.
GSVA was performed on the 50 hallmark pathways annotated in the molecular signature database (54), which was exported using the GSEA Base package (version 1.40.1). The GSVA package (version 1.26.0) was applied with default settings to assign pathway activity estimates to individual cells.

Single-Cell Trajectory Analysis
R package (monocle2) (24) was applied to conduct cellular trajectory analysis with the assumption that one-dimensional "time" can describe the high-dimensional expression values, the so-called pseudotime analysis of single cells. Genes for ordering cells were selected if they expressed more than 1% of the cells, their mean expression value was larger than 0.3, and dispersion empirical value was >1. Based on the "DDRTree" method, the data were reduced to two-dimensional, and then the cells were ordered along the trajectory.

DATA AVAILABILITY STATEMENT
The datasets analyzed during the current study are available in the TCGA, accession numbers TCGA-KIRC, GEO repository (GSE121636 and GSE139555). collected the data. JX and JL analyzed the data and performed the computations. ZZ, PB, and KWX wrote the manuscript. All authors contributed to the article and approved the submitted version.