The Construction and Analysis of ceRNA Network and Immune Infiltration in Kidney Renal Clear Cell Carcinoma

Background: Kidney renal clear cell carcinoma (KIRC) has the highest invasion, mortality and metastasis of the renal cell carcinomas and seriously affects patient’s quality of life. However, the composition of the immune microenvironment and regulatory mechanisms at transcriptomic level such as ceRNA of KIRC are still unclear. Methods: We constructed a ceRNA network associated with KIRC by analyzing the long non-coding RNA (lncRNA), miRNA and mRNA expression data of 506 tumor tissue samples and 71 normal adjacent tissue samples downloaded from The Cancer Genome Atlas (TCGA) database. In addition, we estimated the proportion of 22 immune cell types in these samples through “The Cell Type Identification by Estimating Relative Subsets of RNA Transcripts.” Based on the ceRNA network and immune cells screened by univariate Cox analysis and Lasso regression, two nomograms were constructed to predict the prognosis of patients with KIRC. Receiver operating characteristic curves (ROC) and calibration curves were employed to assess the discrimination and accuracy of the nomograms. Consequently, co-expression analysis was carried out to explore the relationship between each prognostic gene in a Cox proportional hazards regression model of ceRNA and each survival-related immune cell in a Cox proportional hazards regression model of immune cell types to reveal the potential regulatory mechanism. Results: We established a ceRNA network consisting of 12 lncRNAs, 25 miRNAs and 136 mRNAs. Two nomograms containing seven prognostic genes and two immune cells, respectively, were successfully constructed. Both ROC [area under curves (AUCs) of 1, 3, and 5-year survival in the nomogram based on ceRNA network: 0.779, 0.747, and 0.772; AUCs of 1, 3, and 5-year survivals in nomogram based on immune cells: 0.603, 0.642, and 0.607] and calibration curves indicated good accuracy and clinical application value of both models. Through co-correlation analysis between ceRNA and immune cells, we found both LINC00894 and KIAA1324 were positively correlated with follicular helper T (Tfh) cells and negatively correlated with resting mast cells. Conclusion: Based on the ceRNA network and tumor-infiltrating immune cells, we constructed two nomograms to predict the survival of KIRC patients and demonstrated their value in improving the personalized management of KIRC.

Background: Kidney renal clear cell carcinoma (KIRC) has the highest invasion, mortality and metastasis of the renal cell carcinomas and seriously affects patient's quality of life. However, the composition of the immune microenvironment and regulatory mechanisms at transcriptomic level such as ceRNA of KIRC are still unclear.
Methods: We constructed a ceRNA network associated with KIRC by analyzing the long non-coding RNA (lncRNA), miRNA and mRNA expression data of 506 tumor tissue samples and 71 normal adjacent tissue samples downloaded from The Cancer Genome Atlas (TCGA) database. In addition, we estimated the proportion of 22 immune cell types in these samples through "The Cell Type Identification by Estimating Relative Subsets of RNA Transcripts." Based on the ceRNA network and immune cells screened by univariate Cox analysis and Lasso regression, two nomograms were constructed to predict the prognosis of patients with KIRC. Receiver operating characteristic curves (ROC) and calibration curves were employed to assess the discrimination and accuracy of the nomograms. Consequently, co-expression analysis was carried out to explore the relationship between each prognostic gene in a Cox proportional hazards regression model of ceRNA and each survival-related immune cell in a Cox proportional hazards regression model of immune cell types to reveal the potential regulatory mechanism.
Results: We established a ceRNA network consisting of 12 lncRNAs, 25 miRNAs and 136 mRNAs. Two nomograms containing seven prognostic genes and two immune cells, respectively, were successfully constructed. Both ROC [area under curves (AUCs) of 1, 3, and 5-year survival in the nomogram based on ceRNA network: 0.779, 0.747, and 0.772; AUCs of 1, 3, and 5-year survivals in nomogram based on immune cells: 0.603, 0.642, and 0.607] and calibration curves indicated good accuracy and clinical application value of both models. Through co-correlation analysis between ceRNA and immune cells, we found both LINC00894 and KIAA1324 were positively correlated with follicular helper T (Tfh) cells and negatively correlated with resting mast cells.

INTRODUCTION
Renal cell carcinoma is a malignant kidney tumor that the American Cancer Society reports was diagnosed in 65,340 patients, and caused 14,970 deaths, in the United States during 2018 (Siegel et al., 2018). It is a serious health problem worldwide, and the economic burden of the disease has steadily increased during the last century (Hsieh et al., 2017). Kidney renal clear cell carcinoma (KIRC) is the subtype of renal cell carcinoma with the highest invasiveness, mortality and metastasis rates (Xiao et al., 2020); it is typically screened and diagnosed through computed tomography (CT), magnetic resonance imaging (MRI), and pathological section testing in the clinic (Morrissey et al., 2015). Several difficulties continue to make KIRC a challenge to treat, including its stealth in early stages, a lack of effective biomarkers, radiation risks and the high cost of diagnostic imaging required. Underlying biomarkers and molecular mechanisms for the prevention, diagnosis and prognosis of KIRC require further studies.
Long non-coding RNA (lncRNA) is a type of RNA that contains more than 200 nucleotides but does not have the function of encoding proteins. Due to their potential for exploring novel biomarkers in diseases and elucidating mechanisms of biological processes, they have gradually achieved attention recently (Nagano and Fraser, 2011;Quinn and Chang, 2016). Current research reveals that lncRNA plays an important role in the genesis and development of tumors through interactions with competing endogenous RNA (ceRNA) or by acting as microRNA sponges (Tay et al., 2014). Tumorinfiltrating immune cells are a major component of the tumor microenvironment and affect the clinical outcomes and pathological staging of multiple tumor types (Gao et al., 2020;Hong et al., 2020). Some studies suggest lncRNA can affect tumor development and tumor immune cell microenvironment through the ceRNA network (Huang et al., 2018;Cao et al., 2019;Xu et al., 2019). Therefore, exploring interactions between the ceRNA network and various immune cells in the development of KIRC is essential.
In this study, we construct a ceRNA network related to the occurrence and development of KIRC using transcriptome and clinical data from The Cancer Genome Atlas (TCGA) 1 to explore potential molecular mechanisms. In addition, we used "The Cell Type Identification by Estimating Relative Subsets of RNA Transcripts algorithm" (CIBERSORT) 2 (Newman et al., 2015) algorithm to assess differences in the composition of immune cells within tumor tissues and normal tissues. Furthermore, predictive nomograms were constructed based on the ceRNA network and on significant immune cells. Finally, we evaluated the relationship between tumor-infiltrating immune cells and the ceRNA network to identify the underlying regulatory mechanisms.

Data Source and Selection
A flowchart illustrating the research process of our study is provided in Figure 1. Data on lncRNA, miRNA, and mRNA expression of KIRC patients were downloaded from TCGA database (see text foot note 1) through the GDC data transfer tool. Relevant clinical information including age, gender, survival data (vital status, days to death, and days to last follow-up), grade and TNM stage of tumor tissues was also obtained for further analysis. We excluded samples with incomplete survival or pathological staging data, as well as duplicate samples. Only samples with both mRNA and miRNA expression profiles were included in this study. Correspondingly, the adjacent and paired normal tissues collected from the included patients was set as the control group.

Differential Expression Analysis
Based on the HTSeq-Counts data of included samples, we used the "DESeq2" package (Love et al., 2014) 3 of Bioconductor to screen for differentially expressed lncRNAs (DElncRNAs), miRNAs (DEmiRNAs) and mRNAs (DEmRNAs) between KIRC tissue and normal adjacent tissue. The log fold change (FC) criterion for differential expression was set as |log 2 (FC)| > 1 and the false discovery rate (FDR) at < 0.05. For the differential expression results of each type of RNA, volcano plots and heatmaps were generated through R packages "ggpolt2" and "pheatmap." Construction of lncRNA-miRNA-mRNA ceRNA Network and Survival Analysis Using the differential expression results, we predicted the miRNAs-lncRNAs interactions and the miRNAs-mRNAs interactions based on the starBase v2.0 database (Yang et al., 2011;Li et al., 2014), 4 which contained the miRNA-target interactions overlapped with CLIP-Seq data. We conducted hypergeometric tests and correlation analysis to filter the predicted interactions with FDR < 0.05 as the screening criteria. In addition, the lncRNA-miRNA-mRNA regulatory network of KIRC was visualized using Cytoscape v.3.8.1 5 (Shannon et al., 2003). Besides, we conducted a Kaplan-Meier (K-M) survival curve analysis to assess the prognostic value of genes in the KIRC ceRNA network.

ceRNA Network: Cox Proportional Hazards Regression Model
Firstly, significant variables were screened for integration into the initial Cox model using univariate Cox analysis. Next, a least absolute shrinkage and selection operator (Lasso) regression was employed to further evaluate the fitness of multifaceted models and filter variables. Finally, we generated a nomogram of from multivariate Cox proportional hazards regression models to predict the prognosis of KIRC patients. To confirm the discrimination and precision of the nomogram, receiver operating characteristic curves (ROC) and calibration curves were employed. In addition, we calculated the risk score of each patient and then divided patients into high and low risk groups based on the median to conduct a K-M survival curve analysis.

Tumor-Infiltrating Immune Cells: CIBERSORT Estimation and Survival Analysis
To analyze differences between the two groups at the level of immune cells, and explore the connection between key biomarkers in the ceRNA network and immune cells, we estimated the proportion of 22 kinds of immune cell types in 506 tumor tissues and 71 normal adjacent tissues by CIBERSORT (Newman et al., 2015), an algorithm for characterizing cell composition of complex tissues from their gene expression profiles. Only samples with CIBERSORT output of P < 0.05 were included in further analysis. After screening samples, we searched for immune cells with significant differences between cancer tissues and normal adjacent tissues, as indicated by a Wilcoxon rank-sum test. A K-M survival curve analysis was performed to evaluate the relationship between the content of different immune cells and the overall survival of KIRC patients.

Tumor-Infiltrating Immune Cells: Cox Proportional Hazards Regression Model
Immune cells with significant effects in univariate Cox analysis were integrated into the initial Cox model and the Lasso regression again used to evaluate the fitness of multifaceted models and filter variables further. As above, we constructed a nomogram based on the multivariate model of immune cells, and used ROC and calibration curves to confirm the discrimination and precision of the nomogram. Similarly, we conducted a K-M survival curve analysis by dividing patients into high and low risk groups according to the median of risk score. Finally, a Pearson correlation analysis was used to explore the relationship between genes and immune cells.

Differential Expression in lncRNAs, miRNAs, and mRNAs
After screening the samples of TCGA-KIRC, we constructed an experimental group and a control group containing of 506 tumor tissues and 71 normal adjacent tissues, respectively. Compared to the control group, we identified 357 DElncRNAs (287 upregulated and 70 down-regulated), 132 DEmiRNAs (61 upregulated and 71 down-regulated) and 3092 DEmRNAs (2,032 up-regulated and 1,060 down-regulated) using the cutoffs of the |log 2 (FC)| > 1 and FDR < 0.05 (Figures 2A-G).

ceRNA Network: Cox Proportional Hazards Regression Model
Univariate Cox analysis applied to filter genes in the ceRNA network retained 97 genes for incorporation into the initial model (Supplementary Table 2). Lasso regression analysis indicated that 15 genes were statistically significant in this model. Subsequently, a Cox proportional hazards regression model with seven genes was established through further screening of key genes based on the Akaike Information Criterion (AIC). Finally, we constructed a nomogram to predict the 1, 3, and 5-year overall survival probability of KIRC patients. ROC [area under curves (AUCs)] for the model gave 1, 3, and 5-year survivals of 0.779, 0.747, and 0.772, respectively, which reflected the discrimination and precision of the nomogram with calibration curves (Figure 4). Overall survival (based on K-M curves) of the high-risk group is lower than that of the low-risk group (Supplementary Figure 1). Survival curves of the genes in the model are also provided in Supplementary Figure 1.

Tumor-Infiltrating Immune Cells: CIBERSORT Estimation and Survival Analysis
The analysis by CIBERSORT identified five normal tissue samples and 206 tumor tissue samples for inclusion in subsequent analysis (P < 0.05) and the proportions of 22 immune cells in each sample are displayed in the histogram and heat map of Figure 5. A Wilcoxon rank-sum test showed that naive B cells (P < 0.001), plasma cells (P < 0.001), CD8 T cells (P = 0.014), CD4 naive T cells (P < 0.001), CD4 memory resting T cells (P = 0.013), regulatory (Tregs) T cells (P = 0.015), gamma delta T cells (P = 0.021), resting dendritic cells (P = 0.016), and resting mast cells (P = 0.048) all varied between the two tissue sample types ( Figure 5C). Based on the results of K-M survival curve analysis, five immune cells, including memory B cells, plasma cells, follicular helper T (Tfh) cells, T regulatory (Tregs) cells, and resting mast cells were related to overall survival of KIRC patients (Supplementary Figure 2).The clinical correlation of 22 immune cells is displayed in Supplementary Figure 3.

Tumor-Infiltrating Immune Cells: Cox Proportional Hazards Regression Model
After filtering based on univariate Cox analyses (Supplementary Table 3), two of 22 immune cells (Tfh and resting mast cells) were incorporated into the initial regression model. Lasso regression analysis and AIC were then employed for further optimization of the model, and both cells were kept in the final Cox proportional hazards regression model. A nomogram was developed to predict the 1, 3, and 5-year overall survival probability of KIRC patients based on the model. Both the ROC (AUCs of 1, 3, and 5-year survivals: 0.603, 0.642, and 0.607) and the calibration curves suggested that the nomogram had good accuracy (Figure 6). The K-M survival curves suggested that significant differences existed in the overall survival of the high versus low-risk group (Supplementary Figure 4).

Co-expression Analysis of Key Genes and Immune Cells
Pearson correlation analysis was used to explore the coexpression pattern between 22 types of immune cells (Figure 7A). Similarly, the co-expression relationship between the key biomarkers and immune cells of the two models is illustrated in Figure 7B. Tfh cells are positively correlated with LINC00894 ( Figure 7C, R = 0.36, P < 0.001) and KIAA1324 ( Figure 7D, R = 0.39, P < 0.001), while resting mast cells are negatively correlated with LINC00894 ( Figure 7E, R = −0.32, P < 0.001) and KIAA1324 (Figure 7F, R = −0.30, P < 0.001).

Validation of the Correlation Between Key Genes and Immune Cells
Multiple databases were searched to verify the correlation between the key genes (LINC00894 and KIAA1324) and immune cells (Tfh cells and resting mast cells) identified. B cell lymphoma 6 (BCL6), CXCL13, CD10 (MME), ICOS, and PD-1 (PDCD1) were the most common surface markers of Tfh cells in the CellMarker database (Supplementary Figure 5). In the Oncomine TM platform, we explored the expression of KIAA1324 and Tfh markers among various cancers. KIAA1324, BCL6, CXCL13, ICOS, and PDCD1 were highly expressed and MME was downregulated in KIRC compared to normal kidney tissue (Supplementary Figure 6); median rank and P-values are listed in Supplementary Table 4.
Using the gene expression profiling interactive analysis (GEPIA), KIAA1324 is positively correlated with CXCL13, ICOS,   Figure 7), and LINC00894 is positively correlated with BCL6, ICOS, and PDCD1 in KIRC (Supplementary Figure 8). Similarly, the results of the LinkedOmics database showed that KIAA1324 was positively correlated with BCL6, CXCL13, ICOS, and PDCD1 in KIRC. In addition, the expression of KIAA1324 was significantly different   among tumor stages and related to tumor purity in KIRC (Supplementary Figure 9).

DISCUSSION
Kidney renal clear cell carcinoma is one of the most common and malignant subtypes of kidney tumors (Xiao et al., 2020). However, because of limitations in early detection and a lack of sensitive biomarkers, patients with KIRC typically have a poor prognosis.
Recently, ceRNA was identified as playing a potentially important role in the molecular regulatory function of tumorigenesis and growth, and in affecting differences in immune-infiltrating cells within multiple tumors. In this study, we investigated the role and connection of the ceRNA network and immune infiltration in KIRC to explore key prognostic biomarkers and potential regulatory mechanisms. We established a ceRNA network consisting of 12 lncRNAs, 25 miRNAs, and 136 mRNAs and compared the composition differences of immune cells related to KIRC. Based on the results, two prediction nomograms consisting of seven genes and two immune cells, respectively, were constructed to evaluate overall survival. A correlation analysis of the key factors in both nomograms indicated that both LINC00894 (lncRNA) and KIAA1324 (protein-coding RNA) were positively correlated with Tfh cells and negatively correlated with resting mast cells. Consequently, we infer that ceRNA regulatory function, in which both LINC00894 and KIAA1324 participate, and Tfh cells and resting mast cells may play a crucial role in the development and treatment of KIRC.
LINC00894 is lncRNA derived from a locus on the X chromosome that is differentially expressed in various cancers . Similar to our work, Liang (Liu et al., 2018) reported that LINC00894 was a tumor-special lncRNA in KIRC that was upregulated in tumor tissues and correlated with overall survival time. Epithelial-to-mesenchymal transition (EMT) regulating miRNAs regulate drug resistance and cancer progression and metastasis (Park et al., 2008;Ward et al., 2013). LINC00894 can inhibit the TGF-β2/ZEB1 signaling pathway by acting as the sponge of EMT-regulating miR-200, or can be upregulated by ERα activation and positively regulate the expression of miR-200a-3p and miR-200b-3p, which results in induction of the TGF-β2/ZEB1 signaling pathway to reduce the occurrence of drug resistance in breast cancers Farhan et al., 2019;Huang et al., 2020). A large number of miRNAs are dysregulated in the development and treatment of KIRC, and affect tumor proliferation, metastasis and treatment resistance, etc. through degrading target mRNA (Qu et al., 2016;Zhai et al., 2018;Incorvaia et al., 2020). In our study, hsa-miR-130b-3p is the only miRNA in the model we have constructed related to patient prognosis. It showed high abundance in the serum of patients with Wilms tumor, and reflected valuable diagnostic potentials with an AUC of 0.94 (Ludwig et al., 2015). In addition, its upregulation in hepatocellular carcinoma leads to a decrease in the expression of HOXA5, which in turn promotes tumor growth and angiogenesis, is associated with a poor prognosis (Liao et al., 2020). In the ceRNA network we constructed related to KIRC, a large number of lncRNA/miRNA/mRNA axes were revealed, and further experiments are needed to verify their function and mechanism.
KIAA1324, also known as EIG121, encodes a 1013 amino acid transmembrane protein that shows high sequence conservation among different species. In type I endometrial cancer (estrogenrelated), the expression of KIAA1324 is upregulated, but it is downregulated in type II endometrial cancer (not estrogen-related), which is more malignant than type I (Deng et al., 2005). KIAA1324 may be a novel suppressor, being epigenetically downregulated in gastric cancer (Kang et al., 2015); in addition, its high expression in endometrial and pancreatic carcinoma predicts improved survival (Oh et al., 2006;Westin et al., 2009). However, increased expression of KIAA1324 in ovarian cancer is associated with a poor prognosis (Schlumbrecht et al., 2011). These studies indicate that KIAA1324 may exhibit different biological functions in various cancers. Furthermore, recent mechanistic studies reveal that KIAA1324 is localized to endosome-lysosome compartments and associated with autophagy, which helps protect cells from cell death by upregulating the autophagy pathway under unfavorable conditions, such as starvation or chemotherapy (Deng et al., 2010). Therefore, we infer that overexpressed KIAA1324 helps cancer cells proliferate and resist chemotherapy in KIRC.
In this study, we explored five among 22 types of immune cells related to patient prognosis, and screened out the Tfh cells and resting mast cells to construct a Cox proportional hazards regression model, which evaluation of its AUC suggests has clear clinical value. Gou et al. (Zhu et al., 2019) also explored the relationship between immune cells and renal cell carcinoma, and similar to our research, found Tfh cells were associated with poor prognosis and resting mast cells were positively associated with long term survival.
Under the action of myeloid professional antigen presenting cells (APCs), naive CD4 + T cells could differentiate into Tfh cells and non-Tfh effector cells (such as Th1, Th2, or Th17 cells) (Crotty, 2011). Therefore, Tfh cells are a specialized subset of CD4 + T cells that help B cells respond to antigens (Crotty, 2019), which is, in turn, depends on the expression of transcriptional factor B cell lymphoma 6 (BCL6) (Johnston et al., 2009). In the past decade, several studies have revealed a potential connection between Tfh cells and infiltrated tumors. Amé- Thomas et al. (2012) demonstrated that Tfh cells protect follicular lymphoma malignant B cells from spontaneous and rituximab-induced apoptosis by upregulating CD40L and IL-4, which might lead to a worse prognosis. In non-small cell lung cancer (NSCLC) tumor tissues, the proportion of Tfh cells is increased, suggesting that Tfh cells might play an important role in estimating the poor survival of NSCLC patients (Guo et al., 2017). In addition, Tfh cell abundance is known to improve prognosis of patients with breast cancer or colorectal cancer (Bindea et al., 2013;Gu-Trantien et al., 2013), which indicates the value of Tfh cells against tumors through immune responses.
Mast cells, both resting and activated, are large granulated innate immune cells found predominantly in sites between the host and its external environment, such as skin, respiratory mucosa, and the gastrointestinal tract (Frossi et al., 2017). The activation of mast cells is related to various stimuli received by numerous receptors on the cell surface, such as pathogens, neuropeptides, cytokines, growth factors, toxins, basic compounds, complement, immune complexes, etc. (da Silva et al., 2014). Recent studies have shown that the association between mast cells and cancer is two-sided, including both involvement in, and protection against, tumor progression. Xu et al.  constructed a prognosis-associated microRNA-mast cell network in lung adenocarcinoma, and found resting mast cell numbers were closely related to better overall survival and disease-free survival (DFS), while activated mast cells were related to poor survival. miR-30a and miR-550a are also involved in the regulation of immune response by acting as the promoter and suppressor of resting mast cells, respectively. In contrast, mast cells in hepatocellular carcinoma are mostly resting, inactivation of mast cells can promote immune escape, which is beneficial to tumor growth (Rohr-Udilova et al., 2018).
Our research inevitably has some limitations that should be recognized. First, our research data was obtained from public databases. The limited data means that the clinical pathological parameters are incomplete, which can lead to potential errors and biases. Secondly, we have not taken the heterogeneity of the immune microenvironment associated with the location of immune infiltration into consideration. In other words, the accuracy and applicability of the prediction models will be affected by the heterogeneity of tissue subtypes. In addition, the data series used in this study are all from Western countries, so applying the research conclusions to Asian countries has certain bias and may be inappropriate. As well, the lack of cell markers for resting mast cells leads to incomplete validation in our study. Lastly, this study is not a biological mechanism research, as it lacks experiments on the interaction mechanisms between ceRNA and immune cells. Despite limitations of the study, we did combine tumor infiltrating immune cells with a ceRNA network for analysis in KIRC, which not only constructed two normograms with clinical predictive value, but also identified a correlation between two key prognostic molecules and two prognosis-related immune cells, providing novel directions for future research.

CONCLUSION
Based on tumor-infiltrating immune cells and ceRNA networks, we constructed two nomograms to predict survival of KIRC patients. The high AUC values we obtained reflect the utility of this approach, which provides clinicians with more comprehensive clinical information through the models to improve individual management of KIRC patients. In addition, our study suggests that LINC00894, KIAA1324, Tfh cells and resting mast cells are significantly related to the prognosis of KIRC patients and we infer that LINC00894 and KIAA1324 might interact with Tfh cells and resting mast cells to affect the progression of KIRC. This study also provides new ideas for the pathogenesis and clinical treatment of KIRC.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
ZQ and NL conceived the idea, designed the study, and contributed to the revision of the manuscript draft. LD conducted the data analysis and visualization. PW performed the data interpretation and literature collection. LD and NL wrote the manuscript. All authors read and approved the final manuscript.