A Novel Nine Apoptosis-Related Genes Signature Predicting Overall Survival for Kidney Renal Clear Cell Carcinoma and its Associations with Immune Infiltration

Background: This study was designed to establish a sensitive prognostic model based on apoptosis-related genes to predict overall survival (OS) in patients with clear cell renal cell carcinoma (ccRCC). Methods: Obtaining the expression of apoptosis-related genes and associated clinical parameters from online datasets (The Cancer Genome Atlas, TCGA), their biological function analyses were performed through differently expressed genes. By means of LASSO, unadjusted and adjusted Cox regression analyses, this predictive signature was constructed and validated by internal and external databases (both TCGA and ArrayExpress). Results: A total of nine apoptosis-related genes (SLC27A2, TNFAIP2, IFI44, CSF2, IL4, MDK, DOCK8, WNT5A, APP) were ultimately screened as associated hub genes and utilized to construct a prognosis model. Then our constructed riskScore model significantly passed the validation in both the internal and external datasets of OS (all p < 0.05) and verified their expressions by qRT-PCR. Moreover, we conducted the Receiver Operating Characteristic (ROC), finding the area under the ROC curves (AUCs) were all above 0.70 which indicated that riskScore was a stable independent prognostic factor (p < 0.05). Furthermore, prognostic nomograms were established to figure out the relationship between 1-, 3- and 5-year OS and individual parameters for ccRCC patients. Additionally, survival analyses indicated that our riskScore worked well in predicting OS in subgroups of age, gender, grade, stage, T, M, N0, White (all p < 0.05), except for African, Asian and N1 (p > 0.05). We also explored its association with immune infiltration and applied cMap database to seek out highly correlated small molecule drugs. Conclusion: Our study successfully constructed a prognostic model containing nine hub apoptosis-related genes for ccRCC, helping clinicians predict patients’ OS and making the prognostic assessment more standardized. Future prospective studies are required to validate our findings.


INTRODUCTION
Renal cancer, one of the most familiar malignant urological cancers in the world, attributes to 73,820 new cases and 14,770 deaths every year in the United States (Siegel et al., 2019). Therein, renal cell cancer (RCC) consists of 90% of renal carcinoma and it has been reported that almost 400,000 cases were diagnosed in 2018 (Bray et al., 2018). The most common type is clear cell renal cell carcinoma (ccRCC) which causes a low risk of cancer-related death when diagnosed (Joseph et al., 2014). However, almost one third of these newly diagnosed patients had metastases and a poor prognosis (Dagher et al., 2017). Although targeted therapeutics and immune checkpoint inhibitors have changed the landscape of treatment for ccRCC, most patients still did not experience significant clinical benefits (Orlandella et al., 2020). Accordingly, it was critical to discover more effective methods for early screening and diagnosis of ccRCC by further understanding its molecular mechanism and biological process. Thus, we can improve the treatment effect and life quality of these ccRCC patients.
Programmed cell death is a highly conserved and strictly regulated cellular program. Apoptosis occurs in various physiological conditions and pathological states (Lockshin and Zakeri, 2007;Fuchs and Steller, 2011). If excessive or insufficient apoptosis, or occurrence at the wrong time or place, it may lead to pathological development, such as stroke and cancer (Barinaga, 1998;Li et al., 2010). Increasing apoptosis and decreasing cell death could increase the occurrence of cancer and promote the progress of cancer (Fulda, 2009). As previously reported, there were many apoptosis-related genes, including the caspase family, the tumor suppressor gene P53, the oncogene C-myc, the IAPs, FLIPs and the Bcl-2 family (Packham and Cleveland, 1995;Bellamy, 1997;Mohamed et al., 2017;Edlich, 2018;Humphreys et al., 2018). Furthermore, they are involved in abnormal gene expression or genomic changes, leading to changes in apoptosis pathways that enable cancer cells to escape apoptosis (Hanahan and Weinberg, 2000). For instance, PAR4 can induce apoptosis of cancer cells without damaging normal cells (El-Guendy and Rangnekar, 2003). Survivin expression was increased in gastric cancer, and RUNX3 promoted apoptosis by inhibiting survivin expression (Liu et al., 2014). Notch signaling played important roles in the occurrence of tumor and therapeutic resistance of NSCLC cells via the promotion of proliferation or the inhibition of apoptosis (Zou et al., 2018).
In past decades, numerous diagnostic biomarkers associated with ccRCC have been found (Gong et al., 2006;Ricketts et al., 2012;Lu et al., 2016;Yadav et al., 2017), but it could rarely conduct accurate and sensitive early-stage detection, thus making the ccRCC patients lose the chance to be cured early. Moreover, the implementation of cancer precision medicine requires biomarkers or signatures for predicting prognosis and therapeutic benefits. Hence, there remains an urgent need for improving the performance of biomarkers or signatures. In this study, we concentrated our work on constructing a prognostic signature related to apoptosis-related genes to calculate risk level and predict survival time in ccRCC patients. We validated the prognostic model via internal and external aspects. Our results were anticipated for creating novel standard of diagnosis and developing therapeutic targets, helping clinician predict ccRCC patients' OS more effectively and accurately.

Data Acquisition and Processing
The RNA-sequencing datasets and related clinical data of 72 normal renal tissue samples and 539 ccRCC samples were acquired from The Cancer Genome Atlas database (TCGA, https://portal.gdc.cancer.gov). The census of human apoptosisrelated genes was obtained from GSEA (https://www.gseamsigdb.org/gsea/index.jsp) and these gene sets are displayed in the Supplementary Table S1. To ensure a unified standard, the RNA-seq profiles were transformed using the formula log2 (x+1) and normalized. To obtain the differently-expressed genes in the ccRCC tissue, we utilized "limma" package on account of false discovery rate (FDR) < 0.05 and |log2 fold change (FC)| ≥ 1. Besides, we downloaded 99 samples from ArrayExpress datasets (E-MTAB-1980, https://www.ebi.ac.uk/arrayexpress/) as the external validation.

GO and KEGG Functional Enrichment Analyses
Based on differently-expressed apoptosis-related genes, the gene ontology (GO) enrichment involving molecular function (MF), cellular components (CC), and biologic process (BP) was conducted and the kyoto encyclopedia of genes and genomes (KEGG) pathway analysis was also performed. GO and KEGG pathway analysis were conducted to annotate genes and identify biological attributes for differentially expressed apoptosis-related genes. Both FDRs and p values were significant when they were less than 0.05.

PPI Network Construction and Module Screening
The differently-expressed apoptosis-related genes were uploaded to the STRING database (http://www.string-db.org/) to further acquire protein-protein interaction relationship. Then we performed the next analysis by Cytoscape 3.7.0 software to construct the PPI network. The key genes and modules were selected in PPI network with the plug-in of Molecular Complex Detection (MCODE) on the condition of both MCODE scores and the number of nodes are more than 5 (Szklarczyk et al., 2019). All p ≤ 0.05 were considered as statistical significance.

Prognostic Model Construction and Analyses
We firstly performed survival R package to conduct unadjusted Cox regression analysis on all apoptosis-related genes. Then the least absolute shrinkage and selection operator (LASSO) was executed to prevent the occurrence of overfitting and to Frontiers in Molecular Biosciences | www.frontiersin.org March 2021 | Volume 8 | Article 567730 further obtain the candidate genes. Afterwards, based on the previously acquired candidate genes, adjusted Cox proportional hazards regression was utilized to establish a calculation formula. Subsequently, a riskScore algorithm was constructed by above mentioned methods and displayed as follows: Risk score n i 1 expi*βi.
Therein, β means the coefficient value, and exp equals to the expression level.

External and Internal Verification of the Prognostic Model
To evaluate the prognostic ability of our established model, we divided ccRCC patients into high-and low-risk groups based on the median risk score. A log-rank test was conducted to compare these two groups. In addition, we implemented a time-dependent ROC analysis and calculated the area under the ROC curve (AUC) by applying the SurvivalROC package to ccRCC patients. ArrayExpress datasets was set as the external validation. As for internal validation, the whole TCGA database was randomly divided into two groups for verification. In order to predict the likelihood of OS, we implemented the nomogram and calibration plots with rms R package. p < 0.05 was counted as statistical significance.

Construction and Verification of the Prognostic Nomogram
According to prognostic clinical parameters and our established riskScore model, we constructed a nomogram to predict the possibilities of 1-, 3-and 5-year OS for ccRCC patients. ROC curves and their associated AUCs were applied to evaluate the sensitivity and accuracy of the prognostic nomogram. Further, the same methods were applied to the external ArrayExpress datasets to verify the outcomes.

Validation of Apoptosis-Related Genes in ccRCC Tissues by Quantitative Real-Time PCR
To extract total RNA from the target tissue samples, after adding liquid nitrogen in a mortar to fully grind, we increased 1 ml Trizol reagent (Life technology, Grand Island, NY, United States) directly to lyse at room temperature for 15 min on a shaker. To evaluate expressing levels of mRNA, the first-strand cDNA was conducted using parts of total RNA applying the RevertAid First Strand cDNA Synthesis Kit (thermo scientific, Lithuania).

Estimation of Tumor-Infiltrating Immune Cells
We applied the "limma" package of R software to conduct analyses including the expression of TIICs in every sample and the inner relationship of these TIICs in ccRCC, showing by the heatmap and matrix. Moreover, the R package was performed to find whether there are high correlations with the risk model by applying the samples with high-and low-groups. After normalizing gene expression data with standard annotation files, we uploaded these data to the CIBERSORT web portal (Newman et al., 2015). Then we conducted the algorithm with LM22 gene signature and 1,000 permutations. The threshold above was set as the p-value less than 0.05.

Identification of Potential Small Molecule Drugs
To efficiently seek out molecule drugs highly related to ccRCC, we applied a gene expression profiles database named connectivity map (cMap) (Lamb et al., 2006). Then we uploaded the differently expressed apoptosis-related genes to it and bioactive chemicals and potential connections were further explored. Scores of connectivity were set from −1 to 1 to evaluate the degree of closeness in the compound correlated to the query signature. Ultimately the positive score means the promotion of the drug, while negative scores could be considered as the function of depression by a drug. And the set threshold was p < 0.05, n ≥ 4 and |mean| > 0.4.

Statistical Analysis
Statistical analysis was completed with the R software 3.6.3. To accomplish the comparison between low-and high-risk groups, Kaplan-Meier was conducted by applying the log-rank test. By means of LASSO, unadjusted and adjusted Cox regression analyses, we calculated the regression coefficient and the hazard ratio. The time-dependent Receiver Operating Characteristic (ROC) analysis and the area under the ROC curve (AUC) were respectively performed to assess the accuracy of the model and the nomogram. And we applied harrell's index of concordance (C-index) to further evaluate the ability of the nomogram. p values < 0.05 were adhered in the whole statistical analyses.

Identification of Differently Expressed Apoptosis-Related Genes in ccRCC Patients
In this study, systematic analyses were conducted by several advanced computer technological methods to detect key roles and investigate prognostic values of apoptosis-related genes in ccRCC. The procedures of this study were illustrated in Supplementary Figure S1. The dataset of renal clear cell carcinoma was acquired from TCGA containing 539 tumor samples and 72 normal renal tissue samples. Then we conducted R software packages to cope with the data, thus finding the differently-expressed apoptosis-related genes. A total of 2,411 apoptosis-related genes were included in the above analysis, and 127 apoptosis-related genes met the accepted standard (FDR < 0.05, |log2 (FC)| > 1), consisting of 88 up-regulated and 39 down-regulated apoptosis-related genes. To show the differential expression of apoptosis-related genes, heat map and volcano plot were utilized respectively ( Figures 1A,B).

GO and KEGG Pathway Enrichment Analysis of the Differently Expressed Apoptosis-Related Genes
Our results indicated that differently expressed apoptosisrelated genes were significantly enriched in the biological processes (BP) related to regulation of lymphocyte activation, positive regulation of leukocyte activation, positive regulation of cell activation, adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains, neuron death, and neuron apoptotic processes. According to molecular function (MF), the differently expressed apoptosisrelated genes were especially enriched in immunoglobulin receptor binding, antigen binding, scavenger receptor activity, cargo receptor activity, cysteine-type endopeptidase regulator activity involved in apoptotic process, and receptor ligand activity. We also found that differently-expressed apoptosis-related genes were enriched in the cellular FIGURE 3 | Construction of a prognostic model (riskScore) by means of LASSO and adjusted Cox regression analysis; (A,B) LASSO coefficients profiles of the prognostic apoptosis-related genes; The partial likelihood deviance plot presented the minimum number corresponds to the covariates. utilized for adjusted Cox regression analysis; (C) The forest plot of differently expressed apoptosis-related genes were identified to be significantly related to OS of ccRCC patients by adjusted Cox regression analysis. Green points represent negative correlations, whereas red points represent positive correlations.
Frontiers in Molecular Biosciences | www.frontiersin.org March 2021 | Volume 8 | Article 567730 6 component (CC) associated with all the components of the external plasma membrane that the genes were expressed in, immunoglobulin complex, blood microparticle, and membrane raft ( Figure 1C; Supplementary Table S2). Additionally, results of KEGG pathway analysis indicated that apoptosis-related genes were mainly enriched in Cytokine-cytokine receptor interaction, Allograft rejection, Epstein-Barr virus infection, Primary immunodeficiency, PI3K-Akt signaling pathway, Th1 and Th2 cell differentiation, NF-kappa B signaling pathway and Th17 cell differentiation ( Figure 1D, Supplementary Table S3).

Protein-Protein Interaction Network Construction and Key Modules Selecting
To further figure out the roles of differently-expressed apoptosisrelated genes in ccRCC, we constructed a PPI network from the STRING database, consisting of 559 nodes and 6,096 edges ( Figure 2A). To seek out possible key modules involved in the co-expression network, we utilized the MODE tool and finally screened the top 3 significant modules, by means of cytoscape software with the data from the STRING database ( Figures 2B-D).

The External and Internal Validation of Prognosis-Related Genetic Risk Score Model
To assess the predictive ability and accuracy of the signature, we then conducted a survival analysis and performed verification in both TCGA and ArrayExpress databases. According to the median risk score of our established model, we divided the ccRCC patients into low-and high-risk subgroups. Results of the TCGA dataset indicated that patients in the low-risk subgroups had longer OS, compared with those in the high-risk subgroups ( Figure 4A). To further evaluate the prognostic sensitivity of these nine apoptosis-related genes biomarker, we performed a time-dependent ROC analysis, thus finding that the AUCs of 1-year, 3-years and 5-year risk score model were 0.783, 0.745, 0.767, respectively (Figures 4B-D and Table 2), indicating a moderate diagnostic performance. Survival status of patients, expression heat map, and risk score of the signature were also shown in Figure 4I, finding that the higher the risk scores, the higher the patients in highrisk subgroups, and the higher the numbers of dead persons. Heatmap of these nine apoptosis-related genes between low-and high-risk subgroups was also displayed in this figure.
As for external verification, we applied the same formula to the ArrayExpress datasets (E-MTAB-1980) to evaluate whether the predictive model had similar prognostic value and accuracy in other ccRCC patient cohorts. By means of conducting survival analysis, we found that patients with low-risk scores also had a higher OS than those with high-risk scores in the ArrayExpress cohorts ( Figure 4E). ROC curves and their associated AUCs were also performed and we found that the AUCs of 1-year, 3-years and 5-years risk score model were 0.687, 0.660, 0.681, separately (Figures 4F-H and Table 2). Figure 4J indicated that the higher the risk scores, the higher the patients in high-risk subgroups, and the higher the numbers of dead persons. Heatmap of these nine apoptosis-related genes between low-and high-risk subgroups was also displayed in this figure.
To conduct the internal verification, a total of 539 ccRCC samples in TCGA dataset were randomly divided into two groups including the internal validation dataset 1 (test 1) and the internal validation dataset 2 (test 2). Then we conducted the whole procedure of the internal verification of these two groups separately as the same as what we had performed for the external verification. Survival analysis, ROC curves, risk score distribution, survival status, and heatmap of these nine apoptosisrelated genes in low-and high-risk groups were displayed in Figure 5, finding similar results as external verification. ROC curves and their associated AUCs for internal verification were also performed and these AUCs also indicated a moderate accuracy, sensitivity and specificity ( Table 2).

Construction of a Nomogram Based on our Established Signature and Clinical Characteristics
In order to help clinicians to develop diagnostic decisionmaking for ccRCC patients, we combined the nine genes signature and six clinical characteristics (age, gender, grade, T, M and N) to establish a nomogram according to TCGA datasets ( Figure 7A). Points were distributed to individual parameters by applying the point scale in the nomogram, FIGURE 5 | Internal verification of nine apoptosis-related genes signature; (A) Survival curves of OS from high-and low-risk groups of high-and low-risk divided by riskScore in the internal validation dataset 1 (test 1); (B-D) 1-, 3-and 5-year ROC in the internal validation dataset 1 (test 1); (E) Survival curves of OS from high-and lowrisk groups of high-and low-risk divided by riskScore in the internal validation dataset 2 (test 2); (F-H) 1-, 3-and 5-year ROC in the internal validation dataset 2 (test 2); (I,J) The distribution of risk scores for samples, the survival status of patients and the expression heatmap of nine differently expressed genes the internal validation dataset 1 (test 1) and in the internal validation dataset 2 (test 2).
Frontiers in Molecular Biosciences | www.frontiersin.org March 2021 | Volume 8 | Article 567730 and we worked out the total points by summing the points of all factors. We could figure out the survival rates for ccRCC patients at 1, 3, and 5 years, helping the prognostic method for ccRCC patients more quantitative. Its 1-, 3-, 5-year AUC values in the TCGA were 0.846, 0.804, 0.769, with C-index 0.786, thus having good performance in predicting OS (Table 4 and Supplementary Figures S3A-C). Its calibration plots indicated that the predictive values were excellent ( Figure 7C). Similarly, we also created another nomogram of the ArrayExpress datasets as the verification ( Figure 7B) and its 1-, 3-, 5-years AUC values in the TCGA were 0.925, 0.933, 0.897, with C-index 0.866 (Table 4 and Supplementary Figures  S3D-F). Calibration plots of the ArrayExpress dataset showed excellent results too ( Figure 7D).

Relationships Between RiskScore, These Nine Elected Genes and Clinical Characteristics
To study associations between our established riskScore and clinical characteristics, our results indicated that riskScore were significantly related to grade (p 0.004), tumor stage (p 0.005), T stage (p 0.007) (Supplementary Figure S4). Moreover, clinical correlation analyses between these 9 prognostic apoptosis-related genes, our established riskScore and clinical features such as age, race, gender, grade, stage, T,  M, N were also investigated. Therein, p values <0.05 indicated a significant correlation (Supplementary Figure S5).

Prognostic Value of Nine Apoptosis-Related Genes and Clinicopathological Parameters Stratified by our Established riskScore for OS
Survival analyses of these nine critical apoptosis-related genes (APP, CSF2, DOCK8, IFI44, IL4, MDK, SLC27A2, TNFAIP2 and WNT5A) were conducted with all p < 0.05 ( Figures 8A-I), except for survival analysis of CSF2 with p value of 0.054, indicating that patients with high expression of APP, DOCK8, SLC27A2 could have a higher chance of better survival performance, while the IFI44, IL4, MDK, TNFAIP2 and WNT5A being the opposite. Moreover, qRT-PCR verified seven apoptosis-related genes (APP, DOCK8, IFI44, MDK, SLC27A2, TNFAIP2, WNT5A) with at least 2 fold changes by means of 2 −ΔΔCt method, finding that APP, SLC27A2 and WNT5A had a low expression, while DOCK8, IFI44, MDK, TNFAIP2 had a high expression in ccRCC tumor tissues ( Figure 8J).

Tumor-Infiltrating Immune Cells Between High-and Low-Risk Patients with ccRCC
As showed in the heatmap, we could observe that the different composition of TIICs in each ccRCC patients ( Figure 10A). Also the inner relationship of these TIICs of ccRCC based on the TCGA database was exhibited in the matrix, showing that CD8 T cells was positively correlated with T cells follicular helper, while negatively correlated with M2 macrophages and CD4 T cells memory resting ( Figure 10B). Moreover, eight TIICs (Dendritic cells resting, Macrophages M0, Macrophages M1, Mast cells resting, Monocytes, Plasma cells, T cells follicular helper and T cells regulatory) were infiltrated out that represents high correlations between high-and low-risk patients with ccRCC (all p < 0.05; Figures 10C-J). As showed in Figure 10K, it detailed the associations of 21 TIICs in highand low-risk patients with ccRCC. Therein, eight out of these 21 TIICs had significant correlations (all p < 0.05).

Identification of Related Small Molecule Drugs
We applied cMap database to seek out potential small molecule drugs highly correlated to the uploaded differently expressed apoptosis-related genes of ccRCC. Then we obtained the most potential small molecule drugs based on these genes. As showed in Table 5, the small molecule drug named nocodazole was positively related to ccRCC, indicating the potential mechanism of promoting this disease. Furthermore, eight small molecule drugs were negatively correlated with ccRCC containing atractyloside, captopril, coralyne, epitiostanol, clebopride, SR-95531, amiprilose, and harpagoside, indicating the potential repression on the development of this disease.

DISCUSSION
Dysregulation of apoptosis-related genes has been reported in a variety of malignant tumors (Huerta et al., 2006;Xu et al., 2009;Cao and Tait, 2018;Yang et al., 2018). However, a few of the apoptosis related genes have studied deeply and reported being associated with both the occurrence of and the inhibition of the progress of renal cancers (Guo et al., 2020;Li et al., 2020). In this study, we combined the RNA-sequencing data and clinical data of ccRCC from TCGA database, thus finding 127 differently-expressed apoptosis-related genes between ccRCC and normal tissue. After the comprehensive investigation to related molecular function and biological action pathway, we constructed and visualized the PPI network of these apoptosis-related genes. We also conducted the LASSO, unadjusted and adjusted Cox regression analyses to identify the hub candidate genes. Moreover, we constructed the risk model to conduct the prediction of prognosis based on nine hub apoptosis-related genes. The evaluation of the model was performed with survival analyses and ROC analyses of these nine hub apoptosis-related genes in both the TCGA and ArrayExpress databases. These findings may make a contribution to the development of accurate and sensitive biomarkers for the diagnosis and prognosis to the ccRCC patients.
GO enrichment analysis suggested that these genes are mainly enriched in regulation of lymphocyte activation, positive regulation of leukocyte activation, positive regulation of cell activation, neuron death, neuron apoptotic process, immunoglobulin receptor binding, antigen binding, scavenger receptor activity, cargo receptor activity, receptor ligand activity, immunoglobulin complex, blood microparticle, and membrane raft. The role of apoptosis in human diseases, especially cancer, has been increasingly recognized in many studies (Yu et al., 2015;Pai et al., 2017;Fathi et al., 2018). As for apoptosis-related genes, they often played critical roles in tumorigenesis and involved in the control of cancer progression (Fulda, 2015;Cao and Tait, 2018). Several apoptosis-related genes, had been reported to affect renal cell activity and cell proliferation (Woo et al., 2018). The degree of miR-182 was markedly reduced in renal cancer tissue, while mTOR was upregulated (Fu et al., 2018). In regards to ccRCC, down-regulated XIAP expression could increase the apoptosis-related sensitivity of RCC cells . In prostate cancer, FOXQ1 regulated the expression of BCL11A/ MDM2 to promote cell proliferation (Zhang et al., 2016).
Additionally, results of KEGG pathway analysis indicated that apoptosis-related genes were mainly enriched in Cytokine-cytokine receptor interaction, Allograft rejection, Epstein-Barr virus infection, Primary immunodeficiency, PI3K-Akt signaling pathway and Th1 and Th2 cell differentiation. As previously reported, apoptotic factors expressed by apoptotic-related genes can combine with cytochrome C released into the cytoplasm to form a polymer within the abundance of dATP, and induce apoptosis by binding the relevant gene caspase-9 to form apoptotic corpuscle (Shi, 2002).
Fas, a member of the tumor necrosis factor receptor superfamily, expressed by related factors is a transmembrane protein, can initiate apoptosis signal transduction and induce apoptosis when combined with FasL (Curtin and Cotter, 2003). Therefore, the development of pathways exploration worked well, and other biological mechanisms needed to be further studied. Further, we constructed the PPI network and applied the MODE plug-in to identify three vital modules of these differently-expressed apoptosis-related genes.
By means of LASSO, unadjusted and adjusted Cox regression analyses, we acquired nine hub genes (SLC27A2, TNFAIP2, IFI44, CSF2, IL4, MDK, DOCK, WNT5A, APP). Previous studies indicated that these genes were closely related to the occurrence and progress of carcinoma (Hoon et al., 1991;Cho and Kim, 2017;Chen et al., 2018). For example, TNFα-induced protein 2 (TNFAIP2) played a functional role in cell proliferation, angiogenesis, migration and invasion (Jia et al., 2018). Furthermore, interferon-induced protein 44-like (IFI44L) gene served as a type of interferon-stimulated gene (ISG), and we found that consumption of IFI44L activated Met/Src signaling pathway to promote migration, invasion and metastasis (Huang et al., 2018). Additionally, it had been shown that adverse clinical outcomes were caused through overexpression of CSF2 in urothelial carcinoma and epithelial carcinoma of the bladder (Lee et al., 2016). Moreover, Midkine had differently expression at high levels in malignant tumors and acted as a critical substance to promote the growth and metastasis of cancers (Filippou et al., 2020). As for Interleukin-4 (IL4), Hoon, D. S. et al. demonstrated that IL4 alone could modulate the expression of tumor-associated of renal cell carcinomas (Hoon et al., 1991).
We then established the prognostic model based on the nine hub genes. According to the model, the curves based on the two groups indicated that the patients in the low-risk subgroup were with high OS in the TCGA cohort, the external validation dataset (ArrayExpress) and the internal validation dataset (dataset 1; dataset 2). The timedependent ROC curve and the AUCs of the model demonstrated that these selected genes had a moderate performance when predicting OS at 1-, 3-and 5-year. In regard to the results of univariate and multivariate Cox regression analysis, the riskScore could serve as an independent factor of OS. Among the validation of the four cohorts, the p values of the riskScores were all <0.05, except for the ArrayExpress dataset which may be due to an inadequate sample. Therefore, the high sensitivity and accuracy of the nine genes signature in predicting OS for ccRCC patients was acknowledged. Additionally, we constructed a nomogram consisting of several clinical factors and the risk score to predict the 1-, 3-and 5-year OS, finding satisfactory outcomes in the external and internal datasets. We also conducted further validation of these nine apoptosisrelated genes in renal tissues by a series of qRT-PCR, thus offering strong evidence to the above conclusion. Furthermore, we explored its association with immune infiltration and applied cMap database to seek out potential small molecule drugs highly correlated to the differently expressed apoptosis-related genes in ccRCC. Our analysis is undoubtedly helpful, but application in future research is warranted.
The strength of this article was that it was the first time we explored the roles of apoptosis-related genes in ccRCC. Moreover, our prognostic model was not only based on the TCGA database, but also verified by the data from the external database ArrayExpress and two internal validation datasets, making the results more accurate. Further, our results remained stable internally and externally, making the verification more sufficient and the experimental results more persuasive. However, this study still had some limitations. First, our study was retrospective in nature, requiring the use of prospective studies to validate the findings. Second, when detecting the independent factors, we found that p-value of ArrayExpress database 0.652 (p > 0.05) in the multivariate Cox proportional hazards regression, potentially attributing to the inadequacy of the samples from the ArrayExpress datasets. Finally, the establishment of our prognostic model and the evaluation of independent factors mainly focused on the survival time. The selection of clinical treatment plans, treatment information (precise treatment sequence, with or without comorbidities, etc), a prolonged course of disease, as well as the early and late quality of life was not involved.

CONCLUSION
By means of external and internal validation, our study successfully constructed a prognostic model containing nine hub apoptosis-related genes (SLC27A2, TNFAIP2, IFI44, CSF2, IL4, MDK, DOCK, WNT5A, APP). Moreover, our established signature could serve as a tool to help clinicians predict patients' OS, allowing for a more standardized prognostic assessment. Future prospective studies are required to validate the efficiency and accuracy of our work to improve the diagnosis and management of ccRCC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article. R and Perl scripts involved in this article were displayed in Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Research Ethics Committees of Affiliated Hospital of Nantong University. The patients/ participants provided their written informed consent to participate in this study.