The Clinical Significance of DNA Damage Repair Signatures in Clear Cell Renal Cell Carcinoma

DNA damage repair plays an important role in cancer’s initiation and progression, and in therapeutic resistance. The prognostic potential of damage repair indicators was studied in the case of clear cell renal cell carcinoma (ccRCC). Gene expression profiles of the disease were downloaded from cancer genome databases and gene ontology was applied to the DNA repair-related genes. Twenty-six differentially expressed DNA repair genes were identified, and regression analysis was used to identify those with prognostic potential and to construct a risk model. The model accurately predicted patient outcomes and distinguished among patients with different expression levels of immune evasion genes. The data indicate that DNA repair genes can be valuable for predicting the progression of clear cell renal cell carcinoma and the clinical benefits of immunotherapy.


INTRODUCTION
Renal cell carcinoma (RCC) is a lethal cancer of the urinary system, accounting for 2-3% of all malignant cancers in adults (Siegel et al., 2019). Although surgical and systemic therapies for clear cell renal cell carcinoma (ccRCC), the most common subtype of renal carcinoma, have improved survival, the outcome in patients with advanced, metastatic disease is still poor (Garje et al., 2020). Risk-stratifying the patients based on their clinical characteristics and then individualizing treatment according to the risk level could be helpful to improve the outcome. However, the current tumor stage system is insufficient to predict ccRCC prognosis effectively (Suh et al., 2020). Therefore, it is necessary to explore new biomarkers for prognostic prediction in patients with ccRCC.
The advances in genomics and bioinformatics in recent years have enabled the discovery of novel targets and biomarkers. Many biomarkers including lncRNA, miRNA, and mRNA have been identified for making diagnosis and predicting prognosis as well as guiding treatment choices Carril-Ajuria et al., 2019). For instance, some immune-associated signatures have been employed to evaluate the tumor microenvironment (TME) infiltration characterization, revealing a linkage between the TME and clinical features (Zeng et al., 2019). Moreover, the signatures such as hypoxia, N6-methyladenosine (m6A) mRNA modification, autophagy and metabolism have been used for prognosis prediction (Feng et al., 2020;Hu et al., 2020;Jiang p. et al., 2020;Liu et al., 2020).
The role of DNA damage repair (DDR) in neoplasia and tumor progression has been extensively studied (Mauri et al., 2020). Defective DDR can lead to accumulated DNA lesions and genome instability, which contribute to tumorigenesis. It has been reported that germline mutations in exonuclease 5 can impair DNA repair ability and cause androgen-related prostate cancer (Ali et al., 2020). However, during cancer progression, DNA repair may be related to sensitivity to anticancer drugs such as poly ADP-ribose polymerase (PARP) inhibitors or to radiation. MAP kinase-ERK kinase5 (MEK5) can promote the phosphorylation of DNA-PK in response to ionizing radiation (Broustas et al., 2020). High TTK protein kinase (TTK) expression in breast cancer is associated with efficient repair through homologous recombination and low radiation sensitivity (Chandler et al., 2020). These earlier studies point to the importance of exploring the different roles of DNA repair genes in cancer.
As an aggressive tumor, RCC is characterized by high genomic mutation levels (Alexandrov et al., 2013;Cancer Genome Atlas Research Network, 2013;Ged et al., 2020). Many studies reported the prognostic and biological significance of the cancer-driven genetic alterations including von Hippel-Lindau, TP53 as well as PTEN mutations in RCC Huang et al., 2020). However, the role of DNA repair genes in maintaining genome stability in RCC is rarely reported.
In this study, we collected and analyzed data from the International Cancer Gene Consortium, the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) to determine which DNA repair genes are prognostic for patients with ccRCC, developed a prediction model based on the expression of DDR-associated genes, and explored genes and pathways associated with the gene signatures.

Data
In total, 546 DDR-associated genes were analyzed in terms of gene ontology (GO). The clinical data and RNA sequences were downloaded from the TCGA portal, which contains 514 tumor and 72 normal samples. Pathological clinical features for each patient are presented in Table 1. The other three profiles were collected from the GEO (GSE17818 and GSE53757) and ICGC databases.

Screening for Differentially Expressed Genes
The original data were organized and analyzed with the help of R software (4.0.1). Genes that showed a log 2 fold change >1 and a false discovery rate <0.01 between tumor and normal tissues were considered to be differentially expressed. Venn diagrams were used to determine the interaction of the differentially expressed genes (DEGs) of the four datasets. The differentially expressed DDR genes between tumor and normal tissues was identified using the "Linear Models for Microarray Data (LIMMA)" package with a cut-off criterion of p < 0.05.

Modeling and Assessment of Its Prognostic Ability
The patients in the TCGA database were allocated randomly into one of two groups: a training group and a validation group. The training group was used to construct a clinical prognostic model, and the validation group was used to evaluate the model's stability. First, univariate Cox regression analysis was used to extract potential DNA repair genes significantly related to patient prognosis. Last absolute shrinkage and selection operator (LASSO) regression was then used to prevent overfitting of the model. Finally, a risk score formula was generated by combining the gene expression levels weighted by the regression coefficients derived from the multivariate Cox regression analysis. According to the risk scores calculated using the formula, we set the median risk scores as cut-off values and allocated the patients in both training and validation groups into a high-risk group and a low-risk group. The survival rates of the two groups were then compared using Kaplan-Meier analysis. The predicting sensitivity of the selected gene sets was assessed by plotting the receiver operating characteristics (ROC) curve. An area under the curve (AUC) >0.60 was taken as indicating moderate accuracy, and an AUC >0.75 was regarded as highly accurate for predictions. The prognostic values of the risk scores were assessed by the univariate and multivariate analysis through R software.
Additionally, the correlation between the risk scores and clinicopathological features in ccRCC patients were assessed based on the data from TCGA. Then we integrated our risk scores into current staging system and evaluated the utility in stratifying the risk levels.

Functional and Pathway Enrichment Analysis
We identified the different genes between high risk and low risk groups, and conducted GO pathway analysis with the clusterProfiler R packages to functionally annotate the DEGs in different groups.

Statistical Analyses
The t-test was used for statistical comparison. Long-rank test was performed to compare the overall survival rates between different groups. The statistical analyses were completed using R software, and results were considered statistically significant when the p value was ≤0.05.

Differentially Expressed Genes
A total of 7359 genes in the TCGA data, 2862 in the GEO data and 4681 in the ICGA data were found to be differentially expressed between the ccRCC and normal renal samples. There were 951 DEGs common to the gene lists (Figure 1), and 26 of them were found to be closely related to DDR ( Table 2).
The Kaplan-Meier survival analysis showed that patients with high risk scores exhibited worse overall survival rates than those with low risk scores (Figures 3A,B). The ROC analysis generated an AUC of 0.791 for the training group and 0.773 for the test group, indicating good prediction performance (Figures 3C,D).

Risk Scores and Clinical Characteristics
The expression of prognostic genes in the two groups is presented in Figure 5A. On average, the risk scores of patients differently classified by pathological stage and WHO grade were significantly different (Figures 5A,B). The ROC curve showed that the proposed model's predictions of overall survival were similar to those using pathological stage (Figures 5C,D). The patients in high risk group had a significantly worse prognosis than those in low risk group even though they had similar clinicopathologic characteristics, in similar stage ( Figure 5E) or grade ( Figure 5F). Univariate analyses identified age, pathological stage, grade and risk score as significant predictors of OS in both the training and test groups (Figures 6A,B). Multivariate analysis confirmed that only risk score and pathological stage were related to OS independently (Figures 6C,D). Moreover, the hazard ratio of the risk is higher than other clinicopathologic characteristics, which indicating that the prognostic value of risk score based on the DDR associated genes might be better than clinical stages and grades.

Risk Score and Immune Evasion
A link between DDR and immune escape has been reported (Sato et al., 2019), so immunity-related gene expression was evaluated in patients with different risk scores. Patients in the high-risk group had higher expression of PD-1, LAG-3, CTLA-4, and TIGIT than those in the low-risk group (Figures 7A-H), indicating that patients with high risk scores might benefit from the use of combination therapies that integrate immunotherapy with chemotherapy, radiotherapy, and targeted therapy.

Enrichment in Patients With Different Risk Scores
To identify a mechanism potentially contributing to tumor progression, 4190 genes differentially expressed between the high-risk and low-risk groups ( Figure 8A) were studied. GO enrichment analysis showed that the DEGs are mainly related to immune processes such as neutrophil degranulation, neutrophilmediated immunity, neutrophil activation, and in metastasis pathways including cell adhesion molecule binding, cell-substrate adherent junctions, and focal adhesion ( Figure 8B). Those relate the prognostic model mechanistically to tumor progression in ccRCC patients.

DISCUSSION
Clear cell renal cell carcinoma is highly heterogeneous, and its prognosis can vary widely even for patients with similar clinicopathologic characteristics and treatment options, suggesting that current classifications are insufficient for assessing outcomes and risk stratification. Hence, more research is needed to identify novel biomarkers and risk factors in patients with ccRCC. The current study was performed as a pilot trial not only to identify the potential biomarkers associated with prognosis but also to explore new hypotheses for further studies.
Prediction models have for years been explored to guide individual treatment. It has been reported that models based on the expression of tumor genes would allow prediction of patients' response to fluorouracil and gemcitabine (Clayton et al., 2020). Jiang et al. (2019;Jiang Y. Z. et al., 2020) classified triple-negative breast cancers into different subtypes on the basis of genomic features and evaluating the clinical benefit of subtyping-based targeted therapy for triple-negative breast cancers. But the individual treatment based on molecular subtyping in ccRCC is rare.
The DDR-associated pathway is associated with the carcinogenesis of ccRCC, and the expression of some DNA repair factors is correlated with patient prognosis (Na et al., 2019). This study was designed to determine the impact of DNA repair genes on the progression of ccRCC and patient prognosis. Twenty-six DNA repair genes were identified and 14 of them were considered as related to the overall survival of ccRCC patients, and risk scoring based on those genes was shown to be a useful independent prognostic technique. Patients identified as high-risk through the scoring tended to have late stage disease and high tumor grades. RAD51-associated protein 1 (RAD51AP1) interacts with RAD51, which plays an important role in RAD51-related homologous recombination . A recent study found that RAD51AP1 is highly expressed in non-small cell lung cancer patients and facilitates invasion and metastasis by inducing epithelial-mesenchymal transitions (Wu et al., 2019). ISG15 is related to chemosensitivity in pancreatic cancer (Ina et al., 2010). As a known regulator of the WNT pathway, SFRP2 can activate or suppress canonical and non-canonical WNT pathways in different tissues. Sun reports that DNA damage can induce the production of SFRP2, which enhances WNT16B and β-catenin activity to result in therapy resistance (Sun Y. et al., 2016). Recent studies have found that Schlafen 1 (SLFN1) produces cells that are highly sensitive to DNAdamaging drugs by suppressing checkpoint maintenance and repair through homologous recombination (Mu et al., 2016). Small cell lung cancer patients with high SLFN1 expression might benefit from PARP inhibition (Murai et al., 2016;Lok et al., 2017).
The immunotherapy blocking the interaction between PD-1 and PD-L1 in ccRCC patients has seen advances in recent years, which provides a new treatment option (Motzer et al., 2015). However, the benefit from this therapy is highly variable (Motzer et al., 2020). Although there are some markers developed to predict immunotherapy response, their specificity remains controversial.
DNA damage repair is associated with immune activation in different cancers. Chatzinikolaou et al. (2014) has reviewed the direct links between innate immune signaling and DNA damage. And recent work has revealed that pharmacological inhibition of the DNA damage response proteins CHK1 and PARP increases the levels of tumor-infiltrating T-lymphocytes. With anti-PD-L1 therapy it has been shown to work synergistically in modes of small cell lung carcinoma through the STING/TBK1/IRF3 innate immune pathway (Sen et al., 2019). A group led by Sato has shown that genotoxic stress such as irradiation or PARP inhibition can upregulate the expression of PD-L1 through the ATM-ATR/CHK1 pathway (Sato et al., 2017). Jiao et al. (2017) found that PARP inhibitors can upregulate PD-L1 and promote immune suppression. Garsed et al. (2018) reported that DDR pathway mutations are associated with immune cell infiltration and activation. The association of mutations in DNA repair genes and immune regulatory genes with bladder cancer has also been documented (Vidotto et al., 2019). Moreover, it has been reported that alterations in DDR genes which cause loss of function are frequent in metastatic ccRCC, which may certainly affect the effectiveness of immunotherapy (Ged et al., 2020). We therefore performed this bioinformatics analysis to explore the potential relationship between DDR and immune evasion. Our study found that ccRCC patients with greater risk had high expression of immune evasion genes. As it has been reported that antibodies against immune evasion genes could restore responses of tumor-associated T cells to tumor related antigens (Dong et al., 2002), and higher PD-L1 expression on tumor cells and/or immune cells was shown to be associated with better efficacy of anti-PD1/PD-L1 immunotherapies (Dong et al., 2002;Topalian et al., 2012), we speculated that patients with high risk might benefit from immunotherapy. The GO of the DDR genes identified many immunityand metastasis-related pathways. As the major portion of the leukocytes in peripheral blood, neutrophils have been associated with carcinogenesis and cancer development (Galdiero et al., 2018;Mollinedo, 2019). Many studies have shown that neutrophils in the TME are related with poor prognosis (Shen et al., 2014;Zhou et al., 2016;       Jungnickel et al., 2017). Moreover, the neutrophil-to-lymphocyte ratio has been identified as a risk factor with different tumors (Templeton et al., 2014;Barker et al., 2020). Neutrophils in the TME release many inflammatory factors that contribute to tumor proliferation, metastasis and immune suppression. A group led by Coffelt found that tumor-associated neutrophils (TANs) induced by IL-17 can inactivate cytotoxic T lymphocytes and promote metastasis (Coffelt et al., 2015). A previous study found that the interaction between TANs and tumor cells led to tumor shedding, which promoted tumor spreading (Wislez et al., 2007). Moreover, TANs recruit several immunoregulatory cells which inhibit anti-tumor immunity and induce T cell apoptosis by releasing TNF-α (Tecchio et al., 2013;Powell and Huttenlocher, 2016;Michaeli et al., 2017). MMP2, MMP9, VEGF, arginase, and elastase from neutrophil degranulation can also contribute to tumor progression (Caruso et al., 2010;Mishalian et al., 2013;Deryugina et al., 2014). This study has revealed that DNA repair genes are involved in immune and metastasis signaling, uncovering their effects on the initiation and development of ccRCC.
Many prognostic models have been proposed based on immunity, autophagy and glycolysis genes, and their prognostic value in different types of cancer has been evaluated Wan et al., 2019a,b;Zhang et al., 2019). However, the prognostic utility of DNA damage genes in cancer remains controversial. This study has shown that DDR-associated signatures correlate with poor prognosis among ccRCC patients, and the ROC curves show the model's potential utility in predicting the survival of ccRCC patients. Moreover, integrating our risk scores into current staging system lead to a more precise predictive model to further stratify patients with distinct prognosis, as shown by the survival curves plotted in this study.
It should be noted that there are also some shortcomings in our study. First, only a few Asians were included in the data used, which may lead to selection bias. Secondly, this is a bioinformatics analysis based on public databases, and experimental as well as clinical studies are required to validate these findings.
In summary, our study identified DDR signatures that could predict prognosis in ccRCC patients. The current findings show that risk scores from our model can further improve the current clinical staging system and provide more accurate prediction on outcomes. In addition, our model can predict the expression of immune evasion proteins in ccRCC patients, which might predict their response to immunotherapy. Further studies are required to validate our findings.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: The datasets generated for this study can be found in The Cancer Genome Atlas database (https://portal.gdc. cancer.gov/), international cancer genome consortium (https:// icgc.org/), and GEO (https://www.ncbi.nlm.nih.gov/geo).

AUTHOR CONTRIBUTIONS
EG, CW, and JM conceived the work. EG and WZ conducted the data analysis. EG and LZ wrote the manuscript. GH revised and proofread the manuscript. All authors revised the manuscript and allowed the submitted version.