Establishment and external verification of an oxidative stress-related gene signature to predict clinical outcomes and therapeutic responses of colorectal cancer

Objective: Accumulated evidence highlights the biological significance of oxidative stress in tumorigenicity and progression of colorectal cancer (CRC). Our study aimed to establish a reliable oxidative stress-related signature to predict patients’ clinical outcomes and therapeutic responses. Methods: Transcriptome profiles and clinical features of CRC patients were retrospectively analyzed from public datasets. LASSO analysis was used to construct an oxidative stress-related signature to predict overall survival, disease-free survival, disease-specific survival, and progression-free survival. Additionally, antitumor immunity, drug sensitivity, signaling pathways, and molecular subtypes were analyzed between different risk subsets through TIP, CIBERSORT, oncoPredict, etc. approaches. The genes in the signature were experimentally verified in the human colorectal mucosal cell line (FHC) along with CRC cell lines (SW-480 and HCT-116) through RT-qPCR or Western blot. Results: An oxidative stress-related signature was established, composed of ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CDKN2A, CRYAB, NGFR, and UCN. The signature displayed an excellent capacity for survival prediction and was linked to worse clinicopathological features. Moreover, the signature correlated with antitumor immunity, drug sensitivity, and CRC-related pathways. Among molecular subtypes, the CSC subtype had the highest risk score. Experiments demonstrated that CDKN2A and UCN were up-regulated and ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CRYAB, and NGFR were down-regulated in CRC than normal cells. In H2O2-induced CRC cells, their expression was notably altered. Conclusion: Altogether, our findings constructed an oxidative stress-related signature that can predict survival outcomes and therapeutic response in CRC patients, thus potentially assisting prognosis prediction and adjuvant therapy decisions.


Introduction
Modulation of redox homeostasis is essential for maintaining normal cellular function and ensuring cell survival. Tumor cells are characterized by high levels of oxidative stress that is a state of imbalance between oxidation and antioxidation (Donohoe et al., 2019). Accumulated evidence suggests that oxidative stress exhibits dual roles in tumor progression (Yang and Chen, 2021). Reactive oxygen species (ROS) exhibits antitumor effects by heightening tumor cell apoptosis, necrosis, and ferroptosis and strengthening the immune surveillance capacity of immune cells (Gorrini et al., 2013). Instead, ROS promotes tumor progression via triggering DNA damage and genomic changes, activating proliferation-and epithelial-mesenchymal transition-related pathways, and remodeling the tumor microenvironment for tumor invasion and metastases (Falone et al., 2019).
Colorectal cancer (CRC) remains the third most diagnosed cancer (10.0%), and the second leading cause of cancer death (9.4%) worldwide, according to GLOBOCAN 2020 estimates (Sung et al., 2021). Approximately 50% of patients die from tumor metastases . Currently, systemic treatment options comprise adjuvant and neoadjuvant chemotherapy, and therapeutic antibodies directed against growth factor receptors (Berlin et al., 2022). Nevertheless, 30-40% of patients relapse despite treatment. A reasonable and effective signature for prognostic assessment of CRC patients is required. Oxidative stress can induce genetic instability and alter cellular processes, leading to CRC . In a large CRC patient cohort, higher reactive oxygen metabolites exhibit a strong association with more undesirable survival outcomes (Boakye et al., 2020). Cancer cells adapt to chemotherapyinduced oxidative stress using rapidly elevated cellular antioxidant programs, and adaptation of oxidative defense results in therapeutic resistance, a primary barrier to successful cancer treatment (Čipak Gašparović et al., 2021). For instance, SIRT3mediated SOD2 and PGC-1α trigger chemoresistance in CRC cells (Paku et al., 2021). Moreover, up-regulated NOX-2 and Nrf-2 facilitate 5-fluorouracil resistance of CRC cells (Waghela et al., 2021). Given the crucial roles of oxidative stress in the progression and therapeutic resistance of CRC, this study attempted to construct a reliable oxidative stress-related signature to predict patients' clinical outcomes and therapeutic responses.

Materials and methods CRC datasets
Transcriptome profiling (RNA-seq) of colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ) was performed, and normal tissue samples were extracted from The Cancer Genome Atlas (TCGA) via the Genomic Data Commons (GDC). The raw counts were standardized to count-per-million (CPM) using the edgeR package (Robinson et al., 2010). The threshold was set to 1 to retain genes greater than 1 in 2 or more samples. The copy number variation (CNV) data (masked copy number segment) and somatic mutation data (Varscan2) of CRC samples were downloaded from TCGA. Microarrays of CRC patients in GSE12945 (Staub et al., 2009), GSE39582 (Marisa et al., 2013), and GSE103479 (Allen et al., 2018) were acquired from the Gene Expression Omnibus (GEO). Microarray data were corrected for background and normalized through the robust multichip average (RMA) method. Missing data were imputed through the K-nearest neighbor method.

Identification of differentially expressed oxidative stress-related genes
Differentially expressed genes between CRC and normal tissues were screened based on the criteria of |log2fold-change|≥1 and adjusted p ≤ 0.05 utilizing the edgeR package. Adjusted p was calculated through the Bonferroni and Hochberg method. In total, 1,399 oxidative stress-related genes were extracted from the GeneCards according to relevance score≥7 (Supplementary Table S1). Afterward, differentially expressed oxidative stress-related genes were intersected. risk score was computed by the expression of candidate genes along with their coefficients. TCGA CRC samples were randomly assigned to the training set along with the testing set at 1:1 ratio (Liu et al., 2020). In each set, the median risk score was set as the cut-off value of low-and high-risk subsets.

Survival analysis
Kaplan-Meier curves along with the log-rank test were conducted on oxidative stress-relevant gene signature and patients' overall survival (OS), disease-free survival (DFS), disease-specific survival (DSS), and progression-free survival (PFS) based on the clinical data. Uni-and multivariate Cox regression models were established on the gene signature, and clinical parameters and OS with the survival package. Through the survival-ROC package, receiver operator characteristic curves (ROCs) were drawn, followed by the area under the curve (AUC) value.

Quantification of immune cell infiltration
Immune cell infiltrations were estimated across CRC tissues through Cell Type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT), a deconvolution approach proposed by Newman et al. (2015). The LM22 gene set was set as the reference set. This analysis was repeated 1,000 times, with p < 0.05 as the filtering condition.

Cancer immunity cycle
The cancer immunity cycle containing seven steps reflects the antitumor immunity as previously described (Chen and Mellman, 2013). The enrichment score of these steps was quantified via the TIP approach (Xu et al., 2018).

Analysis of CNV and mutation data
On the basis of the recurrently altered regions derived from the Genomic Identification of Significant Targets in Cancer (GISTIC 2.0) algorithm (Mermel et al., 2011), significant focal regions of gain and loss were identified and scored (G-score). The parameter thresholds were set as gain or loss length>0.1 and p < 0.05. Somatic mutation data were analyzed with the maftools package (version 2.6.0) (Mayakonda et al., 2018).

Drug sensitivity analysis
Drug Sensitivity data were acquired from the Genomics of Drug Sensitivity in Cancer (GDSC) database (www. cancerRxgene.org) (Yang et al., 2013). IC50 values were estimated with the oncoPredict package (Maeser et al., 2021).

Gene set enrichment analysis
GSEA was carried out through the Java platform (Subramanian et al., 2005). Gene sets of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were obtained from the Molecular Signatures Database (Liberzon et al., 2015). Terms with FDR<0.05 after 1,000 permutations were significantly enriched.

Cell culture and treatment
Human colorectal mucosal cell lines (FHC) and CRC cell lines  were maintained in DMEM with 10% fetal bovine serum, 100 U/ml penicillin, and 100 μg/ml streptomycin in a 37°C humidified incubator with 5% CO 2 . To induce oxidative stress, the cells were administrated H 2 O 2 in the medium, which was changed daily.

Statistical analysis
Statistical analysis was generated through R 3.6.1. Statistical difference between groups was computed with unpaired Student's t-test, Wilcoxon test, Kruskal-Wallis test, or one-way analysis of variance. Two-tailed p < 0.05 was set as statistical difference.

Development of an oxidative stressrelated gene signature for CRC
In total, there were 1,918 up-regulated genes and 2,081 down-regulated genes in 638 CRC versus 51 normal tissues ( Figures 1A-C). The detailed information is listed in Supplementary Table S2. From the GeneCards, we extracted 1,399 oxidative stress-related genes. After taking the intersection, 387 differentially expressed oxidative stress-related genes were finally identified ( Figure 1D). Among them, 53 genes were significantly correlated with CRC prognosis (Supplementary  Table S3). Afterward, candidate genes with regression coefficient≠0 were used for constructing an oxidative stressrelated gene signature using the LASSO algorithm ( Figures  1E,F). The risk score was computed according to (-0 For CRC prognosis, ACOX1, CPT2, NAT2, NRG1, and PPARGC1A were protective factors, and CDKN2A, CRYAB, NGFR, and UCN were risk factors ( Figure 1G). Figure 1H visualizes the expression of the aforementioned genes across CRC samples.
The oxidative stress-related gene signature accurately predicts CRC prognosis TCGA patients (N = 597) were randomly allocated into the training set (N = 298) and testing set (N = 299) at 1:1 ratio. Table 1 lists the patients' clinicopathological characteristics in the total, training along with testing sets. According to the median value, CRC cases were allocated into the high-or low-risk subsets (Figure 2A), with relatively more dead and recurred/progressed cases in the high-risk subset ( Figures 2B,C). The OS outcomes of the high-risk subset were significantly decreased in comparison to those of the low-risk subset in the training set ( Figure 2D) and the testing set ( Figure 2E) along with the total set ( Figure 2F). ROCs under 4-, 5-, and 6-year OS of the training set ( Figure 2G), the testing set ( Figure 2H) along with the total set ( Figure 2I) demonstrated the excellent performance of the oxidative stressrelated gene signature in predicting CRC prognosis.
We also measured the expression of two master regulators of oxidative stress (NRF2 and KEAP1). Compared with normal tissues, up-regulated KEAP1 and down-regulated NRF2 were found in CRC tissues at the transcriptional level ( Figure 2J), indicating the enhanced oxidative stress during CRC development. Additionally, we observed the difference in NRF2 and KEAP1 between high-and low-risk subsets. As shown in Figure 2K, the high-risk subset presented higher KEAP1 expression and lower NRF2 expression in comparison to the low-risk subset, demonstrating the heterogeneity in oxidative stress between high-and low-risk CRC patients.
The oxidative stress-related gene signature correlates to clinical characteristics of CRC Distribution of the risk score derived from the oxidative stress-related gene signature was analyzed across different clinical characteristics. With the increasing TNM and pathological stage, the risk score was dramatically higher (Figures 3A-D). Additionally, the risk score was positively correlated to the lymph node ( Figure 3E). Compared with microsatellite-stable (MSS), microsatellite unstable-low (MSI-L) had a significantly higher risk score ( Figure 3F). Overall, the oxidative stress-related gene signature was correlated to a more serious pathological status.
The oxidative stress-related gene signature acts as an independent prognostic factor of CRC patients Uni-and multivariate Cox regression models demonstrated that the risk score acted as an independent risk factor of CRC survival (Figures 3G,H).

External verification of the oxidative stress-related gene signature
To prove the robustness of the oxidative stress-related gene signature, this study included three independent cohorts. The same formula was used for computing the risk score. Both in the GSE103479 and GSE39582 cohorts, the high-risk subset exhibited worse OS than the low-risk subset, with relatively high AUCs at 4-, 5-and 6-year survival ( Figures 5A-D). As the N stage ( Figures 5E,F), M stage ( Figure 5G), and pathological stage ( Figures 5H,I) worsened, the risk score gradually increased.
The aforementioned data demonstrated that the signature exhibited excellent robustness on distinct platforms.
The oxidative stress-related gene signature correlates to antitumor immunity of CRC Through the CIBERSORT approach, we estimated the immune cell infiltration across CRC specimens (Figures 6A,B). Activated dendritic cells, activated mast cells, monocytes, neutrophils, resting NK cells, plasma cells, activated memory T-cell CD4, and resting memory T-cell CD4 were significantly lower in the high-risk subset than those in the low-risk subset (Figures 6C,D). Meanwhile, M0 macrophages, activated NK cells, T-cell CD8, T-cell follicular helper, and T-cell regulatory (Tregs) exhibited elevated infiltration in the high-risk subset. The  LGALS3, and PVR) exhibited down-regulation in the highrisk subset ( Figure 6E). Additionally, immunomodulators (IL6R, ICOS, CCR3, CCL20, CCR6, CXCL6, TNFSF13, TNFRSF17, CXCL3, CCL11, CXCL2, CXCL1, HHLA2, and CCL28) were down-regulated in the high-risk subset ( Figure 6F). Meanwhile, higher CX3CL1 and TNFSF14 expressions were found in the high-risk subset than in the low-risk subset. High activity of priming and activation, recruitment of

CD4 T cells, dendritic cells, T cells, and Th1 cells; infiltration of immune cells into tumors; and recognition of cancer cells by
T cells were found in high-risk subset compared to the low-risk subset ( Figure 6G). In contrast, B-cell recruitment, eosinophil recruitment, MDSC recruitment, neutrophil recruitment, Th2 cell recruitment, and Treg cell recruitment showed lower activity in high-risk subset than the low-risk subset. Overall, the oxidative stress-related gene signature was correlated to antitumor immunity of CRC. Frontiers in Pharmacology frontiersin.org

Difference in CNV and mutation between high-and low-risk subsets
For the CNV data, we used GISTIC 2.0 to determine 24 amplified fragments and 44 deleted fragments in the high-risk subset (Figures 7A,B). Meanwhile, 28 amplified fragments and 40 deleted fragments were identified in the low-risk subset (Figures 7C-G). Compared with the high-risk subset, higher mutated frequencies of APC, FAT4, and OBSCN occurred in the low-risk subset ( Figures 7H-K). In contrast, TP53, TTN, KRAS, SYNE1, MUC16, PIK3CA, and RYR2 had higher mutated frequencies in the high-risk subset. Additionally, mutated TP53 was significantly correlated to high-risk CRC ( Figure 7L).

Experimental verification of the genes within the oxidative stress-relevant gene signature
The genes in the signature were verified in the human colorectal mucosal cell line (FHC) along with CRC cell lines (SW-480 and HCT-116) through RT-qPCR or Western blot. CDKN2A and UCN were up-regulated and ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CRYAB, and NGFR were downregulated in CRC than normal cells (Figures 9A-C). Next, we

Discussion
Oxidative stress-related signatures have been established in acute myeloid leukemia (Dong et al., 2021), melanoma , clear-cell renal cell carcinoma (Wu Y. et al., 2021a), gastric cancer (Wu Z. et al., 2021b), head and neck squamous cell carcinoma (Wang and Zhou, 2021), glioma , and bladder cancer (Zhang et al., 2022). Alterations in redox status accompanied by increased production of ROS have been implicated in CRC (Lee et al., 2021). Nevertheless, so far, no oxidative stress-related model has been proposed for CRC. Considering the fact that oxidative stress is a complex process involving different genes, in the present study, we proposed an oxidative stress-related gene signature composed of ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CDKN2A, CRYAB, NGFR, and UCN to predict CRC patients' clinical outcomes with the LASSO approach.
Reliable markers in predicting immunotherapeutic responses of CRC patients are still insufficient in clinical practice (Chen L. et al., 2021b). Dual suppression of endoplasmic reticulum stress and oxidation stress may manipulate macrophage polarization following hypoxia to enhance immunotherapeutic sensitivity (Jiang et al., 2021). SENP7 can sense oxidative stress to maintain metabolic fitness  Frontiers in Pharmacology frontiersin.org 14 and antitumor effects of CD8 + T cells . Moreover, altered tumor metabolism via CD4 + T cells results in TNF-α-dependent intensified oxidative stress and tumor cell deaths (Habtetsion et al., 2018). The oxidative stress-related gene signature was correlated with the antitumor immunity of CRC, indicating that this signature might enable prediction of the immunotherapeutic response (Liu et al., 2020). Genomic alterations and CNVs were compared between high-and low-risk subsets. Particularly, the high-risk subset was remarkably linked to more aggressive molecular alteration: mutated TP53 that triggers enhanced proliferative capacity via consuming oxygen and producing abnormal vasculature during the early stage of cancer development. There were remarkable differences in drug sensitivity between high-and low-risk subsets. Additionally, the risk score was linked to CRC-related pathways, such as mitotic G2 DNA damage checkpoint, microRNAs in cancer, and Hippo signaling pathway.
Our experimental studies demonstrated that CDKN2A and UCN were up-regulated and ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CRYAB, and NGFR were downregulated in CRC cells (SW-480 and HCT-116) compared with human colorectal mucosal cells (FHC). In H 2 O 2 -induced CRC cells, their expression was remarkably altered. Butyrateinduced colonocyte differentiation determines CDKN2A as a prognostic biomarker of CRC recurrence (Dasgupta et al., 2019). Patients who have tumor chromosomal CDKN2A deletion are prone to immunotherapeutic resistance (Horn et al., 2018). ACOX1 may attenuate the migration and invasion of CRC cells (Sun et al., 2017). Down-regulated CPT2 induces stemness and oxaliplatin resistance in CRC through the ROS/Wnt/β-catenin-triggered glycolytic metabolism . Additionally, its downregulation heightens proliferation and weakens apoptosis via p53 signaling in CRC (Liu et al., 2022). NAT2 downregulation is also found in CRC, which correlates to CRC patients' metastasis and survival . CRYAB correlates to clinical outcomes and immunocyte infiltrations in CRC . NGFR improves the chemosensitivity of CRC cells by strengthening the apoptotic and autophagic effects of 5-fluorouracil by activating S100A9 (Chen H. et al., 2021a). Combining previous evidence, the genes in the oxidative stress-related signature play essential roles in CRC progression.
Our analysis is a retrospective study, resulting in unavoidable limitations. As many datasets as possible were included, so sampling bias from tumor heterogeneity and different platforms can only be decreased, but not completely removed. Although we experimentally validated the genes from the oxidative stress-relevant gene signature, more experimental studies are needed for clarifying the functional significance of oxidative stress in CRC.

Conclusion
In summary, this study proposed an oxidative stress-related signature composed of ACOX1, CPT2, NAT2, NRG1, PPARGC1A, CDKN2A, CRYAB, NGFR, and UCN to predict clinical outcomes and therapeutic responses of CRC patients, which provided valuable information for understanding the functional roles of oxidative stress in CRC development, assisting prognosis prediction and guiding adjuvant therapy (especially small molecular compounds and immunotherapy), thereby facilitating precision oncology of CRC.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.