Expression and Prognosis Analysis of SUMOylation Regulators in Oral Squamous Cell Carcinoma Based on High-Throughput Sequencing

Introduction Oral squamous cell carcinoma (OSCC) originates from oral mucosal epithelial cells, accounting for more than 90% of oral cancers. The relationship between the expression and prognostic role of SUMOylation regulators in OSCC is rarely studied. Materials and methods The expression and survival data of OSCC were derived from TCGA and GEO databases. Wilcoxon test was used to determine the differential expression of the SUMOylation regulators. A prognostic model based on SUMOylation regulator-related genes was constructed by Cox regression. Gene set enrichment analysis was applied to predict the potential biological functions that the genes might be involved in. Results RANBP2 and SENP6 had the highest SNV frequency. Eleven genes including PIAS3, RANBP2, USPL1, SENP6, SENP2, SENP5, SAE1, UBA2, PIAS4, UBE2I, and SENP3 were highly expressed in OSCC. The prognostic model based on nine SUMOylation-regulated genes (TRIM37, UFM1, FUBP1, CCNT1, FXR1, HMG20A, RANBP3, SPATA5, and DDX23) had a strong ability to predict the prognosis of OSCC. Conclusion This study might provide targets for prognostic evaluation and targeted therapy of patients with OSCC.


INTRODUCTION
Oral squamous cell carcinoma (OSCC) originates from oral mucosal epithelial cells, accounting for more than 90% of oral cancers (Roy et al., 2019). Its 5-year survival rate is only about 56 -60% (Neville and Day, 2002;Petersen, 2003). Tobacco, alcohol, and poor oral hygiene are the main risk factors for OSCC (Moore et al., 2000;Joseph and D'Souza, 2012). The treatment of OSCC includes surgery, radiotherapy, and chemotherapy (Chinn and Myers, 2015). Based on the research of the molecular mechanism and the development of targeted drugs, precision medicine provides a new method to rescue tumor progression (Harsha et al., 2020).
SUMOylation is a post-translational modification that can reversibly connect SUMO (small ubiquitin-like modifier) to lysine residues in the substrates (Celen and Sahin, 2020;Kroonen and Vertegaal, 2020). The process of SUMOylation is catalyzed by the cascade reaction of E1 activating enzyme (SAE1 and UBA2), E2 conjugating enzyme (UBE2I), and E3 ligase (PIAS1, PIAS2, PIAS3, PIAS4, and RANBP2). Moreover, deSUMOylation is the opposite process of SUMOylation, which releases SUMO molecules. This process is catalyzed by SUMO proteases (SENP1, SENP2, SENP3, SENP5, SENP6, SENP7, and USPL1). SUMOylation is involved in cell localization, protein activity regulation, and protein stability changes, which regulates cell division, signal transfer, and cell metabolism (Dhingra and Zhao, 2019;Chang and Yeh, 2020;Roy and Sadanandom, 2021). It has been reported that the expression of SENP5 is elevated in OSCC and is related to the degree of tumor differentiation (Ding et al., 2008). However, the relationship between the expression and prognostic role of SUMOylation regulators in OSCC is rarely studied.
This study analyzed the expression of SUMOylation regulators in OSCC based on high-throughput sequencing and then constructed a prognostic model based on the SUMOylation regulator-related genes. This work might provide directions for the study of the mechanism of SUMOylation in OSCC.

Data Acquisition
The expression and survival data of OSCC were derived from TCGA and GEO databases (TCGA: 1 ; GEO: 2 , GSE41613). Cases with overall survival time or total follow-up time less than 30 days and incomplete survival data were excluded. Finally, 323 cases of OSCC in TCGA, 32 cases of normal tissue, and 96 cases of OSCC in GEO were selected for follow-up analysis. In addition, single nucleotide variant (SNV) data were obtained from the TCGA database.

SNV and Expression of SUMOylation Regulators in OSCC
Single nucleotide variant data of OSCC were analyzed by "maftools" package of R language. After extracting the expression of the SUMOylation regulators from the TCGA database, Wilcoxon test was used to determine the differential expression of the SUMOylation regulators. In addition, we analyzed the expression differences of SUMOylation regulators among different stages and grades.

The Prognostic Model of OSCC
The Pearson correlation coefficient > 0.6 was used as selection criteria to screen the SUMOylation regulator-related genes. By using R language "caret" package, the TCGA dataset of OSCC was divided into a training group and the internal validation group at a ratio of 7:3. Meanwhile, the GEO dataset was used as the external validation group. In the TCGA training group, prognostic genes of OSCC were assessed by univariate Cox regression. Furthermore, we constructed the prognostic model by multivariate Cox regression. The risk score of every sample was calculated accordingly. Kaplan-Meier survival curve was applied

Gene Set Enrichment Analysis
In this study, "clusterProfiler" package of R language was applied to perform GO and KEGG enrichment analyses on SUMOylation regulator-related genes to predict the potential biological functions that might be involved in.

Statistical Analysis
The analyses and drawing of this work were all based on R language (4.0.2). When P was less than 0.05, the results were considered to be statistically different.

SNV and Differential Expression of SUMOylation Regulators
Among the 15 SUMOylation regulators, missense mutation, SNP, and C > T were the most common types of SNV (Figures 1A-C).
Compared with normal tissues, 11 genes including PIAS3, RANBP2, USPL1, SENP6, SENP2, SENP5, SAE1, UBA2, PIAS4, UBE2I, and SENP3 were highly expressed in OSCC. These heatmap and histogram of these differentially expressed SUMOylation regulators are shown in Figures 2A,B. Among these 11 genes, SAE1 and UBE2I had differential expressions with different stages (Figures 3A-K). Moreover, SENP1, USPL1, SENP2, SENP5, SAE1, UBA2, and UBE2I had differential expressions with different tumor grades (Figures 4A-K). Subgroup analyses showed that the SUMOylation status might be various with different stages and grades. In the following analyses, this study was to analyze the prognosis of the overall OSCC patients regardless of different subgroups, which might be more generally applicable to this disease.

GSEA
GO enrichment analysis showed that these SUMOylation regulator-related genes might be related to DNA replication, RNA splicing, and histone binding pathways ( Figure 5A). KEGG analysis enriched for these genes was linked to pathways such as Spliceosome, cell cycle, RNA transport, and DNA replication ( Figure 5B).

Prognosis Model of OSCC
We identified a total of 683 SUMOylation regulator-related genes. Using univariate Cox regression in the TCGA training group, 37 genes were found to be associated with  score of each sample are shown in Figures 6A,B. The survival curve of the TCGA training group showed that cases with high-risk scores had worse overall survival status than cases with low-risk scores (P < 0.001, Figure 6C). The ROC curve showed that the 3-year and 5-year AUC were 0.720 and 0.779, respectively ( Figure 6D).
In order to verify the prognostic ability of this model, we also drew the survival and ROC curves in the TCGA validation group and GEO validation group. In both TCGA validation group and GEO validation group, the overall survival of cases with high-risk scores were worse than that of cases with lowrisk scores (P < 0.001, Figures 7A-C). The AUC values of the TCGA verification group were 0.618 (for 3 years) and 0.603 (for 5 years), respectively, as shown in Figure 7B. The AUC values of the GEO validation group were 0.610 (for 3 years) and 0.660 (for 5 years), respectively, as shown in Figure 7D. A nomogram was drawn according to the risk model as shown in Figure 7E.

DISCUSSION
Small ubiquitin-like modifier molecules in human beings include SUMO-1, SUMO-2, SUMO-3, and SUMO-4, and their sequences are commonly conserved (Yang et al., 2017). Through the tertiary catalysis of E1 activating enzyme, E2 conjugating enzyme, and E3 ligase, the SUMO molecules were efficiently bound to the lysine residues of substrates (Flotho and Melchior, 2013). Conversely, SUMO proteases can dissociate the SUMO molecules from the substrates instead of degrading the SUMO molecules (Kerscher et al., 2006). Most SUMOylation substrates exist in the nucleus. The substrates are in the rapid circulation of SUMO conjugation and deconjugation (Hay, 2005), but this dynamic process has an important regulatory effect on tumor cell proliferation and genome stability (Eifler and Vertegaal, 2015).
p53 is one famous tumor suppressor. Studies have shown that after p53 is covalently conjugated by SUMO-1, it cannot Frontiers in Genetics | www.frontiersin.org bind to DNA and promote p53-dependent transcription. The SUMOylation of p53 at the K386 prevents p53 acetylation induced by p300. However, p53 that has already been acetylated can still bind to DNA without being interfered by SUMOylation (Wu and Chiang, 2009). MDM2, an E3 ligase, can induce ubiquitination and degradation of p53 and transfer it from nucleus to cytoplasm (Carter et al., 2007). SKI can enhance the SUMOylation of MDM2, thereby upregulating the activity of MDM2 and reducing the level of p53 (Ding et al., 2012). In addition, β-catenin, the key protein of the WNT pathway, is associated with poor prognosis. In multiple myeloma, SUMOylation has been reported to inhibit ubiquitinproteasomal mediated degradation of β-catenin, thereby upregulating the Wnt/β-catenin pathway and promoting cell proliferation (Huang et al., 2015). These studies show that studies of SUMOylation are of great significance for tumor pathogenesis and targeted therapy. However, current research on the molecular mechanism of SUMOylation in OSCC is quite limited. By analyzing the differential expression of SUMOylation regulators in OSCC between tumor tissues and normal tissues, we found that the 11 genes including PIAS3, RANBP2, USPL1, SENP6, SENP2, SENP5, SAE1, UBA2, PIAS4, UBE2I, and SENP3 were highly expressed. Further subgroup analyses indicated that SAE1 and UBE2I had differential expressions with different stages, while SENP1, USPL1, SENP2, SENP5, SAE1, UBA2, and UBE2I had differential expressions with different tumor grades. The differential expressions of SUMOylation regulators indicated that SUMOylation might play an important role in the pathogenesis of OSCC. These differentially expressed SUMOylation regulators have been reported in some kinds of tumors. For instance, the expression of SAE1 increases with the level of glioma. SAE1 can increase the SUMOylation of Akt and promote the progression of glioma (Yang et al., 2019). UBE2I, the unique E2 conjugating enzyme, is highly expressed in osteosarcoma, and silencing UBE2I can induce the dissociation of Cx43 and SUMO-1, thus inhibiting the proliferation and migration of osteosarcoma cells (Zhang et al., 2018). In addition, UBA2 can promote the progression of colorectal cancer and is expected to become a potential therapeutic target for colorectal cancer (Cheng et al., 2018).
This study showed that SUMOylation regulator-related genes in OSCC were enriched in biological functions such as DNA replication, RNA splicing, cell cycle, RNA transport, and DNA replication. There are finite reports that SUMOylation regulators participate in mediating functions in OSCC. Research has shown that SENP3 is highly expressed in OSCC and may be related to cellular oxidative stress (Sun et al., 2013). SENP5 is mainly expressed in the nucleus, and compared with adjacent tissues, the expression in OSCC is increased (Ding et al., 2008). So far, the expression of more SUMOylation regulators and their involvement in OSCC pathogenesis are still unclear.
In this study, a prognostic model of OSCC was constructed based on nine SUMOylation regulator-related genes including TRIM37, UFM1, FUBP1, CCNT1, FXR1, HMG20A, RANBP3, SPATA5, and DDX23. The prognostic model for OSCC performed well in classifying patients with different overall survival. It might be useful for tumor prognosis evaluation and targeted therapy. Furthermore, valproic acid can reduce the expression levels of SUMO-1 and SUMO-2 in OSCC. It can increase the distribution of cell in G1 phase and reduced distribution of S phase, resulting in anti-tumor effects. However, the specific target of valproic acid needs to be further studied (Sang et al., 2016).
The main limitation of this study was that the main work was based on sequencing analyses. More basic in vitro and in vivo experiments to explore molecular mechanisms need to be further carried out.

CONCLUSION
Compared with normal tissues, 11 genes including PIAS3, RANBP2, USPL1, SENP6, SENP2, SENP5, SAE1, UBA2, PIAS4, UBE2I, and SENP3 were highly expressed in OSCC. It is suggested that the SUMOylation regulator-related genes might be related to biological functions such as DNA replication, RNA splicing, cell cycle, RNA transport, and DNA replication. The prognostic model based on nine SUMOylation-regulated genes (TRIM37, UFM1, FUBP1, CCNT1, FXR1, HMG20A, RANBP3, SPATA5, and DDX23) had a strong ability to predict the prognosis of OSCC. This study might provide targets for prognostic evaluation and targeted therapy of patients with OSCC.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
XL conducted data analysis and manuscript draft. YM conducted experiment design and final inspection. Both authors contributed to the article and approved the submitted version.

FUNDING
The study was supported by the 345 Talent Project.