ORIGINAL RESEARCH article
Aberrant RNA Splicing Events Driven by Mutations of RNA-Binding Proteins as Indicators for Skin Cutaneous Melanoma Prognosis
- 1Department of Clinical Pharmacology, Hunan Key Laboratory of Pharmacogenetics, Key Laboratory of Biological Nanotechnology of National Health Commission, and National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- 2Engineering Research Center for Applied Technology of Pharmacogenomics of Ministry of Education, Institute of Clinical Pharmacology, Central South University, Changsha, China
The worldwide incidence of skin cutaneous melanoma (SKCM) is increasing at a more rapid rate than other tumors. Aberrant alternative splicing (AS) is found to be common in cancer; however, how this process contributes to cancer prognosis still remains largely unknown. Mutations in RNA-binding proteins (RBPs) may trigger great changes in the splicing process. In this study, we comprehensively analyzed DNA and RNA sequencing data and clinical information of SKCM patients, together with widespread changes in splicing patterns induced by RBP mutations. We screened mRNA expression-related and prognosis-related mutations in RBPs and investigated the potential affections of RBP mutations on splicing patterns. Mutations in 853 RBPs were demonstrated to be correlated with splicing aberrations (p < 0.01). Functional enrichment analysis revealed that these alternative splicing events (ASEs) may participate in tumor progress by regulating the modification process, cell-cycle checkpoint, metabolic pathways, MAPK signaling, PI3K-Akt signaling, and other important pathways in cancer. We also constructed a prediction model based on overall survival-related AS events (OS-ASEs) affected by RBP mutations, which exhibited a good predict efficiency with the area under the curve of 0.989. Our work highlights the importance of RBP mutations in splicing alterations and provides effective biomarkers for prediction of prognosis of SKCM.
Skin cutaneous melanoma (SKCM) is the deadliest type of skin cancer, comprising <5% of skin cancers but accounting for the majority of skin malignancy-induced deaths (1, 2). SKCM patient survival depends largely on early detection and diagnosis. Assessment and stratification of patients ahead of clinical therapy aid in identifying individuals with high risk of recurrence or poor survival outcome and may help to inform clinical decisions or potential targeted therapy strategies.
Both genetic and epigenetic alterations that interfere with normal cell physiological function are the fundamental reason of tumorigenesis (3–7). Alterations in transcriptome provide one of the biggest possibilities of proteome and molecular diversity of human cells. Alternative splicing (AS) is a posttranscriptional regulatory mechanism that transcribes a single form of pre-mRNA into multiple mature RNA molecules with different sequences and structures, which exist in almost 95% genes in human genome and provide large potential for multiple pathologies including cancer (8–11). Splicing is highly altered in tumor (12, 13). AS aberrations may lead to transcriptome variations in some cancer-related genes and in turn impact tumor progression as well as therapy resistance (14–17). These findings result in a growing interest in studying the role of aberrant AS in tumor progression.
ASEs have exhibited great potential to act as a biomarker for prognosis prediction in several tumor types. Therefore, it would be desirable to have a better understanding of the AS events (ASEs) in SKCM, which would provide potential for the development of novel clinical biomarkers as well as assist in more effective therapeutic decisions (13, 18). Importantly, AS is regulated by a number of RNA-binding proteins (RBPs), mutations of which may trigger widespread splicing aberrations in downstream spliced genes (19–21). RBPs are a category of proteins with unique RNA-binding domains and are usually dysregulated in tumors, which allow them to regulate a large array of transcript processes (22–26).
It is now clear that either mutations or expression changes in RBPs along with corresponding AS aberrations potentially have great value in exploring tumorigenesis and progression. Mutations or expression changes in splicing regulatory factors have been described as oncogenic drivers (27, 28) and would contribute to affecting oncogenic processes (29–31). However, recent studies focused on SKCM have several limitations: they only investigate one or a few genes, they lack effective biomarker identification, they are only reliable to a specific patient specimen, and there is no comprehensive investigation on how splicing regulatory factor aberrations contribute to SKCM development (32–34). There still exists a demand of systematic investigation on how mutations or expression patterns of RBPs along with related AS changes contribute to SKCM survival outcomes. Here, we systematically investigated the DNA and RNA sequencing data as well as clinical information of SKCM samples obtained from the Cancer Genome Atlas (TCGA) dataset, combined with ASE information from TCGA SpliceSeq data, so as to comprehensively describe the complex network of alterations in RBPs and their global effects in ASEs.
Materials and Methods
Alternative splicing data of 103 SKCM patients were downloaded from TCGA SpliceSeq (https://bioinformatics.mdanderson.org/TCGASpliceSeq), a resource for investigating AS patterns in 33 types of human tumor (35). The Percent Spliced In (PSI) value, which ranges from zero to one, was commonly used to quantify ASEs. ASEs with PSI values >75% were selected in this study. Missing values were supplied using “impute” package in R (36–38). ASEs with PSI average < 0.05 or standard deviation < 0.01 were removed. ASE was named as the combination of gene symbol, splicing type, and splicing ID number. DNA-seq data, RNA-seq data, and clinical information were obtained from the TCGA data portal (https://portal.gdc.cancer.gov/repository). Samples with follow-up time of <90 days were removed from the following study. Information of 1350 RBPs was obtained from a previous study (Supplementary Table 1) (39).
Alternative Splicing Events
After processing the alternative splicing data according to the method described above, a total of 26,919 ASEs in 103 SKCM samples were included in our study. There are seven types of alternative splicing events (ASE), including 2,332 alternate acceptor site events (AA), 2,056 alternate donor site events (AD), 5,126 alternate promoter events (AP), 5,096 alternate terminator events (AT), 1,0304 exon skip events (ES), 103 mutually exclusive exons events (ME), and 1,902 retained intron events (RI). The Percent Spliced In (PSI) value, a common value for quantifying ASEs, ranged from zero to one.
Samples with both RBP mutation data and AS information were selected in the study. The frequencies of somatic mutations were calculated for each RBPs, and the correlation between mutations and PSI value of ASEs was tested using the Wilcoxon–Mann–Whitney test (p < 0.01). The waterfall plots were built using the “GenomeInfoDbData” and “GenVisR” package in R. The balloon plot was visualized using the “ggpubr” and “ggplot2” package. The Upset plot was constructed with the “UpSetR” package in R.
RBP Mutations and Differential Expression
Differential expression was analyzed between samples with or without RBP mutations. The Wilcoxon–Mann–Whitney test was applied for estimating the correlations between RBP mutations and mRNA expression alterations, and RBPs were considered differentially expressed if p-value < 0.05. The association between RBPs with mRNA expression-related mutations and PSI value of ASEs was evaluated using Pearson correlation analysis; outcomes with |Cor|>0.5 and p < 0.05 were regarded as significant correlation. The bubble plots were visualized using “ggplot2” package in R.
RBP Mutations and Prognosis Analysis
The association between RBP mutations and prognosis of SKCM were investigated using Kaplan–Meier survival estimate analysis. A total of 177 samples with both clinical information and DNA sequencing data were selected in study. The analyses were realized using “survival” package in R, and significant correlations were identified with p < 0.05.
Protein–Protein Interaction Network and Functional Enrichment Analysis
The spliced genes of most significant ASEs (p < 1.00E-05) were input to the String database (https://string-db.org/), and the highest confidence (0.9) was chosen for confirming the interactions. The protein–protein interaction (PPI) network was visualized using Cytoscape. Gene Ontology (GO) analysis was investigated using the “clusterProfiler” package in R, and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was performed in KOBAs (40), with a criterion of adjusted P-value of < 0.05. The results are shown in bar plots.
Identification of OS-ASEs
Survival information of SKCM patients was obtained from TCGA. Univariable Cox regression analysis was performed to evaluate the prognostic value of every ASE, and events with p-value < 0.05 were identified as OS-ASEs. The Upset plot was constructed to visualize the quantitative analysis of the intersections of seven types of ASEs. A volcano plot showed the overview of survival-related and unrelated ASEs. The most significant OS-ASEs of each type of ASEs are displayed in bubble plots described above. Pearson correlation analysis was used to test the correlation between RBP expression and PSIs of OS-ASEs; the correlation network was built based on the outcomes with |Cor|>0.6 and p < 0.05.
Construction of Prediction Model Based on OS-ASEs
OS-ASEs were firstly screened by the least absolute shrinkage and selection operator (LASSO) regression analysis, and then multivariate Cox regression analysis was performed to screen the significant prognostic OS-ASEs and build the prediction model. The area under the curve (AUC) value under the receiver operator characteristic (ROC) curve was calculated using the “survivalROC” package in R, which was used to evaluate the prognosis predictive efficiencies. Samples were then divided into high-risk and low-risk groups according to the median risk score calculated by our prediction model. Kaplan–Meier survival analyses were used for evaluating the prognostic efficiency of different prognostic indexes (PIs) in high- and low-risk patients. The analysis was proceeded with the “survival” package, and survival curves were plotted using the “survminer” package in R. Heatmap was visualized using the “pheatmap” package. Univariate Cox regression analysis and multivariate Cox regression analysis were applied to evaluate the prognosis value of risk factors including risk score quartiles, age, gender, tumor stage, and TMN stage in overall survival of SKCM patients. Factors with HR>1 or <1 were identified as high- and low-risk factors (p < 0.05), respectively.
Splicing Alterations Driven by RBP Mutations in SKCM
The overall analysis process is presented in Figure 1. We first systematically investigated the splicing process alterations induced by RBP mutations. DNA sequence data of 467 SKCM patients were downloaded from TCGA, while ASE information of 103 SKCM samples was obtained from TCGA SpliceSeq data. The mutant situations of 1,350 known and predicted RBPs were studied in 467 SKCM samples, 452 of which were found to carry mutations in RBPs. Mutant situations of the top 50 RBPs with the highest mutation frequency were exhibited using a waterfall plot (Figure 2A). The mutant information was then combined with the Percent Spliced In (PSI) value of ASEs and applied to the Wilcoxon–Mann–Whitney test. As a result, mutations in 853 RBPs were found to be obviously correlated with 19748 different ASEs (Supplementary Table 2), including 1,655 AAs in 1,334 genes, 1,478 ADs in 1,161 genes, 3,681 APs in 2,280 genes, 3,665 ATs in 2,206 genes, 7,802 ESs in 4,178 genes, 85 MEs in 84 genes, and 1,382 RIs in 1,042 genes (Figures 2B,C). The top 50 RBPs with the highest mutation frequency and the corresponding ASEs of each splicing pattern are shown in Figure 2D. The above results indicated that RBP mutations were significantly correlated with extensive AS alterations in SKCM.
Figure 1. Flow diagram of the research methodology. RBP, RNA-binding protein; ASE, alternative splicing event; SF, splicing factor; PPI, protein–protein interaction.
Figure 2. RBP mutations triggered splicing event alterations. (A) Among the 853 RBPs whose mutations showed significant correlations with aberrant ASEs (Wilcoxon test P-values < 0.01). The mutation situations of the top 50 RBPs with the highest mutant frequency were exhibited. The y-axis represents each RBP and sorted by mutation frequency. The x-axis stands for different SKCM samples. The color indicates different mutation types. The bar plot above represents the mutation frequency with synonymous (red) and non-synonymous (blue) translational effects for each SKCM patient. (B) Upset plot of each splicing pattern affected by RBP mutations. The black bar on the left side indicates the event number of specific splicing type, and the red dots on the right side represent the intersections of ASEs. (C) Distribution of seven types of ASEs and their corresponding genes. The x-axis represents each kind of ASEs (AA, AD, AP, AT, ES, ME, RI), and the y-axis represents the number of ASEs (red) or the corresponding spliced genes (blue). (D) Overview of splicing events per RBP with the highest mutation frequency. The color intensity indicates the occurrence frequency of each splicing pattern, and the above histogram represented the counts of ASEs affected by corresponding RBPs.
RBP Mutation-Induced mRNA Expression Alterations
We also investigated whether the aberrant splicing process driven by RBP mutations occurred as a consequence of mRNA expression alterations of RBPs. To screen out mutations in RBPs that altered the mRNA expression level, we combined DNA and RNA sequencing data downloaded from TCGA and analyzed. As a result, a total of 91 RBPs exhibited a significant differential expression between wild and mutant genotypes (Supplementary Table 3). In more details, RBPs with upregulated and downregulated mRNA expression levels in mutant samples compared to wild type were exhibited, respectively (Figure 3A). The overview of mutation situations of 91 RBPs in 292 SKCM samples were visualized (Supplementary Figure 1).
Figure 3. RBP mutations induced mRNA expression alterations. (A) A total of 91 RBPs exhibit significant different mRNA expression levels induced by mutations (Wilcoxon test P-values < 0.05). The left and right panels indicate the up- and downregulation of the mRNA expression level of tested RBPs. The circle size corresponds to the mutation frequency of SKCM samples. (B) Upset plot showed splicing alterations induced by mRNA expression-associated RBP mutations. The black bar on the left represents the counts of different splicing patterns, while the red dots and lines on the right indicate the intersections of ASEs. (C) Bar graph represents the number of each splicing type as well as corresponding spliced genes. (D) Distribution of splicing events per RBP with mRNA expression-affecting mutations. The color intensity indicates the occurrence counts of each type of ASEs.
Then, we excavated the splicing alterations that may occur as a consequence of expression-associated RBP mutations described above. The association between 91 RBP and PSI values of ASEs was tested in 103 SKCM samples (Supplementary Table 4). Significant associations were found in 1728 ASEs |Cor|>0.5, p < 0.05), including 74 AAs in 71 genes, 66 ADs in 64 genes, 422 APs in 270 genes, 728 ATs in 470 genes, 225 ESs in 204 genes, 4 MEs in 4 genes, and 209 RIs in 185 genes (Figures 3B,C). This result indicated that a single gene can undergo multiple types of ASEs. Besides, among seven types of ASEs, the incidence of AT was most frequently followed by AP, while the rarest type was ME events. The detailed regulatory network of each RBPs on different splicing pattern is exhibited in Figure 3D.
Collectively, the above results indicated that RBP genotypes are frequently altered in SKCM tumors and may contribute to affecting the mRNA expression, which was consistent with previous researches focused on other human tumors (39). The expression-associated RBP mutations discovered in this study and their affecting ASEs provide new information about RBP mutations with a potential role in physiological function regulation of SKCM.
RBP Mutations Associate With Survival Outcome of SKCM Patients
To further define whether mutations in RBPs could influence the survival outcomes of SKCM patients, the DNA sequencing data of 1,350 RBPs and clinical information of 177 SKCM patients with RBP mutations were combined and analyzed. By taking the Kaplan–Meier test, 43 RBPs were demonstrated to carry prognosis-associated mutations (p < 0.05) (Supplementary Table 5). The clinical information and mutant situations of 43 RBPs are exhibited in Figure 4A. Among the 43 significant RBPs, mutations in 5 RBPs were regarded as high-risk factors (green dots), and for the rest of the 38 RBPs, the mutant type showed a lower-risk grade (yellow dots) (Figure 4B). In summary, mutations in the above 43 RBPs were found to have a significant correlation with the overall survival of SKCM patients, which indicated a promising efficiency of RBP mutations in prognosis prediction.
Figure 4. RBP mutations correlated with SKCM prognosis. (A) A total of 43 RBPs were demonstrated to carry prognosis-associated mutations (Kruskal–Wallis H test P-values < 0.05). The y-axis represents each RBP and sorted by mutation frequency. The x-axis stands for different SKCM samples. The color indicates different mutation type. The bar plot above represents the mutation frequency with synonymous (red) and non-synonymous (blue) translational effects of each sample. (B) High-risk factors (right panel) and low-risk factors (left panel) of prognosis-related RBP mutations. The circle size stands for count of samples with mutations.
Functional Enrichment Analysis and Protein–Protein Interaction (PPI) Network of ASEs
We then tried to investigate the potential mechanism following the alterations of relevant spliced genes as well as their encoding proteins. There were 19,748 ASEs in 7,776 parent genes that showed a close correlation with RBP mutations (Supplementary Table 2, p < 0.01). The protein–protein interaction (PPI) network was investigated among the spliced genes of most significant ASEs (p < 1.00E-05). The network was built using the String database and visualized in Cytoscape (Figure 5A). To further reveal the molecular process and biological function mediated by the spliced genes, GO and KEGG enrichment analyses were applied to reveal their enriching pathways. GO analysis revealed the significant enrichment in the process of “covalent chromatin modification,” “histone modification,” “peptidyl-lysine modification,” “cell cycle checkpoint,” “DNA metabolic,” etc. (Figure 5B). As for the KEGG terms, several significant enriched pathways were closely related to tumor progress, like “metabolic pathways,” “pathways in cancer,” “MAPK signaling pathway,” and “PI3K-Akt signaling pathway” (Figure 5C). In summary, the above enrichment results revealed that these ASEs may have an essential role in the biological process and tumorigenesis of SKCM patients, by regulating the modification process, cell-cycle checkpoint, metabolic, and other cancer-related pathways.
Figure 5. Construction of the protein–protein interaction network and functional enrichment analysis. (A) Protein–protein interaction network of spliced genes affected by RBP mutations. Red dots represent each spliced gene. Line thickness represents the strength of interaction. (B–E) GO and KEGG analysis outcomes of RBPs and spliced genes. The top 20 most significant enriched pathways were exhibited in a bar plot. (B,C) Enriched pathways in GO analysis, (B) KEGG analysis, and (C) spliced genes regulated by RBPs. (D,E) GO, (D) KEGG, and (E) enrichment analysis outcomes of RBPs.
In addition, we also made a comparison between the enrichment outcomes of RBPs and spliced genes regulated by these RBPs. The top 20 most significant enriched pathways of GO and KEGG analysis were listed. RBPs are mostly enriched in RNA splicing or other RNA-related process (Figures 5D,E), which is quite different from the spliced genes described above. The separate GO analysis results (cellular component, CC; biological process, BP; molecular function, MF) of spliced genes and RBPs were offered in Supplementary Figures 2, 3. These results indicated that RBPs and their corresponding spliced genes exhibited obvious differences in biological function and regulatory network.
Identification of OS-ASEs and Construction of the Correlation Network
Univariable COX regression analysis was then performed to investigate all the overall survival-related ASEs (OS-ASEs) affected by RBP mutations (Supplementary Table 6). According to the volcano plot, a total of 919 OS-ASEs in 757 parent genes were identified as prognosis-associated ASEs (p < 0.05) (Figure 6A), which included 75 AAs in 74 genes, 86 ADs in 80 genes, 212 APs in 163 genes, 176 ATs in 122 genes, 320 ESs in 301 genes, 2 MEs in 2 genes, and 48 RIs in 47 genes (Figures 6B,C). ES was the most common splicing type associated with overall survival of SKCM patients (Figure 6C). The most significant OS-ASEs in each type of splicing patterns are shown in bubble plots (Figures 6D–J). We also constructed the splicing-regulatory network of these OS-ASEs and corresponding RBPs. The correlation network was visualized in Cytoscape. The positively regulatory (red lines, red dots) and negatively regulatory (blue lines, blue dots) relations were exhibited, respectively (Figure 7).
Figure 6. Overview of OS-ASEs affected by mRNA expression-related RBP mutations. (A) The volcano plot of all the ASEs. The red and blue dots represent OS-related and insignificant ASEs, respectively. (B) Upset plot of each type of OS-ASEs in SKCM. The black bar on the left side indicates the number of specific types of ASEs, while the red dots on the right side stand for the splicing intersections. (C) Distribution of seven types of ASEs and their corresponding genes. The x-axis represents each kind of ASEs (AA, AD, AP, AT, ES, RI), and the y-axis represents the number of ASEs (red) as well as the corresponding genes (blue). (D–J) Bubble plot of the most significant OS-ASEs in SKCM. (D) AA; (E) AD; (F) AP; (G) AT; (H) ES; (I) ME; (J) RI. The x-axis represents the z-score of each type of ASEs, while the y-axis stands for the OS-ASEs.
Figure 7. Correlation network of RBPs and OS-ASEs. The yellow dots represent RBPs. The blue and red dots indicate OS-ASEs that have been negatively regulated or positively regulated by the corresponding RBPs. The blue and red lines represent the existence of negative or positive regulation effects.
Prediction Model Establishment Based on All Types of OS-ASEs
LASSO regression analysis and multivariate Cox regression analysis were then performed to construct a prognostic model based on all types of OS-ASEs that are influenced by RBP mutations, among which 10 OS-ASEs were selected as candidate predict factors (Figures 8A,B). As shown in Table 1, MTMR14-63114-ES, BATF2-16723-AP, and EXOC6-12541-AP were identified as a high-risk factor (HR>1), while the rest 7 ASEs indicated good prognosis (HR<1) (Table 1). The prediction efficiency was evaluated by building the ROC curve, and the AUC value was 0.989, indicating an obviously credible prediction efficiency (Figure 8C). Then, the risk score of each SKCM patient was calculated based on the prediction model and the medium cutoff value was applied to divide patients into high-risk and low-risk groups (Figures 8D,E). Kaplan–Meier survival analyses indicated that SKCM patients in the low-risk groups exhibited a significant better survival compared with those in the high-risk group (p = 6.843E-08) (Figure 8F). The heatmap exhibited the expression's situation of OS-ASEs in the final prognostic model (Figure 8G).
Figure 8. Prognostic model based on all types of OS-ASEs. (A) Cross-validation for tuning parameter selection in the proportional hazard model. (B) LASSO regression analysis for screening coefficients in all types of OS-ASEs. (C) The ROC curves for evaluating the efficiency of the prognostic model. (D) The risk curve of 93 SKCM patients matched with intact follow-up data. (E) The scatter plots of SKCM samples. The red and green plots represent alive and death endpoints, respectively. (F) Kaplan–Meier overall survival curves based on all types of OS-ASEs. The numbers of patients in the high-risk and low-risk groups at different survival times are listed at the bottom panel, respectively. (G) The heatmap of 10 OS-ASEs selected by LASSO regression analysis.
We also constructed a prediction model based on RBP mutations to determine the prognosis value of RBPs themselves instead of their affections on ASEs (Supplementary Figure 4). Mutations in 13 RBPs were selected in this model (Supplementary Figures 4A,B; Supplementary Table 7), and the AUC value under the ROC curve was 0.527 (Supplementary Figure 4C). This result indicated that mutations in RBPs affect SKCM progression and outcome mostly through regulating downstream ASEs instead of themselves.
Cox Regression Analysis of Prognosis-Related Clinical Features
As there were several other risk factors that may contribute to the prognosis of SKCM patients, we next performed univariate and multivariate Cox regression analyses to further evaluate the prognosis value of risk scores calculated by our prediction model and other risk factors including age, gender, tumor stage, and TMN stage. According to the univariate Cox regression analysis result, both tumor stage (HR=2.801, P = 0.019, 95% CI: 1.187–6.612), M stage (HR = 10.173, P = 0.003, 95% CI: 2.178–47.517), and risk score quartiles based on ASEs (HR=13.637, P = 1.517E-04, 95% CI: 3.529–52.695) are regarded as high-risk factors in terms of SKCM prognosis (Figure 9A). As for the multivariate Cox regression analysis, risk score quartiles based on ASEs is the only significant high-risk factors to SKCM patients' outcome (HR=15.288, P = 1.303E-04, 95% CI: 3.781–61.810) (Figure 9B). This result further corroborated that the risk score calculated by our predict model is an independent risk factor of SKCM patients' prognosis.
Figure 9. Cox regression analysis of prognosis-related clinical features and OS-ASEs. (A) Univariate Cox regression analysis and (B) multivariate Cox regression analysis.
SKCM is one of the deadliest types of cancer with an increasingly annual morbidity and mortality rate (41, 42). Major etiological factors contributed to SKCM development including UV radiation, skin pigmentation reduction, nevus density increase, immunosuppression, family history, and genetic susceptibility (43, 44). Though previous studies have identified several biomarkers, efficient diagnostic and prognostic indicators are still lacking (45, 46).
In recent years, remarkable advances have been made in prognostic biomarker identification in SKCM. Except the conventional tissue-based markers such as Breslow thickness (47, 48), ulceration (47, 49, 50), mitotic rate (47, 51), tumor-infiltrating lymphocytes (52, 53), lymphatic and vascular invasion (51, 54), a great amount of research effort has been made to explore molecular biomarkers with efficient prognostic implication. Candidates including S100B (55, 56), ki-67 (57), metallothioneins (MTs) (58–60), and lactate dehydrogenase (LDH) were repeatedly demonstrated to associate with poor clinical outcomes of SKCM patients. Genetic biomarkers such as specific mutations in BRAF and NRAS were also considered to have prognostic implication in SKCM (61–63). Besides, considering the limitations in sensitivity and specificity of individual biomarkers, multi-marker arrays also gained extensive attention and several biomarker panels have been proposed (64–66). It is clear that the discovery of predict biomarkers of SKCM has been evaluated with positive results; however, challenges still remained. Conflicting results are wildly reported, and few of these biomarkers have been proven to be clinically useful or only reliable to a particular group of SKCM patients (67–70). Therefore, currently there still exists strong demand of SKCM prognostic biomarkers that provide guidance to SKCM patient management.
Almost all the multi-exon genes undergo alternative splicing (71, 72). Alternative splicing refers to a biological process that transforms a single pre-mRNA to multiple splice isoforms and finally leads to different or antagonistic functional as well as structural characteristics of protein products or results in different phenotypes due to alternating expression levels of the spliced genes (73, 74). Currently, an increasing number of evidences have been proposed to regard the alternative splicing as an effective indicator of carcinogenic processes (9, 29, 39, 75–79). Zhang et al. revealed that U2AF2 can enhance melanoma migration and cancer aggressiveness by facilitating CD44v8-10 alternative splicing, an isoform switch required for tumor prognostic and metastases (80). Besides, Liu et al. found that JMJD6 can regulate the alternative splicing of a critical component of the MAPK signal pathway, PAK1, and thereby promoting melanoma carcinogenesis (33). However, although some specific correlations between ASEs and melanoma have been identified, the exploration and prognostic signatures were developed only based on splicing events (32). The understanding and exploration of alternative splicing in SKCM are still far from enough.
ASEs were mostly regulated by a series of RBPs, whose mutations may contribute to many of AS alterations observed in cancer (39). In the past few years, increasing researches have revealed the significant role of RBP mutations in prognosis prediction in multiple cancer types (22, 39, 77, 81–83), while their role in SKCM prognosis has not been discussed yet. To determine the underlying mechanism, we deeply investigated the potential regulatory associations between RBP mutations and ASEs, so as to provide insight into the phenotype changes of ASEs in SKCM and contribute to precision treatment. We integrated the DNA-seq and RNA-seq data as well as clinical information obtained from TCGA, and ASE information from TCGA SpliceSeq. We then systematically investigated RBP mutations that correlated with mRNA expression, overall survival, and ASEs, as well as the overall situation, interaction network, and enriched pathways of related spliced genes. We also constructed predicted models based on different types of OS-ASEs, which offered us a potential intervention target for clinical SKCM treatment.
In our study, 19,748 ASEs were identified to associate with RBP mutations. Functional analysis of spliced genes revealed the enrichment in several important biological processes like modification process, cell-cycle checkpoint, DNA metabolism, MAPK signaling, and PI3K-Akt signaling. The correlation network between splicing factors and OS-ASEs was also constructed to highlight the genetic mechanism. In addition, in order to construct a prediction model and evaluate the prediction value of ASEs, LASSO regression analysis was applied to screen the prediction factors and AUC value under the ROC curves were calculated to evaluate the predict efficiency. The final prediction model based on all types of ASEs consists of 10 spliced events, including MTMR14-63114-ES, BATF2-16723-AP, ME1-76870-ES, URI1-48867-ES, ITGB1BP1-52620-ES, DNM2-47584-AA, EXOC6-12541-AP, OSCP1-1775-AP, ING2-71271-AP, and ZNF83-51481-AT. The risk scores were calculated for each SKCM patient by taking LASSO and multivariate Cox regression analysis of PSI values of OS-ASEs, and the subsequent risk stratification by risk scores enabled the satisfactory differentiation of SKCM patients with different survival outcomes. It is noteworthy that this prediction model based on all types of OS-ASEs exhibits an admirable prediction ability, with the AUC value of 0.989, which is much more significant than the previous study (32).
According to our research, we revealed a novel network between RBPs and alternative splicing changes in SKCM that provide sufficient resources of information to understand the molecular mechanism of, and potentially reverse, the tumorigenesis and survival outcomes. One core implication of our research for SKCM prognostic and clinical studies is to expand the functional effects of genetic and epigenetic alterations into changes included in the AS process of genes involved in tumor-related pathways. The prognostic model constructed in our study possessed a high performance for risk stratification in SKCM, which was promising in predicting survival outcome and helping to apply more appropriate treatment regimens to different SKCM patients. Besides, given the high prevalence of splicing defects in tumor, several small molecule modulators targeting RNA processing have been explored and exhibited promising therapeutic effects in tumor treatment (84). Our study also provides potential targets for SKCM treatment. Certainly, more researches concerning the molecular mechanism of significant RBP mutations on AS regulation are needed in the future. Functional investigations and biological experiments focused on the significance of ASEs, and how these aberrant procedures affect critical cancer pathways may provide novel therapeutic targets for SKCM treatment and improve survival outcomes.
In conclusion, our research described a comprehensive landscape of aberrant alternative splicing and its regulation in SKCM. We focused on the executors of splicing procedure, as well as systematic analysis mutation and gene expression patterns of 1,350 RBP genes. We also constructed a network to exhibit the potential correlation between RBPs and relevant ASEs. Furthermore, we identify enrichment of tumor-critical pathways among the spliced genes. The prediction model constructed in this study enabled the satisfactory differentiation of overall survival and provided a better guidance for clinical decisions and prognosis prediction of SKCM patients.
Data Availability Statement
The datasets generated for this study can be found in the TCGA and TCGA SpliceSeq.
Conceptualization: CM, P-YS, and Z-QL. Methodology: CM, P-YS, and XL. Investigation: CM and P-YS. Writing—original draft: CM. Writing—review and editing: WZ and H-HZ. Funding acquisition: Z-QL. Resources: CM, P-YS, and XL. Supervision: Z-QL. All authors contributed to the article and approved the submitted version.
This work was supported by the National Key Research and Development Program of China (2016YFC1306900), National Natural Science Foundation of China (81874327), and Key Research and Development Program of Hunan Province (2019SK2251).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors thank every contributor for their help in this research work.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2020.568469/full#supplementary-material
Supplementary Figure 1. Waterfall plot of 91 RBPs with mRNA expression-affected mutations.
Supplementary Figure 2. GO analysis of spliced genes regulated by RBP mutations. (A) Cellular component; (B) biological process; (C) molecular function. The x-axis stands for the gene ratio (the proportion of genes participated in each pathway) and the y-axis represents the name of pathways.
Supplementary Figure 3. GO analysis of 1350 RBPs. (A) Cellular component; (B) biological process; (C) molecular function. The x-axis stands for the gene ratio and the y-axis indicates the pathway names.
Supplementary Figure 4. Prognostic model based on 1350 RBPs. (A) LASSO regression path plot for building predict model and calculating coefficients; (B) LASSO coefficient profiles based on all types of OS-ASEs; (C) construction of ROC curves to evaluate the predict efficiency.
Supplementary Table 1. Gene list of 1350 known and predict RNA binding proteins (RBPs).
Supplementary Table 2. RBPs mutations induced splicing alterations. The left table lists all the splicing events affect by RBP mutations (Splicing_event) as well as their corresponding RBPs (RBP). It also provides the corresponding p-value (p-value) of each splicing event. The right table lists the occurrence frequency of different splicing patterns and their parent genes.
Supplementary Table 3. Information of 91 RBPs selected by Wilcoxon Mann-Whitney test. The table includes the list of 91 RBPs (RBP), corresponding p-value (p.value) and the percentage of mutant samples (Mut_ratio). It also provides the sample size (Mut_count/Wild_count) and the median of mRNA expression (Wild_med/Mut_med) of wild and mutant samples, respectively.
Supplementary Table 4. Corresponding splicing events regulated by 91 RBPs. In the left table, a total of 1,728 splicing events in 1,268 parent genes were identified using Pearson correlation test. The information of each RBPs and their related splicing events were listed, and the Pearson correlation coefficient (Cor) and p.value were also provided in this table. “Positive” and “negative” in column Reg. indicated “positive regulation and “negative regulation,” respectively. The occurrence frequency of each types of splicing event and their parent genes were listed in the right table.
Supplementary Table 5. Kaplan-Meier test result of 45 significant RBPs. RBP mutations that potentially affect the survival outcomes of SKCM patients were selected by Kaplan-Meier test. The p-value, hazard ratio (HR), 95% confidence interval (95% CI) of each RBP were listed.
Supplementary Table 6. Survival-related splicing events selected by Lasso regression. Among the 19748 splicing events affected by RBP mutations, a total of 919 splicing events were identified to be correlated with the overall survival of SKCM patients using Lasso regression. The Z_score (Cor), hazard ratio (HR), 95% confidence interval (95%CI_L and 95%CI_H, respectively) and p-value were listed in the left table. The occurrence frequency of each types of splicing event and their parent genes were listed in the right table.
Supplementary Table 7. Construction of predict model based on RBP mutations. RBP mutations were selected by Lasso regression. A total of 13 RBPs were selected into the predict model using multivariate Cox regression analysis. The coefficient (Coef), hazard ratio (HR), 95% confidence interval (95%CI_L and 95%CI_H, respectively) and p-value were listed.
1. Guy GP, Jr, Machlin SR, Ekwueme DU, Yabroff KR. Prevalence and costs of skin cancer treatment in the U.S., 2002-2006 and 2007-2011. Am J Prev Med. (2015) 48:183–7. doi: 10.1016/j.amepre.2014.08.036
3. Zhang Q, Ding S, Zhang H, Long H, Wu H, Zhao M, et al. Increased Set1 binding at the promoter induces aberrant epigenetic alterations and up-regulates cyclic adenosine 5'-monophosphate response element modulator alpha in systemic lupus erythematosus. Clin Epigenetics. (2016) 8:126. doi: 10.1186/s13148-016-0294-2
4. Cao AY, Huang J, Hu Z, Li WF, Ma ZL, Tang LL, et al. The prevalence of PALB2 germline mutations in BRCA1/BRCA2 negative Chinese women with early onset breast cancer or affected relatives. Breast Cancer Res Treat. (2009) 114:457–62. doi: 10.1007/s10549-008-0036-z
5. Cao AY, Huang J, Hu Z, Li WF, Ma ZL, Tang LL, et al. Mutation analysis of BRIP1/BACH1 in BRCA1/BRCA2 negative Chinese women with early onset breast cancer or affected relatives. Breast Cancer Res Treat. (2009) 115:51–5. doi: 10.1007/s10549-008-0052-z
11. Ma L, Gao X, Luo J, Huang L, Teng Y, Horvitz HR. The Caenorhabditis elegans gene mfap-1 encodes a nuclear protein that affects alternative splicing. PLoS Genet. (2012) 8:e1002827. doi: 10.1371/journal.pgen.1002827
12. Urbanski LM, Leclair N, Anczukow O. Alternative-splicing defects in cancer: Splicing regulators and their downstream targets, guiding the way to novel cancer therapeutics. Wiley Interdiscip Rev RNA. (2018) 9:e1476. doi: 10.1002/wrna.1476
16. Liu T, Sun H, Zhu D, Dong X, Liu F, Liang X, et al. TRA2A promoted paclitaxel resistance and tumor progression in triple-negative breast cancers via regulating alternative splicing. Mol Cancer Ther. (2017) 16:1377–88. doi: 10.1158/1535-7163.MCT-17-0026
17. Chen Y, Huang Q, Liu W, Zhu Q, Cui CP, Xu L, et al. Mutually exclusive acetylation and ubiquitylation of the splicing factor SRSF5 control tumor growth. Nat Commun. (2018) 9:2464. doi: 10.1038/s41467-018-04815-3
20. Brooks AN, Choi PS, de Waal L, Sharifnia T, Imielinski M, Saksena G, et al. A pan-cancer analysis of transcriptome changes associated with somatic mutations in U2AF1 reveals commonly altered splicing events. PLoS ONE. (2014) 9:e87361. doi: 10.1371/journal.pone.0087361
22. Wang E, Lu SX, Pastore A, Chen X, Imig J, Chun-Wei Lee S, et al. Targeting an RNA-binding protein network in acute myeloid Leukemia. Cancer Cell. (2019) 35:369–84 e7. doi: 10.1016/j.ccell.2019.01.010
29. Zong FY, Fu X, Wei WJ, Luo YG, Heiner M, Cao LJ, et al. The RNA-binding protein QKI suppresses cancer-associated aberrant splicing. PLoS Genet. (2014) 10:e1004289. doi: 10.1371/journal.pgen.1004289
31. Zhang J, Lieu YK, Ali AM, Penson A, Reggio KS, Rabadan R, et al. Disease-associated mutation in SRSF2 misregulates splicing by altering RNA-binding affinities. Proc Natl Acad Sci USA. (2015) 112:E4726–34. doi: 10.1073/pnas.1514105112
33. Liu X, Si W, Liu X, He L, Ren J, Yang Z, et al. JMJD6 promotes melanoma carcinogenesis through regulation of the alternative splicing of PAK1, a key MAPK signaling component. Mol Cancer. (2017) 16:175. doi: 10.1186/s12943-017-0744-2
34. Alsafadi S, Houy A, Battistella A, Popova T, Wassef M, Henry E, et al. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat Commun. (2016) 7:10615. doi: 10.1038/ncomms10615
37. Rong MH, Zhu ZH, Guan Y, Li MW, Zheng JS, Huang YQ, et al. Identification of prognostic splicing factors and exploration of their potential regulatory mechanisms in pancreatic adenocarcinoma. PeerJ. (2020) 8:e8380. doi: 10.7717/peerj.8380
38. Meng T, Huang R, Zeng Z, Huang Z, Yin H, Jiao C, et al. Identification of prognostic and metastatic alternative splicing signatures in kidney renal clear cell carcinoma. Front Bioeng Biotechnol. (2019) 7:270. doi: 10.3389/fbioe.2019.00270
39. Sebestyen E, Singh B, Minana B, Pages A, Mateo F, Pujana MA, et al. Large-scale analysis of genome and transcriptome alterations in multiple tumors unveils novel cancer-relevant splicing networks. Genome Res. (2016) 26:732–44. doi: 10.1101/gr.199935.115
40. Xie C, Mao X, Huang J, Ding Y, Wu J, Dong S, et al. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res. (2011) 39:W316–22. doi: 10.1093/nar/gkr483
43. de Assis LVM, Moraes MN, Castrucci AML. The molecular clock in the skin, its functionality, and how it is disrupted in cutaneous melanoma: a new pharmacological target? Cell Mol Life Sci. (2019) 76:3801–26. doi: 10.1007/s00018-019-03183-5
47. Balch CM, Gershenwald JE, Soong SJ, Thompson JF, Atkins MB, Byrd DR, et al. Final version of 2009 AJCC melanoma staging and classification. J Clin Oncol. (2009) 27:6199–206. doi: 10.1200/JCO.2009.23.4799
50. Aviles-Izquierdo JA, Lazaro-Ochaita P. Histological ulceration as a prognostic factor in cutaneous melanoma: a study of 423 cases in Spain. Clin Transl Oncol. (2012) 14:237–40. doi: 10.1007/s12094-012-0790-6
51. Namikawa K, Aung PP, Gershenwald JE, Milton DR, Prieto VG. Clinical impact of ulceration width, lymphovascular invasion, microscopic satellitosis, perineural invasion, and mitotic rate in patients undergoing sentinel lymph node biopsy for cutaneous melanoma: a retrospective observational study at a comprehensive cancer center. Cancer Med. (2018) 7:583–93. doi: 10.1002/cam4.1320
52. Fortes C, Mastroeni S, Mannooranparampil TJ, Passarelli F, Zappala A, Annessi G, et al. Tumor-infiltrating lymphocytes predict cutaneous melanoma survival. Melanoma Res. (2015) 25:306–11. doi: 10.1097/CMR.0000000000000164
53. Burton AL, Roach BA, Mays MP, Chen AF, Ginter BA, Vierling AM, et al. Prognostic significance of tumor infiltrating lymphocytes in melanoma. Am Surg. (2011) 77:188–92. doi: 10.1177/000313481107700219
54. Petersson F, Diwan AH, Ivan D, Gershenwald JE, Johnson MM, Harrell R, et al. Immunohistochemical detection of lymphovascular invasion with D2-40 in melanoma correlates with sentinel lymph node status, metastasis and survival. J Cutan Pathol. (2009) 36:1157–63. doi: 10.1111/j.1600-0560.2008.01242.x
55. Wevers KP, Kruijff S, Speijers MJ, Bastiaannet E, Muller Kobold AC, Hoekstra HJ. S-100B: a stronger prognostic biomarker than LDH in stage IIIB-C melanoma. Ann Surg Oncol. (2013) 20:2772–9. doi: 10.1245/s10434-013-2949-y
56. Weide B, Elsasser M, Buttner P, Pflugfelder A, Leiter U, Eigentler TK, et al. Serum markers lactate dehydrogenase and S100B predict independently disease outcome in melanoma patients with distant metastasis. Br J Cancer. (2012) 107:422–8. doi: 10.1038/bjc.2012.306
57. Ostmeier H, Fuchs B, Otto F, Mawick R, Lippold A, Krieg V, et al. Prognostic immunohistochemical markers of primary human melanomas. Br J Dermatol. (2001) 145:203–9. doi: 10.1046/j.1365-2133.2001.04335.x
58. Zelger B, Hittmair A, Schir M, Ofner C, Ofner D, Fritsch PO, et al. Immunohistochemically demonstrated metallothionein expression in malignant melanoma. Histopathology. (1993) 23:257–63. doi: 10.1111/j.1365-2559.1993.tb01198.x
59. Weinlich G, Eisendle K, Hassler E, Baltaci M, Fritsch PO, Zelger B. Metallothionein - overexpression as a highly significant prognostic factor in melanoma: a prospective study on 1270 patients. Br J Cancer. (2006) 94:835–41. doi: 10.1038/sj.bjc.6603028
60. Emri E, Egervari K, Varvolgyi T, Rozsa D, Miko E, Dezso B, et al. Correlation among metallothionein expression, intratumoural macrophage infiltration and the risk of metastasis in human cutaneous malignant melanoma. J Eur Acad Dermatol Venereol. (2013) 27:e320–7. doi: 10.1111/j.1468-3083.2012.04653.x
61. Safaee Ardekani G, Jafarnejad SM, Tan L, Saeedi A, Li G. The prognostic value of BRAF mutation in colorectal cancer and melanoma: a systematic review and meta-analysis. PLoS ONE. (2012) 7:e47054. doi: 10.1371/journal.pone.0047054
62. Moreau S, Saiag P, Aegerter P, Bosset D, Longvert C, Helias-Rodzewicz Z, et al. Prognostic value of BRAF(V(6)(0)(0)) mutations in melanoma patients after resection of metastatic lymph nodes. Ann Surg Oncol. (2012) 19:4314–21. doi: 10.1245/s10434-012-2457-5
64. Kashani-Sabet M, Nosrati M, Miller JR 3rd, Sagebiel RW, Leong SPL, Lesniak A, et al. Prospective validation of molecular prognostic markers in cutaneous melanoma: a correlative analysis of E1690. Clin Cancer Res. (2017) 23:6888–92. doi: 10.1158/1078-0432.CCR-17-1317
65. Kashani-Sabet M, Venna S, Nosrati M, Rangel J, Sucker A, Egberts F, et al. A multimarker prognostic assay for primary cutaneous melanoma. Clin Cancer Res. (2009) 15:6987–92. doi: 10.1158/1078-0432.CCR-09-1777
66. Gould Rothberg BE, Berger AJ, Molinaro AM, Subtil A, Krauthammer MO, Camp RL, et al. Melanoma prognostic model using tissue microarrays and genetic algorithms. J Clin Oncol. (2009) 27:5772–80. doi: 10.1200/JCO.2009.22.8239
68. Egberts F, Kotthoff EM, Gerdes S, Egberts JH, Weichenthal M, Hauschild A. Comparative study of YKL-40, S-100B and LDH as monitoring tools for Stage IV melanoma. Eur J Cancer. (2012) 48:695–702. doi: 10.1016/j.ejca.2011.08.007
69. Tas F, Erturk K. Tumor Infiltrating Lymphocytes (TILs) may be only an independent predictor of nodal involvement but not for recurrence and survival in cutaneous melanoma patients. Cancer Invest. (2017) 35:501–5. doi: 10.1080/07357907.2017.1351984
70. Storr SJ, Safuan S, Mitra A, Elliott F, Walker C, Vasko MJ, et al. Objective assessment of blood and lymphatic vessel invasion and association with macrophage infiltration in cutaneous melanoma. Mod Pathol. (2012) 25:493–504. doi: 10.1038/modpathol.2011.182
72. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. (2008) 40:1413–5. doi: 10.1038/ng.259
73. Zhang Y, Yan L, Zeng J, Zhou H, Liu H, Yu G, et al. Pan-cancer analysis of clinical relevance of alternative splicing events in 31 human cancers. Oncogene. (2019) 38:6678–95. doi: 10.1038/s41388-019-0910-7
77. Sveen A, Kilpinen S, Ruusulehto A, Lothe RA, Skotheim RI. Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes. Oncogene. (2016) 35:2413–27. doi: 10.1038/onc.2015.318
79. Adamia S, Haibe-Kains B, Pilarski PM, Bar-Natan M, Pevzner S, Avet-Loiseau H, et al. A genome-wide aberrant RNA splicing in patients with acute myeloid leukemia identifies novel potential disease markers and therapeutic targets. Clin Cancer Res. (2014) 20:1135–45. doi: 10.1158/1078-0432.CCR-13-0956
80. Zhang P, Feng S, Liu G, Wang H, Fu A, Zhu H, et al. CD82 suppresses CD44 alternative splicing-dependent melanoma metastasis by mediating U2AF2 ubiquitination and degradation. Oncogene. (2016) 35:5056–69. doi: 10.1038/onc.2016.67
81. Shingai N, Harada Y, Iizuka H, Ogata Y, Doki N, Ohashi K, et al. Impact of splicing factor mutations on clinical features in patients with myelodysplastic syndromes. Int J Hematol. (2018) 108:598–606. doi: 10.1007/s12185-018-2551-y
82. Hou HA, Liu CY, Kuo YY, Chou WC, Tsai CH, Lin CC, et al. Splicing factor mutations predict poor prognosis in patients with de novo acute myeloid leukemia. Oncotarget. (2016) 7:9084–101. doi: 10.18632/oncotarget.7000
83. Arbab Jafari P, Ayatollahi H, Sadeghi R, Sheikhi M, Asghari A. Prognostic significance of SRSF2 mutations in myelodysplastic syndromes and chronic myelomonocytic leukemia: a meta-analysis. Hematology. (2018) 23:778–84. doi: 10.1080/10245332.2018.1471794
Keywords: skin cutaneous melanoma, RNA-binding protein, mutation, alternative splicing, prognosis
Citation: Mei C, Song P-Y, Zhang W, Zhou H-H, Li X and Liu Z-Q (2020) Aberrant RNA Splicing Events Driven by Mutations of RNA-Binding Proteins as Indicators for Skin Cutaneous Melanoma Prognosis. Front. Oncol. 10:568469. doi: 10.3389/fonc.2020.568469
Received: 01 June 2020; Accepted: 14 August 2020;
Published: 15 October 2020.
Edited by:Suzie Chen, Rutgers, The State University of New Jersey, United States
Reviewed by:Kathleen Marie Kokolus, University at Buffalo, United States
Hao Chang, University of Wisconsin-Madison, United States
Copyright © 2020 Mei, Song, Zhang, Zhou, Li and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
†These authors have contributed equally to this work