- 1Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
- 2Department of Computational Physics, Institute of Modern Physics of Chinese Academy of Sciences, Lanzhou, China
- 3Department of Gynaecology and Obstetrics, Changsha Central Hospital, Changsha, China
- 4The First Department of Operation, Hunan Provincial People’s Hospital, Changsha, China
Clinicopathological characteristics alone are not enough to predict the survival of patients with cervical squamous cell carcinoma (CESC) due to clinical heterogeneity. In recent years, many genes and non-coding RNAs have been shown to be oncogenes or tumor-suppressors in CESC cells. This study aimed to develop a comprehensive transcriptomic signature for CESC patient prognosis. Univariate, multivariate, and Least Absolute Shrinkage and Selection Operator penalized Cox regression were used to identify prognostic signatures for CESC patients from transcriptomic data of The Cancer Genome Atlas. A normalized prognostic index (NPI) was formulated as a synthetical index for CESC prognosis. Time-dependent receiver operating characteristic curve analysis was used to compare prognostic signatures. A prognostic transcriptomic signature was identified, including 1 microRNA, 1 long non-coding RNA, and 6 messenger RNAs. Decreased survival was associated with CESC patients being in the high-risk group stratified by NPI. The NPI was an independent predictor for CESC patient prognosis and it outperformed the known clinicopathological characteristics, microRNA-only signature, gene-only signature, and previously identified microRNA and gene signatures. Function and pathway enrichment analysis revealed that the identified prognostic RNAs were mainly involved in angiogenesis. In conclusion, we proposed a transcriptomic signature for CESC prognosis and it may be useful for effective clinical risk management of CESC patients. Moreover, RNAs in the transcriptomic signature provided clues for downstream experimental validation and mechanism exploration.
Introduction
Cervical cancer (CC) is still the fourth most common cancer in women (Ferlay et al., 2015). Despite developed countries are low epidemic areas of CC by virtue of easier accessibility of routine screening test and human papillomavirus (HPV) vaccination, CC is still the second leading cause of cancer death among women aged 20–39 years in the United States in 2015 (Siegel et al., 2018). At present, clinical stage is the leading predictive characteristic for CC prognosis, although useful, significant variability is observed and the 5 years survival rate is still poor for women with advanced CC (30–40% for stage III and 15% for stage IV). Theoretically, clinicopathological characteristics are macroscopic emergence of molecules (e.g., genes, proteins) and CC patients with homogeneous clinical status may have completely diverse molecular patterns. Therefore, identification of robust and accurate molecular biomarkers for CC patient prognosis is valuable and in urgent need.
By comprehensively characterizing various molecules (DNA-level, RNA-level, protein-level) in 100s of CC samples, The Cancer Genome Atlas (TCGA) has provided a comprehensive way to understand CC (The Cancer Genome Atlas Research Network et al., 2017). Enormous multiple omics data make the discovery of potential biomarkers for CC diagnosis, treatment and prognosis possible. Several studies have investigated the molecular signatures for CC prognosis based on the expression of CC genome. Hu et al. (2010) profiled 96 cancer-related microRNAs (miRNAs) in 102 CC samples and firstly proposed a two-miRNA expression signature for predicting the overall survival (OS) of CC patients. How et al. (2015) measured the miRNA omics of CC samples by miRNA arrays and proposed a prognostic nine-miRNA expression signature in their training set. However, the prognostic value of the nine-miRNA expression signature could not be validated in an independent cohort (How et al., 2015). Liu et al. (2016), Liang et al. (2017), Ma et al. (2018), and Ying et al. (2018) proposed a seven-miRNA expression signature, a three-miRNA expression signature, a three-miRNA expression signature, and a 2 two-miRNA expression signature for CC prognosis based on TCGA miRNA sequencing data, respectively. Huang et al. (2012) profiled 1440 human tumor related gene transcripts using custom oligonucleotide microarrays in 100 CC samples and identified a prognostic seven-gene expression signature. Based on TCGA gene sequencing data, Li et al. (2017b, 2018) proposed a two-histone family gene signature and further proposed another independent gene signature to predict the OS of CC patients.
However, some limitations should be noticed: (1) Previous studies focused on single omics independently, and there lacks a whole transcriptomic analysis which may provide more comprehensive and robust discovery (Hu et al., 2010; Huang et al., 2012; How et al., 2015; Liu et al., 2016; Li et al., 2017b, 2018; Liang et al., 2017; Ma et al., 2018; Ying et al., 2018). (2) Prognostic miRNA signatures identified based on the same data source without cross-references are very different (Liu et al., 2016; Liang et al., 2017; Ma et al., 2018; Ying et al., 2018). (3) For prognostic miRNAs, previous studies did not distinguish miRNA isoforms (3p-arm or 5p-arm). Thus, it is unclear which isoform should be further investigated by experiment (Hu et al., 2010; How et al., 2015; Liu et al., 2016; Liang et al., 2017; Ma et al., 2018; Ying et al., 2018). (4) Pathologically, CC includes cervical squamous cell carcinoma (CESC) and cervical adenocarcinoma (CADC). Because there are significant differences in prognosis between CESC and CADC (Jung et al., 2017), it is not appropriate to mix them for identification of prognostic biomarkers. (5) Semi-parametric survival analysis method such as Cox regression analysis is loosely used without checking the proportional hazards (PH) assumption (Hu et al., 2010; Huang et al., 2012; How et al., 2015; Liu et al., 2016; Li et al., 2017b, 2018; Liang et al., 2017; Ma et al., 2018; Ying et al., 2018). (6) To identify prognostic signatures from high-dimensional omics data, classical multivariate Cox regression analysis (MCA) is usually impeded by the “curse of dimensionality” (i.e., low sample size and large number of variables), which leads to over-fitting and unstable estimation of regression coefficients.
To address these limitations, we submit the transcriptomic data of CESC patients to a Least Absolute Shrinkage and Selection Operator (LASSO) penalized MCA (Simon et al., 2011) to identify a transcriptomic signature for CESC prognosis.
Materials and Methods
Data Acquisition
Level 1 clinical data, level 3 transcriptomic sequencing data, and the corresponding metadata of CCs were retrieved and downloaded from TCGA1 repository in January 2018. Search strategies can be obtained in Section I of the Supplementary Material. CC patients were included in this study by the following criteria: (1) CC patients diagnosed as CESC; (2) CESC patients with at least 60 days follow-up; (3) CESC patients have both clinical data, gene sequencing data, and miRNA isoform sequencing data; (4) CESC patients do not have prior other malignancies; and (5) CESC patients do not receive any preoperative neoadjuvant therapy.
Data Preprocessing
Clinical eXtemsible Markup Language files were parsed by R “XML” package and the R code can be achieved in Section II of the Supplementary Material. Details on sequencing data preprocessing can be also available in Section II of the Supplementary Material. Hierarchical clustering was used to cluster samples to detect sample outliers and guided principal component analysis (gPCA) (Reese et al., 2013) was adopted to evaluate batch effects of the sequencing data.
Identification of Prognostic Demographic and Clinicopathological Characteristics
Kaplan–Meier (KM) survival analysis with log-rank test was applied to evaluate the prognostic effects of age at diagnosis, clinical stage, menopause status, ethnicity, birth control pill usage, tobacco usage, and lymphovascular invasion for CESC. Furthermore, MCA with demographic and clinicopathological characteristics as covariates was adopted to evaluate their independence for CESC prognosis.
Univariate Survival Analysis of RNAs
The miRNA isoform sequencing data only include miRNAs while the gene sequencing data include both messenger RNAs (mRNAs) and long non-coding RNAs (lncRNAs). For convenience of description, we termed mRNA and lncRNA as gene and further termed miRNA, mRNA and lncRNA as RNA.
Associations between OS and RNA expression profiles were preliminarily evaluated by univariate Cox regression analysis (UCA). The proportional hazards (PH) assumption was tested by Schoenfeld residual (Grambsch and Therneau, 1994) and a unified multiple testing (Strimmer, 2008) was applied for tail area-based false discovery rate (FDR) estimation. RNAs with FDRs < 0.1 and PH assumption test P values >0.1 (Kleinbaum and Klein, 2012) were considered to be preliminarily associated with OS of CESC patients. Furthermore, RNAs with hazard ratios (HRs) > 1 were defined to be risky for CESC prognosis, and those with HRs < 1 were considered as protective.
Multivariate Analysis of Preliminarily Survival Associated RNAs
For miRNAs, stepwise MCA was applied to preliminarily survival associated miRNAs to construct an independent miRNA-only expression signature for CESC prognosis. The Bayesian information criterion (BIC) (Schwarz, 1978) was adopted for model selection.
Because the number of preliminarily survival associated genes was comparable to the number of samples, a LASSO penalized MCA with 10-fold cross validation and 1000 iterations was adopted to select genes by penalizing low regression coefficients exactly to zero. To alleviate the local minimum problem, we repeated the LASSO penalized MCA 10 times with different initializations and the model that achieved the minimal partial likelihood was adopted. Furthermore, stepwise MCA was applied to genes selected by the LASSO penalized MCA to construct an independent gene-only expression signature for CESC prognosis.
Finally, a transcriptomic signature for CESC prognosis was constructed by stepwise MCA with gene-only signature and miRNA-only signature as covariates.
Risk Score
A normalized prognosis index (NPI) defined as the standard form of a linear combination of the observed values weighted by the regression coefficients was adopted as a synthetical index for CESC prognosis. Specifically,
where PI is a prognostic index vector and the jth element of PI is the prognostic index of the jth patient, i.e.
βi is the regression coefficient of the ith variable (in this context, the ith gene/miRNA); Gij is the observed value of the ith variable in the jth sample (in this context, the expression of the ith gene/miRNA in the jth sample). For miRNA-only signature, gene-only signature, the integrated RNA signature (i.e., transcriptomic signature), and the previously identified prognostic signatures, we termed the corresponding NPI as miRNA-NPI, gene-NPI, RNA-NPI, and pre-NPI, respectively.
Model Evaluation and Comparison
CESC patients were stratified into a high-risk group (NPI > 0) or a low-risk group (NPI < 0) based on NPI. OS between the high-risk group and the low-risk group was compared by KM survival analysis. MCA was used to evaluate the independence of various NPIs and clinical factors. The abilities of various NPIs to predict CESC patient survival outcome were assessed and compared by calculating the area under the curve (AUC) of the time-dependent receiver operating characteristic (ROC) at 3, 5, and 10 years, respectively.
Gene Ontology and Pathway Enrichment
Unlike other mRNA target prediction software just based on sequence alignment, miRTarBase provided experimentally validated mRNA targets of miRNA. Both strongly and weakly validated mRNA targets of the prognostic miRNA were obtained from miRTarBase (version 7.0) (Chou et al., 2018). Metascapae (Tripathi et al., 2015)2 was adopted for gene ontology and pathway enrichment of the prognostic mRNAs and targets of the prognostic miRNA.
Statistical Analysis Tools
P-value less than 0.05 or adjusted P-value less than 0.1 was considered to be significant. All analyses were performed by R software. Non-parametric survival analysis, semi-parametric survival analysis, and PH assumption test were performed by survival and survminer packages. LASSO penalized MCA was conducted by textitglmnet package. Multiple test correction was conducted by fdrtool package. Time-dependent ROC analysis was conducted by timeROC package.
Results
Available Data
TCGA CC dataset included 307 CC patients who had generated 312 samples for miRNA sequencing (including 307 primary CC samples, 2 metastatic CC samples, and 3 normal samples) and 309 samples for gene sequencing (including 304 primary CC samples, 2 metastatic CC samples, and 3 normal samples). Due to small number of metastatic and normal CC samples, we only analyzed the primary CC samples. Based on the inclusion criteria and low-expressed RNA filtering (Supplementary Material Section III), 214 CESC samples covering 401 miRNAs and 13631 genes (mRNAs and lncRNAs) were retained. Hierarchical clustering showed that there existed a miRNA sample outlier and seven gene sample outliers. After removing sample outliers and scaling the expressions of miRNAs and genes to zero sample mean and standard deviation, 206 primary CESC samples were included for identification of prognostic signatures. Batch effect analysis showed that there was no obvious separation on the first two guided principal components for both miRNA isoform sequencing data and gene sequencing data (Figures 1A,C) with permutation test P-values of 0.598 and 0.947 (Figures 1B,D), respectively. These results indicated that there was no significant batch effect in the sequencing data.
 
  Figure 1. Batch effect evaluation. Scatter plots of the first two guided principal components (A,C) and permutation tests (B,D) for miRNA isofrom sequencing data and gene sequencing data.
Prognostic Demographic and Clinicopathological Characteristics
Of the 206 patients in TCGA-CESC cohort, 55 patients were deceased and 151 were alive at the time of last follow-up. The median OS time was 3097 days (95% CI: 2859–NA days). Demographic and clinicopathological characteristics for TCGA-CESC cohort are summarized in Table 1. Age at initial diagnosis (HR = 1.81, 95% CI: 0.92–3.58), clinical stage (HR = 2.14, 95% CI: 1.12–4.09), tobacco usage (HR = 2.36, 95% CI: 1.24–4.48), and lymphovascular invasion (HR = 13.70, 95% CI: 5.64–33.31) were negatively associated with OS of CESC patients revealed by KM survival analysis (Table 1 and Supplementary Figure S1). MCA revealed that only lymphovascular invasion was an independent predictor for CESC prognosis (Table 1). However, information of lymphovascular invasion was not available in more than half of the CESC patients (n = 106). Considering age at initial diagnosis, clinical stage, and tobacco usage as covariates (i.e., without lymphovascular invasion), MCA revealed that only clinical stage was an independent prognostic clinicopathological characteristic (Table 1).
 
  Table 1. Patient characteristics, KM survival analysis, and MCA of demographic and clinicopathological features.
miRNA-Only Expression Signature for CESC Prognosis
Univariate Cox regression analysis (UCA) after FDR correction (Supplementary Figure S2) revealed that 9 miRNAs were significantly associated with OS of CESC patients. Spearman correlation analysis showed that these prognostic miRNAs were correlated with each other in the upper left block (including hsa-miR-296-5p, hsa-miR-335-3p, hsa-miR-365a-3p, hsa-miR-365b-3p, and hsa-miR-584-5p) and the lower right block (including hsa-miR-3607-3p, hsa-miR-502-3p, hsa-miR-532-5p, and hsa-miR-532-3p) (Figure 2A). To obtain non-redundant miRNAs to predict the OS of CESC patient, stepwise MCA was performed in each block. Hsa-miR-335-3p and hsa-miR-365b-3p were selected by BIC to represent the upper left block and hsa-miR-532-5p was selected to represent the lower right block. Finally, considering hsa-miR-335-3p, hsa-miR-365b-3p, and hsa-miR-532-5p as covariates, MCA reveled that hsa-miR-335-3p and hsa-miR-532-5p were independent prognostic miRNAs and hsa-miR-365b-3p was discarded due to the correlations between blocks (Figure 2A). Furthermore, PH assumption test revealed that hsa-miR-335-3p did not satisfy the PH assumption in the final stepwise Cox model (Supplementary Figure S3). Thus, we obtained hsa-miR-532-5p as an independent miRNA expression signature for CESC prognosis (Table 2).
 
  Figure 2. Pairwise correlation and cross-validation. (A) The diagonal, upper triangular, and lower triangular of the correlation plot is the histogram, scatter plot, and correlation coefficient and significance, respectively. ∗P < 0.05, ∗∗P < 0.01, ∗∗∗P < 0.001. (B) The left vertical line shows where the cross-validation error curve hits its minimum, the right vertical line shows the most regularized model with cross-validation error within 1 standard deviation of the minimum, and the numbers at the top of the Figure indicate the number of the nonzero coefficients. The optimal model is chosen where the cross-validation error curve hits its minimum (left vertical line).
Gene-Only Expression Signature for CESC Prognosis
UCA after FDR correction (Supplementary Figure S4) revealed that 218 genes were significantly associated with OS of CESC patients. Because the number of genes was larger than the number of samples, LASSO penalized MCA revealed that 38 genes were with non-zero regression coefficients (Figure 2B). Stepwise MCA further revealed that 1 lncRNA 7 mRNAs were optimal to construct an independent gene expression signature for CESC prognosis (Table 2).
Transcriptomic Signature for CESC Prognosis
Considering the identified 1 miRNA, 1 lncRNA, and 7 mRNAs as covariates, MCA revealed that hsa-miR-532-5p, lncRNA DLEU1, RBM38, CXCL2, ZIC2, MTMR11, EGLN1, and TPST1 were independent predictors for CESC prognosis (Table 2). NPIs for the miRNA-only signature, the gene-only signature, and the transcriptomic signature were calculated, respectively. Stratification based on RNA-NPI (Figure 3A), gene-NPI (Figure 3B), and miRNA-NPI (Figure 3C) showed that CESC patients in the high-risk group had significantly shorter OS than those in the low-risk group Furthermore, as continuous variables, the miRNA-NPI (HR = 3.32, 95% CI: 1.86–5.93, P-value = 4.81E-05), the gene-NPI (HR = 7.89, 95% CI: 5.08-12.25, P-value < 2.00E-16), and the RNA-NPI (HR = 13.99, 95% CI: 7.88–24.83, P-value < 2.00E-16) were significant predictors for OS of CESC patients revealed by UCA. However, MCA revealed that neither miRNA-NPI (Supplementary Table S1) nor gene-NPI (Supplementary Table S2) was an independent predictor for the OS of CESC patients when considering RNA-NPI as covariate.
 
  Figure 3. Kaplan-Meier plots for the transcriptomic signature (A), the gene-only signature (B), and the miRNA-only signature (C).
To exclude potential effects of mutations on CESC prognosis, we also investigated the mutational patterns of the genes in the identified transcriptomic signature using OncoPrinter. Mutations of ZIC2, MTMR11, EGLN1, and TPST1 were found in two, five, one, and two of total 206 CESC samples, and none of these mutations were found in 197 CESC samples. MCA further showed that the identified RNAs were independent predictors for CESC prognosis in the non-mutated samples (Supplementary Table S3). Thus, the prognostic roles of the identified RNAs were merely caused by their expressions.
Signature Evaluation and Comparison
Time-dependent ROC analysis showed that the transcriptomic signature (Figure 4A; AUC = 0.893, 0.900, and 0.975 at 3, 5, and 10 years, respectively) had slight better predictive ability for CESC survival than the gene-only signature (Figure 4B; AUC = 0.886, 0.899, and 0.888 at 3, 5, and 10 years, respectively) but better than the miRNA-only signature (Figure 4C; AUC = 0.704, 0.705, 0.783 at 3, 5, and 10 years, respectively).
We also compared the miRNA and gene expression signatures proposed by previous studies (Hu et al., 2010; Huang et al., 2012; How et al., 2015; Liu et al., 2016; Liang et al., 2017; Li et al., 2017b, 2018; Ma et al., 2018; Ying et al., 2018) (Supplementary Table S4) with our transcriptomic signature with respect to independence and predictive ability for CESC prognosis. As shown in Supplementary Table S4, previously proposed miRNA and gene signatures were rarely overlapped across studies except for hsa-miR-378c which was identified by both Liu et al. (2016) and Ma et al. (2018) Specifically, hsa-miR-500a-5p, hsa-miR-500a-3p, hsa-miR-500b-5p, hsa-miR-142-3p, hsa-miR-3607-3p, hsa-miR-502-3p, and hsa-miR-145-5p, RFC4, HIST1H2BD, HIST1H2BJ, and MCM5 could be validated in our study. MCA of these verifiable miRNA and genes found that 1 gene and 3 miRNAs were independent (Supplementary Table S5). We also calculated the NPI for each CESC patient based on the obtained regression model and termed it as pre-NPI. MCA showed that both RNA-NPI and pre-NPI were independent predictors for CESC prognosis (Supplementary Table S6). Finally, time-dependent ROC analysis for pre-NPI (Figure 4D; AUC = 0.869, 0.857, and 0.748 at 3, 5, and 10 years, respectively) demonstrated that the pre-NPI was inferior to the RNA-NPI for prediction of CESC patient survival.
Moreover, MCA revealed that clinical stage, RNA-NPI, and pre-NPI were independent predictors for CESC prognosis (Table 3). However, when considering lymphovascular invasion as covariate, RNA-NPI was the only independent prognostic index (Table 3). These results demonstrated that the transcrip-tomic signature was a better predictor for CESC prognosis compared with clinicopathological characteristics and previous proposed signatures.
Angiogenesis Related Functions and Pathways
Sixty-four genes were validated as targets of hsa-miR-532-5p. Gene ontology and pathway enrichment analyses indicated that targets of hsa-miR-532-5p and the prognostic mRNAs in the transcriptomic signature were mainly associated with angiogenesis (Figure 5). Angiogenesis is a main hallmark of tumor progression and may be an independent prognostic factor in CC (Bremer et al., 1996). In this study, EGLN1 (Egl-9 family hypoxia inducible factor 1; also known as PHD2) is a risky gene (HR = 1.79, 95% CI: 1.45–2.21) and it is closely related with angiogenesis by regulating the stability of HIF1 in non-CESC cancers (Chan and Giaccia, 2010; Lu et al., 2013). Targets of hsa-miR-532-5p such as runt-related transcription factor-3 (Peng et al., 2006; Kim et al., 2016), insulin-like growth factor-binding protein-5 (Rho et al., 2008; Lee et al., 2016), and von Hippel-Lindau (Kong et al., 2011, 2014; Lu et al., 2013) were proved to be associated with angiogenesis in many types of cancer. These results prompted that further experiments aimed to explore functional mechanisms of the transcriptomic signature could be focused on angiogenesis related pathways.
Discussion
In this study, a novel transcriptomic signature for CESC patient prognosis was identified. Our proposed transcriptomic signature includes two non-coding RNAs (hsa-miR-532-5p and lncRNA DLEU1) and six mRNAs (RBM38, CXCL2, ZIC2, MTMR11, EGLN1, and TPST1). It is natural to wonder if there are any mRNA targets of the non-coding RNAs in the transcriptomic signature. Interestingly, CXCL2 was reported to be a direct target of hsa-miR-532-5p in hepatocellular carcinoma and this miRNA-gene interaction inhibited hepatocellular carcinoma cell proliferation and metastasis (Song et al., 2015). However, correlation analysis revealed that no significant correlation between hsa-miR-532-5p and CXCL2 was observed. Forty-six genes were predicted to be targets of lncRNA-DLEU1 by starBase v2.0 (Li et al., 2014), but none of them were in the transcriptomic signature. For identification of prognostic biomarkers, it is expected to construct signatures that include as many independent biomarkers as possible. Due to the independence among the biomarkers in the transcriptomic signature, it is hard to find possible biological interactions among them. To find possible mechanisms for the independent RNAs in the transcriptomic signature, it is wise to explore possible correlations between the independent RNAs and the remaining RNAs of CESC transcriptome. Thus, the transcriptomic signature could just provide the initial molecules rather than complete biological mechanisms for further experimental exploration.
For the protective RNAs in the transcriptomic signature, hsa-miR-532-5p was shown to be involved in many cancers either as a tumor suppressor or an oncogenic-miRNA (Song et al., 2015; Zhang J. et al., 2018). However, the role of hsa-miR-532-5p in CESC remains unknown, and our results prompted that it may exert a tumor suppressor role in CESC due to its positive correlation with OS of CESC patients. RNA-binding protein 38 (RBM38) was originally recognized as an oncogene and it was frequently found to be amplified in prostate, ovarian and colorectal cancer, chronic lymphocytic leukemia, colon carcinoma, esophageal cancer, dog lymphomas and breast cancer (Ding et al., 2015). But recently, more and more studies suggested that RBM38 might act as a tumor suppressor (Feldstein et al., 2012; Xue et al., 2014). Ding et al. found that the association between the expression of RBM38 and cancer prognosis varied from cancers and databases (Ding et al., 2015). These studies suggested that the function of RBM38 might be multidimensional in cancers. However, no study was performed to investigate the possible roles of RBM38 in CESC, and our analysis prompted to assume that RBM38 may be a tumor suppressor in CESC. Zic family member 2 (ZIC2) was shown to be oncogenic in many cancers such as ovarian cancer (Marchini et al., 2012) and hepatocellular carcinoma (Lu et al., 2017). In cervical cancer, ZIC2 was rarely investigated. Chan et al. (2011) demonstrated that ZIC2 was up-regulated in CC cell lines and the up-regulation of ZIC2 may enhance the activity of the Hedgehog signaling pathway through nuclear retention of Gli1. Although ZIC2 may be a risk factor that was deduced from the results of Chan et al., the prognostic role of ZIC2 in CESC patients was not investigated. Our analysis showed that ZIC2 may be a protective factor, and further studies are needed to elaborate on the disputable roles of ZIC2 in CESC.
Among the risky RNAs in the transcriptomic signature, Tyrosylprotein sulfotransferase 1 (TPST1) is an enzyme responsible for catalysis of tyrosine sulfation. Previous studies revealed that TPST1 could sulfate the tyrosine of C-X-C motif chemokine receptor (CXCR4) (Seibert et al., 2008; Xu et al., 2013) and the tyrosine sulfation might contribute to nasopharyngeal carcinoma metastasis (Xu et al., 2013). However, expressions, functions, and mechanisms of TPST1 in CESC are not clear. Consistent with these previous studies, our analysis revealed that TPST1 was harmful for CESC prognosis. C-X-C motif chemokine ligand 2 (CXCL2) was demonstrated to be up-regulated in many types of cancer such as chronic lymphocytic leukemia and bladder cancer. The up-regulation of CXCL2 could enhance the cell survival of lymphocytic leukemia (Burgess et al., 2012) and it was correlated with poor prognosis of bladder cancer (Zhang et al., 2016). Recently, Zhang et al. revealed that AKP1 could promote angiogenesis and tumor growth by up-regulating CXCL1, CXCL2, and CXCL8 in CC cells (Zhang W. et al., 2018). Our results revealed that CXCL2 was a risk factor for CESC prognosis, which is in line with these previous experimental studies in non-CESC cancers. Egl-9 family hypoxia inducible factor 1 (EGLN1) was a key cellular oxygen sensor, which played important roles in tumor angiogenesis (Chan and Giaccia, 2010) and tumor metastasis (Kuchnio et al., 2015). Both tumor-promoting and suppressive roles of EGLN1 have been reported in different types of cancer (Chan and Giaccia, 2010). Although EGLN1 was shown to be low-expressed in advanced CC (Roszak et al., 2011; Kuchnio et al., 2015), in our analysis, the expression of EGLN1 was a risk factor for CESC patient survival. Thus, further functional mechanisms of EGLN1 in CESC cells should be carried out to illustrate the worse prognostic effect of EGLN1. Myotubularin related protein 11 (MTMR11) was rarely reported in cancer. The lncRNA DLEU1 played multiple roles in different cancers. Previous studies revealed that DLEU1 could promote progression of ovarian carcinoma (Wang et al., 2017) and gastric cancer (Li et al., 2017a), while Balas and Johnson (2018). showed that DLEU1 may be a tumor suppressor. There are versatile forms of interaction for lncRNA. DLEU1 could exert its functions by binding to proteins (Li et al., 2017a; Liu et al., 2018) or miRNAs (Wang et al., 2017). Moreover, the up-regulated DLEU1 was shown to be associated with the survival of gastric cancer by promoting proliferation of gastric cancer cells (Li et al., 2017a). However, the potential roles of DLEU1 in CESC remain unclear, and our analysis revealed that DLEU1 may also exert as a tumor suppressor in CESC.
Some limitations of the current study should be noticed. (1) For PH assumption test, it is difficult to estimate the type II error (i.e., the false negative rate). Thus, it is hard to choose a threshold of PH assumption test P-value in multiple testing. (2) The transcriptomic signature was identified based on TCGA data-mining and further independent validations and mechanism explorations are in need. (3) Complete mechanisms could not be revealed by the transcriptomic signature itself. However, the transcriptomic signature may be of potential applications for clinical management of CESC patients. Specifically, we can measure the expressions of the transcriptomic signature in CESC patients and calculate the NPIs for the patients based on the measured expression values. Furthermore, CESC patients can be stratified as high-risk and low-risk based on their NPIs. Finally, for high-risk patients, more aggressive therapies such as high-dose chemoradiotherapy may be given. Moreover, hsa-miR-532-5p, RBM38, EGLN1, and DLEU1 were demonstrated as both oncogenes and tumor-suppressors in non-CESC cancers and their roles in CESC were unclear. This study provided experimental directions for these novel genes and miRNA.
Conclusion
We have identified a novel transcriptomic signature for CESC prognosis, including 1 miRNA, 1 lncRNA and 6 mRNAs. The transcriptomic signature was more comprehensive and predictive than the miRNA-only, the gene-only, and the previously identified signatures for CESC prognosis.
Author Contributions
JX conceived and designed the analysis. JX, SG, YS, LG, and ZB performed the analysis. JX wrote the paper.
Funding
This work was supported by the Postdoctoral Science Foundation of Central South University.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
We thank The Cancer Genome Altas project made the genomic data of CESC available. All data obtained from TCGA keep to the rules for usage and publication of TCGA. We wish to thank the editors and reviewers for helpful comments. In addition, we also thank Zheng Yu from University of Washington for English advice.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018.00696/full#supplementary-material
FIGURE S1 | Kaplan–Meier plots for age at diagnosis (A), tobacco usage (B), clinical stage (C), and lymphovascular invasion (D).
FIGURE S2 | The first row shows the densities, the second the distribution function and the last row the local and tail area-based false discovery rates of miRNAs.
FIGURE S3 | Schoenfeld residual plots for hsa-miR-335-3p and hsa-miR-532-5p.
FIGURE S4 | The first row shows the densities, the second the distribution function and the last row the local and tail area-based false discovery rates of genes.
TABLE S1 | MCA of RNA-NPI and miRNA-NPI.
TABLE S2 | MCA of RNA-NPI and gene-NPI.
TABLE S3 | MCA of the transcriptomic signature in non-mutated CESC samples.
TABLE S4 | UCA of previously identified prognostic genes and miRNAs.
TABLE S5 | MCA of previously indentified prognostic genes and miRNAs.
TABLE S6 | MCA of RNA-NPI and pre-NPI.
METHODS | Details on data retrieval and preprocessing.
Footnotes
References
Balas, M. M., and Johnson, A. M. (2018). Exploring the mechanisms behind long noncoding RNAs and cancer. Non Coding RNA Res. 3, 108–117. doi: 10.1016/j.ncrna.2018.03.001
Bremer, G. L., Tiebosch, A. T., van der Putten, H. W., Schouten, H. J., de Haan, J., and Arends, J. W. (1996). Tumor angiogenesis: an independent prognostic parameter in cervical cancer. Am. J. Obstetr. Gynecol. 174, 126–131. doi: 10.1016/S0002-9378(96)70384-8
Burgess, M., Cheung, C., Chambers, L., Ravindranath, K., Minhas, G., Knop, L., et al. (2012). CCL2 and CXCL2 enhance survival of primary chronic lymphocytic leukemia cells in vitro. Leukemia Lymphoma 53, 1988–1998. doi: 10.3109/10428194.2012.672735
Cancer Genome Atlas Research Network, Albert Einstein College of Medicine, Analytical Biological Services, Barretos Cancer Hospital, Baylor College of Medicine, Beckman Research Institute of City of Hope, et al. (2017). Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378–384. doi: 10.1038/nature21386
Chan, D. A., and Giaccia, A. J. (2010). PHD2 in tumour angiogenesis. Br. J. Cancer 103, 1–5. doi: 10.1038/sj.bjc.6605682
Chan, D. W., Liu, V. W., Leung, L. Y., Yao, K. M., Chan, K. K., Cheung, A. N., et al. (2011). Zic2 synergistically enhances Hedgehog signaling through nuclear retention of Gli1 in cervical cancer cells. J. Pathol. 225, 525–534. doi: 10.1002/path.2901
Chou, C. H., Shrestha, S., Yang, C. D., Chang, N. W., Lin, Y. L., Liao, K. W., et al. (2018). miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions. Nucleic Acids Res. 46, D296–D302. doi: 10.1093/nar/gkx1067
Ding, Z., Yang, H. W., Xia, T. S., Wang, B., and Ding, Q. (2015). Integrative genomic analyses of the RNA-binding protein, RNPC1, and its potential role in cancer prediction. Int. J. Mol. Med. 36, 473–484. doi: 10.3892/ijmm.2015.2237
Feldstein, O., Benhamo, R., Bashari, D., Efroni, S., and Ginsberg, D. (2012). RBM38 is a direct transcriptional target of E2F1 that limits E2F1-induced proliferation. Mol. Cancer Res. 10, 1169–1177. doi: 10.1158/1541-7786.MCR-12-0331
Ferlay, J., Soerjomataram, I., Dikshit, R., Eser, S., Mathers, C., Rebelo, M., et al. (2015). Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int. J. Cancer 136, E359–E386. doi: 10.1002/ijc.29210
Grambsch, P. M., and Therneau, T. M. (1994). Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81, 515–526. doi: 10.2307/2337123
How, C., Pintilie, M., Bruce, J. P., Hui, A. B., Clarke, B. A., Wong, P., et al. (2015). Developing a prognostic micro-RNA signature for human cervical carcinoma. PLoS One 10:e0123946. doi: 10.1371/journal.pone.0123946
Hu, X., Schwarz, J. K., Lewis, J. S., Huettner, P. C., Rader, J. S., Deasy, J. O., et al. (2010). A MicroRNA expression signature for cervical cancer prognosis. Cancer Res. 70, 1441–1148. doi: 10.1158/0008-5472.CAN-09-3289
Huang, L., Zheng, M., Zhou, Q. M., Zhang, M. Y., Yu, Y. H., Yun, J. P., et al. (2012). Identification of a 7-gene signature that predicts relapse and survival for early stage patients with cervical carcinoma. Med. Oncol. 29, 2911–2918. doi: 10.1007/s12032-012-0166-3
Jung, E. J., Byun, J. M., Kim, Y. N., Lee, K. B., Sung, M. S., Kim, K. T., et al. (2017). Cervical adenocarcinoma has a poorer prognosis and a higher propensity for distant recurrence than squamous cell carcinoma. Int. J. Gynecol. Cancer 27, 1228–1236. doi: 10.1097/IGC.0000000000001009
Kim, B. R., Kang, M. H., Kim, J. L., Na, Y. J., Park, S. H., Lee, S. I., et al. (2016). RUNX3 inhibits the metastasis and angiogenesis of colorectal cancer. Oncol. Rep. 36, 2601–2608. doi: 10.3892/or.2016.5086
Kleinbaum, D. G., and Klein, M. (2012). Survival Analysis. New York, NY: Springer. doi: 10.1007/978-1-4419-6646-9
Kong, W., He, L., Richards, E. J., Challa, S., Xu, C. X., Permuth-Wey, J., et al. (2014). Upregulation of miRNA-155 promotes tumour angiogenesis by targeting VHL and is associated with poor prognosis and triple-negative breast cancer. Oncogene 33, 679–689. doi: 10.1038/onc.2012.636
Kong, W., He, L., Tan, L., Guo, J., Coppola, D., Coppola, M., et al. (2011). Abstract 3986: microRNA-155 regulates angiogenesis by targeting Von Hippel-Lindau (VHL) tumor suppressor. Cancer Res. 71, 3986–3986. doi: 10.1158/1538-7445.AM2011-3986
Kuchnio, A., Dewerchin, M., and Carmeliet, P. (2015). The PHD2 oxygen sensor paves the way to metastasis. Oncotarget 6, 35149–35150. doi: 10.18632/oncotarget.6216
Lee, J. W., Hwang, J. R., Cho, Y. J., Lee, Y. Y., Choi, C. H., Bae, D. S., et al. (2016). Abstract 3384: IGFBP5-derived peptide as a novel angiogenesis inhibitor for treatment of ovarian cancer. Cancer Res. 76, 3384–3384. doi: 10.1158/1538-7445.AM2016-3384
Li, J. H., Liu, S., Zhou, H., Qu, L. H., and Yang, J. H. (2014). starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 42, D92–D97. doi: 10.1093/nar/gkt1248
Li, X., Tian, R., Gao, H., Yan, F., Ying, L., Yang, Y., et al. (2018). Identification of significant gene signatures and prognostic biomarkers for patients with cervical cancer by integrated bioinformatic methods. Technol. Cancer Res. Treat. 17:153303381876745. doi: 10.1177/1533033818767455
Li, X., Li, Z., Liu, Z., Xiao, J., Yu, S., and Song, Y. (2017a). Long non-coding RNA DLEU1 predicts poor prognosis of gastric cancer and contributes to cell proliferation by epigenetically suppressing KLF2. Cancer Gene Ther. 25, 58–67. doi: 10.1038/s41417-017-0007-9
Li, X., Tian, R., Gao, H., Yang, Y., Williams, B. R. G., Gantier, M. P., et al. (2017b). Identification of a histone family gene signature for predicting the prognosis of cervical cancer patients. Sci. Rep. 7:16495. doi: 10.1038/s41598-017-16472-5
Liang, B., Li, Y., and Wang, T. (2017). A three miRNAs signature predicts survival in cervical cancer using bioinformatics analysis. Sci. Rep. 7:5624. doi: 10.1038/s41598-017-06032-2
Liu, B., Ding, J. F., Luo, J., Lu, L., Yang, F., Tan, X. D., et al. (2016). Seven protective miRNA signatures for prognosis of cervical cancer. Oncotarget 7, 56690–56698. doi: 10.18632/oncotarget.10678
Liu, T., Han, Z., Li, H., Zhu, Y., Sun, Z., and Zhu, A. (2018). LncRNA DLEU1 contributes to colorectal cancer progression via activation of KPNA3. Mol. Cancer 17:118. doi: 10.1186/s12943-018-0873-2
Lu, N., Hui, H., Yang, H., Zhao, K., Chen, Y., You, Q. D., et al. (2013). Gambogic acid inhibits angiogenesis through inhibiting PHD2-VHL-HIF-1α pathway. Eur. J. Pharm. Sci. 49, 220–226. doi: 10.1016/j.ejps.2013.02.018
Lu, S. X., Zhang, C. Z., Luo, R. Z., Wang, C. H., Liu, L. L., Fu, J., et al. (2017). Zic2 promotes tumor growth and metastasis via PAK4 in hepatocellular carcinoma. Cancer Lett. 402, 71–80. doi: 10.1016/j.canlet.2017.05.018
Ma, C., Zhang, W., Wu, Q., Liu, Y., Wang, C., Lao, G., et al. (2018). Identification of a microRNA signature associated with survivability in cervical squamous cell carcinoma. PLoS One 13:e0193625. doi: 10.1371/journal.pone.0193625
Marchini, S., Poynor, E., Barakat, R. R., Clivio, L., Cinquini, M., Fruscio, R., et al. (2012). The zinc finger gene ZIC2 has features of an oncogene and its over- expression correlates strongly with the clinical course of epithelial ovarian cancer. Clin. Cancer Res. 18, 4313–4324. doi: 10.1158/1078-0432.CCR-12-0037
Peng, Z., Wei, D., Wang, L., Tang, H., Zhang, J., Le, X., et al. (2006). RUNX3 inhibits the expression of vascular endothelial growth factor and reduces the angiogenesis, growth, and metastasis of human gastric cancer. Clin. Cancer Res. 12, 6386–6394. doi: 10.1158/1078-0432.CCR-05-2359
Reese, S. E., Archer, K. J., Therneau, T. M., Atkinson, E. J., Vachon, C. M., de Andrade, M., et al. (2013). A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis. Bioinformatics 29, 2877–2883. doi: 10.1093/bioinformatics/btt480
Rho, S. B., Dong, S. M., Kang, S., Seo, S. S., Yoo, C. W., Lee, D. O., et al. (2008). Insulin-like growth factor-binding protein-5 (IGFBP-5) acts as a tumor suppressor by inhibiting angiogenesis. Carcinogenesis 29, 2106–2111. doi: 10.1093/carcin/bgn206
Roszak, A., Kedzia, W., Malkowska-Walczak, B., Pawlik, P., Kedzia, H., Luczak, M., et al. (2011). Reduced expression of PHD2 prolyl hydroxylase, gene in primary advanced uterine cervical carcinoma. Biomed. Pharmacother. 65, 298–302. doi: 10.1016/j.biopha.2011.03.005
Schwarz, G. (1978). Estimating the dimension of a model. Ann. Stat. 6, 15–18. doi: 10.1214/aos/1176344136
Seibert, C., Veldkamp, C. T., Peterson, F. C., Chait, B. T., Volkman, B. F., and Sakmar, T. P. (2008). Sequential tyrosine sulfation of CXCR4 by tyrosylprotein sulfotransferases. Biochemistry 47, 11251–11262. doi: 10.1021/bi800965m
Siegel, R. L., Miller, K. D., and Jemal, A. (2018). Cancer statistics, 2018. CA A Cancer J. Clin. 68, 7–30. doi: 10.3322/caac.21442
Simon, N., Friedman, J., Hastie, T., and Tibshirani, R. (2011). Regularization paths for Cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39, 1–13. doi: 10.18637/jss.v039.i05
Song, X., Wang, Z., Jin, Y., Wang, Y., and Duan, W. (2015). Loss of mir-532-5p in vitro promotes cell proliferation and metastasis by influencing CXCL2 expression in HCC. Am. J. Transl. Res. 7, 2254–2261.
Strimmer, K. (2008). A unified approach to false discovery rate estimation. BMC Bioinformatics 9:303. doi: 10.1186/1471-2105-9-303
Tripathi, S., Pohl, M. O., Zhou, Y., Rodriguez-Frandsen, A., Wang, G., Stein, D. A., et al. (2015). Meta- and orthogonal integration of influenza “OMICs” data defines a role for ubr4 in virus budding. Cell Host Microbe 18, 723–735. doi: 10.1016/j.chom.2015.11.002
Wang, L. L., Sun, K. X., Wu, D. D., Xiu, Y. L., Chen, X., Chen, S., et al. (2017). DLEU1 contributes to ovarian carcinoma tumourigenesis and development by interacting with miR-490-3p and altering CDK1 expression. J. Cell. Mol. Med. 21, 3055–3065. doi: 10.1111/jcmm.13217
Xu, J., Deng, X., Tang, M., Li, L., Yang, L., Zhong, L., et al. (2013). Tyrosylprotein sulfotransferase-1 and tyrosine sulfation of chemokine receptor 4 are induced by epstein-barr virus encoded latent membrane protein 1 and associated with the metastatic potential of human nasopharyngeal carcinoma. PLoS One 8:e56114. doi: 10.1371/journal.pone.0056114
Xue, J. Q., Xia, T. S., Liang, X. Q., Zhou, W., Cheng, L., Shi, L., et al. (2014). RNA-binding protein RNPC1: acting as a tumor suppressor in breast cancer. BMC Cancer 14:322. doi: 10.1186/1471-2407-14-322
Ying, Z., Wang, K. X., Hui, X., and Hong, Y. (2018). Integrative miRNA analysis identifies hsa-miR-3154, hsa-miR-7-3, and hsa-miR-600 as potential prognostic biomarker for cervical cancer. J. Cell. Biochem. 119, 1558–1566. doi: 10.1002/jcb.26315
Zhang, H., Ye, Y. L., Li, M. X., Ye, S. B., Huang, W. R., Cai, T. T., et al. (2016). CXCL2/MIF-CXCR2 signaling promotes the recruitment of myeloid-derived suppressor cells and is correlated with prognosis in bladder cancer. Oncogene 36, 2095–2104. doi: 10.1038/onc.2016.367
Zhang, J., Zhou, W., Liu, Y., Liu, T., Li, C., and Wang, L. (2018). Oncogenic role of microRNA-532-5p in human colorectal cancer via targeting of the 5’UTR of RUNX3. Oncol. Lett. 15, 7215–7220. doi: 10.3892/ol.2018.8217
Keywords: transcriptome, signature, prognosis, CESC, angiogenesis
Citation: Xiong J, Guo S, Bing Z, Su Y and Guo L (2019) A Comprehensive RNA Expression Signature for Cervical Squamous Cell Carcinoma Prognosis. Front. Genet. 9:696. doi: 10.3389/fgene.2018.00696
Received: 19 September 2018; Accepted: 12 December 2018;
Published: 04 January 2019.
Edited by:
Juan Caballero, Universidad Autónoma de Querétaro, MexicoReviewed by:
Himanshi Bhatia, Washington University in St. Louis, United StatesHeriberto Prado-Garcia, Instituto Nacional de Enfermedades Respiratorias, Mexico
Copyright © 2019 Xiong, Guo, Bing, Su and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jie Xiong, eGlvbmdqaWU4NkAxMjYuY29t
†These authors have contributed equally to this work
 Shengyu Guo1†
Shengyu Guo1† 
   
   
  