Clinicopathological Implication of Long Non-Coding RNAs SOX2 Overlapping Transcript and Its Potential Target Gene Network in Various Cancers

Background SOX2 overlapping transcript (SOX2-OT) produces alternatively spliced long non-coding RNAs (lncRNA). Previous studies of the prognostic role of SOX2-OT expression met with conflicting results. The aim of this study was to properly consider the prognostic role of SOX2-OT expression in several cancers. In addition, the regulative mechanism of SOX2-OT is explored. Methods PubMed, EMBASE, and Cochrane Library and The Cancer Genome Atlas (TCGA) database were comprehensively explored to recover pertinent studies. We conducted an extensive inquiry to verify the implication of SOX2-OT expression in cancer patients by conducting a meta-analysis of 13 selected studies. Thirty-two TCGA databases were used to analyze the connection between SOX2-OT expression and both the overall survival (OS) and clinicopathological characteristics of cancer patients using R and STATA 13.0. Trial sequential analysis (TSA) was adopted in order to compute the studies’ power. Results Thirteen studies involving 1172 cancer patients and 32 TCGA cancer types involving 9676 cancer patients were eventually selected. Elevated SOX2-OT expression was significantly related to shorter OS (HR = 2.026, 95% CI: 1.691–2.428, P < 0.0001) and disease-free survival (DFS) (HR = 2.554, 95% CI: 1.261–5.174, P = 0.0092) in cancer patients. Meanwhile, TSA substantiated adequate power to demonstrate the relationship between SOX2-OT expression and OS. The cancer patients with elevated SOX2-OT expression were more likely to have advanced clinical stage (RR = 1.468, 95% CI: 1.106–1.949, P = 0.0079), earlier lymphatic metastasis (P = 0.0005), earlier distant metastasis (P < 0.0001), greater tumor size (P < 0.0001), and more extreme tumor invasion (P < 0.0001) compared to those with low SOX2-OT expression. Meta-regression and subgroup analysis revealed that follow-up time, sample type, and tumor type could significantly contribute to heterogeneity for survival outcomes. The follow-up time could significantly explain heterogeneity for tumor, node, metastasis (TNM) stage. Furthermore, up to 500 validated target genes were distinguished, and the gene oncology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses demonstrated that the validated targets of SOX2-OT were substantially enriched in cell adhesion, mRNA binding, and mRNA surveillance pathways. Conclusions Elevated expression of SOX2-OT predicted a poor OS and DFS. Overexpression of SOX2-OT was correlated with more advanced tumor stage, earlier lymphatic metastasis, earlier distant metastasis, larger tumor size, and deeper tumor invasion. SOX2-OT-mediated cell adhesion, mRNA binding, or mRNA surveillance could be intrinsic mechanisms for invasion and metastasis.

The evidence above showed that SOX2-OT is involved in tumor progression. Moreover, an earlier meta-analysis study published in 2018 had revealed that the overexpression of SOX2-OT was significantly correlated with the overall survival (OS), clinical stage, lymph node metastasis, distant metastasis, and tumor differentiation of cancers (Song et al., 2018). However, the sample size of the study was restricted, and the relationship between SOX2-OT and other clinicopathological characteristics was not explored (Song et al., 2018). As described below, we have conducted a more comprehensive trial sequential analysis (TSA) on the applicable literature and searched The Cancer Genome Atlas (TCGA) database to study the prognostic value of SOX2-OT in patients with several types of cancer. We additionally explored the potential target genes of SOX2-OT through gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses, and the potential mechanisms of SOX2-OT in tumor progression are also discussed.

Search Strategy
Studies on the prognostic roles of SOX2-OT in cancer patients that were published as of October 1st, 2019 were extracted from the electronic databases PubMed, EMBASE, and Cochrane Library using the terms (1) "SOX2-OT" OR "NCRNA00043" OR "SOX2OT" OR "SOX2 overlapping transcript" OR "SRY-box transcription factor 2 overlapping transcript" AND (2) "tumor OR cancer OR carcinoma OR neoplasm OR metastasis". The search strategies are illustrated in Supplementary Table 1. The search and selection of articles for the study were conducted as described previously (Sun et al., 2019).

Inclusion and Exclusion Criteria
Studies entering this analysis met these requirements: (1) definitive diagnosis or histopathological confirmation for patients with cancer; (2) the expression of SOX2-OT must be measured by quantitative real-time polymerase chain reaction (qRT-PCR); (3) the hazard ratios (HRs) and their 95% confidence intervals (CIs) for survival parameters based on SOX2-OT expression levels were promptly available or could be calculated indirectly; and (4) the representative and accurate studies were selected to avoid unnecessary cohort overlapping. Studies that have satisfied the abovementioned inclusion requirements were further ruled out if they had any of the following features: (1) duplicated articles or data; (2) nonhuman studies; (3) review articles or letters; (4) articles in non-English languages.

Quality Assessment of Included Studies
The quality of the included studies was assessed using Newcastle-Ottawa Scale (NOS), with scores ≥ 6 considered high quality. A ''star system'' was applied for case-control studies (Supplementary Table 2).

Data Extraction
The following information was extracted from each study: (1) first author; (2) publication year; (3) nationality, sample size, tumor type, and clinicopathological characteristics of involved patient population; (4) the assay method and cut-off value of SOX2-OT expression levels; (5) HRs of SOX2-OT expression for OS and disease-free survival (DFS). If the HRs for OS and DFS were calculated by both univariate and multivariate analyses, the latter were our first choice for these results and were adjusted for confounding factors. If a study did not report HRs, we estimated HRs and their corresponding 95% CIs using the procedure described by Parmar et al. (1998) and Tierney et al. (2007). The data of Kaplan-Meier curves were regained by Engauge Digitizer software (version 9.8, http://markummitchell.github.io/ engauge-digitizer). This process was repeated three times to decrease variability. Discrepancies were resolved through discussion and review of extraction until consensus was reached on a final list of factors targeted by each study.

Statistical Analysis
All the HRs and their 95% CIs were integrated to evaluate the association between SOX2-OT expression and prognosis. If the pooled HR < 1 and their 95% CI did not overlap the invalid line in the forest plot, the elevated expression of SOX2-OT predicted a good OS. The heterogeneity of the pooled results was examined via Cochrane's Q test and Higgins' I-squared. If P ≥ 0.1 and I 2 ≤ 25%, we disregarded the influence of heterogeneity and pooled the overall result using a fixed effects model, otherwise employing the random effects model. Potential publication bias was assessed by a funnel plot and Egger's test (Stuck et al., 1998) conducted using the "metafor" and "meta" packages of R

Identification of Eligible Studies
Identification of eligible studies is summarized in Figure 1. We screened 122 articles for eligibility and identified 13 eligible studies. These eligible articles were published between 2014 and 2018 and included a total of 1172 participants who represented eight cancer types ( Table 1). Most articles choose the mean and median as the cutoff value. Eight studies that used multivariate analysis of OS were included in the meta-analysis (Hou et al., 2014;Shi and Teng, 2015;Zhang et al., 2016;Zou et al., 2016;Wang et al., 2017a;Li et al., 2018a;Li et al., 2018b;Xie et al., 2018), the adjusted variables of the multivariate analysis were presented in Table 2. The other three studies provided survival curves Sun et al., 2018;Wei et al., 2018).
We performed subgroup analyses of association between SOX2-OT expression and OS using 11 studies. The results showed the presence of a significant association between SOX2-OT expression and OS when the data were fully integrated from eight studies where OS was assessed with multivariate analysis (HR = 2.052, 95% CI: [1.661; 2.536], P < 0.0001, I 2 = 0%) ( Table 4). Furthermore, a significant relationship was revealed in the subgroup analyses for OS based on sample size (P < 0.0001), tumor type (P < 0.05), sample type (P < 0.05), and cut-off value (P < 0.01).     Eight studies employed Cox multivariate analysis to survey the prognostic value of lncRNA SOX2-OT expression on the prognosis of cancer patients (Hou et al., 2014;Shi and Teng, 2015;Zhang et al., 2016;Zou et al., 2016;Wang et al., 2017a;Li et al., 2018a;Li et al., 2018b;Xie et al., 2018). An in-depth subgroup analysis is required to clearly define the values of the adjusted variables in multivariate analysis ( Table 5). Subgroup analysis stratified by independent prognostic factors, such as clinical stage (P < 0.0001), lymph node metastasis (P < 0.0001), tumor differentiation (P < 0.0001), tumor size (P < 0.01), vascular invasion (P < 0.001), tumor depth (P < 0.001), distant metastasis (P < 0.0001), postoperative recurrence (P < 0.05), and smoking status (P < 0.05) ( Table 5) demonstrated that a significant relationship existed between lncRNA SOX2-OT expression and OS.
In order to examine the robustness of OS, the trial sequencing monitoring boundaries executed to the meta-analysis supposed a decrease in relative risk by 15%. The cumulative Z-curve crossed the trial sequential monitoring boundary for benefit, indicating that sufficient evidence exists for a 15% relative risk reduction (RRR) when SOX2-OT expression is low ( Figure 2B).
Publication bias of the association between SOX2-OT expression and prognosis was inferred based on our Egger's test (P < 0.05) ( Figure 4A). No distinct biases of the correlation between SOX2-OT expression and clinicopathological characteristics were found across included studies on the basis of funnel plots and the P value of the Egger's test (Figures 4B-I).

Meta-Regression and Stratified Analysis
To investigate the possible sources of heterogeneity, we gathered the original articles for subgroup analyses, based on various factors. Table 6 displays the outcomes of a meta-regression that examined the source of high heterogeneity for TNM stage. The follow-up time, sample type, and tumor type could significantly explain heterogeneity for survival outcomes in the post-hoc analysis (Table 6, Figure 5A). On the basis of the results of the meta-regression, we carried out a subgroup analysis on groups of patients with the follow-up time, sample type, and tumor type (Figures 5B-D). This subgroup analysis showed a significantly lower heterogeneity in the above 60 months followup group, the tissue group, or the Cholangiocarcinoma group, which suggested that the relationship between high SOX2-OT expression and TNM stage has stronger efficacy in these groups. Table 3) and stratified analysis (Supplementary Table 4) did not demonstrate heterogeneity between all potential factors and the other clinical parameters.

Functional Analysis of SOX2-OT Related Genes in Human Tumors
To systematically analyze the underlying gene regulatory mechanisms of SOX2-OT, a total of 500 target genes were identified with Multi Experiment Matrix (MEM) (Supplementary Figure 5). GO and KEGG analyses were executed. Validated target genes of SOX2-OT enriched GO terms including cell adhesion, cell adhesion molecule (CAM) binding, mRNA binding, mRNA splicing via spliceosome, and MAPK cascade ( Figure 7A). These relevant GO terms were considered as the most specific and useful for describing the concrete function of SOX2-OT. The visualization network is shown in Figure 7B. Furthermore, KEGG enrichment analysis indicated that SOX2-OT may play a critical role in cancers via several pathways including CAMs, retrograde endocannabinoid signaling, circadian entrainment, cAMP signaling pathway, and mRNA surveillance pathway ( Figure 7C). These corresponding KEGG terms were considered as the most specific and useful for describing the concrete pathway of SOX2-OT. The visualization network is presented in Figure 7D.

DISCUSSION
Several studies have indicated that high expression of SOX2-OT is significantly related with the prognosis and clinicopathological outcomes in cancers (Hou et al., 2014;Shi and Teng, 2015;Iranpour et al., 2016;Zhang et al., 2016;Zou et al., 2016;Wang et al., 2017a;Wang et al., 2017b;Han et al., 2018;Li et al., 2018a;Li et al., 2018b;Sun et al., 2018;Wei et al., 2018;Xie et al., 2018). The crucial role that SOX2-OT may play in the progression of many cancers had been further outlined in reviews (Shahryari et al., 2015;Castro-Oropeza et al., 2018). A meta-analysis by Jing et al. proposed that the overexpression of SOX2-OT indicated higher TNM stage and a worse OS in cancer patients, but failed to predict distant metastasis and lymph node metastasis in Chinese cancer patients (Jing et al., 2017). Moreover, other studies since 2014 have investigated the relationship between SOX2-OT and the prognosis of cancer patients (Hou et al., 2014;Shi and Teng, 2015;Iranpour et al., 2016;Zhang et al., 2016;Zou et al., 2016;Wang et al., 2017a;Wang et al., 2017b;Han et al., 2018;Li et al., 2018a;Li et al., 2018b;Sun et al., 2018;Wei et al., 2018;Xie et al., 2018). The present study was performed to obtain a more definite conclusion and assess the potential mechanisms of SOX2-OT effects by integrating the outcomes of published studies and TCGA survival data and running GO and KEGG analyses.
The present meta-analysis of a combination of 1172 patients from 13 eligible studies with 9676 patients from TCGA investigated thoroughly the correlations between elevated expression of SOX2-OT and prognosis as well as clinicopathological outcomes in cancer patients. The NOS was applied to evaluate the quality of all the selected studies, and Egger's test and Begg's test were used to examine the publication bias. If the P value of the Egger's test was less than 0.05, we also checked the reliability of the results by TSA. Our results indicated that elevated expression of SOX2-OT was significantly related to worse prognosis indicators, with an OS of 2.026 (95% CI: 1.691-2.428), and a DFS of 2.554 (95% CI: 1.261-5.174). Regarding the clinicopathological characteristics of patients with cancers, our research suggested that high SOX2-OT expression was significantly associated with the invasion of cancers, as reveal by the tumor stage (RR = 1.468, 95% CI: 1.106-1.949), lymphatic metastasis (RR = 1.554, 95% CI: 1.211-1.994), distant metastasis (RR = 3.054, 95% CI: 1.866-4.999), tumor size (RR = 1.264, 95% CI: 1.019-1.566), and depth of tumor invasion (RR = 1.552, 95% CI: 1.274-1.890), but couldn't predict histological differentiation, age, or gender.
According to our findings, SOX2-OT shows the potential to be used as a marker for progression and prognosis. A subgroup analysis indicated that elevated SOX2-OT expression was substantially associated with OS in sarcoma (SARC) and gastric cancer (STAD) patients, according to the publications and the TCGA datasets. As for pancreatic cancer (PAAD), bile duct cancer (CHOL), lung adenocarcinoma (LUAD), and lung squamous cell carcinoma (LUSC), SOX2-OT overexpression was correlated with a bad prognosis in the publications. However, in the TCGA datasets, SOX2-OT was associated with a good prognosis although the results were not statistically significant; the corresponding HR values were 0.89 (95% CI: 0.591-1.339, P = 0.574), 0.918 (95% CI: 0.364-2.319, P = 0.856), 0.738 (95% CI: 0.552-0.988, P = 0.04), and 0.79 (95% CI: 0.603-1.035, P = 0.085), respectively. High expression of SOX2-OT in liver cancer (LIHC) in the TCGA datasets was correlated with an unfavorable prognosis (HR = 1.467, 95% CI: 0.845-2.548, P = 0.24) although the results were not statistically significant, which was consistent with the publications (Shi and Teng, 2015;Sun et al., 2018) (Tables 4 and 7). Kaplan-Meier analysis initially suggested that SOX2-OT overexpression was associated with a bad OS in adrenocortical cancer (ACC), cervical cancer (CESC),  (Table 7 and Supplementary Figures 3 and 4). Sampling error and publication bias may explain the inconsistent results between literature studies and studies on TCGA datasets.
Heterogeneity appeared in the clinicopathological aspects including tumor stage, lymphatic metastasis, and tumor size (P < 0.1). Since the presence of heterogeneity may affect the results of the meta-analysis, the heterogeneity has been dealt cautiously with a random effects model in order to reduce the effect of heterogeneity on the merged results. Publication bias was prominent in studies with OS data (P < 0.05) as showed by the Egger's, Begg's test, and funnel plots. Hence, the TSA data suggested the results of our study were statistically stable.
Recently, studies on the functioning of SOX2-OT in cancer have spread and cumulative evidence indicating that SOX2-OT could affect various biological behaviors of numerous tumors. Li et al. pointed out that SOX2-OT competitively binds to the miR-200 family to regulate the expression of SOX2, and SOX2-OT promotes epithelial-mesenchymal transition (EMT) and stem cell-like properties by regulating SOX2 expression, thereby promoting invasion and metastasis of pancreatic duct adenocarcinoma (Li et al., 2018a). Qu et al. proposed that SOX2-OT was highly expressed in gastric cancer cells, which promoted the expression of AKT2 by targeting miR-194-5p, thus elevating cell proliferation and metastasis (Qu and Cao, 2018). Finally, Wei et al. discovered that the upregulation of lncRNA SOX2-OT by transcription factor IRF4 promotes cell proliferation and metastasis in cholangiocarcinoma via upregulating SOX2, and activates PI3K/AKT signaling pathway via suppressing the nuclear transcription of PTEN .
The exact gene regulatory mechanisms of SOX2-OT remain poorly understood. Therefore, we uncovered the validated targeting genes of SOX2-OT through the MEM platform, and a comprehensive target gene network analysis was performed. The GO and KEGG pathway analysis together revealed that some CAMs and pathways may be regulated by SOX2-OT. SOX2-OT appears to play a critical role in the cancers via different pathways, including mRNA binding and mRNA splicing, similar to the post-transcriptional regulating functions of other lncRNAs. The above findings suggest that the elevation of SOX2-OT expression is associated with the processes of tumor invasion and metastasis, consistent with our findings.
Our study is consistent with the most recent study by Song et al. in which lncRNA SOX2-OT overexpression was significantly correlated with worse OS and more advanced clinical stages of solid tumors based on 943 cases from 10 studies, all of them being Asians (Song et al., 2018). Consistently, analysis of 481 patients from five studies by Jing et al. showed that high SOX2-OT expression predicted poor OS and more advanced tumor progression, but failed to predict distant metastasis and lymph node metastasis in Chinese cancer patients (Jing et al., 2017). Herein, we have p e r f o r m e d a m o r e c o m p r e h e n s i v e s t u d y o n t h e clinicopathological significance of SOX2-OT expression in cancer patients. First, we included 13 eligible articles involving 1172 cancer patients and 32 TCGA cancer datasets involving 9676 cancer patients to investigate a total of 10,848 participants in our study. Second, we investigated both clinicopathological and prognostic significance of SOX2-OT expression based on comprehensive clinical data and performed a series of subgroup analyses based on prognostic types, adjusted variables in the multivariate analysis of OS, sample sizes, cancer types, sample types, cut-off values, analysis models, and clinicopathological characteristics. These stratifications increase our understanding of the clinicopathological significance of SOX2-OT expression in cancers. Third, TSA on the applicable literature was used to investigate reliability and conclusiveness of available evidence for the prognostic significance of SOX2-OT expression. Fourth, the prognostic value was validated using TCGA datasets and the potential functions were explored using GO and KEGG. In this particular study, there were some limitations. As to this meta-analysis, different cut-off values and sample types of  the selected articles contributed publication bias. Since direct results of survival analysis were unavailable, a divergence in HR values might significantly contribute to extract the survival data through the Kaplan-Meier curve. Consequently, in-depth study is required to investigate the clinical value and prognosis significance of SOX2-OT in cancers.
In order to increase the sample size, we used TCGA datasets for further analysis and validation, but only the results of gastric cancer and sarcoma were consistent with those based the publications. In order to clarify the mechanism by which SOX2-OT is involved in gastric cancer and sarcoma, further molecular biology experiment is warranted to explore other possible signaling pathways or target molecules.
In conclusion, our report shows that elevated SOX2-OT expression was significantly related with invasion and metastasis progress in cancers, implying shorter OS and DFS, a poorer TNM stage, higher rates of lymphatic and distant metastasis, larger tumor size, and deeper invasion. We also concluded that SOX2-OT plays a crucial role via a few pathways. Considering the limitations, further studies are necessary in order to better define the functions of SOX2-OT in cancers.