A Combined Long Noncoding RNA Signature as a Candidate Prognostic Biomarker for Ovarian Cancer

Aims Dysregulated long noncoding RNAs (lncRNAs) contributing to ovarian cancer (OC) development may serve as prognostic biomarker. We aimed to explore a lncRNA signature to serve as prognostic biomarker of OC. Methods Univariate Cox regression was conducted on the lncRNA expression dataset from the TCGA cohort, and 246 genes significantly associated with survival were retained for building a model. A random forest survival model was carried out, and a model was developed using 6 genes with the highest frequency. The selected genes were applied in a Cox multivariate regression model for prognostic prediction by calculating the risk score. We also used CCK-8, EdU, and colony formation assays to validate the function of these lncRNAs in OC cells. Results This study confirmed that the 6-lncRNA combined signature was related to OC prognosis. Systematic analysis demonstrated that lncRNA-associated genes were enriched in oncogenic signalling pathways. Five out of the 6 lncRNAs participated in OC proliferation. Conclusion We established a 6-lncRNA combined signature for OC prognosis, which may serve as powerful prognostic biomarker for OC after further validation.


INTRODUCTION
Ovarian cancer (OC) is a major cause of gynaecologic cancer death in women worldwide (1). Each year, 238,700 patients are newly diagnosed with OC, and 151,190 patients die of OC worldwide (2). The overall mortality of OC remains high owing to the high recurrence rate and lack of early detection (3). Thus, it is crucial to explore more reliable prognostic factors for early detection.
Long noncoding RNAs (lncRNAs) are a class of noncoding RNAs with a length of more than 200 nucleotides. Several lncRNAs have been confirmed to be deregulated and act as critical regulators in OC (4). Overexpression of lncRNA HOTTIP promoted the growth and metastasis of OC via the miR-615-3p/SMARCE1 axis (5). LncRNA HOTAIR maintained higher OC stemness through the miR-206/TBX3 pathway (6). Recently, several lncRNAs were discovered as prognostic biomarkers for OC patients. LncRNA PTPRG-AS1 overexpression could predict a poor prognosis of OC patients (7). The enhanced level of lncRNA ROR was positively correlated with poor clinical outcome of OC patients (8). However, these single features appear to have poor generality in new datasets (9). In this work, we identified a multiple lncRNA combined signature based on a prognostic model for OC patients and investigated the role of the signature on OC proliferation.

Cell Lines and Reagents
The human OC cell lines Hey and SKOV3 and the lentiviral packaging cell line 293T were obtained from the American Type Culture Collection (ATCC). RPMI 1640 medium (Sigma-Aldrich) was used to incubate Hey and SKOV3 cell lines, and DMEM was used to culture 293T cells. Both media were supplemented with 10% foetal bovine serum (FBS) and 1% penicillin/streptomycin. All cells were incubated in a humidified atmosphere at 37°C with 5% CO 2 .

Plasmid Construction and Lentivirus Infection
According to the manufacturer's protocol, two small hairpin RNAs (shRNAs) targeting each lncRNA were inserted into the lentiviral vector pLKO.1-puro to silence lncRNA expression. Supplementary Table 1 shows the shRNA sequences used in this study. We used pLKO.1-puro scrambled shRNA as a control. In the presence of polybrene, Hey and SKOV3 cell lines were infected with the virus. Then, the medium containing puromycin was used to select Hey and SKOV3 stable clones. RT-PCR was used to confirm the knockdown efficiency of these shRNAs in Hey and SKOV3 cell lines.

Reverse Transcription and Quantitative Real-Time PCR
TRIzol reagent (Invitrogen) was used to extract total RNA according to the manufacturer's instructions. A NanoDrop 2000 (Thermo Fisher Scientific) was applied to measure the quantity and purity of the total RNA. The PrimeScript RT Master Mix kit (TAKARA) was used to synthesize cDNA. Real-time PCR was performed using FastStart Universal SYBR Green Master Mix (Rox) (Roche) in an ABI PRISM 7900 sequence detector (Applied Biosystems, Carlsbad, CA), and GAPDH was applied as an internal control. Relative mRNA levels were computed according to the Ct values relative to GAPDH. Supplementary Table 2 shows the primers used in this study.

Cellular Growth Assay
In total, 1000 SKOV3 cells or 1000 Hey cells were seeded in 96well plates. Cell growth was examined by Cell Counting Kit-8 (CCK-8) (Beyotime, China). Briefly, the culture medium was removed at 24, 48, 96, and 120 h, and 10 ml of CCK-8 in 100 ml of medium was added to each well. Then, the cells were incubated at 37°C for 2 h, and the absorbance was measured at 450 nm. These experiments were repeated 3 times.

Colony Formation Assay
A total of 800 SKOV3 cells or 500 Hey cells were seeded in 6-well plates for plate colony formation. Each cell type was cultured at 37°C in a 5% CO2 atmosphere for 2 weeks. Then, 4% PFA (paraformaldehyde) was used to fix the colonies for 10 min, and the colonies were stained with crystal violet solution for 20 min. After washing three times with PBS, the number of colonies was counted. The assays were repeated 3 times.

Edu Cell Proliferation Assay
SKOV3 cells or Hey cells were seeded in 12-well plates. Then, the cells were cultured 24 h before EdU (C0078S, Beyotime, Shanghai, China) treatment. EdU was added in medium and incubated at 37°C for 2 h. PFA (4%) was used to fix the cells for 20 minutes, and 0.3% Triton X-100 was used to permeabilize the cells for 15 minutes. The cells were then incubated with Click reaction buffer for EdU staining for 30 min. Hoechst 33342 (5 mg/mL) was used to stain the cells at 37°C for 10 minutes, followed by PBS washes 3 times. Images were captured using a fluorescence microscope.

Dataset Acquisition and Preprocessing
The TCGA lncRNA dataset was downloaded from the TANRIC database (https://bioinformatics.mdanderson.org/publicsoftware/tanric/). Clinical information and mRNA expression datasets of the corresponding patients were retrieved from UCSC Xena (http://xena.ucsc.edu/public-hubs/). The lncRNA and mRNA expression datasets were first log2 scaled and then z-score transformed for further analysis.

Gene Selection and Model Development
The lncRNA expression dataset of the TCGA cohort was analysed by univariate Cox regression. Model building used 246 genes significantly related to survival (P < 0.05). A random forest survival model was carried out with parameters of 100 iterations and 1000 trees per iteration to optimize the panel. The model development used six genes with the highest frequency. The selected genes were implemented in a Cox multivariate regression model. The risk score was calculated using the following formula:  Statistical Analysis R (version 4.0.2) was used to analyse the data in this study. The R package "survival" was implemented in univariate Cox regression, survival analysis, and multivariate regression. The R package "randomForestSRC" was applied to the random forest survival model and risk score. The AUC and ROC were calculated with the R package "ROCR". Correlation analysis was calculated by Pearson correlation coefficients, and gene enrichment was implemented using Metascape (http:// metascape.org/gp/index.html). A two-sided P value < 0.05 was considered significant in the study.

Feature Selection and Prognostic Value of lncRNAs in the TCGA Cohort
To identify lncRNAs related to OC prognosis, we first performed univariate Cox regression on the TCGA dataset (n = 411). Genes significantly related to overall survival were reserved for further analysis, and a random survival forest model was built on these genes for prognosis prediction (Methods). Finally, the six genes with the highest prediction accuracy were selected for the optimization of the model, and a multivariate Cox proportional hazard model was constructed to calculate the risk score for each sample (Method). The patients were separated into low-risk and high-risk groups by the median split of risk scores, and the high-risk group demonstrated significantly worse overall survival probability ( Figure 1A) and progression-free survival probability ( Figure 1B). Interestingly, we found that four lncRNAs were positively correlated with high-risk scores (AL121820.1, AC006262.3, LINC02115 and AL138831.2), whereas the other two lncRNAs were negatively correlated with the risk score (LINC01984-201 and AL713998.1) ( Figure 1C). Compared with clinical factors, including age, tumour stage, lymphatic invasion, grade, and tumour size, the risk score showed much higher area under the curve (AUC) values in the prognostic prediction of OC patients ( Figure 1D), indicating the possibility of the risk score as a new prognostic biomarker in clinical application.

Targeted Genes of lncRNAs Constructed the Risk Score
To identify genes targeted by the six lncRNAs showing prognostic value in OC, we next conducted Pearson correlation coefficient analysis between the expression levels of coding genes and the six lncRNAs. The results showed that NEK9, PCNX, and ZFYVE26 were positively correlated with AL121820.1, whereas TIMM10, GPX1, and NDUFAF3 were negatively associated with the expression of AL121820.1, demonstrating the biphasic regulation of AL121820.1 to the targeted genes ( Figure 2A). In contrast, genes showing high correlation coefficients (R > 0. 4) with the other five lncRNAs were most positively correlated with their corresponding lncRNAs ( Figures 2B-F). Specifically, CRLF3, which encodes a cytokine receptor-like factor that may negatively regulate cell cycle progression, was positively correlated with LINC01984-201 (10) ( Figure 2B). Twentythree genes demonstrated a positive correlation with AC006262.3 ( Figure 2C), and MYO15B and ZNF169 were positively correlated with LINC02115 ( Figure 2D). Except for GRB14, which was associated with the immune response IL-23 signalling pathway, and cell surface interactions in the vascular wall, the other 13 genes were positively correlated with  AL713998.1 ( Figure 2E). Zinc finger proteins ZNF135 and ZNF311 were both positively correlated with AL138831.2 ( Figure 2F).
To systematically explore the roles of the lncRNA-associated genes, we further performed functional enrichment analysis on their mRNAs to conduct the significant correlation (P < 0.05) with the corresponding lncRNAs by Metascape (11). The AL138831.2-correlated genes were significantly enriched in DNA repair, peptidyl-lysine modification, and negative regulation of I-kB kinase/NF-kB signalling pathways ( Figures  3A, B). In addition, the MCODE ZNF complexes were also enriched ( Figure 3C), suggesting that AL138831.2 may interact with ZNF proteins in OV. Considering that the high expression of AL138831.2 was associated with poor prognosis in OV, AL138831.2 might also contribute to OC development through these oncogenic signalling pathways. The AC006262.3associated genes were significantly enriched in the regulation of cellular response to growth factor stimuli, the regulation of cell adhesion, and the responses to leukaemia inhibitory factor and MAPK signalling pathway ( Figures 4A-C).

LncRNAs Reflected the Cell Motility Status
To validate the function of the six identified lncRNAs in OC development, we silenced the expression of each of these lncRNAs in the OC Hey and SKOV3 cell lines by shRNAs. The expression of lncRNAs detected by q-PCR was remarkably decreased in OC cells (Supplementary Figure 1). The EdU assay was used to study the effects of lncRNAs on cell proliferation. The results from EdU assays showed that silencing AC006262.3, LINC02115, and AL138831.2 indeed promoted the proliferation of Hey and SKOV3 cells in comparison with that of the control cells ( Figures 5A, C and 6A, C). Consistent with the proliferation assay, silencing of AC006262.3, LINC02115, and AL138831.2 increased the colony number, and silencing of LINC01984-201 and AL713998 decreased the colony number in both Hey and SKOV3 cells compared with that of the control cells ( Figures 5B, D and 6B, D). As shown in Figures 5E and 6E, the CCK-8 assay indicated that silencing AC006262.3, LINC02115, and AL138831.2 in Hey and SKOV3 cells promoted cell growth, but knockdown of LINC01984-201 and  AL713998 restrained cell growth compared with that of the control cells. However, the role of AL12820.1 in cell proliferation was not obvious.

DISCUSSION
This study carried out a multistep analysis of a lncRNA signature in OC. Based on the lncRNA expression dataset of the TCGA cohort, we used a univariate Cox regression and a random forest survival model to establish a prognostic 6-lncRNA signature in OC. Functional analysis showed that tumour-related processes were significantly enriched. In addition, CCK-8, EdU proliferation and colony formation assays were performed to validate the function of these lncRNAs. Thus, the 6-lncRNA combined signature is a robust model that could serve as a novel biomarker for OC prognosis. As a common gynaecologic malignancy, OC is the leading cause of cancer death in women worldwide (12). OC is a histologically heterogeneous cancer caused by variations in genetic and environmental factors (13). An increasing body of evidence suggests that dysregulated lncRNAs play an important role in the progression of OC (14). As shown in Figure 2, lncRNAs could regulate the coding genes expression in OC. The mechanisms by which lncRNAs regulate the expression of the selected mRNAs may include RNA stability regulation (15), mRNA translation (16), chromatin modification (17), transcription activation (17,18) and transcription interference (19). The abnormal expression of lncRNAs could promote the proliferation, migration, invasion, and metastasis of OC by regulating the mRNAs expression and contribute to poor prognosis of OC (20). It has been confirmed that the AC006262.3 related genes such as VASP (21), RUNX1 (22), TRIM29 (23) in Figure 2C contribute to the progression of OC. Thus, the aberrant expression of lncRNAs may serve as new prognostic biomarkers in OC.
With the rapid development of high-throughput technologies such as gene chip and RNA sequencing, gene analysis has become a powerful tool for screening molecular biomarkers of tumour prognosis prediction (24). Recently, several studies demonstrated that the robustness of several biomarker combinations is better than the robustness of a single biomarker (25). Our study analysed the available RNA-seq data of OC from TCGA database and established a robust 6-lncRNA signature that could serve as prognostic factor. As shown in Figure 3, the AL138831.2-related gene were indeed enriched in tumourrelated networks, including "DNA repair", "negative regulation of NF-kB signalling", "glycogen metabolic process" and "ZNF proteins", indicating that the activation of these pathways may promote higher mortality risk in patients with high-risk scores. ZNF proteins are immensely implicated in the development of several tumours including OC (26). Through the ZNF proteins, AL138831.2 might function in OC cell metastasis and serve as a prognosis-associated biomarker in patients with OC. Moreover, Metascape and MCODE ZNF complexes analyses in Figure 4 showed that the mRNAs which are positively co-expressed with AC006262.3 participate in "would healing", "cell adhesion" and "WNT signalling pathway", which are considered to be correlated with OC metastasis. Aberrant activation and upregulation of WNT signalling pathway could contribute to the progression of OC and lead to poor prognosis of OC (27). This indicated that AC006262.3 might function in the prognosis of patients with OC by regulating WNT signalling pathway. These results reconfirmed that lncRNAs could act as new biomarkers of OC and provide evidence for the clinical application of these 6 lncRNAs. Furthermore, a recent study analysed the role of lncRNAs in BRCA mutant ovarian cancer (28) and then we searched BRCA mutation information in 294 samples of our study. We found that (1) BRCA mutation rates was very low (Supplementary Figure  2A); (2) BRCA mutation did not affect the overall survival probability of ovarian cancer patients (P=0.506, Supplementary Figure 2B). BRCA mutation had no effect on integration results. Therefore, BRCA mutation has no relationship with LncRNAs screened by this study.
Since the risk score was calculated based on the expression of candidate genes, the effectiveness of the model relies on the function of these genes. However, there have been no studies to report the biological functions of these 6 lncRNAs at present. Thus, we preliminarily explored the biological functions of these 6 lncRNAs. As shown in Figures 5 and 6, we found that silencing AC006262.3, LINC02115, and AL138831.2 promoted cell growth and that knockdown of LINC01984-201 and AL713998 retained cell growth. However, the role of AL12820.1 in cell proliferation was not clear. Collectively, these results indicate that the genes selected in the model are functional active for proliferation, and thus predict survival in OC. There are some limitations in our study. (1) Our data were not eligible to show the pathophysiological functions of these lncRNAs in our study. Thus, how these lncRNAs take part in cancer development may need to be validated in the future.
(2) We are unclear why AL12820.1 did not affect the proliferation of OC cells. (3) The underlying mechanisms of these lncRNAs in OC need to be further investigated.

CONCLUSION
In conclusion, our study identified a 6-lncRNA combined signature associated with the prognosis of OC. Systematic analysis found that lncRNA-associated genes were enriched in oncogenic signalling pathways. Furthermore, five out of six lncRNAs were found to regulate the proliferation of OC cells.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

ETHICS STATEMENT
Ethical approval has been obtained from the Fudan University Shanghai Cancer Center Ethics Committee.

AUTHOR CONTRIBUTIONS
HL, SW and QY mainly took charge of researching, organizing data and writing the manuscript; YL polished the language in the review process; JY and LX used illustration software to arrange data; GY designed this study. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by grants from the National Natural Science Foundation of China (81372797 for GY).