Prediction of Clinical Outcome in Endometrial Carcinoma Based on a 3-lncRNA Signature

Endometrial carcinoma (EC) is one of the common gynecological cancers with increasing incidence and revived mortality recently. Given the heterogeneity of tumors and the complexity of lncRNAs, a panel of lncRNA biomarkers might be more precise and stable for prognosis. In the present study, we developed a new lncRNA model to predict the prognosis of patients with EC. EC-associated differentially expressed long noncoding RNAs (lncRNAs) were identified from The Cancer Genome Atlas (TCGA). Univariate COX regression and least absolute shrinkage and selection operator (LASSO) model were selected to find the 8-independent prognostic lncRNAs of EC patient. Furthermore, the risk score of the 3-lncRNA signature for overall survival (OS) was identified as CTD-2377D24.6 expression × 0.206 + RP4-616B8.5 × 0.341 + RP11-389G6.3 × 0.343 by multivariate Cox regression analysis. According to the median cutoff value of this prognostic signature, the EC samples were divided into two groups, high-risk set (3-lncRNAs at high levels) and low-risk set (3-lncRNAs at low levels), and the Kaplan–Meier survival curves demonstrated that the low-risk set had a higher survival rate than the high-risk set. In addition, the 3-lncRNA signature was closely linked with histological subtype (p = 0.0001), advanced clinical stage (p = 0.011), and clinical grade (p < 0.0001) in EC patients. Our clinical samples also confirmed that RP4-616B8.5, RP11-389G6.3, and CTD-2377D24.6 levels were increased in tumor tissues by qRT-PCR and in situ hybridization. Intriguingly, the p-value of combined 3-lncRNAs was lower than that of each lncRNA, indicating that the 3-lncRNA signature also showed higher performance in EC tissue than paracancerous. Functional analysis revealed that cortactin might be involved in the mechanism of 3-lncRNA signatures. These findings provide the first hint that a panel of lncRNAs may play a critical role in the initiation and metastasis of EC, indicating a new signature for early diagnosis and therapeutic strategy of uterine corpus endometrial carcinoma.


INTRODUCTION
Globally, the incidence for uterine corpus endometrial carcinoma (UCEC) persistently increased with 1.3% per year from 2007-2016, in part due to continued declines in the fertility rate as well as increased obesity (Siegel et al., 2020). In China, the incidence of EC was also increasing from 2014, which ranked second in female reproductive malignancies on account of the increased risk factors such as diabetes and obesity (Chen et al., 2019). Although EC has a good prognosis with 5-year overall survival (OS) of 74-91%, the advanced or metastatic EC patients still have a poor prognosis due to tumor metastasis and poor differentiation (Piulats et al., 2017). Histological classification and the International Federation of Gynecology and Obstetrics (FIGO) staging system are the traditional treatment guideline and prognostic indicators (Pecorelli, 2009;Morice et al., 2016). However, distinct molecular characteristics have been demonstrated in the same stage and histology of cancers (Murali et al., 2014;Yang et al., 2016). With the development of precision medicine, a new therapeutic approach according to molecular profiling has been provided. In 2021, to improve outcomes of EC patients, molecular classification was recommended to select appropriate treatment regimens by the National Comprehensive Cancer Network (NCCN). Four molecular subgroups have been classified in 2013 based on the integrated genomic data of 373 endometrial carcinomas (Levine et al., 2013). Nevertheless, the integrated classification had limited application due to high expense and complex procedures. Therefore, identifying an efficient prognostic and diagnostic signature to guide clinical practice for EC is urgent.
Noncoding RNA was initially recognized as simply leaky transcription noise because they are not translated into proteins. However, numerous noncoding RNAs showed specific functions in cellular processes, as well as the dysregulation in human pathologies. Long noncoding RNA (lncRNA) is a class of noncoding transcripts with more than 200 nucleotides in length. Compelling studies reported that lncRNAs were associated with various human diseases including cancer by participating in biological processes widely (Schmitt and Chang, 2016;Peng et al., 2017;Yang et al., 2019). Meanwhile, accumulating evidence supported the potential ability of lncRNAs as cancer biomarkers (Lim et al., 2019;Xie et al., 2020) and the prognostic value of lncRNAs (He et al., 2014;Sun et al., 2021). For example, Liu et al. systematically discussed the EC-related lncRNAs and their roles in different cancer hallmarks, including tumor growth, metastasis, maintenance of cancer stem cells, and chemoresistance (Liu H et al., 2019). Until now, some biomarkers for EC have been identified using gene expression profile data. However, these models are limited to a specific stage or grade of EC. For example, one study identified a prognostic model for patients with early-stage EC using reverse-phase protein arrays (Yang et al., 2016). Others found a prognostic value of immune, metabolic, or autophagy-related coding and noncoding lncRNAs for EC (Ouyang et al., 2019;Gao et al., 2020;Li and Wan, 2020;Wang et al., 2021). However, given the heterogeneity of EC and the complexity of lncRNAs, a panel of lncRNA biomarkers might be more precise and stable for predicting prognosis rather other one lncRNA. Therefore, it is timely to investigate the new lncRNA biomarkers by combining The Cancer Genome Atlas (TCGA) data with UCEC-specific data.
In the present study, we obtained the lncRNA expression profile and clinical information of UCEC patients from the datasets of TCGA project. By bioinformatic approaches, a potential 3-lncRNA signature was identified in EC, and the association between the signature and clinical characteristics was confirmed. Furthermore, clinical samples were used to demonstrate that 3-lncRNA signature has a much better performance than independent 3 lncRNAs, providing a new signature for early diagnosis and therapeutic strategy of EC.

Identification of Differentially Expressed Long Noncoding RNAs Associated with Uterine Corpus Endometrial Carcinoma from The Cancer Genome Atlas
We obtained lncRNA expression profiles in 548 UCEC tissues and 35 normal tissues from TCGA datasets to screen DElncRs. To obtain reliable and stable results, lncRNA expression data were downloaded and performed using "DEseq," "edgeR," and "limma" R package separately in the R software ( Figures  1A-C), and the intersections were acquired. Among the acquired lncRNAs, a set of 233 lncRNAs, including 93 upregulated and 140 downregulated, was abundantly expressed in all the uterine corpus endometrial carcinoma (Figures 1D-F, Supplementary Table S1). These results indicated the role of differentially expressed lncRNAs in the initiation and progression of uterine corpus endometrial carcinoma.

Validation of Prognostic Long Noncoding RNA Signature
Subsequently, univariate Cox regression analysis was conducted to estimate the prognostic relationship between DElncRs and EC patient OS, and 31 prognostic lncRNAs were obtained with a p < 0.05 ( Figure 2A). Furthermore, to minimize prediction errors, 9 lncRNAs were screened out using the LASSO regression method. Kaplan-Meier survival curves were used to further analyze the relationship between the 9 lncRNAs and the OS of EC patients. Ultimately, 8 lncRNAs were identified to be related with OS ( Figures 2B-I).
Multivariable Cox regression analysis revealed the hazard ratios of 8 lncRNAs for OS of endometrium carcinoma ( Figure 2J). The area under the ROC curve (AUC) for OS was 0.71 ( Figure 2K). These results implied that the 8-lncRNA model could efficiently identify the risk of EC prognosis.

Assessment of Prognostic Risk in Uterine Corpus Endometrial Carcinoma Patients
Using a 3-Long Noncoding RNA Model 616B8.5 = 1.407, and RP11-389G6.3 = 1.409) with the lowest p-value (p < 0.1) were picked out for further investigation ( Figure 3A). Based on the coefficients of 3 prognostic lncRNAs from multivariate Cox regression analysis (Liu Y et al., 2019;Jiang et al., 2021), the risk score of the 3-lncRNA signature for OS was identified as CTD-2377D24.6 expression × 0.206 + RP4-616B8.5 × 0.341 + RP11-389G6.3 × 0.343. According to the median cutoff value of this prognostic signature, patients were divided into low-risk and high-risk sets. The survival results demonstrated that the low-risk set had a higher survival rate than that of the high-risk set (p < 0.0001, Figure 3B). To assess the potential prediction of 3-lncRNAs for overall survival of UCEC patients, the AUC analysis was performed to test the 3-lncRNA signature compared with each lncRNA. The results showed that the 3-lncRNA signature insignificantly showed an excellent performance than that of each lncRNA and two lncRNAs (Supplementary Figure S1, S2).

Correlation Between the 3-Long Noncoding RNA Signature and Clinical Characteristics of The Cancer Genome Atlas-Uterine Corpus Endometrial Carcinoma
To better understand the prognostic value of the 3-lncRNA signature, we further evaluated the relationships between the 3-lncRNA signature and traditional clinical characteristics. According to the median expressions of CTD-2377D24.6, RP4-616B8.5, RP11-389G6.3, and the 3-lncRNA signature risk score, the UCEC samples were divided into two sets. Pearson chisquare or Fisher's exact tests revealed that the 3-lncRNA signature was closely linked with histological subtype (p < 0.0001), advanced clinical stage (p = 0.011), and clinical-grade (p < 0.0001) ( Table 1). Compared to low-risk sets, the high-risk set tended to be serous adenocarcinoma (SAC), a histopathological type with worse differentiation and distant metastasis. These results demonstrated that 3-lncRNA

Expressions of 3 Long Noncoding RNAs in Paracancerous and Tumor Tissues of Uterine Corpus Endometrial Carcinoma Patients
In addition, we validated the expressions of 3 lncRNAs in 30 paired paracancerous and tumor tissues of UCEC patients. First, the transcript abundances of RP4-616B8.5, RP11-389G6.3, and CTD-2377D24.6 were evaluated by qRT-PCR, and the results indicated that the expressions of RP11-389G6.3 and CTD-2377D24.6 were significantly higher in tumor tissues with p-values of 0.023 and 0.002, respectively, while the expression of RP4-616B8.5 did not show significant difference between tumor and paracancerous tissues with a p-value of 0.087 ( Figure 4A), and the p-value of combined 3-lncRNAs was 0.027 using Hotelling T 2 test (F = 3.56) ( Figure 4B).
Furthermore, in situ hybridization assay was also utilized to confirm the expressions of lncRNAs in paracancerous and tumor tissues of UCEC patients ( Figure 4C). The staining scores of RP4-616B8.5, RP11-389G6.3, and CTD-2377D24.6 in EC tissues were significantly higher than those in paracancerous tissues with p-values of 0.042, 0.005, and 0.011, respectively ( Figure 4D), and the p-value of combined 3-lncRNAs was 0.0002 using the Fisher's methods (χ 2 = 25.84) ( Figure 4E). These results revealed that 3-lncRNA signature exhibited a better performance than the independent 3 lncRNAs for EC diagnosis.

Functional Analysis of 3-Long Noncoding RNA Signature in Uterine Corpus Endometrial Carcinoma
To explore the potential roles of 3-lncRNA signature in UCEC, differentially expressed mRNAs (DeRNAs) between the high-risk (3-lncRNAs at high levels) and low-risk (3-lncRNAs at low levels)  Table S2), and KEGG and GO analysis were conducted. Functional enrichment analysis revealed that these DeRNAs were significantly enriched in 5 KEGG pathways, including carcinogenesis, drug metabolism and resistance, fluid shear stress, and steroid hormone biosynthesis ( Figure 5A), 20 GO terms in biological processes, 10 GO terms in cellular components, and 10 GO terms in molecular functions ( Figure 5B), indicating that drug metabolism, chemical carcinogenesis, and cell motility-related pathways might be involved. Given the importance of cortactin for invadopodia formation, cancer cell migration, and metastasis (Ji et al., 2020), we examined the mRNA expression levels and location of cortactin by qRT-PCR and immunohistochemical staining. The results showed that two cortactin-encoding genes, CTTN and HCLS1, were markedly increased in tumor tissues ( Figures  5C,D). Interestingly, immunohistochemical staining revealed that cortactin exhibited in gland duct cells, but not in supporting cells ( Figure 5I). In addition, the greatest differentially expressed mRNAs between the high-risk (3-lncRNA signature at high levels) and low-risk (3-lncRNA signature at low levels) groups were also determined. DNAH5, LTF, and Ezrin were significantly increased in tumor tissues (p < 0.05, Figures 5E-G), and WNT7A displayed a slight increase (p = 0.0669, Figure 5H). These results indicated that cortactin might be associated with the function of 3-lncRNA signature.

DISCUSSION
Currently, a growing number of literatures demonstrated that dysregulated lncRNAs were involved in various diseases, as well as cancers (Evans et al., 2016). lncRNA might be a promising biomarker for cancer diagnosis, treatment, and prognosis prediction. Due to the heterogeneity of the tumor, a panel of lncRNA signature was more precise than a single lncRNA. In the present study, using Cox and LASSO regression, a 3-lncRNA signature was identified for predicting OS of EC patients. According to the median cutoff value of 3-lncRNA model, we demonstrated that the high-risk set displayed a poor survival, a higher clinical stage, and clinical grade and tended to be serous adenocarcinoma, a histopathological type with worse differentiation and distant metastasis. Our clinical samples also confirmed that 3-lncRNA, RP4-616B8.5, RP11-389G6.3, and CTD-2377D24.6 levels were increased in EC tissues than in paracancerous tissues by qRT-PCR and in situ hybridization. These findings provide an important hint that the 3-lncRNA signature has the potential performance for EC diagnosis and prognosis.
In the present study, we verified that the expressions of RP4-616B8.5, RP11-389G6.3, and CTD-2377D24.6 were higher in EC tissues than paracancerous tissues by qRT-PCR and in situ hybridization assays. According to the median cutoff value of 3-lncRNA signature, low-risk and high-risk sets were divided, and DeRNAs were identified. KEGG and GO analysis found that drug metabolism, chemical carcinogenesis, and cell motility-related pathway were enriched, indicating the potential roles of a panel of lncRNAs in initiation, metastasis, and chemoresistance of EC. The extensive quantity of published reports suggested that cell motility at an early stage in cancer correlated with metastasis (Lambert et al., 2017). In particular, the importance of cortactin for invadopodia formation, cancer cell migration, and metastasis has been proven (Schnoor et al., 2018;Ji et al., 2020). However, the link between lncRNAs and cortactin in endometrial carcinoma remains unclear.
Here, we demonstrated that cortactin was markedly increased in UCEC tumor tissues, and especially exhibited in gland duct cells. In addition, the greatest differentially expressed mRNAs between the high-risk and low-risk groups, such as DNAH5, LTF, and Ezrin, were significantly increased in tumors. These hinted that more comprehensive studies about the molecular mechanism of 3-lncRNAs will remain to be lucubrated. Traditional therapeutic strategies and risk stratification for EC patients are based on clinical and histological characteristics. However, the conventional classification does not adequately depict tumor biology owing to the high heterogeneity of EC. Recently, molecular or genomic classification has drawn attention as a promising approach to predict cancer prognosis. Levine et al. assessed the genome, transcriptome, and proteome of 373 endometrial carcinomas. Based on integrated genomic data, they were classified into four subgroups: POLE ultramutated, Frontiers in Cell and Developmental Biology | www.frontiersin.org February 2022 | Volume 9 | Article 814456 microsatellite instability hypermutated, copy-number low, and copynumber high. Subsequently, another molecular classification for EC termed "ProMisE" was identified (Talhouk et al., 2015). A similar integrated risk profile was established by the TransPORTEC international consortium (Stelloo et al., 2015;Stelloo et al., 2016). However, compared with genome sequencing, our 3-lncRNA signature was more suitable for clinical diagnosis and classification due to its higher stability and lower cost. By bioinformatic approaches and verification of clinical samples, we demonstrated that the 3-lncRNA signature might be a reliable prognostic biomarker. However, there are several limitations in our study. First, the 3-lncRNA signature was constructed by the TCGA-UCEC datasets, in which the Caucasian race was the main patient. So, the prognostic value in other races is needed to be validated. Second, we detected the independent difference of 3 lncRNAs between the paracancerous tissues and UCEC tissues in our clinical samples, while the prognostic value of the signature was not analyzed due to insufficient prognostic data. Third, the 3 lncRNAs were rarely reported, and their potential function was unclear. Although functional enrichment analysis based on the DeRNAs in highand low-risk signatures was performed, the potential mechanisms should be further experimentally investigated.
In conclusion, we revealed a potential 3-lncRNA signature that could accurately predict outcomes for UCEC patients. Meanwhile, we found that the 3-lncRNA signature was closely associated with clinical characteristics. Furthermore, we validated the different expressions of the 3 lncRNAs in our clinical samples, indicating that a panel of 3-lncRNAs exhibited better performance for EC diagnosis. These findings provide the first hint that the set of lncRNAs may play a critical role in the initiation and metastasis of EC, indicating a new signature for early diagnosis and therapeutic strategy of uterine corpus endometrial carcinoma.

Construction and Assessment of Long Noncoding RNA-Based Prognostic Signature
First, univariate COX regression and least absolute shrinkage and selection operator (LASSO) model were selected to find the independent prognostic lncRNAs. The LASSO method was performed by the package "glmnet" in the R software. Subsequently, multivariate Cox regression analysis was used to construct the prognostic signature. A risk score formula was constructed as follows: gene 1 × b1 + gene 2 × b2 + gene 3 × b3 +···gene n × bn, in which b represented the respective coefficient of genes. Gene represented the expression level of each gene. Subsequently, the risk score of prognostic signature formula was calculated as follows: CTD-2377D24.6 expression × 0.206 + RP4-616B8.5 × 0.341 + RP11-389G6.3 × 0.343. According to the median of risk score, the TCGA-UCEC patients were divided into the high-risk set and the low-risk set. To evaluate the prognostic signature of lncRNAs, the Kaplan-Meier and timedependent receiver operating characteristic (ROC) curve analysis were performed.

Functional Enrichment Analysis
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology analysis were performed with the "clusterProfiler" R package to identify the function of lncRNA-based signature (Yu et al., 2012). Significant functional categories were filtered into the meaning of p-value and false discovery rate (FDR) values <0.05.

RNA Extraction and Quantitative Real-Time Polymerase Chain Reaction
Total RNA was extracted using TRIzol (Life Technologies, NY, United States). The concentration and integrity of RNA were verified by a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, United States). Afterward, the total RNA was reverse-transcribed into cDNA using the PrimeScript RT reagent kit (Takara, Dalian, China). The expressions of 3 lncRNAs were measured by qRT-PCR using the Hiff qPCR SYBR Green Master Mix (Yeasen Biotech Co., Shanghai, China) in the QuantStudio 6 system (Applied Biosystems, Waltham, MA). The primers synthesized are listed in Table 2.

Ethics Statement
Publicly available TCGA datasets were analyzed in this study, and approval from a local Ethics Committee was not necessary. For human subjects, all procedures were carried out according to Helsinki Declaration and institutional guidelines and were approved by the Ethics Committee at the First Affiliated Hospital of Soochow University.

In situ Hybridization Assay
The paraffin embedded UCEC and adjacent normal tissues were stained to detect the lncRNA expression. The lncRNA probes were designed and produced by SimaifuBio (Suzhou, Jiangsu, China). The probe sequences are presented in Table 3. In brief, sections were deparaffinized, digested, and blocked with 3% methanol-H 2 O 2 ; then, the sections were dropped with prehybridization solution and incubated for 1 h in the incubator at 37°C. With the absorption of the excess liquid, the hybridization solution containing indicated lncRNA probes was added and then incubated in the incubator at 42°C overnight. Next day, after washing, the samples were dropped with block solution and incubated for 30 min at room temperature. After that, digoxigenin-labeled peroxidase antibody was added to incubate for 40 min in the incubator at 37°C. Afterward, the sections were added with DAB coloration, and the positive signal appeared brown-yellow. Hematoxylin staining solution was used to stain the nucleus. CaseViewer 2.2.1 (3DHISTECH Ltd.) and Image Pro Plus 6 were used for image capture and analysis, respectively.

Statistical Analysis
All of the expression profiles and clinical information were obtained from TCGA by R software. All statistical analyses were carried out using SPSS23.0 (SPSS, Chicago, IL, United States) or R software. For continuous variables, Student's t-test was used to compare the difference between the two groups. For categorical variables, χ 2 test was used to compare the differences among groups. Fisher's method and Hotelling T2 test were used to combine p value. p < 0.05 was considered to be statistically significant.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The Ethics Committee at the First Affiliated Hospital of Soochow University. The patients/participants provided their written informed consent to participate in this study.