Identification of Epithelial–Mesenchymal Transition-Related Prognostic lncRNAs Biomarkers Associated With Melanoma Microenvironment

Melanoma is the most common cancer of the skin, associated with a worse prognosis and distant metastasis. Epithelial–mesenchymal transition (EMT) is a reversible cellular biological process that plays significant roles in diverse tumor functions, and it is modulated by specific genes and transcription factors. The relevance of EMT-related lncRNAs in melanoma has not been determined. Therefore, RNA expression data and clinical features were collected from the TCGA database (N = 447). Melanoma samples were randomly assigned into the training (315) and testing sets (132). An EMT-related lncRNA signature was constructed via comprehensive analyses of lncRNA expression level and corresponding clinical data. The Kaplan-Meier analysis showed significant differences in overall survival in patients with melanoma in the low and high-risk groups in two sets. Receiver operating characteristic (ROC) curves were used to measure the performance of the model. Cox regression analysis indicated that the risk score was an independent prognostic factor in two sets. Besides, a nomogram was constructed based on the independent variables. Gene Set Enrichment Analysis (GSEA) was applied to evaluate the potential biological functions in the two risk groups. Furthermore, the melanoma microenvironment was evaluated using ESTIMATE and CIBERSORT algorithms in the risk groups. This study indicates that EMT-related lncRNAs can function as potential independent prognostic biomarkers for melanoma survival.


INTRODUCTION
Melanoma is the fifth and seventh most common cancer in males and females, respectively (Wu et al., 2020). Currently, with a growth rate of 3-5% per year, there are an estimated 20,000 new cases of melanoma each year (Li and Chen, 2013). Numerous treatment methods such as surgery, chemotherapy, radiotherapy, immunotherapy, and targeted therapy are used in the treatment of localized tumors. Surgery remains the main treatment option for melanoma with 90% cure rates.
However, for advanced tumors, the mortality rate of melanoma patients is high (Mishra et al., 2018). The recent development of immunotherapy has greatly improved the treatment of melanoma. However, the prognosis of patients with melanoma remains poor, mainly due to resistance to chemotherapy and/or radiotherapy, aggressive clinical behavior, and higher metastatic potential, resulting in high mortality rates (Bommareddy et al., 2017). Thus, there is an urgent need to identify novel biomarkers or therapeutic targets for predicting melanoma progression, prognosis, and treatment effects.
Epithelial-mesenchymal transition (EMT) is a cellular biological process by which epithelial cells lose their cell-cell adhesion and polarity features and gain the migratory and invasive features to become mesenchymal cells (Pastushenko and Blanpain, 2019;Bakir et al., 2020). Studies have shown that EMT plays diverse roles in tumor functions, including tumorigenesis, tumor progression, metastasis, migration, tumor stemness, intravasation, and chemotherapy resistance (Brabletz, 2012;De Craene and Berx, 2013;Nieto et al., 2016). As an EMT-related gene, RhoC is a member of the Ras-homologous (Rho) GTPase family, has been reported to be associated with worse disease-free survival and overall survival in melanoma patients (Boone et al., 2009). Besides, the overexpression of coxsackie and adenovirus receptor (CXADR) is not only associated with worse survival of melanoma patients but also enhances tumor progression via EMT (Wenandy et al., 2008). Overexpression of CXADR seems to resistant breast cancer cells to TGFβ-induced EMT and its expression associates with overall survival (Nilchian et al., 2019). Besides, EMT may promote tumor growth and metastasis not only through a direct reordering of cancer cells but also through reprogramming of the immune response in the tumor microenvironment (Lou et al., 2016). Thus, the EMT-related genes/lncRNAs may act as potential targets for future melanoma treatment. However, the relationship between the expression of EMT-related lncRNA, their prognostic value, and the pathological characteristics of melanoma has not been investigated.
In this study, we constructed a prognostic model using lncRNA expression data and relevant clinical data of melanoma obtained from the TCGA database. To build a prognostic model, we relied on the expression of EMT-related lncRNA, and we analyzed the correlation between the subgroups and the tumor microenvironment.

Public Data Sources
FPKM lncRNA expression data and relevant clinical of melanoma patients were downloaded from The Cancer Genome Atlas (TCGA). The merge script by Perl and Ensembl database were used to combine RNA-seq results into a matrix of gene symbols. The melanoma dataset contained 471 melanoma tissues and 1 adjacent normal tissue. To enhance the quality analysis, cases without complete survival data as well as those whose overall survival (OS) was less than 30 days were excluded from the study. A total of 447 participants were included in the analysis.

Correlation Analysis
A total of 200 EMT-related genes were downloaded from the MSigDB v7.2 (Molecular Signature Database). Pearson's correlation analysis was applied to determine the relationship between expressed EMT-related genes and the expression of lncRNAs in melanoma patient samples (correlation coefficient > 0.35, p < 0.001). We then randomly divided the melanoma samples into the training and testing cohorts using the R package "caret." The expression levels of EMT-related lncRNAs in the training set were further used to construct the prognostic model.

Construction and Validation of the Prognostic Signature
Univariate Cox regression analysis was performed using the "survival" package in R to determine the EMT-related lncRNAs associated with the overall survival of melanoma patients in the training set (p < 0.01). LASSO regression analysis was performed using the "glmnet" package in R to determine the most significant lncRNAs. We built a risk signature through multivariate Cox regression using the "survminer" package in R. The formula used was: risk score = i coefficient (lncRNAi) × expression (lncRNAi). Thus, the samples in the training and testing sets were classified as low vs. high groups based on the median risk score. The Kaplan-Meier survival curves were drawn using the "survival" package. Moreover, timedependent receiver-operating characteristic (ROC) was used to examine the prognostic value of the risk signature using the "survivalROC" package.

Evaluation of the Independent Prognostic Value of the Risk Score
Univariate and multivariate Cox regression analyses were performed based on the risk score and available clinical data in the training and testing sets to verify whether the risk score is an independent prognostic predictor.

Construction and Verification of the Nomogram
The nomogram was constructed from the 1-, 3-, and 5-year overall survival data of melanoma patients and the independent prognostic factors using the "rms" R package. The 3-, 5year calibration plots were used to evaluate the accuracy of the nomogram.

GSEA Enrichment Analysis
GSEA (Gene Set Enrichment Analysis) was employed to identify the biological functions and pathways associated with the high and low-risk groups. FDR < 5% and P < 0.05 were considered to be statistically significant.

Evaluation of Tumor Microenvironment
ESTIMATE was designed to calculate the immune and stromal scores within the tumor microenvironment based on the expression of immune and stromal cells (Yoshihara et al., 2013). ESTIMATE algorithm was used to calculate the immune score, stromal score, ESTIMATE score, and tumor purity and also analyze the association between the risk groups and tumor microenvironment. Kaplan-Meier method was applied to determine the relationship between the scores and overall survival of melanoma patients.

Evaluation of Immune Cells Infiltration
To further explore the difference between immune cell infiltration in the low and high-risk groups, the CIBERSORT algorithm was used to determine the fraction of 22 immune cells from gene expression data in melanoma cases (Newman et al., 2015). Samples with a CIBERSORT output of P-value less than 0.05 were used for further analyses. We used the Wilcoxon rank-sum test to identify the fraction of immune cells that had remarkable differences in the proportion between the two risk groups (high vs. low). Kaplan-Meier method was used to determine the relationship between the 22 infiltrating immune cells and OS of melanoma patients. Besides, Pearson correlation coefficient was applied to evaluate the correlation between risk score and immune cells.

Statistical Analysis
All statistical data were analyzed by R (version 4.0.2) and Perl. Chi-square or fisher test to analyze categorical variables, whereas Wilcoxon test was used to analyze the continuous data. The difference in survival was computed utilizing the K-M and the log-rank test methods. P < 0.05 was considered statistically significant.

EMT-Associated lncRNA
A total of 200 EMT-related genes were obtained from the MSigDB database (M340) (Supplementary Table 1). We distinguished EMT-related lncRNAs via correlation analysis. Based on the criteria, we identified 667 EMT-related lncRNAs. The expression Frontiers in Cell and Developmental Biology | www.frontiersin.org level of 667 EMT-related lncRNAs and the clinical information of the 447 melanoma samples were used for further analysis. The 447 melanoma patients were randomly assigned into the training (315) and testing sets (132) using the "caret" package.

Construction and Verification of the EMT-Related lncRNA Signature
Here, 315 melanoma samples with overall survival of over 1 month, and 667 EMT-related lncRNAs from TCGA-SCKM were used to construct the prognostic model. Firstly, univariate Cox regression analysis revealed that a total of 49 EMT-related lncRNAs had a significant association with overall survival (Supplementary Table 2) in the training set. Then, the LASSO regression analysis of the 49 EMT-related lncRNAs identified 16 EMT-related lncRNAs which were highly associated with OS ( Figures 1A,B). Finally, from these 16 EMT-related lncRNAs, 8 EMT-related lncRNAs were determined utilizing the multivariate Cox regression analysis and employed to establish an eight EMT-related lncRNAs signature model, and among them, three lncRNAs (LINC02560, AC009495.1, and AC124804.1) were associated with increased risk with HRs > 1, while the other five lncRNAs (AC083799.1, AC036108.3, HLA-DQB1-AS1, SPRY4-AS1, and TRG-AS1) were protective lncRNAs with HRs < 1 (Supplementary Table 3 and Figure 1C). The risk score was established using the following formula: Risk To determine the potential prognostic value of the EMTrelated lncRNA signature in predicting the OS of melanoma patients, we divided the patients into two risk groups (high vs. low) based on the median risk score. The Kaplan-Meier method was used to evaluate the difference in OS between the two risk groups. As shown in Figures 2A,B, patients in the low-risk group had better survival compared with patients in the high-risk group in both the training and testing cohorts. The 5-year survival rate in the low-risk group was 72.32%, while that in the high-risk group was 40.11%. ROC curves were used to demonstrate the precision of the eight EMT-related lncRNAs in predicting the 1-, 3-, and 5-year OS of melanoma patients. The 1-, 3-, and 5-year survival AUC values were 0.771, 0.686, and 0.746 in the training set, and 0.682, 0.634, and 0.663 in the testing set ( Figures 2C,D). Additionally, Figures 3A-F displays the heatmap and status of the training and testing set showing the melanoma samples in the high-risk group associated with more deaths, and the expression of LINC02560, AC009495.1, and AC124804.1. However, the lowrisk group highly expressed AC083799.1, AC036108.3, HLA-DQB1-AS1, SPRY4-AS1, and TRG-AS1.

Performance Comparison of EMTLncSig With Other lncRNA Signatures
We compared the prediction value of the EMT-related lncRNA signature with two published lncRNA signatures: one lncRNA signature derived from Chen's study and another lncRNA signature derived from Yang's study, and both studies utilized the same data from the TCGA database. The AUC value at 3 years for the EMT-related lncRNA signature was 0.686, which was higher than that reported by ChenlncSig (0.665) and YanglncSig (0.496) ( Figure 2E). This illustrates the good prognostic performance of the EMT-related lncRNA signature in predicting survival compared with the two previously reported signatures.

Identification of Independent Prognostic Factors
We conducted univariate and multivariate Cox regression analysis to determine if the risk score can be considered as an independent prognostic predictor of overall survival in melanoma patients. The univariate Cox regression analysis demonstrated that the risk score was an independent variable for forecasting the prognosis of melanoma patients in the training (P < 0.001, HR = 1.326, 95% CI: 1.199-1.467 and P < 0.001, Figure 4A) and testing (P < 0.001, HR = 1.326, 95% CI: 1.199-1.467 and P < 0.001, HR: 2.432, 95% CI: 1.537-3.847, Figure 4C) sets. The multivariate Cox regression analysis also illustrated that the risk score was an independent prognostic factor for melanoma patients in both training (P < 0.001, HR = 1.215, 95% CI: 1.086-1.358, Figure 4B) and testing (P = 0.011, HR = 1.934, 95% CI: 1.163-3.217, Figure 4D) sets. Besides, to investigate the prognostic value of the signature in melanoma patients classified by clinical factors, we divided the samples into different groups according to age, gender, stage, T, N, and M. As displayed in Figure 5, the OS of melanoma patients in the high-risk group was significantly poorer than that of the low-risk group (P < 0.05). Both outcomes showed that EMT-related lncRNA signature had the potential to predict the overall survival of melanoma patients.

Construction and Validation of the Nomogram
To predict the 1-, 3-, and 5-years overall survival of the patients, we established the nomogram using the risk score and other independent clinical variables in the training set ( Figure 6A). The calibration curve displayed good performance for predicting the 3-and 5-years OS in melanoma patient samples (Figures 6B,C).

Identification of GSEA-Derived EMT-Related lncRNA Signature
We used GSEA to analyze the different pathways that were significantly enriched between the two risk groups (low vs. high). Samples in the high-risk group were enriched in glyoxylate and dicarboxylate metabolism, while samples in the lowrisk group were enriched in the toll-like receptor signaling pathway, chemokine signaling pathway, natural killer cellmediated cytotoxicity, cytokine-cytokine receptor interaction, and antigen processing and presentation (Figure 7).

The Relationship Between Tumor Microenvironment and Risk Groups
As displayed in Figures 8A,B, patients in the low-risk group had a significantly higher stromal score, immune score, and ESTIMATE score; and a remarkably lower tumor purity score in the whole (training and testing) set. Moreover, patients with high ESTIMATE scores and/or immune scores were associated with a higher survival rate, while those with higher tumor purity had a lower survival rate (Figures 8C-E).

Characteristics of Immune Cells Infiltration in Melanoma
The profiling of infiltrating immune cells in melanoma samples was performed using the CIBERSORT algorithm. Figures 9A,B show that Macrophages M0, CD8 T cells, naive B cells, resting memory CD4 T cells and Macrophages M2 were highly expressed   Frontiers in Cell and Developmental Biology | www.frontiersin.org in melanoma. Wilcoxon rank-sum test indicated that there were notable differences in B cells memory (P = 0.002), activated memory CD4 T cells (p < 0.001), T follicular helper cells (p = 0.014), resting NK cells (p = 0.009), Macrophages M0 (P = 0.010), Macrophages M1 (P < 0.001), Macrophages M2 (P = 0.008), and resting Mast cells (P = 0.012) between the two risk groups (high vs. low) in the whole (training and testing) set ( Figure 9C). The memory B cells, activated memory CD4 T cells, T follicular helper cells and Macrophages M1were higher expressed in the low-risk group, while the resting NK cells, Macrophages M0, Macrophages M2 and resting Mast cells were higher expressed in the high-risk group (Figure 9C). Kaplan-Meier curves demonstrated that patients with high fractions of Macrophages M1, activated memory CD4 T cells, and CD8 T cells were associated with higher survival rates, while those with higher fractions of activated Mast cells are on the contrary (p < 0.05, Figures 9D-G). Furthermore, we explored the correlation between risk score and immune cells.

DISCUSSION
Melanoma is one of the most aggressive cancers, associated with a worse prognosis and distant metastasis. It is important to identify new potential markers to improve the prognosis and treatment of melanomas. Numerous studies have shown that EMT plays a role in tumorigenesis, migration, metastasis, chemotherapy resistance, etc. (Mittal, 2018;Saitoh, 2018). The majority of studies have concentrated on the influence of EMT in tumor progression, resistance, and metastasis, and only a few studies have explored the prognostic value of EMT-related lncRNAs/genes in cancers, particularly in melanoma.
Here, we aimed to construct and validate a new EMT-related lncRNAs prognostic signature for melanomas. The AUC values calculated from the ROC curves in the training and testing sets showed that the prognostic signature had a strong ability for differentiating melanomas. Besides, the K-M curve, score plot, plot of survival status showed a significant relationship between the risk score and clinical features which supported the robustness of the prognostic value of our signature. Moreover, the risk score and relevant clinical features were found to be important independent prognostic factors. A nomogram was constructed to predict the prognosis of melanomas and may help to individualize treatment in melanoma patients. EMTrelated lncRNA signature holds a significant prognostic value and offers a theoretical basis for EMT-related targeted treatment. Besides, GSEA results showed that most of the biological pathways were enriched in the low-risk group and that EMT plays a more regulatory role in the low-risk group. We further evaluated the melanoma samples according to their tumor microenvironment in the two risk groups using ESTIMATE and CIBERSORT algorithms.
Furthermore, among eight lncRNAs, LINC02560, AC009495.1, and AC124804.1 were unfavorable lncRNA, while AC083799.1, AC036108.3, HLA-DQB1-AS1, SPRY4-AS1, and TRG-AS1 were protective lncRNA. The majority of lncRNAs have been explored in different types of cancers. For example, LINC02560 has been demonstrated to be a potential biomarker and significantly associated with the overall survival of tongue squamous cell carcinoma of patients (Zhou et al., 2019). HLA-DQB1-AS1 acts as a potential biomarker and guides future therapy in lung adenocarcinoma (Jin et al., 2020). PRY4-AS1, which is highly expressed in tumor tissue of Laryngeal Squamous Cell Carcinoma, is an important biomarker in Laryngeal Squamous Cell Carcinoma (Gong et al., 2020). AC124804.1 shows a remarkable prognostic value in lung adenocarcinoma, where it predicts the overall survival and is associated with tumorigenesis and progression (Yu and Zhang, 2020). TRG-AS1 is high expressed in glioblastoma tissues and cells, is associated with poor prognosis, and stimulates glioblastoma cell proliferation by suppressing the expression of miR-877-5p to dysregulate the expression of SUZ12 (Xie et al., 2019). Overexpression of TRG-AS1 is associated with advanced TNM stage, lymph node metastasis, and shorter overall survival, which promotes cell proliferation, invasion, and migration of TSCC (tongue squamous cell carcinoma) by mediating the miR543/YAP1 axis . Besides, TRG-AS1 is overexpressed in hepatocellular carcinoma (HCC) cells, where it enhances HCC cell proliferation, migration, EMT, and invasion by sponging miR4500 to regulate the expression of BACH1 . Currently, the prognostic value and potential mechanism of three lncRNAs (AC083799.1, AC036108.3, and AC009495.1) remain unclear. Even though the prognostic performance of the eight identified lncRNAs in this study is excellent, future research exploring their potential molecular mechanism in melanoma is needed.
Previous studies have demonstrated that EMT is highly related to the tumor microenvironment (Lou et al., 2016;Jiang and Zhan, 2020), and the infiltrating immune cells are associated with patients' survival (Jiang et al., 2018). The stromal score, immune score, ESTIMATE score, and tumor purity of melanoma samples were measured using the ESTIMATE algorithm. We revealed that the risk score was negatively correlated with the stromal score, immune score, ESTIMATE score, but positively correlated with tumor purity. Besides, high immune and ESTIMATE scores have a prognostic significance, and that the immune cell infiltration of the tumor microenvironment may have a favorable effect on the patients' prognosis.
Furthermore, CIBERSORT was used to assess the proportions of the infiltrating immune cells. We found that activated memory CD4 memory T cells, memory B cells, T follicular helper cells, and Macrophages M1 were remarkably higher in the low-risk group while resting NK cells, Macrophages M2, Macrophages M0, and resting Mast cells were higher in the high-risk group. Besides, a high fraction of Macrophages M1, CD8 T cells, and activated   memory CD4 T cells were associated with better prognosis, while a high proportion of activated Mast cells were associated with a worse prognosis. Professional phagocytes, macrophages are specialized to eliminate cellular debris and dead or dying cells and are essential parts of the tumor microenvironment (Brown et al., 2017;Heymann et al., 2019). The presence of macrophages is regarded as a negative prognostic factor in most tumor types (Ruffell et al., 2012). In this study, the fraction of macrophages M0 in the high-risk group was high. We assumed that M0 macrophages played a significant role in the development of melanoma after the transformation of monocytes. The antitumorigenic M1 phenotype promotes the immune response via the regulation of dead or dying cells and infectious agents (Biswas and Mantovani, 2010). Previous studies reveal that high infiltration of M1 macrophages is associated with a high survival rate in many tumor types (Jackute et al., 2018;Zhang et al., 2019). M2 macrophages enhance tumor invasiveness, metastasis, and angiogenesis and are related to the development of an immunosuppressive microenvironment (Brown et al., 2017;Biswas and Mantovani, 2010). Previous studies show that T follicular helper (Tfh) cells emerge in tumor-draining lymph nodes where they produce IL4 to reduce antitumor immunity and cause myeloid cells to differentiate into M2 macrophages (Shirota et al., 2017). Memory B cells play an important role in antibody-mediated responses to self-and nonself-antigens and may help in antitumor immunity. Similarly, patients in high-risk squamous cell carcinoma are reported to have a high proportion of resting mast cells and unfavorable overall survival (Che et al., 2020). The proportion of resting NK cells in the high-risk group is higher in non-small cell lung cancer . CD8 T cells play an important role in tumor inhibition, through the release of cytotoxic molecules such as perforin and granzymes. Besides, CD8 T cells also produce IFNγ that enhance the expression of MHC class I antigens in tumor cells, making them better targets for CD8 T cells (Tsukumo and Yasutomo, 2018). Moreover, previous studies reveal that activated memory CD4 T cells have the least proportion in clinical stage IV of bladder cancer and are associated with beneficial prognosis, which may directly kill tumors or activate the body's immune response to destroy the tumor cells (Li W. et al., 2020).
This study has some limitations that need to be addressed. First, this was a retrospective study that relied on public data sources. Second, the amount of data available from public databases is still limited and has not been validated using other databases. Besides, even though the studies utilized the same cohort, there may be exist the possibility of overfitting when compared the prediction value of the EMT-related lncRNA signature with other two published lncRNA signatures. Therefore, it is important to confirm the findings through multicenter studies and actual experiments before the EMTrelated lncRNA signature can be clinically used.

CONCLUSION
In this study, we establish and validate an EMT-related lncRNA signature targeting EMT-related genes in melanoma, which can be clinically used to predict individualized prognosis and treatment in melanoma patients. Moreover, the EMT-related lncRNA signature is validated as an independent prognostic indicator for melanoma. This signature offers a novel tool for future clinical applications to melanoma treatment.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

AUTHOR CONTRIBUTIONS
BX designed the research study. BX, LL, ZC, and YZ performed the literature search and statistical analysis. BX, AL, PW, and CX interpreted the data and drafted the manuscript. BX, HL, and TX critically revised the manuscript. All authors read and approved the final manuscript.