Identification of Long Non-coding and Messenger RNAs Differentially Expressed Between Primary and Metastatic Melanoma

Purpose: Melanoma is the most aggressive and life-threatening cutaneous cancer. To explore new treatment strategies, it is essential to identify the mechanisms underlying melanoma tumorigenesis and metastasis. Methods: In the current study, we demonstrated altered expression of long non-coding RNA (lncRNA) and messenger RNA (mRNA) in melanoma using data from the Cancer Genome Atlas (TCGA) database. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, and protein–protein interaction (PPI) analyses were conducted. We also constructed a functional lncRNA-mRNA regulatory network and Kaplan-Meier analysis. Results: We identified 246 differentially expressed (DE) lncRNAs and 856 DEmRNAs. A total of 184 DElncRNAs and 428 DEmRNAs were upregulated in metastatic melanoma, while all others were downregulated. Additionally, we investigated the co-expression pattern of 363 genes, among which 26 upregulated lncRNAs, 9 down- regulated lncRNAs, 49 upregulated mRNAs and 151 downregulated mRNAs were identified as being co-expressed with others. Survival analysis suggested high levels of 14 lncRNAs and 10 mRNAs may significantly increase or decrease overall survival. These differentially expressed genes are also potentially prognostic in melanoma. Conclusion: Our findings observe potential roles for lncRNAs and mRNAs during melanoma progression and provide candidate biomarkers for further studies.


INTRODUCTION
Cutaneous melanoma is the most aggressive type of skin cancer, and melanoma metastasis accounts for the vast majority of patient deaths from skin cancer. For early stage melanoma, surgical excision is used as the standard approach to remove localized lesions and results in a good prognosis. However, the prognosis of patients with metastatic melanoma is extremely poor. The 5-year survival rate of metastatic melanoma is only 5-10%, while that of non-metastatic melanoma is estimated to be approximately 90%. Although immunotherapy and molecularly targeted drugs are emerging as attractive new therapeutic approaches (Reddy et al., 2016), their utility is restricted to only a portion of patients who often acquire drug resistance, and the therapeutic options against late-stage melanoma remain very limited. A comprehensive understanding of the genetic and epigenetic underpinnings of melanoma initiation and progression would facilitate the identification of new biomarkers and therapeutic targets for metastatic melanoma.
Recently, actively transcribed lncRNAs have been demonstrated to be involved in complex cancer genome regulatory networks rather than representing merely extraneous transcription as originally believed. lncRNAs are defined as endogenous cellular RNA transcripts longer than 200 nucleotides without protein-coding capacity. Their structures resemble mRNAs, but they exhibit lower levels of expression and more tissue-specific expression patterns than proteincoding genes (Cabili et al., 2011). lncRNAs play extensive functions encompassing biological functions including biological evolution, cell proliferation, morphogenesis, cell cycle, immune and inflammatory response, and restrainting miRNA, mRNA, and proteins from combining with their intended targets (Cathcart et al., 2015;Chen et al., 2017;Yu et al., 2018). The expression of lncRNAs varies significantly (Yan et al., 2015) in different tissues and stages of development.
To date, the roles of most lncRNAs have not been elucidated in spite of their relatively huge representation in genomes. It is well known that co-expressed genes are more likely to be co-regulated and functionally related (Stuart et al., 2003). Thus, identifying co-expressed protein-coding genes may reveal functions to associated lncRNAs through use of publicly available genomewide datasets and computational methods (Signal et al., 2016).
In recent years, TCGA 1 has generated comprehensive, multidimensional maps of the crucial genomic changes in numerous cancers, which help to provide an understanding of how such changes interact to drive diseases. Weighted correlation network analysis (WGCNA), a powerful correlation network method, is used to reconstruct gene co-expression network relationships between differentially expressed lncRNAs (DElncRNAs) and differentially expressed mRNAs (DEmRNAs). Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), PANTHER, and the database for annotation, visualization, and integrated discovery (DAVID) (Huang da et al., 2009) combined with the Search Tool for the Retrieval of Interacting Genes (STRING) database 2 are commonly used to explore the biological functions of the genomic changes. This method has been successfully applied in various biological contexts to identify candidate biomarkers or therapeutic targets, e.g., in cancer, mouse genetics, yeast genetics, and the analysis of brain imaging data. In the present study, we used a computational strategy to perform a systematic study of lncRNAs significantly altered in the transition from primary to metastatic melanoma, and constructed a functional lncRNA-mRNA regulatory network aiming to contribute to a comprehensive understanding of the differentially expressed genes (DEGs) in melanoma and their function. We identified novel melanoma-associated lncRNAs and predicted their potential pathological and biological roles in cancer development and gene regulation. This work may provide 1 https://cancergenome.nih.gov/ 2 https://string-db.org/ some powerful evidence to guide subsequent experimental studies on the altered genes in melanoma as candidate targets for treatment development or as biomarkers.

Differentially Expressed lncRNA and mRNA Analyses
Transcriptome sequencing data by RNA-seq were downloaded from the publically available TCGA database, including 103 primary melanoma frozen tissue and 368 metastatic melanoma tissue datasets. All samples were collected from individuals of different races and genders. This dataset comprised of called gene counts for 57,073 mRNAs and non-coding (nc)RNAs. To obtain genome-wide lncRNA and mRNA expression profiles, normalized expression data were subsequently analyzed for DElncRNAs and mRNAs using Bioconductor package (limma, version 3.4.0) in R software (version 3.4.0) with default parameters (Ritchie et al., 2015). A total of 246 DElncRNAs and 856 DEmRNAs remained in the differential expression analysis using "edgeR" in R. Statistical significance was defined as |log 2 foldchange| >2 and P-value < 0.01. The FDR method by the Benjamini and Hochberg method was applied to correct for multiple testings. Hierarchical clustering, shown in the format of a heatmap and a volcano plot, of the DElncRNA and DEmRNA expression profiles was performed by using "RColorBrewer, " "pheatmap, " and "plot" in R language.

Co-expression Network Construction of the lncRNA-mRNA Regulatory Network
A DElncRNA-DEmRNA co-expression network was generated to explore the key roles of DElncRNAs and DEmRNAs in the progression of melanoma. As a systems biology method, gene co-expression network analysis was performed by the WGCNA package (Langfelder and Horvath, 2008) to construct a co-expression network between the onco-lncRNAs and their target mRNAs, based on the signed Pearson Correlation Coefficient between their normalized expression levels as provided by Cuffnorm (Trapnell et al., 2012). As to achieve a scale-free topology, we set β = 7 and modified the pairwise correlation into an adjacency matrix of connection strengths through a soft-thresholding approach. Only edges with weight above a threshold of 0.8 are displayed. In the co-expression network, nodes represented DEGs, and the correlation of gene expression pattern was defined as connectivity degree among genes (Ghazalpour et al., 2006). The network was visualized using Cytoscape 3.5.1.

Bioinformatics and PPI Network Analysis
Gene Ontology analysis is a functional analysis for exploring DEmRNAs with GO categories. The predicted target genes above were input into David 3 , GOEAST 4 and PANTHER 5 , which utilized GO to identify coexisting biological process, cellular component, and molecular function represented in the gene profile. Furthermore, we also used the DAVID database to analyze the potential functions of these genes in the pathways (P < 0.05). The results were in accordance with those from KEGG 6 .
Protein-protein interaction information of DEmRNAs was acquired from the STRING database 7 . Edges with weight above a threshold of 0.4 are displayed. Hub genes were found by using "barplot" in R language.

Survival Rate Analysis
Furthermore, we investigated whether the co-expressed DElncRNAs-DEmRNAs were correlated with potential good or poor overall survival (OS). Clinical data from TCGA with survival time were collected. The gene expression was labeled as high or low using a dichotomy method (Tian et al., 2017). OS rate analysis was performed by using "hash" and "survival" in R to examine the association between the gene expression levels and the overall lifetime (Gentleman et al., 2006). Differences in the OS between the two groups were estimated and compared by the Kaplan-Meier method with a log-rank test. P-values less than 0.05 were considered significantly different.

Receiver Operating Characteristic (ROC) Analyses
In order to evaluate the diagnostic value of candidate DElncRNAs, ROC analyses were performed using the "survivalROC" package in R language. AUC under a binomial exact confidence interval was calculated to generate the ROC curve. Univariate Cox and multivariate Cox analyses were evaluated separately.

Identification of DElncRNAs and DEmRNAs
A gene expression dataset generated by RNA sequencing (RNA-Seq) was downloaded from TCGA. The dataset included the expression levels detected for 32,068 different probes on 103 primary melanoma frozen tissues and 368 metastatic melanoma tissues. Among these, 246 DElncRNAs and 856 DEmRNAs were identified (log2fold change > 2, FDR < 0.01) by an analysis in R language. We noted 184 DElncRNAs and 428 DEmRNAs were upregulated in metastatic melanoma, with 62 DElncRNAs and 428 DEmRNAs being downregulated, which were described as a volcano plot (Figures 1A,B).
Hierarchical clustering of the DElncRNA and DEmRNA expression profiles was performed by using "pheatmap" in R language. Heatmaps of the differential expression of the genes allowed for easy discrimination between primary and metastatic melanoma (Figures 1C,D). The top 20 significantly upregulated DElncRNAs are shown in Table 1 and downregulated DElncRNAs in Table 2.
The flow diagram of our research design aiming to identify DEGs is shown in Figure 2. We created a gene coexpression network to screen out lncRNAs targeting mRNAs and hub genes among the co-expressed DEGs. Then we performed GO and KEGG analysis on DEGs, followed by analysis of the PPI network. Finally, we verified the prognostic value of DEG expression for patients with melanoma using the comprehensive survival analysis platform Kaplan-Meier plotter. We conducted univariate Cox regression analyses to determine lifetime-predicted values of single genes. We performed ROC curve analyses to obtain the predicted values of multivariate DEGs.

Co-expression of lncRNAs and mRNAs in Melanoma
A DElncRNA-DEmRNA co-expression network determined by WGCNA was used to reveal the roles and functional mechanisms of lncRNAs and their targeted mRNAs in melanoma (Terai et al., 2016). DElncRNAs and their significantly related mRNAs were imported to draw the network using Cytoscape (version 3.4.3). The co-expression network was composed of 235 network nodes and 762 connection edges, among which 26 upregulated lncRNAs, 9 downregulated lncRNAs, 49 upregulated mRNAs, and 151 downregulated mRNAs were identified to have interregulated relationships (Figure 3). Within the network, several single lncRNAs were co-expressed with multiple mRNAs such as AL51224.1 and AC012313.4, whereas individual mRNAs were targeted by multiple lncRNAs such as AC010266.2, C7orf71, and AC010255.1. Additionally, there were a few co-expression relationships between several lncRNAs or mRNAs, which indicated a highly complex regulatory relationship between lncRNAs and mRNAs in melanoma.

GO Term Enrichment Analysis
Gene Ontology term enrichment analysis results varied according to GO classification and the expression change of DEmRNAs. GO term enrichment of 363 dysregulated DEGs in the co-expression network was also found among three databases David, GOEAST, GO, and PANTHER. With respect to biological process, the DEmRNAs were significantly enriched in "keratinocyte differentiation, " "keratinization, " "peptide crosslinking, " and "epidermis development." For cellular component, the DEmRNAs were significantly enriched in "extracellular space, " "extracellular region, " "keratin filament, " "desmosome, " "cornified envelope, " "intermediate filament cytoskeleton, " and "intermediate filament." With regard to molecular function, the DEmRNAs were significantly enriched in "steroid hydroxylase activity, " "structural molecule activity, " "structural constituent of epidermis, " and "heme binding." Additional details regarding GO enrichment analysis results are shown in Table 3.

Cox Regression Analysis and ROC Curve Analysis
To assess the prognostic value of the 246 candidate DElncRNAs and 856 DEmRNAs generated from TCGA database, we conducted Cox regression analysis and time-dependent ROC curve analyses along with calculations of the area under the curve (AUC). In univariate Cox analysis, 486 DEGs were found to have independent significance in predicting clinical outcome (P < 0.05). Among these, 243 DEGs were identified as having great significance (P < 0.001). The top 20 significant predicted-value DEGs are shown in Table 5. NCCRP1, FAM83C, A2ML1, GGT6, TRIM29, RHCG, SPRR2F, PKP1, RHOV, FETUB, KRT17, and CALML3 were identified as independent risk predictors while AL365361.1, AC243960.1, PLA2G2D, TIMD4, PTPRC, DNAJC5B, GPR174, and THEMIS were considered to be positive prognostic factors, which were in accordance with results from the survival rate analysis. We further evaluated the potential prognostic value of multivariates among the top 20 DEGs in melanoma using ROC curves. THEMIS, RHCG, PLA2G2D, NCCRP1, AL365361.1, and AC243960.1 were chosen as differences in the multivariate survival curve between high risk and low risk were significant (P = 7.02e-14) ( Table 6). The AUC for 3 years was 0.72 ( Figure 6A) and that for 5 years was 0.762 ( Figure 6B).

DISCUSSION
Metastasis is a crucial factor in the determination of prognosis and survival of melanoma patients. Thus, elucidating the molecular mechanisms underlying melanoma metastasis is urgently needed. lncRNAs are now known to have crucial regulatory functions related to chromatin remodeling and gene expression in the progression of cancers (Wapinski and Chang, 2011). In the current study, a total of 246 DElncRNAs and 856 DEmRNAs were identified. We found 184 DElncRNAs were up-regulated whereas the remainder were down regulated. Furthermore, we investigated the co-expression pattern among 26 up-regulated lncRNAs,9 down-regulated lncRNAs, 49 upregulated mRNAs and 151 down-regulated mRNAs. "Steroid hormone biosynthesis" and "Linoleic acid metabolism" were the most significant associated pathways according to bioinformatics analysis. In addition, 57 lncRNAs and 36 mRNAs may exhibit significant effects on patient OS rate. Finally, multivariate DEGs in melanoma may have prognostic prediction ability for this disorder.
Emerging evidence indicates that lncRNAs play a vital role in various biological processes such as cell differentiation, immune response, and tumorigenesis (Chen et al., 2018). However, the functions of lncRNAs are not well illustrated (Mercer et al., 2009) and a comprehensive database that provides functions of lncRNAs by experimental researches is still lacking. To further characterize lncRNAs, a "guilt by association" strategy is commonly used to construct a co-expression network of lncRNAs and mRNAs (Liao et al., 2011). In the present study, we used TCGA datasets, which contain evaluated data with high quality and credibility. The transcriptional data was generated by RNA-Seq, which is a revolutionary technology based on next-generation sequencing that is considered the most popular method for studying complete transcriptomes in more detail and with more accurate measurements than microarray and serial analysis of gene expression (SAGE) (Fatica and Bozzoni, 2014). Furthermore, we employed stricter statistical thresholds; fold change > 2 and FDR (q-values) < 0.01 were considered significant to obtain DElncRNAs and DEmRNAs. We systematically investigated DElncRNA-DEmRNA co-expression in a series of statistical models using stringent weight above a threshold of 0.8. This approach identified some new DElncRNAs that showed particularly strong co-expression pairs in our results. For example, AC068985.1 to GABRA6, AC093766.1 and AC007953.2 to GCG, MIR202HG to surrounding mRNAs were predicted to exhibit co-expression relationships (weight > 0.99). In addition, some DElncRNAs, such as AL365361.1, AC243960.1, LINC02397, and LINC02422. AL118505.1, LINC01527, AL512274.1, LINC01215, AC012313.4, FAM41C, SSTR5-AS1, were significant both in co-expression network and predicting survival rate, which may show powerful evidence for the metastasis of melanoma. Notably, the majority of the identified lncRNAs were novel, as only a few lncRNAs have been recorded and characterized in literature to date. To our knowledge, our study is the first to identify novel lncRNAs and mRNAs between primary and metastatic melanoma.
Several previous studies have shown that some lncRNAs involved in the pathogenesis of cutaneous melanoma exert their function through different pathways and that they interact with various molecular targets. Sprouty4-intronic transcript 1 (SPRY4-IT1 or SPRIGHTLY), a cytoplasmically enriched lncRNA derived from an intron of the SPRY4 gene, was the first lncRNA  -36  LOR, LCE3A, S100A7, LCE3D, SPRR2G, TP63, LCE1B, LCE1A, SPRR2E,  CERS3, CDSN, SPRR2A, TGM1, TGM3, LCE2D, IVL, LCE2C, FOXN1, LCE2B characterized in melanoma in 2011. Khaitan et al. (2011) found that SPRY4-IT1 modulates apoptosis and promotes melanoma proliferation, invasion, and migration in melanoma cell lines. Subsequent studies showed that SPRY4-IT1 can function as an anti-apoptotic factor by binding LPIN2 and can regulate lipid metabolism by avoiding cellular lipotoxicity (Mazar et al., 2014). Patients with melanoma and low levels of SPRY4-IT1 have a longer OS than those with high levels of this lncRNA FIGURE 4 | The PPI network constructed by the website STRING (https://string-db.org/). A total of 322 proteins are presented with color nodes; interactions are represented with edges. Only edges with weight (w) above a threshold of 0.4 are displayed. (Liu et al., 2016). Another study identified the MIR31HG lncRNA gene as showing among the greatest expression changes upon BRAF V600E expression. MIR31HG expression inversely correlated with p16INK4A, and its overexpression may represent a frequent driver event of melanoma progression (Montes et al., 2015). Flockhart et al. (2012) identified BRAFactivated non-coding RNA (BANCR) by analyzing BRAF V600E-mutant human primary melanoma specimens. RNAidependent knockdown of BANCR suppressed melanoma cell migration through upregulation of the chemokine CXCL11. It has also been suggested that melanoma proliferation is promoted by BANCR-mediated activation of the ERK1/2 and JNK MAPK pathways (Li et al., 2014). In turn, GAS5 has heretofore been the only lncRNA reported to act as a tumor suppressor FIGURE 5 | Kaplan-Meier analysis for overall survival rate of patients from TCGA. Log-rank test was performed to evaluate the survival differences between the two curves (p < 0.001). Overall survival rate for 14 lncRNAs and 10 mRNAs were identified significant.
The most significant pathway identified by KEGG was steroid hormone biosynthesis. Steroid hormones and their various receptor isoforms, such as androgens and estrogens, play an important role in activating and inhibiting the MAPK and PI3K pathways (de Giorgi et al., 2009(de Giorgi et al., , 2011Mitchell et al., 2014). Notably, the incidence of melanoma increased in men at twice the rate that it has been climbing in women since 1975 (NCI, 2012). It may help explain these phenomena to understand the mechanisms by which estrogen and androgen affect benign and malignant melanocytes. In the genomic pathway, unliganded ERα and ARα in the cell cytoplasm can bind estrogen and androgen, respectively, which leads to dimerization of each receptor. ERα and ARα dimers then traffic to the nucleus and bind to protein co-activators such as CREB to alter proto-oncogene transcription and melanoma proliferation.
In the non-genomic pathway, the liganded ERα and ARα activate RAS, which activates BRAF, which activates MEK, which activates ERK, also known as MAPK, which can phosphorylate and therefore activate CREB in the genomic pathway or can directly enhance proto-oncogene transcription and melanoma proliferation (Mitkov et al., 2015). There are several pathways involved in lipid metabolism including linoleic acid metabolism, arachidonic acid metabolism, and alpha-linolenic acid metabolism. Balogh et al. (2010) observed that cell membrane stress achieved either by heat or benzyl alcohol resulted in pronounced and highly specific alterations in lipid metabolism. A phospholipase C-diacylglycerol lipase-monoacylglycerol lipase pathway was identified in mouse melanoma B16 cells and significantly contributed to the production of several lipid mediators upon stress including arachidonic acid. Schmitt et al. (2015) found that PRDX6 is strongly expressed in most melanoma cells and its expression levels are maintained in a posttranscriptional manner, particularly by EGFR-dependent signaling, while arachidonic acid acts as a key effector of PRDX6-dependent proliferation and inducer of Src family kinase activation. These results subsequently support the biological importance of the emerging evidence of lipid signaling in melanoma.
Another important signaling pathway in our enrichment results is cytochrome P450. Numerous research groups have shown that individuals with polymorphisms that cause defective CYP2D6 are at increased risk for melanoma (Strange et al., 1999). Other genes such as CYP3A4, CYP2C9, ADH4, ADH1A, and CYP1A2 enriched in the cytochrome P450 pathway are Frontiers in Genetics | www.frontiersin.org observed as metabolically active procarcinogens to genotoxic intermediates, and an association has been found between CYP enzyme activity and the risk to develop several forms of cancer. Amann et al. (2015) found that the retinoic acid receptor (RAR) regulated enzyme CYP26a1 showed a significantly lower expression in the lecithin retinol acyltransferase (LRAT)overexpressing murine melanoma B16F10 cell line. The absence of LRAT-catalyzed retinol esterification is important for mediating retinoid sensitivity in murine melanoma cells. The potentially complicated mechanisms of the association between cytochrome P450 and retinol metabolism deserve further studies.
It is noteworthy that a single lncRNA was co-expressed with multiple mRNAs and that multiple lncRNAs could be co-expressed with individual mRNAs. Recent studies have shown that most lncRNAs in humans are produced from divergently transcribed protein-coding genes and that the divergent lncRNA/mRNA pairs exhibit coordinated varies in transcription (Sigova et al., 2013;Cabili et al., 2015), representing cis-acting co-expression of lncRNAs and neighboring mRNAs. LncRNAs may also involve in gene expression by targeting distant (trans-acting) coding genes (Cesana et al., 2011;Chu et al., 2011). We identified both neighboring and distant-acting lncRNA and mRNA co-expression. These phenomena suggest highly complex regulatory relationships between lncRNAs and mRNAs in differential co-expression networks.
This study still has limitations. First, the study power is limited by the lopsided size of the samples. Further researches with lopsided larger sample sizes are warranted to validate the reported findings. However, we took the use of multiple kinds of tests into consideration and applied a strict significance threshold to prevent false positive results. Also, we integrated the expression levels of mRNAs and lncRNAs into co-expression profiles to study their characteristics in primary and metastatic melanoma. Experimental validation of the individual roles of lncRNAs is needed. Furthermore, we classified melanoma into primary and metastatic groups because of limited information, although it is likely that melanoma-associated lncRNAs and lncRNA-mRNA co-expression regulation are specific to more detailed clinical phases as well as histological and molecular subtypes. It is essential to perform subgroup analyses by melanoma subtypes in future studies. Finally, the associations of gene expression levels between lncRNAs and mRNAs such as analyzed in this study may not represent the only possible mechanism by which lncRNAs may function in gene regulation; other mechanisms such as promoter demethylation, chromatin remodeling, microRNA silencing, or serving as molecular scaffolds have been reported (Cesana et al., 2011;Chu et al., 2011).

CONCLUSION
Our work reported the novel dysregulated lncRNAs involved in melanoma determined by using bioinformatics analysis, basing on the large scale of clinical data about lncRNAs between primary and metastatic melanoma. These results might serve as the foundation for establishing future diagnostic or prognostic biomarkers and illuminating potential molecular mechanisms in melanoma. Future studies should be conducted to confirm our findings and to elucidate the detailed tumorigenesis mechanisms of the specific identified lncRNAs.

AUTHOR CONTRIBUTIONS
LY designed the study, carried out analyses, generated figures, and wrote the manuscript. ZG and SW contributed to statistical assessment, writing, and review. LS designed the study, contributed to writing and review. RT and PL contributed to writing and review.

FUNDING
This research was funded by the Foundation for Scientific Initiative Plan of Southern Medical University (QD2018N021).