Pan-Cancer Methylated Dysregulation of Long Non-coding RNAs Reveals Epigenetic Biomarkers

Different cancer types not only have common characteristics but also have their own characteristics respectively. The mechanism of these specific and common characteristics is still unclear. Pan-cancer analysis can help understand the similarities and differences among cancer types by systematically describing different patterns in cancers and identifying cancer-specific and cancer-common molecular biomarkers. While long non-coding RNAs (lncRNAs) are key cancer modulators, there is still a lack of pan-cancer analysis for lncRNA methylation dysregulation. In this study, we integrated lncRNA methylation, lncRNA expression and mRNA expression data to illuminate specific and common lncRNA methylation patterns in 23 cancer types. Then, we screened aberrantly methylated lncRNAs that negatively regulated lncRNA expression and mapped them to the ceRNA relationship for further validation. 29 lncRNAs were identified as diagnostic biomarkers for their corresponding cancer types, with lncRNA AC027601 was identified as a new KIRC-associated biomarker, and lncRNA ACTA2-AS1 was regarded as a carcinogenic factor of KIRP. Two lncRNAs HOXA-AS2 and AC007228 were identified as pan-cancer biomarkers. In general, the cancer-specific and cancer-common lncRNA biomarkers identified in this study may aid in cancer diagnosis and treatment.


INTRODUCTION
Cancer is a general term that refers to malignant tumors, and its world-wide incidence and mortality have been high for many years. In 2020, there were 19.29 million new cancer cases and 9.96 million cancer deaths world-wide (Sung et al., 2021). There is no cure for cancer at the moment. Not only do different types of cancer have common biological characteristics such as abnormal cell differentiation and proliferation, lack of growth control, invasion and metastasis, but they also have many specifically biological characteristics, respectively. Therefore, it is necessary to conduct pancancer research on a variety of cancer types to ascertain the similarities and differences in molecular characteristics across different cancers.
DNA methylation is an important epigenetic modification that plays an important role in many physiological processes (Martin-Subero, 2011;Li et al., 2013;Neri et al., 2017), and its aberrant behavior can result in gene instability, proto-oncogene activation and tumor suppressor gene inactivation Hou et al., 2021). Aberrant DNA methylation occurs in almost all cancers, where unmethylated promoters become methylated or methylated sequences lose their methylation. Focusing on the pan-cancer analysis of DNA methylation can inform research and therapy. Saghafinia et al. (Saghafinia et al., 2018) described an algorithmic strategy for identifying pan-cancer-related DNA methylation alteration affecting gene expression. Numerous DNA methylation dysregulation were discovered to be associated with patient prognosis and therapeutic response. Methylated research in cancer patients has been shown to improve and maintain the efficiency of cancer treatment (Pauken et al., 2016;Sen et al., 2016;Marwitz et al., 2017).
Long non-coding RNAs (lncRNA) are non-coding RNAs with a length of more than 200 nucleotides. In comparison to proteincoding genes, they have a high degree of tissue specificity in their expression (Cabili et al., 2011). LncRNAs play an important role in the occurrence and development of cancer and many other complex diseases . LncRNAs are associated with practically every major cancer type and contribute to all ten hallmarks of cancer (Huarte, 2015;Bartonicek et al., 2016;Esposito et al., 2019;Bao et al., 2020). Additionally, lncRNAs play an important role in immune regulation (Chen et al., 2017;Zhang et al., 2021b). Lin et al. (Lin and Yang, 2018) explored the mechanisms by which lncRNAs regulated cellular responses to extracellular signals and their clinical potential as diagnostic indicators, stratification markers, and therapeutic targets for combinatorial treatments. Zhang et al.  identified lncRNA MT1JP as a ceRNA for the tumor suppressor FBXW7 in gastric cancer by demonstrating competitive binding with MiR-92A-3p. Xu et al.  discovered that lncRNA SNHG6 regulated the expression of the oncogene EZH2 in colorectal cancer via the ceRNA sponge-associated with MiR-26a/b and MiR-214. Chen et al.  discovered that the lncRNA PVT1 promoted tumor development in gallbladder cancer by regulating the miR-143/HK2 axis. Wang et al. (Wang et al., 2017) identified the lncRNA HOXD-AS1 as a ceRNA that regulated SOX4 and promoted liver cancer metastasis. The above studies established the role of lncRNAs in corresponding cancer types, and identified the cancerassociated lncRNAs for each cancer type . Pan-cancer analysis of lncRNAs can help in identifying the similarities and differences between distinct cancer types and identifying potential therapeutic targets for cancer treatment.
Numerous pan-cancer studies have been carried out on lncRNAs. Li et al. (Yongsheng Li et al., 2020) identified multiple pan-cancer immune-associated lncRNAs as potential oncogenic biomarkers. Martens-Uzunova et al. (Martens-Uzunova et al., 2014) summarized the role of lncRNAs in the diagnosis and treatment of urinary tumors, and concluded that lncRNAs could be used as new biomarkers for prostate cancer, kidney cancer and bladder cancer. Zhang et al. (Zhang et al., 2021) identified clinically distinct tumor subtypes by characterizing pancancer lncRNA modifiers of the immune microenvironment. Bao et al. (Bao et al., 2021) proposed a framework for identifying lncRNA signatures associated with pan-cancer prognosis.
Previous studies have established a correlation between lncRNAs and epigenetic regulation (Spizzo et al., 2012;Xu et al., 2018), suggesting that they regulates chromatin state and epigenetic inheritance (Tsai et al., 2010). Lu et al. (Lu et al., 2020) found that DNA methylation-mediated lncRNA activation improved temozolomide resistance in glioblastoma, implying that SNHG12 could be a therapeutic target for overcoming temozolomide tolerance.
Detecting the dynamic pattern of lncRNA methylation during cancer development across pan-cancer may help highlight epigenetic changes and aid in cancer diagnosis and treatment. Yang et al. (Yang et al., 2021) presented a novel integrative analysis framework, termed MeLncTRN for integrating data on gene expression, copy number variation, methylation and lncRNA expression. They identified epigenetically-driven lncRNA-gene regulation circuits across 18 cancer types. Wei et al. (Wei et al., 2019) constructed a systematic biological framework to evaluate the co-methylation events between two lincRNAs in nine cancer types. The lincRNA prognostic signatures were identified to significantly correlate with overall survival in cancers. Wang et al. (Wang et al., 2018) characterized the epigenetic landscape of genes encoding lncRNAs associated with pan-cancer and identified EPIC1 as an oncogenic lncRNA. Xu et al. (Xu et al., 2021) constructed networks of lncRNAassociated dysregulated ceRNA across eight cancer types. They screened nine pan-cancer epigenetically related lncRNAs.
However, no research has been conducted to systematically compare methylation changes of lncRNAs in pan-cancer to identify the specific methylation-related lncRNAs. In this study, we used pan-cancer lncRNA methylation data from the TCGA to examine the lncRNA methylation patterns of 23 cancer types and identified differentially methylated lncRNAs (DMlncs). Subsequently, we examined differentially methylated lncRNAs from different cancer types to identify cancer-specific and cancercommon differentially methylated lncRNAs. Further, combining lncRNA expression data with survival data, lncRNAs with a negative correlation between methylation and expression dysregulation were found as diagnostic biomarkers for each cancer. Finally, the lncRNAs were mapped into the ceRNA network to establish ceRNA relationships with mRNAs confirming their important roles in cancer.

Data
Pan-cancer DNA methylation data for lncRNAs from Infinium 450k arrays were downloaded from the TCGA database. The cancer types with normal samples were selected, and a total of 7,634 tumor samples and 746 normal samples from 23 cancer types were retained. Table 1 shows the number of tumor and normal samples, as well as the number of lncRNAs. The methylation status of each probe in each sample was measured using the β-value (Aryee et al., 2014). The β-value denoted the ratio of methylation intensity of the probe to total intensity, with a range of 0 (low methylation) to 1 (high methylation). The probes with β-values greater than 0 in more than 50% of the samples were retained in the methylation profile for each cancer, and the missing values were filled with the average of all non-zero values on the probes. The average β-value of the promoter region was used to determine the methylation level of each lncRNA. Expression data of lncRNAs and mRNAs for 13 cancer types were downloaded from the TANRIC and the TCGA databases, respectively (Li et al., 2015). The number of tumor and normal samples and the number of lncRNAs and mRNAs are shown in Supplementary Tables S1, S2. Each lncRNA and mRNA expression value was defined as its reads per kilobase per million mapped reads (RPKM) (Mortazavi et al., 2008). Subsequently, we transformed the expression data by log2 (RPKM+1), reserved the lncRNAs and mRNAs with expression values in more than 70% of the samples, and filled their missing values using the average expression values of these RNAs in the samples.
The expression data of mRNA and lncRNA was downloaded from different databases, so we got the human gene annotation files from the GENCODE database (https://www.gencodegenes. org/) to obtain the corresponding relations of ENSG IDs and gene symbols. Then, using Entrez IDs as the main reference, different versions of human gene names (Entrez IDs, gene symbols and ENSG IDs) were converted to standard human gene names.

Identification of DMlncs
The DMlncs for each cancer were first screened using the following formula Δβ: Where, β t and β n denoted the average level of methylation in tumor and normal samples, respectively. Δβ was the subtraction difference in average methylation levels between tumor and normal samples. lncRNAs with Δβ>0.1 were selected as candidates for DMlncs.
Additionally, the "limma" package (Ritchie et al., 2015) in R language was used to measure the degree of difference between tumor and normal samples. The lncRNAs with |log 2 (fold change)|>1 (PCAWG Transcriptome Core Group et al., 2020) and FDR<0.05 were identified as DMlncs.

Identification of Differentially Expressed lncRNAs
The "limma" package was used to calculate differential expression between tumor and normal samples for lncRNA expression data. We took the lncRNAs with |log 2 (fold change)|>1 and FDR<0.05 as differentially expressed lncRNAs.

Functional Enrichment Analysis
The GREAT software (McLean et al., 2010) was used to conduct functional enrichment analysis on lncRNAs. We took the lncRNA BED data as input. The lncRNA BED information included the chromosome, start site and end site extracted from GENCODE database. Gene ontology (GO) functions of the output results were selected for subsequent analysis.

Recognition of ceRNAs
LncRNA-miRNA and mRNA-miRNA targeted relationships were downloaded from the ENCORI platform (Li et al., 2014). For each pair of lncRNA and mRNA, the intersection of their target miRNAs should be more than two (Zhang et al., 2021a). The intersections of their miRNA lists were subjected to the hypergeometric test, and the lncRNA-mRNA pairs with a p-value less than 0.05 were considered. A total of 4,739,668 pairs were obtained.
Subsequently, the Pearson correlation coefficient between lncRNA expression and mRNA expression was calculated. LncRNA-mRNA pairs with Pearson correlation coefficient >0.3 and p-value<0.05 were selected as ceRNAs. Supplementary Table  S3 shows the number of lncRNA-mRNA pairs, lncRNAs and mRNAs in each cancer.

Survival Analysis
Survival analysis of patients was carried out using the "survival" package in R language, in which the maxstat model was used to evaluate the best cut-off point to divide high-risk and low-risk groups. Kaplan-Meier curves were then drawn to depict the survival of patients of high-risk and low-risk groups.

Immunological Score
Three scores were used to assess the immunological effect of lncRNAs: Major Histocompatibility Complex (MHC), Cytolytic Activity (CYT) and Cytotoxic T Lymphocyte (CTL). The MHC score of each sample was calculated as: Where i denoted the sample, exp n represented the expression of gene n, and n was one of the nine genes (HLA-A, PSMB9, HLA-B, PSMB8, HLA-C, B2M, TAP2, NLRC5, and TAP1). These nine genes had a strong correlation and were the core gene set of MHC-I (Rooney et al., 2015;Lauss et al., 2017). The CYT score of each sample was calculated as follows: In which, i represented the sample, exp GZMA and exp PRF1 represented the expression of GZMA and PRF1, respectively. These two genes were key factors of cytolysis and were upregulated in activated CD8+T cells and strongly responded to CTLA4 and PDCD1 immunotherapy (Narayanan et al., 2018).
The CTL score of each sample was calculated as follows: In which, i represented the sample, exp GZMA , exp PRF1 and exp GZMB represented the expression of GZMA, PRF1, and GZMB, respectively. These three genes were important factors to measure T cell toxicity and immune cell effector function (Basu et al., 2016).

The Workflow of DMlncs Identification
In this study, the identification flow of methylation-related lncRNA biomarkers for pan-cancer is shown in Figure 1. We conducted a study on a total of 23 cancer types. Firstly, DMlncs between tumor and normal samples were obtained using lncRNA methylation data. Specific and common lncRNAs were identified for pan-cancer. Subsequently, lncRNA methylation data and lncRNA expression data were integrated to identify DMlncs whose methylation changes were negatively correlated with expression changes. Following that, prognostic lncRNAs were screened and mapped into the ceRNA network. The ceRNA network was constructed by combining lncRNA and mRNA expression data. Finally, pan-cancer lncRNA biomarkers were identified by analyzing the lncRNAs of the ceRNA network.

Identification of Pan-Cancer DMlncs
To identify DMlncs, we downloaded cancer methylation profiles of 23 cancer types from the TCGA, including both tumor and normal samples. The data of 7,634 tumor samples and 746 normal samples was downloaded. The proportion of tumor samples for each cancer is shown in Figure 2A. BRCA had the largest number of samples, which was nearly double that of other cancer types, while CHOL had the lowest number of samples. A total of 7,542 tumor samples had corresponding clinical data of the patients. As shown in Figure 2B, the age, sex, and survival status of patients were analyzed. The male to female ratio of the patients was 1:1, and the age was concentrated among the elderly, which was consistent with the law of the general onset age of cancer. The majority of patients survived following treatment. Figure 2C shows the mortality rate of tumor patients, three-quarters of the patients survived after surgery. GBM had the highest mortality rate, followed by CHOL. PRAD had the lowest mortality rate.
After standardizing the data, DMlncs for each cancer were identified. The number of DMlncs is shown in Table 2. A total of 2,286 DMlncs were obtained. The majority of the cancer types had a large number of DMlncs, and only a few cancer types had a small number of DMlncs.
Additionally, DMlncs were divided into up-methylated and down-methylated groups. Up-methylated lncRNAs were those whose methylation level was elevated in cancer samples compared with normal samples, while down-methylated lncRNAs were the inverse. Table 2 shows the number of patients in each of the two groups. There were overlaps in the genes of different cancers. In total, there were 1,229 DMlncs in the up-methylated group and 1,654 DMlncs in the down-methylated group. Among them, 597 lncRNAs were up-methylated in some cancers but down-methylated in some other cancers, demonstrating uneven regulation tendencies across different cancers. As shown in Figure 3, the proportion of up-methylated and down-methylated lncRNAs varied between cancer types. There were 13 types of cancer had a higher number of down-methylated lncRNAs and ten types of cancer had a higher number of up-methylated lncRNAs. The first few cancer types of most DMlncs had obvious higher number of down-methylated lncRNAs and most of the other cancer types had more up-methylated lncRNAs. Certain cancer types contained only a single type

Cancer-Specific lncRNA Biomarkers
As a complex disease, cancer has a high heterogeneity and distinct pathogenesis. In this study, we searched for specific DMlncs for  each cancer. Figure 4A shows the proportion of cancer-specific lncRNAs, and only a fraction of the DMlncs were cancer-specific. LIHC had the most specifically down-methylated lncRNAs, whereas PRAD had the most specifically up-methylated lncRNAs. The proportion of both specifically up and down lncRNAs in PCPG was about 42%. SARC, STAD, and THCA had DMlncs less than ten, preventing them from being compared with other cancers in terms of lncRNA proportion. Except for SARC, STAD, and THCA, PCPG had the highest proportion of specific lncRNAs. STAD had no up-methylated lncRNAs, while all its seven down-methylated lncRNAs were specific. SARC possessed a single down-methylated lncRNA, and it was specific. Two of the three DMlncs of THYM were specific. Supplementary Table S4 shows the specific DMlncs of each cancer.
Subsequently, we performed functional enrichment analysis of specifically up-methylated and down-methylated lncRNAs in each cancer. For each cancer, the functions enriched by specifically up-methylated and down-methylated lncRNAs were analyzed separately, and the number of functions enriched by each cancer is shown on the right side of Figure 4B. There were significant differences in the number of enriched functions for the cancers. LIHC, COAD, and PAAD enriched in more than 200 functions, while the enriched FIGURE 4 | The cancer-specific lncRNAs with differential methylation (A)The number and proportion of cancer-specific lncRNAs. Green represents up-methylated lncRNAs, blue represents down-methylated lncRNAs, yellow represents specifically up-methylated lncRNAs, and red represents specifically down-methylated lncRNAs (B) The functions of specific lncRNAs in each cancer type. Red represents the functions of up-methylated lncRNAs, while blue represents the functions of downmethylated lncRNAs.
Frontiers in Cell and Developmental Biology | www.frontiersin.org May 2022 | Volume 10 | Article 882698 functions of THYM, STAD, and HNSC were less than five. There also a difference between the number of functions of upmethylated and down-methylated lncRNAs for each cancer. The specifically down-methylated lncRNAs of COAD were enriched in the majority of functions (261), whereas upmethylated lncRNAs of COAD were enriched in only a few functions (14), and LIHC demonstrated a similar pattern. The specifically up-methylated lncRNAs of ESCA were found to be enriched in a variety of functions (194), but the down-methylated lncRNAs were enriched in no functions. The details of enriched functions are shown in Supplementary Table S5.
Among the functions, we selected the most significant function for each group and displayed them on the left side of Figure 4B. The function names, enrichment p-values and other information are shown in Table 3. The lncRNAs of most cancer types were enriched in the "regulation" or "response" functions. Both the up-methylated group of CHOL and the downmethylated group of LIHC were enriched in "depyrimidine". The specifically up-methylated lncRNAs of GBM were enriched in "methylation", the specifically down-methylated lncRNAs of COAD were enriched in "dimethylation", and the specifically upmethylated lncRNAs of BRCA were enriched in "epigenetic".
These results established the important role of lncRNAs in the epigenetic process.
In this study, we hypothesized that up-methylated lncRNAs would exhibit decreased expression and vice versa. That is, there is a negative correlation between changes in methylation and expression levels. In order to screen out negative correlated lncRNAs in DMlncs, we used lncRNA expression data to identify differentially expressed lncRNAs in various cancers.
LncRNA expression data of 13 cancer types were downloaded, and the number of differentially expressed lncRNAs in each cancer is shown in Supplementary Table S6. We identified 2,887 over-expressed lncRNAs and 2,375 low-expressed lncRNAs in different cancers. The total number of these lncRNAs was 4,155, and several genes were overlapped between two groups and displayed conflicting regulation patterns across different cancers. The numerical distribution of differentially expressed lncRNAs in various cancers is shown in Figure 5A. Although the number of over-expressed and lowexpressed lncRNAs was similar in the majority of cancer types, there were significantly more low-expressed lncRNAs in BRCA and THCA, and significantly more over-expressed lncRNAs in LIHC and STAD. Subsequently, negatively correlated lncRNAs (NClncs) were screened from differentially expressed lncRNAs. The number of NClncs in each cancer is shown in Figure 5A. NClncs were divided into two types: up-methylated-low-expressed lncRNAs (UMLElncs) and down-methylated-over-expressed lncRNAs (DMOElncs) based on the changes in expression and methylation levels. The two types of lncRNAs in each cancer are shown in Figure 5B-N. As seen from the figure, the number of NClncs was proportional to the number of differentially expressed lncRNAs. However, the proportion of differential lncRNAs was not balanced in several cancers. For example, STAD and THCA had many differentially expressed lncRNAs, but a small number of DMlncs, implying a small number of NClncs. The proportion of UMLElncs and DMOElncs was different in each cancer. For example, both types of lncRNAs for LUSC were abundant. HNSC and LIHC had significantly more DMOElncs, while BRCA and PRAD had significantly more UMLElncs.

FIGURE 5 | The negatively correlated lncRNAs for cancers (A) Quantitative analysis of lncRNAs that are differently expressed and negatively correlated (B-N)
Distribution and names of negatively correlated lncRNAs for cancers. Red and blue dots represent differentially expressed lncRNAs. In which, blue represents downexpressed lncRNAs, and red represents up-expressed lncRNAs. Green and yellow boxes represent the names of negatively correlated lncRNAs. Where green indicates lncRNAs with up-methylated and low-expressed, and yellow represents lncRNAs with down-methylated and over-expressed.

Cancer
Negatively correlated lncRNAs Additionally, survival-correlated lncRNAs were identified by analyzing the CSNClncs of each cancer. As shown in Table 5, 29 lncRNAs were identified to associated with survival in ten cancers. These lncRNAs may be used as specifically diagnostic markers for corresponding cancers. They not only exhibited synergistic alterations in expression and methylation, but were also closely associated with the survival of cancer patients.

Cancer
Count lncRNAs FIGURE 6 | The negatively correlated lncRNAs which specific to each cancer in the ceRNA network (A)The ceRNAs of BLCA-specific negatively correlated lncRNAs (B) The ceRNAs of KIRC-specific negatively correlated lncRNAs (C) The ceRNAs of KIRP-specific negatively correlated lncRNAs (D-I) Kaplan-Meier curves for the lncRNAs associated with each cancer types. The red line represents the group with a high level of expression, while the blue line represents the group with a low level of expression. Additionally, the "+" on the lines represents patients who were lost to follow-up. At this point, the number of patients decreases but the overall survival rate remains stable.
Frontiers in Cell and Developmental Biology | www.frontiersin.org May 2022 | Volume 10 | Article 882698 9 The main function of lncRNAs was to combine miRNAs competing with mRNAs, thereby increasing mRNA expression. As the methylation of lncRNAs increased, their expression decreased, which indirectly led to the decrease in mRNA expression, and vice versa. Therefore, we screened the ceRNAs of DMlncs. First, the lncRNA-miRNA and mRNA-miRNA regulatory relationships were integrated. A lncRNA and a mRNA sharing more than two miRNAs were considered to have a ceRNA relationship. The correlation between lncRNA and mRNA expression was calculated for each cancer to confirm the ceRNA relationship. We maintained the ceRNA relationships that showed a positive correlation between lncRNA and mRNA expression. Each survival-related CSNClnc was put into the ceRNA network to search for associated mRNAs. The visualized results of ceRNAs for BLCA, KIRC and KIRP are shown in Figures 6A-C. BLCA had two lncRNAs mapped into the ceRNA network (AC006116 and AC073046). Both lncRNAs formed a ceRNA relationship with the mRNA ATXN7L3B and formed ceRNA relationships with some other mRNAs, respectively. ATXN7L3B have been confirmed to be associated with an increased risk of colorectal cancer (Leberfarb et al., 2020), and cytoplasmic ATXN7L3B interfered with the nuclear functions of the SAGA deubiquitinase module (Li et al., 2016). The SAGA complex was composed of two enzymatic modules, which house histone acetyltransferase (HAT) and deubiquitinase (DUB) activities. The DUB module was important for normal embryonic development (Glinsky, 2006;Lin et al., 2012), and alterations in the expression or structure of component proteins were linked to cancer (Lan et al., 2015). Therefore, AC006116 and AC073046, as its ceRNAs, could regulate the expression of ATXN7L3B and were also closely related to cancer. The survival correlation of AC006116 and AC073046 in BLCA is shown in Figures 6D,E. Both lncRNAs were significantly associated with the survival of BLCA patients.
Three lncRNAs in KIRC were mapped into the ceRNA network, among which SNHG12 and AC027601 shared more than 30 mRNAs, while EPB41L4A-AS1 did not share any mRNAs with the other two lncRNAs. The survival analysis results for the three lncRNAs in KIRC are shown in Figures 6F-H. The high expression of SNHG12 and AC027601 both showed a worse prognosis. However, the low expression group of EPB41L4A-AS1 showed a worse prognosis. Therefore, we inferred that in KIRC, lncRNAs SNHG12 and AC027601 had a carcinogenic effect, whereas EPB41L4A-AS1 had a tumor-suppressive effect, which explains why it shared no mRNAs with the other two lncRNAs. Numerous studies have established that SNHG12 is associated with cancers , and could be used as a potential therapeutic target and biomarker for human cancers (Tamang et al., 2019). DNA-methylation-mediated activation of SNHG12 promoted temozolomide resistance in glioblastoma . SNHG12 promoted tumor progression and sunitinib resistance by upregulating CDCA3 in renal cell carcinoma . EPB41L4A-AS1 was a repressor of the Warburg effect and played an important role in the metabolic reprogramming of cancer (Liao et al., 2019), and EPB41L4A-AS1 has been identified as a potential biomarker in non-small cell lung cancer . At present, AC027601 has been identified as a survival signature in renal clear cell carcinoma (Qi-Dong et al., 2020), but has not been reported in other cancers. Given that the three lncRNAs were simultaneously identified as KIRCrelated lncRNAs in this study, we believed that AC027601 should be closely associated with the occurrence and development of KIRC, and it is a newly identified cancer-related lncRNA.
Only ACTA2-AS1 was mapped into the ceRNA network in KIRP, where it established a ceRNA relationship with PPP1R12B. PPP1R12B has been shown to inhibit tumor growth and metastasis by regulating Grb2/PI3K/Akt signaling in colorectal cancer (Ding et al., 2019). ACTA2-AS1 plays an important role in a variety of cancers, for example, ACTA2-AS1 is significantly associated with overall survival in ovarian cancer patients (Li and Zhan, 2019). ACTA2-AS1 plays different roles in different cancers. ACTA2-AS1 knockdown promotes liver cancer cell proliferation, migration and invasion (Zhou and Lv, 2019), while ACTA2-AS1 suppresses lung adenocarcinoma progression (Ying et al., 2020), implying an inhibitory effect on the two cancers. However, ACTA2-AS1 promotes cervical cancer progression (Luo et al., 2020), suggesting its carcinogenic role in cancer. Figure 6I shows the survival analysis result for ACTA2-AS1 in KIRP. We believed that ACTA2-AS1 had a carcinogenic effect in KIRP.

Common lncRNA Biomarkers in Cancers
All cancer types exhibited infinite proliferation, transformation and ease of metastasis. Therefore, we sought to identify DMlncs common to different cancers to help understand the mechanisms underlying the occurrence of common features in cancers. First, the intersection of DMlncs were searched in cancers, and the results are shown in Figure 7. The upper triangle and lower triangle reflected the intersection of up-methylated lncRNAs and the intersection of down-methylated lncRNAs, respectively. The findings were consistent with the hypothesis that the larger the lncRNA set, the greater the overlap with other cancers. The intersections of DMlncs of SARC, STAD, and THYM with other cancers were small. The down-methylated lncRNAs of GBM and SKCM had small intersections with other cancers, whereas the up-methylated lncRNAs of THCA had small intersections with other cancers. There were amount of upmethylated lncRNAs in PCPG (62), but the overlaps with other cancers were small. Among the down-methylated lncRNAs, the intersection of KIRC and KIRP was the largest of KIRP, but only ranked 10th of the KIRC. Among the upmethylated lncRNAs, the intersection of KIRC and KIRP was the largest of KIRC, while was the second largest of KIRP.
Subsequently, we extracted DMlncs which were common in various cancers. The findings indicated that there were 19 common DMlncs in more than 15 cancers (RP4-792G4, RP5-855F14, OTX2-AS1, RP11-52L5, CYP1B1-AS1, RP11-175E9, RP11-552E20, HCCAT3, RP11-718O11, HOXA-AS2, RP3-326L13, AC007228, RP11-297B11, CTC-523E23, LINC01010, RP11-227D2, EVX1-AS, AC018730, and RP11-465L10). Fourteen of them were up-methylated in all the cancers, indicating their carcinogenic potential, whereas three lncRNAs were down-methylated in the majority of cancers and may act as Frontiers in Cell and Developmental Biology | www.frontiersin.org May 2022 | Volume 10 | Article 882698 potential tumor suppressors ( Figure 8A). Figure 8B shows the comparison of methylation levels of the 19 lncRNAs in tumor and normal samples. It is intuitive to conclude that there were significant differences in lncRNAs methylation levels between tumor and normal samples. Except for LINC01010, RP11-552E20, and RP5-855F14, the lncRNAs had a higher methylation level in tumor samples. Among the 19 lncRNAs, OXT2-AS1 was shown to be significantly down-methylated in lung squamous cell carcinoma and was closely associated with poor prognosis of cancer (Zheng et al., 2021). CYP1B1-AS1 has been confirmed to play an important role in triple-negative breast cancer, lung adenocarcinoma and acute myeloid leukemia, and was associated with the prognosis of these cancers (Cheng et al., 2021;Ren et al., 2021;Vishnubalaji and Alajez, 2021). Abnormal methylation and low expression of CTC-523E23 led to poor prognosis in patients with lung squamous cell carcinoma (Rui Li et al., 2020). Inhibition of LINC01010 may promote the migration and invasion of lung cancer cells (Cao et al., 2020) and help in the prediction of neuroblastoma prognosis (Gao et al., 2020). EVX1-AS is closely associated with the prognosis of colon cancer and has been predicted to be potentially associated with the development of multiple cancers by LncRNADisease V2.0 Gao et al., 2021). The findings revealed that the abnormalities of lncRNAs played an important role in the occurrence and development of cancer, and the unconfirmed lncRNAs could serve as entry points for future research. Additionally, numerous lncRNAs were identified as abnormal in lung cancer, indicating that lung cancer may be influenced by a variety of pathogenic factors.
Then, we investigated the functions of the 19 lncRNAs and performed functional enrichment analysis on these lncRNAs using the GREAT software. Figure 8C shows that these lncRNAs were enriched in processes required for organisms such as metabolism and biosynthesis, as well as those closely associated with the occurrence and development of cancer, such as gene expression and transcription.
To further validate the cancer-common lncRNAs identified in this study, significantly survival-related lncRNAs were mapped to the ceRNA network, and only mRNAs shared by more than four types of cancers were selected for further study. Finally, a subnetwork comprising 33 mRNAs and three lncRNAs (AC007228, CYP1B1-AS1, and HOXA-AS2) was identified in ten cancers. The ceRNA distribution of the 33 mRNAs in cancers is shown in Figure 9A. Although these mRNAs were shared by multiple cancers, they generally formed ceRNA relationships with a greater number of lncRNAs and had higher correlation coefficients in KIRP. HOXA3 had ceRNA lncRNAs in ten cancers and showed high correlation coefficients in BRCA, CESC, HNSC and LUSC. Figure 9B shows the ceRNA subnetwork, in which HOXA-AS2 and AC007228 were shared by ten cancers, and the two lncRNAs shared some ceRNA relationships with some mRNAs (DMTF1, HOXB3, NOD1, RABL2A, TRIOBP, ZNF443, and ZNF789) but formed ceRNA relationships with some other mRNAs, respectively (AC007228: ZNF10, ZNF211, ZNF229, ZNF471, ZNF583, ZNF614, ZNF649, ZNF763, ZNF793, and ZNF879; HOXA-AS2: DHRS3 and HOXA3).
The genomic locations of HOXA-AS2 and AC007228 were checked. As shown in Figure 10A, the genomic locations of lncRNAs belonging to the AC007228 family and mRNAs belonging to the ZNF family were extremely similar. As shown in Figure 10B, HOXA-AS2  and its ceRNA HOXA3 were both located in the same genomic region. Their sequences were similar, and the ceRNA relationships were generated by the lncRNAs' cis-regulatory interactions with mRNAs.
The major histocompatibility complex (MHC) region was one of the regions with the highest gene density and polymorphism. High-throughput sequencing and other technologies confirmed the role of MHC in disease and showed that MHC was associated with cancer and neurological diseases in addition to infection and autoimmune diseases. MHC is involved in antigen recognition during the immune response and is capable of inducing immune cells to participate in immune response (Trowsdale and Knight, 2013). Studies have also reported that immune cytolytic activity (CYT) is positively correlated with the presence of inhibitory receptors (PDCD1, PDL1, CTLA4, LAG3, TIM3, and IDO1), and the presence of CYT is more responsive to immune checkpoint inhibition, suggesting that it can be used as a key marker for immune checkpoint therapy (Narayanan et al., 2018;Wang et al., 2019). Immune control of tumor lesions requires local antigen recognition, activation and amplification of tumor-specific cytotoxic T lymphocytes (CTL). The activated CTL infiltrate the tumor microenvironment and scan the tumor tissue, where they directly interact with the target cells, inducing tumor cell apoptosis and atrophy (Basu et al., 2016). First, effector T lymphocytes are required to migrate to the tumor foci, a process referred to as immune cell infiltration. Following that, they have to make physical contact with the tumor  Frontiers in Cell and Developmental Biology | www.frontiersin.org May 2022 | Volume 10 | Article 882698 13 cells and scan their MHC. Finally, by releasing perforin or fas/ fasl to bind target cells, CTL activates and induces apoptosis (Weigelin et al., 2011). Therefore, we assess the immunological effects of HOXA-AS2 and AC007228 using MHC, CYT, and CTL scores.
To improve the evaluation of the effect of lncRNAs on immunity, the R package "ConsensusClusterPlus" (Wilkerson and Hayes, 2010) was used to cluster samples of each cancer based on their expression profiles of HOXA-AS2 and AC007228. We varied the parameter k from two to six, and then selected the optimal subtypes for subsequent immune score evaluation. The score comparison shown in Figure 10C demonstrates that all three scores were consistent across cancer subtypes, indicating that the identified key lncRNAs may aid in predicting the immunological status of cancer patients and provide a basis for tumor treatment. Subsequently, the Wilcoxon rank-sum test was used to compare the immunity scores of different subtypes, and significant differences were observed in the immunity scores of different subtypes in KIRC, LIHC, LUAD, and PRAD.
Finally, we assessed the tumor mutation burden (TMB) among subtypes. TMB was a novel biological target for which therapeutic impact may be predicted. Previous research has demonstrated that the more somatic mutations a cancer patient possesses, the more likely it is that new antigens are produced. Antigen peptides could be loaded onto the MHC and displayed on the cell surface, aiding in their recognition by T cells . Therefore, cancer patients with high TMB levels responded better to immune checkpoint blockade therapy. As shown in Figure 10D, the TMB of KIRC, KIRP, LIHC, LUAD, and UCEC corresponded to the immunity score. Subtypes of key lncRNAs played an important role in the immune effect in KIRC, LIHC, and LUAD.

DISCUSSION
LncRNAs have been implicated in the occurrence and development of cancer. This study aimed to identify the specific and common lncRNAs with aberrant methylation in pan-cancer. After searching for DMlncs in a variety of cancers, the pan-cancer results were compared. Subsequently, data on lncRNA expression, lncRNA methylation and mRNA expression was integrated to identify lncRNAs with a negative correlation between methylation and expression changes, and survival analysis was performed to further verify the results. Following that, survival-related lncRNAs were mapped to the ceRNA network, and pan-cancer biomarkers were identified by examining the connection characteristics of the network. Finally, the immune effect of the lncRNAs was verified.
In this study, DMlncs for 23 cancers were acquired, and cancer-specific lncRNAs and cancer-common lncRNAs were identified. The NClncs were further screened for ten cancers, and the correlation between the lncRNAs and mRNAs, as well as the association with survival were verified. Cancer-specific lncRNAs may be used as diagnostic biomarkers for corresponding cancers. In clinical application, these lncRNAs could apply to make detection kits of corresponding cancers. Common lncRNAs in pan-cancers could be used to understand the mechanism underlying common features in cancers. These lncRNAs can be used for the development of targeting drugs for the remission of general symptoms and treatment of cancer.
This study yielded significant results. Not only did we validate several previously known cancer-associated lncRNAs, but we also identified new cancer-related lncRNAs. AC027601 was identified as a novel KIRC-associated lncRNA, and ACTA2-AS1 was discovered to be carcinogenic in KIRP. Additionally, two lncRNAs, HOXA-AS2, and AC007228 were identified as pancancer lncRNAs.
However, there are some limitations to this study. Because the number of DMlncs for SARC, STAD, and THYM was less than ten, the more systematic comparisons for these cancers were impossible. The number of normal samples with methylation data for these three cancers was less than ten. The sample proportion was skewed when differences were calculated, resulting in less statistically significant results. Additionally, the DNA methylation data only included 450k arrays and did not cover the entire genome, which might have contributed to the study's insufficiency outcomes. We only identified the epigenetically dysregulated lncRNAs based on DNA methylation, although N6-methyladenosine (m6A) as the RNA posttranscriptional modification has been shown to influence the function of RNAs as well. We did not evaluate the relationship between m6A and lncRNA due to the lack of data.
In the future, more complete data sets on cancers may be collected to allow for more rigorous comparisons. We may use copy number data to analyze the change in copy number of lncRNA and conduct a more comprehensive investigation of the change and function of lncRNA in cancer. Additionally, other types of omics data may be integrated, and factors affecting lncRNAs and gene expression could be evaluated more comprehensively bringing the research process closer to the way molecules interact in the human body. With the development of sequencing techniques, additional methylation data sets such as HM850K, whole-genome bisulfite sequencing (WGBS), and reduced representation bisulfite sequencing (RRBS), as well as other types of methylation data sets, will be used to analyze the function of DNA methylation for lncRNAs.
In general, this study screened cancer-related lncRNA biomarkers based on their methylation alterations and their competing mRNAs. This study considered multiple omics data more comprehensively and used more stringent screening criteria, which effectively eliminated of data deviation errors. The lncRNA biomarkers identified in this study may aid in the investigation of cancer mechanisms.