Genome-Wide Methylation Profiling of lncRNAs Reveals a Novel Progression-Related and Prognostic Marker for Colorectal Cancer

Sporadic colorectal cancer (CRC) develops principally through the adenoma-carcinoma sequence. Previous studies revealed that DNA methylation alterations play a significant role in colorectal neoplastic transformation. On the other hand, long noncoding RNAs (lncRNAs) have been identified to be associated with some critical tumorigenic processes of CRC. Accumulating evidence indicates more intricate regulatory relationships between DNA methylation and lncRNAs in CRC. Nevertheless, the methylation alterations of lncRNAs at different stages of colorectal carcinogenesis based on a genome-wide scale remain elusive. Therefore, in this study, we first used an Illumina MethylationEPIC BeadChip (850K array) to identify the methylation status of lncRNAs in 12 pairs of colorectal cancerous and adjacent normal tissues from cohort I, followed by cross-validation with The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. Then, the abnormal hypermethylation of candidate genes in colorectal lesions was successfully confirmed by MassARRAY EpiTYPER in cohort II including 48 CRC patients, and cohort III including 286 CRC patients, 81 advanced adenoma (AA) patients and 81 nonadvanced adenoma (NAA) patients. DLX6-AS1 hypermethylation was detected at all stages of colorectal neoplasms and occurred as early as the NAA stage during colorectal neoplastic progression. The methylation levels were significantly higher in the comparisons of CRC vs. NAA (P < 0.001) and AA vs. NAA (P = 0.004). Moreover, the hypermethylation of DLX6-AS1 promoter was also found in cell-free DNA samples collected from CRC patients as compared to healthy controls (P adj = 0.003). Multivariate Cox proportional hazards regression analysis revealed DLX6-AS1 promoter hypermethylation was independently associated with poorer disease-specific survival (HR = 2.52, 95% CI: 1.35-4.69, P = 0.004) and overall survival (HR = 1.64, 95% CI: 1.02-2.64, P = 0.042) in CRC patients. Finally, a nomogram was constructed and verified by a calibration curve to predict the survival probability of individual CRC patients (C-index: 0.789). Our findings indicate DLX6-AS1 hypermethylation might be an early event during colorectal carcinogenesis and has the potential to be a novel biomarker for CRC progression and prognosis.

Sporadic colorectal cancer (CRC) develops principally through the adenoma-carcinoma sequence. Previous studies revealed that DNA methylation alterations play a significant role in colorectal neoplastic transformation. On the other hand, long noncoding RNAs (lncRNAs) have been identified to be associated with some critical tumorigenic processes of CRC. Accumulating evidence indicates more intricate regulatory relationships between DNA methylation and lncRNAs in CRC. Nevertheless, the methylation alterations of lncRNAs at different stages of colorectal carcinogenesis based on a genome-wide scale remain elusive. Therefore, in this study, we first used an Illumina MethylationEPIC BeadChip (850K array) to identify the methylation status of lncRNAs in 12 pairs of colorectal cancerous and adjacent normal tissues from cohort I, followed by crossvalidation with The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. Then, the abnormal hypermethylation of candidate genes in colorectal lesions was successfully confirmed by MassARRAY EpiTYPER in cohort II including 48 CRC patients, and cohort III including 286 CRC patients, 81 advanced adenoma (AA) patients and 81 nonadvanced adenoma (NAA) patients. DLX6-AS1 hypermethylation was detected at all stages of colorectal neoplasms and occurred as early as the NAA stage during colorectal neoplastic progression. The methylation levels were significantly higher in the comparisons of CRC vs. NAA (P < 0.001) and AA vs. NAA (P = 0.004). Moreover, the hypermethylation of DLX6-AS1 promoter was also found in cell-free DNA samples collected from CRC patients as compared to healthy controls (P adj = 0.003). Multivariate Cox proportional hazards regression analysis revealed DLX6-AS1 promoter hypermethylation was independently associated with poorer

INTRODUCTION
Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the second leading cause of cancer-related death worldwide, with an estimated 1.9 million new cases and 935,000 deaths in 2020 (1). The majority of CRC cases are sporadic and develop principally through the adenoma-carcinoma sequence (2). It is well established that the gradual accumulation of multiple genetic and epigenetic changes plays a key role in the initiation and progression of colorectal carcinogenesis (3). In addition to conventional genetic variants, the regulatory contribution of epigenetic alterations has also been identified as a causative factor during cancer initiation and progression.
To date, aberrant DNA methylation, primarily in the form of hypermethylated or hypomethylated CpG dinucleotides within the genome, is one of the most extensively studied epigenetic alterations in human cancer (4). In particular, hypermethylation of gene promoter regions, which is frequently characterized by transcriptional silencing, remains the most dominant phenomenon during cancer development (5). Many studies have reported DNA methylation changes in cancer-related genes in CRC using genome-wide-based approaches or candidate gene strategies (6)(7)(8). Notably, these aberrant methylation alterations occur more frequently at the early stages of neoplastic progression (6). Indeed, hierarchical hypermethylation patterns of CRC-related suppressor genes, such as SFRP2, SEPT9 and MPPED2, have been observed throughout the progression stages of colorectal carcinogenesis (9)(10)(11). Taken together, these findings indicate that abnormal changes in DNA methylation might be hallmarks of CRC initiation and progression. DNA hypermethylation might be one of the first detectable neoplastic alterations associated with carcinogenesis.
Currently, these former so-called useless transcripts have been proven to be important regulators involved in biological, developmental, and pathological processes (13,14). Remarkably, accumulating evidence supports more intricate regulatory relationships between DNA methylation and lncRNAs (15,16). For instance, by performing an integrated analysis of epigenome and transcriptome data, Miller-Delaney et al. revealed that differential methylation might play an important role in the transcriptional regulation of lncRNAs in human temporal lobe epilepsy (17). He et al. identified 18 lncRNAs involved in methylation modifications that contributed to the tumorigenesis and development in glioma (16). Nevertheless, methylation studies of lncRNAs in CRC have largely been based on candidate gene strategy (18,19). LncRNA methylation as biomarkers of CRC identified based on a genome-wide scale remain elusive.
Therefore, in this study, we first used an Illumina MethylationEPIC BeadChip (850K array) to identify the methylation status of lncRNAs in CRC. Then, we performed a technical validation of six candidate genes with MassARRAY EpiTYPER in CRC, followed by a comprehensive study to analyze the DLX6-AS1 methylation pattern at different stages of colorectal neoplasms, from nonadvanced adenoma (NAA) to advanced adenoma (AA) to colorectal carcinoma. Furthermore, we evaluated the DLX6-AS1 methylation levels in peripheral blood leucocyte DNA and analyzed their consistency with local lesions from the same patient. The methylation status of the DLX6-AS1 promoter in cell-free DNA (cfDNA) of CRC patients was also evaluated. In addition, we performed survival analysis to clarify the prognostic role of methylated DLX6-AS1 in CRC prognosis. A nomogram was established to predict the survival rate for CRC patients.

Study Design and Participants
A flowchart for this study is shown in Figure 1. Briefly, this study was carried out in three cohorts. First, a genome-wide methylation scan by 850K array on cancerous and paired normal tissues from 12 CRC patients in cohort I was performed, followed by crossvalidation using DNA methylation data from the TCGA database (https://cancergenome.nih.gov) and the GEO database (https:// www.ncbi.nlm.nih.gov/geo/). The DNA methylation data from the TCGA and the GEO were generated using an Illumina HumanMethylation450 BeadChip (450K array) in 438 CRC tissue samples (393 tumor, 45 normal) and 208 CRC tissue samples (104 tumor, 104 normal), respectively. An overview of the external datasets used in this study is shown in Supplementary Table S1. Then, 48 pairs of CRC tissue samples from cohort II were tested. Additionally, the methylation levels of DLX6-AS1 were further validated in cohort III, which consisted of 286 CRC patients, 81 AA patients and 81 NAA patients. The characteristics of the participants in each cohort subjected to the tissue-based methylation analysis are shown in Table 1.
To evaluate the DNA methylation levels in peripheral blood, we randomly sampled 60 CRC patients and 60 adenoma patients with complete tissue-based DNA methylation data from cohort II and cohort III, and 60 healthy controls from a populationbased cohort. The DNA methylation status of the same region as measured in tissue samples was tested in each sample of peripheral blood leucocyte DNA. The characteristics of the participants subjected to the peripheral blood-based methylation analysis are shown in Supplementary Table S2.
To evaluate the DNA methylation levels in cfDNA, the DNA methylation data generated by 850K array in 7 cfDNA samples (3 CRCs, 4 healthy controls) were obtained from GEO database (Supplementary Table S1).
To evaluate the influence of DLX6-AS1 methylation on survival, CRC patients with successfully measured DNA methylation data in our cohort II and cohort III were pooled together, and CRC patients with both available methylation data and survival information from the TCGA database were used as an external validation.
CRC patients from Shaoxing People's Hospital were enrolled between January 2015 and July 2018. Participants with AA or NAA and healthy controls were selected from an ongoing population-based cohort since 1989 in Jiashan County, which has been described previously (11). All participants were ethnic Han Chinese from Zhejiang Province and were pathologically confirmed, with no familial adenomatous polyposis (FAP), no previous history of CRC and no preoperative anticancer treatment. For each participant, histologically confirmed tissue  samples, including a colorectal lesion (carcinoma or adenoma) and an adjacent normal mucosa sample, and peripheral blood samples were obtained. The adjacent normal mucosa was collected from the colonic mucosa 5 cm distal from the main neoplasm. Adenomas were classified as AA (any adenoma ≥ 1 cm, high-grade dysplasia, or with tubulovillous or villous histology) and NAA (adenomas < 1 cm without advanced histology) according to current guidelines (20). The TNM staging classification for CRC was determined according to the 7th edition of the American Joint Committee on Cancer (AJCC) cancer staging manual (21).
The study protocol was approved by the Medical Ethics Committee of Zhejiang University School of Medicine. Before basic information and sample collection, written informed consent was obtained from all recruited participants.

DNA Extraction and Bisulfite Modification
Genomic DNA from fresh-frozen samples and peripheral blood leukocytes was isolated using a DNA Tissue Kit (TianLong Biotech, Xi'an, China) and a RelaxGene Blood DNA System (TianGen Biotech, Beijing, China), respectively. Bisulfite treatment was conducted on genomic DNA (500 ng) using the EZ Methylation Gold Kit (Zymo Research, Irvine, CA, USA). All procedures were conducted in accordance with the manufacturer's instructions.

Illumina Methylation Assay
Genome-wide DNA methylation profiling was analyzed using the 850K array in 12 pairs of cancerous and adjacent normal tissues according to the manufacturer's instructions as described in a previous study (11). In this study, the raw array data were processed using the ChAMP package in R software for deriving the methylation level, which was generated as beta values (fraction methylation values between 0 and 1). We focused mainly on probes located in the promoter region of lncRNAs, which was defined as 1500 bp upstream and downstream from the transcription start site (TSS). The lncRNA annotation file was obtained from LNCipedia (https://hg19.lncipedia.org/) and the mapping procedure was conducted using the bedtools (22). Probes were selected on the basis of showing a difference in methylation of ≥ 0.20 and an adjusted P value (Benjamini-Hochberg method) < 0.05. To crossvalidate the results based on our samples, the eligible methylation data in TCGA and GEO were obtained and analyzed. The detailed procedures of data processing have been supplemented in the Supplementary Methods. Due to the larger coverage of the 850K array as compared to 450K array, the new probes in 850K array were cross-validated by the average beta value of the promoter regions of the target genes in 450K array.

Sequenom MassARRAY EpiTYPER Assay
The methylation levels of particular CpG sites located in the promoter region of candidate genes were verified using MassARRAY EpiTYPER (Sequenom, San Diego, CA). The schematic representation of each candidate gene is provided in the UCSC browser (http://genome.ucsc.edu). The primers were designed using EpiDesigner (http://epidesigner.com, Supplementary Table S3). The analyzed sequences are shown in Supplementary Figures S1-6. In some cases, fragments resulting from the T-cleavage reaction may contain small groups of adjacent CpG sites and are therefore referred to as "CpG units". CpG sites that were outside of the mass spectrometry analytical window (low or high mass) were filtered out. The mass spectra were collected on a MassARRAY Compact MALDI-TOF system (Sequenom, BioMiao Biological Technology, Beijing, China), and the methylation proportions of individual units on the spectra were generated by EpiTYPER software (Sequenom, San Diego, CA). Methylation levels ranging from 0 (completely nonmethylated) to 1 (fully methylated) are presented. For each gene, CpG unites with missing values in more than 20% of the samples were removed, as well as samples with missing values in more than 20% of CpG unites. The average methylation value of all CpG units was calculated as a representation of the region-specific gene methylation level.

Statistical Analysis
Statistical analyses were performed in R software (version 3.6.2, R Foundation for Statistical Computing, Vienna, Austria). Continuous variables are presented as the mean and standard deviation (SD), and categorical variables are presented as the frequency.
A paired Student's t test was used to assess the differences in DNA methylation levels between colorectal lesion tissues and paired normal tissues. Analysis of variance (ANOVA) followed by Bonferroni's posttest was used to examine significant differences between different groups. Pearson correlation analyses were used to evaluate the consistency of DLX6-AS1 methylation levels between peripheral blood and local lesions of the same patients with CRC or adenoma. The performance of the mean methylation level of candidate genes in distinguishing colorectal lesion tissues from their adjacent normal tissues was tested by receiver operating characteristic (ROC) curve analysis, and the area under the curve (AUC), sensitivity, and specificity were calculated. In the survival analysis, we adopted the best Youden index based on the time-dependent ROC curve as an optimal cutoff to dichotomize the study patients into high-risk and low-risk groups. Survival differences between groups were assessed using the Kaplan-Meier test and compared by the logrank test. Hazard ratios (HRs) and 95% confidence intervals (95% CIs) were calculated by univariate and multivariate Cox proportional hazards regression analyses. The multivariate analysis was adjusted for age, sex and TNM stage. A nomogram was established to predict the 1-, 2-, 3-and 4-year survival for CRC patients. Harrell's concordance index (C-index) was measured to quantify the discrimination ability of the nomogram, while the calibration curves were used to evaluate whether the predicted survival probabilities were consistent with those observed. All analyses were carried out in a two-sided manner, with a P value < 0.05 regarded as statistically significant.

Discovery of Differentially Methylated lncRNAs From Genome-Wide Profiling
By DNA methylation profiling, a total of 185 differentially methylated CpG sites mapping to the promoter of lncRNAs (all with P adj < 1*10 -5 and b difference > 0.20) were identified by the 850K array generated from 12 pairs of colorectal cancerous and adjacent normal tissues, followed by cross-validation using DNA methylation data generated by the 450K array in CRCs from the TCGA database (tumor=393, normal=45) and GEO database (tumor=104, normal=104), respectively (Supplementary Table  S4). Among them, 95.14% (176/185) of the identified CpG sites were significantly hypermethylated and 4.86% (9/185) were significantly hypomethylated. The methylation levels for each differentially methylated CpG sites are shown by heat maps (Figure 2). Among the list of CpG sites, we focused on six sites ranking on the top (cg24014202 in DLX6-AS1, cg18323466 in lnc-DPH5-1, cg08430489 in lnc-PRSS2-6, cg17722675 in lnc-RPS12-6, cg00159100 in lnc-SFRP4-2, cg27442308 in SOX21-AS1) for following technical confirmation analysis (Figure 3), which were considered as candidate biomarkers.

Elucidation of the Aberrant DLX6-AS1 Methylation Pattern During Colorectal Neoplastic Progression
To elucidate the DLX6-AS1 methylation pattern during colorectal neoplastic progression, the methylation status was assessed in colorectal lesion tissues and adjacent normal tissues from 286 CRCs, 81 AAs and 81 NAAs in cohort III with MassARRAY EpiTYPER. Among them, 433 histologically confirmed colorectal lesion tissues (283 CRCs, 76 AAs and 74 NAAs) and 441 adjacent normal tissues (284 CRCs, 80 AAs and 77 NAAs) were successfully measured. DLX6-AS1 hypermethylation was detected at all stages of colorectal neoplasms, even as early as the NAA stage.  Table 2), the DLX6-AS1 promoter was revealed to be significantly hypermethylated between CRC vs. NAA (P < 0.001) and AA vs. NAA (P = 0.004) but not between CRC vs. AA (P = 1.000).

Evaluation of DLX6-AS1 Methylation Levels in Peripheral Blood and Their Consistency With Local Colorectal Lesions
To evaluate the potential of DLX6-AS1 methylation as a noninvasive biomarker for the diagnosis of colorectal neoplasms, DLX6-AS1 methylation levels were measured in the peripheral leucocyte DNA of 60 CRC patients, 60 adenoma patients and 60 healthy controls. However, there were no significant differences in peripheral bloodbased DLX6-AS1 methylation levels in multiple comparisons between CRC patients, adenoma patients and healthy controls (Supplementary Table S5). Even though some CpG units, such as CpG_2.3, reached a statistically significant level (P = 0.017), the methylation levels did not differ much across the different groups. When evaluating the consistency between peripheral blood and local lesions from the same patients (Supplementary Table S6), the Pearson correlation analysis showed poor correlations between matched peripheral blood and local lesions (P = 0.362 for CRCs and 0.893 for adenomas, respectively, in average methylation levels).

DLX6-AS1 Methylation in Cell-Free DNA Samples From Colorectal Cancer
To identify the methylation status of the DLX6-AS1 promoter in cfDNA of CRC patients, we analyzed the methylation data rates than those with a low methylation status (P = 0.017, Figure 6A).

Construction of a Nomogram Model to Predict the Survival
We further built a nomogram, including the methylation status of DLX6-AS1 and clinical factors (age, gender, and TNM stage).
The nomogram served as an individual's prognostic predictor to predict the probability of disease-specific survival with 1-, 2-, 3-, and 4-year for CRC patients ( Figure 7A). The C-index of the nomogram for predicting the DSS of CRC patients was 0.789 (95%CI: 0.681-0.897), and calibration curves for the 1-, 2-, 3-, and 4-year survival probability demonstrated optimal agreement between the prediction and actual observation ( Figures 7B-E). Similar results were observed in the TCGA dataset (Supplementary Figures S11).

DISCUSSION
In this study, we performed a comprehensive DNA methylation profiling of lncRNAs in CRC and identified the novel methylated lncRNA, DLX6-AS1, as a promising biomarker. We validated the hypermethylation of DLX6-AS1 in CRC, and further elucidated that the hypermethylation occurred since the NAA stage during multiple steps of the adenoma-carcinoma sequence. Further comparisons revealed that DLX6-AS1 methylation was able to differentiate between CRC vs. NAA and AA vs. NAA. Moreover, the DLX6-AS1 promoter hypermethylation was also identified in  cfDNA of CRC patients as compared to healthy controls. Finally, survival analysis demonstrated DLX6-AS1 hypermethylation as an independent predictor of poorer DSS and OS for CRC patients, and nomograms were constructed to predict the survival probability of individual CRC patients. Most sporadic CRCs develop from dysplastic adenomas over a long time (2). This provides a desirable opportunity to detect CRC at an early curable stage and to screen for potentially premalignant lesions (23). Aberrant DNA promoter methylation has previously been revealed to be an early event in CRC development (24). For example, by conducting a series of genome-wide DNA methylation assays among 20 normal and pre-CRC samples, including 18 low-grade adenomas and 22 high-grade adenomas, Fan et al. found that the methylation alterations detected in low-grade adenoma were maintained or increased in high-grade adenoma and cancer (25). Several studies on DNA methylation biomarkers tested in fecal (26,27) and blood (28,29) samples indicated the potential of epigenetic biomarkers for early CRC diagnosis. The present study showed that DLX6-AS1 hypermethylation was detectable since the NAA stage during colorectal neoplastic progression, suggesting that this epigenetic change is a candidate driver of tumor progression. Thus, DLX6-AS1 hypermethylation might be a promising biomarker for the early detection and risk assessment of CRC.
It should be kept in mind that different histological adenomas differ in the risk of colorectal neoplastic progression (30). Based on a prospective cohort study, Click et al. revealed that patients with AA carried a higher risk of developing CRC than patients with NAA (31). To date, molecularly defined colorectal adenomas at high risk of progressing to CRC are limited (32). At the epigenetic level, Semaan et al. identified varied differences in SEPT9 and SHOX2 methylation levels among CRC, AA and NAA tissues (10). The present study revealed significant differences in DLX6-AS1 methylation levels between CRC vs. NAA and AA vs. NAA. However, no significant differences in methylation levels were identified between AA and CRC, thus indicating that the biological processes inherent to CRC might probably be more active in AA than in NAA. These epigenetic features might be used to help characterize patients at a high risk for malignancy in the future.
Growing efforts have been made to identify noninvasive biomarkers for the early detection of CRC (33)(34)(35). Based on peripheral blood, Heiss et al. (36) reported the leukocyte DNA methylation of KIAA1549L and the leukocyte DNA methylation of BCL2 as potential biomarkers for early CRC diagnosis. In the present study, we did not find significant differences in peripheral blood-based DLX6-AS1 methylation levels between CRC patients, adenoma patients and healthy controls. Thus, the potential of this methylation marker in peripheral blood for early diagnosis requires further investigation. In fact, it remains controversial whether the DNA methylation alterations in peripheral blood are actually a response of the hematopoietic systems to tumor development (37). Another point of controversy to mention is whether the DNA methylation status measured in peripheral blood leukocytes could reflect the methylation status of local tumor lesions (38). To address this controversy, we then compared the methylation levels of DLX6-AS1 between matched peripheral blood and local lesions. However, the lack of a correlation between them in the present study provides little evidence for the tissue origin of leukocyte methylation. These results indicate a distinct tissue-specific pattern of DNA methylation in CRC. As cfDNA is tumor derived and carries cancer-specific genetic and epigenetic aberrations (28,39), we then observed the DLX6-AS1 hypermethylation in the cfDNA samples from CRC patients as compared to healthy controls. Altogether, the methylation changes identified in our study might suggest a potential target for the study of cfDNA methylation for early cancer detection and tissue-of-origin mapping for metastases.
In clinical practice, CRC patient prognosis relies mostly on pathological staging according to the TNM system (40,41). However, there are considerable variations in survival among individuals with the same staging (42), underlining the need for additional prognostic and predictive molecular markers. Here, we identified that DLX6-AS1 methylation was associated with CRCspecific survival. Importantly, the identified methylation signature was independent of classical prognostic risk factors and could therefore be of added value when implemented in the clinic. DLX6-AS1 was reported to participate in tumor progression   (45), indicating its potential roles in cancer prognosis. Our study showed DLX6-AS1 methylation to be associated with CRC-specific survival for the first time. Besides, the nomogram was generated to predict the survival probability of individual CRC patients and the calibration plots indicated that the predicted survival was consistent with the observed survival.The findings from this study indicate the potential importance of DNA methylation in CRC prognosis and provide clues to help improve clinical decision-making precision in the future. We are aware of several limitations of this study. First, a direct explanation for the associations between DNA methylation and gene expression were limited since we are currently unable to measure the matched DLX6-AS1 expression levels. Second, although hypermethylation of DLX6-AS1 was observed in cfDNA samples by the 850K array in GEO database, further studies are needed taking into consideration of the low proportion of circulating tumor DNA in cfDNA and the currently very limited sample size. Third, as the follow-up in our study was relatively short, studies with longer clinical surveillance are warranted to bolster the reliability of the identified potential prognostic methylation biomarker. Last, although we found that the aberrant methylation of DLX6-AS1 might serve as a potential biomarker for CRC progression and prognosis, external validation with larger and diverse study populations is still required to further confirm the clinical value of DLX6-AS1 methylation in CRC.
In summary, based on a systematic evaluation of the DNA methylation pattern of lncRNAs in CRCs by genome-wide methylation profiling, the current study is the first to identify that the promoter region of DLX6-AS1 was hypermethylated in CRC and its premalignant lesions. We additionally revealed that hypermethylation was independently associated with poorer DSS and OS in CRC patients. Thus, DLX6-AS1 hypermethylation might occur at an early stage during colorectal carcinogenesis and has the potential to be a biomarker for the progression and prognosis of CRC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Medical Ethics Committee of Zhejiang University School of Medicine. The patients/participants provided their written informed consent to participate in this study.