Your new experience awaits. Try the new design now and help us make it even better

CORRECTION article

Front. Mol. Biosci.

Sec. Molecular Diagnostics and Therapeutics

Volume 12 - 2025 | doi: 10.3389/fmolb.2025.1658908

Development of a prognostic model for early-stage gastric cancer-related DNA methylation-driven genes and analysis of immune landscape

Provisionally accepted
Chen  SuChen Su1Zeyang  LinZeyang Lin1Zhijian  YeZhijian Ye1Jing  LiangJing Liang2Rong  YuRong Yu1*Zheng  WanZheng Wan3*Jingjing  HouJingjing Hou1,3*
  • 1Zhongshan Hospital Xiamen University, Xiamen, China
  • 2Sun Yat-Sen University, Guangzhou, China
  • 3Xiamen University, Xiamen, China

The final, formatted version of the article will be published soon.

Background and Aims: This study aimed to develop a prognostic model based on DNA methylation-driven genes for patients with early-stage gastric cancer and to examine immune infiltration and function across varying risk levels.Methods: We analyzed data from stage I/II gastric cancer patients in The Cancer Genome Atlas which included clinical details, mRNA expression profiles, and level 3 DNA methylation array data. Using the empirical Bayes method of the limma package, we identified differentially expressed genes (DEGs), and the MethylMix package facilitated the identification of DNA methylation-driven genes (DMGs). Univariate Cox regression and LASSO (least absolute shrinkage and selector operation) analyses were utilized to pinpoint critical genes. A risk score prediction model was formulated using two genes that demonstrated the most significant hazard ratios (HRs). Model performance was evaluated within the initial cohort and verified in the GSE84437 cohort; a nomogram was also constructed based on these genes. We further examined 50 methylation sites associated with three CpG islands in C1orf35 and 14 methylation sites linked to one CpG island in FAAH. The CIBERSORT package was employed to identify immune cell clusters in the prediction model.Results: A total of 176 DNA methylation-driven genes were refined down to a four-gene signature (ZC3H12A was hypermethylated; GATA3, C1orf35, and FAAH were hypomethylated), which exhibited a significant correlation with overall survival (OS), as evidenced by p-values below 0.05 following univariate Cox regression and LASSO analysis. Specifically, for the risk score prediction model, C1orf35, which had the highest hazard ratio (HR = 2.035, p = 0.028), and FAAH, with the lowest hazard ratio (HR = 0.656, p = 0.012), were selected. The Kaplan-Meier analysis demonstrated distinct survival outcomes between the high-risk and low-risk score groups. The model's predictive accuracy was confirmed with an area under the curve (AUC) of 0.611 for 3-year survival and 0.564 for 5-year survival. Notably, the hypomethylation of the three CpG islands in C1orf35 and the single CpG island in FAAH was significantly different in stage Gastric cancer is globally recognized as the sixth most prevalent type of cancer and ranks as the seventh leading cause of cancerrelated mortality, as reported by the World Health Organization (WHO) (https://gco.iarc.who.int/Data version: Globocan 2022 (version 1.1) 08.02.2024). The primary treatment strategy for this malignancy is surgery. The survival rates of patients undergoing potentially curative surgery vary significantly, influenced by factors such as the stage of the cancer at diagnosis and the quality of the surgical procedure (Huang et al., 2022;Orman and Cayci, 2019). Despite achieving successful R0 resections, which indicate no residual microscopic disease, some patients may experience recurrence due to previously undetected micro-metastases. To address this, gastrectomy followed by adjuvant chemotherapy is frequently utilized to diminish the risk of recurrence. Prior research highlights the significant role of adjuvant chemotherapy in enhancing outcomes for patients with treatable advanced gastric cancer (Sakuramoto et al., 2007;Nakajima et al., 2010). While many patients with stage I or II gastric cancer experience positive outcomes following either endoscopic or traditional surgical interventions, the prognosis for others remains poor (Miyahara et al., 2022). Recent studies have focused on identifying clinicopathological factors associated with overall survival (OS) in the early stages of gastric cancer (Ikoma et al., 2016;Bausys et al., 2018;Aoyama et al., 2014;Saito et al., 2016), yet the genetic markers predictive of prognosis at these stages are still not well-defined. The identification of such predictive markers is essential for improving survival forecasts for patients with early-stage gastric cancer.Alterations in DNA methylation, including increased methylation of tumor-suppressor genes (Manel, 2002) and decreased methylation of oncogenes (Feinberg and Vogelstein, 1983), play a pivotal role in the pathogenesis of various cancers, including gastric cancer (Tahara and Arisawa, 2015;Calcagno et al., 2013). Genes such as SFRP2, THBS1, and UCHL1, which exhibit aberrant methylation patterns, could be crucial in determining the prognosis of gastric cancer (Yan et al., 2021;Hu et al., 2021;Wang et al., 2015). The tumor microenvironment (TME) is a complex network, and earlier research has suggested that tumor-infiltrating immune cells (TIICs) significantly influence the initiation, progression, and clinical outcomes of cancer. Additionally, the responses of innate and adaptive immune systems are critical determinants of the efficacy of immunotherapies (Steidl et al., 2010;Guo et al., 2021;Seager et al., 2017). Recent studies have elucidated the interaction between DNA methylation and tumor immunity, revealing that DNA methylation regulationrelated genes (DMRegs) have potential effects on immune cell infiltration, the TME, and the efficacy of immunotherapy in hepatocellular carcinoma (HCC) patients. High scores in DMRegs, characterized by the predominance of TP53 wildtype mutations, elevated expression of PD-1 and CTLA-4, and marked immune activation, correlate with a poor prognosis (Song et al., 2022). Another study also confirmed the significant role of DNA methylation in influencing tumor immunity, although its comprehensive impact on TME formation and immune activation remains to be fully elucidated (Yuan et al., 2022;Suarez-Alvarez et al., 2012). These findings underscore the close relationship between DNA methylation and immune regulation, meriting further investigation. However, the specific effects of DNA methylation on the prognosis of early gastric cancer and the functionality of TIICs remain ambiguous. Consequently, further research into DNA methylation could offer valuable insights into this field.This study aims to identify genes regulated by DNA methylation and explore the relationship between these DMGs and TIICs, which may serve as prognostic indicators for patients with stage I or II gastric cancer. This research could significantly enhance our understanding of the characteristics of tumor microenvironment cell infiltration and inform treatment strategies. Our findings suggest that methylation modifications of several genes are intricately linked to the early development of gastric cancer. We procured level 3 DNA methylation and mRNA expression datasets, along with corresponding clinical data for stage I/II gastric cancer, from TCGA (Atlas, 2014). This included mRNA expression data for 164 tumor and 32 normal tissues and DNA methylation data for 189 tumor and 27 normal tissues (Zhu et al., 2014). Clinical details such as age, gender, and stage were also collected (Table 1). Both the methylation and mRNA expression data were generated using the Illumina Infinium Human Methylation450 BeadChip and Illumina GA_RNASeq V2.1.0.0 platforms, respectively (Illumina, Inc., San Diego, CA, United States). Employing the limma package in R (Smyth et al., 2005), we identified DEGs between tumor and normal gastric tissues. Expression fold-change (FC) was calculated, and DEGs were selected based on a significance threshold of p < 0.05 and |log2FC| ≥ 0.585. To identify differentially hyper-and hypomethylated genes, we used the MethylMix package in R (Gevaert, 2015;Cedoz et al., 2018). DMGs were classified as those if they satisfied the following criteria: p < 0.05, |log2FC| ≥ 1 and cor ≤ -0.3. The MethylMix analysis was conducted in three stages: initially, we overlaid cancer DNA methylation data with corresponding data of gene expression for pinpointing methylation changes impacting gene expression. For additional examination, we only chose those genes satisfying the criteria for correlation filtering. Next, we implemented a beta mixed model to describe the methylation patterns across an extensive cohort of patients, reducing requirements for a preset threshold. Finally, we employed the Wilcoxon rank-sum test for contrasting the DNA methylation levels between gastric cancer and normal tissues. Utilizing the Survival package in R, we carried out a univariate Cox regression analysis. This analysis focused on DMGs correlated with patient outcomes, calculating hazard ratios and their confidence intervals. We set the significance at a p-value less than 0.05. We utilized LASSO analysis to explore how the expression of genes influenced by DNA methylation correlates with prognosis. This method effectively identified key genes driven by DNA methylation that are significantly linked to prognosis, enhancing the model's accuracy and minimizing the likelihood of overfitting. From the univariate Cox regression and LASSO analyses, we selected the two methylation-driven genes with the highest or lowest hazard ratios to develop a risk score prediction model. This model utilized a linear combination of gene expression levels, each weighted by coefficients derived from multivariate Cox regression analysis. Employing this model, we split gastric cancer patients into categories of high and low risk in accordance with an optimal risk score threshold. We obtained the risk score for each patient by Risk score = Expression methylation-driven gene 1 × Coefficient methylation-driven gene 1 + Expression methylationdriven gene 2 × Coefficient methylation-driven gene 2. We assessed survival differences between the high-risk and low-risk groups using Kaplan-Meier survival plots. The GSE84437 dataset from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) served to validate the prognostic model. Additionally, for evaluating the model's predictive performance, we conducted a time-dependent ROC analysis. We built a nomogram for predicting the 1-, 3-, and 5-year survival outcomes for stage I/II GC patients using two methylationdriven genes. The calibration and discrimination of the nomogram were evaluated through a bootstrap approach involving 1,000 resamples. The research methodologies were sanctioned by the Institutional Medical Ethics Committee at Xiamen University. We collected clinical samples after obtaining informed consent from the patients, in compliance with the Declaration of Helsinki (1975) Following the protocol provided by the manufacturer, genomic DNA was extracted from paraffin-embedded tumor and control tissue samples from stage I and II patients using the QIAGEN kit, Hilden, Germany. The extracted DNA was then measured and adjusted to a working concentration of 20 ng/μL. Selection of CpG islands in the proximal promoter regions of the C1orf35 and FAAH genes adhered to specific criteria: (1) they must be at least 200 base pairs in length; (2) they should have a GC content of at least 50%; and(3) they need an observed/expected CpG dinucleotide ratio of 0.60 or above. A total of 50 CpG methylation sites across three islands of the C1orf35 gene and 14 CpG methylation sites across one island of the FAAH gene were sequenced.For the purpose of bisulfite conversion, 400 ng of genomic DNA underwent treatment with the EZ DNA Methylation™-GOLD Kit from ZYMO RESEARCH, located in CA, United States. Samples that did not achieve a bisulfite conversion rate of at least 98% were not included in the study. PCR products designed to target specific CpG sites were subjected to separation through agarose gel electrophoresis. Next, we employed the QIAquick Gel Extraction Kit from QIAGEN in Hilden, Germany, to purify these products. Methylation analysis was then performed using the Illumina HiSeq/MiSeq 2000 systems according to the guidelines provided by the manufacturer. For further details on the CpG sites analyzed, refer to Table 2. We used the CIBERSORT package to assess the distribution of 22 immune cell types from high-risk and low-risk groups respectively, enabling us to calculate the enrichment scores for every immune-related phrase in each group (Newman et al., 2015). First, we quantified and evaluated the relative abundance of different immune cell types between tumor and normal samples in the high-risk and low-risk groups to compare and predict immune cell infiltration between the two groups respectively; second, the distribution of immune cells between high-risk and low-risk groups was evaluated too. The Pearson correlation coefficient was used to test whether the relationship between tumor and normal samples is significant in two groups respectively (The threshold for significance: p < 0.05, | cor | ≥ 0.2). Venn diagram was employed to identify the key genes only high expressed in the high-risk group. The "clusterProfiler" R package was used to perform Gene Ontology enrichment analyses to elucidate whether the biological functions in genes are only associated with the high-risk group (Lu et al., 2008).We performed GSEA to clarify the biological roles of previously identified methylated regulatory genes. The threshold for significance was established at p < 0.05, and a minimum enriched gene count of 15 was required. Analyses were conducted with IBM SPSS Statistics 27, utilizing chi-square tests, and outcomes reached statistical significance when the p-value was below 0.05. A methodological flowchart is presented in Figure 1. We analyzed mRNA expression data from the TCGA database and identified 5,304 DEGs, with 4,732 being upregulated and 572 downregulated. We identified 176 genes influenced by DNA methylation, comprising 38 hypermethylated and 138 hypomethylated genes. These genes demonstrated a correlation coefficient below -0.3 with DEGs and significant associations (adjusted p-value <0.05), prompting further detailed analyses. A univariate Cox regression analysis was conducted to assess the prognostic impact of these methylation-altered genes. Genes such as GATA3, C1orf35, CMTM3, and BEX4, which exhibited hazard ratios greater than 1, were deemed independent risk factors. Conversely, genes including FAAH, POF1B, ZC3H12A, BCL2L15, and MUC13, with hazard ratios less than 1, were considered protective (Figure 2A). A heatmap of these DNA methylation-driven genes is shown in Figure 2F.Identifying critical genes using LASSO analyses and building a risk score prediction model based on genes influenced by DNA methylation.The nine selected DNA methylation-driven genes underwent 1,000 iterations of LASSO regression to further narrow the selection. Four genes were identified using LASSO. Cross-validation determined the optimal adjustment parameter λ, minimizing the error rate (Figure 2B). The levels of expression and methylation of these genes are displayed in Figure 2C. Based on their respective hazard ratios-with C1orf35 having the highest (HR = 2.035, P = 0.028) and FAAH the lowest (HR = 0.656, P = 0.012)-and significant Kaplan-Meier survival analyses (P < 0.05), these two genes were selected to construct a risk score prediction model based on DNA methylation-driven gene activity. The expression of FAAH and C1orf35 have negative correlaion with methylation of these two genes (Figure 2D). The Mixture model of FAAH and C1orf35 have been shown in Figure 2E. Subsequent survival analysis and ROC curve assessments were performed on this model (Kamarudin et al., 2017). The Kaplan-Meier plots revealed a significantly shorter OS for the group with elevated risk scores (P = 0.005) (Figure 3E), with AUC values of 0.564 for 5-year survival and 0.611 for 3-year survival, respectively (Figure 3E). The effectiveness of the predictive signature was evaluated using the TCGA dataset and further confirmed through the GSE84437 dataset. In the high-risk group, OS was notably reduced (P = 0.019), as indicated by the Kaplan-Meier analysis (Figure 3F). The AUC values were recorded at 0.569 for 3-year and 0.593 for 5-year survival periods, respectively (Figure 3F). Our nomogram, incorporating a DNA methylation-based gene signature, demonstrated broad applicability for both long-and short-term patient follow-ups, as depicted in Figure 3G. The calibration curves closely matched the predicted probabilities of OS at 1-, 3-, and 5-year intervals for gastric cancer patients (Figure 3H). To assess the methylation levels in early-stage gastric cancer, tissue samples from 34 patient pairs were examined. We calculated the methylation level at each CpG site as the proportion of methylated cytosine to the total cytosines examined. A total of 50 methylation sites associated with three CpG islands in C1orf35 and 14 methylation sites associated with one CpG island in FAAH were analyzed (Figures 4A-D). The heatmap indicates variance in methylation levels across samples. Notably, methylation was lower in one island of C1orf35 (C1orf35_1: p = 0.0292; and not statistical significance in the single island of FAAH (p = 0.8543) compared to normal tissues, as shown in Figure 4E. First, we quantified and evaluated the relative abundance of different immune cell types between tumor and normal samples in the high-risk and low-risk groups. In the highrisk group, B cells naive, CD4 memory resting, CD4 memory activated, macrophages M0, dendritic cells activated, and mast cells resting have statistically significant differences (Figure 5A), and CD4 memory resting, CD4 memory activated, monocytes, macrophages M0, macrophages M1, dendritic cells activated, and mast cells resting cells also have statistically significant differences (Figure 5B). Second, the distribution of CD4 memory resting, Monocytes, Macrophages M1, Mast cells resting and Eosinophils cells between high-risk and low-risk groups have a statistically significant difference (Figure 5C). The Pearson correlation analysis showed that T cells CD4 memory resting and NK cells resting are associated with the risk score in the high-risk group (Figure 5D), and NK cells activated is positive with risk score in the lowrisk group (Figure 5E). Finally, only NK cells resting were observed to be positive in both scores of immune cell infiltration and correlation analysis. We obtained the key genes only highly expressed in the highrisk group for further analysis (Figure 5F). Figure 5G displays the Gene Ontology (GO) analysis results, which show that the marker genes of the B cells from these pathways, such as digestion, immunoglobulin production, organic hydroxy compound catabolic process, production of molecular mediator of immune response, and regulation of hemostasis.We employed gene set enrichment analysis (GSEA) for identifying key signaling pathways associated with the risk score model in both groups. Pathways with a false discovery rate (FDR) below 0.05 and an enrichment score (ES) above 0.5 were considered significant. In the high-risk group, the most enriched pathways included "COMPLEMENT AND COAGULATION CASCADES, " "MELANOMA, " "SYSTEMIC LUPUS ERYTHEMATOSUS, " "TIGHT JUNCTION, " and "WNT SIGNALING PATHWAY" (risk score <0.314). Conversely, the low-risk group showed enrichment in pathways of "STARCH AND SUCROSE METABOLISM, " "RETINOL METABOLISM, " "METABOLISM OF XENOBIOTICS BY CYTOCHROME P450, " "DRUG METABOLISM OTHER ENZYMES, " and "DRUG METABOLISM CYTOCHROME P450" (Figure 5H). These findings underscore the potential molecular mechanisms driving tumor progression in GC. In recent years, a significant body of research has focused on identifying clinicopathological factors associated with overall survival in stage I/II gastric cancer. However, only few studies have investigated genetic prognostic markers (Ikoma et al., 2016;Bausys et al., 2018;Aoyama et al., 2014;Saito et al., 2016). Our study aimed to develop a risk prediction model based on genes affected by DNA methylation in patients with stage I/II gastric cancer. This model seeks to identify patients who may benefit from more aggressive treatment strategies, including adjuvant chemotherapy.Our initial analyses integrated microarray studies and bioinformatics techniques to identify DNA DMGs in stage I/II gastric cancer. Notable genes identified with differential methylation included ZC3H12A, GATA3, C1orf35, and FAAH. ZC3H12A, also known as MCPIP1, is an RNAse that acts as a novel suppressor of microRNA activity and biogenesis (Tahara et al., 2013). GATA3, a zinc-finger pioneer transcription factor, plays critical roles in gene regulation by binding to nucleosomal DNA and facilitating chromatin remodeling (Qiang et al., 2023). The chromosome 1 open reading frame 35 (C1orf35) gene (Melton et al., 2019) and fatty acid amide hydrolase (FAAH), responsible for the degradation of anandamide into arachidonic acid and ethanolamine, are also implicated in significant cellular pathways (Melton et al., 2019). Abnormal methylation of these genes can lead to various diseases, including cancer. Therefore, we conducted a thorough analysis and clinical validation of FAAH and C1orf35 as key components of our prognostic model.Our risk score prediction model, based on univariate Cox regression and LASSO analyses, identified FAAH and C1orf35 as critical genes. Methylation sequencing experiments demonstrated significant hypomethylation in the promoters of these genes among the patients studied. This methylation pattern was confirmed in 34 pairs of early-stage gastric cancer patients. The Kaplan-Meier analysis validated the effectiveness of this prognostic model in predicting outcomes for stage I/II gastric cancer patients, and timedependent ROC analysis further confirmed the model's prognostic relevance.A notable observation from our study was the prognostic significance associated with the methylation status of C1orf35 and FAAH. C1orf35, identified in multiple myeloma cell lines, acts as an oncogene promoting the G1-to-S cell cycle transition by modulating c-MYC expression. Its oncogenic activity may be inhibited by targeting c-MYC (Luo et al., 2020). Further studies have suggested a potential role for C1orf35 in liver cancer (Meier et al., 2021). Consistent with these findings, upregulated and hypomethylated C1orf35 was associated with poor prognosis in stage I/II gastric cancer patients.FAAH, associated with cellular membranes, facilitates the hydrolysis of anandamide, a ligand for cannabinoid receptors. In the gastrointestinal tract, inhibition of FAAH reduces intestinal motility (Capasso et al., 2005) and exhibits anti-inflammatory effects in vivo (Massa et al., 2004;D' Argenio et al., 2006). Previous research has indicated that endocannabinoids may inhibit the development of precancerous lesions in mouse colons (Izzo et al., 2008) and reduce the proliferation of colorectal carcinoma cells in vitro. In our study, hypomethylated and highly expressed FAAH correlated with a favorable prognosis, echoing findings in pancreatic cancer (Michalski et al., 2008). This contrasts with the findings of other studies where elevated FAAH expression was linked to poor outcomes in various cancers (Fogli et al., 2006;Carracedo et al., 2006).T cells CD4 memory resting have been found to have lower expression in tumor tissues and high-risk group in gastric cancer specimens. The Pearson correlation analysis showed the negative correlation between this kind of T cells and tumor tissues in the high-risk group. Cancer immunology and immunotherapy are driving forces of research and development in oncology, and previous studies indicated the importance of CD4 + T cells, CD4+ T cells play an essential role in the immune system by coordinating both adaptive and innate responses (Künzli and Masopust, 2023). Moreover, they have now been recognized as anti-tumor effector cells in their own right (Speiser et al., 2023). Recent research has shown that CD4 + T cells, particularly CD4 + memory T cells, are crucial for the immunotherapy-induced tumor regression (Nguyen et al., 2019), which was consistent with the findings of our study.Our comprehensive analyses of data on DNA methylation arrays, mRNA expression, and related clinical details sourced from the TCGA database have revealed potential predictive biomarkers for early-stage gastric cancer prognosis. These findings could improve treatment accuracy and enhance overall survival in earlystage gastric cancer. However, our study's limitations include potential selection bias due to its retrospective design and small sample size. There is a scarcity of data to inform treatment choices for this particular group, necessitating further research with a larger cohort. We plan to establish a database for stage I/II gastric cancer at our center to expand our research sample size.In conclusion, this study has pinpointed key predictive elements that are vital for forecasting the outcomes of early-stage gastric cancer. Central genes like C1orf35 and FAAH have demonstrated both predictive and prognostic significance as biomarkers based on methylation, setting the stage for accurate diagnosis and targeted treatment of gastric cancer.

Keywords: gastric cancer, FAAH, C1orf35, DNA methylation-driven gene, prognosis

Received: 03 Jul 2025; Accepted: 19 Aug 2025.

Copyright: © 2025 Su, Lin, Ye, Liang, Yu, Wan and Hou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Rong Yu, Zhongshan Hospital Xiamen University, Xiamen, China
Zheng Wan, Xiamen University, Xiamen, China
Jingjing Hou, Xiamen University, Xiamen, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.