Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Cell Dev. Biol.

Sec. Cancer Cell Biology

Volume 13 - 2025 | doi: 10.3389/fcell.2025.1704237

This article is part of the Research TopicArtificial Intelligence in Multi-omics: Advancing Tumor Metastasis Prediction and Mechanism AnalysisView all 4 articles

MASE-GC: A Multi-Omics Autoencoder and Stacking Ensemble Framework for Gastric Cancer Classification

Provisionally accepted
Di  LiuDi LiuZhongguang  CheZhongguang CheGuannan  XuGuannan XuYe  HuangYe Huang*
  • First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China

The final, formatted version of the article will be published soon.

Background: Gastric cancer (GC) is one of the most common malignant tumors and remains a leading cause of cancer-related mortality worldwide. Accurate classification of GC is critical for improving diagnosis, prognosis, and personalized treatment. Recent advances in high-throughput sequencing have enabled the generation of large-scale multi-omics data, offering new opportunities for precise disease stratification. However, existing studies often rely on single-omics approaches or single-model frameworks, which fail to capture the full complexity of tumor biology and suffer from limited sensitivity, specificity, and generalizability. Methods: We propose MASE-GC (Multi-Omics Autoencoder and Stacking Ensemble for Gastric Cancer), a novel computational framework that integrates exon expression, mRNA expression, miRNA expression, and DNA methylation profiles. MASE-GC employs modality-specific autoencoders to extract compact latent features from heterogeneous omics layers and combines them through weighted fusion. The integrated features are then classified using a stacking ensemble of five base learners—Support Vector Machine, Random Forest, Decision Tree, AdaBoost, and Convolutional Neural Network—followed by an XGBoost meta-classifier. A robust preprocessing pipeline, including feature filtering, normalization, and SMOTE–Tomek balancing, is incorporated to address noise, high dimensionality, and class imbalance. Results: Comprehensive experiments on the TCGA-STAD cohort demonstrated that MASE-GC achieved superior classification performance compared with single-omics and baseline methods, reaching an accuracy of 0.981, precision of 0.9845, recall of 0.992, F1-score of 0.9883, and specificity of 0.824. Ablation studies confirmed the complementary contributions of autoencoders and ensemble components, with CNN and Random Forest providing the largest performance gains. Furthermore, independent validation on external cohorts (GSE62254, GSE15459, GSE84437, and ICGC) confirmed the robustness and generalizability of MASE-GC, with accuracy consistently above 0.958 and F1-scores exceeding 0.969. Conclusion: MASE-GC advances computational oncology by offering an effective and generalizable framework for GC classification. By integrating multi-omics fusion, ensemble learning, and robust preprocessing, the proposed model improves both sensitivity and specificity, reduces false positives, and demonstrates strong potential for clinical translation in precision diagnostics and treatment planning.

Keywords: gastric cancer, multi-omics, Autoencoder, ensemble learning, XGBoost, oncology

Received: 12 Sep 2025; Accepted: 08 Oct 2025.

Copyright: © 2025 Liu, Che, Xu and Huang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Ye Huang, huangye202507@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.