ORIGINAL RESEARCH article
Front. Bioinform.
Sec. Integrative Bioinformatics
This article is part of the Research TopicClinical prediction models in cancer through bioinformaticsView all 23 articles
Predicting GD2 expression across cancer types by the integration of pathway topology and transcriptome data
Provisionally accepted- 1Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Centre, Johannes Gutenberg University Mainz, Mainz, Germany
- 2Institute for Quantitative and Computational Biosciences, Johannes Gutenberg-University Mainz, Mainz, Germany
- 3University Cancer Center, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
- 4Research Center for Immunotherapy, Johannes Gutenberg-University Mainz, Mainz, Germany
- 5Department of Pediatric Hematology/Oncology/Hemostaseology, Center for Pediatric and Adolescent Medicine, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
- 6Lipid Pathobiochemistry, Deutsches Krebsforschungszentrum, Heidelberg, Germany
- 7German Cancer Consortium (DKTK), Site Frankfurt/Mainz, Germany, German Cancer Research Center (DKFZ), Heidelberg, Germany
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: The disialoganglioside GD2 is a key cancer therapy target due to its overexpression in several cancers and limited presence in normal tissues. However, experimental assessment is technically challenging and not routinely available. We developed a computational framework that integrates reaction activity derived from transcriptomic data with the glycosphingolipid biosynthesis pathway to predict GD2 expression. Methods: We computed Reaction Activity Scores from transcriptomic data and weighted the reactions of a glycosphingolipid metabolic network, refining edge weights with topology-based transition probabilities to account for enzyme promiscuity. Cumulative activities of GD2-promoting and -mitigating reactions served as features in a Support Vector Machine (SVM) to model GD2-associated differences between neuroblastoma and normal tissue. SVM decision values were used as a continuous proxy for GD2 expression. We validated the predicted GD2 scores across independent datasets by comparing them with literature-reported values and flow-cytometric confirmation of a model-predicted high-GD2 tumor. Copy-number alteration (CNA) data were integrated to identify candidate genomic biomarkers of GD2-positive samples. Results: Our SVM-based GD2 score achieved balanced accuracy of 0.80 with a linear kernel, selected due to reduced overfitting risk and interpretability, while matching the accuracy of more complex kernels. The model transferred reliably across six independent RNA-seq datasets and reproduced known GD2 expression patterns, outperforming a two-gene signature in capturing subtype-specific heterogeneity and avoiding overestimation in normal brain tissue. Pan-cancer analyses revealed heterogeneous GD2 expression in several cancer subtypes. Notably, we experimentally confirmed high GD2 expression in clear cell sarcoma of the kidney, consistent with model predictions. CNA analysis implicated B4GALNT1 amplification as a GD2-promoting factor in dedifferentiated liposarcoma. To facilitate adoption of our approach, we developed GD2Viz, an R package with an interactive Shiny application for score computation, visualization, and analysis of user data. Conclusion: Our computational framework provides a robust, interpretable, biologically grounded predictor of GD2 expression, offering greater consistency and clinical interpretability over existing gene-based signatures. Importantly, with over 20 GD2-directed trials ongoing, our approach may help prioritize tumor entities with high GD2 levels, delineate candidate patient subgroups, and generate testable hypotheses in underexplored cancers, thereby supporting patient stratification and eligibility screening for clinical trials.
Keywords: GD2 Prediction, ganglioside, Support vector machine, TranscriptomeAnalysis, cancer subtypes, metabolic network, Reaction Activity Score, biomarker
Received: 15 Sep 2025; Accepted: 06 Nov 2025.
Copyright: © 2025 Ustjanzew, Marini, Wagner, Wingerter, Sandhoff, Faber and Paret. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Arsenij Ustjanzew, arsenij.ustjanzew@gmail.com
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
