Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Immunol.

Sec. Cancer Immunity and Immunotherapy

Volume 16 - 2025 | doi: 10.3389/fimmu.2025.1681396

This article is part of the Research TopicCommunity Series in Methods in Cancer Immunity and Immunotherapy: Volume IIView all 12 articles

NeoTImmuML: A Machine Learning-Based Prediction Model for Human Tumor Neoantigen Immunogenicity

Provisionally accepted
Yan  ShaoYan ShaoShuguang  GeShuguang GeRuizhe  DongRuizhe DongWei  JiWei JiChaoran  QinChaoran QinPengbo  WenPengbo Wen*
  • School of Medicine Information and Engineering , Xuzhou Medical University, Xuzhou, China

The final, formatted version of the article will be published soon.

Tumor neoantigens, due to their high specificity and immunogenicity, have become key targets for personalized cancer immunotherapies, such as mRNA vaccines and T cell therapies. However, identifying neoantigens and evaluating their immunogenicity remain challenging. These tasks often rely on time-consuming experimental validation, which greatly limits the efficiency of vaccine development. To address this problem, we introduced two main initiatives. First, we upgraded the TumorAgDB database by integrating publicly available neoantigen data published in the past two years, resulting in TumorAgDB2.0. Second, we developed NeoTImmuML, a weighted ensemble machine learning model for predicting neoantigen immunogenicity. Using data from TumorAgDB2.0, we calculated the physicochemical properties of peptides. We then systematically evaluated eight commonly used machine learning algorithms through five-fold cross-validation. Among them, LightGBM, XGBoost, and Random Forest performed best. These three models were combined into a weighted ensemble to build NeoTImmuML. The model showed strong generalization ability on both internal and external test sets. SHapley Additive Explanations (SHAP) feature importance analysis revealed that peptide hydrophobicity and length are key factors influencing immunogenicity prediction. NeoTImmuML is now integrated into the TumorAgDB2.0 platform. Overall, TumorAgDB2.0 provides a comprehensive data resource for neoantigen research. NeoTImmuML offers an efficient and interpretable tool for predicting neoantigen immunogenicity. Together, they provide strong support for the design of personalized neoantigen vaccines and the development of cancer immunotherapy strategies.

Keywords: tumor neoantigens, Immunogenicity, machine learning, Ensemble model, database, Shap

Received: 07 Aug 2025; Accepted: 06 Oct 2025.

Copyright: © 2025 Shao, Ge, Dong, Ji, Qin and Wen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Pengbo Wen, wen_pengbo@xzhmu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.