Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Immunol.

Sec. Cancer Immunity and Immunotherapy

Volume 16 - 2025 | doi: 10.3389/fimmu.2025.1640075

This article is part of the Research TopicAdvances in Surgical Techniques and ML/DL-based Prognostic Biomarkers for Surgical and Adjuvant Therapies of Hepatobiliary and Pancreatic CancersView all 11 articles

Development of a machine learning model to predict overall survival for large hepatocellular carcinoma at BCLC stage A or B after curative hepatectomy

Provisionally accepted
Tai-Xin  YangTai-Xin Yang1,2Jia-Yong  SuJia-Yong Su1,2Min-Jun  LiMin-Jun Li1,2Shen  ShuangShen Shuang1,2Yu  WangYu Wang2Huan- Nan  WeiHuan- Nan Wei2Ming-Jian  HuangMing-Jian Huang2Qing-Man  QinQing-Man Qin2You-Yin  RanYou-Yin Ran2Yao-Ting  HuangYao-Ting Huang2Jin-Yan  HuangJin-Yan Huang2Jie  ZhangJie Zhang1,2*Bang-De  XiangBang-De Xiang1,3,4*Wen-Feng  GongWen-Feng Gong1*
  • 1Guangxi Medical University Cancer Hospital, Nanning, China
  • 2Guangxi Medical University, Nanning, China
  • 3Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumors (Guangxi Medical University), Ministry of Education, Nanning, China
  • 4Guangxi Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumors, Nanning, China

The final, formatted version of the article will be published soon.

Introduction: Patients with large hepatocellular carcinoma (LHCC) have a poor prognosis even after curative hepatectomy. This study aimed to develop and validate an interpretable machine learning (ML) model to predict their overall survival (OS). Methods: This study included 2,565 patients with hepatocellular carcinoma (HCC) who underwent curative hepatectomy between January 2014 and December 2021. The LHCC patients were randomly assigned (7:3 ratio) to a training (n=1069) or validation (n=457) group. Independent risk factors for OS were identified using multivariable Cox regression. Eight ML models were developed and compared. The optimal model's interpretability was assessed using Shapley Additive Explanations (SHAP). Results: LHCC patients experienced a considerable reduction in OS (Hazard Ratio, HR: 1.810, 95% Confidence Interval, CI: 1.585-2.068) compared to SHCC patients. Among eight ML models, the gradient boosting machine (GBM) model demonstrated superior performance. In the validation group, the GBM model achieved area under the receiver operating characteristic curve (AUC) values of 0.742, 0.744, and 0.750 for 1-, 3-, and 5-year OS, respectively. These results were comparable with or superior to established postoperative predictive models. The GBM model showed the ability to stratify patients with LHCC into distinct prognostic groups. A web-based calculator was developed for risk score generation. Notably, the GBM model showed enhanced predictive accuracy in patients with a high neutrophil-lymphocyte ratio (C-index: 0.819). Conclusions: The GBM-based model demonstrated the potential to predict prognosis for patients with LHCC after curative hepatectomy. This interpretable model may assist in personalized risk assessment and tailoring postoperative management strategies.

Keywords: gradient boosting machine, Hepatectomy, Large hepatocellular carcinoma, overall survival, Shap

Received: 03 Jun 2025; Accepted: 30 Sep 2025.

Copyright: © 2025 Yang, Su, Li, Shuang, Wang, Wei, Huang, Qin, Ran, Huang, Huang, Zhang, Xiang and Gong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Jie Zhang, zhangjie1@gxmu.edu.cn
Bang-De Xiang, xiangbangde@gxmu.edu.cn
Wen-Feng Gong, gwf0771@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.