Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol.

Sec. Gastrointestinal Cancers: Hepato Pancreatic Biliary Cancers

Volume 15 - 2025 | doi: 10.3389/fonc.2025.1613462

This article is part of the Research TopicAdvances in Surgical Techniques and ML/DL-based Prognostic Biomarkers for Surgical and Adjuvant Therapies of Hepatobiliary and Pancreatic CancersView all 6 articles

Pre-operative T-Stage Discrimination in Gallbladder Cancer Using Machine Learning and DeepSeek-R1

Provisionally accepted
CHAE  JOONGWONCHAE JOONGWON1Wang  ZhenyuWang Zhenyu1Wu  DuanpoWu Duanpo2Lian  ZhangLian Zhang3Tuzikov  AlexanderTuzikov Alexander4Madiyevich  Talat MagrupovMadiyevich Talat Magrupov5Xu  minXu min6Yu  DongmeiYu Dongmei6*Peiwu  QinPeiwu Qin1*
  • 1Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
  • 2Hangzhou Dianzi University, Hangzhou, Zhejiang Province, China
  • 3The First hospital of Hebei Medical University, Shijiazhuang, Hebei Province, China
  • 4National Academy of Sciences of Belarus (NASB), Minsk, City of Minsk, Belarus
  • 5Tashkent State Technical University, Tashkent, Tashkent, Uzbekistan
  • 6Wenzhou Medical University, Wenzhou, Zhejiang Province, China

The final, formatted version of the article will be published soon.

Background: Gallbladder cancer (GBC) frequently exhibits non-specific early symptoms, delaying diagnosis. This study (i) assessed whether routine blood biomarkers can distinguish early T stages via machine learning and (ii) compared the T-stage discrimination performance of a large language model (DeepSeek-R1) when supplied with (a) radiology-report text alone versus (b) radiology-report text plus blood-biomarker values.We retrospectively analysed 232 pathologically confirmed GBC patients treated at Lishui Central Hospital between 2023 and 2024 (T1, n = 51; T2, n = 181). Seven blood variables-neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), platelet-tolymphocyte ratio (PLR), carcino-embryonic antigen (CEA), carbohydrate antigen 19-9 (CA19-9), carbohydrate antigen 125 (CA125), and alpha-fetoprotein (AFP)-were used to train Random forest, Support Vector Machine (SVC), XGBoost, and LightGBM models. Synthetic Minority Over-sampling Technique (SMOTE) was applied only to the training folds in one setting and omitted in another. Model performance was evaluated on an independent test set (N = 47) by the area under the receiver-operating-characteristic curve (AUROC, 95 % CI by 1 000-sample bootstrap confidence interval, CI); cross-validation (CV) accuracy served as a supplementary metric.1 Chae et al.

Keywords: gallbladder cancer, GBC, machine learning, Large Language Model, DeepSeek-R1, staging, biomarker, radiology report Frontiers

Received: 17 Apr 2025; Accepted: 07 Jul 2025.

Copyright: © 2025 JOONGWON, Zhenyu, Duanpo, Zhang, Alexander, Magrupov, min, Dongmei and Qin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Yu Dongmei, Wenzhou Medical University, Wenzhou, 325035, Zhejiang Province, China
Peiwu Qin, Shenzhen International Graduate School, Tsinghua University, Shenzhen, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.