Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol.

Sec. Gastrointestinal Cancers: Colorectal Cancer

This article is part of the Research TopicHarnessing Artificial Intelligence for Cancer Prevention, Early Diagnosis, and Personalized TreatmentView all 3 articles

An Interpretable Machine Learning Model for Predicting Distant Organ Metastasis After Radical Resection of Colorectal Cancer

Provisionally accepted
Weibin  LinWeibin Lin1Weixiang  NiWeixiang Ni2Junwei  ZhouJunwei Zhou3Weixuan  HongWeixuan Hong4Junwei  FangJunwei Fang4Meiping  WangMeiping Wang4Chunhong  XiaoChunhong Xiao4*Guoliang  HuangGuoliang Huang4*
  • 1Xiamen University, Xiamen, China
  • 2Fujian Medical University, Fuzhou, China
  • 3Fujian University of Traditional Chinese Medicine, Fuzhou, China
  • 4900th Hospital of the People's Liberation Army Joint Logistic Support Force, Fuzhou, China

The final, formatted version of the article will be published soon.

ABSTRACT Objective: Distant organ metastasis remains the primary factor affecting long-term survival following radical surgery for colorectal cancer (CRC). This study aimed to develop and validate an interpretable machine learning (ML) model to predict the 5-year cumulative risk of distant metastasis after radical CRC surgery. Methods: A retrospective observational cohort study was conducted using clinical and follow-up data from 341 CRC patients who underwent radical surgery. The cohort was randomly divided into a training set =239 and a validation set =102 at a 7:3 ratio. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression, identifying variables associated with the 5-year cumulative occurrence of metastasis. Prediction models were constructed using seven algorithms. Model performance was evaluated through multiple metrics: area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, F1 score, calibration plots, and decision curve analysis. The SHapley Additive exPlanations (SHAP) method was applied to improve model interpretability. Results: LASSO combined with tenfold cross-validation selected 11 key features for model development. Among the models tested, the SVM model demonstrated superior performance, achieving a Brier score of 0.144 and an AUC of 0.865 in the validation set. Calibration and clinical decision curves confirmed the SVM model's strong calibration and clinical applicability. The SHAP dependence plots and force analysis provided explanations at both feature and individual patient levels for the model's 5-year risk predictions. Conclusion: This study established a high-accuracy and interpretable ML model capable of effectively predicting the 5-year cumulative risk of distant organ metastasis after radical colorectal cancer surgery, while further external validation is necessary to confirm its clinical utility.

Keywords: colorectal cancer, distant metastasis, Explainable Machine Learning, Prediction model, Radical resection

Received: 10 Dec 2025; Accepted: 09 Feb 2026.

Copyright: © 2026 Lin, Ni, Zhou, Hong, Fang, Wang, Xiao and Huang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Chunhong Xiao
Guoliang Huang

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.