ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 8 - 2025 | doi: 10.3389/frai.2025.1650762
This article is part of the Research TopicAdvances in Artificial Intelligence for Early Cancer Detection and Precision OncologyView all articles
Automated Machine Learning for Predicting Liver Metastasis in Patients with Pancreatic Cancer: A SEER-Based Analysis
Provisionally accepted- Chongqing Nanchuan District People's Hospital, Chongqing, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background. Pancreatic cancer is a common malignant tumor with a high propensity for liver metastasis (LIM), which significantly impacts patient prognosis. This study aimed to predict the risk of LIM in patients with pancreatic cancer using automated machine learning (AutoML) algorithms, to assist clinicians in developing treatment strategies.. Patients diagnosed with pancreatic cancer between 2010 and 2015 were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. The dataset was randomly divided into training, validation, and testing sets at a ratio of 6:2:2. Predictive models were constructed using both the Least Absolute Shrinkage and Selection Operator (LASSO) and AutoML methods. A nomogram was developed based on multivariate logistic regression (LR) analysis. Model performance was evaluated using the area under the receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA). Additionally, the effectiveness and interpretability of the AutoML models were assessed using DCA, feature importance, SHapley Additive exPlanation (SHAP) plots, and Local Interpretable Model-Agnostic Explanation (LIME). Results. A total of 15,215 pancreatic cancer patients with complete baseline information were included, with 9,227 in the training set, 3,023 in the validation set, and 2,965 in the testing set. The incidence of LIM was 17.0%, 17.2%, and 17.6% in the training, validation, and testing sets, respectively. Compared to the traditional LR model, the AutoML models showed superior performance. Among them, the Gradient Boosting Machine (GBM) model achieved the best results, with area under the ROC curves of 0.908, 0.885, and 0.892 in the training, validation, and testing sets, respectively Conclusion. The AutoML model utilizing the GBM algorithm achieved outstanding performance in predicting liver metastasis in pancreatic cancer patients and presents substantial potential for clinical application.
Keywords: artificial intelligence, SEER database, Automated machine learning, liver metastasis, Pancreatic Cancer, predictive models
Received: 20 Jun 2025; Accepted: 08 Aug 2025.
Copyright: © 2025 Zhang, Min, Meng, Wang, Du and Ding. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Yun Du, Chongqing Nanchuan District People's Hospital, Chongqing, China
Jiewen Ding, Chongqing Nanchuan District People's Hospital, Chongqing, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.