Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med. Technol.

Sec. Medtech Data Analytics

Explainable Multi-Modal Machine Learning for Predicting Occult Pulmonary Metastases in Differentiated Thyroid Cancer: A SHAP-Based Approach Prior to Radioactive Iodine Scans

Provisionally accepted
Yuqi  SuYuqi Su1Yuhuang  CaiYuhuang Cai2水  靳水 靳2Xuemei  YeXuemei Ye2Jaesik  JeongJaesik Jeong3Ye  YuanYe Yuan1*Heqing  YiHeqing Yi2
  • 1Wenzhou Medical University, Wenzhou, China
  • 2Zhejiang Cancer Hospital, Hangzhou, China
  • 3Chonnam National University, Buk-gu, Republic of Korea

The final, formatted version of the article will be published soon.

Background: Patients with differentiated thyroid cancer (DTC) may have occult lung metastases before 131iodine (131I) treatment. Identifying occult lung metastases before 131I treatment is of great clinical value for the correct staging of patients and the establishment of 131I treatment plans. Our research is of great significance in establishing statistical models for clinical data using machine learning algorithms to study the prediction of lung metastasis before 131I treatment. Methods: Patients were selected from Zhejiang cancer hospital and data was from two groups of DTC patients treated with 131I, where the experimental group consisted of 55 patients who showed no lung metastases on CT but tested positive on 131I-whole body scan (131I-WBS). The control group included 316 patients who tested negative for metastases across CT, ultrasound, and 131I-WBS. Six machine learning algorithms such as Support Vector Machines (SVM), Decision Trees (DT), Random Forests (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbors (KNN) were employed to predict models and AUC, sensitivity, accuracy, precision, specificity, F1 Score were used to compare the performance between each models. Finally, the SHAP algorithm was used to explain the importance rank of the features. Results: A total of 371 thyroid cancer patients were included in this study, 55 patients with occult lung metastasis and 316 patients in the control group. The data is divided into a training set and a testing set in a 7:3 ratio. Eleven acceptable variables analyzed including gender, age, T stage, N stage, tumor size, degree of invasion, number of lymph node metastases count, Thyroid Stimulating Hormone (TSH), thyroglobulin (Tg), Thyroglobulin antibodies (Tgab), and administrated activity were screened out by multivariate Cox regression. Evaluation indicators of the best model-LR were as following: accuracy (0.91), recall rate (0.64), precision (0.92), F1-s core (0.70), Area Under Curve (AUC) value (0.93), and the Specificity score (0.96). Conclusion: The logistic model (LR) showed the best performance in predicting occult lung metastases of thyroid cancer patients before 131I-WBS. Lymph nodes metastases and throglobulin have the most significant impact on the prediction.

Keywords: thyroid cancer, machine learning, Prediction model, Lung metastases, 131I treatment

Received: 22 Aug 2025; Accepted: 13 Nov 2025.

Copyright: © 2025 Su, Cai, 靳, Ye, Jeong, Yuan and Yi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Ye Yuan, yuanye017@126.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.