Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Endocrinol., 15 December 2025

Sec. Thyroid Endocrinology

Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1697233

Prediction of 131I uptake in lung metastases of differentiated thyroid cancer using deep learning

Hongjun Song&#x;Hongjun Song1†Manman Fei&#x;Manman Fei2†Haoyi TaoHaoyi Tao2Zhongling QiuZhongling Qiu1Chentian ShenChentian Shen1Xiaoyue ChenXiaoyue Chen3Qiong LuoQiong Luo4Huajun SheHuajun She2Qian WangQian Wang5Lichi Zhang*&#x;Lichi Zhang2*‡Quanyong Luo*&#x;Quanyong Luo1*‡
  • 1Department of Nuclear Medicine, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
  • 2School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
  • 3Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
  • 4Department of Nuclear Medicine, Shanghai Tenth People’s Hospital, Tongji University School of Medicine, Shanghai, China
  • 5School of Biomedical Engineering, ShanghaiTech University, Shanghai, China

Objective: An accurate assessment of 131I accumulation capacity in lung metastases of differentiated thyroid cancer (DTC) is pivotal for guiding radioiodine therapy and avoiding ineffective 131I administration. This study aimed to develop a deep convolutional neural network (DCNN) model to predict 131I uptake in lung metastases of DTC before radioiodine therapy.

Methods: In this retrospective, multicenter, population-based cohort study, we collected chest CT image datasets for DTC patients with lung metastases from three hospitals in China. Pulmonary metastases were classified into two categories based on the post therapeutic 131I whole-body scan: 131I-avid (positive 131I uptake) and non-131I-avid (negative 131I uptake). For DCNN model development, patients were assigned to the primary dataset (140 patients with 131I-avid, 121 with non-131I-avid). For model validation, patients were assigned to the internal validation dataset (36 patients with 131I-avid, 23 with non-131I-avid), external validation dataset 1 (25 patients with 131I-avid, 18 with non-131I-avid), and external validation dataset 2 (23 patients with 131I-avid, 18 with non-131I-avid). Using these datasets, we assessed the performance of our model, ResNeSt50, and compared it with two models: Inception V3 and ResNet50.

Results: Compared to Inception V3 and ResNet50, our model, ResNeSt50, demonstrated the highest prediction performance in the internal (area under the curve [AUC] = 0.722, 95% confidence interval [CI] = 0.716–0.725), external validation dataset 1 (AUC = 0.720, 95% CI = 0.691–0.749), and external validation dataset 2 (AUC = 0.731, 95% CI = 0.713–0.748).

Conclusion: We developed a simple and robust DCNN model for predicting the 131I uptake in lung metastases of DTC before radioiodine therapy, which can provide improved screening for patients who may benefit from 131I therapy.

Trial registration: Chinese Clinical Trial Registry (ChiCTR), ChiCTR1800018047. Registered on 28 August 2018.

Introduction

Lungs are the most frequent site of distant metastases from thyroid cancer of follicular cell origin, with an incidence rate of 2% to 20% (1). The 10-year survival rate for patients with such metastases ranges from 25% to 85% (24). According to the post therapeutic radioactive iodine whole-body scan (131I-WBS), pulmonary metastases are classified as 131I-avid (positive 131I uptake) and non-131I-avid (negative 131I uptake). The term “131I-avid” refers to lesions that demonstrated higher radiopharmaceutical uptake than the physiological background of the mediastinum. The non-131I-avid category comprises lesions with no visible uptake on the post-theraputic whole-body scan (Rx-WBS) or, in cases of multiple pulmonary lesions, those with uptake evident in less than 10% of them. Radioiodine therapy remains the mainstay of treatment for 131I-avid lung metastases of differentiated thyroid cancer (DTC). However, in approximately 30% of cases, there is no obvious benefit, as DTC patients with pulmonary metastases may lose 131I accumulation capacity (radioactive iodine-refractory [RAIR]) (4). Generally, the 10-year survival rate of patients with RAIR-DTC is reported to be < 10%, and the prognosis is poor (5, 6), for which treatment with tyrosine kinase inhibitors (TKI) should be considered. Patients showing 131I accumulation should be treated with radioiodine therapy, whereas in the case of radioactive iodine-refractory patients, early identification of this subgroup is crucial to avoid unnecessary 131I treatment and, importantly, to prevent delays in other potentially effective therapies. However, the current lack of an effective, noninvasive method to predict 131I uptake in lung metastases before therapy can lead to potentially “blind” and ineffective treatments for some patients. Methods that may help predict 131I accumulation capacity in DTC lung metastases are urgently needed to facilitate individualized treatment. Computerized tomography (CT) scans accurately detect and diagnose lung metastases in DTC but are ineffective in identifying 131I-avid lung metastases. Artificial intelligence (AI), particularly deep learning algorithms, has achieved remarkable success and has already been successfully applied in the field of imaging diagnosis, classification, and prognosis owing to their advantages of being fast, accurate, and reproducible (7). AI models can discover high-order features and patterns in medical images that human experts cannot handle and can automatically perform quantitative assessments. Many models have even achieved performance comparable to human decision-making in recent applications (8, 9). Deep convolutional neural networks (DCNNs) are particularly recognized for their high performance in image recognition and their ability to accomplish complex visual recognition tasks (10). Deep learning has achieved state-of-the-art performance in automatic and accurate pulmonary nodule detection from CT scans (11). For example, DCNN can achieve an F-score of 85.5% for the classification of lung patterns on CT scans (12). Additional successes include lymph node metastasis prediction from primary breast cancer (13), diagnosis of thyroid cancer (14), and classification of renal cell carcinoma (15).

In this study, based on CT images, we aim to classify patients as 131I-avid and non-131I-avid using deep learning. Experiments validated the effectiveness of our classification model, with an area under the curve (AUC) of 0.722 (95% confidence interval [CI] = 0.716–0.725) in the internal validation dataset, 0.720 (95% CI = 0.691–0.749) in external validation dataset 1, and 0.731 (95% CI = 0.713–0.748) in external validation dataset 2.

Materials and methods

Study participants

This retrospective multicohort study was approved by the institutional review board (IRB) of Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China, and was undertaken according to the principles of the Declaration of Helsinki. Informed consent was waived by the IRB because of the retrospective nature of the study (Approval Number: 2018-KY-048 (K)). We accessed the medical record database from January 2017 to June 2022 to identify patients who had lung metastases, underwent total thyroidectomy for DTC, and received at least one 131I therapy.

Diagnosis, classification, and treatment of DTC lung metastases

The diagnostic criteria for lung metastases of DTC were based on a previous study (4). A patient was considered to have pulmonary metastases if they met any of the following criteria: (i) confirmation of lung metastatic lesions through pathological examination of a biopsy specimen, (ii) presence of localized or diffused pulmonary 131I uptake on 131I-WBS in combination with pulmonary nodules on chest CT and elevated serum thyroglobulin (Tg) levels, or (iii) absence of pulmonary 131I uptake on 131I-WBS in combination with pulmonary nodules on chest CT and elevated serum Tg levels.

According to posttherapy (Rx-WBS) scanning after radioiodine therapy, lung metastases from thyroid cancer were classified as 131I-avid (positive 131I uptake) or non-131I-avid (negative 131I uptake). Lung 131I uptake was defined as “positive” when the lesion uptake was higher than the background physiological mediastinum uptake. Lesions without visible 131I lung uptake on Rx-WBS or with uptake in < 10% of multiple pulmonary lesions were defined as the “negative” group (16, 17). Regarding interrater discordance, two physicians initially interpreted the scans independently. In cases of disagreement, they reevaluated the scans together to reach a consensus.

Patients were instructed to follow a low-iodine diet and discontinue thyroxine medication for 3–4 weeks prior to 131I treatment. The 131I dose ranged from 3.7 to 7.4GBq (100–200 mCi) for each treatment. Posttherapy scanning (known as Rx-WBS) was performed 3 days after 131I treatment. Interpretation of the WBS images was conducted by two experienced nuclear medicine physicians.

DCNN model construction

All patients underwent chest CT to examine metastatic lesions without the use of contrast media, so as not to affect subsequent radioiodine therapy. CT images were downloaded from the database in DICOM format using a picture archiving and communication system (PACS). Equipment manufactured by Philips (Amsterdam, Netherlands) and GE Healthcare (Chicago, USA) was used to generate chest CT images. All CT images were reconstructed using a medium-sharp reconstruction algorithm with a thickness of 1 mm. The objective of our research was to classify the 131I uptake capacity of DTC pulmonary metastatic nodules, which mainly consisted of two major steps: First, we used a pulmonary nodule detection network to extract pulmonary nodules from patient lung CT images, and then cropped the detected pulmonary nodules into fixed sizes (64 × 64). Second, the detected pulmonary nodules were fed into a deep learning classification model for further lesion differentiation.

Pulmonary metastatic nodule detection network

For automatic and accurate pulmonary nodule detection from lung CT scans, we developed a novel method to ensure robustness in nodule detection and classification. This method is based on Faster-RCNN (18) with the feature pyramid network (FPN) (19) as the main architecture, and pulmonary nodules of size 64 × 64 were extracted. The network is divided into three modules: feature extraction (backbone + neck), region proposal network (RPN head), and proposal box re-regression and classification (region-of-interest [RoI] head). We propose a weighted patch sampling method to sample false-positive candidates and extract coarse segmentation of anatomical structures from CT. Inspired by related works on false-positive reduction (20, 21), we propose a parallel multitask RoI head for anatomical structure-aware nodule classification, which has three subheads to classify nodules or nonnodules using both nodule features and context features. The model mimics the diagnosis of a case by more than one doctor and forces the three subheads to learn from different nodule features while generating the same results. The final model benefits from both the anatomical structures and the parallel multitask RoI head. As with other methods, free-response receiver operating characteristic (FROC) analysis is employed to measure the performance of the model.

Pulmonary metastatic nodule classification network

Image classification is a key task in the computer vision field, and CNNs have achieved state-of-the-art performance in this area. We introduce ResNeSt, a ResNet variant that incorporates a split-attention block to enhance feature representations, aiming for improved performance in image classification tasks. Specifically, the classification model consists of one stem stage and four block stages. The stem stage comprises a maximum pooling and three convolutions, with two steps for the initial convolution and pooling. The four-block stages contain the sequences three, four, six, and three blocks, respectively. The first block may include a feature dimensionality increase or down sampling. ResNeSt introduces the split-attention mechanism within its block design. The block structure divides the input into two groups, each further divided into four splits with a basic feature width of 40. Subsequently, features from the diverse groups are fused.

Model training and testing

As is well known, complex models such as CNNs may experience overfitting, which results in suboptimal performance on data not included in the training phase. Therefore, it is necessary to properly prepare the data to evaluate the model’s performance. Based on the patient’s treatment timeline, the dataset was divided into a primary set (from January 2017 to January 2021) and an internal independent validation set (from February 2021 to June 2022). Fivefold cross-validation was applied to evaluate the classifier’s performance on the primary set (80% of the data for training and 20% for validation). To verify the model’s generalization ability, performance was further evaluated using external validation datasets 1 and 2. The performance of our deep learning model for predicting 131I uptake in lung metastases of DTC was compared with two other classical architectures, Inception V3 (22) and ResNet-50 (23). Receiver operating characteristic curves (ROC) were plotted, and the area under the curve (AUC), sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were used to assess the model’s performance.

Implementation and training strategy

The model is implemented by PyTorch on two Nvidia Tesla P100 GPUs. During training, each model was trained for 1,000 epochs using Adam optimization with a momentum of 0.9. The batch size was limited by GPU memory; we set it to 128 (float16) on an Nvidia GTX 1080Ti GPU. A weight decay of 0.0001 was applied. The initial learning rate was 0.0001 and was reduced to 0.00001 after half of the total number of epochs. A linear warm-up learning rate schedule was also applied during the first epoch.

Statistical analysis

We demonstrated the diagnostic ability of the deep learning model to discriminate DTC lung metastases that are 131I-avid from those that are non-131I-avid using independent internal and external validation datasets. Accuracy, sensitivity, specificity, and positive and negative predictive values with 95% CIs were reported for our model (ResNeSt50) and two classical DCNN architectures (Inception V3 and ResNet-50). Statistical analyses were performed using the R package (version 3.6), pROC (version 1.12.1), and GraphPad Prism 7.0.

Results

Patient cohorts

Between January 2017 and January 2021, 302 patients with DTC lung metastases were included in the primary cohort at our institution. Following a quality control evaluation, 41 patients were subsequently excluded for the following reasons: (i) patients with negative chest CT but positive 131I uptake on 131I-WBS, defined as “fine miliaric” (n = 11); (ii) patients with diffuse miliary pulmonary metastases (n = 13); and (iii) patients with low-quality chest CT images (n = 17). Ultimately, the primary set consisted of 20,175 images from 261 individuals: 140 patients with 131I-avid DTC lung metastases (10,220 images) and 121 with non-131I-avid (9,955 images). We conducted a fivefold cross-validation experiment on the primary set to evaluate the performance of our model. Between February 2021 and June 2022, independent validation datasets were obtained: 59 individuals for the internal validation dataset, 43 for the external validation dataset 1, and 41 for the external validation dataset 2. Baseline characteristics of the primary set and three independent validation sets are shown in Table 1. These datasets serve as the foundation for predicting 131I uptake in DTC lung metastases before radioiodine therapy. Figure 1 illustrates the study process in a flowchart.

Table 1
www.frontiersin.org

Table 1. Demographic data for 404 DTC lung metastases patients.

Figure 1
Flowchart depicting the process for detecting and managing DTC lung metastases. It includes images from CT scans of Patient A and Patient B, showing lung metastases. After diagnosing DTC lung metastases, a path leads to \[^{131}\text{I}\] treatment with pre- and post-treatment imaging, identifying \[^{131}\text{I}\]-avid and non-\[^{131}\text{I}\]-avid metastases. Another path involves lung lesion detection using Frangi-UNet and a deep learning model for classification and \[^{131}\text{I}\] uptake prediction. Components include CT images, a convolution layer, max pooling, and output.

Figure 1. Study flowchart of procedures in the development and evaluation of deep learning models for automatically predicting 131I uptake in lung metastases of DTC before radioiodine therapy. CNN, convolutional neural network.

Performance of deep learning models

Pulmonary metastatic nodule detection

In recent years, there have been significant advancements in deep convolutional neural networks for the detection of pulmonary nodules. In this study, we proposed a parallel multitask RoI head method to enhance the robustness of nodule detection. Figure 2 shows the overall framework of our pulmonary nodule detection method. A total of 15,507 lung nodules were extracted from the 131I-avid group and 13,891 nodules from the non-131I-avid group, as summarized in Table 1. These pulmonary metastatic nodules were further classified based on their size: less than 5 mm (447 nodules), 5–10 mm (22,779 nodules), 10–20 mm (5,616 nodules), and larger than 20 mm (538 nodules). The distribution of pulmonary nodule sizes is shown in Figure 3a, with the majority of nodules falling within the 5–10 mm range.

Figure 2
Flowchart illustrating the study and validation process for predicting iodine-131 uptake in DTC lung metastases. From a database of 302 patients (2017-2021), 261 with lung metastases enrolled; exclusions due to specific conditions are noted. The primary set includes 20175 images, split into iodine-avid and non-avid groups, with further pathological confirmations. A DCNN model was used with five-fold cross-validation. Validation from 2021-2022 involved 59 patients with 4530 images and a similar process.

Figure 2. The proposed framework of a parallel multitask RoI head. A whole CT volume is fed into the preprocess module to extract coarse anatomical structures and weights. The input image patch is then sampled via the weights. The backbone and FPN extract multiscale feature maps and feed them into the RPN head. Next, the RPN head proposes RoIs to the parallel multitask RoI head. For each proposed nodule RoI (nRoI), we use a time-scaled nRoI to generate a context RoI (cRoI). The context branch focuses only on the context category classification. For each branch, three parallel subheads perform the same task. Finally, the RoI head classifies whether the proposal is a nodule based on the features of the nodule and context.

Figure 3
Panel A shows a histogram comparing the size distribution of two groups: “avid” in blue and “non-avid” in orange, with peaks around 7.5. Panel B presents a sensitivity plot with several lines representing different models, showing sensitivity against the average number of false positives per scan.

Figure 3. (a) Distribution of pulmonary nodule size. (b) FROC curves of different lung nodule sizes. FROC, free-response receiver operating characteristic.

The performance of pulmonary nodule detection was evaluated using FROC curves, which plot sensitivity against false positives (FPs) per scan (FPs/scan). FPs/scan represents the ratio of the number of false positives to the total number of cases considered for evaluation. Figure 3b shows the FROC curves for pulmonary nodules of different sizes: 0–5 mm, 5–10 mm, and > 10 mm, denoted by the red, green, and blue curves, respectively. The dashed lines above and below each curve indicate the upper and lower bounds of the bootstrap. Detection performance for 0–5 mm pulmonary metastasis nodules was the weakest, with a recall rate of approximately 35% at an FP/scan of 0.25. For nodules larger than 5 mm, a recall rate of 95% was achieved at an FP/scan of 0.25. This performance disparity is likely attributable to false positives arising from factors such as small blood vessels, lung walls, or rib humps.

Pulmonary metastatic nodule classification

In our study, we utilized a deep learning model called ResNeSt50 as the primary tool for discriminating between positive and negative 131I uptake in DTC lung metastatic nodules using chest CT images. A detailed view of the split-attention unit is shown in Figure 4. The model demonstrated promising predictive performance in fivefold cross-validation. During this stage, the average AUC for the ResNeSt50 model was 0.770 (95% CI = 0.767–0.775), indicating good discrimination ability. The model accuracy was 71.0% (95% CI = 70.9%–71.0%), sensitivity was 75.7% (95% CI = 70.7%–80.7%), specificity was 65.9% (95% CI = 60.2%–71.6%), positive predictive value (PPV) was 70.3% (95% CI = 65.1%–75.4%), and negative predictive value (NPV) was 71.8% (95% CI = 66.2%–77.5%). ResNeSt50 also demonstrated good performance in the independent validation datasets, with AUC values of 0.722 (95% CI = 0.716–0.725) for the internal independent validation dataset, 0.720 (95% CI = 0.691–0.749) for the Shanghai RuiJin external validation dataset 1, and 0.731 (95% CI = 0.713–0.748) for the Shanghai Tenth People’s Hospital external validation dataset 2. Compared with two other classical architectures, ResNet50 and Inception V3, ResNeSt50 achieved the best performance, as shown in Table 2. Figure 5 illustrates the ROC curves for ResNeSt50 in the independent validation datasets. Overall, ResNeSt50 showed promising predictive performance in discriminating positive and negative 131I uptake in DTC lung metastatic nodules based on chest CT images. It outperformed other classical architectures in terms of accuracy and demonstrated good discriminative ability in both cross-validation and independent validation sets.

Figure 4
Flowchart of a neural network structure with sections labeled Deep Stem, ResNeSt Bottleneck, and Split-Attention. The Deep Stem includes convolutional and max pooling layers. The ResNeSt Bottleneck features convolutional layers with SplitAttn. Split-Attention consists of two groups with convolutional layers, global pooling, and r-Softmax, merging into a final layer. Each section is color-coded and progresses from input to output.

Figure 4. Detailed view of the split-attention unit for ResNeSt50. The deep learning classification model is based on the ResNet50 framework and includes one stem stage and four block stages. The stem stage includes three convolutions and a maximum pooling layer, with two steps for both the first convolution and pooling.

Table 2
www.frontiersin.org

Table 2. Performance of three CNN models in the cross-validation and independent validation sets.

Figure 5
Five receiver operating characteristic (ROC) curves labeled A to E, each illustrating true positive rate versus false positive rate. A has an AUC of 0.75, B has 0.77, C 0.78, D 0.79, and E 0.75. Each curve is compared to the diagonal line of no discrimination.

Figure 5. Performance of the deep convolutional neural network model to discriminate 131I-avid (positive 131I uptake) from non-131I-avid (negative 131I uptake) DTC lung metastatic modules. Receiver operating characteristic curves of the cross-validation dataset (average) (a), the internal validation dataset (b), the external validation dataset 1 (c), and the external validation dataset 2 (d).

To enhance interpretability, we applied Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize the regions of CT images that contributed most to the model’s predictions. As shown in Figure 6, the Grad-CAM heatmaps highlight that the model primarily focuses on nodule margins and heterogeneous internal areas—features consistent with radiologists’ diagnostic reasoning and known high-risk imaging characteristics. These visualizations provide intuitive evidence that the model’s decision-making process aligns with clinically meaningful regions rather than irrelevant background areas.

Figure 6
Two rows of four MRI images labeled A through D. The top row shows A: a blurry grayscale image; B: a heatmap in blue, yellow, and red; C: a clearer grayscale image; D: a heatmap similar to B. The bottom row mirrors the same layout and styles.

Figure 6. (A, C) Original images; (B, D) corresponding Grad-CAM heatmaps.

Discussion

We constructed a model that automated interpreting CT images to distinguish 131I-avid (positive 131I uptake) from non-131I-avid (negative 131I uptake) DTC lung metastases prior to radioiodine therapy in a real-world setting. As a pilot study, we first created a pipeline that includes anatomical structure-aware pulmonary nodule detection via a parallel multitask RoI head to identify lung nodules from CT images. This pipeline is designed based on an actual clinical scenario and represents a novel attempt to incorporate domain expert knowledge of medical imaging into deep learning frameworks. Moreover, it achieved high sensitivity, and for lung nodules larger than 5 mm, the recall rate can achieve 95% at 0.25 FP/scan. Second, we developed a prediction model to discriminate 131I-avid from non-131I-avid DTC lung metastatic modules. The ResNeSt50 model can achieve good performance in the independent validation datasets, with AUC values of 0.722 (95% CI = 0.716–0.725) for the internal validation dataset, 0.720 (95% CI = 0.691–0.749) for external validation dataset 1, and 0.731 (95% CI = 0.713–0.748) for external validation dataset 2. These results demonstrate that this algorithm is a promising approach and may facilitate the application of deep learning techniques in the precise treatment of thyroid cancer. Figure 7 shows a schematic diagram of our study.

Figure 7
Diagram showing a neural network architecture for CT image analysis. The upper section describes stages: Backbone + FPN processes a CT slice, followed by RPN Head, and then RoI Head for classification and regression. The bottom section details feature processing, with RPN input separating into context and nodule branches, applying convolution, cosine distance, and producing outputs for nodule classification and location.

Figure 7. Schematic diagram of this study. The left part of the diagram shows the traditional method (posttherapeutic 131I whole-body scan) for diagnosing 131I uptake in lung metastases of DTC before radioiodine therapy, while the right half part shows the artificial intelligence method proposed in this study.

The management of patients with DTC lung metastases is a clinical challenge, especially for the first radioiodine treatment, as the decision to administer high-dose 131I (a “blind” therapy) is problematic. We generally assess whether the patient’s lesions have iodine avidity based on the post therapeutic 131I-WBS, which means the patient has already received a blind high dose of 131I therapy. However, approximately 30% of patients with DTC pulmonary metastases lose 131I accumulation capacity (4), and these cases are considered RAI-refractory DTC (RAIR-DTC). In principle, patients with metastatic disease showing 131I accumulation should be treated with RAI therapy. In the case of RAIR-DTC, early determination is important due to the following factors: (i) unnecessary high radiation exposure; (ii) economic burden caused by ineffective 131I treatment; (iii) discomfort caused by discontinuation of thyroxine for 3–4 weeks (hypothyroidism) prior to RAI therapy; (iv) prolonged TSH stimulation, which may stimulate tumor growth (24); and (v) delays in other potentially more effective therapies, such as TKI administration. Therefore, identifying methods to predict 131I uptake in lung metastases of DTC before radioiodine therapy is crucial to selecting appropriate treatment strategies.

Based on the American Thyroid Association Guidelines, 18F-fluorodeoxyglucose Positron Emission Tomography/Computed Tomography (PET/CT) (18FDG-PET/CT), 131I-WBS, CT, and Tg are recommended in combination for identifying RAIR-DTC (25), but none of these tests is simple, convenient, and effective. 18F-FDG PET/CT is an important tool in evaluating patients with “thyroglobulin elevated but negative iodine scintigraphy Syndrome” (TENIS syndrome) and in identifying RAIR-DTC (26). An inverse relationship between 131I and 18FDG uptake (“flip-flop” phenomenon) has been described for RAIR-DTC, in which 131I avidity is lost while glucose metabolism increases. However, there are several drawbacks (1): 18F-FDG-PET/CT is valid only in patients with elevated Tg but negative 131I-WBS, that is, patients who have received at least one or several ineffective RAI therapies (2); metastases may develop from different tumor clones with mixed patterns of 18F-FDG or 131I accumulation (27); and (3) the cost is high and the method is unconventional. Several iodine isotopes, such as 131I, 123I, and 124I, play important roles in nuclear medicine, both for diagnostic purposes and therapy. Research shows that the diagnostic 131I (Dx) scan has a low yield in changing clinical management (28, 29). 123I and 124I are excellent diagnostic agents for whole-body scanning in patients with thyroid carcinoma and are comparable to posttreatment 131I scans. However, this remains a controversial topic. Lammers et al. (30) showed poor sensitivity of 124I PET/CT in detecting posttherapy 131I-positive metastases. Gauri et al. (31) and Freudenberg et al. (32) both demonstrated that a negative 124I PET scan has a low predictive value for a negative post-131I therapy scan and should not be used to exclude the option of blind 131I therapy. Moreover, limited availability and high cost hinder their routine application. Therefore, predicting 131I uptake in DTC lung metastases before radioiodine therapy is crucial but challenging. To date, AI research in this field remains largely unexplored.

In the pilot study, our work was mainly divided into two parts: detection of pulmonary metastatic nodules from lung CT scans and prediction of 131I uptake. To ensure robustness in lung nodule detection, we proposed an effective parallel multitask RoI head to generate nodule RoI (nRoI) and context RoI (cRoI) feature maps. This approach more closely resembles clinical diagnosis and represents a novel attempt to integrate domain expert knowledge from medical images into a deep learning method. Our study shows that the distribution of pulmonary nodule size is mainly clustered in the range of 5–10 mm, and the recall rate can achieve 95% at 0.25 FP/scan. Thus, we assert that the parallel multitask RoI head facilitates the detection of nodules from various features, and the framework demonstrates high sensitivity. Furthermore, we extracted and utilized pulmonary nodules in our prediction network and constructed a classification model based on the ResNeSt backbone (33). Specifically, we use 4s2x40d as the ResNeSt setting in this study, as it provides better training and inference speed and requires less memory. Experiments showed that ResNeSt50 achieved encouraging predictive performance in discriminating 131I-avid from non-131I-avid DTC lung metastatic modules in chest CT images. Table 2 also shows that our proposed ResNeSt outperforms two classical architectures, Inception V3 and ResNet50, with AUC values of 0.722 (95% CI = 0.716–0.725) for the internal validation dataset, 0.720 (95% CI = 0.691–0.749) for the external validation dataset 1, and 0.731 (95% CI = 0.713–0.748) for the external validation dataset 2. The model maintained stability and generalizability across both internal and external validation datasets.

Note that, in contrast to the performance of DCNN models in other types of tumors, such as lymph node metastasis prediction from primary breast cancer (13), our model’s predictive capability is not optimal. This is mainly due to the following factors (1): As a knowledge gap, there is no available method for discriminating 131I-avid from non-131I-avid DTC lung metastatic modules. Our study is preliminary exploratory research, and the performance of the DCNN model appears comparable to that of 18F-FDG-PET/CT or diagnostic 131I scans (2628). The principal strength of the DCNN model is its automated capability, which is accessible and freely available. (2) In our study, the data used were limited to patients’ chest CT. In addition to imaging information, many factors may affect 131I accumulation, such as gene mutations and age (35). However, the number of patients who have undergone gene mutation testing was limited. Future studies will include more information, such as gene mutations, which is expected to further improve the performance of the model.

The successful clinical implementation of our model may serve as a catalyst for improved patient care and resource utilization. By precisely selecting patients, it could enhance therapeutic efficiency, minimize unnecessary side effects, and allocate resources more effectively. This approach paves the way for earlier, more tailored interventions with the potential to improve long-term survival. However, translating such AI models from research to clinical practice is not without challenges. Future work must overcome significant technical and logistical hurdles, including data security, integration with institutional PACS/RIS, and the development of user-friendly interfaces. Furthermore, obtaining regulatory approval as a medical device is a critical and necessary step. Successful navigation of these challenges is a prerequisite for meaningful clinical implementation of our model. Moreover, it is important to note that this model is designed as a clinical decision-support aid, and its outputs should be interpreted within the broader clinical context by healthcare professionals. Final decisions must involve physician judgment, incorporating other relevant information and considering patient-specific factors, as upholding patient autonomy and the physician–patient relationship are key ethical considerations for AI in medicine.

Our research has several limitations. First, this is a retrospective study, and further improvement with larger, prospective studies is needed before clinical application. Second, the study included only chest CT data without considering other relevant factors, such as genetic mutations. In future work, to improve its performance, we intend to incorporate more clinical information into the artificial intelligence system. Accurate diagnoses and successful predictions can guide clinical decision-making, improve patient outcomes, and reduce the cost of cancer management. Third, the use of CT images from multiple hospitals and scanners introduced inherent variations in imaging parameters, which, despite strict quality control, may affect model performance.

In conclusion, we demonstrated that a deep learning algorithm based on CT images could potentially serve as a new tool for predicting 131I uptake in lung metastases of thyroid cancer before radioiodine therapy. As a pilot study, the performance of the DCNN model is encouraging. With further validation in a larger population and model calibration, our convolutional neural network-based model has strong potential to serve as an important decision-support tool in clinical applications.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding authors.

Ethics statement

The studies involving humans were approved by the institutional review board (IRB) of Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

HoS: Conceptualization, Validation, Methodology, Investigation, Writing – original draft, Data curation. MF: Project administration, Supervision, Writing – original draft, Software, Resources. HT: Conceptualization, Funding acquisition, Writing – review & editing. ZQ: Conceptualization, Funding acquisition, Writing – review & editing. CS: Conceptualization, Funding acquisition, Writing – review & editing. XC: Conceptualization, Funding acquisition, Writing – review & editing. QL: Conceptualization, Funding acquisition, Writing – review & editing. HuS: Conceptualization, Funding acquisition, Writing – review & editing. QW: Conceptualization, Funding acquisition, Writing – review & editing. LZ: Conceptualization, Funding acquisition, Writing – review & editing. QYL: Conceptualization, Funding acquisition, Writing – review & editing.

Funding

The author(s) declared financial support was received for this work and/or its publication. This work was supported by the Fundamental Research Funds for the Central Universities (project number *BYG2024QNB23*), Shanghai Jiao Tong university scientific research project, Grant/Award Number: YG2019ZDA09. National Natural Science Foundation of China, Grant/Award Numbers: 81974271; Shanghai Key Clinical Specialty of Medical Imaging, Grant/Award Number: shslczdzk03203.

Conflict of interest

The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Abbreviations

DTC, differentiated thyroid cancer; DCNN, deep convolutional neural network; WBS, whole-body scan; Tg, thyroglobulin; AUC, area under the ROC curve; CI, confidence interval; CNN, convolutional neural network; ROC, receiver operating characteristic.

References

1. Lin JD, Chao TC, Chou SC, and Hsueh C. Papillary thyroid carcinomas with lung metastases. Thyroid. (2004) 14:1091–6. doi: 10.1089/thy.2004.14.1091

PubMed Abstract | Crossref Full Text | Google Scholar

2. Cho SW, Choi HS, Yeom GJ, Lim JA, Moon JH, Park DJ, et al. Long-term prognosis of differentiated thyroid cancer with lung metastasis in Korea and its prognostic factors. Thyroid. (2014) 24:277–86. doi: 10.1089/thy.2012.0654

PubMed Abstract | Crossref Full Text | Google Scholar

3. Mihailovic J, Stefanovic L, Malesevic M, and Markoski B. The importance of age over radioiodine avidity as a prognostic factor in differentiated thyroid carcinoma with distant metastases. Thyroid. (2009) 19:227–32. doi: 10.1089/thy.2008.0186

PubMed Abstract | Crossref Full Text | Google Scholar

4. Song HJ, Qiu ZL, Shen CT, Wei WJ, and Luo QY. Pulmonary metastases in differentiated thyroid cancer: efficacy of radioiodine therapy and prognostic factors. Eur J Endocrinol. (2015) 173:399–408. doi: 10.1530/EJE-15-0296

PubMed Abstract | Crossref Full Text | Google Scholar

5. Durante C, Haddy N, Baudin E, Leboulleux S, Hartl D, Travagli JP, et al. Long-term outcome of 444 patients with distant metastases from papillary and follicular thyroid carcinoma: benefits and limits of radioiodine therapy. J Clin Endocrinol Metab. (2006) 91:2892–9. doi: 10.1210/jc.2005-2838

PubMed Abstract | Crossref Full Text | Google Scholar

6. Schlumberger M, Tahara M, Wirth LJ, Robinson B, Brose MS, Elisei R, et al. Lenvatinib versus placebo in radioiodine-refractory thyroid cancer. New Engl J Med. (2015) 372:621–30. doi: 10.1056/NEJMoa1406470

PubMed Abstract | Crossref Full Text | Google Scholar

7. Hosny A, Parmar C, Quackenbush J, Schwartz LH, and Aerts HJWL. Artificial intelligence in radiology. Nat Rev Can. (2018) 18:500–10. doi: 10.1038/s41568-018-0016-5

PubMed Abstract | Crossref Full Text | Google Scholar

8. LeCun Y, Bengio Y, and Hinton G. Deep learning. Nature. (2015) 521:436–44. doi: 10.1038/nature14539

PubMed Abstract | Crossref Full Text | Google Scholar

9. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. (2016) 316:2402–10. doi: 10.1001/jama.2016.17216

PubMed Abstract | Crossref Full Text | Google Scholar

10. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Imag Anal. (2017) 42:60–88. doi: 10.1016/j.media.2017.07.005

PubMed Abstract | Crossref Full Text | Google Scholar

11. Ziyad SR, Radha V, and Vayyapuri T. Overview of computer aided detection and computer aided diagnosis systems for lung nodule detection in computed tomography. Curr Med Imaging Rev. (2020) 16:16–26. doi: 10.2174/1573405615666190206153321

PubMed Abstract | Crossref Full Text | Google Scholar

12. Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, and Mougiakakou S. Lung Pattern classifcation for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imag. (2016) 35:1207–16. doi: 10.1109/TMI.2016.2535865

PubMed Abstract | Crossref Full Text | Google Scholar

13. Zhou LQ, Wu XL, Huang SY, Wu GG, Ye HR, Wei Q, et al. Lymph node metastasis prediction from primary breast cancer US images using deep learning. Radiology. (2020) 294:19–28. doi: 10.1148/radiol.2019190372

PubMed Abstract | Crossref Full Text | Google Scholar

14. Li X, Zhang S, Zhang Q, Wei X, Pan Y, Zhao J, et al. Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study. Lancet Oncol. (2019) 20:193–201. doi: 10.1016/S1470-2045(18)30762-9

PubMed Abstract | Crossref Full Text | Google Scholar

15. Marostica E, Barber R, Denize T, Kohane IS, Signoretti S, Golden JA, et al. Development of a histopathology informatics pipeline for classification and prediction of clinical outcomes in subtypes of renal cell carcinoma. Clin Cancer Res. (2021) 27:2868–78. doi: 10.1158/1078-0432.CCR-20-4119

PubMed Abstract | Crossref Full Text | Google Scholar

16. Okamoto S, Shiga T, Uchiyama Y, Manabe O, Kobayashi K, Yoshinaga K, et al. Lung uptake on I-131 therapy and short-term outcome in patients with lung metastasis from differentiated thyroid cancer. Ann Nucl Med. (2014) 28:81–7. doi: 10.1007/s12149-013-0781-x

PubMed Abstract | Crossref Full Text | Google Scholar

17. Sabra MM, Ghossein R, and Tuttle RM. Time course and predictors of structural disease progression in pulmonary metastases arising from follicular cell-derived thyroid cancer. Thyroid. (2016) 26:518–24. doi: 10.1089/thy.2015.0395

PubMed Abstract | Crossref Full Text | Google Scholar

18. Ren S, He K, Girshick R, and Sun J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. (2017) 39:1137–49. doi: 10.1109/TPAMI.2016.2577031

PubMed Abstract | Crossref Full Text | Google Scholar

19. Lin TY, Dollar P, Girshick R, He K, Hariharan B, and Belongie S. Feature pyramid networks for object detection. Proc IEEE Conf Comput Vision Pattern Recog. (2017) pp:2117–25. doi: 10.1109/CVPR.2017.106

Crossref Full Text | Google Scholar

20. Dou Q, Chen H, Yu L, Qin J, and Heng PA. Multilevel contextual 3-d CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans Biomed Eng. (2016) 64:1558–67. doi: 10.1109/TBME.2016.2613502

PubMed Abstract | Crossref Full Text | Google Scholar

21. Kim BC, Yoon JS, Choi JS, and Suk HI. Multi-scale gradual integration CNN for false positive reduction in pulmonary nodule detection. Neural Netw. (2019) 115:1–10. doi: 10.1016/j.neunet.2019.03.003

PubMed Abstract | Crossref Full Text | Google Scholar

22. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, and Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IAEE conference on computer vision and pattern recognition (CVPR). Computer Vision Foundation, Las Vegas (NV (2016). p. 2818–26.

Google Scholar

23. He K, Zhang X, Ren S, and Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Computer Vision Foundation, Las Vegas (NV (2016). p. 770–8.

Google Scholar

24. Phan HT, Jager PL, Paans AM, Plukker JT, Sturkenboom MG, Sluiter WJ, et al. The diagnostic value of 124I-PET in patients with differentiated thyroid cancer. Eur J Nucl Med Mol Imag. (2008) 35:958–65. doi: 10.1007/s00259-007-0660-6

PubMed Abstract | Crossref Full Text | Google Scholar

25. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, et al. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Task Force on thyroid nodules and differentiated thyroid cancer. Thyroid. (2016) 26:1–133. doi: 10.1089/thy.2015.0020

PubMed Abstract | Crossref Full Text | Google Scholar

26. Lodi Rizzini E, Repaci A, Tabacchi E, Zanoni L, Vicennati V, Cavicchi O, et al. Impact of 18F-FDG PET/CT on clinical management of suspected radio-iodine refractory differentiated thyroid cancer (RAI-R-DTC). Diagn (Basel). (2021) 11:1430. doi: 10.3390/diagnostics11081430

PubMed Abstract | Crossref Full Text | Google Scholar

27. Riemann B, Uhrhan K, Dietlein M, Schmidt D, Kuwert T, Dorn R, et al. Diagnostic value and therapeutic impact of (18)F-FDG-PET/CT in differentiated thyroid cancer. Results German multicentr stud Nuklearmedizin. (2013) 52:1–6. doi: 10.3413/nukmed-0489-12-03

PubMed Abstract | Crossref Full Text | Google Scholar

28. van der Boom T, Zandee WT, Dekkers CCJ, van der Horst-Schrivers ANA, Jansen L, Kruijff S, et al. The value of pre-ablative I-131 scan for clinical management in patients with differentiated thyroid carcinoma. Front Endocrinol (Lausanne). (2021) 12:655676. doi: 10.3389/fendo.2021.655676

PubMed Abstract | Crossref Full Text | Google Scholar

29. Chen MK, Yasrebi M, Samii J, Staib LH, Doddamane I, and Cheng DW. The utility of I-123 pretherapy scan in I-131 radioiodine therapy for thyroid cancer. Thyroid. (2012) 22:304–9. doi: 10.1089/thy.2011.0203

PubMed Abstract | Crossref Full Text | Google Scholar

30. Lammers GK, Esser JP, Pasker PC, Sanson-van Praag ME, and de Klerk JM. Can I-124 PET/CT predict pathological uptake of therapeutic dosages of radioiodine (I-131) in differentiated thyroid carcinoma? Adv Mol Imaging. (2014) 4:27–34. doi: 10.4236/ami.2014.43004

Crossref Full Text | Google Scholar

31. Khorjekar GR, Van Nostrand D, Garcia C, O’Neil J, Moreau S, Atkins FB, et al. Do negative 124I pretherapy positron emission tomography scans in patients with elevated serum thyroglobulin levels predict negative 131I posttherapy scans? Thyroid. (2014) 24:1394–9. doi: 10.1089/thy.2013.0713

PubMed Abstract | Crossref Full Text | Google Scholar

32. Freudenberg LS, Jentzen W, Muller SP, and Bockisch A. Disseminated iodine-avid lung metastases in differentiated thyroid cancer: a challenge to 124-I PET. Eur J Nucl Med Mol Imag. (2008) 35:502–8. doi: 10.1007/s00259-007-0601-4

PubMed Abstract | Crossref Full Text | Google Scholar

33. Zhang H, Wu C, Zhongyue Z, Zhu Y, Zhang Z, Lin H, et al. ResNeSt: split-attention networks. Amazon, University of California, Davis. New Orleans, LA, USA: IEEE. (2020).

Google Scholar

Keywords: thyroid cancer, lung metastases, radioiodine therapy, deep learning, CNN - convolutional neural network

Citation: Song H, Fei M, Tao H, Qiu Z, Shen C, Chen X, Luo Q, She H, Wang Q, Zhang L and Luo Q (2025) Prediction of 131I uptake in lung metastases of differentiated thyroid cancer using deep learning. Front. Endocrinol. 16:1697233. doi: 10.3389/fendo.2025.1697233

Received: 02 September 2025; Accepted: 28 November 2025; Revised: 09 November 2025;
Published: 15 December 2025.

Edited by:

Qi Zhang, Yale University, United States

Reviewed by:

Barbara Maria Jarzab, Maria Skłodowska-Curie National Research Institute of Oncology, Poland
Mohd Amir, Aligarh Muslim University, India
Sijia Zhang, Heidelberg University, Germany
Ximing Ran, Emory University, United States

Copyright © 2025 Song, Fei, Tao, Qiu, Shen, Chen, Luo, She, Wang, Zhang and Luo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lichi Zhang, bGljaGl6aGFuZ0BzanR1LmVkdS5jbg==; Quanyong Luo, bHVvcXlAc2p0dS5lZHUuY24=

These authors have contributed equally to this work and share first authorship

These authors have contributed equally to this work and share senior authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.