Construction and validation of a predictive model for lymph node metastasis in patients with papillary thyroid carcinoma

Hao, Yanhong; Zhang, Yanjing; Su, Yuan; Liu, Liping

doi:10.3389/fendo.2025.1551108

ORIGINAL RESEARCH article

Front. Endocrinol., 09 June 2025

Sec. Cancer Endocrinology

Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1551108

This article is part of the Research TopicClinical prediction models in cancer through bioinformaticsView all 15 articles

Construction and validation of a predictive model for lymph node metastasis in patients with papillary thyroid carcinoma

Yanhong Hao¹

Yanjing Zhang²

Yuan Su³

Liping Liu^2*

¹Department of Ultrasound, The First Hospital of Shanxi Medical University, Taiyuan, China
²Department of Interventional Ultrasound, The First Hospital of Shanxi Medical University, Taiyuan, China
³Department of Imaging Medicine, Shanxi Medical University, Taiyuan, China

Objective: To study the occurrence of lymph node metastasis in patients with papillary thyroid carcinoma (PTC) and construct a predictive model to assess its predictive performance.

Methods: We retrospectively analyzed the data of 432 patients with PTC. The least absolute shrinkage and selection operator (LASSO) was used to select the features, and multiple logistic regression was used to analyze the predictive factors. Multiple machine learning (ML) classification models are integrated to analyze and identify the optimal model, while Shapley additive exPlanations (SHAPs) are used for personalized risk assessment. A total of 125 patients from Changzhi Heping Hospital were included in an external validation set to evaluate the generalizability of our model.

Results: Predictors of central lymph node metastasis (CLNM) included age, sex, maximum nodule diameter, margin, morphology, number of nodules, relationship between the nodule and the thyroid envelope, and coarse calcification. A logistic classification model was identified as the optimal model, with a test set area under the curve (AUC) value of 0.798. The validation results using external data were consistent, demonstrating the stability and generalizability of our model.

Conclusion: We established a logistic model using the SHAP method, which provides evidence for the ability of the SHAP method to predict lymph node metastasis and serves as a basis for personalized healthcare.

1 Background

Papillary thyroid carcinoma (PTC) is the most common histological subtype of thyroid cancer and its incidence among women in Asian countries has increased significantly since 2000 (1). PTC typically follows an indolent clinical course with a favorable prognosis; however, patients with lymph nodule metastasis have an elevated risk of local recurrence. As the N stage increases, the likelihood of distant metastasis significantly increases, leading to a poorer prognosis (2). Studies have confirmed that total thyroidectomy provides a low local recurrence benefit; therefore, preventive central lymph node dissection is not necessary unless lymph nodule metastasis is detected before surgery (3). Ultrasound is an easy and cost-effective method for evaluating lymph node metastasis in PTC, but its sensitivity for detecting central lymph node involvement is low due to anatomical limitations and the frequent absence of abnormalities on preoperative imaging (4). Therefore, early identification of risk factors and construction of prediction models are highly important for improving the early prediction of lymph node metastasis.

Machine learning (ML) is a new and powerful tool in the field of medicine, particularly in personalized medicine and computer-aided diagnosis (5). In this study, various ML classification models were used to create predictive models. We gathered and analyzed clinical data from patients with PTC to understand the factors influencing lymph node metastasis and guide surgical treatment. In addition, interpretation of the model is challenging. For more intuitive results, the Shapley additive exPlanations (SHAP) tool was used to visually interpret the risk factors influencing patient predictions (5). SHAP not only quantifies individual probabilities of clinical events, but also integrates biological and clinical models, thereby contributing to the advancements of personalized medicine. Therefore, we aimed to develop a more accurate prediction model for lymph nodule metastasis in patients with PTC based on clinical data.

2 Materials and methods

We conducted a retrospective study of 432 patients with surgically and pathologically confirmed papillary thyroid carcinoma (PTC) treated at our hospital between January 2020 and October 2021. Concurrently, 125 patients from Changzhi Heping Hospital were used as an external validation set. Based on the pathological findings, all patients were divided into positive and negative lymph node metastasis groups. The inclusion criteria were as follows: 1) surgically confirmed PTC with cervical lymph node pathology records, 2) thyroid ultrasonography recorded within 2 weeks before surgery, and 3) laboratory examination of thyroid serum markers within 2 weeks before surgery. The exclusion criteria were as follows: 1) incomplete clinical data, and 2) unclear ultrasound images or incomplete data for analysis. The study was conducted in accordance with the principles of the Declaration of Helsinki.

2.1 Methods

2.1.1 US equipment and US characteristics

Ultrasonographic examination was performed via Canon i800 color Doppler ultrasound equipment, and the robe frequency was 18 MHz. The ultrasonographic features included the location, maximum diameter, echogenicity, margin, morphology, relationship of the nodule to the thyroid capsule (distant from the thyroid capsule (≥2 mm), in contact with the thyroid capsule (<2 mm), and invasion or penetration of the thyroid capsule), aspect ratio, nature of the nodule, intranodal calcifications (≤2 mm is defined as microcalcification, and >2 mm is defined as macrocalcification) and the number of nodules. Irregular margins included irregular, ill-defined, nodular, and lobular features. All ultrasound images were assessed by a sonographer who had been performing ultrasound examinations for more than five years, and the ultrasound features were analyzed by a senior doctor.

2.1.2 Clinical and serological indicators of thyroid function

Age, sex, and thyroid serological parameters, including triiodothyronine (FT3), serum-free thyroxine (FT4), thyroid-stimulating hormone (TSH), thyroglobulin (Tg), antithyroglobulin antibody (TG-Ab), and anti-thyroid peroxidase antibody (TPOAb), were collected from the patients during the first 2 weeks of the operation. The TPOAb level was considered normal if it was within the normal range and high if it was above normal. Hashimoto’s thyroiditis (TH) was defined based on the positive level of the patient’s own TPOAb before the operation and changes in the characteristics of the ultrasound image.

2.1.3 BRAF^V600E gene mutation detection

Thyroid nodules with the most apparent signs of malignancy, as assessed by conventional ultrasound, were selected for ultrasound-guided fine needle aspiration, with four punctures per lesion to ensure an adequate tissue sample volume. Genetic and cytological samples were placed in specimen tubes and liquid-based vials, respectively, and DNA was extracted using commercial kits. BRAF^V600E mutation was detected using a real-time fluorescence quantitative polymerase chain reaction amplification-based kit.

2.1.4 Surgical management in study protocol

All enrolled patients underwent standardized thyroidectomy with nodal dissection guided by preoperative ultrasound risk stratification: unilateral lobectomy for localized tumors (T1-2) versus total thyroidectomy for multifocal/aggressive variants (6), complemented by prophylactic cervical lymph node dissection and therapeutic lateral neck dissection for ultrasound-suspected nodes. Therapeutic lateral neck dissection encompassing levels II-V was systematically performed when intraoperative frozen sections confirmed metastasis in ultrasonography-suspected lymph nodes, adhering to the compartment-oriented dissection principles outlined in the 2015 ATA guidelines.

2.1.5 Establishment and evaluation of predictive models

Patients were randomly divided into training and testing groups in an 8:2 ratio, and characteristic factors were selected within the training set. Least absolute shrinkage and selection operator (LASSO) is employed for variable selection, which help mitigate overfitting by shrinking the variable coefficients, and effectively addresses issues of severe multicollinearity. Multiple machine learning (ML) classification models were used for a comprehensive analysis, the importance of each indicator in the training group was compared, and the testing group was evaluated using different models. In addition, we evaluate and validate the results using an optimal model. Shapley additive exPlanations (SHAP) visualizes the overall presentation model and individual sample interpretations. The specific steps were as follows. (1) Data division: Using the random number method in SPSS, patients with PTCs were randomly divided into a training set and a test set at a ratio of 8:2, with 348 cases in the training set and 84 cases in the test set. (2) Screening of characteristic factors: First, LASSO regression analysis was performed using R software (glmnet4.1.8) for variable selection and complexity adjustment. Then based on the results of the LASSO regression analysis, multifactor logistic regression analysis was conducted using SPSS to identify characteristic factors with statistical significance (p<0.05). (3) Python was used to perform a comprehensive analysis of multiple classification models, including logistic regression, light gradient boosting machine (LightGBM), random forest, adaptive boosting (AdBoost), decision tree classification, gradient boosting decision tree (GBDT) classification, multilayer perceptron (MLP), support vector machine (SVM), and Gaussian naive Bayes (GNB) methods. Repeated sampling validation was performed using R, with a validation set ratio of 0.3, 10 validation iterations, and a random seed of 42 for model training and validation. The aforementioned parameterized models (with 10 repetitions of sampling) were trained and tested. The importance of indicators in the training and test sets was analyzed across different models, and the optimal model was selected. Python (sklearn 0.22.1) was used to construct the area under the receiver operating characteristic (ROC) curve, and R software (rmda 1.6) was used to perform decision curve analysis (DCA). Python (sklearn 0.22.1) was employed to generate the calibration curve for assessing the predictive ability of the model, and to conduct a comprehensive evaluation of the predictive model to verify its usefulness in decision support and broader simulation modeling. Python (sklearn 0.22.1) was used to plot the precision–recall (PR) curve, which is widely used to evaluate model performance. (4) Training, validation, and testing of the optimal model: Performed 10-fold cross-validation on the training set and evaluated with the test set. Python (sklearn 0.22.1) was used to plot learning curves to assess the model fit and stability of the training and validation sets. (5) Python (shape 0.39.0) was used for the SHAP-based interpretation to analyze model importance and feature contributions. The contribution of each feature to the prediction results was calculated to explain the model’s outputs. Additionally, the SHAP values were generated for individual samples, and predictive performance was evaluated. Data from Heping Hospital were employed to externally validate the optimal model. Linear predictors (LPs) were first calculated for the external validation dataset. This model was then evaluated against the external dataset, and both ROC and calibration curves were generated to assess the generalization capability of our model.

2.2 Statistical analysis

SPSS 23.0, Python (version 3.4.3), R (version 3.6.1), and Free Statistics version 2.1 (http://www.clinicalscientists.cn/freestatistics/, Beijing, China) were used to analyze the data. All clinical data and ultrasound features in the training and test groups were compared at baseline. Because continuous variables did not follow a normal distribution, they were represented as median [IQR], while categorical variables were represented as n(%). Comparisons between groups were performed using the Mann-Whitney U test, and categorical variables were compared using the chi-square test. p<0.05 was considered statistically significant.

3 Results

3.1 Comparison of baseline data

A total of 432 patients were randomly divided into training and testing groups of 348 and 84 patients, respectively, using the random number method (8:2 test ratio). There was no significant difference in the baseline data between the two groups (p>0.05) (Table 1). In the subgroups of CLNM and LCLNM, only the size of the nodules demonstrated a significant difference (as shown in Supplementary Table S1).

Table 1

Table 1. Comparison of baseline characteristics between the two groups.

3.2 Risk factor screening for lymph node metastasis in patients with papillary thyroid carcinoma

LASSO regression analysis was performed on the aforementioned independent variables with cervical lymph node metastasis as the dependent variable to establish a LASSO regression model (Figure 1). The results show that the optimal λ value with the minimum mean squared error is 0.033, reducing the number of independent variables from 21 to 11, including nodule size, age, sex, FT4, margin, adjacent capsule, capsular invasion, number, coarse calcifications, microcalcifications, and nodule shape. To further control for the influence of confounding factors, a multivariate logistic regression analysis was conducted on the 11 independent variables. Only nodule size, age, sex, margins, adjacent capsules, capsular invasion, number, coarse calcifications, and nodule shape were identified as a significant risk factors (Table 2). Receiver operating characteristic (ROC) curves were used to assess the predictive value of age and nodal size for lymph node metastasis. The results showed that the area under the curve (AUC) for patient age and the maximum nodule diameter were 0.615 and 0.725, respectively. The optimal cutoff values were 45.0 years for age and 0.75 cm for the nodule diameter (Table 3).

Figure 1

Figure 1. Feature selection using LASSO regression analysis. (A) Coefficient profile plot. (B) LASSO regression cross-validation curve, plotted in the LASSO model with the minimum mean squared error (lambda = 0.011) and the minimum standard error of the mean (lambda = 0.033).

Table 2

Table 2. Multifactorial logistic regression for lymphovascular cervical metastasis.

Table 3

Table 3. Diagnostic performance of age and nodule diameter.

3.3 Comprehensive analysis of the classified multi-model

The predictive performance of various machine learning models was assessed using the AUC. The results indicated that in the training cohort, Random Forest, LightGBM, and GBDT performed the best, whereas in the testing cohort, Logistic Regression was optimal (Figures 2a, b). The clinical applicability of different models was further evaluated using DCA (Figure 2c), calibration curves (Figure 2d), and PR curves. Calibration curves demonstrated that the Logistic Regression model exhibited higher predictive accuracy. In the training set, the logistic model demonstrated superior performance; in the test set, it showed the best performance, with the highest AP value in the validation set (Figures 2e, f). A comprehensive analysis indicated that the logistic model could be considered optimal.

Figure 2

Figure 2. Comprehensive analysis of ML (Machine Learning) models. (a) Training set ROC and AUC, and (b) test set ROC and AUC. Papillary thyroid carcinoma patients were sampled 10 times in an 8:2 ratio. (c) Test set DCA, where the black dashed line represents the assumption that all patients have cervical lymph node metastasis, and the red dashed line and thin black line represent the assumption that no patients have cervical lymph node metastasis. The remaining solid lines represent different models. (d) Test set calibration curves, with the horizontal axis representing the average predicted probability and the vertical axis representing the actual probability of the event. The dashed diagonal line is the reference line, and other smooth solid lines are the fitting lines for different models. The closer the fitting line is to the reference line and the smaller the value in parentheses, the more accurate the model’s predicted values. (e) Training set PR curve and AP, and (f) test set PR curve and AP, where the y-axis is precision and the x-axis is recall. If the PR curve of one model is completely covered by the PR curve of another model, it can be concluded that the latter is superior to the former. The higher the AP value, the better the model performance. Different colors in the figure represent the corresponding models.

3.4 Construction and evaluation of the optimum model

Logistic regression analysis and a 10-fold cross-validation test were performed, and the results were verified using the training set. The results confirmed that the mean AUC of the validation and testing groups were 0.888, 0.866, and 0.798 (Figures 3A–C). The AUCs of the three groups eventually stabilized at approximately 0.85, and the model predictions were accurate. Because the performance of the validation set was lower than that of the test set in terms of the AUC metric or ratio being less than 10%, the model fitting was considered successful. The learning curve indicated that the training and validation sets had a strong fit and high stability (Figure 3D). These outcomes suggest that the logistic regression model can be used for classification modeling tasks in the dataset. In different subgroups, the model exhibits excellent discrimination, with AUC values of 0.879 for CLNM and 0.781 for LCLNM (as shown in Supplementary Table S2; Figure S1). The model achieved an AUC of 0.876 in the external validation dataset, confirming its robustness and generalizability (Supplementary Figure S1).

Figure 3

Figure 3. Training, validation, and testing of the logistic regression model. (A) Training set ROC and AUC, and (B) validation set ROC and AUC. Training and cross-validation were conducted on 10% of PTC patients, with different colored solid lines representing different outcomes. (C) Test set ROC and AUC. (D) Learning curve. The red dashed line represents the training set, and the blue dashed line represents the validation set.

3.5 SHAP to model visual explanations

Figure 4A employs SHAP to interpret the role of the nine variables in our model in predicting the status of lymph node metastasis in PTC. Different colored dots represent the attribution of different features to the outcome; red dots represent high-risk values, and blue dots represent low-risk values. The occurrence of CLNM increased with capsule invasion, age < 45 years, the presence of multiple nodes, nodule size, and male sex. Figure 4B shows the ranking of the nine risk factors assessed using the average absolute SHAP values, with the SHAP values on the x-axis indicating the importance of the predictive model. Moreover, we provided a typical CLNM+ case to illustrate the interpretability of the model. The SHAP predictive score was 0.96. The positive act of each feature is shown; red stripes indicate positive actions, and blue stripes indicate inactive actions (Figure 4C).

Figure 4

Figure 4. SHAP model interpretation. (A) Feature attributes in SHAP. Each row represents an element, and the horizontal axis is the SHAP value. Red dots indicate higher feature values, and blue dots indicate lower feature values. (B) Feature importance ranking as shown by SHAP. The matrix plot describes the importance of each covariate in the development of the final predictive model. (C) A specific case demonstration.

4 Discussion

The incidence rate of PTC has increased in recent years, particularly for tumors with a maximum diameter of <1 cm (7). Traditional surgery involves prophylactic central compartment lymph node dissection; however, unnecessary central compartment lymph node dissection can lead to recurrent laryngeal nerve injury. Moreover, the surgical scope for the lateral cervical lymph nodes is larger, which can result in more complications and reduce the quality of life of patients after surgery (8). In this study, the rate of cervical lymph node diversion was as high as 46%. Therefore, effective preoperative prediction of lymph node metastasis can help guide the development of clinical surgical protocols.

Research on ML has always been a key focus in the medical field. We employed several ML models and found that the logistic model generally outperforms the others, based on analyses of the AUC, DCA, PR curves, and calibration curves. However, it is difficult for clinicians to explain ML models more accurately and intuitively. Therefore, we developed a regression model using the SHAP method. The advantage of the SHAP is its ability to provide a fair, transparent, and comprehensible method for quantifying the specific contributions of each feature to a model’s predictions, thereby enhancing the interpretability and credibility of the model.

In this study, the SHAP values indicated that capsule invasion, nodule multiplicity, and nodule size are important predictors of cervical lymph node metastasis (CLNM), which is consistent with previous findings (4, 9). Consequently, it is advisable to consider the increased risk of clinical lymph node metastasis in patients with multiple PTCs invading the capsule. In addition, clinical features such as male sex, age < 45 years, and a maximum diameter >1.0 cm, have been identified as high-risk factors for lymph node metastasis (10–12). This study also revealed that age and sex were identified as independent risk factors for CLNM. Furthermore, our study calculated the cut-off value for predicting CLNM based on thyroid nodule size. Notably, we found it to be 0.75 cm, which differs from the previously considered 1 cm threshold. This result indicates that the occurrence of CLNM should be carefully considered in patients who are male, <45 years, and have a nodule diameter greater than 0.75 cm, to prevent inadequate treatment. The irregular shape of the nodule is a sign of malignancy due to extensive fibrosis of the interstitium of the thyroid cancer nodule, causing internal collagen fibrosis of the papillary structures to pull each other, leading to irregular forms. Malignant nodules exhibit infiltrative growth, often resulting in unclear boundaries. Therefore, when a nodule is close to or breaks through the envelope, despite not having the irregular shape, lymph node dissection should be considered in clinical practice.

Studies have shown that serum markers can predict the biological behavior of lymph node metastases and PTC (13). Hu et al. (14) confirmed that FT4 and TSH levels are positively associated with PTC. Serological studies on autoimmune antibodies have revealed that the incidence of HT has increased significantly in patients with thyroid cancer in recent years (15, 16); however, different researchers have reached conflicting conclusions regarding the relationship between HT background and the occurrence of lymph node metastases (17–19). Our study revealed that the background of HT is not a risk factor for lymph node metastasis in patients with thyroid cancer, which is inconsistent with the findings of Li, suggesting that the correlation between the high expression status of TgAb and TPOAb and the occurrence of lymph node metastasis in patients with thyroid cancer still needs to be further explored. By contrast, an autoimmune response to HT can lead to increased TSH levels, which may promote the growth and invasion of PTC (20, 21). No correlation was found between lymph node metastasis and TSH levels in this study, which is the source of controversy among doctors of different clinical specialties regarding the treatment of intermediate-risk PTC with TSH suppression and HT. Accordingly, additional high-quality studies are warranted to inform and optimize the clinical management of thyroid cancer. Although some scholars have suggested that the BRAF^V600E mutation phenotype is a hazard for lymph node metastasis (22), the results of common studies do not support this finding, which is the current mainstream research conclusion (23). Our study revealed statistically significant differences in BRAF^V600E mutation among various lymph node metastasis subgroups; however, this characteristic did not emerge as an independent risk factor for CLNM. However, BRAF^V600E mutation detection has important adjunctive diagnostic value in distinguishing benign and malignant thyroid nodules (24).

Our study had several limitations. First, this was a retrospective study, and selection bias was unavoidable. Second, the number of patients with PTC included in this study was relatively small. Thus, although high consistency in the reproducibility analysis of the training and test groups was achieved, there may be some unavoidable errors due to the uncertainty of data segmentation. In the future, we will include more cases for further verification. The interpretation of ultrasound image features largely depends on the operator’s scanning habits and the diagnostician’s experience. Therefore, the presence of subjective factors may have affected the final data.

5 Conclusion

In summary, we constructed a predictive model based on the ML model and found that the logistic model performed better in this study. Our model exhibits robust predictive performance across various subgroups of lymph node metastasis, highlighting its potential utility in clinical settings. In addition, we performed personalized risk assessment for cervical lymph node metastasis in patients with PTC. This effective computer-assisted method can further help clinicians and patients to identify the occurrence of lymph node metastasis and provides a foundation for developing strategies to guide surgical decisions and enhance patient prognosis prior to clinical surgery.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Ethics committee of first hospital of shanxi medical university. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

YH: Software, Writing – original draft. YZ: Writing – original draft. YS: Writing – original draft. LL: Funding acquisition, Methodology, Project administration, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. Key Research and Development Project of Shanxi Province (201903D321190).

Acknowledgments

This thesis would never have materialized without help and support from many parties. First, I would like to express my sincere gratitude to my counterparts at China Shanxi Medical University, Professor Chen Yaodong and Gao Feng, who gave me a great deal of useful and constructive advice on my thesis. With his professional and academic knowledge, he taught me how to do research and how to revise the thesis. Whenever I sent her an e-mail concerning my thesis, he replied quickly. He put a great deal of time into reading and correcting my thesis. Only under his guidance and encouragement could I finish this thesis. I am also indebted to all my colleagues in our research group. As a clinical worker myself, I learned much about scientific research methods from them, which is helpful to my job. Finally, I want to thank my husband and other family members and relatives, who expressed their concern and support when I pursued my studies.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo.2025.1551108/full#supplementary-material

Supplementary Figure 1 | The diagnostic efficacy of optimal model across different subgroups. (A) The ROC curve of the optimal model in predicting the subgroup with CLNM. (B) The ROC curve of the optimal model in predicting the subgroup with LCLNM.

Supplementary Figure 2 | Model performance on the external validation set. (A) ROC curve in the training set, (B) ROC curve in the external validation set, and (C) calibration curve.

References

1. Miranda-Filho A, Lortet-Tieulent J, Bray F, Cao B, Franceschi S, Vaccarella S, et al. Thyroid cancer incidence trends by histology in 25 countries: a population-based study. Lancet Diabetes Endocrinol. (2021) 9:225–34. doi: 10.1016/S2213-8587(21)00027-9

PubMed Abstract | Crossref Full Text | Google Scholar

2. Wang W, Ding Y, Jiang W, and Li X. Can cervical lymph node metastasis increase the risk of distant metastasis in papillary thyroid carcinoma? Front Endocrinol (Lausanne). (2022) 13:917794. doi: 10.3389/fendo.2022.917794

PubMed Abstract | Crossref Full Text | Google Scholar

3. Zhao F, Wang P, Yu C, Song X, Wang H, Fang J, et al. A LASSO-based model to predict central lymph node metastasis in preoperative patients with cN0 papillary thyroid cancer. Front Oncol. (2023) 13:1034047. doi: 10.3389/fonc.2023.1034047

PubMed Abstract | Crossref Full Text | Google Scholar

4. Chen SP, Jiang X, Zheng WW, and Luo YL. Correlation between sonographic features and central neck lymph node metastasis in solitary solid papillary thyroid microcarcinoma with a taller-than-wide shape. Diagn (Basel). (2023) 13:949. doi: 10.3390/diagnostics13050949

PubMed Abstract | Crossref Full Text | Google Scholar

5. Lei T, Guo J, Wang P, Zhang Z, Niu S, Zhang Q, et al. Establishment and validation of predictive model of tophus in gout patients. J Clin Med. (2023) 12:1755. doi: 10.3390/jcm12051755

PubMed Abstract | Crossref Full Text | Google Scholar

6. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, et al. 2015 American thyroid association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the american thyroid association guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid. (2016) 26:1–133. doi: 10.1089/thy.2015.0020

PubMed Abstract | Crossref Full Text | Google Scholar

7. Gao X, Luo W, He L, Cheng J, and Yang L. Predictors and a prediction model for central cervical lymph node metastasis in papillary thyroid carcinoma (cN0). Front Endocrinol (Lausanne). (2021) 12:789310. doi: 10.3389/fendo.2021.789310

PubMed Abstract | Crossref Full Text | Google Scholar

8. Yu J, Deng Y, Liu T, Zhou J, Jia X, Xiao T, et al. Lymph node metastasis prediction of papillary thyroid carcinoma based on transfer learning radiomics. Nat Commun. (2020) 11:4807. doi: 10.1038/s41467-020-18497-3

PubMed Abstract | Crossref Full Text | Google Scholar

9. Mattingly AS, Noel JE, and Orloff LA. A closer look at “Taller-than-wide” Thyroid nodules: examining dimension ratio to predict Malignancy. Otolaryngol Head Neck Surg. (2022) 167:236–41. doi: 10.1177/01945998211051310

PubMed Abstract | Crossref Full Text | Google Scholar

10. Lin P, Liang F, Ruan J, Han P, Liao J, Chen R, et al. A preoperative nomogram for the prediction of high-volume central lymph node metastasis in papillary thyroid carcinoma. Front Endocrinol (Lausanne). (2021) 12:753678. doi: 10.3389/fendo.2021.753678

PubMed Abstract | Crossref Full Text | Google Scholar

11. Ulisse S, Baldini E, Lauro A, Pironi D, Tripodi D, Lori E, et al. Papillary thyroid cancer prognosis: an evolving field. Cancers (Basel). (2021) 13:5567. doi: 10.3390/cancers13215567

PubMed Abstract | Crossref Full Text | Google Scholar

12. Feng Y, Min Y, Chen H, Xiang K, Wang X, and Yin G. Construction and validation of a nomogram for predicting cervical lymph node metastasis in classic papillary thyroid carcinoma. J Endocrinol Invest. (2021) 44:2203–11. doi: 10.1007/s40618-021-01524-5

PubMed Abstract | Crossref Full Text | Google Scholar

13. Zhao L, Zhou T, Zhang W, Wu F, Jiang K, Lin B, et al. Blood immune indexes can predict lateral lymph node metastasis of thyroid papillary carcinoma. Front Endocrinol (Lausanne). (2022) 13:995630. doi: 10.3389/fendo.2022.995630

PubMed Abstract | Crossref Full Text | Google Scholar

14. Hu MJ, Zhang C, Liang L, Wang SY, Zheng XC, Zhang Q, et al. Fasting serum glucose, thyroid-stimulating hormone, and thyroid hormones and risk of papillary thyroid cancer: A case-control study. Head Neck. (2019) 41:2277–84. doi: 10.1002/hed.25691

PubMed Abstract | Crossref Full Text | Google Scholar

15. Ferrari SM, Fallahi P, Elia G, Ragusa F, Ruffilli I, Paparo SR, et al. Thyroid autoimmune disorders and cancer. Semin Cancer Biol. (2020) 64:135–46. doi: 10.1016/j.semcancer.2019.05.019

PubMed Abstract | Crossref Full Text | Google Scholar

16. Mcleod DSA, Bedno SA, Cooper DS, Hutfless SM, Ippolito S, Jordan SJ, et al. Pre-existing thyroid autoimmunity and risk of papillary thyroid cancer: A nested case-control study of US active-duty personnel. J Clin Oncol. (2022) 40:2578–87. doi: 10.1200/JCO.21.02618

PubMed Abstract | Crossref Full Text | Google Scholar

17. Tan L, Ji J, Sharen G, Liu Y, and Lv K. Related factor analysis for predicting large-volume central cervical lymph node metastasis in papillary thyroid carcinoma. Front Endocrinol (Lausanne). (2022) 13:935559. doi: 10.3389/fendo.2022.935559

PubMed Abstract | Crossref Full Text | Google Scholar

18. Ni Y, Wang T, Wang X, Tian Y, Wei W, and Liu Q. Clinical features of multifocal papillary thyroid carcinoma and risk factors of cervical metastatic lymph nodes. Zhejiang Da Xue Xue Bao Yi Xue Ban. (2022) 51:225–32. doi: 10.3724/zdxbyxb-2021-0389

PubMed Abstract | Crossref Full Text | Google Scholar

19. Li X, Zhang H, Zhou Y, and Cheng R. Risk factors for central lymph node metastasis in the cervical region in papillary thyroid carcinoma: a retrospective study. World J Surg Oncol. (2021) 19:138. doi: 10.1186/s12957-021-02247-w

PubMed Abstract | Crossref Full Text | Google Scholar

20. Demircioglu ZG, Demircioglu MK, Aygun N, Akgun IE, Unlu MT, Kostek M, et al. Relationship between thyroid-stimulating hormone level and aggressive pathological features of papillary thyroid cancer. Sisli Etfal Hastan Tip Bul. (2022) 56:126–31. doi: 10.14744/SEMB.2022.14554

PubMed Abstract | Crossref Full Text | Google Scholar

21. Liu Y, Lv H, Zhang S, Shi B, and Sun Y. The impact of coexistent hashimoto’s thyroiditis on central compartment lymph node metastasis in papillary thyroid carcinoma. Front Endocrinol (Lausanne). (2021) 12:772071. doi: 10.3389/fendo.2021.772071

PubMed Abstract | Crossref Full Text | Google Scholar

22. Zhang X, Zhang X, Du W, Dai L, Luo R, Fang Q, et al. Fine needle biopsy versus core needle biopsy combined with/without thyroglobulin or BRAF 600E mutation assessment for detecting cervical nodal metastasis of papillary thyroid carcinoma. Front Endocrinol (Lausanne). (2021) 12:663720. doi: 10.3389/fendo.2021.663720

PubMed Abstract | Crossref Full Text | Google Scholar

23. Volpi EM, Ramirez-Ortega MC, and Carrillo JF. Editorial: Recent advances in papillary thyroid carcinoma: Progression, treatment and survival predictors. Front Endocrinol (Lausanne). (2023) 14:1163309. doi: 10.3389/fendo.2023.1163309

PubMed Abstract | Crossref Full Text | Google Scholar

24. Du J, Han R, Chen C, Ma X, Shen Y, Chen J, et al. Diagnostic efficacy of ultrasound, cytology, and BRAF(V600E) mutation analysis and their combined use in thyroid nodule screening for papillary thyroid microcarcinoma. Front Oncol. (2021) 11:746776. doi: 10.3389/fonc.2021.746776

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: papillary thyroid cancer, lymphonodi cervicales metastasis, machine learning, prediction model, SHAP

Citation: Hao Y, Zhang Y, Su Y and Liu L (2025) Construction and validation of a predictive model for lymph node metastasis in patients with papillary thyroid carcinoma. Front. Endocrinol. 16:1551108. doi: 10.3389/fendo.2025.1551108

Received: 24 December 2024; Accepted: 26 May 2025;
Published: 09 June 2025.

Edited by:

Wenlin Yang, University of Florida, United States

Reviewed by:

Hanlin Zhu, Hangzhou Ninth People’s Hospital, China
Ruocen Song, University of Florida, United States

Copyright © 2025 Hao, Zhang, Su and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Liping Liu, bGl1bGlwaW5nMTYwMEBzaW5hLmNvbQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Construction and validation of a predictive model for lymph node metastasis in patients with papillary thyroid carcinoma

1 Background

2 Materials and methods

2.1 Methods

2.1.1 US equipment and US characteristics

2.1.2 Clinical and serological indicators of thyroid function

2.1.3 BRAFV600E gene mutation detection

2.1.4 Surgical management in study protocol

2.1.5 Establishment and evaluation of predictive models

2.2 Statistical analysis

3 Results

3.1 Comparison of baseline data

3.2 Risk factor screening for lymph node metastasis in patients with papillary thyroid carcinoma

3.3 Comprehensive analysis of the classified multi-model

3.4 Construction and evaluation of the optimum model

3.5 SHAP to model visual explanations

4 Discussion

5 Conclusion

Data availability statement

Ethics statement

Author contributions

Funding

Acknowledgments

Conflict of interest

Generative AI statement

Publisher’s note

Supplementary material

References

2.1.3 BRAF^V600E gene mutation detection