Application progress of artificial intelligence in managing thyroid disease

Lu, Qing; Wu, Yu; Chang, Jing; Zhang, Li; Lv, Qing; Sun, Hui

doi:10.3389/fendo.2025.1578455

REVIEW article

Front. Endocrinol., 17 June 2025

Sec. Thyroid Endocrinology

Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1578455

Application progress of artificial intelligence in managing thyroid disease

Qing Lu^1†

Yu Wu^1†

Jing Chang¹

Li Zhang^1*

Qing Lv^1*

Hui Sun^2*

¹Department of Ultrasound, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
²Department of Endocrinology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China

Artificial intelligence (AI) has been used to study thyroid diseases since the 1990s. Previously, it mainly concentrated on the diagnosis of thyroid function and distinguishing benign from malignant thyroid nodules. With the rapid development of machine and deep learning, AI has been widely used in multiple areas of thyroid disease management, including image analysis, pathological diagnosis, personalized treatment, patient monitoring, and follow-up. This review systematically examines the evolution of AI applications in thyroid disease management since the 1990s, with a focus on diagnostic innovations, therapeutic personalization, and emerging challenges in clinical implementation. AI not only reduces the subjectivity associated with ultrasound examinations but also enhances the differentiation rate of benign and malignant thyroid nodules, thereby reducing the frequency of unnecessary fine-needle aspirations. AI synthesizes multimodal data, such as ultrasound, electronic health records, and wearable sensors, for continuous health monitoring. This integration facilitates the early detection of subclinical recurrence risk, particularly in patients who have undergone thyroidectomy. Despite the broad prospects of AI applications, challenges related to data privacy, model interpretability, and clinical applicability remain. This review critically evaluates studies across the ultrasound, CT/MRI, and histopathology domains, while addressing barriers to clinical translation, such as data heterogeneity and ethical concerns.

1 Introduction

The origins of artificial intelligence (AI) can be traced back to the 1950s, when researchers first sought to simulate human thought and decision-making processes (1). With the rapid advancement of computer technology, AI applications have expanded, notably in medical image analysis, where AI has been integrated into computer-aided diagnosis systems to detect and evaluate abnormal structures (2). In the context of thyroid disease research, which began in the 1990s, early AI applications primarily focused on assessing thyroid function (3) and analyzing ultrasound images to assist clinicians in differentiating between benign and malignant nodules (4).

Thyroid nodules are common in the general population. Approximately 20% of individuals have palpable thyroid nodules on physical examination, and up to 50% present with nodules on imaging. However, only 5% to 15% of these cases are malignant (5). Fine-needle aspiration (FNA) biopsy is the gold standard for the preoperative diagnosis of thyroid cancer. Current diagnostic methods detect 20%–30% of cytologically indeterminate thyroid nodules, with a false-negative rate of 3% to 5%, depending on cytological interpretation and nodule characteristics (6, 7). Currently, the major clinical challenge in the management of thyroid nodules is the diagnosis of thyroid cancer. A large multicenter correlation study found a 34% malignancy rate for FNAs with indeterminate cytology (8). However, the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) risk stratification system is relatively complex to apply in clinical practice and has limited diagnostic specificity (44%-67.3%) (9). Clinicians require additional tools to reduce overdiagnosis and avoid unnecessary surgeries.

In the 21st century, rapid advancements in machine and deep learning have created transformative opportunities for AI applications in thyroid disease management. The latest deep-learning algorithms have markedly enhanced image-processing capabilities, allowing AI to analyze complex ultrasound images with greater accuracy and thereby improve diagnostic sensitivity and specificity (10). For instance, studies indicate that AI-assisted ultrasound diagnostic systems can achieve accuracy rates exceeding 90% for identifying thyroid nodules, significantly surpassing traditional diagnostic methods (11). AI combined with radiomics can reduce the rate of unnecessary FNA biopsies from 30.0% to 4.5% in the validation dataset and from 37.7% to 4.7% in the test dataset, compared to ACR TI-RADS (9). AI systems can identify subtle changes in cellular morphology and tissue structure (12, 13) improving the diagnostic accuracy of FNA biopsies (14, 15). In a comparison between AI and human experts, the AI model demonstrated higher accuracy and specificity than those of the average expert cytopathologist by more than two standard deviations (accuracy 99.71% vs. 88.91%, sensitivity 99.81% vs. 87.26%, and specificity 99.61% vs. 90.58%) (16).

Currently, AI is extensively applied to various aspects of thyroid disease management, including image analysis (17–20), pathological diagnosis (12, 14, 16, 21, 22), personalized treatment (23, 24), and patient monitoring and follow-up (25, 26) (Figure 1). By leveraging historical case data, AI systems offer considerable advantages in standardized diagnoses, risk assessments, personalized treatments, and patient follow-ups, ultimately providing more accessible and tailored healthcare services.

Figure 1

Figure 1. AI is extensively applied across various aspects of thyroid disease management, including image analysis, pathological diagnosis, personalized treatment, as well as patientmonitoring and follow-up.

In summary, advancements in AI for thyroid disease management exemplify the deep integration of medicine and computer science, presenting significant opportunities to advance personalized healthcare. This study aims to review the current progress of AI applications in thyroid disease and explore future directions for its development.

2 Methods

2.1 Search strategy and inclusion criteria

This review followed the PRISMA guidelines. Databases (PubMed, Scopus, and Web of Science) were searched (2019–2025) using the following keywords: ‘ultrasonography,’ ‘ultrasonics,’ ‘artificial intelligence,’ ‘intelligent learning,’ ‘thyroid nodule,’ ‘thyroid cancer,’ ‘pathology,’ ‘personalized treatment,’ ‘CT,’ and ‘MRI.’ Inclusion criteria: (1) Clinical human studies; (2) validation in ≥50 patients; (3) performance metrics reported. Exclusion criteria: (1) Animal or phantom studies; (2) technical reports without clinical validation. From 1,837 records, 30 studies met the criteria after screening (see the PRISMA flowchart, Figure 2).

Figure 2

Figure 2. PRISMA flowchart of the review.

2.2 Reproducibility

The reproducibility analysis revealed critical gaps: 90% (27/30) of the studies utilized proprietary datasets with restricted access, whereas 90% (27/30) failed to disclose preprocessing codes. A striking example of this “reproducibility crisis” is Peng’s ThyNet model (27), which achieved 89.1% accuracy in the original publication. However, independent replication attempts by Gild et al. demonstrated a performance decline to 64% (28). Standardized image storage protocols and preprocessing environments are urgently required to enhance reproducibility.

3 Results

3.1 Diagnostic applications

3.1.1 Imaging analysis (Ultrasound/CT/MRI)

3.1.1.1 Evolution of AI in ultrasound technology and clinical applications

3.1.1.1.1 Early exploration of traditional AI algorithms

AI research on thyroid ultrasound originated in 1993 by Sharpe et al., who utilized artificial neural networks for in vitro thyroid function diagnosis (29). Early studies focused on constructing machine learning models based on ultrasound features manually extracted by radiologists, such as nodule morphology and echogenicity. For example, the thyroid ultrasound computer-aided diagnosis system developed by Choi et al. demonstrated a sensitivity comparable to that of experienced radiologists but exhibited lower specificity (56%) and overall accuracy (17). Similarly, the S-Detect system achieved 95% sensitivity for thyroid cancer diagnosis; however, its insufficient specificity (56%) highlights the risk of overdiagnosis (18). Although these technologies improve diagnostic consistency, they remain reliant on manual annotation or feature extraction, introducing subjectivity, operational complexity, and potential increases in interpretation time, false-positive rates, and false-negative rates (30).

3.1.1.1.2 Revolution in autonomous feature extraction via deep learning

Deep learning, through multilayered neural networks enables automatic extraction of high-dimensional image features, overcoming the limitations of traditional methods. Examples include: the AIBx system developed by Swan et al., which integrated with TI-RADS classification and significantly reduced the risk of missed malignant nodules (single-center study, 413 nodules; AIBx and TI-RADS false-negative rates: 22% vs. 6%, with no malignant nodules overlooked when both methods concurred on benign classification) (19); the Al-Thyroid model designed by Eun et al., which improved diagnostic accuracy and interobserver consistency, particularly enhancing junior physicians’ performance (AUROC increased from 0.854 to 0.945, sensitivity from 84.2% to 92.7%, specificity from 72.9% to 86.6%; P <.001 for all) (20); and the ThyNet system proposed by Peng et al., which integrated ultrasound images and video data from 23 hospitals (18,049 images) to optimize positive/negative predictive values and reduce unnecessary FNAs (FNAs decreased from 61.9% to 35.2%, while missed malignancies declined from 18.9% to 17.0%) (27). However, most studies rely entirely on hospital-confirmed histopathological data and lack representation of screening populations. Differences in disease prevalence across cohorts may distort the PPV/NPV metrics and compromise generalizability. Additionally, the exclusion of nondiagnostic scans and unresolved multinodular correlations from retrospective datasets introduces methodological bias. Largescale screening validation remains critical to address these translational gaps.

Deep learning not only achieves benign–malignant nodule classification (AUC: 0.90) (31)but also synergizes with radiomics to extract quantitative features (including shape, texture, and intensity) for refined clinical decision-making. Examples include metastasis prediction: Yu et al.’s radiomics model predicted lymph node metastasis in thyroid cancer with an AUC of 0.90 (n = 1,013) (31); genomic and prognostic analysis: ultrasound features correlated with tumor phenotypes or genetic mutations (n = 96) (32), while multimodal models localized primary cancer sites in metastatic lymph nodes (n = 280) (33); and treatment optimization: radiomics-clinical integrated models reduced unnecessary central lymph node dissections (34, 35) and assessed disease-free survival (36). Most AI validations depend on single-center retrospective data, and lack largescale, multicenter prospective validations.

3.1.1.1.3 Clinical value of AI-TI-RADS

A retrospective analysis of 2,061 thyroid nodules (sampled via FNA or surgery) was used to develop the AI-TI-RADS classification model. Compared to the conventional ACR TI-RADS, AI-TI-RADS demonstrated superior specificity (70.2% vs. 49.2%) and biopsy avoidance rates (42.3%), while maintaining comparable sensitivity (82.2% vs. 86.7%) (37). This disparity underscores the need to balance sensitivity and specificity based on clinical scenarios (37).

3.1.1.2 AI advancements in CT and MRI

Although ultrasound remains the primary imaging modality for thyroid disorders, CT is indispensable in complex cases, such as the assessment of tumor invasiveness. The AI system developed by Wang et al. predicted preoperative cervical lymph node metastasis in thyroid cancer using CT images and outperformed senior radiologists in sensitivity and accuracy. When combined with radiologists, AI further enhances diagnostic efficacy, demonstrating its utility in surgical planning (38). MRI, with its high soft tissue resolution, offers unique advantages for assessing extrathyroidal extension. A radiomics study (n = 132) identified 16 key features from multiparametric MRI data, constructing a predictive model for extrathyroidal extension with an AUC of 0.87 (39). However, these studies involved moderate sample sizes, necessitating larger cohorts to improve predictive efficiency. Additionally, deep learning-based segmentation of thyroid lesions on CT or MRI remains unexplored in the literature. Tumors <0.5 cm in diameter were excluded because of unreliable identification and segmentation on CT or MRI images.

3.1.2 Pathology support

The earliest applications of AI in the pathological analysis of thyroid diseases date back to the 1990s, when AI was primarily used for basic image recognition and classification. Researchers began exploring computer-assisted techniques for analyzing pathological slides; however, the limitations of the technology restricted its application (40).

In the 21st century, the rapid development of deep learning has significantly advanced the application of AI in pathological analysis. The introduction of convolutional neural networks has enabled AI to effectively process and analyze high-resolution pathological images. Research during this period has focused on automated tumor detection and classification, particularly in the diagnosis of thyroid cancer, where AI systems can identify subtle changes in cellular morphology and tissue structure (12, 13). Guan et al. utilized a the VGG-16 deep convolutional neural network (DCNN) model to establish a pathology-validated dataset from 279 cytological images of thyroid nodules. They trained and tested both the VGG-16 and Inception-v3 DCNN models and found that the VGG-16 model showed significant potential to enhance the diagnosis of papillary thyroid carcinoma (PTC) in cytological images. In fragment images, the contours, perimeter, area, and average pixel intensity of PTC cells were all greater than those of benign nodules (12). FNA biopsy remains the gold standard for the preoperative diagnosis of malignant tumors. However, approximately 10%–30% of thyroid nodules yield inconclusive results, with 10-40% of those cases subsequently confirmed to be malignant (41). Zhao et al., found that the DCNN ResNeSt achieved high sensitivity in diagnosing malignancies in these ambiguous atypical nodules. The ResNeSt model achieved an accuracy of 92.49% (160/173) on fragment images and 84.78% (39/46) in distinguishing PTC from benign nodules in ambiguous cases. The sensitivity and specificity of the ResNeSt model were 95.79% and 88.46%, respectively. Malignant nodules exhibit larger and more deeply stained nuclei than those of benign nodules (14).

The development of AI-assisted algorithms using digital cytology images has been significantly impeded by technical challenges and a shortage of optimized scanners for cytology specimens (42). In a study by Guan et al., all three fragmented false-positive cases showed large nuclei with high mean pixel color information similar to that of malignant cells. However, cytopathologists considered these images representative of typical benign nodules. The authors suggested that the DCNN based its diagnosis on nucleus size and staining intensity rather than shape.Future studies should focus on training the networks to differentiate between cellular and nucleus morphologies (12). Additionally, current DCNN models require sufficient sample sizes; smaller datasets risk overfitting. Rare thyroid cancer histopathologies—such as follicular thyroid cancer (FTC) and Hürthle cell carcinoma—remain difficult to diagnose. Wai-Kin Chan et al. found that the accuracy of convolutional neural networks in identifying FTC was only 63.6%–72.7% and in identifying Hürthle cell carcinoma, only 60%–66.7%. These limitations were largely because of the small number of cases in the database—a consequence of the low incidence and prevalence of these cancers. However, the performance of retrained convolutional neural networks was significantly better than that of the participating physicians (43).

Since 2016, the application of AI has gradually evolved toward the integration of multimodal data. Researchers have begun to explore combinations of pathological images with clinical data and genomic information to construct comprehensive models (44). This trend extends the capabilities of AI beyond image analysis, enabling support for predicting genetic information, assessing patient prognosis, and developing personalized treatment plans. For example, PTC— particularly its aggressive subtype—is often associated with BRAF p.V600E mutations and RET fusions (45, 46). Rossi et al. examined 72 FNA cytology specimens from patients diagnosed with PTC and found that 47 of the patients with mutations exhibited distinct morphological features. This study demonstrated that the BRAF p.V600E mutation could be predicted in cytological samples based on specific morphological characteristics (21). AI technology has the potential to predict whether patients with PTC harbor BRAF p.V600E mutations by analyzing and identifying the morphological features of cells (47). Nishikaw et al. generated a morphological analysis dataset using deep learning, constructed 72 whole-slide images, and extracted six types of nuclear features. This study successfully established a predictive model for identifying RET fusions, achieving an AUC of 0.801 (22). Additionally, integrative multiomics analyses—such as combining spatial proteomics, genomics, immunohistochemistry, and metabolomics—with the application of AI and machine learning methods can reveal complex relations and interactions among various molecular components, providing a more comprehensive biological landscape for pathological thyroid diagnosis and addressing current diagnostic challenges (48). Matrix-assisted laser desorption/ionization mass spectrometry imaging and desorption electrospray ionization mass spectrometry imaging enhance the diagnostic performance of FNA by effectively distinguishing between benign and malignant cell regions, serving as supplementary tools for diagnosing uncertain characteristics of thyroid nodules (15, 49) (Table 1).

Table 1

Table 1. AI studies for thyroid disease diagnosis.

3.2 Therapeutic applications

3.2.1 Surgical decision-making

Radiomic models can analyze risk stratification, predict the invasiveness of thyroid cancer and lymph node metastasis, and guide surgical decisions regarding preventive lymph node dissection (50). One study used of mind maps and iterative decision trees to develop a guideline-based clinical decision support system for routine surgical practice. The concordance between clinical decision support system recommendations and actual treatment decisions in real-world clinical settings was 78.9% (51).

3.2.2 Targeted therapy guidance

Initially, the concept of personalized treatment relied primarily on clinical experience and pathological analysis and lacked data-driven approaches. Advancements in AI have facilitated a gradual shift toward data-driven personalized treatment. With the development of genomics and bioinformatics, researchers have begun using AI to analyze patient genetic information to predict disease risk and treatment responses (52). Early studies focused on the genetic mutation analysis of patients with thyroid cancer to identify biomarkers associated with treatment sensitivity (21, 53). The ResNet152-based DTLR model demonstrated significant value in identifying BRAF p.V600E mutations in patients with PTC using ultrasound images (54).Combination therapy with dabrafenib and trametinib is currently the standard treatment for patients with the BRAF p.V600E mutation. Machine learning approaches have contributed to the identification of biological pathways involved in cancer drug responses. For example, machine learning methods identified Rac1/cytoskeleton signaling transduction as the most significant driver of resistance to BRAF inhibitors (55). AI-assisted virtual screening identified Kir5.1 as a druggable target through the molecular docking of 200,000 compounds. Additionally, 10 potent compounds that interact with Kir5.1 were successfully identified using AI-assisted virtual screening (24) (Table 2).

Table 2

Table 2. AI studies for therapeutic applications.

3.3 Prognostic monitoring

AI not only holds significant promise in the diagnosis and personalized treatment of patients with thyroid diseases but also plays an increasingly important role in patient monitoring and follow-up, particularly in remote monitoring, prognostic assessment, and risk management.

3.3.1 Recurrence prediction

With the maturation of deep and machine learning algorithms, the application of AI in the personalized treatment of thyroid diseases has continued to expand. Researchers have begun integrating clinical data, imaging features, and biomarkers to develop complex predictive models. These models not only assist physicians in formulating individualized treatment plans but also evaluate patients’ responses to various therapies. Zhang et al, integrated radiomic features, mutated genes, and clinical characteristics to construct a nomogram model. The study found that this model significantly enhanced the predictive efficacy of radiomic features for lymph node metastasis improving accuracy from 71.5% to 87.0% (23).

In the context of prognostic assessment and risk management, AI aids in analyzing long-term data to evaluate the risk of recurrence, particularly during the follow-up of patients with thyroid cancer. Timely interventions can be initiated by the early detection of abnormal signals. For instance, one study analyzed the prognostic significance of clinical and pathological factors in 1,040 patients with PTC, including the number of metastatic lymph nodes and lymph node ratio. Researchers attempted to construct a disease recurrence prediction model using machine learning techniques and compared the accuracy of five machine learning models. The decision tree model exhibited the highest accuracy at 95%, while the combination of Light Gradient Boosting Machine and stacking models showed an accuracy of 93% (25). In another study involving 554 patients with PTC, researchers used radiomic features in combination with significant clinical and pathological characteristics to construct a nomogram. The results demonstrated that the combined nomogram showed strong concordance with actual recurrence events and yielded a net benefit superior to that of traditional clinical models across most thresholds (26).

3.3.2 Remote monitoring

With ongoing technological advancements, the application of AI in remote monitoring has steadily increased. Using smartphones and wearable devices, patients’ physiological parameters and symptoms can be collected and transmitted to healthcare teams in real time (56, 57). AI systems can analyze these data to promptly identify potential complications and recurrence risks, thereby providing physicians with real-time decision-making support. This form of remote monitoring not only enhances patients’ self-management capabilities but also reduces the need for frequent clinic visits (Table 3).

Table 3

Table 3. AI studies for prognostic monitoring.

4 Conclusion and outlook

AI has demonstrated significant potential in the detection and follow-up of patients with thyroid diseases, particularly in imaging analysis, prediction of invasiveness and metastasis, and prognostic assessment. Through deep learning and machine learning techniques, AI has not only improved the accuracy of differentiating between benign and malignant thyroid nodules but also integrated multiple data sources to monitor patient health and identify potential risks in a timely manner. Despite the promising prospects of AI in thyroid disease management, critical challenges persist regarding data privacy, model interpretability, and clinical applicability. This study had three fundamental limitations:

First, the generalization capacity of AI models is profoundly affected by dataset homogeneity. Existing studies predominantly relied on single-center, hospital-based cohorts (28 of 30 studies, 93%), which differ in thyroid cancer prevalence compared with the general population, thereby compromising external validity. Notably, 83% of the models (25 of 30) were trained on Asian datasets, raising concerns about their efficacy across diverse ethnic and geographic populations. Furthermore, the inadequate representation of pathological subtypes—with 90% of studies focusing on classical PTC—has resulted in diagnostic inequity for patients with FTC and other rare subtypes. This limitation contributes to degraded algorithmic performance across institutions, imaging devices, and multiethnic cohorts.

Second, the “black-box” nature of AI models remains a critical barrier to clinical adoption. Although interpretability tools, such as SHapley Additive exPlanations and Local Interpretable Model-agnostic Explanations (58–60), have been partially implemented, current systems fail to transparently elucidate decision-making pathways—particularly the relative contributions of key morphological features, such as microcalcifications versus vascular patterns. This opacity complicates the clinical validation of misdiagnoses, including the erroneous classification of Hashimoto’s thyroiditis as malignancy (15).

Third, these two systemic disconnects hinder real-world application. Algorithmic development remains poorly integrated with clinical workflows, as exemplified by models, such as those developed by Peng et al. (27), which lack compatibility with Picture Archiving and Communication Systems. Concurrently, the absence of ethical and legal frameworks—addressing liability attribution for AI misdiagnoses and informed consent for predictive genomic models—creates regulatory ambiguities (61).

Future research should prioritize these three directions. Cross-modal data-fusion architectures must integrate ultrasound, pathomics, and multiomics data to develop interpretable multitask learning frameworks. Algorithmic improvements are urgently required to enhance predictive fairness in heterogeneous thyroid nodule populations. The seamless integration of AI tools into clinical workflows necessitates the establishment of rapid implementation pipelines. Additionally, prospective randomized controlled trials are imperative to quantify the real-world impact of AI systems on healthcare costs—such as reductions in FNA rates—and patient outcomes, including 5-year survival rates. Addressing these priorities will bridge the gap between AI innovation and equitable, ethically grounded clinical practice.

Author contributions

QLu: Conceptualization, Writing – original draft, Writing – review & editing, Methodology, Formal Analysis. YW: Writing – original draft, Writing – review & editing, Conceptualization, Methodology, Formal Analysis. JC: Conceptualization, Investigation, Writing – review & editing. LZ: Writing – review & editing, Supervision, Project administration. QLv: Methodology, Writing – review & editing, Supervision. HS: Funding acquisition, Project administration, Writing – review & editing, Supervision.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study has received funding by the Wuhan Knowledge Innovation Project, Grant(No.2023020201010162), the Principal Investigator is Hui Sun, and the Technology Innovation Project of Hubei Province, Grant(No.2023BCB131), the Principal Investigator is Hui Sun.

Acknowledgments

The authors are grateful to all the people who participated in the study, Huazhong University of Science and Technology.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Abbreviations

ACR TIRADS, American College of Radiology Thyroid Imaging Reporting and Data System; AI, Artificial intelligence; AUC, Area under the curve; CAD, Computer-aided diagnosis; DCNN, Deep convolutional neural network; DTLR, deep transfer learning radiomics; ETE, Extrathyroidal extension; FNA, Fine-needle aspiration biopsy; FTC, follicular thyroid cancer; HCC, Hürthle cell carcinoma; MALDI-MSI, Matrix-assisted laser desorption/ionization mass spectrometry imaging; NPV, Negative predictive value; PPV, Positive predictive value; PTC, Papillary thyroid carcinoma; TLR, Transfer learning-based radiomics; BRAF, B-Raf proto-oncogene, serine/threonine kinase.

References

1. Ramesh AN, Kambhampati C, Monson JR, and Drew PJ. Artificial intelligence in medicine. Ann R Coll Surg Engl. (2004) 86:334–8. doi: 10.1308/147870804290

PubMed Abstract | Crossref Full Text | Google Scholar

2. Esteva A, Chou K, Yeung S, Naik N, Madani A, Mottaghi A, et al. Deep learning-enabled medical computer vision. NPJ Digit Med. (2021) 4:5. doi: 10.1038/s41746-020-00376-2

PubMed Abstract | Crossref Full Text | Google Scholar

3. Saarinen K, Nykänen P, Irjala K, Viikari J, and Välimäki M. Design and development of the thyroid system. Comput Methods Programs BioMed. (1991) 34:211–8. doi: 10.1016/0169-2607(91)90045-u

PubMed Abstract | Crossref Full Text | Google Scholar

4. Karakitsos P, Cochand-Priollet B, Guillausseau PJ, and Pouliakis A. Potential of the back propagation neural network in the morphologic examination of thyroid lesions. Anal Quant Cytol Histol. (1996) 18:494–500.

PubMed Abstract | Google Scholar

5. Holt EH. Current evaluation of thyroid nodules. Med Clin North Am. (2021) 105:1017–31. doi: 10.1016/j.mcna.2021.06.006

PubMed Abstract | Crossref Full Text | Google Scholar

6. Liu T, Yang F, Qiao J, and Mao M. Deciphering the progression of fine-needle aspiration: A bibliometric analysis of thyroid nodule research. Med (Baltimore). (2024) 103:e38059. doi: 10.1097/md.0000000000038059

PubMed Abstract | Crossref Full Text | Google Scholar

7. Roy M, Ghander C, Bigorgne C, Brière M, Deniziaut G, Ansart F, et al. An update on management of cytologically indeterminate thyroid nodules. Ann Endocrinol (Paris). (2025) 86:101711. doi: 10.1016/j.ando.2025.101711

PubMed Abstract | Crossref Full Text | Google Scholar

8. Wang CC, Friedman L, Kennedy GC, Wang H, Kebebew E, Steward DL, et al. A large multicenter correlation study of thyroid nodule cytopathology and histopathology. Thyroid. (2011) 21:243–51. doi: 10.1089/thy.2010.0243

PubMed Abstract | Crossref Full Text | Google Scholar

9. Zhao CK, Ren TT, Yin YF, Shi H, Wang HX, Zhou BY, et al. A comparative analysis of two machine learning-based diagnostic patterns with thyroid imaging reporting and data system for thyroid nodules: diagnostic performance and unnecessary biopsy rate. Thyroid. (2021) 31:470–81. doi: 10.1089/thy.2020.0305

PubMed Abstract | Crossref Full Text | Google Scholar

10. Seo JK, Kim YJ, Kim KG, Shin I, Shin JH, and Kwak JY. Differentiation of the follicular neoplasm on the gray-scale us by image selection subsampling along with the marginal outline using convolutional neural network. BioMed Res Int. (2017) 2017:3098293. doi: 10.1155/2017/3098293

PubMed Abstract | Crossref Full Text | Google Scholar

11. Chi J, Walia E, Babyn P, Wang J, Groot G, and Eramian M. Thyroid nodule classification in ultrasound images by fine-tuning deep convolutional neural network. J Digit Imaging. (2017) 30:477–86. doi: 10.1007/s10278-017-9997-y

PubMed Abstract | Crossref Full Text | Google Scholar

12. Guan Q, Wang Y, Ping B, Li D, Du J, Qin Y, et al. Deep convolutional neural network vgg-16 model for differential diagnosing of papillary thyroid carcinomas in cytological images: A pilot study. J Cancer. (2019) 10:4876–82. doi: 10.7150/jca.28769

PubMed Abstract | Crossref Full Text | Google Scholar

13. Cochand-Priollet B, Koutroumbas K, Megalopoulou TM, Pouliakis A, Sivolapenko G, and Karakitsos P. Discriminating benign from Malignant thyroid lesions using artificial intelligence and statistical selection of morphometric features. Oncol Rep. (2006) 15 Spec no:1023–6. doi: 10.3892/or.15.4.1023

PubMed Abstract | Crossref Full Text | Google Scholar

14. Zhao D, Luo M, Zeng M, Yang Z, Guan Q, Wan X, et al. Deep convolutional neural network model resnest for discrimination of papillary thyroid carcinomas and benign nodules in thyroid nodules diagnosed as atypia of undetermined significance. Gland Surg. (2024) 13:619–29. doi: 10.21037/gs-23-486

PubMed Abstract | Crossref Full Text | Google Scholar

15. Capitoli G, Piga I, L’Imperio V, Clerici F, Leni D, Garancini M, et al. Cytomolecular classification of thyroid nodules using fine-needle washes aspiration biopsies. Int J Mol Sci. (2022) 23:4156. doi: 10.3390/ijms23084156

PubMed Abstract | Crossref Full Text | Google Scholar

16. Lee Y, Alam MR, Park H, Yim K, Seo KJ, Hwang G, et al. Improved diagnostic accuracy of thyroid fine-needle aspiration cytology with artificial intelligence technology. Thyroid. (2024) 34:723–34. doi: 10.1089/thy.2023.0384

PubMed Abstract | Crossref Full Text | Google Scholar

17. Choi YJ, Baek JH, Park HS, Shim WH, Kim TY, Shong YK, et al. A computer-aided diagnosis system using artificial intelligence for the diagnosis and characterization of thyroid nodules on ultrasound: initial clinical assessment. Thyroid. (2017) 27:546–52. doi: 10.1089/thy.2016.0372

PubMed Abstract | Crossref Full Text | Google Scholar

18. Li Y, Liu Y, Xiao J, Yan L, Yang Z, Li X, et al. Clinical value of artificial intelligence in thyroid ultrasound: A prospective study from the real world. Eur Radiol. (2023) 33:4513–23. doi: 10.1007/s00330-022-09378-y

PubMed Abstract | Crossref Full Text | Google Scholar

19. Swan KZ, Thomas J, Nielsen VE, Jespersen ML, and Bonnema SJ. External validation of aibx, an artificial intelligence model for risk stratification, in thyroid nodules. Eur Thyroid J. (2022) 11:e210129. doi: 10.1530/etj-21-0129

PubMed Abstract | Crossref Full Text | Google Scholar

20. Ha EJ, Lee JH, Lee DH, Moon J, Lee H, Kim YN, et al. Artificial intelligence model assisting thyroid nodule diagnosis and management: A multicenter diagnostic study. J Clin Endocrinol Metab. (2024) 109:527–35. doi: 10.1210/clinem/dgad503

PubMed Abstract | Crossref Full Text | Google Scholar

21. Rossi ED, Bizzarro T, Martini M, Capodimonti S, Fadda G, Larocca LM, et al. Morphological parameters able to predict braf(V600e) -mutated Malignancies on thyroid fine-needle aspiration cytology: our institutional experience. Cancer Cytopathol. (2014) 122:883–91. doi: 10.1002/cncy.21475

PubMed Abstract | Crossref Full Text | Google Scholar

22. Nishikawa T, Matsuzaki I, Takahashi A, Ryuta I, Musangile FY, Sagan K, et al. Artificial intelligence detected the relationship between nuclear morphological features and molecular abnormalities of papillary thyroid carcinoma. Endocr Pathol. (2024) 35:40–50. doi: 10.1007/s12022-023-09796-8

PubMed Abstract | Crossref Full Text | Google Scholar

23. Zhang R, Hu L, Cheng Y, Chang L, Dong L, Han L, et al. Targeted sequencing of DNA/rna combined with radiomics predicts lymph node metastasis of papillary thyroid carcinoma. Cancer Imaging. (2024) 24:75. doi: 10.1186/s40644-024-00719-2

PubMed Abstract | Crossref Full Text | Google Scholar

24. Yang X, Wu Y, Xu S, Li H, Peng C, Cui X, et al. Targeting the inward rectifier potassium channel 5.1 in thyroid cancer: artificial intelligence-facilitated molecular docking for drug discovery. BMC Endocr Disord. (2023) 23:113. doi: 10.1186/s12902-023-01360-z

PubMed Abstract | Crossref Full Text | Google Scholar

25. Park YM and Lee BJ. Machine learning-based prediction model using clinico-pathologic factors for papillary thyroid carcinoma recurrence. Sci Rep. (2021) 11:4948. doi: 10.1038/s41598-021-84504-2

PubMed Abstract | Crossref Full Text | Google Scholar

26. Zhou B, Liu J, Yang Y, Ye X, Liu Y, Mao M, et al. Ultrasound-based nomogram to predict the recurrence in papillary thyroid carcinoma using machine learning. BMC Cancer. (2024) 24:810. doi: 10.1186/s12885-024-12546-6

PubMed Abstract | Crossref Full Text | Google Scholar

27. Peng S, Liu Y, Lv W, Liu L, Zhou Q, Yang H, et al. Deep learning-based artificial intelligence model to assist thyroid nodule diagnosis and management: A multicentre diagnostic study. Lancet Digit Health. (2021) 3:e250–e9. doi: 10.1016/s2589-7500(21)00041-8

PubMed Abstract | Crossref Full Text | Google Scholar

28. Gild ML, Chan M, Gajera J, Lurie B, Gandomkar Z, and Clifton-Bligh RJ. Risk stratification of indeterminate thyroid nodules using ultrasound and machine learning algorithms. Clin Endocrinol (Oxf). (2022) 96:646–52. doi: 10.1111/cen.14612

PubMed Abstract | Crossref Full Text | Google Scholar

29. Sharpe PK, Solberg HE, Rootwelt K, and Yearworth M. Artificial neural networks in diagnosis of thyroid function from in vitro laboratory tests. Clin Chem. (1993) 39:2248–53. doi: 10.1093/clinchem/39.11.2248

PubMed Abstract | Crossref Full Text | Google Scholar

30. Tong WJ, Wu SH, Cheng MQ, Huang H, Liang JY, Li CQ, et al. Integration of artificial intelligence decision aids to reduce workload and enhance efficiency in thyroid nodule management. JAMA Netw Open. (2023) 6:e2313674. doi: 10.1001/jamanetworkopen.2023.13674

PubMed Abstract | Crossref Full Text | Google Scholar

31. Yu J, Deng Y, Liu T, Zhou J, Jia X, Xiao T, et al. Lymph node metastasis prediction of papillary thyroid carcinoma based on transfer learning radiomics. Nat Commun. (2020) 11:4807. doi: 10.1038/s41467-020-18497-3

PubMed Abstract | Crossref Full Text | Google Scholar

32. Kwon MR, Shin JH, Park H, Cho H, Hahn SY, and Park KW. Radiomics study of thyroid ultrasound for predicting braf mutation in papillary thyroid carcinoma: preliminary results. AJNR Am J Neuroradiol. (2020) 41:700–5. doi: 10.3174/ajnr.A6505

PubMed Abstract | Crossref Full Text | Google Scholar

33. Zhu Y, Meng Z, Wu H, Fan X, Lv W, Tian J, et al. Deep Learning Radiomics of Multimodal Ultrasound for Classifying Metastatic cervical lymphadenopathy into Primary Cancer Sites: A Feasibility Study. Ultraschall Med. (2024) 45:305–15. doi: 10.1055/a-2161-9369

PubMed Abstract | Crossref Full Text | Google Scholar

34. Gao Y, Wang W, Yang Y, Xu Z, Lin Y, Lang T, et al. An integrated model incorporating deep learning, hand-crafted radiomics and clinical and us features to diagnose central lymph node metastasis in patients with papillary thyroid cancer. BMC Cancer. (2024) 24:69. doi: 10.1186/s12885-024-11838-1

PubMed Abstract | Crossref Full Text | Google Scholar

35. Lv X, Lu JJ, Song SM, Hou YR, Hu YJ, Yan Y, et al. Prediction of lymph node metastasis in patients with papillary thyroid cancer based on radiomics analysis and intraoperative frozen section analysis: A retrospective study. Clin Otolaryngol. (2024) 49:462–74. doi: 10.1111/coa.14162

PubMed Abstract | Crossref Full Text | Google Scholar

36. Park VY, Han K, Lee E, Kim EK, Moon HJ, Yoon JH, et al. Association between radiomics signature and disease-free survival in conventional papillary thyroid carcinoma. Sci Rep. (2019) 9:4501. doi: 10.1038/s41598-018-37748-4

PubMed Abstract | Crossref Full Text | Google Scholar

37. Liu Y, Li X, Yan C, Liu L, Liao Y, Zeng H, et al. Comparison of diagnostic accuracy and utility of artificial intelligence-optimized acr ti-rads and original acr ti-rads: A multi-center validation study based on 2061 thyroid nodules. Eur Radiol. (2022) 32:7733–42. doi: 10.1007/s00330-022-08827-y

PubMed Abstract | Crossref Full Text | Google Scholar

38. Wang C, Yu P, Zhang H, Han X, Song Z, Zheng G, et al. Artificial intelligence-based prediction of cervical lymph node metastasis in papillary thyroid cancer with ct. Eur Radiol. (2023) 33:6828–40. doi: 10.1007/s00330-023-09700-2

PubMed Abstract | Crossref Full Text | Google Scholar

39. Wei R, Wang H, Wang L, Hu W, Sun X, Dai Z, et al. Radiomics based on multiparametric mri for extrathyroidal extension feature prediction in papillary thyroid cancer. BMC Med Imaging. (2021) 21:20. doi: 10.1186/s12880-021-00553-z

PubMed Abstract | Crossref Full Text | Google Scholar

40. Nafe R and Choritz H. Introduction of a neuronal network as a tool for diagnostic analysis and classification based on experimental pathologic data. Exp Toxicol Pathol. (1992) 44:17–24. doi: 10.1016/s0940-2993(11)80132-6

PubMed Abstract | Crossref Full Text | Google Scholar

41. Hier J, Avior G, Pusztaszeri M, Krasner JR, Alyouha N, Forest VI, et al. Molecular testing for cytologically suspicious and Malignant (Bethesda V and vi) thyroid nodules to optimize the extent of surgical intervention: A retrospective chart review. J Otolaryngol Head Neck Surg. (2021) 50:29. doi: 10.1186/s40463-021-00500-6

PubMed Abstract | Crossref Full Text | Google Scholar

42. Wong CM, Kezlarian BE, and Lin O. Current status of machine learning in thyroid cytopathology. J Pathol Inform. (2023) 14:100309. doi: 10.1016/j.jpi.2023.100309

PubMed Abstract | Crossref Full Text | Google Scholar

43. Chan WK, Sun JH, Liou MJ, Li YR, Chou WY, Liu FH, et al. Using deep convolutional neural networks for enhanced ultrasonographic image diagnosis of differentiated thyroid cancer. Biomedicines. (2021) 9:1771. doi: 10.3390/biomedicines9121771

PubMed Abstract | Crossref Full Text | Google Scholar

44. Radulović M, Li X, Djuričić GJ, Milovanović J, Todorović Raković N, Vujasinović T, et al. Bridging histopathology and radiomics toward prognosis of metastasis in early breast cancer. Microsc Microanal. (2024) 30:751–8. doi: 10.1093/mam/ozae057

PubMed Abstract | Crossref Full Text | Google Scholar

45. Xing M, Alzahrani AS, Carson KA, Shong YK, Kim TY, Viola D, et al. Association between braf V600e mutation and recurrence of papillary thyroid cancer. J Clin Oncol. (2015) 33:42–50. doi: 10.1200/jco.2014.56.8253

PubMed Abstract | Crossref Full Text | Google Scholar

46. Jeon MJ, Chun SM, Kim D, Kwon H, Jang EK, Kim TY, et al. Genomic alterations of anaplastic thyroid carcinoma detected by targeted massive parallel sequencing in a braf(V600e) mutation-prevalent area. Thyroid. (2016) 26:683–90. doi: 10.1089/thy.2015.0506

PubMed Abstract | Crossref Full Text | Google Scholar

47. Wang CW, Muzakky H, Lee YC, Lin YJ, and Chao TK. Annotation-free deep learning-based prediction of thyroid molecular cancer biomarker braf (V600e) from cytological slides. Int J Mol Sci. (2023) 24:2521. doi: 10.3390/ijms24032521

PubMed Abstract | Crossref Full Text | Google Scholar

48. Piga I, L’Imperio V, Capitoli G, Denti V, Smith A, Magni F, et al. Paving the path toward multi-omics approaches in the diagnostic challenges faced in thyroid pathology. Expert Rev Proteomics. (2023) 20:419–37. doi: 10.1080/14789450.2023.2288222

PubMed Abstract | Crossref Full Text | Google Scholar

49. DeHoog RJ, Zhang J, Alore E, Lin JQ, Yu W, Woody S, et al. Preoperative metabolic classification of thyroid nodules using mass spectrometry imaging of fine-needle aspiration biopsies. Proc Natl Acad Sci U.S.A. (2019) 116:21401–8. doi: 10.1073/pnas.1911333116

PubMed Abstract | Crossref Full Text | Google Scholar

50. Fan F, Li F, Wang Y, Dai Z, Lin Y, Liao L, et al. Integration of ultrasound-based radiomics with clinical features for predicting cervical lymph node metastasis in postoperative patients with differentiated thyroid carcinoma. Endocrine. (2024) 84:999–1012. doi: 10.1007/s12020-023-03644-9

PubMed Abstract | Crossref Full Text | Google Scholar

51. Yu HW, Hussain M, Afzal M, Ali T, Choi JY, Han HS, et al. Use of mind maps and iterative decision trees to develop a guideline-based clinical decision support system for routine surgical practice: case study in thyroid nodules. J Am Med Inform Assoc. (2019) 26:524–36. doi: 10.1093/jamia/ocz001

PubMed Abstract | Crossref Full Text | Google Scholar

52. Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med. (2002) 8:68–74. doi: 10.1038/nm0102-68

PubMed Abstract | Crossref Full Text | Google Scholar

53. Liu Y, Zhang J, Li S, Chen W, Wu R, Hao Z, et al. Prediction of tnfrsf9 expression and molecular pathological features in thyroid cancer using machine learning to construct pathomics models. Endocrine. (2024) 86:324–32. doi: 10.1007/s12020-024-03862-9

PubMed Abstract | Crossref Full Text | Google Scholar

54. Wu F, Lin X, Chen Y, Ge M, Pan T, Shi J, et al. Breaking barriers: noninvasive ai model for braf(V600e) mutation identification. Int J Comput Assist Radiol Surg. (2025) 20:935–47. doi: 10.1007/s11548-024-03290-0

PubMed Abstract | Crossref Full Text | Google Scholar

55. Zhu EY and Dupuy AJ. Machine learning approach informs biology of cancer drug response. BMC Bioinf. (2022) 23:184. doi: 10.1186/s12859-022-04720-z

PubMed Abstract | Crossref Full Text | Google Scholar

56. Chen J, Liu J, Chen W, Shang D, Zhang Q, Li Y, et al. Skin-conformable flexible and stretchable ultrasound transducer for wearable imaging. IEEE Trans Ultrason Ferroelectr Freq Control. (2024) 71:811–20. doi: 10.1109/tuffc.2024.3352655

PubMed Abstract | Crossref Full Text | Google Scholar

57. Kim KH, Lee J, Ahn CH, Yu HW, Choi JY, Lee HY, et al. Association between thyroid function and heart rate monitored by wearable devices in patients with hypothyroidism. Endocrinol Metab (Seoul). (2021) 36:1121–30. doi: 10.3803/EnM.2021.1216

PubMed Abstract | Crossref Full Text | Google Scholar

58. Li Z, Nie W, Liu Q, Lin M, Li X, Zhang J, et al. A prognostic model for thermal ablation of benign thyroid nodules based on interpretable machine learning. Front Endocrinol (Lausanne). (2024) 15:1433192. doi: 10.3389/fendo.2024.1433192

PubMed Abstract | Crossref Full Text | Google Scholar

59. Wang L, Wang C, Deng X, Li Y, Zhou W, Huang Y, et al. Multimodal ultrasound radiomic technology for diagnosing benign and Malignant thyroid nodules of ti-rads 4-5: A multicenter study. Sensors (Basel). (2024) 24:6203. doi: 10.3390/s24196203

PubMed Abstract | Crossref Full Text | Google Scholar

60. Zou Y, Shi Y, Sun F, Liu J, Guo Y, Zhang H, et al. Extreme gradient boosting model to assess risk of central cervical lymph node metastasis in patients with papillary thyroid carcinoma: individual prediction using shapley additive explanations. Comput Methods Programs BioMed. (2022) 225:107038. doi: 10.1016/j.cmpb.2022.107038

PubMed Abstract | Crossref Full Text | Google Scholar

61. Sedano R, Solitano V, Vuyyuru SK, Yuan Y, Hanžel J, Ma C, et al. Artificial intelligence to revolutionize ibd clinical trials: A comprehensive review. Therap Adv Gastroenterol. (2025) 18:17562848251321915. doi: 10.1177/17562848251321915

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: artificial intelligence, deep learning, thyroid nodule, ultrasonography, radiomics, pathology

Citation: Lu Q, Wu Y, Chang J, Zhang L, Lv Q and Sun H (2025) Application progress of artificial intelligence in managing thyroid disease. Front. Endocrinol. 16:1578455. doi: 10.3389/fendo.2025.1578455

Received: 17 February 2025; Accepted: 30 May 2025;
Published: 17 June 2025.

Edited by:

Sandeep Kumar Mishra, Yale University, United States

Reviewed by:

Erika Abelleira, Hospital de Clínicas José de San Martín, Argentina
Leandros Stefanopoulos, Northwestern University, United States
Huang Bin, Zhejiang Hospital, China
Yapeng Wang, Macao Polytechnic University, China

Copyright © 2025 Lu, Wu, Chang, Zhang, Lv and Sun. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Li Zhang, emxpNDI5QGh1c3QuZWR1LmNu; Qing Lv, bHZxaW5nMTk4N0BodXN0LmVkdS5jbg==; Hui Sun, c3Vubnk2OEBodXN0LmVkdS5jbg==

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.