Optimizing C-TIRADS for sub-centimeter thyroid nodules using machine learning–derived feature importance

Guo, Dongming; Lin, Zhihui; Wang, Jiajia; Liao, Xianying; Huang, Haiqing; Zhai, Yuxia; Chen, Zhe

doi:10.3389/fendo.2025.1668347

ORIGINAL RESEARCH article

Front. Endocrinol., 26 September 2025

Sec. Thyroid Endocrinology

Volume 16 - 2025 | https://doi.org/10.3389/fendo.2025.1668347

Optimizing C-TIRADS for sub-centimeter thyroid nodules using machine learning–derived feature importance

Dongming Guo ¹

Zhihui Lin ¹

Jiajia Wang ¹

Xianying Liao ¹

Haiqing Huang ²

Yuxia Zhai ³

Zhe Chen ¹^*

1. Department of Interventional Ultrasound, Cancer Hospital of Shantou University Medical College, Shantou, China
2. Department of Ultrasound, Cancer Hospital of Shantou University Medical College, Shantou, China
3. Department of Ultrasound, Second Affiliated Hospital of Shantou University Medical College, Shantou, China

Article metrics

View details

2,2k

Views

1,4k

Downloads

A correction has been applied to this article in:

Correction: Optimizing C-TIRADS for sub-centimeter thyroid nodules using machine learning–derived feature importance
1. Read correction

Abstract

Background:

To optimize the diagnostic performance of the Chinese Thyroid Imaging Reporting and Data System (C-TIRADS) for sub-centimeter thyroid nodules by incorporating machine learning–derived feature importance.

Methods:

This retrospective study included 741 patients in a primary cohort and 421 patients in an external validation cohort. SHapley Additive exPlanations (SHAP) were used to quantify the diagnostic contribution of six ultrasound features based on an XGBoost model. A modified C-TIRADS scoring system was developed by assigning greater weight to the most contributive feature while retaining original weights for other features. Diagnostic performance was evaluated using the area under the receiver operating characteristic curve (AUC), net reclassification improvement (NRI), and decision curve analysis (DCA).

Results:

SHAP analysis identified vertical orientation as the most predictive feature for malignancy in sub-centimeter nodules. The modified scoring system significantly improved diagnostic performance in both the primary (AUC: 0.911 vs. 0.898, P < 0.001) and validation cohorts (AUC: 0.931 vs. 0.899, P < 0.001). NRI analysis further showed a substantial improvement in risk classifications, with NRI values of 0.406 in the primary and 0.471 in the validation cohort (both P < 0.001). DCA demonstrated greater net clinical benefit across wider threshold ranges in both cohorts. Additionally, malignancy rates exhibited a more rational stepwise increase from C-TIRADS 4A to 5, indicating improved risk stratification.

Conclusion:

The SHAP-guided modified C-TIRADS scoring system enhances diagnostic accuracy and risk stratification for sub-centimeter thyroid nodules and may facilitate improved clinical decision-making in this challenging subset.

1 Introduction

Thyroid nodules are common findings in the general population and represent one of the most prevalent endocrine disorders. With the widespread use of high-resolution ultrasonography, the detection rate of thyroid nodules has increased significantly (1). Approximately 25% to 68% of the global population harbors thyroid nodules, the majority of which are benign (2–4). However, 5% to 15% of these nodules are malignant (5, 6). Therefore, accurate risk stratification is crucial to avoid unnecessary invasive surgeries and missed diagnoses of thyroid cancer. This is particularly important for sub-centimeter nodules (≤10 mm in maximum diameter), where the limited spatial resolution of ultrasound often results in ambiguous sonographic features and ongoing controversy regarding the indications for fine-needle aspiration (7, 8).

The Chinese Thyroid Imaging Reporting and Data System (C-TIRADS) is a structured ultrasound-based risk stratification tool and widely adopted in China (9). It assigns risk scores based on a combination of suspicious sonographic features, such as composition, echogenicity, shape, margin, and calcification, to guide clinical management. While C-TIRADS has demonstrated good diagnostic performance overall, emerging evidence (10–12) suggests that the diagnostic utility of positive ultrasound features may differ depending on nodule size, particularly for thyroid nodules ≤ 10 mm, for which all positive features have low diagnostic efficacy. These findings raise the possibility that a size-adjusted scoring strategy, which accounts for the differential predictive value of specific ultrasound features, may enhance the diagnostic accuracy of C-TIRADS. However, this concept remains insufficiently investigated in the current literature.

In this study, we aim to evaluate the relative diagnostic contributions of individual C-TIRADS ultrasound features for sub-centimeter thyroid nodules using a machine learning-based predictive model. We further propose a modified C-TIRADS scoring system that adjusted feature weights for nodules ≤ 10 mm, and assess whether this revision improves diagnostic efficacy compared to the original C-TIRADS system.

2 Methods

2.1 Study population and nodule selection

This retrospective study consisted of a primary cohort and an external validation cohort. The primary cohort comprised 741 patients from the Second Affiliated Hospital of Shantou University Medical College between January 2019 and December 2024, while the validation cohort included 421 patients form Cancer Hospital of Shantou University Medical College between June 2020 and May 2025. All patients underwent either ultrasound-guided fine-needle aspiration or surgical resection, with a definitive pathological diagnosis. Some patients had multiple benign nodules multifocal papillary thyroid carcinoma; to ensure accurate correspondence between the target nodule and the pathological diagnosis, only one representative nodule (the largest or most suspicious) per patient was included in the analysis. All included nodules were evaluated using grayscale ultrasound prior to pathological confirmation.

All included cases had complete demographic information, sonographic reports containing all required ultrasound features for C-TIRADS scoring, and definitive diagnostic outcomes based on fine-needle aspiration biopsy or surgical pathology. The inclusion criteria were as follows: (1) availability of high-quality preoperative ultrasound images; (2) complete documentation of C-TIRADS-related ultrasound features; (3) a definitive diagnosis based on cytology or histopathology. The exclusion criteria were as follows: (1) Nodules presenting with purely cystic composition or a classic spongiform appearance; (2) incomplete imaging or clinical data; (3) poor-quality ultrasound images not permitting accurate feature assessment or target nodule identification; (4) nodules with ambiguous or indeterminate diagnostic outcomes; (5) history of previous neck radiation or any antitumor treatment prior to thyroid nodule diagnosis.

2.2 Ultrasound feature extraction and definitions

All preoperative ultrasound examinations were performed using high-frequency linear array probes on two main ultrasound systems: the Mindray Resona 7 (Mindray, Shenzhen, China) and the GE LOGIQ E9 (GE Healthcare, Chicago, IL, USA). All nodules were re-evaluated using the stored ultrasound images. Two sonographers with 5–10 years of experience in thyroid ultrasound independently assessed image quality and, when the retained images were clear and standardized, performed feature extraction according to the 2020 C-TIRADS criteria (9). In cases of disagreement, a third senior sonographer with more than 20 years of experience adjudicated to reach consensus. Structured data were then generated based on this standardized process, and these data were used for subsequent model training.

To ensure standardization, reproducibility, and comparability across cases, we restricted the analysis to the six core ultrasound features explicitly defined by the 2020 C-TIRADS. The evaluated features included composition, echogenicity, shape, margin, calcification, and artifacts. Nodule composition was categorized as solid (entirely or nearly entirely composed of soft tissue), predominantly solid (solid portion >50% of the nodule volume), and predominantly cystic (cystic or fluid-filled portion >50% of the nodule volume). Echogenicity was classified as iso/hyperechoic (echogenicity equal to or greater than the surrounding thyroid parenchyma), hypoechoic (lower echogenicity than the thyroid parenchyma but higher than or equal to the strap muscles), and markedly hypoechoic (echogenicity lower than the adjacent neck strap muscles). Shape was assessed by the vertical orientation, defined as an anteroposterior diameter greater than the transverse diameter on transverse imaging. Margin characteristics were recorded as ill-defined or irregular when the nodule boundaries appeared blurred, spiculated, or uneven. Calcifications were further subtyped as microcalcifications (punctate echogenic foci ≤1 mm without acoustic shadowing), macrocalcifications (coarse calcifications >1 mm with posterior shadowing), or peripheral calcifications (calcifications located along the rim of the nodule). Extrathyroidal extension was defined as the disruption of the thyroid capsule, capsular bulging, or direct invasion of surrounding structures. Comet-tail artifact was defined as a short, bright, tapering reverberation artifact extending posteriorly from echogenic spots within the nodule.

2.3 C-TIRADS scoring and risk stratification

According to the 2020 C-TIRADS guideline, each ultrasound feature is assigned a score: one point is given for each suspicious feature, including solid composition, markedly hypoechoic echogenicity, vertical orientation, ill-defined or irregular margin or extrathyroidal extension, and microcalcification. A comet-tail artifact, when not accompanied by microcalcification, is considered a benign feature and is assigned −1 point. The total score is then used to stratify the nodule into one of six C-TIRADS categories: C-TIRADS 2 (−1 point), C-TIRADS 3 (0 point), C-TIRADS 4A (1 point), C-TIRADS 4B (2 points), C-TIRADS 4C (3–4 points) and C-TIRADS 5 (≥5 points).

2.4 Machine learning and SHAP analysis

An eXtreme Gradient Boosting (XGBoost) model was applied to comprehensively assess the diagnostic contribution of six key ultrasound features derived from the C-TIRADS guideline in primary cohort: vertical orientation, solid composition, markedly hypoechoic echogenicity, microcalcification, ill-defined/irregular margin or extrathyroidal extension, and comet-tail artifact (counted only when not coexisting with microcalcifications). SHapley Additive exPlanations (SHAP) values were calculated using the ExactExplainer algorithm to quantify the contribution of each feature to the model’s prediction.

2.5 Modified scoring system construction and validation

Based on the findings that the most impactful features in primary cohort, a modified C-TIRADS scoring system was proposed by increasing the weight of the most contributive feature. In the modified scoring system, this top-contributing feature was assigned a weight of 2 points, while the remaining features retained their original weight of 1 point.

To compare the diagnostic performance between the original and modified C-TIRADS scoring systems, receiver operating characteristic (ROC) curve analysis was performed, and the area under the ROC curve (AUC) was calculated for each system. The statistical significance of AUC differences was assessed using the DeLong test. Reclassification performance between scoring systems was compared using the net reclassification improvement (NRI), and 95% confidence intervals (CIs) and corresponding P-values for the NRI were calculated using 1,000 bootstrap iterations. Decision curve analysis (DCA) was conducted to quantify the net clinical benefit of each scoring model across a range of threshold probabilities.

Additionally, the modified scoring system was externally validated in the validation cohort using the same analytic procedures described above.

2.6 Statistical analysis

Continuous variables were summarized as mean ± standard deviation or median with interquartile range depending on distribution. Categorical variables were presented as frequencies and percentages. Group comparisons of categorical variables were performed using the chi-square test or Fisher’s exact test as appropriate. Comparisons of continuous variables were conducted using the independent samples t-test or Mann–Whitney U test.

All analyses were performed using Python (version 3.12.2) and R (version 4.4.3). A two-sided P-value < 0.05 was considered statistically significant.

3 Results

3.1 Patient characteristics

A flowchart illustrating the patient selection and exclusion process is presented in Figure 1. The baseline characteristics of the two cohorts are summarized in Table 1. No significant differences were observed in age or sex distribution between the primary and validation cohorts. Most ultrasound features showed no statistically significant differences between the two cohorts, except for the following: margins, solid composition, markedly hypoechoic echogenicity, peripheral calcifications. The overall malignancy rates were comparable between the two cohorts. Table 2 presents the distribution of C-TIRADS ultrasound features among the malignant cases.

Figure 1

Flowchart depicting patient enrollment for primary and validation cohorts. The primary cohort initially had 815 patients, excluding 74 due to poor-quality images (28), uncertain diagnosis (34), and incomplete data (12). It enrolled 741 patients, with 376 malignant and 365 benign nodules. The validation cohort initially had 487 patients, excluding 66 due to poor-quality images (16), uncertain diagnosis (31), and incomplete data (19), enrolling 421 patients, with 212 malignant and 209 benign nodules. — Flowchart of patient enrollment and selection in the primary and validation cohorts.

Table 1

Characteristics	Primary cohort n=741 (%)	Validation cohort n=421 (%)	P
Sex			0.518
Male	132 (17.8)	68 (16.2)
Female	609 (82.2)	353 (83.8)
Age (years)	48 (39-56)	47 (34-59)	0.164
Orientation			0.236
Parallel	458 (61.8)	245 (58.2)
Vertical	283 (38.2)	176 (41.8)
Margin
Circumscribed	530 (71.5)	255 (60.6)	< 0.001
Ill-defined or irregular	180 (24.3)	158 (37.5)	< 0.001
Extrathyroidal extension	60 (8.1)	15 (3.6)	0.003
Composition
Solid	634 (85.6)	341 (81.0)	0.046
Predominantly solid	89 (12.0)	64 (15.2)	0.126
Predominantly cystic	18 (2.4)	16 (3.8)	0.206
Echogenicity
Iso/hyperechoic	179 (24.1)	123 (29.2)	0.061
Hypoechoic	500 (67.5)	283 (67.2)	0.948
Markedly hypoechoic	62 (8.4)	15 (3.6)	0.001
Echogenic foci
Microcalcifications	222 (30.0)	126 (29.9)	0.991
Macrocalcifications	117 (15.8)	61 (14.5)	0.611
Peripheral calcifications	33 (4.5)	9 (2.1)	0.049
Comet-tail artifacts	20 (2.7)	15 (3.6)	0.475
Outcome			0.903
Benign	365 (49.3)	209 (49.6)
Malignant	376 (50.7)	212 (50.4)

Baseline characteristics of patients in the primary and validation cohorts.

Table 2

Characteristics	Primary cohort n=376 (%)	Validation cohort n=212 (%)	P
Sex			0.915
Male	58 (15.4)	32 (15.1)
Female	318 (84.6)	180 (84.9)
Age (years)	48 (39-55)	44 (31.25-56.75)	0.008
Orientation			0.009
Parallel	118 (31.4)	45 (21.2)
Vertical	258 (68.6)	167 (78.8)
Margin
Circumscribed	194 (51.6)	82 (38.7)	0.003
Ill-defined or irregular	151 (40.2)	122 (57.5)	< 0.001
Extrathyroidal extension	60 (16.0)	15 (7.1)	0.002
Composition
Solid	372 (98.9)	205 (96.7)	0.064
Predominantly solid	4 (1.1)	7 (3.3)	0.064
Predominantly cystic	0 (0.0)	0 (0.0)	–
Echogenicity
Iso/hyperechoic	12 (3.2)	7 (3.3)	0.942
Hypoechoic	315 (83.8)	190 (89.6)	0.064
Markedly hypoechoic	49 (13.0)	15 (7.1)	0.027
Echogenic foci
Microcalcifications	175 (46.5)	95 (44.8)	0.730
Macrocalcifications	64 (17.0)	31 (14.6)	0.485
Peripheral calcifications	15 (4.0)	1 (0.5)	0.014
Comet-tail artifacts	0 (0.0)	0 (0.0)	–

Distribution of ultrasound features among malignant thyroid nodules.

3.2 Feature contribution and modified C-TIRADS scoring system

SHAP analysis based on the XGBoost model revealed that vertical orientation was the most influential feature contributing to malignancy prediction in sub-centimeter nodules, followed by ill-defined/irregular margin or extrathyroidal extension, and solid composition (Figure 2). These findings informed the development of a modified C-TIRADS scoring system, in which vertical orientation was assigned 2 points. All other features retained their original scoring weights according to the 2020 C-TIRADS guideline.

Figure 2

Bar chart showing mean absolute SHAP values for different features. Vertical orientation leads with 1.623, followed by ill-defined/irregular margin or extrathyroidal extension at 0.886. Solid is 0.858, microcalcification is 0.544, markedly hypoechoic is 0.103, and comet-tail artifact is 0.041. — SHAP summary plot showing the relative contribution of ultrasound features to malignancy prediction in sub-centimeter nodules using XGBoost in the primary cohort.

3.3 Diagnostic performance comparison

ROC curve analysis demonstrated that the modified C-TIRADS scoring system had significantly better diagnostic performance compared with the original system in the primary cohort (Figure 3), with the AUC increasing from 0.898 to 0.911 (P < 0.001).

Figure 3

Receiver Operating Characteristic (ROC) curve comparing two models. The solid blue line represents the original score with an AUC of 0.898, while the dashed orange line represents the modified score with an AUC of 0.911. The diagonal gray line indicates random performance. P-value is less than 0.001. Sensitivity is plotted against 1-Specificity. — Receiver operating characteristic curves comparing diagnostic performance of the original and modified C-TIRADS scoring systems in the primary cohort.

3.4 Net reclassification and risk migration

NRI analysis showed a significant enhancement in risk classification with the modified scoring system. In the primary cohort, NRI was 0.406 (95% CI: 0.349–0.462, P < 0.001). Heatmaps in Figure 4 illustrate the distributional changes between original and modified scoring categories for both benign and malignant nodules. The modified scoring system had a substantial increase in the upward reclassification of malignant nodules, especially in C-TIRADS category 4B (60.7%) and category 5 (45.7%). While a modest proportion of benign nodules were also misclassified into higher-risk levels in C-TIRADS 4B (22.0%) and C-TIRADS 4C (11.8%) categories.

Figure 4

Two heatmaps compare original and modified risk levels for nodules. Heatmap A shows malignant nodules with risk levels C-TR 4A, 4B, 4C, and 5. Heatmap B displays similar comparison for benign nodules, with risk levels C-TR 2, 3, 4A, 4B, 4C, and 5. Both heatmaps use color scales to indicate value intensity. — Heatmaps showing reclassification of benign and malignant nodules across C-TIRADS risk categories between the original and modified scoring systems in the primary cohort. **(A)** malignant nodules; **(B)** benign nodules.

Table 3 summarizes the malignancy rates across TR categories defined by the original and modified C-TIRADS scoring systems. The modified system provided a clearer stratification trend, with a significantly higher malignancy rate in C-TIRADS 5 and lower rates in C-TIRADS 4A-4C categories compared to the original model (P < 0.001).

Table 3

Risk levels	Original scoring (n, %)	Modified scoring (n, %)	P
Primary cohort
C-TIRADS 4A	40 (17.7%)	39 (17.5%)	< 0.001
C-TIRADS 4B	122 (67.4%)	49 (50.5%)
C-TIRADS 4C	208 (92.4%)	187 (87.0%)
C-TIRADS 5	6 (85.7%)	101 (97.1%)
Validation cohort
C-TIRADS 4A	22 (19.3%)	15 (14.3%)	< 0.001
C-TIRADS 4B	60 (56.6%)	30 (41.1%)
C-TIRADS 4C	122 (93.1%)	95 (88%)
C-TIRADS 5	8 (100%)	72 (98.6%)

Malignancy rates across TIRADS risk categories defined by original and modified C-TIRADS scoring systems in the primary and validation cohorts.

3.5 Clinical benefit evaluation

As depicted in Figure 5, DCA demonstrated that the modified C-TIRADS scoring system offered greater net clinical benefit than the original system across threshold probabilities ranging from 52% to 92% in the primary cohort.

Figure 5

A decision curve analysis graph displays standardized net benefit versus threshold probability. It compares four curves: Original C-TIRADS (blue), Modified C-TIRADS (orange), All (gray), and None (black). Both C-TIRADS models show higher net benefit across most threshold probabilities compared to “All” and “None”. The modified C-TIRADS offers greater net clinical benefit than the original C-TIRADS across a broad range of threshold probabilities. — Decision curve analysis comparing net clinical benefit of the original and modified C-TIRADS scoring systems in the primary cohort.

3.6 Validation of the modified scoring system

In the validation cohort, as shown in Figure 6, the modified C-TIRADS scoring system demonstrated superior diagnostic performance compared to the original version, with the AUC increasing from 0.899 to 0.931 (P < 0.001). NRI analysis indicated a significant improvement in risk stratification, with an NRI of 0.471 (95% CI: 0.400–0.542, P < 0.001), further supporting the enhanced discriminatory capacity of the modified model. Additionally, DCA (Figure 6) showed that the modified model yielded higher net clinical benefit across a broader threshold probability range (15% to 95%) compared with the original system.

Figure 6

Panel A shows a ROC curve comparing Original and Modified C-TIRADS scores, with the Modified score achieving a higher AUC of 0.931. Panel B is a decision curve indicating standardized net benefits for the same methods, with Modified C-TIRADS outperforming the Original in most scenarios. — Receiver operating characteristic curves and decision curve analysis comparing the diagnostic performance and clinical net benefit of the original and modified C-TIRADS scoring systems in the validation cohort. **(A)** receiver operating characteristic curves; **(B)** decision curve analysis.

Moreover, the modified scoring system exhibited improved risk stratification, characterized by a more rational stepwise increase in malignancy rates across C-TIRADS 4A to 5 and a more appropriate allocation of malignant nodules to higher categories (P < 0.001, Table 3). The validation results corroborated the findings from the primary cohort, lending further support to the effectiveness of the modified C-TIRADS system.

4 Discussion

In this study, we developed and validated a size-specific modification of the C-TIRADS scoring system for sub-centimeter thyroid nodules, using SHAP-informed feature weighting derived from a machine learning model. Our findings demonstrated that the modified scoring system, which assigned greater weight to vertical orientation, the most predictive feature in small nodules, achieved superior diagnostic performance, improved malignancy risk stratification, and enhanced clinical utility compared with the original C-TIRADS guideline. These results were consistently observed in both the primary and external validation cohorts, supporting the robustness and generalizability of the modified system.

Although most international guidelines, including those from the American Thyroid Association, recommend active surveillance or conservative management for nodules ≤1 cm, a growing body of evidence suggested (13, 14) that a small but significant proportion of cases harbor aggressive histological features, such as extrathyroidal extension, lymph node metastasis, and BRAF mutations, even at early stage. A meta-analysis reported that the overall incidence of central lymph node metastases in papillary thyroid microcarcinoma patients was 33% (15). Improving diagnostic performance for sub-centimeter thyroid nodules is therefore clinically important, yet remains challenging. Due to their small size and frequently ambiguous sonographic appearance, these nodules are more likely to be underdiagnosed or misclassified by conventional scoring systems. This diagnostic uncertainty may lead to missed malignancies or delayed interventions in patients with clinically significant microcarcinomas. Hence, a size-tailored and more accurate risk stratification approach, the SHAP-informed modified C-TIRADS proposed in this study, is essential to better identify microcarcinomas, thereby supporting more personalized and effective clinical decision-making.

In our SHAP analysis, vertical orientation emerged as the most important predictor of malignancy among the six C-TIRADS features in sub-centimeter nodules, consistent with previous findings (12). Thyroid microcarcinomas more frequently exhibit a “taller-than-wide” configuration, in which the anteroposterior dimension exceeds the transverse dimension. The majority of microcarcinomas tend to arise in a subcapsular or peripheral location, which predisposes the tumor to invade outward, following the path of least resistance, into the adjacent soft tissue structures rather than expanding laterally within the confined parenchyma of the thyroid (16). Additionally, the tall cell variant of papillary thyroid carcinoma, defined as tumor cells at least twice as tall as they are wide, is associated with a more aggressive phenotype (17), which may contribute to vertical orientation. Moreover, the hobnail variant of thyroid carcinoma is a distinctive pattern whereby tumor cells loss of cellular polarity/cohesiveness (18). In these tumors, cells lose their normal orientation and cell-to-cell adhesion is diminished, which may result in a vertical growth behavior. These morphological and histopathological patterns provide a biological rationale for assigning increased weight to vertical orientation in the modified scoring system.

Furthermore, the modified scoring system exhibited superior risk stratification, with progressively increasing malignancy rates across C-TIRADS categories 4A to 5 and a more appropriate allocation of malignant nodules to higher-risk categories. This modified model improves the evaluation and stratification of high-risk sub-centimeter nodules and is helpful for clinical decision-making, especially regarding whether to perform fine-needle aspiration or adopt active surveillance.

It is noteworthy that recent studies have explored the potential value of incorporating additional ultrasound features or clinical parameters into risk stratification models. For example, one study (19) demonstrated that combining TIRADS scoring systems with thyroid-stimulating hormone could improve the sensitivity of predicting differentiated thyroid carcinoma, and other studies (20–22) have shown that vascularity, elastography, and contrast-enhanced ultrasound may further enhance the diagnostic accuracy of TIRADS. However, C-TIRADS has been widely adopted in China due to its simplicity and reasonable diagnostic performance. To maintain its ease of use, our study focused solely on optimizing the six intrinsic features defined by C-TIRADS, without incorporating additional factors, although integrating further ultrasound features or clinical parameters may further improve the diagnostic performance of risk stratification models. In future studies, integrating additional features and developing user-friendly tools such as nomograms or online calculators may further facilitate the clinical application of the modified C-TIRADS system. Additionally, our findings suggest that the modified C-TIRADS system may have potential implications for refining fine-needle aspiration decision-making in sub-centimeter nodules. This warrants further investigation in future research.

Despite these promising results, several limitations should be acknowledged. First, this was a retrospective study and may be subject to inherent selection bias. Second, although the external validation cohort supported the modified model, both cohorts were derived from tertiary medical centers within the same city. This geographic and institutional homogeneity may limit the generalizability of our findings to other populations and practice settings. Therefore, our results should be interpreted as preliminary evidence, and further multicenter validation is warranted to confirm the robustness and applicability of the modified C-TIRADS system. Third, this study specifically focused on the six ultrasound features explicitly defined in the 2020 C-TIRADS guideline, since the primary objective was to optimize the C-TIRADS scoring system itself. Thus, additional features such as vascularity, elastography, contrast-enhanced ultrasound, thyroid-stimulating hormone, or BRAF mutation were not included. Nevertheless, incorporating these features may further enhance diagnostic accuracy and should be explored in future prospective studies with standardized protocols. Fourth, information on overall thyroid size, multinodular goiter, Hashimoto thyroiditis, and Graves disease was not systematically collected in this retrospective study. Therefore, we were unable to evaluate the potential impact of these conditions on the association between vertical orientation and malignancy. Finally, although SHAP analysis provides model explainability, prospective validation remains necessary for widespread implementation.

5 Conclusions

In conclusion, this study proposed a SHAP-guided modification of the C-TIRADS scoring system tailored for sub-centimeter thyroid nodules. By assigning greater weight to vertical orientation, the modified scoring system achieved better diagnostic performance and more accurate risk stratification for sub-centimeter nodules, facilitating the identification of high-risk sub-centimeter nodules in clinical practice.

Statements

Data availability statement

The data analyzed in this study is subject to the following licenses/restrictions: The datasets of this study are available from the corresponding author upon reasonable request. Requests to access these datasets should be directed to Zhe Chen, zchensdzl@163.com.

Ethics statement

The study involving human participants was approved by the Ethics Committee of Cancer Hospital of Shantou University Medical College (2024055) and the Ethics Committee of Second Affiliated Hospital of Shantou University Medical College (2024-63). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

DG: Conceptualization, Methodology, Validation, Writing – original draft, Writing – review & editing. ZL: Data curation, Investigation, Methodology, Writing – review & editing. JW: Data curation, Investigation, Methodology, Writing – review & editing. XL: Writing – review & editing, Data curation. HH: Supervision, Validation, Writing – review & editing. YZ: Supervision, Validation, Writing – review & editing. ZC: Conceptualization, Methodology, Project administration, Writing – original draft.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by the Science and Technology Planning Project of Shantou (No.240416216497338 and No.240507196498846).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1
Durante C Hegedus L Czarniecka A Paschke R Russ G Schmitt F et al . European Thyroid Association Clinical Practice Guidelines for thyroid nodule management. Eur Thyroid J. (2023) 2023:12(5). doi: 10.1530/ETJ-23-0067 , PMID:
2
Grani G Sponziello M Filetti S Durante C . Thyroid nodules: diagnosis and management. Nat Rev Endocrinol. (2024) 20:715–28. doi: 10.1038/s41574-024-01025-4 , PMID:
3
Xu L Zeng F Wang Y Bai Y Shan X Kong L . Prevalence and associated metabolic factors for thyroid nodules: a cross-sectional study in Southwest of China with more than 120 thousand populations. BMC Endocr Disord. (2021) 21:175. doi: 10.1186/s12902-021-00842-2 , PMID:
4
Guth S Theune U Aberle J Galach A Bamberger CM . Very high prevalence of thyroid nodules detected by high frequency (13 MHz) ultrasound examination. Eur J Clin Invest. (2009) 39:699–706. doi: 10.1111/j.1365-2362.2009.02162.x , PMID:
5
Grani G Sponziello M Pecce V Ramundo V Durante C . Contemporary thyroid nodule evaluation and management. J Clin Endocrinol Metab. (2020) 105:2869–83. doi: 10.1210/clinem/dgaa322 , PMID:
6
Nambron R Rosenthal R Bahl D . Diagnosis and evaluation of thyroid nodules-the clinician’s perspective. Radiol Clin North Am. (2020) 58:1009–18. doi: 10.1016/j.rcl.2020.07.007 , PMID:
7
Alexander EK Cibas ES . Diagnosis of thyroid nodules. Lancet Diabetes Endocrinol. (2022) 10:533–9. doi: 10.1016/S2213-8587(22)00101-2 , PMID:
8
Alexander EK Doherty GM Barletta JA . Management of thyroid nodules. Lancet Diabetes Endocrinol. (2022) 10:540–8. doi: 10.1016/S2213-8587(22)00139-5 , PMID:
9
Zhou J Song Y Zhan W Wei X Zhang S Zhang R et al . Thyroid imaging reporting and data system (TIRADS) for ultrasound features of nodules: multicentric retrospective study in China. Endocrine. (2021) 72:157–70. doi: 10.1007/s12020-020-02442-x , PMID:
10
Si CF Yu J Cui YY Huang YJ Cui KF Fu C . Comparison of diagnostic performance of the current score-based ultrasound risk stratification systems according to thyroid nodule size. Quant Imaging Med Surg. (2024) 14:9234–45. doi: 10.21037/qims-24-282 , PMID:
11
Qu C Li HJ Gao Q Zhang JC Li WM . Alteration trend and overlap analysis of positive features in different-sized benign and Malignant thyroid nodules: based on Chinese thyroid imaging reporting and data system. Int J Gen Med. (2024) 17:1887–95. doi: 10.2147/IJGM.S461076 , PMID:
12
Zhou Y Li WM Fan XF Huang YL Gao Q . Comparing diagnostic efficacy of C-TIRADS positive features on different sizes of thyroid nodules. Int J Gen Med. (2023) 16:3483–90. doi: 10.2147/IJGM.S416403 , PMID:
13
Wang Z Ji X Zhang H Sun W . Clinical and molecular features of progressive papillary thyroid microcarcinoma. Int J Surg. (2024) 110:2313–22. doi: 10.1097/JS9.0000000000001117 , PMID:
14
Sutherland R Tsang V Clifton-Bligh RJ Gild ML . Papillary thyroid microcarcinoma: Is active surveillance always enough? Clin Endocrinol (Oxf). (2021) 95:811–7. doi: 10.1111/cen.14529 , PMID:
15
Liu LS Liang J Li JH Liu X Jiang L Long JX et al . The incidence and risk factors for central lymph node metastasis in cN0 papillary thyroid microcarcinoma: a meta-analysis. Eur Arch Otorhinolaryngol. (2017) 274:1327–38. doi: 10.1007/s00405-016-4302-0 , PMID:
16
Tallini G De Leo A Repaci A de Biase D Bacchi Reggiani ML Di Nanni D et al . Does the site of origin of the microcarcinoma with respect to the thyroid surface matter? A multicenter pathologic and clinical study for risk stratification. Cancers. (2020) 12:246. doi: 10.3390/cancers12010246 , PMID:
17
Nath MC Erickson LA . Aggressive variants of papillary thyroid carcinoma: hobnail, tall cell, columnar, and solid. Adv Anat Pathol. (2018) 25:172–9. doi: 10.1097/PAP.0000000000000184 , PMID:
18
Asioli S Erickson LA Righi A Lloyd RV . Papillary thyroid carcinoma with hobnail features: histopathologic criteria to predict aggressive behavior. Hum Pathol. (2013) 44:320–8. doi: 10.1016/j.humpath.2012.06.003 , PMID:
19
Trimboli P Curti M Colombo A Scappaticcio L Leoncini A . Combining TSH measurement with TIRADS assessment to further improve the detection of thyroid cancers. Minerva Endocrinol (Torino). (2024) 49:125–31. doi: 10.23736/S2724-6507.24.04207-6 , PMID:
20
Ma G Chen L Wang Y Luo Z Zeng Y Wang X et al . Application of microvascular ultrasound-assisted thyroid imaging report and data system in thyroid nodule risk stratification. Insights Imaging. (2024) 15:230. doi: 10.1186/s13244-024-01806-5 , PMID:
21
Borlea A Borcan F Sporea I Dehelean CA Negrea R Cotoi L et al . TI-RADS diagnostic performance: which algorithm is superior and how elastography and 4D vascularity improve the Malignancy risk assessment. Diagnost (Basel). (2020) 10:180. doi: 10.3390/diagnostics10040180 , PMID:
22
Borlea A Moisa-Luca L Popescu A Bende F Stoian D . Combining CEUS and ultrasound parameters in thyroid nodule and cancer diagnosis: a TIRADS-based evaluation. Front Endocrinol (Lausanne). (2024) 15:1417449. doi: 10.3389/fendo.2024.1417449 , PMID:

Summary

Keywords

sub-centimeter thyroid nodules, C-TIRADS, machine learning, SHAP, ultrasound, risk stratification, microcarcinoma

Citation

Guo D, Lin Z, Wang J, Liao X, Huang H, Zhai Y and Chen Z (2025) Optimizing C-TIRADS for sub-centimeter thyroid nodules using machine learning–derived feature importance. Front. Endocrinol. 16:1668347. doi: 10.3389/fendo.2025.1668347

Received

17 July 2025

Accepted

08 September 2025

Published

26 September 2025

Volume

16 - 2025

Edited by

Vincent Habouzit, Centre Hospitalier Universitaire (CHU) de Saint-Étienne, France

Reviewed by

Giovanni Vitale, Italian Auxological Institute (IRCCS), Italy

Sabine Weidner, University Hospital of Berne, Inselspital, Switzerland

Cihan Atar, Osmaniye State Hospital, Türkiye

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zhe Chen, zchensdzl@163.com

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Thyroid Endocrinology

ORIGINAL RESEARCH article

Optimizing C-TIRADS for sub-centimeter thyroid nodules using machine learning–derived feature importance

Abstract

1 Introduction

2 Methods

2.1 Study population and nodule selection

2.2 Ultrasound feature extraction and definitions