Three-dimensional ultrasound-based radiomics nomogram for the prediction of extrathyroidal extension features in papillary thyroid cancer

Purpose To develop and validate a three-dimensional ultrasound (3D US) radiomics nomogram for the preoperative prediction of extrathyroidal extension (ETE) in papillary thyroid cancer (PTC). Methods This retrospective study included 168 patients with surgically proven PTC (non-ETE, n = 90; ETE, n = 78) who were divided into training (n = 117) and validation (n = 51) cohorts by a random stratified sampling strategy. The regions of interest (ROIs) were obtained manually from 3D US images. A larger number of radiomic features were automatically extracted. Finally, a nomogram was built, incorporating the radiomics scores and selected clinical predictors. Receiver operating characteristic (ROC) curves were performed to validate the capability of the nomogram on both the training and validation sets. The nomogram models were compared with conventional US models. The DeLong test was adopted to compare different ROC curves. Results The area under the receiver operating characteristic curve (AUC) of the radiologist was 0.67 [95% confidence interval (CI), 0.580–0.757] in the training cohort and 0.62 (95% CI, 0.467–0.746) in the validation cohort. Sixteen features from 3D US images were used to build the radiomics signature. The radiomics nomogram, which incorporated the radiomics signature, tumor location, and tumor size showed good calibration and discrimination in the training cohort (AUC, 0.810; 95% CI, 0.727–0.876) and the validation cohort (AUC, 0.798; 95% CI, 0.662–0.897). The result suggested that the diagnostic efficiency of the 3D US-based radiomics nomogram was better than that of the radiologist and it had a favorable discriminate performance with a higher AUC (DeLong test: p < 0.05). Conclusions The 3D US-based radiomics signature nomogram, a noninvasive preoperative prediction method that incorporates tumor location and tumor size, presented more advantages over radiologist-reported ETE statuses for PTC.


Introduction
Papillary thyroid cancer (PTC) is the most common endocrine malignant tumor, and its incidence rate is increasing all over the world (1).The reason for this increase in incidence is partly because of the popularization of routine physical examination and the improvement of high-frequency ultrasound (US) (2).Most PTC patients have a good prognosis; more than 90% of them have survived for more than 10 years (3).Although PTC has a favorable prognosis, some cases show aggressive clinical features, such as lymph node and distance metastases, and poorer prognosis (4).Extrathyroidal extension (ETE) has long been considered to be an independent predictor of poor prognosis in PTC patients (5).Also, ETE is regarded to be an important risk factor associated with recurrence and metastasis, and it has an important impact on staging and the choice of operation (6).The recurrence and mortality after surgery will increase in an ETE patient.The 15year survival rate among PTC patients with ETE was significantly worse than that in patients without ETE (7,8).Traditional surgical setting includes total and subtotal thyroidectomies for PTC, according to the National Comprehensive Cancer Network (NCCN) Guidelines for Thyroid Carcinoma (Third Edition, 2018); total thyroidectomy is the best method to treat ETE patients in PTC (7).Although both surgical procedures have no significant effect on distant metastasis and cancer-specific mortality rates, subtotal thyroidectomies can retain some functionality of the thyroid gland and prevent injuries of the parathyroid and contralateral laryngeal recurrent nerve (9).Therefore, predicting ETE preoperatively is critical for clinicians to choose the surgical approach.
Currently, only pathological biopsy is the gold standard for the diagnosis of ETE (10).Imaging methods such as magnetic resonance imaging (MRI), computerized tomography (CT), single-photon emission computed tomography (SPECT), and US are commonly used for the diagnosis of ETE.MRI has a high resolution of soft tissue and spatial resolution.In a previous study, Wei et al. (11) used multiparametric MRI for preoperative assessment of ETE in 132 cases of PTC, with areas under the receiver operating characteristic curve (AUCs) of 0.96 and 0.87 in the training and testing sets, respectively.However, MRI is relatively expensive and time-consuming; it is also not suitable for patients with implants or who have claustrophobia (12).CT has a certain advantage in evaluating the relationships between the lesions and surrounding tissue, but CT has a potential risk of radiation exposure.From the SPECT examination, the anatomic location of the tumor can be shown clearly.Moreover, a study suggested that SPECT also exhibits promising advantages including higher sensitivity (50%) and specificity (100%) (13).Although several studies have suggested the value of the nuclear medicine method, it could not be widely used clinically in thyroid carcinoma for reasons of universality, radiation, and cost-effectiveness.At present, US examination has become the most frequently used imaging method in PTC patients because of a series of advantages, such as being inexpensive, being noninvasive, and using non-ionizing radiation.Kwak et al. (14) reported that the sensitivity of US examination was 65.2% and the specificity was 81.8% when more than 25% of thyroid nodules contacts with the adjacent capsule.For >50% contact between the tumor and capsule, Lee et al. (15) reported that US findings of capsule disruption had a better AUC (0.674 vs. 0.638) in predicting ETE than CT in 377 PTC patients, while CT combined with US imaging to detect ETE could get the best diagnostic accuracy, with an AUC of 0.744.
With the development of US technology, the emergence of three-dimensional (3D) US provides more possibilities in choosing the imaging method of thyroid disease.Through scanning the target organs by a single sweep of a US beam, it can easily provide the images in multiple slices and planes from the stored data.Therefore, 3D imaging can provide significantly more information about lesions than the traditional two-dimensional (2D) imaging.According to several previous research reports, 3D US examination may help to overcome the limitations of 2D US in various organs (16,17).Kim et al. (18) reported that compared to 2D thyroid US, 3D had higher sensitivity for predicting ETE (66.7% vs. 46.4%,p = 0.03).Consequently, 3D US may have diagnostic potential in predicting ETE status in PTC patients.
Radiomics is an emerging and burgeoning subject in medical research, especially in oncology.Radiomics analysis was first reported by Lambin et al. (19) in 2012.By extracting and analyzing a large number of quantitative features from medical images, radiomics can improve the ability of disease diagnosis and prediction (20, 21).Studies have shown that image feature-based radiomics extraction has objective characteristics and great value in predicting clinical outcomes (19).Radiomics analysis has been applied in various diseases, such as cervical cancer, breast cancer, prostate cancer, lung cancer, rectum cancer, and musculoskeletal tumors (22)(23)(24)(25)(26)(27).In the thyroid grand, Wang et al. (28) believe that 2D US radiomics can effectively evaluate whether ETE occurs in papillary thyroid carcinoma, and the AUC is 0.824.But at present, there are no reports applying 3D US radiomics to evaluate ETE in PTC.
Therefore, the purpose of this study is to construct a 3D US radiomics nomogram to predict ETE to help the doctors select the most appropriate strategy of treatment in PTC patients.

Patient data
This retrospective study was approved by the institutional review board of our institution.Informed consent was obtained from participants before the examination.Between November 2020 and October 2021, a total of 168 individuals who underwent preoperative 2D and 3D thyroid US at our institute were included in this study.All patients underwent subtotal or total thyroidectomy within 1 week after US examination.PTC was confirmed by pathology postoperatively.Exclusion criteria were as follows: 1) the clinical information of the patients was incomplete; 2) the patients had been submitted to thyroidectomy; 3) the entire lesion was not covered by the scan; 4) the images had poor quality; 5) the patients only had routine preoperative 2D or 3D US examination; 6) the maximum diameter of the primary tumor was <5 mm.

US examination and US-reported ETE status
Figure 1 shows the study procedure.Before receiving total and subtotal thyroidectomies, all patients underwent 2D US with 12-5 MHz linear array transducer and 3D US with 13-5 MHz dedicated volume transducer (Philips IU Elite Ultrasound System).The patients were placed in a supine position appropriately with a pillow underneath the neck, with their neck stretched sufficiently to expose the anterior region of the neck.All images were obtained at identical instrument settings for depth, focus position, and gain setting.Clinical data, such as age, sex, and body mass index (BMI), were obtained during the US examination.
After completing the 2D thyroid examination, all subjects underwent 3D examination by two experienced radiologists who were blinded to the data sample identity (8 and 10 years of experience in thyroid US).For 3D US, a volume box size was chosen to cover the tumor lesion.The sectorial and mechanical transducer with a scanning angle of 30°was used for automatic acquisition of tilt-series and 3D images.The two radiologists received a training course consisting of 20 unregistered cases to familiarize themselves with 3D scanning before this study.During the 2D and 3D US examination, they recorded the size of the lesion, nodule position (upper pole, middle pole, or lower pole), primary site (left lobe, right lobe, or isthmus), nodule border (clear or fuzzy), internal echo pattern (nonuniform or uniform), and nodule location (unilateral or bilateral).And they independently reviewed the US imaging features of every patient and recorded a final diagnosis.In case of disagreement, additional reading sessions were used until a consensus was reached.Based on the American Joint Committee on Cancer (AJCC) guidelines (29-32), ETE can be diagnosed when one of the following criteria presents: 1) >25% of the primary tumor perimeter is in contact with the thyroid capsule; 2) the glands between the lesions and thyroid disappear; 3) the primary tumor exceeds the thyroid capsule and extends to the surrounding structures, such as the larynx, recurrent laryngeal nerve, trachea, vasculature, the strap muscles, or esophagus.

Tumor segmentation and radiomic feature extraction
The two abovementioned radiologists were informed about the tumor location confirmed by operation and were blinded to other pathologic results and clinical information.The regions of interest (ROIs) of the 3D images were drawn layer by layer manually by the radiologists using the software ITK-SNAP (version 3.8.0,http:// www.itksnap.org).Lastly, all final tumor regions could be defined by the overlapping region of two ROIs independently drawn by the two radiologists.In case of disagreement, additional reading sessions were used until a consensus was reached.Figure 2 shows a typical case with 2D and 3D US images and the ROIs.Texture analysis was performed on the acquired US images of 168 patients.All feature extraction methods were performed using PyRadiomics package, which was imported from the Python programming language.Subsequently, a total of 1,693 features were extracted for each patient, including First Order Statistics (19 features), Shape-based (3D) (16 features), Shape-based (2D) (10 features), Gray-Level Co-Occurrence Matrix (24 features), Gray-Level Run Length Matrix (16 features), Gray-Level Size Zone Matrix (16 features), Neighboring Gray Tone Difference Matrix (5 features), and Gray-Level Features Matrix (14 features).

Feature selection and radiomics signature building
All data were divided into training and validation sets at a ratio of 7:3 according to random stratified sampling strategy.First, Levene tests were performed to verify variance homogeneity.For two groups, we used the independent-sample t-test or Mann-Whitney U test to acquire significant features with p-values <0.05.After that, the least absolute shrinkage and selection operator (LASSO) regression method with 10-fold cross-validation was applied to select the most useful predictive ETE status-related features from the training cohort.The LASSO is a machine learning regression analysis technique; it not only can reduce model overfitting and improve the results of prediction but also is regarded as a promising method to select significant features through regularization and variable selection (33).A formula was generated using a linear combination of selected features that were weighted by their respective LASSO coefficients, and the formula was then used to calculate a risk score (defined as the radiomics score or radiomics signature).The radiomics signature was then used to build a nomogram combining with clinical predictors.

Development of the ultrasound radiomics nomogram
The chi-square test (categorical variables) and Student's t-test (continuous variables) were adopted to identify the association between the clinical risk factors and ETE.In this study, clinical predictors including age, sex, BMI, lesion, nodule location, nodule position, nodule border, internal echo pattern, and radiological ETE diagnosis were used to conduct a multivariate logistic regression model.To find the incremental value of the radiomics signature for prediction of ETE in PTCs, the radiomics model was built by combining the radiomics signature with clinical predictors with p < 0.05.Next, the model was converted into a radiomics nomogram to help clinicians predict ETE visually in PTC patients.We calculated the AUC for predictive analysis.Then, the DeLong test was performed to compare the differences between the receiver operating characteristic (ROC) curves of the nomogram and the radiologists.

Assessment of nomogram performance
The 3D US-based radiomics nomogram incorporating selected clinical predictors was developed on the training cohort and then tested in the validation cohort.The calibration curve and Hosmer-Lemeshow (H-L) test were used to assess the calibration of the radiomics nomogram.The discriminative performance of the radiomics nomogram was evaluated by using Harrell's concordance index (C-index).

Histopathologic analysis
Two experienced pathologists with 9 and 12 years of experience, respectively, evaluated the histopathology of the tumor specimens.PTC specimens of paraffin embedding slice were followed by hematoxylin and eosin (H&E) dyeing.According to the guidelines published by the American Thyroid Association (ATA), the pathologists evaluated the ETE features (34).Then, the patients were divided into two groups: ETE and non-ETE groups.

Statistical analysis
All statistical tests in this study were conducted using R software (version 4.0.3,https://www.r-project.org).Statistical analysis of clinical data, multivariate logistic regression, and H-L test were performed using SPSS software (version 22.0, SPSS Inc.).The corresponding 95% confidence interval (CI) was used to describe the correlation results.If the measurement data satisfy normal distribution, we use mean ± standard deviation (SD) to express.Other values were reported as median and interquartile range (IQR).Independent-sample t-test was adopted for normally distributed measurement data; otherwise, Mann-Whitney U test was used for non-normally distributed measurement data.The count dates were expressed as frequency (percentage) and compared by chi-square test or Fisher exact test.The factors of ETE in PTC patients were analyzed by multivariate logistic regression (stepwise forward) method.Next, the goodness of fit for logistic regression models was assessed by the H-L test.
ROC was employed to quantify the discriminative capability of the nomogram by comparing nomogram-predicted versus the observed ETE probability.A two-sided p < 0.05 was considered statistically significant.

Clinical characteristics
A total of 168 PTC patients aged 41.96 ± 0.881 years (range, 20-72 years) were enrolled in this research.In this study, 90 patients (41.71 ± 12.73 years old) and 78 patients (42.24 ± 11.22 years old) were assigned to the non-ETE and ETE groups, respectively, based on pathologic results.There was no significant difference in the tumor size, nodule location, nodule position, nodule border, and internal echo pattern between the ETE group and non-ETE group (all p > 0.05).According to the degree of the diagnostic criteria of ETE, there were 10 patients with thyroid capsule contact approximately >25% of the primary tumor perimeter, 47 patients with the glands between the lesions and thyroid disappear, 21 patients with the primary tumor that exceeds the thyroid capsule and extends to surrounding structures.The 168 patients were divided into a training group (n = 117) and a validation group (n = 51) by stratified sampling.Table 1 shows the clinical data of the 168 patients.

Radiomics score
A total of 1,693 features were extracted from the original 3D US images in the training cohort.In this study, LASSO regression with L1 regularization was further used to select the optimal radiomic features.The complexity depends on the lambda (l).According to 10-fold cross-validation, the results indicated that when extracting 3D image features, the models had the lowest mean squared error (MSE) when l was 0.037.After LASSO regression analysis, 691 radiomic features were reduced to 16 potential predictors.As Figure 3 shows, the 16 features were included in the radiomics score formula.Table 2 shows the 16 best radiomic features in 3D US image.

Development and performance of the prediction model
In this study, the AUC of the radiologists is 0.67 (95% CI, 0.580-0.757) in the training cohort and 0.62 (95% CI, 0.467-0.746) in the validation cohort.Then, according to the forward Logistic regression (LR) method, bilateral tumor (p = 0.004)and tumor size (p = 0.005) were identified as independent predictive factors to predictive ETE.Then, we could get a radiomics nomogram with the predictive factors (Figure 4).The prediction model was constructed as follows: Linear predictor = -27.722-1.153 × X1 -1.148 × X2 + 57.033 × X3.X1 being tumor location, X2 being tumor size, X3 being radiomics signature.
In the training cohort, the radiomics nomogram showed good discrimination with an AUC 0.810 (95% CI, 0.727-0.876),which was significantly higher than that of the radiologists (DeLong test, p = 0.0136).In the validation set, it also shows better discrimination with an AUC of 0.798 (95% CI, 0.662-0.897;DeLong test, p = 0.0296).The ROC curves of the two models for both the training and validation sets are presented in Figure 5.The calibration curve and the H-L test showed good calibration in the training cohort (Figure 6A, p = 0.828) and the validation cohort (Figure 6B, p = 0.071).The C-index of the radiomics nomogram is 0.831.From the result, we could know that the above-described radiomics nomogram performed well in differentiating ETE from non-ETE and may help in the clinical decision-making process.

Discussion
Due to the prevalence of coronavirus disease 2019 (COVID-19), delayed investigations and treatment may further lead to an increase in the incidence and mortality of thyroid carcinoma (35).As the most common pathological type of differentiated thyroid carcinoma (DTC), PTC has a low degree of malignancy and a high cure rate if detected and treated in a timely manner.The presence of ETE is considered to be significantly important in PTC patients, and it is included in almost all prognostic scoring systems as a staging variable.Literature reported that the incidence of ETE in PTC ranges between 5% and 45%, and in our patient population, the incidence of ETE was 46% (78/168); the values did not differ much (36).Moreover, PTC patients with ETE have poor prognosis and a high risk of recurrence in the near future (37).And a study found that patients with a history of ETE display a poorer therapeutic effect following 131I ablation (38).The thyroid capsule is closely adjacent to the surrounding tissues such as recurrent laryngeal nerve, the patients who are not diagnosed correctly before surgery often need reoperation, which increases the pain and economic burden of PTC patients.Taken together, it is very important to identify the presence of ETE before thyroid surgery.At present, only pathological diagnosis is the gold standard of ETE diagnosis.Reliable and accurate preoperative imaging examination of ETE will help in accurate diagnosis and selection of the optimum therapeutic strategy.Although fine-needle aspiration biopsy (FNAB) is the most widely used method for the assessment of thyroid nodules preoperatively, it is an invasive procedure, and its samples cannot always reach satisfying diagnostic power.Alternatively, the diagnostic capacity for ETE in PTC patients is limited (39).Noninvasive imaging examinations such as ultrasonic imaging, MRI, CT, and positron emission tomography not only play an increasingly important role in the diagnosis of ETE but also can be useful to guide treatment and patient follow-up.Among them, US is the sole fundamental imaging modality for the assessment of thyroid nodules.US examination is noninvasive, fast, and reliable and can help to enhance the early detection of pathologies (40)(41)(42).However, these traditional imaging examinations depend on the experience and subjective judgments of the radiologists, which limit the ability to make an objective decision.
Therefore, a noninvasive sample and quantitative methods are needed to help clinically predict ETE.Radiomics is a hot topic; it allows the quantitative extraction of high-throughput features from radiographic images, with the advantage of objectivity.Numerous studies have confirmed the usefulness of 3D US in clinical settings.Based on the transverse and longitudinal views of 2D US, 3D US technology improves the visualization of target lesions by adding a new dimension, coronal view.In a study involving 85 PTC patients, Kim et al. (18) believed that 3D US not only had higher sensitivity (66.7% vs. 46.4%,p = 0.03) but also showed better agreement (k 0.53 vs. 0.37) than 2D for predicting ETE, and 3D thyroid US saved time for scanning compared with 2D.But at present, there are no reports applying 3D US radiomics to compare the diagnostic performance to evaluate ETE in PTC, which is the main novelty of this research paper.Kim et al. (43) found that the tumor size ≥1 cm was considered to be a dependent prognostic factor to predict ETE in PTC.In our study, after forward LR method, we found that the PTC patients with tumor size ≥1 cm had been associated with ETEs, which was in line with the above findings.We also found that the location of the Radiomic feature selection using the least absolute shrinkage and selection operator (LASSO) regression model.The LASSO regression with 10-fold cross-validation (A) was used to reduce the dimension of the grouping characteristics (B).Finally, 16 radiomic features with non-zero coefficients were selected.The radiomics nomogram was developed by incorporating the radiomics scores and selected clinical predictors.The radiomics nomogram incorporated tumor location (unilateral or bilateral), tumor size, and the radiomics signature.

B A
Performance of the nomogram.(A) ROC curves of US-reported ETE status and radiomics nomogram for predicting ETE in the validation cohort and in the primary cohort (B).nomo, nomogram; US, ultrasound; ROC, receiver operating characteristic; ETE, extrathyroidal extension.nodules (unilateral or bilateral) was another clinical predictor.Then, we built a radiomics nomogram based on the clinical risk factors, tumor size and tumor location, to predict ETE.As the result showed, the AUC of the radiomics nomogram in diagnosing ETE was higher than that of the radiologists in both the training cohort and the validation cohort.In addition, the C-index and calibration curve also showed good consistency in the primary group and validation group.Hence, the 3D US-based radiomics signature nomogram, a noninvasive preoperative prediction method that incorporates tumor location and tumor size, presented more advantages over radiologist-reported ETE statuses for PTC, which could be a visualization tool for the clinic to choose a suitable surgical modality.

B A
Inevitably, our present study has several limitations.First, this study was a retrospective and single-center study, lacking external validation.Therefore, these results are not necessarily suitable to all patients with PTC preoperatory; multicenter studies are warranted to further validate the results.Moreover, to improve the diagnostic capability, the algorithm and models will be further optimized to improve the accuracy of external validation.Second, in this study, the radiomic features might not be sufficient because we only used grayscale US images to perform the radiomics nomogram.In future research, we will add radiomic characteristics of multimodal US such as elastography and contrast-enhanced US images to the nomogram.Finally, the ROI segmentation was obtained manually, which might be affected by the radiologist's subjective bias.And it may be resolved by semiautomatic/automatic segmentation.
Our 3D US-based radiomics nomogram combining clinical predictors, tumor location and tumor size, shows favorable predictive accuracy for preoperative ETE in patients with PTC.This nomogram is a promising tool to improve the diagnostic accuracy.

FIGURE 1 Flowchart
FIGURE 1Flowchart of patient selection in this study.PTC, papillary thyroid cancer; ETE, extrathyroidal extension.

2
FIGURE 2 Schematic diagram for region of interest (ROI) delineation of ultrasound image: (A) was the original two-dimensional image during ultrasound (US) examination; (B) was the ROI delineated in the largest section of the tumor based on the original two-dimensional ultrasound image.(C-E) are transverse, coronal, and sagittal planes, respectively, and (F-H) are corresponding three-dimensional ROIs.
Calibration curves for the radiomics nomogram in the training (A) and validation (B) sets.The calibration curve and the Hosmer-Lemeshow test showed good calibration in the training cohort (p = 0.828) and in the validation cohort (p = 0.071).

TABLE 1
Basic clinical data for our group.

TABLE 2
The 16 best radiomic features in 3D US image.