Application of ultrasound elastography and radiomic for predicting central cervical lymph node metastasis in papillary thyroid microcarcinoma

Objective This study aims to combine ultrasound (US) elastography (USE) and radiomic to predict central cervical lymph node metastasis (CLNM) in patients with papillary thyroid microcarcinoma (PTMC). Methods A total of 204 patients with 204 thyroid nodules who were confirmed with PTMC and treated in our hospital were enrolled and randomly assigned to the training set (n = 142) and the validation set (n = 62). US features, USE (gender, shape, echogenic foci, thyroid imaging reporting and data system (TIRADS) category, and elasticity score), and radiomic signature were employed to build three models. A nomogram was plotted for the combined model, and decision curve analysis was applied for clinical use. Results The combined model (USE and radiomic) showed optimal diagnostic performance in both training (AUC = 0.868) and validation sets (AUC = 0.857), outperforming other models. Conclusion The combined model based on USE and radiomic showed a superior performance in the prediction of CLNM of patients with PTMC, covering the shortage of low specificity of conventional US in detecting CLNM.


Introduction
Papillary thyroid microcarcinoma (PTMC), which is defined as a papillary thyroid carcinoma (PTC) tumor of 1 cm or less in size, has shown an increased occurrence in result of the widespread use of high-resolution ultrasonography (US) and fine-needle aspiration biopsy (FNAB) (1)(2)(3).Despite the excellent prognosis of PTMC, central lymph node metastasis (CLNM) is common among patients with PTMC, and the occurrence can be as high as 64.1%, which implies poor overall survival (4, 5).According to the current guideline (1), PTMCs with aggressive features, such as clinical node metastasis, distant metastasis, and invasive symptoms to the recurrent laryngeal nerve or trachea, were supposed to complete therapeutic central compartment lymph node dissection (CLND) during initial thyroidectomy.Low-risk PTMCs such as clinical node-negative (cN0) PTMCs, showing no clinical evidence of CLNM on US or other imaging modalities preoperatively, have other options besides immediate surgery, including active surveillance and thyroidectomy without CLND (6).However, The preoperative detection rate of CLNM remains quite low, and CLNM proved by histopathological examination is as high as 31% to 60.9% in PTMCs (7,8).This situation makes accurate evaluation of lymph node status preoperatively particularly important.On one hand, CLND may bring unnecessary complications and economic burdens in pathological lymph node-negative PTMC (9).On the other hand, simple thyroidectomy may lead to disease recurrence, and second surgery is more difficult for CLND (10).In the era of precision medicine, it remains controversial whether CLND should be taken in PTMCs, making it crucial that predictors of CLNM should be screened out and accurate evaluation need to be performed preoperatively.
Ultrasound elastography (USE), an imaging technology sensitive to tissue stiffness, which was first described in the 1990s, is more objective as a tool to evaluate tissue hardness than clinical palpation.USE can be classified into strain elastography (SE) and shear wave elastography (SWE) by the measured physical quantity (11).Recent study reported that SWE showed a great value to differentiate metastatic and benign lymph nodes in PTCs (12).By using SWE to evaluate the stiffness of PTC nodules, higher elasticity values were found to be associated with pathologic central or lateral LN metastasis, improving the sensitivity for prediction of CLNM from 28% to 45% compared with gray-scale US alone (13).According to the different pattern of SE image, a scoring system, that is, elasticity score (ES), class 1 to 4, was used to describe the stiffness of tissue (14).SE has been reported to be useful for predicting extrathyroidal extension (15,16), which should be helpful in the diagnosis and the evaluation of possible recurrence of PTMC.SE has also proved to be an effective addition to US to better predict malignancy and describe thyroid nodules (17).However, to our knowledge, there are few literatures evaluating the application of ES in the prediction of CLNM of PTMC until now.
Radiomic is emerging as a promising tool that quantitatively extracts high-throughput features and converts medical images into mineable data such as the information of pathology, biomarkers, and genomics, improving the diagnostic, predictive, and prognostic accuracy by applying it in clinical-decision support system (18,19).In recent years, radiomic has been widely used in tumor research, obtaining satisfactory achievements.Previous study showed that CT-based radiomic signature performed well in the prediction of lymph node metastasis in PTCs and that radiomic features from the enhancement phase played a leading role in the process (20).Several studies have proved that radiomic analysis based on ultrasound has great value for noninvasively predicting cervical lymph node metastasis (CLNM) in PTCs (18,21).We hypothesized that radiomic may provide more information about PTMCs preoperatively, and it should be able to help identify patients with high-risk PTMC and make more suitable medical plan.
In this study, we aimed to analyze the performance of USE and radiomic in the prediction of CLNM of PTMC and visualize the probability of risk factors in order to facilitate designing an optimal treatment strategy.

Patients
This retrospective study was approved by the Ethics Committee and performed in accordance with the Declaration of Helsinki (2022-SR-512).Considering that it is a retrospective study, the requirement for informed consent was waived.From November 2021 to October 2022, patients confirmed with PTMC and treated in our hospital were involved in this study.Inclusion criteria are as follows: (1) patients with histopathologically confirmed PTMC; (2) patients who underwent primary thyroid surgery; (3) pathology of central cervical lymph lode was confirmed by surgery or fine needle aspiration biopsy (FNAB); and (4) patients whose preoperative ultrasound examination was performed 1 week before surgery and full data of clinical characteristics and ultrasound image can be achieved.Exclusion criteria are as follows: (1) patients received preoperative interventional therapies; (2) patients with incomplete clinical data; and (3) patients whose ultrasound image did not meet the requirement for radiomic analysis.Finally, 204 patients with a total of 204 thyroid nodules were enrolled and randomly assigned to training set (n = 142) and validation set (n = 62) at a 7:3 ratio.In each set, the patients were divided into two groups based on the pathological results of CLND or FNAB.Clinical variables including gender, age, and BRAF V600E mutation status (wild type/mutant type) were recorded.

US image acquisition
All patients underwent a preoperative US examination on the thyroid nodule by a Samsung XR80A ultrasound machine, with a 3to 12-MHz linear probe.The US features of thyroid nodule were observed and recorded by two US physicians with more than 10 years' experience in thyroid US examination, including capsular contact (positive/negative), aspect ratio (<1/=1/>1), shape (regular/ irregular), echogenicity (very hypoechoic/hypoechoic/isoechoic or hyperechoic), echogenic foci (none/punctate echogenic foci/ macrocalcifications), blood flow (none/hypervascular/mild or moderate), TIRADS category (3/4a/4b/4c/5), using thyroid imaging reporting and data system developed by Kwak et al. ( 22), and ES.The ES was ranked from 1 to 4, which is equal to the elasticity from soft to hard.When disagreements appeared between the physicians, the third senior US physician reviewed the features and made the final decision.The US image chosen for radiomic analysis should meet the following requirements: (1) containing as much malignant features as possible of the thyroid nodule in the transverse section; (2) stored in digital imaging and communications in medicine (DICOM) format and without any marks; and (3) captured by the same settings about gain, depth and frequency.

Region-of-interest segmentation and radiomic feature extraction
After being exported from the ultrasound instrument in DICOM format, the US images in the maximum transection area were segmented by an ultrasound expert (more than 5 years of experience) using open-source software (ITK-SNAP 3.8.0;http:// www.itksnap.org)to generate a region of interest (ROI) containing the thyroid nodule.Finally, a total of 464 radiomic features were extracted from the US images, consisting of nine shape features, 90 first-order features, and 365 texture features.

Radiomic feature selection and signature calculation
The reproducibility of radiomic feature extraction was evaluated based on the inter and intra-operator coefficient (ICC).Three weeks after the radiomic feature extraction, the same ultrasound expert randomly selected 30 lesions from the training set to draw ROI again, and another radiologist with more than 5 years of experience repeated the work independently.An independent samples t-test was used to evaluate the inter and intra-operator differences.ICC > 0.75 was suggestive of a good agreement.
The least absolute shrinkage selection operator (LASSO) with 10-fold cross validation was used to selected candidate radiomic features for a radscore calculating.The radscore is calculating according to the following formula: where b0 is the constant term in the regression, bi is logistic regression coefficient, and ci is the value of the selected features.

Construction of three different models and model performance assessment
Univariate analysis was used to analyze the impact factor of clinical variables and US characteristics.Model 1 was constructed with the US features of p< 0.05 plus ES of the training set.Model 2 was constructed with the optimal features selected by LASSO logistic regression in the training set.Model 3 is based on the combination of selected clinical, US characteristics, ES, and radscore.
The established three models were validated using validation set.The receiver operating curve (ROC) was used to evaluate the discriminant ability of three models by calculating the area under curve (AUC), sensitivity, and specificity of them.

Visualization of model 3
For the sake of precision medicine, a nomogram, convenient for clinical decision, was plotted.Decision curve analysis (DCA) was employed to determine the clinical usefulness by quantifying the net benefits in both training set and validation set.Calibration curve was applied to evaluate the correction in both sets.

Statistical analysis
Statistical analyses including univariate analysis and binary logistic regression that was applied to build models were conducted by SPSS software (version 26.0).Chi-square test (c 2 ) or Fisher exact test were used to compare differences for categorical variables.Normal distribution decided whether the independent sample t test or the Mann-Whitney U-test to be used for continuous variables analysis.R software (version 3.6.1,http://www.rproject.org)was used for radiomic features analysis, radscore construction, and model evaluation.Two-sided p< 0.05 was assumed to indicate statistical significance.

Patient characteristics
The study flowchart is shown in Figure 1.A total of 142 patients with a total of 142 thyroid nodules were enrolled in the training set with an average age of 42.25 ± 10.93 years (range, 24-73 years), including 41 men and 101 women.In addition, 62 patients were enrolled in the validation set with an average age of 41.61 ± 11.15 years (range, 25-70 years), including 19 men and 43 women.The proportion of lesions less than 1 cm is 72.5% (148/204), and the proportion of lesions equal 1 cm is 27.5% (56/204).The portion of surgery of central cervical lymph lode is 94.1% (192/204), and the portion of FNAB is 5.9% (12/204).The clinical and US characteristics of the training set and the validation set were summarized in Table 1.There was no significant difference between the two sets in clinical and US features.

Model 1: clinical features and US features
In the training set, univariate analysis in Table 1 showed that four variables were related to CLNM.Model 1, constructed by combining the significant four features with ES, shows a good  performance with an AUC of 0.835 (95% CI, 0.768-0.902)(Figure 2).The sensitivity and specificity rates were 88% and 68.7%, respectively.

Model 3: comprehensive model
Based on the features with statistical significance, ES and radscore, a comprehensive model was built, showing an improved diagnostic efficiency with an AUC of 0.868 (95% CI, 0.811-0.925)(Figure 2).The sensitivity and specificity rates were 80% and 82.1%, respectively.

Performance of the nomogram (visualization of model 3)
The comprehensive model (model 3) was presented as a form of nomogram (Figure 3).Favorable calibration curves of the nomogram were confirmed in both training and validation sets (Figure 4).As is shown in Figure 5, model 3 provided a better net benefit to predict CLNM in patients with PTMC than model 1 and model 2 for all threshold probabilities.

Discussion
The guidelines of the American Thyroid Association suggested that active surveillance rather than surgery could be taken for lowrisk PTMC.Unexpectedly, some PTMCs still progressed during surveillance.As a result, there remains debates on the value of CLND for PTMC (1).In clinical work, we are supposed to take all probabilities into account when considering which of the two management options, observation or surgery, is better or more beneficial for patients with PTMC.Thus, careful assessment of thyroid nodule and determination of risk factors related to CLNM could guide patients with PTMC to adopt appropriate management.
In our study, 102 of the 204 patients with PTMC were pathologically diagnosed with CLNM, taking a proportion of 50%, which was consistent with the incidence of 31%-60.9% in previous studies (7,8).According to our research, male gender is closely related with CLNM in PTMCs, which is consistent with numerous studies.Gui et al. previously reported that male gender was an independent risk factor for CLNM in PTMC (23).Wen et al. found that men increases about 2.92-fold CLNM risks in cN0 PTMC (9).The unhealthy behaviors, such as smoking and alcohol consumption, of men are more likely to give rise to this result (24, 25).These studies revealed that more frequent follow-up  surveillance or more aggressive treatment may be considered for men with PTMC, even though there is no evidence of CLNM.Current studies showed that ultrasound features that only include microcalcification and irregular shape were able to predict CLNM in thyroid cancer (26)(27)(28).The findings were proved by our research to predict CLNM with a p-value of 0.023 and<0.001 in microcalcification and irregular shape.Microcalcifications, which are mainly caused by psammoma bodies with a smaller diameter of 10 µm to ~100 mm, were frequently seen in PTMCs.Microcalcifications are deposits for calcium salts due to the proliferation of blood vessels and fibers, which reflect the rapid growth of cancer cells (29,30).Thus, the cervical lymph nodes should be assessed more careful if microcalcifications were found in thyroid nodules by US.Previous study pointed out that irregular tumor shape is a risk factor for multifocality and bilaterality, which are common features of PTMC and related to disease recurrence and poorer prognosis (31).Our study found that irregular shape may affect the outcome of patients with PTMC.As Kaliszewski et al. suggested, cases without clinically evident LNM but with an irregular shape should also be treated as high-risk PTMCs or symptomatic PTMCs (32).Our findings make us to agree with this view.According to Kwak-TIRADS (22), suspicious malignant features include solid component, hypoechogenicity, marked hypoechogenicity, microlobulated or irregular margins, microcalcifications, and tallerthan-wide shape.As the number of suspicious US features increased, the probability of malignancy increased.Our study demonstrated that the higher risk level a thyroid nodule has, the greater possibility of CLNM a patient with PTMC may have.
USE has been applied to predict CLNM in thyroid carcinoma in recent years, and satisfactory results have been achieved.Wang et al. used SWE and calculated elasticity parameters, including Emin, Emean, and Emax of the thyroid nodule by the system to draw a conclusion that Emax ≥ 48.4 was an independent risk predictor for CLNM in PTC (33).Woo et al. also found that quantitative SWE could predict pathologic prognostic factors of LN metastasis of PTC (12).The results obtained by SWE proved that the stiffness of thyroid nodule has a positive correlation with CLNM in PTC, which made us to state that SE would have the same effect in predicting CLNM in PTMC.Disappointingly, the final visualization of model 3 (nomogram) revealed that the ES takes a small proportion to predict CLNM.Xu et al. found that conventional SE was not helpful in predicting CLNM (34), consisting with the result that higher ES was not associated with CLNM (16).Compared with the quantitative data of SWE, ES mostly relies on subjective evaluation, which may make some errors and lead to the present result.Besides,  the depth of the lesions makes a difference to the score of the elastic image.Ultrasound beams are usually focused at a depth of around 3 cm to 5 cm, so that the area of maximum radiation force energy is 4 cm to 4.5 cm from the transducer and gradually diminishes as it progresses in the medium (35).If ultrasound beams reach a point beyond reference range, then their intensity is too weak to generate an adequate acoustic radiation force (36).This may result in low signal-to-noise reflecting the accuracy of ES.As a way of machine learning, radiomic needs mass of data and the consistency of image data between the training and validation sets.By extracting a large number of features that cannot be identified by the naked eyes, more information about the tumor is mined (37,38).In our study, the finally chosen features include one shape feature, one first-order feature, and four texture features, showing the heterogeneity of thyroid nodule comprehensively.The original_shape2D_Elongation shows the relationship between the two largest principal components in the ROI shape.Although the results of this study were promising, there were several limitations in our study.First, it was a retrospective study conducted in one institution; thus, there may exist a selection bias.In the future, a prospective research will be taken for a more accurate assessment of the prediction of CLNM in PTMCs.Second, this study is lack of an external validation data, we aim to carry out multicenter research and increase the sample size to better evaluate the clinical use of model 3. Finally, our radiomic features were only extracted from conventional US images.We will attempt to extract features from images of different modes of US, including USE and contrastenhanced US, to dig up more information about the PTMC.

Conclusion
The combined model based on USE and radiomic showed a superior performance in the prediction of CLNM of patients with PTMC.A nomogram based on combined model is a useful tool for clinicians to make individualized treatment strategy.The decision curve analysis (DCA) curve evaluated the clinical value of the nomogram.
The AUC was significantly higher in the comprehensive model than model 1 and model 2, which was shown in both the training set and the validation set(in the training set: model 3 versus model 1, p = 0.041; model 3 versus model 2, p< 0.001; in the validation set: model 3 versus model 1, p = 0.018; model 3 versus model 2, p = 009).Specificity was improved when USE and radiomic were combined.

(A)
ROCs of the training set.(B) ROCs of the validation set.

(A)
Calibration curves of the nomogram in the training set.(B) Calibration curves of the nomogram in the validation set.

FIGURE 3 A
FIGURE 3A nomogram, the visualization of the comprehensive model 3.
Original_firstorder_Kurtosis measures the peakedness of the distribution of values in the image ROI.Wavelet.HL _glszm_GrayLevelNonUniformity and wavelet.HH_glszm_ GrayLevelNonUniformity measure the variability of gray-level intensity values in the image.Wavelet.HL_ngtdm_Contras not only is a measure of the spatial intensity change but also is dependent on the overall gray level dynamic range.Wavelet.LL_ gldm_LargeDependenceLowGrayLevelEmphasis measures the joint distribution of large dependence with lower gray-level values.Although model 2 did not achieve a better discrimination than model 1, an incremental improvement was obtained by incorporating the radiomic features into clinical features.The performance of model 3 was more preferable than model 1 and model 2 both in the training set and the validation set, with specificity improved significantly, making up the deficiency of ultrasound only.The luminescent spot of our study is the construction of a nomogram, which is a visualization of model 3, facilitating clinical decisions.Physicians can derive the CLNM possibility of a patient with PTMC by calculating a score for each risk factor.

TABLE 1
Univariate analysis of clinical and US features of cN0 PTMCs.

TABLE 1 Continued
* P-value comparing the two dataset cohorts.

TABLE 2
The variables A to F and the represented the six selected radiomic features.