Improved cancer risk stratification of isoechoic thyroid nodules to reduce unnecessary biopsies using quantitative ultrasound

Objective Gray-scale ultrasound (US) is the standard-of-care for evaluating thyroid nodules (TNs). However, the performance is better for the identification of hypoechoic malignant TNs (such as classic papillary thyroid cancer) than isoechoic malignant TNs. Quantitative ultrasound (QUS) utilizes information from raw ultrasonic radiofrequency (RF) echo signal to assess properties of tissue microarchitecture. The purpose of this study is to determine if QUS can improve the cancer risk stratification of isoechoic TNs. Methods Patients scheduled for TN fine needle biopsy (FNB) were recruited from the Thyroid Health Clinic at Boston Medical Center. B-mode US and RF data (to generate QUS parameters) were collected in 274 TNs (163 isoechoic, 111 hypoechoic). A linear combination of QUS parameters (CQP) was trained and tested for isoechoic [CQP(i)] and hypoechoic [CQP(h)] TNs separately and compared with the performance of conventional B-mode US risk stratification systems. Results CQP(i) produced an ROC AUC value of 0.937+/- 0.043 compared to a value of 0.717 +/- 0.145 (p >0.05) for the American College of Radiology Thyroid Imaging, Reporting and Data System (ACR TI-RADS) and 0.589 +/- 0.173 (p >0.05) for the American Thyroid Association (ATA) risk stratification system. In this study, CQP(i) avoids unnecessary FNBs in 73% of TNs compared to 55.8% and 11.8% when using ACR TI-RADS and ATA classification system. Conclusion This data supports that a unique QUS-based classifier may be superior to conventional US stratification systems to evaluate isoechoic TNs for cancer and should be explored further in larger studies.


Introduction
A long-standing concern in the management of thyroid nodules (TNs) is the ineffectiveness of risk stratification of isoechoic TNs as cancer or benign using gray-scale ultrasound (US).The American Thyroid Association (ATA) TN classification system and the American College of Radiology Thyroid Imaging, Reporting and Data System (ACR TI-RADS) use high-risk US features including hypoechogenicity, irregular margins and microcalcification to assign a risk level for malignancy (1,2).The high-risk features identified in these systems are, however, more specific for the classic papillary thyroid cancer subtype.Isoechoic TNs are very common and are more likely to undergo fine needle biopsy (FNB) due to their larger size (3).While a majority of isoechoic TNs are benign, some malignant TNs (follicular thyroid cancer, follicular variant of papillary thyroid cancer and 20% of classic papillary thyroid cancer) demonstrate an isoechoic appearance on US (4,5).The current ACR TI-RADS TN classification system would not biopsy and completely miss these isoechoic cancers if partially cystic in appearance.The ATA classification system classifies isoechoic TNs as low suspicion and recommends FNB for a size greater than 1.5 cm regardless of other high-risk features such as hyperechoic foci or invasive margins.Follicular cancer and those that behave similarly have a higher risk for distant metastatic disease compared to papillary thyroid cancer making it important that these TNs undergo FNB appropriately (6).At the same time, considerable health care cost and patient and provider anxiety associated with invasive procedures (i.e., FNB, surgery) for benign disease highlight the need to avoid unnecessary FNBs in benign TNs.Therefore, an imaging technique that uniquely allows analysis of isoechoic TNs to reduce unnecessary invasive FNBs in benign TNs without missing cancer will improve the quality of medical care provided to patients with TNs.
Quantitative US (QUS) is an imaging method that utilizes data from raw ultrasonic radiofrequency (RF) echo signals to assess properties of tissue microarchitecture (7)(8)(9)(10)(11).Most of the information contained in RF data is discarded in B-mode grayscale US imaging that is typically used in clinical care.QUS generates numerical parameters that are a function of the underlying microstructure of the interrogated tissue (e.g., effective scatterer size and effective acoustic concentration) (8,9).Our group has previously demonstrated the use of this clinically novel US technique in the risk stratification of TNs (12).The area under the receiver operating characteristics (ROC) curve of a linear QUS-based classifier (combination of QUS parameters or CQP) was 0.857 +/-0.033, and statistically the same as that of ACR TI-RADS and ATA risk classification system for discriminating between malignant and benign TNs (p = 0.327 and p =0.041, respectively) but without the limiting factor of clinician inexperience in thyroid sonography.This CQP classifier also demonstrated a 44 to 66% reduction in unnecessary FNBs which outperformed the reduction using the ACR TI-RADS and ATA risk classification systems with a negative predictive value of 97 to 100%.We now report the outcomes of a preliminary study in which different QUS-based classifiers were created for isoechoic and hypoechoic TNs to determine if cancerrisk stratification improves.

Materials and methods
The study was performed following institutional review board approval.Details regarding subject recruitment, data collection, RF data processing have been outlined in a prior publication (12).Briefly, patients with one or more TNs who were either undergoing an FNB or had a prior FNB were recruited from the Thyroid Health Clinic at Boston Medical Center.A GE LOGIQ-E9 US scanner (GE Healthcare, Chicago, IL) was used for acquiring RF data utilized for computing QUS parameters using the reference phantom method (13).RF data capture is available natively on the LOGIQ-E9 and therefore no modification of the instrument was necessary.A software key provided by the manufacturer had to be input once to activate RF data capture.TNs with significant cystic area or macrocalcification anterior to the region of interests were not included in the analysis due to interference with US wave propagation.Non-invasive follicular thyroid neoplasms with papillary like features (NIFTPs) were not included due to small numbers in the data set.Investigators who are experienced ultrasonographers reviewed gray-scale US images from the picture and archiving, and communications system (PACS) and determined the echogenicity of TNs.TNs that were designated as isoechoic or hyperechoic were categorized as isoechoic for this study.TNs that were designated as mildly hypoechoic or very hypoechoic were categorized as hypoechoic.A combination of cytology, molecular testing using ThyroSeq genomic classifier (v2 or v3) (CBLPath, Inc., Rye Brook, NY) and surgical pathology was used to classify TNs as benign or cancer.A TN was categorized as benign if it had benign cytology (Bethesda II), or indeterminate cytology (Bethesda III or IV) without any high-risk molecular test result or if surgical pathology did not show any evidence of malignancy.A TN was classified as cancer if found to have Bethesda VI cytology or if surgical pathology demonstrated malignancy.In one subject, a TN with high-risk US features was categorized as malignant based on the presence of a suspicious cervical node that was positive for metastatic thyroid cancer on FNB.Similar methods described in prior publications and in our prior study were used for RF data processing and QUS parameter estimation (7,12,14).A combination of QUS parameters were tested using a Fisher linear discriminant approach and classification performance was assessed using ROC curves.Statistical analysis for the study was performed using the MATLAB statistical toolbox (The MathWorks, Inc., Natick, MA).
An optimal linear combination of QUS parameters (CQP) was derived individually for isoechoic TNs [CQP(i)] and hypoechoic TNs [CQP(h)].The performance of these classifiers was compared to a classifier that was trained using TNs irrespective of echogenicity [CQP(c)] and also to currently used gray-scale TN classification system, ACR TI-RADS and ATA classification system.

Results
A total of 274 TNs were included in the final analysis.Of these, 163 were categorized as isoechoic (158 benign and 5 cancer) and 111 as hypoechoic (86 benign and 25 cancer) [Table 1].The prevalence of malignancy in TNs categorized as isoechoic was 3.1% and in those categorized as hypoechoic was 22.5%.
(A) Performance of CQP(i) compared to gray-scale US: The optimal linear combination of QUS parameters (Nakagami shape parameter, intercept, effective scatterer size, and acoustic concentration) for isoechoic TNs -CQP(i) was determined [21.1875 x Avg_NakShapeParam + 2.6668 x Avg_Intercept -0 .8 9 0 6 3 x A v g _ E ff e c t i v e S c a t t e r e r S i z e -2 .4 2].Using the CQP(i) threshold of -61.341 for FNB (i.e., a TN chosen for FNB if the CQP(i) value for the TN is equal or less than the threshold), 119 of 163 (73%) TNs were excluded from FNB with a missed malignancy rate of zero among isoechoic TNs (i.e., all malignant TNs would be selected for FNB).With ACR TI-RADS, FNB would not have been recommended in 91 (55.8%)TNs and one malignant TN would have been missed.With the ATA risk stratification system, FNB could be avoided in 19 (11.7%)TNs, with no missed malignant TN.The reduction in FNB for the ATA system is low in the study as the patient population from which subject recruitment occurred had TNs for which FNB was recommended clinically based primarily on the ATA risk stratification system.
When comparing the TNs for which FNB was not recommended, between the CQP(i) and ACR TI-RADS, there was an overlap of 67 TNs in this category.However, 52 TNs for which FNB was not recommended per CQP(i) met criteria for FNB per ACR TI-RADS.Conversely, 23 TNs for which FNB was not recommended per ACR TI-RADS met criteria per the CQP (i) threshold.
The malignant TN missed by the ACR TI-RADS classification system was a 5 cm solid cystic nodule with isoechoic echotexture, with a smooth margin and without any echogenic foci.The surgical pathology of this TN demonstrated a follicular variant papillary thyroid cancer without lymphovascular invasion, extrathyroidal extension or metastatic nodes.Given the size of this TN, it is likely that many clinicians would have chosen to perform an FNB for the TN even if they used ACR TI-RADS system in their clinical practice.When the performance of each TN classification method was revised after removing this nodule, the CQP(i) produced an ROC AUC performance of 0.929 +/-0.053[95% CI 0.825-1.000]compared to the performance of ACR TI-RADS of 0.854 +/-0.095[95% CI 0.668-1.000]and the performance of ATA risk stratification system of 0.729 +/-0.149[95% CI 0.406-1.000].
When comparing the TNs for which FNB was not recommended, between the CQP(h) and ACR TI-RADS there was an overlap of 8 TNs in this category.There were 16 TN for which FNB was not recommended per CQP(h) that met criteria for FNB per ACR TI-RADS and there were 11 TNs for which FNB was not recommended per ACR TI-RADS that met criteria per the CQP (h) threshold.
(C) Performance of CQP(i) and CQP(h) compared to a common classifier -CQP(c): The performance of an optimal linear classifier trained on TNs irrespective of echogenicity, CQP(c) [7.086 x Avg_NakShapeParam -0.8791x Avg_SpectralSlope + 0.1900 x Avg_AcousticConcentration -0.6343 x Std_Intercept + 0.1049 x Std_SpectralSlope] was compared to the echogenicity specific classifiers CQP(i) and CQP(h) [Table 3].Using a biopsy threshold of 6.638, the number of TNs excluded from FNB without missing any cancer was 91 (33.2%) when CQP(c) was applied to all TN.Specifically, for isoechoic TNs, 74 (45.4%)FNBs could be avoided which is fewer compared to when CQP(i) was used.When CQP(c) was applied for hypoechoic TNs, 17 (15.3%)FNBs could be avoided.

Discussion
TNs are made up of a heterogeneous group of histology that includes benign hyperplasia and adenomas, differentiated (papillary and follicular), poorly differentiated, anaplastic and medullary thyroid cancer, thyroid lymphoma and metastatic disease to the thyroid gland (15).All of these subtypes vary in their histological appearance.For example, classic papillary thyroid cancer is characterized by papillary arrangements with a vascular core and psammoma calcification while in follicular thyroid cancer sheets of follicular cells with reduced amounts of colloid are seen with hallmark vascular or capsular invasion.Certain gray-scale US features, such as echogenicity, reflect these differences in the TN architecture.However, while these are more effective in identifying classic papillary thyroid cancers that are hypoechoic and have punctate echogenic foci, it is less the case for the other subtypes.
It is also important to note that we are shifting from an emphasis on TN US to diagnose cancer to a recognition that the management of TNs should also prioritize avoiding unnecessary invasive procedures.Up to 90 to 95% of TNs are benign (16,17).The incurred cost from FNB, molecular testing and surgery for benign TN is exorbitant (18,19).In addition, it adds to patient and provider anxiety and increases risk for post-surgical complications including nerve injury and hypocalcemia.These complications can be reduced, however not eliminated, through various measures such as undergoing surgery by a high-volume thyroid surgeon and use of neuromonitoring devices during surgery.
Creating a separate QUS-based linear classifier for isoechoic and hypoechoic TNs demonstrated improved TN risk stratification, specifically for isoechoic TNs, compared to applying a single classifier for all TNs.The ROC AUC performance of CQP(i) was greater than that of ACR TI-RADS and ATA classification system, but not statistically significantly in the setting of inadequate TN numbers.However, CQP(i) reduces unnecessary FNBs by 73% in isoechoic TNs compared to 55.8% by ACR TI-RADS. Figure 1 demonstrates two TNs that were both classified as isoechoic.One TN was isoechoic, solid and taller-than-wide and biopsied based on the TI-RADS classification system because the size was >1.5 cm, but was benign by cytology.The second TN was isoechoic and partially cystic and would not have been biopsied based on the TI-RADS classification system.It was found to be a follicular variant of papillary thyroid cancer.These two nodules have very different QUS-based CQP(i) values that would not have recommended a biopsy of the benign isoechoic, solid nodule but would have biopsied the partially cystic isoechoic papillary thyroid cancer.In addition, CQP(i) and CQP(h) together can reduce unnecessary FNB by 52.2% in all TNs without missing a malignant TN.This is improved compared to the reduction in unnecessary FNBs when a single QUS-based classifier is applied to all TNs (45.4%).The relevance of these findings is further highlighted by the fact that 60% of the TNs in the study were isoechoic and 97% of the isoechoic TNs were benign.Of note, while a majority of classic papillary thyroid cancers were hypoechoic, 17% in the study were classified as isoechoic.Follicular thyroid cancers, which are often isoechoic, in the current study were categorized as hypoechoic.
The data demonstrated differences in how the QUS-based classifier and gray-scale US categorized TNs.43.7% of TNs that did not meet criteria for FNB by the CQP(i) classifier, were recommended for FNB by ACR TI-RADS.25.3% TNs that did  Interestingly, the performance of CQP(h), ACR TI-RADS and ATA risk stratification system for hypoechoic TNs was similar.Published literature has demonstrated the ROC AUC performance of gray-scale US in various practice settings ranges from 0.76 to 0.88 (20)(21)(22)(23).QUS is not prone to the operator and machine variability seen with gray-scale US, and it can potentially be a useful tool to improve the performance of a less experienced ultrasonographer assessing TNs.
Similar to our prior study, we did not include TNs with a final surgical pathology of NIFTPs due to the low numbers in this preliminary data set.Our institution historically has a low prevalence of NIFTP, which represents 2.3% of all papillary neoplasia (24).In addition, the prevalence of malignancy in isoechoic TNs is low which limits the interpretation of results in our current study.These two concerns can be better addressed in future studies with larger number of subjects.TNs with significant macrocalcification or cystic areas anterior to region of interest in the TN were also excluded because these structures prevent or change the propagation of US RF signal preventing QUS analysis.The authors recognize while TNs were separated into iso-and hypoechoic, these groups are still heterogeneous in their pathology.While echogenicity was chosen in this study to categorize TNs, in future studies the use of other gray-scale US features in combination with QUS should be explored.Secondly, hypoechoic echogenicity can be further categorized as mildly hypoechoic and very or markedly hypoechoic, the latter associated with a higher risk for malignancy (25)(26)(27).This needs to be taken into consideration when planning future studies.In this preliminary analysis, the categorization of TNs based on the echogenicity was done manually by the investigators.There can be inter-observer and machine variability.In the future, exploring an objective method for determining echogenicity using either QUS or other techniques should be considered.
For many years we have adhered to a tradition of treating all TNs the same while imaging.Our study is an attempt to apply an algorithmic approach to TN imaging.Our preliminary results are promising and builds compelling case to explore TN imaging keeping heterogeneity in TN histology in mind.

TABLE 1
TN characteristics and categorization.

TABLE 3
Comparison of performance of CQP(c), CQP(i) and CQP(h) in isoechoic and hypoechoic TNs.

TABLE 2
Comparison of results of individual QUS-based classifiers for isoechoic and hypoechoic TNs compared to ACR TI-RADS and ATA risk stratification system.