Artificial Intelligence for Personalized Medicine in Thyroid Cancer: Current Status and Future Perspectives

Thyroid cancers (TC) have increasingly been detected following advances in diagnostic methods. Risk stratification guided by refined information becomes a crucial step toward the goal of personalized medicine. The diagnosis of TC mainly relies on imaging analysis, but visual examination may not reveal much information and not enable comprehensive analysis. Artificial intelligence (AI) is a technology used to extract and quantify key image information by simulating complex human functions. This latent, precise information contributes to stratify TC on the distinct risk and drives tailored management to transit from the surface (population-based) to a point (individual-based). In this review, we started with several challenges regarding personalized care in TC, for example, inconsistent rating ability of ultrasound physicians, uncertainty in cytopathological diagnosis, difficulty in discriminating follicular neoplasms, and inaccurate prognostication. We then analyzed and summarized the advances of AI to extract and analyze morphological, textural, and molecular features to reveal the ground truth of TC. Consequently, their combination with AI technology will make individual medical strategies possible.


INTRODUCTION
Thyroid cancers (TC) have emerged in popularity over the past decades, with indolent TC accounting for the majority (1)(2)(3). For advanced TC (1,2) and aggressive papillary thyroid carcinomas (PTC) (4), the incidence and mortality rates are also steadily increasing, which makes it imperative to adopt more effective strategies for managing such changes. In the era of personalized medicine, precise and efficient risk stratification is important before, during, and after treatment, to choose and adjust its type and intensity. The foremost step is to discover key information that reveals the biological behavior of TC. There are abundant anatomical structures (texture, internal architecture, and spatial distribution) and molecular components (gene variation, protein expression, etc.) within TC. So far, TC's diagnosis mainly relies on image analysis (e.g., ultrasound images, cell smears, and tissue sections), but information obtained only by our naked eyes hardly enables a comprehensive analysis of the tumors (5). Given patients and their disease features, primary human cell cultures both from surgical biopsies and from fine-needle aspiration (FNA) samples foster the targeted therapies (6). However, many tough challenges still hinder a clear break of personalized treatment such as inconsistent rating ability of ultrasound (US) physicians (7), uncertainty in cytopathological diagnosis (8), difficulty in discriminating follicular neoplasms (9,10), and inaccurate prognostication.
Artificial intelligence (AI) is a series of technologies combined to mimic human interaction ( Figure 1). In some tasks, it matches or exceeds human perception (11,12). AI deals with various sorts of omics information in parallel, easily identifying and modeling a complicated nonlinear relationship in the image (13,14). Several studies have demonstrated that AI classifier is comparable to radiologists while qualitatively analyzing thyroid nodules (TN) (15)(16)(17)(18). Furthermore, AI can extract and quantify key image information, whereby image diagnosis converts from a subjective qualitative task to objective quantitative analysis. This more detailed and precise information is conducive to special risk stratification and propels tailored management to transit from the surface (population-based) to a point (individual-based).
In this review, we aimed to summarize the use of AI for extracting and analyzing morphological, textural, and molecular features to reveal detailed information and personalize therapies for TC patients ( Figure 2).

APPLICATIONS OF AI IN THE US DIAGNOSIS OF TN
TN with several typical ultrasound features implies an increased risk of malignancy, such as solid composition, hypoechogenicity, irregular margin, microcalcification, and taller-than-wide shape. However, these properties can neither confirm nor exclude the diagnosis of TC (19). The observer's agreement among multiple centers is poorly satisfactory in assessing these features (7). Thyroid Imaging Reporting and Data Systems (TI-RADS) are enormously valuable to PTC as risk stratification systems, while relatively less to FTC, MTC, and other malignancies (20). Interestingly, the AI model appears to be a promising tool to facilitate a better knowledge of TN via quantitative analysis of typical US features and introduction of texture features.

Performance of Typical US Features
Wildman-Tobriner et al. (21) developed an AI TI-RADS based on the American College of Radiology (ARC) TI-RADS. This system optimized the evaluation task through reassigned values for eight ultrasound features, highlighting the status of hypoechogenicity or marked hypoechogenicity. The novel AI TI-RADS had better accuracy than ARC TI-RADS when performed by inexperienced radiologists (55% vs. 48%) and experts (65% vs. 47%). Similar to other studies, ARC TI-RADS-based classifiers had higher sensitivity and slightly lower specificity (21)(22)(23)(24). Wu et al. (25) evaluated quantitative echoic indexes for detecting malignant TN, which showed higher accuracy than typical ultrasound hypoechogenicity (>60% vs. 54.01%). We summarized the outcomes of the ultrasound features employed by AI for classification in Table 1 and found the most widely used features were shape, margin, echogenicity, calcification, composition, and size. In other words, these discriminative features seem to be the focus for the AI model to learn (31,37). Particularly, Choi et al. (30) demonstrated several new calcification features associated with TN malignancy, including shorter calcification distance ratio, smaller amounts of calcification, and dimmer calcification. Chen et al. (28) quantified TN malignant risk through the calcification index. These new features boosted diagnostic accuracy by combining qualitative and quantitative methods (30,38). Current AI classifiers focus on benign and malignant TN dichotomy, and certain of them like the S-Detect series have already become commercially available (32,34). Furthermore, they are expected to predict more tumor-biological behaviors such as lymph node metastasis (39,40) and pathological subtypes (41).

Performance of Texture Features
A meta-analysis suggested that a taller-than-wide shape displays TN's variation in space and orientation growth, and it is defined as the most suggestive feature for malignancy (42). Texture features refer to the characterization of spatial distribution and surface orientation with numerical features (43). Thus, texture analysis as a powerful alternative will make it possible for radiologists to comprehend the TN in depth and gain a correct diagnosis. Raghavendra et al. (44) integrated spatial and fractal texture features and screened two features with an excellent area under the curve in diagnostic practice (94.45%). Prochazka et al. (45) used AI to extract texture features from US images independent of the direction of the US probe and achieved better accuracy (94.64%). Yu et al. (46) performed a numerical transformation of two US features, unregulated shape and long/short-axis ratio into the perimeter2/area and the angle between the long axis and the horizontal axis. These new features showed excellent sensitivity and specificity (100% and 87.88%, respectively) combined with 65 texture features. Collectively, AI mode has a role in integrating typical ultrasonic and texture features, and this fusion might sharply reduce the differences in judgments among US professionals. Despite the mounting advantages of the AI model in optimizing and even creating workflows, many remarkable factors hold its ultimate practice back in the real world. The three main factors are as follows: (i) poor availability of large-high-quality datasets to guarantee great robustness (17); (ii) lack of explainability for conclusions from a black-box algorithm to solidify the trust between physicians and patients (47,48); (iii) financial burden from specific equipment and research costs (48).

APPLICATIONS OF AI IN CYTOPATHOLOGICAL EVALUATION FROM FNA
FNA is a primary preoperative examination to evaluate TN. Its report system, the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC), is a state-of-the-art and category-based method for clinicians' decision-making. While TBSRTC includes six diagnostic categories on the estimated risk of malignancy (ROM) (  Table 3).

Performance of Morphological Features
PTC, the most common TC (>80%), arises from abnormal growth of thyroid epithelial cells (28,38). In recent years, AI models with quantitative morphological features have tried to improve follicular lesions' recognition capacity (55)(56)(57) (70) segmented the whole lesions of follicular neoplasms; as a result, the classification accuracy was significantly improved to 96%. This clarified the importance of internal information and affirmed the study's reliability by Savala et al. (57). Similarly, the diagnosis of MTC and ATC is histology dependent (71,72), yet now no studies to our knowledge have answered the hope of AI in their ultrasound and cytopathological diagnosis.

Performance of Biomolecules
For patients with ITN, repeat FNA or lobectomy might be performed because management guidelines are more flexible (8,73). Fortunately, molecular tests provide a noninvasive and accurate option to reduce clinical and healthy uncertainty (8,67). Each genome contains as much information as 100,000 photographs (74). Next-generation sequencing (NGS) can perform high-speed analysis of multiple genes parallelly in a single operation, producing billions of molecular fragments (74,75). It has always been a crucial component of big data due to its large volume of data, the astonishing velocity of the sequencing methods, and the result output's veracity. Traditional information systems are less competent to analyze large and complex datasets (76,77). AI as a big data algorithm can integrate multi-omic data in a different learning task, and automatically realize high-level features' detection or classification (77). Some genetic classifiers have played their strengths in TN such as the Afirma gene expression classifier (GEC) (58), gene sequence classifier (GSC) (59), gene mutation-based classifier (ThyroSeq) (60,78), and microRNA-based classifier (RosettaGX Reveal) (61,79). The GEC involved 167 genes that displayed high sensitivity (92%)  Table 3). However, whether the mentioned classifiers could consolidate and complement each other remains so ambiguous that we need to further investigate the precise application strategy. The multi-gene analysis is able to enhance diagnostic performance, but it may be limited due to key genes' deletion or their reduced expression. Of note, the number of thyroglobulins has been considered as a predictor of postoperative disease progression (67). Therefore, the key proteins might provide some added information for personalized therapy. Recent research has confirmed that proteins are more stable than RNA in clinical tissues (80). Sun et al. (62) completed a 14 protein-based ANN classifier for TN classification. This model realized the accuracy of 90.62% and 87.53% in multicenter retrospective and prospective samples respectively ( Table 3). Some molecular alterations such as BRAF mutations (81) are diagnostic of cancer, but most of the other alterations (82,83) show overlap in both benign and malignant lesions. Therefore, assessing the risk of malignancy by molecular testing should depend on knowledge of the prior cytological appearance.

APPLICATIONS OF AI IN HISTOPATHOLOGICAL ANALYSIS
Upon reliable evidence obtained by the US and FNA examination, tumor information from the resected specimens is significant for pathologists to diagnosis TC such as tumor size, pathologic types, and degree of malignancy. Molecular patterns in the tumor microenvironment like cytokines, chemokines, and adipocytokines interconnect the units of immune-inflammatory responses (e.g., macrophages, neutrophils, lymphocytes) and tumor nest (e.g., epithelial cancer cells, fibroblasts, endothelial cells) (84). The more detailed information the pathologists provide, the more precise the treatment strategies physicians take. The combination of AI, morphology, and molecular markers is expected to provide more information for TC management at a patient's level.  (Table  3). However, further validation of these models is required due to tumor complicated heterogeneity, which was also turned out in a recent study for classifying TC, normal tissues, nodular goiter, and adenomas using a deep learning model (85). Morphologically, FV-PTC is a mixed entity for typical PTC nuclear features and entirely or almost entirely follicular growth patterns. FV-PTC includes two major subtypes: encapsulated (EFV-PTC) and non-encapsulated or infiltrative variants (IFV-PTC) (86). The former generally have RAS mutations like follicular tumors, the latter often presents extrathyroidal extension (ETE), lymphatic metastasis, and BRAF mutations like classical PTC (cPTC) (87). Likewise, EFV-PTC usually appears invasive or non-invasive, and the noninvasive encapsulate tumor was redefined from carcinoma to borderline tumor, noninvasive follicular thyroid neoplasm with papillarylike nuclear features (NIFTP) (86). Up to a point, the invasive EFV-PTC behaves more aggressively like FC, whereas NIFTP is with indolent clinical behaviors like FA (87). It is believed that invasive EFV-PTC might develop from NIFTP (88). Borrelli et al. (89) revealed a significant difference in miRNA expression of FA, NIFTP, and IFV-PTC. In particular, just two miRNA (miR-10a-5p and miR-320e) enable us to differentiate NIFTP from IFV-PTC. In another study by Selvaggi (90), none of the multinucleated giant cells (MGCs) were observed in 20 NIFTP cases, while the amount of MGCs varied from 1 to 4 in 88% of the FVPTC cases (both IFV-PTC and invasive EFV-PTC). When utilizing computer quantitative analysis to classify FV-PTC, Chain et al. (91) demonstrated the NIFTP nuclear area (mean, 54.8 mm2) and elongation was smaller than PTC (mean, 77.2 mm2); Hsieh et al. (92) addressed PD-L1 expression in NIFTP was lower than in invasive EFV-PTC. These quantitative morphological characteristics and definite molecular alterations contribute to FV-PTC classification.
As FV-PTC's definition stated, the coexistence of papillary and variable follicular structures is so common in cancer nests that we hold a positive view about more transitional or intermediate categories between the cPTC and FV-PTC. Undoubtedly, the clearer the learning exemplars, the easier it is to learn for the AI model because it receives fewer error messages (13). For greater efficiency, it's essential to accurately classify the training set and refine the output target.

Performance of Genetic Parameters
The American Thyroid Association risk stratification system and the American Joint Committee on Cancer TNM staging system are used to guide postoperative treatment and predict post-treatment outcomes, which incorporate several parameters including age, ETE, anatomic location, number, and size of metastatic lymph nodes, aggressive variants, vascular invasion, and distant metastasis. Nonetheless, these systems fail to routinely recommend a genetic determination to guide individual management (67,93). Zhao et al. (65) selected 10 gene variant pathways that involved inflammatory and immune responses to determine the TC patients' risk level. Based on these pathways, the patients were divided into the high-risk and low-risk groups whose survival time was significantly better than the former. Ruiz et al. (66) demonstrated a 25-gene panel related to molecular pathways, cell structure, and function was an independent prognostic factor for lymphatic metastasis and disease-free survival ( Table 3). Further evidence is still warranted to address the value of this genetic information to TN's triage and biological behaviors. As AI and gene testing technology upgrade, the cooperation of traditional clinic-pathological parameters and gene molecules might yield more precise therapeutic implications.

CONCLUSION
The future development of personalized medicine in TC still faces several challenges like inconsistent rating ability of US physicians, uncertainty in cytopathological diagnosis, difficulty in discriminating follicular lesions, and inaccurate prognostication. AI's application has improved the efficiency and accuracy of diagnosis and treatment in other tumors (94)(95)(96). A growing amount of medical information can be extracted and analyzed through AI technology. This review has innovatively offered ideas for the ultrasonic and pathological testing out of these dilemmas in terms of morphological, textural, and molecular features. As more key parameters are explored from the tumor and its microenvironment, the AI-aided combination of morphological and molecular features will pave the way for TC's protocol at the individual level.

ACKNOWLEDGMENTS
Sincere thanks to the teachers, classmates, and editors who worked together on this research.