Prospective study and proposal of an outcome predictive nomogram in a consecutive prospective series of differentiated thyroid cancer based on the new ATA risk categories and TNM

Introduction The personalized management of differentiated thyroid cancer (DTC) is currently based on the postoperative TNM staging system and the ATA risk stratification system (RSS), both updated in 2018 and 2015, respectively. Purpose We aimed to evaluate the impact of the last two editions of TNM and ATA RSS in the prediction of persistent/recurrent disease in a large series of DTC patients. Patients and methods Our prospective study included 451 patients undergone thyroidectomy for DTC. We classified the patients according to TNM (both VIII and VII ed.) and stratified them according to the ATA RSS (both 2015 and 2009). We then evaluated the response to the initial therapy after 12-18 months according to the ATA “ongoing” risk stratification, and analyzed the variables associated with persistent/recurrent disease by multivariate analysis. Results The performance of the last two ATA RSSs was not significantly different. By staging patients according to the VIII or VII TNM editions, we found significant differences only in the distribution of patients with structural disease classified in stages III and IV. At multivariate analysis, only T-status and N-status were independently associated with persistent/recurrent disease. Overall, ATA RSSs and TNMs showed low predictive power in terms of persistent/recurrent disease (by Harrell’s test). Conclusions In our series of DTC patients, the new ATA RSS as well as the VIII TNM staging provided no additional benefit compared to the previous editions. Moreover, the VIII TNM staging system may underestimate disease severity in patients with large and numerous lymph node metastases at diagnosis.


Introduction
Thyroid cancer (TC) is the most common endocrine malignancy, representing about 90% of cases. TC is the most rapidly increasing cancer in the United States, where its incidence increased by 211% in the years 1975-2013; however, due in part to the adoption of more conservative diagnostic criteria, the incidence rate declined by 2.5% per year from 2014 to 2018 (1). The mortality from TC is very low and the death rate increased slightly during 2008 to 2017 (0.6% per year) despite earlier diagnosis and better treatment. In recent years TC death rate appears fairly stable (1). The prognosis of differentiated thyroid cancer (DTC) is excellent with a 5-year survival rate of 99-100% for localized, 98% for regional and 53% for metastatic disease (2).
In 2015 the American Thyroid Association (ATA) guidelines for DTC management (3) introduced a new risk stratification system, which included some supplementary prognostic factors such as lymph node characteristics (number, size and extranodal extension), mutational status, and foci's number of vascular invasion. However, as stated in the ATA guidelines "the incremental benefit of adding these specific prognostic variables to the 2009 initial risk stratification system has not been established" and the added value of this modification has not yet been validated.
The TNM classification has also been updated in 2018 to better predict DTC survival (4). As a result of the changes, the eighth edition of the TNM results in the downstaging of a significant percentage of patients to more accurately indicate the low specific risk of death. Currently, the appropriate clinical-therapeutic management of TC requires post-surgical TNM staging to predict survival, and the assessment of the risk of persistent/recurrent disease according to the ATA risk stratification system. During the follow-up this risk is also regularly re-evaluated according to the "ongoing risk stratification" (3) based on the measurement of thyroglobulin (Tg), thyroglobulin antibodies (TgAb), neck ultrasound, post-131-I WBS, other imaging evaluation as required.
As expected, a more personalized and accurate assessment of the risk of persistent or recurrent disease and death from TC has a significant impact on the initial treatment decision (extension of thyroid and/or lymph node surgery, need for radioactive iodine ablation/therapy, need for TSH suppressive therapy) and appropriate management strategies during short and long-term follow-up.
In the present study we prospectively evaluated, in a continuous series of 451 DTC patients, the variables associated with persistent/ recurrent disease (biochemical and structural) and the predictive power of the different ATA risk categories and TNM staging systems. Our findings may help establishing a tailored treatment management based on tumor and patient characteristics.

Patients and methods
We studied a consecutive series of 451 DTC patients undergone thyroidectomy with or without lymphnode dissection from October 2017 to February 2020 and followed up at our thyroid outpatients' clinic. The median follow-up was 20.5 (IQR 14.7-27.4) months.
We staged the tumors according to the 7th and 8 th TNM editions: T (the greatest size of the primary tumor) and N (regional lymphnodes metastases) through histology. We indicated N0 when all removed lymphnodes were negative or if there is no radiological or clinical evidence of lymphnode metastasis, N1a if only the central compartment (levels VI-VII) is involved and N1b if latero-cervical nodes (levels I-V) were positive. M (distant metastases) was evaluated according to the post-surgical 131I-whole body scan (WBS) and/or other imaging evidence.
Disease status was evaluated in all patients through neck ultrasound and Tg and TgAb determination after surgery and periodically during the follow-up (every 3-6 or 12 months).
After 12-18 months from the first evaluation, the response to initial therapy (surgery with or without iodine treatment) was assessed with neck ultrasound and both serum Tg and TgAb measurements, either basal or TSH stimulated (L-thyroxine, LT4withdrawal or rhTSH injection) in radioiodine treated patients. According to their response to the initial therapy (ongoing stratification) patients were re-classified as excellent response (ER), indeterminate (IR), biochemical incomplete (BIR) or structural incomplete response (SIR).
Subsequent follow-up was modulated based on the initial risk evaluation and the first treatment response.
The level of TSH-suppression was based according to the ATA guidelines: substitutive LT4 therapy in low-risk patients without evidence of disease; mild TSH suppression (0.1-0.4 mU/L) in intermediate risk patients or biochemical disease; complete suppression (< 0.1 mU/L) in high risk patients and/or if structural disease.
All patients with the evidence of persistent or recurrent disease underwent additional morphological exams (computed assisted tomography (CT scan), magnetic resonance imaging, bone scan, positron emission tomography). If patients were not cured, further treatments such as radioiodine treatment therapy, other surgeries or different therapies) were brought off.

Statistical analysis
Categorical variables were expressed as frequencies and percentages; quantitative normally distributed ones as mean ± standard deviation (SD) and non-normally distributed ones as median with interquartile range (IQR). The normality was verified through the Kolmogorov-Smirnov test.
The Chi-square test with Yates's correction or Fisher's test were used to analyze the categorical variables. Multivariate analysis was outbrought off through the logistic regression including only significant variables for recurrent/persistent disease at univariate analysis.
The predictive power of the different ATA risk categories and TNM staging systems at short-term re-evaluation was assessed by the Harrell's C concordance index (C-index).
A nomogram was implemented based on the parameters that resulted significantly associated with the risk of recurrent/persistent disease at the multivariate logistic regression analysis, with a risk score for predicting the probability of persistent and recurrent disease.
A p-value <0.05 was considered statistically significant for all analyses. Data analysis was performed using the Stata software version 16.

Results
Clinical and histopathological characteristics are shown in Table 1. Most patients were females with a F/M ratio of 2.6/1.0. Median age at diagnosis was 47.5 yrs (IQR 36.9-57.8). Histotype was papillary in almost all cases (96%). 14.2% of patients had an aggressive PTC variant (diffuse sclerosing, tall cell, insular or columnar).
Eleven patients (2.4%) had distant metastases at diagnosis (10 lung metastases and 1 with lung and bone metastases).
Minimal extrathyroid extension was present in 21.5% and multifocality in 39.0% of patients.
Most patients fell into the T1 and T2 categories, especially with the new TNM staging system; 26.8% of patients had central (N1a) and 13.7% latero-cervical lymphnode metastases (N1b).
Two hundred seventy-seven (61.4%) patients were treated with 131I with different activities: 85 patients with 30 mCi, 186 patients with 100 mCi, 3 patients with 50 mCi, 1 patient with 70 mCi, 1 patient with 150 mCi e another 1 with 200 mCi. 156 patients were treated after L-T4 withdrawal and 121 after rhTSH administration. Six (1.3%) patients required another surgical treatment in the neck and 6 (1.3%) patients underwent to a second 131I treatment after about 12 months from the first.
4.1 TNM staging system and risk stratification system TNM, staging (VII and VIII editions) and Risk stratification classifications are shown in Table 2.
Applying VII TNM staging system 60% tumors were T1a and T1b,instead applying VIII TNM more than ¾ of the patients (77.4%) fall into this category. This different percentage depends mainly on the removal of minimal extrathyroid extension from T3 classification.
Regarding the lymph node status, applying VII TNM staging 12.0% patients were N0 47.5% Nx; instead applying VIII TNM staging almost 60% of cases were N0a or N0b (268, 59.4%) (due to the removal of Nx in the new TNM system). Thesame percentages were in N1a and N1b classes.
Six patients presented latero-cervical metastases without involvement of the central compartment (skip metastases).
Patients were also staged comparing VII vs VII TNM staging system ( Figure 1): most patients fell into stage I using both classifications, respectively 71.9% vs 86.5%.Using VIII TNM edition there was a significative downstaging in all categories (about 30%), mostly from stage III and IVA into stage I and II.
At initial evaluation, patients were subdivided into three different risk categories with few variations in percentage due mainly to the lymph node number and size categorization using 2009 or 2015 ATA risk stratification: 42.8% and 45.5% low, 54.8%

Response to initial therapy, 12-18 months after initial treatment, in all patients
After initial treatment, 63.9% of patients presented with excellent response. However, 35 patients (not ablated) had basal Tg between 0.2 and 1 ng/mL, stable and compatible with small thyroid remnant. After 12-18 months from initial treatment, just over a third of patients (36.1%) were not cured.
In particular 82 patients presented an indeterminate response (68 patients had indeterminate Tg or TgAb and 14 patients non-specific finding at neck ultrasound or CTscan). Five patients had biochemical incomplete response and 76 patients structural incomplete response (52 had lymph node metastases, almost all small in number and size; 10 lung metastases; 1 only bone metastases and 5 lung and bone metastases, 5 lung and LN metastases, 2 lung, bone and local disease).

Response to initial therapy according to ATA risk categories
As expected, the percentage of patients with an excellent response decreased through risk categories, being less frequent in intermediate-and mostly high-risk patients with both ATA risk categories ( Figure 2).
In low risk patients, no significant difference was found in both TNM: approximately 72% had an ER, 19% IR, 1% BIR and 7.8% SIR.

Structural disease after 12-18 months after initial treatment according to VII vs VIII TNM staging
As expected, the percentage of patients with structural disease increases through the stages mostly using VIII TNM ed. (Figure 3).
Comparing the two TNM editions, a similar percentage of disease was observed in stage I and II both using VII or VIII TNM edition instead a significant difference in stage III and IV, respectively p=0.004 and 0.04.

Predictors of persistent/recurrent disease at 12-18 months after first treatment
Risk factors of persistent disease (either morphologic or biochemical) at last disease assessment are presented in Table 3.
At univariate analysis, the factors associated to the presence of disease were: T status, the presence of lymph node metastasis, both in the central compartment and in lateral compartments (N1a and N1b status), the presence of more than five lymph node metastases, ATA risk intermediate or high and radioiodine treatment therapy.
At multivariate analysis, both T status and lateral lymph node metastasis were patient features independently predicting persistent/recurrent disease (higher O.R. for N1b= 3.07, p<0.001).
Taking into account only structural incomplete response (SIR) at univariate analysis, the factors associated to the presence of persistent disease are, beyond the same risk factors of above, were male gender and multifocality. At multivariate analysis T status and lateral-lymph node were independent predictors of disease (Table 4).
Since the multivariate analysis showed that lymph node metastases have a significant impact to predict persistent/ Graphic representation of patients' distribution into the four stages with VII vs VII TNM staging system. Response to initial therapy, 12-18 months after initial treatment, according to ATA risk categories. recurrent disease, LN characteristics (localization and number) have been investigated in detail.
As expected not only the presence of positive LN (N1a + N1b vs N0/Nx), but also the number of nodal metastases and their location (N1b vs N1a) were relevant risk factors for persistent/recurrent disease persistence at 12-18 months after first treatment.
Analyzing the number of nodal metastases, using the cut off of 5 positive nodes, the likelihood of persistent disease was higher for patients with > 5 vs ≤ 5 metastatic lymph nodes or negative lymph node (respectively 64.6%, 40.8% and 28.7%) (p<0.001) ( Table 3).
As for lymph node location (N1a or N1b), the frequency of disease at 12-18 months after first treatment progressively increased from 28.7% in N0a/N0b to 40.5% in N1a and to 59.7% in N1b (OR 1.69 and 3.67 and p=0.02 and <0.001, respectively) ( Table 3).
Analyzing the number and location of nodal metastases, for 121 N1a patients, the probability of persistent disease was higher when more than 5 lymph nodes were involved (7/13 cases, 53.8%) compared with ≤5 lymph nodes involved (42/108, 38.9%); this difference was not statistically significant (p=0.30).
For 62 N1b patients, the probability of persistent disease was sensibly higher when more than 5 lymph nodes were involved (24/ 35, 68.5%) compared with ≤ 5 lymph nodes involved (13/27, 48.1%); also this difference was not statistically significant (p=0.10). Similarly, the VII and the VIII TNM staging system resulted to have a similar and low predictive power (C-index = 0.560 vs 0.570, for the VII and the VIII edition, respectively).

Development of a nomogram to predict persistent/recurrent disease
A nomogram incorporating all the significant parameters was constructed based on the multivariate logistic model identified in Table 3. For each parameter we obtained a corresponding prognostic points as shown in Figure 4. The point values for all predictor variables were summed to reach a total score. This value was plotted on the total score axis and a vertical line drawn from this axis straight up that indicates the patient's probability of persistent/recurrent disease at re-evaluation after the first therapy.
DTC's prognosis is generally excellent., but in some patients tumors are aggressive with poor outcome.During 2008 to 2017 Frontiers in Endocrinology frontiersin.org (0.6% per year) TC's death rate increased slightly but appears to have stabilized in recent years. It is not known if the incrising incidence of PTC is true or whether it is due to the overdiagnosis of indolent PTCs.
Finding the variables identifying these tumors is a major issue to decide the appropriate management.
The American Thryoid Association (3) introduced a new risk stratification system with additional prognostic variables. Moreover, to better predict DTC survival, also the TNM classification was updated in 2018. A significant number of patients were downstaged by the 8th edition (TNM-8) into lower stages to more accurately reflect the low risk of dying, but underestimating the risk of persistent/recurrent disease and death in some patients due to the fact that all young patients without distant metastases fall into stage I.
The changed risk stratification and TNM staging have a significant consequences on the earliest therapeutic judgment and subsequent follow-up management.
Concerning the risk stratification, different data have been published. In Steinschneider et al. data (7) approximately 70% of Regarding the staging, a large proportion of patients were downstaged in the 8th edition (30-40%) vs the 7 th one, mostly due to the increasing of the age cut-off to 55 years, the downclassification of T3 disease, and the overall downstaging of lymphnode metastases (4, 7-9), with a minimal impact on the expected 10-year disease-specific survival despite the large proportion of shifted patients to stage I and II. Kim et al. (4) found that 41% of patiens were downstaged and inevitably more patients with recurrences or deaths were found in the lower stages: 17% of patients downstaged from stage III to II had recurrent disease, 25% and 13.6% died in the group downstaged respectively from stage IV to III and from stage IV to II.
Also our data shows an important downstaging (about 30%) mostly from staging III and IVA to I, II and III. In particular the downstaging concerned 82.6% of stage II (into stage I), 100% of stage III (into stage I and II) and 100% of stage IV (into stage I, II and III). Evaluating the different response at initial therapy (dynamic or ongoing risk stratification), a paper published 10 years ago by Tuttle et al. (10) found an ER in 86% of low risk patients, in 57% of intermediate-risk and in 14% in the high-risk. In 11% of low-risk patients, 22% in intermediate-risk and 14% in high-risk had BIR and 3% of patients in the low-risk, 21% in the intermediate-risk, and 72% in the high-risk had SIR.
Another paper evaluating 441 patients (7) showed that the proportion of intermediate/high-risk patients in stages I-II increased considerably according to TNM-8 and that patients downstaged in stage II with TNM-8 had more lymphnode metastases, surgeries, disease persistence and an increased disease-specific mortality (non-significant) vs to TNM-7. They found a similar rates of persistent and recurrent disease in stage I in both TNM editions, but higher in stages II (p = 0.05) and III (p = 0.03) in TNM-8 vs TNM-7Therefore the new TNM guaranteed a more accurate system to assess mortality and persistence disease but that the severity, mainly in in stage II patients or in the 45-55-year old group, should not be underestimated as a result of the important downstaging of some particular groups of patients.
In the present study, after initial treatment, 63.9% of patients presented ER and 36.1% patients were not cured, of which half presented an IR, a little less cases SIR and few patients BIR.
Our data are different compared to previous data mostly for lower percentage of the ER in low risk patients (71.7%), due to an higher number of patients IR (19.5% of low risk patients, 17.3% in intermediate-risk patients and in 13.3% of the high-risk patients) and very lower percentage of patients with BIR (1% in the low-risk, 1.3% in the intermediate-risk and 0% in the high-risk group); regarding SIR the rate are similar to the other paper (in 7.8% in the low-risk, 21.6% in the intermediate-risk, and 66.7% in the highrisk group).
Moreover in our patients the rates of structural disease in stage I and II was similar in both editions and it was significantly higher in stages III (p = 0.004) and IV (p = 0.04) in VIII compared to VII edition.
Another issue still to be validated is the timing of the reevaluation. Based on our data, it seems that the restaging at 12-18 months could be too early as many patients with indeterminate response could change into excellent response (for example for TgAb still positive but stable or declining). This hypothesis will be evaluated by extending the follow-up.
Concerning the risk factors, most reports show that age, gender, aggressive variants, tumor size, lymphnode metastases are the most important predictors of outcome (3).
In a recent paper in 2020, Shin et al. (11) found that at at multivariate analysis tumour size > 4 cm, multifocality and nodal factors were the independent factors of recurrence free survival.
In our data only tumor size and lymph node metastases independently predicted short term outcome, instead the other risk factors were not statistically significant. In fact, tumor size is an independent risk factors from T2 up to T4a both for recurrent/ persistent disease (biochemical and structural) and for only structural disease.
In a retrospective analysis of 574 patients with PTC, Tran et al. (12) found that tumor size predicted recurrence-free survival on multivariate analysis and Pellegriti et al. (13) showed that tumor size (≤1.0 cm versus. 1.1-1.5 cm) was not predictive of recurrence. Nguyen et al. (14) showed that, in SEER database, the10-year relative survival rates for tumors sized 1.5 cm or larger and tumors less than 1.5 cm were 95.4% and 99.8%, respectively.
Regarding lymph node metastases, their clinical relevance in PTC has been a debated matter for decades (15,16). To date, the LN metastases impact at diagnosis on recurrence's risk is well documented in many papers, including a recent study of our group (17,18). For many years, however, only neck lymhpnode metastases, without other specified variables, was evaluated as a PTC prognostic factor.
The impact of neck metastatic LN on PTC risk stratification was better defined in 2015 ATA guidelines, in which additional informations, as the number, size or extranodal extension of metastatic lymhpnodes, were included in the evaluation. To date, FIGURE 4 Significant parameters at multivariate and corresponding prognostic points and nomogram for the prediction of persistent/recurrent disease on the basis of clinical and histological characteristics. these additional characteristics have not been validated and their relevance in defining the recurrence risk has yet to be quantified.
In our series the positivity of lymph node metastases, the number (≤5 or >5) and the location (N1a and N1b) are effective predictors of the outcome of the patients at 12-18 months after the first treatment, both for recurrent/persistent disease (biochemical and structural) and for only structural disease.
Data on the location of metastases, N1a or N1b, were judged insufficient to include this information in the clinicopathologic variables to estimate the risk in PTC (3) (recommendation #48, [B20], paragraph 1, line 16).
In a recent publication of our group (19), N1b worsened the prognosis and may be related to the appearance of distant metastases, which are considered the best surrogate index for cancer-specific death.
N1b status could be a marker of more aggressive or more advanced disease at diagnosis in PTC patients and associated with other ones (as molecular alterations) of cancer aggressiveness.
Concerning the mETE, its role on disease specific survival (DSS) and on overall survival (OS) have been evaluated by several studies but it's still consider a controversial prognostic factor. Some authors (19)(20)(21)(22) showed similar clinical outcomes.However, Castagna et al. (23) showed poorer outcome, in term of persistent/recurrent structural disease and tumor-related death in patients with mETE vs tumors >1.5 cm without extrathyroid extension (11.8% vs. 5.1%), concluding that only small mETE cancers should be consider at low-risk.
Recently, an expansion of TNM-8 has been published (Telescoping) to test he subcategories, according to the mETE, to get a better estimate of the prognosis and to plan the follow-up management.In the next few years, the impact of mETE for each tumor class will be available.
Comparing disease specific survival (DSS) between TNM VII vs VIII ed, Tam et al. (20) concluded that DSS in both TNM editions is similar, although through the updated TNM-8 the 10-year DSS appears more proper between stages. For stages I-IV with TNM-7 the 10-year DSS ranged from 100% to 82.6% (p < 0.001) and the 10year OS from 95.8% to 59.7% while with TNM-8 from 99.8% to 71.9% (p < 0.001) and from 94.3% to 34.6%, respectively.
Contrasting, results from Chung et al. (21), analyzing a large series of 3,176 DTC patients, and Jeon et al. (22), investigating the predictive capability of DSS of TNM-8 compared to TNM-7 in 1,613 DTC patients, showed that TNM-8 has a higher power to differentiate patients in each stage and to predict also the DSS.
Although therefore with the TNM-8 an improved assignment of patients at high risk of dying from DTC into more advanced stages of disease seems evident, on the other hand, leads to the erroneuous belief that the disease is less aggressive. Nearly 50% of the cancer-related deaths are involve patients in stages I-II, compared to none with the TNM-7 (7). Having thyroid cancer a very low mortality, in some patients the risk of death is not always related to recurrence's risk.
Recently, several studies showed a better predictability for the new TNM (6,24). A more accurate survival predictions is suggested when TNM-8 is applied, due to the downstaging of a significant number of patients (about 30%).
In our data, being the median follow-up of 20.5 months, mortality was not assessed so an analysis evaluating the presence of distant metastases, good surrogate for predicting mortality, was carried out. The C-Harrel test to evaluated the power to predict disease after short follow-up found no difference using VII and VIII TNM editions (both for biochemical and structural disease and also for structural disease).
Lastly, generally, cancer nomograms are prediction tools to assess the risk based on specific patient's and tumor's characteristics and to predict the likely outcomes of different therapies. By integrating different prognostic variables, the nomogramhas the ability to create an individual numerical probability of a clinical event, useful to improve disease prognostication and therefore a personalized follow-up.
In literature, recently, several nomograms for prediction the risk of death from thyroid cancer have been proposed (23,25). In 2016 Lang et al. (24) validated a nomogram for PTC patients with an excellent discriminating skill and accuracy in predicting 10-yearsdisease-specific death and recurrence. In 2017 two nomograms were proposed (26), one for regional recurrence-free survival and one for distant recurrence-free survival prediction with a C-index of 0.72 and 0.83, respectively. Also the nomogram elaborated by our data could be useful to plan an individualized follow-up for each patient based on the score obtained from the risk calculation. The limit of this system is the lack of external validation In conclusion, although the new TNM-8 compared to the previous TNM edition would seem to better discriminate the disease-specific death, in some patients (as N1b at diagnosis, mainly if numerous and large) could underestimate the severity of disease due to the significant downstaging and bring to a nonnegligible treatment burden. In fact while in the TNM-7 N1b patients were included in advanced risk categories, the new TNM-VIII does not discriminate the death rate according to the lymphnode location, downstaging some patients (particularly old patients and/or N1b) decreasing the discriminating ability for the few patients with a negative outcome, despite categorized as stage II. Therefore, a careful follow-up is needed for downstaged patients.
Further prospective studies are needed to better define the real effectiveness of the 2015 ATA risk stratification system and the VIII TNM staging system.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by Comitato Etico Catania 2 A.O. Garibaldi in Catania Piazza Santa Maria del Gesù, 5 95124 Catania. The patients/ participants provided their written informed consent to participate in this study.