Impact of Gross Strap Muscle Invasion on Outcome of Differentiated Thyroid Cancer: Systematic Review and Meta-Analysis

Background: Gross strap muscle invasion (gSMI) in patients with differentiated thyroid cancer (DTC) was defined as high-risk recurrent group in the 2015 American Thyroid Association guidelines. However, controversy persists because several studies suggested gSMI had little effect on disease outcome. Herein, a systematic review and meta-analysis was conducted to investigate impact of gSMI on outcome of DTC. Methods: A systematic search of electronic databases (PubMed, EMBASE, Cochrane Library, and MEDLINE) for studies published until February 2020 was performed. Case-control studies and randomized controlled trials that studied the impact of gSMI on outcome of DTC were included. Results: Six studies (all retrospective studies) involving 13,639 patients met final inclusion criteria. Compared with no extrathyroidal extension (ETE), patients with gSMI were associated with increased risk of recurrence (P = 0.0004, OR, 1.46; 95% CI: 1.18–1.80) and lymph node metastasis (LNM) (P < 0.00001, OR 4.19; 95% CI: 2.53–6.96). For mortality (P = 0.34, OR 1.47; 95% CI: 0.67–3.25), 10 year disease-specific survival (P = 0.80, OR 0.91; 95% CI: 0.44–1.88), and distant metastasis (DM) (P = 0.21, OR 2.94; 95% CI: 0.54–15.93), there was no significant difference between gSMI and no ETE group. In contrast with maximal ETE(extension of the primary tumor to the trachea, esophagus, recurrent laryngeal nerve, larynx, subcutaneous soft tissue, skin, internal jugular vein, or carotid artery), patients with gSMI were associated with decreased risk of recurrence (P < 0.0001, OR, 0.58; 95% CI: 0.44–0.76), mortality (P = 0.0003, OR 0.20; 95% CI: 0.08–0.48), LNM (P = 0.0003, OR 0.64; 95% CI: 0.50–0.81), and DM (P = 0.0009, OR 0.28; 95% CI: 0.13–0.59). Conclusions : DTC patients with gSMI had a higher risk of recurrence and LNM than those without ETE. However, in contrast with maximal ETE, a much better prognosis was observed in DTC patients with only gSMI.


INTRODUCTION
Extrathyroidal extension (ETE), which is defined as tumor spread outside of the thyroid gland and into the surrounding tissues, occurs in up to 30% of patients with differentiated thyroid cancer (DTC) (1,2). Minimal ETE (mETE), detectable only on histological examination, was not regarded as a negative predictor for either survival or disease recurrence (3)(4)(5). Accordingly, mETE was removed from the T3 definition in the 8th edition of the American Joint Committee on Cancer (AJCC) classification, as it would not affect either T category or overall stage (6). In contrast, gross ETE is believed to be an important risk factor for recurrence and mortality (7,8). Thus, DTC patients with gross ETE are classified as T3b or T4 in the AJCC system (9). Moreover, the 2015 American Thyroid Association (ATA) guidelines grouped tumors with gross ETE in the high risk of recurrence category, with a nearly 20% risk of structural recurrence (10). Therefore, gross ETE was an absolute indication for total thyroidectomy and the administration of post-operative radioactive iodine.
In addition to the degree of gross ETE, the site of gross tumor invasion also plays important roles in disease-specific survival (DSS) and disease-free survival (DFS). Recently, several studies reported that gross strap muscle invasion (gSMI) had little effect on DSS and DFS, which was different from the findings of previous studies (8,11). Increasing evidence suggests that DTC patients with only gSMI have the same DFS as those with microscopic ETE (11). In our previous study, we also found that only four of 30 (13.3%) Braf-mutated thyroid papillary microcarcinoma patients with gSMI were diagnosed with recurrence (12). Accordingly, Shaha (13) suggested that a detailed distinction of gross ETE should be performed. Patients with anterior ETE involving the strap muscle had a relatively good prognosis compared with those with posterior gross ETE to the recurrent laryngeal nerve, trachea or esophagus (8). A possible reason is that gSMI can be easily resected with negative margins (13).
In light of the conflicting data on the recurrence risk and mortality conferred by gSMI, we performed a systematic review and meta-analysis to assess the impact of gSMI on the outcomes of DTC patients.

MATERIALS AND METHODS
This meta-analysis was conducted in accordance with the Cochrane Handbook for Systematic Reviews of Interventions guidelines (14). There was no funding received for this study.

Inclusion and Exclusion Criteria
The studies returned from the search were checked according to the following inclusion criteria: x patients were more than 18 years old; y pathologically proven DTC patients who underwent surgery; z complete clinical data and follow-up information; and { DTC patients with SMI.
The exclusion criteria were as follows: x patients with <12 months of follow-up; y those with incomplete medical records; z those with medullary thyroid carcinoma and undifferentiated carcinoma; and { publication styles were letters to the editor, abstracts and meeting posters.

Data Extraction and Risk of Bias Assessment
L. Zhang and J. Liu assessed the search results according to the relevance in providing information for the review. Two reviewers (L. Zhang and S. Xue) independently assessed the titles and abstracts of the remaining records for relevance according to the protocol criteria. Then, they browsed the full text of the studies in detail. Any disagreements were resolved by consulting a third reviewer (J. Li). L. Zhang assessed the risk of bias of each included study using the relevant, validated tool for each study design. J. Liu performed the risk of bias assessment. The risk of bias of the included trials was assessed using the Newcastle-Ottawa Scale (15).

Statistical Analysis
Review Manager (RevMan) 5.3 software was used for the analysis. We calculated odds ratios (ORs) with 95% confidence intervals (CIs) for dichotomous data. We assessed the heterogeneity across studies using the Q-test and the I 2 statistic. P < 0.1 and I 2 > 50% indicated statistical significance (16). If there was obvious heterogeneity, we used a random-effects model; otherwise, we used a fixed-effects model. We conducted sensitivity analysis by excluding each single study at a time to test its influence on the pooled effects. The source of heterogeneity was also explored by subgroup analyses of operation type and histopathological subtype based on available information. When the p-value was <0.05, it was considered statistically significant.

Literature Search
We initially identified a total of 219 studies. Fifty-two duplicate studies and another 118 studies were excluded after reviewing the titles and abstracts. After scrutiny of the full texts of the remaining 49 articles, six studies were finally included in this meta-analysis, all of which were retrospective studies (8,11,(17)(18)(19)(20). Figure 1 shows the study selection process.

Study Characteristics and Quality
In this meta-analysis, a total of 13,639 patients were included, and the characteristics of the included studies are presented in Table 1. The maximal ETE means extension of the primary tumor to the trachea, esophagus, recurrent laryngeal nerve, larynx, subcutaneous soft tissue, skin, internal jugular vein, or carotid artery. No ETE means no extrathyroidal extension. The quality assessment of the included studies by the Newcastle-Ottawa Scale is presented in Table 2. All studies used hospital controls, who were accessed by the same method as gross ETE into the strap muscle (gETE st+ group). Multivariate analysis was conducted by all the studies. The scores of all the studies were over 5; thus, the quality of the selected studies was generally high.

Overall Mortality
Three studies compared the impact of gSMI on cancer-related mortality with no ETE group among 4,699 patients with DTC (no ETE in 4,002 subjects, gETE st+ in 697 subjects). Two studies compared the impact of gSMI on cancer-related mortality with the maximal ETE group. The mortality of patients with gSMI was not increased compared with that of no ETE patients (P = 0.34; OR, 1.47; 95% CI: 0.67-3.25) (Figure 3A). Compared with maximal ETE, gSMI was associated with decreased mortality (P = 0.0003; OR, 0.20; 95% CI: 0.08-0.48) (Figure 3B).

Year Disease-Specific Survival
Three studies analyzed the impact of gSMI and no ETE on 10 year DSS among 3,981 patients with DTC. There was no significant difference between the no ETE and gSMI groups (P = 0.80; OR, 0.91; 95% CI: 0.44-1.88) with no heterogeneity (I 2 = 0%) (Figure 4).

Distant Metastases
Four studies assessed the impact of gSMI and no ETE on distant metastases (DM). There was no significant difference in the DM ratio between the gSMI and no ETE groups (P = 0.21), with significant heterogeneity (I 2 = 87%) ( Figure 6A). Moreover, only two studies investigated the impact of gSMI and maximal ETE on DM. gSMI in patients was associated with decreased DM (P = 0.0009; OR, 0.28; 95% CI: 0.13-0.59) with no heterogeneity (I 2 = 19%) (Figure 6B).

Sensitivity Analysis and Subgroup Analysis
For the comparisons with significant heterogeneity, we conducted sensitivity analysis. The leave-one-out meta-analysis revealed that LNM (compared with no ETE) and DM (compared with no ETE) did not identify a single study that may have caused the substantial heterogeneity (Data not shown). Furthermore, the degree of LNM is highly dependent on the type of lymph node dissection (LND). The heterogeneity could derive from the fact that data of the literature generally do not allow differentiation between nodal involvement in the central and in the lateral compartment. Indeed, CLND may be selectively performed whereas lateral neck is usually treated only with a therapeutic intent. Prophylactic LND will identify many microscopic LNM, while therapeutic LND is only performed for patients with clinical metastatic lymph nodes. All and some patients underwent prophylactic LND in the Li and Park studies, respectively. Therapeutic LND was performed in the studies by Danilovic and Amit. In the therapeutic LND subgroup, patients with gSMI had increased LNM compared with patients without ETE (P < 0.00001; OR, 6.94; 95% CI: 4.40-10.95; I 2 = 0%) (Figure 7).
Patients with follicular thyroid carcinoma are more likely to present with DM than those with papillary thyroid carcinoma (PTC). We believe the significant heterogeneity of the DM analysis is mainly attributed to histopathological types. In the DTC subgroup, there was still no significant difference in DM between the gSMI and no ETE groups (P = 0.45) without heterogeneity (I 2 = 25%) ( Figure 8A). However, in the PTC subgroup, we found that gSMI increased DM significantly compared with no ETE (P < 0.0001; OR, 12.35; 95% CI: 5.20-29.29; I 2 = 0%) (Figure 8B).

DISCUSSION
Increasing evidence has shown that the site of gross tumor invasion also plays important roles in the recurrence and  mortality of DTC patients (21,22). Some researchers believed that gSMI had a relatively good prognosis compared with gross ETE to the recurrent laryngeal nerve, trachea or esophagus, which was different from the findings of previous studies (11,20,23). It is still controversial whether DTC with only gSMI should be downgraded to a lower tumor stage and recurrent risk category. To the best of our knowledge, this is the first meta-analysis to assess the impact of gSMI on outcomes in DTC patients. Compared with patients with no ETE, patients with gSMI had an increased risk of recurrence and LNM. For mortality, 10 year DSS and DM, there were no significant differences between the gSMI and no ETE groups. In contrast with those with maximal ETE, patients with gSMI had a decreased risk of recurrence, mortality, LNM, and DM.
According to ATA guidelines, tumors with gross ETE are categorized into the high-risk group because of the more than 20% structural recurrence rate (10). In our study, the LRR rate of the gSMI group ranged from 5 to 25.9%. These relatively lower LRR rates were mainly attributed to the exclusion of some high-risk recurrent patients in these studies (11,19,20). In the Danilovic and Li studies, which included all kinds of DTC cases, the LRR rates of gSMI were 24.6 and 25.9% (17,18). These data were consistent with the ATA guidelines. Based on the site of tumor invasion, gross ETE can be further divided into three subgroups: invasion only to perithyroidal soft tissue, invasion only to strap muscle and invasion beyond the strap muscles (recurrent laryngeal nerve, trachea, esophagus, skin, or subcutaneous tissues). Some authors have speculated that patients with anterior gETE (i.e., strap muscle) have relatively favorable prognosis compared to those with posterior gETE (i.e., recurrent nerve, trachea, esophagus) (24). We also found that DTC patients with gETE beyond the strap muscle suffered a much higher LRR than the other two groups. A possible reason is that gSMI can be easily resected with negative margins (13). In the future, it may be reasonable that gETE beyond the strap muscle is categorized into an extremely high-risk group in the new recurrence risk stratification system, although further highquality evidence is needed.  The eighth edition of the AJCC/TNM cancer staging system for DTC was published in 2016 (6). It made a substantial change with regard to the T3 category definition. Because mETE, which is identified only on histological examination, carried much less prognostic importance, the new AJCC/TNM system removed mETE in determining the T category (25). Moreover, T3b was defined as a tumor of any size with gSMI. The 8th edition made clear distinctions of disease with no ETE (T1, T2, T3a), gETE only to the strap muscle (T3b) and gETE beyond the strap muscle (T4) (26). In our study, we found that there was no significant difference between the no ETE and gSMI groups on 10 year DSS. Besides ETE, age, LNM, and DM also play important roles in AJCC system for survival prediction. These factors may be different between "no ETE" and "gross strap muscle invasion" groups, which may explain the similar 10 year DSS between the no ETE and gSMI groups if these factors are adjusted on a multivariate analysis.
Usually, the T stage of tumors is associated with LNM and DM. The invasiveness of tumors represents its severity and differentiation (27). Patients with aggressive tumors are always accompanied by more LNM and early DM (28). Compared with patients with no ETE, patients with gSMI present with more LNM. Maximal ETE was considered an independent risk factor for LNM and DM in contrast with gSMI. This finding in our study suggests that the degree of ETE carries much more prognostic significance for DTC (7).
High heterogeneity with an I 2 > 50% was found in the analysis of LNM (compared with no ETE) and DM (compared with no ETE). Additionally, after the removal of each study from the analysis, similar results were confirmed, and the heterogeneity was not changed significantly. Furthermore, subgroup analysis was performed to explore the source of heterogeneity. In the therapeutic LND subgroup, gSMI increased LNM in comparison with ETE. This finding suggested that clinical LNM was more frequent in patients with gSMI. In the PTC subgroup, we found that gSMI increased DM significantly compared with no ETE. Histopathological types may be correlated with the high heterogeneity in the analysis of DM.

STRENGTHS AND WEAKNESSES
By performing a meta-analysis with populations from different studies, this is the first study to assess the impact of gSMI on outcomes in DTC patients in a larger study sample and to adjust the results for the presence of some confounding factors. High heterogeneity was found in the analysis of LNM (compared with no ETE) and DM (compared with no ETE) and was compensated by subgroup analysis. The results of LNM (compared with no ETE) and DM (compared with no ETE) should be interpreted with caution because of the limited number of enrolled articles, and further study is needed to confirm the corresponding results.
This meta-analysis has some potential limitations. First, the treatment strategies for DTC patients were different among the enrolled studies. These treatment disparities, such as thyroidectomy, lymph node dissection, radioiodine ablation, or follow-up, might contribute to different patient outcomes. Second, the limited number of studies hindered the implementation of meta-regression analysis and publication bias assessment. The results of the subgroup analysis should be interpreted with caution because of the small number of studies, although heterogeneity was eliminated by subgroup analysis. Third, the retrospective and non-randomized nature of all studies included in the analysis might be considered a source of bias. This provided associative, not causal, evidence, and mandates caution when interpreting these results. In future studies, randomized controlled trials with a higher methodological quality are needed to improve the quality of evidence.

CONCLUSION
Patients with gSMI had a higher risk of recurrence and LNM than those without ETE. However, in contrast with maximal ETE, a much better prognosis was observed in DTC patients with only gSMI.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/supplementary material.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.