Liver Transplantation Versus Liver Resection for Stage I and II Hepatocellular Carcinoma: Results of an Instrumental Variable Analysis

Background This study aimed to compare the long-term outcomes of liver transplantation (LT) and liver resection (LR) among patients with stage I and II hepatocellular carcinoma (HCC). Methods SEER 18 registry from 2004 to 2015 was retrieved for this study. We included 1,765 and 1,746 cases with stage I–II (AJCC, 7th) HCC in the multivariable analyses and instrumental variable (IV) analyses, respectively. Propensity score matching (PSM) was further carried out to ensure comparability. Propensity score to receive LT was adjusted by stabilized inverse probability of treatment weighting (IPTW) and standardized mortality ratio weighting (SMRW) methods. In addition, IV analysis was performed to adjust both measured and unmeasured confounding factors. Results We identified 1,000 (56.7%) and 765 (43.3%) patients treated with LR and LT, respectively. In the multivariable adjusted cohort, after adjusting potential confounders, patients undergoing LT offered significant prognostic advantages over LR in overall survival (OS, P < 0.001) and disease-free survival (DSS, P < 0.001). The instrument variable in this study is LT rates in various Health Service Areas (HSAs). Results from the IV analysis showed that cases treated with LT had significantly longer OS (P = 0.001) and DSS (P < 0.001). In IV analysis stratified by clinicopathologic variables, the treatment effect of LT vs. LR in OS was consistent across all subgroups. Regarding DSS in IV analyses, the subgroup analyses observed that LT had better DSS across all subgroups, except for similar results in the older patients (interaction P value = 0.039) and the non-White patients (interaction P value = 0.041). In the propensity-matched cohort, patients with LT still had better OS (P < 0.001) and DSS (P < 0.001) in comparison to cases who underwent LR. In both IPTW and SMRW cohorts, patients who underwent LT had better OS (both P values < 0.001) and DSS (both P values < 0.001). Conclusions LT provided a survival benefit for cases with stage I–II HCC. These results indicated that if LT rate was to increase in the future, average long-term survival may also increase. However, for some special populations such as the elderly patients, owing to the similar outcomes between LT and LR, the selection of LT should be cautious.

Results: We identified 1,000 (56.7%) and 765 (43.3%) patients treated with LR and LT, respectively. In the multivariable adjusted cohort, after adjusting potential confounders, patients undergoing LT offered significant prognostic advantages over LR in overall survival (OS, P < 0.001) and disease-free survival (DSS, P < 0.001). The instrument variable in this study is LT rates in various Health Service Areas (HSAs). Results from the IV analysis showed that cases treated with LT had significantly longer OS (P = 0.001) and DSS (P < 0.001). In IV analysis stratified by clinicopathologic variables, the treatment effect of LT vs. LR in OS was consistent across all subgroups. Regarding DSS in IV analyses, the subgroup analyses observed that LT had better DSS across all subgroups, except for similar results in the older patients (interaction P value = 0.039) and the non-White patients (interaction P value = 0.041). In the propensity-matched cohort, patients with LT still had better OS (P < 0.001) and DSS (P < 0.001) in comparison to cases who underwent LR. In both IPTW and SMRW cohorts, patients who underwent LT had better OS (both P values < 0.001) and DSS (both P values < 0.001).

INTRODUCTION
Liver cancer is the second most frequent cause of cancer death worldwide (1). Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer globally (2). Liver resection (LR) is recommended as first-line treatment in HCC patients without liver cirrhosis (1). In contrast, for HCC cases with cirrhosis, indications for LR are generally based on the comprehensive evaluation of tumor burden, liver function, extent of resection, expected remnant liver volume, cases' comorbid conditions, and performance status (3,4). Except for LR, liver transplantation (LT) is also an excellent radical therapy choice for HCC cases, eliminating both of the underlying liver cirrhosis and tumor. LT is a first-line therapeutic option for tumors meeting the Milan criteria but unsuitable for resection (1). Despite these recommendations, for early stage HCC patients with compensated liver function, in some situations (e.g., patients with available liver donation), LT can also be utilized to achieve radical cure (5)(6)(7).
For cases with early stage HCC who are candidates for both LT and LR, there is no consensus on the eligibility criteria for LR or LT in the current data (5,6,(8)(9)(10)(11). Recent studies comparing LT with LR have demonstrated superior survival outcomes of LT in patients with early stage HCC (6,12). However, owing to the significant heterogeneity among the included patients in these retrospective studies, it is still controversial with regard to which modality provides better long-term results. The aim of the present study was to compare the long-term outcomes of LT and LR in cases with early stage (stages I and II) HCC. To achieve it, instrumental variable (IV) analyses were used in this study. IV analysis is a statistical method that serves as an alternative to random assignment to treatment and addresses confounders owing to both known and unknown factors (13,14).  Figure 1. Patients with early-stage (stage I and II; AJCC, 7th) HCC matching the specified eligibility criteria were included in the multivariable analyses (n = 1765) and IV analyses (n = 1746), respectively. The codes in SEER database for HCC treatment included: LR: 20-25, 30, 36, 37, 50, 51, and 52; LT: 61.

Statistical Analysis
Overall survival (OS) was defined as the interval from the date of diagnosis to the date of death with any causes, and the diseasespecific survival (DSS) was defined as the time until death attributed to HCC. Continuous variables were presented as mean ± SD (tested by t-test or Kruskal-Wallis H test) and categorical variables were expressed as number (%) (tested by Chi-square test or Fisher's exact test). Linear trends in the percentage of patients receiving each type of treatment was evaluated by Cochrane-Armitage trend test.
Survival curves were performed using the Kaplan-Meier method and the differences in the survival rates between two groups were compared via log-rank test. Multivariable Cox models were used to adjust for available confounding factors. Interaction tests were used to examine the influence of each stratified indicator on the relations between surgical modality and patient prognosis.
Propensity score-matched (PSM) analysis was done based on the following factors: race, sex, age, year of diagnosis, tumor size, fibrosis-score (Ishak; FS) and alpha-fetoprotein (AFP). Cases were matched with the closest estimated propensity score within 0.02, and we performed a 1:1 nearest-neighbor matching with the preset caliber. Univariable Cox regression was utilized to compare the survival outcomes of LR vs. LT in the cohort after PSM selection.
In addition, PS to receive LT was adjusted by a standardized mortality ratio weighting (SMRW) and stabilized inverse probability of treatment weighting (IPTW) methods. The IPTW assigned weights of 1/PS for patients receiving LT and 1/ (1-PS) for patients undergoing LR. The SMRW assigned a weight of 1 for LT patients and a weight of PS/ (1-PS) for cases with LR. OS and DSS of LT vs. LR were then compared (univariable Cox regression) using the PS-adjusted pseudopopulation created by these two statistical procedures.
In this study, the LT rate in different Health Service Areas (HSAs) was utilized as the instrumental variable. The IV approach depends on the assumption that LT rate was highly related to the selection of treatment methods (cases with higher HAS LR rates usually had a higher opportunity to receive LR), and the IV was not associated with patient survival except through its correlation with the treatment methods (15). In addition, the IV was unrelated to unmeasured risk factors affecting the outcome. Cases from HSAs with less than 10 cases were excluded, because the LT rates could not be  confirmed accurately in those HSAs (16). To assess the validity of LT rates in HSAs as an IV, we verified that LT rate in a HSA was significantly associated with likelihood of treatment assignment (the F statistic exceeding 10 is suggestive of a strong instrument), while not associated with OS in the Multivariable regression analysis. Besides, covariate balance was examined across quintiles. We used a two-stage residual inclusion (2SRI) method in the instrumental variable analysis (17).
It is important to note that, rather than exploring the average treatment effects for a group of cases (as in a randomized trial), the IV analysis focuses on the treatment effect among those whose selection of therapy is affected by the instrumental variable (18). LT rates in HSAs was utilized as the IV, which indicates that our results are generalizable only to cases whose treatment assignment was influenced by the LT rates in different HSAs. In summary, this study analyzed the treatment effect among marginal patients. The marginal patients are those with early-stage HCC would receive LT in a areas with higher LR rates while not in HSAs with lower LR rates, (18,19) because treatment method (LT or LR) for cases with a uncertain or borderline need for LT could be influenced by experience and preferences in different areas. P value < 0.05 was defined as statistically significant. Statistical analysis was carried out by R 3.6.3.

Demographics
Among 6653 patients treated surgically for stage I and II HCC, we identified 1000 (56.7%) and 765 (43.3%) patients treated with LR or LT, respectively. Figure 2 showed the number and incidence of 6653 cases with stage I-II HCC (AJCC 7th) between 2004 and 2015 with LT or LR. Incidence rate of LT was decreased over time (P < 0.001), while incidence of cases undergoing LR was increased over time (P < 0.001). The general patient characteristics was shown in Table 1. The mean age of patients with LT and LR was 57.1 and 62.6 years, respectively. Cases undergoing LT were younger, more often male and the White, and more patients had stage II disease. When patients underwent LT, their tumors were more likely to measure < 3 cm (65.8%), and more cases had cirrhotic liver (88.9%). For cases  with LR, more cases had non-cirrhotic liver (FS in 53.5% of cases was between 0-4), and 35% of cases had tumors larger than 5 cm.

Multivariable Cox Regression
The current study included a total of 1765 cases with available data needed in survival analysis. The mean DSS for cases with LT or LR were 124.0 and 87.4 months, respectively. The mean OS for all of the cases receiving LT or LR were 106.6 and 77.8 months, respectively. In survival analysis, cases undergoing LT showed longer OS (P < 0.001) and DSS (P < 0.001) in comparison to cases receiving LR ( Figures 3A, C).

Instrumental Variable Analyses
All cases were divided into quintiles based on the proportion of patients within each HSA undergoing LT (Supplementary Table  1). The average LT rate ranged from 3% (quintile 1) to 8% (quintile 5) among different HSAs. The F-statistic is 104.8 (P < 0.001), which confirmed the validity of this instrument. Besides, there was no significant relationship between the IV and OS in a standard Cox regression analysis (HR 1.12, 95% CI 0.94-1.34, P = 0.198). In summary, these observations indicated that LT rate in HSAs could be utilized as a valid instrument variable.
Finally, results in the IV analysis were consistent with those observed in the traditional regression analyses. Outcomes according to this instrument demonstrated that patients receiving LT had an obviously better DSS (HR 0.29, 95% CI 0.16-0.55, P < 0.001) and OS (HR 0.47, 95% CI 0.29-0.75, P = 0.001) after adjusting both measured and unmeasured confounders ( Table 2).

Stratified Analyses
Based on multivariable Cox analyses, the Figure 4 showed the relation of surgical modality and patient prognosis stratified by clinical parameters. In subgroup analyses, the salutary effect of LT vs. LR on overall survival was consistent in all subgroups, except for a similar outcome in the non-cirrhotic subgroup (HR 0.72, 95%CI 0.40-1.29, interaction P value = 0.017) ( Figure 4A). The superior survival benefits of LT vs. LR on DSS were consistent across all subgroups with the exception of a similar outcome in the subgroup of age > 70 years (HR 0.40, 95%CI 0.08-2.03, interaction P value = 0.038) ( Figure 4B).

Results in Propensity Score Matched Cohort
As presented in Supplementary Table 2, in the matched cohort, most of the prognostic variables were well-balanced. After PSM, cases receiving LT showed better DSS and OS (both P values < 0.001) compared to patients undergoing LR (Figures 3B, D). In the PSM cohort, the univariable analysis demonstrated that

DISCUSSION
In the present study, we aimed to explore the independent role of surgical modality (LT vs. LR) in long-term survival for cases with curable stage I and II HCC. Both conventional multivariable regression analyses and the propensity score reweighting methods indicated that cases after LT had better DSS and OS in comparison to cases after LR. Additionally, when accounting for both the known and unknown confounders by IV analyses, LT still showed significant survival benefit compared to LR, whereas the adjusted coefficients were increased (the survival benefits were decreased). In stratified IV analyses, we found that non-White patients and patients with age ≥60 years undergoing LT had similar DSS compared to patients after LR. Previous studies which compared the effectiveness of LT vs. LR have increased in the past decade (5,6,8,11,12,20). However, the majority of studies comparing LT and LR for HCC were singleinstitutional, descriptive or retrospective comparisons. Conventional observational studies have utilized multivariable regression analysis and propensity score methods to evaluate associations between surgical modality and patient prognosis. However, these analyses could not adjust unmeasured confounders (15). In contrast, IV analysis allowed for an unbiased estimation of the treatment effect in cases whose treatment option varied with the instrument variable. The instrumental variable analysis was a type of quasi-experimental and econometric modality using naturally existing variation to produce pseudorandomization. Outcomes from IV analysis were found to  be more similar to results from randomized controlled trials (RCTs) (15). IV analysis calculated the treatment effect on the marginal patients, while not the average treatment effect of LT (13,18) thus, the IV analysis did not need to define the specific clinical characteristics of the populations. Instead, it was based on the precondition that cases resided randomly around hospitals and some cases were treated differently in distinct hospitals. Milan criteria are the benchmark for selection of cases with HCC for LT and the reference for comparison with other criteria (1). For patients within stages I and II, some of them had HCC beyond the Milan criterion (e.g., tumor diameter >5 cm). In subgroup analyses, we found that patients with tumor of 5-7 cm undergoing LT still had better OS compared to those after LR, which was consistent with some expanded criteria such as the Up-to-seven criteria (21) and Hangzhou criteria (22). Specially, in stratified analyses, patients with age >60 years after LT were found to have a similar long-term prognosis compared to those after LR. It was possibly because older patients have more medical comorbidities and poorer performance status. Chen et al. showed that the risk of death increased with an increase in the age at transplantation, especially in dialysis patients (23). Sharma et al. showed that cases aged 70 years and older had obviously higher mortality following LT (24). These observations along with our results should make surgeons aware of the necessity for better risk classification in elderly LT candidates. Especially, in IV analyses, we found that Non-white patients cannot acquire a better survival benefit after LT, which may be caused by the differences in environmental, cultural, social, and genetic factors between the White and non-White patients.
Admittedly, the current study had several limitations. First, some clinicopathologic data including preoperative liver function, comorbidities, performance status, postoperative morbidities, and postoperative treatments were not available in the SEER registry, thus we could not evaluate the impact of these factors on patient survival in multivariable analyses. Second, the observations of this study should be interpreted cautiously, given that a number of cases were excluded from our main analysis owing to the unavailable covariates in the SEER registry. Finally, even though IV analysis was a useful practical alternative to RCTs, its validity depended on the population studied. IV analyses only evaluated the effect on marginal patients, whereas patients who would always or never receive LT were excluded in the marginal cases, and it only focused on HCC cases with uncertain indications for LT.
Despite the increasing incidence of cases with HCC diagnosed at an earlier stage, LT rate decreased in the most recent era. By integrating multivariable analysis, PSM method and instrumental variable analysis, our results indicated that LT provided a survival benefit for marginal cases with stage I-II HCC. These results showed that if LT rates were to increase in the future, average survival time may also increase. However, for elderly patients, owing to the similar outcomes between LT and LR, the selection of LT should be cautious.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
WL proposed the study. WL, HW, and YZ performed the research and wrote the first draft. WL collected and analyzed the data. HX revised this manuscript and validated the statistical methods of this study. YZ is the guarantor. All authors contributed to the design and interpretation of the study and to further drafts, and have read and approved the final version to be published.