Clinical Benefits and Safety of Gemtuzumab Ozogamicin in Treating Acute Myeloid Leukemia in Various Subgroups: An Updated Systematic Review, Meta-Analysis, and Network Meta-Analysis

Background Previous trials demonstrated evidence involving the total effects of gemtuzumab ozogamicin (GO), an anti-CD33 humanized antibody, on treating acute myeloid leukemia (AML). In this updated systematic review, meta-analysis, and network meta-analysis (NMA), we aimed to comprehensively explore the clinical benefits and safety of GO in various subtypes of AML. Methods PubMed, Embase, Cochrane, and Chinese databases were filtered to search randomized controlled trials (RCTs) and retrospective cohort studies that compared clinical efficiency and toxicity of GO with non-GO groups in AML. Random-effects models were used to calculate pooled effect sizes and 95% confidence intervals (CIs). Relative risk (RR) was used for estimating complete remission (CR), early death, and toxicity. Hazard risk (HR) was accomplished to evaluate survival. Results Fifteen RCTs and 15 retrospective cohort studies were identified (GO: 4,768; Control: 6,466). GO tended to improve CR (RR 0.95, p = 0.084), followed by significantly improved survival (overall survival: HR 0.86, p = 0.003; event-free survival: HR 0.86, p = 0.015; relapse-free survival: HR 0.83, p = 0.001; cumulative incidence of relapse: HR 0.82, p < 0.001). GO benefits of CR and survival were evident in favorable- and intermediate-risk karyotypes (p ≤ 0.023). GO advantages were also associated with nucleophosmin 1 mutations (p ≤ 0.04), wild-type FMS-like tyrosine kinase 3 internal tandem duplication gene (p ≤ 0.03), age of <70 years (p < 0.05), de novo AML (p ≤ 0.017), and CD33(+) (p ≤ 0.021). Both adding GO into induction therapy (p ≤ 0.011) and a lower (<6 mg/m2) dose of GO (p ≤ 0.03) enhanced survival. Prognosis of combined regimens with GO was heterogeneous in both meta-analysis and NMA, with several binding strategies showing improved prognosis. Additionally, GO was related to increased risk of early death at a higher dose (≥6 mg/m2) (RR 2.01, p = 0.005), hepatic-related adverse effects (RR 1.29, p = 0.02), and a tendency of higher risk for hepatic veno-occlusive disease or sinusoidal obstruction syndrome (RR 1.56, p = 0.072). Conclusions These data indicated therapeutic benefits and safety of GO in AML, especially in some subtypes, for which further head-to-head RCTs are warranted. Systematic Review Registration [PROSPERO: https://www.crd.york.ac.uk/prospero/], identifier [CRD42020158540].


INTRODUCTION
Acute myeloid leukemia (AML) is a heterogeneous hematological malignancy characterized by accumulated myeloid progenitor cells, leading to poor prognosis (1). High-risk factors such as age, cytogenetics, and genetics play a crucial role in predicting prognosis and influencing recommendations of therapies (2).
The conventional induction chemotherapy of AML combines anthracycline with cytarabine (Ara-C), such as daunorubicin plus Ara-C (DA) (3). However, these combined applications are associated with high toxicity (including thrombocytopenia, neutropenia, and anemia) and marginal rates of complete remission (CR) (53%-58%), particularly in elderly cohorts (4). Owing to the shortage of standard chemotherapy, immunotherapeutic strategies, such as antibodies against tumor antigens, might be promising in treating AML and have been proven to be highly effective in other hematological malignancies (5).
In AML, CD33 is frequently and specifically expressed on the surface of more than 90% of myelocytic and myelomonocytic precursor cells, such as blasts, rather than hematopoietic stem cells (HSCs) and outside of the hematological system. Gemtuzumab ozogamicin (GO) is a humanized antibody-drug conjugate composed of a monoclonal antibody targeting CD33, covalently linked to a semisynthetic derivative of calicheamicin. The GO binding to CD33 on AML blasts is followed by internalization of the GO-CD33 complex and toxin release intracellularly, leading to DNA damage and cell death (6). Due to targeting CD33, this complex is predicted to harbor higher specificity for harming AML cells without destroying normal HSCs and organs. Therefore, the expression of CD33 status might affect the therapeutic efficiency of GO. Initially approved by the US Food and Drug Administration (FDA) for treating relapsed AML, GO was subsequently voluntarily withdrawn due to excessive toxicity at higher doses (≥6 mg/m 2 ) (7). However, later randomized clinical trials (RCTs), such as AML-15 (8), AML-16 (9), and ALFA-0701 (10), demonstrated that a lower dose of GO (3-5 mg/m 2 ) plus DA improved survival. In addition to DA, GO added to other regimens, such as Ara-C monotherapy, FLAG (fludarabine, Ara-C, and granulocyte colony-stimulating factor), ADE (daunorubicin, Ara-C, and etoposide), and MICE (mitoxantrone, etoposide, and Ara-C) (8,(11)(12)(13), resulted in different treatment efficiencies. Except for CD33 status, doses of GO, and combined strategies, GO effects might also be affected by other clinical factors, including age stratifications, gender, mutations [such as mutated Nucleophosmin 1 (NPM1) and FMS-like tyrosine kinase-3 internal tandem duplication (FLT3-ITD)], de novo or secondary AML (sAML), cytogenetic risks, and treatment stages (9,10,(14)(15)(16).
However, until now, no published study has comprehensively evaluated the therapeutic effectiveness of GO in all subgroups mentioned above. Therefore, we conducted this meta-analysis to evaluate GO in diverse patient populations to clarify the target cohort. We also performed a network meta-analysis (NMA) to compare GO effects between various combined therapies in RCT.

MATERIALS AND METHODS
This study was conducted according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) (17) (Supplementary Table 1), registered with PROSPERO (CRD42020158540).

Search Strategy and Study Selection
consultation with a third reviewer (LY). After selecting candidate studies, full articles were checked to identify final eligible studies.

Assessment of Bias Risk and Study Quality
The methodologic quality of studies was independently estimated by two authors (QX and SH) through Newcastle-Ottawa Scale (NOS) (18) and Cochrane Risk of Bias Tool (19), which were used for cohort studies and RCTs, respectively. Any disparity was resolved by discussion. Publication bias was assessed with funnel plots as well as the Begg's (20) and Egger's tests (21) by Stata 15.1. A p-value <0.05 implied publication bias existence.

Data Extraction
Clinical information was independently extracted from candidate studies by two authors (QX and SH). Any disagreement was settled by discussion or consultation with a third author (LY). The extracted data were composed of study characteristics (Supplementary Table 2) and prognostic information.
Prognostic endpoints included CR, overall survival (OS), event-free survival (EFS), relapse-free survival (RFS), and cumulative incidence of relapse (CIR), defined by revised International Working Group criteria (22), without required peripheral count recovery for CR. Relative risk (RR) and hazard ratio (HR) were used for estimating CR and survival outcomes, respectively. Data were preferentially extracted from multivariate analyses; otherwise, RR and HR were obtained from univariate analyses, Kaplan-Meier survival curves, or numeric reports as shown in the study from Tierney et al. (23).

Statistical Analysis
The pooled RR and 95% confidence intervals (95% CIs) for CR were produced from the Mantel-Haenszel method, and the pooled HRs with 95% CI for OS, EFS, RFS, and CIR were calculated with the inverse variance method (24). All analyses were completed with Stata 15.1 software using random-effects models to obtain heterogeneity between studies. Pooled RR or HR <1.00 indicated better effect supporting GO treatment. It was considered statistically significant under the range of 95% CI without 1.00 or with a p-value <0.05. The c 2 -based Q statistic estimated the heterogeneity among studies. Low, moderate, substantial, and considerable heterogeneity indicated I 2 < 30%, 30%-50%, 50%-75%, and >75%, respectively (25). A p-value ≥0.10 meant no or slight heterogeneity, whereas p-value <0.10 showed significant heterogeneity, which was settled by sensitivity and subgroup analyses to identify the source.
Bayesian NMA was done with R 4.0.2 software by means of a random model via packages of "gemtc" and "rjags" in RCT. We calculated HRs or RRs regarding non-GO group as the baseline to act as the effect measure, displayed in forest plots, where RR and HR with 95% credible intervals (95% Crls) were utilized to explain the extent of effects in CR and survival, respectively. To estimate relative HR and RR, a Markov Chain Monte Carlo simulation was finished with 10,000 adaptations and 100,000 iterations of each of the three automatically generated Markov chains. After completing all simulations, NMA determined the probability that each therapy would be best by calculating the probability of simulations in which a certain treatment ranked best. For each iteration, regimens were ranked based on the assessed log HR or log RR. The results from Bayesian NMA were compared with data from pairwise meta-analyses to estimate inconsistency using the node splitting method (26). If no closedloop was present in the network evidence plot, inconsistency analysis could not be executed.
All analyses were based on published data. No ethical approval and patient consent were required.

Studies Characteristics
A total of 1,170 references were retrieved from searching databases, 214 duplicates of which were initially removed. Of the remaining 956 records, 783 studies were excluded, since they did not fulfill the predefined inclusion criteria. The remaining 173 reports were retrieved for detailed full-text estimation. Finally, 30 studies were comprehensively analyzed. The Supplementary Figure 1 illustrated the flow diagram of the study selection. Fifteen RCTs and 15 retrospective cohort studies were eventually contained in this study. Quality assessment of RCTs was shown in Supplementary Figure 2. For survival endpoints, we thought that bias was unlikely since death and relapse were not susceptible to patients, physicians, or outcome assessor bias. The details of NOS score for retrospective cohort studies were listed in Supplementary Table 3.

Pooled Prognosis of Gemtuzumab Ozogamicin
All analyses involved in CR, OS, EFS, RFS, and CIR were summarized into Supplementary Tables 4-8, respectively, including results before and after sensitivity analyses as well as subgroup analyses.

Network Meta-Analysis for Various Combined Regimens of Gemtuzumab Ozogamicin
Supplementary Figures 3A-D displayed the network evidence plots to compare CR, OS, RFS, and CIR between various combined regimens of GO, noting no head-to-head trial in all analyses. Therefore, the summarized data between interventions were produced either from qualified indirect or direct evidence but not from both, and data were unavailable to estimate the inconsistency between direct and indirect comparisons.

Toxicity
In total, early death (defined as induction death or 30-day mortality) and 21 types of toxic effects were analyzed (Supplementary

DISCUSSION
Due to the limited clinical efficacy of standard chemotherapy for AML, some innovative molecular-targeted therapies, such as GO, have been applied. Up to now, 15 retrospective cohort studies and 15 RCTs comparing therapeutic effects between GO and other regimens have been published. As a result, it is indispensable to integrate all available data for assessing this drug. This is the biggest systematic review and meta-analysis to evaluate the total treatment evidence regarding GO in AML. GO tended to improve CR, probably resulting in improved survival and declined relapse. Survival benefits of GO were evidently observed in favorable-and intermediate-risk karyotypes. Improved prognosis was found in GO of NPM1(+) cohorts and FLT3-ITD(-) patients. OS benefits in GO was limited in patients aged ≥70 years, and CIR was reduced regardless of age. Survival benefits were also observed in CD33(+) group instead of CD33(-) patients and in de novo AML rather than sAML, but it might be unclear regarding genders. Furthermore, adding GO into induction treatment instead of consolidation alone might produce better survival. Data also showcased more benefits of GO in some survival outcomes at a lower (<6 mg/m 2 ) dose of GO instead of ≥6 mg/m 2 . Survival outcomes of various combined regimens with GO were heterogeneous, showing improved OS and CIR in GO+DA, increased OS in GO monotherapy, and longer RFS in GO+FLAG. The NMA also presented inconsistent probability of achieving better survival among different combined regimens. Additionally, GO was related to increased risk of early death at a higher dose (≥6mg/m 2 ), hepatic-related adverse effects, and a tendency of higher risk for VOS/SOS. GO was associated with slightly higher risk of bleeding.
Our data did not show significantly improved CR by GO, which was, however, followed by improved survival. These data were consistent with previous meta-analyses (16,49,50)  The diamonds represent the overall summary RR and HR estimates with 95% CI. GO, gemtuzumab ozogamicin; CR, complete remission; OS, overall survival; RFS, relapse-free survival; CIR, cumulative incidence of relapse; RR, relative risk; HR, hazard ratio; 95% CI, 95% confidence interval; DA, daunorubicin plus cytarabine; Ara-C, cytarabine; FLAG, fludarabine, Ara-C, and granulocyte colony-stimulating factor. study comprised more RCTs and considered retrospective cohort studies, leading to more reliable results based on a huge cohort (N = 11,234). The bare benefit of GO on CR might be explained by the lower-dose intensities of chemotherapy in GO compared to control (13,28,29,38,46). However, our data displayed increased CR of GO in FLT3-ITD(-) subgroup, all studies (8,9,39) of which utilized the same combined regimen of GO as control. After excluding the factor of different intensity of chemotherapy between GO and non-GO arms, FLT3-ITD mutation might unfavorably affect response to GO.
Besides, the results of survival outcomes were heterogeneous, which might be settled by various subgroups. Cytogenetic risks might play a role in affecting GO benefit, especially in favorable and intermediate-risk karyotypes. This finding, consistent with preceding meta-analyses (16,49,50), indicated that GO benefit might be limited to favorable-and/or intermediate-risk cytogenetic groups but requires to be further estimated in RCTs. Another adverse factor frequently affecting prognosis in AML was FLT3-ITD mutation (8), which also affected the therapeutic effectiveness of GO, showing that the benefit of GO was observed only in FLT3-ITD(-) patients, resulting from better CR. As for NPM1 mutation, a mutation favoring better survival (14), increased RFS was found in the GO arm of the NPM1(+) cohort, but OS and CIR were improved in the GO arm regardless of the NPM1 mutational status, totally showing benefits of GO in the NPM1(+) group.
Additionally, since GO was a CD33-targeting antibody (6), GO also contributed to more survival benefits in CD33(+) AML in our study. Furthermore, enhanced OS and RFS of GO were restricted to cohorts aged ≥60 years, but better OS was observed in subjects aged <70 years in a larger cohort, and CIR was not affected by the threshold of 60 years old, indicating a total better survival achieved in patients aged <70 years. Besides, survival benefits of GO were found in de novo AML instead of sAML, another high-risk factor resistant to treatment (38).
Furthermore, our study displayed the greatest amount of evidence of survival benefits resulting from GO administration in induction regimens rather than only in the consolidation stage. A possible explanation underlying this result seems like an effective adjunct in the induction treatment of AML, and the early GO treatment may prevent relapse and prolong survival. As a consequence, suggested optimization of induction trials warrants the highest attention. Additionally, this study showed that a GO dose of <6 mg/m 2 favored better survival and lower relapse but no survival advantage at a dose of ≥6 mg/m 2 .
Consistently, five RCTs (8-10, 14, 35) prescribing a GO dose of <6 mg/m 2 did not show a difference in early death between GO and controls, whereas some RCTs (13,38) with a dose of ≥6 mg/m 2 reported higher early death rates with GO. This proposed that lower doses, perhaps <6 mg/m 2 , in this setting might be safer and inevitably related to lower toxicity.
Besides, we did not only analyze different therapeutic effects of combined regimens with GO in meta-analysis but also in NMA. In total, GO alone, GO+FLAG, and GO+DA subgroups supported better prognosis in meta-analysis, whereas the NMA indicated GO alone favored the highest probability of improved CR and OS, but GO+Ara-C harbored the highest probability of improved RFS, and GO+ICE+ATRA had the highest probability of decreased CIR. In addition to the survival benefit of GO alone regarding BSC as control, other survival benefits came from various combined chemotherapies, probably indicating the identified advantages of adding GO into chemotherapy but not yet identifying which combination was the best, which should be further explored in RCTs.
Finally, it was not surprising that our meta-analysis showed increased risk of early death at a higher dose (≥6 mg/m 2 ), hepatic-related adverse effects, and VOS/SOS in the GO group, as previously shown in other meta-analyses (16,49,50). Besides, the GO group was associated with a slightly higher risk of bleeding, which can be timely discovered and treated in the clinic.
There were several advantages of this meta-analysis. Firstly, we performed the biggest meta-analysis to provide the most upto-date evidence of GO in AML, including all RCTs and retrospective cohort research with available data. Secondly, the inclusive high-quality research ensured the reliability of this meta-analysis. Thirdly, we did a comprehensive subgroup analysis, such as mutations, de novo AML/sAML, and combined regimens that were not reported in published metaanalyses, and identified several subsets of patients who would mostly benefit from this drug, which, of course, require to be further estimated in RCTs. Fourthly, a comprehensive NMA was conducted to explore the best combined regimen with GO, which was not done in other meta-analyses. However, like most metaanalyses, our analysis was based on published summary estimates rather than individual patient data. Consequently, the merged survival curves could not be produced to explore patient-level factors, particularly in several particularly targeted subgroups for GO identified in this study [e.g., patients aged <70 years, cases with low-and intermediate-risk karyotypes, FLT3-ITD(-) cohorts, NPM1(+) patients, de novo AML with positive expression of CD33, and patients receiving GO combined with DA or FLAG].
In conclusion, our study showed that GO could improve prognosis in AML patients, especially for those aged <70 years, with de novo AML, with positive expression of CD33, with NPM1 mutation, without FLT3-ITD mutation, and with low-/ intermediate-risk karyotypes. A lower dose of GO (<6 mg/m 2 ) and using GO in induction stage rather than only in consolidation therapy might lead to less early death, better survival, and lower relapse. Combining GO with other chemotherapies probably favored better prognosis when compared to chemotherapy alone. Further studies involved in such subgroups above are warranted, and more head-to-head RCTs are needed to directly identify the best combining regimen with GO.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.