Clinical Benefits and Safety of FMS-Like Tyrosine Kinase 3 Inhibitors in Various Treatment Stages of Acute Myeloid Leukemia: A Systematic Review, Meta-Analysis, and Network Meta-Analysis

Background Given the controversial roles of FMS-like tyrosine kinase 3 inhibitors (FLT3i) in various treatment stages of acute myeloid leukemia (AML), this study was designed to assess this problem and further explored which FLT3i worked more effectively. Methods A systematic review, meta-analysis and network meta-analysis (NMA) were conducted by filtering PubMed, Embase, Cochrane library, and Chinese databases. We included studies comparing therapeutic effects between FLT3i and non-FLT3i group in AML, particularly FLT3(+) patients, or demonstrating the efficiency of allogeneic hematopoietic stem cell transplantation (allo-HSCT) in FLT3(+) AML. Relative risk (RR) with 95% confidence intervals (CI) was used for estimating complete remission (CR), early death and toxicity. Hazard ratio (HR) was used to assess overall survival (OS), event-free survival (EFS), relapse-free survival (RFS) and cumulative incidence of relapse (CIR). Results After addressing all criteria, 39 studies were eventually analyzed. Better CR was accomplished by FLT3i in untreated AML (RR 0.88, p = 0.04) and refractory and relapsed FLT3(+) AML (rrAML) (RR 0.61, p < 0.01) compared to non-FLT3i arm, followed by improved survival (untreated AML: OS, HR 0.76; EFS, HR 0.67; RFS, HR 0.72; all p < 0.01; FLT3(+) rrAML: OS, HR 0.60, p < 0.01; RFS, HR 0.40, p = 0.01). In addition, allo-HSCT improved survival in FLT3(+) AML (OS, HR 0.53; EFS, HR 0.50; RFS, HR 0.57; CIR, HR 0.26; all p < 0.01), which was further prolonged by FLT3i administrated after allo-HSCT (OS, HR 0.45; RFS, HR 0.34; CIR, HR 0.32; all p < 0.01). Additionally, FLT3i consistently improved OS (p < 0.05) regardless of FLT3-ITD ratio, when compared to non-FLT3i group. Besides, FLT3i showed significantly increased risk of thrombocytopenia, neutropenia, anemia, skin- and cardiac-related adverse effects, increased alanine aminotransferase, and increased risk of cough and dyspnea (p < 0.05). In NMA, gilteritinib showed the highest probability for improved prognosis. Conclusions FLT3i safely improved prognosis in induction/reinduction stage of FLT3(+) AML and further boosted survival benefits from allo-HSCT as maintenance therapy, suggesting better prognosis if FLT3i is combined before and after allo-HSCT. In NMA, gilteritinib potentially achieved the best prognosis, which should be identified in direct trials.


INTRODUCTION
Acute myeloid leukemia (AML) is a heterogeneous hematologic malignancy characterized by a maturation block and accumulation of myeloid progenitor cells (1). Among the most prevalent AML genetic aberrations, FMS-like tyrosine kinase 3 (FLT3) mutations are detected in approximately one-third of patients (2), comprising about three-quarters of FLT3(+) patients with internal tandem duplication (FLT3-ITD) ranging from 3 to more than 100 amino acids located in the juxtamembrane region and a FLT3 point mutation in the tyrosine kinase domain (TKD) in approximately 8% of newly diagnosed AML (3). FLT3-related mutations induce constitutive activation of FLT3 receptor and trigger the downstream pathways resulting in leukemic cell proliferation, impaired differentiation, and resistance to apoptosis (4). FLT3-ITD(+) patients always manifest poor prognosis, distinguished by high resistance frequency to induction chemotherapy and relapse, decreased response to salvage therapy, and shorter survival. However, TKD influence on prognosis remains contradictory (2).
FLT3 inhibitors (FLT3i) are a type of tyrosine kinase inhibitors with FLT3 inhibitory activity that may be particularly utilized to treat FLT3(+) AML with likely improved prognosis. According to current clinical studies, they proved that AML patients benefited from FLT3i (3,(5)(6)(7). The following FLT3i are the most frequently estimated in phase 2 or 3 of randomized controlled trials (RCT) and retrospective studies: sorafenib, lestaurtinib, midostaurin, quizartinib and gilteritinib. Typically, sorafenib, lestaurtinib and midostaurin demonstrate activity against multi-kinase. In particular, sorafenib is the most common FLT3i for AML, with high activity against ITD mutations instead of wild-type FLT3 and TKD mutations (8). Lestaurtinib is an indolocarbazole inhibitor, with equal inhibiting impact on FLT3-ITD and -TKD mutations (9,10). Midostaurin is also a multi-targeted indolocarbazole, with equal activity against mutated FLT3-ITD and -TKD (3). Quizartinib is a FLT3i with potent activity against mutated FLT3-ITD and wild-type FLT3, but without intrinsic activity against TKD mutations (5,9). Quizartinib can also moderately inhibit KIT (5). Gilteritinib is a highly selective inhibitor of FLT3 and AXL receptor tyrosine kinases, with anti-leukemic activity against ITD and TKD mutations (6,9) but with weak activity against KIT (6).
These FLT3i are used in various stages, including induction with/without consolidation therapy in newly diagnosed AML, maintenance treatment after allogeneic hematopoietic stem cell transplantation (allo-HSCT) and salvage therapy in refractory and relapsed AML (rrAML). Up to now, no extensive study has been found to comprehensively explore the role of FLT3i in various AML treatment stages and explore which FLT3i probably works the best. Herein, we conducted a meta-analysis in an attempt to clarify the clinical benefit and safety of FLT3i. Data from allo-HSCT in FLT3(+) patients were also summarized to observe the effects of FLT3i as the maintenance therapy after allo-HSCT. We further performed network meta-analyses (NMA) and ranked the prognostic effects of various FLT3i based on phase 2 and 3 RCT to check the most effective FLT3i.

METHODS
This study was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (Supplementary Table 1) (11) and was registered at PROSPERO (CRD42020158077).

Search Strategy and Study Selection
A literature search was conducted through databases of PubMed, Embase, Cochrane library, China National Knowledge Infrastructure, and Wanfang since inception until September 30th, 2020, following keywords "FMS-like tyrosine kinase 3", "FLT3", "acute myeloid leukemia", "AML", "hematopoietic stem cell transplant", "HSCT", "sorafenib", "lestaurtinib", "midostaurin", "quizartinib", and "gilteritinib". The included reports were: (i) published in English or Chinese, (ii) limited to retrospective cohort studies or RCT reporting the therapeutic effects of FLT3i on AML, especially for FLT3(+) patients, or restricted to studies demonstrating prognostic effects of allo-HSCT on FLT3(+) AML, (iii) designed to contain two arms or more for comparing prognostic influence of FLT3i or allo-HSCT with controls. Studies were excluded if they: (i) reported unavailable or insufficient data, (ii) were reviews, case reports, editorials and letters, (iii) had overlapping cohorts and (iv) were sing-arm studies.
Study selection was conducted in two stages. Initially, abstracts and titles of potential literature were independently browsed and screened by QX and SH according to inclusion and exclusion criteria. Both reviewers then evaluated the candidate articles and decided on their inclusion. Any discrepancy was discussed and, if required, settled through discussion or consultation with a third reviewer (LY). After selecting the candidate studies, full texts were checked to identify final eligible ones.

Quality Assessment and Publication Bias Investigation
The methodologic quality of primary studies was assessed separately by two reviewers (QX and SH), based on Newcastle-Ottawa-Scale (NOS) (12) and Cochrane Risk of Bias Tool (13) used for quality assessment of retrospective cohort studies and prospective RCT, respectively. Any disparity can be resolved through discussion panel. Publication bias was investigated with funnel plots as well as Begg's (14) and Egger's (15) tests. A P-value < 0.05 implied publication bias existence.

Data Collection
Clinical information from the included studies was extracted independently by two authors (QX and SH), and any reported disagreement was settled by discussion or consultation with the third author (LY). The extracted data were comprised of the first author, study characteristics, patients' baseline and prognostic information. The endpoints included overall survival (OS), eventfree survival (EFS), relapse-free survival (RFS), cumulative incidence of relapse (CIR), early death (defined as induction death or 30-day mortality), adverse events and complete remission (CR), defined by revised International Working Group Criteria (16), without requirement of peripheral count recovery for CR. Hazard ratio (HR) was utilized to assess survival, and relative risk (RR) was utilized to evaluate CR, early death and adverse events. Data were preferentially extracted from multivariate analyses. However, in researches without multivariate data, RR and HR were exacted from univariate analyses or calculated from Kaplan-Meier survival curves or numeric reports under the methods provided by Tierney et al. (17).

Statistical Analysis
The pooled HR and 95% confidence intervals (95% CI) for survival were calculated with the inverse variance method, and pooled RR and 95% CI for CR, early death and adverse events were produced from the Mantel-Haenszel method (18). Analyses were conducted with Stata 15.1 using random-effect models to account for heterogeneity between studies. Pooled RR or HR < 1.00 indicated better effects supporting FLT3i or allo-HSCT. It was also considered statistically significant with 95% CI range that did not cover 1.00 and a p-value of < 0.05. The c2-based Q statistic was used to assess the heterogeneity among studies. Low, moderate, substantial and considerable heterogeneity showed I 2 < 30%, 30%-50%, 50%-75% and > 75%, respectively. A P-value ≥ 0.10 implied no heterogeneity or slight heterogeneity, whereas P < 0.10 meant significant heterogeneity existence (19). When P-value of heterogeneity was < 0.10, sensitivity and subgroup analyses were conducted to determine significant heterogeneity source.
Bayesian NMA was done with R 4.0.2 by means of a random model via packages of "gemtc" and "rjags" in RCT. We also calculated HR or risk ratios (RR) regarding non-FLT3i group as the baseline to act as the effect measure, displayed in forest plots, where RR and HR with 95% credible intervals (95% Crl) were utilized to explain the extent of effects in CR and survival, respectively. The range of 95% Crl without covering 1.00 implied statistical significance. To estimate relative HR and RR, a Markov Chain Monte Carlo simulation was finished with 10000 adaptations and 100000 iterations of each of the three automatically generated Markov chains. After finishing all simulations, NMA determined the probability that each treatment would be best by calculating the probability of simulations in which a certain treatment ranked best. For each iteration, therapies were ranked according to the evaluated log HR or log RR. The data from Bayesian NMA were compared with data from pairwise meta-analyses to assess inconsistency using the node splitting method (20). Significant inconsistency was indicated if node-splitting analysis showed a P-value < 0.05. If no closed-loop was present in the network evidence plot, inconsistency analysis could not be executed. Besides, the network evidence plots were drawn from Stata 15.1.
All analyses were based on published data; therefore, no ethical approval and patient consent were required.

Characteristics of Included Studies
The study selection was shown in Figure 1. A total of 38 articles, including 39 studies, were eventually included. The 4 th study covered the 5 th cohort but focused on different disease statuses, while the 35 th and 36 th studies contained the same population but involved different treatments. The study characteristics and qualification assessment were listed in Supplementary Table 2.
Totally, 6859 AML patients were included, and their studies were conducted in FLT3(+) AML, except for Roellig et al.  Table 2), covering 464 newly diagnosed AML receiving sorafenib plus chemotherapy as induction and consolidation regimen, regardless of FLT3 mutational status. Three studies comprised 380 newly diagnosed FLT3(+) AML patients given sorafenib-related therapies as induction strategy, with or without consolidation and maintenance therapy following allo-HSCT. Six studies (n = 951) only focused on the maintenance therapy of sorafenib after allo-HSCT, and two studies (n = 235) regarded sorafenib as part of salvage therapy following relapse. Midostaurin was used in the stages of induction, consolidation and maintenance treatment in CR in four studies (n = 1604) and one article (n = 60) reported results of midostaurin-related maintenance therapy after allo-HSCT. For lestaurtinib studies, two (n = 500) focused on the induction stage in untreated FLT3(+) AML and one (n = 224) involved salvage therapy in FLT3(+) rrAML. In addition, the clinical efficacy of gilteritinib (n = 371) and quizartinib (n = 367) was evaluated in RCT for rrAML, and two studies (n = 178) combined several FLT3i. Finally, fourteen studies (n = 1797) compared prognostic discordance between allo-HSCT and non-HSCT in FLT3(+) AML.
Twenty-eight retrospective cohort studies and eleven RCT were included. The reported eleven RCT comprised four for sorafenib, two for midostaurin, one for gilteritinib, one for quizartinib, and three for lestaurtinib. Supplementary Figures 1, 2 showed risk of bias in RCT quality assessment. For survival endpoints, we thought that bias was unlikely because relapse and death were the endpoints without susceptibility to patient, physician, or outcome assessor bias. The details for NOS score of retrospective studies were listed in Supplementary Table 3.

Harm
Totally, early death and 21 types of toxic effects were analyzed (Supplementary

Sensitivity Analyses and Publication Bias
Sensitivity analyses were conducted if high heterogeneity (P < 0.10) existed. The heterogeneity sources were listed in Supplementary Table 4, which were related to age, various FLT3i, allo-HSCT, cytogenetics, and genetics. Supplementary Table 8   analyses of OS and RFS. Consequently, for EFS and CR, the summarized data between interventions were produced either from qualified indirect or direct evidence but not from both, without ability to estimate the inconsistency between direct and indirect comparisons.
Regarding all endpoints included in NMA, gilteritinib accomplished a tendency of the best prognosis compared with standard chemotherapy, particularly in CR (RR 0.45, 95% Crl 0.20-1.10) and OS (HR 0.64, 95% Crl 0.39-1.00) (Supplementary Figures 11E-H), and also favored the highest probability of  Figure 13). The heterogeneity reason is that ages of AML patients from Roellig et al. (7) were aged 60 years or younger, but it was opposite in Serve et al. (21).

DISCUSSION
Until now, no consensus was present on FLT3i role in various AML treatment stages, especially for FLT3(+) AML, and which FLT3i might be the best, which were investigated in this study. FLT3i possibly supported better CR in induction and salvage therapy that further boosted the improved survival of FLT3i. Allo-HSCT improved survival in FLT3(+) AML and FLT3i as the maintenance therapy after allo-HSCT might further enhance survival benefit gained from allo-HSCT. FLT3i also improved OS regardless of stratification of FLT3-ITD ratio, when compared to non-FLT3i group. Additionally, FLT3i consistently showed significantly increased risk of thrombocytopenia, neutropenia, anemia, skin-related adverse effects, increased alanine aminotransferase, cardiac-related adverse events, cough and dyspnea, but were not associated with high early death and increased risk of GVHD. NMA showed that gilteritinib probably favored the highest possibility toward better prognosis, which should be identified in more direct head-to-head RCT.
The summarized data in induction stage favored the prognostic benefit from FLT3i in AML, especially for sorafenib and midostaurin, consistent with a preceding RCT. Roellig et al. (7), one of the largest two RCT of sorafenib, reported improved EFS and RFS of sorafenib in patients aged ≤ 60 years. A possible mechanism underlying these data even in wild-type FLT3 was the antileukemic activity of sorafenib in inhibiting other kinases such as RAF (24), KIT, platelet-derived growth factor receptors, and vascular endothelial growth factor receptors (7). In contrast, Serve et al. (21), another largest RCT for sorafenib, exhibited a higher early mortality and increased toxicity without improved antileukemic efficacy of sorafenib in patients ≥ 60 years. The plausible explanation was the lower tolerability of elderly patients for sorafenib, as well as overexpressed multidrug-resistant phenotypes and, probably, more epigenetic changes in elderly cohorts (25), which offset the targeting effect of sorafenib for FLT3 and contributed to the heterogeneity of summarized EFS in induction stage in our study. For midostaurin, a large RCT (RATIFY) (3) reported that among FLT3(+) patients aged 18-59 years, midostaurin plus chemotherapy achieved prolonged survival, regardless of FLT3-TKD and different levels of mutant FLT3-ITD ratio, probably resulting from enough exposure to this inhibitor. Given a benefit among patients with low allelic FLT3-ITD mutation and a large disease burden, the benefit of this multitargeted kinase inhibitor might lie beyond its ability to inhibit FLT3, like inhibiting KIT in FLT3 (-) AML (3). Finally, Knapper et al. (10) confirmed that lestaurtinib + chemotherapy had no enhanced prognosis among younger FLT3(+) AML, possibly due to the rising level of FLT3 ligand induced by chemotherapy, which could interfere with activity of FLT3 inhibition, including lestaurtinib (10).
Next, allo-HSCT possibly also enhanced survival in FLT3(+) AML. However, some variables greatly affect the effectiveness of allo-HSCT, including disease status (CR or not), FLT3 variables (allelic burden and co-mutations), and using FLT3i before and/ or after allo-HSCT. Unfortunately, no RCT assessed the most suitable post-remission treatment in FLT3(+) AML, considering diverse combinations (26). However, we herein summarized the maintenance effects of FLT3i following allo-HSCT, since even after allo-HSCT, early relapse frequently occurred in FLT3(+) AML (30%-59%) (27). Indeed, increased survival was observed in FLT3i as the maintenance therapy after allo-HSCT, especially for sorafenib. Similarly, one recent phase 3 RCT (28) showed enhanced survival and high tolerance of sorafenib, identifying safety and availability of sorafenib after allo-HSCT. Moreover, in a large retrospective study including 144 FLT3-ITD(+) patients undergoing allo-HSCT (22), probability of CIR and OS was the best in sorafenib administrated both before and after allo-HSCT compared to either alone and non-sorafenib group. Based on these results, further clinical trials should be determined to directly compare availability between four arms above. Besides, sorafenib, midostaurin and gilteritinib were currently evaluated for maintaining after allo-HSCT in FLT3-ITD(+) AML. In the RADIUS phase 2 RCT (29), midostaurin achieved a slight tendency of better OS and CIR. Gilteritinib is prospectively estimated in a phase 3 RCT (NCT02997202). Overall, for the foreseeable future, FLT3(+) AML patients may still benefit from sorafenib after allo-HSCT because of the less use of other FLT3i (26).
Due to high refractory and relapse incidence in FLT3(+) AML (6) and limited knowledge to treat FLT3(+) rrAML, we also explored consistent or different therapeutic efficiency between various FLT3i in such patients. Except for lestaurtinib, the combined OS could be consistently improved by sorafenib, gilteritinib, and quizartinib, showing the potential of FLT3i in treating FLT3(+) rrAML. A phase 3 RCT randomized 371 FLT3(+) rrAML to either gilteritinib or salvage chemotherapy (6), showing increased survival as well as decreased frequency of adverse events in gilteritinib. Results from a similar phase 3 RCT (QuANTUM-R) comparing quizartinib with salvage chemotherapy in 367 FLT3(+) rrAML patients also confirmed quizartinib availability in improving prognosis (p < 0.05). Until now, the promising efficiency of gilteritinib and quizartinib could be observed in FLT3(+) rrAML in the two large RCT. However, a lack of evidence is present to add the two FLT3i to induction, consolidation, and maintenance therapy after allo-HSCT. Several phase 2 or 3 RCT are being processed (Gilteritinib: NCT02927262, NCT02997202, and NCT02752035; Quizartinib: NCT02668653). Additionally, two retrospective studies of sorafenib from Bazarbachi et al. (30) and Xuan et al. (31) also revealed enhanced OS comparing sorafenib with salvage chemotherapy in FLT3(+) rrAML, which should be further confirmed in RCT. Finally, midostaurin might be specifically effective in the untreated AML rather than rrAML, since in vitro studies, midostaurin had broader activity and might achieve greater clinical utility in newly diagnosed AML with blasts tending to be less addicted to FLT3-mediated signaling than rrAML (32).
We also explored the effectiveness of FLT3i on FLT3-ITD(+) AML stratified by ITD allelic ratios, showing that FLT3i consistently achieved significantly improved OS in both of high and low ratio crossing sorafenib, quizartinib, midostaurin and even lestaurtinib, further suggesting the benefits of FLT3i in FLT3-ITD(+) AML. However, there was no information involved in assessing clinical efficiency of gilteritinib in patients with different levels of FLT3-ITD ratio, which needs to be further explored in more studies.
For adverse events, FLT3i were significantly associated with increased risk of thrombocytopenia, neutropenia and anemia regardless of grades, high risk of skin-and cardiac-related adverse effects, especially for grades ≥ 3, all-grade increased alanine aminotransferase, high risk of all-grade cough and dyspnea, as well as a tendency of increased risk for all-grade gastrointestinal-related toxicities. However, these adverse events were generally manageable based on treatment interruptions or dose reductions (2). Moreover, there was no significant relationship between FLT3i as well as high early death and high risk of GVHD, demonstrating the safety of FLT3i.
Finally, due to co-existence of several FLT3i, it would also be meaningful to explore which inhibitor might be the best. Herein, we finished an NMA based on all RCT to settle this problem. The results displayed that gilteritinib probably tended to favor the highest probability of improving prognosis compared with other FLT3i, standard of care and standard chemotherapy. The relevant RCT involved in gilteritinib for rrAML treatment also illustrated improved survival as mentioned above (6). However, these results and corresponding consequences were relatively limited, given the rather small improvement obtained which each FLT3i through the indirect comparison. More direct headto-head RCT with large cohort size are required to explore the different clinical efficiency between these FLT3i.
Overall, our study was up to now the biggest and the most comprehensive meta-analysis involved in various FLT3i in AML, containing all RCT and retrospective cohort studies. In addition to FLT3i function in induction stage in untreated AML, the role of allo-HSCT in FLT3(+) AML and therapeutic efficiency of FLT3i as maintenance therapy after allo-HSCT and as salvage regimens in rrAML were summarized and explored, showing that combining FLT3i, especially sorafenib, into the treatment before and after allo-HSCT might be more beneficial in improving prognosis, which should be further explored in RCT. Our study also primitively proposed NMA to compare various FLT3i, but with limitations as mentioned above. There were also several limitations in this study. Of 39 studies, 28 studies were retrospective, making it difficult to precisely control selection, attrition, information and confounding bias. In particular, due to RCT lack, limited results and conclusion were obtained from NMA. Besides, some data were extracted from Kaplan-Meier survival curves and numeric reports, probably resulting in slight disparity with the fact. Finally, this study did not make subgroup analyses based on age, co-existing mutations with FLT3 mutations, and cytogenetic stratifications due to limited relevant data in primary studies. As a consequence, we will continue updating this study to explore suitable treatment of FLT3i in AML patients.

CONCLUSIONS
In this work, FLT3i reinforced better prognosis in the induction stage of newly diagnosed FLT3(+) AML and salvage therapy of FLT3(+) rrAML and further enhanced survival advantages from allo-HSCT as the maintenance therapy. This probably indicates that better prognosis could be achieved if FLT3i is added into both treatments before and after allo-HSCT. Besides, FLT3i probably improved OS regardless of FLT3-ITD ratio. Additionally, FLT3i were significantly linked to increased risk of all-grade thrombocytopenia, neutropenia, anemia, all-grade skin-and cardiac-related adverse effects, especially for grade ≥ 3, as well as all-grade increased alanine aminotransferase, enhanced risk of allgrade cough and dyspnea, and a tendency of increased risk for allgrade gastrointestinal-related toxicities, but not related to higher incidence of early death and GVHD compared to non-FLT3i group. In NMA, gilteritinib potentially accomplished the best prognosis, which should be identified in direct head-to-head RCT.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.