Measurable Residual Disease Detected by Multiparameter Flow Cytometry and Sequencing Improves Prediction of Relapse and Survival in Acute Myeloid Leukemia

The clinically ideal time point and optimal approach for the assessment of measurable residual disease (MRD) in patients with acute myeloid leukemia (AML) are still inconclusive. We investigated the clinical value of multiparameter flow cytometry-based MRD (MFC MRD) after induction (n = 492) and two cycles of consolidation (n = 421). The latter time point was proved as a superior indicator with independent prognostic significance for both relapse-free survival (RFS, HR = 3.635, 95% CI: 2.433–5.431, P <0.001) and overall survival (OS: HR = 3.511, 95% CI: 2.191–5.626, P <0.001). Furthermore, several representative molecular MRD markers were compared with the MFC MRD. Both approaches can establish prognostic value in patients with NPM1 mutations, and FLT3, C-KIT, or N-RAS mutations involved in kinase-related signaling pathways, while the combination of both techniques further refined the risk stratification. The detection of RUNX1–RUNX1T1 fusion transcripts achieved a considerable net reclassification improvement in predicting the prognosis. Conversely, for patients with biallelic CEBPA or DNMT3A mutations, only the MFC method was recommended due to the poor prognostic discriminability in tracking mutant transcripts. In conclusion, this study demonstrated that the MFC MRD after two consolidation cycles independently predicted clinical outcomes, and the integration of MFC and molecular MRD should depend on different types of AML-related genetic lesions.


INTRODUCTION
Acute myeloid leukemia (AML) is a group of hematological malignant disorders characterized by the high heterogeneity in clinical manifestation, genetic abnormalities, and prognosis (1). Via the treatment modalities as exemplified by cytotoxic drugs or molecular targeted therapies, a high complete remission (CR) rate could be achieved, however, a substantial proportion of AML patients will relapse due to residual leukemic cells below the detection threshold of traditional morphologic methods. To achieve a long term remission and survival, post-remission therapies should be administered not only based on pre-treatment parameters including age, cytogenetics, and a limited set of molecular genetic markers, but also on several post-treatment factors, among which the detection of residual disease is of utmost importance (2,3).
Measurable residual disease (MRD; also named minimal residual disease) detected in AML patients with hematological complete remission after treatment has been suggested as a powerful prognostic indicator (4,5). In general clinical practice, MRD is reliably monitored using the two most common methods, multiparameter flow cytometry-based MRD (MFC MRD) and polymerase chain reaction (PCR)-based MRD (Gene MRD), quantifying MRD in AML by virtue of either immunophenotype or molecular abnormalities of leukemic cells. However, until now, issues on when and how to apply the evaluation of MRD in daily clinical practice are still controversial.
Numerous studies have highlighted the prognostic impact of MFC MRD assessment at diverse time points in AML patients, as exemplified by post-induction (PI), post-consolidation (PC), before and after hematopoietic stem cell transplantation (HSCT) (6)(7)(8)(9), whereas the clinically ideal time point for MRD assessment is inconclusive. On the other side, the existing two common approaches differ in sensitivity and applicability. The MFC MRD can be monitored in the majority of AML patients with rapid turnaround time, but an established threshold with considerably high sensitivity and reproducibility is still an unmet clinical need (7,10). Conversely, although the PCR technique is highly sensitive, its application is limited to a fraction of AML patients who harbored specific genetic aberrations suitable for MRD detection, including RUNX1-RUNX1T1, CBFB-MYH11, and PML-RARa fusions, and NPM1 mutations (5,(11)(12)(13)(14)(15)(16). Notably, the next-generation sequencing (NGS) has introduced novel molecular markers for MRD assessment, and reliable criteria for their routine application in the clinical setting are under exploration. Consequently, how to compare and integrate current procedures for MRD monitoring is of great clinical significance.
In this study, we evaluated the prognostic value of the MFC MRD after induction and two consolidation cycles in AML to identify the optimal time point for MRD measurement. Furthermore, the tracking of molecular MRD on a series of AML-related gene alterations was compared with the MFC MRD, highlighting the necessity of combining both MFC and molecular techniques to establish an integrated methodology for MRD monitoring.

Patients and Treatment
From January 2011 to June 2018, a total of 833 consecutive newly diagnosed AML (non-M3) patients treated in Ruijin Hospital were enrolled. Among which, the majority of patients diagnosed after 2015 participated in one of three phase II/III clinical trials, which were registered at the Chinese Clinical Trial Registry (www.chictr.org.cn: ChiCTR-OPC-15006085; ChiCTR-OIC-16007764; ChiCTR-OIN-16008955).
Young patients (<60 years) were given standard intensive "3 + 7" IA/DA-based regimens as initial induction, which contained idarubicin/daunorubicin (10-12 mg/m 2 /45-60 mg/m 2 , D1-3) and cytarabine (100-200 mg/m 2 D1-7). When CR was achieved, four cycles of high-dose cytarabine (2 g/m 2 ) were delivered as consolidation. Elderly patients (≥60 years) were evaluated for the fitness by the treating physician. Fit patients received the same induction regimens as young patients but reduced the consolidation to two cycles of high-dose cytarabine. Unfit patients received either low dose "3 + 7" regimens, demethylation agents, or other less intensive therapies at the discretion of the physician. Treatment protocols of the three clinical trials are provided in Supplementary Methods.

Multiparameter Flow Cytometry
Bone marrow aspirate samples were obtained at diagnosis and before the first and third cycles of consolidation chemotherapy, which were processed through the standard procedure of our institution (17). The MFC MRD was monitored by using the 10color flow cytometry, and monoclonal antibodies against 21 antigens, including stem cell and progenitor markers (CD34, CD38, CD45, CD117, CD123, and HLA-DR), myelomonocytic markers (CD13, CD11b, CD14, CD15, CD33, MPO, and CD64), and lymphoid lineage markers (CD2, cyCD3, CD4, CD7, CD19, cyCD79a, TdT, and CD56) were utilized. Identical antibodyfluorochrome combinations at diagnosis and during the followup period were utilized for tracking established LAIPs and newly emerging aberrant immunophenotypes. For statistical analyses, the "LAIP-based different-from-normal approach" was applied. Detailed definitions concerning MFC are provided in Supplementary Methods.

Molecular Events
Molecular alterations of AML in this study were selected according to the established laboratory developed tests (LDTs) at Shanghai Institute of Hematology based on our previous work conducted in a large AML cohort, in which gene mutations and fusions showing significant prognostic value were then tested in daily clinical routine (18). Genetic alterations including FLT3-ITD/TKD, KMT2A-PTD, NPM1, NRAS, CKIT, CEBPA, DNMT3A, IDH1, and IDH2 were detected by PCR and Sanger sequencing. RUNX1-RUNX1T1, CBFb-MYH11, and KMT2Arelated fusion genes were detected via reverse transcription (RT)-PCR strategy as previously reported (19). The level of RUNX1-RUNX1T1 transcripts was measured by quantitative real-time PCR (qPCR), and a >3-log reduction compared with the baseline level at diagnosis was defined as molecular MRD negativity according to the published literature (14,20).

Statistical Analyses
Kaplan-Meier and hazard ratio analyses were used to calculate and compare the relapse-free survival (RFS) and overall survival (OS). The Cox proportional hazard regression model was applied for the multivariate analysis of independent factors for RFS and OS. To investigate the prognostic accuracy of MRD status by the MFC and molecular methods, the net reclassification improvement (NRI) (21) was used to measure the net gain in risk reclassification between different techniques for MRD monitoring. All of the statistical procedures mentioned above were carried out using the R (version 4.0.0) and the SPSS (version 26.0) software packages.

Characteristics of Patients and Their Associations With MFC MRD Status
The patient flow diagram is depicted in Figure S1. Among the 833 AML patients, 639 (76.7%) achieved CR after induction chemotherapy, of whom 587 (91.9%) patients had specific LAIPs at diagnosis that were suitable for MRD monitoring by flow cytometry. The genetic alterations of AML patients stratified by LAIPs are shown in Table S1, and the most frequent LAIPs are summarized in Table S2. A higher frequency of RUNX1-RUNX1TI (P = 0.031) and biallelic CEBPA (BiCEBPA, P = 0.001) mutations, but lower frequency of CBFb-MYH11 (P = 0.009) were observed in patients who had LAIPs at diagnosis. MRD analysis by MFC was available in 492 patients after induction, among which, 24 patients chose HSCT as consolidation, 41 patients relapsed, and six patients died. Consequently, 421 patients who were treated with chemotherapy only remained in CR and received MRD monitoring after two cycles of consolidation therapy. By comparing different cut-off levels including 0, 0.01, 0.035, 0.1, and 1% to distinguish MRD + from MRDpatients, the cut-off of 0.1% was proved to be most relevant to prognosis, displaying significant disparities in both RFS and OS ( Figure 1, Figure S2). Therefore, the level of ≥0.1% was considered as MRD positive in this study. As shown in Table 1, when one to two cycles of induction chemotherapy were completed, 329 (66.9%) of the 492 patients were classified into MFC MRD negative group, termed as MFC PI-. Moreover, 340 (81%) of 421 achieved MFC MRD negativity after two cycles of consolidation chemotherapy, termed as MFC PC-. There were no significant differences in age, gender, peripheral blood count, and BM blasts between MFCand MFC + at both time points. Patients in the MFC PI+ group were less likely to carry RUNX1-RUNX1T1 (P = 0.003), and those with MFC PC+ were more likely to harbor biallelic CEBPA (BiCEBPA) mutations at diagnosis (P = 0.003). In cytogenetic risk stratification, favorable cytogenetic abnormalities were less common in patients with MFC PI+ (P = 0.002), while a higher frequency of unfavorable risk was seen in the MFC PI+ group (P = 0.003). Besides, patients who required two induction cycles to attain CR were more likely to have MFC PI+ (P = 0.013). The follow-up of all patients ended in April 2020, with a median follow-up time of 45 (range 1-108) months.

Prognostic Significance of MFC MRD at Different Time Points
Patients with MFC PI-(median MRD, 0; range, 0-0.09%) had a significantly favorable RFS and OS than those with MFC PI+ (median MRD, 0.43%; range, 0.1-11.4%) ( Figures 1A, B and Table S3). Similarly, patients in the MFC PCgroup (median MRD, 0; range, 0-0.09%) had a better prognosis than those whose MFC MRD status was positive (median MRD, 0.27%; range, 0.1-4.81%) (Figures 1C, D and Table S3). Of note, the status of MFC PC seemed to provide a better discrimination ability for both short-and long-term survival than that of MFC PI .
The dynamics of MFC MRD status after induction and the second cycle of consolidation therapy in the 421 patients who experienced MFC MRD monitoring at both time points were integratively evaluated. Patients were stratified into four groups based on the MFC MRD status at the two checkpoints. As shown in Figures 1E, F, there was no difference in the distribution of RFS and OS between MFC PI-MFC PCand MFC PI+ MFC PCpatients (P = 0.787), and between patients with MFC PI+ MFC PC+ and MFC PI-MFC PC+ (P = 0.408), while both groups conferred an inferior prognosis compared to those with MFC PC-. Intriguingly, the prognostic impact of the MFC PC MRD status in both MFC PIand MFC PI+ patients was significant ( Figures 1E, F and Table S3).   The prognostic value of the MFC PC MRD status was also observed in young and old AML patients, respectively ( Figures S3, S4), and in the ELN low and intermediate cytogenetic risk group, respectively ( Figures S5, S6), while it was not significant in the high-risk group ( Figure S7).

Comparison of MRD Assessment by Different Detection Modalities
The clinical utility of molecular MRD and MFC MRD in diverse types of genetic abnormalities was compared in patients with MRD monitoring by both methods after two consolidation cycles. A series of gene markers either of fusions or mutations with certain incidences were selected and described as follows according to gene type and function.
Firstly, the two approaches were compared in 50 patients who harbored the RUNX1-RUNX1T1 fusion gene at diagnosis. Both MFC MRD status (median RFS, 33 vs 5 months, P = 0.008, median OS, NR vs 12 months, P <0.001) (Figures 2A, B) and molecular MRD status (median RFS, NR vs 11 months, P = 0.003; median OS, NR, P = 0.012) Figures 2C, D could distinguish patients with a relatively favorable outcome from those with an increased risk of relapse and mortality, while the presence of both molecular and MFC MRD indicated the worst prognosis ( Figures 2E, F). Of note, the NRI of molecular MRD in 2-year RFS and OS was 21.9 and 15.5%, respectively, compared with MFC MRD (Figure 3), and more importantly, there was no improvement when two methods were combined.
The second panel of genes was involved in activated signaling pathways, and 77 patients with FLT3, C-KIT, or N-RAS mutations at diagnosis were evaluated. Both MFC MRD status (median, 50 vs 10 months, P = 0.017) ( Figure 4A) and molecular MRD status (median, 50 vs 7 months, P <0.001) ( Figure 4C) had a significant prognostic impact on RFS. Gene MRD positivity conferred a significantly worse OS (median, NR vs 12 months, P <0.001, Figure 4D), while the presence of MFC MRD was borderline associated with an inferior OS (median, NR vs 43 months, P = 0.101, Figure 4B). Patients with MFC -Genehad longer RFS and OS than those in either MFC + Geneor MFC -Gene + group (P <0.001, P = 0.002 for RFS; P <0.001, P = 0.029 for OS, respectively, Figures 4E, F). The NRI showed positive gains in reclassification when combing MFC and Gene MRD together, with 11.7 and 23.2% improved value in the prediction of 2-year RFS and OS, respectively, compared with MFC MRD status, and the improvement was 12.1 and 11%, respectively, compared with molecular MRD method ( Figure 3).   P= 0.010) and OS (median OS, NR vs 8 months, P = 0.090) compared with patients who had a clearance of NPM1 mutations ( Figures 5C, D). Patients with MFC -Geneobtained significantly longer RFS and OS than those with either MFC or Gene MRD positive (all P <0.05, Figures 5E, F). The integration of both assays yielded 7.4 and 11.4% NRI for 2-year RFS compared to the MRD status evaluated by MFC and NPM1 mutations, respectively, and 14.4% for 2-year OS compared to Gene MRD (Figure 3). Patients with BiCEBPA mutations were reported to be sensitive to standard chemotherapy. Consistently, there were only six patients in the MFC MRD + group and 1 patient in the Gene MRD + group after two cycles of consolidation. MFC MRD positivity tended to predict a worse RFS (P = 0.013, Figure S8A) but did not impact OS (P = 0.745, Figure S8B). The only patient with Gene MRD + experienced a long-term survival, therefore no significant differences were observed in RFS and OS between different Gene MRD groups (Figures S8C, D).
For 25 patients carrying mutations in DNMT3A, the elimination of MFC MRD was significantly associated with longer RFS (P = 0.011) and OS (P = 0.049) ( Figures S9A, B), while no significant differences were observed in RFS (P = 0.902) and OS (P = 0.596) between patients with detectable DNMT3A mutations and those whose molecular MRD turned into negative ( Figures S9C, D).

DISCUSSION
There is now mounting evidence that the identification of residual disease is of paramount importance in refining risk reclassification and informing therapeutic intervention for AML patients after the achievement of morphological remission (16,22,23). However, no consensus has been reached on the ideal time point and the optimal methodology for MRD evaluation, highlighting the need to establish standardized analysis and reporting procedures so as to improve the accuracy and reproducibility of MRD monitoring.
Our results indicated that the MFC MRD status after two consolidation cycles had a greater impact on the subsequent relapse and inferior outcome than that measured after induction. Notably, the same conclusion could be drawn when restricting the analysis to different patient subgroups (young vs. elderly patients, and ELN cytogenetic low-or intermediate-risk patients). While for high-risk patients, the MFC MRD status at both time points failed to forecast the prognosis, which merits further exploration considering that adverse molecular markers, e.g., TP53, ASXL1, and RUNX1 were not included in our study. Our results are consistent with previous studies that recommended MRD tracking after consolidation (9,13,24), although others favored the post-induction time point (25). It should be mentioned that the controversial interpretation of the prognostic value of MRD in different studies may be attributed to the number of induction and consolidation courses completed at the time of MFC MRD monitoring, and the modality and intensity of induction regimens, as reported by Minetto et al. for fludarabine plus high dose cytarabine-based induction, an earlier timepoint of MRD assessment may provide the most significant information on outcome (26).
The dynamics of sequential MRD monitoring demonstrated that patients who had detectable MRD after induction but entered MRD negativity after the second consolidation showed the same prognosis as those with a negative MRD at both time points. In contrast, the initial clearance of MRD did not guarantee a persistent remission and long-term survival, as exemplified by the dismal prognosis of patients whose MRD was eliminated early and subsequently converted into positive after consolidation. More importantly, the achievement of MFC MRD negativity after two consolidation cycles was an independent predictor for both RFS and OS, emphasizing the need to introduce new therapeutic modalities such as HSCT and targeted therapies to eradicate residual malignant cells when MFC MRD was positive at this checkpoint.
In addition to abnormal immunophenotypes detected by MFC, MRD could be reliably measured through genetic aberrations expressed in leukemic cells. We compared the Gene MRD and MFC MRD after the second consolidation in several molecular groups representative of different genetic etiology and biological function. Gene fusions involving transcription factors such as RUNX1-RUNX1T1 represent a specific subtype of AML. Although significantly diverse prognostic groups could be distinguished by both MFC and molecular MRD, the latter methodology was superior in terms of the NRI. Indeed, a less than 3 log reduction in the RUNX1-RUNX1T1 transcript levels was proved to be an independent adverse prognostic factor (23,27).
Mutations in kinase-related signaling pathways including FLT3, CKIT, or N-RAS mutations usually occur at a later stage and are more likely to be eliminated by cytotoxic chemotherapy. The clearance of MRD confirmed by either MFC or sequencingbased approach conferred a favorable clinical outcome, and the combination of both methods showed greater discriminative ability. Similar results could be observed in patients with NPM1 mutations. Although only two patients harbored NPM1 mutations post consolidation, both displayed a dismal prognosis, which was in concordance with the widely appreciated role of NPM1 mutations in MRD testing (28,29). Remarkably, the MFC MRD can provide complementary prognostic value.
Biallelic CEBPA mutations have been recognized as a favorable prognostic marker of AML (18,30,31). However, since only one patient was in the Gene MRD + group after consolidation, the tracking of molecular MRD showed limited predictive power as reported in prior studies (32,33). Likewise, the continuous presence of mutant transcripts in the epigenetic modifier gene DNMT3A did not exert any adverse impact on prognosis. Consistently, recent researches regarded DNMT3A, TET2, and ASXL1 (DTA) mutations as age-related clonal hematopoiesis and their persistent existence post remission was unable to forecast an increased risk of relapse (3,34).
Despite the high performance of MRD monitoring by MFC and PCR-based assays, a proportion of AML patients lack a traceable MRD marker. So far, routine clinical practices of MRD tracking have dealt with only a small proportion of typical genetic anomalies in AML (5,35). The overexpression of WT1 can be observed in more than 80% of AML patients, which may be an alternative PCRbased MRD testing since the quantification of WT1 expression after treatment has been proved to have significant prognostic value (36). In addition, the integration of WT1-based MRD and MFC MRD may improve the prediction of outcome in AML, although the limited sensitivity and specificity to some extent hamper the wide application of MRD monitoring based on WT1 expression (37)(38)(39). Encouragingly, the NGS technology holds great potential for the widely applicable MRD tracking as nearly all AML patients harbored at least one mutation at diagnosis. Growing evidence has proved the prognostic value of NGS-based MRD, either in the CR or pre-transplant stage, which can provide additional information on changes of variant allele frequency as well as clonal evolution during the follow-up period (3,(40)(41)(42). However, these NGS-based studies often integrated dozens to hundreds of genetic abnormalities without uniform design, technical and reporting standards, which may ignore the heterogeneity of molecular anomalies and their utility for MRD tracking in AML (40). It is noteworthy that the predictive value of molecular MRD varies in different categories of mutations as manifested in this study, and the optimal threshold of NGS-based MRD needs to be explored in a flexible and genotype-oriented way.
In summary, our study indicated that a positive MFC MRD after two consolidation cycles was an independent risk factor, and the comparison of molecular and MFC MRD in patients with different types of recurrent mutations lends support to the clinical implementation of NGS-based MRD assessment. Due to the limitation of technologies, a few germline mutations could affect the explanation of molecular MRD. Hence, how to integrate various detection methodologies and establish standard-of-care guidelines for MRD testing warrants further refinements in large prospective studies.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ruijin Hospital Ethics Committee Shanghai Jiao Tong University School of Medicine. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.