Decoupling Lineage-Associated Genes in Acute Myeloid Leukemia Reveals Inflammatory and Metabolic Signatures Associated With Outcomes

Acute myeloid leukemia (AML) is a heterogeneous disease with variable responses to therapy. Cytogenetic and genomic features are used to classify AML patients into prognostic and treatment groups. However, these molecular characteristics harbor significant patient-to-patient variability and do not fully account for AML heterogeneity. RNA-based classifications have also been applied in AML as an alternative approach, but transcriptomic grouping is strongly associated with AML morphologic lineages. We used a training cohort of newly diagnosed AML patients and conducted unsupervised RNA-based classification after excluding lineage-associated genes. We identified three AML patient groups that have distinct biological pathways associated with outcomes. Enrichment of inflammatory pathways and downregulation of HOX pathways were associated with improved outcomes, and this was validated in 2 independent cohorts. We also identified a group of AML patients who harbored high metabolic and mTOR pathway activity, and this was associated with worse clinical outcomes. Using a comprehensive reverse phase protein array, we identified higher mTOR protein expression in the highly metabolic group. We also identified a positive correlation between degree of resistance to venetoclax and mTOR activation in myeloid and lymphoid cell lines. Our approach of integrating RNA, protein, and genomic data uncovered lineage-independent AML patient groups that share biologic mechanisms and can inform outcomes independent of commonly used clinical and demographic variables; these groups could be used to guide therapeutic strategies.

Acute myeloid leukemia (AML) is a heterogeneous disease with variable responses to therapy. Cytogenetic and genomic features are used to classify AML patients into prognostic and treatment groups. However, these molecular characteristics harbor significant patient-to-patient variability and do not fully account for AML heterogeneity. RNA-based classifications have also been applied in AML as an alternative approach, but transcriptomic grouping is strongly associated with AML morphologic lineages. We used a training cohort of newly diagnosed AML patients and conducted unsupervised RNAbased classification after excluding lineage-associated genes. We identified three AML patient groups that have distinct biological pathways associated with outcomes. Enrichment of inflammatory pathways and downregulation of HOX pathways were associated with improved outcomes, and this was validated in 2 independent cohorts. We also identified a group of AML patients who harbored high metabolic and mTOR pathway activity, and this was associated with worse clinical outcomes. Using a comprehensive reverse phase protein array, we identified higher mTOR protein expression in the highly metabolic group. We also identified a positive correlation between degree of resistance to venetoclax and mTOR activation in myeloid and lymphoid cell lines. Our approach of integrating RNA, protein, and genomic data INTRODUCTION Acute myeloid leukemia (AML) is a clinically and morphologically heterogeneous disease with significant variability in treatment responses and outcomes (1)(2)(3). Although almost 60-70% of AML patients achieve remission with standard anthracycline (idarubicin or daunorubicin) and cytarabine-based induction chemotherapy, almost 50% of these patients eventually experience relapse within 1 year of diagnosis (2)(3)(4). Revealing the underlying biologic processes that contribute to AML heterogeneity and drive outcomes may guide therapeutic strategies.
The French American British (FAB) classification was traditionally used to categorize AML into 8 different morphologic subtypes (M0 to M7) that reflected lineage commitment (2,(5)(6)(7). With the advent of cytogenetic and genomic assessments, the European Leukemia Network (ELN) recommendations were widely adopted as it proposed a risk stratification for patients that based on cytogenetics and genomics (2,8,9). However, cytogenetic and molecular alterations do not fully account for the heterogeneity of AML because not all patients harbor ELN-pre-defined aberrations (2,10,11). Also, there is considerable patient-to-patient variability in response to treatment and clinical outcomes within genomic and ELN subgroups (11). Therefore, there is a need to uncover underlying biologic pathways that are underrepresented in genomic and cytogenetic profiling of AML and may inform outcomes.
To fill this gap, researchers have identified several transcriptomic signatures associated with AML clinical outcomes (10,(12)(13)(14)(15)(16)(17). However, RNA-based profiling revealed that this method of grouping AML patients was highly associated with FAB classifications, i.e., related to AML morphology and lineages (15,16,18). Yet, the FAB-associated clustering was not accounted for in previous transcriptome-based studies, suggesting that the morphology and lineage of AML were driving patient grouping. Furthermore, although mutations were associated with some transcriptomic-based clustering, there was significant overlap for these mutations in multiple clusters (15,16). We hypothesized that by decoupling the lineage-related genes from the transcriptomic profiles of AML, we could unmask biologically relevant pathways that are inherent to AML independent of cell of origin and that could inform clinical outcomes. Furthermore, such an approach could identify biologic pathways associated with clusterspecific mutations.
In the current study, we decoupled FAB-associated genes to decipher lineage-independent biologic pathways in 81 newly diagnosed and previously untreated AML patients. We identified distinct biologic AML patient groups and assessed the outcome of patients according to their group membership. To provide further molecular orthogonal characterization of defined groups, we applied a reverse phase protein array (RPPA) in all patient samples and extended panel DNA sequencing in 73 of 81 patients (90%). Using this approach, we identified inflammatory and metabolic pathways associated with outcomes and validated our findings in 2 independent AML cohorts. The findings from this work demonstrate that RNAbased classification could reveal important potentially targetable biologic pathways.

Patient Population
A total of 81 newly diagnosed AML patients evaluated at The University of Texas MD Anderson Cancer Center were included in the current study. All patients had bone marrow samples collected and analyzed prior to treatment initiation. Patients provided written informed consent that was approved by the MD Anderson Institutional Review Board. The study was conducted in accordance with the Declaration of Helsinki.

Differential Expression and Pathway Analysis
In our cohort and TCGA cohort, gene-level read counts were used to perform differential expression analysis using DESeq2 (3). The T-statistic from the differential expression analysis was used to perform gene set enrichment analysis (GSEA) using the Bioconductor package gage (4), and significantly dysregulated pathways were identified at q < 0.1. For the Valk et al. validation cohort, we used single-sample GSEA (ssGSEA) (5,6) implemented in the Bioconductor package GSVA (7) to convert microarray gene expression into pathway activity scores. Pathway activity scores were compared between groups using the Wilcox Rank Sum test, and p-values were corrected for multiple hypothesis testing using false discovery rate (FDR). Significantly dysregulated pathways were identified at q < 0.05. The Hallmark pathways (8) were used in GSEA and ssGSEA.

Unsupervised Clustering Prior to Excluding FAB-Associated Genes
The R package pheatmap was used to generate heatmaps and dendrograms for samples from a matrix of variably expressed genes. Euclidian distance and complete clustering were used to perform hierarchical clustering of the data. We used a dynamic tree cutting algorithm implemented in the cutreeDynamic() function from the WGCNA package (9) to identify optimal clustering of patients. The strength of association between cluster membership and FAB status (M1/M2 and M4/M5) was tested using Fisher's test. Clustering analysis in our dataset was performed using an increasing number of top variably expressed genes (1000, 1500, 2000, 2500, and 3000 genes). Clustering analysis in TCGA cohort was performed as above using the top 1000 variably expressed genes, including only samples with corresponding FAB status (M1/M2 and M4/M5).

Unsupervised Clustering to Identify Patient Clusters Independent of FAB Status
To identify genes with expression associated with FAB status, we excluded genes associated with FAB and lineage commitment. Specifically, for each gene, a p-value was obtained for each of the FAB groups M2, M4, and M5 relative to M1 from the regression analysis. p-values for each of the groups (M2, M4, and M5) were corrected for multiple testing using FDR. A gene's expression was considered associated with FAB status if expression of the gene was significantly different in at least one of the FAB groups M2, M4, and M5 relative to M1 (q < 0.05). A total of 4743 genes were found to have expression associated with FAB status. Enrichment of these genes in cell type-specific signatures was quantified using Enricher (10,11). FAB-associated genes were excluded, and the top variably expressed genes (variance > 5, 735 genes) were used to identify clusters (as described above). Three distinct expression clusters were identified using this approach, Group_1, Group_2 and Group_3 ( Figure 1A)

Survival Analysis
Survival analysis of patient clusters identified from our expression data ( Figures 1D-F Figure 4 was performed using the survminer R package. p-values for the KM-plots were computed using logrank test implemented in the function surv_pvalue(). All KMplots in the study were plotted using the function ggsurplot().

HOX Gene Survival Analysis
Activity of HOXA and HOXB gene clusters were scored in all datasets using ssGSEA (5,6) implemented in GSVA (7). In each cohort, for each HOX gene cluster, the samples were split into 2 groups (the top and bottom 50 th percentile) based on the activity scores obtained from ssGSEA. Survival differences between the groups were quantified as described above.

Estimation and Comparison of Metabolic Activity Between Patient Groups
Pathway activity scores were calculated using 91 gene sets, including 85 KEGG (12) metabolic pathways and 5 literaturecurated gene sets: glucose deprivation, glycolysis, hypoxia, mTOR, and oxidative phosphorylation. The pathway activity score was calculated using ssGSEA using GSVA (7). Differential activity of pathways among clusters was identified using the Wilcoxon rank-sum test based on each cell's pathway activity scores. p-values were adjusted using the Benjamini-Hochberg method, and the threshold of significance was set to q < 0.05.

Foundation Medicine Assay
Samples were submitted to a Clinical Laboratory Improvement Amendments-certified, New York State-accredited, and College of American Pathologists-accredited laboratory (Foundation Medicine, Cambridge, MA) for next-generation sequencingbased genomic profiling. Samples were processed in the protocol defined by hematologic cancers as previously described (13). Briefly, after DNA and RNA extraction from bone marrow aspirate, adaptor-ligated DNA underwent hybrid capture for all coding exons of 465 cancer-related genes. cDNA libraries prepared from RNA underwent hybrid capture for 265 genes known to be rearranged in cancer. Captured libraries were sequenced to a median exon coverage depth of >500× (DNA) or~3M unique reads (RNA) using Illumina sequencing, and resultant sequences were analyzed for base substitutions, small insertions and deletions (indels), copy number alterations (focal amplifications and homozygous deletions), and gene fusions/ rearrangements, as previously described (13,14). Frequent germline variants from the 1000 Genomes Project (dbSNP142) were removed. To maximize mutation-detection accuracy (sensitivity and specificity) in impure clinical specimens, the test was previously optimized and validated to detect base substitutions at ≥5% mutant allele frequency, indels at ≥10% mutant allele frequency with ≥99% accuracy, and fusions occurring within baited introns/exons with >99% sensitivity (14). Known confirmed somatic alterations deposited in the Catalog of Somatic Mutations in Cancer (COSMIC v62) were called at allele frequencies ≥1% (15). Patients did not provide consent for raw data release. Therefore, associated raw sequence data is not shared. However, variants from a subset of the samples used in this analysis (>18,000) have been deposited in the Genomic Data Commons (accession #phs001179).

Mutational Analysis
Mutation data were binarized to indicate the presence or absence of a mutation. Genes mutated in less than 10% of the samples were excluded from the analysis. Fisher's test was used to quantify the association between the presence of a gene mutation and cluster membership. p-values were corrected using FDR and mutations significantly associated with a cluster were identified at q < 0.1. Oncoplot was generated using R package maftools (16). Gene Mutations that showed significant association with FLT3 mutation were identified using Fisher's test, p-values were corrected for multiple testing using FDR and significant associations were identified at q < 0.1. An odds ratio (OR) > 1 indicates co-occurrence and OR < 1 indicates mutual exclusivity. <-2, respectively, at q < 0.05. The activity of these gene sets was scored in each of the validation cohorts using ssGSEA, and each sample was then assigned a score indicating the difference between the activity of the upregulated and downregulated gene sets. In each of the validation cohorts, samples were stratified into 2 groups indicating more Group 2 like (top 50 th percentile) and less Group 2 like (bottom 50 th percentile). These 2 groups were then used to perform survival analysis as described above. Differential expression between the groups was determined and pathway analysis performed as described above.

Differential Expression of Proteins
RPPA data used in the study were previously published and generated by our group for the cohort of 81 patients (19). To identify differentially expressed proteins between 2 groups, we computed the difference in mean expression of each protein with a p-value using the Wilcoxon rank-sum test. p-values were corrected using FDR. Upregulated and downregulated proteins were identified as difference in mean >75th percentile and <25th percentile, respectively, at q < 0.1.

Cell Line Molecular and Drug Response Data
Cancer cell line drug response data were obtained from Rees et al. (20). The response of each cell to a drug was quantified as the area under the drug response curve (AUC). High AUC indicated poor response and low AUC indicated better response. Protein expression from RPPA for these cell lines was obtained from Cancer Cell Line Encyclopedia (21) using the DepMap portal (https://depmap.org/portal/download/). Correlations between expression data and drug response were computed using Spearman correlation.

Unsupervised Clustering to Identify Prognostic Clusters Independent of FAB and ELN Classification
Unsupervised clustering of the 81 newly diagnosed AML patients based on the top 1000 variably expressed genes initially revealed two distinct patient clusters. The clustering was highly associated with FAB morphologic classification (Fisher p = 9.2e -5 , Supplementary Figure 2A). The FAB-associated clustering of patients persisted even when more genes were added to the unsupervised clustering, suggesting a significant impact of lineage and morphology on transcriptomic-based clustering (Supplementary Figure 2B). To assess whether this observation was also relevant in other AML cohorts, we conducted unsupervised clustering of expression profiles in M1/M2 and M4/M5 patients from TCGA AML cohort (18). Indeed, unsupervised clustering of TCGA AML cohort revealed similar high dependency on FAB classification (Fisher p = 2.24e -10 , Supplementary Figure 2C). These findings suggested that the genes associated with lineage morphology in AML were contributing to AML transcriptomic-based clustering, and hence could be masking AML subgroups that share similar underlying biology but different lineages.
To address this concern, we used linear regression models and identified genes whose expression profiles were associated with FAB classification (q < 0.05; see Methods). Using enrichment analysis in cell lineage and morphology signatures (10, 11), we found that these genes were highly enriched for myeloid and monocytic lineage differentiation (Supplementary Figure 3A). To decouple lineage-associated genes from AML patient clustering and to identify biologically similar AML patients independent of lineage, we excluded the lineage-associated genes and re-clustered AML patients based on the expression of top variable genes (735 genes, variance > 5). This approach led us to identify 3 distinct patient clusters (hereafter referred to as group 1, group 2, and group 3) that clustered independently of FAB classification (Fisher p = 0.251; Figure 1A and Supplementary Figure 2B). The clinical and demographic characteristics of these 3 clusters are summarized in Table 1. Briefly, after correction for multiple hypothesis testing, there were no significant differences in distribution for FAB classification, ELN classification, sex, antecedent hematologic disease, or treatment group. Group 1 patients were the oldest (mean age 68.9 ± 9.9 years), followed by group 2 patients (64.4 ± 15.1 years) and group 3 patients (57.3 ± 15.6 years; p = 0.030, q = 0.073). These findings suggested that the patient grouping was inherently driven by the transcriptomic signatures independent of lineage or clinical characteristics.

Outcomes of AML Patient Groups
We next evaluated whether the 3 AML groups had distinct clinical outcomes. Univariate Cox survival analysis indicated that sample clustering was associated with differential overall survival, event-free survival, and remission duration (p < 0.05). To control for other confounding clinical factors, we first used univariate survival analysis to identify clinical variables associated with survival (p < 0.05; Supplementary Table 1) and then built a multivariable Cox survival model with these variables. Survival trends observed in the clusters in univariate analysis were re-captured after controlling for other confounding variables associated with survival (see Methods). Group 2 was characterized by improved overall survival (median 55.86 weeks; p = 0.037), event-free survival (median 55.85 weeks; p = 0.006), and remission duration (median 111.71 weeks; p = 0.03 relative to group 1, whereas no significant difference was observed between group 1 and group 3 ( Figures 1D-F).

Inflammatory and Immune Pathways Enriched in Group 2 Patients
To explore transcriptomic signatures that were associated with improved outcomes in group 2, we conducted differential gene expression profiling comparing group 2 with groups 1 and 3, revealing 70 upregulated genes and 322 downregulated genes (q < 0.05, absolute log 2 fold change > 2; Figure 2A and Supplementary Table 1). GSEA of hallmarks pathways demonstrated significant activation of immune signaling in group 2 compared with groups 1 and 3 ( Figure 2B). To determine whether the signal was confounded by a single group, we next compared group 2 with group 1 and 3 each separately. Indeed, we saw that patients in group 2 consistently had activation of immune and inflammatory pathways, including interferon-alpha and interferon-gamma, compared with each of the other groups, suggesting that intrinsic immune activation in group 2 was associated with improved outcomes (Figures 2C, D). HOXA and HOXB gene clusters were significantly downregulated in group 2 compared with groups 1 and 3 (Supplementary Table 1). Furthermore, lower expression of HOXA and HOXB gene clusters was associated with better outcomes across all patients in our cohort (Supplementary Figure 4A).

Validation of Immune Signatures in Independent Cohorts
To validate the finding that immune signatures were associated with improved outcomes in AML, we used 2 independent AML cohorts (17,18) with available transcriptomic and clinical data (Supplementary Table 1; see Methods). We then compared outcomes based on median scores derived from ssGSEA (7) from genes differentially expressed in group 2 relative to groups 1 and 3 (see Methods). Higher-scoring patients (i.e., more similar to group 2) had improved survival in both validation cohorts ( Figures 3A-C). Differential pathway activity analysis between these groups revealed activation of immune-associated pathways, consistent with observations in group 2 in our cohort and further validating our finding that immune activity was the main differential factor in outcomes ( Figures 3D, E). Similarly, patients with lower HOXA and HOXB gene scores had improved outcomes (Supplementary Figures 4B, C). These data indicate that activation of immune-associated pathways and suppression of HOX genes in AML are associated with improved outcomes in patients.

Pairwise GSEA Comparisons Revealing Metabolic Signatures in Group 3
To further characterize the biologic pathways that distinguished patient groups, we conducted pairwise GSEA between individual groups of patients. Group 3 patients had significant activation of metabolic activity compared with group 1 and with group 2 patients ( Figures 4A, B). Although patients in group 3 and group 1 had similarly worse outcomes, activity of metabolic pathways was significantly higher in group 3 patients, especially for oxidative phosphorylation and fatty acid metabolism ( Figure 4B), suggesting that metabolism was a distinguishing feature between these groups. Furthermore, activity in the mTOR pathway, a major regulator of cancer metabolism (23), was significantly higher in group 3 than in group 1.
To further characterize metabolic activity in group 3, we compared the metabolic pathway activity scores between group 3 and groups 2 and 1. Relative to group 2, group 3 showed significant activation of energy metabolism pathways such as glycolysis, TCA cycle, biosynthesis of unsaturated fatty acids, and gluconeogenesis. Relative to group 1, group 3 was characterized by activation of oxidative phosphorylation, lipid metabolism (ether lipid metabolism, steroid hormone biosynthesis), and pyrimidine metabolism. Group 3 also showed activation of galactose metabolism and linoleic acid metabolism relative to both group 1 and group 2. These findings indicate that patients in group 3 may be characterized by augmented activation of pathways involved in energy production and metabolism ( Figure 4C).

Proteomic Assessment to Distinguish Group 3 and Group 1
All 81 AML patients had previously reported RPPA profiling (19) at the same time point of RNA and genomic sequencing. We therefore used this orthogonal molecular platform to delineate protein-based molecular pathways that could differentiate these 3 AML patient groups (Supplementary Figure 5). Group 2 had downregulation (q = 0.057 and difference in mean = -0.832) of only CTNNB1 when compared with group 1 (Supplementary Figure 5A) and downregulation of MTOR and MTOR.pS2448 compared with group 3 (Supplementary Figure 5B). These findings suggested that unlike RNA, RPPA was not able to delineate many proteomic differences between group 2 and groups 1 and 3, most likely owing to the smaller number of genes assayed. We next evaluated differences in RPPA signatures between group 1 and group 3 patients who had similar outcomes, compared with group 2 patients. We identified 28 upregulated proteins and 19 downregulated proteins (q < 0.1, see Methods) in group 3 relative to group 1 ( Figure 5A). MTOR.pS2448, which signals for activation of both mTOR and PI3K pathways (23)(24)(25), was over-expressed in group 3. This was consistent with the mTOR upregulation in the group 3 transcriptomic signature and suggested an active PI3K-AKT-mTOR signaling axis in group 3 patients. In addition, we found over-expression of proteins in the MAPK signaling cascade (MAP2K1_2pS217_211, MAPK9), apoptosis (BAX, CASP8, MCL1, BAK1, BAD.pS155), and BRAF in group 3 compared with group 1 ( Figure 5A). (CASP3.cI175) suggested inhibition of apoptosis in group 3 patients ( Figure 5A), consistent with the higher absolute blast count observed in group 3 ( Table 1).
MCL1 overexpression is associated with venetoclax resistance and can be seen in FLT3-mutated AML (26). We therefore checked for a correlation between mTOR and MCL1 expression. Indeed, MTOR.pS2448 expression was positively correlated with MCL1 expression in RPPA across all patients ( Figure 5B and Supplementary Figure 5B). This is significant because resistance to venetoclax can be mediated via MCL1 (26). We therefore evaluated whether resistance to venetoclax in myeloid and lymphoid cell lines is also associated with mTOR overexpression. Phosphorylated S6 p235-236 and p240-244, which are surrogate markers for mTOR activation, were positively correlated with venetoclax AUC (r = 0.36, p = 0.001 and r = 0.32, p = 0.003, respectively), suggesting that mTOR activation was associated with resistance to venetoclax (Figures 5C, D).

DISCUSSION
Clinical outcomes of AML patients are largely determined by patient characteristics such as age, performance status, and the underlying cause of the AML (27). ELN classification categorizes AML patients on the basis of cytogenetic and mutational profiles (22,28). However, almost one-third of AML patients lack prognostic genomic features (29). Also, one-third of AML patients have survival outcomes that deviate more than 20% from their ELN risk category (29). Therefore, identifying orthogonal molecular approaches contributing to AML heterogeneity independent of clinical and genomic features may reveal biologic processes impacting outcomes and identify novel therapeutic strategies.
In previous studies using RNA profiling to classify AML patients (17,30), gene expression clustering was strongly correlated with mutational and cytogenetic profiles, as well as lineage and morphologic groups as classified by FAB. Similarly, we identified cluster correlation with FAB classification in TCGA and in our cohort. We therefore hypothesized that by excluding the expression profiles of genes associated with lineage and morphologic characteristics in AML, we can potentially uncover AML patient groups that share biologic pathways independent of morphology and lineage. In the current study, we undertook a comprehensive and unique approach to decouple lineage-related genes, combined with RPPA and targeted mutation analysis, and we identified immune and metabolic signatures that contributed to AML heterogeneity and impacted outcomes.
Our analysis identified a group of AML patients (group 2) who had significantly improved overall survival, event-free survival, and remission duration. This patient group was characterized by increased frequency of GATA2 mutations and an inflammatory and immune phenotype indicated by enrichment for interferon- alpha and interferon-gamma, tumor necrosis factor-alpha, and interleukin-6/JAK/STAT3 signaling pathways. Interestingly, germline deficiencies in GATA2 leads to myeloid malignancies with an immunodeficient phenotype (31). However, the exact mechanism by which GATA2 mutations could confer a remodeled immunologic phenotype in AML remains unclear and warrants further investigation. Supported by previous studies demonstrating distinct immune cell activity among AML patients with different outcomes (32). Our findings suggested that the intrinsic inflammatory and immune microenvironment in AML was associated with better outcomes and responses to therapy. Recent work demonstrated the complex immunologic landscape of hematologic malignancies with a subset of AML patients having a distinctively high NK/T cell cytotoxic activity {Dufva:2020bg}. Further, recent work demonstrated that an immune-infiltrated signal was associated with improved outcomes in AML patients but not associated with ELN (33). However, in our results, which were validated in 2 independent cohorts, patients could be grouped by shared biologic characteristics independent of ELN classification or clinical variables such as age.
AML patients in group 2 also had significant downregulation of HOX genes, which corresponded with improved outcomes, consistent with previous studies (30,34,35). Inflammation and cytokine production in a canine model was associated with reduced HOXA gene expression (36), and restoring HOX gene expression may oppose inflammation (37) or hamper innate immunity by inhibiting granulopoiesis (38). These studies, although not conducted in a leukemia or cancer model, suggested that inflammation and HOX genes may be co-regulated, but the exact mechanism linking these two pathways is still unclear.
Our analysis also revealed 2 distinct patient groups (groups 1 and 3) that had similarly worse outcomes compared with group 2 but distinct underlying biology. Our orthogonal RPPA and genomic analysis coupled with transcriptomic profiling revealed that these 2 groups can be distinguished by increased metabolic activity and overexpression of mTOR and MCL1 proteins in group 3. However, only group 3 had FLT3 enrichment, contrary to previous transcriptomic studies (17), demonstrating that multiple transcriptional clusters may harbor FLT3 mutations. Therefore, our approach of decoupling the lineage-associated genes generated a better representation of the transcriptomic profile associated with FLT3 mutations. This is also consistent with the proliferative phenotype conferred by FLT3 mutations in AML (39). FLT3 activates downstream mTOR signaling (40,41), and this signaling is involved in metabolic reprogramming (42). Inhibiting mTOR can also lead to inhibition of MCL1, but the exact mechanism is not clear, although it is thought to involve AKT-dependent regulation of MCL1 (43,44). mTOR inhibition also has antitumor activity in AML (45)(46)(47), and our data suggest that mTOR activation is associated with venetoclax resistance. The finding is of high importance because it suggests an alternative therapeutic target to overcome venetoclax resistance (26). Furthermore, mTOR inhibition could be a surrogate for inhibiting MCL1, especially given that direct MCL1 inhibitors have cardiac and gastrointestinal toxicities that have hampered their recent clinical development.
Our dataset comprised 81 samples from patients mostly treated with intensive chemotherapeutic regimens (80% with cytarabinebased regimens). Given the relatively small sample size, it is likely that we missed subtle transcriptomic and proteomic perturbations that might be biologically important. Furthermore, we used targeted sequencing of AML-associated genes to study DNA lesions in the cohort. Although this approach allowed us to study important AML-associated mutations in these patients, it precluded analysis of the full spectrum of mutations in these patients or the associated mutational processes, although most if not all of the myeloid mutations can be captured by this assay. Outcomes in patients with FLT3 mutations (primarily group 3) would have been improved had FLT3 inhibitors been used. However, at the time of sample collection and AML diagnosis, none of the FLT3 inhibitors were approved or under investigation in a trial. Nevertheless, our study, which combined RPPA, genomic profiling, and transcriptomic profiling with extensive and long clinical follow-up data, provided a unique clinical dataset for further interrogation. Our approach to decouple morphology from lineageassociated genes in AML revealed distinct groups of AML patients that share biologic pathways independent of ELN classification, antecedent hematologic disorders, or other clinical and molecular variables that are known to impact outcomes. We also used orthogonal RPPA analysis to differentiate patients with similarly worse outcomes in groups 1 and 3, revealing an mTOR-associated metabolic profile that can be potentially targeted. Our findings demonstrate that employing alternative classifications for AML patients can provide insight into AML heterogeneity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found at: https://www.ncbi.nlm.nih. gov/geo/query/acc.cgi?acc=GSE165656.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by IRB approval MD Anderson. The patients/ participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
HA, VM, and RW analyzed the data. YH and SL conducted the metabolic profiling. FW and JZ conducted DNA analysis. YQ and CH prepared samples. AQ analyzed RPPA. AQ and CB conducted sequencing. ND, MK, and AF contributed to clinical and integrative analysis. KC, LW, SK, and HA supervised the study. All authors contributed to the article and approved the submitted version.

FUNDING
The work was funded in part by a T32 National Institutes of Health fellowship to HA. This project has been made possible in part by grant RP180248 to KC from Cancer Prevention & Research Institute of Texas and grant U01CA247760 to KC. RNA sequencing and Foundation Medicine assay sequencing were partially done via funding from Genentech to SK.  Figure 1A for various numbers of top variably expressed genes and their association with FAB status, inferred using the Fisher test. The analysis was also performed for clusters corresponding to Figure 1A (corrected_clusters). The barplot is the -log 10 p values obtained from the Fisher test. (C) Transcriptome clustering analysis in The Cancer Genome Atlas (TCGA) acute myeloid leukemia cohort using the top 1000 variably expressed genes. Consistent with observations in our data (A, B), identified clusters showed a strong association with FAB status (Fisher p = 2.24e-10). are associated with FLT3 mutations (q < 0.1), identified using Fisher's test followed by correction of p-value by FDR.