Aggrephagy-related patterns in tumor microenvironment, prognosis, and immunotherapy for acute myeloid leukemia: a comprehensive single-cell RNA sequencing analysis

Acute myeloid leukemia (AML) is a complex mixed entity composed of malignant tumor cells, immune cells and stromal cells, with intra-tumor and inter-tumor heterogeneity. Single-cell RNA sequencing enables a comprehensive study of the highly complex tumor microenvironment, which is conducive to exploring the evolutionary trajectory of tumor cells. Herein, we carried out comprehensive analyses of aggrephagy-related cell clusters based on single-cell sequencing for patients with acute myeloid leukemia. A total of 11 specific cell types (T, NK, CMP, Myeloid, GMP, MEP, Promono, Plasma, HSC, B, and Erythroid cells) using t-SNE dimension reduction analysis. Several aggrephagy-related genes were highly expressed in the 11 specific cell types. Using Monocle analysis and NMF clustering analysis, six aggrephagy-related CD8+ T clusters, six aggrephagy-related NK clusters, and six aggrephagy-related Mac clusters were identified. We also evaluated the ligand-receptor links and Cell–cell communication using CellChat package and CellChatDB database. Furthermore, the transcription factors (TFs) of aggrephagy-mediated cell clusters for AML were assessed through pySCENIC package. Prognostic analysis of the aggrephagy-related cell clusters based on R package revealed the differences in prognosis of aggrephagy-mediated cell clusters. Immunotherapy of the aggrephagy-related cell clusters was investigated using TIDE algorithm and public immunotherapy cohorts. Our study revealed the significance of aggrephagy-related patterns in tumor microenvironment, prognosis, and immunotherapy for AML.


Introduction
Leukemia is a malignant clonal disease originating from hematopoietic stem cells (1). The affected cells have uncontrolled proliferation, impaired differentiation, and blocked apoptosis, so the affected cells are stuck in different stages of cell development (2). The incidence and mortality rate of leukemia are both high. The report showed that in 2018 alone, there were 437000 new cases of leukemia and 309000 new deaths from leukemia worldwide (3).
Leukemia can be classified as acute (4) or chronic (5) according to its course. Leukemias can be divided into myeloid leukemia and lymphocytic leukemia according to the cells involved (6). Acute myeloid leukemia (AML), the most common leukemia in adults, is a highly heterogeneous disease (7). French-American-British (FAB) defined eight subtypes (M0 to M7) based on the morphological and cytological characteristics of leukemia cells (8). According to genetics, morphology, immunophenotype and clinical manifestations, World Health Organization (WHO) classified leukemia into six main types and more than 20 subtypes (9). In addition, the prognosis of AML can be divided into good, moderate and poor groups based on cytogenetic characteristics (1,6), but the prognosis of different patients in each group is still very different, indicating that the gene expression pattern of leukemia is very complex.
Tumor microenvironment (TME) is the internal environment that tumor cells depend on for survival and development. Besides tumor cells, it also contains many non-malignant cells and some soluble factors, which play an important role in promoting tumor occurrence, progression and immune escape (10). Tumor microenvironment mainly includes immune microenvironment, including myeloid-derived suppressor cells (MDSCs), tumorassociated macrophages (TAMs), tumor-associated neutrophils (TANs), dendritic cell (DC), T cell, B cell, and Natural Killer (NK) cell, and non-immune microenvironment, including cancerassociated fibroblasts (CAFs), extracellular matrix, mesenchymal stem cells, and various secreted factors (11)(12)(13)(14). Therefore, tumor is a complex mixed entity composed of malignant tumor cells, immune cells and stromal cells, with intra-tumor and inter-tumor heterogeneity. Since bulk tissue is composed of various cells, its sequencing cannot reveal the function or cell state of a specific cell population (15). Therefore, the detection of genome, transcriptome, epigenome and proteome at the cellular level can overcome the limitations of the traditional bulk level and conduct more detailed analysis at the cellular and molecular level (15). Single-cell RNA sequencing (scRNA-Seq) enables non-targeted quantification of transcripts in a single cell. Single-cell RNA sequencing enables a comprehensive study of the highly complex tumor microenvironment, which is conducive to exploring the evolutionary trajectory of tumor cells, the complex interactions between tumor cells and tumor microenvironment, and the spatiotemporal functional relationships between different cell population types (16,17). Bioinformation analysis can identify new cell types, identify rare cell populations, and construct cell status and phylogenetic maps through computational methods such as highdimensional data reduction, unsupervised clustering, phylogenetic modeling, locus inference, RNA rate analysis, lineage tracing, and ligand-receptor interaction mapping (16,17).
Autophagy is an important feedback process of cells under pressure. Autophagy realizes self-digestion and catabolism by phagocytic organelles and degradation of cell contents, so as to maintain the homeostasis balance of cells (18,19). Autophagy plays an important role in maintaining vital activities and immune function and is closely related to tumors and other diseases. The common types of autophagy include macroautophagy, microautophagy and chaperonemediated autophagy (20). Aggrephagy is a kind of selective autophagy, which is the only way to clear protein aggregates. Once the function of molecular chaperone and ubiquitin proteasome is limited or the clearance efficiency of misfolded proteins is lower than the production rate, protein aggregates will be formed, and the aggrephagy needs to be activated to degrade them (21).
In this study, the relationship between aggrephagy-related genes and cell subsets of TME (such as T cells, Natural Killer cell, and Myeloid cells) for AML was investigated using data of single-cell RNAsequencing (scRNA-seq) from GSE116256. After Nonnegative Matrix Factorization analysis, the characteristics of the aggrephagy-mediated cell clusters in Pseudotime trajectory, cell-cell communication, ligandreceptor links, and immunotherapy were investigated.

Materials and methods
Downloading and preprocessing for data of acute myeloid leukemia The samples source with single-cell RNA-sequencing (scRNAseq, GSE116256) and expression profiles (GSE63270 and GSE12417) were downloaded from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) of The National Center for Biotechnology Information (NCBI) (22). We enrolled three normal samples and ten patients with acute myeloid leukemia from GSE116256 for analysis of scRNA-seq (23)(24)(25). There were 104 normal and acute myeloid leukemia (42 populations and 62 leukemic populations) included in GSE63270 dataset (26). GSE12417 dataset contained the analysis of 79 samples of bone marrow or peripheral blood mononuclear cells from adult patients with untreated acute myeloid leukemia (27,28). In addition, the expression profiles and clinical information were acquired from TCGA-LAML cohort, including 151 patients with acute myeloid leukemia (29, 30).

Dimensionality reduction and annotation of single cell for acute myeloid leukemia
First, the data of single cell was filtered by setting each gene to be expressed in at least three cells, and each cell to express at least 500 genes, resulting in 9891 cells. We calculated the percentage of mitochondria and Ribosomal RNA (rRNA) through the PercentageFeatureSet function of Seurat package (31). The number of genes expressed in each single cell was greater than 100 and less than 5000, and we ensured the percentage of mitochondria was less than 20%. Furthermore, the Unique Molecular Identifier (UMI) of the single cell was at least greater than 100, resulting in 9886 cells. Subsequently, we used the method of log-normalization to standardize the single-cell data from each of the 13 samples. The highly variable features were identified by FindVariableFeatures function (32) based on variance stabilization transformation (VST). The genes were then scaled by using the ScaleData function for all genes. We utilized RunPCA function for PCA dimension reduction to find anchors. The FindNeighbors function with dim=15 and FindClusters function with Resolution=0.1 was used to luster cells. Ulteriorly, the RunTSNE function was used to conduct t-SNE (T-Distribution Stochastic Neighbour Embedding) dimension reduction analysis and the RunUMAP function was used to conduct UMAP (Uniform Manifold Approximation and Projection) reduction analysis. The marker genes for single cell were supplied by SingleR package (33) and the classical marker from the published literature (25).

Pseudotime trajectory analysis for the aggrephagy-mediated cell clusters
Monocle R package was applied for the data of single cell to explore the correlation of aggrephagy-related genes and pseudotime trajectories (34). The graphs for the pseudotime trajectories of specific cell with aggrephagy-related genes were plotted using the function from Monocle R package, such as plot_pseudotime_heatmap and so on.
Nonnegative matrix factorization of aggrephagy-related genes in single cell for acute myeloid leukemia Based on the expression matrix of the scRNA-seq, dimension reduction analysis of aggrephagy-related genes in each cell clusters were conducted employing NMF (Nonnegative Matrix Factorization) R package (35,36), thus displaying the effect of aggrephagy-related genes in single cell for acute myeloid leukemia.
Identifying the marker genes of single cell for acute myeloid leukemia FindAllMarkers function was applied to identify he marker genes of single cell for acute myeloid leukemia (31). The aggrephagy-mediated cell clusters were identified based on differentially expressed genes (DEGs) with log Fold Change (logFC) and aggrephagy-related genes. The NK cell subtypes were summarized from the published literature of Huan Liu et al (37).

Analysis of transcription factors for aggrephagy-mediated cell clusters
SCENIC was a tool for simultaneously reconstructing gene regulatory networks and identifying stable cell states from singlecell RNA-seq data (38). The gene regulatory network was inferred based on co-expression and DNA motif analysis, and then network activity was analyzed in each cell to identify cell status (38). We carried out analysis of transcription factors (TFs) for aggrephagymediated cell clusters for acute myeloid leukemia through pySCENIC package (39-41). RcisTarget R package and two genemotif rankings (hg19-tss-centered-10 kb and hg19-500 bpupstream) was used to identify binding motifs of TFs in the gene list for acute myeloid leukemia (42,43). The threshold value for the TFs was set as Benjamini-Hochberg false discovery rate (BH-FDR) <0.05.

Cell-cell communication analysis among cell subsets for acute myeloid leukemia
The signaling inputs and outputs among the cell types and aggrephagy-mediated cell clusters were assessed by applying CellChat package (44) and CellChatDB database (45). The netVisual_circle function was utilized for evaluating the strength of cell-cell communication networks among cell subsets (44, 45). In addition, the ligand-receptor interactions among the specific cell subsets were estimated via the netVisual_bubble function (44, 45).

Prognostic analysis of the aggrephagyrelated cell clusters for acute myeloid leukemia
Based on the data of scRNA, Gene Set Variation Analysis (GSVA) was applied to compute the signature scores involved in aggrephagy for public database (46). We carried out Cox proportional hazard regression to evaluate the prognosis for the aggrephagy-related cell clusters (47). The Kaplan-Meier curves was plotted through the survminer R package.

Immunotherapy analysis of the aggrephagy-related cell clusters for acute myeloid leukemia
We used TIDE (Tumor Immune Dysfunction and Exclusion) algorithm to analyze the immune checkpoint blockade immunotherapeutic for the aggrephagy-related cell clusters (48). We also reviewed the published literature to validate the prognostic and therapeutic effects of each cell subtype using real-world immunotherapy cohorts (49-60).

Statistical analysis
The continuous or category variables were compared using Student's t-test, Wilcoxon rank sum test, Kruskal-Wallis's test, or Chi-square test. The log-rank test was used for survival analyses.

Dimensionality reduction and annotation of single cell for acute myeloid leukemia
We carried out dimensionality reduction and annotation of single cell for acute myeloid leukemia as described in the materials and methods section. We ensured that the number of genes expressed in each single cell was greater than 100 and less than 5000, the percentage of mitochondria was less than 20%, and the Unique Molecular Identifier (UMI) of the single cell was at least greater than 100, resulting in 9886 cells. Supplementary Figures S1A-B was the statistical diagram of cell filtration, which could be seen to meet all thresholds set above (Supplementary Figures S1A, B). The highly variable features were identified via FindVariableFeatures function based on VST, and the top ten highly variable genes among the single cell were marked out in the volcano plot, including IGLL5, HBB, JCHAIN, HBG1, HBG2, HBD, HBA2, CLC, CA1, and HBA1 (Supplementary Figure S1C). Ulteriorly, PCA analysis was carried out on the highly variable genes, we used the Elbow algorithm to carry out the Standard Deviation based on the highly variable genes (Supplementary Figure S1D). The RunTSNE function was used to conduct t-SNE dimension reduction analysis and the RunUMAP function was used to conduct UMAP reduction analysis, thus identifying a total of 18 cell subsets (Supplementary Figure S1E). Afterwards, the marker genes were used to annotate the specific cell types, thus identifying 11 specific cell types, including T cell, Natural  Figure 1A). We plotted the correlation network for the number of interactions among the 11 specific cell types ( Figure 1B). Figure 1C visually showed the proportion of different cell types in each sample. Finally, we created the heat map to show the expression of the aggrephagy-related genes in different cell types ( Figure 1D). We could see that there were several aggrephagy-related genes that were highly expressed in the 11 specific cell types, including HSP90AA1, RPS27A, UBA52, UBB, UBC, and VIM ( Figure 1D). We displayed the global view of the expression pattern for marker genes gained as described in the methods section, reflecting the dynamic features of each cell subsets (Supplementary Figure S2).

Pseudotime analysis for aggrephagymediated NK cells
There was a total of 1067 cells in the NK cell type. Using UMAP reduction analysis, the 1067 cells in the NK cell type could be clustered into eleven cell clusters (Supplementary Figure S4). Based on the aggrephagy-related genes, six clusters were identified using Monocle analysis ( Figure 3A

Discussion
Leukemia is a kind of hematologic malignant disease with hematopoietic stem cell clonal proliferation. Clonal leukemia cells proliferate and accumulate in bone marrow and other normal hematopoietic tissues, inhibit hematopoietic function, and penetrate into other non-hematopoietic tissues and organs through blood circulation, resulting in organ failure and poor prognosis. The clinical manifestations of AML include anemia, bleeding, infection fever and other symptoms. AML is a common type of leukemia, accounting for 80% of acute leukemia, with a high incidence in children (61). Patients with AML tend to die within one year of diagnosis, with a high mortality rate (62). The pathogenesis of AML is complex and diverse, including chemical substances, radioactive substances, genetic factors, gene mutations, abnormal signaling pathways, epigenetic regulation, leukemia microenvironment or immune imbalance. Autophagy is a catabolic process of intracellular substances mediated by lysosome, which has a bidirectional effect in AML. Autophagy can remove abnormal organelles, reduce the accumulation of harmful substances, and effectively prevent cell cancer. However, autophagy can also enable AML cells to obtain various substances and energy, which can help malignant cells to fight against the lack of nutrition and energy caused by their own high metabolism, and promote the growth and proliferation of AML cells. The autophagy levels in different stages of AML were different. How to regulate the A B FIGURE 6 Immunotherapy analysis of the aggrephagy-related cell clusters for acute myeloid leukemia based on TIDE algorithm. (A) Comparison for the response status of immune checkpoint blockade therapy for patients with AML among the aggrephagy-related cell clusters. (B) Comparison for the OR rates of AML patients in GSE12417 and TCGA-LAML cohorts among the aggrephagy-related cell clusters. *P < 0.05; **P < 0.01; ****P < 0.0001. progression of AML, remove AML cells and improve the therapeutic effect by regulating autophagy level is the focus of AML prevention and treatment.
The survival and apoptosis of immune cells, the expression of immunomodulators and the change of tumor microenvironment (TME) all affect the occurrence and development of AML (25). Immune cells monitor abnormal cells in the body and play an immune effect to eliminate them (63). For example, nature killer (NK) cells recognize and kill tumor cells by mediating cytotoxic effects (64). Tumor cells can evade immune recognition and attack by modifying their own surface antigens and changing the microenvironment around tumor tissue, that is, immune escape of tumor. The occurrence of AML is also closely related to immune escape. By changing the activity of immune cells or regulating the expression of immune molecules, the function of immune cells is affected, thus achieving immune escape of AML cells (65). It can effectively treat AML by inhibiting the cell immune microenvironment and enhancing the immune response (66). To elucidate the relationship between the occurrence of AML and the immune response is of great significance for the development of immunotherapy in patients with AML. In this study, we identified 11 specific cell types, including T cell, Natural Killer (NK) cell, Common Myeloid Progenitor (CMP) cell, Myeloid cell, Granulocyte Monocyte Progenitor (GMP) cell, Megakaryocyte Erythroid Progenitor (MEP) MEP, Promonocyte (Promono) cell, Plasma cell, Hematopoietic Stem Cell (HSC) cell, B cell, and Erythroid cell for AML. Four subgroups of T cells were obtained by re-clustering annotation using t-SNE dimension reduction analysis, including CD8+ T cell, CD4+ T cell, natural killer (NK) cells, and Regulatory T (Treg) cells. NK cell type could be clustered into eleven cell clusters. As for Myeloid cells, three subgroups of Myeloid cells were obtained by re-clustering annotation using UMAP reduction analysis, including Mono (monocytes) cell, Macrophages (MAC) cell, and Dendritic cell (DC) cell. Our study identified some specific cell subtypes of AML, which will provide some reference value for exploring the TME of AML.
Transcription factors (TFs) are involved in the formation of transcription initiation complexes that affect transcription processes and thus downstream gene expression (67). AML contains many abnormal genes, some of which directly affect the expression of TFs, and some indirectly affect the combination of transcription factors and regulatory regions to play a role (68). In addition, some TFs play a role in stem cell maintenance, differentiation and maturation of hematopoietic stem progenitor cells, and abnormal expression of these TFs can lead to hematopoietic malignant transformation. Herein, we found that TFs of LGALS9, TIGIT, BTLA, and CTLA4 were upregulated in the TUBA1B+CD8 + T−C1, TFs of IL10 and CD160 were upregulated in the DYNC1H1+CD8 + T−C2, ADORA2A was upregulated in the UBE2V1+CD8 + T−C3, BTN3A1, BTN3A2, CD274, CD247, SLAMF7, LAG3, and PDCD1 were upregulated in the UBE2N+CD8 + T−C4, BTN2A2 and LAIR1 were upregulated in the Non−Aggre−CD8 + T−C6. AML is a highly heterogeneous and aggressive hematological malignancy resulting from clonal expansion of malignant hematopoietic progenitor cells in the bone marrow. Its incidence increases with age and its prognosis is poor. A variety of cytogenetic and molecular genetic abnormalities affect signaling pathways, transcription, and epigenetic regulators that induce AML. Studies have shown that various recurrent gene mutations can directly affect the expression of TFs or indirectly change the binding of TFs to regulatory regions, resulting in abnormalities of transcriptional regulatory networks (TRNs), leading to a large number of cloning and proliferation of myeloid precursor cells and stagnating in different stages of hematopoietic differentiation. The fine regulation of TFs such as TIGIT (69), BTLA (70), CTLA4 (71), IL10 (72), CD274 (73) is crucial in hematopoietic regulation and cell fate determination. Abnormal expression of these TFs can interfere with normal hematopoietic differentiation and cause the occurrence of AML. Our study provided new insights into the regulatory mechanisms of related TFs in cell subtypes of AML.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Author contributions
Our study revealed the significance of aggrephagy-related patterns in tumor microenvironment, prognosis, and immunotherapy for AML. All authors contributed to the article and approved the submitted version.

Funding
This study was supported by grants from Zhejiang Traditional Chinese Medicine Science and Technology Project (ZYJ23JS03), Natural Science Foundation of Zhejiang Province (LGF22H080009) and Guiding Science and Technology Project of Quzhou (2022014).

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's note All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

SUPPLEMENTARY FIGURE 1
Dimensionality reduction of single cell for acute myeloid leukemia. (A, B) The sequencing depth and the number of genes for single cell from three normal samples and ten patients with acute myeloid leukemia. (C) Detection of the highly variable genes across the cells in volcano plot, the top 10 genes were marked out. (D) PCA plot of scRNA-seq samples from 13 samples and the Standard Deviation of 1-20 PCs using ElbowPlot algorithm. (E) t-SNE and UMAP dimension reduction analysis identifying a total of 18 cell subsets.

SUPPLEMENTARY FIGURE 2
Dot plot showing the average and percentage expression of well-defined marker genes in different cell subsets. The color represented the average expression level of the marker genes. The diameter of the dots denoted the fractional expression.

SUPPLEMENTARY FIGURE 3
UMAP reduction analysis and cell-cell communication analysis for aggrephagy-mediated T cells.

SUPPLEMENTARY FIGURE 4
UMAP reduction analysis and cell-cell communication analysis for aggrephagy-mediated NK cells.

SUPPLEMENTARY FIGURE 5
UMAP reduction analysis and cell-cell communication analysis for aggrephagy-mediated Myeloid cells.

SUPPLEMENTARY FIGURE 6
Immunotherapy analysis of the aggrephagy-related cell clusters for acute myeloid leukemia using TIDE algorithm.