Identification of Potential Signatures and Their Functions for Acute Lymphoblastic Leukemia: A Study Based on the Cancer Genome Atlas

Objective Acute lymphoblastic leukemia (ALL) is a malignant disease most commonly diagnosed in adolescents and young adults. This study aimed to explore potential signatures and their functions for ALL. Methods Differentially expressed mRNAs (DEmRNAs) and differentially expressed long non-coding RNAs (DElncRNAs) were identified for ALL from The Cancer Genome Atlas (TCGA) and normal control from Genotype-Tissue Expression (GTEx). DElncRNA–microRNA (miRNA) and miRNA–DEmRNA pairs were predicted using online databases. Then, a competing endogenous RNA (ceRNA) network was constructed. Functional enrichment analysis of DEmRNAs in the ceRNA network was performed. Protein–protein interaction (PPI) network was then constructed. Hub genes were identified. DElncRNAs in the ceRNA network were validated using Real-time qPCR. Results A total of 2,903 up- and 3,228 downregulated mRNAs and 469 up- and 286 downregulated lncRNAs were identified for ALL. A ceRNA network was constructed for ALL, consisting of 845 lncRNA-miRNA and 395 miRNA–mRNA pairs. These DEmRNAs in the ceRNA network were mainly enriched in ALL-related biological processes and pathways. Ten hub genes were identified, including SMAD3, SMAD7, SMAD5, ZFYVE9, FKBP1A, FZD6, FZD7, LRP6, WNT1, and SFRP1. According to Real-time qPCR, eight lncRNAs including ATP11A-AS1, ITPK1-AS1, ANO1-AS2, CRNDE, MALAT1, CACNA1C-IT3, PWRN1, and WT1-AS were significantly upregulated in ALL bone marrow samples compared to normal samples. Conclusion Our results showed the lncRNA expression profiles and constructed ceRNA network in ALL. Furthermore, eight lncRNAs including ATP11A-AS1, ITPK1-AS1, ANO1-AS2, CRNDE, MALAT1, CACNA1C-IT3, PWRN1, and WT1-AS were identified. These results could provide a novel insight into the study of ALL.


INTRODUCTION
Acute lymphoblastic leukemia (ALL) is a malignant disease most commonly diagnosed in adolescents and young adults, especially in patients younger than 15 years. Despite significant improvements in the management of ALL, the long-term survival rate of ALL patients, especially adult patients, remains low (Jabbour et al., 2018;Richard-Carpentier et al., 2019). Therefore, it is of importance to understand the pathogenesis of ALL and identify novel diagnostic biomarkers and therapeutic targets for ALL.
LncRNA is a type of RNA longer than 200 nucleotides. Dysregulated lncRNA as tumor suppressor genes or oncogenes plays a key role in a variety of biological processes, such as cell proliferation, apoptosis, migration, and invasion. Increasing studies are focusing on the role and mechanism of lncRNA in the occurrence and development of ALL (Trimarchi et al., 2014;Arthur et al., 2017). For instance, lncRNA CASC15 could regulate SOX4 expression in RUNX1-translocated leukemia (Fernando et al., 2017). LncRNA HOTAIR is closely associated with acute leukemia patients' poor prognosis . LncRNA HOXA-AS2 induces glucocorticoid resistance by promoting ALL cell proliferation and inhibiting apoptosis (Zhao et al., 2019). Despite the fact that many studies have shown the diagnostic and prognostic values of lncRNAs in ALL, it is still required to further understand their regulatory mechanism. It has been widely accepted that lncRNAs indirectly regulate gene expression through targeted miRNAs (about 20 nucleotides) at the transcriptional or post-transcriptional level. Many miRNAs have been found to play a functional regulatory role in the development of ALL, such as miRNA-126 (Nucera et al., 2016), miRNA-155 (El-Khazragy et al., 2019), and miR-141-3p (Zhou et al., 2019). Yet, the regulatory interactions between lncRNAs and miRNAs in ALL require to be clarified.
The development of transcriptome analysis and RNA sequencing technology is increasing the possibility of identifying lncRNAs that may be involved in the pathogenesis of ALL. Moreover, further studies on the function of abnormally expressed lncRNAs may help understand the pathogenesis of ALL and provide important insights for the treatment of ALL. In this study, we comprehensively analyzed DElncRNAs and DEmRNAs in bone marrow samples of ALL. A ceRNA network was constructed for ALL on the basis of DElncRNA-miRNA and miRNA-DEmRNA pairs. DEmRNAs in the ceRNA network were significantly associated with ALL-related biological processes and pathways. Among DElncRNAs in the ceRNA network, eight lncRNAs including ATP11A-AS1, ITPK1-AS1, ANO1-AS2, CRNDE, MALAT1, CACNA1C-IT3, PWRN1, and WT1-AS were validated by Real-time qPCR, which could become potential diagnostic and therapeutic targets of ALL.

ALL Data Acquisition and Differential Expression Analysis
LncRNA and mRNA RNA-seq data of 494 bone marrows with ALL (hematopoietic and reticuloendothelial systems) were retrieved from TCGA repository 1 , which were derived from the IlluminaHiSeq RNA-Seq platform. All the data from three phases together, including 12 cases of phase 1, 468 cases of phase 2, and 14 cases of phase 3 were enrolled in the study. There were 321 (64.98%) males, 172 (34.82%) females, and 1 unknown (0.02%). The age distribution of the ALL group is as follows: 403 cases of 0-14 years old and 91 cases of ≥ 14 years old. All data of normal tissue samples were obtained from 407 whole blood in the Genotype-Tissue Expression (GTEx) database 2 . There were 265 (65.11%) males and 142 (34.89%) females in the control group. The age distribution of the control group is as follows: 34 cases of 20-29 years old, 34 cases of 30-39 years old, 72 cases of 40-49 years old, 130 cases of 50-59 years old, 132 cases of 60-69 years old, and 5 cases of 70-79 years old. Complete description of the multiple ethnicity groups, the biospecimen procurement methods, and sample fixation was provided in the GTEx official annotation. Differential expression analyses between ALL samples and normal samples were carried out using the EdgeR package in R (Robinson et al., 2010). The obtained p-values were corrected by false discovery rate (FDR). mRNAs and lncRNAs with adjusted p < 0.05 and | log 2fold change (FC)| ≥ 2 were considered as DEmRNAs and DElncRNAs. Volcano plots and heatmaps were generated using the ggplot2 and packages in R, respectively.

Functional Enrichment Analyses of DEmRNAs in the ceRNA Network
Gene Ontology (GO) analysis of DEmRNAs in the ceRNA network was carried out using Database for Annotation, Visualization, and Integrated Discovery (DAVID) (Dennis et al., 2003), including biological process (BP), cellular

Gene symbol
Primer sequence (5 -3 ) component (CC), and molecular function (MF). Moreover, Kyoto Encyclopedia of Genes and Genomes (KEGG) was analyzed using the clusterProfiler in R (Yu et al., 2012). Furthermore, the KEGG results were visualized using the Cytoscape plug-in ClueGO. p < 0.05 was set as the cutoff value.

PPI Network
The interactions between proteins were predicted using the Search Tool for the Retrieval of Interacting Genes (STRING) database 7 (minimum required interaction score > 0.4) (Szklarczyk et al., 2019). Furthermore, PPI networks were embodied using the Cytoscape v3.5.0 software. In addition, we used Molecular Complex Detection (MCODE) plugin to identify the hub genes in the PPI network. The criteria were set as follows: MCODE scores > 3 and number of nodes > 4. The top 10 hub genes were identified using the ranking method of degree.

Real-Time qPCR
Bone marrow samples were isolated from 25 ALL patients and 15 healthy participants and red blood cells were removed. Total RNA was extracted from bone marrow samples and then was stored at −80°C. Extracted samples were lysed using 1 ml of Trizol and placed for 5 min on ice. RNA concentration and purity were determined using a NanoDrop UV spectrophotometer. GAPDH was used as a control. Relative expression levels of lncRNAs were calculated using 2 − CT method. Differences between the two groups were analyzed using Student's t-test. p-value < 0.05 was considered statistically significant.

Identification of DElncRNAs and DEmRNAs for ALL
The workflow of this study is shown in Figure 1. According to adjusted p-value < 0.05 and | log2 fold change (FC)| ≥ 2, 2903 up− and 3228 downregulated mRNAs were identified for ALL, as shown in the volcano plot (Figure 2A and  Figure 2C) and DElncRNAs ( Figure 2D) between ALL bone marrow samples and normal samples.

Construction of ceRNA Network for ALL
The miRNAs that targeted DEmRNAs were predicted using TargetScan, miRDB, and miRTarBase databases. After integration of prediction results from the three databases, 297 DEmRNAs were intersected and identified for the construction of ceRNA network (Figure 3). Furthermore, DElncRNA-miRNA relationships were predicted using miRcode database. By comprehensively analyzing DElncRNA-miRNA and miRNA-DEmRNA pairs, a ceRNA network was constructed for ALL (Figure 4)

Functional Enrichment Analysis of DEmRNAs in the ceRNA Network
As depicted in heatmaps, there were obvious differences in the expression patterns of all DEmRNAs in the ceRNA network between ALL bone marrow samples and normal samples ( Figure 5A). Bubble diagrams showed the top 40 GO enrichment analysis results enriched by DEmRNAs in the ceRNA network ( Figure 5B). We found that these mRNAs were mainly enriched in ALL-related biological processes such as transcription, programmed cell death, apoptosis, cell cycle, proliferation, and so on. Figure 6 depicted the relationships between DEmRNAs and enriched biological processes including morphogenesis of an epithelium, kidney epithelium development, ureteric bud development, mesonephric epithelium development, and mesonephric tubule development. As for KEGG pathway Frontiers in Genetics | www.frontiersin.org FIGURE 4 | A ceRNA network construction for acute lymphoblastic leukemia. Blue rhombus represents lncRNAs; green circle represents miRNAs and red triangle represents mRNAs.
Frontiers in Genetics | www.frontiersin.org  Frontiers in Genetics | www.frontiersin.org enrichment analysis results, these DEmRNAs were mainly enriched in pathways in cancer, cell cycle, small cell lung cancer, p53 signaling pathway, Wnt signaling pathway, pentose phosphate pathway, and non-small cell lung cancer (Figures 7A,B).

Identification of Hub Genes in the PPI Network
The DEmRNAs in the ceRNA network were imported into STRING database. Then, a PPI network was constructed for ALL ( Figure 8A). Two PPI subnetworks were then constructed ( Figures 8B,C). Ten hub genes were identified for ALL, including SMAD3, SMAD7, SMAD5, ZFYVE9, FKBP1A, FZD6, FZD7, LRP6, WNT1, and SFRP1.

Correlation Between Hub Genes and DElncRNAs
Correlation analysis between hub genes and DElncRNAs was performed by corrplot package. The significant correlations between DElncRNAs and hub genes are shown in Figure 9 and Supplementary Material 5. There was strong correlation between WT1-AS and FZD7 (r = 0.751907203; p < 0.0001).  Furthermore, PWRN1 and SMAD3 were significantly correlated (r = 0.521493415 and p = 4.32E−08).

DISCUSSION
In this study, we constructed a ceRNA network for ALL based on DElncRNA-miRNA and miRNA-DEmRNA relationships. Among all DElncRNAs in the ceRNA network, eight lncRNAs were validated in ALL bone marrow samples using Real-time qPCR. These lncRNAs might become potential biomarkers for ALL.
To explore potential functions of DEmRNAs in the ceRNA network, we performed functional enrichment analysis. We found that these mRNAs were mainly enriched in ALL-related biological processes such as transcription (Gocho and Yang, 2019), programmed cell death (Hass et al., 2016), apoptosis, cell cycle , and proliferation . The DEmRNAs in these biological processes could modulate the development of ALL. Furthermore, these DEmRNAs were significantly associated with pathways in cancer, cell cycle, p53 signaling pathway, Wnt signaling pathway, and pentose phosphate pathway. It has been widely accepted that the p53 signaling pathway is a promising drug target in ALL (Trino et al., 2016). In particular, alterations of the tumor suppressor gene TP53 were frequently found in pediatric ALL (Demir et al., 2020). As for the Wnt signaling pathway, it was significantly correlated with the pathogenesis of ALL (Montano et al., 2018). Recent findings reported that inhibiting Wnt/β catenin could reverse multidrug resistance in children ALL . Moreover, the pathway is regulated by many factors. For example, miR-181a-5p could promote ALL cell proliferation via targeting the Wnt pathway (Lyu et al., 2017). Our results indicated that the DEmRNAs in the ceRNA network could be involved in the pathogenesis of ALL.
We constructed a PPI network for B-ALL on the basis of DEmRNAs in the ceRNA network. Ten hub genes were identified for ALL, including SMAD3, SMAD7, SMAD5, ZFYVE9, FKBP1A, FZD6, FZD7, LRP6, WNT1, and SFRP1. Among them, the loss of the Smad3 protein has been identified as a key feature of acute T-cell lymphoblastic leukemia (Wolfraim et al., 2004). Smad7 is a promising therapeutic target for B-cell ALL (Guo et al., 2018). Furthermore, microRNA-181a might regulate its expression for pediatric ALL (Nabhan et al., 2017). Wnt  signaling pathway can enhance hematopoietic cell proliferation (Doubravska et al., 2008). It could mediate growth and prognosis of B-cell progenitor ALL, which could be a potential treatment strategy in ALL (Khan et al., 2007;Mochmann et al., 2011). In the pathway, FZD6, FZD7, LRP6, and WNT1 were marker proteins. LRP6 has been reported to be a candidate tumor suppressor gene in pre-B ALL (Montpetit et al., 2004). Furthermore, low expression of SFRP1 was significantly associated with clinical outcomes of patients with Philadelphia-positive ALL (Martin et al., 2008).
Consistently with differential expression analysis results, eight lncRNAs including ATP11A-AS1, ITPK1-AS1, ANO1-AS2, CRNDE, MALAT1, CACNA1C-IT3, PWRN1, and WT1-AS were significantly upregulated in ALL bone marrow, indicating that these abnormally expressed lncRNAs could be involved in the development of ALL. Among them, CRNDE was upregulated in the bone marrow of B-cell precursor acute lymphoblastic leukemia (BCP-ALL) patients and BCP-ALL cell lines (NALM-6 and RS4;11). Functionally, CRNDE upregulated CREB expression by suppressing miR-345-5p, thus promoting cell proliferation and reducing cell apoptosis in BCP-ALL (Wang W. et al., 2020). A large amount of research has reported that aberrantly expressed MALAT1 was involved in a variety of cancers, such as breast cancer metastasis (Kim et al., 2018), colon cancer (Wu et al., 2018), and non-small cell lung cancer . Abnormally expressed is in significant association with poor prognosis in childhood ALL (Pouyanrad et al., 2019). Furthermore, miR-125b in combination with miR-99a and/or miR-100 could inhibit the expression of MALAT1 in vincristine-resistant children ALL cells (Moqadam et al., 2013). PWRN1 was significantly underexpressed in gastric cancer tissues and cells (Chen et al., 2018). Overexpressed PWRN1 could inhibit the proliferation and metastasis of gastric cancer cells and tumor growth. Furthermore, PWRN1 may regulate miR-425-5p expression by acting as its sponge in gastric cancer cells. ITPK1-AS1 expression could predict gastric cancer patients' survival (Hu et al., 2019). WT1-AS has been characterized as a tumor-suppressive lncRNA in several cancers including cervical squamous cell carcinoma , gastric cancer (Du et al., 2016), papillary thyroid carcinoma (Le et al., 2020), non-small cell lung cancer cell (Jiang et al., 2020), and hepatocellular carcinoma (Lv et al., 2015). Besides, WT1-AS can regulate WT1 on oxidative stress injury and apoptosis of neurons in Alzheimer's disease via inhibition of the miR-375/SIX4 axis (Wang Q. et al., 2020). However, other lncRNAs have not been reported yet. According to our results, these lncRNAs deserve more research on ALL.
However, there are several limitations in this study. First, since there was no normal control of ALL in TCGA database, data of 407 whole blood in the GTEx database were obtained as control. Given that ALL primarily affects younger individuals, the age distribution of the control group is not ideally matched with the ALL from TCGA database, which may cause confounding. Second, the sample size of this study is small, and larger clinical samples should be used to verify these lncRNAs. In addition, this study lacks functional experiments. In future research, we will further the function and clinical value of these lncRNAs in ALL.

CONCLUSION
In our study, a ceRNA network was constructed for ALL. Among all DElncRNAs in the ceRNA network, eight lncRNAs were validated in ALL bone marrow samples using Real-time qPCR, which might provide a novel insight into the further study of ALL.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Research Ethics Committee of the First Affiliated Hospital of Zhengzhou University (2019- . The study was conducted in accordance with the Declaration of Helsinki. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
WW designed the project, proposed the research concept, and wrote the manuscript. WW, CL, and FWa constructed the bioinformatic analysis, constructed the graphic images and data charts, and performed the statistical processing. CW, FWu, XL, and SG performed the experiments. All authors read and approved the manuscript and agreed to be accountable for all aspects of the research in ensuring that the accuracy and integrity of any part of the work are appropriately investigated and resolved.

FUNDING
This study was supported by grants from the National Natural Science Foundation of China (Grant No. 81700138 to WW) and the Medical Science and Technique Foundation of Health Commission of Henan Provincial (Grant No. SBGJ202003041 to WW).

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.