Novel Non-Coding Transcript in NR4A3 Locus, LncNR4A3, Regulates RNA Processing Machinery Proteins and NR4A3 Expression

NR4A3 is a key tumor suppressor in myeloid malignancy, mice lacking both NR4A1 and family member NR4A3 rapidly develop lethal acute myeloid leukemia (AML). We identified a long non-coding transcript in the NR4A3 locus and pursued the characterization of this anonymous transcript and the study of its role in leukemogenesis. We characterized this novel long non-coding transcript as a sense polyadenylated transcript. Bone marrow cells from AML patients expressed significantly reduced levels of lncNR4A3 compared to healthy controls (controls = 15, MDS= 20, p=0.05., AML= 21, p<0.01). Expression of NR4A3, as previously reported, was also significantly reduced in AML. Interestingly, the expression of both coding and non-coding transcripts was highly correlated (Pearson R = 0.3771, P<0.01). Transient over-expression of LncNR4A3 by nucleofection led to an increase in the RNA and protein level of NR4A3, reduction of proliferation in myeloid cell lines K-562 and KG1 (n=3 and 2 respectively, p<0.05) and reduced colony formation capacity in primary leukemic cells. A mass spectrometry-based quantitative proteomics approach was used to identify proteins dysregulated after lncNR4A3 over-expression in K-562. Enrichment analysis showed that the altered proteins are biologically connected (n=4, p<0.001) and functionally associated to RNA binding, transcription elongation, and splicing. Remarkably, we were able to validate the most significant results by WB. We showed that this novel transcript, lncNR4A3 regulates NR4A3 and we hypothesize this regulatory mechanism is mediated by the modulation of the RNA processing machinery.

The role of NR4A1 and NR4A3 in myeloid malignancy is particularly relevant. NR4A1/NR4A3 knock-out mice rapidly develop lethal acute myeloid leukemia (AML) (8,9) and reduced dosage of these genes, in mice, leads to a phenotype that recapitulates myelodysplastic syndrome (MDS), a hematologic disorder with increased susceptibility to AML. Additionally, leukemic blasts from AML patients have reduced expression of NR4A1/NR4A3 genes (9). This evidence strongly supports the hypothesis that the loss of tumor suppressors NR4A1/3 is a key initiating step in leukemic transformation. Strategies aiming to block the inactivation of these transcription factors would hold great potential in the treatment of AML and MDS. However, the mechanisms that lead to their inactivation remain elusive.
Long non-coding RNAs are increasingly recognized as master regulators of cellular function in health and disease (10)(11)(12)(13)(14). These non-coding transcripts are involved in virtually all steps of genetic regulation. They recruit chromatin modifying proteins (15) and transcription factors (16), hijack the splicing (17) and translation machinery (11), sequester miRNAs (18), among other functions. Previous work of our group identified a long non-coding RNA in the NR4A3 locus expressed in hematopoietic stem cells from myelodysplastic syndrome patients (19). Due to the important role of NR4A3 gene in myeloid malignancy, we pursued the functional characterization of this transcript.
A long non-coding RNA encoded in the NR4A3 locus is an interesting candidate to explore cis regulation upon NR4A3.
Here, we characterized this hitherto unknown transcript, evaluated its expression in patient samples, functionally studied its role in NR4A3 locus regulation and used a massspectrometry-based proteomics approach to identify the targets of lncNR4A3.

Patients
Samples from patients with previously untreated AML and MDS by World Health Organization (WHO) criteria, were used in this study. Diagnosis was confirmed by cytologic examination of blood and bone marrow (patient characteristics shown in Table  1). Mononuclear cells were isolated by Ficoll-Hypaque separation of total bone marrow (BM). Samples from MDS patients (12 males, 8 females, median age: 74, range: 31-86 years) and AML patients (12 males, 9 females, median age, 59 years, range, 22-88 years) were collected at the time of diagnosis and BM mononuclear cells of 15 controls (12 males, 3 females, median age, 30 years, range, 15-47 years) were obtained from bone marrow donors. French-American-British (FAB) classification of the patients is presented in Table 1. All patients were diagnosed between 2009 and 2014 at the hematology and transfusion medicine center, University of Campinas. Bone marrow mononuclear cells for the nucleofection experiments were obtained from BM aspirates of two AML patients. The mononuclear cell fraction was separated as described above. One patient had more than 80% CD34+ cells and cells were directly used for the experiments and for the other patient, CD34+ cells were separated using Indirect CD34 MicroBead Kit, Miltenyi Biotech GmbH, Germany. All participants gave written informed consent to the study; procedures were approved by the University Ethics Committee (number CEP1209/2011) and all methods were in accordance with the relevant guidelines and regulations.

Proteomic Analysis
Protein was extracted as described above and concentrations were determined by Bradford protein quantification assay. A total of 50 mg of protein were run in a sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) and undergo reduction, alkylation, and in-gel digestion with trypsin (details in Supplementary Methods). Peptides were separated by C18 (100 mm 6,100 mm) RP-nanoUPLC (nanoACQUITY, Waters) coupled with a Q-Tof Premier Mass Spectrometer (Waters) with nanoelectrospray source at a flow rate of 0.6 ml/min. For protein quantification, data was analyzed by Scaffold Q+ (version 4.4.3; Proteome Software, Inc., Portland, OR, USA) and set to a false discovery under 1%. Gene ontology enrichment was carried out using String software V10.5. Details of analysis in Supplementary Methods.

Statistical Analysis
Statistical analysis of the data was performed using R version 3.5.1 and GraphPad prism software. The patient's data was analyzed by Student's t-test (two-way) and statistical significance between two groups (controls vs. MDS, and controls vs. AML) is shown in the graphs. To measure correlation between lncNR4A3 and NR4A3, the Pearson coefficient was calculated, Pearson r and P-value are shown in the corresponding figure. For the functional experiments the significance of differences between two groups (empty vector and lncNR4A3) was estimated with Student's t test and differences were considered statistically significant at the level of P < 0.05. Figures were plotted using "R" programing language ("ggplot2," "cowplot," "gridextra" packages). For screening of differently expressed proteins from the proteomic analysis, we used an inhouse program developed in "R" 3.5.1 to apply ANOVA built-in function to all proteins detected; however, proteins with more than three missing spectrometry readings were excluded from the analysis.

RESULTS lncNR4A3 Characterization
To characterize lncNR4A3, we based on a partial sequence identified by Nakaya et al. and deposited in a dataset for partially intronic non-coding RNAs (21). Additionally, we performed a search for ESTs (expressed sequence tags) in the UCSC database in this region, identifying several human ESTs (Accession: BG539866, BG546553, BG570616, BF105874, BE502919, BE219816, AW204232, see Supplementary Figure  1). Since most PCR approaches are not direction-sensitive, we used strand-specific PCR to identify the orientation of the transcript, which is based on synthesis of cDNA using specific primers complementary to positive and negative strands of DNA near the expected ends of the transcript from both strands. This assay revealed that lncNR4A3 is transcribed in the same orientation than NR4A3, hence considered a sense transcript ( Figure 1B). To characterize the complete sequence of lncNR4A3, we used primers near the putative 5′ and 3′ ends to perform rapid amplification of cDNA ends. Due to the difficulty to amplify the 3´end of this transcript, we suspected it did not have a poly A tail. We used oligo dT beads to enrich RNAs with poly A tails and collected RNA not bound to the beads as RNA depleted of polyadenylated transcripts. Unexpectedly, LncNR4A3 only amplified in the poly A enriched fraction for cDNA from K-562, suggesting LncNR4A3 is polyadenylated (see Supplementary Figure 2). After several attempts and methods (see Supplementary Methods) LncNR4A3 full sequence (deposited under NCBI GeneBank accession number MK510719) was identified by RACE. This sequence also matches the alignment of ESTs found in the region, supporting MK510719 is the full sequence of the transcript (see Supplementary Figure 1). LncNR4A3 was characterized as a polyadenylated, 1,214 bp transcript overlapping the second intron, and an alternative exon of the NR4A3 gene, see Figure 1A.

Expression of lncNR4A3 and NR4A3 Is Suppressed in Acute Myeloid Leukemia
The novel non-coding transcript, lncNR4A3 was significantly reduced in cells from the bone marrow of AML patients (controls= 15, MDS=20, AML=21), see Figure 1D. As previously reported (9), NR4A3 is also reduced in bone marrow cells from MDS and AML patients ( Figure 1C). Expression of lncNR4A3 and NR4A3 were positively correlated (Pearson R = 0.3771, P<0.01, Figure 1E), supporting involvement of one in the expression of the other.
We also characterized the expression of lncNR4A3 in several myeloid and lymphoid cell lines, CD34+ hematopoietic stem cells, and non-hematopoietic cells lines HS5 and Hela (see Supplementary Figure 4). LncNR4A3 was detected in all cell types evaluated, but expression in myeloid malignant cell lines was extremely low compared to CD34+ HSCs (a normal hematopoietic progenitor cell population from umbilical cord blood) and HS5 (stromal cell line).

LncNR4A3 Over-Expression Modulates the Expression of RNA-Binding Proteins and Particularly Members of the hnRNP Family
We investigated the global effects of lncNR4A3 over-expression among the entire proteome of lncNR4A3 over-expressing K-562 cells compared to controls. Whole-cell lysates were processed and analyzed by Q-tof mass spectrometry as described in methods. A total of over 400 proteins were identified (see complete list in Supplementary Data File). Enrichment analysis using STRING software showed that the altered proteins are biologically connected (n=4, p<0.001, see proteinprotein interaction network, Figure 2A) and functionally associated to RNA binding and processing and cellular components including the spliceosome ( Figure 2B).
Pathways analysis rendered similar results, KEGG pointed to protein processing in endoplastic reticulum and spliceosome as second most significantly associated pathway, "Reactome" gave metabolism of RNA as most associated metabolic pathway and "Local String network cluster" pointed to messenger RNA (mRNA) splicing as associated pathway. The association of these pathways support the role of lncNR4A3 in RNA processing. From this analysis 22 proteins were significantly up-regulated and 19 down-regulated (p<0.05). Among the regulated proteins there are four members of the hnRNP (heterogeneous nuclear ribonucleoproteins) family and three of them up-regulated (Table 2, Figure 2C). The most significantly enriched protein in the lncNR4A3-K-562 cells was the heterogeneous nuclear ribonucleoprotein K (p-value = 0.00058). We validated the upregulation of hnRNPK in K-562 cells and AML patient cells; however, no effect was detected in KG1 cell line (see Supplementary Figure 9). Heterogeneous nuclear ribonucleoproteins (hnRNPs) are involved in RNA translocation and processing, also considered splicing switches (22,23). Poly (ADP) ribose polymerase (PARP1) upregulation was also validated, while this protein is most known for its role in DNA damage repair, it is also involved in alternative splicing and RNA elongation (24,25). Among up-regulated proteins, western blotting confirmed increased expression of splicing factor 3 B2 (SF3B2), a component of the spliceosome machinery that promotes splicing. Remarkably, splicing defects are key in leukemogenesis and mutations in the genes encoding the splicing machinery proteins are common in hematologic malignancy (26,27). Lamin B1 was also among the proteins upregulated by LncNR4A3 over-expression, this protein is an important structural component of the nucleus and evidence shows it is crucial for RNA synthesis and proliferation (28,29). And more interestingly, expression of NR4A3 is known to be regulated by RNA splicing and elongation (30).

Artificial Re-Expression of lncNR4A3 Leads to Increased Expression of NR4A3 Messenger RNA and NR4A3 Protein Level and Reduction of Proliferation in Myeloid Cell Lines K-562, KG1, and Primary Leukemic Cells
The endogenous expression of both NR4A3 (data not shown) and lncNR4A3 in myeloid cell lines is almost completely abrogated (see Supplementary Figure 4), therefore we sought to determine if the inactivation of NR4A3 could be reversed by lncNR4A3 artificial re-expression. Transient over-expression of lncNR4A3 led to a more than 2-fold increase of NR4A3 mRNA ( Figure 3A for K-562 and Supplementary Figure 8 for KG1) and more importantly, this regulatory effect translated into an increased NR4A3 protein level ( Figure 3D). Levels of lncNR4A3 and NR4A3 mRNA were evaluated by qRT-PCR. Efficiency of the over-expression was verified by qRT-PCR, which showed that all nucleofections rendered more than 500fold over-expression of lncNR4A3 compared to controls (see Supplementary Figure 7).
Consistent with this reactivation of NR4A3, the proliferation of K-562 and KG1 cells was reduced as shown by cell viability assay CCK-8 ( Figure 3B   myeloid leukemia patients. Expression of NR4A3 mRNA was increased in both samples after lncNR4A3 over-expression (38% and 68% increase in mRNA level) and due to sample availability, only one patient sample was analyzed by western blot, confirming upregulation of NR4A3 ( Figure 3C). Nucleofection-mediated transient over-expression of lncNR4A3 in these cells caused an important reduction in their colony formation capacity (15-days methylcellulose CFU assay) when compared to empty-vector nucleofected cells ( Figure 3D). Methylcellulose colony-forming unit (CFU) assay shows that the over-expression of lncNR4A3 compromised the proliferation capacity of these cells, evidenced by the reduced number of colonies as well as colony size in lncNR4A3-nucleofected cells after 15 days in semi-solid culture (control-1: 7 clusters, 5 colonies, lncNR4A3-1: 5 clusters, 0 colonies; control-2: 19 clusters, 13 colonies, lncNR4A3-2: 2 clusters, 1 colony) see Figure 3D and Supplementary Figures 8 and 9.

DISCUSSION
Despite the advancements in therapeutic interventions, acute myeloid leukemia (AML) patients have limited treatment options and mortality remains high. Myelodysplastic syndromes (MDS) are a group of pre-leukemic conditions, associated to aging, characterized by inefficient hematopoiesis and accumulation of genetic lesions (31). There is a diversity of genetic abnormalities associated with leukemogenesis, and targeting a mechanism common to myeloid malignancy is a challenge to the development of efficient therapies. The suppression of NR4As nuclear receptors is a common feature of AML irrespective of subtype and cytogenetics, and the loss of these nuclear receptors leads to rapid development of AML in mice. All these characteristics make the reactivation of NR4A1/3 an appealing treatment approach. However, the mechanisms that regulate NR4A1/3 repression during myeloid malignization are not well-understood. Heterogeneous nuclear ribonucleoproteins (HnRNP) comprise a family of multifunctional RNA binding proteins involved in different levels of transcriptional and posttranscriptional regulation including pre-mRNA processing, mRNA stability, and translation (32)(33)(34)(35)(36). Several members of this family have been associated with RNA elongation and splicing (32,34,36). Moreover, HnRNPK has been characterized as a tumor suppressor in myeloid malignancy (37). This protein family, as well as other proteins identified in this study as targets of lncNR4A3, are associated with RNA synthesis, elongation and splicing (25,27,28,32,34).
Here, we identified a novel sense non-coding transcript in the NR4A3 locus. The expression of this transcript is abrogated in myeloid malignancy patients, and myeloid malignant cell lines. Finally, the artificial re-expression of lncNR4A3 caused the reactivation of NR4A3 in leukemic cells and modulation of several proteins associated to RNA processing in K-562.
Although, we embarked on the characterization of lncNR4A3 under the hypothesis of a cis-regulatory role of lncNR4A3 upon NR4A3. Our results provide evidence of a broader reach of this transcript in the regulation of a set of proteins involved in RNA processing. Unfortunately, proteomics approaches are not as comprehensive as transcriptomic ones and due to sensitivity limitations of the technique and equipment, we cannot infer that these proteins are directly regulated or the only ones regulated by lncNR4A3. We also cannot conclude that the dysregulated proteins are cause or consequence of NR4A3 dysregulation. Despite the numerous reported cases of local regulation by long non-coding RNAs (12,(38)(39)(40)(41), results from global transcription analysis reveal that long non-coding RNAs act as global expression regulators, modulating, not only one, but many distant loci (15,38,(41)(42)(43). Moreover, there are several cases of lncRNAs acting through cisand trans-mechanisms simultaneously (38,40,41). Deeper analysis of several cases of reportedly cis-acting lncRNAs revealed a wider set of genome-wide targets from which the locally regulated gene was merely a part or an indirect product of a trans-mechanism (38,40,43). All these are possible scenarios of the regulatory landscape of the NR4A3 locus. In this study we have pointed to a completely new player in the regulation of NR4A3 and the association of this lncRNA with RNA processing machinery. However, future work is necessary to identify direct targets of this transcript and precise regulatory mechanisms of this important locus in the light of these new findings.
Another technical limitation of the study of LncNR4A3 is the fact that primer design cannot exclude immature forms of NR4A3 as contaminants in the qRT-PCR quantification. We performed an amplification using primers spanning exon 3 and adjacent intron to quantify the influence of non-processed NR4A3 RNA (Supplementary Figure). We were able to detect the immature NR4A3 RNA (exon-intron primers); however, close to the limit of reliable detection and in significantly less abundance (~3 cycles) than lncNR4A3.  Despite these limitations, the strong functional association between the up-regulated proteins, suggests a role of lncNR4A3 in RNA processing. We reviewed the literature to understand the nature of the mechanisms that regulate NR4A3 expression. Although some reports suggested that epigenetic modification is involved in the silencing of NR4A3 in some cancer models (1,44), solid evidence from a recent study demonstrated that the abrogation of NR4A3 expression in myeloid malignization was mediated by the blockade of transcriptional processing rather than epigenetic silencing (30). They showed that NR4A3 promoter region of AML blasts lacks common epigenetic markers of repression compared to normal cells, and their repression during malignization depends on RNA processing defects (30). Here, we present evidence of the role of this novel long non-coding RNA, lncNR4A3 in NR4A3 regulation and the modulation of a set of RNA processing related proteins. This study focused on the effect of this new regulatory transcript in myeloid malignancy; however, NR4A3 plays important roles in lymphopoiesis (9) and its suppression is associated to lymphomagenesis (6). The reactivation of NR4A3 by lncNR4A3 could have similar tumor suppressor effect in the context of lymphoma and other malignancies.
Our results suggest that the re-expression of lncNR4A3 is able to revert, to some extent, the loss of NR4A3 in leukemic cells. This enhanced synthesis of NR4A3 leads to the expected reduction in cell viability. Although, it is likely that lncNR4A3 is a fine tuner of expression for several targets, far beyond NR4A3, we present evidence of a tumor suppressor role of this transcript in myeloid leukemia through the regulation of NR4A3.

DATA AVAILABILITY STATEMENT
Proteomics quantitative data is available in a Supplementary Material, and raw spectra for the proteomic analysis is available upon request. Novel LncNR4A3 sequence was deposited in the NCBI database under the accession number MK510719.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Campinas State University Ethics Committee (number CEP1209/2011). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AC designed and performed the research, analyzed data and wrote the paper. FN contributed in performing the research. AD contributed in performing the research. KF contributed in performing the research. SO-S contributed in the design of the research, data analysis, and interpretation. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by the Fundacão de Amparo à Pesquisa do Estado de São Paulo (FAPESP) and carry out in the Hematology and Transfusion Medicine Centre -UNICAMP.