Original Research ARTICLE
Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data
- 1Center for Computational Science, University of Miami Miller School of Medicine, Miami, FL, USA
- 2Human Genetics and Genomic Graduate Program, University of Miami Miller School of Medicine, Miami, FL, USA
- 3Department of Medicine, University of Miami Miller School of Medicine, Miami, FL, USA
Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.
Neuropathic pain (NP) is caused by intense damage brought to nervous system and has a differentiated origin (1). The cause could be either injury affecting the somatosensory nervous system or the damages to either the peripheral nervous system or the central nervous system (CNS) (2). NP is categorized into peripheral and central. The central NP is found implicated in spinal cord injury and in a few stroke cases. For instance, a significant presence of NP was detected in patients with spinal cord injury that experienced chronic pain (3). Peripheral nerve injuries cause damage to both the nerve and the adjacent connective tissue, but the injured nerves can regenerate in the peripheral nervous system by the activation of the intrinsic growth capacity of neurons (4). However, injured nerves generally fail to regenerate in the CNS (5). Various studies on sciatic nerve (SN) injury have been reported with reference to the rat model (6, 7), which is considered as the reference in our work too. Consisting of mixed populations of motor and sensory axons, SN is commonly addressed in nerve regeneration studies. The sensory neurons extending into SN are located in the L4–L6, lumbar vertebrae fourth through sixth dorsal root ganglion (DRG).
Due to the relevance of measuring the expression levels of both DRG and SN tissues, an opportunity is currently offered by many available microarray data, through which cross-reference studies and comparative evaluations can be performed. Meta-analyses have been conducted in many independent experiments to detect genes regulated by chronic pain states (8). With a focus limited to mRNA transcripts, little or no relevance was assigned to non-coding RNAs (ncRNAs) (9). A couple of exceptions were due to microRNA profiling based on SN, and to lncRNAs with temporal profile monitored after DRG injury, respectively (6, 9). This knowledge gap has motivated our work. An opportunity was offered by the reuse of data from open access resources, with the aim of overcoming the paucity of evidences due to limited experiments, high costs, and technical difficulties.
Peripheral inflammation and nerve injury cause changes in the expression of some miRNAs (7, 9, 10). Other ncRNAs might be considered relevant too, as they may regulate genomic expression to an extent yet largely unknown (11). Most studies have been based on deep sequencing and associated bioinformatic analyses lacking clear reference to functions, but reporting on target genes. Naturally enough, knockout studies may advance knowledge, but human transcripts are problematic to examine in vivo (12). Despite problems with cis-specific functions, mechanisms explaining regulatory functions have been shown especially with reference to epigenetic modulation, protein scaffolding, miRNA sequestration, competitive inhibition, etc. These processes include the ability to modulate target gene transcription (13).
The functional significance of such regulators has been discussed in physiology, development, and also various disease processes (14–16). However, up to now, relatively few ncRNAs have been studied. For instance, the expression of some antisense transcripts is linked to the activity of neighbor genes (17), but the regulation of gene expression by antisense transcripts is not acting as an isolated but rather as part of integrated mechanisms to achieve complex regulatory effects (18). Additional consideration deserves the complexity of lncRNAs in relation with the architecture of RNA-binding proteins, due to the presence of multiple RNA-binding domains that may recognize distinct “targets” (19). There is currently little evidence for direct interaction between lncRNAs and DNA. RNA–DNA hybrids or triplex structures can allow single strands of RNA to interact with DNA duplexes by base–pair interactions. These direct RNA–DNA interactions could efficiently and selectively target RNA signals to genomic loci (20).
Antisense biotypes were identified in the mammalian nervous system, including in chronic pain-related regions (21). This holds for the Kcna2 antisense RNA in DRG, for instance, which is overexpressed after peripheral nerve injury (22). In NP regulation, the relationships between protein-coding genes and ncRNA expression are also important for the critical role that the former play, that of target genes. Our study looks at such associations, focusing on the analysis of differential expression in both regulators and targets.
We recently reported work with preliminary analysis of the dataset GSE30165 (23). The rationale was to identify mRNA isoforms and ncRNAs underlying NP, using only the top 250 differentially expressed biotypes. In the present work, starting from ncRNAs classification and curation in relation to NP, we provided further analysis, i.e., contextual annotations for lncRNAs, suggesting a strategy to strengthen inference about the potential functional role of ncRNAs. In particular, since overexpression of ncRNAs acts toward reduction of the expression of potential targets, our evidences suggest to pay attention to putative targets potentially associated with regulatory and signaling NP mechanisms, also studied in relation with brain diseases.
Materials and Methods
The raw microarray data were downloaded from the Gene Expression Omnibus database1 (GEO), platform Agilent-014879: Whole Rat Genome Microarray 4 × 44 K G4131F. Gene expression signatures for 30 samples were generated, including L4-6 DRG and proximal SN tissues (0.5 cm) at day 0, day 1, day 4, day 7, and day 14 after SN resection. This dataset included three samples in each tissue, DRG, and SN, and the expression levels were measured for all samples at five time points.
The differential expression was computed with GeneSpring (GX V 12.6). The time baseline was assigned the same way in both studies, with DRG and SN. Given the start day, 0 day = b, further measurements that were taken at successive intervals were redefined as: I1 = 1 day − b, I2 = 4 day − b, I3 = 7 day − b, I4 = 14 day − b. Sample groups thus appear according to time point in each tissue, for a total of five groups. Significance was assigned when differences in expression values between the groups had fold change (FC) >1.5, and t-test p-values <0.05, then adjusted for multiple testing by the false discovery rate (FDR) or Benjamini and Hochberg method (24). The probe sets and genes correspondence was established by FDR cutoff q < 0.05, and this is a standard way to bypass the problem of probes mapping to multiple genes. The outcome of this process (see also Table S8 in Supplementary Material) was the gene set called DEGs, retained for downstream analysis.
Of course, the stringency of these criteria influences the relative abundance or paucity of evidences that allow downstream analysis and thus inference; however, the goal of capturing biotypes other than genes, say ncRNA, suggests considering that relatively low expression values should be expected to characterize them.
Pathway analysis was obtained using the tool GeneGo Metacore™ (Thomson Reuters Corporation, New York, NY, USA). The DEGs identified by GeneSpring analysis were uploaded to the GeneGo website, and core analysis was applied to obtain the list of activated pathways, using 0.05 as a threshold for p-values. While we rely on established tools for protein-coding gene annotations, i.e., DEGs enriching for pathway terms, we need to work at a more empirical level for the ncRNAs, as explained below.
The differentially expressed detections were annotated with both protein-coding and non-coding biotypes, as provided by Ensembl rel. 77 (25), Rattus_norvegicus.Rnor_6.0.79.gtf, and Mus_musculus.GRCm38.76.gtf. In the non-coding category, annotations regarding lncRNAs were provided for pseudogenes, lincRNAs, and asRNA – considered of major interest in our work. Due to the categories appearing under the label pseudogene, we aimed to have one single group, and thus the processed pseudogenes were consolidated into the final pseudogene group. Below, we explain the annotation strategy for these three major categories. Lastly, we indicate the use of targets to build protein–protein interaction networks (PINs).
The contextual analysis of lincRNAs with respect to the DEG targets was established on the basis of the physical proximal distance on the chromosome. The mode of action of lncRNAs is generally classified into cis and trans regulation [see among other references, Ling et al. (26)]. This depends on whether the lncRNA regulates neighboring genes, i.e., genes on the same chromosomal regions where they are located, or instead distant genes, i.e., on other chromosomes. Notably, lncRNAs interact directly or indirectly with genomic DNA elements, often through proteins that perform specific biological functions, and also through other neighbor lncRNAs in a coordinated way. Note that lncRNAs may target proteins to exert their trans effects, but in such case to know what factors determine the RNA–protein interaction requires further study, here not pursued.
Locations of DE lincRNAs were obtained from Rattus_norvegicus.Rnor_6.0.79.gtf and Mus_musculus.GRCm38.76.gtf, i.e., gene transfer format files downloaded from Ensembl release 77. Two text files, one with lincRNA genomic locations (start and end positions) and another with protein-coding genomic locations (start and end positions), allowed lincRNA neighbors to be annotated. Genes at both left and right sides of starting and ending positions of lincRNAs, and within ±3 Mbps regions, were considered as putative targets. Note that this interval is arbitrary, and there is not a universally accepted definition of such range. In an attempt to explore neighbors of the lncRNA locus that confidently allow putative targets to be identified, we assigned priority to targets located at the closest possible distance from the locus of interest (see Results).
They are among the most important categories of lncRNAs and known to regulate protein-coding genes on the opposite strand. Using information on the genomic coordinates of the genes and their coding potential obtained from the GTF files, we looked into the specific protein-coding genes that were on the opposite strand to DE asRNA in both tissues. Two text files were created for rat (i) asRNA file: a text file describing gene_id, chromosome, start, end, strand, gene_name obtained from Rattus_norvegicus.Rnor_6.0.79.gtf and (ii) GTF file: a text file that has all the genomic information but not restricted to asRNAs. These two text files were parsed through basic Perl scripts and for different steps to obtain the asRNA targets. Basically, this involved checking whether the chromosome numbers are the same in both asRNA and GTF text files and if the strands are opposite, then the ordering of the start position of genes in the asRNA file compared to the end genomic location, and also of the start position of genes in GTF file compared to the end genomic location. Finally, only the gene names from the GTF file on the opposite strand to the corresponding gene in the asRNA file were retained. Then, the same exact protocol was followed to obtain asRNA targets using the mouse reference, Mus_musculus.GRCm38.76.gtf.
Pseudogene–Parental Gene Annotations
The differentially expressed pseudogenes obtained for both tissues were annotated with respect to their parental protein-coding genes. The pseudogene sequences were mapped against the protein-coding gene sequences using BLAST (27), and only the unique hits were assigned to parental gene–pseudogene associations. Fasta sequences for protein-coding genes and pseudogenes were downloaded from the rat reference, Rattus_norvegicus.Rnor_6.0.cdna.all.fa. These two sets of sequences were mapped against each other using BLAST and the parameter “-max_target_seqs:1” to detect only those protein-coding genes that have a high-level of sequence homology to the pseudogenes used as query. Finally, the output obtained from BLAST was analyzed to overcome the multiple associations, i.e., when a pseudogene aligns to multiple protein-coding genes. The BLAST output was filtered using the “sort –u” option, which sorts the file and pipes the output that contains only unique hits. Therefore, only unique hits were considered as our final list for the downstream analysis. Similar steps were followed to extract parental genes from the mouse reference Mus_musculus.GRCm38.cdna.all.fa as well, as reported in parenthesis in Table 1. Further control of the selected parental genes aimed to identify DE in either neuronal tissue.
Table 1. Classification of bioentities (source: Ensembl) – combined gene annotations for differentially expressed bioentities at different time points for the two tissues (SN, DRG).
Target-Driven Protein–Protein Interaction Networks
The parental protein genes found differentially expressed in either of the neuronal tissues were used to generate the PIN using STRING2 (V. 9.1). This is a known state-of-the-art database reporting both experimental and predicted protein–protein interactions. This network was functional to our downstream contextual analysis, i.e., the contextualization of the identified targets with respect to biological processes and pathways.
Rat Model and Mouse Reference
We used Mus_musculus.GRCm38.cdna.all.fa and Rattus_norvegicus.Rnor_6.0.cdna.all.fa from Ensembl (rel 77). The mouse was used as a reference only.
Probe and Gene Annotation
The microarray dataset was profiled for the DEGs in both neuronal tissues – DRG and SN – at different time points (see Table 1). In addition, annotated differentially expressed probes and ncRNAs are uniquely obtained using the mouse reference (see Materials and Methods for details and bold numbers in parentheses reported in Table 1).
We detected 32 unique lincRNAs in SN and 8 unique lincRNAs in DRG. Then, we detected 31 unique antisense in SN and 12 unique antisense in DRG. Such antisense detections in SN and in DRG were not significantly associated with targets; hence, they were no longer explored. We additionally identified 452 unique pseudogenes in SN and 56 unique pseudogenes in DRG, subsequently investigated for their protein-coding targets.
Pathway analysis was conducted with the tool GeneGo, delivering evidences displayed in Figures 1A,B and 2A,B and summarized in the top-10 annotated terms for both tissues, DRG and SN. The enriching DEG sets appear in supplemental files (see Tables S6 and S7 in Supplementary Material for complete lists of annotated terms). Also, fractions of genes enriching pathway terms are reported inside the plots, while the listed terms are sorted by FDR-corrected enrichment values (from best, at the top).
Figure 1. (A) Summary of top-10 pathways in SN_I1 (left) and SN_I2 (right). Percentage inside the pies represents enriched genes. FDR-corrected enrichment values are reported with listed pathway terms. (B) Summary of top-10 pathways in SN_I3 (left) and SN_I4 (right). Percentage inside the pies represents enriched genes. FDR-corrected enrichment values are reported with listed pathway terms.
Figure 2. (A) Summary of top-10 pathways in DRG_I1 (left) and DRG_I2 (right). Percentage inside the pies represents enriched genes. FDR-corrected enrichment values are reported with listed pathway terms. (B) Summary of top-10 pathways in DRG_I3 (left) and DRG_I4 (right). Percentage inside the pies represents enriched genes. FDR-corrected enrichment values are reported with listed pathway terms.
We can observe among the terms in Figure 1A the one annotated as neurophysiological process_Kappa-type opioid receptor in transmission of nerve impulses, especially involved in DRG_I2. The relevance of this term is associated to CREM (activators), AP-1, and P/Q-type calcium channel alpha-1A subunit. The latter gene is a voltage-dependent calcium channel that mediates the entry of calcium ions into excitable cells and is involved in various calcium-dependent processes, like muscle contraction, neurotransmitter release, and gene expression. It is also primarily expressed in neuronal tissue, and it was shown that mutations to this gene could potentially cause two neurological disorders (29).
We also noted recurrence of cell adhesion_Integrin inside-out signaling in T cells at SN_I2, SN_I3, and SN_4, but not at SN_I1. This may indicate a possible role for the integrin-regulated cell adhesion in tissue morphogenesis and wound healing, together with the regulation of cell growth and differentiation, whereas the temporal pattern points to recurrence after injury.
Another interesting pathway is immune_response_CCL2_signaling, which is involved mainly in SN and refers to chemokines previously shown to be upregulated in injured DRG neurons under different NP models (30–33). We found enrichment of immune response_NFAT in SN_I2, SN_I4 and signal transduction_cAMP signaling in DRG_I4 by calcineurin A (catalytic), which is a calcium and calmodulin dependent serine/threonine protein phosphatase. This gene is relevant for CNS functions, such as neurite extension, synaptic plasticity, learning, and memory (34, 35).
Most pathways in SN show gene enrichment for ERK1/2, a key extracellular-signal-regulated protein kinase. The abnormal hyper-phosphorylation of tau, a gene that encodes the microtubule-associated protein tau, promotes microtubule stability and functions as a linker protein between axonal microtubules and neural plasma membrane components. Its mutation in Alzheimer’s disease (AD) has been shown to involve the ERK of the mitogen-activated protein (MAP) kinase family. Both the intracellular and regional distribution of the active forms of both MEK1/2 and ERK1/2, and the accumulation of p-MEK1/2 and p-ERK1/2 are known, the latter found in cases with stages I–III neurofibrillary degeneration (36).
Contextual Analysis of lincRNAs and Detection of the Neighboring DEGs
We detected the differentially expressed ncRNAs across DRG and SN. Further classification was operated into three lncRNA categories, namely pseudogenes, lincRNAs, and asRNAs. LincRNA sequences categorized into different types on the basis of their genomic location with respect to protein-coding genes have been explored in very few studies (9, 22). The lincRNAs, detected by using both mouse and rat references, were manually curated to verify whether they could reveal potential candidate therapeutic targets for NP. In particular, potential association with the neighboring DEG targets was selectively investigated, i.e., within ±3 Mbps.
Table 2 shows the list of DEGs with respect to lincRNAs in both tissues and the corresponding annotations. The directional location of putative targets with respect to the genomic location of the lincRNAs is also reported. Analysis of lincRNAs and their neighboring DEGs revealed about 26 potential protein-coding targets in SN and 4 in DRG. In particular, we enabled a search for context-related information for each candidate bioentity, and after further manual curation of these 30 targets, we found examples, such as RGS22, RGS18 (regulators of G-protein signaling), and STIM2 (stromal interaction molecule 2), among the candidate DEGs in SN (indicated by *), with the potential to be targets of lincRNAs.
In particular, STIM2 plays an important role in ischemia-induced neuronal damage, and its absence in knockout mice was shown to interrupt blood flow in the brain, thus decreasing the neuronal damage caused by ischemia (2). The neuroprotective influence of STIM2-deficiency following an ischemic incidence suggests that the inhibitors of the STIM2 function might have potential therapeutic effect as neuroprotective agents to treat ischemic injury and other neurodegenerative disorders showing altered Ca2+ homeostasis. This basically suggests the role of STIM2 in hippocampus-dependent spatial memory, synaptic transmission, and plasticity. Since NP is likely a result of long-term plastic changes along somatosensory pathways, we hypothesize that STIM2 might play a role in NP-related synaptic transmission.
In addition to STIM2, we also detected a few genes from the family of RGS (regulators of G-protein signaling), namely, RGS18 and RGS22, as putative targets for lincRNAs. RGS proteins share a conserved domain of ~120 amino acids and are responsible for accelerating GTPase activity on the G-protein alpha subunit. They also affect physiological regulation of G-protein-mediated cell signaling in many tissues and organs and have been considered as potential drug targets in various nervous system diseases (37, 38). A family of RGS genes expressed in spinal cord may be involved in the regulation of GPCR signaling and adaptive changes of the nervous system, accompanying insensitivity to morphine observed in NP models (39). Based on the expression of RGS proteins, novel targets for small molecule inhibitors might provide specific treatment for various pathophysiological conditions (40).
For instance, RGS18 acts as a negative regulator of the osteoclastogenesis (41) and also plays an essential role in the regulation of megakaryocyte differentiation and chemotaxis (42). Finally, RGS proteins accelerate the deactivation of G proteins to reduce the GPCS (G-protein-coupled receptors) signaling, while few have effector function and thus transmit signals. The range of functions of RGS proteins along with their dynamically regulated distribution in brain makes them putative targets for therapeutic use (43, 44). While there are no current studies that show the relevance of RGS18 and RGS22 to NP, other components of RGS proteins have been shown to contribute to NP etiology (45, 46).
Annotation Pseudogenes and the Parent Protein Coding
The contextual analysis of all the pseudogenes detected in both the neuronal tissues was performed. These pseudogenes sequences were mapped to the protein-coding genes to identify their associated parental genes, keeping only the top unique hits for downstream analysis. We further found the differentially expressed genes in either or both of the tissues. The detected pseudogenes and their corresponding parental protein-coding genes were listed in DRG (Table S2 in Supplementary Material) and in SN (Table S3 in Supplementary Material), both at the four different time points. These lists of entities were then used to draw STRING networks (Figure 3A), in an attempt to find the protein–protein interactions involved with pseudogene targets. The rat reference was used as a knowledge base, and a quite stringent confidence level of 0.7 was chosen to control for false positives. The interactions were generated by a database of known and predicted protein–protein interactions, following the STRING structure.
Figure 3. (A) Protein–protein interactions for pseudogene targets (black box) obtained using rat reference at confidence level 0.7. Red circle are the networks studied. Dotted lines indicate association between pseudogene and parental genes (the former appear superimposed, being not annotated in STRING). (B) Protein–protein interactions for pseudogene targets (black boxes) obtained using mouse reference at confidence level 0.7. Red circle are the networks studied. Dotted lines indicate association between pseudogene and parental genes (the former appear superimposed, being not annotated in STRING).
Figure 3B shows the protein–protein interactions for pseudogene targets (black boxes) obtained using mouse as the reference. Dotted lines indicate associations between pseudogene and protein-coding targets, which have been detected, but for which there is no interaction. In addition, we generated the protein–protein interactions for pseudogene targets obtained using mouse and rat references (at the same confidence level of 0.7, as before) for DRG (Figure S1 in Supplementary Material) and SN (Figure S2 in Supplementary Material). The parental protein-coding targets were further manually curated to match possible transcription factors, kinases, or receptors (shown in Table S1 in Supplementary Material). Protein-coding targets were scrutinized to confirm their role in various neurodegenerative processes. Table 3 reports network-driven identifications of parental genes appearing as paths in Figures 3A,B. Heatmaps were generated for pseudogenes and their DEG targets (Figures 4A,B, for SN and DRG, respectively).
Figure 4. (A) Expression levels for pseudogenes and the corresponding parental protein-coding targets that are differentially expressed in SN at four different time points. (B) Expression levels for pseudogenes and the corresponding parental protein-coding targets that are differentially expressed in DRG at four different time points.
Overall, we found that the expression of pseudogenes and their respective targets are more prominent 14 days after injury in DRG, whereas in SN, they are highly expressed at 1, 4, and 7 days after injury. Tables S4 and S5 in Supplementary Material contain the list of pseudogenes and their corresponding parental protein-coding genes in DRG and SN that are differentially expressed in DRG and SN at four different time points, respectively. We found about 12 DEGs in DRG and 134 DEGs in SN to be considered as potential targets for pseudogenes. In particular, we identified a number of olfactory receptors as targets of pseudogenes, which connect according to the network path Olr322-Olr522-Olr490-Olr951-Olr1565-Olr1174 in the rat network and are mostly expressed in SN, with the exception of the path Olr322-Olr522-Olr1565-Olr1174 expressed in DRG as well (Figure S1 in Supplementary Material). In general, these receptors have been reported to be important components for the early diagnosis of neurodegenerative diseases (56–63).
We then identified HSPD1 (Table 3, second row), a heat shock 60-kDA protein1, which encodes a member of the chaperonin family. Studies have shown increased frequency of the pathogenic variant of the HSPD1 single-nucleotide variant in a subgroup of sudden infant death syndrome (SIDS) patients (64). Yet, another study showed that the mutations in HSPD1, the gene encoding Hsp60, are related to two human inherited diseases of the nervous system, spastic paraplegia and MitCHAP60 disease. The study shows the significance of Hsp60 chaperone in mitochondrial function and its relation to the formation of the respiratory chain complex in neuronal tissues (47). We also detected HNRNPA1 in mouse network (Table 3, third row). This is a heterogeneous nuclear ribonucleoprotein A1. The levels of HNRNPA1 were found dysregulated in patients with AD, and the associated genotype is likely a risk factor for frontotemporal lobar degeneration (FTLD) among the male populations (49). This network also includes NAP1l1, which plays a significant role in the cortical neuronal differentiation. NAP1 is the nucleosome assembly protein 1, which is normally used for in vitro nucleosome assembly reactions under physiological ionic condition (65). Other studies have shown that NAP1 regulates actin nucleation by forming a complex with WAVE regulatory protein and is selectively expressed in the developing cortex (66). The cytoskeletal rearrangements in the cortical plate play an important role in the cortical neuronal differentiation (51). Yet, another study reveals that the depletion of NAP1 could result in reduced branching of motor axons in the neuronal development (67). Thus, spotting out NAP1 is a significant finding in this study, due to its potential involvement in neurodegeneration.
Naturally enough, we cannot claim any direct association between target genes and both physiological and pathological conditions highlighted in the studies reported in Table 3. Nevertheless, we stress the following two points. (a) The proteins assigned to network paths may or may not refer to significance levels under differentiated models and experimental conditions, but those identified here are indeed obtained from mapped DEGs. (b) A network context is ideal to identify hotspots (motifs, paths, etc.) whose relevance comes from superior robustness of interactions compared to simple correlation observed at gene expression level, which in turn makes inference more reliable.
In recent years, the relevance of ncRNAs has exponentially increased, supported in part by technological advances and in part by the need to overcome the current knowledge limitations in genomics. Two major trends have been observed. On the one hand, the computational community has produced an enormous volume of evidences with a focus on new biotype classifications and characterizations examined in the context of large-scale studies. Undoubtedly, ENSEMBL3 represents a main source of genome annotation and interpretation currently available. As prediction remains the most important factor determining genome annotation, the way the available evidence is consolidated and curated differentiates ENSEMBL say, from other annotation systems [see an interesting reading on biotype conflicts, Zhu et al. (68)]. This is one of the current limitations in the analyses we have proposed here, as the lack of harmonization between annotations of different DB resources makes part of the evidences disagree or contradict each other. However, in many cases and especially for the vast majority of ncRNAs, predictive evidences may be destined to change, following discoveries and validations. This is in line with highly focused studies delivered by experimentalists with the aim to provide small-scale validated evidences for identifying functions of ncRNAs. Notably, such research domains, computational and experimental, have offered almost no contact points, due to different scopes, approaches, and desired impacts.
We aim to contribute at establishing bridges between the two domains, an effort requiring to computational scientists the implementation of sophisticated tools to parse the current heterogeneity of evidences in support of novel or alternative interpretation especially in relation to the role of ncRNAs. Although microarray targeted to spinal nerve ligations have been performed for real-time evaluation of mRNA transcripts (69, 70), no attempts have been made to study the expression levels of ncRNAs, and specifically for NP studies on spinal cord injury. Here, our idea was to develop a computational pipeline (sketched in Figure 5) to detect and categorize ncRNAs while annotating their putative targets. This way, we emphasize a methodological direction complementing the search and the profiling of protein-coding genes. When these are analyzed with pathway annotations, only a few pathway terms emerged with relevance for our specific problem, while the majority did not convey significant evidence.
Figure 5. Methodological pipeline. The graph has been rescaled to fit the data label (452) for pseudogenes in SN. The upward red arrows pointing to 452 indicate that rescaling has been implemented to fit all the data together. The displayed tables are included in the supplementary material.
In particular, we underline the role of the neurophysiological process_Kappa-type opioid receptor, which is involved in the transmission of nerve impulses and in neurotransmitter release. Such limitations were bypassed by providing further important annotations through protein-coding genes with an assigned role of putative targets of lincRNAs and pseudogenes. Protein interaction networks revealed then very useful to establish connectivity patterns between such target genes. It is important to note that such patterns establish dependence relationships between nodes through links, which reflect an underlying metric. Unlike miRNAs, other ncRNAs are not yet supported by sufficient knowledge to allow complete annotations and systems representation according to networks. For this reason, we focused on building network configurations centered on ncRNA targets, such that inference can be conducted on their direct or indirect interactors.
Previously, it was shown how the differentially expressed genes regulated the nine classes of biological processes, but, here, we did an extensive repurposing of the same microarray data to do pathway annotations for the DEGs. As a disease model, the focus on NP was based on the simple premise that very little is known about the role of ncRNA in such process. It is known that nerve injury causes gene expression changes in some isolated miRNAs and in asRNA in the neuronal tissue. It is also known that these changes could be responsible for the injury-induced alterations of few pain-associated genes, causing neuronal excitability and pain hypersensitivity (22, 71). However, no knowledge is currently available on lncRNAs and their associated protein-coding gene targets. Similar knowledge gap applies also to pseudogenes and their associated parental genes.
We have provided evidence on novel players with a possible role in the regulation of NP development and maintenance, and whose specific functions would surely require further validation steps by experimentalists. As computational scientists, we investigated ncRNA expression profiles and their potential targets by two different approaches. First, we searched for the detected DE lincRNAs in coarsely defined proximal genomic regions to locate potential targets. We found, for instance, three examples of targets – RGS22, RGS18, and STIM2 – for which annotations are quite interesting because of the functional information available for them, hence the attribute context-rich assigned to these examples compared to other examples of candidates for which similar information is unavailable. We do not exclude, of course, that the mechanisms underlying lincRNA functions and their targets might not be centered on proximity. We only claim that evidence appears from targets found at proximal locations, indicating the need to explore further the causes of such influences. While the previous findings were limited to antisense, here we found targets for lincRNAs and pseudogenes, highlighting the fact that the regulation of these targets by ncRNAs needs further exploration in the NP domain.
Second, we analyzed DE pseudogenes and their parental protein-coding targets. We showed that some parental genes can be connected when mapped onto networks, in particular, PINs in our case. Interestingly, we reported various network-driven identifications of these aggregated parental genes linked to studies centered on various neurodegeneration and neurogenesis processes. We also found olfactory receptors representing a densely connected functional module, and further literature curation linked them to various neurodegeneration states like Parkinson, Down syndrome, Huntington diseases, and Alzheimer. This, by no means, indicates that direct association exists with one or more of such diseases, also because of the variety of reference models reported in the studies.
As a methodological note, microarray technology is not comprehensive, and we may expect that by replicating our experimental setting with RNA-Sequence, superior evidences could be found to complement our current results obtained. Although RNA-Sequence is undoubtedly a preferable strategy to detect novel lncRNAs, our goal in the study was limited to the identification of known lncRNAs, and microarray revealed informative for such purpose. We expect that more and novel ncRNAs would emerge from using RNA-Seq. More importantly, we want to emphasize that we have reused the microarray data to identify the significant differentially expressed bioentities and to study their potential relevance for NP. Our results suggest that novel unforeseen findings could be latent in microarray analyses, requiring that biologically meaningful interpretations from such repurposing of microarray probes may represent a strategy to consolidate previously acquired knowledge. Recent reviews stressing such points are supporting our work (72, 73).
In conclusion, given the known premises that lncRNA regulation mechanisms are far from being clarified, and this includes the elucidation of direct/indirect and in case, intermediary mechanisms, it appears that targeting proteins allow to exert their cis or trans effects, but the factors determining, for instance, RNA-protein interaction are not yet well-defined.
Our study brings initial but prominent evidence of the impact exerted by some ncRNAs on NP. Despite the limitations of the approach, such as results telling about putative targets and cross-referencing across studies on different models, an advantage of our methodology is that can be generalized to any model for which even just microarray data have been obtained. To our knowledge, these represent still the major fraction available to computational scientists, and even if the ratio with RNA-Seq-driven studies is destined to revert, we are confident that ncRNA evidences can be retrieved from both experimental sources.
Finally, although most of the recent computational tools have tried to examine the expression and the possible regulation functions of ncRNAs, it is plausible to expect that direct NP implications need future targeted studies aimed to validate the precise functional role of ncRNAs and their potential targets, even if the timeline for such milestone remains highly dependent on both model and type of cells under study, thus hard to predict.
HR: performed analyses, computations, coding, and graphics. EC: performed networks and graphics. HR, NT, and EC: designed the methodological approach, wrote the manuscript, read, and approved the final manuscript.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors thank two anonymous reviewers for constructive remarks.
The Supplementary Material for this article can be found online at https://www.frontiersin.org/article/10.3389/fneur.2016.00168
Figure S1. Protein–protein interactions for pseudogene targets for DRG.
Figure S2. Protein–protein interactions for pseudogene targets for SN.
Table S1. Parent protein-coding targets classification of transcription factor, kinase, and receptors.
Table S2. Pseudogenes and parental protein-coding genes in DRG.
Table S3. Pseudogenes and parental protein-coding genes in SN.
Table S4. Differentially expressed pseudogenes and parental protein-coding genes in DRG.
Table S5. Differentially expressed pseudogenes and parental protein-coding genes in SN.
Table S6. Enriched terms and full details for pathway analysis in SN.
Table S7. Enriched terms and full details for pathway analysis in DRG.
Table S8. Probe sets and genes.
Table S9. Cross-references for Table 2 entries.
2. Berna-Erro A, Braun A, Kraft R, Kleinschnitz C, Schuhmann MK, Stegner D, et al. STIM2 regulates capacitive Ca2+ entry in neurons and plays a key role in hypoxic neuronal cell death. Sci Signal (2009) 2(93):ra67. doi:10.1126/scisignal.2000522
5. Fenrich K, Gordon T. Canadian association of neuroscience review: axonal regeneration in the peripheral and central nervous systems – current issues and advances. Can J Neurol Sci (2004) 31(2):142–56. doi:10.1017/S0317167100053798
6. Yu B, Zhou S, Wang Y, Ding G, Ding F, Gu X. Profile of microRNAs following rat sciatic nerve injury by deep sequencing: implication for mechanisms of nerve regeneration. PLoS One (2011) 6(9):e24612. doi:10.1371/journal.pone.0024612
7. Li S, Yu B, Wang S, Gu Y, Yao D, Wang Y, et al. Identification and functional analysis of novel micro-RNAs in rat dorsal root ganglia after sciatic nerve resection. J Neurosci Res (2012) 90(4):791–801. doi:10.1002/jnr.22814
8. Li S, Liu Q, Wang Y, Gu Y, Liu D, Wang C, et al. Differential gene expression profiling and biological process analysis in proximal nerve segments after sciatic nerve transection. PLoS One (2013) 8(2):e57000. doi:10.1371/journal.pone.0057000
9. Yu B, Zhou S, Hu W, Qian T, Gao R, Ding G, et al. Altered long noncoding RNA expressions in dorsal root ganglion after rat sciatic nerve injury. Neurosci Lett (2013) 534:117–22. doi:10.1016/j.neulet.2012.12.014
10. Zhou S, Shen D, Wang Y, Gong L, Tang X, Yu B, et al. microRNA-222 targeting PTEN promotes neurite outgrowth from adult dorsal root ganglion neurons following sciatic nerve transection. PLoS One (2012) 7(9):e44768. doi:10.1371/journal.pone.0044768
12. Johnsson P, Ackley A, Vidarsdottir L, Lui WO, Corcoran M, Grandér D, et al. A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells. Nat Struct Mol Biol (2013) 20(4):440–6. doi:10.1038/nsmb.2516
18. Xu Z, Wei W, Gagneur J, Clauder-Münster S, Smolik M, Huber W, et al. Antisense expression increases gene expression variability and locus interdependency. Mol Syst Biol (2011) 7:468. doi:10.1038/msb.2011.1
19. Wang X, Song X, Glass CK, Rosenfeld MG. The long arm of long noncoding RNAs: roles as sensors regulating gene transcriptional programs. Cold Spring Harb Perspect Biol (2011) 3(1):a003756. doi:10.1101/cshperspect.a003756
22. Zhao X, Tang Z, Zhang H, Atianjoh FE, Zhao JY, Liang L, et al. A long noncoding RNA contributes to neuropathic pain by silencing Kcna2 in primary afferent neurons. Nat Neurosci (2013) 16(8):1024–31. doi:10.1038/nn.3438
29. D’Onofrio M, Ambrosini A, Di Mambro A, Arisi I, Santorelli FM, Grieco GS, et al. The interplay of two single nucleotide polymorphisms in the CACNA1A gene may contribute to migraine susceptibility. Neurosci Lett (2009) 453(1):12–5. doi:10.1016/j.neulet.2009.01.081
30. Tanaka T, Minami M, Nakagawa T, Satoh M. Enhanced production of monocyte chemoattractant protein-1 in the dorsal root ganglia in a rat model of neuropathic pain: possible involvement in the development of neuropathic pain. Neurosci Res (2004) 48(4):463–9. doi:10.1016/j.neures.2004.01.004
31. White FA, Sun J, Waters SM, Ma C, Ren D, Ripsch M, et al. Excitatory monocyte chemoattractant protein-1 signaling is up-regulated in sensory neurons after chronic compression of the dorsal root ganglion. Proc Natl Acad Sci U S A (2005) 102(39):14092–7. doi:10.1073/pnas.0503496102
32. Jeon SM, Lee KM, Cho HJ. Expression of monocyte chemoattractant protein-1 in rat dorsal root ganglia and spinal cord in experimental models of neuropathic pain. Brain Res (2009) 1251:103–11. doi:10.1016/j.brainres.2008.11.046
33. Thacker MA, Clark AK, Bishop T, Grist J, Yip PK, Moon LD, et al. CCL2 is a key mediator of microglia activation in neuropathic pain states. Eur J Pain (2009) 13(3):263–72. doi:10.1016/j.ejpain.2008.04.017
35. Zeng H, Chattarji S, Barbarosie M, Rondi-Reig L, Philpot BD, Miyakawa T, et al. Forebrain-specific calcineurin knockout selectively impairs bidirectional synaptic plasticity and working/episodic-like memory. Cell (2001) 107(5):617–29. doi:10.1016/S0092-8674(01)00585-2
36. Pei JJ, Braak H, An WL, Winblad B, Cowburn RF, Iqbal K, et al. Up-regulation of mitogen-activated protein kinases ERK1/2 and MEK1/2 is associated with the progression of neurofibrillary degeneration in Alzheimer’s disease. Brain Res Mol Brain Res (2002) 109(1–2):45–55. doi:10.1016/S0169-328X(02)00488-6
39. Garnier M, Zaratin PF, Ficalora G, Valente M, Fontanella L, Rhee MH, et al. Up-regulation of regulator of G protein signaling 4 expression in a model of neuropathic pain and insensitivity to morphine. J Pharmacol Exp Ther (2003) 304(3):1299–306. doi:10.1124/jpet.102.043471
40. Roman DL, Traynor JR. Regulators of G protein signaling (RGS) proteins as drug targets: modulating G-protein-coupled receptor (GPCR) signal transduction. J Med Chem (2011) 54(21):7433–40. doi:10.1021/jm101572n
41. Iwai K, Koike M, Ohshima S, Miyatake K, Uchiyama Y, Saeki Y, et al. RGS18 acts as a negative regulator of osteoclastogenesis by modulating the acid-sensing OGR1/NFAT signaling pathway. J Bone Miner Res (2007) 22(10):1612–20. doi:10.1359/jbmr.070612
42. Yowe D, Weich N, Prabhudas M, Poisson L, Errada P, Kapeller R, et al. RGS18 is a myeloerythroid lineage-specific regulator of G-protein-signalling molecule highly expressed in megakaryocytes. Biochem J (2001) 359(Pt 1):109–18. doi:10.1042/bj3590109
43. Heximer SP, Srinivasa SP, Bernstein LS, Bernard JL, Linder ME, Hepler JR, et al. G protein selectivity is a determinant of RGS2 function. J Biol Chem (1999) 274(48):34253–9. doi:10.1074/jbc.274.48.34253
44. Popov SG, Krishna UM, Falck JR, Wilkie TM. Ca2+/calmodulin reverses phosphatidylinositol 3,4,5-trisphosphate-dependent inhibition of regulators of G protein-signaling GTPase-activating protein activity. J Biol Chem (2000) 275(25):18962–8. doi:10.1074/jbc.M001128200
45. Costigan M, Samad TA, Allchorne A, Lanoue C, Tate S, Woolf CJ. High basal expression and injury-induced down regulation of two regulator of G-protein signaling transcripts, RGS3 and RGS4 in primary sensory neurons. Mol Cell Neurosci (2003) 24(1):106–16. doi:10.1016/S1044-7431(03)00135-0
46. Gu Z, Jiang Q, Yan Z. RGS4 modulates serotonin signaling in prefrontal cortex and links to serotonin dysfunction in a rat model of schizophrenia. Mol Pharmacol (2007) 71(4):1030–9. doi:10.1124/mol.106.032490
47. Magnoni R, Palmfeldt J, Christensen JH, Sand M, Maltecca F, Corydon TJ, et al. Late onset motoneuron disorder caused by mitochondrial Hsp60 chaperone deficiency in mice. Neurobiol Dis (2013) 54:12–23. doi:10.1016/j.nbd.2013.02.012
48. Tanaka K, Eskin A, Chareyre F, Jessen WJ, Manent J, Niwa-Kawakita M, et al. Therapeutic potential of HSP90 inhibition for neurofibromatosis type 2. Clin Cancer Res (2013) 19(14):3856–70. doi:10.1158/1078-0432.CCR-12-3167
49. Villa C, Fenoglio C, De Riz M, Clerici F, Marcone A, Benussi L, et al. Role of hnRNP-A1 and miR-590-3p in neuronal death: genetics and expression analysis in patients with Alzheimer disease and frontotemporal lobar degeneration. Rejuvenation Res (2011) 14(3):275–81. doi:10.1089/rej.2010.1123
51. Yokota Y, Ring C, Cheung R, Pevny L, Anton ES. Nap1-regulated neuronal cytoskeletal dynamics is essential for the final differentiation of neurons in cerebral cortex. Neuron (2007) 54(3):429–45. doi:10.1016/j.neuron.2007.04.016
52. Cheng J, Fan YH, Xu X, Zhang H, Dou J, Tang Y, et al. A small-molecule inhibitor of UBE2N induces neuroblastoma cell death via activation of p53 and JNK pathways. Cell Death Dis (2014) 5:e1079. doi:10.1038/cddis.2014.54
53. McCamphill PK, Farah CA, Anadolu MN, Hoque S, Sossin WS. Bidirectional regulation of eEF2 phosphorylation controls synaptic plasticity by decoding neuronal activity patterns. J Neurosci (2015) 35(10):4403–17. doi:10.1523/JNEUROSCI.2376-14.2015
54. Wang CC, Kadota M, Nishigaki R, Kazuki Y, Shirayoshi Y, Rogers MS, et al. Molecular hierarchy in neurons differentiated from mouse ES cells containing a single human chromosome 21. Biochem Biophys Res Commun (2004) 314(2):335–50. doi:10.1016/j.bbrc.2003.12.091
55. Martin S, Zekri L, Metz A, Maurice T, Chebli K, Vignes M, et al. Deficiency of G3BP1, the stress granules assembly factor, results in abnormal synaptic plasticity and calcium homeostasis in neurons. J Neurochem (2013) 125(2):175–84. doi:10.1111/jnc.12189
57. Wenning GK, Shephard B, Hawkes C, Petruckevitch A, Lees A, Quinn N. Olfactory function in atypical parkinsonian syndromes. Acta Neurol Scand (1995) 91(4):247–50. doi:10.1111/j.1600-0404.1995.tb06998.x
59. Khan NL, Katzenschlager R, Watt H, Bhatia KP, Wood NW, Quinn N, et al. Olfaction differentiates parkin disease from early-onset parkinsonism and Parkinson disease. Neurology (2004) 62(7):1224–6. doi:10.1212/01.WNL.0000118281.66802.81
62. Ansoleaga B, Garcia-Esparcia P, Llorens F, Moreno J, Aso E, Ferrer I. Dysregulation of brain olfactory and taste receptors in AD, PSP and CJD, and AD-related model. Neuroscience (2013) 248:369–82. doi:10.1016/j.neuroscience.2013.06.034
63. Garcia-Esparcia P, Schluter A, Carmona M, Moreno J, Ansoleaga B, Torrejon-Escribano B, et al. Functional genomics reveals dysregulation of cortical olfactory receptors in Parkinson disease: novel putative chemoreceptors in the human brain. J Neuropathol Exp Neurol (2013) 72(6):524–39. doi:10.1097/NEN.0b013e318294fd76
66. Chen B, Brinkmann K, Chen Z, Pak CW, Liao Y, Shi S, et al. The WAVE regulatory complex links diverse receptors to the actin cytoskeleton. Cell (2014) 156(1–2):195–207. doi:10.1016/j.cell.2013.11.048
67. Biswas S, Emond MR, Duy PQ, Hao le T, Beattie CE, Jontes JD. Protocadherin-18b interacts with Nap1 to control motor axon growth and arborization in zebrafish. Mol Biol Cell (2014) 25(5):633–42. doi:10.1091/mbc.E13-08-0475
68. Zhu Y, Richardson JE, Hale P, Baldarelli RM, Reed JD, Recia JM, et al. A unified gene catalog for the laboratory mouse reference genome. Mamm Genome (2015) 26(7):295–304. doi:10.1007/s00335-015-9571-1
70. Wang H, Sun H, Della Penna K, Benz RJ, Xu J, Gerhold DL, et al. Chronic neuropathic pain is accompanied by global changes in gene expression and shares pathobiology with neurodegenerative diseases. Neuroscience (2002) 114(3):529–46. doi:10.1016/S0306-4522(02)00341-X
71. Sakai A, Saitow F, Miyake N, Miyake K, Shimada T, Suzuki H. miR-7a alleviates the maintenance of neuropathic pain through regulation of neuronal excitability. Brain (2013) 136(Pt 9):2738–50. doi:10.1093/brain/awt191
Keywords: neuropathic pain, microarray data reuse, differential expression, time-course profiling, pathway analysis, non-coding RNAs, networks
Citation: Raju HB, Tsinoremas NF and Capobianco E (2016) Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data. Front. Neurol. 7:168. doi: 10.3389/fneur.2016.00168
Received: 22 June 2016; Accepted: 20 September 2016;
Published: 18 October 2016
Edited by:Michael F. Miles, Virginia Commonwealth University, USA
Reviewed by:Alessandro Laganà, Icahn School of Medicine at Mount Sinai, USA
Sean P. Farris, Waggoner Center for Alcohol and Addiction Research, USA
Copyright: © 2016 Raju, Tsinoremas and Capobianco. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Enrico Capobianco, firstname.lastname@example.org