Original Research ARTICLE
Co-expression Network Approach Reveals Functional Similarities among Diseases Affecting Human Skeletal Muscle
- 1Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States
- 2Departments Cellular and Molecular Medicine, Computer Science and Engineering, University of California, San Diego, La Jolla, CA, United States
Diseases affecting skeletal muscle exhibit considerable heterogeneity in intensity, etiology, phenotypic manifestation and gene expression. Systems biology approaches using network theory, allows for a holistic understanding of functional similarities amongst diseases. Here we propose a co-expression based, network theoretic approach to extract functional similarities from 20 heterogeneous diseases comprising of dystrophinopathies, inflammatory myopathies, neuromuscular, and muscle metabolic diseases. Utilizing this framework we identified seven closely associated disease clusters with 20 disease pairs exhibiting significant correlation (p < 0.05). Mapping the diseases onto a human protein-protein interaction network enabled the inference of a common program of regulation underlying more than half the muscle diseases considered here and referred to as the “protein signature.” Enrichment analysis of 17 protein modules identified as part of this signature revealed a statistically non-random dysregulation of muscle bioenergetic pathways and calcium homeostasis. Further, analysis of mechanistic similarities of less explored significant disease associations [such as between amyotrophic lateral sclerosis (ALS) and cerebral palsy (CP)] using a proposed “functional module” framework revealed adaptation of the calcium signaling machinery. Integrating drug-gene information into the quantitative framework highlighted the presence of therapeutic opportunities through drug repurposing for diseases affecting the skeletal muscle.
Human skeletal muscle is a versatile tissue, with structure and function governed by complex interactions between its sensing, signaling, force transduction, metabolic and basic cellular processing machinery (transcription and translation) (Kierszenbaum and Tres, 2015). Precisely coordinated activity between each of its components is essential for muscle health and normal functioning of associated motor activity. A disruption to any component within this complex system of interactions leads to disorders of the muscle, typically characterized by muscle fiber loss, reduced motor output and possibly death. Epidemiological, clinical, and physiological studies have contributed immensely to our understanding of pathogenesis and manifestation of individual muscle diseases, revealing similarities amongst them (Jones and Round, 1990; Askanas and Engel, 2008).
Recent advancements in genomic technologies have enabled newer opportunities for understanding mechanisms that are common and distinct across muscle pathologies. High-throughput measurements, multiscale phenotypic data and integrative analysis are beginning to provide increasingly comprehensive understanding of muscle dynamics, specifically for biomarker discovery (Laaksonen et al., 2006; Dewey et al., 2011; Azuaje et al., 2012; Gupta et al., 2014). However, very few studies have developed and implemented techniques for extracting similarities underlying muscle diseases, on a much broader scale. For instance, Blandin et al. (2013) utilized the yeast-two hybrid (Y2H) methodology combined with co-expression networks to generate a muscular LGMD-centered interaction network (LGMD-Limb girdle muscular dystrophy) identifying a total of 1,018 proteins connected by 1,492 direct binary interactions enriched for cytoskeletal and extracellular matrix protein interactions.
In this study, we propose a quantitative framework to assess relationships between 20 diseases affecting the muscle which, based on their pathological/clinical presentation, were categorized into dystrophinopathies, inflammatory myopathies, neuromuscular and metabolic diseases of the muscle (Table 1, see Methods) (Engel and Franzini-Armstrong, 2004). Briefly, dystrophinopathies include Emery-Dreifuss muscular dystrophies (EDMD), limb girdle muscular dystrophies (LGMDs), Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD). While EDMDs and LGMDs are caused by mutations in muscle structural genes, DMD and BMD are caused by frame shift and in-frame mutations respectively, of the DMD gene. Progressive weakening and wasting of skeletal muscle characterize all these diseases. The inflammatory myopathies considered include polymyositis (PM), dermatomyositis (DM), juvenile dermatomyositis (JDM), inclusion body myositis (IBM) and hereditary inclusion body myositis (HIBM). These myopathies are caused primarily by infiltration of immune cells and are characterized by chronic inflammation and weakening of muscle. Metabolic diseases of the muscle including the mitochondrial myopathies [progressive external opthalmoplegia (PEO) and mitochondrial encephalomyopathy, lactic acidosis and stroke-like episodes (MELAS)], acute quadriplegic myopathy (AQM) and chronic fatigue syndrome (CFS) all exhibit impaired metabolism, preferential loss of thick filaments and altered excitability of muscle. While the mitochondrial myopathies-PEO and MELAS are caused by mutations in mitochondrial DNA (MT-TL1 gene), AQM and CFS are idiopathic. The neuromuscular diseases-amyotrophic lateral sclerosis (ALS), spastic paraplegia (SP) and cerebral palsy (CP) all affect muscle secondary to neurodegeneration and are characterized by spasticity and progressive weakening of muscle.
Despite observable physiological similarities in diseases within each group, diseases exhibit considerable heterogeneity in intensity, etiology, phenotypic manifestation, and gene expression. For instance, muscle in both inclusion body myopathies (HIBM and IBM, Table 1) exhibit chronic inflammation with visible vacuoles, however, IBM is mostly idiopathic, while mutations in GNE and MYH2 cause HIBM (Tomé and Fardeau, 1998). Systems biology, in particular network theory, facilitates an understanding of the heterogeneity and uniqueness underlying diseases mechanisms by integrating multiscale data (transcript, protein, and drug). Here we utilized a co-expression based scoring scheme to generate a network and elucidate mechanisms underlying significant disease-relevant modules. We further mapped them onto a “functional module” framework within muscle and onto a human protein interaction network, to infer a common program of regulation underlying majority of the diseases. Further, incorporating drug data into the quantitative framework allowed for identifying avenues for drug repurposing in treating diseases of the muscle.
Materials and Methods
Data Acquisition and Processing
The list of diseases available under the Medical Subject Headings (MeSH) terms “neuromuscular,” “musculoskeletal,” and “muscular” diseases was used as a guideline for identifying muscular diseases of interest (Lipscomb, 2000). All available (RNAseq + microarray platform) information from GEO (Barrett et al., 2013) was downloaded and surveyed for maximum coverage of muscle diseases in the MeSH headings identified above. A single platform GPL96 (Affymetrix HG-U133A) offered the highest coverage of muscle diseases surveyed. Choosing studies from one platform alone (GPL96) limited possible noise arising from platform differences. Additionally, studies with non-muscle sample tissue, and less than two-samples/condition were eliminated. Filtering the data for accuracy, and experimental context using our constraints resulted in microarrays from 19 human diseases. In addition to these 19 diseases, samples from “CP” (not available under the MeSH term categories considered above) was also included resulting in 20 diseases for analysis. CP is a movement disorder characterized by contractures of the muscle with its primary insult on the nervous system. Gene expression data for studies with. CEL files (disease and control) are normalized using RMA (Robust Multi-array Average). Studies with series matrix files were downloaded as is. ComBat cross-array normalization is utilized for diseases with more than one associated GSE (e.g., LGMD2A, DMD, and JDM, Table 1), to remove study artifacts (Johnson et al., 2007). Multiple probes were accounted for using the “collapseRows” function of WGCNA library in R (Langfelder and Horvath, 2008). The reduced and processed data sets subsequently included z-transformed expression values of 12,789 genes across 20 diseases.
Identifying Disease Similarity
Cyber-T's (Kayala and Baldi, 2012) regularized t-test performed on z-transformed expression values was used as a measure of a gene's differential activity. The associated T-statistic provided insight into the difference in mean expression of a gene across conditions and was referred to as the differential gene activity (DGA) score for each disease state (Table S0). Partial pearson correlation of DGA scores across diseases, quantified disease similarity based on expression profiles. Use of partial correlation have been shown to be effective in factoring out dependencies such as variation in tissue types and experimental conditions (Suthram et al., 2010). The “pcor” function available through the “ppcor” package in R (Kim, 2015) was utilized to calculate all possible pairwise partial correlations between each pair of samples (here diseases) while eliminating the effect of all other samples. We identified disease clusters using hierarchical (complete-linkage) clustering of partial correlation.
Disease-Gene Based Disease Overlap
A comprehensive list of genetic factors affecting the 20 muscle diseases from various sources such as OMIM (Amberger et al., 2009), PheGenI (Ramos et al., 2014), ClinVar (Landrum et al., 2014), and DisGeNET (Piñero et al., 2015) (henceforth referred to as the disease-gene list) was downloaded and curated. A hypergeometric model, with a null that disease-genes were randomly drawn from the space of all genes was used to ascertain the statistical overlap between diseases based on the disease-gene list. The function “phyper,” available through base R packages was utilized to calculate the hypergeometric p-values.
Muscle “Functional Modules” and Functional Module Activity Score
We developed a framework of “functional modules” (FM) within muscle that represented significant units required for normal muscle activity (Table 2). Each of the 23 FMs were represented by a group of manually curated list of biomarkers that belonged to a broader functional pathway (family) within muscle (Table S1). This list expanded on an existing framework for muscle functional families (Mukund et al., 2014). Several biomarkers within each FM were multi-functional and were placed in FMs that were most relevant to skeletal muscle. The functional module activity (FMA) score associated with each functional module i in disease k, was calculated as mean DGA score of its component genes. This score reflected the overall state of a module in a particular disease (Table S2). Specifically, a negative FMA values reflected a downregulation of the module genes in disease with respect to controls and the converse for positive FMA values. Significance testing at p < 0.05 identified FMs associated with each disease state.
Human Protein Interaction Network and Protein Module Activity Score
The human protein-protein interaction network (PPIN) was extracted from the STRING database (v9.1), containing both direct (physical) and indirect (functionally derived) interactions (Franceschini et al., 2013). Limiting the interactions to a combined score cutoff of >0.85 allowed us to account for strong interactions, with sufficient experimental evidence and resulted in a total of 1,48,030 unique interactions among 10,341 proteins. The PPIN was clustered using MCL (Markov Cluster) algorithm a fast, scalable and unsupervised cluster algorithm for networks based on simulation of stochastic flow. The MCL algorithm finds cluster structure in graphs by iteratively computing the probabilities of random walks through the graph (markov matrices) using an alternation of two operators called expansion and inflation. Expansion is the power of a stochastic matrix using the normal matrix product (i.e., matrix squaring) and inflation is the Hadamard power of a matrix (taking powers entrywise), followed by a scaling step. A detailed description and comparison are provided in Enright et al. (2002) and Brohee and Van Helden (2006). Clustering the PPIN using MCL clustering resulted in 1,025 protein modules (filtered for minimum module size >2) with sizes ranging between 3 and 256 genes/proteins. We considered 764/1,025 modules (6,215/8,581 proteins overlapping with our list of 12,789 genes) with at least three genes/module for further analysis (Table S3). Analogous to Suthram et al. (2010) the protein module activity (PMA) score was calculated for each protein module i in a disease k as the mean of DGA scores for its component genes. In the end, we obtained a vector of PMAik for each disease, representing the activity level of a given protein module in each disease state (Table S4). Significance testing identified protein modules with a p < 0.05 associated with each disease. We defined the threshold for module expression as upper 50th percentile of |mean PMA| scores from significant modules across diseases.
A background distribution of disease correlations expected at random was generated to assess the significance of observed correlations. Disease and control sample labels were shuffled prior to computing DGA scores and disease similarity (through partial correlations). The whole process was repeated 100 times to create a background distribution of disease correlations. This was utilized to determine a permutation-based p-value (number of the times the permuted statistic exceeded the observed statistic). The background distributions for PMA and FMA scores were similarly generated.
Network Visualization and Functional Enrichment
All network visualization was performed using Cytoscape software (Shannon et al., 2003). Enrichment was identified using Gene Ontology's -Biological Process category available via ClueGO-a Cystoscape plugin (Bindea et al., 2009) and DAVID v6.8 (Dennis et al., 2003). Venn diagram was created using the “VennDiagram” package available through R/Bioconductor (Chen and Boutros, 2011).
We utilized the drug gene interaction database (DGIdb) (Wagner et al., 2015) to identify a list of expert-curated proteins/genes that serve as druggable targets. A list of currently approved drugs from the FDA (or at least one jurisdiction) was downloaded from Drugbank (Law et al., 2014), while drugs treating the disease clusters were obtained from Medscape (Medscape, 2017)1 and UpToDate® (UpToDate, 2017)2
Clustering Muscle Diseases
Expression data for diseases affecting muscle (Table 1) was processed (see Methods) resulting in a set of 12,789 genes across the 20 diseases and utilized in our analysis. The change in gene expression for each disease state (with respect to controls) was quantified as the associated T-statistic of z-normalized values (see Methods, Figure 1A, Figure S1) and referred to as differential gene activity (DGA) score. Hierarchical clustering of diseases based on partial correlation of DGA scores resulted in seven disease clusters (Figure 1B, Table 1). Twenty (~10%) of the 190 possible disease pairs were identified as being significant (p < 0.05) via significance testing (see Methods, Figure 1C). Few well-characterized intra-cluster associations marginally missed the significance threshold and were not captured, for instance the association between LGMD2B and DMD, BMD and DMD. We ascertained if the observed significant disease correlations (based on DGA scores) shared known genetic associations. We compiled a list of genes associated with our disease set from publicly available databases (see Methods). Pairs of disease were considered to significantly share disease-genes if the hypergeometric p-value of overlap was < 0.05. We identified that 26/190 possible disease pairs shared a significant genetic basis, and 6/26 interactions overlapped with our 20 significant DGA based associations (Table 3A), one-sided Fisher's exact test p-value of 0.036 (Table 3B).
Figure 1. Extracting significant disease similarities from 20 diseases affecting muscle. (A) shows the workflow involved in calculating the differential gene activity (DGA) score and hierarchical clustering of the scores to extract disease clusters based on DGA. (B) shows the hierarchical clustering dendrogram (method- complete) of disease correlations. Tree cut height (red line) corresponds to a p-value of 0.05 and disease clusters identified below this line were identified to be significantly correlated. (C) This network represent the 190 possible associations between the 20 diseases. Edges highlighted in red indicate the associations identified as being highly significant through permutation testing. The various node colors indicate the clusters the diseases belong to as identified through hierarchical clustering. The nodes colored in gray were not clustered.
Identifying Protein Modules Underlying Muscle Diseases
Protein modules identified within a human-PPIN through modularity detection algorithms (Enright et al., 2002; Brohee and Van Helden, 2006) represented a group of strongly interacting proteins with putative functional associations. We utilized PPIN to examine if there was a common program of regulation underlying different pathologies.
We generated a catalog of 1,025 protein modules by querying a large-scale human PPIN available through STRING (Franceschini et al., 2013) (see Methods) We identified that 764/1,025 modules (containing at least 3 nodes/module), overlapped our list of 12,789 genes, with module sizes varying from 3 to 256 proteins. We defined a protein module activity (PMA) score calculated for each protein module identified per disease as the mean of component protein DGA scores. The common underlying program of regulation or “signature” protein modules were identified utilizing a 2-fold approach-first, all modules with absolute PMA values significantly higher than random (p < 0.05) in more than half the diseases (n > 10) were extracted (35 modules passed this selection criterion). Next, using the threshold for module expression (see Methods), 17/35 modules were identified as the underlying protein signature (Figure 2, Table S5). The complete list of signature modules identified and their top three enrichment terms are provided in Table S6.
Figure 2. Combined functional enrichment of protein signature underlying diseases affecting the muscle. This figure provides a graphical representation of the top enrichment terms identified in the 17 signature protein modules (combined). The signature modules represent a set of modules that were identified as underlying a majority of the diseases considered in this study Size of each section of the pie is proportional to the number of genes identified in each category.
A “Functional Module” Framework to Identify Muscle-Specific Mechanistic Changes
Utilizing the human PPIN allowed us to infer a common program of dysregulation underlying muscle diseases. We next sought to elucidate mechanistic similarities between significant disease pairs, in a more skeletal muscle-specific context. We developed a framework of FM that represented significant units required for normal muscle activity (Table 2, see Methods). The set of 23 “FM” captured key biomarkers associated with major processes in skeletal muscle. We calculated a functional module activity (FMA) score that reflected the collective behavior of genes in each functional module, for a given disease state (Table S2). We also computed an associated p-value for each functional module via permutation testing.
The FM framework was utilized to assess common functional mechanisms underlying 20 significant disease-pairs (with p < 0.05). Table 4 presents a subset of significant disease-pairs, which shared four or more significant FM between them. For instance, JDM and DMD had 15/23 FM overrepresented (p < 0.05). These included modules associated with atrophy, inflammation, ECM and cytoskeleton, all members of the excitation contraction coupling family (FM IDs-3,4,5), members of contraction (7,8), mitochondrial energy metabolism (11,12,13,14,15), inflammation, and fiber type maintenance (19 and 21).
Drug Targets Over-Representation in Disease-Associated Protein Modules
We ascertained if druggable targets were over-represented in the common protein signature modules to support the hypothesis that drugs targeting common targets can treat a variety of muscle diseases. We identified 54/156 proteins in the signature modules belonged to at least one druggable category, with at least one interaction as categorized in the drug gene interaction database (Wagner et al., 2015). We identified 41 of these proteins as targets for 81 approved drugs used in treating a variety of diseases and provided possible avenues for exploration of therapeutic options (Figure 3, top panel; Table S7).
Figure 3. Representative set of the protein signature modules underlying diseases affecting the muscle. Top panel (i–iv) represents 4/17 protein signature modules identified. The green nodes represent proteins that contain at least one interaction as defined in the Drug-gene interaction database (DGIdb). Number shown in the nodes represent the number of approved drugs targeting the proteins. Lower panel (i–iv) represents functional enrichment (top group terms) identified for the corresponding modules panel (i–iv) in the top panel, using ClueGO (see Methods). All 17 modules identified are presented in Figure S2.
We next hypothesized that protein modules exclusively regulated in each disease cluster might be enriched for drug targets. For instance, we identified protein modules associated with 3 disease clusters (DM/JDM, IBM/PM and DMD/BMD/LGMD) using an approach similar to section Identifying Protein Modules Underlying Muscle Diseases. We identified −20 approved drugs to be targeting proteins regulated in modules unique to the IBM/PM cluster, 26 drugs targeting proteins unique to the DMD/BMD/LGMD cluster and 203 in the DM/JDM cluster (Table S8).
Clustering Diseases Based on Differential Gene Activity Identifies Both Well and Less Characterized Disease Associations
The seven disease clusters identified included several well-characterized clusters such as muscular dystrophies (BMD, DMD, LGMD2I, and LGMD2A), mitochondrial disorders (MELAS and PEO); and a few less characterized clusters such as Sarcopenia, EDMD, LGMD2B and FSHD. Fisher's exact test was performed to capture the significance of overlap between disease pairs identified using DGA scores and those with shared genetic basis (using disease-gene lists). The Fisher's exact p-value of 0.036 implied that the genetic disease similarity was significantly captured by DGA based correlations in this study. A relatively low, albeit, significant p-value of fisher's exact test reflected the possibility that certain disease associations did not pass our significance threshold or were not captured using disease-gene lists. This could also be attributed to certain diseases being better studied than others. For example, genes associated with ALS (895 genes) where far greater than those associated with AQM (26 genes).
Deficient Bioenergetics and Calcium Homeostasis–a Common Program of Dysregulation Underlying Diseases Affecting the Muscle
Enrichment analysis of 17 signature modules, identified from the protein interaction network indicated an over-representation of modules associated with immune response (e.g., module 14), mitochondrial function (e.g., modules 41, 168, 271, 355, and 340), mitochondrial structure (e.g., 537), metabolism (e.g., 426), calcium homeostasis in muscle (e.g., 334), and the extracellular matrix (e.g., 153, 416) (Figure 2, Table S3).
Early mitochondrial research in muscle disorders have suggested that a widespread occurrence of mitochondrial anomalies did not necessarily imply a primary deficiency in efficacy of mitochondrial function (muscle meeting its energy requirements) (Stadhouders and Sengers, 1987). However, more recent research has repeatedly suggested deficient bioenergetics underlie the pathology of several muscular and neuromuscular diseases in mammalian models (Wallace, 2000, 2013; Pieczenik and Neustadt, 2007; Ramadasan-Nair et al., 2014). Pathology of neuromuscular diseases such as ALS also exhibits mitochondrial dysfunction as a major event in its progression (Dupuis and Loeffler, 2009; Cozzolino and Carrì, 2012). Reduced efficiency in the action of the tricarboxylic acid (TCA) cycle has been also assessed in diseased muscle associated with inflammatory myopathies (Coley et al., 2012), dystrophy (Even et al., 1994), and mitochondrial diseases such as MELAS, PEO (Wallace, 1999). Mounting evidence has suggested that the pathological muscle wasting observed in dystrophies (e.g., DMD) might be due to reduced ATP availability required for maintenance of Ca2+ homeostasis and fiber regeneration (Timpani et al., 2015). Bioenergetic pathway enzymes have recently shown to be relevant biomarkers of muscular and neuromuscular disease progression (Santacatterina et al., 2015).
Ca2+ homeostasis in muscle largely determines its contraction and relaxation properties. This is tightly regulated by the Ca2+ signaling apparatus within muscle comprising of the ryanodine receptors, sarcoplasmic endoplasmic reticulum calcium pumps (SERCA), troponin complex, calsequestrin; in addition to Ca2+ binding proteins such as parvalbumin, sarcolipin, phospholamban and calpains. We observed a strong dysregulation of several of these proteins- ATP2A1 (SERCA pump), sarcolipin (SLN, which inhibits SERCA) and calsequestrin [CASQ, restrains Ca2+ to the sarcoplasmic reticulum (SR)], in ALS and DMD (Wang et al., 2012; Mukund and Subramaniam, 2015). Likewise, regulation of ASPH (regulator of ryanodine receptors) and SLN have also been observed in muscle from diseases such as CP (Smith et al., 2011). Figure 3 provides a representative sample of 4/23 signature modules and their enrichment.
Although existing research on several muscle diseases (such as ALS, DMD, BMD, and CP) has shown varying extents of mitochondrial dysfunction and calcium dysregulation in their pathomechanism, our approach points to widespread, statistically non-random dysregulation of mitochondrial function and calcium homeostasis associated with most muscle diseases including relatively less characterized diseases such as AQM and CFS. Further, the absence of modules associated with structural sarcomeric proteins (myosins, z–disc proteins, dystroglycan) at our significance threshold emphasizes the vital role of muscle bioenergetics, calcium signaling and homeostasis pathways in the pathogenesis of diseases affecting muscle.
Muscle Specific Mechanistic Changes Underlie Disease Pairs
The functional module framework utilized, allowed us to capture common functional mechanisms underlying disease pairs. Table 4 represented a subset of significant disease-pairs, which shared four or more significant FM between them (e.g., JDM and DMD having 15/23 FM overrepresented; p < 0.05). DMD and JDM represent myopathies, where the primary insult is on the skeletal muscle however, JDM is a systemic autoimmune vasculopathy characterized by weakness of proximal muscles and skin rashes with its histopathology showing evidence for necrosis, fiber size variation, and a muscle degeneration/regeneration phenotype (Peloro et al., 2001). JDM shares many pathologic similarities with muscle of children affected by DMD. A comparison of the expression profiles of children with DMD and JDM have revealed similarities in gene cascades involving muscular atrophy, deficits in mitochondrial metabolism and contraction, along with upregulation of extracellular matrix and cytoskeletal cascades (Tezak et al., 2002) consistent with the functional overlap observed in our study (see Results).
Finding relevant FM consistent with the current understanding of similarities between JDM and DMD further justified the efficacy of the adopted approach in identifying FM affected in more than one diseases state, in a context specific manner. To further elucidate disease associations much less explored, we focused on two diseases ALS and CP and their overlapping FM and associated FMA scores.
Calcium Dysregulation in Patients with ALS and CP
ALS and CP, both represent neurological diseases with their primary insult on upper and/or lower motor neurons. While ALS is a neurologically progressive disease, CP is not, with both disorders exhibiting progressive musculoskeletal weakness and increased spasticity. While ALS muscle is additionally characterized by denervation atrophy and spasticity, there is distinctive shortening and subsequent weakness of CP muscle (Graham and Selber, 2003; Kiernan et al., 2011). We identified 6 FM as being significantly dysregulated in both ALS and CP (Table 4). The associated FMA scores reflected the state of FM in disease, specifically a negative FMA scores reflected a general downregulation of genes associated with the functional module in the particular disease.
We observed that 5/6 FM identified above were similarly regulated in both ALS and CP (Table 5), mainly associated with mitochondrial metabolism (FM IDs 12–15). There is abundant evidence in literature for mitochondrial dysfunction particularly electron transport chain dysregulation and its role in ALS (Borthwick et al., 1999; Crugnola et al., 2010) in neurons. Our results indicated similar programs of mitochondrial dysregulation to be associated with muscle in patients with ALS and CP. Though no detailed studies in muscle exist to corroborate mitochondrial dysfunction in CP, Smith et al. (Smith et al., 2009, 2011) also show a general downregulation of mitochondrial transcripts. Comparison of the expression values for genes associated with these FM showed an R2 of 0.9.
Cellular dysregulation of Ca2+ dysregulation in ALS within affected neurons is well-characterized (Grosskreutz et al., 2010); likewise, dramatic Ca2+ dysregulation within muscle from CP patients has also been suggested to occur (Smith et al., 2009, 2011). Interestingly, the FMA scores indicated a differential regulation of calcium dynamics/homeostasis in muscle from ALS and CP (Table 5). A few notable differences identified in the differential regulation of ALS and CP FM were as follows- ATP2A1 and ATP2A2, two-muscle specific, energy demanding fast fiber SERCA pumps (sarco (endo) plasmic reticulum Ca2+ ATPase) were very strongly downregulated in ALS, suggesting a reduced efficacy in sequestering Ca2+ to the SR from the cytosol (Periasamy and Kalyanasundaram, 2007). The reduced need for regulation of SERCA pumps was reflected in the downregulation of its two strong regulators- SLN and PLN. Upregulation of ryanodine receptors (RYR3) further emphasized leakage of SR Ca2+ into the cytosol (Perez et al., 2005). Activation of non-skeletal muscle isoforms pointed to a shift in fiber composition toward a slower/more mixed phenotype in ALS.
In CP, increased Ca2+ was inferred from the massive upregulation of PVALB, which selectively binds to free Ca2+ to reduce free intracellular Ca2+ (subsequently, bringing about muscle relaxation). Though no significant changes were observed with respect the SERCA pumps or ryanodine receptors, FKBP1A and PDE4D that prevent channel leaking were significantly downregulated and PLN that controls the Ca2+ intake by the SERCA pumps was significantly upregulated in CP. On the other hand, upregulation of ASPH, TRDN, and CASQ1 indicated that muscle was actively trying to sequester intracellular Ca2+ to the stores. Figure 4 represents associated fold changes for select genes from the calcium homeostasis functional module.
Figure 4. The Ca2+ homeostasis associated functional module in ALS and CP. This figure captures the calcium dysregulation mechanisms (and difference between ALS and CP) via the fold changes associated with select genes of the Ca2+ homeostasis functional module. The left half indicates the fold change associated with ALS while the right half indicates the fold change associated with CP.
Taken together this indicates increased cytosolic Ca2+ in both diseases, however, in ALS- the Ca2+ homeostasis machinery associated with SR appears to be severely challenged by the disease with increased leakage of Ca2+ from the intracellular stores and a constrained uptake of Ca2+ back into SR. In contrast, CP displays a use-dependent decrease in capacity of the SR albeit muscle's efforts to actively recover its Ca2+ stores. This dramatic adaptation in both ALS and CP muscle might additionally lead to altered muscle contractile properties and mitochondrial functions.
Exploring Opportunities for Drug-Repositioning
Current advancements in understanding muscular/ neuromuscular disease pathophysiology have allowed for drastic improvements in drug therapy, however, several of the diseases discussed here are as yet untreatable with high rates of morbidity and mortality with limited therapeutic options. Gene therapy and precision medicine are yet to be realized in their full potential for several of the diseases considered here (Nightingale et al., 2016; Dalakas, 2017). Given this shortage of drug/therapeutic availability for muscle diseases, we aimed to identify if drug-repurposing opportunities could be inferred from our quantitative framework across the 20 diseases. We identified 81 approved drugs to be targeting proteins contained within the signature modules (Table S7) warranting further analysis.
Further, several of the available treatments for muscle diseases currently only offer symptomatic relief, for instance, no specific therapeutic treatments exist for dystrophies such as DMD, BMD LGMD, or EDMD, with patients requiring aggressive supportive care. These patients are only often treated for associated conditions of the heart and lung. Likewise for neurological diseases which are multi-symptomatic such as ALS and CP, patients are mostly provided symptomatic relief with a multitude of agents such as antiparkinsonian, anticonvulsant, antidopaminergic, antispasticity, anti-sialorrhea, or antidepressants. This prompted us to further explore if there were drugs uniquely shared by disease clusters that could provide opportunities for drug repositioning within disease clusters.
Our results from three associated disease clusters (DM/JDM, IBM/PM and DMD/BMD/LGMD clusters) supported this hypothesis (Figure 5, Table S8), for example, several drugs such as diltiazem, cyclophosphamide, cyclosporine—prescribed for management of DM and JDM were identified associated within the DM/JDM cluster (“UpToDate” 2017)3. We also observed in the cluster presence of Ruxolitinib- a JAK inhibitor developed to treat neoplastic diseases but suggested as investigational therapy for DM (Hornung et al., 2014). Interestingly, Sirolimus a FDA approved drug for prophylaxis against organ rejection, is currently suggested for symptomatic relief in patients with DM (Nadiminti and Arbiser, 2005). Sirolimus appears to also target proteins associated in the DMD/BMD/LGMD clusters suggesting possible therapeutic opportunities in different muscle disease categories for Sirolimus (and possibly other shared drugs identified in Figure 5). These results further emphasized opportunities and a need for exploring repurposing of therapeutics in diseases affecting the muscle.
Figure 5. Drug disease network for 3 disease clusters. (A) shows the number of protein modules associated with each disease cluster considered e.g., 13 protein modules were shared among all clusters, 14 modules were uniquely regulated in the DMD/BMD/LGMD cluster, 30 in the IBM/PM cluster and 42 in the DM/JDM cluster. (B) represents the approved drugs (Table S8) associated with the protein modules uniquely regulated in each disease cluster. Nodes in yellow are drugs currently utilized for treatment in the diseases associated with the cluster. Sirolimus and Ruxolitinib, investigational therapeutics currently used in DM/JDM were also identified within the DM_JDM cluster.
Our study demonstrated the value of an integrated approach in revealing disease relationships and highlighted opportunities for therapeutic advancements in treating muscle diseases. Clustering of the co-expression network based on the differential gene activity-scoring scheme allowed us to identify disease clusters, not based on clinical or pathological similarity, but on the similarity of the expression profiles. A “FM” framework of 23 modules was developed to provide a muscle-context specific view of the mechanistic similarities. Integrating this with our data allowed for understanding less explored disease associations such as ALS and CP.
Incorporating protein information with the diseases similarity network allowed for identification of a “common signature”—a set of pathways underlying a majority of the diseases considered here. The common signature included pathways contributing to deficient bioenergetics and calcium dysregulation within affected muscle. An observed overrepresentation of druggable targets within these signature modules, in addition to subset of drugs uniquely associated with three disease clusters (DM/JDM cluster, IBM/PM cluster and DMD/BMD/LGMD cluster) further allowed us to recognize possible avenues for drug repurposing in treating diseases of the muscle.
KM and SS conceived the study, reviewed and edited the manuscript. KM designed the study, performed the analysis and prepared the manuscript.
Supported by National Institutes of Health Grants, U01 CA200147, U01 CA198941, U19 AI090023, R01 HL106579, R01HL108735, R01 HD084633, and National Science Foundation Grant CCF-0939370.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys.2017.00980/full#supplementary-material
Askanas, V., and Engel, W. K. (2008). Inclusion-body myositis: muscle-fiber molecular pathology and possible pathogenic significance of its similarity to Alzheimer's and Parkinson's disease brains. Acta Neuropathol. 116, 583–595. doi: 10.1007/s00401-008-0449-0
Azuaje, F. J., Dewey, F. E., Brutsaert, D. L., Devaux, Y., Ashley, E. A., and Wagner, D. R. (2012). Systems-based approaches to cardiovascular biomarker discovery. Circ. Cardiovasc. Genet. 5, 360–367. doi: 10.1161/CIRCGENETICS.112.962977
Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., et al. (2013). NCBI GEO: archive for Functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995. doi: 10.1093/nar/gks1193
Bindea, G., Mlecnik, B., Hackl, H., Charoentong, P., Tosolini, M., Kirilovsky, A., et al. (2009). ClueGO: a cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 25, 1091–1093. doi: 10.1093/bioinformatics/btp101
Blandin, G., Marchand, S., Charton, K., Danièle, N., Gicquel, E., Boucheteil, J-B., et al. (2013). A Human skeletal muscle interactome centered on proteins involved in muscular dystrophies: LGMD interactome. Skelet Muscle 3:3. doi: 10.1186/2044-5040-3-3
Borthwick, G. M., Johnson, M. A., Ince, P. G., Shaw, P. J., and Turnbull, D. M. (1999). Mitochondrial enzyme activity in amyotrophic lateral sclerosis: implications for the role of mitochondria in neuronal cell death. Ann. Neurol. 46, 787–790. doi: 10.1002/1531-8249(199911)46:5<787::AID-ANA17>3.0.CO;2-8
Coley, W., Rayavarapu, S., Pandey, G. S., Sabina, R. L., Van der Meulen, J. H., Ampong, B., et al. (2012). The molecular basis of skeletal muscle weakness in a mouse model of inflammatory myopathy. Arthritis Rheum. 64, 3750–3759. doi: 10.1002/art.34625
Crugnola, V., Lamperti, C., Lucchini, V., Ronchi, D., Peverelli, L., Prelle, A., et al. (2010). Mitochondrial respiratory chain dysfunction in muscle from patients with amyotrophic lateral sclerosis. Arch. Neurol. 67, 849–854. doi: 10.1001/archneurol.2010.128
Dalakas, M. C. (2017). Gene therapy for duchenne muscular dystrophy: balancing good science, marginal efficacy, high emotions and excessive cost. Ther. Adv. Neurol. Disord. 10, 293–296. doi: 10.1177/1756285617717155
Dennis, G. Jr., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C., et al. (2003). DAVID: database for annotation, visualization, and integrated discovery. Genome Biol. 4:P3. doi: 10.1186/gb-2003-4-5-p3
Dewey, F. E., Perez, M. V., Wheeler, M. T., Watt, C., Spin, J., Langfelder, P., et al. (2011). Gene coexpression network topology of cardiac development, hypertrophy, and failure. Circ. Cardiovas. Genet. 4, 26–35. doi: 10.1161/CIRCGENETICS.110.941757
Dupuis, L., and Loeffler, J.-P. (2009). Neuromuscular junction destruction during amyotrophic lateral sclerosis: insights from transgenic models. Curr. Opin. Pharmacol. 9, 341–346. doi: 10.1016/j.coph.2009.03.007
Engel, A., and Franzini-Armstrong, C. (2004). Myology: Basic and Clinical. McGraw-Hill, Medical Pub. Division. Available online: http://books.google.com/books?id=zP5vQgAACAAJ.
Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., et al. (2013). STRING v9. 1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815. doi: 10.1093/nar/gks1094
Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., et al. (2004). Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5:R80. doi: 10.1186/gb-2004-5-10-r80
Hornung, T., Janzen, V., Heidgen, F.-J., Wolf, D., Bieber, T., and Wenzel, J. (2014). Remission of recalcitrant dermatomyositis treated with ruxolitinib. N. Engl. J. Med. 371, 2537–2538. doi: 10.1056/NEJMc1412997
Laaksonen, R., Katajamaa, M., Paiva, H., Sysi-Aho, M., Saarinen, L., Junni, P., et al. (2006). A systems biology strategy reveals biological pathways and plasma biomarker candidates for potentially toxic statin-induced changes in muscle. PLoS ONE 1:e97. doi: 10.1371/journal.pone.0000097
Landrum, M. J., Lee, J. M., Riley, G. R., Jang, W., Rubinstein, W. S., Church, D. M., et al. (2014). ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985. doi: 10.1093/nar/gkt1113
Mukund, K., Mathewson, M., Minamoto, V., Ward, S. R., Subramaniam, S., and Lieber, R. L. (2014). Systems analysis of transcriptional data provides insights into muscle's biological response to botulinum toxin. Muscle Nerve 50, 744–758. doi: 10.1002/mus.24211
Mukund, K., and Subramaniam, S. (2015). Dysregulated mechanisms underlying duchenne muscular dystrophy from co-expression network preservation analysis. BMC Res. Notes 8:182. doi: 10.1186/s13104-015-1141-9
Peloro, T. M., Miller, O. F., Hahn, T. F., and Newman, E. D. (2001). Juvenile dermatomyositis: a retrospective review of a 30-year experience. J. Am. Acad. Dermatol. 45, 28–34. doi: 10.1067/mjd.2001.113686
Perez, C. F., López, J. R., and Allen, P. D. (2005). Expression levels of RyR1 and RyR3 control resting free Ca2+ in skeletal muscle. Am. J. Physiol. Cell Physiol. 288, C640–C649. doi: 10.1152/ajpcell.00407.2004
Piñero, J., Queralt-Rosinach, N., Bravo, À., Deu-Pons, J., Bauer-Mehren, A., Baron, M., et al. (2015). DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database 2015:bav028. doi: 10.1093/database/bav028
Ramadasan-Nair, R., Gayathri, N., Mishra, S., Sunitha, B. B., Mythri, R., Bharath, M., et al. (2014). Mitochondrial alterations and oxidative stress in an acute transient mouse model of muscle degeneration implications for muscular dystrophy and related muscle pathologies. J. Biol. Chem. 289, 485–509. doi: 10.1074/jbc.M113.493270
Ramos, E. M., Hoffman, D., Junkins, H. A., Maglott, D., Phan, L., Sherry, S. T., et al. (2014). Phenotype–Genotype Integrator (PheGenI): synthesizing Genome-Wide Association Study (GWAS) data with existing genomic resources. Eur. J. Hum. Genet. 22, 144–147. doi: 10.1038/ejhg.2013.96
R Core Team (2015). R: A Language and Environment for Statistical Computing. Available onilne at: https://www.R-project.org (Accessed Aug 14, 2015).
Santacatterina, F., Chamorro, M., de Arenas, C. N., Navarro, C., Martín, M. A., Cuezva, J. M., et al. (2015). Quantitative analysis of proteins of metabolism by reverse phase protein microarrays identifies potential biomarkers of rare neuromuscular diseases. J. Transl. Med. 13:65. doi: 10.1186/s12967-015-0424-1
Shannon, P., Markiel, A., Ozier, O. S., Baliga, N., Wang, T. J., Ideker, T., et al. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498. doi: 10.1101/gr.1239303
Smith, L. R., Lee, K. S., Ward, S. R., Chambers, H. G., and Lieber, R. L. (2011). Hamstring contractures in children with spastic cerebral palsy result from a stiffer extracellular matrix and increased in vivo sarcomere length. J. Physiol. 589, 2625–2639. doi: 10.1113/jphysiol.2010.203364
Smith, L. R., Pontén, E., Hedström, Y., Ward, S. R., Chambers, H. G., Subramaniam, S., et al. (2009). Novel transcriptional profile in wrist muscles from cerebral palsy patients. BMC Med. Genomics 2:44. doi: 10.1186/1755-8794-2-44
Suthram, S., Dudley, J. T., Chiang, A. P., Chen, R., Hastie, T. J., and Butte, A. J. (2010). Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS Comput. Biol. 6:e1000662. doi: 10.1371/journal.pcbi.1000662
Tezak, Z., Hoffman, E. P., Lutz, J. L., Fedczyna, T. O., Stephan, D., Bremer, E. G., et al. (2002). Gene Expression Profiling In DQA1* 0501+ children with untreated dermatomyositis: a novel model of pathogenesis. J. Immunol. 168, 4154–4163. doi: 10.4049/jimmunol.168.8.4154
Timpani, C. A., Hayes, A., and Rybalka, E. (2015). Revisiting the dystrophin-ATP connection: how half a century of research still implicates mitochondrial dysfunction in duchenne muscular dystrophy aetiology. Med. Hypotheses 85, 1021–1033. doi: 10.1016/j.mehy.2015.08.015
Wagner, A. H., Coffman, A. C., Ainscough, B. J., Spies, N. C., Skidmore, Z. L., Campbell, K. M., et al. (2015). DGIdb 2.0: mining clinically relevant drug–gene interactions. Nucleic Acids Res. 44, D1036–D1044. doi: 10.1093/nar/gkv1165
Keywords: co-expression networks, functional module framework, bioenergetics, calcium signaling, skeletal muscle physiology, neuromuscular disease, drug repurposing, human protein interaction network
Citation: Mukund K and Subramaniam S (2017) Co-expression Network Approach Reveals Functional Similarities among Diseases Affecting Human Skeletal Muscle. Front. Physiol. 8:980. doi: 10.3389/fphys.2017.00980
Received: 06 September 2017; Accepted: 16 November 2017;
Published: 01 December 2017.
Edited by:Joseph L. Greenstein, Johns Hopkins University, United States
Reviewed by:Christine W. Duarte, University of Alabama at Birmingham, United States
Michael F. Miles, Virginia Commonwealth University, United States
James E. Koltes, Iowa State University, United States
Copyright © 2017 Mukund and Subramaniam. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Shankar Subramaniam, firstname.lastname@example.org