Shared Gene Expression Between Multiple Sclerosis and Ischemic Stroke

Patients with multiple sclerosis (MS) appear to have an increased risk of ischemic stroke (IS). Although MS and IS have very different phenotypes, gene-based and pathway-based analyses of large-scale genome-wide association studies (GWAS) have increasingly enhanced our understanding of these two diseases. Whether there are common molecular mechanisms connecting MS and IS is still unclear. Here, we describe the outcome of gene-based test and pathway-based analysis of GWAS datasets that explored potential gene expression links between MS and IS. After identifying significant gene sets individually of MS and IS, we performed pathway-based analysis in four biological pathway databases (KEGG, PANTHER, REACTOME, and WikiPathways) and GO categories. We discovered that there were 9 shared pathways between MS and IS in KEGG, 2 in PANTHER, 14 in REACTOME, 1 in WikiPathways, and 194 in GO annotations (p < 0.05). These results provide an improved understanding about possible shared mechanisms and treatments strategies for MS and IS. They also provide some basis for further studies of how these two diseases are linked at the molecular level.


INTRODUCTION
Multiple sclerosis (MS) is a neurodegenerative, demyelinating disease of the nervous system that respects few demographic boundaries. It has an autoimmune basis, which leads to widespread nervous system tissue lesions and dysfunction, resulting in communication breakdown between neurons (Filippi and Rocca, 2005). As MS research has progressed, it has become clearer that environmental and genetic factors underlie the etiology of MS. The cooperation of these two factors in the etiology raises the question of whether one is more important than the other in posing risk, and whether co-morbidities elevate risk.
It is widely held that stroke is the second highest cause of mortality (GBD 2015 Dalys andHale Collaborators, 2016;GBD 2015 Mortality andCauses of Death Collaborators, 2016). Stroke can result in damage to various brain areas, causing patients to suffer physically, mentally, and/or emotionally (Roger et al., 2011). There are two major categories of stroke: ischemic stroke (IS) and intracerebral hemorrhage (ICH) (Khan et al., 2013). One study found that 70-85% of all strokes are IS (Khan et al., 2013). And the studies of stroke genetics discovered several key variants like chromosome 9p21.3, Notch3, and COL4A1 in early time (Cole and Meschia, 2011). This suggests that some of these unknown factors may have a genetic origin.
Recent genome-wide association studies (GWAS) of MS and IS revealed the respective genetic characteristics of these two diseases. Various major histocompatibility complex (MHC) variants (Moutsianas et al., 2015) and 110 non-MHC variants are related to MS susceptibility (International Multiple Sclerosis Genetics et al., 2013). In recent years, researchers identified the variants in SLC9A9 and NR1H3 had associations with the risk of MS Zhang et al., 2018). Moreover, experts have focused research on network-based analyses of genome and protein pathways using GWAS datasets, especially those related to immune pathways (Baranzini et al., 2009). The International MS Genetics Consortium (IMSGC) has obtained enrichment results in gene ontology (GO) and KEGG databases with two large-scale MS-GWAS datasets two examples are apoptosis in GO and the JAK-STAT signaling pathway in KEGG (International Multiple Sclerosis Genetics, 2013). Liu et al. analyzed shared genetic pathways from different MS-GWAS datasets (Liu et al., 2017). In 1 KG dataset of IS, ABO, HDAC9, PITX2, and ZFHX3 were found significant (Malik et al., 2016). The further GWAS research, 22 new significant loci were detected in the metaanalysis for stroke and its subtypes among multiple ancestries (Malik et al., 2018).
Some have noted that the risk of IS is increased for MS patients. For example, one cohort study showed that after adjusting for confounding variables, there was still an increased risk of stroke occurrence in an MS cohort compared to a control cohort (Tseng et al., 2015). In vascular diseases and autoimmune diseases, like MS, pathogenic factors such as endothelial dysfunction, atherosclerosis formation, antiphospholipid antibody, and even smoking can contribute to decreased physical activity (Marrie et al., 2015). In MS, that decreased physical activity increases the risk for IS (Marrie et al., 2015). As our understanding of the immuneinflammatory response in stroke becomes more comprehensive, the link between IS and MS and the immune system becomes more apparent.
We hypothesize that identifying pathways shared by IS and MS will can be novel points to advance understanding of the relationship between IS and MS. Existing GWAS datasets give strong support for exploring the links between MS and IS in terms of SNP, gene and pathway analysis methods. Here, we conducted a gene-based test of IS (10,307 IS cases and 19,326 controls) and MS (9,772 MS cases and 17,376 controls) GWAS datasets following a pathway-based analysis. We found that MS and IS have in common 9 shared pathways in KEGG, 2 in PANTHER and 15 in REACTOME, 1 in Wiki pathways, and Abbreviations: CNS, central nervous system; DAMPs, danger-associated molecular patterns; EAE, experimental autoimmune encephalomyelitis; FDR, false discovery rate; GO, gene ontology; GSEA, Gene Set Enrichment Analysis; GWAS, genome-wide association study; IMSGC, International MS Genetics Consortium; IS, ischemic stroke; KEGG, Kyoto Encyclopedia; MS, multiple sclerosis; NTA, Network Topology-based Analysis; ORA, Over-Representation Analysis; PAMPs, pathogen-associated molecular patterns; PRRSs, pattern-recognition receptors; QC, quality control; SNP, single nucleotide polymorphism; TLR, Toll-like receptor (TLR); TSLP, Thymic stromal lymphopoietin; VEGAS2, Versatile Genebased Association Study-2 version 2, WTCCC2, Wellcome Trust Case Control Consortium 2. 194 in GO annotations. In short, we believe that these new results may represent significant steps toward defining the genetic mechanism underlying the association of IS with MS.

Samples
We used a large-scale MS-GWAS dataset from IMSGC, which was derived from the Wellcome Trust Case Control Consortium 2 (WTCCC2) project (International Multiple Sclerosis Genetics Consortium et al., 2011). This dataset comprises 9,772 MS cases and 17,376 controls of European descent, all the data of which were collected by 23 research groups working in 15 different countries. After subjecting the dataset to certain quality-control methods (such as Bayesian clustering and principal components analyses in sample QC and automated cluster and Beta-binomial model in SNP QC), 464,357 autosomal SNPs were available for genetic analysis (International Multiple Sclerosis Genetics Consortium et al., 2011).
For IS analyses, we obtained the IS dataset derived from the 1000G GWAS summary results of the METASTROKE collaboration (Malik et al., 2016). In the discovery phase, researchers gathered 12 case-control GWAS comprising 10,307 IS cases and 19,326 controls of Caucasian background. After quality-control by using logistic regression analysis (Traylor et al., 2012), meta-analysis resulted in 8.3 million SNPs. In the replication phase, the SNPs with p < 1.00E-05 were calculated with independent samples that included 13,435 cases and 29,269 controls of Caucasian descent and 2,385 cases and 5,193 controls of South Asian descent for replication. Finally, the results obtained from the two phases were subjected to final metaanalysis. The available data in our analysis were summarized from the discovery phase (Malik et al., 2016).

Gene-Based Test for MS and IS GWAS Datasets
We uploaded the SNPs data of MS and IS into VEGAS2 (Versatile Gene-based Association Study software) online. This approach is a more flexible method to assess individual SNPs and conduct gene-based testing (Mishra and Macgregor, 2015). VEGAS2 was used to analyze the hg19 annotated list derived from 1,000 genomes data from the University of California Santa Cruz (UCSC) Table Browser to simulate SNP correlations across the autosomes and chromosome X (Mishra and Macgregor, 2015). With this software, users have five ways to restrict the gene boundaries for SNP option: SNPs within 0kbloc, 10kbloc, 20kbloc, 50kbloc, and 0kbldbin (Mishra and Macgregor, 2015). First, the n SNPs' p-values are shifted to upper tail χ 2 statistics with one degree of freedom (df), and then are summarized to compute a gene-based test statistic that would have a χ 2 distribution with n df under the null hypothesis to define corresponding genes, if SNPs are in linkage equilibrium (Mishra and Macgregor, 2015). More detailed information about this process is found in Mishra and Macgregor (2015). Using SNPs from the 1000G European dataset, we chose a sub-population from all European. Our selections were based on "SNPs within a gene adding SNPs outside of the gene with r 2 > 0.8 with SNPs within the gene."

Pathway-Based Analysis for MS and IS Expression Datasets
We conducted pathway-based analysis using the WebGestalt database (Wang et al., 2017). Of the three well-established and complementary methods (ORA, NTA, and GSEA) supported by WebGestalt, we chose Over-Representation Analysis (ORA) for our enrichment analysis. ORA is a hypergeometric technique used to identify overrepresentation of genes of interest in related pathways. In addition to GO (Ashburner et al., 2000) and KEGG datasets (Kanehisa et al., 2004), we also selected the most up-todate datasets (WebGestalt 2017) from Wiki Pathways (Pico et al., 2008); REACTOME (Fabregat et al., 2016); and PANTHER (Mi et al., 2016). Further, we used common and non-redundant GO annotations for more specific functions to be recognized. The p-value for observing over J disease-related genes in a pathway could be calculated using the following formula: where m is the overall number of genes of interest associated with one given disease, N is the number of all reference genes, and S is the number of genes in the pathway. We chose pathways that contained 20-300 genes in order to exclude testing exceedingly narrow or broad pathways. The minimum number of genes per category was 5. The false discovery rate (FDR) method was used to correct for conducting multiple tests (Wang et al., 2017). Most importantly, we searched for shared biological pathways that are relevant to both MS and IS and that have an adjusted p < 0.05.

Gene-Based Test of MS and IS GWAS Datasets
All the 9,541,572 IS SNPs, and 464,357 MS SNPs were applied in this study. We sorted through 21,913 IS gene sets and 14,811 MS gene sets using the web-based version of VEGAS2 (Figure 1). After filtering the data for gene sets with a p-value of < 0.05, we obtained 1,290 IS gene sets and 1,353 MS gene sets (VEGAS2 analysis conducted 26 May 2017). The detailed results are listed in Supplementary Tables 1, 2.

Pathway-Based Analysis of MS-GWAS Dataset
We performed pathway analysis for the 1,353 MS genes. We identified 66 significant KEGG pathways, 7 significant PANTHER pathways, 127 significant REACTOME pathways, and 58 significant Wiki Pathways (p < 0.05) (Supplementary Table 3). In GO, we identified 719 biologicalprocess pathways, 170 cellular-component pathways, and 152 molecular-function pathways that had a p-value of < 0.05 (Supplementary Table 5).
FIGURE 1 | Flow diagram of the three-phase analysis design. In phase I, we performed a gene-based test using the MS dataset from IMSGC and the IS GWAS dataset from METASTROKE. Gene sets identified with p < 0.05 were carried forward to the next phase. In phase II, we carried out a pathway-based analysis for the two diseases. In phase III, shared and significant pathways were identified that had p's < 0.05.

Pathway-Based Analysis of IS-GWAS Dataset
After uploading the 1,290 IS genes into WebGestalt database, we identified 22 significant KEGG pathways, 4 PANTHER pathways, 70 REACTOME pathways, and 13 Wiki Pathways with a p-value of < 0.05 (Supplementary Table 4). In Go, we identified 332 biological process pathways, 147 cellular component pathways, and 101 molecular function pathways (p < 0.05) (Supplementary Table 6).

DISCUSSION
What accounts for the increased risk of IS in patients with MS (Tseng et al., 2015)? Here, we described, through a pathway-based analysis of IS and MS GWAS datasets, gene expression links between IS and MS. We discovered 9 shared pathways between MS and IS in KEGG, 2 in PANTHER, 15 in REACTOME, 1 in WikiPathways, and 194 in GO annotations. These results provide an improved understanding of possible shared mechanisms and treatment strategies for IS and MS.
To date, MS and IS GWAS research have identified some vital SNPs, gene sets, and pathways for the respective diseases. The most well-known associated gene is CD40, a member of the TNFreceptor superfamily. Several studies have reported that CD40 and related SNPs are closely associated with MS susceptibility. Using multiple statistical analysis methods, Sokolova et al. confirmed that SNP rs6074022 located in CD40 was related to a higher risk of MS development (Sokolova et al., 2013). Other SNPS of CD40 (rs1883832 C/T, rs13040307 C/T, rs752118 C/T, and rs3765459 G/A; and rs1883832 C/T polymorphism and its TCCA haplotype) were suggested to be associated with IS susceptibility at a significant test threshold (Chen et al., 2016). After adjusting for relevant covariates, one clinical study reported a higher risk of IS in the MS cohort compared to the control cohort (Tseng et al., 2015). This was corroborated by findings that both young and older MS patients are at an increased risk for IS (Jadidi et al., 2013).
To convince ourselves that there is indeed a clear but subtle link between MS and IS, we focused our analysis on genes that these two diseases might share in common. In phase I, we confirmed 1,290 IS and 1,353 MS significant genes with IS-GWAS datasets and MS-GWAS. In phase II, we conducted enrichment analysis via GO, KEGG, PANTHER, Wiki Pathways, and REACTOME. In phase III, we selected shared significant pathways between the two diseases using the datasets of each disease. As we determined associations between MS and IS at the gene expression level, which was the most important goal of  Frontiers in Genetics | www.frontiersin.org 6 February 2019 | Volume 9 | Article 598  our research, we detected more risk pathways, like natural killer cell-mediated cytotoxicity pathways in the immune system. So far, the common treatments of MS and IS are mainly associated with the neuroinflammation for some similar pathomechanisms (Paterno and Chillon, 2018). Natalizumab (Elkins et al., 2017), minocycline (Schabitz et al., 2008), and fingolimod (Fu et al., 2014), three approved agents for MS, have been applied in clinical trials already. And in recent years more immunomodulatory drugs approved for MS became a magnet for the therapies of IS, for instance, glatiramer acetate (Poittevin et al., 2013), and DMF (Lin et al., 2016). Our findings offer innovative insights into pathway analysis, which will be crucial for deciphering the pathogenesis underlying the MS-IS relationship and for instituting appropriate therapeutic regimes.

Shared KEGG Pathways
The association between the pathophysiology of IS and MS and how it is related to the immune system is becoming more and more clear. In the context of pathways shared by IS and MS, this association motivated us to focus our attention on key pathways in the immune system and the nervous system. One significant shared pathway identified from our analyses is the natural killer (NK) cell-mediated cytotoxicity pathway (hsa04650). NK cell-mediated cytotoxicity is one of the main characteristics of NK cells. NK cells release cytotoxic granules onto the target cell's surface, causing effector proteins to penetrate the cell membrane and induce programmed cell death.
Over the past decades, both in clinical trials and animal experiments, NK cell dysfunction has been shown to be strongly associated with the immunopathogenesis of MS and certain patients' responses to certain treatments (Morandi et al., 2008). There are two major functional subtypes of NK cells, CD56 dim , CD16 hi , and CD56 bright , the latter of which may play a key role in MS and IS immunopathology. CD56 bright NK cells, which are normally weakly cytotoxic, can acquire cytotoxic properties and produce cytokines if they are stimulated (Melsen et al., 2016), as, for example, with certain therapeutic agents. This increased NK activity correlates with responses to immunotherapies. For example, patients treated with daclizumab or IFNβ produce more CD56 bright NK cells (Bielekova et al., 2006;Saraste et al., 2007). In addition, NK cells are more cytotoxic toward autologous activated T cells in samples from patients treated with daclizumab than those from untreated patients (Jiang et al., 2011).
The role of NK cells in the pathophysiology of MS and IS was further elucidated in another study. Jiang and colleagues demonstrated that the acetylcholine-producing NK cells reduce CNS damage in an animal model of MS . Natural killer cell-mediated cytotoxicity (hsa04650) also has been demonstrated the significance by pathway analysis in GWAS datasets (Giacalone et al., 2015). After acute stroke, ischemic neurons release fractalkine to recruit lymphocytes, including NK cells, to gather in the injured areas (Gan et al., 2014). There, NK cells can induce neuronal death by secreting cytokines and glutamate; this is one inflammatory mechanism that can lead to tissue damage (Gan et al., 2014). A meta-analysis of all types of stroke (IS and its subtypes) of 12 different GWAS identified nearly 100 different pathways associated with each type of stroke (Bonferroni corrected p < 0.05); however, only the NK cell signaling pathway was unique in that it was significantly shared by all stroke and IS subtypes (Malik et al., 2016).
Another pathway shared by MS and IS is the Toll-like receptor signaling pathway (hsa04620). The Toll-like receptor (TLR) family is a well-known class of proteins that are prototype pattern-recognition receptors (PRRs) capable of recognizing pathogen-associated molecular patterns (PAMPs), which are signature motifs possessed by certain pathogenic microorganisms; and danger-associated molecular patterns (DAMPs), which are host molecules that initiate an inflammatory response to damaged tissues or lesions (Kawasaki and Kawai, 2014). The Toll-like receptor signaling pathway (hsa04620) is complicated, as it can activate many important signaling molecules such as nuclear factor-κB (NF-κB) transcription factors, mitogen-activated protein kinases (MAPKs), and p3 (Kawasaki and Kawai, 2014). TLR signaling pathways extensively influence the immune system. Thus, it is not surprising that many diseases have multiple links with these pathways. For example, TLR4 aggravates inflammation in experimental autoimmune encephalomyelitis (EAE) model (Reynolds et al., 2012). TLR2 expression on oligodendrocytes is enhanced in MS lesions but not on oligodendrocytes in normal areas (Hanafy and Sloane, 2011).
TLR2 can also promote immune responses through Th17 cells (Reynolds et al., 2010). In the CNS, a given TLR can stimulate diverse signaling pathways in different neural cells. After a stroke, DAMPs activate TLR2 and TLR4 in microglia to increase the production of pro-inflammatory cytokines (Caso et al., 2007;Lehnardt et al., 2007). In ischemic neurons, TLR2 and TLR4 activate downstream elements, JNK and AP-1, to initiate proapoptotic activity (Tang et al., 2007). In one clinical trial, researchers discovered that TLR7 and TLR8 are related to poor outcome in IS (Brea et al., 2011). Similarly, in a mouse stroke model, TLR3 and TLR9 did not confer protection to neural cells during middle cerebral artery occlusion (Hyakkoku et al., 2010).
Yet another pathway significantly shared by MS and IS is the Th1 and Th2 cell differentiation pathway (hsa04658). Through this pathway, naïve CD4 + T cells respond to cues from antigen presenting cells (APC), causing them to differentiate into Th1 and Th2 cells, which are two major subtypes of effector CD4 + T cells. The differentiation of Th1 and Th2 cells depends on the signals they receive. Th1 cells are triggered by IL-12 and secrete IFN-γ and IL-2, whereas Th2 cells are triggered by IL-4 and IL-2, and secrete IL-4, IL-5, IL-9, IL-10, IL-13, and IL-25 (Zhu and Paul, 2008). In clinical trials, IL-17 and IFN-γ were shown to exacerbate the symptoms of MS patients; this provided evidence that Th1 and Th17 cells can have an impact on diseases through cytokines (Panitch et al., 1987;Havrdova et al., 2016). Th1 and Th2 cells exert opposite effects on infarct lesions in mice undergoing middle cerebral artery occlusion. In these mice, Th2 deficiency increased infarct size by enhancing recruitment of macrophages and neutrophils to the infarct, whereas Th1 deficiency decreased infarct size by retarding recruitment of macrophages and neutrophils to the infarct (Gu et al., 2012). Moreover, stroke initially causes a decrease in immune cells in the penumbra, leading to a shift from Th1 to Th2 cytokines which is associated with stroke-induced immunosuppression (Prass et al., 2003).
Neurotrophin signaling pathway (hsa04722) is another pathway shared by MS and IS. It influences the differentiation and survival of neural cells. Nerve growth factor (NGF), brain derived neurotrophic factor (BDNF), neurotrophin 3 (NT-3), and neurotrophin 4 (NT-4) are major members of the neurotrophin family. The Trk family of tyrosine kinase receptors and p75 neurotrophin receptors (p75NTRs) are two major neurotrophin receptors (Reichardt, 2006). BDNF/TrkB signaling and TrkB-FL/TrkB-T1 balance are two targets for stroke therapies (Vidaurre et al., 2012). Ciliary neurotrophic factor (CNTF) has been reported to have a neuroprotective effect in the cortex of MS patients (Dutta et al., 2007). Glial p75NTRs are increased during plaque formation in MS (Dowling et al., 1999).
MS and IS are diseases that involve both the immune system and CNS. Future studies should direct more attention to analyzing shared and significant pathways of the immune system and nervous system. Doing so will advance understanding of the shared mechanisms underlying the pathogenesis of these two diseases. Moreover, new information gleaned from important pathways identified in MS will suggest meaningful targets in IS to focus on and vice versa.

Shared Gene Ontology Enrichment Analysis
We used annotations from the GO project to identify any significant relationships in biological functions encoded by shared genes at the molecular, cellular, and tissue levels. We identified 194 annotations total, spread among three functional categories: biological process, cellular component, and molecular function. It was convenient for us to study the function of the genome on a more advanced level, for example, to estimate which part of the genome is shared between diseases with regard to signal transduction, metabolism synthesis or copy number. We further narrowed down the annotations of shared genes and identified 7 in biological process, 10 in cellular component, and 6 in molecular function. Enrichment analysis identified cell-cell adhesion via plasma-membrane adhesion molecules (GO: 0098742), as one of the most significant GO functional category shared by IS and MS. Adhesion molecules of lymphocytes are significantly elevated in IS patients (Tsai et al., 2009). In MS, adhesion molecules on immune cells play a role in disease progression (Dhib-Jalbut, 2002). Targeting these adhesion molecules with glatiramer acetate has been shown to reduce the pro-migratory components in MS (Sellner et al., 2013).

Limitations
Despite of the novel findings, there are some limitations to this research. The GWAS datasets we chose were representative but the replications of multiple datasets should be conducted to improve the validity of the results. Next, due to the absence of IS subtypes' data, we did not analyze the associations between MS and the etiological subtypes of IS. Nevertheless, risk variants of IS are gradually identified to be related to its subtypes. At last, we have used reliable statistical methods to identify significant shared pathways after filtering out significant genes, respectively, but there is a chance that the commonalities might be found out between two diseases that are unlikely to share mechanisms. In further studies, our results need more experiments to explore and validate.

CONCLUSIONS
We report on the significant pathways and GO annotations shared by MS and IS, with the goal of understanding more about their biological functions and relationships. By analyzing the pathways of the immune system and nervous system, we can verify that links between MS and IS exist and infer gene expression level correlations. Leveraging information about where biological functions overlap, we believe that a multidisciplinary approach will advance studies on the pathophysiological mechanisms in and treatments for both MS and IS.

AUTHOR CONTRIBUTIONS
HL and JH conceived and designed the study for MS and IS. HL administered the analyses and wrote the manuscript. LC and XM was responsible for manuscript revision. PC and WL provided analyses support. All authors gave approval for the final version for submission.