Identification of Shared Molecular Signatures Indicate the Susceptibility of Endometriosis to Multiple Sclerosis

Women with endometriosis (EMS) appear to be at a higher risk of developing other autoimmune diseases predominantly multiple sclerosis (MS). Though EMS and MS are evidently diverse in their phenotype, they are linked by a common autoimmune condition or immunodeficiency which could play a role in the expansion of endometriosis and possibly increase the risk of developing MS in women with EMS. However, the common molecular links connecting EMS with MS are still unclear. We conducted a meta-analysis of microarray experiments focused on EMS and MS with their respective controls. The GEO2R web application discovered a total of 711 and 1516 genes that are differentially expressed across the experimental conditions in EMS and MS, respectively with 129 shared DEGs between them. The functional enrichment analysis of DEGs predicts the shared gene expression signatures as well as the overlapping biological processes likely to infer the co-occurrence of EMS with MS. Network based meta-analysis unveiled six interaction networks/crosstalks through overlapping edges between commonly dysregulated pathways of EMS and MS. The PTPN1, ERBB3, and CDH1 were observed to be the highly ranked hub genes connected with disease-related genes of both EMS and MS. Androgen receptor (AR) and nuclear factor-kB p65 (RelA) were observed to be the most enriched transcription factor in the upstream of shared down-regulated and up-regulated genes, respectively. The two disease sample sets compared through crosstalk interactions between shared pathways revealed commonly up- and down-regulated expressions of 10 immunomodulatory proteins as probable linkers between EMS and MS. This study pinpoints the number of shared genes, pathways, protein kinases, and upstream regulators that may help in the development of biomarkers for diagnosis of MS and endometriosis at the same time through improved understanding of shared molecular signatures and crosstalk.


INTRODUCTION
Endometriosis (EMS) is an estrogen-dependent inflammatory disorder which affects approximately 5-10% of women in the reproductive age worldwide (Bulun, 2009). The endometrial tissue which normally is present inside the uterus is displaced outside in patients suffering from EMS resulting in pelvic pain and infertility (Capobianco and Rovere-Querini, 2013). Immunological factors are known to contribute significantly to the pathogenesis and pathophysiology of endometriosis (Berkkanoglu and Arici, 2003;Podgaec et al., 2007Podgaec et al., , 2010Fairbanks et al., 2009;Nielsen et al., 2011;Capobianco and Rovere-Querini, 2013). The prime regulators of the innate immune response are macrophages which come into play in case of injury, damage and infection. Macrophages possess functionally diverse contrasting roles, as on one hand, they play a protective role through differentiation and regeneration of cells while on the other hand they stimulate the immune response leading to destruction of infected cells (Vogel et al., 2013). Pro-inflammatory cytokine (interferonγ) activated macrophages are known to have an essential role in the onset and progression of endometriosis. The macrophages misinterpret the displaced ectopic endometrial tissue as an injury and hence, instead of removing the endometrial cells, they activate pathways that repair and enhance their survival leading to sustained endometrial tissue (Podgaec et al., 2010;Capobianco and Rovere-Querini, 2013). It has been reported that the women suffering from EMS are more prone to acquire other inflammatory autoimmune disorders especially multiple sclerosis (MS) (Nielsen et al., 2011;Mormile and Vittori, 2014). Multiple sclerosis (MS) is a chronic neuroinflammatory autoimmune disease of the central nervous system associated with neurodegeneration (Hickey, 1999;Compston and Coles, 2002;Szczucinski and Losy, 2007). Like endometriosis, macrophages have been observed to be directly associated in the pathogenesis of MS (Oreja-Guevara et al., 2012;Vogel et al., 2013). Additionally, T-helper 1 (Th1)/Thelper 2 (Th2) imbalance has been associated with both EMS and MS wherein the pro-inflammatory Th1 profile dominates over the Th2 anti-inflammatory response. This is similar to other autoimmune diseases where the immune system launches an attack on its own cells and tissues (Trapp et al., 1998;Peterson et al., 2001;Diestel et al., 2003;Aktas et al., 2005). Among the increased Th1 cytokines, it has been reported that Interferonγ (IFN-γ) is strongly associated with the pathomechanisms of MS (Oreja-Guevara et al., 2012;Vogel et al., 2013). Though the association between these two heterogeneous diseases EMS and MS, is not clearly recognized, it may be attributable to differential gene expression and sharing of common dysregulated pathogenic pathways involved in the development of both diseases. A 'crosstalk' event between two pathways, thus, elucidates how one or more components of one pathway affect another through interactions with shared components, protein-protein interactions and transcriptional regulations (Lu et al., 2007;Guo and Wang, 2009;Housden and Perrimon, 2014). Therefore, an examination of possible crosstalks and shared components among common dysregulated pathways together with associated genes in both endometriosis and MS may be able to assist in the understanding of the disease mechanism.
Over the last two decades, a meta-analysis approach has been well exploited to uncover the shared molecular signatures between related diseases by integrating the publicly accessible microarray datasets (Silva et al., 2007;Higgs et al., 2012;Tuller et al., 2013;Jha et al., 2016). Recent studies have focused on identifying crosstalk among dysregulated pathways using expression profiles of genes from control vs. disease samples (Zhang et al., 2013;Niu et al., 2014Niu et al., , 2015Chen et al., 2015). In the present study, we aimed to identify commonly dysregulated genes and pathways which probably co-occur in both EMS and MS to elucidate the relationship between the two diseases. We performed here a meta-analysis using gene expression data from microarray experiments of EMS and MS with their respective controls to predict the Differentially Expressed Genes (DEGs) involved in the respective diseases (Arasappan et al., 2011;Taminau et al., 2014). Widely used enrichment analysis methods such as Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Protein-Protein Interactions (PPI) were adopted for the prediction of dysregulated pathways and subsequent possible crosstalk between EMS and MS. The findings from this study increase our understanding of the molecular mechanisms affecting both EMS and MS. Moreover, it brings forth the commonly shared genes, molecules and pathways co-existing in both EMS and MS which may be further explored as newer therapeutic targets.

Data Acquisition
Widely accessible gene expression datasets related to endometriosis (EMS) and MS were obtained from the Gene Expression Omnibus (GEO) database of NCBI (http://www. ncbi.nlm.nih.gov/geo/; Barrett and Edgar, 2006; Barrett et al., 2013). The keywords "endometriosis" and "multiple sclerosis" with "homo sapiens" or "human" were employed to mine the dataset. Studies evaluated on Affymetrix human gene expression dataset (irrespective of platform) containing samples from both normal and diseased tissue (more or less equally distributed) of women were taken. It was also ensured that these studies included only tissue samples that were not cultured in vitro. Similarly, tissue samples treated with any drugs before extraction were also excluded. The expression profiles of both primary and secondary cell cultures were also not considered for this analysis. Overall 14 datasets (7 each for EMS and MS) which met these criteria were selected from published studies and downloaded from GEO database. The expression datasets of EMS combined tissue samples from 7 GEO profiles, i.e., GSE11691, GSE25628, GSE51981, GSE6364, GSE7305, GSE7307, and GSE7846 were selected. Likewise, expression datasets of MS involved tissue samples from 7 GEO profiles, i.e., GSE16461, GSE21942, GSE26484, GSE38010, GSE41848, GSE41849, and GSE41890. These samples were further separated into groups according to tissue source. Datasets were not subjected to any additional normalization, as all the data obtained had already been processed/normalized and were cross-comparable. The related information regarding the dataset pertaining to the

Data Preprocessing and Mining of DEGs
We used GEO2R web tool (http://www.ncbi.nlm.nih.gov/ geo/geo2r/; Barrett et al., 2013) to compare two or more groups of samples in a GEO profile (given in Table 1) in order to identify genes that are differentially expressed across the diverse experimental conditions (Wu et al., 2016;Mou et al., 2017;Sun et al., 2017). GEO2R performs comparisons on original submitter-supplied processed/normalized data tables using the GEOquery and limma (Linear Models for Microarray Analysis) R packages (Smyth, 2004). The adjusted p-values (adj. P) using Benjamini and Hochberg (1995) false discovery rate (FDR) method by default were used to correct for the occurrence of false positive results. The threshold value for identifying DEGs was set as FDR ≤ 0.05 and logFC ≥ 1.5. All gene probes across the microarray datasets were converted to a common Entrez ID using the "Database for Annotation, Visualization, and Integrated Discovery" (DAVID) v6.7 tool (https://david.ncifcrf.gov/ conversion.jsp; Huang da et al., 2008Huang da et al., , 2009). The probes not associated with known genes were not included. The differentially expressed genes (DEGs) from individual studies were selected using a combination of p-value and fold change and the results were combined by taking the union of all individual studies. When multiple probes referred to the same gene, the expression values obtained from these probes were minimized to a single value by averaging the expression value (when all genes with the same direction of expression) or discarded (when genes had diverse direction of expression). The resulting DEGs represent the entire gene set of all studies.

Functional and Pathway Enrichment Analysis of DEGs
KEGG and GO (Gene Ontology) analysis were completed separately using the DAVID tool. DAVID uses Fisher's exact test (Fisher, 1922) to predict enriched pathways by geneset enrichment analysis of Kyoto Encyclopedia of Genes and Genomes (KEGG) database. False Discovery Rate (FDR) adjusted p ≤ 0.05 was selected for significantly over-represented pathways. Significant pathway results were ranked according to the p-value.
In this study, GO terms from the biological process ontology were analyzed. The GO terms with ≥5 numbers of genes and adjusted p ≤ 0.05 were considered significantly enriched GO terms. The shared biological process ontology (GO-BP) was calculated based on overlapping GO-Ids between EMS and MS. The number of genes enclosed by sharing GO terms was used to predict the significant pathways and ensuing crosstalk between these pathways. Cytoscape plugin BiNGO (Maere et al., 2005) and FunRich V3 (Pathan et al., 2015) stand alone software was separately used for functional enrichment analysis of DEGs.

PPI Network-Based Enrichment Analysis
To expose the interactive associations among the DEGs at the protein level, genes obtained from the EMS and MS were mapped on protein-protein interaction (PPI) data using NetworkAnalyst tool (http://www.networkanalyst.ca; Xia et al., 2015). The network building was limited to include only the original seed proteins by picking the zero order interactions to evade "hairball effect." NetworkAnalyst integrates comprehensive PPI data from published literature with experimental information available across different PPI databases. These databases like IntAct (Orchard et al., 2014), MINT (Licata et al., 2012), DIP (Salwinski et al., 2004), BIND (Isserlin et al., 2011), andBioGRID (Chatr-Aryamontri et al., 2015) are integrated in InnateDB (Breuer et al., 2013). Topological properties (such as betweenness centrality and degree distribution) of the constructed PPI network were calculated by NetworkAnalyzer in Cytoscape (Shannon et al., 2003). The degree distribution of all nodes in the network may help to explain whether a network is scale-free or not. The betweenness centrality is defined as the number of shortest paths in the graph that pass through the node divided by the total number of shortest paths. The nodes with a high betweenness centrality are lying on the communication paths and can control the information flow. The densely connected group of proteins referred as modules in a given network was predicted using the "module explorer" panel of NetworkAnalyst that used a random walk based approach for module detection. The significant p-value of a given module was calculated using Wilcoxon rank-sum test (Haynes, 2013). The ranking of the identified modules was based on the number of encompassed seed proteins. The enriched pathways of DEGs in significant modules (≥10 DEGs) were analyzed with a threshold of p ≤ 0.05 using DAVID.

Transcription Factor and Protein Kinase Associated with DEGs
Upstream regulators and protein kinases associated with DEGs were recognized by submitting the list of shared DEGs to Expression2Kinases (X2K) web interface (https://amp.pharm. mssm.edu/X2K/; Chen et al., 2012). X2K identifies the enriched transcription factors (TFs) from the upstream of the shared DEGs using a ChEA database (Chen et al., 2012). Genes2Networks (G2N) module of X2K connects TFs with PPI to yield transcriptional complexes related to these gene signatures. Protein Kinases responsible for TF complex formation and functional regulation were recognized through the Kinase Enrichment Analysis (KEA) module of X2K. Top 10 most enriched TFs and kinases were ranked based on the combined (p-value and z-score) score.

Crosstalk Analysis of Biological Pathways
Pathway crosstalk was defined as those pathways that had overlapping genes and edges. In this study, two pathways were considered to crosstalk if they comprised at least 5 DEGs with adjusted p ≤ 0.05 and at least 1 overlapping gene and edge. If the number of overlapped DEGs in the pathways was more than 3, the two pathways were considered to have a strong interaction. This criterion ensures that each of the pathways and its crosstalk pairs were statistically significant and contained biologically meaningful number of genes. The crosstalk between two pathways was explored by calculating the Jaccard Coefficient (JC)/Jaccard Index (JI) (Jaccard, 1912;Levandowsky and Winter, 1971) and Overlap Coefficient (OC)/Szymkiewicz-Simpson coefficient values (Vijaymeena and Kavitha, 2016). The pairs of the pathway were subsequently ranked by taking the average of the two measurements as the score defining pathway crosstalk. The JC/JI or J (A, B) is represented by the formula Here, in two given pathways, A and B, the overlapping number of genes is the intersection of path A and path B, and the union of the two paths represents the sum of genes. Pathways with JC ≥ 0.01 and sharing at least one DEG were considered.
The Overlap Coefficient (OC) is represented by the formula OC was used to determine the fraction of genes that were overlapped across pathways. The larger the OC, the higher is the similarity in gene information between two pathways. The degree of overlap was defined based on OC-values. OC = 1 represented a complete overlap, OC ≫ 50% indicated high overlap, 20% ≪ OC <50% moderate overlap, OC < 20% low overlap and OC = 0 was taken as no overlap among the two pathways.

Identifying Disease-Associated DEGs from EMS and MS
The meta-analysis approach (Arasappan et al., 2011;Taminau et al., 2014) for the integrative analysis of multiple gene expression profiles ( Figure 1A) was adopted in this study. Primary screening of the GEO profiles offered 292 and 304 samples in EMS and MS respectively, irrespective of the location of tissue sampling ( Table 1). R/Bioconductor limma package (Smyth, 2004) was used to predict the DEGs between disease and control samples in each dataset and the identified DEGs were merged by taking the union of all individual studies. The genes which are expressed in these studies in the same direction for both diseases were averaged and retained, whereas genes which were observed in opposite directions were discarded. The final dataset represents the entire gene set of all studies (Table S1). The probe annotated as antisens RNA, miRNA, chromosomes, hypothetical, loci, non-coding RNAs, non-functional proteins, non-protein coding genes, ORF, pseudogenes, and uncharacterized genes were not considered. As a result, a total of 711 (corresponding to 796 probes) and 1,516 (corresponding to 1,833 probes) DEGs were obtained in EMS and MS patients, respectively (Table S2). The candidate genes were ranked according to their differential expression in response to disease samples as compared to control samples ( Table 2). Out of the 711 EMS-associated genes, 254 genes were up-regulated and 457 genes were down-regulated. In contrast, 779 genes were up-regulated and 737 genes were down-regulated in MS. The common genes shared by both EMS and MS were observed to be 129 genes (Figure 2A). The greatest fold differential expression observed was a 4.67 fold upregulation of the RBMS3 gene ( binding motif, single stranded interacting protein 3) and a 4.68 fold down-regulation of the SCD gene (stearoyl-CoA desaturase/delta-9-desaturase) in MS as compared with EMS. The acquired DEGs from this study were mapped to the validated disease genes of endometriosis and MS offered in Online Mendelian Inheritance in Man (OMIM) (Amberger et al., 2014) and DisGeNET (Piñero et al., 2015) human genetic disorder databases. This analysis confirmed that 58 and 109 validated genes ( Figure 2B, Tables S3, S4) present in these two databases had also been identified as EMS and MSlinked DEGs, respectively in our datasets. This revealed that the identified DEGs were appropriate to signify the two disease conditions.

Identifying Over-Represented Pathways and Biological Process
DAVID has the capability to independently execute Gene ontology (GO) and pathway enrichment analyses. Hence, GO and pathway (KEGG) enrichment analyses of identified DEGs were separately performed using DAVID. Initially, a complete set of up-regulated and down-regulated genes from EMS and MS were mapped to the terms in the KEGG database (Kanehisa et al., 2016(Kanehisa et al., , 2017. The statistical cut-off criterion of p ≤ 0.05 acknowledged 17 over-represented pathways commonly altered in both EMS and MS. These dysregulated pathways were found to be enriched with 26 overlapping genes in both diseases ( Figure S1 and Table S5). The top five enriched pathways based on shared genes were cell adhesion molecules (hsa04514, 7 DEGs), calcium signaling pathway (hsa04020, 7 DEGs), focal adhesion (hsa04510, 4 DEGs), tight junction (hsa04530, 4 DEGs) and dilated cardiomyopathy (hsa05414, 3 DEGs). The shared pathways enriched with most of the DEGs were associated with inflammatory/immune responses. Similarly, biological processes common to both diseases were identified by mapping up-regulated and down-regulated genes to various GO categories in the GO database. To minimize false-positives, a GO biological process was considered significantly enriched if it contained a minimum number of five genes with p < 0.05. The DEGs from EMS and MS were clustered into 136 and 215 functional groups, respectively (Tables S6A,B). Twenty eight over-represented GO terms of biological processes were found to be affected commonly in both EMS and MS (Table S6C).  Cytoscape plugin BiNGO was used to represent significantly overrepresented GO terms in an enrichment network (Figure 3). The top five shared biological process GO terms were cell adhesion (GO:0007155, 17 DEGs), biological adhesion (GO:0022610, 16 DEGs), neuron differentiation (GO:0030182, 11 DEGs), regulation of apoptosis (GO:0042981, 11 DEGs), and cell morphogenesis (GO:0000902, 10 DEGs). To further explore the biological process of the shared DEGs, a separate enrichment analysis using stand-alone FunRich V3 software was completed. As a result, signal transduction (42.2% DEGs), cell communication (39.5% DEGs), cell growth and/or maintenance (13.6% DEGs), apoptosis (3.3% DEGs) and regulation of immune response (0.6% DEGs) related GO terms were observed to be significantly overrepresented in both diseases (Figure 4). These results also revealed that most of the shared biological processes enriched with maximum number of genes were related to inflammation/immune response. Genes associated with enriched GO terms were then mapped onto the corresponding KEGG pathway. The analysis yielded 30 significant pathways associated with biological processes (386 GO terms) of commonly altered DEGs in both diseases (Table S7A, Figure S2). The top five enriched pathways based on shared DEGs of biological processes were focal adhesion (hsa04510, 19 DEGs), pathways in cancer (hsa5200, 18 DEGs), ECM-receptor interaction (hsa04512, 17 DEGs), ErbB signaling pathway (hsa04012, 16 DEGs), and neurotrophin signaling pathway (hsa04722, 16 DEGs). The dysregulated pathways mainly affected by shared biological processes were associated with the inflammatory/immune responses.

Identifying Hub Genes in Protein Interaction Networks
Protein-protein interactions (PPI) network was constructed by mapping abnormally expressed genes from EMS and MS with PPI data to predict biological significant modules comprising shared genes/proteins and interactions which were likely to play key roles in linking EMS and MS. A zero-order interaction network comprising seed nodes only with interconnected edges was constructed to generated interactions for integrated DEGs. The resulting PPI network scattered in 1-20 subnetwork including one large network with highest nodes and edges. We investigated the network features of large PPI subnetwork using NetworkAnalyst and Cytoscape plug-in "NetworkAnalyzer" that accumulated as an undirected graph (edges have no direction), where the proteins/genes were denoted with "nodes" and the interaction between any two proteins/genes was denoted by "edge." The results disclosed the associations of 1,029 nodes (49.05% of DEGs) and 2136 edges in the network ( Figure S3). We calculated the degree of the genes in the PPI network and found 372 (36.15%) with a degree of one and 657 (63.85%) with a degree greater than one. Out of 657 nodes, a total of 85 nodes were observed with ≥10 connections with other nodes. We observed "betweenness" with a range of 2.5 to 212999.72 for large number of nodes (626; 60.84%) in the constructed network. These observations suggested that hub (high degree or number of connections it has to other nodes) and bottleneck (high betweenness or number of shortest paths going through the node) proteins which are likely to be essential proteins were abundant in the constructed network (Yu et al., 2007;Raman, 2010 (Table  S8A). As hubs contributed in a number of interactions and clutch the network together (Jeong et al., 2001), they are more likely to be master regulators of signaling and transcription. Therefore, the hubs can prove to be helpful as therapeutic targets or biomarkers.
FIGURE 3 | Enrichment network of shared DEGs based on biological processes. Significantly overrepresented biological processes ontology terms were analyzed and visualized by Cytoscape plugin BiNGO. The structure of GO is described in terms of a graph, where each GO term is a node, and the relationships between the terms are edges indicating parent-to-child relationships. The size of a node is proportional to the number of targets in the GO category. The color denotes enrichment significance-the deeper the color on a color scale, the higher the enrichment significance.

Identifying Disease Module through Interaction Networks
The constructed PPI network was evaluated for module detection, which contains a group of proteins that execute similar functions. Thirty four highly connected independent modules were observed. We observed nine overlapping modules with more than 10 nodes (p ≤ 0.05) (Table S8B), signifying possible interaction, shared by the two diseases (Samanta and Liang, 2003;Menche et al., 2015;Caldera et al., 2017). The predicted modules were connected to neighboring modules and ranged in size from 10 to 206 genes. The distribution of highly connected hub nodes (degree ≥ 10) in modules gave highest hub nodes of 9 genes (SUMO1, CTNNB1, HIST1H4E, LRRK2, APC, EPAS1, ACTB, CDH1, and PSEN1) in module 0. The next hub nodes of 7 genes (CUL4A, HNRNPM, DHX9, SRRM2, H2AFX, TP53BP1, and SRSF1) occurred in module 1. An evaluation of the modules by KEGG database exposed a number of dysregulated pathways with an FDR ≤ 0.05. Modules 0, 1, 2, and 6 emerged with shared interactions through 25 commonly dysregulated pathways (Figures 5A-C, Table 3, Table S9). An examination at the pathway level of PPI connecting EMS and MS yielded significant crosstalk through nine overlapping genes and 44 overlapping edges (Figures 6A-F). We also found several genes/proteins were involved in multiple pathways indicating that these PPIs might link the crosstalk pathways together.

Upstream Regulator Analysis of Up-and Down-Regulated DEGs
The current study identifies the enriched upstream transcription factor (TF), intermediate protein and associated protein kinase to understand the mechanisms of shared up-regulated and downregulated genes in both EMS and MS. The X2K approach revealed that POU3F2, BACH1 and 3 were top TFs binding to upregulated genes (Figure 7A), whereas SOX11, AR and TRIM28 were the leading TFs in down-regulated genes ( Figure 7B, Table 4). The comparison of TFs of both upregulated and downregulated genes resulted in the overall divergence of TFs except for AHR and TRIM28. These TFs were found to be connected with 411 and 199 intermediate proteins for up-regulated and down-regulated genes, respectively. These intermediate proteins might be facilitating the TFs to be active in the cells. The study also disclosed the enriched protein kinases such as MAPK1, CSNK1D, and PRKD3 for up-regulated genes (Figure 7A), and MAPK14, ABL2, and INSR for down-regulated genes ( Figure 7B) had connections with a large number of intermediate proteins and TFs ( Table 5). The comparison of predicted kinases of up-regulated and down-regulated genes revealed that overall kinases were dissimilar barring ABL2 from up-regulated genes. Androgen receptor (AR) and nuclear factor-kB p65 (RelA) were observed to be a hub protein of down-regulated genes (via 48 direct and 3 indirect interactions) and up-regulated genes (via 80 direct and 3 indirect interactions), respectively. Therefore, the candidate TFs and their downstream target genes could play vital roles in the progression of autoimmune disease. These can be further explored as potential biomarkers for the diagnosis or treatment target.

DISCUSSION
A number of studies have reported that women suffering from EMS are more prone to develop MS (Alviggi et al., 2006;Nielsen et al., 2011;Mormile and Vittori, 2014;Moghadasi and Salehizadeh, 2017). However, not much data is available The network modules show a high degree of clustering of proteins involved in the identical disease. The hub genes with high degree and high betweenness were denoted with red color. The EMS, MS, and shared DEGs in the PPI network modules were denoted by the color yellow, green and blue, respectively. (C) The bar diagram represents the distribution of hub genes with high degree i.e., hub genes ≥ 10 nodes. The hub gene APP (amyloid beta precursor protein) was found to be connected with the highest number of neighbor's node i.e., 81 in the network.        to explain the immunological or defense mechanisms shared by these two autoimmune diseases. Similarly, the unique and shared molecular links accountable for failure or breakdown of self-tolerance, which lead to the development of MS in woman with EMS are also unclear. The present work has explored the publicly available microarray data for EMS vs. control and the MS vs. control cases and uncovered the shared molecular signatures which probably play a role in linking EMS with MS. Widely used enrichment analysis methods ( Figure 1B) was adopted for the prediction of dysregulated pathways and establish subsequent possible crosstalk between EMS and MS. The enrichment analysis of differentially expressed genes (DEGs) by Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene ontology (GO) and protein-protein interactions (PPI) divulged 46 disease-related pathways commonly disturbed in both diseases (Table S10 and Figure S4). The findings of disease-related pathways confined to the immune system, signal transduction, signaling molecules and interaction, and cell growth and death as the alteration in the suggested 14 pathways (Figure 8, Table 6) is known to be associated with shared risks of pathogenesis for both EMS and MS (Nielsen et al., 2011;Mormile and Vittori, 2014). The dysregulated pathways were mainly affected through common genes and edges associated with the inflammatory/immune responses. Downstream analysis revealed a number of crosstalks of dysfunctional pathways mediated through 23 overlapping unique genes that can interact with the signal transduction pathway (Table S11). Associated genes mainly corresponded to cell adhesion molecules (CADM3, CDH1, CLDN11, ITGB8, NCAM1, NEGR1, and NRXN1), neurotransmitters (CHRM3, GABRB2, GRIA2, GRIA3, and RXFP1), cytokine receptors (TGFBR2 and LEPR), and enzyme families (ERBB3, PLD1, PPP2R5E, and PTPN11). Other shared genes connected with cyclin (CCNB1), inositol trisphosphate receptor (ITPR1), laminin receptor (LAMB1), transport protein (SLC8A1), and adaptor protein (SHC3). These are well known immunomodulatory proteins or immune checkpoints that are negative regulators of the immune system. The up-and downregulated expressions of 10 immunomodulatory proteins were detected commonly in both diseases ( Table 7). A shared gene expression signature could form a common link between these two diseases. These results are consistent with preceding reports that excessive co-stimulation and/or insufficient co-inhibition of immunomodulatory molecules can result in a collapse of self-tolerance leading to the expansion of autoimmune diseases in humans (Ramsay, 2013;Zhang and Vignali, 2016). In addition, six interaction networks through overlapping edges of common dysregulated pathways of EMS and MS were also discovered (Table 3, Figure 6). This result suggested the probable associations of these two diseases through overlapping protein interactions. The identical GO biological processes related to inflammatory/immune responses have shown the functional overlap likely to infer the co-occurrence of EMS with MS (Figure 4). We have also identified a handful of hub genes PTPN11 (degree 15, betweenness 374.27), ERBB3 (degree 11, betweenness 198.32), and CDH1 (degree 10, betweenness 176) that are shared by these two disorders (Figure 5). This is consistent with the previous assumption that hub proteins are encoded by the essential genes and are associated with disease genes (Jeong et al., 2001). The shared pathways interacted in crosstalk systems were observed to be regulated by shared upstream regulators, leading to the activated or repressed immune response. POU3F2 (POU domain, class 3, transcription factor 2), BACH1 (breast cancer type 1 susceptibility protein; Igarashi et al., 2017), and STAT3 (signal transducer and activator of transcription 3; Kortylewski et al., 2005;Ho and Ivashkiv, 2006) were top 3 identified TFs from the shared The triangular symbol denoted to the DEGs from endometriosis, whereas star symbol indicated to the shared DEGs between EMS and MS. The interactions score indicates the interaction confidence between two nodes. All scores rank from 0 to 1, with 1 being the highest possible confidence. The color of edges in the PPI network correspond to the curated databases (blue), and experimentally determined (pink) from known interactions, while gene neighborhood (green), gene fusions (red), and gene co-occurrence (blue) signify predicted interactions. Further representations include text mining (yellow), co-expression (black), and protein homology (light blue).
up-regulated genes, leading to the activated immune response when down expressed in cells (Table 4, Figure 7A). Likewise, SOX11 (transcription factor SOX-11), AR (androgen receptor; Lai et al., 2012) and TRIM28 (transcription intermediary factor 1-beta; Ozato et al., 2008;Chikuma et al., 2012) were top 3 identified TFs from the shared down-regulated genes which lead to repression of the immune response when over-expressed in cells (Table 4, Figure 7B) (Schultz et al., 2002;Sripathy et al., 2006). This suggests the involvement of TFs in the association mechanism of both EMS and MS.

Immune Modulation through Dysregulation of Cytokine
The systemic immune alteration due to dysregulation of cytokine or chemokines production is well known to be associated with the pathogenesis of EMS (Cakmak et al., 2009;Herington et al., 2011) and MS (Sellebjerg et al., 2009;Hasheminia et al., 2015;Becher et al., 2017). The sites associated with inflammation attract leukocytes through a group of low molecular weight proteins, the chemoattractant cytokines or chemokines. The role of CC chemokine receptor 5 (CCR5) and its CC chemokine ligand 3 (CCL3) has been suggested in Th1-type inflammatory/immune response (Bleul et al., 1997;Wu et al., 1997;Bonecchi et al., 1998;Patterson et al., 1999;Nansen et al., 2002). Reduced expression of CCL3 in MS was observed, suggesting a concomitant reduction of CCR5 that might contribute in the development of inflammatory demyelination (Banisor et al., 2005). Interleukin 8 (IL-8/CXCL8) is a chemoattractant for monocytes and neutrophils, which is involved in the attraction and infiltration of leukocytes at the site of inflammation. We observed increased expression of CXCL8 in MS patients, which is generally not perceived in normal patients (Lund et al., 2004;Bartosik-Psujek and Stelmasiak, 2005). We also detected elevated levels of chemokine CXCL13 and its receptor CXCR5 which involve the maintenance of pathogenic B cells in autoimmune diseases like MS (Finch et al., 2013). Likewise, the study of genetic polymorphisms of chemokines elucidates an association between rs2812378 and C-C motif chemokine ligand 21 (CCL21) in the advanced stages of endometriosis (Bellelis et al., 2015). These findings signify that chemokines and their receptors are important for the development and maintenance of innate and adaptive immunity (Esche et al., 2005;Raman et al., 2011). The results also  * TF, transcription factor; # C-score, combine score of p-value and z-score, which is used to rank the identified TFs.
suggest that altered expression of β-chemokines are involved in similar biological processes or function in both EMS and MS (Table S7B). In addition, crosstalk analysis exposed that the stimulation of cytokine-cytokine receptor interaction pathway in EMS, up-regulates a gene LEPR or activates a protein leptin receptor that is also a member of the same pathway in MS. Therefore, the down-stream response of the suggested pathways in both EMS and MS might be regulated by the same activated TFs, resulting in the up-regulated expression of LEPR genes in MS. Previous reports have confirmed that leptin (acts via leptin receptor) is negatively correlated with the production of regulatory T cells and hence associated with immune deficiency (Farooqi et al., 2002). Therefore, the elevated expression of leptin/leptin receptor in EMS prompts immune deficiency which may induce MS.

Immune Modulation through Dysregulation of Adhesion Molecules
Our evaluation indicated that patients suffering from EMS exhibited an upregulated expression of neuronal growth regulator 1 (NEGR1) with downregulated expression of Catherine 1 (CDH1) and integrin subunit beta 8 (ITGB8). A similar trend for these proteins was also observed for MS. The stimulation of cell adhesion molecules (CAMs) pathway in EMS, up-regulates a gene NEGR1 or turns on a protein neuronal growth regulator 1 that is also an element of the identical pathway in MS. Therefore, the down-movement reaction of these pathways in EMS and MS is probably regulated through the same activated TFs, resulting in concomitant up-regulation of NEGR1 genes in MS in patients with EMS. The overexpression of NEGR1 gene prevents synaptogenesis leading to autoimmune or neurodegenerative diseases. These results are consistent with the previous hypothesis that the precise expression of neuronal growth regulator 1 is critical for the neurite outgrowth in the brain, while the dysregulated expression of negr1 gene is known to play a key function in inhibiting neurite outgrowth and synapse formation (Gil et al., 1998;Schäfer et al., 2005;Hashimoto et al., 2009;Pischedda et al., 2014;Sanz et al., 2015). The inhibition of cell adhesion molecule pathway in EMS inhibits the genes CDH1/cadherin-1 and ITGB8/integrin subunit beta 8 in the pathway which co-occurs in MS. Activated TFs are expected to regulate this down-flow reaction resulting in the down-regulation of CDH1 and ITGB8 genes in MS.
The ligand αEβ7 integrin, expressed in several subsets of lymphocytes acts via its receptor cadherin-1 or E-cadherin which is expressed in epithelial cells. This interaction results in adhesion of lymphocytes to epithelial cells, probably important for T cell homing to the intestinal sites (Agace et al., 2000). Our results demonstrated the down-regulation of E-cadherin, which is known to increase cellular motility by weakening cellular adhesion within a tissue. The findings point toward the increased risk of altered intestinal immune reactions and tissue injuries connected with inflammatory diseases, including autoimmune diseases (Yoshimoto et al., 2014). This suggests that αEβ7 could be used as a potential therapeutic target for inflammatory diseases. Likewise, the cytokine transforming growth factor-beta (TGF-β) acts as an immune suppressor during homeostasis, infection and disease (Yoshimura and Muto, 2011;Worthington et al., 2012). The down-regulation of integrin subunit beta 8 (ITGB8) was observed, suggesting the interruption in alpha-V/beta-8-mediated TGF-β activation through dendritic cells which is vital for preventing immune disorder (Travis et al., 2007).

Immune Modulation through Dysregulation of Neurotransmitters
The stimulation of neuroactive ligand-receptor interaction pathway as well as calcium signaling pathway in EMS was observed. This up-regulates a gene CHRM3 or initiates a protein cholinergic muscarinic receptor subtype 3 in EMS which coexists in MS. Consequently, the CHRM3 gene was also seen to be up-regulated in MS. CHRM3 signaling modulates the activation, recruitment and differentiation of progenitor cells or oligodendrocytes precursor cells (OPCs), essential for the remyelination in the central nervous system (Abiraman et al., 2015). The crosstalk analysis disclosed the dysregulation of OPCs by the overexpression of CHRM3 gene, which inhibits remyelination and enhances the risk of MS in patients suffering from EMS.

Immune Modulation through Dysregulation of InsP3R and Transport Protein
Crosstalk research has proven that the stimulation of calcium signaling pathway in EMS, up-regulates the gene ITPR1. Since an identical pathway is also present in MS, the increased ITPR1 levels can induce MS. The protein Inositol 1,4,5-trisphosphate (InsP3/Ins3P/IP3) is an essential signal transduction element of the calcium (Ca 2+ ) signaling pathway involved in the regulation of cellular activities (Berridge, 1993;Hirota et al., 2003). Overexpression of ITPR1 gene results in calcium dysregulation accentuating the risk of autoimmune diseases (Zhuang et al., 2015). Our analysis suggested that the inhibition of the calcium signaling pathway in EMS, down-regulates the same pathway in MS. Hence, the expression of Solute carrier family 11A member 1 (SLC11A1) genes is decreased. SLC11A1 possess an immunomodulatory role in manipulating macrophage activation status (M1/M2) and Th1/Th2 biases. The transcriptional repression of SLC11A1 gene leads to cell proliferation and survival resulting in cancer and autoimmunity (Awomoyi, 2007). The reduced level of SLC11A1 genes is linked with the overexpression of cell proliferation in EMS patients. This increases the probability of developing MS in patients with EMS.

Immune Modulation through Dysregulation of Enzyme Families
The inhibition of calcium signaling/ErbB signaling, leukocyte transendothelial migration and oocyte meiosis pathways as seen by our analysis in EMS, down-regulates the genes ERBB3 (Erb-b2 receptor tyrosine kinase 3), PTPN11 (tyrosineprotein phosphatase non-receptor type 11), and PPP2R5E (phosphatase 2 regulatory subunit B'epsilon), respectively. Common TFs regulate them in MS resulting in their synergistic decreased expression. The reduced expression of ERBB3 gene could contribute to insufficient remyelination associated with the expansion of neurodegenerative diseases like MS and Alzheimer's (Bublil and Yarden, 2007). These results are consistent with the previous study that neuregulin-1/ErbB signaling plays a crucial role in of OPC proliferation, oligodendrocyte differentiation and remyelination in the central nervous system (Vartanian et al., 1997;Canoll et al., 1999;Flores et al., 2000;Calaora et al., 2001). Likewise, protein tyrosine phosphatases (PTPs) act as negative regulators of the immune response in normal and pathophysiological conditions by controlling the activation/inhibition of lymphocytes. The observed reduction in the level of PTPs could be associated with abnormal lymphocyte function in both EMS and MS (Dolton et al., 2006;Rhee and Veillette, 2012). Similarly, the downregulated expression of PPP2R5E genes, exhibit the overexpression of regulatory T cells, ensuing the risk of autoimmunity. This is consistent with reports that protein phosphatase 2A (PP2A) is critical for regulatory T cells to function, for in their absence they no longer possess the ability to suppress effector T cells and thus fail to protect against autoimmunity (Apostolidis et al., 2016).

CONCLUSIONS
The microarray gene expression data from healthy individuals and patients suffering from either of the two related autoimmune disorders, endometriosis or MS, from the GEO database was examined through KEGG, GO, and PPI methods. We identified shared differentially expressed genes in these two related diseases which were involved in common disease-related pathways. We also identified the links between these two diseases by analyzing their overlapping disease genes and edges through proteinprotein interactions. The maximum number of shared genes and edges in shared dysregulated pathways was associated with the inflammatory/immune responses. The observed dysfunctional pathways mediated through an overlapping immunomodulatory protein are anticipated to act as probable linkers in EMS and MS. The enrichment analysis of GO terms clearly showcased the functional overlap of biological processes inferring the cooccurrence of EMS with MS. This analysis also suggested that the identified disease genes not shared through protein-protein interaction might play vital roles in the same or related functions to complete the molecular links. The identified common molecular signatures (such as genes, pathways, transcription factors, and protein kinases) can be further explored as novel targets/biomarkers for the simultaneous treatment of EMS and MS. The findings from this study increase our understanding of the molecular mechanisms affecting both EMS and MS and suggest an interconnection between the two diseases.

AUTHOR CONTRIBUTIONS
AK initiated the research and performed the downstream analyses. PK conceived the idea of the study and designed the experiments. SS assisted in the analysis. AK, PK, and TPS interpreted the results and drafted the manuscript. All authors have read and approved the manuscript for publication.