Mapping Transcriptome Data to Protein–Protein Interaction Networks of Inflammatory Bowel Diseases Reveals Disease-Specific Subnetworks

Inflammatory bowel disease (IBD) is the common name for chronic disorders associated with the inflammation of the gastrointestinal tract. IBD is triggered by environmental factors in genetically susceptible individuals and has a significant number of incidences worldwide. Crohn’s disease (CD) and ulcerative colitis (UC) are the two distinct types of IBD. While involvement in ulcerative colitis is limited to the colon, Crohn’s disease may involve the whole gastrointestinal tract. Although these two disorders differ in macroscopic inflammation patterns, they share various molecular pathogenesis, yet the diagnosis can remain unclear, and it is important to reveal their molecular signatures in the network level. Improved molecular understanding may reveal disease type-specific and even individual-specific targets. To this aim, we determine the subnetworks specific to UC and CD by mapping transcriptome data to protein–protein interaction (PPI) networks using two different approaches [KeyPathwayMiner (KPM) and stringApp] and perform the functional enrichment analysis of the resulting disease type-specific subnetworks. TP63 was identified as the hub gene in the UC-specific subnet and p63 tumor protein, being in the same family as p53 and p73, has been studied in literature for the risk associated with colorectal cancer and IBD. APP was identified as the hub gene in the CD-specific subnet, and it has an important role in the pathogenesis of Alzheimer’s disease (AD). This relation suggests that some similar genetic factors may be effective in both AD and CD. Last, in order to understand the biological meaning of these disease-specific subnets, they were functionally enriched. It is important to note that chemokines—special types of cytokines—and antibacterial response are important in UC-specific subnets, whereas cytokines and antimicrobial responses as well as cancer-related pathways are important in CD-specific subnets. Overall, these findings reveal the differences between IBD subtypes at the molecular level and can facilitate diagnosis for UC and CD as well as provide potential molecular targets that are specific to disease subtypes.


INTRODUCTION
Inflammatory bowel disease (IBD) is a chronic disease with gastrointestinal tract inflammation, and there are approximately 6.8 million IBD cases worldwide (Jairath and Feagan, 2019;Meyers et al., 2020). Genetic factors have been scientifically proven to be effective in the onset of the disease, so individuals with a family history are likely to have the disease (Zhu et al., 2019). IBD can develop at all stages of life and last a lifetime, significantly reducing the quality of life of patients (Pugliese et al., 2020). Moreover, patients with IBD have a high risk of developing colon cancer (Pugliese et al., 2020). Although the cause of IBD is not known exactly, various environmental factors trigger the emergence and progression of the disease in genetically susceptible individuals. IBD includes two similar types of idiopathic bowel diseases, namely, ulcerative colitis (UC) and Crohn's disease (CD), that differ in location and depth of involvement in the intestinal wall. The involvement in UC is limited to the large intestine, whereas CD may be involved in any part of the gastrointestinal tract from mouth to anus (Zhu et al., 2019;Winter and Weinstock, 2020). While these two subtypes share a variety of molecular pathogenesis, the macroscopic inflammation patterns are clinically different. Since UC and CD patients tend to show similar symptoms, appropriate diagnosis and treatment options may remain unclear. Improved molecular understanding can reveal disease-type-specific and even individual-specific targets (Vennou et al., 2019;Mitsialis et al., 2020); thus, it is important to reveal the molecular signatures of UC and CD at the network level.
In genomics studies, genes functioning in the epithelial barrier function and genes related to cellular innate immunity are found to be particularly associated with UC and CD, respectively (Fakhoury et al., 2014;Cohen et al., 2019). The NOD2 gene located at the IBD1 locus is the first gene associated with CD (Hugot et al., 2001). ATG16L1, which is necessary for autophagy in the cell, and STAT3 polymorphisms involved in cellular function by regulating gene activity are also associated with CD (Magalhaes et al., 2011;Wang et al., 2014). IRF5 polymorphisms (Gathungu et al., 2012) and ILR23 variants (Duerr et al., 2006) are associated with both UC and CD. Although most gene loci found for IBD show the same direction of action for UC and CD subtypes, some genes also have adverse effects. For example, while NOD2 and PTPN22 genes are risk factors for CD, they showed a significant protective effect for UC (Wawrzyniak and Scharl, 2018).
Understanding the complex mechanisms of diseases is important in both the diagnosis and treatment steps. In systems biology, the molecular understanding that develops with the analysis of biological networks can serve for disease typeand even individual-specific goals (Ran et al., 2013;He et al., 2017). One of the most effective approaches for disease-specific subnetwork (subnet) detection is the method of integrating transcriptome data into protein-protein interaction (PPI) networks (He et al., 2017). Transcriptome data include the whole gene transcripts (RNA molecules) expressed within the cell and represent the relationship between the information stored and encoded in DNA and the phenotype (Kaur, 2013). PPI networks, on the other hand, inform of the interactions between proteins in the organism (Rao et al., 2014). PPI networks and associated experimental data, different for each organism, can form a bridge between cellular processes and disease states (He et al., 2017). By mapping transcriptome data to PPI networks, proteins encoded by genes with significantly changed expression levels due to the relevant situation can be determined, and new modules can be obtained (He et al., 2017). This approach can reveal proteins and cellular mechanisms previously unknown to be related to the disease condition. Thus, with an integrated transcriptome and proteome analysis approach, the different mechanisms underlying network dynamics can be highlighted (Rakshit et al., 2014;Chen et al., 2016).
In a study comparing the performance of subnet discovery algorithms under different conditions, KeyPathwayMiner (KPM) (Alcaraz et al., 2014) has been reported to show high performance (Batra et al., 2017). KPM algorithm was also used to investigate the effect of chemotherapy (Warsow et al., 2013) and to understand the mechanism of Huntington's disease (Alcaraz et al., 2011). Genes in the subnetwork modules, handled by KPM, are defined as significantly active genes in relation to the investigated situation (Warsow et al., 2013). In this paper, the expression data of IBD subtypes (UC and CD) were compared, and differentially expressed genes (DEGs) with respect to the healthy state were determined for each subtype. Moreover, disease subtype-specific PPI networks were extracted, topologically analyzed, and transcriptome data were mapped to PPI networks to identify UC-and CD-specific subnets. Biological significance of the disease-specific subnets was further explored by functional enrichment, revealing the differences between the IBD subtypes at the molecular level. The networkbased pathway functional enrichment method can be used to discover molecular mechanisms related to diseases. With this approach, situation-specific functional modules are extracted from large interaction networks, and new modules are obtained and subjected to analysis (Batra et al., 2017). Our results can facilitate diagnosis for IBD subtypes UC and CD, and provide potential molecular targets.

MATERIALS AND METHODS
The expression data for IBD subtypes (UC and CD) were obtained from the GEO database, and differentially expressed genes (DEGs) with respect to the healthy state were determined for each subtype. All PPI network analyses and visualization were performed by the Cytoscape (3.8.0) application. Cytoscape (Shannon, 2003;Cline et al., 2007) is an open-source software for visualizing, modeling, and analyzing molecular and genetic interaction networks. Cytoscape can be applied to any molecular component and interaction system. In recent years, its use has increased with the emergence of large databases for proteinprotein, protein-DNA, and genetic interactions of humans and model organisms (Shannon, 2003;Cline et al., 2007). In our study, two different approaches were used to map the transcriptome FIGURE 1 | Workflow of the comprehensive protein-protein interaction (PPI) network analysis by mapping transcriptome data for the discovery of ulcerative colitis (UC)-and Crohn's disease (CD)-specific subnetworks.
data to the disease subtype-specific PPI networks to identify UC-and CD-specific subnetworks (subnets) (Figure 1). The first approach uses KeyPathwayMiner (KPM) to determine the modules in the disease-specific PPI networks, whereas in the second approach, disease-and DEG-related PPI networks are merged using stringApp, and their intersection yields the diseasespecific subnets. Last, biological significance of the diseasespecific subnets, coming from different approaches, was explored by functional enrichment, revealing the differences between the IBD subtypes at the molecular level. The details of each step are explained in the following subsections, and the workflow of our comprehensive analysis of PPI networks with transcriptome data for the discovery of UC-and CD-specific subnetworks is shown in Figure 1.

Acquisition and Analysis of Transcriptome Data
Transcriptome data for the IBD subtypes ulcerative colitis (UC) and Crohn's disease (CD) were obtained from the Gene Expression Omnibus (GEO) database 1 under microarray datasets GSE126124 and GSE3365 (Table 1). These microarray datasets were chosen because of the coexistence of samples from UC, CD, and normal (healthy) groups. The acquired transcriptome datasets were analyzed individually by comparing two disease conditions with the healthy condition such as UC vs. healthy tissue and CD vs. healthy tissue using GEO2R, a web tool that can perform R-based gene expression analysis. As default in GEO2R analyses, we applied quantitative normalization and the Benjamini and Hochberg procedure for controlling false discovery rate (FDR). This is the most frequently used adjustment for microarray data because of the good balance between limiting significant genes and false positives. Finally, for determining the differentially expressed genes (DEGs) that differ in mRNA level in UC vs. control and CD vs. control groups, the p-value threshold was considered as 0.05, and the direction of the change in gene expression was assigned according to fold change (FC) values. Genes with logarithm of fold change (logFC) value above 1 was considered as upregulated and below 1 as downregulated (|logFC| ≥ 1).

Identifying Ulcerative Colitis-and Crohn's Disease-Specific Subnets
Mapping Transcriptome Data to Disease-Specific Protein-Protein Interaction Networks With Cytoscape-KeyPathwayMiner Disease-specific PPI data for UC and CD were extracted using the PSICQUIC web service (October 29, 2020). 2 Then the KeyPathwayMiner (KPM) 3 (version 5.0.1) plugin for Cytoscape was downloaded. The KPM plugin of Cytoscape is able to efficiently uncover all the maximum connected subnets in a biological network. The KPM algorithm processes transcriptome data, p-values as "1" or "0" (Alcaraz et al., 2016), so p-values are arranged as "1" for significantly expressed genes (SEGs) (with p-value < 0.05) and "0" for non-significantly expressed genes, respectively. The K-value, which is an important parameter for KPM, indicates how many nonsense genes will be in the module. The optimal K-value was determined as 5 by trial and error (Alcaraz et al., 2016), such that significant genes are missed when the K-value is below 5, and no new significant genes are added when above 5. In this study, after loading the diseasespecific PPI networks to Cytoscape, genes and p-values obtained from GSE126124 and GSE3365 datasets were arranged in the appropriate format for KPM and uploaded as two separate files. The K-value was assumed to be 5; the transcriptome data in the two sets were logically connected using "AND" and mapped to the PPI network so that new modules for UC and CD were obtained.

Merge Analysis of Differentially Expressed Gene-and Disease-Specific PPI Networks With Cytoscape-StringApp
The STRING database is designed to comprehensively combine, evaluate, and disseminate protein-protein relationship information (Franceschini et al., 2012). With StringApp 4 (version 1.6.0), PPI queries can be performed in four different ways: protein query, PubMed query, disease query, and protein/compound query. In this study, the PPI network of DEGs in the two datasets was constructed using the STRING database, and an interaction with a composite score of > 0.95 was considered as statistically significant. Also, an interaction with a maximum protein count of 500 and a composite score of > 0.95 was considered statistically significant when querying the disease names (ulcerative colitis and Crohn's disease) in the STRING database. In this way, DEG-and disease-specific PPI networks for UC and CD were merged (intersection analysis) with the Cytoscape application, and new intersection modules were obtained.

Functional Enrichment of Disease-Specific Network Modules
Last, the new modules obtained with KPM were functionally analyzed using g:Profiler (Reimand et al., 2016), 5 which is an online, user-friendly, and comprehensive database for functional enrichment analysis. It contains the methods commonly used in standard pipelines of biological entity (gene/protein)-centered computational analysis. g:Profiler currently includes Gene Ontology for biological pathway analysis; KEGG, Reactom, and WikiPathways, to determine the regulatory motifs in DNA; TRANSFAC and miRTarBase for protein databases; and it contains commonly used data sources such as the Human Protein Atlas and CORUM (Raudvere et al., 2019). Gene lists in the new disease subtype-specific modules obtained for UC and CD were given as input to g:Profiler algorithm; Benjamini-Hochberg was applied as the statistical correction method, and terms with p-values less than 0.05 were considered as significant.

Disease Subtype-Specific Modules Obtained by StringApp
String Enrichment can perform functional analysis of the modules created by STRING, which performs overrepresentation tests for a total of 11 functional path classification frameworks. Some commonly available frameworks are: Gene Ontology, KEGG paths, UniProt keywords, and Reactome paths (Szklarczyk et al., 2020). Functional enrichment by String Enrichment was applied on the intersection module obtained after the merge analysis of DEG-and disease-specific PPI networks for UC and CD, and the results were compared.  (Tables 2, 3). Literature search of common up-and downregulated disease-specific genes confirms their importance in IBD. The regenerative gene (REG) family shows increased expression during IBD-associated inflammation (Xu et al., 2019). OLFM4 secreted by human intestinal epithelial cells is upregulated in the inflamed mucosa of IBD patients; however, its functional role in IBD has remained uncertain (Kuno et al., 2021). DMBT1 gene, which is considered as a candidate tumor suppressor gene for the brain, lungs, stomach, and colorectal cancers, has shown an increased expression in inflamed tissues of IBD patients, and it has been stated that impaired DMBT1 function may be associated with the onset of Crohn's disease (Renner et al., 2007). Moreover, polymorphisms/mutations of Toll-like receptors (TLR), which are innate immune receptors, have been directly linked to IBD (Lu et al., 2018), and activation of epithelial TLR4 in IBD and colorectal cancer has been associated with upregulation of DUOX2 (Burgueño et al., 2020). It has been reported that MARK2, a master regulator of cell polarity in intestinal epithelial cells, may contribute to the initiation and progression of IBD by interfering with the protein kinase cascade (Yuan et al., 2017). Disease-specific DEGs were further analyzed by functional enrichment using ClueGO, as explained below (Figure 2).

RESULTS AND DISCUSSION
UC-specific genes were found to be mainly associated with signaling pathways such as antibacterial humoral response, positive regulation of interferon-alpha production, and phagocytosis recognition. Considering the antiviral effect of interferon alpha and the importance of phagocytosis in the early stage of bacterial infections, it can be said that UC-specific genes are effective on various immune system signaling pathways. Although the etiology of UC is not exactly known, it is assumed that it is a multifactorial condition that causes immune response Tatiya-Aphiradee et al., 2018). The immune response plays an important role in the initiation and progression of UC, and any loss of immune tolerance results in inflammation (Tatiya-Aphiradee et al., 2018). The relationship between CDspecific genes and chemokines and again the immune system was observed. It has been reported that the immunoregulatory effects of cytokines play an important role in the pathogenesis of IBDs such as Crohn's disease (CD), where they control many aspects of the inflammatory response (Stallmach et al., 2004;Neurath, 2014).

Identified Ulcerative Colitis-and Crohn's Disease-Specific Subnets
Mapping Transcriptome Data to Protein-Protein Interaction Networks With Cytoscape-KeyPathwayMiner The visualization and topological analysis of the disease-specific PPI network modules extracted from the PSICQUIC database were done by Cytoscape. UC-specific PPI module contains 88 proteins and 100 interactions ( Figure 3A). Topological analysis revealed that there is only one hub gene (TP63) in the UC module. UC transcriptome data were mapped to the corresponding UC PPI network to find disease-specific new   modules by KPM analysis (Figure 3B). The resulting UC-specific new module in the PPI network includes 11 nodes mapped to the DEGs (green circles) and one node (TP63, red circle) added by KPM was in the expression dataset but not found as a DEG ( Figure 4B). The hub node TP63 (with the highest number of connections) is in the same family as the tumor proteins p53 (TP53) and p73 (TP73), which have been studied for risk associated with both colorectal cancer and inflammatory bowel disease (IBD) (Hudspath et al., 2018). Note that the existence of this hub node in the expression dataset but not being differentially expressed in disease conditions might be due to its vital role in various cellular processes and the network not being robust to changes in its expression levels. CD-specific PPI module contains 2,777 proteins and 3,475 interactions ( Figure 4A). In the new CD-specific module obtained by KPM analysis, 170 nodes were mapped to DEGs (green circles), and the APP hub node (yellow circle) was mapped to both transcriptome datasets but was significantly expressed in only one. Note that NOD2 gene, which is known to be related with CD, was not included in the modules, and this may be due to the fact that the NOD2 gene is not subtypespecific but rather generally associated with IBD. HTT and VHL hub nodes (red circle) were mapped to both transcriptome datasets but were not significantly expressed. The NCF1 hub node (red pentagon) was mapped to only one transcriptome dataset but was not significantly expressed. P19711 hub node (red triangle) could not be mapped to the transcriptome datasets ( Figure 4B). Similar to the UC-specific subnet, hub genes were mapped to the expression dataset but not differentially expressed in the disease condition; indicating them to be critical for the network robustness. These CD-specific hub nodes and their disease relation can be listed as APP in Alzheimer's disease (Neha et al., 2008), HTT in Huntington's disease (HD) (Warby et al., 2011), and VHL as a tumor suppressor gene (Kim, 2004). Topological analysis reveals that the APP was the hub gene with the highest number of interactions with the expressed genes in the datasets. In addition, a recent study reported that APP gene is an important DEG for CD disease (Li et al., 2020). These results suggest that some similar genetic factors may be effective in both Alzheimer's disease (AD), and CD as APP is a critical gene in  Mapping transcriptome data to CD-specific PPI network reveals a module containing 175 proteins and 217 interactions. Green and red colored nodes correspond to the DEGs in the expression datasets and the genes that are not significantly different from the healthy samples, respectively. Genes depicted in yellow are DEGs in only one of the datasets. Circle-shaped nodes were mapped to both of the expression datasets, nodes with pentagon shape were mapped to only one of the datasets, and triangle nodes were not mapped to any of the datasets. both. Functional interpretation of the modules is given in the functional enrichment section.

Merged Differentially Expressed Gene-and Disease-Specific Protein-Protein Interaction Networks Using Cytoscape-StringApp
Using ulcerative colitis as the String Disease Query, a PPI network containing 500 proteins and 1,318 interactions was retrieved. When the non-interacting proteins with the rest of the network were removed, 293 proteins and 1,303 interactions remained in the UC-specific PPI network. In the String Protein Query search, a total of 280 UC-specific DEGs, including 152 up-and 128 downregulated genes, were used. As a result, a PPI network containing 246 proteins and 51 interactions was obtained, and removing the non-interacting proteins resulted in a DEG-specific PPI network of 51 proteins and 49 interactions. These two networks were subjected to merge analysis using Cytoscape, and the intersection of the two modules revealed a UC-specific subnet module containing 12 proteins and 14 interactions (Figure 5). A similar approach was also used for the discovery of a CDspecific subnetwork and the resulting number of proteins and interactions in the PPI networks are shown in Figure 6. Note that, Crohn's disease as the String Disease Query, yielded 500 proteins and 1,263 interactions, which reduced to 297 proteins and 1,250 interactions after the elimination of the non-interacting nodes with the rest of the network. On the other hand, using FIGURE 6 | Analyzing disease-and DEG-specific PPI networks for the discovery of a CD-specific subnetwork with StringApp. a total of 339 CD-specific DEGs, including 191 up-and 148 downregulated genes, a PPI network containing 317 proteins and 144 interactions was obtained, and removal of non-interacting proteins with the rest of the network resulted in 80 proteins and 137 interactions. The CD-specific subnet module, which is the intersection of the disease-and DEG-specific PPI networks, includes 19 proteins and 35 interactions (Figure 6).

Functional Enrichment of Disease-Specific Subnets
As described above, the disease-specific subnets for UC and CD were obtained integrating transcriptome data into PPI networks using two different methods (Figures 1, 7, 8). In order to understand the biological meaning of these diseasespecific subnets, they were functionally enriched. The functional enrichment of UC-specific module obtained by KPM showed that these genes mostly play a role in the regulation of the activity of cancer suppressor TP53 (Figure 7A). Studies have reported the presence of p53 overexpression in UC patients (Lu et al., 2017). Moreover, p53 expression is closely associated with colon cancer development in UC patients, and the prevalence of TP53 is reported to be high in patients with UC and colon cancer (Yashiro, 2014;Du et al., 2017;Lu et al., 2017). Most studies confirm that UC is a risk factor for colon cancer (Yashiro, 2014;Choi et al., 2016;Kobayashi et al., 2017). UC-associated colon cancer develops through the inflammation-dysplasia sequence, so early detection of any malignancy formation in patients with UC is very important (Kobayashi et al., 2017). The p53 protein overexpression as a result of the mutation of the p53 gene or the development of dysplasia can be used as a biomarker in the diagnosis of UC-associated colon cancer (Du et al., 2017;Kobayashi et al., 2017). On the other hand, the functional enrichment of the UC-specific module obtained by StringApp revealed that genes mostly play a role in the chemokine and disruption of cells of other organisms (Figure 7B). Chemokines are small cytokines secreted by cells that play a role in immunity and inflammation (Griffith et al., 2014). The UC-specific modules obtained by the two different approaches reveal common functional properties such as CXC chemokine receptor 1/2, and CXC chemokine, cellular response to transforming growth factor beta (TGF-beta) stimulus, T-cell proliferation involved in immune response, and identical protein binding. The following literature findings support these signature pathways in UC. CXC chemokine receptors 1/2 (CXCR1 and CXCR2) have similar signaling mechanisms (Muthas et al., 2016). Ligands of CXC chemokine receptor 1/2, which are chemoattractants of PMN (polymorphonuclear leukocyte), have been found at elevated levels in the mucosa of UC patients and the activation of PMN to the colonic mucosa causes tissue damage in patients with UC (Buanne et al., 2007). There is increasing evidence to suggest that CXCL8 (IL-8) has an important role in the pathogenesis of IBD (Williams et al., 2000). CXCL8 binds to CXCR1 and CXCR2 to mediate neutrophil recruitment and trigger cytotoxic action at the sites of infection (Nasser et al., 2009;Fisher et al., 2019). As a result of some studies, it was stated that the mucosal levels of CXCL8 were elevated in UC, but not observed in CD (Mahida et al., 1992;Bruno et al., 2015), and CXCL8 has been reported to mediate inflammation in UC (Zhu et al., 2021). Cytokine TGF-beta has critical functions for the fibrosis process such that it regulates the genes involved in wound healing, including enhancing extracellular matrix (ECM) formation, disordering the ECM cycle, and in the growth of connective tissue and insulin (Burke et al., 2007;Li and Kuemmerle, 2014;van Haaften et al., 2020). UC disease is characterized by cytokine production and T-cell infiltration (Koch Hansen et al., 2014). The pathogenesis of UC is associated with differences in immune regulatory T cells (Watanabe et al., 1997). Known for its anti-inflammatory roles, IL-10 cytokine has an important role in suppressing the exacerbation of disease symptoms of UC (Wang et al., 2020). Since IL-7 has an important role in the proliferation and differentiation of T cells, it contributes to the disruption of immune regulatory T cells in UC (Watanabe et al., 1997).
As a result of the functional enrichment of the CDspecific module obtained by KPM, general biological pathways have mainly emerged. These are, namely, protein binding, acetylation, response to organic substance, intracellular organelle lumen, and Ubl conjugation ( Figure 8A). On the other hand, functional enrichment of CD-specific module obtained by StringApp results in functional pathways, such as IL-17 signaling pathway, cytokine-cytokine receptor interaction, cell killing, positive regulation of neutrophil migration, and cellular response to lipopolysaccharide ( Figure 8B). The following literature findings support these signature pathways in CD. Interleukin-17 (IL-17) is determined to be the main immunoregulatory cytokine that can cause IBD with their disturbances (Hudspath et al., 2018). Due to its importance in the widespread expression of IL-17, the expression of various cytokines and chemokines are stated to be induced (Kim, 2004). The CD-specific modules obtained by the two different approaches reveal common functional properties such as cytokine production, interleukin-1 (IL-1) receptor activity, NF-kappa B (NF-κB) signaling pathway, bladder cancer, and growth factor activity. In CD patients, greater cytokine release and tissue damage were observed in inflamed tissues compared with non-inflammatory tissues (Sarrabayrouse et al., 2020). Interleukin-1 (IL-1) is one of the cytokines that promote inflammation (Dinarello, 2019). IL-1α and IL-1β are proinflammatory cytokines with similar structures; they bind to the same receptor and are present in different signaling pathways such as JNK, with NF-κB as the main active pathway (Dinarello, 2011;Garlanda et al., 2013;Anka Idrissi et al., 2021). Dysregulation of the NF-kB signaling pathways involved in regulating the immune response and inflammation is directly related to CD disease (Shih and Targan, 2007;Buttó et al., 2015;Han et al., 2017;Nissim-Eliraz et al., 2021). As a result of abnormal activation of NF-κB, overproduction of proinflammatory cytokines that cause chronic inflammation in the gut occurs (Han et al., 2017). Studies for CD patients have observed a significant increase in the number of NF-kB-positive cells in the inflamed area compared with the non-inflamed areas (Ellis et al., 1998). NF-κB activation status may reflect the inflammatory load in CD: CD patients with high NF-κB activation showed specific clinical signs such as higher frequency of ileocolonic involvement and lower frequency of perianal involvement compared with patients with low NF-κB activation (Han et al., 2017). Studies have shown that patients with CD are more likely to have bladder cancer than patients with UC (Geng and Geng, 2021;Zhang et al., 2021).

CONCLUSION
Integrating transcriptome data into PPI networks to obtain disease-specific subnetworks (modules) approaches, which can be used to understand the complex natures of diseases, reveals previously unknown relations of proteins and cellular mechanisms with diseases. UC and CD are subtypes of IBD. The diagnosis and treatment processes of these diseases still remain unclear due to the complexity in the pathogenesis. In this study, we identified UC-and CD-specific subnets by mapping transcriptome data to PPI networks in order to reveal the molecular signatures and important functional pathways of these IBD subtypes. First, the analysis of GSE126124 and GSE3365 expression datasets showed UC-and CD-specific genes that significantly differ in mRNA level with respect to healthy cases (DEGs). Functional enrichment of these DEGs revealed that UCspecific genes act on various immune system signaling pathways, such as antibacterial humoral response, positive regulation of interferon-alpha production, and phagocytosis recognition. On the other hand, CD-specific genes were observed to be related with chemokines and again with the immune system. Then, new modules specific to CD and UC disease subtypes were identified employing two different approaches. As a result of the topological analysis of UC-and CD-specific modules obtained by KPM, TP63 was identified as the hub gene in the UC-specific subnet, and p63 tumor protein is studied for risk associated with both colorectal cancer and IBD being in the same family as p53 and p73. APP was identified as the most linked hub gene in the CD-specific subnet, and it has an important role in the pathogenesis of Alzheimer's disease (AD). This relation suggests that some similar genetic factors may be effective in both AD and CD. Last, in order to understand the biological meaning of these disease-specific subnets, they were functionally enriched. The UC-specific modules obtained by the two different approaches reveal common functional properties such as CXC chemokine receptor 1/2 and CXC chemokine, cellular response to transforming growth factor beta (TGF-beta) stimulus, T-cell proliferation involved in immune response, and identical protein binding. The CD-specific modules obtained by the two different approaches reveal common functional properties such as cytokine production, interleukin-1 (IL-1) receptor activity, NFkappa B (NF-κB) signaling pathway, bladder cancer, and growth factor activity. It is important to note that chemokines-special types of cytokines-and antibacterial response are important in UC-specific subnets, whereas cytokines and antimicrobial responses as well as cancer-related pathways are important in CD-specific subnets. Overall, these findings reveal the differences between the IBD subtypes at the molecular level and can facilitate diagnosis for UC and CD as well as provide potential molecular signatures that are specific to disease subtypes.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
SEA conceived the work. SEA and SFM collected, analyzed and interpreted the data, and wrote the manuscript. Both authors contributed to the article and approved the submitted version.