Cellular Signaling Pathways in Insulin Resistance-Systems Biology Analyses of Microarray Dataset Reveals New Drug Target Gene Signatures of Type 2 Diabetes Mellitus

Purpose: Type 2 diabetes mellitus (T2DM) is a chronic and metabolic disorder affecting large set of population of the world. To widen the scope of understanding of genetic causes of this disease, we performed interactive and toxicogenomic based systems biology study to find potential T2DM related genes after cDNA differential analysis. Methods: From the list of 50-differential expressed genes (p < 0.05), we found 9-T2DM related genes using extensive data mapping. In our constructed gene-network, T2DM-related differentially expressed seeder genes (9-genes) are found to interact with functionally related gene signatures (31-genes). The genetic interaction network of both T2DM-associated seeder as well as signature genes generally relates well with the disease condition based on toxicogenomic and data curation. Results: These networks showed significant enrichment of insulin signaling, insulin secretion and other T2DM-related pathways including JAK-STAT, MAPK, TGF, Toll-like receptor, p53 and mTOR, adipocytokine, FOXO, PPAR, P13-AKT, and triglyceride metabolic pathways. We found some enriched pathways that are common in different conditions. We recognized 11-signaling pathways as a connecting link between gene signatures in insulin resistance and T2DM. Notably, in the drug-gene network, the interacting genes showed significant overlap with 13-FDA approved and few non-approved drugs. This study demonstrates the value of systems genetics for identifying 18 potential genes associated with T2DM that are probable drug targets. Conclusions: This integrative and network based approaches for finding variants in genomic data expect to accelerate identification of new drug target molecules for different diseases and can speed up drug discovery outcomes.


INTRODUCTION
Type 2 diabetes mellitus (T2DM) is a metabolic and complex disease that is characterized by hyperglycemia in the context of insulin resistance and relative lack of insulin (Kumar et al., 2005). Globally, it is estimated that there are more than 285 million people with T2DM making up about 90% of diabetes cases (Melmed et al., 2011). The disease mechanism is known to a considerable extent and tissues including pancreatic islets, liver, skeletal muscle, adipose tissues, gut, and the immune system play a role in its progression (Kolb and Eizirik, 2011). Although several key factors including lifestyle, diet, obesity and genetic shave been recognized in the progression of insulin resistance and T2DM (Polonsky et al., 1996;Florez, 2008;Ripsin et al., 2009), the underlying mechanisms remain unclear. It has become a progressively challenging health issue due to its high morbidity, mortality, and heightened incidence worldwide (Melmed et al., 2011).
Recent advances revealed that diabetes is a heterogeneousdisease with complex genetic mechanisms. Several biological systems seem to be connected in the progression and development of T2DM; however the limited understanding of the complications of these systems and their interactions has been a major obstruction in the progress of optimal treatments in T2DM. Most cases of diabetes involve many genes, with each being a minor contributor to an intensified possibility of becoming a type 2 diabetic (Melmed et al., 2011) and similarly genes connected with T2DM poorly signify established pathways of insulin signaling (Florez, 2008). The existing methods to find statistically significant functional classes in T2DM related genes have recognized enrichment of cell cycle regulation (McCarthy, 2010;Voight et al., 2010). Nonetheless, the functional categories and therapeutic role of the expressed genes in T2DM and molecular biology of insulin resistance has not been completely understood (Voight et al., 2010). Therefore, significant gaps in clinical outcome still remain within each of these problems, leading investigators to continue searching for more improvement.
Differential expression in islets from diabetic and control individuals explored the list of genes related to type 2 diabetes mellitus. Among the list of probable genes, CHL1, LRFN2, RASGRP1, and PPM1K were significantly associated with insulin secretion and diabetes type 2. During this global expression analysis, it was found that fifty genetic loci associated with T2DM due to genetic co-expression and proteinprotein interaction involved in insulin secretion and HbA1c (Taneera et al., 2012). It has been observed that the effect of genetic variations on incessant glycemic events in nondiabetic individuals primarily reveal perturbation of insulin secretion (Jain et al., 2013). Another systems biology approach based on genome wide association studies explored the T2DM pathophysiology and insulin signaling genes (Jain et al., 2013). Similarly, the abnormal secretion of glucagon led to islet inflammation in T2DM and it has been seen the interleukin-6 is involved to stimulate the glucagon secretion (Chow et al., 2014).
Genomic expressions in insulin signaling and integrated pathways may manifest themselves and to interrupt any one of these genes could develop the clinically significant insulin resistance and diabetes (Melmed et al., 2011). The systems biology approach potentially integrate these biological networks and will help in revealing key elements involved in pathogenesis. As genetic expression is vital to better understand the network of systems biology, thereby cDNA microarray technology is a valuable tool for analyzing expression levels of thousands of genes at the same time. The large number of expression datasets in the public domain provides a rich source for genome-wide information on T2DM and affords an opportunity to do expression study with a large number of samples.
Therefore, we executed differential analysis to show the target gene signatures associated with insulin resistance and T2DM. In particular, by probing microarray data, we attempted to find a statistically significant T2DM-related differentially expressed genes in diabetic tissue compared to normal. In our study, we constructed the metabolic pathways to uncover new drug targets. We began the analysis by aiming on insulin-signaling and associated cellular genes, a natural and well-established candidates for finding a signature set of genes (Taniquchi et al., 2006) associated with insulin resistance or diabetes. The framework established in this paper is designed to focus key questions: (1) Can biological processes be recognized that are deregulated in metabolic pathways of insulin resistance and diabetes (2) can genetic interaction networks be helpful to reveal new drug targets and biomarkers for optimizing the treatment strategies. Studying these molecular networks from the prospective of probing new drug targets can deliver valuable insights in both biological and medical research. The comprehensive illustration of our study framework has been shown in Figure 1.

Source Data
The aim of this study was to find new drug target gene signatures associated with insulin resistance and T2DM. The study design of this dataset indicated to extract RNA from the vastus lateralis of normal (NGT), glucose intolerant (IGT) and type 2 diabetic individuals (total: 118 samples). Our analyses in this study restricted to genes commonly covered by hgu133plus2 chips. We accessed the source expression data for the AffymetrixHG-U133_Plus_2 microarray GSE18732 (Gallagher et al., 2010) from Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc=GSE18732). The GPL9486 Affymetrix GeneChip Human Genome U133 Plus 2.0 Array (CDF: Hs133P_Hs_ENST, version 10) (Affymetrix, Inc., Santa Clara, CA, 95051, USA, Technology: in situ oligonucleotide) platform was used, and the annotation information (hgu133plus2) of probes was used to detect the gene expression. We used computational analysis using R (http://www.r-project.org) and BioConductor (http://www.bioconductor.org) packages.

Normalization and Differential Expression Analysis
We organized the pheno-data files of this dataset in recognizable format (Troyanskaya et al., 2001). The data was normalized to the median expression level of each gene using the bioconductor "ArrayQuality Metrics" package (Bolstad et al., 2003;Fujita et al., 2006;Obenchain et al., 2014). The expression of a transcript with detection p-value 0.15 was considered marginal. We log transformed and quantile normalized the arrays to make sure that they were on the same scale, and computed the genegene covariance matrix across all arrays (54675 affyids), ignoring missing values. In order to get a summary of intensities, the Robust Multi-array Analysis (RMA) was used to correct the background (Troyanskaya et al., 2001) for perfect matches (PM) and mismatches (MM). We used the RMA-algorithm to calculate averages between probes in a probe set. To measure the quality of RNA in these samples, AffyRNAdeg, summaryAffyRNAdeg, and plotAffyRNAdeg packages was used for degradation analysis (Affymetrix, 1999(Affymetrix, , 2001. We performed relative study and identified differentially expressed genes by pair wise comparison from genomic experiments (Tusher et al., 2001) and multiple testing corrections were completed by Benjamini-Hochberg method (Benjamini and Hochberg, 1995). The Limma package, a modified statistic that is proportional to the statistic with sample variance-offsets, was used to shortlist the DEGs and duplicate spots and quality weights were measured. The moderated statistics were calculated; genes were prioritized with respect to the resulting scores and p-values. A false discovery rate (FDR) less than 0.05, p ≤ 0.05, Average Expression Level (AEL) ≥40% and an absolute log fold change (logFC) greater than 1 were set as the significant cutoffs (Jin and Da, 2013).

K-Fold Validation
We employed K-Fold study of cross-validation and bootstrap for accuracy estimation in differential analysis (Seymour, 1993). The advantage of this method is that all the samples in the dataset are eventually used for both training and testing. K-fold technique is generally better for determining approximate average error and it was used to validate the shortlisted differentially expressed genes using the bioconductor "boot" package. Boots trapping is successfully being used to correct biases in analysis (Ripley, 2010). We applied the generalized linear Gaussian models and used the "cv.glm" function to assess the k-fold cross validation for these cases. The true error is estimated as the average error rate: (1) The Gaussian function was trailed by the Leave-One-Out-Cross-Validation (LOOCV) procedure. The LOOCV method is intuitively termed as one is left out as the testing-set and remaining data are used as the training-set (Ripley, 2010). For each experiment, we used N-1 subsets for training and the remaining for testing. The true error is estimated as the average error rate on test cases: By increasing the number of folds, the bias of the true error rate estimator will be small and correct (Richard and Dennis, 1984;MAQC Consortium, 2010).

Disease-Gene Interaction and Cluster Analysis
Biomedical text mining system is useful to extract specific information from the literature based on the interactions among different types of biomedical entities (Clematide and Rinaldi, 2012). So, from the list of shortlisted DEGs, we investigated the insulin resistance and T2DM associated genes using diverse FIGURE 2 | Normalization and analysis of array quality metrics shows a color heatmap of the distances between arrays. The color scale is chosen to cover the range of distances encountered in the dataset. Patterns in this plot can indicate clustering of the arrays either because of intended biological or unintended experimental factors (batch effects). The distance d ab between two arrays a and b is computed as the mean absolute difference (L 1 -distance) between the data of the arrays (using the data from all probes without filtering). In formula, d ab = mean | M ai -M bi |, where M ai is the value of the i-th probe on the a-th array. Outlier detection was performed by looking for arrays for which the sum of the distances to all other arrays, S a = b d ab was exceptionally large. 12 such arrays were detected, and they are marked by an asterisk, *. We performed the Absolute Pearson correlation cluster analysis (Eisen et al., 1998) based on expression values in each sample of T2DM-associated differential expressed genes to explore expression profiling and biological functions (Nam and Kim, 2008) using the CIMminer tool (Scherf et al., 2000).

Gene Network Analysis and Identifying Gene Signatures
Proteins usually interact with each other to carry out biological functions (Li et al., 2004;Muhammad et al., 2014) and therefore FIGURE 3 | Side-by-side plot produced by plotAffyRNAdeg representing 5 ′ to 3 ′ trendpresenting an assessment of the severity of degradation and significance level.
gene network aims to find biological processes that are steadily deregulated across a cDNA data related with disease conditions in human tissues. In PPI network, each protein is considered as belonging to one or more gene-sets connected with biological or molecular functions (Rachlin et al., 2006). The normal function of these biological networks may show much altered activity in the disease state compared to normal.
To overview the global network of DEGs of microarray dataset, genes in the connection groups were retrieved with a high confidence score (0.999) in the STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) version 10 (Szklarczyk et al., 2011) and HAPPI (Human Annotated and Predicted Protein Interaction) databases (Chen et al., 2009) for protein-protein interactions. These databases mines and annotate comprehensive physical and genetic mapping described in the primary peer-reviewed literature and includes the data that is validated by experimental studies in an inclusive form to support simulation analysis of biological networks and estimation of gene/protein functions. We used Cytoscape software (version 3.2.1) to visualize and analyze molecular and interaction networks (Cline et al., 2007). In this network, we determined the role of each gene signatures (target genes) in type 2 diabetes mellitus that interacted with T2DM-related seeder genes (source genes) by gene mapping using CTD, PubMed, OMIM, MeSH and PMC databases. The motivation for genemapping in the network is to find potentially T2DM-relatedgene signatures is the hypothesis that genes whose dysfunction contributes to a disease phenotype tend to be functionally related. The total number of gene signatures associated with each seeder protein was measured. We assembled the gene signature that are associated with pathways of interest leading to T2DM and constructed a molecular sub-network of these genes that are highly transcriptionally affected in the diabetes state. We used Network Analyzer in Cytoscape to calculate topological network properties. Nodes in the network were categorized according to the degree of association of gene signature with T2DM. Gene ontology (GO) enrichment of the network help us to show biological functions (Nam and Kim, 2008;Muhammad et al., 2015), and it was carried out using the web-based DAVID (Database for Annotation Visualization and Integrated Discovery) (Huang et al., 2009) and FunRich Annotation tools (Pathan et al., 2015). For these set of gene signatures, p-value and FDR were assigned to the number of conditions where it is enriched. The gene-sets with a substantial p-value were considered as transcriptionally affected in a wide range of diabetes associated samples.

Prediction of Gene Signature Specific MiRNA Targets
MiRNAs are considered as post-transcriptional regulators of a large set of genes involving in many biological processes and signaling pathways. So, a useful step for understanding their functional role is characterizing their influence on the gene targets that help us to understand the disease etiology (Alshalalfa and Alhajj, 2013). Using miRNA influence as a functional signature is promising to find molecular connotations between miRNAs and related gene signatures. MiRNA targets of T2DMrelated gene signatures were determined by microRNA target predictor (powered by miRanda, mirSVR) and structure duplex sequences were predicted. MiRNA targets were selected based on the mirSVR score (<= −0.1) which is considered as "good" score (Betel et al., 2008).

Integrated Genome-Scale Pathway Reconstruction with Putative T2DM Linked Genes
A major goal of systems biology is to reconstruct and model in silico the metabolic networks of disease related genes. We analyzed the integrated, interactive and metabolic network of T2DM-related gene signatures and observed the correlation between these pathways. Cellular and signaling pathways were reconstructed from the combined gene signatures using PathVisio 3tool (Kutmon et al., 2015). These genes were mapped and curated using KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways on the basis of literature and database evidence. KEGG, a public domain database generally used for gene-enrichment analysis and pathway visualization (Bergholdt et al., 2012;Califano et al., 2012), has a total of 199-unique human pathways with 5197 unique genes/proteins (http://www.genome.jp/kegg/pathway.html). In this integrated network, the potential role of each gene signature in each pathway was studied. To verify known role of these pathways in T2DM, the PubMed was curated using the key words "type and 2 and diabetes and insulin and (signaling or resistance or sensitivity) and (secretion or pancreatic or islets)" in combination with terms indicating each of these pathways. Genes interacted with disease are involved to share functional relationships and represent pathways of interest for the pathogenesis of insulin resistance and T2DM.

Drug-Gene Network
In drug-gene network, we investigated for genes that interrelate with anti diabetic-drugs using CTD (http://ctdbase.org/) database. CTD is a source of physically curated chemicalgene, chemical-disease and gene-disease interactions from the literature (Davis et al., 2011). We used chemical-gene interaction query for each gene (T2DM-related) in CTD and accessed drugs using the default parameters. In this interaction, drugs were directly linked with T2DM-associated gene were sorted. We used DrugBank database to verify the FDA-approval status of each drug in the interaction network.

Gene Expression Data and Normalization
We used human GEO dataset to find new drug target gene signatures related to insulin resistance and type 2 diabetes mellitus. The cDNA data has118-samples with 54675 genes derived from the study design of mRNA expression profiling of skeletal muscle of type 2 diabetes (Gallagher et al., 2010). The AffyBatch object comprises the size of the array 1164 × 1164 features with 54675 affyIDS. The quantile normalization of the probes showed quality metrics of the normalized distances between arrays of entire DNA chip. Patterns in this metrics revealed clustering of the arrays either because of intended biological or unintended experimental factors (Figure 2). The individual probes in a probe set was organized by location relative to the 5 ′ -end of the targeted RNA molecule. The 3 ′ /5 ′ intensity gradient has been shown to depend on the degree of competitive binding of specific and of non-specific targets to a particular probe. Poor RNA quality is related with a reduced amount of RNA quantity hybridized to the array followed by a declined total signal level. Increasing degrees of saturation decrease the 3 ′ /5 ′ intensity gradient, and we found that short probe sets near the 3 ′ -end of the transcripts (Figure 3). The function summary AffyRNAdeg produced a single summary-statistic for each array in the batch (Supplementary Table 1) indicating an assessment of the severity of RNA-degradation and significance level.

Identifying Differentially Expressed Genes (DEGS) and Cross-Validation
An automatic process was used to execute pair-wise comparison between biologically-comparable groups that found a total of 50 DEGs (all down regulated) from expression profiling in the skeletal muscle of normal (NGT), glucose intolerant (IGT) and type 2 diabetic (T2DM) samples (Supplementary Table 2). For reliable results and verification of differential analysis, we let off any sub-group without repetition from the comparisons and the "cv.glm" function of generalized linear models estimated the cross validation prediction error. The dispersion criterion for Gaussian is 0.088314 which shows the confidence level (Table 1). We obtained the same delta value of 0.08847 with K-folds estimation as we used the LOOCV method (during raw cross validation and then during adjusted cross validation). The significant codes (0.1, 0.01, 0.001, and 0.05) with minimum deviance residuals indicated the quality of differential analysis.

Identifying T2DM Associated Genes and Cluster Analysis
Among differentially expressed genes, 9 T2DM-related genes were identified including: ZEB1, USP16, IL6ST, ASPH, Eif4g1, RBL2, MEF2A, vapB, and SOS2 after disease-gene interaction using CTD, PubMed, OMIM, MeSH and PMC databases. The role of each gene in T2DM was curated and counted (Figure 4). To show the relationship between these differentially expressed genes and T2DM, we estimated the "similarity" between disease-gene interaction by calculating the Absolute Pearson correlation cluster analysis from two profiles (Figure 5). Clustering analysis has recognized to be helpful to understand gene function, gene regulation, and cellular processes. The genetic expression profiling of skeletal muscle of normal (NGT) is distinguished from the glucose intolerant (IGT) and type 2 diabetic (DM) samples, signifying that obvious differences existed among these cases (treated and untreated).

Gene Network Analysis and Finding Gene Signatures
In genetic network of differentially expressed genes, total of 885 nodes and 959 edges were retrieved from STRING and HAPPI databases (Figure 6). This entire network showed that T2DM-related DEGs were found to interact with other functionally related potential genes that are contributing to a disease phenotype. We identified 31-gene signatures associated with T2DM by disease-gene mapping using CTD, PubMed, OMIM, MeSH and PMC databases ( Figure 7A).
FIGURE 5 | Cluster analysis of diabetes type 2-related differentialy expressed genes with 1-Absolute Pearson correlation (Binning method: Equal width). Blue corresponds to small distance and Red to large distance. Lines indicate the clusters boundaries in the level of the tree.

Pathways Model with Putative T2DM Associated Genes
Genes in T2DM-interactome was studied for pathways modeling which revealed that several pathways are involved in T2DMpathophysiology. Other than insulin-signaling and T2DM pathway that relates to both insulin secretion and insulinsignaling, the other pathways such as JAK-STAT, MAPK, TGF, Toll-like receptor, p53 and mTOR, adipocytokine, FOXO, PPAR, and P13-AKT signaling pathways have all been connected in T2DM ( Figure 9A). Although we found enrichment of several pathways associated with gene signatures, insulin signaling was obvious in over-represented pathways model. Collectively, our analysis determined 11-signaling pathways as a connecting-link between gene signatures in insulin resistance and T2DM. The database was curated to verify the known role of these pathways in T2DM. In this study, we found 8-gene signatures associated with JAK-STAT signaling pathways followed by the FOXO and MAPK pathways (7 and 6-genes respectively) ( Figure 9B).

Finding Potential Anti-T2DM Drug Targets in DG-Network
We used a toxicogenomic approach for drugs-genes (DG) interaction to further explore the existing treatment and better understanding of disease etiology. Gene that interact with antidiabetic drugs metformin, mipyridamole, leptin, troglitazone, pioglitazone, acarbose, decitabine, tolbutamide, decitabine, gliclazide, vildagliptin, sitagliptin, estradiol, saxagliptin, liraglutide, exenatide, and few others were identified using the publicly available CTD database. Among them, we found 13-FDA approved drugs. In this interaction, we identified 18-genes as potential drug targets (Figure 10) involved in type 2 diabetes mellitus.

DISCUSSION
The current study signifies the important relationship of genetic variation with gene expression and functional role of these genes in disease. The analyses provide a list of potential T2DM genes based upon differential expression in skeletal muscles, interaction with known T2DM-related gene signatures and correlation with metabolic pathways. The expression profiling of these genes is indicating the obvious differences in skeletal muscle of normal samples from the glucose intolerant (IGT) and type 2 diabetic (DM) samples (Gallagher et al., 2010). We found 50 down regulated differentially expressed genes that showed the interaction with known T2DM-associated genes (ZEB1, USP16, IL6ST, ASPH, Eif4g1, RBL2, MEF2A, vapB, and SOS2) after mapping in databases. The dysregulation and functional aberration of these differential genes has also been studied (Baxter, 2008;Pihlajamäki et al., 2009;Jewell et al., 2010;Jowett et al., 2010;Nitert et al., 2012;Reddy et al., 2012;Chow et al., 2014;Liew et al., 2014;Neglia et al., 2014) in type 2 diabetes progression. The T2DM linked genes including ZEB1, USP16, IL6ST, ASPH, Eif4g1, RBL2, MEF2A, vapB, and SOS2 effect on pancreatic β-cells, peripheral glucose uptake in muscles, the secretion of multiple cytokines, β-cell gene expression, islet cells, β-cells chromatin and proliferation attenuation (Baxter, 2008;Jowett et al., 2010;Nitert et al., 2012;Chow et al., 2014;Liew et al., 2014). The genetic networks extended the analysis of transcripts to predict interactive gene signatures that have role in diabetes pathophysiology. The molecular sub-network revealed the direct interaction of functionally related 31-gene signatures with seeder genes. In this interaction, we found significant number of gene signatures (13-genes) in connection with T2DM-related IL6ST seeder gene. The variant role of family of IL6-genes has been studied in type 2 diabetes (Chow et al., 2014). Similarly, we observed that SOS2 was another seeder gene that was linked with SRC, INSRR, EGFR, FGFR1, PGFRA and PGFRB gene signatures that were significantly associated with insulin resistance and type 2 diabetes (Davidson et al., 2012;Singh and Kakkar, 2013;Zheng et al., 2013;Li et al., 2015). These observations indicated that the aberration in DEGs expression precede the disturbances in gene signatures ultimately causes type 2 diabetes. The curation and mapping of these gene signatures with T2DM further verified this relationship.
To gain insight into the direction of systems biology, these genes are considerably enriched with Insulin signaling pathway, insulin receptor signaling, interleukin-6-mediated signaling, MAPK cascade, JAK-STAT cascade, regulation of insulin secretion and triglyceride metabolic process. We find that T2DM-gene signatures identified in enrichment analysis can elucidate disease conditions when the interlinked genes are taken together with their protein and functional level interactors (Lee et al., 2011;Lakshmanan et al., 2012;Chow et al., 2014;. These signaling pathways contained the significant gene regulatory network of transcript factors families for type 2 diabetes including SP1, NFIC, ZFP161, and FOS, JUND, JUNB. The pathways modeling and integrative network based analysis of gene signatures revealed 11-signaling pathways including insulin secretion, insulin signaling, JAK-STAT, MAPK, TGF, Toll-like receptor, p53 and mTOR, adipocytokine, FOXO, PPAR, and P13-AKT signaling pathways have all been connected in T2DM. Recent reports and literature search indicated our genomic, interactomic, and toxicogenomic evidence to converge on vital pathways including insulin signaling, JAK-STAT signaling, P13-AKT signaling, FOXO signaling, and TGF-beta signaling, the vital pathway involved to play a critical role in pancreatic islets maturity and function, and insulin secretion (Jain et al., 2013). Collectively, we find that genes link directly to insulin secretion and indirectly, through communication with other genes, to insulin resistance and T2DM.
Gene-drug interactions of great interest because such association can not only expressively improve our understanding of disease pathophysiology, but also are helpful in drug discovery processes. The disease related genes-drugs association network can be improved by data mining and biomedical linkages (Chen et al., 2008). Our toxicogenomic-based approach supported this analysis. In this network, 13-FDA approved and few  FIGURE 10 | Drug-Gene network (DG-network). The DG-network is generated between the reported drugs and their target gene signatures (55-nodes and 63-edges). Circles and rectangles correspond to target genes and drugs, respectively. A dotted link is placed between a drug and a target node if the gene is a known target of that drug while solid link denotes the potential drug targets. Color codes are given in the legend. DrugBank_ID has been shown for these drugs. The drugs-gene signature assocaition was curated using PMC, CTD, and Drug Bank databases.
non-approved drugs were associated with T2DM-related genes leaving the 18-genes as potential drug targets. All drugs are FDA approved except troglitazone which has been withdrawn from the market due to its idiosyncratic reaction leading to drug-induced hepatitis. However resveratrol and leptin are FDA investigational drugs and curcumin is non-approved drug. More importantly, this network proposes many testable assumptions with potential of great success, though the real achievement can only be justified by experimental studies.
In conclusion, gene expression microarray studies have greatly improved our knowledge of genetic mechanisms of human diseases. Systems biology analysis of cDNA data helped us to find T2DM-connected genes as alternative drug targets using interactomic and toxicogenomic data that led us to link with vital metabolic and signaling pathways involved in disease pathophysiology. Our simple and integrated steps are helpful in revealing genome to phenome association in diabetes and finding potential drug targets for type 2 diabetes. Therefore this approach will support to understand the genetic basis of complex phenotypes. These findings can provide a valuable framework for developing diagnostic biomarkers and treatment strategies. However, further molecular studies can be designed to validate the role of these genes in T2DM for effective treatment.

AUTHOR CONTRIBUTIONS
SM and JC designed the study. BB, WR, and XW collected the data. SM, JC, and TN analyzed the data. JC and BB provided guidance with study design and data analysis. All authors contributed to manuscript writing and edition.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys. 2017.00013/full#supplementary-material