Study of hub nodes of transcription factor-target gene regulatory network and immune mechanism for type 2 diabetes based on chip analysis of GEO database

Identification of novel therapeutic targets for type 2 diabetes is a key area of contemporary research. In this study, we screened differentially expressed genes in type 2 diabetes through the GEO database and sought to identify the key virulence factors for type 2 diabetes through a transcription factor regulatory network. Our findings may help identify new therapeutic targets for type 2 diabetes. Data pertaining to the humoral (whole blood) gene expression profile of diabetic patients were obtained from the NCBI’s GEO Datasets database and gene sets with differential expression were identified. Subsequently, the TRED transcriptional regulatory element database was integrated to build a gene regulatory network for type 2 diabetes. Functional analysis (GO-Analysis) and Pathway-analysis of differentially expressed genes were performed using the DAVID database and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Finally, gene-disease correlation analysis was performed using the DAVID online annotation tool. A total of 236 pathogenic genes, four transcription factors related to the pathogenic genes, and 261 corresponding target genes were identified. A transcription factor-target gene regulatory network for type 2 diabetes was constructed. Most of the key factors of the transcription factor-target gene regulatory network for type 2 diabetes were found closely related to the immune metabolic system and the functions of cell proliferation and transformation.


Introduction
Globally, an estimated 366 million people are affected by diabetes mellitus and its incidence has shown an increasing trend (Joanne and Jose, 2020).Diabetes ranks as the third most harmful disease after cancer and coronary atherosclerotic heart disease (Chen et al., 2017).The high incidence, high disability rate, and lifelong harm caused by diabetes imposes a heavy economic burden on the society and families (Le et al., 2018).Type 2 diabetes accounts for over 90% of all diabetic patients (Shen et al., 2018).Therefore, the prevention and treatment of type 2 diabetes have attracted great attention from domestic and overseas scholars and governments (Wang and Wei, 2017).
Type 2 diabetes is a complex metabolic disorder involving the metabolism of sugars, proteins, fat, water, and electrolytes.The condition results from hypofunction of the pancreas and insulin resistance; the underlying etiopathogenetic mechanisms are complex and involve genetic factors, immune disorders, infection, and toxins (Guess, 2018).Traditional single gene screening for type 2 diabetes has been unable to meet the needs of clinical medicine (Yi et al., 2022;Kaul and Ali, 2015).
Homeostasis regulation depends on common pathways of the metabolic and immune systems, and metabolic regulation and immune response interact.When dysfunction occurs, it can lead to chronic metabolic disorders in the body.If endogenous or exogenous infections can cause immune responses and metabolic disorders, metabolic abnormalities that occur when nutrient and energy intake and expenditure are out of balance can also induce immune responses.
Transcription factors are a class of proteins with DNA-binding domains that bind to specific DNA sequences and regulate gene transcription by promoting or preventing the recruitment of RNA polymerase.Transcription factors play an important regulatory role in complex networks through thousands of genomic binding sites.Therefore, the construction of a regulatory network of transcription factors may facilitate the identification of novel diagnostic and therapeutic targets for type 2 diabetes (Da et al., 2017).
In this study, the humoral (whole blood) gene expression profile of type 2 diabetic patients was obtained through the GEO Datasets database of the National Center for Biotechnology Information (NCBI), and the differentially expressed gene sets were selected.Subsequently, the TRED transcriptional regulatory element database was integrated to build a gene regulatory network for type 2 diabetes based on the differentially expressed gene set.Gene-disease correlation analysis was performed using the DAVID online annotation tool.Our findings may help identify some novel diagnostic targets and lay the foundation for early clinical diagnosis of type 2 diabetes and the development of novel drugs.

Microarray data
The microarray data used in this study was obtained from GEO Datasets of the NCBI database (Barrett et al., 2013); the index word used was "Diabetic".The filter subjects were "Homo sapiens," "CEL original document," and "Affymetrix".Three groups of microarrays were eventually identified.
We chose gene expression profiles of GSE15653, GSE64998, and GSE23343 from the GEO database, which are freely available in the public domain.The GSE15653 datasets were based on the Affymetrix GPL96 platform and included 13 samples (4 diabetic samples and nine healthy samples).The GSE64998 datasets were based on the Affymetrix GPL11532 platform and included 13 samples (7 diabetic samples and six healthy samples).The GSE23343 datasets were based on the Affymetrix GPL570 platform and included 17 samples (10 diabetic samples and seven healthy samples) (Table 1).

Microarray data processing
The Expression Console ™ software tool of Affymetrix was used to perform background correction and probe fluorescence conversion to microarray data.The Transcriptome Analysis Console tool of Affymetrix was used to standardize and perform logarithmic conversion of the microarray data.The significance analysis of microarrays (SAM) method was used to identify differentially expressed mRNA between healthy individuals and patients with type 2 diabetes.Fold change >1.0 or fold change < −1.0 and p-value <0.05 were used as the criteria to identify differentially expressed genes in this study.

Construction of TF mRNA gene network
Based on mRNA expression profiles after microarray data analysis and after searching the Transcriptional Regulatory Element Database (TRED) (Zhao et al., 2005), we obtained four transcription factors (TFs) and 236 target genes.Four transcription factors (TF) and 236 target genes were predicted to combine for a total of 261 TF-to-target pairs.The relation obtained from the analysis of the differential co-expression was mapped to the transcription factors and target gene pairs to obtain transcription regulation pairs.Finally, Cytoscape3.9.1 software (Shannon et al., 2003) was used for plotting.The yellow rhombus in the TF-gene network represented transcription factors and the blue rhombus represented target genes.The TFs as well as their target genes were connected by dotted lines with arrows indicating the direction from the source to the target.

Construction of PPI network
The genes with more than 15 nodes were input into the String database (https://stringdb.org/) to construct the PPI network, the species was selected as H. sapiens, the score was set to ≥0.9, and other parameters were set as default.

Gene function annotation analysis
The DAVID (Database for Annotation Visualization and Integrated Discovery) database (Huang et al., 2007) was used for the Gene Ontology (GO) function annotation enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis (Xu et al., 2014) on the screened differential genes.According to the GO significance reflected by the differentially expressed genes (p < 0.05), the differentially expressed genes were further analyzed from the functional perspective.

Microarray data processing
The Transcriptome Analysis Console tool of Affymetrix was used to standardize and perform logarithm conversion of the microarray data, while the SAM method was used to identify the differentially expressed miRNAs between healthy individuals and patients with type 2 diabetes.The numbers of differentially expressed genes on each platform were 1,210, 341, and 1,502, respectively.These genes were cross-screened and a total of 236 genes were identified (Figures 1, 2).Of these, 23 differentially expressed genes were identified on three platforms, while 37, 121, and 55 differentially expressed genes were identified on two platforms.

Transcription factor-target gene regulatory network for type 2 diabetes
The TRDE database was used to predict the possible transcription factors for the 236 genes; a total of four transcription factors and 261 corresponding target genes were identified (Table 2).
The transcription factors and their corresponding 261 target genes were mapped using the cytoscape software (Figure 3).In Figure 3, the number of target genes regulated by transcription factor Jun is the largest, followed by Stat1, Fos and Atf5 (Table2).
We found that 13 target genes were regulated by more than two transcription factors (Table 3); of these, Pik3r1 (regulated by four transcription factors) was the most regulated target gene in this network.There were eight target genes regulated by three transcription factors and four target genes regulated by two transcription factors.The transcription factor Fos gene has a regulatory effect on all target genes.Screening results of differentially expressed genes overlapping on two or more platforms.The blue color is GSE15653, the yellow color is GSE23343, and the green color is GSE64998.

FIGURE 2
Volcano plot of GSE15653, GSE64998 and GSE23343.The red represents upregulated differentially expressed genes, and the green represents downregulated differentially expressed genes.

Node of network
We performed a statistical analysis of the network nodes in the transcription factor-target gene regulatory network for type 2 diabetes.We identified 14 genes that existed in more than 15 nodes and three (Jun, Fos and Stat1) of these existed in more than 80 nodes; among these, Jun had the most network nodes (115 network nodes) (Table 4).Furthermore, we found that Lepr, Hsp90ab1, Igf1 and Pik3r1 were closely related to the regulation of multiple transcription factors.

Construction of PPI network
The 14 key nodes (Table 4) were input into the String database to construct the PPI network (Figure 4), and it was found that JUN, Transcription factor regulatory network map of differentially expressed genes in diabetes.The yellow boxes are transcription factors, while the blue boxes are targets genes.
STAT1, FOS, PIK3R1, ATF5 were closely related in the PPI network, and they were regarded as the key targets of type 2 diabetes.

GO functional annotation analysis of differentially expressed genes in diabetes
GO functional annotation was performed on 236 differentially expressed genes and the first 10 pathways were sequenced according to the p-value (Figure 5).Of those 10 pathways, Drug Metabolism-Cytochrome P450 pathway, Metabolism of xenobiotics by cytochrome P450 pathway, Steroid hormone biosynthesis pathway and Type I These four pathways are closely related to the metabolic system.The Influenza A pathway, Staphylococcus aureus infection pathway, Toxoplasmosis pathway and Leishmaniasis pathway are all pathogenic infectious diseases that are closely related to the body's immune system.

Discussion
The prevalence of diabetes has shown a rapid increase owing to sedentary lifestyles in modern society and progressive population aging.Development of new hypoglycemic drugs and the identification of novel drug targets are a key areas of contemporary research (Vyas et al., 2015).With the advent of the post-genome era and the rapid development of bioinformatics, it is possible to construct information networks based on big data and to identify potential drug targets through network nodes (Li, 2012).
In this study, microarray data pertaining to type 2 diabetes was obtained from the GEO Datasets of the NCBI database.Further, The Transcriptome Analysis Console tool of Affymetrix was used to standardize and logarithmize the microarray data.A total of 236 differentially expressed pathogenic genes for diabetes were identified.These 236 genes were analyzed using the TRDE database, and four transcription factors (Jun, Stat1, Fos and Atf5) and their 261 corresponding target genes were predicted.Lastly, a transcription factor-target gene regulatory network for type 2 diabetes was constructed (Table 2).
Jun is closely related to systemic lupus erythematosus (SLE).SLE is a typical autoimmune disease involving multiple organs and systems.Doníz-Padilla et al. found that the expression level of Jun in peripheral blood mononuclear cells (PBMC) of patients with SLE was significantly higher than that in normal controls (Doniz et al., 2011;Linan et al., 2015).Olferiev et al. showed that Jun may play an important role in transcriptional regulation of FCGR2B promoter activity; FCGR2B has been shown to be closely related to the pathogenesis of SLE (Olferiev et al., 2007).These studies suggested that jun may be involved in the pathogenesis of SLE.
Gene Stat1 translated as STAT1, is a transport protein for interferon (Dale et al., 2015;Halupa et al., 2005) subsequent research showed that it is an important component of cellular response to interferon stimulation.STAT1 belongs to the STAT transcription factor family, which includes STAT1, STAT2, STAT3, STAT4, STAT5α, STAT5β, and STAT6.STAT1 plays a key role in cellular immune response against viruses, bacteria, and parasites (Chauche et al., 2017).
As a member of the Fos family, Fos along with the members of the Jun family and the activated transcription factor protein family were shown to form activated protein 1 (ap-1) (Wang et al., 2006).Activator protein-1 (ap-1) is an important intranuclear transcription regulator that plays an important role in many signal transduction processes; it represents the intranuclear intersection of a series of cell signal transduction pathways.
Gene Atf5 translated as ATF5 (activating transcription factor 5), is a member of the ATF/CREB (camp response element binding protein) family.In previous studies, full-length TRB3 was used as the Frontiers in Molecular Biosciences frontiersin.org06 Xu et al. 10.3389/fmolb.2024.1410004decoy protein to screen the human liver cDNA gene bank and to identify the interaction between TRB3 and ATF5 (Yasuda et al., 2014).However, no studies have specifically reported the interaction between the two.Therefore, we concluded that the ATF5 protein has an unpredictable role in glucose metabolism or lipid metabolism.In particular, the function of ATF5 in preadipocyte differentiation through its interaction with TRB3 has not been reported.After statistical analysis of the target genes regulated by transcription factors, we found that only one target gene (Pik3r1) was regulated by the four transcription factors, while eight target genes were regulated by three transcription factors.Pik3r1 is regulated by all four transcription factors and has 22 network nodes; therefore, it seems to play an important role in the transcription factor regulatory network.Pik3r1 is a member of the PI3K family, which is an important kinase of inositol and fluidomyositol (PI).As an important intracellular signal transduction molecule, Pik3r1 is involved in the processes of cell proliferation, apoptosis, and differentiation (Zhang et al., 2019).An increasing body of evidence suggests that Pik3r1 plays an important role in tumor biomolecular mechanisms (Vander et al., 2015).Several studies have shown that diabetes, particularly type 2 diabetes, is associated with an increased risk of breast, colorectal, endometrial, pancreatic, liver, and gallbladder cancer (Sanae et al., 2019;Zhao et al., 2021).
We also performed statistical analysis pertaining to the nodes of the ranscription factor-target gene regulatory network for type 2 diabetes; we found 14 genes that existed in more than 15 nodes, while eight genes existed in more than 20 nodes, (Jun,Fos,Stat1,Atf5,Lepr,Hsp90ab1,Igf1,and Pik3r1).The greater the number of nodes, the more important is the gene in the regulatory network.Of the eight genes with the largest number of nodes in the network, the first four were transcription factors (as discussed earlier) that were closely related to the immune system and signal transduction.Of the remaining four genes, Lepr plays an important role in maintaining energy hemostasis in the body.Ridker et al. identified the expression of LEPR in pancreatic β cells and found that it regulates insulin secretion in consort with leptin (Ridker et al., 2008).In addition, animal experiments have shown that the variation of LEPR plays an important role in the pathogenesis of obesity and diabetes in mice (Brito et al., 2016).Heat shock protein 90 (Hsp90) is widely found in eukaryotic and prokaryotic cells and is the most active molecular chaperone in the cytoplasm.Human Hsp90 is divided into two categories, Hsp90AA1 and Hsp90AB1, based on whether it contains abundant glutamine fragments.The Hsp90AB1 gene has been implicated in the pathogenesis of SLE via regulating the expression of Hsp90 through translation, which increases the expression of Hsp90 and interleukin-6 (IL6); this induces the differentiation of B lymphocytes into plasmocytes, promotes the production of autoantibodies, reduces the activity of CD8 + inhibitory T cells, and increases the secretion of immunoglobulins (Stephanou et al., 1998).As translated as KEGG pathway enrichment analysis.
insulin-like growth factor 1 (Igf1), IGF1 can promote cell proliferation and inhibit apoptosis; in addition, its role in tumor development is a hot topic in contemporary research.The biological function of IGF1 is mediated by its surface specific target cell receptor (IGF1R), which plays an important role in cell transformation and tumorigenesis in many tissues including ovaries; in addition, it can activate the MAPK and PI3K/AKT signaling pathways (Cheng et al., 2022).AKT is protein kinase B, which regulates cell proliferation, apoptosis, and cell cycle (Jae et al., 2020).We have also discussed that Pik3r1 gene is involved in regulating a variety of cellular functions including cell growth, proliferation, transformation and survival; it also plays an important role in tumor biomolecular mechanisms.To summarize, these eight genes are mainly involved in the immune response, cell proliferation, transformation, and other functions, all of which are closely related to type 2 diabetes.
We performed gene ontology (GO) functional annotation analysis of 236 differentially expressed genes, and the first 10 pathways were obtained according to the p values (Figure 5).These 10 pathways were closely related to diabetes and related diseases, of which the type 1 diabetes pathway is the sixth, and the toll-like receptor signaling pathway is closely related to immunity.
Eight network nodes including four transcription factors (Jun, Stat1, Fos and Atf5) that regulate the most target genes, the target gene that was most regulated by transcription factors (Pik3r1), and the eight genes with the most network nodes (Jun, Fos, Stat1, Atf5, Lepr, Hsp90ab1, Igf1, and Pik3r1) were put together for network construction analysis (Figure 6).We found that these genes were closely related to and were regulated by Stat5a, Mapk9 and Mapk8, all of which play a role in the STAT and MAPK signaling pathways for regulation of immune and inflammatory-related functions.At the same time, we also found that the two genes Stat1 and Pik3r1 were located in the central location of the networks, which indicated their importance.
Type 2 diabetes is generally believed to be related to the abnormal cell structure caused by the interaction of inflammatory factors with the endocrine system, immune system, oxidative stress, abnormal fat metabolism, and other factors; in addition, insulin resistance and microvascular disease plays a key Network map of the relationship between transcription factors and network nodes.The circles represent transcription factors; the size of the circle is directly proportional to the number of genes it is linked with.
role in its pathogenesis (Hiroki and Kazuhiro, 2022).We performed GO functional enrichment analysis of 236 differentially expressed genes through the DAVID website and selected the top 10 pathways associated with the most significant p-value.We found that most of these 10 pathways were related to immune, metabolism, and diabetes (such as the toll-like receptor signaling pathway and the type I diabetes mellitus pathway).There were also some transcription factors and signaling pathways related to diseases and immunity, such as Drug kibone-cytochrome P450, metabolism of xenobiotics by cytochrome P450, and chemical carcinogenesis.

TABLE 2
Four transcription factors and their corresponding target genes.

TABLE 3
Target genes regulated by transcription factors.

TABLE 4
Statistical table of transcription factor network nodes (>15 nodes).