Application of multiple machine learning approaches to determine key pyroptosis molecules in type 2 diabetes mellitus

Objective Pyroptosis, a lytic and inflammatory programmed cell death, has been implicated in type 2 diabetes mellitus (T2DM) and its complications. Nonetheless, it remains elusive exactly which pyroptosis molecule exerts an essential role in T2DM, and this study aims to solve such issue. Methods Transcriptional profiling datasets of T2DM, i.e., GSE20966, GSE95849, and GSE26168, were acquired. Four machine learning models, namely, random forest, support vector machine, extreme gradient boosting, and generalized linear modeling, were built based on pyroptosis genes. A nomogram of key pyroptosis genes was also generated, and the clinical value was appraised via calibration curves and decision curve analysis. Immune infiltration was inferred utilizing CIBERSORT. Drug–druggable target relationships were acquired from the Drug Gene Interaction Database. Through WGCNA, key pyroptosis-relevant genes were selected. Results Most pyroptosis genes exhibited upregulation in T2DM relative to controls, indicating the activity of pyroptosis in T2DM. The SVM model composed of BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP exhibited the best performance in T2DM diagnosis, with AUC = 1. The nomogram can predict the risk of T2DM for clinical practice. NK cells resting exhibited a lower abundance in T2DM versus normal specimens, with a higher abundance of neutrophils. NLRP6 was positively linked with neutrophils. Drugs (keracyanin, 9,10-phenanthrenequinone, diclofenac, phosphomethylphosphonic acid adenosyl ester, acetaminophen, cefixime, aspirin, ustekinumab) potentially targeted the key pyroptosis genes. Additionally, CHMP2B-relevant genes were determined. Conclusion Altogether, this work proposes the key pyroptosis genes in T2DM, which might become possible molecules for the management and treatment of T2DM and its complications.


Introduction
Diabetes mellitus (DM) is a chronic metabolic disorder with the characteristics of high blood glucose level, which results from definitive insulin function and/or decreased insulin generation (1). In both type 1 DM (T1DM) and type 2 DM (T2DM), a variety of genetic and environmental factors may lead to progressive loss of bcell number and/or function, clinically manifested as hyperglycemia (2)(3)(4). When hyperglycemia occurs, diabetic individuals are at risk of developing the same chronic complications, though the rate of progression may be different (5). DM is connected with acute and chronic complications, which can be restrained or delayed through intensive glycemic management (6). An in-depth understanding of the pathogenesis of T2DM allows us to better predict the outcome and choose the more precise treatment.
Pyroptosis is a form of programmed cell death characterized by rapid membrane rupture, cell swelling with large bubbles, and the release of proinflammatory cell ingredients (7). The main role of pyroptosis is to drive a strong inflammatory response and protect the host from microbial infection (8). Accumulated evidence suggests the connections of pyroptosis with DM and its complications. For instance, CD74 ablation can rescue T2DM-driven cardiac remodeling and contractile dysfunction via pyroptosis-induced modulation of ferroptosis (9). HECTD3 facilitates NLRP3 inflammasome and pyroptosis for exacerbating DM-relevant cognitive impairment through stabilizing MALT1 (10). Mitochondrial injury and activation of the cytosolic DNA sensor cGAS-STING signaling result in cardiac pyroptosis and hypertrophy in diabetic cardiomyopathy (11). ManNAc exerts a protective effect on podocyte pyroptosis in diabetic renal injury through inhibition of mitochondrial injury and ROS/NLRP3 signaling (12). Schisandrin A mitigates ferroptosis and NLRP3 inflammasome-driven pyroptosis in diabetic nephropathy via mitochondrial damage through AdipoR1 ubiquitination (13). Nonetheless, it is still elusive exactly which pyroptosis molecule exerts a crucial function in T2DM pathogenesis. Herein, diverse machine learning algorithms were adopted for the selection of key pyroptosis molecules, which might have the potential as therapeutic targets of T2DM.

Materials and methods Datasets
Transcriptional profiling of DM was acquired from the Gene Expression Omnibus. The GSE20966 dataset (https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE20966) was composed of pancreatic tissues from 10 non-diabetic controls and 10 T2DM patients on the GPL1352 platform (14). The GSE95849 dataset (https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE95849) comprised peripheral blood samples from six T2DM patients and six healthy participants on the GPL22448 platform (15). The GSE20966 and GSE95849 datasets were merged as the discovery set, and batch effects were removed using the sva package, which was visualized into the principal component analysis (PCA) (16). The GSE26168 dataset (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26168) contained peripheral blood specimens from eight healthy subjects and nine T2DM patients on the GPL6883 platform for external verification (17).
selecting the characteristic pyroptosis genes. Receiver operating characteristic curves (ROCs) were plotted for computing the area under the curve (AUC) on each established model or key pyroptosis gene.

Nomogram establishment
A nomogram was generated through the integration of key pyroptosis genes utilizing the rms package. Calibration curves were drawn for the visualization and evaluation of the consistency between the actual observations and the nomogram-predicted results. Decision curve analysis (DCA) was conducted for quantifying the net benefit at distinct threshold probabilities.

Gene set enrichment analysis
Gene set enrichment analysis (GSEA) was conducted for identifying the possible functions of the selected genes (22). The reference gene sets were acquired from the Molecular Signature Database (23), with p <0.05 as the threshold.

Immune infiltration estimation
Through the CIBERSORT approach (24), the normalized transcriptional profiling of T2DM and normal specimens was transformed to the immune components based upon the LM22 reference signature matrix set at 1,000 permutations.

Drug-druggable target network
From the Drug Gene Interaction Database (DGIdb; www.dgidb.org) (25), the interactions of drugs with key pyroptosis genes were acquired. Afterward, a drug-target network was built by using the Cytoscape software (26).

Weighted correlation network analysis
The weighted correlation network analysis (WGCNA) package was adopted for building co-expression modules (27). The optimal soft thresholding value was selected through the pickSoftThreshold function. By using the dynamic tree cut method, highly connected genes were merged into one co-expression module. The structure of the co-expression modules was visualized through a heatmap plot via the TOMplot function. The interactions of modules with key pyroptosis genes were then estimated via Pearson's test, followed by the evaluation of the module membership versus gene significance.

Protein-protein interaction
Module genes were imported onto the STRING website (28), and protein-protein interaction pairs were acquired. The key genes were selected by using molecular complex detection (MCODE) (a plugin in Cytoscape).

Functional enrichment analysis
By using the clusterProfiler method (29), Gene Ontology (GO) enrichment analysis was carried out, which comprised the biological process (BP), cellular component (CC), and molecular function (MF). Afterward, enrichment of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways was implemented. Results with p <0.05 were indicative of significant enrichment.

Statistical analysis
All the analyses were implemented using the R package (version 3.5.3; https://www.r-project.org/). Two groups were compared by adopting the Wilcoxon test. Pearson's test was employed for correlation analysis. p <0.05 was indicative of statistical significance.

Results
Aberrant expression of pyroptosis genes in T2DM Figure 1 depicts the workflow of this study. This work combined two T2DM datasets, namely, GSE20966 and GSE95849, for expanding the sample size as much as possible ( Figure 2A). The removal of batch effects was then implemented ( Figure 2B). Figure 2C illustrates the genomic position of pyroptosis genes. The detailed information is listed in Table 1. Next, the expression differences in pyroptosis genes were estimated in T2DM and controls. Most pyroptosis genes including CHMP4A, CHMP6, GSDMD, IL1B, IRF2, TP53, CASP9, NLRC4, NOD1, NOD2, and PYCARD exhibited notable upregulation in T2DM relative to normal specimens ( Figures 2D, E), indicating the activation of pyroptosis in T2DM. Both in T2DM and control tissues, pyroptosis genes closely interacted ( Figure 2F).

Generation of a nomogram based upon key pyroptosis genes for T2DM risk
Five key pyroptosis genes were eventually selected for establishing the nomogram, composed of BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP ( Figure 4A). Calibration curves proved the good consistency between the nomogram-predictive results and the actual observations ( Figure 4B). For determining the clinical significance of the nomogram in daily clinical practice, we plotted DCA curves. As depicted in Figure 4C, in comparison to all of the patients or none of them, the application of the nomogram to predict the risk of T2DM might be reasonable and have more clinical net benefit in accordance with the predicted possibilities computed by the nomogram and threshold probabilities.

Verification of the diagnostic efficacy of the SVM model
The excellent diagnostic efficacy of the SVM model was also proven in the GSE26168 dataset (AUC = 1; Figure 5A). In addition, this work investigated the diagnostic performance of each key pyroptosis gene. It was demonstrated that BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP can individually diagnose T2DM with relatively high AUC values ( Figures 5B-F). This demonstrated the crucial significance of key pyroptosis genes in T2DM.

Molecular mechanisms underlying key pyroptosis genes and their correlations with clinical features
GSEA unveiled that BAK1 was negatively connected with basal cell carcinoma, taurine and hypotaurine metabolism, epithelial cell signaling in Helicobacter pylori infection, and maturity-onset diabetes of the young ( Figure 6A). CHMP2B was negatively related to the mTOR signaling pathway, renal cell carcinoma, The workflow of this study.   progesterone-mediated oocyte maturation, and glycosphingolipid biosynthesis of lacto-and neolacto-series ( Figure 6B). NLRP6 exhibited positive interactions with hematopoietic cell lineage, basal cell carcinoma, galactose metabolism, hedgehog signaling pathway, epithelial cell signaling in H. pylori infection, and oxidative phosphorylation ( Figure 6C). PLCG1 was negatively linked with peroxisome, fatty acid metabolism, primary bile acid biosynthesis, propanoate metabolism, O-glycan biosynthesis, and PPAR signaling pathway ( Figure 6D). TIRAP presented positive connections with Vibrio cholerae infection, long-term potentiation, and hedgehog signaling pathway ( Figure 6E). We also evaluated the correlations between the key pyroptosis genes and clinical features (age and BMI). Nevertheless, no significant associations between BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP and age and BMI were observed among T2DM patients (Figures 6F-O). Selection of key pyroptosis genes through multiple machine learning approaches.

Interactions of key pyroptosis genes with immune infiltration
Through the implementation of CIBERSORT, the fraction of diverse immune cell types was estimated across T2DM and control tissues ( Figure 7A; Supplementary Table 1). Their difference was also investigated between the two groups. As illustrated in Figure 7B, natural killer (NK) cells resting exhibited lower abundance in T2DM relative to the normal specimens. In contrast, the higher abundance of neutrophils was investigated in T2DM. Among the key pyroptosis genes, NLRP6 was positively connected with mast cells activated and neutrophils ( Figure 7C). In addition, CHMP2B had a positive interaction with B cells naive. The assessment of the interactions between key pyroptosis genes was also conducted. In Figure 7D, CHMP2B negatively interacted with BAK1 and PLCG1, while the other genes presented positive interactions.

Drug-druggable target network
Drugs that potentially targeted key pyroptosis genes were inferred utilizing the DGIdb. As a result, BAK1 was a druggable target of keracyanin; PLCG1 was a druggable target of 9,10phenanthrenequinone, diclofenac, phosphomethylphosphonic acid adenosyl ester, acetaminophen, cefixime, and aspirin; and TIRAP was a druggable target of ustekinumab ( Figure 7E).

Establishment of the key pyroptosis generelevant co-expression modules
To select the key pyroptosis gene-relevant co-expression modules, this work adopted the WGCNA ( Figure 8A). The appropriate soft thresholding value was set to 13 based on the scale independence as well as mean connectivity ( Figure 8B). By using the dynamic tree cut method, eight co-expression modules were built (Figures 8C, D). Among them, the turquoise module exhibited the strongest interaction with the key pyroptosis gene CHMP2B ( Figure 8E). The genes in the turquoise module were regarded as CHMP2B-relevant genes ( Figure 8F; Supplementary Table 2).

Interactions between key CHMP2B-relevant genes and their biological implications
To determine the key CHMP2B-relevant genes, the MCODE method was adopted (Supplementary Table 3). As a result, 10 key genes were acquired as follows: KNTC1, NCAPG, KIAA0101, DLGAP5, GMNN, CEP55, KIF20B, ZWILCH, MCM6, and MAD2L1 ( Figure 9A). Most of the genes presented differential expression in T2DM relative to the normal specimens ( Figure 9B). The biological significance of CHMP2B-relevant genes was further probed. It was noted that they were notably connected with proteasome-mediated ubiquitin-dependent protein catabolic process, proteasomal protein catabolic process, etc. ( Figure 9C; Table 2). In addition, RNA transport, cell cycle, T-cell receptor pathway, etc. were remarkably enriched by CHMP2B-relevant genes ( Figure 9D; Table 3).

Discussion
Pyroptosis, a proinflammatory form of programmed cell death, has the features of cellular swelling, lysis, and the release of proinflammatory cytokines (30). In the present work, most pyroptosis molecules containing CHMP4A, CHMP6, GSDMD, IL1B, IRF2, TP53, CASP9, NLRC4, NOD1, NOD2, and PYCARD presented remarkable upregulation in T2DM relative to controls, which was indicative of the activation of the pyroptosis process in T2DM, similar to prior research (31).
To select the key pyroptosis molecules exerting essential functions in T2DM, four machine learning algorithms-RF, SVM, XGB, and GLM-were applied. Among them, the SVM model presented the best efficacy in T2DM prediction. Therefore, SVMselected genes were considered key pyroptosis molecules, including BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP. The nomogram built had clinical superiority in risk prediction. Experimental research has demonstrated that the key pyroptosis genes are involved in DM. Targeting BAK1 alleviates diabetic cardiomyopathy (32). In addition, inhibition of PLCG1 mitigates diabetic retinopathy (33). TIRAP is associated with T2DM and insulin resistance (34).
T2DM is associated with increased systemic inflammation that results in insulin resistance, hyperglycemia, and risk of diabetic complications (35). Herein, it was found that T2DM displayed a lower level of NK cells resting as well as a higher level of neutrophils in comparison to normal specimens, consistent with prior research (36). It has been proven that NK cells correlate to DM by relieving systemic inflammation and enhancing cellular insulin sensitivity (37). Neutrophils are probably the dominating leukocytes in the innate arm of the immune system considering the response to damage and danger signals, which are the first leukocytes reacting to and accumulating inside the target tissues of DM (38). NLRP6 presented a positive connection with neutrophils as previously reported (39). Based upon the DGIdb, possible compounds potentially targeting key pyroptosis genes were determined, comprising k e r a c y a n i n , 9 , 1 0 -p h e n a n t h r e n e q u i n o n e , d i c l o f e n a c , phosphomethylphosphonic acid adenosyl ester, acetaminophen, cefixime, aspirin, and ustekinumab. Prior research has proposed that aspirin pretreatment mitigates inflammasome-driven pyroptosis through downregulating NF-kB/NLRP3 signaling in ischemic stroke (40). Nevertheless, experimental verification needs to be carried out for the interactions of these compounds with druggable pyroptosis molecules in T2DM.
Nevertheless, the limitations of this study should be pointed out. Firstly, the performance of the key pyroptosis gene-based nomogram in predicting the risk of T2DM should be validated in prospective cohorts. Secondly, the biological roles of the key pyroptosis genes in T2DM pathogenesis require to be further investigated through more experiments. Thirdly, the interactions between CHMP2B and its relevant genes should be further analyzed in T2DM.

Conclusion
In summary, this work proposed the key pyroptosis genes in T2DM by comparing distinct machine learning approaches, composed of BAK1, CHMP2B, NLRP6, PLCG1, and TIRAP. The key pyroptosis gene-based nomogram enabled to predict the risk of T2DM for clinical application. Possible compounds that targeted the key pyroptosis genes were screened. In addition, the key CHMP2B-relevant genes were KNTC1, NCAPG, KIAA0101, DLGAP5, GMNN, CEP55, KIF20B, ZWILCH, MCM6, and   MAD2L1, which might interact with CHMP2B during T2DM. Altogether, our findings offered promising molecules for the management and therapy of T2DM and its complications.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fendo.2023.1112507/ full#supplementary-material SUPPLEMENTARY TABLE 1 Landscape of the fraction of immune cell types across T2DM and control specimens.

SUPPLEMENTARY TABLE 2
The list of CHMP2B-relevant genes.

SUPPLEMENTARY TABLE 3
Selection of key CHMP2B-relevant genes via MCODE method.