Roles of Pyroptosis-Related Gene Signature in Prediction of Endometrial Cancer Outcomes

Endometrial cancer (EC) is one of the most common gynecological malignancies in women, accompanied by the increasing incidence and decreasing age of onset. Pyroptosis plays an important role in the occurrence and development of malignant tumors. However, the relationship between pyroptosis-related genes and tumor prognosis remains unclear. In this study, analyzing the expression levels and survival data of 33 pyroptosis-related genes in the Cancer Genome Atlas (TCGA) between normal samples and tumor samples, we obtained six pyroptosis-related prognostic differentially expressed genes (DEGs). Then, through the least absolute shrinkage and selection operator (LASSO) regression analysis, a gene signature composed of six genes (GPX4, GSDMD, GSDME, IL6, NOD2 and PYCARD) was constructed and divided patients into high- and low-risk groups. Subsequently, Kaplan-Meier (KM) plot, receiver operating characteristic (ROC) curve and principal component analysis (PCA) in two cohorts demonstrated that the gene signature was an efficient independent prognostic indicator. The enrichment analysis and immune infiltration analysis indicated that the high-risk group generally has lower immune infiltrating cells and less active immune function. In short, we constructed and validated a pyroptosis-related gene signature to predict the prognosis of EC, which is correlated to immune infiltration and proposed to help the precise diagnosis and therapy of EC.


INTRODUCTION
Endometrial cancer (EC) is one of the most common gynecological malignancies in women. In 2020, there were 417,367 new cases and 97,370 new deaths worldwide (1). In recent years, the number of new cases is increasing while its onset age is gradually decreasing (2). The prognostic outcomes of EC patients in different stages are obviously different. Early-stage EC patients generally have a good prognosis, while advanced, metastatic or recurrent EC patients usually have a poor prognosis (3)(4)(5). Therefore, EC patients should be detected as soon as possible to improve their prognosis. The most common clinical symptom of EC is postmenopausal vaginal bleeding (PMB). However, this symptom is not specific because only 9% of women with PMB are diagnosed as EC (6). Cytology and transvaginal ultrasonography are most commonly used to screen EC, but they also lack specificity unfortunately (7). Therefore, it is particularly important to screen high-efficiency biomarkers or risk model to improve the prognosis of EC patients.
Pyroptosis is an atypical form of inflammatory programmed cell death mediated by caspase family proteins. It can be triggered by infections, malignant tumors and other pathological factors. The characteristic of pyroptosis is the rapid rupture of the cell membrane and the release of pro-inflammatory substances in the cell (8)(9)(10). The three pathways of pyroptosis are the classic pathway that depends on caspase-1 (11,12), the nonclassical pathway that depends on caspase-4/5/11 (13,14), and the special pathway that depends on caspase-3 (15). In addition, the Gasdermin family proteins are known as the "executioners" of pyroptosis. They are the most important proteins that affect pyroptosis, especially Gasdermin D (GSDMD) and Gasdermin E (GSMDE) (9,10,16).
The occurrence and development of malignant tumors is an extremely complex biological process, and studies have shown pyroptosis plays a certain role in it (17)(18)(19). On one hand, promoting pyroptosis of tumor cells can effectively inhibit the occurrence and progression of tumors and enhance the effect of anti-tumor treatment (9,20); on the other hand, pyroptosis can form a microenvironment suitable for tumor cell growth, thereby promoting tumor growth (21,22). Recent studies demonstrated that pyroptosis is closely related to proliferation, invasion and metastasis of tumor cells, and can affect the therapeutic effect of chemotherapy (12,15,19,(23)(24)(25). A recent study found that hydrogen inhibits the growth of EC through the pyroptosis pathway mediated by ROS/NLRP3/caspase-1/GSDMD, revealing the close relationship between pyroptosis and EC (12). However, there is still no relevant research to clarify the prognostic value of the gene signature related to pyroptosis in EC.
Accordingly, in this study, we aim to establish a novel gene signature for predicting the prognosis of EC by mining data from the Cancer Genome Atlas (TCGA). The overall design and technical roadmap of the study are shown in Figure 1.
To this end, we intend to analyze the mRNA expression data and clinical information of EC patients in TCGA, and identify the prognostic differentially expressed genes (DEGs) related to pyroptosis. Then, a pyroptosis-related gene signature will be established through the least absolute shrinkage and selection operator (LASSO) regression analysis from the training set, and its prediction performance will be further verified through the validation set. After that, we plan to perform enrichment analysis and immune infiltration analysis on DEGs of the high-and lowrisk groups. In addition, we propose to verify mRNA and protein expression of the pyroptosis-related prognostic DEGs in the gene signature. All in all, we hope to successfully establish a gene signature related to pyroptosis, which may help the diagnosis and treatment of EC.

Data Collection
The mRNA expression profiles and corresponding clinical information of 23 normal samples and 552 EC samples were downloaded from the TCGA database (https://portal.gdc.cancer. gov/). After deleting deletions and duplications, the "sample" function in R was used to randomly divide the remaining 539 EC patient samples into the TCGA training set (n = 287) and the TCGA validation set (n = 252) in equal proportions. In addition, 33 genes related to pyroptosis were collected from previous studies and were listed in Supplementary Table S1 (26).

Identification of Pyroptosis-Related Prognostic DEGs
Firstly, using the false discovery rate (FDR) < 0.05 as the standard, we used the "limma" R package to perform mRNA differential expression analysis on EC samples and normal samples in the TCGA cohort to obtain pyroptosis-related DEGs. Then, taking p < 0.05 as the critical value standard for genes with prognostic value, we performed univariate Cox regression analysis on the overall survival (OS) of these 33 genes to obtain prognostic genes. Subsequently, the pyroptosis-related DEGs and prognostic genes were intersected to acquire the pyroptosisrelated prognostic DEGs for further analysis. In order to show the differential expression level of DEGs between tumor samples and normal samples more intuitively, we adopted the "heatmap" R package to draw a heatmap. Furthermore, we utilized the STRING online tool (http://string-db.org/) to perform proteinprotein interaction (PPI) analysis to further understand the relationship between pyroptosis-related prognostic DEGs.

Construction and Validation of Pyroptosis-Related Gene Signature
After obtaining the prognostic genes related to pyroptosis, we applied the "glmnet" R package to construct the prognostic model of pyroptosis-related genes through LASSO regression analysis. Then, the risk score was calculated through the following formula: risk score = n i Xi × Yi (X: coefficient value of each gene, Y: expression level of each gene). According to the median value of the risk score, the patients were further divided into highand low-risk groups. Kaplan-Meier (KM) curve and receiver operating characteristic (ROC) curve were drawn using the "survival" and "timeROC" R packages to evaluate the predictive efficiency of pyroptosis-associated gene signature. Further, we employed the "Rtsne" R package to perform principal component analysis (PCA) analysis on this model to visually demonstrate its predictive performance. Separately, we also carried out univariate and multivariate Cox regression analysis for further verification.

Enrichment Analysis and Immune Infiltration Analysis
We divided the EC patients into high-and low-risk groups based on the median value of the risk score. Then, with |log2FC| ≥ 1 and FDR < 0.05 as the specific criteria, we utilized the "limma" R package to filter out the DEGs between the high-and low-risk groups. Then, we further applied the "clusterProfiler" R package to perform Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis on these genes. In addition, we made use of the "gsva" package to perform single sample gene set enrichment analysis (ssGSEA) to calculate the scores of immune  infiltrating cells and immune function in the TCGA training set and validation set. Hereafter, we utilized the "limma" package to analyze the difference of the scores between the two groups.

Verification of mRNA and Protein Expression of Six Genes in the Gene Signature
The mRNA expression levels and partial representative protein expression levels of various genes in tumor samples and normal samples resourced from the UALCAN database (http://ualcan. path.uab.edu/) (27) and The Human Protein Atlas database (https:/ /www.proteinatlas.org/) (28,29). Moreover, the genetic alteration data of the six genes in the risk model was derived from the cBioPortal database (https://www.cbioportal.org/) (30,31).

Statistical Analysis
The Wilcoxon test was adopted to compare the mRNA expression levels between normal samples and EC samples. The KM curve of the two-sided log-rank test was used to compare the OS of patients between the high-and low-risk groups. In order to evaluate the predictive power of the risk model, we performed univariate and multivariate Cox regression analysis with hazard ratio (HR) and 95% confidence interval (95% CI). In addition, the Mann-Whitney U test was utilized to compare the scores of immune infiltrating cells and immune function in EC patients between high-and low-risk groups. All statistical analysis in this study were performed using R programming (v.1.4.1717). Unless otherwise specified, p < 0.05 is considered to be a statistically significant standard.

Identification of Pyroptosis-Related Prognostic DEGs
Analyzing the expression levels of 33 genes related to pyroptosis in 23 normal samples and 552 EC samples in the TCGA database, we obtained 25 DEGs (p < 0.05, Supplementary Table S2). We can speculate that pyroptosis is related to the occurrence and development of EC, because the expression of most pyroptosisrelated genes between normal samples and EC samples had statistically significant differences. The heatmap showed the expression levels of these genes (Figure 2A). Simultaneously, we performed univariate Cox regression analysis on 33 genes related to pyroptosis, and obtained six prognostic genes correlated to OS. By taking the intersection of DEGs and prognostic genes, we identified six pyroptosis-related prognostic DEGs, namely GPX4, GSDMD, GSDME, IL6, NOD2 and PYCARD ( Figure 2B). Through the heatmap, we clearly found that GSDME and IL6 were downregulated in tumors while other four genes were upregulated ( Figure 2C). In order to further explore the interaction of these prognostic DEGs related to pyroptosis, we conducted PPI analysis and correlation analysis, and the results were shown in Figures 2D,E, respectively.

Construction of Pyroptosis-Related Gene Signature in the TCGA Training Set
In the TCGA train set, by performing LASSO regression analysis on the above six DEGs, we constructed a pyroptosis-related gene signature according to the optimal λ value (Figures 2F,G). Their respective coefficient values in this model were listed in Table 1, and the corresponding risk score formula was as follows: Risk score = 0.592 × expression value of GSDME+0.150 × expression value of IL6-0.256 × expression value of GPX4-0.339 × expression value of GSDMD-0.556 × expression value of NOD2-0.080 × expression value of PYCARD. We could result from the KM curve that the survival probability of the high-risk group was significantly lower than that of the low-risk group (p < 0.001, Figure 3A). To assess the sensitivity and specificity of the risk model, the ROC curve was carried out and the area under the curve (AUC) for 3, 5, and 7 years was 0.704, 0.699, and 0.752, respectively ( Figure 3B). Based on the median value of the risk score, 143 samples were assigned in the high-risk group, while 144 cases were in the low-risk group ( Figure 3C). As the risk score increased, the survival time of patients decreased and the number of deaths increased ( Figure 3D). Separately, PCA analysis demonstrated that the two groups of patients with high and low risk could be well distributed in the two clusters ( Figure 3E). Univariate and multivariate Cox regression analysis were employed to determine whether the risk score in the gene signature could be adopted as an independent prognostic factor. The results instructed that whether it is univariate or multivariate Cox regression analysis, the risk score was a qualified independent prognostic indicator (p < 0.001). The results of univariate analysis were HR = 2.836 and 95% CI = 1.807∼4.450, while the details of multivariate analysis were HR = 2.230 and 95% CI = 1.404∼3.543 (Figures 3F,G). Moreover, we found that the risk score was significantly correlated with grade (p < 0.0001), vital status (p = 0.0009) and survival time (p = 0.0151) ( Table 2).

Validation of the Gene Signature in the TCGA Validation Set
In the TCGA validation set, 134 patients were included in the high-risk group, while the other 118 patients were assigned to the low-risk group (Figure 4A). It is obviously that the higher the risk score, the higher the patient's probability of death and the lower the patient's survival time ( Figure 4B). The KM plot revealed significant statistical differences in the survival probability of the high-and low-risk groups, and the high-risk group had a lower survival probability (p < 0.01, Figure 4C). ROC curves (AUC for 3, 5, and 7 years are 0.634, 0.648, and 0.706, respectively) further reflected the stable sensitivity and specificity of the prognostic model ( Figure 4D). Likewise, the results of PCA analysis showed that the high-and low-risk groups of patients could still be well distributed in the two clusters in the TCGA validation set, reflecting the stability of the risk model ( Figure 4E). In addition, using correlation analysis, we found that the risk score in the TCGA validation set was significantly correlated with grade (p < 0.0001) and survival time (p = 0.0480) ( Table 2).

Enrichment Analysis of DEGs in Two Risk Groups
We first divided the EC patients in the TCGA training set into high-and low-risk group, and obtained 488 DEGs between the two different risk groups through differential expression analysis (Supplementary Table S3). Then, GO enrichment analysis was performed to explore the biological functions of DEGs in two risk groups. The results demonstrated that the top five significantly enriched biological progresses were immunoglobulin complex, complementary activation (classical pathway), humoral immune response mediated by circulating immunoglobulin, complement activation and immunoglobulin mediated immune response ( Figure 5A). Besides, KEGG pathway enrichment analysis indicated DEGs-related pathways were mainly significantly enriched in cytokine-cytokine receptor interaction, hematopoietic cell lineage and cell adhesion molecules ( Figure 5C). Hereafter, we performed the same analysis on the TCGA validation set and got 463 DEGs (Supplementary Table S4). The GO analysis results in the validation set were similar to the training set ( Figure 5B), while the KEGG analysis of validation set showed that these DEGs were mainly enriched in IL-17 signaling pathway, pathogenic Escherichia coli infection, and protein digestion and absorption ( Figure 5D).

Comparison of Immune Cells and Immune Function of EC Patients in High-and Low-Risk Groups
Using ssGSEA, we further compared the enrichment scores of immune cells and immune function of EC patients in two groups on the basis of the above enrichment analysis. Comprehensive analysis of the results in the TCGA training set and the TCGA validation set, compared with the lowrisk group, the high-risk group generally has lower immune infiltrating cells, especially Dendritic cells (DCs), T helper cells and tumor-infiltrating lymphocytes (TIL) (Figures 6A,C). Similarly, patients in the high-risk group have less active immune function than the low-risk group, particularly in check-point, cytolytic activity, human leukocyte antigen (HLA), T cells co-inhibition and T cells co-stimulation (Figures 6B,D).

Verification of mRNA and Protein Expression of Six Genes in the Gene Signature
In order to further verify the expression levels of the six genes in the risk model, we utilized the UALCAN online website (http://ualcan.path.uab.edu/) to visualize their mRNA expression levels and found the expression of GPX4, GSDMD, NOD2 and PYCARD up-regulated, while GSDME and IL6 expression down-regulated ( Figure 7A). Similarly, the representative immunohistochemical results of these genes obtained from The Human Protein Atlas illustrated that their protein expression levels have similar trends ( Figure 7B).

Mutation Analysis of Six Genes in the Gene Signature
In order to deepen the understanding of the genetic characteristics of these pyroptosis-related genes, the cBioPortal online tool was utilized for mutation analysis. It turned out that although they had significant different expression between normal patients and EC patients in the TCGA database, their  mutation frequency was relatively low. Among them, the highest mutation frequency was only 7% (GSDME and NOD2), while frequency of other genes was even lower (Figure 7C).

DISCUSSION
EC is one of the most common gynecological malignancies in women with a relatively high number of incidences and deaths, and the age of onset has gradually become younger in recent years [(1, 2)]. Currently, it is commonly used to screen EC through clinical symptoms, cytology and transvaginal ultrasound, but the specificity of these methods is not satisfactory.
With the rapid development of sequencing technology, a few single genes had been unearthed as biomarkers for predicting malignant tumors (32)(33)(34). However, since the expression of a single gene is easily regulated by different signaling pathways, its predictive effect has greater uncertainty. Thence, screening multiple key factors regulated by the same signal pathway to construct a multi-gene prediction model may be a way to improve predictive performance. In recent years, pyroptosis had attracted the attention of researchers as a new form of programmed cell death, and many studies had confirmed that it is closely related to the proliferation, invasion, metastasis and chemotherapy effects of tumor cells (12,15,19,(23)(24)(25). A recent study demonstrated that hydrogen inhibited the growth of endometrial cancer through the pyroptosis pathway mediated by ROS/NLRP3/caspase-1/GSDMD, indicating there is a close relationship between pyroptosis and EC (12). However, there is no relevant reports about the connection between pyroptosis and EC prognosis.
In our study, we successfully established and verified a gene signature with six pyroptosis-related genes (GPX4, GSDMD, GSDME, IL6, NOD2, and PYCARD) that could predict the prognosis of EC.
Glutathione Peroxidase 4 (GPX4) is a member of the glutathione peroxidase (GPX) family, which converts H 2 O 2 to H 2 O and oxidizes glutathione to its disulfide form (GSSG). GPX4 has been proved to be a key regulator of nonapoptotic forms of programmed cell death such as ferroptosis. Inhibition of GPX4 can trigger uncontrolled oxidation of polyunsaturated fatty acids (PUFAs) and the production of fatty acid free radicals, which can lead to ferroptosis (35,36). Additionally, some studies had clarified that GPX4 is closely related to the occurrence and development of tumors (37)(38)(39).
Pyroptosis is also called Gasdermin (GSDM)-mediated programmed cell death. Both GSDMD and GSDME belong to the GSDM family, which are the most important proteins that affect pyroptosis. Cleaved by activated cysteine aspartate-specific protease (caspase), GSDM family releases the N-terminal domain to punch holes in cellular membranes and cause the cells to swell, burst and die (40,41).
Interleukin 6 (IL-6) is a soluble mediator that is rapidly produced in the acute phase of infection or tissue damage, and promotes host defense by stimulating hematopoiesis, acute phase inflammation and immune response. IL-6 is a typical inflammatory cytokine that plays an important role in host defense. When infection, tissue damage or cancer occurs, IL-6 can be rapidly released from monocytes, macrophages and even fat cells. IL6 plays an important regulatory role in the occurrence and development of diseases (42,43).
Nucleotide-binding oligomerization domain containing two (NOD2) is a member of the NOD1/Apaf-1 family involved in the regulation of apoptosis. The N-terminal encodes two Caspase Recruitment (CARD) domains, and the C-terminal consists of ten linked leucine-rich repeats (LRR). On one hand, NOD2 promotes apoptosis mainly by inducing the expression of Caspase-9. On the other hand, it can mediate the host immune response by recognizing bacterial pathogenic components such as lipopolysaccharide (LPS) in the cytoplasm, especially playing an important role in gastrointestinal immunity (44)(45)(46).
Encoding apoptosis-associated speck-like protein containing a CARD (ASC) protein, PYCARD gene acts as a key mediator of inflammation and apoptosis, and promotes caspase-mediated apoptosis (47). It mainly recruits and activates caspase-1, participates in the process of cell inflammation and pyroptosis, and plays an important role in inflammatory diseases and a variety of cancers (48,49).
In short, these genes are closely related to inflammation and cell death. However, how they interact with each other in the process of pyroptosis remains to be studied in depth.
Through functional enrichment analysis of DEGs between high-and low-risk groups, we found that DEGs are mainly involved in immune response, especially humoral immunity. Moreover, the immune infiltration analysis of these DEGs detected that the scores of immune cell infiltration and immune function of the high-risk group were both lower than the low-risk group. It can be speculated that the poor prognosis of patients in the high-risk group may be caused by the reduced immune level. Based on the above gene composition of the risk model and results of function analysis, we can reasonably speculate that pyroptosis is involved in the regulation of tumor immune microenvironment (TIME).
Despite our research deserves a certain degree of affirmation, there are still some limitations. First, if the predictive ability of the gene signature can be verified by RNA sequencing of tissue samples from our institution, it will be better. Second, if we can explore the regulation of these genes on the TIME through in vivo and in vitro experiments, the research will be more in-depth.
In conclusion, we constructed and validated a pyroptosisrelated gene signature associating with immune infiltration to predict the prognosis of EC. This gene signature provides a new choice for the prognosis prediction of EC and is proposed to help precise diagnosis and treatment of EC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
YC designed the study and wrote the manuscript under the guidance of SY and JL. YL, QD, and CS participated in data analysis, discussion, and language editing. SQ, KL, and QZ helped statistical analysis. SY and JL contributed to the revision of the manuscript. All authors contributed to the article and approved the submitted version.