Bioinformatics Analysis of Expression Profiles and Prognostic Values of the Signal Transducer and Activator of Transcription Family Genes in Glioma

Signal transducer and activator of transcription (STAT) family genes—of which there are seven members: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B, and STAT6—have been associated with the progression of multiple cancers. However, their prognostic values in glioma remain unclear. In this study, we systematically investigated the expression, the prognostic value, and the potential mechanism of the STAT family genes in glioma. The expression of STAT1/2/3/5A/6 members were significantly higher and positively correlated with IDH mutations, while the expression of STAT5B was lower and negatively correlated with IDH mutations in glioma. Survival analysis indicated that the upregulation of STAT1/2/3/5A/6 and downregulation of STAT5B expression was associated with poorer overall survival in glioma. Joint effects analysis of STAT1/2/3/5A/5B/6 expression suggested that the prognostic value of the group was more significant than that of each individual gene. Thus, we constructed a risk score model to predict the prognosis of glioma. The receiver operating characteristic curve and calibration curves showed good performance as prognostic indicators in both TCGA (The Cancer Genome Atlas) and the CGGA (Chinese Glioma Genome Atlas) databases. Furthermore, we analyzed the correlation between STAT expression with immune infiltration in glioma. The Protein–protein interaction network and enrichment analysis showed that STAT members and co-expressed genes mainly participated in signal transduction activity, Hepatitis B, the Jak-STAT signaling pathway, transcription factor activity, sequence-specific DNA binding, and the cytokine-mediated signaling pathway in glioma. In summary, our study analyzed the expression, prognostic values, and biological roles of the STAT gene family members in glioma, based on which we developed a new risk score model to predict the prognosis of glioma more precisely.


INTRODUCTION
Glioma is one of the most common cancers and leading causes of cancer-related deaths worldwide (Gagliardi et al., 2014;Zhang and Zhang, 2015). Despite the great improvement in therapy, the overall survival of glioma patients remains poor (Linz, 2010). Thus, it is essential to reveal the molecular mechanism involved in tumorigenesis and to identify novel prognosis biomarkers and therapeutic targets for glioma.
The Signal transducer and activator of transcription (STAT) gene family contains seven members: STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B, and STAT6 (Gao et al., 2012). The STAT proteins are large proteins (750-850 amino-acids in size) and all share five common domains: an amino-terminal domain, a coiled-coil domain, a DNA-binding domain (DBD), an SRC homology 2 (SH2) domain, and a transactivation domain (TAD) (Lorenzini et al., 2017). Among them, the coiled-coil domain contributes to nuclear localization; the DBD domain is relevant to target gene transcription; the SH2 domain mediates homo-or heterodimerization of STATs; and the TAD domain is crucial for the activity of the STAT protein (Miklossy et al., 2013;Dorritie et al., 2014). STAT proteins are regarded as cytoplasmic transcription factors that regulate the processes of cell proliferation, differentiation, apoptosis, and immune responses by modulating the transcription of target genes (Goswami and Kaplan, 2017;Villarino et al., 2017). In recent years, an increasing number of studies have demonstrated that constitutive activation of STATs participates in the pathology of glioma. Studies have also shown that the expression of STAT1 is decreased in glioma compared with normal brain tissues (Ju et al., 2013;Chen et al., 2020a). Overexpression of STAT1 strongly suppresses the growth of glioma cells and promotes cell apoptosis (Zhang et al., 2018). Ehrmann et al. (2008) reported that the expression of STAT2 was lower in low-grade astrocytomas than in high-grade astrocytomas, while the function of STAT2 in glioma has rarely been reported. STAT3 has been discovered to be frequently activated and has been identified as an oncogene in glioma. Constitutively activated STAT3 in glioma is associated with oncogenesis and cancer progression (de la Iglesia et al., 2009;Chai et al., 2016). However, recent studies have also shown that STAT3 may play a tumor suppressive function in glioma pathogenesis when associated with different levels of expression of PTEN (de la Iglesia et al., 2008).
Studies investigating the expression and function of the STAT4 gene in glioma are rare. STAT5 comprises two highly homologous isoforms, STAT5A (94 kDa) and STAT5B (92 kDa) (Trifa et al., 2013), which share considerable functional overlap but play different roles (Socolovsky et al., 1999). The expression of STAT5A showed no significant correlation with high grade glioma when compared to normal brain tissues and low-grade glioma, although the knockdown of STAT5A expression inhibited cell invasion but not cell growth in glioblastoma (GBM) (Liang et al., 2009). STAT5B was recognized as the pertinent gene contributing to the progression from low-grade glioma (LGG) to HGG. Silencing STAT5B inhibited cell growth and cell invasion in the GBM cell line (Liang et al., 2005). Higher STAT6 levels were found in glioma tissues than normal brain tissues. Importantly, knockdown of STAT6 expression in a GBM cell line decreased cell proliferation and invasion (Merk et al., 2011).
Collectively, these studies suggest that the STAT gene family plays an important role in the development and progression of glioma. However, the potential application of the entire STAT gene family remains unclear. In this study, the predictive value, potential mechanism, and correlation with tumorimmune infiltrating cells (TIICs) of the STAT gene family were comprehensively assessed using bioinformatics tools.

Expression of the STAT Gene Family in Public Databases
The Oncomine database 1 was first used to analyze the mRNA expression of the STAT gene family in different types of cancers in comparison with normal tissues (threshold p-Value = 0.01 and threshold fold change = 2) (Nie et al., 2020). Second, TCGA 2 and CGGA 3 databases were included to analyze the expression and Spearman's correlations of STAT gene family across different tumor grades (WHO grade II-IV) and different IDH mutation statuses (mutant and wild-type) of glioma samples . Next, the protein expression levels of the STAT gene family in normal brain tissues and glioma tissues were explored in the Human Protein Atlas 4 (Uhlen et al., 2010). Finally, the genomic alterations of the STAT gene family were explored using the cBioPortal 5 .

Prognostic Value of the STAT Gene Family in Glioma
The prognostic values of each member of the STAT family were evaluated in TCGA and the CGGA databases. Then, Least absolute shrinkage and selection operator (LASSO) regression analysis was performed to construct a risk score model based on the expression and prognostic value of STAT gene family members in the TCGA dataset. Cases extracted from the TCGA and CGGA datasets were stratified into high-and low-risk groups based on the risk score. The time-dependent receiver operating characteristic (ROC) curve was drawn to evaluate the predictive value of this risk score for overall survival in both datasets.

Correlation Analysis Between STAT Gene Family Members and Tumor-Immune Infiltrating Cells
The association between the STAT gene family members and all TIICs (including B-cells, CD8 + T-cells, CD4 + T-cells, macrophages, neutrophils, and dendritic cells) in LGG and GBM were analyzed using the Tumor Immune Estimation Resource (TIMER) platform 6 . Tumor purity was used to correct the Spearman-based correlation analysis (Li et al., 2017).

GeneMANIA Analysis
The GeneMANIA website was used to predict function and to construct protein-protein interaction (PPI) networks of genes or gene lists through bioinformatics methods (Warde-Farley et al., 2010). Co-expression, co-location, physical interaction, and gene enrichment were analyzed using this web-based interface.

Functional and Pathway Enrichment Analysis
The interactive genes of the STAT gene family, identified from GeneMANIA, were subject to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis using Database for Annotation, Visualization, and Integrated Discovery (DAVID) (Huang et al., 2007). A p-Value < 0.05 was considered statistically significant. 6 https://cistrome.shinyapps.io/timer/

Statistical Analysis
Graphpad Prism version 5.0 software (GraphPad Software, San Diego, CA, United States) and IBM SPSS Statistics version 20.0 (Chicago, IL, United States) were used for statistical analysis and graphing. Cox regression analysis was employed to perform univariate and multivariate survival analysis. The student's t-test and Pearson's correlation test were conducted to compare groups and for correlation analysis. A p-Value of <0.05 was considered statistically significant in our study.

Expression of the STAT Gene Family in Glioma
The Oncomine database was employed to analyze the mRNA expression of members of the STAT gene family in various types of cancers (Figure 1 and Table 1). STAT1, STAT3, STAT5A, STAT5B, and STAT6 mRNA expression levels were significantly upregulated in glioma (including GBM,  Correlation between STAT genes expression and IDH mutation status in glioma ( * * p < 0.01 and * * * p < 0.001). "NS" means "not significant" or "not statistically significant," i.e., p ≥ 0.05.
Frontiers in Genetics | www.frontiersin.org The correlation between STAT gene expression and IDH mutation status in glioma using CGGA database ( * * p < 0.01 and * * * p < 0.001). "NS" means "not significant" or "not statistically significant," i.e., p ≥ 0.05. astrocytoma, oligodendroglioma, and anaplastic astrocytoma), compared with normal brain tissues or neural stem cells. However, the expression level of STAT4 mRNA was lower in glioma than in normal controls. No study has been conducted to evaluate STAT2 mRNA in the Oncomine database. Next, TCGA datasets were used to evaluate the expression of STAT gene family members in glioma. The results showed that the expression of STAT1, STAT2, STAT3, STAT5A, and STAT6 increased with increasing tumor grades. While the expression of STAT5B was negatively correlated with tumor grades. mRNA expression of STAT4 showed no statistically significant differences between LGG (low-grade glioma) and GBM (Figures 2A-G). Spearman's correlations of the expression of each member in the STAT gene family in the TCGA dataset were also analyzed ( Figure 2H). Furthermore, the mRNA expression of STAT1, STAT2, STAT3, STAT4, STAT5A, and STAT6 were upregulated in IDH wildtype compared to IDH mutated cases, but STAT5B showed the opposite results (Figures 2I-O). Similar results were also obtained in the CGGA dataset, except that the mRNA expression of STAT4 showed no significant correlation with the IDHmutational status (Figures 3A-O). The association between STAT expression and clinicopathological characteristics was also performed (Supplementary Tables 1, 2). The results indicated that STAT expression was correlated with tumor grades, IDH genotype, and age. The protein levels of STAT gene family members in normal brain tissues and glioma samples (low-grade and high-grade) were explored using the Human Protein Atlas website ( Figure 4A). Finally, the genomic alterations of the STAT gene family in glioma were investigated through the cBioPortal website. As shown in Figure 4B, STAT6 displayed the highest incidence rate of genetic variations (1.4%), which was followed by STAT2 (1.2%), STAT5B (0.8%), STAT2 (0.6%), STAT5A (0.6%), STAT1 (0.4%), and STAT4 (0.4%).

Prognostic Potential of STAT Gene Family in Glioma
To investigate the prognostic value of the STAT gene family in glioma, survival analysis was performed to determine the correlation between the expression of STAT gene family members and the prognosis of glioma patients. High mRNA expression of STAT1, STAT2, STAT3, STAT5A, and STAT6 indicated poorer

Construction of a Prognostic Gene Signature Based on STAT Gene Family Members
Based on the expression and prognostic value of the STAT gene family in glioma derived from the TCGA dataset, we aimed to construct a new prognostic gene signature that could predict the prognosis of glioma more accurately (Figures 6A-D). The risk score of the model was calculated as follows: The area under the ROC curve (AUC) was 0.759. The prognostic value of this model was tested in the TCGA dataset. We found that high-risk patients had a significantly poorer OS than the low-risk group (p < 0.0001) ( Figure 6E). The HR was 4.997 (4.239-5.747), which was higher than the other patient groups in Figures 5A-G. Similar results were obtained using the CGGA dataset ( Figure 6F). The above findings indicated that the newly constructed model had good performance for glioma survival prediction.
To eliminate the influence of tumor grades and IDH type, we assessed the prognostic value of STATs in LGG/GBM and IDH wild-type/mutated, respectively. The result showed that STAT1/2/3/5A/6 and the new model were significantly associated with the prognosis in LGG in both TCGA and CGGA datasets. However, none of the STATs showed a significant correlation with the prognosis in the GBM group of either dataset. Only our new model showed a weak correlation with the prognosis of GBM in the CGGA dataset (p = 0.0278). In the TCGA dataset, the expression of members and our constructed model was significantly correlated with prognosis both in IDH wild-type and IDH-mutated cases. In the CGGA dataset, only our constructed model was significantly correlated with prognosis in both the IDH-wild type and IDH-mutated cases (Tables 2, 3). These findings indicated that our newly constructed model showed good performance in subgroup analysis.

Predicted Pathways of the STAT Gene Family in Glioma
To explore the potential mechanisms through which the STAT gene family participates in the pathology of glioma, GeneMANIA was used to predict genes co-expressed with the STAT gene family and to construct a PPI network. A total of 20 genes were found to be co-expressed with the STAT gene family (Figure 8). Next, these genes were subjected to GO and KEEG analyses using the DAVID database. The top 5 molecular functions (MF) enriched terms were signal transducer activity, transcription factor activity, and sequence-specific DNA binding, signaling adaptor activity, DNA binding, and SH3/SH2 adaptor activity. The top 5 biological processes (BP) enriched terms with which the STAT gene family and their cooperators were significantly associated were: the cytokine-mediated signaling pathway; the Jak-STAT cascade; intracellular signal transduction; positive regulation of mitotic cell cycle; and transcription and DNAtemplates. The top two cellular components enriched terms were cytoplasm and nuclear chromatin ( Table 4). The top 10 KEGG enriched terms were Hepatitis B, the Jak-STAT signaling pathway, measles, the prolactin signaling pathway, acute myeloid leukemia, inflammatory bowel disease (IBD), the chemokine signaling pathway, natural killer cell-mediated cytotoxicity, the neurotrophin signaling pathway, and hepatitis C ( Table 5).

DISCUSSION
The STAT gene family is a family of latent transcription factors that can be activated in response to intracellular stimuli. Over 40 different cytokines or growth factors that can activate the STAT signaling pathway have been found, such as Janus kinases (JAKs), growth factor receptor (EGFR), and the platelet-derived growth factor receptor (PDGFR) (Swiatek-Machado and Kaminska, 2020). The activation of STAT members is temporary (from a few minutes to several hours) under normal physiological conditions. However, constitutive activation of STATs has been identified in multiple cancers, including glioma.
Signal transducer and activator of transcription 1 has been implicated in the pathophysiological processes of several types of cancers and plays dual roles. In colorectal carcinoma, hepatocellular carcinoma, and pancreatic cancer, STAT1 acts as a tumor suppressor. High expression of STAT1 was indicative of a good prognosis and could induce apoptosis or cell cycle arrest (Chen et al., 2013;Gordziel et al., 2013;Sun et al., 2014). However, studies have also shown that high STAT1 mRNA levels correlated with poor prognosis in breast cancer, while high levels of STAT1 activity promoted cell growth and immune suppression in breast cancer (Tymoszuk et al., 2014;Hou et al., 2018). These findings indicated that the function of STAT1 had a tissue specificity. In glioma, previous studies showed that the expression of STAT1 was negatively correlated with the grade of glioma (Ju et al., 2013;Yang et al., 2018). Overexpression of STAT1 could significantly inhibit cell proliferation and increase cell apoptosis (Zhang et al., 2018). Mechanistically, STAT1 could negatively modulate the expression of mouse double minute 2 (Mdm2) and interact directly with p53 to induce the expression of pro-apoptotic genes, such as Bax and Fas (Townsend et al., 2004). Suppressors of cytokine signaling 1 (SOCS1), cyclin-dependent kinase inhibitor 1B (Cdkn1b), and VEGFA (Yoshimura, 2009;Zhang et al., 2018) have also been identified as downstream targets of STAT1. However, a more recent study showed that IL-8 promoted glioma migration, invasion, and mesenchymal transition by regulating the STAT1/HIF-1α/Snail axis. Upregulation of STAT1 signaling genes in GBM was associated with poor prognosis (Duarte et al., 2012;Thota et al., 2014). Furthermore, STAT1 has also been implicated in the immunosuppression of cancer cells. IFNγ could activate the STAT1/MEK/ERK signaling pathway, and promote the expression of PD-L1 . In our study, we found that the expression of STAT1 was significantly higher in GBM when compared with LGG in both TCGA and CGGA datasets. STAT1 expression was relatively higher in the IDH wildtype group. Paradoxically, high expression of STAT1 predicted a poorer prognosis. This inconsistency between the expression and prognosis of STAT1 requires further research and exploration.
Signal transducer and activator of transcription 2 has been implicated in the process of multiple cancers, such as ovarian cancer (Chen et al., 2020b) and non-small cell lung cancer (Yang et al., 2019). Some studies have shown that STAT2 may be a tumor suppressor by acting downstream of IFN-I . However, other studies have also shown that STAT2 could promote the tumorigenesis of colorectal cancer (CRC) by upregulating the expression of IL-6 and activating the STAT3 signaling pathway (Gamero et al., 2010). STAT2, along with STAT1 and interferon regulatory factor 9 (IRF9), forms ISGF3 complexes, which play important roles in the activation of immune cells, production of inflammatory cytokines, and immune cell propagation (Lee et al., 2020). In glioma, the expression of STAT2 was lower in LGG than in GBM. In this study, we found that the expression of STAT2 was much higher in the GBM and IDH wild-type groups. Higher expression of STAT2 predicted a poorer prognosis. The mechanism and function of STAT2 in glioma needs to be further elucidated.
As a result of dysregulated upstream events and a lack of negative STAT3 regulation, STAT3 has been shown to be constitutively activated in glioma. Previous studies indicate that dysregulated STAT3 promoted cell proliferation, angiogenesis, resistance to apoptosis, and immune escape. Bcl-xL, Bcl2l1, Bcl-2, cyclin D1, and c-Myc have been identified as its downstream targets (Puram et al., 2012). However, recent studies have also shown that the role of STAT3 in glioma is correlated to a degree with genetic alterations. For example, in PTEN-deficient GBM (∼35% of GBM), STAT3 played the role of tumor suppressor   rather than of oncogene (Cancer Genome Atlas Research Network, 2008). Aberrant STAT3 signaling in GBM has also been associated with the dysfunction of both the innate and adaptive components of the immune system. The activation of STAT3 in GBM-resident tumor-associated macrophages/microglia (TAMs) impaired their ability to mediate phagocytosis and led to their polarization toward an immunosuppressive phenotype. STAT3 activation inhibited the maturation of dendritic cells and the expression of key molecules necessary for T cell activation and antigen presentation (Cheng et al., 2003). In the adaptive immune system, STAT3 activation promoted an increased proportion of Tregs and decreased infiltration of CD4 + and CD8 + T cells (Zhang et al., 2009). Signal transducer and activator of transcription 4 has been identified as an oncogene in gastric cancer, ovarian cancer, and CRC (Zhou et al., 2014;Cheng et al., 2015;Zhao et al., 2017). However, high levels of STAT4 expression predict better prognosis in breast cancer, hepatocellular carcinoma, and gastric cancer (Wang et al., 2015(Wang et al., , 2018b. In our study, we found that the expression of STAT4 had no correlation with tumor grades or patient prognosis. Signal transducer and activator of transcription 5 signaling has been involved in the tumorigenesis of GBM. STAT5A and STAT5B are predominantly located in the cytoplasm of glioma cells. A study that contained nine normal cortex, 22 diffuse astrocytoma, and 15 GBM samples showed that high levels of STAT5B were detected in 57.1% of GBM samples, 27.3% of diffuse astrocytoma samples, and 22.2% of normal cortex samples. However, high levels of STAT5A were detected in 28.6%, 18.2%, and 22.2% of the GBM samples, diffuse astrocytoma samples, and normal cortex samples, respectively (Liang et al., 2009). Targeting STAT5B, but not STAT5A may suppress GBM cell growth and induce G1 cell cycle arrest. In addition, Bcl-2, p27kip1, p21waf1/cip1, VEGF, and FAK may be downstream targets of STAT5B. Recent studies have shown that STAT5A can promote the transcription of LINC01198, which promotes the proliferation of glioma cells by stabilizing DGCR8 (Tan et al., 2020). In addition, STAT5 has been reported to be involved in the maintenance of normal immune function and homeostasis by mediating the biological actions of γc family. STAT5 also plays an important role in the function and development of Tregs. Consistent activation of STAT5 leads to a suppression in antitumor immunity (Rani and Murphy, 2016). In our study, we found that STAT5A expression was positively correlated with tumor grade. Higher expression of STAT5A indicated poor prognosis, while STAT5B displayed the contrary result. These findings indicated that STAT5A and STAT5B may have opposite roles in glioma.  Studies have shown that STAT6 was expressed in many glioma tissues but not in normal brain tissue. Upregulation of STAT6 was correlated with longer survival times. A reduction of 3 H-Thymidine uptake was observed in STAT6-deficient GBM cells. Overexpression of STAT6 promoted cell proliferation and invasion. Urokinase Plasminogen activator (uPA) and matrix metalloproteinase 1 (MMP-1) were identified as the target genes of STAT6 (Merk et al., 2011), which have been reported to be expressed by T cells and B cells (Goenka and Kaplan, 2011). STAT6 plays a crucial role in Th2 cell differentiation and promotes the development and differentiation of B cells (Wang et al., 2021). We found that the expression of STAT6 positively correlated with tumor grades and high levels of STAT6 predicted poor prognosis.
In the present study, the association between the STAT gene family and glioma was investigated using the TCGA and CGGA databases. We found that the expression of STAT1, STAT2, STAT3, STAT5A, and STAT6 was upregulated in GBM compared with LGG tissues, while the expression of STAT5B was downregulated. No significant variation in expression was observed when comparing the expression of STAT4 in both datasets. High expression levels of STAT1/2/3/5A/6 indicated poorer prognosis, while the expression of STAT5B was negatively correlated with prognosis. No apparent correlation between STAT4 expression and the prognosis of glioma was observed. In addition, subgroup analysis was performed to evaluate the prognostic value of STATs. The results indicated that neither STAT expression nor our constructed model had a significant correlation with prognosis in GBM. This may be due to the median patient survival time of GBM being very short (Perry et al., 2017).
Cancer associated inflammation and immunity can promote the initiation and progression of malignant tumors (Mantovani et al., 2008), and thus, cancer immunotherapy has attracted increasing attention in recent years. Studies have shown that the STAT gene family plays an important role in inducing and maintaining cancer-associated inflammation during carcinogenesis and cancer progression (Bollrath et al., 2009;Yu et al., 2009). In this study, we showed that STAT1/2/3/5A/5B/6 were positively correlated with TIICs, while STAT4 was negatively associated with TIICs except for CD8 + cells in LGG. GBM is noted for its paucity of T cells and enrichment of macrophages and microglia. The glioma microenvironment of GBM has also been associated with tumor-intrinsic transcriptional subtypes (Cancer Genome Atlas Research Network, 2008;Louis et al., 2016). For example, NF1 deficiency resulted in increased infiltration of tumor-associated macrophages and microglia (Wang et al., 2018a). In this study, we found that CD8 + cells were negatively correlated with almost all STATs. STAT1/3/5A/5B/6 isoforms were positively correlated or had no significant correlations with other immune cells. The associations between STATs and cancer associated immunity in GBM needs to be further explored in different cell subtypes.

CONCLUSION
Our results revealed a dysregulation or disordered expression of the STAT gene family in glioma tissues. The prognostic value of STATs in glioma was also evaluated and used to construct new prognostic gene signatures with improved predictive value in predicting glioma survival. STATs and their co-expressed genes were mainly involved in the following pathways: signal transducer activity, Hepatitis B, Jak-STAT signaling pathway, measles, transcription factor activity, and sequence-specific DNA binding. The association between STATs and TIICs was also evaluated in LGG and GBM. Further investigation and validation were needed to demonstrate these results in the future.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.