Identification and Validation of an Immunological Expression-Based Prognostic Signature in Breast Cancer

Background: Emerging evidence suggests that the immune system plays a crucial role in the regulation of the response to therapy and long-term outcomes of patients with breast cancer (BRCA). In this study, we aimed to identify a significant signature based on immune-related genes to predict the prognosis of BRCA patients. Methods: The expression data were downloaded from The Cancer Genome Atlas (TCGA). The immune-related gene list, the transcription factor (TF) gene list, and the immune infiltrate scores of samples in the TCGA database were acquired from the ImmPort database, the Cistrome Cancer database, and the TIMER database, respectively. Univariate Cox regression analysis was utilized to identify prognostic immune-related differentially expressed genes (DEGs) (PIRDEGs) in BRCA. A prognostic immune signature containing 15 PIRDEGs in BRCA was established using the least absolute shrinkage and selection operator (LASSO) model with 1,000 iterations followed by a stepwise Cox proportional hazards model with a training set of 508 samples in TCGA. An independent assessment of the prognostic prediction ability of the signature was conducted using Kaplan–Meier survival analysis with a testing set of 505 samples in TCGA. Results: We identified 466 PIRDEGs and 80 TFs among the DEGs. A gene signature containing 15 PIRDEGs was constructed. Risk scores of BRCA patients were calculated using this model, which showed a high accuracy of prognosis prediction in both the training set and testing set and could be an independent prognostic factor of BRCA patients. Conclusions: Our study revealed that a PIRDEG signature could be a candidate prognostic biomarker for predicting the overall survival (OS) of patients with BRCA.


INTRODUCTION
As one of the most common malignancies, breast cancer (BRCA) threatens the wellness and health of women worldwide, and the incidence and mortality rates of BRCA are nearly 30 and 15%, respectively (Siegel et al., 2019). Owing to its high heterogeneity, BRCA harbors a plethora of molecular subtypes, which lead to a variety of treatment therapies for BRCA. Hormone exposure is the major risk factor for BRCA, and estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) are the most common endocrine markers involved in BRCA categorization. Based on the new definition of BRCA molecular subtypes issued by the 2013 St. Gallen International Breast Cancer Conference, BRCA can be classified into the following subtypes: luminal A (ER/PR + , HER2 − , Ki67 + < 20%), luminal B (ER/PR + < 20%, HER2 − , Ki67 + ≥ 20%), HER2 + B2 (ER/PR + and HER2 overexpression), HER2 overexpression (ER − , PR − , and HER2 overexpression), basal-like TNBC (ER − , PR − , and HER2 − ), and other subtypes (Goldhirsch et al., 2013). The treatment of BRCA is multidisciplinary and includes local surgery excision, radiation therapy, endocrine therapy, and other chemotherapies, such as anti-HER2 therapy. Consequently, the classification of BRCA subtypes before deciding on a treatment strategy is critical for BRCA patients. For patients with ER + or PR + BRCA, endocrine therapy is considered the most effective way to cure cancer. However, for patients with ER − , PR − , and HER2 − BRCA, traditional endocrine therapy seems to lack efficacy. Thus, screening of the ER or PR state is usually the first step in the clinic toward choosing the treatment method for BRCA patients (Harbeck et al., 2019). In recent years, with the development and application of comprehensive therapies in the clinic, the overall survival (OS) rates for BRCA patients have increased, and the 5-year survival rates have improved to some extent (local stage, >96%; regional stage, >81%; distant stage, >26%) (DeSantis et al., 2017). However, the prognosis of BRCA patients is primarily related to the molecular subtypes, and almost all patients who develop metastatic disease will succumb to it. Thus, it is important to search for novel prognostic biomarkers to predict the response rates to individual therapy of patients with different clinical characteristics or distinct molecular subtypes.
The immune system has been considered a determining factor during cancer initiation and progression (Gentles et al., 2015). Emerging evidence suggests that the immune system plays a crucial role in the regulation of response to therapy and long-term outcomes of patients with BRCA (Savas et al., 2016). Furthermore, oncology and immunology are interwoven, especially in the selection of tumor therapy and prognosis prediction. Thus, the development of oncoimmunology has revealed that tumor-infiltrating lymphocytes (TILs), which are immune cells that infiltrate tumor tissues and increase the expression of immune-related genes, are closely related to the better survival of patients with specific subtypes of BRCA (Gooden et al., 2011;Karn et al., 2011). TILs have been declared in various types of solid tumors, including BRCA, colon cancer, melanoma, cervical cancer, and lung cancer (Underwood, 1974;Savas et al., 2016). Elevated levels of lymphocytic infiltrate were reported to be associated with HER2 amplification and portended long-term clinical outcomes (Tang et al., 1990). A strong linear relationship between the TIL number in patients with triple-negative breast cancer and the recurrence-free survival endpoints that increased over time has been reported (Loi et al., 2013). A previous study showed that high infiltration of lymphocytes in BRCA tissues might predict the response to neoadjuvant therapy and may also have a significant prognostic value after adjuvant chemotherapy . An increasing number of reports have confirmed that immune cells and immune-related genes have significant prognostic and predictive values. However, little is known about the genomic features driving high or low immune infiltration in BRCA; thus, it is of great value to better understand the interaction of BRCA and the immune system and to discover more potential immuno-oncological prognostic and predictive markers.
Owing to technique limitations, an accurate assessment of TILs has not been achieved; for instance, tissue-based methods, including flow cytometry and immunohistochemistry, cannot be applied to the high-throughput examination of multiple markers and large number of samples simultaneously. Progress in the field of single-cell genomes and transcriptomes has provided us with gene expression data to assess immune-related classifications and their associations with tumor cells (Charoentong et al., 2017;Liu and Mardis, 2017). The development of bioinformatics techniques enables the integration of gene expression data into biological data; numerous emerging databases offer large-scale data with gene expression, clinical information, and biological characteristics; and various bioinformatics tools provide opportunities to merge different resources derived from different studies. For example, Li et al. (2016) created the web resource database TIMER to evaluate the clinical impact of immune cells in diverse cancers. The ImmPort database was put in place to function as a critical repository for immunology-related clinical and molecular data (Bhattacharya et al., 2014). With all these approaches, researchers have begun to quantify TILs using the expression values of immune-related genes and identify individualized immune-related signatures in the tumor prognosis of various types of cancers, including melanoma, lung cancer, glioblastoma, and BRCA (Cheng et al., 2016;Li et al., 2019;Song et al., 2019;Yang et al., 2020). Cheng et al. (2016) identified eight genes (FOXO3, IL6, IL10, ZBTB16, CCL18, AIMP1, FCGR2B, and MMP9) with the highest prognostic value in glioblastoma using The Cancer Genome Atlas (TCGA) database. Li et al. (2019) identified four immune-related genes (APOD, CXCL14, IL33, and LIFR) correlated with BRCA prognosis.
In this study, we focused on investigating the role of immune-related genes that are differentially expressed in BRCA tissues compared with healthy tissues and on exploring a model composed of immune-related genes to predict the prognosis of patients with BRCA. With this goal, we combined multiperspective databases, such as TCGA, ImmPort, TIMER, and Cistrome, with multidimensional analysis methods, such as differential analysis, univariate or multivariate Cox analysis, risk-score model construction, survival analysis, receiver operating characteristic (ROC) curve analysis, prognosis verification, and correlation analysis. As a result, we found a prognostic risk-score model for BRCA patients who contained 15 prognostic immune-related differentially expressed genes (DEGs) (PIRDEGs) in BRCA. We constructed a regulatory network of transcription factors (TFs) and IRDEGs in BRCA. We verified the application value of the risk-score model and analyzed the correlation between the risk score and the number of immune cells in BRCA tissues. Overall, our study discovered a prognostic signature model with relatively high application potential in BRCA. The workflow is shown in Figure 1.

Identification of Differentially Expressed Genes and Prognostic Immune-Related Differentially Expressed Genes in Breast Cancer
A cohort of 1,222 samples, consisting of 1,109 BRCA patients and 113 normal individuals from the TCGA database, was used to identify the DEGs between BRCA tissues and normal tissues. We downloaded the RNA-Seq data from TCGA and genes that met the criteria: |log2-fold change| > 1 and false discovery rate (fdr) < 0.05 were defined as DEGs. As shown in Supplementary Table 1, we finally identified 4,575 DEGs, of which 2,698 were upregulated and 1,877 were downregulated (Supplementary Figures 1A,B). Then, to identify the immune-related genes that play a regulatory role in BRCA, we searched and downloaded all 2,498 immune-related genes from the immunological database ImmPort (Supplementary Table 2). We matched these immune-related genes with DEGs in BRCA, and then we obtained 366 immune-related genes that also belonged to the set of DEGs in BRCA and their detailed expression pattern (Figure 2A and Supplementary Figure 2A and Supplementary Table 3).

The Regulatory Network Between Transcription Factors and Prognostic Immune-Related Differentially Expressed Genes
As key factors regulating gene expression, TFs function as gene transcription accelerators or inhibitors by modulating the transcription of downstream genes. To reveal the crucial mechanisms underlying the transcriptional regulation of immune-related genes in BRCA, we constructed a transcriptional regulatory network of immune-related genes in BRCA. First, 318 tumor-related TFs were downloaded from the Cistrome Cancer database and were matched with the identified DEGs in BRCA (Supplementary Table 4). There were 80 tumor-related TFs among the DEGs ( Figure 2B and Supplementary Table 5 and Supplementary Figure 2B). To construct a model, we randomly classified the 1,013 BRCA samples in TCGA (96 samples were  Frontiers in Genetics | www.frontiersin.org excluded owing to the lack of enough clinical information) into two groups: a training set (n = 508) and a testing set (n = 505). To discover which immune-related genes affected the prognosis of BRCA patients, we subjected the expression values of all 366 immune-related DEGs to a univariate Cox proportional hazards regression analysis in the training set and identified 41 PIRDEGs that strongly correlated with patient OS (p < 0.05). Twenty of the 41 PIRDEGs were high-risk genes, and the other 21 PIRDEGs were low-risk genes for OS in BRCA patients ( Figure 2C). All 41 PIRDEGs and 80 TFs among DEGs were used to construct the transcriptional regulatory network of PIRDEGs via Pearson's correlation analysis. To find strong correlations between TFs and PIRDEGs, we set the correlation analysis parameter filters as |correlation coefficient >0.4| and p < 0.001. As shown in Figure 3 and Supplementary Tables 6, 11 TFs among DEGs and 19 PIRDEGs constituted the transcriptional regulatory network of PIRDEGs. In this network, almost all of the TFs played a positive regulatory role, and only the regulatory relationship between TEAD1 and PSME2 was negative. In this network, the correlation degree was indicated using the color of the edge lines: the light red lines indicated that the correlation coefficient was between 0.4 and 0.6; the red lines indicated that the correlation coefficient was between 0.6 and 0.8; and the dark red lines indicated a correlation coefficient higher than 0.8.

An Immune Gene Signature Model Predicted the Overall Survival of Breast Cancer Patients
As described above, we identified 41 PIRDEGs using univariable Cox regression analysis. We used the least absolute shrinkage and selection operator (LASSO) Cox regression model to select the most useful model, and a gene model above the minimum deviance with 25 genes was identified (Figures 4A,B and Supplementary Table 7). Next, we further performed a stepwise Cox proportional hazards regression model and finally obtained a model consisting of 15 PIRDEGs -PSME2, TINAGL1, MMP9, CSRP1, ROBO3, IGHE, SEMA6D, ADM, FGF7, SCG2, TSLP, FGFR4, GHR, SSTR1, and TNFRSF8 ( Figure 4C and Table 1) -among which PSME2, CSRP1, TSLP, and TNFRSF8 with negative coefficients were low-risk PIRDEGs and TINAGL1, MMP9, ROBO3, IGHE, SEMA6D, ADM, FGF7, SCG2, FGFR4, GHR, and SSTR1 with positive coefficients were high-risk PIRDEGs. The high-risk IRDEGs negatively correlated with the prognosis of BRCA patients, while the low-risk IRDEGs positively correlated with the prognosis of BRCA patients.  Table 8).

Validation of the Immune Gene Signature for Survival Prediction
To verify whether this risk-score model could precisely predict the prognosis of BRCA patients, we utilized the training set and testing set to validate the prognosis prediction ability of this model. The cutoff value of the risk score was the median risk score in the training set. On the basis of this parameter, we divided all samples in the training set and the testing set into high-risk groups or low-risk groups. Kaplan-Meier survival analysis was performed. As expected, the high-risk group showed a poorer OS rate than the low-risk group in both the training set and testing set (Figures 5A,B). Next, we examined the predictive performance of this risk-score model for OS using ROC curves, and the results showed that the areas under the ROC curve (AUCs) in the training set and the testing set were 0.905 and 0.708, respectively (Figures 5C,D).
The nomograms of this model in the training set and the testing set are shown in Figures 5E,F, respectively. We then ranked the risk scores of patients and analyzed the distribution of OS status of each patient in the training set and the testing set. As shown in Figures 6A,B, in the upper panel, the red dot plots indicate the high-risk patients, while the green dot plots indicate the low-risk patients; in the lower panel, the pink dot plots indicate patients who were dead, and the cyan dot plots indicate patients who were alive. It was clear that, in the lower panel, the number of pink dot plots increased with the rise of risk scores of patients. The bar plots show the expression pattern of risk genes in patients in the high-and low-risk groups (Figures 6C,D). We performed univariate and multivariate Cox regression analyses to examine whether the immune gene signature was an independent prognostic factor in BRCA patients. We first analyzed the correlations between clinical factors, including age, gender, TNM stage, or risk scores, and OS of BRCA patients in the training group. In the univariate Cox analysis, age, TNM stage, and risk score were independent prognostic factors of BRCA patients (p < 0.05) (Figure 6E). Survival outcomes of tumor patients can be affected by multiple factors; thus, we defined all these clinical features as covariates and performed multivariate Cox analysis. As shown in Figure 6F, the prognostic prediction power of the 11 immune gene signatures was independent of the clinical features (HR = 1.012, 95% CI: 1.007-1.017, p < 0.001).

The Associations of the Immune Gene Signature and Clinical Characteristics in Breast Cancer
We then investigated the correlations between the risk scores derived from this model and the clinical characteristics in BRCA patients. The results indicated that risk scores in the T4 group were higher than those in the T1, T2, and T3 groups (p = 0.048); however, we observed that the sample number of the T4 group was smaller than that of the other groups; thus, this result may need further verification. Additionally, the risk scores showed a gradual increase with increasing lymph node invasion degree ( Figure 7A). We also analyzed the correlation between the specific expression level of immune

The Associations of the Immune Gene Signature and Immune Cell Infiltration
Previous studies reported that immune cell infiltration levels were associated with favorable outcomes of patients (Cheng et al., 2016;Li et al., 2019;Song et al., 2019;Yang et al., 2020). Thus, we analyzed the correlation of risk scores and the numbers of six infiltrating immune cell types in patients with BRCA, namely, B cells, CD4 + T cells, CD8 + T cells, neutrophils, dendritic cells, and macrophages. As shown in Figure 7B, the risk scores of BRCA patients displayed negative correlations with the infiltration of B cells, CD4 + T cells, CD8 + T cells, neutrophils, and dendritic cells in BRCA tissues, while the risk scores of BRCA patients displayed positive correlations with the infiltration of macrophages (p < 0.05).

DISCUSSION
Immune system dysregulation has been documented in many types of cancers. Investigators have reported the activation of immune-related genes in BRCA (Ascierto et al., 2012). An increasing number of studies have emphasized the significance of prognostic biomarkers in predicting cancer outcomes and recommended that more investigators include these biomarkers into therapeutic research (Li et al., 2016). Several studies have suggested that immune-related genes can be used as prognostic biomarkers in BRCA (Gingras et al., 2015;Li et al., 2019). In the past years, most studies about immune-related prognostic biomarkers focused on TILs, and TILs have been identified as prognostic biomarkers in multiple malignancies. High infiltration of CD45RO + (memory) and CD8 + (cytotoxic) TILs (used to calculate the immunoscore) was strongly related to the clinical outcomes of patients with lung, breast, colon, ovarian, prostate, and head and neck cancers (Vano et al., 2018). For instance, it has been reported that the densities and type of CD8 + TILs were correlated with tumor progression without interference from the tumor stage in colorectal cancer. Thus, the immunoscore was considered a valuable prognostic factor (Baxevanis et al., 2019). The core of TILs is immune cells, and related studies have focused on the cell level. However, the definition of cell type, especially an immune cell type, is still dependent on molecular protein markers. Thus, we proposed as the original aim of our study to develop a series of immune-related biomarkers composed of immune genes rather than immune cells. In our study, to identify the molecular biomarkers in BRCA, we first identified the DEGs between BRCA and healthy tissues using public datasets from TCGA. Then, our study aimed to develop immune-related biomarkers; thus, the immune-related database ImmPort was used to identify the immune genes and help us obtain the IRDEGs in BRCA. PIRDEGs were selected using univariate Cox analysis. Transcription factors play a crucial role in controlling the expression of genes by binding to a particular DNA region, such as an enhancer or promoter. TFs were also found to be associated with the progression of glioblastoma (Wei et al., 2015). To construct a regulatory network of immune genes in BRCA, we selected 80 TFs from the DEGs in BRCA on the basis of information from the Cistrome database. Then, we constructed the regulatory network of TFs-PIRDEGs in BRCA on the basis of their correlations. As shown in Figure 3 and Supplementary  Table 6, we found four key TFs -EOMES, FOXP3, FLI1, and STAT1 -which had the most downstream PIRDEGs and relatively high correlation coefficients. Among the 41 PIRDEGs in this regulatory network, 21 PIRDEGs were the downstream genes of EOMES, and the correlation coefficient between EOMES and CXCR3 was up to 0.86. EOMES has been demonstrated to be involved in the differentiation of CD8 + T cells during the immune response by regulating the expression of lytic effector genes (Atreya et al., 2007) and to be an independent good prognostic factor for progression-free survival and OS, likely due to its association with a favorable immune signature in metastatic renal cell cancer patients (Dielmann et al., 2016). FOXP3 is an essential TF in maintaining immune system hemostasis by regulating Treg lineage function, and it has been described to be a strong prognostic factor for distant metastasis-free survival of BRCA patients (Merlo et al., 2009). In the TF-PIRDEG network, we found many T-cell receptor-coding genes, including T-cell receptor beta variable and constant genes (TRBV5-6, TRBV28, TRBV18, TRBV12-3, and TRBC2).
We constructed a prognostic risk-score model using LASSO Cox regression analysis and obtained an immune gene signature to predict the OS of BRCA patients. As shown in Table 1, four low-risk IRDEGs and 11 high-risk IRDEGs constituted this risk-score model, and each patient was assigned a risk score based on this model. One of the immune genes in the signature and a high-risk factor in this model, FGF7, had the highest coefficient at 0.14. FGF7 is secreted by mesenchymal origin cells and acts as a paracrine cytokine targeting nearby cells via locally secreted signals, thus participating in immune reactions during tumor progression or inhibition (Finch and Rubin, 2006). In urothelial carcinoma of the upper urinary tract and bladder, FGF7 overexpression is an independent prognosticator (Fan et al., 2015). TSLP, TINAGL1, TNFRSF8, and PSME2 in this model exerted negative correlations with the OS of BRCA patients. TSLP is a member of cytokine resembling cytokine IL-7 and has recently been reported to be a regulator of type 2 inflammatory responses (Sims et al., 2000). The function of TSLP in various types of tumors is tumor context dependent, and its role in BRCA is also controversial. One study revealed that an IL-1β-TSLP pathway could blunt antitumor immunity. In contrast, another study proposed its pro-tumor role with the evidence that the keratinocyte-specific overexpression of TSLP (in K14-TSLP transgenic mice) could inhibit the development of early BRCA (Corren and Ziegler, 2019). In our study, TSLP showed a protective effect on the OS of BRCA patients. PSME2 is a subunit of proteasome activator PA28 participating in the generation of class I binding peptides such as antigens from viruses or tumors (Sijts et al., 2002).
All these prognostic signatures identified in our study are immune system regulators and play a significant role in the cellular immune reaction under different situations. By combining them, we obtained a prognosis prediction model. The efficacy of this model was verified in the training set and the testing set simultaneously, the OS rates of high-and low-risk-score groups of BRCA patients showed a significant difference, and the number of dead BRCA patients increased with the increase in risk scores of BRCA patients. Both univariate and multivariate Cox analyses revealed that the risk score derived from this model was an independent prognostic factor for BRCA patients. Early studies identified TILs in BRCA that mainly comprised CD8 + T cells, CD4 + T cells, CD19 + B cells, and rare NK cells (Savas et al., 2016). In our study, we found that the risk scores showed negative correlations with the numbers of B cells, CD8 + T cells, CD4 + T cells, and neutrophil cells. The prognostic value of TILs is somewhat debatable; for instance, in most human cancers, increased densities of CD3 + , CD8 + , and memory CD45RO + T cells are associated with favorable prognosis. However, the prognostic value of these cells may be influenced by other immune cells residing in the same tumor (Baxevanis et al., 2019). Hence, we may conclude that infiltrating immune cells as prognostic biomarkers in tumor patients are, to a large extent, dependent on the comprehensive assessment of various types of immune cells.
In our study, we focused on the gene level rather than the cell level by screening PIRDEGs using bioinformatics analysis in BRCA, and the TF-PIRDEG network was used to identify crucial TFs in the regulation of the transcription level of immune-related genes in BRCA. A risk-score model consisting of immune genes was built, and the efficacy of this model was examined using extensive data from real samples. However, we cannot exclude the possibility that other factors, such as age and sex, may introduce bias in the practical values of the identified genes. In addition, the data types and the random classification of the training group and test group may lead to different models. For instance, the public database TCGA offers researchers various data types, including raw gene counts, FPKM and FPKM-UQ. We used the FPKM files in our study, and the other two file types can also be used for analysis. In our study, we randomly evenly classified the TCGA samples into two groups, which is not the only way to categorize. We considered the different genes and prognostic models that may be found in other studies. Additionally, the different construction methods of the prognostic model and various algorithms that were used during the model construction may also lead to other models harboring equal or higher prediction value. Furthermore, our study was completely dependent on the reanalysis of public data, which was appropriate but lacked validation using clinical samples. We thought it would be more convincing if we added another test group using independent clinical samples, such as examination of the expression at the transcriptional level of the identified genes in our model using RT-PCR or Western blot. Then, validating the regulatory network of TFs and these genes and the molecular mechanisms of tumor immunity may also provide investigators with more accurate and convincing evidence for future application of such regulatory networks in the clinic.

Data Collection and Preprocessing
The transcriptome profiles (HTSeq-FPKM) of 1,222 samples, consisting of 1,109 breast tissues of patients with BRCA and 113 adjacent tissues, and corresponding clinical information were downloaded from the TCGA database in December 2019 1 . Expression data or clinical information of each sample were combined into corresponding matrix files using Perl language 2 . Ensemble IDs in the expression matrix profile were also converted into Gene Symbols with Perl language. ID numbers in the expression profile and clinical information profile were matched, and samples whose ID numbers did not match were excluded from our study. Finally, we obtained 1,097 BRCA cases for subsequent analysis. Immune-related genes were acquired from the ImmPort database 3 . Tumor-related TFs were obtained from the Cistrome Cancer database 4 . The immune infiltrate scores of samples in the TCGA database were obtained from the TIMER database 5 .

Screening of Immune-Related or Transcription Factor-Related Differentially Expressed Genes
The RNA-Seq profiles from TCGA were transformed using log2, and the DEGs were identified using the limma package in R software. The parameters for DEG selection were as follows: |log2-fold change| > 1 and false discovery rate <0.05. Volcano plots and heatmaps were drawn using the limma or pheatmap package in R. The genes overlapping between DEGs and immune-related genes were defined as immune-related DEGs. The genes overlapping between DEGs and tumor-related TFs were defined as TF DEGs.

Survival Analysis and Risk-Score Model Establishment Using Cox Regression Analysis
A total of 366 immune-related DEGs were subjected to univariate Cox proportional hazard regression analysis to identify PIRDEGs. PIRDEGs with statistical significance in the univariate Cox regression analysis were tested in the LASSO Cox regression analysis to generate a coefficient of each immune-related DEG.
For the Cox analysis, the cox.zph function in the survival package in R was used to guarantee that the proportional hazards assumption is appropriate. A risk-score model was built using the formula Risk Score (RS) = n i = 1 (Expi * Coei), where n is the number of IRDEGs; Expi is the expression value of each PIRDEG; and Coei is the estimated regression coefficient of each PIRDEG derived from the multivariate Cox regression analysis. Based on this formula, each patient can be assigned a risk score, which is a linear combination of the expression value of the selected PIRDEGs weighted by their coefficients.

Construction of the Transcription Factor-Prognostic Immune-Related Differentially Expressed Gene Regulatory Network
The coexpression relationship between TF DEGs and PIRDEGs was analyzed using the corrplot package in R software, and Pearson's correlation was calculated. The selection criteria of the correlation coefficient were as follows: |cor| > 0.4, p < 0.001. The network was visualized using Cytoscape software.

Value Assessment of the Risk-Score Model
Relying on the risk score of each patient from the RS formula, we first selected the median risk score of the training set as the cutoff and divided all BRCA patients of the training set or testing set into two groups: the high-risk group and the low-risk group. The survival analysis between these two groups was conducted using the survival package in R software. The accuracy of our risk-score model to predict prognosis was evaluated using the ROC curve, and the ROC curve was calculated and generated using the survivalROC package in R software. Univariate or multivariate Cox analyses were used to test whether the risk score of our model is an independent prognostic factor in BRCA. Cox analysis was performed, and forest graphs were generated using the survival package in R.

Correlation Analysis of Clinical and Immune Characteristics
The correlations between risk scores or gene expression value in the risk-score model and clinical characteristics, including tumor stages, tumor grades, and lymph node metastasis, were performed using the beeswarm package in R. The correlation between risk scores and immune infiltrate scores was assessed using the corrplot package in R software.

Statistical Analysis
All statistical analyses were performed using R software. The Wilcox test in R was applied to identify differences between the normal group and the tumor group. The two-sided log-rank test in the survival package in R was employed to assess the survival difference between the high-risk and low-risk groups, and DEG multivariate analyses were conducted using the Cox proportional hazards regression model. p < 0.05 was considered statistically significant.