A degradome-based prognostic signature that correlates with immune infiltration and tumor mutation burden in breast cancer

Introduction Female breast cancer is the most common malignancy worldwide, with a high disease burden. The degradome is the most abundant class of cellular enzymes that play an essential role in regulating cellular activity. Dysregulation of the degradome may disrupt cellular homeostasis and trigger carcinogenesis. Thus we attempted to understand the prognostic role of degradome in breast cancer by means of establishing a prognostic signature based on degradome-related genes (DRGs) and assessed its clinical utility in multiple dimensions. Methods A total of 625 DRGs were obtained for analysis. Transcriptome data and clinical information of patients with breast cancer from TCGA-BRCA, METABRIC and GSE96058 were collected. NetworkAnalyst and cBioPortal were also utilized for analysis. LASSO regression analysis was employed to construct the degradome signature. Investigations of the degradome signature concerning clinical association, functional characterization, mutation landscape, immune infiltration, immune checkpoint expression and drug priority were orchestrated. Cell phenotype assays including colony formation, CCK8, transwell and wound healing were conducted in MCF-7 and MDA-MB-435S breast cancer cell lines, respectively. Results A 10-gene signature was developed and verified as an independent prognostic predictor combined with other clinicopathological parameters in breast cancer. The prognostic nomogram based on risk score (calculated based on the degradome signature) showed favourable capability in survival prediction and advantage in clinical benefit. High risk scores were associated with a higher degree of clinicopathological events (T4 stage and HER2-positive) and mutation frequency. Regulation of toll-like receptors and several cell cycle promoting activities were upregulated in the high-risk group. PIK3CA and TP53 mutations were dominant in the low- and high-risk groups, respectively. A significantly positive correlation was observed between the risk score and tumor mutation burden. The infiltration levels of immune cells and the expressions of immune checkpoints were significantly influenced by the risk score. Additionally, the degradome signature adequately predicted the survival of patients undergoing endocrinotherapy or radiotherapy. Patients in the low-risk group may achieve complete response after the first round of chemotherapy with cyclophosphamide and docetaxel, whereas patients in the high-risk group may benefit from 5-flfluorouracil. Several regulators of the PI3K/AKT/mTOR signaling pathway and the CDK family/PARP family were identified as potential molecular targets in the low- and high-risk groups, respectively. In vitro experiments further revealed that the knockdown of ABHD12 and USP41 significantly inhibit the proliferation, invasion and migration of breast cancer cells. Conclusion Multidimensional evaluation verified the clinical utility of the degradome signature in predicting prognosis, risk stratification and guiding treatment for patients with breast cancer.


Introduction
According to GLOBOCAN 2020 statistics, female breast cancer is the most common malignancy worldwide, with approximately 2.3 million new cases reported annually, and is the fifth leading cause of cancer-related death (6.9%), thus imposing a huge disease burden worldwide (1). Breast cancer originates from mammary gland epithelial cells. The most common types of breast cancer are infiltrating ductal carcinoma (IDC) and infiltrating lobular carcinoma (ILC). Other types include ductal carcinoma in situ (DCIS), lobular carcinoma in situ (LCIS), mucinous carcinoma and medullary carcinoma. Breast cancer is a major focus of anti-cancer research. Since the beginning of the 21 st century, remarkable progress has been achieved in the treatment of breast cancer. Some patients with breast cancer significantly benefit from targeted therapy and immune checkpoint blockade (2). Despite substantial progress, some challenges remain unresolved at present, such as chemotherapy resistance, unperceived distant metastasis, triplenegative breast cancer (TNBC) treatment and the unavailability of sufficient molecular targets. More importantly, the overall prognosis of breast cancer remains unsatisfactory owing to individual heterogeneity (3). Therefore, developing accurate strategies for predicting prognosis is necessary for improving clinical management. Currently, machine learning and bioinformatics are widely utilized as the mothodology to exploit robust models focusing on diverse end events (4)(5)(6)(7)(8). Compared with single indicators such as a clinicopathological parameter or the expression of a single gene, an integrated signature comprising several pivotal features appears to be a more robust tool for predicting prognosis. Therefore, exploitation and application of a valuable signature to predict the prognosis of patients with breast cancer may help in clinical decision making, prioritizing survival improvement.
The degradome is a repertoire of all proteases expressed in an organism, with over 550 protease-coding genes being identified in the human genome (9). It is preliminarily divided into five clusters according to catalytic sites: aspartyl proteases, cysteine proteases, serine proteases, threonine proteases and metalloproteases (9). The degradome provides a different insight into the functional dysregulation of cancer cells caused by protein/peptide degradation and modification, which may help to elucidate oncogenic mechanisms and identify potential drug targets. Structural alteration or aberrant expression of degradome-related genes (DRGs) has been associated with diverse human diseases, including neurodegenerative disorders (10), cardiovascular diseases (11), musculoskeletal diseases (12), bowel diseases (13) and, particularly, malignancy. Numerous DRGs have been associated with the phenotype of breast cancer. DNPEP (an aspartyl protease) is sponged by PAK5 in breast cancer cells, and its overexpression attenuates cell proliferation and invasion in vitro and suppresses tumour growth and metastasis in vivo (14). USP4 (a cysteine protease) has been identified as the downstream target of DNPEP in the PAK5/DNPEP/USP4 regulatory axis in breast cancer. High USP4 expression is associated with the poor prognosis of patients with breast cancer (14). The expression of TMPRSS13 (a transmembrane serine protease) is elevated in IDC tissues. Silencing TMPRSS13 can significantly suppress breast cancer progression both in vitro and in vivo by decreasing proliferation, enhancing apoptosis and inhibiting invasion, resulting in the inhibition of overall tumour burden and deficiency of detectable tumour growth (15). Additionally, TMPRSS13 knockdown sensitizes aggressive TNBC cells to chemotherapy agents in vitro. PRSS8 (a serine protease) accumulation mediated by TMPRSS13 knockdown is a potential tumour-suppressive mechanism (15). TASP1 (a threonine protease) plays an essential role in both normal mammary gland development and breast cancer progression (16). TASP1 knockdown reduces the expression of cyclins E and A in vivo, thereby blocking carcinogenesis. Mixedlineage leukaemia has been identified as a major substrate of TASP1 and is required for the development of HER2-positive breast cancer in vitro (16). Matrix metalloproteases (MMPs) are well-known members of metalloproteases. MMPs may promote breast cancer progression by remodeling the tumor microenvironment (17). Therefore, constructing a comprehensive prognostic signature based on DRGs may help to understand their prognostic value in breast cancer in a broader way.
In this study, we developed and validated a prognostic signature based on DRGs. Investigations of the degradome signature with respect to clinical association, functional characterization, mutation landscape, immune infiltration and immune checkpoint expression of were orchestrated. Additionally, the clinical utility of the signature in predicting the prognosis of patients undergoing different therapies was analyzed, and potential drugs for chemotherapy and molecular targeted therapy were also implied in different risk groups. In vitro experiments further confirmed the molecular functions of two DGRs (ABHD12 and USP41).

Identification of degradome-related genes
DRGs were selected from The Mammalian Degradome Database (degradome.uniovi.es/dindex.html) (18). The DESeq2 R package was used to screen for differentially expressed genes (DEGs) between BRCA and normal samples in TCGA cohort. Genes with |Log 2 FC| value > 1 and P-value < 0.05 were considered as DEGs. The survival R package was used to screen for prognosis-related genes (PRGs) significantly correlate with overall survival (OS) in TCGA cohort. These DRGs, DEGs and PRGs were intersected, and 22 overlapping DRGs were selected for subsequent analysis. The expression pattern of the 22 DRGs and the correlation among them were analyzed. The clusterProfiler and org.Hs.eg.db R packages were used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analyses of the 22 DRGs. Besides, ConsensusClusterPlus R package was used to perform consensus clustering to verify the consistence of the 22 DGRs by means of dissecting subtypes in TCGA cohort.

Construction of degradome-based prognostic signature
TCGA cohort was used as the training cohort to construct a degradome-based prognostic signature. Univariate and multivariate Cox regression analyses were performed to identify independent prognostic predictors from the 22 DRGs. Thereafter, LASSO regression analysis was employed to construct a prognostic signature based on the 22 DRGs. The risk score was calculated as follows: Risk score = ∑(C i *E i ). In the equation, i represents a certain DRG, C represents the coefficient of the DRG and E represents the expression level of the DRG. Patients were divided into the low-and high-risk groups according to the median risk score. The survminer and survival R packages were used to compare patient survival between the two groups. The timeROC R package was used to plot receiver operating characteristic (ROC) curves to assess the predictive ability of the degradome signature.

Validation of the degradome signature
The GSE96058 dataset was used for external validation. Three cohorts from TCGA were selected for internal validation: pathological-stage-III, ER-positive and HER2-positive cohorts. The degradome signature was tested in each cohort. Kaplan-Meier (K-M) survival curves and ROC curves were plotted for each cohort. Principal component analysis (PCA) was further used to differentiate the low-and high-risk groups in both TCGA and GSE96058 cohorts. In addition, we extracted 18 clinical subgroups from TCGA cohort to clarify the applicability of the degradome signature in a more broader way.

Clinicopathological differences between the two risk groups
The clinicopathological parameters (11 parameters) and complete response of patients undergoing different therapies (chemotherapy, endocrinotherapy and radiotherapy) were compared between the low-and high-risk groups. K-M survival curves were plotted to compare disease-specific survival (DSS), disease-free interval (DFI) and progression-free interval (PFI) between the low-and high-risk groups.

Development of degradome-related clinicopathological nomogram
Univariate and multivariate Cox regression analyses were performed based on 11 clinicopathological parameters and risk score. Characteristics with P-value < 0.05 from the multivariate Cox regression analysis were further used to develop a nomogram to predict OS. The predictive accuracy of the nomogram was verified based on the concordance index (C-index) and calibration curves. Besides, decision curve analysis (DCA) was performed to assess the advantage in clinical benefit of the nomogram compared with traditional pathological stage.

Functional characterisation of DEGs in the two risk groups
DEGs between the low-and high-risk groups were identified and named new DEGs (nDEGs). NetworkAnalyst (http:// www.networkanalyst.ca) was employed to build a protein-protein interaction (PPI) network based on nDEGs. The nDEGs were functionally characterized via GO and KEGG analyses based on a pre-determined |log 2 FC| threshold. Gene set enrichment analysis (GSEA) was performed to identify the significantly enriched functional pathways in the two risk groups. In addition, the DRGs and nDEGs were intersected to compare the degradome expression pattern between the low-and high-risk groups.

Mutation landscapes of the two risk groups
The Breast Invasive Carcinoma dataset (TCGA, PanCancer Atlas 1084 samples) in the cBioPortal for Cancer Genomics database (http://www.cbioportal.org) was used for subsequent analysis. The mutation landscape of the whole cohort, low-risk group and high-risk group was respectively extracted. The top 10 most frequently altered genes and DRGs in the two risk groups were determined. Differences in the fraction of genome altered, mutation counts, microsatellite instability (MSI) and tumor mutation burden (TMB) were analyzed between the low-and high-risk groups. Samples were divided into the low-and high-TMB groups according to the median TMB value. Subsequently, differences in survival were analyzed between the low-and high-TMB groups with or without the consideration of risk score. Additionally, the correlation between risk score and TMB was analyzed.

Differences in immune infiltration and immune checkpoint expression between the two risk groups
The GEPIA2021 database (gepia2021.cancer-pku.cn) was used to examine the correlation among the infiltration of 22 types of immune cells in the tumour microenvironment (TME) of breast cancer. The CIBERSORT algorithm was used to evaluate the infiltration levels of 22 types of immune cells. The correlation between risk score and 22 types of immune infiltrating cells was analyzed, and the infiltration levels were also compared between the low-and high-risk groups. Furthermore, the expression pattern of 47 immune checkpoints was compared between the two risk groups to assess the potential value of the degradome signature in immunotherapy.

Prognostic prediction of patients undergoing different therapies
The survival of patients undergoing different therapies was compared between the low-and high-risk groups in TCGA and METABRIC cohorts. Potential drugs that may result in complete response to chemotherapy were identified in the two risk groups. Additionally, the expression pattern of 21 potential molecular targets from the PI3K/AKT/mTOR signaling pathway, CDK family and PARP family was examined to ascertain targets for molecular targeted therapy in different risk groups.

Drug sensitivity analysis
With pRRophetic R package, we processed wide drug screening based on GDSC database (https://www.sanger.ac.uk/tool/gdscgenomics-drug-sensitivity-cancer) to ascertain the drugs that the two risk groups may sensitively respond to.

Cell culture and siRNA transfection
Two human breast cancer cell lines MCF-7 and MDA-MB-435S were purchased from Wuhan Procell Life Science and Technology Co., Ltd. (Wuhan, China) to form parallel contrast. Cells were cultured in RPMI-1640 (Gibco-BRL) supplemented with 10% foetal bovine serum (Bioserum), 100-U/mL penicillin G and 100-mg/mL streptomycin. siRNAs targeting ABHD12 and USP41 were purchased from GeneChem (Genechem Co., Ltd, Shanghai, China). The siRNAs were transfected into MCF-7 and MDA-MB-435S cells following the recommended guidelines. For each gene (ABHD12 or USP41), a total of three groups were formed: normal control (NC), siRNA1 and siRNA2 groups. The sequence of siRNAs were provided in Supplementary Material.

Cell phenotype assays
Colony formation and CCK8 assays were performed to assess the proliferation of breast cancer cells. Cells from different groups were digested and inoculated in 6-well plates (Jet Biofilter Co., Ltd., Guangzhou, China), with 1,000 cells per well. The medium was changed every 3 days, and the cells were cultured for 10-14 days. After visible colonies were formed, they were immobilised with 4% paraformaldehyde and stained with crystalline violet (Solarbio Life Sciences, China). CCK8 assay (Dojindo, Tokyo, Japan) was performed according to the manufacturer's instructions. The absorbance was measured at 450 nm using a microplate reader. Wound healing and transwell assays were performed to assess the invasive and migratory abilities of breast cancer cells. Cells from different groups were digested and inoculated in 6-well plates. After the cells had reached 95% confluence, a straight scratch was made on the surface of each well with the tip of a 100-mL sterile pipette. The wound area was photographed using an inverted microscope (Nikon DS-RI2, Japan) at 100× magnification at 0, 12 and 24 h. For transwell assay, 800 mL of a medium containing 10% foetal bovine serum (Corning, USA) was added to the lower chamber, and 200 mL of a serum-free medium containing 20,000 cells was added to the upper Matrigel-coated chamber. After 24 h of incubation, cells that had crossed the membrane were fixed with 4% paraformaldehyde, washed with phosphate-buffered saline, stained with crystal violet and photographed using an inverted microscope at 200× magnification.

Statistical analysis
All statistical analyses were performed using R 4.0.3. K-M survival curves were compared via Cox regression analysis. The Wilcoxon rank sum test was used to compare the gene expression between groups, and the chi-square test was used to compare the differences in clinicopathological parameters between risk groups. Spearman correlation coefficients were evaluated for correlation analysis. |r| value > 0.1 were considered relevant, and P-value < 0.05 were considered statistically significant. '*' indicates P-value < 0.05, '**' indicates P-value < 0.01 and '***' indicates P-value < 0.001 throughout this study.

22 DRGs correlated with prognosis were demarcated
A total of 625 DRGs were selected from The Mammalian Degradome Database, including 24 aspartyl proteases, 169 cysteine proteases, 201 serine proteases, 28 threonine proteases and 203 metalloproteases. In the meanwhile, a total of 5068 DEGs and 1596 PRGs were identified. These three gene clusters were intersected ( Figure 1A), and the expression patterns of the 22 overlapping DRGs were verified ( Figure 1B). Furthermore, the correlation among the expression of the 22 DRGs was examined ( Figure 1C). The expression of most DRGs was positively correlated. Functional enrichment analysis confirmed that the 22 DRGs are associated with protein degradation and processing ( Figure 1D). Additionally, two subtypes (BRCA subtype 1 and BRCA subtype 2) were concisely divided via consensus clustering, verifying the favourable consistence of the 22 DRGs (Supplementary Figure S1).

Degradome signature was fairly validated in internal and external cohorts
Three TCGA cohorts were used for internal validation: pathological stage III, ER-positive and HER2-positive cohorts. Survival was adequately distinguished between the low-and highrisk groups in the three cohorts, with patients in the high-risk group having poorer outcomes (Figures 3A, D, G). In the pathological stage III cohort, the AUCs for predicting survival probability at 1-, 3-and 5-year were 0.777, 0.726 and 0.798, respectively ( Figure 3B). In the ER-positive cohort, the AUCs for predicting survival probability at 1-, 3-and 5-year were 0.771, 0.781 and 0.731, respectively ( Figure 3E). In the HER2-positive cohort, the AUCs for predicting survival probability at 1-, 3-and 5-year were 0.703, 0.716 and 0.688, respectively ( Figure 3H). The GSE96058 dataset was used for external validation. Patients in the high-risk group harbored worse prognosis ( Figure 3J). The AUCs for predicting survival probability at 1-, 3-and 5-year were 0.746, 0.669 and 0.636, respectively ( Figure 3K). The distribution of patients with risk score and survival time in the three TCGA cohorts and GSE96058 cohort were demonstrated, respectively (Figures 3C, F, I, L). The distinct risk grouping in both TCGA and GSE96058 cohorts was confirmed by PCA (Supplementary Figure S2). Besides, the AUC stands for predictive capability of risk score was prior to other clinicopathological characteristics but age in both TCGA and GSE96058 cohorts (Supplementary Figure S3). Subgroup analysis further determined the robust efficacy of the degradome signature in discriminating prognosis regardless of multiple clinicopathological features (Supplementary Figure S4). Altogether, these results verified the favourable applicability of the degradome signature.

Degradome-related clinicopathological nomogram showed moderate predictive capability
The distributions of 11 clinicopathological parameters were displayed between the low-and high-risk groups, with integration of the expression of 10 DRGs (Figure 4). These 10 DRGs were significantly differentially expressed between the low-and high-risk groups. On comparing the differences in clinicopathological characteristics between the two risk groups (Supplementary Table  S1), the high-risk group was found to have more patients with T4 stage, ER-negative, PR-negative and HER2-positive breast cancers and more patients in the post-menopausal status. Regarding the PAM50 subtype, the high-risk group had more patients with the Her2, basal and luminal B subtypes, whereas the low-risk group had more patients with luminal A subtype. In addition, the proportion of patients with complete response to chemotherapy was higher in the low-risk group than in the high-risk group. However, no significant differences in complete response to endocrinotherapy and radiotherapy were observed between the two risk groups. Moreover, patients in the high-risk group had worse DSS, DFI and PFI (Figures 5A-C).
Age, ER status (negative), N stage (N3) and risk score were deciphered as independent prognostic predictors of BRCA (Supplementary Table S2). The risk score was also identified as an independent prognostic predictor in the four validation cohorts (Supplementary Figure S5). The four factors were subsequently used to develop a nomogram for predicting OS at 1-, 3-and 5-year ( Figure 5D). The C-index was 0.769 (0.747-0.790), demonstrating the robust predictive ability of the nomogram. The risk score had the strongest effect on OS according to its wide point contribution. The calibration curves were close to the ideal line, suggesting excellent predictive efficacy of the nomogram ( Figure 5E). Additionally, DCA revealed that the nomogram has more satisfactory clinical decision-making advantages at 1-, 3-and 5year comparing with the traditional pathological stage ( Figure 5F). 3.5 TLR signaling regulation and cell cyclepromoting activities were significantly upregulated in the high-risk group nDEGs between the two risk groups were identified and used to construct a tissue-specific PPI network ( Figures 6A, B). GO/KEGG analysis revealed that four biological activities mainly associate with chemokine responses were significantly upregulated in the low-risk group (z score > 1) ( Figure 6C). A total of 30 GO terms and 5 KEGG terms were significantly upregulated in the high-risk group (z score > 1) (Figures 6D-F). These terms included cornification, antimicrobial humoral response, digestion, collagen-containing extracellular matrix, cornified envelope, neuronal cell body, serine-type endopeptidase activity, receptor-ligand activity, channel activity, neuroactive ligand-receptor interaction, PPAR signaling pathway and nicotine addiction, etc. The detailed results of functional annotation are provided in Supplementary Table S3. Subsequently, GSEA revealed the biological activities and signaling pathways enriched in the high-risk group (Figures 6G, H). These activities and pathways were mainly summarized as the following two aspects: TLR signalling regulation (regulation of TLRs by endogenous ligands) and cell cycle-promoting activities (REG cascade of cyclin EXPR, polo-like kinase mediated events, PLK1 pathway, G0 and early G1 and G1 specific transcription). However, no significant GSEA results were observed in the low-risk group. Furthermore, a total of 45 degradome-related nDEGs (DR-nDEGs) were identified after the intersection of DRGs and nDEGs ( Figure 6I). The expressions of the 45 DR-nDEGs were compared between the low-and high-risk groups, suggesting the distinct degradome pattern in different risk groups (Figure 7).

Different risk groups had distinct mutation characteristics
The top 20 most frequently altered genes in BRCA, plus the two putative genes, BRCA1 and BRCA2, were examined ( Figure 8A). The mutation frequency of PIK3CA (34%) and TP53 (33%) was the highest, with the most common mutation type being missense mutation. The fraction of genome altered was less than 25% in most patients ( Figure 8B). The gene mutation count of most patients ranged from 10 to 40 ( Figure 8C). Furthermore, the gene mutation landscapes of the low-and high-risk groups were determined, respectively ( Figures 8D, E). PIK3CA (40%) and TP53 (46%) were the most frequently altered genes in the low-and high-risk groups, respectively. This finding indicated the different gene-driven oncogenesis in patients with different risk statuses. Subsequently, significantly altered genes between the two risk groups were identified (Figures 8F, G). The frequency of significant gene mutations was higher in the high-risk group. Genes with significantly different mutation frequencies between the two risk groups were further counted ( Figure 8H). For both the entire genome and degradome, the number of significantly mutated genes was higher in the high-risk group. The top 10 significantly mutated genes in the entire genome and degradome in the two risk groups were respectively displayed ( Figures 8I-L).
Patients in the high-risk group had a significantly higher fraction of altered genome, more gene mutation counts, higher MSIsensor scores and higher TMB than patients in the low-risk group (Figures 9A-E). The dominant oncogenic pathway was checked out to be the TP53 signaling pathway in the high-risk group and the PI3K/AKT/mTOR signaling pathway in the low-risk group (Figures 9F, G). These results are consistent with those of gene mutation landscapes. However, TMB was found to have no prognostic value in BRCA ( Figure 9H). Survival analysis integrating risk score and TMB revealed that patients with low TMB and high risk score had the worst prognosis, whereas those with high TMB and low risk score had the best prognosis ( Figure 9I). More importantly, the risk score was significantly positively correlated with TMB (r = 0.283, P < 0.001) ( Figure 9J).

Degradome signature was correlated with immune infiltration and immune checkpoint expression
The infiltration levels of a majority of immune cells were positively correlated with each other in BRCA ( Figure 10A). The infiltration levels of 22 types of immune cells were evaluated in each BRCA sample ( Figure 10B). The risk score was significantly positively correlated with the infiltration levels of resting NK cells, M0 and M2 macrophages, activated dendritic cells and neutrophils and significantly negatively correlated with the infiltration levels of naive B cells, CD8 T cells, resting memory CD4 T cells, gammadelta T cells, activated NK cells, M1 macrophages, resting dendritic cells and resting mast cells ( Figure 10C). Furthermore, the infiltration levels of resting NK cells, macrophages M0, macrophages M2 and activated dendritic cells were significantly higher in the high-risk group, whereas those of naive B cells, CD8 T cells, resting memory CD4 T cells, gamma-delta T cells, activated NK cells, M1 macrophages, resting dendritic cells and resting mast cells were significantly higher in the low-risk group ( Figure 10D). Only CD80 was significantly upregulated in the high-risk group, while the other 35 immune checkpoints were significantly upregulated in the low-risk group, including CTLA4, PD1 and PDL1 ( Figure 10E). Distribution of clinicopathological characteristics in the low-and high-risk groups. The efficiency of the degradome signature in predicting the prognosis of patients undergoing different therapies was examined in TCGA and METABRIC cohorts, respectively. The prognosis of patients undergoing traditional chemotherapy was not significantly distinguished between the low-and high-risk groups in both TCGA and METABRIC cohorts (Figures 11A, D).
The prognosis of patients undergoing endocrinotherapy was worse in the high-risk group than in the low-risk group in both TCGA and METABRIC cohorts (Figures 11B, E). The prognosis of patients undergoing radiotherapy was poorer in the high-risk group than in the low-risk group in the METABRIC cohort ( Figure 11F). However, there was no difference in the prognosis of patients undergoing radiotherapy between the low-and highrisk groups in TCGA cohort, possibly owing to limited samples ( Figure 11C).
We next summarized the TCGA-BRCA samples that gain complete response after first-round traditional chemotherapy and the corresponding drug agents ( Figure 11G). A total of 94 and 59 patients exhibited complete response to the first-round traditional chemotherapy in the low-and high-risk groups, respectively. Cyclophosphamide and docetaxel were found to be more beneficial for patients in the low-risk group, while 5-fluorouracil may be more proper for patients in the high-risk group to gain complete response. Furthermore, the expression pattern of 21 molecules in the PI3K/AKT/mTOR signaling pathway, CDK family and PARP family were investigated between different risk groups ( Figure 11H). Most targets from the PI3K/AKT/mTOR signaling pathway were significantly upregulated in the low-risk group, whereas most targets from the CDK family and PARP family were significantly upregulated in the high-risk group. These targets may potentially serve for molecular targeted therapy in BRCA. Interestingly, drug sensitivity analysis with IC50 further suggested that docetaxel, epirubicin and inhibitors of PI3K/AKT/mTOR signaling pathway (afuresertib, buparlisib, ipatasertib and dactolisib) are more beneficial to patients in the low-risk group, whereas tyrosine kinase inhibitors (ibrutinib, lapatinib and sapitinib) may better benefit patients in the high-risk group (Supplementary Figure S6).  Expression pattern of DRGs between the low-and high-risk groups. 3.9 Sponging ABHD12 and USP41 significantly inhibited the proliferation, invasion and migration of breast cancer cells To examine the molecular functions of DRGs in breast cancer, ABHD12 and USP41 were selected for in vitro analysis. ABHD12 and USP41 were significantly elevated in both BRCA samples and the high-risk group. ABHD12 and USP41 were knocked down in both MCF-7 cells and MDA-MB-435S cells via siRNA transfection ( Figures 12A, B). Colony formation assay (Figures 12C-F) and CCK8 assay (Figures 12G-J) consistently verified that the knockdown of ABHD12 and USP41 significantly attenuated the proliferation of both MCF-7 cells and MDA-MB-435S cells. Transwell assay suggested that the knockdown of ABHD12 and USP41 significantly inhibited the migration of both MCF-7 cells and MDA-MB-435S cells (Figures 12K-N). Besides, wound healing assay revealed that the knockdown of ABHD12 and USP41 significantly weakened the invasion of both MCF-7 cells and MDA-MB-435S cells (Figures 13A-F).

Discussion
Previously, degradation was merely considered as a destructive mechanism of proteins. Recent studies have revealed that the repertoire of proteases plays an important role in various physiological and pathological activities than ever (19). The identified 625 DRGs account for over 3% of the currently known 19,587 protein-coding genes in human. Therefore, the degradome contains the most abundant class of enzymes that play an essential role in modulating cellular activities (18, 20, 21). Dysregulation of DRGs may contribute to the onset and progression of breast cancer (14)(15)(16)(17)22). But the crucial DRGs remain unknown in breast cancer. Moreover, to the best of our knowledge, no previous study has systematically evaluated the degradome pattern in breast cancer to predict prognosis, assess treatment response and guide risk stratification. Therefore, this study may serve as a primary reference for subsequent studies.
We constructed and validated a 10-gene signature based on DRGs to predict the prognosis of patients with breast cancer. The OS, DSS, DFI and PFI of patients between the low-and high-risk groups were adequately differentiated by the degradome signature. Three DRGs (PRSS2, SPPL2C and RHBDL1) were identified as protective factors and two DRGs (USP41 and ABHD12) were identified as risky factors for BRCA. In vitro experiments revealed that USP41 and ABHD12 play an essential role in breast cancer progression, which was consistent with the results of our functional assays (23,24). Elucidating the detailed regulatory mechanisms of USP41 and ABHD12 may help to further understand their roles in breast cancer. But no previous study has reported the role of PRSS2, SPPL2C and RHBDL1 in breast cancer, thus this study may report the potential role of these genes as independent prognostic predictors in breast cancer for the first time.
Furthermore, the correlation between risk score and clinicopathological parameters was investigated. Patients with T4 stage, ER-negative, PR-negative, HER2-positive, basal subtype (PAM50), Her2 subtype (PAM50), post-menopausal status and those without complete response to chemotherapy harbored high risk score and hence worse prognosis. The risk score was identified as an independent prognostic predictor, together with ER status (negative), N stage (N3) and age. A novel prognostic nomogram integrating these four factors was developed. The C-index, calibration curves and DCA confirmed that the nomogram shows favourable capability to predict survival and clinical decision-making advantages comparing with the traditional pathological stage. The proportion of DEGs was lower in the low-risk group than in the high-risk group, which led to unsatisfactory results of GO/ KEGG functional enrichment analyses and GSEA in the low-risk group. The GO/KEGG biological activities positively upregulated in the high-risk group included small-molecule transport, digestion and peptidase activity, which indicated a more active degradome in the high-risk group. GSEA revealed that the most enriched biological activity in the high-risk group was regulation of TLRs by endogenous ligands. Endogenous ligands from host cell origin that regulate TLRs are also called damage-associated molecular patterns (DAMPs), which can be activated and secreted to respond to tissue damage by enhancing inflammatory responses (25). DAMPs from breast cancer cells and other invasive cancer cells can promote cancer progression and enhance tumor aggressiveness (26, 27), which may explain the worse outcomes of patients in the high-risk group in this study. Furthermore, five cell cyclepromoting biological processes and signaling pathways were enriched in the high-risk group: REG cascade of cyclin EXPR, polo-like kinase activity, G0 and early G1, PLK1 pathway and G1 specific transcription. Polo-like kinase 1 (PLK1) is a pivotal regulator in mitosis. Its overexpression during mitosis activates the transcription factor FOXM1, which subsequently activates genes that are involved in mitosis (28). PLK1 silencing attenuates cell proliferation and growth and induces apoptosis in breast cancer (29,30). Therefore, the poor prognosis of patients with high risk score in this study may be partially attributed to enhanced proliferation via more active mitosis. Additionally, the expression of DRGs was different between the two risk groups, suggesting distinct degradome pattern in the two groups. Altogether, the results of functional enrichment analyses revealed two potential mechanisms underlying breast cancer progression in the high-risk group. On the one hand, DAMPs from cancer cells are activated and secreted to generate inflammation responses and promote tissue repair, thereby enhancing cancer progression. On the other hand, stronger PLK1 signaling enhances mitosis to promote the proliferation of cancer cells. The two mechanisms may provide novel insights into targeted therapy; however, further experimental verification is required to support these findings. Excessive gene mutation, especially the mutation of tumour suppressor genes, is one of the triggers for tumorigenesis (31). PIK3CA and TP53 are both commonly mutated oncogenes in breast cancer (32). In this study, PIK3CA and TP53 were identified as potential carcinogenesis-driving genes in the low-and high-risk groups, respectively. Kotoula et al. (33) showed that patients with non-lymphocyte-dominant early-stage breast cancer with PIK3CAonly mutations had a favourable DFI, those with TP53-only mutations had a worse DFI and those with PIK3CA-TP53 co-mutations had the worst DFI. They concluded that PIK3CA and TP53 mutations have diverse effects on the prognosis of patients with early stage breast cancer (33). Consistently, in this study, patients with TP53 mutations in the high-risk group had a worse prognosis than those with PIK3CA mutations in the low-risk group. Patients in the high-risk group had higher genome instability and mutation frequency, which may explain the unsatisfactory outcomes observed in this group. More importantly, the risk score was significantly positively correlated with TMB.
Immune infiltrating cells in the TME can influence the prognosis of patients with cancer (34-36). High infiltration levels of T cells usually represent favourable survival probability (37,38). In breast cancer, CD4 T cells can mitigate CD8 T cell exhaustion, and high infiltration levels of CD4 and CD8 T cells indicate favourable prognosis (39,40). Consistently, in this study, the infiltration levels of both resting memory CD8 and CD4 T cells were high in the lowrisk group with better survival outcomes. Gamma-delta T cells act like a double-edged sword in breast cancer. The Vg9Vd2 + subtype can exert cytotoxic effects on cancer cells to suppress tumour growth and angiogenesis and induce apoptosis, whereas the gd1 + subtype orchestrates cancer progression (41). In this study, although the infiltration levels of gamma-delta T cells were found to be high in the low-risk group with better prognosis, the subtypes were not identified. Therefore, more precise sequencing data are required for further investigation, such as single-cell sequencing data. NK cells are the main effectors against cancer cells in innate immunity and are correlated with better survival (42). The anti-cancer effects of NK cells activated by ILs have been verified in previous studies (43 -45). In this study, the infiltration levels of NK cells were found to be higher in the low-risk group, which verifies the tumour-suppressing role of NK cells. Macrophages are important components of the innate immune system; however, when they infiltrate the TME, called tumour-associated macrophages (TAMs), they are employed by tumour cells to promote cancer progression, resulting in a worse clinical outcome (46). TAMs can restrict tumour-associated antigen presentation and attenuate the activation of cytotoxic T lymphocytes (CTLs) while simultaneously promoting the survival, angiogenesis and metastasis of cancer cells (46). Consistently, in this study, the infiltration levels of macrophages were higher in the highrisk group with a poorer prognosis, which verifies the tumourpromoting role of macrophages. In conclusion, the prognostic role of several tumor infiltrating immune cells in BRCA was verified in this study that immune infiltrating cells in the TME are important for cancer status. Recently, immunotherapy has emerged as the first-line anti-cancer strategy. Identification of the expression patterns of immunotherapeutic targets may help to improve the survival of patients (47, 48). Studies have demonstrated that compared with monotherapy, combination therapy with PD1 and CTLA4 inhibitors results in better survival improvement in several cancers (49)(50)(51)(52). In this study, the expression levels of PD1, PDL1 and CTLA4 were higher in the low-risk group, suggesting that co-blockade of these molecules represents a new strategy for immune checkpoint blockade therapy. This finding also indicates that the degradome signature can be used to guide clinical treatment.
The aberrant expression of a single DRG has been reported to associate with the therapeutic effects of chemotherapy (17,53), endocrinotherapy (54) and radiotherapy (55,56). In this study, the prognosis of patients undergoing endocrinotherapy or radiotherapy was worse in the high-risk group than in the low-risk group, possibly owing to treatment resistance induced by degradome dysfunction. Subsequently, cyclophosphamide and docetaxel were determined as beneficial chemotherapeutic drug agents for patients in the low-risk group, whereas 5-fluorouracil may be more proper for patients in the high-risk group. Additionally, evaluation of the expression pattern of potential molecular targets revealed that targeting CDKs/PARPs may represent a better therapeutic strategy for patients in the high-risk group, which is consistent with the results of functional enrichment analysis. In the meanwhile, targeting the PI3K/AKT/mTOR signaling pathway may serve as a better therapeutic strategy for patients in the lowrisk group, which is consistent with the results of mutation analysis. These findings also indicate the utility of the degradome signature in developing individualized treatment strategies in clinical settings.
However, this study has certain limitations. First, specimens from real-world clinical patients are required for verifying the expression of the 10 DRGs. Second, prospective, multi-centre studies with a large BRCA cohort should be conducted to verify the reliability of the degradome signature and corresponding A B D E F C FIGURE 13 Effects of ABHD12 and USP41 on cell invasion. (A, C, D) Wound healing assay in MCF-7 cells. (B, E, F) Wound healing assay in MDA-MB-435S cells. '**' indicates P-value < 0.01; '***' indicates P-value < 0.001. "ns" represents non-significant.
results. Third, more experimental studies are required to elucidate the regulatory mechanisms and functions of the 10 DRGs.

Conclusion
A 10-gene signature based on DRGs was constructed and validated to predict the prognosis of breast cancer. A nomogram integrating clinicopathological parameters and risk score was further developed for predicting OS. The high-risk group had a higher degree of clinicopathological events, as well as higher mutation frequency. TLR regulation and several cell cycle-promoting activities were significantly upregulated in the high-risk group. The risk score was significantly correlated with the infiltration of several immune cells and TMB. The expression of various immune checkpoints, including PD1, PDL1 and CTLA4, was significantly higher in the low-risk group. Additionally, the prognosis of patients undergoing different therapies was distinguished by the degradome signature between the two risk groups possibly owing to treatment resistance. Therefore, the degradome signature may be utilized for prognostic prediction, risk stratification and clinical decision making in breast cancer.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Ethics statement
Consent from all participants was obtained through The Cancer Genome Atlas (TCGA) database, the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and the Gene Expression Omnibus (GEO) database.

Author contributions
YL processed the bioinformatic analyses, and YY performed the functional assays. YL, YY, YC, CZ and YS drafted and edited the manuscript. CW gave final approval of the revised manuscript. JO outlined the study and edited the manuscript. All authors contributed to the article and approved the submitted version.

Funding
This study was supported by the research of early screening indicators and key technology and the construction of an integrated precise diagnosis and treatment system for female breast cancer and ovarian cancer in Xinjiang Uygur Autonomous Region (grant no. 2022B03019).