Matrix Metallopeptidase 14: A Candidate Prognostic Biomarker for Diffuse Large B-Cell Lymphoma

Background Matrix metallopeptidase 14 (MMP14) is an important gene in the regulation of T-cell function. However, the correlation between MMP14 expression, prognosis, and immune cell infiltration in diffuse large B-cell lymphoma (DLBCL) remains unclear. Methods We investigated the influence of MMP14 on clinical prognosis using data obtained from three Gene Expression Omnibus (GEO) database sets (GSE98588, GSE10846, and GSE4475). The expression of MMP14 was analyzed using the Gene Expression Profiling Interactive Analysis (GEPIA). The correlation between MMP14 and immune cell infiltration was investigated using the Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT) and Tumor Immune Estimation Resource (TIMER) tools. In addition, the correlation between MMP14 expression and immune gene markers was analyzed by TIMER and GEPIA. Results MMP14 expression positively correlated with favorable progression-free survival (PFS; GSE98588, P = 0.02) and overall survival (OS; GSE98588, P = 0.003; GSE10846, P = 5.517e-05; and GSE4475, P = 9.85e-04). Moreover, MMP14 expression was higher in DLBCL tumors than in normal tissues. Regarding clinical characteristics, high MMP14 expression was found to be correlated with race. MMP14 expression was also correlated with immune cell infiltration and had a remarkable correlation with various immune marker sets. It was found that M0 macrophages were the immune cells most related to survival, decreasing with the increase in Ann Arbor clinical stage. The results especially showed that MMP14 was a prognostic biomarker and related to the macrophages M0. Conclusion The results suggest that MMP14 is a novel prognostic molecular marker for DLBCL and is related to the immune cell infiltration, especially related to the macrophages M0. Our study provides insights for understanding the potential roles of MMP14 in tumor immunology and its suitability as a prognosis biomarker in DLBCL.


INTRODUCTION
Diffuse large B-cell lymphoma (DLBCL) is the clinically and genetically heterogeneous group comprising malignant proliferations of large lymphoid B cells, according to the definition by the World Health Organization (WHO) classification (1,2). Although immunochemotherapybased combination therapy (R-CHOP; rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone) and better supportive care have improved the survival rate, the prognosis of refractory and relapsed patients remains poor (3). Thus, there is an urgent need to identify novel potential prognostic biomarkers and to investigate the mechanisms of immune involvement in tumorigenesis and tumor progression in the context of DLBCL.
Matrix metallopeptidase 14 (MMP14) is a membrane anchored MMP, which cleaves other functional proteins such as interleukin 8 and pro-tumor necrosis factor to maintain the tumor microenvironment (4)(5)(6). Several studies reported a role for MMP14 in susceptibility, pathological development, microenvironmental regulation, and prognosis of hepatocellular carcinoma (7), ovarian carcinoma (5), breast cancer (8), epithelial skin cancers (9), colorectal cancer (10), and gastric cancer (11). Thus, enhancing the immune response against the tumor could benefit patients affected with DLBCL, improving their survival (12,13). However, there is only little research about this field, and feasible immune biomarkers for the prediction of DLBCL patients' prognosis are lacking. Moreover, the role of MMP14 in the clinical prognosis and immune cell infiltration in DLBCL remains unclear. In this study, we investigate potential new prognostic markers for DLBCL using a bioinformatics approach with data already available in public datasets.

Gene Expression Omnibus Databases Analysis
Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih. gov/geo/) is an international public repository containing high-throughput microarray and next-generation sequencing functional genomic data sets (14). All genes from two independent DLBCL GEO cohorts [GSE10846, 420 samples (15); GSE98588, 137 samples (2)] were analyzed. According to the media based on each gene expression value, we subdivided DLBCL patients into low-and high-expression groups. Overall survival analyses (OS) in correlation to all genes were performed to identify overlapping genes between the two cohorts for the two different groups. The statistical cut-off value was P < 0.001. The overlap was evaluated using a Venn diagram 1 . Gene Ontology (GO) is a community-based bioinformatics resource containing information about a gene product using functionbased categories (16). The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge database for the systematic analysis of gene functions, connecting genomic information with 1 http://jvenn.toulouse.inra.fr/app/example.html higher-order functional information (17). Functional enrichment of overlapping genes based on GO and KEGG annotations (top five modules) was performed using the Metascape tool (18). The consistent candidate prognosis genes were selected among overlapping genes with a potential correlation to DLBCL, as assessed using the Oncomine database 2 and literature search.

The Expression of Matrix Metallopeptidase 14 and Survival Analysis
The Gene Expression Profiling Interactive Analysis (GEPIA; http: //gepia.cancer-pku.cn/) is a web-based analysis tool based on The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) data (19). GEPIA was used to obtain the MMP14 expression data in cancer and normal tissues (19). The correlation between MMP14 expression and the Kaplan-Meier survival analysis progression-free survival (PFS, GSE98588; OS, GSE98588, and GSE10846) in DLBCL cancer was analyzed, which was further validated in another GEO database [OS, GSE4475 (20)].

Diffuse Large B-Cell Lymphoma Data in the Cancer Genome Atlas Analysis
The Cancer Genome Atlas is a public-funded project that seeks to catalog and discover major cancer-causing genomic alterations to create a comprehensive "atlas" of cancer genomic profiles (21). RNA-Seq data of 48 patients affected with DLBCL were downloaded from TCGA. Datasets also contained clinical data such as Ann Arbor clinical stage, gender, race, body mass index (BMI), age, and survival time. Then the relationship between MMP14 expression and Ann Arbor clinical stage, gender, race, BMI, and age was investigated. We also used TCGA data to perform gene set enrichment analysis (GSEA), a statistical method to determine if predefined sets of genes are differentially expressed among selected samples (22). GSEA was performed on low-and high-expression groups, according to the median MMP14 expression values. GO terms and the KEGG pathways analyses were also performed to investigate possible novel biological functions of MMP14 in DLCBL. For this analysis, we selected the most highly enriched signaling pathways based on the normalized enrichment score (NES).

CIBERSORT and TIMER Database Analysis
Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT; http://cibersort.stanford.edu), a method for characterizing cell composition of complex tissues from their gene expression profiles (23), was used to obtain the abundance ratio matrix of 22 immune cell types using RNAseq data from TCGA-DLBCL (23). The differential abundance of immune infiltrates was obtained comparing the distribution of immune cells in low-and high-MMP14 expression groups using the R software. Then a correlation analysis was performed using the 11 immune cell types whose presence was identified with CIBERSORT in the 48 RNA-seq samples.
Tumor Immune Estimation Resource (TIMER; https:// cistrome.shinyapps.io/timer/) is a comprehensive resource for the systematic analysis of immune infiltrates. This repository includes 10,897 samples based on 32 cancer types obtained from TCGA (24). TIMER uses a published deconvolution statistical method to estimate the abundance of tumor-infiltrating immune cells (B cell, CD4 T cell, CD8 T cell, neutrophil, macrophage, and dendritic cell) from gene expression profiles (25). TIMER was used to measure the correlation between MMP14 expression and the abundance of six types of immune infiltrating cells. Tumor purity was used to reduce bias for this analysis (26). In addition, we investigated the correlation between MMP14 expression and various gene markers (27)(28)(29)(30) of tumor-infiltrating immune cells using correlation modules. Moreover, GEPIA was used to further validate significantly correlated genes identified with TIMER.

Clinical Relationship With the Immune Cell Most Related to Survival
We performed Kaplan-Meier analysis to measure the OS of the 48 patients obtained from TCGA based on the abundance ratio of the 11 identified immune cell types, for which the cut-off level was set at the median value. We also identified the immune cell most related to survival based on the results of the Kaplan-Meier survival analysis. Moreover, we investigated the relationship between the immune cell that was most related to survival and the clinical features (Ann Arbor clinical stage, gender, race, BMI, and age) in the 48 samples. The statistical significance between the two variates was calculated using t-test for independent samples, whereas for more variates, one-way ANOVA was used. Furthermore, we performed a correlation analysis between lowand high-expressed gene groups selected according to the median abundance values of the immune cell that was most related to survival. Differentially expressed genes were identified using the R package edgeR, with the following parameters: fold change >1.5 (for up-regulated genes) or fold change <2/3 (for downregulated genes), and P < 0.01. Metascape 3 was used to perform GO and KEGG pathway enrichment analysis, which was used to predict the potential biological functions of the differentially expressed genes.

The Selection of Matrix Metallopeptidase 14 Gene
We identified 29 overlapping genes (Supplementary Table S1) in two independent DLBCL cohorts (GSE10846, 420 samples; GSE98588, 137 samples, P < 0.001) via a Venn diagram. These genes significantly correlated with OS based on the median gene expression values ( Figure 1A). Interestingly, MMP14 was enriched in all the highest-ranked biological processes and KEGG pathways among the 29 overlapping genes as determined by 3 http://metascape.org/gp/index.html Metascape analysis ( Figure 1B). These potential prognostic genes were further analyzed for their presence in DLBCL cases using the Oncomine database and by the literature search. We then selected MMP14 as our candidate prognostic marker in DLBCL.

Matrix Metallopeptidase 14 Expression Predicted Prognosis in Diffuse Large B-Cell Lymphoma
Gene Expression Profiling Interactive Analysis uncovered that the MMP14 expression was higher in DLBCL tumors than in normal tissues (Figure 2A). We investigated the correlation of MMP14 expression with prognosis in different cohorts of DLBCL patients. Indeed, the high-MMP14 expression group was associated with favorable PFS (GSE98588, P = 0.02) and OS (GSE98588, P = 0.003; GSE10846, P = 5.517e-05; Figures 2B-D). Similarly, we also found that the overexpression of MMP14 is correlated to a favorable OS in the GSE4475 dataset (P = 0.008; Figure 2E). Therefore, our results suggest that MMP14 expression is associated with prognosis in DLBCL patients.

Matrix Metallopeptidase 14 Expression Correlation With Clinical Characteristics and Predicted Biological Functions
To enhance the comprehension about the relevance of MMP14 expression in DLBCL cancer, we analyzed the relationship between MMP14 expression and clinical characteristics using the TCGA-DLBCL database. We found that MMP14 expression was significantly correlated with race, with White patients showing the highest MMP14 expression (P < 0.05), but was not significantly correlated to other clinical characteristics, such as tumor stage, gender, BMI, and age (Supplementary Table S2).
To explore the potential MMP14 biological function, we then performed GO term and KEGG pathway analyses ( Table 1 and  Supplementary Table S3). GO annotation showed five positively correlated categories with high levels of MMP14: regulation of macrophage migration, macrophage migration, collagen metabolic process, iron ion homeostasis, and collagen catabolic process. The GO analysis also revealed five negatively correlated categories: nuclear-transcribed mRNA catabolic process, viral gene expression, nuclear-transcribed mRNA catabolic process nonsense-mediated decay, rRNA methylation, and translational initiation. The KEGG pathway analysis revealed the following five strongest positively correlated pathways with MMP14 expression: lysosome, glycosaminoglycan degradation, extracellular matrix receptor interaction, other glycan degradation, and complement and coagulation cascades. Conversely, the five pathways with the strongest negative correlation were RNA polymerase, ribosome, nucleotide excision repair, spliceosome, and base excision repair. These results suggest that pathways regulating macrophage migration and nuclear transcription were critically important in DLBCL patients and that macrophages seem to be strongly associated with MMP14 expression (Figures 3A,B).

Matrix Metallopeptidase 14 Expression Is Correlated With Immune Cell Infiltration in Diffuse Large B-Cell Lymphoma
The abundance ratio of 22 immune cells in the 48 TCGA samples is shown in Figure 4A and Supplementary Figure S1. We found that some immune cells were highly abundant, such as B cells naive, B cells memory, and M0 macrophages (Figure 4A). We further compared the infiltration levels of immune cells between low-and high-MMP14 expression groups. The infiltration levels of B cells naive in the group highly expressing MMP14 were lower compared with those observed in the group expressing low MMP14 levels. However, we observed the opposite situation for M0 and M1 macrophages (Figure 4B). We also found that MMP14 expression had a critically positive correlation with the infiltrating levels of neutrophils and dendritic cells in DLBCL (Figure 4C). Our results suggest that high MMP14 expression levels, other than correlating with a more favorable prognosis, are also associated with high infiltration levels of M0 and M1 macrophages, neutrophils, and dendritic cells. Thus, MMP14 might play an important role in immune infiltration in DLBCL cases.
We further investigated the relationship between MMP14 expression and various immune markers for different immune cells, including monocytes, tumor-associated macrophages (TAMs), M0, M1, and M2 macrophages, neutrophils, and dendritic cells using TIMER and GEPIA ( Table 2). After the correlation adjustment based on tumor purity, results showed that the MMP14 expression level remarkably correlated with most immune markers. Notably, results revealed that the expression levels of monocyte, M0, and M1 macrophages showed a strong correlation with MMP14 expression in TIMER (P < 0.05; Table 2). Especially M0 macrophages results were similar to those obtained using the GEPIA database (Supplementary Table S4). Important correlation was also found among MMP14 and M0, M2 macrophage markers ( Table 2). Collectively, these findings suggested that MMP14 may regulate macrophage cells in DLBCL. Thus, these results further upheld the findings that MMP14 had a correlation with immune-infiltrating cells, especially in macrophage cells or M0 macrophages in DLBCL.

Identifying the Immune Cell Most Related to Survival
For the 11 immune cells in the 48 samples, the abundance ratio and their correlation are shown in Figure 5A and their Pvalue in Supplementary Table S5. T-cell CD4 memory activated was significantly associated with M1 macrophage, while B-cell memory showed a negative correlation with T-cell CD4 and T-cell CD8 memory activated. Furthermore, we investigated the relationship between the abundance ratio of 11 types of immune cells and overall patient survival. Results showed that the abundance ratio of M0 macrophages positively correlated to survival (Figure 5B and Supplementary Table S6). This result is consistent with the correlation between MMP14 expression and survival (Figures 2B-E).

Correlation Between Clinical Characteristics, Macrophages M0, and Its Associated Genes
To better understand the relationship of the clinical characteristics to M0 macrophages, we performed a correlation analysis between the M0 macrophage abundance ratio and the following clinical characteristics: Ann Arbor clinical stage, gender, and race. We observed that the M0 macrophage abundance ratio was negatively correlated to the Ann Arbor clinical stage, and the abundance ratio in males was higher than that in female patients (Figures 6A,B). We found a significant difference regarding race, with White patients showing a higher abundance ratio than that observed for Black, African American, and Asian patients (Figure 6C), which is consistent with the results obtained for MMP14 expression (Supplementary Table S2).
We also analyzed the correlation between gene differential expression and M0 macrophages abundance and found a total of 398 upregulated (including MMP14) and 194 downregulated genes significantly associated with M0 macrophage abundance. We performed GO and KEGG enrichment analysis using these differential expression genes. The top 20 clusters and their enriched representative terms are shown in Figure 6D. Each node in the network represents an enriched term, colored according to the cluster-ID ( Figure 6E) and by its P-value ( Figure 6F). Results showed enrichment in genes involved in the humoral immune response, macrophage migration, positive regulation of adaptive immune response based on somatic recombination of immune receptors, negative regulation of immune system process, and leukocyte migration.

DISCUSSION
Recent immunotherapy breakthroughs have shown a novel strategy to effectively treat tumors, and lots of clinical studies are aimed at improving prognosis via novel combinations of immunotherapy and chemotherapy for DLBCL patients (31). The discovery of reliable and robust biomarkers, in addition to the pathological and clinical factors already in use, is required to improve the personalized therapy for patients.    Thus, immunological molecular markers should be strongly considered in the evaluation and treatment of cancer patients. The MMP14 matrix metalloproteinase is a pericellular type I collagenase (32), and previous studies demonstrated that MMP14 acts on the surroundings by promoting metastasis and tissue remodeling (33). However, before this study, the underlying roles of MMP14 in tumor immunology and its suitability as a prognostic biomarker for DLBCL were not investigated.
Here, we reported that the MMP14 expression level was correlated with favorable OS and PFS using independent datasets in the GEO database (GSE98588, GSE10846, and GSE4475; Figures 2B-E), and MMP14 expression was higher in DLBCL tumors than in normal tissues (Figure 2A), also with race in clinical characteristic. These results reached a conclusion that MMP14 was a prognostic biomarker in DLBCL. To date, diverse immune cell infiltration of MMP14 in DLBCL is unclear. Our results demonstrated that M0 macrophages might play a role in DLBCL. First, the abundance of M0 macrophages was high in DLBCL ( Figure 4A). The MMP14 expression level was remarkably correlated with most immune markers, especially in regulating M0 macrophages and M2 macrophages (Figure 4 and Table 2). Furthermore, M0 macrophages were similar to those in the GEPIA database (Supplementary Table S4). Second, GO term and KEGG pathway analyses also indicated that the pathways regulating macrophage migration were important in DLBCL patients and strongly related with MMP14 expression level in the TCGA-DLBCL databases (Figure 3). These findings showed that MMP14 might have a role in the regulation of infiltrating immune cells, specifically in macrophage cells in DLBCL. Furthermore, M0 macrophages were the immune cells that were most related to survival (Supplementary Table S3). The low abundance ratio of M0 macrophage patients had an increasing risk of death, which is consistent with the relationship between MMP14 expression and survival (Supplementary Figure S2). Moreover, we found that the abundance ratio of M0 macrophages decreased with the increase in the Ann Arbor clinical stage ( Figure 6A). Significant race difference was found and that the White race was the highest (Figures 6B,C), according to the relationship between MMP14 expression and race (Supplementary Table S2). Enrichment analysis also showed that the M0 macrophages were related to the immune  system process and leukocyte migration (Figures 6D,E). All these results supported that MMP14 was related to M0 macrophages.
Although little is known about macrophages' function in tumor immune response in the context of DLBCL, one of the histological hallmarks of B-cell lymphomas is the high number of macrophages involved in the clearance of apoptotic and tumor cells (34,35). Macrophages are extensively involved in the regulation of immune response and homeostasis, acting as sentinels of the tissue microenvironment (36). Under different stimuli, primary macrophages (M0) can be polarized into either M1 (classically activated) or M2 (alternatively activated) macrophages (37). Activated M1 macrophages might favor peritoneal fibrosis by producing various pro-inflammatory cytokines that cause tissue damage, while M2 mediates inflammation resolution and repair promotion (37). In our study, we discovered that in DLBCL, MMP14 expression is positively correlated to M0 macrophages. We found that M0 macrophage abundance is most closely correlated with survival, and it is negatively correlated with the Ann Arbor clinical stage ( Figure 6A). Our findings are consistent with the known role of MMP14, which is upregulated in several types of cancer and promote inflammation, angiogenesis, metastasis, and cancer cell invasion (32). At high clinical stages, MMP14 may promote macrophage polarization into M1 and M2, inducing tissue damage, and a systemic inflammatory response in DLBCL patients. Thus, we hypothesize that the inflammatory response and tumor immunity regulated by MMP14 and M0 macrophage density may have a connection with tumor prognosis and progression in DLBCL patients. However, our results were generated with data present in the databases. Thus, further studies need to validate the link between M0 macrophages and MMP14 by using a different approach, such as gene knockout, experiments based on pathological models, and prospective studies.
In conclusion, we found that high MMP14 expression is associated with favorable prognosis and increased immune cell infiltration in DLBCL cases. MMP14 expression potentially contributed to the immune marker set regulation of TAMs cells, especially in macrophage M0. Therefore, MMP14 likely played a role in immune cell infiltration and is a potential novel prognosis biomarker for DLBCL patients.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.