TMPRSS2 Correlated With Immune Infiltration Serves as a Prognostic Biomarker in Prostatic Adenocarcinoma: Implication for the COVID-2019

Type 2 transmembrane serine protease (TMPRSS2) is a new member of the serine proteases, and studies have shown that TMPRSS2 plays a role in the occurrence of prostate malignancies and is closely related to the occurrence of the coronavirus disease 2019 (COVID-19). However, the role of TMPRSS2 in prostatic adenocarcinoma (PRAD) remains largely unclear. To better explore its function in PRAD, we examined the expression level of TMPRSS2 in the GEO, tumor immune assessment resource (TIMER), as well as Oncomine databases and studied the association between TMPRSS2 and overall survival (OS) rates in the UALCAN and gene expression profiling interactive analysis (GEPIA) databases. In addition, we studied the correlation of the level of immune infiltration and markers of immune cell type in the TIMER database, analyzed the prognosis based on the expression level of TMPRSS2 in the related immune cell subsets, and determined the methylation profile of TMPRSS2 promoter by UALCAN database. Subsequently, we conducted a survival analysis and gene ontology (GO) pathway analysis in the TISID database and detected the expression of TMPRSS2 in the Human Protein Atlas (HPA) database. We also studied the protein-protein interaction (PPI) network of TMPRSS2 in the GENEMANIA database. Additionally, we used the microarray GSE56677 and GSE52920 to illustrate changes in TMPRSS2 expression in vivo and in vitro after severe acute respiratory syndrome-coronavirus (SARS-COV) infection, finding that expression of TMPRSS2 decreased after SARS-COV infection in vitro. The function of TMPRSS2 in the dataset was further verified by gene set enrichment analysis (GSEA). In conclusion, the expression of TMPRSS2 is significantly increased in PRAD, elevated TMPRSS2 is associated with immune infiltration, and prognosis is positively correlated. In addition, tumor tissue from COVID-19 patients with PRAD may be more susceptible to infection with SARS-COV-2, which may render the prognosis gets worse.


INTRODUCTION
Prostatic adenocarcinoma (PRAD) is one of the most common causes of cancer-related death in men in the United States (Gupta et al., 2000). Prostate cancer is one of the leading causes of morbidity and mortality in men 50 years of age, whose incidence rate varies in different countries and ethnic groups (Parsons et al., 2001). In 2012, the incidence of prostate cancer in the tumor registration areas of China was 9.92/100,000, ranking sixth in male malignant tumors. The prostate consists of two components: the epithelium and the stroma. The interaction between the epithelial cells and the stroma is a key factor in the maintenance of normal function and homeostasis of the prostate (Cunha, 2008). In the past, treatment of advanced PRAD was limited due to the lack of effective drugs (Pchejetski et al., 2005). Therefore, before developing specific drugs, exploring the occurrence mechanism and identifying new tumor biomarkers with high sensitivity and specificity are crucial in addressing PRAD.
Recent studies have shown that TMPRSS2 is a new member of the serine proteases and has been reported to be associated with the intestine (Paolonigiacobino et al., 1997). At the same time, researchers later found that the gene was expressed primarily in the prostate in an androgen-dependent manner. The androgenregulated TMPRSS2 promoters form a fusion gene with coding regions of the proto-oncogenic ETS transcription factor family members, which are closely related to prostate cancer and regulate many biological processes (Tomlins et al., 2005). Additionally, TMPRSS2 encodes an intracellular type II transmembrane protein, LDL receptor A (LDLRA), and the scavenger receptor cysteinerich (SRCR) and serine protease domains (Vaarala et al., 2001). Because it is located on the surface of prostate cells, we found that TMPRSS2 could be a potential marker for prostate cancer diagnosis. However, the prognosis and immune mechanisms of TMPRSS2 in PRAD are still unclear.
In late December 2019, a case of viral pneumonia caused by a novel coronavirus was reported in Wuhan, Hubei province, China (Lu et al., 2020). The virus was referred to as severe acute respiratory syndrome-coronavirus (SARS-COV), which is an enveloped RNA virus that can cause intestinal, respiratory, and central nervous system diseases in a variety of animals and humans (Denison et al., 2011). As of August 28, 2020, more than 24.2 million confirmed cases have been reported across more than 200 countries and territories, resulting over 820,000 deaths (according to data from Johns Hopkins University) and causing a notable negative impact on human health and economic development. This coronavirus has been recognized by the World Health Organization as a public health emergency of international concern. Currently, no specific antiviral drugs or clinically effective vaccines are available to prevent and treat the coronavirus disease 2019 (COVID-19;Sanders et al., 2020). Several reports (mainly case series) from around the world have concluded that patients with malignant tumors seem to be more vulnerable to severe COVID-19 infection and death (Addeo and Friedlaender, 2020;Van De Haar et al., 2020), especially those with precancerous conditions (Bhowmick et al., 2020). However, the prognosis of COVID-19 patients with PRAD is unclear. Angiotensin-converting enzyme 2 (ACE2) has identified as cell entry receptors for SARS-COV-2, and receptor-mediated virus entry was dependent on TMPRSS2 (Hoffmann et al., 2020;Zhou et al., 2020). Studies have shown that TMPRSS2 can reduce viral response and promote viral transmission and pathogenesis (Glowacka et al., 2011). More specifically, TMPRSS2 can cleave the SARS-COV-2 spike protein, facilitating viral entry and activation (Strope et al., 2020), and TMPRSS2-expressing cell lines are highly susceptible to SARS-COV, Middle East respiratory syndrome-coronavirus (MERS-COV), and SARS-COV-2 (Matsuyama et al., 2020), which prompted us to explore the association between TMPRSS2 and SARS-COV-2, especially in PRAD patients.
In this work, we studied the mRNA expression level, overall survival (OS), and correlation with immune cells, among other factors. We used tumor immune assessment resource (TIMER), Oncomine database, and GTEx project to obtain the mRNA expression level of TMPRSS2 in PRAD. The prognostic value and OS rate of TMPRSS2 in PRAD were analyzed via the gene expression profiling interactive analysis (GEPIA) and UALCAN databases to explore its functional mechanism. Subsequently, we studied the correlation among TMPRSS2, immune infiltration level, and immune cell type markers in different tumors in the TIMER database. In addition, the integrated repository portal for tumor-immune system interactions (TISIDB) database was used in survival analysis and gene ontology (GO) pathway analysis, and we visualized the Protein-protein interaction (PPI) network in the GENEMANIA database. The expression of TMPRSS2 was detected in the Human Protein Atlas (HPA) database. In addition, GSE56677 and GSE52920 were used to study the expression changes of TMPRSS2 in vivo and in vitro after SARS-COV infection. Based on these data, we identified and elucidated the important role of TMPRSS2 in PRAD and the underlying mechanisms associated with its immune infiltration. The sensitivity of the tumor to SARS-COV-2 and the prognosis of PRAD in patients with COVID-19 were also illustrated.

Oncomine Database Analysis
The Oncomine database 1 is a web-based data mining platform with a microarray database of most human cancers (Rhodes et al., 2007). In this study, the expression level of TMPRSS2 in PRAD was analyzed using the oncology database. In this study, we conducted a search based on the following criteria: (A) analysis type: cancer and normal tissue; (B) data type: mRNA; and (C) threshold: fold change = 1.5 and value of p = 0.01.

TIMER Database Analysis
Using the TIMER database, 2 this study analyzed the expression of TMPRSS2 in PRAD patients and six types of infiltration of immune cells (B-cells, CD4 + T-cells, CD8 + T-cells, neutrophils, macrophages, and dendritic cells) using an abundance of correlation . At the same time, the correlation 1 http://www.oncomine.org 2 https://cistrome.shinyapps.io/timer/ between TMPRSS2 expression and the genetic markers of tumor infiltrating immune cells was also discussed.

GEPIA Database Analysis
In this study, interaction analysis was conducted on online database GEPIA 3 to study the expression of PRAD based genes. Logrank inspection and Mantel-Cox test were used to generate the survival curve, including OS and relapse-free survival (RFS). GEPIA is an interactive network consisting of 9,736 tumor samples and 8,587 normal samples from the TCGA and GTEx projects that analyzed RNA sequencing expression (Tang et al., 2017).

UALCAN Database Analysis
In this study, clinical data from TCGA3RNA-seq in UALCAN 4 and clinical data from 31 cancer types were used to analyze the characteristics of tumor and normal samples in a single other clinic pathological stage, as well as the relative expression of different genes in the tumor subgroup (Chandrashekar et al., 2017).

GENEMANIA Database Analysis
GENEMANIA 5 is a network interface for hypothesis deduction based on gene function (Wardefarley et al., 2010). GENEMANIA can generate a list of genes with similar functions as a query and build an interactive functional association network to illustrate the relationship between genes and data sets. Using this database, we constructed the gene interaction network of TMPRSS2 for coexpression, colocalization and genetic interaction and systematically evaluated its function.

Human Protein Atlas Database Analysis
The HPA 6 was a program with the aim to map all the human proteins in cells, tissues, and organs using an integration of various Omics technologies (Uhlén et al., 2015;Uhlen et al., 2017), and it supplies 32 human tissues and their protein expression profiles and uses antibody analysis to accurately assess protein localization (Lanczky et al., 2016). In this study, we used HPA database to analyze the protein expression and immunohistochemistry (IHC) of TMPRSS2 in normal tissues and PRAD tissues.

TISIDB Database Analysis
TISIDB database 7 was used to further investigate the correlation between TMPRSS2 expression and lymphocytes and immune modulators. The TISIDB database, known as a portal for interaction between the tumor and immune systems, integrates 988 reported immune-related anti-tumor genes, high-throughput screening techniques, molecular profiling and paracancer multinomics data, and various immunological data resources retrieved from seven public databases (Ru et al., 2019).

Microarray Data Collection
We obtained the SARS-COV-related microarray, GSE30589 (Dediego et al., 2011), GSE56677 (Selinger et al., 2014), and GSE52920 (Jimenez-Guardeno et al., 2014) expression profiles and the prostatic-related microarray GSE6956 (Wallace et al., 2008) in the GEO database, 8 a microarray form of highthroughput functional genomics data for public knowledge base storage. The data were normalized via the limma package (Smyth et al., 2005) using the R language. This study elucidated the changes of TMPRSS2 in cells and animals infected with SARS-COV and found the important role of TMPRSS2 in PRAD patients.

GTEx Database Analysis
GTEx database 9 is a database that supplies tissue RNA-Seq data and SNP information contributed by healthy people and combines SNP information and gene expression level. This database was used to investigate the gene expression of TMPRSS2 in the prostate gland (Lonsdale et al., 2013).

GO and KEGG Functional Enrichment Analysis
To explore the relevant pathways and functional annotation involved in TMPRSS2 in GSE30589 and GSE52920, we also conducted GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. GO and KEGG analyses and visualization were implemented based on R software (Version 3.6.1). The results with p < 0.05 were selected.

Gene Set Enrichment Analysis
A computational method known as gene set enrichment analysis (GSEA) was used to analyze the signature gene function and potential pathway. GSEA 10 is a computational method that determines whether an a priori defined set of genes shows statistically significant and concordant differences between two biological states (20; e.g., phenotypes; from the official GSEA website). To explore another link between TMPRSS2 and the functions of interest and to enhance our understanding of the correlation between biological events, we used GSEA software version 4.0.3 and single-gene GSEA of two groups of GSEs based on "C5: GO sets" and "C7: Immunologic Gene Sets. " We set the cut-off criterion to a false discovery rate (FDR) < 25% and nominal p < 0.05.

TMPRSS2 High Expression Level in Tumors
The high expression level of TMPRSS2 in the tumor and corresponding normal tissues in cancers was verified via the Oncomine database. As shown in Figure 1C, TMPRSS2 displayed a higher expression level in bladder cancer, kidney cancer, liver cancer, and prostate cancer, and a lower expression level was found in most other cancers. The prostatic-related microarray GSE6956 contains gene expression profiles of primary prostate tumors resected from 69 patients and 18 non-tumor prostate tissues. The Wilcoxon test and t-test were used to compare the expressions of tumor and normal groups of data. The results ( Figure 1A) showed that the expression of TMPRSS2 in prostate cancer tissues was significantly higher than that in normal tissues (value of p < 0.001) GES6956 further proves the higher result in prostate cancer compared with normal tissue. The RNA-seq expression data in tumors from the TCGA TIMER database ( Figure 1D) show that TMPRSS2 displays obviously high expression in PRAD. From the GTEx projects, we know that the TMPRSS2 gene is highly expressed in the prostate ( Figure 1B). We also explored the expression of TMPRSS2 between tumor and normal tissues and conducted IHC in the HPA database. The protein expression of TMPRSS2 was significantly reduced in normal tissue, and the protein level was significantly elevated in tumor tissue (Figures 2A-D). As shown in Figure 3A, prostate cancer, selected renal cell cancers, urothelial carcinoma, lung cancer, colorectal cancer, and pancreatic cancer exhibit weak to moderate membranous or granular cytoplasmic immunoreactivity. The remaining cancer tissues were all negative. Based on the The Cancer Genome Atlas (TCGA) database, the gene was enriched in prostate cancer in the HPA database (Figure 3B), and RNA tissue specificity was similarly enriched in prostate cancer ( Figure 3C). The higher expression level suggests that TMPRSS2 possesses diverse functions in various tumors, especially in PRAD.

TMPRSS2 Prognostic Value With PRAD
We used the GEPIA database to examine the prognostic value of TMPRSS2. We calculated the Cox P/log-rank p value and hazard ratio with 95% intervals. We set Cox P/log-rank p = 0.05 as the thresholds. The patients were divided into two groups based on the median level of TMPRSS2 expression in each queue. Univariate analysis was performed through GEPIA to assess the impact of TMPRSS2 on various cancer survival rates ( Figure 4A). The results showed that the expression level of TMPRSS2 had an effect on the prognosis of PRAD. Moreover, the UALCAN database was used to evaluate the effect of TMPRSS expression, molecular signature, race and Gleason score on PRAD patient survival. The results showed that prostate adenocarcinoma patients with a high level of TMPRSS2 expression, and Gleason score exhibited a longer survival period ( Figure 4E). This result means that patient survival is associated with gene expression and Gleason score rather than molecular signature and race (Figures 4B-E). Given this information, these results suggested that high expression of TMPRSS2 was related to good prognosis of PRAD.

TMPRSS2 Immune Regulation Molecules
The higher TMPRSS2 expression level suggests that it possesses diverse functions in various tumors, and we explored its GO function in a GO model of TISIDB database, where we found TMPRSS involved in multiple functions related to virus entry into the host cell and viral life cycle. The important role of TMPRSS in regulating the virus suggests its potential association with immune cells in the tumor microenvironment. Via GO and KEGG analyses of GSE30589 and GSE52920 in R, TMPRSS2 was further found to be involved in a variety of virus-related functions ( Table 1) and multiple immune-related pathways ( Table 2). To explore whether TMPRSS2 exerts potential biological roles in immune infiltration, we conducted an integrated analysis based on the TIMER and TISIDB databases, analyzing the link between TMPRSS2 and immune cell infiltration as well as gene markers of immune cell subtypes in PRAD.
As the consequence in Figure 5, it suggested that high levels of TMPRSS2 mRNA expression were associated with high immune infiltration in PRAD. The TMPRSS2 mRNA expression level was significantly negatively correlated with infiltrating levels of CD8 + T-cells (r = −0.345, p = 4.66e −13 ) and CD8 + T-cells (r = −0.16, p = 1.07e −003 ), and it was positively correlated with macrophages (r = 0.178, p = 2.55e −04 ; Figure 5A). After correcting the tumor purity, the immune cell type markers in PRAD were further studied. Table 3 also shows that the TMPRSS2 mRNA expression level had significant correlations with B-cells (CD19, CD27, and CD38), CD8 + T-cells (CD8A and CD8B), neutrophils (FCGR3B, SIGLEC5, and S100A12), macrophages (CD84 and CD163), Th1 (STAT4 and STAT1), Treg (STAT5B and TGFB1), and T-cell exhaustion (PDCD1, CTLA4, LAG3, and GZMB) in PRAD. Through single-gene GSEA analysis based on "C7: Immunologic Gene Sets, " the biological role of TMPRSS2 in the tumor environment was more specifically reflected, as shown in Figure 6. These results strongly confirmed the correlation between TMPRSS2 and immune infiltration in PRAD. In a further investigation, we found that the expression of TMPRSS2 was associated with   (Figures 7A-L). The p values of all of the abovementioned cells are less than 0.001, and |rho| ³ 0.2. Besides, we also explored the biological network between TMPRSS2 and PRAD as shown in Figure 8. Overall, these results suggested that TMPRSS2 and its associated genes were important for immune cell infiltration in the PRAD microenvironment and possibly have a more significant effect on the prognosis of PRAD.

Promoter Methylation Levels of TMPRSS2 Decreased in PRAD
According to the above analysis, we observed a significant increase of TMPRSS2 expression in PRAD, and as a consequence, a further study was performed to explore the reason for the elevated TMPRSS2. Methylation is an important event in epigenetic modification of the genome and is closely related to the course of disease. Particularly, hypomethylation can lead to genome instability (Heyn and Esteller, 2012) and might activate related genes. Therefore, we used the UALCAN database to verify the methylation levels of TMPRSS2 promoter in PRAD. Besides, as shown in the Figures 9B-E, the results is as same as the Figure 9A that the methylation level of TMPRSS2 promoter in normal group was significantly higher than other the groups of race, age, lymphatic metastatic status and TP53 mutation status. The result is shown in Figure 9A. The methylation level of TMPRSS2 promoter in normal tissue was significantly higher than that in PRAD. At the same time, single-gene GSEA analysis of TMPRSS2 was conducted in data sets GSE30589 and GSE52920, and the GSEA of GO gene sets analysis further verified the effect of TMPRSS2 on the promoter methylation level, as shown in Figure 10. Additionally, we performed a stratified analysis of PRAD based on patient race, age, lymphatic metastasis status, and TP53 mutation status, showing that the TMPRSS2 promoter methylation levels of older people, the lymphatic metastasis group, and the TP53   mutation group were lower than that of the control in PRAD (Figures 4B-E), suggesting that PRAD TMPRSS2 promoter methylation might be activated and increase its level.

SARS-COV-2 Infection Might Increase Expression of TMPRSS2
The interaction between TMPRSS2 and ACE2 can promote SARS-COV-2 infection (Hoffmann et al., 2020). The gene TMPRSS2 is closely relevant to prostate cancer as well, regulating many biological processes (Tomlins et al., 2005). Therefore, it is with Human Coronavirus EMC 2012 (HCoV-EMC) or timematched mock infected. Cells were harvested at 0, 3, 7, 12, 18, and 24 h post-infection (hpi), RNA extracted, and transcriptomics analyzed by microarray (Selinger et al., 2014). Due to the high homology between SARS-COV-2 and SARS-COV, changes in TMPRSS2 expression in cells or animals infected with SARS-COV can be used as a reference for SARS-COV-2 infection (Zhou et al., 2020). GSE52920 contains three biological sample types (SARS-COV-wt, SARS-COV-mutPBM, and Mock) based on mice lung tissue. According to whether the mice infected with SARS-COV, we divided it into two groups, SARS-COV group and Mock group and conducted the analysis on the changes of TMPRSS2 expression between two groups. GSE56677 and GSE52920 were both used to analyze the changes of TMPRSS2 expression in Vero E6 cells and mice lung after SARS-COV infection. The results showed that the expressions of TMPRSS2 in the control group was slightly decreased compared with the other group ( Figure 11B), and mice lungs after SARS-COV infection obviously increased compared with the control group ( Figure 11A). This finding suggested that TMPRSS2 expression might increase after SARS-COV-2 infection.

DISCUSSION
This study analyzed the changes of TMPRSS2 mRNA in PRAD via the Oncomine, TIMER and GEO databases and explored the correlation between TMPRSS2 and immune infiltration (Figure 12). In addition, we also respectively investigated the changes before and after TMPRSS2 infection with SARS-COV-2 virus in Vero E6 cells and mouse lungs. The TIMER database based on the TCGA database was used to reveal that TMPRSS2 was also significantly elevated in PRAD (Figure 1D), suggesting that tumor tissues in PRAD were more susceptible to SARS-COV-2 infection. We also used the GERIA and UALCAN databases to process survival analysis and found that the expression of TMPRSS2 was not directly associated with PRAD prognosis. Additionally, the correlation between TMPRSS2 and immune infiltration in PRAD was analyzed in the TIMER database and TISIDB database. The results showed that TMPRSS2 was positively correlated with CD8 + T-cells and macrophages in PRAD (Figure 5A). At the same time, single-gene GSEA analysis was used to verify our conclusions. Further studies on immune cell type marker PRAD (Table 1) showed that the expression level of TMPRSS2 mRNA was correlated with B-cells (CD19, CD27, and CD38), and CD8 + in PRAD T-cells (CD8A and CD8B), neutrophil granulocytes (FCGR3B, SIGLEC5, and S100A12), macrophages (CD84 and CD163), Th1 (STAT4 and STAT1), Treg (STAT5B and TGFB1), and T-cell failure (PDCD1, CTLA4, LAG3, and GZMB) were significantly correlated, suggesting that these results strongly confirm the close correlation between TMPRSS2 and immune infiltration of PRAD.
To probe the cause of the increased TMPRSS2 in PRAD, we studied the methylation levels of TMPRSS2 in PRAD and found that the promoter methylation levels of TMPRSS2 in LUAD decreased significantly. Hence, TMPRSS2 might be activated and upregulated due to its hypomethylation, explaining the elevated TMPRSS in PRAD to a certain extent. We used GSE56677 and GSE52920 to study the in vivo and in vitro changes of TMPRSS2 after SARS-COV infection. The consequence shows that TMPRSS2 expression levels in both of GSE56677 and GSE52920 were reduced after SARS-COV infection (Figure 11), suggesting that TMPRSS2 promoter methylation might be activated and display an increased level in PRAD. As a prostate-specific gene, TMPRSS2 fuses with the transcription factor ERG gene in a large proportion of human prostate cancers (Nam et al., 2007) and plays an important role in selected pathological processes. In certain studies, according to the immunohistochemical analysis of clinical specimens, TMPRSS2 has the highest expression in the apex of the prostate, the secretory epithelium of prostate cancer, and the glandular cavity, indicating that TMPRSS2 is a secreted protease that is highly expressed in prostate cancer and prostate cancer, making it a potential target for the treatment and diagnosis of cancer (Afar et al., 2001). One study showed that considering the high incidence of prostate cancer and the high frequency of such gene fusion, the most common genetic abnormality described thus far in human malignancies is tmprss2-ets gene fusion (Rubin and Chinnaiyan, 2006). In addition, TMPRSS2 is a candidate proteolytic activated human influenza virus, which might play an important role in screening other progenitors in the future (Böttcher et al., 2006). At the same time, studies have found that TMPRSS2 cells are a useful experimental system for studying the cleavage and inhibition of HA by host cell proteases. In addition, these cells also represent a suitable cell line for propagation of the influenza virus in the absence of trypsin (Wu et al., 2017). Interestingly, TMPRSS2 can cleave SARS-COV-2 spike protein, thus facilitating viral entry and activation (Hoffmann et al., 2020), which suggest its correlation with SARS-COV-2. Other studies also show that TMPRSS2-expressing cell lines are highly susceptible to SARS-COV, MERS-COV, and SARS-COV-2 (Matsuyama et al., 2020). In general, TMPRSS2 primarily affects tumor metastasis by intervening in the signaling pathway, but the mechanism of its influence on the prognosis of PRAD is still unclear. However, we found that TMPRSS2 might influence the prognosis of PRAD through a new mechanism, namely, immune infiltration, which suggests a direction for further studies. However, due to the limitations of the database, this study also had certain limitations, and therefore, we did not further analyze the relationship between TMPRSS2 and immune infiltration. Moreover, it is worth noting that all analyses in this paper are based on servers or databases, which may vary in the specific experimental process. In our future research, it will be important to verify the analysis results through experiments.

DATA AVAILABILITY STATEMENT
All datasets presented in this study are included in the article/ supplementary material.

AUTHOR CONTRIBUTIONS
LL conceived the idea. LL, YZ, XdL, XnL, and XlL contributed to the acquisition, analysis, and interpretation of data. LL and YZ wrote the manuscript. ML, HL, LC, and LL reviewed the paper and provided comments. All authors reviewed the manuscript. All authors contributed to the article and approved the submitted version.