A Pan-Cancer Analysis of Cystatin E/M Reveals Its Dual Functional Effects and Positive Regulation of Epithelial Cell in Human Tumors

Cystatin E/M (CST6), a representative cysteine protease inhibitor, plays both tumor-promoting and tumor-suppressing functions and is pursued as an epigenetically therapeutic target in special cancer types. However, a comprehensive and systematic analysis for CST6 in pan-cancer level is still lacking. In the present study, we explored the expression pattern of CST6 in multiple cancer types across ∼10,000 samples from TCGA (The Cancer Genome Atlas) and ∼8,000 samples from MMDs (Merged Microarray-acquired Datasets). We found that the dynamic expression alteration of CST6 was consistent with dual function in different types of cancer. In addition, we observed that the expression of CST6 was globally regulated by the DNA methylation in its promoter region. CST6 expression was positively correlated with the epithelial cell infiltration involved in epithelial-to-mesenchymal transition (EMT) and proliferation. The relationship between CST6 and tumor microenvironment was also explored. In particular, we found that CST6 serves a protective function in the process of melanoma metastasis. Finally, the clinical association analysis further revealed the dual function of CST6 in cancer, and a combination of the epithelial cell infiltration and CST6 expression could predict the prognosis for SKCM patients. In summary, this first CST6 pan-cancer study improves the understanding of the dual functional effects on CST6 in different types of human cancer.


INTRODUCTION
Cystatin E/M (also known as CST6) is a member of the cystatin superfamily that performs physiological inhibitors of lysosomal cysteine proteases through forming high-affinity reversible complexes (Turk and Bode, 1991). The dysfunction of CST6 contributed to the alterations in proteolysis of tissue architecture, which might accelerate the spread of cancer cells Keppler, 2006). Increasing evidence has demonstrated the dual functional effects of CST6 in cancer progress (Lalmanach et al., 2021). For instance, the overexpression of CST6 could rescue mice from bone metastasis by suppressing proliferation, migration, and invasion (Jin et al., 2012). Moreover, the loss of CST6 expression has been observed in lung and cervical cancer, and its recovery expression resulted in growth suppression in culture (Zhong et al., 2007;Veena et al., 2008). In contrast to the protective function, Hosokawa et al. (2008) found that the upregulated expression of CST6 promoted tumor growth in vitro and in vivo in pancreatic ductal adenocarcinoma. CST6 was also shown overexpressed in triple-negative breast cancer and oral cancer, facilitating the tumor metastatic process (Vigneswaran et al., 2003;Li et al., 2018). Collectively, CST6 played crucial and disparate roles in the pathogenesis and development of cancer. However, a comprehensive research about the expression pattern and functional effects of CST6 in pan-cancer level is still lacking.
As an important epigenetically regulatory factor, DNA methylation has been implicated in the dysfunction of CST6. One study has shown that the hypermethylation status of CST6 promoter resulted in CST6 deficiency in glioma tumor-initiating cells, while the promoter was hypomethylated in normal brain tissues (Qiu et al., 2008). The aberrant methylation and downregulated expression of CST6 were also found in breast cancer patients, and the expression could reactivate after DNA demethylating agent treatment (Ai et al., 2006;Schagdarsurengin et al., 2007). However, the relationship between the expression and the DNA methylation of CST6 in pan-cancer level remains unclear.
Numerous studies have shown the important roles of tumor microenvironment in cancer therapy and diagnosis (Wu and Dai, 2017;Hinshaw and Shevde, 2019). CST6 is an epithelium-specific protease inhibitor with essential roles in epidermal differentiation (Zeeuwen et al., 2010). Zhang et al. (2004) found that CST6 was consistently expressed in normal human breast epithelial cells, while it was decreased in breast invasive carcinoma samples, and the expression of CST6 was associated with cell proliferation, migration, and invasion. The CST6 promoter was found highly methylated in cfDNA of breast cancer plasma cells but not in healthy samples (Chimonidou et al., 2013). Moreover, IL-17A, an immunotherapy targeting, could affect keratinocyte differentiation by regulating the expression of CST6 (Sato et al., 2020). However, there has been limited research that comprehensively explored the relationship between CST6 and tumor microenvironment in pancancer level.
In this study, we performed a systematic evaluation of the expression pattern of CST6 across cancer types from TCGA and MMDs. Consistent with the known dual role of CST6, we found that there was a broad spectrum of CST6 expression across cancer types. Through DNA methylation analysis, we found that the expression of CST6 was globally regulated by the methylation level of its promoter region. Moreover, the expression of CST6 was related to epithelial cell infiltration, EMT, and proliferation. Finally, the association between the CST6 expression and patient survival was also investigated. Our first pan-cancer study for CST6 provided novel insights into its dual function in the development of cancer.

Analysis of Gene Expression
We entered CST6 in the "Gene_DE" module of the TIMER2 website 1  and explored the expression of CST6 between different tumors and adjacent normal tissues in TCGA items. For some tumors with no normal sample or the number of normal tissue specimens was less than 5, we used the "Expression Analysis-Expression DIY" module of the GEPIA2 2 to compare the expression level of CST6 between tumor tissues and GTEx (Genotype-Tissue Expression) datasets (Tang et al., 2019). The gene with | log2FC| > 1 and p-value < 0.05 was considered as significantly differentially expressed by the ANOVA method. In addition, we used the pathological stage module and subtype filter module in GEPIA2 to obtain the expression of CST6 in different tumors at different stages and different subtypes. In order to verify the expression pattern of CST6 in different cancer types, we collected gene expression data of more than 8,000 samples from 11 cancers (Supplementary Table 1; Lim et al., 2019). To avoid differences between platforms to the greatest extent, only the dataset generated from the Affymetrix Human Genome U133 Plus 2.0 array was processed to develop the MMDs dataset. All datasets were processed uniformly through RMA normalization, and batch effect were corrected through the Combat R package (Leek et al., 2012). Moreover, the protein level of CST6 between tumor and normal tissue was obtained from the CPTAC analysis module of the UALCAN portal 3 (Chandrashekar et al., 2017). An external validation dataset was obtained with GEO accession GSE46517, which included 73 metastatic and 31 primary melanoma patients.

DNA Methylation Analysis
We downloaded the gene expression and HM450 DNA methylation profiles across cancer types of TCGA Pan-Cancer (PANCAN) cohort through UCSC Xena. 4 The full name, abbreviation, and sample number of cancer types for TCGA are shown in Supplementary Tables 2, 3. The methylation level of CST6 was quantified by averaging the beta values of CpGs located in the promoter region (upstream 2 kb to TSS). Then, Wilcoxon rank test was used to identify differentially methylated CST6 between tumor and normal tissue. In addition, the correlation between DNA methylation and expression of CST6 was calculated using the Pearson correlation method. The results of correlation coefficient less than −0.3 and p-value < 0.05 were identified as significant.

Cystatin E/M-Related Gene Functional Enrichment Analysis
The CST6-related genes were obtained from the "Similar Gene Detection" module of GEPIA2 by Pearson correlation method. We selected the datasets of all TCGA tumors and finally screened out the top 500 CST6-correlated genes. For a specific cancer type in TCGA and MMDs datasets, we used the Pearson correlation method to obtain CST6-related genes (Pearson correlation > 0.3). To explore the potential biological function that CST6 regulated, we passed the CST6-related genes to Metascape 5 with the setting of species ("Homo sapiens") (Zhou et al., 2019).

Estimation of the Relationship Between Cystatin E/M and Tumor Microenvironment, Epithelial-to-Mesenchymal Transition, and Tumor Proliferation
In order to assess the infiltration levels of epithelial cells in diverse cancers, we downloaded the precalculated TCGA data from xCell, 6 a method that yielded cell type enrichment score for 64 immune and stroma cell types, which included epithelial cell infiltration score (Aran et al., 2017).
We estimated the EMT score for an independent cancer type according to the method of Zhang et al. (2020). The epithelial and mesenchymal genes were obtained from a previous study (Mak et al., 2016). Then, the EMT score, which could reflect the EMT level for each sample, was calculated according to the following formula: where N represents the number of mesenchymal genes, and n represents the number of epithelial genes. The expression level of the proliferation marker ki67 (MKI67) was used to reflect tumor proliferation in TCGA samples. Then, the correlation between CST6, cell infiltration, and proliferation score were estimated through Spearman's rank-order correlation method. The relationship between EMT score and CST6 was calculated by partial correlation through "ggm" R packages considering tumor purity as concomitant variable.

Survival Prognosis Analysis
The survival analysis of OS (overall survival) and RFS (disease-free survival) for CST6 in TCGA tumors were performed through GEPIA2. The median expression of CST6 was used to divide the patients into high-expression and low-expression groups. Then, the OS and RFS of these groups were compared by log-rank test. In addition, the survival interaction between CST6 expression and epithelial cell (positive genes obtained from xCell) was explored using siGCD 7 (Cui et al., 2021).

Gene Expression Analysis of Cystatin E/M in the the Cancer Genome Atlas Datasets
CST6 could serve as a biomarker for tumor diagnosis and play a dual functional effect across cancer types (Lalmanach et al., 2021). Previous studies of CST6 expression in cancer were limited to sample sizes and focused on a single cancer type. To comprehensively characterize the expression pattern of CST6, we first applied the "Gene_DE" module of TIMER2 portal for TCGA datasets. As shown in Figure 1A, the expression level of CST6 in bladder urothelial carcinoma (BLCA), breast invasive carcinoma (BRCA), cervical and endocervical cancer (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), rectum adenocarcinoma (READ), thyroid carcinoma (THCA), and uterine corpus endometrioid carcinoma (UCEC) is significantly higher than in normal samples, while the expression level of CST6 in kidney chromophobe (KICH), kidney clear cell carcinoma (KIRC), lung adenocarcinoma (LUAD), and lung squamous cell adenocarcinoma (LUSC) is significantly decreased than adjacent normal samples. Moreover, the patients with HPV infection showed a lower expression level of CST6 in head and neck squamous cell carcinoma (HNSC), and the patients with metastasis status showed a lower expression in skin cutaneous melanoma (SKCM).
Due to the limited normal sample size for several cancer types, we included the normal tissue of the GTEx datasets and evaluated the expression difference of CST6. As shown in Figure 1B, we found that the expression of CST6 was significantly different in CESC, ovarian serous cystadenocarcinoma (OV), pancreatic adenocarcinoma (PAAD), and SKCM. These results were consistent with the tumor-promoting function of CST6 in breast cancer (Li et al., 2018), pancreatic cancer (Hosokawa et al., 2008), and papillary thyroid carcinoma (Oler et al., 2008). The tumor-suppressive function of CST6 has also been reported in lung cancer (Zhong et al., 2007), melanoma (Briggs et al., 2010), and renal cell carcinoma (Morris et al., 2010), which agreed with the loss expression of CST6 in these cancer types.
The results of the CPTAC dataset showed that the expression of CST6 total protein in BRCA and KIRC was lower than that of normal tissues (Figure 1C, t-test, p-value < 0.001). To extend the expression pattern of CST6 with tumor pathological information, we applied the "Stage Plot" and "Subtype Filter" functions of GEPIA2. The expression of CST6 was related to the stage of BLCA, COAD, kidney papillary cell carcinoma (KIRP), OV, READ, SKCM, THCA, and uterine carcinosarcoma (UCS) (Figure 1D). In addition, the

Gene Expression Analysis of Cystatin E/M in the Merged Microarray-Acquired Datasets
To verify the expression pattern of CST6 across cancer types, we collected the MMDs of more than 7,000 samples from 11 cancers with a standard process. Particularly, we revealed that CST6 showed a higher expression in breast cancer, pancreatic cancer, and prostate cancer, while the expression level of CST6 significantly decreased in liver cancer, lung cancer, melanoma, ovarian cancer, and renal cancer (Figure 2A). We found that the expression pattern of CST6 was consistent in breast, pancreatic, lung, melanoma, and renal cancer between TCGA and MMDs datasets. The expression of CST6 was upregulated in BLCA, COAD, and READ for TCGA datasets, while no significantly differently expressed was observed in bladder and colorectal cancer for the MMDs. Moreover, the expression of CST6 was specifically dysregulated in liver, ovarian, and prostate cancer for the MMDs. Regarding the most downregulated cancer type in the two datasets, the coexpression genes of CST6 (Pearson correlation > 0.3) were commonly shared between MMDs lung cancer and TCGA LUAD/LUSC datasets, showing a statistically significant overlap ( Figure 2B, hypergeometric test, p-value < 2.2e-16). We also found that the overlap of CST6related genes was enriched in the tumor microenvironmentrelated processes (such as positive regulation of cell junction assembly and protein localization to the plasma membrane, Figure 2B).
In addition, integrin α3 (ITGA3) was the most related gene in the two datasets, which showed a positively correlation with the expression of CST6 (Figures 2C,D). Moreover, a recent study revealed the important prognostic role of ITGA3 in patients with non-small cell lung cancer . Thus, CST6 and ITGA3 may be potential therapeutic targets for lung cancer. Similar results were obtained in melanoma and renal cancer samples. We found that kallikreinrelated peptidase 7 (KLK7) and keratin 7 (KRT7) were the most related genes in melanoma and renal cancer separately (Supplementary Figure 2). Aberrant expression of KLK7 has found to be related to melanoma aggressiveness by stimulating cell migration and adhesion (Delaunay et al., 2017;Haddada et al., 2018). KRT7 could distinguish the precursor lesions of papillary renal cell tumors, mucinous tubular and spindle cell carcinomas (Szponar and Kovacs, 2012). The detailed information of CST6-related genes list is shown in Supplementary Table 4. These results indicated the conservative expression pattern

The Cystatin E/M Expression Was Regulated by DNA Methylation
CST6 expression has been previously associated with its epigenetic regulation by methylation of the promoter region in several cancer types (Rivenbark et al., 2006;Pulukuri et al., 2009;Peters et al., 2014). To explore the relationship between its expression and DNA methylation in TCGA datasets, we first quantified the methylation level of CST6 by averaging the CpG beta value in its promoter region. Then the Wilcoxon test was used to evaluate differentially methylated status between tumor and adjacent normal samples. In contrast to the expression pattern, we found that the methylation level of CST6 was significantly lower in BLCA, CESC, COAD, ESCA, HNSC, KIRP, LUAD, PAAD, READ, THCA, and UCEC tumor samples, while it was higher in BRCA, KIRC, liver hepatocellular carcinoma (LIHC), and prostate adenocarcinoma (PRAD) (Wilcoxon test, p-value < 0.05, Figure 3A). Next, we observed a significant negative correlation of CST6 methylation and its expression in more than half (21 out of 32, Pearson correlation < 0 and p-value < 0.05) of the cancer types ( Figure 3B). The relationship of methylation and expression level of CST6 for the top correlated cancer types (BRCA, CHOL, KIRP, MESO, STAD, THCA, THYM, and USC, Pearson correlation < -0.3 and p-value < 0.01) is shown in Figure 3C. All these results further proved the capacity of DNA methylation in regulating the gene expression of CST6.

Functional Analysis of Cystatin E/M in Cancer
Taking advantage of integration of transcriptome and DNA methylation resource, we characterized the expression pattern and DNA methylation regulatory mechanism of CST6. To further investigate the function of CST6 across cancer types, we first obtained 500 CST6-related genes (Supplementary Table 4) for all TCGA tumor samples from the GEPIA2 "Correlation Analysis" module. Then, the functional enrichment analysis was performed through Metascape based on the CST6-related gene list. CST6related genes were significantly enriched in keratinization and positive regulation of epithelial cell migration ( Figure 4A). Given that the strong correlation between keratin genes and epithelial cell has been reported (Heatley, 2002;Moll et al., 2008), we next explored the relationship of CST6 and epithelial cell in detail. The expression level of CST6 and epithelial cell infiltrate score were varied in cancer types (Figure 4B and Supplementary Figure 3), while the positive correlation between the two variables was observed in most cancers (Figure 4C and Supplementary Table 5).
Considering the fact that CST6 has been previously associated with the GO term "positive regulation of mesenchymal stem cell proliferation, " we next explored the role that CST6 plays in EMT and proliferation. At first, we downloaded tumor purity for TCGA samples from a previous study (Thorsson et al., 2018). As the EMT score was significantly influenced by tumor purity (Supplementary Table 6), we estimated the relationship between EMT and CST6 using partial correlation to remove the confounder. In contrast to epithelial cell infiltration, CST6 showed dual functional effects on EMT and proliferation ( Figure 4C and Supplementary Table 5). Particularly, the correlation coefficients of epithelial cell and EMT score in SKCM were reversed (Figures 4D,E), which indicated a potential role of CST6 in the metastasis of melanoma. Given that SKCM has the maximum number of metastatic samples in TCGA, we next compared the expression level of CST6 between metastasis samples and primary tumor tissues. As shown in Figure 4F, the expression of CST6 was significantly higher in SKCM primary tumor tissues than that of metastatic samples. Consistent with this finding, a similar pattern was observed in another SKCM metastatic dataset (GSE46517, Figure 4G), which further proved the protective effect of CST6 in melanoma metastasis. Taken together, all these results suggest that CST6 was related to epithelial cell infiltration and tumor EMT process.
As evidence has shown the positive correlation between CST6 and epithelial cell, we next explored its relationship with other cells. As shown in Figure 5, the expression of CST6 was positively related to most epithelial cells, while the negative correlation between CST6 and plasma cell was observed in most cancer types. A previous study has found that CST6 promoter is highly methylated in cfDNA of BRCA plasma cells but not in healthy samples (Chimonidou et al., 2013). These results suggest the potential regulatory roles of CST6 in the tumor microenvironment.

Clinical Associations of Cystatin E/M in Cancer
We next explored the critical efficiency of CST6 in the survival of tumor patients. Tumor samples were divided into highexpression and low-expression groups based on CST6 expression for each TCGA tumor type. Patients with a higher expression of CST6 had worse survival in brain lower-grade glioma (LGG), LUAD, PAAD, PRAD, and stomach adenocarcinoma (STAD) (HR > 1 and log-rank p < 0.05, Figures 6A,B), while they indicated a favorable prognosis in KIRP, uveal melanoma (UVM), and diffuse large B-cell lymphoma (DLBC) (HR < 1 and log-rank p < 0.05, Figures 6A,B). The multivariate Cox regression model was also performed with several clinical factors (Supplementary Figure 4). In addition, the CST6related genes mentioned above (ITGA3, LKL7, and KRT7 corresponding to TCGA lung cancer, SKCM, and renal cancer separately) were found to be associated with clinical outcomes (Figures 6C,D). These results revealed the dual effects of CST6 on the survival of patient.
The association between epithelial and CST6 for SKCM has been examined herein before. Next, we explored whether these two important elements affected the clinical survival of melanoma patients. The expression of CST6 could not well predict the prognosis of SKCM patients (log-rank p = 0.0547, Figure 7A) using siGCD. Meanwhile, we found that the epithelial cell infiltration score of SKCM primary tumor was significantly higher than the metastatic patients (Wilcoxon test, p-value < 0.05, Supplementary Figure 5A). Moreover, the expression of CST6 could serve as a protective factor for the clinical survival of metastatic patients, which indicates the important roles of epithelial cell and CST6 in SKCM survival (Supplementary Figures 5B,C). Thus, we took the epithelial marker from xCell into consideration. In the case of the epithelial cell low subcohort, the OS of patients with high CST6 expression showed significantly better than those with low scores, while the results did not occur in the epithelial cell high subcohort ( Figures  7B,C). These results implied that the combination of molecular expression and cell infiltration could better predict the survival of cancer patients.

DISCUSSION AND CONCLUSION
The dual function of CST6 as both tumor suppressing and tumor promoting has been well appreciated (Lalmanach et al., 2021), but its global function and expression pattern in the development of cancer remain largely unknown. Here, we comprehensively characterized the expression pattern of CST6 in cancer from TCGA, and the result was verified from another large sample dataset. Consistent with prior knowledge, the expression of CST6 showed that it was downregulated with tumor-suppressing function, while it showed a reverse level with tumor-promoting function. Evidence has shown the EMT and metastasis functions of ITGA3 in lung cancer, which were similar to the function of CST6 . Meanwhile, we observed a conservative correlation between CST6 and ITGA3 in two datasets (TCGA and MMDs), providing potential therapeutic targets for lung cancer. Apart from ITGA3, we also identified KLK7 and KRT7 as CST6-related genes in melanoma and renal cancer datasets. Overexpression of KLK7 induced a significant reduction in melanoma cell proliferation and colony formation (Delaunay et al., 2017). KRT7 has been proven to be an important biomarker for kidney cancer (Williamson et al., 2017). Taken together, these results indicate the conservative expression pattern and essential interactive function of CST6 in human tumors.
Given that the expression of CST6 exhibited epigenetic inactivation in special cancer types, we explored the relationship between DNA methylation and its expression across cancer types. We found that the expression of CST6 was globally regulated by DNA methylation, especially in BRCA, KIRP, MESO, STAD, THCA, and THYM cancer types. Although we revealed the essential roles of DNA methylation in regulating CST6 expression, we cannot explain its differential expression in some cancer types. Previously, two SNPs in the 5'UTR region of CST6 have been found to be associated with fluconazole susceptibility through a genome-wide association study (Guo et al., 2020). Thus, we counted the number of SNVs located in the CST6 gene region from cBioPortal. Thirteen cancer types with CST6 mutation were identified. Among them, SKCM has the highest alteration frequency (range from 0.19 to 1.36%, Supplementary Figure 6A). To explore the effect of SNV on the alteration of CST6 expression, we search the PancanQTL to obtain the CST6related cis-eQTLs (within 1 Mb from the gene transcriptional start site) (Gong et al., 2018). We found more than 30 cis-eQTLs in STAD and THCA, and the alternation of rs619701 could improve the expression level of CST6 in THCA (Supplementary  Figures 6B,C and Supplementary Table 7). Integration of SNV data may provide a novel insight for understanding the regulatory mechanism of CST6 in cancer.
Moreover, we found that the expression of CST6 was globally positively correlated with epithelial cell infiltration, suggesting its important roles in the epithelium. Rivenbark et al. (2007) have observed a strong immunostaining phenomenon for CST6 in normal breast epithelial and myoepithelial cells, while it was negative in primary breast tumors. The relevance between CST6 and epithelium encourages us to further explore its relationship with EMT. As the EMT process plays an essential role in cancer metastasis (Brabletz et al., 2018), we found the protective function of CST6 in melanoma metastasis considering the negative correlation between CST6 and EMT score. Although the low-level internalization of CST6 that could affect the migration of melanoma cell has been proven (Wallin et al., 2017), we first revealed the potential mechanism between CST6 and EMT in the melanoma metastasis. Besides, we also found that the combination of epithelium infiltration and CST6 expression could well predict the survival of SKCM patients. Our results suggested the necessity to consider molecular and tumor microenvironment in tumor prognostics.
Our results have been partially limited by the nature of the datasets. Although the variation trend in expression and DNA methylation of CST6 was adverse in most cancer types, a few discordant events were also observed. For instance, the expression and DNA methylation level of CST6 were all upregulated in BRCA tumor samples. This may due to the unbalanced sample size between tumor and normal samples. The discordant of CST6 expression in transcriptome and proteome has also been observed (KIRC was the only cancer type that CST6 showed to be downregulated in both TCGA and CPTAC datasets). Meanwhile, there was a considerable number of lncRNA involved in the CST6-related genes. Thus, the posttranscriptional regulation like non-coding RNA may be another explanation, and this will be the further direction that we will analyze.
In summary, our comprehensive analysis of the expression pattern and dual functional effects of CST6 in pan-cancer level reveals its essential roles. The expression of CST6 was globally regulated by DNA methylation and related to epithelium infiltration. Particularly, CST6 performed a protective function in melanoma metastasis. Dysfunction of CST6 has also shown dual effects in clinical survival in different cancer types.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
KL, ZM, and XB conceived, designed the experiments, finalized, and submitted the manuscript. DX, SD, MC, XY, HW, DQ, and ZX analyzed the data. DX, SD, and MC drafted the manuscript. All authors have read and approved the final manuscript.