Integrated Analysis Reveals ENDOU as a Biomarker in Head and Neck Squamous Cell Carcinoma Progression

Background Head and neck squamous cell carcinoma (HNSCC) is a leading cancer with high morbidity and mortality worldwide. The aim is to identify genes with clinical significance by integrated bioinformatics analysis and investigate their function in HNSCC. Methods We downloaded and analyzed two gene expression datasets of GSE6631 and GSE107591 to screen differentially expressed genes (DEGs) in HNSCC. Common DEGs were functionally analyzed by Gene ontology and KEGG pathway enrichment analysis. Protein-protein interaction (PPI) network was constructed with STRING database and Cytoscape. ENDOU was overexpressed in FaDu and Cal-27 cell lines, and cell proliferation and migration capability were evaluated with MTT, scratch and transwell assay. The prognostic performance of ENDOU and expression correlation with tumor infiltrates in HNSCC were validated with TCGA HNSCC datasets. Results Ninety-eight genes shared common differential expression in both datasets, with core functions like extracellular matrix organization significantly enriched. 15 genes showed prognostic significance, and COBL and ENDOU serve as independent survival markers in HNSCC. In-vitro ENDOU overexpression inhibited FaDu and Cal-27 cells proliferation and migration, indicating its tumor-suppressing role in HNSCC progression. GSEA analysis indicated ENDOU down-stream pathways like DNA replication, mismatch repair, cell cycle and IL-17 signaling pathway. ENDOU showed relative lower expression in HNSCC, especially HPV-positive HNSCC samples. At last, ENDOU showed negative correlation with tumor purity and tumor infiltrating macrophages, especially M2 macrophages. Conclusion This study identified ENDOU as a biomarker with prognostic significance in HNSCC progression.


INTRODUCTION
HNSCC is a leading cancer with high morbidity and mortality worldwide (1,2). It can be categorized by the area of the head or neck in which they begin, like oral cavity, pharynx, oropharynx, larynx, paranasal sinuses and nasal cavity and salivary glands. The most important pathogenic risk factors of HNSCC are tobacco and alcohol consumption and at least 75% of head and neck cancers are attributed to them (3). Molecular pathogenesis is still a comprehensive but largely not understood problem in HNSCC. Due to the phenotypic heterogeneity of each individual patient, it is hard to match suitable and effective treatments. Over the past decades, the development of high-throughput sequencing, increasing researchers have focused on the biological genetic functions of genes involved in the tumorgenesis of HNSCC (4)(5)(6)(7)(8)(9)(10)(11)(12).They have illustrated a lot of largescale gene-expression profiles of HNSCC samples. The first successful therapeutic strategy for HNSCC was to inhibit the epidermal growth factor receptor (EGFR). In-depth understanding of the molecular carcinogenesis of HNSCC could help to develop novel and accurate treatment strategies for individual HNSCC patient.
The Cancer Genome Atlas (TCGA) network group of NIH generated comprehensive genomic characterization of HNSCC in 2015 (5). It profiled 279 HNSCC samples and provided a large-scale landscape of genetic and epigenetic characterizations of HNSCC, pointing the pivotal roles of PIK3CA, TRAF3, E2F1, TP53, NOTCH and other regulators in Wnt signaling pathways in tumor-genesis of HNSCC. Before that, Chen and his colleagues also compared 22-paired samples of HNSCC and normal tissues in 2004 (13). They found critically altered genes in the pathogenic processes of HNSCC through combinatorial analysis of microarray data. However, HNSCC is a cancer with complicated pathogenesis as well as diverse tissue origins. At present, most researches of HNSCC focus on star pathways or molecules, like PI3K signaling pathway or TP53 protein. Therefore, it's necessary and urgent to dig out more potential targets for HNSCC treatment.
In this study, we downloaded two published and wellgenerated profiling datasets about HNSCC tumor and normal tissues from the Gene Expression Omnibus (GEO). Then novel potential prognostic markers were dig out to find more avenues to effective clinical treatments of HNSCC via multiple bioinformatics analysis, including biological process functional annotation and pathway enrichment analysis, protein-protein interaction network analysis as well as gene expression profiling interactive survival analysis. Gene ENDOU showed differential expression, correlation with tumor infiltrates and prognostic significance, and were functionally investigated.

Data Collection
The expression profiles of RNAs were screened from the National Center of Biotechnology Information Gene

Identification of Differentially Expressed Genes (DEGs)
We utilized quartile normalization algorithm to subtract and correct background of these downloaded datasets firstly (14). Then we filtered probes without corresponding gene symbols and calculated the average value of gene symbols with multiple probes. We further used R software Limma package to screen DEGs by filtering p.adj value of student's t-test and fold change (FC) (15). Finally, with a threshold of p.adj-value <0.05 and absolute value of FC >2 (16,17), volcano plot was performed by using R software ggplot2 package to identify the DEGs with statistical significance between two groups. Hierarchical clustering and combined analyses were performed for the DEGs.

Functional Analysis of DEGs in HNSCC Pathogenesis
GO enrichment analyses of differentially expressed mRNAs were implemented with annotation, visualization and integrated discovery (DAVID) (http://david.abcc.ncifcrf.gov/). GO terms (including Molecular Function, Biological Processes and Cellular Components) with p-value less than 0.05 were considered significantly enriched by DEGs. KEGG is a database resource for understanding high-level functions and effects of the biological systems (http://www.genome.jp/kegg/). DAVID was also used to test the significantly statistical enrichment of DEGs in KEGG pathways. Cytoscape (version 3.40) was used to visualize the relationships between biological processes terms and DEGs.

Protein-Protein Interaction (PPI) Network Analysis
Protein-protein interaction analysis was further conducted in STRING database (https://string-db.org/). The higher or larger score protein-protein interaction pairs were selected to construct PPI networks. Then, the regulatory relationships between genes were visualized via Cytoscape (version 3.4.0) (18). The subnetworks were extracted from the whole PPI network by using MCODE app.

Survival Analysis and Tumor Filtrating Immune Cells Analysis
To conduct survival analysis, clinical data and RNA expression data from TCGA dataset were downloaded. Univariate and Multivariate Cox analysis was conducted with SPSS, and forest plot was conducted with R. Then Kaplan-Meier survival plot and LogRank analysis was done. The correlation of ENDOU expression with tumor infiltrating cells were conducted with TIMER (https://cistrome.shinyapps.io/timer/) (19). For macrophage markers, CCL2, CD68, and IL10 were used for tumor associated macrophages (TAM), NOS2, IRF5, and PTGS2 for M1 macrophages and CD163, VSIG4, and MS4A4A for M2 macrophages, referring to previous study (20).

Cells Culture
The FaDu and Cal-27 human carcinoma cell lines were propagated and carefully maintained in our laboratory. FaDu and Cal-27 cells were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum at 37°C in a humid atmosphere containing 5% CO2.

Lentivirus Mediated ENDOU Overexpression
CDS of ENDOU (NM_001172440) was inserted into a recombinant lentiviral expression vector (pGSIL) containing green fluorescent proteins (GFP) tag. To generate lentiviral particles, the recombinant expression plasmid was cotransfected with a packaging plasmid system (psPAX2 and pMD2G) into FaDu and Cal-27 cells, and viral particles were collected after 48 h. FaDu and Cal-27 cells were infected with ENDOU lentiviral vector or with a negative control (NC) for 24 h. The infection efficiency was preliminarily assessed in each experiment under a fluorescence microscope and then measured by sorting GFP-positive cells by flow cytometry (Beckman Coulter, USA). The stably infected cells were expanded and harvested for further experiments. The overexpression of ENDOU was examined with western blotting, using monoclonal antibody (Abcam, Cambridge, MA, USA). GAPDH was used as the house-keeping gene.

MTT Assay
The cell proliferation activity of FaDu and Cal-27 cells was examined with MTT Cell Proliferation and Cytotoxicity Assay Kit (Dojindo Laboratories, Tokyo, Japan). The cells were incubated for 24, 48, 72, and 96 h. After incubation, the MTT solution was removed and replaced with dimethyl sulfoxide (DMSO; 150 ml, 4%; Sigma). A microplate reader (Bio-Tek, Instruments, Neufahrn, Germany) was used to measure the absorbance at 570 nm.

Transwell Assay
Cells in logarithmic growth phase were seeded at the upper transwell chamber insert (Corning, USA) at a density of 2 × 104 cells per well. The chamber was placed in a 24-well plate in which the upper chamber contained serum-free cell culture medium and the lower chamber contained 10% FBS complete medium. The culture was continued for 24 h. The medium was discarded, and stained with a crystal violet solution to observe the number of migrated cells.

Scratch Wound Healing Assay
FaDu and Cal-27 cells were grown in complete growth medium until 90% confluence after transfection. A 3 mm wound was introduced across the diameter of each plate. Cell migration was observed by microscopy at 24 h.

Statistical Analysis
Student's t-test or ANOVA were performed to analyze the differential expression. Kaplan-Meier analysis was used to estimate overall survival rate or time; the differences between the survival curves were analyzed using the log-rank test. In all analyses, P<0.05 was considered to indicate a statistically significant result. Continuous data are presented as the mean ± standard deviation.

DEGs Identification in Two HNSCC Datasets
Two GEO datasets of HNSCC, GSE6631 and GSE107591 were first normalized ( Figure S1) and analyzed to acquire significantly differential expressed genes (DEGs), respectively. Both datasets contains more than twenty tumor and normal samples, and with the same microarray platform of Affymetrix gene expression microarray. The GSE107591 dataset contains 23 normal and 24 tumor tissues. As GSE6631 comprises 22 paired HNSCC tumor and normal samples, we conducted paired comparison. With a threshold of p.adj-value <0.05 and absolute value of FC >2, the expression of DEGs in HNSCC tumor and normal tissues were shown in Figure 1. We found 91 genes significantly up-regulated and 121 genes down-regulated in GSE6631 dataset ( Figures 1A, C).

Functional Analysis of 98 Common DEGs Enriched Hub Pathways in HNSCC
To further figure out the roles of these DEGs in HNSCC, we searched the common DEGs both altered in these two datasets. A total of 98 DEGs (37 up-regulated and 61 down-regulated DEGs) showed dysregulation in both datasets ( Figure 2A). To further detailed unravel the function of these common DEGs in HNSCC, we constructed protein-protein interaction (PPI) network via STRING database (https://string-db.org/). The genes with larger scores were selected to construct PPI networks ( Figure 2B), and their function was uploaded to DAVID database for KEGG pathway and GO enrichment analysis. As shown in Figure 2C, biological processes with the largest number of common DEGs and smallest p-value includes extracellular matrix disassemble, collagen catabolic process, extracellular matrix organization, cell adhesion, aging and angiogenesis. As for the KEGG enrichment analysis in Figure 2C, not surprisingly, items like Pathways in cancer, ECM-receptor interaction, Focal adhesion and PI3K-Akt signaling pathways are significantly enriched and showed the largest number of common DEGs. The PI3K-Akt signaling was inappropriately activated in many cancers, including head and neck cancer (21). Somatic mutations of PIK3CA had been also described before and are found in about 15% of HNSCC (22)(23)(24). These findings might help us find more new therapy targets for HNSCC.

Clinical Significance of 15 Genes Were Analyzed With TCGA HNSCC Datasets
To further explore the clinical significance of DEGs in HNSCC, we first validated the expression of 98 common DEGs with TCGA HNSCC dataset. A total of 2230 DEGs were acquired in HNSCC datasets and 92 genes (in 98 common DEGs, Table S1) showed significant dysregulation in all datasets ( Figures 3A, B). Then the prognostic significance of the 92 DEGs were analyzed with univariate analysis. As shown in Figure 3C Figure S2.
Cordon-Bleu WH2 Repeat Protein (COBL) contains WH2 domains (WASP, Wiskott-Aldrich syndrome protein, homology domain-2) and has been reported to play an important role in the reorganization of the actin cytoskeleton (25)(26)(27). In acute lymphoblastic leukemia, COBL is a hotspot for IKZF1 deletions (28), and we will study the function of COBL in HNSCC in the future. ENDOU, also known as PP11 (Placental Protein 11), encodes uridylate-specific endoribonuclease expressed in the placenta. ENDOU was reported to displays RNA binding capability and cleaves single stranded RNA in a Mn (2+)-dependent manner at uridylates (29). Studies of ENDOU in cancer dates back to 1980s (30,31). ENDOU was annotated with the function of in proteolysis (Gene Ontology term GO:0006508), and members like matrix metalloproteinases (MMPs) play an important function in cancer progression, by depredating extracellular matrix. Considering the significant enrichment of functional items of extracellular matrix in Figures 2C, D, we chose ENDOU for the following study.  At last, the clinical association between ENDOU expression and clinicopathological variables in TCGA HNSCC patients was analyzed. As shown in Table 1, ENDOU expression showed significant correlation with survival status (P=0.03), gender (P=0.011), alcohol history (P=0.017), clinical N stage (P=0.001), lymphovascular invasion status (P=0.001), neoplasm histologic grade (P<0.001) and pathologic N stage (P<0.001). In sum, ENDOU may serve as an independent prognostic marker in HNSCC.

ENDOU Inhibits HNSCC Cancer Cell Proliferation and Migration In Vitro
As ENDOU shows consistent lower expression and hazardous prognostic significance in HNSCC, to explore the function of ENDOU in HNSCC, we conducted in-vitro over-expression (OE) studies in two cell lines of Fadu and Cal-27. The proliferation and migration capability was examined. As shown in Figure 4A, the proliferation capability of Fadu and Cal-27cells in ENDOU overexpression group was significantly decreased. Then wound healing assay and transwell assay was applied to study the effect of ENDOU overexpression on cell migration and invasion. In Figures 4B, C, the migration distance of ENDOU overexpression group was significantly larger than vector groups, and cells numbers was significantly decreased, implying that ENDOU may serve as a tumor suppressor in HNSCC.
To explore the potential mechanism of ENDOU as a tumor suppressor in HNSCC, we applied Gene Set Enrichment Analysis (GSEA) analysis to enrich ENDOU related KEGG pathways and biological processes. As shown in Figure 5, DNA replication, mismatch repair, cell cycle and IL-17 showed significant enrichment, which supported the cellular phenotype of ENDOU in HNSCC.

ENDOU Expression Showed Significant Correlation With Macrophages in HNSCC
In addition to HNSCC, we analyzed the expression of ENDOU in other cancer types, as shown in Figure 6A been linked with intratumoral immune cell infiltrates (e.g. IL-17 +CD8+T lymphocytes) in HNSCC (32,33). Then we compared the expression of ENDOU in HPV-negative and HPV-positive HNSCC samples, and ENDOU showed significant lower expression in HPV-positive HNSCC samples ( Figure 6A). At last, we analyzed the expression of ENDOU with tumor immune infiltrates, in HPV-negative and HPV-positive HNSCC samples. As shown in Figure 6B, ENDOU showed significant negative correlation with neutrophils, dendritic cells and especially macrophages. As the correlation was most significant in tumor infiltrating macrophages, correlation of TAMs, M1 and M2 macrophages markers with ENDOU expression were analyzed. As shown in Figure 6C, the markers of M2 macrophages (CD163, VSIG4, and MS4A4A) all showed significant negative correlation coefficient with ENDOU, implicating potential role of ENDOU in tumor infiltrating M2 macrophages.

DISCUSSION
At present, head and neck cancer is the sixth most common cancer with a poor prognosis and high mortality over the world accounting for approximately 4% of all cancers in the United States (34). HNSCC is a kind of highly heterogeneous malignancy and appropriate clinical treatments remain a major challenge for HNSCC because of heterogeneity. Hence, personalized care for HNSCC patients requires a better understanding of the molecular mechanism of HNSCC. Though some biomarkers in head and neck squamous cell carcinoma progression have been reported yet (35,36), accepted biomarkers for HNSCC prognosis in the clinic is still raw. Effective prognostic care for HNSCC patients requires a better understanding of the molecular mechanism. With the recent development of next generation sequencing and other "omics" profiling methods, we began to focus on genomic analysis of HNSCC to illustrate the detailed or new causes of HNSCC pathogenesis and to try to develop new markers for treatments of this cancer.
In the present study, gene expression data of HNSCC were downloaded from GEO, which revealed significant differences between survival status and clinical treatments of HNSCC patients. Hence, we reanalyzed two published microarray datasets, GSE6631 and GSE107591 in this study. Finally, we found 98 common DEGs between 45 normal and 46 HNSCC tumor samples. Gene functional annotation and pathway enrichment analysis of these DEGs showed that extracellular matrix disassemble collagen catabolic process, extracellular matrix organization, cell adhesion, aging and angiogenesis were enriched in HNSCC. These results revealed that our bioinformatics analysis may help a better understanding of the regulation of these genes in HNSCC. There are also many researchers focusing on extracellular matrix organization play an important role in HNSCC (37,38). They pointed out that extracellular matrix organization was one of the most frequently altered pathways in HNSCC, consistent with our results.
By integrating the clinical information in TCGA, we screened 9 hazardous and 6 protective prognostic markers. Of the 15 genes, LAMC2  (51) has been reported as HNSCC prognostic markers, and in this study, we first uncovered the clinical significance of SEMA3C, FAP, COBL, DUSP5, ENDOU and METTL7A with HNSCC survival. Semaphorin 3C (SEMA3C) has been reported to play a pivotal role in tumor microenvironment driven neuroblastoma metastasis and progression (52). Fibroblast Activation Protein Alpha (FAP) was proven to serve as a marker in metastatic prostate cancer (53), pancreatic cancer (54) and ovarian cancer (55). Cordon-Bleu WH2 Repeat Protein (COBL) has been reported to play an Then, ENDOU not only showed significant lower expression in HNSCC cancer samples (compared with normal samples), but also decreased level in higher stage cancer samples, implicating its tumor suppressing function in HNSCC occurrence and progression. ENDOU, also known as PP11 (Placental Protein 11), encodes uridylate-specific endoribonuclease expressed in the placenta. ENDOU was reported to displays RNA binding capability and cleaves single stranded RNA in a Mn (2 +)-dependent manner at uridylates (29). Studies of ENDOU in cancer dates back to 1980s (30,31), and this is the first time investigating the role of ENDOU in HNSCC. In this study, ENDOU was negatively correlated with tumor purity and then were functionally investigated in cancer cell lines. MTT, scratch healing and transwell assay demonstrated that ENDOU indeed inhibited the proliferation and migration capability, supporting it tumor suppressing role in HNSCC cancer cells. Meanwhile, there are several limits in this part. First, the effect of ENDOU overexpression can be validated in other HNSCC cell lines, and if possible, the impact of ENDOU silencing or mutation on HNSCC cell lines to support the conclusion. Secondly, an in vivo study of ENDOU would provide more solid evidence, as the function of genes in vitro and in vivo are not always consistent. Lastly, the underlying mechanism of ENDOU inhibiting FaDu cells proliferation and migration may be explored.
At last, tumor infiltrating immune cells in tumor microenvironment plays a pivotal role in HNSCC progression and prognosis prediction (58,59). In this study, we found ENDOU negative correlates with tumor infiltrating neutrophils, dendritic cells and macrophages, especially M2 macrophages. Macrophages were reported to be involved in HNSCC metastasis (60)(61)(62), chemotherapy (63) and prognosis (64). Macrophages were polarized into M1 and M2 subtypes depending on their environment (65), and were reported to be functionally distinct in cancer (66). M2 polarized macrophages were found to be correlated with poor prognosis in nasopharyngeal carcinoma patients (67). HNSCC cancer cells were found to contribute to M2 polarization in several ways, like secreting microRNA-21 abundant exosomes (68) and lactic acid (69). Based on the founding that tumor suppressor gene ENDOU was negatively correlated with M2 markers of CD163, VSIG4, and MS4A4A, we speculated that ENDOU may also inhibit macrophages M2 polarization in tumor microenvironment. In sum, our multiple bioinformatics analysis identified 98 genes with core functions like extracellular matrix organization significantly enriched. 15 genes showed prognostic significance, and COBL and ENDOU serve as independent survival markers in HNSCC. In-vitro ENDOU overexpression inhibited FaDu and Cal-27 cells proliferation and migration, indicating its tumorsuppressing role in HNSCC progression. GSEA analysis indicated ENDOU down-stream pathways like DNA replication, mismatch repair, cell cycle and IL-17 signaling pathway. ENDOU showed relative lower expression in HNSCC, especially HPV-positive HNSCC samples. At last, ENDOU showed negative correlation with tumor purity and tumor infiltrating macrophages, especially M2 macrophages. It was the first time that ENDOU was identified as a tumor suppressor, with prognostic significance and negative correlation with tumor infiltrating M2 macrophages. However, further biological and basic studies are needed to validate our findings.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/ Supplementary Material.

ETHICS STATEMENT
The experiment was approved the Ethics Committee of Eye Nose and Throat Hospital, Fudan University.

AUTHOR CONTRIBUTIONS
CX, YZ, and LZ conceived and designed the study. YZ analyzed the data. YPS, YS, and MZ performed the experiments. YZ and LZ wrote the paper. CX and YPS reviewed and edited the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by Natural Science Foundation ofShanghai (16ZR1419300), the National Natural Science Foundation (81972529) and Science and Technology Commission of Shanghai Municipality (9411961300).