Identification of a Costimulatory Molecule Gene Signature to Predict Survival and Immunotherapy Response in Head and Neck Squamous Cell Carcinoma

Background Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignancies worldwide. Checkpoint blockade immunotherapy has made tremendous progress in the treatment of a variety of cancers in recent years. Costimulatory molecules constitute the foundation of cancer immunotherapies and are deemed to be promising targets for cancer treatment. This study attempted to evaluate the potential value of costimulatory molecule genes (CMGs) in HNSCC. Materials and Methods Based on The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) dataset, we identified the prognostic value of CMGs in HNSCC. Subsequently, CMGs-based signature (CMS) to predict overall survival of HNSCC patients was established and validated. The differences of downstream pathways, clinical outcomes, immune cell infiltration, and predictive immunotherapy responses between different CMS subgroups were investigated via bioinformatic algorithms. We also explored the biological functions of TNFRSF12A, one risk factor of CMS, by in vitro experiments. Results Among CMGs, 22 genes were related to prognosis based on clinical survival time in HNSCC. Nine prognosis-related CMGs were selected to establish CMS. CMS was an independent risk factor and could indicate the survival of HNSCC patients, the component of tumor-infiltrating lymphocytes, and the immunotherapy response rate. Functional enrichment analysis confirmed that CMS might involve immune-relevant processes. Additionally, TNFRSF12A was related to poor prognosis and enhanced malignant phenotype of HNSCC. Conclusion Collectively, CMS could accurately indicate prognosis, evaluate the tumor immune microenvironment, and predict possible immunotherapy outcomes for HNSCC patients.


INTRODUCTION
Head and neck cancer ranked as the seventh most common cancer overall (Bray et al., 2018), with an estimated 888,000 new cases globally in 2018. Head and neck squamous cell carcinoma (HNSCC) accounts for more than 90% of all head and neck tumors (Gupta et al., 2016). HNSCC is a disease with biological diversity and genomic heterogeneity, which originates from the squamous mucosa lining of the upper respiratory tract, including lip and mouth, nasal cavity, paranasal sinus, nasopharynx, oropharynx, larynx, and hypopharynx (Marur and Forastiere, 2016;Siegel et al., 2017).
More than 60% of HNSCC patients were diagnosed with cancerous lesions that were locally advanced or metastatic at the first visit. For patients suffering from locally advanced or metastatic disease, the 5-year overall survival (OS) rate is less than 50% (Lefebvre, 2005), with a local recurrence rate of 15-40% and a high risk of developing distant metastasis (Chow, 2020). Surgery remains the major treatment of choice for resectable HNSCC. For cases with unresectable tumors or in which surgery may lead to severe organ dysfunction, concurrent chemoradiotherapy is recommended as the standard treatment (Colevas et al., 2018). However, concurrent chemoradiotherapy may lead to serious long-term adverse events, including pharyngeal dysfunction, ototoxicity, neurotoxicity, and nephrotoxicity (Sklan and Collingridge, 2017).
As a new therapeutic pillar in various cancers, the PD-1 monoclonal antibody was approved by the FDA for the first-line treatment of unresectable recurrent or metastatic HNSCC in 2019 (Cohen et al., 2019). Since then, a new era of immunotherapy has begun. However, despite the advancements in cancer immunotherapies to date, there are still some unmet needs: the overall response rate was still suboptimal, and immunotherapeutic resistance also resulted in substantial barriers (Luke et al., 2017). In view of the status quo that current immunotherapy is largely based on affecting T cell function via costimulatory molecules (Waldman et al., 2020), a better understanding of the mechanisms and the roles of costimulatory molecules in these biological processes is needed to realize the full potential of this treatment approach.
In the current research, we systematically explored the expression pattern and prognostic significances of CMGs by bioinformatics analysis. The biological functions of TNFRSF12A were also investigated by in vitro experiments. Then the CMGs-based signature (CMS) was developed to predict prognosis, immunotherapy response, and immune cell infiltration for HNSCC patients.

Cell Culture and Transfection
The HNSCC cell line 6-10B and Tu 686 were purchased from the Cell Bank of the Chinese Academy of Sciences (Shanghai, China) and were grown under the recommended conditions. The vectors of plko-Puromycin-EGFP-shRNA-TNFRSF12A and plko-Puromycin-EGFP-NC were purchased from Ruoji Technology (Shanghai, China) and were transfected into 6-10B and Tu 686 cells. The shRNA1 sequences are GCAGGAGAGAGAAGTTCACCA(F) & CAAA TGCTGCAGTTCCTTAGT(R), and the shRNA2 sequences are AGGAGAGAGAAGTTCACCACC(F) & GGTGGTGAACTTC TCTCTCCT(R). These two target sequences were mixed with the same proportion. Subsequently, the cells with suitable fluorescence expression were screened with puromycin at a concentration of 4 µ g/ml.

Real-Time PCR
Total RNA was purified using Mini BEST Universal RNA extraction KIT (TaKaRa, Japan) and cDNA was synthesized using the Prime-Script RT Master Mix (TaKaRa, Japan) according to the manufacturer's instructions. Real-time PCR was performed using SYBR Green Realtime PCR Master Mix (Yeasen, China). Samples from each experiment were analyzed in triplicate. The primer sequences used in this study were as follows:

Cell Proliferation Assays and Migration Assays
Cell proliferation was detected by the cell counting kit-8 (CCK-8) and clone formation assays. In brief, cells were inoculated into 96well plates (1000 cells/well). At each time point (1st, 2nd, 3rd, 4th, and 5th day), 10 µl of CCK-8 solution (Yeason, Shanghai, China) was added to the sextuplicate wells. The plates were incubated for 1.5 h and detected.
For clone formation assays, cells were seeded in a six-well plate (1000 cells/well) with the culture medium refreshed every 3 days for 2 weeks. Following the 2 weeks, the cells were washed with PBS, fixed, stained, and counted.
For migration assays, cells were incubated using 24-well transwell plates (8-µm pore size, Corning, NY, United States). Certain cells suspended in serum-free medium were plated in the upper chambers, and 0.6 ml of RPMI-1640 medium with 10% FBS was added to the lower chamber. After incubation for a suitable amount of time, the cells were fixed, stained, and counted under a microscope.

mRNA Expression Datasets and Clinical Information
The expression profile of HNSCC and corresponding clinical information were downloaded from the Cancer Genomics Browser of University of California Santa Cruz (UCSC). This prognostic feature was further validated based on an independent data set (GSE65858) from the Gene Expression Omnibus (GEO) database (Wichmann et al., 2015). For the genes with several probes, mean expression values were recognized as the expression data. The clinical characteristics of these patients from multiple institutions are summarized in Table 1.

Differentially Expressed Genes Analysis and Function Enrichment Analysis
By package limma (Ritchie et al., 2015), differentially expressed genes (DEGs) were screened out with the cutoff value p value < 0.05 and log2 (| fold change (FC)|) > 0.5. Principal component analysis (PCA) was also employed to demonstrate expression patterns of CMGs in different samples. Pathway and function enrichment analysis was performed via clusterProfiler (Yu et al., 2012). Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al., 2017), Gene set enrichment analysis (GSEA) (Subramanian et al., 2005), and Gene Ontology (GO) enrichment analyses were employed.

Mutation Analysis and Tumor Mutation Burden (TMB) Analysis
Somatic variants data of HNSCC samples were analyzed by maftools (Mayakonda et al., 2018). Then, the tumor mutation burden (TMB) of each patient (mutations per million bases) was calculated.

Signature Identification
CMGs-based signature was constructed using 9 CMGs with a linear combination of their expression values. These inputs were weighted with the regression coefficients from the stepwise Cox regression analyses.

Construction and Validation of a Prognostic Nomogram
Nomogram plotted by RMS package Before constructing the nomogram, 4 patients were excluded because of undefined pathological diagnosis. Then CMS and nomogram were tested by receiver operating characteristic (ROC) curves and calibration curves. And area under curve (AUC) values of ROC were also calculated.

Statistical Analysis
The results were expressed as the mean ± standard deviation. Student's t-test or rank-sum test was used for comparisons between groups. Categorical data were analyzed by the chi-square test or Fisher's exact test. Correlation between two groups was determined by analysis of Pearson's or Spearman's correlation  coefficient. Survival differences between the two groups were assessed by Kaplan-Meier method and compared using logrank statistical methods. All statistical tests were bilateral with p value < 0.05 being statistically significant. R software (4.0.4) and GraphPad Prism 7 were used for data analyses.

Genomic Features and Prognostic Value of CMGs in HNSCC
After excluding TNFRSF6B for its low expression, a total of 59 CMGs were abstracted from The Cancer Genome Atlas (TCGA) HNSCC data, including 13 well-defined B7-CD28 family costimulatory molecules and 46 TNFRSF family costimulatory molecules ( Table 2). The different expression levels of CMGs between HNSCC tumor and normal tissues were exhibited in Figure 1A and Table 2. Moreover, PCA exhibited that CMGs could obviously distinguish tumor samples from normal samples ( Figure 1B). The correlation of CMGs was shown in Figure 1C.
CMGs with genetic alterations rate >3% were demonstrated based on cBioPortal database (Supplementary Figure 1). Then, Cox regression analysis revealed that 22 genes were highly associated with OS (p < 0.05, Table 2). Among them, four genes had a high hazard ratio (HR > 1) and were defined as highrisk factors, while eighteen genes had a low hazard ratio (HR < 1) and were defined as protective factors.

Construction and Evaluation of a CMGs Based Risk Model
Following stepwise Cox proportional hazards regression analysis, nine genes were selected to construct CMS (Figure 2A). Then, we established a predictive model on the basis of coefficients and expressions.
Patients were assigned to either high-or low-risk group using the median risk score as the cutoff value. The distribution of risk  scores, survival status, and survival time of patients was exhibited by scatter plots (Figures 2B,C). The expressions of nine selected genes were visualized by a heat map ( Figure 2D). Differences in clinicopathologic features between the highrisk and low-risk subgroups were displayed in Table 3. Survival analysis indicated that patients in the high-risk group exhibited poorer OS (p < 0.0001; Figure 2E). The ROC curve for CMS and other clinical indices was shown in Figure 2G. Furthermore, the accuracy of CMS was validated in another independent GSE65858 cohort (Figures 2F,H).
Meanwhile, we calculated the correlation between clinical features and the risk score on OS, as well. Log-rank analysis manifested that higher risk scores were associated with poor prognosis in most subgroups, which was in accordance to the results in the total cohort (Figure 3).

CMS Was an Independent Predictor for HNSCC
Subsequently, univariate analysis was used to examine the prognostic value of risk score and several clinicopathological features ( Figure 4A). Consequently, risk score (p < 0.001), age (p < 0.05), gender (p < 0.05), N (p < 0.05), and M stages (p < 0.05) were unfavorable factors for OS. To further verify the clinical implications of CMS, multivariate Cox regression analyses were performed ( Figure 4B). Collectively, these results revealed that the novel prognostic model could work as an independent prognostic factor related to the OS of HNSCC (p < 0.001).

Establishment of a Novel Nomogram
To provide a more accurate estimation of survival rates for HNSCC patients, a prognostic nomogram was constructed based on statistically significant factors in univariable cox regression analysis ( Figure 4C). What is more, the calibration curves of the prognostic nomogram suggested the satisfying consistency between prediction and actual 1-, 2-, and 3-year survival in the TCGA cohort ( Figure 4D).

CMS Related Biological Processes and Pathways
We then explored the down-stream pathways in different CMS stratification. A total of 220 DEGs were screened out.
Among them, 47 genes were upregulated, while 173 genes were downregulated in the high-risk group (Figure 5A).
Gene ontology analysis revealed that DEGs were highly involved in immune-relevant responses (Figure 5B), especially immune cell activation, which was validated by GSEA ( Figure 5C). Besides, KEGG and GSEA analysis jointly suggested that the PI3K-AKT pathway might be implicated during these biological processes.

CMS Predicted Immune Infiltration and Responses of Immunotherapy
Despite the great achievements of immune checkpoint inhibitors (ICIs), only a fraction of patients could expect to benefit from such burgeoning agents. The development of predictive indices is needed to optimize patients' benefit and avoid toxicity risks (Topalian et al., 2016). Hence, we evaluated the association of CMS and immunotherapy responses by accessing several biomarkers. Collectively, we enrolled five indices, including IPS, TMB, tumor purity, infiltration levels of different immune cells, and immune signature genes, to obtain a more comprehensive prospect.
Using IPS, a machine learning-based scoring scheme to predict the responses of patients to ICIs, we found that the relative probabilities of responding to anti-PD-1/PD-L1 and anti-CTLA-4 treatment were higher in the group with low-risk scores (p < 0.05, Figure 6A). This indicated that patients with lowrisk scores might receive more satisfactory clinical outcomes following immunotherapy. However, no significant differences between the groups were observed for TMB, which is a biomarker for immunotherapy ; Figure 6B).
Lymphocyte infiltration, specifically CD8+ T cell and NK cell infiltration, has been documented to be associated with improved survival in various cancers (Galon et al., 2006). Based on several algorithms, such as ESTIMATE, CIBERSORT, and XCELL, infiltrating immune cells in the tumor microenvironment (TME) were calculated and exhibited in Supplementary Figure 2. Results indicated that immune cells, especially CD8+ T cell and NK cell, were highly enriched in the low-risk group (Figures 6C-I).
Besides, we compared the expression levels of immunotherapy-related genes between the high-risk score group and the low-risk score group (Figure 6J). Patients with  low-risk scores had significantly elevated expression of PD-1 (p < 0.0001), CTLA-4 (p < 0.0001), and other signature genes.

TNFRSF12A Was a Tumor Promoter in HNSCC
To investigate the significance of our risk model, we then compared the expression levels of 9 factors in CMS and found TNFRSF12A demonstrated the most significant difference between tumor and normal tissues (Table 2 and Figure 1A). Its clinical implication was further validated by GEPIA and HPA database (Supplementary Figures 3A-C). Hence, we chose TNFRSF12A for further investigation.
Short-hairpin RNA (shRNA) was used to inhibit TNFRSF12A expression in two typical HNSCC cell lines to evaluate biological functions of TNFRSF12A. The transfection efficacies were validated by real-time PCR (Figures 7A,B) and Western Blot (Figures 7C,E). By performing CCK-8 (Figures 7D,F) and clone formation (Figures 7G-J) assays, we found that proliferation was dampened when the expression of TNFRSF12A was inhibited. Transwell assays verified that downregulation of TNFRSF12A reduced migration abilities of HNSCC cell lines in the sh-TNFRSF12A group compared to those in the negative control (NC) group (Figures 7K-N). Moreover, in order to further explore the relationship between TNFRSF12A and TME, we investigated the relevance between the expression levels of TNFRSF12A and chemokine genes, which were deemed to cast a crucial role in shaping TME and influence clinical outcomes of immunotherapy (Nagarsheth et al., 2017). Correlation analysis revealed that 5 chemokine genes, out of a total of 43 chemokine genes, exhibited significant correlations with TNFRSF12A (Supplementary Figure 3D). Among these 5 chemokine genes, CXCL2, CXCL3, and CXCL11 showed positive correlation with TNFRSF12A (R > 0.25, p < 0.05), while CXCL17 and CCL19 showed negative correlation (R < −0.25, p < 0.05). Real-time PCR revealed that TNFRSF12A could regulate expressions of CXCL2, CXCL11, and CCL19 ( Figures 7A,B). ELISA further validated that TNFRSF12A was the up-stream regulator of CXCL2, CXCL11, and CCL19 (Figures 7O,P).

DISCUSSION
Immune escape and T cell exhaustion are reckoned as features in the TME, closely associated with patients' survival (Chen and Mellman, 2017). By inhibiting the immune checkpoints, immune cells could be rejuvenated to eliminate cancer cells, which becomes a potential target for cancer immunotherapy (O'Donnell et al., 2019).
In the current study, we synchronously inspected the genome landscape and prognostic value of 59 costimulatory molecules in HNSCC patients. Through TCGA RNAseq data, 22 CMGs were discovered to be related to prognosis. Most of these CMGs had been reported to be involved in various diseases, including cancer. For instance, TNFRSF12A could result in cancer, chronic autoimmune diseases, and acute ischemic stroke via the TWEAK-TNFRSF12A axis (Winkles, 2008). Overexpression of TNFSF14 could contribute to the expansion of functional T cells to prevent the growth of human papillomavirus 16-induced tumors (Kanodia et al., 2010). High expression of ICOS in leukemia was associated with poor prognosis and might contribute to tumor proliferation by assisting tumor cells in evading antitumor immune responses (Tamura et al., 2005).
Among 9 CMGs in CMS, TNFRSF12A showed the most notable difference between tumor and normal tissues. And two independent databases also verified its clinical manifestations. Thus, TNFRSF12A was chosen for subsequent experiments. By knocking down TNFRSF12A in two typical HNSCC cell lines, we found that reduced expression of TNFRSF12A significantly inhibited cellular proliferation in vitro. Simultaneously, downregulation of TNFRSF12A also dampened the migration ability of HNSCC cell lines, consolidating its role as a tumor promoter in HNSCC carcinogenesis. Moreover, we analyzed the association between TNFRSF12A and TME, from the perspective of chemokines. By correlation analysis, we found 5 chemokines were highly associated with TNFRSF12A in HNSCC tissues. What is more, real-time PCR and ELISA verified that TNFRSF12A could regulate CXCL2, CXCL11, and CCL19 expression, highlighting the role of TNFRSF12A in modulating TME.
Following this, the risk model based on CMGs was constructed by the TCGA data set and validated by the GEO data set. The survival analysis showed that patients with low-risk scores tended to have a higher OS rate in both validation and training cohorts. Similar results amongst HNSCC patients of different stages, grades, genders, and ages highlighted the CMS's accuracy. Moreover, cox regression analysis confirmed the independent prognostic value of CMS, henceforth emphasizing the reliability of our risk model. To further enhance the accuracy of prognostic prediction, a novel nomogram combining clinical characteristics and CMS was constructed, which not only helped to predict the specific survival risk of specific patients but also contributed to individualized treatments for HNSCC patients. Then, pathway enrichment analyses were used to provide additional insights into the downstream pathways distinct in two groups. The GO and GSEA analyses jointly showed that DEGs were related to immune responses. And the KEGG analysis suggested that PI3K-AKT pathway, which has been reported to induce specific immune-inhibitory molecules' expression (Pardoll, 2012), might be involved in these biological processes.
Preliminary data from trials of ICIs in the treatment of metastatic or recurrent HNSCC led to encouraging results (Ferris et al., 2016;Gillison et al., 2016;Mehra et al., 2018). Nevertheless, with the rapid augment in the utilization of ICIs, immune-related adverse events and the limited response rate for the majority of HNSCC ensued (Lyon et al., 2018;Postow et al., 2018). As a consequence of these results, biomarkers to guide patient selection for immunotherapy are attached to most priority (Bruni et al., 2020), which prompted us to investigate how our risk model would relate to immunotherapy response. Nevertheless, it is not feasible to accurately predict the response probability for ICI immunotherapy based on merely one parameter and without taking other factors into consideration (Havel et al., 2019). This is probably due to the heterogeneity of HNSCC and its TME. Thus, through integrating a series of promising indices in immunotherapy, including IPS, TMB (Samstein et al., 2019), tumor purity (Fridman et al., 2012), immune cell component (Galon et al., 2006), and immune genes (Andrews et al., 2017;Rowshanravan et al., 2018;Wolf et al., 2020), we came to the conclusion that patients with high-risk scores were inclined to acquire miserable therapeutic outcomes of ICI immunotherapy, in accordance to the results of OS.
However, there are some limitations in our study. Firstly, although an earnest endeavor was made to recruit as many HNSCC patients as possible to establish and validate this CMS model, especially considering the relatively low incidence rate of HNSCC compared to other cancers like colon cancer, this study was still a retrospective analysis. Secondly, because of the limited accessibility to acquire paired mRNA expression data from HNSCC samples before and after immunotherapy, the prediction of immunotherapy response based on CMS was estimated indirectly. The accuracy was remained restricted. Future prospective studies based on genome data could depict a more delicate landscape of the CMGs in HNSCC.

CONCLUSION
This study identified expression pattern and prognostic value of CMGs in HNSCC. TNFRSF12A was found to be a tumor promoter via in vitro experiments. A novel scoring system based on CMGs was established to predict the clinical outcomes of HNSCC patients. It could serve as a biomarker and immunotherapy indicator for doctors to assign individualized therapeutics for patients in future clinical practices.

AUTHOR CONTRIBUTIONS
LA performed the bioinformatics analysis, drafted the manuscript, and conducted the experiments. XSo and JY collected the related references and edited the manuscript. JZ, XSu, and LH participated in the discussion. QL, HY, and DW conceived and designed the study. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
We appreciate statisticians, mathematicians, and programmers for providing biologists and physicians with such plentiful genome analysis tools.

SUPPLEMENTARY MATERIAL
The