Systematic Analysis and Identification of Dysregulated Panel lncRNAs Contributing to Poor Prognosis in Head-Neck Cancer

Head and neck cancer (HNC) is one of the most prevalent cancers worldwide, accounting for approximately 5% of all cancers. While the underlying molecules and their pathogenetic mechanisms in HNC have yet to be well elucidated, recent studies have shown that dysregulation of lncRNAs may disrupt the homeostasis of various biological pathways. However, the understanding of lncRNAs in HNC is still limited by the lack of expression profiling. In the present study, we employed a systematic strategy to identify a panel of lncRNA associated with HNC. A cancer-related lncRNA profile PCR array was screened to explore potential molecules specific for HNC. A total of 55 lncRNAs were found to be dysregulated in HNC cells when compared to normal keratinocytes. Further analysis of the prognostic significance using The Cancer Genome Atlas (TCGA) database revealed 15 lncRNAs highly correlated with overall survival in HNC patients. Additionally, clinical sample expression analysis of the TCGA-HNSC cohort revealed 16 highly dysregulated lncRNAs in HNC, resulting in a combined 31-lncRNA signature panel that could predict prognosis. Validation of these molecules confirmed the considerable level of altered expressions in HNC cells, with XIST, HOXA11-AS, TSIX, MALAT1, WT1-AS, and IPW being the most prominently dysregulated. We further selected a molecule from our panel (XIST) to confirm the validity of these lncRNAs in the regulation of cancer aggressiveness. Gene ontology (GO) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analyses demonstrated that XIST participated in various cancer-related functions, including cell proliferation and metastasis. XIST silencing with the RNAi technique substantially reduced invasion and migration in several HNC cell lines. Thus, our study defined a 31-lncRNA panel as prognostic signatures in HNC. These perspective results provide a knowledge foundation for further application of these molecules in precision medicine.


INTRODUCTION
Head and neck cancer (HNC) is a complex and difficult to treat disease. While this type of cancer encompasses dysregulations at areas including the mouth, nasal cavity, larynx, and pharynx, over 90% of all HNCs are squamous cell carcinomas, and of the oral region (1). Together, they account for approximately 5% of all cancers worldwide, according to GLOBOCAN 2020 estimates (2). Like other cancers, standard treatment methods include surgery, radiotherapy, chemotherapy, or combination therapy (3). Nevertheless, even after months of treatment, relapse is always a potential problem. Although the 5-year survival rate of HNC patients is roughly 80% when detected at the earliest stages, mid-to late-stage detection causes that number to decrease by over two-fold (4,5). Therefore, it is critical to identify and understand how these carcinogenic mechanisms and molecules affect cancer, as we may be able to uncover the mysteries behind HNC and how to treat and/or prevent it.
It is well established that while more than 75% of the human genome is transcribed, only 2% consists of coding genes (6). The remaining non-coding RNAs, previously merely labeled as transcriptional noise or garbage sequences and disregarded, have recently become much more recognized. The largest class of the non-coding RNA family, with transcripts longer than 200 nucleotides, are known as long non-coding RNAs (LncRNAs). They have gained significant attention over the past decade, as many studies have confirmed their roles in various biological processes involving transcriptional and epigenetic regulation, metabolism, and multiple cellular functions (7)(8)(9).
So far, there has been no distinctive markers for HNC. Nevertheless, many studies have shown that aberrantly expressed lncRNAs may potentially play important roles in this particular cancer type. Currently, only a handful of lncRNAs have been implicated in different cancerous functions such as migration, invasion, and metastasis of HNC, including HOTAIR, UCA1, and MALAT1 (10). A variety of lncRNAs have been discovered to play roles in various cancers. For example, lncRNA HOTAIR and UCA1 have both been found to play carcinogenic roles in multiple cancer types (11,12). A recent review by Zhou et al. also depicted various lncRNAs that were implicated in HNC metastasis (13). Moreover, since not many studies have profiled lncRNAs in cancers of the head and neck region in combination with prognosis analysis, the results of this research will allow us to better understand the mechanisms behind HNC, and provide new insights on the development of diagnostic, prognostic, or treatment markers.
However, the screening and selection of these molecules are mostly ambiguous, and their prognosis abilities have yet to be thoroughly investigated.
The Cancer Genome Atlas (TCGA) is a comprehensive database for identifying and annotating different genes across multiple cancers. With the recent development in genomic sequencing, many cancer-associated lncRNA studies have been accomplished by solely analyzing and constructing data based on these clinical datasets (14)(15)(16). While various studies have profiled lncRNAs across different cancers, including breast cancer (17), lung cancer (18), and esophageal cancer (19), very few focus on the intricacies of HNC. Additionally, although the high-throughput TCGA datasets offer a large library of potential candidate molecules, exclusively relying on database information without experimental validation may limit insight for substantiation of prognostic cancer markers, as many have noted (20)(21)(22)(23)(24). Therefore, examination of validated lncRNAs in combination with big-data analysis would be ideal.
PCR array is a relatively new method of gene expression analysis. While it may lack discovery power or high-throughput abilities, it makes up for in its sensitivity, specificity, and great dynamic range. Additionally, data analysis is quick and efficient, as opposed to the cumbersome bioinformatics analysis required for genome-wide methods. Previous studies have reported the use of PCR arrays to analyze genes in specific pathways or diseases. For example, Boone et al. performed two pathwayspecific arrays for apoptosis and neurotrophins & receptor genes to elucidate the changes in post-traumatic brain injury (25). Zhang et al. also used a panel of 54 genes specific to Alzheimer's disease to observe changes in gene expression in mice (26).
To our knowledge, there are very few reports that show a systematic profiling investigation of HNC lncRNAs, nor their correlation with prognosis. Herein, we systematically examined differentially expressed lncRNAs in HNC cells using a PCR array-based method. We further assessed our results with prognostic information obtained from a high-throughput database, the TCGA-HNSC cohort. Additionally, expression levels of the top dysregulated lncRNAs from the same TCGA dataset were parallelly assessed to provide a base foundation for our research. In combination with in silico and in vitro analysis, we defined a panel of 31-lncRNA signatures with valuable prognostic information.

LncRNA Screening via RT 2 PCR Array
Systematic gene profiling was accomplished using Qiagen's PCR array kit, according to the manufacturer's protocol. Briefly, RNA

Clinical Evaluation of LncRNAs Related to Prognosis in HNC Patients
The RNA-seq data was obtained through the UALCAN web portal (http://ualcan.path.uab.edu/) (28). The Kaplan-Meier survival curves were plotted to evaluate the prognosis of lncRNAs in high-and low-risk patient groups. The head and neck RNA-seq dataset on the Kaplan-Meier Plotter pan-cancer database (https://kmplot.com/analysis/) was selected to assess the overall survival of lncRNAs. Head-neck squamous cell carcinoma data was collected from sources including TCGA, European Genome-phenome Archive (EGA), and Gene Expression Omnibus (GEO) (n = 500), and cohorts were split by automatic sel7best cut-off for median expression values. Additionally, univariate proportional cox hazard ratios (HRs) with 95% confidence intervals, along with survival p-values calculated by log-rank test were obtained for each lncRNA. TCGA clinical expression level analysis was performed using SurvExpress (http://bioinformatica.mty.itesm.mx:8080/ Biomatec/SurvivaX.jsp), and pan-cancer analysis was performed with Gepia2 (http://gepia2.cancer-pku.cn/#index).

Molecular Targets and Pathway Analyses via Bioinformatic Methods
Potential lncRNA-binding axes were investigated through various online databases and prediction algorithms. Potential lncRNA-mRNA or protein bindings were screened using ENCORI (version 3.0, http://starbase.sysu.edu.cn/index.php) (29). Gene Ontology (GO) and pathway analysis were conducted with the GO and KEGG database from The Database for Annotation, Visualization and Integrated Discovery (DAVID, version 6.8, https://david.ncifcrf.gov/).

Knockdown LncRNA XIST Expression via Specific siRNA Transfection
Knockdown of lncRNAs was accomplished with specific siRNAs (Thermo Scientific, Thermo Fisher Scientific, Inc.). siRNA sequences are listed in Supplementary Table S1. Transfection was performed with Lipofectamine 2000 ™ reagent (Invitrogen, Thermo Fisher Scientific, Inc.) in OPTI-MEM medium (Invitrogen, Thermo Fisher Scientific, Inc.), according to the manufacturer's instructions. Briefly, cells were seeded in a 10-cm dish and incubated for 24 hours prior to transfection. 20 to 40 µg of siRNA was transfected and incubated for another 24 hours before the cells were counted and/or pelleted for functional assay examination.

Determination of Cellular Functions: Growth, Migration, and Invasion
Colony formation assay was performed by seeding 5 x 10 2 to 5 x 10 3 transfected cells into 6-well plates and incubated without disturbance for 10 to 14 days. The cells were then fixed and stained with crystal violet for 2 hours, and the number of colonies formed was counted. Cell invasion was performed using Millicell ® (Millipore) cell culture inserts. Transwell chambers were coated with matrigel, and 1 x 10 6 transfected cells were seeded into the upper chamber. The lower chamber was filled with 20% FBS culture medium to promote cell invasion. After 16 to 24 hours of incubation at 37°C, invaded cells were fixed with formaldehyde for 30 minutes, stained with crystal violet for 2 hours, and the number of invaded cells which passed through the Matrigel-coated membranes were quantified and compared to their control counterparts.
Cell migration assay was performed using the wound-healing method. 1 x 10 5 transfected cells were seeded into Ibidi ® culture inserts for 16 hours. After the cells adhered, the inserts were removed, leaving a cell-free gap in the monolayer of cells. Migration towards the gaps was then photographed and measured at 4-hour intervals, up to 12 hours.

Statistical Analysis
RT-qPCR data was performed with at least three independent experiments for each experimental cohort. Unpaired t-test was used to compare the normal and cancer groups (Graphpad Prism 8.0), and a p-value of ≤ 0.05 was considered statistically significant.

LncRNA Expression Profiling in HNC Cell Lines
To profile lncRNAs associated with HNC, a PCR array with 84 cancer-related lncRNAs was used to examine the differential expressions between three HNC cell lines (SAS, OECM1, and FaDu) and two lines of normal keratinocytes (CGHNK2 and CGHNK6). The three HNC cell line's geometric mean fold regulation (FR) and fold change (FC) of each lncRNA was compared to the mean of the normal cell lines, as summarized in Supplementary Table S2, and the relationship between cancerous and normal groups were shown in Figure 1A. A screening criterion of a mean |FR| ≥ 1.5 was established, resulting in 55 significantly dysregulated lncRNAs ( Figure 1B). Among these, 27 lncRNAs were upregulated, and 28 were downregulated ( Figure 1B and Tables 1, 2). Hierarchical clustering analysis was used to visualize these differentially expressed lncRNAs ( Figure 1C). These results suggest a panel candidate of lncRNA that may participate in the carcinogenesis of HNC.

Prognostic Significance of the Panel lncRNAs in HNC Patients
The clinical significance of the lncRNA in HNC patients was determined by examining the association between each lncRNA expression and clinical patients' prognosis. The Kaplan-Meier Plotter (KM Plotter) suite was used to analyze overall survival in HNC patients with the TCGA-HNSC dataset (n=500). A total of 41 lncRNAs was examined, which included the 55 candidates post-exclusion of nine molecules without information in KM Plotter. Figure 2A shows a few examples of highly significant results. As depicted, the upregulated lncRNAs, XIST, HOXA11-AS, and TERC, were significantly associated with poor prognosis, while IPW, a downregulated lncRNA, was correlated with good  survival. The hazard ratio (HR) of the prognostic association for each lncRNA was summarized in Figure 2B. In total, there were 24 lncRNAs with HR ≥ 1.0, implying the higher risk of these lncRNA expressions to be correlated with worse survival, whereas 17 lncRNAs were found with HR < 1.0, alluding to a lower lncRNA level, favoring good prognosis in HNC patients.

Dysregulated lncRNA Signatures in Cells Correlated With Prognosis in Patients
To parallelly assess lncRNA expression level and the prognostic significance in HNC patients, Figure 3A was plotted to show the association between these two parameters of each molecule. As shown, a total of 27 lncRNAs exhibited correlative levels of dysregulation and prognostic risk (HRs), with 16 being positiverisk and 11 negative-risk to the prognosis of HNC. To further assess the prognostic prediction power, the statistical significance on the overall survival of these 27 lncRNAs was examined. Figure 3B depicts the 15 molecules that exhibited altered expression and statistical correlation (p-value ≤ 0.05) in HNC patients. Of these lncRNAs, 9 molecules were upregulated and associated with poor prognosis, including TERC, LINC01234, CCAT1, XIST, GACAT1, WT1-AS, CCAT2, HOXA11-AS, and TSIX. In contrast, a total of 6 molecules, NEAT1, MALAT1, C D K N 2 B -A S 1 , C B R 3 -A S 1 , I P W , a n d A I R N , w e r e downregulated and related to good prognosis.

Prognostic Significance of the Top 30 Up-and Down-Regulated lncRNAs From TCGA-HNSC Database
Apart from the data collected from our PCR array panel, we also analyzed the clinical RNA-seq data from TCGA. Here, we selected the 30 most upregulated and the 30 most downregulated lncRNAs in the TCGA-HNSC cohort  ( Figure 4A). Out of the top 60 dysregulated lncRNAs, we found 24 lncRNAs with HRs significantly correlating with their expression levels, including 14 upregulated lncRNAs with corresponding HRs ≥ 1.0, and 10 downregulated lncRNAs displaying HRs < 1.0 ( Figure 4B). Further investigation of these molecules revealed 16 genes with p-values ≤ 0.05, signifying a high correlation with prognosis ( Figure 4C). The collective correlated HRs and significant p-values of the lncRNAs screened are summarized in Supplementary Table S3. In combination with our panel derived from the PCR array results, we established a comprehensive panel of 31-lncRNA signatures associated with HNC prognosis.

A Panel of 31-lncRNA Signature Can Potentially Predict HNC Prognosis
Since the expression levels of the lncRNAs from the TCGA dataset are already established from clinical HNC patients, we wanted to specifically confirm and authenticate the extensive capabilities of the 15 lncRNA signatures screened from our own PCR array sample set through RT-qPCR expression analysis. In addition to the five HNC and normal cell lines used in the PCR array, we also included four additional normal cell lines and seven additional cancer cell line samples. Figure 5A shows the expression folds of six panel lncRNAs: HOXA11-AS, IPW, MALAT1, TSIX, WT1-AS, and XIST, with statistically significant dysregulation in HNC cells compared to normal cells. As seen in the example lncRNAs shown, the panel lncRNAs are verified across multiple HNC cell lines to be significantly correlated with our PCR array results. Thus, our defined 31-lncRNA signature panel provides insight on dysregulation in cells, as well as correlation with prognosis in HNC patients.

LncRNA XIST Is Significantly Correlated With HNC
Herein, we selected lncRNA XIST for further functional analysis, due to its significant FR and correlation with TCGA datasets ( Figure 3B), as well as its high endogenous expression level in HNC cell lines. Figure 5B is a schematic representation of our screening and selection. To further verify the potential significance of XIST in cancer, we confirmed its expression in clinical sample data from TCGA (Supplementary Figure S1A). Furthermore, we examined its pan-cancer expression. TCGA clinical samples and tissues from various other carcinomas, such as lung, liver/bile duct, and thyroid cancers showed that XIST was upregulated in multiple cancers (Supplementary Figure S1B). Additionally, the pancancer overall survival for XIST was seen to correlate with poor prognosis, with an HR of 1.2 and a p-value of 0.023, signifying high expression risk, resulting in poor prognosis (Supplementary Figure S1C). The data collected from clinical resources coincided  with our results with XIST in HNC, verifying that XIST, as well as our other panel lncRNAs could be significant for prognostic analysis of HNC.
To acquire comprehensive information related to XIST modulated functional pathway, we performed Gene Ontology (GO) and KEGG (Kyoto Encyclopedia of Genes and Genomes) enrichment analysis and found that many of the genes associated with XIST participate in pathways and functions related to cancer metastasis, including various adhesion-specific functions ( Figures 6A, B).
Silencing of XIST was performed with siRNA as the cancer function model ( Figure 7A). While analysis of long-term cell growth via colony formation assay did not show any significant increase or decrease of colony formation ability across different cell lines ( Figure 7B), both migration and invasion abilities were prominently inhibited when XIST was silenced (Figures 7C, D). Migration was partially inhibited in SAS cell lines, while CGHNC9 and FaDu cells had at least a 40% inhibition rate. Similarly, all three cell lines exhibited highly repressed invasion rates in the siRNA group.  To analyze the downstream mechanisms of XIST-mediated migration and invasion in HNC, we performed RT-qPCR and Western blotting of molecules associated with EMT, including MMP2, MMP7, MMP9, E-cadherin, and N-cadherin. RT-qPCR analysis showed that silencing of XIST also decreased the levels of MMPs and mesenchymal markers, while the epithelial marker E-cadherin was significantly increased across all three cell lines ( Figure 8A). The protein levels of these genes were also similarly affected. While N-cadherin were inhibited by the knockdown of XIST, E-cadherin was significantly upregulated ( Figure 8B). Thus, as demonstrated by the analysis of XIST, our panel of lncRNAs can potentially be effectively used as biomarkers that can predict prognosis for HNC.

DISCUSSION
Cancer has recently become the most common cause of death in higher-income countries. Therefore, it is of utmost importance to find precise molecules that can detect the cancers before it   (30). However, to the best of our knowledge, no specific prognostic lncRNA(s) have been derived from a systematically verified study. In this study, we designed a comprehensive   (31). A review by Cossu et al. also pointed out various lncRNAs, such as AFAP1-AS1, PVT1, MALAT1, H19, DLEU2, CCAT1, and more, that are potentially associated with HNC, many of which also agreed with our findings (32).
Following our initial screening, we investigated the prognostic abilities of these lncRNAs. We chose to evaluate prognosis through HR and overall survival, as the risk of disease in conjunction with time represents imperative determining factors of cancer progression. Univariate cox proportional HRs have been used by multiple studies to represent prognosis potential, as it estimates the relative risk of each lncRNA (33). The results of our HR analysis showed 27 candidates with significant prognosis implications. Then, utilizing overall survival analysis (p-value ≤ 0.05), we evaluated the significance between lncRNA expression and cancer survival. Here, we discovered 15 prognosis-associated lncRNAs. To broaden the scope of our study to include data based on clinical samples, we analyzed highly dysregulated lncRNAs from the TCGA-HNSC dataset in conjunction with the PCR array screening results. A total of 16 lncRNAs were found to be significant in the cancer survival and progression of HNC patients. Altogether, we established a 31-lncRNA signature panel that predicts HNC prognosis.
Upon further detailed investigation, we found some notable lncRNAs in our panel, including XIST, TSIX, HOXA11-AS, WT1-AS, IPW, and MALAT1, with significant dysregulation in multiple HNC cell lines. A study by Yao et al. also identified HOXA11-AS and MALAT1 as potential biomarkers for HNC (34). The prognostic risk of MALAT1 has also been well established in various previous studies (32,35). Interestingly, although studies have deemed MALAT1 as an oncogene across many cancer types (35), our results indicated that it was downregulated in HNC. TCGA data analysis also showed that high expression of MALAT1 resulted in higher overall survival, which correlates with our study. Thus, these common lncRNAs may have high potential for future HNC-specific studies. On the other hand, some lncRNAs from our panel are relatively novel lncRNAs, such as WT1-AS, where only a handful of studies have proposed its carcinogenic function in lung, cervical, and breast cancer (36)(37)(38). Knowledge regarding HOXA11-AS is also sparse, although recent studies have elaborated on its role in liver (39), lung (40), head-neck (34), and other cancers (41). Not much is known about lncRNA IPW and TSIX either, but some preliminary studies have pointed out potential interactions between TSIX and the more well-known  lncRNA XIST in regards to X chromosome modulation (42). Although some early studies predicted an inverse correlation between these two molecules (43), newer studies began to disprove their correlation, focusing on their individual functions instead (44).
Numerous studies have linked XIST with multiple cancer types, such as colorectal (45), lung (46), and breast cancer (47). Various cancers such as thyroid (48) and osteoscaroma (49) have also shown XIST to act as an oncogene, which coincides with our findings in HNC. Many of these studies have also suggested that The wound-healing model was used for migration assay. SAS was partially inhibited by roughly 20%, while FaDu and CGHNC9 was inhibited by at least 60%. (D) Invasion ability was determined via Matrigel invasion assay. All three cell lines showed statistically significant inhibition rates in the XIST knockdown group. All functional experiments were performed in triplicates. (***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05, t-test, ns = not significant). XIST could act as a prognostic marker, given its various cancerous functions. Thus, we selected the lncRNA XIST for in vitro studies due to its performance in our screening results, along with its novelty in HNC. Additionally, various annotations from GO and KEGG, such as 'adherens junction', 'focal adhesion', and 'cell-cell adhesion', all strongly allude to the metastatic functions as well, which correlates with our functional analysis. Our results showed that XIST played essential roles in regulating cell migration and invasion, as silencing by siRNA significantly inhibited these functions in multiple HNC cells. These carcinogenic roles were also confirmed in previous cancer studies, such as liver (50), ovarian (51), and esophageal cancer (52). A review by Zhou et al. also highlighted the potential of XIST as a prognostic marker (53). Therefore, our findings implied that because XIST affected HNC progression through these metastatic functions, it can be used as a prognostic marker. Taken together, XIST can be used as an HNC marker that can predict prognosis, and can potentially be used as therapeutic target.
In conclusion, we established a systematic profiling method to screen for prognostic lncRNA markers in HNC. Our results from the in silico and in vitro combination analysis suggests that the panel of 31-lncRNA signatures may contribute to HNC tumorigenesis, and can provide valuable prognostic data.
Additionally, XIST demonstrated carcinogenic functions in HNC, implicating its ability as a prognostic marker. Overall, our findings greatly contribute to the knowledge of HNC and prognosis, and can potentially be expanded to applications in precision medicine.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
S-JT was responsible for experimental analysis, acquisition of data, and the writing of the manuscript. A-JC and S-JT were responsible for the conceptualization and design. A-JC and JC were responsible for resources, project administration, supervision and funding acquisition. S-JT, A-JC, and G-RY were responsible for the review of the submitted manuscript. All authors contributed to the article and approved the submitted version.