Risk Stratification and Validation of Eleven Autophagy-Related lncRNAs for Esophageal Squamous Cell Carcinoma

Esophageal squamous cell carcinoma (ESCC), the most prevalent subtype of esophageal cancer, ranks sixth in cancer-related mortality, making it one of the deadliest cancers worldwide. The identification of potential risk factors for ESCC might help in implementing precision therapies. Autophagy-related lncRNAs are a group of non-coding RNAs that perform critical functions in the tumor immune microenvironment and therapeutic response. Therefore, we aimed to establish a risk model composed of autophagy-related lncRNAs that can serve as a potential biomarker for ESCC risk stratification. Using the RNA expression profile from 179 patients in the GSE53622 and GSE53624 datasets, we found 11 lncRNAs (AC004690.2, AC092159.3, AC093627.4, AL078604.2, BDNF-AS, HAND2-AS1, LINC00410, LINC00588, PSMD6-AS2, ZEB1-AS1, and LINC02586) that were co-expressed with autophagy genes and were independent prognostic factors in multivariate Cox regression analysis. The risk model was constructed using these autophagy-related lncRNAs, and the area under the receiver operating characteristic curve (AUC) of the risk model was 0.728. To confirm that the model is reliable, the data of 174 patients from The Cancer Genome Atlas (TCGA) esophageal cancer dataset were analyzed as the testing set. A nomogram for ESCC prognosis was developed using the risk model and clinic-pathological characteristics. Immune function annotation and tumor mutational burden of the two risk groups were analyzed and the high-risk group displayed higher sensitivity in chemotherapy and immunotherapy. Expression of differentially expressed lncRNAs were further validated in human normal esophageal cells and esophageal cancer cells. The constructed lncRNA risk model provides a useful tool for stratifying risk and predicting the prognosis of patients with ESCC, and might provide novel targets for ESCC treatment.


INTRODUCTION
Esophageal cancer (EC) has an increasingly notable cancer burden, accounting for approximately 16% (Ferlay et al., 2015) cancer-related mortality worldwide (Smyth et al., 2017). According to GLOBOCAN estimates, over 604,100 new cases of EC and 544,076 EC-related deaths occurred in 2020 (Thrift, 2021). The two main histological subtypes of EC are esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC). In all EC cases, the proportion of ESCC was >90%. Although its incidence has declined over the past decades, the survival ratios for EC are among the lowest for cancers, mainly because of the late stage at diagnosis and high aggressiveness of the disease. The fatality rate of ESCC is even higher than that of EAC, with a 5-year overall survival rate of <15%. Hence, there is an urgent need to search for effective screening methods and risk stratification to improve patient prognoses.
In the transcriptome, less than 1-2% of RNAs encode proteins and undergo the translation process. Thus, most RNAs are non-coding (Beermann et al., 2016); among these, RNAs with more than 200 nucleotides are called long noncoding RNAs (lncRNAs) (Djebali et al., 2012;Dykes and Emanueli, 2017). Compared with protein-coding messenger RNAs, whose functions are extensively studied, the specific function of most lncRNAs remains unknown (Johnson et al., 2005). lncRNAs are reported to be crucial in many biological processes, including epigenetic modification, cell cycle regulation, and differentiation (Beermann et al., 2016). Although the mechanism by which lncRNAs regulate physiological activities is unclear, their significance, especially in tumor growth and metastasis, has been reported (Quinn and Chang, 2016;Li et al., 2018;Chi et al., 2019). Autophagy has become a popular research topic since Yoshinori Ohsumi was awarded the Nobel Prize in Physiology or Medicine for his contribution in elucidating its mechanism in 2016. Thus, stimulation or inhibition of autophagy in cancer cells has also become a promising therapeutic strategy (Levy et al., 2017). Considering the complexity of various biological processes, the role of autophagy in tumors can be both positive and negative (Russo and Russo, 2018). Autophagy can be metaphorized as a double-edged sword (Pietrocola et al., 2016). In the specific cellular microenvironments of certain tumors (Galluzzi et al., 2015), autophagy can either promote or inhibit cancer development. Despite these paradoxical approaches, there are no reports on autophagy-related lncRNAs as prognostic biomarkers for patients with ESCC.
In this study, using transcriptional and clinical data from two databases, TCGA database and the Gene Expression Omnibus (GEO) datasets, GSE53622 and GSE53624, we first constructed an autophagy-associated lncRNA model and validated its prognostic value. Immune function, tumor mutation burden (TMB), and therapeutic response of the two risk groups were further explored. To determine the expression level of autophagy-related lncRNAs in cultured human cells, the selected lncRNAs were analyzed using quantitative real-time polymerase chain reaction (qRT-PCR).

Esophageal Squamous Cell Carcinoma Clinical and Transcriptional Datasets
The clinical data and lncRNA expression profiles of 179 ESCC patients in the GSE53622 and GSE53624 datasets were obtained from the GEO database and re-annotated using the GPL18109 platform. Data from 174 patients with EC were obtained from TCGA (https://cancergenome.nih.gov/) and used for independent external validation.

Screening of Autophagy-Related Genes and Long Non-Coding RNAs Screening
After annotation and categorization using GENCOED (https:// www.gencodegenes.org), all extracted mRNA and lncRNA expression profiles were compared against the HaDb website, an online database dedicated to collecting ARGs and proteins (http://autophagy.lu/clustering/index.html). After autophagyrelated mRNAs were selected, correlations between autophagyassociated mRNAs and their co-expressed lncRNAs were analyzed using the Pearson method, and the screening criteria were |R 2 | > 0.3 and p < 0.001. Autophagy-related lncRNAs were defined based on these criteria. Cytoscape was used to visualize the correlation network of autophagy genes and their associated lncRNAs. All mRNA sequencing data were standardized using the limma package (version 3.22.7).

Construction of a Prognostic Autophagy Long Non-Coding RNAs Model
Using univariate Cox regression analysis, 43 lncRNAs were identified from all selected autophagy-related lncRNAs. Further multivariate regression analysis revealed the statistically significant prognostic autophagy-associated lncRNAs, which were used in constructing the model. Based on the coefficients of these lncRNAs, the patient's risk value was calculated formula as follows: Risk score β 1 X 1 + β 2 X 2 + / + β n X n The β value represents the regression coefficient of each lncRNA, and the X value represents its transcriptional expression. To increase the accuracy of this risk assessment formula, lncRNA expression levels were weighted using regression coefficients for the linear combination of allocating risk scores. The β value was obtained by the logarithmic transformation of the HR value. After evaluating the risk values of all patients, they were classified into high or low groups based on the median risk value.

Assessment and External Validation of the Prognostic Model
Kaplan-Meier (KM) survival curves of the high-and low-risk groups were plotted to compare the overall survival of the two groups. Based on the clinical data of the GSE53624 and GSE53622 training sets, the receiver operator characteristic curve (ROC) of clinical features such as age, sex, grade, and other characteristics were plotted, and the predictive ability of each feature was evaluated by the area under the curve (AUC). SPSS software was used for statistical analysis and statistical significance was set at p < 0.05. Using the same standards and methods, TCGA dataset was obtained as the testing set to further confirm the stability and reliability of the model.
Nomogram was generated including risk scores and other clinical characteristics with R package of "survival", "regplot" and "rms". Calibration curves of 1-year survival, 3-year survival and 5-year survival were delineated.

Functional Analysis
Through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, we found that different signaling pathways were enriched in different groups. KEGG pathway analysis was implemented based on a gene-set enrichment analysis (GSEA) software, which was downloaded from the website, https://www.gsea-msigdb.org/ gsea/index.jsp. A false discovery set of 1000 repeats, p-value < 0.05, and q-FDR < 0.25 were considered valid. The gene ontology (GO) database was used for gene and gene product classifications.

Immune Function Heatmap and Tumor Mutational Burden Analysis
Differential analysis of immune-associated genes in the high or low risk groups of patients were performed and visualized using the R package "ssGSEA". Simple nucleotide variation (SNV) profile of the TCGAdataset was downloaded to analyze the tumor mutational burden (TMB) in high or low risk group. Survival curves of different TMB and risk subgroups were analyzed. Therapeutic Response Analysis The "pRRophetic" package was used to estimate the therapeutic sensitivity of patients in high or low risk group based on half maximal inhibitory concentration (IC50) of anticancer drugs in the Cancer Genome Project (CGP) database. Filtering criteria was p < 0.05.

Co-Expression Diagram of Autophagy-Related Long Non-Coding RNAs
The co-expression diagram of the 11 prognostic-related lncRNAs and ARGs is shown in Figure 1A. To describe the crosstalk between lncRNAs and ARGs as well as its role in patient survival outcomes, we To validate the ability of the 11 candidate lncRNAs to predict patient prognosis, KM analysis was performed using the survival data of patients with ESCC, as shown in Figure 2. Based on the Sankey analysis, patients with high AC092159.3, BDNF-AS, or ZEB1-AS1 expression showed significantly shorter survival with a lower median overall survival, whereas patients with high AC004690.2, AC093627.4, AL078604.2, HAND2-AS1, LINC00410, LINC00588, LINC02586, or PSMD6-AS2 expression had a lower risk and longer overall survival.

Development of a Prognostic Autophagy Long Non-Coding RNAs Signature in Esophageal Squamous Cell Carcinoma
Based on the derivation equation, we comprehensively determined the risk value of every patient by multiplying the expression levels of the 11 lncRNAs with their correlation coefficients. Depending on the median value of the risk score, patients were categorized into high-or low-risk groups, and the prognosis of both groups was compared using Kaplan-Meier analysis ( Figure 3A). We found that patients in the high-risk group had significantly worse outcomes, both in terms of survival and duration of survival, than those in the low-risk group. The 5year survival rate of the high-risk group was approximately 20%, compared with 60% in the low-risk group (p < 0.0001).
To evaluate the performance of the risk model, we also drew a multi-index ROC curve ( Figure 3B), with an AUC value of 0.728, which was higher than that of other clinical characteristics, indicating that the constructed risk model has the best value for predicting the outcome of patients with ESCC. Furthermore, the risk curve, in addition to the heat map of all patients, is shown in Figures 3C-E. As shown in Figures 3B-D, as the risk score of patients increases, the survival rate of patients decreases.

External Validation
To validate the model, we used data from patients in TCGA database as an external testing set. The risk analysis of the testing set, and KM survival curve are shown in Figures 4A-C. In the KM survival curve, the two groups were distinguished, and the p-value was less than 0.05. Our constructed risk model for autophagy and prognostic lncRNAs was thus validated as a significant prognostic indicator.

Independent Prognostic Analysis and Nomogram
Finally, we conducted univariate and multivariate prognostic analyses based on the risk values in both the training and testing sets. The results showed that in the univariate regression test ( Figures 5A,B), the p-value was <0.001 and the HR value was 1.224, whereas in the multivariate regression, the p-value was <0.001 and the HR value was 1.191. Briefly, independent prognostic analysis, whether univariate or multivariate, confirmed that our established risk model can be an independent risk indicator to precisely evaluate the outcome of patients with ESCC. In order to obtain a more accurate evaluation tool for predicting each patient's prognosis, we combined the autophagy-related lncRNA risk model with other clinical characteristics and built a nomogram as shown in

Gene Set Enrichment Analysis
Using GO annotation and KEGG pathway enrichment, the lncRNAs were found to be enriched in 19 pathways with 50 annotated terms; the top5 GO annotations are shown in Figure 6A. GO enrichment analysis showed that the selected lncRNAs were mainly involved in biological functions associated with various autophagy processes in cells, as well as autophagyrelated signaling pathways. The other related pathways included mTOR signaling pathways, insulin signaling pathways, and choline metabolic signaling pathways ( Figure 6A). To further investigate the underlying molecular interaction network of the screened features in esophageal squamous cell carcinoma, the gene sets were analyzed using GSEA. The filter criteria were p-value < 0.05 and q-FDR value <0.25 ( Figures 6B,C). Different enrichments resulted in significantly different risk sets. As shown in Figure 6B, the B cell receptor signaling pathway and Acute myeloid leukemia pathway were significantly enriched in the lowrisk set, which was connected with immune regulation, suggesting that activation of the B cell receptor signaling pathway can regulate immune function in the low-risk set, thus predicting a better prognosis and longer survival time. Unfortunately, in the high-risk set, we did not obtain distinct enrichment results, suggesting that the group with a low-risk score was associated with activated immune function. These data provide potential for further research on the personalized treatment of ESCC.

Immune Function Heatmap and Tumor Mutational Burden
It has been reported that autophagy plays certain role in mediating innate and adaptive immune responses. In the KEGG and GO analyses we noticed enrichments in cellular autophagy and immunity. Therefore, we compared the difference of immune functions of the high and low risk groups and the heatmap is shown in Figure 7A. Several attempts have been made to identify predictive biomarkers for immunotherapy response. One of the most intriguing and divisive is TMB. Hence, we explored the TMB of the high and low risk group as shown in Figures 7B-C. Generally, the frequency of mutation is higher in the high-risk group and we also found the majority mutation in the high-risk group was associated with mismatch-repair deficiency. Patients with high TMB had shorter OS and unfavorable prognosis. Survival analyses combining TMB and risk model of patients with ESCC is shown in Figures 7D-E.

Chemotherapy and Immunotherapy Sensitivity
Therapeutic response analysis showed patients with ESCC in the highrisk group showed lower IC50 in five commonly used chemotherapy drugs for cancer treatment (Cisplatin, Cytarabine, Docetaxel, Paclitaxel, and Vinblastine) (Figure 8), indicating that the highrisk patients are more sensitive to chemotherapy. We also found a lower IC50 of Lenalidomide in the high-risk group, suggesting that Frontiers in Genetics | www.frontiersin.org June 2022 | Volume 13 | Article 894990 10 these patients are more likely to benefit from Lenalidomide immunotherapy.

Expression of Signature Long Non-Coding RNAs in Esophageal Squamous Cell Carcinoma Cell Lines by Quantitative Real-Time Polymerase Chain Reaction
Differential expression of the 11 autophagy-realted lncRNAs in cancer versus normal tissue is shown in Figure 9A. 7 (BDNF-AS, HAND2-AS1, LINC00410, LINC00588, PSMD6-AS2, ZEB1-AS1, and LINC02586) of the 11 lncRNAs were differentially expressed in cancer and normal tissues. Here we selected four sequenceavailable lncRNAs (BDNF-AS, HAND2-AS1, LINC00588, and ZEB1-AS1) of the seven differentially expressed lncRNAs with qRT-PCR. As shown in Figures 9B-E, the expression levels of HAND2-AS1 and LINC00588 were significantly lower in KYSE30 and KYSE150 cells than that in HEEpiC, p < 0.05, whereas the expression levels of BNDF-AS and ZEB1-AS1 were higher in KYSE30 and KYSE150 compared to that in HEEpiC cells. The remaining three lncRNAs could not be quantified because of a lack of transcriptome information in NCBI.

DISCUSSION
In 2018, more than 572,000 patients were newly diagnosed with EC (Bray et al., 2018). Treatment of EC remains a challenge because of its recurrence and unfavorable prognosis. The overall 5-year survival rate of EC can be as low as 20% owing to the advanced stage at diagnosis (mainly stage III or IV) and its high invasiveness (Anderson et al., 2015). Although the regional distribution of the two main pathological subtypes has changed over the past 4 decades, ESCC still accounts for the majority of EC cases (Siegel et al., 2019). Therefore, understanding the epidemiological characteristics and risk factors of EC is crucial for public health and clinical decision making, including risk assessment, disease screening, and prevention. The current research on risk stratification mainly focuses on screening Barrett's esophagus and gastroesophageal reflux disease (GERD) symptom rating scales (Rubenstein et al., 2013;Dong et al., 2018;Rubenstein et al., 2020). A valid risk stratification tool for ESCC thus remains lacking.
Autophagy is a highly conserved pathway that plays a crucial role in both normal and cancerous cells. Multiple cancers exhibit disrupted autophagy regulation. Autophagy is responsible for alterations in cancer cell metabolic regulation and plays a critical role in promoting immune escape. Targeting the autophagy mechanisms remains a promising strategy for the treatment of an increasing number of cancers. Emerging evidence suggests that lncRNAs may also play a crucial role in tumorigenesis (Liu et al., 2022a;Liu et al., 2022b). Recently, Zhang et al. (2020) reported a co-expression pattern of lncRNA HOTAIR and MTHFR, which regulate the biological behavior of EC cells. According to Hu et al. (2021), lncRNA RP11-465B22.8 can be delivered to macrophages via exosomes to induce the M2 phenotype, thus enhancing the migration and invasion of EC cells. Their research indicates that lncRNAs are a novel target and based on their regulation of immunity in EC they can potentially provide new therapeutic strategies. In a study by Wu et al. (2022), 22 autophagyrelated lncRNAs were included in the risk assessment of patients with EAC, but none of these lncRNAs were identical to those included our stratification model. This may be caused by differences in histological classification. Prognostic lncRNAs in other cancer types have also been reported in recent years, including breast cancer , colon cancer (Zhou et al., 2020), pancreatic cancer (Deng et al., 2020), and bladder cancer (Lai et al., 2020). Most of these models have been constructed based on public databases, with other datasets used for validation. The AUC value varied from 0.6 to 0.9, and was statistically significant. In a previous study (Shi et al. (2021), also reported a prognostic model in ESCC consisted of nine autophagy-related lncRNAs. It is noticed that the 11 lncRNA signatures in our model had no overlap with theirs. This may due to different statistical approach and filtering criteria. In their study, three of the nine lncRNAs for model construction were independent prognostic factors. In comparison, all 11 lncRNAs in our model were significant by multivariate Cox analyses and the model was externally validated using other databases in addition to differential expression validation by qRT-PCR. In this study, we used autophagyassociated lncRNAs as prognostic stratification biomarkers to screen for ESCC risk. Validation in an independent database showed that the AUC value (0.647) of our signature was higher than that of other clinical characteristics. Kaplan-Meier analysis showed that the two risk groups based on our risk stratification model showed different prognoses, with a p < 0.001. A nomogram for predicting patients' OS was built with the risk model and clinicpathological features. Further TMB and therapeutic response analyses displayed significant distinctions between the two risk groups. Among the 11 lncRNAs screened for our risk model, seven (BDNF-AS, HAND2-AS1, LINC00410, LINC00588, PSMD6-AS2, ZEB1-AS1, and LINC02586) were found to be differentially expressed in adjacent normal tissues and cancer tissues. Of these, four (BDNF-AS, HAND2-AS1, LINC00588, and ZEB1-AS1) lncRNAs were further quantified in cultured human ESCC cells and normal epithelial cells; the results obtained were consistent with our data analysis. Unfortunately, the remaining three lncRNAs could not be quantified because of a lack of transcriptome information in NCBI. Among the autophagy-related lncRNAs in our risk model, differential expression of the following three lncRNAs, BDNF-AS1, HAND2-AS1, and ZEB1-AS1, in normal and cancer tissues is commonly found in several cancer types with critical roles in cancer progression. In colon cancer, low expression of BDNF-AS1 can upregulate glycogen synthase kinase-3+, thereby inhibiting the proliferation, invasion, and metastatic ability of colon cancer cells (Zhi and Lian, 2019). In esophageal cancer cell lines, BDNF-AS1 can coregulate mir-214 and thus regulating the growth and invasion of esophageal cancer cells. Further, BDNF-AS1 has diverse effects on other cancer types, mostly inhibiting epithelial-mesenchymal transition (EMT) and tumor formation in tumor cells (Cao et al., 2010). The lncRNA HAND2-AS1 inhibits tumorigenesis and is expressed in various tumor tissues at lower levels than in adjacent normal tissues . Abnormal HAND2-AS1 expression is associated with tumor progression and prognosis. Reduced HAND2-AS1 is reported to inhibit cancer growth and correlates with clinical features such as lymph node involvement, histological differentiation (Yang et al., 2017), tumor size, and staging. ZEB1-AS1 is localized on chromosome 10p11.22 with two exons and one intron in between (Gao et al., 2019), and is derived from the ZEB1 promoter (Jiao et al., 2020). Its subcellular localization is in the nucleus. ZEB1-AS1was identified early for its prominent role in promoting cancer cell proliferation (Chen and Shen, 2020). Further, ZEB1-AS1 expression is found to be significantly higher in hepatocellular tumor tissues than in normal tumor-adjacent tissues, and is abnormally elevated in metastatic HCC tissues. High ZEB1-AS1 expression has also been detected in various hepatocellular carcinoma cell lines.
The limitation of our research is that the enrolled patients' data were mainly obtained from public databases. A larger number of patients with follow-up data are needed to further validate the performance of the model. Moreover, the AUC of our model could potentially be improved by integrating multi-omics data if available. However, the biological functions and underlying mechanisms of most lncRNAs remain elusive.
To our knowledge, this study presents the first autophagyassociated lncRNA signature for risk assessment in ESCC. This autophagy-related lncRNA model provides an effective means for predicting the prognosis of patients with ESCC; moreover, some of these lncRNAs might also serve as novel targets for ESCC treatment.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.