LncRNA ST7-AS1 is a Potential Novel Biomarker and Correlated With Immune Infiltrates for Breast Cancer

Background: Long noncoding RNA (lncRNA) ST7-AS1 can be observed in various cancers, but its role in breast cancer (BRC) remains unclear. Our aim is to, on the basis of The Cancer Genome Atlas (TCGA) database, prove the correlation between lncRNA ST7-AS1 and BRC. Methods: The lncRNA ST7-AS1 expression and its roles in the prognosis of BRC were explored using data from the TCGA database. The expression level of lncRNA ST7-AS1 in BRC samples was detected using RT-PCR. The 1-, 3-, or 5-year survival rate was predicted using a nomogram established through Cox proportional hazard regression. At last, the biological function was explored through gene ontology (GO) analysis and gene set enrichment analysis (GSEA). The hallmark pathways significantly involved in hub genes were described through functional enrichment analysis. The correlation between lncRNA ST7-AS1 expression and immune infiltration was analyzed through single-sample GSEA (ssGSEA). Results: LncRNA ST7-AS1 expression was downregulated in BRC. Decreased lncRNA ST7-AS1 expression in BRC was correlated with advanced clinical pathologic characteristics (high grade, histological type, age, menopause status, and HER2 status), survival time, and poor prognosis. The nomogram was established for using lncRNA ST7-AS1 to predict 1-, 3-, or 5-year survival in patients with BRC. In addition, GO and pathway analyses suggested the involvement of lncRNA ST7-AS1 in cell cycle, DNA repair, and immune cell infiltration in the BRC immune microenvironment. We found the correlation of lncRNA ST7-AS1 with T helper cells and DC cells. Conclusion: Low expression of lncRNA ST7-AS1 indicates poor prognosis and has an impact on cell cycle, DNA repair, and proportion of infiltrating immune cells in the BRC microenvironment. Therefore, lncRNA ST7-AS1 can be used as a protective prognostic marker and a potential treatment target for BRC.


INTRODUCTION
Breast cancer is most commonly observed in female patients with cancer, which is a main cause of death induced by cancer. The proportion of BRC in all newly diagnosed cancers (Balasubramanian et al., 2019) in women reaches 30%, and cancers result in 15% of deaths in women (Siegel et al., 2019). The potential molecular mechanisms underlying tumor formation and progression which further complicate the effective treatment of BRC are poorly understood. Now, early diagnosis and prognosis can decrease the cancer-related mortality by some biomarkers, such as CEA and CA153, and image examinations (Mirabelli and Incoronato, 2013). As breast tumors are heterogeneous, BRC are classified according to their expression of key proteins ER, PR, HER2, etc., for individual therapy (Klinge, 2018). However, these biomarkers are not satisfactory for diagnosis or prognosis due to poor sensitivity and specificity. Thus, it is urgent to find effective prognostic biomarkers for diagnosing and treating BRC.
As a multiple class of transcripts, lncRNA has more than 200 bases and no protein-coding potential (Cech and Steitz, 2014). In tumors, some abnormal expressions of lncRNAs are important regulatory molecules that participate in the regulation of fundamental biological processes, affecting proliferation, apoptosis, metastasis, autophagy, etc. (Yang et al., 2014;Zeng et al., 2017;Chen et al., 2020). With more and more attention being paid to lncRNA as a tumor suppressor or oncogene in various tumors, lncRNA can be used as a therapeutic molecular target or a potential biomarker with prognostic values (Sun et al., 2017;Tian et al., 2017). LncRNA ST7-AS1 is a newly discovered one located on 7q31.2 (Qin et al., 2019). LncRNA ST7-AS1 with Copy Number Variations (CNVs) can cause gene disorder in downstream cancers, which is also strongly linked to proliferation, apoptosis, and cell migration belonging to the biological processes of cancer . Based on current evidence, ST7-AS1 has a strong correlation with human glioma and laryngeal squamous cell carcinoma (Liu et al., 2015;Tian et al., 2017). However, the functions of lncRNA ST7-AS1 in tumors remain unknown.
Here, we presented lncRNA ST7-AS1 as a positive prognostic biomarker for BRC patients in public databases such as TCGA. We also investigated the distinctive genomic alterations and functional networks associated with lncRNA ST7-AS1 expression and evaluated its role in tumor immunity. Our work could potentially reveal more direct evidence for hypothesizing lncRNA ST7-AS1 expression as a potential treatment target or a prognostic biomarker in risk stratification and provide insights into the molecular mechanisms for BRC patients.

RNA Sequencing Data and Clinical Data From the TCGA Database
We obtained the lncRNA ST7-AS1 expression (1,065 cases for the workflow type: HTSeq-FPKM and HTSeq-Counts) and clinical data of BRC projects from TCGA. Those without RNA sequencing data and an overall survival of at least 30 days were excluded. We then converted level 3 HTSeq-FPKM data into TPM (Transcripts Per Million) reads for subsequent analyses. As shown in Table 1, the data of 1,065 patients were summarized. We compared the expression data (HTSeq-Counts) of the high lncRNA expression group and the low lncRNA expression group and identified differentially expressed genes (DEGs) using the DESeq2 R package (Love et al., 2014), with thresholds of |log 2-fold change (FC)| > 1.5 and adjusted p value < 0.05.
Next, we used a total of 15 PCR samples of BRC patients admitted to the Harbin Medical University Cancer Hospital from February 2020 to June 2020. The study protocol has obtained the approval of the Ethics Committee of Harbin Medical University Cancer Hospital and is in conformity with the Declaration of Helsinki.

RNA Isolation and Quantitative RT-PCR
Based on the manufacturer's protocol, we used the Trizol reagent (Invitrogen) to isolate total RNA from clinical samples. Total RNA was reversely transcribed into cDNA for PCR amplification. By using the SYBR Green I real-time detection kit (Cwbio, Beijing, China), we performed real-time quantitative PCR (RT-PCR) on a CFX96 Detection System (Bio-Rad, California, United States). Specific primers for lncRNA ST7-AS1 and GAPDH were as follows: LncRNA ST7-AS1 forward primers: 5′-ACC CTA CTC TGC CTC CCT TAT C-3′, reverse primers: 5′-TAG CAT CTG CCA CCC AAA TC-3′; GAPDH forward primers: GAA GGT GAA GGT CGG AGT CA, reverse primers: TTG AGG TCA ATG AAGG GGTC.

Nomogram Construction and Evaluation
Based on the multivariate Cox analysis results, we established a nomogram to predict the prognosis of BRC patients. According to the prognosis model, we calculated each patient's risk score as the total score of each parameter, which could predict the prognosis of BRC patients. The accuracy estimation of nomogram prediction was obtained from a calibration plot. It was found that the bias-corrected line in the calibration plot was close to the ideal curve (Keynesian cross), indicating a strong consistency between predicted values and observed values. The nomogram discrimination was determined using a concordance index (C-index), and 1,000 resamples were used in calculation by bootstrap approach. In this study, all statistical tests were twotailed, with a statistical significance level of 0.05.

Enrichment Analysis
The GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed for high and low lncRNA ST7-AS1 expression groups on the DEGS using the clusterProfiler package (Yu et al., 2012).

Gene Set Enrichment Analysis
Using the clusterProfiler package (Yu et al., 2012), the subtype specific gene expression patterns and potential cellular pathways were elucidated on GSEA software (Subramanian et al., 2005). Based on the lncRNA ST7-AS1 expression level, we divided gene expression data into high lncRNA ST7-AS1 and low lncRNA ST7-AS1 groups, and each analysis included 1,000 times of gene set permutations. A function or pathway term with a false discovery rate (FDR) of less than 0.25 and a p value of less than 0.05 was considered statistically significant.

Immune Infiltration Analysis by ssGSEA
We used a GSVA R package to quantify the BRC immune infiltration of 24 tumor-infiltrating immune cells in tumor samples through ssGSEA. According to the 509 gene signatures of 24 tumor-infiltrating lymphocytes (TILs) (Bindea et al., 2013), comprising natural killer (NK) cell, T follicular helper cells (Tfhs), CD56 bright NK, CD56dim NK, central memory CD4 + T cell, macrophages, cytotoxic cells, dendritic cells (DCs), CD8 + B cells, effector memory T cell (Tem), eosinophils, gamma delta T cells, activated DC, immature DC, mast cells, neutrophils, plasmacytoid DC, T helper cell, regulatory T cells (Tregs), type-1 T helper cells (Thp1), Th2, and Th17, the relative enrichment score of every immunocyte was quantified. The correlation between lncRNA ST7-AS1 and infiltration levels of immune cells was analyzed by the Spearman correlation, and these immune cells with the different expression groups of lncRNA ST7-AS1 were analyzed by the Wilcoxon rank sum test.

Analysis of Protein-Protein Interactions Network
Search Tool for the Retrieval of Interacting Genes (STRING; https://string-db.org/) is an online database for predicting functional interactions between proteins. The cut-off criterion we used was an interaction score > 0.4. Hub genes were extracted from the PPI network using the Molecular Complex Detection (MCODE) algorithm (Bader and Hogue, 2003) so as to identify crucial subnetworks. Statistical Analysis R (Version 3.5.1) was used for all statistical analyses and plots. The correlations between clinicopathological characteristics and lncRNA ST7-AS1 expression were evaluated using the Chisquared test, Fisher exact test, Kruskal-Wallis (KW) test, Wilcoxon signed-rank test, Wilcoxon rank sum test, and logistic regression. Kaplan-Meier method was adopted to draw survival curves [hazard ratio (HR), 95% CI]. Through univariate and multivariate analysis combined with Cox logistic regression models, other clinical factors impacting the survival and the lncRNA ST7-AS1 expression level were found. All reported p values were two-sided, and those less than 0.05 had statistical significance in all tests. The discrimination ability of lncRNA ST7-AS1 in BRC was evaluated through receiver operating characteristic (ROC) analysis using the pROC package (Robin et al., 2011).

LncRNA ST7-AS1 Expression in BRC Patients From TCGA
To have a better understanding of the correlation of lncRNA ST7-AS1 expression in cancer, we showed the lncRNA ST7-AS1 across all TCGA differential expressions between tumor and adjacent normal tissues ( Figure 1A) or normal tissue ( Figure 1B). To further evaluate the difference of lncRNA ST7-AS1 expression in BRC, 1,065 patients' clinical and gene expression data were downloaded from TCGA ( Table 1). We used the Wilcoxon rank sum test to compare the lncRNA ST7-AS1 expression between BRC samples and adjacent normal tissues. Compared with 1,065 BRC tissues, lncRNA ST7-AS1 had a high expression in 111 healthy tissues (p < 0.001) ( Figure 1C), and lncRNA ST7-AS1 expression in 111 paired normal breast tissues was greatly higher compared with that in 111 tumor tissues (p < 0.001) ( Figure 1D). Based on the observation results of 15 patients' samples by RT-PCR, BRC tissues had a lower lncRNA ST7-AS1 expression level compared with paired normal breast tissues ( Figure 1E). ROC analysis was conducted on data of tumor tissues and normal ones to measure the discrimination value of lncRNA ST7-AS1. The AUC 0.789 ( Figure 1F).
According to univariate analysis using logistic regression, as a categorical dependent variable, the expression of lncRNA ST7-AS1 (based on the median value) was correlated with poor clinicopathological prognosis. Decreased lncRNA ST7-AS1 expression in BRC was significantly correlated with pathologic stage (OR 0.74 for Stage III/IV vs. Stage I/II, p 0.04), HER2 status (OR 0.58 for positive vs. negative, p 0.003), and histological type (OR 2.38 for infiltrating lobular carcinoma      Figure 3B). Similarly, compared with the low expression group, the high expression group had an obvious higher disease-specific survival (DSS) (HR 0.51, CI 0.33-0.80, p 0.003) ( Figure 3C). Moreover, in the presence of TNM stages, multivariable hazards models were used to evaluate the impacts of   Figures 3D-F). Prognostic values of differential expression of lncRNA ST7-AS1 in different subgroups, including anatomic neoplasm subdivision-right, menopause status-post, ER-positive, PRpositive, HER2-positive, infiltrating ductal carcinoma, infiltrating lobular carcinoma, age >60°years, race-white, M stage-M0, N stage N2 and N3, T stage-T2, T stage-T3 (all p < 0.05), are shown in Figure 4. The above data indicated decreased lncRNA ST7-AS1 level is associated with poor prognosis.

Construction and Validation of LncRNA ST7-AS1 Based Nomogram
In order to provide clinicians with predicted prognosis of BRC patients quantitatively, we established the nomogram combining lncRNA ST7-AS1 and independent clinical risk factors (pathologic stage, ER status, and age). Higher total points on the nomogram for OS, progression-free interval (PFI), and DSS, respectively, indicated a worse prognosis (Figures 5A-C).
Based on the calibration curve of nomograms for OS, PFI, and DSS, the predictions conformed well to observations in all patients, and the test showed no deviation from the perfect fit. The nomogram had a C-index of 0.750 and contained 1,000 bootstrap replicates (95% CI: 0.727-0.773) for OS. We also found PFS (C-index: 0.703, CI: 0.676-0.730) and DSS (C-index: 0.778, CI: 0.747-0.809). It was found that the bias-corrected line in the calibration plot was close to the ideal curve (Keynesian cross), indicating a strong correlation between predicted values and observed values ( Figures 5D-F). In summary, these results indicate that the nomogram can well predict short-or long-term survival of BRC patients.

Potential Mechanism of LncRNA ST7-AS1 in Regulating the Progression of BRC
The DESeq2 package was used to analyze the data from TCGA in R (adjusted p value < 0.05, |log2 FC|>1.5). We identified 1,104 DEGs (including 568 and 578 upregulated and downregulated, respectively), 369 differentially expressed lncRNAs (including 187 and 182 upregulated and downregulated, respectively), and 377 differentially expressed mRNAs (including 34 and 343 upregulated and downregulated, respectively) in total ( Figures 6A-C). Relative expression values for the top 10 DEGs between high and low lncRNA ST7-AS1 expression groups are illustrated in Figure 6D. ClusterProfiler was used for GO and KEGG enrichment analyses of the functions of lncRNA ST7-AS1 associated DEGs in BRC. The top GO enrichment items were classified into three functional groups. Molecular functions (MF) mainly involve receptor ligand activity, peptidase regulator activity, endopeptidase regulator activity, peptidase inhibitor activity, and endopeptidase inhibitor activity. The cellular components (CC) were mainly transcription regulation by intermediate filament cytoskeleton, intermediate filament, keratin filament, and cornified envelope. Epidermal development, skin development, epidermal cell differentiation, keratinocyte differentiation, keratinization, and cornification were genes related to biological processes (BP). Based on KEGG enrichment analysis, we found the correlation between retinol metabolism, metabolism of xenobiotics by cytochrome P450, drug metabolism-cytochrome P450, Staphylococcus aureus infection, chemical carcinogenesis, IL-17 signaling pathway, galactose metabolism, and pentose and glucuronate interconversions with DEGs ( Figures 7A-D). Furthermore, we analyzed the protein interactions between lncRNA ST7-AS1 associated genes through the String database to show the relationship between them in the interaction network ( Figure 7E).

GSEA Identifies a Signaling Pathway Related to LncRNA ST7-AS1
We conducted GSEA between high lncRNA ST7-AS1 expression set and low lncRNA ST7-AS1 expression set to identify different signaling pathways in BRC, which revealed  v6.1 symbols). Based on normalized enrichment score (NES), the most significantly enriched signaling pathway was selected, as shown in Figure 8. In particular, lncRNA ST7-AS1 was related to miRNA mediated inhibition of translation, mRNA binding involved in posttranscription, cell cycle, DNA repair, IL6 JAK STAT3 signaling, MYC targets v2, apoptosis, and KRAS signaling.

The Correlation Between LncRNA ST7-AS1 Expression and Immune Infiltration
The Spearman correlation was used to show the correlation of ST7-AS1 expression level (TPM) with immune cell infiltration level quantified by ssGSEA in the BRC tumor microenvironment. We found that ST7-AS1 expression was negatively correlated with the abundances of immunocytes (Th2 cells, Tgd, macrophages, Tcm, etc.) and positively correlated with abundances of innate immunocytes (pDCs, NK cells, B cells, CD8 T cells, etc.) (Figure 9).

DISCUSSION
LncRNA is emerging as a tumor suppressor or oncogene in a number of tumors, as well as a potential therapeutic molecular target or biomarker with prognostic value (Cui et al., 2017). LncRNA ST7-antisense RNA 1 (AS1) is an antisense transcript for tumorigenicity 7 protein (ST7) suppressor located on 7q31.2 (Qin et al., 2019). LncRNA ST7-AS1 may influence the occurrence and development of laryngeal squamous cell carcinoma, but its biological function is entirely unknown, especially in the progression of cancer. As far as we know, there has been no study on lncRNA ST7-AS1 expression and its potential prognostic impact on BRC. This study focuses on the potential role of lncRNA ST7-AS1 in BRC.
Numerous studies have shown that lncRNA expression level plays an important part in the occurrence and progression of cancer (Yao et al., 2019). According to our bioinformatics analysis and high-throughput RNA sequencing data in TCGA, compared with normal tissue, BRC had a significantly lower expression level of lncRNA ST7-AS1. Decreased lncRNA ST7-AS1 expression in BRC was correlated with advanced clinical pathologic characteristics (high grade, histological type, age, menopause status, and HER2 status), survival time, and poor prognosis. Thus, our results suggested that lncRNA ST7-AS1 downregulation occurs in a number of BRC cases, and its role as a potential diagnostic and prognostic marker is worthy of further clinical validation.
In order to make a more accurate prognostic prediction, nomograms have been established, which had a better performance than conventional staging systems in some cancers. In our study, a nomogram was established using the results of multivariate Cox analysis which combined lncRNA ST7-AS1 with independent clinical risk factors (pathologic stage, ER status, and age). The calibration plot showed a good consistency between actual values and predicted values for 1-, 2-, and 3-year OS, DFI, and DSS. We constructed the model based on a complementary perspective of each tumor to provide personalized scores for individual patients. There is another new finding from this study; that is, lncRNA ST7-AS1 participates in the cell cycle, apoptosis, and DNA repair. Genomic instability and mutagenesis belong to basic characteristics of cancer cells. Kinases and their associated signaling pathways conduce to the stabilization and repair of genomic DNA (Karimian et al., 2016;Yogosawa and Yoshida, 2018). LncRNA ST7-AS1 in BRC plays a role in regulating various signaling pathways related to inflammation, such as the IL6/JAK STAT3 signal pathway. It has been shown in previous studies that the IL-6/JAK STAT3 pathway is abnormally highly activated in many cancers, which is generally associated with poor clinical prognosis. In the tumor microenvironment, IL-6/JAK STAT3 signaling contributes to proliferation, invasiveness, and metastasis of tumor cells and strongly inhibits the antitumor immune response (Johnson et al., 2018). Moreover, STAT3 acts as the main mediator of RCC proliferation induced by IL-6 (Horiguchi et al., 2002).
Equally important, our study used ssGSEA and Spearman correlation to reveal the association between lncRNA ST7-AS1 expression and immune infiltration levels in BRC. We found the association between lncRNA ST7-AS1 and T helper cells and DC cells. Many tumors, including lung cancer, glioma, cervical cancer, BRC, gastric cancer, and colorectal cancer, all have Th1/Th2 balanced drift in the body, and Th2 cells are often dominant, which may be related to the immune escape of tumors (Shurin et al., 1999;Matsuzaki et al., 2005). Due to the relationship between Th1/Th2 balance drift and a variety of diseases, more and more research is tending to find and develop drugs and methods that can reverse or stabilize the Th1/Th2 balance. For example, the application of cytokines or cytokine antagonists to reverse Th1/Th2 balance drift in the treatment of tumors and other diseases: Th1 cytokines can promote the transformation of Th1/Th2 balance to Th1 state, while inhibiting the dominant expression of Th2. Th2 cytokines had the opposite effect (Hong et al., 2013;Saxena and Kaur, 2015). There is a close correlation between DC and tumor occurrence and development, and patients with a large number of infiltrating DC in most solid tumors have a good prognosis. The core of an effective antitumor immune response is to generate a cellular immune response dominated by CD8 + T cells, which are also the basis of DC as an immunotherapy (Huber et al., 2018).
Although the approach used in this study helps to understand the relationship between lncRNA ST7-AS1 and BRC, some limitations still exist. Firstly, to fully and comprehensively elucidate the specific role of lncRNA ST7-AS1 in BRC development, various clinical factors, such as details of patients' treatments, should be considered. However, because the experiments are carried out in different laboratories, the public database is lack of such information or the treatments are inconsistent. Secondly, the number of healthy subjects used as controls differs significantly from that of cancer patients in this study, so further studies should be carried out to maintain a balanced sample size. In conclusion, limitations still exist in retrospective studies, especially the inconsistency of interventions and the lack of certain information. Therefore, future prospective studies are needed to eliminate analysis bias of this study which is retrospective in nature. Because this study was based on RNA sequencing data from the TCGA database, we could not elucidate the lncRNA ST7-AS1 expression at the protein level, nor could we clearly evaluate the direct mechanism by which lncRNA ST7-AS1 participates in the development of BRC. Therefore, the direct mechanism in BRC should be further studied.

CONCLUSION
To conclude, our study showed decreased expression of lncRNA ST7-AS1 in BRC samples. Furthermore, low expression of lncRNA ST7-AS1 was associated with adverse outcomes. We also established a nomogram using lncRNA ST7-AS1 to predict 1-, 3-, or 5-year survival in patients with BRC. In terms of biological functions, we showed the involvement of lncRNA ST7-AS1 in cell cycle, DNA repair, and immune cell infiltration in the BRC immune microenvironment. We found the correlation of lncRNA ST7-AS1 with T helper cells and DC cells. In conclusion, these results indicate that lncRNA ST7-AS1 has a potential role in BRC as a prognostic marker and therapeutic target.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: The Cancer Genome Atlas (https://portal.gdc. cancer.gov/).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Harbin Medical University Cancer Hospital and are in conformity with the Declaration of Helsinki. The patients/participants provided their written informed consent to participate in this study.