A Novel RNA-Binding Protein Signature to Predict Clinical Outcomes and Guide Clinical Therapy in Gastric Cancer

Objective: This study aimed to develop an RNA-binding protein (RBP)-based signature for risk stratification and guiding clinical therapy in gastric cancer. Methods: Based on survival-related RBPs, an RBP-based signature was established by LASSO regression analysis in TCGA dataset. Kaplan–Meier curves were drawn between high- and low-risk groups. The predictive efficacy of this signature was assessed via ROCs at 1-, 3-, and 5-year survival. Its generalizability was verified in an external dataset. Following adjustment with other clinicopathological characteristics, the independency of survival prediction was evaluated via multivariate Cox regression and subgroup analyses. GSEA was utilized in identifying activated pathways in two groups. Stromal score, immune score, tumor purity, and infiltration levels of 22 immune cells were determined in each sample via the ESTIMATE and CIBERSORT algorithms. The sensitivity to chemotherapy drugs was assessed through the GDSC database. Results: Data showed that patients with high risk exhibited unfavorable clinical outcomes than those with low risk. This signature possessed good performance in predicting 1-, 3-, and 5-year survival and can be independently predictive of patients' survival. Calcium, ECM receptor interaction, and focal adhesion were highly enriched in high-risk samples. High-risk samples presented increased stromal and immune scores and reduced tumor purity. Moreover, this signature presented close relationships with immune infiltrations. Low-risk specimens were more sensitive to sorafenib, gefitinib, vinorelbine, and gemcitabine than high-risk specimens. Conclusion: This RBP-based signature may be a promising tool for predicting clinical outcomes and guiding clinical therapy in gastric cancer.


INTRODUCTION
Gastric cancer ranks fifth in incidence and third mortality among global cancers (1,2). Patients are diagnosed in histology following endoscopic biopsy and staged by computed tomography, endoscopic ultrasound, positron emission tomography, or laparoscopy (3). This cancer is a highly heterogeneous disease at the molecular and phenotypic levels. Subjects diagnosed by the same TNM stage and treated by similar therapeutic regimens present varied prognoses, emphasizing that TNM stage by itself cannot provide complete prognostic information (4). Endoscopic surgery is a primary therapeutic method for early subjects. Nevertheless, most patients are diagnosed at an advanced stage, who have missed the optimal time for surgery. Despite adjuvant chemotherapy, immunotherapy, and targeted therapy, advanced subjects' median survival time is <1 year (5). Hence, innovative strategies are required for boosting risk stratification as well as predictive accuracy of clinical outcomes.
RNA-binding proteins (RBPs), a type of protein, may be interacted with a variety of RNAs. At present, 1,542 human RBP genes have been found, which participate in posttranscriptional modulation such as RNA splicing, polyadenylation, editing, modification, and translation (6)(7)(8). Aberrant expression of RBPs may induce progress of various malignancies, including gastric cancer (9,10). RBPs have been detected to widely express in tumor cells, thereby affecting the translation of mRNAs into proteins and carcinogenesis processes (11). Increasing evidence has highlighted clinicopathologic implication of immune microenvironment in survival outcomes and therapeutic efficacy in gastric cancer (12). Recent findings have found that RBPs may affect immune microenvironment across different cancer types (13). For example, RBP SORBS2 inhibits metastatic colonization of ovarian cancer through enhancing stability of tumor-suppressive immunomodulatory transcripts (13). In-depth understanding of the roles of RBPs will offer innovative ideas for immunotherapy of gastric cancer. Previously, Huang et al. (14) proposed a 6-RBP signature that predicted the survival of hepatocellular carcinoma with high accuracy. Li et al. (15) developed a 9-RBP signature with accurate predictive efficacy for lung squamous cell carcinoma patients' prognosis. However, there is still lack of gene signature based on RBPs for gastric cancer. Furthermore, the relationships between RBPs and immune microenvironment are required for further analysis. Here, this work developed and verified an RBP-based model that exhibited a good performance in predicting patients' survival and was significantly associated with immune microenvironment using public datasets.
Abbreviations: RBP, RNA-binding protein; TCGA, the cancer genome atlas; GEO, gene expression omnibus; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic curve; AUC, area under the curve; HR, hazard ratio; CI, confidence interval; GSEA, gene set enrichment analysis; ESTIMATE, estimation of STromal and immune cells in MAlignant tumour tissues using expression data; CIBERSORT, Cell type identification by estimating relative subsets of RNA transcripts; GDSC, genomics of drug sensitivity in cancer; IC 50 , half maximal inhibitory concentration.

Establishment and Validation of a Prognostic RNA-Binding Protein Gene Signature
Univariate Cox regression analyses were employed for analyzing associations between RBPs and clinical outcomes of gastric cancer. Prognosis-related RBPs with p < 0.05 were retained. Then, least absolute shrinkage and selection operator (LASSO) regression analyses were adopted to acquire key prognostic RBPs (19). The risk scores of subjects were determined following the formula: risk score = Σ expression level of gene i * β i . β represents the regression coefficient of gene i . Then, the median value was utilized as the cutoff value. Subjects were separated into high-and low-risk subgroups. Utilizing Kaplan-Meier curves, survival probability between the two groups was compared by log-rank test. Receiver operating characteristic curves (ROCs) for 1-, 3-, and 5-year survival were conducted via the ROC package in R. Area under the curve (AUC) was then determined. With the same cutoff value, predictive efficacy of the RBP gene signature was validated in the verification set.

Univariate and Multivariate Cox Regression Analyses
To analyze the relationships between clinical factors (age, gender, grade, stage, TNM, and risk score) and survival, univariate Cox regression analyses were carried out in the training and verification sets, separately. The independency of survival prediction of clinical factors was evaluated via multivariate Cox regression analyses. Hazard ratio (HR), 95% confidence interval (CI), and p values were calculated, respectively.

Subgroup Analyses
Patients were separated into different subgroups on the basis of different clinicopathological characteristics, including age (>65 and ≤65), gender (female and male), grade (grades 1-2 and grade 3), stage (stages I-II and stages III-IV), T (T1-2 and T3-4), N (N0 and N1-3), and M (M0 and M1). Kaplan-Meier curves followed by log-rank test were presented between high-and low-risk subjects in above subgroups.

Pathway Enrichment Analysis
The gene set enrichment analysis (GSEA) 4.0.3 software was utilized in identifying activated signaling pathways in highand low-risk subgroups (21,22 Estimation of Stromal Score, Immune Score, and Tumor Purity Stromal score, immune score, and tumor purity for each specimen were evaluated via the Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data (ESTIMATE) algorithm (23). The differences in stromal score, immune score, and tumor purity between the two subgroups were compared through the Wilcoxon rank-sum test. Kaplan-Meier curves were conducted for estimating survival differences between different subgroups, as follows: high vs. low stromal score, high vs. low immune score, and high and low tumor purity.

Assessment of Immune Cell Infiltration
The infiltration levels of 22 immune cell types were quantified in gastric cancer specimens utilizing the Cell type Identification by Estimating Relative Subsets of RNA Transcripts (CIBERSORT) algorithm as well as the LM22 gene sets containing 547 markers (24). The comparison of immune cell types between the highand low-risk groups was carried out through the Wilcoxon ranksum test.

Estimation of Immune Checkpoint Expression
The expression levels of 47 immune checkpoints were estimated in gastric cancer samples. Their expression was compared in the high-and low-risk groups by the Wilcoxon rank-sum test.

Drug Sensitivity Assessment
The sensitivity to different chemotherapy drugs for each sample was estimated through the Genomics of Drug Sensitivity in Cancer (GDSC; https://www.cancerrxgene.org/) database (25). The calculation of half maximal inhibitory concentration (IC 50 ) was achieved through the pRRophetic package in R (26).

Construction a Prognostic Nomogram Model
A nomogram model construction was achieved by the rms package as well as the survival package in R. This nomogram contained independent prognostic factors. Calibration curves were then depicted for evaluation of the predictive potency for 1-, 3-, and 5-year clinical outcomes of this nomogram.

Statistical Analyses
All analyses were achieved by available packages in R language 3.4.1 (http://www.R-project.org). Comparisons between the two groups were performed by the Wilcoxon ranksum test or Student' t-test. Values of p < 0.05 indicated statistical significance.

Construction of a Prognostic Signature for Gastric Cancer
Herein, 350 gastric cancer specimens were employed as the training set. Totally, 58 RBPs exhibited significant associations with survival of gastric cancer patients ( Table 1). To avoid data overfitting, coexpressed RBPs were eliminated through LASSO regression analyses (Figures 1A,B). Consequently, 33 key RBPs were retained for establishment of a prognostic signature. We determined the risk scores of all subjects. Table 2 listed the  regression coefficients of these key RBPs. Then, these subjects were separated into high-and low-risk groups (Figure 1C).
In Figure 1D, the number of dead patients in the high-risk group was significantly higher than that in the low-risk group. The difference in survival between groups was compared in depth. Figure 1E displayed that subjects with high risk often experienced more unfavorable survival time than those with low risk (p = 1.033e−14). Following confirmation by ROCs, the AUCs for 1-, 3-, and 5-year clinical outcomes were separately 0.779, 0.759, and 0.788 ( Figure 1F). These data were indicative of the predictive potential of the signature. To observe the interactions between 33 key RBPs, we constructed a PPI network. In Figure 1G, 14 key RBPs had mutual regulation.

Verification of the Prognostic Signature in an External Dataset
We further evaluated the generalizability of the signature in the GSE84437 dataset. With the same cutoff value, subjects were separated into high-and low-risk subgroups (Figure 2A). Compared with the low-risk group, there were more patients with dead status in the high-risk group ( Figure 2B). Those with high risk presented worse survival time than those with low risk (p = 7.208e−10; Figure 2C). The AUCs of 1-, 3-, and 5-year clinical outcomes were separately 0.647, 0.645, and 0.669, which was suggestive that this signature might be used in predicting the patients' survival ( Figure 2D).

The Signature as an Independent Prognostic Factor for Gastric Cancer
In the training set, our univariate Cox regression analyses were indicative that risk score presented a significant correlation with gastric cancer prognosis [p < 0.001; HR (95%  Figure 3D). Collectively, this signature was an independent risk factor of gastric cancer.

Subgroup Analysis of the Signature in Predicting Gastric Cancer Patients' Survival
Subgroup analysis was presented to assess whether the signature was accurately predictive of patients' clinical outcomes in the training set. Data indicated that subjects with high risk were indicative of more unfavorable survival in comparison with those with low risk in different subgroups according to age (>65 and ≤65; Figures 4A,B), gender (female and male; Figures 4C,D),

Signaling Pathways Involved in High-and Low-Risk Subgroups
This study evaluated the signaling pathways enriched by high-and low-risk samples via the GSEA in depth. Data indicated that calcium signaling pathway, ECM receptor interaction, and focal adhesion were highly enriched in highrisk samples (Figure 5A). In Figure 5B, base excision repair, cell cycle, DNA replication, mismatch repair, P53 signaling pathway, as well as spliceosome were highly enriched in lowrisk specimens.

Correlation Between This Signature and Drug Sensitivity
We further evaluated the sensitivity to chemotherapy drugs between high-and low-risk groups. Our data were indicative of increased IC 50 values of sorafenib (p = 5.23-05; Figure 7A), gefitinib (p = 0.011; Figure 7B), vinorelbine (p = 0.006; Figure 7C), and gemcitabine (p = 0.011; Figure 7D) in specimens with high risk than those with low risk. Hence, low-risk specimens were more sensitive to sorafenib, gefitinib, vinorelbine, and gemcitabine than high-risk specimens.

Establishment of a Nomogram Integrating Age, Stage, and Risk Score
To personally predict the prognosis of each subject, a nomogram was established via integrating age, stage, and gene signature, which could be predictive of 1-, 3-, and 5-year survival probability ( Figure 8A). Through confirmation of these calibration curves, 1-, 3-, and 5-year clinical outcomes by this nomogram exhibited high consistency with actual clinical outcomes for gastric cancer subjects in the training set (Figures 8B-D).

DISCUSSION
This study developed an RBP-based signature in the prediction of gastric cancer patients' survival. Subjects with high risk presented an unacceptable clinical outcome. Following verification, this signature was independently predictive of prognosis of patients. Moreover, it was distinctly related to immune microenvironment and sensitivity to chemotherapy drugs. Hence, this RBP-based signature may be a promising tool for predicting clinical outcomes and guiding clinical therapy in gastric cancer. The molecular heterogeneity features between high-and lowrisk patients were further analyzed. We found that calcium signaling pathway, ECM receptor interaction, and focal adhesion were highly activated in high-risk samples. Previously, calcium facilitates gastric carcinoma progress through calcium-sensing receptor as well as TRPV4 (27). Furthermore, VPAC1 and TRPV4 channels may accelerate gastric cancer progress by relying on calcium (28). The ECM receptor contributes to carcinogenesis, progress, and unfavorable survival in gastric cancer (29). Focal adhesion-related proteins are independently predictive of pessimistic clinical outcomes in gastric cancer (30). Meanwhile, activation of base excision repair, cell cycle, DNA replication, mismatch repair, P53 signaling pathway, as well as spliceosome was detected in low-risk specimens. The clinical implications of DNA repair like base excision repair and mismatch repair have been confirmed in gastric cancer (31). Deregulation of p53 pathway induces malignant biological properties for gastric cancer cells (32). Immune cell ingredients contribute to gastric cancer initiation and progression. Moreover, immune escape exerts a critical role in tumorigenesis. Immune infiltration levels distinctly affect patients' survival. Tumor immune microenvironment that contains stromal and immune cells exhibits an association with immunotherapy response (5). Immune cells are correlated with tumor invasion and metastases. Stromal cells present close relationships with tumor growth, progression, response to chemotherapy, as well as recurrence. This study demonstrated that high-risk subjects had increased immune and stromal scores than those with low risk. Consistently, Mao et al. (33) found that subjects with high stromal scores presented unfavorable clinical outcomes. At present, novel immunotherapies like anti-PD-1 and anti-PD-L1 have been applied in gastric cancer. Nevertheless, only a minority of subjects benefit from immunotherapies. The compositions in the immune microenvironment are key determinants for prognoses and response to immunotherapies (34). Herein, this study comprehensively analyzed the correlations between immune cell infiltrations and this signature via the CIBERSORT algorithm. High-risk subjects presented increased infiltration levels of T-cell CD4 memory resting, monocytes, macrophage M2, and mast cells resting, and had reduced infiltration levels of T-cell CD4 memory activated as well as T-cell follicular helper than those with low risk. Moreover, we found that high risk was characterized by increased expression of immune checkpoints including BTLA that was expressed in B and T lymphocytes, BTNL2 that was expressed in antigen-processing and presentation cells, CD200 that was mainly expressed in B and T lymphocytes, CD200R1 that was expressed in myeloid lineage cells, CD27 that was expressed in T cells, CD276 that was expressed in cancer cells, CD28 that was expressed in T cells, CD40 that was expressed in antigen-presenting cells, CD40LG that was expressed in T cells, CD44 that was expressed in T cells, CD48 that was expressed in lymphocytes and dendritic cells, CD86 that was expressed in antigen-presenting cells, HAVCR2 that was expressed in T cells, LAIR1 that was expressed in natural killer cells, T cells, and B cells, NRP1 that was expressed in cancer cells, PDCD1LG2 that was expressed in T cells and dendritic cells, TMIGD2 that was expressed in T cells, TNFSF14 that was expressed in T cells, TNFSF18 that was expressed in T cells, TNFSF4 that was expressed in T cells, and VSIR that was expressed in T cells. These data were indicative of this signature being closely related to immunotherapy.
For advanced subjects, surgical resection followed by auxiliary chemotherapy is a major therapeutic strategy. In recent years, a few clinical trials of postoperative chemotherapy have been launched in gastric cancer (35)(36)(37). Miserably, response to chemotherapy is relatively low on account of tumor heterogeneity (38). Our data indicated that subjects with low risk were more sensitive to sorafenib, gefitinib, vinorelbine, and gemcitabine than those with high risk. This RBP-based signature seems to be considered as a classification tool for making individualized therapeutic decisions. Furthermore, a nomogram was then developed for individualized clinical outcome prediction. This model also showed good predictive performance for 1-, 3-, and 5-year survival.
A few disadvantages of this study need to be pointed out. First, this was a retrospective study according to public datasets. In our future studies, we will present prospective multicenter clinical trials for validation of this RBP signature in predicting gastric cancer patients' survival. Second, activated signal pathways in high-and low-risk subgroups should be verified in further basic experiments. In future research, the molecular mechanisms of RBPs will be observed in gastric cancer. Furthermore, we will further validate the relationships of RBPs with immune microenvironment of gastric cancer, which could be used for guiding immunotherapy in clinical practice.

CONCLUSION
This study developed and externally verified an independent RBP-based signature in the prediction of gastric cancer patients' survival. This signature was closely related to tumor microenvironment and chemosensitivity, assisting in the expanding of the applications of immunotherapy and chemotherapy. A nomogram integrating this signature, age, and stage could offer individualized prediction of prognosis. Thus, this RBP signature may represent a prognostic stratification tool for gastric cancer.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.