Gastric Cancer Tumor Microenvironment Characterization Reveals Stromal-Related Gene Signatures Associated With Macrophage Infiltration

The tumor microenvironment (TME) has attracted attention owing to its essential role in tumor initiation, progression, and metastasis. With the emergence of immunotherapies for various cancers, and their high efficacy, an understanding of the TME in gastric cancer (GC) is critical. The aim of this study was to investigate the effect of various components within the GC TME, and to identify mechanisms that exhibit potential as therapeutic targets. The ESTIMATE algorithm was used to quantify immune and stromal components in GC samples, whose clinicopathological significance and relationship with predicted outcomes were explored. Low tumor mutational burden and high M2 macrophage infiltration, which are considered immune suppressive characteristics and may be responsible for unfavorable prognoses in GC, were observed in the high stromal group (HR = 1.585; 95% CI, 1.112–2.259; P = 0.009). Furthermore, weighted correlation network, differential expression, and univariate Cox analyses were used, along with machine learning methods (LASSO and SVM-RFE), to reveal genome-wide immune phenotypic correlations. Eight stromal-relevant genes cluster (FSTL1, RAB31, FBN1, ANTXR1, LRRC32, CTSK, COL5A2, and ENG) were identified as adverse prognostic factors in GC. Finally, using a combination of TIMER database and single-sample gene set enrichment analyses, we found that the identified genes potentially contribute to macrophage recruitment and polarization of tumor-associated macrophages. These findings provide a different perspective into the immune microenvironment and indicate potential prognostic and therapeutic targets for GC immunotherapies.


INTRODUCTION
Gastric cancer (GC) is the fifth most frequently diagnosed cancer and the third leading cause of cancer-related deaths worldwide (Bray et al., 2018). Advanced patients who are no longer eligible for surgery are forced to resort to other therapies. In the past five years, immunotherapy has emerged as the standard of care for many advanced cancers (Lordick et al., 2017). In GC, positive responses to immunotherapy are limited to a small fraction of patients, and, owing to tumor heterogeneity, its efficacy remains to be elucidated (Shi et al., 2012;Salati et al., 2019). Therefore, an understanding of immunotherapy mechanisms is a priority for the management and extension of positive responses to broader target populations.
The tumor microenvironment (TME) is a repertoire of ostensibly normal cells, recruited by cancer cells, that contribute to cancer initiation, growth, and dissemination (Hanahan and Weinberg Robert, 2011). Components of TME include fibroblasts, immune cells, endothelial cells, along with their secreted extracellular matrix (ECM) (Kobayashi et al., 2019). With the understanding of the diversity and complexity of TME in GC deepening, mounting evidence suggests its crucial role in tumor initiation, progression, immune evasion, and its effect on tumor response to immunotherapies . Prevalent infection of Helicobacter pylori in GC patients underlies a chronic inflammatory environment, which is considered a major risk for GC development (Polk and Peek, 2010). In addition, patients with intestinal metaplasia, gastric atrophy, and cancers demonstrated an increased incidence of genetic alterations strongly correlated with immune response (Lott and Carvajal-Carmona, 2018). Therefore, through the systematic analysis of the heterogeneity and complexity of GC TME, typical tumor characterization could be identified, and our ability for guiding and predicting immunotherapies could also be improved (Zeng et al., 2019). Previously, tumor immune infiltration was mainly studied using flow cytometry and immunohistochemistry (IHC), which require large amounts of tissues and high sample quality (Mellors, 1968;Melato, 1997). Nowadays, emerging computational methods are supporting these analyses and rapidly revealing a broader intra-tumoral immune landscape. Such methods are based on gene expression profiles and immunological features, which include Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) and Celltype Identification By Estimating Relative Subsets Of RNA Transcripts (CIOBSORT) algorithms (Yoshihara et al., 2013;Newman et al., 2015).
Consequently, on this basis, we performed a multidimensional and multi-perspective analysis to reveal the Abbreviations: CIOBSORT: Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts; DEGs: differentially expressed genes; ECM: extracellular matrix; ESTIMATE: Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data; GC: gastric cancer; GS: gene significance; HR: hazard ratio; LASSO: least absolute shrinkage and selection operator; MEs: Module Eigengens; ssGSEA: single-sample gene set enrichment analysis; SVM-RFE: support vector machine-recursive feature elimination; TAMs: tumor-associated macrophages; TCGA: The Cancer Genome Atlas; TIMER: Tumor Immune Estimation Resource; TME: tumor microenvironment; WGCNA: weighted gene co-expression network analysis. potential relationship between immune infiltration and the genome in GC using various advanced bioinformatic algorithms. In this study, we used weighted gene co-expression network analysis (WGCNA) to build gene networks, through which the connection between corresponding genes are identified and weighted based on their expression. After transforming the expression profiles into weighted networks, the genes are clustered into modules with distinct clinical characteristics, in which the genes are highly co-expressed. Compared with direct screening of differentially expressed genes (DEGs), gene sets identified with current methods are more biologically connected and significant . Furthermore, machine leaning methods including support vector machine-recursive feature elimination (SVM-RFE) and least absolute shrinkage and selection operator (LASSO) algorithms, were adopted for the identification genes correlating with prognosis. The interactive employment of the two methods ensures the hub genes with best prognostic value and multiple characteristics. Additionally, the potential relationship between immune features and hub genes were subjected to single-sample gene set enrichment analysis (ssGSEA) for second validation, which greatly enhanced the reliability of our study.
Overall, we aimed to identify hub genes associated with GC TME by combining WGCNA with immune and stromal scores in The Cancer Genome Atlas (TCGA) database. Our results established an optimized combination of various advanced algorithms to identify TME-related genes and revealed potential mechanisms by which the TME-related genes promoted GC development. The workflow is summarized in Supplementary Figure S1.

Data Acquisition and Pre-processing
Gene expression data for 375 GC samples and 32 normal samples were downloaded as level 3 RNA-seq FPKM datasets from TCGA 1 . Ensembl IDs were converted into gene symbol matrices using online datasets 2 . Expression values for the same gene names were averaged. The mRNA expression matrix was extracted from standardized data. GC sample somatic mutation data were downloaded using the TCGAbiolinks R package and portal mutect2 (Colaprico et al., 2016). The maftools R package was used for mutational profile analysis and visualization. The corresponding clinical information from patients with GC, including data for age, gender, ethnicity, microsatellite instability, pathologic stage, histologic grade, survival status, and survival time were retrospectively collected from the TCGA database (Supplementary Table S1).

Tumor Mutational Burden Calculation
TMB, the number of somatic missense mutations per megabase of a patient's genome, is a potential biomarker for prediction of response to immunotherapy (Rizvi et al., 2015;Goodman et al., 2017). In this study, 38 Mb was used as the estimated whole exome size (Chalmers et al., 2017). We determined TMB using the equation: TMB = S i ×1,000,000 I (I represents the number of exonic bases with a coverage depth ≥100×, and S i represents the absolute number of somatic mutations).

WGCNA
The genes with upper 25% median absolute deviation were selected to guarantee data heterogeneity and accuracy of WGCNA analysis. We calculated the Pearson correlation coefficient between each paired gene using their absolute transcript expression values, defined as: L ij = cor(x i , x j ) (L represents the Pearson correlation coefficient between gene i and j). We presented this value in a co-expression similarity matrix. We exponentiated the correlation coefficient to reflect the continuous nature of the genes with potential co-expression. To maintain scale independence and mean connectivity balanced in a topology network, we calculated thresholding powers from 1 to 20. An appropriate power value was selected to emphasize strong correlations and to penalize the weak ones to achieve a scale-free topologic network. Next, the weighted adjacency matrix was constructed in power operation: a st = l β st (where a is the adjacency between genes s and t). The topological overlap matrix (TOM) was converted from the adjacency matrix to reflect the sum of direct and indirect correlations between genes in the network. Then, we performed hierarchical clustering based on the TOM-based dissimilarity value (1-TOM) and obtained module dendrograms. The minimum genome size of the module was set as 40.
Gene significance (GS) represents the correlation between gene expression and clinical characteristics. Module eigengenes, the first principal components, were calculated to identify the gene expression signature landscape of each module. Module eigengenes were correlated with clinical traits in a heatmap. To merge similar modules and augment the capacity of the modules, the cutoff value was set as 0.25. Clinical characteristics were analyzed with the gene modules. After identifying modules most related with interesting phenotypes, GS was correlated with module membership (MM), defined as the similarity between a gene expression profile and the module in which it belongs.

Functional Enrichment Analysis and Screening Differentially Expressed Genes
For genes in the module being examined, we used the clusterProfile R package for gene ontology functional annotations with a false discovery rate (FDR) threshold less than 0.05. The limma R package was used to identify genes aberrantly expressed in GC, with the selection criteria of |log2 FC| > 0.3 and FDR < 0.05 (Yu et al., 2012).

Estimation of STromal and Immune cells in MAlignant
Tumor tissues using Expression data was used to evaluate the concentration of infiltrating non-tumor cells within the TME (Yoshihara et al., 2013). The proportions of infiltrating-stromal and immune cells in GC samples were quantified by stromal and immune scores using gene expression signatures.
Tumor Immune Estimation Resource (TIMER) 3 , is an open source server for comprehensive analysis of tumor-infiltrating immune cells across various cancers (Li et al., 2017). We explored the association between the expression of selected genes and the infiltration of six major immune cells in GC: CD4+ T cells, CD8+ T cells, B cells, neutrophils, macrophages, and dendritic cells. This same analysis was used to explore the correlation between hub genes and specific immune cell markers. The immune gene markers were selected based on the CellMarker online database 4 and prior studies (Mlecnik et al., 2011;Tse and Kwong, 2015;Danaher et al., 2017).
CIBERSORT was used to predict the proportion of cells within the 22 human immune cell subsets in GC samples (Newman et al., 2015). The algorithm was suitable for microarray data analysis; therefore, we used the voom R package to correct mRNA expression values. CIBERSORT was run using the LM22 signature (downloaded from website https://cibersort.stanford. edu) and 1000 permutations, with output values of P < 0.05 preserved. The results were generated as a violin plot using the ggplot2 R package.

LASSO and SVM-RFE Algorithms
Univariate cox analysis was used to identify potential hub genes with prognostic value. Then, we used LASSO and SVM-RFE algorithms to screen the genes with the best prognostic prediction value in GC. SVM-RFE and LASSO logistic regression was performed using the glmnet and e1071 packages, respectively (Friedman et al., 2010;Huang et al., 2014). LASSO regression was employed to minimize extra redundancy and irrelevance. SVM-RFE is a feature selection algorithm that ranks the features according to the recursive feature deletion sequence based on the support vector machine. Overlapping genes identified using these two algorithms were regard as hub genes.

Verification of Predictive Performance and Differential Expression
The prognostic performance of the eight hub genes in GC patients was verified using the Kaplan-Meier plotter database 5 (Lanczky et al., 2016). The Kaplan-Meier plotter can assess the significance of more than 50,000 genes for survival in nearly 1500 GC samples, based on the HGU133 Plus 2.0 array. The online web server Oncomine 6 was used to examine hub gene expression levels across diverse cancer types and corresponding normal tissues. The threshold was set as P-value = 1E-4, fold change = 2, gene rank = 10%, data type = mRNA. We also selected GSE54129 and GSE79973 from the Gene Expression Omnibus (GEO) database 7 to compare differences in hub gene expression between GC and normal gastric tissues.
ssGSEA Single-sample gene set enrichment analysis, a deconvolution algorithm based on gene set enrichment analysis (GSEA), transforms gene expression profiles into quantified immune cell fractions in single tumor samples. We used the GSVA R package to evaluate the proportion of 24 innate and adaptive immune cell subtypes in each GC sample. These cell types included natural killer (NK) cells, neutrophils, T effector memory (Tem), Tgd, mast cells, eosinophils, plasmacytoid dendritic cells (pDC), immature DCs (iDC), dendritic cells (DCs), macrophages, T follicular helper (TFH), T central memory cells (Tcm), Th2, Th17, CD56dim NK cells, regulatory T (Treg) cells, activated DCs (aDCs), cytotoxic cells, T cells, and B cells (Hanzelmann et al., 2013). The specific immune gene signatures were previously described (Bindea et al., 2013). We divided the immune cell infiltrating patterns of each sample into low-, median-, and highinfiltration groups using hierarchical agglomerative analyses based on Ward's linkage and Euclidean distance. The relationship between the observed immune infiltrating patterns and overall survival in GC patients was also compared. Finally, we verified the correlation between hub gene expression and immune infiltrating levels in GC.

Statistical Analysis
The statistical analysis was mainly performed in R version 3.5.1. Differences between groups were assessed using an independent t-test. Benjamini and Hochberg method was used for multiple correction test. Survival curves were generated using the GraphPad Prism 7 software and Kaplan-Meier plots databases. Survival analysis results were displayed as P-values and hazard ratios (HRs) from a log-rank test. Spearman's correlation analysis was performed to evaluate the correlation between gene expression and immune infiltrating level. P-values < 0.05 were considered to indicate statistically significant results.

Stromal Scores Identified as an Immune Indicator Associated With Prognosis in GC
Gastric adenocarcinoma mutation data were analyzed based on whole-exome sequencing. The 20 genes with the most significant mutations in GC samples were identified using the MutSigCV algorithm. The clinical profiles including ethnicity, tumor grade, and gender of the patients with GC that provided each sample are listed in Figure 1A. MUC16 is one of the most significantly mutated genes in the GC cohorts. To explore its association with prognosis and TMB in patients with GC, we calculated the TMB for each GC sample. GC patients with MUC16 mutations were significantly associated with a higher TMB (P < 0.001) and better survival outcomes (HR = 1.792, P = 0.002) than patients without MUC16 mutations (Figures 1B,C). Based on the median value of TMB, we found that the high-TMB group was significantly associated with better survival prognosis than the low-TMB group (HR = 1.648, P = 0.031) in GC ( Figure 1D).
To investigate the correlation between the TME and TMB in GC, we calculated stromal and immune scores, and tumor purity of each sample using the ESTIMATE algorithm. GC cohorts were divided into two groups based on the median value of each index. The low-TMB group tended to have higher stromal and immune scores but lower tumor purity (P < 0.05), indicative of a negative correlation between stromal scores and TMB (Figures 1E-G). In Kaplan-Meier survival analysis, the high stromal scores group had significantly poorer prognosis (HR = 1.585, P = 0.009) than did the low stromal group in GC ( Figure 1H). The survival outcome was not statistically significant in the different immune (P = 0.178) and ESTIMATE (P = 0.115) scores groups (Figures 1I,J). These results suggest that TMB is negatively correlated with stromal scores, and the underlying molecular mechanisms in GC warrant further investigation.

Identification of Stromal Scores Relevant Gene Modules Using WGCNA
To identify gene modules with the most significant immunological features and to elucidate the mechanisms underlying the role of the TME in GC development and prognosis, we constructed a weighted gene co-expression network. After eliminating 23 outlier samples (Supplementary Figure S2A), genes with the highest 75% variance were placed in a cluster dendrogram. To satisfy a scale-free network (R 2 = 0.94), the soft threshold β was set at 4 and the topology model fit index was set at 0.9 (Figures 2C,D). After identifying the eigengenes of each module, 14 merged modules were obtained (Figures 2A,B).
The correlation between the modules and clinical phenotypes including survival status, age, gender, histologic grade, pathologic stage, stromal scores, immune scores, ESTIMATE scores, tumor purity, TMB, TP53 mutation status, and MUC16 mutation status were calculated and shown as a heatmap based on MMs and GS ( Figure 2E and Supplementary Figure S2B). The brown and yellow modules strongly correlated with tumor immunemicroenvironment-related phenotypes, and contained 273 and 272 genes, respectively. The brown module was most significantly associated with stromal score (Cor = 0.82, P = 2e-77) and its correlation coefficients with immune scores and tumor purity were 0.37 and −0.66, respectively (P < 0.001). Additionally, the brown module was negatively correlated with TMB (Cor = −0.21, P = 2e-4). The yellow module correlated with immune score (Cor = 0.78, P = 4e-65). Our previous analyses suggested a significant prognosis difference between different stromal scores and TMB groups. Therefore, we chose the brown module as the module of interest for further analysis. We performed correlation analysis between MM of the brown module and GS of different clinical phenotypes ( Figure 2F). The correlation between MM and stromal score GS showed a favorable linear fitting (Cor = 0.93, P = 8.3e-120).

Functional Enrichment Analysis for the Module of Interest
GO analysis was used to reveal the underlying biological mechanisms in the brown module using the "clusterProfile" R package. Consistent with WGCNA results, functional annotation  clustering of brown module genes exhibited strong association with stroma-related biological functions, including "extracellular structure organization, " "extracellular matrix disassembly, " "integrin binding, " "extracellular matrix organization, " "cellsubstrate adherens junction, " and "collagen metabolic process." Important biological functions including "leukocyte migration, " "regulation of vasculature development, " and "response to hypoxia" were also significantly enriched in the brown module ( Figure 3A).

Screening of Differentially Expressed Potential Hub Genes
We performed differential expression analysis for genes in the brown module to identify potential tumorigenic factors. Owing to the biological cascading effect, potential hub genes with low expression values within tumors might play important roles in TME development. To eliminate such bias, we adopted a relatively broad inclusion criterion. With the cutoff threshold set as |log 2 FC| > 0.3 and FDR < 0.05, 160 DEGs in the brown module were screened out. Of these DEGs, 129 were up-regulated, and 31 were down-regulated in GC tissues ( Figure 3E). To identify the most centrally connected genes within the weighted gene co-expression network, we screened out 40 brown module genes with the highest intramodular connectivity. Among them, 26 genes were significantly up-regulated and defined as potential hub genes ( Figure 3B).

Stromal-Relevant Hub Genes Are Associated With Macrophages Infiltration in GC
Based on the brown module's dual positive association with stromal and immune scores, we postulated that interplay between immune and stromal cells has important roles in carcinogenesis. We used the TIMER database to investigate the correlation between the 26 stromal-related DEGs and tumor immune infiltrates. The potential hub genes and their correlation with infiltrating immune cell concentration within the TME was assessed ( Figure 3C). Notably, High expression of these genes was moderately to strongly positively correlated with high macrophage infiltration levels (Cor = 0.4-0.8, P < 0.05).
We assessed the prognostic significance of the infiltrating levels of the six immune cells in GC using the TIMER survival module. We found that high macrophage infiltration levels were associated with a poorer prognosis in GC ( Figure 3D, P = 0.004). These results led to the hypothesis that stromal-related hub genes might promote tumor development by regulating macrophage infiltration in GC.

Survival Analysis and Gene Expression Validation
We conducted survival validation on the eight hub genes using the Kaplan-Meier plotter database. Seven of the hub genes with high expression (FSTL1, RAB31, ANTXR1, COL5A2, ENG, FBN1, and LRRC32) were found to predict poorer overall survival than their low-expression counterparts (P < 0.05) (Figure 5B). We verified hub gene mRNA expression levels across different types of cancers in the Oncomine database. These hub genes are overexpressed in GC and in other cancers including pancreatic cancer and lymphoma ( Figure 6A). The eight hub genes were all ranked at the highest 1% in GENE RANK. Five genes had four or more datasets supporting their aberrant expression in GC with the threshold of fold change = 2, P-value = 1E-4. Moreover, we used datasets GSE54129 and GSE79973 from GEO database to further examine hub gene expression differences in GC and normal tissues (Figures 6B,C). The results show that these hub genes are more highly expressed in GC than in normal gastric tissues (P < 0.05).

Revalidation of the Correlation Between Hub Genes and Tumor Immune Characterization
To validate the relationships between the hub genes with tumor immune characteristics we used ssGSEA to reevaluate the immune infiltration signature of each sample in 24 immune cell types. The 375 GC samples were distinguished as having high, medium, and low immune infiltration patterns with the corresponding clinical characteristics (Figure 7A). We compared the stromal and immune scores between the three immune infiltration patterns (Figures 7B,C). We found that high immune-infiltration groups have higher immune scores than do the low-and medium immune infiltration groups (P < 0.001). Although the difference in stromal scores between high and medium infiltration groups was not significant, the median value of the stromal scores was higher in the highinfiltration group (P < 0.001). We compared survival differences between the three groups and found no significant differences ( Figure 7D). Survival significance differences was observed when we divided the groups into high-and low macrophage infiltration levels using upper and lower quartiles ( Figure 7E, HR = 1.77, P = 0.038). Based on the results of ssGSEA analysis, we further assessed the correlation between hub gene expression and the infiltrating levels of 24 kinds of immune cells (Supplementary Figure S4). The high expression level of the selected hub genes was significantly associated with infiltration of various immune cells, especially macrophages, in GC ( Figure 7F).

Exploration of the Correlation Between Hub Genes and Macrophage-Related Immune Markers
We used CIBERSORT to evaluate the differences in immune infiltration between high-and low stromal groups in GC to elucidate the underlying connection between tumor stroma and infiltrating-immune cells within the TME. With the cutoff standard set as P-value < 0.05, we obtained 161 and 83 samples in the high-and low stromal groups, respectively. The high-stromal group had higher infiltration of monocytes (P = 0.020), M2 macrophages (P = 0.001), resting mast cells (P = 0.049), and resting DCs (P = 0.023) than the low-stromal FIGURE 4 | Identification of hub genes associated with prognosis using the machine learning method. (A) LASSO coefficient profiles of candidate hub genes. Each curve corresponds to a candidate gene. (B) The SVM-RFE algorithm identified candidate genes with the best predictive performance and lowest error bar (5 × CV accuracy = 9-0.907). (C) Venn plot showing hub genes identified in common by the two algorithms. group ( Figure 8A). This highlights the correlation between tumor stroma and tumor-associated macrophages (TAMs). It is feasible that a close interaction exists between stromalrelated hub genes and TAMs in tumor progression. Therefore, we analyzed the correlation between hub gene expression and specific gene markers of macrophages in various immune cell subtypes; these included CD86, CD14, CD16 of monocytes; CCL2, CD115, CD206, and IL10 of TAMs; INOS, IRF5, SOCS1, CCR7, TSPO, ROS, IL6, and CXCL10 of M1 macrophages; and CXCL12, MS4A4A, MAR1, CD36, VISIG4, DCIR, CD184, and CD163 of M2 macrophages. Interestingly, eight hub genes were all significantly correlated with the gene markers of TAMs, M2 macrophages, and monocytes (P < 0.01, Table 1). There was a moderate to strong correlation between five hub genes (RAB31, FBN1, CTSK, FSTL1, LRRC32) and CD86 and CD14 of monocytes; MRC1, CSF1R, and CCL2 of TAMs; and CXCL12, MS4A4A, and MSR1 of M2 macrophages. Furthermore, weak correlation between the hub genes and SOCS1, IRF5 and NOS2 of M1 macrophages was observed ( Figure 8B). These findings suggest that hub genes might regulate macrophage polarization in GC.
The present investigation of TME in GC indicates a complex relationship between the genome and immune infiltration. Overexpressed prognostic hub genes positively correlate with high stromal infiltration and may also participate in macrophage recruitment and M2 macrophage polarization. High-stromalgroup patients were also observed to have higher M2 macrophage infiltration. The adverse prognostic value of each factor was verified independently in survival analysis ( Figure 8C).

DISCUSSION
Biological behaviors of cancers are determined by genetic instability (such as TMB), cancer cell epigenetic abnormalities, and the surrounding milieu (such as TME) that the cancer cells interact with for growth, survival, proliferation, and metastasis (Hanahan and Weinberg Robert, 2011;Mlecnik et al., 2011). Infiltrating stromal and immune cells are major components of normal cells within tumor tissue (Yoshihara et al., 2013). To elucidate the components of TME in GC, we utilized the ESTIMATE algorithm to evaluate the immune phenotypes of GC with immune and stromal scores. Our results showed that stromal scores, but not immune scores, were significantly correlated with survival outcomes in GC, indicating that the stromal compartment plays an important role in GC. Previous studies have indicated that stromal cells, like fibroblasts, are much more crucial in TME formation in GC than inflammatory cells (Komohara and Takeya, 2017;Ham et al., 2019;Kobayashi et al., 2019).
Tumor mutational burden, a novel biomarker of immunotherapy response, is based on the notion that mutationassociated neoantigens can activate immune cells to eliminate cancer cells: the higher the TMB, the better the therapeutic effect (Samstein et al., 2019). Here, the low-TMB group showed significantly poorer survival compared with high-TMB group. In addition, the low-TMB group tended to have higher stromal scores in GC. Given the validity of TMB as a biomarker for immunotherapy response, we speculated that the population that was not susceptible to immunotherapies (i.e., the low-TMB group) are highly associated with stromal cells (high stromal score) within the TME, which is correlated with unfavorable prognosis.
To elucidate the potential connection between the genome and TME in GC, we systematically clustered the co-expressed genes by WGCNA analysis. This approach allowed us to identify gene modules most related to cancer immunological phenotypes, especially stromal scores and TMB. The brown module showed strong positive correlation with stromal scores, moderate correlation with immune scores, and negative correlation with TMB ( Figure 2E). Functional enrichment analysis for the brown module also confirmed its strong correlation with stromal-related phenotypes, including ECM organization and leukocyte migration. This suggests that special genomic abnormalities may facilitate malignant growth by affecting stromal structure or the immunogenicity of the TME in GC. Therefore, we performed differential expression analysis for brown module genes with the highest intramodular connectivity in order to identify potential oncogenes. We adopted a combination strategy, integrating three different algorithms, to screen the prognostic biomarkers from DEGs. Based on univariate Cox analysis determining the prognostic DEGs, LASSO, and SVM-RFE methods were used to identify biomarkers with distinct characteristics. After survival analysis and expression validation, eight brown module hub genes were identified as adverse prognostic factors in GC. Growing evidence suggests that the stromal compartment can influence antitumor immune responses and regulate tumor immunology (Komohara and Takeya, 2017). We investigated whether stromal-related hub genes play an immunomodulatory role within the TME. To this end, we used TIMER and ssGSEA to analyze the relationship between hub gene expression and tumor immune infiltration in GC. The results show a strong positive correlation between hub gene expression and macrophage infiltration in GC. The similarity of the immune correlation analysis results using two independent methodologies indicates the robustness of the results. We additionally observed that high macrophage infiltration levels represent as an adverse prognostic factor in GC (Figures 3D, 7E). To reveal the mechanisms underlying the relationship between stromal score and prognosis, we compared the immune-infiltrating cell fractions of each sample between high and low stromal groups. Intriguingly, the high-stromal group was found to have higher proportions of monocytes and M2 macrophages. Accordingly, further analysis was performed to explore the correlation between stromal-related hub genes and different immunophenotypes of macrophages in order to elucidate the role of hub genes in regulating tumor immunology in GC. Immune markers of monocytes and TAMs, such as CD86 and CSF1R, showed a strong positive correlation with hub gene expression, suggesting that the stromal-related hub genes promote the recruitment of circulating monocytes into the TME and facilitate their differentiation into TAMs. Moreover, M1 macrophage immune markers, such as SOCS1 and NOS2, demonstrated weak or even negative correlation with the expression of hub genes, while M2 macrophage immune markers, such as CXCL12, MSR1, and MS4A4A, exhibited moderate to strong correlation. These results demonstrate the underlying role of hub genes in regulating the recruitment of macrophages and polarization of TAMs in GC.
Evidence indicates that stromal cells may shape an immunosuppressive microenvironment by secreting cytokines and promoting M2 polarization of macrophages (Comito et al., 2014). Notably, Cathepsin K (CTSK), a lysosomal cysteine protease of the peptide protein C1 family, is implicated in matrix remodeling and angiogenesis (Krieg and Lipford, 2008). CTSK overexpression has been detected in breast, prostate, and gastrointestinal cancers, and correlated with tumor stroma (Kleer et al., 2008;. CTSK was identified as a metastatic-related protein regulated by gut microbiota in colorectal cancers, and its effect on M2 polarization of TAMs has been confirmed . The evidence correlating stromal-related hub genes with M2 macrophage polarization support further exploration of the potential of these genes as immunotherapeutic targets. Of the eight stromal-related hub genes, FSTL1, FBN1, COL5A2, and LRRC32 encode extracellular proteoglycans, glycoproteins, and other components within the ECM, playing a fundamental role in determining ECM composition (Andrikopoulos et al., 1995;Dubuisson et al., 2001;Carrillo-Galvez et al., 2015;Mattiotti et al., 2018). ANTXR1, Rab31, and ENG are transmembrane or membrane proteins engaged in signal transduction and transmembrane trafficking (Chen et al., 2013;Carrillo-Galvez et al., 2015). FSTL1 is involved in multiple tumor biological processes, including metastasis and regulation of the immune response. It is noteworthy that FSTL1 is significantly related to tumor metastasis, and up or down regulation of FSTL1 in different cancers greatly induces their migratory and invasive capacity, leading to tumor dissemination (Kudo-Saito et al., 2013;Ni et al., 2018;Chiou et al., 2019). FSTL1 exerts this effect on migration through the reduction of various cytokines, including metalloproteinase-2 (MMP-2), CCL2, and CXCL12 (Chan et al., 2009). Moreover, FSTL1 is considered a determinant of immune dysfunction mediated by mesenchymal stroma/stem cells (MSCs) and immunoregulatory cells, and may therefore represent a target to suppress cancer progression (Kudo-Saito et al., 2018).
Rab31 is a member of the RAB family derived from monomeric GTP-binding proteins, which regulate intracellular transportation (Cheng et al., 2005). Numerous studies have reported that Rab31 participates in tumor initiation and progression across various cancers, including breast cancer, glioblastoma, and pancreatic cancer (Ioannou et al., 2015;Pan et al., 2016;. Rab31 was found to play critical role in tumor development and is an independent prognostic factor in breast cancer (Kotzsch et al., 2017). Recently, Rab31 was recognized to function as an oncogene in GC tumorigenesis and progression by interacting with GLI1, which represents a therapeutic target in GC (Tang et al., 2018). ANTXR1, also known as tumor endothelial marker 8 (TEM8), is widely considered a potential target for cancer therapy because of its effect on tumor angiogenesis. Consistent with our results, ANTXR1 is widely expressed on tumor-associated perivascular stromal cells, which strongly promotes angiogenesis within the TME (Bagley et al., 2009). Recent studies have revealed that antibodies targeting ANTXR1 exert unique antitumor effects by selectively inhibiting stromal endothelial cells . In addition, ANTXR1 is regarded as a potential target for CAR T cell immunotherapies in gastric adenocarcinoma (Sotoudeh et al., 2019), which makes it more applicable for clinical treatment.
Using integrated bioinformatic and machine learning algorithms, we elucidated the genomic landscape of GC and its correlation with TMB, the stromal compartment, and immune cell infiltration. Eight stromal-related prognostic hub genes were found to play pleiotropic roles in the TME of GC. These hub genes inhibit the formation of an immunosuppressive microenvironment, therefore representing potential therapeutic targets. The present findings were based on retrospective datasets, and biological experiments are needed to verify our results.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2020.00663/full#supplementary-material FIGURE S1 | Work-flow of the bioinformatics analysis procedure. Trait heatmap and sample dendrogram after outlier sample elimination. A total of 14 samples were included in the dendrogram. Color concentration is in proportion to survival, age, gender, stage, stromal score, immune score, ESTIMATE score, tumor purity, tumor mutation burden, TP53 mutation status, and MUC16 mutation status.