Systems Biology of Gastric Cancer: Perspectives on the Omics-Based Diagnosis and Treatment

Gastric cancer is the fifth most diagnosed cancer in the world, affecting more than a million people and causing nearly 783,000 deaths each year. The prognosis of advanced gastric cancer remains extremely poor despite the use of surgery and adjuvant therapy. Therefore, understanding the mechanism of gastric cancer development, and the discovery of novel diagnostic biomarkers and therapeutics are major goals in gastric cancer research. Here, we review recent progress in application of omics technologies in gastric cancer research, with special focus on the utilization of systems biology approaches to integrate multi-omics data. In addition, the association between gastrointestinal microbiota and gastric cancer are discussed, which may offer insights in exploring the novel microbiota-targeted therapeutics. Finally, the application of data-driven systems biology and machine learning approaches could provide a predictive understanding of gastric cancer, and pave the way to the development of novel biomarkers and rational design of cancer therapeutics.


INTRODUCTION
Although the incidences and deaths of gastric cancer are declining in Northern America and Western European, gastric cancer still remains as the fifth most common diagnosed cancer worldwide, and is second compared to lung cancer in terms of worldwide cancer deaths (Bray et al., 2018). Gastric cancer is responsible for over one million new cases and an estimated 783,000 deaths in 2018 (Bray et al., 2018). In Eastern Asia, gastric cancer accounts for ∼31% of all cancer incidences in men and for ∼22% in women. In estimation, most of gastric cancer patients at advanced stages have a 5-year survival rate of <30% (Parkin, 2001). Therefore, early detection and targeted treatment of gastric cancer will be potential therapeutic strategies for increasing the 5-year survival rate of gastric cancer patients.
The vast majority of gastric cancer are adenocarcinomas, which can be classified based on their histological and etiological characteristics. Traditionally, gastric cancer can be divided into two major subtypes: intestinal-and diffuse-types of adenocarcinomas according to the Lauren's criteria (Lauren, 1965). Additionally, the alternative World Health Organization (WHO) classification system differentiates gastric cancer into tubular, papillary, mucinous, and poorly cohesive carcinomas, respectively (Bosman et al., 2010). Both classifications enable a better understanding of the pathology of gastric cancer. However, these classifications have quite limited success in promoting the development of subtype-specific treatment approaches due to the heterogeneity of gastric cancer and their disability to identify potential molecular targets. With the development of nextgeneration sequencing (NGS), omics technologies have provided valuable tools to study gastric cancer at the molecular level. Omics based data integration have been extensively applied in gastric cancer research. These studies have successfully identified numerous mutations, gene expression differences, protein abundance differences, epigenetic mutations, and metabolite concentrations to be linked with gastric cancer heterogeneity and staging, which significantly improve our understanding of gastric cancer.
Systems biology approaches aim to the transcendence of individual genes/proteins and to the integration of biological system that taking account into the intrinsic interactions. With more and more available omics data, systems biology approaches have developed many new methods and applications in gastric cancer research. In this review, we will briefly summarize the recent progress in "omics" technologies and their applications in gastric cancer research. We will then highlight the use of omics data integration to classify gastric cancer, and the application of systems approaches and machine learning methods to discover novel biomarkers and potential therapies. Furthermore, how the gastric cancer research shift from human omics to human-microbiota omics for current and future applications will be discussed.

GENOMICS, TRANSCRIPTOMICS, AND EPIGENOMICS IN GASTRIC CANCER
Next-generation sequencing technologies are mainly based on the massively parallel sequencing of short DNA/RNA fragments, which have been extensively reviewed elsewhere (Metzker, 2010). The advances of NGS enable a variety of applications in both DNA and RNA sequencing, including whole-genome, wholeexome, and targeted sequencing of DNA, and total RNA, mRNA, and small RNA. In addition, methylation and ChIP sequencing with NGS are also commonly applied, which remove the biases and limitations generated by previous microarray-based systems (Hurd and Nelson, 2009).
Comprehensive characterization at the genomic, transcriptomic, and epigenomic levels have been applied to define the molecular subgroups of almost all types of cancers. In early studies, the heterogeneity of gastric cancer had been characterized by the expression of a large panel of genes (Cho et al., 2011;Tan et al., 2011). Recently, the genomic landscapes of gastric cancer have been extensively investigated and reviewed elsewhere (Lin et al., 2015;Chia and Tan, 2016;Katona and Rustgi, 2017;Wang et al., 2019). The use of whole genomic data including TCGA (Bass et al., 2014) and ACRG (Cristescu et al., 2015) cohort, have enabled the development of novel and robust molecular classifiers that can guide clinical therapeutics against gastric cancer (Figure 1). With unsupervised clustering of molecular data including array-based somatic copy number analysis, array-based DNA methylation profiling, whole-exome sequencing, mRNA sequencing, miRNA sequencing, and reverse-phase protein array (Bass et al., 2014), the gastric cancer can be classified into four subtypes: (1) Epstein-Barr virus (EBV) positive (9%), (2) microsatellite instability (MSI, 22%), (3) genomically stable (GS, 20%), and (4) chromosomal instability (CIN, 50%). Further evaluation of the clinical and histological characteristics of these molecular subtypes revealed the enrichment of the diffuse histological subtype in the GS subtype (Bass et al., 2014). While the ACRG study developed a distinct 4-subtype classification system with gene expression microarray, genome-wide copy number microarrays and targeted gene re-sequencing (Cristescu et al., 2015). As observed in TCGA cohort, gene mutation profiles (e.g., TP53) and structural variations are frequently identified in gastric cancer (Zang et al., 2012;Wang et al., 2014;Cristescu et al., 2015;Hu et al., 2016), and these four subtypes show strong associations with clinical phenotypes. Taken together, the accumulation of multiple omics dataset increases the complexity of gastric cancer classification, and the treatment of gastric cancer will be benefit from the clinical-pathological-omics combined subtyping with an individualized way.
Transcriptomics describes the expression levels of RNA transcripts. Gene expression had been shown to dramatically change according to the clinical information of patients, which led to the identification of novel expression biomarkers in patients' group (Tan et al., 2011;Lei et al., 2013). The expression signatures of gastric tumors derived from microarray or NGS had been used to improve the early diagnosis and prognosis prediction (Chia and Tan, 2016). Using 973-and 1024-gene expression signatures, gastric tumors can be distinguished from the normal gastric tissues with high precision in early gastric cancer (Vecchi et al., 2007;Nam et al., 2012). As previous described, gene expression had also been applied for stratification of gastric cancer (Shah et al., 2011;Tan et al., 2011), which reveal distinct transcriptomic subtypes. Moreover, recent advent of single-cell DNA/RNA sequencing provides an opportunity enabling the identification of cell types and state. For instance, the recent study  reconstructed single-cell expression atlas underlying the gastric premalignant lesions and early gastric cancer. With expression profiles at the singlecell level, the expression signatures of multiple cell types were identified across different lesions. Furthermore, the singlecell atlas revealed a panel of six high-confidence markers related to early gastric cancer, which could be used as specific biomarkers for early diagnosis targets to recognize the onset of gastric cancer . Interestingly, the singlecell RNA sequencing had also been applied to explore the tumor microenvironment of gastric cancer recently (Sathe et al., 2020), which showed distinct expression changes in tumor samples compared with paired normal tissue. The stromal cells, macrophages and cytotoxic T cells were significantly enriched in tumor samples with expression of multiple immune checkpoint and costimulatory molecules (Sathe et al., 2020). Altogether, gene expression profiling at both the population and the single-cell level elucidate the heterogeneity of gastric cancer and the complex relationship between the immune FIGURE 1 | The systems biology approach for gastric cancer research. The different types of omics data including genomics, transcriptomics, proteomics, epigenomics, and metabolomics are obtained using according omics technologies from gastric cancer cohort. Omics data (transcriptome, proteome, or metabolome) is measured for two or more group that differ in clinical information. Generally, differentially expressed genes (DEGs) based methodology is applied. DEGs are identified by comparative analysis of measured omics data. With integrative network-based approach, the biological networks, such as gene regulatory network (GRN), protein-protein interaction network (PPIN), and genome-scale metabolic network (GEM) are used together with omics data in an integrative way. Then, the up-regulated or down-regulated subnetworks are identified by integrating data into network models. Using network modeling tools, the key driver gene linked to clinical information can be identified. Identified key driver genes can be further applied into clinical studies. In addition, the multi-omics data are used to stratify patient into subtypes. Machine learning algorithms utilize omics features to predict potential treatment outcome. The machine learning methodology can be applied to chemotherapy, immunotherapy or their combinations. Finally, the key driver gene information and the predictive models can be used to design the personalized treatment strategy, and applied in clinics. The clinical survival outcome then can be evaluated after personalized medicine treatment. microenvironment and gastric cancer, which may provide valuable clues to develop rational diagnosis and personalized therapeutic approaches.
Epigenomics describes the modifications of DNA or histones that influence gene expression without altering DNA sequence (Jones and Baylin, 2007). By analyzing the global CpG methylation profiling of gastric cancer and normal tissues, cancer-specific epigenetic alternations were observed in 44% of CpGs in the form of both tumor hyper-and hypomethylation (Toyota et al., 1999;Zouridis et al., 2012). Interestingly, the regions of long-range tumor hypomethylation were strongly associated with increased chromosomal instability (Zouridis et al., 2012). Besides DNA methylation, other types of epigenetic changes, such as histone methylation and acetylation, had been found to be associated with the prognosis of gastric cancer treatment (Calcagno et al., 2019;Li et al., 2019).

PROTEOMICS AND METABOLOMICS IN GASTRIC CANCER
Proteomics complements the genomic and transcriptomics approaches, providing additional information about the protein expression and post-translational modifications. Most of proteomics studies in this field so far focused on the discovery of gastric cancer associated biomarkers from plasma samples (Uen et al., 2013;Abramowicz et al., 2015;Gao et al., 2015;Yoo et al., 2017). An early study (Uen et al., 2013) investigated the glycoprotein profiles of serum samples from gastric cancer patients and healthy subjects. Seventeen significant differentially expressed Con A-bound glycoproteins were identified. Validations using Con A-bound LRG1 glycoprotein revealed an AUC value of 0.65. Another comparative proteomics analysis (Yoo et al., 2017) with serum samples was performed among early gastric cancer, advanced gastric cancer and normal control groups, leading to the identification of hundred protein biomarkers. Using clusterin isoform 1, the highest AUC values to distinguish the advanced or early gastric cancer from normal controls are 0.94 and 0.88, respectively (Yoo et al., 2017). In addition, the comprehensive proteomics studies had also been employed to classify gastric cancer subtypes as genomics data (Ge et al., 2018;Wippel et al., 2018;Mun et al., 2019). The diffusetype gastric cancer can be further classified into three or four distinct subtypes according to proteome profiling, respectively (Ge et al., 2018;Mun et al., 2019). Moreover, integration of phosphoproteome data with other types of omics data elucidated the signaling pathways associated with somatic mutations (Mun et al., 2019). Most of the metabolomics studies in this field so far focused on the discovery of biomarkers associated with gastric cancer from plasma samples (Abbassi-Ghadi et al., 2013;Jayavelu and Bar, 2014). Numerous metabolic changes in plasma, urine, gastric juice, and carcinoma tissues had been identified by using targeted or untargeted metabolomics analyses. It provides efficient ways for diagnosis, prognosis, and drug evaluation of gastric cancer, which serves as a potential strategy to develop personalized gastric cancer therapeutics.

GASTROINTESTINAL MICROBIOME IN GASTRIC CANCER
Human microbiome has been confirmed to play critical roles in human health and disease (Knight et al., 2017). The intrinsically heterogeneity of gastric cancer had been extensively explored in decades based on the omics information from human host. However, little is known about how the human microbiota linked to gastric cancer at the function level. Thus, exploring the gastric microbiota at DNA, RNA, and protein level using meta-omics technologies will be helpful for us to understand the potential roles of gastric microbes in cancer development and stage (Figure 1).
Helicobacter pylori is one of the gastric pathogen that colonizes in more than 50% persons in the world, and 1% of persons with H. pylori infections develop into gastric cancer (Wroblewski et al., 2010;Noto and Peek, 2017;Ferreira et al., 2018). While H. pylori was not the dominant bacterial species in some gastric cancer patients, implying other microbes might account for the gastric cancer development (Noto and Peek, 2017). The gastrointestinal microbiota directly interacted with gastric tissue, and affected gastric cancer development (Brawner et al., 2014;Nardone and Compare, 2015). Recent studies indicated that gastric microbiota was strongly associated with gastric cancer (Dias-Jácome et al., 2016). The gastric microbiota of cancer subjects have reduced microbial diversity, decreased Helicobacter abundance and the enrichment of other bacterial genera mainly from the intestinal commensals (Ferreira et al., 2018). In addition, significant changes of gut microbiota including microbial richness and diversity were observed in H. pylori positive subjects compared to H. pylori negative subjects . Altogether, metagenomics analyses had provided insights into the scenario of gastric microbiota and their interaction with human host. Recently, the drug-microbiota interaction have been extensively investigated (Maier et al., 2018;Vila et al., 2020). However, the influence of gastric cancer treatment, especially the adjunct chemotherapy, on gastric and gut microbiota is still unknown. Therefore, exploring of the gastrointestinal microbiota and gastric cancer associations may provide us novel views in gastric cancer progress and development of microbiota targeted nutrient supplementations or drugs.

DATA-DRIVEN INTEGRATION APPROACHES IN GASTRIC CANCER RESEARCH
Most of gastric studies concentrated on the differential analysis between gastric cancer samples and normal controls using one type of omics data. The comprehensive multi-omics studies of gastric cancer (Bass et al., 2014;Cristescu et al., 2015;Mun et al., 2019) had create a molecular landscape spanning the genome, transcriptome, proteome, and even phosphoproteome. However, there are strong interdependence among different types of omics data. In order to comprehensively understand the gastric cancer and develop efficient diagnosis and treatment approaches, it is critical not only to analyze these omics data as separate layers, but also to dissect how they interact with each another by integrating them together (Figure 1).
Cellular processes are represented with networks, whose structures involve in both the species that participate in the biological processes and the interactions between these species (Chiappino-Pepe et al., 2017). The network based multiomics data integration thus provides us the opportunity to incorporate information across multiple biological layers and describe the gastric cancer (Figure 1). For the transcriptome, proteome, and metabolome data, network inference, pathway enrichment analysis and network module identification are three principal steps in network based integration Chiappino-Pepe et al., 2017;Yan et al., 2017). Both the top-down approaches using available experimental data and the bottom-up approaches using reconstructed networks from related organisms as a scaffold to assemble new biological networks with published data are main strategies to infer biological networks (Chiappino-Pepe et al., 2017).
Pathway and network analysis are the two common procedures to explore the functional dynamics linked to cancer. As shown in Figure 1, the differentially expressed genes (DEG) are firstly identified using available computational workflows, which are generally performed between gastric cancer samples and normal controls. With the over-expression or under-expression profiles of the DEGs, the related biological pathways are associated with cancer status or stage by pathway enrichment analysis approaches such as gene set enrichment analysis (Subramanian et al., 2005;Buzdin et al., 2017). The DEGbased pathway analysis approach had been successfully applied to identify potential biomarkers distinguishing gastric cancer with normal controls samples using the transcriptomics, proteomics or metabolomics data (Anvar et al., 2018). Nevertheless, DEGbased approach still has a number of limitations, restricting its use in clinics. Firstly, the number of DEGs identified usually exceeds the number that can be experimentally validated. Thus, only parts of DEGs selected according to literature or knowledge are experimentally tested in most of studies. Secondly, not all of DEGs identified are the driver genes for gastric cancer. In fact, it is not easy to discover key driver genes from DEGs, and DEG-based approach cannot always guarantee the successful discovery of key gastric cancer driver genes. Considering such limitations, integrative network-based approach may be useful to intercept omics data and discover cancer driver genes in the context of biological network.
With the predefined biological networks [e.g., proteinprotein interaction network (PPIN), gene regulatory network, gene interaction network, and metabolic network], the omics data can be mapped into the biological networks to identify potential functional subnetworks (Figure 1). The activity of subnetwork or modules can be inferred by searching the alternations in predefined networks, providing related regulatory or interaction information linked to clinical information. Furthermore, network-based modeling approaches can be applied to relate the activities of subnetwork components with their influences and consequences on other network components (Creixell et al., 2015). Integrative network analysis utilizing gene expression data identified seven candidates for gastric carcinogenesis with increased levels as disease progression (Takeno et al., 2008;Mansouri et al., 2018). Recent investigations of miRNA and mRNA expression with the human PPIN also reveal a novel miRNA that may function in decreasing gastric tumor proliferation and metastasis through its regulated protein interaction network (Tseng et al., 2011). In summary, transforming the gene-level information to network-level information may provide network biomarkers for understanding the cancer biology (Takeno et al., 2008;Tseng et al., 2011;Mansouri et al., 2018).

MACHINE LEARNING IN GASTRIC CANCER RESEARCH
The applications of machine learning methods, which learn functional relationships from data, had been largely increased in cancer research and drug discovery (Angermueller et al., 2016;Borisov and Buzdin, 2019;Vamathevan et al., 2019;Cuocolo et al., 2020). One important application of machine learning is medical images, and image-based recognition with machine learning had been increasingly applied to diagnosis in various medical fields (Cuocolo et al., 2020). Esophagogastroduodenoscopy (EGD) is the standard procedure for gastric cancer diagnosis. However, the false-negative rate for EGD detection is about 4.6-25.8% (Yalamarthi et al., 2004;Hirasawa et al., 2018). Using convolutional neural networks (CNNs), the machine learning diagnostic system had been trained with >10,000 endoscopic images of gastric cancer (Hirasawa et al., 2018;Yoon and Kim, 2020). The resulting CNN correctly diagnosed 71 of 77 gastric cancer lesions with a overall sensitivity of 92.2% (Hirasawa et al., 2018). Moreover; endoscopic images were used to stratify gastric cancer risk by CNNs, which can diagnose patients as low, moderate, and high risk, respectively (Nakahira et al., 2020).
Not only cancer diagnostics, machine leaning also brings personalized treatment to clinics (Borisov and Buzdin, 2019;Cuocolo et al., 2020). Surgery is the primary treatment for gastric cancer, while the high incidence of distant metastases and the local recurrence of most gastric cancer patients, especially those with advanced gastric cancer, have paved the way for adjuvant therapy (Janunger et al., 2001;Sitarz et al., 2018). The adjuvant treatment may include chemotherapy, targeted drug therapy or immunotherapy, either alone or in combinations (Cunningham et al., 2006). In addition, an emerging chemotherapy method named as neoadjuvant chemotherapy refers to preoperative chemotherapy is recommended for the treatment of patients with resectable advanced-stage gastric cancer (Sitarz et al., 2018). With increased number of omics data linked to gastric cancer treatment, it provided us the opportunities to explore the individual responses to chemotherapy or other types of treatment, and to predict the possible outcome using machine learning and mathematical modeling methods (Figure 1). With the gene expression data from TCGA cohort and KUGH cohort, gene expression signatures specific to each of the four molecular subtypes was used to develop predictive models for patients stratification, and the model was tested in other large independent cohorts (Sohn et al., 2017;Oh et al., 2018). Interestingly, these results showed that the subtypes could be as predictors for survival and response to adjuvant chemotherapy (Sohn et al., 2017). Moreover, a recent study characterized key mutational features, copy number alternations and gene expression changes associated with responses to neoadjuvant chemotherapy with multi-omics data of tumor samples from patients responding to neoadjuvant chemotherapy or not (Li et al., 2020). Compared the responders with non-responders tumors and pre-with post-treatment samples, the C10orf71 mutations were found to be associated with treatment resistance by statistical models (Li et al., 2020). Taken together, such machine learning based approach integrates multi-omics data, providing efficient ways to predict the treatment outcome based on the host genetic information.
Immunotherapy has revolutionized both the cancer research and treatment landscape by targeting the host immune system (Coutzac et al., 2019;Szeto and Finley, 2019). Antibodies targeting to blocking immune checkpoints such as programmed cell death-1 (PD-1), programmed death ligand-1 (PD-L1), and cytotoxic T lymphocyte-associated antigen-4 (CTLA-4) have proven efficacies in diverse solid cancers. Several studies had showed the strong correlations between intra-tumoral immune cells and gastric cancer prognosis (Kang B. W. et al., 2017), and the efficiency of checkpoint inhibitors (e.g., nivolumab, pembrolizumab) and their combinations with chemotherapy had been evaluated in clinical trials (Kang Y.-K. et al., 2017;Boku et al., 2019). These results suggest that immunotherapy may be a potential option for patients with advanced gastric cancer. Machine learning has been used to build predictors of drug response and immunotherapy outcomes (Borisov and Buzdin, 2019;Leiserson et al., 2019). However, there is a lack of mechanistic understanding of the effects of gastric cancer immunotherapy in both human host and gastrointestinal microbiota. With the availability of immunotherapy or chemotherapy related multi-omics data, datadriven integration approach and machine learning method will integrate data with known gastric cancer subtyping knowledge in the tumor-specific and patient-specific ways, which can help in stratifying patients before the treatment. In addition, datadriven machine learning or mathematical modeling method may also be useful to learn knowledge and develop predictive models to provide insight into the rational design of cancer therapy in personalized way.

CONCLUSION AND PERSPECTIVES
The advances of omics technologies in decades are enabling the parallel measurement of millions of biomolecules at the same Frontiers in Molecular Biosciences | www.frontiersin.org time. Omics-wide association studies have been widely applied in gastric cancer research, which revealed strong associations between omics features and the gastric cancer development. With the omics data from genome, transcriptome, proteome, and epigenome levels, gastric cancer have been extensively stratified, and the resulting subtypes show strong correlations with the therapeutic outcomes. Both the TCGA and ACRG classifications revealed four distinct gastric cancer subtypes, and the comparison between these two classification systems showed similarities such as tumors with MSI in both data sets, and the TCGA GS, EBV+, and CIN subtypes were enriched in ACRG dataset (Cristescu et al., 2015). However, strong inconsistencies between these two subtype systems were also observed, which covered most of the patient population. The wide variation in study designs, heterogeneity in study cohorts, together with the variations in data analysis strategy, especially in data processing and analysis methods, make the findings of gastric cancer subtyping difficult to applied in clinics (van den Boorn et al., 2018). Therefore, applying robust statistical methods and performing meta-analyses pooling estimates from multiple multi-omics studies may provide a powerful way to investigate gastric cancer across multiple cohorts.
With the proteomics and metabolomics data, numerous gastric cancer-specific biomarkers had been identified, which pave ways for the diagnosis of gastric cancer at the early stages. Systems biology based integration of multi-omics data have provided lot of insights into the cancer diagnosis and therapeutics. However, the application of such methods in gastric cancer still lags behind. Moreover, the application of big data and machine learning approach in gastric cancer studies are still limited. With increased omics data generating from the gastric cancer research field, the application of systems biology approach would provide a systematic scenario of gastric cancer in the future.

AUTHOR CONTRIBUTIONS
YW and BJ conceived the study. X-JS, YW, and BJ wrote the manuscript. All authors contributed to the article and approved the submitted version.