Applications and Achievements of Single-Cell Sequencing in Gastrointestinal Cancer

Gastrointestinal cancer represents a public health concern that seriously endangers human health. The emerging single-cell sequencing (SCS) technologies are different from the large-scale sequencing technologies which provide inaccurate data. SCS is a powerful tool for deciphering the single-cell resolutions of cellular and molecular landscapes, revealing the features of single-cell genomes, transcriptomes, and epigenomes. Recently, SCS has been applied in the field of gastrointestinal cancer research for clarifying the origin and heterogeneity of gastrointestinal cancer, acquiring micro-environmental information, and improving diagnostic and treatment methods. This review outlines the applications of SCS in gastrointestinal cancer research and summarizes the most recent advances in the field.


INTRODUCTION
Single-cell sequencing (SCS) has rapidly developed in recent years. The first single-cell mRNA sequencing experiment was performed in 2009, then the first single-cell DNA sequencing experiment in human cancer cells was performed two years later, and the first single-cell exon sequencing experiment was performed in 2012. In 2013, Picelli et al. made some improvements on the smart-seq technology, an SCS protocol (1). This new technology is called smart-seq2. In 2017, 10xGenomics and the Fred Hutchinson Cancer Research Center developed a new single-cell RNA sequencing (scRNA-seq) method, and since then, thousands of immune cells had been analyzed (2,3). Single-cell sequencing involves the isolation of the cell group in the tissue or body fluid to a single cell level, then the expansion of the extracted nucleic acid (DNA or RNA) to the lowest detection level, followed by sequencing of the genome and transcriptome, and finally correction and analysis of the data. The SCS procedure is shown in Figure 1.
Gastrointestinal (GI) cancers are one of the most common malignant tumors and comprise gastric cancer (GC), esophageal cancer (EC), colorectal cancer (CRC), pancreatic cancer, etc. They are characterized by high morbidity, mortality, malignancy rates, and rapid development (4). GC has a high global prevalence. It accounted for more than 1 million new cases and an estimated 769,000 deaths in 2020, ranking fifth among the causes of global cancer morbidity and fourth among those of cancer mortality (5). Due to the lack of obvious early symptoms, the mortality rate of GC was still one of the highest among malignant tumors. Intriguingly, the incidence of EC was ranked seventh (604,000 cases) for new cases and sixth in the overall mortality rate (544,000 deaths) in 2020. Moreover, CRC (including anal cancer), which accounted for over 1.9 million new cases and 935,000 deaths, ranked third in terms of incidence, but second in terms of mortality among all cancers in 2020. GI cancer is characterized by complex heterogeneity (6) and a specific tumor microenvironment and is extremely suitable for promoting tumor progression and metastasis (7).
SCS plays a significant role in cancer research. Bulk sequencing does not perform well with intratumoral heterogeneity, as it misses rare mutations. For example, in cancer cells, mutations are diluted or lost during averaging of the bulk sequencing (8). In contrast, SCS can be used for the molecular profiling of individual cells and helps in obtaining more precise information about the tumor ( Figure 2). Therefore, SCS is a potential superior alternative to traditional sequencing methods. Moreover, although the combination of new resistant chemotherapy, molecular targeted therapy, and immunotherapy techniques has shown promising anti-tumor effects against advanced GI tumors, these techniques have several limitations. SCS can better help researchers investigate problems in tumor heterogeneity, microenvironment, diagnosis and treatment. Based on these advantages, many researchers have made important achievements in cancer research by using this technique. This review focuses on SCS and its applications and achievements in GI cancer studies.

SCS APPLICATIONS AND ACHIEVEMENTS IN GC SCS Reveals the Origin and Heterogeneity of GC
Tumor heterogeneity exists due to cell groups with different genotypes in tumor cells during the process of growth, this different cell groups lead to phenotypic inconsistencies. Intratumoral heterogeneity which is observed in various cancers is also one of the main clinical and pathological characteristics of GC. Cellular heterogeneity is significant genotypic differences within the same phenotype and leads to the differences in the growth, invasion, metastasis and drug sensitivity of GC cells (9,10). Many researchers have used SCS to investigate the origin and heterogeneity of GC (Table 1) FIGURE 2 | Conventional sequencing (methods above) results in the neglect of some low-abundance information, and single-cell sequencing (methods below) combines cell heterogeneity.
cells by using SCS. They also revealed that there were 24 significant mutant genes, such as CTAGE5, REC8, SORD, and PTCH2 genes, in single cells, wherein the change in single amino acids affected protein conformation. This study firstly showed the mutation pattern of GC cells at the intratumoral level and provided more important information for understanding individualized targeted therapy and the heterogeneity of GC cancer (12). Based on the SCS results of three primary and paired metastatic lymph node cancers in GC patients, Wang et al. identified different tumor characteristics and different patients with different microenvironment subsets; moreover, their clustering data revealed that KIF5B, NOTCH2, NOTCH2NL and ERBB4 were highly expressed in primary carcinomas, whereas ERBB2, CDK12, and CLDN11 were highly expressed in metastatic carcinomas (13). Similarly, in Zhang' s experiment, they classified a subclass of tumor-specific epithelial cells as "GC type 1", and a subclass consisting of epithelial cells and normal tissues of GC as "GC type 2". The expression of the intestinal mucosal markers MUC13, TFF3, SPINK4, FABP1, and REG4 were increased in the GC type 1 subclass, while, the expression of the previously identified gastric cancer marker gene KRT7 was increased significantly in the GC type 2 subclass (14). A previous study used SCS and GC tumor cell clusters (C1-C5) to investigate that REG4, CLDN4, TFF3, and CLDN7 were upregulated in the malignant epithelium as compared with that in the nonmalignant epithelium. In addition, PGC, MUC5AC, LIPF, and GKN1 were highly expressed in the non-malignant epithelium (15). In a recent study, Wang et al. clarified the relationship between tumor cell lineage/state composition and intratumoral heterogeneity at the transcriptional, genotypic, molecular, and phenotypic levels by using SCS. The study demonstrated the diversity of tumor cell lineage/state components in peritoneal carcinomatosis (PC) samples. The relationship was defined as the key factor responsible for intratumoral heterogeneity (16).

SCS Enables the Discovery of the Features of GC Microenvironment
Tumor cells and their microenvironments are interactive and coevolutionary. The tumor immune microenvironment comprises various tumor-infiltrating immune cells (such as B lymphocytes, T lymphocytes, mast cells, natural killer cells, and myeloid suppressor cells) (17,18). Tumor cells are also surrounded by the stroma, which is divided into cellular and acellular parts. These compartments are composed of complex tumor microenvironments that interact with cancer cells. SCS helps to clarify the molecular level mechanism of the immune cells in the tumor microenvironment during tumor cell generation, development, metastasis, drug resistance and immune escape. It contributes to a more accurate clinical diagnosis, treatment, and prognosis of solid tumors (19,20). In the past few years, several researchers have made significant advances in microenvironmental research by using SCS to analyze GC cells ( 15 patients with gastric adenocarcinoma 10x Genomics RNA 10x Genomics In tumors with mixed gastrointestinal characteristics, the abundance fraction of B cells increased significantly, with a higher proportion of M1-like macrophages (proinflammatory) and a lower proportion of M2-like macrophages (anti-inflammatory).
Tumor tissue of 4 patients C1 microfluidic system RNA SMART-Seq2 TAMs from GC abundantly expressed proinflammatory cytokines and the macrophages were M2 macrophages.
Carcinogeninduced mouse model FACS RNA SMART-seq2 ILC2-derived factors were required for the reprogramming of the gastric mucosa after injury and ILC2s performed a central role in the coordination of gastric epithelial repair after severe damage. involved in the function of L635-treated ILC2 (type 2 innate lymphoid) cells (22). Fu et al. found that the expression of the transcription factor IRF8 in CD8+ tumor-infiltrating lymphocytes in GC tissue was downregulated in the late stage of GC the disease by using SCS. These findings provided a further rationale for targeted immunotherapy in GC (23). Through SCS analysis, researchers confirmed that tumor-specific macrophages existed in a continuum of stimulus-dependent functional states and were regulated by a specific set of genes. They even found that the increase of the abundance of regulatory cells (Tregs) in the gastric tumor microenvironment was related to immunosuppression (14). Kwon et al. found that the dynamic tumor evolution was more related to the collapse of mutant structures in treatment response. Different T-cell receptor lineages were found to be related with a longer progressionfree survival with pembrolizumab treatment by combining whole-exome sequencing (WES) with scRNA-seq. In addition, the increase in the number of PD-1 + CD8 + T cells was associated with lasting clinical benefits (24).

SCS Facilitates the Diagnosis and Treatment of GC
ScRNA-seq approaches can identify optimal combination therapies that efficiently target heterogeneous cell populations. Furthermore, scRNA-seq can identify the alterations associated with treatment resistance in different cell clusters to support individualized cancer therapy (25). The optimization of existing chemotherapeutic agents and the development of targeted therapies have provided more options for the treatment of advanced gastric cancer and further prolonged the survival expectations of the patients. In addition, global efforts, including the employment of SCS, have been made to identify new specific, predictive, sensitive, and prognostic biomarkers and to establish innovative molecular classifications based on gene expression profiles (26). With the use of SCS, the researchers have made great progress in the diagnosis and treatment of gastric cancer ( Table 3). For example, Zhang et al. used SCS to construct a single-cell network based on the cellular and molecular characteristics of gastric epithelial cells with different lesions and establish OR51E1 as a unique endocrine cell marker in early malignant lesions. They also suggested that HES6 may mark goblet precell clusters and these findings helped to identify metaplasia in the early stage. Zhang et al. also determined the specific characteristics which is clinically significant for its accurate diagnosis in early GC (27). SCS can also help to identify markers related to tumor diagnosis and personalized therapy (33,34). Wang et al. performed SCS to classify PC samples into two subtypes that were predicted independent of clinical variables, obtained and verified the prognostic markers of 12 genes (TPM2, FCGBP, CDK6, Among the genes upregulated in the endocrine cells of the EGC lesions, OR51E1 was ranked at the top. HES6 could be used to label cells with some goblet cell features but had not been morphologically identified as goblet cells. Carcinogen-induced mouse model 10x Genomics RNA 10x Genomics Muc6 + Gif + epithelial cells were present in healthy stomachs, but did not express SPEM transcripts such as Tff2, Cd44, and Cftr. NCBP2, CLCX3, PIGR, BTF3, CKB, VPS28, TM4SF1, EIF3E, GPX4) in multiple large-scale gastric adenocarcinoma (GAC) cohorts (16). Another study by Bockerstett et al. established that in Spasmolytic polypeptide-expressing metaplasia (SPEM) and cervical cell proliferation hypertrophy, the expression of SPEMrelated transcripts were similar, and the mechanism of drugmediated parietal cell ablation was similar to that of SPEM induced by chronic inflammation (28). Chen performed singlecell DNA sequencing of 50 target circulating tumor cells (CTCs) and discovered that large multiploid CTCs (LCTCsmulti) and small CTCs with trisomy 8 (SCTCstri) had different gene variations. Moreover, mutations in the KRAS and Rap1 pathways were abundant in SCTCstri, while several unique mutations in the MET/PI3K/AKT pathway and SMARCB1 genes were found in LCTCsmulti. These findings highlighted the different mechanisms of drug resistance for modulating target therapy and could help in preventing the poor prognosis of patients (29). Based on the data from scRNA-seq, some researchers analyzed GC samples to classify them into three GC differentiation-related genes molecular subtypes. They found that molecular typing based on cell differentiation could successfully predict the overall survival of the patient, immune checkpoint gene expression, clinicopathological features. This study emphasized the significance of GC cell differentiation in predicting the clinical outcomes and potential immunotherapy responses of patients (30). Furthermore, by transplanting two GC cell lines into mice and performing singlecell transcriptome sequencing of the transplanted tumors, Nagaoka confirmed that interleukin-17 (IL-17) could be a potential target for enhanced programmed cell death 1, anti-PD-1 (programmed cell death protein 1) mAb treatment for GC (31). Bockerstett et al. sequenced the transcriptome of gastric mucosal epithelial cells and found that gastrin 3 mRNA was a tumor-specific marker of the gastric epithelium of intestinal metaplasia (32). Analysis of the SCS data from mice with hereditary diffuse gastric cancer (HDGC) revealed that inactivation of Cdh1 led to metastasis along the squamous cell differentiation trajectory associated with aberrant expression of GI epithelial differentiation center genes. Cytokeratin 7 encoded by the differentiation-dependent gene Krt7, was a specific marker of early neoplastic lesions in CDH1 carriers (20).
In conclusion, SCS is valuable in identifying prognostic tumor markers for predicting potential clinical outcomes and immune responses, as well as, for individualized therapy.

SCS APPLICATIONS AND
These findings demonstrated the high heterogeneity of ESCC tumor cells in terms of gene expression and CNV status (37). Another study explained the high degree of heterogeneity in the ESCC microenvironment. Macrophages were clustered into five subsets. Among the five macrophage subsets, Mac_1, Mac_2 and Mac_3 expressed higher anti-inflammatory "M2" -related genes in Mac_C, with Mac_5 expressing M2-like genes (38).

SCS Enables the Discovery of the Features of EC Microenvironment
SCS plays an important role in determining the cellular characteristics of the EC microenvironment. To date, various studies have utilized SCS techniques to explore the problems in the microenvironment of EC ( Table 5). For example, some researchers discovered that TAMs expressed not only genes related to immunosuppression (TGFB1 and COX2) but also genes related to angiogenesis (VEGFA, CXCL8, MMP9, and MMP12), and found that VEGFA was upregulated in monocytes, while MMPs were mainly expressed in TAMs. These results revealed the immunosuppressive status of the esophageal squamous cell carcinoma (ESCC) tumor microenvironment and improved our understanding of ESCC (35). Similarly, several characteristics of CD4 + T cells had been identified by using SCS. Three classes of CD4+ T cells were identified via SCS, and it was found that CD4_1 upregulates TIGIT expression in tumors, CD4_2 expresses PD-1 (encoded by PDCD1) exclusively in tumors, and CD4_3 showed tumor-specific TIGIT and CD96 expression. In addition, both CD4_1 and CD4_2 were found to express high levels of CTLA-4 in tumors (38). Wu et al. recently showed that cell cycle signaling was associated with high cancer stemness of EAC, such as E2F3, CHEK1, CDC20, SMC3, and TFDP1. In addition, they identified a novel cancer stem cell-associated gene, poly (ADP-ribose) polymerase 4, and they validated its association with survival in a cohort study of 121 ESCC patients (39). Another study discovered a strong correlation between FGF2 and SPRY1 expression in EC using SCS. In the fibroblasts in EC tissues, a high FGF2 expression was found to be associated with low overall survival, and the mouse tumor model confirmed that FGF2 overexpression in fibroblasts significantly upregulated SPRY1 expression in the depleted T cells, weakened the cytotoxic activity of T cells, and promoted tumor growth (40). Together, several studies have identified the features of different cells in the EC microenvironment and found some exclusively expressed genes using SCS. These researches has considerably improved our understanding of esophageal carcinogenesis.

SCS Facilitates the Diagnosis and Treatment of EC
SCS also plays an important role in the diagnosis and treatment of EC by facilitating the identification of diagnostic markers and development of new treatment. Many investigators have applied SCS to make great progress in the diagnosis and treatment of EC (

SCS Reveals the Origin and Heterogeneity of CRC
SCS is important in revealing the origin and heterogeneity of CRC.
With the help of SCS, many investigators have further revealed the origin and heterogeneity of CRC ( Table 7). Some researchers performed single-cell analysis of colon cancer samples using highthroughput SCS based on multiple displacement amplification. They focused on a mutant gene, SLC12A5, and found that SLC12A5 activation could promote cell proliferation and inhibit apoptosis, thus potentially promoting oncogenesis and demonstrating the biclonal origin of CRC cases (46). In addition, two normal or adenomatous polyps in CRC patients were studied via single-cell whole-exome sequencing and matched bulk WES. The results indicated that accumulation of non-random somatic gene mutations were involved in the GPCR, PI3K-Akt, and FGFR signaling pathways were also observed. These new driver mutations in OR1B1 (GPCR signaling), LAMA1 (PI3K-Akt signal in CRC New mutations were found in OR1B1 (GPCR signaling pathway) in adenoma evolution, and LAMA1 (PI3K-Akt signaling pathway) and ADCY3 (FGFR signaling pathway) in CRC evolution.
Cancer tissue obtained 2824 cells from CRC patient with stage III C 10x genomics RNA 10x genomics High degree of specificity existed for the genes clustered by five cells in the same tumor tissue.  evolution), and ADCY3 (FGFR signaling) of adenoma evolution and cancer evolution, confirming that both colorectal adenomas and CRC were of monoclonal origin (47). Davel analyzed scRNA-seq data of 2824 cells from CRC cancer tissues, dividing different cells of tumor tissues into five clusters according to specific genes, further analyzing the cluster data, gene ontology terms, KEGG pathways and trajectory maps.The study found that cluster 1 was characterized by a unique set of genes, such as IGLC7, IGLC2, IGLC3, cluster 2 has unique HLA-DRA, IGHM, IGHG2 other genes, and the remaining three clusters also defined themselves by unique genes. A high degree of specificity between the different clusters was found (48). These results showed that SCS was a powerful tool for studying tumor cell heterogeneity. Owing to the unique high resolution of SCS, scTRIO-SEQ (a type of single-cell triple sequencing) can simultaneously assess somatic copy number changes, DNA methylation, and transcriptomic information and facilitate single-cell heterogeneity research. The high-throughput and high-resolution characteristics of scRNA-seq are also beneficial for the detection of tumor samples (49). Some studies have shown that mutations in ATM and GNAS, as well as deletions in the tumor suppressor gene PTEN, likely led to tumorigenesis because these genes were potential cancer driver genes. Besides, it had been suggested that mutations in TP53, ERBB2, and APC may play an important role in tumorigenesis and may serve as drug targets (50). Markers for two different subtypes of cancer-associated fibroblasts (CAF) were identified using SCS studies. CAF-B cells expressed markers of myofibroblasts such as TAGLN, ACTA2, and PDGFA, and CAF-A cells expressed DCN, MMP2, and COL1A2. Only the CAF-A cells expressed the FAP (fibroblast activa tion protein a). Thus, this indicated that the heterogeneity of CAF may constitute a potential barrier to FAP-directed therapy (51).

SCS Enables the Discovery of the Features of Tumor Microenvironment in CRC
SCS facilitates the discovery of cellular features in the CRC microenvironment ( Table 8). SCS data revealed that the proportion of somatic copy number alteration (SCNA) in cancer tissues was much higher than that in adjacent normal tissues (11.1% v.s.10.6%), and five genes (BGN, RCN3, TAGLN, MYL9, and TPM2) were identified as fibroblast-specific biomarkers of poor CRC prognosis. Thus CRC successfully confirmed the extensive genomic alteration in cells in CRC tumor microenvironment (52). Studies of the CRC tumor microenvironment using SCS revealed cancer type-specific T cell subsets and developmental patterns, as well as detailed molecular characterization of tumor immunerelated T cell clusters. The cellular and molecular mechanisms underlying the tumor immune microenvironment composition, heterogeneity, and formation were revealed (53). Comprehensive analysis of the non-epithelial scRNA-seq data derived from precancerous lesions and CRC revealed that the proportion of CD8 + T cells, natural killer cells, and gdT cells (labeled cytotoxic cells) was significantly increased in serrated polyps compared to that in adenomas (54). Another study examined TAMs in CRC and found that Bcl9 deficiency caused macrophage polarization inhibition from M0 to M2 and altered the CRC tumor microenvironment to further interfere with the inflammation of M0 and M1, the cell type balance and transcription differences in TAMs regulated by BCL9-driven Wnt signaling affected immune surveillance and inflammation in cancer (55). Together, with the support of new technologies, SCS has greatly promoted a thorough understanding of the tumor microenvironment.

SCS Facilitates the Diagnosis and Treatment of CRC
Many researchers have used SCS to make great achievements in the diagnosis and treatment of CRC (Table 9). Some studies demonstrated that primary tumor cells evolved for a long time and acquired many mutations such as in KRAS, NRAS, APC, and TP53, which spread to distant sites and organs by using highthroughput single-cell DNA sequencing to study the advanced transmission of model metastatic CRC. This transmission model could be extended to many human cancers with important clinical significance (56). Lei et al. conducted scRNA-seq analysis on immune and stromal populations from CRC patients and identified specific macrophage and conventional dendritic cell subsets as key mediators of cellular cross-talk in the tumor microenvironment. Besides, they determined that anti-CSF1R treatment preferentially depleted macrophages with inflammatory features, and CD40 agonist antibody treatment preferentially activated the conventional dendritic cell population (57).

SUMMARY AND PROSPECTS
The incidence and mortality rate of malignant tumors in China are the highest among the world, and the overall situation of GI cancer prevention and treatment are very grim. SCS has become an important technique for studying GI cancer. Currently, with its development and integration with other technologies, further improvements and advances in SCS technologies will improve its applicability in clinical settings. However, the technique has some limitations. For example, biological noise will lead to the change of single cell sequencing data and affect the results of data (36). Another limitation is RNA leakage. It may occur during reverse transcription and then may introduce substantial bias (58). SCS also has some shortcomings. Single-cell sequencing is very sensitive to samples and therefore not suitable for analysis of preserved or poorly processed clinical samples.So it is difficult to translate the results from sequencing studies into the clinic (59). The high of SCS cost limits the ability to analyze a large number of tumors, and often only a few to dozens of samples are analyzed per study.
With the development of single-cell multiplexed technologies and the miniaturization and automation of SCS instruments, these limitations and shortcomings may be solved gradually and SCS will have more vast applications in GI cancer research. What's more, with continuous innovation and optimization of methods, the SCS technology will continue to promote the development of biomedicine and the accurate treatment of GI cancer and may likely aid in achieving high-quality long-term survival for patients with GI cancer.

AUTHOR CONTRIBUTIONS
YJ and KL participated in the design, conception. ZX, JL, PH, YZ and JY wrote the review. ZX, JL, YJ and KL revised the review. All authors contributed to the article and approved the submitted version.