ANGPTL2+cancer-associated fibroblasts and SPP1+macrophages are metastasis accelerators of colorectal cancer

Background Liver metastasis (LM) is a leading cause of cancer-related deaths in CRC patients, whereas the associated mechanisms have not yet been fully elucidated. Therefore, it is urgently needed to deeply explore novel metastasis accelerators and therapeutic targets of LM-CRC. Methods The bulk RNA sequencing data and clinicopathological information of CRC patients were enrolled from the TCGA and GEO databases. The single-cell RNA sequencing (scRNA-seq) datasets of CRC were collected from and analyzed in the Tumor Immune Single-cell Hub (TISCH) database. The infiltration levels of cancer-associated fibroblasts (CAFs) and macrophages in CRC tissues were estimated by multiple immune deconvolution algorithms. The prognostic values of genes were identified by the Kaplan-Meier curve with a log-rank test. GSEA analysis was carried out to annotate the significantly enriched gene sets. The biological functions of cells were experimentally verified. Results In the present study, hundreds of differentially expressed genes (DEGs) were selected in LM-CRC compared to primary CRC, and these DEGs were significantly associated with the regulation of endopeptidase activity, blood coagulation, and metabolic processes. Then, SPP1, CAV1, ANGPTL2, and COLEC11 were identified as the characteristic DEGs of LM-CRC, and higher expression levels of SPP1 and ANGPTL2 were significantly associated with worse clinical outcomes of CRC patients. In addition, ANGPTL2 and SPP1 mainly distributed in the tumor microenvironment (TME) of CRC tissues. Subsequent scRNA-seq analysis demonstrated that ANGPTL2 and SPP1 were markedly enriched in the CAFs and macrophages of CRC tissues, respectively. Moreover, we identified the significantly enriched gene sets in LM-CRC, especially those in the SPP1+macrophages and ANGPTL2+CAFs, such as the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION and the HALLMARK_COMPLEMENT. Finally, our in vitro experiments proved that ANGPTL2+CAFs and SPP1+macrophages promote the metastasis of CRC cells. Conclusion Our study selected four characteristic genes of LM-CRC and identified ANGPTL2+CAFs and SPP1+macrophages subtypes as metastasis accelerators of CRC which provided a potential therapeutic target for LM-CRC.


Introduction
Colorectal cancer (CRC), one of the most common malignancies, accounts for the third incidence and second mortality rates among all tumors worldwide (1).Liver metastasis (LM) is responsible for the major deaths of CRC patients (1,2).Although the therapeutic strategies develop, the clinical outcomes of LM-CRC patients remain unfavorable (3).The 5-year survival rate of early CRC patients who undergo complete surgical resection can reach 30% -57%, while the 5-year survival rate of LM-CRC patients is less than 5% (4).Therefore, it is urgently needed to deeply explore the metastatic drivers and therapeutic targets of LM-CRC.
Cancer cells live in a complex microenvironment named the tumor microenvironment (TME) which contains stromal cells, endothelial cells, and immune cells (5).Mounting evidence has proved that TME is the stimulator of the progression of cancer (6).Monocyte-derived macrophages (M0) are one of the wellcharacterized innate immune cells (7).Sometimes, it can make up to 50 percent of the tumor mass (5).Typically, M0 macrophages can be polarized into classically activated M1 and alternatively activated M2 subtypes which are collectively termed tumor-associated macrophages (TAMs) (8).TAMs promote tumor cell proliferation, invasion, and metastasis (9).For example, TAMs can stimulate angiogenesis and suppress the antitumor function of immune T cells (10).Cancer-associated fibroblasts (CAFs) are another important component of TME and secrete growth factors, inflammatory ligands, and extracellular matrix proteins (11).CAFs are significantly associated with the metastasis of cancers.For instance, CAF-derived IL-32 can promote the invasion and metastasis of breast cancer cells via p38/MAPK signaling (12).Single-cell RNA sequencing (scRNA-seq) is an optimized highthroughput sequencing technology that can define the global gene expression of a single cell and facilitate the dissection of the heterogeneity of cancer tissues (13).Li et al. determined that there are two different CAF subtypes in the TME of CRC carcinoma tissues through scRNA-seq analysis (14).In addition, Lin et al. reported that CTSB + macrophages repress the memory immune hub in the liver metastasis site of CRC (15).However, LMassociated subtypes of macrophages and CAFs remain elusive in CRC and deserve further investigation.
In this study, we identified four characteristic genes of LM-CRC and found that SPP1 and ANGPTL2 were significantly associated with the prognosis of CRC patients and mainly located at the TME.Furthermore, scRNA-seq analysis showed that SPP1 and ANGPTL2 were distributed in the macrophages and CAFs in CRC tissues, respectively.In addition, we identified the significantly enriched gene sets in LM-CRC, especially those in the SPP1 + macrophages and ANGPTL2 + CAFs.Finally, the metastasis accelerator roles of ANGPTL2 + CAFs and SPP1 + macrophages were experimentally verified.

Bulk RNA-seq data analysis
The bulk RNA sequencing data of CRC tissues were d o w n l o a d e d f r o m t h e T C G A C R C c o h o r t ( h t t p : / / tcga.xenahubs.net)and four GEO datasets (GSE81558, GSE49355, GSE178120 and GSE159216) (16)(17)(18).All expression values of genes have been normalized.The differentially expressed genes (DEGs) were identified by the R "limma" package with a criterion of |log 2 FC|≥1 and p<0.05.The clinicopathological characteristics and prognostic information of CRC patients were downloaded from the TCGA database.The GSE81558 dataset contains 23 primary CRC tissues and 19 LM-CRC tissues.The GSE49355 dataset consists of 20 primary CRC tissues and 19 LM-CRC tissues.The GSE178120 dataset contains 105 primary CRC tissues.The GSE159216 dataset consists of 283 LM-CRC tissues.The TCGA CRC cohort includes 362 CRC patients with follow-up times.

ScRNA-seq data analysis
The scRNA-seq data of CRC tissues were enrolled from a GEO scRNA-seq dataset (GSE166555) (19) and an EMTAB scRNA-seq dataset (EMTAB8107) that were deposited in the public Tumor Immune Single-cell Hub (TISCH) database (http://tisch.compgenomics.org/home/)(20).The values in the single-cell level expression matrix are normalized by the NormalizeData method in "Seurat" to scale the raw counts (UMI) in each cell to 10,000.A uniform analysis algorithm (MAESTRO) was adopted for each dataset to perform quality control, clustering, and cell-type annotation.The GSE166555 dataset contains 66,050 cells from 12 CRC tissues.The EMTAB8107 dataset consists of 23,176 cells from 7 CRC tissues.The scRNA-seq was conducted by the platform of 10×Genomics.

Functional enrichment analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were carried out in the online DAVID database (https://david.ncifcrf.gov).GO analysis includes the biological process (BP), molecular function (MF), and cellular component (CC) categories.The results of the GO and KEGG analyses were further visualized through an online tool, OmicShare (http://www.omicshare.com/tools).The GSEA analysis was carried out based on the MSigDB database (http://www.gsea-msigdb.org/gsea/msigdb/index.jsp).The significantly enriched gene sets were selected with a criterion of |NES|>1, nominal (NOM) p-value<0.05,and false discovery rate (FDR) value<0.25.GSEA 4.3.0software was used for enrichment analysis.

Evaluating the infiltration levels of CAFs and macrophages
The infiltration abundances of CAFs and macrophages were estimated in the online tool TIMER2 (http://timer.cistrome.org)by multiple immune deconvolution methods.The infiltration levels of macrophages were assessed by the TIMER, CIBERSORT, CIBERSORT-ABS, EPIC, QUANTISEQ, MCPCOUNTER, and XCELL algorithms.The infiltration levels of CAFs were assessed by the MCPCOUNTER, EPIC, and XCELL algorithms.

Collecting tissue samples
Ten paired CRC tissues and corresponding normal tissues were obtained from CRC patients undergoing tumor resection.The clinicopathological characteristics of the CRC tissues were described in Table 1.All enrolled patients signed the informed consent and did not receive any anti-tumor treatment previously.This study was approved by the Ethics Committee of Nanjing First Hospital.

Isolating fibroblasts
Fibroblasts were isolated from fresh tissues as previously described (21).Briefly, the minced tissue samples were digested in the mixture of 1mg/mL collagenase (cat.#C4-BIOC, Sigma-Aldrich), Dulbecco's modified Eagle's medium (DMEM; cat.#KGM12800, KeyGen), and 10% fetal bovine serum (FBS; cat.#12106C, Sigma, Sigma-Aldrich) for 2 h at 37 °C with shaking.After centrifugation, cell pellets were resuspended and further filtered through a cell strainer (100 mm).Then cells were cultured in DMEM containing 10% FBS.Two hours later, change culture medium.Compared to other cells, fibroblasts are easier to attach to the culture dishes.Fibroblasts were identified by the detection of two positive markers (a-SMA and Vimentin) and two negative markers (KRT20 and Desmin) (22).

ELISA assay
The protein levels of ANGPTL2 in CAFs and NFs culture media were measured by an ELISA kit (cat.#CSB-E13881h, CUSABIO).Briefly, an equal volume of cell culture medium from the same number of CAFs or NFs was centrifugated for 15 min at 1000 × g and 4°C, and the supernate was collected and kept on ice.Add 100 mL of standard and sample per well and cover with the adhesive strip and incubate for 2 h at 37°C.Then remove the liquid from each well without washing.Add 100 mL of Biotin-antibody (1×) to each well, cover with a new adhesive strip, and incubate for 1 h at 37°C.Aspirate each well and wash for a total of three washes.Next, wash each well with 200 mL Wash Buffer and let it stand for 2 min.After the last wash, remove any remaining Wash Buffer and add 100 mL of HRP-avidin (1×) to each well.Cover the microtiter plate with a new adhesive strip and incubate for 1 h at 37°C.Subsequently, add 90 mL of TMB substrate to each well and incubate for 30 min at 37°C in the dark.Add 50 mL of Stop Solution to each well, and gently tap the plate to ensure thorough mixing.Finally, the absorbance (450nm) of each well was detected by a microplate reader (Infinite M200 PRO, TECAN).

Transwell assay
For the Transwell assay, NFs, CAFs or macrophages were seeded on the lower well and cultured in 500mL DMEM or RPMI-1640 medium supplemented with 10% FBS.CRC cells (2×10 4 ) were resuspended in 200mL DMEM and seeded into the upper well of 8-mm-pore chambers (cat.#3422, Corning) which had been coated with Matrigel (cat.#356234, Corning).After co-culture at 37°C and 5% CO2 for 24h, CRC cells at the upper surface of the filter were removed and the cells that invaded the opposite side of the filter were stained with 0.5% crystal violet and counted under a microscope (Nikon, Japan).

Immunohistochemistry (IHC) assay
The IHC assay was performed on paraffin sections (5 mm) of normal tissues and CRC tissues.Briefly, after being deparaffinized and rehydration, the antigen of the sections was retrieved using 0.01 M sodium citrate buffer (pH 6.0) at a boiling temperature for 10 min.Thereafter, the sections were incubated with 3% hydrogen peroxide for 10 min and 5% bovine serum albumin for 1 h.Then, the sections were incubated with primary antibodies at 4 °C overnight.The next day, the sections were washed thrice with PBS and incubated with corresponding secondary antibodies.Finally, the DAB system was applied to visualize the signal, and hematoxylin was applied to stain the nucleus.The immunostaining images were captured using a microscope (Nikon, Japan).The degree of positivity was initially classified according to scoring both the proportion of positive staining tumor cells and the staining intensities.The IHC score was calculated as previously described (24).The antibodies used in this study were as follows: anti-CAV1 (1:200, cat.# 16447-1-AP, Proteintech), anti-SPP1 (1:200, cat.# sc-21742, Santa Cruz), anti-ANGPTL2 (1:200, cat.# 12316-1-AP, Proteintech), anti-COLEC11 (1:200, cat.# 15269-1-AP, Proteintech).

Statistical analysis
Statistical analysis was performed by using GraphPad Prism 8.0 (GraphPad, United States) and R 4.2.0 software.The Kaplan-Meier (KM) curve with a log-rank test was used to compare the different overall survival and disease-free survival of patients between the two groups.The statistical difference between the two groups was analyzed through the Wilcoxon test.The correlations between genes and cell infiltration levels were calculated by Pearson correlation analysis.P-value<0.05was considered statistically significant.

Functional enrichment analysis of The DEGs between LM-CRC tissues and primary CRC tissues
To select the DEGs between LM-CRC tissues and primary CRC tissues, GSE81558 and GSE49355 datasets were collected.The analysis results of the GSE81558 dataset exhibited that 162 and 46 genes are significantly upregulated and downregulated in the LM-CRC tissues, respectively (Figure 1A).The analysis results of the GSE49355 dataset showed that there are 164 upregulated genes and 100 downregulated genes in the LM-CRC tissues (Figure 1B).Then we overlapped these DEGs in two datasets and identified 123 common DEGs (Figure 1C).GO analysis revealed that these DEGs are significantly associated with many biological processes, such as the regulation of endopeptidase activity, blood coagulation, complement activation, and fibrinolysis (Figure 1D).The MF The functional enrichment analysis of DEGs between LM-CRC tissues and primary CRC tissues.Frontiers in Immunology frontiersin.organalysis showed that these DEGs mainly regulate serine-type endopeptidase inhibitor activity (Figure 1D).In addition, the CC analysis demonstrated that these DEGs are mainly located in the extracellular region, extracellular exosome, and extracellular space (Figure 1D).Moreover, the KEGG analysis revealed that these DEGs mainly participate in the complement and coagulation cascades and metabolic pathways (Figure 1E).

The protein levels and prognostic values of the characteristic DEGs of LM-CRC
Next, we tested the protein levels of these characteristic DEGs in CRC tissues based on the Human Protein Atlas database, and the results showed that ANGPTL2 and SPP1 are upregulated in CRC tissues compared to normal tissues (Figure 3A).However, CAV1 and COLEC11 seem to be downregulated in CRC tissues (Figure 3A).Subsequently, we conducted IHC to verify their expression levels in normal and CRC tissues.Consistently, our IHC results exhibited that ANGPTL2 and SPP1 are upregulated, whereas CAV1 and COLEC11 are downregulated in CRC tissues compared to normal tissues (Figure 3B).Intriguingly ANGPTL2 and SPP1 are enriched in the TME but not in the CRC cells (Figures 3A, B).Moreover, the prognostic analysis of the TCGA CRC cohort revealed that CRC patients with higher ANGPTL2, COLEC11, and SPP1 tend to undergo worse overall survival (Figure 3C).In addition, elevated ANGPTL2 and SPP1 in CRC tissues are significantly associated with unfavorable disease-free survival of patients (Figure 3D).

Identify ANGPTL2 + CAFs and SPP1 + macrophages in CRC tissues
Given the prognostic values and unusual distribution of ANGPTL2 and SPP1 in CRC tissues, we intended to explore the cell types where they are enriched based on scRNA-seq.Analysis of the scRNA-seq data in the GSE166555 dataset showed that 33 cell clusters and 13 cell types are identified in CRC tissues (Figures 4A,  B).The marker genes of each cell type are exhibited in Figure 4C.We found that ANGPTL2 is significantly enriched in the CAFs, especially in the C17 cluster (Figure 4D).In addition, SPP1 is significantly enriched in the mono/macrophages, especially in the C16 cluster (Figure 4E).The cell-cell interaction (CCI) analysis revealed that ANGPTL2 + CAFs mainly interact with malignant CRC cells, mono/macrophages, epithelial cells, and dendritic cells (DCs) (Figure 4F) and that SPP1 + macrophages mainly interact with malignant CRC cells, endothelial cells, DCs, proliferating T cells, and myofibroblasts (Figure 4G).To verify our findings, we analyzed an EMTAB scRNA-seq dataset (EMTAB8107).The analysis results showed that 21 cell clusters and 12 cell types are identified in CRC tissues (Figures 4H,  I).The marker genes of each cell type are exhibited in Figure 4J.Consistently, ANGPTL2 is significantly enriched in the CAFs, especially in the C10 cluster (Figure 4K) and SPP1 is significantly enriched in the mono/macrophages, especially in the C13 cluster (Figure 4L).Moreover, the CCI analysis revealed that ANGPTL2 + CAFs mainly interact with malignant CRC cells, mono/macrophages, epithelial cells, and endothelial cells (Figure 4M).In addition, SPP1 + mono/macrophages mainly interact with malignant CRC cells, endothelial cells, and exhaustive CD8+T cells (Figure 4N).Furthermore, we conducted IF assays to explore the distribution of ANGPTL2/SPP1 proteins and CAF/macrophage markers in CRC tissues.The results showed that ANGPTL2 and SPP1 are enriched in subcluster of CAFs/ macrophages in CRC tissues, respectively (Figure 4O).

Support the correlations between ANGPTL2/SPP1 and CAFs/macrophages in CRC tissues based on Bulk RNA-seq
To support the above results, we analyzed the correlation between ANGPTL2 expression and CAFs infiltration levels as well as the correlation between SPP1 expression and macrophage infiltration levels in CRC tissues based on three bulk RNA-seq datasets.As shown in Figure 5A, analysis of the TCGA CRC cohort by the EPIC, MCPCOUNTE, XCELL, and TIDE algorithms showed that ANGPTL2 expression is negatively correlated with the purity of CRC tissues, but positively correlated with the infiltrated levels of CAFs.In addition, analysis of GSE49355 and GSE81558 datasets exhibited that the infiltrated levels of CAFs are significantly elevated in CRC tissues with high ANGPTL2 expression (Figures 5B, C).As shown in Figure 5D, the TCGA CRC cohort analysis by the EPIC, TIMER, MCPCOUNTE, XCELL, CIBERSORT, CIBERSORT-AS, and QUANTISEQ algorithms showed that SPP1 expression is negatively correlated with the purity of CRC tissues, but positively correlated with the infiltrated levels of multiple types of macrophages.Consistently, analysis of the GSE49355 and GSE81558 datasets showed that the infiltrated levels of macrophages, especially M2 macrophages, are positively correlated with SPP1 expression in CRC tissues (Figures 5E, F).

Identify the LM-associated gene sets in ANGPTL2 + CAFs and SPP1 + macrophages in CRC tissues
To identify the enriched gene sets which are associated with LM-CRC, GSEA analysis was conducted based on the GSE49355 and GSE81558 datasets.As shown in Figures 6A, B, the gene sets of Identify the LM-associated gene sets in ANGPTL2 + CAFs and SPP1 + macrophages in CRC tissues.
, UV_RESPONSE_DN are significantly correlated with liver metastasis of CRC.Subsequently, GSEA analysis of scRNA-seq data in the GSE166555 dataset was carried out and the results uncovered that compared to other identified cell types, the gene sets of HALLMARK_EPITHELIAL_ UV_RESPONSE_DN are specifically enriched in the ANGPTL2 + CAFs of CRC tissues (Figures 6C, D).Besides, compared to other identified cell types, the gene sets of _KRAS_SIGNALING_UP are significantly enriched in the SPP1 + macrophages of CRC tissues (Figures 6E, F).To validate these findings, GSEA analysis of scRNA-seq data in the EMTAB8107 dataset was conducted.Consistently, the gene sets of HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION and HALLMARK_UV_RESPONSE_DN are specifically enriched in the ANGPTL2 + CAFs in CRC tissues (Figures 6G, H Explore the correlations between ANGPTL2/SPP1 and ANGPTL2 + CAFs-/ SPP1 + macrophages-specific gene sets in CRC Subsequently, we explored the correlations between ANGPTL2/ SPP1 and ANGPTL2 + CAFs-/SPP1 + macrophages-specific gene sets in CRC based on the GSE49355 and GSE81558 datasets.Figures 7A,  B showed that the expression of most genes in the gene sets of HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION and HALLMARK_UV_RESPONSE_DN are positively correlated with ANGPTL2 expression in CRC tissues.In addition, the expression of most genes in the gene sets of HALLMARK _COMPLEMENT and HALLMARK_KRAS_SIGNALING_UP are positively correlated with SPP1 expression in CRC tissues (Figures 7C, D).The GSE81558 dataset was used to validate these findings.Consistently, ANGPTL2 expression is positively correlated with the expression of most genes in the gene sets of HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION and HALLMARK_UV_RESPONSE_DN (Figures 7E, F), and SPP1 expression is positively correlated with the expression of most genes in the gene sets of HALLMARK_COMPLEMENT and HALLMARK_KRAS_SIGNALING_UP (Figures 7G, H) in CRC tissues.

ANGPTL2 + CAFs promote the EMT and metastasis of CRC cells by secreting ANGPTL2
The biological function of ANGPTL2 + CAFs in CRC remains unclear.ANGPTL2 is a kind of secreted glycoprotein (26).Therefore, we hypothesized that ANGPTL2 + CAFs promote the metastasis of CRC cells by secreting ANGPTL2.We first isolated fibroblasts from tissues and a panel of fibroblast markers, including two positive markers (a-SMA and Vimentin) and two negative markers (KRT20 and desmin), were applied to define the fibroblasts (Figure 8A).Western blot and IF assays showed that the expression of ANGPTL2 is significantly upregulated in CAFs compared to NFs (Figures 8B, C).ELISA assay showed that CAFs secret more ANGPTL2 compared to NFs (Figure 8D).Then we co-cultured CRC cells with NFs or CAFs (Figure 8E).Transwell assays showed that compared to NFs, CAFs significantly promotes the metastasis of CRC cells (Figure 8F).Subsequently, we overexpressed ANGPTL2 in CAFs by transfecting rANGPTL2 (Figure 8G) and co-cultured CAFs with CRC cells.Transwell assays showed that ANGPTL2 overexpression significantly promotes the metastasis of CRC cells (Figure 8H).Moreover, the immunoblotting assay exhibited that overexpressing rANGPTL2 in CAFs enhanced the EMT of CRC cells (Figure 8I).

SPP1+macrophages enhance the invasion and metastasis of CRC cells
Although the oncogenic role of SPP1 + macrophages has been reported (27), whether SPP1 + macrophages regulate the metastasis of CRC cells remains elusive.Then we validated the metastasispromoting role of SPP1 + macrophages in CRC based on in vitro experiments.We first tested the expression of SPP1 in macrophages (Figure 9A).After overexpressing SPP1 in macrophages (Figure 9B), we co-cultured them with CRC cells (Figure 9C).Transwell assay showed that SPP1 overexpression in macrophages significantly improves the invasion and metastasis abilities of CRC cells (Figure 9D).

Discussion
Liver metastasis is a leading cause of cancer-related deaths in CRC patients (1).Currently, more and more LM-associated regulators have been revealed (28).In the present study, we first selected hundreds of DEGs in LM-CRC compared to primary CRC and found these DEGs are significantly associated with the regulation of endopeptidase activity, blood coagulation, and metabolic processes.Consistently, these DEGs mainly participate in the complement and coagulation cascades and metabolic pathways.The matrix metalloproteinases (MMPs) are common zinc-dependent endopeptidases whose enzymatic activity is cleaving components of the extracellular matrix (ECM), which facilitate tumor cell invasion and metastasis by several mechanisms (29).For instance, MMPs remove physical barriers to invasion through the degradation of ECM macromolecules, such as collagens, laminins, and proteoglycans (30).The association between blood coagulation and cancer metastasis is well recognized.Thrombin, a pleiotropic enzyme, has been found to contribute to cancer metastasis by increasing the adhesive ability of cancer cells (31).Metabolic reprogramming is a hallmark of cancer metastasis in which cancer cells manipulate their metabolic profile to meet the dynamic energetic requirements of the tumor microenvironment (32).For example, cancer cells can preferentially utilize glycolysis or oxidative phosphorylation based on heterogeneous intrinsic or extrinsic factors (32).Cancer metastasis is also regulated by microenvironmental and systemic processes, such as immunosurveillance.Natural killer (NK) cells play an important role in immune responses against cancer metastasis (33).Hassan et al. once reported that NK cells directly interact with circulating tumor cells to control cancer metastasis (34).
Our study identified SPP1, CAV1, ANGPTL2, and COLEC11 as the characteristic DEGs of LM-CRC.SPP1 and ANGPTL2 are also significantly associated with the clinical outcomes of CRC patients.SPP1 and CAV1 have been previously identified as the metastatic markers of LM-CRC through weighted gene correlation network analysis (35).Besides, Xu et al. reported that SPP1 can promote CRC metastasis by activating epithelial-mesenchymal transition (36).Conversely, CAV1 serves as a tumor suppressor in CRC.CAV1 attenuates the migration and invasion of CRC cells by inhibiting the phosphorylation of EGFR (37).COLEC11 is a member of the collectin family of C-type lectins and plays a role in innate immunity through its ability to bind non-self-sugars and to activate the complement through the recruitment of MAPS1 (38,39).ANGTPL2 is a widely accepted metastasis promoter in various cancers, including CRC (40)(41)(42)(43).Given the secreted glycoprotein role of ANGPTL2 protein, Motoyoshi and colleagues uncovered that the serum ANGPTL2 levels are significantly upregulated in breast cancer patients with metastasis and are potential predictors of metastatic breast cancer (26).Immunohistochemistry analysis revealed that ANGPTL2 and SPP1 proteins are distributed in the TME of CRC tissues.ScRNAseq analysis demonstrated that ANGPTL2 and SPP1 are markedly enriched in CAFs and macrophages in CRC tissues, respectively.Zhang et al. once found that SPP1 can mediate the M2 polarization of macrophages and upregulate the expression of PD-L1 which attenuates CD4 + T-cell activation (44).Recently, a scRNA-seq and spatial analysis discovered that SPP1 + macrophages are a subgroup of macrophages and interact closely with FAP + CAFs in CRC (27).In addition, CRC patients with high FAP or SPP1 expression achieved less therapeutic benefit from an anti-PD-L1 therapy (27).Consistently, our CCI analysis also revealed an interaction between SPP1 + macrophages and CAFs.M2 macrophages are significantly associated with the metastasis of cancer cells.For example, several tumor-derived exosomal miRNAs can be transferred to macrophages and induced M2 polarization of macrophages which in turn promotes CRC metastasis by enhancing epithelial-mesenchymal transition and secreting VEGF (45).Therefore, it is no wonder that SPP1 + macrophages are metastasis promoters of LM-CRC.CAFs also play a critical role in cancer metastasis by contributing to ECM deposition and remodeling, extensive crosstalk with cancer cells, and epithelial-to-mesenchymal transition (46).Many CAF-derived factors have been proven to enhance the metastatic ability of cancer cells, such as IL-33 (47), TGF-beta (48), and miR-500a-5p (49).Li et al. previously determined that there are two different CAF subtypes in the CRC tumor microenvironment through scRNA-seq analysis (14).However, with the development of scRNA-seq technology and the increasing sample sizes, more and more CAF subtypes were discovered (50).In this study, we were the first to identify ANGPTL2 + CAF as a novel CAF subtype and verify that ANGPTL2 + CAFs release intracellular ANGPTL2 proteins to accelerate the metastasis of CRC cells.Furthermore, we uncovered the specifically enriched gene sets in ANGPTL2 + CAFs and SPP1 + macrophages in CRC, such as the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION in ANGPTL2 + CAFs and the HALLMARK_COMPLEMENT in SPP1 + macrophages.Epithelial-mesenchymal transition is a process in which epithelial cells acquire mesenchymal features.In cancer, the epithelial-mesenchymal transition is associated with tumor invasion and metastasis closely (51).The complement system that encompasses more than 50 soluble and membranebound proteins is a pillar of the innate immune response.Emerging evidence has underscored its relevance in tumor metastasis (52).Our analysis results also revealed that ANGPTL2 and SPP1 levels are positively correlated with the expression of genes in the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION and the HALLMARK_COMPLEMENT in CRC tissues.Here, we validated that ANGPTL2 + CAFs promote the epithelial-mesenchymal transition of CRC cells.Although we verified SPP1 + macrophages enhance metastasis of CRC cells based on in vitro experiments, whether SPP1 + macrophages regulate the complement cascades in CRC needs to be further investigated.
In summary, our study selected four characteristic DEGs of LM-CRC and identified ANGPTL2 + CAFs and SPP1 + macrophages subtypes as metastasis accelerators of CRC which provided novel potential therapeutic targets for LM-CRC.

Data availability statement
The datasets presented in this study can be found in online repositories.The names of the repository/repositories and accession number(s) can be found within the article/Supplementary Materials.
(A) The volcano plot of the DEGs in LM-CRC based on the GSE81558 dataset.(B) The volcano plot of the DEGs in LM-CRC based on the GSE49355 dataset.(C) Overlap the DEGs in two datasets to select the common DEGs.(D) The top 20 GO analysis terms of common DEGs in LM-CRC.(E) The top 20 KEGG pathway analysis terms of common DEGs in LM-CRC.Liu et al. 10.3389/fimmu.2023.1185208

2
FIGURE 2 Identify the characteristic DEGs of LM-CRC.(A) Identify the characteristic DEGs of LM-CRC through the "LASSO" algorithm based on the GSE49355 dataset.(B) Identify the characteristic DEGs of LM-CRC through the "LASSO" algorithm based on the GSE81558 dataset.(C) Overlap the characteristic DEGs in two datasets to select the common DEGs of LM-CRC.(D) The ROC curves of SPP1, CAV1, ANGPTL2, COLEC11, and their combination to differentiate LM-CRC from primary CRC based on the GSE49355 dataset.(E) The ROC curves of SPP1, CAV1, ANGPTL2, COLEC11, and their combination to differentiate LM-CRC from primary CRC based on the GSE81558 dataset.(F) The ROC curves of SPP1, CAV1, ANGPTL2, COLEC11, and their combination to differentiate LM-CRC from primary CRC based on the GSE178120 and GSE159216 datasets.

3 4
FIGURE 3 The protein levels and prognostic values of the characteristic DEGs of LM-CRC.(A) The protein levels of SPP1, CAV1, ANGPTL2, and COLEC11 in CRC tissues and normal tissues deposited in the Human Protein Atlas database.(B) The protein levels of SPP1, CAV1, ANGPTL2, and COLEC11 in CRC tissues and normal tissues revealed by IHC assay.(C)The association between SPP1, CAV1, ANGPTL2, and COLEC11 expression and the overall survival of CRC patients.(D) The association between SPP1, CAV1, ANGPTL2, and COLEC11 expression and the disease-free survival of CRC patients.*, p<0.05; **, p<0.01.

5
FIGURE 5 Support the correlations between ANGPTL2/SPP1 and CAFs/macrophages in CRC tissues based on bulk RNA-seq.(A) The correlations between ANGPTL2 expression and the purity of CRC tissue and the infiltrated level of CAFs based on the TCGA CRC cohort.(B) The correlations between ANGPTL2 expression and the infiltrated level of CAFs based on the GSE49355 dataset.(C) The correlations between ANGPTL2 expression and the infiltrated level of CAFs based on the GSE81558 dataset.(D) The correlations between SPP1 expression and the purity of CRC tissue and the infiltrated level of macrophages based on the TCGA CRC cohort.(E) The correlations between SPP1 expression and the infiltrated level of macrophages based on the GSE49355 dataset.(F) The correlations between SPP1 expression and the infiltrated level of macrophages based on the GSE81558 dataset.*, p<0.05; **, p<0.01; ***, p<0.001.
(A) The significantly enriched gene sets in LM-CRC based on the GSE49355 dataset.(B) The significantly enriched gene sets in LM-CRC based on the GSE81558 dataset.(C) The enrichment of the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION gene set in identified cell types based on the GSE166555 scRNA-seq data.(D) The enrichment of the HALLMARK_UV_RESPONSE_DN gene set in identified cell types based on the GSE166555 scRNA-seq data.(E) The enrichment of the HALLMARK_KRAS_SIGNALING_UP gene set in identified cell types based on the GSE166555 scRNA-seq data.(F) The enrichment of the HALLMARK_COMPLEMENT gene set in identified cell types based on the GSE166555 scRNA-seq data.(G) The enrichment of the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION gene set in identified cell types based on the EMTAB8107 scRNA-seq data.(H) The enrichment of the HALLMARK_UV_RESPONSE_DN gene set in identified cell types based on the EMTAB8107 scRNA-seq data.(I) The enrichment of the HALLMARK_KRAS_SIGNALING_UP gene set in identified cell types based on the EMTAB8107 scRNA-seq data.(J) The enrichment of the HALLMARK_COMPLEMENT gene set in identified cell types based on the EMTAB8107 scRNA-seq data.
) and t h e g e n e s e t s o f HA L L M A R K _ C O M P L E M E N T a n dHALLMARK_KRAS_SIGNALING_UP are significantly enriched in the SPP1 + macrophages in CRC tissues (Figures6I, J).

7
FIGURE 7Explore the correlations between ANGPTL2/SPP1 and ANGPTL2 + CAFs-/SPP1 + macrophages-specific gene sets in CRC.(A) The correlations between ANGPTL2 expression and the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION gene set in CRC based on the GSE49355 dataset.(B) The correlations between ANGPTL2 expression and the HALLMARK_UV_RESPONSE_DN gene set in CRC based on the GSE49355 dataset.(C) The correlations between SPP1 expression and the HALLMARK_KRAS_SIGNALING_UP gene set in CRC based on the GSE49355 dataset.(D) The correlations between SPP1 expression and the HALLMARK_COMPLEMENT gene set in CRC based on the GSE49355 dataset.(E) The correlations between ANGPTL2 expression and the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION gene set in CRC based on the GSE49355 dataset.(F) The correlations between ANGPTL2 expression and the HALLMARK_UV_RESPONSE_DN gene set in CRC based on the GSE49355 dataset.(G) The correlations between SPP1 expression and the HALLMARK_KRAS_SIGNALING_UP gene set in CRC based on the GSE49355 dataset.(H) The correlations between SPP1 expression and the HALLMARK_COMPLEMENT gene set in CRC based on the GSE49355 dataset.

8 ANGPTL2+
FIGURE 8 ANGPTL2 + CAFs promote the EMT and metastasis of CRC cells by secreting ANGPTL2.(A) IF assays of the CAF-specific positive markers a-SMA and Vimentin and negative markers KRT20 and Desmin in fibroblasts.(B) IF assays of the CAF-specific markers a-SMA and ANGPTL2 in NFs and CAFs.(C) The protein levels of ANGPTL2 in NFs and CAFs detected by western blot assay.(D) The expression levels of ANGPTL2 in culture media of NFs and CAFs.(E) The illustration of co-culturing Fibroblasts with CRC cells.(F) The invasion ability of CRC cells revealed by Transwell assays.(G) The effect of rANGPTL2 overexpression in CAFs.(H) The invasion ability of CRC cells revealed by Transwell assays.(I) The changes of EMT-associated genes in CRC cells after co-culturing with ANGPTL2-overexpressed CAFs.***, p<0.001.

9
FIGURE 9 SPP1+macrophages enhance the invasion and metastasis of CRC cells.(A) IF assays of the macrophage-specific markers CD68 and SPP1 in macrophages.(B) The effect of SPP1 overexpression in macrophages.(C) The illustration of co-culturing SPP1-overexpressed macrophages with CRC cells.(D) The invasion ability of CRC cells revealed by Transwell assays.