Tumor Derived SIGLEC Family Genes May Play Roles in Tumor Genesis, Progression, and Immune Microenvironment Regulation

Background SIGLEC family genes can also be expressed on tumor cells in different cancer types, and though it has been found that SIGLEC genes expressed by immune cells can be exploited by tumors to escape immune surveillance, functions of tumor derived SIGLEC expression in tumor microenvironment (TME) were barely investigated, which could play roles in cancer patients’ survival. Methods Using bioinformatic analysis, mutation status of SIGLEC family genes was explored through the cBioPortal database, and expression of them in different tumors was explored through the UALCAN database. The GEPIA database was used to compare SIGLEC family genes’ mRNA between cancers and to generate a highly correlated gene list in tumors. A KM-plotter database was used to find the association between SIGLEC genes and survival of patients. The associations between SIGLEC family genes’ expression, immune infiltration, and immune regulators’ expression in TME were generated and examined by the TIMER 2.0 database; the differential fold changes of SIGLEC family genes in specific oncogenic mutation groups of different cancer types were also yielded by TIMER 2.0. The networks of SIGLEC family genes and highly correlated genes were constructed by the STRING database, and gene ontology and pathway annotation of SIGLEC family highly correlated genes were performed through the DAVID database. Results SIGLEC family genes were highly mutated and amplified in melanoma, endometrial carcinoma, non-small cell lung cancer, bladder urothelial carcinoma, and esophagogastric adenocarcinoma, while deep deletion of SIGLEC family genes was common in diffuse glioma. Alteration of SIGLEC family genes demonstrated different levels in specific tumors, and oncogenic mutation in different cancer types could influence SIGLEC family genes’ expression. Most SIGLEC family genes were related to patients’ overall survival and progression free survival. Also, tumor derived SIGLEC family genes were related to tumor immune cell infiltration and may regulate TME by influencing chemokine axis. Conclusion Our computational analysis showed SIGLEC family genes expressed by tumor cells were associated with tumor behaviors, and they may also influence TME through chemokine axis, playing vital roles in patients’ survival. Further experiments targeting tumor derived SIGLEC family genes are needed to confirm their influences on tumor growth, metastasis, and immune environment regulation.

Background: SIGLEC family genes can also be expressed on tumor cells in different cancer types, and though it has been found that SIGLEC genes expressed by immune cells can be exploited by tumors to escape immune surveillance, functions of tumor derived SIGLEC expression in tumor microenvironment (TME) were barely investigated, which could play roles in cancer patients' survival.
Methods: Using bioinformatic analysis, mutation status of SIGLEC family genes was explored through the cBioPortal database, and expression of them in different tumors was explored through the UALCAN database. The GEPIA database was used to compare SIGLEC family genes' mRNA between cancers and to generate a highly correlated gene list in tumors. A KM-plotter database was used to find the association between SIGLEC genes and survival of patients. The associations between SIGLEC family genes' expression, immune infiltration, and immune regulators' expression in TME were generated and examined by the TIMER 2.0 database; the differential fold changes of SIGLEC family genes in specific oncogenic mutation groups of different cancer types were also yielded by TIMER 2.0. The networks of SIGLEC family genes and highly correlated genes were constructed by the STRING database, and gene ontology and pathway annotation of SIGLEC family highly correlated genes were performed through the DAVID database.
Results: SIGLEC family genes were highly mutated and amplified in melanoma, endometrial carcinoma, non-small cell lung cancer, bladder urothelial carcinoma, and esophagogastric adenocarcinoma, while deep deletion of SIGLEC family genes was common in diffuse glioma. Alteration of SIGLEC family genes demonstrated different levels in specific tumors, and oncogenic mutation in different cancer types could influence SIGLEC family genes' expression. Most SIGLEC family genes were related to patients' overall survival and progression free survival. Also, tumor derived SIGLEC family genes INTRODUCTION SIGLEC family genes translate a group of proteins belonging to the immunoglobulin superfamily in mammal animals, and they can be divided into conservative (SIGLEC1, SIGLEC2, SIGLEC4, and SIGLEC15) and highly evolved (SIGLEC3, SIGLEC5, SIGLEC6, SIGLEC7, SIGLEC8, SIGLEC9, SIGLEC10, SIGLEC11, SIGLEC14, and SIGLEC16) teams, which are widely expressed on immune cell populations' membrane, mainly involving endocytosis and immune regulation in various diseases (1)(2)(3)(4)(5). SIGLEC family genes' protein on immune cells could bind to sialylated oligosaccharides, a type of glycoprotein, expressed by self or non-self cells, which in turn activate or inhibit immune cell function themselves or by binding to other functional kinase protein, providing targets for immune therapy (6).
It has been found that tumor cells can express sialylated ligands for SIGLEC receptors on immune cells, depressing immune cell function to escape immune surveillance, such as SIGLEC7 and SIGLEC9 on natural killer cell (7)(8)(9)(10)(11)(12). However, SIGLEC family genes can also be expressed by tumor cells across cancer types, and recently, studies have found SIGLEC15 expressed by tumor cells or macrophages in mouse melanoma model could directly depress CD8+ T cell infiltration and function in tumor microenvironment through binding to presumptive target on CD8+ T cells (13)(14)(15). The explicit roles of tumor intrinsic SIGLEC family genes' expression on patients' survival, disease progression, and immune regulation in tumor microenvironment were still unknown, and we used bioinformatic analysis to find whether tumor derived expression of SIGLEC family genes played roles in those aspects, which could provide new thoughts for cancer immune therapy.

MATERIALS AND METHODS
Mutation and Alteration Frequency of SIGLEC Family Genes cBioPortal database (https://www.cbioportal.org) was used to explore the mutation frequency of SIGLEC family genes in 33 types of tumors (melanoma, endometrial carcinoma, esophagogastric adenocarcinoma, non-small cell lung cancer, colorectal adenocarcinoma, ovarian epithelial tumor, cervical squamous cell carcinoma, bladder urothelial carcinoma, esophageal squamous cell carcinoma, sarcoma, head and neck squamous cell carcinoma, pancreatic adenocarcinoma, hepatocellular carcinoma, leukemia, prostate adenocarcinoma, invasive breast carcinoma, ocular melanoma, diffuse glioma, non-seminomatous germ cell tumor, renal nonclear cell carcinoma, pleural mesothelioma, adrenocortical carcinoma, glioblastoma, renal clear cell carcinoma, cervical adenocarcinoma, cholangiocarcinoma, mature B-cell neoplasm, pheochromocytoma, miscellaneous neuroepithelial tumor, undifferentiated stomach adenocarcinoma, seminoma, welldifferentiated thyroid cancer, thymic epithelial tumor), which is an integrative database for analysis of mutation in various cancer types, containing somatic mutation and copy number variation data from the cancer genome atlas (TCGA) database (https:// www.cancer.gov/about-nci/organization/ccg/research/structuralgenomics/tcga) and other published articles (16,17). The data of each tumor in TCGA Pan-Cancer projects were selected for analysis and demonstration. The data of alteration frequency (amplification, deletion, mutation, fusion, log2 transformed expression>=2 or log2 transformed expression<=-2) of each SIGLEC family gene were also generated and downloaded from the official website for further analysis ( Figure 1). mRNA Expression Levels of SIGLEC Family Genes and Comparison Between Normal and Tumor Tissues mRNA expression levels of SIGLEC family genes were compared between normal and tumor tissues of each cancer type, using the UALCAN database (http://ualcan.path.uab.edu), which is an official website for comprehensive analysis of cancer data from the TCGA database (18). The corresponding significance of examination in each comparison test was generated by website, and the results were marked with asterisks for illustration.

Influence of SIELCE Family Gene mRNA Expression on Patients' Overall Survival and Progression Free Survival
The KM-plotter database (http://www.kmplot.com/analysis/ index.php?p=service) was used to analyze the association between SIGLEC family genes' mRNA expression and patients' overall survival (OS) and progression free survival (PFS). KMplotter database is an portal website for analysis of association between gene expression and patients survival, using data from previous performed analysis, such as data from TCGA Pancancer projects and GEO database, covering micro RNA, long non-coding RNA, mRNA, and epigenetic information (19). The hazard ratio, 95% confidence interval, and p value of each analysis were generated by website, and in this analysis, best cutoff p values were deployed to subgroup patients into high-and low-expression groups.

mRNA Expression of SIGLEC Family Genes in Single Tumor and the List of SIGLEC Family Highly Correlated Genes
The Gene Expression Profiling Interactive Analysis (GEPIA) database (http://gepia.cancer-pku.cn) is a comprehensive database for tumor gene expression analysis, providing a portal for analyzing specific genes in 32 types of cancer, using data from TCGA Pan-Cancer project and GTEx database (https://www. gtexportal.org) (20,21). In this analysis, median expression of SIGLEC family genes (transcript per million base, TPM) in each tumor type were downloaded from the website for demonstration, and the first 50 highly correlated genes of each SIGLEC family gene in tumor were generated through the website and were combined as SIGLEC family highly correlated genes for further analysis.
Correlation Between SIGLEC Family Genes, Immune Infiltration, and Immune Regulators TIMER 2.0 (http://cistrome.shinyapps.io/timer) is a web-derived tool for analysis of tumor immune infiltration, which provides scores of 6 types of infiltrating immune cells (B cell, CD4+ T cell, CD8+ T cell, myeloid-derived dendritic cell, macrophages, and neutrophils) in tumors (22,23). The correlation between SIGLEC family genes and six types of infiltrating immune cell were examined in each of 32 cancers, the data of which were from the TCGA database, and the results of examination were downloaded for demonstration. The correlation between SIGLEC family genes and well-known immune regulators were also analyzed through TIMER 2.0 for demonstration. TIMER 2.0 can also analyze differentially expressed genes between specific oncogene mutation groups, and SIGLEC family genes were input for analysis of well-known oncogenic mutation in specific tumors.

Network of SIGLEC Family Genes and Highly Correlated Genes in Tumors
The networks of SIGLEC family genes and highly correlated genes were yielded by STRING database (https://string-db.org), which can construct network of selective genes given results of formerly examined correlation in articles (24). The results were downloaded from the website, and Cytoscape (version: 3.7.1) was used to illustrate the correlation (25). The cytoHubba tool in Cytoscape was used to procure the core network and the top-10 leading node genes (26).

Gene Ontology of SIGLEC Family Highly Correlated Genes
The DAVID database (https://david.ncifcrf.gov) was used for gene ontology annotation of SIGLEC family highly correlated genes, which is a useful web tool for functional annotation (biological process, cellular compartment, and molecular function) of gene lists, and is also a tool for gene symbol transformation (27). It can link to the KEGG database (https:// www.kegg.jp) for pathway annotation (28).

Statistics
All statistical examinations were performed by database derived tools, and p value under 0.05 was considered significant. All heat maps in this analysis were constructed in R environment (version: 3.6.1), using R studio (version: 1.2.1335) and pheatmap (29)(30)(31). R package of graphics was used to construct a forrest graph of hazard ratio in survival analysis, and ggplot2, topGO, and clusterProfiler were used to generate the gene ontology graph (32)(33)(34).

Alteration Frequency of SIGLEC Family Genes Across Different Cancer Types
SIGLEC family genes (SIGLEC1 or CD169, SIGLEC2 or CD22, SIGLEC3 or CD33, SIGLEC4 or MAG, SIGLEC5, SIGLEC6, SIGLEC7, SIGLEC8, SIGLEC9, SIGLEC10, SIGLEC11, SIGLEC14, SIGLEC15, SIGLEC16) were also expressed in tumor cells, and we used the cBioPortal database to find mutation status of SIGLEC family genes in different cancer types. Results showed mutation frequencies of SIGLEC1, SIGLEC2, and SIGLEC10 were relatively high among all SIGLEC family genes, and SIGLEC family genes were highly mutated in melanoma (except SIGLEC15 and SIGLEC16). However, in ocular melanoma, miscellaneous neuroepithelial tumor, seminoma, cholangiocarcinoma, undifferentiated stomach adenocarcinoma, pheochromocytoma, well-differentiated thyroid cancer, non-seminomatous germ cell tumor, cervical adenocarcinoma, and pheochromocytoma mutation of SIGLEC family genes were rare. Of notice, while most SIGLEC family genes rarely mutated in cholangiocarcinoma and undifferentiated stomach adenocarcinoma, SIGLEC10 and SIGLEC7 were respectively highly mutated in each of them. Alteration frequencies (mutation, amplification, deep deletion, and multiple mutation) of SIGLEC family genes were also high in endometrial carcinoma, non-small cell lung cancer, bladder urothelial carcinoma, and esophagogastric adenocarcinoma. For endometrial carcinoma, non-small cell lung cancer and bladder urothelial carcinoma, mutation, and amplification of SIGLEC family genes were both high; for esophagogastric adenocarcinoma, mutation, amplification, and deep deletion of SIGLEC family genes were all common. Also, in diffuse glioma, deep deletion of SIGLEC family genes was common in comparison to 32 other types of tumor, and SIGLEC15 seemed to be deeply deleted in various tumors, while SIGLEC16 was amplified in mutating cancer types ( Figure 2).

Expression Levels of SIGLEC Family Genes Between Normal and Tumor Tissues Across Cancer Types
We wondered whether SIGLEC family genes were differentially expressed between normal and tumor tissues in various cancer types, and we used the UALCAN database to examine their expression levels. We found mRNA expression levels of SIGLEC family genes were different between normal and tumors tissues, and extremely in breast invasive carcinoma (BRCA), colon adenocarcinoma (COAD), head and neck squamous cell We compared the median TPM expression of SIGLEC family genes between cancer types (from GEPIA database), and results showed some SIGLEC family genes were highly expressed in specific tumors ( Figure 4). Using TIMER 2.0 database, we found tumor specific oncogenic mutation could influence expression levels of SIGLEC family genes. In APC mutation groups of COAD and rectum adenocarcinoma (READ), most SIGLEC family genes were significantly down-regulated; in the CTNNB1 mutation group of LIHC, all SIGLEC family genes were down-regulated (10 out of 14 achieved significance). In other common mutations of tumors, such as PTEN, TP53, KRAS, and PIK3CA, SIGLEC family genes also demonstrated significant expressional changes in mutation groups of different cancer types ( Figure 5).

Tumor Derived mRNA Expression of SIGLEC Family Genes Were Highly Correlated to Patients' Overall Survival and Progression Survival Across Cancer Types
We associated mRNA expression of SIGELC family genes with patients' survival, using the KM-plotter database, and found SIGLEC family genes were related to patients' overall survival (OS) and progression free survival (PFS) in most cancer types ( Figures 6A, B). Concerning OS of patients with various tumors, most SIGLEC family genes in LUAD, THYM, KIRC, and HNSC showed significant correlation, and in HNSC and LUAD, most SIGLEC family genes showed protective roles, while in THYM and KIRC, most SIGLEC family genes were risk factors ( Figures 6C-F). In LIHC and pancreatic adenocarcinoma (PAAD), only one (SIGLEC6) and three (SIGLEC2, SIGLEC15, and SIGLEC16) SIGLEC family genes were related to patients' OS, respectively; however, when it comes to PFS, most SIGELC family genes changed to protective and risk factors in LIHC and PAAD, correspondingly. Also, in bladder urothelial carcinoma (BLCA) and UCEC, most SIGLEC family genes were significant protecting factors for PFS, and only MAG (SIGLEC4) was a risk factor for PFS in UCEC ( Figures 6G-K).   carcinoma and endocervical adenocarcinoma (CESC), esophageal carcinoma (ESCA), and KIRP, most SIGLEC family genes were positively correlated with B cell and CD8+ T cell (Figures 7A-C). Also, it turned out that SIGLEC family genes were highly correlated with dendritic cell, macrophage, and neutrophil in various cancer types ( Figures 7D-F). Of notice, in brain lower grade glioma (LGG), most SIGLEC family genes were highly correlated with CD4+ T cell and negatively correlated with CD8+ T cell infiltration; in LIHC, LUAD, LUSC, PAAD, READ, SARC, SKCM, and stomach adenocarcinoma (STAD), most SIGLEC family genes were positively correlated with CD8+ T cells. Besides, in comparison to READ, COAD tissues additionally showed negative correlation between SIGLEC family genes and B cell infiltration.

SIGLEC Family Genes Expressed in Tumor
Cells Could Influence Immune Regulators in Tumor Microenvironment, Such as Chemokine Axis, Immune Stimulator, Immune Inhibitor, and MHC Molecular After finding SIGLEC family genes expressed by tumor cells of various cancer types were closely related to tumor immune infiltration, we further examined the correlation between SIGLEC family genes and immune regulators in tumor. Since former studies have found SIGLEC1 (CD169), SIGLEC2 (CD22), SIGLEC3 (CD33), SIGLEC7, SIGLEC9, and SIGLEC15 expressed by immune cell populations could be manipulated by tumor cells to escape immune surveillance, we focused on the five SIGLEC family members. We found SIGLEC family genes were significantly correlated to a wide spectrum of immune regulators, including chemokine axis, immune stimulators, inhibitors, and MHC molecular, such as CD28, CD40, CD40LG, ICOS, LAG3, PDCD1, CD274, and CTLA4. Though some SIGLEC genes showed non-significant correlation with immune regulators, most coefficients were significant with absolute value over 0.5, indicating their active roles of immune regulation in tumor microenvironment ( Figure 8).
Additionally, we generated a list of SIGLEC family highly correlated genes in tumor through GEPIA database, and we constructed a network of them through the STRING database ( Figures 9A-B). The top ranked 10 node genes were TYROBP, ITGAM, ITGB2, FPR2, C3AR1, LILRB2, FCER1G, FCGR2A, GNGT2, and FPR1. Further gene ontology and pathway enrichment of SIGLEC family highly correlated genes, through DAVID database, showed those genes were most immune function related and enriched in cytokine-cytokine receptor interaction, chemokine signaling pathway, and leukocyte transendothelial migration ( Figure 9C).

DISCUSSION
In our analysis, the alteration status of SIGLEC family genes differed in different cancer types. In melanoma, endometrial carcinoma, non-small cell lung cancer, bladder urothelial carcinoma, and esophagogastric adenocarcinoma, mutation and amplification of SIGLEC family genes were high, while deep deletion was commonly seen in diffuse glioma. We believed that in epithelium derived tumors, high proliferation status of tumor tissue may cause high frequencies of mutation and amplification of SIGLEC family genes in tumor cells; however, in diffuse glioma, the tumor microenvironment was different from other cancer types, which may redirect the adaptation of tumor cells. SIGLEC family genes were further associated with specific oncogene mutation in different cancer types and were differentially expressed between patients of mutation and non-mutation groups, such as APC in COAD and READ, or CTNNB1 in LIHC. Those mutations were formerly found to be related to degrees of tumor malignant behaviors or carcinogenesis, and we thought evolution of tumor cells may somehow drive expression changes of SIGLEC family genes.
SIGLEC family genes were highly correlated with patients' OS and PFS across cancer types. Especially in LUAD, THYM, KIRC, and HNSC, most SIGLEC family genes were related to OS with significance, while in LIHC and PAAD, most of them were involved in PFS of patients. The hazard ratio for SIGLEC family genes differed in different cancer types: most SIGLEC family genes showed protective roles in some cancer types, while they demonstrated risking roles in the other. Also, though the hazard ratios for OS and PFS were consistent for most SIGLEC genes, in a few tumors, some SIGLEC family members were both risk factors and protective factors for OS and PFS. We thought expression levels of SIGLEC family genes in different cancer types may influence tumor malignant traits through different mechanisms, and their involvement in a special tumor microenvironment of different cancer types may also cause the survival difference across cancer types.
Former studies of SIGLEC family genes in tumors were mainly about their functional roles on immune cell populations, facilitating tumor growth, and immune escape in different cancer types, such as inhibitory SIGLEC9 and SIGLEC7 expressed on natural killer cells, or SIGLEC6 expressed on mast cells in colorectal cancer, which could be exploited by tumor cells through increased sialylation and glycosylation (7,(35)(36)(37)(38)(39)(40)(41). A recent study concerning SIGLEC15 also showed expression of SIGLEC15 by macrophages or tumor cells could directly depress function of CD8+ T cells in a melanoma model (15). Though some studies mentioned the expression difference of SIGLEC family members between normal and tumor tissue, functions of tumor derived SIGLEC gene expression on tumor growth and progression were rarely investigated. Our results showed all SIGLEC family genes expressed by tumors were survival (OS and PFS) related in various cancer types, and mutation frequencies or expression levels of them differed according to origin or tissue types of tumor, which needs further experiments to undermine the detailed mechanisms in different cancer types. Since SIGLEC genes expressed on immune cells can stimulate or inhibit immune cell function, we examined the correlation between tumor expressed SIGLEC genes and immune infiltration score, as well as immune regulators in tumor microenvironment. Enrichment of SIGLEC family highly correlated genes was also performed. It seemed tumor-expressed SIGLEC genes also were highly correlated with immune microenvironment of tumor, and they may regulate immune infiltration by influencing chemokine axis. SIGLEC family genes were correlated with macrophage, neutrophil, and dendritic cell infiltration in broad cancer types, and in specific tumors, correlation with B cell, CD4+ T cell, and CD8+ T cell infiltration levels were high. Former studies showed expression of SIGLEC genes on immune cells were positively correlated with immune checkpoints, such as PD1, and in our study, tumor expressed SIGLEC genes were also positively correlated with various immune stimulators and inhibitors (8,36,38,42). Also, the recent study about SIGLEC15 additionally demonstrated the similar structure of SIGLEC15 and PDL1, which makes us wonder whether SIGLEC family genes expressed on tumor cells may directly influence immune cell infiltration and function in tumor microenvironment by binding to potential targets on immune cells, since SIGLEC15 has a relatively conservative structure among them (15,(43)(44)(45). Those results shed new light on immune blockade therapy to improve patients' prognosis by neutralizing SIGLEC receptors on tumor cells or immune cells (12,15,46,47).
There are several limitations of our study. First, the analysis was performed by using data from online databases, which needs further experiments for validation. Second, the immune infiltration status of different cancer types was calculated by computational methods. Though calculation was performed by multiple algorithms and correlation analysis was performed with tumor purification adjusted, the sequencing data may contain information from other cell sources, which requires tissue sample confirmation.

CONCLUSION
Our computational analysis showed SIGLEC family genes expressed by tumor cells in different cancer types were related to tumor formation and patients' survival, and they could regulate tumor immune microenvironment by influencing chemokine axis. Targeting tumor derived SIGLEC genes may benefit patients' survival by both interfering with tumor malignant behaviors and improving tumor immune microenvironment.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
ZC, HL, and QY designed the investigation, and ZC, MY, QM, and ZY helped to collect data in databases. ZC, MY, LG, BIZ and SL wrote the draft of the paper, and WZ, BOZ, JY, YSX, and YFX helped to revise and make adaptations. All authors contributed to the article and approved the submitted version.

FUNDING
This study was funded by National Nature Science Foundation of China (81871924 to QY, 81572844 to YFX).