Identification and Validation of Potential Candidate Genes of Colorectal Cancer in Response to Fusobacterium nucleatum Infection

Objective: Recent investigations revealed the relationship between Fusobacterium nucleatum (Fn) infection and colorectal cancer (CRC). However, how the host genes changes contribute to CRC in response to Fn infection remains largely unknown. Materials and methods: In the present study, we aimed to comprehensively analyze microarray data obtained from a Caco-2 infection cell model using integrated bioinformatics analysis and further identify and validate potential candidate genes in Fn-infected Caco-2 cells and CRC specimens. Results: We identified 10 hub genes potentially involved in Fn induced tumor initiation and progression. Furthermore, we demonstrated that the expression of centrosomal protein of 55 kDa (CEP55) is significantly higher in Fn-infected Caco-2 cells. Knocking down of CEP55 could arrest the cell cycle progression and induce apoptosis in Fn-infected Caco-2 cells. The expression of CEP55 was positively correlated with the Fn amount in Fn-infected CRC patients, and these patients with high CEP55expression had an obviously poorer differentiation, worse metastasis and decreased cumulative survival rate. Conclusion: CEP55 plays an important role in Fn-infected colon cancer cell growth and cell cycle progression and could be used as a new diagnostic and prognostic biomarker for Fn-infected CRC.


INTRODUCTION
Many malignant cancers are characterized by complex communities of oncogenic potentially transformed cells with genetic and epigenetic changes caused by bacteria and viruses (Burnett-Hartman et al., 2008). Fusobacterium nucleatum (Fn) is a gram-negative obligate anaerobic bacterium that could adhere to and invade endothelial or epithelial cells through its adhesin FadA. The aggregation of Fn in intestinal epithelium promotes the occurrence and development of colorectal adenoma and adenocarcinoma (Flanagan et al., 2014;Park et al., 2016;Yan et al., 2017;Yamaoka et al., 2018). It has been found that FadA can binds to vascular endothelial adhesion factor CDH5 and activate p38MAPK signal pathway to promote the progress of colorectal cancer (CRC) (Rubinstein et al., 2013). FadA can also bind with E-cadherin on epithelial cells and activate oncogenes Myc and Cyclin D1. Recent studies indicated that Fn can bind to TLR4 with its lipopolysaccharide and activate the cascade reaction of p-pak-1/P-β-Catenins-675/c-myc/Cyclin-D1 to promote the malignant proliferation of colon cancer cells Yang et al., 2017). Furthermore, it was found that Fn could enhance the growth and migration of CRC cells by the overexpression of microRNA-21 through TLR4/NF-κB signaling pathway (Yang et al., 2017). Although these factors are associated with the carcinogenesis induced by Fn, still little is known about genes that contribute to CRC in Fn infection microenvironment.
Recently, the high-throughput gene microarray analysis of Fn-infected and non-infected Caco-2 cells allows us to explore the global molecular changes from transcriptome alterations to somatic mutations, as well as epigenetic changes (De et al., 2015;Jia et al., 2017). In this study, the GSE102573 dataset from the Gene Expression Omnibus (GEO, http://www.ncbi. nlm.nih.gov/geo) database was downloaded and the differentially expressed genes (DEGs) were comprehensively identified using GEO2R. Then, a protein-protein interaction (PPI) network of these DEGs was established and 10 hub genes with a high degree of connectivity were screen out. In addition, Gene Ontology (GO) involving the biological processes (BPs), molecular functions (MFs), and cellular components (CCs) of these DEGs and their Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were also analyzed. The potential correlation and expression levels were further analyzed via Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/index.html) and validated via quantitative reverse transcription-PCR (qRT-PCR).
Our data showed that the expression of centrosomal protein of 55 kDa (CEP55) is significantly higher in Fn-infected Caco-2 cells. Knocking down of CEP55 could arrest the cell cycle progression and induce apoptosis in Fn-infected Caco-2 cells. The expression of CEP55 was positively correlated with the Fn amount in Fn-infected CRC patients, and these patients with high CEP55 expression had an obviously poorer differentiation, worse metastasis and decreased cumulative survival rate.

Microarray Data
The gene expression profile of GSE102573 was downloaded from the GEO free public database. This microarray dataset has a total of 5 Fn-infected and 5 Fn-non-infected Caco-2 samples and was based on the Agilent GPL17586 platform [Affymetrix Human Transcriptome Array 2.0 (transcript (gene) version)].

Data Preprocessing
All of the probes expression values in each sample were reduced to a single mean expression value via the aggregate function method and missing data were assigned using the k-nearest neighbor method (Li, 1991;Altman, 1992). When many genes were located by a probe, the probe was considered to be lack of specificity and was removed from the analysis.

Identification of DEGs
GEO2R was utilized to identify the DEGs between Fn-infected and Fn-non-infected Caco-2 samples. The adjusted p-value, which could help correct false positives, was applied and adjusted p < 0.01 and |log fold change (FC)| > 1 were chosen as the cutoff criteria. The heat map and volcano plot were drawn using the "gplots" package in R 3.5.3 (Ge et al., 2021;Ritchie et al., 2015). A total of 272 upregulated genes and 178 downregulated genes were found and the top 10 genes with a high degree of connectivity were selected as hub genes.

GO and KEGG Pathway Analysis of DEGs
GO analysis can be used to annotate genes and their products with CCs, MFs, BPs, and other functions (Gaudet et al., 2017;Ning et al., 2013). The KEGG databases address genomic and biological pathways related to diseases and drugs and provide a comprehensive understanding of biological systems and genomic functional information (Kanehisa, 2002). DAVID (http://david. ncifcrf.gov) (version 6.8) can integrate large amounts of biological data and related analysis tools to provide systematic and comprehensive biological function annotation information for high-throughput gene expression (Huang et al., 2007).To visualize the key CCs, MFs, BPs and KEGG pathways of the DEGs, the DAVID online database was used to perform biological analysis. p < 0.05 was used as the cut-off criterion for statistically significant differences.

PPI Network and Module Analysis
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) version 11.0 is used to evaluate and integrate physical and functional PPI information (Szklarczy et al., 2015). The network of DEGs in STRING was drawn to evaluate the interrelationships among these DEGs and the PPI network was visualized by using Cytoscape software. Moreover, the maximum number of interacting bodies to 0 and a confidence score of 0.7 as the cut-off criterion were used. Additionally, according to node score cut-off 0-2, degree cut-off 2, max. depth 100, and k-core 2, the Molecular Complex Detection (MCODE) app was also employed to select the PPI network modules in Cytoscape and the top three modules were analyzed with DAVID.

The Expression and Survival Analysis of the 10 Hub Genes
GEPIA is a newly developed interactive web server designed to accurately analyze the RNA sequence expression data of 9,736 tumors and 8,587 normal samples from the TCGA and GTEx projects (Tang et al., 2017). The top 10 hub genes' correlation was analyzed using GEPIA tool. Then, boxplots were used to visualize hub gene expression in CRC and normal colon tissues in our study. The disease-free survival analysis of the 10 hub genes was also obtained from GEPIA. Gene Set Cancer Analysis (GSCA) is a website that collects the cancer genomics data of 33 cancer types from TCGA database (Liu C.-J. et al., 2018;Liu J. et al., 2018). The top 10 hub genes' expression in different CRC stages was analyzed using GSCA tool.

Gene Set Enrichment Analysis
Gene Set Enrichment Analysis (GSEA) could be used to explore whether a given gene set is significantly enriched in a group of gene markers which is ranked by their relevance with a phenotype of interest. The gene sets with fewer than 15 genes or more than 500 genes were excluded and the phenotype label was set as colon cancer versus control. The t-statistic mean of the genes was computed in each KEGG pathway by a permutation test with 1,000 replications. The upregulated pathways were defined by a normalized enrichment score (NES) > 0, and the downregulated pathways were defined by an NES < 0. Pathways with a false discovery rate (FDR) < 5% (q < 0.05) were considered significantly enriched.

TIMER Database Analysis
A comprehensive website TIMER was used for the automatic analysis and visualization of association between immune infiltrate levels and a series of variables (https://cistrome. shinyapps.io/timer/) Jiang et al., 2018;Wang et al., 2021). We assessed the correlation of CEP55 expression with the abundance of six kinds of immune cells (B cells, CD8 + T cells, neutrophils, dendritic cells, CD4 + T cells and macrophages) in CRC via the TIMER algorithm.

Caco-2 Cells and Cell Transfection
Fn-infected and non-infected Caco-2 cells were seeded at a density of 1 × 10 4 cells/mlRPMI-1640 medium for 24 h, respectively. Once the cells reached 40-60% confluence in each well of a 96-well plate, the cells were transfected with 2.5 nM siRNA/NC (RiboBio, Guangzhou, China) using Lipofectamine 2,000 (Thermo Fisher Scientific, United States) according to the manufacturer's instructions. The culture medium was replaced with fresh medium containing 10% FBS 6 h later. The cells were then harvested after 24 h of transfection for the following assays.

Cell Proliferation and Apoptosis Analysis
24 h after siRNA interference, Fn-infected and non-infected Caco-2 cells were treated as indicated and cell proliferation was assessed by Cell Counting Kit-8 (CCK-8) assay (Beyotime biotechnology, China) at 0, 24 and 48 h post treatment following the manufacturer's instruction, respectively. Optical density (OD) was recorded at 450 nm.
Fn-infected and non-infected Caco-2 cells were then harvested, centrifuged, and resuspended in 1×PBS, respectively. The cells were fixed in 70% ethanol overnight. On the second day, after being washed with 1×PBS solution and centrifuged, the cells were resuspended in 1×PBS solution and incubated with RNase A at 37°C for 30 min. Finally, the cells were stained with propidium iodide and analyzed by a FACSCalibur system (BD Biosciences, Germany) for cell cycle analysis.
Fn-infected Caco-2 cells were transfected with siRNA for 24 h, harvested, and centrifuged. Then, the supernatant was removed and resuspended in 1×PBS solution. This procedure was repeated three times with 1 × 10 6 cells per well, and then the cells were stained with an Annexin V/FITC and PI kit. After staining, the cells were also analyzed with a FACSCalibur system (BD Biosciences, Germany) for apoptosis analysis.
Validation the expression of CEP55 in Fn-infected CRC samples QRT-PCR was conducted to quantify the expression level of CEP55 in Fn-infected CRC samples (n 30) from Shenzhen Qianhai and Shekou Free Trade Zone's hospital (Shekou people's hospital, Shenzhen, Guangdong, China). This study was approved by the Ethics Committee of Shenzhen Qianhai and Shekou Free Trade Zone's hospital, and written informed consent was obtained from each patient before inclusion in the study. The primers used for qRT-Frontiers in Genetics | www.frontiersin.org September 2021 | Volume 12 | Article 690990 PCR were described as before. Written informed consents were obtained from all patients. This study was approved by the Ethics Committee of Shenzhen Qianhai and Shekou Free Trade Zone's hospital (Shekou People's Hospital, Shenzhen, Guangdong, China). The correlation between CEP55 expression and Fn amount, cumulative survival rate of Fn-infected CRC samples was calculated and analyzed according to the method described by Aalen (Aalen, 1988). The CEP55 expression and clinicopahtologic features of Fn-infected CRC were also further analyzed.

Statistical Analysis
All experiments were carried out in triplicate for each condition under the protocol and according to the manufacturer's instructions. Clinical characteristics were compared using the Wilcoxon rank sum test; categorical variables were compared using the Fisher exact test. Pearson's correlation test was used to analyze the correlation between the CEP55 expression and Fn amount. Log-rank test was used to determine the association of high/low CEP55 expression with clinical characteristics. Cumulative survival rates were summarized using the Kaplan-Meier method. If p < 0.05, these differences were considered to be statistically significant.

Identification of DEGs and Hub Genes
A total of 5 groups of Fn-infected and 5 groups of Fn-noninfected Caco-2 cells were analyzed. The series from each chip was analyzed separately using GEO2R and R software, and the DEGs were identified using adjusted p value < 0.01 and logFC ≥ 1 or logFC ≤ −1 as the cut-off criteria. A total of 450 DEGs were identified after analyzing GSE102573, 272 of which were upregulated genes, and 178 were downregulated ( Figure 1). In addition, 10 hub genes were identified according to their degree of connectivity, namely CDK1, CCNB1, MAD2L1, CEP55, TPX2, MELK, TRIP13, KIF4A, PRC1 and ANLN (Table 1).

GO Function and KEGG Pathway Enrichment Analysis
To obtain a comprehensive understanding of the selected DEGs, the GO function and KEGG pathway enrichment were analyzed by DAVID. After importing all DEGs into DAVID, we discovered the functions of the upregulated DEGs and downregulated DEGs by GO analysis. More specifically, these DEGs were mainly enriched in BPs involving cell cycle phase, cell cycle process, cell cycle, M phase and mitotic cell cycle for the upregulated  genes; and cell adhesion, biological adhesion, ion homeostasis for the downregulated genes. Regarding MFs, the DEGs were involved in nucleoside binding, adenyl nucleotide binding, purine nucleoside binding for the upregulated genes; and polysaccharide binding, pattern binding, calmodulin binding for the downregulated genes. In addition, GO CC analysis revealed that the upregulated DEGs were principally enriched in the organelle lumen, intracellular organelle lumen, membraneenclosed lumen, endoplasmic reticulum membrane and endoplasmic reticulum, while the downregulated DEGs were mainly enriched in extracellular region part, proteinaceous extracellular matrix, extracellular matrix, extracellular region and calcineurin complex ( Table 2). Table 3 shows the most significantly enriched KEGG pathways of the upregulated and downregulated DEGs. The upregulated DEGs were enriched in the cell cycle, one carbon pool by folate, while the downregulated DEGs were enriched in axon guidance, chemokine signaling pathway.

Hub Genes and Module Screening of the PPI Network
A PPI network of the top 10 hub genes were constructed by STRING database (Figure 1 and Table 1). The top 10 hub genes with a high degree of connectivity were as follows: CDK1, CCNB1, MAD2L1, CEP55, TPX2, MELK, TRIP13, KIF4A, PRC1 and ANLN. Based on the GO function and KEGG pathway analysis, we found that CEP55, ANLN, CDK1, CCNB1 and MAD2L1 were enriched in the cell cycle and cell division. To further detect the most important module in this PPI network, the MCODE plug-in was used and the top three modules were selected (Figure 2). KEGG pathway analysis revealed that the top three modules were mainly associated with the cell cycle, mismatch repair, p53 signaling pathway ( Table 4).

The Expression and Survival Analysis of the 10 Hub Genes
To confirm the reliability of the 10 identified hub genes from the datasets, GEPIA was used to verify the correlation between them. We found that these 10 hub genes were obviously positively correlated with each other in CRC ( Figure 3A). GEPIA was also used to determine the expression levels of the top 10 hub genes in CRC. Figure 3B showed that these genes were all significantly overexpressed in the colon cancer (COAD) samples compared to  the normal samples. GSCA was used to analyze the correlation of the 10 hub genes' expression and CRC clinic stages. Figures 4A-C showed that the expression of CEP55, CCNB1, CDK1 and TRIP13 was significantly different between different CRC stages (p ＜ 0.05). Furthermore, the disease-free survival analysis of the 10 genes was obtained from GEPIA. Among these hub genes, only high expression of TRIP13 was significantly associated with a favorable outcome of CRC (HR: 0.60, p 0.04) ( Figure 5). Since these hub genes were validated in the CRC samples from TCGA, further verification of these hub genes in Fn-infected CRC was needed.

Validation Based on Fn-Infected and Non-Infected Caco-2 Cells
The expression level of the 10 identified hub genes was further validated in Fn-infected Caco-2 cells cultured by our group. We found that, as shown in Figure 6A, compared with the Fnnon-infected Caco-2 cells, the relative expression levels of 10 hub genes in Fn-infected Caco-2 cells were increased. However, only the relative expression of CEP55 was significantly increased (p 0.008). The relative mRNA and protein expression of CEP55 in Fn-infected and non-infected CRC specimens was also compared. As shown in Figures 6B,C, the relative expression of CEP55 in Fn-infected CRC group was significantly higher than that in Fn-non-infected CRC group (p 0.023). The expression of CEP55 is similar in Fninfected Caco-2 cells and Fn-infected CRC, which suggested that our results for this gene expression are reliable. To gain further insight into the functions of the CEP55, GSEA was conducted to map CEP55 into the GO database. The top two pathways were "mitotic nuclear division" and "cytokinetic process" (Figures 6D,E). The TIMER database was also searched to estimate the correlations of CEP55 mRNA expression with immune cell infiltration in CRC. As illustrated in Figure 7, the expression of CEP55 was positively correlated with immune infiltration of B cells, CD8 + T cells, neutrophils and dendritic cells. Therefore, we speculated that high CEP55 expression might affect Fninfected colon cancer cells proliferation and differentiation through mitotic nuclear division, cytokinetic process and immune infiltration.

Knockdown of CEP55 Suppressed Fn-Infected Caco-2cells Growth by Impairing Cell Cycle Progression and Inducing Cell Apoptosis
To determine whether CEP55 could be a crucial component in Fn induced CRC, CEP55 was inactivated by using siRNAs in Fn-infected Caco-2 cells. We found that, compared to the control group, the CEP55 knockdown significantly inhibited cell proliferation ( Figure 8A) and the CEP55 protein expression ( Figure 8H). Knockdown of CEP55 resulted in the increase of cells number in S-phase and the decrease of cells population in G1+G2 phase ( Figures 8B-F), which indicated that CEP55 knockdown prevented cell passage from the S phase into the G2 phase. Therefore, CEP55 was shown to promote S/G2 phase transition. The apoptotic assay results indicated that the apoptotic cells significantly increased in Fninfected Caco-2 cells with si-CEP55 transfection ( Figures  8I-L). These data indicated that CEP55 knockdown could impair cell cycle progression and induce cell apoptosis in Fn-infected Caco-2 cells. However, we also found that CEP55 knockdown had a similar effect on Fn-non-infected Caco-2 as on Fn-infected Caco-2 cells (Figures 8A-C,G,I,M), which meant that the effect of CEP55 on Fn-infected Caco-2 cells was not specific, other stimuli might also have the effects to upregulate the expression of CEP55.

The Correlation Between CEP55 Expression and Fn Amount in Fn-Infected CRC Samples
The age, gender, tumor location, tumor size, clinical stage, differentiation grade and distant metastasis of Fn-infected CRC patients are shown in Table 5. The expression of CEP55 and Fn amount in these samples was detected and the correlation between CEP55 and Fn was also analyzed. As shown in Supplementary Figure S1A, Pearson correlation was  Frontiers in Genetics | www.frontiersin.org September 2021 | Volume 12 | Article 690990 9 significant between the expression of CEP55 and Fn amount (R 0.561; p < 0.01). The expression of CEP55 increased along with the increase of Fn amount in Fn-infected CRC.

The CEP55 Expression and Clinicopathology of Fn-Infected CRC
The relationship between CEP55 expression and clinicopathology of 30 patients with Fn-infected CRC was analyzed ( Table 6). The median value of 2^-ΔCT (1.59) was chosen as the cutoff level. The high CEP55 group was defined as those higher than the median value, and the low CEP55 group was defined as those lower than the median value. The proportions of poorly differentiated tumor and distant metastasis were significantly higher in the high CEP55 group (p 0.031, p 0.028), whereas the proportions of old age, male gender, tumor location, tumor size and clinical stage were not significantly different between these two groups. The odds ratio (OR) and cumulative survival rate of high CEP55 expression in Fn-infected CRC patients were also calculated ( Table 7). The OR was 12.25 (95%CI: 1.27-118.36) for tumor differentiation, and 5.50 (95%CI: 1.15-26.41) for metastasis in high CEP55 expression. The cumulative survival rate of Fn-infected CRC with high expression of CEP55 was significantly decreased (p 0.038), as shown in Supplementary Figure S1B. These results suggested that Fn infection might promote the progression and metastasis of CRC through overexpression of CEP55.

DISCUSSION
It has been increasingly accepted that CRC is the most relevant cancer type associated with Fn infection (Shang and Liu, 2018). To date, several studies have reported the promoting effects of Fn on CRC initiation and progression (Rubinstein et al., 2013;Flanagan et al., 2014;Park et al., 2016;Chen et al., 2017;Yang et al., 2017;Yamaoka et al., 2018). However, the mechanism of Fn infection in CRC is not clearly and fully understood. In the present study, we mined microarray data obtained from a cellular model of Caco-2 cells that were infected by Fn from the GSE102573 dataset of the GEO database. We identified 10 hub genes potentially involved in Fn induced tumor initiation and progression. Our results further suggested that CEP55 might play an important role in Fn-infected colon cancer cell growth and cell cycle progression.
A total of 450 DEGs were identified, including 272 upregulated genes and 178 downregulated genes. To better explore these DEGs, we carried out GO function and KEGG pathway analysis of these DEGs. GO analysis showed that the upregulated DEGs were particularly enriched in "cell cycle phase," "cell cycle process," "cell cycle and mitotic cell cycle" and "M phase," while the downregulated DEGs were involved in "cell adhesion" and "biological adhesion." In addition, the KEGG pathways for the upregulated DEGs included the cell cycle and one carbon pool by folate, while the pathways of the downregulated DEGs were enriched in chemokine signaling pathway and metabolism of xenobiotics by cytochrome P450. PPI network module analysis could provide a visible framework for a better understanding of the functional organization of the proteome (Liu et al., 2009). The enriched pathways of the top three modules showed that Fn-infected Caco-2 cells were mainly associated with the cell cycle, mismatch repair and p53 signaling pathway, which are the major pathways involved in the carcinogenesis of CRC. 10 DEGs with high connectivity were selected as hub genes for PPI network analysis. These hub genes were all belong to upregulated DEGs. By analyzing the correlations and expression levels in GEPIA, we found that these hub genes were obviously positively correlated and significantly overexpressed in CRC samples. GSCA analysis found that the expressions of CEP55, CCNB1, CDK1 and TRIP13 were significantly increased in stage II of CRC, therefore, these genes, especially CEP55, may be related to the development and proliferation of early CRC. Further analysis using GEPIA exhibited that only TRIP13 was significantly associated with CRC survival, the reason for this might be that different inclusion criteria for high and low mRNA expression, clinical stages and pathological grading are applied in the prognosis analysis. We further searched the literature in PubMed for associations among the 10 hub genes in CRC. Recent studies revealed that upregulated CDK1 promotes CRC cell proliferation via the inhibition of the p53 pathway (Gan et al., 2017), CCNB1 overexpression exerts oncogenic role in CRC cells by phosphorylating CDK1 (Fang et al., 2014), high expression of MAD2L1 drives aneuploidy and carcinogenesis in CRC (Ding et al., 2020), TPX2 promotes proliferation and tumorigenicity of colon cancer cells (Wei et al., 2013), MELK overexpression is significantly correlated with advanced tumor stage and further lymph node metastasis , TRIP13 and KIF4Acould promote CRC cell proliferation, invasion and migration and subcutaneous tumor formation (Hou et al., 2018;Sheng et al., 2018), overexpression of PRC1 and ANLN facilitate CRC tumor growth and proliferation (Wang et al., 2016;Xu et al., 2020). The increase expression of these hub genes is closely related to the occurrence and development of CRC. Recent studies have confirmed that Fn could significantly downregulate the expression of CDK1 in gingival keratinocytes (Bhattacharya et al., 2014). Aggregatibacter actinomycetemcomitans, Porphyromonas gingivalis and Neisseria meningitides infections were found to down-regulate CCNB1 and MAD2L1 expression in gingival epithelial cells and brain endothelial cells (Oosthuysen et al., 2016;Zhu et al., 2018). However, we did not find any evidence for a significant correlation between TPX2, MELK, TRIP13, KIF4A, PRC1, ANLN and bacterial infection. Further studies are needed to verify the relationship between these genes and bacterial infection.
We speculated that Fn infection could dysregulate the abovementioned hub genes through various signaling pathways, therefore we further conducted qRT-PCR analysis to verify the microarray results. We found that although the expression of these 10 hub genes was all higher than the control, only CEP55 was significantly increased (p < 0.05). CEP55, also known as c10orf3 or FLJ10540, was initially identified as a major player in abscission of cytokinesis. Bioinformatics analysis found that the top two pathways of CEP55 involved in CRC were "mitotic nuclear division" and "cytokinetic process," and the expression of CEP55 was positively correlated with immune infiltration of B cells, CD8 + T cells, neutrophils and dendritic cells which play an important role in the chronic Fn infection. Therefore, we speculated that high CEP55 expression might affect Fn-infected colon cancer cells proliferation and differentiation through mitotic nuclear division, cytokinetic process and immune infiltration. Recently studies have demonstrated that CEP55 could promote cancer cell stemness and tumor formation through regulating the PI3K/AKT pathway. Clinically, Cep55 has also been found to be overexpressed in many cancer types, and its overexpression has been strikingly associated with tumor stage and metastasis (Tandon and Banerjee, 2020). We demonstrated that, compared with Fn-non-infected Caco-2 cells, the relative expression of CEP55 was significantly higher in Fn-infected Caco-2 cells and knockdown of CEP55 inhibited cell proliferation and induced cell apoptosis in these cells. Correlation analysis exhibited that the expression of CEP55 was positively correlated with the Fn amount in Fn-infected CRC patients, and these patients with high CEP55expression had an obviously poorer differentiation, worse metastasis and decreased cumulative survival rate. These results suggested that Fn-infection might cause progression and metastasis of CRC through overexpression of CEP55 and CEP55 has the potential to be a new biomarker for diagnosis and prognosis of Fninfected CRC.   It has been reported that the expression of CEP55 in peripheral blood cells is significantly up-regulated in septicemia and abdominal infection that caused by bacterial infection (Alonso et al., 2017;Lu et al., 2020), which means that bacterial infection could increase the expression of CEP55. Recent studies have also found that Fn can cause DNA damage and promote cell proliferation by downregulating the expression of Ku70/p53, whereas the expression of CEP55 could be up-regulated through down-regulation of p53 (Chang et al., 2012;Geng et al., 2019). Overexpression of CEP55 was found to promote proliferation, metastasis and invasion of esophageal squamous cell carcinoma by activating PI3K/Akt signaling pathway (Jia et al., 2018). Therefore, we infer that Fn infection might upregulate the expression of CEP55 through downregulating p53, and the upregulation of CEP55 might lead to excessive proliferation, invasion and metastasis of CRC through activating PI3K/Akt signaling pathways. We will further verify the expression of CEP55 in Fn-infected CRC cell lines, animal models and patients and elucidate the molecular mechanism of CEP55 in the proliferation, invasion and metastasis of tumor cells induced by Fn infection.
We acknowledge some limitations of our present study. In this study, DEGs in response to Fn infection obtained from bioinformatics analysis were shown and candidate genes associated with tumorigenic properties were analyzed. And we primarily verified the expression of CEP55 in Fn-infected CRC patients, therefore, more functional assays should be applied to explore and validate the functional roles of CEP55 in Fn-infected CRC. In addition, though we have validated the expression of these hub genes in a small clinical dataset of Fn-infected CRC, other datasets derived from larger scale clinical samples which contain different intestinal conditions and Fn infection prevalence rates should be applied for further validation and evaluation.
In summary, using multiple bioinformatics analyses and qRT-PCR validation, our present work identified 10 hub genes as DEGs. These upregulated DEGs are significantly enriched in several pathways that are mainly associated with the cell cycle and mitotic cell cycle in Fn-infected CRC, and might play critical roles in the development and progression of Fn induced CRC. High expression of CEP55 has been demonstrated to be involved in Fn-infected colon cancer cell growth and cell cycle progression, and could be used as a new diagnostic and prognostic biomarker for Fn-infected CRC.

CONCLUSION
In this study, using multiple bioinformatics analyses, we identified 10 hub genes that were significantly enriched in the cell cycle, mismatch repair and p53 signaling pathway in Fninfected Caco-2 cells. Moreover, the expression level of CEP55 was significantly increased in Fn-infected CRC, and knockdown of CEP55 suppressed Fn-infected colon cancer cell growth by impairing cell cycle and apoptosis progression. Our findings suggest that CEP55 plays an important role in Fn-infected colon cancer cell growth and cell cycle progression and could be used as a new diagnostic and prognostic biomarker for Fninfected CRC.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Shenzhen Qianhai and Shekou Free Trade Zone's hospital (Shekou People's Hospital, Shenzhen, Gungdong, China). The patients provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JZ and HL are responsible for the bioinformatic analysis, experiments design, samples collection and specific experimental operations. ZW is responsible for statistical analysis, data collation and interpretation. JZ and GL are responsible for providing technical guidance and experimental funds.

FUNDING
This study was supported by the education and health science and technology fund of Shenzhen Nanshan District Technology Research and Development Project (Grant number: 2020081).