Bioinformatics Analysis of Differentially Expressed Rhythm Genes in Liver Hepatocellular Carcinoma

Liver Hepatocellular Carcinoma (LIHC), a malignant tumor with high incidence and mortality, is one of the most common cancers in the world. Multiple studies have found that the aberrant expression of rhythm genes is closely related to the occurrence of LIHC. This study aimed to use bioinformatics analysis to identify differentially expressed rhythm genes (DERGs) in LIHC. A total of 563 DERGs were found in LIHC, including 265 downregulated genes and 298 upregulated genes. KEGG pathway enrichment and GO analyses showed that DERGs were significantly enriched in rhythmic and metabolic processes. Survival analysis revealed that high expression levels of CNK1D, CSNK1E, and NPAS2 were significantly associated with the low survival rate in LIHC patients. Through cell experiment verification, the mRNA expression levels of CSNK1D, CSNK1E, and NPAS2 were found to be strongly upregulated, which was consistent with the bioinformatics analysis of LIHC patient samples. A total of 23 nodes and 135 edges were involved in the protein–protein interaction network of CSNK1D, CSNK1E, and NPAS2 genes. Clinical correlation analyses revealed that CSNK1D, CSNK1E, and NPAS2 expression levels were high-risk factors and independently connected with the overall survival rate in LIHC patients. In conclusion, the identification of these DERGs contributes to the exploration of the molecular mechanisms of LIHC occurrence and development and may be used as diagnostic and prognostic biomarkers and molecular targets for chronotherapy in LIHC patients in the future.


INTRODUCTION
Circadian rhythms, driven by endogenous circadian clocks, exist in almost all organisms on Earth. At the molecular level, circadian clocks are roughly similar in different organisms and consist of a series of transcription and translation feedback loops (Li and Zhang, 2015). In mammals, the transcription factors BMAL1 and CLOCK (CLK), forming heterodimers, drive Period (Per), and Cryptochrome (Cry) gene transcription through E-box elements in their promoter regions.
A high degree of similarity to the mammalian system exists in Drosophila, where the transcription activators CYCLE (CYC) and CLOCK (CLK) heterodimerize and drive the transcription of timeless (tim) and per (Lowrey and Takahashi, 2011).
Disruption of the normal circadian rhythm has adverse effects on the physiology of mammals. Some clinical and laboratory experiments have shown that these disturbances can lead to various diseases, including mental illnesses, metabolic disorders, neurodegenerative diseases, and cancers (Gamble et al., 2014;Cash et al., 2015;Kiessling et al., 2017). A database related to circadian expression patterns of specific genes in clinical disorders including tumors has been published . As one of the most common malignant tumors in the world, liver hepatocellular carcinoma (LIHC) ranks among the top 10 in terms of global mortality rate and is also a highly prevalent and harmful malignant tumor in China (Bray et al., 2018). However, the molecular mechanism underlying the carcinogenesis or progression of liver cancer is still unclear.
In the current study, we aim to identify differentially expressed rhythm genes (DERGs) through comprehensive bioinformatics analysis, which maybe benefit to explore the potential molecular mechanism underlying the occurrence and development of LIHC and may be used as diagnostic and prognostic biomarkers and molecular targets for chronotherapy in LIHC patients in the future.

MATERIALS AND METHODS
The flow diagram of this study was shown in Figure 1. DERGs were found through The Cancer and Tumor Genome Atlas (TCGA) database and a human circadian genes database. GO function, KEGG pathway enrichment, protein-protein interaction (PPI) network and survival analyses of DERGs were performed using R software. DERGs expression levels were analyzed in LIHC patient samples and was verified in cell experiments. Further clinical correlation analyses were conducted to explore the value of DERGs in the timing treatment of LIHC.

Data Collection
First, TCGA repository interface 1 was entered. Then, TCGA-LIHC was selected in the Project column. Meanwhile, transcriptome profiling was selected in the Data Category column; Gene Expression Quantification was selected as the Data Type; RNA-seq was selected in the Experimental Strategy column; and HTSeq-FPKM was selected in the Workflow Type column. Finally, all the data that matched the above conditions were downloaded.

Screening Differentially Expressed Genes
The differentially expressed genes (DEGs) between LIHC samples and non-neoplastic samples were screened using R software (version 4.0.2). DEGs were defined as those representing 1 https://portal.gdc.cancer.gov/repository FIGURE 1 | Flow diagram of this study. differences with P < 0.05. Using the Venny 2.1 online tool, 7168 DEGs were compared with the human circadian gene list (constructed by our research group) (Li et al., 2017). The intersection of both was taken; 563 DERGs were screened out and a volcano plot was drawn using R software (version 4.0.2).

GO and KEGG Pathway Analysis
Gene Ontology (GO) analysis is a general method for largescale functional enrichment research. Gene function includes biological process (BP), molecular function (MF), and cellular component (CC) (Harris et al., 2004). KEGG is a broad functional database that stores a wealth of data on biological pathways, genomes, diseases, and drugs (Ogata et al., 1999;Kanehisa et al., 2021). First, the DERGs' official symbols were converted through the org.hs.eg.db package in R software (version 4.0.2); the cluster Profiler package was used for GO and KEGG pathway analyses; and GO plot package was used for cluster analysis. Significant differences were defined when both P and Q values were less than 0.05.

Survival Analysis
The survival package of R software (version 4.0.2) was used to analyze ARNTL, CSNK1D, CSNK1E, NPAS2, NR1D1, PER1, PER2, PER3, PRKAG2, and RORA gene expression profiles and overall survival (OS) in TCGA-LIHC patients (excluding patent information with no survival time). Then, the survival curve analysis of the CSNK1D, CSNK1E, and NPAS2 genes was verified by using the KM-plotter online tool. On the main interface of the KM-plotter, "Start KM Plotter for liver cancer" in "RNA-seq" was selected; the gene name to be searched was inputted and the default values for other options were selected.

Matching Analysis
After screening and sorting the patient samples, the tumor samples were paired with the normal samples from the same patients (a total of 50 pairs). In addition, the expression levels of the CSNK1D, CSNK1E, and NPAS2 genes were extracted, and the matching graphs were drawn in R software (version 4.0.2).

Protein-Protein Interaction Network Construction
The website of Search Tool for the Retrieval of Interacting Genes Database (STRING) 2 was a biological database designed to construct a PPI network of DERGs based on the known and predicted PPIs and then analyze the functional interactions between proteins. Subsequently, the PPI network was visualized by means of Cytoscape software (version 3.7.2).

Clinical Correlation Analyses
The relevant clinical data of the hepatoma patients, namely the phenotypes of TCGA LIHC (n = 377), were downloaded from TCGA database. In addition, according to the clinical characteristics of patients (such as age, sex, and tumor stage), we excluded some patients whose clinical data were incomplete; ultimately, we obtained data from 235 patients for Cox analysis to assess the combined effects of clinical characteristics and gene expression. 2 https://www.string-db.org

Statistical Analysis
Statistical tests were performed using R software (version 4.0.2). The t-test or Mann-Whitney test was used for comparisons between the two groups. Schoenfeld individual tests were used to evaluate the proportional hazards assumption for the fit of the Cox regression model. We used univariate and multivariate Cox proportional hazards models to evaluate the survival analysis. The optimal cut-off value generated by the "surv cutpoint" function of R package was used for survival analysis, and expression levels were divided by dichotomy. In addition, statistical significance in cell line experiments was assessed in t tests in GraphPad Prism version 8 software. A value of P < 0.05 was considered statistically significant ( * P < 0.05, * * P < 0.01, * * * P < 0.001).

Identification of DERGs
A total of 7168 DEGs and 1368 circadian genes were identified in the analyses of the LIHC-FPKM and CGDB datasets, respectively. Among them, 563 DERGs were found in both datasets (Figure 2A and Supplementary Table 1). Compared with normal liver tissue samples, 265 DERGs were downregulated and 298 DERGs were upregulated in LIHC tissue samples ( Figure 2B).

GO Analysis of DERGs
To further evaluate the functions and mechanisms of these DERGs, GO enrichment analysis of BPs indicated that the DERGs were mainly enriched in cellular carbohydrate metabolic process, circadian regulation of gene expression, regulation of small molecule metabolic process, regulation of lipid metabolic process, and response to nutrient levels and other processes. GO enrichment analysis of CCs indicated that the DERGs were mainly enriched in vesicle lumen and granule lumen. GO enrichment analysis of MFs indicated that the DERGs were mainly enriched in hydrolase activity, acting on carbon nitrogen bonds, steroid hormone receptor activity, and so on ( Figure 3A).
To further analyze the enrichment results, we performed gene cluster analysis on the BP modules with the top P-values. The results revealed that DERGs were significantly associated with rhythmic and metabolic processes ( Figure 3B).

KEGG Pathway Analysis of DERGs
The KEGG pathway enrichment analysis indicated that the circadian rhythm pathway, the adipocytokine signaling pathway, and the glycine, serine, and threonine metabolism pathways were significantly affected, especially the circadian rhythm pathway, with P < 0.001 ( Figure 4A). Gene cluster analysis of the KEGG pathways showed that DERGs were also significantly associated with rhythmic and metabolic processes ( Figure 4B). In particular, the CNK1D, CSNK1E, NPAS2, PER1, PER2, PER3, ARNTL, NR1D1, PRKAG2, and RORA genes were significantly enriched in the circadian rhythm pathway (Figure 4C).

Survival Analysis
To investigate the prognostic values of DERGs, a total of 377 liver cancer samples were available for the analysis of OS (Supplementary Table 2). We found that higher expression levels of CSNK1D, CSNK1E, and NPAS2 were significantly related to shorter OS of liver cancer patients ( Figure 5). Conversely, the expression levels of ARNTL, NR1D1, PER1, PER2, PER3, PRKAG2, and RORA were not significantly related to the OS of liver cancer patients (Supplementary Figure 1). KM-plotter online tool further validated these results (Supplementary Figure 2).

Analysis of DERGs Expression Levels
To explore the expression levels of genes (CSNK1D, CSNK1E, and NPAS2) in liver cancer, we compared the mRNA expression levels of tumor and normal tissues. The results revealed that the CSNK1D, CSNK1E, and NPAS2 mRNA levels were strongly upregulated in LIHC samples ( Figure 6A). The patient samples were then further analyzed and sorted (Supplementary Table 3), and the normal and tumor samples from the same patients were matched for comparisons (a total of 50 pairs) of their expression levels. We obtained the same results as before ( Figure 6B). Additionally, in the liver cancer cell experiment, similar results were obtained, in which the CSNK1D, CSNK1E, and NPAS2 mRNA expression levels were strongly upregulated compared with those in normal liver cells ( Figure 6C).

DISCUSSION
Liver Hepatocellular Carcinoma is a common type of cancer that is associated with poor prognosis and high mortality (Yang and Roberts, 2010;Yang et al., 2019). Although research on the pathology of liver cancer has gradually deepened in the last few years and progress has been made in the management of diagnosis and treatment strategies for patients with liver cancer, the treatment success rate and clinical outcomes are still very poor (Lurje et al., 2019;Villanueva, 2019). With the continuous discovery of molecular biomarkers, prognostic indicators will become a promising clinical tool for liver cancer (Xu et al., 2018). Therefore, it is necessary to develop novel diagnostic and prognostic biomarkers for liver cancer to assist in clinical diagnosis and prediction. Epidemiological studies have shown that the disruption of normal circadian rhythms may increase the risk of many types of cancer, such as breast, prostate, rectal, and endometrial cancers (Winter et al., 2007;Taniguchi et al., 2009;Kuo and Ladurner, 2019). In addition, abnormal expression levels of circadian clock genes are closely related to tumor development. Studies have found that mutations in BMAL1, CRY1, and CRY2 are often associated with a higher risk and recurrence of acute myeloid leukemia and endometrial, ovarian, colorectal, and breast cancers    Hasakova et al., 2018;Burgermeister et al., 2019;Kwon et al., 2020). Per 1, Per 2, and Per 3 are frequently expressed in human breast and colorectal cancers, non-small-cell lung cancer (NSCLC), hepatocellular and cervical squamous cell carcinomas, and glioma (Yeh et al., 2005;Mostafaie et al., 2009;Lin et al., 2019). Studies have shown that clock genes can regulate the expression of proto-oncogenes and tumor suppressor genes and participate in the cell cycle, apoptosis, DNA damage, and other processes. These findings suggest that there is a network of mechanisms between rhythm genes and cancer and several DERGs can be used as prognostic markers for clinical diagnosis.
Through bioinformatics analysis and experimental verification, we found that the CSNK1D, CSNK1E, and NPAS2 mRNA levels were strongly upregulated in LIHC samples. Moreover, CSNK1D, CSNK1E, and NPAS2 are also three key genes in the negative feedback loop of circadian rhythm. In the core loop, two E-box specific transcription factors, CLOCK [Circadian Locomotor Output Cycles Kaput, replaced by NPAS2 (Neuronal PAS domain protein 2)] and BMAL1 (Brain and muscle Arnt-like protein-1) heterodimerize during the day and activate transcription from genes containing E-boxes, including the Pers, Crys, and clock-controlled genes (CCGs). In the cytoplasm, PERs and CRYs, phosphorylated by CSNK1D/E and from a complex. Subsequently they enter the nucleus where they inhibit CLOCK/BMAL1-mediated transcription in the evening. PERs phosphorylated by CSNK1D/E are degraded via the proteasome pathway and restart a new cycle of transcription. Secondary loops include additional pairs of antagonizing transcription factors such as DBP and E4BP4, REV-ERBs, and RORs. DBP and E4BP4 regulate the expression of Rors and other CCGs through D-boxes. REV-ERBs and RORs control the expression of Bmal1, Clock (NPAS2) and other CCGs via ROR elements (ROREs) (Figure 9). These interlinked positive and negative transcription-translation feedback loops confer daily rhythmicity for homeostasis maintenance (Lowrey and Takahashi, 2011;Shostak, 2017;Shafi and Knudsen, 2019). Abnormal expression of circadian genes in transcriptiontranslation feedback loops impacts several BPs, such as the cell cycle, apoptosis, DNA damage repair and metabolic regulation, thereby affecting tumor development and progression. However, the molecular mechanism of the occurrence and development of LIHC caused by these three rhythm genes remains to be further studied.
Many anticancer drugs are significantly affected by the time of administration (Smolensky and Peppas, 2007;Lévi et al., 2010;Ye et al., 2018), and their biological behavior, such as absorption, distribution, metabolism, and elimination, follow circadian rhythms. A recent study found that once the drugs are inside the tumor cells, their functions are regulated by the circadian rhythm of the tumor cells, with the result that these anticancer drugs show different anticancer activities and adverse reactions with different times of administration (Lévi et al., 2008;Dulong et al., 2015). Therefore, the identification of these DERGs contributes to be used as diagnostic and prognostic biomarkers and molecular targets for chronotherapy in LIHC patients, which has broad application prospects.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
HL performed most of the bioinformatics analyses and experiments. HL, SH, and ZF created the figures and tables. HL and XW wrote the manuscript. YG designed, performed, and supervised the study. SL provided suggestions and guidance of this study, and revised the manuscript. All authors contributed to the article and approved the submitted version.