Integrated WGCNA and PPI Network to Screen Hub Genes Signatures for Infantile Hemangioma

Background Infantile hemangioma (IH) is characterized by proliferation and regression. Methods Based on the GSE127487 dataset, the differentially expressed genes (DEGs) between 6, 12, or 24 months and normal samples were screened, respectively. STEM software was used to screen the continued up-regulated or down-regulated in common genes. The modules were assessed by weighted gene co-expression network analysis (WGCNA). The enrichment analysis was performed to identified the biological function of important module genes. The area under curve (AUC) value and protein-protein interaction (PPI) network were used to identify hub genes. The differential expression of hub genes in IH and normal tissues was detected by qPCR. Results There were 5,785, 4,712, and 2,149 DEGs between 6, 12, and 24 months and normal tissues. We found 1,218 DEGs were up-regulated or down-regulated expression simultaneously in common genes. They were identified as 10 co-expression modules. Module 3 and module 4 were positively or negatively correlated with the development of IH, respectively. These two module genes were significantly involved in immunity, cell cycle arrest and mTOR signaling pathway. The two module genes with AUC greater than 0.8 at different stages of IH were put into PPI network, and five genes with the highest degree were identified as hub genes. The differential expression of these genes was also verified by qRTPCR. Conclusion Five hub genes may distinguish for proliferative and regressive IH lesions. The WGCNA and PPI network analyses may help to clarify the molecular mechanism of IH at different stages.


INTRODUCTION
Infantile hemangioma (IH) is a benign tumor in children, and its rapid growth can lead to serious morbidity and even mortality (Zhu et al., 2018). It is estimated that 10% of infants have IH, and the frequency of hemangiomas in preterm infants with birth weight less than 1 kg increases to 22.9% (Blei, 2005).
The life cycle of hemangioma is divided into three stages. Samples from 6 months old infants were considered to be in the proliferative stage, while those from 24 months old infants were considered to be in the degenerative stage (Takahashi et al., 1994;Chang et al., 2008;Gomez-Acevedo et al., 2020). However, after tumor regression, 40-80% of IH will leave a permanent scar or a large amount of adipose tissue, especially in facial lesions, which can lead to deformity (Boscolo and Bischoff, 2009). Furthermore, although IHs can spontaneously regress, it is still difficult to predict the progression of some IHs. Therefore, some clinicians suggest that interventions for IHs should be performed at an early stage (Luo and Zhao, 2011).
Effective treatments for IHs include the use of corticosteroids and/or surgical removal of tumors (Chen et al., 2019). Although new advances have been made in treatment strategies for IHs, the main clinical problem remains the lack of reliable parameters to distinguish proliferative or regressive IH lesions.
At present, the etiology and pathogenesis of IH are not fully understood. In fact, there are differences in the expression of gene biomarkers in different stages of IH (Ritter et al., 2002;Greenberger et al., 2010a). P53 participated in the process of IH by regulating hypoxia-inducible factors and angiogenesis (Mabeta, 2018). In addition, immune cells are involved in the development of IHs through mesenchymal transformation of endothelium (Wu et al., 2017). Glucocorticoids affect IH growth by altering the "pro-angiogenic" environment of tumors, mainly by inhibiting high levels of vascular endothelial growth factor (VEGF)-A secreted by vascular endothelial cells (Greenberger et al., 2010b).
Although the history and progression of these lesions are clear, the etiology and exact mechanisms of occurrence and spontaneous degeneration remain unclear. A better understanding of the pathogenesis of hemangioma will provide innovative ideas for exploring more effective treatment strategies. Weighted gene co-expression network analysis (WGCNA) is a widely used method to build co-expression pairwise correlation matrices (Feltrin et al., 2019). Exclusively based on co-expression analysis will better represent genes with a small effect size acting together (Chaste et al., 2015). WGCNA provides a systems-level insight into the signaling networks that may be associated with a phenotype of interest (Liang et al., 2020). This study explored key genes and molecular mechanisms of tumor development by identifying differences of gene expression in IHs at different stages.

Data Collection
We collected IH related datasets from the gene expression omnibus (GEO) database. GSE127487 included gene expression profiling of skin tissue from six children of 6 months with hemangiomas, six children of 12 months with hemangiomas, six children of 24 months with hemangiomas and six healthy children. Matrix reports at the probe level were created through Average Normalization and background subtraction.

Analysis of Differentially Expressed Genes
The difference of gene expression between IH and healthy children were analyzed by limma R software package (Ritchie et al., 2015). The differentially expressed genes (DEGs) were obtained using the criteria of P < 0.05 .
Weighted Gene Co-expression Network Analysis (WGCNA) The WGCNA of DEGs was performed using WGCNA R software package (Botia et al., 2017) to construct co-expression network. The soft-thresholding power was used as the correlation coefficient threshold. Then, module eigengene (MEs) with > 23 genes were selected using the dynamic tree cut method. In addition, the correlations between modules and clinical trait of patients were investigated using the Pearson correlation.

Enrichment Analysis
To explore the potential biological roles of module genes, Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment were analyzed using clusterProfile R software package (Shi et al., 2018). Gene set enrichment analysis (GSEA) for module genes was performed with GSEA software (Reimand et al., 2019). P < 0.05 was chosen as the cutoff value.

Protein-Protein Interaction Network
The selected module genes were analyzed by inputting them into the Search Tool for the Retrieval of Interacting Genes (STRING) database 1 . A combined score of ≥ 0.5 was considered as significant to construct a protein-protein interaction (PPI) network. The PPI network was displayed by Cytoscape software (Shi et al., 2018). The hub genes were chosen based on a higher number of associations with other genes. (C) Venny map of differentially expressed genes for three groups. (D) Genes that are continuously up-or down-regulated of the common genes in STEM analysis. DEG_1: differentially expressed genes between 6 months and normal, DEG_2: differentially expressed genes between 12 months and normal, DEG_3: differentially expressed genes between 24 months and normal.

Quantitative Real-Time Polymerase Chain Reaction
A total of nine hemangioma skin tissue samples (three samples for 6 months, three samples for 12 months and three samples for 24 months) and three normal skin tissue samples were collected. All samples were collected with the informed consent of patients. The study was approved by the human ethics review committee of our hospital. Total RNA was extracted from samples using the TRIzol (Thermo Fisher Scientific). RNA was reverse transcribed into cDNA with Superscript II Reverse Transcriptase kit (Invitrogen) according to the supplier's instructions. The gene expression was measured through qRT-PCR using SYBR Green PCR Master Mix kit (Invitrogen). The sequence of these primers was list in Table 1. Relative expression was calculated using 2 − Ct method, and GAPDH as a reference gene. The differences between two groups were compared by Student's t-test. P < 0.05 (two-sided) was considered significant .

Single-Sample Gene Set Enrichment Analysis (ssGSEA)
The mark genes of immune cell types were collected from Bindea et al. (2013) and Zhang et al. (2020). The ssGSEA program was used to calculate the infiltration level of each immune cell. The correlation between hub genes and immune cell was calculated by Pearson correlation.

DEGs in the Development of Infantile Hemangioma
To identify the gene expression changes during the development of IH, we obtained differentially expressed genes (DEGs) between 6, 12, and 24 months hemangiomas and normal controls, respectively ( Figure 1A). There were 5,785 DEGs between 6 months hemangioma and normal, 4,712 DEGs between 12 months hemangioma and normal, and 2,149 DEGs between 24 months hemangioma and normal. Interestingly, the numbers of DEGs gradually decrease over time ( Figure 1B). Of these, 2,149 common DEGs were simultaneously present in the three groups Frontiers in Genetics | www.frontiersin.org of differences ( Figure 1C). By STEM analysis of common genes, we obtained 1,218 genes that were continuously up-or downregulated from 6 to 24 months and then to normal ( Figure 1D).

Co-expression Behavior of DEGs
The scale-free fit index and mean connectivity were calculated and the power of β = 9 (scale free R 2 = 0.8) was selected (Figure 2A). WGCNA divided the union of DEGs of three groups into different modules. Finally, 10 modules were determined ( Figure 2B). Moreover, we calculated the changing trend of genes in the module ( Figure 2C). The expression of MEred and MEturquoise decreased with time, while that of MEyellow increased. Surprisingly, the correlation between module and trait showed that MEred (module 4) had the highest correlation with IH of 6 months, and then gradually decreased ( Figure 2D). The correlation of MEyellow (module 3) increased gradually. We thought they were the most relevant modules to the development of IH.

Biological Functions and Signaling Pathways of Module Genes
To further study the function of these module genes, GO, and KEGG analysis were performed. A total of 4,633 biological  Figure 3A). In addition, the results of 264 KEGG terms were enriched for 10 module genes. The p53 signaling pathway, mTOR signaling pathway and Cellular senescence were included in module 4 and module 3 ( Figure 3B). The results of subtypeGSEA found that Tyrosine metabolism, Retinol metabolism and Chemical carcinogenesis were gradually upregulated with the improvement of IH, while Sphingolipid signaling pathway, Endocrine resistance and Notch signaling pathway were gradually down regulated ( Figure 3C).

Identification of Hub Genes
By calculating the AUC values of the genes in modules 3 and 4 at different stages, we screened 176 genes with AUC values greater than 0.8 in all three groups. Most of these genes showed gradually up-regulated or down-regulated expression trend ( Figure 4A). By constructing the PPI network, we identified the top five genes (FYN, KIF20A, POLD1, RAD54L, and TYMS) with the highest degree of connectivity in the network as hub genes ( Figure 4B). Surprisingly, these five genes were also persistently dysregulated genes screened by STEM software. The hub genes were continuously down-regulated from 6 months to normal ( Figure 4C). Their AUC values for hemangiomas at 6 months were higher than those at 24 months ( Figure 4D). Fortunately, the differential expression of these genes was also validated by qRTPCR experiment (Figure 4E).

Difference of Immune Infiltration in Hemangioma
By comparing the difference of immune cell infiltration between hemangioma and normal at different stages, we found that Tem and aDC had the most significant difference among the three groups ( Figure 5A). The results of correlation analysis showed that FYN was significantly positively correlated with NK cells, negatively correlated with Th2 cells (Figure 5B). KIF20A was negatively correlated with Mast cells, and RAD54L was positively correlated with Th1 cells.

DISCUSSION
IH is one of the most common tumors in children. Because of its complex etiology, the pathogenesis of hemangiomas remains unclear. This study was based on the different development stages of IH to explore the changes of gene expression during the development of the disease. The aim was to screen potential markers that could distinguish different stages of development. These genes were mainly associated with biological effects such as immune response, cell cycle, and apoptosis. The hub genes we obtained were significantly correlated with immune cells.
Interestingly, by comparing the number of DEGs between IH and normal at different stages, we found that the number of DEGs from the proliferative phase to the regressive phase was getting smaller and smaller. This also proved from the side that different time corresponded to different clinical periods. Nearly half of the genes in common DEGs were identified as persistently dysregulated by STEM. It suggested that these genes may be involved in the development of IH.
Co-expression analysis clusters genes with similar expression patterns into a co-expression module, each of which may characterize different molecular mechanisms (Lu et al., 2017). We found that module 4 had the highest correlation with the proliferative stage, followed by a gradual decline, and module 3 had the lowest correlation with the proliferative stage, followed by a gradual increase. These two module genes may be closely related to the development of IH. In fact, these two module genes were enriched in IH-related biological functions and signaling pathways. The neonatal immune system is a developing structure that evolves in a complex, step-by-step manner (Basha et al., 2014). The innate immune defense system plays a key protective role in the growth of infants and young children (Yu et al., 2018). As expected, innate immunity is widespread in IH (Ritter et al., 2006;Peng et al., 2020). Promoting cell cycle arrest and apoptosis are the main mechanisms of action of drugs for the treatment of IH (Chen et al., 2019). In particular, p53 signaling pathway mediated apoptosis (Yao et al., 2018). The mammalian rapamycin complex (mTOR), which converts signals in the extracellular environment, such as glucose and growth factors, into corresponding changes in basic intracellular processes, including proliferation (Laplante and Sabatini, 2012). Rapamycin, an mTOR inhibitor, has anti-angiogenic effects on endothelial cells under pathological conditions (Guba et al., 2002).
During the development of IH, we found a large number of metabolic-related signaling pathways were continuously upregulated. Dysregulation of cell metabolism is listed as a new feature of cancer (Hirschey et al., 2015). Cellular metabolic pathways are targeted as central mediators of signaling and angiogenesis in health and disease (Wong et al., 2017). Our analysis results showed that the Notch signaling pathway was gradually down-regulated with the development of IH. Notch signaling pathway plays a key role in the development and progression of IH and is a potential target for the treatment of IH (Edwards et al., 2017).
The analysis results suggested that hub genes may had the ability to distinguish different stages of IH development. FYN is involved in many types of tumor progression, including hemangiomas (Llombart et al., 2009;van Oosterwijk et al., 2013;Panagopoulos et al., 2019Panagopoulos et al., , 2020. FYN regulates TLR4 signaling in mast cells and induces the secretion of tumor necrosis factor (Martin-Avila et al., 2016). KIF20A is a mitotic kinase that affects the prognosis of bladder cancer by promoting the proliferation and metastasis of bladder cancer cells (Sishtla et al., 2018;Shen et al., 2019). POLD1 is involved in controlling DNA repair and has been shown to be associated with cancer (Nicolas et al., 2016). High expression of POLD1 may serve as a potential prognostic indicator for invasive breast cancer (Qin et al., 2018). The cell cycle gene RAD54L may be inhibited by p53 to regulate G2/M cell cycle (Fischer et al., 2016). RAD54L was reported to be associated with cancer (Qi et al., 2013). TYMS was identified as a biomarker for hepatocellular carcinoma liver transplantation, pancreatic and colorectal cancer (Zhang et al., 2016;Ntavatzikos et al., 2017;Fu et al., 2019). However, the specific mechanism of action of these hub genes in IH is not yet clear.
This study also has some limitations. The analytical data we used were from public databases with a small sample size. To improve the reliability of data analysis results, we validated the differential expression of hub genes in IH and normal by qPCR experiments. In addition, the hub genes obtained in this study need further study to elucidate the discriminative ability of different IH stages and the relevant mechanism of action.

CONCLUSION
In this study, bioinformatic methods were used to screen genes related to the development of infant hemangioma. We identified five key genes (FYN, KIF20A, POLD1, RAD54L, and TYMS) associated with the proliferative and regressive stages of IH.
These key genes may regulate the development of IH through immune response, cell cycle, and mTOR pathway. Although our study was preliminary and further studies were needed to validate these findings. This network analysis based on WGCNA and PPI provided new insights into the diagnosis and treatment of IH patients.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.