Hypoxia-Associated Prognostic Markers and Competing Endogenous RNA Co-Expression Networks in Breast Cancer

Objective Many primary tumors have insufficient supply of molecular oxygen, called hypoxia. Hypoxia is one of the leading characteristics of solid tumors resulting in a higher risk of local failure and distant metastasis. It is quite necessary to investigate the hypoxia associated molecular hallmarks in breast cancer. Materials and Methods According to the published studies, we selected 13 hypoxia related gene expression signature to define the hypoxia status of breast cancer using ConsensusClusterPlus package based on the data from The Cancer Genome Atlas (TCGA). Subsequently, we characterized the infiltration of 24 immune cell types under different hypoxic conditions. Furthermore, the differentially expressed hypoxia associated microRNAs, mRNAs and related signaling pathways were analyzed and depicted. On this basis, a series of prognostic markers related to hypoxia were identified and ceRNA co-expression networks were constructed. Results Two subgroups (cluster1 and cluster2) were identified and the 13 hypoxia related gene signature were all up-regulated in cluster1. Thus, we defined the cluster1 as “hypoxic subgroup” compared with cluster2. The infiltration of CD8+ T cell and CD4+ T cell were lower in cluster1 while the nTreg cell and iTreg cell were higher, indicating that there was immunosuppressive status in cluster1. We observed widespread hypoxia-associated dysregulation of microRNAs and mRNAs. Next, a risk signature for predicting prognosis of breast cancer patients was established based on 12 dysregulated hypoxia associated prognostic genes. Two microRNAs, hsa-miR-210-3p and hsa-miR-190b, with the most significant absolute logFC value were related to unfavorable and better prognosis, respectively. Several long non-coding RNAs were predicted to be microRNA targets and positively correlated with two selected mRNAs, CPEB2 and BCL11A. Predictions based on the LINC00899/PSMG3-AS1/PAXIP1-AS1- hsa-miR-210-3p-CPEB2 and SNHG16- hsa-miR-190b-BCL11A ceRNA regulation networks indicated that the two genes might act as tumor suppressor and oncogene, respectively. Conclusion Hypoxia plays an important role in the initiation and progression of breast cancer. Our research provides potential mechanisms into molecular-level understanding of tumor hypoxia.


INTRODUCTION
Statistics from the International Cancer Research Center show that breast cancer morbidity and mortality rank first and second among female tumors. Worldwide, the incidence of breast cancer is increasing year by year, and the age of onset is getting younger (1). Although China belongs to a region with relatively low incidence of breast cancer, the number of new cases of breast cancer has been increasing in recent years. In some large and medium-sized cities, it has risen to the top of the incidence of female malignant tumors, which seriously threatens the health and lives of women and brings a huge economic and health burden to society. With the advancement of treatment, the prognosis of breast cancer continues to improve, but breast cancer is still a dominant cause of death for women at present (2).
During the growth of malignant tumors, tumor cells grow faster than their blood vessels grow. The effective oxygen diffusion range of the capillaries around the tumor cells cannot meet the needs of rapid tumor growth, resulting in uneven supply of oxygen and nutrients in tumor tissue, thereby forming the hypoxic microenvironment (3,4). Hypoxia is one of the leading characteristics of solid tumors including breast cancer, and plays an important role in the occurrence and development of cancers (5). Hypoxia in the local microenvironment can promote the formation of new blood vessels by inducing Hypoxia-inducible factor 1-alpha (HIF-1a) (6), Vascular endothelial growth factor (VEGF) (7), C-C Motif Chemokine Ligand 28 (CCL28) (8) and other cytokines, and regulate the expression of the signal cascade and downstream related genes, thereby promoting the proliferation and invasion of tumor cells (9). For example, it has been demonstrated that the expression level of HIF-1a in breast cancer and other tumor tissues is significantly higher than that in adjacent tissues, and its increase is positively correlated with the incidence of breast cancer metastasis and mortality (10). Therefore, exploring the exact or related mechanism of hypoxia in tumorigenesis and development is expected to provide new targets and indicators for the treatment and prognostic detection of breast cancer. However, due to variations of oxygen levels in different tissues, it is still difficult to determine the hypoxia status in tumors.
Current research shows that under hypoxic conditions, tumor cells can adapt to the microenvironment on which they grow by changing the expression of enriched endogenous genes, and these gene expression signatures can reflect hypoxia status (11)(12)(13).
The discovery of a series of non-coding RNAs in recent years, including long non-coding RNA (lncRNA), microRNA (miRNA), Circular RNA (circRNA), etc., has uncovered new ways of regulating gene expression in eukaryotes. Non-coding RNAs are involved in a variety of pathological processes, especially in the occurrence and development of tumors (14). A miRNA is a small non-coding RNA molecule containing about 22 nucleotides, and can inhibit the expression of target genes by completely or incompletely binding the 3'UTR region of the target genes' mRNA. A lncRNA is a type of non-coding over 200 nucleotides in length with no protein coding potential, and can positively or negatively regulate gene expression through various mechanisms. The competitive endogenous RNAs (ceRNAs) hypothesis is one theory that explains the mechanism of lncRNA and miRNA. The theory proposes that lncRNA competes with miRNA for binding and covers the miRNA response element, thereby mitigating the inhibitory effect of miRNA on target mRNAs (15). Besides, lncRNA acts as a molecular sponge of miRNA to inhibit the expression of miRNA (16). The "lncRNA-miRNA-mRNA" network has been confirmed in many human cancers (17).
In this study, we used 13 hypoxia-related gene expression signature to characterize the different hypoxia states of breast cancer samples in The Cancer Genome Atlas (TCGA), and depicted the infiltration of 24 immune cell types in breast cancer tissues under different hypoxic conditions. Furthermore, the differentially expressed hypoxia associated miRNAs, mRNAs and related signaling pathways were analyzed and investigated. On this basis, a series of prognostic markers related to hypoxia were screened and a ceRNA co-expression network in breast cancer was constructed. These results have the potential to further improve the regulatory mechanisms under hypoxia in breast cancer.

Study Cohort
The Cancer Genome Atlas (TCGA) breast invasive carcinoma (BRCA) gene expression profile and miRNA mature strand expression RNAseq illuminaHiseq data were retrieved from UCSC Xena (18,19), as in log2(X+1) transformed RSEM normalize count. The gene expression data included 1,104 tumor samples and 114 normal samples as control. Exchanging the Accession number to the ID of miRNA was performed by miRbase database (20) and miRBaseVersions.db R package. Besides, the phenotype of BRCA samples was also gained from UCSC Xena. Among them, 703 samples have complete clinical and pathological data. This study complied with the publication guidelines of TCGA, and ethics approval and informed consent were not required. For tested cohort, mRNA expression Z scores data and clinical data of 1,356 breast cancer tumor samples were obtained from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset (21) on cBioPortal online database (22,23).

Classification of Hypoxia Status
According to the studies published, we selected 13 hypoxia related gene expression signature for our analysis: ADM, TUBB6, MRPS17, CDKN3, TPI1, ALDOA, MIF, PGAM1, LDHA, P4HA1, SLC2A1, NDRG1, and VEGFA, which have been shown to perform the hypoxia status (12,24). Cellular pathways of 13 hypoxia related gene signature were shown in Table S1 for details. Spearman's rank correlation was performed to assess the correlation among these gene by corrplot package, and the PPI network was built using the STRING database (25). Furthermore, two different hypoxia status groups (cluster1 and cluster2) among 1104 TCGA BRCA tumor samples were selected by using ConsensusClusterPlus package with 50 iterations, resample rate of 0.8. Principal component analysis (PCA) was analyzed and visualized by limma package and ggplot2 package. The differential expression of these genes between tumor samples and normal samples, between cluster1 and cluster2 were analyzed by limma package with a cut-0ff P <0.05, then visualized by pheatmap and vioplot.

Immune Cells Infiltration Analysis
The data of infiltration score and 24 immune cell types including 18 T-cell subtypes and 6 other immune cells: B cell, NK cell, Monocyte cell, Macrophage cell, Neutrophil cell and DC cell in TCGA BRCA was estimated and acquired from the ImmuCellAI database (26). Then, the relationship of these cells and hypoxia status was analyzed by limma package with a cut-0ff P <0.05. The prognostic values of CD4+ T cell, CD8+ T cell, iTreg cell and nTreg cell were calculated via the Kaplan-Meier analysis and Logrank-P test by Graphpad 8.0, and X-tile software (27) was used to determine the optimal cut-off point for the prognostic value of these four types of cells.

Differentially Expressed Genes (DEGs) Analysis
Limma package was utilized to identify differentially expressed genes between cluster1 and cluster2 and adjusted P value < 0.05 and |logFC|≥1 were considered to have a significant difference. Then, the PPI network of DEGs was constructed via STRING database (25), and the crucial sub-network was selected by the MCODE APP on Cytoscape 3.7.2 (28) according to the rules as follow: degree cut-off = 10, node score cut-off = 0.2, Max depth = 100, and k-Score = 2. Further, the pathway enrichment of the subnetwork was analyzed by the ClueGo APP on Cytoscape 3.7.2, and the Gene Ontology (GO) enrichment analysis, including: biological processes (BP), cellular components (CC) and molecular functions (MF), was performed via STRING analysis category.

Identification of Hypoxia Associated Prognostic Markers Among DEGs
The prognostic related genes were identified by univariate Cox regression analysis. After that, LASSO Cox regression was employed to select the powerful independent prognostic markers with P<0.05 for OS in BRCA. The risk score (RS) was calculated by the following formula: Where n represents the gene number in the module, Coef (i) is the coefficient of each gene; X(i) means the mRNA expression level of each gene. When Coef (i) is less than 0, it means that the corresponding gene plays a protective effect on the patient. When Coef (i) is greater than 0, the gene represents the opposite trend for survival. The TCGA BRCA tumor samples were divided into high rick and low risk groups by the cut-off of the median RS. Then, the prognostic value of RS in two groups was analyzed by Kaplan Meier method, and sensitivity and specificity assessments were estimated using the receiver operating characteristic (ROC) curves. Additionally, the relationship of RS and clinical parameters were also evaluated. This risk signature was validated by using Gene expression data and clinical data of 1356 breast cancer patients from METABRIC (21). Importantly, patients in Metabric dataset were divided into high risk and low risk groups by the optimal cut-off point of risk score which was obtained by X-tile software (27).

Differentially Expressed MicroRNAs Analysis
The differentially expressed microRNAs between cluster1 and cluster2 were analyzed by limma package with adjusted P value < 0.05 and |logFC|≥1. The prognostic values of differentially expressed microRNAs in breast cancer were assessed by the online tools and database, Kaplan-Meier Plotter (29), and patient samples were divided into two groups by the best cut-off value by the tool automatically and calculated via the Kaplan-Meier analysis and Logrank-P test for the 120 months' OS. We selected one microRNA with the highest |logFC| value in each of the up-regulated and down-regulated microRNAs for the next analysis. The targets genes of candidate microRNAs, hsa-miR-210-3p and hsa-miR-190b, were identified via the mirDIP database (30,31), an integrative Database of Human microRNA Target Predictions, with the predict score as "very high". Further, the GO enrichment analysis of the target genes was performed by STRING database, and the pathway enrichment was analyzed by the ClueGo APP on Cytoscape 3.7.2.

Identify Genes Regulated by Candidate MicroRNAs Under Hypoxia
Venn diagrams were used to select the intersection of hsa-miR-210-3p targets and down-regulated genes, as well as hsa-miR-190b targets and up-regulated genes. The correlations between the selected genes and microRNAs were calculated via the Starbase database (32) based on the date from TCGA BRCA, and the prognostic values of selected candidate genes were analyzed by Kaplan-Meier Plotter database (29).

Identify Target Long Non-Coding RNAs (lncRNAs) of Candidate MicroRNAs
StarBase database was employed to predict the target lncRNAs of candidate microRNAs, hsa-miR-210-3p and hsa-miR-190b, and the microRNAs-lncRNAs network were constructed via Cytoscape 3.7.2. Further, in order to acquire the confidence target lncRNAs in BRCA, lncRNAs were selected based on a negative relationship with candidate microRNAs (P value < 0.05, correlation coefficient <−0.1) and a positive relationship with target genes (P value < 0.05, correlation coefficient > 0.1) in the data collected from TCGA BRCA using StarBase database (32), and also chosen based on the prognostic values performed by the Kaplan-Meier Plotter (29).

Construction of the Competitive
Endogenous RNA (ceRNA) Network and Related PPI Network STRING database was used to predict protein-protein interactions of candidate genes, and PPI network has been extended until the proteins in our analysis were connected to each other (25). Candidate microRNAs and lncRNAs were then added to the network.

Statistical Analysis
Most of the statistical analysis were performed by online bioinformatic databases and tools as mentioned. Wilcox Test was employed to compare mRNA expression, infiltration score of immune cells and risk sore when comparing two sets of data. Chi-square test is used to compare clinical and pathological parameters and other categorical variables. Differentially expressed microRNAs and mRNAs were calculated by limma R package. The Kaplan-Meier curve and Log-rank P test, Univariable COX and LASSO Cox regression were used to analyze the survival outcomes. ROC curve was utilized to assess diagnostic effect. P-values < 0.05 were considered statistically significant. The visualization of the data was done by R 3.6.3, Excel 2019, Graphpad 8.0 and Cytoscape 3.7.2.

Consensus Clustering Identified Two Clusters of BRCA With Different Hypoxia Status
We selected 13 hypoxia related gene expression signature for our analysis: ADM, TUBB6, MRPS17, CDKN3, TPI1, ALDOA, MIF, PGAM1, LDHA, P4HA1, SLC2A1, NDRG1, and VEGFA. These genes were defined based on hypoxia-related gene function and were highly enriched for hypoxia-regulated pathways (11,12), and their cellular pathways and functions were shown in Table   S1 for details. In order to understand their roles in oncogenesis in breast cancer, we firstly explored expression levels of these signature in tumor samples (n=1,104) and normal samples (n=114). The results are displayed as heatmap and vioplot, suggesting that all of them are abnormally expressed in BRCA samples. More specific, ADM, NDRG1, and TUBB6 were downregulated, while other 10 genes were up-regulated in tumor samples compared with normal control (Figures 1A, B). Next, we analyzed the interrelationships and correlations between the 13 genes. Almost all of them were significantly associated with each other ( Figure 1C). Except NDRG1, RPMS17 and TUBB6, other 10 genes can be incorporated into one PPI network ( Figure 1D).
In order to define hypoxia status of 1104 TCGA BRCA tumor samples, based on the expression similarity of the 13 hypoxia related gene signature, Consensus Clustering Method was use to to cluster the samples. In the CDF curve of consensus matrix, there is a flattest middle segment of CDF curve when K=2 (Figures 1E, F and SUPPLEMENTARY METHODS). Besides, we noticed that the interference between subgroups could be reduced to minimal when K=2 was selected for consensus clustering analysis ( Figures 1G-I). Thus, two subgroups named cluster1 (n=310) and cluster2 (n=794) were identified. Then, PCA was used to compare the transcriptional profile between cluster1 and cluster2, suggesting that there was a significant distinction between the two subgroups ( Figure 1J). To better understand the hypoxia status of the two subgroups, we explore the expression of the 13 hypoxia related genes between cluster1 and cluster2. The result demonstrated that all of these 13 genes were up-regulated in cluster1. So, we could define the cluster1 as "hypoxic subgroup" compared with cluster2 ( Figures  1K, L). After excluding cases with incomplete clinical data of TCGA BRCA from UCSC Xena, 703 cases were included to analyzed the association between hypoxia status and clinicopathological characteristics through chi-square test. The results showed that the hypoxia status was significantly associated with ER status, PR status, Her2 status and PAM50 subtype. In more detail, there were more ER-, PR-, Her2+ and triple negative breast cancer patients in cluster1 than cluster2 ( Table 1).

Immune Cells Infiltration of Different Hypoxia Status in BRCA
According to published articles, hypoxia is an important feature of tumors, which can regulate the immune response in tumors. Hypoxia induces tumor cells to produce multiple mechanisms by activating downstream signaling pathways to escape recognition and attack by the immune system (33). We analyzed 24 immune cell types including 18 T-cell subtypes and 6 other immune cells: B cell, NK cell, Monocyte cell, Macrophage cell, Neutrophil cell and DC cell in BRCA based on the ImmuCellAI database. The differences of these cells between cluster1 and cluster2 are shown in Figure 2A and Figure S1. Importantly, the CD8+ T cell and CD4+ T cell, immune cells that mainly recognize and kill tumor cells, were lower in cluster1 compared with cluster2, while the nTreg cell and iTreg cell were higher in cluster1 ( Figure 2B). This result indicated that there was immunosuppressive state in cluster1. Further, the Kaplan-Meier analysis demonstrated that low CD8+ T cell and CD4+ T cell infiltration in TCGA BRCA tumor samples could predict a poor prognosis (P<0.05). Besides, there was a certain trend between poor survival prognosis and high iTreg cell and nTreg cell infiltration (P<0.05) ( Figure 2C).

Identification of Differentially Expressed Genes (DEGs) and Enrichment Analysis
We performed analysis by the gene expression profiles of BRCA to identify hypoxia associated differentially expressed genes. A total of 1,225 differentially expressed genes were selected; 566 were up-regulated and 659 were down-regulated in cluster1 relative to cluster2 ( Figure 3A). Considering that the number Log-rank p < 0.05 was considered statistical significance. HR > 1, cells infiltration was negatively associated with OS, while HR < 1, cells infiltration was positively associated with OS. **P < 0.01, and ***P < 0.001. of differential genes was too large, it was difficult to accurately make GO and pathway enrichment analysis directly. First off, the PPI network of DEGs was constructed on STRING database, and then a crucial sub-network with 58 nodes and 160 edges was selected by the MCODE APP on Cytoscape 3.7.2 ( Figure 3B). The top 10 items of each GO analysis: biological processes, molecular functions and cellular components, were shown as Bar graphs in Figures 3D-F (more details were depicted in Table  S2). The pathway enrichment analysis showed that the DEGs in the sub-network linked to hypoxia and tumor related pathways: HIF-1 signaling pathway, Transcriptional misregulation in cancer, Bladder cancer, Central carbon metabolism in cancer, Glycolysis/Gluconeogenesis, AMPK signaling pathway, etc.
( Figure 3C, Table S2). These results mean that these DEGs may play a role in promoting tumor progression through their function.

Hypoxia Associated Prognostic Markers Among DEGs and a Risk Signature Established
In order to find hypoxia associated prognostic makers among DEGs, univariate Cox regression analysis was performed based on the RNAseq illuminaHiseq data and overall survival (OS) data of 703 TCGA BRCA tumor samples. As a result, 15 up-regulated genes and 52 down-regulated genes were found significantly associated with OS (P<0.05) ( Table 2). To identify the most powerful prognostic mRNA markers, the LASSO Cox regression analysis results ( Figures 4A, B) demonstrated that 3 upregulated and 9 down-regulated genes could be the powerful prognostic markers, and the coefficient of each gene were shown in Figure 4C. The related pathways and functions of these 12 genes were revealed in Table S3. Based on the 12 powerful prognostic markers, a risk signature was constructed, Then, the      Figure 4E). In order to assess the sensitivity and specificity of the prediction, the AUC value was 0.712 obtained from Time-dependent ROC curve, suggesting well-prediction ability ( Figure 4F). The expression levels of the 13 powerful risk markers in highrisk and low-risk group were visualized in the vioplot and heatmap ( Figures 4D, G). The results showed that there were significant differences associated with risk score in terms of cluster (P<0.001), PR (P<0.001), ER(P<0.001), HER2(P<0.01) and fustat (P<0.001). The risk score was much higher in cluster1, ER-, PR-, HER2+ and TNBC patients (Figures 4H-L). Then, univariate and multivariate Cox regression analysis were performed to test whether the risk signature could be set as independent prognostic factor. As a result, the T stage, N stage, M stage, Stage and risk score were associated with OS by univariate analysis, and only M stage and risk score were still significantly related to OS in multivariate Cox regression analysis (Figures 4M, N).
According to our selected risk signature and LASSO Cox regression formula, we calculated risk score and validated our model among 1,356 breast tumor samples in METABRIC dataset. These patients were divided into high risk (n=985) and low risk (n=371) groups by the optimal cut-off point which was obtained by X-tile software. Significantly, Consistent with results of TCGA BRCA tumor samples, the high-risk group also had significantly worse prognosis compared with low-risk group in METABRIC data ( Figure S2A). AND the AUC value of 0.719 also demonstrated a well-prediction ability ( Figure S2B). The risk score was much higher in ER-, PR-, HER2+ and advanced pathological grade patients ( Figure S2C-H). Then, the results of univariate and multivariate Cox regression analysis showed that risk score still could act as an independent prognostic factor among METABRIC patients (Figures S2I, J).

Identification of Hypoxia Associated Differentially Expressed Candidate MicroRNAs
Analysis of the miRNA mature strand expression data of TCGA BRCA yielded 15 up-regulated and 2 down-regulated miRNAs in cluster1 compared to cluster2 ( Figure 5A, Table 3). We examined the prognostic value of these 17 differentially expressed miRNAs via the Kaplan-Meier Plotter database. There were 7 up-regulated and only 1 down-regulated miRNAs associated with 120 months' OS ( Table 3). We selected one up-regulated and one down-regulated miRNA with the most significant logFC value as the candidate miRNAs. Hsa-miR-210-3p was up-regulated in cluster1, while hsa-miR-190b was down-regulated in cluster1 relative to cluster2 (Figures 5B, C). Besides, high expression of hsa-miR-210-3p in BRCA was associated with worse OS (HR=1.54, logrank P=0.036), while high expressed of hsa-miR-190b was related to better OS (HR=0.64, logrank P=0.014) (Figures 5D, E).
Pathway Enrichment Analysis Revealed the Role of hsa-miR-210-3p and hsa-miR-190b in Cancer In order to understand the possible function of hsa-miR-210-3p and hsa-miR-190b in the development of BRCA, KEGG pathway enrichment was utilized to analyze their target genes. Firstly, the target genes of the 2 miRNAs were obtained from the the mirDIP database, an integrative Database of Human microRNA Target Predictions, with the predict score as "very high" (Table S4 and Figure 5F). Then, the pathway enrichment was performed based on these genes, and the results linked these genes to several cancer related pathways: MicroRNAs in cancer, Signaling pathways regulating pluripotency of stem cells, Transcriptional misregulation in cancer, MAPK signaling pathway, Ras signaling pathway, PI3K-Akt signaling pathway, Hepatocellular carcinoma, Pathways in cancer, etc. ( Figure 5G and Table S3). We also performed GO analysis of these target genes including biological processes, molecular functions and cellular components, and the results were shown as Bar graphs in Figures 5H-J (more details were shown in Table S5).

Identification of Candidate DEGs Regulated by Candidate MicroRNAs Under Hypoxia Status
Venn diagrams were used to identify the intersection between top100 hsa-miR-210-3p targets and down-regulated genes, as well as between top100 hsa-miR-190b targets and up-regulated genes in cluster1 relative to cluster2 ( Figure 6A). Among these genes, two candidate DEGs, CPEB2 and BCL11A, were selected. Expression levels were low and high for CPEB2 and BCL11A in cluster1, respectively, and might act as a tumor suppressor gene and a putative oncogene ( Figures 6B, C, F).

A Hypoxia Related Competitive Endogenous RNA (ceRNA) Regulation Network
To determine whether any lncRNAs might be involved in dysregulated expression of candidate DEGs and miRNAs, we firstly used the Starbase database to predict the target lncRNAs of hsa-miR-210-3p and hsa-miR-190b (Table S4 and Figure 6I). Then we chose the lncRNAs which are not only negatively related to candidate miRNAs but also positively related to CPEB2 and BCL11A based on ceRNA theory ( Table 4). Among the selected lncRNAs, 4 of them, SNHG16, LINC00899, PSMG3-AS1 and PAXIP-AS1, were significantly associated with survival (Figure 7). High expression of SNHG16 could predict a poor prognosis, while high expression of LINC00899, PSMG3-AS1 and PAXIP-AS1 could predict a better prognosis. We constructed a local protein network between the proteins CPEB2 and BCL11A, we then added candidate miRNAs and the 4 lncRNAs to this network. In this network, loss of LINC00899, PSMG3-AS1 and PAXIP1-AS1 leads to increased hsamiR-210-3p. When overexpressed, hsa-miR-210-3p impedes translation of CPEB2, a tumor suppressor gene in breast cancer under hypoxia status. High expressed SNHG16 can suppress hsa-miR-190b, which leads to increased expression of BCL11A, an oncogene in breast cancer ( Figure 8).
Thus, this dysregulated ceRNA network can result in progression of breast cancer under hypoxia situation.

DISCUSSION
The complete tumor tissue includes not only cancer cells, but also the surrounding vessels, lymphatic vessels, fibroblasts, inflammatory cells, and extracellular matrix. It also includes a variety of interstitial cells and biomolecules infiltrating them. These are collectively referred to as the tumor microenvironment (34,35). The abnormal vascular network in solid tumors and the excessive oxygen demand of rapidly growing cancer cells lead to hypoxia in tumor tissues. Hypoxic and acidic microenvironment is one of the most important components of tumor microenvironment. Cancer cells adapt to and rely on these microenvironments, leading to the diversity and instability of gene mutations, activating multiple signaling pathways and survival factors, which contribute to angiogenesis, metabolic reprogramming, epithelial-mesenchymal transition, invasion, metastasis, cancer stem cell maintenance, immune evasion, and resistance to chemotherapy and radiation therapy (5,36). Therefore, understanding the effect of hypoxia on molecular mechanism is essential to improve the outcome of cancer treatment. In our current study, we selected 13 hypoxia related gene signature which can well demonstrate the hypoxia status based on published studies. These 13 genes make up a common hypoxia signature which will be up-regulated and are consistently co-expressed with previously validated hypoxiaregulated genes under hypoxic conditions in various cancers. They are a small number of top-ranked genes with the highest connectivity and the most prognostic in hypoxia co-expression cancer networks, including head and neck, breast and lung cancers (12). And according to their expression, we defined the hypoxia status of breast cancer tissues to divide these breast cancer samples into two groups, namely cluster1 and cluster2. Considering that the expression of the 13 genes in cluster1 is higher than that of cluster2, we defined cluster1 as the "hypoxic subgroup". For the association between hypoxia status and clinicopathological characteristics, there were more ER-, PR-, Her2+ and triple negative breast cancer patients in cluster1 than cluster2. This result indicates that the hypoxic state is closely related to the malignant phenotype of breast cancer.
An increasing number of studies indicate that the hypoxia status of tumor tissue is an important reason for promoting tumor immunosuppression and resistance to immunotherapy. Tumor hypoxic regions can recruit immunosuppressive cells such as myeloid derived suppressor cells (MDSC), tumor-associated macrophage (TAM) and Tregs, and can inhibit the activation of CD8+T cells and CD4+T cells (33,37). Hypoxia can increase the expression and secretion of CCL28 in ovarian cancer cells. CCL28 is an inducer of Tregs and has an immunosuppressive function on CD8+ T cells (38). In the presence of hypoxia and TGF-b, CD4+ T cells upregulated Foxp3 through the binding of HIF-1 to the promoter region of Foxp3, which induced the differentiation of Tregs and enhances immunosuppression (39). Therefore, we compared the infiltration of 24 immune cell types including 18 T-cell subtypes and 6 other immune cells in cluster1 and cluster2. The results showed that, compared with cluster 2, the infiltration of CD8 + T cells and CD4 + T cells in cluster 1 was lower, while the infiltration of nTreg cells and iTreg cells in cluster 1 was higher. This result indicates that there is an immunosuppressive state in cluster1. In addition, this result in turn confirmed that we initially defined cluster1 as the "hypoxic subgroup" was correct. The recent research results of Shaoquan Zheng et al. are in strong agreement with ours. They constructed a combined hypoxia and immune index based on 3 hypoxia-related genes and 7 immune-related genes for triple-negative breast cancer samples in Gene Expression Omnibus (GEO), TCGA and METABRIC by silico analyses, and patients were divided into hypoxia high /immune low and hypoxia low / immune high groups. The key markers of hypoxia (ALDOA, ENO1, LDHA, etc.) are highly expressed in the hypoxia high / immune low group, while a higher percentage of CD8+ T cells was observed in the hypoxia low /immune high group (40). In this case, both their and our results confirmed that there is a strong positive correlation between the hypoxia status of tumor and immunosuppression. The response to hypoxia includes a series of adaptation mechanisms that promote tumor cells survival (41). MET, SLC2A1, VEGFA, and VEGFC. They are well described as part of the hypoxic transcriptome and are HIF targets involved in the response to hypoxia, positive regulation of the I-kappaB kinase/NF-kappaB cascade, carbohydrate metabolism, glycolysis and other pathways and GO functions (42). Further, we analyzed the differentially expressed genes of cluster1 and cluster2 to explore the molecular mechanism of breast cancer tissue changes under hypoxia. Among them, some important DEGs clustered significantly into pathways related to hypoxia and tumor invasion and metastasis, such as HIF-1 signaling pathway, Transcriptional misregulation in cancer, Bladder cancer, Central carbon metabolism in cancer, Glycolysis/ Gluconeogenesis, AMPK signaling pathway, etc. Similarly, these results in turn confirmed that our hypoxic classification of tumor samples was correct. The HIF-1 signaling and Glycolysis/Gluconeogenesis pathways play a vital role in promoting the occurrence and development of tumors under hypoxic conditions. As is reported, cancer cells can use both conventional oxidative metabolism and glycolytic anaerobic metabolism. However, even in the presence of oxygen, their proliferation is also characterized by increased glycolytic metabolism which is called Warburg effect. HIF 1 as a major hypoxia-inducible transcription factor can promote the dissociation between glycolysis and tricarboxylic acid cycle. This process limits the effective production of ATP and citric acid which would otherwise prevent glycolysis. The Warburg effect is also conducive to alkaline pH in tumor cells, which drives cancer cell proliferation (enhancing cell cycle progression and glycolysis) and cancer aggressiveness (resistance to immune response, cytotoxic drugs and apoptosis). This effect even leads to epigenetic and genetic changes which cause cells to appear a variety of new phenotypes, thereby enhancing the growth and aggressiveness of cancer cells (43). Therefore, our results mean that these DEGs between cluster1 and cluster2 may play a role in promoting tumor progression through their functions according to existing research results.
Enriched studies indicate that hypoxia-related gene signatures generated in vitro and in vivo have prognostic power in breast cancer and other cancers. Inna Y. Gong (45), and Sorensen (46)], and confirmed to a certain extent that hypoxia-related gene signatures had potential to be used as biomarkers to predict survival of early breast cancer (47). By contrast, Maud H W Starmans et al. identified 295 up-regulated and 164 downregulated genes under hypoxia in breast (MCF7), colon (HT29) and prostate (DU145) carcinoma cells in vitro, but they found that none of these in vitro derived signatures consisting of hypoxia-induced genes are prognostic when in a much larger cohort of breast cancer patients in vivo (48). In an effort to bolster clinical tools for hypoxic understanding of breast cancer, we also developed a prognostic signature associated with hypoxia. By univariate Cox and LASSO Cox regression analysis, we constructed a new risk signature which was not reported before based on 3 up-regulated and 9 down-regulated genes in cluster1. Besides, our 12-gene signature showed a well-prediction ability to provide new perspectives for the identification of breast cancer with a high risk of death. And the risk score is much higher in cluster1, ER-, PR-, HER2+, TNBC, and advanced Grade patients, which indicates that the increased risk score also predicts a malignant breast cancer molecular phenotype. Some studies have confirmed the role and function of genes in this risk signature. For example, CCL19 inhibited cell proliferation, migration, and invasion in gastric cancer by activating the CCR7/AIM2 signaling pathway, which could be a potential therapeutic approach (49). EGOT reduced the vitality and migration of breast cancer cells by inactivating the Hedgehog pathway (50). COL17A1 as a target of p53 can also inhibit the migration and invasion of breast cancer cells (51).
MiRNAs and lncRNAs are identified as key regulators of gene expression in various biological and pathological processes, including cancer (52). Further we use data contained in databases such as StarBase, mirDIP, Kaplan-Meier Plotter and TCGA, based on ceRNA theory, we identified potential ncRNA regulatory pathways involving a tumor suppressor and an oncogene, LINC00899/PSMG3-AS1/PAXIP1-AS1-hsa-miR-210-3p-CPEB2 and SNHG16-hsa-miR-190b-BCL11A ceRNA regulation networks, and built a local PPI network which might promote the development of breast cancer under hypoxia. Experimental results are consistent with some of our predictions. It is reported that miR-210 was upregulated by HIF-1a in the stromal cells of giant cell tumors of bone (53). Downregulation of miR-210 inhibited growth of tumors, such as glioblastoma and osteosarcoma (54,55). In addition, CPEB2 has been shown to act as a tumor suppressor gene in breast cancer. In MCF7 cells, CPEB2 gene knockdown mediated by siRNA promotes carcinogenic properties in vitro, promotes EMT, migration, invasion, proliferation and stem cell-like phenotype of cells (56). Breast cancer-derived exosomes induced CD73 + gd1 Treg cells by transmitting lncRNA SNHG16, while CD73 + gd1 Treg cells exert an immunosuppressive effect through adenosine production (57). Besides, by directly interacting with the 3'UTR of Bcl2, miR-190b induces osteosarcoma cell apoptosis and confers radiosensitivity to gastric cancer cells (58,59). According to reports, BCL11A acts as a carcinogenic gene for a variety of human cancers, such as breast cancer (60), laryngeal squamous cell carcinoma (61), high-risk neuroblastoma (62), non-small cell lung cancer (63) etc. In addition, the expression of PSMG3-AS1 in breast cancer tumor tissues and cell lines was increased, and PSMG3-AS1 as a sponge of miR-143-3p enhanced the proliferation and migration ability in the pathogenesis of breast cancer (64).
In conclusion, our research has provided an understanding of potential carcinogenesis mechanism and molecular prognostic markers of breast cancer under hypoxic conditions from multiple levels by in silico analyses. We hope that our research can provide a new theoretical basis for exploring the carcinogenic and progression mechanisms of breast cancer. However, it is undeniable that our research still has some limitations. The data from the TCGA database does not directly provide values for hypoxia status, for example, O 2 levels. At the same time, we have only analyzed and constructed relevant ceRNA regulatory networks for hsa-miR-190b and hsa-miR-210-3p, but not other miRNAs. The next important step is to use functional experiments to verify our predictions.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
P-JG and J-WZ contributed to the conception of the study. P-JG, Y-CS, and S-RH contributed to experimental technology and experimental design. P-JG, S-RH, and Y-FZ performed the data analyses. P-JG, X-NY, J-JX, and W-NY wrote the manuscript. LW and J-WZ supervised the study. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by funds from Health commission of Hubei Province scientific research project (WJ2019H020, WJ2019H028).