Identification of intrinsic genes across general hypertension, hypertension with left ventricular remodeling, and uncontrolled hypertension

The purpose of the present article is to identify intrinsic genes across general hypertension (HT), hypertension with left ventricular remodeling (HT-LVR), and uncontrolled hypertension (UN-HT). In total, four microarray datasets (GSE24752, GSE75360, GSE74144, and GSE71994) were downloaded from the GEO database and were used to identify differentially expressed genes (DEGs), respectively. Furthermore, gene set enrichment analysis (GSEA) was utilized to screen for significantly enriched biological pathways across the four datasets above, respectively. Furthermore, weighted gene co-expression network analysis (WGCNA) and functional enrichment analysis were applied to screen out gene modules of interest and potential biological functions, respectively. Finally, a Metascape-based multiple gene list meta-analysis was used to investigate intrinsic genes at different stages of the progression of hypertension. A total of 75 DEGs (63 upregulated genes and 12 downregulated genes, GSE24752) and 23 DEGs (2 upregulated genes and 21 downregulated genes, GSE74144) were identified. However, there were few DEGs identified in GSE75360, GSE71994, and part of the GSE74144 datasets. GSEA and functional enrichment of gene module of interest have indicated that “Heme metabolism,” “TNF alpha/NFkB,” and “interferon alpha response signaling,” and MYC target v1/v2 were enriched significantly in different stages of hypertension progression. Significantly, findings from the multiple gene list meta-analysis suggested that FBXW4 and other 13 genes were unique to the hypertension group, and TRIM11 and other 40 genes were mainly involved in hypertension with the left ventricular remodeling group, while the other 18 genes including F13A1 significantly enriched in uncontrolled hypertension. Collectively, the precise switch of the “immune-metabolic-inflammatory” loop pathway was the most significant hallmark across different stages of hypertension, thereby providing a potential therapeutic target for uncontrolled hypertension treatment.

The purpose of the present article is to identify intrinsic genes across general hypertension (HT), hypertension with left ventricular remodeling (HT-LVR), and uncontrolled hypertension (UN-HT). In total, four microarray datasets (GSE , GSE , GSE , and GSE ) were downloaded from the GEO database and were used to identify di erentially expressed genes (DEGs), respectively. Furthermore, gene set enrichment analysis (GSEA) was utilized to screen for significantly enriched biological pathways across the four datasets above, respectively. Furthermore, weighted gene co-expression network analysis (WGCNA) and functional enrichment analysis were applied to screen out gene modules of interest and potential biological functions, respectively. Finally, a Metascape-based multiple gene list meta-analysis was used to investigate intrinsic genes at di erent stages of the progression of hypertension. A total of DEGs ( upregulated genes and downregulated genes, GSE ) and DEGs ( upregulated genes and downregulated genes, GSE ) were identified. However, there were few DEGs identified in GSE , GSE , and part of the GSE datasets. GSEA and functional enrichment of gene module of interest have indicated that "Heme metabolism," "TNF alpha/NFkB," and "interferon alpha response signaling," and MYC target v /v were enriched significantly in di erent stages of hypertension progression. Significantly, findings from the multiple gene list meta-analysis suggested that FBXW and other genes were unique to the hypertension group, and TRIM and other genes were mainly involved in hypertension with the left ventricular remodeling group, while the other genes including F A significantly enriched in uncontrolled hypertension. Collectively, the precise switch of the "immune-metabolic-inflammatory" loop pathway was the most significant hallmark across di erent stages of hypertension, thereby providing a potential therapeutic target for uncontrolled hypertension treatment. KEYWORDS intrinsic genes, hypertension, left ventricular remodeling, uncontrolled hypertension, bioinformatics strategies Introduction Hypertension (HT) has been the most common publichealth cardiovascular disease with a high incidence and a range of changeable and complex complications worldwide (1). Moreover, HT is a complex systemic disease, which not only endangers the cardiovascular system, but also damages other organs and systems (2). Whereas, current treatment methods for hypertension, which greatly depend on longacting antihypertensive drugs given one time daily, are discouraging (3).
Advances in high-throughput screening technology and novel bioinformatics algorithms have facilitated the systematic detection of the research objects (4). Furthermore, a classic example of high throughput technology applied to hypertension research was a genetic analysis that identified 535 new loci associated with hypertension in over 1 million people (5).
Cardiac remodeling especially left ventricular myocardial remodeling is considered one of the hallmarks of hypertensive heart disease progression (6). In addition, Ehlers et al. (7) suggested the activation of AT1R triggers cardiovascular remodeling in subjects with hypertension. An expressive number of patients continue with uncontrolled hypertension (UN-HT), despite the availability of many antihypertensive drugs and various guidelines on the management (8,9). A multicenter study (10) suggested that many variants of these genes were present in uncontrolled hypertension. This is the first data-mining study to systematically identify peripheral blood transcriptome characteristics of hypertension at different stages, such as hypertension, hypertension with left ventricular remodeling (HT-LVR), and uncontrolled hypertension by means of bioinformatics methods. Liew et al. (11) proposed that expression levels of most genes between bulk tissues and peripheral blood have a good positive linear correlation, demonstrating that peripheral blood is an ideal surrogate tissue to discover genes and pathways. Compared with tissue samples, blood samples have great advantages in early clinical disease diagnosis.
In terms of research methods, we prefer to perform a more robust gene set enrichment analysis (GSEA) rather than the traditional enrichment analysis based on hypergeometric distribution to find transcript functions. Our core step was to detect modules of interest through the weighted gene coexpression network analysis (WGCNA) and adopted a multigene list meta-analysis method to identify intrinsic genes and their interaction at different stages of the progression of hypertension.
Collectively, our study suggested that the signal conversion of the immune-metabolic-inflammatory loop pathway was the most significant hallmark at different stages of hypertension, and identify potential targets for uncontrolled hypertension treatment.

Candidate dataset collection and preprocessing
We collected hypertension-related transcriptomic datasets with complete expression profiles and experimental design information from the Gene Expression Omnibus (12) (GEO) database in the present study by using the keyword "hypertension." The datasets GSE24752, GSE74144, and GSE71994 were downloaded using the R package (13) "GEOquery" (v2.60.0). For the GSE75360 dataset, the non-normalized expression matrix was downloaded from the GEO repository.
The The R packages "hgu133plus2.db" (v3.13.0), "illuminaHumanv4.db" (v1.26.0), "hsAgilentDesign026652.db" (v3.2.3), and "hugene10sttranscriptcluster.db" (v 8.8.0) were used to convert gene probes to gene symbols according to the microarray platform, respectively, and then all gene names were remapped to official gene symbols according to the multisymbol checker tools (https://www.genenames.org/tools/multi- symbol-checker/). On the basis of the experimental design, only GSE75360, GSE74144, and GSE71994 were selected for the subsequent analysis of weighted gene co-expression network analysis (WGNCA). Quartile normalization was applied to the gene expression matrix using "normalizeBetweenArrays" of the R packages (16) "limma" (v3.48.3) and a Log2 transformation was conducted for subsequent data analysis. For genes with multiple probes, the median value from all expressed probes was used. When one probe was matched to multiple genes, it was deleted. Specifically, genes with a zero expression were excluded from all samples.

Screening of di erentially expressed genes (DEGs) and setting an optimal threshold
We used the R package "limma" (v3.48.3) to identify differentially expressed genes (DEGs) between the control group and the experimental group from the GSE24752, GSE75360, GSE74144, and GSE71994 datasets.
Since alterations of RNA levels are usually lower in peripheral-blood samples and peripheral blood mononuclear cells than in other tissues (17), we performed R package "RVA" (v0.0.4) to visualize the number of DEGs under different thresholds. In addition, an absolute 2-fold change with a p-value of < 0.05 was considered DEGs.

Identification of co-expression modules of interest using WGCNA
A weighted gene co-expression network analysis (WGCNA) was designed to identify interesting trait-related, such as dataset preprocessing, soft powers selection, constructing adjacency matrix and topological overlap matrix (TOM) construction, hierarchical clustering, and cluster partition.
The GSE24752 dataset failed to perform weighted gene co-expression network analysis (WGCNA) owing to too few samples for both phenotypes. The gene co-expression networks of the GSE75360, GSE74144, and GSE71994 datasets were constructed by the R package (20) "WGCNA" (v1.70-3).
First, the "pickSoftThreshold" function was used to select optimal soft powers to establish a scale-free network. Next, we calculated an adjacency matrix and corresponding topological overlap matrix (TOM), and hierarchical clustering to find genes with similar expressions in one co-expression module based on the one-step-WGCNA method in the "blockwiseModules" function. Finally, we investigated the module-trait relations between modules and external traits to find functional modules in this co-expression network by the "labeledHeatmap" function. Collectively, certain modules with the highest correlation coefficient and significant p-value were considered the candidate module that is closely correlated with the interested traits, and we used this module for our subsequent analysis.

Functional enrichment analysis of gene modules of interest
The functional enrichment analysis based on the "Metascape" (Database Last Update Date: 2021-08-01) platform (21), which included Gene Ontology (GO)/Kyoto Encyclopedia of Genes and Genomes (KEGG) and transcript factors enrichment analysis, was performed using the genes involved in the modules that are closely correlated with the interested traits in WGCNA. Default settings with genes with an FDR of < 5% were considered significant terms.

Multi-gene list meta-analysis
In general, Venn diagrams methods were used routinely to identify hits that were common or unique to certain gene lists. However, Chanda SK et al. proposed a multi-gene set analysis method that can better reflect the biological significance using the "Metascape" (Database Last Update Date: 2021-08-01) platform (21). Therefore, we identified intrinsic gene sets across general hypertension, hypertension with left ventricular remodeling, and uncontrolled hypertension by means of the "Multi-gene list meta-analysis" methods. The specific steps include three steps, that is, protein-protein interaction (PPI) network construction, functional enrichment analysis, and hub genes selection. The protein-protein interaction (PPI) network of gene modules of interest was established through the Metascape platform. The most significant gene module from the PPI network was visualized and shown using the Molecular Complex Detection (MCODE) method. Cytoscape software (22)

Statistical analysis
Most of the statistical analyses were conducted using R software version 4.1.1. All the p-values and adjusted p-values were for a two-sided test and considered statistically significant when p was < 0.05.

Identification of HT-DEGs, HT-LVR -DEGs, and UN-HT
The flow diagram of the current study is presented in Figure 1, which mainly includes the transcriptomic dataset download and preprocessing (Figure 2), function enrichment analysis, and WGCNA.
According to the microarray groups design, the comparison matrix for differential gene expression analysis was set as follows: In total, we obtained 75 differentially expressed probes (DEGs, 63 upregulated genes, and 12 downregulated genes) in peripheral blood cell samples from the GSE24752 dataset ( Figures 3A,G), which was consistent with Wei et al. (23). Furthermore, we identified 23 differentially expressed probes (DEGs, 2 upregulated genes, and 21 downregulated genes) in peripheral blood cell samples from the GSE74144 dataset (hypertensive patient with normal left ventricular vs. normal control), which was exhibited in Figures 3D,H. Interestingly, we detected few differentially expressed genes in the GSE75360, GSE71994, and part of the GSE74144 datasets based on an absolute 2-fold change with a p-value of < 0.05 ( Figures 3B,C,E,F), which was contrary to that of Pang et al. (24) who found that 842 DEGs were identified in the GSE71994 dataset (including 629 upregulated genes and 213 downregulated genes), while 28,232 DEGs were identified in the GSE74144 dataset. To account for that contradiction, we continued searching for different publications. Li et al. (25) suggested lowering the threshold to increase the number of differentially expressed or adopting the weighted gene coexpression network analysis (WGCNA) algorithm. Therefore, we respectively performed a more robust gene set enrichment analysis (GSEA) and gene co-expression network analysis (WGCNA) to explore gene modules of interest. The full gene list is shown in Supplementary Table 1.    As shown in Figures 5A,B, the SkyBlue gene module exhibited strong positive correlation with hypertension (r = 0.56, p = 0.008, soft threshold beta = 8). Genes from the SkyBlue gene module ( Figure 5C) were significantly enriched in myeloid cell homeostasis, regulation of body fluid levels, COPI-independence Golgi-to-ER retrograde traffic, and so on, and which were widely monitored by SF1 and ZNF577 transcription factors ( Figure 5D). The full list of genes is shown in Supplementary Table 2.
Furthermore, as illustrated in Figures 6A,B the Cyan gene module showed a strong positive correlation between hypertension and the left ventricular remodeling group (r = 0.39, p = 0.02, soft threshold beta = 10). Genes from the Cyan gene module ( Figure 6C) were significantly enriched in the generation of precursor metabolites and energy, TP53 regulated metabolic genes, Ca 2+ pathway, and so on, which were widely ruled by MPM1, ARNT2, and ZNF507 transcription factors ( Figure 6D). The full list of genes is shown in Supplementary Table 3.
In addition, genes from GreenYellow, which were found to have the highest association with uncontrolled hypertension (r = 0.35, p = 0.03, soft threshold beta = 12, Figures 7A,B) were involved in hemostasis, smooth muscle contraction, regulation of cell adhesion, Ras protein signal transduction, and so on, which were broadly regulated by SRF and TCF4. Genes from the Magenta module ( Figure 7C) were involved in lymphocyte activation and lymphocyte migration and were broadly regulated by nuclear factor-κB 1 (NFKB1) and signal transducer and activator of transcription 4 (STAT4) ( Figure 7D). The full list of genes is shown in Supplementary Table 4. . /fcvm. .

Identification of intrinsic gene set across HT, HT-LVR, and UN-HT
We performed a multi-gene-list meta-analysis across HT, HT-LVR, and UN-HT by means of the Metascape platform (21).
As illustrated in Figure 8A, the ceramide biosynthetic process, regulation of TP53 activity, histone methylation, and COPI-independence Golgi-to-ER retrograde traffic were enriched in the hypertension group. The generation of precursor metabolites and energy, Golgi-organization, and Ca 2+ pathway were involved in hypertension in the left ventricular remodeling group, while negative regulation of norepinephrine secretion, smooth muscle contraction, and hemostasis significantly enriched in the uncontrolled hypertension group.
After the protein-protein interaction (PPI) network construction and hub gene establishment, Figures 8B,C showed that there are different intrinsic gene sets and interactioninteraction across general hypertension, hypertension with left ventricular remodeling, and uncontrolled hypertension. The full list of genes is shown in Supplementary Table 5.

Discussion
The difficulty in the treatment of hypertension lies in the high incidence and multiple organ involvement. In genetics, the polymorphism of gene loci and the heterogeneity of the population have become obstacles to the gene therapy of hypertension (27, 28). Despite many antihypertensive drugs available, an expressive number of patients continue with uncontrolled hypertension, which was defined as a mean systolic and/or diastolic BP ≥140/90 mmHg based on the Seventh Joint National Committee (JNCVII) on Detection, Evaluation, and Treatment of High Blood Pressure.
In this study, we have comprehensively investigated the transcriptome characteristics related to hypertension from three perspectives, such as general hypertension, hypertension with . /fcvm. . . Interestingly, we found that the biological processes related to blood metabolism, such as "heme metabolism" and "coagulation" were also significantly enriched in the hypertension group (Figures 4A,D).
. /fcvm. . The advantages associated with our study are as follows: (1) This is the first data-mining study to systematically identify peripheral blood transcriptome characteristics of hypertension at different stages, such as hypertension, hypertension with left ventricular remodeling, and uncontrolled hypertension by means of bioinformatics methods. (2) Considering the . /fcvm. . expression profile from peripheral blood, blood samples have great advantages in early clinical disease diagnosis compared with tissue samples. (3) Our study suggested that FBXW4 and other 13 genes were unique to the hypertension group, TRIM11 and other 40 genes were mainly involved in hypertension with left ventricular remodeling group, while the other 18 genes including F13A1 significantly enriched in the uncontrolled hypertension group. However, our study has its own limitations. (A) More reliable results would be obtained from a larger sample size; (B) our finding was not verified by experiments, mainly by the bioinformatics method; (C) the level of gene expression in peripheral blood is low, and differential expression analysis may not be applicable; and (D) this may not be common sense that we have no detectable DEGs from hypertension with left ventricular remodeling and uncontrolled hypertension vs. control group. Because our study is a pure bioinformatics analysis based on the GEO database, further biological experiments are needed to validate our results. In summary, our study showed that there are many gene changes in different stages of hypertension ( Figure 8C). Moreover, the dynamic changes in immune metabolism and inflammation are the most significant biological changes in hypertension. Gene microarray based on peripheral blood has the potential to screen the target molecules with diagnostic and therapeutic potential for heart failure.