Original Research ARTICLE
Identification and Validation of Potential Biomarkers and Their Functions in Acute Kidney Injury
- 1Department of Nephrology, Chinese PLA General Hospital, Chinese PLA Institute of Nephrology, Beijing Key Laboratory of Kidney Disease, State Key Laboratory of Kidney Diseases, National Clinical Research Center for Kidney Diseases, Chinese People’s Liberation Army General Hospital, Beijing, China
- 2Department of Critical Care Medicine, Beijing Electric Power Hospital, Beijing, China
- 3Department of Urology, San Martino Policlinico Hospital, University of Genoa, Genoa, Italy
Acute kidney injury (AKI) is a global public health concern associated with high morbidity, mortality, and health-care costs, and the therapeutic measures are still limited. This study aims to investigate crucial genes correlated with AKI, and their potential functions, which might contribute to a better understanding of AKI pathogenesis. The high-throughput data GSE52004 and GSE98622 were downloaded from Gene Expression Omnibus; four group sets were extracted and integrated. Differentially expressed genes (DEGs) in the four group sets were identified by limma package in R software. The overlapping DEGs among four group sets were further analyzed by the VennDiagram package, and their potential functions were analyzed by the GO and KEGG pathway enrichment analyses using the DAVID database. Furthermore, the protein–protein interaction (PPI) network was constructed by STRING, and the functional modules of the PPI network were filtered by MCODE and ClusterOne in Cytoscape. Hub genes of overlapping DEGs were identified by Cyto-Hubba and cytoNCA. The expression of 35 key genes was validated by quantitative real-time PCR (qRT-PCR). Western blot and immunofluorescence were performed to validate an important gene Egr1. A total of 722 overlapping DEGs were differentially expressed in at least three group sets. These genes mainly enriched in cell proliferation and fibroblast proliferation. Additionally, 5 significant modules and 21 hub genes, such as Havcr1, Krt20, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln were screened by analyzing the PPI networks. The 5 significant modules were mainly enriched in complement and coagulation cascades and Metabolic pathways, and the top 21 hub genes were mainly enriched in positive regulation of cell proliferation. Through validation, Krt20 were identified as the top 1 upregulated genes with a log2 (fold change) larger than 10 in all these 35 genes, and 21 genes were validated as significantly upregulated; Egr1 was validated as an upregulated gene in AKI in both RNA and protein level. In conclusion, by integrated analysis of different high-throughput data and validation by experiment, several crucial genes were identified in AKI, such as Havcr1, Krt20, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln. These genes were very important in the process of AKI, which could be further utilized to explore novel diagnostic and therapeutic strategies.
Acute kidney injury (AKI) is a syndrome characterized by the rapid loss of the kidney’s excretory function and is typically diagnosed by the decreased urine output or accumulation of end products of nitrogen metabolism (urea and creatinine), or both (Bellomo et al., 2012). AKI is a global public health concern associated with high morbidity, mortality, and health-care costs (Zuk and Bonventre, 2016). It occurs in about 13.3 million people per year and is thought to contribute to about 1.7 million deaths every year (Mehta et al., 2015). Data show that 21% of hospital admissions were affected with AKI at least one time, and patients who require dialysis or those with Kidney Disease: Improving Global Outcomes (KDIGO) stage 3 had a high mortality rate (42 and 46%) (Mehta et al., 2015). About 25.7% of AKI hospitalized patients are still unable to recover after active treatment, and even if postdischarge kidney function returns to normal, AKI patients are still at substantially increased risk of developing chronic kidney disease (CKD) (Sawhney et al., 2017).
Despite extensive investigation of AKI in experimental models, the clinical outcome of AKI has not been improved substantially in the last decade (Liu et al., 2017), and the therapeutic measures for AKI are still limited; no therapeutic interventions except dialysis can reliably improve the survival outcome, limit injury, or speed recovery (Bellomo et al., 2012). In recent decades, the experimental mouse ischemia–reperfusion injury (IRI) model of AKI has been widely applied to study the pathogenesis and injury outcome of ischemic AKI (Hu et al., 2013), while the underlying mechanisms of AKI remain largely unclear.
Microarray and high-throughput sequencing have been widely used to explore the mechanisms of a series of diseases (Zhang et al., 2017). At present, a great number of studies have been performed on AKI gene expression profiles (Wei et al., 2010; Liu et al., 2014; Cui et al., 2016; Liu et al., 2017), and these studies have screened many differentially expressed genes (DEGs) that may be involved in the pathogenesis and progression of AKI. However, their results of DEGs are often not the same due to the heterogeneity among each independent experiment. Thus, single-cohort studies still have many limitations, and the integrated analysis of high-throughput data from different gene expression profiles may solve these difficult problems and enable the discovery of some reliable and effective diagnosis and therapeutic molecular markers (Yang et al., 2018).
In the present study, we aimed to obtain deep insights into the mechanisms and find some potential molecular markers for prognosis and early detection, as well as therapeutic drug targets for treating AKI. We perform an integrated analysis of two datasets GSE52004 (Liu et al., 2014) and one original public high-throughput sequencing data GSE98622 (Liu et al., 2017) downloaded from Gene Expression Omnibus (GEO) database1. Five significant modules and 21 hub genes, such as Havcr1, Krt20, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln, were screened. By experiment validation, several crucial genes were identified in AKI, such as Havcr1, Krt20, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln. These genes were very important in the process of AKI, which could be further utilized to explore novel diagnostic and therapeutic strategies.
Materials and Methods
High-Throughput Data and Normalization
In order to identify the key biomarkers in AKI, we used the keywords “AKI, mouse” to search on the GEO DataSets database1 and downloaded the gene expression profiles of GSE52004 (Liu et al., 2014) and GSE98622 (Liu et al., 2017). Both two series contained samples of mouse kidney IRI model of AKI and sham (control) group. We extracted three group sets in GSE52004 and one group set in GSE98622; each group set contained two groups, AKI 24 h group and the sham 24 h group. The detailed information of four group sets extracted from these two series is in Table 1. GSE52004 was based on the Affymetrix GPL6246 platform [(MoGene-1_0-st) Affymetrix Mouse Gene 1.0 ST Array (transcript gene version)] and provided the raw data. The preprocessing of the gene expression profile data, which includes the background correction, the quantile normalization, the median polish summarization, and the log2 transformation, was performed by R software2 and RStudio software3 using the Robust Multichip Average (RMA) algorithmin affy package, which can be downloaded on the Bioconductor website4 (Lin and Lin, 2017). GSE98622 has already provided the RMA-normalized data. The preprocessed data were filtered to include only those probe sets with annotations and export as expression data. Probe set annotation files (∗.transcript.csv) were download on the official website of Affymetrix5. These CSV files contain the design–time data of the high throughput, such as gene symbol and genomic location. We then performed principal component analysis (PCA) by scatterplot3d package and princomp function in R software.
Identification of Differentially Expressed Genes
The linear model for high-throughput data analysis (limma) in Bioconductor was applied to find DEGs by comparing expression value between AKI 24 h samples and sham 24 h samples in the four group sets. Differential expression was calculated using an empirical Bayes model. The criteria for the statistically significant difference of DEGs was | log2 fold change (FC)| ≥ 1 in expression and adjusted p-value (false discovery rate, FDR) < 0.05. Volcano plot of all DEGs was performed by ggplot2 package in R. The overlapping DEGs among four group sets were further analyzed by VennDiagram package in R software.
GO and KEGG Pathway Enrichment Analyses of DEGs
The Database for Annotation, Visualization and Integrated Discovery (DAVID)6 is a gene functional classification tool that integrates a set of functional annotation tools for investigators to analyze biological functions behind massive genes (Huang et al., 2009). GO and KEGG pathway enrichment analyses of 722 DEGs were performed based on DAVID online tool. In the GO enrichment analysis, the categories include the cellular component (CC), the biological process (BP), and the molecular function (MF) terms (Zhang et al., 2017), and adjusted p < 0.05 was regarded as statistically significant differences. In the KEGG pathway enrichment analysis, enriched pathways were identified according to the hyper geometric distribution with an adjusted p < 0.05.
PPI Network Construction and Analysis of Modules
Considering that proteins rarely work alone, it is necessary to study the interactions among proteins. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING)7 is an online biological resource database that is commonly used to identify the interactions between known and predicted proteins (Szklarczyk et al., 2015). By searching the STRING database, the PPI network of the 722 overlapping DEGs were selected with a score > 0.7, and the PPI network was visualized by Cytoscape software (Shannon et al., 2003)8. In the PPI network, each node stands for a gene or a protein, and edges represent the interactions between the nodes. We then used the plug-in named Molecular Complex Detection (MCODE) and ClusterOne (Nepusz et al., 2012) by Cytoscape to filter the modules of the PPI network. The number of nodes > 6 and MCODE scores > 5 were set as the criteria (Lin and Lin, 2017).
Identification of Hub Genes
Hub genes/proteins are a few numbers of the proteins/DEGs (hubs) that have more association with other DEGs/proteins. They are the important nodes in the PPI network. To find the hub genes of the PPI network of overlapping DEGs, Cyto-Hubba (Chin et al., 2014) and cytoNCA (Tang et al., 2015), both java plugin for Cytoscape software, were applied. In this study, we mined the top 21 hub genes by combining the above two methods, The PPI network of these 21 hub genes was constructed and visualized based on GeneMANIA (Franz et al., 2018) plugin in Cytoscape. GO and KEGG pathway enrichment analyses were performed for hub genes. An adjusted p < 0.05 was considered statistically significant.
Animals and Procedures
C57BL/6 mice (20–25 g) were purchased from the Animal Center of Chinese PLA General Hospital. All animal procedures were approved by the Institutional Animal Care and Use Committee at the Chinese PLA General Hospital and Military Medical College. The 12 male mice were randomly assigned to two groups: 6 mice underwent bilateral renal ischemia and reperfusion surgery (the AKI group), and the remaining 6 mice underwent sham surgery (the sham group). Renal ischemia (45 min) and reperfusion and renal sham surgery were performed as described previously (Hu et al., 2013). At 24 h after reperfusion, blood and kidney samples were harvested for further processing.
Assessment of Kidney Injury
Kidney injury was assessed by measuring the levels of serum creatinine and blood urea nitrogen and calculating the changes in levels. Blood samples were collected from the vena cava at the indicated time points, and the serum was separated by centrifugation at 3,000 rpm for 15 min at 4°C and then sent to the PLA General Hospital Biochemistry Department to detect serum Cr and BUN.
A quarter of the kidney was fixed in 4% formaldehyde, dehydrated, and embedded in paraffin. Tissue sections (4 mm) were stained with periodic acid–Schiff (PAS). Histological examinations were performed in a blinded manner for acute tubular necrosis (ATN) scores regarding the grading of tubular necrosis, cast formation, tubular dilation, and loss of brush border as described before (Hu et al., 2013). Fifteen non-overlapping fields (400×) were randomly selected and scored as follows: 0, none; 1, 1–10%; 2, 11–25%; 3, 26–45%; 4, 46–75%; and 5, > 76%.
Quantitative Real-Time PCR
Frozen tissue samples were lysed in TRIzol reagent (Invitrogen, Carlsbad, CA, United States), and total RNA was extracted according to the manufacturer’s instructions. The levels of transcripts were determined by quantitative real-time PCR (qRT-PCR) using TransStartTM Top Green qPCR SuperMix (AQ131, Transgen, Beijing, China) on an Applied Biosystems 7500 system PCR cycler (Applied Biosystems, Foster City, CA, United States). The data were normalized to 18S expression and further normalized to the control group. Primers were obtained from Genomics (BGI Tech, China). All of these primers are listed in Table 2.
Western Blot Analysis
EGR1 antibody (MA5-15008, Invitrogen, Thermo Fisher Scientific, United States), glyceraldehyde 3-phosphate dehydrogenase (GAPDH) mouse monoclonal antibody (AF0006, Beyotime Biotechnology, China), horseradish peroxidase (HRP)-labeled goat antirabbit immunoglobulin G (IgG) (H + L) antibody (A0208, Beyotime Biotechnology, China), and HRP-labeled goat antimouse IgG (H + L) antibody (A0216, Beyotime Biotechnology, China) were used for Western blot analysis. GAPDH served as control. Approximately 30 μg of protein of each mouse was added to validate.
Kidneys were fixed for 18 h in 4% paraformaldehyde (PFA) at 4°C, then incubated 2 h in 30% sucrose in phosphate-buffered saline (PBS) and embedded in OCT. The embedded kidney was cut into 5 μm thick sections and washed with PBS thrice for 5 min, and then permeabilized with 1% Triton X-100 buffer. After washed with PBS thrice for 5 min, the sections were blocked with 10% casein (SP5020, Vector, Burlingame, CA, United States) and incubated with primary antibodies specific for EGR1 (MA5-15008, Invitrogen, Thermo Fisher Scientific, United States) overnight in a moisture chamber and then washed with PBS to remove unbound primary antibody. Next, the sections were incubated with Cy3-labeled goat antirabbit IgG (H + L) antibody (A0516, Beyotime Biotechnology, China) and fluorescein-labeled Lotus Tetragonolobus Lectin (LTL, FL1321, Vector Laboratories, United States) for 60 min at room temperature and washed as described for the primary antibody. Finally, the sections were washed with PBS and added with 4′,6-diamidino-2-phenylindole (DAPI) (ab104139, Abcam, Cambridge, MA, United States).
Preprocessing of High-Throughput Data
Before the next analysis, we performed the quality control for every raw file (group sets A, B, and C) to guarantee the quality of every chip. The box plot results before and after normalization are shown in Supplementary Figure S1. We did not perform the box plot for group set D since GSE98622 has already provided the RMA-normalized data. The preprocessed data were filtered to include only those probe sets with annotations and export as expression data. Next, we applied PCA to the AKI 24 h group and sham-surgery 24 h group for all four group sets (Figure 1A). All biological replicates for eight groups (different shapes and colors) showed good clustering, indicative of a high degree of similarity in each group. We can also find tight clustering of biological replicates and distinct clustering between AKI 24 h and sham 24 h conditions (different colors) in all four group sets.
Figure 1. Principle component analysis (PCA) plot and differential expression data between AKI 24 h and sham 24 h groups in all four GroupSets. (A) PCA plot shows the tight clustering of biological replicates and distinct clustering between AKI 24 h and sham 24 h conditions in all four GroupSets. Differential expression data between AKI 24 h and sham 24 h groups in (B) GroupSet A data, (C) GroupSet B data, (D) GroupSet C data, and (E) GroupSet D data. The criteria for statistically significant difference of DEGs was adjusted p < 0.05 and |log2 fold change (FC)| ≥ 1 in expression.
Identification of DEGs
A total of 1,380 DEGs were obtained from the group set A (Supplementary Table S1A); among them, 646 genes were upregulated, and 734 genes were downregulated. Overall, 1,902 DEGs were screened from the group set B (Supplementary Table S1B); among them, 605 genes were upregulated, and 1,297 genes were downregulated. In addition, 2,033 DEGs were identified from the group set C (Supplementary Table S1C), including 951 upregulated genes and 1,082 downregulated genes. What is more, 4,603 DEGs were screened from the group set D (Supplementary Table S1D), including 2,973 upregulated genes and 1,630 downregulated genes. The differential expression of multiple genes in each of the four group sets is shown in Figures 1B–E. The cluster heatmap results of the top 50 upregulated and 50 downregulated DEGs are shown in Supplementary Figure S2. We then performed overlapping DEG analysis among the top 50 upregulated DEGs of four group sets; the result indicated that Krt20, Serpine1, Sprr2f, and Timp1 showed identical expression trends in all four group sets, and Fosl1, Gsta1, Havcr1, Plaur, Runx1, Slc7a11, Sox9, and Sprr2g were differentially expressed in at least three group sets. We also performed overlapping DEGs analysis among all the DEGs of these four group sets by VennDiagram package (Figure 2); among them, 245 DEGs showed identical expression trends in all four group sets, including 157 upregulated genes and 88 downregulated genes (Supplementary Table S2), and 722 DEGs were differentially expressed in at least three group sets.
Figure 2. Identification of overlapping DEGs in all four GroupSets. Each colored ellipse in the Venn diagram represents a GroupSet, and the number stands for the number of DEGs in each cross area.
GO and KEGG Pathway Enrichment Analyses of 722 DEGs
All 722 overlapping DEGs that were differentially expressed in at least three group sets were uploaded into the online software DAVID to evaluate the GO and KEGG pathways. GO analysis of DEGs was divided into three functional groups, including biological processes, cellular component composition, and molecular function. Significant results of the GO and KEGG pathway enrichment analyses of DEGs are shown in Figure 3. In the biological process group, 722 DEGs were mainly enriched in the positive regulation of cell proliferation, oxidation–reduction process, positive regulation of fibroblast proliferation, positive regulation of cell migration, metabolic process, and positive regulation of gene expression. In the cellular component group, 722 DEGs were mainly enriched in the extracellular exosome, mitochondrion, cell surface, focal adhesion, extracellular space, and peroxisome. In the molecular function group, 722 DEGs were mainly enriched in the oxidoreductase activity, catalytic activity, flavin adenine dinucleotide binding, lyase activity, integrin binding, and receptor binding. The KEGG pathway results revealed that the DEGs are significantly enriched in metabolic pathways; tumor necrosis factor (TNF) signaling pathway; valine, leucine, and isoleucine degradation; MAPK signaling pathway; glycine, serine, and threonine metabolism; and complement and coagulation cascades (Supplementary Table S3).
Figure 3. The GO and KEGG pathway enrichment analysis of 722 overlapping DEGs (top 6 were listed). Four different colors separately represent four different categories: biological process, cellular component, molecular function, and KEGG pathway. The rectangle represents−log10 (adjusted p-value); the line and point represent gene count. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
PPI Network Construction and Analysis of Modules
Based on the STRING database, the PPI network of the 722 overlapping DEGs was selected; 312 nodes and 585 edges were established in the PPI network with score > 0.7 (Figure 4). After removing the disconnected nodes in Cytoscape software, a total of 251 nodes and 527 edges were analyzed. Five modules met the criteria of MCODE score > 5 and the number of nodes > 6; the results are shown in Figure 5. All of these modules got a p < 0.01, and module 1 was still the most significant in ClusterOne plug-in. KEGG pathway of the five modules are listed in Supplementary Table S4. The results showed that the DEGs included in these five modules were mainly associated with metabolic pathways (including 6 genes from module 4 and 10 genes from module 5).
Figure 4. Protein–protein interaction network of the 722 overlapping DEGs with a score >0.7. Disconnected nodes were hiding in the network. Each node stands for a gene or a protein, and edges represent the interactions between the nodes.
Figure 5. The PPI network of five significant modules selected by MCODE. (A) Module 1, (B) module 2, (C) module 3, (D) module 4, and (E) module 5. Red nodes represent upregulated genes; blue nodes stand for downregulated genes.
Identification of Hub Genes
We mined the top 21 hub proteins by combining Cyto-Hubba and cytoNCA. These hub genes included tissue inhibitor of metalloproteinase 1 (Timp1), Quiescin Q6 sulfhydryl oxidase 1 (Qsox1), transforming growth factor beta 1 (Tgfb1), platelet-derived growth factor, B polypeptide (Pdgfb), protein S alpha (Pros1), matrix metallopeptidase 3 (Mmp3), fibrinogen alpha chain (Fga), fibrinogen gamma chain (Fgg), clusterin (Clu), serine peptidase inhibitor (Serpine1), coagulation factor II receptor (F2r), connective tissue growth factor (Ctgf), endothelin1 (Edn1), calcium-sensing receptor (Casr), annexinA1 (Anxa1), angiotensinogen (Agt), prostaglandin-endoperoxide synthase 2 (Ptgs2), myelocytomatosis oncogene (Myc), Fos-like antigen 1 (Fosl1), FBJ osteosarcoma oncogene (Fos), and Early growth response 1 (Egr1). All the hub genes were analyzed for the GO and KEGG pathway enrichment analyses (Table 3). The PPI network of these 21 hub genes was constructed and visualized based on the GeneMANIA plugin in Cytoscape (Figure 6). The network revealed that there were strong interactions among these hub genes, and their interactions may have an effect on the AKI pathophysiological process.
Figure 6. PPI network of 21 hub genes. Red nodes represent upregulated genes, blue nodes stand for downregulated genes, and green nodes represent the linker genes.
Validation of 21 Hub Genes and 14 Other Important Upregulated Genes
To confirm the key genes identified above, we performed IRI surgery in mice. The results of SCr, BUN, and ATN scores, and PAS staining in 24 h after IRI or sham surgery are shown in Figure 7. The qRT-PCR analysis was performed to validate the expression of 35 key genes including 21 hub genes and 14 other important upregulated genes in our own AKI samples (Figure 8A). Through validation, 31 genes were validated as significantly upregulated, 3 genes were significantly downregulated, and 1 gene (Qsox1) was found not significant in our own AKI samples. Krt20 was identified as the top 1 upregulated genes with a log2 (fold change) larger than 10 in all these 35 genes. Those results were consistent with the DEG analysis results, except Clu and Qsox1. Additionally, we validated the Egr1 gene as an upregulated gene not only in RNA level (Figure 8B) but also in protein level by Western blot (Figure 8D) and immunofluorescence analyses (Figure 8E); the result was also consistent with the above analysis. In conclusion, our results indicated that these key genes were significantly differentially expressed in AKI.
Figure 7. The results of SCr, BUN, and ATN scores, and PAS staining in 24 h after IRI or sham surgery. The results of (A) SCr, (B) BUN, (C) ATN scores, and (D) PAS staining in AKI and sham groups. ***p < 0.001. SCr, serum creatinine; BUN, blood urea nitrogen; ATN, tubular necrosis; PAS, Periodic acid–Schiff.
Figure 8. Expression of 35 key genes and EGR1 in AKI samples and sham samples. (A) Transcriptional levels of 35 important genes including 21 hub genes and 14 other important upregulated genes in our own AKI samples. (B) Translational levels of Egr1 in AKI samples and sham samples. (C) Western blot of EGR1. (D,E) Immunofluorescence of EGR1. *p < 0.05, **p < 0.01, ***p < 0.001.
AKI is a global public health concern associated with high morbidity, mortality, and health-care costs (Zuk and Bonventre, 2016), and the therapeutic measures for AKI are still limited (Bellomo et al., 2012). Therefore, it is very necessary to explore the molecular mechanisms and develop available treatment strategies for AKI. High throughput and high-throughput sequencing technologies have been widely used to explore potential targets for AKI diagnosis and treatment. At present, many basic research papers on the mechanisms underlying the occurrence of AKI have been reported (Kumar et al., 2015; Brossa et al., 2018), but the treatment remains very limited; one of the important reason is that the results of most studies were generated from a single-cohort study or focused on merely a single event.
In this study, we integrated four group sets that contain the AKI 24 h group and sham 24 h group data, and used R software and bioinformatics to deeply analyze these datasets. At first, we
preprocessed the raw data; the box plot results showed that these data have a good quality after preprocessing (Supplementary Figure S1). PCA results showed that all of the biological replicates for the eight groups showed well clustering, which indicates a high degree of similarity in each group (Figure 1A). Volcano plots (Figures 1B–E) showed the differentially expressed genes between AKI 24 h and sham 24 h groups in four group sets. The cluster heatmaps (Supplementary Figure S2) showed the top 50 upregulated and 50 downregulated DEGs in four group sets, which indicated that the AKI and sham groups could be distinguished by the upregulated and downregulated genes.
We then performed overlapping DEGs analysis among the top 50 upregulated DEGs of each four group sets; the result indicated that Krt20, Serpine1, Sprr2f, and Timp1 showed identical expression trends in all four group sets; Fosl1, Gsta1, Havcr1, Plaur, Runx1, Slc7a11, Sox9, and Sprr2g were differentially expressed in at least three group sets. These commonly upregulated genes may play an important role in the pathology and progress of AKI.
Furthermore, we performed integrated analysis among all the DEGs of these four group sets by VennDiagram package (Figure 2); a set of 722 overlapping DEGs that were differentially expressed in at least three group sets were identified. We then performed GO and KEGG pathway enrichment analyses of these 722 overlapping DEGs. As shown in Figure 3, many genes involved in the cell proliferation, fibroblast proliferation, positive regulation of gene expression, oxidoreductase activity, metabolic pathways, and TNF signaling pathway were activated after AKI; these results were confirmed with the pathology of AKI (Bellomo et al., 2012).
In addition, we performed a PPI network of these 722 overlapping DEGs with a score > 0.7 (Figure 4) and then mined five significant modules (Figure 5) and the top 21 hub proteins in the PPI network (Figure 6). The top 2 modules were module 1, which included 10 genes, and module 2, which included 11 genes. Nine of the top 21 hub genes, Timp1, Fgg, Serpine1, Edn1, Ctgf, Ptgs2, Myc, Fosl1, and Egr1, appeared once or more in the top 50 upregulated DEGs of the four group sets. Twelve of the top 21 hub genes, Timp1, Tgfb1, Qsox1, Pros1, Pdgfb, Fgg, Fga, Clu, F2r, Edn1, and Casr, Anxa1, also appeared in the top 2 modules, indicating the importance of these genes. In the GO and KEGG pathway enrichment analyses of the top 21 hub genes, 11 genes were enriched in positive regulation of cell proliferation.
To confirm the key genes identified above, we performed IRI surgery in mice. The qRT-PCR analysis of 35 key genes including 21 hub genes and 14 other important upregulated genes in our own AKI samples (Figure 8A) showed that 31 genes were validated as significantly upregulated, 3 genes were significantly downregulated, and Krt20 were identified as the top 1 upregulated genes with a log2 (fold change) larger than 10 in all these 35 genes. Gehrke et al. (2019) reported that Egr1 was a pioneering gene that can directly regulate other early wound-induced genes. Then, we focused on Egr1, to validate its expression not only in RNA level (Figure 8B) but also in protein level by Western Blot (Figure 8D) and immunofluorescence analyses (Figure 8E). These results showed that Egr1 was significantly upregulated in AKI 24 h kidney samples. In conclusion, our results indicated that these key genes were significantly differentially expressed in AKI.
As validated by our own AKI samples, these 21 hub genes and 14 other important upregulated genes, such as Havcr1, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln, are very important in the process of AKI, which could be used to explore some new diagnostic and therapeutic strategies. For example, Havcr1, which is also known as kidney injury molecule 1 (Kim-1), is a type 1 membrane protein. It was widely known as a biomarker for tubular injury and repair processes (Ichimura et al., 2004; Lim et al., 2013). Sex-determining region Y-box 9 (Sox9) played an important role in the development of many organ systems and cell types (Reginensi et al., 2011). Kumar et al. (2015) identified that Sox9 activation acts as a quick response to AKI within recovering cells of the damaged segment of the proximal tubule, and also Sox9 was required for the physiological repairing process. Kang et al. (2016) identified Sox9 as a cell marker of segment-specific progenitor in the recovering kidney, and cells that express Sox9 proliferated and became the loop of Henle, distal tubule, and proximal tubule after AKI. Ma et al. (2018) reported that Sox9 + cells had a proliferative capacity and could regenerate epithelial cells in the proximal tubule in a mouse unilateral partial nephrectomy model, and Sox9 activation was involved with Notch signaling pathway. Therefore, Sox9 plays a pivotal role in AKI. Timp1, which is a representative member of the TIMP family, is expressed in several tissues that act as an inhibitor of matrix metalloproteinases. The main biological function of TIMPs is to inhibit MMPs by forming a 1:1 enzyme-inhibitor complex (Guo et al., 2016). Bojic et al. (2015) reported that Timp1 was a good diagnostic biomarker of sepsis-associated AKI and may involve in the pathogenesis of sepsis and septic shock. Uehara et al. (2013) successfully identified Timp1 as a candidate genomic biomarker for the early detection of acute papillary injuries in rats.
Serpine1, also named as plasminogen activator inhibitor type-1 (PAI-1), is a very strong proteolytic enzyme. Serpine1 is regulated by various signals, including many growth factors, glucose, hormones, cytokines, catecholamine, and hypoxia (Kim et al., 2013). Serpine1 is produced in little amounts in healthy kidneys but is widely expressed in both AKI and CKD kidneys (Eddy and Fogo, 2006). Increased Serpine1 expression resulted in the accumulation of extracellular matrix, while inhibition of Serpine1 activity increased matrix turnover and could reduce glomerulosclerosis (Malgorzewicz et al., 2013). These results indicated that Serpine1 regulation would be a good strategy to treat ischemia–reperfusion (I/R)-induced inflammatory renal injury. Edn1, the predominant isoform of the endothelin peptide family, also known as endothelin-1 (ET-1), acts through the receptors type A (ETRA) and B (ETRB) (Zanatta et al., 2015). Studies showed that ET-1 was increased in renal tissue in AKI (Ong et al., 2015). Intervention with ETRA antagonists can slow down and prevent the progression from AKI to CKD (Zager et al., 2013) and provide dramatic protection. Kurata et al. (2004) reported that nitric oxide has protective effects against AKI through the suppression of ET-1 overproduction in the postischemic kidney. These results suggested that Edn1 was a potential therapeutic biomarker. The regulation of Edn1 would be a new strategy for the treatment of AKI. Apln, which belongs to the G-protein-coupled receptor family, also known as Apelin, is an endogenous ligand of the specific receptor APJ (Antushevich and Wojcik, 2018). Chen et al. (2015) reported that overexpression of Apelin significantly inhibited apoptosis, hypoxia-induced TGF-β1, to protect against kidney IRI. Apelin can also activate the p-ERK, p-AKT, and Sirt3, and protected against ischemic injury by activating the PI3K/AKT axis (He et al., 2015). Day et al. (2013) reported that apelin can prevent diabetic nephropathy’s progression. Pan et al. (2010) reported that apelin decreases the inflammatory response and cardiac deterioration in the sepsis model. Apelin could also prevent IRI kidney damage and decreased BUN and creatinine levels (Bircan et al., 2016). These results suggested that Apln was a potential therapeutic biomarker.
At present, a great number of studies have been performed on AKI gene expression profiles (Wei et al., 2010; Liu et al., 2014, 2017; Cui et al., 2016), but their results of DEGs are often not the same due to the heterogeneity among each independent experiment. In the present study, we perform an integrated analysis of high-throughput data from different gene expression profiles and validated the expression of 35 key genes including 21 hub genes and 14 other important upregulated genes in our own AKI samples. Those key genes may be potential molecular markers for prognosis and early detection, as well as therapeutic drug targets for treating AKI.
However, there were still some limitations in our present study. First, there is lack human kidney tissue sample data, which may result in not full applicability in human. Second, more high-throughput microarray and high-throughput sequencing need to be included to get more robust results. Third, the above results, such as the gene expression level and gene function, should be validated by further experiments.
In conclusion, by integrated analysis of different high-throughput data and validation, we identified several crucial genes in AKI, such as Havcr1, Krt20, Sox9, Egr1, Timp1, Serpine1, Edn1, and Apln; they might have cofunctions, and their dysregulations might alter activities of pathways such as positive regulation of cell proliferation, complement and coagulation cascades, and TNF signaling, which might finally promote AKI progression. This study is of great value for the prediction of key regulators in the process of AKI and the identification of potential diagnostic and therapeutic biomarkers for AKI.
Data Availability Statement
All of the original high-throughput data can be public achieved at the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), and other data supporting the study is within the paper. The links to all databases and software used in this study are as follows: Affymetrix official website (http://www.affymetrix.com/support/technical/annotationfilesmain.affx), Bioconductor (http://www.bioconductor.org/), Cytoscape software (http://www.cytoscape.org/), version 3_6_1 for windows_64 bit, Database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.ncifcrf.gov/), GSE52004 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52004), GSE98622 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE98622), R software (https://www.r-project.org/), version 3.5.1 for windows_64 bit, R Studio (https://www.rstudio.com/), version 1.1.456 for windows_64bit, Affy package, version 1.50.0, Limma package, version 3.36.3, Ggolot2 package, version 3.0.0, Scatterplot3d package, version 0.3-41, VennDiagram package, version 1.6.20, Search Tool for the Retrieval of Interacting Genes//Proteins (STRING, http://string-db.org/).
XC designed the experiments. JC and YC carried out the study, collected all the important information, and completed the manuscript. AO edited the language of the manuscript. All authors read and approved the final manuscript.
This work was supported by grants from the National Key Research and Development Program of China (Grant No. 2017YFA0103203) and the National Natural Science Foundation of China (Grant No. 81770664).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2020.00411/full#supplementary-material
FIGURE S1 | The box plots before and after normalization of gene expression. (A) The standardization of GroupSet A data, (B) the standardization of GroupSet B data, and (C) the standardization of GroupSet C data. The abscissa is the sample, and the ordinate is the Expression Value. The normalization process normalized the signal strength of all chips to an interval with similar distribution characteristics.
FIGURE S2 | Hierarchical clustering map of the top 50 up-regulated DEGs and the top 50 down-regulated DEGs in four GroupSets. (A) GroupSet A data, (B) GroupSet B data, (C) GroupSet C data, and (D) GroupSet D data. The red rectangle represents up-regulated genes; the green rectangle represents down-regulated genes. Each row represents a single gene; each column represents a tissue sample. DEGs, differentially expressed genes.
TABLE S1 | All of the differential expression gene information including logFC and FDR in all four GroupSets.
TABLE S2 | Genes that were differentially expressed in all four GroupSets in AKI by integrated analysis of high-throughputs.
TABLE S3 | GO and KEGG enrichment analysis of genes that were differentially expressed in ≥ 3 GroupSets (top 6 significantly enriched terms were listed).
TABLE S4 | The KEGG pathway of five significant modules selected by MCODE.
Agt, angiotensinogen; AKI, acute kidney injury; Anxa1, annexin A1; ATN, tubular necrosis; BP, biological process; BUN, blood urea nitrogen; Casr, calcium-sensing receptor; CC, cellular component; CKD, chronic kidney disease; Clu, clusterin; Ctgf, connective tissue growth factor; DAVID, Database for Annotation, Visualization and Integrated Discovery; DEGs, differentially expressed genes; Edn1, endothelin 1; Egr1, early growth response 1; ET-1, endothelin-1; F2r, coagulation factor II receptor; FC, fold change; Fga, fibrinogen alpha chain; Fgg, fibrinogen gamma chain; Fos, FBJ osteosarcoma oncogene; Fosl1, Fos-like antigen 1; GEO, Gene Expression Omnibus; GO, Gene Ontology; IRI, ischemia–reperfusion injury; KDIGO, Kidney Disease: Improving Global Outcomes; KEGG, Kyoto Encyclopedia of Genes and Genomes; Kim-1, kidney injury molecule 1; MCODE, Molecular Complex Detection; MF, molecular function; Mmp3, matrix metallopeptidase 3; Myc, myelocytomatosis oncogene; PAI-1, plasminogen activator inhibitor type-1; PAS, periodic acid–Schiff; PCA, principal component analysis; Pdgfb, platelet-derived growth factor B polypeptide; PPI, protein–protein interaction; Pros1, protein S alpha; Ptgs2, prostaglandin-endoperoxide synthase 2; Qsox1, quiescin Q6 sulfhydryl oxidase 1; RMA, robust multichip average; SCr, serum creatinine; Serpine1, serine peptidase inhibitor; Sox9, sex determining region Y-box 9; STRING, Search Tool for the Retrieval of Interacting Genes/Proteins; Tgfb1, transforming growth factor beta 1; Timp1, tissue inhibitor of metalloproteinase 1.
- ^ https://www.ncbi.nlm.nih.gov/geo/
- ^ https://www.r-project.org/
- ^ https://www.rstudio.com/
- ^ http://www.bioconductor.org/
- ^ http://www.affymetrix.com/support/technical/annotationfilesmain.affx
- ^ http://david.ncifcrf.gov/
- ^ http://string-db.org/
- ^ http://www.cytoscape.org/
Bircan, B., Cakir, M., Kirbag, S., and Gul, H. F. (2016). Effect of apelin hormone on renal ischemia/reperfusion induced oxidative damage in rats. Ren. Fail. 38, 1122–1128. doi: 10.1080/0886022X.2016.1184957
Bojic, S., Kotur-Stevuljevic, J., Kalezic, N., Stevanovic, P., Jelic-Ivanovic, Z., Bilanovic, D., et al. (2015). Diagnostic value of matrix metalloproteinase-9 and tissue inhibitor of matrix metalloproteinase-1 in sepsis-associated acute kidney injury. Tohoku J. Exp. Med. 237, 103–109. doi: 10.1620/tjem.237.103
Brossa, A., Papadimitriou, E., Collino, F., Incarnato, D., Oliviero, S., Camussi, G., et al. (2018). Role of CD133 molecule in Wnt response and renal repair. Stem Cells Transl. Med. 7, 283–294. doi: 10.1002/sctm.17-0158
Chen, H., Wan, D., Wang, L., Peng, A., Xiao, H., Petersen, R. B., et al. (2015). Apelin protects against acute renal injury by inhibiting TGF-beta1. Biochim. Biophys. Acta 1852, 1278–1287. doi: 10.1016/j.bbadis.2015.02.013
Chin, C. H., Chen, S. H., Wu, H. H., Ho, C. W., Ko, M. T., and Lin, C. Y. (2014). cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8(Suppl. 4):S11. doi: 10.1186/1752-0509-8-S4-S11
Gehrke, A. R., Neverett, E., Luo, Y. J., Brandt, A., Ricci, L., Hulett, R. E., et al. (2019). Acoel genome reveals the regulatory landscape of whole-body regeneration. Science 363:eaau6173. doi: 10.1126/science.aau6173
Guo, J., Guan, Q., Liu, X., Wang, H., Gleave, M. E., Nguan, C. Y., et al. (2016). Relationship of clusterin with renal inflammation and fibrosis after the recovery phase of ischemia-reperfusion injury. BMC Nephrol. 17:133. doi: 10.1186/s12882-016-0348-x
Hu, J., Zhang, L., Wang, N., Ding, R., Cui, S., Zhu, F., et al. (2013). Mesenchymal stem cells attenuate ischemic acute kidney injury by inducing regulatory T cells through splenocyte interactions. Kidney Int. 84, 521–531. doi: 10.1038/ki.2013.562
Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2009). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57. doi: 10.1038/nprot.2008.211
Ichimura, T., Hung, C. C., Yang, S. A., Stevens, J. L., and Bonventre, J. V. (2004). Kidney injury molecule-1: a tissue and urinary biomarker for nephrotoxicant-induced renal injury. Am. J. Physiol. Renal. Physiol. 286, F552–F563. doi: 10.1152/ajprenal.00285.2002
Kang, H. M., Huang, S., Reidy, K., Han, S. H., Chinga, F., and Susztak, K. (2016). Sox9-positive progenitor cells play a key role in renal tubule epithelial regeneration in mice. Cell Rep. 14, 861–871. doi: 10.1016/j.celrep.2015.12.071
Kim, J. I., Jung, K. J., Jang, H. S., and Park, K. M. (2013). Gender-specific role of HDAC11 in kidney ischemia- and reperfusion-induced PAI-1 expression and injury. Am. J. Physiol. Renal. Physiol. 305, F61–F70. doi: 10.1152/ajprenal.00015.2013
Kumar, S., Liu, J., Pang, P., Krautzberger, A. M., Reginensi, A., Akiyama, H., et al. (2015). Sox9 activation highlights a cellular pathway of renal repair in the acutely injured mammalian kidney. Cell Rep. 12, 1325–1338. doi: 10.1016/j.celrep.2015.07.034
Kurata, H., Takaoka, M., Kubo, Y., Katayama, T., Tsutsui, H., Takayama, J., et al. (2004). Nitric oxide protects against ischemic acute renal failure through the suppression of renal endothelin-1 overproduction. J. Cardiovasc. Pharmacol. 44(Suppl. 1), S455–S458. doi: 10.1097/01.fjc.0000166314.38258.a8
Lim, A. I., Tang, S. C., Lai, K. N., and Leung, J. C. (2013). Kidney injury molecule-1: more than just an injury marker of tubular epithelial cells? J. Cell. Physiol. 228, 917–924. doi: 10.1002/jcp.24267
Lin, Z., and Lin, Y. (2017). Identification of potential crucial genes associated with steroid-induced necrosis of femoral head based on gene expression profile. Gene 627, 322–326. doi: 10.1016/j.gene.2017.05.026
Liu, J., Krautzberger, A. M., Sui, S. H., Hofmann, O. M., Chen, Y., Baetscher, M., et al. (2014). Cell-specific translational profiling in acute kidney injury. J. Clin. Invest. 124, 1242–1254. doi: 10.1172/JCI72126
Liu, J., Kumar, S., Dolzhenko, E., Alvarado, G. F., Guo, J., Lu, C., et al. (2017). Molecular characterization of the transition from acute to chronic kidney injury following ischemia/reperfusion. JCI Insight 2:e94716. doi: 10.1172/jci.insight.94716
Mehta, R. L., Cerda, J., Burdmann, E. A., Tonelli, M., Garcia-Garcia, G., Jha, V., et al. (2015). International society of nephrology’s 0by25 initiative for acute kidney injury (zero preventable deaths by 2025): a human rights case for nephrology. Lancet 385, 2616–2643. doi: 10.1016/S0140-6736(15)60126-X
Reginensi, A., Clarkson, M., Neirijnck, Y., Lu, B., Ohyama, T., Groves, A. K., et al. (2011). SOX9 controls epithelial branching by activating RET effector genes during kidney development. Hum. Mol. Genet. 20, 1143–1153. doi: 10.1093/hmg/ddq558
Sawhney, S., Marks, A., Fluck, N., Levin, A., McLernon, D., Prescott, G., et al. (2017). Post-discharge kidney function is associated with subsequent ten-year renal progression risk among survivors of acute kidney injury. Kidney Int. 92, 440–452. doi: 10.1016/j.kint.2017.02.019
Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., et al. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504. doi: 10.1101/gr.1239303
Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., et al. (2015). STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452. doi: 10.1093/nar/gku1003
Tang, Y., Li, M., Wang, J., Pan, Y., and Wu, F. X. (2015). CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems 127, 67–72. doi: 10.1016/j.biosystems.2014.11.005
Uehara, T., Kondo, C., Morikawa, Y., Hanafusa, H., Ueda, S., Minowa, Y., et al. (2013). Toxicogenomic biomarkers for renal papillary injury in rats. Toxicology 303, 1–8. doi: 10.1016/j.tox.2012.09.012
Wei, Q., Bhatt, K., He, H. Z., Mi, Q. S., Haase, V. H., and Dong, Z. (2010). Targeted deletion of Dicer from proximal tubules protects against renal ischemia-reperfusion injury. J. Am. Soc. Nephrol. 21, 756–761. doi: 10.1681/ASN.2009070718
Yang, X., Zhu, S., Li, L., Zhang, L., Xian, S., Wang, Y., et al. (2018). Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis. Onco. Targets Ther. 11, 1457–1474. doi: 10.2147/OTT.S152238
Zager, R. A., Johnson, A. C., Andress, D., and Becker, K. (2013). Progressive endothelin-1 gene activation initiates chronic/end-stage renal disease following experimental ischemic/reperfusion injury. Kidney Int. 84, 703–712. doi: 10.1038/ki.2013.157
Zanatta, C. M., Crispim, D., Sortica, D. A., Klassmann, L. P., Gross, J. L., Gerchman, F., et al. (2015). Endothelin-1 gene polymorphisms and diabetic kidney disease in patients with type 2 diabetes mellitus. Diabetol. Metab. Syndr. 7:103. doi: 10.1186/s13098-015-0093-5
Zhang, L., Huang, Y., Zhuo, W., Zhu, Y., Zhu, B., and Chen, Z. (2017). Identification and characterization of biomarkers and their functions for Lapatinib-resistant breast cancer. Med. Oncol. 34:89. doi: 10.1007/s12032-017-0953-y
Keywords: acute kidney injury, microarray, high-throughput sequencing, integrated analysis, differentially expressed gene, interaction network analysis, validation
Citation: Chen J, Chen Y, Olivero A and Chen X (2020) Identification and Validation of Potential Biomarkers and Their Functions in Acute Kidney Injury. Front. Genet. 11:411. doi: 10.3389/fgene.2020.00411
Received: 10 October 2019; Accepted: 31 March 2020;
Published: 12 May 2020.
Edited by:Juan Caballero, Universidad Autónoma de Querétaro, Mexico
Reviewed by:Cun Liu, Shandong University of Traditional Chinese Medicine, China
Jiarui Wu, Beijing University of Chinese Medicine, China
Liping Sun, The First Affiliated Hospital of China Medical University, China
Copyright © 2020 Chen, Chen, Olivero and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xiangmei Chen, email@example.com
†These authors have contributed equally to this work