Genetic and Epigenetic Studies in Diabetic Kidney Disease

Chronic kidney disease is a worldwide health crisis, while diabetic kidney disease (DKD) has become the leading cause of end-stage renal disease (ESRD). DKD is a microvascular complication and occurs in 30–40% of diabetes patients. Epidemiological investigations and clinical observations on the familial clustering and heritability in DKD have highlighted an underlying genetic susceptibility. Furthermore, DKD is a progressive and long-term diabetic complication, in which epigenetic effects and environmental factors interact with an individual’s genetic background. In recent years, researchers have undertaken genetic and epigenetic studies of DKD in order to better understand its molecular mechanisms. In this review, clinical material, research approaches and experimental designs that have been used for genetic and epigenetic studies of DKD are described. Current information from genetic and epigenetic studies of DKD and ESRD in patients with diabetes, including the approaches of genome-wide association study (GWAS) or epigenome-wide association study (EWAS) and candidate gene association analyses, are summarized. Further investigation of molecular defects in DKD with new approaches such as next generation sequencing analysis and phenome-wide association study (PheWAS) is also discussed.


INTRODUCTION
Diabetes is a major public health problem that is approaching epidemic proportions globally. According to the latest report from the IDF, the prevalence of diabetes will increase from 425 million persons in 2017 to 629 million by 2045 (IDF 2017 1 ). Diabetic kidney disease (DKD, previously termed diabetic nephropathy, DN) is a microvascular complication and progresses gradually over many years in approximately 30-40% of individuals with T1D and T2D mellitus (Harjutsalo and Groop, 2014;Thomas et al., 2015;Barrett et al., 2017). DKD is now the main cause of chronic kidney disease (CKD) worldwide and the leading cause of end-stage-renal disease (ESRD) requiring renal replacement therapy (dialysis or transplantation). The presence of CKD is the single strongest predictor of mortality for persons with diabetes (Dousdampanis et al., 2016;Papadopoulou-Marketou et al., 2017). Pathological findings in DKD include glomerular Abbreviations: ACR, albumin-to-creatinine ratio; ADA, American Diabetes Association; BMI, body mass index; CNV, copy number variant; DKD, diabetic kidney disease; ESRD, end-stage renal disease; EWAS, epigenome-wide association study; GFR, glomerular filtration rate; GWAS, genome-wide association study; IDF, International Diabetes Federation; IHME, Institute for Health Metrics and Evaluation; LD, Linkage disequilibrium; PheWAS, phenome-wide association study; SNP, single nucleotide polymorphism; T1D, type 1 diabetes; T2D, type 2 diabetes; UAE, urinary albumin excretion. 1 http://www.diabetesatlas.org/ hypertrophy, mesangial matrix expansion, reduced podocyte number, glomerulosclerosis, tubular atrophy and tubulointerstitial fibrosis. Clinical criteria used to diagnose the subjects with DKD are urine ACR higher than 300 mg/g, while microalbuminuria is diagnosed when ACR is between 30-300 mg/g (Bouhairie and McGill, 2016). Accumulating evidence has indicated that podocyte loss and epithelial dysfunction play important roles in DKD pathogenesis with further progression associated with inflammation but the exact molecular mechanisms responsible for DKD are not fully known (Badal and Danesh, 2014;Reidy et al., 2014;Gnudi et al., 2016).
Both clinical and epidemiological studies have demonstrated that there is familial aggregation of DKD in different ethnic groups, indicating that genetic factors contribute to development of the disease. Furthermore, genetic risk factors in DKD interact with the environmental factors (for example, lifestyle, diet and medication) (Freedman et al., 2007a;Murea et al., 2012;Thomas et al., 2012;Kato and Natarajan, 2014). Figure 1 is a schematic diagram representing the relationship between genetic, epigenetic and environmental factors that are involved in the development and progression of DKD. Genetic studies of DKD are mainly focused on association analyses between genomic DNA variation (for example, single nucleotide polymorphisms, SNPs, copy number variants, CNVs, and microsatellites) and clinical phenotypes of the disease (Freedman et al., 2007a;Gu and Brismar, 2012;Thomas et al., 2012;Florez, 2016). Epigenetics studies of DKD examine potentially heritable changes in gene expression that occur without variation in the original DNA nucleotide sequence (Villeneuve and Natarajan, 2010;Kato and Natarajan, 2014;Thomas, 2016;Keating et al., 2018). Therefore, epigenetic studies of DKD may provide information to help understand how environmental factors modify the expression of genes that are involved in DKD progression. Combined genetic, epigenetic and phenotypic studies together may generate information to understand new pathogenic pathways and to search for new biomarkers for early diagnosis and prediction as part of prevention programs in DKD. The results may also be useful in finding novel targets for the treatment of DKD.
SNPs are the most common form of genomic DNA variation. The updated dbSNP database of more than 500 million reference SNPs (rs) with allele frequency data 2 has provided fundamental information for genetic studies of complex diseases including, DKD. The genetic studies in DKD have implicated previously unsuspected biological pathways and subsequently improved our knowledge for understanding of the genetic basis of the disease. For most common traits studied in DKD, however, the identified genes and their SNPs only explain a fraction of associated risk, suggesting that human genomic DNA variations are only a part of underlying susceptibility to DKD. This has led to evolving interest in epigenetics to help explain some of the missing heritability of DKD. Epigenetic mechanisms mainly consist of DNA methylation, chromosome histone modification and noncoding RNA (ncRNA) regulation (Kato and Natarajan, 2014;Allis and Jenuwein, 2016). Epigenetic related ncRNAs include miRNA, siRNA, piRNA, and lncRNA (Holoch and Moazed, 2015 There are more than 30,000 identified CpG islands in the human genome. Detailed information for these CpG islands can be found in the public database 3 . The CpG islands are defined as stretches of DNA > 200 bp long with a GC percentage greater than 50% and an observed-to-expected CpG ratio of more than 60%. The CpG islands are often found at promoters and contain the 5 end of the transcript, while DNA methylation occurs at 5 -cytosines of "CpG" dinucleotides 4 (Cross and Bird, 1995). In DKD, the effects of DNA methylation have been studied in terms of transgenerational inheritance of the disease to explore environmental and other non-genetic factors that may influence epigenetic modifications in the genes involved in DKD (Deaton and Bird, 2011;Jones, 2012). Identification of differentially methylated CpG sites in promoters or other functional regions of genes and the analysis of the DNA methylation changes that are associated with DKD have become the most common approaches used in epigenetic studies of the disease (Villeneuve and Natarajan, 2010;Kato and Natarajan, 2014;Thomas, 2016). Furthermore, ncRNAs, particularly long ncRNAs are known to be involved in epigenetic processes. ncRNAs certainly play an important role in chromatin formation, histone modification, DNA methylation and consequently gene transcription silencing.
Genetic and epigenetic studies of DKD, initially using candidate gene approaches and more recently at genome-wide scale (known as GWAS and EWAS), have been undertaken to identify many genes conferring susceptibility or resistance to DKD. In this review, clinical phenotypes, research approaches and experimental designs that have been used for genetic and epigenetic studies of DKD are described. These research approaches and experimental designs can also be used for study of CKD. Current information from genetic and epigenetic studies of DKD is summarized. Further investigation of molecular defects in DKD with new generation sequencing analyses and phenome-wide association studies (PheWAS) are discussed.

BIOLOGICAL MATERIAL, RESEARCH APPROACHES AND STUDY DESIGNS USED IN GENETIC AND EPIGENETIC INVESTIGATIONS OF DIABETIC KIDNEY DISEASE
Two major research approaches either at genome-wide scale or focused on candidate gene(s) have been widely used for comparative studies between cases (patients with DKD) and controls (diabetes patients without DKD). Casecontrol studies by recruiting large numbers of subjects can increase the statistical power of reported associations. The aim is to discover the genes presented differentially in genomic structure or genetic expression. Genome-wide or epigenome-wide association studies (GWAS or EWAS) are hypothesis−generating approaches (Rakyan et al., 2011;Do et al., 2017;Lappalainen and Greally, 2017). These FIGURE 1 | This is a schematic diagram representing the relationship between genetic, epigenetic and phenotypic studies in diabetic kidney disease (DKD). Genetic association studies are fundamentally important for identification of susceptibility or resistance genes (G). Epigenetic studies analyzing genomic DNA methylation changes, chromosome histone modification and ncRNA regulation are useful for dissecting the interaction of the genes with environmental factors. The combined data from genetic, epigenetic and phenotypic (Phe) studies may provide the opportunity for us to understand new pathways underlying the pathogenesis of DKD and to discover new biomarkers for early diagnosis and to find targets for prevention and treatment programs of this disease. The different sizes of the 'G" and "Phe" represent the variation of genetic and phenotypic effects. study designs have benefited from rapid development of human genome research, including the creation of publicly available databases of SNPs, haplotypes and CpG islands and the rapid technical improvements in analyzing genomic variation using high-throughput techniques and highdensity SNP or CpG arrays. Another approach is to focus on candidate genes and study a more limited number of genes potentially involved in the pathogenesis of DKD based upon our known knowledge or hypothesis. In genetic and epigenetic studies of DKD, DNA samples used are commonly extracted from peripheral blood samples because they are clinically accessible. Dick et al. (2014) have comparatively analyzed DNA methylation changes related to BMI by using both approaches of whole-blood DNA methylation profiling and adipose tissue specific methylation measurement. Data suggests that analysis of blood DNA methylation is worthwhile because the results can reflect the DNA methylation changes in relevant tissues for a particular phenotype. Nevertheless, there is still limited information concerning the correlation between whole blood DNA methylation profiles and kidney tissue specific DNA methylation changes in part due to the heterogeneity of cell types within the kidney. To improve the tissue specific DNA methylation analysis of kidney diseases, including DKD, it is necessary to construct biobanks of renal biopsies. Karolinska Institutet has established a biobank in KaroKidney with more than 750 renal biopsies 5 . The advantages and limitations of these two approaches, as well as the clinical materials and experimental 5 http://karokidney.org design used in genetic and epigenetic studies of DKD are summarized in Table 1.
genes association with DKD studies are listed in Table 2A, while their potential biological relevance and genetic effects in DKD are briefly described. Of them, 34 genes are originally predicted by GWAS and the statistical association with DKD summarized in Table 2B.
The CNDP1 (carnosine dipeptidase 1) gene is located in chromosome 18q22.3 and contains 5-leucine (CTG) trinucleotide repeat length polymorphism (D18S880) in the coding region (Wanic et al., 2008). This trinucleotide repeat polymorphism is found to have gender specificity and to confer the susceptibility for DKD and ESRD in T2D (Albrecht et al., 2017b). Furthermore, serum carnosinase (CN-1) activity is negatively correlated with time on hemodialysis (Peters et al., 2016). In addition, several SNPs in this gene are also associated with DKD and ESRD (Janssen et al., 2005;Freedman et al., 2007b;McDonough et al., 2009;Alkhalaf et al., 2010;Mooyaart et al., 2010;Ahluwalia et al., 2011b;Chakkera et al., 2011;Kurashige et al., 2013). Interestingly, an experimental study in BTBR ob/ob mice has demonstrated that treatment with carnosine as the target of CNDP1 improves glucose metabolism and albuminuria, suggesting that carnosine may be a novel therapeutic strategy to treat patients with DKD (Albrecht et al., 2017a).
The ELMO1 (engulfment and cell motility 1) gene is located on chromosome p14.1 and encodes a member of the engulfment and cell motility protein family. The protein interacts with dedicator of cytokinesis proteins and subsequently promotes phagocytosis and cell migration. Increased expression of ELMO1 and dedicator of cytokinesis 1 may promote glioma cell invasion (Patel et al., 2010). Furthermore, several SNPs in this gene are found to be associated with DKD in both T1D and T2D (Shimazaki et al., 2005(Shimazaki et al., , 2006Craig et al., 2009;Leak et al., 2009;Pezzolesi et al., 2009a;Hanson et al., 2010;Wu et al., 2013;Alberto Ramirez-Garcia et al., 2015;Bodhini et al., 2016;Hathaway et al., 2016;Mehrabzadeh et al., 2016;Sharma et al., 2016). The variants associated with DKD, however, are different in the several populations studied, suggesting the presence of allelic heterogeneity probably resulting from the diverse ancestral genetic backgrounds of the different racial groups.
The FRMD3 (FERM domain containing 3) gene is located in chromosome 9q21.32. The FRMD3 gene is expressed in adult brain, fetal skeletal muscle, thymus, ovaries, and podocytes (Ni et al., 2003). Pezzolesi et al. (2009b) have demonstrated that FRMD3 expression in kidneys of a DKD mouse model is decreased as compared with non-diabetic mice. Genetic polymorphisms in the FRMD3 gene are associated with DKD and ESRD in T1D and T2D (Freedman et al., 2011;Al-Waheeb et al., 2016). Furthermore, the members of the bone morphogenetic protein (BMP) interact with FRMD3, which implies that FRMD3 may influence the risk of DKD through regulation of the BMP pathway (Martini et al., 2013;Palmer and Freedman, 2013).
The MMP9 (matrix metallopeptidase 9) gene is located in chromosome 20q13.12. The MMP family members are involved in the breakdown of extracellular matrix (ECM) in physiological processes, such as tissue remodeling, reproduction and embryonic development, while MMP9 is the ninth member in the family. MMP9 may play an essential role in local proteolysis of the extracellular matrix and in leukocyte migration. Moreover, MMPs, including MMP9, are zinc-dependent endopeptidases and the major proteases in ECM degradation. There are common variants such as rs3918242 (-1562C/T) and microsatellites (CA)n in the promoter region and several SNPs rs481480, rs2032487, rs4281481, rs3752462 and rs3918242 are found to be associated with the susceptibility to DKD (Hirakawa et al., 2003;Nair et al., 2008;Ahluwalia et al., 2009;Freedman et al., 2011;Cooke et al., 2012;Zhang et al., 2015;Feng et al., 2016).

MMP9
(CA)n in promoter, rs481480, rs2032487, rs4281481, rs3752462, rs3918242 T2D-ESRD, T2D-DKD NMUR2 rs982715, rs4958531, rs4958532, rs4958535 T1D-DKD which is important for electrolyte homeostasis. Mutations in this gene are characterized by hypokalemic alkalosis combined with hypomagnesemia, low urinary calcium, but increased renin activity. Tanaka et al. (2003) performed a GWAS in Japanese T2D subjects and reported that the SLC12A3 Arg913Gln polymorphism was associated with reduced risk of DKD. Nishiyama et al. (2005) then conducted another 10-year longitudinal study in the same population. The results confirmed that the 913Gln allele of SLC12A3 Arg913Gln polymorphism conferred a protective effect in DKD (Nishiyama et al., 2005). More recently, Abu Seman et al. (2014) performed a further genetic study of SLC12A3 polymorphisms in a Malaysian population, including the meta-analysis of the association between the SLC12A3 Arg913Gln polymorphism and DKD from all the previous studies. SLC12A3 Arg913Gln polymorphism was found to be associated with T2D (P = 0.028, OR = 0.772, 95% CI = 0.612-0.973) and DKD (P = 0.038, OR = 0.547, 95% CI = 0.308-0.973) in the Malaysian cohort. The meta-analysis confirmed the protective effects of the SLC12A3 913Gln allele in DKD (Z-value = −1.992, P = 0.046, OR = 0.792). In addition, the authors investigated the role of slc12a3 expression in the progress of DKD with db/db mice and in kidney development with zebrafish embryos. With knockdown of zebrafish ortholog, slc12a3 led to structural abnormality of kidney pronephric distal duct at 1-cell stage. Slc12a3 mRNA and protein expression levels were upregulated in kidneys of db/db mice from 6, 12, and 26 weeks at the age. The authors thus concluded that SLC12A3 is a susceptibility gene in DKD, while allele 913Gln but not allele Arg913 has a preventive effect in the disease (Abu Seman et al., 2014). This association of the SLC12A3   Arg913Gln polymorphism with DKD has been very recently replicated in a Chinese population . The UMOD gene encoded glycoprotein is synthesized exclusively in renal tubular cells and released into urine. Furthermore, UMOD may prevent urinary tract infection and inhibit formation of liquid containing supersaturated salts and subsequent formation of salt crystals. SNPs rs4293393 and rs1297707 in the UMOD gene are found to be associated with the susceptibility to DKD in T2D (Ahluwalia et al., 2011a;Prudente et al., 2017;van Zuydam et al., 2018). The Human Genome Project has revealed that there are more than twenty thousand protein coding genes, and probably more than one million of RNA genes 6 . Genetic association studies of RNA gene polymorphisms with DKD are very limited. Up to date, only two SNPs, i.e., rs2910164 and rs12976445 in the genes for miRNA-146a and miRNA-125 have been found to be associated with DKD in T1D and T2D (Li et al., 2014;Kaidonis et al., 2016). Further investigation of RNA genetic variation conferring susceptibility to DKD needs to be undertaken. 6 https://www.genecards.org/

CURRENT INFORMATION FROM EPIGENETIC STUDIES IN DIABETIC KIDNEY DISEASE
Similar to genetic association studies, epigenome-wide (EWAS) and candidate gene DNA methylation analyses have been used for epigenetic studies of DKD. Current information from epigenetic studies in DKD are represented in Table 3. An EWAS suggested that several genes, including SLC22A12, TRPM6, AQP9, HP, AGTX, and HYAL2, may have epigenetic effects in DKD (VanderJagt et al., 2015). Interestingly, SLC22A12 encodes for urate anion transporter 1 (URAT1), which is a kidney-specific urate transporter that transports urate across the apical membrane of the proximal tubule in kidneys. Loss-of-function SLC22A12 mutations are associated with renal hypouricaemia and affected persons can develop exercise-induced acute kidney injury and are at increased risk of developing urate stones (Lee et al., 2008). TRPM6 is a member of transient receptor potential superfamily of cation channels. This gene is widely expressed in the body, including kidneys along the nephron. The TRPM6  channels are mainly located in the renal distal convoluted tubule, the site of active transcellular calcium and magnesium transport in the kidney (Felsenfeld et al., 2015). As described previously, several studies have implicated UMOD genetic polymorphisms in the susceptibility to DKD (Ahluwalia et al., 2011a;Prudente et al., 2017;van Zuydam et al., 2018).
A recent study has demonstrated that UMOD regulates renal magnesium homeostasis through TRPM6 (Nie et al., 2018). Furthermore, analyses of the candidate genes such as IGFBP1 and MTHFR have also provided evidence that DNA methylation changes in these genes may be involved in the pathogenesis of DKD (Gu et al., 2013(Gu et al., , 2014Yang et al., 2016). Combining and analyzing data from genetic and epigenetic studies together may help understand some of the pathophysiology in DKD. ncRNAs regulate gene expression at the post-transcriptional level and are involved in chromatin histone modification. Most of studies concerning histone modification and ncRNA dysregulation have been performed in diabetic animal models, while a few studies have been undertaken in subjects with DKD ( Table 3). Reddy et al. (2014) have analyzed histone modification profiles in genes associated with DKD pathology and the modified regulation of these genes following treatment with the angiotensin II type 1 receptor (AT1R) blocker losartan. The data indicate that losartan attenuates key parameters of DKD and modifies gene expression, and reverses some epigenetic changes in db/db mice. Losartan also attenuates increased H3K9/14Ac at RAGE, PAI-1, and MCP-1 promoters in mesangial cells cultured under diabetic conditions (Reddy et al., 2014). In a recent study of subjects of T2D and diabetic complications (including DKD) (Dos Santos Nunes et al., 2018) the methylation profiles of miR gene were compared and related to the presence of diabetic complications. Results indicated that miRs can modulate the expression of a variety of genes and methylation changes of miR-9-3, miR-34a, and miR-137 were found to be associated with diabetic complications (Dos Santos Nunes et al., 2018). These two studies provide evidence suggesting that therapies targeting epigenetic regulators might be beneficial in the treatment of DKD.

SUMMARY AND PERSPECTIVES
Researchers have made major efforts to undertake well powered genetic and epigenetic studies in DKD to help understand its pathogenesis. The data, however, need to be confirmed by several strategies, for instance, replication studies could be performed with better selection of subjects with similar genetic background to limit influences from migration; intermarriage; cultural preferences; coupled with further investigation of DNA variation and methylation changes in RNA regulation genes and biological experiments to determine functional impact of these variants. Furthermore, new technologies for DNA and ncRNA sequencing analysis such as third generation sequencing and a PheWAS approach have recently been developed.

New Generation Sequencing
DNA sequencing analysis is used for determining the accurate order of nucleotides along chromosomes and genomes. Secondgeneration sequencing, commonly known as next-generation sequencing (NGS), has presently become popular in DNA sequencing analysis because NGS can enable a massivelyparalleled approach capable of producing large numbers of reads at high coverages along the genome and therefore dramatically reduce the cost of DNA sequencing analysis (Treangen and Salzberg, 2011;Gu et al., 2018;Mone et al., 2018). Today, third-generation sequencing (often called as longread sequencing) is a new generation sequencing method, which works by reading the nucleotide sequences at single molecule level in contrast to the first and second generations of DNA sequencing (van Dijk et al., 2018). Moreover, it is necessary to develop the molecular instruments for whole genome sequencing to make this new generation sequencing commercially available. The advanced sequencing technologies will improve genetic and epigenetic studies in DKD in the near future.

ncRNA Genetic and Epigenetic Studies
In the human genome, RNA genes are much more abundant than protein coding genes, while ncRNAs mainly include miRNAs and lncRNAs. Both forms of ncRNAs have been found to be involved in chromatin histone modifications, and subsequently can have epigenetic effects on the target genes. Therefore, identification of RNA genetic variation and investigation of biological alteration of these RNA genes should be included in research plans. Kato has very recently pointed out a hypothesis that transforming growth factor-β (TGF1β) may play an important role in early stage development of DKD, while some miRNAs and lncRNAs regulate the key molecules in the TGF1β pathway. These ncRNAs may be served as biomarkers for predicting the potential targets for prevention and treatment in DKD (Kato, 2018). Furthermore, Smyth et al. (2018) have compared Sanger sequencing and NGS to validate the five top ranked miRNAs that are predicted to be associated with DKD by EWAS. This study suggests that targeted NGS may offer a more cost-effective and sensitive approach and implied that the methylated miR-329-2, in which region SNP rs10132943 is located, and miR-429 where SNPs rs7521584 and rs112695918 exist, are associated with DKD (Smyth et al., 2018). Although these two studies are preliminary, they may be good examples to help direct further DKD research.

Phenome-Wide Association Study (PheWAS)
PheWAS is a new approach to analyze many phenotypes in comparison with a single genetic variant. This approach was originally described using electronic medical record (EMR) data from EMR-linked with a DNA biobank and also can be combined with GWAS and EWAS. Therefore, PheWAS has become a powerful tool to investigate the impact of genetic variation on drug response among many individuals and may expand our knowledge of new drug targets and effects (Pendergrass and Ritchie, 2015;Denny et al., 2016;Roden, 2017). Clearly, combined with GWAS and EWAS, PheWAS will provide us with the possibility to discover the associations with drug effects, including therapeutic response and side effect profiles in DKD (Hebbring, 2014).
Taken together, application of these advanced studies in DKD will be very useful not only for evaluating current data from genetic and epigenetic studies but also for generating new knowledge for dissecting the complexity of this disease.

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and has approved it for publication.

FUNDING
The study was supported by the Start Grant from China Pharmaceutical University.