Genetics of Cardiovascular Disease: Fishing for Causality

Cardiovascular disease (CVD) is still the leading cause of death in all western world countries and genetic predisposition in combination with traditional risk factors frequently mediates their manifestation. Genome-wide association (GWA) studies revealed numerous potentially disease modifying genetic loci often including several SNPs and associated genes. However, pure genetic association does not prove direct or indirect relevance of the modifier region on pathogenesis, nor does it define within the associated region the exact genetic driver of the disease. Therefore, the relevance of the identified genetic disease associations needs to be confirmed either in monogenic traits or in experimental in vivo model system by functional genomic studies. In this review, we focus on the use of functional genomic approaches such as gene knock-down or CRISPR/Cas9-mediated genome editing in the zebrafish model to validate disease-associated genomic loci and to identify novel cardiovascular disease genes. We summarize the benefits of the zebrafish for cardiovascular research and highlight examples demonstrating the successful combination of GWA studies and functional genomics in zebrafish to broaden our knowledge on the genetic and molecular underpinnings of cardiovascular diseases.


INTRODUCTION
Cardiovascular disease (CVD) is the leading cause of mortality worldwide. CVD describes a class of diseases affecting the heart and blood vessels, such as cardiomyopathies, coronary artery disease, heart failure or arrhythmias. A variety of risk factors, such as smoking, obesity, hypertension or high cholesterol can be causative for CVD, however, it is understood that these traditional risk factors only contribute to a fraction of disease cases (1). Therefore, researchers also focus on the definition of the genetic basis of CVD to identify disease mechanisms independent of environmental risk factors. Recent advances in next-generation sequencing (NGS) techniques enable now an unbiased, whole-genome analysis of patients to identify disease-associated genetic alterations. One of these approaches comprises genome-wide association (GWA) studies (GWAS) that have emerged as a powerful tool to identify disease-related loci and have become a valuable candidate resource for disease causing genes and variants. A GWA study is a hypothesis-free approach utilizing the information of hundreds of thousands of genetic variants across the genome, so-called SNPs (single nucleotide polymorphisms), in large population samples. In this context, GWAS findings are purely genetic, but significant associations between SNPs and the disease are therefore excellent startingpoints for detailed follow-up studies. More than 10,000 of such significant associations with disorders and genomic traits were reported by GWA studies resulting in new insights into biology and molecular mechanisms of various diseases (2). Online platforms like the GWAS catalog provide researchers collected data of published GWA studies and enable the open-access view into these genome-wide analyses (3). The GWAS catalog comprises studies on a variety of diseases ranging from neurological disorders, various cancer types to cardiac diseases, such as cardiomyopathies or arrhythmias. All GWA studies rely on the exact definition of the disease phenotype in patients to obtain an as specific cohort as possible. The influence of a mixed cohort, secondary disease mechanisms or environmental variations might lead to non-significant or underestimated results. This could be particularly observed for GWA studies focusing on heart failure mechanisms (4). Although well-designed, some GWA studies still lack the clinical relevance due to missing causality of the candidate genes. In order to get a fast and reliable validation of GWAS hits, an adequate experimental model in follow-up studies is fundamental. Several model systems are available, ranging from cell culture to animal models and each model has its pros and cons depending on the respective disease mechanism. During the last decades, the zebrafish (Danio rerio) has emerged rapidly as a model organism in cardiovascular research. In this review, we will focus on the use of the zebrafish to investigate the pathomechanisms of heart diseases and discuss its suitability as an experimental tool to validate the disease-association of genes identified by GWA studies.

THE ZEBRAFISH: SMALL FISH, BIG IMPACT
Zebrafish possess a variety of features that are advantageous for the use as experimental model organism. Due to their small size (2-4 cm), zebrafish are easy to handle and one female can produce around 200 eggs per week. Zebrafish embryos develop externally and very rapidly to freely swimming and fed larvae within 5 days (5,6). The zebrafish is an excellent system for microscopic applications as embryos are transparent and numerous transgenic fluorescent reporter lines are available or can easily be produced (7). Such reporter lines are widely used to image organ development and morphology as well as physiological parameters like membrane voltage or calcium transients (8,9). Because of their suitability for imaging applications, zebrafish are also highly interesting for highthroughput small compound screens. This is enabled by already existing and continuously improving screening platforms e.g., for the automated detection of heartbeat, heartrate and fractional shortening in embryos or isolated hearts of adult zebrafish (10)(11)(12)(13). Such set-ups facilitate rapid and high-throughput preclinical tests of large numbers of small molecules and help to identify novel therapeutic strategies (14)(15)(16)(17).
Beside the mentioned general advantages, zebrafish exhibit characteristics making them appropriate to study heart development and disease (16,(18)(19)(20)(21). Zebrafish heart development proceeds fast and results in a differentiated twochambered heart within 48 hpf (22). In addition, zebrafish embryos, in contrast to mammalian or avian embryos, are able to cover their oxygen demand by diffusion during the first days of development and are not dependent on blood circulation. This enables the investigation of gene knockouts or knockdowns, even if they lead to severe defects of the cardiovascular system (23).
On a genetic basis, humans and zebrafish share a 70% sequence similarity and 84% of human disease-causing genes can also be found in the zebrafish genome (24,25). However, regarding the cardiovascular system, there are basic morphological differences as the zebrafish heart consists of only two heart chambers, one atrium and one ventricle. This is on the one hand advantageous as it displays a simplified experimental model, on the other hand, these anatomical difference may limit the translation of findings into the mammalian system (21). Unlike mammals, which develop a coronary system during embryogenesis, zebrafish show a vasculature on the heart surface starting at 1-2 month post hatching (26). This restricts the study of coronary artery disease (CAD) to adult zebrafish, although it is possible to analyze basic mechanisms of atherosclerotic lesion development also in the vasculature of zebrafish embryos (27). There are also several parameters of the zebrafish heart that are closer to the human situation than mammalian model organisms, such as the mouse (17). For example, the zebrafish heart rate of 120-180 bpm (beats per min) is comparable to the 60-100 bpm of the human heart, whereas the mouse heart beats 5 times faster. Furthermore, zebrafish ECG parameters are very similar to human values enabling a direct comparison and translation of experimental findings (20,28).
In addition to the great benefits of the zebrafish in regard to organ development, physiology, handling and imaging, its suitability for genetic manipulation is another big advantage of the system (Figure 1). Here, we will give a compressed overview on the repertoire of zebrafish genetic tools and highlight examples, where they have been used to demonstrate the causality of genes or loci identified by GWAS.

ZEBRAFISH GENETIC SCREENS: A (SWIMMING) POOL OF DISEASE GENES
Before GWAS data became more and more accessible, candidate driven approaches have been very successful in identifying disease-associated mutations. Here, known molecular players and/or regulators of a specific disease-related pathway are screened in a cohort or hereditary trait to find an association with the pathological outcome. Forward genetic screens in zebrafish contributed a lot to these studies as a variety of genes responsible for cardiovascular defects were identified by zebrafish mutant lines arising from mutagenesis screens (18,29). These screens, comparable to GWAS, have the advantage to be hypothesisfree approaches that identify genetic mutations via a randomly induced phenotype. The most prominent mutagenesis screens are based on alkylating agents like N-ethyl-N-nitrosurea (ENU), which give rise to point mutations leading to nonsense or missense mutations that affect the regulatory and coding region of genes (30)(31)(32). The benefits of the zebrafish as an experimental system can enormously help to validate the functional relevance of candidate genes identified by GWA studies. In addition, the system enables the in vivo analysis of underlying pathomechanisms and is highly suitable for high throughput screening applications. In summary, the combination of GWAS and the zebrafish experimental system has the potential to lead to improved and specific therapeutic approaches.
Although mainly recessive and single inherited mutations can be analyzed in such mutagenesis screens, the combination of zebrafish forward genetics followed by human genome analysis led to the identification of several disease-related genes. One example is the zebrafish mutant main squeeze (msq), which harbors a mutation in the gene encoding ILK (Integrin-linked kinase). Msq mutants display progressive loss of ventricular contractility leading to heart failure (33). Another ILK mutant line, lost contact (loc), also displays a cardiomyopathy phenotype (34). After identifying ILK mutations as causative for the loc mutant phenotype, Knöll et al. performed a mutation screen in the ILK gene of human cardiomyopathy patients. This screen revealed an ILK mutation that was associated with the disease and its disease causing effect could be again validated in loc mutants. These examples show that zebrafish can serve as (I) a resource for new candidate genes in heart failure through forward genetic screens as well as (II) a model organism to validate potential disease-causing mutations in reverse genetic analyses.

REVERSE GENETIC APPROACHES IN ZEBRAFISH
Reverse genetics can be regarded as targeted investigation of a gene of interest by increasing, reducing or silencing its expression. A diversity of reverse genetic tools can be applied in zebrafish, however, several characteristics of zebrafish genetics have to be kept in mind. Zebrafish underwent a whole-genome duplication event with the consequence that for many genes a partially redundant paralog is present (24,35). In addition, there often exist several transcripts of the same gene and the knockdown or knockout of several genes might be necessary to model the loss-of-function phenotype of a human ortholog. Another aspect that needs to be considered is the genetic variation between and within zebrafish strains that might have an impact on the phenotype and the conclusion drawn from functional analyses (36).
An important and helpful resource for reverse genetic investigations is the zebrafish mutant project that provides a growing list of fish lines with a defined mutation in a specific gene (25). These mutations are induced by chemical mutagenesis, similar to the one used in forward genetic screens, and identified by high-throughput DNA genotyping, an approach called TILLING (targeting induced local lesions in genomes) (37,38). If a desired and appropriate mutation is available, this open source platform might give scientists a fast access to a loss-of-function model that can be directly used for functional studies. The reverse genetic tools that can be applied in zebrafish are mainly (A) mRNA overexpression and (B) transgenesis, (C) Morpholino-modified antisense oligonucleotide (MO) mediated knockdown or (D) genome editing techniques such as ZFNs (zinc finger nucleases), TALENs (transcription activator-like effector nucleases) or the CRISPR/Cas9 system (clustered regulatory interspaced short palindromic repeats).

mRNA Overexpression
Injecting synthetic capped mRNA encoding the protein of interest into early embryonic stages is commonly used as a standard method to induce transient overexpression of genes or gene variants (e.g., SNPs/variants identified in GWAS or nextgeneration sequencing) for gain-of-function or loss-of-function studies. Thus, mRNA overexpression in zebrafish was used for example to analyze mutations in the NEXN (Nexilin) gene that were identified in human DCM (dilated cardiomyopathy) patients (39). When overexpressed in zebrafish embryos, these mutant NEXN variants induced a severe DCM phenotype showing the suitability of the method for fast and effective testing of the impact of putative mutations. Even though mRNA overexpression is an effective way to elucidate functions of specific genes, its use is restricted to focus on early organ development and function because of limited stability of the injected mRNA.

Transgenesis
Transgenesis in zebrafish involves the insertion of foreign DNA into the genome and is often used to create reporter lines, in which a fluorescent reporter gene under the control of a specific promoter is used to label a particular tissue, organ or cell type. The most commonly used system to insert a transgene in the zebrafish germline is the Tol2 system derived from medaka fish. This autonomously active Tol2 element harbors a gene that encodes for a transposase mediating the transposition of the Tol2 element into the genome (40). For transgenesis of zebrafish, the sequence or gene of interest needs to be flanked by 150-200 bp ends of the Tol2 element. Injection of this construct together with in vitro transcribed transposase mRNA leads to the highly efficient generation of transgenic F 1 offspring (41,42). For zebrafish heart development and function a variety of transgenic lines are present, such as cmlc2-(myosin light chain 7, myl7) promoter driven reporter lines that specifically label cardiomyocytes of both heart chambers (43,44). In addition, random insertional transgenesis of EGFP, so called enhancer trap, was shown to result in various reporter lines specifically labeling cardiac structures (45). A powerful combination is the use of transgenic lines in cell transplantation experiments that are widely-used in zebrafish embryos to investigate cellautonomous mechanisms. With this approach, Sawamiphak and colleagues could, for example, analyze fusion events between cardiomyocytes during heart development that enable exchange of mRNA or proteins between individual cells (46). Furthermore, stable transgenic expression of gene variants associated with heart diseases can serve as an appropriate in vivo model to study the underlying pathology. Huttner and coworkers, for example, showed that transgenic expression of the D1275N mutation of the human cardiac sodium channel (SCN5A), which is associated with cardiac abnormalities in humans, also leads to bradycardia and defects of the cardiac conduction-system in zebrafish (47). Further developments of transgenesis techniques in regard to tissue specificity or inducibility will broaden the possibilities for transgenesis in zebrafish and help to create improved experimental systems for cardiac research (48,49).

Morpholino-Mediated Knockdown
Morpholinos (MO) are knockdown reagents that are very stable, resistant to nucleases and can be injected into 1-cell stage zebrafish embryos. Thus, they became a standard approach for gene knockdown in zebrafish (50,51). In a variety of cardiovascular research studies, MO-mediated knockdowns were performed to analyze the disease association of a particular gene and/or to model specific pathological features. For example, knockdown of genes that are associated with DCM progression in humans also results in cardiomyopathy in the zebrafish (39,52). However, phenotypes induced by MOs may be more severe than those of the corresponding mutants. This discrepancy can be a result of genetic compensation in the mutant or due to off-target effects of the used MO (53)(54)(55). Therefore, proper control of MO specificity, efficiency and toxicity should be performed in all applications (51).

Genome Editing Techniques
Genome editing has evolved as a major strategy to disrupt the coding sequence of genes of interest leading to a loss-of function. During recent years, various CRISPR/Cas9, ZFN and TALEN approaches were developed and applied in zebrafish research to create gene knockouts. The detailed technical aspects are not the focus of this review, but are reviewed elsewhere (56,57). ZFNs and TALEN approaches were successfully applied in zebrafish cardiovascular research studies (58,59). For instance, ZFN-mediated knockout of GATA2 results in severe defects in vascular organization highlighting the importance of this gene for cardiovascular development (60).
The discovery of CRISPR/Cas9 as a genome editing method declared a new era of reverse genetics (61,62). The CRISPR/Cas9 system is the most efficient genome editing method for reverse genetics in zebrafish and exhibits, due to its simplicity and applicability, many advantages compared to ZFN and TALENs (63). The CRISPR/Cas9 system is a two-component complex composed of the Cas9 endonuclease, which induces DSBs (double-strand break) and a guide RNA (gRNA) recognizing specific DNA sequences (62). CRISPR/Cas9 is remarkably simple and adaptable due to its unique mechanism and therefore, is chosen as a major genome-editing tool among all the technologies present in the zebrafish field (64). Its suitability for reverse genetics in zebrafish, in regard to cardiovascular research, could be shown for example by a knockout of the large transcript pr130 of the Protein Phosphatase 2 Regulatory subunit Bα (PPP2R3A) (65). Here, two pr130 knockout lines demonstrated the importance of pr130 for cardiac development and function and provide a suitable genetic model to study the underlying pathomechanisms. An aspect that needs to be considered in all genome editing approaches is the possible presence of off-target effects. Unbiased whole genome analyses of CRISPR/Cas9 off-target effects are still missing and researchers are most often restricted to the analysis of off-target genes that are predicted by computational approaches (66). By careful design and selection of gRNA sequences and the use of nuclease variants with high specificity the risk for off-target effects can be minimized. Additionally, continuous outcrossing of the mutation and the comparison of at least two independently produced knockout lines help to prevent misinterpretations of a genotypephenotype connection.
Recently, a variety of improvements and new applications of the CRISPR/Cas9 system evolved contributing to the fast implementation of the method in many zebrafish laboratories. The classical targeted knock-out strategy involves the injection of gRNA and Cas9 (mRNA or protein) into 1-cell stage embryos and the screening for germline mutations in subsequent generations (67). Another strategy uses e.g., a catalytically dead Cas9 protein (dCas) lacking endonuclease activity to generate a DNA recognition complex that can specifically perturb transcriptional elongation, RNA polymerase binding, or transcription factor binding (68). CRISPR/Cas9 can also be used to generate defined knock-in fish lines with integrated SNPs, stop codons, HA tags, loxP sites or fluorescent proteins (69,70). The CRISPR/Cas9 toolbox is continuously growing and recent progress is achieved by using this method for tissue-specific blockage of gene function (71,72) or by combining the strategy with optogenetic tools to have temporal control over Cas9 activity (73). In the context of cardiovascular research, these improvements will help to obtain heart-specific knockouts and to mimic the late onset and slow progression of many cardiomyopathy subtypes.

GWA STUDIES AND FUNCTIONAL GENOMICS IN ZEBRAFISH: A POWERFUL COMBINATION
The zebrafish functional genomics toolbox enables a defined analysis of theoretically any gene of interest in vivo. This makes the zebrafish a valuable experimental platform to validate putative disease causing genes that are identified by GWA studies. Indeed, a variety of genome-wide surveys, focusing on heart diseases, already used zebrafish to prove their initial findings. Table 1 summarizes selected examples of genomewide studies, for which the resulting candidate genes could be confirmed by zebrafish reverse genetics. An early GWA study in 2008 identified three co-segregating genes (HBEGF, IK, and SRA1) associated with DCM (87). For Heparin-binding EGF-like growth factor (HBEGF), the linkage to DCM progression was already known from mouse knockout studies (88). The DCMassociation for the cytokine IK and the steroid receptor RNA activator1 SRA1 is a new connection arising from this study. The disease-relevance of these candidate genes could be verified in zebrafish embryos. The MO-mediated knockdown of all three genes, HBEGF, SRA1 and IK resulted in severe pericardial edema, accompanied by reduced fractional shortening (FS) of the ventricular chamber (87). Another study focused on the genetic basis of CAID (Chronic atrial and intestinal dysrhythmia) and found a linkage to the SGOL1 gene (Shugosin-like 1) (86). The authors could show that SGOL1 is expressed in the sinoatrial region and atrioventricular valves of the adult zebrafish heart. Consistent with its expression pattern, the knockdown of SGOL1 in zebrafish embryos resulted in bradycardia confirming the involvement of SGOL1 in heart rhythm control (86). KCNIP1 (potassium voltage-gated channel interacting protein 1) is another example of a gene that could be linked by whole genome analysis to heart disease, here atrial fibrillation (AF) (82). Interestingly, the reported mutation does not lead to a loss of function, but is suggested to increase KCNIP1 levels. The authors modeled this by the overexpression of KCNIP1 in zebrafish and could show that increased KCNIP1 levels can result in transient atrial tachycardia and AF during high-rate pacing (82). Norton et al. (79) identified BAG3 (Bcl-2 associated anthanogene 3) as a DCM-associated gene and could confirm its disease relevance by knocking-down BAG3 in zebrafish embryos. BAG3-deficient fish showed severe pericardial edemas and a decreased fractional shortening as well as a reduced peak blood cell flow velocity (79). A second study independently identified also BAG3 as a potential DCM-causing gene (78). In addition, the functional requirement of BAG3 for heart as well as skeletal muscle function was also confirmed by an independent MO-based analysis of several myopathy-related genes (80). Another gene linked to DCM that was identified by a whole genome study is Filamin C (FLNC) (81). The authors of this study also used MO-knockdown experiments to validate their findings. FLNC morphants exhibited dysmorphic or dilated heart chambers as well as impaired heart looping confirming the importance of FLNC for heart function and development (81). Lundby et al. (83) used a GWA approach combined with tissue-specific proteomics to analyze genes associated with LQTS (Long QT Syndrome) (83). They could identify Vinculin (VCL) as a disease-associated gene and could confirm its relevance by using a VCL-knockdown approach in zebrafish. In these experiments, the authors measured cardiac repolarization in isolated embryonic hearts using fluorescent probes and could observe an impaired repolarization response upon loss of VCL (83). Additionally, by using a gene-trap mutant zebrafish line as well as a CRISPR/Cas9 knockout line of VCLb, Cheng et al. could confirm its disease-relevance (85) and a MObased knockdown of VCL in zebrafish in another independent approach also validated the role of Vinculin in heart function and structure (84).
Other studies did not use the zebrafish to validate their candidate genes, however, independent follow-up studies could confirm the disease-causing potential of some of them. For example, two independent GWA studies on DCM and heart failure identified SNPs near the HSPB7 (Heat Shock Protein Family B Member 7) gene to be associated with the disease (74,77). Three years later, Rosenfeld et al. could confirm the requirement of HSPB7 in heart function and structure by MObased knockdown experiments. HSPB7 depletion led to impaired cardiac morphogenesis due to defects in ventricular size, but also due to an early block of heart tube formation (75). By using a TALEN-mediated knockout of HSPB7, the same group recently showed that loss of HSPB7 increases the susceptibility of adult mutant zebrafish for cardiomyopathy due to impaired protein homeostasis serving as another proof of the initial GWAS findings (76). In addition, this zebrafish study and most of the above mentioned, are not only validating candidate genes from GWA studies, they also allow a more detailed investigation of the underlying pathomechanisms and help to identify novel disease-associated pathways and protein networks.

CONCLUSION AND FUTURE PERSPECTIVES
The zebrafish has a variety of advantages to be combined with GWAS. Zebrafish are easy to keep, to handle and to image, show many physiological and genetic similarities to humans and are highly suitable for genetic manipulations. These features help to establish valid disease models and allow a plethora of follow-up studies.
Due to obvious differences in morphology and living environment, it should be clear that a simple translation of findings from the zebrafish system to humans is not always possible. In addition, the probably biggest difference and, also peculiarity of the zebrafish heart compared to mammals is the ability to regenerate injury (89). This may result in drawbacks and limitations when comparing pathomechanisms in fish and humans, but also makes the zebrafish a highly interesting model to study the underlying mechanisms of regeneration (90,91). It is important to mention that a disease-association of a particular gene that cannot be confirmed in zebrafish doesn't necessarily mean that it is not causative for the phenotype. Mechanisms like intrinsic repair processes or genetic compensation may hide a causative effect of a gene mutation. In such situations, other experimental models, like rodents or patient-derived iPSC (induced pluripotent stem cells) might lead to clearer results (92,93). Nevertheless, all mentioned benefits make the zebrafish a valid and highly suitable model to investigate cardiovascular pathologies and to prove findings from GWAS. Many SNPs identified in GWA studies are located in non-coding regions of the genome and might affect for example enhancer or repressor binding, microRNA binding sites or chromatin structure. Also for these kinds of mutations, the zebrafish system can help to identify their in vivo relevance, biological role and the underlying pathomechanisms. Madelaine et al. (94), for example, could very recently confirm human diseaseassociated SNPs in CNEs (conserved non-exonic elements) by using CRISPR/Cas9-mediated deletion of the respective noncoding locus (94).
We are sure that the fruitful synergism between GWAS and zebrafish in cardiac research will expand in the future and will lead to the identification of novel disease-causing genes and variants and help to screen for possible therapeutic strategies.

AUTHOR CONTRIBUTIONS
CP, FD, D-DP, WR, and SJ contributed substantially to the conception, drafting, and revision of the manuscript and approved the final version.