Shared Genetic Liability and Causal Associations Between Major Depressive Disorder and Cardiovascular Diseases

Major depressive disorder (MDD) is phenotypically associated with cardiovascular diseases (CVD). We aim to investigate mechanisms underlying relationships between MDD and CVD in the context of shared genetic variations. Polygenic overlap analysis was used to test genetic correlation and to analyze shared genetic variations between MDD and seven cardiovascular outcomes (coronary artery disease (CAD), heart failure, atrial fibrillation, stroke, systolic blood pressure, diastolic blood pressure, and pulse pressure measurement). Mendelian randomization analysis was used to uncover causal relationships between MDD and cardiovascular traits. By cross-trait meta-analysis, we identified a set of genomic loci shared between the traits of MDD and stroke. Putative causal genes for MDD and stroke were prioritized by fine-mapping of transcriptome-wide associations. Polygenic overlap analysis pointed toward substantial genetic variation overlap between MDD and CVD. Mendelian randomization analysis indicated that genetic liability to MDD has a causal effect on CAD and stroke. Comparison of genome-wide genes shared by MDD and CVD suggests 20q12 as a pleiotropic region conferring risk for both MDD and CVD. Cross-trait meta-analyses and fine-mapping of transcriptome-wide association signals identified novel risk genes for MDD and stroke, including RPL31P12, BORSC7, PNPT11, and PGF. Many genetic variations associated with MDD and CVD outcomes are shared, thus, pointing that genetic liability to MDD may also confer risk for stroke and CAD. Presented results shed light on mechanistic connections between MDD and CVD phenotypes.


INTRODUCTION
Collectively, mental disorders and cardiovascular diseases (CVD) account for a large proportion of the total disability and morbidity worldwide (1,2). Major depressive disorder (MDD), commonly referred to as depression, is characterized by the persistence of low mood. MDD is the most prevalent mental disorder and is accompanied by considerable morbidity, mortality, and a high risk of suicide (3). At some point during the lifetime, it affects 1 out of 5 adults (4). Major forms of CVD include hypertension, coronary heart disease, heart failure, stroke, and atrial fibrillation. High-rate of co-morbidity between depression and CVD is well-acknowledged; patients with depression are more likely to develop CVD, and patients with CVD have higher depression scores than the general population (5). Among patients with CVD, depression is a major contributor to increased healthcare cost, mortality, and reduced quality of life (6,7), and is also considered an independent risk factor for major adverse cardiovascular events (8). Specifically in coronary heart disease patients, the prevalence of depression is reported at 15-23% (9).
A prevailing measure of quantifying the genetic relationship between two traits is a genetic correlation coefficient, with its sign indicating the direction of the shared genetic effect. However, when dealing with mixtures of effect directions across shared genetic variants, the genetic correlation analyses may be underpowered (10). Polygenic overlaps were recently proposed to measure the fraction of genetic variants causally associated with both traits over the total number of causal variants across a pair of traits involved (10).
In previous studies, MDD has been reported to be genetically correlated with coronary artery disease (CAD) (11). Nevertheless, whether these associations are causal remains to be seen. Mendelian randomization (MR) approach tests for causative association between an exposure and an outcome by utilizing genetic variants as instrumental variables (12,13). Several frameworks have been proposed for MR analysis, including MR-Egger methods (14). Recently, a powerful GSMR (Generalized Summary-data-based Mendelian Randomization) suit was developed to account for linkage disequilibrium (LD) by leveraging power from multiple genetic variants (15). The GSMR framework is increasingly employed in recent analyses (16)(17)(18)(19)(20), with reports of the causal effects of MDD on small vessel stroke, ischemic heart disease, and CAD already available (21)(22)(23).
In this study, we evaluated genetic correlation and polygenic overlap between MDD and eight cardiovascular conditions and reported their causal associations. To achieve this, a multi-SNP MR analysis was run on summary GWAS datasets. Across MDD and CVD, pleiotropic genes were extracted by comparing genome-wide genes reported for each trait. Then, in cross-trait meta-analyses, pleiotropic genomic loci shared between MDD and stroke were identified, followed by prioritizing putative risk genes by leveraging a multi-tissue eQTL database.

METHOD GWAS Summary Datasets and Quality Control
The summary results of GWAS of MDD (20) and seven cardiovascular conditions-CAD (24), heart failure (25), atrial fibrillation (26), stroke (27), systolic blood pressure (28), diastolic blood pressure (28), and pulse pressure measurement (28)-were used for the analyses. The summary result of GWAS of CVD (29) was used in the validation stage. The CVD dataset included a mixture of multiple cardiovascular diseases recruited by the UKB (29). Participants from these datasets were either of European origins (for traits of MDD, stroke, heart failure, CVD, and blood pressure) or mainly of European origins (for atrial fibrillation and CAD). Condition-specific sample sizes have ranged from 332,477 to 977,323. Each SNP was analyzed across pairs of datasets after exclusion of all SNPs with conflicting alleles, and effect harmonization. Detailed information on the datasets included in this study is summarized in Table 1 and Supplementary File.

Genetic Correlation and Polygenic Overlap Analysis
GWAS summary results were utilized to extract the genetic correlation of MDD with cardiovascular conditions using LD score regression software (LDSC, v1.0.1) (30,31). Polygenic overlaps were analyzed by MiXeR v1.2 using default parameters (10). The MiXeR pipeline evaluates the number of shared and trait-specific causal variants between two traits, while accounting for effects of LD structure, minor allele frequency (MAF), sample size, cryptic relationships, and sample overlap. The total number of causal variants was 22.6% of the total estimate, which accounts for 90% of SNP heritability for each trait.

MR Analyses
We examined causal effects between MDD and the seven cardiovascular conditions, namely, CAD, heart failure, atrial fibrillation, stroke, systolic blood pressure, diastolic blood pressure, and pulse pressure measurement. GSMR v1.0.9 was used to infer bidirectional causal associations between MDD and the cardiovascular conditions, with causal effects of cardiovascular conditions on MDD being called reverse Mendelian randomization (15). Instrumental variants were selected based on default P ≤ 5 × 10 −8 . When the threshold was surpassed by <10 SNPs, a P-value threshold of 1 × 10 −5 was used. As pleiotropy is known to serve as a potential source of bias and, therefore, an inflated estimation in an MR analysis (32), we used the HEIDI-outlier approach, which detects and eliminates genetic instruments with apparent pleiotropic effects on both the risk factors and the disease (15,33). Multiple tests were corrected by FDR, with significant causal association detected at FDR < 0.05. A detailed description of the MR is provided in the Supplementary Methods section.

Comparison of Genome-Wide Genes Shared Between MDD and CVD
GWAS results were obtained for MDD and four types of CVD from the GWAS Catalog database (34). For stroke, we combined the following labels: stroke as such, large artery stroke, small vessel stroke, cardioembolic stroke, and ischemic stroke. Analysis of gene overlaps among the five traits was conducted using the R package SuperExactTest (35), with the total gene number in the genome being set as 30,000.

Cross-Trait Meta-Analysis
Given that MDD has the closest relationship with stroke among the CVD, we performed a cross-trait meta-analysis of the MDD and the stroke using the subset-based fixed-effects method ASSET (version 2.4.0) (36). The meta-analysis pools the effect of a given SNP across K studies, weighting the effects by the size of the study under the default parameters. After subset-based meta-analysis, SNPs with P-values lower than 5 × 10 −8 were considered statistically significant. FUMA was used for functional annotation and gene-mapping of variants and identify LD-independent genomic regions in the meta-analysis result (37). Enrichment of the shared genes in the GWAS catalog reported categories was calculated using FUMA (37). Gene property analysis for tissue specificity was performed by FUMA. To ensure that sample overlap did not contribute to inflated estimates of genetic overlap between MDD and stroke, λmeta statistics were calculated (38). The λmeta is a statistic that uses effect size concordance to detect sample overlap or heterogeneity. Under the null hypothesis, λmeta = 1 when the pair of cohorts are completely independent. When there are overlapping samples, λmeta < 1. When there is heterogeneity between datasets, the expectation is λmeta > 1. In most GWAS meta-analyses, λmeta is likely to be slightly larger than 1 due to unknown heterogeneity.

Fine-Mapping of TWAS Associations
To prioritize putatively causal genes, we used fine-mapping of causal gene sets (FOCUS v0.6.10) (39) to the meta-analysis MDD and stroke results in three relevant tissues, including the brain, whole blood, and heart. FOCUS models predict expression correlations and assign a posterior inclusion probability (PIP) for genes at each transcriptome-wide association study (TWAS) region and relevant tissue types. A multi-tissue eQTL reference     weight database was employed, and LD information from LDSC was used as reference. Multiple testing corrections were used to account for all gene-tissue pairs using Benjamini-Hochberg adjusted TWAS P-values (FDR < 0.05).

Genetic Correlation and Polygenic Overlap Analysis
Genetic correlation analyses indicated that MDD has a significant genetic correlation with CAD, heart failure, atrial fibrillation, and pulse pressure ( Table 2). Polygenic overlap analysis indicated that 15.8 thousand variants causally influence MDD, while CVD was associated with much smaller numbers of causal variants, ranging from 0.5 thousand for the atrial fibrillation to 2.8 thousand for heart failure. Each of the tested CVD or cardiovascular measurements has shared a substantial set of causal variants with that of MDD ( Figure 1A).

MR Analysis
MDD confers a causal effect on CAD, stroke, and pulse pressure (Table 2, Figure 1B)

Validation of Genetic Correlation and MR Analysis
In the validation stage, we examined the genetic correlation and causal associations between MDD and CVD. Our results indicate that MDD has a significant genetic correlation with CVD (r g = 0.357, s.e. = 0.056, P = 1.79 × 10 −10 ). Genetic liability to MDD confers a causal effect on CVD (b xy = 0.26, s.e. = 0.10, P = 9.84 × 10 −3 ), while genetic liability to CVD confers a causal effect on MDD (b xy = 0.07, s.e. = 0.03, P = 4.74 × 10 −3 ). However, the causal effect conferred by CVD on MDD was relatively weak.

Overlapped Genes Between MDD and CVD
There were 675, 253, 328, 426, and 1,653 genome-wide significant genes for CAD, heart failure, atrial fibrillation, stroke, and MDD, respectively. There was an over-representation of shared genes between MDD and each of the four types of CVD ( Figure 1C,  Supplementary Table 1). A total of seven pleiotropic genes were implicated in MDD and at least three types of CVD, including SLC39A8, MAML3, FADS2, ZFHX3, PLCG1, ZHX3, and ADI1P1 ( Figure 1D). Notably, ZHX3 and ADI1P1 genes were shared by MDD with all four types of CVD.

Cross-Trait Meta-Analysis
The cross-trait meta-analysis of MDD and stroke revealed 45 loci with 104 independent significant SNPs (IndSigSNPs), including 13 loci involving 19 pleiotropic IndSigSNPs and associated with both traits (Table 3, Figures 2A-D). Tissue expression analysis showed that the associations were significantly enriched in brain tissues ( Figure 1F). For datasets on MDD and stroke, λmeta

Fine-Mapping of TWAS Associations
To prioritize putatively causal genes from the meta-analysis of MDD and stroke, fine-mapping of TWAS associations was performed. A total of 100 gene-tissue pairs were identified as part of the 90% credible set for the three tissues, with 71 genes in total. Four genes were identified to be in the credible set with the highest posterior probabilities (PIP > 0.90), including RPL31P12, BORCS7, PTPN11, and PGF (Supplementary Table 2, Figure 1G, Supplementary Figures 2-5).

DISCUSSION
Depression is a major cause of morbidity and poor quality of life among CVD patients (6), and an independent risk factor for major adverse cardiovascular events (8). The comorbidity of depression and adverse cardiovascular outcomes typically forms a vicious cycle, known to significantly impact both the course and the management of these common conditions. The polygenicity of MDD is much higher than that of CVD. Although the genetic correlation between MDD and CVD is relatively low, the substantial polygenic overlap between MDD and CVD was evident. For each CVD or related physiological parameter, more than 60% of genetic variants overlap with those of MDD. Notably, nearly all causal variants influencing atrial fibrillation risk also affect MDD. In addition, we observed an over-representation of shared genes between MDD and all types of CVD. Interestingly, two genes locating at chromosome 20q12, PLCG1, and ZHX3, were implicated in all the five traits, making the chromosome 20q12 region a major pleiotropic locus for both MDD and CVD.
The gene PLCG1 encodes protein PLCγ1, which plays a key role in the intracellular transduction of the signal from receptor-mediated tyrosine kinase activators. In the brain, PLCγ is primarily activated by neurotransmitters, neurotrophic factors, and hormones. Prior studies have reported the potential role of PLCG1 in both normal brain function and brain disorders, including MDD (40,41). On the other hand, the PLCγ1dependent signaling is critical for arterial development (42), the repair of the intima after vessel injury (43), and the myogenic constriction of cerebral arteries (44). The ZHX3 gene encodes a member of the zinc fingers and homeoboxes (ZHX) gene family. Dysregulation of ZHX factors has been reported in both neurological and hematological diseases (45).
Even as high comorbidity of MDD and CVD has long been acknowledged, and their associations have been well-studied and discussed (7,11), causal relationships between these two conditions came into the focus just recently. In this work, genetic liability to MDD was shown to etiologically influence the development of CAD and stroke, while liability to cardiovascular outcomes exerted no or minimal influence on MDD. Genetic correlation evaluates the relationship between two traits, and the sign of the correlation coefficient is determined by whether the directions of the shared genetic effect are predominantly the same or opposite for the two traits. Two traits can have substantial polygenic overlap with a non-significant genetic correlation between them (10,46), which may account for the causal effect of MDD on stroke in the context of no genetic correlation between them. This leads us to the argument that the high rate of cardiovascular events in MDD patients may, at least partially, follow genetic variations inherited by the patients. When compounded with an unhealthy lifestyle, including an overall reduction of the physical activity commonly seen in depressed patients, this pre-existing liability may lead to the acquisition of cardiovascular disease. On the contrary, the high rate of depression seen in CVD patients may largely be due to a psychological and physical reaction that occurs after cardiovascular events, rather than from inherited genetic liability to MDD.
A recent study by Tang et al. reported a causal association of MDD with CAD (23). As the present study was conducted in a CAD dataset which was almost twice larger than that utilized by Tang et al. (332,477 vs. 184,305 patients) and as analytic frameworks were different, our study may be interpreted as a piece of corroborating evidence for the causal effect of MDD on CAD. Another recent work reported that genetic risk factors for MDD may pleiotropically increase CAD risk in females (47). However, the causal effect of MDD on CAD uncovered in our study was relatively weak (b xy = 0.06) when compared with the effects of MDD on stroke (b xy = 0.19). Moreover, our results do not support a causal role of genetic liability to MDD in the development of hypertension but suggest that liability to MDD may result in a marked reduction of pulse pressure instead (b xy = −0.56).
Importantly, our results point to a causal effect of MDD on stroke, thus, extending findings from Cai et al.'s study that have reported the causal effect of MDD on an increased risk of small vessel stroke, but not on a stroke of large arteries (21). The high comorbidity between MDD and stroke has long been observed, with post-stroke depression constituting a common mental health issue (48,49). However, biological mechanisms underlying the phenotypic relationships between MDD and stroke remain largely elusive. Our meta-analysis of MDD and stroke identified 16 protein-coding genes as shared by the two traits. Among these genes, nine have been previously implicated in GWASs of depression, namely, AREL1, DENND1A, NR4A2, PAX5, RPS6KL1, SOX5, TMEM106B, VRK2, and YLPM1; none of these genes have been identified in any GWAS for stroke. Five genes have been described as genome-wide associated with cardiovascular traits, including PGF, PROX2, DLST, TMEM106B, and VWDE. Notably, TMEM106B was repeatedly identified as a risk gene for frontotemporal lobar degeneration (50)(51)(52). Evidence for the involvement of TMEM106B in depression is also compelling (20,53,54). Incidentally, one recent study reported TMEM106B as a genome-wide risk gene for CAD (55).
To identify potentially causal genes involved in MDD and stroke, we used the fine-mapping of TWAS hits implemented in FOCUS. In course of estimating the causality in three relevant tissues, a total of 71 genes were included in the 90%-credible set, including four genes with high PIP. Specifically, the genomic region 1p31.1 (Figure 2B) containing RPL31P12 was included in the 90%-credible gene set with a posterior probability of 1.00 in the brain cerebellum. It was reported that the SNP rs10789336 in the NEGR1 gene is associated with the expression level of RPL31P12 in brain tissues, and also confers the risk for MDD (56). In the 10q24.32 region, BORCS7, a genome-wide risk gene for schizophrenia (57,58), blood pressure (59,60), body mass index (61), and CAD (55), had the highest PIP of 0.97 in the dorsolateral prefrontal cortex. Notably, in a PET imaging study, a SNP in this gene was associated with the altered dopaminergic function (62). Given that both stroke and MDD affect the brain, both RPL31P12 and BORCS7 loci are attractive as candidates conferring genetic liability for both diseases.
In the 12q24.13 region (Figure 2C), PTPN11 entered in the credible gene set with a PIP of 0.92 for the left ventricle of the heart. Previous GWASs have implicated PTPN11 in peripheral artery disease (63), blood pressure (64,65), and multiple sclerosis (66). Locus PTPN11 encodes SHP2, a member of the protein tyrosine phosphatase family that regulates a wide variety of cellular functions including cell growth, differentiation, mitotic cycle, and oncogenic transformation. In particular, SHP2 serves as a pivotal regulator of normal cardiac development and function (67). PTPN11 mutations are the most common cause of Noonan syndrome, a relatively common autosomal dominant disorder, classified as a RASopathy (68), a disorder of RAS signaling commonly associated with hypertrophic cardiomyopathy, or other malformations of the blood vessels. Our study provides evidence supporting the potential causal role of PTPN11 in stroke.
In the genomic locus 14q24.3 ( Figure 2D), PGF had a PIP of 0.96 for the left ventricle of the heart. PGF encodes a secreted placental growth factor (PGF), which belongs to the vascular endothelial growth factor (VEGF) superfamily. PGF regulates cardiac adaptation through the hypertrophy of the heart tissue by inducing capillary growth and fibroblast proliferation (69). In the heart, PGF serves as a protective paracrine effector (70). One animal study demonstrated that the deficiency of Pgf in rodents affects cognitive functions, brain neuroanatomy, and cerebrovasculature (71). In human patients, reduced expression of PGF was linked to preeclampsia and cerebrovascular and neurological aberrations occurring in fetuses; in turn, preeclampsia may impair cognitive functioning, increase the risk for stroke and lead to adverse stroke outcomes (72). Previous genome-wide analyses identified PGF as a candidate gene both for CAD (55) and for mood instability (73). Our meta-analysis identified PGF as a risk gene for both MDD and stroke, and fine-mapping of TWAS signals further asserted that PGF is a possible causal gene for stroke.
In 2008, the American Heart Association (AHA) issued an advisory to screen all patients with CAD for depression (74). Later it was demonstrated that, in this group of patients, a standardized screening pathway for the assessment of depression offers the potential for early identification and improved management (75,76). Similarly, recognition of shared genetic liability between MDD and CVD suggests the need to evaluate cardiovascular risk in patients with MDD, for example, by using polygenic risk scores (PRS). Since medical comorbidities are also known to contribute to either poor response to antidepressants or treatment resistance (77), it is tempting to speculate that a stratified allocation of treatment for MDD patients with higher genetic risk for CVD may help both to achieve a better response to SSRIs and to lower the risk for an adverse outcome of CVD.
Together, our study reveals novel mechanisms by which MDD influences the risk for the development of CVD ( Figure 1E). Identification of shared genetic foundations for MDD and CVD may guide drug discovery and inform early prediction and personalized treatment for these commonly comorbid conditions.
The presented study has several strengths. First, to evaluate the shared genetic liability between MDD and CVD multiple cardiovascular outcomes were analyzed. Second, for each trait, we typically prioritized the largest available dataset as a study backbone. Furthermore, to avoid potential population heterogeneity across the studies, whenever possible, we limited our analysis to individuals of European ancestry. Finally, the genetic relationships between MDD and CVD were evaluated using multiple analytic strategies, corroborating each other.
We should acknowledge several limitations of this work. As our analyses were limited to a genetic component of the traits and European ancestry population, the presented results should be interpreted cautiously. It is also worth noticing that TWAS associations are not free of noise, since the gene expression levels were imputed from weighted linear combinations of SNPs. Considering that the observed causal effect of MDD on CAD was relatively weak, only stroke was included in the further gene-hunting analyses, thus, minimizing the possibility of overreaching for causal inference.

CONCLUSION
MDD and major types of CVD share substantial genetic variations. Genetic liability to MDD may confer risk for stroke and CAD. Presented results shed light on mechanisms underlying phenotypic relationships between MDD, CVD, and prioritize several candidate genes for future studies.

AUTHOR CONTRIBUTIONS
FZ conceived the project and analyzed the data. FZ and AB wrote the manuscript. FZ, AB, and HC contributed to the revision of the manuscript. All authors read and approved the final manuscript.

FUNDING
This work was supported by the National Natural Science Foundation of China (81471364).