Joint Analysis of Genetic Correlation, Mendelian Randomization and Colocalization Highlights the Bi-Directional Causal Association Between Hypothyroidism and Primary Biliary Cirrhosis

Background: Hypothyroidism and primary biliary cirrhosis (PBC) are often co-existed in observational epidemiological studies. However, the causal relationship between them remains unclear. Methods: Genetic correlation, Mendelian randomization (MR) and colocalization analysis were combined to assess the potential causal association between hypothyroidism and PBC by using summary statistics from large-scale genome-wide association studies. Various sensitivity analyses had been conducted to assess the robustness and the consistency of the findings. Results: The linkage disequilibrium score regression demonstrated significant evidence of shared genetic architecture between hypothyroidism and PBC, with the genetic correlation estimated to be 0.117 (p = 0.006). The OR of hypothyroidism on PBC was 1.223 (95% CI, 1.072–1.396; p = 2.76 × 10−3) in MR analysis with inverse variance weighted (IVW) method. More importantly, the results from other 7MR methods with different model assumptions, were almost identical with that of IVW, suggesting the findings were robust and convincing. On the other hand, PBC was also causally associated with hypothyroidism (OR, 1.049; 95% CI, 1.010–1.089; p = 0.012), and, again, similar results can also be obtained from other MR methods. Various sensitivity analyses regarding the outlier detection and leave-one-out analysis were also performed. Besides, colocalization analysis suggested that there existed shared causal variants between hypothyroidism and PBC, further highlighting the robustness of the results. Conclusion: Our results suggest evidence for the bi-directional causal association between hypothyroidism and PBC, which may provide insights into the etiology of hypothyroidism and PBC as well as inform prevention and intervention strategies directed toward both diseases.


INTRODUCTION
Primary biliary cirrhosis (PBC), known as primary biliary cholangitis, is a chronic inflammatory autoimmune disease of the liver (Carey et al., 2015). PBC is often resulted from progressive destruction of the small bile ducts of the liver, leading to cholestasis, fibrosis and eventually cirrhosis. The highest prevalence is estimated to be 40.2 per 100,000 people with the global prevalence being 14.6 per 100,000 people (Lu et al., 2018b;Trivedi and Hirschfield, 2021). PBC is much more common in women (Lu et al., 2018a), and can incur morbidity with the development of pruritus, fatigue, sicca symptoms (also known as Sjögren's syndrome), and abdominal discomfort (European Association for the Study of the Liver, 2017). These complex clinical symptoms of PBC can be long-lasting and will result in significant damage to quality of life (Mells et al., 2013). In addition, it was estimated that up to 40% of patients had no response to standard medications for PBC (Kuiper et al., 2009;Lindor et al., 2019). Without an effective intervention, those moderately advanced or advanced PBC patients will be at high risk of liver failure and hepatocellular carcinoma and may need liver transplantation in the late stage of disease (Trivedi et al., 2016). Thus, it is necessary to identify risk factors of PBC and then benefit for the clinical prevention.
Despite vigorous efforts in the characterization of autoantibodies and bile duct histopathology, the etiology of PBC remains unclear. One common sense is that PBC is caused by genetic as well as environmental factors (Gershwin and Mackay, 2008;Carey et al., 2015). Often, the liver has an important role in thyroid hormone metabolism and the level of thyroid hormones is also important to normal hepatic function and bilirubin metabolism (Huang and Liaw, 1995). Many studies have been conducted to investigate the relationship between thyroid function and liver disease, especially between hypothyroidism and liver cirrhosis. Previous observational epidemiological studies had shown that hypothyroidism was frequently associated with PBC (Golding et al., 1973;Elta et al., 1983;Floreani and Cazzagon, 2018), and the diagnosis of hypothyroidism may precede or follow that of PBC (Elta et al., 1983). Another study found that the high prevalence of thyroid disease, especially Hashimoto's thyroiditis, was observed in PBC patients and thyroid disease may not influence the natural history of PBC or patient survival (Floreani et al., 2017). A populationbased cohort study showed that cirrhotic patients with hypothyroidism became euthyroid after thyroxine treatment, but subsequently presented some degree of liver damage (Oren et al., 2000). While other studies found that hypothyroidism was associated with an increased risk of liver fibrosis (Kim et al., 2018;Bano et al., 2020). It was not surprising to observe the inconsistent findings regarding the relationship between hypothyroidism and liver function, given that observational studies can be susceptible to measurement error, reverse causation as well as unmeasured confounders. Although many studies have confirmed that hypothyroidism has a certain association with PBC, the observational design could not allow for proving causality.
Recently, the proliferation of publicly available genome-wide association studies (GWASs) provides a rich resource of large-scale summary data that did not involve privacy and ethical issue, which has promoted the researchers to examine previously known and novel relationships among complex traits (Welter et al., 2014;Watanabe et al., 2019). Genetic correlation, Mendelian randomization (MR) and colocalization analysis are three widely-used methods to fully utilize genetic data for better understanding the genetic relationship between two traits (Burgess et al., 2018). One main advantage of these methods is that they are less sensitive to many sources of bias commonly encountered in traditional epidemiological studies. Specifically, the genetic correlation can estimate the correlation in alleles effect between two traits (e.g., hypothyroidism and PBC) across genetic variants in the whole genome (van Rheenen et al., 2019), which is symmetric in its two traits and gives no information about the direction of the correlation. While MR can assess the causal effect of an exposure (e.g., hypothyroidism) on an outcome of interest (e.g., PBC) via focusing on genome-wide significantly associated genetic variants as instrumental variables for the exposure of interest (van Rheenen et al., 2019;Kraft et al., 2020). Since genetic variants were randomly allocated from parents to offspring at conception and would not be modified, MR can be thought of a "naturally" randomized controlled trials (Haycock et al., 2016). Thus, MR is well acknowledged to be an efficient and cost-effective method to interrogate the causal relationships among health risk factors and disease outcomes (Haycock et al., 2016) as well as among molecular traits and disease outcomes (Yuan et al., 2020;Liu et al., 2021). Colocalization analysis is to examine whether two potentially related traits share common genetic causal variants in a given region (Giambartolomei et al., 2014). Unlike MR assumes that the genetic variants affect outcome only via exposure, colocalization allows the genetic variants to be associated with both traits directly. It should be noted that these three methods, though with different focus, can complement each other. Joint analysis of these three approaches can provide comprehensive investigation and better understanding for the relationship between two complex traits.
At present study, we performed the joint analysis of genetic correlation, MR and colocalization to deeply demonstrate the relationship between hypothyroidism and PBC using large-scale GWAS summary statistics ( Figure 1). Various sensitivity analyses had been conducted to assess the robustness and the consistency of the findings. The results can advance our understanding of the etiology of PBC and hypothyroidism, as Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753352 2 well as provide guidance for prevention and intervention toward both diseases.

GWAS Data Sources
We downloaded summary statistics of hypothyroidism from the GWAS ATLAS (https://atlas.ctglab.nl/traitDB/3602), with18,740 cases and 270,568 controls from the UK Biobank (data filed 20002_ 1226) with 10,321,705 SNPs (Watanabe et al., 2019). Hypothyroidism was defined as self-reported history of hypothyroidism/myxoedema. The GWAS summary statistics of PBC was obtained from a genomewide meta-analysis, which was the largest PBC GWAS to date and integrated samples from three cohorts and in total involved 2,764 cases and 10,475 controls with 1,134,141 SNPs (Cordell et al., 2015). All PBC cases fulfilled the American Association for the Study of Liver disease criteria for PBC. The details of GWAS datasets used in the analysis were briefly summarized in Table 1.

Genetic Correlation Analysis
We applied cross-trait linkage disequilibrium score regression (LDSC) to estimate the SNP heritability (h 2 ) of hypothyroidism and PBC, and the overall genetic correlation (r g ) between them (B. . LDSC was implemented by regressing the product of z statistics from two studies of traits on LD scores, which were pre-computed using 1000 Genomes European data (B. K. . We estimated the SNP heritability on the liability scale with the sample and population prevalence being 6.48 and 4.6% (Taylor et al., 2018) for hypothyroidism, and 20.88 and 0.0146% (Trivedi and Hirschfield, 2021) for PBC. The regression slope of LDSC can provide an unbiased estimate of r g . Detailed formula of cross-trait LDSC was provided in the Supplementary Methods.

Two-Sample Mendelian Randomization
We first conducted MR analysis to investigate the causal effect of hypothyroidism on PBC. Often, three assumptions should be satisfied for the genetic variant that can be served as valid instrumental variables in an MR analysis ( Figure 2): 1) the genetic variant is strongly associated with the exposure (the relevance assumption); 2) the genetic variant is not associated with any potential confounding between the exposure and the outcome (the independence assumption); 3) the genetic variant only affects outcome through the exposure (the exclusion restriction assumption).  To choose the valid instrumental variables for MR analysis, we followed the previous MR study (Zeng and Zhou, 2019) with a stringent selection procedure. Specifically, we 1) selected SNPs which are strongly associated with exposure (e.g., hypothyroidism) at genome-wide significance (p < 5 × 10 −8 ); 2) matched these significant SNPs with GWAS summary statistics of outcome (e.g., PBC) by chromosome, position and rsid, and removed those unmatched ones; 3) screened out independent SNPs using PLINK (version 1.90) (Purcell et al., 2007), based on r 2 < 0.001 or physics distance more than 10,000 kb; 4) excluded the potentially pleiotropic SNPs that are associated with outcome at the p value less than 0.05 after Bonferroni correction; 5) harmonized the alleles and effects between the exposure and outcome datasets. Through the rigorous screening process, there were 69 SNPs selected as instrumental variables for hypothyroidism, with detailed information shown in Supplementary Figure S1 and Supplementary Table S1. It should be noted that removing the potentially pleiotropic SNPs was a conservative strategy to sufficiently ensure the validity of MR analysis. The same MR procedure was also performed in the reverse causality analysis to investigate the causal effect of PBC on hypothyroidism. Totally, 13 SNPs were selected to serve as instrumental variables for PBC, with detailed information provided in Supplementary Figure S2 and Supplementary Table S2.
The accuracy of MR results inevitably depends on whether the selected genetic variants meet the 3 MR assumptions. The violation of the relevance assumption would lead to weak instrumental bias, we thus calculated the proportion of variance explained (PVE) by an individual SNP (Shim et al., 2015) and then computed the F statistic (Zeng and Zhou, 2019) to assess this issue, with detailed formula provided in the Supplementary Methods. In practice, it is hard to directly verify whether the independence assumption and the exclusion restriction assumption are met, thus we alternatively used multiple MR methods with different model assumptions to assess the consistency and robustness of the findings: 1) inverse variance weighted (IVW) method , which requires all genetic variants to satisfy the instrumental variable assumptions; 2) weighted median method (Bowden et al., 2016a), which can provide consistent estimates when the proportion of valid instrumental variables is over 50%; 3) weighted mode-based method (Hartwig et al., 2017), which can obtain an unbiased estimate if the weights associated with the valid instrumental variables are the largest; 4) IVW method using robust regression (MR-Robust) , which is robust regression-based IVW method and can obtains consistent estimates by downweighting genetic variants with potential pleiotropic effects; 5) MR-Lasso (Slob and Burgess, 2020), which uses penalization to identify the candidate instrumental SNPs; 6) MR Pleiotropy RESidual Sum and Outlier (MRPRESSO) (Verbanck et al., 2018), which first identifies horizontal pleiotropic SNPs as the outliers and then removes these outliers to infer the causal effect; 7) MR-Egger (Bowden et al., 2015), which assumes that the genetic effect on the exposure is independent of the pleiotropic effect, and it can obtain a consistent causal estimate even when all genetic variants were invalid; 8) MR Robust Adjusted Profile Score (MR-RAPS) (Zhao et al., 2020), which is robust to both systematic and idiosyncratic pleiotropy.
We further conducted the leave-one-out (LOO) analysis by removing each genetic variant out of the MR analysis in turn, to assess the influence of individual SNP on the overall causal estimate. In addition, various diagnostic plots were depicted to illustrate the MR results including scatter plot, individual SNP effect forest plot, funnel plot and LOO forest plot.

Colocalization Analysis
Colocalization analysis was performed to assess whether hypothyroidism and PBC share common genetic causal variant in a genomic region. We implemented colocalization analysis using the commonly used Bayesian modelcoloc (Giambartolomei et al., 2014), which assumes at most one association per trait in a test region and uses Approximate Bayes Factor computation to generate posterior probabilities (PP) of all possible configurations between two traits: 1) H 0 : no association with either trait; 2) H 1 : association with trait 1, not with trait 2; 3) H 2 : association with trait 2, not with trait 1; 4) H 3 : association with trait 1 and trait 2, two distinct SNPs; 5) H 4 : association with trait 1 and trait 2, one shared SNP. The PP of each configuration is respectively denoted by PP 0 , PP 1 , PP 2 , PP 3 , and PP 4 . A large PP 4 (e.g., PP 4 > 75%) was considered to be strong support for colocalization in the original method publication (Giambartolomei et al., 2014), which indicated a shared variant between hypothyroidism and PBC. Genomic regions were defined within 200 kb of the instrumental SNP variables, including 69 SNPs for hypothyroidism and 13 SNPs for PBC. After merging overlapping regions, we totally tested for colocalization in 79 unique regions (Supplementary Table  S3). The default setting of prior probabilities in coloc were used.

Genetic Correlation Analysis
Using LDSC, the SNP heritability on the liability scale was estimated to be 0.2085 (se 0.0257) for hypothyroidism and 0.1527 (se 0.0253) for PBC, respectively. We found that there existed a significant positive genetic correlation (r g 0.177, p 0.006), implying a shared genetic architecture, between hypothyroidism and PBC. The intercept of genetic covariance was estimated to be 0.0135 (se 0.0074, p 0.0681), indicating that there was little or no sample overlap given that hypothyroidism and PBC were usually phenotypic correlated.

Causal Effect of Hypothyroidism on PBC From MR Analysis
Totally, 69 SNPs were carefully selected as valid instrumental variables for hypothyroidism (Supplementary Table S1), which together explained about 1.79% phenotypic variance of hypothyroidism. The F statistics of all these instrumental variables were above 10 with an overall F statistic of 76.40, indicating the non-existence of weak instrument bias.
One SNP (rs2111485) was identified as a potential pleiotropic outlier at the nominal significance level of 0.05 in the MRPRESSO outlier test. However, the causal effect estimate did not change substantially after outlier correction (OR 1.247; 95% CI, 1.098-1.417; p 1.19 × 10 −3 ), and MRPRESSO distortion test further suggested that there was no significant difference in the causal estimates before and after correction for outliers (p 0.768). The funnel plot indicated there was no obvious outlier, strengthening the robustness of our results (Figure 4). In addition, the LOO method also suggested that no single instrument acted as a potential outlier (Figure 4).

Causal Effect of PBC on Hypothyroidism From MR Analysis
We, following the same procedure as above, performed an MR analysis for the casual effect estimate of PBC on hypothyroidism. We identified a total of 13 SNPs as valid instrumental variables for PBC (Supplementary Table S2), which together explained about 5.43% phenotypic variance of PBC. The F statistics of all these instrumental variables were above 10 with an overall F statistic of 58.36, indicating no weak instrument bias.
Furthermore, the MRPRESSO outlier test identified no pleiotropic SNPs at the nominal significance level of 0.05. Both the funnel plot and the LOO method also indicated the non-existence of outlier instrumental SNPs, strengthening the robustness of our results ( Figure 5).

Colocalization Analysis
Among 79 unique regions, we identified two genomic regions with PP 4 greater than 0.75, one on chromosome two and the other on chromosome 17 (Supplementary Table S3 and Table 2), indicating that a common biological mechanism may exist between hypothyroidism and PBC. For the two regions showing evidence for colocalization, SNPs with the maximum PP 4 were treated as the most likely shared causal Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753352 variants, with their regional association plots displayed in Supplementary Figure S3.

DISCUSSION
At present study, we jointly conducted genetic correlation, MR and colocalization analysis with large-scale GWAS summary statistics, to comprehensively investigate the causal relationship between hypothyroidism and PBC. The genetic correlation analysis showed a significant overall correlation between hypothyroidism and PBC. Generally, the observed overall association between hypothyroidism and PBC can be explained in three different ways: causality, horizontal pleiotropy and confounding by LD in alternative genetic variants (Supplementary Figure S4). Our MR analysis supported a bi-directional causal relationship between hypothyroidism and PBC, with rigorous screening for instruments to remove the horizontal pleiotropic SNPs and to minimize the impact of potential confounding factors between hypothyroidism and PBC. The consistent results from a variety of MR methods with different model assumptions guard against possible model misspecification, to ensure the reliability of the findings. Finally, colocalization analysis further ruled out the possibility of confounding due to LD between genetic variants to further strengthen the evidence of the causal relationship between hypothyroidism and PBC. Indeed, hypothyroidism and PBC are usually co-existed (Crowe et al., 1980;Floreani et al., 2015). The bi-directional causal relationship between hypothyroidism on PBC may be biologically supported. Firstly, from the perspective of autoimmunity, the leading cause of hypothyroidism is Hashimoto's thyroiditis, which is an autoimmune thyroiditis (Hollowell et al., 2002). Hashimoto's thyroiditis and PBC might share a common autoimmune etiology, and there is cross-reactivity of anti-thyroid autoantibodies in the presence of autoreactive T cells or similar epithelial antigens in both the liver and the thyroid (Chalifoux et al., 2017). Apart from hypothyroidism, many studies have suggested that patients with PBC typically have an increasing incidence of other autoimmune diseases, such as, Sjögren's syndrome, systemic sclerosis, rheumatoid arthritis, lupus, and coeliac disease (Volta et al., 2002;Narciso-Schiavon and Schiavon, 2017). Secondly, from the perspective of metabolism, thyroid hormone is critical for tissue-organ development, growth, differentiation and metabolism (Mullur et al., 2014), especially for regulating hepatic lipid, cholesterol, and glucose metabolism (Ritter et al., 2020). Hypothyroidism has been implicated in the etiology of fibrotic diseases and a number of animal studies have reported profibrotic effects of hypothyroidism (Bano et al., 2020). It is well-described that liver has a critical role in maintaining thyroid hormone homeostasis (Malik and Hodgson, 2002), and thyroid hormone binding proteins are synthesized in the liver. Approximately 80% of daily circulating triiodothyronine is produced by deiodinase enzymes from thyroxine in the liver (Perra et al., 2016). Furthermore, changes in thyroid hormone metabolism and thyroid dysfunction have been reported in chronic liver diseases (Aizawa et al., 1980;Silveira et al., 2009).
We also identified two shared causal genetic variants through colocalization analysis, including rs2111485 on chromosome two and rs35776863 on chromosome 17. The SNP rs2111485, located in the intergenic region, behaves as a cis-eQTL that affects IFIH1 mRNA expression (Fodil et al., 2016). As a cytoplasmic sensor of  was observed in both the portal tract and liver parenchyma of PBC and type 1 IFN signaling may be a crucial molecular target for future treatment of PBC (Takii et al., 2005). In addition, IFIH1 has also been proved to be significantly associated with the risk of Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 753352 8 hypothyroidism in the UK Biobank (Emdin et al., 2018). These findings indicate that IFIH1 may play an important role in the relationship between hypothyroidism and PBC, and imply that a common autoimmune mechanism may exist. Another causal genetic variant, rs35776863, is important for predicting ACAP1 expression (Hoffman et al., 2017). ACAP1 is a GTPase-activating protein for ADP-ribosylation factor (Arf) 6. Previous evidence illustrate that Arf6 plays key roles in hepatic cord formation during liver development (Suzuki et al., 2006) and thyroid stimulating hormone trafficking (Lahuna et al., 2005).
To our best knowledge, this is the first comprehensive study to illuminate the bi-directional causal association between hypothyroidism and PBC by using genetic approaches based on large-scale GWAS summary statistics, where the large GWAS sample sizes provided us sufficient statistical power to detect causal association (Brion et al., 2013). Moreover, although some assumptions of instrumental variables cannot be fully tested, we have alleviated this issue by adopting rigorous procedures to select instruments and minimize the influence of potential pleiotropic SNPs. In addition, we have conducted various sensitivity analyses to show that our findings were consistent from different MR methods with different model assumptions. The results from MR-Egger are sometimes slightly different from the other methods, which may be due to the strict Egger assumption that all SNPs have the same horizontal pleiotropy effects (Burgess et al., 2018). The violations of the Egger assumption can lead to the low power of MR-Egger method (Bowden et al., 2016b). While another method, MR-RAPS (Zhao et al., 2020), relaxes such model assumption of pleiotropy and has high power to significantly detect the causal effect. It should be noted that the GWAS sample size of PBC is relatively small, and a larger PBC GWAS is certainly warranted to further verify the findings.

CONCLUSION
Appropriate analysis of large-scale genetic dataset can provide some clues for biomedical research. Through the joint analysis, our study rendered strong evidence for the bidirectional causal association between hypothyroidism and PBC, which may provide insights into the etiology of hypothyroidism and PBC as well as inform prevention and intervention strategies directed toward both diseases. However, further experimental study is needed to validate the findings.

DATA AVAILABILITY STATEMENT
All GWAS summary data analyzed during this study were publicly available and the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
YS and ZY conceived and designed the study. YW performed the data analyses. PG, LL, YZ and RY interpreted the results of the data analyses. YW, ZY and YS wrote the manuscript. All authors read and approved the final version of manuscript.

FUNDING
This work has been supported by grants from National Natural Science Foundation of China (81673272, 81872712, 82173624, 81922016, 81870607), the Natural Science Foundation of Shandong Province (ZR2019ZD02, ZR2019JQ25, ZR2020ZD14), and the Young Scholars Program of Shandong University (2016WLJH23).

ACKNOWLEDGMENTS
We thank all the GWAS consortiums for making the summary statistics publicly available and are grateful to all the investigators and participants who contributed to those studies.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.753352/ full#supplementary-material Best causal, the SNP with the highest posterior probability to be the causal variant in the genomic region; Hypothyroidism P-value and Hypothyroidism SNP, the lowest p-value found for hypothyroidism and the corresponding SNP; N_SNPs, the number of SNPs tested in the genomic region; PBC p-value and PBC SNP, the lowest p-value found for PBC and the corresponding SNP; PP.H4.abf, the posterior probability of hypothyroidism and PBC sharing a causal variant; Region, genomic region including chromosome and the start and stop position.