Pleiotropy Complicates Human Gene Editing: CCR5Δ32 and Beyond

Publisher Rights Statement: This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.


INTRODUCTION
Half a year ago, Chinese scientist He Jiankui pushed an ethical boundary by claiming to have treated two female infants for potential future HIV infection by altering a small piece of their genome. He was thereafter listed among Nature's 10 people who mattered in 2018. "He was widely criticized for ignoring important ethical considerations and exposing the girls to unknown risks for an uncertain benefit", as reported by Nature (Vol. 564, page 329).
The gene editing target that He Jiankui chose was from a study with participants of European ancestry, wherein in a cohort of HIV-1-infected individuals, none was found to be homozygote for the CCR5Δ32 deletion, despite its relatively high allele frequency (9.2%) in the European population (Samson et al., 1996). Later studies further showed that stem cell transplantation from CCR5Δ32 homozygotes can treat HIV-1-infected individuals (Hütter et al., 2009;Gupta et al., 2019). Thus, introducing the deletion of the CCR5 gene seems to be protective against HIV-1 infection. However, the potential side effects of the deletion are far from clear.
He Jiankui was criticized for putting the young girls into unknown risks. Cyranoski (2018) timely pointed out in Nature News that the target variant was reported to have negative effects on a range of human traits. Later, Lander et al. (2019) commented in the same journal to highlight and discuss the medical, scientific, and ethical considerations of gene editing in humans, where they pointed out that the long-term effects on genetically correlated traits need to be understood before performing gene editing on humans.
According to literature, except for documented side effects on, e.g., West Nile virus infection (Glass et al., 2006), celiac disease, and autoimmune thyroid disorders in patients with type 1 diabetes (Słomiński et al., 2017), CCR5 loss of function was actually reported to be favorable for multiple sclerosis (Barcellos et al., 2000;Kantor et al., 2003), spontaneous hepatitis C viral clearance (Goulding et al., 2005), and chronic and aggressive periodontitis (Cavalla et al., 2018). Although CCR5 is clearly involved in the human immune system, it is hard to assess its potential side effects.
Very recently, Wei and Nielsen (2019) reported an assessment of CCR5Δ32 homozygote carriers in UK Biobank, who were shown to suffer from 21% increase in their mortality rate. Wei and Nielsen predicted that this Δ32 mutation could be highly pleiotropic and potentially increase the susceptibility to other common diseases.
Here, from a quantitative genetics perspective, we aim to use UK Biobank as a unified source of genomic big data to investigate additional evidence of the substantial pleiotropy of diseaseassociated DNA variants, starting from the CCR5 gene that He Jiankui tried to edit using CRISPR.

CCR5Δ32 Does More Harm Than Good According to UK Biobank
We first focused on the CCR5Δ32 variant that was imputed with quality (variant 3:46414943_TACAGTCAGTATCAATTCTG GAAGAATTTCCAG_T, info score 0.838) in the UK Biobank cohort. This deletion variant was also what was aimed for by He Jiankui in his gene editing surgery, as the variant was documented to prevent the homozygote carriers from HIV infection (Hütter et al., 2009). The association analysis results between this variant and 131 curated disease phenotypes with at least 1000 cases was extracted from the UK Biobank round 2 genomewide association study (GWAS) results released by Neale's lab (http://www.nealelab.is/uk-biobank/ukbround2announcement; Supplementary Table 1).
The GWAS by Neale's lab was conducted via a simple linear regression of each binary disease outcome vector y (length n ) on the CCR5Δ32 genotype dosages g, i.e., where μ is the phenotypic mean parameter for CCR5 wildtype homozygotes, β obs is the allelic substitution effect of the CCR5Δ32 deletion on the observed scale, and e is the residual vector. When conducting the GWAS, covariates including sex, age, age 2 , sex × age, and sex × age 2 were fitted to reduce residual variance, and the first 20 principal components of the genomic kinship matrix were also fitted to remove the confounding effect due to population structure. The analysis was performed on 361,194 quality-controlled individuals, with restriction to samples of white British genetic ancestry. The detailed pipeline can be found at https://github.com/Nealelab/UK_Biobank_GWAS.
In order to assess the odds ratio estimates of the CCR5Δ32 deletion, we transformed the estimated genetic effect from the observed scale β obs to its logistic scale β . Typically, the phenotypic variance explained by the genetic variant is a very small fraction, and then β obs , the disease prevalence, and the variant's allele frequency together form a set of sufficient statistics for β , making such transformation feasible (see Pirinen et al., 2013, formula 3.2, and an implementation in Supplementary  Table 1). This provided the odds ratio of CCR5Δ32 for each of the 131 disease phenotypes (Supplementary Table 1). Due to the lack of recorded HIV infection incidence in UK Biobank, we re-analyzed the contingency table in Samson et al. (1996), where the effect of natural CCR5Δ32 deletion was first reported, to examine the odds ratio on HIV-1 infection in the Caucasian population. Estimated from a logistic regression, the odds ratio of a CCR5Δ32 substitution is 0.56 (p = 1.03 × 10 −4 ), though CCR5Δ32 homozygotes appear to be completely immune to macrophage-and dual-tropic HIV-1 strains (Samson et al., 1996).
The observed p value distribution across the diseases significantly deviates from what we expect under the null, indicating that the variant has effects on a significant subset of the diseases (Figure 1A). For instance, the CCR5Δ32 variant has significant effects (false discovery rate < 5%) on rheumatoid arthritis (RA), Still disease (SD), ischemic heart disease (IHD), coronary heart disease (CHD), CHD with no revascularizations (CHD NR ), spinal stenosis (SS), and bronchitis. Notably, among these seven diseases, the effects of the CCR5Δ32 deletion on autoimmune (RA and SD) and other (IHD, CHD, CHD NR , SS, and bronchitis) diseases have opposite directions.
From the estimated odds ratios, regardless of statistical significance, the CCR5Δ32 deletion appears to elevate the risk for 93 out of the 131 disease phenotypes in UK Biobank, versus the other 38 where the deletion appears to be protective against the diseases (Supplementary Table 1). This is notably enriched for harmful effects (p = 1.55 × 10 −6 , Wilcoxon signed rank test with continuity correction) if assuming the diseases are independent for simplicity. As most of these associations could be statistically zero according to the current data, in order to more stringently estimate the proportion of harmful effects across these diseases, we modeled the 131 GWAS Z scores as drawn from a mixture distribution of ( . ) π + = = 0 000 0 9999 p , and the average harmful effect Z score ˆ.
( . ) µ Z p = = × − 1 003 4 0 10 3 . This is equivalent to about 30 out of the 131 diseases having elevated risk due to the Δ32 mutation, while comparatively, the mutation's protective effect is nearly none (Figure 1B).

Established Disease Susceptibility loci Tend to be Pleiotropic
It is arguable that the CCR5Δ32 deletion might happen to be a special case, showing substantial pleiotropic effects on a wide range of phenotypes. How about potential gene editing for other diseases? Here, based on established disease-associated variants, we try to examine the likelihood that gene editing would result in side effects on other phenotypes.
In order to extend the consideration of pleiotropic effects to complex diseases in general, we investigated discovered susceptibility loci for six severe diseases in human population: breast cancer (Michailidou et al., 2017), lung cancer (McKay et al., 2017), coronary artery disease (CAD) (Nikpay et al., 2015), type 2 diabetes (T2D) (Morris et al., 2012), bipolar disorder (BIP) (Stahl et al., 2019), and major depressive disorder (MDD) (Howard et al., 2019). Again, in a different manner, we used the publicly available UK Biobank GWAS results by Neale's lab (http://www.nealelab.is/uk-biobank). Each SNP was quantified for its number of associations across all the phenotypes (p < 5 × 10 −4 ). The genome average of this quantity was 5.56 associations per SNP for all the variants (median = 5). Even for the variants with minor allele frequency larger than 0.3, the average number of associations was 5.64 per SNP (median = 5). For every disease among the six, the average number of associations of its reported susceptibility loci was larger than the genome average ( Figure 1C,  Supplementary Table 2). The results indicate that pleiotropic effects are ubiquitous and even enriched for many established loci associated with complex diseases.

DISCUSSION
Starting from the CCR5Δ32 deletion, a site targeted by He Jiankui in his gene editing surgery, we investigated the pleiotropic nature of this deletion and some other disease-associated variants, using massive publicly available GWAS results from the UK Biobank. The results highlight that pleiotropy should always be carefully considered before gene editing treatment for human complex diseases.
Our results suggest that, in He Jiankui's CRISPR experiment, even if the surgery does produce a deletion effect the same as CCR5Δ32, the treated girls would be prone to an elevated risk of cardiovascular and other potential diseases. It also seems true that the surgery would be more harmful than beneficial, considering the number of diseases that it might have effects on. Some of these diseases are not only common, but also essential contributors to the mortality rate of the current human population (Timmers et al., 2019). Although there is criticism about Wei and Nielsen (2019)'s pipeline, regardless of the level of statistical significance in their analysis, our additional results here do provide evidence that the Δ32 mutation's potential effect on mortality may be related to its side effects on other more common diseases.  Samson et al. (1996). Diseases with false discovery rate < 5% are labeled. Phenotypes for which the CCR5Δ32 deletion elevates/reduces their risks are colored orange/blue, respectively. The sizes of the dots are proportional to the magnitudes of the deviations of their odds ratios from one. The red diagonal line represents equality between the x and y axes. (B) Distribution of CCR5Δ32 effects on complex diseases. The genetic effects are shown as Z scores, i.e., standardized by their standard errors. The gray histogram shows the observed distribution of Z scores across 131 curated disease phenotypes in UK Biobank. The colored densities give the fitted mixture distribution, consisting of three Gaussian components, where the estimated proportion of harmful effects is about 23%. (C) Pleiotropy of the established loci for six diseases is evaluated by the number of associations per locus in Neale's lab round 1 UK Biobank GWAS. Each dot represents the average number of associations per locus for the corresponding disease, and the whiskers represent standard errors. MAF: minor allele frequency.
Besides the issue with pleiotropy, gene editing in humans may lead to other unwanted consequences. Although the CRISPR-Cas9 technology has been shown to be a reliable method to introduce mutations to the target site, it appears that He Jiankui has also ignored the possibility of any off-target effects that might be induced in the process (Zhang et al., 2015). Furthermore, from an evolutionary perspective, we should be careful before introducing any artificial mutation to the human gene pool, even if the introduced mutation might have negligible side effects for the population. For instance, as it is likely that the introduced mutation is in linkage disequilibrium with another functional variant under positive selection, due to genetic hitchhiking (Barton, 2000), the introduced mutation can gain allele frequency so that its effects on the population are revealed. However, we do not suggest a complete ban of gene editing treatments. Similar to the development of any treatment, what is essential is the trade-off between positive and negative effects. One can imagine that a gene editing surgery removing a severely impactful monogenic mutation could be valuable to certain individuals, given that the side effects are known to be none or so small that they do not matter compared to the monogenic disease itself. Unfortunately, for most complex diseases, the situation does not appear to be as straightforward at all. Pleiotropy, i.e., a gene or genetic variant having complex effects on various phenotypes, is a very common phenomenon. It is encouraging to foresee the potential of gene editing in humans as treatments for diseases. However, practitioners such as He Jiankui had uninformed opinions towards CCR5Δ32's effect against HIV and showed disrespect to the complexity of genome biology resulting from billions of years of evolution. The data presented here were all publicly available, sufficient to prevent anyone from even considering the experiment on living human embryos. Unfortunately, all these established resources were overlooked. We provided additional evidence to evaluate He Jiankui's actions and to guide considerations in future gene editing research, as it undeniably is a field with great potential.

DATA AVAILABILITY
The datasets analyzed in this study can be found in the Supplementary Tables and references. AUTHOR CONTRIBUTIONS XS initiated and coordinated the study. TL and XS performed data analysis. Both authors contributed to writing the paper.