# Where is the friend's home?

- Centre of Neurogenetics and Statistical Genomics, Queensland Brain Institute, The University of Queensland, St. Lucia, QLD, Australia

A commentary on

Friendship and natural selection

*by Christakis N., and Fowler, J. (2014). Proc. Natl. Acad. Sci. U.S.A. 111, 10796–10801. doi: 10.1073/pnas.1400825111*

Using Framingham Heart Study (FHS) data (accession no. phs000153.v7.p6), Christakis and Fowler (2014) showed that friends had more extended genetic correlation than strangers had. This finding may suggest that friends are “functional kin.” Depending on the context, functional kin may have high genetic similarity (homophilic) if people have similar functional kin as friends; conversely, functional kin may have low/negative genetic similarity (heterophilic) if people have complementary functional kin as friends. Christakis and Fowler reported that between friends there is positive/negative genetic correlation at the single nucleotide polymorphism (SNP) level, indicating heritability of friendship. This result is very interesting, particularly for the social sciences (Skyrms et al., 2014), but deserves scrutiny. This note demonstrates that (1) high genetic similarity between friends is possible if a person finds friends from his/her own cultural background, which is a surrogate for genetic similarity, or (2) low genetic similarity is possible if a person finds friends from a different cultural background. As illustrated below, the mechanism involved in inflating/deflating genetic similarity between friends is analogous to the way population stratification raises the false positive rate in case-control studies.

In Christakis and Fowler's study, a single-marker regression was conducted to test the association for friendship. The regression is defined as

in which *g _{e.m}* and

*g*are the genotypes of “ego” and his/her friend, respectively, at the

_{f.m}*m*locus (

^{th}*m*= 1, 2, 3 …

*M*). The regression coefficient is estimated as ${\widehat{{b}}}_{{m}}{=}\frac{{c}{o}{v}{(}{{g}}_{{e}{.}{m}}{,}{{g}}_{{f}{.}{m}}{)}}{{v}{a}{r}{(}{{g}}_{{f}{.}{m}}{)}}$. In a conventional GWAS regression, the phenotype is the same for each locus, whereas in this regression, the

*g*phenotype is updated to match each

_{e.m}*g*locus. However, this actually models

_{f.m}*F*under the circumstances as discussed below (blue box in Figure 1). For clarity, the subscript

_{st}*m*is hereafter omitted.

**Figure 1. Illustration of how an ego-friend from the same/different subgroup(s) may inflate/deflate genetic similarity as indicated by λ _{GC}**.

Assume that in the sample there are *S* subgroups (e.g., *S* = 2), each of which has *n _{s}* individuals. The proportion of the

*s*subgroup to the total sample size is ${{w}}_{{s}}{=}\frac{{{n}}_{{s}}}{{n}}$, in which ${n}{=}{\displaystyle {{\sum}}_{{s}{\text{\hspace{0.05em}}}{=}{\text{\hspace{0.05em}}}{1}}^{{S}}{{n}}_{{s}}}$. Of the total

^{th}*n*ego-friend pairs,

*n*are from the

_{s}*s*subgroup. The variance of

^{th}*g*can be written as the within-group variance and the between-group variance (http://en.wikipedia.org/wiki/Mixture_distribution)

_{f}in which *p*_{f(s)} is the reference allele frequency (RAF) of the friends from the *s ^{th}* subgroup.

*p*= ∑

_{f}*w*

_{s}p_{f(s)}is the RAF of the friends, σ

^{2}

_{f(s)}= 2

*p*

_{f(s)}(1 −

*p*

_{f(s)}) is the within-sub-population variance of the friends, and (

*p*

_{f(s)}−

*p*)

_{f}^{2}is the between-sub-population sampling variance. Similarly, the covariance between

*g*and

_{e}*g*can be written as

_{f}in which σ^{2}_{ef(s)} is the covariance between the friends' genotypes within the *s*^{th} subpopulation, and (*p*_{e(s)} − *p _{e}*)(

*p*

_{f(s)}−

*p*) is the between-sub-population covariance. σ

_{f}^{2}

_{ef(s)}≠ 0 if the pair of friends is related or there is heritability for friendship at the

*i*locus.

^{th}In Christakis and Fowler's study, the null hypothesis is that *H*_{0} : σ^{2}_{ef(s)} = 0 (i.e., no heritability for friendship). Even if friendship is not heritable and relatives are not included, friends can still have inflated/deflated genetic correlation, which raises the regression coefficient from zero, if an ego finds friends from the same/different cultural background.

If an ego finds friends from the same cultural background, *p*_{e(s)} will be similar to *p*_{f(s)}. Assuming that *p*_{e(s)} = *p*_{f(s)}, the regression coefficient can be written as

By definition, ${{F}}_{{s}{t}}{=}\frac{{{\Sigma}}_{{s}{=}{1}}^{{S}}}{{{w}}_{{s}}}{{(}{\overline{{p}}}_{{f}{(}{s}{)}}{-}{\overline{{p}}}_{{f}}{)}}^{{2}}{v}{a}{r}{\left(}{{g}}_{{f}}{\right)}{\ge}{\text{0}}$. The standard deviation of $\widehat{{b}}$ is ${\widehat{{\sigma}}}_{{b}}{=}\sqrt{\frac{{{\sigma}}_{{{g}}_{{e}}}^{{2}}}{{n}{{\sigma}}_{{{g}}_{{f}}}^{{2}}}}{\approx}\sqrt{\frac{{1}}{{n}}}$.

Given $\widehat{{b}}$ and $\widehat{{\sigma}}$_{b}, a *z*-score test can be constructed as ${z}{=}\frac{\widehat{{b}}}{{\widehat{{\sigma}}}_{{b}}}{=}\sqrt{{n}}{{F}}_{{s}{t}}$ and *z*^{2} ~ χ^{2}_{1}, with the non-centrality parameter (NCP) Δ = *nF*^{2}_{st}. Christakis and Fowler reported λ_{GC}, the genomic inflation factor (Devlin and Roeder, 1999), for their GWAS analysis. In their study, ${{\lambda}}_{{G}{C}}{=}\frac{{m}{e}{d}{i}{a}{n}{(}{{\chi}}_{{1}}^{{2}}{)}}{{0.455}}{=}{1.04}$, which meant that the median of the observed χ^{2}_{1} value was 0.473, with NCP Δ = *nF*^{2}_{st} = 0.04. Thus, the median of *F _{st}* was ${\widehat{{F}}}_{{s}{t}}{=}\sqrt{{0.04}{/}{n}}{=}{0.0066}$ between the egos and the friends (

*n*= 907 in Christakis and Fowler's study).

An early study (Cavalli-Sforza et al., 1996) indicated that *F _{st}* = 0.016 for European descendants, and a recent study (Novembre et al., 2008) using high-density SNPs showed that

*F*= 0.004 between European nations (the estimated mean of

_{st}*F*= 0.0042 over nearly 900 thousand common HapMap SNPs between CEU and TSI). Although the estimate of

_{st}*F*depends on other factors, such as ascertainment and the sample proportion for each subgroup (Bhatia et al., 2013), the estimated $\widehat{{F}}$

_{st}_{st}between the egos and the friends seems to fall in-between the aforementioned values. For a loosely defined trait such as friendship, the heritability may be low, if not zero. However, as long as friendship is more frequently established within one's cultural background,

*F*will inflate genetic similarity among friendships even in the absence of heritability/“functional kin.” With an increased sample size, λ

_{st}_{GC}

*will be even larger*(blue box in Figure 1), a phenomenon that resembles the manner in which population stratification inflates type I error rate in GWAS.

In the discussed study, the negative genetic correlation between friends was also highlighted. Similarly, if an ego finds friends from a different cultural background, the regression coefficient will tend to turn negative (denoted as $\tilde{{F}}$_{st}, indicating reduced genetic similarity between friends, see the yellow box in Figure 1). In practice, both patterns, not as extreme as demonstrated though, will be possible between friends and lead to homophily and heterophily as observed.

In their GWAS-like analysis, principle components were used as covariates to control for *F _{st}*. However, covariates may not completely eliminate the background effects, such as

*F*. When covariates reduce

_{st}*F*, the heritability of friendship will also be reduced, potentially to zero. Although the analysis in this note does not entail the rejection of the conclusion drawn by Christakis and Fowler, it warns against misleading interpretations of the results.

_{st}## Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Acknowledgments

This note was benefited greatly from the discussion on the journal club held at Centre of Neurogenetics and Statistical Genomics, Queensland Brain Institute, The University of Queensland (http://cnsgenomics.com/journal_club/). None of discussants are responsible for errors, nor should it be assumed that they agree with the analysis and conclusions presented in this note. The title of the note is inspired by the movie “Where Is Friend's Home,” directed by Abbas Kiarostami (http://en.wikipedia.org/wiki/Where_Is_the_Friend%27s_Home%3F).

## References

Bhatia, G., Patterson, N., Sankararaman, S., and Price, A. L. (2013). Estimating and interpreting FST: the impact of rare variants. *Genome Res*. 23, 1514–1521. doi: 10.1101/gr.154831.113

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Cavalli-Sforza, L. L., Menozzi, P., and Piazza, A. (1996). *The History and Geography of Human Genes*. Princeton, NJ: Princeton University Press.

Christakis, N. A., and Fowler, J. H. (2014). Friendship and natural selection. *Proc. Natl. Acad. Sci. U.S.A*. 111, 10796–10801. doi: 10.1073/pnas.1400825111

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Devlin, B., and Roeder, K. (1999). Genomic control for association studies. *Biometrics* 55, 997–1004.

Novembre, J., Johnson, T., Bryc, K., Kutalik, Z., Boyko, A. R., Auton, A., et al. (2008). Genes mirror geography within Europe. *Nature* 456, 98–101. doi: 10.1038/nature07331

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Skyrms, B., Avise, J. C., and Ayala, F. J. (2014). In the light of evolution VIII: Darwinian thinking in the social sciences. *Proc. Natl. Acad. Sci. U.S.A*. 111, 10781–10784. doi: 10.1073/pnas.1411483111

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text | Google Scholar

Keywords: friendship, nature selection, Framingham heart study, FST statistics, GWAS

Citation: Chen G-B (2014) Where is the friend's home? *Front. Genet*. **5**:400. doi: 10.3389/fgene.2014.00400

Received: 30 August 2014; Accepted: 29 October 2014;

Published online: 13 November 2014.

Edited by:

Marshall Abrams, University of Alabama at Birmingham, USAReviewed by:

Sergio Tofanelli, Università di Pisa, ItalyCopyright © 2014 Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: chenguobo@gmail.com