Where is the friend's home?

Using Framingham Heart Study (FHS) data (accession no. phs000153.v7.p6), Christakis and Fowler (2014) showed that friends had more extended genetic correlation than strangers had. This finding may suggest that friends are “functional kin.” Depending on the context, functional kin may have high genetic similarity (homophilic) if people have similar functional kin as friends; conversely, functional kin may have low/negative genetic similarity (heterophilic) if people have complementary functional kin as friends. Christakis and Fowler reported that between friends there is positive/negative genetic correlation at the single nucleotide polymorphism (SNP) level, indicating heritability of friendship. This result is very interesting, particularly for the social sciences (Skyrms et al., 2014), but deserves scrutiny. This note demonstrates that (1) high genetic similarity between friends is possible if a person finds friends from his/her own cultural background, which is a surrogate for genetic similarity, or (2) low genetic similarity is possible if a person finds friends from a different cultural background. As illustrated below, the mechanism involved in inflating/deflating genetic similarity between friends is analogous to the way population stratification raises the false positive rate in case-control studies. 
 
In Christakis and Fowler's study, a single-marker regression was conducted to test the association for friendship. The regression is defined as 
 
 
ge.m=μm+bmgf.m+em 
 
 
in which ge.m and gf.m are the genotypes of “ego” and his/her friend, respectively, at the mth locus (m = 1, 2, 3 … M). The regression coefficient is estimated as b^m=cov(ge.m,gf.m)var(gf.m). In a conventional GWAS regression, the phenotype is the same for each locus, whereas in this regression, the ge.m phenotype is updated to match each gf.m locus. However, this actually models Fst under the circumstances as discussed below (blue box in Figure ​Figure1).1). For clarity, the subscript m is hereafter omitted. 
 
 
 
Figure 1 
 
Illustration of how an ego-friend from the same/different subgroup(s) may inflate/deflate genetic similarity as indicated by λGC. 
 
 
 
Assume that in the sample there are S subgroups (e.g., S = 2), each of which has ns individuals. The proportion of the sth subgroup to the total sample size is ws=nsn, in which n=∑s = 1Sns. Of the total n ego-friend pairs, ns are from the sth subgroup. The variance of gf can be written as the within-group variance and the between-group variance (http://en.wikipedia.org/wiki/Mixture_distribution) 
 
 
var(gf)=∑s=1Sws[σf(s)2+(p¯f(s)−p¯f)2] 
 
 
in which pf(s) is the reference allele frequency (RAF) of the friends from the sth subgroup. pf = ∑ wspf(s) is the RAF of the friends, σ2f(s) = 2pf(s)(1 − pf(s)) is the within-sub-population variance of the friends, and (pf(s) − pf)2 is the between-sub-population sampling variance. Similarly, the covariance between ge and gf can be written as 
 
 
cov(ge,gf)=∑s=1Sws[σef(s)2                              + (p¯e(s)−p¯e)(p¯f(s)−p¯f)] 
 
 
in which σ2ef(s) is the covariance between the friends' genotypes within the sth subpopulation, and (pe(s) − pe)(pf(s) − pf) is the between-sub-population covariance. σ2ef(s) ≠ 0 if the pair of friends is related or there is heritability for friendship at the ith locus. 
 
In Christakis and Fowler's study, the null hypothesis is that H0 : σ2ef(s) = 0 (i.e., no heritability for friendship). Even if friendship is not heritable and relatives are not included, friends can still have inflated/deflated genetic correlation, which raises the regression coefficient from zero, if an ego finds friends from the same/different cultural background. 
 
If an ego finds friends from the same cultural background, pe(s) will be similar to pf(s). Assuming that pe(s) = pf(s), the regression coefficient can be written as 
 
 
b^=cov(ge,gf)var(gf)=∑s=1Sws(p¯e(s)−p¯e)(p¯f(s)−p¯f)∑s=1Sws[σf(s)2+(p¯f(s)−p¯f)2]≈∑s=1Sws(p¯f(s)−p¯f)2∑s=1Sws[σf(s)2+(p¯f(s)−p¯f)2] 
 
 
By definition, Fst=Σs=1Sws(p¯f(s)−p¯f)2var(gf)≥0. The standard deviation of b^ is σ^b=σge2nσgf2≈1n. 
 
Given b^ and σ^b, a z-score test can be constructed as z=b^σ^b=nFst and z2 ~ χ21, with the non-centrality parameter (NCP) Δ = nF2st. Christakis and Fowler reported λGC, the genomic inflation factor (Devlin and Roeder, 1999), for their GWAS analysis. In their study, λGC=median(χ12)0.455=1.04, which meant that the median of the observed χ21 value was 0.473, with NCP Δ = nF2st = 0.04. Thus, the median of Fst was F^st=0.04/n=0.0066 between the egos and the friends (n = 907 in Christakis and Fowler's study). 
 
An early study (Cavalli-Sforza et al., 1996) indicated that Fst = 0.016 for European descendants, and a recent study (Novembre et al., 2008) using high-density SNPs showed that Fst = 0.004 between European nations (the estimated mean of Fst = 0.0042 over nearly 900 thousand common HapMap SNPs between CEU and TSI). Although the estimate of Fst depends on other factors, such as ascertainment and the sample proportion for each subgroup (Bhatia et al., 2013), the estimated F^st between the egos and the friends seems to fall in-between the aforementioned values. For a loosely defined trait such as friendship, the heritability may be low, if not zero. However, as long as friendship is more frequently established within one's cultural background, Fst will inflate genetic similarity among friendships even in the absence of heritability/“functional kin.” With an increased sample size, λGC will be even larger (blue box in Figure ​Figure1),1), a phenomenon that resembles the manner in which population stratification inflates type I error rate in GWAS. 
 
In the discussed study, the negative genetic correlation between friends was also highlighted. Similarly, if an ego finds friends from a different cultural background, the regression coefficient will tend to turn negative (denoted as F˜st, indicating reduced genetic similarity between friends, see the yellow box in Figure ​Figure1).1). In practice, both patterns, not as extreme as demonstrated though, will be possible between friends and lead to homophily and heterophily as observed. 
 
In their GWAS-like analysis, principle components were used as covariates to control for Fst. However, covariates may not completely eliminate the background effects, such as Fst. When covariates reduce Fst, the heritability of friendship will also be reduced, potentially to zero. Although the analysis in this note does not entail the rejection of the conclusion drawn by Christakis and Fowler, it warns against misleading interpretations of the results.


A commentary on
Friendship and natural selection by Christakis N., and Fowler, J. (2014). Proc. Natl. Acad. Sci. U.S.A. 111, 10796-10801. doi: 10.1073/pnas.1400825111 Using Framingham Heart Study (FHS) data (accession no. phs000153.v7.p6), Christakis and Fowler (2014) showed that friends had more extended genetic correlation than strangers had. This finding may suggest that friends are "functional kin." Depending on the context, functional kin may have high genetic similarity (homophilic) if people have similar functional kin as friends; conversely, functional kin may have low/negative genetic similarity (heterophilic) if people have complementary functional kin as friends. Christakis and Fowler reported that between friends there is positive/negative genetic correlation at the single nucleotide polymorphism (SNP) level, indicating heritability of friendship. This result is very interesting, particularly for the social sciences (Skyrms et al., 2014), but deserves scrutiny. This note demonstrates that (1) high genetic similarity between friends is possible if a person finds friends from his/her own cultural background, which is a surrogate for genetic similarity, or (2) low genetic similarity is possible if a person finds friends from a different cultural background. As illustrated below, the mechanism involved in inflating/deflating genetic similarity between friends is analogous to the way population stratification raises the false positive rate in case-control studies.
In Christakis and Fowler's study, a single-marker regression was conducted to test the association for friendship. The regression is defined as in which g e.m and g f .m are the genotypes of "ego" and his/her friend, respectively, at the m th locus (m = 1, 2, 3 . . . M).

The regression coefficient is estimated
var(g f .m ) . In a conventional GWAS regression, the phenotype is the same for each locus, whereas in this regression, the g e.m phenotype is updated to match each g f .m locus. However, this actually models F st under the circumstances as discussed below (blue box in Figure 1). For clarity, the subscript m is hereafter omitted.
Assume that in the sample there are S subgroups (e.g., S = 2), each of which has n s individuals. The proportion of the s th subgroup to the total sample size is w s = n s n , in which n = S s = 1 n s . Of the total n ego-friend pairs, n s are from the s th subgroup. The variance of g f can be written as the within-group variance and the between-group variance (http://en. wikipedia.org/wiki/Mixture_distribution) in which p f (s) is the reference allele frequency (RAF) of the friends from the s th subgroup.
is the within-sub-population variance of the friends, and (p f (s) − p f ) 2 is the between-sub-population sampling variance. Similarly, the covariance between g e and g f can be written as in which σ 2 ef (s) is the covariance between the friends' genotypes within the s th subpopulation, and (p e(s) − p e )(p f (s) − p f ) is the between-sub-population covariance. σ 2 ef (s) = 0 if the pair of friends is related or there is heritability for friendship at the i th locus.
In Christakis and Fowler's study, the null hypothesis is that H 0 : σ 2 ef (s) = 0 (i.e., no heritability for friendship). Even if friendship is not heritable and relatives are not included, friends can still have inflated/deflated genetic correlation, which raises the regression coefficient from zero, if an ego finds friends from the same/different cultural background.
If an ego finds friends from the same cultural background, p e(s) will be similar to p f (s) . Assuming that p e(s) = p f (s) , the regression coefficient can be written aŝ n .

FIGURE 1 | Illustration of how an ego-friend from the same/different subgroup(s) may inflate/deflate genetic similarity as indicated by λ GC .
Givenb and σ b , a z-score test can be constructed as z =b σ b = √ nF st and z 2 ∼ χ 2 1 , with the non-centrality parameter (NCP) = nF 2 st . Christakis and Fowler reported λ GC , the genomic inflation factor (Devlin and Roeder, 1999), for their GWAS analysis. In their study, λ GC = median(χ 2 1 ) 0.455 = 1.04, which meant that the median of the observed χ 2 1 value was 0.473, with NCP = nF 2 st = 0.04. Thus, the median of F st wasF st = √ 0.04/n = 0.0066 between the egos and the friends (n = 907 in Christakis and Fowler's study).

Frontiers in Genetics | Evolutionary and Population Genetics
November 2014 | Volume 5 | Article 400 | 2 An early study (Cavalli-Sforza et al., 1996) indicated that F st = 0.016 for European descendants, and a recent study (Novembre et al., 2008) using high-density SNPs showed that F st = 0.004 between European nations (the estimated mean of F st = 0.0042 over nearly 900 thousand common HapMap SNPs between CEU and TSI). Although the estimate of F st depends on other factors, such as ascertainment and the sample proportion for each subgroup (Bhatia et al., 2013), the estimatedF st between the egos and the friends seems to fall in-between the aforementioned values. For a loosely defined trait such as friendship, the heritability may be low, if not zero. However, as long as friendship is more frequently established within one's cultural background, F st will inflate genetic similarity among friendships even in the absence of heritability/"functional kin." With an increased sample size, λ GC will be even larger (blue box in Figure 1), a phenomenon that resembles the manner in which population stratification inflates type I error rate in GWAS.
In the discussed study, the negative genetic correlation between friends was also highlighted. Similarly, if an ego finds friends from a different cultural background, the regression coefficient will tend to turn negative (denoted asF st , indicating reduced genetic similarity between friends, see the yellow box in Figure 1).
In practice, both patterns, not as extreme as demonstrated though, will be possible between friends and lead to homophily and heterophily as observed.
In their GWAS-like analysis, principle components were used as covariates to control for F st . However, covariates may not completely eliminate the background effects, such as F st . When covariates reduce F st , the heritability of friendship will also be reduced, potentially to zero. Although the analysis in this note does not entail the rejection of the conclusion drawn by Christakis and Fowler, it warns against misleading interpretations of the results.