No Evidence that 2D:4D is Related to the Number of CAG Repeats in the Androgen Receptor Gene

The length ratio of the second to the fourth digit (2D:4D) is a putative marker of prenatal testosterone (T) effects. The number of CAG repeats (CAGn) in the AR gene is negatively correlated with T sensitivity in vitro. Results regarding the relationship between 2D:4D and CAGn are mixed but have featured prominently in arguments for and against the validity of 2D:4D. Here, I present random-effects meta-analyses on 14 relevant samples with altogether 1904 subjects. Results were homogeneous across studies. Even liberal estimates (upper limit of the 95% CI) were close to zero and therefore suggested no substantial relationship of CAGn with either right-hand 2D:4D, left-hand 2D:4D, or the difference between the two. However, closer analysis of the effects of CAGn on T dependent gene activation in vitro and of relationships between CAGn and T dependent phenotypic characteristics suggest that normal variability of CAGn has mostly no, very small, or inconsistent effects. Therefore, the lack of a clear association between CAGn and 2D:4D has no negative implications for the latter’s validity as a marker of prenatal T effects.

Testosterone effects depend on a structure called androgen receptor (AR), which comes in different variants, some of them leading to stronger T effects than others. The relationship between these AR variants and 2D:4D has received considerable attention, based on the notion that if 2D:4D reflects prenatal T effects and if AR variants moderate T effects, AR variants should show systematic relationships with 2D:4D [e.g., Ref. (16,17)]. The current paper seeks to describe this relationship. However, before this is addressed further, it is necessary to look at the link between the AR and T effects in greater detail.
Testosterone regulates the transcription of genes, and this depends on the AR. In the cytoplasm, the AR is bound to heatshock proteins and therefore inactive. When binding with T or dihydrotestosterone, the AR sheds its heat-shock proteins, changes into an active shape and migrates to the cell nucleus. There, it connects with coactivators and another AR and then binds in this dimerized form to specific sites in the DNA where it regulates the transcription of target genes (18,19).
The AR is produced by the AR gene, which is located on the X-chromosome. On exon 1, this gene repeats the nucleotide sequence CAG; the number of these repeats (CAGn) varies interindividually in length and codes for the length of a polyglutamine stretch on the N-terminal domain of the AR. Most humans have CAGn between 15 and 30, the average is about 22 with a standard deviation of about 3.5 (20). Experiments in vitro demonstrated that longer polyglutamine stretches make the AR less effective, resulting in less AR-regulated genetic activity [e.g., Ref. (21)(22)(23)]. In such studies, cell lines from either monkey kidneys or human prostate cancer are transfected with AR gene variants that differ in CAGn. Subsequent activity of a target gene is then measured in the presence and absence of androgen. How strong is the effect of CAGn on target gene activity? I used figures in relevant reports (20)(21)(22)(23)(24)(25)(26) to calculate regression slopes that reflect by what proportion target gene activity drops for each additional CAG repeat (cf. Figure 1); activity at "normal" CAGn (around 20) served as the 100% baseline in each case. Where non-linear effects occurred at CAGn outside the normal human range (20,25), I restricted the computation of the regression slope to the CAGn range that produced a linear effect. Figure 1 illustrates this process based on a fictitious in vitro study. As can be seen from Table 1, which provides an overview of the results, regression slopes averaged −2.3%   (21) is set to 100%. Then the regression slope (dashed line, −2.3%) is calculated to describe target gene activity as a function of CAGn. In cases like the present, where CAGn outside the human range produce a deviation from linearity (here 0 CAGn), the regression slope was calculated only for those CAGn that showed a linear function. In short then, high CAGn is associated with low androgen sensitivity in vitro; hence, a positive relationship between CAGn and 2D:4D might be expected. The first report of such a correlation (17) became one of the most frequently cited papers in the 2D:4D literature; however, later studies showed an inconsistent picture with a mixture of positive and negative findings [e.g., Ref. (27,16)]. The relationship between CAGn and 2D:4D has often played a prominent role in discussions of the validity of 2D:4D as a marker of prenatal T effects. For example Breedlove (5) argued, "the strongest evidence that androgens affect digit ratios is the report (17) that normal polymorphism in the AR gene correlates with digit ratios in men" (p. 4117); conversely, Hampson and Sankar (16) concluded that their failure to find a positive relationship between CAGn and 2D:4D "call[s] into question the widespread assumption that small differences in the size of [. . .] [2D:4D] are an accurate gage of relative differences across individuals in fetal testosterone exposure" (p. 560). This paper has two purposes. First, to clarify the relationship between 2D:4D and CAGn; to this end, I present a meta-analysis of the relevant literature. And second, to discuss in greater detail the implications of this relationship for the validity of 2D:4D as a marker of prenatal T effects.

MATERIALS AND METHODS
Studies were retrieved with the search terms 2D:4D OR digit ratio in conjunction with CAG OR AR in the topics field in ISI Web of Science and in the MeSH Major Topic field in PubMed; this resulted in nine relevant studies from which 14 samples with 792 females and 1331 males entered the analyses. For all samples, CAGn was treated as a continuous measure and I report Pearson correlations with 2D:4D in all cases. As females (but not males) have two AR gene copies, either the shorter allele, the longer allele, or the bi-allelic mean can be used. One report (28) reported all three analyses (which led to very similar results), and I used the result for the bi-allelic mean in the present analysis. For two other reports that involved females (29,30) it remained unclear on which of the three measures their analysis was based.
In line with the approach in the primary studies, separate meta-analyses were run for 2D:4Dr (right-hand 2D:4D), 2D:4Dl (left-hand 2D:4D), and D r-l (2D:4Dr × 2D:4Dl). One longitudinal study (29) reported multiple results for each 2D:4D measure and CAG repeats in the same sample. These were averaged so that each sample contributed only one effect size in each metaanalysis. The Knickmeyer samples and the Loehlin et al. (30) study contained sib-pairs. Although this creates statistical dependencies, the weighting of these samples in the analyses was not corrected downwards, mostly because it did not matter, as I will discuss later. Typically, samples showed little or no ethnic heterogeneity; for one atypical study (31), results with ethnic group as a covariate were used. Where relevant information was missing in the publications, authors were contacted (cf. note in Table 1). Random-effects meta-analyses were performed (32), which model the population correlation as a random variable with mean ρ and variance τ 2 . Due to chance effects in sampling, multiple studies into the same phenomenon are expected to produce different results, resulting in variance of the correlations in primary studies. If the observed variance exceeds the variance to be expected due to random sampling, this suggests that primary studies differ in a systematic fashion, i.e., that not all tap into the same population correlation. E.g., the correlation between CAGn and 2D:4D might differ for females and males, young and old, etc. τ 2 reflects to what extent the observed variance in correlations exceeds the variance expected due to random sampling. The Q-statistic is used to test if this excess variance deviates significantly from zero. In the results, I report the standard deviation τ instead of the variance τ 2 because the former is easier to interpret. Analyses were carried out with Comprehensive Meta-Analysis (2.2.064).

RESULTS
The results for individual studies are listed in Table 2. The results of the three meta-analyses are summarized in Table 3. As can be seen from column ρ, estimates for all population correlations were close to (and not significantly different from) zero, and all upper limits of the 95% CI were r < 0.09. Estimates for random variance around ρ were zero or small, and not statistically significant. Therefore, no attempt was made to explain differences www.frontiersin.org  across study results via meta-regression. Results remained basically unchanged when: the female and mixed-sex samples (k = 4) were removed (cf. Table 3); when an unusual sample of maleto-female transsexuals was removed (detailed results not shown here); or when all of the previous were excluded from analysis (detailed results not shown here).

DISCUSSION
Estimates for the population correlations between CAGn and 2D:4D were close to zero and not statistically significant, and even a liberal viewpoint suggests that any relationship is at best very small (largest upper limit for 95% CI in the full data set r = 0.08). None or little (and statistically non-significant) random variance was observed. Therefore, sampling error suffices to explain the mixture of significant and non-significant findings and there is no reason to assume that the former meaningfully differ from the latter (35). As mentioned in the method section, the Loehlin et al. (30) and the Knickmeyer et al. (29) samples contained numerous sib-pairs, and this was not reflected in the weighting of these samples in the current analyses. However, Table 1 shows that the results for these samples were either close to the estimates for ρ or else had very small sample sizes and therefore had little impact on ρ estimates in the first place; consequently, somewhat reduced weights for these samples would not have meaningfully altered the outcome of any of the analyses or any conclusions drawn. This is also illustrated by the result of the analysis that excluded mixed-sex samples (i.e., the three Knickmeyer et al. (29) samples). Overall, the evidence is quite clear then that 2D:4D and CAGn show no substantial relationship. What does this mean for the validity of 2D:4D as a marker of prenatal T effects? Several authors argued that a relationship between CAGn and 2D:4D is to be expected if the latter indeed reflects prenatal T effects (5,16,17), the logic being that if variables A and B correlate, and variables B and C do as well, then a correlation between A and C should emerge. However, if r AB = 0.40 and r BC = 0.20, a reasonable expectation for r AC is 0.08, and to differentiate this empirically from the null hypothesis (r = 0.00) is difficult.
There is considerable indirect evidence that the link between CAGn and T effects is weak, which is relevant in this context. First, as discussed in the introduction, in vitro studies suggest that each additional CAGn repeat lowers T effectiveness by about 2% (cf .  Table 1). Thus, a one standard deviation in CAGn [which is about 3.5, Ref. (20)] would result in a T effect change of only about 7% in vitro. Changes of this magnitude might only have a moderate effect on 2D:4D: when Berenbaum et al. (4) looked at the effect of a 100% change in T effects by comparing typically developing men with genetic males affected from complete androgen insensitivity syndrome, the group difference in 2D:4D was about d = 0.5, which is equivalent to a correlation of r = 0.241 1 . Moreover, in vitro studies might overestimate the effects of CAGn in vivo, where lower androgen sensitivity due to higher CAGn appears to be counterbalanced by higher circulating T levels, at least in adult men (36,37).
The second line of indirect evidence stems from relationships between CAGn and other T dependent phenotypes. Androgenetic alopecia (patterned hair loss from the scalp), male infertility, polycystic ovary syndrome, and prostate cancer are conditions in the genesis of which T is clearly implicated (38)(39)(40)(41). Following the same line of thought that led to the investigation of a potential link between CAGn and 2D:4D (17), numerous studies looked into the link between CAGn and these conditions. Recent meta-analyses of these studies show that evidence for such a link is at best tentative for prostate cancer and absent for the other three (41)(42)(43).
Androgens promote muscle growth and therefore affect FFM (44). A similar picture emerges for the relationship between CAGn and FFM. Pertinent studies (45)(46)(47)(48)(49) report results for 11 samples (median N = 115). Statistically significant results were only obtained for the two male samples in Walsh et al. (49); in either case a positive relationship between CAGn and FFM was observed, which runs against expectations.
In a well-controlled intervention study by Woodhouse et al. (44), 61 eugonadal young men received either 25, 50, 125, 300, or 600 mg/week T enanthate treatment for 20 weeks. FFM gains were statistically modeled by T treatment, CAGn, age, initial strength, and other variables. T treatment explained 64% of the variance in FFM gain. The two next best predictors explained another 2 and 1% of variance, respectively, but CAGn was not among them. When T treatment was excluded as a predictor, the best threevariable model explained only 17% of the variance in FFM change, and again CAGn was not among these predictors. In sum then the results of this study do not suggest a sizable negative effect of CAGn on FFM, which is in line with the correlational studies.
Inferences from androgenetic alopecia, male infertility, polycystic ovary syndrome, prostate cancer, and FFM to 2D:4D are tentative because the former concern adult phenotypes whereas the latter is largely determined in utero (11,12). Nonetheless, these domains demonstrate that a T effect on a phenotype does not necessarily mean that CAGn correlates with this phenotype. Therefore, the lack of a substantial link between CAGn and 2D:4D observed here does not necessarily implicate that 2D:4D is not affected by prenatal T. 1 This is because r = d 2 On the contrary, the absence of a strong relationship between CAGn and 2D:4D makes the interpretation of 2D:4D findings less ambiguous. If 2D:4D was substantially linked to CAGn, the former might reflect AR effectiveness to a considerable degree. Consequently, a given relationship between 2D:4D and the study variable could reflect effects of circulating T, effects of prenatal T, or both. In light of the nil or near-nil relationship between CAGn and 2D:4D it seems less likely that observed correlations between 2D:4D and study variables reflect effects of circulating T instead of prenatal T [see also Ref. (14)].
Hampson and Sankar (16) conceded that 2D:4D tracks large prenatal T differences between groups (e.g., CAIS vs. typically developing individuals) but argued that the lack of CAGn and 2D:4D demonstrates the latter's inability to reflect finer prenatal T differences within each sex. But I showed here that a sizable relationship between CAGn and 2D:4D may not be expected even when 2D:4D reflects prenatal T effects well. Further, strong relationships between 2D:4D and performance in sports have been consistently shown (50)(51)(52)(53)(54), which also speaks against the idea that 2D:4D cannot explain within-sex differences. However, 2D:4D differences tend to be moderate (d about 0.4-0.8) between groups that differ strongly in prenatal T effects (4,9,10). This suggests that other factors than prenatal steroids strongly affect 2D:4D (55). Indeed, genetic factors unrelated to T have been implied (56,57). The use of 2D:4D as a marker for prenatal T effects requires that the non-T variance in 2D:4D is not systematically related to the study variable, and at present we know next to nothing about this point. It would therefore be desirable to better understand the non-T variance in 2D:4D, which might open avenues for its statistical control. Further, a systematic review to what extent 2D:4D and other methods that are less accessible but also less controversial [e.g., Ref. (2)] lead to similar conclusions about prenatal T effects on human behavior would appear helpful.

CONCLUSION
A meta-analysis of the literature showed no evidence for a relationship between 2D:4D and CAGn. However, closer inspection of the effects of CAGn on T dependent gene activation in vitro and of relationships between CAGn and T dependent phenotypic characteristics suggests that normal variability of CAGn has no, very small, or inconsistent effects. Therefore, the observed lack of an association between CAGn and 2D:4D does not undermine the latter's validity as an indicator of prenatal T effects.

ACKNOWLEDGMENTS
I would like to thank Frank Renkewitz for insightful discussions of statistical matters and Tamsin Saxton for critical comments on the manuscript.