Misregulation of the IgH Locus in Thymocytes

Functional antigen receptor genes are assembled by somatic rearrangements that are largely lymphocyte lineage specific. The immunoglobulin heavy chain (IgH) gene locus is unique amongst the seven antigen receptor loci in undergoing partial gene rearrangements in the wrong lineage. Here we demonstrate that breakdown of lineage-specificity is associated with inappropriate activation of the Eμ enhancer during T cell development by a different constellation of transcription factors than those used in developing B cells. This is reflected in reduced enhancer-induced epigenetic changes, eRNAs, formation of the RAG1/2-rich recombination center, attenuated chromatin looping and markedly different utilization of DH gene segments in CD4+CD8+ (DP) thymocytes. Additionally, CTCF-dependent VH locus compaction is disrupted in DP cells despite comparable transcription factor binding in both lineages. These observations identify multiple mechanisms that contribute to lineage-specific antigen receptor gene assembly.


INTRODUCTION
Somatic gene rearrangements assemble B and T cell antigen receptor genes from individual gene segments during lymphocyte development. This process of V(D)J recombination is lineage specific, such that T cell receptor (TCR) genes recombine only in T lineage cells and immunoglobulin genes recombine completely only in B lineage cells. The only exception to this rule is that the immunoglobulin heavy chain (IgH) gene locus undergoes partial rearrangements in the T lineage (1,2). Functional IgH genes are assembled by two recombination events. The first juxtaposes a diversity (D H ) gene segment to a joining (J H ) gene segment, and the second recombines a variable (V H ) gene segment to the pre-assembled DJ H junction (3,4). D H rearrangements have been shown to occur at several stages of T cell development, including the double positive (CD4 + CD8 + ) stage and ∼50% of mature T cells have a DJ H rearranged IgH allele (2,5). V H recombination has never been detected in WT thymocytes. Several forms of genetic manipulation can, however, induce restricted V H recombination in DP thymocytes. For example, forced expression of Pax5 or inactivating the intergenic control region 1 (IGCR1) leads to recombination of D H -proximal V H 7183 gene segments (6)(7)(8). Additionally, introduction of a V H gene segment near DFL16.1 results in its recombination in DP cells (9). The breakdown of lineage specificity of IgH locus rearrangements remains a unique feature amongst antigen receptor genes. Our working hypothesis is that understanding this phenomenon may provide insight into regulatory mechanisms that impose specificity of V(D)J recombination and more generally into tissue-specific gene expression.
Recombination activating gene products Rag1 and Rag2 initiate V(D)J recombination at immunoglobulin and TCR loci by introducing double-strand breaks at recombination signal sequences (RSSs) associated with rearrangeable gene segments (10,11). Accessibility of the recombinase to antigen receptor loci is governed by regulated changes in chromatin structure of individual V, D, and J gene segments. This is referred to as the chromatin accessibility hypothesis which originates from observations that activation for rearrangement correlates with transcription of unrearranged loci (12,13). Subsequent studies showed that transcriptional enhancers associated with antigen receptor loci are required for lineage-specific V(D)J recombination (14)(15)(16)(17)(18)(19). Thus, enhancers are at the crux of the accessibility hypothesis.
Several studies demonstrate that breakdown of lineagespecific recombination at the IgH locus is related to enhancer activity. Ferrier et al. first showed that IgH intronic enhancer Eµ supports TCR Dβ to Jβ recombination on a transgenic substrate in both T cells and B cells (20). These observations were extended by replacement of TCRβ enhancer (Eβ) with Eµ at TCRβ locus that permitted partial Dβ to Jβ rearrangements in T cells (14). Conversely, Afshar et al. reported that Eµ deletion at the IgH locus abrogated D H to J H recombination in thymocytes (21). Since Eµ is essential for efficient V(D)J recombination in pro-B cells, these observations suggest that lack of lineage specificity of Eµ underlies promiscuous D H recombination in DP thymocytes. However, the extent and basis of Eµ activity in DP thymocytes has not been addressed.
To better understand the mechanisms of partial IgH rearrangements in thymocytes, we examined transcription, recombination and epigenetic state of the IgH locus in CD4 + CD8 + (DP) thymocytes. We found the locus to be partially active in DP cells compared to pro-B cells by all criteria assayed. This state correlated with the absence of a subset of transcription factors from Eµ in DP thymocytes compared to pro-B cells, suggesting that partial locus activation resulted from inappropriate Eµ function. We also found that CTCF-dependent steps of IgH locus compaction were abrogated in DP thymocytes despite binding of this architectural protein throughout the locus, providing a plausible explanation for the lack of V H recombination in these cells. Our observations highlight lineagespecific steps of locus activation that are required for complete IgH gene rearrangements in pro-B cells.

Cell Purification
CD19 + pro-B cells were purified from Rag2 −/− C57BL/6 mice by positive selection using CD19 beads (Stem Cell Technology, Cat # 18754). CD4 + CD8 + cells mice were purified from thymii of TCRβ × Rag2 −/− transgenic mice by positive selection using CD8 beads per manufacturer's instruction (Stem Cell Technology Cat # 18753). All mouse experiments were done in accordance with Animal Care and Use Committee of the National Institute on Aging.
WT DP thymocytes were enriched from thymii of C57BL/6 mice by positive selection using CD8 beads per manufacturer's instruction (Stem Cell Technology, Cat # 18753).

Chromatin Immunoprecipitation (ChIP)
Chromatin immunoprecipitation of modified histones and transcription factors were carried out with pro-B cells derived from bone marrow of Rag2 −/− mice and DP thymocytes derived from TCRβ × Rag2 −/− transgenic mice or WT mice as described previously (22). Modified histone antibodies were purchased from Active Motif: anti-H3K4me3 (Cat # 39519), anti-H3K9ac (Cat # 39137), anti-H3K27me3 (Cat # 39155). Antibodies for transcription factors were as follows: anti-E2A (Cat # Sc-349), anti-YY1 (Cat # Sc-1703), anti-Ets-1 (Cat # Sc-350), Anti-HEB (Cat # Sc-357) were from Santa Cruz Biotechnology and anti-Runx1 (Cat # ab23980), anti-Rad21 (Cat # ab992) were from Abcam; anti-CTCF (Cat # 07-729) was purchased from Millipore. Formaldehyde cross-linked and sonicated chromatin from 5 × 10 6 cells was pre-cleared with 5 µg of non-specific rabbit IgG and immunoprecipitated with the relevant antibody or an equal amount of nonspecific IgG. The coprecipitated DNA was purified and analyzed by real-time PCR. Input DNA and the immunoprecipitated DNA were quantified using PicoGreen (Molecular Probes/Life Technologies). For analysis of enrichment, 200 pg of DNA was used in each real-time PCR reaction performed in triplicate and each ChIP was performed in duplicate. The relative abundance of amplicons in the immunoprecipitated DNA relative to input was analyzed by realtime PCR using the primers listed in Table S1. Rag1/Rag2 ChIP was carried out using anti-Rag1 (Abcam, Cat # ab172637) and anti-Rag2 (David Schatz, Yale University) antibodies as described by Ji et al. (23).
DNase I Sensitivity Assay 10 7 nuclei from Rag2 −/− CD19 + cells and CD4 + CD8 + cells from TCRβ × Rag2 −/− transgenic mice were treated with different concentrations of DNase I. Twenty-five nanograms of purified genomic DNA was used in quantitative PCR assays performed in duplicate with primer pairs shown in Table S1. The amplicons were normalized to the amount of intact β-globin alleles at each DNase I concentration as described previously (24).

RNA Analysis
Total RNA was extracted from Rag2 −/− CD19 + cells and CD4 + CD8 + cells derived from TCRβ × Rag2 −/− transgenic mice using RNeasy plus microkit (Qiagen). Two hundred nanograms of RNA was reverse transcribed with Superscript III (Invitrogen) and strand-specific primers were used according to manufacturer's protocol. Quantitative PCR was performed with SYBR green using primer pairs described in Table S1.
For analysis of Eµ-sense and antisense transcripts, strand specific primers were used to prime cDNA synthesis and amplified amplicons were used for analysis of copy number of sense and antisense RNA. To calculate the number of eRNA molecules, a standard curve was generated by plotting the Ct values of known concentration of 1 kb DNA (amplified from genomic DNA) which covers both sense and antisense transcribed region of IgH locus. The copy number of sense and antisense RNA was calculated (after RT and qPCR) using the equation (Copy number = amount of cDNA in femtogram * 6.022 × 10 23 / length of cDNA in base pairs * 1 × 10 15 * 650).

D H Rearrangements
Fifty nanograms of genomic DNA and 4-fold serial dilutions from pro-B cells and CD4 + CD8 + cells from wild type C57BL/6 mice were used to amplify DJ H junctions using primers listed in Table S1. An amplicon from the mouse β-globin locus was used to normalize across samples.
For deep sequencing, DSP2-J H 1 amplified products were ligated to adaptors and sequenced using Ion Proton sequencer (ThermoFisher Scientific). FASTQ files containing singleend, variable length reads were obtained from the Ion Proton sequencer. Adaptor contamination and low-quality bases (below FRED quality score of 20) were removed by Cutadapt program leaving reads more than 160 bases long for further analysis. Duplicate reads from FASTQ files were removed using Clumpify (from BBTools suite of programs) from Department of Energy (Joint Genome Institute) with default parameters except dedupe=t (i.e., remove duplicates). Link: https://jgi.doe.gov/data-and-tools/bbtools/. The reads after duplicate removal were aligned to custom DSP genome from C57BL/6 (mm9) using Bowtie2 aligner (using very-sensitivelocal option). Reads which had a minimum 100 base mapped length were used for counting reads to the specified regions using SAMtools. The regions used for counting were DSP2pt9_7-84, DSP2pt2_4698-4790, DSP2ptX1_9378-9457, DSP2ptX2_14022-14115, DSP2pt3_18693-18764, DSP2pt5_24559-24663.

Fluorescence in situ Hybridization (FISH)
Pro-B cells from Rag2 −/− mice and DP thymocytes derived from TCRβ × Rag2 −/− mice were used for FISH analyses. FISH was performed as described by Guo et al. (22). Probes were as follows: named as RI (115051557-115227487) (BAC 373N4), RII (115944024-116124641) (BAC 70F21), and RIII (116777388-117011222) (BAC 368C22) (all the BACs were purchased from Thermofisher Scientific). Other position-specific 4-10 kb probes were generated by PCR using genomic DNA with the primers listed in Table S1. After probe hybridization to fixed cells (22), image acquisition was carried out using a Nikon T200 epifluorescence microscope. Twenty-five to thirty optical sections spaced by 0.2-0.3 µm acquired and the dataset was deconvoluted using NIS-Element software (Nikon). Statistical analysis of spatial distances between probes were done measured as previously described (25).

Analysis of Chromatin Accessibility at IgH Locus in DP Thymocytes
IgH rearrangements can be seen early in lymphopoiesis in the common lymphoid precursor (CLP) and early thymic progenitors (ETP) in WT mice (5). Additionally, ongoing IgH rearrangements have been noted in CD4 − CD8 − double negative (DN) and CD4 + CD8 + double positive (DP) thymocytes (6). Because thymic rearrangements are Eµ-dependent, we tested the hypothesis that inadequate Eµ activation underlies partial IgH recombination in DP thymocytes. For this, we assayed several parameters of Eµ function in these cells. To maintain the IgH locus in germline configuration, we used DP thymocytes obtained from recombinase (Rag2)-deficient mice that expressed a transgenic TCRβ chain gene (TCRβ × Rag2 −/− ) (26,27). We previously showed that Eµ regulates both H3K4me3 and H3K9ac histone modifications (associated with active chromatin) in the D H -Cµ region in primary pro-B cells (28). Both these marks were reduced in DP thymocytes compared to pro-B cells. Tcf7 and Lck gene promoters served as T lineage-specific positive controls (Figures 1A,B). γ-actin promoter and Cγ3 amplicons served as positive and negative controls, respectively. These results were similar to reduced activation-specific histone modifications on Eµ-deficient alleles in pro-B cells (28). H3K27me3, a mark associated with inactive chromatin (29-31), did not differ substantially between pro-B cells and DP thymocytes ( Figure S1A). We verified these observations with ChIP analysis of activation-specific mark (H3K4me3) in WT DP thymocytes ( Figure S1B). Genome wide ChIP-seq studies (32,33) of WT DP thymocytes also revealed activation modifications in the D H -Cµ part of the locus, however, the relative levels compared to pro-B Cγ3 was used as a negative control and γ-actin served as a positive control in both cell types. For each independent experiment PCR was done in triplicate. Results shown are the mean of two independent experiments. Error bars represent standard error of the mean (n = 2). Y-axis shows enrichment of respective amplicons in the immunoprecipitate compared to an equal amount of input DNA as described in the methods. (C) DNase I sensitivity analysis of proximal part of IgH locus in pro-B cells and DP thymocytes. 10 7 nuclei from CD19 + pro-B cells from Rag2 −/− mice and DP thymocytes from TCRβ × Rag2 −/− mice were treated with increasing concentration of DNase I (X-axis) followed by purification of genomic DNA. Equal amounts of DNA were used for amplification with the indicated primers. The signal from each amplicon was normalized to that from a β-globin amplicon for each DNase I concentration. β2m promoter served as a positive control while Cγ3 served as negative control. TCRα enhancer was used as an additional positive control for DP thymocytes. The data represents the mean of two independent experiments. Error bars represent standard error of the mean (n = 2).
Frontiers in Immunology | www.frontiersin.org cells could not be inferred from those studies. We conclude that the D H -Cµ part of the locus is partially active in DP cells.
Partial locus accessibility of the D H -Cµ domain in DP cells was further confirmed by DNase I sensitivity analysis. Nuclei from DP thymocytes or Rag2-deficient primary pro-B cells were treated with varying concentration of DNase I, followed by quantitative PCR to query specific regions. Signals from test amplicons were normalized to a beta-globin amplicon (24). We found reduced DNase I sensitivity in the region extending from DQ52 till Eµ in DP cells compared to primary pro-B cells ( Figure 1C). Cγ3 and TCRα enhancer (Eα) served as negative and positive controls, respectively ( Figure 1C). Overall, ChIP and accessibility assays revealed a partially active D H -Cµ domain in DP thymocytes, reminiscent of the state of Eµ-deleted IgH alleles in pro-B cells.

Assessment of IgH Intronic Enhancer Activity in DP Thymocytes
We directly examined the state of Eµ with additional ChIP and transcription studies. Based on genome wide studies, the prevailing view is that poised enhancers are marked by H3K4me1 and H3K27me3, whereas active enhancers are marked by H3K27ac (34)(35)(36). Consistent with this view, Eµ sequences were enriched for H3K27ac, with close to basal level of H3K4me1 in pro-B cells ( Figure 2B). By contrast, Eµ was marked by both H3K27ac as well as H3K4me1 in DP thymocytes. Eα and TCF7 enhancers, which served as positive controls in DP thymocytes, had high levels of H3K27ac in DP cells but not in pro-B cells. Increased H3K4me1 at Eµ in DP cells compared to pro-B cells was consistent with its being partially active in DP cells. This conclusion was substantiated by reduced Eµ-associated eRNA levels in DP cells (Figure 2C).
Eµ binds several transcription factors (37-41) (Figure 2A). To understand the basis for partial Eµ activity, we compared transcription factor occupancy at the enhancer in DP (TCRβ × Rag2 −/− ) and pro-B cells by ChIP. We found that YY1 levels at Eµ were reduced in DP cells compared to pro-B cells, RUNX1 binding to Eµ was comparable in pro-B and DP cells, whereas E2A and Ets-1 binding was substantially lower (Figure 2D). We assayed several of these factors in DP thymocytes enriched from WT C57BL/6 mice and observed similar trends. YY1 and RUNX1 bound Eµ in WT DP cells, whereas E2A binding was lower at Eµ compared to Eβ and Eα (Figures S2A-E). The observed changes were not due to differential expression of these transcription factors ( Figure 2E). To determine whether absence of E2A from Eµ could be accounted for by a different basic helix-loop-helix protein, we carried out ChIP with anti-HEB antibodies in WT DP cells. HEB binding to Eµ was easily evident in these cells ( Figure S2E). We propose that partial and perhaps inappropriate transcription factor binding to Eµ underlies its compromised activity in DP cells.

D H to J H Recombination in Thymocytes
During D H recombination in pro-B cells the 5 ′ -most and 3 ′ -most D H gene segments, DFL16.1 and DQ52, are used most frequently (42)(43)(44). Even amongst intervening DSP2 gene segments, those located at either ends of the cluster, rearrange more frequently ( Figure S3A). This distribution has been proposed to be due to the looped configuration of IgH alleles that places DFL16.1 closest to the J H -associated recombination center (22,45). Though DJ H junctions were noted in DP cells many years ago, D H utilization in thymocytes has not been quantified. Because patterns of D H utilization may provide mechanistic insights, we assayed D H recombination frequency in DP cells. We used 5 ′ primers specific for DFL16.1, or a pan DSP2 primer that binds to six DSP2 gene segments, with a 3 ′ primer located 3 ′ of J H 4 ( Figure 3A) to amplify DJ H rearrangements using genomic DNA from bone marrow pro-B cells or DP thymocytes (Figures S3B-D). Following fractionation by agarose gel electrophoresis and quantitation, we observed the expected over-utilization of DFL16.1 in primary pro-B cells (Figure 3B,  lanes 2-5). The equivalent intensity of DFL16.1 and DSP2 rearrangements in the gel image results from the pan DSP2 primer scoring for 6 different DSP2 gene segments. In contrast, the proportion of DFL16.1 usage was greatly reduced in DP thymocytes ( Figure 3B, lanes 6-9), quantitated in Figure 3C. We also observed low occupancy of recombinase proteins at IgH locus in DP thymocytes relative to pro-B cells ( Figure 3D). Reduced utilization of DFL16.1 in DP thymocytes suggests that the configuration of the IgH locus within which D H rearrangements occur in DP cells differs from that in pro-B cells.
In pro-B cells increased utilization of DFL16.1 has been attributed to spatial proximity of DFL16.1 to the J H -associated recombination center (RC) via Eµ/IGCR1 interaction. Our analysis of D H utilization in DP cells suggested that Eµ may not efficiently recruit DFL16.1 to the recombination center in these cells. To test the possibility, we carried out chromosome conformation capture (3C) assay in Rag2 −/− pro-B cells and DP thymocytes derived from TCRβ × Rag2 −/− transgenic mice. Eµ interactions with IGCR1 and with HS5, were reduced in DP thymocytes compared to pro-B cells (Figure S3E). We used Eα-TEAp interaction as a positive control in thymocytes and Eµ-HS5 served as positive control for pro-B cells. Eµ-RPL32 was used as an out of locus negative control. These data indicated that reduced DFL16.1 utilization in DP thymocytes may be the consequence of weaker association between Eµ and IGCR1.
To determine whether skewed D H usage was also evident amongst the closely related DSP2 gene segments, we amplified DSP2-J H 1 rearrangements from DP cell DNA and sequenced the resulting amplicons. We identified individual DSP2 gene segments from 5' flanking sequences and non-redundant sequence reads were identified as described in the Materials and Methods section. Representative sequences from DP thymocytes confirmed unique DJ H rearrangement-associated junctional diversity (Figure S3F and Table S2). Quantitation of DSP2 utilization revealed striking over-utilization of the 3 ′ -most DSP2 gene segment DSP2.5 in these rearrangements (Figure 3E left panel). To further confirm sequencing results, we cloned and sequenced DSP2-J H 1 junctions from DP cells. Thirty out of 50 unique cloned sequences contained DSP2.5 (Figure 3E right panel). Sequences of recombined products from colony sequencing are shown in Table S3. We conclude that the D H rearrangements in DP thymocytes preferentially Enhancer-associated histone modifications were scored by ChIP using anti-H3K4me1 and anti-H3K27ac antibodies in CD19 + pro-B cells and DP thymocytes derived from TCRβ × Rag2 −/− transgenic mice. Y-axis represents fold enrichment of the indicated amplicon in the immunoprecipitate compared to an equal amount of input DNA. For each independent experiment PCR was done in triplicate. The data shown is the mean of two independent ChIP experiments. Error bars represent standard error of the mean (n = 2). γ-actin was used as a positive control, TCF7 enhancer was used as a positive control for T-lineage-specific gene and Cγ3 was used as a negative control. (C) Levels of eRNAs that originate within the enhancer in DP thymocytes and pro-B cells. Reverse transcription was carried out with strand specific primers, or no primer (np), followed by amplification with primers that score for sense and antisense Iµ transcripts. RNA amounts were calculated based on a standard curve obtained from serial dilutions of 1 kb DNA spanning both sense and antisense transcribed region. (D) Transcription factor binding to Eµ was assayed by chromatin immunoprecipitation using antibodies directed against the indicated factors. Enrichment of specific amplicons in co-precipitated DNA was calculated relative to an equal amount of input DNA (Y-axis). Error bars represent standard error of the mean (n = 2). Positive controls for each transcription factor correspond to the first set of bars in each graph. IgG served as additional negative control. (E) YY1, E2A, Ets-1, and RUNX1 expression in DP thymocytes and Rag2 −/− pro-B cells were assayed by immunoblotting with the respective antibodies.  (Figures S3B-D) were used in PCR reactions with a 5 ′ primer located upstream of DFL16.1 (green arrow), or one that hybridizes to all 6 DSP2 gene segments (brown arrows), and a 3 ′ primer located after J H 4 (black arrow). Four-fold serially diluted DNAs were used for PCR amplification reaction followed by separation of the products by electrophoresis through 1% agarose gels. PCR analysis was carried out with two independent preparations of pro-B and DP cells and the data shown is one representative example. An amplicon from the β-globin gene was used as loading control and a no-DNA control is shown in lane 1. (C) DSP2 and DFL16.1 utilization in pro-B cells and DP thymocytes was calculated after band intensity quantitations from two different gels using Gene Tool. Error bars represent standard error of the mean between two independent gel quantitations. (D) Rag1 and Rag2 binding to the IgH locus was evaluated by chromatin immunoprecipitation. Immunoprecipitated genomic DNA and input DNA were used for qPCR and fold enrichment was calculated as described by Ji et al. (23). DP thymocytes were compared to the D345 pro-B cell line as indicated. γ-actin served as positive control for Rag2 but negative control for Rag1. Cγ3 is used as negative control for both Rag1 and Rag2. TCRJ β 2 gene (TRBJ2-5) is used as additional positive control for DP thymocytes. For each independent experiment PCR was done in triplicate. Data shown is the mean of two independent experiments. Error bars represent standard error of mean (n = 2). (E) Utilization of DSP2 gene segments in DJ H junctions in DP thymocytes from C57BL/6 mice. DSP2-J H 1 recombined products were amplified using (Continued) FIGURE 3 | a forward pan-DSP2 primer and a reverse primer located 3 ′ of J H 1. Amplification products were gel purified followed by adapter ligation and sequencing (left panel). The number of reads aligned to each DSP2 gene are shown in Figure S3F. Percentage of reads mapping to indicated DSP2 gene segments are shown (after removal of redundant reads). DSP2-J H 1 amplification products were also cloned into pGEM-T vector and 60 clones were sequenced (right panel). Number of clones with each gene segment are shown in the bar graph. Thirty clones were sequenced each time from two different PCR amplification. 50 out of 60 clones that had unique junctional sequences are represented in the bar graph.
utilize gene segments located near the 3 ′ part of the D H locus.

Basis for Lack of V H Recombination in Thymocytes
V H recombination does not occur in DP thymocytes on WT IgH alleles, despite availability of a DJ H junction in recombinaseexpressing DP cells. One possibility is that these cells do not survive long enough to undergo two sequential recombination events. Alternatively, V H gene recombination may be prohibited for mechanistic reasons. An essential requirement for V H recombination is for these gene segments come into proximity of DJ H junction to permit Rag1/2-dependent synapsis. Current evidence indicates that distal and proximal V H genes are differently regulated in pro-B cells. We previously proposed that distal V H gene segments come close to the 3 ′ IgH domain via three inter-dependent steps regulated by three different transcription factors (25). The first step uses CTCF to fold the V H region into domains of several hundred kb. The second step further compacts the distal V H region using Pax5 and the third step brings the pre-folded V H region close to the DJ H part of the locus by a process that requires YY1. By contrast, Pax5 and YY1 are not required for proximal V H rearrangements (7,46), suggesting that CTCF-dependent interactions are sufficient. We used fluorescence in situ hybridization (FISH) to investigate the conformational state of the IgH locus in DP thymocytes. The first level of (CTCF-dependent) compaction was analyzed with probes located in different parts of the V H locus. We found that the probe pair V10-V10-3, located in the distal V H region, were comparably positioned in pro-B and DP cells but different in non-B lineage bone marrow cells (Figure 4A). By contrast spatial proximity of IGCR1-V3 probe pairs, querying proximal V H interactions, was similar in DP and non-B cells (Figure 4A and Figure S4A). This was neither due to differential expression of CTCF between DP cells and pro-B cells (data not shown) nor due to lack of CTCF binding to the IgH locus in these cell types based either on genome-wide assays or ChIP followed by quantitative PCR (Figures S4B,C). We conclude that presence or absence of CTCF-dependent interactions in DP cells depends on their location within the IgH locus. Since 60-70% of CTCF sites are also bound by RAD21 in IgH locus, we assayed RAD21 binding at IgH locus. We found that RAD21 binding was reduced at proximal V H gene segments (Figure S4C), consistent with recent observation of Loguercio et al. (47). We used additional primers from the distal V H region to assay RAD21 binding in WT DP thymocytes and Rag2 −/− pro-B cells (Figure S4D). RAD21 binding varied between the 3 new amplicons, precluding any definitive conclusions regarding RAD21 recruitment to CTCF-bound sites in the distal V H J558 region. In our working model a CTCF looped distal V H region is further compacted by Pax5. Because Pax5 is not expressed in the T lineage, it was likely that this second level of compaction would be absent in DP cells. We verified this prediction using bacterial artificial chromosome (BAC) probes RII and RIII that span the distal V H region ( Figure 4B). The proximal 1 Mb of the V H region was examined using BAC probes RI and RII. This association was more similar in DP and pro-B cells than in non-B cells (Figure 4B and quantitated in Figure 4C). The final step of locus compaction brings a pre-folded V H region into proximity of the D H -Cµ domain mediated by the transcription factor YY1. We queried this interaction with probes located near Eµ and different parts of the locus. Eµ interaction with both proximal (V H 7183) and distal (V H J558) parts of the locus was substantially disrupted in DP thymocytes (Figures S4E,F). Overall, V H region compaction is affected at multiple levels in DP cells compared to pro-B cells.

DISCUSSION
Antigen receptor genes recombine in a lineage-and stage-specific manner using enhancers to regulate chromatin structure and, thereby, locus accessibility to recombinase. One exception to this rule is that up to 50% of T lymphocytes contain partially rearranged IgH alleles, demonstrating that IgH rearrangements are not limited to B lymphocytes (2). These rearrangements likely occur at multiple developmental stages including ETP, DN and DP stages (5). Here we probed IgH locus structure in DP thymocytes to address three questions. First, why do D H gene segments recombine in DP cells, and does this step follow the same rules as in pro-B cells? Second, why does V H recombination not occur in in DP cells? And third, do these properties provide insights into mechanisms that regulate tissuespecific locus activation?

Regulation of D H Recombination in Thymocytes
Earlier observations that Eµ deficiency abrogates D H recombination in thymocytes provides a preliminary answer to the first question (21). Namely, D H gene segments rearrange because Eµ makes the IgH locus accessible to recombinase in DP cells. However, we now show that the mechanism by which Eµ is activated and consequences of Eµ activity differs between DP and pro-B cells. We demonstrate that the constellation of transcription factors that bind Eµ in DP cells differs considerably from those that activate it in pro-B cells. Significantly missing from the enhancer in DP cells are the ETS-domain proteins Ets-1 and the bHLH protein E2A. The latter may be substituted by the related protein HEB, and we did not check for PU.1 binding to Eµ because this protein Top panel shows the unrearranged IgH locus with locations of FISH probes. Probes labeled as V10, V10-3, V3, and IGCR1 were generated by PCR amplification from bacterial artificial chromosome (BAC) DNA and range in length from 4 to 10 kb. RI-RIII are BACs. Two color DNA-FISH were carried out for CTCF-dependent interactions using FISH probes indicated on the top. IgH alleles were marked with BAC RP23-201H14 (light blue) which is located 200 kb 3 ′ of HS4. Representative nuclei are shown. Spatial distances between probes were measured as described in (22). Bar graph shows percentage of IgH alleles in which the distance between probes fell in the ranges shown in different colors (n = 100). (B) BAC probes RI, RII, and RIII labeled as indicated, were hybridized simultaneously to pro-B cells, DP cells, and non-B cells derived from bone marrow of Rag2 −/− mice. The percentage of IgH alleles in which distances between BAC probes fell in the indicated ranges are shown in the bar graph (n = 100). (C) Cumulative frequency distribution of spatial distances for each probe combination RI-RII, RI-RIII, and RII-RIII are indicated.
is not expressed in DP cells (48). Additionally, YY1 binding to Eµ is reduced in DP cells. Thus, the enhancer milieu in DP thymocytes is quite different from that in pro-B cells. We suggest that this configuration results in inappropriate enhancer activity, leading to differences in epigenetic features, chromosome conformation, production of eRNAs and RAG recruitment to the IgH locus in DP cells compared to pro-B cells.
The ensemble average nature of ChIP studies precludes a molecular definition of "inappropriate enhancer activity." The most clear difference between pro-B cells and DP thymocytes in regard to Eµ occupancy by transcription factors are the absence of both Ets-1 or PU.1 and the substitution of E2A by HEB at Eµ in DP cells. The current analysis does not rule out that Ets-1 and PU.1 may also be replaced by another ETS-domain protein in thymocytes. These features appear to characterize most DP thymocytes, by contrast, YY1 binding in DP cells is reduced to ∼50% the level in pro-B cells. It is impossible to distinguish whether the YY1 bound state represents 50% cells or 50% of alleles. Adding up these features leads to a view of Eµ bound by RUNX1, HEB, and YY1 on a subset of alleles in DP thymocytes. Our working hypothesis is that absence of the right combination of ETS-domain proteins at the Eµ effectively cripples optimal enhancer activity. This is reflected in maintenance of H3K4me1, a mark of poised enhancers, at Eµ in DP thymocytes.
A functional consequence of inappropriate Eµ activity is the markedly different utilization of D H gene segments in DP cells compared to pro-B cells. DFL16.1, the 5 ′ -most D H gene segment, no longer dominates DJ H junctions, and DSP2 rearrangements are skewed toward the 3 ′ -most DSP2.5 gene segment in DP cells. Reduced DFL16.1 rearrangements can be understood in part by reduced Eµ/IGCR1 interactions in DP cells that, in pro-B cells, bring this gene segment into spatial proximity of the recombination center. Skewed utilization of DSP2.5 is harder to explain. In pro-B cells DSP2 gene segments at either ends of the cluster rearrange more frequently than those that lie in the middle (42)(43)(44). That is, DSP2.2 and 2.9 (at the 5 ′ end) and DSP2.5 (at the 3 ′ end) recombine more than DSP2.X and 2.3 (in the middle). We have previously proposed that this pattern arises from repeat-induced gene silencing (RIGS), a form of RNA-interference initiated heterochromatin formation, of the DSP2 repeat region (49). Our working model is that extensive use of DSP2.5 in thymocytes reflects reduced DSP2.2 and 2.9 utilization (rather than specific activation of DSP2.5) due to weakened RIGS as well movement of these gene segments away from the RC due to reduced Eµ/IGCR1 interaction. RAG proteins tracking (50) from the relatively poor IgH RC in DP cells would thus encounter the DSP2.5 RSS first to initiate rearrangements. Accordingly, most DSP2.5 rearrangements occur by deletion rather than inversion (data not shown).

Lack of V H Recombination in Thymocytes
Our studies reveal several mechanisms that contribute to the absence of V H rearrangements in DP thymocytes. First, conformational compaction of the distal V H region is disrupted in DP cells. This may be attributed to the absence of Pax5 expression in these cells, since proximity between BAC probes RII and RIII is similarly dislocated in Pax5-deficient pro-B cells (51). Second, CTCF-dependent interaction between IGCR1 and the proximal V H genes does not occur in DP cells, though CTCF binding throughout the IgH locus is comparable to that in pro-B cells. We surmise that absence of these interactions is due to reduced cohesin recruitment to V H region CTCF sites in DP cells. The disconnect between CTCF binding and cohesin binding at the IgH locus has been previously noted, though the molecular mechanism by which this occurs remains obscure (47,52). Our observations identify chromatin structural consequences of this disconnect that provide a plausible explanation for the lack of proximal V H recombination in DP cells. Third, an attenuated Eµ in DP cells may be unable to activate DJ H junctions for secondary V H recombination. One piece of evidence in support of this proposal is that DJ H junctions in pro-B cells get hypo-methylated at CpG residues, whereas they remain hyper-methylated in DP cells and on Eµ -deficient IgH alleles in pro-B cells (53). We hypothesize that the cumulative effect of these processes inhibits V H rearrangements in DP thymocytes. However, we cannot rule out the more mundane explanation that absence of V H recombination is the stochastic consequence of reduced RAG1/2 recruitment by a sub-optimally active Eµ.
Our working hypothesis warrants consideration of situations in which V H recombination can be induced in DP thymocytes. Two prominent circumstances have been documented. First, ectopic expression of Pax5 in DP cells results in proximal V H 7183 gene rearrangements (6,7). We surmise that this may be directed by Pax5 binding to sites within V H 7183 genes as shown by Revilla et al. (54). Whether Pax5 expression also compacts distal V H genes in DP cells remains to be determined. Second, disruption of IGCR1 leads to rearrangement of the 3 ′ -most V H 81X in DP cells (8). Based on the observation that Eµ loops to a CTCF binding site close to V H 81X on IGCR1-mutated alleles to promote highly specific rearrangement of this gene segment in pro-B cells (55), we infer that a similar mechanism may also operate in DP thymocytes even with a sub-optimal Eµ.

Tissue-Specific Enhancer Activation
The state of Eµ in DP cells has implications for mechanisms of tissue-specific enhancer activation. One of the interesting features is that differences in Eµ occupancy by many factors, such as Ets-1, E2A, and YY1, occur despite comparable expression of these factors in pro-B and DP cells. These observations substantiate the idea that one or more key tissue-specific factor directs optimal enhancer occupancy. In the case of Eµ, such a function can be ascribed to PU.1 which may recruit or stabilize Ets-1 binding to the enhancer in pro-B cells; in its absence Ets-1 is not recruited to the enhancer in DP cells. Extending this line of reasoning suggests that the PU.1/Ets-1 combination is required for optimal YY1 and E47 binding. Alternatively, the ETS protein milieu of DP cells may exclude Ets-1 binding by mass action. That is, DP cells may contain other ETS proteins with higher affinity to sites in Eµ, or are present in greater abundance, that competitively displace Ets-1. Similar considerations may explain the substitution of E47 by HEB at Eµ in DP cells.
Taken together with earlier observations that Eµ-deleted IgH alleles bear certain hallmarks of locus activation in pro-B cells, our results lead to the following working hypothesis about tissue-specific Eµ function. Locus-specific changes that occur in lymphoid lineage cells permits transcription factor access to Eµ. The combination of factors present in pro-B cells result in optimal Eµ function. In the absence of the correct constellation of factors in the T lineage, Eµ is occupied by factors that are available in that milieu.
However, the combination of inappropriate factor binding, and empty sites leads to sub-optimal function. Our observations demonstrate that enhancers can be partially (or inappropriately) occupied by transcription factors in the wrong cell type. Such occupancy could underlie their H3K4me1 marking in many different cell types, till the right set of factors bind to activate the enhancer, reduces H3K4me1 level and mark it with H3K27ac.