Preimmune Control of the Variance of TCR CDR-B3: Insights Gained From Germline Replacement of a TCR Dβ Gene Segment With an Ig DH Gene Segment

We have previously shown that the sequence of the immunoglobulin diversity gene segment (DH) helps dictate the structure and composition of complementarity determining region 3 of the immunoglobulin heavy chain (CDR-H3). In order to test the role of germline D sequence on the diversity of the preimmune TCRβ repertoire of T cells, we generated a mouse with a mutant TCRβ DJC locus wherein the Dβ2-Jβ2 gene segment cluster was deleted and the remaining diversity gene segment, Dβ1 (IMGT:TRDB1), was replaced with DSP2.3 (IMGT:IGHD2-02), a commonly used B cell immunoglobulin DH gene segment. Crystallographic studies have shown that the length and thus structure of TCR CDR-B3 places amino acids at the tip of CDR-B3 in a position to directly interact with peptide bound to an MHC molecule. The length distribution of complementarity determining region 3 of the T cell receptor beta chain (CDR-B3) has been proposed to be restricted largely by MHC-specific selection, disfavoring CDR-B3 that are too long or too short. Here we show that the mechanism of control of CDR-B3 length depends on the Dβ sequence, which in turn dictates exonucleolytic nibbling. By contrast, the extent of N addition and the variance of created CDR3 lengths are regulated by the cell of origin, the thymocyte. We found that the sequence of the D and control of N addition collaborate to bias the distribution of CDR-B3 lengths in the pre-immune TCR repertoire and to focus the diversity provided by N addition and the sequence of the D on that portion of CDR-B3 that is most likely to interact with the peptide that is bound to the presenting MHC.

INTRODUCTION V(D)J rearrangement has been calculated to yield potential repertoires of more than 10 16 different T cell receptor (TCR) or immunoglobulin (Ig) antigen binding sites (1,2). A fundamental issue is the extent to which diversity is random or directed. The former would imply that diversification of the repertoire is purely a matter of chance. The latter would suggest that diversification takes place under germline control in order to optimize the creation of a functional repertoire and minimize autoreactive clones.
A major component of antigen receptor diversity comes from the inclusion of a diversity (D) gene segment into the rearrangement process. In B cells, D gene segments contribute to Ig heavy (H) chain diversity and in αβ T cells they contribute to TCRβ diversity. In both antigen receptors, amino acids encoded by the D are positioned at the center of complementarity determining region 3 (CDR-H3 for the immunoglobulin H chain and CDR-B3 for the T cell receptor beta chain), which is the direct product of V(D)J rearrangement (3). In both Ig and TCR, D gene segment-encoded amino acids within CDR3 commonly contribute directly to the recognition and binding of cognate antigens. The inclusion of a D gene segment also allows two rounds of junctional diversification during VDJ rearrangement. The somatic mechanisms of CDR3 junctional diversification include terminal exonucleolytic "nibbling, " P junction extension and N nucleotide addition.
Although potential CDR3 diversity is astronomic, we have previously shown that there are constraints on the structures of the immunoglobulin CDR-H3 repertoire that can be detected through analysis of as few as ten to twenty sequences, not thousands or millions. For example, in progenitor B cells, constraints on germline D H sequence can heavily influence the structure and composition of immunoglobulin H chain (CDR-H3) (4). Thus, constraints on D H germline content represent one mechanism by which the structural diversity of the repertoire can be directed.
In order to further test the role of germline D sequence on the shape of the preimmune CDR3 repertoire, we turned to the TCRβ locus and created a mouse with a mutant TCRβ DJC locus wherein the Dβ2-Jβ2 gene segment cluster had been deleted (Dβ2ko) and the remaining Dβ1 gene segment [ImMunoGeneTics (IMGT) database (5) (IMGT: TRBD1)] replaced with a commonly used D H gene segment DSP2. 3 [IMGT: IGHD2-7(BALB/c)].
We found that the mechanism of control of CDR-B3 length, which is important for optimal MHC:peptide interactions, depends on the Dβ sequence, which in turn dictates exonucleolytic nibbling. Conversely, the extent of N addition and the variance of created CDR3 lengths are regulated by the cell of origin, the thymocyte.

Generation of Targeted ES Cells and the DβYTL Mouse
Plasmids containing the germline C57BL/7 Dβ1 and Jβ1 loci were the kind gift of Dr. Barry Sleckman. The targeting construct was generated using a pLNtk targeting vector containing a SalI-loxP-Neo r -loxP-XhoI-TK cassette (Supplementary Figure S1). A 4.4 kb KpnI-SacII 3 homology arm containing the Jβ1 gene segments was subcloned into the XhoI site by blunt-end ligation.
A plasmid (BSSK5 M) containing Dβ1 was used as a substrate for PCR directed replacement of TCRβ Dβ1 by IgH DSP2.3. Overlapping 64 base pair primers containing the sequence of DSP2.3 in place of Dβ1 were generated. These were 5 tgtataaagctgtaa cattgtg TCTACTATGGTTACGAC cacggtg attcaattctatgggaag 3 and 5 cttcccatagaattgaat caccgtg GTCGTAACCATAGTAGA cacaatg ttacagctttataca 3 . The sequence of the D is in caps and the heptamers are separated by spaces from the rest of the sequence. Each of these was individually paired with a forward primer (5 ataacctctgaggacgcacagccttaggg 3 ) upstream of a Bsu36I site and a reverse primer (5 acgactcactatagggcgaattgggtaccg 3 ) downstream of a HindIII site. The overlapping PCR products were then annealed and PCR amplification was performed with the upstream of Bsu36I and downstream of HindIII primers alone. The resulting PCR amplified product was cut with Bsu36I and HindIII, and back cloned into the BSSK5 M plasmid, thus replacing Dβ1 with DSP2.3.
A 2.6 kb NotI-ClaI 5 homology arm was cut from the BSSK5 M plasmid and subcloned by blunt-end ligation into a SalI upstream of the first loxP site in the pLNtk + 3 homology arm plasmid. The resulting 14.9 kb targeting vector was linearized with PvuI and electroporated into 129 derived DJβ2 −/− (Dβ2ko) mouse ES cells (6,7). Briefly, 1 × 10 7 ES cells were electroporated with 25 µg of linearized vector DNA in a 0.4 cm cuvette at 240 V and 500 µF (Bio-Rad Gene Pulsar, Bio-Rad Laboratories, Hercules, CA). Individual ES cell clones were selected with 200 µg/mL G418 (positive selection) and 2 µM Ganciclovir (negative selection) from 24 h after electroporation for a total of 2 weeks. The transfection efficiency was 3%.
ES cell clones with homologous recombination were identified by long PCR using LA Taq DNA polymerase (Takara Bio USA, Mountain View, CA, United States). The PCR program used was (1) denaturation at 94 • C for 1 min, (2) 94 • C for 20 sec, 68 • C for 7 min for 31 cycles, (3) 68 • C for 10 min, and (4) hold at 4 • C. The primers used to identify the correct 5' end of the recombinant were a 5 primer from the mouse TCRDβ1 region (5 gtgagtccatcattgctagggaaaggggttgagtg 3 ) and a 3 primer from the Neo loxP region of the targeting vector (5 gagcccagaaagcgaaggaacaaagctgctattgg 3 ). The primers used to identify the correct 3 end of the recombinant were a 5 primer from Neo loxP region (5 acgggggtgggggtggggtgggattagataaatgc 3 ) and a 3 primer from mouse TCRDβ1 region (5 ccatggaactgcacttggcagcggaagtggttgcg 3 ).
The TCRβ DJC locus resulting from this manipulation contained the original germline Dβ1 recombination signal sequences that now flanked immunoglobulin D H DSP2.3 in place of Dβ1. It contained the Jβ1 and Cβ1 locus in its entirety as well as the Cβ2 constant domain but lacked Dβ2 and Jβ2 sequences. We termed this new D H substituted TCR locus DβYTL, which refers to the central amino acids in each of its three reading frames (i.e., tyrosine, threonine, and leucine). For the purposes of this manuscript, we renamed the original DJβ2 −/− gene targeted locus Dβ2ko to emphasize the deletion of Dβ2.
Two original Dβ2ko ES cells and two independently derived DβYTL ES cell clones were independently microinjected into C57BL6/J blastocysts. The resulting chimeric mice were bred to wild type C57BL6/J mice. The agouti offspring were genotyped by tail DNA PCR analysis to assess germline transmission of the DβYTL or Dβ2ko TCR alleles. Homozygous DβYTL mice were bred to transgenic mice expressing the Cre recombinase from the CMV promoter to delete the LoxP-flanked Neo r gene during early embryogenesis (Cre mice were obtained from Jackson Laboratories). Deletion of the Neo gene in the offspring was confirmed by PCR using Cre3 (5 gaatttactgaccgtacac3 ) and Cre4 (5 catcgccatcttccagcag3 ) primers. The homozygous progeny harboring mutant DβYTL or Dβ2ko TCR alleles were backcrossed with wild type C57BL6/J mice for 24 generations. All animal experiments were approved by the University of Alabama at Birmingham (UAB) Institutional Animal Care and Use Committee. The UAB Animal Care and Use Program is fully accredited by Association for Assessment and Accreditation of Laboratory Animal Care International.

Sequence Analysis of CDR-H3 and CDR-β3
Eight of the thirteen D H gene segments in BALB/c mice and six of the ten D H in C57BL/6 mice belong to the DSP family. Due to the extensive sequence similarity among these gene segments, it is often difficult to determine exactly which DSP germline gene segment contributed to an individual CDR-H3. Thus, we grouped all of the sequences of CDR-H3 that had identifiable DSP family sequence into wild type controls.
Gene segments were assigned according to published germline sequences for the TCR β gene segments as listed in the ImMunoGeneTics database (5). The CDR3 of the TCRβ chain was defined to include those residues located between the conserved cysteine (C104) of FR3 and the conserved phenylalanine (F118) of FR4 (10). These TCRβ sequences obtained from C57BL/6 DβYTL, Dβ2ko, and wild type DN2 thymocytes were compared to the Ig CDR-H3 sequences. In total, 47 of 50, 47 of 51, and 29 of 58 thymocyte CDR-B3 sequences from the three respective mouse strains contained identifiable D H DSP2.3 or TCR Dβ1-Jβ1 sequences (Supplementary Table S1).

Statistical Analysis
Statistical analysis was performed with JMP version 14 (SAS Institute) or GraphPad Prism 8 version 8 (GraphPad Software, San Diego, CA, United States). Population means were analyzed using one-way analysis of variance (ANOVA) test. Variance was assessed with the O'Brien Test for Homogeneity of Variance. Categorical comparisons were performed with Fisher's exact test.

RESULTS
In order to test the effect of changing the sequence of a D gene segment on VDJβ rearrangement and N addition, we replaced the Dβ1 gene segment with DSP2.3, a commonly used Ig D H gene segment, to create a new TCRβ allele we termed DβYTL, which refers to the central amino acids in each of its three reading frames (i.e., tyrosine, threonine, and leucine). To simplify the analysis, we introduced this gene substitution mutation into an ES cell that had previously undergone a gene targeted deletion of the Dβ2-Jβ2 gene segment cluster (Dβ2ko). We chose to utilize a sequence of a member of the DSP D H family ( Table 1 and  Supplementary Table S2) because together the members of this family make up a majority of functional D H sequences in both BALB/c and C57BL/6 mice, the DSP2.3 sequence is found in the germlines of both strains and, unlike the case with many other D H segments, none of the three D H DSP2.3 reading frames rearranged by deletion contain a termination codon (11).

Cloning of Representative CDR3 Sequences From Pre Selection Thymocytes
To assess the effect of the surrounding locus on natural selection of CDR-B3 and identify how the elements that contribute to CDR3 are being processed during development, we compared FIGURE 1 | Flow diagram of the derivation and the numbers of the DSP-and Dβ1-containing CDR3 sequences analyzed. Sequences that did not meet individual criteria were discarded into the "No" pool. Sequences that met all the individual criteria were pooled into the "Yes" pool.
CDR-B3 content in DβYTL DN2 thymocytes to a panel of bone marrow proB and thymocyte DN2 controls. Most of our previous studies of the CDR-H3 repertoire were performed in BALB/c mice, hence sequences from BALB/c bone marrow pro B cells were used as a major control for how DSP gene segments are normally handled. However, since the DβYTL allele was studied in C57BL/6 mice, we also compared CDR-H3 sequences from wild type C57BL/6 bone marrow proB cells as an additional control. We controlled for the effect of deleting the Dβ2-Jβ2 gene segment cluster by analyzing CDR-B3 sequences from Dβ2ko DN2 thymocytes, where the wild type Dβ1-Jβ1 locus was intact and the Dβ2-Jβ2 locus had been deleted. Together with the CDR-H3 sequences, study of these CDR-B3 sequences allowed comparison of how DSP gene segments were handled in two different strains and in the context of the TCR locus, as well as between the single DSP gene segment and Dβ1, both in the presence and absence of the Dβ2-Jβ2 locus.
We had previously cloned and sequenced immunoglobulin HC transcripts from BALB/c and C57BL/6 Fraction B cells that used members of the V H 7183 family (4,9). From this library of sequences, we identified 96 DSP containing sequences from BALB/c and 24 from C57BL/6 ( Figure 1). To sample how the elements that contribute to CDR-B3 content are being processed, we randomly chose to analyze CDR-B3 sequences from TCR Vβ13.1 containing transcripts. We PCR amplified TCRβ sequences containing Vβ13.1 from DN2 thymocytes from DβYTL, Dβ2ko and wild type C57BL/6 mice. We identified 47, 47 and 29 CDR-B3 sequences (Figure 1), respectively. Among the sequences obtained from the wild type mice, we excluded those in which the Dβ1 gene segment had rearranged to a Jβ2 gene segment. In

Loss of Terminal D Nucleotides Is D Sequence Specific
There was a greater loss of nucleotides at the 3 end of the V and a lesser loss of nucleotides at the 5 end of the J in DN2 thymocytes than in Fraction B proB cells, irrespective of the sequence of the D (Figure 2). Conversely, the loss of 5 D sequence was greater and the loss of 3 D sequence lesser in progenitor cells that contained a DSP sequence than in progenitor cells that contained a Dβ1 sequence irrespective of the host cell type. Thus, although the sequence of the D did not control V or J nucleotide loss; the sequence of the D did control terminal loss of D nucleotides, irrespective of cell type. This is further evidence that the sequence of the D controls how that sequence is modified during VDJ rearrangement, with each D creating its own D-specific repertoire (8).
P junction gain at the termini of the D's also appeared D gene segment-specific. There was a greater gain of P junctions at the 5 end of the Dβ sequence than the DSP sequence, and a greater gain of P junction sequence at the 3 end of DSP sequences Frontiers in Immunology | www.frontiersin.org  Sequences are shown in comparison to a representative gene segment. The strain of origin is indicated. than at the 3 end of the Dβ sequence, irrespective of cell type, although the differences at the 5 end did not achieve statistical significance. There was a greater gain of P junction nucleotides at the 3 terminus of V gene segments in Fraction B cells than in thymocytes irrespective of the sequence of the D. The differences in P junctions in J's were not statistically significant. V/D overlap was greater in thymocytes and D/J overlap greater in Fraction B proB cells, although again these differences did not achieve statistical significance and the absolute contribution of nucleotides was small.
The presence of two or more microhomologous nucleotides between rearranging gene segments has been shown to influence the site of RAG mediated recombination in progenitor B cells (12,13). Neither the 5 terminus of the DSP gene segments nor the Dβ1 segment share dinucleotide microhomology with the 3 terminus of the Ig or TCR V's. The 3 terminus of all DSP gene segments shares at least a two nucleotide microhomology (AC) with the 5 sequences of J H 1, J H 2, and J H 4. There is only a two nucleotide microhomology between DSP gene segments and Jβ1 and Jβ2. There is no shared two or more nucleotide microhomology between the other four Jβ's and the 3 terminus of DSP. The 3 terminus of Dβ1 ends in GC, a dinucleotide that is not found at the 5 terminus of any of the Jβ1 gene segments. The lack of a detectable effect of changing the sequence of the D on the extent of terminal loss of V or J sequences and the preservation of patterns of D nucleotide loss irrespective of the cell types supports the view that it is the germline sequence of the V, D, and J gene segments that auto-regulates terminal P junction gain or exonucleolytic loss of sequence at the time of gene segment recombination.

The Extent of N Region Addition Is Cell Type Specific
The extent of N region addition between V and D as well as between D and J was greater in Fraction B proB cells than in DN2 thymocytes, irrespective of the sequence of the D (Figure 2).

Reading Frame Usage Is Random in DN2 Thymocytes Irrespective of D Sequence
Partly due to the microhomology between the 3 end of the D and the 5 end of the J, developing B cells demonstrate a bias against rearrangement into reading frame 2 (RF2). In the absence of extensive DβYTL-Jβ microhomology, RF 2 usage increased to a third of the rearrangements (Figure 3). Enrichment for RF1 was FIGURE 2 | Comparisons of VDJ processing in the pre-selected repertoire. Shown in the left column is the average loss of terminal nucleotides in the V, D, and J gene segments. Shown in the middle column is the average amount of P junction addition. Shown in the right column is: row 1 and 2, the average amount of N nucleotide addition between V and D and between D and J, respectively; row 3 and 4, the extent of V/D and D/J overlap in sequence among sequences lacking N addition. In each case, the data are ordered, from left to right: wild type BALB/c DSP containing sequences, wild type C57BL/6 DSP containing sequences, DβYTL containing sequences, Dβ1 containing sequences from the Dβ2ko mice and from wild type C57BL/6 mice. From left to right, the first three columns represent D H sequences and columns four and five are Dβ sequences. From left to right, the first two columns of each graph represent data from Fraction B proB cells, followed by the three columns of each graph representing data from DN2 cells from thymocytes. Error bars display the standard error of the mean. Statistical analysis was performed only for DβYTL in comparison with the four controls, and comparisons lacking a "p" value all have p > 0.05. greater in BALB/c than in C57BL/6 with a compensatory loss of RF3. The mechanism underlying this difference is unclear.

On Average, CDR-H3 Contains More Random N Nucleotides and Exhibits Greater Variance in Sequence Than CDR-B3
The average number of total N nucleotides (5 plus 3 ) was greater in progenitor B cells than in thymocytes (Figure 3). Although there was no statistically significant difference in total N nucleotides between DβYTL CDR-B3 than in Dβ1-Jβ1 sequences from Dβ2ko, there was less N addition in wild type Dβ1-Jβ1 sequences. The trend to have less N addition was observed both at the V→DJ and D→J joins. How the presence of an additional DβJβ locus might influence total N addition is unclear.
Differences in N addition between cell types were also found in the variance and distribution of N nucleotides (Figure 4). The most variable range of lengths was observed in CDR-H3 of BALB/c transcripts and the least in CDR-B3 from wild type thymocyte DN2-derived TCRβ transcripts. Thus, the cell type clearly influenced the extent of N addition. There is also the suggestion that the absence of Dβ2-Jβ2 not only affected the absolute number of N nucleotides added, but also influenced the variance of N nucleotide addition. The greater variance of N addition in fraction B proB cells vs. DN2 thymocytes contributed to a marked difference in the variance of the number of amino acids, or the lengths of the sequences, in CDR-H3 versus CDR-B3 (Figure 4). The variance in lengths was greater in progenitor B cells than in thymocytes. In DβYTL, the variance in lengths was similar to that of Dβ2ko and wild type CDR-B3. Due in part to the difference in both the quantity and variability of N addition, the variance in the distribution of CDR-B3 lengths was significantly lower than the variance in the distribution of CDR-H3 lengths.
When viewed in toto, the greater length of the Vβ and Jβ portions of CDR-B3 coupled with the lower amount of N addition results in the average CDR-B3 containing more germlineencoded sequence than CDR-H3 ( Figure 5). Intriguingly, however, despite differences in N addition and terminal nucleotide loss or gain, the average length of immunoglobulin CDR-H3s containing a DSP gene segment proved similar to that of TCR CDR-B3 containing Dβ1 and Jβ1 gene segments. This balance was affected when the sequence of the D H was lengthened. DSP2.3 is five nucleotides longer than Dβ1, and on average the length of CDR-B3 sequences containing DSP2.3 was 3.5 nucleotides longer than those containing Dβ1 gene segment sequence. Thus the length of CDR-B3 can be heavily influenced by the length and terminal sequence of the D.

DISCUSSION
Adaptive immunity in jawed vertebrates is designed to produce immunoglobulin (Ig) and T cell receptor (TCR) repertoires of astronomic diversity in developing B cells and thymocytes (1,2). These two antigen receptors have overlapping but distinct roles. While both Ig and TCR act as the antigen receptor for their respective host cells, Igs also have effector functions that require high affinity binding of high specificity. Conversely, while the interaction between Ig and antigen is bimolecular, T cell receptor recognition of antigen requires a trimolecular interaction with both peptide antigen and a member of the major histocompatibility complex (MHC). Secreted Ig clears pathogens through binding to target antigens, which induces a cascade of humoral and cellular reactions. On the other hand, T cells via their TCR can induce killing of target cells infected with the pathogens. Based on these major differences in functions, it is not surprising that the measures used to control their repertoires vary, even though genes encoding both receptors undergo the same process of VDJ recombination and N addition.
We have previously focused our efforts on testing whether the functional efficiency of the process of immunoglobulin diversification can be enhanced by means of natural selection of germline sequence (4). In addition to immunoglobulin, the diversity provided by VDJ recombination and N addition enables a broad array of TCR antigen specificity. The antigen receptors that are generated by both the immunoglobulin and TCR loci have the capacity to be autoreactive, protective, superfluous, or ineffective. Thus a completely random process would likely result in costly inefficiency. We have previously shown that the sequence of D H can heavily influence immunoglobulin diversity, and that the features of this influence can be detected in broad outline by analysis of as few as ten to twenty sequences. Moreover, the changes induced by altering D H sequence negatively impact B cell development, antibody production, protection against infection, responses to allergens, and susceptibility to autoreactive antibody production (4,(14)(15)(16).
In this work, we sought to test whether the process of VDJ recombination and N region addition would be altered in T cells should the sequence of Dβ be changed. To test for potential germline constraints on both Ig and TCR repertoires, we replaced Dβ1 with a commonly used D H in a mouse lacking the Dβ2-Jβ2 gene segment cluster. We evaluated the criteria of selection of TCR which included terminal nucleotide loss, terminal nucleotide P junction gain, V/D and D/J overlap, N addition, reading frame use, the relative contribution of N addition versus germline encoded content in CDR3, and total CDR3 length.
We compared the repertoires expressed in proB cells from the bone marrow to DN2 cells from the thymus. In developing B cells, D H →J H rearrangement precedes V H →D H J H rearrangement. Hardy Fraction B progenitor B cells contain these initial VDJ joins and by definition do not express appreciable levels of µH chain protein. Thus, the VDJ repertoire they express is considered to be preimmune because it is unselected by the immune properties of µH chain protein, including preBCR formation and antigen binding. Similarly, in the thymus Dβ→Jβ rearrangement precedes Vβ→DJβ rearrangement and, during thymocyte development, VDJCβ transcripts are typically first found in DN2 cells. The homology between proB and DN2 cells is not exact since the definition of DN2 cells does not depend on the presence or absence of TCRβ protein. While it is possible that expression of nascent TCRβ protein could affect the fate of the cell, either directly or through its interaction with preTCRα, selection of the TCR β chain is primarily associated with the developmental checkpoint between DN3a and DN3bc (17). Thus, as with proB cells, DN2 thymocytes are considered to primarily express a preimmune TCRβ repertoire.
We found that the sequence of the D maintains control over the outcome of the rearrangement of the D regardless of whether recombination is occurring in proB cells or thymocytes. While the sequence of the D has minimal effects on V or J gene segment loss or gain of terminal sequence, it self-directs the extent of terminal D nucleotide loss or P junction gain.
The extent and variance in N addition appears affected by cell type and by the loci surrounding the rearranging gene segments (18,19). This study was not designed to fully evaluate the independent contributions of these factors. However, it did disclose a potential role for the Dβ2-Jβ2 locus in regulating the extent and distribution of N nucleotide addition. The activity of the N region addition machinery is regulated during ontogeny by controlling TdT expression and access to exposed terminal DNA sequence (20)(21)(22). We would surmise that one or both of these mechanisms are the means by which cell type and the presence or absence of a rearranging locus could influence N region length (20).
Structural studies have shown that amino acids at the tip of the TCR CDR-B3 loop are highly likely to interact directly with peptide antigens presented on the surface of the MHC molecule. The length distribution of CDR-B3 has been proposed to be restricted largely by MHC-specific selection, disfavoring TCRs with CDR3s that are longer than 13 amino acids (23). Our data would suggest that initial restrictions are controlled by natural selection. Potential mechanisms include preservation of D region sequence and regulation of N addition (24).
On average, the 3 termini of TCR Vβ and the 5 termini of TCR Jβ are longer than their immunoglobulin V H and J H counterparts (e.g., Table 1 and Supplementary Table S2). Natural selection of the germline sequences of TCR Vβ, Dβ, and Jβ coupled with control of CDR-B3 length variance has the effect of focusing the diversity provided by N addition and the sequence of the D to that portion of CDR-B3 that is most likely to interact with the peptide that is bound to the presenting MHC.
This manuscript has focused on the effect of D sequence on the preimmune repertoire at the nucleotide level. The contribution of the amino acids encoded by the D, which differ in peptide signature between T cells and B cells, to repertoire selection, T cell development, and antigen responses is reported in a companion manuscript. Given our past experience with the immunoglobulin locus, we were not surprised that T cell biology is also heavily affected by violation of normal germline constraints on the amino acids encoded by the sequence of the D. We speculate that the threat of unrestrained inefficiency or potential hazard in the creation of antigen receptor repertoires by an entirely stochastic process of DNA rearrangement appears to have been constrained during evolution by controlling the sequences of the rearranging gene segments to optimize the products of recombination while engendering diversity.

DATA AVAILABILITY STATEMENT
The authors acknowledge that the data presented in this study must be deposited and made publicly available in an acceptable repository, prior to publication. Frontiers cannot accept a manuscript that does not adhere to our open data policies.

ETHICS STATEMENT
The animal study was reviewed and approved by UAB IACUC.

AUTHOR CONTRIBUTIONS
MK took the lead in analyzing and interpreting the data, and writing the manuscript. ML took the lead role in sequencing of the TCR transcripts from thymocytes. RS participated in planning the original studies, creating the mice, and performing the initial analysis of T cell and repertoire development. PK was instrumental in the creation of the mice. PB participated in the planning of the experiments, interpreting the data, and editing the manuscript. HS developed the concept of the project, directed the planning and execution of the studies, reviewed the data, and directed the writing of the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported, in part, by AI090902 and AI117703.

ACKNOWLEDGMENTS
We thank Yingxin Zhuang for assistance with the creation of the targeting constructs and the gene targeting, Ada Elgavish for simulating discussions, and Barry Sleckman for contributing the original Dβ2ko construct and reviewing and discussing the results of our studies.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu. 2020.02079/full#supplementary-material FIGURE S1 | Creation of the DβYTL targeting construct.