In Silico and Structural Analyses Demonstrate That Intrinsic Protein Motions Guide T Cell Receptor Complementarity Determining Region Loop Flexibility

T-cell immunity is controlled by T cell receptor (TCR) binding to peptide major histocompatibility complexes (pMHCs). The nature of the interaction between these two proteins has been the subject of many investigations because of its central role in immunity against pathogens, cancer, in autoimmunity, and during organ transplant rejection. Crystal structures comparing unbound and pMHC-bound TCRs have revealed flexibility at the interaction interface, particularly from the perspective of the TCR. However, crystal structures represent only a snapshot of protein conformation that could be influenced through biologically irrelevant crystal lattice contacts and other factors. Here, we solved the structures of three unbound TCRs from multiple crystals. Superposition of identical TCR structures from different crystals revealed some conformation differences of up to 5 Å in individual complementarity determining region (CDR) loops that are similar to those that have previously been attributed to antigen engagement. We then used a combination of rigidity analysis and simulations of protein motion to reveal the theoretical potential of TCR CDR loop flexibility in unbound state. These simulations of protein motion support the notion that crystal structures may only offer an artifactual indication of TCR flexibility, influenced by crystallization conditions and crystal packing that is inconsistent with the theoretical potential of intrinsic TCR motions.

T-cell immunity is controlled by T cell receptor (TCR) binding to peptide major histocompatibility complexes (pMHCs). The nature of the interaction between these two proteins has been the subject of many investigations because of its central role in immunity against pathogens, cancer, in autoimmunity, and during organ transplant rejection. Crystal structures comparing unbound and pMHC-bound TCRs have revealed flexibility at the interaction interface, particularly from the perspective of the TCR. However, crystal structures represent only a snapshot of protein conformation that could be influenced through biologically irrelevant crystal lattice contacts and other factors. Here, we solved the structures of three unbound TCRs from multiple crystals. Superposition of identical TCR structures from different crystals revealed some conformation differences of up to 5 Å in individual complementarity determining region (CDR) loops that are similar to those that have previously been attributed to antigen engagement. We then used a combination of rigidity analysis and simulations of protein motion to reveal the theoretical potential of TCR CDR loop flexibility in unbound state. These simulations of protein motion support the notion that crystal structures may only offer an artifactual indication of TCR flexibility, influenced by crystallization conditions and crystal packing that is inconsistent with the theoretical potential of intrinsic TCR motions.
Keywords: T-cells, T cell receptor, complementarity determining regions loops, protein flexibility, computational simulations, X-ray crystallography inTrODUcTiOn T-cells constitute our primary cellular defense against pathogenic challenge and play a major role in controlling neoplasms. The key molecular interface that enables T-cells to sense these threats is mediated by the clonally expressed T cell receptor (TCR) that classically distinguishes between self and foreign peptides. These peptides are derived from processed intra-and extra-cellular proteins, presented by highly diverse major histocompatibility complexes (pMHCs) on the surface of most nucleated cells (1). TCRs are required to respond to a vast number of potential foreign peptides that they have not encountered before, are unable to adapt to, and that can be presented by multiple MHCs (2,3). More recently, it has been shown that the TCR can also recognize lipid antigens and metabolites presented by the invariant MHC-like cluster of differentiation 1d and MHC class I-related molecules, respectively (4,5). Further evidence has implicated several other MHC-like molecules as antigenic targets for T-cells, exemplifying the extreme versatility of the TCR (6,7).
In order to tackle this vast antigenic milieu, the gene rearrangement process that produces the TCR provides almost limitless possible TCR sequences through the recombination of TCR variable, joining and diversity genes (the germline encoded component), as well as addition and deletion of nucleotides (the somatic component). Additionally, through the combination of two chains (α and β) to form a heterodimeric αβ TCR, it is theoretically possible to generate ~10 18 TCRs in humans. However, only a small fraction (<10 8 ) of these possibilities are ever expressed in any individual due to space limitations, suggesting that TCRs must be able to recognize multiple pMHCs to cover all potential pathogen encounters (8). Indeed, recent experimental evidence has confirmed this notion, demonstrating that TCRs can recognize millions of pMHCs with physiologically relevant sensitivity (9)(10)(11)(12)(13).
One key feature that is likely to facilitate this level of cross-reactivity is flexibility within the TCR complementaritydetermining region loops (CDR loops) that form the antigen contact zone. Indeed, early thermodynamic evidence (14)(15)(16), combined with NMR spectroscopy (17-19) and fluorescence anisotropy (18,(20)(21)(22) has been used to indirectly and directly demonstrate TCR CDR loop motions. Alongside biophysical approaches, crystal structures comparing unbound and pMHCbound TCRs have demonstrated that the TCR CDR loops can change shape, becoming stabilized upon binding (20,23,24). However, although X-ray crystallography provides unparalleled resolution of proteins too small for cryo-EM, the resulting snapshots represent only one conformational state that could be influenced by crystallographic artifacts. In order to extend the reach of atomic structures, computational modeling of protein motion has been used for almost 40 years to study a range of different systems ranging from enzymes, viral proteins, G protein coupled receptors and, more recently, immune receptors (21,(25)(26)(27)(28)(29)(30)(31)(32)(33). Application of this approach is beginning to shed light on the malleability of the TCR during pMHC binding, although questions still remain about the intrinsic flexibility of the TCR.
Here, we focused on two important unresolved questions. First, are conformational changes between unbound and pMHC-bound TCRs biologically relevant or crystal artifacts (especially considering that the CDR loops of unbound TCRs are more "exposed" for non-biologically relevant crystal contacts)? We addressed this issue by solving the structures of three different TCRs from multiple crystals. These included 12 datasets of the F11 TCR that recognizes a peptide from the influenza hemagglutinin protein (PKYVKQNTLKLAT) presented by HLA-DR*0101 (DR1-PKY), five data sets from the HA1.7 TCR that also recognizes DR1-PKY, and five datasets from the 003 TCR that recognizes an HIV p17 Gag-derived peptide (SLYNTVATL) presented by HLA-A*0201.
These structures were compared to determine whether different loop conformations existed independent of ligand binding. Second, we investigated the intrinsic flexibility of the TCR CDR loops in unbound state by implementing geometric simulations of flexible TCR motion using a combination of rigidity analysis and coarse-grained elastic network normal mode analysis.
Overall, our data support the notion that crystal structures may represent an artifactual indication of TCR CDR loop flexibility that offers only a snapshot of the theoretical potential of TCR CDR loop motions. These data provide additional evidence contributing toward our understanding of the molecular mechanisms that mediate T-cell antigen discrimination and cross-reactivity.

MaTerials anD MeThODs
Protein expression, refolding, and Purification The F11, HA1.7, and 003 TCRs were generated as previously described (34), and the α and β chains were cloned into separate pGMT7 expression plasmids under the control of the T7 promoter. Each TCR was refolded and purified using methods that have been described previously (35).

crystal structure Determination
All protein crystals were grown at 18°C by vapor diffusion via the sitting drop technique. 200 nL of each TCR (10 mg/ml) in crystallization buffer (10 mM Tris pH 8.1 and 10 mM NaCl) was added to 200 nL of reservoir solution. The TCR crystals used in the structural investigations were grown in a variety of different conditions from PACT premier™ HT-96, JBScreen Classic HTS I, or TOPS (36) detailed in Table 1. Crystallization screens were conducted using an Art-Robbins Phoenix dispensing robot (Alpha Biotech Ltd., UK) and data were collected at 100 K at the diamond light source (DLS), Oxfordshire, UK using an ADSC Q315 CCD detector. Reflection intensities were estimated using XIA2 (37) and the data were analyzed with SCALA and the CCP4 package (38). Structures were solved with molecular replacement using PHASER (39). Sequences were adjusted with COOT (40) and the models were refined with REFMAC5. Graphical representations were prepared with PYMOL (41). Crystal contacts were determined using PYMOL and defined as intermolecular distances <4.0 Å. The reflection data and final model coordinates were deposited in the PDB database and are detailed in Tables 2-4.

geometric simulations of Flexible Motion
Amplitudes of motion in representative structures of the unbound TCRs solved here were simulated using a combination of rigidity analysis and coarse-grained elastic network normal mode analysis. Elnemo software (42) was used to obtain normal mode eigenvectors from coarse-grained elastic network modeling. FIRST/FRODA software (43,44) was used to carry out rigidity analysis (FIRST) (45), which identified the noncovalent interaction network and labeling dihedral angles as locked or variable, and template-based geometric simulations of flexible motion (FRODA) (44) which project the all-atom structure over large amplitudes of motion, while maintaining local bonding and steric geometry. and compared with superpositions of crystallographic models in PyMOL.

resUlTs
Tcr cDr loop Flexibility analysis Using X-ray crystallography Several investigators, including ourselves, have previously used the structures of unbound and pMHC-bound TCRs to explore conformational changes that occur during pMHC ligand binding (20,23,24,35). These studies have revealed that some TCRs undergo large conformational changes during binding, whereas others use a "lock-and-key"-type ligation strategy. However, questions remain over whether these changes accurately reflect how TCRs engage pMHC, or whether these observations are biased because they rely on a static image of a highly flexible protein interface that could be further affected by crystal lattice contacts, crystal packing, and/or crystallization conditions. To address this question, we solved multiple structures of three unbound TCRs at atomic resolution. We generated 12 structures of the F11 TCR, that recognizes a peptide from the influenza hemagglutinin protein (PKYVKQNTLKLAT) presented by HLA-DR*0101 (DR1-PKY), between 1.58 and 1.89 Å resolutions; 5 structures of the DR1-PKY specific HA1.7 TCR, between 2.31 and 2.98 Å resolution; and five structures of the 003 TCR, that recognizes an HIV-GAG-derived peptide (SLYNTVATL) presented by HLA-A*0201 (A2-SLY), between 1.26 and 1.37 Å resolution. All of the structures were solved with crystallographic Rwork/Rfree ratios within accepted limits as shown by the theoretically expected distribution (47). Statistical analysis and structure factors from two representative structures from each TCR are shown in Tables 2-4. The structures were refined by multiple individuals to avoid bias during refinements.
To accurately investigate TCR CDR loop movement during pMHC binding using crystal structures, the unbound TCR structures should, ideally, all be identical. We found that this was not the case for two of the three TCRs under investigation. Indeed, for the F11 TCR, we observed Cα backbone flexibility in the CDR2β loop, shifting by up to 3.6 Å in different structures of the same protein (Figure 1). For the HA1.7 TCR, we observed a larger shift in potential positions for the Cα backbone of the CDR3α loop, differing by up to 5 Å (Figure 2). In both cases, these shifts were comparable to loop movements reported in several other studies in which the unbound and pMHC-bound TCRs were compared (23). These findings demonstrate the potential for artifactual interpretation of the mechanism of TCR ligation using structures alone. However, this is not always the case. Indeed, the CDR loops of the third TCR included in our study, the 003 TCR, were superimposable in all the structures solved (Figure 3). B-factor analysis did not correlate with these loop movements (Figures 1-3).
Having observed positional differences in CDR loop positions in multiple, but not all, TCR datasets, we hypothesized that such differences might be explained by crystallographic artifacts. We therefore considered whether the resolution of the structures, or the crystal lattice contacts, had an impact on the nature of the loop movements observed in our structures. The 003 TCR structures, Normal mode eigenvectors were generated in Elnemo in a one-site-per-residue coarse-graining using the Cα geometry of the input structure, placing springs of equal spring constant between all sites lying within an interaction distance cut-off of 12 Å. A rigidity analysis of the all-atom input structure was carried out in FIRST using the "pebble game" algorithm (43,46), which matches degrees of freedom against bonding constraints in the molecular framework of the protein.
Bonding constraints, include covalent, hydrophobic, and polar (hydrogen bond and salt bridge) interactions. As the strength of the polar interactions can be gauged from their geometry, the results of the analysis depend on an "energy cut-off " which selects the set of polar interactions to include in the constraint network (45). A cut-off of −3.0 kcal/mol was used in this study for simulations of flexible motion. We explored flexible motion biased along the 10 lowest-frequency nontrivial normal modes identified by Elnemo (modes 7-16; modes 1-6 are trivial rigid body motions).
Template-based geometric simulation of flexible motion, carried out using FRODA, explores the mobility of the all-atom structure by iterative perturbation and relaxation of atomic positions in parallel and antiparallel to the direction of normal mode eigenvectors. Several thousand iteration steps were carried out to generate large motion amplitudes. The simulation generates an initial phase of "easy" motion, where the bonding geometry is easily maintained, followed by the onset of "jamming" as the motion encounters steric and bonding constraints, which naturally limit its amplitude. The conformational changes of geometric simulations of TCRs projected using this method were observed, The Flexible Nature of the TCR CDR Loops Frontiers in Immunology | www.frontiersin.org April 2018 | Volume 9 | Article 674   Figure 2B). Moreover, we also observed loop movement for the F11 TCR structures, which were solved at an average resolution of 1.55 Å, comparable to the higher resolution datasets for the 003 TCR. Last, we investigated whether stabilizing crystal lattice contacts might explain why some of the CDR loops appeared identical while others could shift between structures. For the F11 TCR, only the CDR2α and CDR2β loops were free from any lattice contacts (data not shown). Thus, the ability of the CDR2β loop to shift between structures could have been partly due to extra freedom imparted by individual crystal packing. Similarly, the HA1.7 TCR CDR3α loop was free from crystal lattice contacts and shifted between structures. However, several other loops in both the F11 and HA1.7 TCR structures were also free from crystal lattice contacts and did not shift between structures. Finally, none of the 003 TCR CDR loops, which were identical in each structure, made any potentially stabilizing crystal lattice contacts. Overall, neither the resolution of the structures, or the availability of stabilizing crystal lattice contacts were good predictors of TCR CDR loop shifts between structures. Thus, we conclude that CDR loop movements observed between unbound and pMHC-bound TCRs require further investigation in order to confirm whether they are artifactual, or a real part of the TCR-binding mechanism.
investigating Tcr cDr loop Flexibility Using rigidity analysis and simulations of Protein Motion Direct measurements of protein flexibility at the single loop level is highly challenging because of the size (nm) and time (ms) of the movements. Indirect measurements using individually labeled amino acids are possible using NMR and other techniques, but these experiments are technically challenging, time consuming, and not universally available. As an alternative approach, computational modeling has developed rapidly over the past few years and has emerged as a useful technique to investigate protein motions (31). However, for T-cell recognition studies, most of these modeling approaches have focused on flexibility at the peptide-MHC, or TCR-pMHC interface rather than exploring the motional potential of the TCR in unbound state. In the one study that did use this method to investigate TCR flexibility, the authors found large differences in flexibility between two different TCRs that helped to explain the antigen-recognition mechanisms Here, we investigated the intrinsic rigidity and flexibility of the unbound TCR structure datasets using pebble-game rigidity analysis, elastic network modeling, and geometric simulations of flexible motion, using a combination of Elnemo and FIRST/ FRODA software (44). FIRST software identifies the network of noncovalent constraints in the system, including both polar (hydrogen bond) and hydrophobic-tether interactions. Polar interactions are assigned strength in the range 0 to −10 kcal/ mol based on their geometry. The set of polar interactions to include in the rigidity analysis is controlled by an energy cutoff parameter Ecut. We have demonstrated that biologically significant flexibility can be explored at cutoffs in the range −2 to −4 kcal/mol (48,49). In order to determine the appropriate Ecut for TCRs, we performed rigidity analysis using the 003 TCR at cutoffs of −2 kcal/mol and −3 kcal/mol (data not shown). Analysis at a cutoff of −3 kcal/mol, but not −2 kcal/mol, demonstrated that the structure was largely flexible, with very few large rigid clusters. However, N and C terminal domains were still rich in noncovalent interactions maintaining the secondary, tertiary, and quaternary structure. We, therefore, explored flexible motion in all three TCR structures using the constraint network found at Ecut = −3 kcal/mol. We used Elnemo software to identify the 10 lowest-frequency nontrivial normal modes in each TCR structure. We then used the FRODA module of FIRST to project the structure along each mode, while retaining the local covalent and noncovalent bonding geometry of the input structure. This represents intrinsic flexible motion of the structure which can easily be explored in solution. Recent work has shown that the character of motion identified using this method is consistent with conventional molecular dynamics (MD) simulations, while requiring minimal computational expense (a few CPU-hours) (49,50). The lowestfrequency modes include substantial components of relative domain motions, in which the interdomain section of each chain (around residues 110-120) provides a flexible joint, as was also recently observed in the large dimeric enzyme Dcps (49). As a result, a structural overlay of conformations generated by FRODA (Figure 4) includes both domain-motion variations and local changes in loop geometry.
To isolate the loop motions specifically, we carried out an alignment on the N-terminal domain for each structure and each chain (residues 6-110). The alignment was carried out in PyMOL on the non-loop residues of each domain ( Table 5). This allowed visualization of CDR loop structural variations relative to a stable base of comparison, a set of 20 generated structural variants (Figure 5). For each normal mode, we  selected the variants representing the natural limit of flexible motion parallel and antiparallel to the mode direction. This natural limit is the point at which covalent and noncovalent constraints (including steric contacts) start to limit the amplitude of the motion, such that further progress along the mode direction is "jammed".  This analysis demonstrated that the scope for flexible variation in the loop geometries was substantial in all of the TCRs. Measurements of the maximal amplitude of the apex of each loop were conducted to provide estimations of the potential flexibility of each loop ( Table 5). Although this analysis is an approximation and should be treated as such, the average maximal loop motion was slightly less (6.4 Å) for the 003 TCR compared to F11 (8.7 Å) and HA1.7 (9.0 Å), consistent with the structural data demonstrating greater rigidity in the 003 TCR. Further dissection of the data revealed that the longer somatically rearranged CDR3 loops had the most potential for loop motion (CDR3α: 10.2 Å, CDR3β: 12.2 Å) compared to the shorter germline encoded loops (CDR1α: 7.2 Å, CDR2α: 7.7 Å, Fwα: 8.7 Å, CDR1β:7.2 Å, CDR2β: 5.5 Å, Fwβ: 5.8 Å). These findings are consistent with the fact that the CDR3 loops generally make more interactions with the variable peptide component of the antigen, whereas the other loops are generally more focused toward the MHC surface. More generally, this analysis also indicated that all the structures, in solution, can explore large variations in loop geometry, providing an ensemble of flexible variations for conformational selection or induced-fit binding mechanisms. Overall, this analysis demonstrated, as expected, a large degree of potential motion, with the more disordered portions of the protein flexing more than those with secondary structure that was not apparent from the structural analysis (Figures 4 and 5).

DiscUssiOn
The TCR governs T-cell specificity by discriminating between self and foreign peptides presented by MHC molecules. The finger-like CDR loops of the TCR are thought to meld around specific pMHCs, sampling the peptide cargo, and enabling T-cell triggering by ligands with sufficient affinity/dwell-time. This binding mode is also likely to facilitate TCR cross-reactivity by enabling the TCR to explore multiple conformations during ligand interrogation. However, the mechanism(s) that underpin the ability of T-cells to respond to millions of different pMHCs are still emerging. Several experiments have used atomic resolution structures to compare TCRs in unbound state and in complex with pMHC (23). These studies revealed conformational changes upon binding, supporting the idea that the CDR loops can flex to accommodate different peptide cargos using an induced fit mechanism. More recent data, using NMR, FRET, and MD support this view, but also demonstrate that the TCR-pMHC interface can be far more flexible than is apparent from the static image captured during X-ray crystallography (17)(18)(19)(20)(21)(22).
First, we examined a very broad, but unanswered question: Is the conformation of a protein identical in every dataset collected and refined during X-ray crystallography experiments? This question is relevant to all structures solved by X-ray crystallography, but is particularly applicable when investigating ligand     engagement by a receptor, as in the case of TCR-pMHC interaction. Several factors could affect the refined structure generated during this approach including; changes in the crystal packing between crystals, alterations in lattice contacts that could artificially stabilize protein regions, interpretation of the data during refinement, and differences in the protein preparation. In order to try to test some of these factors, we solved the structure of the same three TCRs from multiple crystals, grown in a range of conditions, from several different protein preparations, refined by different scientists.
Reassuringly, the overall conformation of each structure was virtually identical in all datasets tested. However, we observed several CDR loop re-organizations between structures for the HA1.7 and F11 TCRs, while all structures of the 003 TCR were identical at the level of the Cα backbone. These observations were not linked to the resolution of the structures, the crystal growing conditions, the availability of stabilizing crystal lattice contacts, or on who performed the refinement. Thus, we conclude that these loop movements represent real differences in the conformation of the CDR loops of the HA1.7 and F11 TCRs due to the intrinsic The Flexible Nature of the TCR CDR Loops Frontiers in Immunology | www.frontiersin.org April 2018 | Volume 9 | Article 674 generally form the majority of the interactions with the variable peptide cargo, compared to the more MHC-centric CDR1, 2, and Fw loops. Thus, this extra level of flexibility in the CDR3 loops may represent an important mechanism enabling TCR crossreactivity with multiple different peptides (2, 3, 10, 11). These data have important implications for the general analysis of crystal structures, and more specifically for TCR antigen recognition. With the recent breakthroughs in MD and other modeling approaches, technologies that will rapidly develop in the near future, it seems rational to start pairing crystal structures with this type of analysis. Although theoretical, modeling approaches can provide another dimension of information to the complex and flexible amino acid network that governs the nature of protein-ligand dynamics. Findings from these analyses may reveal new areas of interest that can be tested experimentally. Our data, demonstrating the theoretical range of motion for unbound TCRs, are consistent with other modeling approaches that have focused on TCR-pMHC complexes (28)(29)(30)(31)(32)(33), or TCRs alone (21). These studies have shown that the TCR-pMHC interface is highly flexible with some fixed interactions, but others that come and go as the TCR "rocks" on top of the pMHC (33). This binding mode is also congruous with recent experimental data demonstrating that TCRs are highly degenerate and can recognize many thousands, if not millions, of different peptide sequences (9)(10)(11)(12)(13). This enables T-cells to crossreact, thereby allowing a limited pool of TCR sequences within an individual to afford protection against the vast milieu of potential pathogenic peptide sequences that could be encountered (2,3). Finally, our data reinforce the notion that some TCR CDR loops form a highly flexible and dynamic binding site, and that crystal structures alone may not be adequate to fully represent the complex mechanisms employed during pMHC ligation by the TCR. The fact that CDR loops can "move" between different free TCR structures flexibility of these regions. We, therefore, recommend caution when using comparisons of unbound and pMHC-bound TCRs to describe binding mechanisms as these movements assume that the unbound structure of the TCR is a representative low energy state. Our observations suggest that the snapshot provided by X-ray crystallography may not be representative of CDR loop positions because of the highly dynamic nature of these disordered regions of the TCR.
To further investigate the flexible nature of the CDR loops, we measured large-amplitude protein motions in the three TCRs under investigation using FIRST/FRODA software. As expected, this analysis demonstrated a large degree of potential motion, with the more disordered portions of the protein flexing more than those with secondary structure. This analysis was far more revealing than the structural analysis alone, which only demonstrated structural mobility in some of the loops of just two of the TCRs (F11 and HA1.7). Rather, we observed large potential motions in all of the TCR CDR loops of all three TCRs, with more rigidity detected in the non-CDR loop portions of the TCR. This flexibility, which has been assumed, but not definitively proven for the TCR CDR loops, is consistent with the notion that the mechanism by which the TCR samples pMHC epitopes relies on flexibility at the interaction interface. Finer dissection of the motions of each individual CDR loop (including the Fw loop) demonstrated different maximal amplitudes at the apex of each loop. On average, the 003 TCR CDR loops moved slightly less compared to the F11 and HA1.7 TCR CDR loops, in line with the structural analysis. Furthermore, the somatically rearranged CDR3 loops in all of the TCRs studies were more mobile compared to the germline encoded CDR1, 2, and Fw loops. These findings are consistent with the observation that (1) the CDR3 loops are generally longer than the other loops, and (2) the CDR3 loops of the same molecule suggests that caution is advisable when inferring binding mode based on the single snapshots provided by X-ray crystallography of ligated and unligated TCR.