A Workflow for Protein Structure Determination From Thin Crystal Lamella by Micro-Electron Diffraction

MicroED has recently emerged as a powerful method for the analysis of biological structures at atomic resolution. This technique has been largely limited to protein nanocrystals which grow either as needles or plates measuring only a few hundred nanometers in thickness. Furthermore, traditional microED data processing uses established X-ray crystallography software that is not optimized for handling compound effects that are unique to electron diffraction data. Here, we present an integrated workflow for microED, from sample preparation by cryo-focused ion beam milling, through data collection with a standard Ceta-D detector, to data processing using the DIALS software suite, thus enabling routine atomic structure determination of protein crystals of any size and shape using microED. We demonstrate the effectiveness of the workflow by determining the structure of proteinase K to 2.0 Å resolution and show the advantage of using protein crystal lamellae over nanocrystals.


INTRODUCTION
Electron diffraction has become a powerful method for structural biologists and complements now well-established X-ray crystallography methods, such as rotation and serial data collection using synchrotrons and serial femtosecond crystallography with X-ray free-electron lasers (XFEL). High quality data can be recorded from a very small number of sub-micron sized crystals using electrons. This arises from the strong interaction between electrons and matter, permitting protein crystals with a maximum thickness of only a few hundred nanometers to be used for structure determination (Nannenga et al., 2014;La Cruz De et al., 2017), and in some cases just a single crystal (Clabbers et al., 2017). This technique therefore holds an advantage in particular cases where obtaining large crystals may be a bottleneck to structure determination. In addition, electron diffraction produces electrostatic potential maps which can offer unique information not available from their X-ray crystallography equivalent. These maps can reveal details about the charge states of atoms within the protein structure which in turn provide important insight into protein function (Yonekura et al., 2015;Yonekura and Maki-Yonekura, 2016;Liu and Gonen, 2018).
The strong interaction of electrons with matter means that the electron beam is not transmissible through samples greater than several hundred nanometers in thickness when using standard cryoEM operation voltages of 200-300 kV. In addition, multiple scattering and inelastic scattering events are also more probable as sample thickness increases . These non-kinematic scattering events affect the retrieval of structure factors from measured diffraction intensities and thus impact the quality of the final structure. To overcome this problem, we and others previously explored cryo-focused ion beam (cryoFIB) milling as a sample preparation method for electron diffraction experiments (Duyvesteyn et al., 2018;Li et al., 2018;Martynowycz et al., 2019;Zhou et al., 2019). We demonstrated that the integrity of the crystal was maintained after cryoFIB milling, and more importantly, diffraction images from a crystal lamella displayed minimal dynamical scattering (Duyvesteyn et al., 2018).
However, due to the limitation of a standard Ceta detector used in that study -which has a thin scintillator and exhibits poor sensitivity when collecting data using the low doses of electrons required for effective microED data collection from sensitive biological samples -the data collection strategy was not optimal and resulted in a model with poor refinement statistics, making it difficult to fully assess the potential of the lamella in the context of high-quality protein structure determination. Highly specialized CMOS detectors, such as those produced by TVIPS, have previously been used for microED experiments (Shi et al., 2013). Hybrid pixel array detectors, such as Timepix and Medipix detectors, potentially provide a better detection method for protein crystals owing to their high dynamic range, low noise and radiation hard properties . However, these detectors are not yet commonly offered within standard high-resolution cryoEM imaging facilities. Furthermore, current microED data processing borrows software packages that were originally designed for X-ray diffraction experiments and are not optimized for handling systematic errors that are unique to electron diffraction data. To overcome these challenges and enable microED as a standard cryoEM method like single particle analysis (SPA) and cryo-electron tomography (cryoET), we established an integrated workflow for routine microED of protein crystals for implementation in most standard cryoEM imaging facilities.
The workflow includes (1) cryoFIB milling to produce wellordered crystalline lamellae (200-300 nm) from larger protein crystals; (2) a standard Ceta-D detector from Thermo Fisher for microED data collection; (3) microED data analysis using DIALS (Winter et al., 2018) which has now been optimized for electron diffraction . Using this workflow, the structure of proteinase K was determined to 2.0 Å resolution from a single protein crystal lamella. A comparison between data collected from nanocrystals and crystal lamellae suggests that higher quality structures can be obtained from proteinase K lamellae. Future automated microED data collection strategies, similar to those for SPA and cryoET, will be built upon this integrated system.

Crystallization and Grid Preparation
Lyophilized proteinase K from Tritirachium album (Sigma-Aldrich, P2308) was solubilized in 25 mM Tris pH 7.5 to a final concentration of 50 mg/mL. Microcrystals and nanocrystals of proteinase K were then grown using the batch method and their size was optimized by seeding as described by Beale et al. (2019). Crystal seeds were required to reproducibly grow crystals of a uniform size. To produce seeds, a method based on the protocols described by Luft and DeTitta (1999) was used. Crystals several hundred micrometers in size were grown using the vapor diffusion method. Crystals were grown in CrystalQuick TM X plates (Greiner) at 19 • C over approximately 18 h. The drops contained 200 nL 50 mg/mL proteinase K in 25 mM Tris pH 7.5 and 200 nL reservoir solution [20% (w/v) PEG 3350 and 0.2 M ammonium chloride]. Crystals from these conditions were harvested by aspirating with a pipette into a 1.5 mL Eppendorf tube containing 25 µL of reservoir solution and several small silicon beads. Crystals were crushed by three consecutive rounds of 30 s of vortexing followed by a 30 s incubation on ice. The seed stock, crystallization solution [20% (w/v) PEG 3350, 0.1 M ammonium chloride] and protein solution were combined in a 1:2:3 ratio. The crystallization experiment was then incubated overnight with gentle agitation using an orbital shaker at 18 • C. Different dilutions of the seed stock were used to generate crystals either hundreds of nanometers in size (nanocrystals) or approximately 10 × 10 × 12 microns in size (for lamella), where more concentrated seed stocks produced the smaller nanocrystals.
To prepare grids of nanocrystals, 3 µL of the batch crystallization solutions were then applied to the carbon side of glow discharged R1.2/1.3 Quantifoil TM grids (Quantifoil Micro Tools, Jena, Germany). Excess liquid was removed by blotting for 12 s with a Vitrobot (Thermo Fisher Scientific) under 100% humidity at 20 • C. The grids were held in the humid chamber for 30 s before plunge-freezing in liquid ethane. For the microcrystals, 2 µL of a 1 in 6 dilution of the batch crystallization was applied to the carbon side of glow discharged R2/2 Quantifoil TM grids (Quantifoil Micro Tools, Jena, Germany). Excess liquid was removed by blotting for 4-6 s with a Vitrobot (Thermo Fisher Scientific) under 100% humidity at 20 • C and immediately plunged into liquid ethane. All grids were stored under liquid nitrogen until required for cryoFIB milling or electron diffraction experiments.

CryoFIB Milling of Proteinase K Microcrystals
Milling of proteinase K crystals was carried out as previously described (Schaffer et al., 2015(Schaffer et al., , 2017Duyvesteyn et al., 2018) using a Scios TM DualBeam TM cryoFIB microscope (Thermo Fisher Scientific) equipped with a Quorum PP3010T cryotransfer system and a Quorum cryostage and shuttle. Briefly, plungefrozen grids containing crystals of proteinase K with approximate dimensions 12 × 10 × 10 µm were loaded into autogrid (Thermo Fisher Scientific) compatible with FIB-SEM applications. The grids were then coated with an organoplatinum compound using the in situ gas injection system (GIS) of the cryoFIB instrument. Lamellae were generated through a series of milling steps, where the current of the Ga beam was decreased in a stepwise fashion from 300 to 30 pA. These steps corresponded to subsequent lamella thicknesses of approximately 5 µm down to 0.2 µm, respectively. In addition to the initial organoplatinum coating, the sample was sputter coated with metallic platinum post-milling using the Quorum PP3010T system (10 mA, 3 s, argon atmosphere) to reduce beam-induced charging effects.

Microscope Set-up and Data Collection
All data were collected using a Talos Arctica TM TEM (Thermo Fisher Scientific) with an accelerating voltage of 200 kV. Low dose parallel illumination conditions were achieved through a combination of the largest gun lens and a small spot size 11, with a condenser (C2) apertures of 20 or 50 µm and operating in nanoprobe mode. The dose rate was kept constant across experiments using either the 20 or 50 µm C2 apertures so that only the resultant beamsize at the sample position changed.
A dose rate of ∼ 0.04 e − /Å 2 /s and an exposure time of 0.85 s for each diffraction pattern for all datasets were used in data collection. Each diffraction pattern therefore has a dose of 0.85 s × 0.04 e − /Å 2 /s = 0.034 e − /Å 2 . The total dose varies, from 2.7 to 4.7 e − /Å 2 , as we used multiple sweeps of data across either multiple crystals or multiple exposures of a single lamella. For nanocrystals, the maximum dose for nanocrystals was 79 × 0.034 e − /Å 2 = 2.69 e − /Å 2 . For 20 µm aperture lamellae, the maximum dose was 139 × 0.034 e − /Å 2 = 4.7 e − /Å 2 . For 50 µm aperture lamella, the maximum dose was 122 × 0.034 e − /Å 2 = 4.15 e − /Å 2 . The dose rate used was the lowest achievable in our microscope which maintained minimal radiation damage whilst still producing sufficient signal in the outer shells of the collected diffraction patterns for subsequent structure determination. Data collection essentially followed the method described by Shi et al. (2016) with some modifications. We used a small C2 aperture instead of a selected area aperture. This was done in an effort to minimize the contribution of non-crystalline areas of the illuminated sample to the diffraction pattern and to limit the size of the illuminated area at the sample. A custom script (See Supplementary Material) was used to control the rotation speed and direction of the stage such that it was continuously rotating during data collection. Shortly after this work, the microED package EPU-D (Thermo Fisher Scientific) has become commercially available for automated data collection. Data were collected on a Ceta-D detector (Thermo Fisher Scientific) operating in rollingshutter mode where each frame encompassed 0.51 • of data. Data were recorded using Technai Imaging and Analysis (TIA) software as an acquisition series in the TIA/EMISPEC series file format (.ser).
During the initial frames of the continuous-rotation data collection, the diffraction pattern often appears blurry, probably due to the beam-induced specimen charging. This can be largely mitigated by sputter coating the sample with metallic Pt after cryoFIB milling (Schaffer et al., 2017) and/or refocusing the diffraction spots during the initial frames of continuous data collection. These initial frames were excluded from further analysis.

Electron Diffraction Data Processing
Following the scheme presented in , the dxtbx library (Parkhurst et al., 2014) was extended to directly read the. ser format without further conversion. As expected, the calibrated camera length was affected by the diffraction lens adjustment used to focus the pattern. Given that the cell dimensions of the proteinase K crystals were known, we used these constraints to refine the camera length during data processing, which was carried out in DIALS following the method described in .
The default background modeling algorithm in DIALS performs outlier handling based on the assumption that the observed counts in each pixel are approximately Poisson distributed. This model is not appropriate for the Ceta-D, which is not a counting detector. Indeed, some datasets suffered from a negative bias (Supplementary Figure 1). We therefore selected the "simple" model for the background from the program options, which assumes a normal distribution of background counts. Integration was performed using DIALS 1.10, with an improved profile-fitting algorithm, compared to previous versions of the software, that is robust in the presence of negativevalued pixels.
The data were cut to a resolution where CC 1/2 in the outer resolution shell was equal to or higher than 0.5. It is to be expected, as shown in Figures 3A-C, that the generally weaker, high resolution reflections will suffer the greatest proportional disruption due to dynamical scattering. This manifests as an overall inflation of high resolution intensities compared to their values expected from kinematic scattering. This effect reduces the utility of merging statistics such as CC 1/2 and I/σ(I) in choosing a suitable resolution cut-off. Therefore a relatively conservative criterion of CC 1/2 ≥ 0.5 was chosen. The integrated intensities were then scaled and merged using AIMLESS (Evans and Murshudov, 2013).

Structure Solution and Refinement
For all datasets, the phases were determined by molecular replacement using Phaser (McCoy et al., 2007). A model of proteinase K (PDB ID: 2ID8) solved by X-ray crystallography was used as a search model with all alternative side chain conformations and ligands removed (Wang J. et al., 2006). The resultant structures were refined in Phenix using the program phenix.refine with electron scattering factors (Adams et al., 2010). Rounds of refinement were interspersed with manual building using the program Coot (Emsley et al., 2010). Figures of the resultant structures and electrostatic potential maps were generated using Pymol Schrödinger, LLC. (2015).

Sample Preparation Using cryoFIB Milling
Proteinase K was chosen as the test specimen. To reliably generate nanocrystals, seeds (Luft and DeTitta, 1999) were added to batch crystallization conditions. The larger micrometer-sized crystals used for cryoFIB milling were grown in the same batch conditions by using a larger dilution of the crystal seeds. Both the nanocrystals and the microcrystals (Figures 1A-B) were applied to Quantifoil TM cryoEM grids before vitrification in liquid ethane. The larger proteinase K microcrystals were subject to cryoFIB milling using a Scios TM DualBeam TM instrument following the protocols described in Duyvesteyn et al. (2018; Figures 1C-F). The resultant lamellae measured approximately 10 µm × 10 µm × 0.2 µm thick. A notable development in the protocol presented here is that the lamellae were sputter coated with metallic Pt after milling to improve sample stability and reduce charging during data collection.

MicroED Data Collection Using Ceta-D
Electron diffraction data were collected for both the nanocrystals and the lamellae using the continuous-rotation method (Arndt and Wonacott, 1977). The practical methods described by Shi et al., 2016 were used to set up the microscope for data collection with the following modifications: (1) A selected area aperture was not used, instead, a small condenser (C2) aperture of 20 µm was inserted to control the illuminated area which reached the sample. (2) The stage was controlled semi-automatically by a custom program (Supplementary Material) which allowed for continuous-rotation of the sample during exposure to the electron beam without the need for a shutter.
The data were recorded on a Ceta-D detector operating in rolling-shutter mode ( Table 1). Using a dose of approximately 0.04 e − /Å 2 /s, the Ceta-D was able to successfully measure microED data from nanocrystals to 2.7 Å resolution (Figure 1G), which is consistent with a previous report that also utilizes a Ceta-D detector . It is important to note, however, that these nanocrystals were estimated to be at least 500 nm in thickness, and Ceta-D was not sensitive enough to measure diffraction intensities from smaller, thinner crystals using this low-dose setup. Using a higher dose of electrons could compensate the sensitivity of Ceta-D to measure smaller, thinner crystals. However, the rate of radiation damage will also increase.
The resultant challenge being that fewer diffraction patterns per crystal could be collected then requiring a larger number of crystals to be measured in order to achieve data completeness.
In contrast to the nanocrystals, the lamellae measured approximately 200 nm in thickness. Diffraction intensities to 2.4 Å resolution were recorded with the Ceta-D under the same experimental conditions as those used for the nanocrystals (Figure 1H). In addition, we also collected diffraction data from crystal lamellae using a 50 µm C2 condenser aperture, which yielded measurable reflections extending to approximately 2.0 Å ( Table 1).

MicroED Data Processing Using DIALS
The diffraction data were processed using DIALS, which contains implementations specific for electron diffraction . The diffraction images (.ser files) were read directly, without conversion, using the dxtbx library (Parkhurst et al., 2014), while pertinent metadata describing the diffraction Numbers in brackets represent the statistics for data in highest resolution shell.
Frontiers in Molecular Biosciences | www.frontiersin.org experiment were provided separately during data importing. The diffraction images of some data sets displayed negative averaged background levels at higher resolutions (Supplementary Figure 1). This negative bias led to failures during profile fitted integration. Our investigation into these failures led to changes to the method used to estimate weights for each pixel used in profile fitting. The new method ensures positive pixel variance estimates, even in the presence of negative-valued pixels. This more robust algorithm also reduces bias in intensity estimates for weak reflections in general and became the default from DIALS version 1.10. Alongside the enhanced profile fitting algorithm, in this specific case we also applied an additive correction to the images to account for the negative bias. As the magnitude of the bias was less than 1 for all datasets, we added 1 to all pixel values on-the-fly using a dxtbx plugin (Parkhurst et al., 2014), without changes to either the image files or DIALS code. Using a 20 µm C2 aperture, four datasets recorded from two lamellae were merged to produce a single dataset that was taken forward to phasing and refinement. Likewise, eight partial datasets from individual nanocrystals were merged to form a complete dataset for further analysis (Table 1). Notably, when using a 50 µm C2 aperture, we were able to produce a complete dataset using just two rotation data collections from a single lamella. The results from the data processing, structure solution and refinement for the merged datasets are presented in Table 1. Structure solution was successful with the data collected from both the nanocrystals and the crystal lamellae, with example electrostatic potential maps shown in Figure 2. These three structures, derived from nanocrystals, crystal lamellae with 20 µm aperture or crystal lamella with 50 µm aperture, overlap well (Figure 2A). The resolutions are slightly higher with crystal lamellae (Table 1), with densities subtlely better resolved (Figures 2B-D).
Interestingly, a positive electrostatic potential was present close to the catalytic triad for both the nanocrystal and lamella structure of proteinase K when using 20 µm aperture (Supplementary Figure 2). Existing structures show this to be the binding site for inhibiting Hg atoms (Müller and Saenger, 1993;Saxena et al., 1996;Gourinath et al., 2001), but given that no heavy atoms were present in the crystallization conditions, these sites were left unmodelled.

Comparison of Nanocrystals and Crystal Lamellae
Dynamical scattering in electron diffraction is a concern when crystals are thicker than a few hundred nanometers (Subramanian et al., 2015); still, protein structures have been determined successfully from crystals measuring up to 400 nm thick (Hattne et al., 2015). We analyzed the correlation between F o and F c for the nanocrystals, measuring ∼500 nm and the crystal lamella, measuring ∼200 nm, using the plots described in  Figure 3). The nanocrystal data exhibits greater scatter around and deviation from the F o = F c line, particularly at low intensity ( Figure 3A). This is suggestive of dynamical scattering but perhaps also the influence of compound elastic and inelastic scattering events. Interestingly, the average B factor value is considerably lower than would be expected for an equivalent X-ray crystallography structure at this resolution. Indeed, the B factors of several atoms fell to zero during structure refinement. The effect of dynamical scattering on individual reflection intensities is complex (Voigt-Martin et al., 1997). The bulk trend is to increase weak intensities while reducing strong intensities (Clabbers et al., 2019). This may emulate the effect of map sharpening and partially explain these low B-factors.
The lamella data showed stronger correlation between the F o and F c values (Figures 3B,C) suggesting that the data from these samples were less influenced by dynamical scattering. This indicates one of the advantages of using protein crystal lamellae, namely that the sample thickness can be specifically tailored to the requirements of electron diffraction experiments. Higher resolution structures were determined from the lamella data FIGURE 2 | Overall model and example electrostatic potential maps for the proteinase K structures determined using electron diffraction data. (A) The models for the structures determined from nanocrystals (cyan) and the structures determined from lamella with the 20 µm (magenta) and 50 µm (green) condenser apertures are shown aligned by C-alpha residues in cartoon representation with the Ca2+ ion depicted as a sphere. (B-D) A section of the electrostatic potential maps around the disulfide bridge linking residues Cys34 and Cys123 is shown with the 2mFo -Fc maps contoured at 1.0 σ above the mean for the nancrystals (B), 20 µm C2 aperture lamella dataset (C) and the 50 µm C2 aperture lamella dataset (D). (2.4 Å and 2.0 Å) when compared to the nanocrystal data (2.7 Å). This was likely due to an improvement in signal to noise from the lamella, because the lamella filled the entire illumination area and because neither carbon support nor vitrified mother liquor was present in the exposed area. Furthermore, multiple wedges of data could be acquired from a single crystal lamella as shown here, resulting in better data merging statistics.
Using cryoFIB milled protein crystal lamella opens up the electron diffraction method to all sizes and shapes of crystals, allowing researchers to capitalize on the unique properties of microED, namely electrostatic potential maps and the ability to reveal hydrogen positions, given the data are of sufficient resolution. In addition, microED requires very little sample for protein structure determination compared with macromolecular crystallography (MX) and XFEL methods. Here, we established a streamlined workflow to enable microED for routine protein structure determination. We demonstrated cryoFIB milled lamella give higher resolution data of better quality when compared to nanocrystals under the same imaging conditions. We showed that the low cost Ceta-D camera, generally accessible in a standard cryoEM setup, worked well for microED for well diffracting samples, as reported recently by others Martynowycz et al., 2019;Wang et al., 2019;Levine et al., 2020). We show a better structure (2.0 Å) than that reported for lamella of proteinase K using Ceta-D detector (2.75 Å) . Detailed experimental conditions are compared in Supplementary Table 1. Although the workflow is demonstrated using a Ceta-D detector, it would also work for other detectors that are integrated into Thermo Fisher microscopes, such as Gatan OneView, K2 summit, and Thermo Fisher Falcon 2/3. In fact, compared with Ceta-D, Falcon 2 yielded stronger signal to noise ratios without significantly reducing the standard deviation of dark images. Future developments to our protocol using a direct electron detectors or a hybrid pixel array detector could potentially improve both data quality and resultant structures, and enable more challenging problems to be tackled.

DATA AVAILABILITY STATEMENT
Atomic models are deposited at the Protein Data Bank under accession codes PDB 6ZET for crystal structure of proteinase K nanocrystals by electron diffraction with a 20 micrometre C2 condenser aperture, PDB 6ZEV for crystal structure of proteinase K lamellae by electron diffraction with a 20 micrometre C2 condenser aperture, and PDB 6ZEU for crystal structure of proteinase K lamella by electron diffraction with a 50 micrometre C2 condenser aperture.

AUTHOR CONTRIBUTIONS
GE and PZ conceived and designed the experiments. EB and CH designed the experiments. EB, CH, and JR prepared the crystal samples. CH, EB, and JG performed the cryoFIB milling of proteinase K crystals. EB, CH, JR, FH, and BB collected the electron diffraction data. DW and JP optimized the DIALS software for microED data processing. EB and DW analyzed the data. EB, DW, GE, and PZ wrote the manuscript with support from other authors. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by the UK Wellcome Trust Investigator Award 206422/Z/17/Z (PZ) and the UK Biotechnology and Biological Sciences Research Council grant BB/S003339/1 (PZ). We acknowledge Diamond for access and support of the CryoEM facilities at the UK national electron bio-imaging center (eBIC, proposal NT23169). JMP and DIALS development is supported by Diamond Light Source, CCP4 and Wellcome Award 202933/Z/16/Z (GE).