Biophysical Screening Pipeline for Cryo-EM Grid Preparation of Membrane Proteins

Successful sample preparation is the foundation to any structural biology technique. Membrane proteins are of particular interest as these are important targets for drug design, but also notoriously difficult to work with. For electron cryo-microscopy (cryo-EM), the biophysical characterization of sample purity, homogeneity, and integrity as well as biochemical activity is the prerequisite for the preparation of good quality cryo-EM grids as these factors impact the result of the computational reconstruction. Here, we present a quality control pipeline prior to single particle cryo-EM grid preparation using a combination of biophysical techniques to address the integrity, purity, and oligomeric states of membrane proteins and its complexes to enable reproducible conditions for sample vitrification. Differential scanning fluorimetry following the intrinsic protein fluorescence (nDSF) is used for optimizing buffer and detergent conditions, whereas mass photometry and dynamic light scattering are used to assess aggregation behavior, reconstitution efficiency, and oligomerization. The data collected on nDSF and mass photometry instruments can be analyzed with web servers publicly available at spc.embl-hamburg.de. Case studies to optimize conditions prior to cryo-EM sample preparation of membrane proteins present an example quality assessment to corroborate the usefulness of our pipeline.


IJ1 protein purification
Fig. S1: Size exclusion chromatography of IJ1 solubilised with DDM (a, Superdex 200 10/300 column) and its respective SDS-PAGE gel (b). IJ1 sample solubilised in DDM was later subjected to size exclusion chromatography to perform a buffer exchange in LMNG (c, Superdex 200 10/300 column) and amphipol A8-35 (d, Superose 6 3.2/300 column). The elution volumes for the samples correlate with the size of the micelar-protein and amphipolprotein complexes. The size of an empty micelle in the current buffer is around 59.5 KDa for DDM and 93 KDa for LMNG. The void volumes V0 of the columns are shown as grey dashed line in a, c, and d.    The green curve is a typical example of a bad quality sample displaying macromolecular aggregation in this case due to the low concentration of phospholipids added to solubilize the forming complex. The red curve is a sample prepared above the critical micellar concentration (CMC) of PIP2 which is around 200 μM. c) Intensity Mass distribution. A peak corresponding to PIP2 micelles is observed below 2.5 nm (red line). d) Mass Photometry of a complex of yeast AENTH complex in presence of 200 µM PIP2. The peak distribution indicates the presence of different oligomeric states in the sample. The structure of the most prominent ones, a 12-mer and a 16-mer, is shown above the corresponding peaks. e) cryoEM electron micrograph of yeast AENTH complex in presence of 200 µM PIP2. Several assemblies were detected on this sample whose structure is described in Lizarrondo et al., 2021. Scale bar is 50 nm.

PhotoMol User Documentation
April 2022

Overview
PhotoMol was developed to estimate the masses of different species in a sample after a Mass Photometry experiment. More details about this technology can be found at https://www.refeyn.com/.

Fitting model
The function that we use to fit is a sum of truncated Gaussians where y represents the histogram counts, x the masses, n is the number of truncated Gaussians g(x), b is a user-defined baseline, and g(x) is defined as follows.
where xthreshold is the minimum value of x (mass) that can be observed, center is the center of the gaussian, is the standard deviation, and amp is the amplitude.

Input file
PhotoMol accepts as input a '.h5' (Hierarchical Data Format) file. This file should have one 1D dataset called 'masses_kDa' and can be exported using the software Refeyn DiscoverMP. In the DiscoverMP version < 2.5, the file eventsFitted.h5 is saved in the folder when saving the results. In version 2.5 the events can be exported individually selecting a custom file name.
Additionally, a csv (comma-separated values) file with headers can be loaded. The column 'masses_kDa' and 'contrasts' are respectively required for the mass distribution data analysis and calibration.

Bin width
Integer value (kDa) used to group data and build the histogram.

Minimum observed mass
Integer value (kDa) that defines the left limit for the truncated multi gaussian.

Starting values S8
List of numbers separated by spaces. Each value is used to define the initial guess of the mean of a (truncated) Gaussian.

Upper limit for the standard deviation
Integer value (kDa) used to calculate fitting boundaries for the gaussian deviations.

Tolerance to the initial guesses
Integer value (kDa) used to calculate fitting boundaries for the gaussian means.

Window range
Set the limits (kDa) for constructing the histogram.

Baseline
Integer value used in Equation 1 (parameter b). Useful when there is constant noise.

Curve fitting
The histogram defined by the bin width and window region is fitted using the Levenberg Marquardt (damped least-squares) algorithm.

Calibration
Ratiometric contrasts can be converted to masses by loading a '.h5' file with known masses (3 different species at least), or using parameters from a previous calibration. In both cases, the calibration experiment should have been done with the same buffer, at the same temperature, and using the same instrument parameters (i.e., the field of view).
The fitting function and parameters are the same as previously described for analyzing the histogram of the observed masses, with the exception that the units are now 'Ratiometric contrasts' (instead of kDa).

Packages
PhotoMol is possible thanks to: R language: R Core Team (2020