Model-Free or Not?

Zumpfe, Kai; Smith, Albert A.

doi:10.3389/fmolb.2021.727553

REVIEW article

Front. Mol. Biosci., 25 October 2021
Sec. Structural Biology
Volume 8 - 2021 | https://doi.org/10.3389/fmolb.2021.727553

Model-Free or Not?

Kai Zumpfe

Albert A. Smith*

Institute for Medical Physics and Biophysics, Medical Faculty, Leipzig University, Leipzig, Germany

Relaxation in nuclear magnetic resonance is a powerful method for obtaining spatially resolved, timescale-specific dynamics information about molecular systems. However, dynamics in biomolecular systems are generally too complex to be fully characterized based on NMR data alone. This is a familiar problem, addressed by the Lipari-Szabo model-free analysis, a method that captures the full information content of NMR relaxation data in case all internal motion of a molecule in solution is sufficiently fast. We investigate model-free analysis, as well as several other approaches, and find that model-free, spectral density mapping, LeMaster’s approach, and our detector analysis form a class of analysis methods, for which behavior of the fitted parameters has a well-defined relationship to the distribution of correlation times of motion, independent of the specific form of that distribution. In a sense, they are all “model-free.” Of these methods, only detectors are generally applicable to solid-state NMR relaxation data. We further discuss how detectors may be used for comparison of experimental data to data extracted from molecular dynamics simulation, and how simulation may be used to extract details of the dynamics that are not accessible via NMR, where detector analysis can be used to connect those details to experiments. We expect that combined methodology can eventually provide enough insight into complex dynamics to provide highly accurate models of motion, thus lending deeper insight into the nature of biomolecular dynamics.

Introduction

Study of biomolecular function requires understanding the dynamics of the biological system. Nuclear magnetic resonance (NMR), despite many recent technological advances in other techniques, remains a premier method for detailed dynamics characterization. In NMR, one may measure a variety of site-specific relaxation experiments, which provide timescale sensitive information about the motion. By varying the type of experiment (T₁, T_1ρ, NOE, etc.) or experimental conditions (external magnetic field, applied field strength, magic-angle spinning (MAS) frequency, etc.), the timescale sensitivity of the measurement is modified. Then, one may resolve the dynamics both in space, via site resolution, and in timescale, via multiple experiments (Palmer, 2004; Schanda and Ernst, 2016).

However, is it possible to fully characterize the motions leading to the observed relaxation behavior? Many relaxation experiments in NMR are sensitive to the reorientational motion of anisotropic NMR interaction tensors (NMR relaxation can also be sensitive to change in scalar terms, e.g., isotropic chemical shift). For a given spin, relaxation is usually dominated by only one to two interactions. For example, relaxation of ¹⁵N in a protein backbone is determined almost entirely by the reorientation of the one-bond ¹H–¹⁵N dipole coupling and the ¹⁵N chemical shift anisotropy (CSA). But, multiple sources of motion lead to reorientation of the bond. For example, if we suppose the H–N bond to be in a protein, within a helix, then we would have local distortion of the peptide plane (one-bond libration), motion of the peptide plane within the helix, motion of the helix within the protein, and motion of the protein either in solution, in a crystal, a fibril, a membrane, etc.

This degree of complexity is illustrated in Figure 1. For a given bond in a molecule, and a given motion acting on that bond, a distribution of orientations is sampled as illustrated in Figure 1A. The orientational distribution determines the contribution of that motion to the total order parameter, $S^{2}$ . However, not only are there many orientations sampled by a bond due to a motion, but those orientations are sampled at some rate, such that the motion has an associated correlation time or distribution of correlation times (denoted $(1 - S^{2}) θ (z)$ ). We illustrate this in Figure 1B; note that not only is the width of such a distribution variable, but also the functional form of the distribution itself. This results in a correlation function that decays from 1 to $S^{2}$ , where integrating over the distribution of correlation times yields the total amplitude of the decay. Already, a single bond with just one motion acting on it yields potentially a high degree of complexity; however, we must still consider that multiple motions act on each bond, where the total correlation function is the product of the correlation functions of each individual motion (if those motions are independent from one another, Figure 1C). Finally, motion varies throughout a molecule, as a function of position, resulting in a complex, multi-dimensional description as illustrated in Figure 1D.

FIGURE 1

FIGURE 1. Complexity of reorientational dynamics. For each bond in a molecule, multiple types of motion result in orientational sampling, where the distribution of angles for each motion result in a generalized order parameter, S². Therefore, in (A) we plot a possible distribution of Euler angles for a single type of motion (population is plotted as a function of angles β and γ, where α is not required for a symmetric interaction tensor). A single motion is furthermore described by a correlation time, and may be distributed over a range of correlation times. In (B) we plot a possible distribution of correlation times $(1 - S^{2}) θ (z)$ , that is, amplitude of motion as a function of the log-correlation time, $z = lo g_{10} (τ_{c} / s)$ . Each distribution is characterized by an amplitude, center, and width. Note that the integral of the distribution is $(1 - S^{2})$ , S² being determined by the distribution of angles in (A). While (A,B) illustrate aspects of a single motion, multiple motions influence a given bond, where the total correlation function is the product of individual correlation functions. In (C), we plot four distributions of motion (color). Above each motion, we plot the distribution resulting from the product of that motion and all motions below it (black), eventually resulting in the total distribution seen at the top. Finally, we note that the total distribution varies as a function of position in the molecule, resulting in the 3D plot of the distribution as a function of correlation time and position in the molecule observed in (D). While this is just an illustration, one could imagine that motion in (D) results from three α-helices in a protein, each having a slightly different behavior, and varying dynamics as one approaches the end of each helix.

While NMR is powerful, obtaining a complete description of the complex dynamics stretches beyond the limit of what is possible based on experimental data alone, especially for large molecules such as proteins. This problem is a familiar one, addressed almost 40 years ago by Lipari and Szabo (Lipari and Szabo, 1982a), who developed a method known as the model-free approach. While we will discuss the details of this approach below, the name tells us a critical advantage of such an approach: model-free analysis allows the extraction of dynamics information from NMR relaxation data without having knowledge of the specific model of motion. Furthermore, the resulting parameters have a well-defined relationship to the distribution of orientations sampled and the distribution of correlation times.

Lipari and Szabo described the internal motion of a molecule with just two parameters: a generalized order parameter related to the amplitude of motion, $S^{2}$ , and a mean effective correlation time, $〈 τ_{e} 〉$ (a third parameter, $τ_{M}$ , gives the correlation time of the molecule tumbling in solution). While only two parameters suggests a simple analysis, it is important to note that Lipari and Szabo did not intend to only describe simple motions having just a single correlation time and amplitude: theoretical tests of their model were performed on a wobbling-on-a-cone model (Kinosita et al., 1977) that results in a weighted sum of correlation times, and experimental work was performed on methyl groups in a protein, for which the total motion is determined by the product of methyl rotation and by reorientation of the methyl group’s C–C bond. Rather, the two parameters contain the aggregated information describing all motions that is available from the set of relaxation experiments alone.

The advantage of model-free analysis is that it does not require knowing the model of motion. For example, for relatively low fields (∼90 MHz, as used by Lipari and Szabo), all distributions of orientations and correlation times shown in Figure 2 should yield identical relaxation rate constants for the set of experiments. If we do not know which model is the correct model, the best we can do is to parameterize the results in a way that does not depend on the model of motion, as can be done with the model-free parameters $S^{2}$ and $〈 τ_{e} 〉$ .

FIGURE 2

FIGURE 2. Five distributions of orientations and correlation times that yield the same model-free parameters ( $(1 - S^{2})$ = 0.3, $〈 τ_{e} 〉$ = 0.1 ns). In (A–E), we plot a distribution of orientations (sphere, right); on the axes, we plot the distribution of correlation times resulting from exchange among that set of orientations. Models of motion are wobbling-on-a-cone ( $θ_{cone} = 19^{°}$ ), wobbling-in-a-cone ( $θ_{cone} = 28^{°}$ ), symmetric two-site hop ( $θ_{hop} = 39^{°}$ ), asymmetric two-site hop ( $θ_{hop} = 70^{°}$ ), and 6-site asymmetric exchange. Insets in (A,B) show correlation times with small amplitudes. $S_{resid .}^{2}$ refers to the order parameter from residual couplings (see Supplementary Section S3), which deviates from the generalized order parameter for asymmetric motion.

When analyzing data, a model provides a framework for understanding the data, and by using a model we are always adding some information to the experimental data. In some cases, we add further information depending on how we interpret a model. A model is advantageous if that information is correct, and disadvantageous if that information is wrong. Suppose, for example, we know that the correct model in Figure 2 is a symmetric two-site hop, shown in Figure 2C; then we may extract the hop angle and exchange rates from $S^{2}$ and $〈 τ_{e} 〉$ , resulting in $θ_{hop}$ = 39° and $k_{ex}^{1 \to 2}$ = $k_{ex}^{2 \to 1}$ = 5 × 10⁹/s. However, if the true model is an asymmetric two-site hop, shown in Figure 2D, the true angle and exchange rates may be significantly different (for Figure 2D, these are $θ_{hop}$ = 70° with $k_{ex}^{1 \to 2}$ = 1.3 × 10⁹/s and $k_{ex}^{2 \to 1}$ = 8.7 × 10⁹/s). Then, a model-free approach is the more reliable method when the correct model cannot be independently determined.

In this review, we will first discuss the original model-free approach, and then examine methods descended from it, including discussion of our own detector analysis, a relatively new approach that also provides a model-free analysis in the spirit of the original Lipari-Szabo approach, but can extract the full information content of relaxation data sets in instances where the model-free approach cannot. We discuss analysis of microsecond motions using R_1ρ relaxation, and finally consider how other methods, in particular molecular dynamics (MD) simulation, may be used to supply the information that NMR lacks, thus improving the interpretation of NMR parameters.

Model-Free

While dynamics analysis methods have existed for application to solid-state NMR for some years now (Chevelkov et al., 2009b; Schanda et al., 2010; Zinkevich et al., 2013; Lamley et al., 2015a; Smith et al., 2016; Lakomek et al., 2017; Kurauskas et al., 2017), most of the approaches applied have evolved from methodology first developed for solution-state NMR. Probably the most important advance in solution-state analysis was the development of the model-free approach (Lipari and Szabo, 1982a; Lipari and Szabo, 1982b), and related two-step techniques (Wennerström et al., 1979; Halle and Wennerström, 1981; Brown, 1982). Then, we begin by reviewing some of the existing methodology, to understand advantages and disadvantages to various approaches.

Model-Free Theory

Typical solution-state NMR data sets consist of relaxation rate constants for R₁ (1/T₁), R₂ (1/T₂), and nuclear Overhauser effect (NOE, $σ_{IS}$ ), acquired at one or more magnetic fields. The rate constants describe the signal decay ( $I (t) = I_{0} e^{- R_{ζ} t}$ ) or recovery ( $I (t) = I_{eq} + (I_{0} - I_{eq}) e^{- R_{ζ} t}$ ). In solid-state NMR, this behavior can be multi-exponential, whereas we use the rate constant that describes the powder-averaged value (Krushelnitsky et al., 2018). Relaxation is often driven by reorientation of a few anisotropic interactions, for example, for backbone ¹⁵N relaxation, a one-bond H–N dipole coupling and CSA are responsible for relaxation. For these experiments, the relaxation rate constants may be calculated from the spectral density, $J (ω)$ :

\begin{array}{l} R_{1}^{I} = \underset{dipolar relaxation}{\underset{︸}{{(\frac{δ^{IS}}{4})}^{2} (J (ω_{I} - ω_{S}) + 3 J (ω_{I}) + 6 J (ω_{I} + ω_{S}))}} + \underset{CSA relaxation}{\underset{︸}{\frac{1}{3} {(ω_{I} Δ σ_{I})}^{2} J (ω_{I})}} \\ R_{2}^{I} = \frac{1}{2} R_{1}^{I} + \underset{dipolar relaxation}{\underset{︸}{{(\frac{δ^{IS}}{4})}^{2} (3 J (ω_{S}) + 2 J (0))}} + \underset{CSA relaxation}{\underset{︸}{\frac{2}{9} {(ω_{I} Δ σ_{I})}^{2} J (0)}} \\ σ_{IS} = \underset{dipolar relaxation}{\underset{︸}{{(\frac{δ^{IS}}{4})}^{2} (- J (ω_{I} - ω_{S}) + 6 J (ω_{I} - ω_{S}))}} \end{array} (1)

Here, $ω_{I}$ is the Larmor frequency (in radians/s) of the nucleus being relaxed, $ω_{S}$ the Larmor frequency of the coupled spin (usually ¹H), and $δ^{IS}$ and $Δ σ_{I} ω_{I}$ are the anisotropies of the dipolar coupling and CSA, respectively ( $δ^{IS} = - 2 \frac{μ_{0}}{4 π} \frac{h γ_{I} γ_{S}}{r_{IS}^{2}}$ , with $μ_{0}$ the vacuum permeability in T²m³/J, $γ_{I}, γ_{S}$ the gyromagnetic ratios of the two spins in radians/s, h is Planck’s constant in J·s, and $r_{IS}$ the distance between the spins in meters, resulting in $δ^{IS}$ , which is the full breadth of the dipolar powder pattern in radians/s. $Δ σ_{I} ω_{I}$ is similarly the full breadth $(Δ σ_{I} = \frac{3}{2} (σ_{zz} - σ_{iso}))$ of the CSA powder pattern in radians/s when the Larmor frequency of spin I is given by $ω_{I}$ , also in radians/s (Schanda and Ernst, 2016)). The spectral density may be obtained from the Fourier transform of the correlation function of motion. The correlation function itself is the rank-2 tensor correlation function, and describes the reorientational behavior of an NMR interaction tensor in time. If we assume the correlation function is symmetric in time, we may replace $e^{i ω t}$ with $\cos (ω t)$ in the Fourier transform. We can also change the integration bounds from $(- \infty, \infty)$ to $(0, \infty)$ , and must multiply the integral by two in order to compensate for only integrating over half the space.

\begin{array}{l} J (ω) = \int_{- \infty}^{\infty} C (t) e^{i ω t} d t \\ = \int_{- \infty}^{\infty} \underset{symmetric in time}{\underset{︸}{C (t) \cos (ω t)}} d t + i \int_{- \infty}^{\infty} \underset{antisymmetric in time \to 0}{\underset{︸}{C (t) \sin (ω t)}} d t \\ J (ω) = 2 \int_{0}^{\infty} C (t) \cos (ω t) d t \end{array} (2)

Then, model-free analysis makes a few assumptions about the correlation function:

1) The total motion of a given bond is the result of overall tumbling of the molecule in solution and internal motion of the bond within the molecule, and these two motions are statistically independent.

2) Decay of the correlation function due to internal motion is fast compared to all $ω$ sampled by the set of experimental relaxation rate constants (i.e., the extreme narrowing limit).

The decay of the correlation due to internal motion does not need to be mono-exponential (or even multi-exponential, although we will later apply this assumption). Instead of the second assumption, we may assume that the correlation function due to internal motion is mono-exponential, in which case we do not require its decay to be fast (we will visit this case only briefly, as it is less likely to occur in practice). We also assume tumbling is isotropic, although this is not necessarily required. Note that separate methods exist in case overall tumbling and internal motion are coupled (Tugarinov et al., 2001), although we will not consider these here. As a set of equations, this yields

\begin{array}{l} C (t) = C^{intern .} (t) \cdot C^{rot .} (t) \\ C^{rot .} (t) = \frac{1}{5} e^{- t / τ_{M}} \\ C^{intern .} (t) = S^{2} + (1 - S^{2}) G (t) \\ G (0) = 1, \lim_{t \to \infty} G (t) = 0 \end{array} (3)

The first equation is the result of statistical independence of internal and overall motion, such that we may write the total correlation function, $C (t)$ , as a product of a correlation function resulting from the internal motion ( $C^{intern .} (t)$ ), and a correlation function resulting from the overall rotational tumbling ( $C^{rot .} (t)$ ). The overall motion may be described by a single decaying exponential, with correlation time $τ_{M}$ if that overall motion is isotropic (occurring if the molecule is approximately spherical). For internal motion, $C^{intern .} (t)$ has an initial value of 1, and equilibrates at $S^{2}$ . $S^{2}$ is referred to as the generalized order parameter, and is related to, but not always equal to order parameters that may be extracted from measurement of residual couplings, as will be discussed in Determining S². $G (t)$ is simply the decaying part of $C^{intern .} (t)$ , normalized such that its initial value is 1, and final value is 0. If the second assumption, fast decay of the correlation function due to internal motion is fulfilled, we may calculate $J (ω)$ using the parameters $τ_{M}$ , $S^{2}$ , and $〈 τ_{e} 〉$ , where

〈 τ_{e} 〉 = \int_{0}^{\infty} e^{- t / τ_{M}} G (t) d t (4)

We calculate $J (ω)$ in order to see how it is a function of the parameters $τ_{M}$ , $S^{2}$ , and $〈 τ_{e} 〉$ .

\begin{array}{l} J (ω) = \frac{2}{5} \int_{0}^{\infty} [S^{2} e^{- t / τ_{M}} + (1 - S^{2}) e^{- t / τ_{M}} G (t)] \cos (ω t) d t \\ = \frac{2}{5} [\frac{S^{2} τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S^{2}) \int_{0}^{\infty} e^{- t / τ_{M}} G (t) \underset{\approx 1}{\underset{︸}{\cos (ω t)}} d t] \\ = \frac{2}{5} [\frac{S^{2} τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S^{2}) 〈 τ_{e} 〉] \end{array} (5)

We see that if $e^{- t / τ_{M}} G (t)$ decays quickly compared to $ω$ , then we may replace $\cos (ω t)$ with 1, since the exponential approaches zero more quickly than the cosine term can evolve away from 1. Then, regardless of the precise form of $G (t)$ , $J (ω)$ may always be calculated from the parameters $S^{2}$ , $〈 τ_{e} 〉$ , and $τ_{M}$ . Furthermore, if $τ_{M}$ is known (usually from the analysis of R₁ and R₂ throughout a molecule (Kay et al., 1989)), $J (ω)$ becomes a linear function of the parameters $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ .

Instead of assuming fast decay of $G (t)$ , one may alternatively assume that it is mono-exponential ( $G (t) = e^{- t / τ}$ ), yielding

\begin{array}{l} J (ω) = \frac{2}{5} \int_{0}^{\infty} [S^{2} e^{- t / τ_{M}} + (1 - S^{2}) e^{- t / τ_{M}} e^{- t / τ}] \cos (ω t) d t \\ {〈 τ_{e} 〉}^{- 1} = τ_{M}^{- 1} + τ^{- 1} \\ J (ω) = \frac{2}{5} [\frac{S^{2} τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S^{2}) \frac{〈 τ_{e} 〉}{1 + {(ω 〈 τ_{e} 〉)}^{2}}] \end{array} (6)

In the extreme narrowing limit, where decay of the correlation function is fast, we have $ω 〈 τ_{e} 〉 ≪ 1$ such that this result equals the result in Eq. 5. The expression in Eq. 6 is equivalent to Eq. 1 in (Lipari and Szabo, 1982a), and is valid either in the case of mono-exponential decay or fast decay of the internal correlation function. However, we find the case of fast, multi-exponential decay the more likely scenario, and so focus on this assumption.

The notation $〈 τ_{e} 〉$ is used to indicate the average of the effective correlation time. To understand how the integral of $e^{- t / τ_{M}} G (t)$ is related to this average, we must assume that $G (t)$ is the sum of decaying exponentials. This may be achieved with a sum over a discrete number of correlation times, weighted with $A_{i}$ , or a continuous distribution, defined by the function $θ (z)$ .

\begin{array}{l} G (t) = \sum_{i} A_{i} e^{- t / τ_{i}} \\ where Σ_{i} A_{i} = 1 \\ –or– \\ G (t) = \int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z} \cdot 1 s)} d z \\ where \int_{- \infty}^{\infty} θ (z) d z = 1 \end{array} (7)

Since $G (0) = 1$ , it is clear that the sum of amplitudes ( $A_{i}$ ) must be 1. For the former equation, we take a simple sum, and for the latter form, we use a distribution of correlation times, $θ (z)$ , given on a logarithmic scale, such that $z = \log_{10} (τ_{c} / s)$ . The distribution must similarly integrate to 1. The two forms can be treated equivalently. We have recently re-introduced the latter form (Smith et al., 2018), which was previously used to describe a variety of continuous correlation time distributions, e.g., see Beckmann (1988). We may insert this expression for $G (t)$ (Eq. 7) into Eq. 4 in order to obtain the relationship between $θ (z)$ and $〈 τ_{e} 〉$ .

\begin{array}{l} 〈 τ_{e} 〉 = \int_{0}^{\infty} e^{- t / τ_{M}} \underset{G (t)}{\underset{︸}{\int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z} \cdot 1 s)} d z}} d t \\ = \int_{0}^{\infty} \int_{- \infty}^{\infty} θ (z) e^{- t (τ_{M}^{- 1} + {(10^{z} \cdot 1 s)}^{- 1})} d z d t \\ {(τ_{e} (z))}^{- 1} = τ_{M}^{- 1} + {(10^{z} \cdot 1 s)}^{- 1} \\ 〈 τ_{e} 〉 = \int_{0}^{\infty} \int_{- \infty}^{\infty} θ (z) e^{- t / τ_{e} (z)} d z d t = \int_{- \infty}^{\infty} θ (z) {(- τ_{e} (z) e^{- t / τ_{e} (z)}) |}_{t = 0}^{\infty} d z \\ 〈 τ_{e} 〉 = \int_{- \infty}^{\infty} θ (z) τ_{e} (z) d z \\ equivalently : 〈 τ_{e} 〉 = \sum_{i} A_{i} τ_{e}^{i}, for {(τ_{e}^{i})}^{- 1} = τ_{M}^{- 1} + τ_{i}^{- 1} \end{array} (8)

${(τ_{e}^{i})}^{- 1} = τ_{M}^{- 1} + {(10^{z} \cdot 1 s)}^{- 1}$ is the effective correlation time, resulting from decay of both the correlation function due to the internal correlation time, $z = \log_{10} (τ_{c} / s)$ and correlation time of the overall motion, $τ_{M}$ . Since $θ (z)$ integrates to 1, $\int_{- \infty}^{\infty} θ (z) τ_{e} (z) d z$ yields the weighted average of the effective correlation time, $〈 τ_{e} 〉$ . Then, one fits experimental data to a correlation function having the following model:

C^{} (t) = \frac{1}{5} (S^{2} e^{- t / τ_{M}} + (1 - S^{2}) e^{- t / 〈 τ_{e} 〉}) (9)

Applying this model does not require that the true correlation function has exactly this form, but rather, the model correlation function simply must have the same values of $S^{2}$ and $〈 τ_{e} 〉$ as the true correlation function. In this sense, the analysis itself remains model-free, although equating $〈 τ_{e} 〉$ with the averaged effective correlation time requires the true correlation function to be a sum of decaying exponentials, as in Eq. 7.

A Few Notes on Linearity

We will later note that many of the methods used for analyzing relaxation rate constants result in parameters that are linear functions of the distribution of correlation times, $(1 - S^{2}) θ (z)$ . Specifically, we mean that any parameter, $P_{m}$ , is linear to $(1 - S^{2}) θ (z)$ if it can be written as

P_{m} = (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) p_{m} (z) d z (10)

That is, for every correlation time, $z = \log_{10} (τ_{c} / s)$ , P increases proportionally to $(1 - S^{2}) θ (z)$ at that correlation time, where the proportionality is defined by $p_{m} (z)$ . Furthermore, any linear combination of parameters, $P_{m}$ , is then also linear to $(1 - S^{2}) θ (z)$ , as we can see by integrating a sum of parameters, $P_{m}$ , and swapping the order of the integration and the summation.

\begin{array}{l} \sum_{m} a_{m} P_{m} = \sum_{m} a_{m} (1 - S^{2}) \int_{- \infty}^{\infty} p_{m} (z) θ (z) d z \\ = (1 - S^{2}) \int_{- \infty}^{\infty} \underset{= Σ (z)}{\underset{︸}{[\sum_{m} a_{m} p_{m} (z)]}} θ (z) d z \\ = (1 - S^{2}) \int_{- \infty}^{\infty} Σ (z) θ (z) d z \end{array} (11)

We define the function $Σ (z)$ to be the weighted sum of the sensitivities, $p_{m} (z)$ , which then defines the linear relationship of the sum of the $P_{m}$ to $(1 - S^{2}) θ (z)$ . This principle is one of the basic tenants of linear algebra. What can be less obvious is that a linear fit of parameters, $P_{m}$ , defined by a matrix, M, to a new set of parameters, $Q_{n}$ is also linear to $(1 - S^{2}) θ (z)$ . This is only the case if restrictions on the fit parameters, $Q_{n}$ , are not applied (no priors are used). In this case, the parameters $Q_{n}$ should minimize the following equation.

\begin{array}{l} \min [\sum_{m} {| P_{m} - {[M]}_{m, n} Q_{n} |}^{2}] \\ Q_{n} = \sum_{m} {[M^{- 1}]}_{n, m} P_{m} \end{array} (12)

One may determine the $Q_{n}$ by computing the pseudoinverse of M (denoted $M^{- 1}$ ) and multiplying by the $P_{m}$ . Linearity of the $Q_{n}$ to $(1 - S^{2}) θ (z)$ results from the fact that linear combinations defined by $M^{- 1}$ remain unchanged regardless of the value of the parameters being fit, $P_{m}$ . However, if the allowed values of the $Q_{n}$ are restricted with priors, then it can be that some values of $P_{m}$ will result in the latter formula in Eq. 12 yielding $Q_{n}$ outside of the allowed range. In this case, a linear least squares algorithm will search for a different solution than that given by Eq. 12, such that the $Q_{n}$ are no longer defined by $M^{- 1}$ , and no longer have a consistent linear relationship to $(1 - S^{2}) θ (z)$ . Note that if priors are used, but Eq. 12 does not yield $Q_{n}$ outside of the bounds defined by the priors, then Eq. 12 still remains the best solution and linearity is maintained. In general, we will find analysis methods that rely on linear combination of data have more predictable behavior than those that do not.

Then, the model-free parameters $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ are linear to $(1 - S^{2}) θ (z)$ , because one can fit experimental relaxation rate constants with $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ (see Eq. 5), where the relaxation rate constants themselves are linear to the spectral density (Eq. 1), the spectral density is linear to the correlation function (via Fourier transform, Eq. 2), and the correlation function is linear to the distribution of correlation times, $(1 - S^{2}) θ (z)$ (Eqs. 3, 7)). Assuming the correlation function decays quickly, this linear relationship is given by the following, where $τ_{e} (z)$ is defined in Eq. 8.

\begin{array}{l} S^{2} = 1 - [(1 - S^{2}) \int_{- \infty}^{\infty} θ (z) d z] \\ (1 - S^{2}) 〈 τ_{e} 〉 = (1 - S^{2}) \int_{- \infty}^{\infty} τ_{e} (z) θ (z) d z \end{array} (13)

Note that $〈 τ_{e} 〉$ is not itself linear to $(1 - S^{2}) θ (z)$ , but is easily obtained from the above parameters.

Fitting With Model-Free

In Figure 3, we test the performance of model-free fitting under a number of conditions. In Figure 3A, we calculate a number of relaxation rate constants from motion having a single internal correlation time and overall tumbling with $τ_{M}$ = 4 ns, and then fit the results, assuming the model-free correlation function (Eq. 9). We may calculate the spectral density exactly, or we may assume that the correlation function decays quickly, by using the spectral density given in Eq. 5, resulting in a linear fit. The former method is shown as a blue, solid line, where the input parameters always exactly match the fit parameters, whereas using a linear fit (red, dashed line) results in disagreement of input and fit parameters when the correlation function does not decay quickly compared to the frequencies sampled ( $ω τ_{e} ≪ 1$ ); in this case, Eq. 5 is no longer a good estimate of the spectral density whereas Eq. 6 has the correct form.

FIGURE 3

FIGURE 3. Model-free fit parameters as a function of input parameters. For each plot, a data set is calculated, using the experiments found in from Table I of Lipari and Szabo (1982b), and the resulting rate constants are fit using the model-free approach, with the resulting ${〈 τ_{e} 〉}_{fit}$ and $(1 - S^{2})$ shown on the left and right, respectively. For all plots, the tumbling correlation time is $τ_{M}$ = 4 ns and $(1 - S^{2})$ = 0.3. One correlation time of the internal motion is varied, and we plot ${〈 τ_{e} 〉}_{in}$ on the x-axis. In each plot, we fit using the full spectral density (blue, solid, see Eq. 6) and using a linear approximation (red, dashed, see Eq. 5). In (A), the input correlation function only has a single correlation time. In (B), one correlation time is fixed to 10 ps, and the second correlation time is swept. In (C), a log-Gaussian distribution (μ = 10 ps, σ = 0.75 order of magnitude) is combined with a correlation time that is varied (with total amplitude equal). On the left plots, black dotted lines indicate where the input value, ${〈 τ_{e} 〉}_{in}$ , matches the fit, ${〈 τ_{e} 〉}_{fit}$ . In all plots, vertical black dotted lines indicate where $ω {〈 τ_{e} 〉}_{in} = 0.5$ for $ω / 2 π$ = 90 MHz, where this frequency corresponds to the highest field used for the data set.

In Figure 3B, we include two correlation times in the input, each with equal amplitude, where one correlation time is fixed (10 ps), and a second correlation time is swept. We calculate the mean effective correlation time directly on the x-axis ( ${〈 τ_{e} 〉}_{in}$ ), and compare this to the fitted parameters on the y-axis ( ${〈 τ_{e} 〉}_{fit}$ , left plot, $1 - S^{2}$ , right plot). As expected, if the assumption that $ω τ_{e} ≪ 1$ holds for all frequencies sampled and all correlation times present, the fit parameters are in good agreement with their input values, but when $ω τ_{e} ≪ 1$ , ${〈 τ_{e} 〉}_{fit}$ and $S^{2}$ no longer reproduce the correct values. Note that performing this fit with the full spectral density (blue, solid line) and using just a linear fit (red, dashed line) produces very similar results. In Figure 3C, we perform the same tests, but instead of fixing a correlation time to 10 ps, we have a log-Gaussian distribution of correlation times, centered at 10 ps, with a standard deviation of 0.75 orders of magnitude. Results are similar to those found in Figure 3B.

Determining S²

For model-free analysis, $〈 τ_{e} 〉$ is the average effective correlation time, and can be calculated from the distribution of correlation times. $S^{2}$ , on the other hand, is determined from the distribution of orientations sampled by internal motion. By definition, it is equal to the correlation function of internal motion, taken as the limit of t goes to infinity. We may obtain $S^{2}$ by first considering the formula for the correlation function.

C^{intern .} (t) = {〈 P_{2} (\vec{μ} (τ) \cdot \vec{μ} (t + τ)) 〉}_{τ} (14)

$P_{2} (x)$ is the second Legendre polynomial ( $P_{2} (x) = (3 x^{2} - 1) / 2$ ), and $\vec{μ} (τ)$ is a normalized vector that gives the direction of the principal component of an NMR interaction as a function of time, due to internal motion only (without tumbling). The dot product ( $\vec{μ} (τ) \cdot \vec{μ} (t + τ)$ ) yields the cosine of the angle between the two vectors. The correlation function itself may take on a variety of complex forms, depending on the correlation times present, but $S^{2}$ , its value as $t \to \infty$ , depends only on the distribution of orientations sampled by the internal motion. This may be obtained by taking a weighted average over all possible starting orientations (p) and all possible final orientations (q), and calculating $P_{2} ({\vec{μ}}_{p} \cdot {\vec{μ}}_{q})$ for each pair. Defining $p_{eq} ({\vec{μ}}_{p})$ to be the fraction of orientation ${\vec{μ}}_{p}$ at thermal equilibrium, we obtain

S^{2} = \sum_{p} \sum_{q} p_{eq} ({\vec{μ}}_{p}) p_{eq} ({\vec{μ}}_{q}) P_{2} ({\vec{μ}}_{p} \cdot {\vec{μ}}_{q}) (15)

Then, if we have a precise description of the internal dynamics, we may calculate parameters $〈 τ_{e} 〉$ and $S^{2}$ using Eqs. 8, 15. We may not easily go backwards, to obtain a precise description of the dynamics from only these parameters. However, this is not a limitation of the method of analysis, but rather of the information content of the data.

In solid-state NMR, we no longer have overall tumbling motion, so the term $e^{- t / τ_{M}}$ vanishes from the correlation function and Eq. 5 becomes simply

J (ω) = \frac{2}{5} (1 - S^{2}) 〈 τ 〉 (16)

This prevents us from separating $S^{2}$ and $〈 τ 〉$ via relaxation data alone (we drop the subscript e from $τ$ , since it is no longer an effective correlation time); however, one may measure the size of residual couplings in NMR (Chevelkov et al., 2009a; Schanda et al., 2011), often via DIPSHIFT (Munowitz et al., 1981) or REDOR (Gullion and Schaefer, 1989). In this case, the ratio of the anisotropies of the rigid interaction ( $δ_{rigid .}$ ) to the motionally averaged interaction ( $δ_{resid .}$ ) defines $S_{resid .}$ .

S_{resid .} = δ_{resid .} / δ_{rigid} (17)

One usually equates $S^{2}$ and $S_{resid.}^{2}$ , although for motion that does not have at least a three-fold symmetry axis, these terms are not necessarily equal (Supplementary Section S3). Examples are found in Figures 2C-E, although we see the deviation is actually quite small (e.g., $S_{resid .}^{2}$ = 0.69, vs. $S^{2}$ = 0.7), so that this approach may be used to obtain good separation of $S^{2}$ and $〈 τ 〉$ .

Alternative Methods

In the case that all internal motion is fast, such that the correlation function decays quickly, model-free analysis is an ideal approach for extracting dynamics information from relaxation data: the full information content of the relaxation data is captured in the parameters $S^{2}$ and $〈 τ_{e} 〉$ , where these parameters have simple relationships to the distribution of correlation times, $(1 - S^{2}) θ (z)$ (parameters $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ are furthermore linearly related to $(1 - S^{2}) θ (z)$ ). In case the correlation function does not decay quickly compared to the sampled frequencies, our formula for the spectral density becomes significantly more complex. To obtain it, we begin from Eq. 5 (first expression), and insert the assumed form of $G (t)$ , found in Eq. 7, yielding the equation for the solution-state spectral density.

\begin{array}{l} J (ω) = \frac{2}{5} \int_{0}^{\infty} [S^{2} e^{- t / τ_{M}} + (1 - S^{2}) e^{- t / τ_{M}} \int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z} \cdot 1 s)} d z] \cos (ω t) d t \\ {(τ_{e} (z))}^{- 1} = τ_{M}^{- 1} + {(10^{z} \cdot 1 s)}^{- 1} \\ z_{e} (z) = \log_{10} (τ_{e} (z) / s) \\ J (ω) = \frac{2}{5} \int_{0}^{\infty} [S^{2} e^{- t / τ_{M}} + (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z_{e} (z)} \cdot 1 s)} d z] \cos (ω t) d t \\ = \frac{2}{5} [\frac{S^{2} τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \frac{10^{z_{e} (z)} \cdot 1 s}{1 + {(ω \cdot 10^{z_{e} (z)} \cdot 1 s)}^{2}} d z] \end{array} (18)

The first step is to combine the two exponential terms, where we define the log-effective correlation time, $z_{e} (z)$ , as a function of the log-internal correlation time, z, and also the rotational correlation time, $τ_{M}$ . Subsequently, each exponential term is Fourier transformed to yield the familiar Lorentzian function. The spectral density for solid-state NMR can be similarly calculated, where the overall motion is omitted.

\begin{array}{l} J (ω) = \frac{2}{5} \int_{0}^{\infty} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z} \cdot 1 s)} d z \cos (ω t) d t \\ = \frac{2}{5} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \frac{10^{z} \cdot 1 s}{1 + {(ω \cdot 10^{z} \cdot 1 s)}^{2}} d z \end{array} (19)

The integral has a complex dependence on $ω$ , and depends on the specific form of $(1 - S^{2}) θ (z)$ , so that by using multiple relaxation experiments, we can extract more than two parameters describing the internal motion. However, we require a different approach to extract that information. We discuss four approaches developed for treating this case: the extended model-free approach (EMF), spectral density mapping (SDM), LeMaster’s approach, and IMPACT. Another approach that bears mentioning is the slowly relaxing local structure model (SRLS), which accounts for coupling of local motional modes to overall motion of a molecule in solution (Polimeno and Freed, 1992; Tugarinov et al., 2001; Mendelman and Meirovitch, 2021; Shapiro and Meirovitch, 2012). SRLS reduces to the model-free approach as coupling between local and overall motion vanishes. However, we do not include further comparison to the analytically simpler methods discussed here.

Extended Model-Free

Clore and coworkers found that when measuring relaxation data at higher fields (up to 600 MHz) that not all backbone motion could be well fit using the model-free approach for staphylococcal nuclease and interleukin-1β (Clore et al., 1990). They found that the simplest correlation function that could fit the data was obtained by adding another decaying exponential term, yielding the EMF correlation function.

C^{intern .} (t) = (1 - S_{f}^{2}) e^{- t / τ_{f}} + S_{f}^{2} (1 - S_{s}^{2}) e^{- t / τ_{s}} + S_{f}^{2} S_{s}^{2} (20)

In this correlation function, the total internal motion is separated into fast and slow components, with order parameters $S_{f}^{2}$ and $S_{s}^{2}$ , and effective correlation times, $τ_{f}$ and $τ_{s}$ , respectively. The product $S_{f}^{2} S_{s}^{2}$ should yield the total order parameter, $S^{2}$ . Also note that the faster motion’s order parameter scales the influence of the slower motion, as seen in the term $S_{f}^{2} (1 - S_{s}^{2}) e^{- t / τ_{s}}$ . Data analysis with EMF in solid- and solution-state NMR involves simply varying the parameters, $S_{f}^{2}$ , $S_{s}^{2}$ , $τ_{f}$ , and $τ_{s}$ , to find an optimal fit to experimental data. Often, one also performs a model selection step, where one may determine how many parameters should be included in the fit (Mandel et al., 1995; d’Auvergne and Gooley, 2003; Zinkevich et al., 2013; Gill et al., 2016). In Figure 4, the behavior of EMF parameters is shown for several correlation functions. In each subplot, all terms except one correlation time are fixed, and we observe the model behavior as we sweep through the variable correlation time. In Figure 4A, two correlation times are used, so that the input correlation function has the same form as the correlation function used for fitting; as expected, the fitted parameters perfectly match the input parameters, since the input and fit models match. In Figure 4B, three correlation times are input, where the fast and slow correlation times are fixed at 10 ps and 1 ns, and the intermediate correlation time is swept. In this case, when the intermediate correlation time is fast, the fitted $τ_{f}$ falls in between the fast and intermediate correlation times, and the fitted amplitude for the fast motion is the sum of the input amplitudes for the fast and intermediate motions. However, for longer correlation times, the fitted $τ_{f}$ again gets shorter, eventually equaling 10 ps, so that the fitted $τ_{s}$ takes over the role of fitting the intermediate correlation time. This is especially well illustrated in Figure 4(B, right), where the amplitude corresponding to the slow motion increases from 0.1 to 0.2, indicating that the slow motion in the model fits both the input intermediate and slow motions. Similar behavior is observed in Figure 4C, where a distribution of correlation times is combined with a single correlation time that is swept.

FIGURE 4

FIGURE 4. EMF parameters as a function of input correlation time (solution-state). For each plot, a data set is calculated, using the set of experiments from Clore et al. (1990), and the resulting rate constants are fitted using the EMF approach. For all plots, $τ_{M}$ = 8.3 ns, and the input $(1 - S^{2})$ = 0.3. In each subplot, the fitted correlation times (left) and amplitudes (right) are shown, as a function of an input correlation time (x-axis). In (A), the input correlation function has two correlation times (with equal amplitudes), with one fixed at 10 ps, and the other swept. In (B), the input correlation function has three correlation times, two fixed at 10 ps and 1 ns, and the third is swept. In (C), a log-Gaussian distribution of correlation times is used (μ = 100 ps, σ = 0.75 orders of magnitude), and a single correlation time is swept. Black dotted lines show the input correlation times (left plots).

To the best of our knowledge, the behavior of the fit parameters has no well-defined relationship to the distribution of correlation times, $(1 - S^{2}) θ (z)$ : if we know $(1 - S^{2}) θ (z)$ precisely, our only way to obtain the EMF parameters from it would be to explicitly calculate a set of relaxation rate constants, and then fit the results to Eq. 20. This is in sharp contrast to the original model-free parameters. Similar limitations arise for the EMF approach in solid-state NMR, as seen in Figure 5. Note that typical solution-state data sets are fairly continuous in their sensitivity to motion as a function of correlation time (Smith et al., 2019a), whereas solid-state NMR has a “blind-spot” in sensitivity centered around ∼100 ns (Schanda, 2019), which results in some of the more unusual behavior for EMF in solids (see Case 1: Extended Model-Free for a detailed discussion of the behavior of typical model-free parameters in solid-state NMR).

FIGURE 5

FIGURE 5. EMF parameters as a function of input correlation time (solid-state). For each plot, a data set is calculated, including direct measurement of S_resid. via residual couplings (Eq. 17), ¹⁵N T₁ at 400, 500, and 850 MHz, and T₂ with MAS of 60 kHz. The resulting rate constants are fitted using the EMF approach. For all plots, $(1 - S^{2})$ = 0.3. In each subplot, the fitted correlation times (left) and amplitudes (right) are shown, as a function of an input correlation time (x-axis). In (A), the input correlation function has two correlation times (with equal amplitudes), with one fixed at 3.2 ps, and the other swept. In (B), the input correlation function has three correlation times, two fixed at 3.2 ps and 32 ns, and the third is swept. In (C), a log-Gaussian distribution of correlation times is used (μ = 100 ps, σ = 0.75 orders of magnitude), and a single correlation time is swept. Black dotted lines show the input correlation times (left plots).

Spectral Density Mapping

In contrast to EMF, SDM is achieved by simple linear combination of sets of relaxation data at a single magnetic field (Peng and Wagner, 1992; Ishima et al., 1999). From a set of R₁, R₂, and NOE relaxation rate constants, one calculates

\begin{array}{l} J (0) = \frac{R_{2} - R_{1} / 2 - 0.454 σ_{IS}}{δ_{IS}^{2} / 2 + 2 {(Δ σ_{I} ω_{I})}^{2}} \\ J (ω_{I}) = \frac{R_{1} - 1.249 σ_{IS}}{3 {(δ_{IS} / 4)}^{2} + {(Δ σ_{I} ω_{I})}^{2} / 3} \\ J (0.870 ω_{S}) = 16 σ_{IS} / (5 δ_{IS}^{2}) \end{array} (21)

The above expressions yield very close approximations of the spectral density at specific frequencies: 0, $ω_{I}$ , and 0.870 $ω_{S}$ , where $ω_{I}$ is the nuclear Larmor frequency of the spin being relaxed, and $ω_{S}$ is a spin which is dipole coupled to that spin (usually a directly bonded ¹H). Differences in the representations of the anisotropies ( $δ_{IS}$ , $Δ σ_{I} ω_{I}$ ) result in the different appearances of the normalization factors (denominators). These terms may be interpreted as being proportional to the amount of motion near the given frequency (which corresponds to the correlation time $τ = 1 / ω$ ), but otherwise they do not provide a more physical interpretation of the motion. One may subsequently fit the spectral densities to model-free parameters for better interpretation (Gill et al., 2016). If we have a precise description of the motion (e.g., $(1 - S^{2}) θ (z)$ ), the terms $J (ω)$ are easily obtained:

J (ω) = \frac{2}{5} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \frac{10^{z} \cdot 1 s}{1 + {(ω \cdot 10^{z} \cdot 1 s)}^{2}} d z (22)

The parameters resulting from SDM always behave the same way in response to a given correlation time, regardless of other correlation times present, and is the consequence of properties of linearity discussed in A Few Notes on Linearity. This is seen in Figure 6A, where we calculate relaxation rate constants resulting from a single correlation time and analyze with SDM. In Figure 6B, we split motion over two correlation times, and observe how the terms respond to sweeping one of them, and in Figure 6C, we split motion into a distribution and a single, swept correlation time and determine how the terms respond to the swept correlation time. The result is always identical (scaling by 0.5 results from dividing the total amplitude into two parts), a very useful property occurring when data is analyzed strictly by linear combination of data. Unlike EMF analysis, behavior of SDM is independent of the form of the distribution of correlation times.

FIGURE 6

FIGURE 6. Behavior of SDM as a function of correlation time. In each subplot, we calculate ¹⁵N T₁, T₂, and σ_NH at 600 MHz, and analyze the results using Eq. 21. In (A), the input total correlation function consists of a single decaying exponential term (with amplitude 1), where the terms $J (ω)$ are plotted as the correlation time is varied (results are normalized). Black dotted lines show the spectral densities, $J (0)$ , $J (ω_{I})$ , $J (0.870 ω_{S})$ , calculated with Eq. 22, and colored lines show the results of the data analysis, yielding an almost exact correspondence. In (B), the total correlation function now uses two correlation times (equal amplitudes), with one fixed at 10 ps, and the second swept (x-axis). On the y-axis, we plot contribution to the terms, $Δ J (ω)$ , from the correlation time being varied. The resulting behavior is identical to that in (A), except that the amplitude is half as large, since we have split the total amplitude between the fixed and variable correlation time (dashed line marks 0.5). In (C), the same information is plotted, but the total correlation function includes a log-Gaussian distribution (μ = 630 ps, σ = 1 order of magnitude), and a single, variable correlation time.

Note that this approach describes the total motion, and does not separate out tumbling from internal motion in the case of solution-state NMR, which has an especially strong influence on $J (0)$ . The original approach only incorporates data from one field, whereas later work has extended the method to include data from more than one field, although one still requires specific sets of experiments (Skrynnikov et al., 2002; Hsu et al., 2018).

LeMaster’s Approach

LeMaster proposed an alternative to SDM analysis of R₁, R₂, and NOE data from one field, in order to separate overall tumbling from internal motion (LeMaster, 1995). In this case, LeMaster proposed fitting data to the following correlation function:

\begin{array}{l} C (t) = S_{f}^{2} S_{H}^{2} S_{N}^{2} e^{- t / τ_{M}} + S_{f}^{2} (1 - S_{H}^{2}) e^{- t / τ_{H}} + S_{f}^{2} S_{H}^{2} (1 - S_{N}^{2}) e^{- t / τ_{N}} + {(1 - S_{f})}^{2} e^{- t / τ_{f}} \\ τ_{H} = {(ω_{H} + ω_{N})}^{- 1}, τ_{N} = | ω_{N} |^{- 1} \end{array} (23)

It is assumed that $τ_{f}$ is very short so that the term $(1 - S_{f}^{2}) e^{- t / τ_{f}}$ makes only negligible contributions to the spectral density, resulting in the following formula:

\begin{array}{l} J (ω) = \frac{2}{5} S_{f}^{2} [S_{H}^{2} S_{N}^{2} \frac{τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S_{H}^{2}) \frac{τ_{H}}{1 + (ω τ_{H}^{2})} + S_{H}^{2} (1 - S_{N}^{2}) \frac{τ_{N}}{1 + (ω τ_{N}^{2})}] \\ = \frac{2}{5} [\frac{τ_{M}}{1 + {(ω τ_{M})}^{2}} + (1 - S_{f}^{2}) (- \frac{τ_{M}}{1 + {(ω τ_{M})}^{2}}) + \\ S_{f}^{2} (1 - S_{H}^{2}) (\frac{τ_{H}}{1 + (ω τ_{H}^{2})} - \frac{τ_{M}}{1 + {(ω τ_{M})}^{2}}) + S_{f}^{2} S_{H}^{2} (1 - S_{N}^{2}) (\frac{τ_{N}}{1 + (ω τ_{N}^{2})} - \frac{τ_{M}}{1 + {(ω τ_{M})}^{2}})] \end{array} (24)

In the latter formulation, we find that the spectral density becomes a linear combination of terms, weighted by $(1 - S_{f}^{2})$ , $S_{f}^{2} (1 - S_{H}^{2})$ , and $S_{f}^{2} S_{H}^{2} (1 - S_{N}^{2})$ . Then, one must fit these terms to the experimental relaxation rate constants. We do so in Figure 7 for calculated relaxation rate constants. Like SDM, responses as a function of correlation time are always identical (again, excepting a scaling factor of 0.5 resulting from splitting the total motion into components), although the functions themselves are different: this results from the fact that LeMaster’s approach characterizes the internal motion, and not the total motion, so that we obtain one amplitude, $(1 - S_{f}^{2})$ , which captures information about the fastest correlation times (<30 ps), one amplitude, $S_{f}^{2} (1 - S_{H}^{2})$ , which captures information for correlation times near to $τ_{H}$ , and one amplitude, $S_{f}^{2} S_{H}^{2} (1 - S_{N}^{2})$ , which captures information for correlation times near to $τ_{N}$ .

FIGURE 7

FIGURE 7. Behavior of LeMaster’s approach as a function of correlation time. In each subplot, we calculate ¹⁵N T₁, T₂, and σ_NH at 600 MHz for motion with $(1 - S^{2})$ = 0.3 and tumbling correlation time of $τ_{M}$ = 4 ns, and analyze the results using Eq. 24. In (A) the internal correlation function consists of a single decaying exponential term (with amplitude 0.3), where the fitted amplitudes are plotted as the correlation time is varied. In (B) the internal correlation function uses two correlation times (both amplitudes are 0.15), with one correlation time fixed at 10 ps, and the second swept (x-axis). On the y-axis, we plot contributions to the terms from the correlation time being varied. The resulting behavior is identical to that in (A), except that the amplitude is half, since we have split the total amplitude between the fixed and variable correlation time (dashed line marks 0.15). In (C), the same information is plotted, but the total correlation function includes a log-Gaussian distribution (μ = 630 ps, σ = 1 order of magnitude), and a single, variable correlation time.

LeMaster’s approach is a linear fit, without priors; as discussed in A Few Notes on Linearity, this means that the fitted parameters may also be obtained by a linear combination of the experimental relaxation rate constants. Therefore, the parameters $(1 - S_{f}^{2})$ , $S_{f}^{2} (1 - S_{H}^{2})$ , and $S_{f}^{2} S_{H}^{S} (1 - S_{N}^{2})$ are linear to $(1 - S^{2}) θ (z)$ . The parameters $S_{H}^{2}$ and $S_{N}^{2}$ themselves are not linear to $(1 - S^{2}) θ (z)$ , but may be obtained by simple arithmetic from the linear parameters. Like SDM, LeMaster’s approach is limited to data acquired at a single field.

Interpretation of Motions by a Projection onto an Array of Correlation Times Approach

Limitations of the approaches above have led Ferrage and coworkers to develop the interpretation of motions by a projection onto an array of correlation times (IMPACT) approach (Khan et al., 2015), which was applied to a protein with intrinsically disordered regions (IDR). A challenge of IDRs is that the lack of structure potentially yields a large number of distinct motions and therefore many correlation times, so that EMF approach is not appropriate for data analysis, but the limited number of parameters obtained with SDM fails to provide a complete description of the dynamics. Then, the IMPACT approach allows analysis of large, multi-field data sets, by taking the total correlation function to be a sum of several fixed correlation times, $τ_{k}$ , such that

C (t) = \sum_{k} A_{k} e^{- t / τ_{k}} (25)

Because $C (0) = 1$ and decays to 0, the $A_{k}$ must sum to 1. For the Engrailed 2 protein, ¹⁵N T₁, NOE ( $σ_{NH}$ ), and transverse and longitudinal cross-relaxation rate constants at five fields (400, 500, 600, 800, 1,000 MHz) could be fit to an array of six correlation times, log-spaced between 21 ps and 21 ns. When fitting to Eq. 25, one restricts the amplitudes to remain between zero and one, and the sum of amplitudes must be set to one.

Following our procedures for SDM and LeMaster’s approach, we also examine the behavior of the IMPACT approach in Figure 8. When fitting a correlation function having a single correlation time in Figure 8A, we obtain ideal behavior from the IMPACT approach. When the input correlation time matches one of the correlation times in the IMPACT array, the corresponding amplitude is one, and all other amplitudes are zero. When the input correlation time is in between correlation times in the IMPACT array, then only the two nearest correlation times to the input value have non-zero amplitudes, and those two amplitudes sum to one (a minor deviation from this behavior occurs at 10 ns). However, if we input two correlation times in Figure 8B, or one correlation time and one distribution in Figure 8C, with motion split equally between the two correlation times or correlation time and distribution, the fit parameters’ response to the swept correlation time is not an exact reproduction of the behavior in Figure 8A, in contrast to SDM and LeMaster’s approach. While SDM and LeMaster’s approach are both linear combinations of relaxation rate constants, IMPACT is a linear fit for which its behavior depends heavily on restricting the values of the fit parameters (priors), which as discussed in A Few Notes on Linearity, means that the fit parameters are no longer linear to $(1 - S^{2}) θ (z)$ . The result is that the response of the parameters $A_{k}$ to a given correlation time do depend weakly on other motions present, thus not fully obtaining the ideal, linear behavior of SDM and LeMaster’s approach. However, IMPACT provides a good approximation to this behavior, and is more generally applicable than SDM and LeMaster’s approach.

FIGURE 8

FIGURE 8. Behavior of the IMPACT approach as a function of correlation time. In each plot, we fit calculated relaxation rate constants, and fit the amplitudes in Eq. 25 according to the IMPACT procedure, using the set of experiments from Khan et al. (2015). In (A), the input total correlation function consists of a single decaying exponential term (with amplitude 1), where the amplitudes are plotted as the correlation time is varied. In (B), the total correlation function uses two correlation times (equal amplitudes), with one fixed at 1 ns, and the second swept (x-axis). On the y-axis, we plot contributions to the $A_{k}$ from the correlation time being varied. In (C), the same information is plotted, but the total correlation function includes a log-Gaussian distribution (μ = 630 ps, σ = 1 order of magnitude), and a single, variable correlation time.

IMPACT has not been developed for application to solid-state NMR, but it is worth investigating how such a method could work. In Figure 9A, we use an IMPACT-type approach to fitting R₁ at three fields and S², using an array of three correlation times. We restrict the fitted amplitudes ( $A_{k}$ ) to fall between zero and one, but it does not make sense to require the $A_{k}$ to sum to one, since the correlation function in solid-state NMR does not usually decay to zero. Here, we assume a motion with just one correlation time, and $(1 - S^{2})$ = 0.3. Then, we find that IMPACT in solids is similar to its solution-state behavior. Note that the amplitudes corresponding to 1.4 and 5 ns capture motion near those correlation times, whereas the amplitude corresponding to 1 ps captures all motion not in proximity to 1.4 and 5 ns, including very slow motions. As with solution-state NMR, if we split the motion over two correlation times, and determine the response to one of the two correlation times Figure 9B, the response changes compared to fitting just the single correlation time. However, as discussed in A Few Notes on Linearity, and demonstrated with SDM and LeMaster’s approach, this dependence on other motions present vanishes if we eliminate restrictions on the fit parameters. Then, in Figure 9C, we repeat the fit from Figure 9A, without restrictions on the fit parameters, yielding reasonable behavior, excepting some negative amplitudes in the $A_{k}$ . Figure 9D shows similar results, when fitting S² and R_1ρ, although the fitted correlation times must be in the sensitive range of the R_1ρ rate constants. Unfortunately, when we attempt to fit R₁ and R_1ρ simultaneously in Figure 9E, using the same correlation times as in Figures 9C,D, we find extremely unstable behavior. Apparently, we cannot simultaneously fit data on both sides of the solid-state NMR blind spot.

FIGURE 9

FIGURE 9. IMPACT behavior in solids. In each plot, we test the behavior of the amplitudes, $A_{k}$ , using calculated solid-state NMR data (S², ¹⁵N R₁, ¹⁵N R_1ρ, with experimental conditions taken from Smith et al. (2016)). (A) plots the behavior of fitting S² and three R₁ rate constants to three correlation times (1 ps, 1.4 ns, 5 ns), where the input correlation function has a single correlation time ( $(1 - S^{2})$ =0.3), while restricting the $A_{k}$ to fall between 0 and 1. (B) shows fits under the same conditions, but includes two correlation times, with one fixed at 1 ns, and the other swept (x-axis). The y-axis plots the change in the $A_{k}$ due to the swept correlation time. (C) shows fits under the same conditions as (A), without restricting the values of the $A_{k}$ . (D) also removes restrictions on the $A_{k}$ , but fits S² and R_1ρ data, using correlation times of (1 ps, 2.5 μs, and 17.8 μs). (E) fits all data (S², R₁, R_1ρ) simultaneously without restrictions on the $A_{k}$ , with correlation times of 1 ps, 1.4 ns, 5 ns, 2.5 μs, and 17.8 μs (F) fits R₁ data, but uses one very short correlation time (32 ps), and one very long correlation time (100 ns).

In Figures 9C,D, we have fairly good performance, excepting that some of the amplitudes become slightly negative. Interestingly, these negative amplitudes may be eliminated by placing two correlation times further away from each other. Then, in Figure 9F, we fit only R₁ data, using correlation times of 32 ps and 100 ns. The fitted correlation times no longer correspond to the center of the sensitive range of the $A_{k}$ (750 ps, 6.2 ns), and the amplitudes also far exceed the input value for $(1 - S^{2})$ . Fitting while also including $S^{2}$ data allows using an additional correlation time (1 ps), but the corresponding $A_{k}$ becomes large and negative (not shown). From this final result, we could simply renormalize the amplitudes to have a maximum of one, and report the center of the sensitive range instead of the correlation times to which we actually fitted. The result would still be a linear combination of the experimental data, and therefore linear to $(1 - S^{2}) θ (z)$ , but the result would have very little to do with the correlation times chosen to obtain that linear combination.

A New Approach for Solid-State Nuclear Magnetic Resonance

In the previous section, we investigated the behavior of a number of approaches to processing relaxation data. Of those approaches, model-free, SDM, and LeMaster’s approach provide parameters which are linear to the distribution of correlation times, $(1 - S^{2}) θ (z)$ (in some cases, some additional arithmetic operations are required to obtain the reported parameters, e.g., $〈 τ_{e} 〉$ is calculated from $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ ). IMPACT approximates this behavior, although heavy reliance on priors prevents perfect linearity. However, each approach is limited in its application to solid-state NMR data. Therefore, we have developed the detector analysis (Smith et al., 2018), which is a general method for processing relaxation data that maintains a linear relationship between fit parameters and the distribution of correlation times.

Linear Combination of Data

As we have emphasized for the above examples, one may obtain parameters that have a well-defined (linear) relationship to the distribution of correlation times by taking linear combinations of relaxation rate constants. Thus far, we have limited ourselves to very specific linear combinations: combinations that yield the spectral density, or combinations that are related to specific correlation times. However, why shouldn’t we use any linear combination that is optimized to give an ideal linear relationship to the distribution of correlation times, $(1 - S^{2}) θ (z)$ ? We first recall that the correlation function has been defined here as being a linear combination of decaying exponentials, defined by $(1 - S^{2}) θ (z)$ , and its Fourier transform (also a series of linear combinations) must then also be linear to $(1 - S^{2}) θ (z)$ .

\begin{array}{l} C (t) = S^{2} + (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) e^{- t / (10^{z} \cdot 1 s)} d z \\ J^{(θ, S)} (ω) = \frac{2}{5} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \frac{10^{z} \cdot 1 s}{1 + {(ω \cdot 10^{z} \cdot 1 s)}^{2}} d z \end{array} (26)

Here, we take $J^{(θ, S)} (ω)$ to be the spectral density resulting from $(1 - S^{2}) θ (z)$ . Then, any relaxation rate constant is a weighted sum of terms from the spectral density.

\begin{array}{l} R_{ζ}^{(θ, S)} = \sum_{p} a_{p}^{ζ} J^{(θ, S)} (ω_{p}) \\ = \sum_{p} a_{p}^{ζ} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \frac{10^{z} \cdot 1 s}{1 + {(ω \cdot 10^{z} \cdot 1 s)}^{2}} d z \\ = (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \underset{= R_{ζ} (z)}{\underset{︸}{\sum_{p} a_{p} \frac{10^{z} \cdot 1 s}{1 + {(ω \cdot 10^{z} \cdot 1 s)}^{2}}}} d z \\ R_{ζ}^{(θ, S)} = (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) R_{ζ} (z) d z \end{array} (27)

$R_{ζ}^{(θ, S)}$ is the relaxation rate constant for an experiment, indexed ζ, resulting from the distribution of correlation times, $(1 - S^{2}) θ (z)$ . Coefficients $a_{p}^{ζ}$ indicate the weightings of the spectral density for experiment ζ, sampled at frequencies $ω_{p}$ . Insertion of $J^{(θ, S)} (ω)$ into this linear combination allows us to express $R_{ζ}^{(θ, S)}$ as a linear function of $(1 - S^{2}) θ (z)$ , where $R_{ζ} (z)$ defines the linear relationship (we refer to this as the sensitivity).

Then, as is the case for model-free, SDM, and LeMaster’s approach, any sum of relaxation constants maintains linearity. Following our previous convention (Smith et al., 2018), we denote the sum as $ρ_{n}^{(θ, S)}$ .

\begin{array}{l} ρ_{n}^{(θ, S)} = \sum_{ζ} b_{ζ} R_{ζ}^{(θ, S)} \\ = \sum_{ζ} b_{ζ} (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) R_{ζ} (z) d z \\ = (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \underset{= ρ_{n} (z)}{\underset{︸}{\sum_{ζ} b_{ζ} R_{ζ} (z)}} d z \\ ρ_{n}^{(θ, S)} = (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) ρ_{n} (z) d z \end{array} (28)

Then, $ρ_{n} (z)$ defines the linear relationship between $(1 - S^{2}) θ (z)$ and $ρ_{n}^{(θ, S)}$ . The subsequent question is, how do we find the best linear combinations of the experimental relaxation rate constants for analyzing our relaxation data?

Optimizing Detectors: The Relaxation-Rate Space Approach

While any linear combination of experimental relaxation rate constants yields a linear relationship between $(1 - S^{2}) θ (z)$ and the resulting $ρ_{n}^{(θ, S)}$ , not all combinations are equally good choices. A few guidelines are, first, non-negativity of $ρ_{n} (z)$ ; we would like $ρ_{n}^{(θ, S)}$ to always increase when amplitude of motion increases, whereas negative regions of $ρ_{n} (z)$ could cause $ρ_{n}^{(θ, S)}$ to decrease with increasing amplitudes. Second, narrowness: we would like each $ρ_{n}^{(θ, S)}$ to report on a specific range of correlation times. Third, when the full set of relaxation data is analyzed, one should be able to back-calculate the experimental data (within some tolerance) from the parameters $ρ_{n}^{(θ, S)}$ . This ensures that one captures all information in the experimental data (clearly, if the $ρ_{n}^{(θ, S)}$ can reproduce the experimental data, then the $ρ_{n}^{(θ, S)}$ must have retained the information in the experiments).

The question, then, is how to obtain optimized linear combinations satisfying the above requirements. Our initial answer to this question is the result of identifying a similar problem in a completely different field: When one sees the color of an object, its appearance depends on the distribution of wavelengths reflected (or emitted) by the object. The distribution of wavelengths is given by the spectral power distribution, $S (λ)$ . Whereas $S (λ)$ is an infinite-dimensional description of the spectral power vs. wavelength, what is “seen” is a projection of that distribution onto a three dimensional space, corresponding to the three cones that detect color in the eye. This 3D space is often described using the CIE (Commission internationale de l'Eclairage) XYZ color space (Smith and Guild, 1931; Judd, 1951; Vos, 1978).

\begin{array}{l} X = \int_{0}^{\infty} S (λ) \bar{x} (λ) d λ \\ Y = \int_{0}^{\infty} S (λ) \bar{y} (λ) d λ \\ Z = \int_{0}^{\infty} S (λ) \bar{z} (λ) d λ \end{array} (29)

The functions $\bar{x} (λ)$ , $\bar{y} (λ)$ , and $\bar{z} (λ)$ are plotted in Figure 10B. Based on the color one sees, one cannot define $S (λ)$ precisely, but certainly we learn something about the distribution of wavelengths. In the same way, based on a set of relaxation rate constants, we cannot fully define $(1 - S^{2}) θ (z)$ , but certainly we can learn something about the dynamics. The matching forms of Eqs. 27, 29 further highlight the relationship between these problems.

FIGURE 10

FIGURE 10. Similarity between the CIE XYZ colorspace and the relaxation rate constant space. (A) plots the XYZ colorspace, black lines indicate where single wavelengths fall in the colorspace (z not shown, space is normalized such that x + y + z = 1). Points connected by a triangle indicate the definition of red, green, and blue colors as defined by the sRGB standard (Anderson et al., 1996). (B) plots the sensitivity of the $\bar{x} (λ)$ , $\bar{y} (λ)$ , and $\bar{z} (λ)$ color matching functions as a function of wavelength (λ). (C) plots sRGB sensitivities resulting from transformation from the XYZ to sRGB spaces. Points connected by triangles correspond to definitions of ${\vec{r}}_{1}$ , ${\vec{r}}_{2}$ , and ${\vec{r}}_{3}$ that define the detector space. (D) shows the normalized relaxation rate space for ¹³C R₁ at 300 and 800 MHz and H–C NOE at 800 MHz. (E) shows the sensitivities of each of these experiments a function of correlation time. (F) shows detector sensitivities resulting from transformation from the relaxation rate constant space to detector space (defined by the points in (D)).

The XYZ color space can be represented as a 2D space, shown in Figure 10A. Only x and y are shown, and z is selected so that $x + y + z = 1$ (then, a third dimension would vary this sum, corresponding to brightness). By marking points in the color space, one can indicate how the color space may be represented in another basis. Here, we have marked points corresponding to red, green, and blue of the sRGB standard (Anderson et al., 1996). Colors within the resulting triangle may be obtained with positive linear combinations of the red, green, and blue of sRGB, so that this triangle is a good estimate of colors that may be obtained with a color monitor (which creates color by combining red, green and blue pixels–this means that in Figure 10A, colors outside the triangle are not correctly represented on your screen). These points also define a transformation from the XYZ color matching functions (Figure 10B) to the sRGB functions (Figure 10C). Note that any color may be represented in the sRGB space, but only those where $S (λ)$ results in positive R, G, and B values can actually be reproduced by a typical monitor.

Realizing that the mathematics of relaxation rate constants was essentially equivalent to color spaces, we created analogous relaxation rate constant spaces, replacing the X, Y, and Z values with normalized rate constants. However, instead of placing points within the relaxation rate space, we surrounded the space in Figure 10D, since we wanted to describe all points in the space with positive parameters. Interestingly, by surrounding the space as closely as possible, without crossing into the space, we obtained a transformation to functions with well-separated and non-negative sensitivities, see Figure 10F. In the example here, we use three points to transform the three experimental sensitivities into detector sensitivities, resulting in three detectors. However, redundancy in the information of larger data sets often results in the space becoming narrow in a given dimension, so that the full space may also be approximately described using fewer points, resulting in fewer detectors than experimental data points, but better signal-to-noise in the resulting parameters. Full details of this approach are described in Smith et al. (2018).

Optimizing Detector Sensitivities: Automated Approach

Investigating the relaxation rate space is a powerful way to grasp the information content of a relaxation data set, however, detector optimization using this method requires manual selection of points in the space. This quickly became excessively tedious for large data sets, as is the case for analysis of relaxometry data (Smith A. A. et al., 2021), so that we have also automated the optimization of linear combination (Smith et al., 2019a).

For automation, one still has the requirements that we capture the information in the experiments (that is, we can fit the data), while minimizing the number of parameters to describe that data, and second, that we obtain detector sensitivities that are narrow and non-negative. The first requirement may be met using singular value decomposition (Golub and Kahan, 1965). Suppose we have a matrix, M, for which each row is a sensitivity of one of our experiments ( $R_{ζ} (z)$ ), where we perform a normalization to prioritize fitting of higher quality data (procedure: first, we normalize all sensitivities to a maximum of one, second we multiply the sensitivity by the median of the experimental rate constants, and third we divide by the median standard deviation of those rate constants). Each column then corresponds to a correlation time. For N experiments, we obtain the best approximation of M that can be achieved with a linear combination of t vectors, defined by

\begin{array}{l} M \approx \tilde{M} = U_{t} \cdot Σ_{t} \cdot V_{t}^{'} \\ V_{t}^{'} = Σ_{t}^{- 1} \cdot U_{t}^{'} \cdot M \end{array} (30)

The t rows of $V_{t}^{'}$ are linear combinations of the rows of M, with recombination defined by the product $Σ_{t}^{- 1} \cdot U_{t}^{'} \cdot M$ ( $U_{t}$ , $V_{t}^{'}$ are unitary matrices, and $Σ_{t}$ is diagonal, with the largest n singular values along the diagonal). Linear combination of the rows of $V_{t}^{'}$ to yield the rows of M is an approximate relationship, but the inverse, recombination of the rows of M to yield $V_{t}^{'}$ , is exact. Then, the closer $\tilde{M}$ is to M, the better the data can be fit, but this requires t to be larger, and thus more noise is also present in the final analysis. In principle, this linear recombination could be directly applied to the experimental data, to obtain detectors with sensitivities given by the rows of the $V_{t}^{'}$ . The result would capture (approximately) the maximum amount of information possible from the experiment with t parameters. However, the sensitivities found in the rows of $V_{t}^{'}$ are not narrow, and usually have large negative regions. On the other hand, a linear recombination of the vectors in $V_{t}^{'}$ would maintain the information content and fit quality, but allows one to optimize the detector sensitivities to be separated and non-negative.

[\begin{matrix} ρ_{1} (z) \\ ρ_{2} (z) \\ ⋮ \end{matrix}] = T \cdot V_{t}^{'} = T \cdot Σ_{t}^{- 1} \cdot U_{t}^{'} \cdot M (31)

Then, T defines the linear recombination of the $V_{t}^{'}$ to yield the $ρ_{n} (z)$ , where T is a square matrix. The product of a row of T with $V_{t}^{'}$ defines one of the detectors sensitivities, $ρ_{n} (z)$ . A row of T is determined in order to optimize a detector sensitivity, first by choosing a single correlation time, $z_{max} = \log_{10} (τ_{c} / s)$ , for which we optimize a linear combination of the rows of $V_{t}^{'}$ such that $ρ_{n} (z_{max})$ = 1, while simultaneously minimizing $ρ_{n} (z)$ for all other correlation times, and requiring that all $ρ_{n} (z)$ remain non-negative. This can be quickly solved using a linear programming algorithm (Kantorovich, 1960; Dantzig, 1982; Virtanen, 2020). However, if we sweep through an array of correlation times, performing this optimization at each correlation time, we find that we are only successful at t correlation times (we consider the minimization as having failed if for some z, we find that $ρ_{n} (z)$ exceeds 1). Currently, we find the best t detectors by sweeping over a large array of correlation times (200), although this algorithm could be improved to reduce the number of optimizations required (spaces method and automated method both implemented in MATLAB, download from https://difrate.sourceforge.io).

In the detector analysis, once we have optimized the detectors, we apply the same linear combination to the experimental relaxation rate constants as were applied to the sensitivities in order to obtain optimized detector responses. Note in practice that this is implemented as a fit, allowing one to prioritize fitting relaxation rate constants with lower measurement error. Furthermore, we place bounds on the fitted detector responses, $ρ_{n}^{(θ, S)}$ . In A Few Notes on Linearity, we noted that bounds (priors) on the fit parameters can cause the fit parameters to not be linear to $(1 - S^{2}) θ (z)$ . This is only the case if the priors exclude the best fit. Detectors are constructed such that any allowed set of relaxation rate constants will not result in parameters that violate the priors. Allowed rate constants are any set that may occur for an arbitrary form of $(1 - S^{2}) θ (z)$ . If, due to noise or measurement error, a dis-allowed set of relaxation rate constants is measured, then the priors will force the fitted relaxation rate constants to fall in the allowed space.

Model-Free, or Not?

We see that the original model-free approach, SDM, LeMaster’s approach, and detector analysis all belong to a family of methods that yield parameters with well-defined relationships to the distribution of correlation times, here defined by $(1 - S^{2}) θ (z)$ . For SDM and detectors, the final parameters ( $J (ω)$ , $ρ_{n}^{(θ, S)}$ ) are linearly related to $(1 - S^{2}) θ (z)$ ; for model-free, $S^{2}$ and $(1 - S^{2}) 〈 τ_{e} 〉$ are linear, and for LeMaster’s approach, $(1 - S_{f}^{2})$ , $S_{f}^{2} (1 - S_{H}^{2})$ , and $S_{f}^{2} S_{H}^{2} (1 - S_{N}^{2})$ are linear, whereas the final parameters ( $S^{2}$ , $〈 τ_{e} 〉$ , $S_{f}^{2}$ , $S_{H}^{2}$ , and $S_{N}^{2}$ ) must be obtained via additional arithmetic operations. Response of EMF parameters, on the other hand, may react to changes in one motion differently, depending on other motions in the system. Still, its simplicity in analysis and interpretation—one to three pairs of correlation times and amplitudes—makes it an attractive choice for relaxation data analysis. Should we then compromise in some cases, and sacrifice well-defined parameters for more easily interpreted parameters?

Case 1: Extended Model-Free

Using detectors, we may better understand how EMF parameters in solid-state NMR depend on amplitudes of motion for particular windows of correlation times. We re-analyze relaxation data of HET-s (218–289) fibrils (Smith et al., 2016), by first performing a detector analysis on the data, shown in Figure 11A and then iteratively fitting detector responses to correlation times and amplitudes in Figure 11B, resulting in the EMF analysis in Figure 11C.

FIGURE 11

FIGURE 11. Model-free analysis from detectors. (A) shows a detector analysis of HET-s (218–289) fibrils (Smith et al., 2016), with sensitivities shown in (B) (amplitude scale not shown; sensitivities have a maximum of 1). (B) illustrates the procedure to convert 273Ser detector responses into model-free parameters. Bars give the detector responses (y-axis), plotted at the center of the corresponding detector’s sensitivity (x-axis, note that ρ₀, blue, does not have a well-defined center). At top, we find the ratio of $ρ_{3}^{(θ, S)} / ρ_{2}^{(θ, S)}$ is consistent with a correlation time of 34 ns, with corresponding amplitude of 0.075 (intermediate motion). After subtracting the contribution of this correlation time to $ρ_{0}^{(θ, S)}$ (middle), we find the ratio $ρ_{1}^{(θ, S)} / ρ_{0}^{(θ, S)}$ is consistent with a correlation time of 49 ps, and amplitude of 0.12 (fast motion). Using a fixed correlation time of 14.7 μs, we find an amplitude for the slow motion of 1.8 × 10⁻³ (bottom). (C) shows the results of EMF analysis for all residues using the procedure in (B).

Using the following procedure, we are able to reproduce our previous model-free results, illustrated in Figure 11B for residue 273Ser. The procedure is given below as a set of simple equations, where results are a good reproduction of our previous direct fit using the model-free approach.

\begin{array}{l} \begin{array}{l} \begin{array}{l} Step 1 : z_{i} \\ \frac{ρ_{2}^{(θ, S)}}{ρ_{3}^{(θ, S)}} = \frac{ρ_{2} (z_{i})}{ρ_{3} (z_{i})} \end{array} & \begin{array}{l} Step 2 : A_{i} \\ A_{i} = \frac{ρ_{2}^{(θ, S)}}{ρ_{2} (z_{i})} = \frac{ρ_{3}^{(θ, S)}}{ρ_{3} (z_{i})} \end{array} \\ \begin{array}{l} Step 3 : z_{i} \\ \frac{ρ_{0}^{(θ, S)} - A_{i} ρ_{0} (z_{i})}{ρ_{1}^{(θ, S)} - A_{i} ρ_{1} (z_{i})} = \frac{ρ_{0} (z_{f})}{ρ_{1} (z_{f})} \end{array} & \begin{array}{l} Step 4 : A_{f} \\ A_{f} = \frac{ρ_{0}^{(θ, S)} - A_{i} ρ_{0} (z_{i})}{ρ_{0} (z_{f})} = \frac{ρ_{1}^{(θ, S)} - A_{i} ρ_{1} (z_{i})}{ρ_{1} (z_{f})} \end{array} \end{array} \\ Step 5 : A_{s} \\ A_{f} = \frac{ρ_{4}^{(θ, S)} - A_{i} ρ_{4} (z_{i}) - A_{f} ρ_{4} (z_{f})}{ρ_{4} (z_{s})} \\ \begin{matrix} \begin{array}{l} S_{f}^{2} = 1 - A_{f} \\ S_{i}^{2} = 1 - A_{i} / S_{f}^{2} \\ S_{s}^{2} = 1 - A_{s} / (S_{f}^{2} S_{i}^{2}) \end{array} & \begin{array}{l} τ_{f} = 10^{z_{f}} \cdot 1 s \\ τ_{i} = 10^{z_{i}} \cdot 1 s \\ τ_{s} = 10^{z_{s}} \cdot 1 s \end{array} \end{matrix} \end{array} (32)

In the first and second steps, we find a correlation time for which the ratio of sensitivities of ρ₂ and ρ₃ matches the ratio of the detector responses, and then subsequently find the correct amplitude to reproduce these correlation times. With $ρ_{2}^{(θ, S)}$ = 2.0 × 10⁻², and $ρ_{3}^{(θ, S)}$ = 2.2 × 10⁻³, we find $τ_{i}$ = 34 ns. Our first concern with this fit is that the intermediate correlation time, $z_{i} = \log_{10} (τ_{i} / s)$ , is a compromise between a detector sensitive to motions around 6 ns and a second sensitive around 2 μs. It seems unlikely that the same motion can really explain these two detector responses, which have sensitivities separated by three orders of magnitude. The second problem is because we use a compromise correlation time, both detector sensitivities are very low at this correlation time, which must be counterbalanced by using a large amplitude ( $A_{i}$ ) in the model-free fit. Then, in our example, $A_{i}$ = 0.075 is significantly larger than the detector responses, $ρ_{2}^{(θ, S)}$ and $ρ_{3}^{(θ, S)}$ , from which it results, so that we are very likely overestimating the amplitude of this motion.

In the third and fourth steps, we subtract the contributions from $z_{i}$ and $A_{i}$ from $ρ_{0}^{(θ, S)}$ and $ρ_{1}^{(θ, S)}$ , and similarly use the ratios of the remainder of the detector responses to obtain $z_{f}$ , and their amplitudes to obtain $A_{f}$ . Again, it is not clear if these detectors should be treated as if they describe a single motion. In particular, the relatively uniform behavior of $ρ_{0}^{(θ, S)}$ likely is a result of primarily local librational motion, which will not be described by the same amplitudes and correlation times of motions leading to greater variation in $ρ_{1}^{(θ, S)}$ . Interestingly, because the amplitudes do not vary in the same way, the variation in amplitude of $ρ_{1}^{(θ, S)}$ cannot be reproduced in the trends for $A_{f}$ , but instead has to be fitted by variation in correlation time ( $τ_{f}$ ). The result is that amplitude trends in $S_{i}^{2} = 1 - A_{i} / S_{f}^{2}$ , shown in Figure 11(C, middle) correlate well with trends in $τ_{f}$ , especially near breaks between the β-sheets of HET-s (near 235Glu, 271Gly). However, this correlation is actually coming from similar amplitude trends observed for $ρ_{1}^{(θ, S)}$ and $ρ_{2}^{(θ, S)}$ . The corresponding detector sensitivities are centered at 760 ps and 6.1 ns, and in fact overlap, suggesting that they may describe the same or at least related motions. EMF attributes these detector responses to different motions, having median correlation times of 22 ps and 42 ns (taken over all residues), thus being separated by three orders of magnitude.

In the final step, one fixes the slow correlation time to 14.7 μs (based on a fit optimization over the whole data set). In this case, the amplitude of $ρ_{4}^{(θ, S)}$ determines $A_{s}$ alone; the proximity of 14.7 μs to the center of $ρ_{4}$ (24 μs) results in fairly reasonable amplitudes (for 273Ser, $ρ_{4}^{(θ, S)}$ and $A_{s}$ fall within rounding error, yielding 1.8 × 10⁻³).

Then, the major problems with this EMF analysis are intermediate correlation times falling within the NMR blind spot (∼20–600 ns), along with correspondingly inflated amplitudes, as well as similar problems due to fitting fast correlation times to $ρ_{0}^{(θ, S)}$ and $ρ_{1}^{(θ, S)}$ , which requires a compromise correlation time between librational motions (∼ps) with nanosecond motions. Furthermore, this behavior prevented comparison of EMF parameters for HET-s to MD results, whereas detectors yielded reasonable agreement (Smith et al., 2019b). As we have previously pointed out (Smith et al., 2017), the model-free parameters in HET-s fibrils are far from being atypical, in fact they are fairly consistent across multiple protein systems, likely due most studies utilizing similar data sets and analysis methodology (Chevelkov et al., 2009b; Schanda et al., 2010; Haller and Schanda, 2013; Zinkevich et al., 2013; Lamley et al., 2015a).

Case 2: Model-Free Analysis of μs-Motion

Microsecond motion is the result of processes having higher free-energy cost than nanosecond and picosecond dynamics. We suggest dividing these motions into local and collective motions, where the free energy cost of local motions comes from higher amplitude motions (∼10°) that require traversing a large energy barrier. In contrast, collective motions tend to be very low amplitude motion, where the high free-energy cost of the motion is not due to large amplitude dynamics or a significant energy barrier, but rather diffusive dynamics involving large numbers of atoms. Such dynamics are characterized by modes of motion, where a continuum of possible correlation lengths leads to a distribution of correlation times. In contrast, some local microsecond dynamics can be reasonably well approximated as a hopping motion between two orientations, and therefore described with a single correlation time (although effort should be made to determine whether relaxation might be due to multi-site exchange, and understand how this changes the interpretation of data analysis).

Local Dynamics

The availability of R_1ρ data, including formulas for its analysis (Trott and Palmer, 2002; Abergel and Palmer, 2003; Miloushev and Palmer, 2005; Kurbanov et al., 2011; Rovo and Linser, 2017) and improving methods for its collection (Kurauskas et al., 2017; Lakomek et al., 2017; Keeler et al., 2018; Krushelnitsky et al., 2018) has recently resulted in considerable improvement in the ability to characterize local micro- to millisecond motions (Rovó, 2020). We consider two categories of R_1ρ experiments: the first is Bloch-McConnell relaxation dispersion experiments (BMRD), for which R_1ρ relaxation is the result of motion modulating the isotropic chemical shift, and the NEar Rotary-resonance Relaxation Dispersion (NERRD, (Kurauskas et al., 2017)), for which orientational changes in anisotropic tensors leads to R_1ρ relaxation. For two-site exchange, BMRD R_1ρ relaxation rate constants depend on exchange rate ( $k_{ex} = 1 / τ_{c}$ ), the change in isotropic chemical shift due to exchange ( $Δ ω_{12}$ ), and the population ( $p_{1}, p_{2} = 1 - p_{1}$ ). Rate constants further depend on the effective field strengths corresponding to each of the two chemical shifts, $ω_{e 1}$ and $ω_{e 2}$ , as well as the effective field for the average chemical shift. Palmer and coworkers provide us with the following expression (Trott and Palmer, 2002; Trott et al., 2003; Miloushev and Palmer, 2005), which is valid in the fast or intermediate exchange regimes:

\begin{array}{l} R_{1 ρ} = \frac{R_{1}}{2} (1 + \cos^{2} β_{e}) + R_{1 ρ}^{DD,CSA} + R_{1 ρ}^{ex} \\ R_{1 ρ}^{ex} = \frac{\sin^{2} β_{e} p_{1} p_{2} Δ ω_{12}^{2} k_{ex}}{\frac{ω_{e 1}^{2} ω_{e 2}^{2}}{ω_{e}^{2}} + k_{ex}^{2} - \sin^{2} β_{e} p_{1} p_{2} Δ ω_{12}^{2} (1 + \frac{2 k_{ex}^{2} (p_{1} ω_{e 1}^{2} + p_{2} ω_{e 2}^{2})}{ω_{e 1}^{2} ω_{e 2}^{2} + ω_{e}^{2} k_{ex}^{2}})} \\ Ω = p_{1} Ω_{1} + p_{2} Ω_{2} \\ ω_{e}^{2} = ω_{1}^{2} + Ω^{2} \\ ω_{e 1}^{2} = ω_{1}^{2} + Ω_{1}^{2}, ω_{e 2}^{2} = ω_{2}^{2} + Ω_{2}^{2} \end{array} (33)

The total $R_{1 ρ}$ relaxation has contributions from longitudinal relaxation (R₁), transverse relaxation from dipole and CSA tensors ( $R_{1 ρ}^{DD,CSA}$ ), and from chemical exchange ( $R_{1 ρ}^{DD,CSA}$ ). (Kurbanov et al., 2011) give the formula for $R_{1 ρ}^{DD,CSA}$ .

\begin{array}{l} R_{1 ρ}^{DD, CSA} = \sin^{2} β_{e} \times [{(\frac{δ}{4})}^{2} (J (ω_{S}) + \frac{1}{3} J (2 ω_{r} - ω_{e}) + \frac{2}{3} J (ω_{r} - ω_{e}) + \frac{2}{3} J (ω_{r} + ω_{e}) + \frac{1}{3} J (2 ω_{r} + ω_{e})) \\ + \frac{2}{27} {(ω_{I} Δ σ_{I})}^{2} (\frac{1}{2} J (2 ω_{r} - ω_{e}) + J (ω_{r} - ω_{e}) + J (ω_{r} + ω_{e}) + \frac{1}{2} J (2 ω_{r} + ω_{e}))] \end{array} (34)

$ω_{r}$ is the magic angle spinning frequency, and $ω_{e}$ is the effective field as defined above. If one assumes the microsecond dynamics are dominated by two-site hoping, the spectral density is given simply by

J (ω) = \frac{2}{5} 3 p_{1} p_{2} (1 - \cos^{2} θ) \frac{k_{ex}}{k_{ex}^{2} + ω^{2}} = \frac{2}{5} (1 - S^{2}) \frac{τ_{c}}{1 + {(ω τ_{c})}^{2}} (35)

Then, the question is, how may we most efficiently extract the exchange rate ( $k_{ex} = 1 / τ_{c}$ ), populations ( $p_{1}, p_{2}$ ), chemical shift changes ( $Δ ω_{12}$ ), and angle changes ( $θ$ ). Fitting of $k_{ex}$ has been fairly well established using both NERDD or BMRD (Trott et al., 2003; Ma et al., 2014; Rovo and Linser, 2017; Marion et al., 2019), and combining both methods should improve the accuracy of the resulting $k_{ex}$ . However, separation of populations from either $θ$ (NERDD) or $Δ ω_{12}$ (BMRD) is non-trivial. Supposing we already know $k_{ex}$ , a given experiment’s relaxation rate constant then depends on the populations and either $θ$ or $Δ ω_{12}$ (at sufficiently fast MAS, a given effective field usually results in either $R_{1 ρ}^{DD,CSA}$ or $R_{1 ρ}^{ex}$ being dominant, although in principle both terms are active in the same experiments). Inspecting Eqs. 34, 35 we note that terms $p_{1}, p_{2}$ , and $θ$ , only appear once as a product of terms, $3 p_{1} p_{1} (1 - \cos^{2} θ)$ . Then, based on NERRD data alone, these parameters are inseparable. This is seen in Figure 12, where we plot $R_{1 ρ}$ as a function of $p_{1}$ and $θ$ . We also calculate $R_{1 ρ}^{DD,CSA}$ specifically for $p_{1}$ = 0.25 and $θ = 16^{°}$ , and then indicate all other positions resulting in the same value of $R_{1 ρ}^{DD,CSA}$ Figures 12A,B as a black contour. In Figures 12E,F, we only show contours where $R_{1 ρ}^{DD,CSA}$ matches the value obtained for $p_{1}$ = 0.25 and $θ = 16^{°}$ , but show several different experimental conditions (varying the field strength, $ν_{1} = ω_{1} / 2 π$ ). Because this results in identical contours, we are unable to disentangle these parameters based on NERRD experiments under different conditions.

FIGURE 12

FIGURE 12. Separating population from hop angle and change in chemical shift in NERRD and BMRD experiments. Relevant parameters are shown as insets ( $ω_{r} / 2 π$ =40 kHz for NERDD plots). In (A–D), contour plots are shown for NERDD and BMRD relaxation rate constants under various conditions, and in each plot, a contour shows all values of $p_{1}$ and $θ$ or $Δ ω_{1}$ that yield $R_{1 ρ}$ equal to the value obtained for $p_{1}$ = 0.25 and $θ = 16^{°}$ or $Δ ω_{I}$ = 500 Hz (marked as a cross on each plot). In (E–L), we only show the contour, but for a range of experimental conditions (five experiments, linearly spaced, with range indicated in the plot). In some cases, this yields nearly identical contours, such that we only see one of the five contours.

In contrast, $R_{1 ρ}$ relaxation resulting from chemical exchange has a more complex dependence on the various parameters. In particular, effective fields for each of the two states in exchange, $ω_{e 1}$ and $ω_{e 2}$ depend on the different offsets, $Ω_{1}$ , $Ω_{2}$ , but do not depend on the populations, in principle making the terms separable. Indeed, several plots in Figure 12 show that different experimental conditions lead to different contours for $p_{1}$ vs. $Δ ω_{12}$ (contours correspond to $R_{1 ρ}$ that is equal to $R_{1 ρ}$ obtained for $p_{1}$ = 0.25 and $Δ ω_{12}$ = 500 Hz, where contour intersections yield the input values). We are then able to identify the critical conditions required for separating population from chemical shift change. First, we see that if $k_{ex} ≫ Δ ω_{12}$ , contours are fully overlapped so that we are not able to separate the terms, shown in Figure 12G. This is because, in Eq. 33, $k_{ex}^{2}$ must be much larger than the last term in the denominator. If it is also larger than $ω_{e 1}^{2} ω_{e 2}^{2} / ω_{e}^{2}$ , then the critical dependence of the $R_{1 ρ}$ on $ω_{e 1}$ or $ω_{e 2}$ is lost. In case $k_{ex}^{2}$ is not larger than $ω_{e 1}^{2} ω_{e 2}^{2} / ω_{e}^{2}$ , then the effective field must be much larger than $Δ ω_{12}$ , so that this term converges on $ω_{e}^{2}$ , again losing dependence on $ω_{e 1}$ and $ω_{e 2}$ (i.e. the denominator simplifies to $ω_{e}^{2} + k_{ex}^{2}$ (Trott and Palmer, 2002)). In any case, if the effective fields become large, $ω_{e 1} \to ω_{e}$ , $ω_{e 2} \to ω_{e}$ , similarly preventing separation in terms. For example, see Figures 12J,K, where a large offset or large field strength on the spin-locking field results in overlapping contours. Finally, note that we require a frequency offset to be applied in order to obtain the sign of $Δ ω_{12}$ . If no frequency offset is applied, then all contours are symmetric as in Figure 12H.

Separability occurs only when $k_{ex}$ , $Δ ω_{12}$ , and $ω_{e}$ are of similar size. Restricting $ω_{e}$ is particularly challenging in solid-state NMR, where coherent effects may contribute to relaxation when the spin-locking field becomes too small (Öster et al., 2019). One approach would be to use increasing spinning frequencies (Penzel et al., 2015; Lakomek et al., 2017), although we note that some of the most clear improvements in Figure 12 occur in Figure 12L, where the field strength is only a few times bigger than the H–N J-couplings, which cannot be averaged by spinning.

In case we are in the fast exchange limit for BMRD experiments, we are left only with the terms $p_{1} p_{2} (1 - \cos^{2} θ)$ from NERRD experiments and $p_{1} p_{2} Δ ω_{12}^{2}$ from BMRD experiments. In this case, there is little to be done to fully separate population from the other parameters. If values of $Δ ω_{12}$ may be bounded, it is then possible to also bound $p_{1} p_{2}$ , and therefore one finds a restricted range for possible values of $θ$ (the reverse approach also works). However, if we are in the range of intermediate exchange, then we may separate populations from $Δ ω_{12}$ , and use the result to also obtain $θ$ (note that inclusion of NERDD data should additionally improve the accuracy of $k_{ex}$ , which in turn improves separation of $Δ ω_{12}$ from populations based on the BMRD data). Note that Marion et al. have recently presented similar arguments (Marion et al., 2019), although separation of terms was apparently achieved by combining NERRD and BMRD data for fairly fast exchange ( $k_{ex}$ =18,000 s⁻¹, $Δ ω_{12} / 2 π$ = 240 Hz). While we agree that using both data sets together is beneficial, the information to separate population must come from the BMRD data and this is only possible in the intermediate exchange regime (Marion et al. calculated R_1ρ for a set of conditions, and via a coarse grid search, were able to find the initial conditions, however, other solutions along contours as in our Figure 12 likely were overlooked in the grid search).

A final consideration when analyzing BMRD and NERRD data is whether a two-site exchange model is reasonable. In a true two-site exchange, all moving residues should have identical exchange rates and populations, but differing $Δ ω_{12}$ and $θ$ values. Then, validation of the two-site model could be achieved by independently analyzing all residues and establishing that all fits have approximately the same $p_{1}$ , $p_{2}$ , and $k_{ex}$ (or just the same $k_{ex}$ if populations cannot be determined). In case the true behavior is, for example, three-site exchange, fitting to the two-site exchange model will yield exchange rates that are a weighted average of the two non-zero eigenvalues of the 3 × 3 exchange matrix, where weighting will depend on the chemical shifts of the three sites and/or the angles sampled. In this case, it may be appropriate to apply a three-site exchange model, while jointly fitting all residues using a common set of rate constants (four to six independent parameters, depending on the model chosen). Such an approach has been demonstrated using CPMG relaxation in solution-state NMR (Korzhnev et al., 2005; Neudecker et al., 2006), with the general equations solved for CPMG (Koss et al., 2018).

Collective Dynamics

NERDD relaxation also appears throughout the whole protein in the absence of BMRD relaxation, depending on sample conditions, and is attributed to low amplitude rocking of the whole protein. This is observed very weakly in GB1 crystals (Krushelnitsky et al., 2018), and strongly in GB1 complexed with IgG (Lamley et al., 2015b), HET-s (218–289) (Smith et al., 2016), ubiquitin crystals with amplitude depending heavily on crystal form (Ma et al., 2015; Kurauskas et al., 2017; Lakomek et al., 2017), and SH3 (Krushelnitsky et al., 2018). The apparent global nature of this motion led all of these studies, with the exception of Lakomek et al., to fit R_1ρ relaxation using a slow motion with a single correlation time for all residues. In our HET-s analysis, we proposed fitting R_1ρ data using a global, slow correlation time, where the corresponding order parameter could vary, and additionally an offset term that would account for faster motion that could not be fully parameterized from R_1ρ data alone. Kurauskas et al. also followed this procedure, whereas Krushelnitsky and coworkers included explicit fitting of an additional fast motion with a distribution of correlation times. By including an offset term, and using a single correlation time globally, we again have a linear fit.

R_{1 ρ} = \underset{ns motions}{\underset{︸}{R_{1 ρ}^{0}}} + (1 - S_{s}^{2}) \underset{μ s motion, fixed}{\underset{︸}{R_{1 ρ} (τ_{s})}} (36)

Then, for each residue, $R_{1 ρ}^{0}$ and $(1 - S_{s}^{2})$ are varied, where $R_{1 ρ}^{0}$ in principle compensates for relaxation due to fast, nanosecond motion, and $(1 - S_{s}^{2})$ should determine the effective amplitude of the global motion, with correlation time $τ_{s}$ , on the given residue. Practically, what happens is that the R_1ρ rate constants measured for a given residue have certain ratios. If those ratios match the ratios calculated for $τ_{s}$ , then $R_{1 ρ}^{0} = 0$ and the relaxation rate constants are fitted only with ( $1 - S_{s}^{2}$ ). In contrast, if all rate constants are approximately equal, then $(1 - S_{s}^{2}) = 0$ and $R_{1 ρ}^{0}$ accounts for the full relaxation. However, in most cases, the ratios are closer to one than predicted by $τ_{s}$ , but not exactly one and so by including contributions from $R_{1 ρ}^{0}$ and $(1 - S_{s}^{2}) R_{1 ρ} (τ_{s})$ , the data may be fit. One may investigate in more detail how the two terms vary as a function of correlation time (as in Figures 6–8). We show the behavior for the ¹⁵N and ¹³Cα R_1ρ data sets found in Smith et al. (2016) in Figure 13.

FIGURE 13

FIGURE 13. Behavior of fitting R_1ρ data to an offset and a fixed correlation time. (A) shows the offset term, $R_{1 ρ}^{0}$ , divided by 2000, and the order parameter for the slow correlation time, $(1 - S_{s}^{2})$ resulting from fitting calculated relaxation rate constants as a function of correlation time to Eq. 36. Experiments are ¹⁵N R_1ρ acquired with MAS frequency of 60 kHz and spin-lock strengths of 11, 16, 25, 38, and 51 kHz, and $τ_{s}$ is fixed at 18.5 μs. (B) shows detector sensitivities optimized using the same data set. (C) shows $R_{1 ρ}^{0}$ divided by 2000 and $(1 - S_{s}^{2})$ for ¹³C R_1ρ acquired with MAS frequency of 60 kHz and spin-lock strengths of 9, 18, 35, and 48 kHz, as well as an additional experiment with MAS frequency of 40 kHz and spin-lock strength of 25 kHz $τ_{s}$ is fixed at 7.0 μs. (D) shows detector sensitivities optimized using the same data set.

In Figure 13A, we calculate R_1ρ relaxation rate constants for ¹⁵N relaxation, and fit to Eq. 36. $(1 - S_{s}^{2})$ reaches a maximum of approximately one at 19 μs, so that this parameter describes motion at and around the fixed correlation time of $τ_{s}$ = 18.5 μs. On the other hand, the offset term, $R_{1 ρ}^{0}$ , actually is most sensitive at 2.5 μs, far from fitting primarily fast, nanosecond motion. We see that the functional forms are similar to detector sensitivities optimized from the same data set, Figure 13B. In Figure 13C, the behavior is less ideal: $(1 - S_{s}^{2})$ reaches a maximum of 1.28 at 13 μs, somewhat removed from the fixed correlation time of 7.0 μs, and the offset term becomes negative for correlation times around 18 μs.

The sensitivity of the offset term in Figure 13A to motion near 2.5 μs as opposed to faster motions may be surprising, although perhaps it should not be. NERRD experiments are most sensitive in the μs-range of correlation times, and rate constants under different experimental conditions have nearly converged to the same value at 1.9 μs (all rate constants within 5% of each other)– only slightly faster than the 2.5 μs where we find the maximum. Then, we would expect the offset term to be sensitive both near where R_1ρ is most sensitive, but also near where it converges, which is roughly what we find.

It is then important to note that fitting $R_{1 ρ}$ to contributions from an offset term and a fixed correlation time results in an offset term that is most sensitive not to fast (nanosecond) motions, but rather to slower (microsecond) motions. In some cases, $(1 - S_{s}^{2})$ may be overly sensitive to some correlation times, with sensitivity exceeding one at positions that are removed from $τ_{s}$ . Detectors are also a better choice for characterizing broad distributions of correlation times, if one does not know the form of the distribution. In fact, we suspect that global rocking motion is the result of collective dynamics over varying length scales, where increasing the correlation length also increases the correlation time, and therefore yields a broad distribution of correlation times. We demonstrated the relationship between correlation length and correlation time window for HET-s fibrils on the nanosecond timescale using a combination of NMR and MD simulation (Smith et al., 2019b), however, the question remains whether similar behavior can fully explain rocking motion of crystalline proteins; for example, Schanda and coworkers argue that a coupling between overall rocking motion and local loop motion may exist in crystalline ubiquitin (Kurauskas et al., 2017).

Outlook: Combining Methods

We have seen that relaxation data in NMR may be processed by a variety of different methods, however, only some of these methods can really be thought of as “model-free,” such that we can establish a well-defined (linear) behavior for each parameter as a function of correlation time, independent of the actual model of the correlation function. These methods are the original model-free analysis, under the assumption that $ω τ_{i} ≪ 1$ , spectral density mapping, LeMaster’s approach, and detector analysis. Of these, only detector analysis is generally applicable to solid-state NMR.

So, are detectors the last word in NMR dynamics analysis? We certainly hope not. Each detector response provides a “window” into the total reorientational motion of some NMR tensor, with the window width and center defined by $ρ_{n} (z)$ . Still, such a description is not very precise: a moderate detector response could result from a low amplitude motion near where $ρ_{n} (z)$ reaches its maximum, it could result from a high amplitude motion where $ρ_{n} (z)$ is small, or (and we suspect this is often the case), it characterizes a distribution of correlation times that overlaps the range of sensitivity of that detector. A collection of detectors, and their behavior as a function of position in a molecule gives further hints at the dynamics of a molecule, but leaves much to be desired in terms of details of motion. What we would rather have is better models of motion. If we use a good model, based on knowledge of the dynamics obtained from other methods, the information added to our experimental data should improve our interpretation of the experiment.

Molecular dynamics simulation is particularly powerful as a complimentary method to NMR. One obtains positions of all atoms as a function of time, allowing first, the direct calculation of the NMR-relevant correlation functions, and second, in principle allowing one to connect those correlation functions to specific motion in the molecule. $C (t)$ is explicitly calculated as

\begin{array}{l} C (t_{n}) = \frac{1}{N} \sum_{i = 0}^{N - n - 1} P_{2} (\vec{μ} (τ_{i}) \cdot \vec{μ} (τ_{i + n})) \\ \approx S^{2} + (1 - S^{2}) \int_{- \infty}^{\infty} θ (z) \underset{R_{C (t_{n})} (z)}{\underset{︸}{e^{- t_{n} / (10^{z} \cdot 1 s)}}} d z \end{array} (37)

This is the discrete form of Eq. 14, as would be applied to an MD trajectory. To obtain the nth time point in the correlation function, $C (t_{n})$ , we simply average over all pairs of frames separated by n frames. The latter equation is our assumed form for the correlation function, where we note that a given time point of the correlation function, $C (t_{n})$ , is related to the distribution of correlation times with the same functional form as the relaxation rate constants (excepting the offset, $S^{2}$ , see Eq. 27). This allows one to calculate detectors from the collection of time points in MD-derived correlation functions using a procedure nearly identical to that described in Optimizing Detector Sensitivities: Automated Approach, where the sensitivity, $R_{ζ} (z)$ (Eq. 27), is replaced by the term $R_{C (t_{n})} (z) = \exp (- t_{n} / (10^{z} \cdot 1 s))$ . In fact, not only may detector analysis be easily modified to analyze MD-derived data, but it is a general approach to numerically solving the inverse Laplace transform, which avoids some of the pitfalls of more common regularization approaches (Tikhonov and Arsenin, 1977).

When analyzing MD with detectors, one has two options: find the optimal set of detectors for describing correlation time distributions found with MD (that is, as many as possible with good signal-to-noise, and as narrow/non-overlapping as possible), or optimize the detectors to match some or all of the NMR-derived detectors. The latter approach is shown in Figure 14, where sensitivities of seven NMR experiments in Figure 14A are optimized to yield five detectors in Figure 14C, and the linear combination used to yield $ρ_{2} (z)$ is explicitly illustrated in Figure 14B. From MD, time points in the correlation function may also be linearly combined (sensitivities for 11 time points shown in Figure 14D), to match the NMR-derived detectors Figures 14E,F. Note that in Figure 14F, the linear combination is a very good match for ρ₁ and ρ₂, with moderate success for ρ₀, but detector sensitivities in the microsecond range are badly reproduced. The detector optimization indicates (correctly) that a 1 μs trajectory cannot reasonably predict dynamics in the range of several microseconds (ρ₃, ρ₄ in red, violet), thus providing a means for determining what information can and cannot be compared across methods. Where sensitivities agree, quantitative comparison of dynamics in MD and NMR is possible. Note that in Figures 14D–F, we only show 11 time points for illustrative purposes, but this procedure is equally valid for ∼10⁶ time points (for such a long correlation function, calculating Eq. 37 takes much longer than evaluating its result with detectors).

FIGURE 14

FIGURE 14. Combining NMR and MD. (A) plots normalized NMR sensitivities for a selection of experiments (S², ¹⁵N R₁ at 400, 500, 850 MHz, ¹⁵N R_1ρ at 850 MHz, 60 kHz MAS, ν₁ at 10, 25, and 45 kHz). (B) shows a linear combination of the normalized sensitivities (x, y positions shifted to reduce plot overlap), which yields the sensitivity of ρ₂, shown in (C) (green, bold). In color are the weighted contributions from each rate constant, and grey shows the cumulative sum (summing all sensitivities at and below the grey line). (C) shows the five sensitivities optimized from NMR data. (D) plots sensitivities of time points from MD-derived correlation functions (0 s, 10 points log-spaced from 50 ps to 1 μs). (E) shows a linear combination of those sensitivities, optimized to match the sensitivity of ρ₂ (x, y positions shifted to reduce plot overlap). (F) shows detectors optimized to match the NMR-derived detectors in (C). (G) shows spatial correlation of motion in a helix as a function of correlation time (windows for <20, ∼20, ∼100, ∼800 ns). Color intensity and bond radii indicate the correlation coefficient between that residue’s H–N motion and the motion of the black residue. (H) illustrates frames used to separate transformation from the PAS to the lab frame into four steps: a peptide plane frame, a helix frame, and a molecule frame (illustration inspired by Brown (1996), molecule plots created with ChimeraX (Pettersen et al., 2021)).

The detector analysis then provides a very reliable means of comparing NMR results to MD simulation. The ability to easily compare results across multiple methods is one of the primary advantages of detector analysis. We should note that carefully executed fitting of MD-derived correlation functions, followed by calculation of relaxation rate constants should yield similarly reliable rate constants, if the trajectory is sufficiently long (Mollica et al., 2012). However, the rate constants themselves are sensitive to a broader range of correlation times than detectors, so that the comparison has lower timescale resolution than detectors.

With MD and NMR data sets, one may then use NMR data via detectors (or relaxation rate constants) as a means of validating the MD, and potentially refining it; methods include selecting sections of trajectories that best reproduce experiment (Salvi et al., 2016), selecting the best force fields for a system (Antila et al., 2021), or validating the refinement of a force-field itself (Hoffmann et al., 2018a; Hoffmann et al., 2018b). One may also use NMR data (specifically order parameters) as a means of directing the simulation, so that the simulation returns parameters matching the experiment (Hansen et al., 2014). One should note that a major challenge of combining NMR and MD data is that, while NMR is highly sensitive to microsecond motions, for example, via R_1ρ measurements, it is challenging to obtain accurate dynamics on the microsecond timescale from MD simulations. Although MD simulations now regularly extend for multiple microseconds, or longer via enhanced sampling (Bernardi et al., 2015), one still lacks sufficient statistics to obtain reliable dynamics behavior. Consider, if we investigate a 1 μs motion, using a 10 μs trajectory, we should observe 10 events, but the variance in number of events is also 10 (assuming Poisson statistics), so that large errors easily occur. Additionally, correct replication of slower motions requires all the faster motion leading up to the slow motion to occur at approximately the correct rates, so that the slower motions are more susceptible to influences like force field inaccuracies, starting structure of the system, etc. This remains a significant challenge for combining experiment in simulation, requiring creative solutions to take advantage of simulation where reproduction of experimental observables is poor.

With experimental validation or refinement of an MD simulation, one may analyze the simulation further, with improved confidence of the accuracy of the simulation. However, we want to use the simulation specifically to improve our interpretation of the experimental parameters. For example, we recently showed that it was possible to calculate the spatial correlation of motions within a given detector window between different residues in HET-s (218–289) fibrils (Smith et al., 2019b), using a modified iRED analysis (Prompers and Brüschweiler, 2001; Prompers and Brüschweiler, 2002). The result is that we could see that detector windows corresponding to longer correlation times tended to result in correlation over longer distances, providing at least some explanation for the presence of slow, low amplitude motion in fibrils. A similar correlation analysis is shown in Figure 14G, in this case for residues in an α-helix, where similarly, detector windows corresponding to longer correlation times yield longer correlation lengths. We suspect this behavior to be nearly universal: even in well-defined structures, there is always some residual flexibility. Then, both short- and long-range modes of motion should be thermally populated (in terms of modes, these are more accurately described as having short and long wavelengths). However, the longer-range modes usually have longer correlation times, resulting in the trends in Figure 14G. Note that this implies that there should almost always be distributions of correlation times due to varying correlation length, further complicating the interpretation of the two to three correlation times provided by the EMF approach.

For fairly rigid regions of a molecule, we expect detector-specific correlation analysis to help explain dynamic trends. However, what should we do for regions that are more mobile, with multiple types of motion contributing? Having all of the atom positions in an MD simulation should provide the detail that would allow us to separate different motions. Then, we could define the total motion of a bond as resulting from the product of these motions. For example, for an H–N dipole coupling in an α-helix, the total rotation of the dipole is the result of the reorientation of the principal axis system (PAS) of the dipole within the peptide plane (PP), the peptide plane reorienting within the helix, the helix reorienting with the molecule, and the molecule reorienting within the lab frame.

\begin{array}{l} \vec{v} (t + τ) = R (Ω_{τ, t + τ}) \cdot \vec{v} (τ) \\ = R (Ω_{τ, t + τ}^{mol . \to lab}) \cdot R (Ω_{τ, t + τ}^{helix \to mol .}) \cdot R (Ω_{τ, t + τ}^{PP \to helix}) \cdot R (Ω_{τ, t + τ}^{PAS \to PP}) \cdot \vec{v} (τ) \end{array} (38)

This concept is illustrated in Figure 14H. In the case that it is possible to derive a correlation function from each rotation, one then may effectively achieve an in silico model-free type separation of the correlation functions motion. A similar approach for the specific separation of librations, φ/ψ reorientation, and peptide plane tumbling in intrinsically disorded proteins has been demonstrated by Salvi et al. (2017), however we find that it is possible to fully generalize this concept for separation of arbitrary definitions of independent motions (manuscript under revision, (Smith et al., 2021b)). Then, separated motions may also be analyzed with detectors, to determine how both experimental and simulated detector responses depend on both timescale and position in the molecule. Separation of motions could also be coupled with mode analyses such as iRED (Prompers and Brüschweiler, 2001, 2002) or principal component analysis (Amadei et al., 1993; Altis et al., 2007), providing a method to better characterize distributions of correlation times arising from different motions and complex mode-like dynamics. In each proposed case, comparison of the different MD analyses is possible via the detector analysis. Our eventual goal is that one may extract enough detail from the MD to build explicit models of motion for direct application to the NMR experimental results, so that the final characterizations are no longer model-free at all, but rather yield highly detailed models based on the combined information from experiment and simulation.

Conclusion

We show that the original model-free approach, SDM, LeMaster’s approach, and detectors all belong to a class of methods where fit parameters are resulting from a linear combination of experimental relaxation rate constants (potentially requiring an additional arithmetic step to yield the final parameters). IMPACT is a close approximation to this behavior, whereas EMF parameters exhibit significantly different behavior. Analysis methods belonging to this class are particularly useful because it is straightforward to estimate the resulting parameters if the distribution of correlation times, $(1 - S^{2}) θ (z)$ , is known. This is particularly advantageous when determining if a model is consistent with experimentally determined parameters, and also allows easy comparison of multiple methods.

The detector analysis is the most general of these approaches, being applicable to any collection of NMR relaxation experiments probing reorientational motion, and can be generalized for other methods such as MD simulation, requiring very little modification of the analysis. Then, the resulting detector responses from NMR and MD are easily compared. With experimental validation of MD, one may then use the wealth of detail in MD simulation to better understand how experimentally derived parameters are related to specific motion, via correlation of motion, separation of motion, and other existing and yet-to-be developed techniques. This has the potential to lead to improved models of motion for NMR analysis, which in turn can help obtain a more fundamental understand of dynamics in biomolecular systems.

Author Contributions

AS has prepared the manuscript text and KZ has prepared the figures.

Funding

The authors acknowledge funding by the Deutsche Forschungsgemeinschaft (DFG) grant SM 576/1-1 and by European Social Funds (ESF) and the Free State of Saxony (Junior Research Group UniDyn, Project No. SAB 100382164). The authors also acknowledge support from the German Research Foundation (DFG) and Universität Leipzig within the program of Open Access Publishing.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmolb.2021.727553/full#supplementary-material

References

Abergel, D., and Palmer, A. G. (2003). On the Use of the Stochastic Liouville Equation in Nuclear Magnetic Resonance: Application to R1ρ Relaxation in the Presence of Exchange. Concepts Magn. Reson. 19A, 134–148. doi:10.1002/cmr.a.10091

REVIEW article

Model-Free or Not?

Introduction

Model-Free

Model-Free Theory

A Few Notes on Linearity

Fitting With Model-Free

Determining S2

Alternative Methods

Extended Model-Free

Spectral Density Mapping

LeMaster’s Approach

Interpretation of Motions by a Projection onto an Array of Correlation Times Approach

A New Approach for Solid-State Nuclear Magnetic Resonance

Linear Combination of Data

Optimizing Detectors: The Relaxation-Rate Space Approach

Optimizing Detector Sensitivities: Automated Approach

Model-Free, or Not?

Case 1: Extended Model-Free

Case 2: Model-Free Analysis of μs-Motion

Local Dynamics

Collective Dynamics

Outlook: Combining Methods

Conclusion

Author Contributions

Funding

Conflict of Interest

Publisher’s Note

Supplementary Material

References

This article is part of the Research Topic

People also looked at

Determining S²