Free Energy, Enthalpy and Entropy from Implicit Solvent End-Point Simulations
- 1Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Universita' di Udine, Udine, Italy
- 2Istituto Nazionale Biostrutture e Biosistemi, Rome, Italy
- 3Dipartimento di Area Medica, Universita' di Udine, Udine, Italy
- 4Science and Math Division, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates
Free energy is the key quantity to describe the thermodynamics of biological systems. In this perspective we consider the calculation of free energy, enthalpy and entropy from end-point molecular dynamics simulations. Since the enthalpy may be calculated as the ensemble average over equilibrated simulation snapshots the difficulties related to free energy calculation are ultimately related to the calculation of the entropy of the system and in particular of the solvent entropy. In the last two decades implicit solvent models have been used to circumvent the problem and to take into account solvent entropy implicitly in the solvation terms. More recently outstanding advancement in both implicit solvent models and in entropy calculations are making the goal of free energy estimation from end-point simulations more feasible than ever before. We review briefly the basic theory and discuss the advancements in light of practical applications.
Free energy is the key quantity in the description of the thermodynamics of biological systems, and therefore an important objective of biomolecular simulations. In spite of its relevance, free energy calculations are not performed for every simulation performed although most often concepts like (local) stability are used to describe the results of the simulation suggesting a thermodynamic description.
The difficulties arise obviously from the calculation of the entropy of the system which cannot be immediately computed as an ensemble average over simulation snapshots, contrary to the enthalpy. The free energy G can be formally related to an ensemble average:
where R is the gas constant, T the temperature, β is equal to , U is the potential energy and ⟨⟩ indicates thermodynamic ensemble average. However, as pointed out many years ago (Beveridge and DiCapua, 1989), this formula is of no practical use unless all configurational space enters (implicitly or explicitly) the average, because lowest probability configurations display the highest value contribution to the average and there cannot be convergence for upper unbound potentials. In molecular simulations typically only near-equilibrium configurations are sampled.
For this reason free energy differences calculation is performed using methods, like umbrella sampling (Torrie and Valleau, 1977), thermodynamic integration (Straatsma and McCammon, 1991) or metadynamics (Laio and Parrinello, 2002), that compute by various methods the free energy along a pathway connecting end-point states, even if it is often just the difference in free energy between the latter points that is actually needed.
If the two end-points are close enough a simulation of one or both end-points may be performed and free energy differences obtained by free energy perturbation (Zwanzig, 1954), but this is not possible in general.
The problem with the computation of the entropy from a single MD simulation is that only part of the configurational space is accessed by simulation.
In this short perspective we remark that the long sought calculation of free energies from end point simulations may be afforded with reasonable accuracy from implicit solvent end-point simulations. In particular, some recent developments in the way entropy is calculated in fact allow to compute free energy from a single simulation, so it is worth to review briefly the theory here and to provide simple formulae to compute free energy, entropy and enthalpy from implicit solvent simulations.
We consider here simulations sampling different states of a system A in solution. This can be done by different simulations if the states are kinetically well separated, or by post-processing a single simulation to divide microstates belonging to different states.
If the simulation is extensive enough most probable microstates are sampled and representative thermodynamic ensembles are generated for the different states which include samples of all microstates whose probability density is higher than a threshold.
Enthalpy can be computed as the energy ensemble average, whereas entropy is problematic, in particular for what concerns solvent degrees of freedom. Although solvent entropy has been taken into account in some studies (e.g., De Simone et al., 2005) the extensive correlation among solvent molecules makes solvent entropy estimation a difficult task. In order to get rid of this problem we treat the system using an implicit solvent model.
2.1. Implicit Solvent Models
Following the excellent reviews available (Gilson et al., 1997; Roux and Simonson, 1999; Wereszczynski and McCammon, 2012) we write the standard chemical potential, or molar free energy, in the following form:
where and are the solute and solvent coordinates. C includes the integrals over the reference state (1M, random orientation) and the momentum integrals (independent of conformation) that cancel when comparing different states of the same system and the term , with P0 the standard pressure and the partial volume of the solute, whose dependence on conformation provides negligible effects at standard pressure. For a derivation see Gilson et al. (1997).
For this reason in the following C will not be considered further.
Due to the difficulties in estimating the entropy of highly correlated solvent molecules the solvation potential of mean force () is defined by integrating out the solvent degrees of freedom.
where UAS denotes the energy terms that couple the solute and the solvent and US is the solvent energy.
With this definition and the assumptions made above we have:
where the dependence on the temperature of the solvation energy ΔW has been made explicit.
Implicit solvent models provide a functional form and parameters for .
Although the implicit solvent model based on the Poisson-Boltzmann equation (Fogolari et al., 2002) and the solvent accessible surface area (PBSA) could be used in molecular dynamics simulations (Fogolari et al., 2003), the method based on the Generalized Born model and solvent accessible surface area (GBSA) is the method of choice for its computational efficiency (Still et al., 1990; Onufriev et al., 2004). Limitations of both methods include treatment of small crevices and cavities, where water could not display bulk solvation properties, and neglection of curvature dependence of surface tension coefficient (Nicholls et al., 1991). For GBSA an additional limitation could be due to the dependence of empirical parameters on molecular shape (see e.g., Fogolari et al., 2015a). Compared to approaches simulating the solutes in explicit solvent and post-processing the trajectory in implicit solvent (Kollman et al., 2000), methods that generate the configurational ensemble using the same implicit solvent used for energy calculations are more consistent because they don't suffer from possible mismatches between the implicit and explicit solvent models.
In recent years some of the available implicit solvent models (including forcefield and set of parameters used (Swanson et al., 2005, 2007)) have been shown to be extremely accurate in the treatment of protein thermodynamics, as demonstrated by the study by Simmerling and coworkers, where 16 out of 17 proteins could be correctly folded using the GB-Neck2 implicit solvent model and for 14 of them the native fold was preferred over the misfolded one (Mongan et al., 2007; Nguyen et al., 2014).
2.2. Enthalpy in Implicit Solvent Models
The molar enthalpy is obtained as:
When the derivation is performed, taking into account that also depends on the temperature, we obtain:
where the symbol <> indicates the ensemble average of the quantity within brackets, i.e., the average over the simulation of the same quantity.
The above equation shows that the enthalpy, which is obtained as the energy ensemble average in MD simulations, is directly related to the ensemble average of the implicit solvent model potential energy. The difference is expressed by the term which may however be obtained by explicit derivation with respect to the temperature of the solvation energy terms in the implicit solvation model.
In the GBSA model the solvation energy is the sum of an electrostatic (ΔWel) and a surface tension (ΔWSA) term:
where ϵ(T) and ϵin are the solvent and molecular (typically 1) dielectric constants, respectively, qi is the charge of the ith atom, fij a pairwise function depending on all atoms' coordinates, γ(T) is a surface tension and A is the solvent-accessible surface area. The dependence on temperature of the solvent dielectric constant and of the surface tension has been made explicit in the above equation. The contributions to the derivative of ΔW with respect to temperature are easily obtained from the electrostatic and surface tension solvation energy terms, when the temperature dependence of the parameters of the implicit solvent model is known:
2.3. Entropy in Implicit Solvent Models
The entropy is linked to the configurational solute probability distribution.
First consider that the probability density distribution in the implicit solvent model is:
When the derivation of the free energy is performed the entropy can be written as:
Rewriting the entropy in terms of probability density allows to rewrite the free energy in turn as the sum of an ensemble average and a configurational entropy term, which is itself an ensemble average:
As said above the difficulty in estimating free energies resides in the entropy estimation which requires consideration of both sampled and non-sampled configurational space. Similarly, to estimate , consideration of non-sampled configurational space is needed. Implicit solvent models circumvent the problem treating solvent entropy implicitly, through the parameters and their temperature dependence.
In the above equation the configurational entropy is written formally as an ensemble average of the probability density. In this respect a convenient description of systems containing proteins is the bond, angle, torsion (BAT) representation (Go and Scheraga, 1976), because bonds, and to some extent also angles, contribute very little to changes in entropy for different states of proteins (e.g., notably, bound and non-bound) (Karplus et al., 1987). For this reason entropy is estimated often considering to a first approximation only torsional degrees of freedom and possibly external rotation and translation degrees of freedom.
Other descriptions, e.g., in cartesian coordinates, have been used and the entropy has been calculated assuming the system is moving harmonically or anharmonically about the energy minimum (see recent reviews Polyansky et al., 2012; Wereszczynski and McCammon, 2012). This approximation may however be poor for loops or other unrestrained parts of the molecules.
A particularly attractive method to compute entropy is the approach proposed by Singh et al. (2003) which was further developed by the same authors and others (Hnizdo et al., 2003, 2007, 2008; Darian et al., 2005; Numata et al., 2007; Wang et al., 2009; Mukherjee, 2011; Fenley et al., 2014; Huggins, 2014; Fogolari et al., 2015b, 2016). In practice the probability density ρ(x1, x2, …, xs) may be estimated considering a ball of radius ri around each configurational sample up to the k-th nearest neighbor (Figure 1), then the local probability density () is obtained by dividing the number k of the n samples which are found inside the ball over the volume of the ball Vi and n, i.e.,:
The idea is very simple and can be made rigorous (Singh et al., 2003). When equation 14, or its exact form, is substituted in equation 13 the free energy may be estimated easily from configurational samples.
The discrepancy between naive and exact treatment is limited. Note that the volume around each configurational sample is tuned to match the local density and there is no need consider regions where no configurational sample is found. Obviously the dimensionality of a protein system is large and it is not possible in practice to consider a probability density without actually assuming that most degrees of freedom are decoupled from each other.
The groups of Gilson (Killian et al., 2007) and Tidor (King and Tidor, 2009; King et al., 2012) have proposed treatments of mutual information that provide a practical estimate of global entropy (actually, an upper bound) considering only single degrees of freedom and pairwise mutual information. Depending on the number of samples the approach can be easily extended to higher orders of correlation.
We have recapitulated above the fundamentals of implicit solvent free energy calculations from end-point simulations, recalling the relationship between explicit and implicit solvent models, and showing how entropy and enthalpy can be obtained from implicit solvent simulations.
From a practical point of view we can estimate free energy from implicit solvent simulations using equation 13:
The ensemble average of U + ΔW is provided by the potential energy in the implicit solvent model used, whereas the entropy must be computed from the conformational ensemble.
The available implicit solvent models and parameters, together with forcefield parameters, have been shown to be accurate enough to reproduce complex phenomena like protein folding giving confidence in the accuracy of the implicit solvent potential of mean force. The average of the potential energy over an implicit solvent molecular dynamics trajectory thus provides the first term in the above equation.
The entropic term may be estimated using the nearest neighbor method which is emerging as an accurate entropy estimator, with many advantages over more traditional methods, including lack of hypothesis on non-sampled conformational space, no need to consider explicitly non-sampled conformational space and sound theoretical basis.
To make the application of the method straightforward we have implemented the nearest-neighbor method in two programs, PDB2ENTROPY and PDB2TRENT, available through the git-hub repository (URL: https://github.com/federico-fogolari) which allow to compute conformational and rotational-translational entropy directly from the conformational ensemble in PDB format.
In summary, the recent advancements in solvation models and entropy calculation, based on the nearest-neighbor method, are making computation of free-energy from end-point simulations significantly more accurate than before, with many possible applications in the next future.
All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Dr. I. M. Lait and A. Makek are gratefully acknowledged for support.
Beveridge, D. L., and DiCapua, F. M. (1989). Free energy via molecular simulation: application to chemical and biomolecular systems. Annu. Rev. Biophys. Biophys. Chem. 18, 431–492. doi: 10.1146/annurev.bb.18.060189.002243
Darian, E., Hnizdo, V., Fedorowicz, A., Singh, H., and Demchuk, E. (2005). Estimation of the absolute internal-rotation entropy of molecules with two torsional degrees of freedom from stochastic simulations. J. Comput. Chem. 26, 651–660. doi: 10.1002/jcc.20198
De Simone, A., Dodson, G. G., Verma, C. S., Zagari, A., and Fraternali, F. (2005). Prion and water: tight and dynamical hydration sites have a key role in structural stability. Proc. Natl. Acad. Sci. U.S.A. 102, 7535–7540. doi: 10.1073/pnas.0501748102
Fenley, A. T., Killian, B. J., Hnizdo, V., Fedorowicz, A., Sharp, D. S., and Gilson, M. K. (2014). Correlation as a determinant of configurational entropy in supramolecular and protein systems. J. Phys. Chem. B 118, 6447–6455. doi: 10.1021/jp411588b
Fogolari, F., Corazza, A., and Esposito, G. (2015a). Accuracy assessment of the linear Poisson-Boltzmann equation and reparametrization of the OBC generalized Born model for nucleic acids and nucleic acid-protein complexes. J. Comput. Chem. 36, 585–596. doi: 10.1002/jcc.23832
Fogolari, F., Corazza, A., Fortuna, S., Soler, M. A., VanSchouwen, B., Brancolini, G., et al. (2015b). Distance-based configurational entropy of proteins from molecular dynamics simulations. PLoS ONE 10:e0132356. doi: 10.1371/journal.pone.0132356
Fogolari, F., Dongmo Foumthuim, C. J., Fortuna, S., Soler, M. A., Corazza, A., and Esposito, G. (2016). Accurate estimation of the entropy of rotationtranslation probability distributions. J. Chem. Theory Comput. 12, 1–8. doi: 10.1021/acs.jctc.5b00731
Gilson, M. K., Given, J. A., Bush, B. L., and McCammon, J. A. (1997). The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys J. 72, 1047–1069. doi: 10.1016/S0006-3495(97)78756-3
Hnizdo, V., Darian, E., Fedorowicz, A., Demchuk, E., Li, S., and Singh, H. (2007). Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J. Comput. Chem. 28, 655–668. doi: 10.1002/jcc.20589
Hnizdo, V., Fedorowicz, A., Singh, H., and Demchuk, E. (2003). Statistical thermodynamics of internal rotation in a hindering potential of mean force obtained from computer simulations. J. Comput. Chem. 24, 1172–1183. doi: 10.1002/jcc.10289
Hnizdo, V., Tan, J., Killian, B. J., and Gilson, M. K. (2008). Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J. Comput. Chem. 29, 1605–1614. doi: 10.1002/jcc.20919
Killian, B. J., Yundenfreund Kravitz, J., and Gilson, M. K. (2007). Extraction of configurational entropy from molecular simulations via an expansion approximation. J. Chem. Phys. 127:024107. doi: 10.1063/1.2746329
King, B. M., Silver, N. W., and Tidor, B. (2012). Efficient calculation of molecular configurational entropies using an information theoretic approximation. J. Phys. Chem. B 116, 2891–2904. doi: 10.1021/jp2068123
Kollman, P., Massova, I., Reyes, C., Kuhn, B., Huo, S., Chong, L., et al. (2000). Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models. Acc. Chem. Res. 33, 889–897. doi: 10.1021/ar000033j
Mongan, J., Simmerling, C., McCammon, J. A., Case, D. A., and Onufriev, A. (2007). Generalized Born model with a simple robust molecular volume correction. J. Chem. Theory Comp. 3, 156–169. doi: 10.1021/ct600085e
Nguyen, H., Maier, J., Huang, H., Perrone, V., and Simmerling, C. (2014). Folding simulations for proteins with diverse topologies are accessible in days with a single physics-based force field and implicit solvent. J. Am. Chem. Soc. 136, 13959–13962. doi: 10.1021/ja5032776
Nicholls, A., Sharp, K. A., and Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins Struct. Funct. Genet. 11, 281–296. doi: 10.1002/prot.340110407
Onufriev, A., Bashford, D., and Case, D. A. (2004). Exploring protein native states and large-scale conformational change s with a modified generalized Born model. Proteins Struct. Func. Gen. 55, 383–394. doi: 10.1002/prot.20033
Polyansky, A. A., Zubac, R., and Zagrovic, B. (2012). Estimation of conformational entropy in protein-ligand interactions: a computational perspective. Methods Mol. Biol. 819, 327–353. doi: 10.1007/978-1-61779-465-0_21
Still, W. C., Tempczyk, A., Hawley, R. C., and Hendrickson, T. (1990). Semianalytical treatment of solvation for molecular mechanics and dynamics. J. Am. Chem. Soc. 112, 6127–6129. doi: 10.1021/ja00172a038
Swanson, J. M. J., Wagoner, J. A., Baker, N. A., and McCammon, J. A. (2007). Optimizing the Poisson dielectric boundary with explicit solvent forces and energies: lessons learned with atom-centered dielectric functions. J. Chem. Theory Comp. 3, 170–183. doi: 10.1021/ct600216k
Wang, L., Abel, R., Friesner, R. A., and Berne, B. J. (2009). Thermodynamic properties of liquid water: an application of a nonparametric approach to computing the entropy of a neat fluid. J. Chem. Theory Comput. 5, 1462–1473. doi: 10.1021/ct900078k
Wereszczynski, J., and McCammon, J. A. (2012). Statistical mechanics and molecular dynamics in evaluating thermodynamic properties of biomolecular recognition. Q. Rev. Biophys. 45, 1–25. doi: 10.1017/S0033583511000096
Keywords: free energy, enthalpy, entropy, molecular dynamics simulations, implicit solvent, continuum solvent, MM/GBSA
Citation: Fogolari F, Corazza A and Esposito G (2018) Free Energy, Enthalpy and Entropy from Implicit Solvent End-Point Simulations. Front. Mol. Biosci. 5:11. doi: 10.3389/fmolb.2018.00011
Received: 26 October 2017; Accepted: 25 January 2018;
Published: 08 February 2018.
Edited by:Paolo De Los Rios, École Polytechnique Fédérale de Lausanne, Switzerland
Reviewed by:Alfonso De Simone, Imperial College London, United Kingdom
Patrick Senet, Université de Bourgogne, France
Copyright © 2018 Fogolari, Corazza and Esposito. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Federico Fogolari, email@example.com