Surveying an Energy Landscape

We derive a formula that expresses the density of states of a system with continuous degrees of freedom as a function of microcanonical averages of squared gradient and Laplacian of the Hamiltonian. This result is then used to propose a novel flat-histogram Monte Carlo algorithm, which is tested on a three-dimensional system of interacting Lennard-Jones particles, the O(n) vector spin model on hypercubic lattices in D = 1 to 5 dimensions, and the O(3) Heisenberg model on a triangular lattice featuring frustration effects.


I. INTRODUCTION
Central to statistic physics are the canonical ensemble, its partition function, and as defining quantity the temperature on the one hand as well as the microcanonical ensemble and the density of states defined by energy on the other.While the energy in the canonical ensemble takes the form of a weighted average over all microstates, temperature in the microcanonical ensemble becomes the inverse of the derivative of the logarithmic density of states.However, in 1997 Rugh [1] showed that it too can be expressed as an average of an observable comprising various derivatives of the Hamiltonian.The same formula had already been found [2], but -judging by the number of citations -received very little attention.With Rugh's work at hand it was soon realized [3] that this relation among other things offers a way to verify the correctness of Monte Carlo (MC) computer simulations.Unfortunately, the formula involves a rather unwieldy term containing the Hamiltonian's Hessian that is computationally demanding.In this study, we develop a formula that expresses the density of states of the system as a function of microcanonical averages of squared gradient and Laplacian of the Hamiltonian while avoiding this term.
In the last decades MC simulations have become an important tool to investigate thermodynamic properties of models of complex systems.Today, many different techniques are used.In addition to the famous Metropolis algorithm [4] which samples from the canonical ensemble nowadays generalized ensemble methods have become more prevalent.Among these, flat-histogram method such as multicanonical (MUCA) [5], the Wang-Landau method [6], and Statistical Temperature MC [7] aim to create the same ensemble where the distribution as a function of energy is constant.On the one hand this allows to reweight the data to a canonical ensemble with any desirable temperature, while on the other hand even if one is only interested in low-temperature behavior the inclusion of high energies allows the system to decorrelate more easily.To bias the system's random walk such that this ensemble becomes the stationary distribution the logarithm of the density of states (DoS) or its derivative must be known with sufficient precision.This can be achieved in an iterative process [8,9] that analyses and incorporates successively created histograms or on the fly by permanently altering the estimate of the DoS [6] or its derivative [7] based only on the energy of the current state of the system.Either way, the estimate of the DoS that is created and employed to drive the algorithm solely based on the energy time series.No other information is utilized.
It is an interesting exercise and test of the accuracy and precision achieved with our formula to measure the microcanonical properties of the energy landscape and use them to estimate the DoS during a flat-histogram MC simulation that at the same time is using that estimate to achieve a flat distribution in energy.In this study we demonstrate this idea with two examples: A system of interacting Lennard-Jones particles and the O(n) vector spin model.
The paper is organized as follows: In section 2 we once more derive the formula of Gilat and Rugh and proceed to calculate our alternative.In section 3 we discuss how a flat-histogram algorithm can be formulated and develop a suitable method for numerical integration.We then apply the method to a Lennard-Jones system in section 4 and to the O(n) vector spin model in section 5.In section 6 we finish with some concluding remarks.

II. CALCULATING THE DENSITY OF STATES
The DoS or partition sum of the microcanonical ensemble at energy E is given by where H is the Hamiltonian of the system, N the number of degrees of freedom, and the integration goes over the entire state space.This can be rewritten as a surface integral with A E = {X : H (X) = E} being the surface of constant energy E and ∇ is the gradient ( d dx1 , d dx2 , ..., d dxN ) T .The microcanonical average of any observable Q(X) is given by To be able to apply the divergence theorem we rewrite (2) again: Here, n = ∇H (X)/|∇H (X)| is a unit vector perpendicular to the plane of constant energy and pointing to higher energies.It is therefore parallel to ∇H (X).The derivative of the DoS with respect to energy is We insert (5), apply the divergence theorem and integrate perpendicular to the surfaces by multiplying the thickness of the integration volume Dividing by g(E) on both sides we obtain the known result [1, 2, 10-13] with where ∆ = is the Laplace operator and H denotes the Hessian matrix of the Hamiltonian, ∂xi∂xj .The observable B which can in principle be calculated for any microstate X of the system at hand, allows us to determine the DoS up to a factor regardless of the applied algorithm: where E 0 can be chosen freely.It is worth noting that B relates directly to temperature.While it is true by definition that its microcanonical average equals the inverse microcanonical temperature if the latter is defined as it can also easily be shown that its canonical average is equal to the inverse canonical temperature: While gradient and Laplace operator can be applied to H without too much computational effort [14] the determination of the Hessian matrix is very demanding and a simpler scheme would be preferable.We start again with the microcanonical average of some observable Q(X) and now consider its energy derivative The integral can be transformed, and the derivative be calculated similar to the procedure used above.Using differential quotient and divergence theorem we find (18) We obtain for the inverse microcanonical temperature: Integrating on both sides and exponentiating gives (20) We first derived (20) in a different way: If one formally considers a random walk in configuration space with sufficiently small step length and equilibrates the system after every single step on the respective surface of constant energy, one obtains a one-dimensional stochastic process in energy with a stationary distribution proportional to g(E).The change in energy for a small step X → X ′ = X + x is given by and it follows that the drift for such a process is , where α is a constant related to the length of x.In this context (20) represents the solution of the Fokker-Planck equation.
In the context of MC simulations the DoS is virtually always determined via histograms.The distribution of energies within the employed ensemble is estimated and allows to calculate g(E).Although rare, faulty simulations with the detailed balance criterion violated are not unheard-of and it is sometimes not easy to spot such problems.It is worth noting that ( 10) and ( 20) provide an alternative way to determine the DoS and a comparison with the histogram-derived DoS can, therefore, be used to test whether an algorithm is in balance.

III. ALGORITHM
It is well known and widely used that within a Monte Carlo simulation a flat histogram can be produced if the acceptance probability for proposed moves is given by which can now be written as (23) The arguments X have been removed for the sake of clarity.
One main challenge is the accurate numeric evaluation of the integral.Since the energy is continuous, it is natural to employ a binning procedure.It might be worthwhile to use an adaptive binwidth with higher resolution in regions where the integrand changes rapidly with E, but here we use intervals of constant width h and estimate microcanonical averages of an observable O as the mean of all measurements with an energy E t ∈ I k : where t is the time index of the measurements.It is useful to also measure [E] k and use it instead of the midpoint of I k for the integration.We use the notation Following the standard approach for quadrature (numerical integration) we locally approximate the data by an analytical function and integrate the latter.However, the usual choice of polynomials does not represent the underlying mathematical relation well.Since it is we use the Ansatz which corresponds to a system with lowest energy η and constant (canonical) specific heat C = k B (µ + 1).Assuming equality in (26) it is and for two points (E i , f i ) and (E i+1 , f i+1 ) we obtain Thus we arrive at For the actual simulation we use the function with some suitable E 0 and as shown above calculate G(E k ) for all bins (intervals) early by using G(E k ) from the numerical integration and G ′ (E k ) = f k .The acceptance probability from eq. ( 23) becomes We simulate and measure ∇H and ∆H for a short while using G(E), recalculate it, and repeat.Of course, this introduces small violations of the detailed balance criterion, albeit to a much lesser extent than a Wang-Landau simulation.Nevertheless, as with any flat histogram simulation the final data should be produced in a simulation with fixed G(E) and (∇H ) 2 E .As we will show in the next section, in the form presented the algorithm is able to simulate quite large systems.However, there also is a drawback.The estimate of the DoS is necessarily based on information that has been gathered only in regions of state space that already have been sampled.This can include states that represent rare events in the converged ensemble, e.g., configurations that correspond to a supercritical gas.If the "correct" state -the condensate or droplet in the examplehasn't been found yet, then the estimates of microcanonical averages are dominated by the "wrong" data and it can take a very long time to correct this bias.Thus firstorder phase transitions or rough energy landscapes can pose a challenge for the algorithm in its basic form.A more refined method of averaging than eq.( 25) which attributes higher weight to later measurements might turn out to be a solution for this problem.

IV. LENNARD-JONES PARTICLES
Clusters of Lennard-Jones particles and their morphology at low temperature have been under study for a long time.Modeling noble gas atoms, Lennard-Jones particles are an interesting object of inquiry in their own right and they provide challenging benchmark systems for numerical optimization since their energy landscape contains numerous minima belonging to competing geometric structures.For small sizes the ground states have been determined some time ago [15][16][17] and the behavior is well understood.If the number of atoms is a few thousands or less, the low-temperature phase is dominated by icosahedral geometry [18].In many cases there are solidsolid transitions [19] where the outer layer of the cluster changes from a so-called anti-Mackay shape that maximizes entropy to a Mackay structure minimizing energy.In some rare cases N = 38, 75, 76, 77, 98, 102, 103, 104, . . .non-icosahedral states are occupied at a very low temperature leading to solid-solid transitions that can be extremely challenging to investigate by means of MC simulations [20].
We consider N particles in three dimensional-space which interact pairwise through a 12-6 Lennard-Jones potential With this parametrization the potential has its minimum at r 0 = 1.The particles are freely mobile within a cubic volume of linear extension L and we label their positions as x ∈ [0, L] 3 .The Hamiltonian reads One finds that (36) and calculating or refreshing is simpler.
We performed a simulation of N = 100 particles confined in a steric cube with L = 5r 0 .The ground-state energy of this system is E g = −557.039820[15] and we restrict the energy to −520 < E < 0. The energy as a function of simulation time in units of N single-atom displacement moves can be seen in Fig. 1.It is apparent that the simulation is able to reach all energies in the interval within a relatively short time.The early wedgeshaped blocks at low energy indicate that the averages are not converged yet and balance is established by repeatedly transitioning in and out of the low-energy state.Fig. 2 shows the microcanonical averages (∇H ) 2 E /N and ∆H E /N .Interestingly, the Laplacian shows a close to linear behavior throughout most of the energy interval, while squared gradient, as one would expect, goes to zero as E approaches the ground-state energy.Its graph also displays an inflection, signaling a transition.The integration parameters µ and η shown in Fig. 3 also strongly relate to the thermodynamic behavior of the system and might be used similarly to a microcanonical analyses of the density of states [21].Since µ is closely related to the specific heat it behaves similarly.Kinetic degrees of freedom are not taken into account in the simulation and as a consequence we observe µ ≈ 0 for high energies in the gas phase.The condensation transition towards a liquid droplet with a non-zero µ is rather weak due to the small system size.Around E ≈ −475, G(E) becomes concave which manifests as µ < 0. This signal indicates the first-order-like freezing transition which leads to the formation of an icosahedral structure [18].We suspect that the remaining signal at E ≈ −507 is caused by the rearrangement of surface atoms from a socalled anti-Mackay to a Mackay structure [19].All transitions also manifest in η.While η(E) < E in most cases if µ < 0 then the local approximation of G(E) does not become zero at E = η, but instead diverges.Since its slope is positive in these cases it is η > E.

V. O(n n n) SPIN MODEL
The O(n) spin model is the generalization of the Ising (n = 1), XY (n = 2), and Heisenberg (n = 3) spin models.In this model spins σ σ σ ∈ R n , |σ σ σ| = 1 are elements of the n-sphere and are positioned on sites of regular lattices and interact through the Hamiltonian where the sum runs over all lattice bonds and J is the spin-spin interaction strength.To evaluate ∇H and ∆H we first consider the contribution to the total energy by an individual spin σ σ σ k : where h k is the local field and nb(k) the set of neighbors of spin σ σ σ k .In the following we set J = 1 and refrain from displaying it in the formulae.This is equivalent to assuming that e k , h k , and E are measured in units of J.It is convenient to divide by |σ σ σ k | in eq. ( 40) and for the moment to relax the n-sphere constraint to |σ σ σ k | = 1 since this allows one to use the oneparticle gradient ∇ k = ( ∂ ∂σ σ σ k,1 , . . ., ∂ ∂σ σ σ k,n ) T in Cartesian coordinates with the radial component σ σ σ k (∇ k e k ) ensured to be zero.We find that and since If the system is homogeneous, i.e., if all spins are equivalent we can drop the index k and if the total number of spins is given by N it is Next we calculate that and noting that e E = simple result (49) Unfortunately, this formula does not generalize to the Ising model n = 1 since on the one hand a continuous energy scale is implicitly required and on the other hand for the Ising model We now consider hypercubic lattices in D dimensions with linear extension L, N = L D spins, and periodic boundary conditions.For these lattices the number of neighbors of any site (spin), the so-called coordination number, is z = 2D.During the simulation we use N bins and integrate after every 10 3 N individual spin updates.The single concern for selecting this value was to choose it large enough to not slow down the simulation by the computational effort of integrating.Proposed values for spins are selected randomly and independently of the current value.They are drawn using the rejection method for n = 2, Marsaglia's methods [23] for n = 3, 4 and our own technique [24] for n > 4. Time series of the energy per bond 2E/zN for different values of D and n and about N ≈ 10 3 spins are shown in Fig. 4. For these cases the simulation is able to cover most of the available energy interval within about 10 7 sweeps.We point out that for all simulations shown the ratio between the maximum of the DoS and the minimal value reached is between 10 780 and 10 2000 .Of course such values can also be achieved with state-of-the-art flat-histogram methods, but it is satisfying that this is possible with this method as well since it implies that the integration is done with adequate accuracy.The simulations fall a little bit short of the extremal energies E max = −E min = N D. We suspect that one reason is the comparatively large bin width which can become problematic if G or its derivative become very steep.From the measured densities of states g(E) ∝ exp[G(E)]/ (∇H ) 2 E shown in Fig. 5 it becomes apparent that due to the relatively low number of just 1000 bins, values in adjacent bins can differ by more than 20 orders of magnitude.It is satisfying that thanks to the linear interpolation of G(E) a relatively flat distribution inside the bins can be maintained regardless and transitions between the bins are still occurring.Another cause for decreasing performance at extremal energies will certainly be the small acceptance rate.Close to the minimal and maximal energy values spins are almost parallel to the local field and since we draw new spin values completely randomly the probability that such a proposal is accepted becomes very low.We refrained from optimizing the simulation since this study is mostly intended as a proof-of-concept.
We find that for large enough systems h 2 E depends only weakly on the dimension of spin space n.Note that the microcanonical ensemble in the thermodynamic limit directly fixes the correlation between neighboring spins and h 2 E can be expressed in terms of correlations of next-nearest neighbor spins where from any spin σ σ σ * the spins σ σ σ | , σ σ σ || , and σ σ σ are reached by one bond, two parallel bonds and two nonparallel bonds, respectively.This allows one to show that in one dimension h 2 E is even independent of n and one obtains lim which is in very good agreement with our data and would be indistinguishable from the graphs for D = 1 in Fig. 6.
For the other values of D all curves for different n in Fig. 6 also are very close to identical.Separate simulations for D = 2, 3 at energies close to the transition revealed that in the thermodynamic limit the difference in h 2 E for different values of n is of the order of 1%.This behavior is reminiscent of another case of unexpectedly small dependence on n: the critical energy density [25].
The situation is different for e 2 E which comprises z second moments of nearest-neighbor spin products (σ σ σ k σ σ σ i ) 2 E as well as z(z − 1) bond-bond correlations (σ σ σ k σ σ σ i )(σ σ σ k σ σ σ j ) E .We are able to calculate the curves for D = 1 and large N , but these do depend on n (see Appendix).The data in Fig. 7a suggest that for any D, large n and N an approximation may be given through with additional corrections.Here, f D (x) is a function that can easily be calculated in D = 1 dimension.We find However, it appears that this function is also valid for D > 1 and we are led to believe by Fig. 7b that the next correction is of the order z/n 1/D .This is of course a somewhat speculative heuristic analysis and even though the systems are of medium size N ≈ 10 3 the linear extension of the lattices for D > 2 is small.
Finally we applied the method to the Heisenberg model on a triangular lattice with 1024 spins again with 1000 bins.Now the system experiences frustration at positive energies or negative temperatures which for J = 1 correspond to the antiferromagnet that for this lattice type has a maximal energy 2E max /zN = 0.5.Again the algorithm is able to explore most of the energy range without getting trapped and the time series (not shown here) looks very similar to the previous cases.In Fig. 8 the resulting data for h 2 E , e 2 E , and the parameters µ and η are shown.

VI. CONCLUSIONS
In this study we reviewed how the density of states of a system can be calculated via the inverse microcanonical temperature, i.e., the derivative of the logarithmic density of states, and how the latter can be obtained by means of microcanonical averages.We then introduced an alternative method that avoids mixed derivatives of the Hamiltonian, such that instead of the Hessian only the Laplacian and the gradient are required thus reducing computational demands.Since the ratio of Laplacian and squared gradient needs to be integrated, preferably with high accuracy, we devised a simple method for numerical integration adapted to the mathematical properties of that function.
Once the density of states can be calculated with sufficient accuracy and precision it can be used to verify the results of established histogram-based methods or -as shown in this study -to design a novel flat-distribution Monte Carlo method.This method is similar to the multicanonical method, the Wang-Landau method, or Statistical Temperature MC with the important difference that the information required to bias the ensemble towards a flat distribution is not indirectly obtained through the distribution of energy values but directly measured from the gradient and curvature of the Hamiltonian at the surfaces of constant energy.
The simulations we conducted are intended to be a proof-of-concept and we did not focus on optimizing the algorithm.We deem it likely that improvements can be made in various ways just as there are various histogrambased methods.Even hybrid strategies are conceivable.We observe that the algorithm is able to produce flat histograms on intervals of energy over which the density of states differs by hundreds to thousands of orders of magnitude, which in turn is convincing evidence that our formula for the density of states is correct and that our method for numerical integration works well for this particular type of function.
We applied the method first to a system of one hundred interacting Lennard-Jones particles.In order to ensure a stable simulation and converging microcanonical averages we had to exclude the lowest part of the energy spectrum.Nevertheless, even in the current basic form the algorithm was able to cover all three phases -gaseous, liquid droplet, and frozen crystal-like -and also managed to map the low-temperature structural transition of the surface atoms.It turned out that the auxiliary data that are produced during the integration can be used to identity the transitions and the energies at which they occur.
Second we considered the O(n) vector-spin model.After deriving expressions for the Laplacian and gradient of the Hamiltonian it became clear that only the average squared spin energy and the average of the square of a spin's local field are required to calculate the density of states.Both of these can easily be measured during the simulation.We conducted a number of simulations for various spin and lattice dimensions and system sizes of up to about a thousand spins.In each case it was easily possible to sample most of the state space.This was even true for the case of a system with frustration: the Heisenberg model on a triangular lattice.We found that the average squared local field depends on D but surprisingly only very little on n.The defining condition of the microcanonical ensemble is of-course the system's fixed total energy which translates to a known value for the nearest-neighbor spin-spin correlation.This in turn is closely related to the quantities needed to calculate the density of states: The squared local field comprises nextnearest-neighbor spin-spin correlations and the squared local energy the second moment of nearest-neighbor spin products.A more rigorous theoretical analysis of the mutual dependencies of these quantities for the O(n) spin model would be of great interest.

FIG. 1 .FIG. 2 .
FIG. 1.Time series of the energy E throughout a simulation of N = 100 Lennard-Jones particles.The energy was restricted to E > −520.

FIG. 3 .
FIG.3.The integration parameters µ and η from the same simulation as the data in Fig.1.

8 FIG. 5 .
FIG. 5. Logarithmic density of states divided by nN for different sets of values of D, L, and n.