From real materials to model Hamiltonians with density matrix downfolding

Due to advances in computer hardware and new algorithms, it is now possible to perform highly accurate many-body simulations of realistic materials with all their intrinsic complications. The success of these simulations leaves us with a conundrum: how do we extract useful physical models and insight from these simulations? In this article, we present a formal theory of downfolding--extracting an effective Hamiltonian from first-principles calculations. The theory maps the downfolding problem into fitting information derived from wave functions sampled from a low-energy subspace of the full Hilbert space. Since this fitting process most commonly uses reduced density matrices, we term it density matrix downfolding (DMD).


I. INTRODUCTION TO DOWNFOLDING THE MANY ELECTRON PROBLEM
In multiscale modeling of many-particle systems, the effective Hamiltonian (or Lagrangian) is one of the most core concepts. The effective Hamiltonian dictates the behavior of the system on a coarse-grained level, where 'sub-grid' effects are folded into the parameters and form of the effective Hamiltonian. Many concepts in condensed matter physics can be viewed as statements about the behavior of the effective Hamiltonian. In particular, identification of 'strongly correlated' materials as materials where band theory is not an accurate representation of the systems is a statement about effective Hamiltonians. Effective Hamiltonians at different length scales also form the basis of the renormalization group [1]. A major goal in condensed matter physics is to determine what effective Hamiltonians apply to different physical situations, in particular quantum effective Hamiltonians, which lead to large-scale emergent quantum phenomena.
The dominant effective model for quantum particles in condensed systems is band structure, and for metals, Fermi liquid theory. However, a major challenge is how this paradigm should be altered when it is no longer a good description of the physical system. Examples of these include the high-T c cuprates and other transition metal oxides, which do not appear to be welldescribed by these simple effective Hamiltonians. For these systems, many models have been proposed, such as the Hubbard [2], Kanamori [3], t-J [4] and Heisenberg models. While these models have been extensively studied analytically and numerically, and have significantly enhanced our understanding of the physics of correlated electrons, their effectiveness for describing a real complex system of interest is often unclear. At the same time, more complex effective models can be commensurately more difficult to solve, so one would like to also find an accurate effective model that is computationally tractable.
To address the need for a link between ab initio electron-level models and larger scale models, downfolding has most commonly been carried out using approaches based on density functional theory (DFT). The one particle part is obtained from a standard DFT calculation which is projected onto localized Wannier functions and gives an estimate of the effective hoppings of the lattice model based on Kohn-Sham band structure calculations [5]. Then, to estimate the interactions, one assumes a model of screening of the Coulomb interactions based on constrained DFT, RPA, or some other methods. Since effects of interactions between the orbitals of interest have already been accounted for by DFT, a double counting correction is required to obtain the final downfolded Hamiltonian. The approach has been developed and widely applied [5][6][7][8]; but remains an active area of research [9]. There are other downfolding approaches that include the traditional Löwdin method, coupled to a stochastic approach [10,11] and the related method of canonical transformations [12,13]. While they have many advantages, it is typically not possible to know if a given model ansatz was a good guess or not, and it is very rare for a technique to provide an estimate of the quality of the resultant model. The situation described above stands in contrast to the derivation of effective classical models. For concreteness, let us discuss classical force fields computed from ab initio electronic structure calculations. Typically, a data set is generated using an ab initio calculation in which the positions of the atoms and molecules are varied, creating a set of positions and energies. The parameters in the force field ansatz are varied to obtain a best-fit classical model. Then, using standard statistical tools, it is possible to assess how well the fit reproduces the ab initio data within the calculation, without appealing to experiment. While translating that error to error in properties is not a trivial task, this approach has the important advantage that in the limit of a high quality fit and high quality ab initio results, the resultant model is predictive.
Naïvely, one might think to reconcile the fitting approach used in classical force fields with quantum models by matching eigenstates between a quantum model and ab initio systems, varying the model parameters until the eigenstates match [14]. However, this strategy does not work well in practice because it is often not possible to obtain exact eigenstates for either the model or the ab initio system. To resolve this, we develop a general theory for generating effective quantum models that is exact when the wave functions are sampled from the manifold of low-energy states. Because this method is based on fitting the energy functional, we will show the practical application of this theory using both exact solutions and ab initio quantum Monte Carlo (QMC) to derive several different quantum models.
The endeavor we pursue here is to develop a multi-scale approach in which the effective interactions between quasiparticles (such as dressed electrons) are determined after an ab initio simulation (but not necessarily exact solution) of the continuum Schroedinger equation involving all the electrons. The method uses reduced density matrices (RDMs), of low-energy states, not necessarily eigenstates, to cast downfolding as a fitting problem. We thus call it density matrix downfolding (DMD). In this paper, our application of DMD to physical problems employ one body (1-RDM) and two body (2-RDM) density matrices. The many-body states used in DMD will typically be generated using QMC techniques [either variational Monte Carlo (VMC) or diffusion Monte Carlo (DMC)] to come close to the low energy manifold.
The remainder of the paper is organized as follows: ✄ In Section II, we clarify and make precise what it means to downfold a many-electron problem to a few-electron problem. We recast the problem into minimization of a cost function that needs to be optimized to connect the many and few body problems. We further these notions both in terms of physical as well as information science descriptions, which allows us to connect to compression algorithms in the machine learning literature.
✄ Section III discusses several representative examples where we consider multiband lattice models and ab initio systems to downfold to a simpler lattice model.
✄ In Section IV, we discuss future prospects of applications of the DMD method, ongoing challenges and clear avenues for methodological improvements.

II. DOWNFOLDING AS A COMPRESSION OF THE ENERGY FUNCTIONAL
A. Theory

Energy functional
Suppose we start with a quantum system with Hamiltonian H and Hilbert space H. Proof. Ψ ⊕ 0|H|0 ⊕ Φ = 0 because the two states have non-overlapping expansions in the eigenstates of H. Using that fact, we can evaluate This is equivalent to noting that H is block diagonal in the partitioning of H into LE and H \ LE. Importantly, if |Ψ ∈ LE, then δE[Ψ⊕0] δ(Ψ⊕0) * = |Ψ ′ ⊕ 0, where |Ψ ′ ∈ LE. and Since the derivatives are equal, setting Eq. (3) equal to Eq. (4), This involves sampling the low-energy space, choosing the form of H ef f , and optimizing the parameters. An important implication of this is that it is not necessary to diagonalize either of the Hamiltonians; one must only be able to select wave functions from the low-energy space LE. As we shall see, this can be substantially easier than attaining eigenstates.
Some further notes about this derivation: ✄ Fitting Ψ's must come from LE. It is not enough that the energy functional E[Ψ] is less than some cutoff.
✄ In the case of sampling an approximate LE, the error comes from non-parallelity of E[Ψ] with the correct low energy manifold, up to a constant offset.
✄ While H ef f is unique, it has many potential representations and approximations.
✄ Our method can be applied to any manifold spanned by eigenstates ✄ Model fitting is finding a compact approximation to E ef f [Ψ]. This is a high-dimensional space, so we use descriptors to do this. The theory presented above maps coarse-graining into a functional approximation problem. This is still rather intimidating, since even supposing one can generate wave functions in the low-energy space, they are still complicated objects in a very large space. An effective way to accomplish this is through the use of descriptors, d i [Ψ], which map from H → R. Then we can approximate the energy functional as follows where f i are some parameterized functions. This will allow us to use techniques from statistical learning to efficiently describe E ef f .

B. Practical protocol
A practical protocol is presented in Figure 1. In this section we go through this procedure step by step.

Assess descriptors
Ansatz: Generating |Ψ i ∈ LE Ideally one would be able to sample the entire low-energy space. Typically, however, the space will be too large and it will need to be sampled. The optimal wave functions to use depend on the models one expects to fit, which we will discuss in detail in later steps. Simple strategies that we will use in the examples below include excitations with respect to a determinant and varying spin states.
Generate d j [Ψ i ] and E[Ψ i ] The choice of descriptor is fundamental to the success of the downfolding. In the case of a second-quantized Hamiltonian a set of linear descriptors by simply taking the expectation value of both sides of the equation. Then for example, the occupation descriptor for orbital k is The orbital that c k represents is part of the descriptor, and in the examples below we will discuss this choice as well. One is not limited to static orbital descriptors; they may have a more complex functional dependence on the trial function to include orbital relaxation.
Assess descriptors At this point, one has collected the data E i and d ij . If two descriptors have a large correlation coefficient, then they are redundant in the data set. This could either mean that the sampling of the low-energy Hilbert space LE was insufficient, or that they are both proxies for the same differences in states. If two data points have the same or very similar descriptor sets, but different energies, then either the descriptor set is not enough to describe the variations in the low-energy space, or the sampling has generated states that are not in the low-energy space. To resolve these possibilities, one should analyze the difference between the two wave functions.
In either case, when the model is accurate, the fits will be accurate. If descriptors values available in the reduced Hilbert space are not represented in the sampled wave functions, then intruder states can appear upon solution of the effective model. In that case, the model fitting is an extrapolation instead of an interpolation. For this reason it is desirable to have eigenstates or near-eigenstates in the sample set if possible; they are guaranteed to be on the corners of the descriptor space if the model is accurate.
Ansatz: E i ≃ i d ij p j If the descriptors are chosen well, then the model can be written in linear form: which we shorten to If this can be done, the fitting problem is reduced to a linear regression optimization. More complex functions of the descriptors are also possible, although at the cost of making the effective model more difficult to solve and complicating the fitting procedure. Fit optimal model Finally, one wishes to find a set of parameters such that Eq. (9) is satisfied as closely as possible. There are many choices to make in this step, which will often depend on the desired properties of the final model. One can imagine choosing different cost functions to minimize, which can also include a penalty for complicated models. In our tests, we have successfully used LASSO [15] and matching pursuit techniques [16] to select high quality and compact model parameters. A detailed example of using the latter technique is presented in Section III D.

III. REPRESENTATIVE EXAMPLES
Given the theoretical framework for downfolding a many orbital (or many-electron) problem to a few orbital (or few-electron) problem, we now discuss examples which elucidate the DMD method. The examples are as follows: ✄ Section III A: Three-band Hubbard → one-band Hubbard at half filling. Demonstrates finding a basis set for the second quantized operators and uses a set of eigenstates directly sampled from the low-energy space to find a one-band model.
✄ Section III B: Hydrogen chain → one-band Hubbard model at half filling. Demonstrates basis sets for ab initio systems and the possibility to use this technique to determine the quality of a model to a given physical situation.
✄ Section III C: Graphene → one-band Hubbard model with and without σ electrons. Demonstrates using the downfolding procedure to examine the effects of screening due to core electrons.
✄ Section III D: FeSe molecule → 3d, 4p, 4s system. Demonstrates the use of matching pursuit to assess the importance of terms in an effective model and to select compact effective models.
In all examples we will highlight the important ingredients associated with DMD. First and foremost is the choice of low energy space or energy window i.e. how our database of wave functions was generated. Associated with this is the choice of the one body space in terms of which the effective Hamiltonian is expressed. Finally, we discuss aspects of the functional forms or parameterizations that are expected to describe our physical problem. An important effective Hamiltonian that enters three out of our four representative examples is the one-band or single-band Hubbard model: where t and U are downfolded (renormalized) parameters, η is a spin index,d i,η is the effective oneparticle operator associated with spatial orbital (or site) i and n i,η =d † i,ηd i,η is the corresponding number operator. i, j is used to denote nearest neighbor pairs. We will sometimes drop the constant energy shift E 0 when we write equations like Eq. (10).

A. Three-band Hubbard model to one-band Hubbard model at half filling
Our first example is motivated by the high T c superconducting cuprates [17] that have parent Mott insulators with rich phase diagrams on electron or hole doping [18,19]. Many works have been devoted to their model Hamiltonians and corresponding parameter values [4,5,[20][21][22][23][24]. A minimal model involving both the copper and oxygen degrees of freedom is the three-orbital or three-band Hubbard model, where d i , p j refer to the d x 2 −y 2 orbitals of copper at site i and p x or p y oxygen at site j, respectively. sgn(p i , d j ) is the sign of the hopping t pd between nearest neighbors, shown schematically in Figure 2. ǫ d and ǫ p are orbital energies, U d and U p are strengths of onsite Hubbard interactions, and V pd is the strength of the density-density interactions between a neighboring p and d orbital. To simplify we consider only the case where ǫ p , U d and t pd are non zero; t pd is chosen throughout this section to be the typical value of 1.3 eV to give the reader a sense of overall energy scales. Since we work with fixed number of particles we set our reference zero energy to be ǫ d = 0, thus the charge transfer energy ∆ ≡ ǫ p − ǫ d equals ǫ p in our notation. We work in the hole notation; half filling corresponds to two spin-up and two spin-down holes on the 2 × 2 cell. It is our objective to determine what one-band Hubbard model [Eq. (10)] "best" describes the three-band data. The effective d-like orbitalsd i,η , that enter the low energy description are mixtures of copper and oxygen orbitals; this optimal transformation also remains an unknown. Thus the model determination involves two aspects (1) what are the composite objects that give a compact description of the low energy physics? and (2) given this choice what are the effective interactions between them? (A similar problem was posed and solved by one of us in the context of spin systems [25].) In addition, the best effective Hamiltonian description depends on the energy scale of interest. All these issues will be addressed in the remainder of the section.
We begin by encoding the relationship between the bare and effective operators as a linear transformation T,d where c j,η is the hole (destruction) operator and refers to either the bare d or p orbitals. Further generalizations of this relationship (for example, including higher body terms) are also possible, but have not been considered here. For the 2 × 2 unit cell T is a 4 × 12 matrix, which we parameterize by four distinct parameters. These correspond to mixing of a copper orbital with nearest neighbor oxygens (α 1 ), nearest neighbor coppers (α 2 ), next-nearest neighbor oxygens (α 3 ) and next-nearest neighbor coppers (α 4 ) as shown schematically in Figure 2. The explicit form of T after accounting for the symmetries of the lattice has been written out in the Appendix. These parameters are optimized to minimize a certain cost function, which will be explained shortly. All RDMs in the three-band and one-band descriptions are also related via T; the ones that we focus on are evaluated in eigenstate s and are given by, We optimize T by demanding two conditions be satisfied, (1) For the 2 × 2 cell, N ↑ = N ↓ = 2 and i = 1, 2, 3, 4. The number of states s was varied from three to six, depending on the energy window of interest. Figure 3 shows regimes of the three-band model where the lowest six eigenstates are separated from the higher energy manifold; the fourth and fifth eigenstates are degenerate. In the large U d limit, charge fluctuations are suppressed and these six states correspond to the Hilbert space of 4 an aspect we will verify in this section. The eigenstates outside of this manifold involve p-like excitations which the one-band model is not designed to capture.
We chose the lowest three eigenstates of the three-band model for minimizing the cost in Eq. (14). The four dimensional space of parameters of T was scanned for this purpose. The corresponding trace and orthogonality conditions are simultaneously satisfied with only small deviations, confirming the validity of Eq. (12). Importantly, the 1-RDM elements in the transformed basis corresponding to nearest neighbors d † 1d 2 s already provide estimates for U/t of the effective model. Since the exact knowledge of the corresponding eigenstates of the one-band Hubbard model is available for arbitrary U/t by exact diagonalization, we directly look up the U/t with the same 1-RDM value. These estimates complement the one obtained by DMD which was carried out with the same three low-energy eigenstates, using their energies and the computed values of d † 1d 2 s and ñ i,↑ñi,↓ s from Eqs. (13a) and (13b). [26] A representative example of our results for U d /t pd = 8 and ∆/t pd = 3 has been discussed in the Appendix.
Some trends in the one-band description are explored in Figure 3 by monitoring the downfolded parameters as a function of varying ∆/t pd and U d /t pd . For example, when U d /t pd = 8 is fixed and ∆/t pd is increased, we find that the effective hopping t decreases and U/t increases. This is physically reasonable since an increasing difference in the single particle energies of the copper and oxygen orbitals makes it energetically unfavorable for holes to hop between the two orbitals. When ∆/t pd = 3 is fixed and U d /t pd is increased, U/t increases. As one mechanism of avoiding the large U d , the copper orbitals are forced to hybridize more with the oxygen ones; on the other hand, hole delocalization is suppressed in a bid to maintain mostly one hole perd due to the larger U/t. The net result of these effects is that the t also increases.
An important check for the one-band model is its ability to reproduce the low energy gaps of the three-band model; these have been compared in Figure 4. For the case of ∆/t pd = 3, we observe that for all U d /t pd the lowest three eigenstates were reproduced well. This model also reproduces the states outside of the DMD energy window, although with slightly larger errors. Similar trends are seen for the case of ∆/t pd = 5, with the noticeable difference being that the energy error of the highest state has reduced. This also reflects that the parameters obtained from DMD are, in general, dependent on the energy window of interest, a point which we will highlight shortly by investigating it systematically.
A promise of downfolding is the reduction of the size of the effective Hilbert space; allowing simulations of bigger unit cells to be carried out. To show that this actually works well in practice for the three-band case, we consider the 2 √ 2 × 2 √ 2 square unit cell, comprising of 8 copper and 16 oxygen orbitals. For representative test cases, we performed exact diagonalization calculations at half filling; the Hilbert space comprises of 112,911,876 basis states. Roughly 200 Lanczos iterations were carried out, enabling convergence of the lowest four energies. We compared the lowest gaps with the corresponding calculation on the one-band model on the same square geometry, with a Hilbert space size of only 4,900, using the downfolded parameters obtained from the smaller 2 × 2 cell.
Our results are summarized in Figure 5. Panel (A) shows the six representative parameter sets of the three-band model and the corresponding downfolded one-band parameters. Panels (B) and (C) show the lowest three energy gaps for representative values of U d /t pd = 4, 8, 12 for ∆/t pd = 3 and ∆/t pd = 5 respectively. In all cases, the agreement between the three-band and one-band models is remarkably good. The energy gap error of the lowest gap is within 0.0004 eV (1% relative error). The largest error in the third gap is of the order of 0.005 eV (3% relative error). These results indicate the reliability of the downfolding procedure and highlight its predictive power.
Until this point, all our results focused on downfolding using only the lowest three eigenstates of the 2 × 2 cell. We now explore the effect of increasing the energy window, by including higher eigenstates, using our test example of U d /t pd = 8 and ∆/t pd = 3. To do so, we now use all six low energy eigenstates for optimizing the cost function in Eq. (14). We find similar (but not exactly the same) values of α i compared to the case when only the three lowest states were used. The fact that a solution with small cost can be attained confirms our expectation that the entire low energy space of six states is consistently described by a set ofd i operators.
However, as Figure 6(A) shows, the estimates of U/t and t depend on how many eigenstates are used in the DMD procedure. This is because the DMD aims to provide the one-band description that best describes all states in a given window. If the model is not perfect within a given energy window, an energy dependent model is expected, consistent with the renormalization group perspective. For our test example, increasing the number of eigenstates from three to six changed U/t from 13.8 to 9.44 and t from 0.3045 to 0.2750 eV. [27] The features associated with the energy dependence are further confirmed in Figure 6 for which is may be possible to reduce this energy dependence significantly and thus have a model that describes the smaller and larger energy scales equally well.

B. One dimensional hydrogen chain
We now move on to one of the simplest extended ab initio systems, a hydrogen chain in one dimension with periodic boundary conditions. The one-dimensional hydrogen chain has been used as a model for validating a variety of modern ab initio many-body methods [28]. We consider the case of 10 atoms with periodic boundary conditions and work in a regime where the inter-atomic distance r is in the range 1.5 − 3.0Å, such that the system is well described in terms of primarily s-like orbitals.
For a given r, we first obtain single-particle Kohn-Sham orbitals from a set of spin-unrestricted and spin-restricted DFT-PBE calculations. The localized orbital basis upon which the RDMs (descriptors) are evaluated is obtained by generating intrinsic atomic orbitals (IAO) [29] from the Kohn-Sham orbitals orthogonalized using the Löwdin procedure (see Figure 7). These are the orbitals that enter the one-band Hubbard Hamiltonian. Then, to generate a database of wavefunctions needed for the DMD, we produce a set of Slater-Jastrow wavefunctions consisting of singles and doubles excitations to the Slater determinant: where |KS is the Slater determinant of occupied Kohn-Sham orbitals, η = η ′ are spin indices, and a † i (a i ) is a single-electron creation (destruction) operator corresponding to a particular Kohn-Sham orbital. The k, l indices label occupied orbitals in the original Slater determinant, while i, j are virtual orbitals. e J is a Jastrow factor optimized by minimizing the variance of the local energy.
We compute the energies (expectation values of the Hamiltonian) and the RDMs for each wave function within DMC. By computing the trace of the resulting 1-RDMs, we verify that all the electrons present in the system are represented within the localized basis of s-like orbitals. If the trace of the 1-RDM deviates from the nominal number of electrons for a particular state by more than some chosen threshold -2% in this example -it indicates that some orbitals are occupied (2sor 2p-like orbitals for hydrogen) that are not represented within the localized IAO basis used for computing the descriptors. Hence, these states do not exist within the LE space, and cannot be described by a one-band s-orbital model. We exclude such states from the wave function set. The acquired data is then used in DMD to downfold to a one-band Hubbard Hamiltonian. Insets show the intrinsic atomic orbitals which constitute the one-body space which was used for calculating the reduced density matrices (descriptors). Figure 7 shows the fitting results of the energy functional E[Ψ] within the sampled LE for two representative distances (1.5 and 2.25Å). As we can see, the model E ef f [Ψ] reproduces the ab initio E[Ψ] up to certain error that decreases with atomic separation. That is, the fitted Hubbard model provides a more accurate description as separation distance increases, and the system becomes more atomic-like. Figure 8 shows the fitted values of the downfolding parameters t and U/t at various distances. t decreases as the interatomic distance increases, and the value of U/t increases. The single-band Hubbard model qualitatively captures how the system approaches the atomic limit, in which t becomes zero.
The R 2 values obtained from fitting the descriptors to the ab initio energy [see Figure 8(C)] also show that the single-band Hubbard model is a good description of the system at large distances, but not at small distances. This is primarily because the dynamics of other degrees of freedom (e.g. 2s and 2p orbitals) become important to the low energy spectrum at small distances. Other interaction terms beyond the on-site Hubbard U , such as nearest-neighbor Coulomb interactions and Heisenberg coupling, can also become significant. Without including higher orbitals or additional many-body interaction terms, the model gives rise to an incorrect insulator state at small distances. Conversely, at larger separations (r > 1.8Å), where the system is in an insulator phase [30], the model provides a better description.

C. Graphene and hydrogen honeycomb lattice
Our third example highlights the role of the high energy degrees of freedom not present in the low energy description but which are instrumental in renormalizing the effective interactions. We demonstrate this by considering the case of graphene, and by comparing it to artificially constructed counterparts without the high energy electrons. Although many electronic properties of graphene can be adequately described by a noninteracting tight-binding model of π electrons [31], electron-electron interactions are crucial for explaining a wide range of phenomena observed in experiments [32]. In particular, electron screening from σ bonding renormalizes the low energy plasmon frequency of the π electrons [33]. In fact a system of π electrons with bare Coulomb interactions has been shown to be an insulator instead of a semimetal [33][34][35][36]. Using DMD, we demonstrate how the screening effect of σ electrons is manifested in the low energy effective model of graphene.
In order to disentangle the screening effect of σ electrons from the bare interactions between π electrons, we apply DMD to three different systems, graphene, π-only graphene, and a honeycomb lattice of hydrogen atoms. In the π-only graphene, the σ electrons are replaced with a static constant negative charge background. The role of σ electrons is then clarified by comparing the effective model Hamiltonians of these two systems. The hydrogen system we study has the same lattice constant a = 2.46Å as graphene, which has a similar Dirac cone dispersion as graphene [33].
By constructing the one-body space by Wannier localizing Kohn-Sham orbitals obtained from DFT calculations (see Figure 9), we verify that the low energy degrees of freedom correspond to the π orbitals in graphene and its π-only system and s orbitals in hydrogen; these enter the effective one-band Hubbard model description in Eq. (10). Due to the vanishing density of states at the Fermi level, the Coulomb interaction remains long-ranged, in contrast to usual metals where the formation of electron-hole pairs screens the interactions strongly [33]. However, for certain aspects, the long ranged part can be considered as renormalizing the onsite Coulomb interaction U at low energy [37,38].
To estimate the one-band Hubbard parameters, we used the DMD method using a set of 50 Slater-Jastrow wave functions that correspond to the electron-hole excitations within the π channel for the graphene systems or s channel for the hydrogen system. In particular, for graphene, the Slater-Jastrow wave functions are constructed from occupied σ bands and occupied π bands, whereas for π-only graphene, Slater-Jastrow wave functions constructed from occupied π Kohn-Sham orbitals of graphene. The ab initio simulations were performed on a 3 × 3 cell (32 carbons or hydrogens) and the energy and RDMs of these wave functions were evaluated with VMC. The error bars on our downfolded parameters are estimated using the jackknife method [39]. The results from our calculations are summarized in Figure 10.  We find that the one-band Hubbard model describes graphene and hydrogen very well, as is seen from the fact that R 2 is closed to 1 for the fits. Our fits are shown in Figure 10. For both graphene and hydrogen, U/t is smaller than the critical value of the semimetal-insulator transition (U/t) c ≈ 3.8 for the honeycomb lattice [40], which is consistent with both systems being semimetals. The two systems indeed have similar hopping constant t, consistent with the fact that they have similar Fermi velocities at the Dirac point. However, the difference in their high energy structure manifests itself as differently renormalized electron-electrons interactions, explaining the difference in U . Most prominently, the π-only system has much larger U/t (∼ 4.9) compared to graphene, which is large enough to push it into the insulating (antiferromagnetic) phase. Thus, downfolding shows the clear significance of σ electrons in renormalizing the effective onsite interactions of the π orbitals,making graphene a weakly interacting semimetal instead of an insulator.

D. FeSe diatomic molecule
Transition metal systems are often difficult to model due to the many orbital and possibly magnetic descriptors introduced by d electrons. This is seen in the proliferation of models for tran-sition metals, which include terms like spin-spin coupling, spin-orbital coupling, hopping, Hund's like coupling, and so on. Models containing all possible descriptors are unwieldy, and it is difficult to determine which degrees of freedom are needed for a minimal model to reproduce an interesting effect. Transition metal systems are challenging to describe using most electronic structure methods because of the strong electron correlations and multiple oxidation states possible in these systems. Fixed-node DMC has been shown to be a highly accurate method on transition metal materials in improving the description of the ground state properties and energy gaps [41][42][43][44]. In this section, we apply DMD using fixed-node DMC to quantify the importance of various interactions in a FeSe diatomic molecule with a bond length equal to that of the iron based superconductor, FeSe [45], in order to help identifying the descriptors that may be relevant in the bulk material. We considered a low-energy space spanned by the Se 4p, Fe 3d, and Fe 4s orbitals. We sampled singles and doubles excitations from a reference Slater determinant of Kohn-Sham orbitals taken from DFT calculations with PBE0 functional with total spin 0, 2, and 4, which were then multiplied by a Jastrow factor and further optimized using fixed-node DMC. After this procedure, 241 states were within a low energy window of 8 eV. Of these, eight states had a significant iron 4p component, which excludes them from the low-energy subspace. This leaves us with 233 states in the low-energy subspace.
We consider a set of 21 possible descriptors consisting of local operators on the iron 4s, iron 3d states, and selenium 4p states, which is a total of 9 single-particle orbitals. We use the same IAO construction as Section III B to generate the basis for these operators. At the one-body level, we consider orbital energy descriptors: ǫ s n s , ǫ π,Se (n px + n py ), ǫ z n pz , ǫ z 2 n d z 2 , ǫ π,Fe (n dxz + n dyz ).
and the symmetry-allowed hopping terms: parameters may change at each step because the entire model is refitted when an addition parameter is included in each iteration. The parameters are smoothly varying with the inclusion of new parameters, and they take the correct signs based on symmetry (where applicable). The RMS error decreases with each additional parameter, but less so as the algorithm appends additional parameters. Eventually the diminishing improvements do not merit the additional complexity of more parameters.

IV. CONCLUSION AND FUTURE PROSPECTS
The density matrix downfolding (DMD) technique uses data derived from low-energy approximate solutions to a high energy Hamiltonian to systematically determine an effective Hamiltonian that describes the low-energy behavior of the system. It is based on several rather simple proofs which occupy a role similar to the variational principle; they allow us to know which effective models are closer to the correct solution than others. The method is very general and does not require a quasiparticle picture to apply, and neither does it have double-counting issues. It treats all interactions on an equal footing, so hopping parameters are naturally modified by interaction parameters and so on. While most of the applications have used the first principles quantum Monte Carlo method to obtain the low-energy solutions, the method is completely general and can be used with any solution method that can produce high quality energy and reduced density matrices. We have discussed several examples to present the conceptual and algorithmic aspects of DMD.
DMD, though conceptually simple, is still a method in its development stages, with room for algorithmic improvements and new applications. Advances in the field of inverse problems [69] could be incorporated into DMD to mitigate the problems associated with optimization and overfitting. Here we briefly outline some aspects that need further research: 1. The wave function database (|Ψ ∈ LE): The DMD method relies crucially on the availability of a low energy space of ab initio wave functions. While these wave functions do not have to be eigenstates, automating their construction remains challenging and realistically requires knowledge of the physics to be described.
2. Optimal choice of basis functions. The second-quantized operators in the effective Hamiltonian correspond to a basis in the continuum. The quality of the model depends on the basis describing the changes between low-energy wave functions accurately.
3. Form of the low energy model Hamiltonian. While the exact effective Hamiltonian is unique, there may be many ways of approximating it with varying levels of compactness and accuracy.
The advantage of the DMD framework is that all these can be resolved internally. Given a good sampling of LE, (2) and (3) can be resolved using regression. Given that (2) and (3) are correct or near correct, then (1) can be resolved by finding binding planes, as noted in Section II. The method thus has a degree of self consistency; it will return low errors only when 1-3 are correct. We have shown applications to strongly correlated models (3-band), ab initio bulk systems hydrogen chain and graphene, and a transition metal molecule FeSe. The technique is on the verge of being applied to transition metal bulk systems; there are no major barriers to this other than a polynomially scaling computational cost and the substantial amount of work involved in parameterizing and fitting models to these systems. Looking into the future, we anticipate that this technique can help with the definition of a correlated materials genome-what effective Hamiltonian best describes a given material is highly relevant to its behavior.
where we have defined F ≡ √ 1 − 4α 1 2 − 2α 2 2 − 4α 3 2 − α 4 2 . A concrete and representative example of our results, shown in Section III A for the 2 × 2 cell, is explained for the case of U d /t pd = 8 and ∆/t pd = 3. The first task was to obtain the optimal transformation T for which the lowest three eigenstates (s = 1, 2, 3) of the three-band model were used for computing the cost in Eq. (14). The minimum of the cost was determined by a brute force scan in the four dimensional space of α's and using a linear grid spacing of 0.002 found α 1 = 0.216, α 2 = 0.042, α 3 = 0.018 and α 4 = 0.016. The two terms in the cost i.e. the trace and orthogonality conditions are individually satisfied to a relative error of less than 0.5 percent.
c † i c j s and c j,↑ † c m,↓ † c n,↓ c k,↑ s were computed from the exact knowledge of the three-band model eigenstates and hence d † i,ηd j,η s and ñ i,↑ñi,↓ s are obtained once the optimal T has been determined. As mentioned in the main text, the value of d † 1,ηd 2,η s provides estimates for U/t of the effective model by direct comparison of its value to that in the corresponding eigenstate in the one-band model. For our test example, the absolute values of d † 1,↑d 2,↑ s in states s = 1, 2, 3 are approximately 0.159, 0.142 and 0.084 respectively which correspond to (U/t) 1 ≈ 14.1, (U/t) 2 ≈ 13.2, (U/t) 3 ≈ 12.7. Performing DMD with the three eigenenergies and their calculated RDMs gave t = 0.3025 eV and U/t = 13.45; the latter in the correct range of the other estimates.