The GW Compendium: A Practical Guide to Theoretical Photoemission Spectroscopy

The GW approximation in electronic structure theory has become a widespread tool for predicting electronic excitations in chemical compounds and materials. In the realm of theoretical spectroscopy, the GW method provides access to charged excitations as measured in direct or inverse photoemission spectroscopy. The number of GW calculations in the past two decades has exploded with increased computing power and modern codes. The success of GW can be attributed to many factors: favorable scaling with respect to system size, a formal interpretation for charged excitation energies, the importance of dynamical screening in real systems, and its practical combination with other theories. In this review, we provide an overview of these formal and practical considerations. We expand, in detail, on the choices presented to the scientist performing GW calculations for the first time. We also give an introduction to the many-body theory behind GW, a review of modern applications like molecules and surfaces, and a perspective on methods which go beyond conventional GW calculations. This review addresses chemists, physicists and material scientists with an interest in theoretical spectroscopy. It is intended for newcomers to GW calculations but can also serve as an alternative perspective for experts and an up-to-date source of computational techniques.


INTRODUCTION
Electronic structure theory derives from the fundamental laws of quantum mechanics and describes the behavior of electrons-the glue that shapes all matter. To understand the properties of matter and the behavior of molecules, the quantum mechanical laws must be solved numerically because a pen and paper solution is not possible. In this context, Hedin's GW method (Hedin, 1965) has become the de facto standard for electronic structure properties as measured by direct and inverse photoemission experiments, such as quasiparticle band structures and molecular excitations.
Electronic structure theory covers the quantum mechanical spectrum of computational materials science and quantum chemistry. The fundamental aim of computational science is to derive understanding entirely from the basic laws of physics, i.e., quantum mechanical first principles, and increasingly also to make predictions of new properties or new materials and new molecules for specific tasks. The exponential increase in available computer power and new methodological developments are two major factors in the growing impact of this field for practical applications to real systems. As a result of these advances, computational science is establishing itself as a viable complement to the purely experimental and theoretical sciences.
Hedin published the GW method in 1965, in the same time period as the foundational density-functional theory (DFT) papers (Hohenberg and Kohn, 1964;Kohn and Sham, 1965). While DFT has shaped the realm of first principles computational science like no other method today, GW's fame took a little longer to develop 1 . DFT's success has been facilitated by the computational efficiency of the local-density (Kohn and Sham, 1965) or generalized gradient approximations (Becke, 1988;Lee et al., 1988;Perdew et al., 1996a) (LDA and GGA) of the exchange-correlation functional that make DFT applicable to polyatomic systems containing up to several thousand atoms. GW, however, only saw its first applications to realistic materials 20 years after its inception (Hybertsen and Louie, 1985;Godby et al., 1986), due to its much higher computational expense. Soon after it was realized that GW can overcome some of the most notorious deficiencies of common density functionals such as the self-interaction error, the absence of long-range polarization effects and the Kohn-Sham band-gap problem.
The GW approach is now an integral part of electronic structure theory and readily available in major electronic structure codes. It is taught at summer schools along side DFT and other electronic structure methods. This review is intended as a tutorial that complements showcases of GW's achievements with a practical guide through the theory, its implementation and actual use. For GW novices, the review offers a gentle introduction to the GW concept and its application areas. Regular GW users can consult this review as a handbook in their day-to-day use of the GW method. For seasoned GW users and GW experts it might serve as a reference for key applications and some of the subtler points of the GW framework.
In this review we take a more practical approach toward the GW method. We will recap the basic theory starting from theoretical spectroscopic view point as a probe of the electronic structure. Aiming at GW practitioners, we will illustrate how the GW approach emerges from the theoretical spectroscopy framework as a practical scheme for electronic structure calculations. A more in-depth discussion of the theoretical foundations of many-body theory can be found in textbooks (e.g., Fetter and Walecka, 1971;Szabo and Ostlund, 1989;Gross et al., 1991;Bechstedt, 2014;Martin et al., 2016), while the GW theory itself is covered in excellent early reviews (Hedin and Lundqvist, 1970;Aryasetiawan and Gunnarsson, 1998;Hedin, 1999). GW began to flourish at the beginning of the 21st century and two reviews succinctly summarized the state of the field until then (Aulbur et al., 2000;Onida et al., 2002). Our review bridges the ensuing gap of almost 20 years, after which only several more specialized reviews addressed different aspects and applications of GW calculations (Rinke et al., 2008a;Giantomassi 1 The theory directly comparable to DFT is the Luttinger-Ward formalism, not the GW approximation. Here, we focus on the chronological development of practical electronic structure calculations with Kohn-Sham DFT or GW. et al., 2011;Ping et al., 2013;Bruneval and Gatti, 2014;Faber et al., 2014;Marom, 2017;Gerosa et al., 2018a;Kang et al., 2019), and complements a recent review (Reining, 2017).
A considerable part of our review is devoted to the presentation of different GW implementations. We will discuss the practical considerations that GW users have to make when they decide on a particular GW implementation or code for their work. Moreover, we will guide the reader through computational decisions that might affect the accuracy of their GW calculations and illustrate them with concrete examples from the GW literature. An important aspect in this regard is the issue of self-consistency in GW, which we cover in detail.
Although the GW method might be best known for its success in predicting the band gaps of solids, we will present its diversity and discuss a range of different quantities that can be computed with the GW method. Since no method is perfect, we will conclude with a critical outlook on the challenges faced by the GW method and discuss ways to go beyond GW.
We conclude this introduction with a quote from H. J. Monkhorst, who wrote in a laudation in 2005: It is therefore my conviction that, rather sooner then later, we will see a resurgence of the precise many-body approach to solid-state theory as we envisioned. Almost assuredly the GW method will be the tool of choice (Monkhorst, 2005). In 2019, we can say that Monkhorst was right.

Direct and Inverse Photoemission Spectroscopy
Spectroscopic measurements are an important component in the characterization of materials. Any spectroscopic technique perturbs the system under investigation and promotes it into an excited state. Experimentally, the challenge then lies in the correct interpretation of the system's response. From a theoretical point of view, however, the challenge is to find (or develop) a suitable, accurate and, most of all, computationally tractable approach to describe the response of the system. Experimental and theoretical spectroscopy are often complementary and, when combined, they are a powerful approach.
Many-body perturbation theory (MBPT) provides a rigorous and systematic quantum mechanical framework to describe the spectral properties of a system that connects central quantities like the Green's function, the self-energy, and the dielectric function with each other. The poles of the single-particle Green's function, the central object in MBPT, correspond to the electron addition and removal energies probed in direct and inverse photoemission, which is explained in detail at the end of section 2.1. In contrast, information about neutral excitations probed in optical or energy loss spectra can be extracted from the dielectric function. In this review, we will not address optical properties or other neutral excitations and instead focus on the single-particle Green's function and its connection to direct and inverse photoemission spectroscopy.
In photoelectron spectroscopy (PES) (Plummer and Eberhardt, 1982;Himpsel, 1983;Kevan, 1992), electrons FIGURE 1 | Schematic of the photoemission (PES) and inverse photoemission (IPES) process. In PES (A) an electron is excited by an incoming photon from a previously occupied valence state (lower shaded region) into the continuum (gray shaded region, starting above the vacuum level E vac ). In IPES (B) an injected electron with kinetic energy E kin undergoes a radiative transition into an unoccupied state (upper shaded region) thus emitting a photon in the process.
are ejected from a sample upon irradiation with visible or ultraviolet light (UPS) or with X-rays (XPS), as sketched in Figure 1A. The energy of the bound state ǫ s can be reconstructed from the photon energy hν, the work function and the kinetic energy E kin of the photoelectrons that reach the detector 2 IP s = −ǫ s = hν − E kin − , for ǫ s < E F . (1) The ionization potential IP s is defined as the energy that is required to remove an electron from the bound initial state s of the neutral sample, where the energy of the state is below the Fermi level (E F ). It is always a positive number and related to ǫ s as shown in Equation (1). By inverting the photoemission process, as schematically shown in Figure 1B, the unoccupied states can be probed. This technique is commonly referred to as inverse photoemission spectroscopy (IPES) or bremsstrahlung isochromat spectroscopy (BIS) (Dose, 1985;Smith, 1988;Fuggle and Inglesfield, 1992). In IPES, an incident electron with energy E kin is scattered in the sample emitting bremsstrahlung. Eventually it will undergo a radiative transition into a lower-lying unoccupied state, emitting a photon that carries the transition energy hν. The energy of the final, unoccupied state ǫ s can be deduced from the measured photon energy according to − EA s = ǫ s = E kin − hν + , for ǫ s ≥ E F . (2) EA denotes the electron affinity, which we define as the energy needed to detach an electron from a negatively charged species and which is thus the negative of ǫ s . EA can be a positive or negative number. It is negative when the additional electron is in an unbound state, and positive when the electron is bound.
2 Throughout this article the energy zero is chosen to be the top of the valence bands for extended systems and the vacuum level for finite systems.
The experimental observable in photoemission spectroscopy is the photocurrent, which is the probability of emitting an electron with the kinetic energy E kin within a certain time interval. It is related to the intrinsic spectral function A(r, r ′ , ω) of the electronic system, given by the imaginary part of the singleparticle Green's function 3 : (Almbladh and Hedin, 1983;Onida et al., 2002) A(r, r ′ , ω) = 1 π Im G(r, r ′ , ω) sgn(E F − ω), where ω denotes an energy (frequency). The single-particle Green's function, G(r, r ′ , ω), is the probability amplitude that a particle created or destroyed at r is correlated with the adjoint process at r ′ − it will be discussed in detail later. The actual dependence of the photocurrent on the spectral function is quite complicated because the coupling to the exciting light and electron loss processes in the sample, as well as surface effects, have to be taken into account. To our knowledge, no comprehensive theory yet exists for this relation and we therefore proceed with the discussion of the spectral function and will return to the photocurrent later. The energies ǫ s in Equations (1) and (2) are the removal and addition energies of the photoelectron, respectively, and we refer to the transition amplitudes from the N to the N ± 1-body states as ψ s (r) (see also section B.1 in Appendix B): ǫ s = E(N + 1, s) − E(N) ψ s (r) = N|ψ(r)|N + 1, s for ǫ s ≥ E F The states |N, s are many-body eigenstates (wave functions in real space) of the N-electron Schrödinger equationĤ |N, s = E(N, s) |N, s ,Ĥ is the many-body Hamiltonian and E(N, s) = N, s|Ĥ|N, s is the corresponding total energy. The field operator ψ(r) annihilates an electron at point r from the many-body states |N or |N + 1 . The representation given in Equations (4) and (5) is particularly insightful because it allows a direct interpretation of ǫ s as the photoexcitation energy from the N-particle ground state with total energy E(N) into an excited state s of the (N−1)particle system with total energy E(N − 1, s) upon removal of an electron in the photoemission process. Similarly, the addition energy that is released in the radiative transition in inverse photoemission is given by the total energy difference of the excited (N+1)-particle system and the ground state.
To build a practical scheme for calculating the energies in Equations (4) and (5) we introduce the definition of the singleparticle Green's function 4 G(r, σ , t, r ′ , σ ′ t ′ ) = −i N|T{ψ(r, σ , t)ψ † (r ′ , σ ′ , t ′ )}|N (6) whereT is the time ordering operator for the times t and t ′ and σ the spin.T arranges the field operators so that the earlier time 3 Atomic units 4πǫ 0 = h = e = m e = 1, where e and m e are the charge and mass of an electron, respectively, will be used in the remainder of this article. 4 We consider only the zero temperature G and assume µ = E F . is to the right and acts on the ground state |N first. G allows for both time orderings: t > t ′ or t ′ > t. This definition of the Green's function is particularly insightful because it illustrates the process of adding and removing electrons from the system, as done in photoemession spectroscopy. Assuming the time-ordering is as shown in Equation (6),ψ † (r ′ , σ ′ , t ′ ) will create an electron with spin σ ′ at time t ′ in point r ′ . This electron will then propagate through the system, until it is annihilated byψ(r, σ , t) at a later time t in position r. The Green's function is therefore also often called a propagator. We will return to this propagator picture of G in later sections of this review.
To make contact with Equations (4) and (5), we need to Fourier transform the Green's function from the time to the energy axis. For a time-independent Hamiltonian this then produces the spectral or Lehman representation of G (Fetter and Walecka, 1971;Gross et al., 1991) G(r, r ′ , ω) = lim where we have assumed a spin paired system and summed over the spin quantum number shown in Equation (6). The two terms in brackets are for the two time orderings in G. is the Heaviside step function, which is zero for negative arguments and one for positive arguments 5 . It kills any processes that do not obey the correct time ordering, as determined by the created/annihilated particle's energy relative to E F . This representation illustrates that the many-body excitations of the system that are associated with the removal or addition of an electron are given by the poles of the single-particle Green's function. The diagonal spectral function A(r, r, ω) = 1 π Im G(r, r, ω) sgn(E F − ω) (8) = s ψ s (r)ψ * s (r)δ(ω − ǫ s ) then assumes the intuitive form of a (many-body) density of states.

The Quasiparticle Concept
In periodic solids, the crystal has special crystallographic directions so that spectra are direction dependent, with the direction indexed by a wave vector k. By varying the direction of the incident beam relative to the crystallographic axes (a i ), one can map the k dependent PES spectra, as shown in Figure 2A. This technique is called angle resolved photoemission spectroscopy (ARPES). Figure 2C shows data from a typical ARPES measurement and Figure 2B a schematic of the spectral function at a single k-point. The spectra usually exhibit distinct peaks that are attributed to particle-like states but have a finite width. There can also be additional, broader peaks away from the main peak called satellites. However, the spectral function in Equation (9) contains only Dirac-delta functions which appear 5 For the remainder of the review, η is always assumed to be a positive infinitesimal.
FIGURE 2 | (A) Schematic representation of an ARPES experiment. By varying the angles θ and φ with respect to the crystallographic axes (a i ), the measured spectrum is direction, or k, dependent. In practice, the detector angle is usually varied with respect to a fixed beam. (B) A typical spectral function features a sharp peak attributed to the quasiparticle, an incoherent background, and satellites away from the single particle peak. (C) ARPES data of the upper valance bands of ZnO (Kobayashi et al., 2009). The corresponding G 0 W 0 band structure of ZnO is shown in Figure 27.
as infinitely sharp peaks. The broadening of the spectral function comes from the sum of many delta functions close in energy, which merge to form a peak of finite width. If the contributing delta functions are closely packed around one energy, the peak is attributed to a quasiparticle (Landau et al., 1980). To further motivate the association of quasiparticles with particle-like excitations it is insightful to consider noninteracting electrons. In that case, the spectral function consists of a series of delta peaks A ss ′ (ω) = ψ s |A(ω)|ψ s ′ = δ ss ′ δ(ω − ǫ s ), each of which corresponds to the excitation of a non-interacting particle, see Appendix A for the integral notation used in Equation (10). The many-body states |N and |N ± 1 become single Slater determinants so that the exact excited states are characterized by a single creation or annihilation operator acting on the ground state. The excitation energies ǫ s and the wave functions ψ s (r) are thus the eigenvalues and eigenfunctions of the single-particle Hamiltonian. When the electron-electron (or electron-ion) interaction is turned on, the exact eigenstates |N, s are no longer single Slater determinants. As a consequence, the matrix elements of the spectral function A ss ′ (ω) will contain contributions from many non-vanishing transition amplitudes. If these contributions merge into a clearly identifiable peak that appears to be derived from a single delta-peak broadened by the electronelectron interaction, this structure can be interpreted as a singleparticle like excitation-the quasiparticle. The broadening of the quasiparticle peak in the spectral function is associated with the lifetime τ of the excitation due to electron-electron scattering, whereas the area underneath the peak is interpreted as the renormalization Z of the quasiparticle. This renormalization factor quantifies the reduction in spectral weight due to the electron-electron interaction compared to an independent electron, though the total spectral weight is conserved. We can combine these various arguments and say that the quasiparticle peak for state s will exhibit the following shape: In contrast to the exact energies of the many-body states, which are poles of the Green's function on the real axis, the quasiparticle poles reside in the complex plane and are no longer eigenvalues of the N-body Hamiltonian. The real part of this complex energy is associated with the energy of the quasiparticle excitation and the imaginary part with its inverse lifetime Ŵ = 2/τ . To develop a more intuitive understanding of quasiparticles, it is insightful to adopt a real-space picture. The quasiparticle concept can be explained by analogy with a crowd of people, as shown in Figure 3. Picture a group of people, such as at a concert or festival, all crowded into the same area. Not wanting to get too close to each other to preserve their own space, people in the crowd interact with each other. If one person gets too close to another, their mutual repulsion eventually takes over and separates them again. The exact description of the crowd requires the location of each individual person at all times. This is a very difficult task because of the constant interactions, or repulsions, between individual people. This collection of people and their occasional fluctuations are grouped together and labeled the ground state.
A new person arrives and pushes their way into the crowd. We can think of this new person as the electron in inverse photoemission that is injected into the system. The new person enters in a specific direction with a certain energy. As they enter the group, they repeatedly interact with other people as they continue their trajectory, as shown in Figure 3C. These repeated interactions repel people in their immediate vicinity and form a small halo of free space around the incoming person. People seem to move out of their way on their journey, forming a polarization cloud created by the absence of other people around them. The intruder's motion and their polarization cloud can be taken together to form a new composite object, a quasi-person, which appears as a slowed-down version of the newcomer. From The new person begins to interact with other people who, in turn, interact back with the new person in (C) and form a polarization cloud. An effective, or renormalized, object, the quasi-person, moves through the crowd in (D). Even though it is an interacting system, the many-person state in (D) can still be connected to, or identified by, the single person added to the crowd. This connection allows us to identify the quasi-person. Bottom: Schematic representation of photoemission spectroscopy. far away, one does not need to describe the precise motion of all N + 1 people in the group, but only the motion of this quasi-person propagating through the crowd.
By analogy, a quasiparticle can then be considered a combination of an additional electron or hole in the system that interacts with its surrounding polarization cloud. The situation corresponding to photoemission spectroscopy is depicted in the bottom panel of Figure 3. As time increases, the bare hole left by removal of the interaction is screened. The quasiparticle therefore embodies an electron state with the perturbation of its own surrounding. The feedback via interactions of the particle with surrounding electrons is termed the self-energy. Over time, the propagating quasiparticle can decay into many different elementary excitations, giving it a finite lifetime. Essential quasiparticle properties are dispersion, lifetime, weight, and satellite spectrum. The latter arises from the collective excitations in the medium.

Comparison to Experimental Spectra
We have now identified quasiparticles as one possible source for peaks in experimental photoemission spectra. Before we introduce the GW approximation as a tractable computational approach for calculating quasiparticle energies, we will first address the photocurrent, which is the quantity measured in direct photoemission experiments. Then we will briefly discuss the reconstruction of the band structure information, as well as other sources of peaks in spectrum.
Establishing rigorous links between the spectral function and the photocurrent is still a challenge for theory (Hedin, 1999;Lee et al., 1999;Minár et al., 2011Minár et al., , 2013. The photocurrent J k (hν) is the probability per unit time of emitting a photoelectron with momentum k and energy E kin,k due to an incident photon with the energy hν. The spectral function defined in Equation (3) describes the removal of an electron from the sample, but does not include intermediate steps on the way to the detector where the electron loses energy. Therefore, it does not correspond to J k (hν). However, the spectral function can be related to the photocurrent by using the sudden approximation (Hedin, 1999;Hedin and Lee, 2002) assuming that the ejected photoelectron is immediately decoupled from the sample. J k (hν) is then given by (Hedin, 1999) where A ss ′ (ω) = ψ s |A(ω)|ψ s ′ are matrix elements of the spectral function defined in Equation (3). k,s are matrix elements of the dipole operator which describe the coupling to photons. The dipole matrix elements capture the promotion of the electron to a highly excited state (often assumed to be a plane wave), i.e., they describe the transition between the initial and the final electron state. In this final state, the electron travels to the detector. On the way, it crosses the surface of the sample, which adds a further perturbation to its path and its energy. The transition matrix elements affect the amplitudes of the spectrum and add selection rules that give rise to the suppression or enhancement of certain peaks. In practice, one compares only matrix elements of the spectral function to the experiment disregarding the effects of the dipole matrix. Furthermore, it is often assumed that only the diagonal elements of the spectral function are dominating. For the reconstruction of the band structure, e.g., with ARPES, the comparison between theory and experiment is hampered from the experimental side. In ARPES studies of crystalline materials, the emitted photons or electrons inevitably have to pass the surface of the crystal to reach the detector. Therefore, information about their transverse momentum k ⊥ is lost. This is because the crystal's translational symmetry is broken at the surface and only the in-plane momentum k is conserved. To reconstruct the three-dimensional band structure of the solid from experimental data, assumptions are often made about the dispersion of the final states (Plummer and Eberhardt, 1982;Himpsel, 1983;Hora and Scheffler, 1984;Dose, 1985;Smith, 1988). Ab initio calculations as described in this article can aid in the assignment of the measured peaks. Either way, some layer of interpretation between theoretical and experimental band structures is required.
Apart from quasiparticle excitations, a typical photoemission experiment provides a rich variety of additional information.
FIGURE 4 | X-ray photoemission spectrum with 800 eV incident energy compared to two calculated spectra. The red line shows the evGW 0 @LDA spectrum (see section 5), whereas the blue spectrum contains additional vertex corrections in form of a cumulant expansion (see section 11). The evGW 0 @LDA+C * spectrum contains the addition of the Shirley background (shown by the black dashed line) and loss effects of the outgoing photoelectron. Data retrieved from Guzzo et al. (2011), where the GW results are labeled as G 0 W 0 . However, self-consistency in the eigenvalues was in fact applied, which is in our notation evGW 0 (Private Communication).
In core-electron emission for instance, inelastic losses or multielectron excitations such as shake-ups and shake-offs lead to satellites in the spectrum. Satellites can also appear in the valence region. The outgoing photoelectron or the hole left behind can, for example, excite other quasiparticles like plasmons, phonons or magnons. This gives rise to additional peaks, the so-called plasmon or magnon satellites or phonon side bands, that are typically separated from the quasiparticle peak by multiples of the plasmon, magnon or phonon energy. The broad peak in Figure 4, which shows integrated spectra and therefore has no k dependence, near −40 eV is an example of a satellite. Satellites are collective effects that are not described within the quasiparticle picture.

HEDIN'S GW EQUATIONS
Having introduced the general Green's function framework and quasiparticle concept, we are prepared to consider the concrete formalism for GW. GW is an approximation to an exact set of coupled integro-differential equations called Hedin's equations (Hedin, 1965), the full derivation of which can be found in Appendix B.
We build up Hedin's equations from a perturbation theory perspective. We can conveniently represent the perturbation expansion for G that we introduced in the time domain in Equation (6) and in the energy domain in Equation (7) with the Feynman diagram technique. Feynman diagrams are a pictorial way of representing many-body and Green's function theory. We cover the necessary basics in this section and refer the interested reader to an excellent book on Feynman diagrams (Mattuck, 1992).
The perturbation expansion begins with the noninteracting Green's function, denoted G 0 . G 0 is the probability amplitude for FIGURE 5 | The most basic pieces of diagrammatic perturbation theory are G 0 and v. From these, all other quantities can be built. The interaction v(1, 2) is instantaneous. Therefore, the dashed line is perpendicular to the time axis. The arrows in G 0 and G point in only one direction, but both time orderings are included.
a noninteracting particle to propagate from one spacetime point to another. In the diagrammatic technique, G 0 is represented by a solid line with an arrow. The ends of G 0 indicate spacetime points. The generic notation 1 = (r 1 , t 1 , σ 1 ) refers to the spatial coordinate r 1 , time t 1 , and spin variable σ 1 . G 0 , shown in Figure 5, is one of the basic building blocks for the perturbation expansion.
G is the probability amplitude for the interacting system that a particle creation at 2 is correlated with a particle annihilation at 1. The rules of quantum mechanics dictate that we must sum over all possible paths for the particle to move from 2 to 1 − this generates the exact G, which is represented by a bold, or double, line with an arrow in the diagram language. Every possible process between 1 and 2 contributes a different amplitude to the total G. The different processes which connect 1 to 2 depend on interactions with other particles in the system at the times between t 1 and t 2 . Without these interactions, the problem is already solved with G 0 .
Recall from the definition of G in Equation (6) that G contains two time orderings. The second time ordering implies that the annihilation process may come before the creation. Remember that the field operators in Equation (6) act on the interacting ground state. In the ground state, there is some charge for the annihilation operatorψ(r) to "act" on, even without any preceding creation process, so that the reverse time ordering in G makes sense. Feynman diagrams do not explicitly show both time orderings in G, but it is important to remember that G and G 0 lines implicitly contain both time orderings.
The times between t 1 and t 2 are called internal times. We can add up all the processes contributing to G in a certain order depending on the number of times the particle interacts with other particles in the system. These interactions occur only at internal times, and the number of internal interactions is the order of the diagram. To efficiently represent all these internal interactions, we use a dashed line to represent the interaction in the diagram. At a given order n, we construct all possible processes, or diagrams, which connect n interaction lines with G 0 lines at 1 and 2. We connect all of the dashed lines appearing at internal times with additional G 0 lines. There is a very specific set of rules for how these arrangements can be done. Wick's theorem defines how to contract these pieces (Fetter and Walecka, 1971). A simple principle is enough to demonstrate the idea, however. Because the Coulomb interaction is a two-body operator, each dashed line must have two G 0 lines at each end. To compute the FIGURE 6 | The exact G contains amplitudes from all possible paths between 1 and 2. Amplitudes from all of these paths are represented by the rectangle placed between the field operators, the action of which is represented by * symbols. These terms can be calculated order-by-order with perturbation theory. At a given order n, we must connect n interaction lines at internal times in all possible − and allowed − ways. Concrete examples of diagrams are in Figure 7. All terms of the topology which can be inserted between two G 0 lines form the reducible self-energy. exact G, the expansion must be taken to infinite order, n → ∞, adding up all possible processes along the way. The process of building up all diagrams in the perturbation expansion is shown in Figure 6. A few example diagrams, as well as a couple of forbidden diagrams, are shown in Figure 7.
To go further with our analysis, we must dissect the internal structure of the diagrams and separate it into pieces. By considering the possible topologies of internal parts allowed by the contraction rules, we can group the parts into different categories. Here, the "topology" of the piece is determined by the number of G 0 lines and interactions at two different times, without considering the internal structure between the two chosen times. Those parts which have two G 0 lines sticking out are called a self-energy diagram. The full self-energy ( ) can be inserted between two G 0 lines to form G (one must also include the separate G 0 term). Topologies which connect two G 0 legs to an interaction line are labeled a vertex. Summing over all pieces with this topology creates the full vertex (Ŵ), which depends on three spacetime points. Finally, the diagram FIGURE 7 | At first order, n = 1, there are only two possible self-energy diagrams. These are the diagrams of the Hartree-Fock approximation, the direct electrostatic interaction (left) and exchange (right). Two possible n = 2 diagrams are also shown (there are others). The bottom diagrams are forbidden because they do not have two G 0 lines at each end of the interaction lines. When drawing the diagrams, a certain degree of flexibility is allowed and they must be interpreted carefully. For example, the curved interaction lines above must still be treated as instantaneous in a calculation.
parts which end in two dashed lines sum up to the effective, or screened, interaction (W).
Conceptually, the vertex is the most difficult to understand. Figure 8 demonstrates the effect of the vertex in a specific example. The diagram shown in Figure 8A is meant to contain the exact vertex, Ŵ. Ŵ has three corners and can be inserted where two G 0 lines meet an interaction line. By simply letting these three pieces meet without any internal structure, we replace Ŵ with a single spacetime point, as shown in Figure 8B. Alternatively, we could allow the vertex to include the curved interaction line shown in Figure 8C. In that case, the vertex has internal structure.
Hedin's equations can be interpreted as the self-consistent formulation of these topologically distinct building blocks. While Hedin followed a formal and systematic derivation, a heuristic motivation is to group all diagrams of a certain topology together and replace them with a single dressed, or renormalized, object with the same topology. A critical aspect of this replacement is their energy dependence. By replacing many diagrams of perturbation theory with a single object of the same shape, we reduce the number of objects to be computed. However, the information apparently missing due to the reduction in objects is encoded in the energy dependence of the dressed quantity. The FIGURE 8 | The exact vertex Ŵ, shown in (A), can be replaced with approximations to simplify the calculation. The approximation in (B) is referred to as a "single spacetime point" because the vertex has no internal structure. In contrast, the vertex in (C) has internal structure. The diagram shown here is only an example to demonstrate the role of Ŵ and does not correspond to the exact self-energy or the GW self-energy.
final result, Hedin's equations (Appendix B), are a compact and self-contained set of five integro-differential equations. Despite the reduction in the number of objects to be treated compared to the perturbation expansion, the functional differential equations coupling these pieces are extremely difficult to solve exactly.
Hedin recognized this difficulty and suggested the GW approximation. As mentioned above, the vertex is the building block which has a single interaction connected to two G 0 lines. Unlike the other building blocks, the vertex depends on three spacetime points instead of two, making it the most difficult to compute. To simplify the theory, Hedin suggested replacing Ŵ with a single spacetime point. In Hedin's equations, the exact self-energy is = iGWŴ. With the replacement Ŵ(1, 2, 3) = δ(1, 2)δ(1, 3), Hedin's approximation gives = iGW, hence the name of the GW approximation. In this approximation, Hedin's equations are χ 0 (1, 2) = −iG(1, 2)G(2, 1) (1, 2) = iG(1, 2)W(1 + , 2) (17) FIGURE 9 | Diagrammatic representation of Equations (13-17). The GW approximation reduces the self-energy to a product of G with W. where the Hartree potential is included in the solution for G 0 6 . These are the GW equations, which are translated into the diagram language in Figure 9. There is one final important point regarding the reducibility of the quantities in Hedin's equations. and χ 0 in Hedin's equations are irreducible, which means that they cannot be broken into smaller pieces with the same topology. To generate the full, or reducible, quantity from its irreducible part, the irreducible component is iterated in a series similar to the perturbation expansion for G. Series of this type are commonly called Dyson series, and Dyson's equation refers to the equation for G shown in Equation (13) (W in Equation (16) also obeys a Dyson series). Dyson's equation is of great importance in many-body physics, and we return to it in later sections in the context of self-consistent GW. It is common in the literature to use the same symbol for both reducible and irreducible components with the same topology, especially when discussing the self-energy. Almost always, the symbol refers to the quantity as it appears in Hedin's equations. When discussing or calculating the self-energy, this implies that one is interested in the irreducible self-energy.
In Hedin's equations, the screened Coulomb interaction W plays a central role. Screening is based on the simple idea that charges in the system rearrange themselves to minimize their interaction. In polarizable materials, screening is significant, and the effective interaction is noticeably weaker than the bare one. 6 1 + means the time t 1 evaluated at an infinitesimally later time. Such time infinitesimals appear in order to define the time-ordering for quantities meant to be evaluated in the instantaneous limit. The ground state density, for example, is time independent but must be written as n(1) = −iG(1, 1 + ) so that the time-ordering makes sense.
W is also dependent on frequency or a time difference. The frequency dependence of W is critical to both the physics and the numerical implementation of GW. Even though the bare interaction is instantaneous, W is time difference dependent because it is built from repeated bare interactions at different times. This series of bare interactions to form W can be built by iterating the fourth line in Figure 9. The underlying G 0 lines which connect the bare interactions in the W expansion are themselves dependent on a time difference, so that if we vary the initial or final times the entire expansion changes magnitude. This series of repeated bare interactions connected by G 0 is the microscopic mechanism for the quasiparticle screening concept developed in section 2.2. The frequency dependence of W is what allows the system to relax and screen the quasiparticle. The GW self-energy is similar to the bare exchange in Hartree-Fock theory, which can be written as the product of G with v. Given the similarity between the GW self-energy and bare exchange, GW can be thought of as a dynamically screened version of Hartree-Fock.
The GW equations should still be solved self-consistently since all four quantities are coupled to each other. As with other nonlinear equations, including the equations of meanfield theories like Kohn-Sham DFT or Hartree-Fock theory, the GW equations can be solved by iteration. In principle, the prescription is clear. Start from a given G 0 and iterate Equations (13-17) to self-consistency (scGW). However, remarkably few fully self-consistent solutions of the GW equations have been performed in the last 50 years. The first calculations for the homogeneous electron gas (HEG) were reported at the turn of the previous century (Holm and von Barth, 1998;Holm, 1999;García-González and Godby, 2001) and reported worse agreement with experiment on quasiparticle band widths and satellite structure compared to non-self-consistent calculations. They were quickly followed by calculations for real solids, like silicon and sodium (Schöne and Eguiluz, 1998;Ku and Eguiluz, 2002). Self-consistency was then dropped for several years because of its high computational expense and the success of non-self-consistent approximations. More recent scGW studies for atoms Stan et al., 2006Stan et al., , 2009, molecules Caruso et al., 2012aCaruso et al., , 2013aMarom et al., 2012), conventional solids (Kutepov et al., 2009;Grumet et al., 2018) and actinides (Kutepov et al., 2012) have been reported. In practice, non-self-consistent calculations are much more common, and even self-consistent GW calculations come in different types. scGW is discussed in more detail in section 5.

THE
The lowest rung in the hierarchy of GW approximations is the widely used G 0 W 0 approach. Starting from a mean-field Green's function, G 0 W 0 calculations correspond to the first iteration of Hedin's equations. We denote the self-energy of such single-shot perturbation calculations 0 . Since we always refer to the singleshot self-energy in section 4, we drop the label. Furthermore, we define the single-particle Hamiltonianŝ where v ext is the external potential, v H is the Hartree potential, and v MF σ is the mean-field (MF) exchange-correlation potential. The spin channel is denoted by σ . Possible mean-field Hamiltonians are the Kohn-Sham (KS) or Hartree-Fock (HF) Hamiltonians.
From Dyson's equation for G, one can derive an effective single-particle eigenvalue problem referred to as the quasiparticle (QP) equation. The solutions of the QP equation are then given bŷ The self-energy is calculated with a G 0 chosen to match the initial mean-field calculation based onĥ MF . The solution of Equation (21) where {ǫ 0 sσ } are the eigenvalues ofĥ MF . Solving Equation (22), the QP energy ǫ sσ is obtained by correcting the mean-field eigenvalue ǫ 0 sσ . To solve Equation (22), we have to calculate the G 0 W 0 selfenergy σ , where ω is the frequency at which the self-energy is computed. Equation (23) is the frequency space version of Equation (17) for the GW self-energy. The Green's function G σ 0 stems from the aforementioned mean-field Hamiltonian and is given by W 0 in Equation (23) is the screened Coulomb interaction in the random-phase approximation (RPA) with the bare Coulomb interaction v(r, r ′ ) = 1/|r − r ′ | and the dynamical dielectric function ε. The latter is given by In G 0 W 0 , the irreducible polarizability χ 0 , simplifies to the Adler-Wiser expression (Adler, 1962;Wiser, 1963) where the index i denotes an occupied and a an unoccupied (also called virtual) single-particle orbital.
For numerical convenience as well as insight into the underlying physics, the G 0 W 0 self-energy is often split into a correlation part c σ , where W c 0 is defined as and an exchange part Note that the exponential factor in Equation (31) is necessary to close the integration contour, whereas W c 0 (r, r ′ , ω) falls of quickly with increasing frequency and we can take the zero limit of η before integrating. For a derivation of Equation (32) see van Setten et al. (2013). We introduce the following notation for the (s, s)-diagonal matrix elements of the self-energy, The same notation is also used for matrix elements of the meanfield potential v MF sσ = φ 0 sσ |v MF σ |φ 0 sσ 7 .
7 For simplicity, we will drop spin variables in the following parts of section 4.
In the literature, G 0 is often referred to as the "noninteracting" Green's function. However, this is technically only correct if G 0 is constructed from an initial calculation based on h 0 . This is the definition of G 0 in formal many-body theory. However, often times in the theoretical literature, the Hartree potential is included in the G 0 solution and excluded from the self-energy. This is the case of starting the calculation fromĥ in Equation (19). For G 0 W 0 in practice, we usually start from h MF , which implies that we start from a mean-field Green's function rather than a non-interacting one. Conceptually, such a mean-field G 0 is closer to the interacting G than the true G 0 . This is precisely why the mean-field G 0 serves as such a useful starting point for GW calculations-it is closer to a self-consistent solution for G than a true non-interacting G 0 is. When consulting literature references, keep in mind that GW calculations most likely refer to a mean-field G 0 .

Procedure
G 0 W 0 calculations are usually performed on top of KS-DFT or HF calculations. A flowchart for a typical G 0 W 0 calculation starting from a KS-DFT Hamiltonian is shown in Figure 10. Note that details of the flowchart depend on the treatment of the frequency dependence discussed in section 4.3. Figure 10 starts with the KS energies {ǫ KS s }, KS orbitals {φ KS s } and the exchange-correlation potential v xc from a DFT calculation. The exchange part of the self-energy x s is directly computed from the DFT orbitals. For the correlation term c s , the frequency integral over G 0 and W 0 must be computed, see Equation (29). If the integral is evaluated numerically, W 0 is computed for a set of frequencies {ω}. The procedure to obtain W 0 is as follows: First, the irreducible polarizability χ 0 (Equation (28)) is computed with the KS energies and orbitals. Second, χ 0 is used to calculate the dielectric function ε (Equation (26)). From the inverse of ε and the bare Coulomb interaction v, we finally obtain the correlation part of the screened Coulomb interaction, see Equations (25) and (30).
Since the QP energies appear on both sides of Equation (22), an iterative procedure is required. More precisely, the correlation term of the self-energy depends on ǫ s and must be updated at each step. Note that only G 0 is a function of the QP energy, while W c 0 depends solely on the frequencies of the integration grid. Therefore, W c 0 can be pre-computed before entering the QP cycle, as displayed in Figure 10.
The correlation self-energy c s is a complex quantity. However, the imaginary part of c s is generally small for frequencies around the QP energy 8 , see Figures 11A,B, where c s (ω) is plotted for the highest occupied molecular orbital (HOMO) of the water molecule. To solve Equation (22), often only the real part of c s is used, which simplifies the matrix algebra to real operations and reduces the computational cost.
A common technique to avoid the re-calculation of the selfenergy at each iteration step of the QP cycle is the linearization of Equation (22) (Giantomassi et al., 2011;Liu et al., 2016;Wilhelm 8 While the imaginary part might be small, it is nonetheless important as its inverse is proportional to the lifetime of the state.   al., 2016). Assuming that the difference between QP and meanfield energies is relatively small, the matrix elements c s can be Taylor expanded to first-order around ǫ 0 s : The self-energy matrix elements are now only required at the mean-field energies ǫ 0 s . Z s is known as the QP renormalization  factor, because it measures how much spectral weight the QP peak carries (see also Equation (11) in section 2.2). The QP solution (main peak) is characterized by large Z s values, which lie around 0.7 to 0.8 for simple insulators, semiconductors and metals (Aulbur et al., 2000;Laasner, 2014) and around 0.9 for the molecules in Figure 12. Small Z s values indicate satellite features.
The linearization error depends on the state s. The deviation from the full iterative solution usually is in the range of 0.1 eV for the HOMO, as shown in Figure 12 for a set of small molecules. The Taylor expansion of the QP equation becomes less and less accurate for larger binding energies because the absolute distance between DFT eigenvalues and QP energies increases (i.e., the G 0 W 0 correction increases). For the deeper valence states, the linearization error is already as large as 0.5 eV (see Figure 12).
Another alternative to iterating the QP equation is to find a graphical solution. As shown in Figure 11A, the real part of the self-energy matrix elements is computed and plotted on a fine grid of real frequencies {ω} around the expected solution. All intersections of the straight line ω + v xc s − x s − ǫ 0 s with Re c s (ω) are then solutions of Equation (22). The intersection with the largest spectral weight Z s is the QP solution and is characterized by a small slope of Re c s . Another way to calculate the QP excitations is to compute the diagonal elements of the spectral function, A ss (ω), for a set of frequencies as shown in Figure 11C. This is the most accurate procedure to obtain QP energies among the methods discussed here. A ss (ω) is computed from the complex self-energy ( c s = Re c s + i Im c s ), where we employed Equation (3), the Dyson equation, G = G 0 + G 0 G, and used only the diagonal matrix elements of . Figure 11C confirms that the solution at around ≈ −12.0 eV is the main solution. The spectral weight of the other solutions, e.g., the satellite peaks in the frequency range −30 to −25 eV, is indeed very small. The aforementioned iterative procedure is computationally far more efficient than the graphical solution or the calculation of A ss (ω). The number of required QP cycles N QP typically ranges between 5 and 15 and the self-energy only has to be computed for N QP many frequencies. However, the spectral function takes also the imaginary part of the self-energy into account. This is essential for the accurate computation of satellite features in the GW spectrum (Zhou et al., 2015;Reining, 2017). Satellites fall usually in a region where Re c s has poles, as demonstrated in Figure 11A. In these regions, the imaginary part Im c s exhibits complementary peaks and is non-zero (Kramers-Kronig relation), see Figure 11B. Note that the graphical solution indicates the expected range of the satellite peaks, but does not predict their positions accurately because the imaginary part is omitted.

Frequency Treatment
The frequency integration in Equation (23) is one of the major difficulties in a G 0 W 0 calculation since both functions that are integrated, G 0 and W 0 , have poles infinitesimally above and below the real frequency axis. In principle, a numerical integration of Equation (23) is possible, but potentially unstable since the integrand needs to be evaluated in regions in which it is ill-behaved. However, a toolbox of approximate and exact alternatives is available. The most frequently used methods are summarized in the following.

Plasmon-Pole Models
The simplest way to calculate the frequency integral is to approximate the frequency dependence of the dielectric function ε and thus the screened Coulomb interaction W 0 by a plasmon pole model (PPM) (Hybertsen and Louie, 1986). The PPM approximation takes advantage of the fact that ε −1 is usually dominated by a pole at the plasma frequency ω p (Hybertsen and Louie, 1986). This pole corresponds to a collective charge-neutral excitation (a plasmon) in the material. Assuming that only one plasmon branch is excited, the shape of ε can be modeled by a single-pole function where andω are two parameters in the model, whose squares are proportional to ω 2 p , see Giantomassi et al. (2011). ε, andω are matrices typically expressed in a plane wave basis because PPMs are mostly used for periodic systems. Note that Equation (38) holds for each matrix element and that we take the square of the matrix elements in Equation (38) and not the square of the matrix itself. Using a model function for ε −1 , the expression for W 0 is greatly simplified resulting in an analytic expression for the self-energy, see Deslippe et al. (2012).
The two parameters, andω, can be determined in several ways leading to different flavors of the PPM approximation (Giantomassi et al., 2011). The most common PPMs are the Hybertsen-Louie (HL) (Hybertsen and Louie, 1986) and the Godby-Needs (GN) (Godby and Needs, 1989) model. The parameters in the HL model are obtained by requiring that the PPM reproduces the value of ε −1 in the static limit (ω = 0) and that the so-called f -sum rule is fulfilled. The f -sum rule is a generalized frequency sum rule relating the imaginary part of ε −1 to ω p and the electron density in reciprocal space (Johnson, 1974). In the HL model, the low and high real frequency limits are exact and ε has to be calculated explicitly only at ω = 0. The parameters of the GN PPM are determined by calculating ε at ω = 0 and an imaginary frequency point iω ′ p , where ω ′ p is typically chosen to be close to the plasma frequency ω p . The latter corresponds to the energy of the plasmon peak in the electron energy loss spectra (EELS) and can be obtained from experiment. Alternatively, ω ′ p follows from the average electronic density ρ 0 per volume, w ′ p = √ 4πρ 0 (Giantomassi et al., 2011).
A comparison between different PPMs, namely the HL, GN, Linden-Horsch (von der Linden and Horsch, 1988), and Engel-Farid (Engel and Farid, 1993) model, can be found in Larson et al. (2013) and Stankovski et al. (2011). There it was shown that the GN model best reproduces the inverse dielectric function and the corresponding QP energies of reference calculations with an exact full-frequency treatment, such as the contour deformation discussed in section 4.3.2. However, even the accuracy of the GN model decreases further away from the Fermi energy, i.e., for low-lying occupied and high-lying unoccupied states (Cazzaniga, 2012;Laasner, 2014).
While PPMs made the first G 0 W 0 calculations tractable Louie, 1985, 1986), full frequency calculations are now the norm, because the effects of the plasmon-pole approximation on the overall accuracy of the calculation are often hard to judge Miglio et al., 2012). Moreover, the imaginary part of the self-energy becomes nonzero only at the plasmon poles, which implies that QP lifetimes cannot properly be calculated with PPMs, see Equation (98) and section 6.3. However, PPMs are still used in large scale G 0 W 0 calculations , for example for solids (Jain et al., 2011;Reyes-Lillo et al., 2016), surfaces (Löser et al., 2012), 2D materials (Dvorak and Wu, 2015;Qiu et al., 2016;Drüppel et al., 2018), graphene nanoribbons Talirz et al., 2017) or polymers (Hogan et al., 2013;Lüder et al., 2016).
The application of PPMs to molecules is conceptually less straightforward because the dominant charge neutral excitations in molecules are not necessarily collective. This raises the question of how to define the plasma frequency ω p of a molecule. Nevertheless, PPMs have also been used in benchmark studies for molecules, where mean absolute deviations of 0.5 eV from accurate frequency integration methods were reported (van Setten et al., 2015).
The plasmon-pole model can be extended to an arbitrary number of poles, as proposed by Rehr and coworkers (Soininen et al., 2003(Soininen et al., , 2005Kas et al., 2007). If many frequencies are required to determine the parameters in the model, the computational cost for the evaluation of is not necessarily reduced compared to full-frequency methods. However, multipole models are also well-defined for finite systems since the existence of a distinct plasmon peak is no longer an inherent assumption of the model (Kas et al., 2007).

Contour Deformation
The contour deformation (CD) approach is a widely used, fullfrequency integration technique for the calculation of c (ω) (Godby et al., 1988;Lebègue et al., 2003;Kotani et al., 2007a;Gonze et al., 2009;Blase et al., 2011;Govoni and Galli, 2015;Golze et al., 2018). In the CD approach, the real-frequency integration is carried out by using the contour integral, see Figure 13. By extending the integrand to the complex plane, the numerically unstable integration along the real-frequency axis, where the poles of G 0 and W 0 are located, is avoided.
The integral along the contours shown in Figure 13 has four terms: an integral along the real (Re) and the imaginary axis (Im) and along the arcs.
The contour integral is evaluated by taking the contours to infinity, which implies that the radius of the arcs is infinite. For infinitely large ω ′ , G 0 (ω + ω ′ )W 0 (ω ′ ) vanishes and the integral along the arcs of contours Ŵ + and Ŵ − is zero. Therefore, FIGURE 13 | Contour deformation technique: Integration paths in the complex plane to evaluate c (ω). Ŵ + and Ŵ − are the integration contours, which are chosen such that the poles of G 0 , but not the poles of W 0 are enclosed. Ŵ + encircles the upper right and Ŵ − the lower left part of the complex plane. ω ′ denotes frequencies of the integration grid and ω the frequency at which c is calculated.
we can compute the real-frequency integral by subtracting the imaginary-frequency integral from the contour integral. After rearranging Equation (39) and using Equation (23), we obtain where the second term is the integral along the imaginary axis. The contours Ŵ + and Ŵ − are chosen such that only the poles of G 0 fall into D Ŵ + and D Ŵ − , which denote the subsets of the complex plane encircled by Ŵ + and Ŵ − , respectively. The location of the poles of G 0 (ω + ω ′ ) depends on the frequency ω at which the self-energy is computed. Recalling Equation (24), the poles of G 0 lie at the complex frequencies For ω < E F , these poles can enter only D Ŵ + and must arise from occupied states. Our example in Figure 13 displays a case were ω < ǫ (HOMO−1) . Two poles, namely [ǫ (HOMO) − ω] and [ǫ (HOMO−1) −ω], fall into D Ŵ + . For an even smaller ω, more poles from deeper occupied states will shift into D Ŵ + . Conversely, for ω > E F , the poles from the unoccupied states will enter D Ŵ − . We can now calculate the residues of the poles that are in D Ŵ + or D Ŵ − . Employing the residue theorem, the contour integral is then replaced by a sum over these residues: The integral along the imaginary frequency axis is smooth (Rieger et al., 1999;Giantomassi et al., 2011) and the integration is performed numerically. The size of the frequency grid for the numerical integration needs to be carefully converged. For more details and a derivation of the final CD equations see e.g., Golze et al. (2018) or Govoni and Galli (2015).

Analytic Continuation
Analytic continuation (AC) from the imaginary to the real frequency axis is another method in our toolbox that enables an integration over the full-frequency range. The AC technique exploits the fact that the integral of the self-energy along the imaginary frequency axis, is smooth and easy to evaluate, unlike the integral along the realfrequency axis. However, the QP energies and spectral functions are measured for real frequencies.
To return from the imaginary to the real frequency axis, the procedure is as follows: The self-energy is first calculated for a set of imaginary frequencies {iω} and then continued to the real-frequency axis by fitting the matrix elements c s (iω) to a multipole model. A common analytic form is, e.g., the so-called 2-pole-model (Rojas et al., 1995;Rieger et al., 1999) c s (iω) ≈ 2 j=1 a s,j iω + b s,j + a s,0 , which has been widely used for G 0 W 0 calculations of materials (Rieger et al., 1999;Friedrich et al., 2010;Pham et al., 2013) and molecules (Ke, 2011;van Setten et al., 2015;Wilhelm et al., 2016). The unknown complex coefficients a s,j and b s,j are determined by a nonlinear least-squares fit. From the identity theorem of complex analysis we know that the analytic form of a complex differentiable function on the real and imaginary axis are identical. Therefore, we can finally calculate the self-energy in the real-frequency domain by replacing iω with ω in Equation (44). An alternative multipole model function is the popular Padé approximant (van Setten et al., 2015;Liu et al., 2016;Wilhelm et al., 2018), which is more flexible but contains more parameters. In the Padé approximation, the complex fitting coefficients are not obtained by a nonlinear least-squares fit, but recursively from the matrix elements c s (iω) and the imaginary frequencies {iω} (Vidberg and Serene, 1977). In Figure 14 we compare the real self-energy matrix elements Re c s obtained from the AC approach to an implementation on the real-frequency axis such as the CD method. The Padé approximation reproduces the self-energy exactly in the frequency range around the QP energy of the HOMO. The deviation in the HOMO-QP energy is smaller than 10 −4 eV with respect to the CD results. By using a Padé approximant with a large number of parameters, even some features of the pole structure at higher and lower frequencies are reproduced, as shown in Figure 14. The 2-pole model, on the contrary, is significantly less accurate and yields an error of around 0.1 eV in the first ionization potential. For a comprehensive comparison between the Padé and 2-pole model see van Setten et al. (2015).
The reliability of the AC approach is limited to valence excitations because the self-energy structure of deeper states shows poles closer to the QP solution. Our recent work  showed that the AC technique fails drastically to describe the complicated features of the self-energy for core states resulting in errors of 10-20 eV for the core-level binding energies. Furthermore, satellite features are difficult to obtain. As discussed in section 4.2, satellites lie in regions in which Re c s has poles. As evident from Figure 14, these poles can only partly be reproduced by the AC.
The convergence parameters for the AC approach are the number of frequency points {iω}, for which the self-energy is computed, and the size of the frequency grid for the numerical integration over ω ′ . In practice, the same grid is often employed for {iω} and {iω ′ } (Ke, 2011;Wilhelm et al., 2016).

Fully Analytic Approach
The integral in Equation (23) can be carried out fully analytically. In this case, the Adler-Wiser sum-over-states representation of the polarizability introduced in section 4.1 is not used. Instead we start from the reducible polarizability χ(ω). In the spectral representation, χ(ω) is given as sum of its poles n in the complex plane The pole positions n correspond to charge neutral excitation energies and ρ n (r) denotes transition densities. Equation (45) would be exact for the exact n and ρ n (r). Both quantities are obtained by solving a conventional eigenvalue problem. The equations that are solved are identical to the Casida equations (Casida, 1995b), except that for G 0 W 0 the exchange-correlation kernel is omitted (otherwise it would be time-dependent densityfunctional theory). The reducible polarizability χ(ω) can then be expanded in terms of χ 0 in a Dyson series and we can thus rewrite W 0 given in Equation (25) in terms of Inserting Equation (45) yields a pole expansion for W 0 . The selfenergy integral can then be solved analytically and we obtain a closed expression for c s (ω): where P n (r, r ′ ) = ρ n (r)ρ * n (r ′ ). More precisely, c s (ω) also becomes a sum over the poles n . Equation (48) is therefore similar to the PPM approximation, except that we sum over the exact poles of W 0 and not over the poles of a W 0 -model function. A detailed description of the fully-analytic frequency treatment can be found in van Setten et al. (2013). Equivalent expressions are also given in Hedin's review article from 1999 (Hedin, 1999) and were applied in Tiago and Chelikowsky (2006), Bruneval (2012), and Bruneval et al. (2016).

Comparison of Accuracy and Computational Cost
In the previous sections three full-frequency integration techniques have been introduced: the CD, the AC and the fullyanalytic approach. The CD and fully analytic method compute the self-energy directly for real frequencies. By design, the fully analytic approach is in principle the most exact one since it is parameter-free, except for the dependence on the basis set and the broadening parameter η. However, the same accuracy can already be achieved with the CD using a moderately-sized numerical integration grid for the imaginary frequency term . In the AC approach the self-energy is calculated on the imaginary frequency axis, which is fairly featureless. The accuracy of the AC approach depends on the features of the self-energy on the real axis and on the flexibility of the model function, which continues the self-energy to real frequencies.
Generally, QP energies of valence states are well reproduced (van Setten et al., 2015), while the AC is likely to fail for deeper states as discussed in section 4.3.3. In the PPM approximation, the self-energy integral is simplified by introducing a model function for W 0 . The accuracy is therefore determined by the chosen model function and generally difficult to estimate.
The fully-analytic approach is the computationally most expensive method in our toolbox since solving the eigenvalue problem to obtain the poles of χ is an O(N 6 ) step, where N defines the size of the system. The scaling of the CD and AC approach is generally lower, but depends on the details of the implementation. The CD method requires more computational resources than the AC methods due to the additional sum over the residues of the poles of G 0 . The overhead is relatively small for QP energies of valence states, but increases for deeper states due to the steady increase of the number of residues. The PPM is computationally the most efficient method, because the dielectric function ε used to compute W 0 has to be calculated only at a few frequency points to determine the parameters of the PPM.

Basis Sets
In any GW implementation, the QP wave functions {ψ s } and also the mean-field orbitals {φ 0 s } are expanded in a set of normalized basis functions {ϕ j }. Since in G 0 W 0 the QP wave functions are approximated by the KS-DFT or HF ones, we expand in practice where c sj are the expansion coefficients that have to be determined. Performing the G 0 W 0 calculation in a basis transforms the expression for W 0 , ε, and χ 0 into matrix equations suitable for implementation in computer codes. The basis set choice is often guided by the type of system under investigation. In the following we will introduce the most common basis sets with brief comments on their suitability.

Plane Waves
For periodic systems, the energy spacing between discrete energy levels can vanish, in which case the single-particle eigenvalues form bands. According to Bloch's theorem (Bloch, 1929), the single-particle states can be written as Bloch waves where k is a wave vector in the first Brillouin zone and n is the band index. The index s that we had used in Equation (49) and throughout to label states now becomes the compound index nk. The functions u nk (r) have the periodicity of the lattice and can be expanded in plane waves {ϕ G }, where is the volume of the periodic cell and G is a reciprocal lattice vector. Reciprocal lattice vectors G are given by G · t = 2πn, where n is a positive integer and t is a translation vector of the unit cell. G 2 is directly proportional to the kinetic energy E of a free electron. The size of the basis set is characterized by the largest G vector and is usually given in terms of the energy E that corresponds to the largest reciprocal lattice vector, E = G 2 max /2. All G vectors with equal or smaller energies are included in the basis set.
The first GW calculations Louie, 1985, 1986;Godby et al., 1986) were performed for solids with plane wave basis sets. Also today plane waves are common in state-of-theart GW implementations, see Table 1 for a list of GW codes. The real-space representation of G 0 , W 0 , ε, and χ 0 given in Equations (24-28) can be easily transformed into a basis of plane waves by Fourier transforms. For expressions of these quantities in plane waves see, e.g., Hüser et al. (2013b).
Plane wave basis sets are suitable for describing the slowly varying electron density in the valence region, where only the valence orbitals are non-zero. However, the valence wave functions tend to oscillate rapidly close to the nuclei due to orthogonality constraint with respect to the core orbitals. Representing these oscillations requires a large number of plane waves. Plane waves are therefore used in combination with pseudopotentials or the projector-augmented-wave methods (Blöchl, 1994) to approximate the effect of the core electrons. We will introduce the pseudopotential concept in section 4.4.4 and return to plane waves in the context of the projector augmented wave scheme in section 4.4.5.

Localized Basis Sets
While plane waves are mostly used for periodic systems, they can in principle also be used for finite systems by placing, e.g., the molecule in a sufficiently large unit cell to avoid spurious interactions with the neighboring cells. However, large unit cells require a very large plane wave basis set and are therefore computationally expensive. Molecular systems can be more efficiently described by atom-centered localized basis sets.
The most common basis functions of this type are Gaussian basis sets where N l is a normalization constant and Y l,m (θ , φ) are spherical harmonic functions given in spherical coordinates (r, θ , φ). A Gaussian type orbital is characterized by the exponent α and the angular and magnetic quantum numbers l and m, which are dictated by the basis set selection. The design of Gaussian basis sets requires careful optimization regarding the number of functions, their respective angular momentum and exponents α. In quantum chemistry, Gaussian basis sets are widely used and ample experience exists in designing suitable basis sets for correlated methods such as coupled cluster theory. These Gaussian basis sets can then also be used in GW calculations. Another type of localized basis functions used in GW calculations are numeric atom-centered orbitals (NAOs), where u µ (r) are radial functions that are not restricted to any particular shape. The radial part of NAOs is tabulated on dense grids and is fully flexible. Gaussian radial functions can be considered as special types of this general NAO form. Slater type functions, which posses an exponential decay at long range and a cusp at the position of the nuclei, have been also used in GW calculations (Stan et al., 2009). However, this basis set type is less common.
Local basis functions, in particular NAOs that derive from atomic orbitals, are well suited to describe rapid oscillations of wave functions near the nucleus. They are therefore the obvious choice for QP calculations of core and semi-core states.

Augmented Basis Sets
Augmented plane waves (APW) are another basis set type that includes the rapidly varying oscillations near the nuclei. APW methods use the so-called muffin tin approximation, which is a physically motivated approximation to the shape of the potential in solid state systems (Slater, 1937;Martin, 2004). The shape of the potential resembles a muffin tin: it is peaked at the nuclei and predominantly spherical close to it, while it is flat in between. Therefore, real space is partitioned into nonoverlapping (muffin-tin) spheres MT,a centered around each nuclei a and interstitial regions I between these spheres. The valence wave functions are then expanded in localized NAO-like functions (Equation (54)) inside the spheres and plane waves in the interstitial regions.
By construction, the APW basis sets produce wave functions with a discontinuity in the first derivative at the muffin-tin boundaries. The linear APW (LAPW) was proposed to guarantee that the solution in the muffin-tin matches continuously and differentiably onto the plane wave part in the interstitial region (Andersen, 1975). With this extension, the explicit form of the LAPW basis functions is where u a l (r) and its derivativeu a l (r) are radial functions centered at the atom a. The coefficients A lm and B lm are determined such that continuity in value and derivative of the basis functions at the muffin-tin boundaries is ensured.
LAPW basis sets can be extended by additional local orbitals, LAPW+lo, that are completely localized in the muffin-tin spheres and go to zero at the boundaries. Inclusion of such local orbitals significantly improves the variational freedom, e.g., the description of d and f electrons (Singh, 1991). It has furthermore been shown that these local orbitals are particularly important for the unoccupied state convergence in GW calculations Jiang and Blaha, 2016;Jiang, 2018).
A general form of the LAPW method are full-potential LAPW (FLAPW) methods that make no approximations on the shape of the potential (Wimmer et al., 1981) and which are nowadays standard in LAPW codes. Recently a number of FLAPW GW codes have emerged (Friedrich et al., 2006(Friedrich et al., , 2010Jiang et al., 2013;Gulans et al., 2014).
Linear muffin-tin orbital (LMTO) schemes are very similar to LAPW basis sets, except that the basis functions in the interstitial region are not plane waves (Andersen, 1975), but for example smooth Hankel functions (Methfessel et al., 2000).
In these augmented basis sets it is straightforward to include core and semicore states in the Green's function G 0 (Equation (24)) and the polarizability χ 0 (Equation (28)) and therefore in the self-energy. This, in principle, improves the description of QP excitations of valence states and band gaps, even though it has been found that the difference to carefully adjusted plane wave-based projector augmented-wave (PAW) calculations (see section 4.4.5) is typically less than 100 meV (Nabok et al., 2016). However, the same study reported larger differences for deep-lying and very localized d and f states (Nabok et al., 2016). Core excitations are in principle also accessible with FLAPW basis sets. However, these have not been thoroughly investigated yet.
For local and semi-local DFT functionals, the (F)LAPW basis sets have become the ultimate accuracy reference, closely followed by NAOs (Lejaeghere et al., 2014(Lejaeghere et al., , 2016. For G 0 W 0 , first steps in systematically benchmarking solids were made only recently . For molecules, G 0 W 0 benchmark calculations emerged during the last years and we will discuss them in section 9.3. The jury is therefore still out on which basis set is most accurate for solids.

Pseudopotentials
GW calculations can be grouped in two categories: those that take all electrons of the system into consideration and those that partition into valence and core electrons. In this latter case, only the valence electrons enter the GW (and the preceding DFT) calculation explicitly, whereas the effect of the core electrons is taken into account only indirectly, for example through a pseudopotential. Such core-valence partitioning is motivated by the observation that deep core states are relatively inert and do not contribute to chemical bonding. The advantage of using a partitioning scheme is that the electron number in the GW calculation is reduced, which decreases the computational cost. An obvious disadvantage is that the core electrons may have an effect on the valence electrons, which will be difficult to include appropriately in the GW calculation and then may lead to incorrect results.
Pseudopotentials have been the default way to partition electrons (Martin, 2004;Marx and Hutter, 2009). In a pseudopotential, the core electrons are removed and the Coulomb potential of the nucleus and the inner-shell electrons is replaced by a smooth effective potential that acts on the valence electrons. The potentials are generated from calculations of isolated atoms. They are constructed such that the wave function of the valence electrons match those of an all-electron calculation outside the core region or outside a chosen radius around the nuclei. Inside the core region, the functions are smooth and nodeless. Additional norm-conservation criteria (Hamann et al., 1979;Bachelet et al., 1982), which preserve the orthonormality condition for the pseudo wave function, are usually applied (Troullier and Martins, 1991;Goedecker et al., 1996;Fuchs and Scheffler, 1999). The resulting potential is finite at the origin of the atom and shallow. Pseudopotentials are mostly used in combination with plane waves since the smooth and shallow potentials greatly reduce the required plane wave cutoff and make plane wave GW calculations with these basis sets feasible. In addition, pseudopotentials have been used for GW calculations with localized functions to reduce the basis set size Wilhelm et al., 2016).
The majority of pseudopotential development took place in DFT (Marx and Hutter, 2009). Optimizing the parameters in the pseudopotential to ensure transferability is a complex task and requires thorough testing (Shirley et al., 1990;Goedecker et al., 1996). Transferability means that one and the same pseudopotential should be adequate for an atom in different chemical environments. The parameters of pseudopotentials are precomputed, similar to localized basis sets, and then tabulated for download in libraries like the Pseudo-Dojo (García et al., 2018;van Setten et al., 2018).
In GW, the consistency between pseudopotential and allelectron calculations will almost inevitably be violated. To be fully consistent, the DFT pseudopotentials would have to be cast aside and GW pseudopotentials be used. However, no such GW pseudopotentials have been developed until now, due to the complexity of the GW self-energy, which does not lend itself easily to pseudoization. Even if we had GW pseudopotentials, they would then have to first be used in the preceding DFT calculation, in which they would break the DFT core-valence consistency. Unless we perform fully self-consistent GW calculations, we are stuck with an inconsistency dilemma.
Early efforts toward GW pseudopotentials introduced core polarization effects into DFT pseudopotentials (Shirley and Martin, 1993a;Lee and Needs, 2003). By extending the GW formalism to include core contributions in the dielectric screening and the self-energy, such core-polarization potentials have also been tried successfully in the GW method (Shirley et al., 1997). However, developments in this direction did not continue. The default procedure today for plane wave GW codes is to use only well tested DFT pseudopotentials for the required elements (Govoni and Galli, 2018). Care has to be taken that the scattering states (i.e., the unoccupied states) of the pseudopotential are described well and do not introduce ghost states (Gonze et al., 1990). If no good pseudopotentials are available, it is recommended to either generate customized pseudopotentials, use the PAW method or employ genuine all-electron basis sets. Pseudopotential approaches have to be employed with care in particular for materials with localized d and f electrons. Specific issues in GW calculation of these materials are discussed in section 6.

Projector Augmented-Wave Method (PAW)
The PAW method is commonly used in plane wave G 0 W 0 implementations, see Table 1. It enables computational feasibility and ensures transferability between different chemical environments. The PAW method has been derived by Blöchl combining ideas from the pseudopotential method and LAPW basis sets (Blöchl, 1994). The idea is to express the KS all-electron wave function φ 0 s for state s in terms of a smooth auxiliary functionφ 0 s and correction terms, which restore the oscillating behavior in the core region. Note that for Bloch waves, the label s contains the k and band index n.
To constructφ 0 s , we define a linear transformationT which establishes a connection between φ 0 s andφ 0 s , Since the all-electron wave function is already smooth at a certain distance from the nuclei, we partition the space similarly to LAPW schemes: in atom-specific augmentation regions a around the nuclei, where a is the atom index, and an interstitial region I . The augmentation regions are characterized by the cutoff radii r a c , which should be chosen such that the augmentation spheres do not overlap. Outside the augmentation regions,φ 0 s should be identical to the all-electron wave function. T should thus modify φ 0 s only in a and we definê where the atom-centered transformation,T a , acts only within a . The transformation operator is derived by introducing atomcentered functions as in LAPW, which is described in detail FIGURE 15 | Schematic representation of the projector augmented wave (PAW) scheme. The all-electron wave function φ is constructed from the smooth auxiliary functionφ and corrections from the hard and smooth atom-centered auxiliary wave functions φ a andφ a , respectively.
in Martin (2004) and . The all-electron wave function can then be rewritten as where the atom-centered hard and smooth auxiliary wave functions are denoted by φ a s andφ a s , respectively. "Hard" refers to rapidly varying functions in the core region. The concept of the PAW scheme is visualized in a simplified way in Figure 15. By adding φ a toφ 0 s we obtain the oscillating behavior in the core region, but we have to subtract the smooth functionφ a to cancel the contribution ofφ 0 s in a . That implies that the following conditions must hold The atom-centered auxiliary wave functions can be expanded in a finite set of local basis functions {ϕ a j } and {φ a j } and a set of projector functionsp a j , where '∼' indicates again smooth functions. These expansions are given by The variational object in a PAW calculation isφ 0 s . The latter is expanded using, e.g., a plane wave basis set, for which a low energy cutoff can be used due to its smoothness. The local basis sets and projector functions needed to compute the second and third terms in Equation (57) are tabulated for each element of the periodic table. For specific choices of these basis sets see, e.g., Kresse and Joubert (1999) and . Details regarding the practical implementation within a plane wave code are given in Kresse and Joubert (1999) and for real-space grid codes in Mortensen et al. (2005) and Enkovaara et al. (2010).
GW calculations within the PAW schemes employ usually the frozen core approximation (Shishkin and Kresse, 2006a;Liu et al., 2016). The core states are localized at the atoms and confined in a . In the frozen core approximation we assume that the KS core states are identical to the atomic core states α, i.e., φ 0 s = ϕ a α . In this approximation, the decomposition given in Equation (57) is not used for the core states. However, the effect of the core on the valence states is correctly described.
The accuracy of the expansion in Equation (57) depends on the completeness of the set of localized basis and projector functions ({ϕ a j }, {φ a j } and {p a j }). Achieving completeness is easy for occupied states, but practically impossible if s corresponds to a high-energy empty state, which has been discussed by Klimeŝ et al. (2014). However, for a GW calculation, many of these highenergy empty states need to be included, which is explained in more detail in section 4.5. These incompleteness issues lead to a violation of the norm-conservation for the unoccupied states, which can be the source of substantial errors, in particular for elements with d and f electrons. This error can be avoided using norm-conserving instead of the standard PAW potentials for GW calculations (Klimeŝ et al., 2014).

Basis Set Convergence
The first criteria for a reliable G 0 W 0 calculation is that the underlying DFT or HF calculation is converged. This convergence has to be checked for all basis set types. The second convergence criteria is the size of the basis set in the G 0 W 0 calculation itself. In quantum chemistry it is well established that correlated electronic structure methods converge slowly with respect to the number of basis functions (Kendall et al., 1992;Kutzelnigg and Morgan, 1992;Klopper et al., 1999). The same has been also observed for G 0 W 0 (Shih et al., 2010;Friedrich et al., 2011;Bruneval, 2012;Yan et al., 2012;Bruneval and Marques, 2013;van Setten et al., 2013van Setten et al., , 2015Jacquemin et al., 2015;Bruneval et al., 2016;Wilhelm et al., 2016). It has been demonstrated that the convergence rate of G 0 W 0 is comparable to other correlated methods such as second-order Møller-Plesset perturbation theory (MP2) and the coupled cluster singles, doubles and perturbative triples [CCSD(T)] method (Bruneval and Marques, 2013).
Converging G 0 W 0 excitations within a plane wave basis set is straightforward since the cutoff energy can be continuously increased, see Figure 16A. In addition, extrapolation schemes to the complete basis set limit have been reported to reduce the computational cost (Klimeŝ et al., 2014;Govoni and Galli, 2018). Conversely, for localized basis sets only a limited number of basis set sizes is available and a steady increase in size as for plane waves is not possible. Therefore, extrapolation techniques must always be used to obtain converged G 0 W 0 energies. This is displayed in Figure 16B, where the first IP of benzene is plotted with respect to the inverse of the basis set size. Shown are results for the Dunning basis set family cc-pVnZ (n=3-6) (Dunning, 1989;Wilson et al., 1996), which was designed to smoothly reach the complete basis set limit. Increasing values of n indicate increasingly large basis sets. The convergence is smooth, but very slow, as shown in Figure 16B.
The extrapolation to an infinite number of basis functions is performed by a linear regression with respect to the inverse of the total number of basis functions. The extrapolated value of 9.17 eV in Figure 16 is 0.07 eV larger than the IP obtained at the cc-pV6Z level showing that even the largest basis set cannot converge the G 0 W 0 energies completely. This linear fitting procedure is a well-established scheme to extrapolate G 0 W 0 energies and has been tested in large benchmark studies (van Setten et al., 2015). Alternatively, linear regression has also been performed with respect to C −3 n , where C n is the cardinal number of the basis set, i.e., 3 for cc-pVTZ, 4 for cc-pVQZ, 5 for cc-pV5Z and 6 for cc-pV6Z. Extrapolation with respect to C −3 n is well-established for correlated methods (Helgaker et al., 1997). The inverse of the basis set number corresponds roughly to C −3 n . The average difference between the two extrapolation schemes for G 0 W 0 energies is indeed very small with 0.04 eV (van Setten et al., 2015).
A common misconception in the plane wave community is that the number of unoccupied states that enter a G 0 W 0 calculation is a separate convergence parameter. The number of unoccupied states that can be resolved with a given basis set typically grows with the size of that basis set, i.e., the Hilbert space of that basis grows. This implies that more empty states enter the sums in the Green's function (Equation (24)) and the polarizability (Equation (28)). Since it is computationally expensive to generate a large number of unoccupied states in the preceding plane wave DFT or HF calculation, plane wave G 0 W 0 practitioners reduced the number of unoccupied states that enter the GW calculation in order to save computational time van Setten et al., 2017). Localized basis sets on the other hand are significantly smaller and typically all virtual states are computed even in DFT-only calculations. Figure 16A gives an impression of the scale for the plane wave case. It displays the convergence of the band gap of wurtzite ZnO with respect to the number of bands (states) (Yan et al., 2012). The convergence rate of ZnO is particularly slow Stankovski et al., 2011) compared to other semiconductors, e.g., silicon (Friedrich et al., 2006). The band gap finally converges at around 30 Ha. At this point almost 3000 bands have been included in the G 0 W 0 calculation.
While it might seem appealing to reduce the number of required unoccupied states to less than 3,000, Figure 16A illustrates that a reduction is not possible due to the slow convergence. Since the number of resolvable, unoccupied states is coupled to the plane wave cutoff Gao et al., 2016;van Setten et al., 2017), one should always include the maximum number of bands in the G 0 W 0 calculation for a given plane wave cutoff. Such a procedure also greatly simplifies the convergence study since only one and not two parameters need to be converged.

Elimination of Unoccupied State Summation
The complications around the virtual state convergence raised in the previous section for plane wave basis sets can be bypassed completely by eliminating the explicit summation over unoccupied states in the Green's function G 0 (Equation (24)) and the polarizability χ 0 (Equation (28)) Wilson et al., 2008Wilson et al., , 2009Berger et al., 2010;Giustino et al., 2010a;Umari et al., 2010). A practical method for building a perturbation theory without explicit reliance on unoccupied states was pioneered in the context of density functional perturbation theory (DFPT) (Baroni et al., 1987;Gonze, 1995Gonze, , 1997. Here we will briefly review how the DFPT concept can be transferred to G 0 W 0 . For a general introduction to the DFPT formalism see (Baroni et al., 2001).
The central object in DFPT is the response function that measures the response of a system to a perturbation V. In the G 0 W 0 context, we are interested in the response to the introduction of an additional charge to the system at point r. The additional charge perturbs the charge density of the system. The response function mediates the charge density change and the perturbation. We now wish to calculate the response function without invoking the sum over states expression introduced in Equation (28).
We start with the change in the charge density n(r, r ′ , ω) given for a spin-unpolarized system by Giustino et al. (2010a) Here φ 0 i (r, r ′ , ±ω) is the frequency-dependent variation of the occupied mean-field state i. Instead of expanding φ 0 i (r, r ′ , ±ω) in the basis of unperturbed mean-field states φ 0 i (r) it is calculated directly with the Sternheimer equation (Baroni et al., 1987;Giustino et al., 2010a) There are two possible choices for V. The first one is to set the perturbation to the bare Coulomb interaction v(r, r ′ ) ) This choice is known as non-self-consistent Sternheimer GW. The Sternheimer G 0 W 0 formalism is shown in Figure 17.
Quantities that depend on r ′ are expanded in a basis {ϕ k (r ′ )}, see Equation (49), and both sides of Equation (61) are projected onto ϕ l (r ′ ). This leads to a linear set of equations with a parametric dependence on r and ±ω. Solving the Sternheimer equation for each real-space grid point r and the frequencies ±ω yields φ 0 i (r, r ′ , ±ω) for the occupied state i. From the latter we can evaluate the induced charge density n. The dielectric function given in Equation (26) can be rewritten in terms of n Lambert and Giustino, 2013) W 0 is then calculated from the inverse of ε according to Equation (25) as usual. The second choice for V is to set it to the screened Coulomb interaction leading to the self-consistent Sternheimer GW formalism introduced by Giustino et al. (2010a). In this scheme, the (selfconsistent) induced charge density n SC generates a potential V scr , which screens the bare Coulomb potential v due to the perturbative charge in the system. From V scr we can directly calculate W 0 as It can be shown that Equation (66) is equivalent to Equation (25) (Giustino et al., 2010a). Since W 0 appears on the right-hand side of Equation (61), it must be solved self-consistently. In the first step, W 0 is initialized with v. From the solutions of the Sternheimer equation, we calculate n SC , V scr and finally W 0 . Both schemes yield W 0 , but the non-self-consistent approach requires fewer steps. However, the dimensions of the dielectric matrix increase with system size and its inversion might become a bottleneck for large systems, in particular when using plane wave basis sets. In this case the self-consistent scheme might be more efficient.
The two schemes discussed so far address the elimination of empty states in W 0 . Removing the sum over virtual states in G 0 is also possible by using a similar strategy as for W 0 , see Giustino et al. (2010a) for a detailed description. Once G 0 and W 0 have been obtained, the self-energy is composed as usual and the frequency integration is performed with the methods described in section 4.3. Sternheimer approaches have been implemented for plasmon pole models , the analytic continuation (Giustino et al., 2010a) and contour deformation (Govoni and Galli, 2015).
The Sternheimer GW approach is primarily used in plane wave implementations Nguyen et al., 2012;Lambert and Giustino, 2013;Pham et al., 2013;Govoni and Galli, 2015). We are only aware of one non-plane wave implementation using mixed representations of real space and localized basis sets (Hübener et al., 2012a,b). As discussed in section 4.5, converging a G 0 W 0 calculation with plane waves requires a very large number of empty states. The calculation of all empty states in the preceding DFT or HF calculation is computationally expensive and can easily become a computational and storage bottleneck. In the Sternheimer approach, the preceding DFT step is significantly simplified since only occupied states have to be calculated. For localized basis sets, no such benefit is found in DFPT or Sternheimer since the number of virtual states is typically not that large and rarely a bottleneck (Shang et al., 2018).
Sternheimer G 0 W 0 saves not only computational time in the preceding mean-field calculation, but also by not having to carry out the sums over states in G 0 and χ 0 . However, it concomitantly loses time in the Sternheimer iterations. To our knowledge, a detailed comparison of the computational cost to conventional G 0 W 0 implementations has not been reported yet. To speed up Sternheimer G 0 W 0 , projection techniques for representing the dielectric matrix in an optimal, smaller basis (Wilson et al., 2008(Wilson et al., , 2009Nguyen et al., 2012;Pham et al., 2013;Govoni and Galli, 2015) and Lanczos-chain algorithms that efficiently obtain the Sternheimer solution over a broad frequency range (Umari et al., 2010) have been developed. Furthermore, all the Sternheimer equations, that need to be solved for each r and ω, are independent from each other facilitating massively parallel implementation (Govoni and Galli, 2015;Schlipf et al., 2019).
The Sternheimer approach does not reduce the basis set size, i.e., the plane wave cutoff or equivalently the size of the real-space grid, nor does it change the formal scaling of G 0 W 0 with respect to system size. However, it is an elegant way to facilitate easier convergence, since the temptation of converging the number of virtual states separately is removed.
A modified Sternheimer ansatz has been developed for FLAPW basis sets which accounts for response contributions outside the Hilbert space spanned by the basis set (Betzinger et al., , 2013(Betzinger et al., , 2015. This modified approach thus allows the basis set size to be decreased, unlike the classical Sternheimer technique. The explicit summation over unoccupied states is not completely removed, but the number of empty states needed for convergence is strongly reduced.

Starting Point Dependence and Optimization
The results of a G 0 W 0 calculation depend on the wave functions {φ 0 s } and the energies ǫ 0 s that are used as input for the Green's function (G 0 ) and the screened Coulomb interaction (W 0 ). The single-particle wave functions and energies are determined by the choice of the single-particle mean-field Hamiltonianĥ MF , e.g., by the chosen DFT functional. To denote this dependence, we will introduce the notation G 0 W 0 @starting point.

How Severe Is the Dependence on the Reference State?
Until the early 2000s, the majority of all G 0 W 0 calculations were based on DFT calculations using the local-density approximation (LDA) or a generalized gradient approximation (GGA) (Aryasetiawan and Gunnarsson, 1998;Aulbur et al., 2000;Onida et al., 2002). With the advent of exact-exchange based DFT functionals in the solid state community and the proliferation of G 0 W 0 codes that are based on quantum chemical codes, a more diverse range of starting points became available. It was soon realized that G 0 W 0 can exhibit a pronounced startingpoint dependence for semiconductors (Rinke et al., 2005;Fuchs et al., 2007). In the last years, the starting point dependence has also been intensively discussed for molecules Marom et al., 2012;Bruneval and Marques, 2013;Gallandi and Körzdörfer, 2015;Gallandi et al., 2016). Figure 18 illustrates the starting-point dependence for the HOMO of the water molecule. For the underlying DFT calculations, the PBE-based hybrid (PBEh) functional family (Perdew et al., 1996b;Adamo and Barone, 1999;Ernzerhof and Scuseria, 1999) was scanned. The PBEh functional family is characterized by an adjustable fraction α of HF exchange. The exchange-correlation energy E xc is therefore α-dependent and given by where E EX x denotes the HF exchange energy. E PBE x and E PBE c are the PBE exchange and correlation energy, respectively. To illustrate the starting point dependence in G 0 W 0 , the mixing parameter α in PBEh was varied from 0 to 1 and then a subsequent G 0 W 0 calculation was performed. The resulting G 0 W 0 HOMO energies shown in Figure 18 span a range of more than 1 eV. Although a 1 eV spread appears large, it is much smaller than the range of the corresponding mean-field  (Page et al., 1988;Lias and Liebman, 2003).
All GW values are extrapolated to the exact basis set limit using the cc-pVnZ (n = 3-5) basis sets. Further computational details are given in Appendix C.
The strong dependence of G 0 W 0 on the starting point can be largely attributed to over-and underscreening. From the Adler-Wiser expression for χ 0 (Equation (28)) it can be deduced that the screening strength in G 0 W 0 is inversely proportional to the eigenvalue gap of the starting point. Since HF typically overestimates gaps, it will underscreen. PBE, on the other hand, underestimates gaps and therefore overscreens.
Another source of error in the KS orbital energies is the spurious self-interaction term (Perdew and Zunger, 1981). The one-electron self-interaction error (SIE) arises from an incomplete cancellation of the unphysical electrostatic Hartree energy of an electron with itself by the exchange-correlation term. The SIE is more pronounced for localized than delocalized orbitals (Körzdörfer et al., 2009). This can be intuitively understood: an electron in a localized orbital has a stronger self-interaction because its wave function is more confined. As a result, the localized orbitals are destabilized with respect to more delocalized orbitals. This can lead to a wrong ordering of the orbital energies in the underlying DFT calculations, which carries over to the GW spectrum. The SIE can be mitigated by a larger amount of exact exchange, which also restores the correct ordering for the QP energies (Marom et al., 2011. The G 0 W 0 starting point dependence generally lies in the range of 1.0 eV for HOMO energies of molecules , but increases for deeper states. For solids, the spread can exceed 2 eV for the band gap (Fuchs et al., 2007). This beckons for a judicious choice of the starting point in G 0 W 0 calculations or an elimination of the starting point dependence. The dependence on the preceding mean-field calculation can be eliminated or reduced by employing some form of self-consistency as discussed in section 5 or, as proposed only very recently, by replacing G 0 by a renormalized singles Green's function (Jin et al., 2019). In this section we focus on the optimal choice of the starting point. The PBEh family of DFT functionals is convenient for this purpose, since one parameter (the amount of exact exchange α) governs the behavior of the starting point. Several schemes have been developed to find the optimal α value within the PBEh functional family Atalla et al., 2013;Pinheiro et al., 2015;Dauth et al., 2016;Bois and Körzdörfer, 2017). We summarize them in the following.

Consistent Starting Point Scheme
Körzdörfer and Marom developed a consistent starting point (CSP) scheme that seeks a PBEh reference state (i.e., starting point) that best resembles the G 0 W 0 spectrum. Splitting both the G 0 W 0 self-energy and the starting hybrid functional into their respective exchange and correlation parts (see Equations (29) and (32)) allows us to rewrite the QP equation (Equation (22)) as follows Marom, 2017) FIGURE 19 | CSP scheme representative for a small molecule. v c s (Equation (70)) is plotted with respect to v x s (Equation (69)) for a set of occupied states s. The HOMO and HOMO-1 states are indicated. The new α value is obtained from the slope of the straight line fitted through the red symbols. Data retrieved from .
v PBE x and v PBE c are the exchange and correlation part of the PBE exchange-correlation potential, respectively. The optimal α is determined so that the shift between G 0 W 0 and PBEh for the occupied states is approximately a constant If Equation (71) is satisfied, the positions of the PBEh orbital energies relative to each other are as close as possible to the G 0 W 0 energies. In this case, the QP correction amounts to a rigid shift of the PBEh spectrum. The value of α for which Equation (71) is satisfied yields the optimal starting point in the CSP scheme.
If the PBEh and the G 0 W 0 @PBEh spectrum matched perfectly, the constant k would be zero. However, in general it is not possible to find a starting point whose spectrum matches the G 0 W 0 spectrum exactly. For a given guess of α, v x s and v c s are calculated according to Equations (69) and (70). v c s is plotted as a function of v x s for a set of occupied states s as data points, see Figure 19. A straight line fit determines a new α which is used to calculate new DFT eigenvalues and orbitals from PBEh(α new ). From the new eigenvalues and orbitals, a new self-energy is calculated and Equations (69) and (70) are reassessed. This procedure is continued until the α of the linear fit equals the initial α. Then the optimal α has been found. By construction, the PBEh(α) eigenvalues are now, up to the rigid shift k, consistent with the G 0 W 0 spectrum. Typical CSP α values lie in the range of 0.25 − 0.30. The CSP scheme has been tested on several organic molecules that are used in organic electronics and yields good agreement with photoemission spectra in all cases Marom, 2017).

Deviation From the Straight Line Scheme
A physically more rigorously motivated optimization scheme is based on the deviation from the straight line error (DSLE) . In 1982, Perdew and co-workers showed that the total energy E of any many-electron system should change linearly when varying the electron number continuously from N to N − 1 electrons (Perdew et al., 1982), The function E(f ) is a piecewise linear function of the occupation number f , with cusps at every integer value of f , see Figure 20. Standard DFT functionals, however, violate this piecewise linearity condition and yield energies that deviate from the straight line at fractional occupation numbers f (Mori- Sánchez et al., 2006;Ruzsinszky et al., 2006;Kraisler and Kronik, 2013). The straight-line condition applies not only to DFT, but to any total energy method (we will address GW total energies in section 10). Therefore, we can use the DSLE to find an optimal starting point for G 0 W 0 . The slope of the total energy as a function of occupation gives the Kohn-Sham eigenvalue or, in the GW case, the quasiparticle excitation energy for a given electron number, where ǫ HOMO,N is the QP energy of the HOMO for the neutral system (N). ǫ LUMO,N−1 denotes the QP energy of the lowest unoccupied orbital (LUMO) for the charged system (N − 1). It is evident from Equations (73) and (74) that the slopes must be identical for f = 0 and f = 1 in an exact theory. In other words, a necessary condition for piecewise linearity is that the energy for removing an electron from the neutral system equals the energy for adding an electron to the positively charge system, i.e., the IP for the neutral systems and the electron affinity (EA) of the charged system should be equal. The difference should thus be zero and if it is not zero it quantifies the deviation from the straight line error DSLE . The idea is now to find a PBEh starting point for which G 0 W 0 @PBEh minimizes DLSE . The optimal α value for PBEh can be found by the following procedure: We select a set of PBEh functionals with α values between 0 and 1. Then two G 0 W 0 calculations are performed for each starting point. One for the neutral system that yields ǫ HOMO,N and a second one for the cation to obtain ǫ LUMO,N-1 . Following Equation (76), we calculate the difference between these two energies to estimate the deviation from the straight line condition. The PBEh(α) functional that yields the smallest DSLE will be the optimal starting point.
This DSLE scheme has been tested for small molecular systems, where it has been shown that the optimal α values are distributed around 0.35 − 0.40 for the first IP Dauth et al., 2016). The reported deviation from the CCSD(T) reference is smaller than 0.2 eV . The drawback of the DSLE scheme is that the optimization is restricted to the HOMO. For other states, the straight line condition could still be formulated, but the corresponding G 0 W 0 calculations could not be performed because the electron occupation function would no longer correspond to an equilibrium distribution (see section 11.3 for GW calculations out of equilibrium). If we removed an electron from a lower lying occupied state, the sums over occupied and virtual orbitals in the polarizability χ 0 would no longer be rigorously defined, i.e., the energy differences in the denominator in Equation (28) would exhibit the wrong sign.

IP-Theorem Schemes
Several other schemes were developed that are, in spirit, similar to the CSP and DSLE optimization approaches, but are explicitly based on the ionization potential (IP) theorem. The latter states that in exact DFT the negative of the KS orbital can be strictly assigned to the first ionization potential IP HOMO (Levy et al., 1984;Almbladh and von Barth, 1985) IP HOMO = −ǫ KS HOMO .
This statement is not true for any other KS state and not valid for approximate DFT functionals. Atalla et al. proposed a scheme that exploits the IP-theorem and minimizes the G 0 W 0 correction for the HOMO level with respect to the amount of exact exchange α in a PBEh starting point (Atalla et al., 2013), This approach is similar to the CSP scheme in Equation (71). The difference is that we find the PBEh functional whose HOMO energy matches that of G 0 W 0 @PBEh for the same α, whereas CSP looks for the closest spectral match between PBEh and G 0 W 0 @PBEh. HOMO excitations obtained from this IPtheorem-tuned scheme agree reasonably well with CCSD(T) reference data (Bois and Körzdörfer, 2017), but are not expected to reproduce the whole excitation spectrum properly (Atalla et al., 2013). They generally yield large αs (around 0.8) and produce HOMOs and HOMO-LUMO gaps that are too large (i.e., underscreened). Finding a PBEh(α) functional that fulfills the IP-theorem by enforcing consistency with the G 0 W 0 spectrum is one option. An alternative approach is to IP tune the hybrid functionals themselves (Bois and Körzdörfer, 2017) by minimizing with respect to α. These IP-tuned hybrids already give accurate KS-HOMO energies. Recent benchmark studies for molecular systems showed that G 0 W 0 corrections on top of IP -tuned functionals provide spectral properties in good agreement with experiment for the whole excitation spectrum (Refaely-Abramson et al., 2012;Egger et al., 2014;Gallandi and Körzdörfer, 2015;Gallandi et al., 2016;Knight et al., 2016;Bois and Körzdörfer, 2017). In particular, EAs are well reproduced with a mean absolute devation (MAD) smaller than 0.2 eV from the CCSD(T) reference, while the MAD reported for IP HOMO is 0.1 eV (Knight et al., 2016).

Computational Complexity and Cost
Of all the GW schemes described in this review, G 0 W 0 is the computationally most efficient one. Only the diagonal elements of the self-energy are needed and the Green's function that enters is always G 0 . The fully interacting G, on the contrary, depends on the full self-energy and can only be obtained by iterating Dyson's equation The computational complexity of G 0 W 0 depends on the frequency integration method and design of the algorithm. The most accurate integration technique, the fully-analytic approach, requires the solution of the full Casida equations, which is an O(N 6 ) step with respect to the system size N, see section 4.3.5. In the canonical implementation, the computational cost is reduced to O(N 4 ). Different implementations with N 4 complexity have been developed employing a variety of numerical techniques specific for the respective basis set (Shishkin and Kresse, 2006a;Blase et al., 2011;Deslippe et al., 2012;Ren et al., 2012a;Govoni and Galli, 2015;Wilhelm et al., 2016). For example, the O(N 4 ) algorithm proposed by Ren et al. employs localized basis functions and the AC method . The computational and memory costs for the four-center electron repulsion integrals (4c-ERIs) are reduced by refactoring the latter in two-and three-center Coulomb integrals using a resolutionof-the-identity (RI) approach with a so-called Coulomb metric (Vahtras et al., 1993). The accuracy of this algorithm has been validated for valence excitations and EAs by comparing to the fully analytic approach (van Setten et al., 2015). However, for core states it was recently shown that AC fails and that CD is required . The computational complexity of CD remains unchanged for valence states, but increases to O(N 5 ) for the deep states.
The O(N 4 ) scaling and overall cost of canonical GW implementations restrict the tractable system size and prohibit the study of many systems that are relevant in the chemistry and physics community, such as solid-liquid interfaces, molecules in solution, complex alloys, nanostructures or hybrid interfaces, that require large simulation cells with hundreds to thousands of atoms. To make G 0 W 0 calculations feasible for larger systems, the scaling and computational complexity have been scrutinized. Developments have proceeded in two directions: (1) reducing the prefactor, i.e., the overall computational cost at O(N 4 ) scaling, or (2) reducing the scaling.
The prefactor can also be controlled by choosing an optimal basis set for the respective system under study. In the last years, several algorithms for localized basis sets have been developed Ke, 2011;Ren et al., 2012a;van Setten et al., 2013;Bruneval et al., 2016;Wilhelm et al., 2016). These basis sets are generally smaller than traditional plane wave basis sets and considerably more efficient for molecules. However, the development of reliable G 0 W 0 algorithms for periodic systems based on localized basis sets is still underway (Wilhelm and Hutter, 2017).
The reduction of the exponent to O(N 3 ) complexity has been addressed in different ways. Foerster et al. developed a cubicscaling G 0 W 0 algorithm using Gaussian basis sets and exploiting locality in the electronic structure, albeit with a high prefactor (Foerster et al., 2011). Recently, two cubic-scaling algorithms have been devised (Liu et al., 2016;Wilhelm et al., 2018), which are both variants of the O(N 3 ) GW space time method (Rojas et al., 1995). The key step of these algorithms is the computation of the irreducible polarizability in imaginary time, χ 0 (it) = −iG 0 (it)G 0 (−it) and the subsequent transformation to imaginary frequencies iω. The time-ordered non-interacting Green's function in imaginary time is given by Inserting G 0 (it), the summation over occupied and virtual states is now decoupled in χ 0 (it) and can be performed separately, which is fundamental for the reduction to O(N 3 ) complexity. Liu et al. based their cubic-scaling algorithm on a plane wave basis set in combination with a PAW scheme and reported a linear-scaling with the number of k points used to sample the Brillouin zone (Liu et al., 2016). In combination, this paves the way for GW calculations of large periodic systems. Wilhelm et al. employed a Gaussian basis set and exploited sparse matrix algebra by using an overlap metric for the RI approximation (RI-SVS) to refactor the 4c-ERIs (Vahtras et al., 1993). The step with the largest prefactor, the computation of χ 0 , is reduced from  Ren et al. (2012a) and the low-scaling algorithm is shown in Figure 21 for graphene nanoribbons. The canonical algorithm is restricted to system sizes of less than 500 atoms, while systems with more than 1,600 atoms can be addressed with the low-scaling implementation. These are some of the largest G 0 W 0 calculations with high accuracy and full-frequency integration reported so far. The mean absolute deviation with respect to the canonical reference implementation in FHIaims  is less than 35 meV for the GW100 test set , which is discussed more in detail in section 9.3.
An actual linear scaling algorithm was devised within the framework of stochastic GW (Neuhauser et al., 2014) and applied to silicon clusters with 1,000 atoms. However, the verification of its general reliability is still the subject of ongoing research (Vlček et al., 2017).

Practical Guidelines
In summary, the following points should be considered when conducting G 0 W 0 calculations:

Frequency integration technique
A sufficiently accurate method for the frequency integration of Equation (23) has to be chosen. The required precision depends on the system and in particular on the states of interest, see section 4.3.

Basis set choice
The decision should be guided by the system of interest. Localized basis sets are generally more efficient for finite systems, while plane wave G 0 W 0 codes are currently superior for extended systems (see detailed discussion in section 4.4). 3. Basis set convergence GW calculations have to be carefully converged with respect to basis set size. Extrapolation procedures to the complete basis set limits might be required as demonstrated in section 4.5.

Starting point
The QP energies depend strongly on the functional of the preceding DFT calculation as shown in section 4.7. While for solid state systems GGA functionals are often suitable starting points (see also section 6), hybrid functionals perform better for molecules. A judicious choice of the starting point is necessary. 5. Convergence of technical parameters G 0 W 0 calculations usually require the convergence of a few additional parameters, which are strongly implementation dependent. Such a parameter is, e.g., the size of the integration grid for the imaginary frequency terms in the CD and AC approach. G 0 W 0 practitioners should always carefully check their GW code to ensure the robustness and convergence of all available settings and parameters The G 0 W 0 approximation provides computationally efficient access to the whole QP spectrum. Despite these appealing features G 0 W 0 has certain drawbacks. The most severe is the dependence on the starting point discussed in section 4.7. Furthermore, the ground state energy and density cannot be computed. In sections 5 and 10, we will show how these drawbacks can be overcome.

Fully Self-consistent GW
To go beyond G 0 W 0 , one must include some level of selfconsistency in Hedin's GW equations. The conceptually purest approach to GW is to perform full self-consistency in the GW equations, denoted as scGW. It is also the most expensive. As introduced in section 3, all four quantities are iterated until self-consistency in the Green's function is achieved. Until now, self-consistent GW is the rarest form of GW for reasons of computational expense and conceptual controversy (see below), although that is slowly changing. The first scGW calculation was performed for the homogeneous electron gas (HEG) by Holm and von Barth (1998), after the same authors had previously applied partial self-consistency (scGW 0 ) (von Barth and Holm, 1996). Later studies were extended to the 2D HEG (García-González and Godby, 2001). scGW deteriorates the spectral properties of the HEG compared to G 0 W 0 . This deterioration manifests itself in a quasiparticle bandwidth that is larger than the free electron one and a broad and featureless satellite spectrum. Both results contradict experimental evidence for alkali metals which are HEG-like (von Barth and Holm, 1996). Also, band gaps of simple semiconductors are greatly overestimated by scGW (Schöne and Eguiluz, 1998;Grumet et al., 2018). scGW calculations for atoms (Stan et al., 2009) and molecules Caruso et al., 2012aCaruso et al., , 2013aMarom et al., 2012) show improvements over G 0 W 0 for the first ionization energies and transport properties  of finite systems. With regard to the whole spectrum, however, scGW is usually outperformed by G 0 W 0 with a judicious starting-point choice Caruso et al., 2013a;Knight et al., 2016). scGW is computationally more demanding than G 0 W 0 because the full Green's function must be stored and calculated (Caruso et al., 2013a), increasing memory and computation requirements. In G 0 W 0 , the full Green's function is only required in O(N 3 ) schemes (see section 4.8). Other implementations make use of the fact that intermediate quantities can be expressed in terms of the mean-field wave functions and eigenvalues, which reduces the computational complexity (see section 4.1). Furthermore, iterations of the GW equations for scGW are expensive. χ 0 , W, and must be computed at each iteration, with a potentially high computational time for even a single evaluation of .
Conceptually, the additional self-consistency in the Green's function adds more reducible diagrams compared to G 0 W 0 , as Figure 22 illustrates. In G 0 W 0 , the bare Coulomb interaction is screened by a series of sequentially interacting electron-hole pairs, or "bubbles." In scGW, this structure is preserved, but the bubbles are now composed of interacting Green's function lines instead of non-interacting G 0 lines. This effect is a general feature of iterating Green's function diagrams. By iterating diagrams for a given approximation, initial G 0 lines at internal times become interacting G lines. Already after the first iteration of the cycle (G 1 in Figure 22), the Green's function contains sequential selfenergy insertions.
Let us look more closely at how this occurs. Recall from Equation (13) that Dyson's equation is The first guess at the full G, labeled G 1 , would then be where we have inserted G 1 (4, 2) in place of G(4, 2) on the righthand-side (RHS). 0 is the first estimate to the self-energy, evaluated with G 0 wherever G lines enter the self-energy diagram. G 1 appears on both sides of Equation (82) − by replacing G 1 (4, 2) on the RHS with the entire RHS, one can generate a reducible diagram for G 1 that is O( 2 0 ). At this point, the series for G 1 contains three parts: G 0 , a term of order O( 0 ), FIGURE 22 | G 0 W 0 and scGW in terms of Feynman diagrams. In G 0 W 0 , the irreducible self-energy is constructed from G 0 and W 0 . The Green's function updated with the lowest order self-energy, G 1 (shown as the bold Green's function line), contains an infinite series of insertions. In fully self-consistent GW, the starting point dependence is removed and all quantities in the diagrammatic expansion are fully dressed. Here, we assume a true G 0 starting point instead of a mean-field G 0 so that subtraction of v MF is not necessary to include in the diagrams. and a term of order O( 2 0 ). These contributions to G 1 are shown in Figure 22. By further iterating G 1 on the RHS, one can generate all reducible diagrams which contribute to G 1 . Despite the infinite number of reducible diagrams generated by this prescription, G 1 is still computed only with the G 0 W 0 self-energy because we have not updated 0 . This example also demonstrates why it is conceptually much simpler to work only with the irreducible self-energy and avoid this infinite, reducible series. Indeed, iterating Equation (82) to find G 1 while keeping 0 fixed is equivalent to generating the entire reducible series for G 1 .
In scGW, the G 0 W 0 calculation of 0 to build G 1 is only the first step. Next, we update 0 to 1 by inserting G 1 into the selfenergy diagram. The diagram contains one obvious G line ( = iGW), but contains more that are hidden in the polarizability entering W. By updating the polarizability with G 1 in place of G 0 , the diagrams contained in G 1 enter the screened Coulomb interaction. Just as before, we can generate all reducible diagrams contributing to the updated Green's function (G 2 ) by iterating the Dyson series G 2 (1, 2) = G 0 (1, 2) + G 0 (1, 3) 1 (3, 4)G 2 (4, 2)d(3, 4) (83) for a fixed 1 . Continuing to update and iterate G introduces more and more reducible diagrams. The solution is selfconsistent when G entering is the same as G from iterating Dyson's equation.
In real scGW calculations, the procedure is slightly different. G and are updated together instead of iterating to find G i+1 at a fixed i . After the first iteration of Equation (82), the updated − but not yet self-consistent − G 1 is inserted into . This way, is updated at each iteration along with G. The combined iterations are much more efficient because G and converge together. Bear in mind that the efficient method of updating G and at each step does not form the same easy-to-interpret series for G 1 as in Figure 22. Note that even after one iteration to find G and , we would already go beyond G 0 W 0 .
Based on Figure 22, the fully dressed Green's function and screened Coulomb interaction in scGW can be interpreted as double renormalizations of G 0 and W 0 through the two Dyson's equations in the GW equations. However, the third Dyson equation that the vertex function (Equation (B26) in Appendix B) would introduce is missing from the GW equations. The absence of the vertex function has important consequences. Figure 23 shows two of the reducible diagrams that enter W in scGW, but that are not present in W 0 . The diagram on the left shows the polarization bubble with the insertion of one interaction, or scattering event, in each arch. It is part of a series of sequential scattering events and builds additional interactions into the screened Coulomb interaction. The other diagram, however, is problematic. After the creation of the first electron-hole pair, both the electron and the hole interact with a new electron-hole pair. The two new electronhole pairs are composed of the same Green's function lines as the initial electron-hole pair, even though the initial pair still exists. Therefore, the two later pairs do not account for the fact that the initial electron-hole pair has already been created − they should somehow omit the pair already created. The electron (or hole) thus interacts or correlates with itself. The problems that have been identified 9 for scGW can be attributed to diagrams like the one on the RHS of Figure 23 (Romaniello et al., 2009).
We would also like to briefly comment on partial selfconsistency in the Green's function. The above discussion points to the screened Coulomb interaction as the major source of imbalance between self-consistency and the missing vertex corrections. This imbalance can be partially fixed by keeping W fixed at the W 0 level and iterating only the Green's function to self-consistency (scGW 0 ). This is shown schematically in Figure 24. This approximation is partially motivated by a "best G, best W" philosophy that can improve agreement with experiment in certain situations. Before leaving the discussion of self-consistent GW, we introduce one of the most technical and modern aspects of 9 It has also been pointed out that the exact W from RPA does not satisfy the f-sum rule. Green's function theory being researched: the existence of multiple solutions for G from a single Dyson equation (Tandetzky et al., 2015). This issue has been studied in detail for the zero-dimensional one-point model (OPM) (Lani et al., 2012;Berger et al., 2014;Tarantino et al., 2017) and has been produced numerically (Kozik et al., 2015;Gunnarsson et al., 2017;Vučičević et al., 2018). In the analytic OPM, there exist two interacting G which can be mapped from the same G 0 (Rossi and Werner, 2015;Stan et al., 2015). One of these solutions is the physical G for all values of interaction strength. Here, the physical solution is characterized by a smooth connection to G 0 , unlike the unphysical solution for G which diverges at zero interaction strength. The reverse map, from G to G 0 , has two solutions for G 0 which must be disentangled at a certain interaction strength. At this point, the physical G 0 switches between the two solutions, so that solving the problem for all interaction strengths requires switching solution methods at this point. Otherwise, one would obtain a physical G 0 for some interaction strengths and an unphysical G 0 for others. In the OPM, this is now understood. However, it is not well understood if or how this same phenomenon emerges in more realistic systems.

Eigenvalue Self-consistency and Level Alignment
There are a few strategies to reduce the expense of scGW while still including more physics than G 0 W 0 . The simplest form of performing approximate self-consistency in GW is to iterate in the eigenvalues (evGW). After completion of the G 0 W 0 loop, the real parts of the quasiparticle energies obtained from Equations (22) or (34) are reinserted into the non-interacting Green's function G 0 (Equation (24)) in place of the starting eigenvalues. Through G 0 , the change in eigenvalues permeates through W 0 to the self-energy and eventually to the quasiparticle energies (evGW). After iterating until the input quantities equal the output, the equations are self-consistent in the eigenvalues.
Eigenvalue self-consistency was already proposed in the first G 0 W 0 calculation for real materials (Hybertsen and Louie, 1986) and has since been applied frequently (see, e.g., Shishkin and Kresse, 2006b for a more in-depth analysis). However, since only the real part of the quasiparticle energies is used and the wave functions are not updated self-consistently, the starting point dependence cannot be eliminated entirely. For example, a study for azabenzenes demonstrates that although the starting point In (A), all five quantities are iterated to self-consistency. In (B), self-consistent GW (scGW), Ŵ is set to a single spacetime point and the remaining four quantities are determined self-consistently. Eigenvalue self-consistent GW shown in (C), evGW, updates only the quasiparticle energies while leaving the wave functions unchanged. In the scGW 0 or evGW 0 procedures shown in (D), one iterates G to self-consistency in Dyson's equation but does not update χ 0 or W. dependence is reduced from 1.4 eV in G 0 W 0 , it cannot be lowered beyond 0.4 eV .
For molecular systems, it has been shown that eigenvalue self-consistency improves the HOMO-LUMO gaps, which are then in good agreement with experiment Wilhelm et al., 2016). However, examining the entire spectrum reveals that evGW does not lead to consistent improvements over G 0 W 0 . The evGW spectra are overly stretched with respect to the experimental spectra, such that large deviations (on the order of 1 eV) from experiment occur for lower lying states. Moreover, for most systems, the orbital ordering deviates from experimental observations . This is in line with observations for semiconductors and insulators, that find band gaps to be considerably overestimated in evGW (Shishkin and Kresse, 2006b). The reason for this overestimation in solids lies in the fact that the insertion of the quasiparticle energies into the screened Coulomb interaction leads to an underscreening, which should be compensated by the missing vertex corrections, as discussed previously.
Just as scGW 0 ameliorates problems in self-consistent GW, one can perform eigenvalue self-consistency only in G to circumvent underscreening errors. Iterating the eigenvalues in only G produces an evGW 0 scheme that gives band gaps in good agreement with experiment (Shishkin and Kresse, 2006b).
However, for open shell systems, it was observed that eigenvalue self-consistency in G strongly affects the calculated multiplet splittings (Lischner et al., 2012).
The effect of eigenvalue self-consistency in G on the selfenergy is demonstrated in Figure 25 for an evGW 0 calculation. Compared to G 0 W 0 , the structure of the self-energy is almost identical, but shifted to lower energies. The Green's function in the eigenvalue self-consistent GW 0 scheme is given by with ǫ mσ = ǫ 0 mσ + ǫ mσ (85) Inserting the GW corrections ǫ mσ into the Green's function results in a shift of the poles in the self-energy, see Equation (48). On the real axis, the poles of c s are located at ω n iσ = ǫ 0 iσ + ǫ iσ − nσ and ω n aσ = ǫ 0 aσ + ǫ aσ + nσ , where i indicates again occupied and a virtual states. Starting from a GGA functional, the correction ǫ mσ is negative for occupied and positive for virtual states. Compared to a G 0 W 0 scheme, the poles ω n iσ are now located at lower and the poles ω n aσ at higher frequencies. A simplified version of evGW 0 was originally suggested by Hedin (Hedin, 1965) and has been revisited a few times by others (Pollehn et al., 1998;Hedin, 1999;Martin et al., 2016). Instead of using an individual shift ǫ mσ for each state m, a global shift E is employed: where G σ 0 (ω − E σ ) = G σ E (ω). The QP equation (Equation (22)) then transforms into For metals, the shift E is chosen in such a way that the G 0 W 0 Fermi energy aligns with that of the starting point calculation, i.e., with the Fermi level of G 0 . For systems with a energy gap, the highest occupied state is aligned, i.e., the valence band maximum for solids or the HOMO for finite systems. The latter is motivated by "DFT Koopman's theorem, " which states that only the KS energy of the HOMO can be rigorously assigned to the ionization potential when starting from an exact DFT functional (Levy et al., 1984;Almbladh and von Barth, 1985). In that case E would be zero. The shift can be determined by demanding self-consistency for the highest occupied state Inserting Equation (89) into Equation (88) yields the explicit expression Adjusting the energy scale of G σ 0 by E translates to a rigid shift of the self-energy as shown in Figure 25. The results are very similar to evGW 0 in the frequency range where the quasiparticle solution is expected.
The E scheme is less frequently used for the calculation of quasiparticle energies than eigenvalue self-consistent schemes. However, it has been shown that it substantially improves satellite spectra (Pollehn et al., 1998). The same holds for the evGW 0 scheme, which has been employed to calculate satellite spectra of VO 2 (Gatti et al., 2015) and bulk sodium (Zhou et al., 2015).

Self-consistency via a New Ground State
Building on the idea of iterating in the quasiparticle energies, one can go one step further and also incorporate wave function changes. An elegant way to achieve this is to find the variationally best mean-field potential to a given self-energy (Godby et al., 1986(Godby et al., , 1987bCasida, 1995a;van Schilfgaarde et al., 2006;Kotani et al., 2007b). This mean-field potential can then be used to generate new eigenvalues and eigenfunctions to construct a new G 0 for the next iteration of the GW cycle.
If the new potential is local, this iteration can be formalized exactly in the optimized effective potential (OEP) framework (Casida, 1995a;Kümmel and Kronik, 2008), which is equivalent to the Sham-Schlüter equation (Godby et al., 1986(Godby et al., , 1987b. The OEP framework and the Sham-Schlüter equation only guarantee that the density generated by the local potential matches the GW density. The eigenvalue spectrum of the local potential will not be the same as the GW spectrum. For explicitly non-local potentials, no formally exact match between the GW self-energy and the potential has been found because the self-energy is non-local and frequency dependent, while the constructed potential is non-local but static. An approximate non-local potential can be found by introducing the GW Hamiltonianĥ GW (ω) =ĥ 0 + v H + GW (ω). The mean-field Hamiltonianĥ MF that best reproduces the effects of GW is defined asĥ MF =ĥ 0 + v H + v MF , see also  for the definitions of the Hamiltonians.
v MF can then be obtained by minimizing ||ĥ GW −ĥ MF || (van Schilfgaarde et al., 2006;Kotani et al., 2007b). An approximate minimization finally yields an analytic expression for the (static and Hermitian) mean-field potential (Faleev et al., 2004;van Schilfgaarde et al., 2006;Kotani et al., 2007b) where "Re" signifies here the Hermitian part of (ǫ k ) The quasiparticle energies for the Green's function G are then given by the self-consistent G 0 that follows from v MF ij (van Schilfgaarde et al., 2006). Satellites or the incoherent part of the spectral function are not captured by this approximation. This is why the scheme is commonly referred to as quasiparticle selfconsistent GW (QSGW). Reports of a starting point dependence for metal oxides (Liao and Carter, 2011;Isseroff and Carter, 2012) have not yet been reproduced by other groups with a different implementation. In general, the QSGW scheme converges to a unique solution.
An alternative definition for a non-local mean-field potential is given by the static Coulomb hole plus screened exchange (COHSEX) approximation to GW (Hedin, 1965;Hedin and Lundqvist, 1970): The screened exchange (SEX) term is defined in analogy to the exact-exchange self-energy in Equation (32) but with the statically screened Coulomb interaction instead of the bare one SEX σ (r, where φ iσ are eigenfunctions of the COHSEX mean-field Hamiltonian. The static Coulomb hole (COH) term, on the other hand, becomes local in space The statically screened Coulomb interaction W(r, r ′ , ω = 0), which enters in Equations (94) and (95), is obtained by inserting the COHSEX eigevalues and eigenfunctions in Equations (25-28) for ω = 0. Like in QSGW, v MF,COHSEX σ (r, r ′ ) produces new eigenvalues and eigenfunctions, which yield a new self-energy. The COHSEX equation can then be iterated until self-consistency is achieved.
COHSEX can also serve as an improved starting point compared to KS-DFT for a perturbative G 0 W 0 calculation. After completing a COHSEX calculation, one can use the self-consistent COHSEX eigenvalues and wave functions for a perturbative G 0 W 0 calculation with the full, dynamical W. In the case of VO 2 (Gatti et al., 2007), G 0 W 0 @LDA fails to open the band gap. On the other hand, G 0 W 0 @COHSEX opens a band gap, in agreement with experiment. The improved COHSEX starting point is especially important for materials with localized electrons (Aguilera et al., 2011). When comparing to benchmark coupled cluster data on organic molecules, the COHSEX starting point decreases the mean absolute error of G 0 W 0 compared to G 0 W 0 @PBE (Knight et al., 2016). In Ge under pressure, G 0 W 0 @COHSEX predicts a direct gap at the Ŵ point while G 0 W 0 @LDA predicts band overlap (Jain et al., 2014).
Self-consistency in GW is a topic that is still being researched. While the results from true self-consistent GW are usually in worse agreement with experiment than G 0 W 0 (Schöne and Eguiluz, 1998;Shishkin and Kresse, 2006b;Grumet et al., 2018), studies of self-consistency are still necessary to advance the field. We can learn about shortcomings of the theory or assess challenging materials. Most importantly, self-consistent GW implementations are a necessary foundation to go beyond GW in the future, as discussed in section 11.

SOLIDS
Solids were the first testbed of GW theory in real materials. The basic quantity to compute in solids is the band structure. Unlike in molecules with single-particle states given by molecular orbitals, single particles in solids occupy Bloch waves, defined in Equation (50) and indexed by their wave vector k. Diagonalizing the Hamiltonian at each k gives its own set of single-particle eigenvalues. One can conveniently visualize the eigenvalues at different k-points by varying k continuously along some path, placed on the x-axis, and plotting the eigenvalues on the yaxis. Eigenvalues change continuously with k, forming separate bands of states. The collection of single-particle states in bands is similar to the grouping of different combinations of bonding and anti-bonding states in molecules or polymers. Quasiparticle Hamiltonians are also k-dependent, and energies at all k form a band structure which can be compared to angle resolved PES and IPES spectra as the incident momentum is varied (see Figure 2).
Computing quasiparticle band structures with GW gives the quasiparticle band gap, analogous to the HOMO-LUMO gap in molecules. The first G 0 W 0 calculations for real materials (Strinati et al., 1980Louie, 1985, 1986) focused on semiconductors (Si, Ge) and insulators (diamond, LiCl). G 0 W 0 calculations in semiconductors and insulators give a uniform improvement in band gap over estimates with either Hartree-Fock or Kohn-Sham eigenvalues, as shown in Figure 26A. This improvement was the first major success of the GW theory.
The success of GW applied to semiconductors continued with other studies (Godby et al., 1986(Godby et al., , 1987a(Godby et al., ,b, 1988Blase et al., 1995). Many common semiconductors lack semicore states and are well described by pseudopotentials (see section 4.4.4). These factors reduce the computational complexity for GW calculations and made early, realistic GW calculations of semiconductors feasible. Screening in simple semiconductors can also be approximated by model dielectric functions like plasmon-pole models (see section 4.3.1), eliminating the need for a numerical evaluation of the self-energy integral.
It was later realized that semicore d-electrons in semiconductors such as GaN, ZnO, ZnS, ZnSe, or CdS (Rohlfing  , 1995b, 1997a, 1998) and metals such as Cu (Marini et al., 2002) and Au (Rangel et al., 2012) have a strong influence on GW calculations. Due to the strong overlap of the atomic d functions with the atomic s and p functions in the same shell, the exchange self-energy is very sensitive to the inclusion of semicore states. If only the semicore d states are explicitly included as valence state in the G 0 W 0 calculation, while the s and p states in the same shell are frozen in the core of a pseudopotential, the subsequent G 0 W 0 calculation will produce an incorrect band gap (Rohlfing et al., 1995b). This problem can be solved by explicitly including the entire shell as valence in the G 0 W 0 calculation (Rohlfing et al., 1995b;Luo et al., 2002;Tiago et al., 2003a;Fleszar and Hanke, 2005), by using exact-exchange pseudopotentials and exact-exchange starting points (Rinke et al., 2005(Rinke et al., , 2008aQteish et al., 2006) or by all-electron calculations (Friedrich et al., 2006(Friedrich et al., , 2010Shishkin and Kresse, 2006a;Gulans et al., 2014, Jiang and. Figure 26A shows the quasiparticle band gap computed with G 0 W 0 and evGW 0 with a modern all-electron LAPW code (Jiang and Blaha, 2016) for several different semiconductors and insulators. Perfect agreement between theory and experiment would place all data points on the dashed blue line. Generally, Kohn-Sham eigenvalues based on a multiplicative (local or semilocal) exchange-correlation potential (here PBE) underestimate the band gap and Hartree-Fock eigenvalues overestimate (not shown in Figure 26). Eigenvalue self-consistency (evGW 0 ) improves the agreement with experiment even further than G 0 W 0 , when starting from a local or semi-local DFT calculation. Figure 26B compares band gaps computed with different selfconsistency schemes (Grumet et al., 2018) for a different set of semiconductors and insulators than in panel (A). G 0 W 0 @PBE again provides good agreement with experiment (i.e., the red squares are close to the diagonal). Van Schilfgaarde's QSGW scheme (van Schilfgaarde et al., 2006) and fully self-consistent GW calculations consistently overestimate band gaps.

Band Gaps
With the predictive accuracy of G 0 W 0 band gaps validated, we provide a few examples in which GW calculations helped to resolve band gap controversies. One case is InN. In the early 2000s, alloys of GaN and InN were revolutionizing light-emitting diode (LED) technology. However, the band gap of InN was believed to be almost 2 eV (Butcher and Tansley, 2005), which would have severely limited the usefulness of InGaN alloys to tune the emission of LEDs. Through G 0 W 0 calculations and more refined experiments, the real value of the InN band gap was found to be 0.7 eV (Bechstedt and Furthmüller, 2002;Furthmüller et al., 2005;Rinke et al., 2006), paving the way for the LEDs we know today. Another example is hybrid perovskites that have triggered a new boom in the emergent photovoltaic materials field. The prototypical material is methylammonium lead triiodide (CH 3 NH 3 PbI 3 or in short MAPI). Unusually, local or semi-local DFT calculations already predict a band gap in good agreement with measurements, which had caused initial confusion in the field. However, when spin-orbit effects, which are particularly strong in this materials class, are incorporated in the DFT calculations, the band gap becomes significantly underestimated again. G 0 W 0 and QSGW calculations that include spin-orbit effects then predict the correct band gap (Brivio et al., 2014;Umari et al., 2014).
Finally, we consider high pressure physics. At high pressures (∼ 100 GPa), many materials experience band gap closure and transition from an insulator to a metal. There can also be many competing structural phases, each with their own metallization pressure, that are difficult to disentangle in experiments. GW is an excellent tool to theoretically predict the metallization pressure for different structural phases and help interpret experimental results (Khairallah and Militzer, 2008;Tse et al., 2008;Ramzan et al., 2010;Jin et al., 2016;Yang, 2017). Solid hydrogen is a noteworthy example of metallization at high pressure, first predicted in 1935 (Wigner and Huntington, 1935). The metallic hydrogen puzzle is an exceptionally difficult one that is still not fully understood, but GW calculations help corroborate experimental measurements and support the existence of certain structural phases (Lebègue et al., 2012;Dvorak et al., 2014;McMinis et al., 2015).

Band Structures and Band Parameters
With GW, one can compute much more than only the band gap of a solid. A typical band structure computed with GW is shown in Figure 27 for ZnO. To visualize the results, k is allowed to vary on a linear path through the Brillouin zone. The quasiparticle energy as a function of k is also called the dispersion for the system. The G 0 W 0 band structure for ZnO (red lines in Figure 27) is superimposed on the experimental photoemission results shown already in Figure 2. Experiment and G 0 W 0 agree very well both in terms of band positions as well as band curvatures.
Another example of a G 0 W 0 band structure is shown in Figure 28 for K 2 Sn 3 O 7 , a new prospective ion conductor or transparent conductor (McAuliffe et al., 2017). The unoccupied states in the PBE band structure have been shifted up for the purposes of plotting so that the bottom of the conduction bands coincides in PBE and G 0 W 0 @PBE. This removes the PBE band gap problem from the comparison and makes it easier to spot differences in band curvatures. For the valence bands the PBE and G 0 W 0 @PBE band structures agree remarkably well for this material. Toward lower energies the deviations between the band structures become larger with G 0 W 0 @PBE generally giving lower band energies than PBE. This downward shift leads to a band width widening in G 0 W 0 compared to PBE. For the conduction bands, the difference between PBE and G 0 W 0 @PBE is more pronounced. The band curvatures in G 0 W 0 @PBE are much steeper than in PBE, which subsequently leads to a significant underestimation of the PBE bands around the X, S, U, and R points in the Brillouin zone. K 2 Sn 3 O 7 is another example of a  material whose band gap and band structure were not known. The G 0 W 0 @PBE band gap amounts to 3.15 eV (McAuliffe et al., 2017), which now provides a reference value for this new material.
From the band structure, one can access the band gap, band widths, and estimate effective masses. If one models the dispersion at the band edges as parabolic, as is the case for a free particle, one can extract an effective mass from the band structure. The effective mass, labeled m * , is so that the band edge dispersion is to mimic a free particle. In a real crystal, the effective mass is a tensor, not just a scalar. The effective mass model is closely related to the quasiparticle concept, and the renormalization factor Z s (Equation (35)) is one factor contributing to m * . The quasiparticle effective mass is usually heavier than the free electron mass because of the drag induced by the surrounding electrons. Static mean-field theories like Kohn-Sham DFT also give an estimate of effective mass from their band structure. The GW band structure is typically more "curved" or concave than the Kohn-Sham structure, as shown in Figure 28, which means that GW quasiparticles are "lighter" than the KS particles. In silicon and methylammonium lead iodide, the GW level of theory is necessary to predict effective masses in good agreement with experiment (Filip et al., 2015;Poncé et al., 2018). One can also compute the single-particle density of states (DOS), which in solids is analogous to taking horizontal slices through the band structure. This gives the total number of available states at the energy of that slice. In Green's function theory, the concept of the single-particle DOS is replaced by the spectral function. The spectral function is k-dependent, so that the total effective DOS is obtained by adding up the spectral functions at all k. However, computing the spectral function requires the solution of Dyson's equation, which is often not practical computationally. Instead, a G 0 W 0 quasiparticle DOS is computed by summing up artifically broadened Gaussian peaks centered around each G 0 W 0 energy. The right panel of Figure 28 shows the G 0 W 0 @PBE DOS for K 2 Sn 3 O 7 (McAuliffe et al., 2017). In addition, this DOS is projected on the atomic angular momentum channels s, p, and d. Such information is usually extracted from DFT calculations and illustrates that both the valence band and the conduction band of K 2 Sn 3 O 7 is largely made up of p states.
Band parameters like effective masses are important characteristics of semiconductors and are key parameters for the semiconductor industry. G 0 W 0 effective masses are more accurate than those computed with DFT. Effective masses are either extracted directly from the GW band structure by fitting Equation (96) to a fine band structure path (Schleife et al., 2009) or by fitting an effective k · p Hamiltonian to GW quasiparticle energies (Rinke et al., 2008b). In this way, important band parameters have been computed for silicon and silicon under strain (Bouhassoune and Schindlmayr, 2010;Poncé et al., 2018), GaAs (Cheiwchanchamnangij and Lambrecht, 2011), AlN, GaN, and InN (Rinke et al., 2006, 2008bSvane et al., 2010;Yan et al., 2011), MgO, ZnO, andCdO (Schleife et al., 2009;Yan et al., 2012) and more recently for perovskites and hybrid perovskites (Filip et al., 2015). Such band parameters can then be used directly in device simulations to model, for example, charge carrier flows (Kivisaari et al., 2017). If one is interested in charge carrier mobilities and charge carrier densities, scattering due to phonons and impurities has to be taken into account (Kioupakis et al., 2010;Poncé et al., 2018).
GW can further be used as one of the final steps in highthroughput screening studies for new materials. In a search for transparent p-type conductors, G 0 W 0 calculations provided accurate band gaps and effective masses that screened out the final candidates (Hautier et al., 2013).

Lifetimes
Unlike mean-field theories, GW also allows one to compute the lifetimes of states from first principles. The lifetime of the quasiparticle is the characteristic time over which the added particle decays into surrounding degrees of freedom. States which are "closer" to exact eigenstates of the system have longer lifetime. The lifetime of a quasiparticle with corresponding energy ǫ s is directly related to the non-Hermiticity of the self-energy and the magnitude of its imaginary part, τ can be inferred from experimental spectra by its relation to the quasiparticle peak width, Ŵ, as τ −1 = Ŵ/2 (not to be confused with the vertex function Ŵ).
From simple arguments in Fermi liquid theory, lifetimes decrease as particle energy increases because the available phase space for scattering at a fixed energy grows with increasing energy. Studies of quasiparticle lifetimes in the GW approximation for metals (Cu, Ag, Au) show good agreement with experiment at high energies (Keyling et al., 2000;Bacelar et al., 2002;Marini et al., 2002;Yi et al., 2010). At low energies, however, the agreement is not perfect. For example, GW calculations cannot account for the sudden increase in experimental lifetimes of electrons in Cu at energies below 2 eV (Keyling et al., 2000;Yi et al., 2010). These failures are attributed to the localized, short-range interactions of d-electrons in metals that are not described well by GW.

More Challenging Solids
As computational power increased and the success of GW became more widely known, studies were extended to more challenging materials like oxides, or d-and f -electron compounds. These materials are both a theoretical challenge for the GW approximation and are numerically more difficult to compute. Broadly speaking, these materials suffer from a severe mean-field starting point problem and/or contain localized electrons which are not well described by GW. Accordingly, studies of these materials required advances in the treatment of core electrons and the evaluation of the self-energy. The first studies of metals focused on the alkali metals Na and Al (Northrup et al., 1987;Surh et al., 1988). Metals served as a valuable test on the effects of self-consistency and vertex corrections (Mahan and Sernelius, 1989;Shirley, 1996). Eventually, studies moved into oxides and materials with d-electrons (Aryasetiawan, 1992;Aryasetiawan and Gunnarsson, 1995;Massidda et al., 1995. Already in the early nineties of the previous century Aryasetiawan tackled ferromagnetic nickel (Ni) with G 0 W 0 (Aryasetiawan, 1992). He found the quasiparticle band structure and the valence bandwidth to be in good agreement with angleresolved photoemission data. However, the exchange splittings are not well reproduced by G 0 W 0 and a satellite at 6 eV is missing. Later calculations for gadolinium (Gd) revealed similar observations Aryasetiawan, 1997). For Gd, satellites were seen in the G 0 W 0 spectrum, but their spectral weight does not match experiment.
The previous millennium concluded with early explorations into transition metal oxides such as nickel oxide (NiO) and manganese oxide (MnO) (Massidda et al., 1995(Massidda et al., , 1997. They, as well as iron and cobalt oxide (FeO and CoO, respectively), were then revisited with GW in the 2000s (Li et al., 2005;Kobayashi et al., 2008;Rödl et al., 2008;Rödl et al., 2009;Jiang et al., 2010b). These oxides present a challenge to G 0 W 0 calculations because local and semilocal DFT starting points produce metallic states that then cannot be corrected into semiconductors by G 0 W 0 . Instead, DFT+U and hybrid functionals were explored as alternative starting points Jiang et al., 2010b). The resulting G 0 W 0 DOSs are in good agreement with direct and inverse photoemission measurements for the low temperature magnetically ordered phases. However, the GW method cannot describe the DOS in the paramagnetic phase nor the transition to the paramagnetic phase.
The situation is similar in the less correlated copper oxide (Cu 2 O) (Bruneval et al., 2006). G 0 W 0 @LDA again fails to give a proper account of the band structure, while QSGW provides good agreement with ARPES measurements. CuO poses more of a problem, as no starting point or self-consistency scheme produces a satisfying band gap or density of states (Rödl et al., 2015(Rödl et al., , 2017. The early 2000s saw other oxides gain rapid interest, as the semiconductor industry sought a replacement for silicon dioxide (SiO 2 ) in silicon-based microelectronic technology. To prevent gate leakage in ever-shrinking transistors, gate materials with a higher dielectric constant (k) than SiO 2 were required. Eventually hafnium dioxide won the race. During the development period, the electronic structure, in particular the band gap and the band offsets of so called high-k materials were of enormous interest (Shaltaf et al., 2008;Grüning et al., 2010;Jiang et al., 2010a;Sklénard et al., 2018). G 0 W 0 calculations of the closely related compounds zirconium oxide (ZrO 2 ) and hafnium oxide (HfO 2 ) were performed (Grüning et al., 2010;Jiang et al., 2010a). Plane-wave and FLAPW G 0 W 0 agree very well with each other for these materials. The all-electron calculations investigated the effect of the Hf f -electrons and found that they do not change the self-energy corrections in these materials (Jiang et al., 2010a). The final band gap of monoclinic HfO 2 , however, is still under debate. It was initially believed to lie around 5.8 eV and is now thought to be in excess of 6.3 eV (Sklénard et al., 2018). What remains a challenge in strongly polarizable materials such as high-k dielectrics, and could thus potentially explain remaining discrepancies between GW and experiment, is how to include ionic screening (i.e., screening due to nuclear motion) consistently in the dielectric function of a GW calculation.
The list of interesting metal oxides and metallic, semiconducting, or insulating solids is long and the number of GW calculations is steadily growing. Recent flagship applications even include defects, surface effects and solvents in their comparison to experiment (Gerosa et al., 2018b). For a recent review on the performance of different GW variants to metal oxides, we refer to (Bruneval and Gatti, 2014;Gerosa et al., 2018a).
While the f -electrons are relatively inert in HfO 2 , they assume a much more prominent role in lanthanide and actinide metals and oxides. With the exception of early explorations into Gd, GW calculations for f -electron compounds have only emerged fairly recently (Chantis et al., 2007;Jiang et al., 2009Jiang et al., , 2012Richter et al., 2011;Kutepov et al., 2012;Sakuma et al., 2012;Jiang, 2018). These calculations are almost always performed with DFT+U starting points or some form of self-consistency, as local or semi-local DFT provides a poor description of the electronic structure.
QSGW calculations for the rare-earth metals Gd and Er and the rare-earth monopnictides GdN, EuN, YbN, GdAs, and ErAs place the occupied 4f states in agreement with photoemission measurements, but then overestimate the position of the unoccupied f states (Chantis et al., 2007). Also, upon closer inspection, multiplet splittings are not reproduced with GW and require a beyond GW treatment (Richter et al., 2011). For the lanthanide sesquioxide (Ln 2 O 3 ) series, G 0 W 0 @LDA+U calculations reproduce the relative positions of the occupied and unoccupied lanthanide f states across the series and confirm the experimental conjecture derived from phenomenological arguments (Jiang et al., 2009;Jiang, 2018).
Cerium (Ce) is another paradigmatic material. With only one f electron per Ce atom, it should still be relatively easy to describe, but Ce turns out to be an intricate material full of surprises. The phase diagram exhibits an unusual iso-structural phase transition. Both the α and the γ phase have an fcc crystal structure, but the α phase has a smaller equilibrium volume (Amadon et al., 2006;Bieder and Amadon, 2014;Devaux et al., 2015). The different localization of the f -electrons in the two phases is believed to be the driving force for the phase transition (Casadei et al., 2012(Casadei et al., , 2016Devaux et al., 2015). Ce is traditionally thought to be a strongly correlated material that belongs to the realm of dynamical mean-field theory (DMFT), see section 11 for further details. However, the α and γ phases are already captured by the randomphase approximation (see section 10 for details). The G 0 W 0 spectral function of the α phase is in good agreement with photo and inverse photoemission spectra (Sakuma et al., 2012). However, in the more correlated γ phase, G 0 W 0 produces a peak at the Fermi level that is absent in the experimental spectra (Sakuma et al., 2012).
In conclusion of this section, we would like to reiterate that materials with d-or f -electrons remain one of the most challenging applications of GW. As a matter of principle, GW cannot yield an insulator with an odd number of electrons per unit cell. Such Mott insulators (Mott, 1968) are a manifestation of strong electronic correlation. Modern approaches to describing such strongly-correlated, localized states often combine GW with either a phenomenological or first-principles treatment of d-or f -electron correlation, a topic we discuss further in section 11.

Defects in Solids
So far, we have primarily discussed the performance of GW for computing the band gap in solids, which does not depend on the absolute values of the band edges. However, the locations of the valence band maximum (VBM) and conduction band minimum (CBM) are essential for understanding defect level FIGURE 29 | For systems with mid-gap defect levels, computing the band gap alone is not enough to test the material for potential applications. The position of the defect level can also be computed with GW. In these cases, the absolute position of the VBM, CBM, and defect level are important. alignment in solids. The conceptual problem is illustrated in Figure 29. Assume an initial LDA calculation and then a G 0 W 0 calculation of the band structure for a system with a defect level in the gap. We assume that the G 0 W 0 band gap is in good agreement with experiment, but what about the defect level? The position of the defect level relative to the band edges and Fermi energy is critical for determining its occupancy when the system is put in contact with an electron reservoir. The band gap alone is no longer enough to assess the accuracy of the calculation. Similar to the defect problem, band alignment at semiconductor heterojunctions depends on the absolute position of the levels, a problem which is discussed more in section 7.
The accuracy of GW for defect levels is still under investigation (Hedström et al., 2002(Hedström et al., , 2006Weber et al., 2007;Ma and Rohlfing, 2008;Bruneval, 2009;Rinke et al., 2009;Bockstedte et al., 2010;Ma et al., 2010a) and its comparison to hybrid functionals is summarized in the review of Chen and Pasquarello (2015b). With a suitable choice of reference values to align the calculation with experiment, the accuracy of G 0 W 0 is similar to that of hybrid functionals for predicting defect energy levels (Chen and Pasquarello, 2015b). Their major difference in performance can be attributed to their shift in the VBM, which has a direct effect on the defect level alignment and the calculated ionization potential. Hybrid functionals tend to symmetrically shift the VBM and CBM, while G 0 W 0 mostly shifts the VBM down in energy which can worsen agreement with experiment for ionization potentials (Chen and Pasquarello, 2015b).

Outlook on Solids
As large-scale GW implementations became more common and parallelism increased, GW calculations became an indispensable tool for ab-initio predictions in solids. Today, there are too many GW calculations for solids to count. Even so, comparing GW calculations to experiment must be done with some care because there are additional effects in the experiment that are not included in ordinary GW. For example, electronphonon coupling can have a significant effect on the band gap in real materials (Giustino et al., 2010b;Cannuccia and Marini, 2011;Botti and Marques, 2013;Antonius et al., 2014;Kawai et al., 2014). The effect of the electron-phonon interaction can also be described by a self-energy and calculated with perturbation theory (Cederbaum and Domcke, 1974;Smondyrev, 1986). Experimental spectroscopies are also surface sensitive techniques, as mentioned briefly in section 2.3, which means that the measured band structure may not correspond perfectly to the bulk states. These considerations aside, the impressive success of GW in solids encouraged studies of other systems, including surfaces and molecules.

SURFACES
The application of GW to surfaces and interfaces is not as common because these systems tend to have large unit cells with a number of atoms beyond the tractability of many GW codes. However, what makes surfaces particularly interesting from the GW perspective is a long-range polarization effect contained in the screened Coulomb interaction that is absent for bulk materials: the image effect. As illustrated schematically in Figure 30, an additional charge (hole created in the photoemission process or added electron in inverse photoemission) outside a surface induces an image charge in the surface (Deisz et al., 1993). This charge gives rise to an additional potential, the image potential, that renormalizes the energy of the electron or hole state. For metallic and dielectric surfaces it is easy to show from simple electrostatic considerations that the image potential should decay with the inverse distance from the surface. For other geometries, e.g., quantum dots or nanostructures, this decay behavior is modified .
That the GW self-energy encompasses the image effect was first shown by extracting the image potential from the GW self-energy for the Al(111) surface (White et al., 1997). Later, image resonances (Fratesi et al., 2003) and image states for semiconductors, insulators Kutschera et al., 2007) and nanoclusters  were calculated with GW. Freysoldt et al. (2009) showed that the image potential can also be probed by monitoring the excitation energies of a test molecule (see Figure 30). The test molecule (CO) can be moved along the image potential by introducing insulating spacer layers between molecule and surface. The energy of the CO states gets renormalized stronger the closer it is to the surface, i.e., the smaller the spacer layer is. Freysoldt et al. (2009) also showed that the energy of semi-core states in different NaCl layers is affected by the image potential in the same way, a result that was later corroborated by Strange and Thygesen (2012) in a model study.
The most significant effect of the image potential is that it renormalizes the energy of adsorbates such as organic molecules (Freysoldt et al., 2009;García-Lastra et al., 2009;Thygesen and Rubio, 2009;Puschnig et al., 2012). The energetic position of molecular states near or on the surface is different from the molecule in the gas phase. During the excitation process, an electron or hole is added at the molecule. The additional correlation energy due to the polarization of the surface further stabilizes the added charge. As result, occupied states move up in energy and unoccupied states down and the HOMO-LUMO gaps reduce consequently in size, see Figure 30C. The renormalization depends on the dielectric constant of the surface. The larger the dielectric constant, the larger the renormalization. Already for surfaces of insulators the HOMO-LUMO gap renormalization is of the order of 1 eV and can reach more than 3 eV for metallic surfaces (García-Lastra et al., 2009;Thygesen and Rubio, 2009).
Apart from the HOMO-LUMO gap, the position of adsorbate states relative to the substrate's Fermi level or relative to the band edges is of significant interest in surface and interface science. This relative positioning of adsorbate to substrate states is commonly referred to as level alignment. GW calculations are currently considered to be the holy grail for an accurate determination of the level alignment. However, due to the aforementioned computational reasons (i.e., very large supercells) most GW level alignment calculations reported in the literature are not converged. Careful GW cluster calculations (Patrick and Giustino, 2012;Wippermann et al., 2014;Govoni and Galli, 2015) and very large scale GW calculations report good agreement with experiment. For physisorbed molecules, FIGURE 30 | Illustration of the image effect. (B) Shows the image charge and image potential induced by an additional electron (e.g., anionic charge on a molecule) outside a surface. (A) Provides a graphic illustration how the image potential of a germanium (Ge) surface could be probed with a carbon monoxide (CO) test molecule. By adding thicker and thicker sodium chloride (NaCl) layers between CO and Ge, the CO molecule moves along the Ge image potential. The resulting CO gap will then depend on the NaCl layer thickness, which is indeed the case as (C) illustrates. Subfigure (A) adapted from Freysoldt et al. (2009) under the terms of the Creative Commons Attribution 3.0 License. Data for (C) obtained from Freysoldt et al. (2009). whose electronic states do not couple strongly to the substrate, the GW self-energy can be split into a surface and a molecular contribution. Such a simplified GW polarization model has been developed by Neaton et al. (2006) for weakly interacting molecules at metallic surfaces. This model has been used to compute GW estimates for the level alignment of aminegold junctions and interfaces (Quek et al., 2007;Tamblyn et al., 2011) as well as π-conjugated polymers at Au(111) (Giovannantonio et al., 2018).
GW calculations for surfaces and interfaces are not only challenging because of the large supercells. An additional complication is the vacuum spacing in the common repeated slab model. In GW calculations that apply periodic boundary conditions, the surface is modeled as a slab of material that is periodic in two dimensions and finite in the third. The rest of the supercell in the direction perpendicular to the surface is filled with vacuum. Since the periodic boundary conditions apply also in the dimension perpendicular to the surface, the final system is a heterostructure of repeated blocks of material and vacuum (see Figure 31). GW now couples these repeated slab images because the GW interaction is longranged. The image potential decays only with the inverse distance between the slabs (see image effect discussion above) and not exponentially fast, as local or semi-local DFT functionals do. As a result, image potential tails generated by the repeated slab images reach into the surface region we would like to model with the slab model (see Figure 31) and obscure the actual image potential. FIGURE 31 | The image potential of a repeated slab system (B) differs from that of an isolated surface (A). The dashed lines in (C) mark the difference that can be computed with a suitable correction scheme (Freysoldt et al., 2008). As the charge moves across the interface, the ratio of dielectric constants for the "charged" and "uncharged" regions changes. As a result, the image potential changes sign.
In a GW calculation, the image potential is always present, even if we are not explicitly interested in image states. Due to the long range of the interaction, the vacuum spacing cannot be converged out in any GW implementation that has to place basis functions in the vacuum region (as for example plane waves) (Freysoldt et al., 2008;Hüser et al., 2013a). Two prevalent solutions to this problem have emerged: (1) to cut the interaction range and use an effective short-range interaction or (2) to apply post-processing corrections. The easiest way to limit the range is to impose a spherical cutoff on the Coulomb interaction every time it is used in the GW equations (Onida et al., 1995a;Spataru et al., 2004b;Ismail-Beigi, 2006;Rozzi et al., 2006). The largest disadvantage of this approach is that the spherical cutoff also limits the range of the GW interaction inside the material and in the two directions parallel to the surface. The cutoff radius should therefore at least be as large as the slab is thick. This implies that the vacuum separation should at least be equal to the slab thickness, which increases the computation time again for thicker slabs.
A computationally more efficient way is to apply postprocessing corrections to a normal GW calculation that does not modify the range of the Coulomb interaction (Freysoldt et al., 2008). Care has to be taken, however, that the GW implementation correctly includes the dielectric tensor (Freysoldt et al., 2007). Otherwise, the GW calculation will not converge with respect to k-points (Freysoldt et al., 2007;Hüser et al., 2013a). Such a post-processing correction has been derived from an electrostatic model (Freysoldt et al., 2008) and is depicted in Figure 31. The true image potential is shown for two scenarios in Figure 31A: for a charge located outside or inside the slab. As the charge moves from outside the slab to inside, the image potential changes sign, as now the dielectric constant in the region where the charge resides (i.e. in the slab) is larger than where the charge is not (i.e. in the vacuum). Figure 31B shows the image potential for a periodic array of slabs in the repeated slab approach. It is notably different from the image potential of a single slab in Figure 31A. The correction derived by Freysoldt et al. (2008) is shown as black dashed lines in Figure 31C and restores the correct behavior for a single slab. The corrections can be several tenth of eV large and yield converged results already for small vacuum thicknesses (Freysoldt et al., 2008).
At surfaces, the DFT wave functions are sometimes poor approximations of certain surface states and image states. In such cases, it is desirable to calculate quasiparticle wave functions. This can be done by solving the full quasiparticle equation (Equation (21)) in a suitable basis. If this solution is performed iteratively in energy, new quasiparticle states, such as image states, can be found that are absent from the DFT spectrum. Examples where the quasiparticle wave functions differ notably from the LDA or PBE wave functions are GaAs(110) (Pulci et al., 1999) and the C(111) surface (Marsili et al., 2005) as well as image states (White et al., 1997;Rohlfing et al., 2003;Kutschera et al., 2007).
Early GW calculations for surfaces focused on surface states of simple semiconductors such as silicon (Rohlfing et al., 1995a(Rohlfing et al., , 1997bRohlfing and Louie, 1999;Hahn et al., 2001;Weinelt et al., 2004), germanium (Rohlfing et al., 1996), silicon carbide (Rohlfing, 2001), gallium phosphite (Schmidt et al., 1999), indium phosphite (Schmidt et al., 2000;Hedström et al., 2006) and insulators such as diamond (Marsili et al., 2005), lithium fluoride  and sodium chloride (Freysoldt et al., 2009). Frequently, the GW quasiparticle energies are taken as input for optical absorption or reflectance anisotropy spectroscopy (RAS) studies (Pulci et al., 1998;Schmidt et al., 2000;Hahn et al., 2001). The surface band structure and dispersion of surface states is in good agreement with available photoemission studies. Also, computed optical and RAS spectra agree well with experimental spectra for these systems. Later calculations for more complex surfaces or surface adsorbates have to be taken with a grain of salt, since they may not be fully converged with respect to all computational parameters, unless plasmon pole models, other model dielectric functions, or cluster models were used (Giorgi et al., 2011;Patrick and Giustino, 2012;Alves-Santos et al., 2014).

TWO-DIMENSIONAL MATERIALS
Research in two-dimensional materials developed rapidly after the isolation of graphene in 2004 (Novoselov et al., 2004). The crystal structure and Brillouin zone of graphene are shown in Figure 32. Two-dimensional materials have gained great fame for their interesting electronic structures, which include phenomena like Dirac fermions and topological insulators (Geim FIGURE 32 | (A) Graphene has two hexagonal sublattices (A and B) in its honeycomb structure with translation vectors a 1 and a 2 . (B) The Brillouin zone is hexagonal with two symmetry inequivalent corners labeled K and K ′ . (C) Near the Dirac points at K and K ′ , the dispersion is linear. The band structure is computed at the PBE level and taken from the Computational 2D Materials Database (Haastrup et al., 2018) with Fermi energy set to zero. and Novoselov, 2007;Neto et al., 2009;Bhimanapati et al., 2015). Models of these effects are largely in the single particle − or single quasiparticle − picture. GW serves an important purpose to parameterize such models from a fully ab-initio perspective.
Two-dimensional materials often exist at the size and interaction strength that is ideally suited for GW. They are too large (the required Brillouin zone sampling is too dense) for more expensive wave function or beyond-GW Green's function methods, but their correlation is usually weak enough that GW gives a good description of their electronic structure. Similar to GW calculations on surfaces (section 7) or molecules (section 9), two-dimensional materials can show enhanced interaction effects from reduced dimensionality and decreased screening compared to bulk solids. Technical aspects of GW calculations of twodimensional materials include the truncation of the screened Coulomb interaction between layers (similar to surfaces) and slow convergence with respect to k-points (Qiu et al., 2016;Rasmussen et al., 2016;Thygesen, 2017).
The band structure of graphene (and many other twodimensional materials) is characterized by a zero band gap and linear dispersion near the Fermi energy, E(q) ± ≈ ±v F |q| where the + (−) sign refers to electrons (holes) and q is the wave vector relative to the K or K ′ points of the Brillouin zone, see Figure 32. v F is called the Fermi velocity and is the slope of the dispersion at the band edges. This linear dispersion is strikingly different than the parabolic dispersion in Equation (97), which is the case for most materials. Not long after its discovery, GW was applied to graphene to calculate the band structure and v F from first principles (Trevisanutto et al., 2008;Park et al., 2009;Siegel et al., 2011). Compared with calculations based on the local density approximation, GW preserves band closure at the Fermi energy and increases the Fermi velocity by ∼ 17% to give a value of 1.1 × 10 6 m/s which is in good agreement with experiment. These studies also found kinks which appear in the low energy band structure from electronphonon coupling and doping level dependent kinks of purely electronic origin.
Replacing carbon with a different group IV element creates a family of graphene-like materials. By preserving the honeycomb lattice of graphene, the materials still host Dirac fermions, but their chemistry and Fermi velocities depend on the specific element. For example, GW calculations of the Fermi velocity of planar silicon, called silicene, give a value of ∼ 7.7 × 10 5 m/s . Because of silicon's tendency for sp 3 hybridization, silicene also has a buckled structure which preserves linear dispersion at the GW level of theory (Wei et al., 2013). As with graphene, the electronic structures of silicene and germanene (monolayer Ge) subject to hydrogenation, strain, and hybridization with other materials have been studied with GW (Wei and Jacob, 2013b;Drissi and Ramadan, 2015a,b;Wang et al., 2015;Yan et al., 2015;Wang and Wu, 2017).
As one goes down the group IV elements, they become heavier; this has great significance for spin-orbit coupling (SOC) in two-dimensional materials. A SOC induced band gap in twodimensional materials is critical to the topological character of their electronic structure. Stanene is a group IV monolayer (Sn) that has a sizable band gap due to SOC (Lü et al., 2012;Lu et al., 2017).
A number of functionalizations or structural modifications to graphene have been proposed for modifying its electronic structure. Much of the research in functionalized graphene is directed toward achieving semiconducting graphene or, more generally, two-dimensional semiconductors. As interesting as Dirac fermions are, semiconducting layers are necessary to build many layered electronic devices like field-effect transistors. For example, passivating graphene with hydrogen transforms it from a sheet of sp 2 bonded carbon to sp 3 . The passivated structure, called graphane, has a GW band gap of ∼ 5 eV (Lebègue et al., 2009;Leenaerts et al., 2010;Karlický and Otyepka, 2013;Hadipour and Jafari, 2015) and could be useful as a twodimensional semiconductor. Other passivated graphenes also open a band gap (Klintenberg et al., 2010;Wei and Jacob, 2013a). One can also apply strain, poke holes, or form other planar carbon allotropes by rearranging carbon bonds, many of which open an appreciable band gap (∼ 1 eV) in graphene at the GW level (Appelhans et al., 2010a,b;Liang et al., 2012;Nisar et al., 2012;Dvorak and Wu, 2015).
To fill the need for two-dimensional semiconductors, one can move away from graphene and consider materials that are intrinsically semiconducting. Elements from the third and fifth groups of the periodic table (III-V compounds) often form a semiconducting monolayer, see Figure 33. This is largely because of the A-B sublattice imbalance in these materials, which opens a band gap at the tight-binding level of theory (Wallace, 1947). A monolayer of hexagonal boron nitride (hBN) is one possibility, with a GW band gap of ∼ 7.5 eV (Wirtz et al., 2005;Berseneva et al., 2013). Because graphene and hBN have similar lattice constants, they can be layered or hybridized easily, which gives additional tunability of the electronic properties (Bernardi et al., 2012). GaAs is another example, with a GW band gap of ∼ 3 eV (Fakhrabad et al., 2014). Other III-V monolayers are also stable and have been studied with GW (Şahin et al., 2009;Wang and Shi, 2010;Fakhrabad et al., 2015;Prete et al., 2017). Phosphorene is a somewhat unusual case, as a monatomic group V material. This is reflected in its unusual structure, which has armchair-like vertical buckling, shown in Figure 33C. Phosphorene is attractive because it has a smaller band gap than many other two-dimensional semiconductors, computed to be ∼ 2 eV with GW (Tran et al., 2014;Rudenko et al., 2015;Li et al., 2016b;Steinkasserer et al., 2016;Ferreira and Ribeiro, 2017), which is well-suited for applications.
Finally, we get to the transition metal dichalcogenides (TMDs). TMDs have the chemical formula MX 2 where M is a transition metal and X is a chalcogen, commonly S, Se, or Te. In their stable two-dimensional phase, TMDs usually form a three-layered structure with the transition metal atoms in a central layer between the chalcogens (called the 2H phase). MoS 2 , MoSe 2 , WS 2 , and WSe 2 are all semiconductors with G 0 W 0 @LDA band gaps from 2.0−2.5 eV when including SOC (Rasmussen and Thygesen, 2015). The band structure of MoS 2 is shown in Figure 34. TMDs feature unusual electronic structures derived from strong SOC and lack of inversion symmetry (see Manzeli et al., 2017). GW calculations at either the perturbative or partially self-consistent levels improve the agreement with experiment for fundamental band gaps (Cheiwchanchamnangij and Lambrecht, 2012;Komsa and Krasheninnikov, 2012;Ramasubramaniam, 2012;Espejo et al., 2013;Hüser et al., 2013a;Molina-Sánchez et al., 2013;Shi et al., 2013;Debbichi et al., 2014;Ugeda et al., 2014;Qiu et al., 2016;Robert et al., 2016;Lee FIGURE 34 | Band structure of MoS 2 in the 2H phase at the PBE (black) and G 0 W 0 (red) levels. The G 0 W 0 bands include spin-orbit coupling but the PBE bands do not. The Fermi energies for each case are indicated by horizontal dotted lines. Data taken from the Computational 2D Materials Database (Haastrup et al., 2018(Haastrup et al., ). et al., 2017. However, conclusions from different GW studies on the magnitude and character (direct or indirect) of the band gap in MoS 2 are not entirely consistent. Depending on the level of self-consistency, truncation of Coulomb interaction, treatment of frequency dependence, and k-point sampling, the GW quasiparticle band gap of MoS 2 can vary by ∼ 0.44 eV (Qiu et al., 2016).
The MoS 2 case study highlights the importance of carefully converging GW calculations and the difficulties of twodimensional materials, in particular. In two-dimensional semiconductors, the dielectric function is a linear function of q which results in very slow k-point convergence (Rasmussen et al., 2016). TMDs are also commonly stacked in layered materials called van der Waals heterostructures, which allow one to tune the electronic structure for device applications Arora et al., 2017;Winther and Thygesen, 2017). GW allows one to predict band alignment in these heterostructures from first principles (Ganesan et al., 2016).

MOLECULES
The application of GW to molecules is a relatively new field of research that has developed rapidly over the last decade. The electronic screening is much weaker in molecules than in extended systems. The low charge density in molecules does not naturally fit a screening interpretation of correlation which is intrinsic to GW and replacing the bare Coulomb potential with the dynamically screened Coulomb interaction W might not be the obvious choice. Even so, a rigorous test of GW for the He atom, with only two electrons, found excellent agreement with numerically exact results . In addition, the first exploratory G 0 W 0 studies on molecular systems revealed that the inclusion of screening at the GW level substantially improves electron removal and addition energies (Grossman et al., 2001;Niehaus et al., 2005;Dori et al., 2006;Ma et al., 2009Ma et al., , 2010bRostgaard et al., 2010;Blase et al., 2011;Ke, 2011).

First Ionization Potentials and Electron Affinities
In molecules, the single-particle states {φ 0 s } correspond to molecular orbitals (MO) with discrete energies. The energy to remove an electron from an MO is referred to as ionization potential. The negative of the electron affinity (EA) corresponds to the energy needed to add an electron to the LUMO of the neutral system (−EA LUMO =ǫ LUMO ), see also Equations (1) and (2). G 0 W 0 provides access to both quantities. Furthermore, we can calculate the fundamental gap from the first ionization potential, IP HOMO , and the electron affinity The fundamental gap should not be confused with the optical gap ogap , which is the energy needed for the charge neutral excitation from the HOMO to the LUMO. The optical gap is lower in energy than fgap and can not be obtained from GW. It defines the threshold for photons to be absorbed and for the formation of a bound electron-hole pair (exciton). Conversely, the fundamental gap is the energy threshold for the formation of a separate electron-hole pair, which is not bound together. It can be considered as the molecular equivalent to the band gap, see also Bredas (2014) and Baerends et al. (2013). GW has been mainly applied to compute the IP for the HOMO and the electron affinity for π-conjugated molecules with potential for organic photovoltaic applications Faber et al., 2011Faber et al., , 2012Ke, 2011;Gallandi and Körzdörfer, 2015;Gallandi et al., 2016;Knight et al., 2016;Wilhelm et al., 2016;Marom, 2017). Examples for relevant π-conjugated organic molecules are linear acenes (linearly fused benzene rings), quinones, aromatic nitriles, anhydrides, porphyrins, and thiophene polymers. These classes of molecules are particularly suited as organic semiconductor because their EA is often positive 10 (Richard et al., 2016), i.e., they are electron acceptors and their fundamental gap is much smaller than in inorganic molecules. For example, smaller acenes have gaps between 6.0 − 7.0 eV (Richard et al., 2016), whereas the fundamental gap of a small inorganic molecule like water is larger than 14.0 eV (van Setten et al., 2015).
The fundamental gap, IP HOMO and EA are critical parameters for the charge transport in organic semiconductors. Over the last years it has been shown that GW predicts these properties well. Using an appropriate starting point (see section 4.7), the reported mean absolute deviations (MADs) of IP HOMO and EA are less than 0.2 eV from the CCSD(T) reference Knight et al., 2016). The MAD of IP HOMO with respect to experiment can be even reduced to <0.1 eV when including also vibrational effects in the GW spectra (Gallandi and Körzdörfer, 2015).
The electronic properties of π-conjugated molecular structures can be tuned by, e.g., increasing the chain length.
It has been shown that GW correctly predicts the decrease of IP HOMO in trans-polyacetylene with increasing chain length (Pinheiro et al., 2015;Bois and Körzdörfer, 2017). Similar GW studies were conducted for band gaps of linear acenes (Wilhelm et al., 2016). The photovoltaic properties can be further modulated by using two different organic semiconductors in the cell: a molecule with a low IP HOMO (electron donor) and molecule with electron-acceptor character, i.e., with a high EA (Kippelen and Brédas, 2009). The level alignment of such donor-acceptor systems has been studied with GW for tetrathiafulvalene (TTF) and tetracyanoethylene (TCNE) or tetracyanoquinodimethane (TCNQ) dimers, demonstrating the importance of well-chosen starting points or self-consistent schemes (Caruso et al., 2014;Gallandi and Körzdörfer, 2015).
The accurate prediction of charged excitations is not only important for organic semiconductors, but also for DNA and RNA nucleobases in order to study their damage following exposure to ionizing radiation. IPs and EAs for these molecules have been reported at the G 0 W 0 level in good agreement with experiment and quantum chemistry methods (Faber et al., 2011;Qian et al., 2011;Gallandi and Körzdörfer, 2015).

Ionization Spectra
The GW approximation has also been applied to calculate excitations of deeper valence states for small organic molecules Marom et al., 2012;Caruso et al., 2013a;Egger et al., 2014;Ren et al., 2015) and also mediumsized π-conjugated molecules (Dori et al., 2006). An example is shown in Figure 35, where the ionization spectrum of pyridine is displayed for the first 12 valence states. Compared are the G 0 W 0 @PBE0 spectrum and the experimental PES. The positions of the peaks are in good agreement, in particular for the first three valence excitations. Benchmark studies for azabenzenes showed that a HF starting point yields distorted spectra, while hybrid DFT functionals and self-consistent schemes yield spectra that agree well with experiment . However, it has been found that the energy spacings and positions are not always reproduced satisfactorily. For example for benzene, the spacing of the HOMO-1 and HOMO-2 is vanishingly small for all starting points and also scGW . The exact spacing is larger than 0.5 eV. It has been demonstrated that a beyond GW scheme, so-called "vertex corrections, " are necessary to separate these two peaks .
The deeper valence states are generally less valuable for characterization and chemical analysis. Core excitation energies, on the other hand, are a powerful tool to investigate the chemical structure of complex molecules and materials. They are elementspecific, but are also sensitive to the atomic environment, such as covalent bonding, hybridization or the oxidation state (Siegbahn et al., 1969;Bagus et al., 1999Bagus et al., , 2013. The application of GW to core states is more difficult than to valence states, as we will explain in more detail in section 11. Core excitations in GW are an emerging research field and appropriate numerical algorithms have only been developed recently . Lastly, we will briefly address peak broadening in GW spectra. The G 0 W 0 spectrum in Figure 35 has been artificially broadened to facilitate comparison with experiment. This broadening FIGURE 35 | Ionization spectrum of pyridine. G 0 W 0 @PBE0 QP energies compared to the experimental photoemission spectrum (Liu et al., 2011). The calculated spectrum has been artificially broadened; the position of the QP energies is indicated with vertical bars. All QP energies are extrapolated using the cc-pVnZ (n=3-6) basis sets, see Appendix C for further computational details. The QPs of the first valence states are colored in red, green and blue. mimics vibrational, experimental resolution and finite lifetime effects. With regard to electronic lifetimes, also quasiparticle (QP) excitations in molecules have finite lifetimes accompanied by a finite broadening. Such a finite broadening would be revealed in the full spectral function A(ω), see Equation (3). The peaks close to the Fermi energy are usually sharp delta-like peaks, while higher energy excitations may decay through the formation of electron-hole pairs or collective excitations resulting in broader peaks, see Caruso et al. (2013a) for a detailed discussion of lifetimes of quasiparticles in molecules.

The GW100 Benchmark Set
An important aspect in electronic structure theory is benchmarking. Benchmark sets are very common in quantum chemistry, but have not found their way into GW until recently. Molecules offer a distinct advantage compared to solids for benchmarking because accurate reference energies can be computed with high-level quantum chemical methods. For this purpose, sets of small molecules are beneficial since FIGURE 36 | GW100 benchmark comparing IP HOMO energies computed at the G 0 W 0 @PBE level. FHI-aims is set as reference: IP HOMO = IP HOMO (FHI-aims)−IP HOMO (X). (A) Comparison of extrapolated/converged results for VASP , WEST (Govoni and Galli, 2018), BerkeleyGW (van Setten et al., 2015). Shown are the results from full-frequency treatments and iterative solutions of the QP equation.
(B) Comparison of localized basis set codes using the Gaussian basis set def2-QZVP (Weigend and Ahlrichs, 2005) for Turbomole (no-RI) (van Setten et al., 2015) and the N 3 implementation in CP2K . Note that BN, O 3 , MgO, BeO, and CuCN are excluded for WEST, VASP and CP2K and that the BerkeleyGW and Turbomole data contain only a subset of 19 and 70 molecules, respectively. Box plot: Outliers represented by dots; boxes indicate the "interquartile range" measuring where the bulk of the data are.
they are computationally tractable. Moreover, they provide diversity in the electronic structure due to different types of covalent bonding.
The first systematic benchmarks were performed using a small set of 34 molecules Bruneval and Marques, 2013). Van Setten et al. took this idea further and proposed the GW100 benchmark set (van Setten et al., 2015), which is currently the largest and most popular GW benchmark set. It contains 100 molecules that feature a variety of elements from the periodic table. The original GW100 paper reports HOMO and LUMO quasiparticle energies computed at the G 0 W 0 @PBE level and the corresponding experimental references. Van Setten et al. used the test set for a quantitative comparison of the different GW methodologies implemented in the program packages Turbomole, FHI-aims and BerkeleyGW. They compared the performance of different basis sets (plane wave vs. localized), handling of core and valence electrons (allelectron vs. pseudopotentials) and different frequency integration techniques. The codes with localized basis sets (FHI-aims and Turbomole) agree to a precision of 1 meV for most molecules. The deviation of the BerkeleyGW plane wave code to the basis-set-extrapolated FHI-aims and Turbomole results is in the range of 200 meV. These numbers refer to the IPs obtained from full-frequency integration techniques available in all three codes. Based on this, van Setten et al. identified the basis set size as one important aspect for the accuracy of GW calculations.
The test set served later as a benchmark for the PAW G 0 W 0 implementation in VASP . This comparison established that the carefully converged PAW plane wave G 0 W 0 calculations agree very well with the extrapolated results from the localized basis set codes. The MAD from the FHI-aims reference values is 60 meV. GW100 investigations with the WEST code gave similar results and highlighted the need for a re-evaluation of the pseudopotentials for some elements (Govoni and Galli, 2018). Moreover, the GW100 test set has been used to validate the accuracy of the low-scaling GW algorithm based in CP2K . A comparison between the different codes is reported in Figure 36. Extrapolated values are represented in Figure 36A comparing plane wave codes to FHI-aims, whereas the comparison in Figure 36B is restricted to codes with localized functions. A list of all codes that ran the GW100 benchmark can be found in GW100 (2018).
The GW100 test set was not only used to validate the reliability of numerical techniques in G 0 W 0 implementations. It has been also used for a comprehensive assessment of different self-consistent GW methodologies: scGW, QSGW and scGW 0 . The results were compared to CCDS(T) at the polarized triple-zeta level reporting the smallest discrepancies for QSGW. A comparison of basis set extrapolated CCSD(T) and GW schemes was performed shortly afterwards for a smaller, more specialized benchmark set of 24 organic electronacceptor molecules, where G 0 W 0 based on long-range corrected hybrid functionals emerged as the best GW method Knight et al., 2016). Since then, also equation of motion (EOM) coupled cluster benchmark sets have been published that provide reference spectra (and not just HOMO or LUMO energies) for molecules (Lange and Berkelbach, 2018;Ranasinghe et al., 2019).

Molecular Crystals
Modern applications of GW comprise not only isolated molecules, but also molecules in the condensed-phase, such as organic molecular crystals. These materials are composed of weakly bonded molecular units held together by, e.g., vander-Waals interactions, dipole-dipole interactions or hydrogen bonds. Here, we summarize only some key applications of GW to molecular solids. A more comprehensive discussion can be found in the specialized review by Kronik and Neaton (2016).
Molecular solids exhibit a band gap renormalization similar to molecular adsorbates discussed in section 7. The band gap of molecular solids is significantly smaller than the fundamental gap of the isolated molecules (Sato et al., 1981). As for molecular adsorbates, the gap renormalization is a direct consequence of polarization effects. It is also present when there is no wavefunction overlap between neighboring molecular units. If an electron is added to or removed from a certain molecule, the new charge carrier is screened not only by the molecule it was added to, but also by the surrounding molecules. This renormalization effect is shown in Figure 37 for the benzene crystal. The HOMO level moves up in energy with respect to its position in the gas phase molecule, whereas the LUMO moves down resulting in a gap reduction. The gap renormalization typically lies in the range of 2-6 eV (Kronik and Neaton, 2016) and has been studied with GW for benzene (Refaely-Abramson et al., 2013), corannulenebased materials (Zoppi et al., 2011), C 60 (Refaely-Abramson et al., 2013, pentacene Refaely-Abramson et al., 2013), perylenetetracarboxylic dianhydride (PTCDA) , octaethylporphyrin (H 2 OEP) (Marsili et al., 2014), 6,13-bis(triisopropylsilylethynyl)-pentacene (TIPS-pentacene)  and oligoacenes . The gap reduction is not captured by standard DFT calculations (Refaely-Abramson et al., 2013), see also Figure 37. In fact, the DFT gap remains almost unchanged when transitioning from the gas to the crystalline phase because the long-range polarization effects responsible for the gap renormalization are not included in conventional DFT functionals.
The molecular orbitals of molecular crystals resemble those of an isolated molecule. However, the overlap between neighboring molecules is not zero resulting in a k dependence (dispersion) of the energy levels. Starting with early work on C 60 (Shirley and Louie, 1993), GW band structures have been reported for a wide range of organic crystals (Tiago et al., 2003b;Sharifzadeh et al., 2012Sharifzadeh et al., , 2015Refaely-Abramson et al., 2013Fonari et al., 2014;Yanagisawa and Hamada, 2017;Cocchi et al., 2018;Rangel et al., 2018). As for inorganic semiconductors, GW opens the band gap and increases the band width with respect to DFT. For example, GW band widths reported for pentacene (Tiago et al., 2003b;Sharifzadeh et al., 2012), PTCDA , rubrene (Yanagisawa et al., 2013), or picene (Yanagisawa et al., 2014) are larger by more than 15%. The bands of molecular crystals are relatively flat compared to inorganic semiconductors (see section 6). For example, GW-computed band widths for pentacene are only 0.4 eV for the valence and 0.7 eV for the conduction band .
Molecular crystals are an ideal testbed for GW embedding schemes since the band gap of molecular solids is mainly determined by polarization effects and significantly less by dispersion. In the spirit of quantum mechanics/molecular mechanics (QM/MM) embedding schemes the molecular crystal is partitioned into a small part that is calculated with GW and a much larger MM part. In the embedding scheme proposed by Blase and co-workers, the small part to which GW is applied consists of one or more molecules, while a continuum polarization model is used to include the response of the MM system (Duchemin et al., 2016;Li et al., 2016a). They reported GW/MM gaps for pentacene and perfluoropentacene that are in close agreement with the bulk reference (Li et al., 2018). Such embedding schemes are often computationally more efficient than periodic boundary condition calculations, especially for local orbital basis set codes.

TOTAL ENERGY AND THE ELECTRONIC GROUND STATE
In addition to the quasiparticle spectrum, the Green's function also provides information on the electronic ground state. Both the ground state density and the ground state total energy are accessible. However, very few studies have explored ground state properties with GW. Since this review mainly addresses spectroscopic properties, we will only briefly address GW ground state calculations here.

Electron Density
The ground-state density n(r) follows directly from the Green's function (Fetter and Walecka, 1971) The total electron number contained in G can be obtained through integration of the density. For a self-consistent G that has been obtained from a converged solution of Dyson's equation (Schindlmayr, 1997), this number should then equal the total number of electrons N in the system. Also scGW 0 satisfies this particle number conservation law, but all other approximate self-consistency schemes as well as G 0 W 0 violate particle number conservation. Figure 38 shows density differences compared to the Hartree-Fock method for PBE, coupled cluster singles-doubles (CCSD), and self-consistent GW for the CO molecule. Overall the pattern is similar. All three methods remove charge from the bonding region and the top of the oxygen atom and focus it on the carbon atom and a p orbital of the oxygen atom. The charge density difference pattern between CCSD, a high-level quantum chemistry method, and scGW is very similar. This indicates that the GW density is of high quality.
From the density, the dipole moment of CO can be calculated. In PBE the dipole moment amounts to 0.2 Debye, in HF to −0.13 Debye and from scGW we obtain 0.07 Debye (Caruso et al., 2012a). The CCSD dipole moment is 0.06 Debye (Caruso et al., 2012b). All values were computed at the equilibrium bond-length of the respective method and the experimental dipole moment is 0.11 Debye (NIST, 2019). CCSD and scGW again agree closely and also match experiment reasonably well, whereas PBE overestimates the dipole moment and HF gives the wrong sign. The good agreement between scGW and CCSD and experiment is further testimony for the quality of the GW density.
Since fully self-consistent GW calculations are numerically quite involved and can currently only be performed for small systems, DFT densities are still used in the majority of GW studies. However, in situations in which the underlying DFT Kohn-Sham spectrum has the wrong order of states, erroneous charge transfer can occur in the DFT calculation. This is, for example, frequently the case in molecular complexes, if the HOMO of one molecule erroneously ends up above the LUMO of another. The corresponding G 0 will not reflect the true ground state density of the complex and the subsequent G 0 W 0 calculation will be wrong. G 0 W 0 itself cannot rectify this situation because it has no access to the density. Only self-consistent schemes can correct the density and the Green's function. Examples of such molecular complexes are dimers of tetrathiafulvalene (TTF) with tetracyanoethylene (TCNE), tetracyanoquinodimethane (TCNQ) and p-chloranil. In all cases, scGW stops the erroneous charge transfer that occurs in PBE and in hybrid functionals with a low amount of exact exchange (Caruso et al., 2014). The resulting charge density reflects the molecular charge densities that are slightly perturbed where the molecules are closest to each other.

Total Energy
The total electronic energy can be obtained from the singleparticle Green's function G via the Galitskii-Migdal (GM) formula: (Galitskii and Migdal, 1958;Fetter and Walecka, 1971) whereĥ 0 contains the kinetic energy operator and the external potential. This equation can be recast into a more familiar looking form (Strinati, 1988;Caruso et al., 2013a) in which T denotes the kinetic energy, E ext the external potential energy, and E H the Hartree energy. The exchange-correlation (xc) energy is given by the self-energy, , and the Green's function. Equation (102) is appealing because it contains the same terms as the DFT total energy. Notable differences are that the kinetic energy is the fully interacting kinetic energy and not that of an auxiliary non-interacting system. Correspondingly, the exchange-correlation energy is purely due to electronic exchange and correlation and does not need to also approximate the difference between the interacting and the non-interacting kinetic energy as in Kohn-Sham DFT.
The GW total energy is closely related to the popular randomphase approximation (RPA) in DFT (Langreth and Perdew, 1977;Hesselmann and Görling, 2010;Eshuis et al., 2012;Ren et al., 2012b). The xc energy in GW and RPA can be represented in terms of topologically identical Feynman diagrams (Hellgren and von Barth, 2007;Caruso et al., 2013b) and thus have a total energy expression with the same functional dependence on the Green's function (Klein, 1961;Dahlen et al., 2006b;Hellgren and von Barth, 2007). However, the RPA energy is evaluated with a non-interacting Green's function (originating from a local Kohn-Sham potential) and the GW energy with a fully interacting Green's function. In fact, the Dyson equation results as stationary equation from the optimization of the GW total energy with respect to the Green's function in the Klein or Luttinger-Ward functionals.
Early GW calculations for the homogeneous electron gas found the total energy to be in good agreement with Quantum Monte Carlo calculations (von Barth and Holm, 1996;Holm and von Barth, 1998;Holm, 1999;Holm and Aryasetiawan, 2000;García-González and Godby, 2001). GW also captures van der Waals interactions as exemplified by the total energy curve between two jellium slabs (García-González and Godby, 2002) and by changes in the GW density of the argon dimer (Ferri et al., 2015). More recently it was shown that the lattice constants and bulk moduli of simple solids agree much better between experiment and GW than with LDAs, GGAs or HF (Kutepov et al., 2009). However, GW total energy calculations for atoms (see Figure 39) and small molecules show the opposite (Stan et al., 2009;Caruso et al., 2012aCaruso et al., , 2013a. Presumably due to the low amount of screening self-consistent GW calculations are outperformed by high-level quantum chemistry methods and even simple DFT functionals. Further analysis (Hellgren et al., 2015) reveals that the difference between G 0 W 0 @HF and scGW can be ascribed to the difference in the kinetic energy (termed here kinetic correlation in analogy with DFT) because their Coulomb correlation energies are almost identical for the small systems shown in Figure 39. Conversely, the difference between G 0 W 0 @PBE and scGW is almost entirely due to Coulomb correlation. Both Coulomb and kinetic correlation are large, as illustrated in Figure 39. Once included, the remaining difference between scGW and full FIGURE 39 | Total energy of atoms computed with three different GW variants for atoms and small molecules plotted as a difference to the essentially exact Configuration Interaction (CI) results. Data retrieved from Caruso et al. (2012a).
configuration interaction (the essentially exact solution) must be due to missing vertex corrections. This contribution is much smaller than the two correlation contributions.

Challenges
As successful as the GW approximation is for describing quasiparticle excitations, there are still technical and theoretical challenges to overcome. Core-level spectroscopy is a valuable tool for chemical analysis and characterizing materials. The operating principle is the same as PES and IPES discussed in section 2, though at higher incident energies using Xrays. The technique is then referred to as X-ray photoelectron spectroscopy (XPS). Core levels of the same type, for example, different carbon 1s states, are element-specific, but are also sensitive to the local chemical environment, i.e., bonding, hybridization or the oxidation state (Egelhoff, 1987;Bagus et al., 1999). However, these so-called chemical shifts are for second-row elements often smaller than 1 eV (Siegbahn et al., 1969). The energetic differences are particularly minute for carbon with XPS peaks that are separated by less than 0.5 eV. Such spectra are hard to resolve and interpret. Theoretical spectroscopy can be a valuable tool to aid the interpretation of experimental results.
For core levels, however, the simple, single quasiparticle picture can break down. The incident photon in PES may produce spectral features away from the single-particle peak. If the additional peak is broad, these so-called satellites can be attributed to the collective excitation of the system after the electron is excited. If the additional peak is narrow or, equivalently, has a long lifetime, the electron has spectral weight divided between multiple particle-like eigenstates of the system . This effect can also appear when probing the multiplet structure of open-shell systems (Lischner et al., 2012). In these cases, the quasiparticle equation can have multiple solutions, making both the GW calculation and interpretation of the result more difficult.
The problem also appears for more conventional valence states of small molecules, and recent work has shown that these multiple solutions lead to unphysical discontinuities in quasiparticle energies and that evGW can exacerbate the problem Véril et al., 2018). Just as for experimental spectroscopy, the sensitivity of core states to the local environment makes the GW calculation more challenging than for conventional valence states. Due to its value for chemical analysis and dearly needed support for XPS experiments, GW for core levels can yield useful insight and is an ongoing topic of research (Zhou et al., 2015. Spin dependence in GW calculations is important for understanding magnetic systems and is critical to the electronic structure of topological insulators. Already in the case of collinear spin, when the spin quantum number is either up or down, spin polarization has an effect on the excitation spectrum of MnO (Rödl et al., 2008). By including spin-orbit coupling (SOC) in the one electron Hamiltonian, single-particle states become noncollinear and can no longer be decomposed into up or down. Noncollinear calculations are important in relativistic systems with strong SOC or when describing magnetic effects (Sakuma et al., 2011;Kutepov et al., 2012;Ahmed et al., 2014;Kühn and Weigend, 2015;Scherpelz et al., 2016). For materials with heavy elements, energy shifts due to spin-orbit coupling must be included for good agreement with experiment on band gaps (Scherpelz et al., 2016). Topological insulators commonly contain heavy elements (Se, Te, Bi, Sb) and depend on spinorbit coupling for band inversion (Aguilera et al., 2013a(Aguilera et al., ,b, 2015aNechaev and Chulkov, 2013;Nechaev et al., 2015). For a detailed review of GW+SOC calculations, see Aguilera et al. (2015b). To describe spin-dependent interactions between particles, one must generalize Hedin's equations beyond the Coulomb interaction, which has no spin dependence. This generalization was recently completed Biermann, 2008, 2009) and allows one to treat magnetic dipole-dipole interactions, for example.

Quantum Chemistry
Quantum chemistry offers an established, albeit expensive, route to compute particle addition/removal energies in molecules. Ionization energies and electron affinities can be computed as the difference of total energies between the neutral molecule and the ion. In fact, GW calculations on small systems are often compared with coupled cluster results as a benchmark. A direct comparison between correlated wave-function and Green's function methods to determine the level of correlation described by each is somewhat challenging. In certain cases, it is possible: recent work compares diagrams included in GW with those included in equation-of-motion coupled cluster theory (Lange and Berkelbach, 2018).
Generally, nonperturbative wave-function methods are considered beyond GW, even if they rely on an ansatz or other approximation. In Green's function embedding theories, quantum chemistry (either full or truncated configuration interaction) can be used as a high accuracy Green's function solver in a subspace (Zgid et al., 2012). After computing the subspace wave function, one directly computes the amplitudes in Equation (5) for the subspace Green's function. With G and G 0 in hand, it is then trivial to compute the self-energy (Pavlyukh and Hübner, 2007). In this subspace, the Green's function is computed from accurate many-body wave functions so that correlation is treated beyond GW. The subspace Green's function can be self-consistently iterated with the remaining degrees of freedom described at the GW level of theory (Martin et al., 2016). Other routes to combine GW with quantum chemistry are an emerging field. A newly developed method combines GW with configuration interaction by embedding a wave function calculation inside of a Green's function calculation (Dvorak et al., 2018;Dvorak and Rinke, 2019). These developments offer valuable insight to merge these disciplines in the future.
Green's functions are also directly studied in quantum chemistry, where they are more commonly called propagators. There certainly is some overlap between the two communities in their treatment of GW or GW-like approximations. Because we primarily focus on GW and Hedin's equations in physics, we refer the interested reader to the work of Cederbaum and Domcke (2007) and Ortiz (2012) for a perspective of propagators in chemistry.

Non-equilibrium Green's Functions
The GW approach has also been applied to systems in strong external fields. These include quantum transport calculations (Thygesen and Rubio, 2008;Spataru et al., 2009) and semiconductors in strong laser fields (Spataru et al., 2004a). The problem of describing quantum transport is similar to that of level alignment at a molecule/metal interface discussed in section 7. First, the alignment of molecular states in the contact region relative to the Fermi level of the metal leads determines the overall conductance. Second, for applied biases, charge will flow from the lead into the molecule or molecules in the contact region. This charge flow will alter the electron density of the system and therefore the quantum mechanical interactions.
Self-consistent GW calculations (Thygesen and Rubio, 2008) take charge transfer and the associated change in screening (e.g., image effect) and the many-body interactions into account correctly. scGW is an appropriate tool for finite, small bias quantum transport calculations, as benchmarked for instance for thiol-and amine-linked benzene/gold  and alkane/gold junctions . Strong correlation effects in quantum transport, such as the Coulomb blockade or the Kondo effect, can, however, not be captured with the GW approach and require a beyond GW treatment of correlation Thoss and Evers, 2018).
For strong external biases in quantum transport and systems in other strong external fields, the electron distribution is perturbed so strongly that it can no longer be described by an equilibrium Fermi function at a finite temperature. In such non-equilibrium cases, the Green's function theory described in this review article is not applicable anymore. Nonequilibrium scenarios can be incorporated into the Green's function formalism, by switching to non-equilibrium Green's functions defined on the Keldysh contour (Dahlen et al., 2006a). These non-equilibrium Green's functions obey the Kadanoff-Baym and not Hedin's equations. The Keldysh contour formalism goes beyond the scope of this review article and we refer the FIGURE 40 | Diagrammatic representation of Hedin's equations. All 5 quantities are coupled to all others. Here, we omit the Hartree potential from the G diagram, though it must also be included when translating the diagrams to the equations in Appendix B.
In one application of this non-equilibrium formalism highly excited GaAs was investigated. It had been hypothesized that GaAs could become metallic if enough electrons could be promoted from the valence to the conduction band with a strong laser. Non-equilibrium GW calculations showed that the band gap was indeed decreasing with increasing laser power, but would not close completely, falsifying the hypothesis (Spataru et al., 2004a).

Vertex Corrections
To go beyond the GW approximation, one must include vertex corrections. The full set of Hedin's equations, including the vertex, are shown diagramatically in Figure 40, which can be directly compared to Figure 9. The mathematical equations are in Appendix B.2. By comparison to the GW diagrams, we see that treating the vertex, Ŵ, beyond a single spacetime point significantly complicates the equations, as demonstrated for a single diagram in Figure 8. The vertex contributions beyond Ŵ(1, 2, 3) = δ(1, 2)δ(1, 3) are commonly called vertex corrections. The exact vertex requires the functional derivative δ /δG. This functional derivative cannot be computed numerically and must be derived analytically. Any resulting vertex is extremely expensive to compute because it now depends on three spatial, spin, and time indices. There are a few reasonable options for reducing the expense: using an approximate from a different theory to simplify the derivative, using a diagrammatic but simplified Ŵ, or using an exact Ŵ in only a small subspace. As in GW, Ŵ can be selectively iterated or computed in a single-shot way to further lower the cost.
The earliest approaches to vertex corrections used an approximate vertex based on the local density approximation (LDA) to Kohn-Sham DFT (Hybertsen and Louie, 1986;Del Sole et al., 1994). In these approximations, the LDA exchangecorrelation functional is used in place of the self-energy to compute the functional derivative. The approximate vertex enters the polarizability and interaction as where n is the electron density and f xc determines the vertex correction. The advantage of the LDA is that f xc can be calculated analytically. These approaches are computationally much lighter than the true vertex and, for that reason, are still used . Approximate vertex corrections can also be extended beyond the LDA to recover a more realistic behavior (Chen and Pasquarello, 2015a). The inclusion of f xc has its roots in timedependent density functional theory (TDDFT) and is somewhat inconsistent with the Green's function formalism. The final results of such calculations can improve band gaps compared to G 0 W 0 (Chen and Pasquarello, 2015a) or band centers compared to G 0 W 0 . Shaltaf et al. found that a GGAbased vertex correction had a negligible effect on band offsets in the Si/SiO 2 interface (Shaltaf et al., 2008).
Diagrammatic vertex corrections, instead of those based on a density functional, are a more consistent and formal extension to GW. One can build upon GW in a fully diagrammatic framework by including vertex corrections in the perturbative, single-shot self-energy. These methods are analogous to the G 0 W 0 approach in that the vertex correction is computed only one time, and the quasiparticle energies are usually computed in a diagonal approximation. For example, second order screened exchange (SOSEX) and related approximations include a subset of exchange-type diagrams which are a vertex correction to GW (Shirley and Martin, 1993b;Bobbert and van Haeringen, 1994). In the approximation of Ren et al. (2015), the diagonal matrix elements of GW+SOSEX are where f q and f r are Fermi occupation factors and sr| |qp is defined in Appendix A. Notice extra matrix elements in the numerator and factors in the denominator compared to the equation for the GW self-energy in Figure 10. SOSEX-type approximations generally improve upon GW band gaps in molecules . In the perturbative approach, these calculations are relatively lightweight but have the same starting point dependence of G 0 W 0 . A systematic bridge between diagrammatic vertex corrections and TDDFT was developed by Del Sole, Reining, and others (Streitenberger, 1984a,b;Reining et al., 2002;Adragna et al., 2003;Del Sole et al., 2003;Sottile et al., 2003;Bruneval et al., 2005). In this approach, the kernel to construct the irreducible polarizability is cast as only a two-point function. This is in contrast to the exact vertex, Ŵ, which depends on four spacetime coordinates to compute, making it much more expensive (four coordinates for the derivative δ (1, 2)/δG(4, 5), see Appendix B.2). This two-point kernel can only be used inside of W. Outside of W, the three-point nature of the vertex is unavoidable. The simplified many-body approach retains the simplicity of a TDDFT kernel but has its foundation in many-body theory. By adopting the GW approximation to , an approximate, analytic f QP xc exists. Calculations of the dielectric function in Si and GaAs show good agreement between the f QP xc approach and a solution for the full vertex .
More recent work has included diagrammatic vertex corrections to solve Hedin's equations at some level of selfconsistency, though still at an approximate level (Grüneis et al., 2014;Kutepov, 2016Kutepov, , 2017. The greatest conceptual and computational difficulty to these calculations is how to update Ŵ. Because Ŵ enters in both χ 0 and , and because full self-consistency is so expensive, it is advantageous to update Ŵ in only one portion of the calculation. For example, one could evaluate the interaction W with a diagrammatic Ŵ once at the beginning of the calculation. Keeping W fixed, only G and are updated through Dyson's equation in the self-consistency cycle. While this procedure is only partially self-consistent, it incorporates a diagrammatic Ŵ while keeping the GW level of complexity through the self-consistency cycle. When applied to semiconductors and insulators, and with some practical restrictions on Ŵ, solutions of Hedin's full equations give noticeably better band gaps than GW (Kutepov, 2016(Kutepov, , 2017. Full, self-consistent solutions of Hedin's exact equations remain out of reach in real systems, and even partially self-consistent schemes are a technical challenge.

Optical Properties
Calculations of the many-body vertex have another application beyond particle addition/removal energies. Optical processes, such as photon absorption, can also be modeled in the Green's function formalism. In such a case, the relevant correlation function is the time-ordered two-particle correlation function, L. L obeys a Dyson equation like G, except that the role of the self-energy is instead played by δ /δG. The Dyson series for the full vertex to determine L is called the Bethe-Salpeter equation (BSE) in physics (Salpeter and Bethe, 1951;Held et al., 2011). BSE calculations describe a different process than ordinary GW, so they are not beyond GW in the same sense as including a vertex in the self-energy. Even so, they are closely related. The common implementation of the BSE for materials relies on the GW approximation to the self-energy. In these GW/BSE calculations, the excited electron and hole instantaneously interact via W. The effective two-particle Hamiltonian for correlated optical excitations (Strinati, 1982(Strinati, , 1984Rohlfing and Louie, 2000), called excitons, is where i and a denote again occupied and empty states, respectively, ǫ i/a is a quasiparticle energy, A P ia is the Pth exciton wave function in the single-particle basis, and P is the P th excitation energy. Equation (106) makes the common Tamm-Dancoff approximation (TDA), which ignores backward propagating electron-hole pairs that are present in the exact BSE. Matrix elements of K, δ /δG, include a direct screened interaction (W) and repulsive exchange (v). Schematic and diagrammatic representations of the optical process are shown in Figure 41. BSE calculations can be considered a first iteration of Ŵ in Hedin's equations to go beyond GW for particle addition/removal energies, if the resulting polarizability is reinserted into W.

T-Matrix
The framework of Hedin's equations, and GW in particular, places great emphasis on the screened Coulomb interaction. Indeed, many of the approximate schemes presented here frame the exact vertex as a correction (hence the term "vertex correction") to a GW calculation of the self-energy. In certain systems, it may be necessary to abandon this picture entirely. For example, in systems with low electron density and a similarly low number of electron-hole screening channels, as in very small atoms or molecules, screening of the Coulomb interaction may be insignificant. Roughly speaking, diagrams of the vertex type could be more important than screening diagrams included in GW. For these systems, we should adopt a different formalism which does not rely on screening and directly emphasizes other correlation channels. One such formalism is the T-matrix (Zhukov et al., 2004;Romaniello et al., 2012;Zhang et al., 2017), in which the self-energy is written as a product of G with a four-point kernel, T, (1, 2) = −i d3 d4 G(4, 3)T(1, 3, 2, 4).
The precise form of T depends on the choice of the particleparticle (pp) or particle-hole (ph) T-matrix, which determines the channels that are correlated alongside a third propagating channel. T obeys its own Dyson series and physically corresponds to repeated interactions, or scattering, between the selected channels (pp or ph). There are advantages of the T-matrix approach: it is exact up to second order in the bare interaction and includes many more exchange diagrams than GW, making it useful for magnetic systems. At first glance, the ph T-matrix may sound like GW. However, the two approximations correlate different particle and hole channels in the self-energy diagram. A schematic representation of the correlated channels in GW and T-matrix is shown in Figure 42. Notice the different topologies of G 0 lines correlated in GW and ph-T. The T-matrix approach has been applied to understand the role of spin-flip excitations in metals (Zhukov et al., 2004(Zhukov et al., , 2005(Zhukov et al., , 2006Müller et al., 2018;Młyńczak et al., 2019), double ionizations and Auger spectroscopy (Noguchi et al., 2007(Noguchi et al., , 2008(Noguchi et al., , 2010, as well as satellites in metals (Springer et al., 1998).

Cumulant Expansion
One long-standing problem with the GW approximation is its description of plasmon satellites in, for example, Si, Na and Al Guzzo et al., 2011Guzzo et al., , 2014Zhou et al., 2015). Plasmon satellites are peaks in the spectral function which are attributed not to a single quasiparticle, but to the coupling between a hole (in the particle removal case) and the collective excitation of the remaining electrons. This coupling leads to a quasiparticle peak in the spectral function and a series of progressively weaker plasmon replicas separated by the plasmon energy. A proven route to improve the plasmon description compared to GW is the cumulant expansion to the Green's function, which has been tested on Na, Al, graphene, Si, and the electron gas Guzzo et al., 2011;Gatti and Guzzo, FIGURE 42 | Schematic representation of diagrams included with (A) GW, (B) particle-hole T-matrix, and (C) particle-particle T-matrix. In each case, channels going into the box are correlated further with additional interactions.
2013; Lischner et al., 2013Lischner et al., , 2014Giustino, 2015, 2016;Zhou et al., 2018). Based on an exponential ansatz, somewhat analogous to the coupled cluster expansion for the wave function, the cumulant Green's function for a hole takes the form Guzzo et al., 2011;Lischner et al., 2013) G s (t) = (−t)e −iǫ 0 s t+C s (t) where ǫ 0 s is the mean-field energy that enters G 0 for state s and C s (t) is the cumulant. The exact form of the cumulant C s (t) depends on the chosen approximation to the self-energy. If one Taylor expands Equation (108) in powers of C s and compares it to Dyson's equation with the GW self-energy, an approximate closed form for the cumulant exists. This is called the GW+C method. The cumulant includes vertex corrections beyond the GW self-energy at the same computational expense as ordinary GW. These vertex corrections generally improve the description of satellites over GW when compared with experiment.
The cumulant appears to be a tremendous success − it miraculously provides vertex corrections for the same cost as GW. However, it does not improve the description of valence quasiparticle energies. The quasiparticle energy is still determined by ordinary GW. Furthermore, the cumulant ansatz in Equation (108) relies on the separation of electron and hole branches of the Green's function and produces satellites only below the Fermi energy. In general, this separation is not correct, and it becomes a worse approximation closer to the Fermi energy (Guzzo et al., 2014;Martin et al., 2016). The formal connection between GW and the cumulant is presented in Gumhalter et al. (2016).
As a case study of the cumulant, we highlight the study of doped graphene by Lischner et al. (2013). The spectral function of doped graphene on a SiC(0001) surface displays a quasiparticle peak and satellite, shown in Figure 43. With ordinary G 0 W 0 , the FIGURE 43 | Spectral function A(ω) of graphene on SiC at the Dirac point for electron doping density of n = −5.9 × 10 13 cm −2 . Data taken from Lischner et al. (2013).
splitting between the quasiparticle and satellite is 0.44 eV, which overestimates the experimental value of 0.3 eV. By including vertex corrections to the hole Green's function with the cumulant, the GW+C calculation reduces the splitting to 0.27 eV. GW+C also redistributes spectral weight away from the quasiparticle and to the satellite. Additionally, GW+C eliminates a spurious plasmaron − coupling between a hole and plasmon − solution that appears in GW.

Local Vertex
The treatment of localized electrons in physics has become its own subfield (Held et al., 2011;Hirayama et al., 2017;Tomczak et al., 2017). In strongly-correlated physics, localized d-or felectrons usually indicate a need to go beyond GW (Nohara et al., 2009;Gatti and Guzzo, 2013;Sakuma et al., 2013), with transition metal oxides being typical test cases. Much of the previous discussion applies just as well to localized electrons as any others − Hedin's equations are exact. However, the localized nature of these states lends themselves to model Hamiltonians, particularly the Hubbard model (Hubbard, 1963), which describes localized interactions by a parameter U. U is a measure of the repulsive interaction, or energetic cost, for electrons occupying the same spatial orbital. When combined with the LDA, the LDA+U method can improve upon the poor description of localized states by mean-field and GW theories (Jiang, 2018).
In the Green's function formalism, including diagrams beyond GW is made tractable by approximating the true Coulomb interaction by U. The combination of GW with dynamical meanfield theory (DMFT) (Georges and Kotliar, 1992;Kotliar et al., 2006), the GW+DMFT method (Biermann et al., 2003;Biermann, 2014), is a rigorous way of combining diagrams of higher order with the GW approximation. GW+DMFT describes long-range correlation with GW and local d-or f -electron correlation via the Hubbard U. GW+DMFT correlates a small set of states (the d-or f -electrons) using a non-perturbative vertex, called a "local" vertex since single site DMFT only includes diagrams which are local in orbital space. The first studies of SrVO 3 with GW+DMFT demonstrated its potential for predicting spectral properties of strongly-correlated solids (Tomczak et al., 2012(Tomczak et al., , 2014. The GW+DMFT method continues to be developed (Biermann, 2014;Choi et al., 2016).

CONCLUSION
Photoelectron and inverse photoelectron spectroscopy will remain some of the most powerful probes of matter available to scientists. While experimental spectroscopy gives the "answer" in the form of the measured spectrum, it may not give the full understanding of the underlying physics. In this regard, theoretical spectroscopy plays a huge role as a complement to experimental techniques.
We have introduced the Green's function formalism and many-body theory from a perturbation theory perspective. The formalism is exact, in principle, and allows one to calculate both ground and excited state properties. From the Feynman diagram construction, we have given a heuristic motivation for Hedin's equations, which are themselves nonperturbative. Hedin's equations place emphasis on the screened Coulomb interaction. The intuitive nature of screening-the simple idea that charges rearrange themselves, or respond, to an added charge-is the major reason behind the appeal and success of the GW approximation. The frequency dependence of the screened Coulomb interaction is largely what sets GW apart from density functional or Hartree-Fock theories.
Impressive advances in code development and computing resources have pushed GW calculations to a new scale. At the computationally lowest level of theory, G 0 W 0 calculations remain the most widely used and can be routinely applied to systems with hundreds of atoms. Within the G 0 W 0 approach, we have outlined the practical considerations presented to the user before performing any calculation: starting point, basis set, evaluation of the self-energy, and convergence are all for the user to decide. For a broad class of systems, G 0 W 0 already gives electron addition and removal energies in good agreement with experiment. These successes give GW an impressive ranking in computational value, or accuracy for computational cost, on any list of first-principles electronic structure methods. The versatility of GW assures that new applications in physics, chemistry, and materials science will continue to emerge in the future.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial contribution to the work. DG focused on numerical aspects, MD on theoretical concepts. PR contributed to all parts.

FUNDING
This work is supported by the Academy of Finland through grant Nos. 284621, 305632, 316347, and 316168.

ACKNOWLEDGMENTS
The authors acknowledge the CSC-IT Center for Science, Finland, for generous computational resources and the Aalto University School of Science Science-IT project for computational resources. We would like to thank Fabio Caruso, Jan Wilhelm, Leeor Kronik, and Lucia Reining for fruitful discussions and valuable feedback.

APPENDIX A: INTEGRAL NOTATION
We adopt the following integral notation ×φ k (r)φ l (r ′ ) whereÔ(r, r ′ ) is an operator that depends on the spatial variables r = (x, y, z) and r ′ = (x ′ , y ′ , z ′ ). Furthermore the following notation for Coulomb integrals are used where v(r, r ′ ) = 1/|r − r ′ | is the Coulomb operator. The antisymmetrized Coulomb integrals are defined as = φ i φ j |φ k φ l − φ i φ j |φ l φ k .

APPENDIX B: THE MANY-BODY PROBLEM
In first principles electronic structure theory, the aim is to solve the Schrödinger equation. For simplicity we focus on the nonrelativistic time-independent Schrödinger equation. For a system of N electrons and M nuclei, the Schrödinger equation is given byĤ with the many-body Hamiltonian r i and R a are the positions of the electrons and the nuclei, respectively, and Z a is the charge of the nuclei. To make this system of coupled electrons and nuclei more tractable, the Born-Oppenheimer (BO) approximation of clamped nuclei is frequently introduced. In this case, we only need to consider the electronic Hamiltonian by itself, H elec elec = E elec elec , in which the many-electron wave function elec depends parametrically on the position of the nuclei. The electronic Hamiltonian is then given bŷ We use v to denote the bare Coulomb interaction and the external potential is the same for every electron v ext (r) = − M a=1 Z a |r − R a | .
The kinetic energy and external potential are grouped together inĥ 0 (r i ) in Equation (B5). Approaches to solve the electronic Schrödinger equation (B3) can be largely grouped according to the basic variable they work with: the many-body wave function, the density-matrix, the density, or Green's functions. Each choice has its advantages and disadvantages and no consensus has been reached on the optimal choice. Green's functions have a natural connection, however, to the particle addition/removal problem and theoretical spectroscopy.

B.1. Green's Function Formalism
To derive equations for the single-particle Green's function that are more amenable to approximations than definitions in Equations (6) and (7), we start from the equation of motion for the field operators, which relates the time derivative ofψ to the commutator ofψ andĤ elec (Fetter and Walecka, 1971;Gross et al., 1991): where x contains also the spin variable σ , i.e., x = (r, σ ). Evaluating the commutator in the Heisenberg picture and applying the anti-commutation relations yields for the equation of motion i ∂ ∂tψ (x, t) = ĥ 0 (r) + ψ † (x ′ , t)v(r, r ′ )ψ(x ′ , t)dx ′ ψ (x, t) . (B8) The equation of motion for the Green's function (6) then follows from (B8) The term under the integral contains the two-particle Green's function, G 2 (xt, x ′′ t, x ′′ t + , x ′ t ′ ), and includes all two-body correlations in the system. In order to calculate the oneparticle Green's function we would therefore require the equation of motion for the two-particle Green's function, which in turn introduces the three-particle Green's function. Applied iteratively, this procedure creates an infinite series of higher order Green's functions and thus describes all the possible many-body interactions in the system. In practice, however, the resulting recurrence relation for the n th order Green's function is impossible to solve for large n. Instead we introduce the non-local, time-dependent self-energy¯ (xt, Analogous to other electronic structure methods, we separate out the most dominant term, the Hartree potential v H (r) = dr ′ v(r, r ′ ) N|ψ † (r ′ , t)ψ(r ′ , t)|N (B11) = dr ′ v(r, r ′ )n(r ′ ) , where n(r) is the electron density. With this definition, the equation of motion for the Green's function (B9) adopts the much more convenient form of an integral equation involving the self-energy, =¯ − v H : If G 0 now denotes the Green's function that is a solution tô h =ĥ 0 + v H (r) (the kinetic energy, the external potential, and the Hartree potential), then Equation (B13) can be further rewritten as G(1, 2) = G 0 (1, 2) (B14) + G 0 (1, 3) (3, 4)G(4, 2)d(3, 4) .
Here we changed to the abbreviated notation (1, 2, . . .) for the set of position, time and spin variables (x 1 t 1 , x 2 t 2 , . . .). Accordingly d(1) is a shorthand notation for the integration in all three variables of the corresponding triple(s). In this context 1 + implies the addition of a positive infinitesimal to the time argument 1. Equation (B14) is again Dyson's equation (Dyson, 1949a,b) (see also Equations (13) and (82)) and links the non-interacting Green's function, G 0 , to the interacting one, G. Dyson's equation gives a physical interpretation to the self-energy instead of simply its definition by Equation (B10). The self-energy quantifies the difference between a bare and a fully interacting electron, or quasielectron. This brings us back to our phenomenological consideration in section 2. An additional electron or hole drags a dynamically adjusting polarization cloud through the system that alters its energy. Hence the name self-energy. If instead G 0 is the solution to the mean-field Hamiltonianĥ MF [e.g. of Kohn-Sham density-functional theory (Kohn and Sham, 1965)] then the selfenergy, , embodies the difference between a quasielectron and an electron in the static mean-field.
At this stage in the derivation, the self-energy as well as the Green's function are still exact and contain the electron-electron interaction to all orders. A full solution of Equations (B13) and (B14) is not tractable and approximations are required.
Dyson's equation (B14) closes this set of integro-differential equations, which is shown pictorially in Figure 24A.
For a polarizable medium, like a solid, screening is large and the screened Coulomb interaction will differ considerably from the bare one. It therefore makes sense to build a perturbation series on W instead of v. Hedin's equations are physically transparent in this sense. Electron-hole pairs are created in the polarizability (one Green's function for the electron, a separate Green's function for the hole). They interact through the vertex function, which is determined by the change in potential upon excitation. The polarizability, in turn, determines the dielectric function, which screens the Coulomb interaction. The self-energy quantifies the energy contribution that the added electron or hole experiences through the interaction with its surrounding. Figures 11, 12, 14, 16B, 18, 25, and 35 present original content. All calculations are performed with the FHI-aims program package Havu et al., 2009), which expands the all-electron KS equations in numeric-atom-centered orbitals (NAOs), see section 4.4. The structures have been optimized at the DFT level using the Perdew-Burke-Ernzerhof (PBE) functional (Perdew et al., 1996a) to model the XC potential and NAOs of tier 2 quality  to represent core and valence electrons. Dispersion corrections are accounted for by employing the Tkatchenko-Scheffler van der Waals correction (Tkatchenko and Scheffler, 2009). G 0 W 0 calculations have been performed with the contourdeformation (CD) technique  if not indicated otherwise. Calculations with the analytic continuation (AC)  have been conducted for the G 0 W 0 self-energies in Figure 14 and for the scGW result in Figure 18. For both methods, CD and AC, a modified Gauss-Legendre grid with 200 grid points is used for the numerical integration of the integral over the imaginary frequency axis. In case of the AC approach, the same set of grid points {iω} is employed to calculate c s (iω), which has been fitted to a Padé approximant with at least 16 parameters (Vidberg and Serene, 1977), unless stated otherwise.

APPENDIX C: COMPUTATIONAL DETAILS
Quasiparticle energies have been computed by iteratively solving Equation (22). Furthermore, all quasiparticle energies have been extrapolated to the complete basis set limit using the Dunning basis set family cc-pVnZ (n = 3 − 6) (Dunning, 1989;Wilson et al., 1996). The cc-pVnZ basis sets are allelectron Gaussian basis sets, which can be considered as a special case of an NAO and are treated numerically in FHI-aims. The extrapolation has been performed by a linear regression against the inverse of the total number of basis functions.