A Practical Guide to Surface Kinetic Monte Carlo Simulations

This review article is intended as a practical guide for newcomers to the field of kinetic Monte Carlo (KMC) simulations, and specifically to lattice KMC simulations as prevalently used for surface and interface applications. We will provide worked out examples using the kmos code, where we highlight the central approximations made in implementing a KMC model as well as possible pitfalls. This includes the mapping of the problem onto a lattice and the derivation of rate constant expressions for various elementary processes. Example KMC models will be presented within the application areas surface diffusion, crystal growth and heterogeneous catalysis, covering both transient and steady-state kinetics as well as the preparation of various initial states of the system. We highlight the sensitivity of KMC models to the elementary processes included, as well as to possible errors in the rate constants. For catalysis models in particular, a recurrent challenge is the occurrence of processes at very different timescales, e.g., fast diffusion processes and slow chemical reactions. We demonstrate how to overcome this timescale disparity problem using recently developed acceleration algorithms. Finally, we will discuss how to account for lateral interactions between the species adsorbed to the lattice, which can play an important role in all application areas covered here.


INTRODUCTION
As the witty name suggests, Monte Carlo is a wide umbrella term that covers a numerous family of approaches with one simple central idea in common: the resolution of complex problems using random numbers. Given the versatility of the concept, it is no surprise that Monte Carlo based approaches have gained popularity in computational chemistry and materials science (cf. e.g., Frenkel and Smit, 2001), most prominently for the simulation of ensemble properties using Metropolis Monte Carlo, or methods derived from the latter such as Basin Hopping for global geometry optimization. In addition to equilibrium properties, the Monte Carlo idea can also be exploited to tackle dynamical properties. In this sense, a number of approaches emerged over the decades under different names, until the term kinetic Monte Carlo (KMC) became universally used in this context.
Nowadays KMC is a popular tool to describe a variety of phenomena related to e.g., transport (diffusion), structures and properties of materials (e.g., crystal growth) or equilibrium and nonequilibrium chemistry (catalysis). As will become apparent throughout the text, in the context of atomistic simulations KMC can be considered as a form of coarse graining. This renders it particularly suitable to find its place in hierarchical multiscale modeling approaches, where information at different levels of accuracy or detail is integrated to provide a more comprehensive description. In this context, KMC is an essential method to bridge the gap between the microscopic world (elementary processes such as atomistic diffusion jumps or the making and breaking of chemical bonds) and the meso-to macroscopic world (e.g., a diffusion constant or a reaction rate).
Let us illustrate this concept by considering for example heterogeneous catalysis, which is one of the fields where KMC, as well as hierarchical approaches in general, have made considerable impact (Sabbe et al., 2012;Reuter, 2016). The design of a catalytic process requires a deep understanding of phenomena ranging from the reactive chemistry to the optimization of heat and mass transport within the reactor. Numerical simulations have become an integral part in this design process, which requires appropriate models at multiple time and length scales, and perhaps most importantly, concepts on how these models should be connected. One can imagine "zooming in" from the macroscopic world where we live and an industrial reactor operates, all the way down to the microscopic scale where the events are ultimately governed by electronic structure: adsorption and desorption of atoms and molecules at surfaces, diffusion, bond breaking and bond forming (see Figure 1). All of the latter constitute elementary processes that can occur at the interface between the catalyst and the reaction fluid, and can nowadays be described individually to a great level of detail by first-principles electronic structure calculations. At this scale, what one essentially needs is a mechanistic description in terms of the Potential Energy Surface (PES) of the system (vide infra). Hereby, an appropriate quantum mechanical approach is required to capture chemical subtleties in a predictive manner. The workhorse for this remains to this day largely Density-Functional Theory (DFT) (Hohenberg and Kohn, 1964;Kohn and Sham, 1965), thanks to its unique compromise between accuracy and efficiency that allows to access system sizes as relevant for heterogeneous catalysis.
If one now "zooms out" a little and into the mesoscopic scale, one can see how what globally happens is the result of an intricate interplay of the above elementary processes as well as the interaction with the surrounding environment. Here, the spatiotemporal evolution at the interface is dominated by collective behavior; dynamics and thermodynamics come into play. At this stage, one may employ the microscopic information (e.g., reaction barriers, adsorption energies etc. of the elementary processes) and embed it into microkinetic models as a form of coarse-graining. A plethora of approaches of different complexity are available, from Sabatier analysis to mean-field models and kinetic Monte Carlo. Finally, at the macroscale, one needs to take into account transport phenomena and gradients of mass and temperature in the reactor geometry. This is a realm that is presently largely covered by continuumtype fluid dynamical models (Janardhanan and Deutschmann, 2011). At this stage, again the information from the lower scales can be embedded, i.e., here the outcome of the microkinetic models is integrated as an input, for instance in form of a boundary condition to the differential equations describing the flow phenomena (Matera and Reuter, 2010;Matera et al., 2014). The development of appropriate hierarchical models to effectively describe events at such different scales, and more so of "bridges" to transfer information between them, constitutes the core of modern multiscale modeling approaches (cf. Raimondeau and Vlachos, 2002;Reuter et al., 2005;Vlachos, 2012).
Similar kind of multiscale approaches have been or are being developed in many other areas of chemistry and materials sciences. Microkinetic models, and there prominently KMC simulations, are generally the approach used to capture the statistical interplay between elementary processes whenever the mesoscopic property or functionality to be described is outside of thermodynamic equilibrium. Besides catalysis, notable such application areas with a similar focus on surfaces or interfaces are diffusion and crystal growth. In this understanding of the use of KMC simulations we will concentrate in the following on this particular technique in these particular application areas. The present is, however, not intended as yet another extensive account of the fundamental methodology. For this we refer the reader to the many excellent reviews available in literature (Chatterjee and Vlachos, 2007;Voter, 2007;Reuter, 2011;Stamatakis and Vlachos, 2012;Stamatakis, 2014). Instead, what we want to provide is a practical guide of how to carry out such simulations (especially in the context of surface kMC), with particular emphasis on best practice recommendations as well as a discussion of current challenges and perspectives.

Rare-Event Dynamics: A Bottleneck Which Enables Its Own Solution
Many elementary processes involved at surfaces of solids exhibit high activation barriers (even despite a possible reduction of the barrier due to the presence of the catalyst in heterogeneous catalysis). These barriers are usually much larger than k B T and the corresponding processes are thus classified as rare events, if only thermal energy is there to drive them. While the motion of individual atoms (e.g., their vibrations, but also the actual reaction events, i.e., crossing an activation barrier once the system has reached the transition state) occurs on picosecond time scales, the time between consecutive high-barrier events can therefore be many orders of magnitude longer, possibly requiring simulations up to seconds or more in order to arrive at meaningful conclusions about the effect of the statistical interplay within an ensemble of multiple possible elementary processes. The "life" of our system in the long time span between these rare events is filled with vibrational motion around a single minimum on the PES. The relevant transitions to other (meta)stable states aka PES basins occur only occasionally. On a mesoscopic time scale, the time evolution of our system therefore manifests itself as a series of consecutive jumps from state to state (see Figure 2). Additionally, it is intuitive to assume that the longer the time the system spends in one basin, the more it "forgets" how it actually got there. After a while, each possible way to escape from the basin therefore becomes completely independent of the entire preceding history before entering the basin. In other words, the state-to-state jumps FIGURE 2 | Coarse-graining of a molecular dynamics (MD) trajectory into a Markov chain. (Left) A possible MD trajectory (black) overlayed on the underlying potential energy surface (PES) of the system with red regions representing lower-energy basins. A large fraction of time is spent in these PES basins in vibrational motion around the respective minimum. At a certain moment in time, the systems finds an escape route to the next basin. (Center) Coarse-graining of PES minima into positions on a suitably defined lattice. Each lattice position represents the basin of attraction of a PES minimum. (Right) Coarse-graining of the continuous MD trajectory into a Markov chain of discrete hops between the basins/lattice positions.
of the system constitute a so-called Markov chain (cf. e.g., van Kampen, 2007).
In consequence, the change of the probability P i (t) of the system to actually be in state i at time t depends only on the probabilities of hopping out of the current state i into any other state j, k ij , and on the probabilities of hopping into state i from any other state j, k ji . In the present context of chemical kinetics, these hopping probabilities are expressed as rate constants of the elementary processes with units time −1 . The overall change in P i (t) is thus governed by a simple balancing equation, called a Markovian master equation, that only contains these rate constants: From a mathematical standpoint, Equation (1) is a system of coupled differential rate equations. Seemingly simple, it unfortunately becomes quickly unfeasible to solve explicitly for the number of possible states typically involved in surface catalysis (or diffusion or growth). For a rough estimate, let us consider a system with 100 surface sites (e.g., an fcc(100) slab of ten atoms per edge in an otherwise periodic boundary condition cell to simulate an extended surface). In the course of a KMC simulation modeling a simple catalytic reaction A + B → AB, each of these sites can assume 3 possible occupation states; it can be empty, occupied by species A or occupied by species B (assuming that a formed product AB immediately desorbs into the gas phase). Already the number of possible configurations of such a trivial toy system is then 3 100 ! The matrix k ij containing all the possible rate constants between system states will thus have (3 100 ) 2 ≈ 2.66 10 95 elements, making even its comprehensive storage impossible, let alone its diagonalizationand even considering that the limited number of accessible states renders it largely sparse.
As an ingenious way out of this mess, the practical Monte Carlo type idea behind KMC is to never even attempt to deal with the entire matrix, but instead to generate stochastic trajectories that propagate the system from state to state (i.e., a Markovian sequence of discrete hops to random states happening at random times). From this, the correct time evolution of the probabilities P i (t) is then obtained by ensemble averaging over these trajectories, or, if the system is in a steady state and ergodicity is ensured, by time averaging over a singular, sufficiently long trajectory. The KMC method thus replaces the analytical solution of Equation (1) with a numerical approach based on stochastic dynamics. The "only" thing that is needed to make this work is an algorithm that generates suitable such trajectories that (once ensemble or time averaged) yield the correct probabilities P i (t). This algorithm hence needs to determine at each step along such a state-to-state trajectory to which state the system should jump next and after what time step this next jump should happen. The KMC algorithm does so by selecting elementary processes according to their rate constants, followed by an updating of the time. We will come back to these algorithmic details in section 3.

Mapping the Problem Onto a Lattice
The challenges in applying KMC in practice are largely connected to the plethora of minima on the relevant PESs, and more so, the even larger number of elementary processes connecting them (Margraf and Reuter, 2019). In the initially sketched multiscale view it would be desirable to employ rate constants calculated from first principles for all of these processes in order to establish models of predictive power. If one considers that each rate constant calculation requires in principle the determination of a transition state to get the activation barrier (vide infra), a brute-force KMC approach that requires at each step of a KMC trajectory a large number of such first-principles rate constants is at present and for the foreseeable future in general not feasible. There are currently a number of routes pursued to overcome this showstopper. One obvious remedy would be to recycle firstprinciples rate constants that have already been computed at previous KMC trajectory steps and thus build up an increasing database. In practice, this requires an unambiguous recognition scheme though that allows to identify that elementary processes that are possible at the present KMC step are identical to processes that were possible in previous steps. Another possibility is to make the rate constant calculations less demanding. This could either be done by resorting to lower levels of theory like using appropriate interatomic potentials instead of DFT or by using more approximate activation barriers e.g., from Brønsted-Evans-Polanyi relationships (vide infra) that circumvent the costly determination of the transition state. The crucial issue here is always whether then sufficient accuracy is retained to maintain the desired predictive power. This applies most prominently to surface catalysis, where activity and even worse selectivity are highly sensitive to small changes in activation barriers. Finally, one could consider only selective parts of all possible elementary processes, for instance in catalysis by focusing on certain reaction mechanisms only. If such considerations are based on reliable insight (e.g., from experimental evidence or other simulations), this can be very elegant. The grain of salt here is that KMC simulations are often employed precisely to find out which of all possible elementary processes crucially govern the statistical interplay. In other words, the objective of KMC is to identify the important parts of process space rather than to assume them in the first place.
In this situation, a prevalent school of KMC implementations resolves the problem by exploiting a crystalline order in the studied system. Under such order, it is possible to map the relevant PES minima onto some suitable form of periodic lattice. Different system states, for instance in a surface catalysis KMC simulation, differ then only in their distribution of adsorbates on the various lattice positions. This type of KMC is referred to as lattice KMC. Let us illustrate the idea with the simple example of a surface process such as the diffusion of an adatom on an fcc (100) surface. If the stable PES minima correspond to the fourfold hollow sites, we can immediately establish a lattice model where we include only diffusion processes that allow the adatom to start and end up in one of the (hollow) lattice sites. For each state of the system it is usually necessary to relax the geometric structure, and the atoms may not be exactly on the lattice positions after that. However, one has to choose the lattice in such a way so that atoms at least end up close enough to the lattice positions in order to allow for an unambiguous assignment to one lattice site. This already significantly reduces (and ensures the finiteness of) the number of possible processes, which however remains rather large.
Further reduction of the number of required rate constants can be achieved by exploiting the translational symmetry of a crystalline lattice. Let us first assume that in our diffusion toy system there is only one single adatom. The translational symmetry of the crystalline lattice then tells that the elementary processes out of any hollow site are all the same. Once we have computed the rate constants for these processes once, we can simply reuse them for the elementary processes out of another hollow site. They will be the same. In the example of our toy system and if we assume that hopping diffusion over a bridge site is the only diffusion mechanism possible, then this means that we have to simply compute just one single rate constant (for such a bridge hopping process). Here, the symmetry refers to the symmetry of the lattice positions per se, since (remember) we considered only one adatom in our toy diffusion example. If we have other species occupying nearby lattice positions, this symmetry will readily be broken. What helps in this case is the nearsightedness of chemical interactions as formulated by Walter Kohn. The interaction to some nearby species on the lattice may thus already be so small, that it has a negligible influence on the rate constant. If we neglect such influences between neighboring species to an increasing degree, we restore a complete locality of the elementary processes to the level that we have an as simple and high-symmetry situation as we had in the case of the isolated adatom. For this particular example, the most local approximation would for instance be to completely neglect any interaction with nearby other species-except for preventing diffusion processes in which an adatom would end up in an already occupied site (so-called site blocking). Despite the enormous number of possible configurations of a system with an arbitrary number of adatoms on the lattice, still only one single rate constant for hopping processes over bridge sites would be required. If we consider a (10 × 10) lattice, then the state-to-state matrix k ij in Equation (1) is still of the order 2 100 . However, it only contains elements that are either zero or have this single rate constant as their value. More generally, translational (and maybe any rotational) symmetry does thus not reduce the state space. However, it can dramatically reduce the number of inequivalent rate constants that need to be computed.
As a less drastic approximation we may consider only the lateral interactions of the adsorbate with species in nearest neighbor sites. As illustrated in Figure 3 in our simple diffusion example we will then need to compute five distinct rate constants, one for each possible occupation of the four nearest neighboring sites: one with no nearest neighbors, two with one neighboring site occupied, one with two and one with three neighboring sites occupied. The generalization to different lattices is a straightforward exercise. Augmenting the local environment that would affect the rate constants would improve the accuracy of the simulation (if such further reaching lateral interactions are indeed still non-negligible), but obviously requires even more rate constant computations. In practice, this gradual inclusion of lateral interactions on a lattice proceeds through clusterexpansion techniques, which we will further discuss in section 9. Essentially, cluster expansions then allow to modify the fundamental rate constant of a given elementary process in the absence of other species to the rate constant of the same process with any distribution of nearby neighbors.
In the lattice approximation one can therefore scan all possible configurations and transitions, compute the associated rate constants for any local configuration beforehand and save the latter in a so-called rate catalog. During the KMC run, the current configuration at a trajectory step is examined and the possible processes and their rate constants are extracted from the rate catalog. An alternative is to identify the possible processes and calculate the cluster-expansion correction to the fundamental process rate constants only on-the-fly at each KMC step. This can be numerically more efficient when many lateral interactions are taken into account, and the size and cost of searching a comprehensive rate catalog becomes intractable. Examples of this will be provided in section 9.
The approximations used so far can be extremely efficient and may allow for the possibility of using very large supercells in the performed simulations. However, detailed PES information is required and an ordered lattice is assumed as the structural motif. If the lattice model is not suitably chosen, it may neglect important minima of the PES. Simultaneously, any changes of the lattice induced by the simulated dynamical processes cannot be captured by construction (Reuter, 2016). This includes important aspects such as reaction-induced surface reconstruction, other surface morphological transitions or loading-induced lattice transformations in intercalation diffusion. The purpose of offlattice KMC is precisely to overcome such limitations and we will discuss in section 5 how the number of rate constant calculations can then be dealt with.

Mean-Field Approximation
An alternative to the full numerical solution of the Master equation (Equation 1) with KMC is to introduce further approximations (on top of the lattice approximation) that make the equation easier to solve or allow even for an analytical solution. The most common of such approaches is the meanfield approximation (MFA), where the detailed spatial resolution over the extended lattice is sacrificed and replaced by the mean coverage of each considered species at any of the site types exhibited by the lattice. Mathematically speaking, the MFA assumes that the occupation of the different sites on the lattice is statistically independent, i.e., that there are no correlations between different sites on the lattice. In the context of surface adlayers, one says that the adlayer is well mixed.
Let us begin from the (time-dependent) rate r ij (t) of an elementary process, which is given by (2) For a start, let us consider only first-and second-order processes, i.e., we assume that at most two lattice sites are involved in the elementary process. Per definition, first-order processes do not involve more than one lattice site, i.e., the assumption of uncorrelated lattice sites holds trivially. As an example of a second-order process, let us consider the reaction of two neighboring species A and B. We will denote the (timedependent) pair-probability of finding species A at site a and species B at a neighboring site b with P ab (A, B, t). In the absence of correlations, and assuming that the distribution of the species FIGURE 3 | Hopping diffusion process (red arrow) of a particle on a quadratic lattice for all possible configurations of the nearest neighbor lattice sites in the initial state. Configurations that are equivalent by symmetry are shown only once. The site where the particle diffuses to must be empty for the process to be possible.
Frontiers in Chemistry | www.frontiersin.org on the lattice is thus spatially homogeneous, we can write this pair probability as a simple product where a (A, t) and b (B, t) are the (time-dependent) spatially averaged coverages of species A at sites of type a and species B at sites of type b, respectively. Generalizing this to any reaction order, the MFA hereby condenses the high-dimensional Master equation into much simpler rate equations of the form where N ij is a geometrical factor accounting for the connectivity of the sites involved in the initial and final states i and j and the species A occupies the site a in the initial state i (Matera et al., 2011). The MFA thus yields a set of coupled differential equations, which are solvable by standard algorithms (Medford et al., 2015). In catalysis, this is often combined with certain assumptions about the rate-determining step (see section 7) to arrive even at an analytical solution for the reaction rate. While these approximations and the MFA approximation itself simplify the problem at hand enormously, one should always keep in mind that they do represent approximations. In particular the MFA is in general only fulfilled for infinitely fast diffusion and in the complete absence of lateral interactions. We will come back to this point in section 8. The first-principles input required for a MFA model is largely the same as for a KMC model, that is, all possible processes and their associated rate constants. However, due to the assumption of (infinitely) fast diffusion inherent to the MFA model, kinetic barriers for diffusion between sites of identical type do not need to be explicitly calculated.

Codes
Even though in particular in the context of surface catalysis MFA microkinetic models are still prevalent, one can clearly discern a trend toward KMC simulations. In one way or the other, it is often said that the maturity of a new simulation technique can be judged by the emergence of general-purpose software packages that are user friendly (maybe even up to providing a graphical user interface). If one takes the latter argument at face value, KMC has indeed matured dramatically in the last couple of years. A number of high-end KMC codes have appeared in an astonishingly short period of time. Referring for a more detailed (and maybe exhaustive) account of such codes to the review by Stamatakis (2014), we here compile only a brief presentation of a number of such codes to provide an impression of what is presently available: One of the first general-purpose KMC implementations was provided by Lukkien et al. in the code CARLOS (Gelten et al., 1998). In CARLOS one can specify any kind of reaction as input, then the program uses pattern recognition to identify possible reactions. CARLOS also implements time-dependent rate constants. Slepoy et al. (2008), implements several KMC solvers and is structured modularly to facilitate expansion and implementation of new models and solvers. Currently implemented models include both lattice and offlattice applications, as well as a general purpose model for the simulation of biochemical reaction networks. Stamatakis and Vlachos (2011) developed an approach that employs graph-theoretical ideas to overcome the limiting assumption that each participating species occupies a single site and that elementary events involve a maximum of two sites. Here, lattice structure and elementary events are represented as graphs, and lattice processes are identified by solving subgraph isomorphism problems during the simulation. Building on the latter code, Nielsen, Nielsen et al. (2013) developed ZACROS, which incorporates cluster expansion Hamiltonians in order to accurately account for long-range lateral interactions. The latter two approaches are suitable for treating systems with rather complex surface chemistry, including organic adlayers and more generally situations where the reactants may adsorb on the surface on multiple sites.

SPPARKS, developed by
The recently developed MonteCoffee (Jørgensen and Grönbeck, 2018) exploits similar ideas to the graph-theoretical approach, geared toward the simulation of nanoparticles. The code uses neighbor lists to represent the site connectivity, rather than mapping the problem onto a lattice. With respect to the graph-theoretical approaches, hereby the user directly controls the site connectivity.
Adaptive KMC (aKMC) approaches (vide infra) were mostly developed by Henkelmann and coworkers (Henkelman and Jónsson, 2001;Xu and Henkelman, 2008). The code EON currently includes a set of algorithms to model mesoscale dynamics (parallel replica dynamics, hyperdyamics, and basin hopping as well as aKMC). For aKMC, the code implements a server-client architecture where the client processes are responsible for saddle point searches (see section 4) to escape the current basin, then report the calculated rates back to the server which executes the KMC algorithm.
kmos  is an application programming interface based KMC framework that facilitates the generation of an abstract model definition in Python, which is then used to automatically generate efficient Fortran code. The code was instigated by Max Hoffmann and is mainly being developed in our group. We will use it for the hands-on examples in the later sections, providing a detailed practical account on how to run KMC simulations. In the following section we will further describe the KMC algorithm underlying kmos as one example of present-day codes, while in general we emphasize that different codes may implement different algorithms to solve the Master equation.

GETTING PRACTICAL: ALGORITHMS AND INPUT DATA
As discussed above, the real trick of KMC is the KMC algorithm that generates stochastic trajectories in such a way that their appropriate averaging yields the time evolution of the probability P i (t) in the Master Equation (1). One of the most commonly used such KMC algorithm, initially developed by Bortz et al. (1975) for Ising spin systems, is known as the BKL algorithm (after the authors) or the "n-fold way". It also goes under the names of Variable Step Size Method (Jansen, 1995) or Direct Method (Gillespie, 1976). For a simple, practical rationalization of the algorithm, let us consider a toy system which can assume only two states A and B connected by a barrier with associated rate constants k AB and k BA for the forward and backward transitions, respectively. In this system, only two elementary events are possible; if, say, the system is sitting in state A, it can hop to state B. As the rate constant for this hop is k AB (with unit time −1 ), one may naively think that the average time that will have passed until such a hop occurs is t AB = k −1 AB . Obviously for the hop back from state B to state A, the average time would be t BA = k −1 BA . Accordingly, a KMC algorithm would generate a trajectory where after each hop the time is incremented by t AB or t BA (depending on what hop occurred).
Mathematically, this naive thinking is not entirely correct. In reality, while being in state A for each short increment of time the system will have the same probability of finding the escape path. This generates an exponentially decaying survival statistics, whose derivative represents the probability distribution p AB for the true time of first escape: The average escape time thus has to be appropriately weighted by this Poisson distribution. It can be shown (Gillespie, 1976;Fichthorn and Weinberg, 1991) that this is achieved by advancing the system clock by where ρ 2 ∈ [0, 1] is a randomly drawn number.
When we now generalize this to an arbitrary system, where at each step along a KMC trajectory a multitude of processes i → j from the current state i to other states j are possible, the theory of Poisson processes allows to straightforwardly derive the average time that will have passed until any process has occurred. Since the elementary processes are independent, each has its own probability distribution p ij given by and the probability of the time of first escape from state i through any of the processes i → j follows as where is the total escape rate constant obtained as the sum of all the individual elementary rate constants. Just as for the single process case, the average escape time is then where ρ 2 ∈ [0, 1] is a randomly drawn number. It is crucial to highlight that this escape time only depends on the total rate constant, and is independent of the actual process that actually occurs to bring the system out of the current state i. This actual process nevertheless needs to be identified, since this is what determines to which state j the system propagates. This state j is then the starting point for the next KMC step, i.e., the next KMC step evaluates the escape from this particular state j.
The BKL algorithm determines this executed process out of all possible processes again by rolling the dice. Imagine a stack of segments of height proportional to the rate constants k ij of the possible processes, which correspondingly sums up to a total height of k tot . We can choose a process to execute out of this stack by drawing a random number ρ 1 and multiplying it by k tot . The resulting number will "point" at the process with correct probability in the process stack, as illustrated in Figure 4. The selected process is then executed, bringing our system into a different state for the next KMC step. This way of choosing out of the stack ensures that faster processes are selected with a higher probability than slower ones: They have a larger rate constant, have correspondingly a thicker segment in the stack and are correspondingly chosen more often. We therefore have a recipe to generate trajectories that satisfy the Master equation, obtained stochastically via the extraction of only two random numbers ρ 1 , ρ 2 ∈ [0, 1]. An important parameter is thereby also the random number seed used to generate the sequence of random numbers, as recalculating the trajectory with a new seed value will lead to the new trajectory that is statistically independent of the former.
The algorithm, also illustrated in Figure 4, may thus be summarized as follows: • Make a list of possible processes p for the system to escape the current state i, with associated rate constants k p ; • draw two random numbers ρ 1 , ρ 2 ∈ [0, 1]; • calculate k tot = p k p ; • extract process q, which has to fulfill the constraint q p=1 k p ρ 1 k tot q−1 p=1 k p ; • execute randomly drawn process; • update clock: t → t − ln(ρ 2 )/k tot .

RATE CONSTANTS FROM FIRST PRINCIPLES: TRANSITION STATE THEORY
As discussed in the previous sections, KMC requires rate constants for all considered elementary processes as an input. For the surface applications focused on here, these rate constants are at present predominantly obtained through Transition State Theory (TST). Making a number of assumptions, for instance that the flux of trajectories passing the transition state (TS) from state i to j will never come back to the state i (norecrossing rule) and that the barrier crossing is a purely classical event (no tunneling), TST provides a simple expression for the rate constant, known as Eyring (or Eyring-Polanyi) equation: Laidler (1987) where T is the absolute temperature, h is Planck's constant, q vib TS and q vib i are the partition functions at the transition state and at the initial state, respectively, and E ij is the activation barrier of the process. The latter is directly available from PES information and therefore accessible to first-principles calculation (e.g., semilocal DFT). The prefactor k o k B T h may in principle be calculated. Most popular is harmonic TST, where the partition functions are obtained from the vibrational modes at the initial state i and at the TS. In the predominant number of studies the considerable computational costs of these vibrational calculations are avoided though, and one simply approximates k o ≃ 1 − 10, yielding a prefactor in the range 10 12 − 10 13 s −1 . This is never really fully justified, and if at all only when the vibrational properties of the TS do not differ much from those of the initial state. A prominent class of processes where this common approximation is not valid are non-activated adsorption processes, where the prefactor needs to account for the strong entropy reduction from the gas phase to the surface-bound state. In this case the rate constant is better estimated as: (Reuter and Scheffler, 2006;Chorkendorff and Niemantsverdriet, 2017) k ads n,B (T, p n ) = S n,B (T) where p n is the partial pressure of species n of mass m, and the local sticking coefficient S n,B (T) governs the fraction of impinging particles sticking to a site B located in a surface unit cell of area A uc .

Master Equation and Detailed Balance
We will next motivate some practical guidelines that the input (processes and rate constants) to any microkinetic model (KMC or MFA) must adhere to. Let us consider a system that has reached steady state. This imposes the constraint of vanishing derivative in the Master equation (Equation 1), which leads to the condition where P * i (P * j ) is the time-independent probability that the system is in state i (j). This condition is a conservation law stating that the sum of the rates of all transitions out of any state i (j) must equal the sum of the rates of all transitions into state i (j). At thermodynamic equilibrium, microscopic reversibility and the principle of detailed balance (Tolman, 1925) imposes the even stronger constraint that the average rate of every microscopic process must exactly balance its reverse process The right-hand side of Equation (14) is thereby proportional to the states' Boltzmann weights and can thus be expressed in terms of the free energy difference between states i and j: The above derivation motivates two practical guidelines for constructing KMC (or MFA) models. The first guideline is that every microscopic process must have defined a corresponding reverse process, and the second guideline is that the rate constant expressions used for the forward and reverse processes must fulfill (Equation 15). For the latter point it is particularly crucial to realize that this also extends to computing the free energies of both states F i and F j with the same numerical approximations. In practice, this is often neglected (different supercells/configurations, mix of first-principles and empirical data etc.) and can then have drastic consequences as the kinetic model is not thermodynamically consistent (Schmitz, 2000;Mhadeshwar et al., 2003). Quite some work has therefore been devoted to achieve an overall thermodynamic consistency, e.g., Mhadeshwar et al. (2003) and Nielsen et al. (2013).

Obtaining Rate Constants: Transition State Search
In the context of determining rate constants, it is natural to put a primary focus on the lowest activation barriers that need to be overcome, e.g., in catalysis on the minimum energy path connecting reactants and products. From the mathematical standpoint, locating the lowest barrier(s) translates into locating the lowest first-order saddle point(s) on the PES, which is a particularly challenging task for which-in contrast to locating PES minima-there is yet no general approach that is guaranteed to work. In the following we will briefly discuss the reliability, accuracy and performance of different available methods. Always keeping in mind that "your mileage may vary, " there is nevertheless a general guideline as to which family of approaches would be preferable depending on the nature of the problem at hand. For lattice KMC one would assume or derive the mechanism (e.g., the reaction mechanism in the context of heterogeneous catalysis) and subsequently compile a list of the elementary processes that constitute this mechanism. In this case, initial and final states are therefore pre-determined, enabling the use of so-called interpolation methods. For adaptive KMC, more often than not no previous assumption of mechanism is made and one may need to blindly explore the possible and most probable escape pathways from a current state. So-called local methods are then mandatory, as only information on the initial state is available.

Interpolation Methods
The simplest form of interpolation-based TS search consists in identifying a reaction coordinate guess in one or a small number of internal degrees of freedom, preferring those that describe the main structural differences between initial and final state. The selected coordinates are subsequently constrained to specific values between the initial and final structures, while all remaining degrees of freedom are optimized. Such TS searches are often referred to as coordinate-driving (drag) methods (Halgren and Lipscomb, 1977;Rothman and Lohr, 1980;Williams and Maggiora, 1982). The success of drag methods depends critically on the ability to choose a good set of reaction coordinates and on the topology of the PES in the direction of the remaining degrees of freedom. In general, if the reaction path is dominated by only one or two degrees of freedom, the coordinate driving may work, and the constrained optimized geometry (with the smallest residual gradient) is a good approximation to the TS. On the other hand, a bad choice of reaction variable(s) may lead to hysteresis and converge to (unphysical) discontinuous reaction paths (Halgren and Lipscomb, 1977;Henkelman et al., 2002).
Drag methods operate on one structure at the time. A significant improvement is achieved by simultaneously optimizing multiple points along the initial guess of the reaction path. An example is the ridge method (Ionova and Carter, 1993), which iteratively refines an initial guess of the TS by simultaneously relaxing two replicas of the latter, slightly displaced across the ridge of the PES, until they contract to the actual TS. Methods that operate with more than two structures are often referred to as "chain-of-state" methods. The initial distribution of structures will typically be along a linear interpolation of coordinates between the initial and final, or any convenient form of continuous variation along a chosen reaction path. All intermediate states or images are then optimized simultaneously in some concerted way, providing not only the saddle point, but also a good approximation of the entire reaction path.
Among those, the Nudged Elastic Band (NEB) method (Jónsson et al., 1998; is arguably most popular as it incorporates strong points of older approaches in order to cure their shortcomings. After initializing an initial chain of images R i , NEB minimizes a target function ("elastic band") defined as the sum of energies of all intermediate images and an additional penalty term which distributes the points along the path through a single spring constant k (see Figure 5): In general, a straightforward minimization of S EB would exhibit a tendency to "cutting corners" if the spring constant k is too large, and "sliding down" toward the extrema if k is too small (thus undersampling the actual TS region). These problems are alleviated by "nudging" the elastic band, i.e., by using only the component of the spring force parallel to the tangent of the path (to cure for corner-cutting), and only the perpendicular component of the energy force (to cure for down-sliding). The total force acting on each image is then τ i being the tangent unit vector at image i. In the Climbing-Image NEB (CI-NEB)  variant, the image with the highest energy is selected after a few iterations and driven up toward the saddle point by turning off its spring force and reversing the component of the potential force parallel to the chain. This yields exactly the saddle point (which in the nonclimbing version is obtained by interpolation) at no additional computational cost.
The NEB method is still likely to run into trouble when dealing with a PES for which the energy varies largely along the reaction path, but very little perpendicular to it. Regarding this problem, it has been pointed out that for CI-NEB corner cutting does not affect the accuracy of the TS, and that a more robust relaxation to the TS may be achieved by using the full spring force rather than only the component parallel to the tangent of the path (Kolsbjerg et al., 2016). Furthermore, as all such algorithms in its category, NEB comes at a high computational cost as it involves the optimization of many structures and typically requires a rather large number of iterations. Technical parameters such as the number of images and the value of the spring constant must be wisely chosen beforehand. The latter issue in particular may prove to crucially influence the optimization efficiency: a small value causes an erratic coverage of the reaction path, while a large value focuses the effort on distributing the points rather than on finding the reaction path, and consequently slows down the convergence. In the traditional NEB method the number of images are fixed during the simulation and it can be challenging to reach a good compromise between a sufficient coverage of the reaction path and the computational cost. To mitigate this, an automated NEB (AutoNEB) algorithm has been proposed (Kolsbjerg et al., 2016), which can save computational resources by focusing first on converging a rough path before improving on the resolution around the TS. An alternative solution to the same problem has been proposed in the truncation-based energy weighting string method (Carilli et al., 2015), which uses energy weighting to focus the computational effort on the physically interesting images within the barrier region. Finally, it is also worth mentioning the growing string method (GSM) (Peters et al., 2004), which circumvents the need for a (good!) initial guess of the reaction path by separately evolving two string fragments, one associated with the reactants and the other with the products, until the fragments converge and thereby define the reaction path. The combination of GSM with an eigenvector following TS search (Zimmerman, 2013) has shown promising results for a benchmark set of more than 100 elementary reactions.

Local Methods
Keeping in mind that transition states are points where the gradient vanishes, they may in principle be located by minimizing the gradient norm. This is exactly the working concept behind the so-called local methods. In contrast to interpolation methods, local methods only use information of the PES function and its first and possibly also second derivatives at each point, i.e., they require no knowledge of the initial-state and/or final-state geometries. They do, however, usually require a good estimate of the TS to use as a starting geometry in order to converge. This is one of their main limitations.
A most common member of the local group of methods is the Newton-Raphson (NR) approach, which locates the TS directly, given that one starts the search sufficiently close to the TS. Sufficiently close here means already in the harmonic region with the Hessian having exactly one negative eigenvalue. Under these conditions, computing the Hessian and inverting the second order Taylor expansion directly yields the step which maximizes the energy along the TS eigenvector and minimizes the energy along all other directions, converging exactly to the TS. The main drawback of the NR method is the need for generating and manipulating the full Hessian matrix.
However, the main function of the Hessian for saddle point optimization is to provide the direction along which the energy should be maximized (lowest ascent if sufficiently near the TS). The dimer method (Henkelman and Jónsson, 1999) can be employed to determine this direction without calculating the Hessian explicitly, employing two symmetrically displaced replicas-the dimer (see Figure 5), which is used to transform the force in such a way that optimization leads to convergence at a saddle point rather than at a minimum. In general, the strategy for the dimer method is to try many different initial configurations around a minimum, usually taking them from the extrema of a short high-temperature MD trajectory in order to find the saddle points that lead out of that basin. In a first step, the dimer is minimized with respect to orientation by imposing a constrained distance between the images. The lowest mode direction is then given by the line connecting the two images, and this can be used for displacing the central structure, i.e., translating the dimer, which is then followed by a new dimer optimization and so on. The force acting on the center of the dimer gets modified by inverting the component in the direction of the dimer: minimization of this force drives the dimer to a saddle point. A dimer optimization can be done using only first derivatives, and thus alleviates the need for calculating the Hessian matrix. In general, however, performance scaling with system size is not really known. It is unclear in particular whether the added computational cost of optimizing each dimer configuration eventually outweighs the saving by not requiring an explicit Hessian.

BEP and Scaling Relations
An alternative, cheaper approach to determine activation barriers, which still retains grounds in first principles, is provided by the employment of approximate energy relations. The most prominent example is the Brønsted-Evans-Polanyi (BEP) relation (Nørskov et al., 2002;Michaelides et al., 2003), which yields linear relationships of the kind where c 1 and c 2 are constants and E FS and E IS are the total energies of the final and initial states, respectively. The latter are obtainable from local geometry relaxations, and hence significantly cheaper than even the sloppiest TS search. The two parameters c 1 and c 2 need to be determined by linear fitting to appropriate first-principles calculations. The parameters are in general only transferable to site types that are similar to those used in the fitting, e.g., the fcc(111) sites of transition metal surfaces, but they can be rather universal among different kinds of reactions (Wang et al., 2011). Furthermore, it has been shown that the binding energies of many reactants, products and intermediates at transition metal surfaces correlate with the binding energies of the few base elements (mostly C, N, O, S, halogens) with which the molecules typically bind to the surface (Abild- Pedersen et al., 2007). The employment of such scaling relations, combined with BEP relations, enables an enormous reduction of the computational cost of getting first-principles rate constants for applications such as catalyst screening, where a large number of rate constants have to be computed for "similar" processes.

GARBAGE IN-GARBAGE OUT
As always in modeling, it is important to realize that the predictions made from a model are limited by the quality of the input data to the model, commonly expressed as the garbage in -garbage out (GIGO) principle. The necessary input data to a KMC model are the possible processes as well as their associated rate constants. As discussed above, in the context of first-principles KMC simulations where the rate constants are determined by DFT or other electronic structure calculations, these input data are at present typically determined beforehand. In other words, rather than the KMC simulation identifying by itself what processes are important or should be considered and at which accuracy, the simulation depends on a fixed given list of such processes with rate constants that come with the typical uncertainty imposed by the underlying DFT calculation (the rate catalog). This rigid setup leads to a high sensitivity of the KMC simulation results on this input data.
As already mentioned initially, advanced KMC approaches that overcome at least some of the limitations of this prevalent rigid input-data setup are a long sought goal and topic of current research. One possibility to automatically identify the relevant processes in a system would be to use (accelerated) MD approaches like hyperdynamics, temperature-accelerated dynamics or replica-exchange dynamics beforehand (Voter et al., 2002). In adaptive (on-the-fly) KMC a process search using the dimer method or high-temperature MD simulations is directly integrated into the KMC algorithm (Henkelman and Jónsson, 2001;Chill and Henkelman, 2014). A great advantage of the latter methods is also that the KMC model does not necessarily have to be implemented on a fixed lattice. However, they are generally also much more computationally demanding, as they require many orders of magnitude more energy and force evaluations to determine all processes and their barriers. Applications have therefore been limited to rather simple systems or systems where the energy and force evaluations can be done with classical force fields instead of DFT (Xu and Henkelman, 2008;Konwar et al., 2011;Pedersen et al., 2015). At least for the time being, firstprinciples KMC simulations in the application areas covered here are instead in practice only tractable within the rigid setup, which is why we concentrate on it from now on in this practical guide.
In the following we will illustrate the above-described concepts using a simple model for the diffusion of Au adatoms on a Au(100) surface. This will highlight possible pitfalls, but will also provide guidance for best practice. All discussed KMC models have been implemented in the kmos software package  and are available in the Supplemental Data.

Adatom Diffusion on Au(100)
The simplest diffusion process one can think of in this system consists of the hopping of Au atoms between the 4-fold hollow adsorption sites offered by the Au(100) surface lattice (see Figure 6A). For a simple KMC model of this system we will consider a (20×20) square lattice of adsorption sites. The possible processes are the hops of particles from one site into one of the four neighboring sites up, down, left and right. Neglecting any possible lateral interactions between Au adatoms, the barrier for any of these processes is the same and at the DFT-LDA level it has been calculated to E diff = 0.83 eV (Yu and Scheffler, 1997). We will use the TST expression (Equation 11) to express the rate constant of the diffusion processes in terms of this barrier. For simplicity we will ignore any entropic corrections to the barrier and zero-point vibrational energy corrections, i.e., k o = 1. In order to run the KMC model we need to fill some of the lattice sites with particles. If we were to initialize the simulation with an empty lattice, we would directly hit a deadlock where no processes are possible. In this example we prepare the initial state by randomly filling sites with Au adatoms corresponding to a coverage of 0.25 monolayer (ML), i.e., every fourth surface site is occupied. As the output of the simulation we calculate the diffusion coefficient of the Au adatoms by tracking the mean squared displacements (MSD) (Garhammer, 2017) as a function where t is the simulation time and d is the lattice dimension (2 in this case). The averaging is performed over all Au adatoms. In order to improve the statistics 25 independent simulations were run, which differed from each other in the random initialization of the lattice and in the random number seed used. The pink line in Figure 7 shows the average of these 25 simulations from which the diffusion coefficient is determined to be 0.0022 nm 2 /s. Such hopping diffusion processes are, however, not the only mechanism possible for self-diffusion on metal surfaces. Both experimentally and theoretically an alternative exchange diffusion mechanism has been described (Bassett and Webber, 1978;Wrigley and Ehrlich, 1980;Feibelman, 1990;Yu and Scheffler, 1997), in which a metal adatom replaces a surface layer atom and pushes it to a neighboring adsorption site (see Figure 6B). With the adatom and surface atom of the same species, this effectively also results in a net displacement. In Yu and Scheffler (1997) it was found that this process occurs with the lower barrier of only 0.65 eV on Au(100), and should therefore dominate over hopping diffusion. When we additionally allow for this diffusion process in the model, the output of the simulation changes indeed dramatically, since the barrier enters the rate constant for diffusion exponentially. Including exchange diffusion, the diffusion coefficient is now determined to be 4.7 nm 2 /s, see brown line in Figure 7, i.e., more than three orders of magnitude higher than before. It can also be noted that the timescale reached in the simulation (for the same number of total KMC steps) is much shorter with exchange diffusion. In general, the time advanced with a KMC step, t escape , and therefore the total timescale that is computationally reachable during a simulation, depends on the sum of all rate constants (see Equation 10), which is dominated by the fastest process in the system.
The above example thus highlights the extent to which the outcome of a KMC simulation is dependent on knowing and allowing for all relevant processes in the model. This is a severe limitation and requires utmost caution and care in setting up a KMC model. The likelihood to overlook a non-intuitive process such as exchange diffusion cannot be overemphasized and thenalas-the GIGO principle applies with full might. Another drawback of first-principles KMC is that the DFT energies entering the rate constants can have rather large errors associated with them. For processes at extended surfaces often low-rung semi-local DFT functionals (possibly with some +U correction) are still the state-of-the-art. This means that DFT errors on barriers may easily be on the order of 0.1-0.2 eV. Considering that these barriers enter the rate constants exponentially, see Equation (11), the associated rate constants could be wrong by orders of magnitude. However, in general, errors associated with the various rate constants entering a KMC model do not all have a similar effect on the output of the simulation. To illustrate this, we plot in Figure 7 the output of the diffusion model including both hopping and exchange diffusion, when lowering the barrier for hopping diffusion by 0.05 eV (dark green line) and by 0.1 eV (blue line). As can be seen, the result only begins to change significantly when the barrier for hopping diffusion comes within few tens of meV of the barrier for exchange diffusion. In contrast, lowering the rate constant for the dominant exchange diffusion process by only 0.05 eV (light green line) leads to a seven times higher diffusion constant. Again, this is a result of the exponential dependence of the rate constant on the barrier. In other words, a DFT error of 0.1 eV on hopping diffusion would not change the output of the simulation, while the same DFT error on exchange diffusion would lead to huge changes in the output.
While this behavior is trivial to guess in this simple twoparameter model, KMC models, particularly in catalysis, can be much more complicated with very many reaction steps and competing pathways. It is then of interest to identify more systematically those processes and their input rate constants that are most important for the outcome of the simulation. Within the more general context of multiscale modeling, viewing first-principles KMC simulations as a hierarchical multiscale modeling setup combining an electronic structure with a mesoscopic statistical technique, such endeavors are called sensitivity analyses. As has hopefully become clear from this simple adatom diffusion model, for the rigid KMC setup such analyses are absolutely pivotal to assess the meaningfulness of the obtained results. In addition, such analyses also provide important insight into which processes are those that control the kinetics, as it is the rate constants of these processes that critically determine the simulation outcome. Such mechanistic insight is another important reason for conducting KMC simulations. In catalysis, these controlling processes are called rate-determining steps, and identifying rather than assuming them for a catalytic cycle is one of the big assets of comprehensive KMC simulations. We will come back to this topic in section 7 after having discussed examples for such more complex catalysis KMC models in the next section.

STEADY-STATE AND TRANSIENT SIMULATIONS FOR SURFACE CATALYSIS
For catalysis applications one is often interested in the behavior of the system once it has reached steady state (see Equation 13). Since the system is open (constant influx of reactants, constant outflux of products), this still requires kinetic simulations even though the quantities of interest are per se generally not time-dependent. Major such quantities of interest are the surface composition (for instance in form of averaged coverages of different adsorbates/reaction intermediates), reaction mechanisms and production rates of various chemicals. The latter reaction rate is often expressed as a turnover frequency (TOF), which is the average rate of production of a certain molecule per second per surface site (or surface area). Alternatively, for analysis methods like temperature programmed reaction (Jansen, 1995;Rieger et al., 2008), cyclic voltammetry (Rai et al., 2006) or titration (Piccinin and Stamatakis, 2014) one might be interested in the transient behavior of a system prepared in a given initial state. In the next section we will introduce a simple catalysis KMC model for CO oxidation on RuO 2 (110) and illustrate the preparation of various initial states and possible pitfalls with the relaxation to the steady-state solution.

CO Oxidation on RuO 2 (110)
The CO oxidation model is taken from Reuter et al. (2004) and we refer to this publication for its more detailed motivation. In this model, the RuO 2 (110) surface is considered to contain two types of active sites, br (bridge) and cus (coordinately unsaturated) sites, arranged in an alternating rectangular lattice as shown in Figure 8. Each site can be either empty or occupied by O or CO. A total of 26 processes are possible in this system, covering non-dissociative CO adsorption/desorption, dissociative O 2 adsorption/desorption, diffusion of O and CO, as well as CO 2 formation, where for each reaction type all combinations of site types are taken into account. The formed CO 2 is assumed to desorb instantaneously and irreversibly due to its weak binding to the surface.
In Figure 9A we show the temporal evolution of the system beginning from an initial state corresponding to an empty (20 × 20) lattice and at 1 bar O 2 and CO pressure and a temperature of 450 K. At these high pressures, it is clear that a substantial fraction of the surface will be covered with O or CO in the final steady state. Indeed, in the very first KMC steps (i.e., on nanosecond timescales!), this coverage builds up quickly. The O coverage builds up roughly double as fast as the CO coverage, since every O 2 adsorption event leads to the appearance of two O atoms on the surface in contrast to only one CO in a CO adsorption event. Already after about 20-30 ns one could be mistaken to assume that the system has reached the steady-state solution, since both the TOF and the surface coverages become constant. These coverages roughly reflect the impingement situation, with about 2/3 of all cus and br sites covered by O and about 1/3 of all cus and br sites covered by CO. However, this is not the true steady state! During prolonged KMC simulation (now on a timescale of seconds!) the coverages change again dramatically, until only about 1/3 of all br sites are covered by O (the rest by CO) and essentially all cus sites are covered by CO. In the course of these longer term changes, the TOF decreases by more than three orders of magnitude as compared to the premature apparent steady state. The reason for this two-timescale behavior is that the double as high impingement of O atoms onto the initially empty lattice fills a lot of br sites with oxygen. These O br atoms are very strongly bound and quite unreactive. They will definitely not desorb readily and it requires CO oxidation reactions with comparably high barriers (small rate constants) to remove these O br atoms once they have formed. The latter processes therefore take much longer than the initial filling, and since in this longer-term transformation CO also replaces almost all of the highly reactive O cus species, the TOF also collapses.
As impressively demonstrated by this example, identifying what is the true steady state in a KMC simulation can be a major challenge. In fact, this holds even more since what one would really want is to have some form of automatized algorithm that would flag once a simulation has reached steady state. In practical applications, dozens, if not hundreds of different KMC simulations need to be conducted, for instance evaluating different gas phase conditions or determining the influence of changes in the catalyst surface composition (doping with certain metal atoms etc.). It becomes increasingly impractical and cumbersome to having to monitor central quantities of interest during an ongoing KMC simulation and then judge manually whether steady state has been reached. Unfortunately, fully automatic and fool-proof such steady-state detection (SSD) is not yet available for KMC simulations. In contrast, quite some knowledge is for instance available in the area of signal processing and process control (Cao and Rhinehart, 1995;Kelly and Hedengren, 2013). Only very recently such algorithms have also found their way into the KMC community (Núñez et al., 2017;Nellis et al., 2018). Typically, they are applied to several properties of interest (reaction rates, coverages, total lattice energy etc.) in order to avoid a false-positive detection of the steady state. Even so, further testing and method development will probably be needed to ascertain whether they can always be applied in a foolproof manner.
In the absence of such sophisticated SSD algorithms, one pragmatic approach is often to have a lot of knowledge of the studied system and be really cautious (certainly not a foolproof and elegant solution though). The other, complementary approach is to start simulations from mindfully chosen varying initial conditions and then monitor if the same steady state is reached. This implies that the system does not exhibit true multiple steady states. Such behavior is well known from the solution of differential equations, for instance in the context of MFA microkinetic models (Ramachandran et al., 1981). For KMC simulations in the context of surface catalysis such true multiple steady states that are not the result of a prematurely assumed convergence of the simulation have not been reported, and differentiating between the two cases would likely also be involved. In any case, it never hurts to run several KMC simulations starting from different initial conditions and then monitor where they converge to. Obviously, the closer the initialization is chosen to the final steady state, the faster the simulation will likely converge.
Notwithstanding, one might also be trapped in preconceived configurations, when for instance initializing a simulation with a steady-state configuration determined in a preceding (similar) simulation. For the CO oxidation model we illustrate in Figure 9B how starting from a completely different initial population does in fact lead to the same steady state. Here, the initial state corresponded to 1 ML oxygen coverage on the cus sites and vacant bridge sites. For nanosecond timescales we see again the appearance of a (different!) quasi-steady-state solution where the cus sites remain occupied by O, while the bridge sites are covered by about 0.35 ML CO and about 0.65 ML O. However, similar as for the other initialization, on the timescale of seconds the system then transforms again and we arrive at the same steady-state solution as found there. Having understood the reason for this two-timescale behavior, a most suitable initialization would instead be to prepare all br sites with CO to prevent the initial massive buildup of O br . In this case, the true steady state is in fact reached extremely quickly (not shown). Yet, such detailed insight into the chemistry of the system is rarely present in the first place and the approach of starting from largely different initializations is often the best and hopefully reliable shot we have.
Once the steady-state solution has been reached, one can make use of the (hopefully present) ergodicity of the KMC simulation to calculate the desired quantities (surface composition, occurrence of various elementary steps, TOFs) as time averages instead of ensemble (trajectory) averages. The average reaction rate, r β , for the production of a given molecule β can for instance be calculated as where t KMC is the total KMC simulation time, the first sum runs over all KMC steps n (up to a total of N KMC steps), the second sum runs over all states j that are accessible from the current state i, k β ij is the rate constant for a process involving the production of the molecule β, and t escape,n is the escape time for KMC step n. The total simulation time should be chosen long enough to reduce the statistical error on the sampled quantities to a desired value. When calculating statistical errors over successive trajectory fractions, the fraction simulation time must also be chosen long enough that each trajectory fraction is statistically independent of the other fractions. The required time is known as the decorrelation time and it describes the time it takes before the current system configuration is uncorrelated from the initial system configuration in the trajectory fraction. In case the quantity of interest is the transient behavior of the system prepared in a given initial state (e.g., to simulate the aforementioned temperature programmed reaction Jansen, 1995;Rieger et al., 2008, cyclic voltammetry Rai et al., 2006, or titration Piccinin and Stamatakis, 2014, several statistically independent trajectories must be calculated and averaged. The statistical independence can for example be achieved by using different random number seeds, as was done for the adatom diffusion model in section 5.1 above.

SENSITIVITY ANALYSIS AND UNCERTAINTY QUANTIFICATION
As already motivated in section 5.1 it can be of particular interest to know the extent to which the rate constants of the various processes in a model influence the model predictions and to thereby quantify the uncertainty of those predictions. This is generally known as sensitivity analysis and uncertainty quantification (UQ). Some of the main questions these methodologies seek to answer are the following: (i) Error propagation: How do errors introduced at the level of theory used to calculate rate constants propagate to the model predictions? Which rate constants are most important to calculate with a high accuracy (maybe then recursively refine such calculations once the critical rate constants are known)? What conclusions can be reliably drawn despite the errors? (ii) Design and optimization: What are the limitations to achieve an optimal performance of e.g., a given catalyst or battery material? How should rate constants be varied (e.g., by varying the material) to achieve such optimal performance? One sensitivity measure introduced specifically for catalysis is Campbell's degree of rate control (DRC) X RC,I for reaction step I (Campbell, 1994) where the average reaction rate r β should be calculated for the product β of interest and the derivative is evaluated holding constant the rate constants k J of all other reactions steps J and the equilibrium constant K I for step I. A positive (negative) DRC signifies that the reaction rate will increase (decrease) when increasing k I , whereas a value of zero signifies that the reaction rate is insensitive to variations in k I . The DRCs follow a sum rule, which states that the sum of all DRCs is equal to one (Meskine et al., 2009;Hoffmann et al., 2017). A single non-zero DRC equal to one then signifies a single rate-limiting step in the reaction mechanism, while in general several steps can be rate-limiting at the same time. The fact that the equilibrium constant for step I is held constant means that both the forward and reverse rate constants for step I are varied simultaneously, which can also be viewed as a variation of the TS energy of step I (for activated processes). Later, the DRC concept was extended to a thermodynamical version where instead the energies of reaction intermediates are varied (Stegelmann et al., 2009). Obviously, these sensitivity measures can easily be extended to other quantities of interest than reaction rates. From a practical point of view, the main challenge in evaluating the derivative entering the DRC expression is that we don't have an analytical expression for the reaction rate in KMC. Relying instead on a finite-difference sampling, very long simulation times are typically required to reduce the statistical error sufficiently (Meskine et al., 2009). A more efficient three-stage approach has recently been proposed (Hoffmann et al., 2017), which allows for the direct sampling of sensitivity measures from a single KMC trajectory. DRC sensitivity measures are formulated as a linear response theory, meaning that the result is only valid locally in the input parameter space. However, kinetic models are most often highly non-linear and the DRC can change substantially over rate constant variations corresponding to a DFT error of e.g., ±0.2 eV on reaction barriers. Recently, a number of methods have therefore been developed for global sensitivity analysis and UQ, i.e., to assess which conclusions about the model can be reliably drawn despite uncertainties in the input parameters (Sutton et al., 2016;Döpking and Matera, 2017). Sutton et al. (2016) furthermore addressed the fact that the errors in the input parameters for KMC models are often highly correlated. Such correlations can arise because the used DFT functional might generally over-or underestimate certain kinetic parameters. Corresponding DFT functional correlations have been exploited to assess the uncertainty of reaction rates in a MFA microkinetic model for ammonia synthesis (Medford et al., 2014) carried out using the Bayesian error estimation functional with van der Waals correlation (BEEF-vdW) (Wellendorff et al., 2012). The latter functional provides not only a single value for a given kinetic parameter, but an ensemble of values generated by sampling known uncertainties in the exchange-correlation model parameters. Another source of correlations in kinetic parameters is the existence of correlations in the adsorption energies of chemically related intermediates and TSs, generally known as scaling relations (Nørskov et al., 2002;Michaelides et al., 2003;Abild-Pedersen et al., 2007) (see section 4.2.3). Both sources of correlation were considered in the sensitivity analysis and UQ carried out in Sutton et al. (2016) and applied to a kinetic model for ethanol steam reforming. The method allows to assess whether a proposed reaction mechanism can be considered to agree with experimental data within the known DFT errors and can reveal limitations such as the failure to take into account support effects in heterogeneous catalysis models.
To be quite honest, no such sophisticated sensitivity analysis is yet really on the agenda of the large majority of practitioners. As a very crude and simple advise in the context of this practical guide to surface Monte Carlo simulations, we would thus recommend to always at least vary some of the key rate constants in a KMC model by hand and see how this changes the simulation result. This is straightforward to do and it provides a crude (possibly incomplete, but still better than none) picture of what could be the critical input and what are the dependencies in the studied reaction network. In case this flags a critical sensitivity, one can and should escalate from there. Honestly, if the gist of the story one extracts from KMC simulations depends critically on highly specific numerical values of one or a few rate constants and one knows that there is a large uncertainty in these values, then who wants to stick their neck out?

TIMESCALE DISPARITY PROBLEM
KMC achieves an enormous speedup over MD simulations by avoiding the explicit treatment of the vibrational degrees of freedom of the system, and instead considering only the rare events such as adsorption/desorption, diffusion or reaction steps (see section 2.1). However, also those rare events that are treated in the KMC simulation can occur at largely different timescales. When this is the case, almost all of the CPU time is spent simulating the fast processes, while the (maybe truly important) dynamics arising from the slow processes is sampled insufficiently or not at all. This is especially problematic for KMC models of surface reactions on metals, where often very fast surface diffusion processes and slow surface reactions occur simultaneously in the reaction network.
A wide variety of methods has been developed to deal with this timescale disparity problem. The τ -leap method is an approximate procedure in which the KMC simulation is accelerated by the firing of multiple processes at once (Gillespie, 2001). The underlying assumption (leap condition) of this method is, however, only fulfilled when surface populations are approximately constant during the coarse time increment τ . Hence, the method does not apply to surface reactions on microscopic lattices, where site populations can change dramatically (e.g., from zero species to one species) from one KMC step to the next. In practice, the method is only applied to coarse-grained lattices where the concentration of species within one larger coarse-grained cell is approximately constant in time. Other methods rely on a separation of the processes into "slow" and "fast" processes. They treat then only the slow process dynamics stochastically at the KMC level, while the fast process dynamics is treated deterministically or using a Langevin equation (Haseltine and Rawlings, 2002;Salis and Kaznessis, 2005). A main drawback of this class of methods is that the process timescale separation has to be specified by the user in advance and remains fixed throughout the simulation.
In practice, recent KMC works have therefore often relied on simple acceleration schemes that function by decreasing the rate constants of the fastest processes in the system in order to make the span in process timescales tractable (see Figure 10). For simple reaction networks such rate constant scaling, together with verification that the model output is not affected by the scaling, can be carried out manually Piccinin and Stamatakis, 2014;Lorenzi et al., 2016;Jørgensen and Grönbeck, 2017). Recently, algorithms have also appeared that take care of the scaling of fast processes automatically, without the user having to specify those processes in advance (Chatterjee and Voter, 2010;Dybeck et al., 2017). A main assumption of these methods is that fast processes become quasi-equilibrated after a limited number of executions, i.e., it is assumed that continued simulation of these processes is not necessary to correctly describe the system dynamics. The accelerated superbasin KMC (AS-KMC) method from Chatterjee and Voter (2010) defines a superbasin as a set of lattice configurations that the system can jump between through the execution of quasi-equilibrated processes only (see Figure 10). The execution of a non-equilibrated process then defines the entering of a new superbasin. The goal of the acceleration algorithm thereby becomes to encourage the system to leave the current superbasin at an earlier time through the scaling of fast, quasi-equilibrated processes. A drawback of this method for complex systems is that the total number of system configurations can be very large, which may cause the algorithm to become inefficient since the full sampling of even a single superbasin becomes exceedingly slow.
This problem was addressed in the method presented in Dybeck et al. (2017), where rather than tracking both system configurations (superbasin states) and processes, only the executions of some user-defined reaction channels are tracked. A reaction channel could for example be the adsorption/desorption of some species at a given site type, independently of where on the lattice this reaction occurs, and the scaling of rate constants is then applied to the whole reaction channel. Scaling still only FIGURE 10 | Potential energy surface (PES) for a system with processes occurring at disparate timescales due to large differences in the barriers. A set of states connected by fast (low-barrier), quasi-equilibrated processes constitutes a superbasin. The system can escape from one superbasin to another through the execution of a slow (high barrier), non-equilibrated process. The KMC simulation is accelerated by scaling the rate constants (increasing the barriers) of fast, quasi-equilibrated processes. This decreases the time it takes for the system to leave the current superbasin and thereby enhances the sampling of neighboring regions of the PES (other superbasins).
occurs for processes that have been executed a minimum number of times within the current superbasin and for which the number of forward executions is roughly equal to the number of reverse executions to within some tolerance. Once a non-equilibrated process was executed, the rate constants are unscaled again to allow for sufficient sampling of the new superbasin-and the process is started over again. The algorithm was shown to work well for a reaction model of Fischer-Tropsch synthesis over ruthenium nanoparticles (Dybeck et al., 2017). Very similar approaches to KMC acceleration, but excluding the unscaling step, have been followed in Núñez et al. (2017) and Hoffmann and Bligaard (2018). In Núñez et al. (2017) the acceleration scheme was employed together with efficient sensitivity analysis beyond finite differences (similar to the method from Hoffmann et al. (2017) that was discussed in the preceding section) for improved sampling of sensitivity measures also in KMC models characterized by large disparities in the timescales.
In Andersen et al. (2017) the algorithm from Dybeck et al. (2017) was implemented in the kmos code and applied to a trend study of CO methanation over stepped surfaces of the transition metal series using scaling-relation-based rate constant expressions. Also for this more complex reaction mechanism and the consideration of many different catalyst surfaces, the acceleration algorithm generally performed well. However, some challenging cases leading to a breakdown of the algorithm were also identified. A particularly problematic case for the algorithm was shown to be the occurrence of reactions between two lowcoverage species, which are both produced in independent, quasi-equilibrated reaction steps. In this case the algorithm may scale the rate constants of the quasi-equilibrated steps too aggressively, leading to insufficient or no sampling of lattice configurations where the two low-coverage species are found at neighboring lattice sites, as required for their reaction. This problem is likely related to the fact that the algorithm from Dybeck et al. (2017) does not track system configurations and therefore cannot verify if all configurations within a superbasin have been sufficiently sampled. A simple correction scheme that takes into account lattice configurations of the nearest neighbor sites in the definition of the reaction channel was proposed in Andersen et al. (2017). This was shown to work well for the simple case of a reaction between two low-coverage species formed directly at neighboring sites. However, it does not apply if the low-coverage species are produced at distant lattice sites and rely on diffusion steps before the reaction step.
In some sense it is ironic that KMC is so particularly challenged by fast diffusion steps, considering that its effective competitor in form of MFA microkinetic models is challenged by slow diffusion steps. Even in the absence of lateral interactions, which would generally be used to argue in favor of the validity of the MFA, such slow diffusion processes can prevent the system to ever reach the well-mixed state assumed in the MFA. At metals, this was shown to happen at strongly binding step sites (Andersen et al., 2017), whereas at oxides this might arise simply from the higher diffusion barriers at these more open compound materials. For the KMC model of CO oxidation on RuO 2 (110) discussed in section 6.1 a corresponding MFA breakdown could for instance be traced back to a relatively high barrier of approximately 1.6 eV for O diffusion (Temel et al., 2007;Matera et al., 2011;Exner et al., 2015). In this respect, this leaves essentially no system of interest in surface catalysis where one could generally expect mean-field kinetics to yield the right answers: At metals, the MFA is typically invalidated by strong lateral interactions at nearby sites (see next section), at compounds like oxides high diffusion barriers prevent a sufficient mixing of the adsorption layer. The problem is then that even though modern KMC implementations like those discussed in section 3 have become dramatically more efficient, they are typically still more demanding than mean-field models. The timescale disparity problem adds significantly to this and even though most recent acceleration algorithms have become better, this problem is not yet fully solved. This still leaves many users to resort to MFA microkinetic modeling, even though it is likely not correct. Alternatively, algorithms that are intermediate between MFA and KMC in terms of accuracy and computational cost such as the quasichemical approximation have also been applied to catalytic reactions (Hellman and Honkala, 2007) and have in some cases been shown to reproduce KMC simulations at significantly reduced cost.

LATERAL INTERACTIONS
Lateral interactions are interactions between species adsorbed to a lattice. They can be either attractive or repulsive depending on the chemical nature of the involved species and the surface or bulk material defining the KMC lattice. Several recent studies have found that lateral interactions are essential to take into account for a correct description of the system dynamics (Stamatakis and Piccinin, 2016;Jørgensen and Grönbeck, 2017;Piccinin and Stamatakis, 2017;Vignola et al., 2017;Hus and Hellman, 2019). As already discussed in section 2.2, lateral interactions are taken into account within the lattice approximation in KMC by assigning an individual hopping rate constant to each neighbor configuration (see Figure 3). In case of repulsive interactions, the particle will be more likely to jump away from an initial configuration with occupied neighboring sites as compared to empty neighboring sites. This will cause the particles to spread out over the lattice to maximize the inter-particle distances. For attractive interactions, the situation is exactly opposite and the particles will show a tendency for island formation. Both cases can lead to the formation of ordered structures on the lattice. They may therefore lead to inaccuracies of mean-field treatments of the system kinetics, since the MFA assumes a random distribution of the particles without any correlations between sites (Liu et al., 2016;Stamatakis and Piccinin, 2016;Pineda and Stamatakis, 2017).
In practice, lateral interactions can be accounted for in lattice KMC models through the general cluster expansion method (Sanchez et al., 1984;Stampfl et al., 1999;Müller, 2003). In this method, the lattice energy is expanded into a sum of discrete interactions (clusters) such as pairwise interactions, three-body interactions etc. through a lattice-gas Hamiltonian. For any adsorbate on the lattice, one would thus evaluate how many neighbors of which type sit at which kind of distances (nearest-neighbor, next nearest neighbor etc.). For each such neighbor the adsorption energy (or more generally rate constant) of an isolated particle on the lattice at this site type would be corrected for by a certain amount prescribed by the lattice-gas Hamiltonian. Summing up all these contributions defines the pairwise interactions. Then one looks up all possible motifs of two simultaneously present neighboring species, for which again there are energy (rate constant) corrections. This defines the three-body corrections to the pairwise interaction correction, and formally this goes on to higher and higher many-body interactions. While cluster expansions considering up to three-and four-body terms have been parametrized with DFT for simple systems (Jansen and Offermans, 2005;Zhang et al., 2007;Schmidt et al., 2012;Wu et al., 2012;Piccinin and Stamatakis, 2014), presently cluster expansions are typically truncated already after the first nearest neighbor pairwise interaction term in more complex systems with many species and site types in order to keep the computational cost tractable (Stamatakis and Vlachos, 2011;Yang et al., 2013). A particularly crude form of this are so-called site-blocking rules Liu et al., 2016;Lorenzi et al., 2016), where strongly repulsive first neighbor interactions are simply modeled by suppressing any KMC processes that would lead to such immediately neighboring species. Furthermore, cluster expansions are typically used only for adsorbates in their stable and metastable adsorption sites, since taking into account also transition states, i.e., changes in barrier heights due to lateral interactions, could make the DFT parametrization intractable. In the ZACROS KMC code cluster expansion is built-in for adsorbates and the effect of lateral interactions on transition states is instead taken into account through an approximate Brønsted-Evans-Polanyi relation (Nielsen et al., 2013). A main drawback of the cluster expansion method is that the computational cost of both the DFT parametrization and the KMC simulation can quickly become intractable for complex systems. A benchmarking of the cluster expansion parametrized in Wu et al. (2012) showed that the computational cost of the KMC simulation increased with about 5 orders of magnitude when the number of clusters considered increased from 3 to 12 (Nielsen et al., 2013). Of course, this depends on the actual KMC implementation and algorithm used, and in particular codes like ZACROS are written to mitigate the additional costs when considering lateral interactions.
Despite the added cost, we emphasize again that lateral interactions often play an important role not only for surface diffusion, but also for crystal growth and heterogeneous catalysis. In the following we will thus provide two example kmos models for the application areas crystal growth and heterogeneous catalysis that take into account lateral interactions. For more realistic growth and lateral interaction models we refer to the literature (Fichthorn et al., 2002;Ruan and Schuh, 2010;Shirazi and Elliott, 2014).

kmos Models With Lateral Interactions
We will begin this section with a short description of two of the available kmos backends and the way they each handle lateral interactions. The local smart backend is the original backend and has been used as a basis and inspiration for the other backends. It was built with the implicit objective of offering the best run time possible at the expense of memory usage. For this reason, a key element in this backend is a pre-calculated list of rate constants, i.e., a rate catalog. Together with an efficient local updating of the available processes after each KMC step, this makes it the most efficient backend when the number of different rate constants in the list is reasonable small. The local smart backend implements the most simple treatment of lateral interactions, in which any new neighbor configuration defines a new KMC process with a rate constant associated with this particular configuration. There are several drawbacks of this approach for models with lateral interactions since these generally feature an exponentially growing number of processes with the number of interactions taken into account: (i) Several routines whose execution time scale with the total number of processes can become slow, (ii) the bookkeeping data structures, which scale in size with the total number of processes, can become too big for available memory, (iii) the size of the source code can become very large, making compilation very slow or even impossible due to memory requirements.
The on-the-fly backend was constructed to alleviate the problems encountered in the local smart backend and to enable KMC simulations of complex lateral interaction models. As the name implies, it calculates rate constants on-the-fly instead of making use of a pre-calculated rate catalog. In the on-thefly backend the local environment is taken into account for each process through a bystander list. Here a bystander is a neighboring site whose occupation influences the rate constant of the process according to the on-the-fly-calculated rate constant. The benefit of this approach is that the total number of processes in the model is constant with respect to the number of interactions taken into account. On the other hand, the drawback is that now the compute time of each KMC step scales linearly with the system size due to the need to scan through the lattice and sum up all rate constants explicitly in order to evaluate the total escape rate constant k tot (Equation 9). Now moving to examples, we will first present a kmos model for crystal growth with lateral interactions that is simple enough that it can be handled efficiently in the local smart backend. For this model we consider a three-dimensional quadratic lattice and a single species that is deposited onto a solid substrate with a constant adsorption rate constant k ads = 3 · 10 −3 s −1 . The lowcoverage desorption barrier E 0 is set to 1 eV. In addition, we consider attractive pairwise lateral interactions ǫ int = 0.5 eV with the nearest neighbor species. Thereby, the rate constant for the desorption process k des becomes where n NN is the number of nearest neighbor species in the lattice configuration. For the desorption process to be possible the species must be located in the surface layer, i.e., the site just below it must be occupied and the site just above it must be empty. Lateral interactions are then taken into account for the four neighboring sites at the same z height (see Figure 11A). Since there are only two possible configurations for each neighbor site (empty or occupied) this leads to merely 16 inequivalent desorption process types and this model can therefore be efficiently treated in the local smart backend.
In Figure 11B we show snapshots of the grown crystal structure for temperatures of 350 K and 450 K. In both simulations the system is prepared in an initial state corresponding to one layer of fixed substrate species (blue atoms) onto which adsorption can take place. As expected, the grown structure becomes smoother at higher temperatures, where atoms deposited onto unfavorable adsorption sites with no attractive interactions to neighboring species will be more likely to desorb. The model could be made more realistic by including also diffusion processes and by considering a more detailed cluster expansion model for the lateral interactions.
As the next example we return to the catalysis model for CO oxidation on RuO 2 (110) already discussed in section 6.1. To explore how the kmos performance of the local smart and on-the-fly backends are each affected by increasingly complex lateral interaction models, we step-wise add pairwise interactions to each process in the model. Possible pairwise interactions are illustrated with black arrows in Figure 11C for a second-order process (e.g., O 2 desorption) involving two neighboring sites (marked with black circles). The performance of each backend as a function of the number of interactions included is shown in Figure 11D For the local smart backend (rate catalog) the cost grows exponentially with the number of interactions and we are effectively limited to simple interaction models with a maximum of four pair-wise interactions. This happens because the number of processes in the rate catalog grows exponentially with the number of interactions, i.e., the three possible occupations of each neighboring site (O, CO or empty) to the power of the number of neighboring sites (interactions) taken into account. In the on-the-fly backend the cost grows only linear with the number of interactions and remains tractable at all considered FIGURE 11 | (A) Illustration of the lateral interaction model for the desorption process of the crystal growth model. The site just below the desorbing species is always occupied and its interaction energy is therefore included in the low-coverage desorption barrier. The four neighboring sites at the same z height can be either empty or occupied and modify the desorption rate constant accordingly. (B) Snapshots of grown crystal structures at two different temperatures. (C) Illustration of pairwise interactions in the CO oxidation on RuO 2 (110) model. (D) kmos performance for the CO oxidation model as a function of the number of pairwise interactions considered for two different backends (rate catalog or on-the-fly calculation of rate constants). Using a rate catalog, the performance is independent of the lattice size. In the on-the-fly implementation the cost instead grows linearly with the lattice size (quadratic growth with the length N of an (N × N) simulation cell) as illustrated for N equal to 10, 20, 30, 40, 50 (different red lines). system sizes, despite the quadratic scaling of the cost with the length N of the (N × N) simulation cell used (linear scaling with system size). It is worth noting in this context that the on-the-fly algorithm presented here specifically for kmos is not new. For example, also the ZACROS code (Nielsen et al., 2013) calculates rate constants on-the-fly.
As has hopefully become evident from the above examples, lateral interactions are a double-edged sword for KMC models. On the one hand, the ability to systematically take into account effects of the local environment through systematic cluster expansion enables models of potentially very high accuracy and constitutes a great advantage over MFA models. On the other hand, the inclusion of lateral interactions counteracts the reduction in the number of needed first-principles rate constants that was obtained by making use of the lattice approximation (see section 2.2) and introduces additional complexity and cost to the KMC simulation. In practice, the choice of lateral interaction model therefore (unfortunately) often becomes a pragmatic one determined by what is computationally feasible, even if it is wellknown that an appropriate interaction model can be crucial for the validity of the kinetic model.

SUMMARY AND OUTLOOK
The last decade has witnessed an impressive growth, not only in the number of studies employing predictive, first-principles KMC modeling, but also in the number of new codes that have become available. Especially for heterogeneous catalysis applications, the employed models are increasingly able to deal with complexity, both in the employed reaction mechanisms and in the structure of the catalytic material modeled. These advances have been powered by algorithmic developments for the determination of the input processes and rate constants, as well as for the actual algorithms that carry out the KMC simulation.
With many new KMC users entering this exciting field, the aim of this tutorial review has been to provide the necessary practical guidelines and examples for constructing and evaluating KMC models, and to highlight the pitfalls met along the way as well as current challenges and perspectives. We discussed in detail how to make use of the lattice approximation in order to exploit a crystalline symmetry of the underlying surface lattice and thereby reduce the number of required first-principles rate constants. This also involves pitfalls and challenges, in particular when a dynamical restructuring of the surface takes place, or if well-defined adsorption sites of the surface species do not exist. We briefly discussed how off-lattice (adaptive) KMC attempts to overcome these limitations, as well as the limitations associated with a pre-defined-possibly incompletelist of possible processes, through the automatic identification of possible states and processes. However, despite this exciting perspective, these methods are still hampered by their high cost in practice.
Even for lattice KMC models the number of required first-principles rate constants can be daunting. Prevalent TST approaches require the location of the TS and often also the determination of prefactors and zero-point vibrational energies, at least to the level of determining the harmonic frequencies of the system. While it is desirable to carry out these calculations by DFT or other electronic structure methods to achieve a predictive-quality model, the development of cheaper methods could dramatically increase the complexity that it is possible to tackle. To this end, we discussed commonly employed BEP and scaling relations as one prevalent example today. For the future, the development of increasingly accurate semi-empirical methods such as Density-Functional Tight-Binding (DFTB), reactive force fields or machine-learning-based force fields is an interesting perspective to which we ascribe a high potential. The actual determination of the TS is typically the computational bottleneck, and for this we discussed a number of methods and their respective strong and weak points. Furthermore, we discussed how the accuracy of the employed rate constants can be systematically improved in lattice KMC models through the cluster expansion method in cases where the local environment significantly influences the rate constant.
For KMC models constructed using first-principles DFT rate constants, one still has to bear in mind that the expected error on barriers can easily be on the order of several hundreds of meV. This may lead to rate constants that are potentially wrong by orders of magnitude. A practical way to tackle this significant drawback is to estimate the sensitivity of the model predictions on the input rate constants and to quantify the uncertainties on those predictions. For this, we discussed various approaches ranging from a simple parameter variation to sophisticated models taking into account correlations between the input rate constants. An exciting perspective here is a recurrent refinement of the used rate constants, possibly also regarding the lateral interaction model employed, as information about the importance of the various input processes becomes increasingly available.
While KMC has come a long way, there are also a number of challenges to be met in the future. One example of this is the inevitable timescale disparity problem, which continues to challenge practical applications of KMC. We discussed a wide variety of methods that have appeared in the last decades, with a particular focus on recent acceleration algorithms that automatically identify the simulation bottlenecks, i.e., the fast, quasi-equilibrated processes. While these approaches can work well in many cases, there are also examples where they break down. Likely, this is an area where further improvements are to be seen in the coming years.