Analyzing the Nuclear Interaction: Challenges and New Ideas

This review presents some of the challenges in constructing models of atomic nuclei starting from theoretical descriptions of the strong interaction between nucleons. The focus is on statistical computing and methods for analyzing the link between bulk properties of atomic nuclei, such as radii and binding energies, and the underlying microscopic description of the nuclear interaction. The importance of careful model calibration and uncertainty quantification of theoretical predictions is highlighted.


INTRODUCTION
The ab initio approach to describe atomic nuclei and nuclear matter is grounded in a theoretical description of the interaction between the constituent protons and neutrons. The long-term goal with this course of action is to construct models to describe and analyze the properties of nuclear systems with maximum predictive power. It is of course well-known that the elementary particles of the strongly interacting sector of the Standard Model are quarks and gluons, not protons and neutrons. However, since the relevant momentum scales of typical nuclear structure phenomena are low enough to not resolve the internal degrees of freedoms of nucleons, it is reasonable to model the nucleus as a collection of strongly interacting and point-like nucleons. This idea has inspired significant efforts aimed at developing algorithms and mathematical approaches for solving the many-nucleon Schrödinger equation in a bottom-up fashion and with as few uncontrolled approximations as possible (see e.g., references [1][2][3][4][5][6][7][8][9][10]), as well as a multitude of theoretical descriptions of the interaction between nucleons, at various levels of phenomenology (see e.g., references [11,12], and references [13][14][15]) for comprehensive reviews on (chiral) effective field theory (EFT) methods. Reference [16] also offers a historical account of various approaches to understand the nuclear interaction.
Currently, ab initio modeling of atomic nuclei faces two main challenges: • We have limited knowledge about the details of the interaction between nucleons, which in turn limits our ability to predict nuclear properties. • Given a microscopic description of the interaction between nucleons inside a nucleus, a quantum-mechanical solution of the nuclear many-body problem is exacerbated by the curse of dimensionality.
There is however continuous progress on both frontiers, and attempts at quantifying the uncertainty of model predictions are beginning to emerge in the community. Rapid algorithmic advances in combination with a dramatic increase in available computational resources make it possible to employ several complementary mathematical methods for solving the nuclear Schrödinger equation. We can nowadays generate numerical representations of microscopic many-nucleon wavefunctions, for selected medium-mass and heavy-mass nuclei, with a rather impressive precision. Although several observables remain beyond the reach of state-of-the-art models, e.g., most properties associated with highly collective states, we can still describe certain classes of observables rather well, such as total ground-state binding energies and radii, and sometimes low-energy excitation spectra. We are thus capable of analyzing experimentally relevant nuclei directly in terms of a quantum mechanical description of the interaction between its constituent nucleons. Indeed, the list of, sometimes glaring, discrepancies between theory and experiment furnish some of the most interesting nuclear physics questions at the moment (see e.g., references [17][18][19][20][21][22][23]). Many of these efforts are aimed at understanding the nuclear binding mechanism, the location of the neutron dripline, the existence of shell-closures and magic numbers in exotic systems, and the emergence of nuclear saturation. State-of-the-art theoretical analyses of experimental data indicate a large and non-negligible systematic uncertainty in the description of bulk nuclear observables (see e.g., reference [24]). Given the high-precision of modern many-body methods, much of this uncertainty can be traced to the description of the interaction potential. Although there exists ab initio models that describe nuclei rather well, albeit in a limited domain, it is less clear why other models sometimes fail. Indeed, the NNLO sat interaction potential [25] reproduces several key experimental binding energies and charge radii for nuclei up to mass A ∼ 50 [23,[26][27][28], while the so-called 1.8/2.0 (EM) interaction potential [29,30] reproduces binding energies and low-energy spectra up to mass A ∼ 100 [26,[31][32][33][34][35] while radii are underestimated. The origin of the differences between these potentials is unknown. It is of course the role of nuclear theory to close the gap between theory and experiment by developing and refining the theoretical underpinnings of the model. But given the complex nature of atomic nuclei, there is significant value in trying to quantify, or estimate, the detailed structure of the observed theoretical uncertainty. This might provide important clues about where we should focus our efforts. There exists well-defined statistical inference methods that can provide additional guidance, and several ongoing projects are currently focused on applying statistical computing methods in the field of ab initio modeling. The topic of uncertainty quantification in nuclear physics has been discussed at a series of workshops on Information and Statistics in Nuclear Experiment and Theory (ISNET). Recent developments in this field are documented in the associated focus issue published in Journal of Physics G [36]. A second focus issue has just been announced, and the first few papers are already published.
In sections 2 and 3 of this paper I will review a selection of recent results and often applied methods for calibrating ab initio models. In sections 4 and 5 I will discuss some of the recently emerging strategies for making progress using statistical computing and Bayesian inference methods. The aim is to provide an overview of selected accomplishments in the field of statistical inference and statistical computing with ab initio models of atomic nuclei. Hopefully, this paper can serve as a brief introduction to practitioners who wish to learn about ongoing developments and possible future directions.
As a final remark, in this paper I will try to consistently use the word model when referring to any current method for theoretically describing the properties of atomic nuclei, including descriptions that claim to be building on more fundamental underpinnings, such as EFT. One can certainly make a finer distinction between models, EFTs, and theories. As outlined in reference [37]; theories provide a unified framework, categorization, and the joint language used for discussions; EFTs capture physics at a given momentum scale; and models can be used to study aspects of a theory, increase understanding, and provide intuition.

AB INITIO MODELS OF NUCLEAR MANY-BODY SYSTEMS
An ab initio model is here defined as a description that is based on a state | that solve the many-nucleon Schrödinger equation In this schematic representation,T is the total kinetic energy operator for the A-nucleon system,V( α) is the potential energy operator for the interaction between the nucleons, and E is the total energy of the system in the state represented by | . The potential operator term depends on a set of parameters α that governs the strengths of the various interaction pieces in the potential. In the context of EFT, these parameters are often referred to as low-energy constants (LECs). Given a particular expression for the potentialV, with numerical values for the parameter vector α, and a mathematical method to solve Equation (1) for e.g., the state | with lowest energy, it is in principle possible to quantitatively compute the expectation value for any observableÔ with respect to this state, e.g., its charge radius. Of course the trustworthiness of the result and its level of agreement with experimental data can vary dramatically between different models, i.e., combinations of potentials and many-body methods.
I will denote an ab initio model with M( α, x). It is defined as the combination of a definite expression for the potentialV( α), and a method for solving the Schrödinger equation. The vector x is a set of control inputs that specify all necessary settings, such as nucleon numbers, which observable to compute, values of the fundamental physical constants, and algorithmic settings for the mathematical method used for solving Equation (1). Once a set of numerical values for α has been determined, a subset of the control inputs x of the model can be varied to make model predictions, preferably at some physical setting, for e.g., exotic nuclei where we cannot easily make measurements. Provided that the form of the potential operatorV and relevant physical constants remain the same, and the model parameters α were calibrated carefully, it is of course possible to transfer the vector α between ab initio models based on different methods for solving the many-nucleon Schrödinger equation. This is also in line with a physical interpretation of the parameters α that elevate them to a status beyond being simple tunable parameters inherent to a specific model with the sole purpose of achieving a good fit to calibration data. This will be discussed further in section 3.
One of the most exciting developments in nuclear theory is that we nowadays have access to a range of methods for solving Equation (1) with very high numerical precision for selected isotopes and observables. This gives us the opportunity to compare model predictions with experimental data to learn more about the elusive structure of the interaction between nucleons. However, such an analyses require careful statistical interpretation of the theoretical results. In particular a sensible estimate of the uncertainty associated with a theoretical prediction. Indeed, only with reliable theoretical errors is it possible to infer the significance of a disagreement between experiment and theory, which in turn may hint at new physics.

Chiral Potentials and the Strong Interaction Between Nucleons
On a fundamental level, the atomic nucleus is a quantum mechanical and self-bound system of interacting nucleons. In turn these particles are composed of three quarks whose mutual interactions are described well by the Standard Model of particle physics. As such, starting from the Standard Model it should be possible to account for all observed phenomena also in atomic nuclei, besides possible signals of beyond Standard Model physics. However, to theoretically understand the emergence of nuclei from the Standard Model is an open problem, and linking the quantitative predictions of atomic nuclei to the dynamics of quarks and gluons is a central challenge in lowenergy nuclear theory. Although, viewing the atomic nucleus as a (color-singlet) composite multi-quark system is not the most economical choice. Indeed, the strong interaction, which is the most important component for nuclear binding and well-described by quantum chromodynamics (QCD), is nonperturbative in the low-energy region inhabited by atomic nuclei. Non-perturbative Monte Carlo sampling of the quantum fields of QCD amounts to a computational problem of tremendous proportions. This strategy, referred to as lattice QCD, is expected to require at least exascale resources for a realistic analysis of even the lightest multi-nucleon systems. Without any unforeseen disruptive technology, this approach will not provide an operational method for routine analyses of nuclei. For the cases where numerically converged results can be obtained, lattice QCD offers a unique computational laboratory for theoretical studies of QCD in a low-energy setting (see e.g., references [38,39]).
The description of nuclei should nevertheless build on QCD, or the Standard Model in general. A turning point in the development of QCD-based descriptions of the nuclear interaction came when EFTs of QCD [40] arrived also to manynucleon physics [41]. An EFT formulates the dynamics between low-energy degrees of freedom, e.g., nucleons and pions, in harmony with some assumed symmetries of an underlying theory, e.g., QCD, and any high-energy dynamics, e.g., quarkgluon interactions, are integrated out of the theory. The resulting chiral effective Lagrangian models the low-energy interactions between two or more nucleons in terms of pion exchanges between nucleons and the high-energy dynamics is incorporated as zero-ranged contact interactions. This approach introduces several model parameters referred to as low energy constants (LECs). They were denoted with α above, and play a central role during the model calibration discussed below. The notion of high-and low-energy scales in EFT requires the presence of at least two scales in the physical system under study. An EFT formally exploits this scale separation to expand observables in powers of the low-energy (soft) scale over the high-energy (hard) scale, and in chiral EFT the resulting ratio is often denoted where, in the case of chiral EFT, the soft scales are m π and k, the pion mass and a typical external momentum scale, respectively. The hard scale is denoted b and is set by the e.g., the nucleon mass M N . Depending on the system under study, one can always try to exploit existing scale separations to construct other kinds of EFTs in nuclear physics, e.g., pion-less EFT [42], vibrational EFT [43], or chiral perturbation theory (the prototypical EFT of QCD) [44]. In the following, I will only discuss results from ab initio models based on chiral EFT, i.e., a pion-full EFT, but many of the methods can be generally applied.
In chiral EFT, the nuclear interaction potential V is analyzed as an order-by-order expansion in terms of Q ν and organized following the principles of an underlying power counting (PC). Terms at a higher chiral expansion-orders ν should be less important than terms at a lower orders. Potentials expanded to higher orders are expected to describe data better. Higher chiral orders contain more involved pion exchanges and polynomial nucleon-contacts of increasing exponential dimension, and therefore more undetermined model parameters α to handle during the calibration stage. To provide some detail about the chiral potentials: the leading-order (LO) typically consists of the familiar one-pion exchange interaction plus a nucleonic contactpotential. The structure of the contact potential, and the exact treatment of sub-leading orders vary depending on the PC. Still, typical chiral potentials include at most contributions up to a handful of chiral orders, e.g., next-to-next-leading order (NNLO) and next-to-next-to-next-to-leading order (N3LO), and the total number of LECs, i.e., undetermined model parameters, range between ∼10 and 20, sometimes a few more, at such chiral orders. Several important contributions to the two-, three-, and four-nucleon interactions at higher orders in the chiral expansion have also been worked out [45][46][47][48][49][50]. At N5LO, a new set of 26 contact LECs appear, bringing the total number of contacts to 50. Some of the unique advantages of chiral EFT descriptions of the nuclear interaction are the natural emergence of two-, three-, and many-nucleon interactions [51][52][53][54][55], the consistent formulation of quantum currents, e.g., with respect to electroweak operators [56][57][58][59][60][61][62], and a clear connection with the pion-nucleon Lagrangian which makes it possible to link nuclei with low-energy pion-nucleon scattering processes [63]. For a detailed account of chiral EFT potentials (see references [13][14][15]).
To ensure steady progress toward a realistic ab initio model for atomic nuclei, we need to critically examine and evaluate the quality and predictive power of different theoretical approaches and model predictions. To this end it is crucial to equip all quantitative theoretical results with uncertainties, and this is where another advantageous aspect of EFT comes into play. It promises to deliver a handle on the systematic uncertainty of a theoretical prediction. Indeed, on a high level the EFT expansion for an observable O can be written where O 0 is the first term in the above expansion, and c ν are dimensionless expansion coefficients. Here, and in the following, the LO result (O 0 ) was pulled out in front of the sum to set the overall scale. One could equally well use the experimental value for O or the highest-order calculation to set the scale of the observable expansion. If we are dealing with an EFT, one should expect the expansion coefficients to be of natural size such that predictions at successive chiral orders are smaller by a factor of Q. See also references [64,65] for discussions on how to assess the convergence of data. In an actual calculation, the order-byorder description of O is truncated at some finite order k, which induces a truncation error δ k in the prediction. The underlying EFT description then, in principle, allows us to determine the formal structure of the truncation error This type of handle on the theoretical uncertainty in a prediction is not present in purely phenomenological descriptions of the nuclear interaction, such as the Argonne V18 potential [11] or the CD-Bonn potential [12]. Despite all of the promised advantages of chiral EFT, it should be pointed out that much work remains to be done regarding the analysis and theoretical underpinnings of chiral EFT, in particular the formulation of a PC that, arguably, fulfills the field theoretic requirements for an EFT of QCD (see e.g., references [66][67][68][69][70][71][72][73]), for various views on this topic. Indeed, one cannot yet confidently claim that the uncertainty estimates in ab initio predictions of nuclear observables based on proposed chiral EFT interactions are linked to missing physics at the level of the effective Lagrangian. The details of the PC, regularization approach, and chosen maximum chiral order k in Equation (3), are some of many possible choices that give rise to the rich landscape of different chiral interactions in nuclear theory. Although there is a flurry of activity, and far from clear which is the best way to proceed, there is tremendous overarching value to organize the model analysis according to the fundamental ideas and expectations of EFT, most importantly the promise of order-by-order improvement.

MODEL CALIBRATION
The goal of model calibration is to learn about the parameter of the model using a pool of calibration data. This can mean many different things depending on the situation, and in this section I will discuss a few representative model calibration examples from ab initio nuclear theory.
Assume that we have a model M( α; x) that consists of a method for solving the Schrödinger equation and some theoretical description of the nuclear interaction, e.g., a particular interaction potential from chiral EFT, and we do not know the permissible values for α. The vector α = [α 1 , α 2 , . . . , α N ] denotes the N physically relevant and adjustable calibration parameters of model M, and the vector x denotes the set of control inputs. The adjustable parameters of interest will typically correspond to the LECs of the nuclear interaction potential, and the vector x will contain e.g., proton-and neutron-numbers, observable type, or some kinematical setting. In principle the model might contain additional adjustable parameters that for some reasons can be considered as constants. For instance, we typically do not consider the pion mass as a calibration parameter, although the variation of such fundamental properties can also play an important role (see e.g., references [74,75]). The choice of many-body method will depend on which class of observables is targeted, either during prediction or calibration. For instance, coupled-cluster theory will perform very well for nuclei in the vicinity of closed shells and Faddeev integration will be able to access the positive energy spectrum of the three-nucleon Hamiltonian. Throughout, I will implicitly assume that the model is realized only on a computer, i.e., M is defined through some computer code, and there is no stochastic element present in the output. This means that each time the model is evaluated with the same input and settings, we will basically get the same result.
To calibrate the parameters, suppose that we have a set of n experimental observations compiled in a data vector D = [z 1 , z 2 , . . . , z n ]. They correspond to particular settings x 1 , x 2 , . . . , x n of the control variables, to produce model outputs for e.g., ground-state energies for light nuclei or scattering cross sections at selected scattering momenta. We can link the data points to the model outputs via the following relation This expression relates the reality of measurement with our model, and includes a so-called model discrepancy term δ, that depends on the control variable x i . The measurement error is denoted with ε i . In cases where the measurement is accompanied with zero uncertainty, something that is highly unlikely of course, the model discrepancy term represents the entire difference between the model and reality. The theoretical discrepancy δ is not physics per se, but should rather be interpreted as a random variable of statistical origin, informed via domain knowledge.
The model discrepancy term can be partitioned into at least three terms (6) and they represent the neglected or missing physics in the theoretical description of the nuclear interaction, neglected or missing many-body correlations in the mathematical solution of the many-body Schrödinger equation, and any numerical errors arising due to algorithmic approximations in the implementation of the computer model, respectively. We are currently most interested in understanding δ interaction in situations where we, to a good approximation, can neglect δ many−body and δ numerical . Thus, in most of the literature, the dominant part of the model discrepancy originates from the chiral EFT description of the nuclear interaction. It should be pointed out that the discrepancy term of the many-body method can be quite large for many types of observables. However, ab initio methods are often applied wisely, and there exists plenty of domain knowledge regarding which many-body methods that are best suited for different kinds of observables. Yet, it is not easy to set bounds on this discrepancy a priori. Comparison between several complementary ab initio models provides important validation [76][77][78]. Although challenging, it would be of great value to quantify the many-body discrepancy more carefully. Finally, the last term in Equation (6) is currently not the dominant part of the discrepancy, provided that the computer code has been benchmarked.
Two related questions immediately arise: (i) what is the impact of the discrepancy term δ( x i ) on the inference about the model parameters α? and (ii) what happens if we neglect all sources of model discrepancy during model calibration?
Let us consider the second question, since it is easier and also sheds light on the first one. Ignoring δ( x i ) in Equation (5) leaves us with the following expression This is the conventional starting point in nuclear model calibration. If one also assumes that the measurement errors ε i have finite variance, then the principle of maximum entropy dictates that the likelihood of the data is normally distributed. For independent errors, this leads to the canonical expression for the likelihood Here, the notation P(X|Y) denotes the probability density function (pdf) of X conditioned on Y. The structure of the likelihood remains the same for correlated measurement errors, although one must employ the full covariance matrix instead of only the diagonal terms σ 2 j to represent the variance of the data. Model calibration in ab initio nuclear theory is typically formulated as a maximum likelihood problem. This boils down to finding the optimal, or best-fitting, parameters α ⋆ that minimize the exponent in Equation (8). We are thus facing a mathematical optimization (minimization) problem of finding the point that fulfills χ 2 ( α ⋆ ) ≤ χ 2 ( α) for all α ∈ , where represents the parameter domain. In general, this is an intractable problem unless we have detailed information about α or that the parameter domain is discrete and contains a finite number of points. In reality, we are trying to find local minimizers to χ 2 ( α), i.e., points α ⋆ for which χ 2 ( α ⋆ ) ≤ χ 2 ( α) for all α ∈ close to α ⋆ . For ab initio models, optimization of the likelihood function typically proceeds in several steps [11,12,[79][80][81][82]. First, the parameters, i.e., the LECs in chiral EFT, are calibrated such that the model optimally reproduces nucleon-nucleon scattering phase-shifts from published partial-wave analyses [83,84]. This typically yield model parameters confined to some narrow range of values. Although each scattering phase-shift only depends on a limited subset of the entire vector of model parameters α, this stage still benefits from using mathematical optimization algorithms, such as the derivate-free algorithm called pounders [85,86]. In a next step, the results from the phase-shift optimization serves as the starting point for a second round of parameter optimization where all model parameters are varied to best reproduce thousands of nucleon-nucleon scattering cross sections up to scattering energies in the vicinity of the pion-production threshold.
Minimizing the χ 2 in Equation (8) for nucleon-nucleon interaction potentials with respect to nucleon-nucleon scattering data 1 has been the workhorse of model calibration in nuclear theory for decades 2 . Since long, the figure of merit for a nuclear interaction potential has been the χ 2 -per-datum value. If this value is close to unity for some particular parameterization α ⋆ , then the corresponding potential is dubbed to be "highprecision." This is beginning to change. Only for models M, where the model-discrepancy is in fact negligible this approach can be justified. Otherwise, chasing a low χ 2 leads down the path of significant over-fitting, with unreliable predictions as a consequence. For the calculation of nucleon-nucleon scattering phase shifts and cross sections it is valid to ignore δ many−body and δ numerical since the corresponding equations are can be solved more or less numerically exactly. However, since we clearly cannot claim to have a zero-valued δ interaction term, the χ 2 -perdatum with respect to nucleon-nucleon scattering data is not the optimal measure to guide future efforts in nuclear theory. Before and during the development of ab initio many-body methods and EFT principles, when it was very unclear how to understand the concept of model discrepancy in nuclear theory, it was certainly more warranted to benchmark nuclear potentials based solely on a straightforward χ 2 value.
State-of-the-art interaction potentials also contain threenucleon force terms. Although some of the parameters in chiral EFT are shared between two-and three-nucleon terms, there exists a subset of parameters inherent only to the threenucleon interaction. Such parameters must be determined using observables from A > 2 systems. Arguably, all parameters of a chiral potential should be optimized simultaneously to a joint dataset D. The easiest approach is to employ also e.g., the binding energies and charge radii of 3,4 He and 3 H. Unfortunately, there exists a universal correlation between the binding energies of 3 H and 4 He, the so-called Tjon line [88]. Also the radii and binding energies exhibit a strong correlation. Altogether, this reduces the information content of this data. Fortunately, it was demonstrated in reference [89] that the beta decay of 3 H can add valuable, although limited [90], information about the parameters in the three-nucleon interaction, and this has been employed in several works, as indicated by the long list of citations of reference [89]. Recently, selected threenucleon scattering observables have been added to the pool of calibration data [91,92], however not routinely since it is still computationally quite costly to evaluate the ab initio models for such observables. There are indications that it is necessary to include also data from nuclei heavier than 4 He to learn about the parameters in ab initio models. This is discussed in section 3.2.
Ignoring the δ interaction discrepancy terms during model calibration can have serious consequences. Most importantly, this reduces the LECs to tuning parameters without any physical meaning. Indeed, in the strive to replicate the data at any cost, the numerical values can be driven far away from the true values of the model. At some point, continued tuning of the parameters induces over-fitting and the model will pick up on the noise in the data. Naturally, this leads to poor predictive power. With increasing amounts of data, the optimization process will converge with increasing certainty to false values for α. A pedagogical introduction to the statistics of model discrepancies and a physics example is provided in reference [93].
A total model discrepancy, according to Equation (6), was included in ab initio model calibration for the first time in reference [80]. The parameters in a set of chiral interactions at LO, NLO, and NNLO were optimized using nucleon-nucleon, and pion-nucleon scattering data. The terms in the three-nucleon interaction were simultaneously informed using bound-state observables from A = 2, 3 nuclei. The details of the analysis and results can be found in the original paper. The discrepancy terms were interpreted as uncorrelated errors and added in quadrature with the data uncertainties, leading to a slight modification of the corresponding χ 2 function (10) The interaction discrepancy was constructed from the EFT assumption that the external momenta flowing through the interaction diagrams scale as some power corresponding to the truncation of the chiral expansion, in accordance with Equation (4). The intrinsic scale of this error was solved for selfconsistently by requiring that the χ 2 -per-datum should approach unity providing that the model error is correctly estimated. This implicitly assumes a correct estimate of the number of statistical degrees of freedom. Something that cannot be easily estimated for non-linear χ 2 functions [94].
To summarize, although the inclusion of model discrepancies is preferred, it is not without problems. To blindly include a term δ( x i ) to capture model discrepancies in the process of model calibration can lead to statistical confounding between α and a general discrepancy function δ(·) [93]. This means that the model parameters and the discrepancy term are not identifiable and we only recover a some joint pdf for the two components. Indeed, for any α there is a δ(·) given by the difference between model and reality. To make progress requires us to specify some a priori ranges for α and/or δ(·). Or in the language of Bayesian inference, we need to specify the prior pdf for the model parameters and the theory uncertainties. This is partly related to approaches where one augments the χ 2 function with a penalty term to constrain the values of the model parameters (see e.g., reference [95]). For EFT descriptions of the nuclear interaction one can argue that the LECs should maintain values of order unity, if expressed in units of the breakdown scale, and the discrepancy could follow the pattern of Equation (4). To adequately represent the discrepancy term in nuclear models is ongoing research, and it appears advantageous to reformulate model calibration as a Bayesian inference problem, see section 4.

Hessian Error Analysis
At the optimum parameter point α ⋆ , a Taylor expansion of the χ 2 function to second order gives where H denotes a Hessian matrix, the inverse of which is proportional to the covariance matrix for the model parameters [96]. Contracting the parameter-Jacobian of any model prediction with this covariance matrix yields the standard error propagation result of the parameter uncertainties. For the conventional χ 2 function, the parameter covariances reflect the impact of the experimental uncertainties on the precision of the optimum and predicted observables. Sometimes, this is referred to as statistical uncertainties, which is a bit confusing since all uncertainties are statistical in nature. See Figure 1 for an example result of applying a parameter covariance matrix to obtain the joint pdf for the 4 He ground-state energy and the 2 H point-proton radius, two important few-nucleon observables. This particular result is taken from reference [80], where in fact a model discrepancy term δ(·) was incorporated during the optimization, thus in this particular case the covariances reflect more than just the measurement noise. See e.g., references [97][98][99][100][101][102] for details about statistical error analysis and illuminating examples of forward error propagation in ab initio nuclear theory.
To extract the covariance matrix requires computation of the second-order derivatives of the χ 2 function with respect to the model parameters. The general process of numerically differentiating an ab initio model with respect to α is significantly simplified, and numerically much more precise, with the use of automatic differentiation (AD) [80]. This corresponds to applying the chain rule of differentiation on a function represented as a computer code. It relies on the principle that any computer code, no matter how complicated always executes a set of elementary arithmetic operations on a finite set of elementary functions (exponentiation, logarithmization, etc). To implement AD requires modification of the original computer code, e.g., operator overloading via third-party libraries. Once implemented, AD also enables application of more advanced derivative-based optimization algorithms and Markov chain Monte Carlo methods [103] with the computer model M. An alternative, and derivative-free approach, to computing the Hessian matrix for forward error propagation is to employ Lagrange multipliers [104]. This method is more robust, but also more computationally demanding to carry out. From a practical and computational perspective, if one considers to use Lagrange multipliers, one should also look into performing a Bayesian analysis (see section 4).

Selecting Calibration Data
It is preferable to use data corresponding to observables that are computationally cheap to evaluate, and if possible with model settings corresponding to low δ( x) many−body discrepancies. One should also strive to include data with highly complementary information content that constrain a maximum amount of linearly independent combinations of model parameters.
The conventional approach to calibrate ab initio models is to use only data from A 4 nuclei, as was discussed above. It was observed in reference [25] that the additional inclusion of ground-state energies and charge radii of selected carbon and oxygen isotopes dramatically increases the predictive power of models for bulk properties of nuclei up to the medium-mass nickel region (see Figure 2). This calibration strategy led to the construction of the so-called NNLO sat interaction. The strategy to include data from selected A > 4 nuclei was also used in the construction of the Illinois 3NF presented in reference [105]. From a quantitative perspective, the advent of models capable of accurate predictions is of course an important step forward and has proven very useful [26,27,106,107].
The major drawback of any model based on the NNLO sat interaction is the lack of quantified theoretical uncertainties. This is quite common also for ab initio models based on other interaction potentials. At the moment, the best we can do is to estimate the truncation error using Equation (4). This requires additional and sub-leading chiral-order potentials using the same optimization protocol, e.g., LO sat and NLO sat , which do not exist. The calibration of such models require an even more careful inclusion of model discrepancies. This is discussed more in section 4. One can certainly argue that it becomes even more important to quantify the theory errors for models that we strongly believe will make accurate predictions, like the ones based on the NNLO sat interaction. Otherwise we are limited in our ability to assess discrepancies with respect to experiment. This argument applies equally well to models based on e.g., the 1.8/2.0 interaction from reference [29,30] which typically yield good predictions for binding energies and lowenergy spectra. In reference [26], the prediction from ab initio models based on different interactions, NNLO sat and the 1.8/2.0 interactions amongst other, were compared to estimate the overall theoretical uncertainty.
It is difficult to judge the degree of over-fitting to finite nuclei in NNLO sat . It was noted during calibration that this interaction fails to reproduce experimental nucleon-nucleon scattering cross sections for scattering momenta larger than ∼ m π . Enforcing a good reproduction of all scattering data up to e.g., the pionproduction threshold most likely corresponds to over-fitting in the A = 2 sector. It is the role of the model discrepancy term, with appropriate priors, to balance this.
One clearly gains predictive power regarding saturation properties by including additional medium-mass data during the calibration stage. This was also observed in a lattice EFT analysis of the nuclear binding mechanism [108]. The related topic of possibly emergent nuclear phenomena like saturation, binding, and deformation of atomic nuclei is discussed further in reference [109]. Although the inclusion of a model discrepancy term while calibrating to heavier-mass data will be important, it does not solve the underlying problem of having a systematically uncertain model. It was noted in references [110][111][112] that the explicit inclusion of the isobar in the chiral description of the nuclear interaction dramatically improves the description of nuclei while also reproducing nucleon-nucleon scattering data. A possibly fruitful way forward is to employ improved models, i.e., with explicit inclusion of the isobar, that are calibrated using also data from selected heavy-mass nuclei, while systematically accounting for model discrepancies. Furthermore, it will be interesting to se how much additional information is contained in three-nucleon scattering data [91,92,113].

BAYESIAN INFERENCE
The previous section introduced the concept of model calibration and the fundamental expression in Equation (5) that relates a model with measured data. In this section I will outline the Bayesian strategy for learning about the model parameters and some existing estimates of the discrepancy term. The overarching goal is still to calibrate an ab initio model M( α, x), and reliably predict properties of atomic nuclei. However, instead of finding a single point α ⋆ in parameter space that maximizes the likelihood for the data, we can use Bayes' theorem to relate the data likelihood to a pdf for the model parameters themselves where P( α|M, I) denotes the prior pdf for the parameters, P(D| α, M, I) denotes the likelihood of the data, the denominator P(D|M, I) denotes the marginal likelihood of the data, and P( α|D, M, I) denotes the sought-after posterior pdf of the model parameters. The additional I represents any other information at hand. The Bayesian reformulation of the inference problem can at first sight appear as a subtle point, and it is easy to overlook the fundamental difference between computing the pdf for the parameters and maximizing the likelihood, i.e., frequentist inference. From a practical perspective, it is clearly advantageous to obtain a pdf for the model parameters P( α|D, M). This quantity is also intuitively straightforward to interpret compared to frequentist interval estimates that might contain the true value of the unknown model parameters, e.g., confidence intervals. The prior pdf P( α|M, I) for the parameters α given a model M offers up front possibility to incorporate any prior knowledge (or belief) about the parameters, before we look at the data. In the case of ab initio modeling, an underlying EFT-description of the nuclear interaction embodies substantial prior knowledge, such as the typical magnitude of the model parameters as well as a handle on the systematic uncertainty. The Bayesian requirement of prior specification also ensures full transparency regarding the assumptions that goes into the analysis.
The existence of priors in Bayesian inference is sometimes criticized and one can argue that the scientific method should let the data speak for itself, without the explicit insertion of subjective prior belief. Inference about model parameters in terms of hypothesis tests or confidence intervals, derived from the frequency of the data, is referred to as frequentist inference. Note however that the likelihood rests on initial subjective choice(s) regarding the data model. In this review, I will maintain a practical perspective, and just recognize the usefulness of the Bayesian approach to encode prior information about the model parameters and the model discrepancy terms. Which is also required in order to handle possible confounding between the discrepancy and the model parameters [93]. Either way, it is difficult to avoid subjective choices in statistical inference involving uncertainties and limited data. In fact, one can even argue that only subjective probabilities exist [114].
Bayesian model calibration, sometimes called Bayesian parameter estimation, is currently emerging in ab initio modeling [115][116][117]. To get some intuition about this topic, let us look at Bayesian parameter estimation in its most simple version. This amounts to assuming a (bounded) uniform prior pdf for the model parameters α, i.e., and adopting a data likelihood as in Equation (8).
In practice, what remains is to explicitly evaluate P( α|D, M, I) in Equation (12) by computing the product of the two terms in the numerator. The denominator can be neglected since it does not explicitly depend on α. This marginal likelihood does however matter for absolute normalization of the posterior pdf. The evaluation of the posterior can be done via brute force evaluation in some simple cases, but for computationally expensive models and/or high-dimensional parameter space typically more clever strategies are required, such as Markov chain Monte Carlo. With uniform priors, the point for the maximum posterior coincides exactly with the point obtained using maximum likelihood methods, which for normal likelihood distributions is nothing but least-squares. The advantages of Bayesian parameter estimations becomes apparent once we include non-uniform prior knowledge, and in most cases we know a bit more about the parameters than what a simple uniform pdf reflects. The general strategies for application of Bayesian methods to calibrate EFTs are pedagogically outlined in reference [116]. To exemplify the use of priors and some of the related techniques, let us assume a Gaussian prior with zero mean for the model parameters α = [α 1 , α 2 , . . . , α N ], i.e., where the parameterā 2 denotes the prior variance. This is not an unreasonable prior for the model parameters in chiral EFT. The impact of this parameter prior is to penalize model parameters that are too large, which would typically signal over-fitting. For situations where there exist a large amount of precise data, the prior specification for the parameters matter less. Nevertheless, the question remains, what value should we pick forā? This can be dealt with straightforwardly by marginalizing overā, i.e., we express the prior for the parameters as P( α|M, I) = dā P( α|ā, M, I)P(ā|M, I), which only forces us to specify a prior for the variance for our belief about the model parameters, here we could choose a rather broad range if we like. With appropriate analytical form for the prior onā, it is even possible to carry out this marginalization step analytically. See reference [117] for illuminating examples about the impact of different priors in model calibration with scattering-phase shifts.

Prediction and Calibration Including Model Discrepancies
Observables computed with potentials from chiral EFT should exhibit a pattern where contributions from successive orders ν = 0, 1, 2, 3, . . . are smaller by factors Q ν . This is reflected in Equation ( [118,119] how to extract a pdf for the EFT truncation error δ k in Equation (4) using this information. First, we factor out the overall scale, and defineδ as the overall dimensionless truncation error. We now seek an expression for P(δ k |c 0 , c 1 , . . . , c k ) given the known values for the first k + 1 coefficients. It turns out that for independent, bounded, and uniform prior pdfs for the expansion coefficients, the integrals can be solved analytically if one also approximates δ k with the leading term. Thus, we assumẽ The posterior pdf P(δ (1) k |c 0 , c 1 , . . . , c k ) is given in reference [119] (Equation 22), and explicitly derived in the appendix of reference [118]. This posterior pdf is the complete inference aboutδ (1) k . If the pdf is multi-modal or otherwise non-trivial one should use it in its entirety in forward analyses. However, we can sometimes use a so-called degree of belief (DOB) value to quantify the width of a pdf. This is the probability p%, expressed in percent, that the value of an uncertain variable η, distributed according to the pdf P(η), falls within an interval [a, b]. This interval is then referred to as a credible interval with p% DOB, where p% = b a P(η) dη. (18) The posterior pdf forδ (1) k is not Gaussian, however it is symmetric and have zero mean. Therefore, we can define a smallest interval [−d (p) k , +d (p) k ] that captures p% of the probability mass and solve for d k . This will define the width of the credible interval within which the next term in the EFT expansion will fall with p% DOB, i.e., an estimate of the truncation error. The expression is derived in references [118,119], and given by where n c denotes the number of available coefficients. Thus, with n c /(n c + 1) × 100% DOB, the EFT truncation error for the observable O, in dimensionful units, is straightforwardly estimated by O 0 ×max(|c 0 |, |c 1 |, . . . , |c k |)Q k+1 . This estimate also corresponds to the prescription employed in reference [120]. This a posteriori truncation error estimate essentially boils down to guessing the largest number that one can expect based on a series of numbers drawn from the same underlying distribution. For example, given only one (n c = 1) expansion parameter c 0 , we have a 50% DOB that we have encountered the largest coefficient in the series. This procedure has been applied to estimate the truncation error in several ab initio model calculations, see the long list of papers that are citing references [119,120].
The procedure for estimating the EFT truncation error, i.e., part of the model discrepancy, requires an estimate of the high-energy scale b of the underlying EFT. For the models discussed here, the results are based on chiral EFT, for which the naive estimate of b is roughly M N ∼1 GeV. This was analyzed more carefully for semi-local chiral potentials [45,120] in reference [121]. The posterior pdf for b indicated that a more probable value is b ≈ 500 MeV. This value was also used for the breakdown scale in the truncation error analysis of nucleon-nucleon scattering phase shifts from the -full models at LO,NLO, and NNLO chiral orders in reference [110]. The results are presented in Figure 3. This result also strengthens the observation made earlier, that the inclusion of the degree of freedom tend to improve model descriptions of nuclear systems. This is more clearly seen when employing the same potentials to make model predictions for the ground-state-energies and charge radii of selected finite nuclei (see Figure 4), and the energy per nucleon in symmetric nuclear matter (see Figure 5).
The model predictions for the nuclear matter indicate that the -full models on average agree better with experimental energies and radii. The uncertainty bands for the predictions were extracted under the additional assumption that the relevant  soft-scales for finite and infinite nuclear systems are given by the pion mass and the Fermi momentum, respectively. Although these are rough estimates of the soft scales, it is important to note that the the truncation error in Equation (20) only holds up to factors of order unity. A comparison of theoretical error estimates based on different statistical methods provide additional validation. The Bayesian method for estimating the truncation error and the model errors estimated using the modified χ 2 -function in Equation (10) are quite different in nature. Nevertheless, a comparison of the theoretical errors in nucleon-nucleon cross sections at high scattering-energies agree very well for these methods [80,119]. The link between the two approaches for estimating the model uncertainties is discussed further in reference [117]. A complete Bayesian parameter estimation including model discrepancy will hopefully reveal more details about the structure of the chiral EFT error.
At the moment, most model discrepancies in ab initio modeling based on chiral EFT are extracted a posteriori using predictions based on calibrated models. This is possible based on the expectation that the predictions might follow an EFT pattern. This of course remains to be validated on theoretical grounds. However, under the assumption that the interaction potential actually gives rise to an EFT pattern for the observable, we can build on Equation (4) to include a discrepancy term in the likelihood for calibrating ab initio models. See reference [122] for a discussion about correlated truncation errors in nucleonnucleon scattering observables following this line of thought, where it is also observed that the expansion parameters behave largely as expected.

SUMMARY AND OUTLOOK
Statistical representation of a sound model discrepancy term is certainly challenging. Still, the assumption of zero model discrepancy is a rather extreme position. Almost any reasonable guess is better than nothing in order to avoid false values for the model parameters and to minimize over-fitting.
The importance of acknowledging model discrepancies is neatly summarized in the famous quote of George E. P. Box: "Essentially, all models are wrong, but some are useful" [123], with the additional comment in reference [93]: "But a model that is wrong can only be useful if we acknowledge the fact that it is wrong." Fortunately, most of the ab initio models of atomic nuclei are built on methods from EFT, which by construction promises extra information about the expected impact of the neglected or missing physics in theoretical predictions. Bayesian inference is a natural choice for accounting for model discrepancies and prior knowledge, especially when the priors have a physical basis. Indeed, extracting the posterior pdf for the model parameters via Bayesian inference methods makes it possible to abandon the notion of having a single parameterization of a particular interaction potential and instead build models based on a continuous pdf of parameters. Developments along these lines are already taking place in e.g., density functional theory for atomic nuclei [124].
At the moment, most theoretical analyses of atomic nuclei proceed in the following fashion. Given a potential V( α ⋆ ), optimized to reproduce some set of calibration data D, we setup a model M( α ⋆ , x) to analyze an experimental result corresponding to the control setting x i , i.e., we evaluate M( α ⋆ , x i ). In a few cases we propagate uncertainties originating from the measurement errors present in the data vector D, and sometimes we estimate the EFT truncation error using a series of models at different chiral orders. This takes a lot of effort. Indeed, ab initio nuclear models are represented by complex computer codes, implemented via years of dedicated work by several people, and computationally expensive to evaluate. On top of that, to understand the underlying nuclear interaction is, arguably, one of the most difficult problems in all physics. Still, we would like to answer questions like: how much should we trust a model prediction? is the model M over-fitted? why is it not agreeing with observed data, and how do we understand this discrepancy?
We should strive to use Bayesian methods for calibrating our models M( α, x) to obtain posterior pdfs P( α|M, D, I) for the parameters. Subsequent evaluations of an observable O i , corresponding to setting the model control variable to x i , should be marginalized over the parameter posterior pdf to produce a posterior predictive pdf This quantity will best reflect our state of knowledge, and is quite meaningful to compare with data. Various marginalizations with respect to subsets of the parameters can provide better insights into the qualities of the ab initio model. Bayesian inference also allows us to compare different models via the computation of Bayes factors [125], which in turn enables us to address questions like: which PC in chiral EFT has the strongest support by data? It is also theoretically straightforward to compute the posterior predictive pdf averaged over a set of different models M = [M 1 , M 2 , . . . , M 3 ] [126], each weighted by their probability of being true, in the finite space spanned by M, given data D.

The Computational Challenge
There are several challenges connected with the outlook presented above: working out the theoretical underpinnings of chiral EFT, specifying prior information, formulating model discrepancy terms, and performing challenging Markov Chain Monte Carlo evaluation of complicated posterior pdfs. From a practical point of view, handling, the computational complexity is the most difficult one. Indeed, evaluating models of medium-and heavy-mass atomic nuclei typically requires vast high-performance computing resources. This clearly puts the feasibility of the Bayesian scenario presented above into question. Without any unforeseen disruptive computer technologies or dramatic algorithmic advances, it will be necessary to employ, where possible, fast emulators that accurately mimic the response of the original ab initio models. This is where we can draw from advances in machine learning. Possibly useful methods are e.g., Gaussian process regression and artificial neural networks. Both of these approaches can be challenging since they introduce hyperparameters that require additional optimization. Although it can be difficult to assess how well such methods will work, there exist several examples of useful surrogate interpolation and extrapolation in nuclear modeling (see e.g., references [122,124,[127][128][129][130][131]). A new method called eigenvector continuation [132] turns out to be a promising tool for accurate extrapolation and fast emulation of nuclear properties [133]. In a recent paper [134], this method proved capable of emulating (with a root mean squared error of 1%) more than one million solutions of an ab initio model for the ground-state energy and radius of 16 O in one hour on a standard laptop. An equivalent set of exact ab initio coupled-cluster computations would require 20 years.

AUTHOR CONTRIBUTIONS
The author has made a substantial contribution to the work, and approved it for publication.