The case for the fundamental $M_{\rm BH}$-$\sigma$ relation

Strong scaling relations between host galaxy properties (such as stellar mass, bulge mass, luminosity, effective radius etc) and their nuclear supermassive black hole's mass point towards a close co-evolution. In this work, we first review previous efforts supporting the fundamental importance of the relation between supermassive black hole mass and stellar velocity dispersion ($M_{\rm BH}$-$\sigma_{\rm e}$). We then present further original work supporting this claim via analysis of residuals and principal component analysis applied to some among the latest compilations of local galaxy samples with dynamically measured supermassive black hole masses. We conclude with a review of the main physical scenarios in favour of the existence of a $M_{\rm BH}$-$\sigma_{\rm e}$ relation, with a focus on momentum-driven outflows.


INTRODUCTION
Observational evidence suggests that most local galaxies host a central supermassive black hole (henceforth simply 'black hole', not to be confused with an 'ordinary' stellar mass black hole). Indeed, galaxies for which high-resolution data can be acquired show stellar kinematic patterns strongly suggesting the presence of a central massive dark object [Ferrarese andFord, 2005, Kormendy andHo, 2013]. The central black hole masses, inferred from dynamical measurements of the motions of stars and/or gas in the host galaxies, appear to scale with galaxy-wide properties (or perhaps bulge-wide properties), such as stellar mass [Magorrian et al., 1998, Häring andRix, 2004] and stellar velocity dispersion [Gebhardt et al., 2000, Ferrarese, 2002, Gültekin et al., 2009, McConnell and Ma, 2013, Savorgnan and Graham, 2015. The existence of such correlations is remarkable, as the black hole's (sub-parsec scale) sphere of influence is orders of magnitude smaller than the scale of it's host galaxy (kilo-parsec scale). These correlations suggest a close link (a "co-evolution") between black holes and host galaxies [Silk andRees, 1998, Granato et al., 2004].
The existence of massive black holes at the centre of galaxies also lends further support to the widelyaccepted paradigm that quasars, and more generally Active Galactic Nuclei (AGN), are powered by matter accreting onto a central black hole. The release of gravitational energy from an infalling body of mass m approaching the Schwarzschild radius R s = 2GM/c 2 of a compact object of mass M , is in fact one of the most efficient known processes to release enough energy to explain the large luminosities in AGN. As discussed by Peterson [1997], the emission from release of gravitational energy increases with the compactness of the source M/r. Assuming that most of the gravitational energy E powering the emission from an accreting black hole originates from within a few times R s , say r = 5R s , one could set E = GM m/5R s , implying E = 0.1mc 2 . The latter efficiency η ∼ 0.1 of energy conversion in units of the rest-mass energy, is orders of magnitude higher than the efficiency in stellar fusion (η ∼ 0.008). Theoretical models have also suggested that the energy/momentum release from the central black hole, routinely known as "AGN feedback", could have profound consequences on the fate of its host galaxy, potentially driving out a galaxy's gas reservoir, quenching star formation, and shaping the above-mentioned scaling relations [Silk andRees, 1998, King, 2019].
The most prominent and studied scaling relations relate the black hole mass M BH to the stellar velocity dispersion σ e [Ferrarese and Merritt, 2000] and the (stellar) mass of the host bulge, M bulge (and by extension the luminosity of the bulge L bulge , see Marconi and Hunt [2003]). Other types of correlations have been proposed in the literature, such as correlations with the bulge light concentration c bulge [Graham et al., 2001] and even the mass of the surrounding dark matter halo M halo [Ferrarese, 2002]. This review will focus on the M BH -σ e relation, where σ e is the stellar velocity dispersion inferred from spectral absorption lines (see Mo et al. [2010], Chapter 2).
The M BH -σ e relation has attracted the attention of the astronomical community since its discovery [Ferrarese and Merritt, 2000], as it is believed to be closely connected to the galaxy/halo gravitational potential well, and thus may be related to the above-mentioned AGN feedback process , as further discussed in Section 3.2. The relation is of the form: (1) Ferrarese and Merritt [2000] initially retrieved a normalization and slope of, respectively, α = 8.14 ± 1.80 and β = 4.80 ± 0.54, whereas more recent work [e.g., Tundo et al., 2007] suggests α = 8.21 ± 0.06 and β = 3.83 ± 0.21. There is some debate in the literature concerning the exact shape of the M BH -σ e and its dependence on, for example, morphological type or even environment (see, e.g., Lauer et al. [2007a], Wyithe [2006] and Hu [2008] for more details). It has been noted (e.g. van den Bosch et al. [2012]) that several overmassive black holes exist on this relation, hosted by galaxies that have undergone fewer than usual mergers, in tension with semi-analytic models [Savorgnan and Graham, 2015]. However, these outliers could simply be the result of incorrect modelling of the galactic bulge/disc .
Several groups have noted that the M BH -σ e relation only weakly evolves with redshift (if at all) [e.g. Gaskell, 2009, Salviander and Shields, 2013, Shen et al., 2015. Supporting work by other groups base their conclusions on direct estimates of the M BH -σ e relation on high redshift quasar samples [Woo et al., 2008], and studies based on comparing the cumulative accretion from AGN with the local black hole mass density, retrieved from assigning to all local galaxies a black hole mass via the M BH -σ e relation (e.g. Shankar et al. [2009a], Zhang et al. [2012]).
On the assumption that all local galaxies host a central black hole, scaling relations could in principle allow us to assign black hole masses to all local galaxies without a direct dynamical mass measurements, thus generating large-scale black hole mass statistical distributions such as black hole mass functions or correlation functions (see Shankar [2009] and Graham and Scott [2015] for more focused reviews on this topic). For example, a number of groups have used luminosity, as performed by Shankar et al. [2004], Salucci et al. [1999] and Marconi et al. [2004], or even Sersic index, as performed by Graham et al. [2007]), to generate black hole mass functions. This procedure of course relies on two assumptions: firstly, that the observer has correctly identified the surrogate observable of black hole mass, and secondly that the established scaling relation is reliable. For example, the M BH -σ e and M BH -L bulge , probably the most commonly used relations, have led to different black hole mass function estimates (Lauer et al. [2007b], Tundo et al. [2007]).
An important question is whether the same black hole-galaxy scaling relations hold for both active and inactive galaxies. Several groups suggest that this is indeed the case [e.g. Reines and Volonteri, 2015, Caglar et al., 2019, Shankar et al., 2019b. It is important to stress that the samples of nearby (inactive) galaxies on which the black hole-host galaxy relations are based, still remain relatively small, only comprising around ∼ 70-80 objects. A key difficulty relies of course in acquiring sufficiently high-resolution data to allow for dynamical black hole mass measurements (see Faber [1999], Ferrarese and Ford [2005] and Kormendy and Ho [2013] for reviews of observational challenges).
Indeed, there is a growing body of work , van den Bosch et al. [2015], Shankar et al. [2016], Shankar et al. [2017], Shankar et al. [2019b], Shankar et al. [2019a]) supporting the view that that current dynamical black hole mass samples may indeed be "biased-high", possibly due to angular resolution selection effects (see Merritt [2013]), with meaningful consequences for any study based on the "raw" relations. Interestingly, Shankar et al. [2016] showed that, via aimed Monte Carlo simulations, irrespective of the presence of an underlying resolution bias, the raw and "de-biased" scaling relations would still share similar slopes and overall statistical properties (e.g., very similar residuals around the mean), with (noticeable) differences arising only in the normalization between observed and de-biased scaling relations. In particular, the M BH -σ e was shown to be more robust and the least affected by possible angular resolution effects.
The main aim of this work is threefold: i) to review the evidence in favour of the primary importance of the M BH -σ e relation above other black hole scaling relations, ii) to provide further support to velocity dispersion as the main host galaxy property driving the connection between black holes an their hosts, and iii) to review the main theoretical scenarios that give rise to the M BH -σ e relation, with a focus on momentum-driven outflows. In sections 2.2 and 2.3 we will describe original evidence based on residuals and principle component analysis (respectively) in support of the primary role played by M BH -σ e . In section 3 we include a description of the theoretical scenarios behind the physical origin of the M BH -σ e relation. We then conclude in section 4.

Review of previous work
Standard regression analyses showed that the M BH -σ e has the lowest intrinsic scatter of any black hole scaling relation; e.g. Gültekin et al. [2009], Saglia et al. [2016] and van den Bosch [2016]. This alone suggests σ e is different from other variables. Beifiori et al. [2012] came to the conclusion that M BH was fundamentally driven by σ e due to its relative tightness. This work also tested the possibility for multi-dimensional relations, concluding that the introduction of additional variables barely reduced the scatter with respect to the M BH -σ e , suggesting that stellar velocity dispersion remains a fundamental driving parameter. The amount of scatter characterizing diverse black hole scaling relations has been studied by several groups (Marconi and Hunt [2003], Hopkins et al. [2007b]). Marconi and Hunt [2003] and Hopkins et al. [2007b] explored the addition of the effective radius R e to σ e to create a "fundamental plane" in the black hole scaling relations, further discussing in Hopkins et al. [2007a] how this relation naturally arises in their simulations, as a (tilted) correlation between black hole mass and bulge binding energy. This conclusion was supported by Saglia et al. [2016], who argued for a multidimensional relation deriving from the bulge kinetic energy (M bulge σ 2 e ), as originally suggested by Feoli and Mele [2005].
de Nicola et al.
[2019] presented a systematic study of black hole scaling relations on an improved sample of local black holes, confirming that "the correlation with the effective velocity dispersion is not significantly improved by higher dimensionality". The authors concluded that the M BH -σ e is fundamental over multidimensional alternatives, independent of bulge decompositions. This is in line with van den Bosch [2016], who claimed that the M BH -M bulge is mostly a projection of the M BH -σ e relation.
On more general grounds it has been suggested that, in terms of galactic scaling relations, velocity dispersion may be statistically more significant and relevant than other galaxy observables (e.g., Bernardi et al. [2011a], Bernardi et al. [2011b]). Bernardi et al. [2005] analysed the color-magnitude-velocity dispersion relation of a early-type galaxy sample of the Sloan Digital Sky Survey (SDSS), concluding that color-magnitude relations are entirely a consequence of the combination of more fundamental correlations with velocity dispersion. Bernardi et al. [2007] noted that the M BH -σ e and M BH -L bulge predict different abundances of black holes, with the former predicting a smaller number of more massive black holes. Interestingly, the combined σ e -L relation (for the dynamically measured black hole sample, e.g. Yu and Tremaine [2002]) is inconsistent with the same relation from the SDSS, with smaller L bulge for given σ e (regardless of the band used to estimate luminosity). This suggests that the dynamical sample of local black holes may be biased towards objects with higher velocity dispersion when compared to local galaxies of similar luminosity, which obviously calls into question the accuracy of the raw M BH -σ e and M BH -L bulge relations. While unable to identify the source of the bias, modelling of this effect by Bernardi et al. [2007] and Shankar et al. [2016] suggested that the bias in the M BH -σ e is likely to be small, whereas the M BH -L bulge is likely to predict over-massive black holes at a fixed galaxy (total) luminosity/stellar mass.

Residuals analysis
We start by revisiting the residual analysis on the black hole scaling relations following the method outlined in Shankar et al. [2016], Shankar et al. [2017] and Shankar et al. [2019b]. Residuals in pairwise correlations [Sheth and Bernardi, 2012] allow for a statistically sound approach to probe the relative importance among variables in the scaling with black hole mass. Residuals are computed as where the residual is computed in the Y variable (at fixed X) from the log-log-linear fit of Y (X) vs X, i.e. log Y | log X . For each pair of variables, each residual is computed 200 times, and at each iteration five objects at random are removed from the original sample. From the full ensemble of realizations, we then measure the mean slope and its 1σ uncertainty.
Our results are shown in Figures 1 and 2, which show the residuals extracted from the recent homogeneous sample calibrated by de Nicola et al. [2019]. Figure 1 shows that black hole mass strongly correlates with velocity dispersion at fixed galaxy luminosity with a Pearson coefficient r ∼ 0.7 (top left panel), and even more so at fixed effective radius with r ∼ 0.8 (bottom left panel), while the corresponding correlations with stellar luminosity or effective radius are significantly less strong with r ∼ 0.4 at fixed velocity dispersion (right panels). Figure 2 shows the residuals restricting the analysis to only early type galaxies (red circles). The residuals appear quite similar in both slopes and related Pearson coefficients. These results further support the findings by Shankar et al. [2016] (shown, for comparison, in figure 3) that velocity dispersion is more fundamental than effective radius and stellar mass, and that even disc-dominated galaxies follow similar scaling relations.

PCA analysis
We will now present additional original work in favour of the M BH -σ e being more fundamental, via Principal Component Analysis (PCA; Jolliffe, 1986), which is a powerful complementary statistical technique to the residuals analysis presented above. PCA is a mathematical procedure that diagonalises the covariance matrix of variables in a dataset, providing a set of uncorrelated linearly transformed parameters, called principal components, defined by a set of orthogonal eigenvectors. The new orientation ensures that the first principal component (PC1) contains as much as possible of the variance in the data, and that the maximum of the remaining variance is contained in each succeeding orthogonal principal component (PC2, PC3,etc.). In other words, PCA finds the optimal projection of a number of (possibly correlated) physical observables into a smaller number of uncorrelated variables, revealing which quantities are more responsible for the variance (or, in some sense, for the information) in the dataset. PCA has been widely adopted in extragalactic astronomy, for instance to search for possible dimensionality reduction of the parameter space necessary to describe a sample (e.g., Lara-López et al., 2010, Hunt et al., 2012 or to study the mutual dependencies between observed gas-and metallicity-based galaxy scaling relations (e.g., Bothwell et al., 2016, Hunt et al., 2016, Ginolfi et al., 2019. Here we use PCA as an alternative technique to explore the black hole scaling relations. In detail, by quantifying through PCA the robustness of the correlations between M BH and, in turn, σ e , L (total luminosity) and R e (the bulge effective radius), we can infer which of these observables provides a more fundamental scaling relation.

Black Hole scaling relations
In the PCA analysis we continue to use the dataset from de Nicola et al. [2019]. To ensure that quantities with a higher dispersion are not over-weighted, we normalize our variables to their mean values, dividing by the standard deviation of their distributions. We therefore define the new variables (for convenience, in what follows we simply define L = L K ): We perform three different PCA on the 2D-space datasets formed by M BH and, in turn, one among σ e , L and R e . The resulting principal component coefficients are reported in Table 1. We account for uncertainties in our results following a commonly adopted method (see e.g., Bothwell et al., 2016, Ginolfi et al., 2019. We perform a Monte Carlo bootstrap running 10 5 iterations, in each of which we perturb all the analysed quantities by an amount randomly extracted in a range of values defined by their respective measurement errors. Thus, the reported principal component's coefficients and their errors are computed, respectively, from the average and the standard deviation of the values obtained over all the iterations. In the upper panels of Figure 4 we show the determined mutually orthogonal eigenvectors drawn onto the planes defined by the 2D-space datasets consisting of log(M BH ) PCA and, in turn, log(σ e ) PCA , log(L) PCA and log(R e ) PCA . The three datasets, projected into the principal components, are shown in the lower panels of Figure 4. We find that, although in all three cases PC2 contains only a small fraction of the total variance (see Table 1), confirming that an overall good physical correlation exists among the variables, in the M BH -σ e relation PC2 is minimised and the dataset can be very well described uniquely by the PC1. In detail, we find that in the M BH -σ e relation 95.4 ± 0.4 % of the variance is contained into PC1 (with the little remaining information contained in PC2), while lower amount of variance are contained in the PC1 of the M BH -L relation (89.7 ± 0.5 %) and in the PC1 of the M BH -R e relation (88.2 ± 0.3 %).
Since in all the three cases PC2 contains a little variance, we can set it to zero to obtain a linear approximation of the correlation among our observables from the PCA projected datasets. Thus, we obtain the following PCA model predictions: where the non-normalized variables are restored (as defined in the equations discussed above), and the errors on the parameters are computed propagating the uncertainties on the principal component coefficients and on the mean values of the distributions.
In the upper panels of Figure 5 we show a comparison between the observations in our 2D-space datasets and the PCA model relations, while in the lower panels we show the distributions of the corresponding residuals. We find that the relation for which our PCA model can better reproduce the data is the M BH σ e , with a Gaussian 1σ scatter of σ ∼ 0.47. For the M BH -L and M BH -R e relations, our PCA model yields larger scatters in the residuals, respectively σ ∼ 0.63 and σ ∼ 0.7. These larger scatters are linked to the lower variance contained by PC1 in the samples respect to the M BH -σ e case, and therefore a more significant loss of information when setting PC2 to zero.
Altogether, our 2D-space PCA analysis suggests that, among σ e , L and R e , σ e is the observable that better correlates with M BH . The M BH -σ e is the more fundamental scaling relation, with more than 95% of the information contained in the PC1 and only a scatter of σ 0.5 in the residuals between the data and the PCA model relation.

4D-space PCA
As a complementary way of exploring the mutual dependencies among the observables in our sample, we perform a PCA in the 4D-space defined by M BH , σ e , L and R e .
We find that PC1 contains 86.1 ± 0.4 % of the variance, confirming that the full set of four observables can be approximately well described by a 2D surface. PC2 contains 9.8 ± 0.4 % of the variance, meaning that accounting for a third dimension could recover ∼10 % of the information, while PC3 and PC4 contains only ∼ 2 % (see Table 2). Following the same scheme discussed in Sec. 2.3.1, setting to zero the PC that contain less variance, we obtain the best PCA model relation that expresses M BH in terms of the other observables in the dataset: Consistently with the result obtained in Section 2.3.1, we find that the primary dependence is attributed to σ e , i.e., the quantity that better describes M BH . L and R e have a secondary and tertiary dependence respectively, with relatively much lower weights (∼ 16% and ∼ 10%, computed as the ratios between the coefficients) with respect to σ e Interestingly, as shown in Figure 6, the residuals obtained from the 4D-space PCA model relation are worse (σ ∼ 0.55) than in the 2D-space PCA model relation obtained trough the optimal projection of the M BH -σ e space. This effect is likely to be ascribed to some intrinsic noise introduced when adding L k and R e in a 4D-space.

THEORETICAL PERSPECTIVE
As we have seen, a growing body of work is pointing to the fundamental importance of the M BH -σ e . A key perspective that we have so far neglected in regards to black hole scaling relations is that of the theoretical modeller, which we will explore in this section.
The parameters of the galaxy that correlate with M BH tell us which physical processes are most important in setting the black hole mass. Each parameter is related to certain physical quantities. For example, velocity dispersion is naturally related to the mass of the galaxy's spheroidal component, and by extension to its gravitational potential. In the simplest case, modelling the bulge as an isothermal density profile, gas density is ρ ∝ σ 2 e and its weight (the product of the gas mass and gravitational acceleration) is W ∝ σ 4 e . Therefore, modelling a connection between the upper limit of the black hole mass and the weight of the gas surrounding it may indeed be a good starting point to explaining the correlation. Alternatively, if the SMBH mass were controlled by stellar processes, such as turbulence driven by stellar feedback, we would expect a strong correlation between M BH and stellar mass. Similarly, if the rate of SMBH feeding from large-scale reservoirs were an important constraint, a correlation with the bulge size R e or dynamical timescale t dyn R e /σ e might emerge. The fact that such correlations are not seen suggests that these processes are secondary to the host's gravitational potential.
A very promising group of models that have emerged over the past two decades are those based on AGN feedback [Silk and Rees, 1998, Harrison, 2017, Morganti, 2017. The common argument is that AGN luminosity transfers energy to the surrounding gas and at some point drives it away, quenching further black hole growth. These models are generally capable of explaining not only the σ e relation, but also the presence of quasi-relativistic nuclear winds and large-scale massive outflows observed in many active galaxies. Other models that presume either no causal connection between galaxy and black hole growth [Peng, 2007, Jahnke andMacciò, 2011] or those that claim the black hole to be merely a passive recipient of a fraction of the gas used to build up the bulge [Haan et al., 2009, Anglés-Alcázar et al., 2013 make no predictions regarding outflows and generally connect the black hole mass to the mass, rather than velocity dispersion, of the galaxy bulge.
There are several ways of transferring AGN power to the surrounding gas, e.g. radiation, winds and/or jets [Morganti, 2017]. Jets are typically efficient on galaxy cluster scales, heating intergalactic gas and prevent it from falling back into the galaxy [McNamara and Nulsen, 2007]. This process, referred to as "maintenance mode" of feedback, prevents the SMBH mass from growing above the limit established by the M BH -σ e relation. Jets are considered to be the primary form of feedback in AGN that accrete at low rates and have luminosities L < 0.01L AGN [Merloni and Heinz, 2007]. The opposite type of feedback is known as "quasar mode", and it is believed to be most efficient in more luminous AGN. Here, again, there are two possibilities in which energy can be transferred. Directly coupling AGN luminosity to the gas in the interstellar medium is possible if the gas is dusty (due to a very high opacity, see Fabian et al. [2008]). On the other hand, dust evaporates when shocked to the temperatures expected within AGN outflows [Barnes et al., 2018], potentially limiting the impact of radiation-driven outflows. A much more promising avenue is to connect the AGN with the surrounding gas via a quasi-relativistic wind [King and Pounds, 2015a]. Such a model naturally produces both a M BH -σ e relation similar to the observed one, and outflow properties in excellent agreement with observations, both within galaxies [Zubovas and King, 2012a, Cicone et al., 2014, Menci et al., 2019 and on intergalactic scales in galaxy groups [Lapi et al., 2005].

AGN wind-driven feedback
AGN are highly variable on essentially all timescales and are known to occasionally reach the Eddington luminosity where κ e.s 0.346 cm 2 g −1 is the electron scattering opacity. Under such circumstances, the geometrically thin accretion disc produces a quasi-spherical wind that self-regulates to an optical depth τ ∼ 1 [King and Pounds, 2003]. Therefore each photon emitted by the AGN will, on average, scatter only once before escaping to infinity, and the wind carries a momentum ratė whereṀ w is the wind mass flow rate, v w is the wind velocity and L AGN ≡ lL Edd is the AGN luminosity, where l is the Eddington ratio. By writing L AGN = ηṀ BH c 2 , we find the wind velocity to be whereṁ ≡Ṁ w /Ṁ BH . The value ofṁ is highly uncertain, but should not be extremely different from unity. To see this, consider the extreme ends of the possible range ofṀ BH . If the accretion rate on to the accretion disc is significantly below Eddington, no wind is produced, while if the accretion rate rises above the Eddington limit, the wind moderates the accretion flow. Overall, the highest possible average accretion rate is the dynamical rate:Ṁ where f g 0.16 is the cosmological gas fraction and σ ≡ 200σ 200 km s −1 is the velocity dispersion in the galaxy [King, 2010a, King andPounds, 2015a]. In deriving the second equality, we used the M BH − σ relation that is derived below, in eq. 19. Therefore, in most cases, the SMBH feeding rate is not significantly higher than the Eddington rate, unless M BH is well below the observed relation. As a result, we takeṁ ∼ 1 for the rest of this section. This leads to the final expression for the AGN wind velocity v w ηc 0.1c, which is very close to the average velocity in observed winds [Tombesi et al., 2010a,b]. The kinetic power of the wind isĖ The wind rapidly reaches the interstellar medium (ISM) surrounding the AGN and shocks against it. The shock is strong, since v w /σ 1, and the wind heats up to a temperature where m p is the proton mass, and k b is the Boltzmann constant. The most efficient cooling process at this temperature is Inverse Compton (IC) cooling via interaction with AGN photons [King, 2003b, Faucher-Giguère andQuataert, 2012]. Most of the photons interact with electrons in the shocked wind, and a two-temperature plasma develops [Faucher-Giguère and Quataert, 2012]. The actual cooling timescale then depends on the timescale for energy equilibration between electrons and protons. As a result, cooling is highly inefficient and the shocked wind can expand as an approximately adiabatic bubble.
The subsequent evolution of the expanding bubble depends on the density structure of the ISM. Most of the energy stored in the hot wind bubble escapes through the low-density channels and creates a large-scale outflow [Zubovas and Nayakshin, 2014]. Denser clouds, however, remain and are mainly affected by the direct push of the wind material. These two situations create two kinds of outflow, known as energy-driven and momentum-driven, respectively. The latter kind is responsible for establishing the M BH − σ e relation.

The predicted relation
Momentum-driven outflows push against the dense clouds surrounding the black hole. These clouds are the most likely sources of subsequent black hole feeding, therefore their removal quenches further black hole growth for a significant time and establishes the M BH -σ e relation [King, 2003b, Murray et al., 2005, King, 2010b. Considering the balance between AGN wind momentum and the weight of the gas W gas leads to a critical AGN luminosity required for clearing the dense gas: Frontiers where the second equality assumes that the gas distribution and the background gravitational potential are isothermal, i.e. ρ = σ 2 / 2πGR 2 [Murray et al., 2005]. Equating this critical luminosity with the Eddington luminosity of the black hole allows us to derive a critical mass [King, 2010b]: This value is very close to the observed one, although it has a slightly shallower slope. This discrepancy may be explained by the fact that the black hole still grows during the time while it drives the gas away [Zubovas and King, 2012b]. As the gas is pushed away, it joins the energy-driven outflow. This outflow coasts for approximately an order of magnitude longer than the AGN phase inflating it and stalls at a distance [King et al., 2011] where t AGN is the duration of the driving phase and the energy-driven outflow velocity is [King, 2005, Zubovas andKing, 2012a] v e = 2ηc 3σ 0.16 f g By equating R stall with either the bulge radius or the virial radius of the galaxy, we obtain the time t AGN for which the galaxy must be active in order to quench further accretion on to the black hole and find t AGN ∝ Rσ −1/3 ∝ σ 2/3 , since R ∝ σ e on average [this relation arises from the Fundamental plane of galaxies, see Djorgovski andDavis, 1987, Cappellari et al., 2013]. Note that this growth does not need to happen all at once: as long as the outflow is still progressing by the time the next episode begins, the system behaves as if it was powered by a continuously shining AGN [Zubovas, 2019].
This extra growth steepens the M BH -σ e relation beyond the simpler analytical prediction and brings it more in line with observations [Zubovas and King, 2012b]. Furthermore, it shows that galaxy radius may be an important secondary parameter determining the final black hole mass.
As a final note, the extra black hole growth while clearing the galaxy also depends on its spin. Since a rapidly spinning black hole produces more luminosity and drives a faster outflow than a slow-spinning one, the latter has to be active for longer and grow more before it clears the gas from the galaxy. Although present-day estimates of black hole spins are not robust or numerous enough to test this prediction in detail, this might become possible in the near future [Zubovas and King, 2019].
In general, theoretical models based on momentum-driven outflows are capable of naturally explaining the relationship between black hole mass and velocity dispersion, primarily due to the latter acting as a tracer of the host's gravitational potential well. In addition, these models could account for secondary, weaker dependencies on, e.g., galaxy stellar mass or size, which may still be allowed by current data as discussed above (see, e.g., Figures 1 and 5 and Shankar et al. 2016).

DISCUSSION AND CONCLUSIONS
In this paper we have reviewed previous evidence for the M BH -σ e being the most fundamental of all black hole-host galaxy scaling relations (among those discovered so far) and we have presented new evidence based on the statistical analysis of the sample recently compiled by de Nicola et al. [2019]. Both residuals (e.g., Shankar et al. [2016]) and PCA analyses point to σ e being more fundamental than both stellar luminosity/mass or effective radius in their correlation to central black hole mass.
Theoretically, as reviewed by King and Pounds [2015b], the M BH -σ e arises as a consequence of AGN feedback. In short, the black hole in these models is expected to grow until it becomes massive enough to drive energetic/high-momentum large-scale winds that can potentially remove residual gas, inhibiting further star formation and black hole growth. The limiting mass reached by the black hole, which ultimately depends on the potential well of the host, naturally provides an explanation for the existence of the M BH -σ e relation.
Its fundamental nature and lower inclination towards selection biases (in comparison to other scaling relations, e.g. Shankar et al. [2016]) make the M BH -σ e relation the ideal benchmark for statistical studies of black holes in a variety of contexts. The M BH -σ e relation should always be the one adopted to constrain the f vir factor used in reverberation mapping studies (see e.g. Vestergaard and Peterson [2006]) to infer black hole masses from active galaxies (e.g., Shankar et al. [2019b]). The M BH -σ e relation also provides more robust large-scale clustering predictions in black hole mock catalogues [Shankar et al., 2019a]. Furthermore, pulsar timing array predictions of the gravitational wave background (e.g. Kramer and Champion [2013]) are strongly dependent on the normalization of the black hole scaling relations (Sesana et al. [2008], Shankar et al. [2016]), but they should be based on the M BH -σ e rather than on the M BH -M * relation (see Rosado et al. [2015]).
The shape and scatter of the M BH -σ e relation could yield important information on the evolutionary channels of black hole growth. For example, its scatter could retain memory of the merger histories of the host galaxies [Savorgnan and Graham, 2015]. More broadly speaking, global star formation and black hole growth from continuity equation argument modelling is known to peak at around z ∼ 2 (e.g. Shankar et al. [2009b], Delvecchio et al. [2014]). This is in itself consistent with the idea that black holes and their hosts may be co-evolving, and understanding how the M BH -σ e relation precisely evolves over cosmic time or change as a function of environment could set invaluable constraints on the mechanisms behind black hole growth (e.g. Hirschmann et al. [2014], Fontanot et al. [2015] and Sijacki et al. [2015]).

CONFLICT OF INTEREST STATEMENT
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.