Toward Snow Cover Estimation in Mountainous Areas Using Modern Data Assimilation Methods: A Review

The snow cover is a key component of land surface hydrology, especially in mountain areas where it governs the amount and timing of water availability in downstream areas. It is involved in relevant climate feedbacks and natural hazards such as avalanches and floods. Monitoring and forecasting snow cover characteristics is challenging. While snow cover extent is relatively easy to retrieve from satellite data, remote sensing retrievals of the snow water equivalent (SWE) is often inaccurate, particularly in complex mountainous terrain. Model-based snow cover estimates, driven by meteorological data, often bear significant uncertainties due to both input data and model errors. Data assimilation can combine both approaches to improve SWE estimates. In this paper, we review current state-of-the-art data assimilation methodologies used to optimally combine measurements with snow cover models in order to reduce uncertainties. The suitability of a given data assimilation method varies with the numerical complexity of snow models as well as the availability and the type of observations. This review describes the issues and challenges associated with data assimilation applied to the mountain snow cover, providing recommendations for existing and upcoming monitoring and prediction systems of snow hydrology in mountainous regions.


INTRODUCTION
Snow on the ground plays a major role in the hydrological cycle, especially in mountain areas. The seasonal snow cover is the primary water source for human use and ecosystems in mountain regions and represents up to 95% of the water supply in some mountain basins (Liniger et al., 1998). In these regions, the runoff is heavily dependent on snowmelt amount and timing (Bowling et al., 2003;Bales et al., 2006). Monitoring of snow mass, most often referred to snow water equivalent or SWE, and more generally the state of the snow cover is particularly relevant for the management of water resources and risks from natural hazards, such as avalanches and snowmelt-induced floods (Sui and Koehler, 2001;Viviroli et al., 2011;Finger et al., 2012;Freudiger et al., 2014). Due to the influence of the snow cover on the surface energy and water balance (Hansen and Nazarenko, 2004;Flanner and Zender, 2006;Hall and Qu, 2006;Qu and Hall, 2014), its representation in land surface models is essential for predicting the thermal state of the underlying soil, atmospheric interactions and the hydrological cycle (Gouttevin et al., 2012).
Snow is a complex medium that is highly variable both in time and space. This variability is especially enhanced in mountainous areas, where it is governed by both the variability in the accumulation processes, i.e., amount of solid precipitation, wind transportation, and in the ablation processes, e.g., incoming shortwave and longwave radiation, wind speed and direction, air temperature and humidity. In mountainous terrain, the complex topography and the high spatial variability in land cover (e.g., forest, rocks, pasture) are responsible for a high spatial and temporal variability in both accumulation and ablation processes, resulting in a high variability in mountain snow cover properties (Clark et al., 2011).
The spatial variability of the snow cover in mountains areas prevails across all spatial scales from the hill-slope scale (1-100 m, e.g., preferential deposition, topographic shading) to the watershed scale (100-10,000 m, e.g., orographic precipitation, temperature gradient with elevation) (Mott et al., 2018). Accurately representing the spatial variability in snow cover properties at the hill-slope scale is crucial to monitor the changes in snow mass at the scale of a watershed since it controls the nonuniform snow disappearance (Clark et al., 2011;Revuelto et al., 2016). This is also essential for assessing the mechanical stability of the snowpack (Schweizer et al., 2003). As a consequence, observations or simulations useful to depict the spatial variability of the snow cover in mountainous regions typically require a spatial resolution better than a few hundred meters (Winstral et al., 2014). For example, Baba et al. (2018) identified a 250 m grid spacing to be the best compromise for the spatial resolution of simulations in the High Atlas. Thus, snow cover monitoring is facing unique challenges in mountain areas, compared to other snow-covered landscapes, on the observational and modeling sides (e.g., Bormann et al., 2018).
The monitoring of the evolution of the snow cover (e.g., snow depth) and its surface properties (e.g., albedo, surface temperature) is possible through in situ (Essery et al., 2013;Lejeune et al., 2018;Ménard et al., 2019) and remotely-sensed observations (Nolin, 2010;De Lannoy et al., 2012;Dietz et al., 2012a;Gascoin et al., 2015;Du et al., 2019). In situ datasets are essential for the understanding of snow physical processes, and for the evaluation of snow cover model and remotesensing products (e.g., Vionnet et al., 2012;Krinner et al., 2018). However, point measurements inherently limit the spatial extent of data thus not fully capturing information on the spatial heterogeneity. Most snow cover properties, e.g., snow depth, exhibit a very high spatial variability and consequently, most of these point measurements are also not applicable at large scales. To monitor the evolution of snow cover over large areas, the use of remote-sensing observations, either ground-based, airborne or spaceborne are needed (Hüsler et al., 2012;Painter et al., 2016;Arslan et al., 2017). Remote sensors have different measurement wavelengths and employ various operating principles (active or passive). For instance, multispectral images from optical sensors provide information on the visible and near infra-red wavelengths with a spatial resolution ranging from tens of centimeters to hundred of meters from space. Under cloud-free conditions, these images provide snow cover extent information, but also information on albedo and light absorbing particles, as well as information relevant to the state of the snow microstructure (Hall et al., 1995;Hall and Riggs, 2007;Painter et al., 2009Painter et al., , 2012Dietz et al., 2012b;Gascoin et al., 2015). However, the retrieval of the snow mass from satellite data only has remained problematic especially in mountainous areas (Lettenmaier et al., 2015).
Snow cover properties can also be monitored and predicted using numerical models of snow cover evolution. The snow cover was first included in land surface models to obtain improved time varying estimation of the surface energy balance over cold regions (Loth et al., 1993;Lynch-Stieglitz, 1994;Douville et al., 1995;Slater et al., 1998;Yang et al., 1998). The representation of snow on the ground in land surface models is indeed crucial in view of its role in energy exchanges on the climate system and the hydrological cycle. More complex snow cover models, that included a detailed representation of physical processes occurring within the snowpack, were developed in the 1990s for specific applications such as avalanche hazards forecasting (Brun et al., 1992;Lehning et al., 1999). In order to better represent and account for the interactions between snow on the ground and the climate system, some of these detailed processes have been progressively incorporated into land surface model (Boone and Etchevers, 2001;Dutra et al., 2010;Vionnet et al., 2012;Decharme et al., 2016), leading to the development of numerous snow cover models.
Observations and numerical simulations represent two sources of information that can be exploited simultaneously, but inherently carry uncertainties (Dong, 2018). Data assimilation aims to optimally combine observed and simulated snow properties in order to provide improved estimates of the snow cover state. A key advantage of data assimilation is that its output can have lower uncertainty than either the observation or the model. Data assimilation techniques can be used to improve initial conditions of the snow cover for operational Numerical Weather Prediction (NWP) or hydrological prediction (Brasnett, 1999;Lehning et al., 1999;Barret, 2003;Drusch et al., 2004) and avalanche hazard forecasting . In data assimilation, observations are used to correct a set of parameters or a physical state given by a model in order to improve the overall representation of a particular system (Andreadis and Lettenmaier, 2006;Clark et al., 2006;Nagler et al., 2008;Leisenring and Moradkhani, 2011;Liu et al., 2013). Various data assimilation methods exist, each characterized by advantages and drawbacks.
The utility of and interest in data assimilation approaches has not gone unnoticed in the snow cover modeling community and brought about an increasing number of publications on application of data assimilation in very specific settings, determined by previous model choices and/or data availability (Girotto et al., 2020). In this paper, we aim to provide an overview of data assimilation methods used for snow related applications in mountainous regions, in view of the unique challenges associated with this complex environment. The study discusses the strengths and weaknesses of these data assimilation systems and provides recommendations for future snow cover monitoring and prediction systems in mountainous areas. Note that assimilation in atmosphere-surface coupled systems in view of NWP applications is beyond the scope of this review. This article firstly describes the different types of snow cover observations and models along with their associated uncertainties and limitations in mountainous environments. The study then reviews existing data assimilation methodologies, including a description of the limits of each method for snow cover applications. The study also discusses the dependencies of data assimilation methods on the characteristics of the observations and snow cover models.

SNOW COVER OBSERVATIONS
This section describes the different observations commonly used to determine the evolution of the snow cover, with an emphasis on its water equivalent, because of its critical relevance to mountain snow hydrology. This section only includes observations that are relevant to data assimilation systems.

Ground-Based Observations
Site measurements describe the characteristics of the snow cover at a given location at one point (fixed instruments) or over an area using techniques such as terrestrial laser scanning, webcam photography or airborne sensors. Groundbased observations are performed manually or automatically by fixed measuring instruments that provide nearly continuous data. These observations are performed either at conventional operational meteorological stations or at dedicated experimental snow sites.

Meteorological Stations
Weather stations provide continuous meteorological information at a particular location, such as wind speed, temperature, incoming radiation, specific humidity, and precipitation rate. Such weather stations cover the whole globe with widely variable spatial densities. These sites often provide all data needed to run a simple snow cover model. Complementary snow cover measurements from these sites, such as snow depth and density, therefore form the basis for single-point data assimilation experiments. Manual and automatic snow cover measurements present in weather stations in Europe are listed in detail in the European Snow Booklet (ESB) (Haberkorn, 2019).

Dedicated Snow Cover Monitoring Sites
The monitoring of the snow cover at conventional weather stations is often insufficient to improve process understanding and to perform a thorough snow cover model evaluation. To overcome this lack of information, dedicated monitoring sites have been installed and include observation systems relevant to address key snow cover processes. These sites typically include automatic instruments providing daily and hourly measurements of meteorological and snow cover variables such as presence of snow, snow depth, SWE, snow density, albedo, snow temperatures, and runoff (Pirazzini et al., 2018). Discontinuous measurements are often made seasonally, especially during winter, such as snow stratigraphy profiles (Fierz et al., 2009;Morin et al., 2012). Only a few sites measure most of the meteorological and snow cover variables required to address the snow surface energy balance. These experimental sites, such as Col de Porte (Lejeune et al., 2018), Weissfluhjoch , Sodankyla (Essery et al., 2016), or Reynolds Creek (Reba et al., 2011), are particularly relevant for studying the detailed evolution of the snow cover (e.g., Dumont et al., 2017;Würzer et al., 2017). These sites have been used extensively for snow model evaluation (Wang et al., 2013;Magnusson et al., 2015;Decharme et al., 2016;Piazzi et al., 2018) and model-intercomparison, such as SnowMIP (Etchevers et al., 2004) and ESM-SnowMIP (Krinner et al., 2018). An overview of the observation datasets at ten such sites is provided in Ménard et al. (2019).
For some of these sites, distributed measurements of SWE (see next section) or snow depth are also available. Such distributed measurements can be obtained via manual snow survey or via new remote sensing techniques. Snow depth are now commonly obtained with sub-meter spatial resolution and with high accuracy, i.e., a few cm, using terrestrial laser scanning (e.g., Prokop et al., 2008;Grünewald et al., 2010;Deems et al., 2013), Unmanned Aerial Vehicles (UAV), structure from motion (e.g., Bühler et al., 2017) and airborne LIDAR (see section 2.3). Note that mobile precipitation radar can also be used to monitor the small-scale variability of snowfall (Scipión et al., 2013).

SWE Measurements: Methods and Uncertainties
Various methods can be used to measure SWE, automatically and manually, as reviewed in detail in the WMO guide to meteorological instruments and methods of observations (WMO, 2018). SWE manual measurements are based on bulk density samples made with a snow sampling tube (e.g., Leppänen et al., 2016) or by integrating the SWE within each snow layer from snow pits. The relatively sparse nature (in space and time) of manual measurements has led to the use of more automated measurements. The most commonly used technique for continuous SWE measurements is the snow pillow (Serreze et al., 1999;Smith et al., 2017), where pressure sensors measure the hydrostatic pressure of the overlying snow (Beaumont, 1965). Gamma-ray sensors can also be used to measure SWE continuously using in situ sensors or from airborne measurements (Cho et al., 2020). Originally, these sensors used a radiation source (Harding, 1986). However, for safety and environmental reasons, this instrument has been gradually replaced by sensors sensitive to natural emitted radiation such as cosmic-ray sensors and passive gamma radiation (Potassium and Thallium) sensors Martin et al., 2008). SWE measurements are also accessible through the attenuation of the Global Navigation Satellite System signal (e.g., Koch et al., 2019) with reasonable accuracy (41-73 kg m −2 depending on snow wetness). Distributed measurements of SWE can also be accessed via Ground Penetrating Radar measurements (GPR) (e.g., Marchand et al., 2001;Griessinger et al., 2018). An assessment of the different SWE measurements methods was performed during the WMO SPICE experiment (Smith et al., 2017). Recent efforts have been made to provide detailed descriptions of existing measurement methods (Pirazzini et al., 2018;Haberkorn, 2019).
The uncertainties associated with in situ datasets are not only due to the intrinsic uncertainty of snow cover measurements, but also to the representativeness of the measurements with respect to the model. In mountain environments, the large spatial variability of snow depth around a given observation point requires that this point scale data is used with great caution (Grünewald et al., 2013). In the studies of Lafaysse et al. (2017) and Lejeune et al. (2018) on snow depth measurements, the instrumental error was determined to be low (about 1 cm) compared to the spatial variability of snow depth of the surrounding environment (up to 20 cm). Conversely, the instrumental uncertainty of SWE measurements can be of the same order of magnitude as their spatial variability (Lafaysse et al., 2017;Smith et al., 2017;Lejeune et al., 2018;Nitu et al., 2018). SWE measurement uncertainties also vary as a function of the type of instruments, methodologies and snow cover state (e.g., Egli et al., 2009). Many early studies (e.g., Goodison, 1978) found that snow tubes over-estimated SWE (Johnson et al., 2015) whereas Sturm et al. (2010) estimated an average underestimation of 7.1%. Dixon and Boon (2012) also found general underestimations of SWE. Hill et al. (2019) summarized these findings suggesting that coring devices should be viewed as having an accuracy of ±10%. Snow pillow SWE measurements, on the other hand, bear uncertainties which range from 6 to 12% in optimal conditions. This method is particularly suitable in cold climates. The uncertainty of snow pillow measurements increases in case of snow bridging and with the presence of liquid water in the snowpack, therefore they can be particularly unsuitable during the snowmelt season (Engeset et al., 2000;Johnson and Marks, 2004;Smith et al., 2017). SWE measurements obtained by cosmic-ray sensors have an accuracy around 5-10% (Gottardi et al., 2013). Passive gamma radiation method has shown RMSE values from 18 to 59 kg m −2 compared to manual measurements for Finnish and Canadian sites (Smith et al., 2017), where the comparison between automated and manual observations were carried out over a distance ranging from 5 to 25 m. These differences can partly be explained by the spatial variability of the snow cover (Smith et al., 2017;Lejeune et al., 2018).

Satellite Remote Sensing Observations
Remote sensing observations are crucial for monitoring the evolution of the snow cover over large areas. A wide variety of satellite sensors exist with various spatial resolutions and wavelengths. The choice of remote sensing techniques for snow cover products varies with the desired temporal and spatial resolution. This section describes the different types of satellite sensors (optical and microwave satellite sensors) used for monitoring snow hydrology and, indirectly, SWE. Examples of space-borne instruments or satellites used in snow cover studies with their general properties classified by type of satellites sensors are summarized in Table 1. Note than an overview of existing satellite snow products over Europe and the Northern Hemisphere is available in Bartsch (2018).

Optical Sensors
Passive optical sensors use a multi-spectral imaging system in the visible and infrared domain. By design, data is only available during daytime and under cloud-free conditions. The spatial resolution of optical satellites varies from 50 cm to 1 km. Optical sensor measurements are useful for providing observations of the surface area covered by snow. They are mainly used to monitor the snow cover area (SCA) providing a binary information on the presence or absence of snow per pixel, and the snow cover fraction (SCF), which provides the percentage of snow coverage per pixel. SCA and SCF retrievals are mostly based on two different methods. The normalized difference snow index (NDSI) (Dozier, 1989;Hall et al., 1995) uses reflectance data from both visible (VIS) and short wave infrared (SWIR), thereby taking advantage of the high contrast in reflectance between the two wavelengths. SCA can be estimated by using a threshold on NDSI to determine the presence of snow and SCF can be estimated from a linear relationship to NDSI value (Salomonson and Appel, 2004). The spectral unmixing method recognizes that the total reflectance in a pixel corresponds to the sum of the reflectances due to snow, rock and vegetation and provides a decomposition of their relative contributions (e.g., Rosenthal and Dozier, 1996;Painter et al., 2009;Sirguey et al., 2009). In mountainous regions, snow mapping by optical satellites is affected by errors induced by the complex terrain, clouds and forest (Masson et al., 2018).
Snow reflectance varies significantly with wavelength and the structure of the snowpack. Optical satellites in the visible and infrared range are thus sensitive to snow impurities and snow microstructure. Optically equivalent grain size (aka specific surface area) and light absorbing impurities content at the snow surface can be retrieved from the reflectance (Painter et al., 2012;Mary et al., 2013;Kokhanovsky et al., 2019). However direct SWE measurements are not possible solely based on optical data. Optical satellite data are largely used to retrieve SCF and snow albedo covering the entire Earth at a near daily frequency, for example based on MODIS products (Hall et al., 2002;Klein and Stroeve, 2002) distributed through the US National Snow and Ice Data Center. It should be noted that the use of these images remains challenging owing to difficulties in interpreting snow properties in forest, complex terrain, shaded areas, and cloudy conditions and frequently, inaccurate image geo-registration (Rittger et al., 2013).
Stereoscopic satellites (e.g., Pleiades 1a/1b and SPOT5/6/7) cover a specific region with different view angles with a very high resolution (VHR, 50 cm to 2 m), and the acquisition of such images is generally upon request only. These images can be used for 3D mapping by photogrammetry, thereby providing topographic information. Tri-stereoscopic satellites, such as Pleiades, have been used to retrieve snow depth over open alpine catchment by comparing snow-covered (i.e., winter) and snowfree images (Marti et al., 2016). The snow depth measurement uncertainty is estimated to be on the order of 50-80 cm at 3 m spatial resolution (Marti et al., 2016;Deschamps-Berger et al., in press). The uncertainty decreases by a factor of 2 when the data are averaged over 20 m grid cells, demonstrating that satellite photogrammetry is a promising tool to monitor snow depth distribution in mountains areas (Deschamps-Berger et al., in press).
The passive optical satellites most often used to detect snow have varying spatial resolution (1 m to 1 km) and revisit times (daily to 16 days). In general, the higher spatial resolution is paired with lower revisit times. The longevity of sensors like MODIS or AVHRR makes them extremely useful and unique since they provide decades-long time series enabling climatological studies (e.g., Hüsler et al., 2014). High resolution satellites used for snow cover mapping (SPOT 6/7, Sentinel 2, Landsat 8, Pleiades 1A and 1B, see e.g., Gascoin et al., 2019 for data access) are limited by the revisit time. In the case of SPOT 6/7 and Pleiades, the acquisition of satellite images is performed on demand, and can be superseded by higher priority requests. Monoscopic (i.e., one viewing angle) optical satellite data with most frequent revisit times are currently MODIS, PROBA-V, Sentinel 3 (S3) and VIIRS. Higher revisit times also increase the probability of obtaining cloud-free images.
Active optical sensors, namely LIDAR (e.g., Ice Sat, Ice Sat 2), can be used for regional-scale snow depth mapping using differences in acquired data between snow-free and snow-covered dates (Kwok and Cunningham, 2008;Treichler and Kaab, 2017). Treichler and Kaab (2017) demonstrated a RMSE of 0.15 to 0.6 m when comparing in situ and airborne measurements, given the availability of an accurate DEM for the study area. However, the revisit time for Ice Sat-2 can be as high as 91 days. This long revisit time along with the necessity for an high accuracy DEM strongly limit the use of such techniques for mapping snow depth, especially in mid-latitude mountainous areas where the number of overpasses is very limited.

Thermal Infrared Sensors
Thermal infrared satellites have also proven to be useful to monitor the evolution of the surface temperature of snow and ice covered surfaces (e.g., Dozier and Painter, 2004;Fréville et al., 2014). They are not restricted to daylight periods but the spatial resolution of the thermal bands combined with the revisit time is often too coarse for mountain regions. However, recent work from Lundquist et al. (2018) has demonstrated that MODIS can be used to estimate both snow and forest temperature in mixed pixels at 1 km resolution, with an accuracy of better than 1 K during the night. During daylight, retrieval of snow surface temperature is complicated by reflected sunlight. To overcome the issue linked with the 1 km spatial resolution (e.g., MODIS or Sentinel-3), sensors like Landsat-8 (30 m spatial resolution) can provide higher resolution snow temperature maps but with a lower revisit time. This might be an issue since, as recently demonstrated by Colombo et al. (2019) monitoring the daily amplitude of snow surface temperature at high resolution from space could be extremely informative for SWE prediction.

Passive Microwave
Passive microwave sensors (10 −3 to 1 m wavelength) have the advantage of providing data under all atmospheric conditions and during day and night-time. The spectral luminance energy measured by passive microwave sensors can be utilized to calculate brightness temperature values which, in turn, can be used to estimate snow depth (Chang et al., 1982) or SWE. The response of each sensor band to snow cover properties has been detailed by Rango (1993). The characteristics of passive microwave satellites used in snow cover studies is summarized by König et al. (2001). The estimation of SWE from passive microwave data is either based on the brightness temperature alone or on the brightness temperature combined with measurements or model estimates of snow depth and snow density (Chang et al., 1987;Boone et al., 2006;Davenport et al., 2012;Cho et al., 2017;Smith and Bookhagen, 2018).
Passive microwave satellites have proven to be useful for monitoring snow cover at global and regional scales (Chang et al., 1982;Armstrong and Brodzik, 1995). These satellites can be used for wet snow mapping (Picard and Fily, 2006). Studies during the GlobSnow project have used a combination of passive microwave and meteorological stations to produce daily SWE maps (Luojus et al., 2014;Pulliainen et al., 2020). The highest uncertainties associated with passive microwave retrieval occur during the melting period, when the microwave signal is perturbed by liquid water contained in the snowpack. Strong limitations also arise from the sensitivity of the signal to snow layer properties (e.g., Helmert et al., 2018). The coarse spatial resolution of passive microwave sensors (around 25 km) does not permit the representation of fine scale processes, thus limiting the applicability of these observations for mountainous areas.

Active Microwave
Active microwave radar (RAdio Detection And Ranging) contains its own source of radiation. The detection of the scattered signal emitted by the active radar provides information on surface structure in terms of the snow cover roughness and microstructure. Like passive microwave, radars are not disturbed by cloud cover and can record during the day and night. Radar frequencies relevant for snow cover monitoring typically vary from 1 to 40 Ghz, namely L bands to Ka bands. The snow properties to which the backscattered signal are sensitive change with frequency and with the presence of liquid water in the snowpack. Typically for bands L and C, bulk snow density or snow depth could be retrieved (Shi and Dozier, 2000;Lievens et al., 2019). Higher frequencies such as X, Ku and Ka are more sensitive to snow microstructure and stratigraphy Rott et al. (2010). The radar signal is extremely sensitive to the presence of liquid water, leading to lower backscatter for wet snow conditions (Magagi and Bernier, 2003;Schmid et al., 2014). In case of wet snow, the signal is then often saturated. A comparison of the different characteristics of active imaging synthetic-aperture radar sensors (SAR) is given by König et al. (2001).
Very high resolution imaging (up to 1 m) is obtained with SAR. Several studies have used SAR imaging system for the production of wet snow maps (Shi et al., 1994;Eckerstorfer et al., 2016). Recent studies have attempted to use Sentinel-1 C-band radar measurements to map snow depth at high spatial and temporal resolution (Conde et al., 2019;Lievens et al., 2019). The backscatter signal of the SAR sensors allows the estimation of the snow depth, but some snow property information (such as liquid water content) must be known beforehand. Spaceborne SAR interferometry (inSAR) has shown great potential for retrieving snowdepth with high accuracy in Band C (Conde et al., 2019) or Band Ka (Moller et al., 2017). The high spatial and temporal resolution of SAR sensors makes their use potentially welladapted for snow hydrological applications (Bernier and Fortin, 1998;Nagler and Rott, 2000;Leinss et al., 2014;Rondeau-Genesse et al., 2016). However, the oblique geometry viewing of SAR systems enhance geometric distortions which makes it particularly challenging to interpret in mountainous regions (Small, 2011;Veyssière et al., 2019).

Airborne Observations
Most of the satellite sensors used for snow cover monitoring can be found on airborne payload. In the case of airborne observations, the spatial resolution is usually increased compared to satellite measurements. The drawbacks of such observations with high spatial accuracy are their limited domain and revisit time. Planes thus enable the use of sensors that are not relevant when used from space, e.g., passive microwave such as in Kim et al. (2019) or of sensors that are not available on satellite such as gamma-ray sensors to monitor SWE (Cho et al., 2020). The Airborne Snow Observatory (ASO) initiative has demonstrated how relevant airborne observations can be for snow cover monitoring in mountain areas (Painter et al., 2016). ASO supports frequent flights over the same large domain with LIDAR and hyperspectral airborne measurements. In this case, hyperspectral measurements provide an estimation of snow broadband albedo and LIDAR an estimation of snow depth with a few cm accuracy at 50 m spatial resolution. The LIDAR snow depth estimates are however still challenging below the canopy (Deems et al., 2013).

SNOW COVER MODELING
The complexity of a snow cover model varies from the most simple degree-day snow schemes (often used for hydrological applications) to multi-layer snow cover evolution models, which can even include explicit representations of snow microstructure (e.g., for avalanche hazards forecasting). The physical processes represented in the different snow cover models are shown in Figure 1. This section gives a brief overview of the different levels of complexity of snow cover models used in the scientific community with their corresponding applications, limits, and uncertainties.

Degree-Day Models
Snow degree-day models are the simplest category of snow cover models, simulating snow mass evolution solely from precipitation and temperature. Individual surface energy fluxes are not explicitly represented, all fluxes are integrated and parameterized as a function of air temperature, and internal processes are rarely accounted for. This method is reliable for computing snowmelt depth from daily to seasonal periods where and when temperature is a sufficient proxy of the surface energy balance (Rango and Martinec, 1995;Ohmura, 2001). This approach has been used for more than 80 years (Clyde, 1931;Collins, 1934) and is still used actively (Tobin et al., 2013;Riboust et al., 2019). It is mainly used for snowmelt runoff forecast, especially for mountain basins and for studies of glacier surface mass balance (Finsterwalder, 1887;Singh et al., 2000;Braithwaite, 2008;Reveillet et al., 2017). Such models however require calibration limiting direct transferability to other sites and the ability to predict responses to future changes. They are however computationally efficient and thus adapted to applications requiring very large number of model runs (large spans of points, time and also ensemble simulations). They may be useful for testing new computationally demanding data assimilation approaches.

Surface Energy Balance Models
Surface energy balance models incorporate a more detailed representation of the exchange of mass and energy at the snow cover interfaces by calculating radiative, turbulent, and advective heat exchanges at the air/snow interface as well as conductive exchanges at the snow/ground interface. Surface energy balance models are able to estimate snow accumulation and snowmelt, including the rate of snowmelt from minutes to seasonal time scales, through full or partial surface energy balance computation. In the following, they are referred to as surface energy balance models or physically-based snow cover models. These models include processes such as snow accumulation, snowmelt, compaction, albedo variations, surface temperature evolution, sublimation as well as liquid water retention and refreezing processes (Tarboton and Luce, 1996;Mahat and Tarboton, 2012). The internal physical processes of snow are described according to a single or a multi-layer snow scheme in a more or less simplified way, e.g., bucket approach vs. Richards equation for liquid water percolation . These two categories are detailed below.

Bulk/Single-Layer Models
Single-layer snow cover models were used in the first versions of representations of the snow cover in land surface model, and in some cases they are still being used, e.g., the operational snow scheme of the European Centre for Medium range Weather Forecasting (ECMWF) Integrated Forecast System (IFS) (Dutra et al., 2010). In such cases, the physical properties controlling the energy balance such as albedo, roughness, and thermal properties are represented. The snowpack itself was initially represented as a composite snow-soil layer (Pitman et al., 1991;Douville et al., 1995;Yang et al., 1997). The surface energy in a composite snow-soil layer model combines the soil, vegetation, and snow energy balances for representing vegetation-soilsnow-atmosphere interactions. It must be emphasized that the representation of these interactions bears high uncertainties, whose evaluation is challenging, especially when the calculation uses empirical fractional snow cover parameterizations. Slater et al. (2001) compared snow cover representations in 21 land surface model revealing several weaknesses in macroscale snow cover modeling. In recent years, the increase in computer performance has led to the incorporation of more physicallybased snow processes and a separate snow energy balance within land surface model, the so-called explicit single-layer model, which integrates a separate surface energy balance for snow interactions with the atmosphere.
Explicit single layer models compute the snow surface temperature, albedo, snow density, snow depth and SWE evolution with time but internal processes are usually not considered. Since the snow cover is represented by a single layer, snow density is considered vertically constant for the whole snowpack (Chalita and Le Treut, 1994;Gouttevin et al., 2012) and usually increases with the age of the snowpack (Douville et al., 1995). Single-layer snow cover schemes have been specifically used in snow hydrology, NWP and climate applications (Dutra et al., 2012) and are often used for operational and/or large scale applications.

Multi-Layer Models of Intermediate Complexity
The continuous scientific progress and the increase of computer performance has led to the development of snow cover models representing the different layers of the snowpack. Multi-layer snow cover models of intermediate complexity represent the snowpack with a prescribed number of layers (typically 2 to tens) (Anderson, 1968;Kondo and Yamazaki, 1990;Loth et al., 1993;Marks et al., 1999;Sun et al., 1999;Yang and Niu, 2003;Wang et al., 2013;Ekici et al., 2014). In order to obtain a satisfactory representation of the heat fluxes between the atmosphere and the snow, such models often impose a higher vertical resolution for the layers near its interfaces (Lynch-Stieglitz, 1994;Boone and Etchevers, 2001;Decharme et al., 2016). Each layer of snow is characterized by state variables (snow density, snow thickness, temperature, and liquid water content), allowing the calculation of the vertical gradient of temperature and density. Some of these models also explicitly account for wind-driven blowing snow (Pomeroy et al., 1993;Liston et al., 2007). A list of the physical processes and state variables used in multi-layer snow cover models is given in Figure 1. Multi-layer snow cover models account for heat conduction as well as more detailed physical snow processes such as snow compaction and percolation of liquid water.

Detailed Multi-Layer Models
The highest level of complexity is reached for detailed snow cover models which are multi-layers and account for all the processes described in the previous section. These detailed models include additional internal snow processes (snow metamorphism) such as in Crocus, SNTHERM and SNOWPACK (Brun et al., 1989;Jordan, 1991;Lehning et al., 1999). In addition Crocus and SNOWPACK uses a Lagrangian multi-layer framework in which the layering evolves in time (Lehning et al., 1999;Vionnet et al., 2012). In these detailed models, the microstructure of snow layers and its time evolution are represented and expressed by state variables such as specific surface area and sphericity (Lehning et al., 2002;Carmagnola et al., 2014).

Uncertainties and Limits Associated With Snow Cover Models
Snow cover simulations are generally affected by several sources of uncertainties. First, the reliability of model results depend, to a large extent, on the quality of the driving data. Meteorological forcing data from NWP model output contain errors impacting the snow simulations. This can be partly mitigated through data assimilation in atmospheric reanalysis at the global (Dee et al., 2011) or local scale (Durand et al., 1993;Magnusson et al., 2014). Meteorological input is often the dominating source of uncertainties in snow cover models outputs (Fekete et al., 2004;Bosilovich et al., 2008;Magnusson et al., 2015;Raleigh et al., 2015). The second source of uncertainties comes from the model itself i.e., the chosen parameters and simplifications in the description of the physical processes (Essery et al., 2013;Lafaysse et al., 2017;Günther et al., 2019). The uncertainties of snow cover simulations are also related to the sub-grid variability of the study area, especially in complex mountainous terrain, with features such as topography and slope as well as non-represented processes such as lateral or gravitational redistribution of snow by blowing snow and avalanches, respectively. Interactions between vegetation and the snow cover are also a substantial source of uncertainties (Rutter et al., 2009;Boone et al., 2017;Todt et al., 2018). The additional parameterizations (e.g., canopy model, snow interception, unloading, etc.) make up for missing processes but also add uncertainties (Mahat and Tarboton, 2014).
Despite these additional uncertainties, they generally increase the accuracy of estimates of the snow cover in forest areas compared to estimated obtained with models not taking into account any vegetation effects (Todt et al., 2018).
Model intercomparison projects include a large variety of models suitable for a wide range of climate conditions with varying levels of complexity. Physical snow processes in snow cover models are represented by different equations and parameterizations, and this leads to a spread in the predicted evolution of the snow cover under similar vegetation, soil and meteorological forcing conditions. In this most recent model intercomparison project ESM-SnowMIP (Krinner et al., 2018), the results show that for the 10 sites considered, the individual models which performed best relative to observed SWE at each site featured different complexities. This illustrates that model complexity is not necessarily correlated with performance. However, it is important to highlight that detailed snow models with greater physical basis are more likely to be transferable in space and time although some of their parameterizations still suffer from these issues (e.g., optimal parameter values for albedo and turbulent fluxes parameterizations often vary in time and space). It is also known that snow cover model performance is generally better at open sites than forested sites (Rutter et al., 2009). The performance of a snow cover model generally differs during the accumulation and the ablation periods with often larger discrepancies during the melt period (Etchevers et al., 2004). This induces a greater spread in model performance during warm winters and overall warm sites due to a higher frequency of melt events (Etchevers et al., 2004).
Ensemble approaches are increasingly used to estimate uncertainties pertaining to model output, helping to better describe the best configuration options and the reasons explaining differences in model performance. Ensemble simulations can be used to quantify the various contributions to the overall error distribution of the model. The latter is estimated through a number of simulations where each member uses slightly different initial conditions or physical options. Multiple physics ensemble systems, such as JIM (JULES Investigation Model) (Essery et al., 2013), FSM (Essery, 2015), or ESCROC (Lafaysse et al., 2017), have increasingly been driven by ensemble of NWP predictions (Molteni et al., 1996;Descamps et al., 2015;Vernay et al., 2015) to represent uncertainties in both the snow model and in the meteorological forcings. However, ensemble models frequently remain under-dispersive leading to underestimates of the actual errors (Vannitsem et al., 2018). Essery et al. (2013) demonstrated using JIM at Col de Porte, based on several winter seasons, that there is no single configuration of a model that provides the best simulation in all years. An ensemble of configurations yielded the best results, however, some configurations were systematically worse. Models including prognostic albedo and snow density as well as liquid water retention were shown to lead to better performance (Boone et al., 2004;Essery et al., 2013).

From Modeling to Observation Spaces
When combining models and observations, it necessary to estimate observed variables based on model variables. The model used to bridge the observations and the model spaces is usually called an observation operator. In some cases, for instance snow depth, this operator is reduced to identity. However, for some observations such as radar basckscattering coefficients, brightness temperature and top-of-atmosphere radiance, more complex models are required. For such observations, models for radiative transfer (RT) in snow are used to link model and observations. The SMRT model  includes various RT models and can be used both for active and passive microwaves over a wide range of frequencies. For the solar spectrum, broadband albedo is most of the time directly available from the snow models, but this is often not the case for spectrally resolved reflectance. Furthermore, it is needed to account for the interactions between the solar radiation and the complex topography of mountainous terrain (Lamare et al., 2020, and references therein). Besides the inherent uncertainties of these RT models, it must be underlined that they generally require as inputs snow state variables describing the snow microstructure that are only available from detailed snow cover models.

COMBINING OBSERVATION AND MODELING: DATA ASSIMILATION
Observations and numerical simulations represent two sources of information, both affected by uncertainties, that can be exploited simultaneously, thereby reducing the overall uncertainties of the output products. Data assimilation is the statistical and methodological approach used to achieve this goal. In other words, data assimilation is the technique whereby observational data are combined with output from a numerical model to produce an optimal estimate of the evolving state of the system. Data assimilation was firstly used in weather forecasting systems to improve initial conditions of NWP, later also used for estimating model parameters. A set of parameters, a physical state of the model or the input data can be corrected by a series of observed quantities to obtain a more accurate representation of the system under consideration.

Data Assimilation Methods Used for the Mountain Snow Cover
A wide range of data assimilation methods exist which are more or less adapted to different applications depending on the limitations, the required level of complexity, the characteristics of the system, the type of observations and their frequencies. Sequential data assimilation methods assimilate observations fixed in time whereas non-sequential methods assimilate all the observations available during a given time window. This section reviews existing data assimilation applications in mountain snow cover studies and discusses the advantages and disadvantages of each of these methods.

Ad-hoc and Interpolation Methods
Direct insertion consists of directly replacing model states by observed values. The adjustment of other, non-observed variables is then performed through the model integration. Another approach, designed to better mitigate physical imbalance between variables, is to apply ad-hoc adjustments based on physical considerations to non-observed variables after insertion. But at some point of sophistication in these adjustments, direct insertion can turn into a more advanced method: the frontier is not clearly established. Direct insertion has been and is still widely used with snow observations (Liston et al., 1999;Rodell and Houser, 2004;Malik et al., 2012;Hedrick et al., 2018;López-Moreno et al., 2020).
Cressman interpolation (Cressman, 1959;Drusch et al., 2004;Dee et al., 2011) and optimal interpolation (Gandin, 1965) are simplified data assimilation methods widely used in the context of snow cover simulations (Helmert et al., 2018), mostly for NWP applications. Cressman interpolation is based on successive corrections of the model state using the observations weighted by their distance to the grid points. Optimal interpolation is based on the Best Linear Unbiased Estimate (BLUE, Talagrand, 1997). Optimal interpolation is optimal only for nearly linear systems with Gaussian error statistics, does not track error changes, and performs poorly with complex, highly non-linear snow cover models featuring error covariances that significantly evolve in time. Winstral et al. (2019) introduced the bias-detecting-ensemble (BDE), a method to dynamically incorporate snow observations. This technique used in situ snow depth observations from a dense network across Switzerland to adjust model forcings in a surface energy balance model to bring simulations into better agreement with observations. The method corrects for potential errors in both the meteorological forcings and snow cover model structures. In situ observations were used to update only the model forcings rather than simulated states making BDE suitable for snow cover models of any complexity. BDE improved simulations at unobserved sites as well, indicating its appropriateness for spatial propagation. The technique as presented however, does not account for observation uncertainties.

Filtering Methods
Filters provide an estimate of the state at a given time, with an uncertainty estimate, based on past and present observations. They implement a sequential process alternating forecast steps, where the state and uncertainty estimates are advanced in time, and analysis steps, where those are updated using available observations.
A first class of filters is based on the Kalman filter, where the uncertainty estimate is given by its covariance matrix. The Kalman filter is optimal for linear systems with Gaussian errors. Linearity actually drives the Kalman Filter implementation. For non-linear systems, the Extended Kalman Filter uses local linearizations of the operators around the mean estimate (Jazwinski, 1970;Miller et al., 1994;Dong et al., 2007). The propagation of the covariance matrix involves the tangent linear and adjoint models, which make the extended kalman filter unpractical if those are not readily available, and inappropriate for strongly non-linear systems. A much more convenient implementation of the Kalman Filter for non-linear systems is the Ensemble Kalman Filter (EnKF; Evensen, 1994) that represents the state estimate and its uncertainty with an ensemble. The EnKF uses a Monte Carlo sampling to propagate the error information (Burgers et al., 1998;Evensen and van Leeuwen, 2000). This approach loses accuracy for highly nonlinear systems and increases in computation cost when larger ensembles are needed. And because the correction of nonobserved variables are based only on statistical relationships with observed variables, physical imbalances or inconsistencies can sometimes be introduced by the analysis, and must be corrected too. The EnKF technique has been however widely used in snow hydrology studies (e.g., Andreadis and Lettenmaier, 2006;Clark et al., 2006;Slater and Clark, 2006;Durand et al., 2008;Su et al., 2008;De Lannoy et al., 2012;Magnusson et al., 2014;Huang et al., 2017).
Another ensemble-based data assimilation method is the particle filter with sequential importance resampling (PF-SIR; Gordon et al., 1993;Thirel et al., 2013;Charrois et al., 2016). It uses an ensemble of simulations to represent the probability density function (PDF) of the system with no assumptions on the nature of the PDF. When an observation is available, each member of the ensemble, or particle, is weighted and then selected according to its distance to the observations. This method retains only the particles closest to the observations. The retained particles are then resampled according to their respective weights to refill the ensemble. This relatively simple method is particularly suitable for non-linear systems where the dynamics of the model are respected. However the spatial propagation of the assimilated observation (e.g., to non-observed pixels) is not trivial to ensure although localization methodology are being developed (Farchi and Bocquet, 2018).
Variational assimilation solves the analysis problem through the optimization of a given criterion (minimization of a costfunction). It generally requires an adjoint model, which is challenging to obtain for complex models and is of limited efficiency for nonlinear systems. As a consequence, such methods have been only rarely used in snow cover applications (Dumont et al., 2012).

Retrospective Data Assimilation Methods
Data assimilation methods can also be applied using so-called smoother approaches. In such methods, rather than updating the states at a measurement time and predicting forward until the next measurement, smoothers assimilate a batch of observations over a pre-defined retrospective window. Contrary to filters, smoothers include observations ulterior to the states to estimate. The most common approach is the fixed-interval smoothing, where the state trajectory over a time interval is estimated through the assimilation of the observations available in that same interval. Four-dimensional variational methods (i.e., 4D-Var) is a batch smoother whereby measurements are assimilated by minimizing a least squares cost function that incorporates information about measurement errors, prior information and the model constraint. While 4D-Var can handle non-linear models, its application requires the derivation of an adjoint model (as for 3D-Var), which can be difficult for complex nonlinear models such as many of the snow models described above. As such, to our knowledge, 4D-Var has not been applied in snow cover applications. Other smoother methods, namely the Ensemble Kalman Smoother (EnKS) and the Particle Batch Smoother (PBS), which are essentially smoother extensions of the filtering approaches outlines above, have been applied in snow hydrology recently with a variety of data assimilated including passive/active microwave and SCA/SCF (e.g., Durand et al., 2009;Bateni et al., 2013Bateni et al., , 2015Girotto et al., 2014;Margulis et al., 2015Margulis et al., , 2016Margulis et al., , 2019bCortés and Margulis, 2017;Li et al., 2017;Aalstad et al., 2018;Baldo and Margulis, 2018). In the EnKS, a Kalman update can be performed on all states over a given window, while in the PBS, particle weights are updated by conditioning on all measurements over the assimilation window. Because of the retrospective nature of smoother data assimilation applications, they are well-suited to deriving snow cover reanalysis datasets.

Advantages and Disadvantages of Data Assimilation Methods
The suitability of a given data assimilation method relies on several criteria: • The characteristics of the observation to assimilate: is the model state variable directly observed or is there a need for an observation operator to convert observation to model state? what is the spatial distribution of the observations (point vs. distributed)? • The level of complexity of the snow model, e.g., models with multiple, dynamic layers are challenging to update when only bulk properties are observed; this is also the case for models with multiple state variables (e.g., snow temperature, liquid water content), which are typically not observed and weakly correlated with the observed states (e.g., snow depth). • The foreseen application (e.g., avalanche hazards prediction, runoff forecasting, SWE reanalysis) and the potential necessity to propagate the analysis in space (e.g., to non-observed pixels).
Data assimilation systems can be built for different purposes: (i) update model parameters (e.g., Piazzi et al., 2018) (ii) update model state variables (e.g., Charrois et al., 2016), (iii) update the meteorological forcing (e.g., Magnusson et al., 2017;Winstral et al., 2019) or (iv) combinations thereof (e.g., Margulis et al., 2016). Each option has advantages and drawbacks and is not necessarily compatible with all the data assimilation methods described above. Options (iii) and (iv) can ease the spatial propagation of the assimilation benefit to non-observed pixels. The comparison of the advantages and disadvantages of each category of methods applied to snow cover observations and models have been summarized in Table 2. The data assimilation scheme complexity increases when using smaller spatial resolution of the model and for longer prediction time horizons, as described in the Figure 9 of Carrassi et al. (2018). The difficulties of optimal interpolation and extended kalman filter in dealing with non-linearities make them generally not suited for complex non-linear snow cover models. As for variational methods, although they are able to deal with non-linearities, the development of the adjoint model, which can be non-trivial in presence of non-differentiable and non-linear models, can be a barrier to their use with detailed snow cover models. The capability of the EnKF to deal with large dimension error covariance matrices could be suitable for explicit snow cover models but this approach can also lead to issues related to its linearity assumption. To avoid non-linearity issues, the particle filter is one of the most adequate data assimilation method for explicit detailed snow cover models. Contrary to the other methods, particle filters are adequate to deal with ensemble detailed snow cover simulations in which members may have a variable number of layers, i.e., variable dimension problem. However, in large dimension (numerous variables or large domain), this method has limitations when the resampling leads to very close particles which can lead to degeneracy. Because of its easy implementation and its ability to process non-linear system, this technique has been used for detailed multi-layer snow cover models and has shown promising results (e.g., Charrois et al., 2016;Magnusson et al., 2017).
In applications focused on seasonal snow processes, batch smoothers can be an attractive option for deriving historical reanalysis datasets (e.g., Margulis et al., 2016;Cortés and Margulis, 2017). Smoother methods also have the ability to extract more information than filters, especially where instantaneous relationships between measurements and state variables (i.e., fSCA and SWE) is relatively low, but the seasonal correlations are strong (i.e., seasonal depletion of fSCA and peak SWE). The application of batch EnKS methods is straightforward, and has the same limitations as the EnKF, but with the primary additional cost being the increase in state vector size (i.e., across time). This can be mitigated, for example, by updating model forcings rather than states (e.g., Girotto et al., 2014) based on the fact that uncertainty of the snow cover state is to a large extent determined by input (precipitation) uncertainty. The PBS method, like the PF, generalizes the problem to allow for both non-linearities and non-Gaussianity, and is computationally feasible by focusing on updating ensemble weights rather than state vectors (e.g., Margulis et al., 2015).

Benefit of Data Assimilation Systems Used for Monitoring the Snow Cover in Mountains
This section provides a brief history and overview of snow data assimilation studies in mountain areas. For the sake of clarity, the studies are sorted here by type of observations used.

In situ Observations
Due to increasingly available observations, data assimilation has been progressively integrated into land surface, hydrological and snow cover models during the last 20 years. Assimilation of snow cover observations have proven to be an essential tool to reduce the uncertainties for water resources. The first studies used simple conceptual hydrological and snow models integrating direct in situ measurements, such as snow depth and SWE (Carroll, 1978;Day, 1990). Their integration has led to an improvement in the forecast of river flow (Day, 1990) and snow cover runoff uncertainties were reduced by as much as 40% in a region where snow cover represents up to 80% of the water resource (Carroll, 1978). Large-scale operational numerical weather prediction have started to assimilate snow cover data but using only in situ snow measurements (Brasnett, 1999). The assimilation of snow cover observations in an operational system is usually made with observations of snow depth rather than SWE measurements due to the limited availability of the latter. Research on the simultaneous assimilation of various snow variables has been limited, and has been tested only on local and offline simulations (Piazzi et al., 2018). The latter study has shown that multivariate assimilation of snow surface temperature, surface albedo, snow depth and SWE has potential but raise new degeneracy challenges when using a particle filter. Detailed assimilation of in situ observations of snow depth over all Switzerland at 1,000 m spatial resolution has been set up in Magnusson et al. (2014) and Winstral et al. (2019). Such systems are highly efficient but rely of an extremely dense network of in situ observations (one station every 100 km 2 ) that enables proper spatial propagation of the information through the input meteorological forcings . in situ snow depth assimilation has been shown to improve both SWE and snow density estimates (Smyth et al., 2019). Subsequent runoff simulations were shown to greatly benefit from the assimilation of these in situ snow depth observations (Griessinger et al., 2016(Griessinger et al., , 2019. When the in situ snow observation network is not dense enough, their assimilation cannot recover the spatial heterogeneity of the snow cover. For this reason, the combination of in situ snow measurement with remotely-sensed observations is needed especially in regions with complex topography (Magnusson  , 2014). This issue also highlights the question of the representativeness of point-scale measurements and how the information of the point-scale measurements can be propagated in space via the assimilation systems.

Airborne Data
Snow cover observations from airplanes, although only available on limited domains, provides an alternative to satellite data, overcoming some of the shortcomings of the latter, e.g., limited spatial resolution. The Airborne Snow Observatory (Painter et al., 2016) has provided a valuable framework for testing and comparing different assimilation systems. Painter et al. (2016) used snow albedo maps from airborne imagery as inputs for a physically-based snow model and combined the simulated snow density with snow depth maps from airborne observations to obtain SWE maps at high resolution in mountain areas. The approach is though restricted to the area covered by the planes. Hedrick et al. (2018) tested the first integration of LIDAR-derived distributed snow depth data into a physics-based snowcover model using direct insertion and obtained largely improved simulation of the snow depth spatial distribution at high resolution (50 m). More recently, Margulis et al. (2019a) demonstrated the benefit of even infrequent assimilation of LIDAR snow depths, using PBS and a physically-based snow cover model, at estimating SWE throughout the water year. Recent studies have also been carried out by assimilating airborne observations of high-resolution passive microwave imagery for SWE estimations for deep mountainous snowpacks (Li et al., 2017;Kim et al., 2019). The latter study demonstrated that accurate SWE simulations are obtained with the use of a detailed snow cover model that includes snow microstructure evolution, and a particle filter approach. Data assimilation of the same observations with a three-layer snow model and a Kalman filter lead to lower accuracy in the SWE estimates.

Satellite Data
The most commonly used satellite data for assimilation are probably SCA/SCF products. The direct insertion method has been used extentively with MODIS SCA/SCF products (Rodell and Houser, 2004;Kumar et al., 2008;Hall et al., 2010;Fletcher et al., 2012;Liu et al., 2013). This approach has resulted in an improvement of the simulated snow cover in comparison to simulations without assimilation, especially in the lowest elevation bands and during the melting period (Andreadis and Lettenmaier, 2006;Arsenault et al., 2013). In contrast, the direct insertion method has degraded performance compared to the EnKF assimilation (Arsenault et al., 2013) and does not allow a backward correction of the simulation. Assimilation of optical SCA/SCF is also performed using particle filters approach (e.g., Thirel et al., 2013;Baba et al., 2018) leading to improved or similar performance to EnKF approaches. Some studies combine the assimilation of SCA from satellite and in situ snow depth measurements leading to accurate snow maps over Scandinavia (Saloranta, 2016). SWE reanalysis over long periods have been produced using assimilation of SCA/SCF with different types of batch smoothers (e.g., Girotto et al., 2014;Margulis et al., 2015;Cortés and Margulis, 2017;Aalstad et al., 2018). SCA, in contrast to SCF, consists of only binary information on the absence or presence of snow on the ground. Consequently, the link between this observed variable and the model state variable, a.k.a. observation operator, is complex without violating physics especially for physically-based snowpack models. The sequential assimilation of SCA/SCF is also not very efficient during midwinter where most of the surface is covered by snow.
Aside from SCA/SCF, optical satellite data can also be used to retrieve albedo data which plays an important role in the melting processes of the snowpack. Several studies have assimilated albedo products by using the direct insertion method, which shows improved estimations of snow depth, albedo and SWE (Malik et al., 2012;Wang et al., 2015). Another approach is the assimilation of optical MODIS albedo products by using a 1D-Var method, which led to a 40% root-mean-square-error reduction in the mass balance of an alpine glacier (Dumont et al., 2012). Snow cover simulations can be improved by combining different products such as an EnKF assimilation of SCF products and a direct insertion of albedo (Xu et al., 2016). In addition, the assimilation of reflectance (spectral albedo) has been shown to further improve the simulated snowcover physical properties (e.g., Charrois et al., 2016). The uncertainties related to the conversion of raw optical satellite imagery into snow products remain the main limitation of surface reflectance and albedo assimilation (Zaitchik and Rodell, 2009;Hall et al., 2010;De Lannoy et al., 2012;Cluzet et al., 2020). The calculation of the observation errors is thus critical for the efficiency of the system. Moreover, the assimilation of optical images is limited by the presence of clouds (Hall and Riggs, 2007) which may be problematic in some mountainous regions (Charrois et al., 2016). Charrois et al. (2016) demonstrates that despite the limited time resolution due to clouds, the assimilation of spectral albedo is beneficial to the simulations systems and the most useful observations are the one obtained after a long period without precipitation. In contrast to EnKF, the use of particle filter over large domain may trigger degeneracy issues and does not provide a simple way forward to the spatial propagation of the assimilation from one pixel to the other. Localization methods may solve this issue (e.g., Farchi and Bocquet, 2018), but their implementation, based on ensemble weighted averaging, is challenging with multi-layer models due to the dynamic layering.
A recent effort was made to assimilate the synthetic aperture radar (SAR) backscattered coefficients using for instance variational approach with a detailed multi-layer snow model (Phan et al., 2014). However, large uncertainties on the observed backscattered signal make this approach difficult to use, and it requires an unbiased estimate of the observed data (Veyssière et al., 2019). Moreover, the radiative transfer model used to simulate the backscattered signal is more complex for radar data than for optical data generally leading to larger uncertainties (Helmert et al., 2018;Picard et al., 2018).
Several studies have combined the assimilation of SWE products from passive microwave AMSR-E observations and SCA/SCF products or reflectance from MODIS images, which leads to better estimations of the snow cover at mid-latitude (Durand and Margulis, 2007;Kumar et al., 2008;De Lannoy et al., 2012;Fletcher et al., 2012;Liu et al., 2013). The extended kalman filter and EnKF method have been often used for the assimilation of SWE products from passive microwave satellites, such as AMSR-E and SMMR (Andreadis and Lettenmaier, 2006;Kumar et al., 2008;De Lannoy et al., 2012;Liu et al., 2013).
A recent study has assimilated passive microwave AMSR-2 satellite observations in a detailed snow cover model using the particle filter, which was shown to reduce the bias of SWE estimation by up to 71% over flat areas (Larue et al., 2018). However, the coarse spatial resolution of passive microwave satellites limits their usefulness for snow products over mountain regions.

SUMMARY AND RECOMMENDATIONS
The monitoring of snow is essential to monitor, predict and manage water resources and snow-related natural hazards. This monitoring can be partially achieved by using in situ and remotely-sensed observations from the ground, air or space. In situ snow cover measurements at the point-scale provide critical data for characterizing the snow cover at a given location, however these measurements do not provide information on the spatial heterogeneity of mountain snow. Limitations related to the representativeness and scarcity of point-scale in situ data makes the use of distributed observations from satellite, airplane and terrestrial sensors, key for enabling the monitoring of snow spatial variability.
The snow properties to which a remotely sensed observation is sensitive vary with the sensor frequencies and type (optical, thermal infrared and microwave). Monoscopic and stereoscopic optical sensors are limited to daytime cloud-free measurements. Monoscopic optical satellites provide multi-spectral images in the visible and near infrared wavelengths with a relatively high spatial resolution (20 -500 m). They generally provide snow cover area (SCA) and snow cover fraction (SCF) information, surface properties such as albedo, and light absorbing particles of the snow and snow microstructure, but they cannot directly measure water equivalent of snow cover (SWE). Stereoscopic optical satellites provide snow depth information at very high spatial resolution (0.5-2 m), but this technique contains substantial uncertainties (±50 cm), typically only operates ondemand for a reduced set of observation dates, and is generally restricted to small areas. The energy radiation from passive microwave has the advantage to provide information under all-weather conditions (clouds, rain and night), and is directly sensitive to snow depth and/or SWE. However, the coarse resolution of passive microwave satellites (20-40 km) is not sufficient for mountain areas and its use is limited to noncomplex terrain, such as flat areas. In addition, the sensitivity of the type of sensors is lower for deep and/or wet snowpacks. For mountain applications, the complex topography requires the use of high spatial resolution satellite data, such as optical and SAR observations. Disentangling the contribution of the different snowpack properties (density, microstructure, . . . ) to the SAR signal remains challenging without a priori knowledge of the structure and the physical properties of the snow cover. One recent study has shown that Band C SAR signal is directly related to snow depth, highlighting the necessity of further theoretical investigations to support the physical understanding of this direct link.
The use of numerical models is required to monitor the evolution of the snow cover continuously in space and time since observations generally lack one of the two aspects. The physical characteristics and processes are represented in snow cover models with varying levels of complexity. The choice of the model complexity is determined by applications as well as the spatial and temporal extent of the application. It is further constrained by computational cost and the availability of meteorological forcings, where the number of meteorological inputs is higher for the more complex schemes, such as Surface Energy Balance models. The increase of computing performance has fostered the use of detailed multi-layer snow cover models, which represent snow cover properties with higher accuracy, despite the necessary introduction of new sources of uncertainties in the forcing data and model parameters. The most recent snow cover model intercomparison studies have shown that the best performance can be achieved by models with various levels of complexity. Despite these advances in predicting SWE and snow characteristics using numerial simulations, many uncertainties persist. This leads to the need for the combined use of snow cover models and snow observations. The assimilation of in situ and remotely-sensed observations of snow has been increasingly used in snow applications and provides the potential for a reduction of the uncertainties associated with unconstrained snow cover simulations.
• Data assimilation of in situ observations of snow depth or SWE is an efficient method to provide SWE estimates when the observation network is dense enough which is the case only for some rare regions of the Earth. The main challenge remains to extrapolate the information at measurement locations to grid cells without observations. Some solutions exist, especially when the data assimilation method is updating the distributed meteorological forcings, but there is still significant room for improvement. • Remotely-sensed data are thus the most natural source of information to constrain snow cover simulations in regions where in situ observations are too sparse. The satellite data most commonly used in data assimilation for SWE estimation in mountains are SCA/SCF from optical satellites. SCA/SCF assimilation is well-established but still complex since it mostly provides a binary information and the definition of the observation operator for a physically-based snow model cover without violating physics is challenging. In the case of sequential data assimilation, the impact of the assimilation of such products is low in periods where snow is covering most of the surface. However, SWE reanalysis using SCA/SCF with batch smoothers, where the relationship between accumulated SWE, and the full depletion record is higher, has been very successful. • Ensemble assimilation of raw or low level satellite products such as radiances from optical sensors or SAR combined with surface energy balance snow cover models is one of the most powerful tools to improve SWE prediction at high resolution. The use of raw or low level satellite data is likely to be preferred to higher level satellite products in order to avoid the propagation of existing biases in the satellite retrieval products in the data assimilation system. Though promising results have been shown, especially for optical data, several issues need to be tackled to build highly efficient prediction systems. Firstly, the assimilation of such data requires an a priori knowledge of the vertical structure of the snowpack which fosters the use of detailed snow cover models and accurate observation operators that includes both the terrain effects in link with the rugged topography and the electromagnectic interactions within the snowpack. Second, such system are highly nonlinear which plays in favor of data assimilation methods such as batch smoothers, particle filters and to a lesser extent EnKF. Third, snow partial coverage and exposed vegetation are more complex to treat and are still open research questions. • Assimilation of distributed SWE or snow depth observations is more straighforward. However, distributed SWE measurements are rare and SWE measurements cannot be directly obtained via satellite data. Snow depth mapping from satellite methods are currently improving and we can expect more accurate data sources to become available, and with that the development of suitable data assimilation frameworks. In such a case, even infrequent snow depth maps could be used to significantly improve SWE simulations at longer time scales. • In all cases, the question of how to propagate the assimilation benefit to non-observed pixels, especially in the case of particle filters, is an ongoing research area, which requires more investigation. • Data assimilation methodologies capable of using multivariate observations are also an area of promise. New satellite observations such as high resolution surface temperature (e.g., from the Franco-Indian mission Thermal infraRed Imaging Satellite for High-resolution Natural resource Assessment -TRISHNA) would most likely largely improve the reliability of such prediction systems. • Data assimilation of snow cover observations in forested regions has almost never been investigated in detail and is still an open question.
To conclude, prediction systems combining snow observations and physically-based snow cover modeling would enable not only accurate probabilistic SWE prediction, for instance for runoff and flow predictions, but also a better knowledge of the snowpack physical properties, that is a prerequisite for instance for avalanche hazards forecasting. In the future, integrated solutions with assimilation of both snow and meteorological observations in a coupled atmosphere/snow model systems are likely to provide the most robust estimates of snow conditions, in particular SWE, at scales of practical and scientific interest.

AUTHOR CONTRIBUTIONS
CL and MD wrote the initial draft of the paper and provided equal contributions. All authors contributed to the writing-review and editing of the final written version.

FUNDING
The CNRM/CEN and IGE are part of Labex OSUG@2020. The research was partly funded by CNES (APR MIOSOTIS), MAGELLIUM and by the ANR program ANR-16-CE01-0006 EBONI. SMe was funded by ANR through contract number ANR-17-CE01-0009-01. Partial funding for SAM was provided by the NASA High Mountain Asia Grant NNX16AQ63G.