Climatological and Hydrological Observations for the South American Andes: In situ Stations, Satellite, and Reanalysis Data Sets

Modern hydrology relies on multiple sources of information combined with climatological, hydrological and glaciological data. These data can be collected through various sources such as private initiatives by companies, research programs, and both national and international organisms. They also vary by types, e.g., in situ measurements, satellite, reanalysis and simulated data. Recently the ANDEX research project, as a GEWEX regional program, was created to understand the processes related to the hydrological cycle and energy fluxes in the Andean region from Colombia to Patagonia. It is quite challenging to carry out this program given the complex orography and diversity of climates from tropical to sub-polar climates. This review article is a compilation of the various databases that are useful for hydrometeorological research in the South American Andes. The National Meteorological and Hydrological Services in Bolivia, Chile, Colombia, Ecuador, Peru, Venezuela and Argentina provide a large amount of data however the high-elevation areas are poorly instrumented and the number of stations varies greatly between the countries. National databases are only partially shared with the international bodies responsible for summarizing the existing data; this causes problems in term of data product assimilation. Across the entire continent, too few radiosondes are being used despite the fact that these data are crucial for validating and identifying problems in the atmospheric models. An increasing number of satellite data are available but it is difficult to assimilate them into the hydroclimatological models suited to the adjusted spatial and temporal resolutions. Specifically, for precipitation, we recommend merged products that account for the high spatial and temporal variability across the Andes. Finally, the international ANDEX program could be an excellent opportunity to increase the knowledge of the hydrological processes in the Andes.


INTRODUCTION
Modern hydrology relies on multiple sources of information given that, in recent decades, data availability has increased thereby contributing to a new paradigm with data intensive science (Butler, 2014;Chen and Wang, 2018). This study is related to the ANDEX a prospective Regional Hydrological Project (RHP) of the Global Energy and Water Cycle Experiment (GEWEX 1 ) project of the World Climate Research Program (WCRP 2 ). This RHP aims to understand the processes related to the hydrological cycle and energy fluxes in the Andean region that extends from Venezuela to Patagonia and cover a large range of climates from tropical to sub-polar (Garreaud et al., 2009;Sarricolea et al., 2017).
To accomplish this, ANDEX requires bio-geophysical data, such as air temperature, precipitation, evaporation, evapotranspiration, runoff, as well as data on the vegetation, glaciers, etc. Due to the complex physiographic characteristics of the region, additional high resolution data are needed on the topography, soils, land cover and land use. This would allow for a more detailed and accurate representation of the hydrological processes in basins. Therefore, it is necessary to create an inventory of the sources of data and information. The goal of this review is to compile the existing hydrological and meteorological information for the South American Andes. The assessment of the data and observations is based on publications by the World Meteorological Organization (WMO) (WMO, 2007(WMO, , 2010, the portals of various international and national institutions or universities, as well as information provided by the National Meteorological and Hydrological Services (NMHSs) from Bolivia, Chile, Colombia, Ecuador, Peru, Venezuela and Argentina. With the exception of Colombia, information taken from private networks was not included in this study, mainly because this information was not available or reported by the NMHSs. In addition to these in situ measurements, we present a compilation of the satellite based products and reanalysis data useful for climatological and hydrological purposes in the Andes. The purpose of this article is to provide a concise review of the available sources of information that are useful or relevant for the objectives of the ANDEX program. This inventory besides showing the data availability, also identifies the gaps to be filled and the current limitations that must be addressed, in order to implement new research programs on the water cycle and energy balance in the Andean region. The available data sources and information are separated according to their types: in situ observations and measurements from the current hydrometeorological network operated by governmental institutions (Section 2); remote sensing data (satellite sensors) (Section 3); re-analysis data (Section 4); physiographic and morpho-topographic data (Section 5). The main knowledge gaps and some research are presented in Section 6. Finally, we conclude with research opportunities through the ANDEX program (Section 7). Although this is a renewed effort to update the inventory of existing observations, it cannot be considered as complete. The data relative to future climatological and hydrological scenarios for the Andes are beyond the scope of this article.

IN SITU OBSERVATION AND MEASUREMENTS
The following information is available for different time periods and different spatial and temporal resolutions. On the one hand, data for some places and regions have been obtained through regular observations or measurements over long periods, while other data are obtained by specific research programs over a limited time period. Regular observations/measurements of hydrometeorological variables have been made by governmental institutions in countries in this region under the standard methods used by the WMO for its operational Integrated Global Observing System (WIGOS). These observations include, among others, the Regional Basic Synoptic Network (RBSN) and the Regional Basic Climatological Networks (RBCN). A description of the whole Integrated Global Observing System may be found in WMO (2017). For the Andes, these regional networks are operated by Argentina, Bolivia, Chile, Colombia, Ecuador, Peru and Venezuela; i.e., countries that are members of the WMO Regional Association III (RA-III) (WMO, 2018).

Observing Systems Capability Analysis and Review Tool -OSCAR
The official information from the NMHSs is compiled in the Observing Systems Capability Analysis and Review Tool (OSCAR) 1 . This database is developed by the WMO at the global scale in support of Earth Observation applications, and provides physical variables related to weather, water and climate. Additionally, the portal details information concerning all earth observation satellites. However, for South America, OSCAR has a limited number of observations included by the NMHSs. For the seven Andean countries mentioned, there are only 451 stations that collect hydrometeorological variables. In Table 1, the stations reported to OSCAR are broken down by country (NMHS) and by variable. Argentina is the country with the highest number of stations in the system with 147 stations, while Bolivia has the lowest number of stations with 39. In general, almost all of the stations reported temperature and precipitation measurements with different time steps (hourly, daily, and monthly). Argentina, Chile, Colombia and Venezuela measure the Potential Evaporation and Solar Insolation while Ecuador just measures the Potential Evaporation, Peru measures the Insolation and Bolivia does not include these variables in the OSCAR system. This could be due to the lack of records or because the data was not downloaded in the OSCAR system. Very few glaciers are reported in the database along the Andes from Colombia to Patagonia. None of the Andean countries reported river/lake level and discharge observations to OSCAR, indicating a lack of information for these two critical hydrological components.  Argentina  SMN  147  116  111  28  88  0  0  13   Bolivia  SENAMHI  39  37  38  0  0  0  0  1   Chile  DMC  58  36  36  8  27  0  0  2   Colombia  IDEAM  59  51  51  17  28  0  0  5   Ecuador  INAMHI  49  43  42  7  0  0  0  0   Peru  SENAMHI  55  50  50  0  8  0  0  6   Venezuela  INAMEH  44  38  44  36  37  0  0  0 Variables: Temp -surface air temperature; Pr -Precipitation; EVP -Evaporation; Insol -Insolation; Lev -River/Lake levels; Disch -River/Lake discharges; Glacier -Glacier variables. Source: OSCAR/Surface WMO's official repository of WIGOS metadata https://oscar.wmo.int/surface//index.html#/. National Meteorological and Hydrological Services Data OSCAR is an official repository but there is a lack of meteorological and hydrological data for South America. To estimate the available information in western South American countries, the direct information provided by the NMHS and through their official website has been compiled. The summary of this information from Bolivia, Chile, Colombia, Ecuador, Peru, Venezuela and Argentina indicate a total of 14,595 stations (meteorological and hydrological), which is much higher than the number of stations reported to OSCAR. The details of this compilation, broken down by country and altitude range, are presented in Table 2. The stations are distributed as follows: 7,471 stations between an altitude of 0 and 1,000 m; 3,053 stations between 1,001 and 2,000 m; 2,436 stations between 2,001 and 3,000 m and 1,635 stations above an altitude of 3,000 m. Colombia has a total of 8,858 stations, representing 60.7% of the total stations. It is the sole country to have made records from the available private stations, which explains the difference with the other countries. Peru follows with 2,236 stations (15.3%); Ecuador is the third country with 1,447 stations (9.9%); Chile has 959 stations (6.6%); Venezuela has 820 stations (5.6%); next comes Bolivia with a total of 166 stations, representing 1.1% of the total and Argentina rank last with a total of 109 stations, which represents 0.7% of the total. In general terms, the spatial density of the stations could be determined by considering the local geographical such as the presence of deserts and extended plains and how easy is to access these areas to acquire data.
Observations Between 0 and 1,000 m The percentage of stations between 0 to 1,000 m in comparison with the total per country is given in Table 2. Chile, Ecuador, Peru and Venezuela have very similar values at this level, representing close to 10% for each country. Colombia has the highest number of stations at this altitude range with 57.4% of the total stations, while Bolivia and Argentina represent less than 1.0%. For Chile, Colombia, Ecuador and Venezuela, the majority of the stations are located in the 0-1,000 altitude range. For Venezuela and Chile, this range is important as it represents 93.5 and 80.6%, respectively, of its total stations. The spatial distribution of the stations between 0 and 1,000 m is shown in Figure 1. In this altitude range, a good coverage is found in the Caribbean basins (Colombia and Venezuela) and on the western slope of the Andean mountains to the coast. This dense meteorological station network contrasts with the network on the eastern slope and the Orinoco and Amazon basins. The same pattern is observed in Ecuador and Peru, where most of the stations are located from the western slope of Andes to the coast and only few stations are located in the Amazon Basin. In Bolivia, the density of stations is relatively low with major gaps depending on the Observations Between 1,001 and 2,000 m Once again, Columbia has the highest percentage of stations between 1,001 and 2,000 m out of the total number for the Andean region compared to the other countries. The spatial distribution of the stations for this altitude range is shown in Figure 2, where we observe a good coverage with a dense network of stations for Colombia. A significant gap exists over the Guiana Shield ("Macizo Guayanés") located in southern Venezuela. A low density is also observed in the Andean region of Ecuador, and the southern Andes of Peru and Bolivia and Chile. Once again, Argentina shows a higher density of stations in its western-central region, on the eastern side of the Andes.
Observations Between 2,001 and 3,000 m The percentage of stations between 2,001 to 3,000 m is shown in Table 2. For this altitude range, the countries with the fewest number of stations at this range, compared to the other ranges, are Chile, Argentina and Venezuela. Conversely, Colombia stands out as having the highest percentage compared with the other countries (71.5% of the total). The spatial distribution of the stations for this altitude range is shown in Figure 3 and there is a lack of stations over the Andean Cordillera in Bolivia, Chile and Venezuela. On the contrary, there is a high density of stations over the Andean Cordillera in Colombia, Ecuador and Peru.

Observations Above 3,000 m
In general very few stations exist above 3,000 m along the Andes. Nevertheless, this percentage represents 48.2% of Bolivia's total number of stations, and 39.8% of Peru's total number of stations. In contrast, Argentina, Colombia and Venezuela have the fewest number of stations in this range with a total of 6.4, 4.3, and 0.1% in each country, respectively. Colombia still has more stations than Argentina, Bolivia, Chile Ecuador and Venezuela (with only one station) in this altitude range. The spatial distribution of stations above 3,000 m is illustrated in Figure 4 where we generally note a low density of stations. In the case of Chile, the stations are only located in the northern region due to the orography.

Reported Observations for the WMO Information System
CLIMAT is an additional database in which each station has a code for reporting the monthly climatological data assembled at land-based meteorological surface observation sites Frontiers in Earth Science | www.frontiersin.org to data centers. Each NMHS exchanges data via the Global Telecommunication System (GTS) (WMO, 2017). The stations reported in CLIMAT are a subset of the NMHS's stations mentioned in the previous section. Figure 5 shows the percentage of stations that are part of the Global Climate Observing System Surface Network (GSN) 2 and which report data to the CLIMAT 3 database from June 2018 to May 2019. Although the observational capacity in the Andean countries has increased over time, the availability of data transmitted through the CLIMAT Reports is limited in countries such as Ecuador, Peru and Venezuela. This example illustrates how the number of available stations and the existing gaps could be exacerbated by the operational limitations.

Global Precipitation Climatology Centre -GPCC
The GPCC provides three datasets (Schneider et al., 2011). The first quality-controlled dataset compiles the information from 7,000 stations and covers the period from 2007 to the present. The second one is a quality-controlled dataset from 67,200 stations worldwide with recording duration of at least 10 years over the 1901-2013 period (Full Data Product, V7). This product contains the gridded monthly rainfall with spatial resolutions of 0.5 • , 1.0 • , and 2.5 • . The last product, a near real-time first guess, is a gridded product with spatial resolution of 1.0 • at daily time-scale. The problem is that it is based on a limited number of stations. These three gridded products are not yet bias-corrected for systematic gauge measurements, e.g., wind undercatch. Nevertheless, the GPCC provides the number of gauges used to generate the grids as well as the climatological estimates of error. All of the products can be downloaded on the GPCC 3 portal. A limitation of the GPCC dataset in South America, and the Andes in particular, is the number of stations used for the gauge interpolation. The number of stations has declined considerably during the last two decades (Figures 6A,B). Figure 6B illustrates the scarcity of data along the Andes. It can be seen that more data are available in the Northern Andes. Because gauges are scarce in this region, this has implications for the gridded product which could be nonhomogeneous. Problems can be observed in mountainous areas when the satellite-based rainfall estimates (SREs), such as TMPA (TRMM Multi-satellite Precipitation Analysis), use GPCC to adjust their estimates. Finally, this could degrade the results of the hydrological modeling using this data.

Radiosonde
The punctual upper-air measurements for a given time and date can be taken using a radiosonde. This instrument, carried by a balloon through the troposphere, is equipped with devices to measure one or several meteorological variables (pressure, temperature, humidity, horizontal wind) at different altitudes, and is equipped with a radio transmitter for sending this information to the observing station usually at a fixed location (WMO, 2014). Radiosondes take measurements at one location, ideally twice a day, up to heights of roughly 25 to 35 km. The ascending velocities are comprised between 5 and 8 m s −1 . Biases exist for each variable: 1 hPa for pressure, 0.4 to 1 K for temperature and 5 to 14% for relative humidity, respectively. One of the strengths of this data is its precise location into the atmospheric column. These data are used in forecast models as well as in reanalysis. Some limitations for sending the radiosondes exist due to high and labor-intensive costs. In the Andean region and nearby adjacent areas there are only 16 points for five countries with regular upper air measurements with a mean frequency equal to 1 per day (Table 3). Generally, the launch time is at 12Z. Data from these radiosonde stations may be accessed in each NMHS or through the University of Wyoming 5 , by searching in South America or using the station code number. Only five stations are launched at an altitude higher than 1000 meters above sea level, demonstrating the need for developing more scientific projects and programs in the Andes. If intensive radiosonde campaigns were to be carried out in the Andean region, this would lead to a better understanding of the vertical structure of precipitation events and cloud microphysical processes (Perry et al., 2014).

Hydrological Data
The hydrological used and disseminated by the NMHS are also collected, processed and analyzed according to the WMO's standard methodologies as part of the WMO Hydrological Observing System (WHOS 6 ), the hydrological component of WIGOS. The concept of this resource is to pool and share historical and real-time data from water information systems. In the web portal 3 is possible to have access to each NMHS database and the corresponding website. In addition to the NMHS, other institutions and specific research programs also operate hydrometeorological stations based on the concept of citizen science, which implies that stakeholders help to define the research or application objectives for this kind of network.
For example, few years ago, Celleri et al. (2010) proposed an Andean Network of Research for a participatory-based (or collaborative) hydrological monitoring network, including the collection, processing, storage and dissemination of data for research and application purposes; this network should include (a) densely-instrumented basins for identifying and quantifying hydrological processes and their spatial variability and (b) baseline instrumented basins managed by local stakeholders. More recently, Ochoa-Tocachi et al. (2017) presented the initial results of the Regional Initiative for Hydrological Monitoring of Andean Ecosystems (iMHEA), a network with over 30 local stakeholders at 15 sites located in Bolivia, Ecuador, Peru and Venezuela. iMHEA has produced data for precipitation, stream flows, meteorological variables (temperature, humidity and wind speed/direction), soil properties, geological characterizations, land use and tracer monitoring. The data and analyses tools generated by iMHEA are freely available on internet (Ochoa-Tocachi et al., 2018). Other initiatives in Chile spearheaded by the Center for Climate and Resilience Research share a free extensive hydro-meteorological database through their portal 7 . These data can be used to improve knowledge about hydrological processes in a much detailed manner than the scale provided by the NMHS. The Global Runoff Data Centre (GRDC 8 ) provides another free hydrological database compiling runoff data at the global scale. The GRDC is an international data center operating under the auspices of the World Meteorological Organization (WMO) that was established in 1988 to support research on global and climate change and integrated water resources management. Furthermore, cryosphere monitoring provides relevant information for hydrology, especially in the Andean region. To share this information, the WMO created an international mechanism for supporting in situ and remote sensing observations: the Global Cryosphere Watch (GCW), which is also a component of WIGOS. The objective is to group and share the raw data, information and analyses on the past, present and future state of the cryosphere. As part of WMO, data obtained from extensive monitoring are collected through standardized measurements 9 .

Regional Climate Centers of South America
In addition to the NMHS, South America has two Regional Climate Centers (   of obligatory functions to create regional climate products, including prediction, and to support climate activities at regional and national scales. In western South America, a relevant source of information is the Latin American Climate Assessment & Dataset (LACA&D 13 ) that receives data from nine member countries with 1,225 weather stations. They present information on rainfall rates, rainfall extremes and climatic trends. These data are shared by the NMHSs from their official databases. RCC-WSA also is responsible for receiving and publishing the seasonal prediction made by the NMHS with monthly in situ measured data. Similarly, the RCC-SSA has their own database with information from the meteorological stations of the member countries. This information is updated on a monthly basis and can be used to analyze the climate diagnosis. The historical information in this database may be also used for hydroclimatological studies. Additionally, both institutions develop climate analyses from global and regional data in order to provide a better understanding of the impacts of extreme climate events and climate variability and change in the region. The present challenges for the different Centers are: (i) to have a system of data rescue and to provide high quality essential climate variables datasets up to date; (ii) to create new gridded time series products merging station data, reanalysis and satellite observations; (iii) to work with sector-based research team to develop applications models.

DATA FROM SATELLITE SENSORS
In addition to the data produced by in situ observations and measurements, data are also generated by remote sensors placed in satellite platforms. By scanning the Earth's surface and atmosphere, these devices produce data on the surface temperature, soil and air moisture, water vapor, cloudiness, precipitation and several indices that represent the vegetation and the soil. This methodology provides new and useful information for monitoring and modeling hydroclimatic variables, due to the higher spatial resolution and longer time ranges of the data generated. Considering that huge amounts of data are becoming increasingly available, cloud computing is becoming increasingly necessary as a result. These data can be accessed through portals such as Google Earth Engine, NASA Earth Exchange (NEX) and/or Earth Observation Data for Water Resources Monitoring (EODC) (Chen and Wang, 2018). Satellite-derived precipitation estimates use indirect data and do not accurately represent the precipitation in mountainous areas where orography plays an important role (Derin and Yilmaz, 2014). The article published by Sun et al. (2018) gives a review of Global Precipitation Data Sets, witch estimated 30 different products and highlighted that large differences exist, particularly in complex mountain areas. One recent study Nijzink et al. (2018) highlights the advantage of taking various information sources into account in order to better understand the hydrological processes. Using five different conceptual hydrological models applied to 27 catchments with areas comprised between 91 and 1587 km 2 , these authors demonstrated that the models became more accurate with an increasing number of data sources include Advanced Microwave Scanning Radiometer, Earth Observing System and Advanced Scattermeter soil moisture, gravity measurements and Moderate Resolution Imaging Spectroradiometer (MODIS) data. Table 4, modified from McCabe et al. (2017), compiles the most used satellite data to study the water cycle and we detail each of them in sections 3.1 to 3.6.
Details are provided regarding the most used satellite precipitation based products and the MODIS data ( Table 5). These devices are on board the Terra (EOS AM-1) and Aqua (EOS PM-1) satellites. Since the early 1990s, they have been producing data on the land, oceans and lower atmosphere. Until now, data time series are available for these variables over a 25 year period, with a spatial resolution from 500 meters to 1 kilometer. Some of the variables measured by these different components are presented in the Table 4.

Tropical Rainfall Measuring Mission (TRMM) and Global Precipitation Measurement -Integrated Multi-SatellitE Retrievals (GPM -IMERG)
A highly valuable source of data for hydroclimate studies is the Tropical Rainfall Measuring Mission (TRMM), a joint program of the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA) (Huffman et al., 2007). The use of a Microwave Sounder Unit (MSU) emplaced in a research satellite, covering tropical and subtropical regions, generated operational data on the total available precipitation. There is 3-hourly data points over a 25-year period with a spatial resolution of 0.5 • × 0.5 • . There are several versions of TRMM data and many studies have been conducted on the Andes to determine their usefulness for hydrological applications (Collischonn et al., 2008). Depending on the TRMM product, some corrections for the high zones are needed (Condom et al., 2011). A relevant study from Zulkafli et al. (2014) demonstrates the interest of version 7 of the TRMM 3B42 precipitation estimates (or TRMM Multi-satellite Precipitation Analysis -TMPA) as forcing data for hydrological modeling in Andean-Amazon River Basins. The TRMM 3B43-V7 product is also able to reproduce the spatio-temporal variability of the precipitation for the Pacific slope of Ecuador but tends to overestimate the ground observation (Erazo et al., 2018).
In addition, Ochoa et al. (2014) found that for the Pacific-Andean River Basins in Ecuador, TMPA accurately represents the rainfall amounts related to advective-convective precipitation regimes in southern Ecuador. The TMPA data are available on the Precipitation Measurement Mission's (PMM) web page 14 . Manz et al. (2017) showed that the Integrated Multi-satellitE Retrievals for GPM (IMERG) has better detection levels and can provide better quantitative rainfall intensity estimates than TMPA (TRMM Multi-satellite Precipitation Analysis), particularly in the high Andes of Ecuador and Peru. However, IMERG did not show any improvement in terms of either rainfall detection or rainfall rate estimation along the dry Peruvian coastline.

2017) daily precipitation products over the northwestern South
America Andean region (Colombia) using in situ observations from 185 rain-gauges. Both products are able to capture the spatial and temporal distribution of the daily mean precipitation however the amounts are biased with an overestimation over the Andes Mountains and an underestimation in areas with medium to low altitudes that also correspond to areas with very high levels of precipitation near the coast of the Pacific Ocean. Over southern South America (Chile/Argentina) Hobouchian et al.
(2017) carried out a systematic evaluation of satellite estimates of the daily precipitation in relation to terrain complexity. Over a period of 7 years and using more than 400 in situ weather stations as a reference, they found that CMORPH, TRMM3B42RT (realtime version) and TRMM3B42-V7 (research version) are able to reproduce the spatio-temporal variability of the precipitation with more accurate estimates in the wettest and coldest season (June-July-August). The bias is not the same depending on the altitude with an underestimation at low land elevations and an overestimation at high altitudes. More generally, the best performances of the estimate are found downwind of the terrain. These studies underlined the necessity to evaluate the bias for the different satellite precipitation estimates for each Andean region (from Venezuela to Patagonia) and eventually correct them before using these estimates in hydrological models.

Climate Hazards Group InfraRed Precipitation With Station (CHIRPS)
CHIRPS 15 is a global (maximum latitudes of 50 • N-50 • S) precipitation dataset from 1981 to the present. The dataset is a combination of a climatology developed from predictors of precipitation such as topography, elevation, geography, and estimated precipitation data from meteorological stations and satellites with current in situ data from stations and satellites. The result is a blended gridded dataset with a spatial resolution of 0.05 • and temporal resolution of 1 day. The CHIRPS station blending procedure is a modified Inverse Distance Weighting (IDW) algorithm that has several unique characteristics. The first version of these is the use of climatology to define a local decorrelation distance; this distance is where the estimated pointto-point correlation is zero (Funk et al., 2015). This dataset has a rapid preliminary version followed by a final version made available at a slightly later time. The first version is available within a 1 day, while the final version is available sometime after the 15th of the following month. Similar to the GPCC, the limitation of CHIRPS is also the number of stations included in the blended final data, especially for South America. However, as a quasi-global dataset, it can be used to carry out an analysis at a regional scale (Segura et al., 2019).

Precipitation Estimation From Remotely Sensed Information Using Artificial Neural Networks-Climate Data Record (PERSIANN-CDR)
The PERSIAN-CDR 16 product is developed in two steps, first by applying a PERSIANN algorithm using an artificial neural network on the Gridded Satellite Infrared Data (GridSat-B1) and secondly by correcting the bias with the 2.5 • grid of monthly Global Precipitation Climatology Project (GPCP) (Sorooshian et al., 2000;Ashouri et al., 2015). The final product covers the major part of the world (60 • S-60 • N), has a daily time-step and has a spatial resolution of 0.25 • . The data are available through the Climate Data Record of the National Oceanic and Atmospheric Administration's (CDR NOAA) website 17 .

Snowfall
Snowfall estimates are still challenging in mountainous regions. This is particularly true for estimating precipitation from space. Three main satellite estimates can be used to quantify this variable: CHIRPS, GPM-IMERG and CloudSat. In the following part, we will focus on the last two products because CHIRPS has a poor ability of detecting snowfall (Bai et al., 2018). Through its design with a dual-wavelength precipitation radar (DPR) at 13.6 and 35.5 GHz, GPM was designed to have high sensitivity to detect low amounts of precipitation and to discriminate solid precipitation from liquid precipitation (You et al., 2017). For CloudSat, at a global scale, Kulie et al. (2016) propose a separation between the different kinds of clouds generating snowfalls, i.e., shallow cumuliform or nimbostratus clouds. Only a few regions are investigated in detail (e.g., Greenland, inland Russia, among others). As there have been no evaluations comparing this with ground measurements, they conclude that the next steps should be to use reanalysis and ground based observational datasets to identify and quantify the dataset biases. Skofronick-Jackson et al. (2019) compare active and passive snowfall estimates for GPM and CloudSat at the global scale with an exhaustive analysis of the differences arising from classification, sampling, instrumentation and algorithm differences and they propose a unified approach to evaluate the global snowfall datasets. Furthermore, they also concluded that it is necessary to validate the GPM and CloudSat estimates with ground-based observations over contrasting regions and with different kinds of surfaces. At the Andean scale, the problem with this is that such data are scarce. Kidd and Levizzani (2019) recently provide an exhaustive analysis of the Precipitation Estimation from the Earth Observation Satellite. These authors highlight the good accuracy and mention that the resolution (spatial and temporal) is suitable for hydrological modeling and water resource assessment. They also list the future improvements that are needed. The TRMM PR and CloudSat radar systems will be added to the upcoming radars on the GPM-core satellites and the European Space Agency's Clouds, Aerosol and Radiation Explorer (EarthCare). Finally, to improve the identification and quantification of frozen precipitation, effort should be made to correctly quantify the radiative transfer modified by the snow particles. This would be possible by combining both active and passive radar observations. The high variability of the surface ground state (rugosity, type of soil cover and humidity) will have to be considered.

Land Surface Temperature and Evaporation
In the hydrological cycle, the Land Surface Temperature (LST) is essential for discerning the physical processes occurring on the Earth's surface at the regional and global scale. The variability of LST is due to soil-surface-atmosphere interactions and is important for environmental studies and water management (Li et al., 2013). More specifically, LST can be used to quantify climate warming. In this context, MODIS LST is commonly used to estimate the air temperature (Zhang et al., 2016b) because Terra and Aqua satellites both pass over land at approximately the same local time each day data and therefore, the data representing the daylight conditions are comparable between days (Mao et al., 2017). Nevertheless, caution should be taken with cloudy scenes because the cloud cover affects the relationship between LST and air temperature (Zhang et al., 2016b). Aguilar-Lome et al.
(2019) used MODIS LST on the Andes (between 7 and 20 • S) over 2000-2017 period at a resolution of 1 km with monthly data to characterize the regional air temperature trends and determine the altitude dependence. They limit their study to the austral winter (June-July-August) to avoid problems with the clouds. The main conclusions are that, at the regional scale, the temperature trend is positive in most areas with an average rate of 1.0 • C/decade and that this increase is dependent on the elevation and is more pronounced for higher elevations. This kind of analysis could be planned for the whole Andean regions as temperature changes affect Andean ecosystems. Basically, the actual evaporation (E) and evapotranspiration (ET) are calculated using the atmospheric states (T • , humidity, wind, etc.), the land cover type (vegetation, bare soil, bare rocks, open water, etc.) and the soil moisture. Three approaches are used to calculate the actual evapotranspiration from remote sensing imagery (Zhang et al., 2016a): (i) Land Surface Temperature Vegetation Index (LST-VI) space methods; (ii) empirical methods; and (iii) Surface Energy Balance (SEB) methods. Chen and Liu (2020) propose a review of these methods and one of their conclusions is the need to improve the regional ET in situ monitoring capacity. Two satellites are mostly used to calculate E and ET: MODIS and Landsat (7 and 8). Olivera-Guerra et al. (2020) propose a simple model to estimate the daily root zone soil moisture as well as ET with Landsat 7 and 8 data and show that the model is suitable at a monthly time scale but has some limits due to the frequency of the Landsat images. MODIS is more suitable at the regional scale and can be used to calculate daily ET (Zou et al., 2020). All of the methods based on remote sensing imagery to estimate ET require a large amount of storage and calculating capacities. New efforts are being made to parallelize the algorithm that can be used on either public or private clouds (Cunha et al., 2020). Furthermore, future requirements to refine the hydrological modeling would include reducing the revisit time satellites that will be launched in the future.

Snow Cover
At high altitudes, snowfall, accumulation and melt are important in the hydrological cycle. Remote sensing data are useful in snow cover evolution estimates and the main satellites for this field are MODIS, VIIRS, SPOT, Landsat and SENTINEL, among others. The major advantage of MODIS is the time step, resolution and duration of the time series that could be use to calculate the climatic trends. For instance, over the 2000-2016 period for the Andes (8 to 36 • S), Saavedra et al. (2018) identify a significant rising of the snowline by approximately 10-30 m.y −1 south of 30 • S. Concerning the snow persistence (the fraction of time with snow cover for 1 year) over the same period, a distinction should be made between the region situated to the north of 29 • S which has limited snow cover and areas found between 29 and 36 • S that have significant seasonal snow cover. For this region, the season presents a decrease of 5 days over the period. Malmros et al. (2018) provide details concerning the central Chilean and Argentinean Andes for the same period. They use the MODIS MOD10A1 C6 daily snow product in a hydrological model. The Snow Cover Extent (SCE) and the snow persistence decrease by an average of ± 2% and 43 ± 20 days. Snapir et al. (2019) used a data fusion technique with MODIS and Sentinel-1 data for a huge Himalayan catchment extending 55,000 km 2 over ∼2.5 years to characterize the changed in the snow cover area as well as its fraction of wet snow. These kind of data are pertinent as input data to feed hydrological models and will allow to improve runoff analyses (water resources, flood risk, etc.). This methodology could be applied in the Andean region when SENTINEL data are available. The results identify a significant inter-annual variability for these two variables but, contrary to other studies mentioned above, no long-term trend is detected.

Water Storage in Soils
Soil moisture evolutions and groundwater storage changes are a crucial component of the hydrological cycle and essential for water management. At a large spatial scale, two recent studies aimed to characterize the changes in water storage in the soil using GRACE data (Khaki and Awange, 2019;Ndehedehe and Ferreira, 2020). In the first study, South America is divided into 15 major catchments and the water balance is calculated for each one using a hydrological model at a 3 × 3 • spatial resolution and a monthly time step. In this model, the assimilation of GRACE and SMOS data improve the quality of the simulations. Nevertheless, this superior analysis cannot be completed for small mountainous catchments due to low temporal and spatial resolutions. In the second study, the groundwater changes in the floodplain is analyzed in detail but some discrepancies between the simulations and observations in the arid zone highlight the difficulty to quantify complex hydrogeological processes from space where an anthropogenic influence could play an important role.

Vegetation Indices and Land Cover Changes
Land use and land cover change affect regional climate through modifications in the water balance and energy budget. Remote sensing data offer a great opportunity to quantify these changes at the regional scale. For example, MODIS provides time series of vegetation indices (MODIS -VI) and a normalized difference vegetation index at 1 km and 500 m, respectively (Tucker et al., 2005). Huete et al. (2002), demonstrate the performance of these two indices with in situ measurements for the semiarid grass, shrub, savanna and tropical forest biomes. For small mountainous catchments, some limitations exist due to the relatively low spatial resolution of the products. One possibility is to use the Landsat 5TM, Landsat 7ETM + and Landsat 8 data, as shown by López et al. (2017) which aims to investigate the connections between land use and climate variation at decadal time scale for the Cotopaxi region in Ecuador between 1976 and 2013. On the other hand, Salazar et al. (2015) propose an exhaustive analysis of the land use and land cover change at the continental scale over South America with a link to the climatic variability. The data used, MODIS and Landsat TM, can be used to precisely quantify where severe changes in the region occurred over the past few decades, e.g., the Amazon deforestation, the Chilean Matorral or the Atlantic forest. In the conclusion, they highlight the fact that more studies are needed in order to estimate the changes in non-Amazonian South America and to ascertain the link with the duration and intensity of climate extremes.

DATA PRODUCED BY RE-ANALYSIS AND MODELING
Reanalysis is a systematic approach to produce datasets for climate monitoring and this option can be used in hydroclimatic studies (Van den Hurk, 2012). Reanalysis are created by processing observational data with an assimilation scheme and by models, which ingest all available observations every 3-6-12 h over the period being analyzed. Produced data in a grid for a given spatial resolution provides a dynamically consistent estimate of the climate state at each time step. However, it is necessary to know the technical specifications and to be aware of the limitations (Parker, 2016). The main reanalysis products are detailed in the following section (see Sun et al., 2018 for an exhaustive list).
European Centre for Medium-Range Weather Forecast (ECMWF; ERA-Interim and ERA5 Data) The European Centre uses its forecast systems to generate global data sets for the atmosphere, land, surface and oceans. The reanalysis data highlighted for this institution are ERA-Interim and ERA5 data. ERA-INTERIM are global atmospheric data with a 6-h analysis window, a spatial resolution of approximately 80 km and a time period of 1979-2019. These reanalysis products are updated once a month with a delay of 2 months to guarantee quality and to correct any possible technical problems. In 2017, a new version of the ECMWF reanalysis data (ERA5) was released, with a spatial resolution of 30 × 30 km and with improved data quality, which includes information about uncertainties for all variables. Compared with ERA-Interim data, this new data has a higher spatial and temporal resolution, a better tropospheric description, a better global balance between precipitation and evaporation, better precipitation over land and in the tropical zone, better soil moisture and more consistent Sea Surface Temperature and sea ice. It also covers a longer time from 1950 until the present. ERA5 data will eventually replace ERA-Interim and uncertainty estimate for ERA5 can be found in Laloyaux et al. (2018). These data can be downloaded on the Copernicus webpage 18 .

NCEP/NCAR Reanalysis 1 Project
The NCEP/NCAR reanalysis is available from 1948 to the present with a temporal resolution equal to 6 h intervals. This reanalysis is produced by an analysis/forecast system with the assimilation of in situ data. However, the data from 1948 to 1957 are different compared with the rest of the time series due to time step differences. The data and related technical specifications can be found on the PSD webpage 19 .

JRA-55 Japanese Reanalysis
The Japanese global atmospheric reanalysis project covers 55 years, extending back to 1958. The data were produced in a regular latitude-longitude Gaussian grid (320 latitudes by 640 longitudes, nominally 0.56 degree). For detailed information about this project, see Kobayashi et al. (2015). JRA-55 data and technical information can be consulted on the University Corporation for Atmospheric Research (UCAR) webpage 20 .

Multi-Source Weighted-Ensemble Precipitation (MSWEP)
Another important precipitation dataset developed for hydrology is the Multi-Source Weighted-Ensemble Precipitation (MSWEP; Beck et al., 2017). The latest version of MSWEP provides a fully global historic precipitation dataset (1979-2017) with a 3hourly temporal and 0.1 • spatial resolution. The key feature of the MSWEP product is that it provides a precipitation estimate worldwide merging seven complementary precipitation estimates with three satellite remote sensing, two gauge observations and two atmospheric reanalysis datasets. For each grid cell, the weight assigned to the gauge-based estimates is calculated from the gauge network density, while the weights assigned to the satellite and reanalysis based estimates are calculated from their comparative performance at the surrounding gauges 21 . The long-term mean of MSWEP is based on Climate Hazards Group's Precipitation Climatology (CHPclim) dataset. It includes also a correction for gauge under-catch and orographic effects.

Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2)
The MERRA-2 product, provided by the NASA's Global Modeling and Assimilation Office (GMAO), has a spatial resolution of 50 km, an hourly temporal resolution, and is available from 1980 to the present. Assimilation observation types (GPS-Radio, GEOS model, among others) are assimilated into MERRA-2. Data are available at the Goddard Space Flight Center 22 and detailed information concerning this product can be found in Gelaro et al. (2017).

Global Land Evaporation Amsterdam Model (GLEAM v3)
GLEAM is a set of algorithms that separately estimate the different components of land evaporation at the global scale. The eight products are: (i) transpiration, (ii) bare-soil evaporation, (iii) interception loss, (iv) open-water evaporation, (v) sublimation, (vi) surface and root-zone soil moisture, (vii) potential evaporation, and (viii) evaporative stress conditions. Basically, the algorithm, based on the Priestley and Taylor equation, uses observations of surface net radiation and nearsurface air temperature to calculate the potential evaporation of the tall and short canopy and the land fractions of bare soil. Then, these potential evaporation estimates are converted into actual evaporation using a multiplicative evaporative stress factor based on observations of microwave Vegetation Optical Depth (VOD) and estimates of root-zone soil moisture. Two recent products are available at daily resolution and 25 km spatial resolution: GLEAM v3.3a (1980( -2018( period) and GLEAM v3.3b (2003( -2018, which are mainly based on satellite data. Some limitations are inherent to this kind of global product for mountainous regions due to the spatial resolution and the limited knowledge of the soil properties. Details concerning the data can be found in Miralles et al. (2011) and Martens et al. (2017) and data can be download on the GLEAM website 23 .

PHYSIOGRAPHIC AND MORPHO-TOPOGRAPHIC DATA
Several global and open datasets are available to estimate physiographic data relevant for hydrology. Recently, land cover and land use data with a 30-meter resolution was released by the Global Land Cover Characterization (GLCC) Project. This land cover map is one of the best available open datasets. GLCC is based on 1-year Advanced Very High Resolution Radiometer (AVHRR) using an unsupervised classification. Data are available on the GISGeography webpage 24 . Simultaneously, most of the physiographic catchment characteristics can be estimated from the European Space Agency (ESA) Climate Change Initiative (CCI) program or from the NASA among others. The global databases cover the principal physical controlling factors of the hydrological cycle with land cover characteristics, soils properties, glaciers, water bodies (lakes, rivers, etc.) and flooded areas. These databases are listed in Table 6 (non-exhaustive list).
In addition, Digital Elevation Models (DEM) are crucial in climatological and hydrological studies; they can be used to delineate the catchment limits, to determine flow routings, aspects, etc. The most common DEMs are: (i) the Global Digital Elevation Model -GDEM v2, with a resolution of 30 m; (ii) the Shuttle Radar Topographic Mission -SRTM v4 with a resolution of 90 m (Satgé et al., 2015); (iii) the JAXA's Global ALOS 3D world with a resolution of 30m; and (iv) the TanDEM-X DEM with a resolution of 12 m distributed by the German Aerospace Center DLR 3 . Since 2010, higher spatial resolution DEMs at the decimetric scale can be obtained with the Pleiades satellites. In addition, the Global Width Database for Large Rivers (GWD-LR) with a resolution of three arcseconds (Yamazaki et al., 2014) can be used for flow routing.

KNOWLEDGE GAPS AND NEXT STEPS
A recent evaluation of multiple satellite products in Chile reported that CHIRPS and MSWEP data are quite promising One way to address the complexity of spatio-temporal patterns is to use dynamical downscaling in global climate models for South America as done in the CORDEX/CLARIS project (Solman, 2013). It is also possible to use the Weather Research Forecast Model (WRF) as done in the study by Armenta-Porras and Pabón Caicedo (2016) where a data-set of meteorological and hydrological variables was produced for the northern Andes at a horizontal resolution of 10 × 10 kilometers, over 39 levels, and a 3-hourly time step. Finally this kind of Regional Climate Model can be used to describe the spatio-temporal variability of the precipitation, the orography taking into account (Trachte et al., 2018); however any biases should be corrected before using these data in hydro-glaciological models (Mourre et al., 2016). A recent atmospheric modeling framework was proposed over the Himalayas and Middle Eastern mountains at the kilometer scale and can be used to reproduce the effect of the orography on atmospheric flow (Van Niekerk et al., 2018). This kind of methodology could guide future studies at the scale of the Andes. To date, the development of precipitation products tends to combine the strengths of multiple and complementary data sources: gauge, satellite and reanalysis-based data, as well as model simulations in order to produce reliable precipitation estimates (Heredia et al., 2018). Combined products with local weather stations and TRMM data can also integrate the normalized difference vegetation index as shown by Yarleque et al. (2016) in the Peruvian Andes.

CONCLUSION AND PERSPECTIVES
Despite the asymmetries among countries and capabilities in the Andean region, there is strong evidence of increasing observational capacity and data availability. However, in several meetings of the RA III (WMO, 2018), the NMHS identified that some of the current limitations on weather and climate prediction are related with the limited understanding of the physical processes that define weather and climate in particular regions. Data availability could be increased by identifying and prioritizing scientific questions so that a sound research agenda for the region can be put into place where the NMHSs would acquire the data and sent it to the OSCAR database. Since one of the main challenges in the region is the sustainability of the observations networks and the increase in available data for applications and research, potential activities should be focused on promoting and facilitating meeting points between the operational community represented by the NMHS and the scientific community in the region. ANDEX could be a suitable opportunity to foster this approach in order to discuss all together the relevance and urgent need to address the societal needs, save lives, reduce the adverse impacts of hydrometeorological hazards and adapt to a changing climate. This is particularly challenging in the Andean region where many scientific questions are still open with the subsequent impact on prediction and early warnings limitations.
The international context provides a unique opportunity to moving forward. The new WCRP Strategic Plan 2019-2029, the recently approved WMO Strategy and the current reform in the organization aim to promote a better and more efficient interaction between the operational and research communities with a strong focus in the regions. The current Regional Climate centers operating in South America: CIIFEN/RCC-WSA and RCC-SSA are suitable infrastructures which could be used by ANDEX to work together and foster hydroclimate research. It is highly recommended to meet the operational and the research community to work together and demonstrate the relevance of ANDEX to the Governments and funding agencies. This will help maintain and enhance the NMHS's capabilities and will help obtain the necessary funding to foster research activities in this field. Considering the increasing GSM network, future efforts should be made to obtain more remote data transmission. This would be a prerequisite for near-real time forecasts in climatology and hydrology. Data on the physical processes involved in the climatological processes in mountainous regions could be improved if there were more radiosondes in the area. In addition, from a hydrological point of view, scientific programs could focus on a better understanding of groundwater and water use data as this is crucial in the hydrological cycle but is still poorly known.

AUTHOR CONTRIBUTIONS
RM requested station data from the NMHSs in Bolivia, Colombia, Chile, Ecuador, Peru, Venezuela, and Argentina. He also analyzed the collected data. JP helped describe the hydrological data and the hydrometeorological Andean network. FC incorporated and described the OSCAR data. FL generated the maps with GIS processing. LP, TC, and JN incorporated the satellite information and physiographic datasets used in hydrology. All of the authors helped writing the article.

FUNDING
This review paper has been achieved as part of the ANDEX program (www.gewex.org/project/andex/), which is a prospective Regional Hydroclimate Project (RHP) of the GEWEX Hydroclimatology Panel (GHP). This study was supported by the 'Centro Internacional para la Investigación del Fenómeno de El Niño, ' the 'Universidad Nacional de Colombia, ' the 'Universidad Yachay Tech, ' the 'Escuela Politécnica Nacional, ' and the French 'Institut de Recherche pour le Développement' (IRD) through the Andean part of the French glacier observatory service, GLACIOCLIM (https://glacioclim.osug.fr/spip.php?rubrique30&lang=en). It was conducted in the framework of the International Joint Laboratory GREAT-ICE, a joint initiative of the IRD and universities and institutions in Bolivia, Peru, Ecuador, and Colombia.