Evaluation of State and Bias Estimates for Assimilation of SMOS Retrievals Into Conceptual Rainfall-Runoff Models

For an accurate estimation of land surface state variables through remote sensing data assimilation, it is important to estimate the forecast and observation biases as well. This study focuses on the evaluation of a methodology to estimate land surface state variables, together with model forecast and observation biases. Two conceptual rainfall-runoff models (HBV and GRKAL) are used for this purpose. Soil moisture data, retrieved by the Soil Moisture Ocean Salinity (SMOS) mission, are assimilated into these models for 59 unregulated sub-basins of the Murray-Darling basin in Australia. When both models simulate similar soil moisture values, the methodology results in similar forecast and observation bias estimates for both models. The same behavior is obtained when the temporal evolution of the soil moisture simulations is different, but with a similar long-term mean climatology. However, when the long-term mean climatology of both models is different, but with a similar temporal evolution, the bias estimates from both models have a different climatology as well, but with a high temporal correlation. The overall conclusion from this paper is that observation bias estimation is of key importance when updating internal state variables in a conceptual rainfall-runoff system that is calibrated to produce realistic discharge output for possibly biased internal state variables, and that the relative partitioning of bias into forecast and observation bias remains a model-dependent challenge.


INTRODUCTION
Soil moisture is a key variable in the hydrologic cycle, through its crucial role in the partitioning of the available radiation into latent and sensible heat fluxes, and the partitioning of rainfall into surface runoff and infiltration. Through these mechanisms, forecasted precipitation is highly sensitive to the land surface wetness conditions (Betts et al., 1996). Given the importance of soil moisture, the hydrological community has devoted significant efforts to the estimation of the surface soil moisture state at large spatial scales, and has identified a number of microwave missions that can be used for soil moisture estimation. Recent examples are the Sentinel-1 (Attema et al., 2007), the Advanced Scatterometer (ASCAT) (Figa-Saldana et al., 2002), the Soil Moisture Ocean Salinity (SMOS) (Kerr et al., 2010), the Soil Moisture Active Passive (SMAP) (Entekhabi et al., 2010), and the Global Change Observation Mission (GCOM) (Imaoka et al., 2010) missions. The major drawback of satellite soil moisture retrievals is the relatively large interval between overpasses, which can vary from a few days to multiple weeks (Li et al., 2016). The large spatial scale makes the use of these products for watershed management problematic. Furthermore, because of observation noise and errors in the inversion algorithm, remotely sensed soil moisture values will always be prone to a certain level of error. For these reasons, it has been suggested that the best way to estimate soil moisture values is the merging of satellite remote sensing and hydrologic modeling (Kostov and Jackson, 1993), which is commonly referred to as soil moisture data assimilation. Since the pilot study of Entekhabi et al. (1994), a large number of studies have put soil moisture data assimilation into practice.
Over the last few years, a number of studies have focused on the assimilation of SMOS data into hydrologic models. These studies can be classified into studies that focused on SMOS soil moisture data assimilation with the objective to obtain improved soil moisture values (Brocca et al., 2013;Dumedah and Walker, 2014b;Dumedah et al., 2014Dumedah et al., , 2015Ridler et al., 2014;Zhao et al., 2014;Xu et al., 2015;Blankenship et al., 2016), soil moisture assimilation to improve discharge predictions (Wanders et al., 2014;Alvarez-Garreton et al., 2015;Lievens et al., 2015b), crop yields (Chakrabart et al., 2015), evapotranspiration estimates (Martens et al., 2015), atmospheric CO 2 simulations (Scholze et al., 2016), and model parameters (Dumedah and Walker, 2014a;Han et al., 2014;Lee et al., 2014). Furthermore, other studies have focused on brightness temperature assimilation (De Lannoy and Reichle, 2016b), and a comparison of brightness temperature to soil moisture assimilation (De Lannoy and Reichle, 2016a;Lievens et al., 2016). One of the outstanding issues with the assimilation of SMOS data into hydrologic models is the partitioning of biases into forecast and observation biases (Kornelsen et al., 2016).
The question of how to best deal with the bias between model forecasts and satellite observations has received significant attention. One practical approach consists of rescaling the observations so their distribution matches the distribution of the model state variables, a method commonly referred to as Cumulative Distribution Function (CDF)-matching. This was first introduced by Reichle and Koster (2004). A number of problems have been found with this approach. First, as demonstrated by Kumar et al. (2015), this approach could lead to the exclusion of unmodeled processes in the analysis results. Further, Pauwels and De Lannoy (2015) showed that CDFmatching removes the bias between observations and forecasts, but it does not correct the absolute value of the analyses or forecasts. The logical consequence is that CDF-matching may not be sufficient for coupled models, when the prediction of e.g., discharge depends on the absolute value of a model variable. Finally, a historical time series is not always available at the beginning of a new mission and data reprocessing or updates to the retrieval algorithm requires an update to the CDF-matching.
A second approach consists of a separate estimation of the model state variables and the forecast bias through the assimilation algorithm, which was first suggested by Friedland (1969). A number of studies have since then focused on separate state and forecast bias estimation, either through the inclusion of the forecast bias in the state vector (state augmentation) (Drécourt et al., 2006;Kollat et al., 2011) or the use of a dual filter (Dee and Da Silva, 1998;Dee and Todling, 2000;Dee, 2005;De Lannoy et al., 2007). Both state augmentation and the dual filter approach were compared by Drécourt et al. (2006), leading to the conclusion that both approaches lead to better model performance than when the bias is not taken into account. A number of studies have also attempted to estimate observation biases and the model state variables through the assimilation algorithm (Derber and Wu, 1998;Auligné et al., 2007;Dee and Uppsala, 2009;Montzka et al., 2013;Draper et al., 2015). Pauwels et al. (2013) developed a framework to estimate both observation and forecast biases, in addition to the model state variables, using the ensemble Kalman filter.
The question of how to best deal with biases between the model state variables and the observations for the assimilation of SMOS retrieval data remains unanswered. Most studies focusing on SMOS retrieval data assimilation removed bias through rescaling of the observations. Lievens et al. (2015b) compared a number of prior rescaling methods, and concluded that only matching the first moment of the soil moisture led to the strongest improvement in the modeled discharge. The objective of this study is to assess the results of a dual bias and state estimation algorithm for two different conceptual rainfall-runoff models. The study focuses on one-dimensional soil moisture data assimilation, using SMOS data from 2010 to 2016, across all unregulated subcatchments of the Murray-Darling basin in Australia.

STUDY SITE AND DATA DESCRIPTION
The study is performed on the Murray-Darling basin in Australia, which is discussed in detail in Lievens et al. (2015a). Figure 1 shows the location of the test site and the modeled subbasins. The basin has a drainage area of ∼1 million km 2 , and has been the subject of regular flooding over the last years. It is one of the nation's most important agricultural areas, accounting for roughly 40% of Australia's food production. Generally the catchment receives low precipitation amounts, but isolated heavy rainfall events do lead to significant flooding. Rainfall amounts vary very strongly throughout the basin. For example, in 2013 the Southern areas received ∼1,200 mm, while the Western parts received <200 mm. This variability in rainfall amounts, both spatially and temporally, combined with the availability of data, make this basin both an ideal and challenging test bed for this study. Daily streamflow records have been collected for 169 gauge stations, which are mainly located along the East boundary of the basin. The drainage area for the available stations ranges between 50 and 52,000 km 2 , with an average of 3,400 km 2 . Daily streamflow measurements are available from 2009 to 2016 from the Water Data Online project of the Bureau of Meteorology (http://www.bom.gov. au/waterdata/). Rainfall and potential evapotranspiration were available, with a spatial resolution of 0.05 • , from the Bureau of Meteorology Australian Water Availability Project (AWAP, http://www.bom.gov.au/jsp/awap/). The SMOS SMUDP2 v620 retrieval data were mapped onto a 36 km EASEv2 grid and conservatively screened for the impact of open water, radio frequency interference, retrieval accuracy, etc., as detailed in Koster et al. (2016). The bottom panel of Figure 1 shows the overlay of the SMOS grid on the study area.
For each subcatchment, spatially averaged soil moisture observations and meteorological forcings were used in the study. These were calculated by summing the SMOS observations (or the AWAP data) for all SMOS (or AWAP) grids located inside the subcatchment, and dividing by the number of grids inside the subcatchment.
SMOS data from June 2010 to December 2016 were used. The number of available SMOS observations varied strongly between the studied catchments. In order to draw reliable conclusions, only the catchments with at least 100 soil moisture observations throughout the simulation period were retained. Based on this criterion, the results for 59 subcatchments of the Murray Darling basin were analyzed.

MODEL DESCRIPTION
To study the relative effect of various biases in various rainfallrunoff models, and to check the robustness of the assimilation system, two models with a different representation of soil moisture are used in this study.

The HBV Model
One model used in this study is the Hydrologiska Byråns Vattenbalansavdelning (HBV) model, that was originally developed by Linström et al. (1997), and is described in detail in Pauwels et al. (2013). In its original formulation, the model uses observed precipitation (R tot ) and potential evapotranspiration (ETP) as input. The catchment is divided into a soil reservoir (with storage S), a slow reservoir (with storage S 1 ), and a fast reservoir (with storage S 2 ). The state vector in the assimilation procedure consists of these three variables, and for each of these variables the forecast bias is calculated. Table 1 lists the model parameters and their minimum and maximum value for the model parameter estimation.
The modeled discharge is routed to the catchment outlet using a triangular unit hydrograph. The amount of time steps in the unit hydrograph is referred to as the base time, and the value of the unit hydrograph increases linearly from zero at time zero to the maximum at half the base time. After this, the unit hydrograph decreases linearly from the maximum value at half the base time to zero at the base time. The unit hydrograph value at half the base time is calculated so that the integral of the unit hydrograph is equal to one. The base time of the unit hydrograph is a calibrated parameter as well.

The GRKAL Model
The second model is the Génie Rural Kalman model (GRKAL), as described in Francois et al. (2003). This is based on the daily conceptual model GR4 (Loumagne et al., 1996). The rainfall is partitioned into infiltration into the soil column and a subterranean layer. The soil column consists of a thin upper layer and a deeper lower layer. The state vector in the assimilation algorithm consists of the water content of this thin layer, the deeper lower layer, and the subterranean layer. Again, for each of these storages the forecast bias is calculated. The root fraction of these two layers determines the partitioning of the evapotranspiration between the layers. Diffusive exchange of water between the two layers is modeled based on their water contents. A oneparameter baseflow formulation is used. The baseflow is then added to the subterranean layer. The overland flow is then routed to the outlet through a triangular unit hydrograph, and the baseflow is routed using a linear reservoir. The sum of the routed baseflow and surface runoff is the catchment discharge. For a detailed model description we refer to Francois et al. (2003). Table 1 lists the model parameters and their minimum and maximum value for the model parameter estimation.

MODEL PARAMETER ESTIMATION
Prior to the assimilation of SMOS soil moisture retrievals, both models are calibrated to ensure that both models simulate similar, realistic, levels of discharge.

The Parameter Estimation Algorithm
The parameter estimation algorithm used in this paper, Particle Swarm Optimization, PSO (Kennedy and Eberhart, 1995), is based on the complex, collective behavior of individuals in decentralized, self-organizing systems. These systems are created through a population of individuals that interact locally with each other and with the community. These interactions lead to global behavior, which can result in the achievement of certain objectives. Examples of such systems in nature are abundant: ant colonies, swarms of birds, and schools of fish. For a complete description of the algorithm we refer to Scheerlinck et al. (2009).

Application
For each of the 59 modeled sub-basins of the Murray-Darling basin, the model was applied from January 1, 2009, to December 31, 2016, using a daily time step. The Root Mean Square Error (RMSE) between model simulations and observations of discharge (Q o and Q s , respectively, in m 3 s −1 ) was calculated as follows: (1) N q is the amount of discharge observations. The model parameters listed in Table 1 were estimated, as well as the base time of the unit hydrograph of the HBV model. Two different calibration strategies have been applied. First, the model was calibrated using discharge data from the entire simulation period. The results of the SMOS data assimilation algorithms will then be evaluated during the calibration period. Second, the model was calibrated using the discharge observations from 2009 to 2013. The data assimilation algorithms will then be validated using data from the validation period (2014)(2015)(2016). In the second calibration strategy, SMOS data were assimilated only during 2014-2016, while in the first SMOS data from 2010 onwards were used. This strategy allowed an understanding of the robustness of the methodology, by applying the model to a time period that was used to estimate the model parameters, and a second time period that was not used in the model calibration. Unless noted otherwise, the remainder of the paper will mainly refer to the results from the first strategy.

Short Description of the Assimilation Algorithm
In this study, the dual observation/forecast bias and state variable estimation algorithm described in Pauwels et al. (2013) was used. This algorithm consists of an ensemble Kalman filter, estimating the system state variables, and two discrete Kalman filters, one for the observation and forecast biases, respectively. Figure 2 shows an overview of the methodology. The biased system state vector x k is propagated as follows: k is the time step, f k,k−1 the non-linear model propagating the state from time step k − 1 to time step k,x k is the biased state vector, u k the model forcings, and w k the model error. For the remainder of the paper, variables indicated with a [.] indicate biased variables, with biased a priori forecast error covarianceP − . The unbiased state vector x k is: with unbiased a priori forecast error covariance P − . b f k is the forecast bias. The system is observed as follows: H k is the observation operator (assumed linear),ỹ k is the vector with the biased observations, and v k the observation noise. v k is Gaussianly distributed with mean zero and variance R k . The unbiased observations are: b o k is the observation bias. The forecast and observation biases are propagated as follows: with a priori forecast bias error covariance P f − k and a priori observation bias error covariance P o− k . b f k−1 consists of the forecast bias estimate for the three model variables for both models, while b o k−1 consists of the estimate of the bias in the observed soil moisture. Pauwels et al. (2013) describe in detail the state and bias update equations. For these updates, the biased and unbiased background error covariances, and the observation and forecast bias error covariances are needed. The biased a priori background error covarianceP − k can be diagnosed from ensemble forecast integration (Reichle et al., 2002), based upon which the unbiased background error covariance P − k and the forecast bias error covariance P f − k can be calculated as: The superscript − indicates a prior value, before the updating. γ f is a filter parameter, between zero and one, which determines the amount of update to the state and the bias, and which can be obtained through calibration. Note that the forecast bias errors are thus assumed to be correlated in the same way as the forecast state errors, allowing an estimation of the bias in unobserved variables. The observation bias error covariance is calculated as: The matrix H kP − k H T k is the error covariance of the observation predictions, and can be calculated in a computationally efficient way using the ensemble statistics, as described in Pauwels and De Lannoy (2009). κ o is a filter parameter and can be estimated through calibration as well.

Terminology
For the remainder of the paper, the following consistent terminology is used.
The term "biased observations" refers to SMOS soil moisture retrievals, without bias removal. On the other hand, "unbiased observations" refers to the SMOS soil moisture observations, from which the estimated observation bias has been removed. "Observation bias" is the bias in the SMOS soil moisture observations, but it can also refer to the more general bias between SMOS observations and observations predictions. In other words, the observation bias includes both soil moisture retrieval bias and representativeness bias.
For the model results, a similar terminology is used. "Biased simulations" and "biased results" refer to the modeled soil moisture (from either GRKAL or HBV), again without bias removal. "Unbiased simulations" and "unbiased results" are the soil moisture simulations, from which the estimated forecast bias has been removed. "Forecast bias" means the bias in the modeled (GRKAL or HBV) soil moisture. "Baseline simulations" or "baseline results" refer to the model simulations without SMOS retrieval assimilation. "Assimilation run" refers to model simulations with the assimilation of SMOS soil moisture data.
In the sections below, all evaluations are based on ensemble mean forecasts and analyses.

Filter Application
The filter parameters, γ f and κ o , are calibrated through examining the Gaussianity of the innovations. These parameters are assumed constant throughout the entire simulation, and are determined through minimizing an objective function. More specifically, the normalized innovations for the bias and state updates need to be Gaussianly distributed. When one observation is assimilated, these requirements for the bias and state innovations can be respectively written as: [.] indicates an estimate of the state or bias vector, i is the ensemble member number, N is the ensemble size, and the overline refers to ensemble means. The superscript + indicates a posterior value, after the updating, and the error covariance matrices effectively collapse to scalar error variances. Sections 3.1 and 3.2 describe the three state variablesx. Consequently, the state and forecast bias vectors consist of three entries. Since the water content of the upper layer in each model can be directly compared to the SMOS soil moisture observations, and because only one observation is assimilated at each assimilation time step (leading to observation and observation bias vectors with one entry), the observation operator can be written as: An ensemble of 32 members was selected for the study, to dynamically estimateP − k (and to derive P − k , P f − k , and P o− k , following Equations 7 and 8). Pauwels et al. (2013) and Pauwels and De Lannoy (2015) have shown that this is a sufficiently large ensemble for data assimilation studies with the HBV model. A Gaussian random number with mean zero and standard deviation of 10% of the parameter or forcing input value is added to each model parameter for the first model time step, and to each forcing variable (precipitation and potential evapotranspiration) at each time step. This is the same strategy as was applied in Pauwels and De Lannoy (2015). A lower and upper bound was set for all parameters and forcing variables, in order to avoid physically unrealistic values. For the model parameters, these bounds are the values in Table 1. For the precipitation, a lower bound of 0 was used. 6. RESULTS

Estimation of the Model Parameter
The left hand side panel of Figure 3 shows the distribution of the Nash-Sutcliffe Efficiency (NSE) obtained using the optimal model parameter set for the 59 sub-basins of the Murray-Darling basin, for the model calibration over the entire simulation period. This efficiency is calculated as: σ 2 0 is the variance of the observed discharge values. MSE is the mean square error, which is the square of the value calculated in Equation (1). The average NSE for all 59 catchments is roughly similar for the two models. Figure 4 shows the comparison of the modeled discharge to the observations for the smallest modeled catchment in the basin (401015, 781 km 2 ), the catchment with the median drainage area in the basin (410048, 1,406 km 2 ), and the largest catchment in the basin (422201E, 81,250 km 2 ). The NSE values for these three catchments are within one standard deviation of the average, and can thus be considered as values typically obtained for the 59 catchments.
The right hand side panel of Figure 3 shows the distribution of the Nash-Sutcliffe Efficiency (NSE), for the validation period (2014)(2015)(2016), when calibrating the model using data from 2009 to 2013. As can be expected, these NSE values are lower than the results without a separate validation.
For a number of subcatchments a relatively low NSE has been obtained. This can be explained by a number of issues. First, there is the rainfall forcings, which will inevitably have a relatively large uncertainty for this very large area with a relatively coarse coverage of stations. The catchments are also characterized by extreme changes between dry and wet conditions and agricultural water management practices, which are very difficult to model. Overall, the models perform relatively well, but there are catchments where the performance is low.
The model performance can be assumed, for some catchments, to be too weak for operational flood management purposes. However, to assess the impact of soil moisture assimilation on the bias and state variable estimates of the models from a methodological point of view, these results can be assumed to be sufficiently accurate.

Estimation of the Filter Parameters
The filter parameters γ f and κ o were estimated in order to fulfill Equation (9). The objective function (OF) that was minimized for each river basin individually is: ). For γ f , relatively consistent estimates were obtained across all basins, with an average and standard deviation of 0.051 and 0.008, respectively, for the HBV model. For the GRKAL model, the average and standard deviation were 0.05 and 0.004. However, for κ o , more variable results were obtained. For the HBV model the average was 556.9 with a standard deviation of 421.5. For GRKAL the average and standard deviation were 33.8 and 39.7, respectively. This can be explained by the way the models simulate soil moisture. In other words, the variable soil moisture for the HBV model does not have the same physical meaning as for the GRKAL model. However, the same observation is used for both models, and it has again a different physical meaning. It can thus expected that the parameters used to estimate the observation bias will be different for both models. This further implies that the estimated forecast bias uncertainty (∼0.95/0.05P − k , Equation 7) tends to be relatively much smaller than the observation uncertainty (up until ∼1,000 P − k , Equation 8), and both are larger than the a priori forecast state error covariance. The larger spread in the observation bias errors is a strong indicator that the observation bias will be the most updated in the following results.

Analysis of the Filter Parameters
In order to analyze the impact of the filter parameters, a sensitivity analysis was performed. Figure 5 shows, for both hydrologic models, the results of a sensitivity analysis for the filter parameters for one example river basin. Similar results were obtained for all other sub-basins. More specifically, the NSE is calculated between the observation biases obtained using predefined filter parameters, and the observation biases obtained by varying filter parameters, as follows: N b is the number of time steps with bias estimates (2,182), and b v,i and b c,i are the bias estimates with variable and predefined filter parameters, respectively. These predefined filter parameters were assigned rather high values, more specifically 0.3 for γ f , and 500 and 150 for κ o for the HBV and GRKAL models, respectively. A strongly varying NSE with respect to the parameter values thus indicates a high sensitivity of the filter results to the parameter values. A similar analysis has also been performed for the forecast bias. An examination of Figure 5 leads to the conclusion that the biases are sensitive to the value of γ f only for relatively low values of κ o . For higher values of κ o the obtained biases are no longer sensitive to the value of γ f . Further, as can be expected, the forecast bias is more sensitive to the value of γ f than the observation bias. Figure 6 shows the statistics of the innovations for the first selected sub-basin. Following Equation (9), the mean of the bias innovations has to be equal to zero, and the standard deviation of the state and bias innovations has to be equal to one. These optimum values are obtained for high values of κ o . Again, similar results were obtained for the remaining sub-basins. The statistics N s is the number of time steps with soil moisture observations, and G i and H i are the results for the GRKAL and HBV models, respectively. The top and middle panel show the observation and forecast bias estimates, respectively. From these plots it can be seen that the observation bias estimates are an order of magnitude larger than the estimates of the forecast bias. Furthermore, there is a stronger agreement in absolute terms between the observation bias estimates for both models than between the forecast bias estimates, even though the correlations are similar. This can be expected, as the observation bias depends in the first place on the remotely sensed observations, and secondly on the models. Representativeness error, meaning that each model has a different definition of soil moisture, also contributes to the observation bias. The forecast biases for the two models are very different, as indicated by the very low NSE. This can be explained by the different soil moisture dynamics of the two models, which is demonstrated in the bottom panel of Figure 7. Even though the models show very different soil moisture estimates, the forecast bias estimates do not compensate for this large difference and the bias-corrected forecasts are very different and each approaching a different "truth." This is because each model is calibrated to produce the right discharge with each a different soil moisture climatology. The conclusion that can be drawn from Figure 7 is that the agreement between the observation bias estimates for both models is relatively high and dominated by retrieval bias, whereas the forecast bias and modeled soil moisture estimates are strongly different.

Analysis of the Bias Estimates
For the analysis of the bias estimates, three different stations were selected. Figure 8 shows the comparison of the observation and forecast bias estimates and the upper layer soil moisture for these three selected sub-basins. It should be clarified that these soil moisture estimates were obtained from the baseline run. A first station (403241A), shows very similar upper layer soil moisture estimates for the two models. A second station, 405246A, shows a very different upper layer soil moisture climatology for the two stations, but a relatively similar temporal evolution. Finally, a third station (405269A), shows the opposite behavior: a relatively similar upper layer soil moisture climatology, but a strongly different temporal evolution. For station 403241A, the soil moisture estimates from both models are very similar. Consequently, for station 403241A the observation and forecast biases are very similar as well. For station 405246A the mean bias estimates for both models are very different, but with a relatively similar temporal evolution, which is indicated by the value of 0.61 and 0.53 for the correlation between the observation and forecast bias estimates, respectively, for both models. For station 405269A again relatively similar observation and forecast bias estimates are obtained for the two models, even though the temporal evolution These examples represent the three extreme cases for all the modeled sub-catchments. Either the soil moisture estimates are very similar, in which case the biases are also similar. In case the soil moisture estimates are completely different, two different behaviors of the biases can be observed. When the average soil moisture for the two models is similar, similar biases are obtained. When the means are different, different bias values result, but still with a relatively high agreement in the temporal evolution of the bias.
These results provide support to a framework in which both observation bias and model forecast bias are estimated, but the joint identifiability of both biases remains a topic to be further explored in more studies. Its feasibility can be expected to be strongly dependent on the types and amounts of available observed data.  Figure 9 shows the distribution of the NSE values, correlation coefficients, and mean average difference between the soil moisture values, and the observation and forecast biases, estimated by both models, for all modeled sub-basins. For the soil moisture values, the correlation coefficients tend to be rather high, with relatively low NSE values, indicating that the middle panel of Figure 8 is the most representative for the modeled basins. This is supported by an analysis of the observation biases, which show relatively high correlation coefficients as well, with a relatively low NSE. The forecast biases also show relatively high correlation coefficients (albeit not as high as the observation bias), with very low NSE values. This can be explained by the middle panel of Figure 7, which shows a more variable forecast error for GRKAL than for the HBV model. As the NSE is calculated as one minus the RMSE divided by the variance, this low standard deviation will lead to very low NSE values. The conclusion from Figure 9 is that the soil moisture estimates from both models, as well as the observation and forecast biases, in general show a relatively high temporal correlation.
The observation bias estimates from both models are in better agreement than the forecast bias estimates. Figure 10 shows the same distributions and statistics, but for the validation period for the applications with a separate calibration and validation period. Essentially the same conclusions can be drawn as for Figure 9, indicating the robustness of the methodology.

Impact on the Discharge Forecasts
The impact of the data assimilation on the discharge forecasts has yielded mixed results. For the HBV model, the average NSE of the baseline and assimilations runs was unaltered, regardless of whether a separate calibration and validation period was used or not. In the case of separate calibration and validation periods, the data assimilation had a positive impact on the discharge estimation for 31 stations, and a negative impact for 24 stations. During the separate calibration period, a positive impact was obtained for 35 stations, and a negative impact for 19 stations.
However, in both cases, the data assimilation only changed the NSE by 0.05 or less on average.
A stronger impact was found for the GRKAL model. In the case of no separate calibration and validation, the NSE between the observed and simulated discharge reduced to −1.52, with a reduction of more than 0.05 for 19 stations, and more than 0.5 for 8 stations. A positive and negative impact was obtained for 5 and 50 stations, respectively. In the separate validation period, the NSE reduced to −6.60, with a positive and negative impact for 8 and 47 stations, respectively. For 17 stations a reduction of more than 0.05 was obtained, with a reduction of more than 0.5 for 12 stations, respectively. Overall, the impact of the DA is negative for the GRKAL model, with this average negative impact being caused by a small number of strongly negatively impacted stations.
This discrepancy can be explained by the calibration of the models, which did not take into account the soil moisture content, and only focussed on optimizing the discharge forecasts.
Furthermore, both models calculate the soil moisture contents in strongly different ways. In other words, they define soil moisture in different ways, not consistent with the definition of soil water content by the SMOS satellite. Despite the bias estimation, it is thus not surprising that the assimilation of external soil moisture values did not improve the discharge forecasts. The potentially negative impact of soil moisture assimilation on discharge forecasts was also noted by Mao et al. (2019).
FIGURE 10 | Same as Figure 9, but results for the validation period for the applications with a separate calibration and validation period.

DISCUSSION AND CONCLUSIONS
The objective of this paper was to evaluate a strategy to assimilate SMOS data into conceptual hydrologic models. Both the state variables and observation and forecast biases are estimated using a dual state and observation/forecast bias estimating ensemble Kalman filter. The GRKAL and HBV models were used for this purpose. When the soil moisture estimates from both models are in good agreement, the observation and forecast bias estimates are in good agreement as well. When the soil moisture estimates from both models are different, with a different climatology but a relatively high temporal correlation, the bias estimates still show a relatively high temporal correlation. On the other hand, if the soil moisture estimates have a low temporal correlation but a relatively similar climatology, the bias estimates are again relatively similar. Further research can focus on the assimilation of remote sensing data into physically-based land surface models.

DATA AVAILABILITY STATEMENT
All data used in this study are publicly available and can be downloaded from the Bureau of Meteorology (BoM) Australian Water Availability Project (AWAP), http://www.bom.gov.au/jsp/awap/index.jsp.

AUTHOR CONTRIBUTIONS
VP performed the modeling aspects of the work and wrote the paper. GD and H-JH helped in the interpretation of the results and the structuring of the manuscript.

FUNDING
For this study, VP was funded by ARC Future Fellow grant FT130100545.