Machine Learning vs. Physics-Based Modeling for Real-Time Irrigation Management

Real-time monitoring of soil matric potential has now become a common practice for precision irrigation management. Some crops, such as cranberries, are susceptible to both water and anoxic stresses. Excessive variations in soil matric potential in the root zone may reduce plant transpiration, due to either saturated or dry soil conditions, thereby reducing productivity. A timely supply of the right amount of water is, therefore, fundamental for efficient irrigation management. In this paper, we compare the capabilities of a machine learning-based model and a physics-based model to predict soil matric potential in the root zone. The machine learning model is a random forest algorithm, while the physics-based model is a two-dimensional solver of Richards equation (HYDRUS 2D). After training and calibration on a dataset collected in a cranberry field located in Québec (Canada), the performance of the two models is evaluated for 30 different time frames of 72-h soil matric potential forecasts. The results highlight that both models can accurately forecast the soil matric potential in the root zone. The machine learning-based model can achieve better performance when compared to the physics-based model, but forecasting accuracy decreases rapidly toward the end of the 72-h lead time, while the error for the Richards equation-based model does not increase with time and remain small compared to the typical measurement error.


INTRODUCTION
Meeting future food demands for a rising global population while minimizing environmental impacts remains a challenge, for which precision irrigation strategies will play a critical role (Provenzano and Sinobas, 2014). Precision irrigation is based on real-time measurements of soil moisture conditions and the capacity of models to predict the amount to be applied for irrigation scheduling (Autovino et al., 2018). When applying water-saving irrigation strategies, one must observe water stress indicators, such as soil water content, soil matric potential in the root zone, leaf water potential, etc. These indicators may help to prevent water stress conditions and excessive water use (Rallo et al., 2014). However, they do not inform growers on how much water is needed to bring the plant root zone to optimal conditions. Besides, most of the time, when the plant water stress can be observed, it is already too late to act (Rekika et al., 2014), especially in highly sensitive crops such as lettuce, tomatoes and cranberries (Lafond et al., 2015;Pelletier et al., 2017). That is why soil-waterplant-atmosphere models (SWPA) are often used to provide information on the soil and plant water status (Minacapilli et al., 2009;Cammalleri et al., 2013;Aguilera and Ruiz-Valenzuela, 2019). Several physically-based and process-based models, such as HYDRUS (Autovino et al., 2018) and CATHY (Camporese et al., 2015), are currently used to describe water processes in the SWPA continuum. After appropriate calibration and validation, these models can be used to support irrigation water management, providing not only the timing of irrigation, but also the amount of water needed, therefore helping to increase productivity and reduce water use. For example, Bigah et al. (2019) proposed a process-based model to predict the water needs for different cranberry farm operations such as irrigation, frost protection and harvest. Rekika et al. (2014) proposed an analytical model to estimate soil critical matric potential thresholds for irrigation management for various highly sensitive crops (onions, celery, and baby spinach seed germination).
In the last few years, along with physics-based models, many machine learning (ML) based models have been developed and applied to water management. Artificial Neural Networks (ANN) have been used to model the water table dynamics of various agricultural systems (Dibike and Coulibaly, 2006;Yoon et al., 2011). Li et al. (2016) have compared the capacity of Random Forest (RF) and ANN models to predict lake water levels. The results demonstrated that the RF model has superior predictive capabilities with fewer parameters and training time. Decision tree-based models have been successfully applied in groundwater hydrological modeling (Singh et al., 2014;Wang et al., 2018). Marques et al. (2005) used ML algorithms to optimize water supply for crops at different growing stages, given the water cost, the market price, cost of irrigation, crop expenses, and expected yield reduction due to under-or overirrigation throuhout the entire growing period. The resulting optimization models were incorporated into a decision support system (DSS) for precision irrigation, but did not account for field-specific characteristics. Literature also comments on models for determining irrigation needs based on field-specific data. For example, Hedley et al. (2013) used ML to predict soil water status and water table depth based on soil resistivity mapping. Smith and Peng (2009) used ML to classify soil textural composition based on infrared measurements as an input to a deficit irrigation control system. However, none of these studies have compared the performance between physicsbased and ML-models for predicting soil water status using the same dataset.
Given the increased popularity of ML approaches, the main objective of this paper was to test the efficiency of a physics-based model, HYDRUS-2D, and a ML-based model, a Random Forest (RF) model, in predicting soil water status for a highly sensitive and valuable crop. Using a dataset collected at an experimental site in Québec (Canada), HYDRUS and RF were used, after appropriate calibration and training, to predict the water status in the root zone with up to a 72-h lead time, in the context of realtime precision irrigation management. The comparison between the two approaches was made from the perspective of a grower or an irrigation manager, who typically only measures soil matric potential at one location to manage irrigation in their crop-fields.

Study Area
The study area was located on a cranberry farm, under a warm and humid summer climate, near Québec City, Québec, Canada (46 • 14 ′ N, 72 • 02 ′ W). The average rainfall in June and September is 48.0 mm and 70.6 mm, while the average temperatures are 17.1 and 13.3 • C, respectively. The location of the farm is shown in Figure 1, and the studied cranberry fieldis ∼18 ha (40-m wide by 450-m long). This site is equipped with a subsurface drainage system with four drainage pipes, equally spaced at 12 m, at an average depth of 0.9 m below the soil surface. These subsurface drains, which are designed to maintain a 0.6-m deep water table throughout the field, discharge into a control chamber south of the site. This setup is designed to ensure adequate moisture conditions, ideally keeping the plant root zone between −6.5 and −4 kPa. Three PVC pipes were inserted vertically at mid-spacing between the subsurface drains, serving as observation wells for water table measurements. The site was also equipped with a rain gauge.

Cranberry Production
Cranberry (Vaccinium macrocarpon Ait.) is a high-value perennial crop that has historically been grown on wetlands (peat soils), but is also commonly cultivated on sandy anthropogenic soils. The cranberries require about 60 mm of water per month during the growing season in Québec (Pelletier et al., 2015). The water table must be controlled during the growing season because too dry or too wet conditions can adversely affect cranberry growth and root development (Périard et al., 2017). A water table depth between 0.3 and 0.5 m for sandy soils and around 0.6 m for peat soils can meet cranberry water requirements by capillary rise (Périard et al., 2017). A soil with high a hydraulic conductivity is preferred to maintain optimal water conditions for cranberry growth by water table control or irrigation . As mentioned previously, for cranberry production in Québec, the optimal range of soil water matric potential in the root zone (i.e. 10-cm deep) is between −6.5 and −4.0 kPa . Thus, matric potential and hydraulic conductivity represent the soil properties with the largest impact on cranberry yields.

Hydrometeorological Data and Soil Matric Potential
Precipitation was measured every 15 min using a rain gauge (WatchDog 1120 Rain Gauge, Spectrum Technology, Inc., IL US) combined with an ST-4 Hortau datalogger (Lévis, Québec, www.hortau.ca). In summer, sprinkler irrigation was used for cooling crops. Thus, the rain gauge also measured the amount of irrigation. The total precipitation used in this study is the sum of irrigation and rainfall.
Water table (WT) depth was continuously measured using a submersible pressure transducer (TDH80, Transducersdirect Inc., Ellington Court Cincinnati, Ohio, USA) and a datalogger FIGURE 1 | Site localization in Quebec (Canada) and location of the tensiometers (T1, T2, and T3 installed at 10 cm bellow the soil surface), the water table pressure gage (N3) and the drains within the cranberry field.
(ST-4, Hortau Inc., Lévis, Québec, Canada). This sensor automatically corrects data for atmospheric pressure. The soil surface was the elevation datum for the pressure depth measurements. Data were collected every 15 min and sent wirelessly to Hortau's Irrolis (Irrolis 3 v.3.5.1, Hortau Inc., Lévis, Québec, Canada) website. For the physics-based model, WT values were only used to assign the boundary conditions at the drain location.

Overview of the Random Forest (RF) Method
Random Forest (RF) is a supervised ML method with multiple building blocks (decision trees) producing an ensemble of predictive models (Rodriguez-Galiano et al., 2014). Amongst supervised ML algorithms, RF has proven to be suitable for either classification or regression, depending on the nature of the targeted variable. For simulating soil matric potential in the root zone, we only used regression trees (RT).
The regression algorithm is built around a set of hierarchically structured restrictions, sequentially applied to a root node up to terminal nodes of the decision tree (Breiman, 2001). To prevent correlations between the different RTs, RF develops each tree independently using different bootstrap samples extracted from the learning dataset and a subset of randomly selected variables out of the predictive variables (Breiman, 2001). Specifically, the RF builds each tree by making them grow from two-thirds of each bootstrap sample (in-bag). About one-third of the samples are excluded from the bootstrap sample (out-of-bag) for a nonbiased estimation of the regression error. A prediction of the outof-bag (OOB) data is generated for each tree. These predictions are subsequently averaged to obtain an estimation of the error rate of the OOB. The generalization error of the RF model depends on the weight of the individual trees and the correlations between them (Qu et al., 2019). The building of a RT follows a recursive binary partition approach starting at the root node and divides the non-correlated variables into two new branches. This recursive division is performed on the dependent variable as a function of the most significant independent variable, leading to the best ensemble of homogeneous populations. Each node is then divided using the most homogeneous population among the subsets of predictive variables selected randomly at the node. FIGURE 2 | Observed hydrometeorological data collected in the study area. From the top, we report rainfall, potential evapotranspiration, soil matric potential measured at T1, T2, and T3 locations, and water table near the drain.
This division process continues until a predefined number of observations at the terminal node (node size) is reached. The result obtained using the RF is, in the case of a regression, an average of the predictions from all decision trees.
The RF regression algorithm can be summarized into three basic steps as follows (Breiman, 2001): randomly drawn from the original dataset W. Two-thirds of the samples are included in a bootstrap sample and one-third as the OOB samples. Each tree is constructed to correspond to a particular subset of the bootstrap. • At a node in each tree, a new split is randomly selected from all indices, and the input variable with the lowest mean squared error (MSE) is chosen as the splitting criterion of the regression tree.
• The data splitting process at each internal node is repeated according to the above steps until all randomized trees have been grown, and a stop condition is achieved.
The final results of the regression are calculated as follows, where N stands for the number of trees in the forest and T n represents a tree:ŷ

Input selection
The RF-model was implemented in R Development Core Team (2010) using the package caret (Kuhn, 2020). Soil matric potential (SMP) at the root zone was predicted hourly from 1 to 72 h ahead. As input, we considered rainfall (P), potential evapotranspiration (ET 0 ) and SMP at the T1 position only (Figure 1) in the root zone of previous time steps. The T1 position was chosen because most growers typically measure soil matric potential in one location only, and the most significant position in cranberry fields is indeed in the middle of two drains. The model was built with separate combinations of input variables. Lag times of input variables were generated by means of the sliding window method (Brédy et al., 2020), which allowed us to restructure the time series of P, ET 0 , and SMP as a supervised learning problem by using a size of the lag (d) equal to 168 h. This lag size was chosen to consider the influence of the SMP values at position T1 over the previous 168-h time period on the predictions. Therefore, we constructed the ML-based model maps an input window of width d into an individual output value y. The model predicts the SMP values at the T1 position (y i,t+j , with j the time step forecast) using the window: (x i,t+j , x i,t+j−1 , . . . ,x i,t , . . . , x i,t−d+1 , x i,t−d ) for P and ET 0 and (y i,t , . . . , y i,t−d+1 , y i,t−d ) for the antecedent SMP value. Multi-step forecasts from 1 to 72-time steps were used for forecasting the SMP value in the root zone.
Evapotranspiration (ET 0 ) was estimated using the formula proposed by Baier and Robertson (1965): Where R a is the extraterrestrial radiation (cal cm −2 d −1 ), ET 0day is the daily evapotranspiration (mm d −1 ) and T max and T min are daily maximum and minimum emperatures ( • C).
The temperatures were measured using the Hortau's weather station, which includes the WatchDog 2900ET (Spectrum Technology, Inc., IL USA). To obtain hourly ET (used as input), the results from Baier and Robertson's formula were multiplied by a ratio of extraterrestrials radiations for hourly (R h ) and daily (R d ) periods using the equation, ET 0 = ET 0day · R h R a . The extraterrestrial radiation for the daily and hourly periods were estimated using the equations of Allen et al. (1998), which for each day of the year and for different latitudes, can be estimated from the solar constant, G sc = 4.92 MJm −2 d −1 . Because of computational time constraints, the number of decision trees (ntree) was set to 200; as reported by Rodriguez-Galiano et al. (2014) and (Brédy et al., 2020), the gain in accuracy is negligible for ntree > 200.

Data splitting
The experimental dataset consisted of 2,000 hourly observations for each variable (P, ET 0 , and SMP at T1 position) ranging from July to September 2018. This dataset was split by the holdout method, using the "CreateDataPartition" caret package function Kuhn (2020) in two subsets: a training subset, which contained 70% of data randomly selected for each dataset and the remaining 30% that were used as test subset. For each step-ahead prediction,this data division was repeated 30 times using a uniform distribution. This allowed the models to be trained and validated with a total of 30 different training and test subsets for each training pattern (t + 1 to t + 72) and to estimate the variability of their results.

Model calibration and validation
The calibration procedure aims to optimize a set of model parameter values that enable the model to map the relationship between the inputs and outputs of a given dataset (Wu et al., 2014). For the RF method, two parameters (ntree and mtry) were assigned using results and recommendations from the literature: 200 trees and 1/3 of the predictive variables (Liaw and Wiener, 2002), respectively. The model was trained to determine the optimal parameter values that minimize the training error. The test data were used to independently assess the generalization ability of the trained model, an essential step to evaluate the model quality and accuracy with data that were not used in training.

Richards Equation Numerical Solver
Assuming that the air phase does not affect the liquid flow processes and that thermal gradients are negligible, the water movement in partially saturated porous media can be described by Richards equation (Richards, 1931):

S w s s
where S w = θ/φ is water saturation, θ and φ being the volumetric soil water content and porosity [/], respectively, S s is the specific , ∇ is the gradient operator, K s is the saturated hydraulic conductivity tensor [L/T], [L/T], K r is the relative hydraulic conductivity function [/], η z = (0, 0, 1) 1 is the vertical direction vector, z is the vertical coordinate directed upward [L], and q s represents a sink term [L 3 /L 3 T] [L 3 /L 3 T] accounting for evapotranspiration. The unsaturated hydraulic properties are taken into account by means of the van Genuchten functions S w (ψ) and K r (ψ)K r (ψ) (Van Genuchten, 1980): where S wr = θ r /φ is the residual water saturation, with θ r the residual water content, α is an empirical constant [L −1 ] related to the inverse of the air entry suction, while the dimensionless shape parameters n and m are linked by the expression m = 1 − 1/n. These parameters are often referred to as the van-Genuchten Mualem (VGM) parameters. The sink term (q s ) in Richards equation accounts for depthdependent root water uptake. The potential evapotranspiration (ET 0 ) is distributed across the root depth according to a root distribution function. Actual evapotranspiration depends on soil water content and, hence, on the soil matric potential in the root zone. If the soil is dry, vegetation can experience water stress, and transpiration would reduce to limit water losses; in nearly saturated conditions, the low availability of oxygen to roots might also cause a decrease in the transpiration rate. We modeled the effect of low and high soil moisture on root water uptake by multiplying the potential root water uptake by a reduction function, α rw (Feddes et al., 1974). The reduction function, commonly referred to as Feddes function, is zero at pressure heads higher or equal to ψ 1 , close to saturation when oxygen stress is inhibiting root water uptake. As ψ decreases, α rw is assumed to increase linearly up to 1 at the anaerobiosis point (ψ 2 ). When pressure head falls below ψ 3 , associated with incipient water stress, transpiration is assumed to decrease linearly, reaching zero at the wilting point, ψ 4 . Between ψ 2 and ψ 3 the soil is under well-watered conditions and roots can take up water at their potential rate (i.e., α rw = 1).

Model Setup
A vertical cross-section of a half cranberry field is simulated. The domain (Figure 1) is 1-m deep and 6-m wide with a subsurface drain at one end and the middle of the field at the other end.
The computational domain represents mid-drain spacing. It is discretized with a 2D finite element mesh of 4,000 nodes and 8,000 elements. No-flow boundary conditions are assumed on the sides and bottom of the domain, while seepage face or Dirichlet  boundary conditions, according on whether the water table is above or below the drain, respectively, are used to represent the drain. At the surface, measured precipitation and estimated ET 0 rates are imposed. A hydrostatic profile corresponding to a water table of 0.5 m below the surface was used as initial condition for all the simulation scenarios. In addition, a warm-up period of five days was always applied before the start of each calibration; such a period was found to be sufficient to dissipate the influence of the arbitrary initial conditions.
The parameters of the Feddes model for cranberries are known , with values for ψ 1 , ψ 2 , ψ 3 , and ψ 4 of −0.1, −0.45, −0.75, and −3.0 m, respectively. The maximum rooting depth is 0.15 m and the root distribution is linear. The VGM soil parameters ( K s , φ , θ r , α , and n ) were calibrated according to the procedure described in the next section.

Model Calibration and Validation
The model was first calibrated using the entire data set of SMP within the root zone measured in the T1 position (Figure 1). This calibration was carried out to evaluate the possibility of obtaining a well performing model. In order to verify the accuracy and the physical validity of this simulation, we show in Figure 3 the performance of the model for all the three tensiometers available, although T2 and T3 were not used for calibration. To minimize the RMSE (Equation 5) between observed and simulated soil matric potentials, we used the Marquardt-Levenberg parameter estimation technique implemented in HYDRUS (Šimunek and Hopmans, 2002).
The same optimization method was used to calibrate the soil parameters in 30 different simulation scenarios, whereby a time window of 408 h was used for calibration and the subsequent 72 h for forecasting, in order to consistently compare the forecasting capacity of the calibrated model with that of the trained ML algorithm.

Model Performance Criteria
Three statistical evaluation criteria were used to evaluate both models predictive power and efficiency: (1) The root mean square error (RMSE): (2) The Nash-Sutcliffe efficiency (NSE): And (3) the coefficient of determination (R 2 ): whereỹ average simulated value,ŷ i are the observed data, y i the simulated data, y i the mean of observed values and N the number of observations. For a model to yield a good fit between simulated and observed data the RMSE should be close to zero, while NSE and R 2 close to one. In this study, RMSE and NSE statistics are used to measure the model performance for forecasting SMP fluctuation whereas R 2 is used to analyze the linear regression goodness of fit between observed and simulated data.  Figure 3 shows the calibration results for tensiometer T1, ranging from 0 to 2018 h, between July 2018 and September 2018. The resulting RMSE, NSE and R 2 values, obtained during training, are 6.18 cm, 0.74, and 0.79, respectively. They suggest a very good agreement between observed and simulated values of SMP; giving confidence about the possibility of successfully calibrating the model and the subsequent comparison with the ML-based model. Figure 4 shows the comparison between simulated and observed SMP time series at the T1 location (root zone) for 27 of the 30 scenarios. Figure 5 presents a scatter plot of simulated vs. observed SMP values at the same location for the same scenarios. The temporal dynamics following calibration is well represented by the physics-based model, showing average performance criteria for RMSE, NSE and R 2 of 6.46 cm, 0.72 and 0.82, respectively. Three simulations, out of the 30 scenarios, did not converge due to numerical instabilities, corresponding to 10% of the runs. The reasons for such instabilities can probably be found in a complex interplay of initial conditions (a 2days warm-up used for every simulation) and soil parameters whose search space was limited and therefore did not allow for a comprehensive representation of the whole parameter space. Another possible explanation is that, due to the very short calibration windows (i.e., 360 h) which did not represent the total range of possible situations (i.e., from very dry to very wet periods), the calibration would have needed a larger parameter space to reasonably simulate the observed values. Figure 6 shows box plots of the calibrated van Genuchten-Mualem (VGM) parameters and saturated hydraulic conductivity for the 27 scenarios that converged. The VGM parameters were calibrated individually for each scenario using the Marquardt-Levenberg type parameter optimization algorithm for inverse modeling included in HYDRUS-2D. The initial VGM parameters were measured in the field and were fixed to the same value for each scenario (θ r = 0.072, θ s = 0.423, α = 0.03 cm −1 , n = 1.56, m = 1 − 1/n, and K sat = 19.62cm/h). The resulting variability of the soil hydraulic properties (VGM) is relatively large, with some runs having parameters equal to the lower or upper bounds, an additional clue of a limited search space that could have led to the three simulations not converging. Once again, this variation may be due to the initial small windows selected, making the soil hydric condition very different from window to window, and causing a wide range of soil parameters required for simulating the water dynamics. Another possible explanation could be hysteresis, not accounted for in the model, due to the fact that for some windows the system is mainly draining while for others the water table is mainly rising. Figure 7 shows the scatter-plot of the relationship between observed and simulated SMP values for the forecasting period for all the 27 successful scenarios. The model performance is still very good, with RMSE = 5.72 cm, NSE = 0.78, R 2 = 0.82 indicating an excellent agreement between simulated and observed values of SMP in the forecasting period.   Figure 9 shows the out-of-the-bag (OOB) squared error over the ntree parameter for the testing period. For ntree > 50 the error is constant, showing no gain for ntree > 200. Figure 10 shows the scatter plot between observed and simulated SMP values for the 30 testing periods, i.e., 30% of the 408 h of data for each period. The performance metrics over all simulation scenarios are RMSE = 8.49 cm, NSE = 0.55, and R 2 = 0.58, showing a drastic reduction of ML-model efficiency for testing. Figure 11 shows the RMSE as a function of forecasting time. RMSE values vary from 2 to 8 cm, with a clear increasing trend over time, confirming a strong reduction of the ML-based model prediction capabilities as the forecasting time increases.

DISCUSSION
Both physics-based and ML-based models were used here to predict SMP values in the root zone of a cranberry field. The ML-based model shows a strong dependency on the forecasting period. This dependence is somehow expected because the model is based on an autoregressive relationship of the input variables.
The RMSE values increased quickly in the first 3 h and leveled off from hours 20 to 60 in the forecasting period, before increasing again. These results are consistent with the findings by Nayak et al. (2006), who showed that the performance of ML-based models (ANN, RF, Gradient Boosting) was better for shorter lead times, but became worse as the prediction time increased. Compared with ANN models, the tree-based ones do not stay  in a local minimum which causes over-adjustment during the training period (Maier et al., 2010). Also, decision-tree models are not affected by missing and correlated data and they have been widely used in modeling natural phenomena.
For the physics-based model, the RMSE values appear to be completely random, mostly oscillating between 4 and 7 cm, with a single peak reaching 8 cm around 20 h of the forecasting period. Interestingly, after 50 h in the forecasting period, the physicsbased model shows better performance than the ML-based model. However, overall, both models display good performances in terms of RMSE and NSE values during the calibration and validation (training and testing, for ML) steps, which indicates that both models can provide satisfactory forecasting SMP values for irrigation management. Indeed, for cranberries, an error of less than 10 cm in SMP prediction is acceptable for irrigation management , considering a period of reaction of 24 or less for the irrigation to be activated into the concerned fields.
The physics-based model has the advantage of being able to consistently represent the water dynamics over the entire domain, not only where it was calibrated, as is the case for the ML-based model for which no extrapolation is possible beyond the range of observed values. On the other hand, physicsbased models need additional efforts to assign proper initial and boundary conditions, as well as parameter heterogeneity, in order to represent the soil water dynamics accurately. Also, physicsbased models would likely need to be constantly calibrated as new data are collected to maintain an effective prediction capability. For this study, where the time windows were short, the range of variability in the calibrated hydraulic parameters is quite large, except for the soil hydraulic conductivity. Data assimilation could represent a valid alternative strategy, as it would allow model users to regularly update not only the system state, but also the model parameters while keeping the advantages of physicsbased models.

CONCLUSIONS
In this paper, we presented a comparison between a physicsbased model and an ML-based model to predict the soil matric potential in the root zone for real-time irrigation management of a cranberry field. Both models were calibrated (trained) and validated (tested) with the same dataset of SMP for 72 h of forecasting scenarios. Both models presented acceptable simulation errors in terms of RMSE, R 2 and NSE values when compared with observed values. From an operational point of view, both models could be used for irrigation management and scheduling. The ML-based model should be used with care for long lead times, as it showed significant performance degradation over time, whereas the error of the physics-based model was mostly random throughout the forecasting period. The MLmodel has the advantage of being easy to implement, as it needs a smaller computational effort for training, that is practically negligible compared to the same step (calibration) of a physicsbased model. On the other hand, after proper calibration, the physics-based model could be used to simulate water dynamics everywhere within the domain for which it was calibrated, since it is derived from physical laws of mass conservation and hydrodynamics. Future developments should include the comparison between both approaches and data assimilation techniques, such as the Ensemble Kalman filter, implemented within an operational context.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
SG contributed to the model development and application as well as paper writing. MC and AB contributed to the model calibration as well as paper writing. JL, CP, and AR proofreading and paper writing. JG machine learning model development and proofreading.

FUNDING
This research was supported by a grant (RDCPJ 477937-14 -Gestion intégrée des ressources en eau dans la production de canneberges) from the Natural Sciences and Engineering Research Council (NSERC) of Canada.