Iterative Forecasting Improves Near-Term Predictions of Methane Ebullition Rates

Near-term, ecological forecasting with iterative model refitting and uncertainty partitioning has great promise for improving our understanding of ecological processes and the predictive skill of ecological models, but to date has been infrequently applied to predict biogeochemical fluxes. Bubble fluxes of methane (CH4) from aquatic sediments to the atmosphere (ebullition) dominate freshwater greenhouse gas emissions, but it remains unknown how best to make robust near-term CH4 ebullition predictions using models. Near-term forecasting workflows have the potential to address several current challenges in predicting CH4 ebullition rates, including: development of models that can be applied across time horizons and ecosystems, identification of the timescales for which predictions can provide useful information, and quantification of uncertainty in predictions. To assess the capacity of near-term, iterative forecasting workflows to improve ebullition rate predictions, we developed and tested a near-term, iterative forecasting workflow of CH4 ebullition rates in a small eutrophic reservoir throughout one open-water period. The workflow included the repeated updating of a CH4 ebullition forecast model over time with newly-collected data via iterative model refitting. We compared the CH4 forecasts from our workflow to both alternative forecasts generated without iterative model refitting and a persistence null model. Our forecasts with iterative model refitting estimated CH4 ebullition rates up to 2 weeks into the future [RMSE at 1-week ahead = 0.53 and 0.48 loge(mg CH4 m−2 d−1) at 2-week ahead horizons]. Forecasts with iterative model refitting outperformed forecasts without refitting and the persistence null model at both 1- and 2-week forecast horizons. Driver uncertainty and model process uncertainty contributed the most to total forecast uncertainty, suggesting that future workflow improvements should focus on improved mechanistic understanding of CH4 models and drivers. Altogether, our study suggests that iterative forecasting improves week-to-week CH4 ebullition predictions, provides insight into predictability of ebullition rates into the future, and identifies which sources of uncertainty are the most important contributors to the total uncertainty in CH4 ebullition predictions.


INTRODUCTION
Near-term (day to year) ecological forecasting can improve our understanding and quantification of ecosystem processes (Dietze et al., 2018). The repeated updating of models used to generate forecasts (often called data assimilation or data-model fusion; Luo et al., 2011) is crucial for forecasting because it can iteratively improve forecast skill by updating model initial conditions or recalibrating model parameters (Luo et al., 2011;Dietze, 2017a). The iterative process of making a forecast with a model, collecting observations to compare with the forecast, and then updating the forecast models before making a new forecast embodies the scientific method and can substantially improve our representation of ecosystem processes in models (Dietze, 2017b). If the observations match the forecast, the hypothesis (as represented by the forecast model) is supported, otherwise, it provides an opportunity for revising the hypothesis by, e.g., updating model parameters, modifying the model's driver variables, or changing the model's equations. Thus, near-term, iterative forecasting creates a model-data feedback loop that evaluates how effectively a model predicts future ecosystem states, with the forecasts evolving as the ecosystem experiences different environmental conditions. While near-term, iterative forecasting has been successfully applied to predict other ecological variables, to date the approach has not been broadly used in aquatic biogeochemistry. One biogeochemical process that near-term, iterative forecasts may improve predictions for is methane (CH 4 ) ebullition, or bubble fluxes of CH 4 from organic-rich sediments to the waterbody surface. Freshwater ecosystems emit large quantities of CH 4 to the atmosphere (currently estimated between 117 and 212 Tg CH 4 yr −1 ; Saunois et al., 2020). Within human-made reservoirs alone, ebullition rates can vary substantially in emissions, ranging from 1 to 1,000 mg CH 4 m −2 d −1 (Deemer et al., 2016) and up to 8.9 Tg CH 4 yr −1 (Johnson et al., 2021). Among the different types of freshwater CH 4 emissions, ebullition can make up anywhere from 0 to 99.6% of total emissions to the atmosphere (Deemer and Holgerson, 2021) and it is considered one of the most uncertain fluxes in both inland water and global CH 4 budgets (Wik et al., 2016;Saunois et al., 2020).
Ebullition's high spatial and temporal variation makes it difficult to quantify experimentally and create appropriate models, resulting in three challenges that stand in the way of developing robust predictions of CH 4 ebullition fluxes from inland waters. These include 1) model transferability across time and space (i.e., the ability to apply ebullition models across years and ecosystems); 2) identification of the forecast horizons for which future estimates provide useful information (Petchey et al., 2015); and 3) quantification of uncertainty in model predictions. First, while there are multiple different CH 4 ebullition models for inland waters, e.g., process-based models that couple physical, biogeochemical, and bubble plume modules (Schmid et al., 2017), empirical models that use physical, chemical and biological drivers as predictors of CH 4 ebullition fluxes (DelSontro et al., 2016;Aben et al., 2017;Grasset et al., 2021), and auto-regressive (AR) time series models (McClure et al., 2020a), it remains unknown how well they can predict CH 4 ebullition dynamics over time within the same ecosystem. Moreover, given the clear interannual variability in ebullition rates across inland water systems (Burke et al., 2019;Männistö et al., 2019;Linkhorst et al., 2020), quantifying the transferability of CH 4 ebullition model predictions from 1 year to another is critical for CH 4 ebullition modeling. Knowing, for example, if the same model was robust for predicting CH 4 ebullition across multiple years vs. if new models were needed to predict CH 4 ebullition each year would greatly assist CH 4 scaling efforts.
Second, the ability of current CH 4 ebullition models to predict future out-of-sample observations across varying forecast horizons remains untested. An important step for improving the robustness of CH 4 ebullition predictions is to use models to generate out-of-sample predictions for CH 4 ebullition rates at different future horizons (e.g., days, months, or years into the future) and then compare these predictions with observations when they are available. It is possible that CH 4 ebullition models may successfully simulate observed data at a given forecast horizon, but not effectively provide insight as to how far into the future the predictions provide useful information (Petchey et al., 2015). Here, we propose that exploring the transferability and effective forecast horizon of CH 4 ebullition rates using nearterm forecasting can serve as a robust tool for improving our understanding of future CH 4 ebullition rate predictions.
Third, near-term iterative forecast workflows need to account for uncertainty in predictions, which provides a critical metric of confidence in forecasts. Moreover, partitioning the different sources of uncertainty can provide insight into model performance and measurements, which has the potential to improve aspects of the model and predictions (Dietze, 2017a;Carey et al., 2021a). In particular, forecast workflows that apply state-space Bayesian methods with iterative model refitting enables the partitioning of sources of uncertainty (e.g., parameters, initial conditions, driver data, and model processes; Dietze, 2017a;Luo et al., 2011), which is crucial for identifying strategies for improving the design for future measurements and models. For example, Thomas et al. (2018) used a state-space hierarchical Bayesian model to forecast biomass change in loblolly pine (Pinus taeda) forests in the southeastern United States and then evaluated the relative contribution of different forms of uncertainty to the total forecast uncertainty to guide improvements on future biomass predictions. However, accounting for and partitioning uncertainty still remains uncommon in ecological modeling (Harrison et al., 2017;Lewis et al., 2021), especially in forecasts of future biogeochemical processes like CH 4 ebullition (Lewis et al., 2021). Thus, developing a forecast workflow that implements uncertainty partitioning at every time step will provide key insight to future model development for CH 4 ebullition rates.
The goal of this work was to assess the potential of forecasting CH 4 ebullition as a case study to evaluate its potential more broadly. Given the complexity of ebullition as a biogeochemical process with many different mechanisms that control its fluxes, our goal was to see if we could successfully forecast CH 4 , which would support its application to other aquatic biogeochemical processes. We developed an iterative forecasting workflow of weekly CH 4 ebullition rates in a freshwater reservoir up to 2 weeks into the future using forecasted water temperatures, ebullition rate observations, and a state-space model fit in a Bayesian framework with iterative model refitting. We refit our model with new observations to update our ebullition model's states, parameters, initial conditions, and driver data on each model time step and tested our ebullition rate forecast workflow against a similar forecast workflow that did not use iterative model refitting and a persistence null model. We also quantified the contributions of different sources of uncertainty to CH 4 ebullition forecasts at each time step. Our objectives were: 1) develop an iterative, near-term forecasting workflow of CH 4 ebullition rates and apply it to assess a CH 4 ebullition model's interannual transferability; 2) determine the effective forecast horizon of CH 4 ebullition; and 3) quantify and partition the sources of uncertainty in our forecasts to inform future workflow development. Altogether, our goal was to examine the potential of near-term iterative forecasting for improving the representation of CH 4 ebullition in models and providing insight into this important biogeochemical flux.

Site Description
We developed near-term, real-time, iterative CH 4 ebullition rate forecasts in Falling Creek Reservoir (FCR, Figure 1) in summer 2019 and tested the efficacy of different forecasting workflows against observations of CH 4 ebullition rates at different forecast horizons. FCR is a small (0.119 km 2 ), shallow (Z max 9.3 m), eutrophic, drinking water reservoir located in southwestern Virginia, United States (37.30°N, 79.84°W). FCR is owned and operated by the Western Virginia Water Authority as a drinking water supply and is in a completely forested watershed (Gerling et al., 2016). FCR's water level was managed to stay at a constant level and did not experience substantial fluctuations during this study (Carey et al., 2021a).
The forecasting workflow used a model developed from a previous summer sampling season (2017) of CH 4 ebullition monitoring data collected at FCR to build and calibrate our forecast models (McClure et al., 2020b). In 2017, four ebullition traps were deployed and monitored weekly from 8 May to 24 October along a shallow upstream transect (McClure et al., 2020a; Figure 1). In 2019, the year of this forecasting study, we redeployed all four traps as close as possible to their original locations in 2017 and forecasted measurements collected weekly throughout the summer. Ebullition rates were natural log-transformed (log e ) + 0.1 to meet the assumptions of normality and avoid forecasts of negative ebullition rates in log-space.

Forecast Model and Model-Fitting
Our forecast model was an auto-regressive (AR) time series model with sediment-water interface (SWI) temperatures as a driver: where E t log e (mg CH 4 m −2 d −1 ) is the log e -transformed mean CH 4 ebullition rate from the four sites at the upstream transect, E t−1 log e (mg CH 4 m −2 d −1 ) is the log e -transformed CH 4 ebullition rate at the previous measurement (the AR term), and T t (°C) is the water temperature averaged from measurements at the SWI below the upstream transect between each time step of the model (weekly). There were three parameters in this model: the intercept term (β 0 ), the parameter governing the effect of the AR term (β 1 ), and the parameter governing the effect of SWI temperature (β 2 ). This model was chosen based on prior modeling work at the site, which demonstrated that ≥60% of the total reservoir-wide CH 4 ebullition was emitted from the shallow upstream transect in FCR during the ice-free period, and that there was a strong positive relationship between ebullition and sediment-water interface (SWI) temperatures (McClure et al., 2020a).

FIGURE 1 | The bathymetry of Falling
Creek Reservoir superimposed by a depiction of how our forecast workflow mapped spatially from the dam to the upstream transect where CH 4 ebullition rates were forecasted. Our forecast workflow used water temperature forecasts from the dam site in the reservoir (following Thomas et al., 2020). The water temperature forecasts were scaled to an upstream transect (denoted by red dots on map) to generate near-term iterative CH 4 ebullition rate forecasts. Simultaneously, direct observations from ebullition traps were used to iteratively update the forecast model via data assimilation.
Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 We estimated the posterior distributions of the parameters in the model using a state-space Bayesian framework. This framework includes latent states that represent the "true" ebullition rate before observation uncertainty is added. The latent state has a distribution that represents uncertainty in our capacity to model the true ebullition rate. Following Eq. 1, the state-spaced framework uses the previous time step's latent state in the AR term: The latent state is normally distributed with a mean equal to the prediction from Eq. 2 and standard deviation of Δtσ proc : σ proc is an unknown parameter that represents process uncertainty that accumulates over a day. Process uncertainty is uncertainty that arises due to the inability of a particular model structure to represent the real world and reproduce observed conditions ( Table 1, Dietze, 2017a). σ proc is multiplied by the number of days between time steps (Δt) to allow for the standard deviation to scale with the time step length. This was necessary because data were collected approximately weekly but the exact number of days between measurements varied. The observations (Y E t ) are modeled as a normal distribution with a mean equal to the latent state and standard deviation (σ obs t ) that was set to be the standard error of the mean ebullition rate across the four upstream sites: σ obs t was time-dependent because standard error of the observations varied by measurement period. Although we include Y E t as a source of error, we acknowledge that the actual Y E t uncertainty is likely higher due to our inability to include other inherent sources of error (e.g., sample measurement, sample transport, instrument uncertainty, etc.). However, these other inherent sources of uncertainty are likely substantially smaller than the sampling uncertainty currently represented for Y E t (Eq. 4).
We set the priors for the β 0 , β 1 , and β 2 parameters as uninformative with normal distributions with a mean of zero and a large standard deviation (1,000). The prior on σ proc was also uninformative with a uniform distribution between 0 and 10,000. State-space models require a prior distribution for the states at the first time step (initial conditions), which we assumed to be normally-distributed with a mean equal to the observed mean Y E 0 ebullition rate and standard deviation equal to the standard error of the mean ebullition rate across the four upstream sites at the first observation period: Our model-fitting process involved four steps. First, we estimated posterior distributions of the parameters using data from summer 2017 that included weekly measurements of CH 4 ebullition rate and SWI temperature using Markov chain Monte Carlo (MCMC) analyses. MCMC analyses are particularly well-suited for forecasting because they numerically estimate probability distributions for model parameter, the latent states, and process uncertainty. These distributions can subsequently be used to quantitatively assess the uncertainty associated with each aspect of the model (parameters, initial conditions, and process error; see Table 1). The MCMC analyses were carried out using the "rjags" and "R2jags" packages (Plummer, 2018) within the R statistical environment (R Core Team, 2021). We ran three MCMC chains for 70,000 iterations, including 20,000 initial iterations that were discarded as a burn-in. We assessed convergence using the potential scale reduction factor (PSRF) of the Gelman-Rubin statistic, and parameters were determined to have converged if the PSRF was between 1 and 1.05 (Gelman and Rubin, 1992). Second, we generated prior distributions for a refitting of the state-spaced model using 2019 data that were derived from posteriors from the 2017 model-fitting. The posterior distributions from the β 0 , β 1 , and β 2 parameters from the 2017 model-fitting followed a multivariate normal distribution that was used as the prior in the 2019 calibration. The posterior TABLE 1 | Definitions of uncertainty sources that can contribute to total forecast uncertainty and what parameter and variables from the Eqs 2, 3, 4, 6, 7 correspond to the total uncertainty (derived from Dietze, 2017a).

Source of uncertainty Definition
Forecast model parameters associated with uncertainty source Model process Uncertainty that arises due to the inability of a particular model structure (equations) to represent the real world and reproduce observed conditions. This uncertainty is not associated with a particular mechanism and is represented by adding random noise to forecasts between time-steps Δt, σ proc Eq. 3 Model Parameter Uncertainty in the model parameter values; represented by randomly sampling from the parameter distributions at the beginning of a forecast Uncertainty in the observed conditions when a forecast is created; represented using a distribution of model states at the first model time step Uncertainty in the forecasted estimates of the model covariates (i.e., sediment-water interface temperatures); represented using an ensemble for model drivers Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 of σ proc was approximately log-normally distributed in the 2017 model-fitting and we used moment-matching (i.e., a method of matching parameters of the distribution to produce a particular mean and variance) to estimate the parameters for the log-normal prior distribution in the 2019 recalibration. Third, during the summer of 2019, we refitted the model weekly as new data were collected. The refitting involved starting with the priors from the 2017 model-fitting, appending the new data to the previously collected 2019 data, and re-running the MCMC estimation of posteriors using the 2019 data. The first measurement of 2019 was used as the initial conditions for Eq. 2. As a result of this analysis design, we expected the posterior parameter distributions to reflect the prior distributions (i.e., the 2017 posteriors) early in the summer, when data from 2019 were limited, but the posterior parameter distribution would increasingly be influenced by data in 2019 as more data were collected.
Finally, the posterior estimates for the parameters (β 0 , β 1 , β 2 , and σ proc ) from each recalibration in 2019 were used to generate 1-week ahead and 2-week ahead forecasts (Overview of Forecasting Workflow).

Overview of Forecasting Workflow
Here, we outline the implementation of the forecasting workflow based on the model development and fitting described in Forecast Model and Model-Fitting, and then describe how the forecasted driver data were generated in Forecasting SWI Temperature. Approximately each week between 17 June and November 7, 2019, we generated forecasts of CH 4 ebullition rates that extended approximately 2 weeks into the future ( Figure 2). The forecast horizon varied between 13 and 16 days ahead depending on the exact days that the observations were ultimately collected. The forecasts presented here are technically hindcasts because they were generated after the observations were collected; however, a preliminary version of the forecasts were generated using a similar approach in real-time during the summer of 2019 (and thus those are considered true forecasts).
The forecasts were generated using an MCMC approach to numerically represent forecast uncertainty. Because MCMC permits estimation of model parameters, latent states, and process uncertainty parameter as distributions rather than fixed values, an MCMC approach enables estimation of the uncertainty associated with each of these model components by making multiple draws from estimated distributions to FIGURE 2 | Conceptual illustration of the forecast workflow using iterative model refitting that generates near-term, iterative CH 4 ebullition forecasts. The model refitting stage occurred at the time when new ebullition rates were manually collected from the reservoir. In the shaded part of the figure, the blue boxes represent processes, and the yellow and purple boxes represent the data products that resulted from each process. The yellow boxes are manually-collected data products that were used to update the forecast model and the purple boxes represent forecasted data products.
Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 build an ensemble of possible forecast outcomes. Our approach used an ensemble with 210 members. First, to represent initial condition uncertainty, the forecast was initialized using 210 random draws from a normal distribution with a mean equal to the observation at the start of the forecast (Y t m , where m is the week number in the 2019 time series at the time the forecast was initialized) and a standard deviation equal to σ obs t m . Second, to represent parameter uncertainty, we randomly drew a set of 210 parameter values from the posterior distributions for each ensemble member from the most recent recalibration. As a result, parameter values varied among ensemble members but were constant over time for a given ensemble member. Third, to represent driver uncertainty, each Bayesian state-space ensemble member was assigned an ensemble member from the sediment-water interface (SWI) temperature forecast (Forecasting SWI Temperature). Fourth, using the initial conditions and parameter values described above in steps one and two, respectively, one time step (∼1 week) was simulated for each ensemble member. Normal random noise from a distribution with a mean equal to 0 and a standard deviation equal to Δtσ proc was then added to each ensemble member to account for process uncertainty and generate the 1-week forecast. Finally, the 1-week forecast was used as initial conditions for the second week forecast and step 4 was repeated to generate the 2-week forecast ( Figure 2). Overall, this numerical forecasting approach quantifies the contribution of initial condition uncertainty, parameter uncertainty, process uncertainty, and driver uncertainty ( Table 1).

Forecasting SWI Temperature
Because we generated forecasts of future CH 4 ebullition rates, forecasts of the model covariates (i.e., SWI temperatures) were needed as part of our workflow (Figure 1, 2). To produce forecasts of SWI temperature, we used the existing Forecasting Lake And Reservoir Ecosystems (FLARE; Thomas et al., 2020) water temperature forecasting framework and infrastructure deployed in FCR to generate predictions of future water temperatures as driver data for our CH 4 ebullition rate forecasts at all four traps on the transect. FLARE uses nearreal time temperature observations, weather forecasts, and a hydrodynamic model (General Lake Model) to forecast water temperatures up to 16 days into the future at the "dam site" at FCR, which was located ∼700 m downstream from the ebullition traps ( Figure 1). To generate SWI forecasts at the upstream ebullition transect, which had a depth of 1-3 m, we developed a temperature scaling model that converted the FLARE forecasts between 1 and 3 m at the dam site to forecasts at the transect. We estimated the posterior distributions of the parameters in the temperature scaling model using a simple (non-state space) Bayesian framework where: and T t is the predicted upstream transect SWI temperature at time t ϕ 0 and ϕ 1 are the slope and intercept parameters, d t is the observed mean water temperature between 1 and 3 m at the dam site, and Y swi t is the observed mean upstream transect SWI. We estimated the posterior distribution for the parameters using a Bayesian framework applied to data from 2017. The temperature scaling model ran three MCMC chains with a burn-in period of 1,000 iterations and a sample size of 10,000 iterations. Prior distributions for the parameters (ϕ t ) were uninformative by using a normal distribution with a mean of zero and a large standard deviation (1,000). The prior distribution for σ swi was also uninformative by using uniform with a large range (0-1,000). Similar to recalibration of the parameters in the ebullition model, the SWI temperature model was iteratively recalibrated in 2019 using the priors derived from the posteriors of the 2017 model-fitting.
To generate the forecast of SWI temperature, first we forecasted water temperature at the dam site using FLARE (Thomas et al., 2020) up to 16-days into the future. This produced a 210-member ensemble with uncertainty primarily derived from uncertainty in weather predictions from the NOAA Global Ensemble Forecasting System and process uncertainty associated with the hydrodynamic lake model underlying FLARE (Thomas et al., 2020). Second, we collected the mean temperature forecasts at 1-and 2-week horizons of the 1 and 3 m depth forecasts for each FLARE member ensemble and applied these values as d t in Eq. 6. Third, we sampled 210 members from the joint posterior distribution of ϕ 0 , ϕ 1 , and σ swi . Fourth, using sampled ϕ 0 and ϕ 1 and FLARE water temperature forecasts, we predicted the SWI temperature at the upstream site using Eq. 6. Finally, for each of the 210 ensemble members we added the normal distributed error from Eq. 7. The 210 ensembles of forecasted water temperatures from the scaling model for the upstream transect were then used as driver data for our CH 4 ebullition rate forecast model (T t ) in Eq. 2.

Field Data Collection
Approximately every week during the forecast period, we measured CH 4 ebullition rates and SWI temperature manually to update the daily forecasts ( Figure 2). We collected 10-min SWI temperature data with HOBO temperature loggers (HOBO Pendant Temperature/Light Data Logger, Bourne, MA, United States) deployed below each of the four ebullition traps during the 2017 training period and 2019 forecast period. Each logger was sunk using a stainless-steel weight and a nylon string to sit ∼0.1 cm above the SWI. We affixed the bottom logger ∼1 m horizontally away from each ebullition trap to prevent potential disturbance of sediments under the traps. We simultaneously downloaded the temperature logger data as ebullition bubbles which had accumulated in the traps since the last sampling date were being collected. The temperature loggers were downloaded using HOBOware version 3.7.13.
The dam site water temperature data ( Figure 1) were collected using a CTD (Conductivity, Temperature, and Depth) profiler in 2017 and water temperature thermistors in 2019. In 2017, the dam site water temperature depth profiles were collected with the CTD on the same days ebullition rates were collected from the Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 traps in the upstream transect. The CTD measured temperature every 0.25 s during the profile, resulting in temperature depth profiles at approximately 10-cm resolution through the water column. Further information and all thermistor and CTD data are available in the Environmental Data Initiative (EDI) repository following Carey et al. (2021a) and Carey et al. (2021b). We sampled the CH 4 ebullition rates at the four upstream traps following McClure et al. (2020b). The ebullition samples from each trap on the transect were extracted across a septum stopper using a needle attached to a 10-ml syringe. We injected 10 ml of ebullition gas into a 20-ml crimped top glass vial that was pre-filled with saturated salt brine solution. A secondary exit syringe extracted the salt brine solution as the sample was injected to generate 10 ml of gas headspace in the vial. Two replicates were collected from each trap on a sampling day if enough gas sample was available, resulting in up to eight possible ebullition rate samples among the four traps. The vials were stored upside down until analysis, so the remaining 10 ml of salt brine solution acted as a barrier to prevent any gas from escaping. We extracted any remaining gas from each ebullition trap using a 30-ml syringe and summed the total volume of ebullition gas collected each week.
We analyzed the manually-collected ebullition gas from the traps for its CH 4 concentration using a Shimadzu Nexus-2030 Gas Chromatography-Flame Ionization Detector (GC-FID; Shimadzu Corporation; Kyoto, Japan) within 24 h of collection (following McClure et al., 2018;McClure et al., 2020b). We determined the observed CH 4 ebullition rate (Y E t ) as follows: where V gas was the volume of ebullition gas collected in the trap (L), [CH 4 ] is the CH 4 concentration of the gas (mg CH 4 L −1 ), Δt is the duration of time the trap was deployed (in days), and A F is the cross-sectional area of the funnel (0.26 m 2 ). Following McClure et al. (2020b), we calculated the daily ebullition rates separately for each trap and then averaged the rates from the four traps within the transect to determine a mean daily transect ebullition rate.

Forecast Analysis
To evaluate the interannual transferability of the ebullition model (Objective 1), we compared the skill of the forecast workflow without iterative model refitting with the forecast workflow with iterative model refitting. If the forecast workflow with iterative model refitting performed substantially better than the workflow without iterative refitting, then model transferability from year to year is low. The alternate forecasting workflow without iterative model refitting used the same AR model (Eq. 2) and the 2019 CH 4 ebullition rates as initial conditions for each 2-week forecast but sampled parameters from the posteriors that were only based on the 2017 model-fitting. As a result, the parameters did not iteratively update as new data were collected in 2019.
Next, to evaluate the effect of the forecast horizon of CH 4 ebullition rates in our system (Objective 2), we compared the performance of the forecast workflow with iterative model refitting with the persistence null model. If the null model performed better than the forecast workflow with iterative model refitting, then our forecast model of CH 4 ebullition rates was no better than a simple model that assumed next week's rates would be the same as the previous week's rates. The persistence null model was developed using a similar structure as the forecast models and assumed that CH 4 ebullition rates would remain similar into the future with propagated uncertainty. The persistence null model replaced Eq. 2 with: The persistence null model used the same MCMC methods for estimating posterior distributions as the forecasting model. Similar to the forecasting model, the null model was fit to 2017 data and the posteriors were used as priors in the weekly recalibration of the model using 2019 data.
For both of the analyses described above, we quantified the Nash-Sutcliffe efficiency (NSE) and root-mean-square-error (RMSE) of the CH 4 ebullition rate forecasts from the forecast workflow with refitting, the forecast workflow without refitting, and the persistence null forecast model. The NSE is a normalized metric that evaluates a model's performance relative to the mean of the observed time series (Nash and Sutcliffe, 1970). We calculated the NSE coefficient (Nash and Sutcliffe, 1970) as follows: where F t was the mean of the CH 4 ebullition rate forecast ensembles, Y E t was the observed mean CH 4 ebullition rate, and Y E was the average of all the observed CH 4 ebullition rates. NSE values range between -∞ to 1, where one indicates a perfect score (i.e., the model perfectly recreates observations; Nash and Sutcliffe, 1970;Moriasi et al., 2007), 0 indicates the model performs the same as the mean of the observations, and <0 indicates the model performs worse than the mean of the observations.

Forecast Uncertainty Partitioning
We quantified the total forecast uncertainty in CH 4 ebullition rate forecasts among all forecast cycles as standard deviation (SD) throughout the forecasting period using the variance of the 210member ensemble at the 1-and 2-week forecast horizons (Objective 3). Additionally, we partitioned the relative contributions of four different sources of uncertainty, model parameters (β 0 , β 1 , and β 2 ), model process error, initial conditions, and driver data (d t ) (Dietze, 2017a) among all forecasts (Eq. 5).
We used a One-At-a-Time Sensitivity (OATS) analysis to determine the relative contribution of the uncertainty sources to total uncertainty (as total forecast variance) in each forecast cycle. An OATS analysis holds all sources of uncertainty at their mean except for one, and then numerically evaluates the sensitivity of the forecast to that specific source of uncertainty (Dietze, 2017b). For example, to partition parameter uncertainty, we held all other sources of uncertainty (model process, driver data, and initial condition) at their mean and only allowed the parameter values to vary among ensemble members. The relative proportion of each source of uncertainty was determined by dividing the total variance of each isolated source of uncertainty by the sum of the variance of all uncertainty sources at both the 1-and 2-week forecast horizons across the forecasting period.

Observed CH 4 Ebullition Rates
Throughout the forecasting period, we observed high temporal variation in mean CH 4 ebullition rates at FCR's upstream transect (Figure 3). Mean ebullition rates at the upstream transect increased from −0.48 to 1.76 log e (mg CH 4 m −2 d −1 ) from 27 May though 24 June (Figure 3). Between 24 June and 15 July, mean CH 4 ebullition rates at the transect continued to increase until the maximum CH 4 ebullition rate for the forecast period was observed on 15 July [3.88 log e (mg CH 4 m −2 d −1 )]. Between 15 July and 16 October, the observed CH 4 ebullition rates ranged between 1.56 and 3.74 log e (mg CH 4 m −2 d −1 ). After 16 October, CH 4 ebullition rates dropped to ≤1.32 log e (mg CH 4 m −2 d −1 ) for the remainder of the forecasting period, which ended on 7 November.

Forecast Evaluation and Model Transferability
All three forecast workflows -the forecast model with iterative refitting, the forecast model without refitting, and the null persistence model -generated 95% predictive intervals that encompassed out-of-sample CH 4 ebullition rate observations at both 1-and 2-week forecast horizons ( Figure 3). However, we found that there were differences in performance among the workflows based on our evaluation metrics ( Table 2). At both 1and 2-week forecast horizons, the forecast workflow with iterative model refitting exhibited both higher NSE and lower RMSE values than the forecast workflow without refitting (Table 2). Moreover, the forecast workflow with refitting performed better than the null model at both 1-and 2-week forecast horizons ( Table 2). FIGURE 4 | Parameter estimates of β 1 , the autoregressive parameter (A), β 2 , the sediment-water interface temperature parameter (B), and β 0 , the intercept parameter (C) of the forecast workflow recalibrated with new data. The black line represents the mean of the parameter ensembles, and the grey area represents the standard deviation of the ensembles (±1 S.D. mean).
Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 Our result that the iterative model with refitting performed better than the iterative model without refitting indicates that model parameters estimated from 2017 ebullition rates did a relatively poor job of predicting 2019 ebullition rates, and as such, the model's interannual transferability among years is low. Our finding of low year-to-year transferability informs methods for scaling ebullition models predicting fluxes in one waterbody at one time point to multiple waterbodies over multiple years. Specifically, this result suggests that a single prediction model of ebullition likely will not apply to other years, motivating future work to test this hypothesis in other waterbodies.
Although the forecast workflow without refitting exhibited low transferability across years, the forecast workflow with refitting successfully predicted future CH 4 ebullition rates well because of the evolution of the model parameters throughout the 2019 forecasting period. We observed a substantial evolution of the intercept, AR, and SWI temperature parameters (β 0 , β 1 , and β 2 , respectively) during the 2019 forecast period (Figure 4), highlighting how iteratively refitting the model to update the parameters can lead to improved forecast performance. The SWI temperature (β 2 ) parameter was at its highest value in the early stages of the forecasting period at 0.41 ± 0.15 log e (mg CH 4 m −2 d −1 ) × C −1 (mean ±1 S.D.) on 17 June, and then steadily decreased throughout the forecasting period to 0.07 ± 0.06 log e (mg CH 4 m −2 d −1 ) × C −1 by 7 November. This decrease over time coincided with an increase in SWI temperatures, and indicates that the relative importance of the temperature scaling FIGURE 5 | The total forecast uncertainty (as measured by standard deviation of the forecast ensembles) for each week aggregated across forecast cycles for 1 and 2 weeks into the future Panel (A). In panel (B), each different colored line represents the number of weeks into the forecast horizon. The small grey lines are included as a guide to represent each forecast and how the uncertainty of each forecast mapped across the forecasting period.
Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 parameter was highest when temperatures at the SWI were low. Conversely, the unitless AR parameter (β 1 ) was 0.48 ± 0.17 and 0.63 ± 0.09 between 17 June and 2 September, and then steadily increased to a maximum of 0.78 ± 0.09 on 7 November. Finally, the intercept (β 0 ) parameter changed throughout the forecasting period and was −6.67 ± 2.59 and −2.11 ± 1.26 log e (mg CH 4 m −2 d −1 ) between 17 June and 2 September, and then steadily increased to a maximum of −1.11 ± 1.27 log e (mg CH 4 m −2 d −1 ) on 7 November. These changes indicate that the relative importance of these three model parameters evolved throughout the forecasting period.

Effective Forecast Horizons
The comparison of the workflows and the persistence null model indicated that our model's effective forecast horizon of CH 4 ebullition with iterative model refitting extended up to 2 weeks into the future (Table 2). However, at both 1-and 2-week forecast horizons the null persistence model outperformed the workflow without refitting ( Table 2). To understand why the forecast workflow with iterative model refitting consistently performed better than the workflow without refitting and the persistence null model, we explored the sources of forecast uncertainty in the workflow with iterative model refitting.

Forecast Uncertainty and Uncertainty Partitioning
Over the forecasting period, the forecast uncertainty (as measured by standard deviation, SD) consistently increased from 1 to 2 weeks into the future, on average by 32% ( Figure 5A). Simultaneously, total uncertainty in both forecast horizons decreased throughout the forecasting period from June to November ( Figure 5B). The first forecast (starting 17 June) exhibited the largest uncertainty in both 1-and 2-week forecast horizons, with a SD of 1.67 log e (mg CH 4 m −2 d −1 ) at 1-week and 2.46 log e (mg CH 4 m −2 d −1 ) at the 2-week horizon ( Figure 5B). By 15 July, the total forecast SD across the 1-and 2week horizons decreased to 1.42 and 1.73 log e (mg CH 4 m −2 d −1 ) respectively, a 15 and 30% decrease in forecast SD. The total forecast uncertainty continued to decrease though the forecasting season, and by 7 November, the final forecast of the season, the forecast uncertainty was 0.84 log e (mg CH 4 m −2 d −1 ) at the 1- week horizon and 1.50 log e (mg CH 4 m −2 d −1 ) at the 2-week horizon. When partitioned, driver data and model process uncertainty were consistently the largest sources of uncertainty in our forecasts at both 1-and 2-week forecast horizons among all forecasts (Figure 6). At the 1-week forecast horizon, driver data uncertainty was the largest source of uncertainty between 17 June and 11 September and then model process uncertainty was the largest source from 11 September to the last forecast on 7 November ( Figure 6A). At the 2-week forecast horizon, driver data uncertainty was also the largest source of uncertainty between 17 June and 2 September and then model process and parameter uncertainty became the largest sources of uncertainty from 2 September to the last forecast on 7 November ( Figure 6B). When aggregated across the 2019 period for both horizons, driver data uncertainty contributed a mean of 60% of total forecast uncertainty ( Figure 6C), indicating that uncertainty in the forecasted water temperatures from FLARE and the uncertainty in the SWI temperature scaling model contributed most to uncertainty in our forecasts. Model process uncertainty was the next highest contributed source of uncertainty at 25% over the entire forecasting period. Both model parameter and initial conditions uncertainty contributed smaller proportions of total variance, with seasonal means of 11 and 4%, respectively ( Figure 6C).

DISCUSSION
Our CH 4 ebullition forecasting workflow demonstrated that iterative model refitting improved summer CH 4 ebullition rate predictions at both 1-and 2-week forecast horizons (Table 2; Figure 3). The utility of iterative model refitting was demonstrated by the evolution of the model parameter distributions over time (Figure 4) and the improvement of the forecast workflow with refitting over the null model during the forecasting period ( Table 2). In addition, the forecasts developed by iterative model refitting exhibited lower overall forecast uncertainty as the forecasting period progressed, as indicated by the reduction in the total SD ( Figure 5B). The primary drivers of uncertainty in the forecasts were driver data and model process. Altogether, our CH 4 ebullition rate forecasts provide insight to: the low transferability of CH 4 ebullition rate prediction models among years (Objective 1), the effective forecast horizon of CH 4 ebullition can extend up to 2 weeks in the future (Objective 2), and that improvements to the driver data and the CH 4 ebullition models themselves would be most useful for improving CH 4 ebullition rate predictions (Objective 3), as discussed below.

Model Interannual Transferability
Our comparison of the forecasting workflows with and without iterative model refitting provided useful insight on the transferability of our CH 4 ebullition prediction models among years. The forecast workflow that relied on the fitted 2017 posterior distributions (without iterative refitting) performed worse than the forecasts with iterative model refitting at both 1-and 2-week horizons ( Table 2). This difference is likely due to lower overall ebullition rates in 2019 than 2017, which caused the calibrated 2017 posterior distributions to consistently overestimate CH 4 ebullition rates in 2019 (Figure 3). These results support previous work observing high interannual variability in CH 4 ebullition rates (Burke et al., 2019;Männistö et al., 2019;Linkhorst et al., 2020), and suggest that adopting a modeling workflow that is iteratively refitted with new incoming data throughout the season will improve CH 4 ebullition predictions. Given our empirical forecast model (Eq. 2), its low transferability is not surprising and points to the importance of additional covariates that may be important for driving variability in CH 4 ebullition rates (e.g., Deemer and Holgerson, 2021). The addition of meaningful covariates would likely increase temporal transferability by increasing the mechanistic ability of the model to recreate observed CH 4 ebullition dynamics, as well as decrease total forecast uncertainty, as discussed below. The evolution of model parameters throughout the forecasting period ( Figure 4) highlights the ability of iterative refitting to respond to the changing environmental conditions that occurred between June and November. For example, an increase in the observed ebullition rates (Figure 3) in early July was concomitant with a decrease in the SWI temperature parameter as the reservoir warmed. Between 15 July and 11 October, both the SWI temperature and AR term parameters remained largely stationary as the ebullition rates exhibited a seasonal increase and then decrease (Figure 3). After 11 October, the SWI temperature parameter decreased while the AR term parameter increased to its maximum value. The dynamic parameter values highlight how SWI temperature is likely most important for predicting CH 4 ebullition in early summer, prior to the onset of stable thermal stratification. Conversely, SWI temperatures were much less important during the autumn, when there are large changes occurring week-to-week as water at the SWI undergoes dynamic fall mixing conditions.

Effective Forecast Horizons
The better performance of our forecast workflow with iterative model refitting over the persistence null model at both the 1-and 2-week forecast horizons ( Table 2) provides insight on how to evaluate the predictive horizon of CH 4 ebullition rates. Although CH 4 ebullition is often considered a stochastic process (Bastviken et al., 2004;though see;West et al., 2016;Bezerra et al., 2020), our study suggests that CH 4 ebullition may be successfully predicted up to 2 weeks ahead at the weekly time step. However, our forecasts do not provide evidence on whether CH 4 ebullition rates can be accurately predicted beyond 2 weeks. Additionally, the 32% increase in the forecast ensemble uncertainty ( Figure 5) between the 1 and 2-week forecast horizons in our iterative model refitting workflow (Figure 5A), indicates that the 2-week forecasts provide less meaningful predictions than the 1-week forecasts. Thus, even if the mean of the forecast ensemble closely matches the out-of-sample predictions at the 2-week horizon, our 2-week forecasts should be interpreted carefully given the substantial increase in the ensemble SD between the 1-and 2week horizons. Altogether, we advocate for the inclusion of persistence null models and the quantification of forecast Frontiers in Environmental Science | www.frontiersin.org December 2021 | Volume 9 | Article 756603 ensemble uncertainty as means for identifying the effective forecast horizon in future forecasting workflows for CH 4 ebullition to inform the performance and meaningful horizon of the models being tested.

Uncertainty Partitioning
Partitioning different sources of uncertainty (initial conditions, parameter, process, and driver uncertainty; Dietze, 2017a) in our CH 4 ebullition rate forecasts allowed us to examine areas where our forecasting workflow with iterative model refitting could be improved ( Figure 6). The large relative contribution of driver uncertainty in the 1-and 2-week forecasts ( Figures 6A,B) during the 2019 forecasting period was likely due to uncertainty in the forecasted water temperatures from FLARE and the propagated uncertainty that arose in the SWI temperature scaling model from the downstream dam site to upstream transect site (Eqs 6, 7; Forecasting SWI Temperature). Thomas et al. (2020) found that meteorological driver data contributed the largest proportion of uncertainty in FLARE's water temperature forecasts in FCR's surface waters at horizons greater than 2 days into the future. Similarly, Dietze (2017a) found that meteorological driver data uncertainty contributed substantial uncertainty to net ecosystem exchange forecasts at forecast horizons greater than 1 week. Consequently, driver data uncertainty remains an important source of uncertainty that should be quantified in future forecasts of CH 4 ebullition. We also observed a larger contribution of model process uncertainty at the 1-and 2-week forecast horizons toward the end of the 2019 forecasting period, indicating that improvements to the model itself would improve CH 4 ebullition rate predictions ( Figure 6A). Model process uncertainty refers to the unexplained variability in the model itself, and encompasses multiple sources of variability in the model, including model structure, parameter heterogeneity, and stochastic variation (Dietze, 2017a). In the case of our forecast workflow for CH 4 ebullition rates, the large contributions from model process uncertainty could have resulted from the model structure, interannual variability that occurred between 2017 and 2019, or other factors that were not accounted for in our state-space model. New forecasts of CH 4 ebullition rates could be improved by specifically partitioning these potential sources of model process uncertainty using a hierarchical framework (Dietze, 2017a;Harris et al., 2018).

Forecasting System Limitations and Areas for Improvement
There are several limitations of our near-term, iterative CH 4 ebullition rate forecasts that should be considered. First, there are many other potential drivers of CH 4 ebullition, such as chlorophyll a, nutrients, and water pressure, among others (e.g., DelSontro et al., 2016;West et al., 2016;Harrison et al., 2017;Davidson et al., 2018). Alternative model structures (e.g., DelSontro et al., 2016;Schmid et al., 2017;Peltola et al., 2018;Grasset et al., 2021) could also improve our CH 4 ebullition forecasts and improve their transferability to other freshwater ecosystems and different temporal scales. Because our forecasts relied on an AR model developed from earlier work in FCR (McClure et al., 2020a), we were unable to quantify model selection uncertainty by using an ensemble of models in our forecasting workflow (e.g., process-based ebullition models; Peltola et al., 2018;Schmid et al., 2017), which is an important step for future work. Testing different forecasting workflows with different CH 4 ebullition model covariates and model types in inland water ecosystems at varying temporal and spatial scales will further improve our understanding of this important biogeochemical flux.
In addition to testing other CH 4 ebullition rate models, additional improvements to our forecasting system would also include the addition of automated sensors to improve temporal coverage, incorporation of other statistical methods into the forecasting workflow (e.g., sequential data assimilation methods like particle filters and Kalman filters; Evensen, 2009;Carpenter et al., 1999;Del Moral et al., 2006;Doucet and Johansen, 2009), and other forecast model covariates which might provide more insight to mechanisms of CH 4 ebullition. While our forecasting case study showed that near-term, iterative forecasts with iterative model refitting are possible with manually-collected CH 4 ebullition rates, recent technological improvements in automated sensors hold great potential for advancing the future of CH 4 ebullition forecasting at daily or sub-daily scales by increasing the timescale at which new data can be assimilated and at which forecasts can be made (e.g., Varadharajan et al., 2010;Delwiche and Hemond, 2017;Maher at al., 2019). Together, integrating automated CH 4 ebullition rate sensors, an ensemble of predictive models, and alternate data assimilation methods have the potential to substantially advance forecasting of CH 4 ebullition fluxes at sub-weekly timescales.

Scaling our CH 4 Ebullition Forecasting Workflow to Other Sites
Our near-real time forecasts showed that out-of-sample CH 4 ebullition rates at our littoral transect in FCR were adequately predicted at 1-and 2-week horizons using a state-space Bayesian workflow with iterative model refitting and forecasted SWI temperatures as a model covariate (Figure 3). While we acknowledge that we have only applied this forecasting workflow to one reservoir site, we envision that it has the potential to be adapted for predicting CH 4 ebullition in other freshwater ecosystems. Here, we identify three critical components we think would enable others to begin making near-real time forecasts of CH 4 ebullition rates from inland waters and inform ecological understanding of the processes governing ebullition rates: 1) A routine ebullition field monitoring program. The foundation of near-term, iterative forecasting is a coupled model-data feedback loop (Luo et al., 2011;Dietze, 2017b). Developing a field monitoring program that routinely samples ebullition from a waterbody of interest is necessary to develop and iteratively update ebullition forecast models. For example, this could include one automated ebullition trap with high temporal resolution or multiple passive traps that are limited in temporal resolution but are less expensive and allow for more spatial coverage. It is ideal if the monitoring data collection occurs within the maximum time horizon of the forecasted driver data to enable iterative forecast evaluation and model refitting. 2) A model to generate future predictions of ebullition that can be updated as observations become available. While we used a relatively simple state-space model fit in a Bayesian framework to account for the computational requirements to run forecasts with iterative refitting, there are many possible model types and data assimilation methods than can be applied like process-based models (Schmid et al., 2017;Peltola et al., 2018), and machine learning approaches such as neural networks (Abbasi et al., 2020) to develop CH 4 ebullition forecasts. We suggest starting with something simple like a persistence null model (Eq. 9) and then adding in additional complexity when resources are available to do so.
3) Forecasted driver data. Depending on model structure, near-real time ebullition forecasts will likely rely on forecasted driver data to predict ebullition at a future time step. In lieu of having access to a water temperature forecasting system as we did for this study, meteorological forecasts (e.g., NOAA's Global Ensemble Forecast System) can be integrated into an ebullition forecasting workflow if the CH 4 ebullition model is driven by meteorological variables such as air temperature, barometric pressure, shortwave radiation, or wind speed (Joyce and Jewell 2003;Tokida et al., 2007;Wik et al., 2014;Peltola et al., 2018;McClure R. et al., 2020). For example, Tokida et al. (2007) showed how decreasing atmospheric pressure triggered substantial increases in CH 4 ebullition from wetlands. Similarly, Wik et al. (2014) associated seasonal CH 4 ebullitive fluxes from thermokarst lakes with shortwave radiation and Joyce and Jewell (2003) suggested that wind speed could influence bottom currents, thereby changing the shear stress at the sediments releasing ebullition bubbles.
There are several additional drivers of ebullition that have the capacity to be applied in a forecasting workflow in addition to those included in this study. However, in order to include a new driver variable in a forecasting workflow, one must produce a forecast of the driver variable, which can add additional uncertainty [e.g., our forecasting application here that translated weather forecasts into future SWI temperatures using FLARE and an empirical scaling model (Eq. 6)]. For example, changes in reservoir water level (Beaulieu et al., 2018), chemical and biological drivers such as chlorophyll a (West et al., 2016), sedimentation quantity (Wik et al., 2018), sediment quality (Wik et al., 2018), macrophyte abundance, and diel Chaoborus spp. migration (Bezerra et al., 2020), are known drivers of ebullition but rarely forecasted. This provides an opportunity for future model studies to begin generating forecasts of these drivers and testing how they perform with near-term forecasts of ebullition. Among the different predictors, chlorophyll a may be a useful one to start with as a forecast driver for CH 4 ebullition as there are multiple existing workflows that have generated chlorophyll a forecasts for management (Page et al., 2018;Rousso et al., 2020). Integrating chlorophyll a forecasts into CH 4 ebullition forecast workflows, similar to our example of integrating water temperature forecasts to our CH 4 ebullition workflow, would likely improve forecasts of CH 4 ebullition, and enable testing of how well these drivers perform at varying spatial and temporal scales.

Utility of Adopting Near-Term Iterative Forecasting Workflows
In conclusion, we successfully generated near-real time forecasts of CH 4 ebullition rates, a highly variable biogeochemical flux in inland water ecosystems, using forecasted SWI temperatures and near-real time observations of ebullition rates. Further, our forecasts that iteratively updated parameters through model refitting performed better than forecasts that did not update parameters and a persistence null model at both 1-and 2week forecast horizons, indicating that forecasting workflows with iterative model refitting have the potential to improve our estimates of this highly variable biogeochemical flux up to 2 weeks into the future. Additionally, quantifying and partitioning uncertainty provided an additional line of evidence that iterative forecasting can improve the skill of predictions with time. Uncertainty partitioning showed that our forecast model is highly sensitive to process uncertainty, highlighting the need for more resolved partitioning of variation within our predictive AR model and testing of different CH 4 ebullition predictive models to improve predictions and transferability across multiple spatial and temporal scales. Altogether, we show that near-term iterative forecasting with model refitting and uncertainty partitioning has the potential to improve future CH 4 ebullition rate predictions, an important step to informing our upscaling estimates of CH 4 ebullition fluxes from inland waters to the global carbon cycle.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
RM and CC developed the original research idea. RM and RT developed the forecasting workflow. WW and ML contributed to code development, field data collection, and forecast model testing. RM wrote the manuscript with equal writing contributions from CC and RT. All authors provided feedback and approved the final version.