Using maximum entropy production to describe microbial biogeochemistry over time and space in a meromictic pond

The maximum entropy production (MEP) conjecture posits that systems with many degrees of freedom will likely organize to maximize the rate of free energy dissipation. Previous work indicates that biological systems can outcompete abiotic systems by maximizing free energy dissipation over time by utilizing temporal strategies acquired and refined by evolution, and over space via cooperation. In this study, we develop an MEP model to describe biogeochemistry observed in Siders Pond, a phosphate limited meromictic system located in Falmouth, MA that exhibits steep chemical gradients due to density-driven stratification that supports anaerobic photosynthesis as well as microbial communities that catalyze redox cycles involving O, N, S, Fe and Mn. The MEP model uses a metabolic network to represent microbial redox reactions, where biomass allocation and reaction rates are determined by solving an optimization problem that maximizes entropy production over time and a 1D vertical profile constrained by an advection-dispersion-reaction model. We introduce a new approach for modeling phototrophy and explicitly represent aerobic photoautotrophs, anoxygenic photoheterotrophs and anaerobic photoautotrophs. The metabolic network also includes reactions for heterotrophic bacteria, sulfate reducing bacteria, sulfide oxidizing bacteria and aerobic and anaerobic grazers. Model results were compared to observations of biogeochemical constituents collected over a 24 hour period at 8 depths at a single 15 m deep station in Siders Pond. Maximizing entropy production over long (3 d) intervals produced results more similar to field observations than short (0.25 d) interval optimizations, which support the importance of temporal strategies for maximizing entropy production over time. Furthermore, we found that entropy production must be maximized locally instead of globally where energy potentials are degraded quickly by abiotic processes, such as light absorption by water. This combination of field observations with modeling results show that microbial systems in nature can be accurately described by the maximum entropy production conjecture applied over time and space.

The maximum entropy production (MEP) conjecture posits that systems with many degrees of 10 freedom will likely organize to maximize the rate of free energy dissipation. Previous work indicates 11 that biological systems can outcompete abiotic systems by maximizing free energy dissipation over 12 time by utilizing temporal strategies acquired and refined by evolution, and over space via 13 cooperation. In this study, we develop an MEP model to describe biogeochemistry observed in 14 Siders Pond, a phosphate limited meromictic system located in Falmouth, MA that exhibits steep 15 chemical gradients due to density-driven stratification that supports anaerobic photosynthesis as well 16 as microbial communities that catalyze redox cycles involving O, N, S, Fe and Mn. The MEP model 17 uses a metabolic network to represent microbial redox reactions, where biomass allocation and 18 reaction rates are determined by solving an optimization problem that maximizes entropy production 19 over time and a 1D vertical profile constrained by an advection-dispersion-reaction model. We 20 introduce a new approach for modeling phototrophy and explicitly represent aerobic photoautotrophs, 21 anoxygenic photoheterotrophs and anaerobic photoautotrophs. The metabolic network also includes 22 reactions for heterotrophic bacteria, sulfate reducing bacteria, sulfide oxidizing bacteria and aerobic 23 and anaerobic grazers. Model results were compared to observations of biogeochemical constituents 24 collected over a 24 hour period at 8 depths at a single 15 m deep station in Siders Pond. Maximizing 25 entropy production over long (3 d) intervals produced results more similar to field observations than 26 short (0.25 d) interval optimizations, which support the importance of temporal strategies for 27 maximizing entropy production over time. Furthermore, we found that entropy production must be 28 maximized locally instead of globally where energy potentials are degraded quickly by abiotic 29 processes, such as light absorption by water. This combination of field observations with modeling 30 results show that microbial systems in nature can be accurately described by the maximum entropy 31 production conjecture applied over time and space. 32 Introduction 34 Mass and energy flow associated with the growth and predation of bacteria, archaea and eucaryotes 35 in microbial food webs coupled with abiotic reactions and transport processes define biogeochemical 36 cycles that occur in ecosystems ranging in size from less than a liter (Marino et al., 2016) to the entire 37 planet. Because organisms are ultimately responsible for most observed biogeochemical 38 transformations, it is customary and natural to focus on the bioenergetics and growth and predation of 39 the organisms that constitute food webs in order to understand and predict biogeochemical 40 transformations. This organismal focus has a long history and has advanced our understanding and 41 prediction of ecosystem dynamics and the mass and energy flow through them (Riley, 1946 the growth, predator-prey and cooperative interactions of organisms will continue to contribute to our 44 understanding of ecosystem bioenergetics, there are some challenges that limit this approach for 45 microbial systems that form the basis of biogeochemical cycles. Microbial communities are diverse, 46 complex and abundant, consisting, for example, of upwards of 10 9 microorganisms per liter of 47 seawater, with estimates that Earth hosts close to 1 trillion microbial species (Sogin et al., 2006;48 Locey and Lennon, 2016). With the advent of metagenomics, next generation sequencing and 49 bioinformatics, the challenging task of deciphering and annotating the metabolic capabilities and 50 activities of bacteria and other microorganisms has begun, but untangling the Gordian knot will take 51 significantly more time and investment , and determining how 52 information in genomes contributes to competition and cooperation is still in its infancy (Hallam and 53 McCutcheon, 2015; Pasternak et al., 2015;Worden et al., 2015). Even more challenging is 54 understanding and predicting community composition dynamics and succession as environmental 55 conditions change, both from exogenous and endogenous drivers (Konopka et al., 2015). To develop 56 predictive models of biogeochemistry based on organisms and the genes they carry, exchange 57 horizontally and differentially express requires an enormous amount of information, which will likely 58 take many decades to compile and decipher, but good progress is being made (Reed et al., 2014;59 Coles et al., 2017). While we believe this reductionist approach is essential, there is also a 60 complementary approach to understanding microbial biogeochemistry that is less studied and uses a 61 more thermodynamic, or whole systems, approach. 62 Understanding how ecosystems function at the systems level has a long tradition in theoretical 63 ecology (Chapman et al., 2016;Vallino and Algar, 2016), but the underlying premise is that 64 ecosystems organize so as to maximize an objective function, such as maximizing power proposed 65 by Lotka (1922) nearly 100 years ago. The advantage of the systems approach is that the variational 66 principle can be used to determine how an ecosystem will organize and function without the 67 knowledge of which organisms are present and how their population changes over time, provided 68 there is sufficient diversity. That is, understanding and modeling of ecosystems can focus on function 69 rather than on organisms, and there is growing support that stable function can arise from dynamic 70 communities (Fernandez et al., 1999;Fernandez-Gonzalez et al., 2016;Louca and Doebeli, 2016;71 Needham and Fuhrman, 2016; Coles et al., 2017). Here, we build upon the conjecture that microbial 72 systems organize to maximize entropy production. The maximum entropy production (MEP) 73 principle proposes that systems (biotic systems included) with sufficient degrees of freedom will 74 likely organize to maximize the dissipation of Gibbs free energy (Dewar, 2003;Lorenz, 2003; 75 Martyushev and Seleznev, 2006). The MEP principle has been applied to both abiotic and biotic 76 processes ( investigate metabolic switching in nitrate reducing environments . 79 I n r e v i e w Energy dissipation drives microbial function 3 An unresolved aspect of MEP involves the temporal and spatial scales over which it operates. MEP 80 theory has only been derived for nonequilibrium steady-state systems, but microbial systems are 81 dynamic, so steady state does not apply. Consequently, we have proposed that abiotic processes, such 82 as fire or a rock rolling down a hill, maximize instantaneous entropy production; that is, they follow a 83 steepest descent trajectory down a potential energy surface. Biological systems, however, use 84 information acquired by evolution and culled by natural selection to follow alternate pathways that 85 maximize entropy production over time, which allows biology to sometimes outcompete abiotic 86 processes (Vallino, 2010). Similarly, when considering a spatial domain, entropy production can 87 either be maximized locally at each point in the domain or entropy production at each point can be 88 adjusted so that entropy production is maximized globally over the entire domain. A simple 89 numerical study indicated that when a system organizes over space, entropy produced by global 90 optimization can exceed that from local optimization, but this requires spatial coordination by the 91 community, while abiotic systems are likely to only maximize entropy production locally (Vallino, 92 2011). This paper seeks to examine both of these ideas using spatially and temporally distinct 93 observations collected from a field site. 94 kinetics and thermodynamics are determined so as to maximize internal entropy production over a 119 specified interval of time and space. One of the underlying objectives of the approach is to replace 120 what are typically poorly constrained biological parameters, such as maximum specific growth rates, 121 substrate affinities, prey preferences, reaction efficiencies, etc, with optimal control variables whose 122 values are determined by maximizing free energy dissipation. Consequently, these MEP-based 123 I n r e v i e w This is a provisional file, not the final typeset article models contain relatively few adjustable parameters, but see Section 2.1.3.1 below. We describe 124 briefly below the three model components, but we focus first on the representation of the metabolic 125 network, as this forms the foundation of the approach, and includes the concentrations of 11 chemical 126 constituents and 8 functional groups (Fig. 1) The catalysts and reactions included in the metabolic network represent the capabilities of the entire 139 microbial community, but the reactions are distributed across S catalysts (8 for the case of Siders  140 Pond, Fig. 1), just as functions are distributed across phyla in natural communities (Vallino, 2003). 141 The reactions are highly simplified and condensed, and consist of two essential components: an 142 anabolic reaction that synthesizes catalyst from environmental resources, and a catabolic reaction that 143 provides free energy to drive the anabolic reaction forward. A highly simplified list of metabolic 144 reactions for the Siders Pond model is given in where vector is the concentration of substrates and products, , is the number of electrons 181 transferred in the redox reaction, * and * are universal constants (we have used the same values of 182 350 d -1 for * and 5000 mmol m -3 for * for all models to date) and Ω , is the second optimal 183 control variable class. Since a catalyst, S , can catalyze , sub-reactions, Ω , determines how 184 much of catalyst S is allocated to reaction , at a given time and location in space. For instance, the 185 sulfate reducing bacteria are also given two reactions in addition to Eq. (3) ( Table 1). Reactions 186 2, and 3, allow SRB to decompose recalcitrant organic carbon, C , and phosphorous, P , into 187 labile organic carbon, C , and inorganic phosphate, respectively, but to do so they must deallocate 188 from biosynthesis reaction, 1, , Eq. (3). That is, each Ω , is bound between 0 and 1, and Ω , must 189 sum to unity over for each S . An example of how , ∆ , , and vary over time and space to 190 influence reaction rate is given in SM Section 4.1, Fig. S3 for growth of sulfate reducing bacteria, 191 1,

I n r e v i e w
This is a provisional file, not the final typeset article In the formulation above, the free energy of reaction, ∆ , , for the pure anabolic reaction, Eq. (1), 194 might be positive (but it can be negative, see Vallino, 2010). To improve computational consistency 195 between anabolic reactions with differing free energies, the reaction stoichiometries are augmented to 196 insure that ∆ , goes to zero as goes to 1 for the anabolic component of the reaction. For an 197 anabolic reaction that has a strongly positive ∆ , , the reaction is coupled to the catabolic reaction 198 based on the ratio of the anabolic to catabolic reaction free energies, defined as , in the 199 stoichiometric equations. Consequently the true stoichiometry for 1, , as well as other reactions in 200 the network, is slightly more cluttered (e.g., Eq. S103), but still readily calculated. 201

Phototrophy 202
In this version of the MEP model we introduce catalysts associated with phototrophic growth, 203 specifically phytoplankton (Phy), S ℎ , anaerobic green sulfur bacteria (GSB), S , and 204 photoheterotrophs (PH), S . Both Phy and GSB are modeled similarly using two sub-reactions 205 (Table 1). One sub-reaction couples photon capture that drives CO 2 fixation into carbohydrates, C ℎ 206 and C , with elemental composition CH 2 O, and a second sub-reaction converts carbohydrates into 207 biomass in a manner analogous to growth of heterotrophic bacteria and GSB previously developed. 208 We focus on phytoplankton here (but see SM Section 2.7.2) because development for GSB is similar, 209 and can be found in SM Section 2.7.3. 210 The carbon fixation reaction for Phy is given by, 211 where ℎ are captured photons of frequency , ℎ is Plank's constant and 1, ℎ is the mmoles of 212 photons needed to fix 1 mmol of CO 2 at 100% efficiency (i.e., ℎ = 1) (note, units for reaction 213 rates in the model are in mmol m -3 d -1 , so we use mmol instead of mol in this description). The Gibbs 214 free energy for a mmol of photons is given by 215 where is Avogadro's number (in mmol) and is the thermodynamic efficiency for converting 216 electromagnetic radiation into work (see Eq. (S26) and Candau, 2003). If we define ∆ ℎ as the 217 Gibbs free energy for fixing 1 mmol of H 2 CO 3 into C ℎ plus O 2 (see Eq. (S29)), then 218 and the Gibbs free energy of reaction for Eq. (6) is given by, 219 In this formulation, ℎ governs the efficiency for the conversation of electromagnetic potential into 220 chemical potential. If 100% of solar radiation is converted to chemical potential, no entropy is 221 produced, and the free energy of reaction for Eq. (6) is zero, so the reaction will not proceed due to 222 thermodynamic constraints. However, as ℎ is reduced below 1, some photons in the 223 photosynthetic active radiation (PAR) spectrum are simply dissipated as heat, and all are dissipated 224 I n r e v i e w as heat when ℎ = 0. In Eq. (6), the classic quantum yield, , (Kishino et al., 1986) is given by 225 , which obviously varies as a function of the optimal control variable, ℎ . 226 The reaction rate for Eq. (6) depends on the rate of photon capture, which is given by, 227 where is the coefficient for light absorption by particles, Ω 1, ℎ [S ℎ ] is the fraction of 228 phytoplankton catalytic machinery allocated to capturing photons (i.e., chlorophyll, other pigments, 229 electron transfer chains, etc), and ( , ) is the light intensity (mmol photons m -2 d -1 ) at time and 230 depth (see SM Section 2.5). Consequently, the reaction rate for Eq. (6) is given by, 231 This reaction has similarities to those typically used to describe phytoplankton growth (Macedo and  232 Duarte, 2006), but our derivation focuses not on the local light intensity level, but rather on how 233 much light is actually intercept by the phytoplankton, as governed by Ω 1, ℎ [S ℎ ], and how 234 much of that free energy is actually used to drive carbon fixation, as governed by ℎ . As evident in 235 Eq. (11), the maximum rate is directly tied to the rate of photon interception, ∆ ℎ , not by an 236 arbitrary maximum specific growth parameter. The second reaction used by Phy and GSB (Table 1)  237 is simply the conversion of reduced organic carbon, C ℎ and C , into more catalyst, but of course 238 C ℎ and C can also just be combusted to produce entropy depending of the value of ℎ . The 239 carbohydrate pools serve a similar function as cell quota in Droop's (1973) model. 240 The reaction for photoheterotrophs (PH) differs slightly from that above (SM Section 2.7.8). In this 241 case only one reaction is used ( 1, , Table 1), where the photon capture is linked to the conversion 242 of labile carbon into PH catalyst. As above, photons captured can also be dissipated as heat for 243 < 1, or the free energy can be used to drive biosynthesis ( > 0), and photon energy replaces 244 catabolic reaction used in chemotroph reactions. In the currently formulation, reaction rate is also tied 245 to photon capture rate, so growth can only proceed during the day (Eq. (S131)). A second 246 chemotrophic sub-reaction (like 1, , Eq. (S93)) could have been added to S , but this would 247 increase the number of optimal control variables in the problem, and would be redundant with S , 248 so this option was not explored. 249

Entropy production 250
It is outside the scope of this manuscript to delve into the thickets of entropy (see Vallino and Algar,251 2016), but a few definitions are in order, since entropy accounting is the foundation of the MEP 252 approach. Simply stated, entropy production occurs when an energy potential is destroyed and 253 dissipated as heat to the environment, but not when the potential is converted from one potential to 254 another, if done so reversibly, which technically is possible, but never occurs in practice. For 255 example, a flame converting chemical potential into heat or light being absorbed by water both result 256 in maximum entropy production; these are irreversible processes. On the other hand, entropy is not 257 produced by converting electromagnetic potential into chemical potential, which phytoplankton do 258 partially, but they also convert a fraction of that potential to heat, as dictated by ℎ in the model. As 259 ℎ approaches 1, electromagnetic potential is converted to chemical potential without loss, but 260 I n r e v i e w This is a provisional file, not the final typeset article (Eq. S24) drivers the reaction rate to 0. For organisms to grow at a non-zero rate an energy potential 261 must be partially dissipated and some entropy must be produced. 262 For the Siders Pond model, entropy production associated electromagnetic radiation can be produced 263 along three pathways. Photons intercepted by water, ( , ), and particles lacking photochemical 264 properties, such as bacteria, [S ] ( , ) and grazers, [S ] ( , ), as well as by the non-265 photosynthetic components of phytoplankton, Ω 2, ℎ [S ℎ ] ( , ) and Ω 2, [S ] ( , ), are 266 simply dissipated as heat (long wave radiation), and the entropy produced from them is readily 267 calculated. Photons intercepted by the photosynthetic machinery of phototrophs can be either 268 dissipated as heat, = 0, or the energy potential can be transferred to chemical potential, = 1, but 269 in general it is a combination of those processes, 0 < < 1. This latter route is considered entropy 270 production by reaction in the Results Section, even though it involves photons. Consequently, total 271 entropy production, Σ (Eq. (S178) first expression) is the sum of three processes: entropy production 272 by photon absorption by water, ̇ (Eq. (S176)), entropy production by light absorption by particles, 273 ̇ (Eq. (S177)) and entropy production by reaction, ̇ (Eq. (S178) second expression). In the photic 274 zone of Sider Pond, it is the production of biological particles through growth that greatly increases 275 local entropy production. 276

Transport model 277
Siders Pond is horizontally well mixed, so an advection-dispersion-reaction (ADR) model that 278 includes particle sinking was used to approximate vertical transport of the 19 state variables (Fig. 1). 279 The origin of the vertical coordinate, , is defined at the pond's surface, and the axis points in the 280 positive direction downward towards the benthos and reaches a maximum depth of 15 m. Siders 281 Pond 3D bathymetry surface (Fig. S1) was rendered from a contour plot in Caraco (1986), and an 282 equation for cross-section area as a function of depth, ( ), was derived therefrom (Eq. S3). was based on a model of solar zenith angle (Brock, 1981), which assumes a cloudless sky. To predict 295 PAR light intensity as function of time and depth, a standard light adsorption model was used that 296 includes coefficients for light absorption by water, , and particulate material, (Eq. S7). 297 To complete the ADR transport model, Neumann boundary conditions were used for state variables 298 at the pond's surface, but particles were not allowed to sink into the boundary from above (Eq. S9), 299 and gas transport for O 2 , CO 2 and H 2 S across the air-water interface was accounted for using a 300 stagnant film model (Eq. S10). Robin boundary conditions were used at the bottom boundary based 301 on the flux of material entering the boundary associated with the intrusion of seawater diluted with 302 groundwater from below. However, aerobic and anaerobic decomposition of sinking organic matter 303 I n r e v i e w from the water column contributed to a sink for O 2 and H 2 SO 4 , and a source for H 3 PO 4 , H 2 CO 3 and 304 H 2 S to the overlying water (Eq. S23). 305

Parameter adjustments 306
One of the objectives in using a variational approach is to remove model parameters, but a few 307 required adjustments to get simulations to within an order of magnitude of Siders Pond observations. 308 Our objective was not to minimize error between predictions and observation, so the few parameters 309 described in this section were adjusted in a rather ad hoc manner. No formal data assimilation was 310 attempted, and the parameters were adjusted using fixed nominal values for the optimal control 311 parameters, and Ω , (i.e., simulations without optimization). All biological structures, S , detritus, 312 C and P , and associated phytoplankton carbohydrates, C ℎ and C , were allowed to sink 313 through the water column (Eq. (S1)). However, sinking velocities can vary greatly across and within 314 organism size classes (Bach et al., 2012) , so velocities where set through trial and error to obtain 315 profiles consistent with those observed in Siders Pond (Table S1). In addition, coefficients for light 316 absorption by water, , and particles, , were adjusted so that light profiles were similar to 317 observations (SM Section 2.5). Computational approach used for solving the ADR equation is 318 described in SM Section 2.9. 319

Optimization and MEP 320
The stoichiometry, thermodynamics and kinetics of the 28 reactions that comprise the metabolic 321 network (Table 1, Fig. 1, SM Section 2.7) vary as a function of the optimal control variables, and 322 the partitioning of biological structure, S , to sub-reactions , ( Table 1)  The critical aspect of the optimization is choosing the appropriate time interval over which to 327 maximize entropy production and whether local or global optimization should be used (Vallino,328 2011); consequently, these two aspects are the focus of this manuscript and form the bases of the 329 Results Section. Computational approach used for solving the optimal control problem is described in 330 SM Section 3.2. 331

Siders Pond site description 332
Siders Pond is a small coastal meromictic kettle hole (volume: 10 6 m 3 ; area: 13.4 ha; maximum 333 depth: 15 m) that receives approximately 1x10 6 m 3 of fresh and 0.15 x10 6 m 3 of saltwater each year 334 (Caraco, 1986). The latter input occurs via episodic inputs during high tides and storm events via a 335 small creek that connects the pond to Vineyard Sound approximately 550 m to the south (Fig. S1). 336 Tritium-helium water dating confirmed vertical mixing across two observed chemoclines, but 337 permanent stratification is maintained because the saltwater inputs enter the pond at depth, mix 338 upward and become entrained with freshwater before exiting the pond (Caraco, 1986). Caraco 339 (1986) also characterized N and P loading to the pond (50 g N m 2 y -1 and 1.3 g P m -2 y -1 , 340 respectively), and an N+P enrichment study (Caraco et al., 1987) shows phytoplankton to be P 341 limited, especially in the low salinity surface waters. Previous studies show Siders Pond is eutrophic 342 averaging 16 mg m -3 chlorophyll a (Chl a) in surface waters (but can exceed 100 mg m -3 at times) and 343 an annual primary productivity of 315 g C m -2 (Caraco, 1986;Caraco and Puccoon, 1986). In anoxic 344 bottom waters bacterial Chl c, d and e associated with photosynthetic green sulfur bacteria averages 345 20 mg m -3 (purple sulfur bacteria were not found in high concentration), but BChl cde was also 346 observed to reach high concentrations at times (> 75 mg m -3 ). Even though green sulfur bacteria 347 I n r e v i e w This is a provisional file, not the final typeset article could attain high concentrations, their productivity was only 6% of the oxygenic photoautotroph 348 (cyanobacteria + algae) production (Caraco, 1986). Siders Pond was chosen for this study because 349 extensive redox cycling occurs over a 15 m deep water column, which greatly facilitates sampling 350 due to the large water volumes that can be readily collected without perturbing the system. 351

Sampling and measurements 352
Samples were collected from Siders Pond, Falmouth, MA over a 24 h period starting at 6:45 on Jun 353 25 th and ending at 7:37 on Jun 26 th , 2015 from a single station located within the deepest basin of the 354 pond (41.548212ºN, 70.622412ºW). A total of 7 casts were conducted over the 24 hr period, and 355 each cast sampled 8 depths to generate a 2D sampling grid designed for comparison to model outputs 356 (Fig. 2) (Fig. 2). Our results focus on how these two solutions compare to 404 observations as well as to each other. Future discounting was not used in solutions presented here 405 (i.e., = 0 and = ∆ ). See SM Section 3.2 for details on optimization. 406

Simulations compared to observations 407
In this section, solutions obtained from the short (0.25 d) and long (3 d) interval optimizations are 408 compared to biogeochemical observations from Siders Pond that were collected on Jun 25 th and 26 th , 409 2015 (Fig. 2). Simulated profiles for photosynthetic active radiation (PAR), Chl a and dissolved 410 oxygen over 6 days for the short and long interval optimizations are compared to 1 day of 411 observations in Siders Pond in Figure 3. The short interval optimization (SIO) shows PAR extending 412 to nearly the bottom of the pond (Fig. 3a), while PAR from the long interval optimization (LIO) (Fig.  413 3b) more closely matches observations (Fig. 3c); however, high intensity PAR (~ 100 µmol m -2 s -1 ) 414 observations extend a few meters deeper than the LIO simulation predicts. The prediction for Chl a in 415 both simulations do not match observations very well (Fig. 3d-f), but this is partially due to how Chl 416 a was estimated in the model, since Chl a is not specifically modeled (see Section 2.3). The Chl a in 417 vivo observations show a peak Chl a around 5 m, while both simulations have peaks around 12 m, but 418 those peaks are due to accumulation of sinking phytoplankton rather than productivity (see below). 419 The LIO simulation does show a secondary Chl a peak developing around 3 m, but it is weaker than 420 observations (~ 6 vs 40 µg L -1  429  4), but some discrepancies are apparent. The phosphate chemoclines occur around 10 m, 11 m and 8 430 m for the SIO, LIO and observations, respectively (Fig. 4a-c). Above the phosphate chemocline, the 431 SIO simulation shows slightly elevated levels of H 3 PO 4 (~0.3 mmol m -3 ) in the surface and 5 m 432 layers, compare to observations (Fig. 4c)

I n r e v i e w
This is a provisional file, not the final typeset article simulations show much lower DIC concentrations below 10 m (3,000 and 5,400 mmol m -3 , 437 respectively) than observed in Siders Pond (16,500 mmol m -3 ), which would indicate either much 438 higher anaerobic respiration in the pond than occurs in the simulations, or the bottom boundary 439 condition for the simulations is incorrect (see SM Section 2.6). The SIO and LIO also show greater 440 draw down of DIC in the surface water above 2 and 4 m respectively, and the LIO simulation shows 441 minimum DIC approaching just 2 mmol m -3 at 1 m, while minimum observed value is above 660 442 mmol m -3 . 443 Simulations of hydrogen sulfide show a chemocline at approximately 9 and 10 m for the SIO and 444 LIO solutions, respectively, which are comparable to the H 2 S chemocline observed in Siders Pond at 445 about 8 m (Fig. 5a-c). However, H 2 S reaches concentrations as high as 7,000 mmol m -3 in Siders 446 Pond, while maximum concentrations only reaches 900 and 2,100 mmol m -3 in the SIO and LIO 447 simulations, respectively. Simulations also show a peak H 2 S concentration at 12 m, and a decrease in 448 concentration below 12 m, which seems unlikely and indicates that the bottom boundary condition 449 may not be correct. That is, the flux of H 2 S into the bottom water is lower than it should be, but it is 450 also possible that the rate of anaerobic metabolism in the simulations is too low, which would be 451 consistent with the low simulated DIC concentration compare to observations mentioned above.  (Fig. 6a). The LIO simulation shows a similarly high DOC concentration at 14 m, but above 12 465 m, the DOC concentration ranges from 2 to 140 mmol m -3 , which is comparable to those observed in 466 Siders Pond, which range from 200 to 300 mmol m -3 above 10 m, and increase to a maximum of 467 about 1,000 mmol m -3 at 12 m. Similar to DOC, POC in the SIO simulation shows low values (< 30 468 mmol m -3 ) above 6 m, but POC peaks to 1,000 mmol m -3 at 12 m (Fig. 6d). The POC concentrations 469 from the LIO simulation are closer to observations, but the mid-water POC maximum in the LIO 470 simulation is approximately 500 mmol m -3 , while the observations peak at 280 mmol m -3 around 4 m. 471 Like the SIO simulation, the LIO simulation also shows high POC concentrations below 10 m, which 472 was not observed in Siders Pond. It is possible that DOC and POC may reach higher concentrations 473 in the funnel-like basin of Siders Pond below 12 m (Fig. S1), but samples were not collected there. 474

Comparison between SIO and LIO simulations 475
Since the MEP optimization model generates a large number of outputs, this section highlights some 476 of those outputs to contrast the simulations based on the short (0.25 d) interval optimization (SIO) to 477 that from the long (3 d) interval optimization (LIO). Consider entropy production (Eq. S178), which 478 is the variable that is being sequentially maximized over either a 0.25 d interval (SIO) or a 3 d 479 interval (LIO) at 10 different depths (see SM Section 3.1, Eq. (S180)) (Fig. 7). Total entropy 480 production, Σ, for the SIO and LIO simulations differ significantly, in that Σ in the SIO solution is 481 I n r e v i e w spread out over a 12 m water column (Fig. 7a) versus the LIO solution (Fig. 7e) where most of the 482 entropy production occurs in the top 3 m of the pond. Furthermore, peak entropy production in the 483 LIO simulation is 7.5 times great than the SIO solution (9.8 vs 1.3 GJ m -1 K -1 d -1 ), and the total 484 integrated entropy produced over the water column and 6 d simulation, (Eq. (S182)), was 21.3 GJ 485 K -1 and 36.5 GJ K -1 for the SIO and LIO simulations, respectively. For comparison, if the pond was 486 sterile, would equal 12.4 GJ K -1 from light absorption in the water column. The higher total 487 integrated entropy, , produced by the LIO simulation illustrates that extending the time scale 488 results in greater entropy production, as has been previously shown in experimental mesocosms 489 . 490 The different contributors to total entropy production, namely by reaction, ̇, water, ̇, and 491 particles, ̇, also differ significantly between the two simulations. For instance, total integrated 492 entropy production associated with reactions (Table 1) was actually greater in SIO than the LIO 493 simulation (Fig. 7b vs f), as well as exhibited a greater maximum ̇ (0.28 vs 0.26 GJ m -1 K -1 d -1 for 494 SIO versus LIO); however, entropy production by reaction during the day is rather small (< 25% in 495 the upper 5 m) compared to light dissipation by water or particles, but the two simulations differ here 496 as well. The SIO simulation dissipates most of the incoming radiation by water absorption in the 497 upper 5 m of the water column ( Fig. 7c vs g), while the LIO simulation dissipates most of the 498 electromagnetic potential via absorption by particles (Fig. 7d vs h). As evident in the POC (Fig. 6) 499 and Chl a (Fig. 3) concentrations, the LIO simulation produces more biomass in the upper portion of 500 the water column, and biomass is effective at absorbing and dissipating light. The high biomass in 501 the LIO simulation is due to phytoplankton, as describe next. 502 An analysis of phytoplankton growth by the SIO and LIO simulations (Fig. 8) illustrates not only 503 how the two simulations differ, but also some of the mechanics of the MEP-based optimization 504 approach. Phytoplankton density attains a maximum of 20 mmol m -3 in the LIO simulation by June 505 27 th , but Phy are effectively absent in the SIO simulation, attaining a maximum of only 0.1 mmol m -3 506 ( Fig. 8a and f). While it is possible that the low phytoplankton (Phy) density in the SIO simulation 507 could be due to extensive predation, this is not the case because the rates of CO 2 fixation ( 1, ℎ , 508 Table 1) and conversion of fixed C to biomass ( 2, ℎ , Table 1) are two orders of magnitude lower in 509 the SIO versus LIO simulation (Fig. 8b, c versus g, h). Of course the differences in phytoplankton 510 density and reaction rates between the SIO and LIO simulations are due to how the optimal control 511 variables change over time and space (Fig. 8d, e, i and j). Consider how reaction 512 efficiency, ℎ ( , ), varies over time and space in the two simulations ( Fig. 8d and i). The SIO 513 simulation shows rapid switching between very high efficiencies (>0.98) and very low efficiencies 514 (<0.02) over time. While not on all days, reaction efficiency drops to very low levels around noon 515 (Fig. 8d), which results in high entropy production, but at the sacrifice of fixing CO 2 , which is 516 consistent with a short term optimization objective: burn fast now instead of investing for the future. 517 On the contrary, the LIO solution (Fig. 8i) shows much more gradual changes in ℎ ( , ), 518 operating between 0.3 and 0.4 for most of the simulation. There is rapid changing of Ω 1, ℎ ( , ) in 519 the LIO solution (Fig. 8j), but this makes sense, because the control variable partitions phytoplankton 520 biomass to the light requiring carbon fixation reaction, 1, ℎ , during peak daylight, then switches to 521 the biosynthesis reaction, 2, ℎ , at night (Fig. 8h). Changes in biomass partitioning is also observed 522 in the SIO solution (Fig. 8e), but sometimes Ω 1, ℎ selects for the light-requiring CO 2 fixation 523 reaction at night. This nonsensical behavior is likely due to the nature of optimization. For the SIO 524 simulation, phytoplankton do not contribute significantly to free energy dissipation, so values of the 525 optimal control variable are freer to randomly fluctuate because they do not strongly impact the 526 optimization objective. Based on Fig. 7c, the SIO solution places more weight on using water for the 527 This is a provisional file, not the final typeset article short term dissipation of electromagnetic potential in the pond's surface, but in deeper water the SIO 528 solution does grow biomass. 529 Instead of growing phytoplankton, the SIO solution produces more green sulfur bacteria (GSB), 530 which reach a maximum concentration of 735 mmol m -3 , compare to only 23 mmol m -3 in the LIO 531 solution (Fig. 9a and d). Furthermore, GSB increase during the SIO simulation, while they decrease 532 in the LIO, which is evident in the greater biosynthesis reaction, 2, in SIO versus LIO solutions 533 (Fig. 9b and e) as well as in 1, (not shown). However, the value of the reaction efficiency control 534 variable, ( , ), does switch to low values around noon on several days in the SIO simulation, 535 which implies again the solution favors entropy production over growth (Fig. 9c and f). This SIO 536 simulation also favors much higher reaction rates of photoheterotrophs (PH), 1, , compared to the 537 LIO solution, but interestingly, this does not lead to greater concentrations of S (Fig. 10). The 538 reason is because high rates for 1, are coupled with extremely low values of ( Fig. 10b and c). 539 Based on the adaptive Monod equation that changes substrate affinity as a function of 4 (Eq. 540 (S131)), uptake of labile organic carbon by PH can occur at extremely low concentrations of C 541 when is close to zero, but small values means very little biomass is produced as a results 542 (Eq. (S123)). This is an interesting result, as light energy is being used to scavenge organic carbon at 543 low concentrations, which contributes to the low labile organic carbon, C , observed in the SIO 544 simulation (Fig. 6a). Furthermore, the high entropy production from reaction, ̇ (Fig. 7b), is almost 545 entirely due to PH. One of the reasons both green sulfur bacteria and photoheterotrophs are more 546 active in the SIO versus LIO simulations is that light is not being intercepted by phytoplankton like it 547 is in the LIO simulation ( Fig. 8a vs f). 548 Bacterial densities are similar in the two simulations ( Fig. S4a and d), but there is significantly higher 549 growth rate by bacteria in the LIO simulation between 4 and 10 m (Fig. S4b and e). Both simulations 550 allocate almost all of S to detrital carbon decomposition, 2, , below 13 m ( Fig. S5c and f), but 551 the SIO simulation also allocates biomass to 2, sporadically throughout the water column to 552 produce C (Fig. S4c), which is limiting (Fig. 6a). The anaerobic, sulfate reducing bacteria function 553 similarly as bacteria, but of course operate in the anaerobic portion of the water column (Figs. S6 and 554 S7). However, while their growth occurs predominately below 12 m ( Fig. S6b and e), there is 555 significant biomass accumulation at and above 11 m, which indicates advection is transporting them 556 upwards. Like bacteria (Bac), the sulfate reducing bacteria (SRB) allocate biomass to 2, below 13 557 m in both simulations ( Fig. S7c and f) and sporadically throughout the water column in the SIO 558 simulation (Fig. S7c). Overall, bacteria and SRB function in a complementary mode across the 559 aerobic and anaerobic portions of the water column. Similar to phytoplankton, the control variables 560 for both S and S show more rapid (bang-bang) control in the SIO compared to the LIO 561 simulation (Figs. S5 and S7). The sulfide oxidizing bacteria, S , are largely unimportant in either 562 of the 6 day simulations. 563 Another significant difference between the SIO and LIO simulations is a greater importance of 564 predation, S , in the SIO solution (Fig. 11). Because predation is abstracted in the MEP model, it 565 represents all predation mechanisms, including protists, predatory bacteria, viruses and cannibalism. 566 In addition to dissipating chemical potential stored as biomass, predation serves a more important 567 task of turning over biomass locked into metabolic functions, such as sulfate reduction, that are no 568 longer of use under prevailing conditions. The SIO and LIO simulations show that the concentration 569 of S is more the 4 times higher in SIO than LIO solutions, and S increases over time with the 570 SIO objective. The primary prey items in the SIO solution are S (i.e., cannibalism), S and S 571 (Fig. 11b, c and d), while only S predation is of significance in the LIO solution (Fig. 11g). 572 I n r e v i e w Closer inspection of Fig. 11b-c reveals that predation occurs predominately at night, which is a result 573 of temporal changes in the partitioning control variables, Ω , , rather than prey concentration (not 574 shown). It is also worth noting that while the high concentration of S is found below 10 m, 575 predation actually occurs between 4 and 7 m, which indicates the high S concentration is due to 576 sinking and accumulation. Accumulation of biomass in the deep, anaerobic, portion of the water 577 column becomes food for anaerobic predators, S , which are important in both SIO and LIO 578 simulations, but are slightly more active in the LIO simulation (Fig. S8). The dominate prey items for 579 AGz are GSB (Fig. S8b and e), AGz (cannibalism) (Fig. S8c and f), SRB (Fig. S9a and c) and PH 580 ( Fig. S9b and d). 581 An interesting result that derives from the focus on energy dissipation rather than organisms is the 582 importance of chemotrophs on dissipating electromagnetic potential. For instance, heterotrophic 583 bacteria, S , are well understood as dissipaters of chemical potential as they typically respire more 584 reduced organic carbon than converting it biomass (i.e., < 0.5, Eq. (S84)), where the latter 585 reaction conserves chemical potential (Table 1). In both the SIO and LIO simulations, however, far 586 more free energy is dissipated by passive light absorption than it is by respiration (Fig. 12), although 587 the difference is more striking in the SIO simulation ( Fig. 12a versus b). Of course, some prokaryotes 588 in nature harness the abundant light energy via expression of proteorhodopsin (Beja et al., 2000;589 Giovannoni, 2017). 590

4
Discussion 591 The two primary objectives of this modelling study were to demonstrate that 1) a model based on 592 free energy dissipation reasonably describes microbial community organization and function and 2) 593 that microbial systems operate collectively over characteristic timescales that are likely longer than 594 what common wisdom would suggest. The secondary objectives were to demonstrate how the model 595 can be used in systems with spatial dimensions and to extend the approach to include phototrophs. 596 While improvements could be made with explicit data assimilation (Edwards et al., 2015), the MEP 597 model did a reasonable job at simulating biogeochemistry in Siders Pond, and the better fit of the 598 long interval optimization (LIO) simulation to observations indicates that the microbial community 599 has evolved to function over time scales that are longer than 0.25 days. While sinking velocities and 600 the two light absorption coefficients were crudely tuned (Section 2.1.3.1), the model otherwise has 601 very few adjustable parameters compared to conventional biogeochemical models that often contain 602 dozens (Ward et al., 2010). Furthermore, the MEP model is expected be more robust to extrapolation 603 beyond the envelope of calibration data, as the MEP objective is applicable regardless of a systems 604 operating point. The main drawback to the MEP approach is in the prediction of details. Since the 605 MEP conjecture is based on statistical mechanics (Dewar, 2003;Lorenz, 2003), it does not provide 606 information on the mechanisms by which a system attains an MEP state. If fact, the reason systems 607 tend to follow an MEP trajectory is because there are many ways to achieve that objective. The 608 approach is unlikely to predict details correctly. For instance, MEP models are unlikely to distinguish 609 an algal bloom from a harmful algal bloom, although predicting the latter is of great importance. 610 Likewise, an MEP model is just as likely to predict biomass turnover by jellyfish versus turnover by 611 salmon, but from an economics perspective, these are two vastly different solutions. In essence, 612 MEP is best used at predicting overall function, not the details that describe that function, largely 613 because predicting details is very difficult and prediction accuracy decays rapidly. MEP modeling is 614 similar to weather versus climate prediction. Climate models can predict far into the future, but not 615 at the details of weather, and weather models provide details, but their prediction decays quickly. 616 Perhaps the most useful aspect is that MEP provides a different perspective to view biology (Skene,617 This is a provisional file, not the final typeset article 2017). For the Siders Pond model, the perspective was microbial functional activity over time and 618 space. 619 Temporal strategies, such as circadian clocks (Wolf and Arkin, 2003), anticipatory control (Mitchell 620 et al., 2009; Katz and Springer, 2016), energy and resource storage (Schulz et al., 1999;Grover, 621 2011), and dormancy (Lewis, 2010) are hallmarks of biology, yet they are often not given much 622 consideration when theory and models are developed for understanding biogeochemistry, even 623 though temporal strategies have also been observed in microbial communities (Ottesen et al., 2014). 624 Here our results demonstrate that different organizational timescales dramatically impact 625 biogeochemistry and how microbial communities function. For instance, the SIO simulation does not 626 invest resources in phytoplankton growth, because over the short 0.25 d optimization, water 627 dissipates more electromagnetic potential than a small increase in phytoplankton biomass over the 628 short interval. Instead, the SIO solution allocates resources to green sulfur bacteria (GSB) and 629 photoheterotrophs (PH) growth, possibly because sinking losses near the surface of the pond are 630 greater than they are at depth due to advection and the pond's geometry. Near the surface of the 631 pond, upward advective velocity that counteracts sinking is only 0.01 m d -1 , while at 12 m velocities 632 reach 0.06 m d -1 , which facilitates biomass accumulation. Specific growth rates need to be higher at 633 the surface. The SIO solution also places more resources on decomposing refractory carbon, but also 634 on respiring the produced labile carbon, which results in low standing-stock concentrations of C . 635 Grazing rates, especially under aerobic conditions, are higher in the SIO simulations as well. These 636 types of resource allocations in the SIO solution are consistent with R* or resource-ratio theory 637 (Tilman, 1982) and r-selection (Pianka, 1970;Fierer et al., 2007); that is, emphasis on fast growth. 638 On the contrary, the LIO solution appears more similar to K-selection where resources are invested 639 for longer term outcomes. We know that terrestrial systems function on long timescales, at least on the order of the four 661 seasons, but their time scales are likely much longer given that individual trees can live hundreds 662 years, and forest succession takes longer than that (Odum, 1969;Finegan, 1984). At this time, we do 663 not have an answer to the question regarding an appropriate optimization time scale, but it is likely 664 I n r e v i e w related to the time scales over which biological predictions, acquired by evolution, can be reliably 665 made, such as the sun will return tomorrow, or winter is coming. Predicting the future has obvious 666 evolutionary advantages, but it also results in greater dissipation of free energy and entropy 667 production. Some specific temporal predictions can be quite long, such as periodical cicadas that 668 have strategies that operate on 13 and 17 year cycles, which clearly have been successful to date. We 669 believe that determining the temporal scales that microbial communities operate over will be 670 important for developing predictive biogeochemistry models, but time is not the only scale of 671 importance. 672 Strategies that coordinate function over space also lead to greater free energy dissipation (Vallino, 673 2011). Some examples of such coordination include quorum sensing (Goo et  did not produce biologically relevant results due to the speed at which electromagnetic potential is 681 dissipated abiotically. Because water rapidly absorbs light, any water column of sufficient depth will 682 dissipate all incoming radiation as entropy regardless of the presence of particles, so an infinite 683 number of solutions exist that all produce the same global entropy production. For instance, a water 684 column devoid of particles will produce entropy at an exponentially decreasing rate with depth, but 685 eventually all the light will be dissipated. A solution in which the surface is particle rich will result 686 in rapid entropy production with depth, but the same total amount of entropy will be produced. 687 Particles placed lower in the water column would produce a different profile of entropy production, 688 but the result would be the same, with all of the electromagnetic potential destroyed. Consequently, 689 maximizing entropy production summed over the water column (i.e., global entropy maximization) 690 does not produce a unique solution as it does for systems where energy potentials are stable, such as 691 chemical potentials (Vallino, 2011). However, maximizing local entropy does select for a unique 692 solution, which our results show is biologically relevant. In local maximization of entropy 693 production, the objective is to maximize entropy production, ( ), at every depth simultaneously. 694 The order in which ( ) is maximized at each grid point does not matter, but in theory the 695 optimization should be repeated until no further improvement occurs at any depth. Since light 696 intensity is greatest at the surface, entropy production there can be greatly increased with an increase 697 in particle density, which then lowers the amount of entropy that can be produced at depth due to 698 shading. Of course, if the particles are phytoplankton and other organisms, resource availability will 699 often limit biomass accumulation and local entropy production, but the system will organize in an 700 attempt to minimize resource limitations, such as grazing and remineralization, and this is the 701 solution observed in both the SIO and LIO simulations, with the latter outperforming the former. 702 This first MEP study involving light dissipation and phototrophy generated an additional interesting 703 question--under what circumstances should entropy production be maximized locally versus 704 globally? Our previous study with purely chemical reactions showed that global entropy production 705 results in greater chemical potential destruction (Vallino, 2011). This study indicates that if abiotic 706 processes can destroy an energy potential faster than biotic ones, then local MEP optimization will be 707 the preferred solution. Global maximization of entropy production is more likely to be found in 708 systems where energy potentials are stable with respect to abiotic decay, such as chemical potentials. 709 However, it seems possible that both local and global optimization could be operating simultaneously 710 is a system given that both light and chemical potentials exist. If so, one possible means of finding 711 I n r e v i e w This is a provisional file, not the final typeset article the correct solution is to first conduct local optimization followed by global optimization. In the case 712 of the Siders Pond MEP model, a solution could first be determined from local optimization as was 713 described in the Results Section, then that solution could be used as an initial condition for global 714 optimization . If the global solution cannot improve on the local optimization, then the local  715  optimization is the final solution. However, if a solution found by local optimization can be  716 improved upon by a global optimization then that solution should be chosen. We did not conduct this 717 study, as the MEP modeling approach is computationally demanding, and other components of the 718 approach could use further development, which we briefly explain next. 719 The adaptive Monod equation that servers as the kinetic driver, (e.g., Eq. (S45) and others) is 720 based on simple observations of bacterial growth under oligotrophic to copiotrophic conditions, but 721 there must be underlying biophysical constraints that form the basis of the tradeoff between growing 722 fast versus taking up nutrients at low concentrations (Kirchman, 2016 refinement. Finally, the current approach uses non-linear optimal control to locate MEP solutions, but 739 this is a computationally intensive problem, and falls under the class of problems known as control of 740 partial differential equations (Coron, 2007). Building on expertise from that field could reduce 741 computation requirements, especially for problems involving two and three dimensions, but it might 742 be possible to avoid the formal optimization problem altogether. One possibility would be to use 743 Darwinian inspired trait-based modeling approaches (Follows et  SIO) interval optimization supports the hypothesis that biological systems maximize entropy 764 production over long timescales. The modeling presented here extends the MEP approach to include 765 an explicit spatial dimension, and new metabolic reactions were introduced to model phototrophs and 766 entropy production associated with the destruction of electromagnetic potential. By considering the 767 dissipation of both chemical and electromagnetic potentials, the MEP model shows that heterotrophs, 768 such as bacteria, dissipate far more free energy in the photic zone by passive light absorption than by 769 chemical respiration. Short interval optimization (SIO, 0.25 d) results in higher grazing rates and 770 turnover of organic carbon, as well as rapid (bang-bang) changes in the reaction efficiency control 771 variables, while long interval optimization supports higher phytoplankton growth and standing stocks 772 near the surface of the pond. We also found that maximization of local entropy production, as 773 opposed to global entropy production, must be used for energy potentials that are quickly dissipated 774 by abiotic processes, such as light absorption by water and particles. However, our general 775 conjecture that biological systems evolve and organize to maximize entropy production over the 776 greatest possible spatial and temporal scales, while abiotic processes maximize instantaneous and 777 local entropy production, remains valid. Conflict of Interest 781 The authors declare that the research was conducted in the absence of any commercial or financial 782 relationships that could be construed as a potential conflict of interest. 783

7
Author Contributions 784 Both JV and JH contributed equally to project concepts and design, and both were full participants of 785 the 24 hour endurance sampling of Siders Pond. Preliminary results from metatranscriptomics and 786 metagenomics from JH were used to improve development of the metabolic network. JV developed 787 the model and wrote first draft of the manuscript with editing and inputs during the process from JH. 788 Both authors contributed to manuscript revision, read and approved the submitted version. supported the participation of JH. This is C-DEBI contribution number XXX. 795

9
Acknowledgments 796 We thank Petra Byl for assistance in field work preparation and sampling of Siders Pond, as well as 797 insightful conversations about the project. We also thank Jane Tucker for dissolved inorganic carbon 798 analyses, Suzanne Thomas for hydrogen sulfide analyses, Sam Kelsey for dissolved organic carbon 799 analyses, Kate Morkeski for sulfate, phosphate and particulate organic carbon analyses, and Rich 800 McHorney, Leslie Murphy, Emily Reddington, and Gretta Serres for assistance with Siders Pond 801 sampling logistics. We also thank the Semester in Environmental Science Program at MBL for use of 802 their Hydrolab and Jon Boat use for sampling and a special thank you to Tom Gregg who allowed us 803 to access Siders Pond through his property as well as for providing logistical support. 804 805 I n r e v i e w I n r e v i e w