Comparing Log-Linear and Best-Fit Models to Evaluate the Long-Term Persistence of Enteric Markers in Sewage Spiked River Water

Water quality models use log-linear decay to estimate the inactivation of fecal indicator bacteria (FIB). The decay of molecular measurements of FIB does not follow a log-linear pattern. This study examined the factors associated with the persistence of Escherichia coli uidA, enterococci 23S rDNA, and Bacteroides thetataiotaomicron 1,6 alpha mannanase in microcosms containing 10% (vol/vol) sewage spiked river water stored at 4°C for up to 337 days. The study estimated the markers' persistence with log-linear models (LLMs) to the best-fit models, biphasic exponential decay (BI3) and log-logistic (JM2) and compared the estimates from the models. Concentrations of B. thetataiotaomicon decreased to levels below detection after 31 days in storage and were not fit to models. BI3 and JM2 were fit to E. coli and enterococci, respectively. LLMs had larger Bayesian information criterion values than best-fit models, indicating poor fit. LLMs over-estimated the time required for 90% reduction of the indicators (T90) and did not consider dynamic rates of decay. Time in storage and indicator species were associated with the persistence of the markers (p < 0.001). Using the T90 values of the best-fit models, enterococci was the most persistent indicator. Our data supports the use of best fit models with dynamic decay rates in water quality models to evaluate the decay of enteric markers.


INTRODUCTION
In the United States, environmental waters that are impaired or in danger of being impaired are assigned remediation plans aimed at restoring designated uses (United States, 2016). Forecasting the restoration of an impaired watershed is partially dependent on the development of water quality models that estimate the concentrations of fecal indicator bacteria (FIB), while simultaneously calculating the transport and fate (particularly the survival) in the aquatic ecosystem. Water quality models have been slow to adapt data generated from molecular measurements of FIB even as these measurements are finding use in rapid "swimmable" assessments. Compared to concentrations of culturable enterococci, measurements of enterococci 23S rDNA (Entero1) had a stronger correlation to gastrointestinal illness incidence of children visiting beaches near point sources of treated sewage at Lake Michigan, Michigan and Indiana; and Lake Erie, Ohio (Wade et al., 2008). Therefore, water quality models that incorporate molecular measurements of FIB could provide better resolution of the risk of waterborne illness.
Microbial source tracking (MST) with molecular methods can identify and quantify pollution associated with specific hosts in watersheds of impaired rivers (Ballesté et al., 2020;Brooks et al., 2020). Identification of host-associated pollution is important because different origins of pollution are suggested to pose varying risks of adverse health effects (Soller et al., 2010). Multiple studies have demonstrated that human-associated markers can be detected in various types of environmental waters (Reischer et al., 2007;Rusiñol et al., 2014;Ahmed et al., 2019). The concentrations of the markers can also be incorporated into water quality models to improve forecasting of an impaired watershed (Jeong et al., 2019). Such analysis could offer better assessment of pollution associated with higher risk to public health.
Two commonly used water quality models, HSPF, and SWAT, use log-linear models (LLMs) to calculate the survival of culturable FIB in aquatic environments (Johnson et al., 1984;Gassman et al., 2007). A meta-analysis determined that there are three general patterns of survival curves of culturable E. coli in water microcosms: biphasic log-linear, shouldering followed by log-linear decay, and log-linear (Blaustein et al., 2013). These studies suggest that persistence models could accommodate the dynamic decay of enteric markers.
There are few studies that have evaluated the persistence of FIB in water over longer timescales. A previous study determined that the time needed for the 90% reduction (T 90 ) of culturable E. coli (EC) and enterococci (ENT) was > 60 days in microcosms containing sediment covered with freshwater spiked with sewage effluent and stored at 10 • C (Pote et al., 2009). Further research of the long-term persistence of FIB would provide better forecasting power for water quality models.
In order to increase the accuracy and precision of water quality models, we describe the long term persistence (up to 337 days) of three enteric molecular markers, B. thetaiotaomicron 1,6 alpha-mannanase, enterococci 23S rDNA, and E. coli uidA, stored in microcosms of 10% (vol/vol) sewage spiked river water stored in the dark and at 4 • C. Enterococci 23S rRNA was chosen because it is a standard method to evaluate fecal pollution in recreational waters (USEPA, 2015). E. coli uidA was chosen because the marker is species specific and the United States Environmental Protection Agency established criteria regarding the concentration of culturable E.coli in recreational waters (USEPA, 2012). B. thetaiotaomicron 1,6 alpha-mannanase was chosen to evaluate the persistence of human associated pollution. The temperature was chosen because it represented a conservative estimate of the persistence of enteric markers in many temparate freshwater lakes. Microcosms were stored in the dark during the duration of this study. Sunlight inactivation was not evaluated in this study because previous studies have demonstrated that freshwater microcosms containing either human or cattle sewage demonstrated no significant difference in the decay of molecular targets in the presence of sunlight compared to microcosms stored in the dark (Dick et al., 2010;Korajkic et al., 2019).
The specific objectives of this study were to: 1. Evaluate the mathematical relationships of experimental data of Escherichia coli uidA, enterococci 23S rRNA, and Bacteroides thetataiotaomicron alpha-mannanase measured from microcosms of 10% sewage spiked river water stored for up to 337 days at 4 • C using log-linear and best-fit methods.
Compare the mathematical relationships of the best-fit models to log-linear models. Compare the times required for 90% and 99% reductions of the initial concentrations of the markers using best-fit models and a corresponding log-linear model. 2. Evaluate the associations of time in storage, and indicator species to the observed persistence of the three markers.

Microcosm Set-Up
Water from the Red Cedar River (6 L) was autoclaved for 1 h in multiple 2 L plastic containers. After cooling to room temperature, the water was spiked with 10% (vol/vol) raw sewage from the East Lansing Wastewater Treatment and equally divided into 12 autoclaved 1 L Nalgene polypropylene bottles (ThermoFisher Scientific, Waltham, MA). Neither the river water nor the raw sewage were evaluated for the three genetic markers prior to the set-up of the microcosms. Each microcosm contained ∼500 mL of liquid and no sediment. All microcosms were stored in the dark at 4 • C.

Sample Processing and DNA Extraction
On each of the following days after seeding the raw sewage into the river water: 0, 31, 61,93,128,155,187,219,248,279,306, and 337, a microcosm was shaken 25 times (EPA-DNA, USEPA, 2015) and triplicate aliquots of 100 ml from one microcosm were membrane filtered onto three separate Nucleopore Track Etch polycarbonate membrane filters (0.45 µm pore size, 47 mm diameter, Whatman Inc., Piscataway, NJ). The three filters per sampling day represented technical replicates. DNA extraction from all samples was performed using USEPA Method 1611.1 (EPA-DNA, USEPA, 2015). Specifically, three filters were used to concentrate cells within each time point. At each time point, a filtration negative of autoclaved phosphate buffer saline accompanied the filtration of the samples. The extraction method in USEPA Method 1611.1 mechanically shears the cells. The first step is 1 min of bead beating of the filters at max speed in 600 µL DNA extraction buffer with 0.3 mm glass beads. Two centrifugation steps were completed to separate glass beads and other cell debris from the 350 µL-supernatant containing the DNA. The supernatants were diluted 1:5 in AE buffer and were stored at −80 • C.

qPCR Quantification of Three Indicators of Fecal Pollution
The primers, probes, amplified sequence length, and qPCR protocols for Bacteroides thetataiotaomicron alpha mannanase (BT-am), Escherichia coli uidA (EC-uidA), and Enterococcus spp. 23S rDNA (ENT-23) are described in Table 1  probe (0.1 µM BT-am probe), 2 mg/ml bovine serum album (BSA), 1 mM of MgCl2 (EC-uidA only), 5 µl of DNA sample, and enough nuclease free water added to equal a 20 µl reaction volume. The following components were analyzed in each qPCR assay: three wells of a dilution step from the standard curve such as the 10 6 or 10 7 copies/rxn for ENT, 10 5 or 10 7 copies/rxn for EC, and or 10 3 or 10 4 copies/rxn for BT, a well containing nuclease free water, and duplicates of method blanks for each storage timepoint (DNA extracts of sterile phosphate buffer saline membrane filtered at each time point). Each DNA extract from the experiment was run in duplicate to represent analytical replicates. The analytical replicates were averaged to represent each technical replicate. Information regarding the performance of the standards curves, including the theoretical limit of detection, limit of quantification, and efficiency of the standard curve are included in Supplementary Table 1.
Quantification was established with a standard curve. Overnight cultures of Enterococcus faecalis ATCC strain 19433 and E. coli ATCC strain 15597 were made, and genomic DNA was extracted using Qiamp DNA Mini Kit (Qiagen Inc, Valencia, CA). The qPCR measurements were reported as cellular equivalents (CE) of BT, ENT, or EC per 100 ml-water sample to compare the data to survival curves of culturable FIB in previous studies. All DNA extracts were diluted 1:5 before addition to the qPCR reagents to decrease the chances of inhibition, which also decreased the lowest detected concentration of BT, EC, and ENT to 2.68 * 10 2 , 1.30 * 10 5 , and 1.38 * 10 3 CE/100 ml-water sample, respectively. Genomic DNA was quantified of B. thetataiotaomicron (ATCC, 29148) with NanoDrop 1000 (ThermoFisher Scientific, Waltam, MA) to obtain copy number per rxn. Subsequently, 1:10 serial dilutions with dilution steps of the genomic DNA of the three indicators were made. The dilution steps spanned from 10 0 to 10 7 copies/rxn. One standard curve was used for both qPCR runs and contained triplicates of each dilution step. For all standard curves, the R 2 was > 0.96. qPCR inhibition (defined as technical duplicates differing by > 3.32 Ct) was not noted in the experiment.

Persistence Modeling and Statistical Analysis
The statistical and modeling analyses considered all data points, including technical replicates that measured ENT, EC, and BT (statistical analysis only). The persistence of BT was not evaluated using a best fit model, or log-linear model because there were only three timepoints that were above the detection limit ( Figure 1A). In the EC dataset, non-detects were evaluated at the detection limit in the persistence modeling and statistical analyses, as previously advised (USEPA, 1991). Model selection and shape of best-fit models of EC were affected by non-detect samples that occurred at t = 219 and 248 days ( Figure 1C).
The average CE concentrations of each indicator were transformed to the relative persistence of the original concentration, N/N 0 , where N was the average concentration of the indicator from three technical replicates (CE/100 ml-sample) after t days in storage, and N 0 was the average concentration of the indicator from three technical replicates at initial sampling (t = 0). A model fitting tool in R (R Development Core Team, 2013) provided by Drs. Kyle Enger and Jade Mitchell used maximum likelihood estimations to evaluate 17 linear and non-linear models (qmra.canr.msu.edu) to the data. The best-fit model with the lowest value of the Bayesian information criterion (BIC) was chosen to represent the persistence of the indicators. If the BIC value of biphasic log-linear model (BI3) was ≤ 2 units from the smallest BIC value, then BI3 was chosen; as ≤ 2 units difference are considered equivalent (Bolker, 2008). These two best-fit models evaluated the indicators: biphasic log linear model, BI3 (Carret et al., 1991); and log-logistic model, JM2 (Juneja et al., 2003). The model fitting tool also evaluated the persistence of the indicators using the log-linear model, LLM (Chick, 1908). Descriptions of the equations are in Table 2. The T 90 , and T 99 values (the time required for 1 and 2 log reductions of the initial concentrations of the indicators, respectively) were calculated by substituting −1.0 (or −2.0 for T 99 ) = Log 10 (N/N 0 ), and solving for t. FIGURE 1 | The relative persistence of (A) Bacteroides thetaotiaomicron, BT; (B) enterococci, ENT; and (C) Escherichia coli, EC measured from microcosms containing sewage spike river water stored at 4 • C for up to 337 days. Filled in data points illustrate the data that were below the detection limit and represent the indicator specific detection limit. Each data point represents the average relative persistence of three technical replicates. The error bars represent one standard error. The lines represent the following models: log-linear (LLM), biphasic log-linear (BI3), or log-logistic models (JM2).
Frontiers in Water | www.frontiersin.org TABLE 2 | Descriptions and properties of the log-linear model and best-fit equations (biphasic log-linear, and log-logistic) that mathematically evaluated the relationship of enterococci 23S rDNA, and Escherichia coli uidA from microcosms containing sewage spike river water stored at 4 • C for up to 337 days.

Multiple Linear Regression Analysis
The regression equation for the dataset was: Log 10 (N/N 0 ) = 0.006 × S + 0.307 × I − 0.146, with R 2 = 0.549 (p < 0.001; n = 108). Time in storage and indicator species (p < 0.001) were significantly associated with the relative persistence of the three indicators. This indicates that the longer the time in storage at 4 • C, the greater the decrease in the concentration of the genetic marker compared to initial time. The persistence of each marker was significantly different and BT was the least persistent and ENT was the most persistent.

Description of the Observed Persistence of the Three Molecular Markers
At initial sampling, the average concentrations of BT, EC, and ENT (and standard error) were: 6.81 * 10 4 (5.08 * 10 ∧ 3), 3.18 * 10 8 (3.99 * 10 ∧ 7), and 1.01 * 10 7 CE (2.58 * 10 ∧ 6) per 100 ml-water sample, respectively. The average concentrations of BT and EC were below detection in 9 and 2 time-points, respectively, for the complete data set. The most and least persistent markers were EC and BT, respectively (Figures 1A-C). Concentrations of BT were above detection at t = 0, 31, and 219 days ( Figure 1A). The average concentration of BT in one technical replicate at t = 219 days was above the detection limit (Figure 1A). At t = 337 days, the observed relative persistence of BT was below detection representing ≤ -1.17 log decay. Overall, the observed relative persistence of EC and ENT decreased over the duration of the experiment by −1.84 log and −3.17 log, respectively, at t = 337 days (Figures 1B,C). The observed relative persistence of EC fluctuated around −2 log during t = 61 -337 days ( Figure 1C). Table 3 outlines the parameters of the selected best-fit models, log-logistic (JM2), and biphasic log-linear (BI3), and LLM along with their corresponding BIC values, and T 90 and T 99 values. JM2 and BI3 were selected for ENT and EC, respectively. Non-detects were observed in the EC dataset and affected the decay rates calculated from the best-fit models and LLMs ( Figure 1C). After 78.25 days in storage, LLM calculated a faster rate of decay for EC than BI3 (Table 3). Also, LLM calculated a faster decay rate for ENT after ∼250 days in storage than JM2. For both datasets, LLMs had larger BIC values than the corresponding best-fit models ( Table 3).

Comparison of the Persistence of Three Enteric Markers Predicted From the Best-Fit and Log-Linear Models
The best-fit models demonstrated that the decay of ENT and EC during the experiment's duration was dynamic (Figures 1B,C). During t = 0 -78.25 days, BI3 calculated a rapid decay of EC, afterwards during t = 78.25 -337 days the decay waned ( Figure 1C, Table 3). JM2 estimated that the relative persistence of ENT experienced shouldering followed by rapid decay, and slower decay toward the end of the experiment (Figure 1B). The best fit models calculated that the concentrations of EC and ENT at t = 337 days relative to the initial concentrations were: −2.27 log and −3.12 logs, respectively, and were larger than the concentrations derived from the corresponding LLMs (−3.15 log and −3.66 log, respectively).

Comparison of the Predicted T 90 and T 99 Values From the Best-Fit and Log-Linear Models
The T 90 and T 99 values, time required to reduce the initial concentrations of the indicators by 90%, and 99%, respectively, were calculated from the best-fit models and LLMs of the ENT and EC datasets ( Table 3). The T 90 and T 99 values of ENT calculated from LLM (92.1 and 184.19 days, respectively) were larger than the values from JM2 (62.93 and 138.94 days, respectively). The LLM model of EC predicted that T 90 and T 99 = 107.01 and 214.02 days, respectively, and were larger than the values predicted from BI3 (35.80, and 71.59 days, respectively). Based on the T 90 values calculated from the best-fit models, the relative order of persistence of the indicators was EC < ENT, while the order of persistence based on the T 90 values from the LLMs was ENT < EC.

DISCUSSION
This is one of the first studies to use experimental data to describe the long-term persistence of naturally occurring Bacteroides thetataiotaomicron alpha-mannanase, enterococci 23S rRNA, and Escherichia coli uidA measured from microcosms containing sewage spiked river water stored at 4 • C. Using linear regression analysis, the time in storage and indicator species (p < 0.001) were significantly related to the persistence of the markers. The order of persistence of the markers was BT < EC < ENT. These results indicate that the persistence of the indicators is specific to the species in a long-term evaluation of persistence. In a 28day evaluation of persistence of fecal indicator markers in liquid microcosms and storage on solid surface, time in storage was also significantly associated to persistence of the molecular markers of water stored at 4 • C while indicator species was not significant (Brooks et al., 2015). The finding in this study and in Brooks et al. (2015) indicate that indicator species significantly affect marker persistence over longer time frames that may not be apparent in shorter time frames such as up to 28 days.
We compared the mathematical relationships between the concentrations of indicators and time in storage (up to 337 days) from log-linear decay models with a constant decay rate to best fit models, biphasic log-linear decay model (BI3) and log-logistic model (JM2). Overall, the best-fit models responded well to changes in decay rates over the course of the experiment including the modeling of effects such as shouldering and slower decay as the experiment progressed (Figures 1B,C). Similarly, a meta-analysis of the inactivation of culturable E. coli in water determined that the most popular inactivation model was biphasic log-linear decay, with an initial rapid decay followed by slower decay (Blaustein et al., 2013). Another meta-analysis of the inactivation of culturable E. coli in water observed that a large portion of the dataset was better represented by "a piecewise loglinear model" (Stocker et al., 2014). Likewise in ENT, inactivation of culturable enterococci and ENT-23 in marine water during winter for up to 8 days demonstrated an initial shouldering of the concentrations of the indicator organisms and markers (Mattioli et al., 2017).
There are a few explanations for the variation of the decay over time. The presence of BT in one technical replicate at t = 219 days could be due to a sub-population more resistant to degradation. Another possible scenario is that cells may have attached to a solid surface such as particulate matter in the water, which could enhance persistence. Previously, a simulation of water columns of brackish and freshwater estimated that the attachment of cells of enterococci to a particle can increase its persistence in the water column (Myers and Juhl, 2020). Additionally, the rapid decay of ENT and EC at the beginning of the experiment may be due to the presence of sub-populations with distinct decay rates (Rogers et al., 2011). Specifically, the survival of cultivable E. coli in microcosms of filtered estuarine water microcosms was increased in cells from the B1 phylotype (Berthe et al., 2013). DNA fingerprinting of cultivable enterococci isolated from the Lake Superior watershed indicated that sub-populations of Enterococcus sp. have differing decay rates and some populations can grow in the watershed (Ran et al., 2013). The rapid decay of the indicators could also be a result of ENT and EC exceeding the carrying capacity of the microcosm, causing rapid decay until the carrying capacity is achieved (Easton et al., 2005). This theory may explain the > 1 log variation of the concentration of EC during t = 61 -306 days. The increase in the concentration of EC during t = 248 -337 days could indicate a subpopulation of E. coli that are able to utilize resources to grow in the microcosm to achieve a higher cell density. The above studies suggest that the evaluation of the survival or persistence of FIB in environmental waters may be better represented with water quality models that have dynamic decay rates and scenarios for possible regrowth.
In our study, we compared the T 90 and T 99 values calculated from the best-fit models and LLMs. Overall, the T 90 and T 99 values from LLMs were larger than from the corresponding bestfit models, indicating that LLMs overestimated the persistence of ENT and EC. This overestimation could delay progress toward restoring impaired environmental waters. Additionally, the T 90 values calculated from LLMs and best-fit models predicted different relative orders of persistence for the enteric markers. A previous study used LLMs to estimate that genetic markers from sewage sourced ENT and EC seeded into marine and freshwater in-situ microcosms had T 90 values that ranged 51 -335 days and 21 -54 days, respectively (Sagarduy et al., 2019). Similarly, in our study, LLMs calculated a similar fining for EC (T 90 = 107.01 days), and the best-fit models for ENT and EC (T 90 = 62.93 and 35.8 days, respectively) were within the T 90 values of the abovementioned study.
Currently, LLMs are often used to estimate the persistence and loads of fecal indicators in water quality models that forecast restoration of impaired environmental waters. Our findings indicate that subpopulations of ENT and EC are persistent in microcosm at 4 • C for up to 337 days (Figures 1B,C), which provide further evidence that FIB can persist in water at temperatures ≤ 4 • C (Davenport et al., 1976). The data from our study indicate that water quality models should make provisions to estimate the role of sub-populations of persistent FIB in maximum daily load assessments and utilize best-fit models with variable decay rates to better estimate the decay of enteric markers in impaired environmental waters. Such information can guide water quality models into a better understanding of more realistic maximum loads of fecal indicators as well as improve forecasting of the time required to remediate impaired environmental waters. Such improvements in the models can allow for the better protection of public health by increasing the accuracy of the estimation of pollution present in impaired watersheds.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.