Original Research ARTICLE
Predictive Modeling of a Batch Filter Mating Process
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, Canada
Quantitative characterizations of horizontal gene transfer are needed to accurately describe gene transfer processes in natural and engineered systems. A number of approaches to the quantitative description of plasmid conjugation have appeared in the literature. In this study, we seek to extend that work, motivated by the question of whether a mathematical model can accurately predict growth and conjugation dynamics in a batch process. We used flow cytometry to make time-point observations of a filter-associated mating between two E. coli strains, and fit ordinary differential equation models to the data. A model comparison analysis identified the model formulation that is best supported by the data. Identifiability analysis revealed that the parameters were estimated with acceptable accuracy. The predictive power of the model was assessed by comparison with test data that demanded extrapolation from the training experiments. This study represents the first attempt to assess the quality of model predictions for plasmid conjugation. Our successful application of this approach lays a foundation for predictive modeling that can be used both in the study of natural plasmid transmission and in model-based design of engineering approaches that employ conjugation, such as plasmid-mediated bioaugmentation.
Horizontal gene transfer by plasmid conjugation can result in rapid change in bacterial populations, as exemplified by the increased prevalence of plasmid-borne antibiotic resistance genes in response to the widespread use of antibiotics post-World War II (Davies and Davies, 2010). This capacity for rapid spread of genetic elements could be harnessed as a tool, for instance in the design of cellular computing platforms (Goñi-Moreno et al., 2013) or the modification of environmental bacterial populations through plasmid-mediated bioaugmentation (Top et al., 2002).
Efforts to quantitatively characterize conjugation processes have focused on describing the transfer rate (or “fertility”) of specific plasmids in specific conditions. Early descriptions of transfer rates consisted of reports of the ratio of plasmid-donating to plasmid-receiving cells (Watanabe, 1963; Curtiss et al., 1969) or of threshold populations below which conjugation was not observed (Anderson, 1975). A systematic approach to describing transfer rates was presented by Levin et al. (1979), who developed a population-based ordinary differential equation (ODE) model to capture the temporal dynamics of conjugation in suspended batch cultures. Follow-up work (Simonsen et al., 1990) presented a simple “end-point method” to estimate a conjugation rate from a single time-point measurement.
Subsequent projects made use of this end-point method for estimating conjugation efficiency in both suspended and attached cultures (Gordon, 1992; Duncan et al., 1995; Normander et al., 1998; Licht et al., 1999; Lilley and Bailey, 2002). The ordinary differential equation model of Levin et al. (1979) employs mass-action kinetics, which are based on an assumption of spatial homogeneity. The utility of this model for describing conjugation in spatially heterogeneous conditions was defended by Simonsen (1990), whose analysis suggested that the mass-action based model accurately captures the dynamics of attached cultures inoculated from well-mixed suspensions provided the cell density is sufficiently high.
The model developed by Levin et al. (1979) was calibrated by just two parameters: a growth rate (shared by all subpopulations) and a mass-action conjugation rate. Over the years, a number of model variants have been explored, including the following. Simonsen et al. (1990) considered allowing for (i) transition to statonary phase, (ii) segregative loss, and (iii) transitory derepression of conjugation from newly formed transconjugants (zygotic induction), but found that, for the systems they were considering, none of these features significantly improved the accuracy of their end-point estimate of transmission rate. Lundquist and Levin (1986) considered the specific effects of transitory derepression in the context of plasmid maintenance. Freter et al. (1983) developed a model to describe the behavior of cultures in continuous flow reactors (as a proxy for the mammalian gut). Clewlow et al. (1990) fit a model that incorporated Verhulst logistic growth limitations in an attempt to describe populations growing in soil microcosms. Massoudieh et al. (2010) used a partial differential equation approach to address bacterial activity on granular porous media. A stochastic differential equation approach was described by Philipsen et al. (2010).
Recently, attempts to characterize conjugation dynamics have focused on capturing the spatio-temporal dynamics of the process using individual-based models (Krone et al., 2007; Merkey et al., 2011; García and Rodríguez-Patón, 2015; Goñi-Moreno and Amos, 2015). Using simulations of an individual-based model, Zhong et al. (2012) analyzed the potential accuracy of the ODE-based end-point method (and other non-spatial metrics) for estimating transfer efficiency. They found that, as expected, spatial effects can play a major role in conjugation, but that, for sufficiently dense populations, surface-associated conjugation dynamics can be reasonably well described by non-spatial models.
As highlighted by Sørensen et al. (2005), mathematical modeling approaches are crucial for characterizing and ultimately manipulating horizontal gene transfer in situ. As described above, mathematical characterizations of conjugation processes have focused on quantifying the rate of plasmid transfer. In this study, we pursued the more general goal of accurately characterizing all significant aspects of a proof-of-principle batch conjugation process involving two E. coli strains. This project was carried out toward the long-term goal of using such models in the engineering of systems involving plasmid transfer. Consequently, our primary aim was to confirm whether an ODE model of the process could be accurately calibrated and thus provide confident predictions of system behavior. We used model comparison and uncertainty analysis to arrive at a model formulation that is well-supported by our data, and tested our final calibration against experiments that demand extrapolation from the training data. (Similar model assessment techniques were employed in Philipsen et al. (2010), but the goal in that study was to accurately estimate the noise model.) Our approach is facilitated by the use of flow cytometry to assay culture subpopulations. As confirmed by del Campo et al. (2012), cytometric measurements agree with the more traditional colony counting approach, and provide improved precision without loss of accuracy.
2. Materials and Methods
2.1. Laboratory Experiments
2.1.1. Strains, Plamids, and Media
All experiments were carried out using Escherichia coli strains DH5α and CSH26. A CSH26 strain bearing the self-conjugative plasmid pKJK10, which harbors tetracycline, streptomycin, kanamycin resistance and P(A1-04/03)::gfpmut3b genes, was a gift from the Sørensen lab (Sengeløv et al., 2001). A DH5α strain bearing plasmid pSB1C3, which carries chloramphenicol resistance and Plac::mRFP genes, was a gift from the University of Waterloo iGEM team (Biobrick ID J04450; parts.igem.org). Cultures were grown in Luria-Bertani (LB) broth or on LB agar at 37°C. Selective media was prepared with antibiotics in the following concentrations: chloramphenicol (Cm) 10 μg/ml; tetracycline (Tet) 10μg/ml. Phosphate buffered saline (PBS) was prepared as a 10-fold dilution of stock consisting of 40 g NaCl, 1 g KCl, 5.7 g Na2HPO4, and 1 g NaH2PO4 in 500 ml MilliQ water.
2.1.2. Conjugation Assays
Donor (CSH26) and recipient (DH5α) strains were inoculated in 5 ml of LB supplemented with appropriate antibiotic (DH5α in Cm, CSH26 in Tet) and grown overnight at 37°C at 250 rpm. To set up the experiment, 1 ml each of donor and recipient overnight cultures were centrifuged at 13,000 rpm for 1 min, decanted and resuspended in 1 ml of fresh LB broth (non-selective). Then, 400 μl of each resuspension was inoculated in 9.6 ml of fresh LB and grown in a 250 rpm shaking incubator at 37°C to an optical density (OD600) of 0.5 (Genesys 20 Visible Spectrophotometer, Thermo Fisher Scientific).
Filter matings were carried out in 47 mm diameter petri plates (Fisher Scientific) containing 1.5% LB agar upon which were laid 47 mm diameter 0.4 μm polycarbonate filters (Fisher Scientific). Eight filter mating experiments were carried out, numbered 1 through 8 below. In each case, donor and recipient suspensions were diluted to achieve desired cell densities prior to loading.
For experiments #1–4 the donor and recipient cultures were loaded nearly simultaneously. To begin, 200 μl of recipient cell suspension were manually spread evenly on each filter. The filters were then dried for 1 min before loading and spreading 200 μl of donor cell suspension. Plates were then kept at 37°C until the desired time-point (specified below). Experiments #5–8 followed a staggered loading protocol; the procedure was identical to experiments #1–4 except that the second suspension (donor or recipient as specified in the caption of Figure 3) was loaded 120 min after the first suspension was loaded.
To begin each time-point measurement, the filters were removed from the plates and suspended in PBS. Suspension volumes were chosen to account for increasing cell populations over time, at ten time-points as follows [Time (min), PBS Volume (ml)]: (0, 2), (40, 2), (80, 2), (120, 2), (160, 2), (200, 5), (240, 10), (280, 20), (320, 25), (360, 30). Suspended filters were vortexed for 2 min after which the filter was removed from the suspension. The OD600 of the PBS-suspended culture was recorded and the samples were analyzed by flow cytometry, as described next. Observations were made in triplicate (biological replicates—separate plates) at each time-point.
2.1.3. Flow Cytometric Analysis
Analysis was carried out on an Amnis Imagestream MkII flow cytometer (EMD Millipore). Data acquisition was implemented with the Amnis Inspire software package (EMD millipore). All samples were gated against gradient root mean square in the brightfield channel (labeled as gate R0), accepting images with scores between 50 and 80. Fluorescence was excited by a 488 nm solid state laser at 45 mW intensity. Green fluorescence was detected by a 533/55 nm band pass filter. Red fluorescence was detected at 610/30 nm. As a negative control, separate CSH26 and DH5α cultures (expressing GFP+/RFP- and GFP-/RFP+, respectively) were analyzed. For each, 10,000 R0 samples were collected; the results were used to construct a compensation matrix with the Amnis IDEAS software package (EMD Millipore). Gates for the three subpopulations of interest were then defined by setting thresholds for compensated non-fluorescence in the red and green channels, as shown in Figure 1. Cells were taken to be RFP- (gate R2) if the red fluorescence intensity was below 256.63. RFP+ cells were considered GFP- if the green fluorescence intensity was below 1,036.2 (gate R1). At each experimental time point, for each replicate, 20,000 R0 events were collected and sorted into the three subpopulations based on the threshold gating.
Figure 1. Populations and flow cytometry gates. (A) Three populations were involved in the experiments: donor CSH26::pKJK10 cells (GFP+), recipient DH5α::pSB1C3 cells (RFP+), and transconjugant DH5α cells harboring both plasmids (RFP+/GFP+). (B) Single-population controls were used to establish green and red fluorescence thresholds for identifying plasmid-bearing cells (regions R2 and R1, respectively). During the mating experiments, a population of RFP+/GFP+ transconjugants was established (region R3). The data is compensated (see Methods) and is plotted on a bi-exponential (logicle) scale (Tung et al., 2007), linear in the range [−1000, 1000].
2.1.4. Estimating Population Sizes
For each measurement, the flow cytometry results indicate the population distribution among three subpopulations: donors (GFP+/RFP−), recipients (GFP−/RFP+), and transconjugants (GFP+/RFP+). Together with the OD600 readings, these give relative measures of population size which were used to calibrate and analyse the model. To estimate corresponding absolute population sizes, we made use of the rule-of-thumb scaling for E. coli of 1.0 OD600 to 8 × 108 cells per ml in LB broth (www.genomics.agilent.com/biocalculators/calcODBacterial.jsp, see also Milo and Phillips, 2015). To do so, we determined a scaling between OD600 in PBS and OD600 in LB. Using cultures from preliminary filter mating experiments with OD600 in PBS ranging from 0.073 to 1.20, we resuspended and measured OD600 in LB. Thirty-six samples (data not shown) were fit, resulting in the relation (OD600[LB]) = 0.869 × (OD600[PBS]) - 0.0057, with R2 of 0.995. Finally, to estimate populations on the filters, we assayed the efficiency of cell recovery by loading filters as described above and then comparing the OD600 of the recovered population with the original suspension; these tests (data not shown) indicated a recovery efficiency of about 50%. Consequently, to estimate the cell counts on the filter, we multiplied our cell counts in suspension by a factor of two after accounting for the corresponding suspension volume to yield a cell count per ml. Finally, we divided by the filter area (17.35 cm2) to arrive at a measure of cell density (cells/cm2).
2.2. Mathematical Modeling
2.2.1. Simulation and Calibration
Models were simulated with the ode23s numerical integration routine in MATLAB (Mathworks). As described above, triplicate observations were made of the three culture subpopulations over a range of time points. For each model variant considered, and for each parameterization p, model predictions were compared against the training data (experiments #1–4) via the weighted sum of squares function:
where is the mean of the replicate measurements of subpopulation i at time tk, is the standard deviation of those measurements, and is the model prediction of subpopulation i at time tk. Here i runs over the four training experiments (#1–4) and the three subpopulations (donors, recipients, transconjugants), while tk runs over the time-points: 0, 40, 80, 120, 160, 200, 240, 280, 320, 360 min (from initial loading of the filter).
For each model variant, the SSE was minimized by application of global optimization routines (simulated annealing, MATLAB function simannealbnd, and interior point algorithm, MATLAB function fmincon). Multiple starting points were employed to improve the chance of finding the global minimum. Parameter values were bound to the range [0, 500]. The initial values of the donor and recipient populations were treated as free parameters, to avoid over-weighting the measurements at these time-points. Transconjugant populations were initialized to zero.
2.2.2. Model Comparison
We used the Akaike Information Criterion (AIC) (Akaike, 1974; Burnham et al., 2011) to assess the suitability of each model. AIC describes a trade-off between the quality of the fit and the number of degrees of freedom granted by the model formulation. The corrected form of AIC was used to guard against overfitting:
where m is the number of model parameters (kinetic parameters plus eight free initial conditions: donors and recipients in each of four experiments), n = 116 is the number of observations (four experiments, ten time-points, three populations, less the four initial transconjugant populations assumed to be zero), and SSE is the minimal weighted sum of squared errors (Equation 1). Presuming the errors are independent and normally distributed, the AICc provides a relative measure of the strength of evidence for each model; smaller AICc values correspond to increased model suitability.
2.2.3. Uncertainty Analysis
Uncertainty analysis was applied to gauge confidence in the best-fit estimates of the kinetic parameters. The initial conditions were not included in this analysis, as model-based predictions will not rely on estimates of initial conditions.
184.108.40.206. Sensitivity coefficients
Local absolute sensitivity coefficients were defined as:
where is the i-th model output at time-point tk and pj is the j-th parameter. These derivatives were approximated by finite differences of 1% in pj. Absolute sensitivity coefficients were scaled to generate dimensionless relative sensitivity measures:
The overall sensitivity measure:
where i runs over all subpopulations and experiments, and tk runs over all time-points, describes the degree to which each model parameter pj influences the model outputs.
220.127.116.11. Identifiability scores
To account for correlation among the parameters, the orthogonalization approach of Yao et al. (2003) was applied to arrive at practical measures of identifiability, as follows. A sensitivity coefficient matrix was constructed by arranging the relative sensitivity coefficients for each parameter column-wise:
where there are np parameters, no observables (running over experiments and subpopulations), and nT time-points. The column with the largest 2-norm (square root of sum of squared entries) is labeled X1. The corresponding parameter is judged the most identifiable; its identifiability score is the 2-norm of the corresponding column. Each column of is then projected onto X1, and the residuals are collected in matrix R2:
The column of R2 with the largest 2-norm corresponds to the next most identifiable parameter. The matrix X1 is then concatenated with the column of that corresponds to that parameter, to form matrix X2. The residuals of the projection of onto X2 are then determined (organized into matrix R3), and the third most identifiable parameter is identified. This process is iterated to provide an identifiability score for each parameter (i.e., the 2-norm of the corresponding column vector of matrix Rj).
18.104.22.168. Confidence intervals
Two complementary approaches were applied. Both begin with construction of a matrix S as in Equation (5) but using the absolute (unscaled) sensitivity coefficients Si,j (Equation 3) as entries. Assuming the experimental errors are independent and normally distributed, the least squares error in Equation (1) can be used to provide a lower bound on the radius of the 95% confidence interval for parameter pj of Ashyraliyev et al. (2009):
where is the parameter estimate (i.e., the minimizer of SSE(p)), m is the number of parameters, n = 108 is the number of observations (four experiments, three subpopulations, nine non-initial time-points), F0.05(m,n − m) is the value at 0.95 of the inverse of the cumulative distribution function for the Fisher's distribution with m and n − m degrees of freedom, and (·)jj is the jj-th matrix element. These results are reported as relative estimates () in the following section.
where W is the inverse of the measurement covariance matrix. Then assuming the measurement errors are independent and normally distributed, a lower bound on the radius of the 95% confidence interval for parameter pj is given by:
Given the samples sizes available in this study (triplicate observations), we could not estimate the full measurement covariance matrix. The measurement variances (diagonal entries in W) were calculated for each observation; the covariances (off-diagonal terms) were set to zero. Again, the relative estimate is reported.
3.1. Model Development and Model Comparison
Model development began with the three-state model of Levin et al. (1979) that describes dynamics of a recipient population (R), a donor population (D), and a transconjugant population (T). The model dynamics depend on the following assumptions: (1) the conjugation rate is jointly proportional to the abundance of plasmid-bearing and plasmid-free cells, (2) plasmid loss by segregation is negligible, (3) newly formed transconjugants can donate plasmids immediately, (4) the original donors and the transconjugants transmit plasmids at the same rate, (5) all subpopulations grow at the same exponential rate. The model takes the form:
where ψ is the growth rate and γ is the conjugation rate. This model captures the essential features of conjugation dynamics, but a more complex model might be needed for accurate prediction of mating dynamics in specific conditions; in particular, we wanted to allow for distinct growth rates and growth profiles for all subpopulations. Consequently, we considered model extensions that incorporate (i) growth lag, as in Baranyi et al. (1993), (ii) transition to stationary phase, as in Simonsen et al. (1990), and (iii) distinct transmission and growth kinetics for each subpopulation. The most general model formulation we considered involves the three subpopulations (measured in cells/cm2) and the fractional abundance of a limiting resource C (dimensionless measure, with initial value one):
where, for each index i = D, R, T, the parameter ei is a measure of resource depletion (cell−1 cm2), and the growth rates ψi and plasmid transfer rates γj (j = D, T) are given by:
with ψi,max (min−1) and γj,max (min−1 cell−1 cm2) the maximal growth and transfer rates, respectively, KG,i and KT,j the resource levels at which growth and transfer are half maximal (dimensionless Monod constants), KL,i the time at which the growth rate is half maximal (min), and the Hill coefficient ni characterizing the abruptness of the end of lag phase (dimensionless).
This fully general model involves nineteen kinetic parameters, compared to just two for the original model of Levin et al. (1979) (Equation 9). We expected that a suitable model could be found between these two extremes. To identify the most suitable formulation, we applied the Akaike information criterion, as described in the Methods section, to a range of model variants. Each model variant was fit against the training data as described in the Methods section (116 triplicate data points collected from experiments #1–4). For each experiment, the initial abundance of recipient and donor subpopulations were treated as free variables, to avoid overweighting the measurements at these time-points. Thus, each model fit addressed these eight free variables in addition to the free kinetic parameters. The model comparison results are shown in Table 1; the model variants are described below and in the Table caption.
The full model (Equation 10), labeled as model variant 0 (V0), resulted, of course, in the best fit (lowest SSE), but at the cost of a high degree of parameterization, as indicated by a high AICc value. An exhaustive comparison of all model formulations (which, through combinatorial combination would number in the thousands) was not feasible. Instead, we carried out a strategic comparison of a targeted set of model variants, as follows.
Our first steps toward model reduction were motivated by the parameter fits. Estimates of ni were close to one, while those of KG,i and KT,j were large relative to one. Eliminating these parameters led to models V1 and V2, with a modest reduction in quality of fit but a significant improvement in AICc score. Next, our expectations of system behavior led us to consider model V3, in which the resource-depletion rates of recipients and transconjugants are constrained to be equal, and V4, in which all three resource-depletion rates are equal. Likewise, models V5 and V6 incorporate assumptions of shared growth lag. Next, model V7 was reached by constraining the transmission rates of donors and transconjugants to be equal (reflecting an absence of transitory derepression); this simplification was motivated by similarity in the values of the corresponding parameter estimates.
As expected, each reduction in model complexity corresponds to a reduction in the quality of fit (i.e., larger SSE). The AICc values in Table 1 indicate that, for some simplifications, the loss of accuracy was compensated by reduction in degrees of freedom; for others, it was not. In particular, model V7 has the lowest AICc value of the first eight variants.
Models V8 through V12 represent further reductions of model V7: constraining all resource-depletion rates to be identical (V8); constraining all lag times to be zero (V9); constraining pairs of growth rates to be identical (V10, V11, V12). In each case the AICc indicates that the loss in accuracy is not compensated by the reduction in model complexity.
At this point, we did not expect to achieve an improved AICc value by further model reduction. Nevertheless, for completeness, we carried out comparisons with the model of Simonsen et al. (1990) (V13, with four kinetic parameters—common growth, transmission, resource-depletion rates and Monod constant) and the model of Levin et al. (1979) (V14, with two kinetic parameters, Equation 9). The corresponding AICc values indicate that these models are less suitable than model V7.
A final variation came after a preliminary identifiability analysis indicated that, contrary to our expectations, the transconjugant growth rate could not be confidently distinguished from zero. (Indeed the estimated value of this parameter was orders of magnitude smaller than the estimated growth rates for the other two populations.) Model V15 resulted from fixing the transconjugant growth rate to zero in model V7.
The most suitable model formulation (V15) is:
with six kinetic parameters: ψR,max, ψD,max, KL, γmax, eR and eD. The model fits against experiments #1–4 are shown in Figure 2 (the full dataset is provided in Supplementary File S1); best-fit kinetic parameter values are reported in the next section.
Figure 2. Model fits. Data points correspond to time-point flow cytometric and OD600 readings as described in Methods. Error bars correspond to standard deviation of triplicate observations. Curves are best-fit model simulations. Best-fit initial conditions (R0, D0) in units of cells/cm−2 are: Exp. #1: (1.96 × 106; 1.77 × 106); Exp. #2: (8.05 × 106; 8.42 × 106); Exp. #3: (4.02 × 106; 2.31 × 106); Exp. #4: (6.35 × 105; 5.57 × 106). Initial transconjugant populations are taken as zero.
3.2. Model Assessment
3.2.1. Uncertainty Analysis
Uncertainty analysis was carried out on model V15 (equation 11) as described in the Methods section. The initial conditions were not considered free parameters in this analysis, because we are not concerned with accurately estimating the initial conditions when using the model to predict behavior. To determine the sensitivity coefficients, simulations were run from the best-fit initial conditions (reported in the caption of Figure 2). The results of the uncertainty analysis are presented in Table 2.
3.2.2. Comparison to Test Data
Figure 3 shows the results of using model V15 (Equation 11) to predict the results of the staggered loading experiments #5–8, in which one of the two subpopulations was loaded 120 min after the first (the full dataset is provided in Supplementary File S1). For the simulations, the initial donor and recipient populations were set to the observed values. The transconjugant population was set at zero up to time 120.
Figure 3. Model predictions. Data points correspond to time-point flow cytometric and OD600 readings as described in the Methods Section. Error bars correspond to standard deviations of triplicate observations. Curves are simulations of model (11). Initial conditions correspond to mean observations, as follows (in units of cells/cm−2): Exp. #5: R(0) = 5.92 × 106, D(120) = 3.51 × 107; Exp. #6: R(0) = 5.24 × 106, D(120) = 6.71 × 105; Exp. #7: D(0) = 5.51 × 106, R(120) = 1.00 × 107; Exp. #8: D(0) = 4.34 × 105; R(120) = 8.07 × 106. Simulations of delayed loading incorporated the delay into the growth lag (i.e., time t was replaced by (t − 120) for the populations that were loaded at t = 120).
4.1. Model Development
By comparing model variants by the Akaike information criterion (rather than relying on expectations of system behavior) we arrived at an unbiased selection of the model formulation that is best supported by the experimental results. AIC-based model comparison is technically only justified when the errors are independent and normally distributed. That condition is only approximately satisfied in this case. (The errors, not shown, are symmetric, but are more heavy-tailed then a normal distribution.) Nevertheless, the AIC provides a useful guide to model comparison. Notably, it led to a model for which (i) there is no transitory derepression of the conjugative machinery, and (ii) the newly-formed transconjugants exhibit negligible growth over the experimental time-frame. These findings represent hypotheses regarding the plasmid-recipient pair under investigation. The Sørensen lab has carried out a number of studies addressing conjugation of pKJK10's parental plasmid pKJK5 (e.g., Bahl et al., 2007a,b), but the specific dynamic features represented by the model have not been previously investigated.
The estimated growth kinetics in Table 2 must be interpreted carefully. Considered individually, the estimates correspond to unreasonable maximal doubling times of only 18 min for donors, 12 min for recipients, and a long lag phase—half maximal growth reached after 145 min. These best-fit estimates cannot be interpreted in isolation; they provide accurate descriptions of behavior in the specific context of a rapidly-depleting limiting resource. (In all simulations the growth rates were always well below their maximal values). Alternative experimental conditions (e.g., fed-batch) could allow the lag and maximal growth rates to be independently assessed. (The model variants that do not include lag or resource limitation provide some indication of what might be expected; e.g., for model V14, Equation 9) the common doubling time is estimated to be about 50 min).
4.2. Model Assessment
The overall sensitivity measure in Table 2 is a relative score indicating the degree to which the observations are influenced by each parameter; small scores indicate that the available data may be insufficient to accurately constrain parameter estimates. The corresponding identifiability scores discount the sensitivity measures by accounting for correlation between parameteric effects. Low scores indicate low potential for confidently identifying parameter values. In Yao et al. (2003), an identifiability cut-off value of 0.04 was recommended. All model parameters are identifiable by that criterion. (When model V7 was analyzed, the transconjugant growth rate was not above this identifiability threshold, and so it was eliminated to yield model V15.)
The last two columns of Table 2 contain estimates of 95% confidence intervals on the parameter estimates. These results, like the AIC-based model comparison, depend on an assumption that the errors are independent and normally distributed, which holds only approximately. The FIM-based estimate is derived under an assumption that the model itself is accurate (so that experimental error is the only source of variance). The complementary nonlinear regression approach incorporates model mismatch into its estimate. (A complementary upper bound calculation (Ashyraliyev et al., 2009) was not informative as it yielded estimates uniformally larger than ±100%.) While these results clearly indicate there is room for improvement, they suggest that there is strong potential for accurate estimation of the system kinetics with sufficient data.
Finally, Figure 3 demonstrates that the model can provide reasonably accurate predictions of system behavior. To reiterate, the data from the test experiments (#5–8) was not used for model fitting or model development. Moreover, the protocol followed for experiments #5–8 was dynamically distinct from the training experiments (#1–4). The training data (Figure 2) was collected from simultaneous loading experiments. The testing data (Figure 3) was collected from staggered loading experiments. Figure 3 thus demonstrates the model's ability to extrapolate to distinct dynamic regimes. The fits are not perfect, the worst prediction being the donor population in experiment #6 (perhaps as a result of the donor-recipient ratio at loading being too far beyond those of the training data). Nevertheless, the overall quality of the predictions speaks to the potential of the chosen model variant to predict system behavior.
4.3. Alternative Modeling Frameworks
Of course, any ODE-based model cannot be expected to provide highly accurate descriptions of surface-associated culture behavior, as spatial heterogenity is bound to have an effect on system dynamics. While the success of our approach indicates that such spatial features are not crucial to predicting behavior in the high-density mixed-distribution cultures observed in this study, spatial aspects will no doubt dominate in more heterogeneous environments or when subpopulations are not evenly distributed. Spatially explicit models offer appropriate frameworks in these cases: partial differential equation (PDE) models offer valuable descriptions of continuous spatial distributions, e.g., Massoudieh et al. (2010), while individual-based models provide increased resolution by describing the population in terms of individual cells (Kreft et al., 2013). Simple individual-based models represent cells as occupying positions in a regular lattice, e.g., Krone et al. (2007) and García and Rodríguez-Patón (2015), while more complex models, such as the DiSCUS model developed by Goñi-Moreno and Amos (2015) capture cell morphology. Related individual-based modeling frameworks, presented in Lardon et al. (2011) and Rudge et al. (2012), focus on biofilm formation but do not currently describe gene transfer. Importantly, these modeling frameworks have the capacity to capture inevitable variability within populations. Continued model developments will facilitate investigations of natural microbial populations (Marino et al., 2014; Widder et al., 2016) and the design of synthetic microbial ecologies (Zomorrodi and Segrè, 2016).
In this study we expanded on previous modeling efforts by assessing the quality of parameter estimates and the accuracy of extrapolative predictions. This project represents a step toward a modeling framework that can be used for confident prediction of the behavior of engineered microbial communities, such as those employed in the conjugation-based cellular computing systems described in Goñi-Moreno et al. (2013). In particular, plasmid-mediated bioaugmentation holds promise in a range of applications, including bioremediation (Top et al., 2002) and suppression of antibiotic resistance (Baquero et al., 2011; Gooding-Townsend et al., 2015) (see also Yosef et al., 2015). The success of such efforts will depend upon the development of techniques for accurate model-based design in cellular bioengineering.
Conceived and designed the study: BI, AM. Performed the experiments: AM, PS, AN. Analyzed the results: AM, AN, BI. Wrote and revised the manuscript: AM, AN, PS, BI.
This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada, the Canada Foundation for Innovation (CFI) and a MITACS Globalink Fellowship to AM.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/article/10.3389/fmicb.2017.00461/full#supplementary-material
Ashyraliyev, M., Fomekong-Nanfack, Y., Kaandorp, J. A., and Blom, J. G. (2009). Systems biology: parameter estimation for biochemical models. FEBS J. 276, 886–902. doi: 10.1111/j.1742-4658.2008.06844.x
Bahl, M. I., Hansen, L. H., Licht, T. R., and Sørensen, S. J. (2007a). Conjugative transfer facilitates stable maintenance of incp-1 plasmid pkjk5 in Escherichia coli cells colonizing the gastrointestinal tract of the germfree rat. Appl. Environ. Microbiol. 73, 341–343. doi: 10.1128/AEM.01971-06
Bahl, M. I., Hansen, L. H., and Sørensen, S. J. (2007b). Impact of conjugal transfer on the stability of incp-1 plasmid pkjk5 in bacterial populations. FEMS Microbiol. Lett. 266, 250–256. doi: 10.1111/j.1574-6968.2006.00536.x
Baquero, F., Coque, T. M., and de la Cruz, F. (2011). Ecology and evolution as targets: the need for novel eco-evo drugs and strategies to fight antibiotic resistance. Antimicrob. Agents Chemother. 55, 3649–3660. doi: 10.1128/AAC.00013-11
Burnham, K. P., Anderson, D. R., and Huyvaert, K. P. (2011). AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav. Ecol. Sociobiol. 65, 23–35. doi: 10.1007/s00265-010-1029-6
García, A. P., and Rodríguez-Patón, A. (2015). “A preliminary assessment of three strategies for the agent-based modeling of bacterial conjugation,” in 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. Advances in Intelligent Systems and Computing, Vol. 375, eds R. Overbeek, M. Rocha, F. Fdez-Riverola, and J. De Paz (Cham: Springer). Available online at: http://link.springer.com/chapter/10.1007/978-3-319-19776-0_1
Goñi-Moreno, A., and Amos, M. (2015). “DiSCUS: a simulation platform for conjugation computing,” in Unconventional Computation and Natural Computation. UCNC 2015. Lecture Notes in Computer Science, Vol. 9252, eds C. Calude and M. Dinneen (Cham: Springer). Available online at: http://link.springer.com/chapter/10.1007/978-3-319-21819-9_13
Kreft, J., Plugge, C. M., Grimm, V., Prats, C., Leveau, J. H., Banitz, T., et al. (2013). Mighty small: observing and modeling individual microbes becomes big science. Proc. Natl. Acad. Sci. U.S.A. 110, 18027–18028. doi: 10.1073/pnas.1317472110
Lardon, L. A., Merkey, B. V., Martins, S., Dötsch, A., Picioreanu, C., Kreft, J., et al. (2011). iDynoMiCS: next-generation individual-based modelling of biofilms. Environ. Microbiol. 13, 2416–2434. doi: 10.1111/j.1462-2920.2011.02414.x
Licht, T. R., Christensen, B. B., Krogfelt, K. A., and Molin, S. (1999). Plasmid transfer in the animal intestine and other dynamic bacterial populations: the role of community structure and environment. Microbiology 145, 2615–2622. doi: 10.1099/00221287-145-9-2615
Marino, S., Baxter, N. T., Huffnagle, G. B., Petrosino, J. F., and Schloss, P. D. (2014). Mathematical modeling of primary succession of murine intestinal microbiota. Proc. Natl. Acad. Sci. U.S.A. 111, 439–444. doi: 10.1073/pnas.1311322111
Massoudieh, A., Crain, C., Lambertini, E., Nelson, K., Barkouki, T., L'amoreaux, P., et al. (2010). Kinetics of conjugative gene transfer on surfaces in granular porous media. J. Contam. Hydrol. 112, 91–102. doi: 10.1016/j.jconhyd.2009.10.009
Merkey, B. V., Lardon, L. A., Seoane, J. M., Kreft, J., and Smets, B. F. (2011). Growth dependence of conjugation explains limited plasmid invasion in biofilms: an individual-based modelling study. Environ. Microbiol. 13, 2435–2452. doi: 10.1111/j.1462-2920.2011.02535.x
Normander, B., Christensen, B. B., Molin, S., and Kroer, N. (1998). Effect of bacterial distribution and activity on conjugal gene transfer on the phylloplane of the bush bean (Phaseolus vulgaris). Appl. Environ. Microbiol. 64, 1902–1909.
Sengeløv, G., Kristensen, K. J., Sørensen, A. H., Kroer, N., and Sørensen, S. J. (2001). Effect of genomic location on horizontal transfer of a recombinant gene cassette between Pseudomonas strains in the rhizosphere and spermosphere of barley seedlings. Curr. Microbiol. 42, 160–167. doi: 10.1007/s002840010197
Sørensen, S. J., Bailey, M., Hansen, L. H., Kroer, N., and Wuertz, S. (2005). Studying plasmid horizontal transfer in situ: a critical review. Nat. Rev. Microbiol. 3, 700–710. doi: 10.1038/nrmicro1232
Top, E. M., Springael, D., and Boon, N. (2002). Catabolic mobile genetic elements and their potential use in bioaugmentation of polluted soils and waters. FEMS Microbiol. Ecol. 42, 199–208. doi: 10.1111/j.1574-6941.2002.tb01009.x
Tung, J. W., Heydari, K., Tirouvanziam, R., Sahaf, B., Parks, D. R., Herzenberg, L. A., et al. (2007). Modern flow cytometry: a practical approach. Clin. Lab. Med. 27, 453–468. doi: 10.1016/j.cll.2007.05.001
Widder, S., Allen, R. J., Pfeiffer, T., Curtis, T. P., Wiuf, C., Sloan, W. T., et al. (2016). Challenges in microbial ecology: building predictive understanding of community function and dynamics. ISME J. 10, 2557–2568. doi: 10.1038/ismej.2016.45
Yao, K. Z., Shaw, B. M., Kou, B., McAuley, K. B., and Bacon, D. (2003). Modeling ethylene/butene copolymerization with multi-site catalysts: parameter estimability and experimental design. Polym. React. Eng. 11, 563–588. doi: 10.1081/PRE-120024426
Yosef, I., Manor, M., Kiro, R., and Qimron, U. (2015). Temperate and lytic bacteriophages programmed to sensitize and kill antibiotic-resistant bacteria. Proc. Natl. Acad. Sci. U.S.A. 112, 7267–7272. doi: 10.1073/pnas.1500107112
Zhong, X., Droesch, J., Fox, R., Top, E. M., and Krone, S. M. (2012). On the meaning and estimation of plasmid transfer rates for surface-associated and well-mixed bacterial populations. J. Theor. Biol. 294, 144–152. doi: 10.1016/j.jtbi.2011.10.034
Keywords: mathematical modeling, horizontal gene transfer, plasmid conjugation, uncertainty analysis, batch processing, model comparison, sensitivity analysis, identifiability
Citation: Malwade A, Nguyen A, Sadat-Mousavi P and Ingalls BP (2017) Predictive Modeling of a Batch Filter Mating Process. Front. Microbiol. 8:461. doi: 10.3389/fmicb.2017.00461
Received: 01 December 2016; Accepted: 06 March 2017;
Published: 21 March 2017.
Edited by:Dimitrios Georgios Karpouzas, University of Thessaly, Greece
Reviewed by:Venu Kamarthapu, New York University Medical Center, USA
Spyridon Ntougias, Democritus University of Thrace, Greece
Copyright © 2017 Malwade, Nguyen, Sadat-Mousavi and Ingalls. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Brian P. Ingalls, email@example.com