# Quantitative Examination of Five Stochastic Cell-Cycle and Cell-Size Control Models for *Escherichia coli* and *Bacillus subtilis*

^{1}Theory Group, Chan Zuckerberg Biohub, San Francisco, CA, United States^{2}Department of Physics, University of California, San Diego, San Diego, CA, United States^{3}Division of Biology and Biological Engineering, Broad Center, Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA, United States^{4}Section of Molecular Biology, Division of Biology, University of California, San Diego, San Diego, CA, United States

We examine five quantitative models of the cell-cycle and cell-size control in *Escherichia coli* and *Bacillus subtilis* that have been proposed over the last decade to explain single-cell experimental data generated with high-throughput methods. After presenting the statistical properties of these models, we test their predictions against experimental data. Based on simple calculations of the defining correlations in each model, we first dismiss the stochastic Helmstetter-Cooper model and the Initiation Adder model, and show that both the Replication Double Adder (RDA) and the Independent Double Adder (IDA) model are more consistent with the data than the other models. We then apply a recently proposed statistical analysis method and obtain that the IDA model is the most likely model of the cell cycle. By showing that the RDA model is fundamentally inconsistent with size convergence by the adder principle, we conclude that the IDA model is most consistent with the data and the biology of bacterial cell-cycle and cell-size control. Mechanistically, the Independent Adder Model is equivalent to two biological principles: (i) balanced biosynthesis of the cell-cycle proteins, and (ii) their accumulation to a respective threshold number to trigger initiation and division.

## Introduction

Quantitative microbial physiology is marked by close interactions between experiment and modeling since its birth in the mid twentieth century (see Jun et al., 2018 for a review of the history with extensive literature). In particular, bacterial cell-size and cell-cycle control has enjoyed rejuvenated interests in modeling with the advent of microfluidics techniques that allow tracking of thousands of individual cells over a hundred division cycles (see, for example, Wang et al., 2010; Moffitt et al., 2012; Long et al., 2013; Vashistha et al., 2021). Re-emerged from the new single-cell data is the adder principle (Campos et al., 2014; Jun and Taheri-Araghi, 2015; Taheri-Araghi et al., 2015), which states that individual cells grow by adding a fixed size from birth to division, independently from their size at birth. This principle has characteristic repercussions on cell size homeostasis. Specifically, upon perturbation, the cell size at birth relaxes toward its steady-state value according to a first-order recurrence relation with a correlation coefficient equal to 1/2 (Voorn et al., 1993; Amir, 2014; Taheri-Araghi et al., 2015).

Although the adder principle was originally proposed and statistically tested almost three decades ago by Voorn et al. (1993) before its recent revival, its mechanistic origin has remained elusive until recently because direct experimental tests were not available for a long time (Jun et al., 2018; Si et al., 2019). Several models have been proposed so far (Campos et al., 2014; Iyer-Biswas et al., 2014; Taheri-Araghi et al., 2015; Harris and Theriot, 2016; Wallden et al., 2016; Amir, 2017; Micali et al., 2018; Si et al., 2019; Witz et al., 2019; Bertaux et al., 2020; Serbanescu et al., 2020; Zheng et al., 2020), and we expect a consensus to emerge as more experimental data become available.

The main purpose of this article is to derive and present steady-state statistical properties of quantitative bacterial cell-cycle and cell-size control models that we are currently aware of and, where relevant, critically examine them against single-cell data from our lab’s mother machine experiments accumulated over the last decade in *E. coli* and *B. subtilis*. These models are (i) the stochastic Helmstetter-Cooper model (sHC) (Si et al., 2019), (ii) the initiation adder (IA) model (Amir, 2014, 2017; Ho and Amir, 2015), (iii) the Replication Double Adder (RDA) model (Witz et al., 2019), (iv) the Independent Double Adder (IDA) model (Si et al., 2019), and (v) the concurrent cell-cycle processes (CCCP) model (Boye and Nordström, 2003) and its stochastic version (Micali et al., 2018).

Some of these models are graphically illustrated in Figure 1. Briefly, the sHC model is a literal extension of the textbook Helmstetter Cooper model by allowing independent Gaussian fluctuations to each of the initiation mass, the τ_{cyc} = C+D period (from initiation to division), and the cell elongation rate. The IA model assumes that replication initiation is the sole implementation point of cell-size control, and division is strictly coupled to initiation such that division is triggered after fixed τ_{cyc} = C+D minutes have elapsed since initiation. The RDA model is similar to the IA model in that it also assumes that initiation is the reference point for cell-size control. Its main difference from the IA model is that it assumes division is triggered after the cell elongates a constant length per origin of replication *δ*_{id}, rather than a constant time, since initiation. In other words, *δ*_{id} is the added size during the C+D period. Both the IA and the RDA models assume the initiation adder, i.e., the cell growth by a nearly fixed size per replication origin between two consecutive initiation cycles, irrespective of the cell size at initiation (initiation mass).

**Figure 1.** Physiological parameters that can be measured from single-cell experiments. **(A)** Time-lapse images of a single *Escherichia coli* cell growing in a microfluidics channel. The cell boundaries are segmented from phase contrast images whereas the replication forks are visualized using a functional fluorescently labeled replisome protein (DnaN-YPet). **(B)** Multifork replication: in most growth conditions, several replication cycles overlap. The direction of the arrows is not the direction of time, but to illustrate that the HC model’s core idea is to trace replication initiation backward in time by C+D from division. **(C)** Four models of *E. coli* cell cycle and their control variables, which can be measured from single-cell experiments. The sHC model describes cell size and cell cycle using three parameters: elongation rate *λ* = dln(*l*)/d*t*, where *l* is the cell length (not shown), *τ*_{cyc} = *C*+*D*, and the initiation size per origin of replication *s*_{i}. The IA model uses *λ*, *τ*_{cyc} and the added size per origin of replication between consecutive replication initiation events *δ*_{ii}. The RDA model uses *λ*, *δ*_{ii} and the added size per origin of replication from initiation to division *δ*_{id}. The IDA model uses *λ*, *δ*_{ii} and the added size from birth to division *Δ*_{d}. Note that both *δ*_{id} and *τ*_{cyc} can span multiple generations. The prefactor before s_{i}, δ_{ii}, δ_{id} reflects multiple replication origins at initiation as depicted above.

The IDA model states that initiation and division are independently controlled by their respective initiator proteins. However, the IDA model is based on mechanistic assumptions that these proteins are produced in a balanced manner (i.e., for every protein, the mass synthesis rate is a fixed fraction of the total mass synthesis rate Scott et al., 2010), and initiation and division are triggered when the cell has accumulated their respective initiator proteins to their respective threshold numbers. The CCCP model states that replication cycle and division cycles progress independently, but checkpoints or their equivalent are activated to ensure cell division (Boye and Nordström, 2003).

This article is structured as follows. In section “Statistical Properties of Five Bacterial Cell-Size and Cell-Cycle Control Models,” we summarize the five models and derive some of their statistical properties. In section “Test of the Models Against Data,” we test the predictions of these models against the data. In section “Discussion,” we critically examine one of the recent correlation analysis methods (the *I*-value analysis) used to justify the RDA model. We conclude that the IDA model is as of today the model most consistent with data, which also provides a falsifiable mechanistic picture.

## Statistical Properties of Five Bacterial Cell-Size and Cell-Cycle Control Models

### The Stochastic Helmstetter-Cooper Model

The original HC model (Cooper and Helmstetter, 1968) is based on the experimental observation that the average duration of chromosome replication (“*C* period”) can be longer than the average doubling time of the cells in fast-growing *E. coli*. In such growth conditions, *E. coli* must initiate a new round of replication before the ongoing replication cycle is completed. The core of the HC model is the recipe to trace replication initiation backward by *τ*_{cyc} = *C*+*D* > *τ* minutes starting from cell division during overlapping cell cycles (Figure 1).

Thus, the HC model introduces three control parameters for a complete description of replication and division cycles: two temporal parameters (the doubling time *τ* and the duration of cell cycle *τ*_{cyc} = *C*+*D*) and one spatial parameter (e.g., cell size at division or initiation). It was Donachie who showed that, if (i) *τ*_{cyc} = *C*+*D* is invariant under different nutrient conditions and (ii) the average cell size increases exponentially with respect to the nutrient-imposed growth rate λ = ln 2/τ as *S* = *s*_{i} exp(*αλ*) (where *s*_{i} and *α* are constant, and *S* is the average cell size of a steady-state population), then the cell size at initiation per replication origin *s*_{i} (or, the “initiation mass”) must be mathematically invariant in all growth conditions (Donachie, 1968). This result was later generalized to all steady-state growth conditions with and without growth inhibition (Si et al., 2017).

Since the original HC model is deterministic and can be defined in terms of *λ*, *τ*_{cyc} and *s*_{i} (or division size *S*_{d}), one possible extension to a stochastic version is by making the three physiological variables stochastic. Together, they completely determine cell sizes including the size at division, assuming perfectly symmetric division (Figure 1). For simplicity, we draw *λ*, *τ*_{cyc} and *s*_{i} at cell birth from a multivariate Gaussian distribution, which also encodes cross- and mother-daughter correlations in the covariances matrix (Si et al., 2019).

The recursion relation for the cell size at division in this “stochastic” Helmstetter-Cooper (sHC) model can be written as follows:

where *n* denotes the generation index. If we assume that cells elongate exponentially at the growth rate *λ* (Wang et al., 2010), the number of overlapping cell cycles *p*+1 is completely determined by the relation:

where *a*_{i} is the time duration elapsed between cell birth and replication initiation. It follows that *p* is the integer part of *τ*_{cyc}/*τ*, where *τ* = ln2/*λ* is the generation time, so that *p*+1 is the number of overlapping cell cycles [unless noted otherwise we will adopt the convention that *X*^{(n)} denotes the value of a physiological variable in generation *n* whereas *X* is the average over the whole lineage].

In the sHC model, consecutive sizes at initiation are correlated through *ρ*_{i} = *ρ*(*s*_{i}^{(n)}, *s*_{i}^{(n+1)}), where *ρ*(*A,B*) stands for the Pearson correlation coefficient between variables *A* and *B*. In the absence of mother-daughter correlations for all three physiological variables, the cell should behave as a sizer, *ρ*_{d} = *ρ*(*S*_{d}^{(n)}, *S*_{d}^{(n+1)}) = 0. However, additional cross- or auto-correlations among *λ*, *τ*_{cyc} and *s*_{i} [such as cross-correlations between *s*_{i}^{(n)} and *τ*_{cyc}^{(n)} and/or mother-daughter correlations between *s*_{i}^{(n)} and *s*_{i}^{(n+1)}] can have a non-trivial effect on size homeostasis. Analytical expressions for *ρ*_{i}, *ρ*_{d} and *ρ*_{id}=(*s*_{i}, *S*_{d}) are derived in Supplementary Appendix C. Importantly, *ρ*_{d} is particularly sensitive to the mother-daughter initiation-size correlation *ρ*_{i} = *ρ*(*s*_{i}^{(n)}, *s*_{i}^{(n+1)}) in the sHC model (Si et al., 2019). This prediction motivated an experimental study aiming at perturbing *ρ*_{i} by periodic expression of DnaA in order to break balanced biosynthesis for the DnaA protein (one of the two conditions to produce an adder phenotype) and thus break the adder phenotype in *E. coli*. Experiments rejected this prediction from sHC model (Si et al., 2019), since *E. coli* maintained its size homeostasis following the adder behavior despite periodic oscillations of *dnaA* expression level (Si et al., 2019). An important conclusion from the oscillation experiments is that replication initiation and cell division are independently controlled in steady-state conditions in both *E. coli* and *Bacillus subtilis*, thus firmly refuting the particular version of the sHC model.

### The Initiation Adder Model

The IA model is a variant of the sHC model in which the constraint on the initiation mass (*s*_{i} = constant in all growth conditions) is replaced by an adder mechanism running between consecutive replication initiations (Sompayrac and Maaloe, 1973; Ho and Amir, 2015). Specifically, the cell initiates replication following the adder principle, i.e., the size added per origin between two consecutive initiation cycles, *δ*_{ii}, is independent of the cell size at initiation (Amir, 2017). Yet, as in the sHC model, the IA model assumes that division is triggered after a fixed duration of time, *τ*_{cyc}, has elapsed since initiation. The three stochastic control parameters in the IA model are therefore: *λ*, *τ*_{cyc} and *δ*_{ii}. A given cell size at replication initiation determines the next replication initiation event and one division event.

The recursion relation for cell size at division is the same as in the sHC model (Eq. 1). However, this relation is complemented with the following adder recursion relation determining the cell size per origin at replication initiation:

As before, *λ*^{(n)}, *τ*_{cyc}^{(n)}, and *δ*_{ii}^{(n)} are random variables associated with the *n*-th generation. To derive statistical properties of the IA model (Table 1), we will assume that the *δ*_{ii}^{(n)} are independent Gaussian variables. At steady-state, Eq. (3) implies that *s*_{i} = 2*δ*_{ii}. Therefore, the number of overlapping cell-cycles is also determined by Eq. (2) in the IA model (namely, *p*+1).

Cell sizes at consecutive initiations are correlated as *ρ*_{i} = 1/2 (Supplementary Appendix A). Therefore the IA model can be seen as a specific case of the sHC model, for which there is no cross-correlations between physiological variables, and for which the only non-zero auto-correlation is *ρ*_{i} = 1/2. In general, *ρ*_{d} < 1/2, and it only reproduces the adder correlation in the deterministic limit where *λτ*_{cyc} is a constant.

### The Replication Double Adder Model

The RDA model states that the cell simultaneously follows two types of adder. The first adder is between two consecutive initiation cycles (“initiation adder”), same as in the IA model. The second adder states that the size added between initiation and division is independent of the cell size at initiation (“initiation-to-division” adder). This second initiation-to-division adder makes the RDA model different from the IA model, although both models can be considered initiation-centric. This model was developed to explain one specific data set with non-overlapping cell cycles in *E. coli* (Witz et al., 2019). In section “Test of the Models Against Data,” we will use the same statistical analysis method that was used in Witz et al. (2019) to establish the RDA model.

In the RDA model, the cell size per origin of replication, *s*_{i}, follows the same recursion relation as in the IA model (Eq. 3). As for the initiation-to-division adder, the cell size at division is determined by the following recursion relation:

where *δ*_{id} represents the added size per origin of replication from initiation to division.

While Eq. (4) is straightforward to understand for a non-overlapping cell cycle, it is worth checking its validity for overlapping cell cycles. Let *S*_{i}^{(n)} be the cell size at initiation for the *n*-th generation. Let us first emphasize how *S*_{i}^{(n)} is measured. In the sHC model, *S*_{i}^{(n)} is measured at a duration of time *τ*_{cyc}^{(n)} before division occurs. For a non-overlapping cell cycle, *τ*_{cyc}^{(n)} < *τ ^{(n)}*, therefore

*S*

_{i}

^{(n)}is measured in generation

*n*. For two overlapping cell cycles 2

*τ*>

^{(n)}*τ*

_{cyc}

^{(n)}>

*τ*, initiation therefore occurs in the (

^{(n)}*n*-1)-th generation, meaning that

*S*

_{i}

^{(n)}refers to a size measured in the mother cell (i.e., generation

*n*-1) as shown in Figure 1. Cells are born with 2 origins of replications, therefore we have

*S*

_{i}

^{(n)}= 2

*s*

_{i}

^{(n)}. In this example, the mass synthesized between two consecutive replication initiation events must take into account one division event. Back to the RDA model, and using the same convention, the total added size from initiation to division is 2

*S*

_{d}

^{(n)}−

*S*

_{i}

^{(n)}= 4

*δ*

_{id}

^{(n)}. The factor of 4 accounts for the 4 origin of replications present after replication initiation. Dividing by 2, we obtain Eq. (4). This reasoning generalizes to any number of overlapping cell cycles. From Eq. (4), we also obtain that the average cell size at division is

*S*

_{d}= 2 (

*δ*

_{ii}+

*δ*

_{id}). An argument similar to Eq. (2) yields the number of overlapping cell cycles

*p*+1 as a function of the mean of the physiological variables:

*p*is the integer part of log

_{2}(1 +

*δ*

_{id}/

*δ*

_{ii}).

The IA model is not compatible with size convergence by the adder principle. While ρ_{i} =1/2 as in the IA model, the division size mother-daughter correlation is given by:

where *σ*_{ii}^{2} and *σ*_{id}^{2} are the variances for *δ*_{ii} and *δ*_{i}_{d}, respectively (see Supplementary Appendix A). Since the adder principle is equivalent to *ρ*_{d} = 1/2 (see Supplementary Appendix A), the IA model converges to the adder only in the deterministic limit σ_{id} → 0. In addition, we can also compute the correlation between initiation size per origin and division size *ρ*_{id} = *ρ*(*s*_{i}^{(n)}, *S*_{d}^{(n)}) and obtain:

### The Independent Double Adder Model

The IDA model states that, in steady state, initiation and division independently follow the adder principle. That is, the size added between two consecutive initiations is independent of the size at initiation (as in the IA and RDA models), whereas the size added between two division cycles is independent of the cell size at birth (or division). The recursion relation for the division size can be written as:

It follows that the average cell size at division is *S*_{d} = 2 *Δ*_{d}. An argument similar to Eq. (2) yields the number of overlapping cell cycles *p*+1, where *p* is the integer part of log_{2}(*Δ*_{d}/*δ*_{ii}).

We have *ρ*_{i} =1/2 and ρ_{d} = 1/2 as expected by the definition of the model. Furthermore, since initiation and division follow two independent processes (Eqs. 3 and 7), division and initiation sizes are independent from each other, namely ρ_{id} = 0.

Mechanistically, the IDA model is based on (i) balanced biosynthesis of cell-cycle proteins and (ii) their accumulation to respective threshold numbers to trigger initiation and division (Si et al., 2019).

### The Concurrent Cell-Cycle Processes Model

The CCCP model is an adaptable model with several adjustable parameters (as in the sHC model) and lies somewhere in between the IA and the IDA model. The adaptability is analogous to the presentation by Amir (2014) so that the model can be continuously adjusted between sizer and timer depending on the mother-daughter size correlations between −1 and +1. To ensure 1-1 correspondence between the replication cycle and the division cycle, the model explicitly implements a constraint that division must wait until after replication termination. Biologically, the model follows the view by Boye and Nordström (2003).

We discuss the specific case of the adder by fixing the mother-daughter size correlation coefficients to 1/2 as explained throughout this section “Statistical Properties of Five Bacterial Cell-Size and Cell-Cycle Control Models.” That is, the cell size at initiation follows the recursion relation:

where *A*^{(n)} is the logarithmic added size between consecutive replication initiations. As mentioned above, the CCCP model was originally introduced in a more general form than Eq. (8), with an adjustable correlation parameter (see Supplementary Appendix D). However, as explained by the authors a value of 1/2 is the most consistent with experimental data. Equation (8) is very similar to Eq. (3): it is an adder on the logarithmic sizes rather than on the actual sizes at replication initiation. Denoting *C* the time to replicate the chromosome, a candidate size for the division size is:

where as before *λ* is the elongation rate. If chromosome replication was the only process determining the size at division, *S*_{R}^{(n)} would be the division size. However, another process, namely the division adder, is constraining the division size, resulting in a second candidate size for division:

where *B* is the added logarithmic size between consecutive division adder cycles. Equation (10) is similar to Eq. (7) and represents the division adder. Finally, cell size at division is determined by the slowest of the two processes from Eqs. (9) and (10):

Equation (11) simply means that division should start only after replication termination. Denoting *f* as the fraction of cases in which division size is limiting (namely *S*_{R}<*S*_{H}=*S*_{d}), the average time elapsed between replication initiation and cell division can be expressed as [assuming that < $\text{ln}(\mathrm{x})>\approx \text{ln}(<\mathrm{x}>)$ (Ho and Amir, 2015)]:

where *A*, *B*, *C* stand for means. Therefore, the number of overlapping cell cycles *p*+1 is determined by Eq. (2). Equation (12) has a functional dependence on growth rate compatible with experimental reports (Wallden et al., 2016).

### Similarities and Differences Between the Stochastic Helmstetter-Cooper, Initiation Adder, Replication Double Adder, Independent Double Adder, and the Concurrent Cell-Cycle Processes Models

The question of implementation point for cell size control has been controversial in the past. In the sHC, IA, and RDA models, replication initiation is the implementation point of cell size control. By contrast, the IDA and CCCP models assume that the division and replication cycles are controlled by independent processes.

These models reflect a major challenge for identifying a cell-size control model that is compatible with the new plethora of high-throughput single-cell data (Wang et al., 2010; Wallden et al., 2016). Although the sHC and IA models can be dismissed by experimental evidence (section “Comparison of Correlations From Model Predictions and From Experimental Data”), the other models require more thorough analysis. For example, in contrast to the IDA model, the RDA model only ensures the initiation adder and it only reproduces the division adder behavior in the deterministic limit where *δ*_{id} is constant. The essential difference between the IDA and RDA models comes from the correlation between the size per oriC at initiation and the added size per oriC from initiation to division. Specifically, *ρ*(*s*_{i}, *δ*_{id}) is zero for the RDA model whereas it takes negative values for the IDA.

In the next section, we test these models against data in more detail.

## Test of the Models Against Data

### Description of the Experimental Data Used in This Study

We use datasets from our previous studies for *E. coli* and *B. subtilis* (Sauls et al., 2019; Si et al., 2019). We have also performed additional experiments for this study (see Supplementary Methods). All data and numerical analysis are available (Supplementary Information: Numerical Methods). In total, we have 15 experimental datasets from our studies. We have also analyzed the 4 experimental datasets made available by Witz et al. (2019).

### Comparison of Correlations From Model Predictions and From Experimental Data

We first set out to test the different cell-cycle and cell-size control models. Specifically, we computed the four correlations *ρ*(*S*_{b}, *Δ*_{d}), *ρ*(*s*_{i}, *δ*_{ii}), *ρ*(*s*_{i}, *δ*_{id}), and *ρ*(*s*_{i}, *τ*_{cyc}) (Figure 2). The correlation *ρ*(*S*_{b}, *Δ*_{d}) is important because *ρ*(*S*_{b}, *Δ*_{d})=0 defines the adder-based cell-size homeostasis. Indeed, *ρ*(*S*_{b}, *Δ*_{d}) is zero in virtually all experimental data. The *ρ*(*s*_{i}, *δ*_{ii}) is also close to zero, although deviations are seen for some experiments. These results suggest that both the IDA and RDA models are possible. By contrast, the *ρ*(*s*_{i}, *τ*_{cyc}) correlation shows consistently a negative value. This refutes the sHC and IA models, which both assume that the initiation-to-division duration and the initiation size per origin are independent control parameters, and thus predict *ρ*(*s*_{i}, *τ*_{cyc}) = 0. In addition, *ρ*(*s*_{i}, *δ*_{id}) is also close to zero, in favor of the RDA model, but it is also slightly negative for several conditions, in agreement with the IDA model (Supplementary Appendix B). We did not test the CCCP model because it is a model to be adjusted to the data. The candidate models are therefore the RDA and IDA models. Hereafter, we focus on these two models.

**Figure 2.** The four correlations *ρ*(*S*_{b}, *Δ*_{d}), *ρ*(*s*_{i}, *δ*_{id}), *ρ*(*s*_{i}, *δ*_{ii}), and *ρ*(*s*_{i}, *τ*_{cyc}) are computed for the 4 experimental datasets by Witz et al. (2019), and 15 experimental datasets that we produced (see Supplementary Methods). While the first three vanish for most experimental data, the *ρ*(*s*_{i}, *τ*_{cyc}) displays a consistent negative correlation, inconsistent with the sHC and IA models. Variables in each dataset were normalized by their mean. Numerical values for the Pearson correlation coefficients are given in Supplementary Data 2 file. Slopes can be inferred from the Pearson correlation coefficients and CVs in the approximation of bivariate Gaussian variables.

### Statistical Analysis and the Case Study of the *I*-value Analysis

In their recent paper (Witz et al., 2019), Witz et al. tracked replication and division cycles at the single-cell level, using experimental methods similar to previous works (Adiciptaningrum et al., 2015; Wallden et al., 2016; Si et al., 2019). They computed correlations between all pairs of measured physiological parameters, and attempted to identify the set of most mutually uncorrelated physiological variables by computing the “*I*-value,” a metric to measure the statistical independence of the measured variables. They then assumed that statistically uncorrelated physiological parameters must represent biologically independent controls. Such approaches previously facilitated the discovery of the adder principle and its formal description (Taheri-Araghi et al., 2015). Based on this correlation analysis or *I*-value analysis, Witz et al. (2019) concluded that the RDA model is the most likely model of the cell-cycle and cell-size control.

To compute the *I*-value for a given model, one needs to identify the control parameters of the model and their characteristic features. For the RDA model, they are (i) the absence of correlation between the cell size at initiation and the added size between initiation and division, namely *ρ*(*s*_{i}, *δ*_{id}) = 0, and (ii) the absence of correlation between the cell size at initiation and the added size between consecutive initiations, namely *ρ*(*s*_{i}, *δ*_{ii}) = 0. In addition to these size variables, it is known that the growth rate is mostly independent of the other physiological variables, namely: *ρ*(*λ*, *δ*_{id}) = 0 and *ρ*(*λ*, *δ*_{ii}) = 0. Witz et al. (2019) hence proposed a scalar metric that summarizes these four correlations being equal to zero, namely the determinant *I* (or *I*-value) of the matrix of correlations between the 4 variables *s*_{i}, *δ*_{id}, *δ*_{ii} and *λ* (Eq. 13). When *I* ≪ 1, some cross-correlations exists and both *ρ*(*λ*, *δ*_{id}) and *ρ*(*λ*, *δ*_{ii}) cannot vanish. On the other hand, when *I* = 1, the RDA model holds. Although since the work by Cooper and Helmstetter (1968) it has been known that the progression of cell size and cell cycle can be completely described using three variables (Wallden et al., 2016; Si et al., 2017), four variables are necessary here to encompass the correlation structure characterizing the RDA model, as explained by Witz et al. (2019, 2020). In summary, to measure the statistical independence of each set of parameters, the *I*-value analysis needs a correlation matrix of the following form (Eq. 13).

The diagonal elements are 1’s and the off-diagonal elements are cross-correlations between pairs of parameters. Therefore, if all parameters are statistically independent of each other, the off-diagonal elements should be 0, and the determinant *I* of the matrix should be 1. Based on this observation, Witz et al. (2019) used the determinant *I* ≤ 1 of the matrix as a metric for statistical independence of the hypothetical control parameters, with *I* = 1 being the set of most independent parameters.

In section “Discussion,” we will come back to some of the limitations of the *I*-value analysis.

### Test of the *I*-value Analysis With Various Models

Our 4-variable *I*-value analysis of the models described in section “Statistical Properties of Five Bacterial Cell-Size and Cell-Cycle Control Models” (except the CCCP) for all 19 datasets is shown in Figure 3 (top). We computed *I*-values for the following four models: RDA, IDA, IA, and sHC (see Supplementary Methods). The results of the analysis indicate that 7 out of our 15 experiments support the IDA rather than the RDA model, whereas 1 supports the IA model. Furthermore, when we applied the same analysis to all 4 datasets from Witz et al. (2019), we found that all 4 experiments support the IDA model. Note that Witz et al. (2019) had only analyzed one dataset. The sHC and IA models were included for completeness, and they show systematically lower *I*-values (except in one *B. subtilis* condition in which the IA models had the largest score for reasons we do not understand). Overall, the results of this analysis suggest that the IDA model is most consistent with the 19 datasets we have analyzed.

**Figure 3.** We applied the *I*-value analysis proposed by Witz et al. (2019) to 4 models (RDA, IDA, IA, and sHC) using 15 experimental datasets that we produced (see Supplementary Methods), and the 4 datasets published by Witz et al. (2019). In the top bar graphs, the 4 variables are *λ, Δ*_{d}, *δ*_{ii}, *S*_{b} for both the IDA model and *λ, δ*_{id}, *δ*_{ii}, *s*_{i} for the RDA model (see the choice of variables in Supplementary Methods). The length of the arrows indicate how much IDA or RDA model is favored by the data.

## Discussion

### Limitations of the *I*-value Analysis

We noticed that the *I*-values for the RDA and IDA models were in general very close, although these two models point to two fundamentally different mechanisms of the cell cycle. We therefore asked to what extent the *I*-value analysis could be used to effectively identify meaningful models of the cell cycle. In the spirit of the ranking performed in the study by Witz et al. (2019), we considered all possible combinations of 4 among 18 physiological variables (see Supplementary Methods; see Supplementary Figure 1), and computed the *I*-values for each of Witz et al. (2019) 4 datasets and for each of our 15 datasets. Although the IDA and RDA models have high scores, we found that many other combinations have higher *I*-values, including combinations that do not correspond to any meaningful model of the cell cycle (Figure 3 bottom). Since *I*-values cannot be used to distinguish sound from unsound models of the cell-cycle, we conclude that this analysis lacks predictive power.

Furthermore, the *I*-value analysis can only be employed to compare models with the same number of variables in defining correlations. To see this, let us consider the RDA and IDA models in Figure 4. The RDA model can be defined by the 3 parameters {*δ*_{ii}, *δ*_{id}, *λ*} (Figure 4A). Indeed, from an initial condition consisting of an initiation size, only those 3 parameters need to be known at each generation to construct a whole lineage. Yet the defining correlations of the RDA model are given by *ρ*(*λ*, *δ*_{id}) = 0 and *ρ*(*λ*, *δ*_{ii}) = 0. Thus, we need a total of 4 variables {*s*_{i}, *δ*_{ii}, *δ*_{id}, *λ*} to characterize the RDA model, which leads to the 4x4 correlation matrix shown in Figure 4C.

**Figure 4.** Each model **(A)** is characterized by a set of two correlations **(B)**. Note that these defining correlations require 1 additional parameter *S*_{b} for the sHC model, 1 additional parameter *s*_{i} for the RDA and the IA models, whereas the IDA model requires 2 additional parameters, *s*_{i} and *S*_{b}. **(C)** As a result, the covariance matrix for the *I*-value analysis of these models, according to Witz et al. (2019), would be 4x4 for the sHC, IA and RDA models and 5x5 for the IDA model. Therefore, the *I*-values of these models cannot be meaningfully compared.

The problem with the above procedure is that the size of the correlation matrix becomes model dependent, and thus the *I*-value analysis cannot compare different models. For example, following the same reasoning, the IDA model would require 5 variables for the *I*-value analysis, because of the two defining correlations (*s*_{i}, *δ*_{ii}) and (*S*_{b}, *Δ*_{d}) (Figure 4B). Therefore, in addition to the three independent control parameters {*δ*_{ii}, *Δ*_{d}, *λ*}, the *I*-value analysis would require two additional parameters {*s*_{i}, *S*_{b}} from the defining correlations. The resulting correlation matrices would then be 5x5 from {*s*_{i}, *S*_{b}, *δ*_{ii}, *Δ*_{d}, *λ*} instead of 4x4 (Figure 4C). Since *I*-values obtained from correlation matrices of different sizes cannot be meaningfully compared, the *I*-value analysis is fundamentally limited to compare a specific class of models.

Finally, it is worthwhile mentioning that the *I*-value analysis employed by Witz et al. (2020) is only valid for non-overlapping cell cycles.

### The Replication Double Adder Model Does Not Produce the Adder Principle

From Eq. (5), size convergence according to the RDA model is incompatible with the adder principle. More specifically, in the presence of fluctuations, the RDA model is skewed toward a sizer behavior. Using experimentally measured values for the variance of *δ*_{ii} and *δ*_{id}, we computed *ρ*_{d} according to Eq. (5) (Figure 5). The RDA model would predict a deviation from the adder principle, in contradiction to several experimental results (Campos et al., 2014; Taheri-Araghi et al., 2015; Le Treut et al., 2020; Treut et al., 2020).

**Figure 5.** The theory based on the initiation-centric model predicts a more sizer-like behavior. We used Eq. (1) and the experimental values of σ_{id}/σ_{ii} from Si et al. (2019), Witz et al. (2019).

Witz et al. (2019) simulation of the RDA showed a good agreement with the experimental data. Yet further investigation showed that the agreement was a direct consequence of introducing yet another adjustable parameter, namely the variance of the septum position (Le Treut et al., 2020; Treut et al., 2020). Indeed, for perfectly symmetric division, the simulation results also show deviatiations from experimental adder behavior (Supplementary Figure 2), in agreement with Eq. (5).

### Mechanistic Origin of the Independent Double Adder Model

The IDA model is a mechanistic model based on two experimentally verified hypotheses. First, the cell cycle proteins are produced in a balanced manner (i.e., the synthesis rate of each protein is the same as the growth rate of the cell):

where *N* is the protein copy number in the cell, *V* is the cell volume and *c*^{∗} is the steady-state protein concentration. Second, initiation or division is triggered when the respective initiator protein reaches a threshold, namely:

The adder phenotype is a natural consequence of these two assumptions, provided that the initiator proteins are equally partitioned at division between daughter cells (Si et al., 2019). The requirement that *N _{0}/2* proteins must be synthesized between birth and division results in the added volume from birth to division to be:

This model was substantiated in an experimental study showing that perturbing the first condition, namely balanced biosynthesis, was enough to break the adder phenotype (Si et al., 2019). Balanced biosynthesis was perturbed in two orthogonal ways: (i) by oscillating the production rate of the FtsZ protein through periodical induction and (ii) by relieving FtsZ degradation through ClpX inhibition.

A similar mechanism is thought to apply to the initiation process, through the initiator protein DnaA, which accumulates at the origin of replication to trigger replication initiation. An important difference with the division mechanism, however, is that this results in a threshold to be reached at each origin of replication, thus (Eq. 15) is modified to:

Provided that once again proteins are equally partitioned at division, Eqs. (2) and (4) result in a fixed added volume per origin of replication between consecutive initiation events, hence the initiation-to-initiation adder *δ*_{ii}.

### Agreement of the Models With the Full Correlation

We computed the three experimental correlations *ρ*_{i}, *ρ*_{d}, and *ρ*_{id} and compared them to the predictions of the models shown in Table 1 (we didn’t include the CCCP model because of the extra-parameter *f* which left the expressions undetermined). Unfortunately, this analysis failed to discriminate between the IDA and RDA models (see Supplementary Data 2). The IDA model accounted better for the experimental *ρ*_{d} correlation while the RDA model accounted better for the experimental *ρ*_{id} correlation. This suggests that this correlation study is not sufficient to discriminate between the proposed models. We do not discuss the *ρ*_{i} correlation because all 3 out of the 4 models considered predicted the same value of 1/2.

### Is the (*s*_{i}, *S*_{d}) Correlation Real?

The decoupling between replication initiation and cell division was shown by performing independent perturbations to each of those two processes (Si et al., 2019). Specifically, replication initiation was periodically delayed by knocking down the expression of the initiator protein DnaA, yet cell division was left unaffected. Similarly, division was periodically delayed by repressing the expression of the division protein FtsZ, yet replication initiation was left unaffected. This decoupling supports the replication and division process being independent processes, as advocated in the IDA model. However, several nutrient-limitation growth conditions show a (*s*_{i}, *S*_{d}) correlation which is slightly positive, somewhat in between the zero correlation predicted by the IDA model and the value predicted by the RDA model (Supplementary Data 2). This suggests that refinements to the IDA model are still needed to perfectly agree with all experimental correlations. In that regard, some recent developments in cell-cycle modeling are promising (Nieto et al., 2020; Jia et al., 2021). For example, the cell-cycle is divided into several stages, and transitions between consecutive stages occur with a cell volume-dependent rate. Such theories can reproduce more finely the observed correlations at the expense of a larger number of parameters, such as for example the mild deviation from the adder principle toward the sizer behavior in slow growth conditions (Wallden et al., 2016). Similarly, although *B. subtilis* follows the adder principle, it appears the cell cycle can be divided into two phases, one exhibiting a sizer correlation and the other one exhibiting a timer correlation (Nordholt et al., 2020).

## Perspective and Concluding Remarks

While applying a recently proposed correlation analysis, namely the *I*-value analysis, we became aware of some of its limitations, as explained above. In our view, this illustrates some of the caveats one may encounter when applying correlation analysis. While valuable in various contexts, correlation analysis can lead to erroneous conclusions when additional sources of variability such as experimental and measurement errors are not properly taken into account in the analysis. For example, adder correlations can emerge from non-adder mechanisms due to measurement errors in the cell radius (Facchetti et al., 2019). Therefore, while correlation analysis is useful to confront models to experimental data, we believe it is important to seek a molecular understanding of a model of the cell cycle.

The question of whether the implementation point of the cell cycle is birth or replication initiation has a long history. Although cell-size control was initially thought to be division centric because the CV of the division size was smaller than that of the doubling time (Schaechter et al., 1962), many interpreted the HC model (Cooper and Helmstetter, 1968) and Donachie’s theoretical observation (Donachie, 1968) as an initiation-centric view for long. The rediscovery of the adder principle, which cannot be explained by the sHC model, has revealed the need to revisit models of the bacterial cell cycle. In this article, we have reviewed one of the latest controversies that has emerged in the field of quantitative bacterial physiology, namely the question of the implementation point of the cell cycle. Based on recent results and single-cell experimental data that we have generated over the last decade, we favor a mixed implementation strategy with two independent adders namely the IDA, as the most likely mechanism ruling the *E. coli* and *B. subtilis* cell cycle. Furthermore, the fact that both initiation and division share the same adder phenotype suggest to us that they also must share the same mechanistic principles. That is, initiation and division must require (i) balanced biosynthesis of their initiator proteins such as DnaA for initiation and FtsZ for division, and (ii) their accumulation to a respective threshold number to trigger initiation and division (Si et al., 2019). Ultimately, these predictions should be tested experimentally to gain mechanistic understanding and their generality beyond correlation analysis.

## Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

## Author Contributions

GLT performed research and wrote the manuscript. FS performed research. DL performed research. SJ supervised the project and wrote the manuscript. All authors contributed to the article and approved the submitted version.

## Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

## Acknowledgments

We thank Martin Howard, Sandeep Krishna, Vahid Shahrezaei, and members of Hwa and Jun labs at UCSD for critical feedback and discussions. This work was supported by National Science Foundation (MCB 2016090).

## Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2021.721899/full#supplementary-material

## References

Adiciptaningrum, A., Osella, M., Moolman, M. C., Cosentino Lagomarsino, M., and Tans, S. J. (2015). Stochasticity and homeostasis in the *E. coli* replication and division cycle. *Sci. Rep.* 5:18261. doi: 10.1038/srep18261

Amir, A. (2014). Cell size regulation in bacteria. *Phys. Rev. Lett.* 112:208102. doi: 10.1103/PhysRevLett.112.208102

Bertaux, F., von Kügelgen, J., Marguerat, S., and Shahrezaei, V. (2020). A bacterial size law revealed by a coarse-grained model of cell physiology. *PLoS Comput. Biol.* 16:e1008245. doi: 10.1371/journal.pcbi.1008245

Boye, E., and Nordström, K. (2003). Coupling the cell cycle to cell growth. *EMBO Rep.* 4, 757–760. doi: 10.1038/sj.embor.embor895

Campos, M., Surovtsev, I. V., Kato, S., Paintdakhi, A., Beltran, B., Ebmeier, S. E., et al. (2014). A constant size extension drives bacterial cell size homeostasis. *Cell* 159, 1433–1446. doi: 10.1016/j.cell.2014.11.022

Cooper, S., and Helmstetter, C. E. (1968). Chromosome replication and the division cycle of *Escherichia coli*. *J. Mol. Biol.* 31, 519–540. doi: 10.1016/0022-2836(68)90425-7

Donachie, W. D. (1968). Relationship between cell size and time of initiation of DNA replication. *Nature* 219, 1077–1079. doi: 10.1038/2191077a0

Facchetti, G., Knapp, B., Chang, F., and Howard, M. (2019). Reassessment of the basis of cell size control based on analysis of cell-to-cell variability. *Biophys. J.* 117, 1728–1738. doi: 10.1016/j.bpj.2019.09.031

Harris, L. K., and Theriot, J. A. (2016). relative rates of surface and volume synthesis set bacterial cell size. *Cell* 165, 1479–1492. doi: 10.1016/j.cell.2016.05.045

Ho, P.-Y., and Amir, A. (2015). Simultaneous regulation of cell size and chromosome replication in bacteria. *Front. Microbiol.* 6:662. doi: 10.3389/fmicb.2015.00662

Iyer-Biswas, S., Wright, C. S., Henry, J. T., Lo, K., Burov, S., Lin, Y., et al. (2014). Scaling laws governing stochastic growth and division of single bacterial cells. *Proc. Natl. Acad. Sci. U.S.A.* 111, 15912–15917. doi: 10.1073/pnas.1403232111

Jia, C., Singh, A., and Grima, R. (2021). Cell size distribution of lineage data: analytic results and parameter inference. *iScience* 24:102220. doi: 10.1016/j.isci.2021.102220

Jun, S., and Taheri-Araghi, S. (2015). Cell-size maintenance: universal strategy revealed. *Trends Microbiol.* 23, 4–6. doi: 10.1016/j.tim.2014.12.001

Jun, S., Si, F., Pugatch, R., and Scott, M. (2018). Fundamental principles in bacterial physiology-history, recent progress, and the future with focus on cell size control: a review. *Rep. Prog. Phys.* 81:056601. doi: 10.1088/1361-6633/aaa628

Le Treut, G., Si, F., Li, D., and Jun, S. (2020). Comment on “Initiation of chromosome replication controls both division and replication cycles in *E. coli* through a double-adder mechanism. *bioRxiv* [Preprint]. doi: 10.1101/2020.05.08.084376

Long, Z., Nugent, E., Javer, A., Cicuta, P., Sclavi, B., Cosentino Lagomarsino, M., et al. (2013). Microfluidic chemostat for measuring single cell dynamics in bacteria. *Lab Chip* 13, 947–954. doi: 10.1039/c2lc41196b

Micali, G., Grilli, J., Osella, M., and Lagomarsino, M. C. (2018). Concurrent processes set E. coli cell division. *Sci. Adv.* 4:eaau3324. doi: 10.1126/sciadv.aau3324

Moffitt, J. R., Lee, J. B., and Cluzel, P. (2012). The single-cell chemostat: an agarose-based, microfluidic device for high-throughput, single-cell studies of bacteria and bacterial communities. *Lab Chip* 12, 1487–1494. doi: 10.1039/c2lc00009a

Nieto, C., Arias-Castro, J., Sánchez, C., Vargas-García, C., and Pedraza, J. M. (2020). Unification of cell division control strategies through continuous rate models. *Phys. Rev. E* 101:022401. doi: 10.1103/PhysRevE.101.022401

Nordholt, N., van Heerden, J. H., and Bruggeman, F. J. (2020). Biphasic cell-size and growth-rate homeostasis by single *Bacillus subtilis* cells. *Curr. Biol.* 30, 2238.e–2247.e. 2238–2247.e5^{∗}, doi: 10.1016/j.cub.2020.04.030

Sauls, J. T., Cox, S. E., Do, Q., Castillo, V., Ghulam-Jelani, Z., and Jun, S. (2019). Control of *Bacillus subtilis* replication initiation during physiological transitions and perturbations. *MBio* 10, e02205–19. doi: 10.1128/mBio.02205-19

Schaechter, M., Williamson, J. P., Hood, J. R. Jr., and Koch, A. L. (1962). Growth, cell and nuclear divisions in some bacteria. *J. Gen. Microbiol.* 29, 421–434. doi: 10.1099/00221287-29-3-421

Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., and Hwa, T. (2010). Interdependence of cell growth and gene expression: origins and consequences. *Science* 330, 1099–1102. doi: 10.1126/science.1192588

Serbanescu, D., Ojkic, N., and Banerjee, S. (2020). Nutrient-dependent trade-offs between ribosomes and division protein synthesis control bacterial cell size and growth. *Cell Rep.* 32:108183. doi: 10.1016/j.celrep.2020.108183

Si, F., Li, D., Cox, S. E., Sauls, J. T., Azizi, O., Sou, C., et al. (2017). Invariance of initiation mass and predictability of cell size in *Escherichia coli*. *Curr. Biol.* 27, 1278–1287. doi: 10.1016/j.cub.2017.03.022

Si, F., Treut, G. L., Sauls, J. T., Vadia, S., Levin, P. A., and Jun, S. (2019). Mechanistic origin of cell-size control and homeostasis in bacteria. *Curr. Biol.* 29, 1760–1770.e7. doi: 10.1016/j.cub.2019.04.062

Sompayrac, L., and Maaloe, O. (1973). Autorepressor model for control of DNA replication. *Nat. New Biol.* 241, 133–135. doi: 10.1038/newbio241133a0

Supplementary Information: Numerical Methods (2021). Available online at: https://github.com/junlabucsd/DoubleAdderArticle/tree/frontiers (commit 4d3b79dc8492324f2ee4fc789d6e9a24a8c98c82)

Taheri-Araghi, S., Bradde, S., Sauls, J. T., Hill, N. S., Levin, P. A., and Paulsson, Js, et al. (2015). Cell-size control and homeostasis in bacteria. *Curr. Biol.* 25, 385–391. doi: 10.1016/j.cub.2014.12.009

Treut, G. L., Le Treut, G., Si, F., Li, D., and Jun, S. (2020). Single-cell data and correlation analysis support the independent double adder model in both *Escherichia coli* and *Bacillus subtilis*. *bioRxiv* [Preprint] doi: 10.1101/2020.10.06.315820

Vashistha, H., Kohram, M., and Salman, H. (2021). Non-genetic inheritance restraint of cell-to-cell variation. *Elife* 10:e64779. doi: 10.7554/eLife.64779.sa2

Voorn, W. J., Koppes, L. J. H., and Grover, N. B. (1993). Mathematics of cell division in *Escherichia coli*: comparison between sloppy-size and incremental-size kinetics. *Curr. Top. Mol. Gen.* 1, 187–194.

Wallden, M., Fange, D., Lundius, E. G., Baltekin, Ö, and Elf, J. (2016). The synchronization of replication and division cycles in individual *E. coli* cells. *Cell* 166, 729–739. doi: 10.1016/j.cell.2016.06.052

Wang, P., Robert, L., Pelletier, J., Dang, W. L., Taddei, F., Wright, A., et al. (2010). Robust growth of *Escherichia coli*. *Curr. Biol.* 20, 1099–1103. doi: 10.1016/j.cub.2010.04.045

Witz, G., Julou, T., and van Nimwegen, E. (2020). Response to comment on “Initiation of chromosome replication controls both division and replication cycles in *E. coli* through a double-adder mechanism. *bioRxiv* [Preprint] doi: 10.1101/2020.08.04.227694

Witz, G., van Nimwegen, E., and Julou, T. (2019). Initiation of chromosome replication controls both division and replication cycles in through a double-adder mechanism. *Elife* 8:e48063. doi: 10.7554/eLife.48063.sa2

Keywords: adder, bacterial cell cycle, bacterial cell size control, quantitative microbial physiology, bacterial physiology

Citation: Le Treut G, Si F, Li D and Jun S (2021) Quantitative Examination of Five Stochastic Cell-Cycle and Cell-Size Control Models for *Escherichia coli* and *Bacillus subtilis*. *Front. Microbiol.* 12:721899. doi: 10.3389/fmicb.2021.721899

Received: 07 June 2021; Accepted: 06 October 2021;

Published: 26 October 2021.

Edited by:

Monika Glinkowska, University of Gdańsk, PolandReviewed by:

Shiladitya Banerjee, Carnegie Mellon University, United StatesRamon Grima, University of Edinburgh, United Kingdom

Copyright © 2021 Le Treut, Si, Li and Jun. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Guillaume Le Treut, guillaume.letreut@czbiohub.org; Suckjoon Jun, suckjoon.jun@gmail.com