Directions for Optimization of Photosynthetic Carbon Fixation: RuBisCO's Efficiency May Not Be So Constrained After All

The ubiquitous enzyme Ribulose 1,5-bisphosphate carboxylase-oxygenase (RuBisCO) fixes atmospheric carbon dioxide within the Calvin-Benson cycle that is utilized by most photosynthetic organisms. Despite this central role, RuBisCO's efficiency surprisingly struggles, with both a very slow turnover rate to products and also impaired substrate specificity, features that have long been an enigma as it would be assumed that its efficiency was under strong evolutionary pressure. RuBisCO's substrate specificity is compromised as it catalyzes a side-fixation reaction with atmospheric oxygen; empirical kinetic results show a trend to tradeoff between relative specificity and low catalytic turnover rate. Although the dominant hypothesis has been that the active-site chemistry constrains the enzyme's evolution, a more recent study on RuBisCO stability and adaptability has implicated competing selection pressures. Elucidating these constraints is crucial for directing future research on improving photosynthesis, as the current literature casts doubt on the potential effectiveness of site-directed mutagenesis to improve RuBisCO's efficiency. Here we use regression analysis to quantify the relationships between kinetic parameters obtained from empirical data sets spanning a wide evolutionary range of RuBisCOs. Most significantly we found that the rate constant for dissociation of CO2 from the enzyme complex was much higher than previous estimates and comparable with the corresponding catalytic rate constant. Observed trends between relative specificity and turnover rate can be expressed as the product of negative and positive correlation factors. This provides an explanation in simple kinetic terms of both the natural variation of relative specificity as well as that obtained by reported site-directed mutagenesis results. We demonstrate that the kinetic behaviour shows a lesser rather than more constrained RuBisCO, consistent with growing empirical evidence of higher variability in relative specificity. In summary our analysis supports an explanation for the origin of the tradeoff between specificity and turnover as due to competition between protein stability and activity, rather than constraints between rate constants imposed by the underlying chemistry. Our analysis suggests that simultaneous improvement in both specificity and turnover rate of RuBisCO is possible.

The ubiquitous enzyme Ribulose 1,5-bisphosphate carboxylase-oxygenase (RuBisCO) fixes atmospheric carbon dioxide within the Calvin-Benson cycle that is utilized by most photosynthetic organisms. Despite this central role, RuBisCO's efficiency surprisingly struggles, with both a very slow turnover rate to products and also impaired substrate specificity, features that have long been an enigma as it would be assumed that its efficiency was under strong evolutionary pressure. RuBisCO's substrate specificity is compromised as it catalyzes a side-fixation reaction with atmospheric oxygen; empirical kinetic results show a trend to tradeoff between relative specificity and low catalytic turnover rate. Although the dominant hypothesis has been that the active-site chemistry constrains the enzyme's evolution, a more recent study on RuBisCO stability and adaptability has implicated competing selection pressures. Elucidating these constraints is crucial for directing future research on improving photosynthesis, as the current literature casts doubt on the potential effectiveness of site-directed mutagenesis to improve RuBisCO's efficiency. Here we use regression analysis to quantify the relationships between kinetic parameters obtained from empirical data sets spanning a wide evolutionary range of RuBisCOs. Most significantly we found that the rate constant for dissociation of CO 2 from the enzyme complex was much higher than previous estimates and comparable with the corresponding catalytic rate constant. Observed trends between relative specificity and turnover rate can be expressed as the product of negative and positive correlation factors. This provides an explanation in simple kinetic terms of both the natural variation of relative specificity as well as that obtained by reported site-directed mutagenesis results. We demonstrate that the kinetic behaviour shows a lesser rather than more constrained RuBisCO, consistent with growing empirical evidence of higher variability in relative specificity. In summary our analysis supports an explanation for the origin of the tradeoff between specificity and turnover as due to competition between protein stability and activity, rather than constraints between rate constants imposed by the underlying chemistry. Our analysis suggests that simultaneous improvement in both specificity and turnover rate of RuBisCO is possible.

INTRODUCTION
Ribulose 1,5-bisphosphate carboxylase-oxygenase (RuBisCO) is the enzyme responsible for the fixation of carbon derived from atmospheric CO 2 as part of the Calvin-Benson cycle that leads to production of the glucose essential for growth in most photosynthetic organisms. However, RuBisCO has a low turnover rate in higher plants (∼3 s −1 ) and the efficiency of carbon fixation by the enzyme is compromised by a competing reaction with atmospheric O 2 that leads to photorespiration at high cost to the organism in terms of both energy and loss of carbon. A recent analysis of k cat and K M values of several thousand enzymes (Bar-Even et al., 2011) has shown that RuBisCO's catalytic rate, k cat , and efficiency (k cat /K M ) are not unusually low compared with values of the "average" enzyme (see their Figure 1), even though much lower than fast enzymes at the diffusion-controlled limit, for a variety of reasons including absence of strong evolutionary selection pressure and substrate properties, especially low molecular mass and hydrophobicity, limiting K M optimization. A later analysis (Bar-Even et al., 2015) showed that enzyme-substate encounters for the "average" enzyme are not productive("futile"), again for various reasons. The insights from these analyses are useful in placing RuBisCO's catalytic rate and efficiency in the context of all enzymes, especially the significant dissociation rate for CO 2 we find in this work, but nonetheless puzzles remain as RuBisCO has been subject to very strong evolutionary pressure.
To mitigate this apparent torpidity of the enzyme, organisms have co-evolved other strategies for maintaining levels of photosynthesis. The observed large variations in RuBisCO kinetic parameters from photosynthetic organisms in different kingdoms down to different species Ogren, 1981, 1983) is a consequence of co-evolution with resource allocation into other strategies that lead to enhanced photosynthesis (largely by way of more efficient CO 2 and nitrogen utilization) and suppressed photorespiration (Badger and Andrews, 1987;Badger et al., 1998).
Cyanobacterial RuBisCOs are characterized by lower values of activity with CO 2 relative to that of O 2 (the relative specificity, S C/O ) and higher catalytic turnover rates (k C cat ). These organisms utilize a carbon-concentrating mechanism (CCM) which compensates for the lower S C/O and limits photorespiration by increasing the CO 2 /O 2 ratio at the site of fixation, while taking advantage of the higher k C cat by reducing RuBisCO concentration and hence the requirement for nitrogen. Some non-green algae with higher S C/O do not express a CCM but instead the lower k C cat is mitigated by increasing RuBisCO and, hence, higher investment of nitrogen in RuBisCO protein. In higher plants, the kinetic balances and photosynthetic pathways lie somewhere in the middle of these two extremes. In C 3 plants S C/O is generally greater and k C cat less than in C 4 plants expressing CCMs (Yeoh et al., 1980;Seemann et al., 1984;Ghannoum et al., 2005), while others are characterized as C 3 -C 4 intermediate or C 4 -like (Kubien et al., 2008).
Understanding the nature of constraints imposed on RuBisCO's intrinsic efficiency is important for directing future research on photosynthesis. Study of RuBisCO activity has become a focus for improving photosynthesis (Bainbridge et al., 1995;Peterhansel et al., 2008;Gready and Kannappan, 2009;Whitney et al., 2011;Parry et al., 2013;Carmo-Silva et al., 2015) with a major aim of improving crop yields. However, some doubt has been cast on whether it can be significantly improved via mutation because of a hypothesis of "underlying constraints" in the chemistry of the reaction (Tcherkez et al., 2006;Savir et al., 2010;Tcherkez, 2013).
In the present study, we argue that this conclusion may have resulted from unsupported assumptions of the kinetic models and limited data sets used in the analyses. Resolving the precise nature of the constraints imposed on RuBisCO kinetics is clearly pivotal to providing direction of future research into improving photosynthesis. The rate constants (Figure 1) determine, and therefore ultimately limit, the physical binding of substrates, the breaking and formation of chemical bonds, and finally the release of products (Lorimer, 1981;Cleland et al., 1998;Andersson, 2008;Kannappan and Gready, 2008).
Although methods for computing individual rate constants from kinetic data have not been widely implemented for RuBisCO (McNevin et al., 2006), the more commonly measured are generally functions of these. Here we derive the equations for the kinetic mechanism (Figure 1) and estimate the mean (or expected) values for rate constants using regression analysis. Utilizing the compilation in Table 1, which includes the data used by Savir et al. in their analysis (Savir et al., 2010), we performed our own linear regression analysis on a wider range of data sets. This analysis was extended to other plant data (Galmés et al., 2014;Prins et al., 2016) to assist in validating the results. We found that the rate constants for dissociation of the CO 2 and O 2 substrates (k 6 and k 12 in Figure 1) are much larger relative to the corresponding catalytic rate than previously assumed and consequently have a significant effect on the kinetics. We also suggest the constraints on RuBisCO may be better explained by competing selection pressures, rather than by positive selection within hypothetical constraints (Tcherkez et al., 2006;Tcherkez, 2013) imposed by the chemical mechanism.
Our results and conclusions are indicative of a less constrained RuBisCO and are consistent with observed variations in the kinetics of a wider range of wild type and mutant RuBisCO that are now available, although such kinetic data is regrettably still sparse.

METHODS
We consider the rate constants k i for the kinetic mechanism (Figure 1) to be a set of general random variables (Koralov and Sinai, 2007). The expected value, E(k i ) ≡ k i , is the mean value of k i , i.e., averaged over a number of sequences. In principle these averages can be extracted using both linear and non-linear regression methods to establish functional relationships between the RuBisCO kinetic parameters. As K C and K O depend explicitly on k C cat and k O cat , respectively, we restrict the independent variables (predictors) to k C cat and k O cat . The dependent (response) variables whose expected values, conditional on k C cat or k O cat , FIGURE 1 | The kinetic mechanism of RuBisCO. RuBisCO must first be activated by carbamylation and binding of Mg 2+ before it processes three substrates, ribulose bisphosphate (RuBP), and carbon dioxide or oxygen, the complete reactions taking place over several stages (Lorimer, 1981;Cleland et al., 1998;Andersson, 2008;Kannappan and Gready, 2008). RuBP binds first forming a complex (ER) with the activated form of the enzyme (E), followed by enolization of RuBP (ER*) which facilitates binding with the carbon dioxide or oxygen molecule to form the ERC or ERO enzyme-substrate complexes. After hydrolysis, the six-carbon compound formed by the addition of carbon dioxide to RuBP breaks at a C-C bond forming a product complex (EP) which dissociates into two three-carbon compounds, 3-phosphoglyceric acid (PGA), with the addition of two protons. Oxygenation proceeds through analogous steps except that the dissociation products are one PGA molecule and one of 2-phospho-glycolate (PG). Atoms originating from free CO 2 and O 2 are shown in red, and oxygen atom originating from the water molecule used for hydration is shown in aqua blue.
are determined by regression are then K C , K O and S C/O , e.g., E K C k C cat ≡ K C . The Generalized Extreme Studentized Deviate (ESD) test (Rosner, 1983) was used with P-value of 0.05 to eliminate multiple outliers in the data prior to regression analysis. The regression parameters were then used to estimate the expected values of various terms in the kinetic equations. We can illustrate the procedure by considering a more simplistic singleintermediate kinetic mechanism where the Michaelis constant is given by K M = k cat +k off k on (e.g, Roberts, 1977;Farquhar, 1979). Enzyme assays typically provide K M and k cat but insufficient data to determine k on and k off which are, respectively, the rate constants for the binding and dissociation of substrate (e.g., CO 2 or O 2 ). However, if we consider that the rate constants k cat , k on and k off randomly fluctuate over a number of sequences, a linear correlation, K M , may be obtained between K M and k cat from which the gradient and intercept give the expected values 1 k on and k off k on , respectively, and using the approximation xy ≈ x y for a finite number of random variables x and y, we can hence determine the expected values of the rate constants k on and k off . Although K M is linearly dependent on k cat , we should not necessarily expect to observe any correlation, as high variances may be associated with the other two terms, k on and k off . Where a linear correlation exists, we may infer that the rate constants k on and k off are fairly constant (low variance), while a non-linear correlation would be consistent with an additional correlation between k cat and at least one of these other two terms. Statistical (regression) methods are here used to show how these different scenarios are represented in the available kinetic data.

Kinetic Equations
In deriving the following kinetic equations for this mechanism (Figure 1) we assumed only that both k 10 and k 16 are very much smaller than any of the remaining rate constants (effectively, k 10 = k 16 = 0). We emphasize that no such approximations (k i = 0) were made anywhere else in the derivation. The Michaelis constants (K M ) for carboxylation and oxygenation are

Species
Ref.
The general equation for the specificity of carboxylation relative to that of oxygenation (relative specificity) is then (Equation A25).
In Equation (3), the relative specificity (S C/O ) is formally a function of 10 rate constants k 5 ..k 9 , k 11 ..k 15 , five for each of the carboxylation and oxygenation reactions.
is a function only of rate constants for the enolization step (Equation A22), i.e., independent of carboxylation or oxygenation, and 0 < γ < 1. Both k C cat and γ C are formally functions of k 3 , k 7 , k 8 and k 9 (Equation A26). It is evident (Equation A26) that if k 7 is the slow step that determines the maximum catalytic rate (k C cat = k 7 ), then However, we need not make these types of assumptions here, and simply regard γ C k 6 and γ O k 12 as effective dissociation rate constants.

Michaelis Constants
The results of linear regression analysis performed on a number of data sets are summarized in Table 2. The green algae, bacteria and cyanobacteria data in Table 1 and other plant species (Galmés et al., 2014) could not be considered individually for analysis due to the small numbers of observations (N < 3). The log-scale plots (Figures 2A,B) of K C over the full range of k C cat values in Table 1 suggest a linear correlation and hence regression analysis of ln(K C ) on k C cat ("All data" sets in Table 2). P < 0.05 for both coefficients were obtained only for carboxylation using the "All data" sets (Figures 2A,B), carboxylation using a subset of the C 3 plants (Galmés et al., 2014), oxygenation using Triticeae data (Prins et al., 2016) and oxygenation using only the higher plant data ( Figure 2C). The residuals were found to be near-normally distributed (Figure 3). Reliable expected values for effective CO 2 and O 2 dissociation rate constants can be derived from the coefficients in regressions ( Table 2) that yield P < 0.05 for both coefficients (i.e., both the gradient and intercept). The results are given in Table 3. For the regression of ln(K C ) on k C cat , equating the first terms (a 1 + a 1 b 1 k C cat ) in the expansion of the exponential form (a 1 e b 1 k C cat ) with Equation (1) we find that the value of K C at k C cat = 0 is given by γ C k 6 K R k 5 = a 1 . From the regression analysis carried out using the full data set in Table 1 (Figure 2A) and the subset utilized by Savir et al. (2010) (Figure 2B), we obtain values of a 1 = 9.7 µM and a 1 = 4.5 µM, respectively. From the expansion of the exponential we also find that 1 K R k 5 ≈ a 1 b 1 at k C cat = 0, where the two estimates All Data c (Figure 2A)  Table 2 in Prins et al. (2016) are a 1 b 1 = 2.2 µM.s and a 1 b 1 = 1.5 µM.s, respectively. In Figures 2A,B, Equation (1), which will obviously deviate from the trend line as k C cat increases, has been graphed using these values. Combining these results obtained for γ C k 6 K R k 5 and 1 K R k 5 we estimate (at k C cat = 0) expected effective rate constants for CO 2 dissociation ( γ C k 6 ) of 4.3 s −1 and 3.0 s −1 , respectively. Assuming the scheme (Figure 1) correctly describes the kinetic mechanism, the deviation from linear behavior suggests there exists at least one type of correlation between rate constants. From Equation (1), the expected value of K R k 5 conditional on k C cat in terms of regression parameters a 1 and b 1 is then given by ( Figure 4A).
Therefore, we may also use Equation (4) to define the expected effective dissociation constant conditional on k C cat as ( Figure 4B).  (1) has been graphed using the parameters at k C cat = 0 as derived from the regression analysis (see text).
In Figure 4 it is assumed (Tcherkez et al., 2006) that the exponential increase in K C conditional on k C cat arises from K R k 5 (one correlation effect, i.e., due to CO 2 binding) while γ C k 6 is a constant in Equation (4). Alternatively, in Figure 5 we have assumed that variation arises from γ C k 6 (another correlation effect i.e., due to CO 2 dissociation) while K R k 5 is now the constant. Here the respective constants are the values of γ C k 6 and K R k 5 at k C cat = 0 as determined from the regression (Figure 2B). There is, of course, also the possibility that variability in both K R k 5 and γ C k 6 contribute to the non-linear behavior of K C , i.e., both K R k 5 and γ C k 6 are conditional on k C cat . In general, therefore, we could ascribe any functional dependence for either K R k 5 or γ C k 6 to this nonlinear behavior.
For the regression of K O on k O cat (Figure 2C), we have included only the data for all higher plants ( Table 1). Unlike the above regressions of K C on k C cat there are no indications of any deviations from non-linear behavior. The graph of K O on k O cat for the higher plants in particular clearly conforms to a linear function, and the residuals of regressed K O data are near normally distributed (Figure 3). From the intercept we find the expected value of the dissociation constant and from the gradient we obtain the constant 1 K R k 11 ≈ 280 µM.s.
From Equations (6, 7) we estimate the expected value of the effective O 2 dissociation rate constant, γ O k 12 ≈ 0.3 s −1 . Finally,  from the above determinations of 1 K R k 5 (from Figure 2B) and 1 K R k 11 we can estimate the expected CO 2 to O 2 ratio of the rate constants for binding at k C cat = 0 as k 5 k 11 ≈ 190.

Relative Specificity
The graph of reciprocal relative specificity, S O/C = 1 S C/O , against k C cat ( Figure 2D) suggests a linear dependence. The residuals of regressed S O/C data are near normally distributed (Figure 3). We first consider the expected value of S C/O conditional on k C cat as the reciprocal of the equation for the straight line that describes S O/C , i.e., where a 2 = 7.4 × 10 −3 mol/mol and b 2 = 1.2 × 10 −3 s are the regression parameters ( Figure 2D). Although Equation (8) generally provides a good fit to the data (Figure 6), it clearly does not display the correct limiting behavior as k C cat approaches zero Equation ( (Figure 2A), the expected value of S C/O conditional on k C cat can be written as As there are no correlations between k C cat and k O cat (Figure 7A) or K O (Figure 7B), S O is also not correlated (Figure 7C), and so   (Figure 6).
Assuming correlation ( Figure 2B) arises from CO 2 binding, the factor implicit in Equation (10)  L335V mutant is shown. Also on the graph is S C/O given by Equation (8) and Equation (10), including a possible factor of Equation (10), that is also conditional on k C cat is estimated by (Equations 4, 7, Figure 4A).

Mutant Example
We use Equation (3) to rationalize the in vitro kinetic data for the Leu to Val mutation at position 335 (L335V) in tobacco (Whitney et al., 1999). The decrease in k C cat from 3.43 s −1 in the wild type to 0.81 s −1 in the mutant is accompanied by a large decrease also in S C/O from 81 to 20 mol/mol. In Figure 8, S C/O is plotted against k C cat assuming that in Equation (3) the term is constant on the curve, i.e., We determine the constant factor such that S C/O = 81 mol/mol for the wild-type tobacco at the two limits (k C cat ≫ γ C k 6 and k C cat ≪ γ C k 6 ) for specific values of γ C k 6 = 1, 2, 3 and 4 s −1 . Note that in the limit k C cat ≫γ C k 6 we obtain S C/O = k 5 k 11 = 81 mol/mol, while the lower limit for k C cat ≪ γ C k 6 gives S C/O = 0. Noting that k 5 , the remaining kinetic parameters [K C = 10.7 µM, k O cat = 1.17 s −1 , K O = 295 µM for wild type, and K C = 5.1 µM, k O cat = 0.39 s −1 , K O = 48.9 µM for the mutant] (Whitney et al., 1999) can be used to simply determine the expected value of the ratio k 5 k 11 as. , against k C cat including all data in Table 1. Data points highlighted in green are those compiled by Savir et al. (2010).
where is the difference between wild type and mutant.

Significant Dissociation of CO 2 and O 2 Substrates
The trend lines (Figure 2) clearly intercept the vertical axes well above zero, indicating significant expected values for the dissociation constants γ C k 6 and γ O k 12 . However, the rate constant for CO 2 dissociation has been previously estimated as not more than about 5% of k C cat (Pierce et al., 1986;McNevin et al., 2007), so that it has generally been assumed that k C cat k C cat +γ C k 6 ≈ 1. Our estimates (Figures 2A,B, Table 3) of the expected value (at least for low k C cat ) are much higher, and find support in the kinetics modeling study of RuBisCO from spinach. We find that the expected values of dissociation rate constants (γ C k 6 ) for the binding of the substrate CO 2 are 4.3 s −1 (Figure 2A), 3.0 s −1 (Figure 2B), and 3.1 s −1 for a subset of C 3 plants (Galmés et al., 2014;Prins et al., 2016), noting that the differences are not statistically significant (Table 3). These values can be compared with 1.6 ±1.1 s −1 estimated for the CO 2 dissociation rate constant in spinach (McNevin et al., 2006(McNevin et al., , 2007, and the 5 − 10 µM range of K C D for lower values of k C cat (Figures 4B, 5B) is also consistent with a K C D = k 6 k 5 of 3 µM for spinach RuBisCO (McNevin et al., 2006). The effective CO 2 dissociation rate constant, γ C k 6 , impacts the k C cat dependence of S C/O (Figure 8). As k C cat approaches γ C k 6 Equation (12) describes the rapid decline in S C/O due to increasing probability that the CO 2 will dissociate from RuBP before catalysis takes place. The observed values of k C cat and S C/O for the L335V mutant (Whitney et al., 1999) are entirely consistent with a γ C k 6 greater than k C cat . The expected value of k 5 k 11 as given by Equation (13) is also consistent with the value obtained when averaged over a larger number of RuBisCOs with lower k C cat (Figure 6). Thus, changes in the gas-substrate binding in the mutant RuBisCO appear to be minimal, the bulk of the effect being described by Equation (12). The dissociation rate constant of O 2 is generally considered effectively zero (Tcherkez, 2013(Tcherkez, , 2016. However, although the expected value of 0.4 ± 0.4 s −1 for γ O k 12 in higher plants obtained here (Table 3) is significantly lower than the mean k O cat of 1.3 ± 0.2 s −1 (from data in Table 1) it is still sufficient to have an impact on K O (Equation 2). Additionally, the expected value of 2.3 ± 1.9 s −1 for γ O k 12 in Triticeae (Table 3) and the corresponding mean k O cat of 0.83 ± 0.16 s −1 (Prins et al., 2016) are not significantly different. Statistical analysis of the available data therefore suggests the expected (or average) value of the dissociation rate is not significantly lower than that of the catalytic rate. Moreover, a knowledge of rate differences in any particular RuBisCO requires more kinetic data than is currently available. Consequently, there is no justification for generally neglecting either of the dissociation rate constants, γ C k 6 or γ O k 12 , i.e., assuming they are an order of magnitude or more lower than the corresponding catalytic rates, as has been done previously (Tcherkez, 2013(Tcherkez, , 2016.

The Tight-Binding Hypothesis
Assuming k 5 k 11 decreases with increasing k C cat (Equation 11, Figure 6), it could be regarded as a proxy for S C/O (Tcherkez, 2013). Also as the specificity of oxygenation, S O , is not correlated with k C cat (Figure 7C), the variation in k 5 k 11 would be largely constrained to the dependence of k 5 on k C cat ( Figure 4A). It has been hypothesized (Tcherkez et al., 2006;Tcherkez, 2013) that such a constraint is to be expected from the predicted energetics of the reaction as tighter binding of CO 2 to ribulose bisphosphate (increasing k 5 ) would necessarily raise the activation free energy (decreasing k C cat ) required for the subsequent steps leading to turnover of product. However, the generality of this tight-binding (TB) hypothesis has come under question (Hanson, 2016) for its inability to explain the variations in S C/O that have been observed in some RuBisCOs (Young et al., 2016). It would seem that the TB hypothesis suffers from a more fundamental problem in that it is based on an incomplete and unrepresentative data distribution. In the present analysis, Equation (8) provides the better fit R 2 = 0.63 to the selected data ( Figure 2D), although it is not the more general equation for S C/O (Equation 10, Figure 6). Similar types of relationships that provide an even tighter fit to the data have been reported elsewhere: (Savir et al., 2010).
The TB hypothesis is posited on k 5 k 11 determining the dependence of S C/O on k C cat . Significantly, all of these analyses are in fact conditional on k C cat ≫ γ C k 6 , i.e, neglect of the CO 2 dissociation rate constant, k 6 . However, the high level of variance in K C and S C/O (Figures 2A, 6, respectively) argues for a more cautious data interpretation in the regression analysis. Statistically, the quadratic (Savir et al., 2010) and exponential ( Figure 2B) forms both describe the dependence of K C on k C cat equally well, but only the latter, more general case (Equations 3, 10), allows nonzero values for γ C k 6 ( Figure 5A).

Rate Constants May Not Be Highly Correlated
The deviation of any given data point (Figure 6) from the expected value (Equation 10) can be attributed to variations in the parameters of Equation (3). We expect that S O will generally produce random variations in S C/O (Figure 7C), although, possibly lower k 11 (higher K O , Figure 2C) for the cyanobacteria may in part account for a systematic reduction in S C/O . The CO 2 dissociation term, γ C k 6 , will certainly become apparent at low enough k C cat values (Figures 4, 5). In particular, variations in γ C k 6 may contribute significantly to the large variance seen in the non-green algae (Figures 2A, 6). If the catalytic rate correlates with k 5 , regression analysis defines only the first moment, k 5 , of the distribution (Figure 4A and Equation 11, Figure 6), and provides no information on the variance. In the absence of any coupling, mutations produce random changes in the underlying rate constants, k i . Irrespective of whether rate constants are correlated, the expected value of k i is given by n s k s i n where k s i is the value of a rate constant for a given sequence (s). In reality, the composition of the sequence space, Ω (i.e., any number of known sequences), will be determined in varying degrees by genetic drift and natural selection, as these determine the probability that a mutation becomes fixed. If the variations in k s i themselves are entirely random (zero correlation), we might expect both S C/O and k C cat at the high end of their observed values, as there is nothing to constrain them and the combined effect should have become fixed in some species by positive selection. The TB hypothesis attempts to explain this absence of both high S C/O and high k C cat by positive selection processes occurring within particular constraints (Figures 4A, 6) imposed on the chemical reaction steps (Tcherkez et al., 2006), but it may also be explained by competing selection pressures. The essential difference is that the origin of the evolutionary constraints is shifted from k s i to Ω.

Competing Selection Pressures May Constrain RuBisCo
From a biophysical perspective, thermodynamic stability is recognized as the most important constraint on the evolution of proteins and their ability to acquire new function (Tokuriki and Tawfik, 2009;Sikosek and Chan, 2014). The necessity of a protein to maintain the integrity of its folded structure despite the destabilizing effects of accumulated mutations results in only a small percentage being fixed by positive selection. Consequently, in the evolution of C 3 to C 4 plants, destabilizing mutations that are selected on the basis of improved activity are followed by mutations that restore stability with little impact on activity (Studer et al., 2014). This leads to an apparent tradeoff between activity and stability that may well limit the ability of RuBisCO to fix the number of mutations required to increase both S C/O and k C cat . Depending on the sub-cellular CO 2 /O 2 ratio, the fixed mutations increase specificity (for low ratio) or catalytic rate (for high ratio), or a varying combination of both, whichever best optimizes photosynthesis.

Potential for Optimizing Carbon Fixation
The origin of the constraint(s) has significant implications for the optimization of RuBisCO activity. If the constraint is on Ω (i.e., from competing selection pressures) rather than k s i , greater variability may be exhibited. To what extent the functional limits of RuBisCO are reflected in the minimum and maximum values of kinetic parameters is not yet clear for RuBisCOs with higher k C cat because of the absence of empirical data. Much effort has been directed toward research on higher plants with particular emphasis on the evolution of C 3 to C 4 plants with their associated CCMs, although the recent work on diatoms may now help stimulate investigations into a more diverse range of photosynthetic organisms (Hanson, 2016;Young et al., 2016). Diatoms and C 3 plants share very similar k C cat , although the variance, var(S C/O ), for diatoms is relatively large (with corresponding variations in CCM expression), whereas for C 3 plants var(S C/O ) is barely significant (Figure 6, Table 1). This could raise the possibility of improving specificity, if not k C cat , in higher plants. It is perhaps not surprising that the non-green (red) algae, from which diatoms have evolved with somewhat lower k C cat values, also exhibit high var(S C/O ) (Figure 6). The data distributions are incomplete (Figures 2A, 6, Table 1); there is a scarcity of data for green algae, photosynthetic bacteria and cyanobacteria, with k C cat values between 6 s −1 and 14 s −1 . Discoveries of significant variance among these also may provide important clues on how to achieve increases in both k C cat and S C/O in higher plants.

CONCLUSION
The results of our analysis using regression analysis on updated RuBisCO-kinetic data sets suggest that CO 2 dissociation from the RuBisCO gas-addition complex is generally more important in rationalizing the observed variations in the kinetics of RuBisCO than hitherto assumed (Tcherkez et al., 2006;Tcherkez, 2013). Moreover, we have identified significant variations in the statistical correlations between K M and k cat in higher plants, i.e., the non-linear correlation for carboxylation as opposed to the linear correlation for oxygenation. These findings cast doubt on the hypothesis (Tcherkez et al., 2006;Savir et al., 2010;Tcherkez, 2013) that RuBisCO is so tightly constrained by the active-site chemistry that its activity is effectively optimized. Rather, the current body of kinetic parameters exhibits far more plasticity than this hypothesis predicts. We suggest that the possibility that the apparent tradeoff observed between k C cat and S C/O could arise from competing selection pressures on RuBisCO activity and stability (Studer et al., 2014) be given more attention. The relative strengths of these selection pressures would determine the strength of the constraints and, thus, the possibilities of improving the kinetics of RuBisCO by sitedirected mutagenesis. Indeed, although published comments (Griffiths, 2006;Gutteridge and Pierce, 2006) on the paper of Tcherkez et al. (2006) noted the vastness of sequence space that would need to be sampled, neither showed any positivity that a rational method to increase the efficiency of such a search was possible merely noting (Griffiths, 2006) directed evolution as a possibility. However, a method to reduce the sequence-search space for RuBisCO has since been reported in a patent (Gready and Kannappan, 2009).
In summary, there is still wide conjecture in the literature regarding the mechanisms by which plants ultimately regulate photosynthesis (Igamberdiev, 2015), and the absolute limitations of RuBisCO functionality have only been partly explored, as recent studies (Hanson, 2016;Young et al., 2016) suggest. Consequently, the potential for increasing both the catalytic turnover and relative specificity in higher plants with the view to improving photosynthesis remains to be fully tested. As argued (Hanson, 2016), kinetic data for a wider diversity of RuBisCOs are much needed and will likely prove useful in guiding the reengineering of higher-plant RuBisCOs with both significantly higher turnover rate and specificity. Our analysis suggests that such simultaneous improvement in both specificity and turnover rate is possible, and that competing selection pressures of activity and stability better explain the nature of constraints. Improved understanding of these competing selection pressures is much needed.

AUTHOR CONTRIBUTIONS
PC, BK, and JG designed and performed the research, wrote the paper and approved it for submission.