Occlusion of the Ribosome Binding Site Connects the Translational Initiation Frequency, mRNA Stability and Premature Transcription Termination

Protein production is controlled by ribosome binding to the messenger RNA (mRNA), quantified in part by the binding affinity between the ribosome and the ribosome binding sequence on the mRNA. Using the E. coli lac operon as model, Ringquist et al. (1992) found a more than 1,000-fold difference in protein yield when varying the Shine-Dalgarno sequence and its distance to the translation start site. Their proposed model accounted for this large variation by only a variation in the binding affinity and the subsequent initiation rate. Here we demonstrate that the decrease in protein yield with weaker ribosome binding sites in addition is caused by a decreased mRNA stability, and by an increased rate of premature transcription termination. Using different ribosome binding site sequences of the E. coli lacZ gene, we found that an approximately 100-fold span in protein expression could be subdivided into three mechanisms that each affected expression 3- to 6-fold. Our experiments is consistent with a two-step ribosome initiation model, in which occlusion of the initial part of the mRNA by a ribosome simultaneously protects the mRNA from both premature transcription termination and degradation: The premature termination we suggest is coupled to the absence of occlusion that allows binding of transcription termination factor, possibly Rho. The mRNA stability is explained by occlusion that prevents binding of the degrading enzymes. In our proposed scenario, a mRNA with lower translation initiation rate would at the same time be “hit” by an increased premature termination and a shorter life-time. Our model further suggests that the transcription from most if not all natural promoters is substantially influenced by premature termination.


INTRODUCTION
The protein yield per mRNA is traditionally thought to be a relatively simple function of ribosome binding to the ribosome binding site (RBS). The study of lacZ gene from E. coli (Ringquist et al., 1992) extended this view and quantified how the expression of a gene was affected by the Shine-Dalgarne (SD) sequence, its distance from the translation start, as well as of the first codons.
In that study the produced β-galactosidase spanned more than thousand-fold. This huge range in protein yield indicated that the mRNA with the weak RBS were only rarely exposed to a ribosome. Given the current knowledge of coupling between the first ribosome and the RNA-polymerase (Burmann et al., 2010;Proshkin et al., 2010), one might question how the RNApolymerase is able to transcribe such low yield mRNAs at all. In the study of Ringquist et al. only the β-galactosidase expression levels in the various lacZ variants were measured. These data are therefore not sufficient to determine whether an altered translation initiation rate is the only direct cause for the amount of expressed protein.
We have previously shown that the functional stability of an mRNA depends on the ribosome occupancy in the initial codon region, extending to include the first 50 codons on the mRNA. The mechanism for this, we envisaged, was that a lower ribosome occupancy on this part of the mRNA generated a larger target for degradation of the mRNA (Pedersen et al., 2011). It is therefore natural to suggest that the reported low expression for mRNA with weak RBS was also associated to the message stability. In this paper, we report the mRNA stabilities from a selected subset of the lacZ messages from the study of Ringquist et al. (1992).
In bacterial transcription, it is essential that a contact between the elongating RNA polymerase and the initial ribosome is established. If the contact for example is broken by a translational stop-codon, the transcription is terminated by transcription termination factor Rho (Beckwith, 1963;Newton et al., 1965). Altering the RBS will affect the time for the first ribosome to contact the RNA polymerase. For a weak RBS, it may take so long that the transcription terminates. We therefore also measured the premature transcription termination in these lacZ SD variants, and found a substantial termination for weak SD sequences.
To facilitate cloning of the altered RBS in the previous study (Ringquist et al., 1992), a change of the early lacZ codons was needed in order to generate a restriction enzyme site. Such change might alter the occlusion time (the time a preceding ribosome mask the RBS) of the ribosome binding site or the binding constant itself and thereby alter the initiation frequency. Thanks to the development of the recombineering technique (Sawitzke et al., 2007) one can now insert sequences at will. In order to compare the results both to the previous study and to the expression of the wild type, (wt) lacZ gene used in our study (Pedersen et al., 2011), we constructed seven lacZ variants: sd5-lacZ-rq, SD12-lacZ-rq, SD8-lacZ-rq, with the lacZ coding sequence used by Ringquist et al. (1992) and also sd5-lacZwt, SD12-lacZ-wt, SD8-lacZ-wt, and wt-SD-lacZ-wt. Throughout our paper we use the nomenclature introduced by Ringquist et al. (1992), where sd5 means the reasonably good SD sequence AAGGA located 5 nucleotides from the start codon, and SD8 and SD12 the very strong SD sequence TAAGGAGGA located 8 respectively 12 nucleotides from the start codon. The relevant part of these lacZ variants were sequenced and shown in Table 1.
All plasmids are derived from pMAS2, a derivative of pBR322 and carrying the wt lacZ (Sørensen et al., 1989). The variants in lacZ were made by recombineering (Sawitzke et al., 2007) in HME70 F'lacI q1 lacZ::Tn5 by first isolating a variant, pMET1 with a UAG stop codon in codon 3 of the lacZ gene (white on Xgal plates) into pMAS2. We then made variants with the desired sequences, screening for blue recombinants on X-gal plates.
We next reconstructed the complete lac operon in the above variants by cloning an EcoRI-EcoRI fragment (from the EcoRI site late in lacZ to the EcoRI site in the cam gene). This fragment contained lacY and lacA and came from pSB4027, that carried the complete lac operon on the vector pACYC184 (pSB4027 was provided by S. Brown).
The assays for LacZ and LacA enzyme activity were performed essentially as described (Miller, 1972), but without the inducer, IPTG in the overnight culture. The activity for both enzymes after induction were plotted vs. the OD of the culture at the sampling time. The slope of the resulting straight line is "Miller Units." The lacZ mRNA functional half-life was determined at an ambient temperature of 37 • C as described (Pedersen et al., 2011). We induced the cultures for 5 or 10 min and then removed the inducer by filtration, washed the filter twice and re-suspended the cells in inducer free medium. Next, one milliliter samples were labeled at 20 s intervals with approximately 5 µCi carrier free [ 35 S]-Methionine followed by a chase with more than 1,000-fold excess unlabeled methionine. After electrophoresis on a standard 7.5% SDS-acrylamide gel the protein bands were visualized on a phosphor image screen and quantified using the NIH-ImageJ software: imagej.nih.gov/ij/. Bands corresponding to the LacZ protein were normalized to the bands of the β/β ′ -RNA polymerase subunits. The labeling of these was not detectably influenced by the presence or absence of the lac inducer, 1 mM IPTG. The mRNA half-life was determined from the slopes in semi-logarithmic plots of the relative labeling of β-galactosidase to the radioactivity in the β and β ′ subunits of the RNA polymerase vs. the time of labeling.

Experimental Results
As model for quantifying translation efficiency we use variants of the lacZ mRNA that differ in strength of the RBS, the distance of this sequence to the translational start site and with two sets of the first few codons. To differentiate the mechanisms that regulate the protein yield from each of these seven sequences we measured three different quantities: (1) the total protein yield, (2) the stability of the mRNA, and (3) the extent of premature transcription termination.

Protein Yield: Expression of β-galactosidase
The measured specific enzymatic activities from these lacZ variants normalized to the enzymatic activitiy measured from lacZ-wt are shown in Table 2, column A. We compared the specific enzymatic activity of the LacZ-rq enzyme to that of the wt enzyme and found it to be 85% of that of the wt enzyme. This was done by calibrating the enzymatic activity of the two enzymes to their actual amount as measured by the [ 35 S] methionine labeled β-galactosidase bands in steady state of induction (data not shown). Table 2, column A shows the normalized protein yield corrected by this factor.
The results confirm that the protein expression is hugely influenced by both the SD, its distance to the start site and by changes of the early codons in agreement with the previous results (Ringquist et al., 1992).

mRNA Stability
The production of proteins depends on the lifetime of functional mRNA, which in turn can be measured by recording how the capacity for protein synthesis decays after transcription is suddenly stopped. The protocol for the used pulse-chase experiment is described in the method section. The measured half-lives for the seven lacZ variants are given in Table 2, column B. One see that the about 4-fold variation in the mRNA half-life can explain part of the variation in the total protein yield. However, the measured β -galactosidase yields in column A differed more than mRNA half-lives. In column C, we highlight this difference by tabulating the relative protein yield per functional mRNA. The protein yield per functional mRNA in column C may be attributed to either a variation in the translational initiation frequency or a change in the frequency of successful transcriptions.

Premature Transcription Termination
Even though our experimental setup is aimed at quantifying differences in translation while keeping the promoter and the first 30 nucleotide untranslated region of the mRNA unchanged, one should not forget that total protein yield is proportional to the number of completed transcription events which may be reduced by transcription termination. One mechanism would be a premature transcription termination in lacZ, e.g., associated to a lack of coupling between the first ribosome and the RNApolymerase.  (Mitarai et al., 2008;2.2/column E). Column G, The first ribosome initiation time t first , obtained by subtracting the occlusion time (∼ 1 s) from Column F.
The lac operon consists of the genes: lacZ, lacY, and lacA and the lac mRNA has been found to be cleaved intergenically between lacZ and lacY-lacA (Cannistraro et al., 1986;Li and Altman, 2004). The lacZ and lacA mRNA thus have independent half-lives (Petersen, 1987;Murakawa et al., 1991). Consequently, lacA expression will be unaffected by changes in lacZ, unless an event in lacZ terminated the RNA polymerase and in this way lowered transcription of lacA. This assay was developed as the classical assay for polarity: Termination by transcription termination factor Rho if the leading ribosome fail to contact the polymerase for instance if translation was terminated by an early stop codon (Beckwith, 1963;Newton et al., 1965;Richardson et al., 1975). Such premature termination event might also affect messengers with weak RBS if the leading ribosome in these would take too long to contact the polymerase because of the reduced initiation frequency.
To measure premature termination, we completed the lac operon in these lacZ variants by inserting an EcoRI-EcoRI fragment that contained the lacYA part of the lac operon. We next measured the LacA transacetylase activity in these lacZ variants. Table 2, column E show the fraction of successful transcription events relative to that of the wt. For the weakest SD sequence, sd5, 16 or 84% of the RNA polymerases terminate before reaching lacA depending on the lacZ coding sequence. Noticeably, the stronger SD sequences are less sensitive to premature termination. Note that the lacA expression in the variant with the strongest lacZ RBS (SD8-lacZ-wt) is higher than that of the wt lacZ operon, indicating that only approximately 71% of the wt lacZ transcription initiations will lead to full length transcripts (see later).

Variation of the Translation Initiation
The expression of β-galactosidase differs more than the combined effect of mRNA half-life and the premature transcription termination can explain. The remaining differences in protein yield are explained by variation in translation initiation as shown in Table 2 column E.
One now observe a translation initiation frequency that at most vary 5-fold among these seven lacZ variants. Notice that estimate of the sd5-lacZ-rq variant is uncertain because of the low stability of this mRNA. Previously, we estimated the translation initiation frequency of the wt lacZ to be 1/2.2 s (Mitarai et al., 2008). Using this estimate, the average time intervals between subsequent translation initiations in the lacZ variants are shown in Table 2, column F. As we estimated the occlusion time of the ribosome to be about 1 s. (Mitarai et al., 2008), we in column G explicitly list the translation initiation time of first ribosome to be 1 s faster than the later ribosomes. In Figure 2 we plot the estimated fraction of successful transcripts (from Table 2, column D) as function of the average time for the first ribosome initiation (column G).

Model
This paper revisits the relationship between the RBS and the protein yield, using the lac system in E. coli as a model system. This has been done previously, most noticeably in the papers by Ringquist et al. (1992) and Yarchuk et al. (1992). In Ringquist et al. the explanation of the protein yield per message was entirely focused on ribosome initiation frequency while implicitly assuming that number of mRNA molecules and their stability were constants that were not influenced by translation. In Yarchuk et al. (1992) a connection between the efficiency of the initiation region and mRNA stability and premature transcription termination was established, but because of the large sequence variation among the initiation regions used, no mathematical model describing the connectivity between these mechanisms was attempted. The present paper includes wt SD and wt coding sequences and combines the systematic variance of the SD strength from the Ringquist et al. paper with measurements of both mRNA stability, protein yield per mRNA and the number of completed mRNA molecules to arrive to a simple mathematical model for the interplay between these three mechanisms. Furthermore, our results indicate that most, if not all, synthesis of natural mRNA are substantially affected by premature transcription termination.
We analyze our measurements in terms of a model that is inspired by the one used by Ringquist et al. (1992). As in that paper, we assume that ribosome binding and subsequent initiation frequency are central steps in determining the translation initiation frequency. In addition, we take into account the fact that ribosomes that initiated translation will occlude the binding of subsequent ribosomes until they have moved one ribosome diameter. This is important because the time between initiations and the occlusion time are only approximately 2-fold different. We estimated previously that subsequent ribosomes will initiate at about 2 s intervals whereas it takes a ribosome about 1 s to translate the about 11 codons that would make space for binding of the next ribosome (Mitarai et al., 2008).
To simplify the presentation of our model we describe translation initiation as a 2-step process (and not the 3-step model used by Ringquist et al.). We assume that the ribosome binds to the RBS with a rate k 1 and unbinds with a rate k −1 . In other words, each time a ribosome binds, then it stays at the RBS for a time interval of average duration 1/k −1 . During this time, the ribosome can transit to start translation with a rate k 2 , as illustrated in the insert of Figure 3. If the ribosome starts translation we assume that the ribosome is now committed, and will stay on the mRNA for as long as translation is possible. The average time for the first ribosome to start translation is then Here the first term t unbound is the average time where the ribosome is uncommitted/unbound, which is calculated by multiplying the time for binding 1/k 1 with the average number of binding attempts before initiation wins over that unbinding (1 + k −1 / k 2 ). The second term 1/k 2 is the time it typically takes for a ribosome to start translating, given that it is bound at the RBS. This model is a simpler version of the 3-step model used by Ringquist et al. and could as well describe their data by rephrasing the above rate constants k 2 and k −1 in terms of combinations of the additional rate constants that ware introduced by Ringquist et al. (1992).
The time before the first ribosome starts translating should be shorter than the time intervals between translation starts of the subsequent ribosomes, because the previous ribosome occlude binding of the subsequent ones for the average time, t occlusion, it takes a ribosome to clear the RBS. The t occlusion was estimated to be 1 s (Mitarai et al., 2008). Thereby we expect that the time between subsequent ribosome initiations would be given by This relation predicts that there is an upper limit for translation initiation; it cannot be faster that once per occlusion time or approximately once per sec. Also, it implies that when this limit is approached, the distance between subsequent ribosomes will be small which likely will cause ribosome collisions and queuing and also excluding any other factors from access to the mRNA.
The notion that occlusion will have an increasing relative effect with faster ribosome initiation rates suggests that all our data could be explained in terms of occlusion by ribosome, as schematically described in Figure 1. The premature termination is coupled to the absence of occlusion, allowing binding of transcription termination factors, possibly transcription termination factor Rho, Figure 1 top. The mRNA stability will then be explained by occlusion that prevents binding of the degrading enzymes. An mRNA with lower translation initiation rate would then at the same time be "hit" by an increased premature termination, and a shorter life-time.
In the simplest formulation of such a model, we will assume that both mRNA stability and premature termination is determined in the beginning of the mRNA. This is motivated by the observation that only 0.4% of peptide are tagged by tmRNA rescue mechanism (Moore and Sauer, 2005). If there were significant mRNA cleavages between translating ribosomes, the use of the tmRNA rescue mechanism should be larger, because the estimate of the number of peptides produced per mRNA for the wt lac mRNA is approximately 75, based on the wt lac mRNA mean life (116/ln2 s) and the initiation frequency (1 per 2.2 s). This suggests that mRNA degradation starts early on the mRNA, so it does not interfere with the translating ribosomes (Pedersen et al., 2011). Premature termination by Rho is also believed to terminate mRNA synthesis if the mRNA is not protected by the contact between the first translating ribosome and the RNA polymerase (e.g., Newton et al., 1965).
Within this framework we will first assume that premature termination can act in the time interval before the first ribosome has reached the elongating RNA polymerase. This time of exposure to termination factors is proportional to The "constant" depends on the difference in speed between first ribosome and the RNAP when it is transcribing without the help of an associated ribosome. In this simple formulation, the probability that the mRNA survives this termination is S = exp(−k × t exposure to termination ) = exp(−t first / 7s) Here k is the rate for premature termination factors to engage. The final fit that use the exponential decay with decay time of approximately 7 s is shown as a dashed line in Figure 2.
Remarkably, this fit predicts that even lacZ-wt sequence is exposed to approximately 29% premature termination, while the SD8-lacZ-wt sequence is still exposed to 20% premature termination. Further, the above equation is simplified, in the sense that we assume that the time between initiation of subsequent translations is fixed, corresponding to a near deterministic sequence events for each initiation. If each translation start is primarily governed by a single rate limiting event, the functional fit becomes a hyperbola, FIGURE 1 | Schematic description of the RBS occlusion model. First, a RNA Polymerase (Green) on double strand DNA (purple) start transcribing a mRNA (black). The race between occlusion by the first ribosome (blue) and arrival of transcription terminating enzymes which may be the Rho factor (red) determines the extent of the premature termination. If the first ribosome successfully starts translation it occludes the transcription terminating enzyme, and the mRNA stability will subsequently be affected by occlusion of RBS preventing binding of the mRNA degrading enzymes (yellow) around the RBS by initiating ribosomes.
FIGURE 2 | The probability, S for a transcription event to be successful is plotted as a function of the initiation time t first of the first ribosome. The dashed line shows the fit S = exp(−t first /(7 s)) with S value for the wt of 0.71. The vertical axis is therefore the values in Table 2 column E multiplied by 0.71. Error bars for the data are based on the SEM in the LacA transacetylase activity measurement, where at least 3 independent measurements were made. Black: wt-lacZ-wt, red: sd5-lacZ-rq, cyan: sd5-lacZ-wt, orange: SD8-lacZ-rq, blue: SD8-lacZ-wt, magenta: SD12-lacZ-rq, navy: SD12-lacZ-wt.
Frontiers in Microbiology | www.frontiersin.org S = 1/(1 + t first /τ), and the best fit for the constant τ predicts a premature termination of wt lacZ mRNA of about 33% (analysis not shown).
In our occlusion based model, the stability of the mRNA is also determined by translation initiation, as the mRNA is exposed to the degradation during the times when the ribosome is not bound, i.e., during the time t unbound between the subsequent ribosome initiation. In other words, we can express the degradation rate being proportional to the probability that the translation start site is free from a ribosome P free = t unbound /t between . Table 2 give us t between for each construct, but does not give us direct information about individual rates, k 1 , k −1 , and k 2 , which is needed to calculate t unbound . The ribosome on-rate k 1 should be independent of RBS, and substantially faster than the shortest of our t between allows for. For simplicity, we fix its value to k 1 = 10/s. We then assume that, when RBS of a mRNA is not occluded by a ribosome, mRNA degradation enzymes act on the mRNA with the same rate "η." The lifetime of each mRNA is now fitted by the mRNA degradation rate P free × η. Given that the wild type mRNA has a half-life of approximately 116 s (Petersen, 1987), each choice of η correspond to a particular value of t unbound (hence P free ) for the wt lacZ. Or reversely, we can determine η from assuming a fraction of time that the wild type sequence is exposed to the degrading enzymes. In Figure 3 we use this way to present our data, where a P free being 5 % for wt RBS site correspond to a much higher η than the case where the wt RBS is exposed 15% of the time. For each value of this exposure (and thereby of η) our model provides a value of the dissociation constant k −1 /k 1 and the firing rate of a bound ribosome k 2 that match the measured halflife ( Table 2 column B) and the measured translation initiation frequency ( Table 2, column F).
While all the curves presented in Figure 3 represent possible fits to our data, the figure also provide some constraints on plausible parameters. Notice for example that some of the very weak start sites such as SD12-lacZ-rq or sd5-lacZ-rq have high values of both firing rate k 2 and of dissociation constant k −1 /k 1 for larger P free (hence smaller η). This is because, to fit their short life-times, t unbound for these sequences needs to become large. It is not plausible that the relatively weak start sites of SD12-lacZ-rq or sd5-lacZ-rq should have a faster k 2 rate than the other sequences, which in turn suggest that the wt RBS is exposed to degradation for less than 10% of the time. For this relatively high activity of the degradation the overall set of suggested parameters are less extreme.

DISCUSSION
We propose a model for ribosome initiation and the resulting protein yield. Our model extends the Ringquist et al. model for translation initiation to also take into account ribosome occlusion, which we suggest, influence both translation initiation, the premature transcription and the functional stability of the mRNA. More recently, Richards et al. (2012) observed a similar effect for the chemical half-life of mRNA based on experiments FIGURE 3 | A 2-step model of translation initiation and mRNA degradation. Insert: we assume that ribosome (blue) binds to RBS on mRNA at a rate k 1 , dissociate at a rate k −1 , or start initiation at a rate k 2 . After initiation, it will take occlusion time t occlusion of 1 s before the next ribosome can bind. The mRNA degrading enzyme complex is shown in yellow, and it can attack at a constant rate η to the mRNA if the RBS is free. The mRNA degradation rate is assumed to be η × P free , where P free is the probability that a RBS is free from ribosome. Experimental data gives the time between the initiation t between , and to calcucate P free we also need the effective dissociation constant (k −1 /k 1 ) and the effective firing rate k 2 . From the relations given in the main text, and knowing that the degradation rate is the inverse of the mean life time t 1/2 /log(2), we obtain t unbound = t between × (log(2)/t 1/2 )/η, k 2 = 1/(t between − t between − t unbound ), and (k −1 /k 1 ) = k 2 × (t unbound − 1/ k 1 ). These relations enable us to plot (k −1 /k 1 ) and k 2 for all the sequences using the measured values of t 1/2 and t between for a given η. In (A,B), we plot them as a function of the probability P free that RBS is free for wild type sequence. This probability can be converted to the rate of degradation for naked mRNA η by using wt mRNA's half-life 116 s as P free = log(2)/116 s/η.
where the average ribosome distance was varied by altering the RBS. This work also indicated a role for the translation machinery in influencing the mRNA stability. Our model was supported by experiments that allowed us to quantify each of the factors that determine the overall protein yield by systematically altering RBS. Overall we saw that a 120-fold change in protein yield could be explained by a product of three different and roughly equally contributing factors: Translation initiation rate, premature transcription termination and mRNA stability. Three factors that can be consistently explained by assuming that the parameters of the initiation rate are the prime determinants, that in turn influence both of the two subsequent factors.
We speculate that our finding from Figure 2, that the strength of the RBS influences the mRNA level in a cell, may provide a more versatile regulation compared to a cell that relies on promoter control only. This figure suggest that the mRNA level can be directly modulated by the relative speeds of the ribosome and the RNA polymerase. The mRNA level therefore may be regulated by external conditions that result in altered nucleotide triphosphate concentrations and/or tRNA charging levels. The codon usage in the early coding region in general has a lower Codon Adaptation index (Sharp and Li, 1987;Bulmer, 1988) which in all likelihood means that the ribosomes translating this part of the mRNA are unsaturated with substrate. The speed of translating ribosome should therefore be able to respond to changes in external conditions. Finally, our suggested role of a premature transcription in fine-tuning mRNA transcription might explain at least part of the discrepancy between the observed total number of RNA polymerases and the number estimated for synthesis of mRNA and stable RNA. The seemingly "inactive RNA polymerases" are about twice that of the "active polymerases" (Bremer and Dennis, 1996). As shown in Figure 2 even an mRNA with an extremely good SD such as SD8-lacZ-wt is still subject to approximately 20% premature termination. The premature transcription termination of most natural mRNA therefore can be expected to be substantially higher. Compared to the regular promotor control a premature transcription termination mechanism may respond better to external signals and might therefore be advantageous for the cell.

AUTHOR CONTRIBUTIONS
ME and SP performed and designed experiments; NM and KS performed the modeling; NM, KS, ME, and SP wrote MS.