Region 4 of Rhizobium etli Primary Sigma Factor (SigA) Confers Transcriptional Laxity in Escherichia coli

Sigma factors are RNA polymerase subunits engaged in promoter recognition and DNA strand separation during transcription initiation in bacteria. Primary sigma factors are responsible for the expression of housekeeping genes and are essential for survival. RpoD, the primary sigma factor of Escherichia coli, a γ-proteobacteria, recognizes consensus promoter sequences highly similar to those of some α-proteobacteria species. Despite this resemblance, RpoD is unable to sustain transcription from most of the α-proteobacterial promoters tested so far. In contrast, we have found that SigA, the primary sigma factor of Rhizobium etli, an α-proteobacteria, is able to transcribe E. coli promoters, although it exhibits only 48% identity (98% coverage) to RpoD. We have called this the transcriptional laxity phenomenon. Here, we show that SigA partially complements the thermo-sensitive deficiency of RpoD285 from E. coli strain UQ285 and that the SigA region σ4 is responsible for this phenotype. Sixteen out of 74 residues (21.6%) within region σ4 are variable between RpoD and SigA. Mutating these residues significantly improves SigA ability to complement E. coli UQ285. Only six of these residues fall into positions already known to interact with promoter DNA and to comprise a helix-turn-helix motif. The remaining variable positions are located on previously unexplored sites inside region σ4, specifically into the first two α-helices of the region. Neither of the variable positions confined to these helices seem to interact directly with promoter sequence; instead, we adduce that these residues participate allosterically by contributing to correct region folding and/or positioning of the HTH motif. We propose that transcriptional laxity is a mechanism for ensuring transcription in spite of naturally occurring mutations from endogenous promoters and/or horizontally transferred DNA sequences, allowing survival and fast environmental adaptation of α-proteobacteria.


INTRODUCTION
The bacterial DNA-dependent RNA polymerase (RNAP) holoenzyme (Eσ) consists of a core enzyme (subunits α 2 ββ ′ ω; E) and one sigma factor (σ) subunit, which recognizes DNA promoters to initiate sequence-specific transcription (Lee et al., 2012). During transcription, sigma factors provide the most fundamental mechanism for orchestrating broad changes in the gene expression profile, making them key proteins during this process (Wösten, 1998).
Almost every sigma region contains a tract that interacts with E (Nagai and Shimamoto, 1997;Murakami, 2002;Schroeder et al., 2007). Among the seven distinct sigma factors of E. coli, RpoD has the highest affinity for E (Maeda et al., 2000).
Rhizobium etli, an α-proteobacteria, can be found as a free living soil organism or as a symbiont in root nitrogenfixing nodules of Phaseolus vulgaris (common bean). The whole genomic sequence of R. etli CFN42 consists of a circular chromosome and six large plasmids, with an average G+C content of 61.5% . The α-proteobacteria class encompasses not only a wide variety of lifestyles but also broad genome sizes. Many of their members show a multireplicon genome structure (Capela et al., 2001;Galibert et al., 2001;Mackenzie et al., 2001;Wood et al., 2001;González et al., 2006;Strnad et al., 2010). Additionally, R. etli contains a large number of sigma factors (one primary sigma gene, two copies of rpoH, two copies of rpoN, and 18 genes of the extracytoplasmic factor group), a feature shared with other nitrogen-fixing organisms, like Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti (Mittenhuber, 2002).
The R. etli primary sigma factor gene (sigA) encodes a protein of 685 amino acid residues with a molecular weight of 77.18 kDa. The amino acid sequence of SigA shows 48% identity (98% coverage) to RpoD. Like other α-proteobacteria (S. meliloti, Caulobacter crescentus, Rhodobacter sphaeroides, Rhodobacter capsulatus), R. etli primary sigma factor is capable of transcribing most of the RpoD-dependent promoters tested so far (transcriptional laxity). On the other hand, R. etli, S. meliloti, C. crescentus, R. sphaeroides, and R. capsulatus primary sigma dependent promoters are poorly or not transcribed at all by RpoD (Karls et al., 1993;Malakooti et al., 1995;Cullen et al., 1997;MacLellan et al., 2006;Ramírez-Romero et al., 2006), which suggests some differences between the E. coli and the α-proteobacterial transcriptional machineries, perhaps at the level of promoter recognition by the primary sigma factor.
The characterization of the transcriptional molecular basis in organisms with agricultural importance like R. etli is fundamental, because it could provide information for future biotechnological applications. Among the potential applications are: heterologous expression of genes contributing to enhance symbiosis or nitrogen fixation, design of promoters that ensure transcription among other symbiotic α-proteobacteria and engineering of sigma factors to gain broad transcriptional capacities. We chose R. etli SigA gene as a model of transcriptional laxity in α-proteobacteria.
To identify SigA regions involved in transcriptional laxity we made a library of chimeric genes exchanging the regions of RpoD and SigA. We constructed 14 non-redundant possible combinations between both sigma factors. We then tested their ability to complement the thermo-sensitive deficiency of RpoD285 primary sigma factor of E. coli strain UQ285 (E. coli rpoD285; Harris et al., 1978;Hu and Gross, 1983). The results show that whenever SigA region σ4 is present, the carrier chimera is able to sustain growth of E. coli rpoD285 at restrictive temperature (42 • C). Mutating residues at variable positions within the first two α-helices enhances the ability of SigA to complement the E. coli rpoD285 phenotype. We propose that these helices participate in correct folding and positioning of the HTH motif found on region σ4. The HTH motif is responsible for promoter recognition.

Bacterial Strains and Plasmids
Relevant information about bacterial strains and plasmids is listed in Table 1.

Wild Type Primary Sigma Factor Genes Amplification and Cloning
Total DNA was purified from E. coli DH5α and R. etli CFN42 strains, respectively (Sambrook and Russell, 2001). The forward oligonucleotide sequence included an XbaI recognition site and an optimal ribosome binding site (RBS), 5 ′ -AGGAGA-3 ′ , six base pairs away from the start codon. The reverse oligo contained a KpnI recognition sequence. Oligonucleotides were designed to amplify only the coding sequence of the corresponding primary sigma factor genes. Once amplified and purified, wild type genes were digested with KpnI-XbaI and cloned into pRK415. The pRK415 cloned fragments are under the control of the E. coli lactose promoter (Plac). Transformants were selected on LB/Tet plates grown at 37 • C. The oligonucleotides described above were also used in the last amplification step for the production of a chimeric library. Consequently, all constructs involving the expression vector pRK415 share the same RBS sequence and cloning sites. First members of the library were pRK415rpoD (pRKrpoD) and pRK415sigA (pRKsigA). Wild type genes were amplified using Platinum Taq High Fidelity DNA polymerase (Invitrogen). Restriction enzymes were obtained  (Harris et al., 1978).
R. etli CFN42 30 • C Template for amplification of wild type sigA gene. Host strain for the pBBR1MCS5PnRFP library. Nalidixic acid resistant

In silico Oligonucleotide Design and Chimeric Gene Assembly
Amino acid and nucleotide sequences of E. coli rpoD and R. etli sigA were obtained using Artemis Genome Browser (release 15.0.0, RRID:SCRRRID:SCR_004267; Rutherford et al., 2000) from complete chromosome sequences of E. coli strain K12 substrain MG1655 (GenBank: NC_000913) and R. etli CFN42 (NC_007761), respectively. SigA region identification was done in accordance to RpoD amino acid sequence (Gruber and Gross, 2003) and oligonucleotide design corresponded to those boundaries. Fourteen non-redundant chimeric genes were created, shuffling regions between the two wild type genes (Figure 1). Resembling functional primary sigma factors, each chimeric gene included four regions (σ1-σ2-σ3-σ4). σNCR and σ2 constitute domain 2 in RpoD (Gruber and Gross, 2003); for that reason, these two regions were considered as one and designated by σ2 alone. Sequence alignments were performed using MUSCLE (version 3.8.31, RRID:SCRRRID:SCR_011812; Edgar, 2004). Chimeric genes were assembled using ad hoc scripts written in R language (version 2.15.1; R Development Core Team, 2008, RRID:SCRRRID:SCR_001905) and subsequently manually verified. All programs mentioned above were run locally.

Construction of Chimeric Genes
Chimeric sequences were assembled by three different approaches: overlapping PCR products, plasmid recovery (modified from Vos and Kampinga, 2008) and gene synthesis. Fourteen different chimeric genes were obtained, according to the in silico design. All chimeric, wild type genes and sigA mutants were cloned into pRK415 (Keen et al., 1988;Hülter and Wackernagel, 2008) using KpnI-XbaI restriction enzymes (pRK415sigma library). Platinum Taq High Fidelity DNA polymerase was used for all amplification reactions. Restriction enzymes were obtained from NEB.

Chimera Assembly by Modified Plasmid Recovery Technique
The two wild type genes, rpoD and sigA, were cloned independently in vector pUC19 (Norrander et al., 1983) between XbaI and KpnI restriction sites. These constructs, pUCrpoD and pUCsigA, were used as DNA templates. Oligonucleotide sequences were designed to obtain two complementary segments of each construct, disrupting the ampicillin resistance gene (present on the vector) and the sigma factor gene (on the desired target region; Vos and Kampinga, 2008). After performing amplification reactions, DpnI digestion of the template DNA and gel purification of the corresponding bands, we ended up with four fragments: pUC19rpoDσ1-σ2_partA, pUC19rpoDσ3-σ4_partB, pUC19sigAσ1-σ2_partA, and pUC19sigAσ3-σ4_partB. Each DNA fraction was digested with AflII-SpeI restriction enzymes, then, complementary DNA fragments were mixed (e.g., chimera01: pUC19rpoDσ1-σ2_partA and pUC19sigAσ3-σ4_partB) and ligated overnight at 16 • C.
The ligation reaction was used to transform DH5α cells and transformants were selected on LB/Amp plates at 37 • C. After verifying them by PCR and re-digestion with KpnI-XbaI enzymes, candidate constructs (pUCch01 and pUCch02) were used as templates for another round of amplification with external oligonucleotides. Finally, KpnI-XbaI digestion and cloning into pRK415 were done with these fragments, producing the constructs pRK415chim01 (pRKch01) and pRK415chim02 (pRKch02). Transformants were selected on LB/Tet plates at 37 • C. DNA fragments and final products were amplified using Phusion High Fidelity DNA polymerase, which yields bluntended products only. DNA polymerase, ligase, and restriction enzymes were obtained from NEB.

Chimera Assembly by Gene Synthesis
The DNA sequences of chimeras 03 and 04 were assembled in silico according to protein region delimitation based on RpoD (Gruber and Gross, 2003) and secondary structure prediction results from the psipred server (Buchan et al., 2013 RRID:SCRRRID:SCR_010246). These chimeras were synthesized by GeneArt TM Gene Synthesis (ThermoFisher Scientific). Synthetic chimeric gene sequences were re-amplified using Platinum Taq High Fidelity DNA polymerase, digested with KpnI-XbaI and cloned into pRK415. In this way, we obtained the constructs pRK415chim03 (pRKch03) and pRK415chim04 (pRKch04). DH5α transformants were selected on LB/Tet plates at 37 • C.

Chimera Assembly by Overlapping PCR Products
Primers used for this technique required two important features: target site amplification and overlap sequences. Each part has a minimum length of 20 base pairs (bp). Oligonucleotide design considered the 3 ′ -A overhang ends produced by Platinum Taq High Fidelity DNA polymerase ( Table 2). In order to fuse two DNA segments, we proceeded as follows: (1) amplify each fragment independently by PCR, (2) purify each band from agarose gel (Purification kit, Roche), (3) mix the purified fragments into the assembly PCR, where the overlapping region will produce the 3 ′ -OH DNA end required for polymerization, (4) make an enrichment reaction using external oligonucleotides, and (5) purify the assembled DNA from an agarose gel. After purification, the DNA was digested with KpnI-XbaI restriction enzymes, cloned into pRK415 and transformed into DH5α. Transformants were selected on LB/Tet plates at 37 • C. Chimeras 05 through 14 were assembled by this method, yielding the constructs pRK415chim05 to chim14 (pRKch05-ch14).

SigA Mutant Constructs
An amino acid sequence alignment between RpoD and SigA genes showed that they differ in 16 out of the 74 residues along region σ4. SigA residues were replaced by their corresponding RpoD counterparts (changing corresponding codons) at these variable positions. Three sigA mutants were obtained with an average of five amino acid substitutions each. After the in silico mutagenesis, the resulting sequences and the pRK415  Restriction enzyme sites are underlined, except for NheI sites that appear in lowercase. RBS are in bold. Eco denotes oligonucleotides for RpoD; Ret, oligonucleotides for SigA. The -35 and -10 boxes of promoter consensuses appear in underlined bold. FWD, forward primer, REV, reverse primer, Vos, plasmid recovery technique for assembling chimeras, Amp, ampicillin resistance gene primers, UNI, universal primer, pBBR, primers for cloning into pBBR1MCS5 vector, pUC, primers for cloning into pUC19 vector, RFP, red fluorescent protein gene primers.

RFP Reporter Gene Constructs
Three promoters (RpoD and SigA consensuses and a promoterless sequence) were fused with the Red Fluorescent Protein gene independently (P n RFP). PnRFP constructs were cloned into two different vectors, pBBR1MCS5 (Kovach et al., 1995) and pUC19 (Norrander et al., 1983). We constructed pBBR1MCS5 set first. Consensus promoter sequences of RpoD (P Eco ; Hawley and McClure, 1983;Harley and Reynolds, 1987;Shultzaberger et al., 2007;Shimada et al., 2014) and SigA (P Ret ; Ramírez-Romero et al., 2006) were introduced in the reverse oligonucleotide and a promoter-less intergenic region (pD00022 gene) of the R. etli genome was used as template for the amplification reaction (this sequence ended upstream of the promoter). Separately, the RFP coding sequence flanked by an upstream RBS site (5 ′ -AGGAGA-3 ′ ) and a downstream strong transcriptional terminator was amplified by PCR from plasmid pJ61002 (iGem repository). These two DNA fragments were digested with EcoRI and ligated with T4 DNA ligase (NEB). After ligation, another round of DNA amplification was carried out using the external oligonucleotides. Constructs were cloned into pBBR1MCS5 between SmaI and ApaI sites (inside multiple cloning site, MCS). Insert orientation was chosen in order to minimize the effect of outer promoters. DH5α transformant cells were selected on LB/Gen plates. The PBBR1MCS5PrpoDconsRFP (pBBP Eco ) plasmid was digested with NheI to remove the DNA spacer and -10 box of the RpoD consensus promoter sequence, purified from agarose gel, and re-ligated. In this manner, pBBR1MCS5PlessRFP (pBBP less ) was assembled. Once the pBBR1MCS5P n RFP (pBBP n ) set was verified by sequencing, new external oligonucleotides were used to amplify the P n RFP fragment. BamHI and SphI restriction enzymes were used for pUC19P n RFP (pUCP n ) construct and DH5α transformants were selected on LB/Amp plates at 37 • C. All DNA amplification reactions were carried out with Platinum Taq High Fidelity DNA polymerase. We used the same DNA spacer sequence (located between -35 and -10 promoter boxes) for both consensuses (corresponding to promoter Bba_J23119 of the iGem repository).

Bacterial Growth Curves Measurements
E.coli rpoD285 was transformed with pRK415sigma library and growth curves were recorded in Synergy 2 Microplate Reader (Biotek). Synergy 2 was set to record optical density at a wavelength of 600 nm (OD 600nm ) every 30 min within 24 h. Experiments were performed with five randomly selected colonies from each library member. We carried out four technical repetitions for each colony, giving a total of 20 per library member. For each experiment, bacterial colonies were picked from solid plates and inoculated in its corresponding microplate well (pre-culture at 30 • C), which was then replicated into a fresh new one (30 • C) and finally the preceding microplate was replicated and grown at 42 • C.

DNA Isolation and Manipulation
Total deoxyribonucleic acid from E. coli DH5α and R. etli CFN42 was purified (Sambrook and Russell, 2001)

DNA Sequencing
The pRK415sigma library was sequenced with Sanger technology by Macrogen Inc. Each construct was sequenced in both forward and reverse strands. The pRK415sigAmutant set was sequenced by GenScript. The P n RFP library was sequenced at the Unidad de Sintesis y Secuenciacion de ADN, Instituto de Biotecnologia-UNAM.

Data Handling and Statistical Analysis
For data standardization, the ratio of measurements obtained at 42 • C vs. those obtained at 30 • C was calculated for each genetic construct, data type and time point. Perl ad hoc scripts were used to format raw output files from the Biotek microplate reader. Sequence alignments were performed using MUSCLE (Edgar, 2004). Post-script and jpg files of the alignments were created with SeaView (Galtier et al., 1996) and Apache OpenOffice, respectively. Growth curve graphs and statistical analyses were done using suitable R software packages (ggplot2, statmod, stats; R Development Core Team, 2008, RRID:SCRRRID:SCR_001905). We used R package grofit for mathematically model growth curves (Kahm et al., 2010). Permutation tests were implemented with the R package compareGrowthCurves (Elso et al., 2004;Baldwin et al., 2007) using growth curves data. For hierarchical clustering, we chose Minkowski distances to determine groups according to median and median absolute deviation of standardized integral values. We performed Shapiro-Wilk, Anderson-Darling and Jarque-Bera tests to analyze normality. Wilcoxon Utests were accomplished to determine groups of similar behaviors among constructs using standardized integral values (42/30 • C) for E. coli rpoD285. Kendall rank correlation tests were carried out to assess correlation between: (1) OD and CFU standardized values, (2) integral parameters obtained from standardized RFP values (RFP/OD) and (3) integral parameters from E. coli rpoD800 and DH5α. Due to extreme variation during the first 15 measurements of RFP activity, statistical analyses were done excluding these data points. The extremely high values of RFP activity could have arisen because cultures started from an overnight pre-culture reached stationary phase. Cells in this phase are expected to overexpress the RFP. All Sanger sequence reads were handled using trace tuner for base-calling (Paracel/Celera, 2006), Lucy for sequence quality/trimming (Chou and Holmes, 2001) and MIRA (Chevreux et al., 1999, RRID:SCRRRID:SCR_010731) for assembling contigs. BLAST was used to obtain identity percentages (Altschul et al., 1990, RRID:SCRRRID:SCR_004870). The Psipred web server was employed for secondary structure prediction (Buchan et al., 2013). All programs, except for psipred, were run locally.

Protein Purification
E. coli rpoD285 was used for this experiment. Three independent colonies of each pRK415sigma construct (excluding sigA mutants) were grown overnight at 30 • C. New flasks containing 100 ml fresh media were inoculated with the previous cultures adjusting OD 600nm to 0.03 and left at 42 • C. The OD 600nm of these cultures was periodically monitored and samples were extracted when they had reached 0.6-0.8. We adjusted the OD 600nm of each culture to obtain 40 ml at an OD 600nm of 0.6. From this point on, samples were kept on ice. The vector and constructs pRKch02, ch03, ch07, ch10, and ch11 reported an OD 600nm that ranges from 0.04 to 0.06 after 72 h at 42 • C. These library members were excluded from this experiment. Cultures were washed with 1X PBS, re-suspended in ice-cold milliQ H 2 O supplemented with Protease Inhibitor Cocktail (Sigma-Aldrich) and sonicated three times (25 s, 13 Microns). Then, 5 ml of absolute acetone was added, samples were frozen overnight at -80 • C and centrifuged. Pellets were re-suspended in 5 ml of Extraction Buffer (0.7 M Sucrose, 0.5 M Tris-base, 0.1 M KCl, 30 mM HCl, 50 mM EDTA, 2% β-Mercaptoethanol and PVPP), mixed with 6 ml of phenol and centrifuged. The aqueous phase was recovered, suspended in 15 ml of ammonium acetate, frozen overnight at -80 • C and centrifuged. Samples were washed twice using 5 ml of 80% acetone. Samples were finally suspended in Solubilization Buffer (7 M Urea, 2 M Thiourea, 4% CHAPS, 2 mM TBP, 2% Ampholine, 600 mM DTT). Protein was quantified by Bradford Assay, using Bovine Serum Albumin as standard. All E. coli rpoD285 cultures were grown aerobically (220 rpm agitation) in liquid LB/Tet/0.5 mM IPTG media. All the centrifugation steps were done at 4 • C, 7000 rpm for 20 min.

Phylogenetic Analysis
We selected a total of 74 sigma protein sequences belonging to 54 α-proteobacteria and 20 Enterobacteria species for the phylogenetic analysis. The protein sequences were aligned with ClustalW2 (Thompson et al., 2002, RRID:SCR_002909) and the evolutionary model (WAG) that best fit the data was obtained with ProtTest (version 2.4; Abascal et al., 2005). The phylogenetic reconstruction was made with PhyML software (version 3.0; Guindon and Gascuel, 2003).

Sigma Factor RpoD Region σ4 Crystal Model
We selected and downloaded the PDB file 4YLN (Zuo and Steitz, 2015) from the PDB database (PDB, RRID:SCR_012820). This file contains the crystallographic structures of E. coli transcription initiation complexes comprising a complete transcription bubble. We obtained the chain F that belongs to RpoD and selected its region σ4 (amino acids 540 to 613). The structure of RpoD region σ4 and the promoter's -35 box DNA were display using the Swiss-PdbViewer software (version 4.1; Guex and Peitsch, 1997).

R. etli SigA is Laxer in Promoter Recognition than E. coli RpoD
As described in Ramírez-Romero et al. (2006), the functional comparison between E. coli RpoD and R. etli SigA revealed that RpoD is stricter for promoter recognition, which is reflected in a robust consensus sequence ( Table 3). The opposite case is observed among α-proteobacteria, where lax primary sigma factors allow a larger variation in its promoter structure (Karls et al., 1993;Malakooti et al., 1995;Cullen et al., 1997;MacLellan et al., 2006;Ramírez-Romero et al., 2006). This observation suggests that E. coli RpoD is unable to recognize R. etli SigAdependent promoters or recognizes them less efficiently. In order to test this possibility, the red fluorescent protein (RFP) gene was put under the control of three different promoters: RpoD consensus sequence (P Eco ), SigA consensus sequence (P Ret ), and an RpoD consensus that lacks the spacer and -10 box (P less ). Each of these constructs was cloned into pBBR1MCS5 (pBBP n ) and pUC19 (pUCP n ). The constructs pBBP Eco and pBBP Ret conferred a red color to R. etli CFN42 under previously described growth conditions, while the negative control pBBP less remained white (Figure 2). These data showed that SigA is able to use the RpoD consensus promoter sequence to sustain expression of the reporter gene under the tested growth conditions. At the same time, pUCP n constructs were introduced into E. coli DH5α. Only pUCP Eco displayed red colored colonies. Both  pUCP Ret and pUCP less exhibited only white colored colonies (Figure 2). In order to exclude the participation of RpoS, the alternative sigma factor that shows partial resemblance to the RpoD consensus promoter sequence (Peano et al., 2015), we repeated the pUCP n experiments using an E. coli rpoS strain (E. coli BW28465. Zhou et al., 2003). RFP activities in E. coli rpoS strain revealed good correlation to those observed on DH5α (Kendall τ = 0.72, Pvalue = 0.0091; Figure 2).
Taken together, these results confirm previous observations and provide new information supporting the following: (1) E. coli RpoD is unable to sustain gene expression under the control of the R. etli SigA consensus promoter sequence, (2) SigA is a primary sigma factor with a lax promoter recognition pattern, and (3) alternative sigma factors of E.
coli recognize neither the RpoD nor SigA consensus promoter sequences.

R. etli SigA Gene Complements the Heat-Sensitive Phenotype of an E. coli RpoD Mutant
Previous results showed that transcriptional fusions containing 33 different R. etli SigA-dependent promoters are not recognized by E. coli RpoD. However, the E. coli lactose promoter (P lac ) is recognized by SigA, suggesting that the latter is a laxer primary sigma factor [(Ramírez-Romero et al., 2006) and previous sections]. If this hypothesis were true, then SigA would be able to substitute RpoD in vivo. To test this, we used the E. coli rpoD285, which holds an RpoD thermo-sensitive allele. rpoD285 has a 42 bp in-frame deletion at its σNCR (Hu and Gross, 1983). This strain is unable to grow at 42 • C (restrictive temperature) due to the unfolding of its primary sigma factor. At 30 • C (permissive temperature) E. coli rpoD285 grows orderly (Harris et al., 1978;Hu and Gross, 1983). RpoD and SigA were separately cloned into pRK415 plasmid where they are expressed under the control of P lac . Constructs pRK415rpoD (pRKrpoD; positive control), pRK415sigA (pRKsigA), and the vector (pRK415, negative control) were independently transformed into E. coli rpoD285. All constructs grew at permissive temperature, presenting lag, exponential and stationary growth phases. Maximal stationary OD 600nm ranged between 0.8 and 1.0 (Figure 3A). At restrictive temperature, only the two primary sigma factor constructs sustained the growth of E.coli rpoD285, reaching maximal OD 600nm between 0.7 and 0.9 ( Figure 3B). We also reproduced these experiments using another RpoD thermo-sensitive E. coli strain, CAG1 (E. coli rpoD800. Liebke et al., 1980). E. coli rpoD800 complementation results show moderate correlation to those from rpoD285 (Kendall τ = 0.669, Pvalue = 2.38 × 10 −7 ). Maximal OD 600nm for E. coli rpoD800 experiments were: permissive temperature (1.0-1.2; Figure 3C) and restrictive temperature (0.75-1.0; Figure 3D). Neither E. coli strains rpoD285 nor rpoD800 were complemented by the vector at restrictive temperature. These results showed that R. etli SigA is able to complement the E. coli RpoD mutant phenotype.

Construction of Chimeric Genes Swapping RpoD and SigA Regions
To identify regions involved in transcriptional laxity phenotype of SigA, we implemented a strategy based on the assembly of functional chimeric genes in E. coli. To this end, we constructed a library of 14 chimeras exchanging protein regions of RpoD and SigA (Figure 1). Each construct was designed in frame, maintaining intact the ORF. Chimeric genes were cloned into pRK415. To determine the functionality of these chimeras in E. coli, we chose E. coli rpoD285 as a host strain for complementation experiments.

Substitution of SigA Regions, One At a Time
SigA σ1 and Non-Conserved Regions Are Not Involved in Transcriptional Laxity The results described above suggest that SigA can recognize promoters associated with indispensable E. coli growth-related genes. The comparison of the predicted structure of SigA to known RpoD domains (Gruber and Gross, 2003) indicated that regions σ2.4 and σ4.2, responsible for promoter recognition, are identical and different by only three amino acids, respectively. SigA regions σ1 and σNCR comprise 72 extra residues, suggesting that differences located in these regions could be related to the SigA lax promoter recognition reported previously (Ramírez-Romero et al., 2006). If this were true, then the interchange of these SigA regions by their RpoD counterparts would change its promiscuous promoter recognition to a more stringent one. Chimeric constructs 04 and 05 (pRKch04 and pRKch05) represent this design (Figure 1). E. coli rpoD285 growth complementation experiments showed the following: (1) At permissive temperature, these constructs displayed clear discernible growth phases. Maximal stationary OD 600nm ranged between 0.9 and 1.0 ( Figure 4A).
(2) At restrictive temperature, all chimeras grew similarly to SigA, displaying growth phases. Maximal OD 600nm ranged from 0.8 to 0.9 ( Figure 4B). According to these results, SigA regions σ1 and σNCR do not contain elements related to transcriptional laxity.

Neither SigA σ2 and σ3 Regions Participate in Transcriptional Laxity
Given that SigA and RpoD regions σ2 share 100% identity (100% coverage), it was discarded as a strong candidate for explaining transcriptional laxity. Construct pRKch04 supported this notion (see previous part). Furthermore, regions σNCR and σ2 are part of the same domain in RpoD (Gruber and Gross, 2003). For this reason, the design of chimeras 05 through 14 considered regions σNCR and σ2 as a unit. Construct pRKch08 exchanged region σ3. It exhibited discernible growth phases at both temperatures. Maximal stationary OD 600nm reached 1.2 for permissive and 0.8 for restrictive temperatures (Figure 4). Because construct pRKch08 was able to complement E. coli rpoD285 thermo-sensitive phenotype, we concluded that SigA region σ3 do not participate in transcriptional laxity.

sigA σ4 Region May Participate in Transcriptional Laxity
The last chimeric construct (pRKch10) swapped regions σ4. E. coli rpoD285 complementation experiments revealed the following: (1) At permissive temperature, this construct showed discernible growth phases, reaching maximal OD 600nm of 1.0 ( Figure 4A) and (2) At restrictive temperature, this construct was unable to sustain cell growth ( Figure 4B). This was the first observation suggesting that SigA region σ4 participates in transcriptional laxity.
To test possible interactions between regions regarding transcriptional laxity, we replaced two and three regions at a time in the remaining constructs (Figure 1). In this way, we can unveil the potential participation of more than one SigA region in transcriptional laxity.

Substitution of SigA Regions, Two At a Time
This part of the pRK415sigma library was built exchanging every pair of regions according to a non-redundant combination design. In this way, six constructs were obtained. We replaced SigA regions in the following order: σ1/σ2 (pRKch01), σ1/σ3 (pRKch13), σ1/σ4 (pRKch11), σ2/σ3 (pRKch12), σ2/σ4 (pRKch14), and σ3/σ4 (pRKch02). All constructs behaved as expected at permissive temperature, displaying clear growth phases. Maximal stationary OD 600nm ranged between 0.8 and 1.1 (Figure 5A). At restrictive temperature, constructs pRKch02 and pRKch11 showed no growth. The other paired library members (pRKch01, ch12, ch13, and ch14) displayed lag, exponential, and stationary growth phases with maximal OD 600nm ranging from 0.6 to 0.8 ( Figure 5B). From previous results, we expected that constructions exchanging SigA region σ4 (pRKch02, ch11, and ch14) would be unable to complement the E. coli rpoD285 phenotype. However, construction pRKch14 exhibited complementation capacity. We believe this ability is explained by the presence of both RpoD promoter's recognition regions (σ2 and σ4) and the functional compatibility between all its protein regions.

Substitution of SigA Regions, Three At a Time
In the remaining part of the pRK415sigma library, we simultaneously interchanged three regions per construct. SigA replaced regions were: σ1/σ2/σ3 (pRKch09), σ1/σ2/σ4 (pRKch07), σ1/σ3/σ4 (pRKch03), and σ2/σ3/σ4 (pRKch06). At permissive temperature, all constructs showed growth, reaching maximal OD 600nm between 1.0 and 1.1 (Figure 6A). At restrictive temperature, constructs pRKch06 and ch09 displayed growth (maximal OD 600nm of 0.9), but pRKch03 and ch07 were unable  Frontiers in Microbiology | www.frontiersin.org to complement the E. coli rpoD285 thermo-sensitive phenotype ( Figure 6B). For pRK415sigma library constructs that were able to sustain growth at 42 • C, a general tendency of OD 600nm decay was observed after the culture had reached its maximum (Figures 3-7, sections B). This observation may be explained by the accumulation of metabolism by-products led by prolonged heat-stress. This may cause cell death and/or arrest in the tested conditions.
Constructs pRKch06, ch07, and ch14 hold the two RpoD regions known to be involved in promoter recognition (σ2 and σ4). Only pRKch07 was unable to complement the E. coli rpoD285 phenotype at restrictive temperature, although it displays 96% identity (100% coverage) to RpoD. The incapability of pRKch07 to complement E. coli rpoD285 phenotype suggests that this particular combination of domains renders the chimeric protein not functional, perhaps at RNAP core binding, transition from abortive RNA synthesis to transcription elongation (σ3) or misfolding of the entire protein. Given the previous results, the presence of chimeras that were unable to do it suggests RpoD-SigA regions incompatibility, maybe due to allosteric interactions.

Chimeric and Wild-Type Genes are Translated in E. coli rpoD285
In order to establish if every pRK415 construct was translated in E. coli rpoD285, total protein was extracted from 14 library members grown at restrictive temperature. Constructs pRKch02, ch03, ch07, ch10, and ch11, together with the empty vector, were discarded because of their inability to sustain growth at 42 • C. Protein samples were collected, electrophoresed, electro-transferred to nitrocellulose membranes and incubated in primary antibody 2G10 (Creative Biomart Cat# CABT-36751ME, RRID:AB_11443551), which targets the amino acid region mapped from residues 470 to 486 on E. coli RpoD (inside region 3.1) and also cross-reacts to primary sigma factors of other bacterial species (Batut et al., 1991;Severinova et al., 1996;Breyer et al., 1997;Cullen et al., 1997;Bowman and Kranz, 1998). Western blot results showed a band corresponding to the molecular mass of RpoD (reference line two, E. coli Eσ D ), suggesting the expression of all constructs that complemented growth of E. coli rpoD285 at restrictive temperature (Supplementary Figure 1).

Colony Forming Units Assay Confirms the Growth Curve Data
In pursuance of supporting growth curve data, we performed colony forming units (CFU) assay. CFU results showed general agreement with OD 600nm growth curve data (for statistical correlation tests please see Supplementary Information), i.e., only constructs that displayed growth on liquid media also did so on solid plates (Supplementary Figure 2).

Growth Curve Analysis of the pRK415sigma Library
To analyze the growth curves of pRK415sigma library, we mathematically modeled the observed data with gcFitModel function of R package grofit (Kahm et al., 2010). In this way, descriptive growth parameters as the lag phase length (λ), growth rate or maximum slope (µ), maximum cell growth (A), and the area under the curve (integral) were obtained. To review the goodness of the fit and statistical tests applied to these parameters please see Supplementary  The parameter integral was chosen to compare growth kinetics because this feature comprehends all the others. Clustering of standardized integral values (see Materials and Methods) allowed us to propose three groups of kinetic behaviors. No-growth group was integrated by the vector and chimera ch02, ch03, ch07, ch10, and ch11. Intermediategrowth group comprised sigA and chimera ch01, ch04, ch05, ch08, ch12, and ch13. High-growth group consisted of rpoD, chimera ch06, ch09, and ch14. For a more detailed description please see Supplementary

SigA Region σ4 Aids Transcriptional Laxity
Growth curves and parameters data unveiled that each time SigA region σ4 appeared on a genetic construct, the carrier chimera was able to complement E. coli rpoD285 growth at restrictive temperature ( Table 4). This feature was not observed with any other SigA region, i.e., regions σ1, σ2, and σ3 of R. etli primary sigma factor were present in chimeras that either sustain or impair growth at restrictive temperature. In order to test the sequence conservation of SigA region σ4 among the other αproteobacteria that showed transcriptional laxity, we aligned the amino acid sequences of primary sigma factors from R. etli, S. meliloti, R. capsulatus, R. sphaeroides, C. crescentus, and E. coli using MUSCLE (Supplementary Figure 5). Sequence alignment revealed 16 mismatches out of 74 residues along region σ4 of these primary sigma factors. For this reason, we targeted these sites on SigA for mutational analysis.

SigA Region σ4 Mutants
We decided to substitute the 16 mismatched positions found along region σ4 by exchanging SigA residues with its RpoD correspondents. These sequence changes should benefit SigA complementation capacity on E. coli. Secondary structure prediction using Psipred mapped this change within four different α-helices and some on its associated coil/looped regions. For SigA mutant 01 (sigAm01), five sequence changes were introduced into the first α-helix of region σ4; SigA mutant 02 (sigAm02) inserted four within the second α-helix and finally, SigA mutant 03 (sigAm03) inserted seven along the helix-turnhelix (HTH) motif. The HTH motif is responsible of promoter's −35 box recognition. Once the amino acid sequence mutants were designed, we identified its corresponding codons and exchanged them to those of RpoD. In silico designed sigA mutant sequences and pRK415 plasmid DNA were sent to GenScript (NJ, USA) to be chemically synthesized and cloned into the expression vector.
OD 600nm ranged between values of 1.1-1.3 for permissive and 0.81-0.86 for restrictive temperatures. All SigA mutants were able to sustain growth of E. coli rpoD285 at restrictive temperature and exhibited OD decay during stationary phase (after reaching maximal growth) as previously seen on other pRK415sigma constructs. Growth parameters and statistical analysis for sigA mutants were computed as formerly described. The three SigA mutants fell into the high-growth cluster where RpoD resides (

DISCUSSION
R. etli primary sigma factor, SigA, is able to transcribe most of the previously tested E. coli RpoD-dependent promoters (Ramírez-Romero et al., 2006), although these proteins locate on clearly separated phylogenetic clusters (Supplementary Figure 6). The same behavior was observed between other α-proteobacteria vs. the enterobacterial model (Karls et al., 1993;Malakooti et al., 1995;Cullen et al., 1997;MacLellan et al., 2006;Ramírez-Romero et al., 2006). This capacity may be explained, at least in part, by adaptive demands derived from the vast environmental conditions that α-proteobacteria inhabit.
In this work, we showed that SigA can complement E. coli rpoD285 (rpoD thermo-sensitive strain) growth at restrictive temperature (42 • C). We have called this the transcriptional laxity phenomenon. Moreover, R. etli transcription machinery is also able to transcribe the RFP reporter gene from RpoD consensus promoter sequence. These results imply that the following conditions are met in E. coli rpoD285 host: (1) SigA is transcribed and translated, (2) SigA folds in a functional manner, (3) SigA is able to interact with RNAP core, assembling functional hybrid holoenzymes, (4) the hybrid holoenzymes are capable of transcribing indispensable genes for survival and growth at 42 • C and (5) wild-type R. etli holoenzyme can transcribe the E. coli RpoD-dependent consensus promoter.
To identify the protein region(s) responsible for transcriptional laxity, we made a chimeric gene library exchanging regions of SigA and RpoD. In this way, all 14 non-redundant combinations between the two wild type genes were obtained. Chimeric library constructs were tested in their ability to complement E. coli rpoD285 growth at 42 • C and these data were mathematically modeled to generate descriptive parameters of the construct kinetics.
We found that whenever SigA region σ4 is present in a chimera, the construct is able to complement in vivo the thermo-sensitive misfolding of RpoD285. Sequence alignment between E. coli RpoD and lax primary sigma factors from αproteobacteria (Karls et al., 1993;Malakooti et al., 1995;Cullen et al., 1997;MacLellan et al., 2006;Ramírez-Romero et al., 2006), manifested 16 mismatches along region σ4. Sequence identity at these positions is conserved among α-proteobacterial proteins only. SigA mutant library was designed according to sequence alignment and secondary structure prediction data (Liebke et al., 1980;Buchan et al., 2013), resulting in three different mutant proteins. Although sigA and its mutants exhibit the lowest sequence identity percentage to RpoD among library members, the introduced changes significantly improved phenotypic complementation ability of SigA mutants (Wilcoxon U-test Pvalues: 1.02 × 10 −10 , 1.06 × 10 −7 , and 2.9 × 10 −11 ) on E. coli rpoD285. This result shows that amino acid sequence identity alone is not enough to predict transcriptional laxity of a primary sigma factor. Integral values represent the area under the curve for each library member. RpoD sequence was used as query in BLASTP searches to determine identity percentages. Growth group determined by clustering library members. NA, not applicable.
In this study, we show that region σ4 is involved in transcriptional laxity, at least among α-proteobacterial primary sigma factors. Moreover, sequence changes in this protein region alone can enhance its transcriptional capacity on the target host organism. This transcriptional improvement can be achieved by altering the first two α-helices of region σ4, although they do not appear to directly interact with promoter DNA. Comparison of primary sigma dependent promoter consensus sequences between E. coli and α-proteobacteria revealed that -35 boxes are strongly conserved while -10 boxes display higher sequence variation in the latter bacterial class ( Table 3). These observations support the relevance of region σ4 in transcriptional laxity phenomenon among α-proteobacteria.
We propose that primary sigma factors of the αproteobacteria class rely mostly on region 4 to achieve transcription. It is also the main target region for sequence changes that enhance functional fitness of the protein to its host organism. The first two α-helices of region σ4 may help in efficient folding and positioning of the HTH motif, so recognition of the -35 box could be accomplished. We also hypothesize that α-proteobacterial primary sigma factors depend predominantly on finding and binding to the -35 box of the promoter sequence during closed complex formation. Region σ4 and -35 box interaction anchors Eσ at the promoter long enough to start transcription bubble, lessening the need for a well conserved -10 box. In this way, α-proteobacteria manage to sustain transcription from a wide variety of promoter sequences, even from those of other bacterial classes of its phylum. Transcriptional laxity may have arisen during α-proteobacterial evolution to: (1) ensure expression of essential genes on vast environmental conditions, (2) exploit possible advantageous sequences obtained by horizontal gene transfer, (3) adapt and colonize the vast environments they inhabit today, and (4) counteract the effect of naturally occurring mutations at endogenous promoter sequences.

AUTHOR CONTRIBUTIONS
OS, MR, and GD designed the study. OS performed the statistical analysis and the majority of the experiments.
OS and AC purified the protein samples and carried out Western Blotting. LL built the phylogenetic trees and crystal model figures of region σ4. OS wrote the paper and prepared figures. MR, GD, and SE critically edited the manuscript.