<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2018.01344</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Mining Novel Constitutive Promoter Elements in Soil Metagenomic Libraries in <italic>Escherichia coli</italic></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Westmann</surname> <given-names>Cau&#x000E3; A.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/529747/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Alves</surname> <given-names>Luana de F&#x000E1;tima</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/572612/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Silva-Rocha</surname> <given-names>Rafael</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/243152/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Guazzaroni</surname> <given-names>Mar&#x000ED;a-Eugenia</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/529725/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Cellular and Molecular Biology, FMRP, University of S&#x000E3;o Paulo</institution>, <addr-line>Ribeir&#x000E3;o Preto</addr-line>, <country>Brazil</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Biology, FFCLRP, University of S&#x000E3;o Paulo</institution>, <addr-line>Ribeir&#x000E3;o Preto</addr-line>, <country>Brazil</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Biochemistry, FMRP, University of S&#x000E3;o Paulo</institution>, <addr-line>Ribeir&#x000E3;o Preto</addr-line>, <country>Brazil</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Angel Angelov, Technische Universit&#x000E4;t M&#x000FC;nchen, Germany</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Ram&#x000F3;n Alberto Batista-Garc&#x000ED;a, Universidad Aut&#x000F3;noma del Estado de Morelos, Mexico; Justin Joseph Donato, University of St. Thomas, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Mar&#x000ED;a-Eugenia Guazzaroni <email>meguazzaroni&#x00040;ffclrp.usp.br</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology</p></fn></author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>06</month>
<year>2018</year>
</pub-date>
<pub-date pub-type="collection">
<year>2018</year>
</pub-date>
<volume>9</volume>
<elocation-id>1344</elocation-id>
<history>
<date date-type="received">
<day>18</day>
<month>02</month>
<year>2018</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>05</month>
<year>2018</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2018 Westmann, Alves, Silva-Rocha and Guazzaroni.</copyright-statement>
<copyright-year>2018</copyright-year>
<copyright-holder>Westmann, Alves, Silva-Rocha and Guazzaroni</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Although functional metagenomics has been widely employed for the discovery of genes relevant to biotechnology and biomedicine, its potential for assessing the diversity of transcriptional regulatory elements of microbial communities has remained poorly explored. Here, we experimentally mined novel constitutive promoter sequences in metagenomic libraries by combining a bi-directional reporter vector, high-throughput fluorescence assays and predictive computational methods. Through the expression profiling of fluorescent clones from two independent soil sample libraries, we have analyzed the regulatory dynamics of 260 clones with candidate promoters as a set of active metagenomic promoters in the host <italic>Escherichia coli</italic>. Through an in-depth analysis of selected clones, we were able to further explore the architecture of metagenomic fragments and to report the presence of multiple promoters per fragment with a dominant promoter driving the expression profile. These approaches resulted in the identification of 33 novel active promoters from metagenomic DNA originated from very diverse phylogenetic groups. The <italic>in silico</italic> and <italic>in vivo</italic> analysis of these individual promoters allowed the generation of a constitutive promoter consensus for exogenous sequences recognizable by <italic>E. coli</italic> in metagenomic studies. The results presented here demonstrates the potential of functional metagenomics for exploring environmental bacterial communities as a source of novel regulatory genetic parts to expand the toolbox for microbial engineering.</p></abstract>
<kwd-group>
<kwd>functional metagenomics</kwd>
<kwd>bi-directional reporter</kwd>
<kwd>constitutive promoters</kwd>
<kwd>synthetic biology</kwd>
<kwd>high-throughput screening</kwd>
</kwd-group>
<contract-num rid="cn001">2015/04309-1</contract-num>
<contract-num rid="cn001">2012/21922-8</contract-num>
<contract-num rid="cn001">2016/05472-6</contract-num>
<contract-num rid="cn001">2016/06323-4</contract-num>
<contract-num rid="cn002">472893/2013-0</contract-num>
<contract-num rid="cn002">441833/2014-4</contract-num>
<contract-sponsor id="cn001">Funda&#x000E7;&#x000E3;o de Amparo &#x000E0; Pesquisa do Estado de S&#x000E3;o Paulo<named-content content-type="fundref-id">10.13039/501100001807</named-content></contract-sponsor>
<contract-sponsor id="cn002">Conselho Nacional de Desenvolvimento Cient&#x000ED;fico e Tecnol&#x000F3;gico<named-content content-type="fundref-id">10.13039/501100003593</named-content></contract-sponsor>
<counts>
<fig-count count="4"/>
<table-count count="2"/>
<equation-count count="0"/>
<ref-count count="85"/>
<page-count count="15"/>
<word-count count="10588"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>The study of prokaryotic transcriptional regulation is essential for understanding the molecular mechanisms underlying decision-making processes in microorganisms (Ishihama, <xref ref-type="bibr" rid="B38">2010</xref>), comprising populational, ecological and pathogenic behaviors. The activity of most bacterial promoters is usually dependent on the combined action of transcription factors and sigma factors in response to multiple environmental stimuli (Browning and Busby, <xref ref-type="bibr" rid="B8">2016</xref>). For instance, in <italic>Escherichia coli</italic>, the compilation of decades of experimental data indicate that &#x0007E;50% of its promoters are under the control of a single specific regulator, while all other genes are regulated by at least two transcription factors (Gama-Castro et al., <xref ref-type="bibr" rid="B30">2016</xref>). Moreover, the recent development of experimental and large-scale sequencing techniques, together with powerful computational approaches have allowed both the discovery of insightful information about other bacterial transcriptional systems and the development of novel approaches for studying them n higher depth (Shen-Orr et al., <xref ref-type="bibr" rid="B70">2002</xref>; Mart&#x000ED;nez-Antonio and Collado-Vides, <xref ref-type="bibr" rid="B57">2003</xref>; Covert et al., <xref ref-type="bibr" rid="B12">2004</xref>; Shimada et al., <xref ref-type="bibr" rid="B71">2005</xref>). However, despite technical innovations, most of the studies are still centered on <italic>E. coli</italic>, a single bacterial species among at least 30,000 other already sequenced (Land et al., <xref ref-type="bibr" rid="B48">2015</xref>), in an estimated total of 1 trillion species (Locey and Lennon, <xref ref-type="bibr" rid="B51">2016</xref>).</p>
<p>With the advent of Metagenomics (Handelsman et al., <xref ref-type="bibr" rid="B36">1998</xref>), the exploration of unculturable bacteria (&#x0007E;99% of a bacterial community (Amann et al., <xref ref-type="bibr" rid="B2">1995</xref>) widely expanded genomic information, providing resourceful data about populational structures and genetic diversity in a myriad of environmental samples (Torsvik and &#x000D8;vre&#x000E5;s, <xref ref-type="bibr" rid="B79">2002</xref>; Venter, <xref ref-type="bibr" rid="B84">2004</xref>; Tringe, <xref ref-type="bibr" rid="B80">2005</xref>). Two main approaches are commonly adopted for those metagenomic studies (Singh et al., <xref ref-type="bibr" rid="B75">2009</xref>): the sequence-based metagenomic approach, which relies on massive sequencing of metagenomic DNA and powerful bioinformatics tools for extracting information from the metagenomic sequences; and functional metagenomics (Cowan et al., <xref ref-type="bibr" rid="B13">2005</xref>; Li and Qin, <xref ref-type="bibr" rid="B49">2005</xref>), which directly explores the functionality of enzymes and other structural elements through a wide range of stress/substrate/product-based assays (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>; Uchiyama and Miyazaki, <xref ref-type="bibr" rid="B82">2010</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B32">2013</xref>). In this context, although a large number of genes/ORFs has been discovered through the previously described approaches, the detection of novel bacterial regulatory elements using high-throughput technologies has been poorly explored, presenting so far a single well-defined method for the discovery of substrate-inducible regulatory sequences&#x02014;SIGEX (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>)&#x02014;and a direct assay for prospecting promoters for industrial applications (Han et al., <xref ref-type="bibr" rid="B35">2008</xref>). This scarce number of methodologies is directly related to the biased search toward novel enzymatic activities and to a lack of both experimental and computational tools for finding and validating promoter sequences in metagenomic libraries (Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>).</p>
<p>Unraveling novel bacterial promoters is essential for understanding the regulatory diversity of microorganisms, addressing important questions, such as the abundance of both constitutive and inducible elements in a metagenomic library, the bottlenecks regarding host choices (i.e., the constrains limiting the diversity of exogenous promoters that can be recognized by different hosts) and the correlation between promoter strength, transcriptional noise and the functional role of the regulated gene/operon (Ekkers et al., <xref ref-type="bibr" rid="B20">2012</xref>; Silander et al., <xref ref-type="bibr" rid="B73">2012</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>; Vester et al., <xref ref-type="bibr" rid="B85">2015</xref>). Furthermore, prospecting, and characterizing novel promoters is crucial for expanding the current Synthetic Biology toolbox and generating novel biotechnological applications as there is a high demand for constitutive and inducible promoters responding to process-specific parameters (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>; Silva-Rocha and de Lorenzo, <xref ref-type="bibr" rid="B74">2008</xref>; Boyle and Silver, <xref ref-type="bibr" rid="B7">2009</xref>; Blount et al., <xref ref-type="bibr" rid="B5">2012</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>).</p>
<p>In this context, the most common strategy for prospecting promoters is the usage of trap-vectors, which consist in transcriptional fusions between DNA fragments and a reporter gene. This method has been widely employed for assessing promoters in genomic DNA (Kubota et al., <xref ref-type="bibr" rid="B46">1991</xref>; Dunn and Handelsman, <xref ref-type="bibr" rid="B18">1999</xref>; Lu et al., <xref ref-type="bibr" rid="B53">2004</xref>; Chen et al., <xref ref-type="bibr" rid="B9">2007</xref>), however its application in metagenomic DNA fragments has remained poorly explored (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>; Han et al., <xref ref-type="bibr" rid="B35">2008</xref>). Furthermore, most adopted promoter trap-systems are unidirectional, while bacterial genomes present a large variation in the percentage of their leading-strand genes, ranging from &#x0007E;45 to &#x0007E;90% (Mao et al., <xref ref-type="bibr" rid="B56">2012</xref>, <xref ref-type="bibr" rid="B55">2015</xref>), suggesting that a bi-directional promoter reporter system would be preferable. Therefore, in the present work, we merge this strategy into an integrative approach for exploring bacterial communities through the lens of their regulatory dynamics, focusing on the study of bacterial promoter elements from environmental soil samples.</p>
<p>Although both constitutive and inducible promoters can be potentially detectable by the bi-directional method, we have focused exclusively on the study of the former, as a proof of concept, avoiding substrate-based induction assays (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>; Williamson et al., <xref ref-type="bibr" rid="B86">2005</xref>; Uchiyama and Miyazaki, <xref ref-type="bibr" rid="B82">2010</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B32">2013</xref>). We have collected soil samples from two differentially biomass-enriched sites of a Secondary Atlantic Forest in South-eastern Brazil and generated metagenomic libraries in a bi-directional probe vector for primary screenings. We have characterized the expression behaviors of a large set of GFPlva expressing clones from both libraries and narrowed down our selection to 10 clones for an in-depth analysis regarding potential ORFs and endogenous promoters. By cross-validating <italic>in silico</italic> analyses and experimental data of predicted constitutive promoters, we have located and profiled the expression of 33 endogenous promoters within the selected clones, providing resourceful information concerning the architecture and transcriptional dynamics of promoters from metagenomic fragments. Thought the identification of novel constitutive, natural promoters, our work contributes to the expansion of the toolbox of synthetic biology, which, in turn, can be used for genetic modification of microorganisms relevant in Biotechnology.</p></sec>
<sec sec-type="materials and methods" id="s2">
<title>Materials and methods</title>
<sec>
<title>Bacterial strains, primers, plasmids, and general growth conditions</title>
<p><italic>Escherichia coli</italic> DH10B (Invitrogen) cells were used for cloning and experimental procedures. <italic>E. coli</italic> strains were routinely grown at 37&#x000B0;C in Luria-Broth medium or M9 minimal medium (Sambrook et al., <xref ref-type="bibr" rid="B65">1989</xref>) (6.4 g/L Na<sub>2</sub>HPO<sub>4</sub>&#x000B7;7H<sub>2</sub>O, 1.5 g/L KH<sub>2</sub>PO<sub>4</sub>, 0.25 g/L NaCl, and 0.5 g/L NH<sub>4</sub>Cl) supplemented with 2 mM MgSO<sub>4</sub>, 0.1 mM casamino acid, and 1% glycerol as the sole carbon source. When required, chloramphenicol (Cm) (34 &#x003BC;g/mL) was added to the medium to ensure plasmid retention. When cells were grown in minimal medium, antibiotics were used at half concentrations. Transformed bacteria were recovered on LB (Luria&#x02013;Bertani) liquid medium for 1 h at 37&#x000B0;C and 180 r.p.m, followed by plating on LB-agar plates at 37&#x000B0;C for at least 18 h. All constructions were cloned into the pMR1 bi-directional-reporter vector (Guazzaroni and Silva-Rocha, <xref ref-type="bibr" rid="B33">2014</xref>), which carries mCherry and GFPlva, a short-lived variant of GFP.</p>
</sec>
<sec>
<title>Study site, soil sampling, and DNA extraction</title>
<p>Soil samples were obtained from a parcel of southeast region of Brazil (South America), from a Secondary Atlantic Forest at the University of Sao Paulo (Ribeir&#x000E3;o Preto, S&#x000E3;o Paulo, Brazil; 21&#x000B0;09&#x00027;58.4&#x0201D;S, 47&#x000B0;51&#x00027;20.1&#x0201D;W, at an altitude of 540 m). The soil from those parcels are geologically considered Oxisols (Schaefer et al., <xref ref-type="bibr" rid="B67">2008</xref>)&#x02014;clay soil always presenting a red or yellowish color, due to the high concentration of iron (III) and aluminum oxides and hydroxides&#x02014;. The top soil from two sections of the parcel (herein referred to as USP1 and USP3) were sampled at a depth of 0&#x02013;15 cm on July 2015 (soil temperature 23&#x000B0;C). Three replicates (0.2 kg each) were collected within a 1 m distance, and the samples were stored at &#x02212;20&#x000B0;C until DNA was extracted. Each sample was differentially enriched regarding tree species abundance on plant-litter composition: (i) enriched in leaves from <italic>Phytolacca dioica</italic> and (ii) from <italic>Anadenanthera</italic> spp. DNA was extracted from soil samples using the UltraClean&#x02122; Soil DNA isolation Kit (Mo Bio Laboratories, Solana Beach, CA, USA). DNA was visualized by using 0.7% (w/v) agarose gel electrophoresis and quantified spectrophotometrically (260 nm).</p>
</sec>
<sec>
<title>Metagenomic libraries construction and screening for fluorescent clones</title>
<p>For the construction of the libraries, metagenomic DNA was partially digested using Sau3AI, and fragments from 1.5 to 7 kb were extracted from an agarose gel for ligation into the dephosphorylated and BamHI-digested pMR1 vector. Ligation mixtures were transformed by electroporation into <italic>E. coli</italic> DH10B cells. To amplify the libraries, they were grown on LB agar plates containing Cm and incubated for 18 h at 37&#x000B0;C. Both green and red clones were manually isolated from LB-agar plates exposed to blue light wavelength (at &#x0007E;470 nm) by a transilluminator (Safe Imager&#x02122; 2.0 Blue Light Transilluminator). Ten fluorescent and 20 non-fluorescent clones were randomly picked from each library and had their plasmids extracted, following digestion with EcoRI and SmaI enzymes for checking presence/absence of inserts and their sizes. Cells from the same library were collected and pooled together in LB supplemented with 10% (wt/vol) glycerol for storing at &#x02212;80&#x000B0;C. The plasmids from the 10 selected clones were isolated from individual clones and transformed into new <italic>E. coli</italic> DH10B cells to reconfirm expression patterns.</p>
</sec>
<sec>
<title>Nucleic acid techniques</title>
<p>DNA preparation, digestion with restriction enzymes, analysis by agarose gel electrophoresis, isolation of DNA fragments, ligations, and transformations were done by standard procedures (Sambrook et al., <xref ref-type="bibr" rid="B65">1989</xref>). Plasmid DNA was sequenced on both strands by primer walking using the ABI PRISM Dye Terminator Cycle Sequencing Ready Reaction kit (PerkinElmer) and an ABI PRISM 377 sequencer (Perkin-Elmer) according to the manufacturer&#x00027;s instructions.</p>
</sec>
<sec>
<title>GFP fluorescence assay and data processing</title>
<p>To measure promoter activity, freshly plated single colonies were grown overnight in M9 medium supplemented with required antibiotics. Samples were diluted 1:20 (v/v) in M9 medium for a final volume of 200 &#x003BC;L in 96-well microplates. Cell growth and GFP fluorescence were quantified using a Victor X3 plate reader (PerkinElmer, Waltham, MA, USA). Promoter activities were expressed as the emission of fluorescence at 535 nm upon excitation with 485 nm light and then normalized with the optical density at each point (reported as fluorescence/OD<sub>600</sub>) after background correction. Background signal was evaluated with non-inoculated M9 medium and used as a blank for adjusting the baseline of measurements. <italic>E. coli</italic> DH10B harboring the pMR1 empty plasmid was used as a negative control. Three different positive controls were used, consisting in <italic>E. coli</italic> DH10B harboring pMR1 plasmid with one of the following synthetic constitutive promoters from the iGEM BBa_J23104 Anderson&#x00027;s catalog (<ext-link ext-link-type="uri" xlink:href="http://parts.igem.org/Promoters/Catalog/Anderson">http://parts.igem.org/Promoters/Catalog/Anderson</ext-link>) (Kelly et al., <xref ref-type="bibr" rid="B42">2009</xref>) upstream a GFPlva reporter: J23100, J23106, and J23114 (referred here as p100, p106 and p114, respectively; Sanches-Medeiros et al., <xref ref-type="bibr" rid="B66">2018</xref>). Unless otherwise indicated, measurements were taken at 30 min intervals over 8 h. All experiments were performed with both technical and biological replicates, being biological triplicates evaluated as independent measurements on different dates. Raw data were processed and plots were constructed using Microsoft Excel. All data was normalized by background values and transformed to a log2 scale for better data visualization. Heatmap dendrograms with expression profiles were generated by using MeV2 (<ext-link ext-link-type="uri" xlink:href="http://mev.tm4.org/">http://mev.tm4.org/</ext-link>) software.</p>
</sec>
<sec>
<title>Small-DNA inserts libraries generation and screening</title>
<p>In order to experimentally find and validate the promoter regions from each of the 10 selected metagenomic clones, an experimental technique was developed based on the previously described methodology of metagenomic library construction. All selected clones had their plasmids extracted and pooled together in an equimolar ratio. The pooled sample was amplified through a single PCR reaction using high-fidelity polymerase enzyme (Phusion) and previously described primers flanking the MCS region (Multiple Cloning Site) of the pMR1 vector, into which the metagenomic inserts were cloned. The resulting amplicons were firstly submitted to an analytical digestion followed by electrophoretic analysis for finding the optimal concentration of Sau3AI enzyme for obtaining fragments size ranging from 0.1 to 0.5 kb. Then, the purified pooled samples were fragmented by Sau3AI in preparative digestion and thereafter punctured from a 1% agarose gel in the region between 0.1 and 0.5 kb. These small DNA fragments, in turn, were ligated to pMR1 vector. Aliquots of electrocompetent <italic>E. coli</italic> DH10B cells were transformed with ligated DNA. A total of 100 fluorescent clones (80 expressing GFP and 20 expressing mCherry) were isolated under blue light excitation screening and had their plasmids extracted for sequencing reactions. Fluorescent clones were stored at &#x02212;80&#x000B0;C in LB medium supplemented with required antibiotics and 10% glycerol (v/v).</p>
</sec>
<sec>
<title><italic>In silico</italic> analysis of ORFs and promoter regions</title>
<p>The inserts of selected clones were sequenced on both strands as previously described. Sequences were manually assembled for the generation of 10 contigs. All sequences were analyzed for taxonomic origins by using the <italic>PhylopythiaS</italic> Web Server (Patil et al., <xref ref-type="bibr" rid="B62">2012</xref>) (<ext-link ext-link-type="uri" xlink:href="http://phylopythias.bifo.helmholtz-hzi.de/index.php?phase=wait">http://phylopythias.bifo.helmholtz-hzi.de/index.php?phase=wait</ext-link>), a sequence composition-based classifier that utilizes the hierarchical relationships between clades. Putative ORFs were identified and analyzed using the online ORF Finder platform, available at the NCBI website (<ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/orffinder/">https://www.ncbi.nlm.nih.gov/orffinder/</ext-link>). Comparisons of nucleotide and transcribed amino acid sequences were performed against public databases (NCBI) using BlastN, BlastX, and BlastP (BLAST, basic local alignment search tool) at the NCBI on-line server. For translation to protein sequences, the bacterial code was selected, allowing ATG, GTG, and TTG as alternative start codons. All the predicted ORFs longer than 270 bp were translated and used as queries in BlastP. Sequences with significant matches were further analyzed with psiBlast, and their putative function was annotated based on their similarities to sequences in the COG (Clusters of Orthologous Groups) and Pfam (Protein Families) databases. Predicted general cellular functions were annotated only for known ORFs based on the MultiFun classification (Serres and Riley, <xref ref-type="bibr" rid="B68">2000</xref>). All sequences with an E-value higher than 0.001 in the BlastP searches and longer than 300 bp were considered to be unknown. Transmembrane helices were predicted with TMprep (<ext-link ext-link-type="uri" xlink:href="http://www.ch.embnet.org/software/TMPRED_form.html">http://www.ch.embnet.org/software/TMPRED_form.html</ext-link>) and signal peptides with Signal P3.0 server (<ext-link ext-link-type="uri" xlink:href="http://www.cbs">http://www.cbs</ext-link>. dtu.dk/services/SignalP/). A complete table can be found at Table <xref ref-type="supplementary-material" rid="SM1">S1</xref>. Promoter prediction was based on the analysis of the 10 contigs by using both BPROM (<ext-link ext-link-type="uri" xlink:href="http://www.softberry.com/berry.phtml?topic=bprom&#x00026;group=programs&#x00026;subgroup=gfindb">http://www.softberry.com/berry.phtml?topic=bprom&#x00026;group=programs&#x00026;subgroup=gfindb</ext-link>) and bTSSfinder (<ext-link ext-link-type="uri" xlink:href="http://www.cbrc.kaust.edu.sa/btssfinder/">http://www.cbrc.kaust.edu.sa/btssfinder/</ext-link>) web-based platforms. Both methods searched for rpoD-related sequences and we have only considered as valid predictions the ones matched on both approaches. Those filtered sequences were used to cross-validate 23 out of 33 experimentally defined regulatory regions by comparing the positions between predicted and experimental sequences in metagenomic fragments. The positions of the 33 small DNA fragments were obtained by a multiple alignment of the original contigs (queries) against those selected sequences, which has also allowed the validation of the promoter&#x00027;s directionality&#x02014;forward or reverse&#x02014;by observing the matched strands (Plus/Plus or Plus/Minus). The consensus Logo sequence was based on the alignment of the 33 experimentally validated promoters, using the WebLogo platform (<ext-link ext-link-type="uri" xlink:href="http://weblogo.berkeley.edu/logo.cgi">http://weblogo.berkeley.edu/logo.cgi</ext-link>).</p>
</sec>
<sec>
<title>Criteria for the choice of sample sizes</title>
<p>The sample sizes chosen in this work were based on a seminal study regarding the characterization of random promoter libraries (Cox and Elowitz, <xref ref-type="bibr" rid="B14">2007</xref>) in which &#x0007E;1% (288) of the total set of promoters (22,000) was selected for further analysis. In our study, we have selected a much higher fraction of the population for sampling (&#x0007E;25% of 1,100 screened clones). Furthermore, using classical statistics for determining optimal sample sizes and reducing the uncertainty caused by sampling error (Nakagawa and Cuthill, <xref ref-type="bibr" rid="B59">2007</xref>), we have found that sampling 260 clones from a total of 1,100 clones would result in confidence level of 99% with a confidence interval of 0.07. Each selected clone was manually streaked in LB-agar and microbiologically purified two times for further validation in plate reader assays&#x02014;which was done with biological and technical triplicates. Regarding the 10 selected clones at the in-depth analysis, we have adopted the same sample fraction from the study of (Cox and Elowitz, <xref ref-type="bibr" rid="B14">2007</xref>), (1% of the total number of positive clones-&#x02212;10 in 1,100 clones). In this context, from each of the 10 analyzed clones containing metagenomic fragments we have obtained at least three promoters, which were individually characterized in plate reader assays. The choice of 100 clones from the small-fragment library was based on the following rationale: (i) the combined size of the 10 selected clones in this analysis was 30 kb, (ii) each small fragment had an average of 0.4 kb, thus, (iii) 100 fluorescent clones from the small-insert library would represent &#x0007E;40 kb, providing enough coverage for all 10 original clones. Furthermore, as each fluorescent clone would represent a single promoter sequence at a specific region in the original clones, it was highly improbable that the 100 selected clones would cover the 10 original clones. Thus, our intention by choosing a sample size of 100 clones was to enrich the single promoters. This assumption was further supported by the discovery of only 33 promoters among those 100 sequences (promoter sequences were overrepresented).</p></sec></sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec>
<title>Generating metagenomic libraries and screening for fluorescent clones</title>
<p>We have constructed and assessed two metagenomic libraries hosted in <italic>E. coli</italic> DH10B strain for the analysis of bacterial promoters in environmental samples (Figure <xref ref-type="fig" rid="F1">1</xref>). The libraries were generated from soil microbial communities of two sites bearing differential tree litter composition (<italic>Anadenanthera</italic> spp. and <italic>Phytolacca dioica</italic>) within a Secondary semi-deciduous Atlantic Forest zone at the University of Sao Paulo, Ribeir&#x000E3;o Preto, Brazil&#x02014;see Experimental Procedures for further details. Both metagenomic DNA were cloned into the pMR1 (Guazzaroni and Silva-Rocha, <xref ref-type="bibr" rid="B33">2014</xref>) bi-directional reporter vector&#x02014;which has a <italic>GFPlva</italic> and a <italic>mCherry</italic> reporter gene in opposite directions, flanking a multiple cloning site; chloramphenicol resistance marker and a <italic>p15a</italic> origin of replication for low/medium copy number. Each metagenomic library presented about 250 Mb of environmental DNA distributed into &#x0007E;60,000 clones harboring insert fragments size ranging from 1.5 to 7 kb, with an average size of 4.1 kb (Table <xref ref-type="table" rid="T1">1</xref>). We have chosen fragments of 1.5&#x02013;7 kb in order to validate our strategy on standard-sized functional metagenomic libraries based on plasmid vectors (Gabor et al., <xref ref-type="bibr" rid="B28">2004</xref>; Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>; Pushpam et al., <xref ref-type="bibr" rid="B63">2011</xref>; Jim&#x000E9;nez et al., <xref ref-type="bibr" rid="B40">2012</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B32">2013</xref>). In total, 1,100 fluorescent clones, resulting in a rate of approximately one fluorescent clone every 150 clones (USP1) or every 90 clones screened (USP3), were manually selected under blue light exposition. Then, these fluorescent clones were directly recovered from LB agar plates supplemented with chloramphenicol. The direct screening was preferred over the use of metagenomic clone pools from stocks as it reduces the chances of both biased clone enrichment (e.g., clones with higher growth rates, usually clones bearing small inserts or without insert) and dilution of positive clones with impaired growth (e.g., clones with high expression of GFP and/or other exogenous genes), avoiding thus clonal amplification.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Schematic representation of the workflow for finding, characterizing and cross-validating novel bacterial cis-regulatory elements in environmental samples. From left to right: firstly, we have generated metagenomic libraries from soil samples in <italic>E. coli</italic> DH10B. The DNA fragments were cloned into a bi-directional reporter trap-vector (bearing <italic>mCherry</italic> and <italic>GFPlva</italic> fluorescent reporters), pMR1, which allowed for the screening of promoters in both DNA strands. Secondly, we have manually screened all visible fluorescent clones from our metagenomic libraries and analyzed the expression patterns of all green fluorescent clones on a microplate reader during 8 h. Lastly, we have selected 10 clones based on their GFPlva expression patterns for an in-depth analysis combining experimental (small DNA insert library generation) and <italic>in silico</italic> promoter prediction. This integrated strategy has allowed us to identify, validate and estimate the accessibility of novel promoter regions from metagenomic libraries.</p></caption>
<graphic xlink:href="fmicb-09-01344-g0001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Features of the generated metagenomic libraries.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Metagenomic library</bold></th>
<th valign="top" align="center"><bold>USP 1</bold></th>
<th valign="top" align="center"><bold>USP 3</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Total number of clones</td>
<td valign="top" align="center">100,000</td>
<td valign="top" align="center">90,000</td>
</tr>
<tr>
<td valign="top" align="left">Percentage of clones with insert (%)</td>
<td valign="top" align="center">60</td>
<td valign="top" align="center">70</td>
</tr>
<tr>
<td valign="top" align="left">Number of clones with insert</td>
<td valign="top" align="center">60,000</td>
<td valign="top" align="center">63,000</td>
</tr>
<tr>
<td valign="top" align="left">Total number and rate<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref> of fluorescent clones</td>
<td valign="top" align="center">400 (1:150)</td>
<td valign="top" align="center">700 (1:90)</td>
</tr>
<tr>
<td valign="top" align="left">Total number and rate<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref> of green clones</td>
<td valign="top" align="center">270 (1:220)</td>
<td valign="top" align="center">400 (1:157)</td>
</tr>
<tr>
<td valign="top" align="left">Total number and rate<xref ref-type="table-fn" rid="TN1"><sup>&#x0002A;</sup></xref> of red clones</td>
<td valign="top" align="center">130 (1:460)</td>
<td valign="top" align="center">300 (1:210)</td>
</tr>
<tr>
<td valign="top" align="left">Average insert size (kb)</td>
<td valign="top" align="center">4.5</td>
<td valign="top" align="center">3.7</td>
</tr>
<tr>
<td valign="top" align="left">Total metagenomic library size (Mb)</td>
<td valign="top" align="center">270</td>
<td valign="top" align="center">233</td>
</tr>
<tr>
<td valign="top" align="left">Estimated number of genomes<xref ref-type="table-fn" rid="TN2"><sup>&#x0002A;&#x0002A;</sup></xref></td>
<td valign="top" align="center">60</td>
<td valign="top" align="center">52</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TN1"><label>&#x0002A;</label><p><italic>Rate represented by the number of fluorescent clones divided by the total number of clones with inserts</italic>.</p></fn>
<fn id="TN2"><label>&#x0002A;&#x0002A;</label><p><italic>Assuming 4.5 Mb per genome (Raes et al., <xref ref-type="bibr" rid="B64">2007</xref>)</italic>.</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Evaluating the expression dynamics of fluorescent clones</title>
<p>In order to analyse the expression patterns of the isolated clones, we evaluated the intrinsic dynamics of GFPlva and mCherry by randomly selecting 20 clones expressing each reporter (as schematically represented in Figures <xref ref-type="fig" rid="F1">1</xref>, <xref ref-type="fig" rid="F2">2A</xref>). As represented in Figures <xref ref-type="fig" rid="F2">2B,C</xref>, we found that clones expressing mCherry were not suitable for standard microplate 8 h assays, as the fluorescence intensity values differed dramatically between 8 and 24 h after the beginning of the experiment. The slow kinetics of mCherry expression has already been reported as a consequence of a two-step oxidation process for protein maturation when compared to the one-step maturation process found in GFP reporters (Hebisch et al., <xref ref-type="bibr" rid="B37">2013</xref>). We highlight that although mCherry clones were not optimized for dynamic profiling, they were essential for quantifying the total number of metagenomic fragments harboring promoters accessible to <italic>E. coli</italic>&#x02013;the sum of both green and red fluorescent clones in the library. On the other hand, the clones expressing GFPlva presented the enhanced intrinsic properties for microplate assays, supported by the observation of very similar fluorescence intensities between the two time points tested. Furthermore, the GFPlva has an LVA-degradation tag attached to its C-terminal, which reduces GFP accumulation and increases protein turnover, generating a more precise fluorescence output on analysis of expression patterns (Andersen et al., <xref ref-type="bibr" rid="B3">1998</xref>).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Evaluating the expression dynamics of fluorescent clones. <bold>(A)</bold> LB-agar plate under blue light excitation comprising a subset of metagenomic isolated clones expressing GFPlva (top) and mCherry (bottom) fluorescent reporters. A few clones were observed to express both reporters. All isolated clones were initially considered to hold at least one endogenous promoter. <bold>(B,C)</bold> Indirect assessment of maturation times from both fluorescent reporters GFPlva <bold>(B)</bold> and mCherry <bold>(C)</bold> after 8 h (light bars) and 24 h (dark bars) of the beginning of the experiment. Maturation times are substantially lower for mCherry than for GFPlva, which excluded the former from further analyses. Positive controls for GFP and mCherry are represented by p100 and pRED, respectively. Fluorescence data has been normalized by OD<sub>600</sub> values for each sample following normalization by values from the negative control (empty-pMR1). Data was transformed to log2 scale to allow better visualization of fluorescence variation. <bold>(D)</bold> Hierarchical representation of a metaconstitutome (i.e., all expression profiles from a single metagenomic library (USP3) in <italic>E. coli</italic>. Fluorescence time-lapse dynamics were measured during 8 h for each clone and represented as heat maps. Promoter activities (calculated as GFP/OD<sub>600</sub>) were normalized by the negative control (<italic>E. coli</italic> DH10B harboring empty pMR1) and transformed to log2 scale in order to facilitate the visualization of subtle activities. Positive controls (p100, p106, and p114-strong, medium and low expression, respectively) and negative control (pMR1) expression profiles are indicated by black arrows at the left side of the heatmap. Data are representative of three independent experiments.</p></caption>
<graphic xlink:href="fmicb-09-01344-g0002.tif"/>
</fig>
<p>Thus, 260 clones expressing GFPlva&#x02014;see Experimental Procedures for further information about chosen sample sizes&#x02014;(160 clones from the USP1 library and 100 from USP3) were selected for further analysis of expression patterns on microplate reader assays with biological and technical triplicates. The dynamic profiles for each clone were converted into heat maps and hierarchically clustered by a Euclidean Distance algorithm into a dendrogram, concisely representing the expression patterns of each metagenomic library. In order to assess the diversity of promoter strengths among the generated metagenomics libraries, three previously characterized constitutive promoters (see Experimental Procedures for further information) positioned upstream a GFPlva reporter were used as standards for strong, medium and weak expression profiles (referred here as p100, p106, and p114, respectively).</p>
<p>Considering both metagenomics libraries, we have found a total of 30 strong promoters showing a strength similar to the p100 control, 40 medium strength promoters similar to the p106 control, 60 weak promoters similar to the p114 control and a wide range of promoters with particular expression patterns which did not cluster with any of the previously mentioned positive controls (Figure <xref ref-type="fig" rid="F2">2D</xref> and Figure <xref ref-type="supplementary-material" rid="SM1">S1</xref>). Moreover, the dynamic expression profiles have allowed us to observe a few clones that, although constitutively active, had their GFPlva expression levels increased during certain time frames (Figure <xref ref-type="fig" rid="F2">2D</xref>). Concerning the hierarchical organization of the expression profiles, the dendrogram of the USP3 library (Figure <xref ref-type="fig" rid="F2">2D</xref>) could be subdivided into at least four well-defined expression clusters comprising: (i) high, (ii) medium, (iii) low and (iv) very low expression profiles. A very similar pattern was identified in the expression dendrogram independently generated for the USP1 metagenomic library (see Figure <xref ref-type="supplementary-material" rid="SM1">S1</xref>).</p>
</sec>
<sec>
<title><italic>In silico</italic> analysis of DNA metagenomic fragments from selected clones</title>
<p>From the 260 assessed samples, we have selected 10 clones displaying particular profiles (see Figure <xref ref-type="supplementary-material" rid="SM1">S2</xref>)&#x02014;see Experimental Procedures for further information about chosen sample sizes&#x02014;depicting the diversity of expression behaviors found in both libraries. The inserts from selected clones were sequenced and analyzed for C-G content, taxonomic origins, potential ORFs and RpoD-related promoter regions (&#x02212;10 and &#x02212;35 conserved regions). The relative abundance of the guanine-cytosine content of each insert was assessed (Table <xref ref-type="table" rid="T2">2</xref>), resulting in a median of 54%, varying from 43 to 61%, indicating their diverse phylogenetic affiliation. Using the <italic>PhylopythiaS</italic> sequence classifier for metagenomic sequences (Koonin, <xref ref-type="bibr" rid="B45">2009</xref>; Patil et al., <xref ref-type="bibr" rid="B62">2012</xref>), the DNA fragments were assigned to their closely related phylum (Table <xref ref-type="table" rid="T2">2</xref> and Figure <xref ref-type="supplementary-material" rid="SM1">S3</xref>). The most abundant assigned phyla were Proteobacteria (46%), followed by Actinobacteria (23%), Verrumicrobia (15%), Chloroflexi (8%) and Bacteroidetes (8%) (see Figure <xref ref-type="supplementary-material" rid="SM1">S3</xref>).</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Description of the ORFs contained in plasmids from the selected clones (pCAW1 to pCAW10) and their sequence similarities.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Clone_Sample [insert bp]</bold></th>
<th valign="top" align="left"><bold>G &#x0002B; C %</bold></th>
<th valign="top" align="left"><bold>GenBank accession No</bold>.</th>
<th valign="top" align="left"><bold>Phylum<xref ref-type="table-fn" rid="TN3"><sup>a</sup></xref></bold></th>
<th valign="top" align="left"><bold>ORF<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></bold></th>
<th valign="top" align="left"><bold>Strand</bold></th>
<th valign="top" align="left"><bold>Length (aa<xref ref-type="table-fn" rid="TN5"><sup>c</sup></xref>)</bold></th>
<th valign="top" align="left"><bold>Closest similar protein<xref ref-type="table-fn" rid="TN6"><sup>d</sup></xref> (Length in aa)</bold></th>
<th valign="top" align="left"><bold>Closest Organism/Phylum<xref ref-type="table-fn" rid="TN7"><sup>e</sup></xref></bold></th>
<th valign="top" align="left"><bold>Identity (%)</bold></th>
<th valign="top" align="left"><bold>Putative function</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><bold>pCAW1 (2,367 bp)</bold></td>
<td valign="top" align="left">55%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939589">KY939589</ext-link></td>
<td valign="top" align="left">Proteobacteria or Verrucomicrobia</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">131</td>
<td valign="top" align="left">Hypothetical protein (416)</td>
<td valign="top" align="left"><italic>Bacteriodetes bacterium/Proteobacteria</italic></td>
<td valign="top" align="left">68%</td>
<td valign="top" align="left">Alginate lyase</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">271</td>
<td valign="top" align="left">Hypothetical protein (261)</td>
<td valign="top" align="left"><italic>Acidobacteria bacterium/Acidobacteria</italic></td>
<td valign="top" align="left">73%</td>
<td valign="top" align="left">17-B-hydroxysteroid dehydrogenase</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">295</td>
<td valign="top" align="left">Beta-glucosidase (777)</td>
<td valign="top" align="left"><italic>Caulobacter</italic> sp. <italic>OV484/Proteobacteria</italic></td>
<td valign="top" align="left">66%</td>
<td valign="top" align="left">Beta-glucosidase</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW2 (2,069 bp)</bold></td>
<td valign="top" align="left">52%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939590">KY939590</ext-link></td>
<td valign="top" align="left">Actinobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">304</td>
<td valign="top" align="left">Unkonwn<xref ref-type="table-fn" rid="TN5"><sup>c</sup></xref></td>
<td valign="top" align="left"><italic>Hyphomicrobium</italic> sp. <italic>NDB2Meth4/Proteobacteria</italic></td>
<td valign="top" align="left">33%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">249</td>
<td valign="top" align="left">Unkonwn</td>
<td valign="top" align="left"><italic>Hungatella hathewayi/Firmicutes</italic></td>
<td valign="top" align="left">33%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW3 (4,404 bp)</bold></td>
<td valign="top" align="left">53%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939591">KY939591</ext-link></td>
<td valign="top" align="left">Proteobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">318</td>
<td valign="top" align="left">IS4 family Transposase (320)</td>
<td valign="top" align="left"><italic>Escherichia coli/Proteobacteria</italic></td>
<td valign="top" align="left">96%</td>
<td valign="top" align="left">IS4 family transposase</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">1011</td>
<td valign="top" align="left">DNA-directed RNA polymerase subunit beta&#x00027; (1430)</td>
<td valign="top" align="left"><italic>Sphingobacteriales bacterium 44-61/Bacteroidetes</italic></td>
<td valign="top" align="left">83%</td>
<td valign="top" align="left">RNA polymerase - Beta Subunit</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">120</td>
<td valign="top" align="left">Uncharacterised protein (135)</td>
<td valign="top" align="left"><italic>Bordetella pertussis/Proteobacteria</italic></td>
<td valign="top" align="left">47%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">4</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">151</td>
<td valign="top" align="left">Uncharacterised protein (130)</td>
<td valign="top" align="left"><italic>Bordetella pertussis/Proteobacteria</italic></td>
<td valign="top" align="left">37%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">5</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">94</td>
<td valign="top" align="left">Uncharacterised protein (64)</td>
<td valign="top" align="left"><italic>Bordetella pertussis/Proteobacteria</italic></td>
<td valign="top" align="left">82%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">6</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">96</td>
<td valign="top" align="left">Uncharacterised protein (86)</td>
<td valign="top" align="left"><italic>Vibrio cholerae/Proteobacteria</italic></td>
<td valign="top" align="left">48%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">7</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">173</td>
<td valign="top" align="left">predicted protein (585)</td>
<td valign="top" align="left"><italic>Ruminococcus</italic> sp. <italic>CAG:403/Proteobacteria</italic></td>
<td valign="top" align="left">26%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW4 (4,002 bp)</bold></td>
<td valign="top" align="left">61%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939592">KY939592</ext-link></td>
<td valign="top" align="left">Proteobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">245</td>
<td valign="top" align="left">Nosine monophosphate cyclohydrolase (246)</td>
<td valign="top" align="left"><italic>Ktedonobacter racemifer/Chloroflexi</italic></td>
<td valign="top" align="left">63%</td>
<td valign="top" align="left">IMP cyclohydrolase</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">214</td>
<td valign="top" align="left">Phosphodiesterase (498)</td>
<td valign="top" align="left"><italic>Candidate division NC10 bacterium/NC10</italic></td>
<td valign="top" align="left">40%</td>
<td valign="top" align="left">Phosphodiesterase</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">402</td>
<td valign="top" align="left">Hypothetical protein A2Y08_02680 (625)</td>
<td valign="top" align="left"><italic>Planctomycetes bacterium GWA2_40_7/Planctomycetes</italic></td>
<td valign="top" align="left">43%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">4<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">142</td>
<td valign="top" align="left">Gentisate 1,2-dioxygenase (349)</td>
<td valign="top" align="left"><italic>Pseudomonas</italic> sp. <italic>21C1/Proteobacteria</italic></td>
<td valign="top" align="left">60%</td>
<td valign="top" align="left">Gentisate 1,2-dioxygenase</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW5 (2,724 bp)</bold></td>
<td valign="top" align="left">54%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939593">KY939593</ext-link></td>
<td valign="top" align="left">Verrucomicrobia</td>
<td valign="top" align="left">1<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">642</td>
<td valign="top" align="left">Pyruvate:ferredoxin oxidoreductase (1565)</td>
<td valign="top" align="left"><italic>Uncultured bacterium HF770_11D24]/Acidobacterium</italic></td>
<td valign="top" align="left">80%</td>
<td valign="top" align="left">Pyruvate:ferredoxin oxidoreductase</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW6 (2,125 bp)</bold></td>
<td valign="top" align="left">57%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939594">KY939594</ext-link></td>
<td valign="top" align="left">Chloroflexi or Proteobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">159</td>
<td valign="top" align="left">Hypothetical protein BGO39_33875 (215)</td>
<td valign="top" align="left"><italic>Chloroflexi bacterium 54-19/Chloroflexi</italic></td>
<td valign="top" align="left">65%</td>
<td valign="top" align="left">MerR family</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">336</td>
<td valign="top" align="left">Hypothetical protein BGO39_33870 (347)</td>
<td valign="top" align="left"><italic>Chloroflexi bacterium 54-19/Chloroflexi</italic></td>
<td valign="top" align="left">78%</td>
<td valign="top" align="left">PrsW intramembrane metalloprotease</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">163</td>
<td valign="top" align="left">Hypothetical protein BGO39_33865 (173)</td>
<td valign="top" align="left"><italic>Chloroflexi bacterium 54-19/Chloroflexi</italic></td>
<td valign="top" align="left">75%</td>
<td valign="top" align="left">Chromate transporter</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW7 (2,558 bp)</bold></td>
<td valign="top" align="left">46%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939595">KY939595</ext-link></td>
<td valign="top" align="left">Actinobacteria</td>
<td valign="top" align="left">1<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">391</td>
<td valign="top" align="left">Hypothetical protein A2X07_06330 (480)</td>
<td valign="top" align="left"><italic>Flavobacteria bacterium GWF1_32_7/Bacteroidetes</italic></td>
<td valign="top" align="left">45%</td>
<td valign="top" align="left">Por secretion system sorting domain</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">250</td>
<td valign="top" align="left">Hypothetical protein (586)</td>
<td valign="top" align="left"><italic>Chitinophagaceae bacterium PMP191F/Bacteroidetes</italic></td>
<td valign="top" align="left">65%</td>
<td valign="top" align="left">Polysaccharide Lyase</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW8 (4,480 bp)</bold></td>
<td valign="top" align="left">57%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939596">KY939596</ext-link></td>
<td valign="top" align="left">Actinobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">508</td>
<td valign="top" align="left">Hypothetical protein AUH20_02325 (597)</td>
<td valign="top" align="left"><italic>Rokubacteria bacterium/Rokubacteria</italic></td>
<td valign="top" align="left">76%</td>
<td valign="top" align="left">5-oxoprolinase / Hydantoinase_B</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">348</td>
<td valign="top" align="left">Oxidoreductase (336)</td>
<td valign="top" align="left"><italic>Rokubacteria bacterium/Rokubacteria</italic></td>
<td valign="top" align="left">61%</td>
<td valign="top" align="left">Flavin-utilizing monoxygenases</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">314</td>
<td valign="top" align="left">Hypothetical protein ETSY1_46935 (279)</td>
<td valign="top" align="left"><italic>Candidatus Entotheonella</italic> sp. <italic>TSY1/Tectomicrobia</italic></td>
<td valign="top" align="left">76%</td>
<td valign="top" align="left">Cellulose biosynthesis BcsQ</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW9 (2,573 bp)</bold></td>
<td valign="top" align="left">43%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939597">KY939597</ext-link></td>
<td valign="top" align="left">Bacteroidetes or Proteobacteria</td>
<td valign="top" align="left">1<xref ref-type="table-fn" rid="TN4"><sup>b</sup></xref></td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">81</td>
<td valign="top" align="left">Hypothetical protein (129)</td>
<td valign="top" align="left"><italic>Janthinobacterium/Proteobacteria</italic></td>
<td valign="top" align="left">50%</td>
<td valign="top" align="left">Unknown</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">303</td>
<td valign="top" align="left">Formylglycine-generating enzyme (379)</td>
<td valign="top" align="left"><italic>Mucilaginibacter</italic> sp.<italic>/Bacteroidetes</italic></td>
<td valign="top" align="left">65%</td>
<td valign="top" align="left">Formylglycine-generating enzyme</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Minus</td>
<td valign="top" align="left">457</td>
<td valign="top" align="left">Acetylglucosamine-6-sulfatase (504)</td>
<td valign="top" align="left"><italic>Flavihumibacter solisilvae/Bacteroidetes</italic></td>
<td valign="top" align="left">67%</td>
<td valign="top" align="left">Acetylglucosamine-6-sulfatase</td>
</tr>
<tr>
<td valign="top" align="left"><bold>pCAW10 (2,076 bp)</bold></td>
<td valign="top" align="left">56%</td>
<td valign="top" align="left"><ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="KY939598">KY939598</ext-link></td>
<td valign="top" align="left">Proteobacteria</td>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Plus</td>
<td valign="top" align="left">204</td>
<td valign="top" align="left">Hypothetical protein (195)</td>
<td valign="top" align="left"><italic>Luminiphilus syltensis/Proteobacteria</italic></td>
<td valign="top" align="left">50%</td>
<td valign="top" align="left">Unknown</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TN3"><label>a</label><p><italic>Classification based on PhylopythiaS (Patil et al., <xref ref-type="bibr" rid="B62">2012</xref>) webserver</italic></p></fn>
<fn id="TN4"><label>b</label><p><italic>Truncated proteins</italic></p></fn>
<fn id="TN5"><label>c</label><p><italic>aa, amino acids</italic></p></fn>
<fn id="TN6"><label>d</label><p><italic>Sequences with an E-value higher than 0.001 in Blastp searches were considered to be unknown proteins</italic></p></fn>
<fn id="TN7"><label>e</label><p><italic>Classification based on Blastp</italic>.</p></fn>
</table-wrap-foot>
</table-wrap>
<p>In the case of the identification of putative genes, 29 ORFs with significant <italic>E-values</italic> (&#x0003C;0.001) were found (Table <xref ref-type="table" rid="T2">2</xref>) unevenly distributed between both DNA strands, in line with a lack of strong directional trends regarding bacterial genome organization (Koonin, <xref ref-type="bibr" rid="B45">2009</xref>). The ORFs were also classified within a range of functional classes (delineated by MultiFun; Serres and Riley, <xref ref-type="bibr" rid="B68">2000</xref>) and taxonomic groups based on closest similar proteins (Table <xref ref-type="table" rid="T2">2</xref>). Regarding gene function, the most abundant ORFs were related to unknown functions (31%) and metabolism (31%), followed by stress adaptation cell processes (17%) (Table <xref ref-type="table" rid="T2">2</xref>).</p>
<p>The <italic>in silico</italic> promoter prediction has also provided relevant information concerning the potential number of regulatory regions on each selected fragment. The BPROM software (Solovyev, <xref ref-type="bibr" rid="B77">2011</xref>) has been extensively employed in other promoter prediction studies and is based on the analysis of the &#x02212;35 and &#x02212;10 consensus sequence of RpoD promoters. The main sigma subunit, sigma-70 encoded by <italic>rpoD</italic>, plays a major role in transcription of growth-related genes, the so-called housekeeping genes (Lonetto et al., <xref ref-type="bibr" rid="B52">1992</xref>; Gruber and Gross, <xref ref-type="bibr" rid="B31">2003</xref>; Paget and Helmann, <xref ref-type="bibr" rid="B61">2003</xref>). From the <italic>in silico</italic> analysis, a total of 140 promoters were predicted among the 10 selected clones, suggesting an average of 5 RpoD-related promoters/kb. This led us to question whether most expression profiles previously described (Figure <xref ref-type="fig" rid="F2">2D</xref> and Figure <xref ref-type="supplementary-material" rid="SM1">S1</xref>) were representing the dynamics of a single &#x0201C;dominant&#x0201D; promoter or the combined effect of multiple adjacent promoters present in the metagenomic fragment. Considering that, we have delineated a strategy to experimentally assess the number and location of accessible promoters from our selected clones, contrasting experimental results with <italic>in silico</italic> data.</p>
</sec>
<sec>
<title>Experimental identification, characterization, and cross-validation of promoter regions</title>
<p>In order to explore the potential set of accessible promoter regions from our metagenomic libraries, we developed a small DNA insert library generation approach (Figure <xref ref-type="fig" rid="F1">1</xref>). Firstly, the plasmids from the previously 10 selected clones (original clones) were pooled together for insert amplification in a single PCR reaction. The resulting amplicons were fragmented by Sau3AI digestion and DNA fragments ranging from 0.1 to 0.5 kb were selected for subsequent cloning into the pMR1 vector. The generation of this sub-fragment library allowed the screening for both red and green fluorescent colonies as they would represent the accessible set of promoters among the metagenomic DNA fragments studied. It is important to highlight that as the cloning process was not directed, small fragments bearing promoter regions had a 50% chance of getting cloned in any direction, thus clones expressing mCherry were also isolated for subsequent sequencing. A total of 100 clones&#x02014;see Experimental Procedures for further information about chosen sample sizes&#x02014;coming from the small DNA insert library (80 expressing GFPlva and 20 expressing mCherry) were sequenced and then aligned against the original metagenomic fragments. As a result, we have identified at least 33 promoter regions within the initial set of the selected metagenomic clones (Figure <xref ref-type="fig" rid="F3">3</xref>, Figure <xref ref-type="supplementary-material" rid="SM1">S4</xref>, and Table <xref ref-type="supplementary-material" rid="SM1">S1</xref>).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Schematic representation of six metagenomic inserts (contigs) showing predicted ORFs and experimentally validated/characterized promoters. Each contig is identified on the far left of each subfigure. Promoters are indicated by elbow-shaped arrows and name according to their relative position in the contig. Promoter directionality, regarding the leading and lagging strands, is represented by yellow and blue colors, respectively. Asterisks over specific promoters indicate regulatory regions which were cross-validated by matching <italic>in silico</italic> predictions. Dark arrows represent predicted ORFs, according to their relative positions in each contig (see Table <xref ref-type="table" rid="T2">2</xref> for more information). All genetic features respect their original relative sizes, following the 1 kb scale depicted at the bottom of this figure. Beneath each metagenomic insert, there is a heat map cluster representing the whole set of promoter activities measured during 8-h fluorescence assays. The first line of each cluster shows the original expression profile initially measured for each metagenomic insert. All other lines represent expression activities from <italic>de novo</italic> experimentally validated promoters within each contig (small DNA fragments). The second line of each cluster represents the endogenous promoter showing the most similar activity with respect to the original expression profile for each contig. All expression profiles are properly identified at the most rightmost side of each line, following their respective contig/promoter name. For the supplementary set of analyzed contigs, see Figure <xref ref-type="supplementary-material" rid="SM1">S4</xref>.</p></caption>
<graphic xlink:href="fmicb-09-01344-g0003.tif"/>
</fig>
<p>Additionally, the current experimental approach allowed us not only to identify novel promoter regions but also to determine promoter directionality. The evaluation of promoter localization within the 10 selected clones revealed that from the 33 experimentally selected small fragments, 7 (21%) were considered intragenic promoters while the remaining 79% (26 promoters) were considered primary promoters, defined as the furthest upstream promoter in a gene/operon (Conway et al., <xref ref-type="bibr" rid="B11">2014</xref>). For the sake of comparison, <italic>E. coli</italic> K-12 genome presents the following proportions: primary (66.3%), secondary (19.6%), intragenic (9.8%), and antisense (4.2%) promoters (Cho et al., <xref ref-type="bibr" rid="B10">2009</xref>; Conway et al., <xref ref-type="bibr" rid="B11">2014</xref>).</p>
<p>Based on the alignment results, we selected a defined set of small fragment clones related to each original sequence for dynamic expression profiling on a microplate reader. The results showed that for each set of small-fragments belonging to a DNA metagenomic clone, there was at least one with an expression pattern corresponding to the original clone previously observed (Figure <xref ref-type="fig" rid="F3">3</xref> and Figure <xref ref-type="supplementary-material" rid="SM1">S4</xref>). Similarly, we identified other clones bearing small-inserts with individual profiles different to the primarily observed, representing alternative promoter regions in the original sequence that were not mapped in the initial approach (Figure <xref ref-type="fig" rid="F3">3</xref>). Data has also shown that, in our experimental conditions, it seems that in each case a single promoter (usually the closest to the reporter gene) has the major contribution for the gene expression pattern observed. This can be concluded since, in each case, only one promoter mapped from the small-insert library produced the same expression profile observed for the original full length fragment.</p>
<p>Regarding <italic>in silico</italic> cross-validation, from the 33 experimentally validated promoters, 23 RpoD-related promoters (70%) were supported by the algorithmic analysis as they were aligned to their respective original sequences (Figure <xref ref-type="fig" rid="F3">3</xref>). On the other hand, the remaining 10 sequences (30%) were considered as promoters exclusively identified by experimental approaches. This could indicate that these promoters that do not macth the RpoD concensus are reconigzed by alternative sigma factors. This hypothesis will be investigated in future studies. Finally, sequences of the above experimentally validated promoters were characterized accordingly to previous studies reported in the literature. For this, we adopted an <italic>in silico</italic> classification proposed by Shimada et al. (<xref ref-type="bibr" rid="B72">2014</xref>) (Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>), in which constitutive promoters present a high-level conservation of the consensus sequence for the major sigma factor RpoD, that is, the elements TTGACA (&#x02212;35) and TATAAT (&#x02212;10) separated by &#x0007E;17 bp (Figures <xref ref-type="fig" rid="F4">4A,B</xref>). Constitutive promoters are defined as promoters active <italic>in vivo</italic> in all circumstances, and, on the other hand, inducible promoters are switched ON and OFF by transcription factors depending on the <italic>in vivo</italic> conditions (Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>). The Logo pattern (Crooks et al., <xref ref-type="bibr" rid="B15">2004</xref>) generated from the alignment of the 33 identified metagenomic promoters (Figure <xref ref-type="fig" rid="F4">4C</xref>) indicated that positions &#x02212;35 and &#x02212;34 (&#x02212;35 box) and positions &#x02212;8, &#x02212;7, and &#x02212;3 (&#x02212;10 box) were highly conserved. Additionally, when the promoters were analyzed in sub-groups based on the level of strength (high, medium and low), we could notice a variation in the consensus sequence obtained for each group (Figure <xref ref-type="supplementary-material" rid="SM1">S5</xref>). These variances in the consensus sequences could explain the different promoter expression profiles observed experimentally.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Consensus of RpoD-related metagenomic promoters. <bold>(A)</bold> Known consensus sequences of the RpoD-dependent promoter determined <italic>in vitro</italic>, TTGACA (&#x02212;35) and TATAAT (&#x02212;10) separated by 17 plus/minus 2 bp in <italic>E. coli</italic> (Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>). <bold>(B)</bold> Known consensus sequences of 582 promoters experimentally validated in <italic>E. coli</italic> (Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>; Gama-Castro et al., <xref ref-type="bibr" rid="B30">2016</xref>; Keseler et al., <xref ref-type="bibr" rid="B43">2017</xref>). <bold>(C)</bold> The sequences of the 33 promoters experimentally validated in this study were aligned and subjected to Logo analysis (Crooks et al., <xref ref-type="bibr" rid="B15">2004</xref>). The consensus from the metagenomic set <bold>(C)</bold> is very similar to the one from the experimentally validated set from <italic>E. coli</italic> <bold>(B)</bold>.</p></caption>
<graphic xlink:href="fmicb-09-01344-g0004.tif"/>
</fig>
</sec></sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<sec>
<title>Meta-expression profiles for studying microbial communities</title>
<p>The similar expression clusters found between the two independent metagenomic libraries might suggest broader trends of organizational expression patterns in nature. Independent studies on microbial communities from aquatic environments have described similar patterns by evaluating gene expression through metatranscriptomic analysis (Frias-Lopez et al., <xref ref-type="bibr" rid="B27">2008</xref>; Stewart et al., <xref ref-type="bibr" rid="B78">2012</xref>; Dupont et al., <xref ref-type="bibr" rid="B19">2015</xref>; Fortunato and Crump, <xref ref-type="bibr" rid="B25">2015</xref>), indicating that our observations are not restricted to the assessed soil samples. It has also been computationally demonstrated by Fernandez et al. (<xref ref-type="bibr" rid="B21">2014</xref>) that the microbial metaregulome&#x02014;the whole set of regulons of an environmental sample&#x02014;is shaped by the physicochemical conditions of the environment as an adaptive process. Thus, we suggest that expression profiling of an environmental sample might bear great potential for revealing insightful trends regarding the transcriptional diversity of microbial communities and for aiding on the design of efficient microbial communities for therapeutic or ecological needs (Fernandez et al., <xref ref-type="bibr" rid="B21">2014</xref>; Fredrickson, <xref ref-type="bibr" rid="B26">2015</xref>; Sol&#x000E9;, <xref ref-type="bibr" rid="B76">2015</xref>; Johns et al., <xref ref-type="bibr" rid="B41">2016</xref>).</p>
<p>Regarding the explanation for the diversity expression profiles found among the metagenomic clones, it is important to stress that regulatory patterns have a multifactorial nature, being ruled by many different processes. Firstly, the regulatory dynamic is inherently interconnected with the function of the original regulated gene (e.g., housekeeping, adaptive etc.) (Silander et al., <xref ref-type="bibr" rid="B73">2012</xref>). Secondly, the transcriptional bias imposed by the <italic>E. coli</italic> molecular machinery might constraint the recognition of promoter elements and/or not necessarily reproduce the original behaviors found in natural hosts (Gabor et al., <xref ref-type="bibr" rid="B28">2004</xref>; Liebl et al., <xref ref-type="bibr" rid="B50">2014</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>). Another point to be taken into consideration is that artificial juxtaposition of the exogenous promoter to the ribosome-binding site of the fluorescent reporter might increase expression as a consequence of the cloning process. Finally, another process that could influence the detection of active clones in <italic>E. coli</italic> is that the expression of many heterologous genes are toxic to this host (Kimelman et al., <xref ref-type="bibr" rid="B44">2012</xref>). This would also limit the cloning of some fragments in this host for functional metagenomics approaches.</p>
<p>Our observations also suggested transcriptional regulation beyond the control of the RpoD sigma factor for those clones (i.e., adjacent transcription factors), introducing novel niches for the exploration of regulated promoters. Since the discovery of distinct expression behaviors is essential for expanding the current set of commercial promoters, the diversity of expression profiles highlighted in this study has supported the current framework as a promising strategy for finding novel promoters for downstream applications. We also believe the developed strategy could greatly benefit from the combination with other high-throughput screening methods, such as SIGEX (Uchiyama et al., <xref ref-type="bibr" rid="B81">2005</xref>), providing innovative possibilities for the prospection of both inducible and constitutive promoters. Finally, we emphasize our observations are always constrained, to a certain extent, by the perspective of the chosen microbial host (Neufeld et al., <xref ref-type="bibr" rid="B60">2006</xref>; Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>; Alves Ld et al., <xref ref-type="bibr" rid="B1">2017</xref>) (i.e., the set of constitutive promoters active in <italic>E. coli</italic>) and might represent only a fraction of the effective environmental metaconstitutome. Future studies systematically applying our methodology to a range of environmental samples and hosts will greatly contribute to understanding this relationship between regulatory diversity and environmental adaptation in bacteria.</p>
</sec>
<sec>
<title>Regulatory architectures and host compatibility for promoter exploration</title>
<p>Through the generation of a small-DNA insert library combined to <italic>in silico</italic> platforms we were able to analyse taxonomic and architectural features of the metagenomic fragments. We have also provided both (i) a consensus of recognizable exogenous constitutive promoters in an <italic>E. coli</italic> host. The analysis of the metagenomic fragments for nucleotide composition were in agreement with previous G-C content diversity analyses of soil samples, which ranged from 50 to 61% (Foerstner et al., <xref ref-type="bibr" rid="B24">2005</xref>; Bohlin et al., <xref ref-type="bibr" rid="B6">2010</xref>; Mann and Chen, <xref ref-type="bibr" rid="B54">2010</xref>), suggesting the environmental influence on G-C content and taxonomic predominance of microbiomes. Although phylogenetic affiliation based on ORFs at the protein level are not suitable as sequence-composition based classifiers&#x02014;as <italic>PhylopythiaS</italic>&#x02014;for predicting taxonomic origins, we could observe that there was an agreement between both methods in a few samples (e.g., pCAW3, pCAW6, pCAW9 and pCAW10). Furthermore, the abundance of bacterial groups and gene functions predicted in this work was also similar to previous high-throughput studies in soil microbial communities (Janssen, <xref ref-type="bibr" rid="B39">2006</xref>; Fierer et al., <xref ref-type="bibr" rid="B22">2007</xref>, <xref ref-type="bibr" rid="B23">2012</xref>). Considering the above, the proposed experimental methodology has allowed us to directly asses the different bacterial groups that had promoters sequence recognizable by the host&#x02013;as the metagenomic fragments from these predicted taxa have allowed GFP expression in <italic>E. coli</italic>.</p>
<p>Regarding the in-depth search for promoters <italic>in vivo</italic>&#x02014;small-DNA library&#x02014;and <italic>in silico</italic>, the experimental finding of at least 33 promoter regions within the initial set of the selected metagenomic clones suggested the <italic>in silico</italic> prediction was overestimated (140 RpoD-related promoters). The above can be explained since it is not uncommon for prediction algorithms to underestimate or overestimate results due to a lack of information regarding diversity and variability of natural <italic>cis</italic>-regulatory sequences (Vanet et al., <xref ref-type="bibr" rid="B83">1999</xref>; de Jong et al., <xref ref-type="bibr" rid="B16">2012</xref>; Shahmuradov et al., <xref ref-type="bibr" rid="B69">2016</xref>). Furthermore, the analysis of the metagenomic promoter positions/architectures have slightly diverged from the <italic>E. coli</italic> K-12 genome, suggesting the diversity of genomic architectures in metagenomic libraries and a current underestimation of bacterial intragenic promoters that goes far above the <italic>E. coli</italic> model.</p>
<p>Regarding the promoter consensus obtained from the small-DNA fragments, we hypothesized that these sequences could be either recognized by other sigma factors than RpoD or presented unusual consensus sequences for &#x02212;10 and &#x02212;35 boxes which have bypassed the algorithmic analysis. However, experimental validation in <italic>E. coli</italic> strains lacking diverse sigma factors genes should be necessary for a more accurate conclusion. Although the observed logo pattern was distant from the <italic>E. coli</italic> consensus proposed for the RpoD-dependent constitutive promoters identified <italic>in vitro</italic> (Figure <xref ref-type="fig" rid="F4">4A</xref>; Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>), it was very similar to the previously described consensus from experimentally validated promoter (Mitchell, <xref ref-type="bibr" rid="B58">2003</xref>) sets from RegulonDB (Gama-Castro et al., <xref ref-type="bibr" rid="B30">2016</xref>) and EcoCyc (Keseler et al., <xref ref-type="bibr" rid="B43">2017</xref>) databases (Figure <xref ref-type="fig" rid="F4">4B</xref>), suggesting a certain degree of degeneracy for the recognition of constitutive promoters in <italic>E. coli</italic>. Thus, it has allowed us to identify a consensus for exogenous promoter recognition in <italic>E. coli</italic>, which can be an important resource for defining host-dependent constraints in functional metagenomics. Yet, it is possible that promoters that do not match the known consensus for RpoD could be reconginzed by alternative sigma factors, but this need to be further exploited in the future.</p>
<p>A seminal study in functional metagenomics provided by Gabor et al. (<xref ref-type="bibr" rid="B28">2004</xref>), estimated on a theoretical basis that 40% of the enzymatic activities present in a soil metagenomic library could be readily accessed using <italic>E. coli</italic> as a host in an independent gene expression mode. This prediction implies that at least 40% of the metagenomic promoters would also be recognized by <italic>E. coli</italic>. Contrastingly, recent empirical studies on <italic>E. coli</italic> and other hosts have shown that functional expression faces a myriad of challenges (Bernstein et al., <xref ref-type="bibr" rid="B4">2007</xref>; Ekkers et al., <xref ref-type="bibr" rid="B20">2012</xref>; Vester et al., <xref ref-type="bibr" rid="B85">2015</xref>), reflecting significantly lower rates than the proposed by Gabor and collaborators (Gabor et al., <xref ref-type="bibr" rid="B28">2004</xref>). In agreement with those studies, our work stresses the gap between theoretical estimations and experimental results, as we have observed only a small portion of the whole set of promoters is accessible for <italic>E. coli</italic> in metagenomics libraries (&#x0007E;1% of the clones assayed displayed detectable fluorescence in the plates)&#x02013;in contrast to the previously predicted enzymatic activities recovery rate (&#x0007E;40%) (Gabor et al., <xref ref-type="bibr" rid="B28">2004</xref>). Thus, we remark the importance of generation predictions on a combination of both experimental and computational data.</p>
</sec>
<sec>
<title>Intrinsic challenges in functional metagenomic studies for promoter exploration</title>
<p>In order to address the constraints underlying our observations and predictions, we have selected some caveats raised during this study, which are intrinsic to functional metagenomics and regulatory studies. Firstly, functional metagenomics investigates a system&#x02014;bacterial community&#x02014;based on its genetic parts&#x02014;metagenomic fragments&#x02014;, thus it is limited to provide blurred (and somewhat biased) depiction of the whole&#x02014;e.g., some promoters observed as constitutive might be repressed by the structural conformation of bacterial chromatin in the original organism (Dillon and Dorman, <xref ref-type="bibr" rid="B17">2010</xref>), but not in the plasmidial context in the host. Secondly, the metagenomic host will always bias the results as it filters biological information according to its own molecular machinery (Guazzaroni et al., <xref ref-type="bibr" rid="B34">2015</xref>; Lam et al., <xref ref-type="bibr" rid="B47">2015</xref>; Alves Ld et al., <xref ref-type="bibr" rid="B1">2017</xref>)&#x02014;e.g., a promoter might be considered constitutive when its exogenous repressor is not expressed in the host. Another potential limitation of the strategy used here, is that the direct cloning of DNA fragments and screening for fluorescent clones would be biased toward the identification of promoters located near the fluorescent reporter. Yet, since we were able to identify promoters located more than 1 kb away from the reporter gene, this potential limitation would not be a concerning issue here. Lastly, the line between constitutive and regulated promoters has become rather arbitrary among studies as it usually relies on the experimental design and concepts adopted by each research group&#x02014;e.g., some authors consider constitutive bacterial promoters as those that are active <italic>in vivo</italic> in all circumstances, while others define them as the promoters recognized <italic>in vitro</italic> by RNA polymerase RpoD holoenzyme alone in the absence of additional regulatory proteins (Shimada et al., <xref ref-type="bibr" rid="B72">2014</xref>).</p>
</sec></sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusions</title>
<p>In summary, we have focused in integrating experimental and <italic>in silico</italic> approaches to exploit the regulatory diversity from metagenomics DNA fragments by prospecting and characterizing novel promoter sequences in <italic>E. coli</italic>. From this, we were able to identify novel constitutive promoters using real-sized metagenomic DNA fragments, and a further dissection of individual clones allowed us to demonstrate that a number of internal promoters can be recognized by the host to drive gene expression <italic>in vivo</italic>. Further studies could be applied to exploit which type of sigma factors are contributing for the expression of the identifiable active promoter fragments. Despite the intrinsic limitations previously described, our strategy can be further optimized by high-throughput studies, which will be essential for expanding our current estimations into a more holistic landscape. Finally, we highlight that this work should be also useful for the applied sciences, expanding the current biotechnological toolbox through the discovery and characterisation of novel regulatory features.</p>
</sec>
<sec id="s6">
<title>Data availability</title>
<p>The nucleotide sequences obtained for the plasmid inserts have been deposited in the GenBank database under the Accession numbers (KY939589 to KY939598), which are also shown in Table <xref ref-type="table" rid="T2">2</xref>.</p></sec>
<sec id="s7">
<title>Author contributions</title>
<p>CW, LA, M-EG, and RS-R: designed the experiments; CW and LA: performed the experiments; CW: analyzed the data; CW and RS-R: prepared the figures. CW and M-EG wrote the manuscript. All authors reviewed the manuscript.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec></sec>
</body>
<back>
<ack>
<p>The authors are thanks to lab colleagues for insightful discussion about this manuscript.</p>
</ack>
<sec sec-type="supplementary-material" id="s9">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fmicb.2018.01344/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fmicb.2018.01344/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Alves Ld</surname> <given-names>F.</given-names></name> <name><surname>Silva-Rocha</surname> <given-names>R.</given-names></name> <name><surname>Guazzaroni</surname> <given-names>M.-E.</given-names></name></person-group> (<year>2017</year>). <article-title>Enhancing metagenomic approaches through synthetic biology</article-title>, in <source>Functional Metagenomics: Tools and Applications</source>, eds <person-group person-group-type="editor"><name><surname>Charles</surname> <given-names>T.</given-names></name> <name><surname>Liles</surname> <given-names>M.</given-names></name> <name><surname>Sessitsch</surname> <given-names>A.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>75</fpage>&#x02013;<lpage>94</lpage>.</citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amann</surname> <given-names>R. I.</given-names></name> <name><surname>Ludwig</surname> <given-names>W.</given-names></name> <name><surname>Schleifer</surname> <given-names>K. H.</given-names></name></person-group> (<year>1995</year>). <article-title>Phylogenetic identification and <italic>in situ</italic> detection of individual microbial cells without cultivation</article-title>. <source>Microbiol. Rev</source>. <volume>59</volume>, <fpage>143</fpage>&#x02013;<lpage>169</lpage>. <pub-id pub-id-type="pmid">7535888</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andersen</surname> <given-names>J. B.</given-names></name> <name><surname>Sternberg</surname> <given-names>C.</given-names></name> <name><surname>Poulsen</surname> <given-names>L. K.</given-names></name> <name><surname>Bjorn</surname> <given-names>S. P.</given-names></name> <name><surname>Givskov</surname> <given-names>M.</given-names></name> <name><surname>Molin</surname> <given-names>S.</given-names></name></person-group> (<year>1998</year>). <article-title>New unstable variants of green fluorescent protein for studies of transient gene expression in bacteria</article-title>. <source>Appl. Environ. Microbiol</source>. <volume>64</volume>, <fpage>2240</fpage>&#x02013;<lpage>2246</lpage>. <pub-id pub-id-type="pmid">9603842</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernstein</surname> <given-names>J. R.</given-names></name> <name><surname>Bulter</surname> <given-names>T.</given-names></name> <name><surname>Shen</surname> <given-names>C. R.</given-names></name> <name><surname>Liao</surname> <given-names>J. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Directed evolution of ribosomal protein S1 for enhanced translational efficiency of high GC <italic>Rhodopseudomonas palustris</italic> DNA in <italic>Escherichia coli</italic></article-title>. <source>J. Biol. Chem</source>. <volume>282</volume>, <fpage>18929</fpage>&#x02013;<lpage>18936</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M701395200</pub-id><pub-id pub-id-type="pmid">17412688</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blount</surname> <given-names>B. A.</given-names></name> <name><surname>Weenink</surname> <given-names>T.</given-names></name> <name><surname>Vasylechko</surname> <given-names>S.</given-names></name> <name><surname>Ellis</surname> <given-names>T.</given-names></name></person-group> (<year>2012</year>). <article-title>Rational diversification of a promoter providing fine-tuned expression and orthogonal regulation for synthetic biology</article-title>. <source>PLoS ONE</source> <volume>7</volume>:<fpage>e33279</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0033279</pub-id><pub-id pub-id-type="pmid">22442681</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bohlin</surname> <given-names>J.</given-names></name> <name><surname>Snipen</surname> <given-names>L.</given-names></name> <name><surname>Hardy</surname> <given-names>S. P.</given-names></name> <name><surname>Kristoffersen</surname> <given-names>A. B.</given-names></name> <name><surname>Lagesen</surname> <given-names>K.</given-names></name> <name><surname>D&#x000F8;nsvik</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2010</year>). <article-title>Analysis of intra-genomic GC content homogeneity within prokaryotes</article-title>. <source>BMC Genomics</source> <volume>11</volume>:<fpage>464</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-11-464</pub-id><pub-id pub-id-type="pmid">20691090</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boyle</surname> <given-names>P. M.</given-names></name> <name><surname>Silver</surname> <given-names>P. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Harnessing nature&#x00027;s toolbox: regulatory elements for synthetic biology</article-title>. <source>J. R. Soc. Interface</source> <volume>6</volume>, <fpage>S535</fpage>&#x02013;<lpage>S546</lpage>. <pub-id pub-id-type="doi">10.1098/rsif.2008.0521.focus</pub-id><pub-id pub-id-type="pmid">19324675</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Browning</surname> <given-names>D. F.</given-names></name> <name><surname>Busby</surname> <given-names>S. J.</given-names></name></person-group> (<year>2016</year>). <article-title>Local and global regulation of transcription initiation in bacteria</article-title>. <source>Nat. Rev. Microbiol</source>. <volume>14</volume>, <fpage>638</fpage>&#x02013;<lpage>650</lpage>. <pub-id pub-id-type="doi">10.1038/nrmicro.2016.103</pub-id><pub-id pub-id-type="pmid">27498839</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>S.</given-names></name> <name><surname>Bagdasarian</surname> <given-names>M.</given-names></name> <name><surname>Kaufman</surname> <given-names>M. G.</given-names></name> <name><surname>Walker</surname> <given-names>E. D.</given-names></name></person-group> (<year>2007</year>). <article-title>Characterization of strong promoters from an environmental <italic>Flavobacterium</italic> hibernum strain by using a green fluorescent protein-based reporter system</article-title>. <source>Appl. Environ. Microbiol</source>. <volume>73</volume>, <fpage>1089</fpage>&#x02013;<lpage>1100</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.01577-06</pub-id><pub-id pub-id-type="pmid">17189449</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cho</surname> <given-names>B.-K.</given-names></name> <name><surname>Zengler</surname> <given-names>K.</given-names></name> <name><surname>Qiu</surname> <given-names>Y.</given-names></name> <name><surname>Park</surname> <given-names>Y. S.</given-names></name> <name><surname>Knight</surname> <given-names>E. M.</given-names></name> <name><surname>Barrett</surname> <given-names>C. L.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Elucidation of the transcription unit architecture of the <italic>Escherichia coli</italic> K-12 MG1655 genome</article-title>. <source>Nat. Biotechnol.</source> <volume>27</volume>, <fpage>1043</fpage>&#x02013;<lpage>1049</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.1582</pub-id><pub-id pub-id-type="pmid">19881496</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Conway</surname> <given-names>T.</given-names></name> <name><surname>Creecy</surname> <given-names>J. P.</given-names></name> <name><surname>Maddox</surname> <given-names>S. M.</given-names></name> <name><surname>Grissom</surname> <given-names>J. E.</given-names></name> <name><surname>Conkle</surname> <given-names>T. L.</given-names></name> <name><surname>Shadid</surname> <given-names>T. M.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing</article-title>. <source>MBio</source> <volume>5</volume>, <fpage>1</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1128/mBio.01442-14</pub-id><pub-id pub-id-type="pmid">25006232</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Covert</surname> <given-names>M. W.</given-names></name> <name><surname>Knight</surname> <given-names>E. M.</given-names></name> <name><surname>Reed</surname> <given-names>J. L.</given-names></name> <name><surname>Herrgard</surname> <given-names>M. J.</given-names></name> <name><surname>Palsson</surname> <given-names>B. O.</given-names></name></person-group> (<year>2004</year>). <article-title>Integrating high-throughput and computational data elucidates bacterial networks</article-title>. <source>Nature</source> <volume>429</volume>, <fpage>92</fpage>&#x02013;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1038/nature02456</pub-id><pub-id pub-id-type="pmid">15129285</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cowan</surname> <given-names>D.</given-names></name> <name><surname>Meyer</surname> <given-names>Q.</given-names></name> <name><surname>Stafford</surname> <given-names>W.</given-names></name> <name><surname>Muyanga</surname> <given-names>S.</given-names></name> <name><surname>Cameron</surname> <given-names>R.</given-names></name> <name><surname>Wittwer</surname> <given-names>P.</given-names></name></person-group> (<year>2005</year>). <article-title>Metagenomic gene discovery: past, present and future</article-title>. <source>Trends Biotechnol</source>. <volume>23</volume>, <fpage>321</fpage>&#x02013;<lpage>329</lpage>. <pub-id pub-id-type="doi">10.1016/j.tibtech.2005.04.001</pub-id><pub-id pub-id-type="pmid">15922085</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cox</surname> <given-names>R. S.</given-names> <suffix>III.</suffix></name> <name><surname>Surette</surname> <given-names>M. G.</given-names></name> <name><surname>Elowitz</surname> <given-names>M. B.</given-names></name></person-group> (<year>2007</year>). <article-title>Programming gene expression with combinatorial promoters</article-title>. <source>Mol. Syst. Biol</source>. <volume>3</volume>:<fpage>145</fpage>. <pub-id pub-id-type="doi">10.1038/msb4100187</pub-id><pub-id pub-id-type="pmid">18004278</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crooks</surname> <given-names>G. E.</given-names></name> <name><surname>Hon</surname> <given-names>G.</given-names></name> <name><surname>Chandonia</surname> <given-names>J. M.</given-names></name> <name><surname>Brenner</surname> <given-names>S. E.</given-names></name></person-group> (<year>2004</year>). <article-title>WebLogo: a sequence logo generator</article-title>. <source>Genome Res.</source> <volume>14</volume>, <fpage>1188</fpage>&#x02013;<lpage>1190</lpage>. <pub-id pub-id-type="doi">10.1101/gr.849004</pub-id><pub-id pub-id-type="pmid">15173120</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Jong</surname> <given-names>A.</given-names></name> <name><surname>Pietersma</surname> <given-names>H.</given-names></name> <name><surname>Cordes</surname> <given-names>M.</given-names></name> <name><surname>Kuipers</surname> <given-names>O. P.</given-names></name> <name><surname>Kok</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>PePPER: a webserver for prediction of prokaryote promoter elements and regulons</article-title>. <source>BMC Genomics</source> <volume>13</volume>:<fpage>299</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-13-299</pub-id><pub-id pub-id-type="pmid">22747501</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dillon</surname> <given-names>S. C.</given-names></name> <name><surname>Dorman</surname> <given-names>C. J.</given-names></name></person-group> (<year>2010</year>). <article-title>Bacterial nucleoid-associated proteins, nucleoid structure and gene expression</article-title>. <source>Nat. Rev. Microbiol</source>. <volume>8</volume>, <fpage>185</fpage>&#x02013;<lpage>195</lpage>. <pub-id pub-id-type="doi">10.1038/nrmicro2261</pub-id><pub-id pub-id-type="pmid">20140026</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dunn</surname> <given-names>A. K.</given-names></name> <name><surname>Handelsman</surname> <given-names>J.</given-names></name></person-group> (<year>1999</year>). <article-title>A vector for promoter trapping in <italic>Bacillus cereus</italic></article-title>. <source>Gene</source> <volume>226</volume>, <fpage>297</fpage>&#x02013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1016/S0378-1119(98)00544-7</pub-id><pub-id pub-id-type="pmid">9931504</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dupont</surname> <given-names>C. L.</given-names></name> <name><surname>McCrow</surname> <given-names>J. P.</given-names></name> <name><surname>Valas</surname> <given-names>R.</given-names></name> <name><surname>Moustafa</surname> <given-names>A.</given-names></name> <name><surname>Walworth</surname> <given-names>N.</given-names></name> <name><surname>Goodenough</surname> <given-names>U.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Genomes and gene expression across light and productivity gradients in eastern subtropical Pacific microbial communities</article-title>. <source>Isme J</source>. <volume>9</volume>, <fpage>1076</fpage>&#x02013;<lpage>1092</lpage>. <pub-id pub-id-type="doi">10.1038/ismej.2014.198</pub-id><pub-id pub-id-type="pmid">25333462</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ekkers</surname> <given-names>D. M.</given-names></name> <name><surname>Cretoiu</surname> <given-names>M. S.</given-names></name> <name><surname>Kielak</surname> <given-names>A. M.</given-names></name> <name><surname>van Elsas</surname> <given-names>J. D.</given-names></name></person-group> (<year>2012</year>). <article-title>The great screen anomaly&#x02014;a new frontier in product discovery through functional metagenomics</article-title>. <source>Appl. Microbiol. Biotechnol</source>. <volume>93</volume>, <fpage>1005</fpage>&#x02013;<lpage>1020</lpage>. <pub-id pub-id-type="doi">10.1007/s00253-011-3804-3</pub-id><pub-id pub-id-type="pmid">22189864</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fernandez</surname> <given-names>L.</given-names></name> <name><surname>Mercader</surname> <given-names>J. M.</given-names></name> <name><surname>Planas-F&#x000E8;lix</surname> <given-names>M.</given-names></name> <name><surname>Torrents</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>Adaptation to environmental factors shapes the organization of regulatory regions in microbial communities</article-title>. <source>BMC Genomics</source> <volume>15</volume>:<fpage>877</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-15-877</pub-id><pub-id pub-id-type="pmid">25294412</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fierer</surname> <given-names>N.</given-names></name> <name><surname>Bradford</surname> <given-names>M. A.</given-names></name> <name><surname>Jackson</surname> <given-names>R. B.</given-names></name></person-group> (<year>2007</year>). <article-title>Toward an ecological classification of soil bacteria</article-title>. <source>Ecology</source> <volume>88</volume>, <fpage>1354</fpage>&#x02013;<lpage>1364</lpage>. <pub-id pub-id-type="doi">10.1890/05-1839</pub-id><pub-id pub-id-type="pmid">17601128</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fierer</surname> <given-names>N.</given-names></name> <name><surname>Leff</surname> <given-names>J. W.</given-names></name> <name><surname>Adams</surname> <given-names>B. J.</given-names></name> <name><surname>Nielsen</surname> <given-names>U. N.</given-names></name> <name><surname>Bates</surname> <given-names>S. T.</given-names></name> <name><surname>Lauber</surname> <given-names>C. L.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Cross-biome metagenomic analyses of soil microbial communities and their functional attributes</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>109</volume>, <fpage>21390</fpage>&#x02013;<lpage>21395</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1215210110</pub-id><pub-id pub-id-type="pmid">23236140</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foerstner</surname> <given-names>K. U.</given-names></name> <name><surname>von Mering</surname> <given-names>C.</given-names></name> <name><surname>Hooper</surname> <given-names>S. D.</given-names></name> <name><surname>Bork</surname> <given-names>P.</given-names></name></person-group> (<year>2005</year>). <article-title>Environments shape the nucleotide composition of genomes</article-title>. <source>EMBO Rep.</source> <volume>6</volume>, <fpage>1208</fpage>&#x02013;<lpage>1213</lpage>. <pub-id pub-id-type="doi">10.1038/sj.embor.7400538</pub-id><pub-id pub-id-type="pmid">16200051</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fortunato</surname> <given-names>C. S.</given-names></name> <name><surname>Crump</surname> <given-names>B. C.</given-names></name></person-group> (<year>2015</year>). <article-title>Microbial gene abundance and expression patterns across a river to ocean salinity gradient</article-title>. <source>PLoS ONE</source> <volume>10</volume>:<fpage>e0140578</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0140578</pub-id><pub-id pub-id-type="pmid">26536246</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fredrickson</surname> <given-names>J. K.</given-names></name></person-group> (<year>2015</year>). <article-title>Ecological communities by design</article-title>. <source>Science</source> <volume>348</volume>, <fpage>1425</fpage>&#x02013;<lpage>1427</lpage>. <pub-id pub-id-type="doi">10.1126/science.aab0946</pub-id><pub-id pub-id-type="pmid">26113703</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Frias-Lopez</surname> <given-names>J.</given-names></name> <name><surname>Shi</surname> <given-names>Y.</given-names></name> <name><surname>Tyson</surname> <given-names>G. W.</given-names></name> <name><surname>Coleman</surname> <given-names>M. L.</given-names></name> <name><surname>Schuster</surname> <given-names>S. C.</given-names></name> <name><surname>Chisholm</surname> <given-names>S. W.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Microbial community gene expression in ocean surface waters</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>105</volume>, <fpage>3805</fpage>&#x02013;<lpage>3810</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0708897105</pub-id><pub-id pub-id-type="pmid">18316740</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gabor</surname> <given-names>E. M.</given-names></name> <name><surname>Alkema</surname> <given-names>W. B.</given-names></name> <name><surname>Janssen</surname> <given-names>D. B.</given-names></name></person-group> (<year>2004</year>). <article-title>Quantifying the accessibility of the metagenome by random expression cloning techniques</article-title>. <source>Environ. Microbiol.</source> <volume>6</volume>, <fpage>879</fpage>&#x02013;<lpage>886</lpage>. <pub-id pub-id-type="doi">10.1111/j.1462-2920.2004.00640.x</pub-id><pub-id pub-id-type="pmid">15305913</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gama-Castro</surname> <given-names>S.</given-names></name> <name><surname>Salgado</surname> <given-names>H.</given-names></name> <name><surname>Santos-Zavaleta</surname> <given-names>A.</given-names></name> <name><surname>Ledezma-Tejeida</surname> <given-names>D.</given-names></name> <name><surname>Muniz-Rascado</surname> <given-names>L.</given-names></name> <name><surname>Garcia-Sotelo</surname> <given-names>J. S.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond</article-title>. <source>Nucleic Acids Res.</source> <volume>44</volume>, <fpage>D133</fpage>&#x02013;<lpage>D143</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkv1156</pub-id><pub-id pub-id-type="pmid">26527724</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gruber</surname> <given-names>T. M.</given-names></name> <name><surname>Gross</surname> <given-names>C. A.</given-names></name></person-group> (<year>2003</year>). <article-title>Multiple sigma subunits and the partitioning of bacterial transcription space</article-title>. <source>Annu. Rev. Microbiol</source>. <volume>57</volume>, <fpage>441</fpage>&#x02013;<lpage>466</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.micro.57.030502.090913</pub-id><pub-id pub-id-type="pmid">14527287</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guazzaroni</surname> <given-names>M. E.</given-names></name> <name><surname>Morgante</surname> <given-names>V.</given-names></name> <name><surname>Mirete</surname> <given-names>S.</given-names></name> <name><surname>Gonzalez-Pastor</surname> <given-names>J. E.</given-names></name></person-group> (<year>2013</year>). <article-title>Novel acid resistance genes from the metagenome of the Tinto River, an extremely acidic environment</article-title>. <source>Environ. Microbiol.</source> <volume>15</volume>, <fpage>1088</fpage>&#x02013;<lpage>1102</lpage>. <pub-id pub-id-type="doi">10.1111/1462-2920.12021</pub-id><pub-id pub-id-type="pmid">23145860</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guazzaroni</surname> <given-names>M. E.</given-names></name> <name><surname>Silva-Rocha</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>Expanding the logic of bacterial promoters using engineered overlapping operators for global regulators</article-title>. <source>ACS Synthetic Biol.</source> <volume>19</volume>, <fpage>666</fpage>&#x02013;<lpage>675</lpage>. <pub-id pub-id-type="doi">10.1021/sb500084f</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guazzaroni</surname> <given-names>M. E.</given-names></name> <name><surname>Silva-Rocha</surname> <given-names>R.</given-names></name> <name><surname>Ward</surname> <given-names>R. J.</given-names></name></person-group> (<year>2015</year>). <article-title>Synthetic biology approaches to improve biocatalyst identification in metagenomic library screening</article-title>. <source>Microb. Biotechnol.</source> <volume>8</volume>, <fpage>52</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1111/1751-7915.12146</pub-id><pub-id pub-id-type="pmid">25123225</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Han</surname> <given-names>S. S.</given-names></name> <name><surname>Lee</surname> <given-names>J. Y.</given-names></name> <name><surname>Kim</surname> <given-names>W. H.</given-names></name> <name><surname>Shin</surname> <given-names>H. J.</given-names></name> <name><surname>Kim</surname> <given-names>G. J.</given-names></name></person-group> (<year>2008</year>). <article-title>Screening of promoters from metagenomic DNA and their use for the construction of expression vectors</article-title>. <source>J. Microbiol. Biotechnol.</source> <volume>18</volume>, <fpage>1634</fpage>&#x02013;<lpage>1640</lpage>. <pub-id pub-id-type="pmid">18955811</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Handelsman</surname> <given-names>J.</given-names></name> <name><surname>Rondon</surname> <given-names>M. R.</given-names></name> <name><surname>Brady</surname> <given-names>S. F.</given-names></name> <name><surname>Clardy</surname> <given-names>J.</given-names></name> <name><surname>Goodman</surname> <given-names>R. M.</given-names></name></person-group> (<year>1998</year>). <article-title>Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products</article-title>. <source>Chem. Biol</source>. <volume>5</volume>, <fpage>R245</fpage>&#x02013;<lpage>R249</lpage>. <pub-id pub-id-type="doi">10.1016/S1074-5521(98)90108-9</pub-id><pub-id pub-id-type="pmid">9818143</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hebisch</surname> <given-names>E.</given-names></name> <name><surname>Knebel</surname> <given-names>J.</given-names></name> <name><surname>Landsberg</surname> <given-names>J.</given-names></name> <name><surname>Frey</surname> <given-names>E.</given-names></name> <name><surname>Leisner</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>High variation of fluorescence protein maturation times in closely related <italic>Escherichia coli</italic> strains</article-title>. <source>PLoS ONE</source> <volume>8</volume>:<fpage>e75991</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0075991</pub-id><pub-id pub-id-type="pmid">24155882</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ishihama</surname> <given-names>A.</given-names></name></person-group> (<year>2010</year>). <article-title>Prokaryotic genome regulation: multifactor promoters, multitarget regulators and hierarchic networks</article-title>. <source>FEMS Microbiol. Rev</source>. <volume>34</volume>, <fpage>628</fpage>&#x02013;<lpage>645</lpage>. <pub-id pub-id-type="doi">10.1111/j.1574-6976.2010.00227.x</pub-id><pub-id pub-id-type="pmid">20491932</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Janssen</surname> <given-names>P. H.</given-names></name></person-group> (<year>2006</year>). <article-title>Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes minireviews identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes</article-title>. <source>Appl. Environ. Microbiol</source>. <volume>72</volume>, <fpage>1719</fpage>&#x02013;<lpage>1728</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.72.3.1719-1728.2006</pub-id><pub-id pub-id-type="pmid">16517615</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jim&#x000E9;nez</surname> <given-names>D. J.</given-names></name> <name><surname>Monta&#x000F1;a</surname> <given-names>J. S.</given-names></name> <name><surname>&#x000C1;lvarez</surname> <given-names>D.</given-names></name> <name><surname>Baena</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>A novel cold active esterase derived from Colombian high Andean forest soil metagenome</article-title>. <source>World J. Microbiol. Biotechnol</source>. <volume>28</volume>, <fpage>361</fpage>&#x02013;<lpage>370</lpage>. <pub-id pub-id-type="doi">10.1007/s11274-011-0828-x</pub-id><pub-id pub-id-type="pmid">22806812</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johns</surname> <given-names>N. I.</given-names></name> <name><surname>Blazejewski</surname> <given-names>T.</given-names></name> <name><surname>Gomes</surname> <given-names>A. L.</given-names></name> <name><surname>Wang</surname> <given-names>H. H.</given-names></name></person-group> (<year>2016</year>). <article-title>Principles for designing synthetic microbial communities</article-title>. <source>Curr. Opin. Microbiol.</source> <volume>31</volume>, <fpage>146</fpage>&#x02013;<lpage>153</lpage>. <pub-id pub-id-type="doi">10.1016/j.mib.2016.03.010</pub-id><pub-id pub-id-type="pmid">27084981</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kelly</surname> <given-names>J. R.</given-names></name> <name><surname>Rubin</surname> <given-names>A. J.</given-names></name> <name><surname>Davis</surname> <given-names>J. H.</given-names></name> <name><surname>Ajo-Franklin</surname> <given-names>C. M.</given-names></name> <name><surname>Cumbers</surname> <given-names>J.</given-names></name> <name><surname>Czar</surname> <given-names>M. J.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Measuring the activity of BioBrick promoters using an <italic>in vivo</italic> reference standard</article-title>. <source>J. Biol. Eng.</source> <volume>3</volume>:<fpage>4</fpage>. <pub-id pub-id-type="doi">10.1186/1754-1611-3-4</pub-id><pub-id pub-id-type="pmid">19298678</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Keseler</surname> <given-names>I. M.</given-names></name> <name><surname>Mackie</surname> <given-names>A.</given-names></name> <name><surname>Santos-Zavaleta</surname> <given-names>A.</given-names></name> <name><surname>Billington</surname> <given-names>R.</given-names></name> <name><surname>Bonavides-Mart&#x000ED;nez</surname> <given-names>C.</given-names></name> <name><surname>Caspi</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>The EcoCyc database: reflecting new knowledge about <italic>Escherichia coli</italic> K-12</article-title>. <source>Nucleic Acids Res</source>. <volume>45</volume>, <fpage>D543</fpage>&#x02013;<lpage>D550</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkw1003</pub-id><pub-id pub-id-type="pmid">27899573</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kimelman</surname> <given-names>A.</given-names></name> <name><surname>Levy</surname> <given-names>A.</given-names></name> <name><surname>Sberro</surname> <given-names>H.</given-names></name> <name><surname>Kidron</surname> <given-names>S.</given-names></name> <name><surname>Leavitt</surname> <given-names>A.</given-names></name> <name><surname>Amitai</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>A vast collection of microbial genes that are toxic to bacteria</article-title>. <source>Genome Res.</source> <volume>22</volume>, <fpage>802</fpage>&#x02013;<lpage>809</lpage>. <pub-id pub-id-type="doi">10.1101/gr.133850.111</pub-id><pub-id pub-id-type="pmid">22300632</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koonin</surname> <given-names>E. V.</given-names></name></person-group> (<year>2009</year>). <article-title>Evolution of genome architecture</article-title>. <source>Int. J. Biochem. Cell Biol.</source> <volume>41</volume>, <fpage>298</fpage>&#x02013;<lpage>306</lpage>. <pub-id pub-id-type="doi">10.1016/j.biocel.2008.09.015</pub-id><pub-id pub-id-type="pmid">18929678</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kubota</surname> <given-names>M.</given-names></name> <name><surname>Yamazaki</surname> <given-names>Y.</given-names></name> <name><surname>Ishihama</surname> <given-names>A.</given-names></name></person-group> (<year>1991</year>). <article-title>Random screening of promoters from <italic>Escherichia coli</italic> and classification based on the promoter strength</article-title>. <source>Jpn. J. Genetics</source> <volume>66</volume>, <fpage>399</fpage>&#x02013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.1266/jjg.66.399</pub-id><pub-id pub-id-type="pmid">1954034</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lam</surname> <given-names>K. N.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Engel</surname> <given-names>K.</given-names></name> <name><surname>Neufeld</surname> <given-names>J. D.</given-names></name> <name><surname>Charles</surname> <given-names>T. C.</given-names></name></person-group> (<year>2015</year>). <article-title>Current and future resources for functional metagenomics</article-title>. <source>Front. Microbiol.</source> <volume>6</volume>:<fpage>1196</fpage>. <pub-id pub-id-type="doi">10.3389/fmicb.2015.01196</pub-id><pub-id pub-id-type="pmid">26579102</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Land</surname> <given-names>M.</given-names></name> <name><surname>Hauser</surname> <given-names>L.</given-names></name> <name><surname>Jun</surname> <given-names>S. R.</given-names></name> <name><surname>Nookaew</surname> <given-names>I.</given-names></name> <name><surname>Leuze</surname> <given-names>M. R.</given-names></name> <name><surname>Ahn</surname> <given-names>T. H.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Insights from 20 years of bacterial genome sequencing</article-title>. <source>Funct. Integr. Genomics</source> <volume>15</volume>, <fpage>141</fpage>&#x02013;<lpage>161</lpage>. <pub-id pub-id-type="doi">10.1007/s10142-015-0433-4.</pub-id><pub-id pub-id-type="pmid">25722247</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Qin</surname> <given-names>L.</given-names></name></person-group> (<year>2005</year>). <article-title>Metagenomics-based drug discovery and marine microbial diversity</article-title>. <source>Trends Biotechnol</source>. <volume>23</volume>, <fpage>539</fpage>&#x02013;<lpage>543</lpage>. <pub-id pub-id-type="doi">10.1016/j.tibtech.2005.08.006</pub-id><pub-id pub-id-type="pmid">16154653</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liebl</surname> <given-names>W.</given-names></name> <name><surname>Angelov</surname> <given-names>A.</given-names></name> <name><surname>Juergensen</surname> <given-names>J.</given-names></name> <name><surname>Chow</surname> <given-names>J.</given-names></name> <name><surname>Loeschcke</surname> <given-names>A.</given-names></name> <name><surname>Drepper</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Alternative hosts for functional (meta)genome analysis</article-title>. <source>Appl. Microbiol. Biotechnol</source>. <volume>98</volume>, <fpage>8099</fpage>&#x02013;<lpage>8109</lpage>. <pub-id pub-id-type="doi">10.1007/s00253-014-5961-7</pub-id><pub-id pub-id-type="pmid">25091044</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Locey</surname> <given-names>K. J.</given-names></name> <name><surname>Lennon</surname> <given-names>J. T.</given-names></name></person-group> (<year>2016</year>). <article-title>Scaling laws predict global microbial diversity</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>113</volume>, <fpage>5970</fpage>&#x02013;<lpage>5975</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1521291113</pub-id><pub-id pub-id-type="pmid">27140646</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lonetto</surname> <given-names>M.</given-names></name> <name><surname>Gribskov</surname> <given-names>M.</given-names></name> <name><surname>Gross</surname> <given-names>C. A.</given-names></name></person-group> (<year>1992</year>). <article-title>The sigma 70 family: sequence conservation and evolutionary relationships</article-title>. <source>J. Bacteriol</source>. <volume>174</volume>, <fpage>3843</fpage>&#x02013;<lpage>3849</lpage>. <pub-id pub-id-type="doi">10.1128/jb.174.12.3843-3849.1992</pub-id><pub-id pub-id-type="pmid">1597408</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lu</surname> <given-names>C.</given-names></name> <name><surname>Bentley</surname> <given-names>W. E.</given-names></name> <name><surname>Rao</surname> <given-names>G.</given-names></name></person-group> (<year>2004</year>). <article-title>A high-throughput approach to promoter study using green fluorescent protein</article-title>. <source>Biotechnol. Prog</source>. <volume>20</volume>, <fpage>1634</fpage>&#x02013;<lpage>1640</lpage>. <pub-id pub-id-type="doi">10.1021/bp049751l</pub-id><pub-id pub-id-type="pmid">15575693</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mann</surname> <given-names>S.</given-names></name> <name><surname>Chen</surname> <given-names>Y. P. P.</given-names></name></person-group> (<year>2010</year>). <article-title>Bacterial genomic G &#x0002B; C composition-eliciting environmental adaptation</article-title>. <source>Genomics</source> <volume>95</volume>, <fpage>7</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1016/j.ygeno.2009.09.002</pub-id><pub-id pub-id-type="pmid">19747541</pub-id></citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mao</surname> <given-names>X.</given-names></name> <name><surname>Ma</surname> <given-names>Q.</given-names></name> <name><surname>Liu</surname> <given-names>B.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name></person-group> (<year>2015</year>). <article-title>Revisiting operons: an analysis of the landscape of transcriptional units in <italic>E</italic></article-title>. <source>coli. BMC Bioinformatics</source> <volume>16</volume>:<fpage>356</fpage>. <pub-id pub-id-type="doi">10.1186/s12859-015-0805-8</pub-id><pub-id pub-id-type="pmid">26538447</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mao</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Yin</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name></person-group> (<year>2012</year>). <article-title>The percentage of bacterial genes on leading versus lagging strands is influenced by multiple balancing forces</article-title>. <source>Nucleic Acids Res</source>. <volume>40</volume>, <fpage>8210</fpage>&#x02013;<lpage>8218</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gks605</pub-id><pub-id pub-id-type="pmid">22735706</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mart&#x000ED;nez-Antonio</surname> <given-names>A.</given-names></name> <name><surname>Collado-Vides</surname> <given-names>J.</given-names></name></person-group> (<year>2003</year>). <article-title>Identifying global regulators in transcriptional regulatory networks in bacteria</article-title>. <source>Curr. Opin. Microbiol</source>. <volume>6</volume>, <fpage>482</fpage>&#x02013;<lpage>489</lpage>. <pub-id pub-id-type="doi">10.1016/j.mib.2003.09.002</pub-id><pub-id pub-id-type="pmid">14572541</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mitchell</surname> <given-names>J. E.</given-names></name></person-group> (<year>2003</year>). <article-title>Identification and analysis of &#x02018;extended&#x02212;10&#x02019; promoters in <italic>Escherichia coli</italic></article-title>. <source>Nucleic Acids Res</source>. <volume>31</volume>, <fpage>4689</fpage>&#x02013;<lpage>4695</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkg694</pub-id><pub-id pub-id-type="pmid">12907708</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nakagawa</surname> <given-names>S.</given-names></name> <name><surname>Cuthill</surname> <given-names>I. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Effect size, confidence interval and statistical significance: a practical guide for biologists</article-title>. <source>Biol. Rev. Camb. Philos. Soc.</source> <volume>82</volume>, <fpage>591</fpage>&#x02013;<lpage>605</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-185X.2007.00027.x</pub-id><pub-id pub-id-type="pmid">17944619</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neufeld</surname> <given-names>J. D.</given-names></name> <name><surname>Mohn</surname> <given-names>W. W.</given-names></name> <name><surname>de Lorenzo</surname> <given-names>V.</given-names></name></person-group> (<year>2006</year>). <article-title>Composition of microbial communities in hexachlorocyclohexane (HCH) contaminated soils from Spain revealed with a habitat-specific microarray</article-title>. <source>Environ. Microbiol</source>. <volume>8</volume>, <fpage>126</fpage>&#x02013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1111/j.1462-2920.2005.00875.x</pub-id><pub-id pub-id-type="pmid">16343328</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paget</surname> <given-names>M. S. B.</given-names></name> <name><surname>Helmann</surname> <given-names>J. D.</given-names></name></person-group> (<year>2003</year>). <article-title>The sigma70 family of sigma factors</article-title>. <source>Genome Biol</source>. <volume>4</volume>:<fpage>203</fpage>. <pub-id pub-id-type="doi">10.1186/gb-2003-4-1-203</pub-id><pub-id pub-id-type="pmid">12540296</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Patil</surname> <given-names>K. R.</given-names></name> <name><surname>Roune</surname> <given-names>L.</given-names></name> <name><surname>McHardy</surname> <given-names>A. C.</given-names></name></person-group> (<year>2012</year>). <article-title>The PhyloPythiaS web server for taxonomic assignment of metagenome sequences</article-title>. <source>PLoS ONE</source> <volume>7</volume>:<fpage>e38581</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0038581</pub-id><pub-id pub-id-type="pmid">22745671</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pushpam</surname> <given-names>P.</given-names></name> <name><surname>Rajesh</surname> <given-names>T.</given-names></name> <name><surname>Gunasekaran</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Identification and characterization of alkaline serine protease from goat skin surface metagenome</article-title>. <source>AMB Express</source> <volume>1</volume>:<fpage>3</fpage>. <pub-id pub-id-type="doi">10.1186/2191-0855-1-3</pub-id><pub-id pub-id-type="pmid">21906326</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raes</surname> <given-names>J.</given-names></name> <name><surname>Korbel</surname> <given-names>J. O.</given-names></name> <name><surname>Lercher</surname> <given-names>M. J.</given-names></name> <name><surname>von Mering</surname> <given-names>C.</given-names></name> <name><surname>Bork</surname> <given-names>P.</given-names></name></person-group> (<year>2007</year>). <article-title>Prediction of effective genome size in metagenomic samples</article-title>. <source>Genome Biol</source>. <volume>8</volume>:<fpage>R10</fpage>. <pub-id pub-id-type="doi">10.1186/gb-2007-8-1-r10</pub-id><pub-id pub-id-type="pmid">17224063</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sambrook</surname> <given-names>J.</given-names></name> <name><surname>Fritsch</surname> <given-names>E. F.</given-names></name> <name><surname>Maniatis</surname> <given-names>T.</given-names></name></person-group> (<year>1989</year>). <source>Molecular Cloning: A Laboratory Manual.</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Cold Spring Harbor</publisher-name>.</citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sanches-Medeiros</surname> <given-names>A.</given-names></name> <name><surname>Monteiro</surname> <given-names>L. M. O.</given-names></name> <name><surname>Silva-Rocha</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <article-title>Calibrating transcriptional activity using constitutive synthetic promoters in mutants for global regulators in <italic>Escherichia coli</italic></article-title>. <source>Int. J. Genomics</source> <volume>2018</volume>:<fpage>9235605</fpage>. <pub-id pub-id-type="doi">10.1155/2018/9235605</pub-id><pub-id pub-id-type="pmid">29750145</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schaefer</surname> <given-names>C. E. G. R.</given-names></name> <name><surname>Fabris</surname> <given-names>J. D.</given-names></name> <name><surname>Ker</surname> <given-names>J. C.</given-names></name></person-group> (<year>2008</year>). <article-title>Minerals in the clay fraction of Brazilian Latosols (Oxisols): a review</article-title>. <source>Clay Miner.</source> <volume>43</volume>, <fpage>137</fpage>&#x02013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1180/claymin.2008.043.1.11</pub-id></citation></ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serres</surname> <given-names>M. H.</given-names></name> <name><surname>Riley</surname> <given-names>M.</given-names></name></person-group> (<year>2000</year>). <article-title>MultiFun, a multifunctional classification scheme for <italic>Escherichia coli</italic> K-12 gene products</article-title>. <source>Microbial. Compar. Genomics</source> <volume>5</volume>, <fpage>205</fpage>&#x02013;<lpage>222</lpage>. <pub-id pub-id-type="doi">10.1089/omi.1.2000.5.205</pub-id><pub-id pub-id-type="pmid">11471834</pub-id></citation></ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shahmuradov</surname> <given-names>I. A.</given-names></name> <name><surname>Mohamad Razali</surname> <given-names>R.</given-names></name> <name><surname>Bougouffa</surname> <given-names>S.</given-names></name> <name><surname>Radovanovic</surname> <given-names>A.</given-names></name> <name><surname>Bajic</surname> <given-names>V. B.</given-names></name></person-group> (<year>2016</year>). <article-title>bTSSfinder: a novel tool for the prediction of promoters in Cyanobacteria and <italic>Escherichia coli</italic></article-title>. <source>Bioinformatics</source> <volume>33</volume>, <fpage>334</fpage>&#x02013;<lpage>340</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btw629</pub-id><pub-id pub-id-type="pmid">27694198</pub-id></citation></ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen-Orr</surname> <given-names>S. S.</given-names></name> <name><surname>Milo</surname> <given-names>R.</given-names></name> <name><surname>Mangan</surname> <given-names>S.</given-names></name> <name><surname>Alon</surname> <given-names>U.</given-names></name></person-group> (<year>2002</year>). <article-title>Network motifs in the transcriptional regulation network of <italic>Escherichia coli</italic></article-title>. <source>Nat Genet.</source> <volume>31</volume>, <fpage>64</fpage>&#x02013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1038/ng881</pub-id><pub-id pub-id-type="pmid">11967538</pub-id></citation></ref>
<ref id="B71">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shimada</surname> <given-names>T.</given-names></name> <name><surname>Fujita</surname> <given-names>N.</given-names></name> <name><surname>Maeda</surname> <given-names>M.</given-names></name> <name><surname>Ishihama</surname> <given-names>A.</given-names></name></person-group> (<year>2005</year>). <article-title>Systematic search for the Cra-binding promoters using genomic SELEX system</article-title>. <source>Genes Cells</source> <volume>10</volume>, <fpage>907</fpage>&#x02013;<lpage>918</lpage>. <pub-id pub-id-type="doi">10.1111/j.1365-2443.2005.00888.x</pub-id><pub-id pub-id-type="pmid">16115199</pub-id></citation></ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shimada</surname> <given-names>T.</given-names></name> <name><surname>Yamazaki</surname> <given-names>Y.</given-names></name> <name><surname>Tanaka</surname> <given-names>K.</given-names></name> <name><surname>Ishihama</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>The whole set of constitutive promoters recognized by RNA polymerase RpoD holoenzyme of <italic>Escherichia coli</italic></article-title>. <source>PLoS ONE</source> <volume>9</volume>:<fpage>e90447</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0090447</pub-id><pub-id pub-id-type="pmid">24603758</pub-id></citation></ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Silander</surname> <given-names>O. K.</given-names></name> <name><surname>Nikolic</surname> <given-names>N.</given-names></name> <name><surname>Zaslaver</surname> <given-names>A.</given-names></name> <name><surname>Bren</surname> <given-names>A.</given-names></name> <name><surname>Kikoin</surname> <given-names>I.</given-names></name> <name><surname>Alon</surname> <given-names>U.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>A genome-wide analysis of promoter-mediated phenotypic noise in Escherichia coli</article-title>. <source>PLoS Genet.</source> <volume>8</volume>, <fpage>1</fpage>&#x02013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1371/annotation/73cf6e53-2141-4918-926b-8d07b073884d</pub-id><pub-id pub-id-type="pmid">22275871</pub-id></citation></ref>
<ref id="B74">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Silva-Rocha</surname> <given-names>R.</given-names></name> <name><surname>de Lorenzo</surname> <given-names>V.</given-names></name></person-group> (<year>2008</year>). <article-title>Mining logic gates in prokaryotic transcriptional regulation networks</article-title>. <source>FEBS Lett.</source> <volume>582</volume>, <fpage>1237</fpage>&#x02013;<lpage>1244</lpage>. <pub-id pub-id-type="doi">10.1016/j.febslet.2008.01.060</pub-id><pub-id pub-id-type="pmid">18275855</pub-id></citation></ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Singh</surname> <given-names>J.</given-names></name> <name><surname>Behal</surname> <given-names>A.</given-names></name> <name><surname>Singla</surname> <given-names>N.</given-names></name> <name><surname>Joshi</surname> <given-names>A.</given-names></name> <name><surname>Birbian</surname> <given-names>N.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Metagenomics: concept, methodology, ecological inference and recent advances</article-title>. <source>Biotechnol. J</source>. <volume>4</volume>, <fpage>480</fpage>&#x02013;<lpage>494</lpage>. <pub-id pub-id-type="doi">10.1002/biot.200800201</pub-id><pub-id pub-id-type="pmid">19288513</pub-id></citation></ref>
<ref id="B76">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sol&#x000E9;</surname> <given-names>R.</given-names></name></person-group> (<year>2015</year>). <article-title>Bioengineering the biosphere?</article-title> <source>Ecol. Complexity</source> <volume>22</volume>, <fpage>40</fpage>&#x02013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1016/j.ecocom.2015.01.005</pub-id></citation></ref>
<ref id="B77">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Solovyev</surname> <given-names>V.</given-names></name></person-group> (<year>2011</year>). <person-group person-group-type="author"><name><surname>V.</surname> <given-names>Solovyev</given-names></name> <name><surname>A</surname> <given-names>Salamov</given-names></name></person-group> (<year>2011</year>). <article-title>Automatic annotation of microbial genomes and metagenomic sequences</article-title>, in <source>Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies</source>, ed <person-group person-group-type="editor"><name><surname>Li</surname> <given-names>R. W.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Nova Science Publishers</publisher-name>), <fpage>61</fpage>&#x02013;<lpage>78</lpage>.</citation></ref>
<ref id="B78">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stewart</surname> <given-names>F. J.</given-names></name> <name><surname>Ulloa</surname> <given-names>O.</given-names></name> <name><surname>Delong</surname> <given-names>E. F.</given-names></name></person-group> (<year>2012</year>). <article-title>Microbial metatranscriptomics in a permanent marine oxygen minimum zone</article-title>. <source>Environ. Microbiol</source>. <volume>14</volume>, <fpage>23</fpage>&#x02013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.1111/j.1462-2920.2010.02400.x</pub-id><pub-id pub-id-type="pmid">21210935</pub-id></citation></ref>
<ref id="B79">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Torsvik</surname> <given-names>V.</given-names></name> <name><surname>&#x000D8;vre&#x000E5;s</surname> <given-names>L.</given-names></name></person-group> (<year>2002</year>). <article-title>Microbial diversity and function in soil: from genes to ecosystems</article-title>. <source>Curr. Opin. Microbiol.</source> <volume>5</volume>, <fpage>240</fpage>&#x02013;<lpage>245</lpage>. <pub-id pub-id-type="doi">10.1016/S1369-5274(02)00324-7</pub-id><pub-id pub-id-type="pmid">12057676</pub-id></citation></ref>
<ref id="B80">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tringe</surname> <given-names>S. G.</given-names></name></person-group> (<year>2005</year>). <article-title>Comparative metagenomics of microbial communities</article-title>. <source>Science</source> <volume>308</volume>, <fpage>554</fpage>&#x02013;<lpage>557</lpage>. <pub-id pub-id-type="doi">10.1126/science.1107851</pub-id><pub-id pub-id-type="pmid">15845853</pub-id></citation></ref>
<ref id="B81">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uchiyama</surname> <given-names>T.</given-names></name> <name><surname>Abe</surname> <given-names>T.</given-names></name> <name><surname>Ikemura</surname> <given-names>T.</given-names></name> <name><surname>Watanabe</surname> <given-names>K.</given-names></name></person-group> (<year>2005</year>). <article-title>Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes</article-title>. <source>Nat. Biotechnol.</source> <volume>23</volume>, <fpage>88</fpage>&#x02013;<lpage>93</lpage>. <pub-id pub-id-type="doi">10.1038/nbt1048</pub-id><pub-id pub-id-type="pmid">15608629</pub-id></citation></ref>
<ref id="B82">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uchiyama</surname> <given-names>T.</given-names></name> <name><surname>Miyazaki</surname> <given-names>K.</given-names></name></person-group> (<year>2010</year>). <article-title>Product-induced gene expression, a product-responsive reporter assay used to screen metagenomic libraries for enzyme-encoding genes</article-title>. <source>Appl. Environ. Microbiol</source>. <volume>76</volume>, <fpage>7029</fpage>&#x02013;<lpage>7035</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.00464-10</pub-id><pub-id pub-id-type="pmid">20833789</pub-id></citation></ref>
<ref id="B83">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vanet</surname> <given-names>A.</given-names></name> <name><surname>Marsan</surname> <given-names>L.</given-names></name> <name><surname>Sagot</surname> <given-names>M. F.</given-names></name></person-group> (<year>1999</year>). <article-title>Promoter sequences and algorithmical methods for identifying them</article-title>. <source>Res. Microbiol</source>. <volume>150</volume>, <fpage>779</fpage>&#x02013;<lpage>799</lpage>. <pub-id pub-id-type="doi">10.1016/S0923-2508(99)00115-1</pub-id><pub-id pub-id-type="pmid">10673015</pub-id></citation></ref>
<ref id="B84">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Venter</surname> <given-names>J. C.</given-names></name></person-group> (<year>2004</year>). <article-title>Environmental genome shotgun sequencing of the sargasso sea</article-title>. <source>Science</source> <volume>304</volume>, <fpage>66</fpage>&#x02013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1126/science.1093857</pub-id><pub-id pub-id-type="pmid">15001713</pub-id></citation></ref>
<ref id="B85">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vester</surname> <given-names>J. K.</given-names></name> <name><surname>Glaring</surname> <given-names>M. A.</given-names></name> <name><surname>Stougaard</surname> <given-names>P.</given-names></name></person-group> (<year>2015</year>). <article-title>Improved cultivation and metagenomics as new tools for bioprospecting in cold environments</article-title>. <source>Extremophiles</source> <volume>19</volume>, <fpage>17</fpage>&#x02013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1007/s00792-014-0704-3</pub-id><pub-id pub-id-type="pmid">25399309</pub-id></citation></ref>
<ref id="B86">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Williamson</surname> <given-names>L. L.</given-names></name> <name><surname>Borlee</surname> <given-names>B. R.</given-names></name> <name><surname>Schloss</surname> <given-names>P. D.</given-names></name> <name><surname>Guan</surname> <given-names>C.</given-names></name> <name><surname>Allen</surname> <given-names>H. K.</given-names></name> <name><surname>Handelsman</surname> <given-names>J.</given-names></name></person-group> (<year>2005</year>). <article-title>Intracellular screen to identify metagenomic clones that induce or inhibit a quorum-sensing biosensor</article-title>. <source>Appl. Environ. Microbiol</source>. <volume>71</volume>, <fpage>6335</fpage>&#x02013;<lpage>6344</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.71.10.6335-6344.2005</pub-id><pub-id pub-id-type="pmid">16204555</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported by the National Council for Technological and Scientific Development (CNPq 472893/2013-0 and 441833/2014-4) and by Young Research Awards by the Sao Paulo State Foundation (FAPESP, award numbers 2015/04309-1 and 2012/21922-8). CW and LA are beneficiaries of FAPESP fellowships (award numbers 2016/05472-6 and 2016/06323-4, respectively).</p></fn>
</fn-group>
</back>
</article>