Functional validation of putative toxin-antitoxin genes from the Gram-positive pathogen Streptococcus pneumoniae: phd-doc is the fourth bona-fide operon

Bacterial toxin-antitoxin (TAs) loci usually consist of two genes organized as an operon, where their products are bound together and inert under normal conditions. However, under stressful circumstances the antitoxin, which is more labile, will be degraded more rapidly, thereby unleashing its cognate toxin to act on the cell. This, in turn, causes cell stasis or cell death, depending on the type of TAs and/or time of toxin exposure. Previously based on in silico analyses, we proposed that Streptococcus pneumoniae, a pathogenic Gram-positive bacterium, may harbor between 4 and 10 putative TA loci depending on the strains. Here we have chosen the pneumococcal strain Hungary19A-6 which contains all possible 10 TA loci. In addition to the three well-characterized operons, namely relBE2, yefM-yoeB, and pezAT, we show here the functionality of a fourth operon that encodes the pneumococcal equivalent of the phd-doc TA. Transcriptional fusions with gene encoding Green Fluorescent Protein showed that the promoter was slightly repressed by the Phd antitoxin, and exhibited almost background values when both Phd-Doc were expressed together. These findings demonstrate that phd-doc shows the negative self-regulatory features typical for an authentic TA. Further, we also show that the previously proposed TAs XreA-Ant and Bro-XreB, although they exhibit a genetic organization resembling those of typical TAs, did not appear to confer a functional behavior corresponding to bona fide TAs. In addition, we have also discovered new interesting bioinformatics results for the known pneumococcal TAs RelBE2 and PezAT. A global analysis of the four identified toxins-antitoxins in the pneumococcal genomes (PezAT, RelBE2, YefM-YoeB, and Phd-Doc) showed that RelBE2 and Phd-Doc are the most conserved ones. Further, there was good correlation among TA types, clonal complexes and sequence types in the 48 pneumococcal strains analyzed.


INTRODUCTION
Toxins-antitoxins (TAs) were discovered in the early 1980s and, later on, their cellular function was extensively studied especially in the past two decades. However, until now the world of TAs is still perplexing. TAs are found abundantly in prokaryotes, especially in free-living bacteria, as well as in archaea, and probably in fungi but so far, not in other eukaryotes (Pandey and Gerdes, 2005). In general, a TA locus consists of two genes organized as an operon, with the antitoxin gene, which is generally precedes the toxin gene, encoding either an untranslated RNA or a labile protein that neutralizes the toxicity of its cognate toxin, whereas Abbreviations: TAs, toxins-antitoxins; SD, Shine-Dalgarno; ant, antirepressor; bro, baculovirus repeated orfs; GFP, Green Fluorescent Protein; OD, optical density; STs, sequence types; ICEs, integrative and conjugative elements; MLST, multilocus sequence typing; CCs, clonal complexes; Fic, filamentation induced by cyclic AMP. the toxin gene encodes a toxin protein that binds to the cellular targets and halts essential cell processes. Under certain circumstances such as stress, the system would be triggered by a rapid degradation of the antitoxin, thus liberating the toxin to act on its target. Most known toxins function as RNases (Christensen et al., 2001;Nariya and Inouye, 2008;Jørgensen et al., 2009;Yamaguchi and Inouye, 2009), whereas other toxins target essential cellular components such as DNA gyrase (Van Melderen, 2002), cell wall (Mutschler et al., 2011), and EF-Tu elongation factor (Castro-Roa et al., 2013). TAs have been classified into five types, depending on how the antitoxin counteracts the toxin. Type II TAs, in which the antitoxin is a protein that binds avidly to the toxin protein to form an inert complex, are the most widely studied (Makarova et al., 2009;Leplae et al., 2011). When found on plasmids, TAs function primarily in the stable maintenance of the plasmid, post-segregationally killing off any plasmid-free daughter cells that developed. The functions of the chromosomally-encoded bacterial TAs are still intriguing. They might not be essential, but they have been reported to be involved in a variety of cellular processes related to global stress responses (Christensen et al., 2001), programmed cell death (Engelberg-Kulka and Glaser, 1999), maintenance of mobilomes (Rowe-Magnus et al., 2003;Szekeres et al., 2007), persistence (Gerdes and Maisonneuve, 2012), biofilm formation (Harrison et al., 2009;Soo and Wood, 2013), niche colonization (Norton and Mulvey, 2012), virulence (Ren et al., 2012), phage abortive infection system (Fineran et al., 2009;Dy et al., 2014), and other important cellular processes (Goeders and Van Melderen, 2014). Some bacterial genomes are bountiful in TAs, e.g., Mycobacterium tuberculosis encodes 30 functional TAs (Ramage et al., 2009), which makes their intricacy bewildering. The ability of TAs to impart growth control may attribute to the slow growth and dormant state, which are the hallmarks of latent tuberculosis infection (Schifano and Woychik, 2014), but the mechanisms involved, and how many TAs may be responsible for these phenotypic traits are still unknown. It is still unclear whether the TAs within the same host could cross-talk. However, some studies have shown that expression of one TA might activate another (Garcia-Pino et al., 2008). Whether numerous TAs would provide cumulative or synergistic effect to the cells is something worth to ponder.
Streptococcus pneumoniae (the pneumococcus) is a Grampositive bacterium, pathogenic for humans and responsible for both infections in respiratory tracts as well as invasive infections, and associated with significant morbidity and mortality (Chan et al., 2012). Furthermore, clinical isolates of pneumococci show a high degree of variability, probably due to the recombinogenic nature of this bacterium (Claverys et al., 2006;Baquero, 2009). We had previously identified up to 10 putative Type II pneumococcal TAs based on in silico data mining (Chan et al., 2012), and three of them had already been proven to be functional, namely relBE2, yefM-yoeB, and pezAT (Chan et al., 2013). Here, we have undertaken the validation and characterization of the rest of the putative pneumococcal TAs. We show that S. pneumoniae encodes a fourth functional TA operon, phd-doc, which corresponds to a bona fide TA. Further, we experimentally ruled out some of the predicted pneumococcal gene pairs as typical TAs, since some of these putative toxins turned out to be either non-toxic, or possibly toxic but lacking a cognate antitoxin gene.

DNA MANIPULATIONS, SEQUENCING, AND SEQUENCE DATA ANALYSIS
DNA manipulations and other molecular biology techniques were done following standard protocols (Sambrook and Russel, 2001). S. pneumoniae genomic DNA was extracted using the Bacterial Genomic DNA Isolation Kit (Norgen Biotech Corp.). Plasmid DNA from both S. pneumoniae and E. coli was isolated using the High Pure Plasmid Isolation kit (Roche). However, for S. pneumoniae, the plasmid extraction protocols were slightly modified as described (Ruiz-Cruz et al., 2010). DNA fragments and PCR products were purified with the NZYGelpure from NZYTech. DNA was sent for automated Sanger sequencing in Secugen S.L., Centro de Investigaciones Biológicas, CSIC, Madrid. DNA sequences were analyzed using the BioEdit Sequence Alignment Editor version 7.0.4.1 (Hall, 1999).

CONSTRUCTION OF RECOMBINANT PLASMIDS WITH S. PNEUMONIAE HUNGARY 19A -6 PUTATIVE TAs
All plasmids and primers used in this study are shown in Table 1, and the details are described below.

pFUS2SD and derivatives
The arabinose-inducible plasmid pFUS2 (Lemonnier et al., 2003) was used to construct recombinant plasmids with the seven putative toxins from S. pneumoniae Hungary 19A -6 (as described above) to validate their functionality in E. coli heterologous host.

Plasmids
Descriptions and primers used pFUS2, (Lemonnier et al., 2003) A 4.5 kb vector with ori-pBR322 which harbors an arabinose-inducible P BAD promoter upstream of a multiple cloning site pFUS2SD An SD consensus AGGAGG was inserted downstream of P BAD promoter and upstream of EcoRI restriction site of pFUS2 plasmid. This recombinant plasmid was used for the cloning of putative toxins from S. pneumoniae Hungary 19A -6 and subsequently for overexpression assays in E. coli Top10 The underlined sequences were the restriction sites used for cloning.
Since TA operons are usually co-transcribed and some of the toxin genes which are located downstream of the antitoxin genes do not have an obvious SD sequence, we had constructed plasmid pFUS2SD (this study) by inserting an SD consensus sequence (AGGAGG) downstream of the arabinose-inducible P BAD promoter of pFUS2 by reverse PCR and then self-ligation. This recombinant plasmid was subsequently used for the cloning of the seven putative toxin genes from Hungary 19A -6 strain. In addition, the known toxin yoeB (GI: 15903627) from S. pneumoniae R6 was also constructed in pFUS2SD to serve as a positive control (Chan et al., 2011). All these genes were inserted at the EcoRI restriction site downstream of the SD sequence, and thus the start codons of all these genes were 6 bp apart from the SD sequence. The recombinant plasmids were transformed into E. coli Top10 (Sambrook and Russel, 2001) and used to conduct overexpression assays.

pAST derivatives
To verify the promoter activities and to characterize the transcriptional regulation of the promoter by pneumococcal Phd and Phd-Doc, transcriptional fusions were constructed to measure Green Fluorescent Protein (GFP) fluorescence. Plasmid pAST (Ruiz- Cruz et al., 2010), which harbors a promoter-less gfp gene was used for this study. The putative promoter of the phd-doc operon, the phd gene along with its promoter and the phddoc genes along with their promoter were inserted upstream of the gfp gene of pAST plasmid, respectively. All the recombinant plasmids were transformed into S. pneumoniae R6 and then to evaluate the promoter activities by measuring GFP fluorescence.

OVEREXPRESSION ASSAYS OF PUTATIVE TAs IN S. PNEUMONIAE AND E. COLI
Cell cultures of S. pneumoniae R6 harboring the different recombinant plasmids were grown in AGCH medium supplemented with 0.3 sucrose and 0.2% yeast extract (Lacks et al., 1986) until optical density (OD) 650 ∼0.3 and subsequently subcultured to OD 650 of ∼0.02 in the same medium but with addition of 0.2% maltose to induce promoter P M . OD of the cultures was measured every hour for 8 h. E. coli Top10 with different recombinant plasmids were inoculated in TY medium (Maniatis et al., 1982) supplemented with 0.4% glucose to repress expression from the P BAD promoter, and the cells were allowed to grow until OD 600 of ∼0.3. Cells were harvested and washed twice with TY medium, followed by suspension in TY medium to OD 600 of ∼0.02, and then induced with 0.4% arabinose. The assays were repeated three times and the mean values and the standard errors were calculated.

GFP FLUORESCENCE ASSAYS
To characterize the promoter activities of pneumococcal phd-doc, fluorescence assays were conducted as previously described (Ruiz-Cruz et al., 2010 (Zhou et al., 2011) was used to identify TAs in prophage sequences within pneumococcal genomes. In the case of strains, whose sequence types (STs) were not directly available, sequences of seven loci used in the multilocus sequence typing (MLST) scheme (Enright and Spratt, 1998) were retrieved from genomic data and submitted to the online database http://pubmlst.org/spneumoniae/ (28th August 2014, date last accessed) to obtain allele numbers and STs. Position of the STs was then analyzed for all strains within the population snapshot, constructed based on the content of the whole MLST database (28th August 2014, date last accessed) using the eBURST approach (Feil et al., 2004) at http://eburst. mlst.net/ and resulting clonal complexes (CCs) were named based on the predicted founder of the group. The largest observed CC, including ST156 as a proposed primary founder, was additionally divided into lineages named according to the predicted secondary founders. Between 4 and 10 putative TAs were found in our previous in silico analysis of S. pneumoniae genomes (Chan et al., 2012). Genes of the putative TAs analyzed here were derived from the Hungary 19A -6 strain. It is a pathogenic strain, which harbors all 10 putative TA pairs, thus covering most (if not all) of the putative pneumococcal TAs reported to date. Three out of the 10 TA operons, namely the R6 strain relBE2, yefM-yoeB and pezAT, had been previously shown to encode functional TAs in S. pneumoniae and also in E. coli (Nieto et al., 2006(Nieto et al., , 2007Khoo et al., 2007). Thus, these three pairs were not included here, with the exception of the yoeB toxin gene that was used as a positive control for the overexpression assays in the E. coli host.

VERIFICATION OF THE PUTATIVE TAs FROM S. PNEUMONIAE
The functionality of the seven putative toxins was first analyzed in their natural host S. pneumoniae and the ones that retarded cell growth were further analyzed with their cognate antitoxins. In the case of the pneumococcal doc gene, the construction of plasmid pLS1ROM-MCS harboring doc had a mutation at the SD in which AGGAGG was changed to AGGCGG . However, this mutation did not hinder Doc protein to exhibit its toxicity to the cell. Overproduction of Doc had strongly retarded the growth of S. pneumoniae despite the growth resumed after 6 h ( Figure 1A). Nonetheless, its toxicity could be clearly neutralized by co-expression of its cognate antitoxin Phd in cis ( Figure 1B). This finding demonstrated that the pneumococcal phd-doc pair constitutes a bona fide functional TA. For Ant and Bro, cell growth was also inhibited when the proteins were overproduced, respectively, albeit less toxic than Doc ( Figure 1A). Growth of S. pneumoniae cells resumed after a few hours of induction of the toxins (Figure 1A), indicating the toxicity was temporal. Unexpectedly, co-expression of their presumptive cognate antitoxins XreA and XreB did not neutralize the toxicity of Ant and Bro, respectively ( Figure 1B). Expression of XreA and XreB alone was not toxic to the cells, but intriguingly, the pairs XreA-Ant and Bro-XreB seemed to be slightly more toxic to the cells, compared to the Ant and Bro toxins alone ( Figure 1B). Conversely, overproduction of the rest of the putative pneumococcal toxins HicA, RelE1, COG2856CA, and COG2856B had no effect on the growth of S. pneumoniae (Figure 1A).
In the case of the heterologous host E. coli, as in their native host S. pneumoniae, overexpression of HicA, RelE1, COG2856CA, and COG2856B were inert to the cells (Figure 1C). We were unable to obtain the recombinant plasmid of pFUS2SD with the putative pneumococcal toxin doc gene, most likely because of its high toxicity to the host. As it was observed for YoeB, overexpression of the Ant and Bro proteins led to suppression of the growth of E. coli; moreover, in contrast to S. pneumoniae, no indication that the cultures resumed growth was found even after 8 h of incubation, indicating that both proteins were very toxic to the host ( Figure 1C).

FIGURE 1 | Overexpression of pneumococcal putative TAs. (A)
Overexpression of pneumococcal putative toxins in homologous host S. pneumoniae had shown that HicA, RelE1, COG2856CA, and COG2856C were not harmful to the cells; whereas overexpression of Doc, Ant and Bro had thwarted cell growth but resumed a few hours albeit slowly. (B) Since Doc, Ant and Bro were toxic to the cells, they were thus further analyzed to ensure if their putative cognate antitoxins were functional as well in S. pneumoniae host. For Doc, co-expression of its cognate antitoxin Phd in cis had neutralized the Doc toxicity to the cell, signifying Phd-Doc is a functional TA. On the other hand, overexpression of Ant and Bro repressed the growth of the cells but co-expression of their cognate putative antitoxins XreA and XreB, respectively, were not able to neutralize the toxic effect of the toxins, and instead more toxic to the cells. Note that XreA or XreB alone was inert to the cells. Thus, we have ruled out that XreA-Ant and Bro-XreB are typical TAs. (C) Overexpression assays in heterologous host E. coli. Similar to the natural host S. pneumoniae, overexpression of pneumococcal putative toxins HicA, RelE1, COG2856CA, and COG2856C did not affect cell growth. Unfortunately we were not able to construct doc toxin in pFUS2SD plasmid likely due to its high toxicity. YoeB from strain R6, which is a known functional pneumococcal toxin (Nieto et al., 2007;Chan et al., 2011) was used as a positive control (pFUS2SD_YoeB), whereas pFUS2SD without insert served as a negative control. As with the pneumococcal YoeB toxin, overexpression of pneumococcal Ant and Bro inhibited growth of E. coli and no recovery was observed even after 8 h of incubation, which was in contrary to the results in S. pneumoniae, in which cell growth was resumed a few hours after toxins overproduction.
We conclude that out of the seven gene pairs that we had investigated here, only Phd-Doc was demonstrated to be a bona fide pneumococcal TA. Concerning Ant and Bro, overexpression of each of these proteins inhibited growth of both S. pneumoniae and E. coli; however, the products of their adjacent genes XreA and XreB were unable to counteract their toxicity, respectively. Therefore, we do not classify them as typical TA pairs. Thus, we have re-evaluated the number of functional bona fide TAs for the 48 pneumococcal strains whose genomic sequences are available in the NCBI genome databases (Table 2, see below).

THE PNEUMOCOCCAL PHD-DOC: THE FOURTH FUNCTIONAL TA PAIR IN S. PNEUMONIAE
As mentioned above, the pneumococcal Doc was shown to be toxic to its natural host, and co-expression of the cognate Phd antitoxin was able to counteract the toxicity of Doc. The pneumococcal phd-doc gene pair was present in a single copy in almost all the genomes of pneumococcal strains examined (43 out of 48 strains; Table 2) (Chan et al., 2012). The phd-doc genes are structured as an operon, which was consistently surrounded by genes encoding Type I restriction-modification systems; further, a gene that encodes a putative integrase/recombinase was also conserved downstream of doc (Figure 2A). However, the genomic context of phd-doc was not entirely conserved since, in some strains, transposases were also evident further downstream. The pneumococcal phd antitoxin and doc toxin genes overlap by four nucleotides and both genes are likely co-transcribed from a single promoter located upstream of phd (Figure 2A). An inverted repeat of 8 nt (GTAA·TTAC) was identified, which partly covered the −35 consensus sequences (Figure 2A). This differs from the promoter of phd-doc of bacteriophage P1, in which two palindrome sequences (TGTGT·ACACA and CGAGT·ACACG) were found to be located between the −10 consensus sequence and the start codon of Phd, with one of the palindromes overlapping the SD (Magnuson et al., 1996). We postulate that the inverted repeat would function as the operator where the pneumococcal Phd antitoxin binds to repress its own transcription, as in other TAs (Magnuson et al., 1996;Tian et al., 2001;Kedzierska et al., 2007). To test the regulation of transcription of pneumococcal Phd-Doc, we constructed a set of pneumococcal plasmids based on the pAST promoter-probe vector that harbors the gfp gene (Ruiz- Cruz et al., 2010). Thus, transcriptional fusions allowed us to measure the phd-doc promoter. In addition, and to demonstrate whether Phd antitoxin and the Phd-Doc pair regulate their   transcription, either the antitoxin or the TA pair was fused to the promoter-less gene encoding GFP. Measurement of the levels of fluorescence showed that indeed the promoter region resulted in GFP expression of ∼8 arbitrary units, which were slightly repressed by synthesis of the antitoxin (∼6 arbitrary units) and repressed further to ∼1 arbitrary units (down to nearly basal levels ∼0.1 arbitrary units) by the joint production of the Phd-Doc pair (Figure 2B). In agreement to these results, observation of these cells under phase-contrast and fluorescence microscopy also demonstrated that the fluorescence was not much different for cells harboring promoter-gfp and promoter-phd-gfp; whereas the fluorescence of cells harboring promoter-phd-doc-gfp was visibly lower (Figure 2C). Employment of the GFP reporter to detect TA activation was previously shown for the Salmonella sehAB operon (De La Cruz et al., 2013). In our case, it has turned out to be a simple and reliable way to show auto-repression of the phd-doc operon. Further, it could be used for any of the TA to be tested and also to perform in vivo studies on conditions that trigger any particular TA. To our understanding, this is the first time that this strategy is performed for a Gram-positive bacterium. Like other Doc homologs, the pneumococcal Doc has a Fic (filamentation induced by cyclic AMP) domain (Utsumi et al., 1982) at the N-terminal moiety. In general, the conserved Fic motif, HxFx(D/E)GNGRxxR (Engel et al., 2012;Goepfert et al., 2013), possesses adenylylation (or "AMPylation") activity (Itzen et al., 2011;Woolery et al., 2012). By comparison, the consensus residues of the Fic motifs in all Doc toxins are slightly different, HxFx(D/N)(A/G)NKR (Cruz et al., 2014), as is the pneumococcal Fic motif HVFANANKR. The target of the Doc toxin from bacteriophage P1 (Magnuson and Yarmolinksy, 1998) was revised recently, in which Doc was reported to be a new type of protein kinase [instead of AMPylating like other Fic proteins (Kinch et al., 2009) or obstructing ribosomes (Liu et al., 2008)] that inhibits bacterial translation by phosphorylating the conserved threonine (Thr382) of the translation elongation factor EF-Tu, thereby rendering it unable to bind aminoacylated tRNAs (Castro-Roa et al., 2013;Cruz et al., 2014). Interestingly, it was also demonstrated Frontiers in Microbiology | Evolutionary and Genomic Microbiology December 2014 | Volume 5 | Article 677 | 8 that the Phd antitoxin can only block the de novo phosphorylation of EF-Tu, but cannot revert it, and dephosphorylation of EF-Tu was probably catalyzed by the Doc toxin itself (Castro-Roa et al., 2013). This finding coincided with our observation, in which the inhibition of growth mediated by the pneumococcal Doc toxin was slowly relieved a few hours after overexpression ( Figure 1B). It had been shown that ectopic overexpression of Doc in E. coli might lead to Lon-dependent activation of RelE to exhibit its mRNA cleavage activity (Garcia-Pino et al., 2008). Since S. pneumoniae harbors both RelBE and Phd-Doc TAs, it would be interesting to find out whether both TAs could cross-talk as described by Garcia-Pino et al. (2008).

MORPHOLOGY OF E. COLI AND S. PNEUMONIAE CELLS AFTER TOXIN OVEREXPRESSION
In many cases where overexpressed genes may cause a toxic effect, the bacterial cells change their morphology and may exhibit a phenotype of either filamentation or other gross changes in the cell shape (Scott et al., 2010). To test whether overexpression of the various toxins analyzed here had any influence on the cell morphology of S. pneumoniae and E. coli, cultures were subjected to toxin overexpression under the above conditions, and cells were observed using phase-contrast microscopy. In the case of S. pneumoniae no prominent morphological changes were observed for any cells that overexpressed the different toxins (Supplementary Figure S1). On the other hand, for E. coli, filamentation was observed when Ant or Bro were overexpressed, indicating that the cells were not dividing properly and thus cell growth was impaired (Figures 3C,D). The filamentation was not reversible for at least 8 h as inhibition of cell growth by Ant or Bro did not show any sign of resuming even after 8 h ( Figure 1C). Overexpression of YoeB toxin, which likely an endoribonuclease as its E. coli homolog (Christensen et al., 2004;Zhang and Inouye, 2009), did not result in very obvious differences in morphology when compared to the wild type (Figures 3A,B).

The relBE1 operon
A previous study showed that RelE1 of S. pneumoniae had no RNase activity (Christensen and Gerdes, 2003), but the cell growth profile after overexpression of RelE1 was not assessed.
Here, we confirmed that expression of RelE1 was not detrimental to its homologous host or to the heterologous host E. coli (Figure 1) even though RelE1 shared 52% similarity with RelE2 (in the Hungary 19A -6 strain) which had been demonstrated to be functional (Nieto et al., 2006). In spite of the high conservation of both relB1 and relE1 in all the strains (not shown), two (R65 and R85) out of five crucial residues responsible for RelE toxicity in Pyrococcus horikoshii were missing in the pneumococcal RelE1 (Takagi et al., 2005;Nieto et al., 2010;  pFUS2SD_Bro, were examined after 8 h of overexpression. Filamented cells were observed in cultures overexpressing Ant or Bro, but no prominent differences were found for cells overexpressing YoeB when compared to wild type.

www.frontiersin.org
December 2014 | Volume 5 | Article 677 | 9 et al., 2012). In addition, truncation of relE1 in some strains, such as ATCC 700669 and 70585 was also observed. In general, the gene arrangement around the relBE1 genes of all the strains was similar, i.e., these genes were flanked upstream by genes that encode DNA polymerase III, elongation factor G and 30s ribosomal proteins, whereas downstream genes were found to be encoding a possible amino peptidase, an integral membrane protein and ribosomal small subunit pseudouridine synthase. In some instances, transposases were identified further upstream of relB1 (strains ATCC 700669 and INV104). Conservation of the relBE1 gene pair in many of the pneumococcal genomes could point to a function not related to type II TA, but this remains to be investigated.

The gene pairs xreA-ant and bro-xreB
These two gene pairs were previously proposed as putative TAs in bioinformatics searches of the S. pneumoniae genome (Makarova et al., 1999;Chan et al., 2012). Overexpression of the pneumococcal Ant and Bro proteins inhibited cell growth in both S. pneumoniae and E. coli, but co-expression of their putative respective cognate genes xreA and xreB did not counteract their toxicity. However, we still do not rule out the possibility that Ant and Bro are solitary toxins, similar to the MazF-mx of M. xanthus, which lacks the co-transcribed antitoxin (Nariya and Inouye, 2008). MrpC, which is expressed from another part of the M. xanthus genome, serves as an antitoxin by forming complexes with MazF-mx, and it also positively regulates the MazF-mx expression (Nariya and Inouye, 2008). We made use of a web server designed to identify prophages (PHAST; Zhou et al., 2011) to search whether pneumococcal xreA-ant and bro-xreB gene pairs were located within pneumococcal prophages. We found that they are located within a 94.4 kb intact prophage termed Streptococcus phage MM1 of S. pneumoniae Hungary 19A -6 strain (Supplementary material). Analysis with Pfam showed that both Ant and Bro have two domains, respectively: Ant (232 amino acids) has an AntA (AntA/AntB antirepressor) domain at the N-terminal moiety and a phage antirepressor KilAC domain at the C-terminal moiety ( Figure 4A); whereas Bro (237 amino acids) has a Bro-N domain at the N-terminus and an ORF6C (C-terminal of bacteriophage bIL2850 ORF6) (Iyer et al., 2002) domain at the C-terminus ( Figure 4B). Both KilA and Bro domains are widely prevalent in bacteriophages, bacteria, eukaryotic viruses of the nucleo-cytoplasmic large DNA virus and baculovirus classes (Iyer et al., 2002). A study by Kuan et al. showed that the protein ANT8 (ORF8) of lytic corynephage P1201 isolated from Corynebacterium glutamicum harbors a Bro domain at the N-terminus and an antirepressor domain at the Cterminal end (http://research.nchu.edu.tw/upfiles/ADUpload/oc downmul2272130635.pdf). In this case, the DNA binding ability of ANT8 was contributed by the Bro domain, whereas overexpression of the antirepressor domain inhibited E. coli cell growth and also killed the cells as shown by a decrease in colony forming units. Under microscopic examination, the cells that overexpressed the ANT8 protein became filamentous, with irregular multiple nucleoids, indicating inhibition of cell division, and this result is also in agreement with our findings.

GENOMIC CONTEXT OF THE FUNCTIONAL PNEUMOCOCCAL TAs relBE2 AND pezAT, AND THEIR DISTRIBUTION IN S. PNEUMONIAE POPULATION
The distribution of functional TAs showed a very good correlation with pneumococcal STs and CCs ( Table 2). As described in our previous report, RelBE2 is present in all the 48 pneumococcal database strains with seven out of them having two copies. Moreover, six different gene organizations were found for this TA (Chan et al., 2012) that fall into two groups, encompassing the previously defined Types I-IV and Types V-VI (Chan et al., 2012; Figure 5A). The latter are flanked by xre upstream and COG2856B downstream, an arrangement that is not observed in Types I-IV. RelE2 of Types V and VI were also clustered together in the phylogenetic analyses ( Figure 5B). In addition, in strains with two copies of RelBE2, one copy belonged to Types I-IV whereas the second copy belonged to either Type V or VI. Integrases and IS elements were evident in Types III-VI. The particular gene arrangement showed also a very good agreement with STs and CCs, with a single exception in the Taiwan 19F -14 international clone ( Table 2). All these observations suggest that the second copy of relBE2 may likely be horizontally acquired by some strains rather than represent a product of gene duplication.
In our previous bioinformatics search, 28 out of the 48 annotated pneumococcal strains harbored a single copy of pezAT; whereas three strains harbored two copies (Chan et al., 2012). The two copies of pezAT within the same strain were almost identical (similarities of 96-97%), and the catalytically important residues (Meinhart et al., 2003) were also conserved. The gene organization of pezAT among the various S. pneumoniae strains is even more varied than relBE2. This TA is flanked by a number of genes that contribute to site-specific recombination and even genes that play roles in conjugative transfer. The pezAT locus is known to be present on a genomic island known as the pneumococcal pathogenicity island 1 (Meinhart et al., 2003;Nieto et al., 2006) but interestingly, a search using a web-based resource for bacterial ICEs known as ICEberg database led us to discover the presence of a previously unreported pezAT operon in a putative pneumococcal ICE designated Tn5253 (Ayoubi et al., 1991). This was corroborated when a recent analysis of the complete sequence of Tn5253 from S. pneumoniae DP1322 indicated the presence of two copies of pezAT (Iannelli et al., 2014)  In general, the relBE2 operon was flanked upstream by vicX (metal-dependent hydrolase), whereas a RUP (repeat unit of pneumococcus), element, ldh (lactate dehydrogenease), and gyrA (the A subunit of DNA gyrase) are located downstream for Type I-IV. For Type V-VI, the relBE2 cassette is flanked upstream by xre-like protein and COG2856B downstream.
Mobile elements are seen in Types III-VI. Other abbreviations used: restriction-modification (RM); restriction-endonuclease (RE); hypothetical protein (HP). (B) Phylogenetic analyses of RelE2 homologs with neighbor-joining algorithm, in which the determined start codon of relE2 was considered , had shown that Type V and Type VI gene organizations of RelE2 cluster together, indicating RelE2 from both types are of more similar in sequences. The type of relBE2 gene organization was indicated. For the strain Canada MDR 19F, the type of gene arrangement is unknown due to data unavailability in the NCBI databases.
of a 1169 bp direct repeat that flanks the cat(pC194) element within Tn5253. Experimental evidence showed that excision of cat(pC194) from Tn5253 occurred by a recombination event within the pezAT-containing repeat segment leaving one copy in Tn5253 and another copy in the newly-excised plasmid pC194 (Iannelli et al., 2014). Besides that, ectopic integration of cat(pC194) to other regions of the S. pneumoniae genome was also detected, likely by homologous recombination involving the pezAT-containing direct repeat segments (Iannelli et al., 2014). Thus, pezAT appeared to play a direct role in the transfer of the cat(pC194) element and may also help in the stable maintenance of this element in its hosts. It is therefore not surprising that phylogenetic analyses showed that PezAT found in Tn5253 formed a separate cluster from PezAT located in the pneumococcal pathogenicity island 1 (Supplementary Figure S2 and Table S1).

CONCLUSIONS
Previous searches for putative type II TAs in S. pneumoniae (Chan et al., 2012(Chan et al., , 2013 led us to conclude that their number was greatly undervalued. However, our present in vivo validation has shown that, indeed, only four of the TAs, namely RelBE2 (Nieto et al., 2006), YefM-YoeB (Nieto et al., 2007;Chan et al., 2011), PezAT Mutschler et al., 2011) and now, Phd-Doc can be considered as bona fide TAs in the "classical" sense that the harmful effects of the toxins must be counteracted by their cognate antitoxins and that their two genes should be organized as an operon (Gerdes, 2013). Bioinformatics search approaches have helped us to identify various putative TA homologs; however, the actual number of bona fide TAs might be overestimated as in our case e.g., pneumococcal relBE2 is functional TA but pneumococcal relBE1 is apparently not. In the case of M. tuberculosis, of the 88 putative TAs identified, only 30 are functional (Ramage et al., 2009). Other approaches like shotgun cloning as presented by Sberro et al. (2013) could be another alternative and more thorough approach to identify TAs. However, there might be more TAs than these four if we consider Ant and Bro as possible solo toxins whose respective antitoxins might be located elsewhere in the pneumococcal chromosome. The finding that these two novel proteins might be associated with lysogenic bacteriophages is, indeed, appealing and merits an in-depth study of their features.