Genetic islands in pome fruit pathogenic and non-pathogenic Erwinia species and related plasmids

New pathogenic bacteria belonging to the genus Erwinia associated with pome fruit trees (Erwinia, E. piriflorinigrans, E. uzenensis) have been increasingly described in the last years, and comparative analyses have found that all these species share several genetic characteristics. Studies at different level (whole genome comparison, virulence genes, plasmid content, etc.) show a high intraspecies homogeneity (i.e., among E. amylovora strains) and also abundant similarities appear between the different Erwinia species: presence of plasmids of similar size in the pathogenic species; high similarity in several genes associated with exopolysaccharide production and hence, with virulence, as well as in some other genes, in the chromosomes. Many genetic similarities have been observed also among some of the plasmids (and genomes) from the pathogenic species and E. tasmaniensis or E. billingiae, two epiphytic species on the same hosts. The amount of genetic material shared in this genus varies from individual genes to clusters, genomic islands and genetic material that even may constitute a whole plasmid. Recent research on evolution of erwinias point out the horizontal transfer acquisition of some genomic islands that were subsequently lost in some species and several pathogenic traits that are still present. How this common material has been obtained and is efficiently maintained in different species belonging to the same genus sharing a common ecological niche provides an idea of the origin and evolution of the pathogenic Erwinia and the interaction with non-pathogenic species present in the same niche, and the role of the genes that are conserved in all of them.


Introduction
The genus Erwinia belongs to the Enterobacteriaceae family and essentially comprises plantassociated bacteria that are pathogenic and epiphytic to pome fruit trees (Palacio-Bielsa et al., 2011). The most important species is Erwinia amylovora, causal agent of fire blight on rosaceous hosts, which is present worldwide and produces very high economic losses (Bonn and van der Zwet, 2000). Other pathogenic Erwinia species have been described in the last decades: Erwinia pyrifoliae, a pathogen described in Asian pear isolated in South Korea Rhim et al., 1999;Shrestha et al., 2003); E. piriflorinigrans, isolated in 1999 in Spain, causes necrosis of pear blossoms , and Erwinia uzenensis from Japan, which produces bacterial black shoot disease (BBSDP) on European pear (Matsuura et al., 2012). Other related Erwinia species, E. tasmaniensis and E. billingiae, are epiphytes in the same hosts. All these species are genetically and phenotypically closely related, although they can be distinguished by taxonomic criteria (Mergaert et al., 1999;Kube et al., 2008;Palacio-Bielsa et al., 2011).
In the last years, several sequencing projects have been carried out which included all the Erwinia species except E. uzenensis. All have provided interesting clues about the relationships among these organisms and the exchange of genetic material (Sundin, 2007;Kube et al., 2008Kube et al., , 2010Sebaihia et al., 2010;Smits et al., 2010a,b;Powney et al., 2011;Smits et al., 2013;Ismail et al., 2014;Smits et al., 2014). Because E. amylovora is a very important pathogen, the majority of information is related to it. Genetic studies have divided E. amylovora strains into two major groups with different host range: strains isolated from Spiraeoideae and from Rosoideae (Rubus spp., Mann et al., 2013). The genomes of Spiraeoideae-infecting strains are highly homogeneous, and greater diversity was observed between Spiraeoideae-and Rubusinfecting strains, the majority of which was attributed to variable genomic islands (Triplett et al., 2006;Smits et al., 2010b).
Comparative genomics of E. amylovora strains from different origins showed that the pan-genome is highly conserved relative to other phytopathogenic bacteria species, with homogeneity of 99.99% identity at the nucleotide level (Smits et al., 2010b;Mann et al., 2013). The genomes of two E. pyrifoliae strains sequenced (Ep1/96) and DSM 12163 (Ep16/99) are almost identical whereas the two saprophytic species are distantly related to the pathogenic species, with E. tasmaniensis more related than E. billingae (Kube et al., 2010;Smits et al., 2010a). Comparison of genomes of Japanese (Ejp617) and Korean (Ep1/96) E. pyrifoliae strains revealed a high level of genome conservation (more than 95% nucleotide sequence identity) despite the numerous insertion/deletion rearrangements and inversions associated with Insertion Sequences (IS). The differences are mainly based on transposases, phage-related genes, and single genes (Smits et al., 2010a;Thapa et al., 2013). The genes acquired by horizontal gene transfer (HGT) are introduced via mobile genetic elements (MGEs) and incorporated into the chromosome by homologous or illegitimate recombination.
Other characteristics observed are related to the genome size. Differences in size between E. pyrifoliae and E. tasmaniensis genomes are due to the insertion of MGEs in the E. pyrifoliae genome that code transposases, integrases, and phage-related proteins. The prevalence of a high number of mobile elements in Ep1/96 may suggest frequent genomic changes and a higher rate of evolution (Smits et al., 2010a;Thapa et al., 2013).
Ancestral origins of several virulence factors have been found, and the two major virulence determinants required for E. amylovora to infect and cause disease are the genes involved in amylovoran biosynthesis and the type III secretion systems (T3SS). Other genes that could have been acquired after a divergence of pathogenic species are flagellar genes  and the type VI secretion systems (T6SS; Smits et al., 2011;Mann et al., 2013). In this review, I will discuss the presence of transfer elements, involving from individual genes to entire plasmids, and how these genetic transfers intervene in the emergence of characteristics like pathogenicity, virulence and the fitness of the pome fruit erwinias .

Exopolysaccharide Biosynthesis
Exopolysaccharide (EPS) is a pathogenicity factor contributing to biofilm formation of E. amylovora (Koczan et al., 2009), encoded by the ams operon. This gene cluster is present in the genomes of E. amylovora, E. pyrifoliae, and E. piriflorinigrans (Bernhard et al., 1996;Kube et al., 2008). In the sequenced E. tasmaniensis and E. billingiae genomes these genes are present in a different cluster (cps) yielding an EPS more related to stewartan of Pantoea stewartii subsp. stewartii (Coplin et al., 1996;Kim et al., 2002;Smits et al., 2010a;Malnoy et al., 2012). The hypothesis could be that an Erwinia ancestor produced an EPS similar to stewartan of P. stewartii (Coplin et al., 1996;Kube et al., 2008), and the differentiation took place at or after the separation of the pathogenic Erwinia from E. tasmaniensis, and this would indicate that the genes involved in amylovoran production are probably acquired (Bernhard et al., 1996;Smits et al., 2011).

Type III Secretion Systems
The T3SS is a protein complex found in several Gram-negative bacteria with a needle-like structure used as a sensory probe to detect the presence of eukaryotic organisms and to secrete and inject virulence factors into the host cells (Galan and Collmer, 1999) and are located in pathogenicity islands (PAIs) integrated into the genomes in the plant pathogens Toth et al., 2006). The PAI in all isolates analyzed of E. amylovora is divided into four distinct DNA regions, and is delimited by genes suggesting horizontal gene transfer and the remnants of an integrative conjugative element (ICE) are present at the flank of the Hrp cluster (Wei and Beer, 1993;Mann et al., 2012Mann et al., , 2013Vrancken et al., 2013). The Hrp region is conserved in the Spiraeoideae strains sequenced (CFBP 1430, ATCC 49946; Smits et al., 2010b) whereas the genomes of several Rubus strains (ATCC BAA-2158, Ea644, and MR-1; Powney et al., 2011) showed larger sizes. Similarly, the island transfer (IT) regions of Spiraeoideae-infecting strains of E. amylovora (IT: group of MGEs that reside in a host chromosome but retain the ability to excise and to transfer by conjugation) are highly conserved (>99% nucleotide identity and identical synteny), but the IT regions of the Rubus-infecting strains all vary in sequence identity and length. Comparative genome sequences revealed two additional T3SS PAIs (PAI2 and PAI3) and two flagellar T3SS systems (Fla1 and Fla2; Zhao et al., 2009;Smits et al., 2011;. PAI2 and PAI3 have a significantly lower G+C content and are closely related to those of Sodalis glossinidius (an endosymbiont of the tsetse fly) and to the pathogens Salmonella and Yersinia (Dale et al., 2001;Young and Young, 2002;Gendlina et al., 2007;Zhao et al., 2009;Smits et al., 2010b). Sequences upstream of PAI2 and PAI3 contained genes associated with MGEs, thus, the insertion of a mobile element deleted a part of PAI-2 in E. tasmaniensis Et1/99 (Kube et al., 2008), and PAI2 is lost in E. pyrifoliae (Smits et al., 2010a). It could be speculated that E. amylovora may have acquired these novel T3SS PAIs from other bacteria associated with their insect vectors during evolution, or that these novel T3SS PAIs may contribute to the association of E. amylovora with its insect vectors (Zhao and Qi, 2011). Two other sets of genes encoding for flagellar assembly and chemotaxis related proteins were found in the genome of E. pyrifoliae. One set is tightly clustered and the encoded proteins show high identity with the corresponding proteins of Salmonella and Escherichia spp. (Kube et al., 2008). This suggests that the entire region was acquired as a genomic island via horizontal genetic transfer (Smits et al., 2010a).

Type VI Secretion Systems
Type VI secretion systems (T6SS) have been identified in at least a quarter of the sequenced Gram-negative bacteria (Boyer et al., 2009;Records, 2011), and three gene clusters (T6SS1-3) have been found in the genome of E. amylovora CFBP 1430 (Smits et al., 2010b). Comparison of the T6SS clusters among Erwinia and Pantoea species has identified conserved core regions and variable islands (De Maayer et al., 2011). T6SS clusters 1 and 2 are highly similar to E. pyrifoliae DSM 12163T and E. tasmaniensis Et1/99, with the exception of some genes encoding hypothetical proteins that do not belong to the core genes of T6SS (Bingle et al., 2008). E. amylovora showed variations within non-conserved islands of T6SS-1 in regions that share high sequence similarity to bacteria of the genus Pantoea, including the plant pathogen Pantoea stewartii subsp. stewartii (Braun, 1982;Smits et al., 2010a; Figures 1A,B). The third T6SS cluster was identified only in E. amylovora CFBP 1430, located within a putative genomic island and, therefore, might be acquired by horizontal gene transfer (Smits et al., 2010b).

Gene Transfer in Related Plasmids in Erwinia spp
One of the most obvious differences among strains of Erwinia spp is the presence of different plasmids in all genome-sequenced Erwinia spp. New plasmids have been described during the latest genome sequencing projects, and is the largest factor influencing the pan-genome size of E. amylovora (McGhee et al., 2002;Maxson-Stein et al., 2003;Foster et al., 2004;Sebaihia et al., 2010;Powney et al., 2011;Kamber et al., 2012;Llop et al., 2012;Ismail et al., 2014). The analyses performed have revealed a strong relationship with other plant and human pathogenic and nonpathogenic bacteria, and constitute the widest host range of the genetic exchange in Erwinia spp. Some of the genetic material exchanged involving different plasmids are described below.

Streptomycin Resistance in pEA34
Streptomycin resistance (SmR) E. amylovora strains were found harboring transposon Tn5393 including the strA-strB gene pair (Chiou and Jones, 1993;McGhee and Sundin, 2011). Tn5393 was introduced to E. amylovora on the conjugative plasmid pEa34. This transposon is also present in Pantoea agglomerans and is thought to have been transferred to E. amylovora on pEA34 (Chiou and Jones, 1991). Plasmid pEA34 could have originated from the insertion of Tn5393 into a 28 kb plasmid no related with pEA29 (Chiou and Jones, 1993;Jones and Schnabel, 2000), and the transposon probably moved from pEa34 to pEA29 because this transposon was found in pEA29 (Burr et al., 1988;Chiou and Jones, 1993;McManus and Jones, 1994;Sundin andBender, 1995, 1996;McGhee and Sundin, 2011). An additional insertion of Tn5393 was found into the thiO gene of pEA29 (McGhee and . Other E. amylovora isolates were found harboring genes strA-strB within the same Tn5393 transposon in a different plasmid (pEA8.7; Sundin and Bender, 1996). This plasmid is identical to the SmR plasmid RSF1010 that has a broad distribution among enteric bacteria and also encodes the strA-strB gene pair (Palmer et al., 1997). These genes are identical to the streptomycin resistance genes found in at least 14 genera of gram-negative animal and human pathogens. In the clinical strains, the genes reside on small broad-host-range plasmids like RSF1010 and pBP1 or large self-transmissible R plasmids like pGS05 and pCJ004 (Sundin and Bender, 1996;McGhee and Sundin, 2011).

Plasmids and Genetic Transfer Elements
Erwinia amylovora isolates from Poland and Belgium showed the presence of a new plasmid of 68 kb (pEA68). It is closely related to other plasmids from different E. amylovora strains, pEA72 from USA (Sebaihia et al., 2010) and pEA78 in E. amylovora from Mexico (Smits et al., 2014). The amino acid sequence identity between these plasmids range between 60 and 90% for the CDS shared, and include genes of transfer and mobilization (mob, tra, and trb). Large regions between the repA gene and the mobABC gene cluster are divergent in all three plasmids, indicating that these regions are highly variable likely due to horizontal gene transfer (Ismail et al., 2014).
Plasmid pEI70 was found in Spanish strains of E. amylovora (Llop et al., 2006). It is conjugative and widespread in European countries, and similarly to pEA29, it induces an increase in symptoms development but with no specific pathogenicity genes (Llop et al., 2008. pEI70 is almost entirely included in plasmid pEB102 from E. billingiae, with nucleotide sequence identities superior to 98%. The organization is identical as well, with only a 36-kb region in pEB102 absent in pEI70. Another major feature of pEI70 is the presence of an Integrating Conjugative Element (ICE) that shares similarities to specific regions of Pseudomonas fluorescens Pf-5 (Mavrodi et al., 2009) and Pectobacterium atrosepticum SCRI 1043 (Toth et al., 2006;Llop et al., 2011; Figure 2A) and its possible role in the fitness of the bacteria .
Plasmid pEL60 was reported in E. amylovora strains from Lebanon in half of the isolates analyzed. pEL60 has strong relationships with other enterobacterial plasmids, with a high similarity to the Citrobacter freundii plasmid pCTX-M3, sharing 66 of the 68 ORFs it contains (Foster et al., 2004). Transfer genes are similar to genes on plasmid pACM1 from Klebsiella oxytoca (Preston et al., 2000) and to tra genes from plasmids isolated from other enteric bacteria and from the plant pathogen P. syringae pv. tomato DC3000. These observations suggest that the environmental enteric plant pathogen E. amylovora can access the horizontal gene pool shared among clinical enteric bacteria.
In E. pyrifoliae strains, plasmids pEP36 and pEJ30 were found in South Korean and Japan isolates, respectively. They are nearly identical, both contain the element IS285 and the transposon (Tn5394) is missing in pEJ30. Other significant similarities of pEP36 were found in Yersinia pestis genome and Shigella flexneri SHI-2 PAIs (Mann et al., 2013). In E. piriflorinigrans strains, sequencing of a common 37-kb plasmid (pEPIR37) revealed high similarity to plasmids pEA29 of E. amylovora, and plasmids pEP36 and pEJ30 of E. pyrifoliae (McGhee et al., 2002;Maxson-Stein et al., 2003;Barbé et al., 2012). The replication origin has high homology with plasmids pET46 of E. tasmaniensis and pPag3 of P. vagans (Kube et al., 2008;Smits et al., 2010c) but also, a fragment of RepA protein (94% sequence identity) present in pEA29, pEJ30, and pEP36 plasmids, is present in pEPIR37. 12 CDS are similar to genes present in the genomes of E. pyrifoliae, E. tasmaniensis, and E. billingiae .
A novel plasmid pEA30 was found in E. amylovora strain CFBP 2585, and nucleotide searches showed that is closely related to the RA3 plasmid of Aeromonas hydrophila (64-81% identity). Its high genetic similarity to RA3, a broad host range and self-transmissible plasmid, stably maintained in Alpha-, Beta-, and Gammaproteobacteria, is another example of the possible generation of the mobilome in pome fruit erwinias by means of plasmids present in the environment (Kulinska et al., 2008;Mann et al., 2013; Figure 2B).
In the plasmid pEA29 from E. amylovora, apart from the similarities with plasmids pEP36, pEJ30, and pEPIR37, remnants of several IS detected resembled insertion elements identified previously in other unrelated bacteria. A vestige of Tn2501 and direct repeats found in the IS911 were detected in all derivatives of pEA29 (McGhee and . A 108 aa CDS with similarity to ParA from Agrobacterium tumefaciens was also found . The presence of a partial parA gene may explain the occurrence of the 8-bp repeats found consistently in this plasmid. An striking feature of E. pyrifoliae strain Ep1/96 is the presence of an assumed non-ribosomal peptide synthetase EppT (NRPS) with high similarity with a protein of unknown function from Photorhabdus luminescens (syn. Xenorhabdus luminescens), an enterobacterial pathogen of insects, and an NRPS of the distantly related soft rot pathogen P. atrosepticum encoded in an island typical for horizontal gene transfer (Duchaud et al., 2003;Bell et al., 2004;Kube et al., 2010). Similar NRPS proteins are also encoded in the genome of E. tasmaniensis strain Et1/99 and E. billingiae strain Eb661, but differ in size and domain content. The presence of the eppT gene in this strain may be a result of an integration event as indicated by a phage integrase located upstream (Kube et al., 2010).

Discussion
Many genomic studies on almost all the pome fruit Erwinia species performed in the last years have unveiled the occurrence of transposition events related to HGT, and the presence of different genetic elements. They have allowed inferring the evolution and relatedness of the species within this genus and offer broad information about the mobilome of the plant host erwinias, both pathogenic and epiphytic . The pathogenic A B FIGURE 2 | (A) Comparison of plasmid pEB102 of E. billingiae Eb661 (1) with E. amylovora ACW 56400 plasmid pEI70 (2), and the conserved region of GAI-2 of Pectobacterium atrosepticum SCRI 1043 (3). Orthologous genes are indicated by blue shading (conserved ICE element genes) and shading. Genes in white do not have orthologs in these regions (Illustration from Llop et al., 2011). (B) Comparison of plasmid pEA30 of CFBP 2585 (Ea495) to the RA3 plasmid of Aeromonas hydrophila. The RA3 plasmid is the archetype of the IncU plasmids which are a distinct group of mobile elements with highly conserved backbones and variable antibiotic resistance gene cassettes. Conservation between pEA30 and RA3 (represented by the gray shaded lines) is limited to the conserved backbone of replication, maintenance and transfer related genes. Nucleotide similarity searches to known sequences in GenBank indicate that pEA30 has 70% total sequence coverage and 64-81% identity of all high-scoring segment pair matches related to the RA3 plasmid (Illustration from Mann et al., 2013). species show the acquisition of a large range of pathogenicity and virulence factors and a reduction of chromosome size by a significant gene loss. The important pathogenicity factor EPS, is present only in the genomes of E. amylovora, E. pyrifoliae, and E. piriflorinigrans (Smits et al., 2010bKamber et al., 2012). All these features demonstrate that horizontal gene transfer is responsible for many of the differential features between these species and may have led to the emergence of pathogenic species from the non-pathogenic . It is interesting to point out that the largest factor accounting for the genetic variability influencing the pan-genome of this genus is due to the presence of plasmids Malnoy et al., 2012). Plasmid sequencing has uncovered the relationship observed in some of the medium and small plasmids in Erwinia species. Thus, partial sequences of plasmids present in some E. amylovora strains are most probably originated from other plasmids from human and animal pathogens (plasmids pEA30 and RA3, Figure 2B) and also the high genetic identity between plasmids pEI70 from E. amylovora and pEB102 from the epiphytic species E. billingiae indicates that lateral transfer of almost entire extrachromosomal material could take place between species sharing host and niche and be stably maintained ; Figure 2A). Several of the plasmids reported show the potential for conjugal transfer, such as pEL60 and pEI70 from E. amylovora, pEB170 of E. billingiae, and several plasmids from E. tasmaniensis, and others carry mob genes and may contain an oriT to be mobilized by Tra proteins of other plasmids (plasmids pEP05 and pEt46 from E. pyrifoliae and E. tasmaniensis, Kube et al., 2010). The chromosome of E. tasmaniensis Et1/99 encodes part of the central region of E. pyrifoliae plasmid pEP36 carrying the thiOSGF and the betB genes, but not the entire plasmid (Smits et al., 2010a). The most recent sequencing projects performed are still unveiling new plasmids that are related to other plasmids or/and genomes from other genera, indicating that in this genus the main genetic source of variability is this extrachromosomal material (Smits et al., , 2014Ismail et al., 2014). Then, as reflected in this review, the mobilome in this genus shows an extended presence of transferable elements of different sizes, affecting all the species and situated mainly in PAIs related to virulence determinants, but also to the fitness of the bacterium, and all this information allows the knowledge about how the plant pathogens of the Erwinia pome fruits could access the gene pools of other enteric bacteria through horizontal transfer. As special features of the HGT in this genus are the role of MGEs based on plasmids in the acquisition of new traits related not only on classical aspects (i.e., antibiotic resistance) but in bacterial fitness that leads to a better survival in epiphytic species or increased aggressiveness in pathogenic ones. Also, the spread and stability of entire plasmids in both pathogenic and non-pathogenic species is remarkable. Many different factors can influence the prevalence of HGT in a community, but we do not know how the composition of the community shapes the likelihood of HGT events. In this case, knowledge of the bacterial community composition of the different species that are present in a particular site would be of great interest because HGT might be easier in a community composed of closely related species, and some species seem to be more prone to HGT events than others are. This would provide useful information for illuminating patterns of gene transfer in the microbial world.