# EVOLUTIONARY TRAJECTORIES IN PLANT-ASSOCIATED *PSEUDOMONAS* AND *XANTHOMONAS* STRAINS, 2nd Edition

EDITED BY : Marco Scortichini, Dawn Arnold, Olivier Pruvost, Marie-Agnès Jacques and Adriana J. Bernal PUBLISHED IN : Frontiers in Microbiology and Frontiers in Plant Science

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-8325-4822-6 DOI 10.3389/978-2-8325-4822-6

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# EVOLUTIONARY TRAJECTORIES IN PLANT-ASSOCIATED *PSEUDOMONAS* AND *XANTHOMONAS* STRAINS, 2nd Edition

Topic Editors:

Marco Scortichini, Council for Agricultural and Economics Research (CREA), Italy Dawn Arnold, University of the West of England, United Kingdom Olivier Pruvost, UMR Peuplement Végétaux et Bio-agresseurs en Milieu Tropical (CIRAD), France Marie-Agnès Jacques, INRA Centre Angers-Nantes Pays de la Loire, France

Adriana J. Bernal, University of Los Andes, Colombia

The strict relationships between bacteria and plants represent one of the major facets of terrestrial ecology. Depending on the type of interaction and amount of metabolic advantage one organism can obtain from such relationships, these are classified as mutualistic, commensal or parasitic interactions. Within this context, Pseudomonas and Xanthomonas are bacterial genera with a worldwide spread, capable of establishing all of the above mentioned interactions with plants. Therefore, they represent good models for studying different lifestyles and, accordingly, deciphering distinct evolutionary trajectories followed by different lineages of a single genus to infect and/or to establish a mutualistic relationships with the plant. Some members of these two genera are regulated pests that are recognized as economically major threats for their host crop(s) both in temperate and tropical environments.

Some Pseudomonas and Xanthomonas are key examples of different lifestyles (i.e., mesophyll or vessel-colonizing pathogens, epiphytic pathogens, plant growth-promoting rhizobacteria, non-pathogenic strains of recognized pathogenic species, etc). Refining our knowledge on the ecology and epidemiology of these bacterial groups, as well as deciphering their evolutionary dynamics are keys for understanding their contrasting lifestyles and consequently improving plant disease control. At the same time, insights on the activation of different plant defense mechanisms as challenged by the different repertoires of virulence factors displayed by pseudomonads and xanthomonads, would yield new achievements to reduce the threats they pose to cultivated and wild plant species.

This Research Topic focuses on microbial and evolutionary ecology of plant associated Pseudomonas and Xanthomonas, as well as the genomic and molecular diversity of lineages and the virulence and fitness features involved in the interaction with the host-plant. Most of the literature available for this Research Topic has been performed for strains isolated in temperate zones. In line with the long-recognized high social and environmental impact of pests and pathogens in tropical countries, we have welcomed submissions of studies covering such situations for these areas. This Research Topic gathers high-quality contributions (Original Research, Methods, Protocols, Hypothesis & Theory, Reviews, Mini Reviews, Focused Reviews) and in order to promote complementary and original research approaches to improve our knowledge on pseudomonads and xanthomonads-host interactions and their control, it benefited from the scientific communities currently working on Pseudomonas and Xanthomonas such as the teams dealing with the Pseudomonas syringae species complex and the French Network on Xanthomonads (FNX).

Publisher's note: This is a 2nd edition due to an article retraction.

Citation: Scortichini, M., Arnold, D., Pruvost, O., Jacques, M.-A., Bernal, A. J., eds. (2024). Evolutionary Trajectories in Plant-Associated *Pseudomonas* and *Xanthomonas* Strains, 2nd Edition. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-8325-4822-6

# Table of Contents

*06 Genetic and Phenotypic Characterization of Indole-Producing Isolates of*  Pseudomonas syringae *pv.* actinidiae *Obtained From Chilean Kiwifruit Orchards*

Oriana Flores, Camila Prince, Mauricio Nuñez, Alejandro Vallejos, Claudia Mardones, Carolina Yañez, Ximena Besoain and Roberto Bastías

*18 Plant Microbiome and its Link to Plant Health: Host Species, Organs and*  Pseudomonas syringae *pv.* actinidiae *Infection Shaping Bacterial Phyllosphere Communities of Kiwifruit Plants*

Witoon Purahong, Luigi Orrù, Irene Donati, Giorgia Perpetuini, Antonio Cellini, Antonella Lamontanara, Vania Michelotti, Gianni Tacconi and Francesco Spinelli

*34 Inference of Convergent Gene Acquisition Among* Pseudomonas syringae *Strains Isolated From Watermelon, Cantaloupe, and Squash*

Eric A. Newberry, Mohamed Ebrahim, Sujan Timilsina, Nevena Zlatković, Aleksa Obradović, Carolee T. Bull, Erica M. Goss, Jose C. Huguet-Tapia, Mathews L. Paret, Jeffrey B. Jones and Neha Potnis

*52 Corrigendum: Inference of Convergent Gene Acquisition Among*  Pseudomonas syringae *Strains Isolated From Watermelon, Cantaloupe, and Squash*

Eric A. Newberry, Mohamed Ebrahim, Sujan Timilsina, Nevena Zlatković, Aleksa Obradović, Carolee T. Bull, Erica M. Goss, Jose C. Huguet-Tapia, Mathews L. Paret, Jeffrey B. Jones and Neha Potnis

*53 Multiple Recombination Events Drive the Current Genetic Structure of*  Xanthomonas perforans *in Florida*

Sujan Timilsina, Juliana A. Pereira-Martin, Gerald V. Minsavage, Fernanda Iruegas-Bocardo, Peter Abrahamian, Neha Potnis, Bryan Kolaczkowski, Gary E. Vallad, Erica M. Goss and Jeffrey B. Jones

*66 Molecular Evolution of* Pseudomonas syringae *Type III Secreted Effector Proteins*

Marcus M. Dillon, Renan N.D. Almeida, Bradley Laflamme, Alexandre Martel, Bevan S. Weir, Darrell Desveaux and David S. Guttman


Marisa A. S. V. Ferreira, Sophie Bonneau, Martial Briand, Sophie Cesbron, Perrine Portier, Armelle Darrasse, Marco A. S. Gama, Maria Angélica G. Barbosa, Rosa de L. R. Mariano, Elineide B. Souza and Marie-Agnès Jacques


José A. Gutiérrez-Barranquero, Francisco M. Cazorla and Antonio de Vicente

*145 Analyses of Seven New Genomes of* Xanthomonas citri *pv.* aurantifolii *Strains, Causative Agents of Citrus Canker B and C, Show a Reduced Repertoire of Pathogenicity-Related Genes*

Natasha Peixoto Fonseca, José S. L. Patané, Alessandro M. Varani, Érica Barbosa Felestrino, Washington Luiz Caneschi, Angélica Bianchini Sanchez, Isabella Ferreira Cordeiro, Camila Gracyelle de Carvalho Lemes, Renata de Almeida Barbosa Assis, Camila Carrião Machado Garcia, José Belasque Jr., Joaquim Martins Jr., Agda Paula Facincani, Rafael Marini Ferreira, Fabrício José Jaciani, Nalvo Franco de Almeida, Jesus Aparecido Ferro, Leandro Marcio Moreira and João C. Setubal

# Genetic and Phenotypic Characterization of Indole-Producing Isolates of Pseudomonas syringae pv. actinidiae Obtained From Chilean Kiwifruit Orchards

Oriana Flores<sup>1</sup> , Camila Prince<sup>1</sup> , Mauricio Nuñez<sup>1</sup> , Alejandro Vallejos<sup>2</sup> , Claudia Mardones<sup>2</sup> , Carolina Yañez<sup>1</sup> , Ximena Besoain<sup>3</sup> and Roberto Bastías<sup>1</sup> \*

<sup>1</sup> Laboratorio de Microbiología, Instituto de Biología, Facultad de Ciencias, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile, <sup>2</sup> Departamento de Análisis Instrumental, Facultad de Farmacia, Universidad de Concepción, Concepción, Chile, <sup>3</sup> Laboratorio de Fitopatología, Escuela de Agronomía, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile

#### Edited by:

Marco Scortichini, Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria (CREA), Italy

#### Reviewed by:

Stefania Tegli, Università degli Studi di Firenze, Italy David John Studholme, University of Exeter, United Kingdom

> \*Correspondence: Roberto Bastías roberto.bastias@pucv.cl

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology

Received: 02 April 2018 Accepted: 30 July 2018 Published: 22 August 2018

#### Citation:

Flores O, Prince C, Nuñez M, Vallejos A, Mardones C, Yañez C, Besoain X and Bastías R (2018) Genetic and Phenotypic Characterization of Indole-Producing Isolates of Pseudomonas syringae pv. actinidiae Obtained From Chilean Kiwifruit Orchards. Front. Microbiol. 9:1907. doi: 10.3389/fmicb.2018.01907 In recent years, Chilean kiwifruit production has been affected by the phytopathogen Pseudomonas syringae pv. actinidiae (Psa), which has caused losses to the industry. In this study, we report the genotypic and phenotypic characterization of 18 Psa isolates obtained from Chilean kiwifruits orchards between 2012 and 2016 from different geographic origins. Genetic analysis by multilocus sequence analysis (MLSA) using four housekeeping genes (gyrB, rpoD, gltA, and gapA) and the identification of type III effector genes suggest that the Chilean Psa isolates belong to the Psa Biovar 3 cluster. All of the isolates were highly homogenous in regard to their phenotypic characteristics. None of the isolates were able to form biofilms over solid plastic surfaces. However, all of the isolates formed cellular aggregates in the air–liquid interface. All of the isolates, except for Psa 889, demonstrated swimming motility, while only isolate Psa 510 demonstrated swarming motility. The biochemical profiles of the isolates revealed differences in 22% of the tests in at least one Psa isolate when analyzed with the BIOLOG system. Interestingly, all of the isolates were able to produce indole using a tryptophan-dependent pathway. PCR analysis revealed the presence of the genes aldA/aldB and iaaL/matE, which are associated with the production of indole-3-acetic acid (IAA) and indole-3-acetyl-3-L-lysine (IAA-Lys), respectively, in P. syringae. In addition, IAA was detected in the cell free supernatant of a representative Chilean Psa strain. This work represents the most extensive analysis in terms of the time and geographic origin of Chilean Psa isolates. To our knowledge, this is the first report of Psa being able to produce IAA. Further studies are needed to determine the potential role of IAA in the virulence of Psa during kiwifruit infections and whether this feature is observed in other Psa biovars.

Keywords: Pseudomonas syringae pv. actinidiae, Psa Biovar 3 (Psa-V), MLSA, kiwifruit, IAA, IAA production, indoleacetic acid lysine, IAA-L

### INTRODUCTION

fmicb-09-01907 August 22, 2018 Time: 12:36 # 2

Pseudomonas syringae pv. actinidiae (Psa) is the causal agent of bacterial canker in Actinidia deliciosa and Actinidia chinensis that has caused severe losses in all of the major areas of kiwifruit cultivation, including Italy, China, New Zealand, and Chile (Scortichini et al., 2012; Ferrante and Scortichini, 2015). This bacterium infects host plants by entering natural openings and wounds, moving inside the plant, and promoting the appearance of necrotic leaf spots, red exudate production, and canker and necrosis in the trunk. In the late stages of the infection, the plants wilt and desiccate which leads to the death of the kiwifruit vine (Vanneste et al., 2012; Cellini et al., 2014). Since its identification in Japan in 1984, successive outbreaks of Psa have been observed worldwide, and therefore it is now considered to be a pandemic phytopathogen (Scortichini et al., 2012; McCann et al., 2017). Comparative analysis using multilocus sequence analysis (MLSA), the detection of type III secretion system effector genes and phytotoxins (phaseolotoxin or coronatine) in Psa isolates from different geographic origins have revealed the existence of five clusters of biovars (Marcelletti et al., 2011; Ciarroni et al., 2015; Ferrante and Scortichini, 2015; Fujikawa and Sawada, 2016; McCann et al., 2017): biovar 1, comprising Japanese strains which are able to produce phaseolotoxin; biovar 2, including only South Korean strains which produce coronatine; biovar 3 or Psa-V, which includes the most virulent strains that are characterized by not producing phytotoxins and were first isolated in Italy (2008–2009) and have been subsequently reported to cause outbreaks in different countries (Butler et al., 2013; Ciarroni et al., 2015; Cunty et al., 2015a); biovar 4, contain strains with low virulence and was recently proposed to be a new pathovar called P. syringae pv. actinidifoliorum (Psaf) (Abelleira et al., 2015; Cunty et al., 2015b); and finally, biovar 5 with Japanese strains isolated in 2012 which do not produce phytotoxins. Recently, a potential new biovar was described in Japan, which produces both phaseolotoxin and coronatine (Fujikawa and Sawada, 2016).

The genetic analysis of the Psa biovars described a set of genes that participate in distinct phases of kiwifruit infection and niche colonization, both outside and inside of the host plant. These genes are related to bacterial motility, biofilm formation, copper and antibiotic resistance, siderophore production, and the degradation of lignin (Marcelletti et al., 2011; Scortichini et al., 2012; Ghods et al., 2015; Gao et al., 2016; Colombi et al., 2017; Patel et al., 2017). However, the mechanisms that determine infection and the interactions between Psa with the kiwifruit plant remain unknown. The production of the phytohormone indole-3-acetic acid (IAA) is another virulence factor that has been described in Pseudomonas savastanoi and P. syringae pathovars. This compound can perturb the regulation of the hormone balance in the plant and increase its susceptibility to infection (Glickmann et al., 1998; Cerboneschi et al., 2016). IAA production using the indole-3-acetamide (IAM) pathway is the most common mechanism in phytopathogenic bacteria, including P. syringae, and has mostly been characterized in P. savastanoi pv. savastanoi (Psav) (Baltrus et al., 2011; Aragón et al., 2014) where IAA biosynthesis begins from Ltryptophan (Trp) and involves the activity of the enzymes tryptophan-2-monooxygenase (IaaM) and IAM hydrolase (IaaH) encoded by the iaaM and iaaH genes, respectively. However, in other P. syringae pathovars, the IAA production involves other genes that lack homology to iaaM and iaaH (Glickmann et al., 1998), and recently aldehyde dehydrogenase family proteins encoded by genes aldA and aldB, were associated with IAA synthesis in P. syringae pv. tomato (McClerklin et al., 2018). For instance, P. savastanoi pv. nerii can conjugate IAA to the amino acid lysine producing indole-3-acetyl-3-L-lysine (IAA-Lys) due to the action of the enzyme IAA-Lys ligase encoded by the iaaL gene (Cerboneschi et al., 2016). This gene has been found in several P. syringae pathovars where it is arranged in synteny with the gene matE, which encodes a putative MATE family transporter, and has been implicated in the fitness and virulence of P. syringae pv. tomato (Pst) in tomato plants (Glickmann et al., 1998; Castillo-Lizardo et al., 2015).

Pseudomonas syringae pv. actinidiae was first reported in Chile in 2010 following its isolation from kiwifruit orchards in the Maule Region, and since 2011, it has been considered to be a pest under the official control of the Agricultural and Livestock Service (SAG) of the Government of Chile (McCann et al., 2013). Previous studies included classifying the first Chilean Psa isolates in biovar 3 together with strains from China, Europe, and New Zealand (Butler et al., 2013; McCann et al., 2013; Cunty et al., 2015a). However, the scope of these studies was limited by the number of Chilean strains. In this study, we report the genotypic and phenotypic characterization of Chilean Psa isolates obtained between 2012 and 2016 from the regions that accumulate more than 80% of the Psa-positive orchards in Chile. In addition, we show the first evidence of Psa strains producing IAA.

### MATERIALS AND METHODS

#### Bacterial Strains and Culture Conditions

Chilean Psa isolates are listed in **Table 1** and were obtained from the SAG from kiwi orchards of different geographic areas in the central-south of Chile in 2012, 2013, and 2016. P. syringae pv. tomato DC3000 was provided by Dr. Paula Salinas of the Universidad Santo Tomás (Santiago, Chile). Escherichia coli DH5α, E. coli K12, Pseudomonas aeruginosa PAO1, Azospirillum brasilense SP7, Salmonella bongori X9617, and Cupriavidus metallidurans CH34 were obtained from the bacterial collection of the Laboratory of Microbiology of the Pontificia Universidad Católica de Valparaíso (PUCV). Pseudomonas antarctica S63 (Vásquez-Ponce et al., 2018) was provided by Dr. Jorge Olivares from the PUCV. The bacteria were grown at 25◦C in Luria-Bertani (LB) medium except when another medium is specified. Growth curve were performed in 96 multi-well plates at 25◦C during 30 h in a microplate spectrophotometer Infinite <sup>R</sup> M200 NanoQuant (TECAN). Optical density (OD600 nm) was determined each 30 min. All curves were performed in biological triplicates.


<sup>∗</sup>Accession number of the nucleotide sequences of the gapA, gltA, gyr, and rpoD genes added to GenBank (NCBI) are included in Supplementary Table S3.

#### Molecular Identification and Characterization of the Psa Isolates

Pseudomonas syringae pv. actinidiae strain molecular identification was performed using RG-PCR and duplex-PCR as previously described (Rees-George et al., 2010; Gallelli et al., 2011). For RG-PCR, specific primers were used to amplify the internal transcribed spacer (ITS) between the 16S and 23S rRNA sequences, and for duplex-PCR, specific primers against ompP1 (Outer Membrane Protein P1) and avrD1 (effector) genes were used. All 18 isolates amplified produced bands of the expected size (**Supplementary Figure S1**). In addition, the identity of these isolates was also confirmed by partial 16S rDNA sequences. For genomic DNA isolation, the bacteria were grown in LB media for 16 h until the stationary phase. Total genomic DNA was extracted using a Wizard <sup>R</sup> Genomic DNA Purification Kit (Promega) according to the manufacturer's instructions. The DNA concentration was determined using MaestroNano MN-913 (Maestrogen, Inc.). For the molecular identification of the type III effector genes, the reference genome of Psa ICMP 18884 biovar 3 strain (GenBank accession number: NZ\_CP011972.2) (Templeton et al., 2015) and contigs of the Chilean Psa genomes, ICMP 19439 (ANJM00000000.1) and ICMP 19455 (ANJK00000000.1), available in GenBank (NCBI) were used to design specific primers for the PCRs. Comparative sequence analysis was performed using the Geneious R11 software (Kearse et al., 2012). The amplicons of effector genes obtained from strain Psa 743 were purified using an E.Z.N.A. <sup>R</sup> Cycle Pure Kit (Omega Bio-Tek, Inc.) and sequenced using the Sanger method by Macrogen, Inc. (South Korea). The quality and assembly of the sequences were analyzed using Geneious R11 software, which were compared with the NCBI database using BLASTN and BLASTX to identify the genes. Primers

and annealing temperatures used in the PCRs are listed in **Supplementary Table S1**. In all cases, PCR was performed on a SureCycler 8800 Thermal Cycler (Agilent Technologies) using SapphireAmp Fast PCR Master Mix (Takara Bio) according to the manufacturer's instructions. PCR products were separated using electrophoresis in agarose gel (1.5% agarose in 1× buffer TAE) stained with GelRedTM (Biotium), and the bands were visualized under UV light. PCRs were performed in triplicate. The genomic DNA of P. syringae pv. tomato DC3000 and E. coli DH5α were used as the control reactions. The sequences of the effector genes of a selected strain (Psa 743) were deposited in GenBank (NCBI), and the accession numbers are listed in **Supplementary Table S2**.

#### Phylogenetic Analysis by MLSA

The gapA, gltA, gyrB, and rpoD genes, encoding glyceraldehyde-3-phosphate dehydrogenase, citrate synthase, DNA gyrase B, and sigma factor 70, respectively, were amplified from the genomic DNA of Psa isolates using the primers listed in **Supplementary Table S1** as previously described (Ferrante and Scortichini, 2010). PCR was performed in triplicate using a SureCycler 8800 Thermal Cycler (Agilent Technologies) with GoTaq G2 Flexi polymerase (Fermentas) according to the manufacturer's instructions. The PCR products were visualized using electrophoresis in agarose gels and purified using an E.Z.N.A. <sup>R</sup> Gel Extraction Kit (Omega Bio-Tek, Inc.). The automated sequencing of the amplicons was performed by Macrogen, Inc. (South Korea), and the sequences were analyzed using the Geneious R9 software package (Biomatters Limited) (Kearse et al., 2012). The nucleotide sequences of the gapA, gltA, gyr, and rpoD genes of Chilean Psa strains were added to GenBank (NCBI) and are listed in **Supplementary Table S3**. The sequences of other Psa biovars available in GenBank (NCBI) were included in the analysis and are listed in **Supplementary Table S4**. In addition, sequences of P. syringae pv. tomato strain DC3000 were included: gapA (AE016853.1:1415258-1416259), cts (AE016853.1:2414332-2415621), gyrB (AE016853.1:4147- 6564), and rpoD (AE016853.1:588846-590696) (Buell et al., 2003). The sequences of each locus were aligned using the CLUSTALW included in the MEGA7 software (Kumar et al., 2016). A dendrogram from four-locus concatenated sequences was generated using neighbor-joining (UPGMA) and 1,000 bootstrap iterations.

#### Biochemical Characterization

The bacteria were streaked out from a −80◦C stock onto LB plates and incubated at 25◦C for 48 h. Biochemical patterns were determined using the Biolog GEN III MicroPlateTM system (BiologTM, United States) according to the manufacturer's instructions. BIOLOG plates were read in an Infinite M200 PRO plate reader, TECAN. Reactions were considered positive if the OD590 nm was greater than 50% of the positive control (∼0.7). Reactions indistinguishable from the negative control and with an OD590 nm below 25% of the positive control (∼0.35) were considered to be negative. Reactions between these two parameters were considered borderline.

### Determination of Streptomycin and Copper Susceptibility

The copper and streptomycin susceptibility was determined using the broth microdilution method (Biebl and Pfennig, 1978; Mergeay et al., 1985). Bacterial strains were grown in Tris minimal (for the copper assay) or Mueller–Hinton (for the streptomycin assay) media during 18 h, and the optical density at 600 nm (OD600 nm) was adjusted to 0.7. For the copper susceptibility assays, 10 µL of each bacterial culture were inoculated in Tris minimal agar media (1.5% agar) supplemented with the corresponding copper sulfate concentration (0, 75, 100, 125, 150, 175, 200, 225, 250, 275, and 300 µg/mL). To assess the streptomycin susceptibility, bacterial strains were inoculated in Mueller–Hinton agar media supplemented with the corresponding antibiotic concentration (0, 3.9, 7.8, 15.7, 31.25, 62.5, 125, 250, 500, 1,000, and 2,000 µg/mL). Plates were incubated for 5 days at 25◦C, and the bacterial growth was observed. C. metallidurans CH34 and P. antarctica S63 were used as experimental controls (von Rozycki and Nies, 2009; Vásquez-Ponce et al., 2018). All experiments were performed in biological and technical triplicates.

#### Biofilm Production

Microtiter plate biofilm production was performed and adapted as previously described (Merritt et al., 2011; O'Toole, 2011; Ueda and Saneoka, 2015). Briefly, overnight bacterial cultures were adjusted to an optical density of 0.1 (OD600 nm) and diluted 10-fold. Aliquots (100 µL) of the dilution were added to each well (96-well microtiter plates), and the plates were incubated for 7 days at 25◦C. After incubation, the liquid supernatant was removed and the plates were washed with distilled water. The wells were stained with 0.1% violet crystal solution, and the biofilm was solubilized with a 30% acetic acid solution. The biofilm production was quantified spectrophotometrically (550 nm) in a Tecan Infinite M200 <sup>R</sup> microplate reader. For the air–liquid interface biofilm assay, 1 mL of the bacterial dilution was added to each well (12 well plates), and the plates were incubated at 25◦C for 96 h. Surface biofilm formation was monitored and photo documented every 24 h. All of the experiments were performed in biological and technical triplicates, and P. aeruginosa PAO1 was used as the positive control (Ghafoor et al., 2011).

### Bacterial Motility Assay

Motility assays were adapted for the Psa assays as described by Hosseinidoust et al. (2013). Swimming motility assays were performed by inoculating 2 µL of stationary-phase bacterial culture (OD600 nm∼1.3) into the center of 0.3% LB agar plates. Swarming motility assays were performed utilizing the same procedure except that 0.5% LB agar plates were used. The zone sizes were measured after incubation at 30◦C for 72 h. The assays were performed in biological and technical triplicates. E. coli K12 was used as the experimental control (Swiecicki et al., 2013). Statistical analysis was performed using one-way ANOVA and Dunnett's multiple comparison test with p ≤ 0.05.

### Indole Production and Identification of IAA Pathway Genes

The indole production was determined using Salkowski's method as previously described (Mazzola and White, 1994; Mohite, 2013). Briefly, each strain was grown in LB media supplemented with Trp (2 g/L) and incubated at 25◦C for 24 h. After incubation, the bacterial density was measured (OD600 nm), and the cultures were centrifuged at 10,000 rpm for 10 min. Cell-free supernatants were mixed with 0.5 mL of Salkowski's reagent (12 g of FeCl<sup>3</sup> per liter in 7.9 M H2SO4). The mixture was incubated for 30 min at room temperature in the dark, and the absorbance at 530 nm was determined. The concentration of indole in each sample was determined using a standard curve of indoleacetic acid (Sigma) (0– 30 µg/mL) (**Supplementary Figure S3**). IAA concentrations were normalized to the cell density. A. brasilense SP7 (Bar and Okon, 1993) and S. bongori X9617 (De La Rosa Fraile et al., 1980) strains were used as experimental positive and negative controls, respectively. All of the analyses were performed in biological and technical triplicates. Statistical analysis was performed using one-way ANOVA and Dunnett's multiple comparison test with p ≤ 0.05. The detection of iaaL, matE, iaaH, iaaM, aldA, and aldB genes in the Chilean Psa isolates was performed using specific primers designed on the basis of conserved regions from the sequences of different P. syringae pathovars (**Supplementary Table S5**). The primers designed are listed in **Supplementary Table S1**. PCRs were performed on a SureCycler 8800 Thermal Cycler (Agilent Technologies) using a SapphireAmp Fast PCR Master Mix (Takara Bio) according to the manufacturer's instructions. The PCR conditions were as follows: 5 min at 95◦C, followed by 35 cycles of 30 s at 95◦C, 30 s at the annealing temperature (**Supplementary Table S1**), 2 min at 72◦C, and a final elongation step of 5 min at 72◦ . Sanger automated sequencing of the amplicons from Psa 743, Psa 598, and Psa 889 was performed by Macrogen, Inc. (South Korea). The sequences were compared with those in the NCBI database using BLASTN and BLASTX for gene identification. The sequences obtained were deposited in GenBank (NCBI), and the accession numbers are listed in **Supplementary Table S2**.

### LC-ESI-MS/MS Analysis

To detect IAA, Psa strain 743 was grown in minimal media (4.5 g/L KH2PO4, 10.5 g/L K2HPO4, 1 g/L (NH4)2SO4, and 0.5 g/L sodium citrate) supplemented with Trp (2 g/L) and incubated at 25◦C for 72 h. After incubation, the bacterial density was measured (OD600 nm), and the cultures were centrifuged at 10,000 rpm for 10 min. The supernatant was filtered (0.22 µm). Methanol and acetic acid were added to the cell-free supernatant at a final concentration of 10 and 0.05%, respectively, and then filtered through a PVDF filter (0.22 µm). At the end, the sample was subjected to LC-ESI-MS/MS analysis using indoleacetic acid and lysine (Sigma) as standards. The analysis was performed using a Shimadzu Nexera HPLC system coupled to a 3200Q TRAP mass spectrometer equipped with a turbo ion spray interface (Applied

Biosystems/MDS Sciex, ON, Canada). A Kinetex C18 core shell column (150 mm × 4.6 mm i.d.; 2.6 µm particle size; Kinetex, Phenomenex) protected by a C18 UHPLC Ultra column guard (0.5 µm Porosity × 4, 6 mm. i.d., Phenomenex, United States) was used. The elution gradient was adapted from Matsuda et al. (2005) and consisted of a mixture of methanol:water containing 0.05% acetic acid (methanol gradient: 10–90% in 13 min; 95% from 13.1 to 28 min) at a flow rate of 0.4 mL/min and a column temperature of 30◦C. MS was conducted in the positive ion mode during the following conditions: curtain gas (CUR), 10 psi; collision activated dissociation (CAD), medium; ion spray voltage (IS), 4500 V; nebulizer gas (Gas1), 60 psi; turbo gas (Gas2), 40 psi; temperature (TEM), 400◦C. The detection was performed using multiple reaction monitoring (MRM). The data obtained were processed using Analyst 1.3 software (Applied Biosystems).

### RESULTS

#### Phylogenetic Analysis and Molecular Characterization of the Chilean Psa Isolates

The 18 Chilean Psa isolates used in this study were collected from kiwi plants with canker disease symptoms by the SAG. These isolates were obtained between 2012 and 2016 from orchards in central-south Chile (Bío Bío and Maule Regions) that is the site of the vast majority of kiwifruit production in the country (Oficina de Estudios y Políticas Agrarias [ODEPA], 2018) and accumulates more than 50% of the Psa-infected orchards in Chile (**Figure 1** and **Table 1**). All of the isolates were confirmed as Psa strains by PCR using different sets of primers (see section "Materials and Methods").

The first Chilean Psa isolates had been previously assigned to the biovar 3 group (Butler et al., 2013; McCann et al., 2013; Cunty et al., 2015a). An MLSA using the housekeeping genes gyrB (DNA gyrase B), rpoD (sigma factor 70), gltA (citrate synthase), and gapA (glyceraldehyde-3-phosphate dehydrogenase) showed that the genes sequenced have 100% identity with the corresponding genes in different Psa strains belonging to biovar 3, including Chilean strains obtained in 2010. The phylogenetic analysis including other Psa strains shows a clear clustering of different biovars except for biovar 2 and 5 that are grouped together (**Figure 2**). The results show that all the Chilean Psa isolates group together with the other Psa biovar 3 isolates, confirming the findings of previous studies. These results were also confirmed by the PCR detection of the 16 type III effector genes that have been described in Psa biovar 3 strains (McCann et al., 2013; Ferrante and Scortichini, 2015). Type III effector genes were detected in all of the Chilean Psa strains, including those encoded in plasmid DNA in Psa biovar 3. The identity of these genes was confirmed by sequencing the amplicons of Psa strain 743 as a representative of the other Chilean Psa strains (**Supplementary Table S2**). These results also suggest that no new biovars have been introduced to Chile during this period.

### Phenotypic Characterization of the Psa Isolates

Different features implicated in the fitness and virulence of Psa were evaluated in the 18 Chilean isolates. None of the strains showed differences in their growth parameters (data not shown). However, their biochemical profile determined using the Biolog GEN III MicroPlate revealed differences in 22% of the different tests in at least one of the 18 strains (**Supplementary Table S6**). All of the strains were able to use different carbon sources such as D-glucose, D-mannose, D-galactose, glycerol, D-mannitol, Larginine, L-serine, acetic acid, and citric acid. However, they varied in their ability to use sucrose, D-fructose, inosine, L-glutamic acid, and formic acid. Alternatively, all of the strains were resistant to antibiotics such as rifamycin SV, lincomycin, and vancomycin, while they were sensitive to minocycline and troleandomycin and showed variable sensitivity to aztreonam, nalidixic acid, and fusidic acid. Despite these differences, all of the strains were identified as P. syringae pathovars according to the Biolog GEN III database (version 2.8). Interestingly, all of the isolates were susceptible to copper (MIC 75 µg/mL Cu2+) and streptomycin (MIC 3.9 µg/mL), suggesting that no resistance has developed in these strains despite the use of copper compounds as antimicrobials in the Chilean kiwifruit industry.

Biofilm production has been proposed to be an important virulence factor in P. syringae (Ghods et al., 2015; Ueda and Saneoka, 2015). Therefore, the ability to produce biofilm was evaluated in the different Chilean Psa isolates. The results showed that none were able to produce biofilm over an abiotic surface. However, they do produce a thin layer of biofilm (pellicle) in the air–liquid interface. Initially a thin layer of cells was observed in the center of static cultures after 24 h of incubation, turning to a fully grown biofilm after 96 h (**Supplementary Figure S2**). Swimming and swarming motility was also evaluated among the different Psa isolates. The results show that all of the isolates exhibit swimming motility except for strain Psa 889 which shows a significant reduced displacement in comparison to the other strains (p < 0.05). In contrast, none of the strains except for Psa 510 demonstrated swarming motility under the experimental conditions (p < 0.05) (**Figure 3**). These results show that the Chilean Psa strains demonstrate a high phenotypic homogeneity with specific differences in particular strains.

#### Indole Production in the Psa Isolates

Indole-3-acetic acid production has been described in different P. syringae pathovars and P. savastanoi (Glickmann et al., 1998; Cerboneschi et al., 2016) but not in Psa. It is produced mostly from Trp via IAM by enzymes encoded in the genes iaaM and iaaH. Therefore, all 18 isolates were evaluated for their ability to produce IAA (Glickmann and Dessaux, 1995). The results show that all of the Chilean Psa isolates can produce indole at different concentrations (**Figure 4A**). In addition, some of the Chilean Psa isolates (Psa 882 and Psa 394) produce indole concentrations similar to those of A. brasilense (63 µg/mL IAA) that produces exceptionally large amounts of IAA (Bar and Okon, 1993). In all cases, indole was produced only in the presence of Trp, suggesting that, as observed in other P. syringae,

this amino acid is the precursor of IAA synthesis in Psa. IAA production was also confirmed in the Chilean Psa strain 743 using LC-ESI-MS/MS analysis, showing a strong signal for IAA in the supernatant of the Psa 743 cell-free cultures (**Supplementary Figure S4**). The iaaM and iaaH genes were not detected in the Chilean Psa isolates using PCR and specific primers, suggesting an alternative route of synthesis exists in these strains. Recently, a novel IAA synthesis pathway was reported in P. syringae pv. tomato DC3000 (Pst), which involves the participation of an indole-3-acetaldehyde dehydrogenase encoded by the gene aldA and its homolog, aldB (McClerklin et al., 2018). Comparative analysis by BLASTN showed 95 and 97% identity between the aldA and aldB genes, and an aldehyde dehydrogenase sequence (GenBank accession number: CP011972.2: 149109–150602) and a carnitine dehydratase/3-oxoadipate enol-lactonase sequences (GenBank accession number: CP011972.2: 3182732–3184213) were encoded in the Biovar 3 Psa strain ICMP 18884. PCR with specific primers revealed that the aldA and aldB genes were also detected in all of the Chilean Psa strains, suggesting that they are likely to be responsible for the synthetic route of IAA. The identity of genes aldA and aldB was confirmed in strains Psa 889, Psa 743, and Psa 598 using Sanger sequencing (**Supplementary Table S2**).

It has also been reported that IAA can be conjugated to the amino acid lysine to produce IAA-Lys by the enzymatic activity of the iaaL gene product (Glickmann et al., 1998; Castillo-Lizardo et al., 2015; Cerboneschi et al., 2016). Our analysis detected the presence of the genes iaaL and matE in all of the Chilean Psa isolates (**Figure 4B**), which are in tandem in the Hrp regulon and are associated with IAA-Lys production. However, IAA-Lys production was not detected using the LC-ESI-MS/MS analysis. The identity of the iaaL and matE genes was also confirmed using Sanger sequencing in strains Psa 889, Psa 743, and Psa 598 (**Supplementary Table S3**). Taken together, these results strongly suggest that the Chilean Psa isolates produce IAA using a Trp-dependent pathway.

## DISCUSSION

### Genetic Analysis of the Chilean Psa Isolates

Pseudomonas syringae pv. actinidiae was first isolated in Chile in 2010, and since then, it has been considered to be a quarantine pest under the official control of the SAG of Chile. The 18 Chilean Psa isolates included in this study were obtained as part of the monitoring program established by the SAG. They were isolated from the central south region of Chile, which is the zone that accumulates the majority of Psa infections reported in the country (Servicio Agrícola y Ganadero [SAG], 2018). These strains were obtained between the years 2012 and 2016, representing the most extended study performed on Psa

in Chile. All of these strains were identified by the SAG and then confirmed by the standard molecular techniques used with this pathovar (Rees-George et al., 2010; Gallelli et al., 2011). As reported previously, the use of specific primers for the ITS amplification was not specific to Psa and also amplified a fragment from P. syringae (Vanneste, 2013). Therefore, a duplex-PCR analysis was necessary to positively identify the Psa isolates.

The MLSA confirmed that the Chilean Psa isolates belong to biovar 3. In this case, four housekeeping genes were used (gyrB, rpoD, gltA, and gapA), which seems to be sufficient to discriminate between biovar 3 and the other biovars; however, it is not sufficient to distinguish between biovars 2 and 5, which according to previous research, are very closely related (Fujikawa and Sawada, 2016). This phylogenetic analysis included sequences from several Psa strains with different biovars and origins, including some older Chilean strains that were also grouped in biovar 3. This suggests that this "hypervirulent"

group (Ciarroni et al., 2015) is the only found in Chile, and no other biovar has entered or emerged. The conclusions of this study are consistent with previous research in which the Chilean Psa isolates were classified in the Psa Biovar 3 cluster using different approaches: REP-PCR fingerprinting, MLVA (multiple locus variable number of tandem repeats analysis) assay and MLST (Ferrante and Scortichini, 2010, 2015; Vanneste et al., 2010; Ciarroni et al., 2015; Biondi et al., 2017). Genomic analyses of the Chilean Psa strains suggest that they originated from China forming a sub-group in biovar 3 (Butler et al., 2013; Ciarroni et al., 2015).

Nearly 50 putative effector genes have been identified in Psa and are found in most of the biovars (McCann et al., 2013; Ferrante and Scortichini, 2015; Fujikawa and Sawada, 2016). Sixteen type III effector genes, among others, were identified in all of the Chilean Psa isolates, including genes that were reported in conjugative DNA plasmids in other biovar 3 Psa strains (hopAV1 and hopAU1). The emergence of resistant strains as an evolutionary response to the use of antimicrobial compounds was observed in countries affected by recent outbreaks of Psa biovar 3 strains (Han et al., 2004; Vanneste, 2013; Colombi et al., 2017).

#### Phenotypic Features of the Chilean Psa Isolates

The results of this study show a high phenotypic homogeneity. However, it is still possible to observe differences between specific features and specific strains. For instance, the biochemical profile shows differences between the various Chilean Psa strains (**Supplementary Table S6**). These differences are related to carbon source utilization and chemical susceptibility assays. Moura et al. (2015) reported similar results with different Psa isolates from Portugal. Using the BIOLOG system, they observed differences in the ability to use at least 12 different carbon sources among the Portuguese strains. Interestingly, both the Chilean and Portuguese strains varied in their ability to use methyl pyruvate, bromo-succinic acid, and acetoacetic acid as carbon sources showing that variations in the biochemical repertory are not exclusive to the Chilean strains. Both groups of strains are susceptible to minocycline, lithium chloride, and sodium butyrate. The Chilean Psa strains are also resistant to antibiotics not used in agriculture such as rifamycin SV or vancomycin. However, curiously they were susceptible to streptomycin (MIC 3.9 µg/mL) that, in the past, has been authorized for use to control Psa infections in Chile. This suggests that no resistance has evolved among the Chilean Psa strains, in contrast to what has been reported by others where Psa strains can have a MIC for streptomycin greater than 2,000 µg/mL (Cameron and Sarojini, 2014). A similar situation has been observed for copper resistance in which other studies have reported Psa strains with a MIC from 100 µg/mL to more than 1,000 µg/mL (Cameron and Sarojini, 2014), while the Chilean strains have a MIC of 75 µg/mL. The absence of resistance among the Chilean Psa strains could be due to multiple factors such as low selective pressures from the environment or low plasticity in the Psa genome of these strains. However, is not possible to disregard the existence of resistant Chilean Psa strains in the environment. Our results do not show a clear correlation between these differences in the biochemical profiles and the origin or isolation year of the strains, but it would be interesting to determine if these differences have any relevance for fitness or niche colonization in the natural environment of Psa.

All of the Chilean isolates demonstrate a similar range of swimming motility (**Figure 3**) with strain Psa 889 being the only exception that lacks motility. In contrast, none of the Chilean Psa strains show swarming motility, except for strain Psa 510 that demonstrates a slightly but significantly greater amount of displacement than the other strains. The differences observed between strains Psa 889 and Psa 510 are probably related to alterations in their flagella, since no differences were observed in the growth of any of the strains according to our analysis (**Supplementary Figure S5**). Flagellar motility is an important virulence factor that allows the infection of plants through natural openings on their tissue surfaces (Ichinose et al., 2013). Therefore, it remains to be determined if these differences in strains Psa 889 and Psa 510 are correlated with alterations in their virulence.

Psa infections are very persistent, and once they are detected in a region, it is very difficult or even impossible to eradicate the bacteria (Vanneste, 2017). This persistence could be related to the ability to endure environmental conditions through biofilm formation (Danhorn and Fuqua, 2007; Renzi et al., 2012). It has been reported that Psa can form biofilm (Ghods et al., 2015). However, our analysis showed that the Chilean Psa strains are not able to form biofilms over abiotic solid surfaces. This and other differences observed between the Chilean Psa strains and the other Psa are probably related to the unique clonal origin of the Psa strains present in Chile (Butler et al., 2013). However, the low affinity to form biofilms over solid surfaces has been observed in the P. syringae pathovars (Ueda and Saneoka, 2015). Therefore, it seems that biofilm formation is not a hallmark of this species. Interestingly the Chilean strains do form a thin layer of cells at the air–liquid interface in liquid cultures. This phenomenon has been described for other Pseudomonas species where an air–liquid interface would represent a favorable environment due to the oxygen access enabling a more rapid rate of growth (Constantin, 2009; Ueda and Saneoka, 2015). All of these results confirm the high degree of homogeneity among the different Chilean Psa strains. Further studies are needed to determine if the differences between the Chilean strains affect the colonization and infection of the kiwifruit plants.

#### Indole Production in Psa Isolates

Several phytopathogens, including P. syringae pathovars, produce auxins that can alter the host's physiology and promote plant susceptibility to infection (Glickmann et al., 1998; Cerboneschi et al., 2016). To our knowledge, this is the first report showing that Psa can produce indole using a Trp-dependent pathway. All of the Chilean Psa strains evaluated produce indole, some of them at levels similar to A. brasilense, which is a plant growth promoting bacterium (Masciarelli et al., 2013). The common route for IAA production in P. syringae pathovars is via the IAM pathway using the enzymes IaaM and IaaH. This pathway has been studied in P. syringae pv. syringae (Pss) and Psav (Glickmann et al., 1998;

Baltrus et al., 2011; Aragón et al., 2014; Cerboneschi et al., 2016), and the only related report in Psa is from a strain isolated in 1984 belonging to biovar 1 which has putative ORFs of an IAM pathway (Baltrus et al., 2011). The Chilean Psa strains have the genes aldA and aldB which are associated with an alternative synthesis route of IAA recently found in P. syringae pv. tomato (McClerklin et al., 2018). Therefore, this is the most probable pathway in the Chilean Psa strains. Interestingly, bioinformatics analysis revealed that the genes iaaH and iaaM, associated with the common synthesis route of IAA, are only found in the Psa strains from biovar 4, which are now considered to be a new pathovar designated P. syringae pv. actinidifoliorum that is characterized by low virulence in kiwifruit plants (Abelleira et al., 2015). In this regard, the presence of the IAM pathway represents another distinctive feature differencing the former biovar 4 from the other Psa biovars.

Pseudomonas syringae pv. tomato and other species, such as P. savastanoi pv. nerii, also produce the enzyme IAA-lysine ligase, encoded by the iaaL gene, which is responsible for IAA-Lys production (Glickmann et al., 1998; Castillo-Lizardo et al., 2015; Cerboneschi et al., 2016). In the P. syringae pv. tomato (Pst) genome, iaaL is found in synteny with the matE gene that encodes a multidrug transporter of the MatE family. The analysis of the Chilean Psa strains revealed that all of the strains contain the genes iaaL and matE. A bioinformatic analysis showed that the iaaL gene was first annotated as a pre-protein translocase subunit Tim44 in several P. syringae pathovars; however, later it was annotated as an indoleacetate-lysine ligase gene in P. syringae pv. tomato (Castillo-Lizardo et al., 2015). According to this analysis, the matE and iaaL genes are conserved in Psa Biovar 1, 2, 3, and 5 strains with near 100% identity in their amino acid sequences (**Supplementary Figure S6**). There are reports on the importance of IAA production, and IAA-Lys in particular, in the virulence of P. syringae. For instance, mutations in the IAM pathway of Pss affect its growth in Phaseolus vulgaris (Mazzola and White, 1994), and the deletion of the aldA, aldB, iaaL, or matE genes in P. syringae pv. tomato result in a reduction in fitness, colonization, and virulence in infected tomato plants (Castillo-Lizardo et al., 2015; McClerklin et al., 2018). In addition, studies on the IAA-Lys effect on plants suggest that IAA conjugation can modulate hormone action and suppress the immune response (Romano et al., 1991). Our results show that the Chilean Psa strains produce IAA. However, we were not able to demonstrate IAA-Lys production. Despite this, the presence of the genes iaaL and matE in the Chilean and other Psa strains, including different biovars, raise the possibility that this compound could be produced in conditions other than those evaluated in this study. To date, the exact mechanism of action of IAA and IAA-Lys in the virulence of P. syringae species is not totally understood. The results presented here show that the Chilean Psa

#### REFERENCES

Abelleira, A., Ares, A., Aguin, O., Peñalver, J., Morente, M. C., López, M. M., et al. (2015). Detection and characterization of Pseudomonas syringae pv. actinidifoliorum in kiwifruit in Spain. J. Appl. Microbiol. 119, 1659–1671. doi: 10.1111/jam.12968

strains produce IAA, but it is unknown if this feature is shared with other Psa strains of biovar 3 and other biovars. The results represent the starting point to determine the mechanisms and regulation of IAA production (and possibly IAA-Lys) in Psa and its participation during infection in kiwifruits plants.

### CONCLUSION

The results of this study confirm that the Chilean Psa isolates belong to biovar 3. The isolates exhibit high homogeneity with phenotypic differences in specific isolates. This study is also the first report of Psa strains producing IAA using a Trp-dependent pathway. Several reports suggest that this compound may be related to virulence in P. syringae pathovars. Therefore, it would be interesting to determine whether this feature plays a role during bacterial canker in kiwi plants and to evaluate whether this is a common characteristic in different biovars of this pathovar.

### AUTHOR CONTRIBUTIONS

OF, CY, XB, and RB conceived and designed the study, and analyzed the results. OF, CP, and MN performed the experiments. AV and CM performed the LC-ESI-MS/MS analysis. OF and RB wrote the manuscript. All authors reviewed and approved the final manuscript.

### FUNDING

This work was financially supported by CONICYT grants FONDEF/II Concurso IDeA en Dos Etapas ID15I10032 and FONDECYT Postdoctorado 2017 No. 3170567.

### ACKNOWLEDGMENTS

The authors wish to acknowledge the Agricultural and Livestock Service (SAG) for facilitating the Psa isolate collection, the Chilean Kiwifruit Committee for support assistance, and Dr. Paula Salinas and Dr. Jorge Olivares for providing bacterial strains.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01907/full#supplementary-material


sequencing and comparative genomics of 19 Pseudomonas syringae isolates. PLoS Pathog. 7:e1002132. doi: 10.1371/journal.ppat.1002132


bacterial canker on yellow kiwifruit (Actinidia chinensis) in central Italy. Plant Pathol. 59, 954–962. doi: 10.1111/j.1365-3059.2010.02304.x


pv. actinidiae provides insight into the origins of an emergent plant disease. PLoS Pathog. 9:e1003503. doi: 10.1371/journal.ppat.1003503


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Flores, Prince, Nuñez, Vallejos, Mardones, Yañez, Besoain and Bastías. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Plant Microbiome and Its Link to Plant Health: Host Species, Organs and Pseudomonas syringae pv. actinidiae Infection Shaping Bacterial Phyllosphere Communities of Kiwifruit Plants

#### Edited by:

Marco Scortichini, Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria (CREA), Italy

#### Reviewed by:

Vardis Ntoukakis, University of Warwick, United Kingdom Dawn Arnold, University of the West of England, United Kingdom Brian H. Kvitko, University of Georgia, United States

#### \*Correspondence:

Francesco Spinelli francesco.spinelli3@unibo.it

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Plant Science

Received: 06 June 2018 Accepted: 05 October 2018 Published: 07 November 2018

#### Citation:

Purahong W, Orrù L, Donati I, Perpetuini G, Cellini A, Lamontanara A, Michelotti V, Tacconi G and Spinelli F (2018) Plant Microbiome and Its Link to Plant Health: Host Species, Organs and Pseudomonas syringae pv. actinidiae Infection Shaping Bacterial Phyllosphere Communities of Kiwifruit Plants. Front. Plant Sci. 9:1563. doi: 10.3389/fpls.2018.01563 Witoon Purahong<sup>1</sup>† , Luigi Orrù<sup>2</sup>† , Irene Donati<sup>3</sup> , Giorgia Perpetuini<sup>3</sup> , Antonio Cellini<sup>3</sup> , Antonella Lamontanara<sup>2</sup> , Vania Michelotti<sup>2</sup> , Gianni Tacconi<sup>2</sup> and Francesco Spinelli<sup>3</sup> \*

<sup>1</sup> Department of Soil Ecology, Helmholtz Center for Environmental Research - UFZ, Halle, Germany, <sup>2</sup> CREA Research Centre for Genomics and Bioinformatics – Fiorenzuola d'Arda, Italy, <sup>3</sup> Department of Agricultural and Food Sciences, Alma Mater Studiorum – Università di Bologna, Bologna, Italy

Pseudomonas syringae pv. actinidiae (Psa) is the causal agent of the bacterial canker, the most devastating disease of kiwifruit vines. Before entering the host tissues, this pathogen has an epiphytic growth phase on kiwifruit flowers and leaves, thus the ecological interactions within epiphytic bacterial community may greatly influence the onset of the infection process. The bacterial community associated to the two most important cultivated kiwifruit species, Actinidia chinensis and Actinidia deliciosa, was described both on flowers and leaves using Illumina massive parallel sequencing of the V3 and V4 variable regions of the 16S rRNA gene. In addition, the effect of plant infection by Psa on the epiphytic bacterial community structure and biodiversity was investigated. Psa infection affected the phyllosphere microbiome structures in both species, however, its impact was more pronounced on A. deliciosa leaves, where a drastic drop in microbial biodiversity was observed. Furthermore, we also showed that Psa was always present in syndemic association with Pseudomonas syringae pv. syringae and Pseudomonas viridiflava, two other kiwifruit pathogens, suggesting the establishment of a pathogenic consortium leading to a higher pathogenesis capacity. Finally, the analyses of the dynamics of bacterial populations provided useful information for the screening and selection of potential biocontrol agents against Psa.

Keywords: Actinidia chinensis, Actinidia deliciosa, epiphytic community, metagenome, bacterial biocoenosis, biocontrol, bacterial canker

## INTRODUCTION

The plant microbiome plays a crucial role in plant health and productivity and, thus, has received significant attention in recent years (Turner et al., 2013). The main focuses of the plant microbiome studies are devoted to model plants, such as Arabidopsis thaliana, as well as important economic crop species including barley (Hordeum vulgare), corn (Zea mays), rice (Oryza sativa), soybean

**18**

(Glycine max), wheat (Triticum aestivum), whereas less attention is given to fruit crops and tree species (Busby et al., 2017). Plant microbiomes are shaped by both plant-related (i.e., genotype, organ, species, health status etc.) and environmental factors (i.e., management, land use and climate) (Bringel and Couée, 2015). Although plant health status is reported in some studies to be reflected or linked to its microbiome (Berendsen et al., 2012; Turner et al., 2013; Berg et al., 2014), this aspect is actually still unclear and requires further empirical evidence. Thus, in fruit crop species, it is still uncertain how infectious diseases alter the microbiome of the infected organs.

Pseudomonas syringae pv. actinidiae (Psa) is the causal agent of the bacterial canker of kiwifruit, which is the major threat to kiwifruit production worldwide (Scortichini et al., 2012; Vanneste, 2012). The pathogen can infect both Actinidia chinensis and A. deliciosa plants, the two most important commercial species (Donati et al., 2014). So far, no resistant genotype has been found, but, generally, A. deliciosa varieties are considered less susceptible than the ones belonging to A. chinensis (Spinelli et al., 2011). Before infecting the plant, the pathogen grows on the epiphytic surfaces of Actinidia flowers and leaves. After this epiphytic phase, infection occurs via natural opening such stomata on leaves or stylar tissues on flowers (Donati et al., 2018), or via natural wounds, such as broken trichomes (Spinelli et al., 2011). Once Psa enters the host tissues, the infection rapidly becomes systemic, leading to the death of the host plant (Renzi et al., 2012; Scortichini et al., 2012; Donati et al., 2014). Therefore, the understanding of Psa interactions with the phyllosphere microbial community could provide essential information for developing innovative, effective and long-lasting control strategies. To date, no sustainable and completely effective control methods have been developed for this disease, and control mainly relies on the use of copper formulates (Collina et al., 2016; Scortichini, 2016). However, the increasing concerns about the environmental risks caused by the widespread use of xenobiotic pesticides led institutions, such as the European Commission, to develop regulations to restrict their use (Commission Regulation [EC], 2008. 396/2005/EC, 149/2008/EC). A sustainable and environment-friendly alternative to chemical pesticides for controlling disease in the phyllosphere is the use of biological control agents (BCAs) (Wicaksono et al., 2018). Indeed, the phyllosphere represents an ecological niche with pivotal agricultural and biological significance (Whipps et al., 2008; Vorholt, 2012), and the bacterial epiphytic community can positively impact plant health, physiology and environmental fitness (Kim et al., 2011; Vorholt, 2012; Dees et al., 2015). Several epiphytic bacterial species isolated from the phyllosphere have been reported to be strong competitors against plant pathogens, thus acting as BCAs (Volksch and May, 2001). In addition, to the direct competition for limited space and nutrients, some BCAs can also inhibit pathogen growth by secreting antimicrobial compounds (e.g., Pantoea agglomerans, Lactobacillus plantarum), or interfering with the pathogen signalling system (Volksch and May, 2001). Finally, other epiphytic bacteria are known to exert a plant growth-promoting activity and induce natural plant resistance against pathogens (Ryu et al., 2003; Ottesen et al., 2013; Rastogi et al., 2013). For the control of Psa, strains of Pseudomonas fluorescens, Bacillus subtilis, Bacillus amyloliquefaciens and, more recently, Lactobacillus plantarum have been tested as possible BCAs (Gould et al., 2015; Collina et al., 2016; Yakhin et al., 2017).

Screening and selection of new BCAs has been mainly focused on the identification of single bacterial species effective in contrasting a specific pathogen. However, under natural conditions, bacteria live in communities regulated by interspecies signalling (Ryan and Dow, 2008) and, thus, the modern approach to enhance plant growth and health is to elucidate the effect of small microbial consortia against pathogens or on plant host resistance induction (Sarma et al., 2015). Several studies highlighted that, in comparison to the use of single beneficial species, the application of microbial consortia may improve efficacy, reliability and consistency of the growth and health promotion under a wider range of environmental conditions (Stockwell et al., 2011). In this view, the beneficial effect on plant health is the result of the combined and synergic interaction of multiple bacterial species each with specific positive effect (Kim et al., 2011; Sarma et al., 2015).

Understanding the dynamics and evolution of the bacterial community on the phyllosphere may also provide crucial information on other factors influencing Psa infection process. In fact, symbiotic interactions among different microbial species, leading to a pathogenic consortium, may increase disease incidence and development (Lamichhane and Venturi, 2015). Growing evidence highlighted that pathogens do not operate independently, but their virulence is mediated by their interaction with other pathogens (Singer, 2010; Lamichhane and Venturi, 2015). This phenomenon has led researchers to develop the idea of pathobiome, i.e., a community in which pathogens participate in complex interactions with their biotic environment (Vayssier-Taussat et al., 2014). The importance of the interactions among pathogens is well recognised in human health (Singer, 2010). For example, in the medical field, the term syndemic indicates the synergistic interactions among diseases (Singer and Clair, 2003). Even though some cases of bacterial pathogens cooccurrence has been described in plants, the impact of pathogens interactions on plants diseases has received far less attention (Kùdela et al., 2010; Lamichhane and Venturi, 2015). In kiwifruit, Pseudomonas pathogens, such as P. syringae pv. syringae and P. viridiflava, often occur together but their interaction is still unclear (Balestra et al., 2008; Petriccione et al., 2017).

The main aims of this study were (i) to investigate the bacterial phyllosphere communities on leaves and flowers of two species of kiwifruit species (A. deliciosa cv. Hayward and A. chinensis cv. Hort16A) using Illumina sequencing of the V3 and V4 variable regions of the 16S ribosomal gene; (ii) to verify the relative importance of plant species, organ and Psa infection in shaping bacterial phyllosphere communities; (iii) to quantify (by quantitative real-time polymerase chain reaction, qPCR) the abundance in different plant organs of Pseudomonas pathogens (i.e., Psa, P. syringae pv. syringae and P. viridiflava) and BCAs (i.e., Lactobacillus plantarum, Pantoea agglomerans, Bacillus subtilis, B. amyloliquefaciens, and P. fluorescens) in relation to plant health status. The experimental approach allowed us to highlight the possible contribute of the speciesspecific microbiome on the different susceptibility of A. deliciosa cv. Hayward and A. chinensis cv. Hort16A to Psa.

#### MATERIALS AND METHODS

#### Sample Collection

fpls-09-01563 November 7, 2018 Time: 12:28 # 3

Leaves and flowers were collected from A. deliciosa cv. Hayward and A. chinensis cv. Hort16A plants grown in commercial orchards located in Faenza region (Emilia Romagna, Italy). In those orchards, the average disease incidence in the previous season were 8 and 21% in Hayward and Hort16A, respectively. At shoot emergence and beginning of blooming, an extensive screening was performed to discriminate uninfected from infected plants. For this purpose, 10 flowers and 10 leaves per each plant were sampled and Psa contamination was assessed according to Gallelli et al. (2014). The study of the microbial community in the phyllosphere was performed separately on uninfected from infected plants. For this purpose, sampling was performed at full blooming for both kiwifruit species. Standard orchard pruning, fertilisation and irrigation were applied, and no chemical or biological pesticides were applied in the 6 months preceding sampling. Moreover, the orchards were naturally pollinated, with no assisted pollen application. Leaves were sampled in groups of 10 per plant, randomly chosen either in uninfected or infected vines. To minimise the effect of leaf position and age, only the third fully expanded leaf from each shoot of the same age was collected. Flowers were sampled at anther dehiscence in groups of 5 per plant, randomly selected either in uninfected or infected vines. Each leaf or flower sample was washed with 10 or 15 ml of sterile MgSO<sup>4</sup> 10 mM solution, respectively. To concentrate the bacterial load, the same washing solution was used for all samples inside a specific group. To extract all bacteria associated with phyllosphere, washing was carried out for 15 min under gentle agitation (100 rpm) at 4◦C temperature to avoid mechanical tissue damage and bacterial multiplication. To verify the efficacy of the washing process, each washed leaf or flower was transferred in a new sterile solution of MgSO<sup>4</sup> and processed again as previously described. This washing solution was successively plated on LB-agar medium. Finally, the grouping of the different samples according to the presence of absence of Psa was confirmed by homogenising each group of leaves or flowers in a batch and processing according to Gallelli et al. (2014).

#### DNA Extraction and Sequencing

Libraries were prepared using the Illumina (San Diego, CA, United States) 16S metagenomic sequencing library preparation protocol, which allows the sequencing of the variable V3 and V4 regions of the 16S rRNA gene. Briefly, washing solutions were pelleted by centrifugation at 20,000 × g for 20 min at 4 ◦C, then the supernatant was discarded. Pellets were joined, immediately frozen in liquid nitrogen and stored at −80◦C. From frozen pellets, genomic DNA was extracted and purified using NucleoSpin <sup>R</sup> soil kit (Macherey-Nagel GmbH & Co. KG, Düren, Germany) following manufacturer instruction. After determining its concentration and purity by spectrophotometer, the extracted genomic DNA was used as template for V3–V4 regions amplification with 16S Amplicon PCR Forward = 5<sup>0</sup> TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACG GGNGGCWGCAG and Reverse = 5<sup>0</sup> GTCTCGTGGGCTCGGA GATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC primers following the PCR protocol suggested by Illumina.

PCR products were purified using the Agentcourt <sup>R</sup> AMPure <sup>R</sup> XP Beads (Beckman Coulter Company, Brea, CA, United States). The quality of the final products was assessed using a Bioanalyzer 2100 (Agilent Technologies, Waldbronn, Germany) and quantified with Qubit <sup>R</sup> fluorometer (Thermo Fisher Scientific, Waltham, MA, United States) following manufacturer protocol. The amplicons were coupled to dual indices and Illumina sequencing adaptors attaches using the Nextera XT Index Kit (Illumina Inc., San Diego, CA, United States), pooled in equal proportions and sequenced paired-end in an Illumina MiSeq (Illumina Inc., San Diego, CA, United States) at IGA Technology Services (Udine, Italy). To prevent focusing and phasing problems due to the sequencing of "low diversity" libraries such as 16S amplicons, 30% PhiX genome was spiked in the pooled library.

#### Bioinformatic Analysis

Raw reads were first processed with Trimmomatic (Bolger et al., 2014) to remove low-quality reads using a sliding window of 5 bp length with an average phred score ≥ 20. Sequences shorter than 100 bases were discarded. The 16S rRNA sequences were analysed using the Mothur software package version 1.35.1 (Schloss et al., 2009). The paired-end reads were assembled and aligned to the SILVA 16S rRNA sequences database (Pruesse et al., 2007). Sequences were de-noised to remove sequencing error with the command "pre.cluster" and chimeric sequences were removed using the Uchime algorithm (Edgar et al., 2011) implemented in Mothur. Sequences were clustered into OTUs at 96% sequence identity using the nearest neighbour clustering methods. The sequences were classified using the references Ribosomal Database Project database (RDP) provided in Mothur. OTUs that were singletons and doubletons were removed. The samples were normalised to 6,886 sequences each (the size of the smallest sample) to ensure that the analysis was not influenced by differential sequencing depths. The bacterial 16S rRNA gene Illumina sequencing data are deposited in the NCBI BioProject library (Accession: PRJNA472855, ID: 472855). The bacterial taxonomic table (with bacterial relative abundance data) are given in **Supplementary Table S1**.

#### qPCR Analysis

The primer sets used in this study are listed in **Table 1**. New primer sets were designed based on L. plantarum WCFS1 and P. syringae pv. actinidiae RC3 sequences available at the National Centre for Biotechnology Information (NCBI<sup>1</sup> ). Appropriate primers were designed using the online programme Primer3Plus<sup>2</sup>

<sup>1</sup>http://www.ncbi.nlm.nih.gov

<sup>2</sup>http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi/


TABLE 1 | Primers used in this study to reveal the presence of the indicated bacterial species.

(Untergasser et al., 2012). The BLAST search software (Basic Alignment Search Tool<sup>3</sup> ) was used to cheque the specificity of each primer set. Properties of each primer were verified by Oligo analyser 3.1. Primer specificity was validated by melt curve analysis and end point PCR performed with the same protocol adopted for qPCR (see below), using AmpliTaq Gold <sup>R</sup> 360 enzyme and Master Mix (Thermo Fisher Scientific, Waltham, MA, United States).

qPCR analyses were performed using Sybr Green fast master mix chemistry (Applied Biosystem, Foster City, CA, United States) in a 96 well spectrofluorometric thermal cycler StepOnePlus <sup>R</sup> (Thermo Fisher Scientific, Waltham, MA, United States). DNA concentration was adjusted to 100 ng. All reactions were performed in triplicate, with the following thermal profile: 1 cycle at 50◦C (2 min), 1 cycle at 95◦C (10 min), 40 cycles of 95◦C (15 s) and 60◦C (30 s). The temperature was raised by 0.3◦C every 10 s from 63 to 95◦C to obtain the melting temperature. To quantify the bacterial titre of the samples, standard curves were generated for each bacterial species tested plotting cycle threshold (Ct) values versus bacterial cell titre, as measured by plating 10-fold dilutions of the same sample on LBagar medium (Lyons et al., 2000). Upon verification by genome blast, the average number of detector gene copies per genome was assumed to be 1.0 for each species.

#### Meta-Analysis of Bacterial Association

The correlation among natural epiphytic populations of Psa and other bacteria (P. syringae pv. syringae, P. viridiflava, P. fluorescens, Pantoea agglomerans/vagans, and Lactobacillus spp.) was evaluated based on data obtained between 2012 and 2016, relating to A. deliciosa cv. Hayward samples collected in the same area and season as the samplings for metagenomic analysis. Each sample was singularly washed in 10 ml MgSO<sup>4</sup> 10 mM sterile solution. Bacterial quantification was performed on the wash by qPCR as described above.

#### Statistical Analysis

To assess the coverage of the sequencing depth, individual rarefaction analysis was performed for each sample using the "diversity" function in PAST 3.0 (Hammer et al., 2001). Alpha diversity indices (Pielou, Inverse Simpson and Shannon) were analysed after normalisation using Mothur. Similarity Percentages (SIMPER) analysis using PAST was used to calculate the average dissimilarity and to obtain the identity and relative abundances of the bacterial taxa that contributed most of the observed pair-wise variation in the bacterial community composition due to different kiwifruit species (healthy plants), organs (healthy plants) and pathological status (healthy vs. diseased plants). Principal component analysis (PCA) based on correlation matrix was carried out in PAST to display the clusterisation of samples according to the variance in qPCR population analysis.

Multiple regression was performed on bacterial populations to test their association with Psa, using Statistica ver. 7.0 (Statsoft, Inc., Tulsa, OK, United States). The analysis was restricted to samples positive to Psa, and the data were transformed to Log<sup>10</sup> before elaboration. Statistical significance was assumed for P < 0.05.

### RESULTS

#### Description of the Epiphytic Bacterial Microbiome of Kiwifruit Plant

In this study, the leaf- and flower-associated microbiota of two kiwifruit species, A. deliciosa and A. chinensis, were analysed. After the normalisation step at 6,886 sequences per sample, a total of 1,050 bacterial OTUs were retrieved. The rarefaction curves were close to saturation, suggesting that the OTUs recovered in this study nearly represented the whole bacterial genetic diversity (**Supplementary Figure S1**). The OTUs were assigned to 16 phyla, and 220 different genera (**Supplementary Table S1**). Proteobacteria was the most abundant phylum representing about 77.4% of the total contigs, followed by Firmicutes (10.7%), Actinobacteria (6.1%), and Bacteroidetes (3.5%) (**Figure 1**). At an OTU level, the most abundant OTUs were identified as Pseudomonas OTU 00001 (31.5%), two unclassified genera from the Enterobacteriaceae family, OTU 00002 and OTU 00004, representing together about 15.3% of the total sequences, Sphingomonas OTU 00003 (5.8%) and Massilia OTU 00005 (4.9%). The first 10 most abundant OTUs accounted for approximately 70% of total sequences (**Figure 1** and **Supplementary Table S1**).

<sup>3</sup>http://www.ebi.ac.uk/blastall/nucleotide.html

#### Species-Specificity of Epiphytic Bacterial Microbiome

Bacterial community was shaped by the species of kiwifruit plants (**Figure 2**). In fact, the overall average dissimilarity between leaves or flowers of the two species was 78.27 and 63.26%, respectively. In leaves, two unclassified genera (OTU 00004 and 00011) belonging to Enterobacteriaceae accounted for about 23% of the dissimilarity. The genus Pseudomonas (OTU 00001) also accounted approximately for 20% of the dissimilarity, being 13 times more abundant on A. deliciosa than on A. chinensis leaves.

On flowers, Enterobacteriaceae (OTU 00002) accounted for approximately 27% of dissimilarity followed by the genus Sphingomonas (8.4%, OTU 00003). The genera Arthrobacter (OTU 00013) and Bacillus (OTU 00009) accounted for 5 and 4% of dissimilarity, respectively. Pseudomonas (OTU 00001) accounted only for the 2.7% of dissimilarity.

The influence of kiwifruit plant species was confirmed also by the three biodiversity indices determined: Shannon (H<sup>0</sup> ), Inverse Simpson (1/D 0 ) and Pielou (J 0 ). A higher biodiversity was observed in A. chinensis leaves (H<sup>0</sup> = 4.02; 1/D <sup>0</sup> = 11.93; J <sup>0</sup> = 0.65) than in A. deliciosa ones (H<sup>0</sup> = 2.99; 1/D <sup>0</sup> = 7.16; J <sup>0</sup> = 0.54). Similarly, higher values of all the considered indices were found for A. chinensis (H<sup>0</sup> = 4.13; 1/D <sup>0</sup> = 30.14; J <sup>0</sup> = 0.70) than for A. deliciosa (H<sup>0</sup> = 2.76; 1/D <sup>0</sup> = 5.73; J <sup>0</sup> = 0.49) flowers (**Table 2**).

### Organ-Specific Epiphytic Bacterial Microbiome

In each kiwifruit species, flowers and leaves harboured a distinct bacterial microbiome. In fact, in A. chinensis the overall average dissimilarity between leaves and flowers was 71.16%, while for A. deliciosa it was 58.55% (**Figure 3**). In A. chinensis, an unclassified genus (OTU 00004) belonging to Enterobacteriaceae

accounted for 16.48% of dissimilarity followed by the genus Bacillus (OTU 00009, 5.13%). On the other hand, in A. deliciosa OTU 00002 belonging to Enterobacteriaceae accounted for about 29% of the dissimilarity being 13 times more abundant on flowers, while the genus Pseudomonas (OTU 00001) accounted for approximately 19% of dissimilarity and it was more abundant on leaves than flowers (**Figure 3**).

These data are in agreement with the bioversity indices trend. A. chinensis hosted a more biodiverse epiphytic bacterial community (H<sup>0</sup> = 4.13; 1/D <sup>0</sup> = 30.14; J <sup>0</sup> = 0.70) than leaves (H<sup>0</sup> = 4.02; 1/D <sup>0</sup> = 11.93; J <sup>0</sup> = 0.65). On the other hand, in A. deliciosa, the bacterial community presented a higher complexity in leaves (H<sup>0</sup> = 2.99; 1/D <sup>0</sup> = 7.16; J <sup>0</sup> = 0.54) than flowers (H<sup>0</sup> = 2.76; 1/D <sup>0</sup> = 5.73; J <sup>0</sup> = 0.49) (**Table 2**).


Ac, Actinidia chinensis var. HORT16A; Ad, Actinidia deliciosa var. Hayward; D, diseased; H, healthy; F, flower; L, leaf. For each sample, 6,886 sequences were analysed.

### Effect of Psa Infection on the Epiphytic Bacterial Microbiome

A detailed comparison about bacterial community changes related to Psa infection showed that the overall average dissimilarity between leaves and flower of healthy and infected plants ranged from 79.10 to 66.70% in A. chinensis and from 64.45 to 35.41% in A. deliciosa (**Figures 4A,B**).

In A. chinensis, the substantial increase in the genus Pseudomonas (OTU 00001) in infected leaves and flowers contributed 8.25 and 47.86%, respectively (**Figure 4A**). A similar result was observed in A. deliciosa were the genus Pseudomonas (OTU 00001) increased up to three and two times in leaves and flowers, respectively (**Figure 4B**).

A reduction in diversity of the bacterial community was observed after Psa infection, with the only exception of A. deliciosa flowers, as indicated by the biodiversity indices (**Table 2**). Psa infection caused a marked drop in population evenness and biodiversity in infected A. chinensis (flowers and leaves) and A. deliciosa (leaves only), with the dominance of few genera, mainly Pseudomonas (OTU 00001).

The Venn diagrams in **Figure 5** describe the distributions of unique and shared OTUs in healthy and diseased plants in the two species and tissues analysed. Psa infection had the strongest impact on the bacterial community of A. deliciosa leaves compared to the other conditions analysed, only 37 OTUs being shared between healthy and diseased leaves. Furthermore, on infected leaves of A. deliciosa we identified some specific OTUs belonging to Oxalobacteraceae (OTU 00043, OTU 00186), Haemophilus (OTU 00076), Moraxellaceae (OTU 00323) that were not present in healthy A. deliciosa leaves (**Supplementary Table S1**). Finally, a considerable number of high abundance OTUs disappeared or showed a dramatic reduction on leaves from infected plants (**Supplementary Table S1**). A similar trend was observed in A. chinensis flowers and leaves. Also in this case, healthy plants were characterised by the presence of characteristic OTUs absent in diseased ones (e.g., Epilithonimonas OTU 00016, Porphyromonas OTU 00110, Bacteroides OTU 00111).

On the other hand, in A. deliciosa flowers some OTUs were more abundant or only present in diseased samples (e.g., Pseudomonas OTU 00001, Propionibacterium OTU 00012, Weissella OTU 00029) (**Supplementary Table S1**).

#### Abundance and Dynamic of Pseudomonas spp. Pathogens and Putative Biocontrol Bacterial Agents in Relation to Plant Pathological Status

Quantitative PCR was applied to detect the occurrence, in relation to plant pathological status, of two other kiwifruit pathogenic bacteria: P. syringae pv. syringae (Pss) and

FIGURE 4 | Overall average dissimilarity between leaves and flower of healthy and infected A. chinensis plants (A) and A. deliciosa (B) (H, healthy; D, diseased, infected with Pseudomonas syringae pv. actinidiae). For each combination of species and organ, the numbers of OTUs specific to H or D samples, or present in both of them, are indicated in the Venn diagrams. Top ten OTUs which contribute most to the overall average dissimilarity and their relative abundances are shown. The colour scale indicates the abundance ranking of the relative OTU: highest (red), mid-point with 50% percentile (yellow), lowest (green).

P. viridiflava (Pv) and bacterial species with potential biocontrol activity, such as P. agglomerans/vagans, L. plantarum, B. subtilis, B. amyloliquefaciens, and P. fluorescens (Choudhary and Johri, 2009; Savitha et al., 2013; Bonaterra et al., 2014; Dutkiewicz et al., 2016; Sharifazizi et al., 2017). Novel primer sets were developed for L. plantarum and Psa. Melting curves analysis (**Supplementary Figure S1**) revealed the specificity of designed primers: a unique peak was observed, suggesting the specificity of the amplification, i.e., each primer pair amplified a unique locus targeted on the genome.

In Psa-infected plants, all the three pathogens were present, although Pss and Pv populations were generally lower than Psa (**Figure 6**). Psa and Pss/Pv were associated in 62.5% of flowers. In non-infected samples, none of the pathogens was detected, with the only exception of a small amount of Pss on the leaves of A. chinensis (**Figure 6**). PCA analysis revealed that healthy plants clustered together. In particular, diseased flowers of both species were mainly characterised by presence of Pv, while Psa and Pss were mainly associated to diseased leaves (**Figure 7**). A significant correlation was found between epiphytic Psa and Pss/Pv populations, for values lower than 10<sup>9</sup> bacterial cells per flower (**Figures 7B,C**).

Regarding the bacterial species with potential biocontrol activity (P. agglomerans/vagans, L. plantarum, B. subtilis, B. amyloliquefaciens, and P. fluorescens), all of them were found in the healthy plant samples, while only L. plantarum appeared also in the corresponding infected organs, and P. fluorescens was present in diseased leaves, but not flowers (**Figure 8**). B. amyloliquefaciens appeared in some diseased samples, without connection to the plant species or organ. The other species were not present in diseased samples. P. agglomerans/vagans and P. fluorescens population sizes were inversely correlated with Psa, for pathogen populations lower than 10<sup>5</sup> and 10<sup>6</sup> bacterial cells per gramme of tissue, respectively. No significant correlation was found with Lactobacillus spp. Principal component analysis showed that diseased plants were well differentiated from healthy ones and that the presences of L. plantarum and P. agglomerans/vagans were mostly distinctive of A. chinensis and A. deliciosa, respectively (**Figure 9**).

## DISCUSSION

### Microbial Biodiversity in Actinidia Phyllosphere

The phyllosphere supports complex microbial populations, and the phyllosphere microbiota can promote plant growth or exhibit biocontrol against various plant pathogens (Thapa et al., 2017). The abundance and spatial distribution of phyllosphere microbiota is to a large extent influenced by environmental factors, but host plant genotype also plays a key role (Bringel and Couée, 2015). Indeed, in previous research, a core community of 31 bacterial species, amounting to 99.8% of total sequences, was found on kiwifruit pollen samples regardless of the different geographical origins and year of collection (Kim M.J. et al., 2018). The present study provides a comprehensive description of the epiphytic bacterial communities on flowers and leaves of A. chinensis and A. deliciosa, the two main kiwifruit commercial species, and highlights their variability in relation to Psa infection. The differences in the microbiota structures were investigated also through the determination of three biodiversity indices. Shannon and Inverse Simpson indices were used to extrapolate the total richness from the observed OTUs. The former is the most widely used index based on species richness and is sensitive to changes in rare species, while the latter is preferred over other measures of alpha-diversity because it accounts for evenness in addition to the number of species. Finally, Pielou index provides information on species evenness, ranging from 0 to 1, with 1 representing perfect evenness and 0 complete dominance. In kiwifruit phyllosphere, Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes were the most abundant phyla. These phyla are considered as phyllosphere-associated generalists and have been found to be the most abundant phyla in the

FIGURE 6 | Abundance in each sample (mean ± standard error, n = 5) of Pseudomonas syringae pv. actinidiae (Psa), Pseudomonas syringae pv. syringae (Pss), and Pseudomonas viridiflava (Pv) determined by qPCR.

function) showing the distribution of samples from healthy and infected plants on the basis of pathogens quantification by qPCR. (B) Linear regression between Psa and Pss populations. (C) Linear regression between Psa and Pv populations in infected flowers. In (B,C), the data sets were restricted to samples presenting a Psa population lower than 10<sup>9</sup> bacterial cells flower−<sup>1</sup> .

phyllosphere of several plant species (Bulgarelli et al., 2013; Bringel and Couée, 2015; Kim M.J. et al., 2018). The most frequent genera were Pseudomonas, Sphingomonas, and Massilia. Their presence has been reported also by other authors in other host plants (López-Velasco et al., 2011; Bodenhause et al., 2013; Bogas et al., 2015). The predominant genus was Pseudomonas.

regression between Psa and Pantoea agglomerans/vagans populations. (C) Linear regression between Psa and P. fluorescens populations. (D) Scatterplot representing Psa and Lactobacillus spp. populations on leaf samples. In (B–D), the data sets were restricted to samples presenting a Psa population lower than 10<sup>6</sup> bacterial cells g−<sup>1</sup> tissue.

In general, pseudomonads colonise plant surfaces, and many strains harbour interesting potential biocontrol actions (Thapa et al., 2017). Some species, such as P. putida, are known for their phosphate solubilisation ability and IAA production (Adesemoye et al., 2008; Gholami et al., 2009; Ahemad and Khan, 2012; Ahemad and Kibret, 2014). Strains of P. fluorescens, P. aeruginosa, P. asplenii, and P. protegens are also used as biocontrol agents against different pathogens (Krishnamurthy and Gnanamanickam, 1998; Akter et al., 2016; Michavila et al., 2017).

The Actinobacteria class also represents a reservoir of potential BCAs. Members of this phylum are well known for their ability to produce secondary metabolites with application in the agricultural, pharmaceutical and medical industries (Himaman et al., 2016). Several studies proposed their use as BCAs (El-Tarabily et al., 2000; Kunoh, 2002; Cao et al., 2005; Prapagdee et al., 2008; Mingma et al., 2014). They play key roles as plant growth promoters, disease resistance inducers and drought tolerance stimulators (Himaman et al., 2016).

The genus Sphingomonas generally acts as a plant-protective genus by suppressing disease symptoms and decreasing pathogen growth (Kim D. et al., 2018). Innerebner et al. (2011) showed that the inoculation with a Sphingomonas sp. strain reduced the population size of the plant pathogens Pseudomonas syringae pv. tomato DC3000 and Xanthomonas campestris pv. campestris LMG 568 on Arabidopsis leaves. The genus Massilia belongs to the family of Oxalobacteraceae. The presence of Massilia spp. was reported in the phyllosphere of different plants, including lettuce and apple (Bassas-Galia et al., 2012; Rastogi et al., 2012; Yashiro and McManus, 2012).

#### Host-Specific Bacterial Communities

Both on flowers and on leaves, the epiphytic bacterial community differed according to kiwifruit species. The main taxa contributing to differences were Enterobacteriaceae, Pseudomonas, Acinetobacter, and Sphingomonadaceae on leaves, and Sphingomonas, Arthrobacter, Bacillus, and Bradyrhizobiaceae in flowers. The detection of this last genus has been reported also in the phyllosphere of spinach, rice, and tobacco, providing evidence for vertical transmission of bacteria from seed to the phyllosphere (Chi et al., 2005; Li et al., 2010; Lopez-Velasco et al., 2013).

Data obtained here are in agreement with other studies showing that different plants genotypes of the same species can host different bacterial communities (De Costa et al., 2006; Vorholt, 2012). The observed differences could be related to anatomical differences of leaves and flowers of the two kiwifruit species. In fact, bacteria are not uniformly distributed across leaf surfaces: instead, they form scattered microcolonies in proximity of trichomes, stomata, epidermal cell wall junctions and grooves along veins (Lindow and Brandl, 2003; Vorholt, 2012), where water and nutrients are most available (Kinkel, 1997; Leveau and Lindow, 2001; Monier and Lindow, 2004; Vorholt, 2012). One of the most considerable anatomical differences between A. deliciosa and A. chinensis is related with trichome structure and density. In A. chinensis, the leaves show a higher trichome density compared to A. deliciosa cultivars (He et al., 2000; Spinelli et al., 2011). Moreover, trichomes in A. chinensis are characterised by a higher central peduncle (He et al., 2000). These differences in trichomes abundance may affect the microbial community composition and structure.

### Organ-Specific Bacterial Communities

Differences in the microbial community were also observed in different organs of the same plant species. A. chinensis leaves were populated mostly by Enterobacteriaceae and Pseudomonas, while on flowers, Pseudomonas and Bacillus were the most abundant genera, and a higher overall biodiversity was observed. Contrastingly, in A. deliciosa, the influence of organs on microbial composition was less evident than in A. chinensis, as Pseudomonas and Enterobacteriaceae were the dominant groups on both leaves and flowers. Organ-specific pathogenic consortia were observed in Psa-infected host. In fact, Pss, which primarily induces leaf spot symptoms (Petriccione et al., 2017), was more abundantly found on Psa-infected leaves, while Pv, responsible for blossom blight disease (Balestra et al., 2008) was more closely associated to Psa-infected flowers.

Since flowers are composed by different tissues (stigmas, styles, anthers, ovariums, nectarhodes), each of them providing a favourable and unique environment for the resident microbial community (Howpage et al., 1998; Spinelli et al., 2005; Aleklett et al., 2014), a higher biodiversity on flowers than on leaves could be expected. However, some flower parts may be less conducive to bacterial epiphytes than leaves. In fact, Junker et al. (2011) observed a lower biodiversity on petals than leaves of Lotus corniculatus and Saponaria officinalis. In this perspective, the relative organisation, proportion, and chemical features of Actinidia spp. flower parts may be evoked to explain differences in bacterial colonisation. Although sharing the same basic structure, A. chinensis and A. deliciosa flowers also show evident morphological differences. A. deliciosa flowers present a higher number of styles and stamens, and the perianth and androecium are closer to gynoecium, resulting in the nectar cup being more protected than in A. chinensis (Harvey and Fraser, 1988; Huang, 2014). It is likely that this last evidence could pose an obstacle to the formation of a highly diversified community, as observed on A. deliciosa flowers compared to leaves. Moreover, the production of volatile compounds may further act as a selective agent on the epiphytic microflora (Junker et al., 2011). The sesquiterpene α-farnesene, for instance, was found to play a role in plant defence (Huelin and Murray, 1966; Pare and Tumlinson, 1999; Yang et al., 2011), acting as a feeding deterrent to insects (Aharoni et al., 2003) and exhibiting toxicity to bacteria (Chorianopoulos et al., 2004) and fungi (Terzi et al., 2007). This compound is a major constituent of the flower odour bouquet of A. deliciosa, while it is not or scarcely emitted by A. chinensis (Tatsuka et al., 1990; Crowhurst et al., 2008; Nieuwenhuizen et al., 2009; Green et al., 2012).

### Antagonistic and Synergic Relationships of Psa With Other Microbes

Pseudomonas syringae pv. actinidiae infection had several effects on the diversity and taxonomic structure of the bacterial

communities with the only exception of A. deliciosa flowers. The mechanism by which Psa antagonises indigenous bacteria, determining its dominance on A. chinensis and A. deliciosa leaves and the disappearance of most of the dominant microbial species, is probably based on competition for limiting nutrient resources, since a mechanism based on antibiosis would not be affected by the plant host. In this view, a specialised pathogen such as Psa should be able to outcompete, in the Actinidia phyllosphere, other non-specialised residents. The extensive screening of plant material collected from infected orchards for 4 years confirmed a negative correlation between Psa and P. agglomerans/vagans or P. fluorescens populations when the population of these two bacteria ranges between 10<sup>4</sup> and 10<sup>6</sup> . Thus, even though in highly infected leaves, Psa overwhelms other bacterial competitors, in early stages of Psa epiphytic growth the competition with P. agglomerans/vagans or P. fluorescens may prevent the reach of the infection threshold (approximately 10<sup>5</sup> Psa cells per gramme of tissue) (Donati et al., 2018). On the other hand, no antagonism could be observed between Psa and Lactobacillus spp., speculatively suggesting that bacteria of the latter group are poorly affected by Psa competition. In comparison to leaves or A. chinensis flowers, Psa exerts a weak antagonism toward the other residents in A. deliciosa flowers (**Figure 1B** and **Table 2**). Thus, it may be concluded that this niche is not highly conducive for Psa, or that other bacteria are as specialised and adapted to the floral niche as Psa. This observation, together with the higher relative abundance of Psa on A. chinensis flowers, could also contribute in explaining the higher susceptibility of these flowers in comparison with A. deliciosa ones (Donati et al., 2018), in spite of the similar Psa population sizes that can be attained on the two species (**Figure 6**).

Quantitative real-time polymerase chain reaction analysis showed that Psa formed a syndemic association with Pss and Pv. In fact, Psa-infected flowers also harboured detectable Pss and Pv populations in 62.5% of the cases. It is reasonable to hypothesise that these three species form a consortium which may compete more effectively in the epiphytic niche (Buonaurio et al., 2015). When occurring in association, Psa and Pss have been shown to infect the host plant more efficiently (Petriccione et al., 2017). Such observations pose interesting implications for the control of bacterial canker of kiwifruit. For instance, treatments aimed at reducing Pss and/or Pv population may be envisaged to limit Psa pathogenicity. In addition, hacking the signalling network among different pathogen species may be a strategy to repress their virulence in field conditions. Many Gramme-negative, plant-associated bacteria pathogens have been reported to regulate their virulence by N-acyl-homoserine lactones (AHLs) (Ma et al., 2013). AHL-quorum sensing model includes AHL synthase, which belongs to the LuxI-protein family, and AHL receptors/transcriptional regulators (LuxR) (Papenfort and Bassler, 2016). Psa does not produce AHLs and a complete LuxI/R system is absent. However, it possesses three putative LuxR solos (Patel et al., 2014). Two of them respond to exogenous AHLs, while the third is most likely involved in interkingdom signalling (Patel et al., 2014). The presence of three LuxR solos is rather unusual as most commonly proteobacteria possess only one, therefore they could represent an evolutionary advantage for Psa favouring the communication with other epiphytic bacteria (Patel et al., 2014).

Finally, inside this bacterial consortium, horizontal gene transfer may be facilitated, thus allowing a faster adaptation to environmental changes and stresses. In fact, the accessory genome of the P. syringae complex is characterised by genomic islands and various mobile elements, such as insertion sequences (IS elements), transposons, plasmids and integrative conjugative elements (ICEs) (Butler et al., 2013). These mobile elements include genes related to the ecological fitness and virulence, such as toxin production (Murillo et al., 2011), copper and antibiotic resistance (Butler et al., 2013; Colombi et al., 2017), siderophore production and phenolics degradation (Scortichini et al., 2012). In this sense, the possibility to characterise the "phyllobiome" through the application of high throughput, next generation sequencing technologies allows to strategically select BCAs highly specialised for the host's most sensible pathways of infection. Among the putative BCAs studied in this work, for instance, only L. plantarum colonised both infected and healthy flowers and leaves in the two Actinidia species, while B. amyloliquefaciens could be occasionally found in infected tissues, although without relation to plant organ or species. Furthermore, coupling the complementary information acquired by next generation sequencing technologies and metanalysis of population association, allowed to identify other BCA candidates, such as P. agglomerans/vagans or P. fluorescens. The ability of tested BCAs to persist in these organs in spite of Psa infection could be related with the ability of members of this species to exert a strong antagonistic effect against the other microbes inhabiting the same niche thorough the release of a wide array of anti-microbial compounds opening new possibilities for its exploitation as BCAs.

## CONCLUSION

The data obtained in this work highlighted for the first time the impact of Psa on bacterial communities associated to kiwifruit plants. The results here reported provide new insight on how Psa influences the microbial communities associated with the leaf and the flower in kiwifruit. The complex interactions between host, environment, and microbes play a role in determine the outcome of the infection process and in defining niches important for the resident bacteria. The main causes underlying these changes remain unclear, and a better understanding of the process will require a greater interdisciplinary effort, as well as an integrative approach to detect the triggers of disease outbreak in kiwifruits, to develop more sustainable strategies for the control of the bacterial canker disease in kiwifruit.

## AUTHOR CONTRIBUTIONS

FS conceived the experiment and supervised the work. ID contributed to design of the experiments and performed all the sampling and the classical microbiological and pathological analysis. AC and ID performed DNA extraction, qPCR and bacterial identification. LO, AL, and WP analysed the next generation sequencing data. FS, LO, ID, and GP drafted the

manuscript. All authors critically contributed to the review of the manuscript and discussion of the data.

#### FUNDING

The work was funded by the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 613678 (Dropsa - Strategies to develop effective, innovative and practical approaches to protect major European fruit crops from pests and pathogens).

#### REFERENCES


#### ACKNOWLEDGMENTS

We thank Dr. Simona Nardozza and the Photography Team of Plant & Food Research, New Zealand for the photographs of Actinidia deliciosa and Actinidia chinensis flowers and leaves.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.01563/ full#supplementary-material

biocontrol agents of fire blight. Acta Hortic. 1056, 117–122. doi: 10.17660/ ActaHortic.2014.1056.16


covered by Annex I thereto. Available at: http://data.europa.eu/eli/reg/2008/ 149/oj


Huang, H. (2014). The Genus Actinidia: a World Monograph. Beijing: Science Press.

Huelin, F. E., and Murray, K. E. (1966). Alpha-farnesene in the natural coating of apples. Nature 210, 1260–1261. doi: 10.1038/2101260a0


biosynthesis of the phytotoxin phaseolotoxin in Pseudomonas syringae suggests at least two events of horizontal acquisition. Res. Microbiol. 162, 253–261. doi: 10.1016/j.resmic.2010.10.011


communities. Appl. Environ. Microbiol. 75, 7537–7541. doi: 10.1128/AEM. 01541-09



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Purahong, Orrù, Donati, Perpetuini, Cellini, Lamontanara, Michelotti, Tacconi and Spinelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Inference of Convergent Gene Acquisition Among Pseudomonas syringae Strains Isolated From Watermelon, Cantaloupe, and Squash

Eric A. Newberry1,2, Mohamed Ebrahim3,4, Sujan Timilsina<sup>3</sup> , Nevena Zlatkovic´ 5 , Aleksa Obradovic´ 5 , Carolee T. Bull<sup>6</sup> , Erica M. Goss3,7, Jose C. Huguet-Tapia<sup>3</sup> , Mathews L. Paret<sup>2</sup> , Jeffrey B. Jones<sup>3</sup> \* and Neha Potnis<sup>1</sup> \*

#### Edited by:

Dawn Arnold, University of the West of England, United Kingdom

#### Reviewed by:

Brian H. Kvitko, University of Georgia, United States David John Studholme, University of Exeter, United Kingdom

#### \*Correspondence:

Jeffrey B. Jones jbjones@ufl.edu Neha Potnis nzp0024@auburn.edu

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology

Received: 27 November 2018 Accepted: 01 February 2019 Published: 19 February 2019

#### Citation:

Newberry EA, Ebrahim M, Timilsina S, Zlatkovic N, Obradovi ´ c A, ´ Bull CT, Goss EM, Huguet-Tapia JC, Paret ML, Jones JB and Potnis N (2019) Inference of Convergent Gene Acquisition Among Pseudomonas syringae Strains Isolated From Watermelon, Cantaloupe, and Squash. Front. Microbiol. 10:270. doi: 10.3389/fmicb.2019.00270 <sup>1</sup> Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, United States, <sup>2</sup> Department of Plant Pathology, North Florida Research and Education Center, University of Florida, Quincy, FL, United States, <sup>3</sup> Department of Plant Pathology, University of Florida, Gainesville, FL, United States, <sup>4</sup> Department of Plant Pathology, Faculty of Agriculture, Ain Shams University, Cairo, Egypt, <sup>5</sup> Faculty of Agriculture, University of Belgrade, Belgrade, Serbia, <sup>6</sup> Department of Plant Pathology and Environmental Microbiology, Pennsylvania State University, State College, PA, United States, <sup>7</sup> Emerging Pathogens Institute, University of Florida, Gainesville, FL, United States

Pseudomonas syringae sensu stricto (phylogroup 2; referred to as P. syringae) consists of an environmentally ubiquitous bacterial population associated with diseases of numerous plant species. Recent studies using multilocus sequence analysis have indicated the clonal expansion of several P. syringae lineages, located in phylogroups 2a and 2b, in association with outbreaks of bacterial spot disease of watermelon, cantaloupe, and squash in the United States. To investigate the evolutionary processes that led to the emergence of these epidemic lineages, we sequenced the genomes of six P. syringae strains that were isolated from cucurbits grown in the United States, Europe, and China over a period of more than a decade, as well as eight strains that were isolated from watermelon and squash grown in six different Florida counties during the 2013 and 2014 seasons. These data were subjected to comparative analyses along with 42 previously sequenced genomes of P. syringae stains collected from diverse plant species and environments available from GenBank. Maximum likelihood reconstruction of the P. syringae core genome revealed the presence of a hybrid phylogenetic group, comprised of cucurbit strains collected in Florida, Italy, Serbia, and France, which emerged through genome-wide homologous recombination between phylogroups 2a and 2b. Functional analysis of the recombinant core genome showed that pathways involved in the ATP-dependent transport and metabolism of amino acids, bacterial motility, and secretion systems were enriched for recombination. A survey of described virulence factors indicated the convergent acquisition of several accessory type 3 secreted effectors (T3SEs) among phylogenetically distinct lineages through integrative and conjugative element and plasmid loci. Finally, pathogenicity assays on watermelon and squash showed qualitative differences in virulence between strains of the same

clonal lineage, which correlated with T3SEs acquired through various mechanisms of horizontal gene transfer (HGT). This study provides novel insights into the interplay of homologous recombination and HGT toward pathogen emergence and highlights the dynamic nature of P. syringae sensu lato genomes.

Keywords: horizontal gene transfer, homologous recombination, pathogen emergence, Pseudomonas syringae sensu stricto, cucurbits

#### INTRODUCTION

The Gram-negative bacterial species, Pseudomonas syringae sensu lato (in the largest sense), embodies both a pathogenic and phylogenetic complex of strains, which are responsible for numerous plant diseases of economic importance worldwide. Because many of the phytopathogenic bacteria found within this species complex could not be differentiated using traditional phenotypic and biochemical tests, they were classified into distinct pathogenic populations (i.e., pathovars) as defined by their host specificity (Dye et al., 1980). Currently, over 50 pathovars have been described within the seven named species and one genomospecies in P. syringae sensu lato (Gardan et al., 1999). These can be distinguished by multilocus sequence analysis (MLSA; Hwang et al., 2005; Young, 2010; Bull et al., 2011; Berge et al., 2014) and whole genome sequence analysis (Marcelletti and Scortichini, 2014; Nowell et al., 2014; Gomila et al., 2017) into phylogroups which correspond to distinct species.

Aside from its role as a plant pathogen, P. syringae sensu lato is common in a variety of habitats outside of the agricultural context, including in precipitation, water, soil, and wild plants as a facultative saprophyte (Hirano and Upper, 2000; Morris et al., 2013). Given the ubiquitous nature of this bacterial species, it is not surprising to note that P. syringae sensu lato may exhibit a variety of interactions with plants ranging from commensal leaf inhabitant, to opportunistic, and host-specialized phytopathogen. Similarly, some P. syringae sensu lato lineages have evolved differing modes of transmission to plants, including via seed and water, which may be reflected in their ecology, metabolic versatility, and other forms of microbial physiology (Baltrus et al., 2017). Several well characterized plant diseases such as bacterial speck of tomato, bleeding canker of European horse chestnut, or bacterial canker of kiwifruit were each linked to the expansion of a genetically monomorphic pathogen lineage (Green et al., 2010; Cai et al., 2011a; McCann et al., 2013). In some cases, the clonal lineages associated with these diseases were closely related to strains collected from environmental sources that were less virulent and had a broader host range than their host-specialized relatives (Cai et al., 2011b; Monteil et al., 2013). This observation has led to the hypothesis that P. syringae sensu lato displays an epidemic population structure, whereby novel pathogen lineages emerge from recombining ancestral populations through the acquisition of genes or alleles that provide an adaptive benefit (Vinatzer et al., 2014). Consistent with this hypothesis, gene

content fluctuation occurs at an over 100-fold greater rate than amino acid sequence divergence in P. syringae sensu lato genomes (Nowell et al., 2014).

Among the various species found within P. syringae sensu lato, P. syringae sensu stricto (phylogroup 2; referred to as P. syringae in the rest of the manuscript) possesses many traits that are characteristic of the species complex as a whole. The strains described here are commonly recovered from environmental sources, maintain large epiphytic populations, are active ice-nucleators, and cause disease on a wide range of plant species (Canfield et al., 1986; Morris et al., 2008; Berge et al., 2014). A distinguishing feature of this group is the production of the phytotoxins syringomycin, syringopeptin, and syringolin, which are virulence factors that exhibit antimycotic activity and facilitate host colonization (Scholz-Schroeder et al., 2001; Misas-Villamil et al., 2013; Nowell et al., 2016). Although a number of agriculturally relevant pathovars have been described within P. syringae (Bull and Koike, 2015), strains are commonly identified as P. syringae pv. syringae based on the detection of genes associated with the biosynthesis of syringomycin (Little et al., 1998; Sorensen et al., 1998; Bultreys and Gheysen, 1999). P. syringae pv. syringae, which was named for its original host of isolation (Syringae vulgaris), has been recorded as a pathogen of over 40 different plant species and has a host range distinct, but overlapping many of the other pathovars found within the same phylogenetic group (Young, 2010). As a result, it is unclear to what degree many of the plant pathogenic bacteria described here exhibit host-specificity and/or represent ecologically separate populations.

Bacterial leaf spot of watermelon (Citrullus lanatus), cantaloupe (Cucumis melo), and squash (Cucurbita pepo) is a common early spring disease that has a worldwide distribution and can cause significant economic losses under cool, wet environmental conditions (Morris et al., 2000; Riffaud et al., 2003; Newberry et al., 2017). The disease was recently recognized as a seedborne disorder of squash (Manceau et al., 2011); however, its etiology in various cucurbit species is likely to have multiple sources (Monteil et al., 2016). Recently, we characterized the P. syringae population responsible for bacterial leaf spot epidemics that occurred in commercial production fields of watermelon and squash throughout Florida. Analysis of the population structure indicated that this newly emerging disease was primarily associated with the expansion of a clonal P. syringae lineage throughout the state, that was most closely related to the P. syringae pv. syringae type/pathotype strain, LMG 1247PT (=ICMP 3023<sup>T</sup> ), within phylogroup 2b. Additionally,

we identified two other clonal lineages collected from either the same, or previous bacterial leaf spot epidemics in the United States that shared a recent a common ancestor with the aforementioned epidemic clone; however, were located in a separate phylogroup within the same species, namely phylogroup 2a (Newberry et al., 2016, 2018). Although, we were able to precisely classify these pathogens within the phylogenetic structure of P. syringae sensu lato using MLSA, we were unable to delineate them from other strains collected from a diverse group of plant species or attribute them to any pathovar previously associated with cucurbit hosts, other than P. syringae pv. syringae.

Here, we investigated the evolutionary processes that led to the emergence of these similar, yet distinct P. syringae lineages as successful pathogens of watermelon, cantaloupe, and squash, as well as the genetic factors that distinguish them from other members of this environmentally ubiquitous bacterial population. We obtained high-quality draft genomes for 11 P. syringae strains collected from bacterial leaf spot epidemics in the United States, as well as for three strains isolated from symptomatic squash grown in Italy, Serbia, and China over various years. In order to investigate these strains in the context of the larger diversity of P. syringae, we analyzed these data together with the genomes of 42 additional P. syringae strains collected from diverse plant species and environments available from public sequence databases. We examined the population structure and analyzed patterns of homologous recombination within P. syringae. The distribution of previously described virulence factors including type 3 secreted effectors (T3SEs), phytotoxins, and other biologically relevant features were computationally surveyed. Finally, pan-genome association analysis was carried out to identify orthologous groups potentially involved in niche adaptation of the cucurbit lineages. The combined results of this study demonstrate the presence of a hybrid P. syringae clone associated with watermelon, cantaloupe, and squash in the United States and Europe and provide evidence for the convergent adaptation of two phylogenetically distinct P. syringae populations to cucurbit hosts.

#### MATERIALS AND METHODS

#### Bacterial Strains and Sequencing

We selected 14 P. syringae strains that were isolated from the symptomatic tissue of watermelon, cantaloupe, and squash over the period of numerous years for shotgun sequencing, as detailed below. Most of these strains were described previously and altogether, comprised three different multilocus haplotypes located in phylogroups 2a and 2b of P. syringae (**Table 1**). Eight strains collected from bacterial leaf spot epidemics that occurred in Florida between 2013 and 2014 were sequenced. These strains were isolated from watermelon and squash grown in six different Florida counties during these epidemics and comprised haplotypes one and two (Newberry et al., 2018). Three strains included for sequencing comprised haplotype three and were isolated from cantaloupe and squash grown in Florida, Georgia, and California between 2000 and 2006 (Newberry et al., 2016). Finally, three additional strains that were isolated from symptomatic squash grown in Italy, Serbia, and China between 2005 and 2013 were also sequenced because they were found to be identical at four partial housekeeping gene sequences to one of the previously mentioned haplotypes (Hwang et al., 2005).

Bacterial strains were purified from single colonies and cultured overnight in nutrient broth. Genomic DNA was extracted using the CTAB-NaCl method (Ausubel et al., 1994), checked for quality using a NanoDrop 2000 (Thermo Scientific, Waltham, MA, United States) and gel electrophoresis, then quantified using a Qubit 3.0 fluorometer (Thermo Fischer, Waltham, MA, United States). Genomic libraries were prepared using a Nextera library preparation kit (Illumina Inc., San Diego, CA, United States) and the DNA was sequenced using the Illumina MiSeq platform at the Interdisciplinary Center

TABLE 1 | Draft genome sequencing and assembly statistics for P. syringae strains isolated from watermelon, cantaloupe, and squash.


<sup>a</sup>WM, watermelon; CL, cantaloupe; SQ, squash. <sup>b</sup>Multilocus sequence type.

TABLE 2 | List of genomes included in comparative analyses including pathovar classification, host of isolation, and phylogenetic classification based on MLSA.


for Biotechnology Research, University of Florida. The raw sequence data were subjected to adapter and quality trimming with Scythe<sup>1</sup> and SolexaQA, respectively (Cox et al., 2010). The quality-trimmed reads were then de novo assembled into contigs using the SPAdes Genome assembler (v3.5.0) with the " careful" option to reduce mismatches in the assembly (Bankevich et al., 2012). The draft genome assemblies were submitted to the Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP) pipeline and the Joint Genome Institute (IMG-JGI) server for annotation (Markowitz et al., 2012; Tatusova et al., 2016). The sequencing, assembly statistics, and collection information are presented in **Table 1**.

#### Pan-Genome Association Analysis

The strains sequenced in this study were investigated in context of P. syringae by including 42 additional genomes that were publicly available from the National Center for Biotechnology Information (NCBI) GenBank for analysis (**Table 2**). These genomes were selected to represent the diversity

<sup>1</sup>https://github.com/vsbuffalo/scythe

of phytopathogenic bacteria previously described within the species and are distributed among phylogroups 2a, 2b, and 2d (Berge et al., 2014; Bull and Koike, 2015). Representatives of phylogroup 2c, otherwise known as P. congelans, were not included in this analysis as they are not known to be phytopathogenic (Mohr et al., 2008). The genome assemblies were re-annotated with Prokka to generate GFF3 files (Seemann, 2014), which were then used as input for Roary pan-genome pipeline (v.3.6.1, Page et al., 2015). Orthologous groups were clustered using the CD-Hit and MCL algorithms with a BLASTp cut-off set to 95% along with the "-s" option to prevent the splitting of orthologous groups containing paralogs. The output of Roary was further analyzed using Scoary to test for associations between the presence/absence of orthologous groups and cucurbit-associated lineages (Brynildsrud et al., 2016). The population structure (as described below) was used to control for spurious associations in the estimation of probabilities and orthologous groups with a Bonferroni p ≤ 10−<sup>5</sup> were reported.

#### Analysis of Population Structure and Interlineage Recombination

Initial analysis of the population structure was conducted by calculating the average nucleotide identities between genomes using the MUMmer algorithm (Marçais et al., 2018), with the Python package pyani<sup>2</sup> . Subsequently, a core genome alignment was constructed using the program Parsnp (Treangen et al., 2014). Locally collinear blocks (LCBs) of maximal unique matches shared across all genome assemblies were identified and aligned against the gold standard reference genome of P. syringae pv. syringae strain B728a (NC\_007005.1). Single nucleotide polymorphisms (SNPs) located on LCBs < 200 bp or other regions of poor alignment were removed from the data set to generate a concatenated alignment of high-quality core genome SNPs. This concatenated SNP alignment was used to infer a maximum likelihood phylogeny using iQTree (v.1.6.4) with the Jukes-Cantor model of nucleotide substitution (Nguyen et al., 2015). Branch support was assessed using the ultrafast bootstrap method with 1,000 replicates (Minh et al., 2013) and the phylogenetic tree was visualized and annotated using FigTree (v.1.4.2<sup>3</sup> ).

To analyze patterns of homologous recombination within P. syringae, the core genome alignment generated with Parsnp was used as input for analysis with fastGEAR using the default settings. The fastGEAR algorithm employs, BAPS, a Bayesian hierarchical clustering method to infer sequence clusters in an alignment (Corander et al., 2003). This was followed by a hidden Markov model (HMM), which was used to collapse clusters that share a common ancestry in at least 50% of the sites into lineages. The program detects "recent" recombination events between lineages using a HMM and the origin of the recombinant sequence is assigned to the lineage with the highest probability at that position. Similarly, "ancestral" recombination events that are shared by all strains which comprise a lineage are identified; however, the origin of these recombination events cannot be inferred due to their conserved nature (Mostowy et al., 2017). The statistical significance of recombination predictions was tested using a Bayes factor (BF) > 1 for recent recombination events and BF > 10 for ancestral recombination events. This analysis was conducted with two alternative genome alignments using completed reference genomes of P. syringae pv. syringae strains UMAF0158 (NZ\_CP005970.1) and HS191 (NZ\_CP006256.1) to validate the lineage prediction and proportion of gene flux between lineages. Finally, to examine the evolutionary history of P. syringae in the absence of recombination, the positions in the core genome alignment which corresponded to the predicted recent recombination events were removed and SNPs extracted using the program SNP-sites (Page et al., 2016). This recombination-filtered SNP alignment was then used to construct a maximum likelihood phylogeny as described above.

### Functional Analysis of Recent Recombination Events

To characterize the functional impact of the recombinant genes leading to the emergence of phylogroup 2b-a, BLASTn (Evalue ≤ 1e−50) was used to map the recent recombination events predicted by fastGEAR back to the genome assemblies of three strains (13-140A, ZUM3584, HS191), representative of the different recombination profiles in the dataset. The Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology IDs corresponding to recombinant and non-recombinant genes were subsequently extracted from the IMG annotations and a linear regression analysis was conducted to compare the composition of recombinant vs. non-recombinant core genes across the COG functional categories. Additionally, the KEGG orthology IDs were used to compare the distribution of recombinant genes across the KEGG BRITE functional annotations among the three different recombination profiles.

After noting evidence for extensive recombination affecting the hrp/hrc pathogenicity island of phylogroup 2b-a cucurbit strains in the fastGEAR output, amino acid sequence alignments for the 27 open reading frames that comprise hrp/hrc gene cluster, as described by Alfano et al. (2000), were constructed using a sub-set of 54 P. syringae genomes. Individual gene alignments were subjected to a phylogenetic analysis with FastTree 2 using the gamma time reversible model (Price et al., 2010). A gene was considered to be recombinant among the 2b-a lineages if it clustered into a monophyletic group with other of members of phylogroup 2a, rather than 2b, as predicted by the fastGEAR algorithm. To illustrate these relationships, a phylogenetic network was constructed with a concatenated alignment of the entire hrp/hrc gene cluster using the NeighborNet method and p-distance in SplitsTree 4 (Huson and Bryant, 2006).

### Analysis of Type 3 Secreted Effectors (T3SEs) and Phytotoxins

Reference sequences were obtained from the public T3SE database<sup>4</sup> and used to query the genome assemblies using

<sup>2</sup>https://github.com/widdowquinn/pyani

<sup>3</sup>http://tree.bio.ed.ac.uk/software/figtree/

<sup>4</sup>http://www.pseudomonas-syringae.org/

BLASTn (E-value ≤ 1e−<sup>5</sup> ). A given T3SE was considered to be present in a genome if the alignment with the reference sequence displayed at least 80% sequence identity over 80% of the query length. If genomic sequences orthologous to a given reference sequence were split between two contigs, then the sequences were concatenated to determine their presence in the assemblies. For each T3SE, an alignment of the subject sequence with the curated reference sequence was used to record putative frameshift mutations or other disruption of the coding sequence. The distribution of T3SEs was then used to construct a binary matrix, where 1 indicated the presence of a gene, 0.5 indicated a putative pseudogene, and 0 if the gene was absent. Hierarchical clustering of the binary matrix was performed with the Python package, hclust2<sup>5</sup> . A similar strategy described by Baltrus et al. (2011) was used to identify genes involved in the biosynthesis of the following phytotoxins: tabtoxin, phaseolotoxin, mangotoxin, syringomycin, syringolin, and syringopeptin.

#### Identification of Plasmids and Genomic Islands

The presence of plasmids was predicted for the strains sequenced in this study using plasmidSPAdes (Antipov et al., 2016). Assembled contigs were screened using Microbial Genome BLAST against the complete plasmid database to identify putative plasmid sequences. These contigs were subsequently screened for T3SEs and other virulence factors as described above. Genomic islands and other signatures of horizontal gene transfer (HGT) were detected using the Island Viewer 4 server (Bertelli and Brinkman, 2018). Genomic islands (GIs) were predicted using the reference strains P. fluorescens SBW25, P. protegens Pf-5, P. aeruginosa PAO1, and P. syringae pv. syringae B728a. The predicted GIs were analyzed for gene content and extracted from the genome assemblies to construct multiple sequence alignments using the Mauve software package (Darling et al., 2004).

#### Pathogenicity Assays

The pathogenicity of most of the bacterial strains sequenced in this study was examined previously (Newberry et al., 2016, 2018). However, to make direct comparisons with strains collected from independent disease outbreaks, a sub-set of 10 P. syringae strains representative of the genetic diversity and geographic source of isolation in the collection were selected for further pathogenicity testing. A strain that was originally isolated from diseased proso millet, P. syringae pv. syringae HS191 (Gross and DeVay, 1976; Ravindran et al., 2015), was also included in these experiments due to its genetic similarity to the cucurbit strains examined here. Bacterial strains were cultured overnight on King's medium B agar, suspended in a sterile MgSO<sup>4</sup> ∗ 7H2O solution (10 mM), and adjusted to ∼1 × 10<sup>8</sup> colony forming units (CFU ml−<sup>1</sup> ) spectrophotometrically (OD<sup>600</sup> = 0.1). Two-week-old seedlings of watermelon cv. Troubadour (Harris Seeds, Rochester, NY, United States) and squash cv. Conqueror III (Seminis Vegetable Seeds) were sprayed with the bacterial suspension until runoff, incubated in a humidity chamber at 100% RH for 48 h, then placed under growth room conditions at 21◦C and ∼70% RH with a 12 h photoperiod. Plants sprayed with the sterile MgSO<sup>4</sup> ∗ 7H2O solution served as a negative control. Two weeks after inoculation, the total proportion of necrotic/symptomatic leaf tissue was rated visually from 0 to 8 using a modified version of the Horsfall–Barratt scale (Horsfall and Barratt, 1945): 0 = no symptoms, 1 = 1–3%, 2 = 3–6%, 3 = 6–12%, 4 = 12–25%, 5 = 25– 50%, 6 = 50–75%, 7 = 75–87%, 8 = 87–100%. Five replicates were included for each strain/host combination, and the experiment was conducted twice. A non-parametric analysis of variance (ANOVA) was used to test for differences among the treatments and all statistical analyses were conducted using JMP Pro 13 (SAS Institute, Cary, NC, United States).

### RESULTS

#### Population Structure of P. syringae sensu stricto

Previous studies using MLSA described three phylogroups of phytopathogenic bacteria within P. syringae sensu stricto (Berge et al., 2014; Bull and Koike, 2015). Maximum likelihood reconstruction of the P. syringae core genome based on 246,510 polymorphic sites extracted from an alignment of 2.43 Mb revealed the presence of two well-supported phylogenetic groups in addition to those delineated through MLSA (2a, 2b, and 2d), designated here as 2b-a and Pav (**Figure 1**). The between-phylogroup average nucleotide identities (ANI) were primarily equal and ranged from 94.61 to 95.54% ANI. However, phylogroup 2b-a was intermediate of 2b (97.43% ANI), Pav (96.15% ANI), and 2a (96.00% ANI). Similar levels of sequence identity (between 98.16 and 98.98% ANI) were recorded within all P. syringae phylogroups (**Table 3**). Phylogroup 2b contained a number of strains collected from diverse monocot and herbaceous dicot plant species including the type strain, P. syringae pv. syringae ICMP3023<sup>T</sup> , as well as other previously described pathovars such as P. syringae pvs. aptata ICMP459PT , atrofaciens ICMP4394PT , pisi PP1, and lapsa ICMP3947PT. Phylogroup 2b-a branched as a sister group to 2b and contained many cucurbit strains that were isolated from plants grown in Florida, Italy, France, and Serbia between 2003 and 2014. All of these strains were of the same clonal lineage, except for ZUM3584, and shared a recent common ancestor with that of P. syringae pv. syringae HS191, which was isolated in Australia from diseased proso millet in 1969 and was more distantly related. The other novel P. syringae phylogroup inferred in this analysis, phylogroup Pav, consisted of strains associated exclusively with hazelnut decline that were classified primarily as P. syringae pv. avellanae (O'Brien et al., 2012) and branched between phylogroups 2b and 2d. The cucurbit strains located in phylogroup 2a branched from each other into two distinct lineages that were nested among other strains isolated from diverse plants and environments. A well supported split was observed between the clones collected in California and China (BS2121 and ZUM3984) and those in Florida and Georgia (03- 19A and 200-1), which differed by less than 300 SNPs in the

<sup>5</sup>https://bitbucket.org/nsegata/hclust2

support was ≥98% except where otherwise indicated and the scale bar shows the number of substitutions per site.



<sup>a</sup>Percent identity was calculated after correcting for clones in the dataset.

core genome alignment and represented distinct haplotypes within the lineage (**Figure 1**).

#### Patterns of Homologous Recombination Between P. syringae sensu stricto Phylogroups

The sequence clusters identified by the Bayesian statistical clustering method were congruent to the phylogroups supported by the maximum likelihood phylogeny. However, fastGEAR collapsed phylogroups 2b and 2b-a into a single lineage, while the others remained distinct. The average proportion of recent recombination in the core genome of P. syringae phylogroups (± the standard deviation) was estimated to be 0.71 ± 1.13% for 2a, 1.44 ± 1.16% for 2b, 2.03 ± 0.06% for Pav, and 0.81 ± 0.20% for 2d. In contrast, the proportion of recent recombination in the core genome of phylogroup 2b-a ranged from 27.98 to 30.54%, with ≥98.69% of the recombinant sequences predicted

to have originated from phylogroup 2a. Between 178 and 213 recent recombination events were dispersed across the core genome of 2b-a lineages, where ZUM3584 and HS191 displayed variation in distribution of recent recombination events in relation to the strains collected in Florida, France, and Serbia (**Figure 2A**). Although evidence of ancestral recombination was minimal or undetected among the 2a (0.49%), 2b (0.00%), and 2d (1.32%) lineages, 40.09% of the phylogroup Pav core genome was acquired through ancestral recombination events (**Supplementary Figure S1**). After removing 1,592,717 positions affected by recent recombination from the core genome alignment (64% of the original alignment), a second phylogenetic analysis was conducted using 81,421 recombination-free SNPs. Overall, the topology of the recombination filtered phylogeny was congruent to that of the unfiltered tree. However, phylogroup 2ba appeared as a branch within phylogroup 2b, rather than as a distinct group and with considerably shorter branch lengths as compared to the unfiltered phylogeny (**Figures 2B,C**).

### Functional Analysis of Recombinant Genes Leading to the Emergence of Phylogroup 2b-a

The recombinant regions predicted by fastGEAR corresponded to 3,094 coding sequences among strains 13-140A, ZUM3584, and HS191, of which 2,812 were assigned to a COG functional category. Likewise, 6,875 of 7,654 coding sequences were assigned to COG categories corresponding to regions of the core genome where no signal of recombination was detected. No significant difference in the effect of COG category on recombination was observed between strains (P = 0.15) and a strong linear relationship between recombinant and recombination-free core genes was noted across the general functional groups (R <sup>2</sup> = 0.886; **Figure 3A**). Although, a Grubb's test for outliers indicated that no COG category was significantly over- or underrepresented with recombinant genes (data not shown), the top three categories skewed by recombination included amino acid transport and metabolism (E), cell motility (N), and extracellular structures (W). Functional categorization of recombinant core genes utilizing the KEGG BRITE database supported these observations. Approximately 25% of the classifiable genes impacted by recombination were ATP-dependent transporters associated with numerous metabolic pathways, followed by bacterial motility proteins and secretions systems, which each comprised approximately 7–8% of the recombinant genes in each strain (**Figure 3B**).

Examination of the specific secretion systems impacted by recombination indicated that many of the genes associated with the biosynthesis of type III secretion system (i.e., hrp/hrc gene cluster) were recombinant among strains 13-140A and

core genome of three strains representative of phylogroup 2b-a recombination profiles (B).

ZUM3584, whereas no signals of recombination were detected in the hrp/hrc cluster of strain HS191. Phylogenetic analyses showed that 17 hrp/hrc genes clustered the cucurbit strains isolated in Florida, France, and Serbia into a monophyletic group with other members of phylogroup 2a and were therefore considered recombinant, while strain ZUM3584 carried 11 recombinant alleles (data not shown). A phylogenetic network constructed with a concatenated alignment of the entire hrp/hrc cluster (7,509 aa) was largely concordant with that of the core genome phylogeny. This analysis showed the phylogroup 2b-a cucurbit strains branching from other 2a lineages, while ZUM3584 was placed in a hybrid position in the phylogenetic network and HS191 clustered among members of phylogroup 2b (**Figure 4**).

### Distribution of Type 3 Secreted Effectors (T3SEs) and Phytotoxins

The T3SE repertoires of the 56 genomes analyzed in this study are presented in **Figure 5**. The cucurbit strains carried between 17 and 21 potentially functional or disrupted effector genes, while an average of 13 T3SEs were present among other P. syringae lineages. Hierarchical clustering based on the presence/absence of effector genes revealed a loose correlation between the core

genome evolution and effector repertoires, with most members of the primary P. syringae phylogroups clustering into distinct groups. The phylogroup 2b-a and 2a cucurbit strains carried similar effector profiles and clustered together along with other 2a lineages. The T3SE hopA1 was exclusive to most of the strains within this group and was noted to be the only gene present in the exchangeable effector locus of phylogroup 2b-a strains (except for HS191 which carried hopA2), along with its corresponding chaperone, shcA. The effector hopZ5 was present exclusively in the genomes of all but two cucurbit strains (13-509 and PS711) within P. syringae sensu stricto and displayed 98.27% amino acid sequence identity to the hopZ5 allele present in the P. syringae pv. actinidiae biovar 3 of phylogroup 1. This effector was linked to hopH1, present with a point deletion, resulting a frameshift mutation in the gene. Several effector genes found to be sparsely distributed across P. syringae were also identified among the cucurbit-associated lineages. Most of these genes were present on an integrative and conjugative element (ICE) or plasmid loci (see results below) and included avrRpt2 in strains 13-509A and PS711; hopAR1 in strains 200-1 and 03-19A; avrPto1, hopAU1, and hopX2 in strains 13-139B and 13-429; and hopAW1, which was common to all phylogroup 2a-cucurbit strains except 200-1 and 03-19A.

We also investigated the presence of phytotoxin biosynthetic gene clusters in the genome assemblies, which displayed a simpler distribution. The mangotoxin gene cluster was conserved across P. syringae. Evidence for the complete syringomycin, syringolin, and syringopeptin biosynthetic pathways was also present in most P. syringae genomes including the cucurbit strains examined here, although was notably absent from strains associated with diseases of woody hosts including all members of phylogroup Pav, P. cerasi 58<sup>T</sup> , and P. syringae pv. papulans CFBP1754PT. Additionally, phylogroup 2a cucurbit

strains 03-19A, 200-1, BS2121, and ZUM3984 carried dcd2, which involved in phaseolotoxin biosynthesis, but lacked other components of this biosynthetic pathway (**Supplementary Figure S2**). Ravindran et al. (2015) previously noted that phylogroup 2b-a strain HS191 was negative for ice-nucleation activity and carried an ∼2 Kb truncation in the center of the ice-nucleation protein, inaZ. As the 2b-a cucurbit strains were previously found to be negative for ice-nucleation activity (Newberry et al., 2018), we investigated the presence of inaZ in the genome assemblies of these strains and found the same truncation (data not shown).

#### Pan-Genome Association Analysis

We investigated the pan-genome of P. syringae to identify orthologous groups (OGs) associated the cucurbit strains and therefore, potentially involved in niche adaptation. Among the 19,613 OGs that comprised the P. syringae pan-genome, only seven were significantly associated with both phylogroup 2a and 2b-a cucurbit strains (Bonferroni p ≤ 10−<sup>5</sup> ). Among these, two displayed 100% specificity, indicating that these OGs were completely absent from the genomes of related P. syringae strains and included the T3SE hopZ5 and a hypothetical protein also present in the ICE locus. Several other OGs with known functions in virulence displayed significant associations and included the T3SE hopA1, its corresponding chaperone shcA, and the type VI secreted effector vgrG. The gene that displayed the strongest association with the cucurbit lineages was a hypothetical protein of 63 aa in length, adjacent to a putative iron-sulfur binding gene cluster (**Table 4**).

#### Convergent Acquisition of T3SEs and Other Putative Virulence Factors Through Integrative and Conjugative Elements and Plasmids

Analysis with Island Viewer 4 revealed that all cucurbit strains carried a predicted genomic island between approximately 90 and 125 Kb in size that was similar in structure to an ICE described in P. syringae pvs. actinidiae and phaseolicola of phylogroups 1 and 3, respectively (Pitman et al., 2005; McCann et al., 2013). The boundaries of the island were delineated by parA and xerC at opposite ends, each flanking a repeat region overlapping tRNAlys, which serve as the attachment sites for the island (Pitman et al., 2005). The contigs corresponding to predicted ICE loci were subsequently extracted from the genomes of representative cucurbit strains (divided between one and five contigs per genome) and used to construct a multiple sequence alignment along with the complete ICE sequences of P. syringae pv. actinidiae (Psa) SR121 (Accession no. KX009066) and P. syringae pv. syringae HS191 (**Figure 6**). We were unable to confirm whether the predicted ICE carried by strains 13-429 and 13-139B was intact due to the presence of the pyoverdine biosynthetic gene cluster (∼60 Kb) in the center of the ICE region of these two strains and therefore, were not included in the alignment.

This analysis revealed that the putative ICEs displayed a mosaic structure characterized by differing mobile genetic elements and gene cassettes. It also revealed the overall synteny of the ICEs was conserved and that the core ICE genes displayed a high degree of homology (ranging from 96 to 99% average pairwise identity) with the ICE of Psa strain SR121. Similar to the ICE described in P. syringae pv. phaseolicola strain 1302A (Pitman et al., 2005), hopAR1 was carried in the ICE of strain 200- 1 between a predicted helicase and pilL, while avrRpt2 was found at the same locus in strain 13-509A and HS191 carried a predicted type VI secretion protein (hcp1). Similarly, hopZ5 and a disrupted copy of hopH1 was present in the ICEs of all cucurbit strains except 13-509A, which carried hopC1 and an intact copy of hopH1 at the same locus. Both of these effector pairs were flanked by a predicted phage integrase and site-specific recombinase (xerD) that shared 97.05 and 93.13 aa identity, respectively.

The effector hopZ1 was also present in the ICE of all but two cucurbit strains (ZUM3584, 200-1) and was located approximately 10 Kb downstream of hopZ5/hopC1, near the boundary of the island. Interestingly, hopZ1 was also carried by strain HS191 and was linked to a similar integrase/recombinase gene cassette. However, in HS191, hopZ1 was present in the exchangeable effector locus, as indicated by the presence of queA and tRNAleu, which delineate the boundary of this pathogenicity island (**Supplementary Figure S3**). Several other accessory genes and putative virulence factors were identified in ICEs. Strain ZUM3985 carried hopAW1 at a locus unique to the ICE found in this strain. It also carried a predicted calcium binding protein (1,691 aa) of the RTX toxin superfamily, which is a protein family


a IMG locus tags reported for strain 13-140A, the identifier Ga0170668\_precedes each locus tag number. <sup>b</sup> ICE shown in parentheses indicates that a gene was present in the integrative and conjugative element locus. <sup>c</sup>Sensitivity indicates the proportion of the target population (cucurbit lineages) in which an orthologous group was present, and specificity shows the proportion of the non-target population in which an orthologous group was absent.

were not aligned and contain sequence elements specific to a particular ICE. Gene annotations and general functional annotations are labeled where available.

commonly secreted via the type I secretion system (Linhartová et al., 2010). Likewise, a cassette of small peptides between 62 and 83 aa in length with a predicted papB domain, which is involved in the regulation of adhesin biosynthesis (Xia et al., 2000), was conserved in ICE of several strains including 13-140A, ZUM3584, 200-1, HS191, and SR121 (**Figure 6**).

Analysis with plasmidSPAdes provided no evidence of plasmid sequences in phylogroup 2b-a cucurbit strains, whereas several were assembled among strains within phylogroup 2a. These results were consistent with the size of genome assemblies, which averaged 5.92 and 6.23 Mb for the 2b-a and 2a cucurbit strains, respectively (**Table 1**). A 52.42 Kb contig was assembled in strains 200-1 and 03-19A which displayed 93% nucleotide identity with 66% query coverage to the complete plasmid sequence from strain HS191 (NZ\_CP006257.1). This plasmid carried the effectors hopC1 and hopH1, which marked the second copy of hopH1 for these two strains, in addition to the disrupted allele present in the ICE. A putative virulence plasmid of 16.51 Kb in size was assembled in strains 13- 429 and 13-139B which harbored four accessory effector genes including hopAU1, hopAW1, hopAF1, and avrPto. The top hit for this contig in the NCBI complete plasmid database was plasmid PP3 (NZ\_LT963405.1) of P. syringae pv. avii strain CFBP3846 and displayed only distant homology (95% nucleotide identity and 39% query coverage). Strains BS2121 and ZUM3984 also carried two putative plasmids, for which the top BLAST hits included the complete plasmids pCC1557 (NZ\_CP007015.1) and pB13-200A (NZ\_CP019872.1). However, no apparent virulence factors were identified in these contigs. A summary of the origin of T3SEs among representative cucurbit strains is presented in **Figure 7C** and **a** compiled list of putative plasmid sequences and BLAST results are available in **Supplementary Table S1**.

#### Correlation Between T3SE Repertoires and Pathogenicity

The results from two independent pathogenicity experiments produced similar results and showed significant differences in disease severity between the treatments (P < 0.0001). Phylogroup 2b-a strains 13-140A, 13-C2, and ZUM3854 induced expanding lesions and foliar blighting on watermelon and squash, with mean severity ratings ranging from 5.60 to 7.50 among the three strains. In contrast, the squash strains 13- 509A and PS711 produced only superficial lesions on both hosts and mean severity ratings that did not significantly differ from that of the millet strain, HS191 (mean severity ratings ranging from 0.70 to 2.40). Two effector pairs differentiated the virulent from weakly virulent strains within this group and included hopZ5 and the disrupted copy of hopH1 among the virulent strains, while the weakly virulent strains carried hopC1 and an intact copy of hopH1. Within phylogroup 2a, no significant differences in disease severity were observed among strains BS2121, ZUM3984, and 13-139B, which produced moderate to high levels of disease severity on both hosts. While strains 03-19A and 200-1 also induced expanding lesions and foliar blighting on squash (between 5.00 and

among strains (C).

7.00 mean severity), they were weakly virulent on watermelon (between 1.40 and 2.00 mean severity). Four T3SEs were unique to the squash limited strains of phylogroup 2a and included hopC1, an intact copy of hopH1, hopAR1, and hopM1 disrupted by a frameshift mutation. Likewise, hopAW1, hopZ1, and an intact hopM1 allele were unique effectors common among the strains virulent to both watermelon and squash (**Figure 7**).

## DISCUSSION

fmicb-10-00270 May 2, 2019 Time: 15:31 # 14

Bacterial leaf spot of watermelon, cantaloupe, and squash, caused by P. syringae (referring to sensu stricto unless otherwise stated), is a sporadic plant disease and often attributed to contaminated seed. Here, we provide the first genomic insights into the P. syringae strains associated with these three cucurbit hosts, with a focus on three apparently clonal lineages located in phylogroups 2a and 2b. We sequenced the genomes of six strains that were isolated from diseased cucurbits grown in the United States, Europe, and China over the period of a more than a decade, as well as eight strains that were isolated from watermelon and squash grown in six different Florida counties during the 2013 and 2014 seasons (**Table 1**). Reconstruction of the P. syringae core genome revealed that the cucurbit-associated lineages of phylogroup 2b formed a novel phylogenetic group, designated here as phylogroup 2b-a (**Figure 1**), which emerged through genome-wide homologous recombination between phylogroups 2a and 2b (**Figure 2**). While the majority of this group consisted of a single clonal lineage isolated from plants grown in the United States, France, and Serbia, strains ZUM3584 and HS191 were more distantly related and displayed variation in the distribution of recombinant loci that were dispersed across the core genome (**Figures 1**, **2**). As the overall proportion of recombinant sequences was similar among all 2b-a lineages (ranging from 27.98 to 30.54% of the core genome), the distribution of recent recombination events, rather than the quantity, was the primary factor driving diversification within this group. These observations were supported by the recombination filtered core-genome phylogeny, which showed phylogroup 2b-a as a branch within phylogroup 2b, rather than as a distinct group, and more recently diverged as compared to the unfiltered core-genome phylogeny (**Figures 2B,C**). In contrast to the 2b-a strains, the P. syringae strains isolated from watermelon, cantaloupe, and squash in phylogroup 2a were nested among the lineages of other strains isolated from corn, wheat, switchgrass roots, and various tree species (**Figure 1**).

Unexpectedly, another example of genome-wide homologous recombination was inferred in our analysis among the P. syringae pv. avellanae (Pav) lineages of phylogroup Pav, whereby an estimated 40.09% of the core genome was acquired through ancestral recombination events (**Supplementary Figure S1**). These results were particularly interesting as the four Pav genomes analyzed were not clonal and displayed levels of sequence divergence equivalent to that of other P. syringae phylogroups (**Table 3**), suggesting that the fixation of recombinant alleles across phylogroup Pav (i.e., inference of ancestral recombination) was not an artifact of small sample size. These results were striking given the minimal impact of recombination on the evolution of the primary P. syringae phylogroups, which displayed an average of 95% pairwise identity between groups and were at the edge of the proposed species delimitation boundary for prokaryotes (**Table 3**). Both the recent and ancestral recombination profiles suggested admixture between phylogroups Pav and 2b (**Figure 2A** and **Supplementary Figure S1**). However, because phylogroup Pav did not display levels of pairwise identity intermediate of two different phylogroups, as was observed for 2b-a (**Table 3**), this could indicate that recombination has occurred between a currently unsampled P. syringae population. Taken together, these observations indicate that P. syringae phylogroups may inhabit overlapping environmental niches and are consistent with that of an epidemic population structure.

The link between macro-scale recombination events and the emergence of hybrid, epidemic-clones has been well documented among human and animal bacterial pathogens (Spoor et al., 2015). As such, genome-wide recombination events are associated with niche adaptation and remodeling of the host-pathogen interaction. Functional analysis of the recent recombination events that led to the emergence of phylogroup 2b-a painted a similar picture. Although a strong linear relationship between the COG category compositions suggested an overall proportional change among the general functional groups (**Figure 3A**), analysis utilizing the KEGG BRITE database revealed that pathways involved in the ATP-dependent transport of amino acids and other organic compounds, bacterial motility, and secretion systems were enriched for recombinant genes (**Figure 3B**). Interestingly, most of the specific pathways found within these functional groups were similarly affected by recombination in the millet pathogen, P. syringae pv. syringae HS191, except for those genes encoding for the type 3 secretion system, otherwise known as the hrp/hrc pathogenicity island. A phylogenetic analysis confirmed that the recombinant hrp/hrc genes shared a common evolutionary history with that of phylogroup 2a (**Figure 4**) and therefore, may be a significant factor contributing to the convergent pathogenicity of 2a and 2b-a lineages to cucurbit hosts. This finding was reminiscent of bacterial etiolation and decline of creeping bentgrass, for which the host-specificity of two phylogenetically distinct Acidovorax avenae populations was associated with three ancestral recombination events affecting the hrp/hrc gene cluster (Zeng et al., 2017).

In addition to genome-wide recombination, we found evidence for the remodeling of T3SE repertoires of phylogroup 2b-a. This remodeling was marked by the acquisition of hopA1, which was carried exclusively by most phylogroup 2a lineages within P. syringae, including those isolated from cucurbit hosts. The hopA1 and shcA genes, which form an effector-chaperone complex, were the only two genes occupying the exchangeable effector locus of phylogroup 2b-a lineages (except for the millet strain HS191 which carried hopA2), suggesting that this effector was recently acquired. Previous studies have demonstrated that strains of P. syringae sensu stricto carry fewer T3SEs than other Pseudomonas spp. and this was associated with the production of broad host-range toxins such as syringomycin (Baltrus et al., 2011; Hulin et al., 2018). Although, we found evidence for the syringomycin, syringopeptin, and syringolin biosynthetic clusters in the genome assemblies (**Supplementary Figure S2**), on average, phylogroup 2b-a and 2a cucurbit strains carried more T3SEs (between 17 and 21 potentially functional or disrupted effector genes) than other P. syringae lineages (**Figure 5**). Most of these accessory T3SEs were acquired through independent mechanisms of HGT. These included an ICE,

present in both 2a and 2b-a lineages (**Figure 6**), as well as two different plasmids harboring effector genes among phylogroup 2a lineages (**Supplementary Table S1**). Interestingly, a positive correlation between the proportion of recent recombination in the core genome and the number of T3SEs was observed in other P. syringae strains, including P. cerasi 58<sup>T</sup> , P. syringae pv. papulans CFBP1754PT, and P. syringae pv. pisi PP1 and highlights the interplay of homologous recombination and HGT in pathogen emergence (**Figures 2, 5**).

Despite the open pan-genome exhibited by P. syringae (Nowell et al., 2014), pan-genome association analysis identified only a handful of orthologous groups that were significantly associated (Bonferroni p ≤ 10−<sup>5</sup> ) with both the 2a and 2b-a cucurbit strains (**Table 4**). Four of the seven significantly associated genes included previously described virulence factors such as the T3SE hopZ5, the hopA1/shcA effector-chaperone complex, and the type VI secreted effector vgrG, suggesting a role suggesting a role in pathoadaptation. The effector hopZ5 was among several accessory T3SEs acquired through the ICE (**Figure 6**) and was the only virulence factor which distinguished the cucurbit-associated lineages from related pathogens of various plant species (**Table 4**). Interestingly, hopZ5 was also among the few virulence factors acquired by the pandemic strains of P. syringae pv. actinidiae (Psa) biovar 3 (which may be considered a species distinct of P. syringae sensu stricto) and to date, has only been recorded in the genomes of Pseudomonas spp. associated with diseases of woody hosts (McCann et al., 2013; Nowell et al., 2016). Although hopZ5 was not linked to the ICE described in Psa biovar 3, it was curious to find that many of the core ICE genes associated with the horizontal transfer of this effector among the cucurbit strains shared a recent evolutionary history with the ICE carried by Psa strain SR121 (**Figure 6**).

The acquisition of novel virulence factors through HGT is commonly attributed to changes in bacterial phenotypes. Hence, it was striking to note the contrasting pathogenicity among the cucurbit strains (**Figure 7**). As these differences in pathogenicity were observed among strains of the same clonal lineage, this indicated that components of the accessory genome, rather than the underlying genetic background was likely the key factor influencing these phenotypes. We found that two phylogroup 2ba strains collected in Florida and Serbia (13-509A and PS711, respectively) were both weak pathogens of watermelon and squash and produced superficial lesions like that of the millet strain HS191 (**Figures 7A,B**). Interestingly, 13-509A and PS711 also carried effector repertoires more like that of the millet strain than other virulent 2b-a strains isolated from cucurbits (**Figure 5**). This difference was primarily accounted for by the presence of hopC1, in place of hopZ5 in the ICE locus, while hopC1 was carried in the exchangeable effector locus of strain HS191 (**Figure 7C**). Both hopZ5 and hopC1 were adjacent to a second T3SE, hopH1 (**Figure 6**). However, the hopH1 allele present in the virulent cucurbit strains carried a point deletion, rendering a frameshift mutation in the gene and was therefore not predicted to be translocated or expressed.

A similar negative association between the hopC1 and hopH1 effector pair and virulence was observed among phylogroup 2a strains 03-19A and 200-1. Both strains were weakly pathogenic to watermelon and carried hopC1 and an intact copy of hopH1 on a putative ∼52 Kb plasmid (**Figure 7C**). While the disruption of hopH1 among highly virulent strains within phylogroups 2a and 2b-a may serve to avoid host recognition, we cannot discount the possibility that the disruption of this effector was an artifact due its horizontal transfer and linkage to hopZ5, rather than selection pressure. Furthermore, we observed that strains 03-19A and 200-1 induced severe foliar blighting on squash, rather than being weakly pathogenic in general (**Figures 7A,B**). This indicates a difference in the nature of the host-pathogen interaction and perhaps suggests induction of an effector triggered immunity in watermelon. These strains also carried hopAR1 in the ICE locus (**Figure 7C**), which is an avirulence gene that has been well described in P. syringae pv. phaseolicola (Tsiamis et al., 2000) and is another candidate potentially limiting the host-range of these two stains to squash. Further analysis is required to determine whether the expression of any of the gene candidates identified here, including hopC1, hopH1, and hopAR1 may serve as negative pathogenicity factors in cucurbits, and conversely, whether hopZ5 promotes the virulence of these pathogens.

The widespread distribution of two genetically monomorphic P. syringae populations in association with multiple cucurbit hosts suggests a role for natural selection in maintaining these populations. Furthermore, the convergent acquisition of alleles through homologous recombination and multiple virulence factors through HGT among the cucurbit strains examined here may be interpreted as genomic signatures of host-adaption (Sheppard et al., 2018). Although the evolutionary processes inferred here mirrored those of other plant–pathogenic bacteria responsible for an array of emerging diseases, we do not know if these recombination events occurred upon colonization of cucurbit hosts or if these events happened prior to exposure to this ecological niche. Clues as to the answer of this question may found in the recombinant genome of the millet pathogen, P. syringae pv. syringae HS191, which lacked similar signatures in ecologically significant loci such as the hrp/hrc pathogenicity island and ICE locus. The significance of other numerous functional pathways affected by recent recombination events remains to be explored and provides further evidence for the emerging paradigm that plant–pathogen compatibility is not defined solely by T3SE repertoires but is likely a multifactorial process involving the acquisition/metabolism of plant derived nutrients, bacterial chemotaxis, and evasion of plant-innate immunity, among other processes (Jacques et al., 2016). Ultimately, this hypothesis will need to be tested through more extensive sampling of P. syringae strains from multiple environments coupled with functional analysis.

#### AUTHOR CONTRIBUTIONS

EN, ME, JJ, EG, and MP conceived the project. MP, CB, NZ, EN, and AO provided P. syringae strains. EN conducted the pathogenicity assays. ME prepared the genomic DNA. ME, ST,

and JJ oversaw the sequencing experiments. ST and NP assembled the draft genomes. ME conducted the toxin analysis and EN performed all other computational analyses with support from JH-T and NP. ME submitted the genome sequences to NCBI-GenBank and JGI-IMG. EN wrote the manuscript and all authors provided a critical review of the paper.

#### FUNDING

This research was supported in part by the Southern IPM Center, Florida Watermelon Association, the National Watermelon Association, and USDA-NIFA. NZ and AO were supported by the national project III46008. NP and EN were supported by funding from NIFA-Hatch and Alabama Agriculture Experiment Station. Publication of this manuscript

### REFERENCES


was supported by the University of Florida Open Access Publishing Fund.

#### ACKNOWLEDGMENTS

Thanks to Dr. Bert Woudt for providing P. syringae strains collected in Italy and China. The authors declare that all standard biosecurity and institutional safety procedures were adhered to during this work.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2019.00270/full#supplementary-material



analysis system. Nucleic Acids Res. 40, D115–D122. doi: 10.1093/nar/ gkr1044



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Newberry, Ebrahim, Timilsina, Zlatkovi´c, Obradovi´c, Bull, Goss, Huguet-Tapia, Paret, Jones and Potnis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

5 ,

# Corrigendum: Inference of Convergent Gene Acquisition Among Pseudomonas syringae Strains Isolated From Watermelon, Cantaloupe, and Squash

Eric A. Newberry 1,2, Mohamed Ebrahim3,4, Sujan Timilsina<sup>3</sup> , Nevena Zlatkovic´ Aleksa Obradovic´ 5 , Carolee T. Bull <sup>6</sup> , Erica M. Goss 3,7, Jose C. Huguet-Tapia<sup>3</sup> , Mathews L. Paret <sup>2</sup> , Jeffrey B. Jones <sup>3</sup> \* and Neha Potnis <sup>1</sup> \*

#### Approved by:

*Frontiers Editorial Office, Frontiers Media SA, Switzerland*

> \*Correspondence: *Jeffrey B. Jones jbjones@ufl.edu Neha Potnis nzp0024@auburn.edu*

#### Specialty section:

*This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology*

Received: *04 April 2019* Accepted: *16 April 2019* Published: *03 May 2019*

#### Citation:

*Newberry EA, Ebrahim M, Timilsina S, Zlatkovic N, Obradovi ´ c A, Bull CT, ´ Goss EM, Huguet-Tapia JC, Paret ML, Jones JB and Potnis N (2019) Corrigendum: Inference of Convergent Gene Acquisition Among Pseudomonas syringae Strains Isolated From Watermelon, Cantaloupe, and Squash. Front. Microbiol. 10:963. doi: 10.3389/fmicb.2019.00963* *<sup>1</sup> Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, United States, <sup>2</sup> Department of Plant Pathology, North Florida Research and Education Center, University of Florida, Quincy, FL, United States, <sup>3</sup> Department of Plant Pathology, University of Florida, Gainesville, FL, United States, <sup>4</sup> Department of Plant Pathology, Faculty of Agriculture, Ain Shams University, Cairo, Egypt, <sup>5</sup> Faculty of Agriculture, University of Belgrade, Belgrade, Serbia, <sup>6</sup> Department of Plant Pathology and Environmental Microbiology, Pennsylvania State University, State College, PA, United States, <sup>7</sup> Emerging Pathogens Institute, University of Florida, Gainesville, FL, United States*

Keywords: horizontal gene transfer, homologous recombination, pathogen emergence, Pseudomonas syringae sensu stricto, cucurbits

#### **A Corrigendum on**

#### **Inference of Convergent Gene Acquisition Among Pseudomonas syringae Strains Isolated From Watermelon, Cantaloupe, and Squash**

by Newberry, E. A., Ebrahim, M., Timilsina, S., Zlatkovi´c, N., Obradovi´c, A., Bull, C. T., et al. (2019). Front. Microbiol. 10:270. doi: 10.3389/fmicb.2019.00270

In the original article, we neglected to acknowledge the University of Florida Open Access Publishing Fund in supporting the publication of this manuscript. The authors apologize for this error and state that this does not change the scientific conclusions of the article in any way. The original article has been updated.

Copyright © 2019 Newberry, Ebrahim, Timilsina, Zlatkovi´c, Obradovi´c, Bull, Goss, Huguet-Tapia, Paret, Jones and Potnis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multiple Recombination Events Drive the Current Genetic Structure of Xanthomonas perforans in Florida

Sujan Timilsina<sup>1</sup> , Juliana A. Pereira-Martin<sup>2</sup> , Gerald V. Minsavage<sup>1</sup> , Fernanda Iruegas-Bocardo<sup>1</sup> , Peter Abrahamian<sup>2</sup> , Neha Potnis<sup>3</sup> , Bryan Kolaczkowski<sup>4</sup> , Gary E. Vallad<sup>2</sup> \*, Erica M. Goss1,5 \* and Jeffrey B. Jones<sup>1</sup> \*

<sup>1</sup> Department of Plant Pathology, University of Florida, Gainesville, FL, United States, <sup>2</sup> Gulf Coast Research and Education Center, University of Florida, Gainesville, FL, United States, <sup>3</sup> Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, United States, <sup>4</sup> Microbiology and Cell Science, University of Florida, Gainesville, FL, United States, <sup>5</sup> Emerging Pathogens Institute, University of Florida, Gainesville, FL, United States

#### Edited by:

Dawn Arnold, University of the West of England, United Kingdom

#### Reviewed by:

Prabhu B. Patil, Institute of Microbial Technology (CSIR), India Marcus Michael Dillon, University of Toronto, Canada

#### \*Correspondence:

Gary E. Vallad gvallad@ufl.edu Erica M. Goss emgoss@ufl.edu Jeffrey B. Jones jbjones@ufl.edu

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology

Received: 30 November 2018 Accepted: 20 February 2019 Published: 13 March 2019

#### Citation:

Timilsina S, Pereira-Martin JA, Minsavage GV, Iruegas-Bocardo F, Abrahamian P, Potnis N, Kolaczkowski B, Vallad GE, Goss EM and Jones JB (2019) Multiple Recombination Events Drive the Current Genetic Structure of Xanthomonas perforans in Florida. Front. Microbiol. 10:448. doi: 10.3389/fmicb.2019.00448 Prior to the identification of Xanthomonas perforans associated with bacterial spot of tomato in 1991, X. euvesicatoria was the only known species in Florida. Currently, X. perforans is the Xanthomonas sp. associated with tomato in Florida. Changes in pathogenic race and sequence alleles over time signify shifts in the dominant X. perforans genotype in Florida. We previously reported recombination of X. perforans strains with closely related Xanthomonas species as a potential driving factor for X. perforans evolution. However, the extent of recombination across the X. perforans genomes was unknown. We used a core genome multilocus sequence analysis approach to identify conserved genes and evaluated recombination-associated evolution of these genes in X. perforans. A total of 1,356 genes were determined to be "core" genes conserved among the 58 X. perforans genomes used in the study. Our approach identified three genetic groups of X. perforans in Florida based on the principal component analysis (PCA) using core genes. Nucleotide variation in 241 genes defined these groups, that are referred as Phylogenetic-group Defining (PgD) genes. Furthermore, alleles of many of these PgD genes showed 100% sequence identity with X. euvesicatoria, suggesting that variation likely has been introduced by recombination at multiple locations throughout the bacterial chromosome. Site-specific recombinase genes along with plasmid mobilization and phage associated genes were observed at different frequencies in the three phylogenetic groups and were associated with clusters of recombinant genes. Our analysis of core genes revealed the extent, source, and mechanisms of recombination events that shaped the current population and genomic structure of X. perforans in Florida.

Keywords: core genome multilocus sequence typing, bacterial evolution, recombination, horizontal gene transfer (HGT), Xanthomonas perforans, bacterial spot

### INTRODUCTION

Bacterial pathogens challenge the sustainability and economics of agricultural production. The most damaging bacterial plant pathogens combine rapid evolution with a tendency for emerging strains to spread quickly over long-distances (Carroll et al., 2014). Characterizing bacterial strains associated with disease outbreaks advances our understanding of changes in pathogen

populations and geographic distribution of genetic variation as well as the potential to trace the source of outbreaks. Technological advancements in both sequencing and computational tools have facilitated translational research for bacterial disease management via epidemiological and resistance-based approaches in hosts ranging from humans to plants (Köser et al., 2012; Gétaz et al., 2018).

Evolutionary and epidemiological studies of bacterial populations use core genomes, pan-genomes, and intergenic regions to uncover patterns and processes of strain emergence and spread (Biek et al., 2015; McNally et al., 2016; Jibrin et al., 2018). The process of whole genome sequencing followed by gene-by-gene comparisons to identify core genes, which are present in all sampled genomes, expands MLSA (Multilocus Sequence Analysis) from a half-dozen to several hundred or even a thousand genes (Bialek-Davenet et al., 2014). This reproducible approach for phylogenetic comparisons is termed core genome MLSA (cgMLSA) (Maiden and Harrison, 2016; Ghanem et al., 2017; Moura et al., 2017).

The genus Xanthomonas is comprised of plant pathogenic bacteria affecting multiple plant hosts. Fresh market tomato production in Florida is severely affected by bacterial spot disease of tomato caused by Xanthomonas perforans (Jones et al., 2004; Horvath et al., 2012). Previous studies on X. perforans strains isolated from Florida have shown shifts in the bacterial population with regards to species, races, bactericide resistance, bacteriocin production, effector profiles, and phylogenetic groups (Timilsina et al., 2014; Schwartz et al., 2015; Abrahamian et al., 2018). Prior to the initial identification of X. perforans in 1991, only tomato race 1 (T1) strains of Xanthomonas euvesicatoria were reported on tomato in Florida (Horvath et al., 2012; Timilsina et al., 2016). The first X. perforans strains from Florida were identified as tomato race 3 (T3) strains (Jones, 2004; Timilsina et al., 2016). T3 strains carry the functional XopAF (avrXv3) and XopJ4 (avrXv4) effectors. In 1998, a tomato race 4 (T4) X. perforans strain was identified (Minsavage et al., 2003) that lacked a functional XopAF effector. Various surveys and independent isolations over the last two decades determined that T4 X. perforans has become the dominant pathogen causing bacterial spot on tomato in Florida (Horvath et al., 2012; Vallad et al., 2013). While selection for widespread copper tolerance in bacteria is expected due to the historical reliance on copper-based bactericides for the management of bacterial spot disease (Vallad et al., 2010), the drivers of tomato race change (in the absence of host resistance), host expansion, and introduction of novel effector genes are less obvious.

We previously identified at least two phylogenetic groups of X. perforans in strains isolated from Florida in 2006 and 2012 using MLSA of six housekeeping genes (Timilsina et al., 2014). Among the two groups, group 2 strains appeared recombinant based on the sequences of two housekeeping genes that were identical to X. euvesicatoria strain Xe85-10, isolated from pepper (Timilsina et al., 2014). Although X. perforans strains are regarded as tomato specific, a group 2 X. perforans strain, Xp2010, was isolated from pepper, and other group 2 strains from tomato were shown to cause disease on pepper (Timilsina et al., 2014; Schwartz et al., 2015). The phenotypic and genotypic changes in group 2 strains suggests that the genomic impact of recombination likely extends beyond the few genes we have previously reported (Jibrin et al., 2018).

Phylogenetic methods are commonly applied to the study of bacterial strain ancestry and diversification (Didelot and Falush, 2007). However, most phylogenetic analysis methods assume recombination is absent, and the presence of recombination in the history of a sample can cause incorrect phylogenies. For multilocus sequence analysis of bacterial populations, the tendency has been to remove recombination in order to correctly interpret ancestral relationships for the unrecombined portion of the genome, the "clonal frame" (Wicker et al., 2012; Croucher et al., 2014; Lu et al., 2016). However, considering the ubiquity and impact of recombination on bacterial genetic diversity and evolution, the effect of recombination on phylogenetic relationships should be considered (Didelot and Wilson, 2015; Mostowy et al., 2017). Horizontal gene transfer can expedite evolution and may influence host-specificity in bacteria (Ochman et al., 2000; Yan et al., 2008). Genetic transfer may result in trait convergence due to shared genes acquired by horizontal gene transfer, or lead to the formation of distinct lineages or phylogroups (McNally et al., 2016). Transduction via virus, transformation by donor DNA, and conjugation with the donor are the three mechanisms by which bacteria acquire genetic material (Ochman et al., 2000). The acquisition of genomic DNA can leave specific signals surrounding the introduced genes at the integration sites (Ochman et al., 2000). For example, genomic movement between bacterial species by transduction is limited by phage-host specificity and the events are mediated by mobile DNA vectors observed along with the translocated genomic DNA (Popa et al., 2017).

Our objectives were to determine the extent of recombination in X. perforans genomes from Florida strains, identify recombined genes that contribute to the observed population structure in Florida, and evaluate putative mechanisms of genetic transfer of recombined regions. Using a cgMLSA approach, our study provides insights into the extent of recombination and mechanisms of horizontal gene transfer affecting the core genes that constitute the majority of the genomic background of phylogenetically divergent X. perforans genomes. The presence of multiple recombination mechanism signals throughout the genome, affecting both core and pathogenicity associated genes, is consistent with high genome plasticity in X. perforans. We provide empirical evidence that recombination of core genes has defined the existing phylogenetic groups of X. perforans in Florida. The observed genomic patterns appear to be correlated with traits like host-specificity and overall pathogen fitness and indicate that recombination has an extraordinary impact on evolutionary processes in X. perforans.

#### MATERIALS AND METHODS

#### Bacterial Strains, Genome Assembly, and Genome Similarity

The genomes of 58 X. perforans strains isolated from Florida in 1991, 2006, 2012/13, and 2015 were used in this study

#### TABLE 1 | List of strains used in this study.

fmicb-10-00448 March 13, 2019 Time: 13:10 # 3


(**Table 1**). Draft whole genome sequences of 33 strains, including reference strain Xp91-118, were previously published (Potnis et al., 2011; Schwartz et al., 2015). The remaining 25 X. perforans strains collected in 2015 are also publicly available (**Supplementary Table 1**). The raw Illumina MiSeq 2x250 basepair reads were reassembled using Spades v.3.11 with read error correction and "--careful" switch (Bankevich et al., 2012). The assembled sequences were validated using filter-spades.py<sup>1</sup> and Bowtie2 was used to align the assembled reads to identify inconsistencies (Langmead and Salzberg, 2012). Pilon (Walker et al., 2014) was used to remove the inconsistencies identified by Bowtie2. The assembled sequences were filtered to remove sections with coverage less than 2 and contig size less than 500 nucleotides. CheckM identified more than 99% genome completeness with less than 0.6% contamination per genome (Parks et al., 2015; **Supplementary Table 1**). The genomes were annotated using the IMG/JGI platform (Markowitz et al., 2013). Following assembly, pairwise Average Nucleotide Identity (ANI) based on blast was calculated using jSpecies v 1.2.1 (Richter and Rosselló-Móra, 2009).

### Pan-Genome Size

For evaluation of the pan-genome of the 58 X. perforans strains, all genes were extracted from the 58 X. perforans strains using roary (Page et al., 2015) following gene annotation from prokka (Seemann, 2014). The method yielded a total of 7,245 genes. The pan-genome matrix of gene presence/absence in each genome was used as input for a rarefaction analysis to calculate the average number of genes added with each additional genome (Méric et al., 2014). The calculation was randomized by resampling 100 times. The Heaps law function was fitted to the data using the micropan package in R to the rarefaction curve (Tettelin et al., 2008; Snipen and Liliand, 2015). The Heaps law model estimates the parameter alpha. When alpha > 1, this suggests a closed pan genome and saturated sampling of the gene pool, while alpha < 1 suggests an open pan-genome.

#### Core Gene Identification and Alignment

The IMG/JGI annotated sequences were used to identify core genes among the 58 genomes. Nucleotide and amino acid sequences of annotated genes were used as input for core gene identification using get\_homologues v.2.0.1.9 (Contreras-Moreira and Vinuesa, 2013). Genes present in at least 95% of the genomes and with 75% pairwise alignment coverage were retained. The genes were parsed using python scripts to strictly define core genes as genes present in 100% of the genomes with intact start and stop codons. This approach was taken to limit the core genes to those most likely to be functional, based on genome annotation, in all strains. Genes with multiple copies were also removed. A total of 1,356 genes met the above criteria. Nine genes annotated as functional by the get\_homologues built-in annotation algorithm were not annotated by NCBI nor IMG/JGI, but were included in the analysis. The resulting nucleotide sequences of single copy core genes were individually aligned by MAFFT (Katoh and Standley, 2013) using a biopython script (Cock et al., 2009). Individual gene alignments were concatenated using sequence matrix software (Vaidya et al., 2011) to create a circa 1.09 megabases long sequence for each strain.

### Sequence Typing and Gene Mapping

Individual core genes were sequence typed based on nucleotide sequence identity using a python script. Genes with identical sequences were assigned the same number, representing the sequence type. The process was repeated in a loop for all core genes and an output sequence type matrix was generated. This allowed quick comparison of core genes based on allelic variation. Invariable genes were stripped from the matrix to generate a heat map of allelic profiles using the ggplot2 package (Wickham, 2010) in R (R Core Team, 2013). The heat map was color coded to illustrate the allelic patterns for variable core genes, thus providing a genetic fingerprint.

The relative positions of the core genes were mapped based on the complete genome of X. perforans Xp91-118 (NCBI accession number: GCA\_000192045.3). We used the collated nucleotide sequences of the core genes of Xp91-118 as queries to BLAST (Zhang et al., 2000) against the complete genome. The output was configured to list the start and end positions in the complete genome for all core genes, which were sorted by position using a python script. BRIG (BLAST Ring Image Generator) software v. 0.95 (Alikhan et al., 2011) was used to visualize the positions of individual core genes in Xp91-118.

<sup>1</sup>https://github.com/drpowell/utils/blob/master/filter-spades.py

### Phylogenetic Analysis

fmicb-10-00448 March 13, 2019 Time: 13:10 # 4

Single gene evolution may be different from the evolution of the organism as a whole, particularly when there is horizontal gene transfer (Gogarten and Townsend, 2005). PhyML v.3.1 (Guindon et al., 2010) was used to construct maximum likelihood phylogenetic trees for single gene and concatenated core gene sequences. Nucleotide substitution models were estimated independently for individual genes and selected based on the log likelihood Akaike Information Criterion result calculated using jModelTest2 (Darriba et al., 2012). General time reversible model with gamma distributed rates and invariant sites (GTR+G+I) was identified as the best nucleotide substitution model for the concatenated sequence. Maximum likelihood trees were constructed with 500 bootstrap samples for both concatenated and single genes using the suggested substitution model. ClonalFrameML (Didelot and Wilson, 2015) was used to reconstruct maximum likelihood trees while accounting for recombination. ClonalFrameML calculates R/theta, nu, and 1/delta, which represent the relative rate of recombination to mutation, new polymorphisms introduced from recombination, and the inverse of average tract length of recombination (Didelot and Falush, 2007; Didelot and Wilson, 2015). The three parameters were calculated for all single gene trees and for the concatenated sequence tree (hereafter referred to as core genome tree). Additionally, genomic clustering observed in the phylogenetic trees was confirmed by principal component analysis (PCA). The sequence types of the core genes were used as input to conduct PCA using micropan package in R (Snipen and Liliand, 2015).

### Detecting Genes Driving Phylogenic Relationships

Phylogenetic distances between the unrooted single gene trees and core genome tree, along with the sequence type matrix, were used to determine the genes influencing the core genome tree topology of the 58 strains of X. perforans. Congruency of single gene trees to the core genome tree was assessed using Robinson-Foulds (RF) symmetry. This index represents the distance between two phylogenetic trees by evaluating the number of nodes in a tree that are shared with a reference tree (Robinson and Foulds, 1981). We used the core genome tree as the reference tree. The RF symmetry values range between 0 and 1, such that 0 indicates identical tree topology and 1 indicates completely different tree topologies. For example, phylogenetic trees for genes that were identical in nucleotide sequence among all the 58 X. perforans strains did not share any nodes with the core genome tree and the RF value was 1. Alternatively, if any nodes in a single gene tree supported a node in the core genome tree, the resulting RF value was less than 1. RF symmetry was calculated using ETE3 Toolkit (Huerta-Cepas et al., 2016) for each single gene tree against the core genome tree to determine the genes that supported some part of the topology of the core genome tree. The sequence types of genes with RF symmetry < 1 were extracted. The variable genes that exhibited RF values < 1 and supported the phylogenetic grouping in the core genome topology are hereafter referred to as Phylogenetic group-Defining (PgD) genes. Maximum likelihood phylogenetic trees were constructed using concatenated sequences of 241 PgD and 1,115 non-PgD genes separately. The total tree length of the two phylogenetic trees were computed using Analysis of Phylogenetics and Evolution (ape) package in R (Paradis and Schliep, 2018) to confirm the role of PgD genes in phylogenetic grouping of X. perforans in the core genome tree.

#### Identifying Recombination Sources and Recombination Mechanisms

We used two methods to determine if PgD genes may have been horizontally transferred. The sequences of the PgD genes were compared to the NCBI sequence database to determine if alleles were shared with other closely related Xanthomonas species. We also calculated the relative impact of recombination to mutation on nucleotide substitution using ClonalFrameML. Clusters of genes identified as variable or PgD, particularly those with high recombination values, were identified and their gene neighborhoods and flanking regions were examined. Gene neighborhood regions from representatives of each phylogenetic group were aligned and examined for the presence of genes suggestive of prior transfer events, including features of plasmids, phages, and transposable elements (Chiu and Thomas, 2004). In addition to clusters of core genes, we confirmed the presence of these recombination associated signatures in neighborhood regions of effector genes that were previously suspected to be horizontal transferred (Timilsina et al., 2016).

### RESULTS

#### Xanthomonas perforans Pan-Genome

A total of 7,245 genes was identified in the pan-genome of 58 X. perforansstrains and 2,866 genes were considered as core genes by roary (**Supplementary Figure 1A**). The Heaps law estimate, on the rarefaction curve of the number of new genes identified after randomly adding a genome, suggested an open pan-genome (**Supplementary Figure 1B**). The estimate for alpha was 0.813, suggesting that additional genes will be found upon sampling more X. perforans genomes.

### Core Genome Phylogeny

The get\_homologues pipeline identified 2,031 genes as core genes present in at least 95% of the sampled genomes. The variation in the core genes identified from roary and get\_homologues is likely due to the two different annotation pipelines used to generate inputs for these programs. Roary used the annotation output from prokka whereas annotation based on Clusters of Orthologous Groups of proteins (COG) downloaded from IMG/JGI were used for the get\_homologues platform for core gene extraction. We manually curated these to 1,356 core genes that were present and intact in all 58 X. perforans genomes (**Supplementary Table 2**). Nucleotide sequence comparison revealed that 783 genes were identical among all 58 genomes. At least two allele types were found in the remaining 573 genes (**Supplementary Table 3**).

The core genome phylogenetic analysis identified a third phylogenetic group in addition to the two previously described groups of X. perforans in Florida (**Figure 1A**). PCA of sequence types of the 1,356 core genes confirmed the three phylogenetic groups (**Figure 2**). Strain Xp17-12, which was previously considered to be within group 1 (Schwartz et al., 2015), clustered with 10 strains isolated in 2015/16 to form a separate phylogenetic clade that we refer as group 3. Among the 58 strains, 18 strains were designated group 1, 29 strains were designated group 2, and 11 strains were

designated group 3. Group 1 is a heterogeneous group that includes 6 strains from 2006 and 11 from 2012 along with the reference tomato race 3 (T3) strain Xp91-118 from 1991. Strains from the 2015/16 season were in group 2 (15 strains) or group 3 (10 strains).

Within groups, the majority of strains shared more than 99.8% pairwise nucleotide sequence identity in the core genome (**Supplementary Table 4**). Between the groups, identity was reduced to 99.5%. The group 1 strains had relatively lower sequence identities of ∼99.5% in pairwise comparisons, and some strains had group 2 sequence types for several genes as observed in the heatmap and sequence type table (**Figure 1B** and **Supplementary Table 3**). Group 2 strains formed a monophyletic group with sequence identity above 99.7% among core genomes except for comparisons with Xp8-16 and Xp2010 (**Figure 1A** and **Supplementary Table 4**). Group 3 showed relatively low polymorphism with the majority of strains sharing core genome sequence identity above 99.9%. Average nucleotide identity based on BLAST using the whole genomes of these strains showed similar pairwise sequence identities to core genome comparisons (**Supplementary Table 5**).

#### Variable Core Genes by Phylogenetic Group

Core genes were distributed throughout the Xp91-118 genome (**Figure 3**). Among the 573 genes that had at least two allele types, referred to as variable genes, allelic variation was often between phylogenetic groups (**Figure 1B**). Sequences were generally monomorphic or had a single SNP at low frequency within phylogenetic groups. While only 783 genes were monomorphic across the 58 genomes, the number of genes with identical nucleotide sequences within groups were 1124, 1195, and 1239 for groups 1, 2, and 3, respectively.

#### Phylogenetic Group Defining Genes

We identified 241 core genes that supported one or more branches of the core genome tree topology that defined the phylogenetic groups (RF < 1). These genes are collectively distinguished as the phylogenetic group defining (PgD) genes as they drive the observed phylogenetic grouping of the 58 strains (**Figure 1A** and **Supplementary Figure 2A**). In particular, we observed that these PgD genes carried allele types that were often specific to phylogenetic groups (**Figure 1B**, **Supplementary Table 3**, and **Supplementary Figure 2A**). The annotations of genes identified as PgD are listed in **Supplementary Table 2**, among which 59 were annotated as hypothetical proteins and 2 as genes with domains of unknown functions. A total of 1,115 single gene trees did not support the core genome tree topology (RF = 1). These included 783 genes that were identical among all the X. perforans strains, plus 332 genes that had a variant allele type for at least one strain. The total length of tree based on 241 PgD genes was six times the length of the tree based on the remaining 1,115 core genes, signifying the larger contribution of PgD genes to strain variation (**Supplementary Figure 2B**). The non-PgD genes contribute to variation shared among a small numbers of strains.

Allele types distinguishing different groups were found in the PgD genes. Among the 241 PgD genes, 96 genes carried an allele specific to group 2 strains (different from the allele in group 1 and 3 strains), and 78 (81%) of those group 2 alleles were identical to X. euvesicatoria reference strain Xe85-10 (NCBI Accession no GCA\_000009165.1). We found 142 genes with group 3-specific alleles, out of which 64 genes (45%) had alleles identical to X. euvesicatoria Xe85-10. An additional five PgD genes with group 3-specific alleles were identical to those of X. axonopodis pv. citrumelo strain F1 (NCBI Accession no. GCA\_000225915.1), which included two hypothetical proteins, a protease modulator HflC, an anti-anti-sigma factor, and a type VI secretion system associated gene. Finally, group 3 alleles of two PgD genes were identical to those of X. perforans strain LH3 (NCBI Accession no. GCA\_001908855.1), which was isolated from Mauritius in 2010 (Richard et al., 2017). These genes were N-acetylgamma-glutamyl-phosphate reductase (AQS75037.1) and aminoglycoside phosphotransferase (AQS78190.1). Therefore, LH3 is the only group 1 strain to contain these group 3 allele types. BLAST searches did not produce exact sequence matches to group 3 alleles for 71 genes. Unique allele types of PgD genes were distributed among group 1 strains. Group 1 strains isolated in 2006 carried specific allele types for 13 PgD genes that were identical to Xe85-10. Xp4-20 and Xp5-6 carried an additional 51and 19 unique allele types, respectively. The remaining four group 1 strains isolated in 2006 (Xp4B, Xp15-11, Xp11-2, and Xp18-15) had specific allele types for 15 additional genes. Group 1 strains isolated in 2012 were homogenous with 15 genes among the PgD genes identical to Xe85-10. Some of the allele types carried by group 1 strains collected in 2006 were identical to group 2 but different from the reference strain Xp91-118. Among all 241 PgD genes, we found three genes that each had three alleles that were specific to group: endopeptidase (AQS77891.1), TonB-dependent siderophore receptor (AQS78913.1), and septum formation protein Maf (AQS76051.1).

Mapping PgD genes to the complete genome of Xp91-118 identified the positions and proximity of these genes (**Figure 3**). For example, a ∼22 kb region between tryptophan-tRNA ligase (AQS77329.1) and catalase (AQS77307.1), encompassing 16 core genes (14 designated as PgD genes), exhibited diverged haplotypes specific to group 2 strains compared to group 1 and 3 strains. Similarly, an ∼8 kb region, between co-chaperone YbbN (AQS78328.1) and peptidyl-prolyl cis-trans isomerase (AQS78967.1) genes, exhibited a distinct haplotype in group 3 strains compared to the other two groups. The overall ratio of changes introduced by recombination relative to mutation in the concatenated core genome tree was estimated to be 16.75 by ClonalFrameML. These values ranged between 0.063 (AQS77927.1) and 184.159 (AQS77019.1) among the individual PgD gene trees (**Supplementary Table 2**).

### Recombining Genes and Mechanism of Horizontal Gene Transfer

Genomic regions acquired via horizontal gene transfer may have signatures of integration associated with different modes of horizontal gene transfer (Ochman et al., 2000). We examined

genomes representing each phylogenetic group for signals of recombination flanking clusters of PgD genes. We used two criteria to select genomic regions. First, we were interested in alleles of closely clustered PgD genes that were identical to Xe85-10 indicating gene transfer from X. euvesicatoria. Second, we focused on gene trees that exhibited higher ratios of recombination to mutation than the concatenated gene tree. Gene neighborhood comparisons around PgD genes showed the presence of multiple tRNAs, phage-associated and plasmid mobilization genes, along with site specific two-component system XerC and XerD, which were previously described to be associated with horizontal gene transfer (Aussel et al., 2002). For instance, group 1 strains isolated in 2012 have ∼9.6 kb of phage-associated genes between the tRNA-Leucine and tRNA-Glycine adjacent to the PgD genes AQS75151.1–AQS78555.1 (distinct alleles in group 2 and group 1 strains isolated in 2006). This unique ∼9.6 kb region found only in group 1 strains isolated in 2012 include AAA-domain containing protein, phage major capsid protein, phage portal protein, phage terminase-like protein with HTH domain, and hypothetical proteins. Nucleotide

BLAST search in NCBI revealed 98% sequence similarity with only LH3 strain. The site-specific recombinase (XerD) gene was observed in all the group 1 strains between the tRNA-Leucine and phage associated genes, suggesting the integration of the unique genomic region was facilitated by bacteriophages (**Figure 4A**). In general, multiple copies of XerD, ranging from four to eight, were observed in all X. perforans genomes except for the 1991 reference strain, Xp91-118, which carries only one copy.

Group 3 strains carried 71 PgD genes with unique alleles specific to the group. For example, group 3 alleles of six PgD genes between locus tags AQS76085.1–AQS76101.1 were distinct from the other two groups and were unique among sequences available in NCBI. In group 1 and 2, this genomic region is adjacent to XerD, mobile element protein (MobA), conjugal transfer protein (traD) and phage integrase family protein, suggesting integration via plasmid mobilization and sitespecific recombination mechanisms (**Figure 4B**). The genomic region between the multiple tRNA sites and XerD present in group 1 and 2 included several unique hypothetical genes and DNA methyltransferase gene. Furthermore, the genetic variation in group 3 strains indicate the acquisition of novel genomic traits via recombination from multiple donors in addition to X. euvesicatoria.

### Recombination Affecting Type III Effectors

Following the observation of at least two different mechanisms of genetic exchange that were associated with phage transfer and plasmid mobilization in core genes, we examined these signals throughout representative genomes. Interestingly, signatures of horizontal gene transfer were found surrounding effectors that show presence/absence polymorphism among strains and previously predicted to be acquired via recombination. Among the effectors found in bacterial spot causing X. perforans, avrXv4 (xopJ4) is found in all tomato pathogenic strains but not found in the strain, Xp2010, isolated from pepper. We found phage associated genes flanking XopJ4 (**Figure 5**). In Xp2010, both phage associated genes and the xopJ4 effector are absent. Similarly, the gene coding another XopJ family effector, avrBsT (XopJ2), found in the majority of tomato race 4 X. perforans, is flanked by genes for type IV secretion system proteins, conjugative transfer, chromosome partitioning protein, and hypothetical proteins (**Supplementary Figure 3**).

## DISCUSSION

Using a cgMLSA approach, we determined that the X. perforans strains isolated in Florida at different times is defined by three groups of strains that are differentiated by hundreds of variable genes, the majority of which appear to be recombinant. All of the strains we analyzed were tomato race 4 X. perforans that were collected in the past two decades, except for the 1991 T3 reference strain, Xp91-118, and a pepper strain, Xp2010 (Schwartz et al., 2015). A high-resolution phylogeny of X. perforans strains was constructed as well as a core genome fingerprint, which shows allelic variation affecting genes throughout the chromosome. Core genome multilocus sequence analysis allows a holistic comparison of phylogenetic groups and their evolution while minimizing the individual differences between the strains (Maiden et al., 2013; Maiden and Harrison, 2016). Recombination was inferred to be widespread in core genes showing allelic variation with X. euvesicatoria strains as a major donor. Furthermore, we observed plasmid and phage associated site-specific recombination mechanisms surrounding clusters of putatively recombinant genes as well as genes that influence host-specificity and pathogen fitness. The open pangenome further suggests high genomic plasticity in X. perforans. We hypothesize that these recombinant strains are epidemic lineages that have emerged in Florida tomato production from a highly recombinogenic source population.

We previously showed two major phylogenetic groups of X. perforans in Florida (Timilsina et al., 2014, 2016; Schwartz et al., 2015), but strains collected in the 2015/16 growing season revealed a third monophyletic group of 11 strains. Consistent with recent emergence, it was the most homogenous of the three groups as shown by pairwise nucleotide identity and PCA of sequence types. One strain from this group, Xp17-12, was isolated in 2006 and was previously described as an outlier within group 1 (Schwartz et al., 2015). The relative homogeneity among group 3 strains over time suggests that the 2015/16 group 3 strains were from the same source population. Group 1, which includes strains isolated in 1991, 2006 and 2012, is more heterogeneous than other groups (**Figures 1**, **2**). Group 2 strains have been dominant in Florida since at least 2006 (Timilsina et al., 2016) but appear to be largely clonal (**Figure 2**). Three genes with alleles specific to each phylogenetic group were observed that could be used for assigning strains to groups to monitor their prevalence in Florida populations going forward. In general, we identified only a quarter of the total genes being shared among all X. perforans genomes, which suggests high genome plasticity in the species and is consistent with our finding of an open genome and is in agreement with another study of X. perforans (Jibrin et al., 2018).

We inferred recombination to have been the major source of genetic variation in the core genome. Horizontal inheritance of genes from multiple transfer events can obscure ancestral relationships among bacterial strains (Gogarten and Townsend, 2005). Consequently, recombinant loci are often removed from alignments prior to phylogenetic analysis. However, these loci also define the evolution of the recombinant strains (Didelot and Wilson, 2015). Phylogenetic reconstruction without the presence of these potentially recombined PgD genes significantly altered the observed phylogenetic relationships among the X. perforans groups (**Supplementary Figure 2A**). This variation, likely due to recombination from multiple donors, reinforces the necessity to include recombination in bacterial population studies. For group 2, most of the recombinant sequences appeared to have been acquired from X. euvesicatoria, which was displaced in Florida by X. perforans producing antagonistic bacteriocins against X. euvesicatoria (Hert et al., 2005; Timilsina et al., 2016). For group 3, X. euvesicatoria was one of multiple donors. The earliest strains of X. perforans isolated in Florida belonged to group 1, which had the least recombination signatures relative to the other two groups, as reflected by the copy numbers of site-specific recombinase XerCD genes. Correspondingly, the frequency of recombination observed with X. euvesicatoria and other closely related Xanthomonas was relatively low in these strains. Although, horizontal gene transfer is largely associated with acquisition of new traits, the tracts of homologous genes potentially acquired from closely related species shows the impact of recombination on the tempo and direction of X. perforans genome evolution. Recombination affecting the genetic background of a pathogen can affect niche adaptation, fitness, and microbial competition, and thus the establishment of recombinant strains (Alizon et al., 2009).

Our observations suggest horizontal gene transfer of long fragments of shared recombinant alleles through different genomic vectors. The horizontally introduced genomic fragments can be regulated by the carrying capacity of plasmid or phage vectors (Ochman et al., 2000; Bobay and Ochman, 2017). These two mechanisms of vector associated horizontal gene transfer were clearly visible in our X. perforans genomes. Phage and plasmid associated genes were present in the flanking regions of clusters of core genes showing evidence of recombination. Among the three modes of horizontal gene transfer, we were not able to directly identify genomic fragment acquisition via transformation. Several variable genes had variation attributed to recombination, but without any evidence of phage or plasmid associated genes in the flanking regions. These genes may have been transferred via transformation.

Phage mediated gene transfer appears to play a major role in influencing genomic diversity and evolution of X. perforans. We found phage genes throughout the genome in regions with potentially recombined core and effector genes. One such example is the XopJ family effector, avrXv4 (xopJ4). XopJ4 is a member of the XopJ effector family, which is similar to the YopJ, serine/threonine acetyltransferase superfamily (Szczesny et al., 2010). A previous study reported that xopJ4 was conserved in all X. perforans strains except for the Xp2010 strain that was isolated from pepper (Schwartz et al., 2015). The xopJ4 gene was located between phage associated genes. The whole genomic region including phage associated and XopJ4 genes is missing in the Xp2010 pepper strain. A similar XopJ family effector (98% amino acid identity) in X. citri pv. vignicola strain CFBP 7112 (Ruh et al., 2017) is also located between phage associated genes. Effector avrBsT is another XopJ family effector found in X. perforans that is generally associated with plasmids (Ciesiolka et al., 1999; Kay and Bonas, 2009; White et al., 2009). The avrBsT effector was not found in X. perforans until 1998 (Timilsina et al., 2016). The nucleotide sequence of avrBsT in X. perforans is identical to that in the more distantly related bacterial spot species, X. vesicatoria (Timilsina et al., 2016). An identical allele type of the plasmid-borne avrBsT gene in different Xanthomonas species, including X. perforans, suggests the gene is horizontally transferred across the genus (Timilsina et al., 2014, 2016; Jibrin et al., 2018). The majority of T4 X. perforans strains carry avrBsT and the gene has been found to provide a competitive advantage to bacterial strains in field conditions (Abrahamian et al., 2018). The variation created by mobile genetic elements in the core and accessory genes could influence host preference and pathogenicity of X. perforans strains.

Several PgD genes also showed evidence of plasmid associated horizontal gene transfer. An intriguing pattern that was evident in the genomes was the density of tRNAs and site-specific tyrosine recombinase genes flanking these recombined regions. The twocomponent site-specific tyrosine recombinase, XerC and XerD, catalyzes crossover and recombination at specific sites (Midonet and Barre, 2015). Site-specific recombination is characterized by cleavage of both DNA strands at two recombination sites that are later joined to new DNA partners without DNA degradation and phosphodiesters hydrolysis (Subramanya et al., 1997). The XerD recombinase works together with XerC, both of which belong to the λ-integrase family. XerD is reported to initiate recombination by strain exchange to form the Holliday junctions and that is reconstructed by XerC (Aussel et al., 2002;

Crozat et al., 2015). In X. perforans genomes, the site-specific XerD gene was co-located with a plasmid mobilization and transfer (tra) gene where a ∼9.6-kb genomic region was present in group 1 genomes but absent in groups 2 and 3. Sequence comparison showed this region is specific to group 1 X. perforans strains. Along with XerCD genes, this genomic island was flanked by tRNAs (**Figure 4A**). The tRNAs serve as a gateway for integration of foreign DNA (Ochman et al., 2000; Williams, 2002). Boyd et al. (2009) reported that tRNA-Arg, -Leu, -Thr, and -Ser were commonly observed insertion sites. The genomic islands introduced by phage or plasmid in X. perforans seem to have specific attachment sites, facilitated by tRNAs and sitespecific recombinase genes, that altered the core genomes and ultimately shaped the population of X. perforans in Florida. These specific sites serve as recombination hotspots in the bacterial genome.

Bacterial spot disease of tomato has posed a series of management challenges in Florida, including the introduction of X. perforans. In this study, we have begun to tease apart the genetic mechanisms driving population changes in X. perforans since its emergence in 1991, which appear primarily due to phage and plasmid-mediated horizontal gene transfer followed by integration into the chromosome. Our findings indicate rapid genomic evolution in the X. perforans population in Florida, which together with our previous findings of extensive recombination in strains from Nigeria and Italy (Jibrin et al., 2018), suggest a pathogen with a high probability of overcoming management practices, i.e., X. perforans poses a high "evolutionary risk" (McDonald and Linde, 2002). However, the clonal structure of the Florida population also indicates that a limited number of recombinant genotypes have been introduced to or have successfully established in Florida tomato production. If we could determine the population or populations that are highly recombinogenic, these populations could be specifically managed or movement out of these populations curtailed. For example, recombination events could be occurring in seed sources or other production regions

#### REFERENCES


that are not closely connected to Florida, thus exchanging few migrants. Efforts are also needed to understand why X. perforans readily recombines when other bacterial phytopathogens, including the bacterial spot pathogen X. gardneri, appear highly clonal (Schwartz et al., 2015; Timilsina et al., 2014).

#### DATA AVAILABILITY

The datasets generated for this study can be found in NCBI, PRJNA436012.

### AUTHOR CONTRIBUTIONS

ST, NP, GV, JJ, and EG conceived the project. PA collected additional bacterial strains and provided their genomes. JP-M and FI-B oversaw the genome assembly. GM and ST conducted the sequencing experiments. ST conducted all computational analyses and interpreted with the help of BK, EG, GV, and JJ. ST, EG, JJ, and GV wrote the final manuscript. All authors approved the final manuscript.

### FUNDING

This research was supported in part by the United States Department of Agriculture, National Institute of Food and Agriculture under award number 2015-51181-24312. Publication of this article was funded in part by the University of Florida Open Access Publishing Fund.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2019.00448/full#supplementary-material

to single-cell sequencing. J. Comput. Biol. 19, 455–477. doi: 10.1089/cmb.2012. 0021



infecting tomato and pepper. BMC Genomics 12:146. doi: 10.1186/1471-2164- 12-146


codon information. Cladistics 27, 171–180. doi: 10.1111/j.1096-0031.2010. 00329.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Timilsina, Pereira-Martin, Minsavage, Iruegas-Bocardo, Abrahamian, Potnis, Kolaczkowski, Vallad, Goss and Jones. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Molecular Evolution of Pseudomonas syringae Type III Secreted Effector Proteins

Marcus M. Dillon<sup>1</sup> , Renan N.D. Almeida<sup>1</sup> , Bradley Laflamme<sup>1</sup> , Alexandre Martel<sup>1</sup> , Bevan S. Weir<sup>2</sup> , Darrell Desveaux1,3 and David S. Guttman1,3 \*

<sup>1</sup> Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada, <sup>2</sup> Landcare Research, Auckland, New Zealand, <sup>3</sup> Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, ON, Canada

Diverse Gram-negative pathogens like Pseudomonas syringae employ type III secreted effector (T3SE) proteins as primary virulence factors that combat host immunity and promote disease. T3SEs can also be recognized by plant hosts and activate an effector triggered immune (ETI) response that shifts the interaction back toward plant immunity. Consequently, T3SEs are pivotal in determining the virulence potential of individual P. syringae strains, and ultimately help to restrict P. syringae pathogens to a subset of potential hosts that are unable to recognize their repertoires of T3SEs. While a number of effector families are known to be present in the P. syringae species complex, one of the most persistent challenges has been documenting the complex variation in T3SE contents across a diverse collection of strains. Using the entire pan-genome of 494 P. syringae strains isolated from more than 100 hosts, we conducted a global analysis of all known and putative T3SEs. We identified a total of 14,613 putative T3SEs, 4,636 of which were unique at the amino acid level, and show that T3SE repertoires of different P. syringae strains vary dramatically, even among strains isolated from the same hosts. We also find substantial diversification within many T3SE families, and in many cases find strong signatures of positive selection. Furthermore, we identify multiple gene gain and loss events for several families, demonstrating an important role of horizontal gene transfer (HGT) in the evolution of P. syringae T3SEs. These analyses provide insight into the evolutionary history of P. syringae T3SEs as they co-evolve with the host immune system, and dramatically expand the database of P. syringae T3SEs alleles.

Keywords: Pseudomonas syringae, type III secreted effectors, type III secretion system, plant–pathogen, host–microbe interactions, virulence, immunity

#### INTRODUCTION

Over the past three decades, type III secreted effectors (T3SEs) have been recognized as primary mediators of many host–microbe interactions (Michiels and Cornelis, 1991; Salmond and Reeves, 1993; Hueck, 1998; Coburn et al., 2007; Deng et al., 2017; Hu et al., 2017; Rapisarda and Fronzes, 2018). These proteins are translocated directly from the pathogen cell into the host cytoplasm by the type III secretion system (T3SS), where they perform a variety of functions that generally promote virulence and suppress host immunity (Coburn et al., 2007; Zhou and Chai, 2008; Cunnac et al., 2009; Oh et al., 2010; Buttner, 2016; Khan et al., 2018). However, T3SEs can also be recognized by the host immune system, which allows the host to challenge the invading microbe. In plants, this immune response is called effector triggered immunity (ETI) (Jones and Dangl, 2006;

#### Edited by:

Dawn Arnold, University of the West of England, United Kingdom

#### Reviewed by:

Brian H. Kvitko, University of Georgia, United States David Baltrus, The University of Arizona, United States

> \*Correspondence: David S. Guttman david.guttman@utoronto.ca

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Plant Science

Received: 21 December 2018 Accepted: 19 March 2019 Published: 05 April 2019

#### Citation:

Dillon MM, Almeida RND, Laflamme B, Martel A, Weir BS, Desveaux D and Guttman DS (2019) Molecular Evolution of Pseudomonas syringae Type III Secreted Effector Proteins. Front. Plant Sci. 10:418. doi: 10.3389/fpls.2019.00418

**66**

Dodds and Rathjen, 2010; Khan et al., 2016). The interaction between pathogen T3SEs and the host immune system results in an evolutionary arms race, where pathogen T3SEs evolve to avoid detection while still maintaining their role in the virulence process, and the host immune system evolves to recognize the diversity of T3SEs and their actions, while maintaining a clear distinction between self and non-self to avoid autoimmune activation.

One of the best studied arsenals of T3SEs is carried by the plant pathogenic bacterium Pseudomonas syringae (Lindeberg et al., 2009, 2012; Mansfield et al., 2012). P. syringae is a highly diverse plant pathogenic species complex responsible for a widerange of diseases on many agronomically important crop species (Mansfield et al., 2012). While the species as a whole has a very broad host range, individual strains can only cause disease on a small range of plant hosts (Sarkar et al., 2006; Lindeberg et al., 2009; Baltrus et al., 2017; Xin et al., 2018). A growing number of P. syringae strains have also recently been recovered from non-agricultural habitats, including wild plants, soil, lakes, rainwater, snow, and clouds (Morris et al., 2007, 2008, 2013; Clarke et al., 2010). This expanding collection of strains and the increased availability of comparative genomics data presents unique opportunities for obtaining insight into the determinants of host specificity in P. syringae (Baltrus et al., 2011, 2012; O'Brien et al., 2011, 2012; Dillon et al., 2019).

Pseudomonas syringae T3SEs have been the focus of both fundamental and applied plant pathology research for decades, going back to some of the early work on gene-for-gene resistance and avirulence proteins (Mukherjee et al., 1966; Staskawicz et al., 1984, 1987; Keen and Staskawicz, 1988; Kobayashi et al., 1989; Keen, 1990; Jenner et al., 1991; Fillingham et al., 1992). Since then, over 1000 publications have focused on P. syringae T3SEs (Web of Science {"P. syringae" AND [avirulence OR ("type III" AND effector)]}, October 2018), making it one of the most comprehensively studied T3SE systems. To date a total of 66 T3SE families and 764 T3SE alleles have been cataloged in the P. syringae Genome Resources Homepage<sup>1</sup> . Many of these T3SE families are small, relatively conserved, and only distributed in a subset of P. syringae strains, while others are more diverse and distributed across the majority of sequenced P. syringae strains (Baltrus et al., 2011; O'Brien et al., 2011; Dillon et al., 2019). Given the irregular distribution of T3SEs among strains and their frequent association with mobile genetic elements, it has long been recognized that horizontal transfer plays an important role in the dissemination of T3SEs among strains (Kim and Alfano, 2002; Rohmer et al., 2004; Stavrinides and Guttman, 2004; Lovell et al., 2009, 2011; Godfrey et al., 2011; Neale et al., 2016). Nucleotide composition and phylogenetic analyses of a subset of T3SEs identified eleven P. syringae T3SE families that were acquired by recent horizontal transfer events. However, the remaining thirteen families appeared to be ancestral and vertically inherited, suggesting that pathoadaptation may also play a major role in T3SE evolution through mutations that modify the function of T3SEs (Rohmer et al., 2004; Stavrinides et al., 2006; O'Brien et al., 2011). While T3SE repertoires are thought to be key determinants of host specificity, strains with divergent repertoires are at times capable of causing disease on the same host (Almeida et al., 2009; O'Brien et al., 2011, 2012; Lindeberg et al., 2012), signifying that we have much to learn about the ways in which T3SEs contribute to P. syringae virulence.

Two major issues impact our understanding of T3SE diversity in P. syringae; namely, sampling, and nomenclature. From a sampling perspective, the current catalog of T3SEs listed on the P. syringae Genome Resources Homepage were identified from approximately 120 strains that represent only a subset of the phylogroups and overall diversity in the P. syringae species complex. Expanding this strain collection to include more diversity, including less biased agricultural collections and more natural isolates, will undoubtedly expand our understanding of diversity within T3SE families and reveal as-yet identified families.

Nomenclature problems are certainly less interesting from a biological perspective, but are very important operationally since poor classification and naming practices can lead to substantial confusion and even spurious conclusions. Efforts to address this problem have been made in the past. Most notably, a standardized set of criteria for the identification and naming of P. syringae T3SEs was agreed upon by the majority of labs heavily invested in T3SE research in 2005 (Lindeberg et al., 2005). Specifically, we proposed that new T3SEs be classified into existing families using a BLASTP analysis against previously characterized T3SEs to designated families (E < 10−<sup>5</sup> , alignment length > 60%), at which point each family would be subject to rigorous phylogenetic analyses to assign subgroups and truncations. T3SEs that did not fit into any existing families were assigned a new family designation. While the guidelines proposed by this work were widely accepted and implemented, they were not universally applied to all new candidate T3SEs for a number of reasons. Some of the problems stemmed from the inherent biological complexity found in many T3SE, which are often multidomain proteins that share similarity with multiple divergent T3SE families (Stavrinides et al., 2006; McCann and Guttman, 2008). Further, at the time of their discovery, many families also had fewer than three T3SE alleles, making robust phylogenetic analyses impossible. Regardless of the root cause, we are currently in a situation where many T3SEs are annotated without family assignment, some very similar T3SEs have been assigned to different T3SE families, and some highly divergent T3SEs are assigned to the same family based on short tracts of local similarity. This situation should be rectified in order to facilitate more comprehensive analyses of the role of T3SEs in the outcomes of host–pathogen interactions, particularly in light of the growing database of P. syringae genomics resources.

Here, we present an expanded catalog of T3SEs in P. syringae and an updated phylogenetic analysis of the diversity within each T3SE family. We identified a total of 14,613 T3SEs from 494 P. syringae whole-genomes that include strains from 11 of the 13 P. syringae species complex phylogroups. These strains allowed us to redefine evolutionarily distinct family barriers for T3SEs, examine the distribution of each family across the P. syringae species complex, quantify the diversity within each

<sup>1</sup>www.pseudomonas-syringae.org/

T3SE family, and explore how T3SEs are inherited. By expanding and diversifying the database of confirmed and predicted P. syringae T3SEs and placing all alleles in an appropriate phylogenetic context, these analyses will ultimately enable more comprehensive studies of the roles of individual T3SEs in pathogenicity and allow us to more effectively explore the contribution of T3SEs to host specificity.

#### MATERIALS AND METHODS

#### Genome Sequencing, Assembly, and Gene Identification

Four hundred and ninety-four P. syringae species complex strains were analyzed (**Supplementary Dataset S1**), of which 102 assemblies were obtained from public sequence databases, including NCBI/GenBank, JGI/IMG-ER, and PATRIC (Markowitz et al., 2012; Wattam et al., 2014; NCBI Resource Coordinators, 2018), and 392 strains were sequenced in house by the University of Toronto Center for the Analysis of Genome Evolution and Function (CAGEF). Two hundred and sixty-eight of these sequenced strains were provided by the International Collection of Microorganisms from Plants (ICMP). For the strains sequenced by CAGEF, DNA was isolated using the Gentra Puregene Yeast and Bacteria Kit (Qiagen, MD, United States), and purified DNA was then suspended in TE buffer and quantified with the Qubit dsDNA BR Assay Kit (ThermoFisher Scientific, NY, United States). Paired-end libraries were generated using the Illumina Nextera XT DNA Library Prep Kit following the manufacturer's instructions (Illumina, CA, United States), with 96-way multiplexed indices and an average insert size of ∼400 bps. All sequencing was performed on either the Illumina MISeq or GAIIx platform using V2 chemistry (300 cycles). Following sequencing, read quality was assessed with FastQC v.0.11.5 (Andrews, 2010) and low-quality bases and adapters were trimmed using Trimmomatic v0.36 (Bolger et al., 2014) (ILLUMINACLIP: NexteraPE-PE.fa, Maximum Mismatch = 2, PE Palindrome Match = 30, Adapter Read Match = 10, Maximum Adapter Length = 8; SLIDINGWINDOW: Window Size = 4, Average Quality = 5; MENLEN = 20). All genomes were then de novo assembled into contigs with CLC v4.2 (Mode = fb, Distance mode = ss, Minimum Read Distance = 180, Maximum Read Distance = 250, Minimum Contig Length = 1000). Raw reads were then re-mapped to the remaining contigs using samtools v1.5 with default settings to calculate the read coverage for each contig (Li and Durbin, 2009). Any contigs with a coverage depth of less than the average contig coverage by more than two standard deviations were filtered out of the assembly. Finally, gene prediction was performed on each genome using Prodigal v2.6.3 with default settings (Hyatt et al., 2010).

#### Identifying, Annotating, and Delimitation of Type III Secreted Effector Families

To characterize the effector repertoire of each of the 494 P. syringae strains used in this study, we first downloaded all available P. syringae effector, helper, and chaperone sequences from three public databases: NCBI<sup>2</sup> (18,120), Bean 2.0 (225) (Dong et al., 2015), and the P. syringae Genome Resources Homepage<sup>3</sup> (843). Additional type III associated proteins were recovered from NCBI by querying the NCBI protein database with "type III effector," "type III helper," and "type III chaperone" along with "P. Syringae." These were combined with all available type III secretion associated genes from the Bean 2.0 database and the P. syringae Genome Resources database, which were downloaded directly. Using this database of 19,188 T3SE associated sequences in P. syringae, we then performed an all-vs-all BLASTP analysis to ensure that all sequences that we downloaded were assigned to appropriate families, which was essential given that many of the sequences downloaded from NCBI are ambiguously labeled. Any unassigned T3SE associated gene that had significant reciprocal blast hits (E < 1e−24) with an assigned T3SE associated gene was assigned to the corresponding family. This strict E-value cutoff was chosen to avoid incorrectly assigning families to sequences based on short-tracts of similarity that are common in the N-terminal region of T3SEs from different families (Stavrinides et al., 2006). Sequences that had reciprocal significant hits from multiple families were assigned to the family where they had more significant hits, which means that smaller families could be dissolved into a larger family if all sequences from the two families were sufficiently similar. However, this only occurred in one case, which resulted in all HopF and HopBB sequences being assigned to the HopF/HopBB family. In sum, our final seed database of P. syringae T3SEs contained a total of 7,974 effector alleles from 66 independent families, 1,585 discontinued effector alleles from 6 independent families, 2,230 helper alleles from 23 independent families, and 1,569 chaperones alleles from 10 independent families. Any sequences that were not able to be assigned to an appropriate T3SE family were discarded because of the possibility that these are not true T3SE associated genes.

#### Identifying and Classifying Individual Type III Secreted Effectors

We identified all putative T3SE protein sequences from the 494 P. syringae genomes by querying each predicted protein against the T3SE seed database using BLASTP if it passed a significance threshold of E < 1e−24. This resulted in the annotation of 14,613 T3SEs across the 494 P. syringae strains. Family names were initially assigned to these T3SEs based on the name that had been assigned to the hit in the seed database, and then later refined based on more rigorous criterion. First, we blasted each T3SE amino acid sequence against a database of all 14,613 T3SEs and retained only hits with an E-value of less than 1e−24 and a length that covered at least 60% of the shorter sequence. Sequences that had multiple non-contiguous hits (i.e., high-scoring segment pairs) with an E-value less than 1e−24 whose cumulative lengths covered at least 60% of the shorter sequence were also retained. As was the case above, the strict e-value cutoff prevents us from assigning significant hits between T3SE sequences that only share strong local identity, which is most commonly seen in

<sup>2</sup>https://www.ncbi.nlm.nih.gov

<sup>3</sup>https://pseudomonas-syringae.org

the N-terminal secretion signal. The 60% length cutoff prevents chimeric T3SEs from linking the two unrelated T3SE families that combined to form the chimera.

Second, a final list of all T3SE pairs that shared significant hits was gathered and T3SE sequences were collectively binned based on their similarity relationships. With this method, T3SE families were built based on all-by-all pairwise similarity between T3SEs rather than the similarity between individual T3SEs and an arbitrary seed T3SE or collection of centroid T3SEs, as is the case with some clustering methods. Significantly, our approach binned all significantly similar T3SEs regardless of whether any two T3SEs were connected through direct or transitive similarity. For example, if T3SE sequence A was significantly similar to T3SE sequence B, and sequence B was significantly similar to sequence C, all three sequences would be binned together, regardless of whether there was significant similarity between sequence A and sequence C. This is important for appropriately clustering particularly diverse T3SE families, which may contain highly divergent alleles that have intermediate variants.

Finally, we assigned the same T3SE family designation to all T3SEs within each cluster based on the most commonly assigned T3SE family name that had initially been assigned to sequences within that cluster. In the majority of cases, all sequences in a single cluster had the same initially assigned T3SE family. However, for cases where there were multiple family names assigned to sequences within a single cluster, both Hop designations (i.e., HopD/HopAO) were assigned to all sequences in the cluster, with the lower designation being considered the short-form for the family. Conversely, for cases where T3SEs that had initially been assigned the same family designation formed two separate clusters, T3SEs from the larger cluster were assigned the initial family name, and T3SEs from the smaller cluster(s) were assigned a novel family name, starting with HopBO, which is the first available Hop designation. In these cases, the initial family designation of the T3SEs in the family were also kept following a forward-slash ("/") so that the source of the family was known (i.e., HopBO/HopX). Ultimately, this method allowed us to effectively delimit all T3SEs in this dataset into separate families with consistent definitions and performed considerably better at partitioning established T3SE families than standard orthology delimitation software like PorthoMCL (Tabari and Su, 2017) (**Supplementary Dataset S2**), likely because of the widespread presence of chimeric T3SEs in the P. syringae species complex.

We then validated our approach and assessed whether the reliance on protein queries (e.g., BLASTP) substantially increased the likelihood of missing T3SEs without predicted protein sequences due to the lack of a properly called coding sequence. To do this we examined whether we recovered all T3SEs from the well characterized repertoires of P. syringae strains PtoDC3000, PsyB728a, Pph1448a, and PtoT1 (Cunnac et al., 2009; Xin et al., 2018). Of the 123 non-pseudogene T3SEs carried by these four strains, we only failed to identify a single T3SE; namely, AvrPto1 from PsyB728a. This T3SE was not identified because in this region of the chromosome an alternative coding sequence was identified on the opposite strand and the AvrPto1 region was not called as a coding region. While we can identify these additional T3SEs by directly querying the genome assemblies with TBLASTN, this approach increases our false positive rate by pulling non-effectors via their chimeric relationships with known T3SEs and often results in the inaccurate identification of start and stop codons, which muddles downstream evolutionary analyses, Therefore, only T3SEs that were identified as coding sequences in this study were analyzed.

In order to classify short chimeric relationships between families, as illustrated in **Figure 2**, we used a similar approach to the one outlined above. Specifically, we parsed our reciprocal BLASTP results to capture hits that occurred between alleles that had been assigned to different families. Here, we used a significance cutoff of E-value < 1e−10, with no length limitation. All chimeric relationship are shown in **Figure 2**, with the length of each allele and their overlapping regions shown proportionally. These local relationships between alleles in distinct families were not considered in the following evolutionary analyses.

#### Phylogenetic Analyses

We generated three separate phylogenetic trees in this study to ask whether core-genome diversity, pan-genome content, or effector content could effectively sort P. syringae strains based on their host of isolation. For the core genome tree, we clustered all protein sequences from the 494 P. syringae genomes used in this study into ortholog families using PorthoMCL v3 with default settings (Tabari and Su, 2017). All ortholog families that were identified in at least 95% of the P. syringae strains in our dataset were considered part of the soft-core genome and each of these families was independently aligned using MUSCLE v3.8.31 with default settings (Edgar, 2004). These alignments were then concatenated end-to-end using a custom python script and a maximum likelihood phylogenetic tree was constructed based on the concatenated alignment using FastTree v2.1.10 with default parameters (Price et al., 2010). For the pan-genome tree, we generated a binary presence-absence matrix for all ortholog families that were identified in more than one P. syringae strain. This presence-absence matrix was used to compute a distance matrix in R v3.3.1 using the "dist" function with the Euclidean distance method. The phylogenetic tree was then constructed using the "hclust" function with the complete linkage hierarchical clustering method. We used the same approach to generate the effector content tree, except the input binary presence-absence matrix contained information on the 70 effector families rather than all ortholog families that made up the P. syringae pan-genome.

### Estimating Pairwise Ka, Ks, and Ka/Ks

Evolutionary rate parameters were calculated independently for each T3SE family. First, amino acid sequences were multiple aligned with MUSCLE v.3.8.31 using default settings (Edgar, 2004). Each multiple alignment was then reverse translated based on the corresponding nucleotide sequences using RevTrans v1.4 (Wernersson and Pedersen, 2003) and all pairwise Ka and Ks values were calculated for each family using the Nei-Gojobori Method, implemented by MEGA7-CC (Kumar et al., 2016). Output files were parsed using custom python scripts to convert the Ka and Ks matrices to stacked data frames with four

columns: Sequence 1 Header, Sequence 2 Header, Ka, and Ks. The alignment-wide ratio of non-synonymous to synonymous substitutions (Ka/Ks) was then calculated for all T3SE pairs that had both a Ka and a Ks value greater than 0 in each family. For codon-level analysis of positive selection in each family, we used Fast Unconstrained Bayesian Approximation (FUBAR) to detect signatures of positive selection in all families that were present in at least five strains with default settings (Murrell et al., 2013).

For comparisons between T3SE family evolutionary rates and core genome evolutionary rates, we converted each individual core genome family alignment that was generated with MUSCLE to a nucleotide alignment with RevTrans, then concatenated these alignments end-to-end as described above. As was the case with each T3SE family, we then calculated Ka and Ks for all possible pairs of core genomes using the Nei-Gojobori Method and parsed the output files into stacked data frames using our custom python script. The core genome data frame was then merged with each T3SE family data frame independently based on the genomes that the two T3SE sequences were from so that the evolutionary rates between these two T3SEs could be directly compared to the evolutionary rates of the corresponding core genomes.

#### Gain-Loss Analysis

We used Gain Loss Mapping Engine (GLOOME) to estimate the number of gain and loss events that have occurred for each T3SE family over the course of the evolution of the P. syringae species complex (Cohen et al., 2010). The gain-loss analysis implemented by GLOOME integrates the presence-absence data for each gene family of interest across the phylogenetic profile to estimate the posterior expectation of gain and loss across all branches. These events are then summed to calculate the total number of gene gain and loss events that have occurred for each family across the phylogenetic tree. We performed this analysis on each T3SE family using the mixture model with variable gain/loss ratio and a gamma rate distribution. The phylogenetic tree that we used for this analysis was the concatenated core genome tree, which gives us the best estimation of the evolutionary relationships between strains, given the ample recombination known to occur within the P. syringae species complex (Dillon et al., 2019).

#### RESULTS

In this study, we analyzed the collective type III effector repertoire of the P. syringae species complex using whole-genome assemblies from 494 strains representing 11 of the 13 established phylogroups and 72 distinct pathovars (**Supplementary Dataset S1**). These strains were isolated from 28 countries between 1935 and 2016, and include 62 P. syringae type and pathotype strains (Thakur et al., 2016). Although the majority of the strains were isolated from a diverse collection of more than 100 infected host species, we also included strains isolated from environmental reservoirs, which have been dramatically under-sampled in P. syringae studies (Morris et al., 2007, 2013; Mohr et al., 2008; Clarke et al., 2010; Demba Diallo et al., 2012; Monteil et al., 2013, 2016; Karasov et al., 2018). As per Dillon et al. (Dillon et al., 2019), we designate phylogroups 1, 2, 3, 4, 5, 6, and 10 as primary phylogroups and 7, 9, 11, and 13 as secondary phylogroups (we have no representatives from phylogroups 8 or 12, although presumably they would also be secondary phylogroups) (Berge et al., 2014). The primary phylogroups are phylogenetically quite distinct from the secondary phylogroups and include all of the well-studied P. syringae strains. Nearly all of the primary phylogroup strains carry a canonical P. syringae type III secretion system and were isolated from plant hosts. In contrast, many of the strains in the secondary phylogroups do not carry a canonical P. syringae type III secretion system and were isolated from environmental reservoirs (e.g., soil or water).

All of the P. syringae genome assemblies used in this study were downloaded directly from NCBI or generated in-house by the University of Toronto Centre for the Analysis of Genome Evolution & Function using paired-end data from the Illumina GAIIx or the Illumina MiSeq platform. There was some variation in the genome sizes, contig numbers, and N50s among strains due to the fact that the majority of the genomes are de novo assemblies in draft format (**Supplementary Figure S1**); however, the number of coding sequences identified in each strain were largely consistent with the six finished (closed and complete) genome assemblies in our dataset. Given the large size of the P. syringae pan-genome, the fact that some strains have acquired large plasmids, and the relatively high frequency of horizontal gene transfer in the P. syringae species complex (Baltrus et al., 2011; Dillon et al., 2019), we expect there to be some variation in genome size and coding content of different strains.

#### Distribution of Type III Secreted Effectors in the P. syringae Species Complex

To explore the distribution of T3SEs across the P. syringae species complex, we first identified all putative T3SEs present in each of our 494 genome assemblies using a BLASTP analysis (Altschul et al., 1997), where all protein sequences from each P. syringae genome were queried against a database of known P. syringae T3SEs obtained from the P. syringae Genome Resource Database<sup>4</sup> , the Bean 2.0 T3SE Database<sup>5</sup> , and the NCBI Protein Database<sup>6</sup> . In sum, we identified a total of 14,613 confirmed and putative T3SEs (**Supplementary Datasets S2–S4**), 4,636 of which were unique at the amino acid level, and 5,127 of which were unique at the nucleotide level. We consider these T3SEs to be putative because their presence within a genome does not necessarily mean that they are expressed and translocated into the host cytoplasm. Individual P. syringae strains in the dataset harbored between one and 53 putative T3SEs, with a mean of 29.58 ± 10.13 (SEM), highlighting considerable variation in both the composition and size of each strain's suite of T3SEs (**Figure 1**). As expected, primary phylogroup strains tended to harbor substantially more T3SEs than secondary phylogroups strains (30.55 ± 8.97 vs. 3.89 ± 1.64, respectively), which frequently do not contain a canonical T3SS (Dillon et al., 2019). However, a subset of strains from phylogroups 2 and 3, and all

<sup>4</sup>https://pseudomonas-syringae.org

<sup>5</sup>http://systbio.cau.edu.cn/bean

<sup>6</sup>https://www.ncbi.nlm.nih.gov

strains from phylogroup 10 harbored fewer than 10 T3SEs, more closely mirroring secondary phylogroup strains in their T3SE content. The extensive T3SE repertoires found in most primary phylogroup strains supports the idea that these effectors play an important role in the ecological interactions of the majority of strains in this species complex.

Objective criteria are required for partitioning and classifying T3SEs prior to any study of their distribution and evolution. In 2005, an effort was made to unify the disparate classification and naming conventions applied to P. syringae T3SEs (Lindeberg et al., 2005). While this effort was very successful overall, the criteria have not been universally or consistently applied, resulting in some problematic families. For example, the HopK and AvrRps4 families have high similarity over the majority of their protein sequences, but are assigned to distinct families, while the HopX family contains highly divergent subfamilies that only share short tracts of local similarity.

We reassessed the relationship between all 14,613 T3SEs using a formalized protocol in order to objectively delimit families and suggest new family designations for some similar families. While the selection of the specific delimiting criteria is arbitrary and open to debate, we have elected to use a well-established protocol with fairly conservative thresholds. We identified shared similarity using a BLASTP-based pairwise reciprocal best hit approach (Altschul et al., 1997; Eisen, 2000;Daubin et al., 2002), with a stringent Expect-value acceptance threshold of E < 1e−24 and a length coverage cutoff of ≥60% of the shorter sequence (regardless of whether it is query or subject). It should be noted that since this approach uses BLAST it requires only local similarity between family members. Nevertheless, our stringent E-value and coverage thresholds select for matches that share more extensive similarity than would typically be observed when proteins only share a single domain. We feel that these criteria provide a reasonable compromise between very relaxed local similarity criteria (using default BLAST parameters) and very conservative global similarity criteria. All T3SEs that exceeded our acceptance thresholds were sorted into family bins. T3SEs in each bin can therefore be either connected through direct similarity or transitive similarity. Finally, we assigned a primary name to all T3SEs in each bin based on the most common effector family name in that bin and included all secondary family names following a "/".

Our analysis identified 70 T3SE families and sorted T3SEs into their historical families in the majority of cases. We found that our particular delimitation criteria created T3SE family partitions with the best match to those commonly used in the literature. Unfortunately, it was impossible to find any one objective set of standards that did not require some family renaming and shuffling of alleles among families. These exceptions include merging existing effector families that shared significant local similarity (**Table 1**) and assigning previously named T3SEs into new families (**Table 2**). A number of these new families only contain a single allele, so it is likely that they are recent pseudogenes still annotated as coding sequences by Prodigal. Finally, in two cases, a subset of alleles from one T3SE family were assigned to a different family due to the extent of shared local similarity. This included the assignment of all originally designated HopS1 subfamily alleles to HopO, and the assignment of all originally designated HopX3 alleles to HopF/HopBB.



<sup>1</sup>New family short forms were assigned based on the first assigned Hop designation.

#### TABLE 2 | New T3SE Families.

fpls-10-00418 April 4, 2019 Time: 18:1 # 7


<sup>1</sup>These new families only contain a single allele.

It is important to emphasize that the new criteria do not bin T3SEs that share less than 60% similarity across the shortest sequence. This was done to prevent families from being combined due to short chimeric relationships between a subset of the alleles in distinct families (Stavrinides et al., 2006). These relationships could be considered super-families, although the reticulated nature of these relationships makes this unwieldly. We list families that share these short regions of similarity in **Figure 2**, although it is important to recognize that some of these chimeric relationships are only found in a subset of alleles in each family.

The distribution of each of these 70 T3SE families across the P. syringae species complex reveals that the majority of families are present in only a small subset of P. syringae strains, typically from a few primary phylogroups (**Figure 3** and **Supplementary Figure S2**). Among T3SE effector families, only AvrE, HopB/HopAC, HopM, and HopAA are considered part of the soft-core genome of P. syringae (present in >95% of strains). This designation of core T3SE is not impacted by the inclusion or exclusion of the secondary phylogroup strains. Interestingly, three of these core families, AvrE, HopM, and HopAA are part of the conserved effector locus (CEL), a well characterized and evolutionarily conserved sequence region that is present in most P. syringae strains (Alfano et al., 2000; Dillon et al., 2019). The fourth CEL effector, HopN, is only present in 14.98% of strains, all of which are from phylogroup 1. HopB/HopAC emerged as a core effector family in our analysis because of the merging of HopB with the broadly distributed HopAC family, which is present in nearly all P. syringae strains (491/494). The significant similarity between HopB and HopAC (E < 1e−24) occurs across the full length of HopB (the shorter of the two). Despite the high prevalence of this family throughout the P. syringae species complex, very little is known about its function. It would be particularly interesting to see if there has been neofunctionalization between the shorter HopB alleles, which are generally localized between the HrpK locus protein and a 315 amino acid hypothetical protein, and the longer alleles formally classified as HopAC, localized between a nebramycin 5 0 synthase and a 481 amino acid hypothetical protein. While the remainder of T3SE families are also mostly present in a small subset of strains, there is a wide distribution in the number of strains harboring individual T3SE families, further highlighting the dramatic variation in T3SE content across P. syringae strains (**Supplementary Figure S3**).

Following family and strain T3SE classification, we also performed hierarchical clustering using the T3SE content of each strain to determine if T3SE profiles are a good predictor of host specificity. We previously reported that in P. syringae, neither the core genome or gene content phylogenetic trees correlate well with the hosts from which the strains were isolated (Dillon et al., 2019). This remains true in this study, where we've updated the core and pan-genome analyses with an expanded set of strains (**Supplementary Figures S4**, **S5**). The T3SE content tree is not as well resolved due to the smaller number of phylogenetically informative signals in the T3SE dataset. However, we were able to largely recapitulate the established P. syringae phylogroups with this analysis, suggesting that more closely related strains do tend to have more similar T3SE repertoires (**Supplementary Figure S6**). We also see that the phylogroup 2, phylogroup 3, and phylogroup 10 strains that have smaller T3SE repertoires than other primary phylogroups, cluster more closely with secondary phylogroup strains in the effector content tree. However, as was the case in the core genome and gene content trees, hierarchical clustering based on effector content did not effectively separate strains based on their host of isolation. We therefore conclude that overall T3SE content is not a good predictor of host specificity.

#### Diversification of Type III Secreted Effectors in the P. syringae Species Complex

Substantial genetic and functional diversity has been shown to exist within individual T3SE families (Lewis et al., 2014; Dillon et al., 2019). While some T3SE families are relatively small, restricted to only a subset of P. syringae strains, and present in only a single copy in each strain, others are found in nearly all strains, and often appear in multiple copies within a single genome (**Figure 4**). Many of the largest families, including those that are part of the core genome (AvrE, HopB/HopAC, HopM, and HopAA), are among those that are often present in multiple copies. However, we also found that some families that are present in less than half of P. syringae strains (e.g., HopF/HopBB, HopO, HopZ, and HopBL) frequently appear in multiple copies. While the average copy number of individual T3SEs per strain across all families is 1.30 and some families are present in copy numbers as high as six, it is important to recognize that in many cases these multi-copy effectors are not full-length T3SEs. Rather, they are partial copies derived from the same fulllength allele that have been split due to the introduction of a premature stop codon (**Supplementary Dataset S2**). Therefore, at least in these cases, we think it is unlikely that both copies retain function. Nevertheless, the fact that both of these two coding sequences retain enough protein similarity to be linked to other T3SEs in the same family suggests that these disruption events either occur very commonly, or that these regions are

FIGURE 2 | Interfamily blast hits (E < 1e–10) that did not pass our e-value and/or length cut-offs for combining T3SEs into families. Each superfamily represents a cluster of families that have some overlapping sequence. Colored blocks represent the regions of the representative sequence pairs that are homologous, where the length of the blocks is proportional to the length of the homologous sequence. Black lines represent the remainder of each representative sequence that is not homologous, where the length of the lines is proportional to the length of the 5<sup>0</sup> and 3<sup>0</sup> non-homologous regions. Not all families within a superfamily need to contain a significant blast hit with all other families in the superfamily because they can be homologous to the same intermediate sequence in different regions. Short form family names are used for merged or separated families.

FIGURE 4 | Total number of P. syringae strains harboring an allele from each T3SE family. Color categories denote the copy number of each effector family in the corresponding strains. While the majority of families are mostly present in a single copy, some of the more broadly distributed families have higher copy numbers in a subset of P. syringae genomes. Short form family names are used for merged or separated families.

still subject to purifying selection due to the retention of some functional role.

To quantify the extent of genetic diversification within each T3SE family, we aligned the amino acid sequences of all members from each family with MUSCLE, then reverse translated these amino acid alignments and calculated all pairwise non-synonymous (Ka) and synonymous (Ks) substitution rates for all pairs of alleles within each family. There was a broad range of pairwise substitution rates in the majority of T3SE families, which is expected given the range of divergence times in the core-genomes of strains from different P. syringae phylogroups (Dillon et al., 2019). The three families with the highest non-synonymous substitution rates were HopF/HopBB, HopAB/HopAY, and HopAT/HopAV (**Figure 5A**), which all have an average Ka greater than 0.5. These families also tended to have relatively high synonymous substitution rates, but several other families also have Ks values that are greater than 1.0 (**Figure 5B**).

While some pairwise comparisons of effector alleles did yield a Ka/Ks ratio greater than 1, the predominance of purifying selection operating in the conserved domains of these families likely overwhelms signals of positive selection at individual sites. Indeed, the average global pairwise Ka/Ks values were less than 1 for all T3SE families (**Figure 5C**). Therefore, we also analyzed the Ka and Ks on a per codon basis using FUBAR to search for site-specific signals of positive selection in each family (Bayes Empirical Bayes P-Value ≥ 0.9; Ka/Ks > 1) (Murrell et al., 2013). We find that 37 out of the 64 (57.81%) T3SE families with at least five alleles have at least one positively selected site. The number of positively selected sites in these families was relatively low, ranging from 1 to 17, with the percentage of positively selected sites in a single family never rising above 2.29% (**Table 3**). By comparison, we found that only 3,888/17,807 (21.83%) ortholog families from the pangenome of P. syringae that were present in at least five strains demonstrated signatures of positive selection at one or more sites (Dillon et al., 2019), suggesting that T3SE families experience elevated rates of positive selection. Nevertheless, these results should be interpreted cautiously given the variable rates of recombination and horizontal transfer among strains and the confounding impact this can have on detecting selection (O'Reilly et al., 2008; Betancourt et al., 2009).

Finally, to explore whether T3SE families display different levels of diversity than core gene families carried by the same P. syringae strains, we compared all pairwise Ka and Ks values within each effector family to the pairwise Ka and Ks values for the core genes carried in the corresponding genomes. We would expect T3SEs and core genes to share the same Ka and Ks values if they were evolving under the same evolutionary pressures. Deviation from this null expectation could be due to either differences in selective pressures, or the movement of the T3SE via horizontal gene transfer (HGT). We find that the pairwise Ka values for T3SEs are substantially higher than those of the corresponding core genes for the majority of T3SEs (**Figure 6A** and **Supplementary Figure S7**). This was also true for pairwise Ks values, although the differences between T3SE pairs and core genes were not as high and there were many more examples of T3SE pairs that had lower Ks values than the corresponding core genes (**Figure 6B** and **Supplementary Figure S8**).

### Gene Gain and Loss of Type III Secreted Effectors in the P. syringae Species Complex

Both the patchy distribution of T3SE families across the P. syringae species complex and the inconsistent relationships between T3SE and core gene substitution rates suggest that HGT may be an important evolutionary force contributing to the evolution of T3SEs in the P. syringae species complex. Therefore, we also sought to analyze the expected number of gene gain events across the P. syringae phylogenetic tree in order to more accurately quantify the extent to which HGT has actively transferred T3SEs between P. syringae strains over the evolutionary history of the species complex. We used the Gain Loss Mapping Engine (GLOOME) to estimate the number of gain and loss events (Cohen et al., 2010; Cohen and Pupko, 2010), and found extensive evidence for HGT in several T3SE families, with some families experiencing as many as 40 HGT events over the course of the history of the P. syringae species complex (**Figure 7**). Outlier T3SE families that did not appear to have undergone much HGT in P. syringae include the smallest families, like HopU, HopBE, and HopBR/HopBN, and the largest families, like AvrE, HopB/HopAC, HopM, and HopAA. Smaller families were less likely to have undergone HGT because they were only identified in a subset of closely related strains, so are not expected to have been part of the P. syringae species complex through the majority of its evolutionary history. Larger families may experience less HGT because they are more likely to already be present in the recipient strain and therefore will quickly be lost following an HGT event. However, because GLOOME only identifies HGT events that result in the gain of a new family, we cannot be certain whether P. syringae genomes with multiple copies were generated by HGT or gene duplication.

An opposing evolutionary force that is also expected to have a disproportional effect on the evolution of T3SE families is gene loss. Specifically, loss of a given T3SE may allow a P. syringae strain to infect a new host by shedding an effector that elicits the hosts' ETI response. Indeed, we found that gene loss events were also common in many T3SE families, with more than 50 events estimated to have occurred in the HopAT/HopAV and HopAZ families (**Figure 7**). T3SE families that experienced more gene loss events also tended to experience more gene gain events, as demonstrated by a strong positive correlation between gene loss and gene gain in T3SE families (**Supplementary Figure S9**) (linear regression; F = 140.50, df = 1, 68, p < 0.0001, r <sup>2</sup> = 0.67). However, as was the case with gene gain events, we observed few gene loses in the smallest and the largest T3SE families. For small families, this is again likely to be the result of the fact that they have spent less evolutionary time in the P. syringae species complex. For large families, we are again blind to gene loss events that occur in a genome that has multiple copies of the effector prior to the loss event. Therefore, there are likely many more T3SE losses occurring in larger families than we observe here

substitution rates in the family that are not identified as outliers (> 1.5 times the interquartile range). Average pairwise Ka, Ks, and Ka/Ks values for each family are denoted by red diamonds. Short form family names are used for merged or separated families.

because these T3SE families tend to be present in multiple copies within the same genome.

Finally, we also observed that there is a significant positive correlation between both evolutionary rate parameters and the rates of gene gain and loss for T3SE families (Ka-Gene Gain: F = 8.48, df = 1, 63, p = 0.0050, r <sup>2</sup> = 0.1186; Ka-Gene Loss: F = 16.15, df = 1, 63, p = 0.0002, r <sup>2</sup> = 0.2041; Ks-Gene Gain: F = 6.46, df = 1, 63, p = 0.0135, r <sup>2</sup> = 0.0930; Ks-Gene Loss: F = 7.70, df = 1, 63, p = 0.0072, r <sup>2</sup> = 0.1089) (**Supplementary Figure S10**). This implies that the same evolutionary forces resulting in diversification of T3SEs are also causing them to undergo elevated rates of gain or loss. However, there was

#### TABLE 3 | Positive Selection among T3SE Families.

fpls-10-00418 April 4, 2019 Time: 18:1 # 12


(Continued)




<sup>1</sup>Unique DNA sequences.

substantial unexplained variance in these correlations, resulting in some T3SE families that have high evolutionary rates and low levels of gain and loss, and other T3SE families that have low evolutionary rates and high levels of gain and loss. These families tended to be the same for all correlations.

#### DISCUSSION

Bacterial T3SEs are primary virulence factors in a wide-range of plant and animal pathogens (Hueck, 1998; Desveaux et al., 2006; Zhou and Chai, 2008; Block and Alfano, 2011; Buttner, 2016; Khan et al., 2016; Hu et al., 2017; Khan et al., 2018; Xin et al., 2018). T3SEs are particularly interesting from an evolutionary perspective due to their dual and diametrically opposed roles in host–pathogen interactions. While T3SEs have evolved in order to promote bacterial fitness, usually via the suppression of host immunity or disruption of host cellular homeostasis, hosts have evolved mechanisms to recognize the presence or activity of T3SEs, and this recognition often elicits an immune response that shifts the interaction back into the host's favor. To explore the distribution and evolutionary history of P. syringae T3SEs and gain insight into their role in host specificity, we cataloged the T3SE repertoires of a large and diverse collection of 494 P. syringae isolates. These phylogenetically diverse strains allowed us to generate an expanded database of more than 14,000 putative T3SE alleles and investigate the evolutionary mechanisms through which these important molecules have enabled P. syringae to become one of the most globally important bacterial plant pathogens (Mansfield et al., 2012).

then calculating pairwise substitution rates in MEGA7 with the Nei-Gojobori Method. Each point on the scatter plot represents the average of these pairwise rates for a single family and the red dotted lines represent the null-hypothesis that the substitution rates in the effector family will be the same as the substitution rates of the core genes in the same collection of genomes.

## Expanded Database of Type III Secreted Effectors in P. syringae

This study increases the number of confirmed and putative T3SE alleles available in the P. syringae Genome Resources Database by 20-fold, resulting in a final database of 14,613 T3SE alleles from the P. syringae species complex, 5,127 of which are unique at the nucleotide level. Although these new, putative T3SEs all share an ancestral sequence with known T3SE families, we did not confirm expression or translocation of any of these T3SEs, so some of these coding regions may represent recently pseudogenized effectors. However, the extensive diversification that has occurred within many of these families clearly indicates that some level of functional diversification has occurred.

Consistent with our earlier analysis, we find that primary phylogroup strains harbor considerably larger repertoires of T3SEs than secondary phylogroup strains (Baltrus et al., 2011; O'Brien et al., 2011; Dudnik and Dudler, 2014; Dillon et al., 2019). We also find that a small number of primary phylogroup strains have significantly smaller effector repertoires; including phylogroup 10 strains, which were primarily isolated from non-agricultural sources similar to most secondary phylogroup strains, and the phylogroup 2 strain Psy642, which has previously been highlighted as an outlier in its T3SE content and has been characterized as non-pathogenic (Clarke et al., 2010; O'Brien et al., 2011). In general, phylogroup 2 strains have somewhat smaller T3SE repertoires and employ a greater number of phytotoxins relative to other primary phylogroup strains (Baltrus et al., 2011; O'Brien et al., 2011; Dillon et al., 2019). This may indicate that phylogroup 2 strains have evolved a different hostmicrobe lifestyle than other P. syringae primary phylogroup strains, e.g., one tending toward low virulence, epiphytic interactions, rather than high virulence, invasive pathogenesis (Hirano and Upper, 2000).

Among the 70 T3SE families that were delimited in this study, seven of the newly assigned families had fewer than five total members (HopBR/HopBN, HopBS/HopAV, HopBT/HopAB, HopBU/HopAB, HopBV/HopAJ, HopBW/HopBH, HopBX/ HopL). These families all consist of alleles that were separated from a larger T3SE family during the delimitation stage of our analysis because they shared only very limited regions of local similarity with the larger family. The small size of these families suggests that they may be pseudogenes degenerating due to a lack of selective constraints. The 63 remaining families are similar to the ∼60 families that have been discussed in earlier studies (Baltrus et al., 2011; Lindeberg et al., 2012). While we do merge seven families based on our delimitation analysis, seven new families have been discovered in the past 5 years (McCann et al., 2013; Hockett et al., 2014; Lam et al., 2014; Matas et al., 2014; Mucyn et al., 2014). Furthermore, our objective delimitation analysis separated HopX2 from HopX, HopZ3 from HopZ, and HopH3 from HopH, forming the HopBO/HopX, HopBP/HopZ, and HopBQ/HopH families, respectively. Despite these differences, we arrive at several similar conclusions to prior work on the distribution of individual T3SEs across P. syringae strains. Specifically, we find that few T3SE families are considered part of the core genome (Baltrus et al., 2011; O'Brien et al., 2011; Lindeberg et al., 2012), with only AvrE, HopB/HopAC, HopM, and HopAA being present in more than 95% of strains. Three of these families (AvrE, HopM, and HopAA) are part of the CEL, while the other CEL effector, HopN, is only present in 14.98% of P. syringae strains, all from phylogroup 2. This suggests that HopN arose in the CEL after the divergence of this phylogroup. Other families that have previously been characterized as core T3SEs in P. syringae include HopI and HopAH (Baltrus et al., 2011), which are only present in 79.76 and 89.07% of strains from our study, respectively. Neither HopB

FIGURE 7 | Expected number of gene gain and gene loss events for each T3SE family. The posterior expectation for gain and loss events was estimated for each family on each branch of the P. syringae core-genome tree using GLOOME with the stochastic mapping approach. The sum of these posterior expectations across all branches yields the total expected number of events for each family. Short form family names are used for merged or separated families.

or HopAC has been highlighted as a core T3SE in prior studies, but the HopB/HopAC family in this study was one of the largest and most broadly distributed T3SE families. Although HopB and HopAC do vary substantially in length and occur in different genomic contexts, they typically share reciprocal BLASTP hits across more than 80% of the HopB sequence with E-values less than 1e−24, indicating shared ancestry. The remainder of T3SE families have a considerably sparser distribution across the P. syringae species complex, ranging in frequency from 1.62% to 80.97%. This demonstrates that different T3SE families were likely acquired episodically throughout the evolutionary history of the P. syringae species complex and are subject to strong evolutionary pressures for gain and loss due to the widespread and diverse ETI surveillance system of plants (Cunnac et al., 2009; Xin et al., 2018).

Finally, we find that highly divergent combinations of T3SEs can enable P. syringae to infect the same host (**Supplementary Figure S6**). While this observation is consistent with prior studies in P. syringae (Baltrus et al., 2011; Lindeberg et al., 2012; O'Brien et al., 2012), it is in contrast to the convergence in T3SE repertoires that has been observed in Xanthomonas, another phytopathogen that employs a T3SS (Hajri et al., 2009). Importantly, this limits our ability to detect and differentiate P. syringae pathogens of different hosts using this fairly crude application of comparative genomics. The lack of correlation between T3SE repertoires and host specificity may be a direct result of the fact that there is substantial functional redundancy among P. syringae T3SEs from different families, or that certain T3SEs in combination can mask the detection of other T3SEs in a given P. syringae background (Cunnac et al., 2009; Cunnac et al., 2011; Lindeberg et al., 2012; Wei et al., 2018). However, it will be important moving forward to assess the true host range of a broader collection of P. syringae strains in order to determine whether specific T3SEs promote or suppress growth on particular hosts.

#### Genetic and Functional Evolution of P. syringae Type III Secreted Effectors

Given the broad array of unique T3SEs that exist within the P. syringae species complex, mining this untapped diversity is likely to reveal a number of new functions and interactions for T3SEs in P. syringae. By quantifying Ka, Ks, and Ka/Ks for each pair of T3SE alleles in each family, we identified substantial genetic diversity in most T3SE families (**Figure 5**). Our codon-level analysis of positive selection also revealed that T3SE families were substantially more likely than non-T3SE families to contain positively selected sites (**Table 3**). Finally, we confirmed that this divergence is not simply a reflection of the immense diversity exhibited by the strains used in this study, since the divergence observed for T3SE families is consistently higher than the divergence observed across core genes (**Supplementary Figures S7**, **S8**). Elevated nonsynonymous substitution rates in T3SE families implies that there may be elevated positive selection operating on these families, while elevated synonymous substitution rates show that this

elevated positive selection may extend to synonymous sites, that many T3SEs arose prior to the last common ancestor (LCA) of the P. syringae species complex, and/or that T3SEs undergo considerably higher rates of HGT than core genes. However, it is difficult to pinpoint the timing and strength of positive selection on T3SEs because of the confounding effects of variable rates of recombination and horizontal transfer throughout their evolutionary history.

Fast-evolving T3SEs will also provide numerous opportunities for studying Red Queen dynamics (Van Valen, 1973). Under Fluctuating Red Queen (FRQ) dynamics, fluctuating selection drives oscillations in allele frequencies at the focal genetic loci in both the pathogen and the host, resulting in rapid evolutionary change on both sides (Brockhurst et al., 2014). In the case of P. syringae and their plant hosts, bacterial T3SEs are the key players on the pathogen side, and plant resistance genes are the key players on the host side. These FRQ dynamics are expected to maintain high levels of within-population genetic diversity at focal loci, as we've observed in many T3SE families. The majority of T3SE families in P. syringae are highly divergent and display strong signatures of positive selection, likely in response to intense host-imposed selection to evade recognition (Rohmer et al., 2004; Baltrus et al., 2011; Lindeberg et al., 2012). This implies that few T3SEs are broadly unrecognized, making interactions between individual T3SEs and the corresponding plant resistance genes an excellent resource for exploring FRQ dynamics.

The highly dynamic nature of T3SE evolution is also seen in our analysis of T3SE gain and loss across the P. syringae phylogenetic tree. More than five gene gain events are estimated to have occurred in 52 out of the 70 T3SE families analyzed in this study, with a maximum of 41 HGT events estimated in the HopZ family. Gene loss events were even more common, with 57 out of 70 T3SE families experiencing more than five loss events and a maximum of 53 events in the HopAZ family. Earlier studies have also suggested that both gene gain and loss were quite common among T3SE families. One specific study using nucleotide composition and phylogenetics found that members from 11 out of 24 tested P. syringae T3SE families were recently acquired by HGT (Rohmer et al., 2004). These families included AvrA, AvrB, AvrD, AvrRpm, HopG, HopQ, HopX, HopZ, HopAB, HopAF, and HopAM (although AvrD is not a T3SE Leach and White, 1996; Mucyn et al., 2014). The T3SEs from this dataset were also highlighted by this study as undergoing considerably higher rates of gene gain and loss within the P. syringae species complex. Specifically, all of these T3SEs were demonstrated to have undergone at least ten gene gain events and many were among the most dynamic T3SEs in our dataset. Other studies have shown that many T3SEs are present on mobile genetic elements and that T3SEs from the same family are often found at different genomic locations (Kim and Alfano, 2002; Charity et al., 2003; Lovell et al., 2009, 2011; Godfrey et al., 2011; Neale et al., 2016), which may both promote and be a consequence of the high rates of gene gain and loss for particular T3SE families. From a selective perspective, it is also likely that host immune recognition can drive selection for gene gain or loss (Vinatzer et al., 2006), while the functional redundancy of different T3SE families carried in the same genetic background may limit the negative impacts of the loss of such T3SEs (Kvitko et al., 2009; Cunnac et al., 2011; Wei et al., 2018). Finally, as has been previously reported (Baltrus et al., 2011), we find that there is a significant positive correlation between rates of evolution and rates of gene gain and loss (**Supplementary Figure S10**), suggesting that similar evolutionary forces that cause the diversification of T3SEs are contributing to the loss and gain of T3SEs. However, not all T3SEs fit this model which could reflect that T3SEs vary in their mutational robustness and/or that the genomic context of different T3SEs makes them more or less prone to HGT. In any event, the extensive gene gain and loss that occurs in the majority of T3SE families lends further support to the hypothesis that few T3SE alleles are broadly unrecognized (Baltrus et al., 2011).

Given the highly dynamic nature of T3SE evolution, we predict that there are still numerous T3SEs that will be found to elicit ETI. Most research on ETI elicitation to date has focused on a small number of T3SE families, and an even smaller number of alleles from each family (Mansfield, 2009). The immense diversification that we observe in many T3SE families points to strong selective pressures that may be explained by as-yet discovered ETI responses. If this prediction holds true, it will be particularly interesting to study T3SE families with alleles that induce different ETI responses in the same host. These patterns will help reveal how strains shift onto new hosts or break immunity in an existing host, perhaps explaining the evolutionary driving force behind new disease outbreaks.

### AUTHOR CONTRIBUTIONS

MD, DD, and DG designed the research. MD, BL, AM, and BW performed the experiments. MD and RA analyzed the data. MD and DG wrote the manuscript.

### FUNDING

This work was supported by Natural Sciences and Engineering Research Council of Canada Discovery grants (DG and DD), Canada Research Chairs in Comparative Genomics (DG) and Plant-Microbe Systems Biology (DD), and the Center for the Analysis of Genome Evolution and Function (DG and DD).

### ACKNOWLEDGMENTS

We thank all members of the Guttman and Desveaux labs for helpful discussion and valuable input on this project.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00418/ full#supplementary-material

#### REFERENCES

fpls-10-00418 April 4, 2019 Time: 18:1 # 16


a minimal functional repertoire of type III effectors in Pseudomonas syringae. Proc. Natl. Acad. Sci. U.S.A. 108, 2975–2980. doi: 10.1073/pnas.1013031108


phaseolicola and Phaseolus. Mol. Plant Microbe Interact. 4, 553–562. doi: 10. 1094/MPMI-4-553



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Dillon, Almeida, Laflamme, Martel, Weir, Desveaux and Guttman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Fitness Features Involved in the Biocontrol Interaction of Pseudomonas chlororaphis With Host Plants: The Case Study of PcPCL1606

Eva Arrebola1,2, Sandra Tienda1,2, Carmen Vida1,2, Antonio de Vicente1,2 and Francisco M. Cazorla1,2 \*

<sup>1</sup> Departamento de Microbiología, Facultad de Ciencias, Universidad de Málaga, Málaga, Spain, <sup>2</sup> Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora" IHSM, UMA-CSIC, Málaga, Spain

#### Edited by:

Marco Scortichini, Council for Agricultural and Economics Research, Italy

#### Reviewed by:

Vittoria Catara, Università degli Studi di Catania, Italy Anastasia L. Lagopodi, Aristotle University of Thessaloniki, Greece Monika Maurhofer, ETH Zürich, Switzerland

> \*Correspondence: Francisco M. Cazorla cazorla@uma.es

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology

Received: 26 November 2018 Accepted: 21 March 2019 Published: 10 April 2019

#### Citation:

Arrebola E, Tienda S, Vida C, de Vicente A and Cazorla FM (2019) Fitness Features Involved in the Biocontrol Interaction of Pseudomonas chlororaphis With Host Plants: The Case Study of PcPCL1606. Front. Microbiol. 10:719. doi: 10.3389/fmicb.2019.00719 The goal of this mini review is to summarize the relevant contribution of some beneficial traits to the behavior of the species Pseudomonas chlororaphis, and using that information, to give a practical point of view using the model biocontrol strain P. chlororaphis PCL1606 (PcPCL1606). Among the group of plant-beneficial rhizobacteria, P. chlororaphis has emerged as a plant- and soil-related bacterium that is mainly known because of its biological control of phytopathogenic fungi. Many traits have been reported to be crucial during the multitrophic interaction involving the plant, the fungal pathogen and the soil environment. To explore the different biocontrol-related traits, the biocontrol rhizobacterium PcPCL1606 has been used as a model in recent studies. This bacterium is antagonistic to many phytopathogenic fungi and displays effective biocontrol against fungal phytopathogens. Antagonistic and biocontrol activities are directly related to the production of the compound 2-hexyl, 5-propyl resorcinol (HPR), despite the production of other antifungal compounds. Furthermore, PcPCL1606 has displayed additional traits regarding its fitness in soil and plant root environments such as soil survival, efficient plant root colonization, cell-to-cell interaction or promotion of plant growth.

Keywords: Pseudomonas chlororaphis, root colonization, biocontrol, avocado, antifungals

#### INTRODUCTION

Since the earliest studies, soil has been described as an infinite source of microorganisms with beneficial activities that promote plant health (Waksman and Woodruff, 1940). Inside the soil, the rhizosphere environment is considered the soil-plant root interphase where potentially beneficial rhizobacteria are established. The plant-beneficial microbial life can be actively recruited by the plant rhizosphere (Berendsen et al., 2018) and can finally result in the biological control of the disease (Babalola, 2010). These biocontrol rhizobacteria can use a wide range of mechanisms involved in the suppression of plant pathogens. A diverse range of bacterial genera, such as Bacillus, Pseudomonas, Serratia, Stenotrophomonas, and Streptomyces, has been commonly described as

beneficial rhizobacteria (Berg, 2009). Among them, representatives of the Pseudomonas genus have been commonly associated with the rhizosphere and soil habitats (Lugtenberg and Dekkers, 1999). This bacterial genus has also been widely studied due to its ability to produce antifungal compounds, compete for niche and/or nutrients on the rhizosphere, and elicit induced systemic resistance in plants (Haas and Défago, 2005). Currently, many strains belonging to the group of fluorescent Pseudomonas are known to enhance plant growth promotion and reduce the severity of various diseases (Ganeshan and Kumar, 2005; Mercado-Blanco and Bakker, 2007; Weller, 2007).

The Pseudomonas fluorescens complex is one of the most diverse bacterial groups within the Pseudomonas genus and comprises more than fifty validly named species and many unclassified isolates (Garrido-Sanz et al., 2017). Many strains of this complex have been isolated from plant-related environments, and several species can be considered beneficial since many are described as plant growth-promoting rhizobacteria and/or minimize the effects of phytopathogens (PGPR; Kang et al., 2006; Raaijmakers et al., 2009). The beneficial effects displayed by some bacteria result from the expression of multiple activities that act directly and indirectly inhibiting pathogen activities and promoting plant health (McSpadden, 2007). To date, a number of studies have characterized the environmental factors that affect the abundance of different pseudomonad populations below ground (Berg et al., 2002; Ownley et al., 2003; Mazzola et al., 2004; Bergsma-Vlami et al., 2005). Pseudomonas species most commonly reported to include plant beneficial rhizospheric strains are Pseudomonas aureofaciens, Pseudomonas brassicacearum, Pseudomonas chlororaphis, P. fluorescens, Pseudomonas Protegens, and Pseudomonas putida.

#### BENEFICIAL TRAITS OF RHIZOSPHERIC Pseudomonas chlororaphis STRAINS

Among the beneficial Pseudomonas spp., P. chlororaphis has evolved to be a common inhabitant of the root environment of many plants. Moreover, it has been extensively reported the role of specific traits that render this bacterium able to be used as an inoculant for biofertilization, phytostimulation, and biocontrol purposes (Bloemberg and Lugtenberg, 2001).

#### Subspecies of Pseudomonas chlororaphis

Pseudomonas chlororaphis, P. aureofaciens, and Pseudomonas aurantiaca were initially included in the Approved List of Bacterial Names (Skerman et al., 1980) and considered separate species in the first edition of Bergey's Manual of Systematic Bacteriology by Palleroni (1984). However, the results obtained by Peix et al. (2007) of fatty acid analysis, phenotypic characterization, 16S rRNA gene sequencing, and DNA-DNA relatedness together with the results obtained by Hilario et al. (2004) on the phylogenetic analysis of several housekeeping genes, support the reclassification of P. aurantiaca as a later heterotypic synonym of P. chlororaphis. The results published by Peix et al. (2007) also reveal that strains of P. aurantiaca, P. aureofaciens, and P. chlororaphis form three clearly distinguishable groups within P. chlororaphis that merit the status of subspecies. Therefore, the current classification is P. chlororaphis subsp. chlororaphis subsp. nov., P. chlororaphis subsp. aureofaciens subsp nov., comb. nov. and P. chlororaphis subsp. aurantiaca subsp. nov., comb. nov. (Peix et al., 2007). Three years later, Burr et al. (2010) added a new subspecies to P. chlororaphis and placed it on a distinct branch within this species with the name P. chlororaphis subsp. piscium subsp. nov. The current reports of sequenced bacterial genomes of P. chlororaphis strains (Calderón et al., 2015; Deng et al., 2015; Town et al., 2016; Moreno-Avitia et al., 2017; Biessy et al., 2019) will help to refine the current classification of the P. chlororaphis group.

### Main Traits Involved in Biocontrol by P. chlororaphis

These aerobic, Gram-negative bacteria are associated with soil and plant roots (González-Sánchez et al., 2010; Calderón et al., 2015; Vida et al., 2017a). Typically, this species possesses plant-colonizing and antagonistic activities against soilborne plant pathogens. Products from secondary metabolism usually mediate antagonism, and can be regulated by the GacS-GacA two component regulatory system. GacS-GacA system governs a complex signal transduction pathway, involving regulatory RNAs and translational repression (Yan et al., 2018; Jahanshah et al., 2019). Simultaneously, Quorum Sensing (QS) is a regulatory systems which is involved in the general biology performance of P. chlororaphis, including biofilm formation, antifungal production or exoenzyme secretion. QS is a mechanism of intercellular signaling that makes the bacterial population to act co-ordinately, based in the secretion of diffusible signal molecules (mainly acyl homoserine lactones, or AHL; Venturi, 2006). The use of OMICs and functional studies have revealed a more complex scenario, where the presence of several QS systems can coexist inside the same bacterial cell (Morohoshi et al., 2017), but also the participation of secondary metabolites (such as the antifungals phenazines and/or the resorcinol-related compounds) in final QS regulation (Selin et al., 2010; Brameyer et al., 2015).

Recent reports using OMICs techniques, have allowed a more comprehensive understanding of the potential weaponry that P. chlororaphis group could uses to interact with the root plant. For example, presence of different antimicrobial and insecticidal compounds, cyclic peptides, siderophores, bacteriocins, molecules involved in beneficial plant-bacteria interactions, secretions systems, antibacterial proteins, etc., (Loper et al., 2012; Chen et al., 2015; Biessy et al., 2019). Below, the most relevant are summarized (**Table 1**).

Phenazines are among the most copious secondary metabolites produced by fluorescent pseudomonads, and phenazine-producing microorganisms represent a ubiquitous group of antibiotic-producing bacteria in the environment (Chin-A-Woeng et al., 2000; Mavrodi et al., 2013). Phenazine

TABLE 1 | Summary of main compounds produced by Pseudomonas chlororaphis subspecies with beneficial effects in plant pathogen control.


Reference strains published are included. <sup>1</sup>Pa: P. chlororaphis subsp. aurantiaca; Pe: P. chlororaphis subsp. aureofaciens; Pc: P. chlororaphis subsp. chlororaphis; Pp: P. chlororaphis subsp. Piscium.

compounds are redox-active nitrogen-containing heterocyclic molecules and its beneficial role on plant biology is not limited to antibiosis against phytopathogenic microbes (Pierson and Pierson, 2010; Biessy and Filion, 2018; Biessy et al., 2019). Additional effects have been shown for this compound such as triggering induced systemic resistance in plants, reducing the expression of key pathogenicity-related genes of the phytopathogen, or its involvement in the root persistence (Biessy and Filion, 2018). In relation to the bacterial interaction with the plant root, phenazines can be crucial for biofilm formation (Selin et al., 2010). An extensive colonization of the rhizosphere is a prerequisite in efficient disease suppression by preventing pathogen form access to the root (Lugtenberg and Kamilova, 2009). The involvement of phenazines on root colonization has been strengthened because some phenazine compounds could be terminal signaling factors in the QS network of some bacteria, and are directly involved

in biofilm formation on biotic surfaces (Dietrich et al., 2006; Selin et al., 2012).

Pyrrolnitrin and the volatile compound hydrogen cyanide, are also among the additional antifungal compounds typically produced by P. chlororaphis strains. Pyrrolnitrin is considered a key compound for fungal biocontrol (Hill et al., 1994) and is becoming even more relevant than phenazines extending its action to eukaryotic organisms (Nandi et al., 2015; Huang et al., 2018). The same observation can be applied to the volatile compound hydrogen cyanide, which also has a broad spectrum of prokaryotic and eukaryotic targets (Nandi et al., 2017; Kang et al., 2018). The biological importance of this broad spectrum of both active compounds would be related to its typical environmental persistence, for example, allowing them to escape from predation (Nandi et al., 2017). Related to the insecticidal activity of this bacterial species, the most studied virulence factor against insects is the Fit toxin,

which is similar to Mcf1 of the entomopathogenic bacterium Photorhabdus luminescens (Ruffner et al., 2015). Fit mutants of P. chlororaphis PCL1391 further showed reduced virulence, and the residual toxicity could be assigned to the wide range of other antimicrobial compounds produced by P. chlororaphis (previously listed) or cyclic lipopetides (Flury et al., 2017).

About Clps, these compounds can be involved in many biological functions, such as motility, biofilm formation, protection against predators and antagonism (De Souza et al., 2003; Raaijmakers et al., 2010). Clps produced by plants-beneficial bacteria were found to induce plant resistance and to contribute to plant protection against root pathogenic fungi (Olorunleke et al., 2015). But interestingly, Clps were demonstrated to be further insect pathogenicity factor in P. chlororaphis strains (Flury et al., 2016, 2017).

The production of exoenzymes has also been described to have a role in biocontrol activity (Haran et al., 1996). Enzymes such as chitinases, lipases or proteases have a broad distribution among the soil bacterial community and are probably related to general metabolism, but also inhibit the pathogen (degrading some cell structures) and stimulate plant growth by providing additional resources from the degradative activity (Vida et al., 2017a). Remarkably, P. chlororaphis strains can produce 1-aminocyclopropane-1-carboxylate (ACC) deaminase (Nadeem et al., 2007), which is an enzyme produced by plant-associated bacteria that decrease the ethylene levels and protect the plant from its effect, which results in a general beneficial activity (Glick, 2014). In addition, the production of the biofertilizer hormone indole-3-acetic acid (IAA) has also been reported for P. chlororaphis strains (Dimkpa et al., 2012), and its production is important in microbe-microbe and microbe-plant signaling, and can also results in an promotion of plant growth (Kang et al., 2006).

Other compounds can also have an important role for P. chlororaphis, such as the production of siderophores, which can be considered as a general beneficial activity, at least, for all the soil-related Pseudomonas spp. (Zhang and Rainey, 2013). These molecules are secondary metabolites involved in iron quelation. The most known is pyoverdine, a water-soluble fluorescent pigment produced by fluorescent Pseudomonas species (Barelmann et al., 2003). However, the recent comparative genomic studies of P. chlororaphis genomes, revealed the putative presence of various secondary siderophores, such as achromobactine and hemophore (Biessy et al., 2019).

#### THE BENEFICIAL RHIZOBACTERIUM Pseudomonas chlororaphis PCL1606 (PcPCL1606) AS A MODEL

In order to find potential bacterial biocontrol agents against the avocado white root rot caused by Rosellinia necatrix, a collection of bacterial isolates belonging to the genera Bacillus and Pseudomonas were isolated from avocado rhizosphere (Cazorla et al., 2006, 2007; Pliego et al., 2011). Interestingly, a number of P. chlororaphis were consistently isolated from avocado roots (Cazorla et al., 2006). The management of this crop could enhance this presence on avocado roots of P. chlororaphis isolates, since it has been reported that application of organic amendments can enhance the presence of specific groups of beneficial microbes, including antagonistic P. chlororaphis (Vida et al., 2016).

### PcPCL1606 as a Biological Control Agent

Nearly all the P. chlororaphis isolated from avocado roots were antagonistic and produced a broad range of antimicrobials including phenazines. Among them, the strain PcPCL1606 do not produce phenazines; otherwise produce proteases, lipases and the antifungal metabolite 2-hexyl 5-propylresorcinol (HPR; **Figure 1**). Another unusual characteristic of this strain is the absence of plant growth promotion in the assayed plant models; however, siderophore production and phosphorous solubilization were detected (among other PGPR-related traits; Vida et al., 2017a). This strain displayed strong antagonism to many phytopathogenic fungi and showed biocontrol of crown and root rot of tomato, caused by Fusarium oxysporum f. sp. radicis-lycopersici and avocado white root, caused by R. necatrix (Cazorla et al., 2006; González-Sánchez et al., 2013). Effectiveness of biocontrol was directly related to the compound HPR (Cazorla et al., 2006; Calderón et al., 2013). HPR production was led by three biosynthetic genes located in a cluster (darA, darB, and darC) followed by two independent regulatory genes (darS and darR; Nowak-Thompson et al., 2003; Calderón et al., 2013). Further experiments revealed that HPR production was also under transcriptional regulation of the GacS-GacA twocomponent regulatory system, as previously described for other antifungal antibiotics (Haas and Keel, 2003), and also modulated by different growth parameters such as temperature, pH and the presence of salts in the medium (Calderón et al., 2014a).

### Main Features of PcPCL1606 Involved in Pathogen and Plant Interaction

PcPCL1606 showed strong antifungal activity (**Figure 1**), and HPR production was the main determinant in the antagonistic and biocontrol phenotypes (Calderón et al., 2013). In addition to HPR, other antifungals can be produced by PcPCL1606, such as pyrrolnitrin (PRN) or hydrogen cyanide (HCN), as well as several exoenzymes such as proteases, chitinases or phosphatases (Vida et al., 2017a). Nevertheless, HPR is more than a powerful compound against pathogenic fungi in the soil and could have additional roles. It has been reported that some alkylresorcinols (to which the compound HPR belongs) can behave as quorum sensing-like signal molecules in the genus Photorhabdus (Brameyer et al., 2015), and for this, could have a similar role in HPR-producing P. chlororaphis strains. Thus, additional HPR-dependent traits, which are different from antagonism, could have an essential role in the beneficial effects of PcPCL1606 on the plant, such as the root colonization or the biofilm formation (Calderón et al., 2014b, 2019).

Related to the possibility to physically exclude the pathogen from the plant root habitat (**Figure 1**), biological processes, such as biofilm formation or chemotaxis, are crucial for the PcPCL1606. PcPCL1606 is strongly attracted to the avocado

root exudates by chemotactic processes (Polonio et al., 2017). As a result of this attraction, PcPCL1606 efficiently colonizes avocado roots (González-Sánchez et al., 2010) and can be found forming a biofilm on avocado root surfaces, located in the same area where R. necatrix can be found during the early stages of infection (Calderón et al., 2014b). Moreover, two bacteriocins (R-tailocins 1 and 2), recently described in PcPCL1606 would contribute to better competition against other rhizosphere-associated bacteria (Dorosky et al., 2017). However, PcPCL1606 bacterial cells also displayed a direct chemotaxis to fungal exudates and finally showed a direct contact with the fungal hyphae of R. necatrix. This cell-to-cell contact causes an increase in stress symptoms on the hyphae, among others, by the direct release of antifungal substances, which lead to an accelerated ageing process in the hyphae and hyphal death (Calderón et al., 2014b; Moore-Landecker, 1996). Moreover, the root colonization ability and biofilm formation of the wild-type strain was also related to HPR production, and the absence of HPR resulted in reduced root colonization levels and no biofilm formation by PcPCL1606 (Calderón et al., 2014b, 2019).

To obtain insight into the features of PcPCL1606, its complete genome sequencing was completed. Phylogenetic studies clustered this strain into the P. chlororaphis clade which is placed into the fluorescent Pseudomonas complex, however, as previously mentioned, PcPCL1606 it is not a typical P. chlororaphis strain (Biessy et al., 2019). Thus, phylogenetic analysis revealed clear differences with the genomes of other biocontrol P. chlororaphis, such as PcPCL1601 or PcPCL1607, also isolated from avocado root (Calderón et al., 2015; Vida et al., 2017b; Biessy et al., 2019). Analysis of PcPCL1606 genome confirmed a lack of phenazine biosynthetic genes, cyclic lipopeptides that are related to the surfactant and insecticidal properties, which are typical for P. chlororaphis

(Raaijmakers et al., 2006). However, PcPCL1606 exhibits a complete Fit toxin (fit) cluster (Calderón et al., 2015).

#### FUTURES PROJECTS AND RESEARCH

The future of P. chlororaphis as biocontrol agent is very promising. P. chlororaphis is ubiquitous in the environment, lacks known toxic or allergenic properties, and has a history of safe use in agriculture and in food and feed crops. P. chlororaphis is considered non-pathogenic to humans, wildlife or the environment according to the United States Environmental Protection Agency (EPA), and commercial products based on P. chlororaphis strains are already available. For example, Cedomon <sup>R</sup> (P. chlororaphis, BioAgri AB, Sweden), Spot-Less <sup>R</sup> (P. aureofaciens Tx-1, Turf Science Laboratories, Carlsbad, United States) or AtEze <sup>R</sup> (P. chlororaphis 63-28, Turf Science Laboratories, Carlsbad, United States) are based on P. chlororaphis strains, but many other products are already present in the market based on other Pseudomonas spp. These facts pointed out to a promising future for the use of biocontrol agents belonging to the specie P. chlororaphis.

Regarding the model bacterium PcPCL1606, studies revealed that PcPCL1606, as well as other P. chlororaphis isolates from avocado roots, displayed high persistence and reached a population density that was enough to reduce

#### REFERENCES


disease (González-Sánchez et al., 2013). Under commercial greenhouse conditions, applications of PcPCL1606 cells resulted in biocontrol against R. necatrix. Moreover, some other P. chlororaphis isolates from avocado roots, that have different beneficial traits (such as phenazine production or plant growth promotion), could also provide plant protection. These finding suggest that a promising approach to improve P. chlororaphis based biocontrol would be to develop consortia which combine strains with complementary traits resulting in more stable or even enhanced beneficial effects on plants.

#### AUTHOR CONTRIBUTIONS

EA and FC designed the review content. EA, ST, CV, AV, and FC wrote the manuscript. All authors read and approved the final manuscript.

### FUNDING

This research was supported by the Spanish Plan Nacional I + D + I. Grant AGL2017-83368-C2-1-R and partially supported by the European Union (FEDER). CV and ST were supported by a grant from FPI, Ministerio de Ciencia e Innovación, Spain.


displaying biocontrol activity. J. Appl. Microbiol. 103, 1950–1959. doi: 10.111/j.1365-2672.2007.03433.x



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Arrebola, Tienda, Vida, de Vicente and Cazorla. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Xanthomonas citri pv. viticola Affecting Grapevine in Brazil: Emergence of a Successful Monomorphic Pathogen

Marisa A. S. V. Ferreira<sup>1</sup> \*, Sophie Bonneau<sup>2</sup> , Martial Briand<sup>2</sup> , Sophie Cesbron<sup>2</sup> , Perrine Portier<sup>2</sup> , Armelle Darrasse<sup>2</sup> , Marco A. S. Gama<sup>3</sup> , Maria Angélica G. Barbosa<sup>4</sup> , Rosa de L. R. Mariano<sup>3</sup> , Elineide B. Souza<sup>3</sup> and Marie-Agnès Jacques<sup>2</sup> \*

<sup>1</sup> Departamento de Fitopatologia, Universidade de Brasília, Brasília, Brazil, <sup>2</sup> IRHS, INRA, AGROCAMPUS-Ouest, SFR4207 QUASAV, Université d'Angers, Beaucouzé, France, <sup>3</sup> Laboratório de Fitobacteriologia, Departamento de Agronomia, Universidade Federal Rural de Pernambuco, Recife, Brazil, <sup>4</sup> Embrapa Semiarido, Petrolina, Brazil

#### Edited by:

Giorgio Gambino, Institute for Sustainable Plant Protection, Italian National Research Council (IPSP-CNR), Italy

#### Reviewed by:

Sang-Wook Han, Chung-Ang University, South Korea Fabiano Sillo, University of Turin, Italy

#### \*Correspondence:

Marisa A. S. V. Ferreira marisavf@unb.br Marie-Agnès Jacques marie-agnes.jacques@inra.fr

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Plant Science

Received: 29 November 2018 Accepted: 29 March 2019 Published: 18 April 2019

#### Citation:

Ferreira MASV, Bonneau S, Briand M, Cesbron S, Portier P, Darrasse A, Gama MAS, Barbosa MAG, Mariano RdLR, Souza EB and Jacques M-A (2019) Xanthomonas citri pv. viticola Affecting Grapevine in Brazil: Emergence of a Successful Monomorphic Pathogen. Front. Plant Sci. 10:489. doi: 10.3389/fpls.2019.00489 The pathovar viticola of Xanthomonas citri causes bacterial canker of grapevine. This disease was first recorded in India in 1972, and later in Brazil in 1998, where its distribution is currently restricted to the northeastern region. A multilocus sequence analysis (MLSA) based on seven housekeeping genes and a multilocus variable number of tandem repeat analysis (MLVA) with eight loci were performed in order to assess the genetic relatedness among strains from India and Brazil. Strains isolated in India from three related pathovars affecting Vitaceae species and pathogenic strains isolated from Amaranthus sp. found in bacterial canker-infected vineyards in Brazil were also included. MLSA revealed lack of diversity in all seven genes and grouped grapevine and Amaranthus strains in a monophyletic group in X. citri. The VNTR (variable number of tandem repeat) typing scheme conducted on 107 strains detected 101 haplotypes. The total number of alleles per locus ranged from 5 to 12. A minimum spanning tree (MST) showed that Brazilian strains were clearly separated from Indian strains, which showed unique alleles at three loci. The two strains isolated from symptomatic Amaranthus sp. presented unique alleles at two loci. STRUCTURE analyses revealed three groups congruent with MST and a fourth group with strains from India and Brazil. Admixture among populations were observed in all groups. MST, STRUCTURE and e-BURST analyses showed that the strains collected in 1998 belong to two distinct groups, with predicted founder genotypes from two different vineyards in the same region. This suggest that one introduction of grape planting materials contaminated with genetically distinct strains took place, which was followed by pathogen adaptation. Genome sequencing of one Brazilian strain confirmed typical attributes of pathogenic xanthomonads and allowed the design of a complementary VNTR typing scheme dedicated to X. citri pv. viticola that will allow further epidemiological survey of this genetically monomorphic pathovar.

Keywords: MLVA, MLSA, Vitis vinifera, grapevine bacterial canker, Xanthomonas campestris pv. viticola

## INTRODUCTION

fpls-10-00489 April 17, 2019 Time: 16:54 # 2

Xanthomonas citri pv. viticola, the causal agent of grapevine bacterial canker, was first described in India as Pseudomonas viticola sp. nov. (Nayudu, 1972). For many years, its occurrence was restricted to India and regarded as a disease of secondary importance until outbreaks in the late 1980's (Chand and Kishun, 1990). In 1998, a new disease was reported affecting vines of Vitis vinifera cultivar Red Globe in the irrigated areas of the São Francisco River valley in Pernambuco and Bahia states, northeastern Brazil. This region accounts for a significant percentage of table grape production in Brazil. Disease symptoms were leaf spots and cankers observed on stems, twigs and petioles. The causal agent was identified through biochemical and pathogenicity tests as Xanthomonas campestris pv. viticola. Additionally, rep-PCR fingerprinting analysis of strains collected in Brazil showed highly similar profiles to the Indian pathotype strain (NCPPB 2475) (Trindade et al., 2005). Infected grapevines were later detected in other states in Brazil (Halfeld-Vieira and Nechet, 2006; Rodrigues Neto et al., 2011), and eradication procedures were adopted since the pathogen is of quarantine significance and subjected to regulatory measures. Besides India and Brazil, the pathogen has been reported in Africa in 2005 (Midha and Patil, 2014). The pathogen may disseminate by infected propagating material and an association with seeds and berries was demonstrated suggesting systemic colonization and spread (Tostes et al., 2014). Natural hosts of pathovar viticola are V. vinifera varieties. Nayudu (1972) also reported natural infection of Azadirachta indica (neem, Meliaceae) and Phyllanthus maderaspatensis (Euphorbiacae), which may represent alternative sources of inoculum for infection of grapevines. Plants in the Anacardiaceae family, such as mango tree (Mangifera indica) have also been described as potential hosts through inoculation (Chand and Kishun, 1990). In Brazil, some weed species belonging to the genera Alternanthera, Amaranthus, Glycine, and Senna have been identified as potential alternative hosts as well (Peixoto et al., 2007).

Diagnosis of grapevine bacterial canker is based on symptom observation followed by bacterial isolation and identification tools, including induction of a hypersensitive reaction (HR) on tomato leaves, pathogenicity tests on susceptible varieties, serology with polyclonal antibodies and/or molecular identification tests based on PCR (Trindade et al., 2005, 2007; Gama et al., 2018). Primers have been designed on partial sequences of the hrp cluster that differentiate Xanthomonas strains at both pathovar and species levels (Leite et al., 1994) and were shown to be useful for detection and identification of pathovar viticola in culture and plant tissue (Trindade et al., 2007).

The pathovar viticola, a non-pigmented xanthomonad, has been referred as Xanthomonas campestris sensu lato since it was not included in the Xanthomonas reclassification study of Vauterin et al. (1995). Sequence analysis of the housekeeping gene gyrase B (gyrB) for over 200 xanthomonads, including 67 poorly characterized pathovars of X. campestris, placed the pathotype strain from India and three pathovars associated with hosts formerly classified in the genus Vitis, in the X. citri subsp. citri clade (Parkinson et al., 2009), along with several members of group 9.5 of X. axonopodis, such as pathovars citri, glycines, and mangiferaeindicae (Ah-You et al., 2009; Mhedbi-Hajri et al., 2013). Coherently, it was recently included in the newly proposed X. citri species that encompasses the so-called 9.5 and 9.6 groups (Rademaker et al., 2005; Constantin et al., 2016). A taxonomic reposition as X. citri pv. viticola comb. nov. has been proposed (Gama et al., 2018). In addition, phylogenomic analysis revealed that several pathovars, including pathovar viticola, form a monophyletic cluster and belong to one species, X. citri (Bansal et al., 2017).

Multilocus variable number of tandem repeats analysis (MLVA) is a high-resolution method for monitoring epidemics and assessing population structure and diversity for many bacterial species. A typical variable number of tandem repeats (VNTR) locus shows large range of copy numbers even among highly related bacterial strains (Jackson, 2010). MLVA has been used as a typing tool in outbreaks of numerous human and animal pathogens, but also in food microbiology such as the winemaking process (Claisse and Lonvaud-Funel, 2012). For human pathogens of medical interest, it has been regarded as a powerful tool for outbreak detection and source tracing in several European countries (Lindstedt et al., 2013). Resources for discovery of polymorphic loci such as VNTR databases are available for free access (Chang et al., 2007). For plant associated bacteria, several MLVA schemes have been described for important pathogens including several species and pathovars of Xanthomonas spp. such as X. citri pv. citri (Bui Thi Ngoc et al., 2009; Pruvost et al., 2014; Leduc et al., 2015); X. oryzae (Poulin et al., 2015); X. arboricola (Cesbron et al., 2014; Essakhi et al., 2015; López-Soriano et al., 2016); X. fragariae (Gétaz et al., 2018); and X. axonopodis pv. manihotis (Arrieta-Ortiz et al., 2013). VNTR typing has been recognized as the best tool to type recently emerged bacteria with limited genetic diversity and to better understand their patterns of longdistance dissemination (Bühlmann et al., 2013; Cunty et al., 2015; Nakato et al., 2018).

The objectives of this study were to assess the genetic relatedness among Xanthomonas citri pv. viticola strains from India and Brazil, related pathovars (pv. vitiscarnosae, vitistrifoliae, vitiswoodrowii) affecting other host plants in the family Vitaceae and three pathogenic strains from Amaranthus sp. collected close to bacterial canker-infected grapevines in Brazil. We conducted the characterization of pathovar viticola strains based on a concatenated sequence of seven housekeeping genes (MLSA) to allow comparisons with strains from other pathovars. A MLVA typing scheme with eight VNTR loci derived from X. citri pv. citri was validated with 107 strains of pathovar viticola and used to assess the genetic structure of this pathovar in Brazil. Genome sequencing of one Brazilian strain confirmed typical attributes of pathogenic xanthomonads and allowed the design of a complementary VNTR typing scheme dedicated to X. citri pv. viticola that will allow further epidemiological survey of this genetically monomorphic pathovar.

### MATERIALS AND METHODS

#### Bacterial Strains

fpls-10-00489 April 17, 2019 Time: 16:54 # 3

A collection of 102 strains isolated from grapevine (Vitis vinifera) and three strains isolated from symptomatic Amaranthus sp. plants growing in bacterial canker-infected vineyards in Brazil were used in this study (**Table 1**). These strains were isolated over a period of 14 years (1998–2012). The pathovar viticola pathotype strain from India, two other Indian strains from grapevine (V. vinifera) and the pathotype strains (CFBP 7658, CFBP 7659, and CFBP 7657) of the three pathovars affecting other species in the Vitaceae family (vitiscarnosae, vitistrifoliae, and vitiswoodrowii, respectively) were also included in this study. These six strains were isolated in India from 1951 to 1972 (**Table 1**). The strains isolated from grapevine plants in Brazil were previously identified by PCR with specific primers targeting a 240 bp-sequence of the hrcN (hrpB) gene (Trindade et al., 2007) and tested for hypersensitive response (HR) induction on tomato and/or pathogenicity on a susceptible grapevine cultivar.

All strains were recovered on LPGA medium (yeast extract 7 g liter−<sup>1</sup> ; peptone 7 g liter−<sup>1</sup> ; glucose 7 g liter−<sup>1</sup> ; agar 15 g liter−<sup>1</sup> , pH 7.2) and transferred to 10% TSA medium (1.7 g liter−<sup>1</sup> tryptone, 0.3 g liter−<sup>1</sup> soybean peptone, 0.25 g liter−<sup>1</sup> glucose, 0.5 g liter−<sup>1</sup> NaCl, 0.5 g liter−<sup>1</sup> K2HPO4, and 15 g liter−<sup>1</sup> agar). Cells were grown at 28◦C for 24 h. For long term storage strains were preserved and frozen in 40% glycerol at −80◦C. DNA was obtained from 1 × 10<sup>7</sup> CFU ml−<sup>1</sup> bacterial cell suspensions with a heating step at 94◦C for 10 min before the amplification program.

### Multilocus Sequencing Analysis (MLSA)

Two strains from India (CFBP 7660 and CFBP 7691) and a subcollection of 26 strains from the 105-strain collection from Brazil were selected to represent the diversity in terms of year, host, and geographical origin. PCR amplifications of portions of seven housekeeping genes [atpD: ATP synthasebeta chain, dnaK: encoding the 70-kDa heat shock protein, efp: elongation factor P, fyuA coding a transmembrane protein (Ton-B dependent transporter), glnA: glutamine synthetase I, gyrB: DNA gyrase subunit B, and rpoD: RNA polymerase sigma 70 factor] were carried out with the primers designed by Mhedbi-Hajri et al. (2013), except for gyrB from which a 904-bp portion was amplified with the forward primer XgyrB1F (ACGAGTACAACCCGGACAA) and the reverse primer XgyrB1R (CCCATCARGGTGCTGAAGAT) (Young et al., 2008).

PCR amplifications were performed in a 25 µl-reaction containing 1X Go Taq Buffer (Promega), 200 µM dNTPs, 0.5 µM of each primer, 0.375 U of Go Taq Polymerase, and 5 µl of boiled bacterial cell suspension. Amplification program was carried out in a PE 9600 thermocycler (Applied Biosystems) with an initial denaturation at 94◦C for 5 min, 35 cycles of denaturation at 94◦C for 30 s, annealing at 60◦C for 30 s (or 62◦C for efp), extension for 1 min at 72◦C, and a final extension at 72◦C for 7 min. Quality and yield of PCR products were checked by loading 5 µl of the reaction in 1% agarose gels in 1 x Tris acetate EDTA (TAE) followed by staining with ethidium bromide. PCR products (20 µl) from each strain/gene combination were sequenced with reverse and forward primers at Genoscreen (Lille, France). Sequences obtained from forward and reverse primers were assembled and edited using GENEIOUS Pro 4.8.5 (Biomatters, New Zealand). Consensus sequences were generated, and codonbased multiple alignments were obtained using CLUSTALW (Thompson et al., 1994) application in BioEDIT (Hall, 1999) with default parameters. Initial phylogenetic analyses were performed on individual rpoD and gyrB sequences for comparisons with sequences from Xanthomonas (type and pathotype strains) from the CFBP/PhyloSearch tool database<sup>1</sup> using the Neighbor Joining (NJ) method available in MEGA 5.05 (Tamura et al., 2007). As the concatenated data sets were identical for all 27 strains tested, we selected the sequences from the pathotype strain (CFBP 7660) for comparisons with DNA sequences of 131 strains of X. axonopodis representing 21 pathovars (Mhedbi-Hajri et al., 2013) from all six Rademaker's genetic groups (Rademaker et al., 2005). Phylogenetic analysis was performed for each gene individually and on the concatenated data set. Concatenated alignments of the seven-genes sequences displayed in alphabetic order were generated in GENEIOUS to a final sequence of 4,759 bp (1–738 for atpD, 739–1485 for dnaK, 1486–1832 for efp, 1833–2473 for fyuA, 2474–3352 for glnA, 3353–4054 for gyrB, and 4055-4759 for rpoD). Separate and concatenated trees were constructed by NJ and maximum-likelihood (ML) reconstruction methods. For the latter, the model of nucleotide substitution was estimated with hierarchical likelihood ratio test (hLRT) and the Akaike Information Criterion (AICc) to select the best model from 56 candidate models, using Modeltest 3.7 in PAUP (Swofford, 2002). Phylogenetic trees were obtained by the PhyML method and Xanthomonas campestris pv. campestris strain CFBP5241 (ATCC 33913) was used to root the tree as it is more distantly related from the other xanthomonads (X. citri and related species). The SH test (Shimodaira and Hasegawa, 1999) from the DNAml program in PHYLIP (Felsenstein, 1989) was performed to test whether the ML tree topology based on each separate gene fell within the same confidence limits. For both NJ and ML trees, bootstrap analyses were performed with 1,000 replications and the trees were generated with MEGA 5.05.

#### HR Induction and Pathogenicity Tests on Vitis vinifera

Upon isolation from plant material, strains were tested for induction of HR on tomato (Solanum lycopersicum "Santa Clara") by leaf infiltration of a 1 × 10<sup>9</sup> CFU ml−<sup>1</sup> bacterial suspension in sterile distilled water. Pathogenicity of isolated strains was confirmed following infiltration of V. vinifera cv. Red Globe leaves with a bacterial suspension at 1 × 10<sup>8</sup> CFU ml−<sup>1</sup> . Bacterial suspension (100 µL −1 ) was infiltrated into four points of the abaxial surface of the leaves with the aid of a hypodermic syringe without needle. These qualitative tests were conducted with two replicates per isolate. Inoculated plants were maintained in a greenhouse at 28◦C and the pathogen was reisolated from typical lesions 7–10 days after inoculation.

<sup>1</sup>http://147.99.127.226/pub/cfbp/user\_tool.php

#### TABLE 1 | Xanthomonas strains used in this study.

fpls-10-00489 April 17, 2019 Time: 16:54 # 4


(Continued)

#### TABLE 1 | Continued

fpls-10-00489 April 17, 2019 Time: 16:54 # 5


(Continued)

#### TABLE 1 | Continued

fpls-10-00489 April 17, 2019 Time: 16:54 # 6


<sup>a</sup>States in Brazil: PE, Pernambuco; BA, Bahia; PI, Piauí; PR, Paraná; RR, Roraima; <sup>b</sup>PCR with specific primers targeting a 240 bp-sequence of the hrcN gene (Trindade et al., 2007); <sup>c</sup>na, information not available; <sup>d</sup>not determined; <sup>e</sup> year added to NCPPB.

The three pathotype strains (CFBP 7657, 7658, and 7659) were tested on 90- day-old V. vinifera plants cv. Sauvignon under controlled conditions (28◦C, 98% RH, photoperiod of 16 h). Two methods of inoculation were employed: leaf infiltration (two spots per leaf) with 200 µl of a 1 × 10<sup>7</sup> CFU ml−<sup>1</sup> suspension and deposition of 25 µl of a 1 × 10<sup>8</sup> CFU ml−<sup>1</sup> suspension on the stems at three points after wounding with a needle. Two plants and six leaves per plant were inoculated with each strain. Plants were kept at 100% humidity for 48 h and were evaluated for symptom development until 35 days after inoculation. Three additional plants each were inoculated with pathovar viticola strains from India, CFBP 7660 and CFBP 7694, and one Brazilian strain (CFBP 7764) as positive controls. One plant was inoculated with water and kept as a negative control. All inoculation tests were carried out following quarantine procedures at IRHS, France. The assay was evaluated qualitatively, by scoring presence or absence of necrotic symptoms during 35 days. Isolations from inoculated leaves and stems were attempted 35 days after inoculation and colony growth was recorded after 48–72 h on 100% TSA medium.

#### Selection of VNTR Loci for MLVA

Bacterial suspensions of each strain were prepared at 1 × 10<sup>7</sup> CFU ml−<sup>1</sup> and were boiled at 95◦C for 10 min before PCR. Aliquots of boiled cells were kept at −20◦C. Primers for amplification of 14 VNTR loci from X. citri pv. citri (Bui Thi Ngoc et al., 2009) were tested with five strains of pathovar viticola and strain 306 of pathovar citri. Reaction mix contained 1X GoTaq Flexi buffer (Promega), 1.5 or 3.0 mM MgCl2, 62.5 µM each dNTP, 0.125 µM of each primer, 0.25 U of GoTaq Flexi DNA polymerase and 1 µl of bacterial cells suspension. Conditions for amplification were as follows: 95◦C for 5 min followed by 32 cycles of 95◦C for 30 s; 60, 64 or 68◦C, depending on the primer set, for 30 s and 72◦C for 30 s and a final extension at 72◦C for 10 min. PCR products were separated on 2.5% agarose gels in 1X Tris acetate EDTA buffer (TAE) and visualized after ethidium-bromide staining. When poor amplification occurred, PCR was optimized by testing different MgCl<sup>2</sup> concentrations (1.5 and 3.0 mM) and annealing temperatures (60, 64, 68◦C). VNTR loci were selected based on reproducibility (amplicons of same size produced in different PCR runs) and polymorphism detection among pathovar viticola strains, verified by gel electrophoresis.

#### VNTR Genotyping

Eight VNTR loci were selected and the forward primers were tagged with one of four fluorescent dyes, 6-FAM, VIC, NED or PET (Eurofins MWG/Operon). Primer pairs were combined in four duplexes, with the respective annealing temperature, as follows: XL-1 FAM and XL-4 VIC, 64◦C; XL-13 NED and XL-15 PET, 60◦C; XL-3 FAM and XL-5 PET, 64◦C; XL-6 NED and XL-8 VIC, 68◦C.

PCR products (1 µl of products marked with 6-FAM, VIC or NED; and 2 µl of products marked with PET) were diluted in ultrapure water to a final volume of 32 µl. An aliquot of 2.4 µl was mixed with 0.15 µl of the GeneScanTM 500 LIZTM size standard (Applied Biosystems) and 9.35 µl formamide in a 96-well tray, followed by denaturation at 94◦C for 10 min in a thermocycler. Capillary electrophoresis was conducted in the ABI3130 sequencer using the GeneMapper application. Chromatograms were visualized with PeakScannerTM software v. 1.0 (Applied Biosystems). Fragment sizes were estimated and converted into copy number for each VNTR. To confirm the copy number for each VNTR locus, PCR products of strains CFBP 7660 and CFBP 7764 were sequenced at Genoscreen. Sequences were edited with GENEIOUS and the search tools were used to detect the tandem repeat sequences.

#### Stability Test

In order to test the stability of the pathovar viticola VNTR types after successive culture transfers, four distinct strains were tested: three pathovar viticola strains (CFBP 7660, 7764, and 5869) and X. axonopodis pv. citri strain 306. A starter bacterial cell suspension (1 × 10<sup>8</sup> CFU ml−<sup>1</sup> ) was prepared and 50 µl were transferred to 5 ml 10% TS liquid media. After 24 h- growth at 28◦C, a new aliquot of 50 µl was transferred to a new tube. This procedure was repeated every 24 h for 4 days. At each day, a bacterial suspension of 1 × 10<sup>7</sup> CFU ml−<sup>1</sup> was prepared from each culture and tested for the eight VNTR makers. Serial dilutions and colony counts were performed on 10% TSA after 48 h to assess the number of generations from the starter culture.

#### Analyzing VNTR Data

The VNTR data obtained from 107 strains of pathovar viticola were analyzed with BioNumerics (version 6.5, Applied Maths, Sint-Martens-Latem, Belgium). The copy numbers for each VNTR were used as character data and submitted to cluster

analysis. A minimum spanning tree (MST) was generated. This tool creates a tree that connects all strains in such way that the summed distance of all branches is minimized. Clonal complexes were designed using BioNumerics. The Bayesian clustering approach was used to infer population structure and assign individuals to groups characterized by distinct allele frequencies (Pritchard et al., 2000). It was implemented in the software structure 2.3.4. The method estimates a probability of ancestry for each individual from each of the groups. Individuals are assigned to one cluster or jointly to two or more clusters if their genotypes indicate that they were admixed. Twenty independent runs of structure were performed by setting the number of subpopulations or groups (K) from 1 to 10, with 10,000 burnin replicates and a run length of 20,000 replicates to decide which value of K best fits the data (Evanno et al., 2005). Clustering of isolates of pathovar viticola was evaluated for the inferred number of groups. Structure was run using the admixture model without prior population information, which assumes correlated allele frequencies for our MLVA data. The founder genotype, which is the one from which most single locus variants (SLV) arose (Feil et al., 2004; Spratt et al., 2004) was identified using eBURST v3<sup>2</sup> . The discriminatory power of MLVA was calculated using an online tool<sup>3</sup> .

#### CFBP 7764 Genome Sequencing and in silico Design of New VNTRs

The genome of Xanthomonas citri pv. viticola strain CFBP 7764 was sequenced using the Illumina HiSeq 2000 platform (Genoscreen, France). The genomic sequence of another X. citri pv. viticola strain, LMG 965, already published (GCA\_000723725.1) (Midha and Patil, 2014) was used for comparison. Annotation of both genomes was performed using EuGene-PP (Sallet et al., 2014). The genome sequences were mined to search CDSs encoding functions of interest for xanthomonads. A set of almost 1800 CDSs identified mostly in xanthomonads, but also in various pathogenic bacterial genera (Xanthomonas, Pseudomonas, Ralstonia, Erwinia, Escherichia, Salmonella) was used to screen for homologs of these proteins using tBLASTN (identity higher than 80% on at least 80% of CDS length). Genes encoding proteins involved in chemotaxis, motility, lipopolysaccharide and exopolysaccharide biosynthesis, TonB-dependent transporters (TBDTs), two-component systems (TCSs), the different secretion systems (T1SS, T2SS, T3SS, T4SS, T6SS) and their effectors, fibrillar and afibrillar adhesins, and insertion sequences (ISs) belonging to different families were included in this list. Furthermore, reciprocal tBLASTN (identity higher than 80% on at least 80% of CDS length) were performed between the 4,572 and 4,233 CDSs that were predicted in CFBP 7764 and LMG 965 genomes, respectively.

Tandem repeats finder web tool (Benson, 1999) was used to search Variable Number of Tandem Repeats (VNTRs) in the genome of CFBP 7764. Selected VNTRs have at least two copies with period size shorter than 100 nucleotides and a percentage of matches of at least 95% between the different copies. VNTRs were

<sup>2</sup>http://eburst.mlst.net/

then checked on LMG 965 genome and only VNTRs that had a different number of copies within both genomes were selected. Primers conserved in both strains were designed in the 500 bpflanking regions using Primer3 web site (Untergasser et al., 2012) in order to amplify DNA fragments with a final size between 100 and 350 bp to be compatible with the use of the ABI3130 capillary electrophoresis sequencer.

#### Nucleotide Sequence Accession Numbers

The GenBank accession numbers for the partial sequences of the grapevine strain CFBP 7764 and the Amaranthus strain Am-1, used in this study are, respectively: for atpD MH171285 and MH171286; for dnaK MH171287 and MH171288; for efp MH171289 and MH171290; forfyuA MH171291 and MH171292; for glnA MH171293 and MH171294; for gyrB MH171295 and MH171296; and for rpoD MH171297 and MH171298.

#### Accession Number

The whole-genome shotgun sequence of Xanthomonas citri pv. viticola strain CFBP 7764 has been deposited in GenBank under accession no. PPHE00000000.

#### RESULTS

#### Xanthomonas Strains Causing Grapevine Bacterial Canker in Brazil Belong to X. citri pv. viticola and Are Monomorphic Based on MLSA

Neighbor Joining tree-based phylogeny determined from rpoD and gyrB concatenated sequences alignments showed the relatedness among types and pathotypes of 15 Xanthomonas species and 27 strains of pathovar viticola and strains of the three other pathovars affecting plants in the Vitaceae family (**Figure 1**). All pathovar viticola strains (**Table 1**) had identical sequences for both gene sequences, including the Brazilian and Indian strains and two strains from Amaranthus sp. These strains, as well as those from the related pathovars affecting Vitaceae species from India, were assigned to the newly described X. citri species that encompasses the previously described 9.5 and 9.6 groups. The pathovar viticola strains were all distinct from these other pathovars. Based on housekeeping gene sequences, the closest relative to pathovar viticola is pathovar vitistrifoliae.

Sequences of all seven genes were identical for all strains collected from grapevine and from Amaranthus. Therefore, only one sequence type was used for comparisons with 131 gene sequences from X. axonopodis pathovars from Mhedbi-Hajri et al. (2013) and sequences from the three Vitaceae-associated pathotypes. ML (**Figure 2**) and NJ trees were constructed based on the 4,759 bp concatenated sequences of the seven genes. Both trees showed congruent assignments for most pathovars according to Rademaker's genetic groups 9.1–9.6, except for pathovars alfalfae and allii from the 9.2 group. ML trees showed higher bootstrap values compared to the NJ trees. Both methods assigned the pathovar viticola and the other related pathovars

<sup>3</sup>http://insilico.ehu.es/mini\_tools/discriminatory\_power/index.php

Xanthomonas species. Bootstrap values (1,000 replicates) are shown at each node. The 28 strains include 24 strains isolated in Brazil from bacterial canker-infected grapevines, two (Am-1 and Am-3) from Amaranthus plants grown in the vicinity of grapevines, and two strains from India (CFBP 7660 and CFPB 7691). CFBP 7660 is the pathotype strain of X. citri pv. viticola.

to the 9.5 clade. The SH test performed on the ML trees showed that for five genes the concatenated tree topology was congruent with each individual gene tree, except for glnA and rpoD (**Supplementary Table S1**). For these two genes, different positions of pathovars in the 9.1 and 9.2 groups were evident, suggesting occurrence of recombination events.

FIGURE 2 | Maximum likelihood tree of 131 Xanthomonas strains based on the 4,759 bp concatenated sequences of atpD, dnaK, efp, fyuA, glnA, gyrB, and rpoD. Tree was constructed with PhyML and the bootstrap values higher than 50 (1,000 replicates) are shown at each node. Sequences of the pathotype strain (CFBP 7660) of X. citri pv. viticola were compared to sequences of 131 strains formerly assigned to X. axonopodis representing 21 pathovars (Mhedbi-Hajri et al., 2013) and all six Rademaker's genetic groups 9.1–9.6 (Rademaker et al., 2005). Correspondence between Rademaker's groups and the four Xanthomonas species (according to Constantin et al., 2016) are indicated. Xanthomonas campestris pv. campestris strain CFBP 5241 (ATCC 33913) was included as outgroup.

FIGURE 3 | Pathogenicity test on Vitis vinifera cultivar Sauvignon carried out by leaf and stem inoculations with Xanthomonas strains. (A) Symptoms 35 days after inoculation of: CFBP 7764 on leaf and stem (a,b); CFBP 7657 (c), CFBP 7658 (d), CFBP 7659 (e) and negative control at 21 days after infiltration (f). (B) Symptom development was recorded as (+) necrosis at the point of infiltration; (++) necrosis at the point of infiltration followed by multiple necrotic spots on the leaves and leaf veins, or development of canker-like lesions on the stems; strain CFBP 7694 was received as X. campestris pv. viticola, but is related to X. hortorum according to gyrB and rpoD sequencing (as shown in Figure 1). Scale bar = 1.0 cm.

Based on MLSA the four pathovars affecting distinct species in the family Vitaceae were distinct from each other, while still belonging to X. citri, more precisely to the 9.5 Rademaker's group (**Figure 2**). Pathovars vitistrifoliae and viticola fell into one clade supported by high bootstrap values. On the other hand, pathovar vitiswoodrowii fell into a different clade, closest to pathovar bilvae. These three strains from India are reported as non-pathogenic on V. vinifera. However, due to the differences in their phylogenetic positions (**Figure 2**) we tested them for pathogenicity on V. vinifera, cv. Sauvignon. These pathogenicity tests confirmed the non-host status of V. vinifera only for pathovar vitiswoodrowii. Symptoms developed on stems of plants inoculated with pathovars vitiscarnosae and vitistrifoliae. For pathovar vitiscarnosae, necrotic spots at the point of infiltration also developed on leaves (**Figure 3A**). After 35 days, the bacterium was isolated from both leaves and stems of all plants, but isolations were unsuccessful from plants inoculated with pathovar vitiswoodrowii. While symptoms incited by these two pathovars were mild and did not progress beyond the point of infiltration, the plants inoculated with pathovar viticola strains CFBP 7660 and 7764 showed more severe symptoms. Besides leaf perforation, several spots appeared on the leaf veins and in interveinal areas of the leaf that gradually enlarged becoming necrotic. Grape leaves inoculated with pathovar vitistrifoliae did not show any symptoms, but the isolation was positive, yielding pure colonies of the bacterium (**Figure 3B**).

It should be noticed that strain CFBP 7694 (NCPPB 3642), received as Xanthomonas campestris pv. viticola, was assigned to the X. hortorum clade, and for this reason it was not included in the MLVA study. In contrast to all pathovar viticola strains, which are non-pigmented, CFBP 7694 was a yellowpigmented strain isolated in India and added to NCPPB in 1990. Due to its atypical characteristics, this strain was also tested for pathogenicity on grapevine plants. The pathogenicity test showed that it is not pathogenic on this host. This strain was identified as X. hortorum, according to gyrB and rpoD sequence analysis (**Figure 1**). No symptoms were observed on leaves or stems 35 days after inoculation. However, bacterial colonies were isolated from inoculated leaves, suggesting that this strain can survive in grape leaves (**Figure 3B**).

#### VNTR Markers From X. citri pv. citri Have Sufficient Resolution to Detect Diversity in Pathovar viticola

Out of 14 VNTR loci from X. citri pv. citri (Bui Thi Ngoc et al., 2009), 13 were PCR-amplified from DNA of pathovar viticola strains. Primers for marker XL-2 did not produce any visible fragments on agarose gels. Using a subset of five pathovar viticola strains and strain 306 of X. axonopodis pv. citri, polymorphism was observed with eight markers (**Table 2**). The stability of these markers was checked in vitro after 32 generations for four strains, three pathovar viticola strains (CFBP 7660, 7764 and 5869) and X. axonopodis pv. citri strain 306. No variation in fragment sizes was observed throughout the experiment. Hence, all eight markers remained stable after 32 generations of these four strains.

TABLE 2 | Number of alleles, range of repeat numbers, strain frequency for each dominant allele and allelic diversity for the eight VNTR loci tested on 107 pathovar viticola strains.


7691), based on MLVA with 8 VNTR markers. The circles represent a MLVA type. The types that are connected by a thick solid line differed by 1 VNTR locus; MLVA types connected by thin solid lines differed by 2–3 VNTR loci, and the types that differed by 4 or more loci are connected by dashed and dotted lines. (A) The gray zone represents clonal complexes comprising MLVA types that differ from one another by one locus, (B) the gray zone groups types that differ by one or two loci.

### MLVA Typing Revealed Diversity in the Pathogen Population From Brazil

In a collection of 107 pathovar viticola strains, the number of alleles ranged from 5 to 12 and the copy numbers of the repeat sequences ranged from 3 to 29. Four VNTR loci were the most diverse (diversity indexes over 80%): XL-1, - 4, -3, and -6. The other four revealed less diversity in the collection. For example, for XL-8, 95 strains (88.8%) presented the same allele (**Table 2**). A total of 101 haplotypes were detected, but none of them was overrepresented in this set of strains. The discriminatory power of the MLVA was calculated and it showed a level of discrimination of 0.9563 for 107 typed strains. A MST based on repeat copy numbers shows the relationships among 107 strains in relation to the year of isolation and a subdivision in several clusters (**Figure 4**). The VNTR markers clearly separated the two Indian strains from the Brazilian strains. These strains from India, isolated in 1969 and 1972, had unique alleles at three loci (XL-4, -8, and -5) and differ from each other by one mismatch at locus XL-1. Three larger clonal complexes composed by strains that differed by only one VNTR were detected in the Brazilian set of strains. Two of these complexes contained older strains, which were isolated in 1998, the year of the first disease outbreak in Brazil. Bayesian clustering was performed in Structure supporting four groups (K = 4). Analysis of these groups revealed one population with greater admixture containing the Indian strains, two groups containing isolates from 1998 to 2012 and one group with isolates from 2006 to 2012 with overlapping (**Figure 5**). The E-burst algorithm identified also three clusters of related genotypes, and several singletons (**Figure 6**). The predicted founders for the three clusters are strains 1193, 1194 and 54. Strains 1193 and 1194 were both isolated from Red Globe vines in 1998 in Petrolina, state of Pernambuco, but from two different vineyards. However, when grouping the strains that shared identical alleles at 6 or 7 loci, one single large clonal complex appeared (**Figures 4B**, **6**). The predicted founder of this larger complex was strain 1194 from which the larger number of single and double locus variants emerged.

The two strains (4779B and 482) that were more geographically distant (i.e., detected in the states of Parana, south of Brazil, and Roraima, northwest of Brazil, respectively) had unique alleles at loci XL- 6 and XL- 1, respectively. Several strains appear as singletons not belonging to the three major clonal complexes (eBURST and MST), for example, strain 26 (**Figure 4A**).

This strain has one unique allele at locus XL-15 and it is also unique as far as to its collection site. It was collected in 2009 in the same municipality (Petrolina) as most others, but it is the only strain in the collection from one specific grape-producing area.

Regarding the host of isolation, the three strains from Amaranthus (Am-1, Am-2, and Am-3) had three distinct MLVA profiles and those were not identical to any of the strains collected from grapevines in the same area, in the same year (P1S5, P1S6, P1S9, P1S12, and P1S16). Strain Am-1, for example, had different copy numbers for 3 VNTRs compared to strains P1S5 and P1S16, that share the same MLVA type. Only one VNTR locus (XL 13) was monomorphic among the three Amaranthus and the five grape strains collected in the same area and year. Interestingly strain Am-3 and the founder genotype of one of the clonal complexes (strain 1194) were identical in 7 loci (**Figure 4**).

#### CFBP 7764 Genome Features Are Typical of Plant Pathogenic Xanthomonads

Strain CFBP 7764 was chosen for whole genome sequencing analysis because it was isolated in Brazil with a time lapse of more than 40 years compared to the Indian pathotype strain. Shotgun sequencing yielded 8,390,830 100-bp paired-end reads with an insert size of 250 bp. A combination of Velvet (Zerbino and Birney, 2008), SOAPdenovo, and SOAP Gapcloser (Luo et al., 2012) yielded 76 contigs (N50, 592,828 bp), with the largest contig being 791,586 bp, for a total assembly size of 5,311,793 bp. Genomic sequence of this strain CFBP 7764 showed a typical Xanthomonas gene content (Alegria et al., 2005; Potnis et al., 2011). The genes encoding the main secretion systems described in Gram-negative bacteria were detected in the genome of strain CFBP 7764. Genes encoding at least two T1SSs and two more putative T1SSs were identified. Genes encoding proteins involved in Tat and Sec pathways, in two complete T2SSs (Xcs and Xps) and 77 putative T2-secreted cell wall degrading enzymes were predicted. The hrp cluster encoding the T3SS-Hrp2 family and 17 T3E-genes (avrBs2, xopA, xopAE, xopAI, xopAQ, xopB, xopC2, xopE1, xopE3, xopK, xopL, xopN, xopP, xopQ, xopV, xopX, and xopZ1), a T4SS gene cluster similar to the chromosomic cluster of Xac306 (Alegria et al., 2005), and a single T6SS cluster belonging to the group 3 (Potnis et al., 2011) were predicted. Strain CFBP 7764 is fully equipped with genes necessary to sense and move in its environment, to protect itself, and to acquire nutrients, through a complete flagellar system, at least 25 MCPs (methyl-accepting chemotaxis proteins), complete type I and type IV pili, several T5SS, including fhaB, fhaC, shlB, and yapH,

xanthan biosynthesis and near seventy TBDTs. At least, almost 120 genes encoding TCSs could be involved in the detection and the response to environmental signals. Comparison of the genomic sequences of the two strains of X. citri pv. viticola did not show any differences in all these functions. The draft quality of the genome sequences did not allow an exhaustive analysis of IS content. It was however possible to observe some diversity between both strains. CFBB 7764 harbored partial sequences homologous to ISXac2 and ISXc8 from IS3 family that were not detected in LMG 965 sequence. Reciprocally, LMG 965 had sequences homologous to IS1477 from IS5 family that was not detected in CFBB 7764 (**Supplementary Table S2**).

As in strain LMG 965, the xanthomonadine biosynthesis gene cluster showed a truncated gene that can explain the white aspect of the colonies, in contrast to the yellow colonies of most species of the genus Xanthomonas (Midha and Patil, 2014). Comparison based on reciprocal tBLASTNs of the genomic sequences of CFBP 7764 and LMG 965 revealed 26 CDSs predicted in LMG 965 genome that had no orthologs in CFBP 7764, most of them were probably on a 16 Kb plasmid in LMG 965. Conversely, 233 CDSs predicted in CFBP 7764 genome had no orthologs in LMG 965 (**Supplementary Tables S2**, **S3**). These CDSs were distributed in several clusters, corresponding to almost 15 whole contigs of various sizes (between 64.4 and 0.8 Kb). Around 100 of these 233 CDSs had no orthologs in NCBI nr database. Most of the remaining CDSs had orthologs in plasmid sequences, such as plasmid pB07007 of X. hortorum strain B07-007, plasmid C of X. citri pv. fuscans strain 4834R, plasmid pICMP7383.2 of X. gardneri, and plasmid pLH3.1 of X. euvesicatoria pv. perforans strain LH3. Apart from numerous CDSs encoding proteins involved in conjugation, these putative plasmids carried CDSs encoding functions such as toxin-antitoxin, restriction and anti-restriction proteins, multidrug efflux systems and copper resistance genes (**Supplementary Table S3**). A copLAB gene cluster has hence been evidenced in CFBP 7764.

### Availability of Epidemiological Contrasted Genome Sequences to Design New VNTRs

The eight VNTRs used in this study were initially developed for X. citri pv. citri (Bui Thi Ngoc et al., 2009). These VNTRs were found within the two genomes of X. citri pv. viticola strains, being, however, slightly divergent (**Supplementary Tables S4**, **S5**). All VNTRs had a 7-nucleotide repeat motif and were distributed among six different contigs in both genomes. Genome mining showed that five of these eight VNTRs (XL3, XL 4, XL5, XL 8, and XL15) have different repeat numbers in these two strains isolated at a 43-year interval in different continents (**Supplementary Table S4**). Except for XL 5 and XL6, the numbers of repeats and number of loci with different copy numbers between the two strains were greater in the experiments (amplicon sequencing) than in the genome miningbased prediction. This was due to degenerated repeats that were not taken into account in the prediction using the Tandem Repeats Finder tool. Taking the opportunity of having these two genome sequences, we designed a set of 32 new VNTRs (**Table 3**). VNTRs were selected based on a repetition number higher than two, a length shorter than 100 bp, a high motif conservation within the VNTR (95%) and different repetition numbers between the two genome sequences. This VNTR scheme included repeats with motifs varying from three to 16 nucleotides and covering 11 different contigs, in particular five and nine VNTRS were designed in the two large contigs from CFBP 7764 (G102 and G103) that were not targeted with the X. citri pv. citri VNTR scheme, giving a wider representation of the entire genome sequence.

### DISCUSSION

Although it was first described in 1972, the emergence of X. citri pv. viticola as a grapevine pathogen is relatively recent with outbreaks in India (1990) and Brazil (1998). Analysis of a Brazilian collection of strains showed that this pathovar lacks genetic diversity in seven housekeeping genes and confirms its status as a monophyletic pathovar of X. citri species. Further knowledge of the diversity of this pathogen was possible through a MLVA scheme with eight VNTR loci which allowed a better understanding of the genetic structure of the Brazilian strains.

Primers for amplification of VNTR loci in bacterial plant pathogens have been designed from draft or complete genome sequences (Arrieta-Ortiz et al., 2013; Cesbron et al., 2014; Cunty et al., 2015; Poulin et al., 2015) or from genomes of close relatives (Pruvost et al., 2011). VNTR markers developed from a specific pathovar genome can be successfully used for genotyping other pathovars belonging to the same species, as shown for X. arboricola pv. pruni and related pathovars (Cesbron et al., 2014). For pathovar viticola, the genome sequence of the reference strain was not available at the beginning of this study, consequently VNTR markers designed for the citrus canker pathogen X. citri pv. citri (Xanthomonas axonopodis pv. citri) were tested. Pathovar citri is phylogenetically related to pathovar viticola based on gyr B sequences (Parkinson et al., 2009) and sequences from other housekeeping genes (Gama et al., 2018; this study). A closer relationship between these two pathovars had been previously demonstrated by whole-cell fatty acid methyl esters (FAMEs) following a comprehensive study of 975 xanthomonads strains (Yang et al., 1993). In fact, diseases caused by both pathovars, viticola and citri, were first noted in India and recent whole genome comparisons confirm that the two pathovars are members of the same species but with different host specificity (Midha and Patil, 2014; Bansal et al., 2017).

Eight out of 14 VNTR markers described for pathovar citri, were polymorphic for pathovar viticola. Six out of these eight markers can also reveal polymorphism among strains from the pathovars mangiferaeindicae and malvacearum (Bui Thi Ngoc et al., 2009), which also belong to the rep-PCR group 9.5 (Rademaker et al., 2005; Mhedbi-Hajri et al., 2013). Both pathovars have been included in the newly described X. citri species (Constantin et al., 2016).

Compared to other methods for deciphering population structures and diversity, MLVA has much higher resolution, and can be applied to human pathogens that lack diversity

#### TABLE 3 | VNTR scheme designed based on CFBP 7764 and LMG 965 genome sequences.


in housekeeping genes, i.e., monomorphic (Achtman, 2008). Among plant pathogens examples of monomorphic pathogens are Pseudomonas syringae pv. actinidae biovar 3 (Cunty et al., 2015) and X. citri pv. citri (Pruvost et al., 2014; Leduc et al., 2015). In a similar way, MLSA approach lacks resolution to distinguish among strains of pathovar viticola. Strains from India and Brazil were identical in all seven genes. Four (dnaK, fyuA, gyrB, and rpoD) out of these seven loci were also used in the MLSA scheme proposed by Young et al. (2008) for species differentiation in Xanthomonas. Consequently, the VNTR markers were chosen to help us gain some insight and an overview of the genetic structure of pathovar viticola strains isolated in Brazil since the 1998 outbreak and to understand how these strains are linked to the Indian strains. The MLVA scheme with eight loci proved to be efficient tool for discriminating strains that had identical housekeeping genes sequences (**Figure 4**). Even though strains isolated in the same year, in the same location and from the same cultivar (many isolated from Red Globe vines) are overrepresented in the collection, MLVA had enough resolution to distinguish strains from the same area, strains from a weed host and grapevine, and to distinguish most Brazilian strains from the strains from India.

Grapevine bacterial canker is a disease with limited distribution around the globe. It was reported from India more than 40 years ago, but only in the last 20–25 years, it gained economic importance. Serious disease outbreaks occurred in India in the late 1980's and were linked to increases in the area cultivated with the susceptible seedless cultivars (Chand and Kishun, 1990). The reported yield losses in severely infected vineyards were up to 60 or 80%. In 1998, the disease was first noted in Brazil affecting mostly seedless varieties. Currently, regarded as a quarantine pest in Brazil, control measures based on surveillance and eradication have been adopted (Naue et al., 2014). The detection of infected plants in other states and regions in the country (Halfeld-Vieira and Nechet, 2006;

Rodrigues Neto et al., 2011) reveals pathogen spread by asymptomatic propagating material, which leads ultimately to eradication procedures. In the state of São Paulo, approximately 4,700 plants were destroyed due to a disease outbreak in 2009 (Rodrigues Neto et al., 2011).

A possible introduction event associated with propagating material originating from India has been hypothesized to explain the emergence of this disease in Brazil (Rodrigues Neto et al., 2011). This event should have taken place at least 3 years before the disease outbreak in 1998, since the first symptoms were observed on young vines up to 3 years of age. The lack of sequence variation in seven housekeeping genes among Brazilian and Indian strains shows that, globally, it is a monomorphic pathogen. A genetically monomorphic pathogen may arise from a strong reduction in the population size of the ancestors of the existing strains due to a recent bottleneck (Achtman, 2008). Housekeeping genes encodes essential metabolic enzymes for species survival, thus they may undergo strong purifying selection, as demonstrated for most phylotypes of the plant pathogen Ralstonia solanacearum (Castillo and Greenberg, 2007).

A panel of eight polymorphic VNTR markers derived from X. citri pv. citri was developed for X. citri pv. viticola and showed genetic diversity in a set of 105 strains from Brazil. The high discriminatory power of MLVA revealed patterns of genetic diversity nor detected by previous studies with rep-PCR (Trindade et al., 2005; Gama et al., 2018). MST and Structure analyses identified three congruent major genetic groups in the Brazilian collection. The epidemic-related strains from 1998 were separated in two groups while the two strains from India were clustered. A fourth group (red) detected by Structure (**Figure 5**) was not clearly understood as it groups the two strains from India with 23 strains from Brazil that were, mostly, not connected to the major MST clonal complexes and appear as singletons. Furthermore, admixture among populations was observed (**Figure 5**).

Some strains found in the same field, from the same grape cultivar and year of collection shared the same haplotype (P1S5 and P1S16; 1193 and 1195). However, same haplotypes were also shared by strains collected from neighboring states (CFBP 7676 and 1192; 191 and 119), which suggests dissemination by the planting or grafting of symptomless contaminated plant material. That would also explain the disease outbreaks in two more distant states in the country (Paraná, in the southeast and Roraima, in the north region).

Most Brazilian strains are members of three larger clonal complexes (**Figure 4A**). The predicted founder genotypes of two clonal complexes are strains 1194 and 1193 which were isolated in 1998 from Red Globe vines. These strains are not linked to the two Indian strains isolated in 1969 and in 1972. The fact that the Brazilian strains from 1998 belong to two distinct clonal complexes suggests that the 1998 outbreak of grapevine bacterial canker in Brazil probably occurred through one introduction event of two distinct grapevine planting materials contaminated with genetically distinct strains. The development of the irrigation projects in the São Francisco River valley in Brazil started in the 1970's and the introduction and

exchange of propagating material of different grape varieties occurred over time.

The lack of a more diverse and recent collection from India did not allow us to draw conclusions about the events that lead to the emergence of this pathogen in Brazil. We hypothesize that the 1998 outbreak-related strains from Brazil are probably epidemiologically linked to the strains that caused the severe disease outbreaks in India in the late 1980's (Chand and Kishun, 1990) which were highly aggressive on seedless varieties, but not linked to the ancient strains (1969/1972) as shown by the results.

The environment where conditions are variable may favor the existence of more genetically diverse populations, from which new crop strains emerge, often as highly virulent clones (Goss et al., 2013). Alternative hosts harboring potential sources of inoculum may contribute to amplify the diversity observed in Brazil. In Brazil, xanthomonads-like bacteria have been isolated from several weeds growing in the vicinity of vineyards. Their pathogenicity was confirmed in the original host and in Red Globe grapevines (Peixoto et al., 2007). In the present study we provide further evidence on the identification of three Amaranthus strains, collected in a Red Globe area in 2012. Pathogenicity of these strains on grapevine was confirmed. Based on MLSA these strains have 100% identity to the grape strains, confirming the potential of pathovar viticola to survive and infect weeds such as Amaranthus sp. as alternative hosts. Neem is often employed as windbreaks in vineyards in Brazil and has been described as a natural host in India (Nayudu, 1972). Mango is also grown in the same region in Brazil and can develop symptoms upon inoculation with pathovar viticola (Chand and Kishun, 1990). However, natural populations of pathovar viticola infecting neem or mango have never been reported. Isolations from neem have been unsuccessful (Peixoto et al., 2007) and whether pathovar viticola strains can survive epiphytically and/or infect mango under natural conditions remains unknown.

Genome mining revealed that strain CFBP 7764 had all genes necessary to a Xanthomonas strain to sense and move in its environment, to protect itself, and to acquire nutrients. Presence of the different types of secretion systems (T1SS to T6SS) and their numerous effectors confirmed the pathogenic nature of strain CFBP 7764. Comparison of the genomic sequences of the two strains of X. citri pv. viticola did not show any differences in all these functions but revealed differences mostly in plasmid content. Indeed, the presence of sequences that matched with one or several plasmids was detected in strain CFBP 7764, and the sequences had no orthologs in LMG 965; reciprocally the sequences from one plasmid of LMG 965 had no orthologs in CFBP 7764. However, the sequencing technology used did not allow to obtain a sufficiently high-quality sequence to properly assemble the putative plasmids. Plasmids allow phytopathogenic bacteria to maintain a dynamic, flexible genome and possible advantage in host–pathogen and other environmental interactions (Sundin, 2007).

We proposed a new VNTR scheme, based on the analysis of genomic sequences of two strains representing epidemics from India and Brazil. This scheme could complete the previously proposed X. citri pv. citri VNTR scheme. Indeed, the VNTRs from the newly proposed scheme were chosen

specifically to have different copy numbers between the two sequenced strains in order to enhance the probability to have variable loci in Brazilian vs. Indian strain collections. This scheme encompasses a majority of VNTRs with a short repeat motif (≤7) that should be particularly well suited for epidemiologically related strains as previously mentioned (Pruvost et al., 2014). Furthermore, we designed some VNTRs with a longer repeat motif (up to 16), all together that should allow epidemiological surveys at various scales, with shorter repeat motifs being suited for small to medium spatio-temporal scales and larger ones for global surveillance (Poulin et al., 2015). This study is the first step toward a MLVA scheme suitable for assessing the genetic structure of pathovar viticola, which may help to identify inoculum sources and understand how this pathogen disseminated at both local and intercontinental scales. Comparing population diversity of this bacterial pathogen in its native area (India) and invaded regions in Brazil, may contribute to our knowledge of how bacterial plant pathogens emerge and adapt in new environments. Furthermore, the recent availability of two complete genome sequences of pathovar viticola (this study, Lima et al., 2017) will improve our understanding of genome diversity and the relationships among strains from different geographical origins.

#### CONCLUSION

In this study, we used sequences of housekeeping genes to confirm the taxonomic status of strains pathogenic on grapevine and Amaranthus as members of Xanthomonas citri pv. viticola. We demonstrated that pathovar viticola is a well-defined and monophyletic pathovar, distinct from three other pathovars from India, that affect plants in the Vitaceae family. Based on MLSA, Brazilian strains do not differ from two ancient strains from India. In contrast, eight polymorphic VNTR markers allowed us to assess the genetic structure of the pathogen in Brazil and suggested one introduction event of two genetically distinct groups of strains that lead to adaptation of this pathogen in the country. MLVA showed that Brazilian strains from 1998 and the two ancient Indian strains are not epidemiologically linked. Whole genome comparisons between two strains from India and Brazil, collected within a gap of 43 years revealed new VNTR

#### REFERENCES


markers that could be useful to assess diversity at various scales. Our results provided novel information and insights into how this pathogen emerged in Brazil. Validation of this method with a larger collection of strains, especially from India, could be subject of future studies. This is the first report of a MLVA scheme for rapidly assessing diversity in this plant pathogen.

#### AUTHOR CONTRIBUTIONS

MF and M-AJ conceived and designed the study. MF and SB performed the multilocus sequencing analysis. AD and MB performed the genome analysis. PP, MG, MAB, ES, and RM contributed with strain collection, characterization and pathogenicity tests. MF and SC designed and performed the VNTR analysis. MF, M-AJ, AD, and SC wrote and critically reviewed the manuscript. All authors read and approved the final manuscript.

### FUNDING

Capes-MEC (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Ministério da Educação), Brazil, provided financial support for MF.

#### ACKNOWLEDGMENTS

We would like to thank Karine Durand, Jacky Guillaumes, and Geraldine Taghouti for their technical assistance. We would also like to thank Muriel Bahut and the ANAN platform from SFR Quasav for VNTR sequencings, and CIRM-CFBP (https://www6.inra.fr/cirm\_eng/CFBP-Plant-Associated-Bacteria) for strain preservation and supply.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00489/ full#supplementary-material

and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. manihotis strain CIO151. PLoS One 8:e79704. doi: 10.1371/journal.pone. 0079704


bacterium, Xanthomonas citri pv. citri. Mol. Ecol. Res. 9, 125–127. doi: 10.1111/ j.1755-0998.2008.02242.x


Xanthomonas citri pv. citri, an emerging citrus pathogen in Mali and Burkina Faso. Environ. Microbiol. 17, 4429–4442. doi: 10.1111/1462-2920.12876


for Xanthomonas. Phytopathology 95, 1098–1111. doi: 10.1094/PHYTO-95- 1098


by rep-PCR fingerprinting. Fitopatol. Bras. 30, 46–54. doi: 10.1590/S0100- 41582005000100008


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ferreira, Bonneau, Briand, Cesbron, Portier, Darrasse, Gama, Barbosa, Mariano, Souza and Jacques. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

,

## A Pathovar of Xanthomonas oryzae Infecting Wild Grasses Provides Insight Into the Evolution of Pathogenicity in Rice Agroecosystems

Jillian M. Lang1,2† , Alvaro L. Pérez-Quintero1,2† , Ralf Koebnik<sup>2</sup> , Elysa DuCharme<sup>1</sup> Soungalo Sarra<sup>3</sup> , Hinda Doucoure<sup>4</sup> , Ibrahim Keita<sup>4</sup> , Janet Ziegle<sup>5</sup> , Jonathan M. Jacobs1,2,6, Ricardo Oliva<sup>7</sup> , Ousmane Koita<sup>4</sup> , Boris Szurek<sup>2</sup> , Valérie Verdier1,2 \* and Jan E. Leach<sup>1</sup> \*

<sup>1</sup> Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO, United States, 2 IRD, Cirad, Univ. Montpellier, IPME, Montpellier, France, <sup>3</sup> Centre Régional de Recherche Agronomique de Niono, Institut d'Economie Rural, Bamako, Mali, <sup>4</sup> Laboratoire de Biologie Moléculaire Appliquée, Université des Sciences Techniques et Technologiques de Bamako, Bamako, Mali, <sup>5</sup> Pacific Biosciences, Menlo Park, CA, United States, <sup>6</sup> Department of Plant Pathology, Infectious Disease Institute, Ohio State University, Columbus, OH, United States, <sup>7</sup> International Rice Research Institute, Los Baños, Philippines

Xanthomonas oryzae (Xo) are globally important rice pathogens. Virulent lineages from Africa and Asia and less virulent strains from the United States have been well characterized. Xanthomonas campestris pv. leersiae (Xcl), first described in 1957, causes bacterial streak on the perennial grass, Leersia hexandra, and is a close relative of Xo. L. hexandra, a member of the Poaceae, is highly similar to rice phylogenetically, is globally ubiquitous around rice paddies, and is a reservoir of pathogenic Xo. We used long read, single molecule real time (SMRT) genome sequences of five strains of Xcl from Burkina Faso, China, Mali, and Uganda to determine the genetic relatedness of this organism with Xo. Novel transcription activator-like effectors (TALEs) were discovered in all five strains of Xcl. Predicted TALE target sequences were identified in the Leersia perrieri genome and compared to rice susceptibility gene homologs. Pathogenicity screening on L. hexandra and diverse rice cultivars confirmed that Xcl are able to colonize rice and produce weak but not progressive symptoms. Overall, based on average nucleotide identity (ANI), type III (T3) effector repertoires, and disease phenotype, we propose to rename Xcl to X. oryzae pv. leersiae (Xol) and use this parallel system to improve understanding of the evolution of bacterial pathogenicity in rice agroecosystems.

Keywords: Xanthomonas oryzae, transcription activator-like effectors (TALEs), agroecosystem, cutgrass, rice

#### Edited by:

Dawn Arnold, University of the West of England, United Kingdom

#### Reviewed by:

Gongyou Chen, Shanghai Jiao Tong University, China David J. Studholme, University of Exeter, United Kingdom

#### \*Correspondence:

Valérie Verdier valerie.verdier@ird.fr Jan E. Leach Jan.Leach@colostate.edu †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Plant Science

Received: 21 December 2018 Accepted: 02 April 2019 Published: 30 April 2019

#### Citation:

Lang JM, Pérez-Quintero AL, Koebnik R, DuCharme E, Sarra S, Doucoure H, Keita I, Ziegle J, Jacobs JM, Oliva R, Koita O, Szurek B, Verdier V and Leach JE (2019) A Pathovar of Xanthomonas oryzae Infecting Wild Grasses Provides Insight Into the Evolution of Pathogenicity in Rice Agroecosystems. Front. Plant Sci. 10:507. doi: 10.3389/fpls.2019.00507

### INTRODUCTION

fpls-10-00507 April 27, 2019 Time: 15:34 # 2

Rice is a staple crop for more than half the world. Severe rice diseases, such as bacterial leaf streak (BLS) caused by Xanthomonas oryzae pv. oryzicola (Xoc) and bacterial blight (BB), caused by X. o. pv. oryzae (Xoo), are increasing in prevalence in parts of Asia and sub-Saharan Africa and cause significant yield losses. In Asia, perennial weeds are considered an important source of primary pathogen inoculum for these two diseases (Ou, 1985; Mew, 1987).

Southern cutgrass (Leersia hexandra Swartz) is a common grass found in the southern United States, South America, Africa, and Asia. It is a member of the Poaceae family and is closely related to rice, but diverged from Oryza approximately 14 mya (Guo and Ge, 2005). L. hexandra is an invasive species that frequently grows along rivers and canals surrounding rice paddies. Because of its close relationship to Oryza spp., Leersia spp. are included as outgroups in phylogenetic studies. Recent genome investigations of Leersia perrieri, a cutgrass found in Madagascar, were done to compare repetitive elements and transposable elements among Oryza sp. and to uncover orthologs of the important submergence tolerance gene, SUB1 (Copetti, 2013; Copetti et al., 2015; dos Santos et al., 2017). One-third of the L. perrieri genome was found to consist of repeats. The high amount of newly discovered repeats (35%) indicates that the L. perrieri genome is evolving rapidly relative to the Oryza genus (Copetti, 2013).

Early reports of phytoremediation by L. hexandra showed this grass' capacity to sequester Cr, Cu, and Ni, and it has now been proposed as a tool in wastewater treatment (Liu et al., 2011; Wang et al., 2012; You et al., 2013). Interestingly, this and other grasses in this genus, such as Leersia sayanuka, Leersia oryzoides, and Leersia japonica, are susceptible to Xoo (Noda and Yamamoto, 2008) and can serve as reservoirs for inoculum. Xoo strains isolated from symptomless L. hexandra cause BB symptoms in rice, and, in artificially inoculated weed plants, Xoo multiplied without evidence of disease (Gonzalez et al., 1991) implicating this grass as an alternative host for the pathogen. These and other findings reinforce that effective integrated management of crop diseases must incorporate knowledge of pathogen interactions with weedy species.

The species Xo is highly diverse, and is represented by distinct lineages of Xoo from Asia and Africa, Xoc from Asia and Africa, and strains not assigned as a pathovar from the United States, Xo (Gonzalez et al., 2007; Triplett et al., 2011; Hajri et al., 2012; Poulin et al., 2015; Triplett and Leach, 2016). The term pathovar is used to refer to a strain or set of strains with the same or similar characteristics, differentiated at infrasubspecific level from other strains of the same species or subspecies on the basis of distinctive pathogenicity to one or more plant hosts<sup>1</sup> . Poulin et al. (2015) used multi-locus variable-number tandem-repeat analysis (MLVA) to investigate genetic structures of microbial populations (Zhao et al., 2012; Poulin et al., 2015), and suggested that Xoo and Xoc from Africa had a common Asian ancestor; this conclusion was based on the fact that the allelic richness, or number of alleles, was significantly less in these populations. However, further analyses on an extensively sampled set of isolates are needed to confirm this ancestral hypothesis.

Xanthomonas spp. inject effector proteins into plant host cells to elicit disease via a type-III (T3) secretion system (White et al., 2009). These proteins can confer pathogenicity and/or dictate host specificity (Jacques et al., 2016). Xanthomonas spp. are most notable for production of transcription activator-like effectors (TALEs). TALEs influence host gene expression by directly binding to specific sequences [effector binding elements (EBEs)] in the target promoter as dictated by repeat-variable di-residues (RVDs) at the 12 and 13 amino acid position in the central repeat region (CRR) (Boch et al., 2009; Moscou and Bogdanove, 2009; White et al., 2009; Bogdanove and Voytas, 2011). The CRR contains different numbers of repeats, each with 33–35 amino acids. TALEs may enhance diseases by targeting susceptibility (S) genes, or may trigger a resistance response through activation of an "executor" resistance gene (R) expression (Boch et al., 2014; Hutin et al., 2015).

The presence of TALE effectors is variable in the genus (Jacques et al., 2016). Xoo contain nine to 20 TALEs while Xoc can contain up to 29. US Xo do not contain any TALEs, and due to this absence, have been employed as a tool to study TALE effector biology in rice (Ryba-White et al., 1995; Verdier et al., 2012). New sequencing technologies and predictive algorithms have accelerated the characterization of TALEs and their host gene targets. In particular, long read, single molecule real time (SMRT) sequencing (Pacific Biosciences, Menlo Park, CA, United States) has enabled the rapid assembly of TALE sequences that are otherwise laborious to capture due to their highly repetitive structure (Eid et al., 2009; Booher et al., 2015; Wilkins et al., 2015; Grau et al., 2016; Peng et al., 2016; Quibod et al., 2016; Tran et al., 2018). Collectively, TALE repertoires (TALomes) encoding polymorphic groups that have contrasted abilities to induce susceptibility target genes potentially underlie host adaptation at a small evolutionary scale (Doucouré et al., 2018). The Xo group has clearly undergone significant evolution influenced by geography, environment, and host, and TALomes can provide critical insight into how these events occur.

Xanthomonas campestris pv. leersiae (Xcl), a pathogen of L. hexandra, was previously shown to group distinctly from Xo by host range and phylogenetic analysis (Fang et al., 1957; Parkinson et al., 2009). However, using a multi-locus sequence alignment (MLSA) analysis Triplett et al. (2011) showed that Xcl strain NCPPB4346, which was isolated from southern cutgrass in China, groups within the Xo cluster, yet it could not be assigned to any described pathovar. A more recently isolated strain, BAI23 from weeds in Burkina Faso, showed high sequence similarity with Xcl NCPPB4346 based on a MLSA analysis as well as the presence of TALEs (Wonni et al., 2014). Together, the two strains form a distinct genetic cluster within X. oryzae.

Prediction algorithms based on the TALE's specific RVD pattern and their corresponding degenerate DNA code has facilitated identification of plant target genes whose promoters contain EBEs for TALE binding (Doyle et al., 2012; Grau et al., 2013, 2016; Pérez-Quintero et al., 2013; Booher and Bogdanove, 2014; Yang et al., 2014). Many S genes are transporters

<sup>1</sup>www.isppweb.org/about\_tppb\_naming.asp

(sugar or sulfate) or transcription factors and upon induction facilitate bacterial colonization and symptom development (Hutin et al., 2015; Tran et al., 2018). Although a large body of work is available on TALEs from Xoo and Xoc, no information has been reported on the TALEs from Xcl, how they compare to those in other X. oryzae, and the nature of their predicted targets within Leersia spp. In this study, we used comparative genomics, identification of T3 effectors, TALomes, and disease phenotyping to characterize Xcl. We used gene target prediction algorithms to identify potential Xcl TALE gene targets in draft Leersia genome sequences. Finally, we provide evidence to support renaming Xcl to X. oryzae pv. leersiae (Fang et al., 1957) and will refer to this organism as Xol throughout this work.

### MATERIALS AND METHODS

#### Bacterial Strains and Plant Varieties

Bacterial strains included in this study are listed in **Table 1**. Bacteria were cultured on peptone sucrose agar (PSA) at 28◦C for plant inoculations. Genomic DNA for sequencing was isolated from Xol strains BAI23, BB 151-3, BB 156-2, NCPPB4346, and NJ 6.1.1 grown for 48 h on nutrient agar at 28◦C (Lang et al., 2014; Wonni et al., 2014). Barley (Hordeum vulgare L. cultivar Morex), wheat (Triticum aestivum cv. Chinese Spring) and tobacco (Nicotiana tabacum) were grown in a growth chamber at 22◦C, 50% relative humidity, and 16 h of light. Rice (Oryza sativa) varieties included in pathogenicity assays were Azucena, Carolina Gold, Cypress, IR64, and Nipponbare.

Southern cutgrass (L. hexandra) was collected in Texas, United States, and seed was propagated at Colorado State University. The seed was scarified then germinated in porous ceramic silica (Greens Grade, Profile Products, LLC, Buffalo Grove, IL, United States) and 0.5x Hoagland's solution with the following modifications: 2.5 mM KNO3, 1 mM MgSO4, 3 mM KH2PO4, 2.5 Ca(NO3)2, 0.05 mM FeSO4, and 0.1 mM (Na)2EDTA (Hoagland and Arnon, 1950). Seed was incubated in a petri dish in the dark for 8 days, then in the light for 30 days at 28◦C. Germinated seeds were transplanted into 1-gallon pots with equal parts Greens Grade and ProMix BX (ProMix, Quakertown, PA, United States), and grown in a greenhouse (27 ± 1 ◦C, 16 h day length, and 80–85% relative humidity). Additional plants were obtained from the two mother plants via rhizome propagation. Propagation began 30 days after planting with subsequent propagation every week to allow time for new growth. To promote root growth, plants were placed in the dark for 24 h.

#### Pathogenicity Assays

Rice varieties were inoculated at 4–5 week-old with suspensions (10<sup>8</sup> CFU mL−<sup>1</sup> ) of bacterial strains listed in **Table 1**. Bacterial suspensions were both infiltrated into the intercellular spaces of rice leaves on either side of the main vein with a needleless syringe and inoculated by leaf clipping as described (Kauffman et al., 1973; Reimers and Leach, 1991). Two leaves were inoculated on each of three to six separate plants; water was included as a negative control. The entire experiment was conducted twice. Lesions were measured at 12 days post inoculation (dpi), and bacterial numbers in planta were quantified as previously described (Verdier et al., 2012).

### Molecular Diagnostics

To test relationships of Xol to Xo (Notomi et al., 2000; Triplett et al., 2011; Wonni et al., 2014), a diagnostic multiplex and loop mediated isothermal amplification (LAMP) PCR were used (Lang et al., 2010, 2014; Wonni et al., 2014). Previously described universal US Xo primers were also tested to differentiate Xol from this novel US clade within the species (Triplett et al., 2011). UniqPrimer was employed to compare draft Xol genomes (BAI23 and NCPPB4346) and generate primers specific to Xol as previously described (Ash et al., 2014; Lang et al., 2014, 2017; Juanillas et al., 2018). Specificity was validated by screening Xol primers against diverse pools of bacterial genomic DNA (**Table 1**). Primers, expected product size, and optimal annealing temperatures are listed in **Supplementary Table S1**.

### Genome Sequencing and Assembly

Long read, SMRT sequencing (PacBio, Menlo Park, CA, United States) data were generated for five Xol strains and for the Xo strain X11-5A, to be used as an outgroup. DNA for SMRT sequencing was isolated by culturing strains on nutrient agar for 48 h then using the Genomic DNA buffer set and Genomic-tips according manufacturer instructions (Qiagen, Valencia, CA, United States). SMRT sequence was assembled using HGAP v4 (PacBio, Menlo Park, CA, United States). Genomes were circularized using circulator (Hunt et al., 2015). Assemblies and raw data have been deposited in NCBI (BioProject IDs PRJNA522807 and PRJNA522811; BioSample accessions SAMN03862116, SAMN02469650, SAMN10956066-68, SAMN10956070; raw sequencing files SRX5417793-98; Assembly accessions CP036251-56). Assembly CP036251 (X11-5A) replaces draft assembly GCF000212755.1 and assembly CP036253 (NCPPB4346) replaces draft assembly GCF001276975.1 (also in Bioproject PRJNA257008). Accessions for all genomes used in this study are listed in **Supplementary Table S2**.

#### Phylogenomics and Bioinformatic Analyses

In addition to the five Xol sequenced strains, all completely sequenced X. oryzae genomes were obtained from the NCBI to be used for comparisons, as well as representative genomes from other Xanthomonas species. All genomes and accessions can be found in **Supplementary Table S2**.

Average nucleotide identity (ANI) values were obtained using the ANI-matrix script from the enveomics collection (v1.3) (Rodriguez-R and Konstantinidis, 2016). Parsimony trees based on pan-genome SNPs were obtained using KSNP (v3.0) (Gardner et al., 2015). Multi-locus sequence analysis was made by identifying 33 housekeeping genes in all genome using Amphora2 (Wu and Scott, 2012). Concatenated amino acid sequences of these genes were then aligned using MUSCLE (v3.8.31) (Edgar, 2004), and neighbor-joining

#### TABLE 1 | Bacterial strains used in phenotyping and molecular diagnostics.


<sup>a</sup>Strains tested for pathogenicity to Oryza sativa and Leersia hexandra.

bootstrapped trees were generated using functions (dist.ml; model = "Blosum62", NJ, bootstrap.pml) of the R package phangorn (v2.4.0) (Schliep, 2011).

Automated annotation of proteins used in this manuscript for all genomes was made using Prokka (v1.14-dev) (Seemann, 2014), annotation for public versions available in the NCBI was made with the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Groups of orthologs and trees based on presence/absence of ortholog genes were generated using OrthoFinder (v 2.2.6) (Emms and Kelly, 2015). Dotplots of whole genome alignments were obtained using Gepard (v. 1.4, using word length 100) (Krumsiek et al., 2007). Genome duplications were then quantified using minimap (v.1) (Li, 2016) as implemented in minidot<sup>2</sup> (parameters = −g 1000 −k 50

<sup>2</sup>https://github.com/thackl/minidot

−w 5 −L 100). Colinear gene regions were identified using DAGchainer (Haas et al., 2004) (parameters = −E 1e−20 −s −g 2000 −x 200 −A 4) using the blast results from orthofinder as inputs. Insertion sequences (ISs) were identified using ISEScan (v1.6) (Xie and Tang, 2017).

Non-TALE T3 effectors were identified by BLASTP (v. 2.6.0+, results were filtered keeping hits with −evalue < 0.0001, > 0% identity in >40% the query length) (Boratyn et al., 2013) of consensus effectors sequences obtained from http:// xanthomonas.org/ against the protein sequences obtained using Prokka. TALE sequences were extracted from each genome using in-house Perl scripts<sup>3</sup> . Neighbor joining trees were generated from concatenated nucleotide TALE N and C terminal sequences, alignments were made using MUSCLE (Edgar, 2004) and trees were generated using functions (dist.ml; model = "JC", NJ, bootstrap.pml) of the phangorn (v2.4.0) (Schliep, 2011).

Transcription activator-like effector repeat sequences were aligned using DisTAL v1.2 (Pérez-Quintero et al., 2015) and neighbor joining trees were generated based on DisTAL genetic distances using the R package ape (Paradis et al., 2004). TALE groups were defined using the function cutree in R (height = 4.8) on the DisTAL tree. Predictions for TALE binding sites were made using Talvez on the promoters (−1 kb upstream of the translation start site) of annotated genes in the L. perrieri (v1.4) and O. sativa cv. Nipponbare (v. MSU7) as previously described (Pérez-Quintero et al., 2013).

#### RESULTS

#### Xol Is Pathogenic to Rice and Southern Cutgrass

To better understand the biology of Xol, we established a host range by screening for pathogenicity on rice and southern cutgrass (L. hexandra) using different inoculation techniques. Relative to the virulent Philippine Xoc strain BLS256, the Xol strains were less aggressive to rice, and caused less expansive water-soaked leaf streaking on several rice varieties. Xol BAI23 was more aggressive than NCPPB4346 on rice varieties Cypress and IR64, and caused more disease than the US Xo strain X11- 5A on Azucena. Rice variety Carolina Gold was resistant to Xol, exhibiting a hypersensitive response and no lesion expansion after infiltration. Both Xol strains caused longer lesions on southern cutgrass than Xo, but were not as aggressive as Xoc BLS256 on rice. Xo produced water-soaked spots at the point of infiltration that did not expand on southern cutgrass (**Figure 1**).

After clip inoculations, lesions caused by Xol did not expand on rice or southern cutgrass, unlike those caused by the vascular pathogen Xoo (**Supplementary Figure S1**). When infiltrated into leaves, populations of Xol were equivalent to Xoc BLS256 and Xo X11-5A on rice cvs. Nipponbare or Azucena after 72 hpi (**Figure 2**). On southern cutgrass, their native host, Xol grew to a significantly higher population than Xo X11-5A. Xol did not cause disease on wheat or barley (**Supplementary Figure S2**). Phenotyping Nicotiana species can serve as a screen for the ability of microbes, particularly Xanthomonas spp., to elicit a non-host resistance response (Gonzalez et al., 2007). Xol caused minor chlorosis, but not water soaking nor a hypersensitive response, when infiltrated into N. tabacum, similar to the phenotype caused by Xoo PXO99A and Xoc BLS256 while Xoo BAI3 and Xo US 11-5A caused a strong hypersensitive response at the site of infiltration on N. tabacum similar to prior reports (Gonzalez et al., 2007) (**Supplementary Figure S2**).

Our studies show that Xol is pathogenic to southern cutgrass and mildly pathogenic on diverse varieties of rice, but does not cause disease on barley or wheat, and that some rice varieties exhibit resistance to Xol. After inoculation, Xol causes symptoms on rice that are most similar to Xoc, i.e., expanding lesions when introduced into the intercellular spaces (leaf infiltration) and no spreading lesions when introduced into the xylem vessels (clipping).

#### X. campestris pv. leersiae Belongs to the X. oryzae Species and Is Phylogenetically Close to Xoc

Amplification of Xol DNA with primers specific for Xo but not Xoc, Xoo, or US Xo suggested that Xol were related to X. oryzae, but distinct from other Xo pathovars (Lang et al., 2010; Triplett et al., 2011) (**Supplementary Figure S3**). We then used SMRT sequencing technology to derive complete genomes of five available Xol strains from China (NCPPB4346), Burkina Faso (BAI23), Mali (NJ611), and Uganda (BB151-3 and BB156-2). We calculated pairwise ANI among these and all fully sequenced X. oryzae genomes (Rodriguez-R and Konstantinidis, 2016). This analysis showed that Xol strains were 99–100% identical to one another, and were 97–99% identical to US Xo, Xoc, and Xoo. Xol were most similar to Xoc (∼98.5%). They were 76–91% similar to other Xanthomonas species. X. vasicola was the next most similar Xanthomonas species to oryzae, sharing 91% ANI with Xol and all Xo (**Figure 3**).We generated parsimony phylogenetic trees based on pan-genome SNPs using kSNP3 (Gardner et al., 2015), which showed again that Xol strains were closely related to Xoc (**Figure 3**). Neighbor-joining trees generated based on MLSA and on presence/absence of ortholog families showed similar groupings, although the placement of African Xoo in the tree was variable (**Figure 3** and **Supplementary Figure S4**).

These combined data indicate that our sequenced strains are more closely related to Xo than other Xanthomonas species and therefore, combined with previously reported MLSA data (Triplett et al., 2011; Wonni et al., 2014), we recommend that the formal taxonomic placement of Xcl be included in the species "oryzae." Further biological support for this shift from host range and effector repertoires is described below.

To avoid misidentification of Xol as a Xoc or Xoo in future studies, genomes were compared to identify regions of specificity to base diagnostic primer design with UniqPrimer (Juanillas et al., 2018). Two primer sets were validated for specificity against over 30 closely and distantly related bacteria (**Supplementary Table S1**). Both primer sets consistently amplified only control Xol strains and did not amplify any other bacterial strain tested (**Table 1**).

<sup>3</sup>https://github.com/alperezq/experimenTAL

varieties and (B,C) wild Leersia hexandra were measured 12 days post infiltration inoculation. An asterisk denotes a significant difference between strains on each variety (p ≤ 0.05). Error bars represent ± SD.

### Rice Associated X. oryzae Show High Genome Plasticity Compared to X. oryzae pv. leersiae

We generated dotplots to visualize pairwise whole genome alignments to further compare the genomes of X. oryzae strains. These alignments showed several genome rearrangements between the different Xol strains with respect to each other and to other X. oryzae groups. Curiously, we noticed that selfalignments of genomes of pathovars oryzae and oryzicola overall showed more genomic duplication than those of Xol (**Figure 4A**). To quantify this, we identified colinear regions within each genome that showed these possible genomic duplication events are present in different frequencies in each X. oryzae lineage; and that they are more frequent in Xoc and Asian Xoo, and less frequent in Xol and African Xoo (**Figure 4B**). Notably, all X. oryzae examined exhibited overall more genomic duplications than other Xanthomonas species (**Figure 4B**).

We also identified colinear arrangements of homologous genes within each genome, meaning genome duplications that involve multiple genes, and again, these were more frequent in Xoc and Xoo than in Xol (**Figure 4C**). Since some of these duplications may be mediated by duplicative transposition we annotated and quantified insertion sequences (ISs) in the different X. oryzae genomes, which indeed revealed a higher frequency of IS in the groups with higher amounts of duplicated regions (**Figure 4**). The distribution of IS families was overall similar among X. oryzae, with Xol's most resembling the distribution of African Xoo (**Figure 4E**).

### Non-TAL T3E Repertoires of Xol Are Similar to Xoc

Computational prediction of T3E among annotated proteins in each genome revealed that the Xol T3E repertoire resembles more closely that of Xoc. Nonetheless, some features are unique to Xol, specifically all Xol strains possess a xopD gene that is absent in all other Xo genomes. XopAH is also present in all Xol strains and is shared only by two Xoc strains. On the other hand, Xol strains lack xopO which is present in all Xoc strains, and some Xol strains seem to lack the otherwise universal effectors xopW and xopK (**Supplementary Figure S5**).

### Xol Strains Have Distinct TALE Repertoires With Some Similarities to African Xoo

Previous Southern blot analyses using conserved TALE probes predicted that Xol BAI23 and NCPPB4346 had five and four

FIGURE 3 | X. oryzae pv. leersiae (Xol) is a member of X. oryzae (Xo) and is closely related to X. oryzae pv. oryzicola (Xoc). The heatmap shows pairwise average nucleotide identity (ANI) values between fully sequenced X. oryzae genomes. (Left) Consensus parsimony tree generated with based on shared pangenome SNPs, numbers in gray indicate node support as outputed by kSNP3, heatmap rows are ordered according to this tree. (Top) MLSA neighbor-joining tree based on concatenated alignments of 33 housekeeping genes, numbers in gray indicate bootstrap support for branches, and heatmap columns are ordered according to this tree. X. oryzae pv. oryzae is abbreviated as Xoo and X. vasicola pv. vasculorum as Xvv. All species abbreviations are followed by strain name.

pairwise whole genome alignments between representative genomes of each X. oryzae group, blue squares highlight self-alignments showing high amounts of duplication and rearrangements in X. oryzae pv. oryzicola and pv. oryzae when compared to X. oryzae pv. leersiae. (B) Boxplot showing total duplicated nucleotide sequences (at least 100 bp) in self-alignments of whole genomes for each group, each dot represents the total regions detected for one strain aligned to itself. (C) Gene duplication involving at least four co-linear genes were identified based on alignments of annotated proteins in a genome against itself, boxplot shows total duplication events detected for each strain. (D) Total number of insertion sequences (ISs) identified in genomes of each group. (E) Distribution of IS families identified within each group, percentages are calculated based on the average of each family in all strains analyzed for each group.

TALEs, respectively (Wonni et al., 2014). However, Southern blot analysis cannot resolve TALEs that are close in size. We found in our whole genome sequences that Xol strains contained 12 or 13 TALEs each, which is more than the nine TALEs per genome found in African Xoo, and less than what is commonly found in Xoc (22–29 TALEs) and Asian Xoo (13–20 TALEs). No truncated TALEs were identified in any Xol genome. As previously reported, no TAL effectors were found in the Xo strain X11-5A (Ryba-White et al., 1995; Triplett et al., 2011).

Phylogenetic trees based on the N and C termini of X. oryzae TALEs showed that Xol TALEs form a distinct group, but seem to be close to African Xoo TALEs, despite the overall genomic similarities with Xoc (**Supplementary Figure S6**). We also constructed trees based on similarities in the CRR of using DisTAL, which grouped Xol and African Xoo together in a subgroup that also includes Xoc (**Figure 5A**). We used the DisTAL tree to define TALE groups based on repeat region similarities, Xol TALEs were classified in 12 groups (**Figure 5B**).

shows presence/absence of each TALE group in X. oryzae strains, darker colors indicate multiple TALs from one group found in one strain. Tree to the left indicates average DisTAL distance between repeat arrangements for each TALE group. Top trees show hierarchical clustering of each X. oryzae group based on the presence/absence pattern of TALE groups. (C) Dotplots shows pairwise distance between all TALE repeat arrangements from each strain as calculated with DisTAL, each dot represents a pair of TALEs within one genome.

One of the groups contained TALEs from Xoc and Xol strains, while all the others were exclusive to Xol. Of these, seven groups were present in all five Xol strains. One of these groups, present in the four African Xol strains, contains the RVD combination "TI" which has not been previously reported in other species.

All Xol strains contained at least two TALEs that were classified within the same group, that is, within each Xol genome there are at least two TALEs with nearly identical repeat arrangements (**Figure 5B**). By examining pairwise genetic distances between the repeat regions of TALEs, we saw that TALEs from Xol are on average more similar to each other than TALEs within other X. oryzae groups, indicating that they are possibly less diversified and/or more redundant (**Figure 5C**).

### Xol TALE Targets in Cutgrass and Rice Are Different From Xoc and Xoo Targets

Talvez (Pérez-Quintero et al., 2013) was used to predict host targets for each TALE in our dataset in the promoters of annotated genes in the L. perrieri (v1.4) and O. sativa (vMSU7) genomes. We identified ortholog pairs between both genomes using reciprocal BLAST. Comparisons of the predictions between both genomes for Xol TALEs revealed very few cases where an ortholog pair was predicted to be a target in both genomes (**Supplementary Figure S7**), and highlighted that the promoters (and thus the targeted genes) in these two hosts are very different.

We then compared predictions for all Xol lineages in both genomes and calculated the overlap between predictions for each

strain. As a result, we saw that each X. oryzae group has a distinct group of predicted targets with relatively little overlap with other groups. In the case of Xol, the highest overlap was found with predictions for Xoc TALEs (10–11% shared predicted targets) (**Figure 6**). Given the differences in the genomes of their hosts and that Xol TALEs have unique repeat sequences, we expect their targets to be likewise unique. Additionally, we looked for orthologs of known susceptibility (S) genes targeted by Xoc or Xoo TALEs (e.g., SWEETs, OsSULTR3;6) and queried whether they were among the top predictions for Xol TALEs. No known Xoc or Xoo target homolog was found. It is, however, possible, that while not targeting the direct orthologs of these S

numbers for genomes used in this analysis are available in Supplementary Table S2.

genes, Xol TALEs may be inducing similar functions, since the predicted targets contain genes annotated with similar functions to known S genes including sulfate transporters, nodulins and various families of transcription factors (**Supplementary Table S3**). Expression data and further experiments are necessary to effectively identify the biological mechanisms of these TALEs.

## DISCUSSION

Xanthomonas oryzae pv. leersiae, which has been isolated from the pervasive weed species L. hexandra surrounding rice paddies

(Fang et al., 1957; Wonni et al., 2014), was historically grouped as a distinct species and pathovar (Vauterin et al., 1995; Triplett et al., 2011; Wonni et al., 2014). We compared pathogenicity of multiple strains on diverse cereal hosts and complete genomes of five Xol strains from Burkina Faso, China, Mali, and Uganda. Similar to Xoc, Xol strains caused water soaked lesions on rice and L. hexandra, but were not virulent to wheat and barley. ANI is a widely accepted baseline beyond DNA–DNA hybridization for taxonomic placement of prokaryotes into a species, not a pathovar, at a threshold of >95–96% (Konstantinidis and Tiedje, 2005; Goris et al., 2007; Richter and Rossello-Mora, 2009; Kim et al., 2014; Bull and Koike, 2015). In phylogenetic analyses, these five Xol strains, representing geographic and temporal diversity, grouped more closely with Xo pathovars than other members of this genus, and were above a species delineation threshold in ANI analyses. Therefore, we propose re-naming these strains from Xcl to Xol (Fang et al., 1957) comb. nov.

Xanthomonas oryzae pv. leersiae colonize and cause watersoaking on rice leaves, but the southern cutgrass isolates are not as aggressive as Xo isolated from rice. The lesions caused by Xol were phenotypically similar to rice BLS caused by Xoc and these strains did not cause disease when introduced into rice or southern cutgrass by leaf-clipping. Thus, we suggest that Xol are not systemic pathogens, and are more like Xoc than the systemic relative Xoo.

T3Es, as important contributors to bacterial pathogenicity, may define host range, and can also inform lineages and evolutionary relationships among different populations of related bacteria (Arlat et al., 1991; Hajri et al., 2012; Wonni et al., 2014; Schwartz et al., 2015). Studies in X. oryzae have shown that different lineages are shaped by T3E repertoires and reflect phenotypic adaptation to their agroecosystems (Hajri et al., 2012; Quibod et al., 2016). Likewise, this study sought to uncover similarities between Xo and Xol effector repertoires that could further inform their evolutionary lineage. Out of a set of previously defined core effectors for X. oryzae [avrBs2, avrBs3 (TALEs), xopL, xopN, xopP, xopQ, xopV, xopW, xopY, xopAA, xopAB, xopAE, and xopF1) (Hajri et al., 2012), Xol strains contain all except xopW, which is absent in three of the five strains. Although present in strains BAI23 and NJ611, xopW contains a large IS. This IS most likely prevented its amplification in previous studies (Wonni et al., 2014). Interestingly, this same IS was identified in other African Xoc strains, consistent with a shared evolutionary origin with Xol (Hajri et al., 2012). On the other hand, xopD is present in Xol but absent in other X. oryzae. XopD is a SUMO protease mimic that suppresses host defense responses during Xanthomonas euvesicatoria infection (Kim et al., 2008, 2013). In addition to Xol, xopD is predicted to be present in X. campestris pv. campestris, X. euvesicatoria, and Acidovorax citrulli<sup>4</sup> . Absence in other Xo suggests an independent acquisition by Xol. Conversely, loss of this effector at some time by Xoo or Xoc could have occurred, but further validation of these hypotheses is necessary. Since it is not known what level of virulence, and/or host specificity any T3E conveys for Xol, future work should include functional validations.

We also assessed TALE diversity in Xol strains, using genomic assemblies based on long reads generated using SMRT sequencing. While according to phenotyping, phylogenomics, and non-TALE Type III effector repertoires, Xol most closely resembles Xoc, the Xol TALE sequences more closely resemble African Xoo. Xol and African Xoo also possess, on average, smaller TALomes (TALEs per genome) than Xoc or Asian Xoo. Furthermore, their TALEs also have less diverse repeat arrangements as evidenced by shorter genetic distances in pairwise TALE repeat alignments. A possible explanation for this feature is that TALEs from Xol and African Xoo more closely resemble the ancestral TALE repertoire for the group and have not undergone the extensive diversification found in Xoc and Asian Xoo. A hypovirulent strain of Xoc was isolated from rice in the Yunnan province of China that also contains nine TALEs. This strain was used in heterologous expression assays to determine targets and function of Tal7 (Cai et al., 2017). Unfortunately, the TALome of this strain is unavailable at this time and it is unclear if this strain is related to the Xol strains characterized in this study.

The expansion of Xoc and Xoo TALomes is curious since many of the TALEs they carry seem not to be required for virulence and some may even have redundant functions (Pérez-Quintero et al., 2013; Cernadas et al., 2014). Given their repetitive nature and a general tendency toward homogenization of repeats, TALEs have been proposed as being selected for evolvability, that is, selected for their ability to quickly recombine (Schandry et al., 2016). Having an expanded TALome would further allow frequent recombination and generation of new TALE variants, as reflected in bigger evolutionary distances between repeat arrangements. This selection of a bigger TALome has been proposed to be driven by extensive breeding for resistance in the host plant (Schandry et al., 2018). When exposed to a resistant host population, a pathogen population can benefit from carrying a heterogeneous and redundant set of effectors, since preexisting isolates harboring a set advantageous for the new conditions would then be selected, in what has been proposed as a type of evolutionary "bet-hedging" strategy (Win et al., 2012). It is then possible that Xoc and Asian Xoo have historically encountered more resistance, possibly related to domestication of its primary host, than Xol and African Xoo, leading to a selection for expanded TALomes.

At least two resistance genes (Xa1 and Xo1) in O. sativa specifically recognize TALEs (Ji et al., 2016; Triplett et al., 2016), and Xoc and Xoo seem to have benefited from their expanded TALomes by selecting TALE variants (iTALEs or truncTALEs) that can specifically suppress resistance mediated by these genes (Ji et al., 2016; Read et al., 2016). Meanwhile Xol infecting L. hexandra, as African Xoo originally infecting Oryza glaberrima (Gonzalez et al., 2007), may have never encountered similar resistance in its host, and thus lacks these TALE variants and cannot cause disease on varieties carrying Xo1 (**Figure 1**).

TALome expansion may have been accompanied by, or be a consequence of, higher genome plasticity in the X. oryzae clade as evidenced by a high amount of genome and gene duplication and a high frequency of IS elements, these measures being generally lower in Xol. Within the clade, differences

<sup>4</sup>www.xanthomonas.org

may be once more related to resistance in the host, with the more plastic genomes (Asian Xoo) matching more variable host populations. The requirement of plastic genomes in the clade as a whole has been hypothesized to be a consequence of rice cultivation through millennia (Salzberg et al., 2008; Bogdanove and Voytas, 2011), and suggest the ancestor of the clade faced an already variable population. Which then raises the question of where Xol falls within this scenario of adaptation to a cultivated crop?

Rice and southern cutgrass are closely related members of the Poaceae. Leersia species are often used as an outgroup in phylogenetic and, most recently, genomic investigations (Copetti et al., 2015). Evidence of genome duplication events and defense response genes shared by rice and Leersia species has been reported (Jacquemin et al., 2009; Xiao et al., 2009). It is plausible that based on their genetic and evolutionary relatedness that respective pathogens of Oryza and Leersia evolved independently. The US strains of Xo, which do not contain intact TALEs, could have been progenitors of other Xo pathovars, having acquired TALEs over time to enhance virulence on rice or Leersia spp. (Triplett et al., 2011). Significant trade between the United States, Africa, and Asia could allow for movement of strains across continents. However, it is not clear if Xol is currently present in the United States, despite early reports of L. hexandra being an alternate host for Xoo (Gonzalez et al., 1991). Alternatively, this system may represent a sympatric scenario where Xoo, Xoc, Xo, and Xol all lived in the same habitat, providing the opportunity to exchange genetic material and to adapt to their respective host while maintaining basic homology (Jacques et al., 2016). In this scenario, given its similar infection biology and overall similarities to Xoc, Xol may be a specialized subgroup originating from a Xoc-like population able to colonize southern cutgrass, or vice versa.

How specifically adapted Xol strains are to either L. hexandra or rice remains a fascinating question, since Xol could potentially represent an emerging pathogen for rice. Here we have shown that Xol can, to some extent, infect rice, and two of the isolates sequenced in this work were originally recovered from symptomatic rice leaves in a field (BB 151- 3 and BB 156-2). To get insights into host adaptation, we attempted to predict targets for Xol TALEs in L. hexandra using L. perrieri, the only available genome from the genus, as proxy and O. sativa. Overall, the predictions indicate that Xol TALEs induce different sets of genes than other X. oryzae, and that different genes are induced in rice and Leersia sp. given that relatively few orthologues are predicted as targets in both genomes. Of particular interest, given the phenotypic similarities with Xoc, is the predicted targeting of genes annotated as sulfate transporters. Four genes in the L. perrieri genome corresponding to orthologs of sulfate transporters from rice (LOC\_Os03g09940, LOC\_Os03g09970, LOC\_Os03g09980, and LOC\_Os09g06499) were predicted to be targeted by TALEs from at least one Xol strain with high prediction scores (**Supplementary Table S3**). The primary virulence target of Tal2g from Xoc is OsSULTR3;6 which is a member of the sulfate transporter family 3 (Takahashi et al., 2011; Cernadas et al., 2014). It is feasible that the TALEs targeting sulfate transporters in Leersia mirror the virulence function of Tal2g. However, transcriptomic data and further biological validation are required since it is still possible that the few-shared genes in the predictions are true targets and similar functions are required for virulence in both hosts.

### CONCLUSION

In summary, we propose Xol, which was isolated from rice and southern cutgrass (L. hexandra), a weedy grass closely related to rice, as a new member of the X. oryzae species. Genomic analysis and disease phenotyping on various hosts demonstrated the close relationship of Xol to the rice pathogens Xoo and Xoc. T3E and TALE content of the Xol indicated that this group of organisms uses similar virulence mechanisms to the rice pathogens. While weeds such as southern cutgrass are not agronomic crops, they are competitors for resources and potential reservoirs of pathogen inoculum, they are important in management considerations for rice growers. Interfering in any agroecosystem requires comprehensive consideration. Certain Leersia sp. are used as banker plants for the critically important rice brown plant hopper (Zheng et al., 2017), therefore integrated management of weeds surrounding rice paddies will require prospecting and balance of all possible pests. The fact that they harbor a pathogen group that can also impact rice emphasizes that, in general, more attention should be focused to the surrounding ecosystem in rice production and more broadly in any crop rotation as a general management strategy. Research contributing toward understanding the Xol/rice/southern cutgrass pathosystem will be significant for all rice-producing countries.

## AUTHOR CONTRIBUTIONS

JML, AP-Q, RK, BS, VV, and JEL conceived and designed experiments. JML, AP-Q, RK, ED, and JJ performed the experiments. SS, HD, IK, RO, OK, and VV collected and provided new Xol strains. JML, AP-Q, RK, and JZ analyzed the data. RK, RO, OK, BS, VV, and JEL provided resources and supervision. JML, AP-Q, BS, VV, and JEL developed the manuscript.

### FUNDING

This research was supported by a Marie Curie IOF Fellowship (EU Grant PIOF-GA-2009-235457 to VV); the Embassy of France in the United States, Office of Science and Technology – STEM Chateaubriand Fellowship Program (to JML); USDA NIFA Postdoctoral Fellowship Award No. 2016-04706 (to JJ); USDA's National Institute of Food and Agriculture, award # 2018-67013-28490 (to JJ, JML, and JEL); and by IRD JEAI Coana (to SS and OK).

#### ACKNOWLEDGMENTS

fpls-10-00507 April 27, 2019 Time: 15:34 # 13

We are grateful to Dr. Geoffrey Onaga for strain collection in Uganda; Dr. Bing Yang for Xol BB 151-3 and BB 156-2 genome sequence; and Emily Luna for technical support. We thank Drs. Mathilde Hutin, Céline Pesce, Sébastian Cunnac, and Tuan T. Tran for seed, constructive discussions, and technical support and Dr. Young-Ki Jo for providing Leersia hexandra. This article is based upon work from COST Action CA16107

#### REFERENCES


Copetti, D. (2013). "Genomic resources for Leersia perrieri: an outgroup species for the genus Oryza," in Proceedings of the Plant and Animal Genome VIII (Plant and Animal Genome), (San Diego, CA).

Copetti, D., Zhang, J., El Baidouri, M., Gao, D., Wang, J., Barghini, E., et al. (2015). RiTE database: a resource database for genus-wide rice genomics and evolutionary biology. BMC Genomics 16:538. doi: 10.1186/s12864-015-1762-3


EuroXanth, supported by COST (European Cooperation in Science and Technology).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00507/ full#supplementary-material

TAL effector design and target prediction. Nucleic Acids Res. 40, W117–W122. doi: 10.1093/nar/gks608


throughout the Oryza genus and beyond. BMC Plant Biol. 9:146. doi: 10.1186/ 1471-2229-9-146


Ou, S. H. (1985). Rice Diseases, 2nd Edn. Surrey: Association Applied Biology.

Paradis, E., Claude, J., and Strimmer, K. (2004). APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290. doi: 10.1093/ bioinformatics/btg412


TALomes reveals a new susceptibility gene in bacterial leaf blight of rice. PLoS Pathog. 14:e1007092. doi: 10.1371/journal.ppat.1007092


White, F. F., Potnis, N., Jones, J. B., and Koebnik, R. (2009). The type III effectors of Xanthomonas. Mol. Plant Pathol. 10, 749–766. doi: 10.1111/j.1364-3703.2009. 00590.x


and Burkina Faso reveals a high level of genetic and pathogenic diversity. Phytopathology 104, 520–531. doi: 10.1094/PHYTO-07-13-0213-R


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Lang, Pérez-Quintero, Koebnik, DuCharme, Sarra, Doucoure, Keita, Ziegle, Jacobs, Oliva, Koita, Szurek, Verdier and Leach. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Pseudomonas syringae pv. syringae Associated With Mango Trees, a Particular Pathogen Within the "Hodgepodge" of the Pseudomonas syringae Complex

José A. Gutiérrez-Barranquero, Francisco M. Cazorla and Antonio de Vicente\*

Departamento de Microbiología, Facultad de Ciencias, Universidad de Málaga, Instituto de Hortofruticultura Subtropical y Mediterránea "La Mayora" (IHSM-UMA-CSIC), Málaga, Spain

#### Edited by:

Olivier Pruvost, UMR Peuplements Végétaux et Bio-Agresseurs en Milieu Tropical (CIRAD), France

#### Reviewed by:

Boris Alexander Vinatzer, Virginia Tech, United States Cindy E. Morris, INRA Centre Provence-Alpes-Côte d'Azur, France

> \*Correspondence: Antonio de Vicente adevicente@uma.es

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Plant Science

Received: 18 December 2018 Accepted: 15 April 2019 Published: 08 May 2019

#### Citation:

Gutiérrez-Barranquero JA, Cazorla FM and de Vicente A (2019) Pseudomonas syringae pv. syringae Associated With Mango Trees, a Particular Pathogen Within the "Hodgepodge" of the Pseudomonas syringae Complex. Front. Plant Sci. 10:570. doi: 10.3389/fpls.2019.00570 The Pseudomonas syringae complex comprises different genetic groups that include strains from both agricultural and environmental habitats. This complex group has been used for decades as a "hodgepodge," including many taxonomically related species. More than 60 pathovars of P. syringae have been described based on distinct host ranges and disease symptoms they cause. These pathovars cause disease relying on an array of virulence mechanisms. However, P. syringae pv. syringae (Pss) is the most polyphagous bacterium in the P. syringae complex, based on its wide host range, that primarily affects woody and herbaceous host plants. In early 1990s, bacterial apical necrosis (BAN) of mango trees, a critical disease elicited by Pss in Southern Spain was described for the first time. Pss exhibits important epiphytic traits and virulence factors, which may promote its survival and pathogenicity in mango trees and in other plant hosts. Over more than two decades, Pss strains isolated from mango trees have been comprehensively investigated to elucidate the mechanisms that governs their epiphytic and pathogenic lifestyles. In particular, the vast majority of Pss strains isolated from mango trees produce an antimetabolite toxin, called mangotoxin, whose leading role in virulence has been clearly demonstrated. Moreover, phenotypic, genetic and phylogenetic approaches support that Pss strains producers of BAN symptoms on mango trees all belong to a single phylotype within phylogroup 2, are adapted to the mango host, and produce mangotoxin. Remarkably, a genome sequencing project of the Pss model strain UMAF0158 revealed the presence of other factors that may play major roles in its different lifestyles, such as the presence of two different type III secretion systems, two type VI secretion systems and an operon for cellulose biosynthesis. The role of cellulose in increasing mango leaf colonization and biofilm formation, and impairing virulence of Pss, suggests that cellulose may play a pivotal role with regards to the balance of its different lifestyles. In addition, 62-kb plasmids belonging to the pPT23A-family of plasmids (PFPs) have been strongly associated with Pss strains that inhabit mango trees. Further, complete sequence and comparative genomic analyses revealed major roles of PFPs in detoxification of copper compounds and ultraviolet

radiation resistance, both improving the epiphytic lifestyle of Pss on mango surfaces. Hence, in this review we summarize the research that has been conducted on Pss by our research group to elucidate the molecular mechanisms that underpin the epiphytic and pathogenic lifestyle on mango trees. Finally, future directions in this particular plant–pathogen story are discussed.

Keywords: Pseudomonas syringae pv. syringae, mango tree, epiphytic fitness, virulence strategies, mangotoxin, pPT23A family plasmid, ultraviolet radiation and copper resistance

#### Pseudomonas syringae pv. syringae STRAINS ISOLATED FROM MANGO TREES BELONG TO A SINGLE PHYLOTYPE AND HAVE FEATURES DISTINGUISHING THEM FROM THE REST OF THE Pseudomonas syringae COMPLEX

Pseudomonas syringae complex has been traditionally used as a taxonomic hodgepodge that currently includes 15 recognized bacterial species and more than 60 different pathovars of the sensu stricto species P. syringae (Gomila et al., 2017). The taxonomy of the P. syringae complex has been widely discussed over the last 40 years, yet still remains a controversial group. The classification of this group is defined based on host range and symptomatology, dividing P. syringae species into pathogenic varieties known as pathovars (Dye et al., 1980; Young, 2010). The pathovar-based classification is widely accepted even today, but does not reveal the genetic relationships between pathovars. Initial genomic studies were based on DNA-DNA hybridization methods (Palleroni et al., 1972; Pecknold and Grogan, 1973; Denny et al., 1988; Gardan et al., 1992; Janse et al., 1996). Gardan et al. (1999) described nine discrete genomospecies classification groups that have been widely accepted until recently. Phylogenetic approaches based on multilocus sequence typing analysis (MLST) have had a significant impact on P. syringae classification (Sarkar and Guttman, 2004; Hwang et al., 2005; Almeida et al., 2010; Bull et al., 2011; Berge et al., 2014). Although the classification proposed by Berge et al. (2014) is generally accepted, a recent study using comparative genomics of the whole genome sequences of this species proposed the delineation of phylogenomic P. syringae complex and confirmed, as one might expect, that a high proportion of strains were misclassified (Gomila et al., 2017). Significantly, different P. syringae strains isolated from different sources (i.e., snow, irrigation water, and a diseased crop) have been identified as belonging to the same evolutionary lineage (Monteil et al., 2016). This fact suggests that the evolutionary history of the plant pathogen P. syringae is linked to the water cycle, which promoted the colonization of agricultural and non-agricultural habitats (Morris et al., 2008).

Pseudomonas syringae species possess a great diversity of virulence factors, such as a type III secretion system (T3SS) and its effector repertoires, toxic compounds, exopolysaccharides, ice nucleation activity, cell-wall-degrading enzymes and plant hormones, that make it the model phytopathogenic bacterium for understanding plant–pathogen interactions. Additionally, adaptation mechanisms to its plant hosts and microbial evolution have more recently become of great interest to many research groups (Xin et al., 2018). In particular, Pseudomonas syringae pv. syringae (Pss), has been described as the most polyphagous bacterium into the P. syringae complex due to its broad host range (Kennelly et al., 2007). Pss strains isolated from mango trees were identified as the causative agent of bacterial apical necrosis (BAN) disease of mango trees, which is the most limiting factor for mango crop in the Mediterranean region (Cazorla et al., 1998). A novel antimetabolite toxin called "mangotoxin" was reported to be intimately associated with all Pss strains isolated from mango trees, and with a few Pss strains from other hosts (Arrebola et al., 2003). The presence of different variants of copper resistance genes, as well as ultraviolet resistance determinants, were found to be associated with 62-kb plasmids belonging to the pPT23A family plasmids (PFPs) (Cazorla et al., 2002, 2008; Gutiérrez-Barranquero et al., 2013b). In addition, several studies have attempted to unravel the biosynthesis pathway and the regulatory mechanisms of mangotoxin production (Arrebola et al., 2007; Arrebola et al., 2012; Carrión et al., 2012, 2014). A molecular evolutionary approach using mangotoxin biosynthetic operon gene cluster, revealed that this operon was specifically distributed within the P. syringae Genomospecies 1, and which was acquired only once during evolution (Carrión et al., 2013). Moreover, a diversity survey of Pss strains isolated from mango trees was performed using phenotypic, genetic and phylogenetic approaches based on MLST analysis (Gutiérrez-Barranquero et al., 2013a) in order to understand the epidemiology of BAN disease. This study strongly indicated that Pss isolated from mango trees were forming a single phylotype inside the Pss species, characterized mainly by its adaptation to the mango host and by the production of mangotoxin. Subsequently, and due to the genome sequencing project of the model strain Pss UMAF0158 (Martínez-García et al., 2015), a gene cluster involved in the production of cellulose was discovered (Arrebola et al., 2015). This study demonstrated that cellulose was an important exopolysaccharide (EPS) to attach to the mango surface that could also act as a switch modulating the transition from epiphytic to pathogenic phases of Pss on the mango host. Finally, a PFPs sequencing project determined the importance of the 62-kb plasmids in improving the epiphytic survival of Pss strains isolated from mango trees (Gutiérrez-Barranquero et al., 2017a).

Therefore, this review summarizes the work that has been conducted on Pss strains isolated from mango trees over more than two decades of research. This phytopathogenic bacterium has arisen as a particular pathogen developing important features

that modulate their epiphytic and pathogenic lifestyle phases on the mango tree surface.

#### Pseudomonas syringae pv. syringae, THE CAUSAL AGENT OF BACTERIAL APICAL NECROSIS OF MANGO TREES

Mango crops (Mangifera indica L.) are present in many tropical and subtropical regions and represent one of the most important subtropical fruit crops distributed worldwide (Galán-Saúco, 2015). This crop was established in Southern Spain in Malaga in the early 1980s. The pace of the planting of this crop was relatively high over the last few years, expanding from 800 hectares (ha) in 2004 to 4500 ha in 2016 in Spain, of which more than 2,000 ha are in full production (Gutiérrez-Barranquero et al., 2017b). Very recent data claim that there are more than 6,000 ha, of which more than 3,000 are currently in full production, which would break the historical record of more than 30,000 tons of mango fruit harvested (Anonymous, 2018) August. Thus, the mango crop has been considered one of the most promising crops in Southern Spain, mainly in the tropical coastal areas of Malaga and Granada. As new crops are deployed in new regions, there might be spill-over effects and the emergence of new diseases. The commercial viability of this crop has been threatened by different bacterial and fungal plant pathogens (Bradbury, 1986; Gagnevin and Pruvost, 2001; Gutiérrez-Barranquero et al., 2019). In Southern Spain, the fungal pathogen Fusarium mangiferae which causes mango malformation disease (Crespo et al., 2012) and Pss the causal agent of BAN disease (Cazorla et al., 1998) are the most severe phytopathogens causing important economic losses. The main symptomatology associated with BAN disease, the isolation and identification of Pss as the causal agent of BAN disease, and the control methods specifically tested to limit and prevent Pss infections are discussed in detail below.

#### BAN Disease Symptomatology

The mango crop develops well at temperatures between 20 and 25◦C, reaching a dormancy period when the temperature is below 15◦C (Samson, 1986; Galán-Saúco, 2015). Thus, cool temperatures and wet periods play an important role in favoring the development of BAN symptoms, which has also been described in other infections caused by P. syringae in other woody hosts (Kennelly et al., 2007). Rain or dew are essential for inoculum dissemination to other buds and leaves, and wind exposure facilitates BAN development by causing microinjuries (Cazorla et al., 1998). BAN disease on mango trees is characterized by rapidly expanding necrotic spots on buds and leaves from October–November. January–February are the coolest and rainiest months in Southern Spain, giving rise to the highest incidence of necrotic symptoms, which is consistent with the period with the largest Pss population on mango trees (Cazorla et al., 1998). Additionally, at this time the symptoms can extend from buds through the leaf petiole to reach the leaves and stems. Typically, lesions on leaves start as interveinal, angular, water-soaked spots that may coalesce, becoming black and slightly raised. Importantly, favorable weather conditions for the pathogen that are maintained throughout the winter and even into the spring season can promote the appearance of wood necrosis on branches to such a degree that, in extreme cases, this can lead to the death of the tree. These symptoms are quite similar to those described for blossom blast of pear and stone fruits (English et al., 1980). Additionally, a white milky gum exudate can also be observed. Necrotic symptoms affecting flower panicles are less frequently observed but can become very apparent in years with severe attacks. These symptoms cause the most severe economic losses due to decreases in fruit yield (Cazorla et al., 1998). The typical symptoms of BAN disease of mango trees are summarized in **Figure 1**.

### Unraveling the Causative Agent of BAN Disease

The phytopathogenic bacterium P. syringae has the ability to survive as an epiphyte on plant surfaces. During its epiphytic phase, P. syringae has to cope with different abiotic factors by using different mechanisms (Sundin and Jacobs, 1999; Yu et al.,

FIGURE 1 | Typical symptoms of bacterial apical necrosis (BAN) disease on mango trees. (A) Healthy mango tree. (B) Mango tree affected by BAN disease. (C) Healthy mango apical bud. (D) Typical gum exudes on mango apical bud. (E) Initial necrotic spots on mango apical bud. (F) Severe necrosis of mango apical bud. (G) Necrotic symptoms progression from apical bud to leaves through the petiole. (H) Dead mango apical bud and surrounded leaves. (I) Flower panicles. Yellow arrow: healthy mango flower panicle; red arrow: necrosis on mango flower panicle.

1999; Lindow and Brandl, 2003), which allow it to achieve large population sizes before starting an infection process (Hirano and Upper, 2000). Although P. syringae can elicit disease symptoms in a wide variety of woody and herbaceous plants, P. syringae has been considered a weak pathogen because the infection process on their plant hosts can be strongly improved by frost damage or mechanical injury. Thus, P. syringae can elicit disease outbreaks in temperate regions distributed worldwide in important crops, causing significant yield losses (Kennelly et al., 2007). Since the early 1990s, necrotic symptoms have been observed in apical buds, leaves and stems in mango trees in Southern Spain and Portugal (Cazorla et al., 1998). In years with severe attacks, which correlate with cool and wet winters, necrotic symptoms were more evident in the mango tree canopy and could cause a reduction of 30–50% in mango fruit production (Gutiérrez-Barranquero et al., 2012). Preliminary isolation from the edge of necrotic tissues of mango trees revealed that over 90% of bacterial isolates recovered were fluorescent Pseudomonas. Similar necrotic symptoms have been reported in many other woody hosts infected by Pss, such as peaches (Endert and Ritchie, 1984), citrus (Mirik et al., 2005; Ivanovic´ et al., 2017), cherry (Sundin et al., 1989; Wenneker et al., 2013), almond (Lindow and Connell, 1984), apple (Mansvelt and Hattingh, 1986; Gasic et al., 2018) and pear (Montesinos and Vilardell, 1991; Xu et al., 2008). Different biochemical and physiological characteristics suggested the tentative identification of P. syringae. Furthermore, the presence of ice nucleation activity (INA), a virulence trait well-documented in P. syringae to be used by the bacterium to cause micro-wounds on the plant surface to provide an entry way to the plant (Hirano and Upper, 1995; Hwang et al., 2005), was found in bacterial isolates following a protocol previously described by Cazorla et al. (1995). The production of lipodepsipeptidic toxins typically associated with P. syringae, such as syringomycin and syringopeptins, were also confirmed in bacterial isolates from mango (Gross and DeVay, 1977; Ballio et al., 1991; Arrebola et al., 2003). All the results obtained conclusively confirmed that the bacterial isolates associated with necrotic symptoms in mango trees belonged to the P. syringae species (Cazorla et al., 1992, 1998). P. syringae is a highly heterogeneous species comprising more than 60 pathovars (Young, 2010). To determine which pathovar was the causal agent of necrotic symptoms, different pathogenicity tests were performed in tomato and lilac plants, immature lemon and pear fruits, and bean pods (Lelliott and Stead, 1987). All P. syringae strains assayed induced typical symptoms in all plant hosts of Pss. Once the bacterial strains associated with necrotic symptoms in mango trees were identified, a pathogenicity test in adult mango plants was carried out in order to fulfill Koch's postulates. Two different experiments under field conditions were performed using 2-year-old mango plants growing in pots. Buds and stems were inoculated with 10 µl of bacterial suspensions using a microsyringe. Necrotic symptoms developed in the inoculated mango trees, and the incidence and severity of necrotic symptoms that occurred in each experiment (i.e., different years) were different, indicating the importance of the weather conditions in symptom development, as has been previously observed for P. syringae in other hosts (Hirano and Upper, 2000). The

subsequent re-isolation from the necrotic lesions artificially reproduced in mango tissues and the subsequent identification confirmed that Pss was the causal agent of bacterial apical necrosis (BAN) of mangos (Cazorla et al., 1998).

Therefore, the life cycle of Pss on mango trees is clearly divided first, in an epiphytic phase, in which Pss has to survive and grow under harsh environmental conditions, and second, in a pathogenic phase to produce BAN symptomatology. In both phases, different genetic traits are expressed to either, improve survival or to enhance an infection process (**Figure 2**).

#### Control Options for BAN Disease

Management of woody plant diseases caused by P. syringae, and particularly those provoked by Pss, are a major concern for growers worldwide due to the broad host range. Sprays of copper compounds have been used for decades as standard bactericides to combat many bacterial diseases, but their use is subject to a number of constraints (Kennelly et al., 2007). The most common treatment for controlling BAN disease in Southern Spain is the spraying of a copper compound with a film-forming mode of action known as Bordeaux mixture (BM). However, different copper-based compounds fail to protect against BAN. Unfortunately, continuous treatments with copper sprays can lead to many problems. The efficacy of copper treatments for the control of bacterial diseases is often limited, largely due to the selection of copper-resistant strains; this has previously been described for Pss strains isolated from mango trees (Cazorla et al., 2002). Another serious problem associated with the excessive usage of copper is that copper is a major heavy metal contaminant that accumulates in soil from different sources (Wang, 1997; Xiong, 1998; Kabata-Pendias, 2001). Copper has demonstrated toxicity to roots and young shoots and leaves (Kairu et al., 1985; Alva and Graham, 1991; Iannotta et al., 2007), and has sustained bioaccumulation effects (Xiong and Wang, 2005). Finally, the European Union has introduced legislation limiting the use of copper compounds in regulation No. 473/2002 (Anonymous, 2002). For all of these reasons, there has been an urgent need expressed by growers and extension services to search for alternative treatments to copper compounds that may be effective for the control of BAN disease. In this context, Cazorla et al. (2006) evaluated the capacity of several different control treatments to cope with BAN disease in mango crops. In addition, the mechanisms of action of the different treatments were examined, analyzing their effect on Pss population levels. The treatments assayed in this work included BM, fosetyl-Al, gibberellic acid, acibenzolar-S-methyl, silicon gel (soluble potassium silicate 34%) and combined treatments (Cazorla et al., 2006). Interestingly, treatments reduced symptoms but did not reduce the size of the pathogen population, suggesting a non-bactericidal mode of action of these compounds. After evaluation of the different treatments, this study concluded that the best treatment to control BAN disease was conventional copper-based treatment BM. However, there were promising effects showed by other assayed treatments against BAN disease, indicating that a few of them could be interesting alternatives to traditional chemical control (Cazorla et al., 2006). The silicon gel was highly relevant, because its

FIGURE 2 | The life cycle of Pseudomonas syringae pv. syringae on mango trees. (A) Epiphytic phase of P. syringae pv. syringae on mango trees is developing mainly in spring/summer seasons, where high temperature and high UV radiation are present. At population level, P. syringae pv. syringae is present mainly on the buds and leaves surfaces forming microcolonies (1), that will subsequently form a mature biofilm with the biosynthesis of an extracellular matrix (2). At single cell level, rulAB operon encoded by 62-kb PFP plasmid involved in resistance to UV radiation, and wss operon present at the chromosome and involved in the biosynthesis of cellulose, are both highly expressed. On the contrary, copABCD operon encoded by 62-kb PFP plasmid involved in copper resistance, and mbo operon located at the chromosome and involved in mangotoxin biosynthesis, are less expressed. (B) Pathogenic phase of P. syringae pv. syringae on mango trees arise primarily in autumn/winter seasons, where low temperatures, low UV radiation and high rainfall are present. At population level, the infection process on mango leaves and buds is the following: (3) epiphytic survival and biofilm formation; (4) biofilm disassembly and bacterial migration; (5) Ice nucleation activity to damage mango surfaces; (6) bacterial entry into cells by microinjuries; (7) Bacterial entry into cells through stomata; (8) Release of phytotoxins; and (9) Release of type III effectors by using the type III secretion system. At single cell level, firstly, copABCD operon involved in detoxification of copper compounds is highly expressed in response to copper treatment applications by farmers. Then, all genes that encode virulence factors are highly expressed (Mangotoxin, lipodepsipeptidic toxins and type III secretion system and its effectors) to elicit the typical BAN disease symptoms.

reduction of necrotic symptoms in apical buds was similar to the levels obtained with BM; it also has potential for use in organic farming.

Due to the limitations concerning the use of copper compounds, together with the increasing demand for organic crops, have led to in-depth analysis of different alternative treatments to combat plant diseases. Particularly, Gutiérrez-Barranquero et al. (2012) performed a study where they analyzed different alternative treatments, including the silicon gel that previously showed potential to control BAN disease. In this study after different scale trials (small, semi-commercial, and commercial), confirmed the efficacy of silicon gel to control BAN disease, reducing the occurrence of necrotic symptoms at a similar level to the conventional treatment BM. Moreover, mango growers directly observed the effectiveness of silicon gel, and thus, this treatment has been registered for commercial use in mango crops in Spain as a phytostrengthener compatible with organic farming (Gutiérrez-Barranquero et al., 2012). Interestingly, silicon gel failed to reduce the bacterial population in mango tress, suggesting a film-forming mode of action acting as a physical barrier to avoid the entry of the pathogen, as it was previously reported for BM (Becerra, 1995). A similar mode of action has been previously described for silicon protective effects in other plant hosts against fungal and bacterial pathogens (Diogo and Wydra, 2007; Guével et al., 2007; Sun et al., 2010). However, other putative modes of action for silicon gel cannot be ruled out, as might be the induction of systemic resistance (ISR) (Bélanger et al., 2003; Rodrigues et al., 2003; Rodgers-Gray and Shaw, 2004; Fauteux et al., 2005) and to enhance cell wall lignification (Kim et al., 2002).

#### EPIPHYTIC FITNESS DETERMINANTS: IMPROVING SURVIVAL OF P. syringae pv. syringae ON MANGO SURFACES

Plant surfaces are hostile and dynamic environments for plant-associated bacteria due to rapidly changing climatic conditions (Lindow and Brandl, 2003). P. syringae is an epiphytic bacterium and an opportunistic plant pathogen that needs to survive on plant surfaces (Hirano and Upper, 2000). Before initiating infection, P. syringae has to face environmental abiotic stressors via different survival mechanisms (Sundin and Jacobs, 1999; Yu et al., 1999; Lindow and Brandl, 2003). The life cycle of Pss on mango plant surfaces (as depicted in **Figure 2**) involves an epiphytic phase mainly during the spring and summer seasons, that subsequently leads to an infection process during the autumn and winter seasons, when the weather conditions are favorable for the disease development (Cazorla et al., 1998).

Pseudomonas syringae pv. syringae isolated from mango trees has therefore developed different strategies to survive on the mango plant surface. Where present, the 62 Kb PFP plasmids exhibit a key role (Cazorla et al., 2002, 2008; Arrebola et al., 2009; Gutiérrez-Barranquero et al., 2013b, 2017a). Recently, other important genes located on the chromosomal genetic material have been described as having a primary role in adhesion and subsequent biofilm formation on mango plant surfaces (Arrebola et al., 2015).

#### Copper and Ultraviolet Resistance Genes Mainly Encoded by PFP Plasmids Are Essential for Epiphytic Survival on Mango Tree Surfaces

Plasmids have been reported to be one of the most important sources for bacterial evolution, due to their ability to acquire foreign DNA and be rapidly transmitted among bacteria via the horizontal gene transfer process (Vivian et al., 2001; Norman et al., 2009). Plasmids are part of the flexible genome and represent a portion of the genome that does not contribute to basic survival functions. However, plasmids encompass important genes that can improve the ecological fitness of their bacterial hosts (Medini et al., 2005; Sundin, 2007) and improve virulence mechanisms (Jackson et al., 1999; Arnold et al., 2001). The PFPs are a family of native plasmids that appear to be indigenous to P. syringae. All PFP plasmids share a major replication protein, gene repA (Sesma et al., 1998, 2000). Apart from specific genes involved in self-maintenance and replication processes of PFPs, different genes implicated in virulence and/or ecological fitness are encoded. In particular, copper- and ultraviolet radiation-resistance genes are two of the most widely distributed genes in this family of plasmids, which play a fundamental role in epiphytic survival (Sundin, 2007).

As mentioned previously, the use of copper compounds has been strongly associated with agriculture (Lamichhane et al., 2018). The extensive use of copper by growers led to an increase in the dosage and frequency of applications, giving rise the emergence of copper-resistant strains, a concerning issue that is very common among plant pathogenic bacteria, such as P. syringae (Sundin et al., 1989; Andersen et al., 1991; Sundin and Bender, 1993; Scheck and Pscheidt, 1998). In Southern Spain, different copper compounds have been largely used to control BAN disease in mango trees, as well as other plant diseases. This suggests that the selection of copper-resistant strains could be a major reason for further control failures with copper bactericides. The copABCD operon is the most common genetic determinant associated with copper resistance in P. syringae and has been reportedly associated with conjugative native PFP plasmids (Bender and Cooksey, 1986; Cooksey, 1987; Lim and Cooksey, 1993; Sundin and Bender, 1996). The copABCD operon encoded by a 35-kb plasmid from P. syringae pv. tomato was the first of these genes to be sequenced (Mellano and Cooksey, 1988). Based on this background, a study was performed to analyze the role of the copABCD operon in copper treatment tolerance, as well as its association with PFP plasmids in Pss strains isolated from mango trees (Cazorla et al., 2002). The presence of the copABCD operon and its association with PFPs plasmids was further analyzed. Over 75% of the copper-resistant strains, harbored 62-kb plasmids that showed a hybridization signal by Southern blot analysis with the copABCD probe obtained from P. syringae pv. tomato PT23 (Bender and Cooksey, 1987). The copABCD operon is also encoded, albeit to a lesser extent, in the chromosome, as well as in 120- and 45-kb plasmids. This observation suggested that

different variants of copper resistance determinants could be found in Pss mango populations, as has previously been reported in other Pss populations (Sundin and Bender, 1993; Rogers et al., 1994). These data were also supported by 62-kb plasmids restriction profiles, identifying different restriction profiles in both copper-resistant and copper-sensitive plasmids. Moreover, in order to determine whether those plasmids were conjugative and also the main determinants of copper resistance, mating experiments proved that those plasmids were conjugative and were involved in the copper resistance phenotype. The presence of copper-resistant conjugative plasmids could be considered the main cause of control strategy failures when treating with copper bactericides. Thus, field experiments where copper treatments were applied to mango trees once per month, from September to June, were analyzed to assess the emergence of copper-resistant strains. It was clearly demonstrated that excessive usage of copper in mango trees to control BAN disease promoted an increase in copper-resistant strains, which could be mainly due to the ability of these plasmids with be transmitted by conjugative processes (Cazorla et al., 2002).

Subsequently, based on a PFPs sequencing project that included strains that harbored different variants of copper-resistance determinants (Gutiérrez-Barranquero et al., 2017a), it was shown that the presence of a novel genetic structure in Pss UMAF0081 strain isolated from mango increased copper-resistance phenotypes. This novel genetic structure encoded the cusCBA genes (detoxifying monovalent cations of silver and copper) and copG, a putative metal-transporting P-type ATPase, both inserted within the copABCD operon (Gutiérrez-Barranquero et al., 2013b). Furthermore, the novel genetic structure was found in another strain of Pss analyzed in this study (Pss 6–9 strain isolated from sweet cherry), and was also present in another two strains from the database that belonged to different pathovars (ATCC1128 pv. tabaci and NCPPB1108 pv. tomato). This structure encompassed 15 genes that were more than 17 kb in size, according to data that was recently updated (Gutiérrez-Barranquero et al., 2017a). To determine whether those extra genes were responsible for the increase in copper resistance, the minimal inhibitory concentrations of copper and other heavy metals were investigated. A collection of Pss strains isolated from mangos and others hosts, two strains from different pathovars, a transconjugant strain obtained previously (Cazorla et al., 2002, FF5-km + 62-kb 0081 plasmid), and two Pss FF5 transformants that harbored copG and cusCBA were independently evaluated. It was observed that the transconjugant strain showed the same MIC value for copper as the original 0081 strain; the transformed strains also had increased their MIC values in comparison with the copper-sensitive parental FF5 strain (Sundin and Bender, 1993). A growth curve performed in minimal medium supplemented with 0.8 mM of copper sulfate clearly demonstrated that copG and cusCBA were responsible for the increase in copper resistance. The role of cusCBA in detoxifying heavy metals has been previously reported in Cupriavidus metallidurans (Mergeay et al., 2003; Von Rozycki and Nies, 2009), Escherichia coli (Franke et al., 2003) and Pseudomonas putida KT2440 (Cánovas et al., 2003; Leedjärv et al., 2008). Finally, qRT-PCR experiments were

performed to analyze the expression profiles of copG and cusA in the presence or absence of 0.8 mM copper sulfate. The results showed that the expression levels of cusA and copG increased 13- and 100-fold, respectively, in the presence of copper, and the expression of cusA was 3-fold higher than copG. These results confirmed the previous results obtained in the MIC and growth curve experiments, supporting the hypothesis that the novel rearrangement of three different genetic determinants into a conjugative plasmid increases copper resistance in P. syringae (Gutiérrez-Barranquero et al., 2013b). Thus, the presence of different copper-resistance structures associated primarily with 62-Kb PFPs plasmids has been demonstrated in Pss strains isolated from mango trees. However, little is known concerning the dynamics of maintenance or preference of the different types of 62-kb plasmids in Pss mango populations.

UV radiation affects bacterial communities that are intimately associated with plant surfaces; to overcome this growth-limiting environmental stress, different mechanisms have been developed (Beattie and Lindow, 1995; Sundin and Jacobs, 1999; Jacobs and Sundin, 2001). Among the different mechanisms described for avoiding UV damage, the presence of DNA repair mechanisms, such as rulAB operon encoded by PFP plasmids, are the most relevant in Pss (Sundin et al., 1996; Sesma et al., 1998; Sundin and Murillo, 1999; Sundin et al., 2000). In Southern Spain, mango crops are exposed to high UV radiation, especially in the spring and summer seasons. These highly restrictive solar radiation conditions suggest that a similar rulAB-like operon could play an indispensable role in the epiphytic survival of Pss associated with mango trees. As noted above, there was a high incidence of 62-kb plasmids associated with Pss isolated from mango trees that belong to the PFP family, which were also strongly associated with copper resistance phenotype (Cazorla et al., 2002). In this sense, Cazorla et al. (2008) analyzed the presence of the rulAB-like operon and its role in UV radiation tolerance in the 62-Kb PFP plasmids. Over 62% of the strains analyzed harbored a 62-kb plasmid. Additionally, it was observed that the Pss strains harboring 62-kb plasmids, rather than those lacking plasmids or having a different plasmid, were more tolerant to UVC exposure and were able to maintain higher population levels in vitro. However, the UVC wavelengths do not naturally reach the earth's surface; thus, its impact on ecological fitness is low (Kim and Sundin, 2000). Subsequently, two different exposures conditions of UVA+B (high irradiation and similar radiation in a summer day in Southern Spain) were tested, and in both conditions, the role of plasmids in UVA+B tolerance was demonstrated. This result reinforced the importance of 62-kb plasmids in epiphytic survival of Pss isolated from mango trees in Southern Spain. Finally, the role of 62-kb plasmids in UV tolerance was tested in vivo on mango leaf surfaces, evaluating different conditions (leaves in sunny and shady areas, and adaxial and abaxial parts of the leaves). Once again, a greater surviving population of Pss was observed in the strains harboring 62-kb PFP plasmids, although this difference was only notable in the adaxial side of leaves exposed to direct sunlight radiation (Cazorla et al., 2008). Therefore, it has been clearly demonstrated the rulAB+ Pss strains shown an advantage regarding their epiphytical fitness, and thus this operon plays a relevant role in

growth and dispersion of Pss on mango surfaces during its harsh epiphytic phase suffered in Southern Spain. This competitive advantage may be promoting the selection and the dispersion of these plasmids among the mango microbiome.

### Cellulose Production Modulates the Epiphytic and Pathogenic Lifestyle of Pseudomonas syringae pv. syringae on Mango Surfaces

Exopolysaccharides (EPS) have been reported to play essential roles in plant colonization and epiphytic survival of plant-associated bacteria (Pfeilmeier et al., 2016), including P. syringae (Yu et al., 1999). Different EPS have been associated with different functions of P. syringae during the epiphytic phase on the plant surface, as well as with its pathogenic lifestyle. Alginate is one of the most-studied EPS in P. syringae, and its involvement in osmotic stress tolerance, epiphytic survival, and virulence has been well-established (Yu et al., 1999; Freeman et al., 2013). Although the role of alginate and levan are not directly related to biofilm formation (Laue et al., 2006), their role in the initial stages of adhesion prior to biofilm development cannot be ignored (Yu et al., 1999). In addition, the putative role of levan as a nutrient source in mature biofilms, as well as its activity as a barrier blocking the recognition by the plant during pathogenesis, have been proposed (Kasapis et al., 1994; Laue et al., 2006). Cellulose is an important EPS that is well-documented in many bacterial species (Römling and Galperin, 2015). It is an integral part of extracellular matrix components of biofilms, mainly in environmental and pathogenic Pseudomonas (Ude et al., 2006; Römling et al., 2013). It is noteworthy that cellulose also exhibits major roles in the modulation of virulence mechanisms in both human and plant pathogenic bacteria (Römling et al., 2013). Based on a "genome mining" approach using the complete genome sequence of the model strain Pss UMAF0158 (Martínez-García et al., 2015), an orthologous gene cluster to the operon wss of Pseudomonas fluorescens SBW25 involved in cellulose biosynthesis was identified in the chromosome (Rainey and Travisano, 1998; Spiers et al., 2002). This gene cluster is organized as an operon, and encompasses 14,642 bp that encodes nine genes with putative functions associated with cellulose production and acetylation. Additionally, the evolutionary history of this gene cluster revealed that it was present in both pathogenic and non-pathogenic Pseudomonas. In addition, the flanking regions of the cellulose gene cluster were consistent between Pss UMAF0158 and other P. syringae cellulose-producing strains, suggesting an identical chromosome location.

Epiphytic colonization by P. fluorescens SBW25 and its survival on plant surfaces is primarily due to cellulose overproduction by the wss operon (Gal et al., 2003; Spiers et al., 2003). The role of cellulose in biofilm formation of P. syringae pv. tomato DC300 has also been shown (Pérez-Mendoza et al., 2014). To determine the role of cellulose in the lifecycle of Pss isolated from mangos, insertional mutants in the biosynthetic genes of the wss cluster, wssB and wssE (Römling, 2002), were constructed and proved to be impaired in cellulose production. Furthermore, a cellulose-overproducing strain was obtained via the transformation of Pss UMAF0158 with plasmid pVS61-WsR19 that contained wspR19 from P. fluorescens SBW25 (Ude et al., 2006). Scanning electronic microscopy on mango buds and tomato leaves revealed the formation of microcolonies of the wild-type and overproducing strains immersed in the extracellular matrix, but not for the cellulose-defective mutants. Furthermore, adhesion experiments on mango leaves revealed that the amount of bacteria recovered were higher in the wild-type and overproducing strains, in respect to the wss mutants. In contrast, although growth curves on minimal medium for the different strains exhibited similar patterns, the incidence (number of necrotic points developed) and the severity (necrotic area developed) on tomato leaflets were higher in the wss mutants, lower in the wild-type, and practically abolished in the cellulose overproducing strain. The competitive index approach analysis supported these results, showing that the competitiveness of the overproducing strain was decreased during the plant infection experiments (Arrebola et al., 2015). It is evident that cellulose plays a primary dual role between epiphytic and pathogenic lifestyle of Pss on mango tree surfaces, which suggests that this trait is maintained on Pss mango populations for mango leaf and bud colonization and adaptation. Mechanisms of the regulation of cellulose biosynthesis by Pss isolated from mango trees has not yet been determined, but some clues have been discovered in related Pseudomonas. The second messenger c-di-GMP controls cellulose biosynthesis in P. fluorescens SBW25 (Spiers et al., 2002) and regulates the switch between the static and motile phases in many different bacterial species (Römling et al., 2013). More recently, the transcriptional regulator AmrZ has been reported to be a key regulator in the biosynthesis of cellulose in P. syringae pv. tomato DC3000 (Prada-Ramírez et al., 2016). The regulon of the AmrZ transcriptional regulator includes putative c-di-GMP proteins such as AdcA and MorA; thus, AmrZ could be directly involved in cellulose biosynthesis by modulating the available pool of c-di-GMP.

#### VIRULENCE FACTORS ASSOCIATED WITH Pseudomonas syringae pv. syringae STRAINS ISOLATED FROM MANGO TREES

As described by Salmond (1994), a virulence factor could be any molecule present on the bacterial cell surface or released from the cell that could influence the growth of the pathogen in plants, enhancing infection and subsequent disease development. Plant pathogenic bacteria have developed many different and specific virulence strategies to infect successfully their plant hosts. The identification, characterization and dissection of the modes of action of different virulence factors is complex, despite the efforts of many research groups (Mansfield et al., 2012; Pfeilmeier et al., 2016). Whereas the traits that confer P. syringae pathogenicity are numerous and well-studied, the mechanisms underlying susceptibility of mango are unknown. The lack of balance in our understanding of the mechanisms involved (well understood

for the pathogen, poorly understood for the host) make us to focus in the role of the pathogen during the interaction with the host. P. syringae, in particular, and Pss strains isolated from mango specifically, shows a broad and sophisticated armament of different virulence factors (Ichinose et al., 2013), among which bacterial toxins are one of the most studied in depth.

#### Bacterial Toxins

Bacterial toxins are important virulence factors of P. syringae (Mitchell, 1991) and have been described to be involved in the development of chlorotic and necrotic disease symptoms in its plant hosts (Volksch and Weingart, 1998; Scholz-Schroeder et al., 2001). Lipodepsipeptidic toxins, such as syringomycins and syringopeptins, have been strongly associated with several pathovars of P. syringae and are mainly related with the production of necrotic symptoms (Gross and DeVay, 1977; Ballio et al., 1991; Adetuyi et al., 1995; Vassilev et al., 1996; Bender et al., 1999; Scholz-Schroeder et al., 2001). Pss strains isolated from mango trees were found to produce syringomycin by using growth inhibition tests toward Geotrichum candidum (Gross and DeVay, 1977) and Rhodotorula pilimanae (Iacobellis et al., 1992), and the detection of a specific gene involved in its biosynthesis was done by a PCR protocol (Sorensen et al., 1998). In addition, Pss strains isolated from mango trees were also found to produce syringopeptins by using a grown inhibition bioassay of Bacillus megaterium (Lavermicocca et al., 1997). Another group of important toxins described in several pathovars of P. syringae are the so-called "antimetabolite toxins." This group of toxins blocks the function of enzymes involved in the biosynthetic pathways of crucial amino acids, as well as the biosynthesis of polyamine (Bender et al., 1999; Arrebola et al., 2011a,b). These toxins produce chlorotic symptoms in plant tissue due to the accumulation of different intermediates (Patil et al., 1972; Turner and Debbage, 1982; Bachmann et al., 1998). The best-known antimetabolite toxins produced by different pathovars of P. syringae are tabtoxin, phaseolotoxin, and the recently identified mangotoxin (Arrebola et al., 2003). Mangotoxin was initially identified to be produced mainly by Pss strains isolated from mango trees, although its production was also reported in a few Pss strains from other hosts (Arrebola et al., 2003). The biosynthesis pathway of mangotoxin, its regulation, and the role that this toxin plays in the different lifestyles of Pss-mango interactions are discussed extensively in the next section.

#### MANGOTOXIN, AN ANTIMETABOLITE TOXIN MAINLY ASSOCIATED WITH P. syringae pv. syringae STRAINS ISOLATED FROM MANGO TREES

Mangotoxin is the most recent antimetabolite toxin discovered and was first described to be mainly produced by Pss strains isolated from mango trees. This toxin was called "mangotoxin" due to the plant host (mango tree) from which most of the Pss strains mangotoxin producers were isolated (Arrebola et al., 2003). As mentioned above, antimetabolite toxins block enzymes functions involved in the biosynthetic pathways of crucial amino acids and the biosynthesis of polyamine (Bender et al., 1999; Arrebola et al., 2011a,b). The toxic activity of mangotoxin is reversed by the addition of ornithine, and thus, its target enzyme was identified as ornithine N-acetyl transferase (OAT) (Arrebola et al., 2003). In **Figure 3A**, a schematic representation of the arginine-glutamine and polyamine biosynthesis pathways shows the target enzymes of the different antimetabolite toxins including mangotoxin. In order to decipher the chemical structure of mangotoxin a physicochemical characterization was performed initially using cell-free filtrates revealing that mangotoxin is a small secreted molecule of a hydrophilic nature smaller than 3 kDa in size, extremely resistant to high pH and high temperature, but sensitive to protease treatments. The analysis of a Tn5 defective mutant in mangotoxin production (UMAF0158-3aE10) and the wild-type strain Pss UMAF0158 by using High-performance liquid chromatography (HPLC), revealed a specific peak associated with mangotoxin activity (Arrebola et al., 2003). Another chemical separation techniques such hydrophilic interaction liquid chromatography (HILIC) and ion Exchange chromatography (FPLC) have been also applied to decode the mangotoxin structure (data not published). However, the efforts conducted to unravel the chemical structure of mangotoxin have been in vain to date, largely due to its high chemical instability.

To understand the molecular basis of mangotoxin production, three mutants impaired in mangotoxin production obtained from a genomic library (Pss UMAF0158-3γH1, -6γF6, and -5αC5) that displayed growth characteristics and production of lipodepsipeptidic toxins similar to wild-type strain UMAF0158 (Arrebola et al., 2007) were studied in depth. The insertion in the mutant UMAF0158-6γF6 was located in a DNA region that showed high similarity with an non-ribosomal peptide synthetase (NRPS) present in Pss B728a, P. syringae pv. tomato DC3000 and P. syringae pv. phaseolicola 1448A. This orf called mgoA gene has a size of 3447 bp, and the amino acids sequence of this protein was composed of an activation module with conserved domains typical for NRPS (Stein and Vater, 1996; Marahiel et al., 1997). The role of mgoA in virulence of Pss was demonstrated in tomato leaflets, showing this mutant a lower disease incidence than the wild-type. Therefore, the NRPS gene mgoA was confirmed to be involved in mangotoxin biosynthesis and, also, in virulence (Arrebola et al., 2007). Furthermore, three additional genes were detected together with the mgoA gene and were designated mgoB, mgoC, mgoA, and mgoD, in accordance with the mangotoxin generating operon. Insertional mutants in mgoC, mgoA, and mgoD, had altered mangotoxin production. Additionally, by using RT-PCR all mgo genes were co-transcribed together, forming a single polycistronic mRNA and thus forming an operon. Complementation experiments with the mgo operon restored the ability of the mutants to produce mangotoxin, and therefore, these results confirmed strongly that the mgo operon was necessary for mangotoxin production (Arrebola et al., 2007, 2012). The mgo operon has been found to be well-distributed in the majority of Pseudomonas species, including different pathovars of P. syringae (Lindeberg et al., 2008; Vallet-Gely et al.,

2010). A homologous gene cluster to the mgo operon, pvf, has been proposed to be encoded in Pseudomonas entomophila as a regulator of virulence factors (Vallet-Gely et al., 2010). Recently, the family of pyrazine N-oxides (PNOs), including a novel dihydropyrazine N,N 0 -dioxide metabolite, were identified to be produced by the pvf gene cluster in P. entomophila, suggesting that these molecules could be involved in Pseudomonas signaling and virulence (Kretsch et al., 2018). In addition, fragin biosynthesis, the main antifungal compound produced by Burkholderia cenocepacia H111 is under the control of valdiazen,

a novel quorum-sensing signaling molecule produced by a gene cluster homologous to the mgo and pvf operons (Jenul et al., 2018). Although the structure of the putative signaling molecule produced by the mgo operon in Pss isolated from mango trees remains unknown, its function as a regulator of biosynthesis of mangotoxin, and likely other secondary metabolites, is quite feasible.

Interestingly, another two Tn5 mutants abolished in mangotoxin production (UMAF01585aC5 and UMAF0158-4βA2), and thus, affected in virulence (tested in virulence assay in tomato leaflets) were studied in depth because they did not show homology with the genome sequences of Pss B728a, P. syringae pv. tomato DC3000 or P. syringae pv. phaseolicola 1448A. The involvement of mangotoxin in the epiphytic survival of Pss strains isolated from mango was demonstrated by Arrebola et al. (2009). Epiphytic survival experiments on tomato leaflets revealed that there was no difference between the wild-type Pss UMAF0158 and both mutants. Nevertheless, when the bacteria were co-inoculated together the wild-type with each of the mutants individually a slight but significant decrease was observed in the mutants, and the difference reached almost one order of magnitude. Thus, in addition to its virulence function, mangotoxin could also play a role in improving the ecological fitness of Pss strains isolated from mango trees. Furthermore, the screening of both mutant insertions in the genomic library showed that both were in a cluster of six genes present in wild-type strain Pss UMAF0158 and not in Pss B728a, P. syringae pv. tomato DC3000 or P. syringae pv. phaseolicola 1448A. Complementation experiments restored the ability of both mutants to produce mangotoxin (Carrión et al., 2012). These six genes were named mboA, B, C, D, E, and F in accordance with the mangotoxin biosynthetic operon and experiments based on RT-PCR and Northern blot analysis confirmed that these six genes were co-transcribed as a single polycistronic mRNA molecule confirming that these genes were forming an operon. Furthermore, site directed insertional mutations performed in each gene have shown a complete abolition of mangotoxin production in mboA, B, C, and D gene mutants and altered phenotypes in mboE and F gene mutants. Transformation experiments with pLAC-AF (pBBR1-MCS5 + mboA-F), a plasmid that contains the six mbo genes in different non-producing Pseudomonas strain genetic backgrounds, resulted in mangotoxin producers. Therefore, all experiments strongly confirmed that the mbo genes were essential for full production of mangotoxin.

Unambiguously, Carrión et al. (2014) demonstrated that the regulation of mangotoxin production was under the control of both gacS/gacA and mgo genes and additionally, that mgo genes were regulated by gacS/gacA genes. Tn5 mutants that were all defective in mangotoxin production (mgoA mutant, mboD mutant, mboB mutant, gacS mutant, and gacA mutant) were used to unravel the regulation of the mangotoxin biosynthetic pathway. Transcriptional analysis by qRT-PCR showed that expression levels of the mboA, C, and E genes were significantly lower in the gacA and mgoA mutants than in the wild-type; however, the mgo and mbo mutants did not affect the transcription levels of the gacS/gacA genes. These results suggested that the gacS/gacA system controls the regulation of both mgo and mbo operons and downstream the mgo operon controlled the regulation of the mbo operon, and thus controlling the mangotoxin production. Promoter fusion experiments using the mbo promoter showed high levels of β-galactosidase activity in the wild-type, whereas the expression was significantly lower in mgoA, gacA, and gacS mutants, supporting the results obtained previously. Taken together, a model for the regulation of mangotoxin production has been proposed (**Figure 3B**) (Carrión et al., 2014). In this model, it is proposed that mgo molecules could serve as signaling molecules, as has been previously described in similar bacteria, and may be involved in the regulation of other virulence traits in Pss strains isolated from mango trees. Moreover, other functions in addition to virulence have been described for mangotoxin, and its putative role as a signaling molecule has been hypothesized.

A diversity survey using different approaches (genetic, phenotypic, and phylogenetic) showed that Pss strains isolated from mango trees formed a single phylotype into the pathovar syringae associated with the mango host, producers of mangotoxin and distributed worldwide in areas where mango is grown and BAN is a relevant disease (Gutiérrez-Barranquero et al., 2013a). Despite of Pss strains isolated from mango are more similar among them in comparison with other Pss isolated from others hosts and other pathovars, phenotypic (including virulence degree) and genetic variability has been observed (Gutiérrez-Barranquero et al., 2013a). Then, in order to determine the evolutionary history of the mbo operon, a phylogenetic analysis using the housekeeping genes rpoD and gyrB grouped all strains belonging to the Genomospecies 1 together but separated in three different clusters. Two of these clusters were associated with the presence of the mbo operon (Carrión et al., 2013). Group I mbo+ was mainly composed of strains from the pathovar syringae, mainly isolated from woody hosts, but predominantly from mango trees, group which correspond with the single phylotype of Pss strains associated with mango trees described by Gutiérrez-Barranquero et al. (2013a). Group II mbo+ was composed of five different pathovars of P. syringae isolated from herbaceous and woody plants (aptata, avellanae, japonica, pisi, and syringae) and group III mainly composed by the pathovar syringae that was negative for the presence of mbo genes. Interestingly, group III (the group that lacked the mbo operon) diverged before the separation of groups I and II. These results suggested that the mbo operon was acquired by groups I and II in only one or two acquisition events after their separation from group III. Thus, this work strongly suggested that the mbo operon was horizontally acquired only once during the evolution of the P. syringae complex shown to be specifically distributed within the P. syringae Genomospecies 1 (Carrión et al., 2013). In the last few years, the databases have suffered a veritable explosion regarding the number of P. syringae genome sequences available (Baltrus et al., 2011; Thakur et al., 2016), which also contributed to a novel classification of the P. syringae complex in 13 different phylogroups (Berge et al., 2014). A more in depth phylogenetic analysis has been performed including 150 strains of the P. syringae complex belonging to the phylogenetic groups

1, 2, 3, 4, 5, 6, 7, and 11 (**Figure 4**). Inside the phylogenetic group 2, where the Pss strains isolated from mango are present, it is possible to observe the differentiation of three main groups, similar to those previously reported by Carrión et al. (2013). Group I mbo+ was mostly composed of pathovar syringae, mainly isolated from mango trees, corresponding with the single phylotype described. Group II mbo+ was composed of the 5 pathovars previously identified in this group. However, this new analysis included two more pathovars into this group (pathovar atrofaciens and coryli). Finally, a third group was composed mainly by the pathovar syringae that was negative for the presence of mbo genes. This new phylogenetic analysis confirms the previous assumption that Pss strains isolated from mango are forming a single phylotype inside the Genomospecies1-phylogenetic group 2.

### Ice Nucleation Activity

Pseudomonas syringae infections tend to be favored by cool and wet conditions due to its ability to induce ice nuclei formation at warm, subfreezing temperatures (−2 to −4 ◦C) (Lindow et al., 1982; Hirano and Upper, 1995). Ice nucleation activity (INA), is considered an important virulence factor wide spread throughout P. syringae complex that plays a major role in the early stages of infections causing wounds that can facilitate disease particularly in woody plant species (Lindow et al., 1982; O'Brien and Lindow, 1988; Hwang et al., 2005; Lamichhane et al., 2014). In this sense, Cazorla et al. (1995) developed a simple and alternative multiple-tube test that showed an increase in detection sensitivity of active ice nuclei forming bacteria relative to the traditional drop-freezing methods (Lindow et al., 1978). This method revealed that all Pss strains isolated from mango trees were positive for INA detection. Although the INA virulence factor could be important at the initial stages of BAN development, the low probability of occurrence of frost in mango-producing areas makes its role in virulence largely anecdotal.

### Type III Secretion System

The most-studied and well-characterized virulence factor associated with P. syringae is the T3SS (Lindeberg et al., 2012). The T3SS is a complex nanomolecular machinery used by P. syringae and many other plant and animal pathogens to inject effector proteins into host plant cells to subvert the plant immune system and induce disease development (Lindeberg et al., 2012). While the T3SS is the most-studied virulence factor in P. syringae-plant interactions (Collmer et al., 2000; Oh et al., 2010; Cunnac et al., 2011; Lo et al., 2017), the role that this secretion system might play in the development of BAN disease has not been examined in depth to date. At this stage, a

sequences of rpoD and gyrB housekeeping genes using MEGA 7 software. Bootstrap values (1,000 repetitions) are shown on branches and evolutionary distances are in units of nucleotide substitutions per site. One hundred and fifty strains belonging to the phylogenetic groups 1, 2, 3, 4, 5, 6, 7, and 11 of the P. syringae complex are depicted in the circular phylogenetic tree. Marked in blue are represented the strains belonging to the phylogenetic group 2, where the P. syringae pv. syringae strains isolated from mango are found. (B) Exclusive representation of the phylogenetic group 2. Three main groups are defined regarding the presence or not of the mbo genes necessary for mangotoxin production. The topology was similar among phylogenetic trees produced by the maximum-parsimony and maximum-likelihood methods. Supplementary Table S1 provides the phylogenetic groups, the host of isolation and the accession numbers of the DNA sequences used for each strain represented in this phylogenetic analysis.

genome sequencing project performed on the Pss model strain isolated from mango trees, UMAF0158, revealed the presence of two different T3SSs (Martínez-García et al., 2015). The first T3SS (T3SS-1) is similar to the Hrp-1 T3SS family (Egan et al., 2014) found in different pathovars of P. syringae (Lindeberg et al., 2012) and represents the canonical T3SS widely distributed in pathogenic P. syringae strains (Block and Alfano, 2011; Lindeberg et al., 2012). Pss strains isolated from mango trees were able to induce a hypersensitivity response (HR) in tobacco plants (Cazorla et al., 1998). The capability of P. syringae to provoke a HR in non-host plants is dependent on a functional T3SS (Huang et al., 1992). Thus, in Pss UMAF0158, a simple deletion mutant constructed in the hrpL gene (UMAF01581hrpL) (an alternative sigma factor that binds to the hrp box promoter sequence of the T3SS genes that upregulates their expression) confirmed the involvement of T3SS-1 in HR development (Martínez-García et al., 2015). The role of this particular T3SS in overall virulence has been widely recorded in Pss B728a, P. syringae pv. tomato DC3000, and many others (Schechter et al., 2006; Vinatzer et al., 2006; Kvitko et al., 2009; Lee et al., 2012).

Additionally, bioinformatics analysis highlighted the presence of an additional T3SS (called T3SS-2) in the chromosome of Pss UMAF0158 (Martínez-García et al., 2015) that was also found in different strains from different pathovars (Reinhardt et al., 2009; Studholme et al., 2009; Clarke et al., 2010; Matas et al., 2014). This T3SS-2 shows high similarity to the rhizobial-like T3SS Rhc of the Rhizobiales family (Gazi et al., 2012; Egan et al., 2014). A typical hrp box promoter regulatory sequences of the HrpL regulon found preceding the genes of the typical T3SS (Fouts et al., 2002) was missed in the T3SS-2. As it has been demonstrated in other P. syringae strains, the T3SS-2 is dispensable for pathogenicity, although a possible role in plant surface colonization or interaction with insects cannot be ruled out (Lindgren et al., 1986; Clarke et al., 2010; Pérez-Martínez et al., 2010; Silby et al., 2011). Different specific mutants in the T3SS-1, T3SS-2, and in combination in both systems constructed in Pss UMAF0158 did not revealed the function of the T3SS-2 in Pss isolated from mango trees, which remains unknown to date (Martínez-García et al., 2015).

Due to the release of the Plant-bacteria Interaction FActors Resource (PIFAR), an open-access web-based resource for genetic factors involved in bacterial interactions with plant–hosts<sup>1</sup> (Martínez-García et al., 2016), the detection of type 3 effectors (T3Es) has become more accurate than the method previously selected to identify T3Es in Pss UMAF0158 (Martínez-García et al., 2015). By using PIFAR tool, 15 putative T3Es have been identified in Pss UMAF0158, 4 T3Es more than the 11 previously identified. A Venn diagram analysis of the core T3Es, comparing the Pss UMAF0158 genome with the genome sequencing of three Pss strains (B728a, HS191, and B301D) and P. syringae Cit7 all belonging to Genomospecies 1 (Gardan et al., 1999) and Phylogenetic Group 2 (Berge et al., 2014), has been performed (**Figure 5A**). In addition, the presence or absence of the different T3Es present in these four strains are

<sup>1</sup>http://bacterial-virulence-factors.cbgp.upm.es/PIFAR

**137**

depicted (**Figure 5B**). hopA1, hopAX1, hopAZ1, and hopBK1 are the unique T3Es shared by Pss UMAF0158 with several other strains (Ps Cit7 and Pss HS191). On the other hand, hopA1, hopAX1, and hopBK1 have been found in other pathovars of P. syringae belonging to different phylogroups (Berge et al., 2014). Remarkably, the effector hopAX1 appears to be mainly associated with a few strains of different pathovars all belonging to the Genomospecies 1-Phylogenetic Group 2 (pv. aptata, pv. pisi, pv. aceris, and pv. syringae). Dillion et al. (2019) have recently described the high-specificity of hopAX1 T3E in the Phylogenetic Group 2.

### P. syringae pv. syringae STRAINS ISOLATED FROM MANGO TREES IN THE GENOMIC ERA

High-Throughput Sequencing technologies (HTS) has had a large impact on plant pathology and other research areas. In recent years, there has been substantial growth regarding genome sequencing of bacterial plant pathogens (Studholme et al., 2011) that can provide a strong basis for a better understanding of plant–microbe interactions that 1 day will contribute to the eradication of plant diseases. P. syringae is the model plant pathogen par excellence most often used worldwide to dissect plant–pathogen interactions (Baltrus et al., 2017). From the first genome sequenced of the model strain P. syringae pv. tomato DC 3000, the current landscape has changed markedly, with many groups interested in P. syringae comparative genomics and evolution (Lovell et al., 2009; Green et al., 2010; Baltrus et al., 2011; McCann et al., 2013; Thakur et al., 2016; Hulin et al., 2018). Currently, the complete genome sequences of 29 P. syringae strains, along with more than 400 draft genome sequences, are included in the NCBI database<sup>2</sup>,<sup>3</sup> . To date, there is only one complete genome sequenced of Pss strains isolated from mango trees (chromosome + 62-kb PFP plasmid), which was performed in the model strain Pss UMAF0158 (Martínez-García et al., 2015). This work revealed a high degree of conservation with other Pseudomonas from the P. syringae complex; however, different genetic factors were identified for their potential involvement in the epiphytic or pathogenic lifestyle, and these factors have been described in depth in this review. Among these factors, the most important were the presence of the mbo operon (mangotoxin biosynthetic operon), the presence of the wss operon (operon involved in cellulose biosynthesis), the additional type III-like rhizobial secretion system, the additional type VI secretion system, and a particular T3E repertoire.

Recently, a PFP sequencing project that includes 4 62-kb PFP plasmids from different strains of Pss strains isolated from mango trees was carried out (Gutiérrez-Barranquero et al., 2017a). In this work, it was revealed that the main functions of 62-kb plasmids of Pss strains isolated from mango trees were related to the increase in UV radiation and copper treatment tolerance. The backbone of the different plasmids regarding the genes involved in the maintenance, replication and conjugation was similar and showed a high degree of synteny. Interestingly, these plasmids were included in the previously described subgroup B (Ma et al., 2007), sharing more than the repA gene (replicase gene shared by all PFPs plasmids; Sundin, 2007). In addition, a novel genetic structure likely related to a cell-to-cell communication signaling system appeared in those plasmids upstream of the type IV secretion system, suggesting that the conjugation process could be under the regulation of this signaling mechanism (Gutiérrez-Barranquero et al., 2017a). On the other hand, there is a relatively low degree of homology in remaining genes found in each 62-kb PFP plasmids.

#### CONCLUDING REMARKS AND FUTURE DIRECTIONS

The enormous efforts that have been carried out over the last two decades have led us to gain more in-depth understanding of the P. syringae pv. syringae-mango host interactions. Pss causes important economic losses in mango crop production in the Mediterranean region. Pss strains isolated from mango trees form a single phylotype within the pathovar syringae and exhibit important factors that contribute to the epiphytic-pathogenic phase establishment on the mango plant, revealing a deep interaction between the pathogenic microbe and the host plant. It is worthy to note that the traits in P. syringae that are involved in pathogenic and epiphytic lifestyles have been studied in depth, but particularly, the mechanisms underlying the association of Pss with the mango host are little known. Thus, the major traits analyzed in depth in this review would help Pss to interact successfully with mango trees, but some of them are also useful in the interaction of other P. syringae strains with other plant hosts. Mangotoxin is the main virulence factor of this particular group of bacteria, and although much attention has been paid to it, the structure of this toxic molecule remains elusive. In addition, the possible role of mangotoxin as a signaling molecule modulating specific gene expression has been hypothesized. Further experiments are currently being carried out to confirm this hypothesis. Additionally, another important virulence factor not well-studied is the T3SS. In Pss isolated from mango trees, an extra copy of the T3SS is present. However, despite the efforts made, its role in the ecology of Pss remains unknown. Another relevant factor recently discovered in Pss strains isolated from mango trees is the presence of a cellulose biosynthetic gene cluster. The cellulose gene cluster has been described as involved in adhesion and biofilm formation development in Pss on the mango leaf surfaces. This gene cluster is only present in a few strains of P. syringae but is present in all Pss strains isolated from mango trees, suggesting that it is a crucial factor in the adaptation to the mango host. Its role in modulating epiphytic and pathogenic phases on mango surfaces has also been addressed. In addition, 62-kb PFP plasmids have been shown to play a key role in epiphytic survival of Pss on mangos, harboring UV and copper resistance determinants, among others. This

<sup>2</sup>https://www.ncbi.nlm.nih.gov/genome/185

<sup>3</sup>https://www.ncbi.nlm.nih.gov/genome/2253

long-lasting interaction among Pss and mango led us to search for effective control methods to allow farmers to deal with BAN symptoms. The efficacy of the alternative treatment silicon gel compared to the spray of copper compound BM has been demonstrated, and silicon gel has finally been registered for its commercial use in mango crops in Spain to combat BAN disease.

Given all of this, the future directions of this research are actually being targeted in two aims: (1) to unravel signaling mechanisms of Pss in interactions with other bacterial members of the mango microbiome by analysis of the transcript-level expression using in vitro and in vivo approaches; and (2) comparative genomics and evolutionary history analysis. In spite of the massive development of genomic sequencing technologies, there is a lack of information regarding genomic data from Pss strains isolated from mango trees. Thus, a great effort is currently being carried out to perform a major genome sequencing project involving a number of different strains to unravel the evolutionary processes that have occurred in mango populations from different geographical regions, separated in time. Phylogenetic and evolutionary approaches will open new windows of research that allow us to better understand why this phytopathogenic bacterium is so peculiar.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### REFERENCES


#### FUNDING

This work was supported by grants from CICE-Junta de Andalucía, Proyecto de Excelencia (P12-AGR-1473) co-financed by FEDER (EU). JG-B was supported by a Postdoctoral Fellowship from the Research Own Plan of the University of Malaga "Ayuda de Incorporación de Doctores 2017".

#### ACKNOWLEDGMENTS

We are very grateful to all the people who were directly involved in the development of this research: José María Farré, José María Hermoso, Emilio Guirado, David Sarmiento, Alejandro Pérez-García, Juan C. Codina, Eva Arrebola, Victor J. Carrión, and Jesús Murillo. We would like to thank SAT 2803 TROPS and all collaborating farmers. We also extend a special thanks to Irene Linares for her collaboration and technical support. This work is especially dedicated to the memory of our colleague Juan A. Torés Montosa, who sadly passed away in July 2018. He was one of the original researchers responsible for the discovery of BAN disease, and the development of this research line in our laboratory.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00570/ full#supplementary-material

pathogenic lifestyles. FEMS Microbiol. Ecol. 91:fiv071. doi: 10.1093/femsec/ fiv071




among effectors. PLoS Pathog. 5:e1000388. doi: 10.1371/journal.ppat.100 0388



the replication region and an altered incompatibility behavior. Appl. Environ. Microbiol. 64, 3948–3953.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gutiérrez-Barranquero, Cazorla and de Vicente. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Analyses of Seven New Genomes of Xanthomonas citri pv. aurantifolii Strains, Causative Agents of Citrus Canker B and C, Show a Reduced Repertoire of Pathogenicity-Related Genes

Natasha Peixoto Fonseca<sup>1</sup>† , José S. L. Patané<sup>2</sup>† , Alessandro M. Varani<sup>3</sup> , Érica Barbosa Felestrino<sup>1</sup> , Washington Luiz Caneschi<sup>1</sup> , Angélica Bianchini Sanchez<sup>1</sup> , Isabella Ferreira Cordeiro<sup>1</sup> , Camila Gracyelle de Carvalho Lemes<sup>1</sup> , Renata de Almeida Barbosa Assis<sup>1</sup> , Camila Carrião Machado Garcia<sup>1</sup> , José Belasque Jr.<sup>4</sup> , Joaquim Martins Jr.<sup>5</sup> , Agda Paula Facincani<sup>3</sup> , Rafael Marini Ferreira<sup>3</sup> , Fabrício José Jaciani<sup>6</sup> , Nalvo Franco de Almeida<sup>7</sup> , Jesus Aparecido Ferro<sup>3</sup> , Leandro Marcio Moreira1,8 \* and João C. Setubal<sup>5</sup> \*

<sup>1</sup> Programa de Pós-graduação em Biotecnologia, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil, <sup>2</sup> Laboratório Especial de Ciclo Celular, Instituto Butantan, São Paulo, Brazil, <sup>3</sup> Departamento de Tecnologia, Universidade Estadual Paulista, UNESP, Campus de Jaboticabal, Jaboticabal, Brazil, <sup>4</sup> Departamento de Fitopatologia e Nematologia, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba, Brazil, <sup>5</sup> Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, Brazil, <sup>6</sup> Fundo de Defesa da Citricultura (FUNDECITRUS), São Paulo, Brazil, <sup>7</sup> Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Campo Grande, Brazil, <sup>8</sup> Departamento de Ciências Biológicas, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, Brazil

Xanthomonas citri pv. aurantifolii pathotype B (XauB) and pathotype C (XauC) are the causative agents respectively of citrus canker B and C, diseases of citrus plants related to the better-known citrus canker A, caused by Xanthomonas citri pv. citri. The study of the genomes of strains of these related bacterial species has the potential to bring new understanding to the molecular basis of citrus canker as well as their evolutionary history. Up to now only one genome sequence of XauB and only one genome sequence of XauC have been available, both in draft status. Here we present two new genome sequences of XauB (both complete) and five new genome sequences of XauC (two complete). A phylogenomic analysis of these seven genome sequences along with 24 other related Xanthomonas genomes showed that there are two distinct and wellsupported major clades, the XauB and XauC clade and the Xanthomonas citri pv. citri clade. An analysis of 62 Type III Secretion System effector genes showed that there are 42 effectors with variable presence/absence or pseudogene status among the 31 genomes analyzed. A comparative analysis of secretion-system and surfacestructure genes showed that the XauB and XauC genomes lack several key genes

#### Edited by:

Dawn Arnold, University of the West of England, United Kingdom

#### Reviewed by:

Jeffrey Jones, University of Florida, United States David J. Studholme, University of Exeter, United Kingdom

#### \*Correspondence:

Leandro Marcio Moreira lmmorei@gmail.com João C. Setubal setubal@iq.usp.br

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Microbe Interactions, a section of the journal Frontiers in Microbiology

Received: 01 March 2019 Accepted: 27 September 2019 Published: 11 October 2019

#### Citation:

Fonseca NP, Patané JSL, Varani AM, Felestrino ÉB, Caneschi WL, Sanchez AB, Cordeiro IF, Lemes CGdC, Assis RdAB, Garcia CCM, Belasque J Jr, Martins J Jr, Facincani AP, Ferreira RM, Jaciani FJ, Almeida NFd, Ferro JA, Moreira LM and Setubal JC (2019) Analyses of Seven New Genomes of Xanthomonas citri pv. aurantifolii Strains, Causative Agents of Citrus Canker B and C, Show a Reduced Repertoire of Pathogenicity-Related Genes. Front. Microbiol. 10:2361. doi: 10.3389/fmicb.2019.02361 in pathogenicity-related subsystems. These subsystems, the Types I and IV Secretion Systems, and the Type IV pilus, therefore emerge as important ones in helping explain the aggressiveness of the A type of citrus canker and the apparent dominance in the field of the corresponding strain over the B and C strains.

Keywords: effectors, adaptation, virulence, Xanthomonas evolution, genome sequencing

#### INTRODUCTION

fmicb-10-02361 October 10, 2019 Time: 17:10 # 2

Citrus is an important worldwide crop (FAS, 2019) that has been threatened by various diseases over many decades. Even though citrus Huanglongbin (greening) is today the major citrus threat (Wang et al., 2017), citrus canker (Goto, 1992; Schubert and Miller, 1996; Schubert et al., 2001) still is an important disease (CAB-International, 2019), especially in Brazil (Mendonça et al., 2017).

Three species of bacteria of the genus Xanthomonas are associated with citrus canker diseases in citrus: Xanthomonas citri subsp. citri (Xcc) pathotypes A, A<sup>∗</sup> and Aw, X. citri subsp. aurantifolii pathotypes B and C (XauB and XauC), and X. alfalfae subsp. citrumelonis (Xacm). Xcc, XauB and XauC are respectively the causative agents of citrus canker A, B, and C, which cause small necrotic raised lesions surrounded by a water-soaked margin (Civerolo, 1984). Citrus canker A, the most aggressive, remains a concern in all citrus growing regions in Asia and South America (CAB-International, 2019). XauB strains are less aggressive, and XauC strains have a more restricted host range, when compared with symptoms and host range of Xcc, respectively. Canker B is currently known to be present only in Argentina, Paraguay, and Uruguay (Civerolo, 1984); moreover, XauB may have been eradicated even from this restricted region by competition from Xcc (Chiesa et al., 2013). Canker C is limited to the state of São Paulo, Brazil (Malavolta Júnior et al., 1984); the most recent field report dates to 2009 (Jaciani et al., 2009). Xacm is the causal agent of citrus bacterial spot, which induces symptoms very similar to those of canker, but the lesions are flat and not raised.

Sequencing of the X. citri susbsp. citri strain A306 genome (A306) allowed the characterization of important properties of this more aggressive pathotype (da Silva et al., 2002). Following that work, genomes of the other pathotypes were sequenced and compared with each other (Jalan et al., 2013; Bodnar et al., 2017).

Given the phylogenetic relatedness of the causal pathogens of cankers A, B, and C, the comparative study of XauB and XauC strains at the genomic level offers the opportunity of achieving a better understanding of citrus canker disease in general. Up to now, only one XauB and only one XauC strain genome have been sequenced (Moreira et al., 2010). We therefore decided to sequence the genomes of additional strains of XauB and XauC. The newly sequenced isolates were selected because they showed significant differences in pathogenicity and aggressiveness when inoculated in different citrus genotypes and/or had different genetic characteristics (Jaciani, 2012; **Table 1**). The isolates XauB 1561 and XauB 1566 showed less virulence with respect to the other isolates and absence of clear infection symptoms, suggesting a probable loss of pathogenicity, besides being genetically different by AFLP and ERIC-PCR (Jaciani, 2012). The selection of XauC strains was based on the ability of some isolates to produce dark pigment when cultivated in NB or NA culture media (NB: 0.5% peptone, 0.3% beef extract; NA: 0.5% peptone, 0.3% beef extract, 1.5% agar), also observed in X. citri pv. fuscans and X. campestris pv. vignicola (Schaad et al., 2005; Schaad et al., 2006). XauC 535 and XauC 1609 cause lesions only in Mexican lime [C. aurantifolia (Christm.) Swingle] and Swingle citrumelo [C. paradisi Macfad. × Poncirus trifoliata (L.) Raf.], and in both hosts with mild symptoms. XauC 1609 was found to infect Swingle citrumelo under natural conditions (Jaciani et al., 2009), despite the fact that prior work suggested that only Mexican lime was susceptible to XauC (Malavolta Júnior et al., 1984). Additionally, XauC 535 and XauC 1609 were also differentiated by AFLP and BOX-PCR (Jaciani, 2012). The isolates XauC 763, XauC 867, and XauC 1559, which do not produce pigment, were distinguished in terms of pathogenicity and aggressiveness. XauC 763 and XauC 1559, which are also Mexican lime pathogens, caused injuries in Swingle citrumelo and Cravo mandarin (C. reticulata Blanco), and when inoculated in high concentration they infected Rangpur lime (C. limonia Osbeck), Persian lime [C. latifolia (Yu. Tanaka) Tanaka], lemon [Citrus limon (L.) Burm. f.], Grapefruit (C. paradisi Macfad.), and Cleopatra mandarin (C. reshni hort. ex Tanaka) (Jaciani, 2012). Finally, XauC 1559 was slightly more aggressive than XauC 763 when inoculated in Cravo mandarin, and XauC 867 presented a slightly more restricted pathogenicity and lower aggressiveness in Mexican lime compared to XauC 763 and XauC 1559 (Jaciani, 2012).

Altogether, based on the information above, we have sequenced two new XauB and five new XauC genomes, with the aim of achieving a better understanding of the genomic basis of citrus canker and the evolutionary history of these strains. Together with 24 other public and closely related genomes, this allowed us to carry out a phylogenomic analysis as well as an investigation of selected gene families relevant in bacteria-plant interactions in general and in citrus canker in particular (Ryan et al., 2011), which we present here.

A note on taxonomic nomenclature: Xanthomonas species that are pathogenic to citrus were described in this study using names as proposed by Schaad et al. (2006), since this classification is adopted for all cited references found until the present. The other Xanthomonas species were described as proposed by Bui Thi Ngoc et al. (2010) and Constantin et al. (2016).


#### MATERIALS AND METHODS

#### Media and Culture Conditions

The new genomes presented here were sequenced from strains stored both in autoclaved tap water at room temperature and at −80◦C in NB medium (3 g/L meat extract, 5 g/L peptone) containing 25% glycerol. Each one of the strains was recovered from a −80◦C stock, streaked on solid NA medium (3 g/L meat extract, 5 g/L peptone and 15g/L agar) and cultivated for 48 h at 29◦C. For each strain, colonies were inoculated into 10 mL of liquid NB medium in a sterile 50 mL Falcon conical centrifuge tube and incubated at 29◦C in a rotary shaker at 180 rpm for 16 h (final OD600 nm ∼1.0).

#### DNA Extraction and Quantification

A volume of 2 mL of the culture was centrifuged at 16,000 g for 10 min at 4◦C in a refrigerated benchtop microcentrifuge. The supernatant was discarded and the cell pellet was resuspended in 600 µL of Nuclei Lysis Solution supplied by Promega Wizard Genomic DNA purification kit (Promega Corporation, Madison, United States). Total DNA extraction was performed using Promega Wizard Genomic DNA purification kit according to manufacturer instructions. DNA quantity and quality were determined using NanoDrop ND-1000 spectrophotometer (NanoDrop Tech, Wilmington, DE, United States), Qubit 2.0 fluorometer (Invitrogen, Life Technologies, CA, United States) and 0.8% agarose gel electrophoresis. Each extraction yielded at least 5 µg of high-quality genomic DNA.

#### Genome Sequencing and Assembly

The genomes of XauC 535, XauC 763, and XauC 867 strains were sequenced using the Illumina HiScanSQ platform at NGS Soluções Genômicas (Piracicaba, Brazil). An average of ∼20 M reads (2 × 100 bp) for each genome was generated. The raw reads were trimmed with seqyclean software<sup>1</sup> , using minimum phred value of 23, minimum read length of 30 bp, and removing custom Illumina TruSeq adapters. Genome assembly was carried out with SPAdes v3.8.1 (Bankevich et al., 2012) with default parameters. Contig sequences were assigned to plasmids using plasmidSPAdes v3.8.1 (Antipov et al., 2016).

The genomes of XauB1561, XauB1566, XauC1559, XauC1609 strains were sequenced using PacBio single molecule real-time (SMRT) technology at the Duke Center for Genomic and Computational Biology (United States). One SMRT library was sequenced for each sample using P6-C4 chemistry, generating reads with average length of 10–15 Kb, thus yielding ∼150X coverage of each genome. De novo assembly was conducted using SMRT <sup>R</sup> Analysis Server v2.3.0<sup>2</sup> . Raw PacBio reads were mapped against the resulting contigs using the blasR aligner, and SNP corrections were conducted with variant-caller software using the quiver algorithm (both part of the Analysis Server).

The rationale for having some genomes sequenced using PacBio technology and some using Illumina technology was as

<sup>1</sup>https://bitbucket.org/izhbannikov/seqyclean

<sup>2</sup>https://www.pacb.com/documentation/smrt-analysis-software-installation-v2- 3-0/

follows. We wanted to ensure that we could provide complete genomes for both XauB and XauC, given that prior to this work only draft genomes were available for these pathotypes (Moreira et al., 2010). On the other hand our budget was limited, and we could afford PacBio sequencing for only four genomes. Under these constraints, the choice of which genomes to sequence by PacBio was arbitrary.

All assembled genomes were verified with CheckM (Parks et al., 2015), resulting in 100% completeness and 0% contamination for all of them.

#### Genome Selection

fmicb-10-02361 October 10, 2019 Time: 17:10 # 4

For the purposes of phylogenomic analyses, we searched for genomes in NCBI/GenBank using "Xanthomonas citri" as a keyword, then looked at the automatic dendrogram generated by genomic distances on the NCBI website<sup>3</sup> , which reveals all genomes within this group, including all subspecies/lineages/varieties available as reference sequences (RefSeq). After downloading this tree, we searched for all different lineages, and then downloaded up to three genomes from each such lineage, if available, and preferentially (if possible) drawing from separate clades where the lineage appears in NCBI's dendrogram, to avoid pseudoreplication (i.e., avoiding picking two closely related genomes). This led to a final dataset of 31 genomes.

#### Phylogenomic Reconstruction

In order to generate comparable sets of gene families, Prokka (Seemann, 2014) (with default parameters) was employed for annotation of each genome. Get\_Homologues (Contreras-Moreira and Vinuesa, 2013) was used for multiple local BLASTdirected comparisons among all genes (of all genomes), and these were further clustered by the OMCL method which drives the OrthoMCL algorithm (Li et al., 2003) within Get\_Homologues. Subsequently compare\_clusters.pl (a script within the same software) was used for retrieval of the set of orthologous genes uniquely present in all genomes (hereafter denominated the unicopy set). Mafft (Katoh and Standley, 2013) was used for multiple alignment of each unicopy gene, and then concatenation of all genes was done using FASconCAT (Kück and Meusemann, 2010). IQTree (Nguyen et al., 2015) was used for maximum likelihood (ML) estimation, with model choice employed before tree search, and branch support computed by UFBoot (Hoang et al., 2017).

#### Effector Analysis

The aminoacid sequences of 62 effectors were retrieved from the Xanthomonas.org site (AvrBs2, AvrXccA1, AvrXccA2, HpaA, HrpW, XopA, XopAA, XopAB, XopAC, XopAD, XopAE, XopAF1, XopAF2, XopAG, XopAH, XopAI, XopAJ, XopAK, XopAL1, XopAL2, XopAM, XopAP, XopAU, XopAV, XopAW, XopAX, XopAY, XopAZ, XopB, XopC1, XopC2, XopD, XopE1, XopE2, XopE3, XopF1, XopF2, XopG1, XopG2, XopH1, XopI1, XopJ1, XopJ2, XopJ3, XopJ4, XopJ5, XopK, XopL, XopM, XopN, XopO, XopP, XopQ, XopR, XopS, XopT, XopU, XopV, XopW, XopX, XopY, and XopZ1) (**Supplementary Table S1**), to infer their evolution across the 31 genomes. For each genome, we assessed whether each effector without a premature stop codon had a frameshift or not. In order to do so, we performed local tBLASTn searches within the Blast + suite (Camacho et al., 2009) with an e-value of 1e-50 (a threshold obtained by trial-and-error, that minimized the number of extra hits bearing indels and mismatches without compromising detection of supposedly functional copies), discarding alignments in which the subject sequences aligned to less than 60% of the query length or with less than 60% identity. After tBLASTn runs, multiple alignments were generated by an inhouse python script for each effector, each of which was manually checked in Aliview (Larsson, 2014) for detection of frameshifts and premature stop codons. Optimization of character evolution for each effector along the ML tree (i.e., presence, frameshifts without premature stop codons, sequences with premature stop codons, and absence) was obtained by the ace function within the R library phytools (Revell, 2012).

#### Additional Gene Analyses

Additional gene families were investigated based on OrthoMCL clustering (Li et al., 2003) and STRING (Snel et al., 2000). OrthoMCL was run with default parameters, and results were then processed in the OrthologSorter pipeline (Setubal et al., 2018). Additionally, we created an Ortholog Alignment using gene families provided by OrthoMCL, with the A306 strain as anchor and all the XauB and XauC genomes, plus X. citri pv. fuscans 4384. This alignment is useful to visualize syntenic regions among genomes. The parts of this alignment that were used in reporting results in this work are shown in a simplified version in **Supplementary Table S2**. In the case of STRING, for each family of interest, the relevant genes as present in A306 were used as queries.

#### Gum Production Assay

The xanthan gum production assays were performed as described by Moreira et al. (2010), without modification.

#### Biofilm Production Assay

Biofilm production assays were performed following O'Toole (O'Toole, 2011), with a few modifications. The bacterial isolates were grown in liquid LB or XVM2 medium at 28◦C. Bacterial density was standardized for all the isolates in OD600nm equal to 1.0. The samples were diluted 1:10 in liquid LB and 100 µL of each sample were placed in the 96-well plate for growth during 96 h at 28◦C. After the incubation period, the plate was washed with distilled water to remove the cells and left drying for 2 h. Subsequently, 125 µL of crystal violet solution 0.1% (CV) were transferred to each well, which were left resting for 45 min. After the incubation period the plate was washed again with distilled water and left drying once more. Next, 125 µL of 95% ethanol were added to each well, which were left to rest for 45 min to complete CV dissolution. The absorbance reading was done at OD550nm. For each bacterial isolate 6 replicates were performed.

<sup>3</sup>https://www.ncbi.nlm.nih.gov/projects/treeview/

### Autoaggregation Assay

fmicb-10-02361 October 10, 2019 Time: 17:10 # 5

The autoaggregation assay was adapted from Alamuri et al. (2010), with modifications. Cultures of different bacterial isolates were grown at 28◦C in liquid LB medium or XVM2: 1.16 g/L NaCl, 1.32 g/L (NH4)2SO4, 0.021 g/L KH2PO4, 0.055 g/L K2HPO4, 0.0027 g/L FeSO4·7H2O, 1.8 g/L fructose, 3.423 g/L sucrose, 5 mMMgSO4, 1 mM CaCl2, 0.03% Casamino acid (pH 6.7), in triplicate. Samples with 10 mL of each culture were placed in a sterile 20 mL tube. Initially all cultures were vigorously shaken for 15 s and the tubes remained static throughout the experiment. Aliquots containing 100 µL were removed from approximately 1 cm of the top of the culture of each tube over time and optical density was measured at OD600 nm every hour.

### RESULTS

Information about the genomes that were sequenced for this work is given in **Table 2**. The additional genomes listed there were included in the analysis of pathogenicity-related genes.

### Phylogenomic Analyses

For the phylogenomic analyses we used 31 genomes (**Table 3**). Gene family computation resulted in 2,449 single-copy shared families, leading to a concatenated alignment of 2,516,841 bp. The best ML model was GTR + G + R2 (where R2 means a mixed model of rate variation with two rate classes), with most nodes with support ≥ 95%. The resulting phylogeny is shown in **Figure 1**.

### Xanthan Gum, Biofilm, and Autoaggregation Analysis

In an attempt to understand which physiological factors could contribute to the induction of the respective virulence phenotypes of the investigated strains, xanthan gum production, biofilm, and cell self-aggregation were analyzed (**Supplementary Table S3**). As expected, A306 is the strain with the highest production of xanthan gum by bacterial mass. On the other hand, XauC 535 and XauC 1609 showed respectively the highest biofilm production and self-aggregation capacity in virulenceinducing medium (XVM2).

### Type III Secretion System Effector Analysis

Out of the 62 effectors investigated, four were present in all genomes, 16 were absent from all genomes, for a total of 42 effectors with variable presence/absence across lineages (**Figure 1**). For 11 effectors we observed interesting patterns of presence, absence, or pseudogenization. For these effectors we inferred their evolutionary history in terms of gains, losses or pseudogenization (**Supplementary Figure S1**).

### Other Pathogenicity-Related Genes

Individual genes or genes that collectively encode proteins that compose cell complexes involved in virulence and adaptation were analyzed in all genomes listed in **Table 2**. The virulence and adaptation genes were grouped into two broad categories: (1) secretion systems (other than Type III effectors); (2) surface structure. The analysis framework we have adopted is as follows. The A306 strain has several genes in each of the categories analyzed (da Silva et al., 2002). On the other hand, as will be seen, the XauB and XauC genomes that we have analyzed lack many or all of the genes in some of these categories. In order to better understand the potential impact that the lack of these genes may have in the pathogenicity and/or survival capabilities of the XauB and XauC strains, for each category in which XauB and XauC lack genes we first describe the A306 gene content. We then note the differences exhibited by XauB and XauC (as given by the Ortholog Alignment of 11 genomes, with A306 as anchor, as described in section Materials and Methods), followed by a network analysis based on the A306 genes, using the tool STRING (Snel et al., 2000).

### Secretion Systems

We verified that all the analyzed genomes retain all orthologous genes belonging to the two gene clusters associated with synthesis of the type II secretion system (T2SS, XAC0694-XAC0705, and XAC3534-3544), all the genes involved in structuring the type III secretion system apparatus (T3SS, XAC0393-XAC0417), all the genes associated with the type VI secretion system (T6SS, XAC4119-20-24, XAC4139-40-45), as well as complete Sec and Tat secretion systems. The main differences observed are related with the type I secretion system (T1SS), the type IV secretion system (T4SS) and effectors of the type III secretory system (T3SS). Results for T3SS effectors were already presented above.

### XauB and XauC Lack Key Genes in the Type I Secretion System

The T1SS corresponds to an ABC transporter system and it is basically composed of two proteins, HlyD – an ABC transporter, and HlyB – a membrane fusion protein, whose main function, together with TolC, is to promote the secretion of toxins (Koronakis et al., 2004). In A306 two copies of the gene encoding the toxin presumably secreted by this system, hemolysin (type-calcium, XAC2197-98), are upstream of the genes hlyDB (XAC2201-02), separated by two hypothetical proteins (XAC2199-2200). These gene families (XAC2197- 2202) were not found in the XauB genomes. The XauC genomes on the other hand do not have orthologs of XAC2197-98, but they do have hlyB and hlyD. Other genes associated with synthesis and regulation of hemolysin in these genomes were also analyzed. All genomes have orthologs of XAC4303 and XAC1668 (cryptic hemolysin transcriptional regulator), XAC3043 and XAC0079 (hemolysin III, hly3), and XAC1709 (hemolysin, tlyC). However, in XauB and XauC strains we did not find orthologs for the genes XAC1814 (outer membrane hemolysin activator protein) and XAC1918 (hemolysin-like protein).

Analysis of possible interactions of the products of genes hlyB and hlyD (**Figure 2**) revealed two well-defined interaction networks for the A306 hlyB gene used as query to STRING. One of these groups, in orange background, is the genes/proteins


TABLE 2 | List of genomes used in the comparative analysis section, including information about the seven newly sequenced XauB and XauC genomes.

associated with T1SS composition and functionality. The other network (green background) is composed primarily of membrane genes/proteins, essentially ABC transporters. Eight genes/proteins represented by the nodes of the network composing the T1SS apparatus correspond to the same genes in the cluster discussed above, including the gene encoding the lytic enzyme (XAC0466) present in the XauC10535 genome. We observed that the genomes of XauB strains do not have any of these genes. However, the loss of a single gene of hemolysin in XauC strains would have a small effect, since this loss could be compensated by paralogous genes in their genomes. Concerning the cluster of membrane proteins, three of the ABC transporters are associated with resistance to acriflavin (XAC3994-95 and XAC3850) and two have the hlyD domain (PF00529), involved with secretion of toxins. Some of the genes in this network were not found in the genomes of strains XauC 1609 and XauC 535.

#### XauB and XauC Genomes Lack the Chromosomal Copy of Type IV Secretion System Genes

The genes encoding the T4SS in A306 are found in two similar gene clusters, one in the chromosome (XAC2607-2623) and another in the plasmid (XACb0036-b0047) (da Silva et al., 2002). The genomes of XauB and XauC have only the plasmidial cluster (**Supplementary Table S2**). Note that in the cases of XauB 11122 and XauC 10535, whose contigs are not distinguished as belonging to the chromosome or to a plasmid, it is our inference based on synteny that the T4SS genes actually belong to a plasmid (**Supplementary Table S2**).

#### XauB and XauC Genomes Lack Key Genes in the Synthesis and Regulation of Type IV Pilus

A306 has at least four clusters of genes involved with synthesis and regulation of type IV pilus (T4p). One of these clusters, pilE-Y1-X-W-V-fimT (XAC2664-2669) is found between a set of prophage genes upstream and a transposase downstream, suggesting possible horizontal gene transfer. We observed that two of these genes (pilX-pilV) are missing in the XauB and XauC genomes (**Supplementary Table S2**). In the case of cluster pilS-R-B-A-A-C-D (XAC3237-3243) (Yang et al., 2004), the XauB and XauC genomes lack the two copies of pilA. PilA encodes pilin, an essential T4p component that contributes to twitching motility and biofilm development in A306 (Dunger et al., 2014; Petrocelli et al., 2016). Another gene whose product has a function related to T4p is pilL (XAC2253). In A306 this gene is found in a large genomic island (XAC2176 to XAC2286), but is absent from the XauB and XauC genomes.

We carried out an analysis of predicted interactions of pilA (XAC3240) (**Figure 3**). In orange background we observe that



XAC3240 interconnects five other networks and that the pilin subunits (XAC3240 and XAC3241) are connected to one another, and connect to another pilA (XAC3805). As expected, one of the networks starting from pilins refers to genes/proteins associated with the pilus structure and with the T2SS (cyan background), as is known that both are evolutionarily related (Peabody et al., 2003). Moreover, three genes share the same genomic region of pilins in the chromosome of A306 (1, 4, and 5). Close to the pilin network (purple background) there is a network involving genes associated with quorum sensing (rpf), gum synthesis (gum) and the plant tissue degrading enzyme polygalacturonase (pglA), known to be virulence-related (Wang et al., 2008). Likewise, this network reflects the interaction profiles of DSF production mediated by rpf genes, which act as signaling molecules of gum synthesis and consequently of the production of plant cell wall degrading enzymes, as is the case for PglA mediated by T2SS (Vojnov et al., 2001; An et al., 2013). In addition, another network expands from rpfC. Indeed, this network (pale green background), associated with chemotaxisrelated genes, includes phoB, which is involved in phosphate regulons, essential for adaptation and virulence induction in members of the genus Xanthomonas (Pegos et al., 2014; Moreira et al., 2015).

### XauB and XauC Genomes Lack an Alginate Biosynthesis Gene

In A306, the first gene downstream of pilE-Y1-X-W-V-FimT, XAC2670, encodes an alginate biosynthesis protein, which is absent from the XauB and XauC genomes (**Supplementary Table S2**). The A306, XauB, and XauC genomes encode another gene whose product is involved with the metabolism of alginate, alginate lyase: algL (XAC4349).

Analysis of predicted interactions of the gene XAC2670 with other genes/proteins revealed two distinct clusters (**Figure 4**). The first one on blue background contains seven nodes whose genes/proteins are directly related to synthesis and regulation of T4p (pil genes previously described). In this group, excepting pilO (XAC3383), all other genes are present in a cluster (XAC2664-XAC2669) downstream of a transposase and a phage insertion (numbers 1–6), and upstream of the gene XAC2670. The second cluster, on yellow background, contains 12 nodes, with most genes/proteins related to regulatory functions, especially algZR (XAC0620- 21), encoding a two-component system, respectively for sensor and regulatory proteins (Okkotsu et al., 2014), and algC (phosphomannomutase) (Davies and Geesey, 1995), all described as essential to alginate synthesis. Another twocomponent system, lytST (XAC2142-2141, sensor-regulator), is also present in this network. LytT, as well as rpfD, also present in the yellow background network (XAC1874) and member of the rpf gene cluster, exhibit the lytR domain, also present in proteins such as AlgR with DNA binding function (Nikolskaya and Galperin, 2002). Finally, rpoE (XAC1319) connects the two clusters (**Figure 4**), and therefore may be directly associated with both by regulating the EPS synthesis and/or by modulating T4p expression.

### XauB and XauC Lack Several Genes Related to Hemagglutinin and Hemolysin Synthesis

A further set of genes with significant differences in terms of presence and absence in the analyzed genomes is related to hemagglutinin and hemolysin synthesis. These genes are located in two regions in the genome of A306 (XAC4112-XAC4125 and XAC1810-XAC1819). The first region is flanked by genes that are part of the T6SS, both downstream and upstream. All genes of this region are present in all genomes of XauB and XauC. However, the genes in the second region (XAC1810- XAC1819) are totally absent in the XauB and XauC genomes. Among these genes we highlight fhaC (XAC1814), which codes for an outer membrane hemolysin activator, fhaB (XAC1815), which codes for a filamentous hemagglutinin, XAC1816, which codes for a hemagglutinin/hemolysin-related protein, XAC1818, which codes for hemagglutinin, and the genes in the operon HmsHFR-hp (XAC1813-1810).

Analysis of predicted interactions of the gene fhaB (XAC1815) allowed the characterization of two major interaction networks (**Figure 5**). One of these networks (pink background) is associated with adhesion, whereas the other network (gray background) basically contains hypothetical genes/proteins. Furthermore, other genes/proteins in the adhesion network are located in the two regions related to hemagglutinin and hemolysin synthesis mentioned in the previous paragraph.

Using hmsF (XAC1812), we obtained an interaction network made of three clusters, two of which seem to be functionally related (**Figure 5**). One of the clusters (green background) contains genes/proteins associated with carbohydrate metabolism. The other cluster (orange background) contains the hmsFHR genes, related to biofilm formation.

A summary of these results is presented in **Figure 6**, which includes some additional pathogenicity-related genes also lacking in the XauB and XauC genomes: vapBC, a toxin-antitoxin module in Acidovorax citrulli (Shavit et al., 2015); and tspO, which encodes a protein with a potential role in the oxidative stress response, iron homeostasis, and virulence expression in Pseudomonas (Leneveu-Jenvrin et al., 2014).

## DISCUSSION

Our results show that the three lineages inflicting citrus canker (A strains and XauB and XauC strains) can be robustly separated into two well defined clades, with A strains in one clade, which we call the Citri-citri (C-c) clade, and XauB and XauC in another clade, which we call the aurantifolii clade (**Figure 1**); furthermore, XauB and XauC were shown to be in a paraphyletic clade, with X. citri pv. anacardii being closer to XauB (**Figure 1**). It is noteworthy to observe that the C-c and aurantifolii clades contain strains that are pathogenic in taxonomically disparate plant hosts, such as citrus (X. citri pv. citri and X. citri pv. aurantifolii), leguminosae (X. citri pv. glycines and X. citri pv. fuscans emerging from more basal nodes), cashew (X. citri pv. anacardii), mango (X. citri pv. mangiferaindicae), and cotton (X. citri pv. glycines). Curiously, X. citri pv. anacardii (infecting cashew) apparently evolved within a citrus-associated clade, suggesting a possible host jump.

We have made an extensive analysis of the presence and absence of effectors in the 31 genomes we sampled to reconstruct

BioCyc resource (Karp et al., 2017). The arrows represent the genes and the gray background the transcriptional units. The blue numbers correlate the position of a given gene in the genomic context and the same gene on the network. The number in red has a similar purpose and simply highlights the query gene in the network

inference. the phylogeny. We now discuss the main results of this analysis. XopS was shown to be the only effector among the 62 investigated that is present in the C-c clade (although in some cases as a pseudogene) but absent in the aurantifolii clade (**Figure 1**). XopS is completely dependent on HpaB to be translocated; it contributes to disease symptoms and bacterial growth and suppresses pathogen-associated molecular pattern (PAMP)-triggered plant defense gene expression (Schulze et al., 2012). XopF1 was found to have the opposite pattern compared to xopS: present in the aurantifolii clade (in some cases as a pseudogene) and absent in the C-c clade. In Xanthomonas oryzae pv. oryzae, XopF1 has been shown to repress basal PAMP-triggered immunity response in rice (Mondal et al., 2015). A third interesting case is xopK, which is present in the C-c clade, but was found to be a pseudogene in all genomes of the aurantifolii clade. XopK has been shown to inhibit PAMP-triggered immunity upstream

of mitogen-activated protein kinase cascades in Xanthomonas oryzae pv. oryzae (Qin et al., 2018). **Figure 1** makes clear that there are many other differences in effector repertoires among the 31 genomes analyzed; 11 of these other effectors have been studied in terms of their gains and losses across the evolution of the 31 strains (**Supplementary Figure S1**). Because the pattern of gains, losses and pseudogenization is more intricate, additional studies are required to correlate these inferred histories to known phenotypic traits of the affected strains.

In addition to effectors, we have carefully analyzed the gene content in the broad categories of secretion systems-related genes and surface structure-related genes. Our main tool in this analysis, in addition to presence/absence results, was the prediction of possible interactions. These analyses resulted in several noteworthy differences of XauB and XauC strains when compared to the A306 genome.

### Type I Secretion System

The T1SS is responsible for secreting toxins, such as hemolysins in E. coli (Thomas et al., 2014). In the three XauB strains investigated, both genes coding for apparatus secretory proteins (hlyB and hlyD) and genes coding for hemolysins (hlyA) were not found. This absence might contribute to a decrease in the elicitation of the plant immune response as well as to decreased competitive capability with other organisms due to the inability to secrete these toxins.

## Type IV Secretion System

The T4SS has multiple functions, including transport of a variety of substrates from DNA and protein-DNA complexes to proteins, and it plays fundamental roles in both bacterial

on a figure generated by the BioCyc resource (Karp et al., 2017).

pathogenesis and adaptation to the cellular milieu in which bacteria live (Darbari and Waksman, 2015). Jacob et al. (Jacob et al., 2014), reported that the T4SS in A306, unlike the T3SS, is not associated with virulence induction, but rather in cellcell interactions. This finding was confirmed by Souza et al. (2015), who demonstrated the involvement of the chromosomal T4SS in bacterial killing, showing that this special class of T4SS is a mediator of both antagonistic and cooperative interbacterial interactions. We speculate that the lack of the T4SS chromosomal gene cluster in XauB and XauC genomes may have a consequence in the ability of these strains to compete with other bacteria, in particular with A306 itself. If this speculation is correct, this may be an explanation for the apparent disappearance of XauB strains from the field (Chiesa et al., 2013).

#### Synthesis and Regulation of Type IV Pilus

Among the protein complexes involved in biofilm formation is the type IV pilus (T4p) (Dunger et al., 2014). Besides actively participating in this matrix, the T4p is of fundamental importance in the adhesion process to the host tissue in the early stages of infection and independent flagellum displacement,

FIGURE 5 | Networks of putative interactions between genes/proteins using as queries the A306 genes fhaB (XAC1815) and hmsF (XAC1812). For additional details see legend of Figure 2. The network figure was generated by the program STRING (Snel et al., 2000) and the genomic context figure was based on a figure generated by the BioCyc resource (Karp et al., 2017).

called twitching motility (Mattick, 2002). The orthologous gene cluster pilE-Y1-X-W-V-FimT (XAC2664-2669) of A306 in P. aeruginosa has been shown to be involved in negative regulation of swarming motility (Giltner et al., 2010; Kuchma et al., 2012). In this same context, inactivation of pilA inserted in the pilS-R-B-A-A-C-D cluster (XAC3238-3243) interfered with twitching motility, biofilm development, and adherence of XAC (Dunger et al., 2014). Thus, the lack of genes pilX, pilV, pilA, pill, and fimT genes in the XauB and XauC genomes, all involved with T4p apparatus structuring, seem to explain at least in part the decreased production of biofilm and self-aggregation capability in some XauB and XauC strains (**Supplementary Table S3**). On the other hand, these same results show that XauC 535 and XauC 1609 presented, respectively, the highest biofilm production and self-aggregation capability in virulence-inducing media (XVM2) among all strains, even in the absence of the genes listed; this result requires further investigation.

#### Alginate Biosynthesis

fmicb-10-02361 October 10, 2019 Time: 17:10 # 13

Alginate is an EPS related to biofilm formation and produced by bacteria of the genus Pseudomonas (Baker and Svanborg-Eden, 1989; Orgad et al., 2011). The function of alginate lyase is to hydrolyze bonds that hold the structured polymer, thereby enabling the bacterium to leave the biofilm structure, allowing its spreading by the colonized tissue (Boyd and Chakrabarty, 1994).

The intricate network we inferred for XAC2670 (which codes for an alginate biosynthesis protein in A306) may be depleted in XauB and XauC due to the lack of key genes/proteins in the composition of these clusters, as it is the case of pilX and pilV, and XAC2670 itself, which could impair the synthesis and regulation of T4p apparatus and EPS production. Importantly, there are no reports in the literature mentioning any Xanthomonas species as an alginate producer. However, it is interesting to notice the presence of at least nine genes in A306 that may be involved with synthesis and regulation of this polymer, from which four are present in the interaction networks described above.

#### Hemagglutinin and Hemolysin Synthesis

The hemagglutinin gene (XAC1818) has been described as fundamental to the virulence process in many organisms, including Xylella fastidiosa (Caserta et al., 2010; Voegel et al., 2010), another plant pathogen of the Xanthomonadaceae family, and in A306 (Gottig et al., 2009). The hmsHFR-hp genes (XAC1813-1810) are involved in adaptation and virulence, and have been reported respectively to be homologous to E. coli genes pgaABCD (Wang et al., 2004). Mutations in genes from this operon in members of the genera Chromobacterium, Yersinia, and Xanthomonas have resulted in reduction of biofilm formation and consequent reduction in virulence induction (Becker et al., 2009; Abu Khweek et al., 2010; Wang et al., 2012). Therefore, the absence of XAC1810- XAC1819 in the genomes of strains XauB and XauC might contribute to less efficient tissue adhesion processes and biofilm formation, and reduce cell-to-cell aggregation dependent of adhesin and exopolysaccharides molecules; this in turn would lead to reduction in tissue colonization capabilities in these strains.

#### CONCLUSION

Taken together, our results show that the XauB and XauC genomes lack many genes that are known to play a role in host infection, either in A306 or in other pathosystems. This result is consistent with the attenuated citrus canker phenotypes of the XauB and XauC strains. In addition, the lack of recent reports about the presence of XauB and XauC strains in the field suggests a scenario in which A306 or other A strains may have outcompeted the XauB and XauC strains, possibly leading to their eradication. If so, this would be a process similar to what has taken place with Candidatus Liberibacter americanus, a causative agent of citrus huanglongbin, which has reportedly been eradicated, in South America, by Candidatus Liberibacter asiaticus (Wulff et al., 2014). It is to be hoped that such knowledge can be put to practical use in the efforts to eradicate from the field the A strains as well.

#### DATA AVAILABILITY STATEMENT

Raw reads are available at the Short Read Archive at NCBI at the following URLs: https://www.ncbi.nlm.nih.gov/Traces/study/ ?acc=PRJNA273983; https://www.ncbi.nlm.nih.gov/sra/?term= PRJNA273983.

#### AUTHOR CONTRIBUTIONS

JS, LM, JF, JB, and FJ conceived the study. JB and FJ selected and prepared strains for sequencing. AF, RF, and JF did the genome sequencing. AV and NA assembled the genomes. NF, ÉF, WC, AS, IC, CL, RA, CG, JM, JP, LM, and JS analyzed the data and interpreted the results. JS, LM, and JP wrote the manuscript.

### FUNDING

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001 (the BIGA project). NF was funded in part by grant Fundect-MS (007/2015 SIAFEM 025139). JS, LM, NF, and AV were funded in part by Researcher Fellowships from CNPq.

#### ACKNOWLEDGMENTS

We thank Carlos Morais Piroupo for providing computational assistance.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2019. 02361/full#supplementary-material

FIGURE S1 | Trees with the reconstruction of gains, losses, and pseudogenization events for 11 effector genes. The effector name is shown at the top of each tree frame.

TABLE S1 | Type III Secretion System Effectors investigated.

TABLE S2 | Representation of an alignment of ortholog genes having as anchor the Xac306 genome.

TABLE S3 | Results of biochemical assays related to xanthan gum and biofilm production, and aggregation (left-had side).

#### REFERENCES

fmicb-10-02361 October 10, 2019 Time: 17:10 # 14



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Fonseca, Patané, Varani, Felestrino, Caneschi, Sanchez, Cordeiro, Lemes, Assis, Garcia, Belasque, Martins, Facincani, Ferreira, Jaciani, Almeida, Ferro, Moreira and Setubal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.