Genetic and Phylogenetic Characteristics of Pasteurella multocida Isolates From Different Host Species

Pasteurella multocida is a leading cause of respiratory diseases in many host species. To understand the genetic characteristics of P. multocida strains isolated from different host species, we sequenced the genomic DNA of P. multocida isolated from pigs and analyzed the genetic characteristics of strains from avian species, bovine species, pigs, and rabbits using whole genome sequence (WGS) data. Our results found that a capsular: lipopolysaccharide (LPS): multilocus sequence typing (MLST) genotype A: L1: ST129 (43.75%) was predominant in avian P. multocida; while genotypes B: L2: ST122 (60.00%) and A: L3: ST79 (30.00%) were predominate in bovine P. multocida; genotype D: L6: ST50 (37.50%) in porcine P. multocida; and genotype A: L3: ST9 (76.47%) in rabbit P. multocida. Comparative genomic analysis of P. multocida from different host species found that there are no genes in the P. multocida genome that are specific to any type of host. Phylogenetic analysis using either whole-genome single nucleotide polymorphisms (SNPs) or the set of SNPs present in all single-copy core genes across genomes showed that P. multocida strains with the same LPS genotype and MLST genotype were clustered together, suggesting the combining both the LPS and MLST typing schemes better explained the topology seen in the P. multocida phylogeny.


INTRODUCTION
Pasteurella multocida infects a wide spectrum of domestic and wild animals such as poultry and wild birds, pigs, cattle and buffaloes, rabbits, small ruminants, cats (including house cats and large wild cats), dogs and other mammals (Wilkie et al., 2012;Wilson and Ho, 2013). P. multocida isolates are associated with clinical manifestations ranging from asymptomatic or mild chronic upper respiratory inflammation to acute, pneumonic and/or disseminated disease (Wilson and Ho, 2013). While less common, P. multocida can also cause infections in humans through animal bites and/or scratches (Wilson and Ho, 2013).
There are many virulence factors contributing to the pathogenesis of P. multocida. Currently known virulence factors include genes involved in the formation of the capsule, lipopolysaccharide (LPS), fimbriae and adhesins, toxins, iron regulated and iron acquisition proteins, sialic acid metabolism, hyaluronidase, and outer membrane proteins (OMPs) (Harper et al., 2006). The two key surface components, capsule and LPS, form the main typing basis of P. multocida. Serologically, the bacterium is classified into five capsular serogroups (A, B, D, E, F) and/or 16 somatic serotypes according to its capsule and/or LPS antigens, respectively (Carter, 1955;Heddleston et al., 1972). These traditional serological typing methods are too complicated to be used, because the preparation of the high-sensitive antiserum required for these methods is very difficult (Peng et al., 2016). Therefore, molecular typing methods have been developed to help assign P. multocida into five capsular genotypes (A, B, D, E, F) (Townsend et al., 2001) and eight LPS genotypes (L1-L8) . In addition, the multilocus sequence typing method has been also designed for P. multocida (Subaaharan et al., 2010), and it has been widely used in epidemiology and surveillance (Hotchkiss et al., 2011;Moustafa et al., 2013;Fernández-Rojas et al., 2014;Peng et al., 2018).
It is documented that the infection of P. multocida isolates displays host predilection (Wilkie et al., 2012;Wilson and Ho, 2013). For example, a P. multocida B: 2 strain can kill cattle and buffaloes at a low dose, but it has no effect on chickens, even at very high doses (Aktories et al., 2012). Isolates from nonavian hosts generally do not cause symptoms of fowl cholera in birds (Ahir et al., 2011;Peng et al., 2017). However, little is known about the genetic characteristics of P. multocida strains circulating in different hosts. In this study, the genomic DNA of P. multocida strains isolated from pigs was sequenced, and the genetic characteristics of P. multocida isolates from avian hosts, bovine hosts, pigs and rabbits were determined using the previously published whole genome sequences (WGSs) data. The aim of this study is to understand the genetic characteristics of P. multocida strains circulating in different host species.

P. multocida Strains and Whole Genome Sequencing
A total of 47 P. multocida strains were selected for whole genome sequencing in this study ( Table 1). Most of these strains were isolated from pigs with respiratory disorders in China (Peng et al., 2018), with the exception of strain HN04, which is a capsular type B isolate from a swine haemorrhagic septicaemia case. The capsular types and LPS genotypes of these 47 P. multocida isolates were determined through PCR assays, as described previously (Townsend et al., 2001;Harper et al., 2015).
The genomic DNA (gDNA) of the 47 P. multocida strains were isolated using a QIAGEN Blood and Cell Culture DNA Midi Kit (Lot NO. 157037192, QIAGEN, Germany) with QIAGEN Genomic-tip 100/G (Lot NO. 157024372, QIAGEN, Germany). DNA quantity and quality were evaluated by electrophoresis on a 1% agarose gel and evaluation using a NanoDrop2000 (Thermo Scientific, Waltham, USA), respectively. DNA libraries were constructed using NEBNext R Ultra TM II DNA Library Prep Kit. Sequencing was carried out on an Illumina Hiseq Xten platform (Illumina Inc., San Diego, USA) at Guangdong Magigene Biotechnology Co. LTD (Guangzhou, China), using the pair-end 150-bp sequencing protocol. The strategy yielded 5,308,336∼12,159,104 raw reads. The raw reads were then filtered to eliminate reads with low quality according to the following criteria: low quality base pairs at each terminal of the reads (Quality-Value < 20) were removed; reads with over short length (parameter setting at 50 bp), > 10% Ns (parameter setting at 10 bp) or > 15 bp overlap overlap with Illumina TruSeq adapter sequences (parameter setting at 15 bp) were removed. After that, 4,732,078∼10,715,272 clean reads (Q20% = 100, Q30% ≥ 95.64) were produced. High quality reads were de novo assembled via SPAdes v3.9.0 (Bankevich et al., 2012) to generate contigs.
In addition to the 47 porcine P. multocida sequenced herein, we also retrieved genome sequences of P. multocida from different hosts from the National Center for Biotechnology Information (NCBI) genome database (https://www.ncbi.nlm. nih.gov/genome/genomes/912). By the time of this study design (Assessed at December 31, 2017), there are 132 P. multocida genome sequences publicly available. To ensure the reliability and quality of our analysis, only those high-quality genomes assemblies with contigs numbers ≤ 50 plus coverage ≥50× were selected. While not meeting the criterion of selection, the two avian P. multocida isolates P1059 (GenBank accession number: CM001581) and X73 (GenBank accession number: CM001580) were still included, because their phenotypical and genetic characteristics have been published (Johnson et al., 2013). The strategy resulted in 62 P. multocida genome sequences, which include 16 avian, 20 bovine, 9 porcine, and 17 rabbit isolates ( Table 1). A total of 109 P. multocida genome sequences were finally used in this study, including 47 sequenced herein plus 62 retrieved from NCBI database ( Table 1).
Draft genome sequences of the 47 porcine P. multocida were annotated using prokka v1.12 (Seemann, 2014) following the NCBI Prokaryotic Genome Annotation Pipeline (Angiuoli et al., 2008). The 47 annotated genomes were submitted to GenBank accession numbers are displayed in Table 1, which are highlighted in blue text.

Genotyping
The capsular genotypes and LPS genotypes of the 47 P. multocida strains sequenced were determined using PCR methods (Townsend et al., 2001;Harper et al., 2015) and are documented on our previous publication (Peng et al., 2018). Capsular genotypes and LPS genotypes of the 62 P. multocida genomes from NCBI genome database were determined by blasting the oligonucleotide sequences designed for capsular typing (Townsend et al., 2001) and LPS genotyping  against their WGSs, respectively. MLST genotypes of the 109 P. multocida strains were assigned by performing blast of their WGSs against the P. multocida MLST Databases (https:// pubmlst.org/pmultocida). Currently, there are two separate

Phylogenetic Trees
The phylogenetic relationship between P. multocida strains from different host species was predicted by analyzing the whole-genome single nucleotide polymorphisms (WG-SNPs) as well as the set of SNPs present in all single-copy core genes across genomes (CG-SNPs). The WG-SNPs were identified by comparing each of the WGSs against the reference P. multocida ATCC 43137 genome sequence (GenBank accession NO. CP008918) using MAFFT (version 7.222) software (Katoh and Standley, 2013). A phylogenetic tree based on these WG-SNPs was exported using Parsnp (version 1.2) software (Treangen et al., 2014). Another phylogenetic tree based on CG-SNPs was also constructed. In this strategy, gene contents of each of the 109 P. multocida genomes were clustered using Roary v3.11.0 software (Page et al., 2015) with a criterion of 90% DNA identity plus 90% gene coverage as the minimum criteria for a match to define core genes at first, then SNPs within all the singlecopy core genes were compared to generate a neighbor-joining tree using MEGA6 with Kimura 2-parameter model and 1000 bootstrap (Tamura et al., 2013).

Core and Pan-Genome Analysis
Gene contents of each of the 109 P. multocida genomes were clustered using Roary v3.11.0 (Page et al., 2015). Homologous genes were identified by using 90% DNA identity and 90% gene coverage as the minimum criteria for a match. Because potential genes might be disrupted by the gaps between the contigs, we defined those genes present in 99∼100% of the strains as core genes. Dispensable genes were defined as those genes shared by at least two strains but shared by <99% strains, while unique genes were defined those present only in a single strain. Hostspecific genes were defined as those genes shared by all strains isolated from the same host species but absent in the strains from the other host species. Core genes, dispensable genes, and strain-specific were annotated using the Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih. gov/COG/) for functional analysis (Galperin et al., 2015).

Distribution of Main Virulence Factors Associated Genes
Genes identified as virulence factors (VFGs) were predicted by performing BLAST analysis of the genome sequence against the bacterial Virulence Factor Database (VFDB) (Chen et al., 2005(Chen et al., , 2012. After that, a BLAST score ratio (BSR) analysis was performed on these predicted VFGs, and the presence of a VFG in a genome was determined based upon the BSR analysis with a normalized ratio ≥0.8, as described previously (Rasko et al., 2008). A heat-map showing the BSR ratio values of each gene across the genomes of all bacterial strains was generated.

General Features of the 47 Porcine P. multocida Genomes
Whole genome sequencing yielded approximately 796.25∼1823.87 Mbp raw data for the 47 porcine P. multocida isolates. The data filter strategy produced approximately 701.86∼1589.23 Mb clean data for assembly. Sequences assembled using SPAdes v3.9.0 (Bankevich et al., 2012) generated approximately 18∼66 contigs for the 47 porcine P. multocida isolates, with an average of 25 contigs for each strain. These contigs represented a 161∼631-kb N50 fragment with an average contig size of approximately 35∼130 kb. The predicted genome sizes ranged from 2.22 to 2.49 Mbp, with average G+C content ranging from 40.04 to 40.67% for the 47 P. multocida isolates. The average genome size and G+C content for each of the P. multocida sequenced were 2.35 Mbp and 40.3%, respectively. The number of coding DNA sequences among the genome sequences ranged from 2,027 to 2,330. The general features of the 47 sequenced P. multocida isolates from pigs and their accession numbers as well as those public available genomes of P. multocida from different hosts are shown in Table 1.

Capsular Genotypes
Capsular typing strategy identified two categories of capsular genotypes (A and F) for the 16 avian P. multocida isolates, two categories of capsular genotypes (A and B) for the 20 bovine P. multocida isolates, four categories of capsular genotypes (A, B, D, and F) for the 56 porcine P. multocida isolates, and two categories of capsular genotypes (A and D) for the 17 rabbit P. multocida isolates. Type A was the most common capsular genotype for the avian P. multocida isolates (93.75% of the isolates) and rabbit P. multocida isolates (94.12% of the isolates); types A and B were the common capsular genotypes for the bovine P. multocida isolates (40.00% of the isolates and 60.00% of the isolates, respectively); types A and D were the common capsular genotypes for the porcine P. multocida isolates (48.21% of the isolates and 42.86% of the isolates, respectively; see Table 1 and Image 1 in Supplementary Materials).

Phylogeny
We generated two phylogenetic trees among the P. multocida isolates from different isolates collected in this study, and they were based on two criteria, respectively. The first tree was generated using 21,765 SNPs across the WGSs of the P. multocida comparing against the reference P. multocida ATCC 43137 genome. Using this strategy, it is clearly showed that P. multocida strains with the same LPS plus MLST genotypes were phylogenetically clustered together (Figure 3). The second tree was constructed based on the SNPs within all single-copy core genes among the comparison genomes. Based on this tree, P. multocida strains with the same LPS plus MLST genotypes were also phylogenetically clustered together (Figure 4).

Core and Pan-Genome Analysis
All genes encoded by the 109 P. multocida genomes were used for the determination of homologous and unique genes. These analyses identified a shared set of 1,806 genes (core genes), 1,841 dispensable genes, and 609 strain-specific genes (Figure 5). According to COG functional prediction, the core genes mainly participated in translation, ribosomal structure and biogenesis (J), amino acid transport and metabolism (E), carbohydrate transport and metabolism (G), cell wall/membrane/envelope biogenesis (M), energy production and conversion (C), coenzyme transport and metabolism (H), inorganic ion transport and metabolism (P), and posttranslational modification, protein turnover, chaperones (O); while the dispensable genes mainly involved in mobilome (prophages, transposons) (X), replication, recombination and repair (L), and transcription (K); the strainspecific genes mainly participated in mobilome (X), replication,  recombination and repair (L), defense mechanisms (V), and transcription (K) (Figure 5).
To figure out genes carried by P. multocida genome that are associated with host predilection, we first defined a host predilection related gene as one present among the strains from one host species but absent in strains from all the other host species. This criterion found that there were no genes specific to any hosts. Therefore, we next defined a host predilection related gene was determined as one gene present among 90% of the strains from one host species but absent in 90% of the strains from all the other host species. However, this criterion also found that there were no genes specific to any hosts.

Distribution of Virulence Factor-Associated Genes
A total of 432 putative VFGs were identified by performing BLAST analysis of the 109 P. multocida genome sequences against the VFDB database, and these VFGs were assigned into 366 VFG categories (see Table S1 in Supplementary Materials). According to BSR analysis, the distribution of VFGs between different P. multocida strains was mainly reflected on the genes responsible for the synthesis of capsule, LPS and fimbrial lowmolecular-weight proteins; however, none of these genes were specific to any hosts (see Image 4 in Supplementary Materials). Interestingly, some VFGs displayed a certain level of LPS: MLST FIGURE 3 | Phylogenetic tree generated based on the SNPs across the whole genome sequences of the P. multocida comparing against the reference P. multocida ATCC 43137 genome. P. multocida capsular type A strains were marked in red, type B strains were marked in green, type D strains were marked in blue, and type F strains were marked in purple. Avian isolates are marked with " "; Bovine isolates are marked with " "; Porcine isolates are marked with " "; Rabbit isolates are marked with " ". genotype-preference. For example, the hyaluronidase encoding gene pmHAS was absent in the genotypes L6: ST50, L6: ST287, and L2: ST122 strains; the surface fibrils encoded by the gene hsf-2 was missing in genotypes L6: ST50 and L3: ST13 strains (see Image 5 in Supplementary Materials).

DISCUSSION
Birds, pigs, cattle and buffaloes, as well as rabbits are the natural hosts of P. multocida (Wilkie et al., 2012;Wilson and Ho, 2013). In this study, we genotyped P. multocida strains FIGURE 4 | Phylogenetic tree generated based on the SNPs within the single copy core genes among the comparison P. multocida genomes. P. multocida capsular type A strains were marked in red, type B strains were marked in green, type D strains were marked in blue, and type F strains were marked in purple. Avian isolates are marked with " "; Bovine isolates are marked with " "; Porcine isolates are marked with " "; Rabbit isolates are marked with " ". from these host species using their WGS data. Types A: L1 (56.25%) and A: L3 (18.75%) were the common capsule: LPS genotypes for avian P. multocida. It is widely documented that P. multocida strains causing avian pasteurellosis (fowl cholera) are most frequently designated Carter: Heddleston serotypes A:1, A:3, or A:4 (Adler et al., 1999;Wilson and Ho, 2013), and P. multocida Heddleston serotype 1 and 3/4 strains produced LPS genotypes L1 and L3, respectively (Harper et al., 2011(Harper et al., , 2013 . Therefore, our results are in agreement with those previously documents (Adler et al., 1999;Wilson and Ho, 2013). It has FIGURE 5 | Distribution of core and dispensable genes among the 109 P. multocida and their COG functions.
The predominate MLST genotype of P. multocida from avian species was ST129 (43.75%), and this result is in agreement with Wang et al. (2013). The difference between the two studies is that Wang et al. only defined ST129 from the avian P. multocida strains but we identified multiple MLST genotypes in addition to ST129; other MLST genotypes such as ST8, ST9, ST27, ST53, ST60 are also common in avian P. multocida (Subaaharan et al., 2010). These findings suggest the prevalence of P. multocida genotypes from different geographic regions might be different. ST122 was the common MLST genotype in P. multocida isolates from bovine species (Table 1). This genotype is widely documented to be associated with bovine haemorrhagic septicaemia, a common pasteurellosis in bovine species (Hotchkiss et al., 2011). Interestingly, the MLST genotype of the porcine isolate HN04 from swine haemorrhagic septicaemia case was also ST122, suggesting that ST122 might be particularly associated with the disease other than the host species. The mostly common MLST genotypes of P. multocida from pigs defined in this study were ST50 (37.50%), ST13 (23.21%), and ST74 (19.64%). These results are quite different from the date of a recent molecular epidemiological study (Peng et al., 2018), but this could be explained by the two studies using two different MLST databases to determine the MLST genotypes. When we used the Multiple host MLST database to re-assigned these strains, we found that all ST50, ST13, and ST74 strains were re-assigned as ST11, ST3, and ST10, respectively. These results are in agreement with the latest epidemical data from China (Peng et al., 2018), and also in agreement with the data from Spain (García-Alvarez et al., 2017).
The predominate MLST genotype in P. multocida isolates from rabbit species was ST9 (76.47%), which was re-assigned as ST12 using the Multiple host MLST database. Previous studies have documented that ST12 was the common MLST genotype in rabbit P. multocida (García-Alvarez et al., 2015;Massacci et al., 2018). Interestingly, a previous study documented that the ST11 (assigned using Multiple host MLST) clone of P. multocida is widely spread among farmed rabbits in the Iberian Peninsula and demonstrates respiratory niche association (García-Alvarez et al., 2015). However, the percentage of the ST11 clone (defined as ST50 using RIRDC MLST) defined in this study was very low, only 5.88% (one isolate). According to NCBI genome database (https://www. ncbi.nlm.nih.gov/genome/genomes/912), the WGSs data from the 17 rabbit P. multocida using in this study are from France, these findings also suggest that the prevalence of P. multocida genotypes in different geographic regions might be different.
When combining the capsular genotypes, LPS genotypes and the MLST genotypes, it is interesting to see that although some P. multocida strains from different hosts share the same capsule: LPS genotypes, their MLST genotypes are quite different. The MLST genotypes of P. multocida capsule: LPS genotype A: L1 strains from avian species were ST129 and ST69, but the MLST genotype of those type A: L1 strains from bovine species was ST65 (Table 1). However, P. multocida strains with the same capsule: LPS: MLST genotype are able to spread in different host species but cause similar disease symptoms. For example, the capsule: LPS: MLST genotype B: L2: ST122 strain is associated with both bovine haemorrhagic septicaemia and swine haemorrhagic septicaemia; the F: L3: ST9 clone can lead to both avian respiratory disease and porcine respiratory disease; the D: L6: ST50 clone is associated with both the porcine respiratory disease and the rabbit respiratory disease ( Table 1).
Although many publications have documented that the infection of P. multocida isolates displays host predilection (Kubatzky, 2012;Wilkie et al., 2012;Wilson and Ho, 2013), these analyses using comparative genomics found that there were no genes or VFGs specific to any isolates for any specific type of host (see Image 4 in Supplementary Materials). Though the dermonecrotic toxin encoding gene toxA was only present in one isolate from pigs (see Image 5 in Supplementary Materials), this gene has been also identified in P. multocida isolates from bovine species, avian species, small ruminants, sheep, and goats (Ewers et al., 2006;Sarangi et al., 2016;Shirzad Aski and Tabatabaei, 2016). These findings suggest that toxA is not associated with host predilection. Interestingly, it is documented that the presence of toxA displays a significant association to the disease status in swine, even though it is detected in P. multocida isolates from small ruminants, cattle, and poultry as well (Ewers et al., 2006). This suggests that the disease symptoms caused by P. multocida with specific VFGs might be also associated with the host itself and/or the locations that the bacteria are found on the host. However, more studies are necessary for further confirmation. Similar to toxA, the transferrin binding protein A encoding gene tbpA was also present in isolates from bovine species (see Image 5 in Supplementary Materials). However, this gene has been also reported to be identified in P. multocida isolates from ovine species and rabbits (Ewers et al., 2006;Ferreira et al., 2012). These findings suggest that tbpA is also not host specific.
It has been proposed that there is little or no correlation between the phylogenetic groups and phenotypic characteristics Moustafa et al., 2015;Cao et al., 2017). However, in this study, using both the SNPs across the WGSs and the SNPs within all single-copy core genes among the comparison genomes, we found that P. multocida isolates with the same LPS genotype and MLST genotype could be phylogenetically grouped, though these P. multocida strains were isolated from different geographic regions or host species, and/or associated with different disease symptoms (Figures 3, 4). While more data are needed for the further confirmation, these findings suggest the combining both the LPS and MLST typing schemes might better explain the topology seen in the P. multocida phylogeny.
In conclusion, we performed whole genome sequencing on P. multocida strains isolated from pigs and determined the genetic characteristics of P. multocida isolates from avian species, bovine species, pigs and rabbits in this study. Genotyping using WGSs data suggest that P. multocida isolates from different host species displayed a certain preference for "capsule/LPS/MLST genotypes." Although host predilection in P. multocida infections are widely documented, our analyses found there were no genes specific to any hosts. In addition, our phylogenetic analyses found that P. multocida isolates with the same LPS genotype and MLST genotype could be phylogenetically grouped, which suggests that the combining both the LPS and MLST typing schemes might better explain the topology seen in the P. multocida phylogeny.

AUTHOR CONTRIBUTIONS
ZP, WL, RZ, HC, and BW participated in the conception and design of the work; ZP, WL, ZfX, ZhX, and ZL performed whole genome sequencing and data analysis; FW and LH contributed to bacterial genomic DNA isolation and purification; ZP wrote the manuscript; RZ, HC, and BW helped to revised the manuscript. All authors read and approved the final manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01408/full#supplementary-material Table S1 | Virulence factor-associated genes of the 109 P. multocida isolates predicted by the VFDB database.
Image 1 | Distribution of different capsule genotypes in P. multocida isolated from different host species.
Image 2 | Distribution of different LPS genotypes in P. multocida isolated from different species.
Image 3 | Distribution of different MLST genotypes in P. multocida isolated from different species.
Image 4 | Heat-map showing the distribution of virulence factor-associated genes among P. multocida isolates from different host species.
Image 5 | Heat-map showing the distribution of the 27 main kinds of virulence factor-associated genes among P. multocida isolates from different host species.