Long-Read Sequencing Reveals Evolution and Acquisition of Antimicrobial Resistance and Virulence Genes in Salmonella enterica

Salmonella enterica is a significant and phylogenetically diverse zoonotic pathogen. To understand its genomic heterogeneity and antimicrobial resistance, we performed long-read sequencing on Salmonella isolated from retail meats and food animals. A collection of 134 multidrug-resistant isolates belonging to 33 serotypes were subjected to PacBio sequencing. One major locus of diversity among these isolates was the presence and orientation of Salmonella pathogenic islands (SPI), which varied across different serotypes but were largely conserved within individual serotypes. We also identified insertion of an IncQ resistance plasmid into the chromosome of fourteen strains of serotype I 4,[5],12:i:– and the Salmonella genomic island 1 (SGI-1) in five serotypes. The presence of various SPIs, SGI-1 and integrated plasmids contributed significantly to the genomic variability and resulted in chromosomal resistance in 55.2% (74/134) of the study isolates. A total of 93.3% (125/134) of isolates carried at least one plasmid, with isolates carrying up to seven plasmids. We closed 233 plasmid sequences of thirteen replicon types, along with twelve hybrid plasmids. Some associations between Salmonella isolate source, serotype, and plasmid type were seen. For instance, IncX plasmids were more common in serotype Kentucky from retail chicken. Plasmids IncC and IncHI had on average more than five antimicrobial resistance genes, whereas in IncX, it was less than one per plasmid. Overall, 60% of multidrug resistance (MDR) strains that carried >3 AMR genes also carried >3 heavy metal resistance genes, raising the possibility of co-selection of antimicrobial resistance in the presence of heavy metals. We also found nine isolates representing four serotypes that carried virulence plasmids with the spv operon. Together, these data demonstrate the power of long-read sequencing to reveal genomic arrangements and integrated plasmids with a high level of resolution for tracking and comparing resistant strains from different sources. Additionally, the findings from this study will help expand the reference set of closed Salmonella genomes that can be used to improve genome assembly from short-read data commonly used in One Health antimicrobial resistance surveillance.

Salmonella enterica is a significant and phylogenetically diverse zoonotic pathogen. To understand its genomic heterogeneity and antimicrobial resistance, we performed longread sequencing on Salmonella isolated from retail meats and food animals. A collection of 134 multidrug-resistant isolates belonging to 33 serotypes were subjected to PacBio sequencing. One major locus of diversity among these isolates was the presence and orientation of Salmonella pathogenic islands (SPI), which varied across different serotypes but were largely conserved within individual serotypes. We also identified insertion of an IncQ resistance plasmid into the chromosome of fourteen strains of serotype I 4, [5],12:i:-and the Salmonella genomic island 1 (SGI-1) in five serotypes. The presence of various SPIs, SGI-1 and integrated plasmids contributed significantly to the genomic variability and resulted in chromosomal resistance in 55.2% (74/134) of the study isolates. A total of 93.3% (125/134) of isolates carried at least one plasmid, with isolates carrying up to seven plasmids. We closed 233 plasmid sequences of thirteen replicon types, along with twelve hybrid plasmids. Some associations between Salmonella isolate source, serotype, and plasmid type were seen. For instance, IncX plasmids were more common in serotype Kentucky from retail chicken. Plasmids IncC and IncHI had on average more than five antimicrobial resistance genes, whereas in IncX, it was less than one per plasmid. Overall, 60% of multidrug resistance (MDR) strains that carried >3 AMR genes also carried >3 heavy metal resistance genes, raising the possibility of co-selection of antimicrobial resistance in the presence of heavy metals. We also found nine isolates representing four serotypes that carried virulence plasmids with the spv operon. Together, these data demonstrate the power of long-read sequencing to reveal genomic arrangements and integrated plasmids with a high level of resolution for tracking and comparing INTRODUCTION Salmonella enterica is an important zoonotic pathogen that causes over one million illnesses in the United States each year (Scallan et al., 2011). S. enterica are classically subdivided into serotypes and over 2,600 serotypes have been identified thus far. While many serotypes may be capable of causing infections in humans and animals, a limited number of serotypes cause most human infections in the United States. Recent advancements in whole genome sequencing (WGS) offer a unique opportunity to dissect and investigate Salmonella serotypes at the nucleotide level and to further our understanding about notable evolutionary changes. The main features associated with S. enterica evolution include acquisition and recombination of mobile genetic elements such as genomic islands, transposons, integrons, and plasmids, among others (Partridge et al., 2018). An in-depth analysis of these features will help us to understand drivers of resistance, host and environmental adaptations, and sources of resistant Salmonella infections. While most Salmonella infections are self-limiting, serious infections can require antimicrobial therapy (Tack et al., 2020). Antimicrobial resistance (AMR) can compromise therapy, increase healthcare costs, and cost lives (Abebe et al., 2020). In Salmonella, AMR is typically attained by horizontal acquisition of antimicrobial resistance genes (ARGs), although chromosomal mutations also play a role (McDermott et al., 2016).
One way that Salmonella strains acquire ARGs is through acquisition of plasmids (Emond-Rheault et al., 2020). Plasmids carry not only ARGs, but also heavy metal and disinfectant resistance genes, which may contribute to co-selection for AMR (Vijayakumar and Sandle, 2019). The types of plasmids that Salmonella carry can vary considerably, as they may include species-specific non-conjugative plasmids, or conjugative plasmids found widely among Enterobacterales (Redondo-Salvo et al., 2020). Some plasmid types are highly associated with specific serotypes and sources (Zhao et al., 2020), thus plasmids provide important information for outbreak investigations and AMR source attribution. Traditionally, incompatibility plasmid types have been used to assign plasmids into different groups based on plasmid replication machinery (Carattoli, 2013). This approach does not account for all plasmid types and it is often unclear which replication machinery is dominant, especially in hybrid plasmids arising from recombination (Hsu et al., 2019).
Characterization of plasmids and other resistance elements in Salmonella has been studied extensively by WGS. The use of short-read sequencing in conjunction with programs such as PlasmidSpades, PLACnet, or others have helped expand analyses of genomes derived from short-read sequencing data (de Toro et al., 2014;Antipov et al., 2016). There have been relatively few large-scale, long-read sequencing studies, which can yield more complete genomic information with higher resolution.
Aside from plasmids, ARGs also are commonly carried by chromosomally encoded Salmonella genomic islands (SGIs). SGI-1 was first reported in S. Typhimurium DT104 in 2001. It contained a 27 kb backbone plus a 15 kb complex with a class 1 integron, with ARGs conferring resistance to five antimicrobial classes (Boyd et al., 2000). Different variants of SGI-1 have been described, with a diversity of ARG alleles in multidrug resistance (MDR) regions (Hall, 2010). Additional SGIs, including SGI-0, SGI-2, SGI-3, and SGI-4, have been identified based on genomic structure and resistance gene contents. Both SGI-0 and SGI-2 are in the same location as SGI-1 and shared the SGI-1 backbone sequence (Levings et al., 2008;de Curraize et al., 2020). SGI-3 and SGI-4 were initially described as distinct SGIs, but they are in the same chromosomal location, have the same sequence backbone structure and are considered the same SGI (Arai et al., 2019;Branchu et al., 2019). SGI-4 did not carry AMR genes, instead it carried 24 heavy metal resistant genes (HMRGs) (Arai et al., 2019). Together the acquisition of AMR determinants, mobile genetic elements contribute to the genomic diversity found in Salmonella.
Salmonella pathogenic islands (SPIs) play a pivotal role in Salmonella virulence (Hensel, 2004;Rychlik et al., 2009). There are 24 known SPIs and several are associated with particular mechanisms of virulence (Cheng et al., 2019). SPI-1 and SPI-2, which encode type III secretion systems, are conserved across Salmonella, along with SPIs 3-6, SPI-9, and SPI-11 (Jennings et al., 2017;Lou et al., 2019;Zhao et al., 2020). Most other SPIs are variable across serotypes and may account for differences in virulence among Salmonella serotypes (Zhao et al., 2020). Thus, the presence and organization of these SPIs are important to understanding the evolution and pathogenicity in Salmonella.
The history of Salmonella epidemiology has relied on various features to categorize strains. Following biochemical profiling, serotyping has long been the basis of Salmonella strain typing and tracking. Later, plasmid profiling by electrophoresis and pulsed field gel electrophoresis (PFGE) were used. For AMR monitoring in national programs such as the United States National Antimicrobial Resistance Monitoring System (NARMS), minimum inhibitory concentration (MIC) testing followed by multiplex PCR and conjugation assays have been commonly used to track phenotypic and genotypic resistance. With the advent of affordable WGS, surveillance of AMR Salmonella and other pathogens can now be done routinely using short read DNA sequencing chemistries. While this provides a comprehensive picture of strain relatedness and gene carriage, closed genomes are needed to reveal the detailed gene arrangements and structural changes. In this report, we describe the use of PacBio long-read sequencing to characterize 134 isolates, representing 33 Salmonella serotypes, isolated from raw meats and food animals. This study helped us to elucidate the genomic structure and location of virulence and resistance genes, their colocation on mobile DNA elements, and how these traits relate to Salmonella evolution. We also proposed a new approach to simplify naming of SGIs based on their genomic position.

Isolate Sources and PacBio Sequencing
One hundred thirty-four isolates, representing 33 serotypes, were collected as part of routine surveillance by the NARMS. The sources of these isolates were chicken, turkey, beef, and pork products as well as cecal/gut samples collected at slaughter from swine, turkey, cattle, and chicken from 2016-2018 across 31 different states. Isolates were selected for Pacific Biosciences (PacBio) long-read sequencing to represent diverse resistance patterns including three pansusceptible isolates, diverse serotypes and different NARMS sources (Supplementary Table 1).
For long-read sequencing, DNA libraries were prepared using a 10 kb template preparation protocol with SMRTbell template prep kit v 1.0. Sequencing was performed using Pacific Biosciences technology on the Sequel platform with sequencing kit 3.0, as described previously

Resistance Gene and Plasmid Identification
Antimicrobial resistance genes, biocide resistance genes, and HMRGs were identified with AMRFinder Plus version 3.8 (Feldgarden et al., 2019). The AMRFinder Plus virulence genes and ARGs outside the AMRFinder core genes were not reported, due to their limited relevance to this Salmonella study.
To identify plasmid replicon sequences, we used PlasmidFinder with cutoffs of 90% identity and 60% length (Carattoli et al., 2014). The sequence of the spvRABCD operon was extracted from the plasmid pOU1115 carried by a S. Dublin strain (Accession DQ115388). A local blastn analysis with the same cutoffs was performed to identify the presence of this spv operon.
Integrated plasmids were also identified by a similar approach, with PlasmidFinder being used to identify replicons. In some cases, blastn analysis was conducted to identify similarity with known plasmids ( Table 2).

Salmonella Pathogenic Islands and Salmonella Genomic Island Identification
Sequences of 24 SPIs were downloaded from GenBank to a local database (Fookes et al., 2011;Hayward et al., 2014;Cheng et al., 2019;Hsu et al., 2019). The size of the SPIs ranged from 1.7 to 133.3 kb, encoding 1-21 virulence genes. Due to the variable length and gene content of SPIs, specific SPIs were identified as present if any of its associated virulence genes in the database developed previously (Blondel et al., 2009;Suez et al., 2013;Zhao et al., 2020) were identified with 85% identity and at least 70% length by blastn v.2.7 and anchored by blasting chromosome sequences with the reference SPIs sequence using 85% identity and at least 10% length. SPI-8 and SPI-13 have the same genomic location adjacent to tRNA-pheV (Espinoza et al., 2017), and seven isolates from four serovars carried 17 kb (70%) SPI-13 sequence but without its virulence genes were given the name of SPI-13 * in this paper.
SGI-1 and potential variant sequences were initially identified by blast with 85% identity and 70% length to 47 kb of reference SGI-1 sequence from S. Typhimurium DT104 (Boyd et al., 2000). Further analysis to identify additional SGI-1 sequences involved identifying insertions between yidY(5 ) and thdF (3 ), with seven additional SGI variants identified. The existence of SGI-4 was discovered using a blast query of chromosomal sequences against SGI-4 reference (MN730129.1) with 85% coverage and 70% homology cutoffs.
The additional SGI-4 was discovered by aligning the N16S319 and other S. Alachua without resistance genes. The other islands with ARGs and/or HMRGs were discovered using comparisons to chromosomes within the same serotypes in isolates lacking ARGs and HMRGs.

Phylogenetic Tree
The program KSNP3.0 was used to generate Single Nucleotide Polymorphisms (SNPs) from a subset of 44 complete chromosomes to represent all 33 serotypes and cases of chromosomal heterogeneity within serotypes (Gardner et al., 2015). Prior to SNP generation, the kmer size was chosen by Kchooser included in KSNP3.0 (Gardner et al., 2015). The maximum likelihood phylogeny tree was constructed by MEGA7.0 with 250 bootstraps. The clade including S. Montevideo and S. Schwarzengrund was placed at the root based on established literature (Worley et al., 2018).

Presence of Salmonella Pathogenic Islands and Arrangement in the Chromosome
Assembly of long-read sequences produced circular closed chromosomes in 116 of the 134 Salmonella isolates, ranging in size from 4,492,868 bp in S. Bredeney to 5,073,615 bp in S. I 4,[5],12:i:-. Nine isolates showed genomic size >5,000,000 bp, with six of them I 4,[5],12:i:-, two S. Agona and one S. Table 1). Our data showed that additional accessary genomic elements, such as phage, integration of plasmids in chromosome or SGI contributed to the larger genomes. Salmonella pathogenic islands contain a variety of genes that contribute to Salmonella virulence as part of Salmonella serotype evolution (Marcus et al., 2000). To assess the complement of SPIs that contribute to the diversity of chromosomal sequences from different serotypes, we constructed a phylogenetic tree using SNPs across 44 chromosomes from 34 serotypes ( Figure 1A). This tree reflects the phylogenetic relationships among different Salmonella serotypes and is reflective of their entire genomic content.
Salmonella pathogenic islands were largely conserved within serotypes but showed varying degrees of diversity between serotypes, consistent with previous reports (Hsu et al., 2019;Zhao et al., 2020). As expected from the close relationship between S. Typhimurium and its monophasic variant S. I 4,[5],12:i:- (Ido et al., 2014), their complement of SPIs was identical (Figure 1). In fact, the large clade of related serotypes from Typhimurium to Litchfield all had almost identical combinations of SPIs ( Figure 1B). Only a few serotypes displayed SPI variability among their strains. Interestingly, none of the SPIs were serotypespecific, as each of the 21 identified SPIs were distributed among multiple serotypes. Large inversions were observed within some serotypes. For example, in one of two S. Enteritidis strains, a large fragment from SPI-6 to SPI-17 was inverted, and in one of the S. Infantis strains the region between SPI-12 to SPI-14 was inverted ( Figure 1B).

Acquisition of Genomic Islands and Associated Antimicrobial Resistance Genes and Heavy Metal Resistant Genes
In this study, SGI-1 and SGI-1 variants were found among 12 strains including serotypes Typhimurium (n = 5), Senftenberg (n = 2), Derby (n = 3), Saintpaul (n = 1), and Alachua (n = 1) ( Table 1). The location of insertion of SGI-1 was consistent across these different serotypes, present between SPIs 4 and 3 in the assembled chromosomes. The backbone and ARGs in SGIs varied greatly in different strains. All five S. Typhimurium SGI-1 carried similar backbone structure and ARGs, aadA2-qacEdelta-sul1-floR-tet(G)-blaCARB-2-qacEdelta as previously reported (GenBank accession number AF261825), except that in N17S016, whose SGI was 9 kb shorter and contained only sul1 and bla CARB−2 ARGs.
The three S. Derby strains carried similar SGI-1 backbone structure (43.9-44.9 kb) as the S. Typhimurium strain, and all three carried a similar resistance gene cassette (Table 1), with 2 to 4 ARGs and 4 mercury resistance genes.
The remaining four SGI-1 sequences varied greatly in size, from 29.7 kb in a S. Saintpaul strain to 117.9 kb in a S. Senftenberg strain ( Table 1). The structures and resistance gene content varied considerably, with 5-6 ARGs and 0-6 mercury resistance genes (Table 1). Interestingly, although the top NCBI blast hits were for Salmonella, all SGI-1s are found to be homologous or partially homologous (Table 1) to the genomic islands on Proteus. The S. Senftenberg SGI-1 was particularly notable since it appeared to have a hybrid origin with 51 kb aligning with other SGI-1 sequences and a 67 kb homology with that of Citrobacter (Table 1).
An additional potential estimated 84 kb genomic island ( Table 1) was found in S. Alachua (Table 1). It has some homology to SGI-4 with 95% identity and 40% length. Although not experimentally validated as an SGI, it has many similarities to SGI-4 including its location between SPI-4 and SPI-6 and presence of the sil and pco HMRGs. This region also has genes related to conjugative transfer and partitioning, indicating that it is likely a mobile element.

Integration of Plasmids Into the Chromosome and Associated Antimicrobial Resistance Genes and Heavy Metal Resistant Genes
IncQ plasmid replicons were detected in chromosomal sequences of fourteen S. I 4,[5],12:i:-isolates (Table 2). Interestingly, the location of IncQ plasmids was in the same region as the fljB gene encoding the phase two H antigen in S. Typhimurium, and the fljB region missing in S. I 4,[5],12:i:- (Figure 2). This finding suggests a plasmid to chromosome recombination event that transformed strains from serotype Typhimurium to I 4,[5],12:i:-. In each of the S. I 4,[5],12:i:-genomes in this study the recombination event had introduced sul2 and in all but one isolate tetB, bla TEM−1 and aph(3 )-Ib/aph(6)-Id were introduced.
There are additional examples of ARGs or HMRGs in the chromosome that may have resulted from plasmid integrations ( Table 2). For three S. Typhimurium strains from retail chicken (N18S0597, N18S1595, and N18S2170), there are two chromosomal regions with ARGs and HMRGs. The first is an up to 107 kb region located between SPI-6 and SPI-16 with merR, tet(A), and sul2, and can be traced back to an IncC plasmid pN18S1634-2. The second region is about 85 kb and has over 70% alignment to a previously published IncHI plasmid (nucleotide accession MH287085.1). This 85 kb region carries five ARGs and 16 HMRGs. Another common insertion included a 31.6 kb element with silver and copper resistance genes. This insertion was found in the chromosomes of nine isolates, including serotypes Muenster, Johannesburg, Senftenberg, Heidelberg, Schwarzengrund, and Agona, and across multiple animal sources ( Table 2). This insertion has high sequence homology with the IncHI plasmid pF18S044-1 ( Table 2). These findings reveal how plasmids or their remnants can contribute to the chromosomal acquisition of ARGs and HMRGs.
The most common serotypes with chromosomal ARGs were Typhimurium (20/24 isolates), I 4,[5],12:i:-(15/16), Agona (10/10), and Heidelberg (7/7). In contrast, serotypes Kentucky (0/11) and Reading (0/6) did not have any chromosomal ARGs. Among the nine isolates that did not carry any plasmids, no chromosomal ARGs were found in three isolates (N17S192, N17S312, and N18S1429) which were pan-susceptible to all antimicrobials tested, fosA was found in two (N18S0476 and N18S0722), and the remaining four (N17S107, N17S834,   FIGURE 2 | IncQ plasmid integrated into Salmonella I 4,[5],12:i:-chromosome. The circle on the left represents a chromosome structure of a S. Typhimurium strain F18S013 with SPIs (in red) distributed on the chromosome. The green area was the sequence homologous to a Salmonella I 4,[5],12:i:-strain F18S010. The gray area shows differences between the two strains. The detailed comparison of the genomic structure of the two strains on the right is from the area with black box on the chromosome. Table 1). These isolates encompassed seven different serotypes. Chromosomal HMRGs were found in 51 of 134 isolates, including genes encoding resistance to copper, silver, mercury, and arsenic. These comprised 22 turkey, 14 swine, 11 chicken, and 4 cattle isolates. Thirty-nine of the fifty-one isolates with more than three chromosomal HMRGs also had chromosomal ARGs. In many cases, HMRGs and ARGs were physically linked ( Table 2).

N18S0173, and N18S2170) carried multiple chromosomal ARGs (Supplementary
Together these findings among 134 Salmonella genomes show that the maintenance and spread of chromosomal ARGs and HMRGs in Salmonella is accomplished through a complex interplay of genomic islands and integrated plasmids. Further work will be needed to understand whether acquisition of these genes is specifically selected for by exposure to heavy metals and/or antimicrobials or connected to other survival and fitness challenges faced by Salmonella.

Plasmid Types and Association With Resistance Genes, Sources, and Serotypes
From the 134 Salmonella isolates we developed high-quality sequences for 285 plasmids, 245 of which we fully resolved into circular structures. On average there were two plasmids per isolate, although seven isolates had no plasmids and one had    Table 1). We sought to identify the AMR and HMRGs on these plasmids and evaluated if any plasmids had animal source or serotype-specificity. Although some plasmid types were most common in certain sources, most were widely distributed across all animal sources (Figure 3); some association between plasmid type and serotype was also noticed ( Table 3).

IncC
Sizes of the 36 IncC plasmids varied considerably, from 52 to 232 kb, and were found among all animal sources (Table 3). These plasmids were found among 11 different serotypes, with the most prevalent being S. Typhimurium. Some common resistance patterns emerged, and 34 of the plasmids had sul2 and tetA, and 22 had all of bla CMY−2 , floR, and aph(6)-Id/aph(3 )-Ib. Twelve of the isolates also had at least one qac gene and 29 had at least one mercury resistance gene.

IncHI
There were 15 IncHI plasmids identified among twelve different serotypes, present in all food animal sources, with swine being most common (7/15, 47%). All were large plasmids of 145-354 kb and possessed from one to fourteen ARGs. Seven also had the qacE biocide resistance gene, and all had at least one metal resistance operon, including those conferring resistance to silver, copper, mercury, and arsenic.

IncI
There were 39 IncI plasmids in our collection of 134 genomes, from 19 different serotypes, with at least one ARG found in 33 of the plasmids. They ranged in size from 62 to 125 kb and were found from all four food animal sources. The most common ARG was the clinically relevant bla CMY−2 , which was found in 15 isolates, including six S. Kentucky from retail chicken. Only two of the plasmids had HMRGs, and the qacE gene was seen in eight of the plasmids. The most common animal source for IncI plasmids was turkey, with 22 isolates, and 6 of these were in serotype I 4,[5],12:i:-.

Other Plasmid Types
Several other plasmid types were identified, including Inc R, A, B/O/K/Z, Q, N, P, F, and Y in descending ARG prevalence, respectively. The 27 IncF plasmids most commonly encoded either no ARGs (10/27) or only aminoglycoside and tetracycline resistance genes (8/27). Only 4/15 IncX plasmids encoded ARGs. Small Col-type plasmids contained few ARGs, with only 7/72 encoding any ARGs, spread widely across different sources (Figure 3). There were 11 IncQ plasmids (aside from those integrated into the chromosome) found among diverse serotypes and sources, nine of which had the genes sul2, aph(6)-Id/aph(3 )-Ib, and tetA. There were also three IncN plasmids, two IncA, and one plasmid each of IncP, IncR, IncY, and IncB/O/K/Z. Twelve plasmids had multiple replicon types, indicating likely recombination between multiple plasmid types. These included IncHI, IncC, IncI, IncN, IncF, IncP, IncQ, and IncX replicons (Supplementary Table 1). All but one of these hybrid plasmids had ARGs for at least three different drug classes.
A total of 44/285 plasmids did not have hits based on PlasmidFinder, indicating a failure of conventional typing techniques to identify them. Even though these plasmids were un-typeable, three contained ARGs (Supplementary Table 1).
Overall, there were 33 Salmonella serotypes represented, with some interesting serotype-specific findings. Serotype Agona had bla CMY−2 as part of multidrug-resistant IncC plasmids in six of ten isolates. Similarly, MDR IncC plasmids were found in all four S. Montevideo and all five S. Newport isolates.
Serotype Kentucky also had IncX1 plasmids in ten of eleven isolates, with each of these being from chicken sources. There was one S. Kentucky isolate from turkey which did not have this plasmid and it did not possess any ARGs (Supplementary Table 1). Seven S. Kentucky isolates also had IncF plasmids with aph(3 )-Ib/aph(6)-Id, and tet(A); this combination of plasmid type and resistance genes was not observed in any other serotypes.
Other serotypes also had specific plasmid/resistance gene combinations. For instance, five of six S. Reading isolates had IncQ plasmids with the ARGs aph(3 )-Ib/aph(6)-Id, sul2, and tet(A), among only nine total isolates with this combination of plasmid type and resistance genes. All but one of these nine isolates were from turkey sources. Of importance to Salmonella virulence is that spv operon were found in nine isolates of four serotypes, including five Typhimurium, one I 4,[5],12:i:-, two Dublin, and one Enteritidis. IncF (IncFII(S) subtype) replicons were present in all but one plasmid with spv operon, with one missing this replicon likely because it was not fully circularized. IncC and IncHI plasmid types were most strongly associated with ARGs (Table 3).
Some genes that were plasmid-specific include catA and mcr-9 only on IncHI plasmids  and qnrB19 only on Col plasmids. Other genes, such as bla TEM−1 and tet(B), were more widely distributed across different plasmid types (Figure 4). Overall, there was an association between the presence of ARGs and the presence of HMRGs on the same plasmid ( Figure 5).

DISCUSSION
Here we present the results from long-read sequencing of 134 Salmonella genomes, including over two hundred plasmids with circularized sequences. This sequencing effort produced several important findings, including insight into the acquisition of virulence and AMR determinants.
The size of SPIs ranged from 1.7 to 133.3 kb, encoding 1-21 virulence genes. A phylogenetic tree based on the presence and absence of genes in each SPI from our previous study showed that, aside from serotype Choleraesuis, many other serotypes showed great diversity, particularly Newport, Derby, Agona, Anatum, and Typhimurium (Zhao et al., 2020). Some SPIs were more conserved across serotypes, such as SPIs 1-5, which had ≥96% identity and 72% length alignment with the reference SPI in all isolates. Some SPIs had greater variability, such as SPI-6, which in some cases had 97% identity but only 21% length alignment with the reference SPI, although its position relative to other SPIs was consistent in all isolates. Despite the great diversity of SPIs, the virulence genes are highly specific to individual SPIs. The presence of SPIs varied in different serotypes but was largely consistent within individual serotypes (Figure 1). Several publications showed different SPI profiles by using different methods to identify the presence of SPIs (Amavisit et al., 2003;Suez et al., 2013). In this study, a combined method of presence of virulence gene, genomic location and SPI reference structures was used to identify the SPIs.
Rarely, some virulence genes were found in multiple location, for example, gtrA genes were shared by both SPI-16 and SPI-17. Since the genomic location of each SPI is conservative, the presence of SPI-16 or SPI-17 were easily decided by this combined method.
In this study, it was found that the relative position of SPIs to each other was generally stable. Even SPIs that are in phylogenetically distant serotypes, such as SPI-18 in Kiambu and Bredeney, had conserved insertion locations. This was true even though the number of SPIs was variable across different serotypes, indicating a key role for the SPIs in the evolution of serotypes.
The presence of additional elements was also able to be easily identified in reference to the SPI positions, as shown in Figure 1. The conserved SPI genomic locations can help to identify large insertions, such as SGIs or other accessory elements. In all cases the combinations of SPIs were consistent within serotypes, which can potentially be used to orient assemblies with short-read data, acting as a scaffold to understand the arrangement of the genome.
Recent research shows that SGI-1 is an integrative mobile element that plays an important role in introducing antibiotic resistance in various Gram-negative bacteria, including FIGURE 4 | Plasmid-resistance gene associations. The information above relates to plasmids with typing information and does not include those with zero or multiple replicon types.
FIGURE 5 | Correlation between presence of antimicrobial resistance genes (ARGs) and heavy metal resistant genes (HMRGs). This depicts information related to plasmids and the presence of these genes.
S. enterica, Proteus mirabilis, and Acinetobacter baumannii (Cummins et al., 2020). We identified twelve isolates with SGI-1 sequences, either by homology to the reference SGI-1 sequence or from insertions in the same region. All 12 SGIs were similar to SGIs in Proteus mirabilis, but their close relatives also included sequences from Citrobacter and Enterobacter. This finding further helped us to understand how these genomic islands were horizontally acquired (Table 1).
We had other novel findings, including an SGI-1 sequence in serotype Alachua and large SGI-1s of different origin in serotypes Senftenberg, and Saintpaul. Those SGI-1 had almost no homology to previously reported SGI-1. The great diversity made it impossible to name the variants alphabetically as typical approach. In this study we also resolve the issue associated with the naming of SGI sequences by proposing a new approach based on their relative position in the genome. For instance, SGI-0 and SGI-2 (Levings et al., 2008;de Curraize et al., 2020) as previously name can all named as SGI-1 based on their consistent positions with other SGI-1 variants. An additional example of SGI diversity is shown by a novel SGI island containing HMRGs in S. Alachua. Even though it has limited homology with a previously reported SPI-4 (95% identity and 40% length), it was named as SGI-4 variant because of its genomic location. By naming SGIs based on location, we hope for a streamlined process for SGI nomenclature in future work as diverse SGI sequences are identified. It can help to identify the potential new variants. It would be interesting to further investigate the prevalence and distribution of SGIs in Salmonella isolated from other sources, including from human and sick animals. In this study, we also found ARGs and HMRGs on many chromosomal sequences with 100% homology to plasmids, indicating fragment of plasmid integration into the chromosome ( Table 2). In fourteen isolates of S. I 4,[5],12:i:-, MDR IncQ plasmids were inserted into the same location in the chromosome (Figure 2). This monophasic serotype often results from different insertions, deletions, or other disruptions of the fljB gene in serotype Typhimurium (Yang et al., 2015). We believe our results are the first to identify insertion of IncQ at this site and may have resulted in S. Typhimurium conversion to serotype I 4,[5],12:i:-.
As expected, most AMR was mediated by plasmids. A total of 147 plasmids had one or more ARGs compared to only 74 chromosomes, some of which carried integrated plasmids as well as genomic islands. Although there were certain plasmid type/resistance gene associations found only in particular serotypes, most were not typically source or serotype-specific and carried diverse plasmid types linked with different AMR genes and HMRGs. In addition, we found that 60% MDR (>3 AMRGs) strains also carried >3 HMRGs, including those conferring resistance to copper, gold, mercury, silver, arsenic, and tellurium (Supplementary Table 1 and Figure 5). This coexisting of AMRs and HMRs is of interest as the presence of any of these metals in food animal production have the potential to co-select for AMR. This is of particular significance as these congregated HMRGs were found on newly discovered SGI-4 and plasmids conferring resistance to three or more antimicrobial classes (Figure 5).
There were some limitations in this study. Only 134 Salmonella isolates were sequenced, they were exclusively from food animals and retail meats, and the isolates were not randomly chosen. As a result, findings from this study may not be broadly applicable to all Salmonella serotypes or genomes from different animals, foods, or environments. In addition, we focused our sequencing on multidrug-resistant isolates, so some plasmids found to be frequently associated with AMR may have lower associations in a broader context of Salmonella serotype or genomic diversity. Also, our work highlights a drawback of using incompatibility typing to identify plasmid types, as some plasmids often have either multiple replicon sequences or none. Furthermore, isolates with the pESI plasmid in S. Infantis were only identified with IncF replicons, despite the fact that this plasmid resulted from a combination of multiple plasmid types . Given these challenges, alternative approaches such as the use of Plasmid Taxonomic Units could help address at least some of these issues (Redondo-Salvo et al., 2020). Despite the limitations, this study represents perhaps the largest collection of closed Salmonella genomes reported to date and advances our understanding of Salmonella genomics including its genomic plasticity and evolution.
Future work to evaluate the applicability of long-read data to short-read datasets, including the use of reference-assisted assemblies, will increase the level of detail in genome-based AMR surveillance such as that done in NARMS and other national surveillance programs. Greater efforts to close Salmonella genomes also will help improve our understanding of genomic plasticity, evolution, and virulence. This work will help refine risk assessments by revealing the associations of resistance and virulence on mobile DNA elements, making it possible to be more precise in targeted interventions to limit the spread of the most problematic Salmonella serovars, including those less likely to respond to antimicrobial therapy.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.