High-Resolution Comparative Genomics of Salmonella Kentucky Aids Source Tracing and Detection of ST198 and ST152 Lineage-Specific Mutations

Non-typhoidal Salmonella (NTS) is a major cause of foodborne illness globally. Salmonella Kentucky is a polyphyletic NTS serovar comprised of two predominant multilocus sequence types (STs): ST152 and ST198. Epidemiological studies have revealed that ST152 is most prevalent in US poultry whereas ST198 is more prevalent in international poultry. Interestingly, ST152 is sporadically associated with human illness, whereas ST198 is more commonly associated with human disease. The goal of this study was to develop a better understanding of the epidemiology of ST198 and ST152 in WA State. We compared the antimicrobial resistance phenotypes and genetic relationship, using pulsed-field gel electrophoresis, of 26 clinical strains of S. Kentucky isolated in Washington State between 2004 and 2014, and 140 poultry-associated strains of S. Kentucky mostly recovered from the northwestern USA between 2004 and 2014. We also sequenced whole genomes of representative human clinical and poultry isolates from the northwestern USA. Genome sequences of these isolates were compared with a global database of S. Kentucky genomes representing 400 ST198 and 50 ST152 strains. The results of the phenotypic, genotypic, and case report data on food consumption and travel show that human infections caused by fluoroquinolone-resistant (FluR) S. Kentucky ST198 in WA State originated from outside of North America. In contrast, fluoroquinolone-susceptible (FluS) S. Kentucky ST198 and S. Kentucky ST152 infection have a likely domestic origin, with domestic cattle and poultry being the potential sources. We also identified lineage-specific non-synonymous single nucleotide polymorphisms (SNPs) that distinguish ST198 and ST152. These SNPs may provide good targets for further investigations on lineage-specific traits such as variation in virulence, metabolic adaptation to different environments, and potential for the development of intervention strategies to improve the safety of food.


INTRODUCTION
Non-typhoidal Salmonella (NTS) is one of the leading causes of bacterial foodborne illness in the United States (Dewey-Mattia et al., 2018), and the second most commonly reported zoonotic bacterial infection in the European Union (EFSA and ECDC, 2018;Troeger et al., 2018). Globally, NTS causes an estimated 153 million cases of gastroenteritis annually leading to 57,000 deaths (Healy and Bruce, 2019). In the USA, an estimated 1.2 million cases of salmonellosis are caused by NTS annually, leading to 23,000 hospitalizations and 450 deaths (CDC, 2016). Poultry is considered a major reservoir host and source for human infection for several clinically significant NTS serotypes (Shah et al., 2017). In general, poultry infected with NTS do not typically show clinically discernible illness (except for poultry-adapted serovars), but handling and consumption of contaminated poultry and poultry products including eggs are considered a major risk factor for human infection (Löfström et al., 2010;Ziehm et al., 2013). Common symptoms of salmonellosis in humans include diarrhea, fever, and stomach cramps, with occasional invasive infection and death (Löfström et al., 2010).
Periodic shifts in population dynamics of NTS in poultry are known to result in the emergence and re-emergence of different serotypes or strains that cause human infection [reviewed in (Foley et al., 2011;Shah et al., 2017)]. Currently, among the 12 most prevalent poultry-associated Salmonella serovars reported in the United States, Salmonella enterica subsp. enterica serovar Kentucky (S. Kentucky) is the most predominant serovar in US poultry (Shah et al., 2017). S. Kentucky shows unique ecological and epidemiological characteristics. Unlike most other NTS serovars, comparative whole genomic and MLST analysis has revealed that S. Kentucky is polyphyletic and the two predominant sequence types (STs), ST198 and ST152 represent the two distinct genetic lineages (Sukhnanand et al., 2005;Timme et al., 2013;Haley et al., 2016;Tasmin et al., 2017;Luhmann et al., 2021). Globally, ST198 and ST152 are also among the most predominant STs isolated from a variety of poultry sources and people (Le Hello et al., 2011;Haley et al., 2016;Tasmin et al., 2017;Xiong et al., 2020). Interestingly, S. Kentucky ST152 is the most prevalent ST found in US poultry and poultry products; however, it has only been associated with sporadic cases of human illness and not yet linked to any major outbreaks of foodborne salmonellosis in the US (Shah et al., 2017). Multiple studies have reported that multidrug resistance (MDR) is common among the S. Kentucky isolates recovered from US poultry sources (Shah et al., 2017;Rauch et al., 2018). However, resistance to fluoroquinolones such as nalidixic acid and ciprofloxacin has not yet been reported in S. Kentucky isolated from US poultry (Shah et al., 2017).
Unlike ST152 that is consistently isolated from US poultry for more than two decades, S. Kentucky ST198 has only been isolated from US chickens once in 1937 (Edwards, 1938;Le Hello et al., 2011). A recent report shows that ST198 is sporadically isolated from US cattle sources (Haley et al., 2016). Interestingly, S. Kentucky ST198 isolated from US food animal sources such as poultry and cattle does not exhibit resistance to fluoroquinolones (Haley et al., 2016;Shah et al., 2017).
Conversely, two studies published by CDC and the Maryland Department of Health showed that fluoroquinolone-resistant (Flu R ) S. Kentucky ST198 are commonly isolated from human patients in the US (Rickert-Hartman and Folster, 2014;Haley et al., 2019). Similarly, the Public Health Agency of Canada reported that Flu R S. Kentucky ST198 is commonly isolated from human patients in Canada (Mulvey et al., 2013). Although limited numbers of cases associated with Flu R S. Kentucky ST198 infection were reported in these studies, the majority were linked to international travelers returning from Africa, Southeast Asia, and the Middle East where Flu R S. Kentucky ST198 is commonly isolated from poultry and poultry products (Le Hello et al., 2011Xiong et al., 2020). Flu R S. Kentucky ST198 is also commonly associated with human illnesses in other parts of the world including Europe, Asia, Africa, and the Middle East (Le Hello et al., 2011Hello et al., , 2013. Moreover, Flu R S. Kentucky ST198 is now endemic in poultry in France, Poland, and other European countries. As a result, contaminated poultry and poultry products in these countries represent a significant risk to public health and food safety. The increasing reports of detection of Flu R S. Kentucky ST198 in travelers returning to North America suggest that Flu R S. Kentucky ST198 is disseminating internationally and likely represents a significant risk to public health in the United States (Le Hello et al., 2011Hello et al., , 2013. Thus, monitoring of Flu R S. Kentucky ST198 in people and food animal systems in the US is needed, however conventional serotyping does not distinguish between ST198 and ST152, making surveillance of Flu R ST198 difficult. The goal of this study was to develop a better understanding of the epidemiology of ST198 and ST152 in WA State. We compared the antimicrobial resistance phenotypes and genetic relationship using pulsed-field gel electrophoresis of 26 clinical strains of S. Kentucky isolated in Washington State between 2004 and 2014, and 140 poultryassociated strains of S. Kentucky mostly recovered from the northwestern USA between 2004 and 2014. We also compared whole-genome sequences of representative human clinical and poultry isolates from the northwestern US with a global database of S. Kentucky genomes representing 400 ST198 and 50 ST152 strains. This comparison aided the identification of a set of lineage-specific single nucleotide polymorphisms (SNPs) that distinguish ST198 and ST152 and provide a strong foundation for follow-up epidemiologic investigations and foundational studies to determine potential differences in virulence, metabolic, and/or host-adaptation among these lineages.

PFGE and PCR
Single-enzyme (XbaI) PFGE using a modified CDC PulseNet protocol as described previously was performed on all isolates in this study (Ribot et al., 2006). Results were analyzed using BioNumerics software version 6.6 (Applied Maths, Sint-Martens-Latem, Belgium). All isolates were tested for the presence of Salmonella genomic island−1 (SGI-1) insertion by PCR using both previously described (Boyd et al., 2001;Doublet et al., 2008) and newly designed primer pairs corresponding to an SGI-1 insertion site between thdF and yidY (PCR-1), respectively, following previously published PCR conditions (Supplementary Table 2). The isolates that tested negative for the PCR-1 (Supplementary Table 2) were considered to carry SGI-1 insertion and vice-versa. All isolates were also screened for the presence of pColV plasmid following the PCR primers (Supplementary Table 2) and procedure described previously (Johnson et al., 2010).

Whole-Genome Sequencing and Comparative Genomics Analysis
Whole-genome sequencing (WGS) was performed on a subset of S. Kentucky isolates based on PFGE clustering (Supplementary Figure 1), fluoroquinolone resistance, presence or absence of SGI-1 insertion, ColV plasmid (Supplementary Table 1), and associated epidemiological metadata (Supplementary Table 3). The genomic DNA was extracted from the representative isolates using the DNeasy Tissue Kit (Qiagen, USA) following the manufacturer's protocol. The isolates were sequenced by the FDA Center for Food Safety and Applied Nutrition (CFSAN). Briefly, paired-end sequencing libraries (2 × 250 bp) were prepared using the Nextera XT kit (Illumina, San Diego, CA) following the protocol in the DNA library reference guide, size selected to be in the range of 600-1,000 bp (average peaks of ∼800 bp), and sequenced using the MiSeq Illumina version 2 kit, according to the manufacturer's instructions. Whole-genome MLST was performed as previously described by Nadon et al. (2017). The raw sequences of 678 S. Kentucky (and metadata including ST when available) strains were downloaded from the NCBI Sequence Read Archive (SRA) as of December 31, 2018. When ST was not available, the PubMLST sequence query using the Salmonella enterica (Achtman) MLST locus/sequence definitions were used to determine ST in silico. STs other than ST198 or ST152 were excluded. The sequence quality was evaluated using FastQC version 0.11.8 (Babraham Bioinformatics, Cambridge, United Kingdom). Next, sequences were imported into Geneious Prime version 11 (Biomatters Ltd., Auckland, New Zealand) as inward-pointing paired-end FASTQ files. Sequences were randomly subsampled without replacement if there were >3,000,000 reads, trimmed for quality using BBDuk version 37.64 using the following parameters (minimum Phred quality score of 30; minimum length of 30; trim both ends; trim adapters from the right end with a minimum overhang length of 24). Duplicate reads were removed with Dedupe version 37.64. Paired reads were then merged using BBMerge version 37.64 using default settings. Sequences with mean read coverage ≥30x were used for downstream analysis. Reads were mapped to the reference genome using the Geneious Prime mapper (medium sensitivity, iterated up to 5 times; multiple best matches mapped randomly). Finally, after removing sequences that lacked mean read coverage ≥30x when mapped to the reference, 450 sequences remained for further examination (ST198, n = 400; ST152, n = 50). These sequences along with the isolates sequenced in the current study resulted in a total of 464 sequences evaluated (ST198 = 406 sequences; ST152 = 58 sequences). To identify ST152 and ST198 lineage-specific single nucleotide polymorphisms (SNPs), the genome sequences of all S. Kentucky ST152 isolates were mapped to the reference genome (chromosome) of Flu R S. Kentucky ST198 str. PU131 (NCBI RefSeq accession: NZ_CP026327.1) (Shah et al., 2018). Likewise, the genome sequences of all S. Kentucky ST198 isolates were mapped to the reference genome (chromosome) of S. Kentucky ST152 str. SA20030505 (NCBI RefSeq accession: NZ_CP022500.1) (Yoshida et al., 2016). All known horizontally acquired genetic elements including phages and insertion elements were manually annotated and excluded from further analysis. Additionally, any other genetic region that was absent in one or more of the test strains was identified as a region of difference (ROD) and excluded from downstream genetic analysis, resulting in a core genome alignment. Core genome SNPs inside and outside of coding sequences were then identified with a minimum coverage of 10; minimum variant frequency 0.8; effects of variants on translations analyzed using bacterial genetic code. A phylogeny was constructed from core genome SNPs identified in ST152 and ST198 sequences, respectively, mapped to the ST152 reference sequence SA20030505 or the ST198 reference sequence PU131, using PhyML version 3.3.20180621 with the HKY85 nucleotide substitution model and invariant sites masked and excluded from analysis, and with 100 bootstrap replicates (Guindon et al., 2010). Strains within the phylogenetic clusters of interest were analyzed for the carriage of resistance genes in silico using ResFinder version 3.2 with default options (90% ID threshold; 60% minimum length threshold) (Zankari et al., 2012) and the results were correlated with the available metadata including host and geographic location of origin.

Statistical Analyses
Statistical analyses (two-sample z-test comparison of means) were conducted using an open access Z score calculator (Stangroom, 2021). For all analyses p < 0.05 were considered significant.

RESULTS AND DISCUSSION
A total of 23 unique antimicrobial resistance phenotypes were detected among a collection of 166 S. Kentucky isolates (Supplementary Figure 1). The majority of S. Kentucky isolates were resistant to at least one antimicrobial (114/166, 68.7%), with streptomycin resistance being the most common (107/166, 64.5%). Multidrug resistance (MDR), as defined by resistance to ≥3 classes of antimicrobials, was reported more frequently in isolates from human clinical sources (15/26, 57.7%) when compared with the poultry isolates (6/140, 4.3%) (p < 0.00001, z = 7.519, two-sample z-test). None of the poultry isolates showed resistance to fluoroquinolones (0/140, 0%), however, 16 (61.5%) human clinical isolates were resistant to fluoroquinolones including ciprofloxacin (13, 50%) and/or nalidixic acid (16, p < 0.00001, z = −9.761, two-sample z-test). In contrast, all human clinical isolates were susceptible to ceftiofur (26/26, 100%), however, 6 (4.3%) poultry isolates showed ceftiofur resistance. In silico identification of resistance genes from a subset of 11 human and 4 poultry whole genome-sequenced isolates correlated with antimicrobial phenotypes (Supplementary Table 1) and confirmed that MDR was predominantly associated with human clinical isolates. These results corroborate a recent study  in which fluoroquinolone resistance was reported in ST198 but not in ST152 isolates. Similarly, thirdgeneration cephalosporin (e.g., ceftiofur, ceftriaxone) resistance was reported in ST152 but not in ST198 .
Based on PFGE, S. Kentucky isolates were grouped into two clusters (A and B) at a similarity of ∼60% (Supplementary Figure 1). Cluster-A contained 18 human clinical isolates and was further divided into two subclusters, A1 and A2. All isolates within cluster-A1 were negative for pColV specific PCR but contained a variant form of SGI-1 (Supplementary Figure 1). SGI-1 is epidemiologically important because it is a mobile genetic element known to confer multidrug resistance to Salmonella (Doublet et al., 2008). Two isolates from within cluster-A1 (PU129 and PU107) were selected for whole-genome sequence-based MLST which identified both isolates as ST198, indicating that cluster-A1 likely represents Flu S S. Kentucky ST198. To the best of our knowledge, Flu S S. Kentucky ST198 has only been isolated once from a US chicken source in 1937 (Edwards, 1938;Le Hello et al., 2011). Recently, however, Flu S S. Kentucky ST198 strains harboring variant forms of SGI-1 were sporadically isolated from the domestic cattle and turkey sources within the USA (Haley et al., 2016). These data suggest that cluster-A1 likely represents Flu S S. Kentucky ST198 strains circulating within domestic poultry or cattle.
In striking contrast to cluster-A1 isolates, all clinical isolates within cluster-A2 (n = 15) showed an MDR phenotype and consistently exhibited Flu R phenotype  Cip, Ciprofloxacin; Nal, Nalidixic acid; n/a, not applicable; *Other travelers have missing information for poultry/eggs, did not report "no" to poultry and eggs.
(Supplementary Figure 1). Overall, 13/15 isolates (86.7%) within cluster-A2 showed ciprofloxacin resistance and all isolates (15/15, 100%) showed nalidixic acid resistance ( Table 1). A whole-genome MLST of representative cluster-A2 isolates (5/15, 33.3%) identified these isolates as ST198 (Supplementary Figure 1). One (6.7%) cluster-A2 isolate (PU117) was identified as pColV positive. All isolates within cluster-A2 contained SGI-1 (Supplementary Figure 1), which is consistent with the well-documented presence of SGI-1 in internationally circulating strains of Flu R S. Kentucky ST198 (Doublet et al., 2008;Le Hello et al., 2011). Collectively, these data suggest that Flu R S. Kentucky within cluster-A2 likely represents the internationally circulating ST198 strains. Unlike cluster-A, cluster-B contained isolates from both poultry (n = 140) and human (n = 8) sources with variable antimicrobial resistance phenotypes (Supplementary Figure 1). Interestingly, only one (PU67) out of 8 (12.8%) human isolates within cluster-B showed Flu R phenotype (resistance to nalidixic acid) ( Table 1). The frequency of Flu R phenotype among human isolates in cluster-B (1/8, 12.5%) was significantly lower (p < 0.0006, z = 3.430, two-sample z-test) when compared with the human isolates from cluster-A (15/18, 83.4%). None of the cluster-B isolates recovered from the US poultry sources showed Flu R phenotype (Supplementary Figure 1). Wholegenome MLST of representative poultry (n = 4) and human (n = 4) isolates within cluster-B revealed that all sequenced isolates were ST152 (Supplementary Figure 1). Moreover, SGI-1 insertion was not detected in any of the isolates within cluster-B, irrespective of the source (poultry or human). These results corroborate with a previous report that S. Kentucky isolates recovered from the chicken sources in the US largely belong to the ST152 lineage that does not exhibit Flu R phenotype and lack SGI-1 (Parveen et al., 2007;Le Hello et al., 2011;Haley et al., 2016;Ladely et al., 2016). Furthermore, pColV was detected in one (12.5%) out of 8 human clinical isolates and 51 (36.4%) out of 140 poultry isolates within cluster-B. Previously, 72.9% of S. Kentucky isolated from the US poultry and other domestic animal sources were reported to carry pColV (Johnson et al., 2010). Recently, it has been hypothesized that pColV in S. Kentucky isolates is likely linked to its ability to successfully colonize and persist in the poultry population in the US (Johnson et al., 2010). However, given that pColV is not uniformly found across all ST152 poultry isolates, poultry colonization may be multifactorial, with pColV likely enhancing but not solely responsible for successful poultry colonization. Collectively, these data suggest that the SGI-1 negative, Flu S S. Kentucky within cluster-B represents S. Kentucky ST152 lineage circulating within domestic poultry. Why Flu R S. Kentucky ST198 is not found in the US poultry sources remains an intriguing question. However, these data raise a possibility that exposure to SGI-1 negative, Flu S S. Kentucky ST152 isolates in the US poultry may provide immunity and protect against SGI-1 positive Flu R S. Kentucky ST198 strains. Continued monitoring is needed to ensure that SGI-1 positive S. Kentucky is not introduced into US poultry and other food-producing animals.
WGS based phylogenetic analysis of the subset of 14 S. Kentucky strains (ST198 n = 6, ST152 n = 8) strains sequenced in this study showed isolates clustering into four clades, A through D (Figure 1). Clade-A contains all Flu S and SGI-1negative ST152 isolates recovered from human patients from WA State and the US poultry sources (Figure 1) and correlates with PFGE cluster-B (Supplementary Figure 1). Interestingly, epidemiological source tracing showed that only 1 out of 8 (13%) clinical isolates of S. Kentucky strains within cluster-B originated from human patients with a history of international travel before the onset of illness (Table 1). These data support our inference of the domestic origin of these isolates with poultry or cattle being likely sources of infection. Clade-B contains Flu S S. Kentucky ST198 isolates carrying a variant SGI-1 (Figure 1) and correlates with PFGE cluster-A1 (Supplementary Figure 1). Epidemiological source tracing revealed that all Flu S S. Kentucky isolates within PFGE cluster-A1 (n = 3) originated from human clinical cases with no history of international travel (Table 1). These data collectively support our inference of domestic origin of PFGE cluster-A1 isolates with domestic poultry or cattle being likely source of infection, most likely through the consumption of undercooked or handling of contaminated poultry products or beef. Interestingly, representative Flu R and SGI-1-positive ST198 isolates from PFGE cluster-A2 formed two separate clades C and D (Figure 1). Clade-C contains 2 nalidixic acid-resistant, but ciprofloxacin-susceptible ST198 isolates, whereas clade-D contains 2 ciprofloxacin-resistant ST198 isolates (Figure 1) and these clades correlate with PFGE cluster-A2 (Supplementary Figure 1). These results suggest that the population of Flu R S. Kentucky ST198 is comprised of multiple genetically divergent lineages, which corroborates with published reports (Sukhnanand et al., 2005;Timme et al., 2013;Haley et al., 2016;Tasmin et al., 2017;Luhmann et al., 2021). Epidemiologic source tracing revealed that 11 out of 15 (73%) Flu R S. Kentucky clinical isolates within PFGE cluster-A2 originated from patients with a history of travel to different international destinations before the onset of illness (Table 1). These data support our inference of the international origin of Flu R S. Kentucky isolates.
Overall, 20 (77%) out of 26 S. Kentucky clinical isolates originated from human patients that indicated a history of eating poultry and/or eggs before the onset of illness (Table 1). These included 100% (3/3) isolates within PFGE cluster-A1, 73% (11/15) isolates within cluster-A2 and 74% (6/8) isolates within PFGE cluster-B. Given that S. Kentucky is most frequently isolated from contaminated poultry, poultry products, these data suggest that contaminated poultry or poultry products domestically and during international travel is likely a major risk factor for acquiring S. Kentucky infections. S. Kentucky has also been reported as a contaminant of foods imported into the US, and this route of infection cannot be ruled out in this scenario (Kiessling et al., 2002;Zhao et al., 2006;van Doren et al., 2013).
To further confirm source attribution, we expanded the WGS based phylogenetic analysis of 8 S. Kentucky ST152 isolates sequenced in this study to a larger collection of S. Kentucky isolates which included a total of 50 S. Kentucky ST152 sequences available from the NCBI SRA global database. Sequences of 58 ST152 isolates were mapped to the genome of S. Kentucky ST152 reference str. SA20030505 isolated from poultry in Canada (Yoshida et al., 2016). S. Kentucky ST152 sequences could be classified into three different clades (Figure 2). Clade-A is comprised of four ST152 strains isolated from the US cattle sources. S. Kentucky ST152 has only been occasionally isolated from domestic cattle sources, thus the zoonotic transmission of ST152 from the US cattle to humans remains unknown (Haley et al., 2016). Clade-B is comprised of five S. Kentucky ST152 strains isolated from human sources in the United Kingdom. The epidemiology of S. Kentucky ST152 is not well-studied in the UK, and the source of human infection is not currently known. Clade-C is the largest clade comprised of 49 S. Kentucky ST152 strains isolated from US poultry sources with one exception (sequence SRR1840634 from a cattle source in Maryland). Interestingly, FIGURE 1 | (A-D) Phylogenetic tree of Salmonella Kentucky ST198 and ST152 strains sequenced in this study. Maximum-likelihood phylogenetic tree based on core-genomic SNPs in S. Kentucky ST198 and ST152 strains mapped to the genome of ST198 reference strain PU131 was generated using PhyML. The tree is rooted in the ST152 cluster. Branch names are labeled by strain number and ST type. Bootstrap values represent the percentage of concurring bootstraps from 100 iterations.
clade-C also contains all representative human (n = 4) and poultry (n = 4) ST152 strains isolated in WA State and sequenced in this study (Figure 2) along with 17 other US poultry isolates (15 chicken and 2 turkeys) previously described by Haley et al. (2016). These ST152 US poultry isolates and human clinical isolates do not form source-specific clusters. Therefore, it can be reasonably inferred that the cases of human salmonellosis caused by S. Kentucky ST152 in WA State are likely acquired from a domestic poultry source.
Similar to ST152, we expanded the phylogenetic analysis of 6 S. Kentucky ST198 human strains sequenced in this study to the sequences of 400 isolates downloaded from the NCBI GenBank database. A total of 406 sequences of ST198 were mapped to the ST198 reference strain S. Kentucky str. PU131 (Shah et al., 2018). Unlike S. Kentucky ST152, ST198 appears genetically more diverse as a greater number of clades and subclades were identified for S. Kentucky ST198 (Supplementary Figure 2), presumably because a large number of strains sequenced originate from diverse sources globally. To determine the Flu R genotype, we screened a subset of sequences from representative clades for the mutations in gyrA and parC genes via ResFinder (Zankari et al., 2012). Quinolone and fluoroquinolone antimicrobials work by inhibiting DNA replication, and resistance in bacteria is mediated by a stepwise mechanism in which two chromosomal mutations develop: a mutation in gyrA (DNA gyrase subunit B) and a mutation in parC (DNA topoisomerase IV subunit A). Both mutations must occur for fluoroquinolone resistance to be present (Hooper, 2001). Given that 4 Flu R S. Kentucky ST198 clinical isolates sequenced in this study clustered within four different clades (clade-A, -B, -C, and -D), we focus our discussion on these four clades (Figure 3). These clades were comprised of sequences derived from international human sources, the majority of which were sequenced by Public Health England (PHE) with no other FIGURE 2 | Phylogenetic tree of S. Kentucky ST152. Maximum-likelihood phylogenetic tree based on core-genomic SNPs in S. Kentucky ST152 strains mapped to the genome of ST152 reference strain SA20030505 was generated using PhyML. Branch names are labeled by NCBI SRA identifiers; metadata for strains are included in Supplementary Table 1 epidemiologic metadata. Thus, it is currently unknown if these isolates originated within the UK or were derived from an international origin from outside of the UK. Nevertheless, these results further support the international origin (either travelrelated or import of contaminated food) for the WA State human clinical isolates since they cluster with other, non-US, humanorigin sequences. Two chromosomal mutations leading to Flu R were identified in clades-A, -B, and -C. Interestingly, clade-D which is comprised of 2 human clinical isolates sequenced in this study (PU129-SRR5113856 and PU107-SRR5113866) with no known history of international travel (corresponds to PFGE cluster-A1), clustered with the isolates from US food animal-origin (cattle = 3, turkey = 1), and two human-origin isolates from Asia (China SRR1106466, and Taiwan SRR1106467) (Figure 3). The screening of sequences within clade-D for the presence of resistance gene repertoire revealed that all isolates were Flu S . These results suggest that a Flu S S. Kentucky ST198 population is circulating in food animals including cattle and poultry (turkey) in the US is likely associated with sporadic human illnesses. Moreover, the clustering of Flu S S. Kentucky ST198 isolates from Asia within clade-D raises a possibility that the illnesses were either acquired during travel to the US, or illnesses are likely related to the importation of contaminated US food products. It is also possible that a genetically related Flu S S. Kentucky ST198 lineage is circulating in eastern Asia. Further investigation on larger numbers of such isolates from Asia and the US is needed to determine the epidemiologic source tracing of Flu S S. Kentucky ST198.
Comparative genomics analysis of a large collection of S. Kentucky isolates in this study also confirmed that ST152 and ST198 form two distinct genetic lineages. While others have reported that ST152 and ST198 are genetically distinct (Sukhnanand et al., 2005;Timme et al., 2013;Haley et al., 2016;Tasmin et al., 2017;Luhmann et al., 2021), lineage-specific genetic polymorphisms that clearly distinguish these two lineages have not been identified and described well. Thus, we aimed  Figure 2). The maximum-likelihood phylogenetic tree of S. Kentucky ST198 mapped to the genome of ST198 reference strain PU131 was generated using PhyML. Branch names are labeled by NCBI SRA identifiers; metadata for strains can be found in Supplementary Table 1. Bootstrap values represent the percentage of concurring bootstraps from 100 iterations. to identify lineage-specific SNPs that differentiate ST152 and ST198 strains ( Table 2). To accomplish this, we mapped all ST152 sequences to the genome of ST198 str. PU131 as a reference and all ST198 sequences to the genome of ST152 str. SA20030505 as a reference (see methods). SNPs identified in the subset populations of ST152 (n = 8) and ST198 (n = 6), as well as the total study populations of ST152 (n = 58) and ST198 (n = 406), are summarized in Table 2. Of particular interest are lineage-specific SNPs that are conserved across the entire population examined; that is to say, SNPs which are identified in 100% of the sequences examined in a specific lineage (ST152 n = 58, ST198 n = 406; Supplementary Table 3). These conserved SNPs differentiate ST152 and ST198 into two unique lineages. In both lineages, more conserved SNPs were identified within coding regions when compared with non-coding regions ( Table 2). The majority of the conserved SNPs within the large populations examined were identified within the coding regions of ST152 (n = 26,414) or ST198 (n = 4,640) and resulted in synonymous substitutions with no predictable protein effects. A relatively smaller number of conserved SNPs were identified as non-synonymous substitutions with predicted protein effects (e.g., amino acid substitutions, protein truncation, protein extension, and start codon loss). Interestingly, a greater number of non-synonymous substitutions (n = 4,543) distinguished the ST152 population from the ST198 reference genome, whereas fewer non-synonymous substitutions (n = 1,560) distinguished the ST198 population from the ST152 reference genome (Supplementary Table 4). Although these SNPs distinguished ST198 from ST152 populations, it is important to recognize that the identities of the proteins, encoded by the genes where these SNPs occur, are shared between ST198 and ST152. In addition, some conserved SNPs were found in mobile genetic elements such as transposases. These mobile genetic elements were not excluded from the core genome analysis because they were found in 100% of sequences within a given S. Kentucky lineage. Of particular interest may be the conserved non-synonymous SNPs, i.e., the SNPs that not only distinguish ST152 from ST198 but also impact the function of the corresponding protein. These include non-synonymous SNPs resulting in loss of a start codon, extension, or truncation of the proteins encoded by the gene in which these high-impact SNPs occur. Therefore, the SNPs that result in non-synonymous substitutions including the highimpact SNPs are likely good targets for further investigation to determine what role they play in differential virulence, host adaptation, or metabolic adaptation of ST152 and ST198 lineages in poultry, cattle, and/or human hosts (Supplementary Table 4).

CONCLUDING REMARKS
Based on comparative phenotypic, genotypic, and epidemiologic analysis, this study demonstrates that human Flu R S. Kentucky ST198 infections in WA State reported during 2004-2014 were linked to international travel. In contrast, human Flu S S. Kentucky ST198 infections in WA State during the same period could be linked to domestic food products (cattle or turkey origin) or other unknown sources; sequencing of a larger collection of these strains will be required to determine the specific sources for these strains. Similarly, human Flu S S. Kentucky ST152 infections in WA State are likely linked to domestic poultry sources. The increasing incidence of Flu R S. Kentucky ST198 in the US and globally raises intriguing questions about whether this lineage is undergoing metabolic or virulence adaptation in a specific niche. Lineage-specific nonsynonymous SNPs identified in ST198 and ST152 may serve as good targets for further investigations on variation in virulence, metabolic adaptation to different environments, and potential for the development of intervention strategies to improve the safety of food. Furthermore, these SNPs can also serve as targets for epidemiologic monitoring and source tracing of these lineages domestically and internationally.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.