Comparative Genomics of Staphylococcus Reveals Determinants of Speciation and Diversification of Antimicrobial Defense

The bacterial genus Staphylococcus comprises diverse species with most being described as colonizers of human and animal skin. A relational analysis of features that discriminate its species and contribute to niche adaptation and survival remains to be fully described. In this study, an interspecies, whole-genome comparative analysis of 21 Staphylococcus species was performed based on their orthologues. Three well-defined multi-species groups were identified: group A (including aureus/epidermidis); group B (including saprophyticus/xylosus) and group C (including pseudintermedius/delphini). The machine learning algorithm Random Forest was applied to prioritize orthologs that drive formation of the Staphylococcus species groups A-C. Orthologues driving staphylococcal intrageneric diversity comprised regulatory, metabolic and antimicrobial resistance proteins. Notably, the BraSR (NsaRS) two-component system (TCS) and its associated BraDE transporters that regulate antimicrobial resistance showed limited distribution in the genus and their presence was most closely associated with a subset of Staphylococcus species dominated by those that colonize human skin. Divergence of BraSR and GraSR antimicrobial peptide survival TCS and their associated transporters was observed across the staphylococci, likely reflecting niche specific evolution of these TCS/transporters and their specificities for AMPs. Experimental evolution, with selection for resistance to the lantibiotic nisin, revealed multiple routes to resistance and differences in the selection outcomes of the BraSR-positive species S. hominis and S. aureus. Selection supported a role for GraSR in nisin survival responses of the BraSR-negative species S. saprophyticus. Our study reveals diversification of antimicrobial-sensing TCS across the staphylococci and hints at differential relationships between GraSR and BraSR in those species positive for both TCS.


Staphylococcus Species and Genomics
The existence of taxonomically distinct species groups was first proposed for Staphylococcus based on differential DNA-DNA hybridization methods . These groups were supported by 16S rDNA sequence analysis of 38 taxa (Takahashi et al., 1999) and multilocus sequence data of around 60 species and subspecies (Lamers et al., 2012).
A comparative analysis that utilized next generation genome sequencing data of staphylococci to probe phylogenetic relationships with 491 shared orthologues across 12 Staphylococcus species (Suzuki et al., 2012) proposed S. pseudintermedius and S. carnosus as the most basal lineages. Moreover, with 10 species in their analysis being residents of human skin, the authors proposed that evolution selected for human adaptation after branching from S. carnosus. The relationships between the strains generated from shared orthologues were maintained using total gene content (Suzuki et al., 2012). However, in contrast to the conclusions of 16S rDNA and multilocus data (Takahashi et al., 1999;Lamers et al., 2012) their analysis revealed discrete clustering of Staphylococcus species. In contrast with this analysis, no distinct clustering of S. hominis with S. haemolyticus was observed, and S. saprophyticus was assigned to the S. epidermidis group of species (Suzuki et al., 2012). Currently, there is a knowledge gap in Staphylococcus species comparisons with a need to determine if this clustering of staphylococcal species is supported using whole genome data. Our findings here begin to close this gap.

Two Component Systems
Prokaryotes are receptive to environmental stimuli through diverse sensory and transducing two component systems (TCS). These TCS archetypically comprise a sensor histidine kinase (HK) that spans the cell membrane to interact with the external environment. Stimulus perception causes conditional autophosphorylation that is relayed to an interacting response regulator (RR) to enable DNA-binding directed transcription modulation (Stock et al., 2000).
While TCS are widespread and diverse across prokaryotes, the intramembrane-sensing histidine kinases (IM-HK) are specific to the Firmicutes. This family of small HKs has a short, 25 amino acid linker region between each 400 amino acid transmembrane helix. S. aureus GraSR uses a IM-HK to regulate a global network responsible for resistance to antimicrobial peptides (AMPs). GraSR modulates the expression of DltABCD and MprF that in concert alter the S. aureus surface charge to evade electrostatic interaction-mediated targeting of cationic AMPs .
An orthologous TCS to GraSR described in S. aureus was concurrently designated BraSR and NsaSR by two different groups (Blake et al., 2011;Hiron et al., 2011). Serial passage in sub-MIC concentrations of the lantibiotic nisin was shown to select increased nisin MIC due to a SNP in nsaS gene encoding sensor histidine kinase of NsaRS (nisin susceptibilityassociated sensor regulator) (Blake et al., 2011). The TCS was separately designated BraSR (bacitracin resistance-associated sensor regulator) from the reduced MIC of bacitracin and nisin determined for the TCS gene mutant . BraR binding sites were revealed upstream of the ABC transporter genes braDE and vraDEH (Popella et al., 2016) that were not transcribed in the mutant but induced in the presence of bacitracin. The transporter BraDE contributes to the detection of nisin and bacitracin and subsequent signal transduction via BraSR, whereas VraDE is more directly involved in detoxification by efflux . Transcription of braSR is increased following exposure to multiple antibiotics, including ampicillin, phosphomycin and nisin. Inactivation of braS (nsaS) revealed differential transcription of 245 genes (Kolar et al., 2011), revealing the TCS might report cell envelope stress to directly regulate biofilm formation, cellular transport and responses to anoxia.
In this study, a comparative genome analysis of 21 Staphylococcus species was performed based upon their orthologous gene content. Species groups were revealed and then interrogated using the Random Forest algorithm to identify group-contributing genes. The operon encoding the BraSR TCS was found to differentiate the S. aureus/S. epidermidis species group from other species groups determined in the study and the TCS was found to have restricted distribution across 49 species of Staphylococcus. Experimental evolution of representative braSR-positive and -negative species with nisin selection identified differential selection of BraSR and GraSR to produce resistance to this AMP.

Analysis of Orthologous Gene Content Across the Staphylococci
The orthologous gene content of 21 sequenced staphylococcal species' genomes ( Table 1) was determined using OrthoMCL to group orthologous genes (homologues separated by speciation) into clusters across the different species. The number of shared orthologous clusters between the different species' genomes was then represented as a heatmap (Figure 1). The output from this analysis revealed the assembly of three major groups of species, each with high numbers of shared orthologous clusters. An associated cladogram supported three groups (groups A, B, and C) when defined as containing three or more species (Figure 1). This supported previous reported groupings from 16S rDNA and multilocus analyses (Takahashi et al., 1999;Lamers et al., 2012). Additionally, three species pairs showed a high degree of shared orthologous clusters of genes and branched together in the cladogram: S. aureus/S. simiae, S. simulans/S. carnosus, and S. lentus/S. vitulinus. S. aureus and S. simiae were proposed as members of the S. aureus group of staphylococci from gene content (Takahashi et al., 1999).
The largest and least well-defined, species group comprises S. epidermidis, S. capitis, S. warneri, S. haemolyticus, S. hominis, S. lugdunensis, S. pettenkoferi, S. aureus, and S. simiae (Figure 1). Designated group A, it is dominated by species that colonize human skin (Kloos, 1980;Schleifer and Bell, 2015). The Genomes were sequenced for this study, as indicated, or retrieved from NCBI for analysis; genome integrity is indicated.
likelihood of a strain-dependent effect structuring group A was investigated by substituting S. epidermidis, S. hominis, and S. aureus strains based on multiple available genomes (Table 1 and Supplementary File S1). Substituting these individual species with alternative strains and repeating the OrthoMCL analysis did not alter species groupings. Groups B and C were similarly unaffected by switching strains of S. saprophyticus and S. pseudintermedius, respectively. The smaller species group B comprises S. equorum, S. arlettae, S. cohnii, S. saprophyticus, and S. xylosus (Figure 1). Though not universal, a frequent lifestyle identified in the group B species is human or animal host colonization; several species are associated with meat products and novobiocin resistance (Devriese et al., 1985;Bannerman et al., 1991) with commonalities in their cell wall composition (Schleifer et al., 1984).
Species group C comprises S. pseudintermedius, S. delphini, and S. intermedius and this collective was previously designated the S. intermedius group (SIG); the species cause opportunistic infection of companion animals and equids (Devriese et al., 1985). Emerging antibiotic resistance in the SIG species group is a clinical veterinary concern (Stull et al., 2014) and their routine speciation is complicated by their high degree of 16S rRNA locus sequence identity (Slettemeås et al., 2010).
While preserving known species groupings, the whole genome analysis identified discrete species groups of staphylococci (A-C) and an explanation for their formation was sought. Genetic determinants directing the formation of species group A were tested in R using machine learning with the Random Forests algorithm for classification (Breiman, 2001). This algorithm was used to identify variables, in this case OrthoMCL clusters (data not shown), that contributed to formation of the groups, based on a forest of trees generated from these variables. A gene from each cluster was then determined and mapped back to a representative genome and the PROKKA annotation of each protein coding sequence was verified using BLAST, for group A this representative was S. epidermidis. Contributing variables were investigated for group A, based on the strain set described in Table 1, where permutations were used to verify the existence and reproducibility of species groups (Supplementary File S1).

Clusters Driving Formation of Group a Species
The presence of 7 and absence of 6 OrthoMCL clusters collectively contribute to defining group A, with differing levels of support (Mean Decrease in Accuracy [MDA] values) ( Table 2 and Supplementary File S2). Four orthologs that are sequentially encoded in the genome as an operon (epi_02134 -epi_02137; MDA 3.2, 3.0, 2.6, 2.2, respectively) were also the most strongly supported in this analysis ( Table 2). The latter cluster pair epi_02136/epi_02137 was annotated by PROKKA as a TCS sensor/regulator ( Table 2 and Supplementary File S2) and shares ∼100% similarity with BraSR (SA2417/SA2418 of S. aureus N315), a TCS associated with resistance to AMPs nisin and bacitracin . The adjacent clusters encoded in the same operon (epi_02134, epi_02135) comprise the BraD/BraE ABC transporter subunits with 98 and 99% similarity with SA2415/SA2416 of S. aureus N315, respectively . We demonstrate as a key finding of our analysis that BraSR and BraDE FIGURE 1 | Heat map representation of shared orthologous proteins across Staphylococcus species. Presence is indicated using a color scale from red (highest number of shared clusters of orthologous proteins) to white (lowest number). Major groups of species observed in the analysis are highlighted as groups A-C.
are associated with genomes of group A Staphylococcus species.
The presence of orthologue epi_00542 (MDA 2.2; Table 2 and Supplementary File S2) contributes to species group A, with support that the protein functions as a putative cell wall hydrolase from the Nlp-P60 family hydrolase domain that is associated with hydrolysis of peptidoglycan. Also, contributing to defining group A are the absences of two orthologue clusters (sap_00398; MDA 3.3 and sap_00399; MDA 1.4; Table 2 and Supplementary File S2) that are annotated as multidrug ABC transporters. A range of cytotoxic molecules are mobilized across the cell membrane by multidrug ABC transporters where certain families of these can also act as sensors (Stock et al., 2000;Rietkötter et al., 2008). Across staphylococcal groups, differential repertoires of ABC transporters associated with antimicrobial survival are consistent with the importance of community competition in species evolution.
Sequence variation of the NADP-dependent succinate semialdehyde dehydrogenase (SSADH) between group A staphylococci versus groups B and C was identified by the association of cluster sap_00201 (MDA 2.9, Table 2 and Supplementary File S2) with group A species; this variation might be allied to differences in glutamate metabolism across the genus. Glutamate is involved in multiple metabolic processes and bacterial glutamate dehydrogenase catabolizes glutamate, which contributes to acid tolerance. NADP-SSADH catalyzes catabolism of γ-aminobutyrate, a product of glutamate dehydrogenase activity (Feehily and Karatzas, 2013); this pathway is oxidative stress sensitive owing to the catalytic cysteine residue of SSADH.  With respect to clusters driving formation of group B and C species, the size of species input groups B and C (Figure 1) limit use of the random forest algorithm. Consequently, a similar species-defined analysis of groups B and C is not included and a broader species comparison of staphylococci could be considered in future.

Diversity of Cationic AMP Survival Loci Across the Staphylococci
The described comparative genomic analysis revealed that while BraSR TCS is associated with group A species of staphylococci, the GraSR TCS is distributed across all species groups. Supporting predictions from the Random Forest analysis, low sequence identity of BraR/BraS with GraR/GraS was confirmed. BraR mean sequence identity with GraR of group A (44%) and group B/C species (40%) was greater than that of BraS compared with GraS of group A and groups B/C (mean ∼30% and ∼26%, respectively) ( Table 3).
High mean sequence identity (84-98%) of GraR regulator protein occurs within each of the three species groups ( Table 3) with divergence of GraR between species groups identified by lower mean sequence identity (67%). GraS sensor histidine kinase was less conserved within species groups A (mean 69%) and B (mean 66%), compared with GraS of species group C that shared greatest mean sequence identity (88%), albeit that group C defined here is a small, related species set. Both BraS and GraS sensor proteins have lower sequence conservation across staphylococci than BraR and GraR ( Table 3). The reduced divergence of these response regulators might reflect their relative isolation from selection by the external environment and differential stimuli.
Responses to cationic AMPs in the staphylococci are complex (Peschel and Collins, 2001;Herbert et al., 2007) and ligand specificity could account for species divergence of GraSR and BraSR TCS. This evolutionary outcome could be explained with strong selection pressure driven by ubiquity and diversity of cAMPs in staphylococcal niches. One intrigue in our analysis is the absence of GraSR and presence of only BraSR TCS in the group A species, S. pettenkoferi, with the sole related sensor protein having a mean sequence identity of 27% with group A GraS but 58% with group A BraS. S. pettenkoferi BraR has a mean sequence identity of 47% with GraR and 73% with BraR from group A. These values support the S. pettenkoferi TCS is a BraSR ortholog. GraSR was also absent and BraSR present in the four additional publicly available S. pettenkoferi genome sequences (strains 1286_SHAE, 589_SHAE, UMB0834 and CCUG 51270). The absence of GraSR in S. pettenkoferi raises questions about the evolution of BraSR in group A staphylococci. Gene duplication of GraSR in a group A species, with subsequent sequence divergence over time to BraSR and spread throughout group A species by horizontal gene transfer, is tempting to suggest. S. pettenkoferi having BraSR but not GraSR presents a challenge to this paralog hypothesis. We propose two possibilities; S. pettenkoferi may have suffered deletion of graSR following acquisition of braSR, or S. pettenkoferi never acquired braSR, but rather its TCS evolved from ancestral genes. Such a scenario would enable group A organisms to acquire braSR from S. pettenkoferi as an additional and sufficiently divergent TCS locus.
Staphylococcus species genomes sequenced recently were investigated for their encoded GraS and BraS protein homologues, which supported the limited distribution of BraS in staphylococci as identified in the Random Forest analysis ( Table 4 and Supplementary File S3). Furthermore, it revealed additional species encoding BraSR but not GraSR (S. agnetis, S. auricularis, S. chromogenes, S. hyicus, S. massiliensis). Regardless of the origins of both TCSs, the divergence between and within GraSR and BraSR likely reflect specificities for their ligands and selection driven by the niches to which the staphylococci are specialized.

GraSR and BraSR-Associated ABC Transporters
Both GraSR and BraSR, as members of the BceS-like IM-HK family of TCS, are activated by AMP ligand bound to an associated ABC transporter (Mascher, 2014). Given the important function of these TCS, the conservation of their associated transporter protein sequences was compared across the staphylococci.
VraFG is the GraSR-associated ABC transporter (Meehl et al., 2007) and in the genomes encoding VraFG (absent from group B species and S. pettenkoferi) there is a high degree of shared protein sequence conservation. VraF has a mean sequence identity of 68% across the staphylococci examined (Table 1), with greatest conservation within species groups (group A, 79% identity; group B, 85.3% identity; group C, 96.8% identity). Shared sequence identity among the VraG proteins was 47.5%, with 88%, 65.2 and 61.9% identity within groups A, B, and C, respectively. The BraDE ABC transporter associated with BraSR Previous studies demonstrated that selection by experimental evolution identified mutations conferring antimicrobial resistance in overarching regulators, notably SNPs in braS revealed roles for BraSR in nisin sensing and survival . Following our identified species association of BraSR to group A staphylococci, we adopted an experimental evolution strategy to interrogate the contributions of GraSR and BraSR TCS under selection for nisin resistance. Strains of group A species, S. aureus and S. hominis plus group B S. saprophyticus were each serially passaged in triplicate cultures with increasing concentrations of nisin using a microtiter plate method, with an equivalent sodium citrate buffer control passaged in parallel. Stepwise increases in nisin MIC were observed for all strains tested with no obvious pattern in the rate of resistance acquisition between the species. After selection, both S. aureus 171 and S. aureus SH1000 strains exhibited ∼100-fold increases in nisin MIC, a greater fold increase in resistance than that observed by Blake et al (2011;Hiron et al., 2011), which may be due to experimental design differences. Selection of both S. hominis strains increased nisin MIC ∼25-fold, and S. saprophyticus strains CCM_883 and CCM_349 showed 80fold and 5-fold increases, respectively. Multiple clones of S. aureus 171, S. hominis J31 and S. saprophyticus CCM883 were genome sequenced to identify sequence variants that potentially contributed to increased nisin MIC. T0 genomes were assembled and annotated, then reads from three pools (each comprising 5 independent clones) and one individual clone of each experimentally evolved species were aligned to their respective assembled genomes to identify sequence variants (SNPs, insertions/deletions) specific to nisin selection (Tables 5-7).

Nisin-Selected SNPs in Staphylococci
Experimental evolution of S. saprophyticus identified a SNP in graS (GraS: A 160 S; Table 5) that was present in two clone pools, and SNP graS G 209 C in a third pool. A single clone sequenced from the latter pool identified only one SNP in graS (GraS: G 209 C) and an upstream variant associated with ptsG ( Table 5). These data provide support for GraSR contributing to nisin resistance in S. saprophyticus given the absence of the BraSR TCS in this group B Staphylococcus species. Aside from TCS, other regulators may contribute to the nisin response in S. saprophyticus as evidenced by an identical SNP identified in two separate nisin resistance selections (pools 2 and 3) corresponding to a T 62 I change in an uncharacterized MarR transcriptional repressor.
In both S. aureus and S. hominis there are multiple pathways to high-level nisin resistance. Each species revealed SNPs in TCS systems, but these differed across the parallel selection experiments (Tables 5-7). In S. aureus, a non-synonymous SNP in braS (BraS: T 175 I) was present in 100% of reads from one sequenced pool, differing from previous work that identified a discrete braS SNP (BraS: A 208 E) . Evidence for a second TCS contributing to nisin resistance arose from a walK non-synonymous SNP (WalK: H 364 R) within the diverse and flexible signal sensing PAS domain of WalK in S. aureus (Hefti et al., 2004). WalKR is essential and functions to maintain cell wall metabolism (Delaune et al., 2012) and SNPs in this TCS contribute to vancomycin and daptomycin resistance due to cell wall-thickening (Howden et al., 2011). Should this cell wall phenotype be associated with the H 364 R WalK variant it could similarly limit nisin interaction with its lipid II target to abrogate pore formation. A large overlap was reported between the WalKR and GraSR regulatory networks in S. aureus .
In S. hominis, a graS SNP (GraS: S 120 L) was present in 2 clones of sequence pool 2 and no SNPs or other sequence variants were identified in braSR (Tables 5-7). S. hominis has both braSR and graSR loci and therefore it is intriguing nisin resistance selection resulted in SNPs in a different TCS to S. aureus despite encoding both, potentially reflecting differences in their contribution across group A staphylococci. A further transcriptional regulator might contribute to nisin resistance in both S. aureus and S. hominis, where the uncharacterized yhcF revealed SNPs producing G 73 R and N 47 * , respectively; the presence of SNPs in yhcF of both species supports a role for this regulator. The YhcF transcriptional regulator proteins of S. aureus and S. hominis have 75% similarity and their cognate genes are adjacent to an ABC transporter locus with potential specificity for GlcNAc, which might catalyze recycling of cell wall substrates from nisin damage. The role of this operon is currently being investigated.
In summary, we have identified differential encoding and diversity of antimicrobial resistance regulators and their associated transporters across the staphylococci. Our previous studies of the nasal microbiome correlated cumulative antimicrobial production with community structure, limitation of invasion and S. aureus exclusion Libberton et al., 2014Libberton et al., , 2015. Further dissection of antimicrobial sensing and discrimination via the TCS systems BraSR and GraSR combined with analysis of their associated transport specificities will provide information that can be layered with niche-relevant antimicrobial activities from competing species. Such analyses are now emerging and will provide a more holistic determination of Staphylococcus ecology.

Staphylococcus Orthologous Gene Content
Representative genomes of 21 different Staphylococcus species available at the time of analysis (Table 1) were either sequenced (see later section) or retrieved from the NCBI FTP repository 1 . Complete genomes were used where possible. Draft genomes available as NCBI scaffolds were reordered against an appropriate reference using a bespoke perl script. Genomes were annotated using PROKKA (version 1.5.2) (Seemann, 2014) to ensure consistent gene calling and annotation. OrthoMCL (version 1.4) was used to cluster orthologous proteins , with input parameters, e-value cut-off: 1e-5, percentage identity cut-off: 30, percentage match cut off: 20. Briefly, initial BLAST steps of orthoMCL used the latter two low stringency cut-off values; these values were used to retain more proteins for clustering from these BLAST stages. Inparalog, ortholog and co-ortholog pairwise relationships were generated through reciprocal best and better hits in subsequent stages that used the p-value cut-off of 1e-5. Finally, the MCL (Markov clustering) aspect of the tool was applied to these pairwise relationships to allow clustering into orthologous Names and functions of genes containing SNPs in the sequenced clone pools are shown with their locations and nucleotide change. Cognate amino acid change or stop ( * ) is indicated. Pools comprised 5 clones from each of three independent experiments and allele frequencies were determined from numbers of corresponding reads in these pools. Nisin MICs of clones in each pool were confirmed to ensure they were similar.
groups Fischer et al., 2011). A bespoke python script was used to create a table describing the presence or absence of each OrthoMCL cluster within every genome. These data were converted to a matrix for analysis in the statistical package R and a heatmap was generated from the matrix. To control for gross strain-specific effects on the heat map (and thus OrthoMCL clusters), this step was repeated by substituting with alternative strains (Supplementary Table S1) and all permutations were analyzed in subsequent steps of the analysis.

Drivers of OrthoMCL Group Formation
The R library, Random Forest (version 4.6-7) (Liaw and Wiener, 2003) was used to investigate the genetic inputs directing classification of the species into their OrthoMCL groups. A presence/absence table of each of the orthologous groups obtained from the United States300 permutation of the OrthoMCL analysis was generated using a bespoke python script and used as the input data for the Random Forest algorithm. The data was split into a test and training data set with both sets including equal proportions of group A species. The optimum value for mtry was found to be 66 using the tuneRF function (ntree = 1001, stepFactor = 1.5, improve = 0.001). These mtry and ntree parameters resulted in a model with an out of bag (OOB) error rate of 9.09% and area under ROC curve (AUC) of 0.96. Data output was summarized using the variable importance plot function and the numeric mean decrease in accuracy (MDA) resulting from the permutation of each variable was obtained through the importance function; these data were used as the measure of the importance of each variable. The maximum MDA in this analysis was 3.3. Clusters were mapped back to the genome and the annotation of protein sequence for a species representative of each cluster was retrieved. Protein sequences of clusters identified as important were retrieved and their annotations curated and verified against published annotations. In addition, outputs were generated by substituting strains of species in the analysis to compare conservation of identified clusters between the variable importance plots. Sequences of protein clusters from the single species representative in Table 2 and identified by Random Forest output are listed in Supplementary Files S2, S3. Protein sequences were retrieved from their respective genomes and alignments were performed using ClustalW2 (version 2.1).

Minimum Inhibitory Concentration Assay
Nisin (Sigma-Aldrich Company Ltd, United Kingdom) was prepared as a 20 mg mL −1 solution in 10 mM sodium citrate (Sigma-Aldrich Company Ltd, United Kingdom) at pH 3 and stored at 4 • C. MIC assay used microtiter plates with doubling dilutions of nisin in BHI (Thermo Scientific) inoculated 1 in 2  with 100 µL bacterial suspension adjusted to OD 600 0.2 ± 0.005. The lowest concentration with an optical density ≤ to that of the initial optical density was taken as the minimum inhibitory concentration (MIC).

Selection for Nisin Resistance
Experimental evolution was performed by serial passage in broth containing doubling dilutions of nisin in triplicate wells of a microtiter plate. For selection of S. aureus and S. saprophyticus, the maximal assay concentration of nisin was 5 mg mL −1 and for S. hominis 50 µg mL −1 . Control selection experiments with equivalent sodium citrate concentrations were performed in parallel. Experiments were initiated with inoculation of bacteria to OD 600 = 0.2 for the first passage and plates were incubated static at 37 • C. Bacteria growing at the highest concentration of nisin after 24-48 h were passaged forward to the next plate; subsequent passages were inoculated with a 1:1000 dilution of culture. Serial passage was continued until growth occurred at the maximal nisin concentration (for strains S. saprophyticus = 10 mg ml −1 , S. aureus = 10 mg ml −1 , and S. hominis = 250 µg ml −1 ) or for a period of 12 days. All passaged cultures were collected and stored at −80 • C in 20% (v/v) glycerol (Fisher Scientific) after each passage and the T 0 time point served as comparator strain. Colonies were randomly selected for sequencing after plating from independent biological replicate cultures that had reached an equivalent maximum level of nisin resistance. Clones from each repeat were selected and cultured in 10 mL of BHI at 37 • C with shaking at 200 rpm overnight. Increased MICs were confirmed by using the MIC assay described above at the highest nisin concentrations. Selection was performed for a corresponding citrate control time point for each of the three species.

DNA Extraction, Library Preparation and Sequencing
Cells were harvested from overnight culture and lysed in buffer containing 12.5 µg ml −1 lysostaphin (Sigma-Aldrich) and 10 U mutanolysin (Sigma-Aldrich). DNA was purified using a DNeasy Blood and Tissue Kit (Qiagen). DNA (30 ng) from each of five selected clones was pooled to make Illumina Truseq DNA libraries with an insert size of 350 bp. In addition to three separate clone pools, a single clone was selected for sequencing from the clones used to constitute the pools. Single clones were selected on the basis of the highest DNA quality. The single clones and the T 0 isolates were also sequenced using Illumina Truseq nano DNA libraries with 350 bp inserts.

Identification of SNPs and INDELS
T 0 comparator strains were assembled using VelvetOptimiser (version 2.2.5; Victoria Bioinformatics Consortium) with Kmer sizes from 19 to 99 and Velvet version 1.2.06 (Zerbino and Birney, 2008). Annotation was carried out using PROKKA version 1.5.2 (Seemann, 2014). The PacBio assembly of S. hominis strain J31 (Accession FBVO01000000) (Coates-Brown and Horsburgh, 2017) was used as the comparator assembly for this strain. Good quality filtered reads from experimentally evolved pools and single clones were aligned to respective comparator strains using the BWA (version 0.5.9-r16) (Li and Durbin, 2009) packages aln and sampe, and also using BWA (version 0.7.5a-r405) mem package. SAM files were converted to bcf (binary variant call) files with samtools for SNP calling using the mpileup package. The bcf output file from mpileup was then converted to vcf (variant call format) files and quality filtered. For SNPs, only this quality filtered vcf file from the pooled clones, along with mpileup output without base data, were used to further filter the SNPs to include only those present in 33.33% of reads, which equates to the SNP being present in more than one clone. To reduce falsely called SNPs, SNPs not called from both alignments (from either BWA aln and sampe or BWA mem) were removed from the data set, as recommended by Li (Li, 2014). SNPs called in the control data and evolved isolates were filtered from the data.

AVAILABILITY OF DATA AND MATERIALS
Genomes resulting from this work can be retrieved from the ENA database at EMBL-EBI (https://www.ebi.ac.uk/ena/data/view) under the bioproject accession PRJEB22856, including data from experimental evolution of S. aureus 171; Parental S. aureus 171 data accession: LT963437. Individual genome assembly accessions used in Figure 1 are listed in Table 1 and Supplementary File S1. Strains not already publicly archived are available on request. This manuscript was submitted to bioRxiv ahead of review (Coates-Brown et al., 2018).

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

FUNDING
RC-B was funded by BBSRC training grant BB/J500768/1 awarded to MH with support from Unilever PLC. JM was funded by BBSRC research grant BB/L023040/1 awarded to MH with support from Unilever PLC. The funders were not involved in the study design, collection of samples, analysis of data, interpretation of data, the writing of this report or the decision to submit this report for publication.