A Population-Based Descriptive Atlas of Invasive Pneumococcal Strains Recovered Within the U.S. During 2015–2016

Invasive pneumococcal disease (IPD) has greatly decreased since implementation in the U.S. of the 7 valent conjugate vaccine (PCV7) in 2000 and 13 valent conjugate vaccine (PCV13) in 2010. We used whole genome sequencing (WGS) to predict phenotypic traits (serotypes, antimicrobial phenotypes, and pilus determinants) and determine multilocus genotypes from 5334 isolates (~90% of cases) recovered during 2015–2016 through Active Bacterial Core surveillance. We identified 44 serotypes; 26 accounted for 98% of the isolates. PCV13 serotypes (inclusive of serotype 6C) accounted for 1503 (28.2%) isolates, with serotype 3 most common (657/5334, 12.3%), while serotypes 1 and 5 were undetected. Of 305 isolates from children <5 yrs, 60 (19.7%) were of PCV13 serotypes 19A, 19F, 3, 6B, and 23F (58/60 were 19A, 19F, or 3). We quantitated MLST-based lineages first detected during the post-PCV era (since 2002) that potentially arose through serotype-switching. The 7 predominant emergent post-PCV strain complexes included 23B/CC338, 15BC/CC3280, 19A/CC244, 4/CC439, 15A/CC156, 35B/CC156, and 15BC/CC156. These strains accounted for 332 isolates (6.2% of total) and were more frequently observed in children <5 yrs (17.7%; 54/305). Fifty-seven categories of recently emerged (in the post PCV7 period) putative serotype-switch variants were identified, accounting for 402 isolates. Many of these putative switch variants represented newly emerged resistant strains. Penicillin-nonsusceptibility (MICs > 0.12 μg/ml) was found among 22.4% (1193/5334) isolates, with higher penicillin MICs (2–8 μg/ml) found in 8.0% (425/5334) of isolates that were primarily (372/425, 87.5%) serotypes 35B and 19A. Most (792/1193, 66.4%) penicillin-nonsusceptible isolates were macrolide-resistant, 410 (34.4%) of which were erm gene positive and clindamycin-resistant. The proportion of macrolide-resistant isolates increased with increasing penicillin MICs; even isolates with reduced penicillin susceptibility (MIC = 0.06 μg/ml) were much more likely to be macrolide-resistant than basally penicillin-susceptible isolates (MIC < 0.03 μg/ml). The contribution of recombination to strain diversification was assessed through quantitating 35B/CC558-specific bioinformatic pipeline features among non-CC558 CCs and determining the sizes of gene replacements. Although IPD has decreased greatly and stabilized in the post-PCV13 era, the species continually generates recombinants that adapt to selective pressures exerted by vaccines and antimicrobials. These data serve as a baseline for monitoring future changes within each invasive serotype.


INTRODUCTION
The two major emphases of IPD strain surveillance, identification of serotype distributions and antimicrobial resistance phenotype, have not changed over several decades. The distributions of these two basic pneumococcal strain features informs strategic formulation of next-generation vaccines, evaluation of current vaccines, and establishment of appropriate antibiotic usage for clinical cases. Pneumococcal conjugate vaccines (PCVs) are very effective in reducing incidence of PCV-type IPD (Pilishvili et al., 2010;Moore et al., 2015) and pneumonia in children (Nelson et al., 2008;Olarte et al., 2017a), and widespread use in children has substantially reduced rates of IPD and pneumonia among adults (Griffin et al., 2013;Simonsen et al., 2014). PCVs prevent acquisition of vaccine-type pneumococcal carriage in children which serves to reduce transmission to other children and adults. Pneumococcal conjugate vaccines have not only greatly reduced IPD, but have preferentially targeted antimicrobialresistant strains. Introductions of both PCV7 (in 2000) and PCV13 (in 2010) both served to dramatically decrease IPD caused by antibiotic-resistant strains in all ages, particularly strains resistant to penicillins (Kyaw et al., 2006;Tomczyk et al., 2016).
Neither of the past two PCVs had entirely predictable effects. While PCV7 greatly decreased overall disease through reduction of PCV7-type disease, the dramatic emergence of the highly resistant 19A/ST320 was an unexpected negative event. Besides impacting the emergence of pre-existing uncommon strains (for example 19A/ST320 before PCV7 implementation), circumstantial data indicates that serotype-switch strains arising through recombinational replacement of the cps locus can be amplified through PCV selective pressure. For example, the 19A/ST695 variant resulting from a PCV7 serotype 4 to non-PCV7 type 19A switch, was first detected in the post-PCV7 period and spread throughout the country to become the 2nd most frequent 19A strain complex (Pai et al., 2005b;Brueggemann et al., 2007;Beall et al., 2011;Golubchik et al., 2012). Similarly, while PCV13 has dramatically decreased PCV serotype IPD, the multi-resistant 35B/ST156 variant was detected soon after PCV13 implementation (Metcalf et al., 2016b;Olarte et al., 2017b) and quickly became the 2nd leading 35B strain complex . It will not be surprising if detected recombination events involving non-PCV serotypes continue to increase since carriage of these serotypes within the principal nasopharyngeal reservoir in children has substantially increased in the post PCV7 and PCV13 periods (Sharma et al., 2013;Desai et al., 2015).
Here we reveal distributions of serotypes and key antimicrobial phenotypes for invasive disease isolates identified through routine surveillance in 2015-2016. In addition, we depict basic strain structures of each individual serotype and quantitate strain complexes that have only become apparent within the post-PCV period.

IPD Isolates
All 5334 isolates characterized were identified through CDC's Active Bacterial Core surveillance (ABCs) during 2015-2016. ABCs is an active population and laboratory-based system which covers a population of approximately 32.2 million individuals. ABCs areas, methods, and key surveillance data through 2016 are described at https://www.cdc.gov/abcs/reportsfindings/survreports/spneu16.html. WGS accessions are only provided for the 5212 isolates that yielded high quality sequencing metrics (sTable 1).

Genomic Sequencing
Genomic DNA preparation, library construction, whole genome sequencing (WGS), and bioinformatics pipeline features have been previously described (Li et al., 2016Metcalf et al., 2016a,b). Streptococcus pneumoniae strains were cultured on Trypticase soy agar supplemented with 5% sheep blood and incubated overnight at 37 • C in 5% CO 2 . Genomic DNA for short-read WGS was extracted manually using a modified QIAamp DNA mini kit protocol (Qiagen, Inc., Valencia, CA, USA). Nucleic acid concentration was quantified by an Invitrogen Qubit assay (Thermo Fisher Scientific Inc., Waltham, MA, USA) and samples were sheared using a Covaris M220 ultrasonicator (Covaris, Inc., Woburn, MA, USA) programmed to generate 500-bp fragments. Libraries were constructed on the SciCloneG3 (PerkinElmer Inc., Waltham, MA, USA) using a TruSeq DNA PCR-Free HT library preparation kit with 96 dual indices (Illumina Inc., San Diego, CA, USA) and quantified by a KAPA qPCR library quantification method (Kapa Biosystems Inc., Wilmington, MA, USA). WGS was generated employing two MiSeq instruments and the MiSeq v2 500 cycle kit (Illumina Inc).

Strain Features
Strain features were predicted using our bioinformatics pipeline (Metcalf et al., 2016a,b). Strain features included capsular serotypes, multilocus sequence typing (MLST), pilus prediction, and minimum inhibitory concentrations (MICs) for antibiotics. Strain features associated with isolate identifiers and genome accession numbers are listed for all isolates in sTable 1. Of the 5334 isolates, 5212 (∼98%) were judged of sufficient quality to release to the short read archive (SRA). For ∼2% of isolates, critical strain parameters, when missing, were obtained through phenotypic testing and for these no WGS data is publicly available. Critical WGS assembly metrics (contig number, n50 [an indicator of average contig size], longest contig length, sum of contig lengths) are provided in sTable 1. These WGS sequence read accession (SRA) submissions are available at https://trace. ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA284954&go=go.
Penicillin-resistance (penR) equates to an MIC of ≥2 µg/ml, with intermediate resistance (penI) at 0.12-1 µg/ml and reduced susceptibility at 0.06 µg/ml. All year 2015 isolates were also subjected to conventional broth dilution testing as previously described (Metcalf et al., 2016a). Year 2016 isolates recovered from Minnesota were also subjected to conventional MIC testing and serotyping, which were in close agreement with pipelinepredicted MICs and serotypes (latter with 100% concordance). Approximately 3% of remaining year 2016 isolates were subjected to conventional testing for minimum inhibitory concentrations (MICs) due to sub-optimal assemblies affecting penicillin binding protein (PBP) typing. About 0.3 % of isolates were assigned serotypes serologically.

Genetic Analyses of Strains
eBURST (Feil et al., 2004) was employed using the website http:// eburst.mlst.net/v3/enter_data/single/ with the number of loci set at 10. The 10 loci included the conventional multilocus sequence typing housekeeping gene fragment identifiers (provided by https://pubmlst.org/spneumoniae/) to which we added the previously described 3 digit PBP type (Metcalf et al., 2016b). eBURST complexes consisted of isolates sharing at least 5 of the 7 housekeeping loci with other member(s) of the set.
For phylogenetic analysis of strain sets, a reference sequence was selected for the major ST from the year 2015 dataset. Snippy (https://github.com/tseemann/snippy) was used to read map genomes against the corresponding reference sequence and to build a core SNP alignment. Isolates that had under 90% coverage of the reference sequence were excluded from the analysis. Trees were built using RAxML with ascertainment correction. PlotTree (https://github.com/katholt/plotTree) was used for the visualization of metadata in trees.
Recombinant region sizes within individual strains exhibiting typing (MLST, cps, PBP subtype) features characteristic of 35B/CC558 strains were measured through comparison with the 35B/ST558 reference sequence from isolate 20154871 (sTable 1). Contiguous regions of > 1,000 bases between 2 SNPs sharing > 99.9% sequence identity with 20154871 were measured.
Figures 3A,B depict rate differences by serotype comparing IPD incidence pre-PCV13 (2007)(2008) to post-PCV13 (2015)(2016) in children <5 yrs and in adults >65 yrs. PCV13 serotypes, with the exception of serotype 19F, decreased during 2015-2016 relative to 2007-2008 in young children and elderly adults. Significant reductions were observed for serotypes 19A and 7F, which were the predominant IPD serotypes during 2007-2009, and for serotype 6C. Slight increase in non-PCV13 serotypes and PCV13 serotype 19F were observed among both young children and elderly adults in 2015-2016 when compared to 2007-2008 with a significant increases for serotype 23B among the elderly. Serotype-specific incidence changes showed no consistent correlations with antimicrobial resistance or the presence/absence of pili.
Most combined resistance to erythromycin and clindamycin was conferred by ermB, although 7 isolates (4 different serotypes) contained the 23S rRNA A2061G substitution conferring this resistance trait (sTable 1). The proportions of isolates with macrolide-resistance increased with increasing penicillin MICs (Figure 4). Only 19.1% of basally penicillin-susceptible isolates were macrolide resistant, while 51.1% with the "firststep" penicillin MIC (reduced susceptibility) were macrolideresistant. Of intermediately penicillin-resistant isolates, 56.8% were macrolide-resistant, while 83.5% of isolates with penicillin MICs ≥2 µg/ml were macrolide-resistant. The distributions of combined resistance features for penicillin, erythromycin, and clindamycin are diagramatically presented within different FIGURE 1 | (A) Serotype frequency, patient age distributions, and proportion of most common clonal complex (CC) within each serotype for 5284 isolates. The ranking of the 10 serotypes causing the most IPD in individuals <5 yrs of age is shown in parentheses. The major MLST-based clonal complex within each serotype is shown within each column. There were no pediatric isolates within serotypes indicated with an asterisk. (B) Numbers of isolates within each serotype with individual combinations of penicillin, erythromycin, and clindamycin resistance phenotypes/ (C) Pilus backbone determinant frequency (presence or absence of PI-1 and/or PI-2) within each serotype.

Overview of Individual Serotypes and Serotype Switch Events
The sections that follow provide general depictions of clonal complexes, resistance features, and recently documented (2002 or later) putative serotype-switch variants within each serotype. sTable 2 provides a listing of year 2015-2016 switch variant progeny, along with their major MLSTs, PBP types, and resistance features. Each CC/serotype combination is designated as a category of switch variant.
Certain strain complexes that were represented by numerous progeny accounted for multiple switch events. For example, examination of PBP and MLST alleles within 35B/CC156 progeny revealed that this complex originated from at least 3 independent serotype-switch events . Similarly, the CC338 strains 23B/ST1373 and 23B/ST338 originated from different ancestral events.

35B/CC558 as a Substrate for Recombination
There were numerous examples of apparent hybrid strains that exhibited typing pipeline features highly associated with two different strains of known lineages (data intuited from sTable 1). Such strains included many putative serotype-switch strains within sTable 2, but also included other strains with altered beta lactam MICs conferred by horizontal transfer of PBP loci. Several recombinant strains that involved the 35B/CC558 lineage were easily recognized due to PBP typing determinants and MLST loci that are highly specific for this lineage ( Table 1). The MLST determinants recP44, xpt77, and ddl97 were all initially documented from within invasive 35B/ST558 circulating within the United States during 1995-2001 .
Similarly, we have found that the 3 resistance-conferring mosaic PBP genes associated with PBP type 4-7-7 are highly associated with 35B/CC558, although as indicated within Table 1, 2b-7 is also found within year 2015-2016 serogroup 6/CC1092 and 19A/CC230 IPD isolates from 1998-present. Table 1 depicts 16 putative recombination events (multifragment recombinations counted as single event) that implicate 35B/CC558 as a genetic donor (of cps35B, MLST loci, or PBP type determinants) in 15 genetic exchanges and as a genetic recipient (of cps6B) in a single exchange, resulting in the indicated progeny strains shown in the right column. Each exchange is defined by 1-3 specific double crossover events deduced in 1-25 progeny isolates.
There were 5 clear instances of serotype switching events involving a 35B/CC558 parental donor strain, involving 5 different donor serotype lineages. The most important serotype 35B switch variant, accounting for 25 isolates, arose from a single event where the 9V/ST156 pbp2x-cps9V-pbp1a region was replaced with the corresponding sequences from 35B/ST558 through two closely situated recombination events ( Table 1, number 1). This strain was initially reported from a year 2012 ABCs isolate (Metcalf et al., 2016b) and was subsequently found within all 10 ABCs surveillance sites during 2015-2016 . The second serotype switch, also determined previously to share the same 2 parental lineages (Metcalf et al., 2016b), resulted in a 35B/ST10174 strain that was discovered as a year 2009 IPD isolate. Within this strain two unlinked recombination events were revealed; one encompassing the contiguous 1A-4 and cps35 loci, and the other encompassing the contiguous 2B-7 and ddl97 markers as an example of ddl "hitchhiking" with a resistance-conferring pbp2b allele (Enright and Spratt, 1999). We have subsequently recovered two 35B/ST10174 IPD isolates recovered during 2016-2017. The third serotype-switch shown in Table 1 implicate a 35B/CC558 cps35B donor on the basis of 3 unlinked recombination events that replaced the recP, gdh, and cps regions of the common 23/ST338/PBP type 0-1-1 with recP44, gdh12, and cps35B resulting in a 35B/ST13255/PBP type 0-1-1 progeny strain. Table 1, number 4 involved a 35B/CC558 cps35B donor and a serogroup 6 recipient where one recombination event replaced cps6 and the second closely linked double crossover resulted in a hybrid pbp2x fusion gene creating a new PBP2x subtype (2x-36). The fifth serotype switch event involved a 35B/CC558 cps35B donor and a 15A/ST11818 recipient. It is interesting that there were 4 15A/ST11818 isolates recovered, 3 of which shared the common 15A/CC63 PBP type 24-31-114. The PBP type of the progeny 35B/ST11818 (4-31-114) and its cps35B region sequence are consistent with a gene replacement event effecting a serotype switch event and a change in PBP type. Here, a single chromosomal fragment carrying cps35B and the 3 ′ 544 codons of pbp1a (encompassing PBP subtype 1a-4) from a 35B/CC558 strain replaced the corresponding region from a 15A/ST11818/PBP type 24-31-114 strain.
A sixth serotype switch depicted in Table 1 (entry 17) involved an unknown lineage cps6B locus donor and a 35B/ST558 recipient strain (last row). Comparison of the cps locus from this recombinant to a cps6B reference locus (CR931639) and the 35B/ST558 cps35B reference revealed potential crossover sites corresponding to bases 2321 (within wzg) and 16033 (immediately downstream of 4 gene rhamnose biosynthetic gene cluster) of the cps6B reference, with 99.2% sequence identity shared within the 13,743 bp overlap of these two cps6B loci. As expected, sequences upstream and downstream of the cps6B locus that encompassed the full-length pbp2x and pbp1a genes in the recombinant strain were highly similar to the 35B/ST558   For each serotype, a descriptive section is provided in the text. Designations with dashes simply indicate differences in 1-3 of the PBP loci (e.g., ST180-2 vs. ST180-8, ST433 vs. ST433-1) Each lists a legend where the serotype is underlined and indicates the ratio of year 2016 isolates of that serotype to 2015 isolates in parentheses. Below each serotype individual clonal complexes are listed, also with the ratio of 2016 to 2015 isolates. Entries where there are more year 2016 isolates than year 2015 isolates are indicated in red font. Strain complexes that have only been detected within the post-PCV era and are likely to have originated through serotype switching are indicated, with the most likely serotype of the parental recipient strain indicated. Black, green, and red lines between nodes indicate variation at 1, 2, and 3 of the 10 loci, respectively. Clonal complexes within dotted rectangle include MLST types that differ in 1-2 of the 7 housekeeping loci, yet differ in 3 or more of the 10 locus eBURST scheme. A legend depicting relevant resistance phenotypes for penicillin, erythromycin, and clindamycin is included for each serotype. It was informative to show eBURST of serotypes 23A and 23B together (C), and also between serotypes 6C, 6A, 6B, and 6D together (L) due to intrinsic close relationships. In 5A and 5C PBP type-driven increased penicillin MIC are depicted (see 180-4 in 5A and 338-3 in Fig 5C). recipient reference strain (>99.5 % sequence identity over 14,433 bp flanking the cps6B switch fragment, including complete identity with full-length 2160 bp pbp1a and 1762 bp pbp2x genes).
The remaining 9 entries in Table 1 (entries 6-14) do not involve cps switches, but introduced changes in MLST and/or PBP loci that were observed in the progeny. Except for entry 10, where the short length (565 bp) of the pbp2b putative replacement fragment does not provide strong association with a 35B/CC558 donor, these putative recombination events appeared non-subjective, involving replacement fragments of 1424 bp -> 26,001 bp that closely matched the 35B/ST558 reference. Four of these hybrid 15A/CC63 strains (entries 6-9) had higher predicted MICs for beta lactams due to the introduced changes in PBP type. Replacement of 2-3 PBP markers resulted in a change from the median 15A/CC63 pen MIC from 0.25 to 2 µg/ml (entries 6 and 7). The predicted change from PBP type 13-115-73 (the closest matching PBP type among our 15A isolates) to 13-7-73 results in change of pen MIC from 1 to 2 µg/ml. All 3 33F hybrid progeny appeared to result from an identical strain mixing that effected 3 large and unlinked gene replacements within a 33F/ST10491 recipient (entry 12 in Table 1). Two of these replacements dramatically increased beta lactam MICs through changing the 33F/ST10491 PBP type from 2-23-6 (associated with pen MIC < 0.03 µg/ml) to 4-23-7 (pen MIC = 1 µg/ml).

Serotype 3
Despite being a PCV13 serotype, serotype 3 was the major IPD serotype overall during 2015-2016. It is interesting that the clade centering upon ST180-2 coincides with the recently described globally emergent CC180 clades I-β and II (Azarian et al., In Press). STs 180-4 and 3798-1, which feature ermB expression combined with reduced penicillin-susceptibility due to a first step PBP-2x substitution (0.06 µg/ml indicated), coincides with the finer-detailed phylogeny shown sFigure 18.

Serotype 22F
Serotype 22F displayed 2 significant mef + lineages (196 isolates) that included the ST7314-focused clade and ST433-2 ( Figure 5B). These eBURST-based clades are in agreement with finer phylogenetic resolution and generally show much deeper branching within the ST433 clade relative to the ST7314 clade which is consistent with more recent emergence of the major macrolide-resistant ST7314 clade (sFigure 19).

Serotypes 23A and 23B
Serotypes 23A and 23B are depicted together ( Figure 5C) since two broad MLST-based clonal complexes (CC338 and CC439) represent the majority of both serotypes (99.2% of 23A and 93.4% of 23B). Although ST338 was first noted in association with serotype 23F, within ABCs it has been found almost exclusively within 23A even before PCV7 implementation (Pai et al., 2005a). This observation is potentially indicative of 2 shared major ancestral lineages encompassing both serotypes. Clonal shifts occurred within 23A, 23B and other non-PCV serotypes during the post-PCV7 years such that more ABCs isolates were of the penicillin-nonsusceptible CC338 (Gertz et al., 2010).
Although, 23A and 23B ranked 3rd and 12th in overall incidence, respectively, there were 25 23B isolates from children and only 9 23A isolates from children. Serotype 23B was the only NVT serotype that increased in children (also in the elderly).
It is interesting that of the numerous STs represented in the broad complexes CC338 and CC439, one (ST338-1) was found in both serotypes (there were 4 23B/ST338-1 isolates and 126 23A/ST338-1 isolates). Phylogenetic analysis was consistent with 23B/ST338-1 arising from at least one serotype switch with a 23A/ST338-1 recipient (sFigure 20, sTable 2, category 1), however the major CC338 genotype within 23B was ST1373, which has only been traced within ABCs 23B isolates in the post-PCV7 era (Gertz et al., 2010;Metcalf et al., 2016b) and was originally reported from a 19F isolate recovered in 2001 (www. mlst.net). It is interesting that 23B/ST1373 has been associated with a divergent cps locus (Andam et al., 2017) previously described as sequence subtype 23B1 (Kapatai et al., 2016). The short branch lengths among the majority of the 23B/ST1373 isolates are consistent with the recent emergence and expansion of this strain complex (sFigure 20).

Serotype 35B
The serotype 35B isolates include the recently described 35B serologic subtype 35D caused by mutations within the wciG35Bencoded acetyltransferase gene Geno et al., 2017). Potential 35D isolates represented 7.7% of the serotype 35B isolates based solely upon substitutions and indels within wciG35B. Indels accounted for 2.3% of these isolates and were generally reliably distinguished as serotype 35D. The variety of individual wciG35B changes (not shown), almost all of which occurred within single isolates, possibly indicates the lack of a selective advantage for this serologic variant of 35B.
CC156/35B included isolates from at least 3 independent serotype switch events, including the previously described 35B/ST156 major variant and the independent resistant switch variant 35B/ST10174 carrying the closely linked ddl97 and pbp2b-7 markers (Enright and Spratt, 1999) derived from the 35B/ST558 serotype donor strain (Metcalf et al., 2016b;Chochua et al., 2017). As with ST10174, ST162 shares 6 of 7 housekeeping loci with ST156, however 35B/ST162 was likely to have originated from a third serotype switch between penicillin-susceptible 35B and 9V strains (sTable 2, category 14, discussed more fully in below section).

Serotype 16F
The four outlier isolates of the three largely susceptible CCs included a penI, eryR serotype derivative (single locus variant) of a serogroup 6 lineage (sTable 2, category 57) that shared the same resistance features as 14 6C/ST1292 isolates in this study (PBP type 19-31-8, mef+).

Serogroup 6
These genetically heterogeneous isolates, consisting primarily (90%) of serotype 6C, were grouped together ( Figure 5L). The arrows in Figure 5L indicate CCs representing both 6C and other serogroup 6 serotypes in this study since 6C strains appear to have emerged within several different lineages previously associated with serotypes 6A and 6B (Carvalho et al., 2009). There were 2 serotype 6D isolates (Bratcher et al., 2010), which are the first of this serotype documented in ABCs. Both of the 6D isolates were ST1379 which has been previously associated with serotype 6A (Carvalho et al., 2009). Five recent serotype-switch variants (7 isolates) were observed within serotypes 6C, 6A,and 6B (sTable 2,categories 7,20,39,49,and 54]). More detailed description of the single penR 6B/ST558 variant of the major 35B complex is described below and in Table 1.
Although under intense selective vaccine pressure in the post-PCV13 era, serotype 7F revealed no obvious serotype switch variants.
Within serotype 17F, two variants were observed that included single isolate variants potentially derived from recipient strains of 19F/ST177 and 15A/ST63 (sTable 2, categories 23 and 29).
Despite only consisting of 22 serotype 21 isolates, there were 7 pediatric isolates that included 2 meningitis case isolates. Five of the 22 isolates were from meningitis cases, including 3 of the 5 ST3689 isolates and 2 ST432 isolates.
The PCV13 serotypes 9V, 23F, and 14 cumulatively accounted for 36% of ABCs isolates (1442/4046 total isolates, including 477/1044 (46%) pediatric isolates) recovered during 1999 from a surveillance population of about 18.6 million people (Kim et al., 2016). During 2015-2016 isolates of these 3 serotypes were rare (23 total, sFigures 14-16), with no pediatric isolates of these serotypes recovered. The most common serotype during the pre-PCV7 period was serotype 14, which accounted for approximately one third of IPD in children <5 yrs. Although there were only 6 total serotype 14 isolates during 2015-2016, these residual CCs were common in the pre-PCV7 era and included CC156 (SLVs ST3819 and ST334), CC13, and CC230 (SLV ST5912). Serotype 9V in the US has primarily consisted of both penR and penS sublineages of CC156 throughout the pre and post PCV periods, reflected by these 2 sublineages accounting for all 9 isolates from 2015-2016. Five of the 8 serotype 23F isolates were of the longstanding and highly resistant CC81 lineage (Wyres et al., 2012).

DISCUSSION
This initial depiction of our first 2 years of WGS-based ABCs strain surveillance provides an encompassing data framework for comparison with future strain surveillance. Here WGS has allowed us a closer glimpse into the active process of strain mixing through recombination. The amount of recombination deduced from simply observing 35B/CC558 markers distributed among non serotype 35B strains suggests a species continually engineering and presenting new experimental strains for "testing" the property of fitness. Here we assume that a basic level of fitness is described by the emergence of a strain in our IPD surveillance to the level of detection. For this to have occurred, we assume that the strain was generated and carried within the upper respiratory reservoir for some period of time. For this reason we view with concern the two 3/ST271 isolates described in this study (a third also recovered during 2018; unpublished data). This highly resistant strain combines the current most successful invasive serotype with the most successful invasive clonal complex of the decade before PCV13 implementation (Beall et al., 2011). At this point in time only a minority of newly recognized strain complexes have successfully emerged from these recombinational "experiments" to become detected within multiple ABCs cases. The current IPD burden imposed by the recombinant strains revealed in this study is significant, however our study is limited in detection of such events by the use of recognizable typing patterns gleaned from short-read sequence and incomplete genome assemblies. Increased carriage of serotype 35B by children has been observed in post-PCV years (Sharma et al., 2013;Desai et al., 2015;Kaur et al., 2016). Performing a screen for typing elements consistent with 35B/CC558 within non-35B/CC558 strains allowed us to qualitatively assess the impact of this single common strain complex in recombination events.
Although most of the serotype-switch classes that we have found among these 2015-2016 ABCs isolates were represented by single recombinant strains, there is longstanding evidence that certain serotype-switch "experiments" generate highly adapted and successful strains. While we depict several putative vaccine-escape recombinant strain complexes (e.g., 23B/ST1373, 15A/ST3811, 15BC/ST3820, and 35B/ST156) that have only recently shown successful emergence, serotype switching has been a key mechanism in shaping the pneumococcal population structure long before PCV introduction (Coffey et al., 1991;Wyres et al., 2013). These events often are reflective of recipient and/or donor parental strains that have been known to be abundant within the carriage reservoir, and have often involved gene replacement events that included the cps and nearby resistance-conferring pbp1a and pbp2x alleles (Coffey et al., 1991;Brueggemann et al., 2007;Chochua et al., 2017).
Increasing resistance to beta lactams has been shown to be conducive to the acquisition of additional resistance features, and dually resistant (to penicillin and erythromycin) strains have emerged faster than strains that are resistant to one of these antibiotics alone (Jacobs et al., 1978;McCormick et al., 2003). Within our recent strain surveillance, the emergence of the penI, eryR, cliR 23A/ST338-3 clade has outpaced the eryS, cliS 23A/ST338-1 clade that expresses a lower penicillin MIC (unpublished data). The clade 3/ST180-4 is another example of an emergent offshoot that expresses higher resistance to beta lactams and macrolides than the ancestral clonal complex.
Although many, and possibly all, of the 57 putative recent serotype-switch variants described in this paper may have originated in the pre-PCV era, they were only detected after PCV implementation. Certain of these hybrid strains, as well as older well-documented strains, warrant close scrutiny. Preferential increase or persistence of a specific PCV13 lineage could be consistent with some sort of localized immunologic or epidemiologic advantage. For example, the vaccine serotype 4 isolates from 2015 to 2016 surveillance, primarily the putative switch 4/ST10172 genotype and the well-established 4/ST244 genotype, are almost entirely from western surveillance areas (exclusively California for ST244, Colorado and New Mexico for the ST10172; data not shown) and each genotype constitutes a distinct highly related phylogenetic cluster (sFigure 8). It is striking that 43% of these isolates were from homeless individuals (unpublished ABCs data).
For unknown reasons, pan-susceptible serotype 19F/CC251 strains have declined more slowly than other 19F CCs that were abundant before PCVs and actually increased in IPD slightly in 2016 relative to 2015. It might prove important to examine such strains in PCV13-based opsonophagocytosis assays to evaluate whether antibodies generated in response to type 19F antigen in the vaccine have functionality against these specific type 19F strains. Serotype 6C is an example where such investigation proved important. Originally mistyped as serotype 6A, 6C proved to be a distinct serotype not targeted by PCV7 that differed within its primary repeating unit structure (Park et al., 2007(Park et al., , 2008. Conversely, continued decreases shown within the major 19A CCs is consistent with uniform targeting of these strains by PCV13. It might be relevant to test next generation PCVs for effective targeting of clonal complexes that have recently emerged. For example, although CC199/15BC has been predominant within the U.S. for the past 20 years, it might be prudent to assess the rapidly expanding antimicrobial-resistant 15BC/ST3280 (Andam et al., 2017) for its behavior in next generation PCV-based opsonophagocytic killing assays.
Our primary objective was to provide a meaningful descriptive account of IPD isolates recovered through population-based surveillance in the post-PCV13 era. We hope that these strain data will support a wide array of studies to assist in prevention efforts and contribute to our basic understanding of IPD strains.

AUTHOR'S NOTE
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

AUTHOR CONTRIBUTIONS
All authors reviewed the manuscript and provided constructive feedback. BB wrote the study, performed many of the analyses provided, and guided bioinformatics pipeline development. SC performed, directed, and evaluated all whole genome sequencing data. BM developed the bioinformatics pipeline and provided periodic updates. YL provided the bioinformatics approach for prediction of β-lactam MICs conferred by newly encountered PBP types. ZL, TT, and HW performed whole genome sequencing. LM oversaw all laboratory operations. JR provided phylogenetic analysis for selected serotypes. RG maintained typing antisera as well as performed conventional testing of selected isolates. TP assisted in the writing and provided population-based disease rates for the ABCs data.

FUNDING
This work was performed as part of our normal responsibilities at the CDC. sFigure 19 | 22F. Phylogenetic resolution of serotype 22F isolates (refer to Figure 5B). Note the separation of clades ST433 (blue) and ST7314 (red-violet, corresponding to eryR isolates). A separate node (ST433-2) of eryR isolates is also indicated.
sFigure 21 | Phylogenetic resolution of serotype 12F isolates (corresponds to Figure 5F). sFigure 22 | Resolution of 33F showing the relatively short branch lengths within the ST2705-focused clade compared to the ST100 clade (refer to Figure 5H). In addition, the very high relatedness of the 5 penicillin nonsusceptible ST11856 isolates within CC100 is consistent with very recent emergence.
sTable 2 | CCs are listed in order of decreasing number of progeny isolates. The MLST/PBP types combinations within each CC are listed such that existing PBP relationships are optimally aligned.