Integrated genomic analysis of antibiotic resistance and virulence determinants in invasive strains of Streptococcus pneumoniae

Introduction Streptococcus pneumoniae is an important human pathogen that may cause severe invasive pneumococcal diseases (IPDs) in young children and the elderly. A comprehensive comparative whole-genome analysis of invasive and non-invasive serotype strains offers great insights that are applicable to vaccine development and disease control. Methods In this study, 58 invasive (strains isolated from sterile sites) and 71 non-invasive (serotypes that have not been identified as invasive in our study) pneumococcal isolates were identified among the 756 pneumococcal isolates obtained from seven hospitals in Zhejiang, China (2010–2022). Serotyping, antimicrobial resistance tests, and genomic analyses were conducted to characterize these strains. Results and discussion The three most invasive serotypes were 23F, 14, and 6B. The invasive pneumococcal isolates' respective resistance rates against penicillin, ceftriaxone, tetracycline, and erythromycin were 34.5%, 15.5%, 98.3%, and 94.7%. Whole-genome sequencing indicated that the predominant invasive clonal complexes were CC271, CC876, and CC81. The high rate of penicillin non-susceptible Streptococcus pneumoniae (PNSP) is related to the clonal distribution of resistance-conferring penicillin-binding proteins (PBP). Interestingly, we found a negative correlation between invasiveness and resistance in the invasive pneumococcal serotype strains, which might be due to the proclivity of certain serotypes to retain their β-lactam resistance. Moreover, the mutually exclusive nature of zmpC and rrgC+srtBCD suggests their intricate and potentially redundant roles in promoting the development of IPD. These findings reveal significant implications for pneumococcal vaccine development in China, potentially informing treatment strategies and measures to mitigate disease transmission.

Introduction: Streptococcus pneumoniae is an important human pathogen that may cause severe invasive pneumococcal diseases (IPDs) in young children and the elderly.A comprehensive comparative whole-genome analysis of invasive and non-invasive serotype strains offers great insights that are applicable to vaccine development and disease control.
Methods: In this study, 58 invasive (strains isolated from sterile sites) and 71 noninvasive (serotypes that have not been identified as invasive in our study) pneumococcal isolates were identified among the 756 pneumococcal isolates obtained from seven hospitals in Zhejiang, China (2010-2022).Serotyping, antimicrobial resistance tests, and genomic analyses were conducted to

Introduction
Streptococcus pneumoniae is an important pathogen that causes invasive pneumococcal diseases (IPDs) such as invasive pneumonia, sepsis, bacteremia, and meningitis (Weiser et al., 2018).The most infected population comprises young children, older adults, and immunocompromised patients (O'Brien et al., 2009).IPDs can be effectively controlled by introducing pneumococcal conjugate vaccines (PCVs) which are designed by target the most prevalent and virulent pneumococcal serotypes (Whitney et al., 2003;O'Brien et al., 2009;Pneumococcal Disease: Prevention | CDC, 2023).Epidemiological studies of IPD have consistently reported specific serotypes that cause invasive diseases, such as the recently reported serotype 4 and 24F IPD emergence in Israel and France, respectively (Kellner et al., 2021;Lo et al., 2022).To identify important virulent strains, a comparative investigation between isolates from different origins is widely accepted (Sharew et al., 2021;Yan et al., 2021).However, there is a potential risk of overlooking important differences between invasive and non-invasive serotype strains.Investigations of this type appear to be relatively scarce in the current literature, which are crucial for understanding the diseasecausing potential of invasive pneumococcal serotypes and for contributing to vaccination strategies and vaccine development.
The selection of anti-infective treatment regimens should consider the drug resistance status of the strain.The increasing resistance of S. pneumoniae to b-lactams, macrolides, and tetracyclines has led to increasingly limited options for treating IPD (Thummeepak et al., 2015;Suaya et al., 2020).The high resistance of S. pneumoniae to macrolides and tetracyclines in China means that these drugs are seldom prescribed, and blactams are the first-line treatment for IPDs (Zhou et al., 2022b).However, macrolides are still recommended for regions with resistance rates below 25% (Gregory and Davis, 2020), and the global increase in the non-susceptibility S. pneumoniae to b-lactams highlights the critical need to monitor the antibiotic resistance status of invasive pneumococci.
The development of IPD usually starts with bacterial upper respiratory colonization, in which pneumococcus may asymptomatically colonize the host (Bogaert et al., 2004).If pneumococcal strains evade the host immune defense and reach the lower respiratory tract, they can cause inflammation and fluid accumulation in the lungs, which is normally confirmed as pneumonia (Coonrod, 1989;Bogaert et al., 2004).Once the pneumococcus spreads to the bloodstream or cerebrospinal fluid, it causes sepsis or meningitis, which can lead to organ failure and death (Mook-Kanamori et al., 2011;Weiser et al., 2018).During this process, various pneumococcal virulence factors contribute to immune evasion (capsule polysaccharides, Cps), epithelial cell adhesion (pneumococcal rrg pathogenic island), and host tissue invasion (zinc metalloproteinase, ZmpC) (Jonsson et al., 1985;Camilli et al., 2006;El Mortaji et al., 2012;Cremers et al., 2014).Comparative whole-genome sequencing (WGS) analysis of virulence factors among invasive and non-invasive serotype strains, for example the presence of critical virulence genes in invasive isolates rather than non-invasive serotype strains, would be a valuable contribution to our understanding of IPD development.
In this study, we identified 58 invasive pneumococcal strains which were isolated form sterile site from a pool of 756 isolates collected from seven hospitals in Zhejiang, China, during the period 2010-2022.Those 58 invasive pneumococcal cover 16 serotypes.A genomic comparison analysis was conducted between these strains and a set of 71 strains from serotypes that had not been previously identified as invasive in our collected cases.This study aimed to investigate the serotype distribution, antimicrobial susceptibility, molecular epidemiology, and virulence factors of invasive pneumococcal serotype strains in Zhejiang Province, China.

Streptococcus pneumoniae isolation and serotyping
S. pneumoniae isolates (n=129) were collected from seven tertiary hospitals in Zhejiang, China, from July 2010 to January 2022, and included 58 invasive and 71 non-invasive serotype strains.Invasive isolates were collected from patients' sterile site, for instance, blood, bronchoalveolar lavage fluid (BALF), and cerebrospinal fluid (CSF) specimens.Non-invasive serotype isolates were those strains obtained from sputum, nasopharynx, and oropharynx specimens and belong to serotypes that have never been identified in IPDs in all of our pneumococcal-positive cases.All isolates were obtained by culturing the clinical samples on blood agar plates at 37°C with 5% CO 2 and were identified by optochin, bile solubility, and lytA PCR tests.Thereafter, all isolates were subjected to serotyping by the latex agglutination test and Quellung reaction (SSI Diagnostica, Denmark).We also conducted in silico serotyping after WGS (detailed below) using SeroBA software (https://github.com/sanger-pathogens/seroba)(Epping et al., 2018).

Clinical information collection
The clinical information of all patients in each IPD case was retrospectively extracted from their medical records with the approval of the Sir Run Shaw Hospital Ethics Review Committee (Zhejiang University School of Medicine, 20201112-32), which included data on sex, age, and primary diagnosis (Table 1).The specimen types of all cases are summarized in

Antimicrobial susceptibility test
Broth microdilution assays were used to determine the minimal inhibitory concentrations (MICs) of the tested antimicrobial agents according to the Clinical and Laboratory Standard Institute (CLSI) protocols, as described previously (Lewis, 2023).The antimicrobial agents tested were penicillin (PEN), ceftriaxone (CRO), erythromycin (ERY), and tetracycline (TET).S. pneumoniae strain ATCC 49619 was used as the quality control strain.The results were defined according to the 2023 Clinical and CLSI Guidelines M100-Ed33 (Lewis, 2023).

Statistical analysis
To assess the difference in antibiotic resistance between invasive and non-invasive isolates, we conducted a Mann-Whitney U test on each column, where a p<0.0001 was considered statistically significant.The correlation between invasive and PEN insensitivity ratios for each serotype was calculated using a two-tailed method.A Pearson correlation coefficient (r)<-0.7 was considered a strong negative correlation, r<-0.5 was considered a moderate correlation, and r<-0.3 was considered a weak correlation.A two-tailed p<0.05 was considered statistically significant.All analyses were performed using GraphPad Prism v9.5.0.

Clinical characteristics of patients with IPD
The demographic and clinical characteristics of patients with IPD are summarized in Tables 1, 2. Among 58 patients with IPD, males accounted for the majority (69.0%).In the different age groups, 44.8% (26/58) of IPD cases were identified in young children (0-5 years), and 19.0% (11/58) occurred in patients ≥65 years old.The three most common primary diagnoses were bronchitis, pneumonia, and sepsis.The specimen types used were blood (49/58, 84.5%), BALF (5/58, 8.6%), and cerebrospinal fluid (CSF, 4/58, 6.9%).Bronchitis, fever, and sepsis were the primary diagnoses of cases that later been confirmed as IPD due to pneumococcal blood culture positive.In cases of noninvasive pneumococcal infection, the most commonly primary diagnosed disease was pneumonia (Supplementary Table 1).

Antimicrobial susceptibility
Antimicrobial susceptibility results for the 58 invasive S. pneumoniae isolates are presented in Table 3.According to the nonmeningitis breakpoint, the non-susceptibility rates of the isolates to PEN, CRO, TET, and ERY were 34.5%, 15.5%, 98.3%, and 94.7%, respectively.Regarding the meningitis breakpoint, the insensitivity rates of the isolates against PEN and CRO were 82.8% and 43.1%, respectively.Most PEN-and CRO-non-susceptible strains were PCV-covered serotypes, whereas serotypes 19F and 14 accounted for the majority.The MIC values of each isolate are presented in Supplementary Tables 3  and 4, showing that the MIC90 values of invasive and non-invasive pneumococcal isolates against PEN were 8 and 4 mg/mL, respectively.

Correlation of beta-lactam resistance and invasion
Careful analysis of the above datasets and taking into account our total pneumococcal strain storage (n=756) revealed that the serotype 19F strains presented the highest PEN insensitivity rate of 62.8% (108/172) and the lowest invasive ratio of 2.9% (5/172).In contrast, 80% (4/5) of the serotype 4 strains in our study were invasive isolates and none were PEN-insensitive (Figures 3A, B).This finding prompted us to perform correlation analysis for all invasive serotypes.As shown in Figure 3C, the invasive and PEN insensitivity ratios of the invasive serotypes were negatively and moderately correlated, respectively (r=-0.5444,p=0.292).Among all invasive serotypes, 19F and 19A were the most resistant and least invasive, while serotypes 4 and 8 were the most invasive and least resistant, respectively.Because 19F is the most prevalent strain in our study resulting in its high isolation rate in the invasive strains, we conducted a separate analysis that included all available genome data of serotype 19F strains (n=124) in our laboratory.As shown in Supplementary Figure 1, most serotype 19F isolates belonged to serotype CC271.The PEN insensitivity of this serotype is very high (62.8%),and is mediated by the same PBP1a-2b-2x type (13-11-33).Among more than hundred 19F strains, only five were invasive.

Virulence factors analysis
To determine the virulence of invasive pneumococci, we conducted virulence gene screening for both invasive and non-invasive strains.As shown in Figure 4, the choline-binding protein gene cbpA was present in only one strain.The virulence factorencoding genes pitAB (iron uptake) and srtG1,2 (cognate sortase) were only present in the CC271 strains.The genes cpsA (capsule synthesis), hysA (hyaluronidase), lytABC (autolysin), nanAB (neuraminidase), pavA (fibronectin-binding protein), pce (phosphorylcholine esterase), ply (pneumolysin), and psaA (pneumococcal surface adhesin A) were present in both invasive and non-invasive strains with no clone specificity.The non-invasive strains carried slightly more cbpGD, pfbA (plasmin-and fibronectin-binding protein A), and pspAC (pneumococcal surface protein A) genes.Moreover, the genes rrgABC (pilus adhesin), srtBCD (sortase), and zmpC (zinc metalloproteinase) tended to be carried by invasive strains.Interestingly, the co-occurrence of zmpC and rrgC+srtBCD was mutually exclusive, and zmpC was mainly carried by serotypes 4 and 8, whereas rrgC+srtBCD appeared more frequently in the 19F and 19A strains.Again, the analysis targeting only serotype 19F showed rrgABC+srtBCD was clonally distributed in these isolates (Supplementary Figure 2).

Discussion
The identification of prevalent invasive serotypes is crucial for effective prevention and management of patients with IPD.The data presented in this study indicate that the most prevalent IPD serotypes in Zhejiang, China, are 23F, 14, and 6B.However, the CSF-isolated pneumococcal serotypes were 34, 8, and 15C, of which 34 and 15C were the most commonly identified nonvaccine serotypes.Owing to the low PCV coverage in China, the prevalence of IPD serotypes is different from that in developed countries (Pick et al., 2020;Suaya et al., 2020;Yanagihara et al., 2021;Wu et al., 2022).A recent multicenter study of 300 invasive S. pneumoniae isolates predominantly from northern China indicated that the most prevalent invasive serotypes are 23F, 19F, and 19A, and the top CSF-isolated serotypes are 23F, 19F, and 14 (Zhou et al., 2022a).A similar study from western China showed that the most prevalent pneumococcal serotypes are 19F, 19A, and 6B (Yan et al., 2021).The inconsistent results found in the literature as well as in our study highlight the variation in serotype distribution across different regions of China.Therefore, it is crucial to conduct a national surveillance study on invasive S. pneumoniae to accurately assess this situation.The region-specific dominance of certain serotypes may direct different vaccine strategies.While PCV13 and PCV20 cover the majority of IPD serotypes in Zhejiang Province, non-vaccine serotypes 34 and 15C have caused highly virulent cases of meningitis that require continued vigilance.
types from a global pneumococcal database (Li et al., 2016).For this reason, according to the non-meningitis breakpoint, we reported a very high PNSP rate of over 30% for invasive S. pneumoniae tested in the current study.A study conducted in China that collected 993 strains of S. pneumoniae up to 2017 showed that the nonsusceptibility rate of IPD strains to penicillin reached 22.4%, which was significantly higher than that of the non-invasive strains (Yan et al., 2021).However, another pneumococcal epidemiological study collected 300 invasive pneumococci from 2010 to 2015 and reported a very low PNSP rate (4.3%) (Zhou et al., 2022b).Later, the same group performed a deeper analysis of PBPs using conventional PCR and showed that the PBP site substitutions in PNSP strains matched our findings of non-susceptible PBP, which were distributed mostly in prevalent clones.Although in silico predicted MIC of strains carrying non-susceptible PBPs against penicillin was 4 mg/mL (Li et al., 2016), most of the isolates in their study had an MIC of 1-2 mg/mL.The reason for these inconsistent results might be due to MIC value interpretation, sample collection time periods, and regional differences.Macrolide and tetracycline resistance was maintained at high levels in both invasive and non-invasive pneumococci, which is mediated by the national distribution of ermB and tetM in China (Zhou et al., 2022b), indicating the limited clinical value of such drugs in China.Notably, some serotype strains had very high invasive rates, but none of them presented a penicillin-resistant phenotype; therefore, we performed further analysis to examine the relationship between invasiveness and penicillin resistance.Among all the collected strains of invasive serotypes, we found a negative correlation between invasive and penicillin non-susceptibility rates.To date, no relevant reports have been published regarding S. pneumoniae.However, in the gram-negative bacterium Klebsiella pneumoniae, hypervirulent strains have been reported to exhibit a relatively low ability to acquire antibiotic-resistant plasmids and hardly simultaneously exhibit virulence and resistance (Tian et al., 2022).An ongoing study in our laboratory has shown that serotype 19F strains display greater proclivity to maintain their b-lactam resistance phenotype than that of other strains.In the top three invasive serotypes (4, 8, 33B), none of the strains to been found resistant to PEN in our study.Recent reports demonstrated the same PSSP for 190 invasive serotype 4 strains and 90 invasive serotype 8 strains (Hansen et al., 2021;Kellner et al., 2021).No report of PEN resistance was found in pneumococcal serotype 33B

A B
The phylogenetic tree and antibiotic resistance determinants of invasive and non-invasive serotype pneumococcal strains.(A) The phylogenetic tree of all tested pneumococcal isolates (n=129) was constructed in PopPUNK, where three major clone complexes (CC) for invasive and non-invasive serotype isolates were shaded in light red and blue, respectively.The metadata including specimen types, sequence type (ST), serotype, and antibiotic susceptibility test results (AST) was aligned for all sequenced pneumococcal isolates, which was followed by the detection of antibiotic resistance determinants of PBPs, tetM, ermB, and mefA; (B) Minimal inhibition concentration (MIC) comparison between invasive and non-invasive serotype strains."****" indicate a significant difference with a p-value less than 0.0001, "ns" indicate no significance was detected, red dash line is the cut-off of resistant MIC.
strains.The mechanism behind such phenotype need further identification.The negative correlation observed between the invasiveness and non-susceptibility of S. pneumoniae may be attributed to the presence and retention of specific virulence factors and resistance determinants in the highly invasive and resistant serotype strains, respectively.A more comprehensive understanding of this newly discovered epidemiological phenomenon requires further in vestigat ion int o it s underlying mechanisms.
During the invasive infection process, various virulence factors participate at different stages.We did not find a carrying difference between invasive and non-invasive pneumococcal serotype strains in several classical virulence factor-encoding genes, such as capsule synthesis, pneumococcal surface adhesin, autolysin, fibronectinbinding protein, and pneumolysin, which are similar to those reported previously (Yan et al., 2021).However, rrgABC (pilus adhesin), srtBCD (sortase), and zmpC (zinc metalloproteinase) are mostly carried by invasive pneumococci, and the rrgABC+srtBCD locus is clonally distributed in CC271 strains.There are two intriguing points in this section regarding these results.First, except for CC271, one of the pilus type 1 (P1) encoding genes, rrgC, would be independently expressed in the invasive strains together with the three P1 specific sortase encoding genes, srtBCD.RrgC has been reported as a lectin targeting different host glycosylations; however, it is the least understood pilus protein in S. pneumoniae (Day et al., 2017).Our findings demonstrate the importance of sortase for pilus expression and the small pili RrgC for S. pneumoniae causing invasive disease.Future pneumococcihost interaction studies would be valuable by focusing on these virulence factors.Furthermore, rrgABC+srtBCD was initially identified in the highly invasive serotype 4 strain TIGR4 (Barocchi et al., 2006).However, none of our serotype 4 invasive The correlation between invasiveness and penicillin non-susceptibility of invasive serotypes.The proportion of invasive (A) and PEN non-susceptible (B) isolate in each invasive serotype (included all our strain bank isolates, n=756); (C) The correlation analysis between the invasiveness and PEN non-susceptibility in all invasive serotypes.A Pearson correlation coefficient (r) <-0.7 was considered a strong negative correlation, r<-0.5 was considered a moderate correlation, and r<-0.3 was considered a weak correlation.A two-tailed p<0.05 was considered statistically significant.
isolates was found to carry P1-related genes; instead, they encoded a zinc metalloproteinase.The co-occurrence of zmpC and rrgC +srtBCD was mutually exclusive, and zmpC was mainly carried by serotypes 4 and 8, which are highly correlated with IPDs.A study conducted in Italy (Camilli et al., 2006) demonstrated that serotypes 8 and 11A carried zmpC, and another study from the Netherlands (Cremers et al., 2014) revealed that zmpC was predominantly associated with serotypes 8, 4, 33A/F, and 11A/D.In contrast, our laboratory isolates of serotype 11A were collected from patients without IPD and no zmpC was detected in these strains.Pneumococcal ZmpC is involved in the breakdown of host tissues, whereas the pilus biogenesis proteins RrgC and SrtBCD play a role in the adhesion and colonization of the bacterium.There is no clear explanation for the mutual exclusion of these two different virulence factors; they may interfere with the expression of other functions or they are carried on different mobile genetic elements that are not transferable between different clones.Nevertheless, our findings demonstrate that both are important for the invasiveness of different invasive pneumococcal serotypes.
Research on the serotype-specific virulence mechanism of S. pneumoniae would be significantly more meaningful than studying its overall pneumococcal virulence.
In conclusion, this study provides valuable insights into invasive S. pneumoniae serotypes and their management.A national surveillance study is necessary to understand the variation in invasive serotype distribution across different regions of China, which is crucial for PCV vaccination strategies.A high PNSP rate was related to the clonal distribution of non-susceptible PBP types, and we found a negative correlation between invasiveness and resistance in invasive pneumococcal strains.The mutually exclusive nature of zmpC and rrgC+srtBCD suggests their intricate and potentially redundant roles in promoting the development of IPD.Further mechanistic studies will contribute to the development of pneumococcal vaccines.
FIGURE 1 Serotype distribution and vaccine serotype coverage of pneumococcal isolates All tested S. pneumoniae isolates were serotyped by quellung reactions and in silico by whole genome sequencing via SeroAB.(A) The distribution of invasive pneumococcal vaccine serotypes in different age groups: 0-5 (blank), 6-65 (grey), and, >65 (light grey); (B) The proportion of pneumococcal conjugate vaccine (PCV) 7, PCV13-add, PCV20-add, and non-vaccine serotype (NVT) invasive pneumococcal isolates; (C) Serotype distribution of NVT invasive pneumococcal isolates; (D) Serotype distribution of pneumococcal isolates belong to the serotypes that are never been detected for invasive cases (non-invasive serotypes).
FIGURE 3 FIGURE 4The virulence factor analysis of invasive and non-invasive serotype pneumococcal strains.(A) The detection of virulence factors was attached for each strain in the phylogenetic tree of invasive and non-invasive serotype strains.(B) Carrying proportion of virulence factors for invasive (black) and non-invasive (grey) serotype strains.Several virulence factors are clonally distributed (cbpA and pitAB.)or carried by all strains (ply and psaA), which were excluded from the analysis.(C) Distribution and proportion of zmpC in the carried serotypes of 3, 4, 8, and 14; D. Distribution and proportion of rrgC+srtBCD in the carried serotypes of 6A,6B, 15C, 19A, 19F, and 34.

TABLE 1
Demographic and clinical characteristics of IPD patients.

TABLE 2
Invasive specimen type and related pneumococcal diseases.

TABLE 3
Antibiotic susceptibility of invasive S. pneumoniae.