Group B streptococcus virulence factors associated with different clinical syndromes: Asymptomatic carriage in pregnant women and early-onset disease in the newborn

Background Group B streptococcus (GBS) harbors many virulence factors but there is limited data regarding their importance in colonization in pregnancy and early-onset disease (EOD) in the newborn. We hypothesized that colonization and EOD are associated with different distribution and expression of virulence factors. Methods We studied 36 GBS EOD and 234 GBS isolates collected during routine screening. Virulence genes (pilus-like structures-PI-1, PI-2a, PI-2b; rib and hvgA) presence and expression were identified by PCR and qRT-PCR. Whole genome sequencing (WGS) and comparative genomic analyses were used to compare coding sequences (CDSs) of colonizing and EOD isolates. Results Serotype III (ST17) was significantly associated with EOD and serotype VI (ST1) with colonization. hvgA and rib genes were more prevalent among EOD isolates (58.3 and 77.8%, respectively; p < 0.01). The pilus loci PI-2b and PI-2a were more prevalent among EOD isolates (61.1%, p < 0.01), while the pilus loci PI-2a and PI-1 among colonizing isolates (89.7 and 93.1% vs. 55.6 and 69.4%, p < 0.01). qRT PCR analysis revealed that hvgA was barely expressed in colonizing isolates, even though the gene was detected. Expression of the rib gene and PI-2b was two-fold higher in EOD isolates compared to colonizing isolates. Transcription of PI-2a was three-fold higher in colonizing isolates compared to EOD isolates. ST17 isolates (associated with EOD) had a smaller genome size compared ST1 and the genome was more conserved relative to the reference strain and ST17 isolates. In a multivariate logistic regression analysis virulence factors independently associated with EOD were serotype 3, and PI-1 and PI-2a was protective. Conclusion There was a significant difference in the distribution of hvg A, rib, and PI genes among EOD (serotype III/ST17) and colonizing (serotype VI/ST1) isolates suggesting an association between invasive disease and these virulence factors. Further study is needed to understand the contribution of these genes to GBS virulence.


Introduction
Group B streptococcus (GBS), also known as Streptococcus agalactiae, is a beta-hemolytic, aerobic, Gram-positive, polysaccharide-encapsulated streptococcus. GBS isolates can be divided into ten distinct serotypes (Ia, Ib, and II to IX) based on a serological reaction directed against the polysaccharide capsule (Shabayek and Spellerberg, 2018;Filkins et al., 2021). Multilocus sequence typing (MLST) demonstrates that most human GBS isolates are clustered into six clonal complexes (CCs) (Melin, 2011;Russell et al., 2017;Shabayek and Spellerberg, 2018;Schindler et al., 2020;Filkins et al., 2021). GBS is a commensal bacterium that belongs to the human microbiota colonizing the gastrointestinal and genitourinary tract. GBS colonization in humans is usually not clinically significant but can cause severe diseases, in particular neonatal sepsis, and meningitis (Dermer et al., 2004). Two GBSassociated syndromes are described in neonates: early-onset disease (EOD) and late-onset disease (LOD) (Dermer et al., 2004;Joubrel et al., 2015). EOD, which is mostly associated with bacteremia, occurs in the first week of life (0-6 days), and results from vertical transmission of GBS from the colonized mother through contaminated amniotic or vaginal secretions to her newborn during, or just before delivery. EOD rate varies between 0.3 and 0.6 infants per 1,000 live births (Dermer et al., 2004;Joubrel et al., 2015). The rate of GBS EOD and vaginal colonization varies between geographical areas as do the predominant serotypes (Russell et al., 2017). The most recent report regarding global vaginal GBS colonization estimates a prevalence of 18%, with the lowest prevalence in Southern and Eastern Asia (11-13%) and the highest prevalence in the Caribbean (35%) (Russell et al., 2017;Shabayek and Spellerberg, 2018). Serotypes Ib, II, and V are prevalent colonizers in the United States (US) and Europe (Gherardi et al., 2007;Bergal et al., 2015;Genovese et al., 2020), whereas serotypes VI and VIII are prevalent in Japan (Lachenauer et al., 1999;Kawaguchiya et al., 2022). ST17 isolates mostly belonging to serotype III have been described in some geographical regions as a hypervirulent clone responsible for a large proportion of EOD (Melin, 2011;Teatero et al., 2014;Filkins et al., 2021;McGee et al., 2021).
Since 2017, we have noticed an increase in GBS associated neonatal sepsis and meningitis in Mayanei Hayeshua Medical Center (MHMC), which serves an Orthodox Jewish community in central Israel. The incidence of neonatal EOD disease increased significantly from 0.25/1,000 during 2016 (three cases) to 0.51/1,000 live births (seven cases) in 2017, significantly higher than the national Israeli Ministry of Health GBS annual reported rate of 0.27/1,000 live births. GBS colonization rate in MHMC was significantly higher (26.1%) than the overall prevalence reported among pregnant Israeli women (21.6%) (Data from the Israeli Ministry of Health). In a previous study, we found that the dominant colonizing serotype among asymptomatic pregnant women was serotype VI, while serotype III was the most prevalent among EOD cases. Furthermore, ST-1 was associated with colonization, while ST17 was associated with EOD (Schindler et al., 2020). GBS genome is 2.2 Mbp long and has over 2,100 predicted coding regions. GBS may produce a wide variety of virulence factors, including adhesins that enable penetration of epithelial and endothelial cellular barriers; factors that inhibits immunological clearance, and toxins that directly injure or disrupt host tissue components (Supplementary Appendix Table 1; Rosini et al., 2006;Carreras-Abad et al., 2020). Despite knowledge regarding many virulence factors related to GBS, there is limited information about virulence factors in the context of pregnancy and birth. Here we chose to focus on the more interesting and recently studied virulence factors, pili structures, Rib protein, and the hyper virulent adhesin HvgA, because they are considered important candidates in the development of vaccines against GBS (Carreras-Abad et al., 2020;McGee et al., 2021). Pili promotes colonization of epithelial cell surfaces, support biofilm formation, and facilitate translocation across the blood-brain barrier (Lauer et al., 2005;Dramsi et al., 2006;Rosini et al., 2006). Rib protein belongs to a family of highly repetitive proteins with unique high-molecular-weight in the cell wall, found predominantly in serotype III strain. Rib protein elicits protective immunity in humans and is already being tested in a phase I vaccine study (Fischer et al., 2021;Pulido-Colina et al., 2021). HvgA efficiently supports bacterial adhesion and transfer through to the intestinal wall, and mediates transfer across the bloodbrain barrier, specifically the vascular endothelium and the choroid plexus (Tazi et al., 2010). We hypothesized that different clinical syndromes, asymptomatic vaginal colonization during pregnancy, and EOD in the newborns, are related to different distribution of virulence factors.

GBS isolates
The study included a total of 270 GBS isolates (Supplementary Appendix Figure 1 and Supplementary Appendix Tables 2, 3). In a previous study (Schindler et al., 2020) 240 colonizing isolates and 19 EOD isolates were studied. Of these we randomly chose 126 colonizing, and all EOD isolates for further study. To enrich the sample we also obtained 17 additional EOD isolates from (MHMC) and 108 additional colonizing isolates from Meuhedet Health Maintenance Organization, the 3rd largest health plan in Israel (MHMO). Altogether, we studied 36 GBS isolates obtained from blood cultures of neonates with EOD during 2013-2019, and 234 colonizing GBS isolates collected during routine screening of asymptomatic pregnant women at week 35-37 (Supplementary Appendix Figure 1, Supplementary Appendix Tables 2, 3). The study was approved by the Ethics Committee of MHMC (approval number 0023-18-MHMC).

Bacterial strains and growth conditions
Group B streptococcus isolates were obtained from clinical cultures and preserved in sterile Brain Heart Infusion broth with 15% glycerol (HY-labs, Israel) at −70 • C for long-term storage and at 4 • C for short-term maintenance.

DNA extraction
Genomic DNA from GBS isolates was extracted using the EZ1 Virus Mini kit (Qiagen) according to the manufacturer instructions. Frozen GBS strains were revived; twice subcultured on trypticase soy with 5% sheep blood agar plates (TSA, HY-labs) at 37 • C with 5% CO 2 . One to three colonies were suspended in one ml of saline. Following centrifugation at 3,200×g, the bacteria were lysed in a 15 mg/ml lysozyme solution (Lysozyme, Geneaid). A 400 µL of lysed bacteria were transferred for DNA extraction. Extracted DNA was stored at −20 • C.

Serotyping
All GBS isolates were serotyped using TaqMan real-time PCR. Amplification was performed using specific primers for every serotype and a fluorescent probe labeled with 6-FAM (Breeding et al., 2016). Positive and negative controls were included in every run.

Virulence gene detection
The presence of genes encoding GBS surface proteins potentially associated with virulence: pili island (PI-1, PI-2a, PI-2b); Rib protein (rib) and hypervirulent GBS adhesin (hvgA) were evaluated among EOD (n = 36), colonizing GBS isolates from MHMC (n = 126) and MHMO (n = 108). The sets of primers used for detection of virulence genes are listed in the Supplementary Appendix Table 3. The reaction mixture, in final volumes of 50 µL, contained: 10 µL of TaqMan fast advanced master mix (Applied biosystems), 0.2 µL of each primer (10 µM), 7.6 µL nuclease-free PCR water, and 2 µL of DNA. Conditions for amplification were as follows: initial denaturation at 94 • C for 3 min, followed by 39 cycles of denaturation at 95 • C for 30 s, primer annealing at 55 • C for 30 s, and extension at 72 • C for 30 s. Amplifications were performed in CFX96 thermocycler (Bio-Rad). In each run a negative control consisting of the reaction mixture with nuclease-free water was added. 10 µL of PCR products were visualized by 2% agarose gel electrophoresis with SYBR Safe. The size of each PCR product was estimated by using standard molecular size markers (100-bp ladder) (GeneDireX, Hylabs). The control GBS isolates were used as positive control and was included to each PCR reaction.

Genome sequencing and bioinformatics analysis
A sample of GBS isolates were studied: 24 EOD isolates, and 25 colonizing isolates (Supplementary Appendix Table 4). The genomes of GBS isolates were prepared using Nextera XT kits (Illumina, San Diego, CA, USA) and sequenced using the Illumina MiSeq Reagent Kit v3 (600-cycle). Bioinformatics analysis of GBS whole genome sequencing (WGS) was performed by using the bacterial bioinformatics database and analysis resource PATRIC 1 . The reads obtained for each sample were trimmed and the quality of the FASTQ reads was examined using the FASTQ Utilities Service, and finally assembled by SPAdes using the PATRIC website. A high-quality representative genome of S. agalactiae 2603 V/R ATCC BAA611 (serotype V, ST19) was used as reference for genomic comparisons (Tettelin et al., 2002;Ondov et al., 2016). The "Phylogenetic Tree Building" service in PATRIC website provided the Codon Tree method that selects a single copy of the amino acid and nucleotide sequences from a defined number of PATRIC's global Protein Families (PGFams), picked randomly, to build an alignment, and then generate a tree based on the differences within those selected sequences. We performed the protein sequence-based genome comparisons using bidirectional BLASTP by the PATRIC server. This tool provides information about conserved genomic contexts, and the presence of insertions or deletions. We compared the genomic coding sequences (CDSs) of ST17 strains and ST1 strains to the reference strain (Tettelin et al., 2002). The results 1 https://www.patricbrc.org/  Distribution of virulence genes: Surface adhesins (rib and hvgA), and pili structures (PI-1, PI-2a, PI-2b) among colonizing group B streptococcus (GBS) isolates from MHMC (n = 126) and MHMO (n = 108). Overall, virulence gene distribution was similar (p > 0.05) except PI-2a which was less prevalent in MHMC (p < 0.028).

FIGURE 1
The frequency of virulence genes encoding surface adhesins (hvgA and rib) and pilus islands (PI-1, PI-2a, PI-2b) among 270 group B streptococcus (GBS) isolates: Early-onset disease (EOD) (n = 36) and colonizing isolates (n = 234).   of representative ST17 and ST1 strains are displayed with colorcoding for protein percent identity relative to the best hit on the reference genome. We searched for virulence factors genes by an exhaustive bioinformatic screening of the database "Virulence Factors Database-VFDB" (Chen et al., 2005), available at the PATRIC website.

RNA isolation, reverse transcription, and qRT-PCR
Quantitative qRT-PCR analysis of bacterial gene expression was performed as described previously (Sullivan et al., 2017). Primers were designed using Primer3 Plus software and used at a final concentration of 10 µM (Supplementary Appendix Table 5). The GBS isolates used for this study are listed in the Supplementary Appendix Table 4. RNA was extracted from GBS cultures grown at 37 • C to an exponential growth phase in BHI medium. RNA was purified using the RNeasy Mini kit (Qiagen) according to manufacturer instructions. Purified RNA was treated with the DNAse kit (HY-labs, Israel) according to manufacturer instructions. cDNA was synthesized using the Hy-RT-PCR kit (HY-labs, Israel), according to manufacturer instructions. cDNA was diluted 1:150 to further reduce bacterial DNA contamination and qPCR was performed using Hy-SYBR power mix (HY-labs, Israel) and CFX96 Real-Time System (Bio-Rad). RNA from three independent biological triplicates were analyzed and final cycle threshold (Ct) for each strain was calculated (the mean value of three experiments). We compared the expression of tested genes between separate groups (EOD vs. colonizing isolates, ST17 vs. ST1 strains, respectively). For this purpose, the final Ct of each group was determined by calculating the mean CT value of all strains belonging to the group. Relative quantification of gene expression was performed using the comparative 2 − CT method. The growth rates of colonizing and EOD isolates were tested in BHI medium, and similar growth rates were observed. Results were normalized using rpoB gene as the housekeeping gene. Gene expression results of each gene were normalized to its expression.

Statistical analyses
Means and medians were computed for continuous variables. Number and percent were used to describe categorical variables. Variables were compared using student T test or chi square as appropriate. P-value was set at < 0.05. For small samples Fisher exact test was used. Mantel-Haenszel common odds ratio estimate was used to calculate 95% confidence intervals.

Data availability
The sequences are now available to the public at the NCBI (BioProject number BioProject ID PRJNA861829, at the following link: http://www.ncbi.nlm.nih.gov/bioproject/861829).

20
*Chi square p-value < 0.01. Association between ST type and presence of virulence genes was determinated among 68 GBS isolates from MHMC. There was a significant association between hvgA and rib with ST-17 (p < 0.0074, p < 0.0001, respectively); PI-2a with ST-1 (p < 0.008). Statistical significant values were highlighted in bold.  Genomic meta-data of 49 GBS isolates from MHMC (24 EOD and 25 colonizing isolates). The data for each isolate was organized and sorted by genome size (from largest to smaller) and includes ST type, number of contigs, guanine-cytosine content (GC) and coding sequence (CDS). colonizing isolates (COL).

FIGURE 2
Phylogenetic codon tree (for 1,000 shared genes) of the group B streptococcus (GBS) isolates was built, together with the most closely related ATCC strain of Streptococcus agalactiae 2603 V/R (ATCC BAA611). The 49 isolates were clustered into five groups, which corresponded to their multilocus sequence typing (MLST) sequence types (STs). The first branch was mainly composed of ST17, ST19, and ST27 strains, belonging to early-onset disease (EOD) and colonizing isolates, which had very close genetic distance from each other. The second branch was composed from two main clusters mostly containing isolates belonging to colonizing GBS isolates, corresponding to various STs (ST1, ST4, ST8, ST12, and ST130). The phylogenetic distribution of the two branches suggests that the clusters composing these branches belong to divergent lineages.

Genomic analysis of GBS isolates
We assessed genome sequences of 24 EOD isolates and 25 colonizing GBS isolates from MHMC. General features of the GBS genome sequences, including genome size, number of contigs, and guanine-cytosine (GC) content are summarized in Table 5. We identified a correlation between the ST type and genome size. The genome size of ST17 strains, which was mostly associated with EOD, was significantly smaller 2,001,326 bp compared to the genome size of ST1 strains causing asymptomatic colonization 2,100,416 bp (p < 0.0001).
We constructed a phylogenetic tree of 49 isolates, and a reference strain, to assess the evolutionary relationships among colonizing and EOD GBS isolates. The tree demonstrated five clusters, which corresponded to their MLST STs (Figure 2). The first cluster was mainly composed of ST17, ST19, ST23, and ST27 strains, belonging The comparison of representative nine ST17 and nine ST1 strains relative to a reference group B streptococcus (GBS) genome S. agalactiae 2603 V/R (ATCC BAA611). Colored vs. white highlights insertions/deletions. Changes in conservation relative to the reference genome (going from blue representing the highest protein sequence similarity to red representing the lowest). Each circle represents the genome of GBS isolate.
to EOD and colonizing isolates. The second cluster included mainly colonizing isolates corresponding to various STs (ST1, ST4, ST8, ST12, and ST130). The phylogenetic distribution of the two clusters suggests that they belong to divergent lineages.
To study further the genetic differences between EOD (ST17) and colonizing (ST1) GBS isolates we performed protein sequence-based genome comparisons using bidirectional BLASTP by the PATRIC server. This tool provides information about conserved genomic contexts, and the presence of insertions or deletions. The results of representative nine ST17 and nine ST1 strains are displayed in Figure 3, demonstrate different clusters of conserved genes between the two ST types. There were significant differences in gene sequences between ST17 and ST1 strains. ST1 strains showed a conserved genome relative to the reference strain (>99.9% protein sequence identity), while ST17 strains had less similarity (≤98%). Four mutant gene islands were characterized in both ST17 and ST1 strains (Q1-Q4), compared to the reference strain. In ST17 strains, Q3 and Q4 regions contained both gene deletions and point mutations, while in ST1 strains these regions contained only point mutations. The sequences in these mutant gene islands were mainly coded for phagerelated proteins, hypothetical proteins with unknown function, and gene recombination related enzymes. ST17 isolates also carried one mutation-enriched region (Y1), that contained point mutations with sequence identity below 50%. Most of the genes in this region were associated with metabolism and transport (Supplementary Appendix Table 6).

Virulence factors
To further characterize the presence and role of virulence genes in GBS isolates, we performed further analyses in 49 sequenced GBS isolates from MHMC by employing the database VFDB using the PATRIC website. All GBS isolates tested shared the following virulence factors, which contribute to adhesion (lmb, hylB), protease activity (scpB), biofilm formation (hasC), and toxins production (cfa, cyl) ( Table 6). The main differences between colonizing and EOD isolates were found regarding the Rib surface protein (rib) and fibrinogen binding protein type B (fbsB). These proteins were widespread among EOD isolates and rarely identified among colonizing isolates. Pilus island-2 (PI-2) was frequently identified among colonizing isolates (genes encoded for the backbone pilin protein, PilA and PilC were found in 80% of colonizing isolates). PI-1 had the same distribution across all the isolates regardless of clinical presentation. PI-2 was not detected by this approach.

Expression profiling of virulence genes in EOD and colonizing GBS isolates
We performed comparative experiments using quantitative PCR for the expression of the virulence genes for EOD (n = 8) and colonizing (n = 8) isolates. We found that the differences between EOD and colonizing isolates is not only reflected by the presence of Frontiers in Microbiology 08 frontiersin.org virulence genes, but also by their expression (Figure 4). hvgA was barely expressed in colonizing ST17 strains, although the presence of this gene was confirmed by PCR. The expression of the rib gene was two-fold higher in EOD isolates compared to colonizing isolates. Transcription of PI-2a was about three-fold higher in colonizing isolates as compared to EOD isolates. In contrast, the transcription of PI-2b was two-fold higher in EOD compared to colonizing isolates ( Figure 5). This suggests difference in the regulation of the pilus loci in different GBS isolates mainly those responsible for invasive (EOD) or non-invasive (colonizing isolates) disease. We also investigated the correlation between the expression levels of PI-2a, PI-2b, and PI-1 genes in ST17 compared to ST1 strains. RT-qPCR analysis was performed with six ST17 and six ST1 GBS strains. Transcription of PI-2a was about two-fold higher in ST1 strains as compared to ST17 strain, while the transcription of PI-2b was two-fold higher in ST17 compared to ST1 strains (Figure 6). In a multivariate logistic regression analysis PI-I, PI-2A, and serotype 3 were independently associated with EOD ( Table 7). The r 2 was 0.511.

Discussion
We studied aspects of GBS virulence related to molecular differences between GBS isolates derived from different clinical syndromes, vaginal colonization during pregnancy (non-invasive strains), and neonates with EOD (invasive strains). In general, ST17 GBS strains are overrepresented in EOD on the newborn, and considered as hypervirulent, while ST1 strains were associated with asymptomatic carriage during pregnancy (Shabayek and Spellerberg, 2018;Bobadilla et al., 2021).
Evidence of virulent genes distribution is mainly associated with a geographic area of the countries and there is less information regarding the association with specific clinical syndromes (Bobadilla et al., 2021).
In our data, there were significant differences in the distribution of virulence genes between EOD/ST17 GBS isolates and ST1 GBS isolates from colonized pregnant women. Genes encoding for surface adhesins (hvgA and rib) were more prevalent among EOD isolates compared to their distribution among colonizing isolates (58.3, 77.8%, respectively). These results are consistent with other studies (Brzychczy-Włoch et al., 2012;Campisi et al., 2016;McGee et al., 2021). HvgA is a hypervirulent adhesin suggested to promote meningeal tropism in neonates when the Rib protein confers protective immunity. Both these surface-anchored proteins, act not only as a bacterial adhesins, but can also penetrate the intestinal and blood-brain barriers thus allowing the migration of GBS into the circulatory and central nervous systems (Pereira et al., 2018;Fischer et al., 2021;Pulido-Colina et al., 2021). Our findings demonstrated also that the surface adhesins HvgA and Rib were associated mainly with serotypes III, strongly associated with EOD. Previous data from Southeast Asian countries and Europe supports this association (Lohrmann et al., 2020;Pulido-Colina et al., 2021).
Interestingly, the genes encoding for pilus loci PI-2b were more prevalent among EOD isolates, while the genes encoding for pilus loci PI-2a and PI-1 were more prevalent among colonizing isolates. These pilus loci mediate interaction with host cells, involved in bacterial invasion and paracellular translocation mediating resistance The expression of rib and hvgA gene expression in early-onset disease (EOD) and colonizing isolates from Mayanei Hayeshua Medical Center (MHMC) were analyzed by qRT-PCR. Amounts of each transcript were normalized to rpoB gene and expressed relative to this gene in reference strain ATCC BAA611. The expression level of ATCC BAA611 strain was defined as 1. Columns represent the relative mRNA expression level of each gene from each strain. Values are represented as means ± SD (n = 8) from three independent qRT-PCR experiments. Error bars show SD. Asterisk indicates a significant difference ( * p < 0.05 by t-test). The expression of pili islands gene in early-onset disease (EOD) and colonizing group B streptococcus (GBS) isolates from Mayanei Hayeshua Medical Center (MHMC) were analyzed by qRT-PCR. Amounts of each transcript were normalized to the rpoB gene and expressed relative to this gene in reference strain ATCC BAA611. The expression level of ATCC BAA611 strain was defined as 1. Columns represent the relative mRNA expression level of each gene from each strain. Values are represented as mean ± SD (n = 8) from three independent qRT-PCR experiments. Error bar show SD. Asterisk indicates a significant difference ( * p < 0.05 by t-test).
to phagocytic killing and virulence (Pezzicoli et al., 2008;Pietrocola et al., 2018). Previous studies where in vitro models of GBS infection were involved, have shown that PI-2a was important for biofilm formation (Mandlik et al., 2008;Rinaudo et al., 2010), while PI-2b protein increased intracellular survival in macrophages (Pezzicoli et al., 2008;Motallebirad et al., 2021). Another research demonstrated The expression of pili islands gene expression in ST17 and ST1 group B streptococcus (GBS) strains from Mayanei Hayeshua Medical Center (MHMC) were analyzed by qRT-PCR analysis. Amounts of each transcript are normalized to rpoB gene and expressed relative to this gene in reference strain ATCC BAA611. The expression level of ATCC BAA611 strain was defined as 1. Columns represent relative mRNA expression level of each gene from each strain. Values are represented as means ± SD (n = 8) from three independent qRT-PCR experiments. Error bar show SD. Asterisk indicates a significant difference ( * p < 0.05 by t-test).
that PI-1 and PI-2a islands were the most frequently detected surface proteins (88.2 and 82%, respectively) and were found in all tested serotypes (Motallebirad et al., 2021). Our findings regarding virulence gene distribution were not unique to isolates obtained from MHMC, as a similar distribution was found among isolates obtained from MHMO, representing the Israeli pregnant population. The importance of detecting virulence genes with respect to predicting the invasiveness of strains is contradictory. Smith et al. (2007) and Eskandarian et al. (2015) could not demonstrate any correlation between the virulence genes and clinical status of the patients from whom the isolates were obtained. In contrast, Manning et al. (2006) found that invasive strains were associated with specific serotype/gene combinations, but the association was only marginally significant. It is possible that the differences in pathogenicity are not directly related to the virulence genes, but to differences in their expression. RT qPCR analysis revealed that hvgA was not expressed in colonizing isolates, even though the presence of gene was detected. The expression of the rib gene was two-fold higher in EOD isolates compared to colonizing isolates. Transcription of PI-2a was about three-fold higher in colonizing isolates as compared to EOD isolates. In contrast, the transcription of PI-2b was two-fold higher in EOD compared to colonizing isolates. These differences in the levels of transcription suggests a difference in the regulation of pilus loci expression in different GBS isolates according to GBS type and clinical illness: invasive (EOD isolates) or colonizing isolates. This is the first report describing that the genome of ST17 strains is smaller than the ST1 genome. This may be due to evolutionary processes, such as genome size reduction (Didelot et al., 2016;Gatt and Margalit, 2021;Grote and Earl, 2022). Bacterial genome size is mainly determined by gains or losses of genes (Gressmann et al., 2005). Bacteria acquire new genes through duplication of genes or horizontal gene transfer (Moran, 2002;Furuta et al., 2011). Genes may be deleted through mutations or recombination events (Furuta et al., 2011). Being small comes with some advantages, such as needing fewer resources or having more opportunities to hide or escape from predators. Genome reduction occurs when gene losses prevail over gene gains (Björkholm et al., 2001;Moran, 2002;Gressmann et al., 2005). ST17 lineage is derived from a bovine GBS ancestor (Sørensen et al., 2019). Theoretically, it is possible that due to neutral drift among genes that are no longer needed, ST17 strains became more restricted to humans. The same process was identified in Paratyphi A and Typhi, human restricted serovars of Salmonella enterica (McClelland et al., 2004). Recently, Murray et al. (2021) published a systematic study of the claim that pathogenicity of various bacteria is associated with genome reduction and gene loss. However, there is no data about a correlation between genome reduction and GBS pathogenesis.
The phylogenetic analysis of GBS isolates revealed two main clusters. Cluster 1: EOD isolates (ST17, ST23, ST19, and ST27) and cluster 2: colonizing isolates (ST1, ST4, ST8, and ST12), which mean that colonizing (ST1) and EOD (ST17) isolates belong to different evolutionary branches with unique evolutionary characteristics. The phylogenetic distribution of two branches supports the MLST findings that clusters composing these branches are not close to each other and belong to divergent lineages. Additionally, we also performed a genomic sequence alignment between ST1 and ST17 strains. Our results identified significant differences in gene sequences between them. The genome of GBS ST1 strains were more conserved compared to the genome of ST17 strains. In both genomes mutant gene islands were identified, but only in ST17 strains an additional mutation enriched region was detected, mainly coded for metabolism and transport related genes. Previous studies have found that genome recombination was a major driver for GBS genetic diversity (Brochet et al., 2006), which can explain the hypervirulence feature of ST17 clones, their possibility to colonize and invade different host tissues.
Vaccination is one of the strategies most likely to be implemented to prevent GBS infections. For designing GBS vaccines, knowledge of the prevalence of immunization targets such as capsular serotypes and alternative targets, such as virulence genes is essential (Bobadilla et al., 2021). In our recent study, we reported that serotype VI (ST1) was a dominant strain among colonizing GBS isolates, however, the serotype III (ST17) strains were significantly associated with EOD. This data could be of interest in the perspective of a future vaccine (Russell et al., 2017).
Our study has several limitations. We performed gene expression tests in a laboratory setting. Although we did our best to standardize tests, still differences of gene expression may exist between in vitro and in vivo models due to differences in the environmental conditions between in vivo and in vitro studies. The sample selection process may also introduce biases in the interpretation of the results. To conclude, this study provides information on the distribution of virulence genes among colonizing and invasive GBS strains in Israel. We demonstrated significant differences in the distribution of hvg A, rib, and PI-2b genes among EOD (serotype III/ST17) and colonizing (serotype VI/ST1) isolates thus suggesting an association between virulence and clinical syndromes. Additional studies are needed to investigate these virulence factors involved in immune evasion, host-cell interactions, and successful environmental persistence mechanisms such as biofilm formation. This data can help to define better interventional programs such as vaccine development and preventive therapeutic targets against the GBS.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: BioProject ID PRJNA861829.

Ethics statement
The studies involving human participants were reviewed and approved by the Mayanei Hayeshua Medical Center (approval number: 0023-18-MHMC). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions
YS participated in developing the idea for this work, performed the genetic analyses, and wrote the Introduction, Methods, and Results of the manuscript. IN, GP, BA, RR, and GV performed some of the molecular work and edited the manuscript. GR and DT-M developed the idea, reviewed and edited the manuscript, and contributed to the funding. YM developed the idea, wrote the Discussion, and reviewed and edited the manuscript. All authors contributed to the article and approved the submitted version.

Funding
This study was funded by internal funds of the Laboratory of Microbiology, Mayanei Hayeshua Medical Center, Bnei Brak, Israel and the Infectious Disease Unit, Sheba Medical Center.