Molecular Characteristics of Streptococcus pyogenes Isolated From Chinese Children With Different Diseases

Streptococcus pyogenes is a bacterial pathogen that causes a wide spectrum of clinical diseases exclusively in humans. The distribution of emm type, antibiotic resistance and virulence gene expression for S. pyogenes varies temporally and geographically, resulting in distinct disease spectra. In this study, we analyzed antibiotic resistance and resistance gene expression patterns among S. pyogenes isolates from pediatric patients in China and investigated the relationship between virulence gene expression, emm type, and disease categories. Forty-two representative emm1.0 and emm12.0 strains (n = 20 and n = 22, respectively) isolated from patients with scarlet fever or obstructive sleep apnea-hypopnea syndrome were subjected to whole-genome sequencing and phylogenetic analysis. These strains were further analyzed for susceptibility to vancomycin. We found a high rate and degree of resistance to macrolides and tetracycline in these strains, which mainly expressed ermB and tetM. The disease category correlated with emm type but not superantigens. The distribution of vanuG and virulence genes were associated with emm type. Previously reported important prophages, such as φHKU16.vir, φHKU488.vir, Φ5005.1, Φ5005.2, and Φ5005.3 encoding streptococcal toxin, and integrative conjugative elements (ICEs) such as ICE-emm12 and ICE-HKU397 encoding macrolide and tetracycline resistance were found present amongst emm1 or emm12 clones from Shenzhen, China.


INTRODUCTION
Streptococcus pyogenes (group A Streptococcus, GAS) is an important gram-positive bacteria that ranks among the 10 main causes of death from infectious diseases worldwide, with more than 517,000 deaths annually (Carapetis et al., 2005). GAS causes a wide spectrum of clinical diseases ranging from mild pharyngitis to life-threatening invasive infections (Carapetis et al., 2005;Walker et al., 2014). Although antibiotics are effective and widely used for treating GAS infections, antibiotic resistance, especially to macrolides, is increasing in several countries (Lu et al., 2017;Bhardwaj et al., 2018). The rise of antibiotic resistance leads to an increase in mortality, which has become a public health issue of global concern (Wajima et al., 2014;Silva-Costa et al., 2015). This issue has also received close attention in China (You et al., 2018(You et al., , 2020. Genotyping is an effective method for monitoring bacterial strains in microbiology research (Steer et al., 2009), and sequence analysis of the emm gene is currently used for GAS genotyping (Facklam et al., 1999;Gherardi et al., 2018). In recent years, emm cluster analysis has also been widely used in GAS molecular epidemiology analysis. The typespecific M protein, encoded by the emm gene, and superantigens (SAgs), encoded by the sAg genes, are important virulence factors in S. pyogenes (Golińska et al., 2016). Currently, stains of S. pyogenes are often tested for the expression of speA, speC, speG, speH, speI, speJ, speK, speL, speM, ssa, smez, and the enzymeencoding speB and speF genes as genes encoding SAgs, even though speB and speF have been confirmed to encode cysteine protease and streptococcal DNase proteins (Strus et al., 2017). However, the correlations between sAg distribution, emm type, and disease spectrum for GAS have not yet been established (Rantala et al., 2012;Imöhl et al., 2017;González-Abad and Alonso Sanz, 2020). In this study, the antimicrobial sensitivity of GAS strains as well as the relationships among sAg distribution, emm types, and disease categories were analyzed. Moreover, 42 representative strains of the two main epidemic emm types were analyzed for population structure, genetic diversity, phylogeny, and susceptibility to vancomycin.

Bacterial Strains and Antimicrobial Susceptibility Testing
A total of 342 GAS strains were isolated from children aged < 18 years who were admitted to Shenzhen Children's Hospital from 2016 to 2018 for treatment of one of 15 diseases. Of these strains, 87 were isolated in 2016 (4 invasive and 83 non-invasive strains), 138 were isolated in 2017 (13 invasive and 125 non-invasive strains), and 117 were isolated in 2018 (15 invasive and 102 non-invasive strains). The strains were isolated from 262 throat swabs, 47 sputum samples, 21 pus samples (abscess, surgical wound infections, and skin burn infections), 5 wound secretions, 3 vulvar secretions, 3 blood samples, and 1 urine sample. The streptococcus grouping kit (Oxoid Limited) was used to identify these strains, as previously reported (Liang et al., 2012).
We tested the susceptibility of the 342 strains to antimicrobial agents including penicillin, azithromycin, erythromycin, clarithromycin, clindamycin, tetracycline, levofloxacin, and chloramphenicol (Oxoid Limited). Susceptibility to vancomycin was analyzed in 42 representative strains. Minimum inhibitory concentration (MIC) values were determined according to the guidelines of the Clinical and Laboratory Standards Institute (CLSI) (2019) by using the broth dilution method. Quality control was performed using Streptococcus pneumoniae ATCC 49619, which was provided by the Clinical Test Center of the Ministry of Health and maintained by the Microbiology Laboratory of Shenzhen Children's Hospital.

Emm Sequence Typing
The emm sequence types were determined using a previously reported protocol 1 . Amplicons were sequenced by Guangzhou BGI Genomics Co. Ltd. Emm type was determined based on the sequence identity (> 95%) of the first 180 bp of the emm gene between the tested sequence and the reference emm gene.

Whole-Genome Sequencing and Phylogenetic Analysis
We selected 42 strains that represented the major emm types for WGS according to the following criteria: (i) the major emm types (emm1.0, emm12.0), (ii) the two most common diseases [scarlet fever and obstructive sleep apnea syndrome (OSAS)], and (iii) 3 strains each year from 2016 to 2018. For each subtype of emm1 and emm12, we choose one strain from each year, and then chose strains from different seasons in each year. Forty-two strains isolated from children with OSAS or scarlet fever (20 emm1.0 strains and 22 emm12.0 strains) were used for the WGS analysis. Genomic DNA was sequenced using the DNBSEQ platform (BGI-Shenzhen, China). DNA quality was assessed by electrophoresis, and then the DNA was fragmented and processed by end repair, A-tailing, adapter ligation, DNA size selection, circulation, and DNA nanoball formation according to an in-house library. DNA libraries with an insert size of 300 bp were sequenced using paired-end 100-bp reads (PE100). Lowquality sequences were trimmed using SOAPnuke (Chen et al., 2018). The remaining short reads were assembled into contigs using SPAdes version 3.11.1 (Lagesen et al., 2007).
Genes were predicted using Glimmer 3.02 (Delcher et al., 2007). Virulence genes were identified by searching against the Virulence Factor Database (Liu et al., 2019), Antibiotic Resistance Genes Database (Liu and Pop, 2009), and Comprehensive Antibiotic Resistance Database (Alcock et al., 2020). The core genes of the sequenced and reference genomes were identified by clustering proteins with a sequence identity > 50% and a coverage > 70% using CD-HIT . Single nucleotide polymorphisms (SNPs) among the core genes were aligned pairwise and subjected to phylogenetic tree inference using TreeBeST and the NJ method (TreeSoft, 2021).

Statement of Ethics
This study was approved by the research ethics committee of the Shenzhen Children's Hospital. Informed consent was obtained from patients or their guardians before sample collection.

Statistical Analysis
Data were analyzed using SPSS version 22.0. Differences in the distributions of diseases and emm types and comparisons between diseases and a specific emm type were analyzed using the independent-samples Kruskal-Wallis test. The exact Mann-Whitney U-test was applied to identify differences in distributions between streptococcal sAg expression, emm types, and disease categories. Two-sided P-values < 0.05 indicated a statistically significant difference between groups. Values of P adj < 0.05/m indicated that a within group difference was statistically significant, where m represents the total number of Bonferroni corrections within the group.

Antimicrobial Susceptibility Patterns and Resistance Genes
All strains were highly susceptible to penicillin and chloramphenicol, whereas 98.5% of the tested strains were sensitive to levofloxacin. The results regarding the rates and degree of resistance to azithromycin, erythromycin, clarithromycin, clindamycin, and tetracycline are shown in Table 1. In addition, 90.9 and 6.1% of strains expressed ermB and ermA, respectively. Additionally, 16.4% were mefA-positive, and 85.4% were tetM-positive. All 42 representative strains were susceptible to vancomycin.

Distribution of emm Genotypes and Their Correlation With Disease Categories
In this survey, 10 emm types, including 7 subtypes, were identified among the 342 GAS strains from 15 different diseases. Figures 1, 2 show the details of the 10 emm types, as well as the 7 subtypes, for the different disease categories in strains collected between 2016 and 2018. The distribution of disease categories differs significantly between emm12.0 and emm2.0 stains ( Table 2).

Correlations Between Superantigen Expression Profiles, Disease Categories, and emm Types
The positivity rates for speA, speB, speC, speF, speG, speH, speI, speJ, speK, speL, speM, ssa, and smeZ differed among the strains, and the distributions of speA, speH, speI, and speJ expression were related to the emm types ( Table 3). The distributions of sAgs in different disease categories did not differ significantly ( Table 4).
Among the 342 strains, 79.5% expressed six or more sAgs. The five major gene profiles (A-E) associated with emm types were identified according to the sAg combinations (Figure 3).

DISCUSSION
GAS is a gram-positive bacterial pathogen that causes a wide range of clinical diseases, and antibiotics are effective agents for treating GAS infections. Although no GAS strains resistant to β-lactam antibiotics have been found, strains that are highly resistant to macrolides, lincosamides and streptomycin B have been identified globally since 1990 (Gherardi et al., 2015). All strains analyzed in this study were susceptible to penicillin and chloramphenicol, and most were susceptible to levofloxacin. Therefore, penicillin can remain the first-line drug for treating GAS infections in pediatric patients in China. The use of chloramphenicol and levofloxacin in children is limited due to the side effects of these antibiotics on the hematopoietic system and bone and joint development, respectively. Due to the high rate and degree of resistance to macrolides observed for GAS, these antibiotics cannot be used to effectively treat children who are allergic to lactam antibiotics. The use of tetracycline is also limited by a high rate and level of resistance among GAS as well as the side effects of tetracycline on bone and teeth. In the present study, 98.5% of the isolated GAS strains were sensitive to levofloxacin, and this finding is consistent with the rate observed among GAS strains isolated from children in Shanghai, China, but significantly different from the rate observed among GAS strains isolated from Chinese adults (Shen et al., 2018). The reason for this difference may be related to the medication habits of Chinese people. Levofloxacin is not widely prescribed to children due to its side effects on bone and joint development.    The PCR and WGS results in the present study indicate that macrolide-resistant and tetracycline-resistant strains mainly expressed ermB and tetM, respectively, which is consistent with the findings of previous studies (Feng et al., 2010;Liang et al., 2012;Tsai et al., 2020). Unexpectedly, expression of lmrP, a broad-spectrum drug efflux gene, and expression of pmrA, which potentially confers resistance to fluoroquinolone through drug efflux, were identified in all 42 strains by WGS. No penicillinresistant strains or mutations in pbp2x were found. Nonetheless, a relationship exists between pbp2x gene variations and MIC values (Hayes et al., 2020;Musser et al., 2020;Vannice et al., 2020). Our WGS and MIC results showed that the vanB+ vanrG+ vanuG-and vanB+ vanrG+ vanuG+ strains were susceptible to vancomycin. These results underscore the need to better understand the relationship between antibiotic resistance phenotypes and resistance genes. ICEs have been proposed to play major roles in the selection and expansion of emm12 scarletfever outbreak lineages in Hong Kong and mainland China where antimicrobial usage patterns are elevated, highlighting their importance for S. pyogenes population structure (Davies et al., 2015;Shen et al., 2018;You et al., 2018;Jespersen et al., 2020). However, the role of ICEs in driving the global population structure of S. pyogenes has not been fully explored (Jespersen et al., 2020). The majority of characterized S. pyogenes exotoxins are carried by prophages, i.e., bacteriophages integrated in the bacterial chromosome. These SAg toxins, termed streptococcal pyrogenic exotoxins (spe), streptococcal mitogenic exotoxin Z (smeZ) and streptococcal superantigen (ssa), are amongst the most potent known activators of T cells (Blake et al., 2019). A previous study evaluated time-dependent changes in the emm type prevalence of S. pyogenes (Meisal et al., 2010). In the  : speA-, speB+, speC+, speF+, speG+, speH+, speI+, speJ-, speL-, speM-, ssa+, smeZ+, speK-. Profile B: speA+, speB+, speC+, speF+, speG+, speH-, speI-, speJ+, speL-, speM-, ssa+, smeZ+, speK-. Profile C: speA-, speB+, speC+, speF+, speG+, speH-, speI-, speJ-, speL-, speM-, ssa+, smeZ+, speK-. Profile D: speA-, speB+, speC+, speF+, speG-, speH+, speI+, speJ-, speL-, speM-, ssa+, smeZ+, speK-. Profile E: speA+, speB+, speC+, speF+, speG-, speH-, speI-, speJ+, speL-, speM-, ssa+, smeZ+, speK-. present study, 10 different emm types, including 7 subtypes, were identified. The most prevalent types and subtypes were emm12.0 and emm12, followed by the emm1.0 and emm1 (Figure 1). The prevalence of these two emm types is known to change over time (Ma et al., 2009;Liang et al., 2012;Li et al., 2020;You et al., 2020;Yu et al., 2020).
Previous studies confirmed the relationship between diseases caused by GAS infection and emm types (González-Abad and Alonso Sanz, 2020), although some other studies did not support this relationship. In the present study, the distribution of 15 diseases among the emm12.0 and emm2.0 types differed significantly. The emm2.0 strains were isolated from patients with 15 different diseases, whereas the emm12.0 strains were isolated mainly from patients with three diseases. These results indicate that the emm types of GAS strains infecting children in Shenzhen from 2016 to 2018 were associated with specific diseases.
S. pyogenes can cause infection by crossing human mucosal membranes and skin barriers. Prophage exotoxins enhance colonization fitness in epidemic scarlet fever-causing S. pyogenes (Brouwer and Barnett, 2020). The M protein and SAgs play a crucial role in the pathogenesis of S. pyogenes infections, and a close relationship exists between emm type and sAg distribution (Imöhl et al., 2017). The results of this study revealed differences in the frequency of sAg expression among emm types. Indeed, five sAg profiles were identified among GAS strains carrying six or more sAgs. Several sAg profiles were observed for each emm type, but only one or two genes were predominant for each type. Profiles B and E were most common among emm1.0 FIGURE 4 | Phylogenetic relationship of selective emm1 and emm12 strains using whole genome core-gene SNPs (A,C) and their antimicrobial drug resistance gene and virulence factor variation profiles (b, d). References indicate S. pyogenes strain MGAS5005 (Genbank accession CP000017.2) and S. pyogenes strain TJ11-001 (CP028148.1), respectively (B,C). The S. pyogenes strain Bra048, assembly accession GCA_900984775, is also included in the phylogeny of emm1 for reference. Strains containing 2 copies (emm1) and 3 copies (emm12), rather than the majority 4 copies, formed clusters and are indicated in bold font. The numbers in cells (B,D) indicate gene copy numbers. and its subtypes, whereas profiles A, C, and D were most common among emm12.0 and its subtypes. Thus, sAgs and their combinations are closely related to emm types. In this study, only a few strains were isolated for some diseases. We believe this could be related to the pathogenicity of S. pyogenes or to a lack of attention by researchers. In the past 30 years, most investigators have focused only on the relationship between S. pyogenes and postpartum infections, while the relationship with neonatal infections and vaginitis in girls has not received sufficient attention. Recent research on S. pyogenes shows that the pathogenicity of S. pyogenes in the vulva and vaginal mucosa of girls has become a concern (Donders et al., 2021;Hu et al., 2021). Although the amount of relevant information in this study is limited, the results provide insight into the relationship between emm type and vulvitis and vaginitis. More comprehensive research is needed.
The pathogenicity of GAS is related to the various virulence factors it produces. Pathogenic GAS has evolved to generate a large number of virulence factors, which promote its adhesion to host cells and invasion of deep tissues, ultimately leading to disease. SpeB can promote the spread of bacteria and their products in host tissues by degrading the tissue structure and can also degrade proteins and antimicrobial peptide LL-37 in order to resist immunity (Walker et al., 2014). SpeF is the main cause of pulmonary vascular permeability, which is sufficient to cause acute respiratory distress syndrome under the conditions of toxic shock-like syndrome caused by S. pyogenes (Matsumoto et al., 1999). Among all virulence factors, SAg is of particular concern. SAgs are proteins synthesized by ribosomes that have a relatively low molecular mass (∼22-28 kDa) and contain classic signaling peptides that are cleaved after secretion to release the mature toxin. SAgs function by activating T cells and are among the most powerful T-cell activators identified to date. At present, at least 14 genetically distinct SAgs have been characterized, and many of them are encoded within lysogenic bacteriophage or putative bacteriophage elements (Blake et al., 2019). Therefore, different strains encode different repertoires typically consisting of 3-6 distinct SAgs (Blake et al., 2019). Some studies reported a strong correlation between SAgs and disease categories (Smoot et al., 2002;Türk Dagı et al., 2018), and additional research showed that the distribution of sAgs correlates with disease categories and differs considerably among emm types (Murakami et al., 2002;Gergova et al., 2019). However, in the present study, we found that the sAg distributions were closely correlated to emm types but not to disease categories. The emm12.0 and especially emm2.0 strains were significantly associated with disease categories. Therefore, we speculate that variation in the distribution of sAgs is mainly due to the emm type, rather than the disease category.
In the present study, we generated the genome sequences of 42 S. pyogenes emm1 and emm12 strains. Streptococcal toxin-encoding prophage ϕHKU16.vir, ϕHKU488.vir, 5005.1, 5005.2, and 5005.3 in addition to the macrolide-and tetracycline-resistant ICE-emm12 and ICE-HKU397 elements were found amongst the Shenzhen strains. These results confirm our previous findings (You et al., 2018) of ϕHKU.vir, ICE-emm12 and ICE-HKU397 elements amongst multi-clonal emm12 strains of mainland China. Sequencing of more strains from China in the future will be important to determine the evolutionary pathway and population structure of the predominant emm1 and emm12 S. pyogenes strains.
The virulence factor streptococcal DNase sda1 was previously shown to interfere with the entrapment of bacteria through neutrophil extracellular traps and Toll-like receptor 9 (TLR9) signaling. This factor impairs plasmacytoid dendritic cell recruitment by reducing interferon (IFN)-1 levels at the site of infection and destabilizes DNA via the host protein HMGB1 (high mobility group box 1), which may decrease IFN-1 levels at the site of infection (1 levels at the site of infection (Uchiyama et al., 2012;Keller et al., 2019). Our WGS data showed that sda was specifically expressed by emm12.0 strains. In both emm12.0 and emm1.0 strains, the expression of sda showed no correlation with disease. hylP increases the virulence of GAS via the digestion of hyaluronic acid capsules (Singh et al., 2014). Our WGS data showed that the hylP expression in both emm12.0 and emm1.0 strains was specifically correlated with emm genotype but not with disease. Two copies of hylP were mainly distributed in the emm1.0 strains, and three copies of hylP were mainly distributed in the emm12.0 strains (Figure 4). Our study showed that variations in hylP might play roles in epidemic cloning expansion, but this needs to be further investigated.
In the present study, emm12.0 and its subtypes were present in 58.8% (201/342) of strains, and emm1.0 and its subtypes were present in 30.7% (105/342) of strains, which suggests that emm1.0 and emm12.0 were widespread and causing diseases in Chinese children from 2016 to 2018. The population structure and genetic diversity of these emm types were characterized by sequence typing and WGS (Supplementary Figure 1). The results showed that emm1.0 and emm12.0 each contained a stable clone, suggesting they had genetically diverged without recombination. However, the emm12.0 strains showed higher genetic diversity than the emm1.0 strains, suggesting that longer circulation of the former led to several robust clades with > 700 bootstrap replicates (Supplementary Figure 1). These observations are consistent with their epidemiological history. Horizontal transmission of virulence genes can occur between emm types. This should be considered in future GAS surveillance studies.

CONCLUSION
In conclusion, antimicrobial agents commonly used to treat GAS infections are highly active against clinical strains. However, increasing macrolide resistance warrants analysis of the epidemiological characteristics of these strains. The molecular epidemiology of GAS in China has shown many differences from earlier reports, based on the present study along with the above-mentioned reports evaluating the molecular profiles of GAS strains collected in different time periods and geographical regions. In the future, multicenter studies including various diseases are necessary to assess whether our findings are affected by temporal and geographical changes.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/bioproject/, PRJNA743366.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by this study was approved by the Research Ethics Committee of the Shenzhen Children's Hospital. Written informed consent to participate in this study was provided by the participants' legal guardian/nex of kin.

AUTHOR CONTRIBUTIONS
YZ and YY contributed to conception, design, and administrative support. QM, RZ, and YC provided study materials and patients. DY, YL, QL, WW, LH, and YB contributed to the collection and assembly of data, data analysis, and interpretation. DY and YL contributed to the manuscript writing. All authors contributed the final approval of manuscript.