Prevalent emm Types among Invasive GAS in Europe and North America since Year 2000

Background Streptococcus pyogenes or group A streptococcus (GAS) is an important human pathogen responsible for a broad range of infections, from uncomplicated to more severe and invasive diseases with high mortality and morbidity. Epidemiological surveillance has been crucial to detect changes in the geographical and temporal variation of the disease pattern; for this purpose the M protein gene (emm) gene typing is the most widely used genotyping method, with more than 200 emm types recognized. Molecular epidemiological data have been also used for the development of GAS M protein-based vaccines. Methods The aim of this paper was to provide an updated scenario of the most prevalent GAS emm types responsible for invasive infections in developed countries as Europe and North America (US and Canada), from 1st January 2000 to 31st May 2017. The search, performed in PubMed by the combined use of the terms (“emm”) and (“invasive”) retrieved 264 articles, of which 38 articles (31 from Europe and 7 from North America) met the inclusion criteria and were selected for this study. Additional five papers cited in the European articles but not retrieved by the search were included. Results emm1 represented the dominant type in both Europe and North America, replaced by other emm types in only few occasions. The seven major emm types identified (emm1, emm28, emm89, emm3, emm12, emm4, and emm6) accounted for approximately 50–70% of the total isolates; less common emm types accounted for the remaining 30–50% of the cases. Most of the common emm types are included in either one or both the 26-valent and 30-valent vaccines, though some well-represented emm types found in Europe are not. Conclusion This study provided a picture of the prevalent emm types among invasive GAS (iGAS) in Europe and North America since the year 2000 onward. Continuous surveillance on the emm-type distribution among iGAS infections is strongly encouraged also to determine the potential coverage of the developing multivalent vaccines.

iNtRODUctiON Streptococcus pyogenes or group A streptococcus (GAS) is a human Gram-positive bacterial species that either can exist as commensal or can be responsible for a broad range of infections, ranging from uncomplicated throat and skin infections to more severe and invasive diseases, such as bacteremia, soft tissue infections, necrotizing fasciitis, and septic shock; it represents, therefore, on a global scale, an important cause of morbidity and mortality (1). In addition, symptomatic infections can be involved in alterations of the immune system leading to post-streptococcal autoimmune sequelae, such as acute rheumatic fever (ARF) and chronic rheumatic heart disease (RHD), and post-streptococcal glomerulonephritis. ARF/RHD is an uncommon disease today in most high-income countries but it remains the major cause of acquired heart disease in adolescents and young adults in the developing world, responsible for at least 350,000 premature deaths per year (2). Due to the size and clinical severity of the GAS disease burden, epidemiological surveillance has been crucial to detect changes in the disease pattern in various populations. Typing of collected bacterial isolates has been an important part of the epidemiological surveillance of GAS disease. Several different methods have been described and are available for GAS typing (3,4). Classical serotyping methods based on the different forms of M, T, and OF surface antigens have been replaced by sequence typing of the N-terminal part of the M protein (emm) gene in the late 1990s (4)(5)(6), and it is now the most widely used genotyping method for GAS. So far, more than 200 emm types have been identified (http://www.cdc.gov/streplab/MProteinGene-typing. html), indicating that the M protein is the most polymorphic bacterial protein. Large epidemiological studies on pharyngitis and invasive disease have been performed worldwide using the emm sequence typing (3,7,8). DNA molecular typing techniques that consider multiple genome markers have been also used for GAS genotyping, such as pulsed-field gel electrophoresis (9) and multi locus sequence typing (10). These methodologies have proved to be of particular importance to define the clonal structure of particular GAS populations (11). Recently, emm-cluster typing system, which groups most emm types into 48 different functional emm clusters on the basis of their structural properties, and multiple-locus variable-number of tandem repeats analysis have been proposed as promising additional GAS typing tools (12,13). The recent advances in whole-genome sequencing (WGS) technologies with reduced costs and turnaround times, along with the development of bioinformatics' tools able to manage the large amount of generated data, made this technology accessible to reference microbiology (14). Recently, WGS coupled with the appropriate bioinformatic pipelines proved to be a reliable tool for the assignment of emm types and subtypes from genomic data (15)(16)(17).
Available molecular epidemiological data have been used for the design of a GAS vaccine. Despite the lack of licensed GAS vaccines, several vaccine candidates have been considered and they can be divided into M protein-based and non-M protein-based vaccines. Among the M-protein-based vaccines under clinical trial investigation there are the multivalent 26-valent and the 30-valent formulations as well as the conserved M protein vaccines (18)(19)(20); the non-M-protein-based candidate vaccines, at various stages of development, include either cell wall or several secreted virulence factors (21,22).
The present review aimed to provide a picture of the emm-type distribution among invasive GAS (iGAS) strains in high-income countries of the Western world, such as Europe and North America, retrieved from the literature since the year 2000. This study had two prominent objectives: to update the picture on the most prevalent emm types causing invasive infections for epidemiological purposes and, second, to provide a scenario of the possible efficacies of the under-development M-based vaccine formulations.

metHODOLOGy Search Strategy and Study Selection criteria
We searched for studies that described the epidemiology of invasive GAS based on emm typing by the use of a systematic approach. Searches were done in PubMed Medline for papers published from 1st January 2000 to 31st May 2017 by the combined use of the search terms "emm" and "invasive. " The search was planned in order to include only GAS isolates responsible for invasive infections according to the case definition (23). Exclusion criteria included studies that involved other beta-hemolytic streptococci, not population-based surveys, outbreak, and case-report studies, surveillance studies focused to specific emm types or limited to the analysis of antibiotic-resistant strains. Reports considering small numbers of strains (approximately not more than 40 GAS strains) were also not considered, except for those papers that were the only representative of a given European Country. Moreover, all studies that were part of major cited studies and reviews were also excluded from the analysis. Those studies for which clear data were not available, such as studies with uncertainty on the time period of the collection of strains, on the origin of strains, on the emm-type distribution, on the geographical origin, or studies involving strains collected across 2000 but starting earlier in 1990s and for which emm types after 2000 were not precisely provided, were not considered in this review. Finally, the search was restricted to only papers in English language. Overall, 264 articles were retrieved. A total of 38 articles, of which 31 and 7 from European and North American studies, respectively, were selected by the criteria described above. Five additional articles, all European studies, which were cited in the references although not retrieved by using the search criteria, were also included.

Data extraction and analysis
A database was created to record the country, the period (years) of isolation, the geographic area involved (local, regional, or a nationwide survey), the group ages of patients affected by iGAS, the overall number of genotyped iGAS and the emm-type distribution. Only one isolate per patient was included and the relative frequency of each emm type was determined or extrapolated from each study. All the information obtained was incorporated into an Excel file and summarized in Table S1 in Supplementary Material

ReSULtS europe
All countries located in the European continent were considered. Therefore, besides the 28 European Union (EU) countries, Iceland, Liechtenstein, Norway, and Switzerland were included in the search.
No data on iGAS emm types were retrieved from 15 countries (Austria, Belgium, Bulgaria, Croatia, Cyprus, Estonia, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Slovakia, Slovenia, Liechtenstein, and Switzerland) while for the other 17 countries at least one study on iGAS emm-type distribution could be retrieved and taken into consideration. The findings of the most prevalent emm types encountered in each European Country are summarized in the Figure 1 and in Table S1 in Supplementary Material.

Northern Countries Denmark
Three studies were available for Denmark covering the period 2001-2011. Overall, the major emm types were emm1, 28, 89, and 3, with the latter commonly found only since the year 2003 onward. The most recent study described a nationwide laboratory-based surveillance between years 2005 and 2011 and it analyzed 910 iGAS isolates (24). A total of 58 different emm types were identified, with emm types 1 (26%), 28 (17%), 3 (13%), 89 (11%), and 12 (9%) representing 76% of all isolates. emm1 predominated in most years, except in the year 2010 when emm28 was the most prevalent, and 2011 when emm1 predominated along with emm3 (24).
Another report was a nationwide prospective surveillance study performed between January 2001 and August 2002 on GAS collected from patients with invasive infections, noninvasive infections, and asymptomatic carriers. This study revealed that among the 200 invasive isolates, a total of 27 different emm types were found, of which emm1, emm28, emm12, emm6, and emm4 accounted for 69% of the total emm types (26). emm1 and emm28 alone constituted 32 and 20% of the total emm types, respectively, with emm1 that significantly increased over 2002 (26).

Finland
Four studies met the criteria used in this review. These encompassed the period 2000-2013 and the major emm types were emm28, 89, and 1. The most recent paper was a nationwide study conducted during the period 2008-2013 on 1,112 iGAS (27). A total of 72 different emm types were identified, of which emm28 (26%), emm89 (17%), and emm1 (12%) were the most common types (27). An increase of erythromycin and clindamycin resistance was observed at that time that paralleled with the emergence of the novel clone emm33, detected for the first time in 2012 (27). This study revealed that emm89 increased from 2008 to 2013, emm1 decreased while emm84, that was common in 2008, was not detected in 2013 (27).
Another nationwide survey was performed to assess the population-based incidence and outcomes of pediatric iGAS infections (28). During the period of 1996-2010, a total of 151 children with iGAS infection were identified (28). Overall, 60 isolates were genotyped and emm1 strains were the most prevalent in the years 2000-2010, showing a typical fluctuation by herd immunity, followed by emm28, emm12, and emm89 (28).

Norway
Two nationwide and three regional surveillance studies on iGAS in Norway have been reported since 2000. The studies comprised iGAS collected over the period 2000-2014 and the major emm types identified were emm1, 28, 3, and 89.

England and Wales
Two studies from United Kingdom were included in this survey, of which one on a large collection of iGAS isolated in 2014, the other on a small collection of strains recovered in the early 2000s. The bigger study was conducted after an increase of scarlet fever episodes observed in 2014 in England and involved 252 iGAS. emm3 was the most prevalent emm type representing 28.2% of all iGAS, followed by emm1 (22.6%), emm89 (9.1%), emm12, and emm28 (7.1% each) (38).
The other report was a small surveillance study performed on blood GAS isolates obtained from injecting drug users compared to isolates from non-drug users at the Royal Sussex Hospital in Brighton over the period 2000-2003 (39). Overall, 44 iGAS were available for the study and the most frequent emm types were emm83 (25%), emm82 (18.2%), emm1 (15.9%), emm89 (13.6%), and emm87 (11.4%) (39). Differences in emm-types distribution were noted between injecting drug users and non-drug users, with emm82 and emm83 almost exclusively found in the injecting drug users population (39).

Scotland
Only one national study from Scotland was retrieved; it reported the distribution of GAS emm types from invasive and noninvasive sites from all age groups patients over a 4-years period (2011)(2012)(2013)(2014)(2015). This study revealed that among 329 iGAS strains recovered from sterile sites emm1 (66% of all iGAS) was the most prevalent and strongly associated with iGAS infections (40). The other most common emm types identified in invasive infections were, in decreasing order, emm76 (7.4%), emm89 (6.7%), and emm3 (5.8%).

Ireland
The only study from Ireland that met the chosen criteria documented an increased incidence of iGAS over in the years 2012-2013; emm1 dominated over the entire study period (41). In 2012, 176 iGAS were genotyped and the most predominant types were emm1 (48.6%), emm12 (9.2%), and emm28 (7.3%). In 2013, a further increment in iGAS infections was noted and it was associated with a notable increase in emm3 isolates in the first half of 2013, reaching the rate of 22% from that of 4% observed in 2012 (41). The relative percentages of emm types found over the entire study period were not clearly indicated.

Eastern Countries Czech Republic
The only study on iGAS was an epidemiological survey on 215 iGAS obtained from 34 hospitals throughout the Country, isolated during the years 2001-2005 (42). The emergence of the uncommon GAS emm53 type was reported, with the highest proportion observed during 2003, possessing resistance to macrolides (42). The most prevalent emm type was emm1, followed by emm81, emm28, emm53, emm3, and emm66, although the relative frequency of each emm type was not reported (42).

Hungary, Poland, and Romania
One national surveillance study from each Country has been retrieved according to the used criteria, all three including only a limited number of iGAS isolates.
The nationwide laboratory-based surveillance study from Hungary regarded the molecular characterization of 26 iGAS isolates obtained in the years 2004 and 2005. The most prevalent emm types encountered in this study were emm1 (50%) and emm80 (19.2%) (43).
The study from Poland was a national laboratory-based survey involving 17 different hospitals distributed in different part of the Country (44). The major limitation of this study for our analysis was the lack of data on the emm-type distribution obtained since 2000 onward, but considering that it was the only molecular study on iGAS from this country, it was included. Overall, 41 iGAS clinical isolates were obtained between the years 1997 and 2005, with a total of 23 different emm types identified, of which emm1 and emm12 (19.5% each) were the most frequent, followed by emm 81 (7.3%) (44).

Western Countries France
Several studies on the epidemiology and emm-type distribution of iGAS isolates recovered from 2000 to 2013 met the criteria used in this review. Overall, the most prevalent emm types were emm1, 28, and 89.
Four other studies on French iGAS strains isolated in the years 2006-2011 found emm1 as the predominant type. Two of these studies were nationwide surveillances on a large number of strains recovered from all age group patients in France (48,49); in particular, one report assessed the epidemiology of 623 iGAS strains isolated in 2007, the other study involved more than 2,600 iGAS collected in the period 2007-2011. The most prevalent emm types were quite the same between studies, with slight differences: emm1 predominated, followed by emm28 or emm89, emm4, and emm12 (48,49). The other two French studies were a report on 1,542 iGAS recovered from adults in the years 2006 and 2010 (50) and 125 iGAS strains isolated from children in the period 2009-2011 (51). The first study identified emm1 as the most common type accounting for 24% of the isolates, followed by emm28 (17%), emm89 (15%), emm4, emm3, and emm12 (5% each); the other study revealed that emm1 predominated (24.8%), followed by emm12 and emm28 (15.2% each), emm6 (12%), and emm3 (9.6%).

Southern Countries Greece
Two national surveillance reports have been selected, covering the period 2003-2007, and they both indicated emm1 and emm12 as the most prevalent types. A multicenter laboratory-based surveillance study was conducted between the years 2003 and 2007; among the 138 iGAS available for genotyping the two most prevalent emm types were emm1 (28.2%), mainly isolated in adults, and emm12 (8.5%) (54). The other study was a multicenter surveillance involving a total of 101 isolates obtained between the years 2003 and 2005 and the most common emm types were emm1 (26.5%), emm12 (8.9%), emm4, emm6, and emm95 (5% each) (55).

Portugal
Two major consecutive surveillance studies involving iGAS have been performed in the period 2000-2009. emm1, emm3, emm6, and emm89 represented the four most common types, with differences in their order of frequency.

Spain
Only one study met the criteria and it was a regional survey on invasive strains recovered from two regions in Spain between the years 1998 and 2009. Among the 215 isolates, emm1, associated with speA and ssa genes, largely predominated (27.9%) and was responsible for the majority of fatal outcomes, followed by emm3 (9.8%), emm4 (6.5%), and emm28, emm12, and emm89 (6% each) (58).

North america
Four and three studies on emm-type distribution of iGAS from US including Alaska and from Canada, respectively, were considered in this review. The results of the emm-type distribution and their relative frequencies reported in North America are depicted in Figure 2 and in Table S2 in Supplementary Material.

United States
Three major studies on iGAS collected by the Centers for Disease Control and Prevention's Active Bacterial Core surveillance, a population-based network, including geographically different states from US, were retrieved. Altogether, these three studies encompassed a time period of 13 years, from 2000 to 2012. emm1 was the predominant type in all studies and four other emm types, namely emm3, emm12, emm28, and emm89, were among the most common types, with some fluctuations over time.

Canada
Overall, three studies on molecular epidemiology of iGAS infections were found, encompassing a 9-years period (between 2006 and 2014), with large geographical differences in the emm-type distribution.
Another regional surveillance study of iGAS in the province of Ontario started because of an outbreak of iGAS infections due to emm59 strains occurring in Thunder Bay District, North-Western Ontario in 2008. All iGAS obtained in Thunder Bay District from 2011 to 2013 were studied and they were compared to iGAS strains recovered during the same period from the metropolitan area of Toronto/Peel and the province of Ontario (63). Most iGAS cases isolated in the Thunder Bay District were caused by strains belonging to skin or generalist emm types, while those from the province of Ontario and the Toronto metropolitan area were caused by emm types frequently associated with invasive GAS infections (63). In particular, among the 66 iGAS obtained from Thunder Bay District the six most prevalent emm types were emm87 (12.3%), emm82 (10.8), emm1, emm101, and emm83 (9.2% each), and emm114 (7.7%). By contrast, from the metropolitan area of Toronto/Peel and the rest of province of Ontario emm1 predominated (23.5%), followed by emm89 (12.7%), emm3 (11.3%), emm12 (8.2%), and emm28 (5.7%) (63).
The third study was a nationwide surveillance conducted from January 2006 to December 2009 and included 4143 iGAS obtained from 10 provinces and 3 territories in Canada (64). Among all iGAS cases, 539 (13%) were attributed to emm59, mainly circulating in the province of Ontario. emm1, emm28, emm3, emm89, and emm12 were other well-represented types, although the relative percentages of each emm type were not provided (64).

DiScUSSiON aND cONcLUSiON
In this review, we provided a picture of the most prevalent emm types among iGAS reported in Europe and North America from the year 2000 to May 2017 (last updated on June 1st 2017). It is interesting to note that the major emm types were almost the same in all European countries and in US. emm1 largely represented the dominant type in all countries, and in only few cases it was replaced by other common emm types, such as emm28, reported in Denmark, Finland, and Norway, emm89 and emm77 reported in Sweden and Finland, respectively, and emm3 in the United Kingdom. Other common emm types were emm28, emm89, emm3, emm12, emm4, and emm6, with differences in their prevalence among countries. Some "uncommon" emm types were sporadically encountered as prevalent types, as for emm84 and emm119 in Finland; emm11 in Norway; emm75 in Finland, Norway, and Spain; emm81 in Finland, Sweden, Czech Republic, Hungary, Poland, and Romania; emm82 in Norway and Alaska; emm66 in Norway and Czech Republic; emm53 in Finland and Czech Republic; emm18 in Italy; emm64 in Portugal; emm49 and emm108 in Alaska; and emm59 in Canada. Interestingly, emm81 was commonly found in all Eastern European countries, suggesting the likely spread of this type between neighboring countries.
Overall, the major emm types (emm1, emm28, emm89, emm3, emm12, and emm4, and emm6) accounted to only up 70% of the total isolates. Therefore, it is of some concern that several other different emm types that are less represented nevertheless accounted for the remaining 30-50% of the cases. Some of these minor emm types could emerge and become among the most prevalent ones in the future.
In North America, emm1 was the most prevalent type in most national studies from US, followed by emm12, emm28, and emm3 but in regional remote communities in Canada other uncommon emm types predominated, such as emm11, emm87, emm101, emm114, and emm118.
It is important to be aware of the increase in iGAS infections with associated mortality observed in the last years in several European countries (40,(65)(66)(67), reinforcing the need of an European multinational surveillance network that could compensate the scattered available information on the iGAS disease burden (68).
Data on the emm-type distribution of population-based GAS surveillance have been also used for the development of M-multivalent GAS vaccine candidates. The 26-M-valent and 30-M-valent vaccines have been developed in order to maximize the "coverage" of circulating emm types (19). Epidemiologic surveys suggest that the 26-valent vaccine would provide good coverage of circulating GAS strains in industrialized countries (over 72%) but poor coverage in many developing countries due to differences in emm distribution (3). Similarly, the 30-valent vaccine has a limited coverage in many developing countries where GAS infections are endemic but this inconvenience is likely mitigated by the demonstration that the 30-valent vaccine induces protection not only against the emm types contained in the vaccine but it also cross-reacts to some non-vaccine emm types (69).
It is of some concern that some emm types found to be wellrepresented in some European studies are not included into the vaccine formulations, as is the case of the emm types 53, 64, 66, 84, 108, and 119; on the other end, sporadic emm types as emm 49, 81, and 82 that were identified in the European studies or emm types 87, 101, and 118 isolated in regional Canadian surveys are included in one or the other of the two vaccines (19,70).
Fluctuations in emm-type distribution have been attributed to multiple factors, such as different frequencies of the most prevalent circulating clones, influence of the herd immunity, different clinical manifestations of iGAS infections associated with specific emm types, age, racial, and seasonal differences (25,28,32). Recently, seasonal, geographic, and temporal trends of specific emm clusters associated with iGAS infections have been found, due to a probable different capacity for transmission or infection (71).
The factors contributing to the fluctuations and/or success of specific epidemic clones in invasive diseases have always gained a priority interest in term of prevention control and the study of the dynamics of GAS population. The predominance of particular emm types in invasive disease could likely be a consequence of the high prevalence in the entire population, as demonstrated for emm1, suggestive of the better success of specific clones as well-adapted human colonizers (72). In the last few years, the use of WGS has proved to be a useful way to investigate the evolutionary history of these highly successful GAS epidemic or pandemic clones belonging to specific emm types. Specific virulence factors, such as streptococcal pyrogenic exotoxin (spe) A and its alleles, streptokinase, streptolysin O (slo), NAD glycohydrolase (nga), have been associated with shaping and spreading of successful strains. For example, the emergence of the emm1 epidemic clone has been associated with three consecutive gene transfer events, which is the acquisition of Dnase sdaD2, expression of speA2 and upregulation of the virulence factors slo and nga (73). Similarly, for emm89, the emergence of a third clonage lineage (clade 3 clone) by modification in the nga/slo locus and loss of the capacity to synthesize the capsule is the cause of an ongoing epidemic of invasive infections in Europe and North America (74). Another recent study from England reported an increase in iGAS infections associated with emm3 isolates. By analysis of the whole-genome single-nucleotide polymorphism-based phylogeny the main factor responsible for this upsurge seemed to be associated with the expansion of a genetic lineage characterized by the presence of a prophage carrying speC exotoxin and spd1 DNAase genes and the loss of two other prophages considered typical of the emm3 strains (75).
Notwithstanding the importance of loss or acquisition of genes by horizontal transfer (mostly mediated by bacteriophage or pathogenicity islands) as a fundamental evolutionary forces shaping the structure of GAS genomes, it is also evident that there is not a strict correlation between defined patterns of exogenous elements-associated virulence genes and invasiveness or disease severity. The main contribution to the generation and expansion of emm-type specific invasive clones seems to be more related to mutations in some important genetic regulatory circuits governing the global expression of GAS virulence. Among them, there are the two components system CovR/S (76), the regulator of protease B ropB (77), the regulator of Cov RocA (78), the Fas system (79), and the RofA-like protein IV regulator RivR (80).
In this context, it is clear how emm type (i.e., M serotype) continues to be the GAS epidemiological reference marker for tracking the clonal radiation of epidemic lineage from a common genetic background. The recent advances of WGS technologies with reduction in terms of costs and reduced turnaround times have been proposed as the method of choice to type strains, GAS included. The design of bioinformatics components to emulate current methods, however, is still laboratory-restricted and standardization of different steps still needs improvements to ensure backward compatibility with the classical "gold standard" typing methods and schemes (15,17). The potential of WGS data analysis is huge as already demonstrated also for GAS where temporal and geographic relatedness between GAS isolates could be deduced (16). These new "omics" approaches may also provide rapid assessment of outbreaks by discriminating between closely related isolates, and produce extensive data on the virulome, including the associated regulatory mechanisms, and the antibiotic resistome (81,82). Hopefully, this review can be a reference for all those who work in the field of molecular epidemiology of GAS and may represent a milestone also to those who are approaching the problem using the recent next generation sequencing methodologies.
In any case, continuous surveillance of the emm-types distribution of iGAS is strongly encouraged to monitor the prevalent emm types responsible for the disease burden and the potential coverage of the under-development multivalent vaccines.
aUtHOR cONtRiBUtiONS GG and RC conceived the project. GG and RC searched the database for potentially eligible articles, extracted the data, and performed the analyses. GG, LV, and RC interpreted the results and wrote the manuscript. All the authors reviewed the final version of the manuscript prior to submission for publication.