The Role of Whole Genome Sequencing in the Surveillance of Antimicrobial Resistant Enterococcus spp.: A Scoping Review

Enterococcus spp. have arisen as important nosocomial pathogens and are ubiquitous in the gastrointestinal tracts of animals and the environment. They carry many intrinsic and acquired antimicrobial resistance genes. Because of this, surveillance of Enterococcus spp. has become important with whole genome sequencing emerging as the preferred method for the characterization of enterococci. A scoping review was designed to determine how the use of whole genome sequencing in the surveillance of Enterococcus spp. adds to our knowledge of antimicrobial resistance in Enterococcus spp. Scoping review design was guided by the PRISMA extension and checklist and JBI Reviewer's Guide for scoping reviews. A total of 72 articles were included in the review. Of the 72 articles included, 48.6% did not state an association with a surveillance program and 87.5% of articles identified Enterococcus faecium. The majority of articles included isolates from human clinical or screening samples. Significant findings from the articles included novel sequence types, the increasing prevalence of vancomycin-resistant enterococci in hospitals, and the importance of surveillance or screening for enterococci. The ability of enterococci to adapt and persist within a wide range of environments was also a key finding. These studies emphasize the importance of ongoing surveillance of enterococci from a One Health perspective. More studies are needed to compare the whole genome sequences of human enterococcal isolates to those from food animals, food products, the environment, and companion animals.


INTRODUCTION
A variety of Enterococcus spp. are commensals within the gastrointestinal tract (GIT) of humans and animals, while others exist within the broader environment; some enterococcal species have also emerged as important human pathogens, especially in nosocomial infections (1). Two enterococcal species, Enterococcus faecium and Enterococcus faecalis, are most commonly implicated in human disease (2). These species, in particular, can acquire antimicrobial resistance (AMR) and harbor virulence genes that give them advantages as opportunistic pathogens (3). Their acquisition of antimicrobial resistance genes (ARGs) can be chromosomally-and plasmidmediated, arising from selection pressure through antimicrobial use and the transfer of ARGs on mobile genetic elements (MGEs) such as plasmids and transposons (4). Of particular concern is the rising emergence of vancomycin-resistant enterococci (VRE) (4). Acquired vancomycin resistance is mediated by various gene clusters termed VanA/B/D/E/G/L/M/N (4,5). Each gene cluster codes for a different resistance mechanism. VanA and VanB are the most common clusters seen, are often hospital-acquired, and can be plasmid-or chromosomally-mediated (5)(6)(7). Two species, Enterococcus gallinarum and Enterococcus casseliflavus, have intrinsic vancomycin resistance that is chromosomallymediated by the VanC gene cluster (4,5).
These two species, along with others such as Enterococcus villorum, Enterococcus thailandicus, Enterococcus durans, and Enterococcus hirae, are more typical of animal-and environmentally-adapted species (8,9). Their genomes reflect adaptations to specific niches; for example, clusters of orthologous groups (COGs) have been found for ethanolamine utilization as a carbon source in environmental species that are not found in E. faecium (8).
Other antimicrobials can co-select for vancomycin-resistance genes if there are multiple ARGs on a single mobile genetic element. This means that the use of other antimicrobials can lead to the acquisition of vancomycin-resistance genes, even if vancomycin is not used (10). Thus, the VRE that arise in this manner are also multi-drug resistant (MDR) or multiclass resistant (7) referring to organisms that have acquired resistance to two or more antimicrobials or those organisms that have acquired resistance to two or more classes of antimicrobials, respectively.
As AMR pathogens have emerged, the use of antimicrobials for prophylaxis and metaphylaxis in food producing animals has come under scrutiny for its potential to apply a selective pressure that contributes to the dissemination of AMR and MDR enterococci (7,11). Perhaps the most classic example is in poultry and swine where the previous use of a vancomycinrelated glycopeptide, avoparcin, as a growth promoter was associated with carriage of vancomycin-resistant E. faecium (VREfm) in treated herds or flocks through cross-resistance (7,12). The occurrence of VREfm in food animals decreased after the avoparcin use in animals was banned (12); however, prolonged persistence of specific VREfm clusters in agricultural settings (e.g., VanA gene cluster on the Tn1546 transposon) were observed possibly due to co-selection for VRE through continued use of other antimicrobials, such as macrolide use in swine (7,12,13). Retrospective molecular and genetic studies have demonstrated that the VREfm isolated from hospital and agricultural setting are usually separate sequence types (7,12,13). In contrast, the same sequence types of vancomycinresistant E. faecalis can be isolated from hospital settings and from farm animals (7,12,14). In human medicine, enterococci are often opportunistic pathogens that acquire resistance and arise in immunocompromised individuals in hospital settings.
Often these patients have been treated with multiple classes of antimicrobials in an effort to control difficult to treat infections (6). These antimicrobials may include those deemed critically important for use in human medicine by WHO (10), leading to enterococci resistant to these antimicrobials circulating in the human population (10).
Due to the importance of Enterococcus spp. as potential human pathogens, their ability to easily acquire ARGs, and their ubiquitous nature in the GIT and the broader environment, many countries have added enterococcal species to their list of pathogens under surveillance. Their role as a GIT commensal also make enterococci useful as fecal indicator bacteria (15 (6,20,21). The European Antimicrobial Resistance Surveillance Network (EARS-Net) is a large surveillance program based on clinical antimicrobial resistance data from laboratories across Europe (22). The pharmaceutical industry also runs some important post-marketing surveillance programs to comply with licensing requirements of new antimicrobials, looking at potency and spectrum. Examples of these programs are the Zyvox Annual Appraisal of Potency and Spectrum (ZAAPS) and the Linezolid Experience and Accurate Determination of Resistance (LEADER) (23). Integrated surveillance programs survey and address AMR in humans, animals, and the environment from a One Health perspective, emphasizing the interfaces within the system. These programs collect samples and process isolates from various sources, including animal fecal samples, human screening samples, retail meats, wastewater, surface water, groundwater and soils (15). Human screening samples include those from hospital surveillance programs of in-patients and samples submitted to laboratories performing surveillance (6,(24)(25)(26). The data generated from the wide range of samples can be integrated with antimicrobial use data allowing for the monitoring of changes in antimicrobial resistance found in bacteria important to public and animal health (15). Information gleaned from surveillance can then inform policy and risk mitigation strategies to combat increasing AMR and protect antimicrobials important to human health (15). For example, surveillance of VRE through DANMAP allowed for the detection of VRE strains in broiler chickens and human isolates connected to the use of avoparcin for growth promotion in broilers. This detection led to the ban of avoparcin use in food animal production (7,15).
Early surveillance was primarily based on traditional microbiology to determine phenotypic antimicrobial susceptibility profiles, molecular genomics methods such as polymerase chain reactions (PCR) to assess for the presence of resistance genes, and pulsed-field gel electrophoresis (PFGE) for DNA fingerprinting Multi-locus sequence typing (MLST) arose more recently to better assess genetic relationships among isolates (7,27). The use of these technologies allowed for the phylogenetic study of sequence types, epidemiologic investigation and determination of the presence of specific ARGs. However, the development of whole genome sequencing (WGS) has provided a more in-depth and detailed analysis of enterococcal ARGs, phylogenetics, and virulence (7,27). As whole genome sequencing has become more widely available and less expensive, many archived isolate collections are being reanalyzed and their genomes compared with new isolates (13,28). Following WGS, it has become possible to utilize new sequence typing methods for enterococci, such as core-genome multi-locus sequence typing (cgMLST), allowing for better analysis of isolate relatedness across sample sources (28). WGS also allows for the identification of emerging strains, analysis of outbreaks, and the characterization of resistance and virulence genes and their locations and context in the bacterial genome. Due to these advantages many surveillance research groups have been transitioning to WGS-based approaches of isolate characterization. Sequencing compliments traditional microbiology approaches and offers a reliable method of characterizing ARGs and sequence types (27).
With the increasing popularity of WGS for bacterial pathogen surveillance, it is now imperative to review the progress that has been made toward surveillance methods and identify gaps in our surveillance knowledge. We have undertaken this scoping review to investigate and summarize the extent to which whole genome sequencing in surveillance studies has advanced our understanding of AMR in Enterococcus spp.

METHODS
To investigate our research question, a scoping review was designed following the PRISMA-ScR extension for scoping reviews (29) and the guidelines laid out by the JBI Reviewer's Manual (30). This protocol was not registered with an online registration platform.

Search Terms and Strategy
The Population, Concept, Context (PCC) framework (30) was used to develop the research question and search strategy. A search strategy was developed to return a broad range of studies that fit within the following population, concept, and context: -Population: Enterococcus spp. that underwent whole genome sequencing -Context: Enterococcus spp. isolates derived from surveillancetype studies (as described below) -Concept: use of whole genome sequencing to better understand antimicrobial resistance.
After consultation with a librarian, three separate databases were selected for searching: PubMed (NCBI) 1 , Web of Science 2 , and CAB Abstracts 3 All databases within Web of Science were included in order to include both the Web of Science Core Collection and BIOSIS. Searches were performed by a single reviewer (LR) on November 17, 2019, and email search alerts from each database were implemented to inform the reviewer of any new articles. New articles from alerts up to and including December 31, 2020 were included. The same search terms were used for all three databases, except for differences due to specific database formatting.The search terms and resulting number of articles for one database (Web of Science) are described in Table 1. To capture relevant gray literature, a manual search of the reference lists of included articles was completed during data extraction. Gray literature is information produced and distributed outside of academic publications such as government reports. The gray literature underwent the same screening process; however, the full text was read for screening if no abstract was available. The web application, Rayyan, was used for the organization of articles during the screening process (31).

Article Screening and Selection
Only journal articles and abstracts in the English language, published after 2002 were eligible for screening. The entire genome of E. faecium was first sequenced in 2000 (32); however, the assembly was not completed until 2012 (33,34). The earliest available genome sequence of an Enterococcus sp. from the NCBI database is from 2002 (35). Given this information, 2002 was considered the earliest year that a publication would contain the information relevant to this scoping review. Any publication that was not an article or abstract (e.g., textbook, poster, or conference presentation) was excluded. Relevant gray literature articles were included and searched for, as described above. All screening was done independently by two reviewers (LR and KS). Any discrepancies were resolved in discussion with two other reviewers (SLC and SCC).

Title Screening
The article titles were screened initially and any article that was clearly about bacteria other than Enterococcus spp. was excluded. These articles needed to explicitly include the name of bacteria other than Enterococcus spp. in the title and not include terms relating to the taxonomy of enterococci. All articles that did not meet this exclusion criterion were included for the next screening step. The title screening was intentionally left broad to maximize the number of articles included.

Abstract Screening
Two screening steps were applied to the included abstracts. The first abstract screening step was performed to exclude any articles that did not include whole genome sequencing and antimicrobial resistance of Enterococcus spp. The abstract had to include all three pieces of information (i.e., WGS, AMR, and Enterococcus) in the abstract text. This step was also intentionally left broad and any articles that mentioned sequencing without providing information about whether or not the whole genome was sequenced were included to be screened based on methodology (described below). Following this, the abstracts were screened a second time using the following question, "did all or a portion of the Enterococcus spp. isolates in this study result from surveillance or screening for enterococci?" The following criteria for surveillance or screening were used: • Isolates were from a collection maintained by a surveillance group (a surveillance group is defined as an organization collecting and analyzing bacterial isolates for surveillance of those particular bacteria such as CIPARS, SENTRY, or DANMAP). OR, • A statement was included that the isolates were collected for screening or surveillance purposes. OR, • Isolates were collected for the sole purpose of genomic comparison. OR, • The article was published in a journal which included "Surveillance" in the journal name.
Articles needed to meet one or more criteria. Articles that did not meet these criteria, such as those with only clinical isolates, were excluded. Articles that were unclear if they met the surveillance inclusion criteria through their abstract were screened based on methodology as described below.

Methods Screening
As stated previously, some abstracts did not contain enough detail to determine if they met the inclusion/exclusion criteria. These articles were further screened through the reading of their methods sections, following the same criteria as for abstract screening. Articles that did not meet the inclusion criteria during the methods review were excluded. The number of articles excluded at each screening step is displayed in Figure 1.

Data Extraction and Charting
Data was extracted independently by two reviewers (LR and KS) to answer the research question. The chart was trialed with 15 articles to ensure the reviewers were extracting comparable information. The completed tables from data extraction were compared by one reviewer (LR), and any discrepancies in information were resolved in discussion between the two original reviewers (LR and KS). No critical appraisal of articles was performed and all articles were included regardless of study quality. The methodology of data extraction is outlined in Table 2.

Article Characteristics
Seventy-two articles were included after the full-text review (Figure 1). Of these, 70 were primary research articles and two were gray literature (government reports) (25,36). All articles were published in 2015 or later. The corresponding authors were from seventeen (17) different countries with most from Australia (16.7%), Denmark (13.9%), and Germany (13.9%), followed by the USA (12.5%), then the UK (6.9%) and 5.6% from each of Canada, China, the Netherlands and Portugal. Corresponding authors were also from Brazil  Table 1.
Just under half of the studies were not associated with a specified surveillance group (48.6%). The remaining articles were associated with government funded programs (e.g., DANMAP, NARMS), within hospital screening or surveillance programs, or private/industry funded surveillance programs ( Table 3 and Supplementary Table 1).
All 72 articles provided some information on the AMR phenotypes of their isolates. The majority of articles (80.6%) describe isolates with resistance to glycopeptide antibiotics (vancomycin or teicoplanin). Twenty-one articles (29.2%) described isolates with resistance to oxazolidinones (linezolid or tedizolid). Other antimicrobial classes with identified phenotypic resistance included fluoroquinolones, macrolides, aminoglycosides, penicillins, and tetracyclines (Figure 3 and Supplementary Table 3). Methods used to assess phenotypic resistance in each article are outlined in Supplementary Table 4. Nineteen articles (26.4%) provided no information on the methodology used to define phenotypic resistance.

WGS Platforms and Results
All articles selected performed whole genome sequencing of the isolates (as a part of the inclusion criteria) and Illumina was the most commonly used platform for sequencing. Sixtyfour articles (88.9%) used a version of Illumina for sequencing, whereas four articles did not provide sequencing methodology. Illumina MiSeq was the most commonly used version of Illumina employed. Other platforms included PacBio, Ion Torrent PGM, or Illumina in combination with PacBio, Ion Torrent PGM, or MinIon platforms ( Table 6). Sixty-two articles (86.1%) provided archive accession numbers for access to their resultant sequences.
Whole genome sequencing generated information about AMR genes that was described in almost all articles, with two articles (2.8%) failing to report AMR genes (Figure 4 and Supplementary Table 3). Fortyone articles (56.9%) reported the vanA gene cluster, 26 articles (36.1%) reported vanB, and 15 articles (20.8%) reported optrA.
From the sequencing, 61 articles (84.7%) described the sequence type (ST) and/or clonal complex (CC) of isolates. Over 60 sequence types were reported, including   Figure 5 and Supplementary Table 3.
Many articles (75.0%) reported other molecular techniques, including PCR, PFGE, and MALDI-TOF MS. These techniques were primarily used for speciation or screening for AMR genes (Supplementary Table 4).

Article Findings and Conclusions
The articles' findings or conclusions related to antimicrobial resistance were summarized into categories, and articles
could fit into more than one category. Of all articles, 31.9% reported a new or uncommon finding, which could include a novel strain or gene, or a previously described finding in a novel location. A fifth (20.8%) of articles reported how Enterococcus spp. were optimized for adaptation and survival in their environment. Twelve articles (16.7%) specifically stated that WGS was a better means of detecting or differentiating Enterococcus spp. than other genomic methods (such as PCR or PFGE). The other summarized findings are described in Table 7 with more details are available in Supplementary Table 2.

DISCUSSION
In this scoping review, we aimed to assess the value added by the use of WGS in the surveillance of enterococcal AMR. The use of WGS for the surveillance of Enterococcus spp. has added to our knowledge about Enterococcus spp. through the detection of previously unidentified strains and finding that WGS was better at detecting and differentiating Enterococcus spp. than other genomic methods. European countries provided the most surveillance studies, with many of these coming from the Danish DANMAP program. This is perhaps unsurprising as DANMAP is an extensive and well-established program implemented in 1995 (25).
Importance of WGS to Detect AMR Enterococcus spp.
While the majority of Enterococcus spp. are adapted to the natural environment and animal GITs, rarely causing disease in humans, E. faecium and E. faecalis are the species most likely to cause human disease (2). Thus, it was not surprising that they were the most studied species in the included articles and the majority of articles were conducted in hospital settings. Nearly half of the included articles were studies specific to VREfm, showing the importance of vancomycin resistance in enterococci. WGS was important in studies to better understand VRE, especially in the assessment for vancomycin-variable enterococci (VVE). These are enterococcal isolates that are phenotypically sensitive to vancomycin but carry vancomycinresistance genes. These isolates become phenotypically resistant to vancomycin when exposed to the antibiotic in vivo (25). Even though the resistance genes could be identified via PCR, the importance of WGS to further characterize VVE isolates was shown in the 2018 DANMAP report. The use of WGS and cgMLST allowed for the identification of new complexes and sequence types. DANMAP can now perform surveillance specific to these VVE strains. This should allow for earlier detection of VVE in patients and more appropriate treatment (25). WGS also allowed for a better understanding of E. faecium as it determined new sequence types, including pstS-null types, through cgMLST (37). The pstS gene locus is a housekeeping gene used for MLST but is missing in some VREfm strains. The use of WGS and cgMLST allowed for more robust sequence typing and identification of these isolates (37). Surveillance of enterococci within a hospital using WGS also allowed for the identification of a VREfm outbreak. A combination of sequencing data and an epidemiological investigation allowed for the identification of transmission route and the implementation of measures to prevent further outbreaks (20).

A One-Health Approach
The ubiquitous nature of Enterococcus spp. naturally requires a One-Health approach to the surveillance of AMR in enterococci (38). This means a transdisciplinary approach across the humananimal-environment continuum in order to better understand the problem of AMR in enterococci (97). While enterococcal species other than E. faecium and E. faecalis were discussed in a few studies, in general, there were relatively few studies using a One-Health approach to compare animal, environmental (e.g., water and soil), and human samples (9,11,13,(39)(40)(41)(68)(69)(70)(71).
No studies sampled companion animals or equids, even though these animals live in close proximity to humans. This could be due to the complexity and cost of coordinating a study with so many sample sources or that human health studies are more easily funded. A collaborative transdisciplinary approach would bring a broad perspective to study design and allow for a better interpretation of the results. This would ease the complexity of designing and coordinating a One-Health surveillance study and create a more robust understanding of the issue of AMR in enterococci. Two studies included in this review did produce very informative results across multiple sample sources to address the One-Health continuum (38,39). These studies pulled samples from livestock, retail meat, wastewater, and human bloodstream infections and showed limited sharing of genes between isolates from humans and animals (38,39). Research using this One-Health approach will provide a means to assess the risk of AMR enterococci moving from food animals, through the food chain, into human populations as well as through the environment.

Importance of Surveillance of Enterococci
Many articles included in this study stressed the importance of the surveillance of Enterococcus spp., especially in hospital settings. This is because enterococci are optimized for adaptation and survival in their environment, whether the hospital environment or natural environment (42,72). Both targeted surveillance of at-risk patients (e.g., immunocompromised) and passive surveillance of incoming hospital patients allowed for early recognition of outbreaks. Outbreaks could be controlled before becoming a significant problem and new hospital protocols surrounding cleaning and isolation could be developed (20,72,87). WGS allowed for more accurate sequence typing and identification of AMR genes (27,73,88).

Sequencing Platforms
From this review, Illumina sequencing platforms are currently the most popular for whole genome sequencing studies. They are historically reliable, with low error rates and have become accessible, abundant, and cost-effective (98). The combination of Illumina short-read with long-read sequencing (usually PacBio) was occasionally used to close a chromosome or a plasmid to accomplish genomic integrity as well as complete understanding of MGEs and their context. Unfortunately, the cost of running large numbers of isolates on a PacBio system is prohibitive (99). The use of long-read sequencing is likely to increase as inexpensive and portable bench-top platforms such as the Nanopore MinION become more reliable with lower error rates (100,101). This will allow for the rapid identification of an isolate and its genetic composition, including MGEs (99). The majority of papers (86.1%) also provided archive accession numbers for their sequences, highlighting the importance of sharing raw genomic data with the scientific community and the requirements for publication in many journals.

Limitations
This scoping review held limitations similar to other review papers in the possible omission of relevant literature, such as gray literature or articles written in a language other than English. Findings from government surveillance programs may be published online or as peer-reviewed articles, but in order to maintain an efficient and reproducible search method, gray literature was only searched from the reference lists in the primary research articles included in the review. No separate gray literature search was performed, which may have resulted in the omission of relevant information. Two non-English articles were excluded in our search which otherwise might have been included. In order to minimize the omission of articles, several databases were searched and inclusion criteria were intentionally left broad until the abstract screening steps. The authors did maintain a rigid definition of surveillance, which could have excluded epidemiological articles that did not fit the selection criteria. This largely eliminated studies on human clinical isolates of Enterococcus spp. as the isolates would have been selected for a study based on certain characteristics.
Another limitation of the study is that a critical appraisal of the included articles was not conducted. This was intentional as one of the objectives of the present study was to identify gaps in the literature, but it means that studies of lower and higher quality would carry equal weight. The findings of some studies may not share the same validity based on their study design, but this was not determined in this review.

CONCLUSION AND FUTURE DIRECTIONS
Whole genome sequencing has added value to the surveillance efforts of Enterococcus spp. by identifying new genes and strains, adding to the knowledge about its prevalence in various settings, and finding that WGS is a better means of detecting and differentiating Enterococcus spp. than other molecular methods. The ability of Enterococcus spp. to adapt and survive in its environment was frequently stated as a reason for the importance of using WGS for the surveillance of this bacterium. Future studies should focus on the state of Enterococcus spp. in companion animal veterinary medicine and determining the link between humans, animals, food products, and the environment for a better One-Health approach to Enterococcus spp. surveillance.

AUTHOR CONTRIBUTIONS
LR, KS, SLC, and SCC developed the research question and scoping review protocol. LR and KS performed the literature search, article screening, and data extraction of included articles. LR drafted the complete manuscript. SCC and SLC were secondary reviewers of articles and primary reviewers of the manuscript. All authors assisted with editing and content review of the manuscript.

FUNDING
LR and KS stipend funding was through the University of Calgary, Faculty of Veterinary Medicine One Health Award, and the University of Calgary Master's Award. This research is part of the AMR -One Health Consortium, partially funded by the Major Innovation Fund program of the Ministry of Jobs, Economy and Innovation, Government of Alberta. The funders had no role in the study design, data interpretation, or writing of the manuscript.