High Throughput Sequencing for Detection of Foodborne Pathogens

High-throughput sequencing (HTS) is becoming the state-of-the-art technology for typing of microbial isolates, especially in clinical samples. Yet, its application is still in its infancy for monitoring and outbreak investigations of foods. Here we review the published literature, covering not only bacterial but also viral and Eukaryote food pathogens, to assess the status and potential of HTS implementation to inform stakeholders, improve food safety and reduce outbreak impacts. The developments in sequencing technology and bioinformatics have outpaced the capacity to analyze and interpret the sequence data. The influence of sample processing, nucleic acid extraction and purification, harmonized protocols for generation and interpretation of data, and properly annotated and curated reference databases including non-pathogenic “natural” strains are other major obstacles to the realization of the full potential of HTS in analytical food surveillance, epidemiological and outbreak investigations, and in complementing preventive approaches for the control and management of foodborne pathogens. Despite significant obstacles, the achieved progress in capacity and broadening of the application range over the last decade is impressive and unprecedented, as illustrated with the chosen examples from the literature. Large consortia, often with broad international participation, are making coordinated efforts to cope with many of the mentioned obstacles. Further rapid progress can therefore be prospected for the next decade.


INTRODUCTION Foodborne Pathogens and Their Impact
Foodborne pathogens (FBPs) cause foodborne diseases (FBDs) either directly (by infectious agents) or indirectly (by toxic metabolites, i.e., bacterial toxins and mycotoxins; EMAN, 2015;Martinovic et al., 2016) and can have devastating health and economic consequences in both developed and developing countries (Pires et al., 2012;EFSA, 2014;ECDC, 2015a;Henao et al., 2015). A major fraction of FBDs are diarrheal diseases, with particularly high impact on children (Pires et al., 2015). Typical FBPs are bacteria and several viruses, but also parasites and some fungi can cause FBDs.

Conventional Microbiological Food Analyses
Microbiological analyses of foods are carried out for verification and control, surveillance, investigation of disease outbreaks or sporadic cases, or for research (Figure 1). Time and labor consuming culture dependent methods, including enrichment and/or selective steps, are often used. Selective enrichment may be crucial to capture the species of interest. Isolation of the pathogen of interest is an optimal starting point for further characterization and research, and contributes to ensure that public health agencies can perform their basic mandate of FBD surveillance and response (Forbes et al., 2017). Over the last decades traditional culture dependent methods have gradually been complemented with molecular analytical methods. Speed and costs are the main drivers of this development. Concerns about the possible lack of necessary sensitivity, specificity or correspondence between molecular findings and presence or absence of viable, pathogenic microorganisms are among factors restraining this development. Polymerase chain reaction (PCR) analysis of enriched samples from food is established for a range of FBPs improving the efficiency of screening of samples, but viable microorganisms are, in most cases, still required for definitive confirmation of positive samples. High throughput sequencing (HTS) based workflows now gradually emerge as options also for routine applications to FBP detection and characterization (Figure 1; Table 1).

High Throughput Sequencing
HTS can generate thousands to millions of sequence reads, and up to several hundred billion base pairs (bp) of sequence information per sample. The read length, error rate and number of reads and sequenced bases vary substantially. Selective amplification (targeted) and non-selective, random (shotgun) approaches exist. The number of high quality genomes for the most important food pathogens is already high and rapidly growing, in part benefiting from the relatively small genome sizes of most microorganisms (≤100 Mbp). In cases where a sequenced reference genome is unavailable it is necessary to perform de novo sequencing and genome assembly (Figures 2, 3). De novo assembly to obtain a draft quality genome based on high quality, short-read (<250 bp) sequence data from single, cultured isolates is complex, but can be (semi-)automated (Emond-Rheault et al., 2017). Closing a genome often requires highly skilled bioinformaticians and long sequence reads (up to several kbp) or PCR and Sanger sequencing of the unknown gaps in scaffolds (Goodwin et al., 2015;Rhoads and Au, 2015). Even this can sometimes be (semi-)automated (Emond-Rheault et al., 2017). Analysis of re-sequencing data employing a mapping strategy (Figures 2, 3) is less complex, and can be (semi-)automated. It is, however, time and computer intensive and interpretation of the mapped data can be quite challenging (Goodwin et al., 2015;Rhoads and Au, 2015). Alignment-independent comparisons using statistic (probabilistic) approaches (k-mers; Figures 2, 3) can be much faster and automated and emerge as an attractive option at the cost of detailed resolution (Ondov et al., 2016). The optimal FIGURE 1 | Four sectors are considered here as potential users of high throughput sequencing (HTS) technologies for detection and characterization of foodborne pathogens (FBPs). Research (upper left) is a knowledge driver providing exploitable reference data and detection methods among others to the other three sectors (green arrows), and receives valuable data and material back from the other sectors (yellow arrows). The food industry (upper right) is legally obliged to take preventive measures and to monitor its products and production systems to prevent contamination with FBPs, with economy as a main priority driver. Documentation of the systematic efforts to maintain low risk (goal = pathogen free) products must be available for inspection. The health sector (lower left) treats patients and is usually the first to isolate and characterize outbreak-associated strains, thereby providing key information necessary for the other sectors to investigate and minimize the impact of outbreaks. The competent authorities (lower right) enforce the food law and surveil the food industry and products, but also coordinate the outbreak investigations based on data provided by the other sectors. Epidemiological data, legal acts and quality control documents are the main information sources used and shared by the competent authorities (blue arrows). Outbreak investigations have a strong focus on specific source tracking (red arrow).
HTS strategy is therefore purpose dependent. Large sequence databases including thousands of completely sequenced genomes are now available, facilitating identification of common as well as rare but potentially important genes by mapping of sequence reads to the database sequences. Low error rate, long reads and high coverage facilitate sequence assembly (Figures 2, 3). Altogether, this has expanded our ability to gain a more comprehensive insight into the genetics of individual strains or species, and the microbiota (microbial species composition) and microbiome (functional gene pool of microbiota) of a very broad spectrum of sample types, including environmental, food and clinical samples.
Tremendous progress in HTS technology developments has been made during the recent decade and this review will not discuss per se the sequencing technologies, because several excellent reviews are available (Loman and Pallen, 2015;Goodwin et al., 2016). Understanding advantages and disadvantages associated with the different HTS platforms, (in particular the Molecular targets A narrow range of pre-defined risk indicators, e.g., particular species specific markers, serotype specific markers, virulence genes A broader range of risk indicators, pre-defined or not. May also include focus on different sequence variants At least all case specific markers identified from characterization of patient isolate(s). Often also including a broad range of other risk indicators, pre-defined or not Purpose of analysis Ensure that own food product is legally compliant and does not pose an unacceptable risk to consumer (can be perceived as safe) Ensure that food production/products are legally compliant and safe, and monitoring of prevalence/distribution of contaminants on the market Identify contaminated product(s) and retract the product(s) from the market as well as clearing non-contaminated products from suspicion How? Analytical verification: pre-defined risk indicators are not detectable at a pre-defined limit of detection or in a pre-defined sample size. Sampling based on HACCP c approach Analytical detection and typing of risk indicators, analysis of traceability documentation and epidemiological analysis. A pre-defined sample size or LOD d and quantitative analyses are often applied. Sampling based on specific sampling plans, standards or guidelines Interviews of patients/family : analytical detection targeting case specific markers from patient isolate(s) complemented with epidemiological analysis and sampling partly based on interviews. A pre-defined sample size is often applied, but a pre-defined LOD is usually not

Required time from test to result
Minutes or usually several hours, but may be justifiable with up to days/weeks in case of re-emerging contamination problem Exceptionally weeks, usually days, but may be shorter in case of products with short shelf life and high turnover As soon as possible, preferably minutes to hours, but can be days

Resource limitations
Routine on-site detection. Price sensitive market : testing costs per unit produced must be very low, but may be justifiable with high costs in case of re-emerging contamination problem Routine laboratory analyses complemented with in-depth analyses in well-equipped (reference) laboratories. Testing costs per unit can be low to moderate, depending on specific control program. High to very high costs can be justified in particular cases read length, read number, sequencing error rate and costs) may, however, help readers to better understand the choices made in the following cited examples. Illumina sequencing is currently the prevailing HTS technology and also offers the highest fidelity. It provides very large data sets of relatively short reads (100-300 bp) at an error rate per sequenced base of approximately 1%. Roche 454 (phased out in 2016) was the previously dominating technology and provided smaller data sets of longer reads (≤700 bp) with higher error rates. Much of the initial HTS literature reports on Roche 454 data (Liu et al., 2012;Mayo et al., 2014;van Dijk et al., 2014). Long read sequences, up to several kbp, can be obtained using other platforms (see below) but the error rates are high (5-40%). All HTS technologies allow for assembly of good quality draft bacterial genomes with up to 100 contigs. A fully closed genome sequence is usually obtained through application of different technologies combining the accuracy of short reads with the ability of long reads to span gaps in assembly scaffolds (Tallon et al., 2014). Due to the capacity of the Single Molecule Real Time sequencing technology of Pacific Biosciences to provide reads of 1-10 kbp, this technology has recently been popular for de novo assembly of completely closed genomes, long repetitive sequences, plasmids and bacteriophages (Rhoads and Au, 2015). The Oxford Nanopore MinION has competitive potential, as it is claimed to be able to provide even longer reads in real-time at low costs (Goodwin et al., 2015;Quick et al., 2015). The short read technologies generally provide significantly higher coverage than the long read technologies, at much lower costs per sequenced base.

HTS Based Microbiological Analyses
Foods, feeds, clinical and environmental samples harbor complex and diverse microbial communities. We use the expression "metagenomics" for analysis of such samples by HTS. For FIGURE 2 | Metagenomics data analysis. At least three different approaches for analysis of HTS sequence reads can be selected, but combinations are often preferred. (A) Assembly of sequences (e.g., reads) into contigs (consensus sequences) requires mapping. Sets of contigs are often further assembled into scaffolds (not shown), where the relative position of contigs is known but gaps of ± known size between the contigs remain to be closed. The example shows that four of the six reads can be assembled into a consensus contig while the two remaining reads cannot be assembled with any of the others. (B) Mapping of sequences (e.g., reads) to other sequences (e.g., in database) also requires mapping. The example shows one perfect and one partial match between two query sequences (e.g., reads) and a reference sequence. The mismatch in the partial match is shown in red. (C) Any sequence larger than one nucleotide can be divided into subsequences of length k ≥ 1. The size of k will affect the likelihood of any random k-mer being unique to a data set. A small k will reduce the number of unique k-mers. This is demonstrated in the example, as for the given reference sequence two k-mers will not be unique with k = 3, while with k = 5 all k-mers are unique. Rare k-mers or k-mer frequencies can be used to estimate relationships between two sets of sequences (e.g., two shotgun metagenomes or a sequence isolate and a reference genome).
further distinction we use the term "shotgun metagenomics" for whole genome sequencing (WGS) or whole-sample-DNA based metagenomics (Figure 4). Targeted amplicon analysis of various ribosomal RNA coding DNA (rDNA) and other conserved markers is, as suggested by Marchesi and Ravel (2015), denoted "metataxonomics" throughout this review. "Metabarcoding" is a related term sometimes used in the literature (e.g., Jones et al., 2013;Staats et al., 2016). Shotgun metagenomics, and "metatranscriptomics" (i.e., shotgun sequencing of RNA transcripts) enables full genome or transcriptome sequencing, respectively, in a complex sample. RNA based shotgun metagenomics (i.e., sequencing of all RNA in the sample) and metatranscriptomics are often combined into a single approach "RNAseq". We therefore only refer to metatranscriptomics or RNA based shotgun metagenomics when we need to distinguish from RNAseq. FIGURE 3 | Approaches to HTS sequence read analysis and their dependence on alignment, time and coverage. At least three different approaches for analysis of HTS sequence reads can be selected, but combinations are often preferred. Top right: Assembly of sequences into contigs (see also Figure 2), scaffolds and complete genome assemblies is alignment dependent, time consuming and the success probability is usually correlated with the coverage. This approach is typically taken when time is not the limiting factor and a complete assembly is desired for successive analysis and reference applications. Bottom: Mapping (see also Figure 2) of reads to existing assembly/assemblies is also alignment dependent and time consuming but can also be performed successfully at low coverage (a single read can be mapped to a reference assembly). The size of the reference (e.g., database or genome) and the degree to which mismatches are accepted will have a significant impact on the time required for data analysis (olive arrows). This approach is typically taken to determine functional aspects of metagenome and transcriptome sequences and in metataxonomics. Top left: K-mer analysis (see also Figure 2) is a fast, alignment independent, statistical (probabilistic) approach to investigate properties of a sequenced genome such as its similarity and relationship to other (reference) genomes. It is typically used to screen sequenced genomes to identify genomes of particular interest for more comprehensive analysis.
Metataxonomics is generally more sensitive than shotgun metagenomics, due to enrichment of targets by amplification. Metataxonomics is, due to the targeted enrichment, prone to bias and may fail to detect novel variants of relevant targets, e.g., 16S rDNA with mismatches in the PCR primer binding motifs or genes involved in previously unknown but relevant biosynthetic pathways. Shotgun metagenomics on the other hand presumably has a low bias, is independent of a priori knowledge of target sequences, and can be used to monitor alterations in the microbiome that may not be evident from the composition of the microbiota. RNAseq is biased by the reverse transcription used to synthesize cDNA prior to sequencing, possibly affecting detectability of RNA viruses. Metatranscriptomics is further biased by RNA transcription rates. RNAseq is otherwise comparable to shotgun metagenomics. The main drawbacks of shotgun metagenomics FIGURE 4 | The difference between shotgun metagenomics and amplicon based metataxonomic sequencing. (A) Six different genomes (1-6) shown in different colors with five of the six (1 and 3-6) containing a shared genomic region. The shared genomic region in all five has a conserved motif on the left side (red circle = forward primer binding site), but one of them has a significant change in the conserved motif on the right side (yellow circle = reverse primer binding site) resulting in primer mismatch (black circle). (B) The sequenced fragments with shotgun metagenomics are random motifs from the six different genomes, and only one of the conserved (primer binding) motifs will exceptionally be included. (C) The sequenced fragments with amplicon sequencing are only those delimited both by a conserved left and right motif (red and yellow circles). The difference in mean coverage per nucleotide is significant. Assuming that each genome has the same length (e.g., 10 8 bp) and is present in equal concentration in the original sample, a read length of 200 bp, and an invariant length of the shared genomic region delimited by the primer sites of 250 bp, the mean coverage per nucleotide of the targets will be: (B) R × L/N × G = 10 6 × 200 bp/6 × 10 8 bp = 1/3 where R = number of reads, L = read length, N = number of genomes and G = genome size. (C) R × L/A × D = 10 6 × 200 bp/4 × 250 bp = 2 × 10 5 where R and L are the same as for B, while A = number of genomes flanked both by conserved forward and reverse primer sites, and D = the length of the shared genomic region delimited by the primer sites. In this example C is 6 × 10 5 times more sensitive than B. Shotgun reads may be analyzed applying all bioinformatics approaches (assembly, mapping and k-mer analysis, alignment dependent and alignment free; cf. Figures 2, 3). Amplicon reads are usually analyzed by mapping, clustering and phylogenetic approaches, while assembly is only exceptionally applied. are: several logs higher (inferior) limit of detection (LOD; Figure 4) and complex data analyses (bioinformatics) that, at present, are difficult to automate/standardize. The composition of the sample's microbiota and/or the causative agent is typically not well known in advance (limiting the possibility to apply targeted approaches such as metataxonomics), and relevant reference sequences may be lacking from public databases (limiting the possibility to perform mapping and assign function to sequences).

SELECTED, ILLUSTRATIVE EXAMPLES OF APPROACHES FOR SPECIFIC PATHOGEN TAXA AND APPLICATIONS
FBDs are caused by bacterial, viral, fungal or parasitic pathogens entering the body via contaminated foods and have entered the food chain at some point from farm to fork. Bacteria and viruses are the most commonly reported sources of disease. Bacteria are more easily identified than viruses because the former can often be cultured. The vast majority of published HTS based FBP studies have focused on particular taxa, mainly bacterial, and specific applications, e.g., outbreak investigation starting with a clinical isolate. The organization of the following sub-chapters reflects this. Future advancements, with some included pioneer examples, are expected to allow for simultaneous detection of multiple higher and lower level taxa.

Bacterial Foodborne Pathogens
Many FBPs are well-studied and the use of genomic data and large scale WGS have become important for studies in epidemiology, evolution, surveillance and outbreak investigations. On August 16th 2017, there were 104,667 (84,726) genome assemblies and 8,119 (6,286) complete bacterial genomes (number in brackets was 7 months earlier) available from National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/genome/browse/). The contribution of bacteriophages to bacterial genome size, evolution and virulence is very significant (Brüssow et al., 2004;Salmond and Fineran, 2015). Shotgun metagenomics can provide new insights into these aspects and the possible relevance of phages to FBDs (Nieuwenhuijse and Koopmans, 2017).
Bacterial FBPs are often present in foods in low numbers heterogeneously spread in the product. Consequently, ability to detect very low levels of FBPs in various food sources is important. Current detection methods normally involve one or more enrichment steps, screening (e.g., by PCR), followed by an isolation step. Isolation of a FBP from a food matrix may be challenging due to low recovery of isolates;-a sample can be positive with PCR screening, yet isolation of a corresponding bacterial strain may not be achieved. Some of the most important bacterial FBPs are Salmonella, Listeria monocytogenes and Shiga toxin-producing E. coli (STEC) causing many outbreaks and sporadic cases with severe or fatal outcome (Crim et al., 2014;Astridge et al., 2015;EFSA, 2015b). These three FBPs are used in the following as illustrative examples. For Salmonella infections contaminated food sources typically include poultry, eggs, swine and ready-to-eat foods, and affect people at all ages (EFSA, 2015b). L. monocytogenes is commonly detected in ready-to-eat foods such as smoked fish and soft cheeses and often affect elderly, immunocompromised patients, pregnant women and have high mortality rate (EFSA, 2015b). The main food vehicles of STEC infections are bovine meat followed by vegetables and juice (EFSA, 2013(EFSA, , 2015b. STEC can cause severe complications like acute kidney failure (hemolytic uremic syndrome) and often affects children under the age of five, elderly and immunocompromised people (Davis et al., 2014).
For outbreak investigation the pathogen must be linked to the correct food product (source of infection). Food producers on the other hand, need to determine if their products or production line is contaminated, how an unwanted pathogen entered their production facilities, and/or if it is a persistent household strain (Figure 1; Table 1). Several strategies can be applied for comparison of isolates, e.g., pulsed-field gel electrophoresis (PFGE), multi-locus variable number of tandem repeats analysis (MLVA) and multi-locus sequence typing (MLST). For many FBPs the traditional typing or subtyping offers too low (phylogenetic) resolution to distinguish closely related but distinct strains. High resolution is required to discriminate parallel outbreaks or to separate sporadic cases from an outbreak, but also to assess if a reemerging contamination problem is caused by a persistent strain or reintroduction of similar strains. WGS will provide highly discriminatory data for subtyping of strains by single nucleotide polymorphism (SNP) analysis or extended (core genome or whole genome) MLST (cgMLST, wgMLST) for strain comparison for outbreak investigations and surveillance purposes. Among possibilities beyond traditional molecular fingerprinting is the reanalysis of complete genome sequences when subsets (e.g., MLST) provide insufficient information/resolution. Polymorphisms can be investigated with or without mapping of the HTS data to a reference genome (Figures 2, 3). WGS-based analyses can also aid in the identification of other relevant factors such as virulence and antibiotic resistance genes (Joensen et al., 2014;Holmes et al., 2015;Octavia et al., 2015;Forbes et al., 2017). By standardizing the workflow of the actual HTS and bioinformatics analysis, this can take only a few days. However, comparable data and standardized protocols and pipelines are required, a topic discussed in further detail below.

Outbreak Investigations
The starting points for outbreak investigations with strain typing are access to clinical isolates. WGS has been used many times in recent years for comparison of isolates in outbreak investigations. Most published studies were retrospective, but a few were performed in real-time. A selection of examples is summarized in Table 2. In an early prospective study (2009) isolates from human patients and animals associated with an STEC O157:H7 outbreak were selected for WGS for comparison of isolates and source identification (Underwood et al., 2013; Table 2). A combination of Roche 454 and Illumina data were used to generate a reference assembly from the strain with best quality data. The hybrid assembly resulted in 463 contigs with average size of 12,028 bp and served as a reference genome for the successive analysis. Shotgun metagenome reads from 16 isolates associated with the outbreak were mapped and examined for SNPs over the entire genome. Based on the SNP results five subtypes of the outbreak strain were identified, providing for design of assays for detection of six specific SNPs. These assays were used to follow the outbreak, including analysis of 106 additional isolates obtained from the outbreak, demonstrating that the five subtypes were widely distributed on the involved farm prior to the first human clinical case (Underwood et al., 2013). Variable number of tandem repeats and PFGE typing indicated that there were two different strains in the sample collection, but the data from each typing method did not overlap and were therefore inconclusive. The HTS data on the other hand documented that the outbreak was caused by a strain differing in four SNPs from the hypervirulent O157:H7 ST11 clade 8. This early study demonstrated that HTS can provide better resolution (five subtypes vs. two subtypes) and therefore can be superior to more traditional characterization methods. HTS in addition provided data suitable for design of specific diagnostic assays that improved the monitoring of the outbreak.
Specific diagnostic sequence motifs are not always available or known, and a specific pathogenic agent may exhibit new and unexpected combinations of involved virulence genes for which current tests are not optimally designed. This was for example the case in the large STEC O104:H4 outbreak in Germany and other European countries in 2011 . In mid-May the public health authority in Germany was informed about a cluster of three cases with hemolytic uremic syndrome and raw WGS data from a patient derived isolate were already on June the 2nd published by Beijing Genome Institute (NCBI accession no. SRX067313; Kupferschmidt, 2011). Public release of these data incited a huge joint effort from bioinformaticians and researchers around the world, very quickly resulting in in-depth knowledge of the strain. This also facilitated design of specific diagnostic tools for further investigation of the outbreak (Struelens et al., 2011).
A complex outbreak investigation in the UK identified watercress as the source of STEC O157 in two simultaneous outbreaks with different sources of contamination (Jenkins et al., 2015; Table 2). SNP positions of high quality in all genomes of the Public Health England STEC O157 database were then extracted. Pseudo sequences of polymorphic positions were used to create maximum-likelihood trees and compared to the WGS data of additional strains held in the database. Phylogenetic analysis supported a foreign source for the outbreak, but no microbiological link to a specific country of origin was identified. Only one isolate was identified from the irrigation water from the implicated watercress, indicating a low level of contamination.  This isolate was compared to the human isolates, and a maximum of 3 SNP differences were reported for the second outbreak, confirming the source of this outbreak (Jenkins et al., 2015). One of the first published studies using WGS in outbreak investigations concerned a large L. monocytogenes outbreak in Canada in 2008 (Gilmour et al., 2010; Table 2). Two clinical isolates with similar but distinct PFGE patterns were subjected to WGS to assess the genetic diversity of these isolates. Altogether 28 SNPs and three indels, including a 33-kbp motif corresponding to presence/absence of a prophage were observed. The additional information obtained with WGS compared to PFGE indicated that not one, but three distinct, yet closely related strains were possibly involved in the outbreak.
Investigations of an Australian hospital outbreak of L. monocytogenes in 2013 identified a chocolate profiterole from a specific food manufacturer as the common food consumed by the patients. A follow-up WGS study identified more SNP differences in the environmental isolates from the food manufacturing facility than the patients' isolates from the outbreak (Wang et al., 2015b; Table 2). However, the five outbreak isolates shared multiple distinctive genetic features including five prophage insertions. Wang et al. (2015b) suggested that the human isolates were less divergent because of successful adaptation to the relatively stable human environment while the environmental strains (19-20 SNP differences from human isolates) were under increased survival pressure due to less favorable conditions. Schmid et al. (2014 ; Table 2) investigated a cluster of listeriosis in Austria and Germany by WGS where the human isolates shared PFGE and fluorescent amplified fragment length polymorphism profiles. Gene-by-gene comparison or cgMLST based on 2,298 genes revealed that four of the human isolates belonged to a single cluster differing by ≤6 alleles (genes). This cluster was distinct from but related to food isolates from two Austrian producers (differing by ≤8 and ≤19 alleles, respectively). The study did not explain if the allelic differences corresponded to SNPs or were more substantial. The other three human isolates were more distinct and unrelated to the outbreak cluster. Octavia et al. (2015) used SNP analysis in an attempt to define whether an isolate was part of an outbreak or not and identify whether one or more strains were implicated in an outbreak ( Figure 5). They modeled the mutation rate in S. typhimurium using 250 bp paired-end reads and estimated a cutoff value for the intra-strain number of SNPs the bacteria could have. When using a high or low substitution rate, and including a time limit of an outbreak from less than a month to up to 3 months the number of SNPs was estimated to differ from 2 to 9. Other studies have identified variable numbers of SNPs in Salmonella outbreaks. Several studies have reported 0-3 SNP differences in one outbreak Taylor et al., 2015;Wuyts et al., 2015). However, some outbreaks have reported to have larger SNP variation based on the core genome (Leekitcharoenphon et al., 2014). In concordance with many of the Salmonella reports, a low intra-strain number of SNPs (0-7) have been reported from epidemiologically linked cases of STEC O157:H7 (Turabelidze et al., 2013;Underwood et al., 2013;Joensen et al., 2014;Holmes et al., 2015;Jenkins et al., 2015; Figure 5).

Applications of Strain and Isolate Typing To Surveillance and Control
Surveillance of specific FBPs has been ongoing in public health laboratories for a long time, and can benefit from access to clinical and/or food derived isolates. A few countries and laboratories have implemented WGS as a routine typing tool for public health surveillance (i.e., on clinical isolates) for selected FBPs (Joensen et al., 2014;Ashton et al., 2016;Chattaway et al., 2016;Lindsey et al., 2016). Implementation of WGS as a standard typing tool for isolates from foods as well, is still in a start-up phase and routinely done in very few countries.
Denmark has implemented WGS typing of L. monocytogenes isolates from patients and the food surveillance program. Two unexpected genetic clusters, as classified by MLST type, were identified through the WGS analysis during 2013-2015 and further analyzed for SNP differences by mapping to a reference genome of the same MLST sequence type (Lassen et al., 2016). Another study on L. monocytogenes has developed a geneby-gene (cgMLST) method based on 1,748 loci among 957 genomes (Moura et al., 2016). High robustness was shown as different DNA extraction methods, library preparations and sequencing instruments were used as well as assembly-free and de novo assembly-based methods to ensure that the allelic profiles generated were the same despite differences in the WGS methodology. FIGURE 5 | Isogenic or non-isogenic isolates? The distance or number of observed differences between isolates, usually measured as single nucleotide polymorphisms (SNPs) in HTS studies, can provide clues to determine if isolates belong to the same strain, i.e., whether they are isogenic or not. This is important for outbreak investigations, epidemiology and to assess if a persistent strain is present in a food production system. Fewer than ten SNPs is often interpreted as evidence of an isogenic origin of bacterial isolates (see examples and discussion in the main text of this paper). Practice is currently not harmonized and also depends on the taxon in question, how SNPs or other differences are calculated, and which part of the genome the study covers (e.g., core or whole genome). An inferred phylogenetic relationship between nine isolates (A-I; terminal nodes) is shown. For each isolate, a blue letter (a-i) indicates the number of unique SNPs associated with each individual isolate. Internal nodes labeled X-Z connect three clusters of isolates, while internode N connects all isolates. Brown letters (x-z) indicate the number of shared SNPs separating each individual cluster of isolates from the others. The distance ( ) between any pair of isolates is the sum of SNPs (i.e., blue and brown letters) separating them, e.g., if a = 3, d = 2, x = 2 and y = 4 then AD = 3 + 2 + 2 + 4 = 11. The following two examples serve to illustrate the difference between putatively isogenic and non-isogenic clusters of isolates (with a threshold of 9 for isogenics): If a = b = c = d = e = f = 2, g = h = i = 3, x = 2, and y = z = 3 then all the isolates A-F might be considered isogenic (internal distance between any pair of isolates max ≤ 9), as might G-I ( max = 6), whereas A-F might not be considered isogenic with G-I (internal distance between any members from two different clusters min ≥10).
then only isolates G-I ( max = 2) might be considered isogenic (any other pair of isolates would yield min ≥11).
WGS and alignment-free SNP analysis were used to differentiate between persistent and repeatedly reintroduced strains of L. monocytogenes in a longitudinal study of foodassociated environments (Stasiewicz et al., 2015). The PFGE patterns suggested reintroduction due to observed differences. Patterns unique to single retailers or single states supported persistence or clonal spread. However, the WGS analysis revealed that the observed PFGE differences were caused by a single mobile element, suggesting persistent contamination. Identifying clonal isolates from different food-associated environments emphasize the importance of strong epidemiological data in traceback of foodborne outbreaks.
Both SNP-based approaches and cgMLST yield a high discriminatory power and are reproducible for comparison of isolates. A prerequisite for detailed typing methods is a clear understanding of what makes two bacteria isogenic (belong to the same strain or clonal lineage; Figure 5). Lack of harmonization complicates the conclusive linking of clinical and food isolates, epidemiology and tracing and tracking of contamination in food processing facilities. Expert opinions will depend on the bacterial species and how SNPs are calculated (whole, core or extended genome). A prerequisite when performing reference-based SNP analysis is the availability of good quality reference genomes from strains closely related to the target strain(s). Unfortunately, in case of outbreaks due to rare variants of the causative agent, such reference genomes are not always available. This is true even for some variants of pathogenic E. coli, Salmonella spp. and L. monocytogenes.

Metagenomics for Typing of Bacterial Communities and FBPs
Isolates are often unavailable and may be difficult to obtain. Most foods harbor complex and composite microbial communities. Contaminating pathogens are often heterogeneously dispersed and represent a minority of the microorganisms present in the sometimes complex food sample. Metagenomics approaches offer the opportunity to investigate the composition of microbes in food matrices in toto without selective isolation, including the detection of non-viable and "viable but not cultivable" microbes (Bergholz et al., 2014), and will capture a broader range of the microbial community than classical microbiology. Most of the numerous HTS metagenomics studies report on 16S rDNA analysis, i.e., what we refer to as metataxonomics. This approach has proven useful for identification of bacteria, phylogenetic studies and characterization of bacterial communities in different foods, water and other environments (Mayo et al., 2014;Kergourlay et al., 2015;Tan et al., 2015). However, the 16S rDNA has limited resolution power and cannot be used to detect non-bacterial taxa. Reliable 16S rDNA based classification of bacteria rarely extends beyond phylum, group or genus level (Livezey et al., 2013). A few FBPs can be identified by 16S rDNA sequencing, but only exceptionally at the species level and never to pathotype. All Salmonella species are considered as pathogens (Jarvis et al., 2015;Zhang et al., 2015), but only two of the Listeria species known so far are pathogenic, i.e., L. monocytogenes and L. ivanovii. These species are partly distinguishable based on 16S rDNA sequence. In contrast, 16S rDNA sequences cannot distinguish STEC from non-diarrheagenic or commensal E. coli. For identification of STEC, detection of virulence-associated genes including the Shiga toxin-and intimin-encoding stx and eae genes will be essential for meaningful strain typing. Database limitations (not all relevant taxa and haplotypes represented + possible erroneous sequences and taxonomic annotations) and the common use of only a subsection of the 16S rDNA (missing or covering only some of the inter-strain variation) further reduces the fitness of metataxonomics for FBP detection (Adeolu et al., 2016;Singer et al., 2016).
Shotgun metagenomics of the entire DNA present in a sample offers a more comprehensive insight into the microbial diversity of a sample with regard to the richness of microbial taxa at all levels, or with regard to the presence of gene families or biomarkers in general (Ferri et al., 2015;Blagden et al., 2016;Ranjan et al., 2016). Shotgun metagenomic approaches used for improved FBP detection have been tested in a few studies. Bioinformatics methods to distinguish two genomes of the same species in a complex sample are needed, but it seems possible to determine whether one or more FBP strains is involved (Leonard et al., 2016). In investigations where it is essential to detect a specific organism assumed present in low numbers in a complex matrix, culture-based bacterial enrichment is still necessary. Then, inevitably, an enrichment bias of the composition of the bacterial community relative to the original sample will emerge.

Shotgun Metagenomics in Outbreak Investigations
When screening of food products and even in outbreak investigations, the genome sequence of the specific FBP strain/strains is usually not available. HTS technology may be used to identify the genome sequence of the causative agent of the outbreak by de novo assembly of sequence reads from complex samples with high prevalence of the agent, e.g., clinical specimens or food samples (Figures 6, 7). Theoretically, the approach shown in Figure 6 can also be applied to control of food products by the manufacturers or enforcement authorities. However, the current costs and other resource requirements (skills, lack of standardized data analyses, time) prevent justification of the approach for routine controls.
In a retrospective study of the German/European STEC O104:H4 outbreak in 2011 it was demonstrated that such an approach could be used to identify the infectious agent in human fecal samples (Loman et al., 2013; Table 2). Fortyfive fecal samples from patients were sequenced by shotgun metagenomics. Human DNA was subtracted in silico and assembly was performed to create environmental gene tags (EGTs). EGTs found in more than 20 of the fecal samples were selected for further analysis. The total outbreak metagenome was screened for sequence reads from healthy humans and matching EGTs were subtracted. A set of 450 outbreak-specific EGTs were then subjected to taxonomic analysis and almost 65% were assigned to the Enterobacteriales. The original metagenomics data sets were then used in an attempt to reconstruct the E. coli outbreak strain genome. Functional annotation confirmed the presence of important strain-specific and virulence-associated genes, and ten samples had more than 10× coverage of reads mapped to the reference genome of the specific outbreak strain. The coverage was > 1 in 26 samples and Shiga toxin genes were detected in 27 of 40 STEC-positive samples. In some of the individual samples sequences from other human pathogens such as Campylobacter, Salmonella, and Clostridium difficile were also identified. This study indicates the potential of shotgun metagenomic analyses for the culture-independent identification of bacterial pathogens in samples with complex microbial composition.
FIGURE 6 | Reference guided metagenome sequencing based approach for identification and characterization of pathogenic and outbreak associated strain(s). In case of an outbreak, fecal samples from patients are subjected to culturing, in order to isolate the outbreak strain. Patients are also interviewed in order to try to identify food products that may be the source(s) of infection. The metagenomes of stool samples, food products and cultured strains can then be amplicon sequenced (metataxonomics) or shotgun sequenced (metagenomics) and the data mapped to reference databases for identification of virulence markers. Shotgun reads can also be assembled into larger contigs or genomes for identification of pathogenic strains. The latter is facilitated if the sequence data are derived from single isolates. Black arrows indicate forward flow direction of the analysis, while gray arrows indicate feedback changing the premises for earlier steps. Feedback from the sequencing analysis can be used to refine and narrow the search for a specific FBP. If successful, the outbreak will be terminated. This review includes multiple examples of the application of the described approach to outbreak investigations.

Shotgun Metagenomics for Food Surveillance and Control
A selection of studies is described below and details are presented in Table 3. Tomatoes have been implicated in Salmonella outbreaks several times, but isolation of Salmonella from tomatoes has only been successful a few times. Ottesen et al. (2013) used shotgun metagenomics to describe taxa associated with pre-enrichment and throughout the enrichment steps of a protocol for Salmonella detection in environmental tomato samples ( Table 3). DNA was extracted prior to enrichment and the remaining tomato samples were enriched overnight in a universal pre-enrichment broth and aliquots successively added to two different growth media. The sequencing depth was insufficient to capture the majority of the diversity within the samples. To achieve about 1× coverage of all genomes Ottesen et al. (2013) estimated that they would have needed approximately 250× more sequence data. Variation among samples suggested differences in the microbial community in the starting material of the samples. An important biological FIGURE 7 | Reference independent shotgun metagenome sequencing based approach for identification and characterization of outbreak strain(s). In case of smaller outbreaks the possibility to compare metagenomes from affected people (patients) and healthy controls is limited. In these cases the availability of clinical isolates may be required to avoid exhaustive open ended bioinformatics (in silico) analyses, as exemplified by Brzuszkiewicz et al. (2011) and Rasko et al. (2011). Environmental gene tags (EGTs) from metagenomes of people affected by the outbreak and controls (people not affected) can be compared in case of a larger outbreak, as exemplified by Loman et al. (2013). In that study, EGTs present only in affected patients were characteristic of the outbreak strain and provided sufficient information to near complete characterization of its genome. Scaffolds and in particular assembled genomes may and should be uploaded to reference database(s), for successive use in analytical approaches like those described in Figure 6.
finding was the significant enrichment of Paenibacillus sp. from uncultured to cultured samples. This taxon is known to inhibit and kill Salmonella. The study also identified a number of sequences as Salmonella-specific despite negative PCR and culture results when those samples were tested for Salmonella. A comparison of results from the two different applied assembly approaches showed that increased read length, contrary to what might be expected, reduced the ability to assign taxonomy. Others have made similar observations . It is not clear if this phenomenon is associated with database limitations. Jarvis et al. (2015) aimed to characterize the microbiota in cilantro (coriander leaves), and simultaneously identify Salmonella from the samples ( Table 3). Metataxonomics based on sequencing of the 16S rDNA from 91 samples was complemented with shotgun metagenomics. Gram-negative Proteobacteria dominated in the cilantro samples before enrichment. After 24 h of enrichment the microbial composition had shifted to mainly Gram-positive Firmicutes, as described above for tomato (Ottesen et al., 2013). These findings suggest that the culture-based method should be optimized for the detection of the organisms of interest. Low detection of Salmonella by metataxonomics was thought to be due to low sequencing depth and/or reduced amplification efficiency caused by imperfect match in one of the primers. Shotgun metagenomics was performed on six cilantro samples culture-positive for Salmonella (enriched samples), and variable levels of Salmonella were identified. The genomes of the Salmonella isolates from these samples were already fully sequenced and were therefore included in the reference database used in the similarity analysis. The variable levels of Salmonella detected after 24 h enrichment illustrate the challenge of detecting Salmonella in matrices with a complex microbial background. Again, the sequencing depth of the analysis and levels of contamination were reported to influence the ability to detect the suspected agent.
Similarly, predominance of other species than L. monocytogenes was observed until the end of the enrichment procedure (after 40 h) in a study on L. monocytogenes and associated microbiota in naturally contaminated ice cream (Ottesen et al., 2016; Table 3). Leonard et al. (2015) applied shotgun metagenomics to detection of STEC in bagged spinach ( Table 3). Spiked samples with known concentrations of a STEC O157:H7 were sequenced and sufficient coverage of the genome required spiking with at least 10,000 colony forming units (CFU) of STEC per 100 g spinach followed by enrichment for 5 h to enable full pathogen characterization. However, enrichment for 23 h allowed the full pathogen characterization by shotgun metagenomics from as little as 10 CFU of STEC spiked into 100 g of spinach. Then, the sequencing coverage of the STEC strain was 184× and the consensus sequence after reference-based assembly covered the whole reference genome with only six gaps. However, reliable detection to low levels like 10 CFU of STEC per 100 g of spinach, required enrichment for at least 8 h. Then, approximately 2.9% of the reads could be mapped to the reference genome with coverage of approximately 10×. Leonard et al. (2015) concluded that this should be sufficient to enable DNA sequence-based determination of the serotype and essential virulence genes of the contaminating pathogen.
The same team demonstrated the possibility to detect and identify STEC down to strain-level in spinach samples spiked at 10 CFU/100 g of spinach using a variety of STEC strains (Leonard et al., 2016; Table 3). A shotgun metagenomics approach as described in the abovementioned study (Leonard et al., 2015) was applied. For microbial community analysis a database of unique 25-mers for species identification was used. This k-mer approach could also differentiate between E. coli phylogroups and demonstrated presence of more than one E. coli phylogroup in some of the samples. Conserved chromosomal E. coli genes (2,542) were extracted from WGS data from the STEC strains used in the spiking experiment as well as the metagenomic assemblies for whole genome phylogeny and SNP analysis. When the metagenomic assemblies only included the spiked STEC strain or the abundance of other E. coli was much lower than the spiked strain, the number of mismatches from the SNP analysis was less than 20, i.e., at or close to the intra-strain SNP-variability level (Figure 5).
The above-mentioned studies indicate that at present, it is difficult to achieve the large number of reads required for ≥1× coverage. These studies also showed how enrichment will bias the microbial composition, potentially favoring other taxa or strains than those intended for enrichment. Since multiple samples often need to be analyzed, it would be cost-efficient if <1× coverage would be sufficient for most samples. K-mer approaches to screen samples after a minimum of enrichment (see e.g., Ondov et al., 2016) to classify samples according to risk of presence of FBP, could be used to reduce the number of samples for which more in-depth sequencing and analysis is needed. Significant correspondence between a mapped read and FBP specific reference sequences may also provide enough evidence even at <1× coverage (Spilsberg et al., 2017). Better sample preparation and enrichment protocols could also contribute. Results from metataxonomics and shotgun metagenomics could aid in optimization of such protocols.

Viral Food Pathogens
Complete genome assemblies for 7,409 viruses were available from NCBI on August 16th 2017 (an increase of nearly 500 in 7 months). Viruses lack the genes necessary to transcribe protein-coding genes and replicate and reproduce, and therefore depend completely on the biological machinery of their host cells (Moreira and Lopez-Garcia, 2009). They are highly diverse and lack a common genetic constitution such as genes coding for ribosomal RNAs (Moreira and Lopez-Garcia, 2009), limiting the metataxonomic options. Some viruses can be cultured, but cultivation require advanced protocols, suitable host cells for propagation, and is time-consuming (Rodríguez-Lazaro et al., 2012). Viruses play a dual role in food pathogenesis. Some viruses like bacteriophages can affect the virulence and population structure of microorganisms (Hayes et al., 2017). Other viruses can be FBPs themselves (Newell et al., 2010;EFSA, 2011). For a virus to be transmissible through foods, it must have some environmental stability and remain infectious for some time on or in a food matrix. As viruses are unable to replicate in the food matrix they must have a low infectious dose. Foodborne viruses are commonly shed in large amounts by a fecal route and viral contamination of foods is primarily via human fecal material (Rodríguez-Lazaro et al., 2012). Normal cooking or frying inactivates viruses. Food sources of infections are typically raw foods like fresh produce, soft berry fruits, herbs, shellfish, ready-to-eat products in general and undercooked meat or foods  served cold that are contaminated by an infected food-handler post cooking (Halliday et al., 1991;Hedberg and Osterholm, 1993;de Wit et al., 2003;Fiore, 2004;EFSA, 2011EFSA, , 2015a. Most viruses that can infect through a foodborne route can also utilize a person-to-person infectious route. Many outbreaks, potentially the majority, are simultaneously propagated via combined person-to-person and foodborne infectious routes. Food and water associated transmission is also suspected to enhance the spread of zoonotic viruses and facilitates the occurrence of zoonotic events, e.g., through the handling of bushmeats (Nieuwenhuijse and Koopmans, 2017 and refs. therein). The viral FBP load is often low and heterogeneously dispersed in the food matrix while it is high in clinical patients. It is not trivial to distinguish between foodborne and person-to-person infections, unless initial cases are identified and analyzed. The reporting and surveillance of foodborne viruses is limited, and disease symptoms can be very similar for some viruses and for other viruses differ substantially between infected individuals. All this results in low documentation of the real impact and diversity of foodborne viruses (Nieuwenhuijse and Koopmans, 2017). The most notable FBP viruses are norovirus (NoV), hepatitis A virus (HAV) and hepatitis E virus (HEV) which are all positivesense, single-stranded, non-enveloped RNA viruses, and the double-stranded RNA rotavirus (RV) (Newell et al., 2010;EFSA, 2011). Severe acute and Middle East respiratory syndrome (SARS and MERS) and Ebola viruses are examples of other (zoonotic) RNA viruses suspected to be transmissible via food (Newell et al., 2010;Nieuwenhuijse and Koopmans, 2017 and refs. therein). Most FBP viruses are RNA viruses and require special sample processing and nucleic acid extraction methods in contrast to the vast majority of bacteriophages, which are double-stranded DNA viruses.
Viruses, and in particular RNA viruses, evolve rapidly, both via genetic drift and in response to active selection pressures (Holland et al., 1982). Detection and genotyping with PCR approaches, including metataxonomic HTS approaches, can therefore fail. Particular sample processing steps can significantly improve the probability of detection, e.g., by concentration of the viruses and/or removal of interfering substances [reviewed in Hartmann and Halden (2012); see also (EFSA, 2011)]. Detailed understanding of virus stability, inactivation times and temperatures is lacking due to limitations in model systems (Cook, 2013).

Outbreak Investigations
Isolates of viruses are rare in clinical settings, but clinical samples from viral outbreaks can contain high titers of the causative virus strain(s). The starting points for outbreak investigations are therefore availability of clinically derived samples. Due to the relative heterogeneity of samples compared to isolates, the analytical approaches derive from metagenomics approaches. The small genome size and high mean coverage per nucleotide, however, usually allows for characterization at strain level.
The severity of HAV infection varies strongly with age. HAV is endemic in regions with inadequate sanitation and limited access to clean water, creating population wide immunity. Contrastingly, regions with high quality sanitation and water supply can experience outbreaks of hepatitis A unless broad scale vaccination has been performed. Nearly 300,000 persons were infected in a clam-related epidemic of HAV in Shanghai in 1988 (Halliday et al., 1991) while 1,589 were reported infected, including 2 casualties, in a recent outbreak in Europe associated with frozen berries (Severi et al., 2015). In a recent epidemiological case study, a single food product was associated with a HAV outbreak (Collier et al., 2014; Table 2). HAV was extracted from serum and fecal samples from 120 patients. Metataxonomics targeting a 315 bp HAV fragment yielded 117 (98%) positive for HAV genotype IB. Of these, 99 (85%) were identical in the 315 bp sequenced segment. Attempts to isolate HAV from the food product (frozen pomegranate arils imported from Turkey) were unsuccessful. This demonstrates the challenge of establishing etiologies. Chiapponi et al. (2014) analyzed two samples of frozen berries from two apparently unrelated HAV outbreaks in Italy, and successfully detected HAV by reverse transcription, quantitative PCR ( Table 2). RNA was then extracted from the two samples and complete subgenotype IA HAV genomes of 7,398 and 7,393 nucleotides, respectively, were obtained by a combined RNAseq and amplicon HTS strategy. Chiapponi et al. were able to link the two outbreaks and also link the food derived sequences to an existing patient derived sequence by ≥99.9% nucleotide identities.
RV primarily infects children, is the most common source of gastroenteritis among infants (Desselberger, 2014) and is estimated to cause approximately 5% of total child deaths worldwide. An outbreak of foodborne gastroenteritis in Japan in 2012 caused by RV was probably associated with consumption of raw sliced cabbage (Mizukoshi et al., 2014; Table 2). Samples from patients and food handlers were positive for RV. The study combined a broad spectrum of tests. One clinical, fecal sample was subjected to RNAseq. Sequences of all 11 segments of the viral genome, sufficient to determine the specific viral strain, were identified. No food, however, was assessed for presence of pathogens.
HTS has accelerated the study of viral genetics, the assembly of novel viral genomes, and molecular epidemiology of viral outbreaks (Finkbeiner et al., 2009;Kundu et al., 2013;Wong et al., 2013;Smits et al., 2014;Ganova-Raeva et al., 2015). Studies of virus variants and quasispecies, analyses of vaccine escapes and drug resistance and viral evolution are now almost exclusively performed with HTS (Barzon et al., 2011).
The etiology of virus-associated outbreaks will often remain unknown, for various reasons. Finding and characterizing the causative virus may not be included among the analytical objectives, or the applied method is not always sufficiently sensitive. Other reasons for failure can be genetic drift in the virus, or an outbreak caused by a novel virus. Discovery of pathogens almost exclusively starts with samples from a diseased hosts and not a food matrix. There are many examples of investigations unraveling the etiology of viral outbreaks, but very few where the source of infection is identified or a specific food suspected and tested. It is difficult to establish if an outbreak started as a food contamination event, and it is reasonable to assume that these events are underreported. The small genome sizes of viruses may partly explain why the massive capacity of HTS is rarely used in viral FBP outbreak investigations, as may the limited availability of effective enrichment methods. The potency of new HTS platforms to provide valuable epidemiological information to large viral outbreaks was recently demonstrated by Quick et al. (2016), although on clinical isolates. A similar approach could be applied to epidemiological monitoring and source identification in case of a foodborne outbreak.

Surveillance and Control Purposes
The viral titers in food products are commonly much lower than in clinical samples, and isolates are not available. As for outbreak investigations the approaches applied for surveillance and control are metagenomics derived while the end-point is strain characterization. Aw et al. (2016) used assembly of RNAseq and shotgun metagenomics reads (mean contig size 680 bp) and mapping to a database of viral reference genomes on samples from field grown and retail lettuce ( Table 3). A small fraction of the reads corresponded to RV and other viruses that infect humans. In the study, 16S rDNA metataxonomic screening was used to verify absence of contaminating bacterial DNA. Fernandez-Cassi et al. (2017) used RNAseq and mapping to examine the viral contamination of fresh parsley plants irrigated with fecally tainted river water (Table 3). A small fraction (<1%) of the reads was related to FBPs, including among others HEV and NoV.
NoV is extremely contagious, and can cause large outbreaks of gastroenteritis (de Wit et al., 2003). NoV is the no. 1 cause of diarrheal disease and mortalities in the world (Pires et al., 2015) and the leading source of foodborne illness in the USA (Scallan et al., 2011). The etiology is reviewed elsewhere (Moore et al., 2015). Imamura et al. (2016b) used HTS to characterize NoV diversity in shellfish from two commercial producers in Japan, using a combination of RNAseq on virus suspensions and PCR enriched targeted sequencing (genotyping approach similar to metataxonomics; Table 3). NoV genotypes GI.3 and GI.4 were most prevalent and identified in a surprisingly high proportion of 20-25% of the samples. This proof of concept study could not actually address the diversity of NoV in single shellfish as 3 individuals were pooled to one sample prior to analysis to obtain sufficient starting material. The same genotyping approach was applied to a study of the efficiency of removing NoV from shellfish by depuration (Imamura et al., 2016a; Table 3). Depuration is used to decontaminate commercial shellfish. The study demonstrated that depuration is insufficient with respect to NoV.

Fungal Food Pathogens
On August 16th 2017, there were 2,515 genome assemblies and 29 complete fungal genomes available from NCBI. So far, there are very few examples of application of HTS to studies of epidemiology and virulence of fungal FBPs Lee et al., 2014;Litvintseva et al., 2014Litvintseva et al., , 2015Vaux et al., 2014). Only two published studies relate to specific cases of fungal food pathogenesis (Lee et al., 2014;Vaux et al., 2014). Molecular typing of fungal isolates and mycobiotas in relation to FBD is almost exclusively metataxonomic, but usually limited to PCR amplification of one or a few genetic loci followed by Sanger sequencing with exceptional examples of MLST analysis in the published literature (e.g., Byrnes et al., 2010;Desnos-Ollivier et al., 2015;Wang et al., 2015c). The relevance of fungi as FBPs is debated, significantly lower than that of bacteria and viruses, but possibly also understudied.
Plants are commonly infected in the field by "field fungi." Some of these produce toxic metabolites and may consequently cause disease when plant derived products are consumed Stoev, 2015). Typical examples are Fusarium spp. on cereal grains, producing zearalenone, fumonisins and trichothecenes (e.g., deoxynivalenol, T-2 and HT-2 toxin), and Penicillium spp. on fruits, producing patulin. Other fungi infect a broad range of food and feed products during storage ("post-harvest fungi"). Typical examples are Aspergillus spp. and Penicillium spp., producing acute or chronically toxic compounds such as aflatoxins, ochratoxins and citrinin, and multiple antimicrobials with indirect health effects via modulation of the gut microbiota (Gillings et al., 2015;Stoev, 2015). Some mycotoxins are persistent to food processing (EMAN, 2015). They can exacerbate the infections with a wide range of non-fungal and fungal pathogens (Antonissen et al., 2014;Stoev, 2015), and it is hypothesized that mycotoxins may contribute to fungal infections (mycoses; Withlow and Hagler, 2016).
Several fungi are opportunistic, infective FBPs (Clemons et al., 2010;Iriart et al., 2010;Gurgui et al., 2011;Kazan et al., 2011;Benedict et al., 2016). Invasive fungal infections are primarily a problem in immunocompromised people Bitar et al., 2014;Benedict et al., 2016). Reported examples of verified foodborne fungal infections are sparse (Benedict et al., 2016), and the problem is perhaps under-investigated. Hitherto, there is only one example of the use of HTS to investigate a fungal FBP outbreak. We suspect that the investigation of several other outbreaks or sporadic cases of fungus associated FBD reported in the literature (Benedict et al., 2016) could have benefited from application of HTS approaches similar to those described for other FBP taxa in this review.

Outbreak Investigation
As with bacteria the starting points for fungal foodborne outbreak investigations are usually clinically derived isolates. Both strain characterization and metagenomics approaches can be used and are described in the literature, but only one published example applied HTS.
Mucoralean fungi cause mucormycosis (zygomycosis), fatal fungal infections in humans whose incidence has been increasing lately (Roden et al., 2005;Spellberg, 2012). Gastrointestinal mucormycosis is rare and thought to be secondary to ingestion of fungi (Roden et al., 2005). In 2013 a strain of Mucor circinelloides f. circinelloides, the most virulent subspecies of M. circinelloides, was found as a contaminant in a batch of yogurt in the USA (FDA, 2013). More than 200 consumers became ill, although no fatalities were recorded (Lee et al., 2014). The affected consumers were immunocompetent. An isolate obtained from a yogurt container was subjected to WGS in order to characterize its genetic potential to cause significant infections, and to establish its genetic relationship to other strains of M. circinelloides (Lee et al., 2014; Table 2). A reference genome assembly was obtained from a strain of M. circinelloides isolated from human skin (Findley et al., 2013). Reads from the yogurt isolate were mapped to the reference genome for SNPs analysis. Comparison with a third isolate of a Mucor sp. using whole-genome alignments, and pathogenicity studies in murine models contributed to verify the pathogenicity of the yogurt strain. No clinical isolate from the outbreak was included in this study.

Parasitic Food Pathogens
Parasites are a diverse (polyphyletic) group of small animals or animal-like Eukaryotes, and their pathogenic potential is less frequently linked with metabolites and more frequently with their energy consumption and predation on host tissues than bacteria and fungi. World-wide, more than 100 species of foodborne parasites cause disease in humans (Orlandi et al., 2002). Complete or draft genome assemblies were available from NCBI for at least 40 of these species on August 16th 2017. Globalization, i.e., the movement of people, animals and food and feed increases the risk of moving and spreading parasites that are originally endemic, to new countries and hosts (Robertson et al., 2014). The lack of suitable enrichment methods for parasites, as opposed to most of the known bacterial and fungal FBPs, means that recovery/isolation steps are particularly important (Robertson et al., 2014). This also suggests that molecular detection can be useful to monitor presence, distribution and epidemiology, as well as the efficiency of clinical treatments after parasite infections.
Published studies applying HTS technologies to parasites are with few exceptions limited to characterization of genomes and transcriptomes, a topic not covered here. These studies, however, provide for detailed insight into the genetics of adaptations to specialized parasitism, and provide urgently needed reference sequence data for detection and identification purposes and clues to possible target/drug combinations (e.g., Tsai et al., 2013;Foth et al., 2014;Young et al., 2014;Barratt et al., 2015).
The presence of a multitude of other taxa and the frequently low abundance of parasites or derived DNA in clinical fecal samples challenge the detectability and may prevent effective sampling and purification of DNA for detection of parasites. Specialized protocols may be required to purify and enrich the parasite relative to a background matrix such as food or feces. Improved sample preparation and enrichment methods are discussed later.
Parasites are Eukaryotes with substantially more genetic similarity to their human and animal hosts than Bacteria. The size of parasite genomes (10-1,000 Mbp) ranges from 10× the size of bacterial genomes and up to nearly the size of the human genome. New databases are in development, collecting genomic information for various parasites (Martin et al., 2015). The lack of annotated genomes and transcriptomes has been a major obstacle to the effective use of HTS for detection of parasites. Molecular discrimination is dependent on detailed knowledge of genomes and genetic variation. However, many of the parasites are detectable by visual, macro-or microscopic inspection, and this is likely to be a more cost-efficient approach in many instances.

Outbreak Investigation
The intraspecific variation in virulence among foodborne parasites is, to our knowledge, not reported to be high or significant. Combined with the large genome size of Eukaryotes, this suggests that metataxonomic approaches can be sufficient for detection and outbreak investigations.
More than 200 seafood poisoning cases of unknown etiology were reported in Japan from 2008 to 2010. Victims commonly reported to have ingested raw Paralichthys olivaceus (a flounder). Kawai et al. (2012) therefore extracted total DNA or RNA from frozen P. olivaceus filets for shotgun metagenomics and RNAseq ( Table 2). In parallel, muscle tissue was sieved to recover spores from suspected parasite infections. The presence of spores and 18S rDNA from Kudoa septempunctata in the fish samples was observed. The pathogenicity of this myxosporean was confirmed in suckling mice and house musk shrews.

Vectors of Foodborne Pathogens
In addition to pathogens naturally associated with the food producing organisms such as gut microbes, dermal yeasts and bacteria, plant pathogenic fungi, etc. many of the most severe food pathogeneses are caused by incidental transfer from animal vectors (pests; Olsen et al., 2001;Jones et al., 2013). The Food and Drug Administration has identified the 22 most common pests contributing to the spread of FBD in the USA (Olsen et al., 2001). Four of these are rodents (mouse and rat species), while the remaining 18 are insects (cockroaches, ants and flies). The traditional approach to detect and identify these is microscopy.
Recently it was proposed to use metataxonomics by sequencing of the mitochondrial cytochrome c oxidase subunit I (COI) as a faster, more reliable, sensitive and cost-efficient approach (Jones et al., 2013). COI metataxonomics can be performed using HTS (see Figure 4) and may therefore suit as an attractive approach to screen routinely for presence of these vectors in raw materials for food production, to reduce the risk of introducing pathogens in the production. Vectors other than, or in addition to the 22 identified by Olsen et al. (2001) may be identified as particularly relevant, e.g., in other parts of the world. The introduction of additional sequence targets in an HTS based metataxonomic screening should be feasible (Lammers et al., 2014;Leray and Knowlton, 2015;Arulandhu et al., 2017) despite the limitations of COI and metataxonomic approaches in general (Deagle et al., 2014;Staats et al., 2016;Arulandhu et al., 2017).

Other Applications of HTS with Potential Relevance to FBPs
The discovery and characterization of biosynthetic gene clusters in microorganisms is radically facilitated with the availability of HTS (Cacho et al., 2015). Secondary metabolites are often toxic, and many of the microorganisms producing them are FBPs. Characterization of the biosynthetic pathways provides a basis for faster detection of agents producing the metabolites, as well as for interception of undesirable biological effects and exploitation of the biosynthetic pathways for production of new bioactive compounds.
Direct shotgun metagenomics on DNA purified from microorganisms isolated without enrichment culturing is an attractive approach. Such an approach applied to clinical, polymicrobial urine samples was found to have comparable identifiability of bacteria as more conventional and much more time consuming approaches . The complexity of such urine samples may be comparable to that of some food matrixes. Several approaches to data analysis were taken. First: identification of microorganisms by presence of specific k-mer motifs identified in a database of complete bacterial genomes. Secondly: alignment based subtraction of host-DNA derived reads and estimation of relative bacterial species distribution. Third: mapping of reads against a larger database of complete and draft bacterial, archaeal, fungal, protozoan and viral genomes. Finally, MLST and resistance gene identification performed against relevant databases. In a follow up study, direct shotgun metagenomics on toilet waste samples from long-distance flights was used to identify bacteria and antimicrobial resistance genes (Petersen et al., 2015) by mapping of reads to the abovementioned databases.
Host-specific genetic markers may be present in strains of pathogenic and non-pathogenic taxa, and then hold potential for source tracking (Gomi et al., 2014). See also Franz et al. (2016) for a more detailed discussion on source attribution.

Acknowledging the Importance of Sample Preparation
A major challenge for detection of FBPs in complex matrixes is the recovery of the organism or its genes. Foods are physically and chemically complex matrices hosting complex microbial communities. The extraction and purification of DNA from the samples is of major importance as the presence of inhibitors may hamper the analysis further down the line (Ceuppens et al., 2014;Moore et al., 2015). The sample storage conditions and preparation steps can introduce taxonomic biases linked to recovery (Ceuppens et al., 2015;Menke et al., 2017). Similar or even stronger effects on the microbiome and RNA population of samples are predictable, as also the intraspecific diversity and gene expression can be affected. The size of isolated/purified nucleic acid fragments is also relevant. Longer fragments are usually superior and complementary to shorter fragments for genome assembly. The purity and relative concentration of target isolated nucleic acids affect their detectability and certainly their quantifiability (Holst-Jensen et al., 2003). Different taxa and life stages of potential pathogens require different treatments for detachment from substrates and complex structures, recovery and lysis. Protocols required for extraction and purification of DNA from strongly attaching, biofilm-forming taxa with tough cell walls can cause shearing of nucleic acids of other relevant taxa, yielding undesirably short fragments or totally degraded RNA/DNA. The stability of nucleic acids is a critical parameter, and this topic is reviewed by Ceuppens et al. (2014). Enrichment processes required to achieve the necessary target concentration can also bias the post-enrichment microbial composition in undesirable ways, as illustrated with Salmonella spp. in examples discussed earlier (Ottesen et al., 2013;Jarvis et al., 2015).
Examples of non-culturing based enrichment of particular organisms include size based filtering, buoyancy, affinity based columns or immuno-magnetic separation (Hadfield et al., 2015). Molecular enrichment can be achieved by subtraction hybridization (Galbraith et al., 2004) and various combinations of digestion with restriction enzymes, adapter ligation and sequence specific amplification (Leichty and Brisson, 2014;Arulandhu et al., 2016).

Specific Challenges Related to Molecular Analyses
Ability to discriminate viable/infective agents from dead/noninfective agents and to obtain isolates of the agent(s) from food samples for comparison purposes is often important (Ceuppens et al., 2014;Forbes et al., 2017). Molecular analytical methods can potentially circumvent need for enrichment culturing and further selective steps for the detection of many of the most important FBPs, but the issues of viability, recovery and LOD remain critical. Transcriptomics or use of propidium monoazide prior to sequencing are options to discriminate living from dead cells (Weinmaier et al., 2015). There is no harmonized requirement for detection and/or identification of an FBP by HTS, neither with respect to the number of reads, coverage of a specific diagnostic target motif, minimum length of (assembled) contig or maximum number of mismatches. Depending on the choice of actual minimum performance parameters and associated acceptance values it is possible to calculate the probability of detection (POD) of relevant target(s) and perform a statistical comparison of the POD to the actual observations Spilsberg et al., 2017). This would provide clues to the probability and reliability of findings. POD calculations performed prior to analyses can be useful to assess the cost-efficiency of alternative approaches. HTS is generally not quantitative, and complementary tools are needed to verify the LOD and recovery. Direct sequencing of nucleic acids from a clinical or food derived sample can be faster than culturedependent analytical approaches. However, successive analysis of HTS data, necessary for interpretation, may be too timeconsuming to render HTS really competitive in many cases. For bacteria in particular, the combination of limited enrichment culturing and fast HTS with (semi-)automated bioinformatics is currently the most optimal, realistic option that can provide very detailed information while simultaneously offering sufficient sensitivity and speed. For those FBPs that cannot be enriched by culturing, future pipeline developments may improve the situation, at least for certain types of matrixes and scenarios.

Harmonization and Validation of HTS Approaches
The use of different technologies and methods negatively affect the comparability of results (Junemann et al., 2013), as does lack of harmonized terminology and data interpretation (Lambert et al., 2017;Taboada et al., 2017). Recent studies have tried to overcome this variability in end-point results by establishing standard WGS data sets from outbreaks with Salmonella, STEC and L. monocytogenes (Timme et al., 2015) and a benchmarking dataset consisting of 101 whole genome sequences from one E. coli hypermutator strain (Ahrenfeldt et al., 2017). Use of these data sets is proposed to facilitate standardization and harmonization of bioinformatics pipelines (Timme et al., 2015;Ahrenfeldt et al., 2017).
Traditional approaches to standardization and validation cannot be directly applied to HTS approaches. There is for example no common cognition as to what is required for "identification" of a FBP, such as the number of organismspecific reads (positive calls), sequence depth (confidence), LOD, etc. for each platform and protocol, or number of differences distinguishing isogenic from non-isogenic isolates (Figure 5). There are, however, several efforts to improve the situation. Guidelines for harmonization of clinical testing by application of HTS have been published and may provide useful guidance also to other sectors including FBP detection (Gargis et al., 2012(Gargis et al., , 2015Weiss et al., 2013;Aziz et al., 2015). The Global Microbial Identifier (GMI) was initiated in 2011 (GMI, 2011;Aarestrup et al., 2012). The background for this initiative was the high number of microbiological isolates that are characterized annually with very diverse and expensive typing systems, the increasing number of infectious diseases with global epidemiology requiring rapid detection and identification of microbial agents, the likelihood that microbiological laboratories (primarily clinical) will have DNA sequencers available in the foreseeable future, and that the likely future limiting factor is not the HTS cost but the assembly, processing and data handling in a standardized way to make the information useful (GMI, 2011). In 2015 the GMI launched the first HTS proficiency testing scheme in this respect (GMI, 2015;Moran-Gilad et al., 2015). Recent progress by GMI is reported by Taboada et al. (2017). The World Organization for Animal Health also recently launched its first standards for HTS, bioinformatics and computational genomics (OIE, 2017). Two other examples are the International Organization for Standardization's initiative to standardize WGS for typing and genomic characterization (ISO/TC 34/SC 9/WG 25) and the initiative to standardize the format of HTS derived SNP data (in a cancer context; Pipan and Kunaj, 2015), respectively.

Results Interpretation
HTS can generate a vast amount of data describing the microbial community of foods. When technical issues such as speed and standardization of laboratory and bioinformatics methods are improved, HTS and especially shotgun metagenomics may find applications in routine analysis of food, also for control purposes, as it has already done in clinical settings. Interpretation of the results is an important issue, for both the agri-food industry and the competent authorities, but also how to act on the results (Lambert et al., 2017;Taboada et al., 2017).
In the quality control of products and production lines, the detection of a pathogen, a genus that includes pathogenic species, and/or specific virulence genes/factors may, but will not necessarily lead to withdrawal of the product or lead to decisions to decontaminate the entire production line. The decision will depend on the specific context, i.e., the specific combination of pathogen marker, how well characterized the pathogen is, the type of product and the intended use. Quantitative data may also play a role, but sometimes are not or cannot be made available. Legal requirements can direct the decision processes. In the European Union (EU), both food safety criteria and process hygiene criteria are listed with details on sampling plans, limits, analytical reference methods and stage where the criterion applies (European Commission, 2005). Similarly, the safety of the food supply chain in the USA is regulated (FDA, 2011).
The application of WGS in microbiological risk assessments of foods is largely unexplored and faces important challenges. The number of hazards increase exponentially when zooming from serovars or serotypes to genotypes. The translation of multidimensional genotypic data into reduced information on phenotypes to ultimately generate a measure of risk that matches the requirements of food safety authorities and policy makers is therefore necessary (Franz et al., 2016). The presence or absence of specific genetic markers in a bacterial isolate or in a sample can direct the acceptance or rejection of a particular product. Furthermore, specific detection of FBP-markers may only be required within a narrow concentration range. For example, the observed presence of the STEC-associated virulence genes stx1 or stx2 in ground beef can lead to rejection or at least to particular caution and further examination to preclude the presence of STEC. We suggest that the observed background level of E. coli can provide a basis for decision in the case of STEC. The presence of E. coli itself is undesirable and un-tolerable background levels of E. coli, e.g., ≥ 10 4 CFU g −1 should lead to rejection. Thus, we believe it is most critical to ensure that the POD for STEC (markers) is acceptable within the entire range of concentrations at tolerable background levels, e.g., POD ≥ 99% at < 10 4 CFU E. coli g −1 . We suggest that such an approximation can be used to guide both method developments and the enrichment efforts needed to obtain conclusive data. The infectious dose of FBPs vary, and for some FBPs it can be < 100 cells (Tilden et al., 1996;Tuttle et al., 1999;Hall, 2012) and must be considered for this type of calculation.
Recently, it has been questioned if all protists and helminths, historically thought of as "parasites" are indeed parasites (Lukes et al., 2015). Growing evidence of the diverse functional roles that specific bacteria can take on in humans, has forced us to acknowledge that beneficial and pathogenic may be two sides of the same organism, and that it must be understood in the broader contexts of ecology, symbiosis and diversity, among others. Collecting data on the entire microbial communities associated with their hosts may provide for better understanding of the functional interplay between hosts and hosted microorganisms, though so far no clear trend is emerging (Martin et al., 2015). In the future, perhaps, by learning more aided by HTS studies, it will be possible to stimulate beneficial and prevent pathogenic behavior by microorganisms with broad phenotypic potentials.

Capacity Challenges
Low Sequencing Costs, but High Costs Overall HTS has profound impacts on both academic research and practical diagnostic and clinical surveillance. Despite this, there are several obstacles to its full adaptation in the routine detection of FBPs. The ideal HTS solution for the detection of FBPs should be rapid, accurate, operable, and economic. Obstacles include the purchase and implementation of the HTS platforms, but perhaps more importantly the subsequent data analysis and results interpretation (Loman et al., 2013;ECDC, 2015b). Sequencing costs have declined exponentially in the past decade, a development expected to continue. However, also costs associated with sample collection and preparation, nucleic acid extraction and purification, as well as any other steps in the preparation of sequencing libraries, sequencing, data management and downstream analyses must be considered (Sboner et al., 2011). As long as the rapid decline in the cost of data generation is not matched by a corresponding reduction in data storage, maintenance and processing costs, there will remain a substantial economic barrier preventing the full implementation of HTS in research and routine for FBP identification.

Computational Capacity is a Major Hurdle
HTS requires large computational capacities, i.e., various tools for data storage and data analysis, including data management, quality control, mapping and alignment, de novo assembly, scaffolding, gene annotation, metagenomics and biological significance interpretation (Cuccuru et al., 2014;Mayo et al., 2014). These bioinformatics tools, particularly their proper usage and output interpretation appear as an impenetrable barrier facing almost all scientists in either fields of food science and microbiology lacking or having weak bioinformatics background. A substantial effort will therefore be required to train relevant staff in use of these tools (Taboada et al., 2017). The application of HTS in FBP research and surveillance is also impeded or slowed down by the complexity of different formats of diverse software tools and limitations of computational capacities Xia et al., 2012;Cuccuru et al., 2014;Mayo et al., 2014). Several integrative frameworks consisting of publicly available research software and specifically designed pipelines are available for visualizing, querying and downloading the data released, and for HTS data processing and analysis for diverse microorganisms Wang et al., 2012;Cuccuru et al., 2014;Taboada et al., 2017). We believe it is essential that a corresponding user friendly web-based framework for FBPs detection using HTS is created. To fulfill this purpose, a close and effective collaboration between different scientists from related fields is urgently needed (Stapleton, 2014). These should jointly develop more automatic and reliable bioinformatics tools for sequencing reads' quality control (Edgar and Flyvbjerg, 2015), data compression and extraction (Wang and Zhang, 2011;Lassmann, 2015), data analysis on cloud computation (Kwon et al., 2015), and downstream analysis (Davis et al., 2015;Gweon et al., 2015;Müenz et al., 2015;Ratan et al., 2015;Wang et al., 2015a).

Database Limitations on Access and Contents
The establishment of a reference database for each major FBP species or subspecies poses yet another challenge. An accurate and complete reference database for all major FBPs is fundamental to epidemiological investigations and design of pathogen detection assays. The genome sequence is the prerequisite for understanding the molecular basis of a given phenotype of a FBP. Thus, genome sequencing is invaluable in advancing our understanding of virulence mechanisms and epidemiology of FBPs. Many complete or draft genome assemblies of the most common FBPs have been released (Mellmann et al., 2011;Timme et al., 2012;Schmitz-Esser and Wagner, 2014;Quick et al., 2015;Wu et al., 2015). This development is rapidly progressing and open data sharing including uploading of sequence data to public high-quality databases should be encouraged by all stakeholders. The availability of reference genomes is expected to speed up diagnosis and shorten FBD outbreaks, thus promoting public health. Additional complementary efforts are needed for more endemic FBPs.

Cost-Efficiency of Data Generation
The massive capacity of HTS can be utilized to find a causative agent or other target of relevance by unbiased ultradeep sequencing, but this "brute force" approach will generate massive non-target data and, with current sequence pricing, is only applicable to research applications. Late availability of interpretable data can be a drawback of HTS, depending on the protocols used (Loman et al., 2012;Reuter et al., 2013;Quick et al., 2015). Quick et al. (2015) demonstrated (on Salmonella) that the time to answer in a hospital outbreak can be reduced significantly without impact on the results using draft shotgun metagenomics by rapid MiSeq sequencing, compared to standard protocols for MiSeq and HiSeq. The same study also demonstrated that samples could be assigned to species level within 20 min, serotype within 40 min and whether the isolate was part of the outbreak in less than an hour using the MinION sequencing technology and a mapping approach. With alignment-free bioinformatics the speed of analysis is further improved (Ondov et al., 2016). Third generation sequencing platforms can report sequences in real-time and this opens the possibility to selectively sequence only the nucleic acid of interest (Loose et al., 2016). That could permit more cost-efficient use of the sequencing capacity. These examples illustrate some recent improvements to the potential of HTS in management of outbreaks.

Global Imbalance in Capacity vs. Needs
The populations in developing countries and rural areas are more commonly struggling with FBPs than the inhabitants of major cities in developed, Western countries. The access to advanced analytical technologies is inversely distributed. HTS technologies are only exceptionally developed for point-of-care and in-field applications, but recent examples of outbreak investigations of Ebola and Zika viruses using the MinION (Quick et al., 2016(Quick et al., , 2017 indicate that more robust and portable platforms may be available in the foreseeable future. However, even with access to the sequencers, the access to necessary consumables, computer capacities including databases, and competent manpower will likely remain obstacles to the implementation of HTS based analytical methods for many years. This is a serious challenge that must be solved in order to ensure that initiatives to build up global, collaborative networks and systems to cope with emerging FBPs are successful.

CONCLUSION
Despite the obvious benefits associated with potential independence from culturing, and the broad spectrum and level of detailed characterization of targets that can be detected simultaneously, we believe that in a short perspective HTS is unlikely to find application outside the most advanced laboratories. The reductionist tradition prevails in many laboratories, i.e., of testing analytically for single agents, in particular in the food sector. Detailed sequencing based characterization of reference strains and representative clinical, food and environmental samples is still in its relative infancy, with many FBP species still practically missing in the databases, and few species where the range of genetic diversity is close to completely characterized. However, as HTS studies accumulate so does the data pool and the appreciation of the potentials of the technology. Third generation HTS technology, as exemplified with the Oxford Nanopore MinION sequencer promise to permit on-site, long-read, real-time sequencing (Mikheyev and Tin, 2014;Greninger et al., 2015;Quick et al., 2015), but still struggles with laborious sample preparation requirements and high error rates. As these obstacles are mitigated, this type of technology will inevitably lead to a paradigm shift in FBP detection.

AUTHOR CONTRIBUTIONS
CS drafted most of the introduction and section on bacterial food pathogens and contributed to the discussion; AH coordinated the manuscript drafting, wrote the sections on fungal and parasitic food pathogens, and vectors and other HTS applications, and contributed to the introduction, discussion and sections on bacterial and viral pathogens; UD, GJ, and WL contributed to the introduction, the section on bacterial pathogens and the discussion; BS drafted the section on viral pathogens and contributed to the introduction and discussion; and WL and JS initiated the review. All authors (CS, AH, UD, GJ, WL, BS, and JS) critically revised the manuscript.

ACKNOWLEDGMENTS
This review was undertaken in connection with research activities in the Decathlon project (FP7-KBBE-2013-7-613908-Decathlon) funded by the European Commission in the 7th Framework Programme. This publication and all its contents reflect the views of the authors only, and the Commission cannot be held responsible for any use which may be made of the information contained herein. Complementary financial support was obtained from the Research Council of Norway. The authors are grateful to a reviewer for critical and useful comments and recommendations.